Discovering worthy spatial co-location patterns based on pattern distributions through clustering

Abstract

Spatial co-location pattern mining aims to uncover associations among spatial features, enabling users to discover correlation knowledge from spatial datasets. However, as spatial datasets grow, traditional frameworks for mining co-location patterns produce an overwhelming number of redundant results, which complicates further analysis. This paper focuses on extracting worthy co-location patterns, which are concise summaries of prevalent co-location patterns. We introduce two similarity measures—feature-based similarity and distribution-based similarity—to evaluate redundancy between co-location patterns from both feature and instance perspectives. Using these measures, we propose a novel approach called the Worthy Co-location Patterns Mining algorithm (WCPM) to condense prevalent co-location patterns. Initially, we employ a clique-based method to discover prevalent co-location patterns and categorize them into Maximal Co-location Patterns (MCPs) and Non-Maximal Co-location Patterns (NMCPs). Subsequently, we cluster the MCPs to extract the feature-similar MCPs, and based on distribution similarity, identify the worthy MCPs from the clustering results. Finally, we design a top-down algorithm to mine Worthy Non-Maximal Co-location Patterns (WNMCPs). Experiments on both synthetic and real datasets demonstrate that WCPM outperforms similar state-of-the-art approaches in terms of compression power and running time.

Keywords

Spatial pattern mining worthy co-location patterns clustering similarity metric pattern recommendation

Get full access to this article

View all access options for this article.

References

Huang

Shekhar

Xiong

. Discovering colocation patterns from spatial data sets: a general approach. IEEE Trans Knowl Data Eng 2004; 16: 1472–1485.

Shekhar

Huang

. Discovering spatial co-location patterns: a summary of results. In: International symposium on spatial and temporal databases, 2001, pp.236–256. Springer.

Akbari

Samadzadegan

Weibel

. A generic regional spatio-temporal co-occurrence pattern mining model: a case study for air pollution. J Geogr Syst 2015; 17: 249–274.

Shu

Wang

Yang

, et al. Mining the potential relationships between cancer cases and industrial pollution based on high-influence ordered-pair patterns. In: International conference on advanced data mining and applications, 2022, pp.27–40. Springer.

Cheng

Bannister

, et al. Geographically and temporally weighted co-location quotient: an analysis of spatiotemporal crime patterns in greater manchester. Int J Geogr Inf Sci 2022; 36: 918–942.

Liu

Deng

, et al. Discovery of statistically significant regional co-location patterns on urban road networks. Int J Geogr Inf Sci 2022; 36: 749–772.

Wang

Zou

. A maximal ordered ego-clique based approach for prevalent co-location pattern mining. Inf Sci (Ny) 2022; 608: 630–654.

Chan

HKH

Long

Yan

, et al. Fraction-score: a generalized support measure for weighted and maximal co-location pattern mining. IEEE Trans Knowl Data Eng 2023; 36: 1582–1596.

Yoo

Bow

. Mining maximal co-located event sets. In: Pacific-Asia conference on knowledge discovery and data mining, 2011, pp.351–362. Springer.

10.

Wang

Zhou

, et al. An order-clique-based approach for mining maximal co-locations. Inf Sci (Ny) 2009; 179: 3370–3382.

11.

Yao

Peng

Yang

, et al. A fast space-saving algorithm for maximal co-location pattern mining. Expert Syst Appl 2016; 63: 310–323.

12.

Yoo

Bow

. Mining top-k closed co-location patterns. In: Proceedings 2011 IEEE international conference on spatial data mining and geographical knowledge services, 2011, pp.100–105. IEEE.

13.

Wang

Bao

Chen

, et al. Effective lossless condensed representation and discovery of spatial co-location patterns. Inf Sci (Ny) 2018; 436: 197–213.

14.

Bao

, et al. Mining non-redundant co-location patterns. IEEE Trans Neural Netw Learn Syst 2021; 33: 6613–6626.

15.

Wang

Bao

Zhou

. Redundancy reduction for prevalent co-location patterns. IEEE Trans Knowl Data Eng 2017; 30: 142–155.

16.

Duong

Pham

Truong

, et al. Efficient algorithms to mine concise representations of frequent high utility occupancy patterns. Appl Intell 2024; 54: 4012–4042.

17.

Jeya Sutha

Ramesh Dhanaseelan

Felix Nes Mabel

, et al. T2fm: a novel hashtable based type-2 fuzzy frequent itemsets mining. J Intell Fuzzy Syst 2024; 46: 3231–3244.

18.

Yoo

Shekhar

Smith

, et al. A partial join approach for mining co-location patterns. In: Proceedings of the 12th annual ACM international workshop on Geographic information systems, 2004, pp.241–249.

19.

Yoo

Shekhar

. A joinless approach for mining spatial colocation patterns. IEEE Trans Knowl Data Eng 2006; 18: 1323–1337.

20.

Wang

Bao

, et al. A new join-less approach for co-location pattern mining. In: 2008 8th IEEE international conference on computer and information technology, 2008, pp.197–202. IEEE.

21.

Bao

Wang

. A clique-based approach for co-location pattern mining. Inf Sci (Ny) 2019; 490: 244–264.

22.

Yoo

Bow

. A framework for generating condensed co-location sets from spatial databases. Intell Data Anal 2019; 23: 333–355.

23.

Wang

Bao

Zhou

, et al. Mining maximal sub-prevalent co-location patterns. World Wide Web 2019; 22: 1971–1997.

24.

Zou

Wang

, et al. Efficiently mining maximal l-reachability co-location patterns from spatial data sets. Intell Data Anal 2023; 27: 269–295.

25.

Tran

. Meta-pcp: a concise representation of prevalent co-location patterns discovered from spatial data. Expert Syst Appl 2023; 213: 119255.

26.

Wang

Bao

Cao

. Interactive probabilistic post-mining of user-preferred spatial co-location patterns. In: 2018 IEEE 34th international conference on data engineering (ICDE), 2008, pp.1256–1259. IEEE.

27.

Bao

Wang

. Discovering interesting co-location patterns interactively using ontologies. In: Database Systems for Advanced Applications: DASFAA 2017 International Workshops: BDMS, BDQM, SeCoP, and DMMOOC, Suzhou, China, March 27-30, 2017, Proceedings 22, 2017, pp.75–89. Springer.

28.

Bao

Wang

Chen

. Ontology-based interactive post-mining of interesting co-location patterns. In: Web technologies and applications: 18th Asia-pacific web conference, APWeb 2016, Suzhou, China, September 23-25, 2016. Proceedings, Part II, 2016, pp.406–409. Springer.

29.

Bao

Chang

, et al. Knowledge-based interactive postmining of user-preferred co-location patterns using ontologies. IEEE Trans Cybern 2021; 52: 9467–9480.

30.

Wang

Chang

Bao

, et al. Knowledge-based discovery of multi-level co-location patterns using ontology. Knowl Inf Syst 2024; 66: 6463–6491.

31.

Xin

Han

Yan

, et al. Mining compressed frequent-pattern sets. In: Proceedings of the 31st international conference on Very large data bases, 2005, pp.709–720.

32.

Liu

Deng

. Determine the number of unknown targets in open world based on elbow method. IEEE Trans Fuzzy Syst 2020; 29: 986–995.

33.

Yang

Wang

Zhou

, et al. A fast spatial high utility co-location pattern mining approach based on branch-and-depth-extension. Inf Sci (Ny) 2024; 666: 120407.

34.

Wang

Tran

, et al. Efficiently mining spatial co-location patterns utilizing fuzzy grid cliques. Inf Sci (Ny) 2022; 592: 361–388.

35.

Yang

. Mining evolving spatial co-location patterns from spatio-temporal databases. In: 2022 IEEE International conference on big data and smart computing (BigComp), 2022, pp.129–136. IEEE.

36.

Xue

Song

Fang

, et al. Intra-and inter-semantic with multi-scale evolving patterns for dynamic graph learning. Knowl Based Syst 2023; 260: 110167.

37.

Gerasimou

. Characterization of the jaccard dissimilarity metric and a generalization. Discrete Appl Math 2024; 355: 57–61.

38.

Grygorian

Iacob

. A concise proof of the triangle inequality for the jaccard distance. College Math J 2018; 49: 363–365.

39.

Kosub

. A note on the triangle inequality for the jaccard distance. Pattern Recognit Lett 2019; 120: 36–38.