摘要
【目的】解决传统的空间co-location模式挖掘方法在研究类似污染源与癌症病例这两大类特征之间的关系时,会挖掘出大量用户不感兴趣的模式且只考虑模式的频繁性等问题。【方法】首先,利用Voronoi图的性质结合星型实例模型,定义空间实例之间的邻近关系和空间序偶模式的概念;其次,考虑距离衰减效应和影响叠加效应,定义空间序偶模式的频繁度与影响度;最后提出了一个挖掘相应序偶模式的基本算法和一个优化算法。【结果】所提挖掘算法均能挖掘出传统算法挖掘不到的用户感兴趣的结果,且结果数量比传统算法少很多,相比于基本算法,优化算法的剪枝率达到80%以上,数据集越大,效果越好。【局限】默认数据都是点空间对象,扩展空间对象有待进一步研究。【结论】空间序偶模式可以更好地研究类似污染源与癌症病例这两大类特征之间的关系。
[Objective]This paper tries to identify the relationship between pollution sources and cancer cases,aiming to address the issues of discovering too many non-pertnient patterns by method using spatial co-location patterns.[Methods]First,we combined the properties of Voronoi diagram and the star instance model.Then,we defined the proximity relationship between spatial instances and the concept of spatial ordered pair patterns.Third,we decided the prevalence and the influence of the spatial ordered pair patterns based on the distance attenuation and the influence superposition effects.Finally,we proposed a basic algorithm and an optimization algorithm to examine the spatial ordered pair patterns.[Results]The proposed algorithms revealed more pertinent relationship which cannot be identified by the traditional algorithms.And the total number of results was much less than those of the traditional algorithms.Compared with the basic algorithm,the pruning rate of the optimization algorithm surpassed 80%.The larger the data set,the better the results.[Limitations]The default data are all point-spatial objects,while the extended spatial objects merit more studies.[Conclusions]The spatial ordered pair patterns could effectively identify the relationship between pollution sources and cancer cases.
作者
谢旺
王丽珍
陈红梅
曾兰清
Xie Wang;Wang Lizhen;Chen Hongmei;Zeng Lanqing(School of Information Science and Engineering,Yunnan University,Kunming 650500,China)
出处
《数据分析与知识发现》
CSSCI
CSCD
北大核心
2021年第2期14-31,共18页
Data Analysis and Knowledge Discovery
基金
国家自然科学基金项目(项目编号:61966036,61662086)
云南省创新团队基金项目(项目编号:2018HC019)的研究成果之一。