期刊文献+

面向海量数据的空间co-location模式挖掘新算法 被引量:2

Spatial Co-location Patterns Mining Algorithm over Massive Spatial Data Sets
下载PDF
导出
摘要 空间co-location模式挖掘是空间数据挖掘的一个重要任务,目前无论是挖掘确定数据,还是不确定数据,算法的时间和空间效率都不高,更谈不上对海量数据进行挖掘。为此,在深入分析传统挖掘方式过度消耗时间和空间资源的根本原因的基础上,提出了网格微分挖掘co-location模式的算法。新算法在传统网格基础上实施微分,求出各微分格中属于同一特征的实例质心,并基于这些质心进行多分辨剪枝co-location模式挖掘。算法在保证具有较高准确率的前提下,较好地解决了传统挖掘方式中存在的效率问题,从而解决了面向海量数据进行空间co-location模式挖掘的难题。大量实验证明,网格微分算法具有高效性、稳健性和高准确率等优点。 Spatial co-location patterns mining is an important task in spatial data mining, but the efficiencies of rannlng time and space are low for traditional mining algorithms of determination data and uncertain data, not to mention the massive data. Therefore, based on the analysis of why traditional mining algorithms consumed excessive time and space resources, this paper proposes a grid differential algorithm to mine spatial co-location patterns. The new algorithm divides the traditional grids into differential ones, and then calculates the centroids of instances that belong to the same feature for each differential grid. Finally, based on these centroids, the co-location patterns are mined with mulfiresolution pruning method. The proposed algorithm greatly improves the overall efficiency and has a high accuracy rate, which better solves the problem of mining spatial co-location patterns from a massive data set. Extensive experiments show that the grid differential algorithm has the advantages of high efficiency, robustness and high accuracy and so on.
出处 《计算机科学与探索》 CSCD 北大核心 2015年第1期24-35,共12页 Journal of Frontiers of Computer Science and Technology
基金 国家自然科学基金Nos.61472346 61272126 61262069 云南省教育厅基金项目No.2012C103~~
关键词 网格微分算法 质心 σ^2微分格 空间实例压缩率 grid differential algorithm centroid σ^ differential grid compression ratio of spatial instances
  • 相关文献

参考文献6

二级参考文献132

  • 1金澈清,钱卫宁,周傲英.流数据分析与管理综述[J].软件学报,2004,15(8):1172-1181. 被引量:161
  • 2谷峪,于戈,张天成.RFID复杂事件处理技术[J].计算机科学与探索,2007,1(3):255-267. 被引量:54
  • 3Deshpande A, Guestrin C, Madden S, Hellerstein J M, Hong W. Model-driven data acquisition in sensor networks// Proceedings of the 30th International Conference on Very Large Data Bases. Toronto, 2004:588-599 被引量:1
  • 4Madhavan J, Cohen S, Xin D, Halevy A, Jeffery S, Ko D, Yu C. Web-scale data integration: You can afford to pay as you go//Proceedings of the 33rd Biennial Conference on Innovative Data Systems Research. Asilomar, 2007:342-350 被引量:1
  • 5Liu Ling. From data privacy to location privacy: Models and algorithms (tutorial)//Proceedings of the 33rd International Conference on Very Large Data bases. Vienna, 2007: 1429- 1430 被引量:1
  • 6Samarati P, Sweeney L. Generalizing data to provide anonymity when disclosing information (abstract)//Proeeedings of the 17th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. Seattle, 1998:188 被引量:1
  • 7Cavallo R, Pittarelli M. The theory of probabilistic databases//Proceedings of the 13th International Conference on Very Large Data Bases. Brighton, 1987:71-81 被引量:1
  • 8Barbara D, Garcia-Molina H, Porter D. The management of probabilistic data. IEEE Transactions on Knowledge and Data Engineering, 1992, 4(5): 487-502 被引量:1
  • 9Fuhr N, Rolleke T. A probabilistic relational algebra for the integration of information retrieval and database systems. ACM Transactions on Information Systems, 1997, 15(1): 32-66 被引量:1
  • 10Zimanyi E. Query evaluation in probabilistic databases. Theoretical Computer Science, 1997, 171(1-2): 179-219 被引量:1

共引文献223

同被引文献11

引证文献2

二级引证文献5

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部