期刊文献+

基于排序思想的高维稀疏数据聚类 被引量:2

High Dimensional Sparse Data Clustering Based on Sorting Idea
下载PDF
导出
摘要 针对CABOSFV聚类算法对数据输入顺序的敏感性问题,提出融合排序思想的高属性维稀疏数据聚类算法,通过计算首次聚类中两两高属性维稀疏数据非零属性取值情况确定所需要计算差异度的集合组合,减小了算法复杂度。应用结果表明,该方法能提高CABOSFV聚类的质量。 In the light of the sensitivity of the order of data input by CABOSFV clustering algorithm, this paper puts forward a high attribute dimensional sparse clustering algorithm of the integration of sorting. The method of how to determine the two sets calculates the difference between two high dimensional sparse data sets in the first clustering, the algorithm complexity is reduced. The method improves the quality and efficiency of clustering. Simulation results of one groups of sample are given to illustrate that it can improve the quality of CABOSFV clustering.
出处 《计算机工程》 CAS CSCD 北大核心 2010年第22期13-14,共2页 Computer Engineering
基金 国家自然科学基金资助项目(60963008)
关键词 高维稀疏数据 CABOSFV聚类 排序 high dimensional sparse data CABOSFV clustering sorting
  • 相关文献

参考文献6

二级参考文献20

  • 1金澈清,钱卫宁,周傲英.流数据分析与管理综述[J].软件学报,2004,15(8):1172-1181. 被引量:161
  • 2Han Jiawei,Kamber Micheline,范明,孟小峰,等译.数据挖掘概念与技术[M].北京:机械工业出版社,2007:424-479. 被引量:43
  • 3FRIGUI H, NASRAOUI O, Simuhaneous clustering and attribute discrimination[C]. Proceeding of the 9th IEEE International Conference on Fuzzy Systems, 2000. 被引量:1
  • 4JING L. NG M. K. and HUANG. J. Z. An Entropy Weighting K-Means algorithm for subspace clustering of high-dimensional sparse data[J]. IEEE Transactions on Knowledge and Data Engineering, 2007,19(8) : 1-16. 被引量:1
  • 5测试数据集.http://archive.ics.uci.edu/ml/machine-learning-databases. 被引量:1
  • 6宋国杰 王腾蛟 唐世渭.数据流中频繁模式的评估与维护[A]..第20届全国数据库学术会议[C].长沙,2003.. 被引量:2
  • 7B.Babcock,S.Babu,M.Datar,etal.Models and issues in data stream systems.In:Proc.21st ACM Symposium on Principles of Database Systems.New York:ACM Press,2002.1~16 被引量:1
  • 8G.Hulten,L.Spencer,P.Domigos.Mining time-changing data streams.In:Proc.7th ACM SIGKDD Int'l Conf.Knowledge Discovery and Data Mining.New York:ACM Press,2001.97~106 被引量:1
  • 9J.X.Yu,Z.H.Chong,H.J.Lu,et al.False positive or false negative:Mining frequent itemsets from high speed transactional data streams.In:Proc.30th Int'l Conf.Very Large Data Bases.San Francisco:Morgan Kaufmann,2004.204~215 被引量:1
  • 10H.X.Wang,W.Fan,P.S.Yu,et al.Mining concept-drifting data stream using ensemble classification.In:Proc.9th ACM SIGKDD Int'l Conf.Knowledge Discovery and Data Mining.New York:ACM Press,2003.226~235 被引量:1

共引文献165

同被引文献25

  • 1张曙红,孙建勋,诸克军.基于遗传优化的采样模糊C均值聚类算法[J].系统工程理论与实践,2004,24(5):121-125. 被引量:21
  • 2尹松,周永权,李陶深.基于稀疏差异度的聚类方法在信息分类中的应用[J].计算机技术与发展,2006,16(1):117-119. 被引量:4
  • 3杨风召.高维数据挖掘技术研究[D].东南大学出版社.2007. 被引量:1
  • 4AGRAWAL R.GEHRKE J, GuNOPulDS D, et al Automatic sub-space clustering of high dimensional data for data mining applications[C] / / Proe of ACM-SIGMOD International Conference on Manage-ment of Data. Seattle, WA[s. n. , 1998: 94-105. 被引量:1
  • 5PARSONS L, HAQUE E, LIU Huan. Subspace clustering for high dimensional data: a review[J]. ACM SIGKDD Explorations Newslet-ter, 2004, 6(1): 90-105. 被引量:1
  • 6L parsons, E. Haque, HI Liu. Subspace clustering for high dimensional data: review[J]. SIGKDD Explorations. 6(1): 90-105, 2004. 被引量:1
  • 7U, Y Xia, and ES. Yu. Clustering through decision tree construction[J]. Inproceedings of the Ninth International Conference on Information and Knowledge Management, 2000: 20-29. 被引量:1
  • 8C. Cano, L. Adarve, J. Lopez, A. Blanco. Possibilistic approach in and biclustering micmarray data[J]. Computers in Biology and Medicine, 2007. 被引量:1
  • 9Kirk S, Gelatt C D and RVecchi M. Optimization by simulated annealing[J]. Science. 1983: 671-680. 被引量:1
  • 10武森,高学东,(德)M.巴斯蒂安.数据仓库与数据挖掘[M].北京:冶金工业出版社,2003. 被引量:1

引证文献2

二级引证文献6

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部