期刊文献+

一种基于互信息的模糊聚类集成算法 被引量:2

Fuzzy Clustering Ensemble Based on Mutual Information
下载PDF
导出
摘要 聚类集成是机器学习中的新问题.它是利用同一数据集的多个聚类划分集成在一起,以提高聚类分析的性能.如何发现从多个划分中得到“consensusclustering”是一个很困难的问题.很多学者对此作了研究.本文提出了一种基于互信息的模糊聚类集成算法.该算法主要扩展了Strehl&Ghosh提出的基于互信息的聚类集成目标函数,将其应用到模糊划分的集成,同时利用类似于信息瓶颈聚类的算法进行求解.实验结果表明,在4个UCI的数据集上,基于互信息的聚类集成能获得良好的性能. Clustering ensemble is a new topic in machine learning. It can find a combined clustering with better quality from multiple partitions. But how to finding the consensus clustering is a difficult problem, In this paper, we proposed a new algorithm for "fuzzy" clustering ensemble, This ensemble algorithm is base on the mutual information for clustering ensemble and is similar to Information Bottleneck clustering(IB). Experiments on four real-world data sets indicate that our algorithm provides solutions with improved quality.
出处 《小型微型计算机系统》 CSCD 北大核心 2007年第6期1068-1071,共4页 Journal of Chinese Computer Systems
基金 国家自然科学基金重点项目(60234030)资助.
关键词 聚类集成 互信息 信息瓶颈 clustering ensemble mutual information information bottleneck
  • 相关文献

参考文献13

  • 1Dietterich T G.Machine learning research:four current directions[J].AI Magazine,1997,18(4):97-136. 被引量:1
  • 2Strehl A,Ghosh J.Cluster ensembles-a knowledge reuse framework for combining partitions[C].In:Proc.Conference on Artificial Intelligence (AAAI 2002),Edmonton,93-98. 被引量:1
  • 3Fred A L N,Jain A K.Data clustering using evidence accumulation[C].In:Proc.of the 16th International Conference on Pattern Recognition,ICPR 2002,Quebec City,276-280. 被引量:1
  • 4Fern X Z,Brodley C E.Random projection for high dimensional data clustering:a cluster ensemble approach[C].In:Proceedings of the 20th International Conference on Machine Learning,2003,186-193. 被引量:1
  • 5Monti S,Tamayo P,Mesirov J,et al.Consensus clustering:a resampling-based method for class discovery and visualization of gene expression microarray data[J].Machine Learning,2003,52,91-118. 被引量:1
  • 6Topchy A,Jain A,Punch W.A mixture model for clustering ensembles[C].In:Proc.SIAM Data Mining,2004,379-390. 被引量:1
  • 7唐伟,周志华.基于Bagging的选择性聚类集成[J].软件学报,2005,16(4):496-502. 被引量:95
  • 8Frossyniotis D,Likas A,Stafylopatis A.A clustering method based on boosting[Z].Pattern Recognition Letters 25 (2004),641-654. 被引量:1
  • 9Noam Slonim.The information bottleneck:theory and applications[D].Hebrew University,Jerusalem,Israel,2002. 被引量:1
  • 10Blake C,Keogh E,Merz C J.UCI repository of machine learning databases[EB/OL].Irvine:Department of Information and Computer Science,University of California,1998,http://www.ics.uci.edu/~mlearn/MLRepository.html. 被引量:1

二级参考文献14

  • 1Estivill-Castro V. Why so many clustering algorithms-A position paper. SIGKDD Explorations, 2002,4(1):65-75. 被引量:1
  • 2Dietterich TG. Machine learning research: Four current directions. AI Magazine, 1997,18(4):97-136. 被引量:1
  • 3Breiman L. Bagging predicators. Machine Learning, 1996,24(2):123-140. 被引量:1
  • 4Zhou ZH, Wu J, Tang W. Ensembling neural networks: Many could be better than all. Artificial Intelligence, 2002,137(1-2):239-263. 被引量:1
  • 5Strehl A, Ghosh J. Cluster ensembles-A knowledge reuse framework for combining partitionings. In: Dechter R, Kearns M,Sutton R, eds. Proc. of the 18th National Conf. on Artificial Intelligence. Menlo Park: AAAI Press, 2002. 93-98. 被引量:1
  • 6MacQueen JB. Some methods for classification and analysis of multivariate observations. In: LeCam LM, Neyman J, eds. Proc. of the 5th Berkeley Symp. on Mathematical Statistics and Probability. Berkeley: University of California Press, 1967,1:281-297. 被引量:1
  • 7Blake C, Keogh E, Merz CJ. UCI Repository of machine learning databases. Irvine: Department of Information and Computer Science, University of California, 1998. http://www.ics.uci.edu/~mlearn/MLRepository.html 被引量:1
  • 8Modha DS, Spangler WS. Feature weighting in k-means clustering. Machine Learning, 2003,52(3):217-237. 被引量:1
  • 9Zhou ZH, Tang W. Clusterer ensemble. Technical Report, Nanjing: AI Lab., Department of Computer Science & Technology,Nanjing University, 2002. 被引量:1
  • 10Fern XZ, Brodley CE. Random projection for high dimensional data clustering: A cluster ensemble approach. In: Fawcett T, Mishra N, eds. Proc. of the 20th Int'l Conf. on Machine Learning. Menlo Park: AAAI Press, 2003. 186-193. 被引量:1

共引文献94

同被引文献20

引证文献2

二级引证文献6

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部