一种基于矩阵低秩近似的聚类集成算法被引量：6

Matrix Low Rank Approximation-Based Cluster Ensemble Algorithm

下载PDF

导出

摘要首先将聚类集成问题归结为直观的最佳子空问的求解问题;随后根据线性代数理论将该问题描述为带约束条件的优化问题,通过放松离散约束条件进一步约简为矩阵低秩近似问题;最后通过求解超图的加权邻接矩阵的奇异值分解问题获得最佳子空间的一组标准正交基.据此,设计了一个基于矩阵低秩近似的算法,该算法根据每个对象在低维空间下的坐标使用K均值算法进行聚类,从而得到最终的结果.在多组基准数据集上的实验结果表明:较之于传统的聚类集成算法,本文的算法获得了更好的聚类结果,且效率较高. As an important extension to conventional clustering algorithms,cluster ensemble techniques became a hotspot in machine learning area.In this paper,cluster ensemble problem was first viewed as a direct problem of seeking the best subspace. And then,we formally described the problem as an optimization problem with constraint according to linear algebra,and further transformed into a matrix low rank approximation problem by relaxing the discrete constraint.Lastly,a set of orthonormal basis of the best subspace was attained by solving the singular value decomposition problem of the hypergraph＇s weighted adjacent matrix. Hereby,a matrix low rank approximation-based algorithm was proposed,which called K-means algorithm to cluster objects according to their coordinates in the low dimensional space and obtained the final clustering result.Experiments on baseline datasets demonstrate the effectiveness of the proposed algorithm,and it outperforms other baseline algorithms.

作者徐森周天于化龙李先锋

机构地区盐城工学院信息工程学院哈尔滨工程大学水声技术重点实验室江苏科技大学计算机科学与工程学院

出处《电子学报》 EI CAS CSCD 北大核心 2013年第6期1219-1224,共6页 Acta Electronica Sinica

基金国家自然科学基金(No.60970542 No.41006057 No.6110507) 国家863重点项目(No.2008A09701) 国际科技合作聘专重点项目江苏省高校"青蓝工程"资助项目盐城工学院人才引进专项基金(No.XKR2011019)

关键词无监督学习聚类分析聚类集成矩阵低秩近似 unsupervised learning clustering analysis cluster ensemble matrix low rank approximation

分类号 TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献17

1STREHL A, GHOSH J. Cluster ensembles-a knowledgereuse framework for combining partitionings [j]. The Journalof Machine Learning Research,2002,3(12) :583 - 617. 被引量：1
2TOPCHY A, JAIN A K, PUNCH W. A mixture model for clus-tering ensembles[ A]. Michael W B, et al. Proceedings of the4th SIAM International Conference on Data Mining [ C]. Flori-da: Society for Industrial and Applied Mathematics, 2004. 379-390. 被引量：1
3徐森,卢志茂,顾国昌.解决文本聚类集成问题的两个谱算法[J].自动化学报,2009,35(7):997-1002. 被引量：20
4FRED A, LOURENGO A. Supervised and Unsupervised En-semble Methods and Their Applications[M] . Berlin: Springer,2008.3-30. 被引量：1
5FERN X Z, BRODLEY C E. Solving cluster ensemble prob-lems by bipartite graph partitioning[A]. Russ G,Dale S. Pro-ceedings of 21st International Conference on Machine Learning[C].New York: ACM,2004.36 - 43. 被引量：1
6LI T’DING C, JORDAN M I. Solving consensus and semi-su-pervised clustering problems using nonnegative matrix factoriza-tion[A] . Naren R, Osmar Z. Proceedings of the 7th IEEE Inter-national Conference on Data Mining [ C]. Washington: THRRComputer Society,2007.577 - 582. 被引量：1
7IAM On N,BOONGEON T, GARRETT S,PRICE C. A link-based cluster ensemble approach for categorical data clustering[J] . TEEE Transactions on Knowledge and Data Engineering,2012,24(3):413 - 425. 被引量：1
8唐伟,周志华.基于Bagging的选择性聚类集成[J].软件学报,2005,16(4):496-502. 被引量：95
9SEVILLANO X,ALIAS F, SOCORO J C. BordaConsensus: anew consensus function for soft cluster ensembles [A]. WesselK,et al. Proceedings of the 30th Annual International ACM SI-GIR[C] .New York: ACM,2007.743 - 744. 被引量：1
10CARPINETO C,ROMANO G. Consensus clustering based ona new probabilistic rand index with application to subtopic re-trieval[ J ]. TEEE Transactions on Pattern Analysis and Ma-chine Int^nigence,2012,34(12) :2315 - 2326. 被引量：1

二级参考文献36

1唐伟,周志华.基于Bagging的选择性聚类集成[J].软件学报,2005,16(4):496-502. 被引量：95
2TIAN Zheng,LI XiaoBin,JU YanWei.Spectral clustering based on matrix perturbation theory[J].Science in China(Series F),2007,50(1):63-81. 被引量：19
3罗会兰,孔繁胜,李一啸.聚类集成中的差异性度量研究[J].计算机学报,2007,30(8):1315-1324. 被引量：36
4Estivill-Castro V. Why so many clustering algorithms-A position paper. SIGKDD Explorations, 2002,4(1):65-75. 被引量：1
5Dietterich TG. Machine learning research: Four current directions. AI Magazine, 1997,18(4):97-136. 被引量：1
6Breiman L. Bagging predicators. Machine Learning, 1996,24(2):123-140. 被引量：1
7Zhou ZH, Wu J, Tang W. Ensembling neural networks: Many could be better than all. Artificial Intelligence, 2002,137(1-2):239-263. 被引量：1
8Strehl A, Ghosh J. Cluster ensembles-A knowledge reuse framework for combining partitionings. In: Dechter R, Kearns M,Sutton R, eds. Proc. of the 18th National Conf. on Artificial Intelligence. Menlo Park: AAAI Press, 2002. 93-98. 被引量：1
9MacQueen JB. Some methods for classification and analysis of multivariate observations. In: LeCam LM, Neyman J, eds. Proc. of the 5th Berkeley Symp. on Mathematical Statistics and Probability. Berkeley: University of California Press, 1967,1:281-297. 被引量：1
10Blake C, Keogh E, Merz CJ. UCI Repository of machine learning databases. Irvine: Department of Information and Computer Science, University of California, 1998. http://www.ics.uci.edu/～mlearn/MLRepository.html 被引量：1

共引文献136

1张杰鑫,庞建民,张铮.拟态构造的Web服务器异构性量化方法[J].软件学报,2020,31(2):564-577. 被引量：10
2高琰,谷士文,唐琎,蔡自兴.一种基于互信息的模糊聚类集成算法[J].小型微型计算机系统,2007,28(6):1068-1071. 被引量：2
3李士进,朱跃龙,刘净.一种基于k-prototype的多层次聚类改进算法[J].河海大学学报（自然科学版）,2007,35(3):342-347. 被引量：1
4张莉,陈恭和.一种适合大规模数据集的特征选择方法[J].计算机工程,2007,33(4):184-186. 被引量：1
5罗会兰,孔繁胜,李一啸.聚类集成中的差异性度量研究[J].计算机学报,2007,30(8):1315-1324. 被引量：36
6张妤,王文剑,康向平.一种回归SVM选择性集成方法[J].计算机科学,2008,35(4):178-180. 被引量：7
7刘明,袁保宗,苗振江,唐晓芳,李昆仑.从局部分类精度到分类置信度的变换[J].计算机研究与发展,2008,45(9):1612-1619. 被引量：6
8罗会兰,孔繁胜,李一啸.基于添加人工数据的高差异性聚类集体生成方法[J].模式识别与人工智能,2008,21(5):682-688.
9王红军,李志蜀,成飏,周鹏,周维.基于隐含变量的聚类集成模型[J].软件学报,2009,20(4):825-833. 被引量：14
10郭红玲,程显毅.多分类器选择集成方法[J].计算机工程与应用,2009,45(13):186-187. 被引量：7

同被引文献94

1刘云峰 ,齐欢 ,HU Xiang'en ,CAI Zhiqiang ,代建民 .基于潜在语义空间维度特性的多层文档聚类[J].清华大学学报（自然科学版）,2005(S1):1783-1786. 被引量：11
2张敏,于剑.基于划分的模糊聚类算法[J].软件学报,2004,15(6):858-868. 被引量：176
3贾正华.广义逆矩阵及其性质[J].巢湖学院学报,2005,7(3):38-39. 被引量：4
4CONG Ling-bo, RUAN Wan-qing. K-Mean clustering analysis and its applications to classification of tumor gene [ M ]//Informatics and Management Science llI. London:Springer, 2013: 699-706. 被引量：1
5XIONG Z, CHEN R, ZHANG Y, et al. Multi-density DBSCAN algo- rithm based on density levels partitioning[ J]. Journal of Informa- tion and Computational Science, 2012, 9(10): 2739-2749. 被引量：1
6MAHESHWARY P, SRIVASTAVA N. Wave cluster for remote sen- sing image retrieval [ J ]. International Journal on Computer Science and Engineering, 2011, 3(2) : 976-979. 被引量：1
7FREY B J, DUECK D. Clustering by passing messages between data points[J]. Science, 2007, 315(5814): 972-976. 被引量：1
8CHEN Jin-hua, CHEN Xiao-yun. Relative density weights based fuzzy C-means clustering algorithms [ J ]. Quantitative Logic and Soft Computeing, 2010, 82(2) :459-466. 被引量：1
9SHANG Fan-hua, JIAO L C, SHI Jia-rong, et al. Fast affinity propa- gation clustering: a multilevel approach [ J ]. Pattern Recognition, 2012, 45(1 ): 474-486. 被引量：1
10HASSANABADI B, SHEA C, ZHANG L, et al, Clustering in vehi- cular Ad hoe networks using affinity propagation [ J ]. Ad hoc Net- works, 2014, 13: 535-548. 被引量：1

引证文献6

1徐森,皋军,徐秀芳,花小朋,徐静,安晶.一种基于二部图谱划分的聚类集成方法[J].控制与决策,2018,33(12):2208-2212.
2韩义波,韩璞.一种迭代加权更新的带加速算子的半监督AP聚类算法[J].计算机应用研究,2015,32(2):376-378. 被引量：1
3廖律超,蒋新华,邹复民,贺文武,邱淮.一种支持轨迹大数据潜在语义相关性挖掘的谱聚类方法[J].电子学报,2015,43(5):956-964. 被引量：29
4徐小龙,李永萍.一种基于MapReduce的知识聚类与统计机制[J].电子与信息学报,2016,38(1):202-208. 被引量：1
5黄栋,王昌栋,赖剑煌,梁云,边山,陈羽.基于决策加权的聚类集成算法[J].智能系统学报,2016,11(3):418-425. 被引量：4
6张长伦,余沾,王恒友,何强.基于广义低秩矩阵分解的分离字典训练及其快速重建算法[J].电子学报,2018,46(10):2400-2409. 被引量：1

二级引证文献36

1CHEN Linshu,WANG Jiayang,WANG Weicheng,LI Li.A New Granular Computing Model Based on Algebraic Structure[J].Chinese Journal of Electronics,2019,28(1):136-142. 被引量：6
2杨丽萍.基于半结构特征分割的Web数据挖掘算法[J].微电子学与计算机,2015,32(8):154-157.
3傅立伟,武森.基于属性值集中度的分类数据聚类有效性内部评价指标[J].工程科学学报,2019,41(5):682-693. 被引量：14
4米捷,张鹏,于海鹏.粒子群差分扰动优化的聚类算法研究[J].河南工程学院学报（自然科学版）,2016,28(1):63-68. 被引量：10
5胡先兵,赵国庆.引入时频聚集交叉项干扰抑制的大数据聚类算法[J].计算机科学,2016,43(4):197-201. 被引量：4
6赵卓峰,丁维龙,张帅.海量车牌识别数据集上基于时空划分的旅行时间计算方法[J].电子学报,2016,44(5):1227-1233. 被引量：7
7朱亚东,高翠芳.基于PSO的云计算环境中大数据优化聚类算法[J].计算机技术与发展,2016,26(9):178-182. 被引量：7
8李岗岗,赵婷婷.纺织科技英语强化训练的词汇分类方法[J].西安工程大学学报,2016,30(4):440-445. 被引量：1
9董航,李姝湲,郭红霞.基于谱聚类的SHIBOR非对称波动研究[J].轻工学报,2016,31(5):98-104.
10覃晓,梁伟,元昌安,唐涛.基于遗传优化谱聚类的图形分割方法[J].计算机科学,2017,44(1):100-102. 被引量：4

1徐森,卢志茂,顾国昌.基于矩阵谱分析的文本聚类集成算法[J].模式识别与人工智能,2009,22(5):780-786. 被引量：6
2赵峰,张军英.一种KPCA的快速算法[J].控制与决策,2007,22(9):1044-1048. 被引量：14
3李胜平,徐斌.标准正交基的新定义及运用[J].科技信息,2009(26).
4范伟鹏.Lanczos双对角算法在文本挖掘当中的应用[J].信息技术,2012,36(12):92-94.
5殷卓.浅析工业机器人的坐标转换矩阵算法[J].机器人技术与应用,2014(5):29-35. 被引量：2
6刘松华,张军英,丁彩英.核矩阵列相关低秩近似分解算法[J].模式识别与人工智能,2011,24(6):776-782. 被引量：2
7李邦明,沈建新,廖文和,张运海.微机械薄膜变形镜校正性能及控制算法[J].强激光与粒子束,2010,22(7):1558-1562. 被引量：3
8辛宇,杨静,谢志强.基于随机游走的语义重叠社区发现算法[J].计算机研究与发展,2015,52(2):499-511. 被引量：14
9周文全,杨天奇.基于Neville型插值的过程神经网络[J].计算机工程与设计,2012,33(7):2787-2791. 被引量：2
10孔敏,陈思宝,罗斌.基于图谱分解的人脸表情分析[J].计算机技术与发展,2006,16(4):33-34.

电子学报

2013年第6期

浏览历史

内容加载中请稍等...

一种基于矩阵低秩近似的聚类集成算法被引量：6

参考文献17

二级参考文献36

共引文献136

同被引文献94

引证文献6

二级引证文献36

相关作者

相关机构

相关主题

浏览历史

一种基于矩阵低秩近似的聚类集成算法 被引量：6

参考文献17

二级参考文献36

共引文献136

同被引文献94

引证文献6

二级引证文献36

相关作者

相关机构

相关主题

浏览历史

一种基于矩阵低秩近似的聚类集成算法被引量：6