期刊文献+

联合Laplacian正则项和特征自适应的数据聚类算法 被引量:6

Clustering with Joint Laplacian Regularization and Adaptive Feature Learning
下载PDF
导出
摘要 在信息爆炸时代,大数据处理已成为当前国内外热点研究方向之一.谱分析型算法因其特有的性能而获得了广泛的应用,然而受维数灾难影响,主流的谱分析法对高维数据的处理仍是一个极具挑战的问题.提出一种兼顾维数特征优选和图Laplacian约束的聚类模型,即联合拉普拉斯正则项和自适应特征学习(joint Laplacian regularization and adaptive feature learning,简称LRAFL)的数据聚类算法.基于自适应近邻进行图拉普拉斯学习,并将低维嵌入、特征选择和子空间聚类纳入同一框架,替换传统谱聚类算法先图Laplacian构建、后谱分析求解的两级操作.通过添加非负加和约束以及低秩约束,LRAFL能获得稀疏的特征权值向量并具有块对角结构的Laplacian矩阵.此外,提出一种有效的求解方法用于模型参数优化,并对算法的收敛性、复杂度以及平衡参数设定进行了理论分析.在合成数据和多个公开数据集上的实验结果表明,LRAFL在效果效率及实现便捷性等指标上均优于现有的其他数据聚类算法. The explosion of information has been evoking a leading wave of big data research during recent years.Despite many empirical successes of spectral clustering algorithms,it is still challenging to cluster the high dimensional data due to the curse of dimensionality.This study proposes a novel algorithm referred to as joint Laplacian regularization and adaptive feature learning(LRAFL),which adaptively learns the feature weights and fits the feature selection as well as clustering into a unified framework,rather than the two-phase strategy of typical approaches.With a new rank constraint imposed on the Laplacian matrix,the connected components in the resulted similarity matrix are exactly equal to the cluster number.An effective approach is also proposed to solve the formulated optimization problem.Comprehensive analyses,including convergence behavior,computational complexity,and together with parameter determination are also presented.Surprisingly sound experimental results can be achieved on synthetic data and benchmark datasets by the proposed algorithm when compared with the related state-of-the-art clustering approaches.
作者 郑建炜 李卓蓉 王万良 陈婉君 ZHENG Jian-Wei;LI Zhuo-Rong;WANG Wan-Liang;CHEN Wan-Jun(School of Computer Science and Technology,Zhejiang University of Technology,Hangzhou 310023,China;School of Computer and Computing Science,Zhejiang University City College,Hangzhou 310015,China)
出处 《软件学报》 EI CSCD 北大核心 2019年第12期3846-3861,共16页 Journal of Software
基金 国家自然科学基金(61602413,61873240) 浙江省自然科学基金(LY19F030016)~~
关键词 LAPLACIAN矩阵 特征选择 谱聚类 相似度矩阵 低秩约束 Laplacian matrix feature selection spectral clustering similarity matrix low-rank constraint
  • 相关文献

参考文献4

二级参考文献125

  • 1Tibshirani R. Regression shrinkage and selection via the Lasso. Journal of the Royal Statistical Society: Series B (Methodological), 1996, 58(1): 267-288. 被引量:1
  • 2Breiman L. Better subset regression using the nonnegative garrote. Technometrics, 1995, 37(4) 373-384. 被引量:1
  • 3Frank L L E, Friedman J H. A statistical view of some chemometrics regression tools. Technometrics, 1993, 35 (2) 109-135. 被引量:1
  • 4Efron B, Hastie T, Johnstone I, et al. Least angle regression. The Annals of Statistics, 2004, 32(2): 407-499. 被引量:1
  • 5Yuan M, Lin Y. On the non-negative garrotte estimator. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 2007, 69(2) : 143-161. 被引量:1
  • 6Xiong S. Some notes on the nonnegative garrote. Techno- metrics, 2010, 52(3): 349-361. 被引量:1
  • 7Fu W J. Penalized regressions: The bridge versus the Lasso. Journal of Computational and Graphical Statistics, 1998, 7(3) : 397-416. 被引量:1
  • 8Knight K, Fu W. Asymptotics for Lasso-type estimators. Annals of Statistics, 2000, 28(5): 1356-1378. 被引量:1
  • 9Huang J, Horowitz J L, Ma S. Asymptotic properties of bridge estimators in sparse high-dimensional regression models. The Annals of Statistics, 2008, 36(2) : 587-613. 被引量:1
  • 10Friedman J, Hastie T, H6fling H, et al. Pathwise coordinate optimzation. The Annals o{ Applied Statistics, 2007, 1(2) : 302-332. 被引量:1

共引文献85

同被引文献53

引证文献6

二级引证文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部