期刊文献+

基于排名分布的异构信息网络协同聚类算法 被引量:2

Ranking Distribution Based Co-clustering Algorithm for Heterogeneous Information Network
下载PDF
导出
摘要 异构信息网络聚类问题是一个新兴问题.最近提出的排名聚类算法将之前看似完全无关的排名与演化聚类方法结合在一起,相互加强,为异构网络的挖掘提供了一种新思路.然而排名聚类算法仅仅完成异构信息网络中特定目标类型数据的聚类,其聚类结果无法涵盖完整的异构网络结构和异构类型数据.引入协同聚类方法,将排名与协同聚类相结合,本文提出一种Rank Co Clus算法,首先由基于后验概率的排名分布生成模型得到排名分布矩阵,然后使用协同聚类方法对不同类型的对象同时聚类,一方面可以实现异构信息网络中不同类型节点的同时聚类,另一方面也能提升异构类型数据聚簇结果的一致性关联.真实DBLP四领域数据集及人造数据集上的对照实验结果表明,Rank Co Clus算法在准确性和聚簇一致性等方面较排名聚类及协同聚类算法均有更好的性能. Clustering analysis of heterogeneous information network is an emerging problem. Recently proposed RankClus algorithm integrates traditional clustering with ranking, which used to be regarded as two orthogonal techniques, providing a new idea for heter- ogeneous information network analysis. However, RankClus gives the clustering results of specified target type only, which can cover neither the complete structure of heterogeneous network nor the whole data of multi-type. By introducing co-clustering technique and combining it with ranking, we propose a novel clustering algorithm called RankCoClus. Firstly, the ranking distribution matrix is gen- erated by the ranking distribution generation model based on the posterior probability, and then the co-clustering methods are used for clustering of objects of different types synchronously. Through this process, we not only cluster different types of nodes simultaneous- ly, but also improve the consistency of clustering results. Experimental results employing the real DBLP-4 area data set and synthetic data sets illustrate that the proposed algorithm can achieve better performance compared to RankClus and the classic Co-clustering al- gorithm.
作者 童浩 余春艳
出处 《小型微型计算机系统》 CSCD 北大核心 2014年第11期2445-2449,共5页 Journal of Chinese Computer Systems
基金 国家自然科学基金项目(60805042)资助 福建省自然科学基金项目(2010J01329 2011J05150 2012J01262 2013J01231)资助 福建省重大产学合作项目(2011H6014)资助
关键词 聚类 排名分布 协同 异构信息网络 clustering ranking distribution collaboration heterogeneous information network
  • 相关文献

参考文献14

  • 1Han Jai-wei. Mining heterogeneous information networks: the next frontier [R]. Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM,2012. 被引量:1
  • 2Dhillon I S. Co-clusteting documents and words using bipartite spec- tral graph partitioning [ C]. Proceedings of the Seventh ACM SIGK- DD International Conference on Knowledge Discovery and DataMining,ACM, 2001. 被引量:1
  • 3Sun Yi-zhou, et al. Rankclus:integrating clustering with ranking for heterogeneous information network analysis[ C ]. Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology,ACM, 2009. 被引量:1
  • 4Sun Yi-zhou,Yu Yin-tao, Han Jia-wei. Ranking-based clustering of heterogeneous information networks with star network schema [ C]. Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining,ACM,2009. 被引量:1
  • 5Hart Jia-wei, Micheline K. Data Mining-concepts and techniques [M]. Beijing: China Machine Press, 2011. 被引量:1
  • 6Cai D,He X,Han J,et al. Graph regularized nounegative matrix fac- torization for data representation [ J ]. IEEE Transactions on Pattern Analysis and Machine Intelligence ,2011,33 (8) : 1548-1560. 被引量:1
  • 7Wang Hun, Hang Huang, Chris Ding. Simultaneous clustering of multi-type relational data via symmetric normegative matrix tri-fac- torization [ C ]. In Proceedings of the 20th ACM Intemational Con- ference on Information and Knowledge Management,2011:279-284. 被引量:1
  • 8Gu Q,Zhou J. Co-clustering on manifolds [ C]. In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discov- ery and Data Mining,2009:359-368. 被引量:1
  • 9Gentle J E,HSrdle W. Handbook of computational statistics:concepts and methods [ M]. Springer,2004. 被引量:1
  • 10Newman, Mark E J. Power laws,pareto distributions and zipf's law [J]. Contemporary Physics,2005,46(5 ) :323-351. 被引量:1

同被引文献29

引证文献2

二级引证文献7

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部