摘要
在分析标签共现的基础上,提出一种基于共现的标签谱聚类方法,该方法直接利用标签的共现关系来测度标签的相关性,能够避免将标签表示成向量空间模型时所带来的高维稀疏等问题。在衡量标签的共现相似性时,设计一种综合的方法,并给出标签综合共现相似度的计算公式。与传统的单一利用标签的个体共现来衡量其相似性相比,综合的方法同时考虑标签的个体共现相似性和标签的群体共现相似性,能够更加精确地刻画标签的共现相似度。实验结果表明,基于综合共现相似度的标签共现谱聚类方法具有较好的效果。
Based on analyzing the tags co-occurrence, a tags co-occurrence spectral clustering method is presented The method utilizes the co-occurrence relations of tags to measure their correlation, which could avoid the high dimensional and sparse problems when the tag is represented as vector space model. An integrated approach is designed when co-occurrence similarity among tags is measured, and the tag integrated co-occurrence similarity calculation formula is given. Compared with the traditional approach which uses the individual co-occurrence of tags to measure their similarity singly, the integrated approach considers not only the tag individual co-occurrence similarity, but also the tag common co-occurrence group similarity, which could precisely characterize the similarity among tags. Experimental results show that the tag co-occurrence spectral clustering method based on integrated co-occurrence similarity has a better effect.
出处
《图书情报工作》
CSSCI
北大核心
2014年第23期129-135,共7页
Library and Information Service
基金
国家自然科学基金项目"基于协同训练策略的不完全标记数据流分类问题研究"(项目编号:61273292)
教育部人文社会科学研究青年基金项目"社会化标注环境下的标签层次关系发现方法研究"(项目编号:13YJCZHO77)研究成果之一
关键词
社会化标签系统
标签共现
谱聚类
相似性
social tagging system tag co-occurrence spectral clustering similarity