期刊文献+

一种子空间聚类算法在多标签文本分类中应用 被引量:4

APPLYING A SUBSPACE CLUSTERING ALGORITHM IN MULTI-LABEL TEXT CLASSIFICATION
下载PDF
导出
摘要 随着社交网络的兴起,文本数据不断增加,这使得自动化文本分类技术成为研究的热点。单个文本可能同时带有多个类别标签,该特点直接导致传统的二分类或多类别分类技术在多标签文本数据上性能不佳。针对这一不足,提出一种基于半监督杂质的子空间聚类分析算法SCA(subspace clustering analysis),该算法分析在多标签环境下每一对分类和标签之间存在的潜在相关性。并设计一种对分类文本数据更有效的多标签分类器。最后,实验对两个多标签文本集进行分析,结果表明该算法优于当前采用的其他文本分类方法。 With the rise of social networking,the amount of generated text data gains increasingly,this makes the automated text classification technology become the focus of the research. Single text file may have multiple category labels simultaneously,this feature directly causes conventional two or multi-category classification techniques perform poor in text data with multi-label. In response to this deficiency,we propose a semi-supervised impurity based subspace clustering analysis algorithm named SCA,it analyses the potential correlation existing between each pair of classification and label in a multi-label environment. We also design a multi-label classifier more effective on the classified text data. Finally,the experiments of analysing two multi-label text set are carried out,results show that the algorithm is superior to other text classification methods currently used.
出处 《计算机应用与软件》 CSCD 北大核心 2014年第8期288-291,303,共5页 Computer Applications and Software
关键词 文本数据 多标签 分类器 子空间聚类 杂质 Text data Multi-label Classifier Subspace clustering Impurity
  • 相关文献

参考文献17

二级参考文献69

共引文献27

同被引文献25

引证文献4

二级引证文献13

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部