期刊文献+

一种基于随机游走模型的多标签分类算法 被引量:57

A Multi-Label Classification Algorithm Based on Random Walk Model
下载PDF
导出
摘要 在数据挖掘领域,传统的单分类和多分类问题已经得到了广泛的研究.但是多标签数据的普遍存在性和重要性直到近些年来才逐渐得到人们的关注.在多标签分类问题中,由于标签相关性的存在,传统的单分类和多分类问题的解决方法,无法简单地应用于多标签分类问题.文中提出了一种基于随机游走模型的多标签分类算法,称为多标签随机游走算法.首先,将多标签数据映射成为多标签随机游走图.当输入一个未分类数据时,建立一个多标签随机游走图系列.而后,对图系列中的每个图应用随机游走模型,得到遍历每个顶点的概率分布,并将这个点概率分布转化成每个标签的概率分布.最后,基于多标签随机游走算法,文中给出了一种新的阈值学习算法.真实数据集上的实验表明,多标签随机游走算法可以有效地解决多标签分类问题. There are extensive literatures related to traditional single-class and multi-class classification problems,in which each data point is assigned to one category.But in many applications,a data point may belong to more than one category.This kind of problem is called the Multi-Label Classification(MLC) problem.Due to the existing of label relevance,the traditional data-mining methods cannot be directly applied to the MLC problems.This paper proposes a novel MLC algorithm based on the random walk model,called Multi-Label Random Walk(MLRW) algorithm.Firstly,a multi-label random walk graph is built on the training set.As an unlabeled data arrives,a multi-label random walk graph system will be built,on which the random walk processing is carried out.After that,a probability distribution among all labels is obtained.At last,a threshold learning algorithm is proposed based on the MLRW algorithm so that the final prediction on each label is presented.Experimental results on actual data set show that the MLRW algorithm provides an effective solution to the MLC problems.
出处 《计算机学报》 EI CSCD 北大核心 2010年第8期1418-1426,共9页 Chinese Journal of Computers
基金 国家自然科学基金(60803016) 国家"九七三"重点基础研究发展规划项目基金(2007CB310802 2009CB320706) 国家"八六三"高技术研究发展计划项目基金(2008AA042301 2007AA040602)资助~~
关键词 多标签 分类算法 随机游走 阈值学习 multi-label classification random walk threshold learning
  • 相关文献

参考文献24

  • 1Streich A,Buhmann J.Classification of multi-labeled data:A generative approach//Proceedings of the ECML/PKDD.Antwerp,Belgium,2008,2:390-405. 被引量:1
  • 2Lewis D,Yang Y,Rose T,Li F.RCV1:A new benchmark collection for text categorization research.The Journal of Machine Learning Research,2004,5:361-397. 被引量:1
  • 3Veloso A,Meira Jr W,Zaki M.Calibrated lazy associative classification//Proceedings of the 23rd Brazilian Symposium on Databases.Brazil,2008:135-149. 被引量:1
  • 4Snoek C,Worring M,Gemert J V,Geusebroek J,Smeulders A.The challenge problem for automated detection of 101 semantic concepts in multimedia//Proceedings of the ACM Multimedia.Santa Barbara,CA,USA,2006:421-430. 被引量:1
  • 5Tsoumakas G.Multi-label classification.International Journal of Data Warehousing & Mining,2007,3(3):1-13. 被引量:1
  • 6Shen X,Boutell M,Luo J,Brown C.Multi-label machine learning and its application to semantic scene classification//Proceedings of the 2004 International Symposium on Electronic Imaging.San Jose,California,USA,2004:18-22. 被引量:1
  • 7Hullermeier E,Furnkranz J,Cheng W,Brinker K.Label ranking by learning pairwise preferences.Artificial Intelligence,2008,172(16):1897-1916. 被引量:1
  • 8Read J.A pruned problem transformation method for multi-label classification//Proceedings of the New Zealand Computer Science Research Student Conference.New Zealand,2008:143-150. 被引量:1
  • 9Tsoumakas G,Vlahavas I.Random k-labelsets:An ensemble method for multilabel classification//Proceedings of the ECML.Warsaw,Poland,2007:406-417. 被引量:1
  • 10Schapire R,Singer Y.BoosTexter:A boosting-based system for text categorization.Machine Learning,2000,39(2):135-168. 被引量:1

同被引文献614

引证文献57

二级引证文献359

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部