期刊文献+

双重代价敏感随机森林算法 被引量:3

Double Cost Sensitive Random Forest Algorithm
下载PDF
导出
摘要 针对分类器在识别不平衡数据时少数类准确率不理想的问题,提出了一种双重代价敏感随机森林算法,双重代价敏感随机森林算法分别在随机森林的特征选择阶段和集成投票阶段引入代价敏感学习。在特征选择阶段提出了生成代价向量时间复杂度更低的方法,并将代价向量引入到了分裂属性的计算中,使其在不破坏随机森林随机性的同时更有倾向性地选择强特征;在集成阶段引入误分类代价,从而选出对少数类数据更敏感的决策树集合。在UCI数据集上的实验结果表明,提出的算法较对比方法具有更高的整体识别率,平均提高2.46%,对少数类识别率整体提升均在5%以上。 A Double Cost Sensitive Random Forest(DCS-RF)algorithm is proposed to solve the problem that the accuracy of a few classes is not ideal when the classifier identifies unbalanced data.The DCS-RF algorithm introduces the cost sensitive learning in the feature selection stage and the integrated voting stage of the random forest respectively.In the feature selection stage,the method of generating cost vector with lower time complexity is proposed,and the cost vector is introduced into the calculation of split attributes,so that it can select strong features more tendentiously without destroying the randomness of random forest;in the integration stage,the misclassification price is introduced to select the decision tree set which is more sensitive to a few types of data.The experimental results on UCI dataset show that the proposed algorithm has higher overall recognition rate than the comparison method,with an average improvement of 2.46%,and the overall improvement of recognition rate for minority classes is more than 5%.
作者 周炎龙 孙广路 ZHOU Yan-long;SUN Guang-lu(School of Computer Science and Technology, Harbin University of Science and Technology, Harbin 150080,China)
出处 《哈尔滨理工大学学报》 CAS 北大核心 2021年第5期44-50,共7页 Journal of Harbin University of Science and Technology
基金 国家自然科学基金(61702140) 黑龙江省留学归国人员科学基金(LC2018030) 黑龙江省普通高校基本科研业务费专项资金资助(JMRH2018XM04).
关键词 随机森林 不平衡数据 特征选择 代价敏感 random forest imbalanced data feature selection cost-sensitive
  • 相关文献

参考文献9

二级参考文献75

  • 1蒋盛益,谢照青,余雯.基于代价敏感的朴素贝叶斯不平衡数据分类研究[J].计算机研究与发展,2011,48(S1):387-390. 被引量:21
  • 2职为梅,范明.利用基本显露模式两阶段分类稀有类[J].微机发展,2005,15(12):44-47. 被引量:4
  • 3王向军,王研,李智.基于特征角点的目标跟踪和快速识别算法研究[J].光学学报,2007,27(2):360-364. 被引量:48
  • 4凌晓峰,SHENG Victor S..代价敏感分类器的比较研究(英文)[J].计算机学报,2007,30(8):1203-1212. 被引量:35
  • 5Turney P D.Types of cost in inductive concept learning//Proceedings of the Workshop on Cost-Sensitive Learning at the Seventeenth International Conference on Machine Learning.Stanford University,California,2000:15-21 被引量:1
  • 6Domingos P.MetaCost:A general method for making classifiers cost-sensitive//Proceedings of the 5th International Conference on Knowledge Discovery and Data Mining.San Diego,CA,USA,1999:155-164 被引量:1
  • 7Elkan C.The foundations of cost-sensitive learning//Proceedings of the 17th International Joint Conference of Artificial Intelligence.Seattle,WA,USA,2001:973-978 被引量:1
  • 8Zadrozny B,Elkan C.Learning and making decisions when costs and probabilities are both unknown//Proceedings of the 7th International Conference on Knowledge Discovery and Data Mining.San Francisco,CA,USA,2001:204-213 被引量:1
  • 9Zadrozny B,Langford J,Abe N.Cost-sensitive learning by cost-proportionate example weighting//Proceedings of the 3th International Conference on Data Mining.2003 被引量:1
  • 10Ting K M.Inducing cost-sensitive trees via instance weighting//Proceedings of the 2nd European Symposium on Principles of Data Mining and Knowledge Discovery.Lecture Notes in Computer Science 1510.London,UK:Springer-Verlag,1998:139-147 被引量:1

共引文献238

同被引文献22

引证文献3

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部