期刊文献+

多重筛选的随机森林不平衡特征选择算法研究 被引量:1

Random Forest Unbalanced Feature Selection Algorithm Based on Multiple Screening
下载PDF
导出
摘要 针对传统分类算法对工业传感器的高维不平衡数据分类效果不理想问题,本文提出一种基于多重筛选的随机森林不平衡特征选择算法。首先提出基于特定因子的不平衡数据处理方法,得到平衡数据集,其次对特征进行基于相关系数图的特征筛选,最后引入最大互信息系数(MIC)计算特征与类别之间的相关性,完成最后一次筛选。采用工业冲压机温度传感器、压力传感器、三相电传感器等传感器数据集完成冲压机故障预测实验,结果表明此优化算法能有效解决高维不平衡数据分类效果不理想的问题,准确率、精确率、召回率以及G-mean系数均比单一模型更高。 With the accelerated development of industrialization,the amount of information produced by industrial sensors increases dramatically.The high and unbalanced data dimensions bring unprecedented challenges to data analysis.In order to solve the problem that the traditional classification algorithm can not classify the high-dimensional unbalanced industrial sensor data well,a random forest unbalanced feature selection algorithm based on multiple screening is propased.Firstly,an unbalanced data processing method based on specific factors is proposed to obtain a balanced data set.Secondly,the features are screened based on correlation coefficient graph.Finally,the MIC coefficient is introduced to calculate the correlation between features and categories,and the MIC value of features is ranked.The results of experiment on industrial press sensor data sets show that the proposed algorithm can effectively solve the problem of unsatisfactory classification effect of high-dimensional unbalanced data,and the efficiency is higher than that of the single model.
作者 王红霞 汪楷翔 WANG Hongxia;WANG Kaixiang(Shengyang Ligong University,Shenyang 110159,China)
出处 《沈阳理工大学学报》 CAS 2021年第5期17-21,51,共6页 Journal of Shenyang Ligong University
关键词 多重筛选 特征选择 不平衡数据 随机森林 multiple screening feature selection unbalanced data random forests
  • 相关文献

参考文献10

二级参考文献64

共引文献165

同被引文献14

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部