摘要
急性肝衰竭(ALF)无发病前兆且发展迅速,及时准确的预诊有助于预防措施的提前介入.对比分析了三种智能预诊方法(XGBoost,神经网络和随机森林)与因子分析的耦合模型,分别简称为XGBoost耦合因子分析(XGBoost-FA),神经网络耦合因子分析(ANN-FA)和随机森林耦合因子分析(RF-FA).选取2018年Kaggle竞赛的Acute Liver Failure数据集作为算例,首先利用因子分析将特征变量从30个降到16个(贡献率为80.6%),然后将8785条数据按照7:3的比例划分训练集和测试集,学习出的XGBoost-FA、ANN-FA和RF-FA预诊模型,在测试集上的(对数损失函数,训练时间)分别为:(0.6646636,14.8s),(0.733198,12.7s),(0.6721212,23.1s).对比预诊的结果可知:XGBoost-FA的精确度最高,ANN-FA的速度最快.
Acute liver failure(ALF)has no premonitory symptoms and gets worse rapidly.Timely and accurate pre-diagnoses can help the intervention in advance.This paper compares and analyzes three intelligent pre-diagnosis methods(XGBoost,neural network and random forest)coupled with factor analysis,referred to as XGBoost-FA,ANN-FA and RF-FA.Taking the Acute Liver Failure data set of the 2018 Kaggle competition as an example,30 feature variables were reduced into 16 common factors(the contribution rate was 80.6%)by factor analysis firstly.And then the data of 8785 samples were divided into a training set and test set according to the ratio of 7:3.On the test set,the(log loss function,training time)of the learned XGBoost-FA,ANN-FA,and RF-FA pre-diagnosis models are respectively(0.6646636,14.8 s),(0.733198,12.7 s)and(0.6721212,23.1 s).By comparing the results of the pre-diagnosis,it can be seen that XGBoost-FA has the highest accuracy and ANN-FA has the fastest speed.
作者
张冬阳
龚谊承
ZHANG Dong-yang;GONG Yi-cheng(Department of Mathematics and Statistics,Science College,Wuhan University of Science and Technology,Wuhan 430065,China;Hubei Province Key Laboratory of Systems Science in Metallurgical Process,Wuhan University of Science and Technology,Wuhan 430065,China)
出处
《数学的实践与认识》
北大核心
2020年第13期141-152,共12页
Mathematics in Practice and Theory
基金
湖北省大学生创新创业训练计划(201810488097)
湖北省冶金工业过程重点实验室(Y201906)。