摘要
为有效处理不平衡分类,提高电信业客户流失预测中高价值客户流失预测的准确率,提出改进的随机森林算法(IRFA).该算法改进随机森林中生成每棵树时节点划分的方法,基于客户生命价值划分节点,这是对信息增益的修改,不但解决数据分布不平衡问题,而且提高对有流失倾向的高价值客户预测的准确率.将算法应用于某电信公司的客户流失预测,实验表明,与其他方法相比,IRFA具有更好的分类性能,而且提高高价值客户流失预测的准确率.
An improved random forest algorithm (IRFA) is proposed to handle imbalanced classification and improve the prediction accuracy of high-value customers in telecom customer churn prediction. The node partition method for generating each tree is improved. Nodes are divided based on the life value of customers. Thus the problem of imbalanced data distribution is solved, and the accuracy of chum prediction of high-value customers is raised. IRFA is applied to customer churn prediction for a telecom company. Experimental results show that compared with other methods, the proposed algorithm has a better performance in classification and it improves the accuracy of churn prediction of high-value customers.
出处
《模式识别与人工智能》
EI
CSCD
北大核心
2015年第11期1041-1049,共9页
Pattern Recognition and Artificial Intelligence
基金
中央高校基本科研基金项目(No.WK2100100021)资助
关键词
流失预测
随机森林
不平衡数据
Churn Prediction, Random Forest, Imbalanced Data