摘要
针对在传统的客户流失预测数据预处理中,使用one-hot编码处理离散属性导致数据维度增加及数据过于稀疏的问题,提出了两种基于多层感知机的改进后的客户流失预测模型。其主要思想是分别使用堆叠自编码器和实体嵌入两种方法对多层感知机进行改进,通过将离散属性的高维编码数据向低维空间映射,有效地减少了one-hot编码产生的稀疏数据,增加了离散属性值之间的关联度。在对两份公开的数据集进行交叉验证后的实验结果表明,改进后的模型既有效地提高了预测的准确度,又维持了传统多层感知机模型在并行化计算方面的优势。
To deal with the issues of the increasing data attributes and sparse data,evoked by using one-hot encoding method to encode discrete properties,in the preprocessing of customer churn prediction,this paper proposes two improved customer churn prediction models based on multi-layer perceptron.The main idea is to improve multi-layer perceptron by using stacked auto-encoder and entity embedding respectively.By mapping the high dimensional data of discrete properties into low dimensional space,the methods can reduce the number of sparse data made by one-hot encoding and increase the correlation between different values of discrete properties efficiently.The cross-validation results testing on two public data sets reveal that the improved methods not only increase the accuracy of prediction efficiently but also keep the advantage of traditional multi-layer perceptron in parallel computing.
作者
夏国恩
唐琪
张显全
XIA Guo’en;TANG Qi;ZHANG Xianquan(College of Computer Science and Information Engineering,Guangxi Normal University,Guilin,Guangxi 541000,China;School of Business Administration,Guangxi University of Finance and Economics,Nanning 530000,China)
出处
《计算机工程与应用》
CSCD
北大核心
2020年第14期257-263,共7页
Computer Engineering and Applications
基金
国家自然科学基金(No.71862003)
广西高等学校高水平创新团队及卓越学者计划
广西跨境电商智能信息处理重点实验室培养基地专项
广西财经学院创新治理与知识产权学科群专项。
关键词
客户流失
多层感知机
离散属性
堆叠自编码器
实体嵌入
映射
customer churn
multi-layer perceptron
discrete attributes
auto-encoder
entity embedding
map