摘要
Bagging是一种经典的分类器集成方法,其有效性依赖于基分类器之间的差异度。通过遗传算法为每个基分类器构建独立的特征集,目的是获得基分类器之间更好的差异性。同时,根据不同基分类器的分类性能进行优化加权集成,获得更好的泛化能力。最后,采用Softmax回归作为基分类器,将改进的Bagging集成方法应用到互联网流量分类,实验结果表明,改进方法相比经典Bagging方法在分类准确率上有显著提高,与利用决策树集成的随机森林相比也有较好的性能提升。
Bagging is a classic ensemble approach, whose effectiveness depends on the diversity of component base classifiers. In order to gain the largest diversity, employing genetic algorithms to get independent feature subset for each base classifier was proposed. Meanwhile, for better generalization, the optimal weights for the base classifiers according to their predictive performance were selected. Finally, refined Bagging ensemble based on simple Softmax regression was applied successfully in traffic classification. The experiment result shows that the proposed approach can get more improvement than the original Bagging ensemble in classification performance, and is better than the random-forests to a certain extent.
作者
钱亚冠
关晓惠
吴淑慧
云本胜
任东晓
QIAN Yaguan;GUAN Xiaohui;WU Shuhui;YUN Bensheng;KEN Dongxiao(Department of Big-Data Science, Zhejiang University of Science and Technology, Hangzhou 310023, China;Zhejiang University of Water Resources and Electric Power, Hangzhnu 310018, China)
出处
《电信科学》
2018年第4期41-48,共8页
Telecommunications Science
基金
浙江省自然科学基金资助项目(No.LY17F020011)