摘要
网络业务流量的多样化高速化发展给流量识别技术带来了极大挑战,特征选择作为对数据降维处理的有效方法,具有重要的研究意义.本文描述了流量二次特征选择模型,并以此为基础提出了流量二次特征选择算法.算法将流量数据分为若干数据子集进行分治处理,对各数据子集提取出的特征进行汇总,以提出的影响度这一指标作为特征评估排序的依据,进行二次特征提取.实验表明,提出的算法在模型构建上性能更加优越,并且可以选取更少的特征实现对流量更准确的识别.
The diversified and high-speed development of network traffic presents a great challenge for traffic identification. As an effective method for data dimensionality reduction,the research of feature extraction is of great significance. A secondary traffic feature extraction model is described as the foundation of the secondary feature extraction algorithm of network traffic. The algorithm divides traffic data into several subsets and gathers the features extracted from different subsets.The index of influence is proposed as the reference of feature ranking and extraction. The experiment results showthat the secondary traffic feature extraction model has better performance,and the algorithm can identify traffic more accurately with fewer features.
出处
《电子学报》
EI
CAS
CSCD
北大核心
2017年第1期128-134,共7页
Acta Electronica Sinica
基金
陕西省科技计划自然基金重点项目(No.2012JZ8005)
航空科学基金(No.20141996018)
关键词
二次特征提取
分治
排序
影响度
流量识别
secondary feature extraction
divide-conquer
ranking
influence
traffic identification