摘要
海量网络流量数据的处理与单一节点的计算能力瓶颈这一矛盾导致数据分类效率低,无法满足现实需求。为解决这一问题,结合本体与MapReduce技术各自在海量异构数据描述与处理方面的优势,提出一种基于本体的并行网络流量分类方法。该方法基于MapReduce并行计算架构,根据网络流量本体结构,对网络流量本体并行化构建;通过并行知识推理完成基于流量统计特征的网络流量分类。实验结果表明,集群环境下基于MapReduce的网络流量本体构建效率明显高于单机环境,而且适当增加计算节点使得加速比线性提升;并行知识推理的分类方法能够有效地提高大规模网络流量的分类效率。
The contradiction between the processing of mass network traffic data and the computing bottleneck of a single node leads to low efficiency of data classification. To address this challenge, we propose an ontology based parallel network traffic classification method by integrating the advantage of ontology and MapReduce in dealing with the description and processing of mass heterogeneous data. Our approach makes use of MapReduce, a framework of parallel computing. Firstly, it uses the ontology to describe and manage network traffic data, and constructs the layered and parallel network traffic ontology. Then it builds the classification model by employing the decision tree algorithm, by which the inference rule set is generated. Network traffic classification based on traffic statistical features is completed by utilizing parallel knowledge reasoning. Implementation results show that data classification efficiency of the proposed approach in group environment is higher than in stand-alone scenario. The speedup ratio increases linearly when increasing the quantity of compute nodes. In addition, the new method is able to improve the classification efficiency of large-scale network traffic significantly.
出处
《电子科技大学学报》
EI
CAS
CSCD
北大核心
2016年第3期417-422,共6页
Journal of University of Electronic Science and Technology of China
基金
国家自然科学基金(61163058
61363006)
广西可信软件重点实验室开放课题(KX201306)
广西高校云计算与复杂系统重点实验室开放课题(14104)