摘要
对网络通信中,安全指标间关联规则的挖掘速度缓慢问题进行研究。网络通信数据的高容量、多样性和复杂性,使网络安全指标间关联规则挖掘的信息处理难度较高、时间效率低,为此提出一种基于并行FP-树频集算法的关联规则数据挖掘方法并成功应用于网络通信的安全指标挖掘中。首先对网络通信数据进行Netflow流量数据采集,对其进行预处理以信息熵的形式存储。然后将频集压缩到频繁模式树上,再引入并行算法在多个处理器上为频繁模式树的节点创建条件模式库和条件模式树,在不同的并行处理器上进行同时处理,最后生成反映网络安全信息的关联规则。该方法提高了网络信息安全指标间关联规则挖掘的效率,在同样的支持度阈值和置信度阈值的条件下,可减少处理时间4~7s。
For network communications, security index of mining association rules between slow problem are studied. High capacity of data network communication, diversity and complexity, the index of network security of mining association rules between information processing with high difficulty and low time efficiency, therefore, a frequency set algorithm based on parallel FP - tree method of data mining association rules. First Netflow data of the network traffic flow data collection, which carries on the pretreatment of storage in the form of information entropy. Then frequency set to frequent pattern tree, then introduced the parallel algorithm for frequent pattern tree nodes on multiple processors to create conditions for the pattern library and conditional pattern tree, at the same time in dif- ferent parallel processor processing, finally generated reflect the information network security association rules. The method improves the network information safety indexes between the efficiency of association rule mining, in the same support and confidence threshold threshold conditions, can improve the processing time is more than 3.5 seconds.
出处
《科学技术与工程》
北大核心
2014年第7期216-218,222,共4页
Science Technology and Engineering
基金
河南省高等学校青年骨干教师项目(2012GGJS-288)资助