摘要
数据集内容的特性对基于负载的网络异常入侵检测系统准确度有很大影响。本文分析了训练集数据包之间的内容特性差异对基于字节频度分布的模型的影响,较大的差异可能会导致分组计算频度均值的模型产生较高的误报率。本文据此提出了一种改进的模型—单包频度分布模型,以单个数据包的频度分布特征构成正常行为集,并以聚类方法控制其规模。在模拟数据集和DARPA99数据集上的实验表明,训练集数据包内容特性的差异确实导致基于均值的字节频度模型产生更多的误报,单包频度分布模型则不受影响,它有更高的检测准确度,在同等检测率下误报率更低。在数据包相互完全不同的情况下,基于均值的模型甚至失效。可认为单包频度分布模型对具有丰富动态内容的网络服务将有良好的适应能力。
The content characteristics of datasets have strong effect on the detection accuracy of network anomaly intrusion detection systems.The influences impacted on byte frequency distribution based models by the differences between content characteristics of the training packets are analyzed,revealing that those differences would lead the models calculating the average frequency of grouped packets to a higher false alarm rate.Based on this,a modified model named single packet frequency distribution is proposed,which uses the frequency distribution data of the unitary packet to form normal profiles instead of using their average values,and controlls the size of that normal set by clustering techniques.Experiments are carried out respectively on the simulation dataset and the DARPA99real network dataset.The results indicate that the great difference between packet contents in deed makes the average byte frequency value based models generating more false alarms,whereas the single packet frequency distribution model is not affected by that,and it gets higher detection accuracy,generating an equal detection rate with the lower false alarm rate.The average value based model even becomes invalid at the worst case.The single packet frequency distribution model can be considered having good adaptability to those network services with rich dynamic contents.
出处
《计算机工程与科学》
CSCD
北大核心
2012年第7期24-28,共5页
Computer Engineering & Science
基金
校级自然科学基金资助项目(j02005302)