摘要
传统的基于主动学习的异常检测算法在采样过程中,仅考虑了样本与分类边界间的距离,忽视样本特征,导致算法选取样本冗余,运行效率降低。针对此问题,对选择策略进行优化,提出一种基于改进主动学习的异常检测算法。通过基于KDD99数据集进行异常检测仿真实验,实验结果表明,与传统算法相比,所提算法需要较少已标记样本,即可达到相同的分类准确率。适当增加样本的选择数量,能够有效减少算法达到收敛的迭代次数,提高运行效率。
In the sampling process,the anomaly detection algorithm based on traditional active learning only considers the distance between the sample and the classification boundary,which results in redundancy of the selected samples and reduces the efficiency of the algorithm.Aiming at this problem,the selection strategy was optimized and an anomaly detection algorithm based on improved active learning was proposed.The results based on the knowledge discovery in database 99(KDD 99)show that compared with the traditional algorithm,the proposed algorithm requires fewer labeled samples to achieve the same accuracy of the prediction.Appropriately increasing the selection number of the sample can effectively reduce the number of iterations required for the algorithm to reach convergence and improve the operation efficiency.
作者
蔡颖
陈伟荣
CAI Ying;CHEN Wei-rong(The First Research Department,The 28th Research Institute of China Electronics Technology Corporation,Nanjing 210001,China)
出处
《计算机工程与设计》
北大核心
2022年第11期3057-3062,共6页
Computer Engineering and Design
关键词
主动学习
支持向量机
选择策略
冗余度
异常检测
入侵检测
数据挖掘
active learning
support vector machine
selection strategy
redundancy
anomaly detection
intrusion detection
data mining