摘要
由于日志解析准确率不高以及标记样本不足降低了异常检测的准确率,所以提出了一种新的基于日志的半监督异常检测方法。首先,通过改进字典的日志解析方法,保留了日志事件中的部分参数信息,从而提高日志信息的利用率和日志解析的准确率;然后,使用BERT对模板中的语义信息进行编码,获得日志的语义向量;接着采用聚类的方法进行标签估计,缓解了数据标注不足的问题,有效提高了模型对不稳定数据的检测;最后,使用带有残差块的双向时间卷积网络(Bi-TCN)从两个方向捕获上下文信息,提高了异常检测的精度和效率。为了评估该方法的性能,在两个数据集上进行了评估,最终实验结果表明,该方法与最新的三个基准模型LogBERT、PLELog和LogEncoder相比,F 1值平均提高了7%、14.1%和8.04%,能够高效精准地进行日志解析和日志异常检测。
Because the accuracy of log parsing is not high and the lack of tag samples reduces the accuracy of anomaly detection,this paper proposed a new semi-supervised anomaly detection method based on logs.Firstly,the method enhanced the log parsing method of the dictionary to retain parameter information in log events,improving the utilization and accuracy of log resolution.Next,the method utilized BERT to encode semantic information in the template,obtaining the semantic vector of the log.Then,the method employed the clustering method to estimate the tag,which effectively alleviated the problem of insufficient data labeling and enhanced the model’s ability of detecting unstable data.Finally,the method captured context information from two directions based on the bidirectional temporal convolution network(Bi-TCN)with residual blocks,which enhanced the accuracy and efficiency of anomaly detection.To evaluate the method’s performance,it conducted extensive experiments on two datasets.The results demonstrate that the proposed method achieves an average improvement of 7%,14.1%and 8.04%in F 1 value compared to the latest three benchmark models,LogBERT,PLELog and LogEncoder,enabling efficient and accurate log parsing and log anomaly detection.
作者
尹春勇
孔娴
Yin Chunyong;Kong Xian(School of Computer,Nanjing University of Information Science&Technology,Nanjing 210044,China)
出处
《计算机应用研究》
CSCD
北大核心
2024年第7期2110-2117,共8页
Application Research of Computers
基金
国家自然科学基金面上项目(6177282)。
关键词
日志解析
异常检测
半监督学习
双向时间卷积网络
上下文相关性
log parsing
anomaly detection
semi-supervised learning
bidirectional temporal convolution network
contextual correlation