摘要
Modern large-scale enterprise systems produce large volumes of logs that record detailed system runtime status and key events at key points.These logs are valuable for analyzing performance issues and understanding the status of the system.Anomaly detection plays an important role in service management and system maintenance,and guarantees the reliability and security of online systems.Logs are universal semi-structured data,which causes difficulties for traditional manual detection and pattern-matching algorithms.While some deep learning algorithms utilize neural networks to detect anomalies,these approaches have an over-reliance on manually designed features,resulting in the effectiveness of anomaly detection depending on the quality of the features.At the same time,the aforementioned methods ignore the underlying contextual information present in adjacent log entries.We propose a novel model called Logformer with two cascaded transformer-based heads to capture latent contextual information from adjacent log entries,and leverage pre-trained embeddings based on logs to improve the representation of the embedding space.The proposed model achieves comparable results on HDFS and BGL datasets in terms of metric accuracy,recall and F1-score.Moreover,the consistent rise in F1-score proves that the representation of the embedding spacewith pre-trained embeddings is closer to the semantic information of the log.
基金
supported by the National Natural Science Foundation of China (Nos.62072074,62076054,62027827,61902054,62002047)
the Frontier Science and Technology Innovation Projects of National Key R&D Program (No.2019QY1405)
the Sichuan Science and Technology Innovation Platform and Talent Plan (No.2020TDT00020)
the Sichuan Science and Technology Support Plan (No.2020YFSY0010).