摘要
伪周期数据流是一类常见的数据流,广泛出现于各种监测应用中.在这类数据流中出现的异常可能蕴涵了感兴趣的领域知识,因此有必要检测异常的发生以作为进一步深入分析的基础.DTW距离较之欧氏距离具有更好的鲁棒性,采用DTW距离作为伪周期数据流不同波段间相似性的度量可以有效检测出有较少历史相似波段的异常波段,继而在此基础上提出了一种基于聚类索引的快速近似异常波段检测方法用以加速检测过程,在真实数据集上的实验表明了所提方法的有效性.
Pseudo period data streams appear in a lot of applications,especially in monitoring domains.The anomalies detected over pseudo period data streams may possess significant domain knowledge which is worth to do further analysis.When Euclidean distance between time series changes greatly with the compared time series moving slightly along the time-axis,DTW(dynamic time warping) distance is suggested as a more robust distance than Euclidean distance.In this paper DTW distance is adopted as similarity measure of different wave sections in pseudo period data streams,and then the anomaly wave sections are defined,which have few historical similar counterparts based on that similarity measure.A nave algorithm is given to detect the anomaly wave sections by directly computing the DTW distance between the current wave section and all other wave sections in the historical dataset.However,the efficiency of the nave algorithm is very poor which limits its application.So a fast approximate algorithm based on the cluster index is proposed to speedup the nave method.Compared with the nave algorithm,this new method is much faster in speed and no big degrades in accuracy.Extensive experiments on the real dataset demonstrate the effectiveness of the proposed methods.
出处
《计算机研究与发展》
EI
CSCD
北大核心
2010年第5期893-902,共10页
Journal of Computer Research and Development
基金
国家"八六三"高技术研究发展计划基金项目(2007AA010502
2007AA01Z474
2006AA01Z451)
教育部新世纪优秀人才支持计划基金项目(NCET-06-0928)~~
关键词
动态时间弯曲距离
聚类索引
伪周期数据流
波段划分
异常检测
dynamic time warping distance
cluster index
pseudo period data stream
wave splitting
anomaly detection