摘要
时间序列作为数据的典型代表,被广泛应用于许多研究领域.时间序列异常模式代表了一种特殊情况的出现,在许多领域都具有重要意义.现有的时间序列异常模式识别算法大多只是单纯检测异常子序列,忽略了异常子序列的类别区分问题,且许多参数都需要人为设置.为此提出了一种基于自适应k近邻的异常模式识别算法(anomaly pattern recognitionalgorithm based on adaptive k nearest neighbor,APAKN).首先,确定各子序列的自适应k近邻值,引入自适应距离比计算子序列的相对密度,确定异常分数;然后提出一种基于最小方差的自适应阈值方法确定异常阈值,检测出所有异常子序列;最后,对异常子序列进行聚类,所得聚类中心即为具有不同变化趋势的异常模式.整个算法过程在无需设置任何参数的情况下,不仅解决了密度不平衡问题,还精简了传统基于密度异常子序列检测算法的步骤,实现良好的异常模式识别效果.在时间序列数据集合UCR的10个数据集上的实验结果表明,提出算法在无需设置参数的情况下,在异常子序列检测和异常子序列聚类问题中都表现良好.
As a typical representative of data,time series is widely used in many research fields.The time series anomaly pattern represents the emergence of a special situation,and is of great significance in many fields.Most of the existing time series anomaly pattern recognition algorithms simply detect anomaly subsequences,ignoring the problem of distinguishing the types of anomaly subsequences,and many parameters need to be set manually.In this paper,an anomaly pattern recognition algorithm based on adaptive k nearest neighbor(APAKN)is proposed.Firstly,the adaptive neighbor value k of each subsequence is determined,and an adaptive distance ratio is introduced to calculate the relative density of the subsequence to determine the anomaly score.Then,an adaptive threshold method based on minimum variance is proposed to determine the anomaly threshold and detect all anomaly subsequences.Finally,the anomaly subsequences are clustered,and the obtained cluster centers are anomaly patterns with different changing trends.The whole algorithm process not only solves the density imbalance problem without setting any parameters,but also simplifies the steps of the traditional density-based anomaly subsequence detection algorithm to achieve a good anomaly pattern recognition effect.Experimental results on the 10 data sets of UCR show that the proposed algorithm performs well in detecting anomaly subsequences and clustering anomaly subsequences without setting parameters.
作者
王玲
周南
申鹏
Wang Ling;Zhou Nan;Shen Peng(School of Automation and Electrical Engineering,University of Science and Technology Beijing,Beijing 100083;Key Laboratory of Knowledge Automation for Industrial Processes(University of Science and Technology Beijing),Ministry of Education,Beijing 100083)
出处
《计算机研究与发展》
EI
CSCD
北大核心
2023年第1期125-139,共15页
Journal of Computer Research and Development
基金
国家自然科学基金项目(62076025,61572073)。
关键词
时间序列
异常子序列
异常模式
自适应k近邻
相对密度
time series
anomaly subsequences
anomaly pattern
adaptive k nearest neighbor
relative density