摘要
特征表示和相似性度量是时间序列数据挖掘的基础工作,其质量好坏直接影响后期的挖掘结果.利用正交多项式回归模型对时间序列进行多维形态特征表示,分析特征维数对时间序列拟合效果的影响,选取部分特征来描述序列的主要形态趋势,提出了一种鲁棒性较强的形态特征相似性度量方法来近似度量时间序列,且具有较高的相似性度量质量.实验结果表明,基于多维形态特征表示的时间序列相似性度量方法不仅满足下界要求,具有较好的下界紧凑性和数据剪枝能力,而且在时间序列聚类和分类等数据挖掘任务中取得了良好的效果.
Features representation and similarity measure are tile basic work of time series data mining. Its quality directly influences the results of time series data mining. The orthogonal polynomial regression model is used to represent the multidimensional shape features of time series and the fitting effects of time series are analyzed according to the dimensions of shape features. Part of the features are chosen to describe main shapes and trends of time series. Furthermore, a robustness method with a higher measurement quality based on shape features is proposed to approximately calculate the similarity of time series. The experimental results demonstrate that the similarity measure not only satisfies lower bound and has better tightness and pruning power for time series but also obtains good results of time series data mining such as clustering and classification.
出处
《系统工程理论与实践》
EI
CSSCI
CSCD
北大核心
2013年第4期1024-1034,共11页
Systems Engineering-Theory & Practice
基金
国家自然科学基金(70871015
71031002)
中央高校基本科研业务费(DUT11SX04
12SKGC-QG03)
关键词
时间序列
相似性度量
形态特征
正交多项式
time series
similarity measure
shape feature
orthogonal polynomials