摘要
序列模式挖掘是数据挖掘的一个重要分支 ,在序列事务及有关信息处理中有着广泛的应用 .目前已有许多序列模式模型及相应的挖掘算法 .该文在对序列模式挖掘问题及挖掘算法进行分析的基础上 ,定义了一种称为序列模式图的序列模式框架 ,用于表示序列模式挖掘过程发现的所有序列模式 .序列模式图是由离散状态的序列集到统一的图结构的桥梁 ,可以将序列模式挖掘结果统一到序列模式图中来 .基于序列模式图进行研究可发现某些结构化的新知识 ,称之为后序列模式挖掘 .文中还给出了序列模式图的有关性质及构造算法 .
Sequential pattern mining is an important data mining task with broad applications that include the analysis of customer behaviors, web access patterns, process analysis of scientific experiments, prediction of natural disasters, treatments, drug testing and DNA analysis, etc. Agrawal and Srikant first introduced the sequential pattern mining problem. Over the last few years considerable attention has been focused on the achievement of better performance in sequential patterns mining but there is still the need to do further work in order to improve results achieved so far. Questions that are usually asked with respect to a better performance in sequential pattern mining are: What is the inherent relation among sequential patterns? Is there a general representation of sequential patterns? We propose a novel framework for sequential patterns called Sequential Patterns Graph as a model that can be used to represent relations among sequential patterns. This model has features that: (i) it can be used to represent all the sequential patterns mined in a mining task; (ii) it is the foundation of structural knowledge from which other new patterns can be obtained. Based on this model, a novel concept, Post Sequential Patterns, is proposed that involves graph patterns composed of sequential patterns, branch patterns and iterative patterns. The properties and construction algorithm of SPG are presented also.
出处
《计算机学报》
EI
CSCD
北大核心
2004年第6期782-788,共7页
Chinese Journal of Computers
基金
国家"八六三"高技术研究发展计划项目基金 ( 2 0 0 1AA413 40 )资助