摘要
序列规则挖掘旨在发现频繁序列之间的因果关联,当前最优的序列规则产生方法仅考虑两规则间的包含关系而没有考虑多规则间的演绎关系,故而存在大量冗余。引入演绎无冗余规则的概念,分析演绎冗余的原因,重新定义了无冗余规则的概念。在频繁闭序列及其生成子的基础上,基于最大重叠项冗余性检查给出了无冗余规则抽取算法。理论分析和实验评估表明该算法在处理效率基本不变的前提下,提高了序列规则的生成质量。
Sequence rule mining aims at finding the casual association between frequent sequences,current best sequence rules generation approach just considers the inclusion relationship between two rules but does not consider the deduction relationship among multi rules,therefore has lots redundancies. We introduce the concept of deductive non-redundant rules and analyse the reasons for deductive redundancy,as well as redefine the concept of non-redundant rules. We also present the non-redundant sequence rules extraction algorithm based on the maximum overlap term redundancy checking on the basis of frequent closed sequence and its generator. Theoretical analysis and experimental assessment demonstrate that this algorithm improves the generation quality of sequence rules with almost the same efficiency.
出处
《计算机应用与软件》
CSCD
2016年第3期52-55,66,共5页
Computer Applications and Software
基金
国家自然科学基金项目(61303225)
关键词
事件
序列规则
包含
演绎
无冗余
Event
Sequence rule
Inclusion
Deduction
Non-redundant