摘要
快速关联规则增量式更新算法充分利用以往挖掘过程中的结果,无需再次扫描原数据集,对新增数据集也只扫描一次,即可得到事务更新后的数据集的频繁项集。该算法避免了重新处理已经处理过的数据和多次扫描新增数据集,与其他相关算法相比,极大地减少了算法运行时间,提高了挖掘效率。随着历史数据集的增大,更加显现出本算法的优越性。本算法还可以用于解决由于数据集过大而导致的内存不够的Apriori算法的挖掘问题,相当于数据集分组挖掘。
Rapid ineremental updating algorithm makes full use of the results of mining and will get frequent item sets of the item updated data set hy scanning the newly-added data set only once without reseanning the original one. The algorithm avoids re-dealing with the data which has been dealt with and repeatedly scanning newly-added data set. Compared with other associating algorithm, it greatly reduces the run-time and improves the mining eflleieney. With the enlarging of the historical data set, the superiority wi/l be shown more obvoiusly. And it can also resolve the mining problems of the Apriori algorithm which is due to the too large data set leading to the insufiieient memory. And this is equivalent to data set group mining.
出处
《安庆师范学院学报(自然科学版)》
2007年第2期17-20,共4页
Journal of Anqing Teachers College(Natural Science Edition)
基金
安徽省科技厅自然科学研究项目(050420207)
关键词
关联规则
增量式更新
频繁项目集
association rule
incremental updating
frequent itemsets