期刊文献+

在线挖掘数据流滑动窗口中最大频繁项集 被引量:9

Online Mining Maximal Frequent Itemsets in Sliding Window over Data Streams
下载PDF
导出
摘要 相对于频繁项集,最大频繁项集的数目较少,挖掘最大频繁项集的算法具有较高的时空效率。提出了一种新的基于文法顺序FP-Tree的最大频繁项集单遍挖掘算法FPMFI-DS。该算法采用了一种混合搜索空间项顺序策略,并利用我们所提出的一种新的剪枝技术—"子集等价剪枝技术",有效缩小搜索空间的大小。基于该算法,提出了一种能够在线更新挖掘数据流滑动窗口中最大频繁项集的算法FPMFI-DS+。FPMFI-DS+算法能够在任意时刻都维护数据流当前窗口中的最大频繁项集。仿真实验表明,FPMFI-DS算法的效率接近于多遍挖掘算法FPMax*,并具有良好的可扩展性,FPMFI-DS+算法更新挖掘速度快。 For the number of maximal frequent itemsets (MFIs) is less than that of frequent itemsets, the efficiency of algorithm for mining MFIs is higher. A novel single-pass lexicographical-order FP-Tree based algorithm, FPMF1-DS was proposed. FPMFI-DS uses a kind of mixed item. ordering policy and imports a new pruning technique, subset equivalence pruning technique. These two techniques effectively decrease the size of searching space. Based on FPMFI-DS, another algorithm, FPMFI-DS+ was proposed, which'could mine MFls in sliding window over data streams in an online updating fashion. FPMFI-DS+ can maintain MFIs in current sliding window over data streams at any time. The experiments show that FPMFI-DS is comparable with multi-pass algorithm FPMax^* regarding with the efficiency, and has good scalability, and FPMFI-DS+ has high updating-miningspeed.
出处 《系统仿真学报》 CAS CSCD 北大核心 2009年第4期1134-1139,共6页 Journal of System Simulation
基金 国家自然科学基金资助项目(60573057 60704038)
关键词 数据流 最大频繁项集 在线挖掘 滑动窗口 文法顺序FP-Tree data streams maximal frequent itemsets online mining sliding window lexicographical-order FP-Tree
  • 相关文献

参考文献13

  • 1B Babcock, S Babu, M Datar, R Motwani, J Widom. Models and Issues in Data Stream Systems [C]// Proc. of PODS'2002. USA: ACM, 2002: 1-16. 被引量:1
  • 2D Lee, W Lee. Finding maximal frequent itemscts over online data streams adaptively [C]// Proc. of the Fifth IEEE International Conference on Data Mining. Houston. USA: IEEE, 2005: 266-273. 被引量:1
  • 3H Li, S Lee, M Shan. Online mining (recently) maximal frequent itemsets over data streams [C]//Proc. of the fifteenth International Workshops on Research Issues in Data Engineering: Stream Data Mining and Applications, Tokyo, Japan. USA: IEEE, 2005:11-18. 被引量:1
  • 4G Mao, X Wu, X Zhu, et al. Mining maximal frequent itemsets from data streams [J]. Journal of Information Science, 2007, 33(3): 251-262. 被引量:1
  • 5G Grahne, J Zhu. Efficiently Using Prefix-trees in Mining Frequent Itemsets [C]// Proc. of the IEEE ICDM Workshop on Frequent Itemset Mining Implementations. USA: IEEE, 2003. 被引量:1
  • 6Y Yah, Z Li, H Chen. Fast Mining Maximal Frequent ItemSets Based on FP-Tree [C]//Proc. of AI'2004, Cairns Australia, December, 2004. Germany: Springer, 2004: 475-487. 被引量:1
  • 7宋余庆,朱玉全,孙志挥,陈耿.基于FP-Tree的最大频繁项目集挖掘及更新算法[J].软件学报,2003,14(9):1586-1592. 被引量:164
  • 8F Ao, Y Yan, J Huang, K Huang. A Novel Pruning Technique for Mining Maximal Frequent Itemsets [C]// Proc. of FSKD'2007, Haikou, China, August, 2007. USA: IEEE, 2007:469-473. 被引量:1
  • 9Y Zhu, D Shasha. StatStream: Statistical monitoring of thousands of data streams in real time [C]//Proc. of the 28th Int'l Conf. on Very Large Data Bases. Hong Kong: Morgan Kaufmann, 2002: 358-369. 被引量:1
  • 10J Han, J Pei, Y Yin. Mining frequent patterns without candidate generation [C]//Proc. of the Special Interest Group on Management of Data 2000. USA: ACM, 2000: 1-12. 被引量:1

二级参考文献1

共引文献163

同被引文献81

引证文献9

二级引证文献21

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部