摘要
传统的约束频繁项集挖掘方法无法实现对较大数据量的快速处理,针对该问题,结合分布式框架Hadoop的分布式计算优势,提出一种基于MapReduce的约束频繁项集挖掘算法。将一个完整的挖掘任务分成若干个相对独立的子任务,根据用户自定义的约束条件对子任务进行并行挖掘,提高算法的执行效率。实验结果表明,该算法具有较好的实用性和良好的扩展性。
Traditional constrained frequent item set mining methods fail to achieve fast processing of large amounts of data.To solve these problems,a constraint frequent item sets mining algorithm based on MapReduce combining with distributed computing framework advantage of the distributed Hadoop.A complete mining task was split into several relatively independent subtasks,and subtasks were parallel mined based on user-defined constraints,so that the efficiency of the algorithm was improved.Experimental results show that the algorithm has good practicability and good scalability.
出处
《计算机工程与设计》
北大核心
2015年第10期2725-2728,2748,共5页
Computer Engineering and Design
基金
国家自然科学基金项目(61103129
61202312)
江苏省科技支撑计划基金项目(BE2009009)