摘要
MapReduce是云计算技术主流的分布式计算模型,它充分利用计算机集群的处理能力;能对大规模数据进行高效的挖掘分析;在研究MapReduce模型架构的基础上,将云计算技术与数据挖掘技术结合在一起,提出了基于MapReduce模型的Apriori算法;该算法对事务集和项集进行双重二进制编码,只需"与"运算和"或"运算,提高了模式匹配和连接的效率;试验结果表明,该算法执行效率比传统集中式Apriori算法有很大幅度的提高。
MapReduce is a major distributed computing mode on cloud computing. It takes full use of computer processing power of clusters, thus could handle data analysis tasks over large scale data. After MapReduce architecture is studied, the paper combines the cloud computing and data mining technology, and then proposes this algorithm based on MapReduce. This algorithm converts trade set and item set to Binary, just "AND" operation and "OR" operation to improve the efficiency of pattern matching and connection. The experimental result shows this algorithm makes a sharp increase in efficiency compared with centralized Apriori algorithm.
出处
《计算机测量与控制》
CSCD
北大核心
2012年第6期1653-1655,共3页
Computer Measurement &Control
基金
国家自然科学基金(60473003)
2009广东警官学院科研项目(2009-Z09)