摘要
为了方便读者能在海量的图书资源中快速有效的找到需要的书籍,利用Map Reduce框架分块处理,结合关联分析Apriori算法,将数据挖掘技术应用到图书管理系统中。但需要多次扫描数据库和产生大量候选集,对Hadoop平台处理速度带来了巨大挑战,因此,针对传统的Apriori算法,提出基于内存计算、弹性分布式数据集处理的Spark平台为读者推荐书籍,指引读者的借阅行为。
In order to search the required books from a tremendous amount of resources immediately for authors, we tried to use the method of MapReduce for dealing the process of block data, combining the algorithm of Apriori, and applying data mining technology to the library management system. But the method referring to above need scan database many times and emerge a large number of candidate set, which will produce tremendous challenge to the speed with Hadoop processing method. Thus, compared to the tradition method of Apriori, there is a new method based on the memory and RDD to compute in Spark platform to recommending books for readers and guiding their borrowing behavior.
作者
高琪娟
刘锴
陈佳
GAO Qijuan1, LIU Kai2, CHEN Jia3(1. School of Information and Computer Science, Anhui Agricultural University, Hefei 230036; 2. Modem Educational Technology Center of Anhui Agricultural University, Hefei 230036; 3. High-standard clustering Service Center Department of China Telecom Co., Ltd., Wuhu 24100)
出处
《安徽农业大学学报》
CAS
CSCD
2018年第4期768-771,共4页
Journal of Anhui Agricultural University