期刊文献+

基于Spark流式计算的实时电影推荐研究 被引量:3

Real-time Film Recommendation Research Based on Spark Streaming Calculation
下载PDF
导出
摘要 基于Hadoop平台的实时电影推荐系统在需要大量迭代计算时运行速度明显变慢,无法根据用户行为作出实时反馈。针对以上问题,设计基于Spark流式计算的实时电影推荐系统,可更好地满足用户实时需求。基于Spark流式计算的实时电影推荐系统将传统电影推荐算法与Spark流式计算方法相结合,在线部分使用Spark Streaming实时接收用户模拟评分,并使用Scoket编程模拟用户浏览商品时产生的实时日志数据。日志数据包括用户当前浏览电影、观看电影次数、停留时间与是否购买该商品,再使用Spark Streaming构建实时数据处理系统,计算出当前用户相关度最高的电影并进行推荐。实验结果表明,基于Spark平台的电影实时推荐系统在离线推荐训练过程中,训练速度相对于Hadoop平台有明显提高,能根据用户行为作出实时反馈,并向用户进行电影推荐。 The real-time movie recommendation system of the Hadoop platform can't make the feedback in real time according to the users' behavior.The real-time movie recommendation system based on Spark flow calculation can better meet the users' real-time demand.The real time movie recommendation based on Spark flow calculation is to combine the traditional movie recommendation algorithm with the spark streaming computing film attention.The online part uses Scoket to simulate the user's browsing products to produce real time data.The data includes the movies that the user is currently browsing and the number and stay time of watching the movie and the purchase of the product.Then Spark Streaming is used to build real-time data processing system to calculate current users' biggest concerns about those movies.The implementation results show that compared to the Hadoop platform,Spark platform based on real-time recommendation system achieves the speed of the off-line recommendation training significantly higher than that of the Hadoop platform,and can make real-time feedback according to user behavior,and want users to carry out real-time recommendation.
作者 严磊 汪小可 YAN Lei;WANG Xiao-ke(College of Computational Science and Engineering,Wuhan Institute of Technology,Wuhan 430000,China)
出处 《软件导刊》 2019年第5期44-48,共5页 Software Guide
关键词 电影推荐 SPARK STREAMING SPARK 实时推荐 movie recommendations Spark Streaming Spark real-time recommendation
  • 相关文献

参考文献9

二级参考文献52

  • 1边肇祺,模式识别(第2版),2000年 被引量:1
  • 2Corp. Lustre File System [EB/OL]. http://wiki.lustre.org/index.php/Main_Page. 被引量:1
  • 3Ghemawat S, Gobioff H, Leung S. The Google File System [C] // the ACM Symposium on Operating Systems Principles, Lake George: Association for Computing Machinery, 2003:29-43. 被引量:1
  • 4Apache Sottware Foundation. Apache Hadoop Project [EB/OL]. http://hadoop.apache.org/. 被引量:1
  • 5Konstantin S, Hairong K, Sanjay R, et al. The Hadoop Distributed File System[C]//the 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST), Lake Tahoe: IEEE Computer Society, 2010:1-10. 被引量:1
  • 6IOzone Filesystem Benchmark [EB/OL]. http://www.i ozone.org/. 被引量:1
  • 7Dean J, Ghemawat S. MapReduce: simplified data processing on large c1usters[J]. Communications of the ACM, 2008, 51 (1): 107-113. 被引量:1
  • 8Gerbessiotis A V, Valiant L G. Direct bulk-synchronous parallel algorithms[J]. Journal of Parallel and Distributed Computing, 1994,22(2): 251-267. 被引量:1
  • 9Low Y, Gonzalez J, Kyrola A, et al. Graphlab: a new framework for parallel machine learning[J/OL]. arXiv:1006.4990 (2010)[2014-10-16]. http://arxiv.org/abs/1408.2041. 被引量:1
  • 10Malewicz G, Austern M H, Bik A J C, et al. Pregel: a sys- tern for large-scale graph processing[C]//Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data, Indianapolis, USA, Jun 6-11, 2010. New York, NY, USA: ACM, 2010: 135-146. 被引量:1

共引文献100

同被引文献23

引证文献3

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部