摘要
在协同过滤推荐算法中,如果用户-评价矩阵稀疏,共同评价的物品个数少,就很难准确的计算出用户相似度,加上其它实际因素,会使最终的推荐结果与实际结果有很大的差异,推荐效果不佳。本文旨在通过改进算法的计算方式,融入更多实际因素,最终形成更准确的推荐结果集。首先,对数据进行预处理分类,降低冗余数据的计算和矩阵稀疏性。其次,考虑实际推荐中影响用户相似度较大的因素,对用户相似度计算做出改进。然后,通过构造混合推荐函数,在spark分布式计算平台上进行离线和实时计算,减少了计算时间。通过最终的数据训练和结果集的对比,展示了改进后的算法在效率和准确率的提高程度。
In the collaborative filtering recommendation algorithm, if the user-evaluation matrix is sparse and the number of items evaluated together is small, it is difficult to accurately calculate the user similarity. With the other practical factors, there will be very Big differences between recommendation result and the actual result. The purpose of this paper is to improve the calculation method of the algorithm and incorporate more practical factors to form a more accurate recommendation result set. First, the data is pre-processed to reduce the calculation of redundant data and matrix sparsity. Secondly, considering the factors that affect the user's similarity in the actual recommendation , the user similarity calculation is improved. Then, through the algorithm integration, the hybrid recommendation function is constructed, and the offline and real-time calculations are performed on the spark distributed computing platform, which reduces the time. Through the comparison of the final data training and the result set, the improvement of the efficiency and accuracy of the improved algorithm is demonstrated.
作者
李淑敏
夏茂辉
赵志伟
LI Shu-min;XIA Mao-hui;ZHAO Zhi-wei(College of Science, Yanshan University, Hebei 066004, China)
出处
《软件》
2019年第2期173-178,共6页
Software