摘要
隐语义模型(LFM)是文本挖掘领域的重要模型,将它应用于推荐系统的评分预测具有预测精度高和占用内存小的优点。但由于时间开销较大,LFM模型并不适合用于处理大规模稀疏矩阵。针对此问题,论文将K-means算法引入到LFM模型的评分数据处理,得到改进模型K-LFM。在K-LFM模型中,利用K-means算法对评分矩阵中的用户和项目数据进行聚类处理,然后重构评分矩阵降低原始矩阵的稀疏程度和矩阵规模,最后用重构后的评分矩阵训练模型,预测评分。通过在movielens数据集上实验发现K-LFM模型在运行时间上较LFM模型有大幅降低,而预测精度没有受到明显影响。
Latent Factor Model(LFM)is an important model widely used in text mining.It has the advantage of high precision and low memory cost in rating prediction.However LFM model is not suitable for processing large-scale sparse matrix.In order to improve the performance,K-means algorithm is introduced to deal with rating data into LFM.This new model is called K-LFM.First of all,K-means is used to classify user and item information in K-LFM.And then the rating matrices are refactored to reduce the scale and sparse degree of orignal matrix.Finally training model with refactoring matix,can get predict rating.The experiment on public data set movielens shows that K-LFM model is superior to LFM model on processing efficiency.Besides,the prediction accuracy isn't significantly affected.
出处
《计算机与数字工程》
2016年第4期572-574,609,共4页
Computer & Digital Engineering
基金
贵州省科学技术基金项目(编号:黔科合J字[2010]2100号)
贵州大学引进人才科研项目(编号:贵大人基合字(2009)029号)资助