摘要
协同过滤作为国内外学者普遍关注的推荐算法之一,受评分失真、数据稀疏等问题影响,算法推荐效果不尽如人意。为解决上述问题,文章提出了一种改进的聚类协同过滤推荐算法。首先,该算法利用无监督情感挖掘技术将评论情感映射为一个固定区间中的值,通过加权修正用户评分偏差;然后,构建修正后用户-产品评分矩阵的数据场,利用启发式寻优算法计算最佳聚类数和最优初始聚类中心,进而对用户进行划分聚类,结合最近邻用户相似性与评分产生推荐结果;最后,基于三个自建真实数据集对所提算法性能和有效性进行全面评估。实验结果表明,改进算法在精度Precision、召回率Recall和F1-Score评价指标上的表现均优于其他算法,能够有效应对数据稀疏的问题,提升推荐系统的推荐效果。
As one of the widely studied recommender algorithms by scholars globally,collaborative filtering is adversely affected by issues such as rating biases and data sparseness,leading to suboptimal recommendation performance.In order to address the aforementioned issues,this paper proposes an improved clustering-based collaborative filtering recommender algorithm.Firstly,the algorithm utilizes unsupervised sentiment mining techniques to map the sentiment of comments into a value within a fixed range,correcting user rating biases through weighted adjustments.Then,the algorithm constructs a data field for the modified user-item rating matrix,employes a heuristic optimization algorithm to calculate the optimal number of clusters and the best initial clustering centers.This facilitates the clustering segmentation of users,integrating proximity-based user similarity and ratings to generate recommendations.Finally,the performance and effectiveness of the proposed algorithm are comprehensively evaluated based on three self-constructed real-world datasets.The experimental results show that the improved algorithm outperforms other algorithms in terms of Precision,Recall and F1-Score evaluation indexes,and proves effective in addressing data sparsity and improving the recommendation performance of the recommender system.
作者
马鑫
段刚龙
Ma Xin;Duan Ganglong(Business School,Nankai University,Tianjin 300110,China;School of Economics and Management,Xi’an University of Technology,Xi’an 710054,China;Big Data Analysis and Business Intelligence Laboratory,Xi’an University of Technology,Xi’an 710054,China)
出处
《统计与决策》
北大核心
2024年第4期23-27,共5页
Statistics & Decision
基金
陕西省软科学项目(2022KRM188)。
关键词
评分偏差
随机初始聚类中心
协同过滤
评论情感挖掘
数据场聚类
rating biases
random initial clustering centers
collaborative filtering
comment sentiment mining
data field clustering