摘要
为了高效地解决协同过滤算法中的遗漏值问题,而不是简单地用缺省值加以代替,提出了一种新的、在协同过滤中的遗漏值处理方法.其基本思想是,先利用具有最小方差的局部主成分,把包含有遗漏值的不完备数据集划分成多个模糊聚类,然后通过求解广义逆矩阵来获得各个子聚类的主成分,最终在局部主成分的基础上通过简单的线性方程模型去估计聚类中的遗漏值.实验表明,这种方法的优点是低内存需求,具有较小的平均绝对偏差值,并且显示出了比传统推荐算法更好的推荐质量.
To efficiently resolve the problem of missing data in cooperative filter algorithm (instead of simply using defaults), a new approach based on principal component analysis and fuzzy clustering was proposed. The essential idea was that an incomplete data set including missing values was partitioned into several fuzzy clusters by using local principle component with least variance, and through solving the general inverse matrix of the data to obtain the principle components of each sub-clusters, the missing values in clusters could be estimated based on local principal components utilizing a simple linear model. Experimental results show that this method is of low memory requirements and lower mean absolute error (MAE) value, and provides better recommendation quality compared with traditional collaborative filtering algorithms.
出处
《西安交通大学学报》
EI
CAS
CSCD
北大核心
2004年第8期808-810,850,共4页
Journal of Xi'an Jiaotong University
基金
国家高技术研究发展计划资助项目 (2 0 0 3AA1Z2 61 0 )