摘要
针对新闻的个性化服务差及推荐效率低的问题,提出了一种新闻事件的分布式混合推荐算法.该算法改进了传统的层次聚类算法用于新闻事件发现,通过协调簇中心距离和簇间最远距离的权重解决了传统层次聚类中的大簇问题;使用混合推荐算法进行事件推荐,引入了事件的多重特征来计算用户兴趣模型,更准确地表示用户的兴趣偏好;采用Spark分布式计算平台实现该算法,可处理大数据的个性化推荐问题.在公开数据集上的实验结果表明本文方法有效.
A distributed news event hybrid recommendation approach was proposed to improve efficiency of personalized news recommendation. In this approach, the traditional hierarchical cluster algorithm was modified to find news events, the distance weight of two cluster centers and the maximum distance weight among different clusters were modulated to avoid 'big cluster' in traditional hierarchical cluster. Then a hybrid recommendation algorithm was used to recommend news events, and a users' interest model with multiple event characteristics was introduced into the hybrid recommendation algorithm. At last, this approach was implemented with Spark to deal with big data recommendation. Experimental results on open collections show the effectiveness of our proposed approach.
作者
牛振东
王帅
王诗航
陈杰
NIU Zhen-dong WANG Shuai WANG Shi-hang CHEN Jie(School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081 Beijing Engineering Research Center of Massive Language Information Processing and Cloud Computing Application, Beijing 100081, China)
出处
《北京理工大学学报》
EI
CAS
CSCD
北大核心
2017年第7期721-726,共6页
Transactions of Beijing Institute of Technology
基金
国家自然科学基金资助项目(61370137)
关键词
SPARK
分布式
层次聚类
用户兴趣模型
混合推荐
Spark
distribution
hierarchical cluster
user interest model
hybrid recommendation