摘要
【目的】提出一个基于评论的用户建模算法,实现评论资讯的个性化推荐。【方法】借助预训练词向量从评论观点句中提取细粒度的产品特征,基于语义关联构建特征词图,并运用TextRank关键词抽取算法计算用户对产品特性的关注度,构建用户兴趣模型。【结果】结果显示,结合词向量和词图算法生成的用户模型与人工归纳的用户模型吻合度较高,语义关联度近90%。模型评测指标F1为0.5505,优于基于词频的传统词袋模型(特征词模型F1为0.5269,词项模型F1为0.3322)。【局限】通过人工标注的评测语料偏少;基于通用语料获得的词向量对解决领域相关问题存有一定局限。【结论】对于形式表达不规范的评论语言,信息凝聚与语义分析技术的有机结合能够有效提升用户建模的质量,为评论质量的评价及评论在推荐系统中的有效利用提供了新思路。
[Objective]This paper proposes a review-based user modeling method,aiming to improve the personalized information pushing services.[Methods]Firstly,we identified product feature-specific terms from reviews with the help of pre-trained word embedding model.Then,we built a term-specific graph based on semantic correlation among feature-specific words.Finally,we used the Text Rank algorithm to compute user’s interest in product features,and model their preferences for products.[Results]User model generated by our new algorithm was consistent with the manually created ones(with nearly 90%semantic correlation).Our F1-score was 0.55,better than those of the classic TF-based word bag models.[Limitations]More manually labeled data and research is needed to improve the domain-specific analysis.[Conclusions]The proposed model helps us better analyze online reviews and develop new application for recommendation system.
作者
聂卉
Nie Hui(School of Information Management,Sun Yat-Sen University,Guangzhou 510006,China)
出处
《数据分析与知识发现》
CSSCI
CSCD
北大核心
2019年第12期30-40,共11页
Data Analysis and Knowledge Discovery
基金
国家社会科学基金项目“面向用户感知效用的在线评论的质量与控制研究”(项目编号:15BTQ067)的研究成果之一.