期刊文献+

基于发文内容的微博用户兴趣挖掘方法研究 被引量:9

Research of microblog user interest mining based on microblog posts
下载PDF
导出
摘要 针对微博用户兴趣属性缺失问题,提出一种基于发文内容分析的微博用户兴趣挖掘方法。利用基于短语的主题模型和自动构建的用户兴趣知识库,能够有效地从发文内容中挖掘出高质量的用户兴趣短语并标志其类别,从而实现对微博用户的兴趣挖掘。在SMP CUP 2016数据集上的实验结果表明,主题短语模型在困惑度和短语质量上取得的效果均优于传统的主题模型,用户兴趣挖掘的准确率和召回率最高可达到78%和82%。 To abstract missing interests of microblog users,this paper proposed an data mining approach based on posting message analysis. Using the phrase-LDA and the user interest knowledge base constructed automatically,it could extract high-quality candidate interest phrases from posting messages and implement the interest classification. The experimental results on SMP CUP 2016 dataset show that the phrase-LDA can achieve better results than traditional topic model on perplexity and phrase quality. The accuracy rate and the recall rate of user interest mining can reach 78% and 82% at best respectively.
作者 熊才伟 曹亚男 Xiong Caiwei;Cao Yanan(National Key Engineering Laboratory,Institute of Information Engineering,Chinese Academy of Sciences,Beijing 100093,China;School of Computer & Control Engineering,University of Chinese Academy of Sciences,Beijing 100093,China)
出处 《计算机应用研究》 CSCD 北大核心 2018年第6期1619-1623,共5页 Application Research of Computers
基金 国家自然科学基金青年基金资助项目(61403369) 国家科技部重大专项资助项目(2016YFB0801300)
关键词 微博 发文内容 兴趣挖掘 主题短语模型 知识库 mieroblog mieroblog posts interests mining phrase-LDA knowledge base
  • 相关文献

参考文献5

二级参考文献53

  • 1李阳,王晓岩,王昆,沙瀛.基于社交网络的安全关系研究[J].计算机研究与发展,2012,49(S2):124-130. 被引量:10
  • 2Kang J H, Lerman K, Plangprasopchok A. Analyzing Microblogs with affinity propagation [C] //Proc of the 1st KDD Workshop on Social Media Analytic. New York: ACM, 2010:67-70. 被引量:1
  • 3Ramage D, Dumais S, Liebling D. Characterizing microblogs with topic models [C] //Proc of Int AAAI Conf on Weblogs and Social Media. Menlo Park, CA: AAAI, 2010:130-137. 被引量:1
  • 4Xu R, Wunsch D. Survey of clustering algorithms [J]. IEEE Trans on Neural Networks, 2005, 16(3): 645-678. 被引量:1
  • 5Deerwester S, Dumais S, Landauer T, et al. Indexing by latent semantic analysis [J]. Journal of the American Society of Information Science, 1990, 41(6): 391-407. 被引量:1
  • 6Landauer T K, Foltz P W, Laham D. Introduction to Latent Semantic Analysis [J]. Discourse Processes, 1998, 25 (2) 259-284. 被引量:1
  • 7Griffiths T, Steyvers M. Probabilistic topic models [G] // Latent Semantic Analysis: A Road to Meaning. Hillsdale, NJ: Laurence Erlbaum, 2006. 被引量:1
  • 8Hofmann T. Probabilistic latent semantic indexing [C] // Proc of the 22nd Annual Int ACM SIGIR Conf on Research and Development in Information Retrieval. New York: ACM, 1999:50-57. 被引量:1
  • 9Salton G, McGill M. Introduction to Modern Information Retrieval [M]. New York: McGraw-Hill, 1983. 被引量:1
  • 10Blei D M, Ng A Y, Jordan M I. Latent Dirichlet Allocation [J]. The Journal of Machine Learning Research, 2003, 3: 993-1022. 被引量:1

共引文献212

同被引文献93

引证文献9

二级引证文献30

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部