期刊文献+

基于分类的中文微博热点话题发现方法研究 被引量:3

Classification-based Hot Topic Detection Approach on Chinese Micro-blog
下载PDF
导出
摘要 智能手机和微博客户端强化了微博的媒体特性,实时发现微博话题具有现实意义。文章提出了一种基于关键字分类的中文微博热点话题发现方法,通过关键字对微博信息进行筛选和归类,以时间窗内词频和增长速度构造赋权函数提取主题词,词汇的同文本条件概率作为相似度判定依据,基于改进的单遍聚类算法进行主题词聚类。对系统运行结果分析表明,该方法可以实时有效地聚类发现微博热点话题。 Smart-phones and micro-blog client reinforce the micro-blog media features. Therefore, Micro-blog hot topic real-time detection can provide valuable research results in relevant ifelds. The paper introduces a real-time hot micro-blog topic detection method based on keywords classiifcation. Filtered micro-blog messages were classiifed according to keywords. A multi-weight function based on the word frequency and growth in the time window was used to extract the key words of micro-blog information. An improved single-pass clustering algorithm based on same-text conditional probability was used to ifnd the micro-blog hot topic. The results show that the approach is effect in clustering micro-blog hot topic in real time.
作者 郑飞 张蕾
机构地区 上海市公安局
出处 《信息网络安全》 2014年第9期127-131,共5页 Netinfo Security
关键词 分类 微博 话题发现 聚类 classiifcation micro-blog topic detection clustering
  • 相关文献

参考文献21

二级参考文献114

共引文献549

同被引文献33

  • 1Allan J, Carbonell J G, Doddington G, et al. Topic Detection and Tracking Pilot Study Final Report[C]//proceedings of the darpa broadcast news transcription and understanding workshop, 1998:194-218. 被引量:1
  • 2Blei D M, Ng A Y, Jordan M I. Latent dirichlet allocation[J], the Journal of machine Learning research, 2003, ( 3 ) : 993-1022. 被引量:1
  • 3Chen C C, Chen M C, Chen M S. LIPED: HMM-based life profiles for adaptive event detection[C]//Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining. ACM, 2005: 556-561. 被引量:1
  • 4Chen C C, Chen Y T, Chen M C. An aging theory for event lift-- cycle modeling[J]. Systems, Man and Cybernetics, Part A: Systems and Hunlans, IEEE Transactions on, 2007, 37(2): 237-248. 被引量:1
  • 5Weng J, Lim E P, Jiang J, et al. Twitterrank: finding topic-sensitive influential twitterers[C]//Proceedings of the third ACM international conference on Web search and data mining. ACM, 2010: 261-270. 被引量:1
  • 6Du Y, He Y, Tian Y, et al. Microblog bursty topic detection based on user relationship[C]//Information Technology and Artificial Intelligence Conference (ITAIC), 2011 6th IEEEJoint International. IEEE, 2011,( 1 ): 260-263. 被引量:1
  • 7Akcora C G, Bayir M A, Demirbas M, et al. Identifying breakpoints in public opinion[C]//Proceedings of the First Workshop on Social Media Analytics. ACM, 2010: 62-66. 被引量:1
  • 8Zhang H P, Yu H K, Xiong D Y, et al. HHMM-based Chinese lexical analyzer ICTCLAS[C]//Proceedings of the second SIGHAN workshop on Chinese language processing-Volume 17. Association for Computational Linguistics, 2003: 184-187. 被引量:1
  • 9Salton G, Buckley C. Term-weighting approaches ill automatic text retrieval[J]. Information processing & management, 1988, 24(5): 513-523. 被引量:1
  • 10张焕明.网络舆情分析系统的研究与设计[J].微计算机信息,2010,26(18):119-121. 被引量:15

引证文献3

二级引证文献20

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部