摘要
微博突发话题的检测是网络舆情分析的一个重要分支,如何实时高效地发现微博文本流中的突发话题是目前亟待解决的问题。为此提出一种基于动态窗口的微博突发话题检测方法。将词对加速度作为突发特征,根据微博文本流中突发词对出现的速度自适应地确定突发话题窗口范围,并利用改进的非负矩阵分解聚类方法获取突发话题窗口中微博的主题结构。在微博文本流上的对比实验表明,该检测方法不但可以减少突发话题检测的时间延迟,而且能够提高检测的准确率和召回率。
Microblog bursty topic detection is an important branch of network public opinion analysis.And how to discover bursty topic in microblog text stream in real time and efficiently is an urgent problem to be solved.For this purpose,this paper proposes a microblog bursty topic detection method based on dynamic window.It took the acceleration of word pairs as the bursty feature,and adaptively determined the range of burst topic window according to the speed of emergence of bursty word pairs in microblog text stream.Moreover,the improved nonnegative matrix factorization method was used to obtain topic structure of microblog in the bursty topic window.The comparison experiments on microblog text stream show that the proposed detection method can not only reduce the time delay of bursty topic detection,but also improve the detection precision and recall rate.
作者
李艳红
贾丽娜
王素格
李德玉
Li Yanhong;Jia Li na;Wang Suge;Li Deyu(School of Computer and Information Technology,Shanxi University,Taiyuan 030006,Shanxi,China;Key Laboratory of Computational Intelligence and Chinese Information Processing of Ministry of Education,Taiyuan 030006,Shanxi,China)
出处
《计算机应用与软件》
北大核心
2020年第5期30-37,共8页
Computer Applications and Software
基金
国家自然科学基金项目(61573231,61672331,61603229)
山西省重点研发计划项目(201803D421024,201903D421041)
山西省基础研究计划项目(201601D021076)。
关键词
微博
突发话题
动态窗口
词对加速度
非负矩阵分解
Microblog
Bursty topic
Dynamic window
Word pairs acceleration
Nonnegative matrix factorization