摘要
博客是目前网络舆论的重要载体之一,如何自动检测博客中的突发事件对于舆情分析与疏导具有重要的研究价值。针对目前突发事件检测中存在的时间信息有歧义的虚假突发事件问题,本文提出了一种基于时间分布特征的博客突发事件检测方法。该方法通过波峰检测和计算事件文档与背景语料文档之间、事件相关文档和不相关文档之间的时间分布差异来判断该事件在时间特征上是否具有突发性和关联性。实验结果表明,该方法可有效检测博客中的突发事件并可有效去除时间信息有歧义的虚假突发事件。
Blog is one of the most important carriers for public opinions, and how to automatically detect emergent events of the blog has an important research value for analyzing and diverting public opinions. Because the false emergent events can be detected by ambiguous temporal information, this paper presents a blog emergent event detection method based on temporal distribution. This method can determine whether there are emergency and relevance between events and temporal information through the peak detection and calculating the difference of temporal distribution between the event documents and the background corpus documents, and between eventrelevant documents and eventirrelevant documents. The experimental results show that the method can effectively detect emergent events in the blog, and can effectively remove the false emergent events which have ambiguous temporal information.
出处
《计算机工程与科学》
CSCD
北大核心
2010年第10期145-149,共5页
Computer Engineering & Science
基金
国家自然科学基金资助项目(60873179)
深圳市科技计划基础研究资助项目(JC200903180630A)
高等学校博士学科点专项科研基金资助项目(20090121110032)