期刊文献+

突发事件热点话题识别系统及关键问题研究 被引量:6

Study on hot topics identification and key issues about emergency events
下载PDF
导出
摘要 针对突发事件热点话题识别系统,建立了系统实现的整体技术框架,给出了系统四个组成部分的关键问题描述及解决策略,结合新闻报道文本内容和结构的特点和报道源分布性特征,基于VSM文本表示模型和TF-IDF公式,提出了正文裁剪方法和特征权重计算的改进模型,并以地震突发事件新闻报道作为数据源进行模型评估。实验结果表明通过对新闻报道正文的裁剪,只提取标题、导语及相关特征参量等信息即可作为热点话题识别的样本集,且改进的特征权重计算模型与经典模型比较,具有更好地执行效率和适应性更强的文本表示能力。 Concerning the system of hot topics detection about the emergency events,an overall technical framework is established to implement the system.Description and solution strategy about the key issues in the four components of the system are provided.In terms of the content and structure features of the news reports as well as the distribution feature of the report sources, the text clipping method and the modified model of feature weighting calculation are proposed based on the VSM text representation model and the TF-IDF formula.The news reports about the earthquake emergency event are evaluated for this model as the data sources.Experimental results indicate that the information such as the headline,the lead and relevant feature parameters by clipping the main body of the news report can be considered as the sample set of the hot topics to be identified.Furthermore, compared with the classical model,the modified feature items weighting calculation model is more efficient in execution and more adantive in terms of the text representation capability.
出处 《计算机工程与应用》 CSCD 北大核心 2011年第32期19-22,共4页 Computer Engineering and Applications
基金 国家自然科学基金No.91024001 No.61070142 中央高校基本科研业务费专项资金资助(No.2009RC0210) 北京市自然科学基金项目(No.4111002)~~
关键词 突发事件 新闻报道 热点话题识别 正文裁剪 文本表示模型 emergency event news report hot topic identification text clipping text representation model
  • 相关文献

参考文献11

  • 1中国互联网络信息中心.第27次中国互联网络发展状况统计报告[R].2011-01-19. 被引量:25
  • 2曾庆香..试论新闻话语[D].中国社会科学院,2003:
  • 3雷震..基于事件的新闻报道分析技术研究[D].国防科学技术大学,2006:
  • 4Vineel G.Web page DOM node characterization and its applica- tion to page segmentation[C]//IEEE International Conference on Intemet Multimedia Services Architecture and Applications.Ban- galore.IEEE IMSAA,2009:1-6. 被引量:1
  • 5Salton G, Bucldey C.Term weighting approaches in automatic text retrieval[J].Information Processing and Management, 1988,24 ( 5 ) : 513-523. 被引量:1
  • 6宗成庆编著..统计自然语言处理[M].北京:清华大学出版社,2008:475.
  • 7Bun K K, Ishizuka M.Topic extraction from news archive using TF*PDF algorithm[J].Web Information Systems Engineering, 2002: 73-82. 被引量:1
  • 8Schenker A, Last M,Bunke H, et al.Classification of web documents using a graph model[C]//Proceeding of the 7th Imemational Conference on Document Analysis and Recognition(ICDAR' 03 ). IEEE Computer Society,2003:240-244. 被引量:1
  • 9龚海军..网络热点话题自动发现技术研究[D].华中师范大学,2008:
  • 10Sebastiani F.Machine learning in automated text categorization[J]. ACM Computing Surveys, 2002,34( 1 ) : 1-47. 被引量:1

共引文献24

同被引文献53

  • 1人文主义[EB/OL].(2007-12-02).http://baike.baidu.com. 被引量:14
  • 2Stolley K. Using Microformats: Gateway to the Semantic Web[J]. Professional Communication, 2009, 52(3): 291-302. 被引量:1
  • 3Shcherbak S S. Interoperability Web Application Models Based on Microformats[C]//Proc. of the 21th International Crimean Conference of Microwave and Telecommunication Technology[S. 1.]: IEEE Press, 2011: 57-58. 被引量:1
  • 4Iasmina E, Bogdan D. The Usefulness and Functionality of Microformats in a Particular Elearning System[C]//Proc. of International Joint Conference on Computational Cybernetics and Technical Informatics[S. 1.]: IEEE Press, 2010: 387-390. 被引量:1
  • 5Ellen F S. Microformats for Innovative Lexicons[C]//Proc. of IEEE International Conference on Granular Computing[S. 1.]: IEEE Press, 2010: 172-177. 被引量:1
  • 6Weaver J B, Huck I, Brosius H B. Biasing Public Opinion: Computerized Continuous Response Measurement Displays Impact Viewers’ Perceptions of Media Messages[J]. Computers in Human Behavior, 2009, 25(1): 50- 55. 被引量:1
  • 7Karl S. Using Microformats: Gateway to the Semantic Web[J]. IEEE Transactions on Professional Communication, 2009, 52(3): 291-302. 被引量:1
  • 8洪宇,张宇,刘挺,李生.话题检测与跟踪的评测及研究综述[J].中文信息学报,2007,21(6):71-87. 被引量:153
  • 9Diao Q M,Jiang J,Zhu F D. Finding Bursty Topics from Microb-logs[C]. In: Proceedings of ACL,2012:536-544. 被引量:1
  • 10Du Y Y,Wu W,He Y X, et al. Microblog Bursty Feature Detec-tion Based on Dynamics Model[C]. In: Proceedings of the In-ternational Conference on SystemsandInformatics ( ICSAI ) ,2012:2304-2308. 被引量:1

引证文献6

二级引证文献32

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部