摘要
话题发现与追踪以新闻流为处理对象,采用基于事件的信息组织方式进行研究,一直是自然语言处理领域里的热点。该研究借鉴大量相关研究尤其是信息检索中的经典模型和方法,取得了很大成功。首先介绍了话题发现与追踪的主要研究内容、评价方法以及发展历史;然后对其多个研究内容提出一个统一研究框架,并对该框架中的关键技术进行了详细分析;最后指出该领域中的关键问题及难点,并对未来研究做出展望。
Topic detection and tracking (TDT) is the research that addresses event-based organization of broadcast news. It has always been an issue in the field of natural language processing from the beginning. Remarkable successes have been made with classical models and methods which are borrowed from other related researches, especially information retrieval. TDT and its primary tasks, evaluation methods and development history are first introduced. Then, an integrated research framework is provided for TDT. The key technologies of the framework are analyzed. Finally, the key problems and the difficulties in TDT are proposed and the future work is looked forward.
出处
《计算机科学与探索》
CSCD
2009年第4期347-357,共11页
Journal of Frontiers of Computer Science and Technology
基金
国家自然科学基金No.60873097
新世纪优秀人才支持计划No.NCET-06-0926~~
关键词
话题发现与追踪
统一研究框架
表示模型
topic detection and tracking(TDT)
integrated research framework
representation model