期刊文献+

特征驱动的关键词提取算法综述 被引量:35

Features Oriented Survey of State-of-the-Art Keyphrase Extraction Algorithms
下载PDF
导出
摘要 面向文本的关键词自动提取一直以来是自然语言处理领域的一个关键基础问题和研究热点.特别是,随着当前对文本数据应用需求的不断增加,使得关键词提取技术进一步得到研究者的广泛关注.尽管近年来关键词提取技术得到长足的发展,但提取结果目前还远未取得令人满意的效果.为了促进关键词提取问题的解决,对近年来国内、外学者在该研究领域取得的成果进行了系统总结,具体包括候选关键词生成、特征工程和关键词提取3个主要步骤,并对未来可能的研究方向进行了探讨和展望.不同于围绕提取方法进行总结的综述文献,主要围绕着各种方法使用的特征信息归纳总结现有成果,这种从特征驱动的视角考察现有研究成果的方式有助于综合利用现有特征或提出新特征,进而提出更有效的关键词提取方法. Keyphrases that efficiently represent the main topics discussed in a document are widely used in various document processing tasks, and automatic keyphrase extraction has been one of fundamental problems and hot research issues in the field of natural language processing(NLP). Although automatic keyphrase extraction has received a lot of attention and the extraction technologies have developed quickly, the state-of-the-art performance on this task is far from satisfactory. In order to help to solve the keyphrase extraction problem, this paper presents a survey of the latest development in keyphrase extraction, mainly including candidate keyphrase generation, feature engineering and keyphrase extraction models. In addition, some published datasets are listed, the evaluation approaches are analyzed, and the challenges and trends of automatic keyword extraction techniques are also discussed. Different from the existing surveys that mainly focus on the models of keyphrase extraction, this paper provides a features oriented survey of automatic keyphrase extraction. This perspective may help to utilize the existing features and propose the new effective extraction approaches.
作者 常耀成 张宇翔 王红 万怀宇 肖春景 CHANG Yao-Cheng;ZHANG Yu-Xiang;WANG Hongl;WAN Huai-Yu;XIAO Chun-Jing(School of Computer Science and Technology, Civil Aviation University of China, Tianjin 300300, Chin;School of Computer and Information Technology, Beijing Jiaotong University, Beijing 100044, China)
出处 《软件学报》 EI CSCD 北大核心 2018年第7期2046-2070,共25页 Journal of Software
基金 国家自然科学基金(U1533104 U1633110 61603028) 中央高校基本科研业务费(ZXH2012P009)~~
关键词 关键词提取 候选关键词生成 特征 有监督方法 图方法 keyphrase extraction candidate keyphrase generation feature supervised approach graph-based approach
  • 相关文献

参考文献6

二级参考文献27

共引文献190

同被引文献310

引证文献35

二级引证文献297

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部