期刊文献+

语义知识驱动的论文摘要关键词抽取方法

A Semantic Knowledge-Driven Keyword Extraction Method for Paper Abstracts
下载PDF
导出
摘要 [目的/意义]关键词抽取技术可以帮助用户从海量文本中快速定位核心内容,对情报收集工作有着重要意义。目前,关键词抽取主要依靠词频和共现关系,忽视了知识库对关键词抽取的指导作用。[方法/过程]本文提供了一种融合知识的关键词抽取方法,首先基于义原和词林构建词汇知识图谱,其次结合词语的共现关系,生成新的概率转移矩阵,最后实现关键词抽取。[结果/结论]基于海量摘要数据集的实验表明,融合知识的关键词抽取方法,能有效提高现有关键词抽取方法的性能。 [Objective/Significance]Keyword extraction technology can help users quickly locate core content from massive short texts,which is of great significance to intelligence collection.At present,keyword extraction mainly relies on word frequency and co-occurrence relationship,ignoring the guiding role of the knowledge base in keyword extraction.[Methods/Process]This article provides a method of keyword extraction that integrates knowledge.First,build a vocabulary knowledge graph based on the original meaning and the word forest,and then combine the co-occurrence relationship of the words to generate a new probability transition matrix,and finally realize the keyword extraction.[Results/Conclusions]Experiments based on massive abstract data sets show that the keyword extraction method based on fusion knowledge can effectively improve the performance of existing keyword extraction methods.
作者 段建勇 鲁朝阳 王昊 李欣 何丽 DUAN Jianyong;LU Zhaoyang;WANG Hao;LI Xin;HE Li(School of information,North China University of Technology,Beijing 100144,China;The Key Laboratory of Rich-Media Knowledge Organization and Service of Digital Publishing Content,Beijing 100036,China;CNONIX National Standard and Promotion Laboratory,North China University of Technology,Beijing 100144,China)
出处 《情报工程》 2022年第3期3-12,共10页 Technology Intelligence Engineering
基金 国家自然科学基金项目“基于多源特征学习的中文查询纠错方法研究”(61672040) “面向新闻事件的查询时效性计算模型研究”(61972003) 富媒体数字出版内容组织与知识服务重点实验室开放基金“垂直领域知识图谱构建关键词技术研究”(ZD2021-11/05) 北京市教育委员会科学研究计划项目资助(KM202210009002)。
关键词 关键词抽取 融合知识 义原 词林 Keyword extraction fusion of knowledge sememe cilin
  • 相关文献

参考文献5

二级参考文献50

  • 1聂卉.结合词向量和词图算法的用户兴趣建模研究[J].数据分析与知识发现,2019,3(12):30-40. 被引量:8
  • 2[1]中国社会科学研究评价中心.中文社会科学引文索引[EB/OL].[2008-08-25].http://cssci.nju.edu.cn/introduce.htm. 被引量:2
  • 3[1]Chien Lee-Feng.PAT-tree-based keyword extraction for Chinese information retrieval[C]//Proceedifigs of the ACM SIGIR Intemational Conference on Information Retrieval,1997:50-59 被引量:1
  • 4[2]Yang Wenfen,Li Xing.Chinese keyword extraction based on max-duplicated strings of the documents[C]//Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval,2002 被引量:1
  • 5[3]Zhang Kuo,Xu Hui.Tang Jie,et al.Keyword extraction usingsupport vector machine[C]//Proceedings of the 7th International Conference on Web-Age Information Management,Hong Kong,China,2006:85-96 被引量:1
  • 6[4]Olena M,Witten I H.Thesaurus-based index term extraction for agricultural documents[C]//Proceedings of the 6th Agricultural Ontology Service Workshop at EFITA/WCCA.Vila Real;IEEE Press,2005:11-22 被引量:1
  • 7[5]Peter T.Learning to extract keyphrases from text[R].OTTAWA:National Research Council,1999:1-43 被引量:1
  • 8[7]俞鸿魁,张华平,刘群.基于角色标注的中文机构名识别[C]//Proceedings of the 20th International Conference on Computer Processing of Oriental Languages(ACOL),2003 被引量:1
  • 9[9]中国科学院计算技术研究所.汉语词法分析系统ICTCLAS[EB/OL].[2008-03-10].http://www.i3s.ac.err/index.htm 被引量:1
  • 10[10]詹卫东.中文信息处理基础[EB/OL].[2008-03-10].http://ccl.pku.edu.cn/doubffire/Course/Chinese%20 Information%20Processing/2002_2003_1.htm 被引量:1

共引文献1038

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部