期刊文献+

科技文献内容语义识别研究综述 被引量:6

Review on Identifying the Semantics of Scientific Literature Content
下载PDF
导出
摘要 科技文献内容的语义识别是将蕴含在文本中的科研要素显式地揭示出来。它属于细粒度文本挖掘,是获取和利用知识的基础。本文梳理科技文献内容的语义识别相关研究,为后续研究提供参考。首先概括现有的文献内容语义标注模型,然后围绕章节、句子、词汇三种识别粒度,梳理科技文献内容语义识别问题的研究发展,总结识别方法、评测手段以及典型应用,并据此提出现存问题及未来发展方向。本文探讨了五个方面的问题:(1)人们关注文献内容中哪些语义类型;(2)选取什么粒度的文本单元作为识别对象;(3)识别方法分为哪些类型;(4)如何评测识别结果;(5)语义识别有哪些典型应用。本文发现目前还存在语义类型标准不一、优质文献数据集欠缺、研究关注点不平衡、识别方法存在局限等问题,需要在后续研究中探寻解决方法。 Identifying the semantics of the textual content of scientific literature can shed light on the research elements of scientific literature.This task is a kind of fine-grained text mining,and is essential for knowledge acquisition and utilization.This article reviews recent research studies on identification of semantics of scientific literature content;it is expected that such a review would provide comprehensive reference for subsequent studies.This study begins by summarizing the existing semantic annotation models of literature content,and then it discusses the research track of semantic identification of literature content based on different granularities(i.e.chapters,sentences and terms),illustrates the typical applications,highlights the existing problems,and suggests future research directions.The study seeks answers to five questions:(1) Which semantic types of scientific literature content are under focus?(2) What granularity of text units should be selected for semantic identification?(3) What kind of identification approaches are available?(4) How to evaluate the identification results?(5) What are the typical applications of semantic identification? Future improvement on this line of research includes proposing uniform standards on semantic types,increasing the available training data sets and focusing on multiple semantic types and their relations,and improving existing methods.It is important to continue making many efforts to find more solutions through future studies.
作者 黄红 陈翀 张婧莹 Huang Hong;Chen Chong;Zhang Jingying(School of Government,Beijing Normal University,Beijing 100875)
出处 《情报学报》 CSSCI CSCD 北大核心 2022年第9期991-1002,共12页 Journal of the China Society for Scientific and Technical Information
基金 国家社会科学基金一般项目“面向科研人员定量评价的多维学术专长识别及属性度量研究”(21BTQ065)。
关键词 科技文献内容挖掘 语义类型 章节结构功能识别 语步识别 词汇语义识别 content mining of scientific literature semantic type chapter structure function identification move identification lexical semantic identification
  • 相关文献

参考文献27

二级参考文献217

共引文献305

同被引文献162

引证文献6

二级引证文献9

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部