摘要
知识图谱于2012年被Google正式提出,其继承了多种传统语义网的技术,最初被用于提高搜索引擎的能力,随后被广泛地应用于多种领域。文章从知识图谱的三元组和本体的逻辑架构出发,详细介绍了包括知识抽取和知识融合的知识图谱构建流程,并对知识图谱在情报学中的具体应用(信息收集、信息组织、信息检索、信息计量与分析、知识发现)进行了详细的阐述。文章的主要发现包括:知识抽取的常见任务(实体抽取、关系抽取、属性抽取)基本上都经历了从手动构建规则到基于统计的机器学习的方法转变过程;近年来,随着基于深度学习的信息抽取框架的建立与深度学习方法的推广,深度学习方法在知识抽取任务中也愈加占有重要地位;知识融合的各子任务的方法相对来说一致性较低,每个子任务都有具备任务自身特色的处理方法;知识图谱在情报学中的应用多数利用了知识图谱的数据多源性和强大的语义表达能力;知识图谱的未来发展仍然有待于解决多源数据融合、知识图谱动态化以及知识图谱与嵌入式表示算法的有效结合等问题。
The Knowledge Graph was formally proposed by Google in 2012.It inherits various traditional semantic web technologies and was initially applied to improve the capability of search engines.Then knowledge graphs were widely adopted to various domains.This survey first introduces logical structure of knowledge graph triples and ontologies and presents key procedures of constructing knowledge graphs,including knowledge extraction and knowledge integration.A thorough review on the applications of knowledge graphs in the discipline of Information Science is then elaborated,covering details in information collection,information organization,information retrieval,information measurement and analysis,and knowledge discovery.The main findings of this paper include:①Common tasks of knowledge extraction(e.g.,entity,relationship,and attribute extractions)have basically undergone the process of methodological transformation from manually rule-based construction to statistical-based machine learning.②in recent years,deep learning methods have also become increasingly important in the task of knowledge extraction.③methods of knowledge integration subtask have relatively low consistency,and each subtask has a processing method with its own characteristics.④most of the applications of knowledge graphs in Information Science make use of the multi-source data and powerful semantic expression ability of knowledge graphs.⑤the future of knowledge graphs still remains to solve the problems of multi-source data integration,dynamic knowledge graph,and the effective combination of knowledge graphs and representation learning algorithms.
出处
《情报学进展》
2022年第1期349-384,共36页
Advances in Information Science
基金
国家自然科学基金青年项目“基于因果推断的高学术影响力跨学科团队早期识别研究”(项目编号:7210040786)
教育部人文社会科学研究青年基金项目“复杂网络视角下科学文献的知识融合与知识扩散对比研究”(项目编号:21YJC870001)的成果之一
关键词
知识图谱
知识抽取
知识融合
情报学
knowledge graph
knowledge extraction
knowledge integration
information science.