摘要
随着大数据时代的到来,海量数据不断涌现,从中寻找有用信息,抽取对应知识的需求变得越来越强烈。针对该需求,知识图谱技术应运而生,并在实现知识互联的过程中日益发挥重要作用。信息抽取作为构建知识图谱的基础技术,实现了从大规模数据中获取结构化的命名实体及其属性或关联信息。同时,由于具有多样化的实现方法,扩充了信息抽取技术的应用领域和场景,也提升了对信息抽取技术研究的价值和必要性的认可度。本文首先以知识图谱的构建框架为背景。探讨信息抽取研究的意义;然后从MUC、ACE和ICDM三个国际测评会议的角度回顾信息抽取的发展历史;接着,基于面向限定域和开放域两个方面,介绍信息抽取的关键技术,包括实体抽取技术、关系抽取技术和属性抽取技术。
With the advent of the new era of big data, massive data constantly emerge. Therefore, the demand to find useful information and extract corresponding knowledge becomes intense. In response to this demand, knowledge graph technology came into being and has increasingly played an im-portant role in achieving knowledge integration. Information extraction, as a basis for constructing knowledge graphs, obtains structured named entities with their attributes and relationships from large-scale data. This paper starts with the significance of information extraction in the context of knowledge graph construction. Then, from the viewpoints of the MUC, ACE, and ICDM conferences, this paper reviews the evolving history of information extraction. Next, this paper introduces closed domains and open domains oriented key technologies of information extraction, respectively, in-cluding entity extraction, relationship extraction and attribute extraction.
出处
《数据挖掘》
2020年第4期282-302,共21页
Hans Journal of Data Mining
关键词
知识图谱
信息抽取
实体抽取
关系抽取
开放域
Knowledge Graph
Information Extraction
Entity Extraction
Relationship Extraction
Open Domain