摘要
信息抽取的目标是自动从文本信息中抽取出预先想要得到的信息 (知识 ) ,它提供了一条从浩瀚的信息堆积中抽取出与用户相关的信息的一条思路。文章分析了信息抽取的主要概念、主要研究活动、信息抽取的类型和信息抽取系统的一般结构 ,并提出在数字图书馆的建设中 ,信息抽取技术能够在数字内容的自动标引、元数据获取、数据挖掘、情报研究分析、大型知识库数值库建设。
Information Extraction (IE) is a term which has come to be applied to the activity of automatically extracting pre-specified sorts of information from natural language texts. This paper analyses the basic concept of information extraction, the main research activities on information extraction, the type of information extraction and the system of information extraction. The author believes information extraction will play a very important role in coping with the huge collection of digital information. It can provide helps in automatic annotation of digital materials, automatic acquisition of metadata, improving data mining in information analysis, developing knowledge base from free text, and generating answers in digital reference system.
出处
《现代图书情报技术》
CSSCI
北大核心
2004年第6期1-5,23,共6页
New Technology of Library and Information Service
基金
中国科学院公派留学项目的资助
关键词
信息抽取
MUC
数字图书馆
NLP
Information Extraction(IE) Message understanding conference Digital library Natural language processing