摘要
重点讨论非结构化中文文本中表达式命名实体(ENE)的抽取和分类过程,尝试构造匹配模式集合,建立基于层次模式匹配的ENE识别模型(HPM_ENE_EM),作为竞争情报系统、用户兴趣度获取等情报学应用研究的基础,并以学术论文中的术语缩略语识别为例探讨该模型的具体应用。
This paper emphasizes the process of extraction and classification of Expression Named Entity (ENE) in non- structured Chinese text, attempts to construct pattern collection for matching and builds the ENE Extraction Model Based Hierarchical Pattern Matching( HPM_ENE_EM), which is the base of the application research on intelligence, such as Competitive Intelligence System(CIS) ,user interest degree gaining and so on. At last, the paper discusses the detailed application of this model used for extracting the abbreviative terms in academic papers.
出处
《现代图书情报技术》
CSSCI
北大核心
2007年第5期62-68,共7页
New Technology of Library and Information Service
关键词
表达式命名实体
层次模式匹配
术语识别
缩略语
Expression named entity Hierarchical pattern matching Term extraction Abbreviative terms