摘要
为了提供规范的资源属性、概念取值和关联类型描述,文章以间质性疾病领域为实验对象,构建了包含元数据元素集和取值词汇集在内的领域关联词汇集。首先,借鉴领域已有的词表、类表和规范文档等,创建了通用关联词汇集;其次,采用N-gram统计分词、命名实体识别、模式识别等多种技术方法,构建了领域核心关联词汇集,以更好地引出和关联该主题领域的各种资源与数据。
In order to provide standard description of resource concepts, attributes and relationship, the domain linked vocabulary of interstitial disease is constructed in this article, which contains metadata element set and the vocabulary of domain concepts. Firstly, the domain general linked vocabulary is created using the existing thesaurus, classifications and normalized documents for reference. Secondly, the domain core lined vocabulary is built to link different resources and data in the field by using N-gram to count the word segmentations, to name the entity recognition and the pattern recognition.
出处
《图书馆论坛》
CSSCI
北大核心
2016年第8期13-19,共7页
Library Tribune
基金
国家社会科学基金项目“图书馆资源组织中的数据关联机制研究”(项目编号:14CTQ005)研究成果之一
关键词
关联词汇集
元数据
知识本体
知识组织
linked vocabulary
metadata
ontology
knowledge organization