摘要
网络报纸的长期保存必须解决其元数据的抽取问题,CWM为我们提供了方便的技术框架模型。在介绍CWM的基本标准、技术、内容、框架体系基础上,基于提取的网络报纸整合数据链,利用CWM对整合数据链的不同部分分别进行元数据抽取,并分析不同部分可能的元数据集。设计出网络报纸的元数据抽取框架模型,指出抽取过程中应该解决的关键问题是解决对象-关系映射、元数据冲突及元数据导出。
The long-term preservation of network newspapers need to solve the issue of meta-data extraction,and the CWM provides us with a framework to facilitate it.After introducing the basic standards,techniques and contents of CWM,the authors present the integration data chain of network newspaper extraction and analyze the possible meta-data set of different parts of integration data chain of network newspaper.Then the meta-data extraction framework model of network newspapers is designed and the key problems about which are to solve the object-relation mapping,metadata confliction and meta-data export is given.
出处
《情报科学》
CSSCI
北大核心
2010年第3期438-441,共4页
Information Science
基金
四川省社科规划项目(SC08W02)
关键词
网络报纸
整合数据链
CWM
元数据抽取
network newspaper
integrated data chain
CWM
metadata extraction