摘要
健康领域的数据由于来源不同,通常会使用不同元素或结构表示相同概念与关系,呈现出很强的异构性。为了满足异构数据统一查询工作的要求,需要研究与解决异构数据之间的语义冲突及结构冲突。在传统XML Schema模式集成算法基础上,计算语义相似度与结构相似度,同时对生成的候选匹配进行结构冲突检测,并解决结构冲突,最终生成具有较小冗余的全局模式,为异构数据的统一查询提供统一的全局视图。与传统模式集成算法相比,该方法很好地解决了模式集成中的关系嵌套冲突、关系方向冲突及实体属性冲突,减轻了模式集成后的冗余,具有更好的模式集成质量。
Due to different sources of health data,different elements or structures are often used to represent the same concepts and relationships,showing a strong heterogeneity.In order to satisfy the unified query work of heterogeneous data,it is necessary to study and solve the semantic conflicts and structural conflicts between heterogeneous data.Based on the traditional XML Schema model integration algorithms,computing semantic similarity and structure similarity,while the generation of candidate matching structure conflict detection and resolution,the global schema generated less redundancy,providing a unified global view for the unification of heterogeneous data query.Compared with the traditional schema integration algorithm,it solves the nesting conflict,the relation direction conflict and the entity attribute conflict in schema integration,which can better reduce the redundancy after schema integration,and has better mode integration quality.
作者
田燚林
王勇
TIAN Yi-lin;WANG Yong(Institute of Computer,Beijing University of Technology,Beijing 100124,China)
出处
《软件导刊》
2018年第11期158-161,166,共5页
Software Guide
关键词
健康领域
异构数据
模式集成
全局模式
模式结构冲突
health domain
heterogeneous data
schema integration
global schema
schema structure conflict