摘要
系统介绍数据溯源的定义,并从数据溯源的方法、模型和应用等三个方面进行了总结.概述了7种数据溯源模型:流溯源信息模型、时间-值中心溯源模型、四维溯源模型、开放的数据溯源模型OPM、Provenir数据溯源模型、数据溯源安全模型和PrInt数据溯源模型,总结出异构数据的溯源模型.并对目前最为广泛的几种溯源方法进行分析和比较,在此基础上,为达到节省存储空间的目的,提出标注信息列存储的思想.本文分别从数据库领域、工作流领域和其它应用领域三个方面描述了数据溯源的应用,并结合典型的实例加以说明.最后展望了数据溯源的研究热点以及发展方向.
This paper introduces the concept of data provenance, and investigates it from three aspects: method, model and applica- tion. Seven data provenance models are presented: flow information model, time-value centric model, four dimensions model, open provenance model, Provenir model, data provenance's security model and Print model. Based on these models, we provided a data provenance model for heterogeneous data. Based on the analysis of several popular data provenance methods, we proposed a method to extend the labeling method with column storage. We described data provenance applications in the fields of database, workflow and other. The typical application cases are provided. At last, the hot research spots and directions are given.
出处
《小型微型计算机系统》
CSCD
北大核心
2012年第9期1917-1923,共7页
Journal of Chinese Computer Systems
基金
国家"九七三"重点基础研究发展计划项目(2011CB302302)资助
铁道部基金项目(20091111068)资助
关键词
数据溯源
数据追踪
标注法
数据溯源模型
data provenance
data tracking
labeling method
data provenance model