期刊文献+

一种优化关系型溯源信息存储的新方法 被引量:6

An Approach for Optimizing Relational Provenance Storage
下载PDF
导出
摘要 现代数据管理必须处理来源不同、质量各异的数据,因此从系统层面支持数据溯源,让用户了解数据的来源及派生过程成为当前至关重要的一个研究课题.基于标注的方法是支持数据溯源的基本方法之一.这种方法的主要问题是存储空间开销,因为溯源信息可能会超过实际数据的大小.在该文中,作者提出了一个用与查询结构匹配的溯源树来表达和存储溯源信息从而避免数据派生过程中冗余存储的基本框架.基于这个框架,作者提出了一系列针对关系型查询的存储优化方法,选择查询树部分节点来存储溯源信息.这些优化算法对于查询大小是多项式时间,对于溯源信息大小是线性时间,在溯源信息的跟踪和优化方面均不会产生巨大的开销.这一框架是数据溯源研究的一个新思路,有着广泛的应用前景. Modern data management has to deal with data from different sources with different quality,therefore,supporting data provenance in the system level and allowing users to know where data comes from and how it was derived have become a critical research topic.Annotation is one of approaches to track provenance.However,storing fine-grained annotations can be expensive as the complete annotations for the data may outsize the storage space required for the data itself.In this paper,we propose a framework for storing provenance information relating to data derived via relational queries,using provenance trees which match the query structure to avoid redundant storage of information about the derivation process.Within this framework,we come up with a series of storage optimization methods against the relational queries to make good choices of query tree nodes where provenance information should be stored.Our optimization algorithms run in time polynomial in the query size and linear in the size of the provenance,thus enabling provenance tracking and optimization without incurring large overheads.This framework is a new idea for the data tracing study and has a wide range of applications.
出处 《计算机学报》 EI CSCD 北大核心 2011年第10期1863-1875,共13页 Chinese Journal of Computers
基金 教育部博士点新教师基金(200804861067) 澳洲研究院(ARC)项目基金(LP0882957)资助~~
关键词 溯源树 溯源表 存储优化 最优削剪 规则I&II削剪 provenance tree provenance table storage optimization optimal reduction rules I&II reduction
  • 相关文献

参考文献30

  • 1周晓方.数据质量-现代数据库与信息系统研究的一个核心问题[J].中国计算机协会通讯,2009,5(2):49-51. 被引量:1
  • 2Bose R, Frew J. Lineage retrieval for scientific data processing: A survey. ACM Computing Surveys, 2005, 37(1): 1 -28. 被引量:1
  • 3王昌桂 闫德齐 赵应成.酒西盆地油气分布与富集规律[M],中国油盆地油气聚集分布规律[M].北京:石油工业出版社,1991.. 被引量:2
  • 4Stahl W J.Carbon and nitrogen isotopes in hydrocarbon research and exploration [J ].Chemical Geology,1977,20 (2):121-149. 被引量:2
  • 5Schoell M.Multiple origins of methane in the Earth [J].Chemical Geology,1988,71:1-10. 被引量:2
  • 6Buneman P, Khanna S,Tan W C. Why and where: A char acterization of data provenance//Proceedings of the ICDT. London, UK, 2001:316-330. 被引量:1
  • 7Chapman A, Jagadish H V, Ramanan P. Efficienl prove nance storage//Proceedings of the SIGMOD Conference. Vancouver, BC, Canada, 2008:993- 1006. 被引量:1
  • 8Woodruff A, Stonebraker M. Supporling fine grained data lineage in a database visualization environment//Proceedings of thelCDE. Birmingham, UK, 1997:91 -102. 被引量:1
  • 9Cui Y, Widom J. Practical lineage tracing in data warehnu ses//Proceedings of the ICDE. San Diego, CA, USA, 2000: 367 378. 被引量:1
  • 10Bhagwat D, Chiticariu L, Tan W C, Vijayvargiya G. An an notation management system for relational databases,//Proceedings of the VLDB. Toronto, Canada, 2004:900-911. 被引量:1

共引文献1

同被引文献97

  • 1熊本海,傅润亭,林兆辉,罗清尧,杨亮.散养模式下猪只个体标识及溯源体系的建立[J].农业工程学报,2009,25(3):98-102. 被引量:21
  • 2张欣露,王成,吴勇,乔晓军,侯瑞锋,王开义.集成传感器电子标签在农产品溯源体系中的应用[J].农业机械学报,2009,40(S1):129-133. 被引量:27
  • 3刘喜平,万常选.数据起源研究综述[J].科技广场,2005(1):47-52. 被引量:13
  • 4http://www.digitalpreservation.gov. 被引量:1
  • 5Ram S, Liu J. A new perspective on semantics of data provenance [ EB/OL ]. [ 2015 - 03 - O1 ]. http ://citeseerx. ist. psu. edu/view- doc/download? doi = 10,1,1. 154. 8485&rep = repl&type = pdf. 被引量:1
  • 6Plale B, Gannon D, Simmhan Y L. A survey of data provenance techniques [ EB/OL ]. [ 2015 - 03 - 01]. http://citeseerx, ist. psu. edu/viewdoc/summary? doi = 10,1. I. 70. 6294. 被引量:1
  • 7Simmhan Y L, Plale B, Gannon D. A survey of data provenance techniques[ J]. Computer Science Department, 2005,34 (3) : 31 -36. 被引量:1
  • 8祝彝.数字信息长期保存中来源感知技术的研究[D].武汉华中科技大学,2013. 被引量:1
  • 9CCSDS 650.0 - NI - 2, Reference model for an open archival infor- mation system(OAIS) [ S]. Washington : CCSDS ,2012. 被引量:1
  • 10PREMIS data dictionary for preservation metadata, version 2.0 [ EB/OL ]. [ 2015 - 03 - 01 ]. http://www, loc. gov/standards/ premis/v2/premis - 2 - 0. pdf. 被引量:1

引证文献6

二级引证文献75

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部