期刊文献+

一种基于重复数据删除的备份系统 被引量:5

A Remote Data Backup System with Deduplication
下载PDF
导出
摘要 重复数据删除技术有效地提升了备份系统的备份效率,但重复数据的匹配开销也随之增加.针对该问题,设计并实现了一种基于重复数据删除的备份系统THBS,该系统提出了高精简的数据备份方法HAD(hierachical approach of data deduplication),依次从目录、文件、块、字节粒度分层多步,由粗及细地匹配删除重复数据,同时采用bloomfilter和倒排索引技术,以减少不必要的数据匹配与磁盘访问,提高匹配查找速度.通过两组真实数据集的实验发现,THBS在备份过程中节省了63.1%~96.7%的存储空间,比Scp和Rsync分别节约了71.3%~97.6%,41.2%~66.7%的网络带宽,累计备份时间分别为Scp和Rsync的75%~86%和91%~97%. 重复数据删除技术有效地提升了备份系统的备份效率,但重复数据的匹配开销也随之增加.针对该问题,设计并实现了一种基于重复数据删除的备份系统THBS,该系统提出了高精简的数据备份方法HAD(hierachical approach of data deduplication),依次从目录、文件、块、字节粒度分层多步,由粗及细地匹配删除重复数据,同时采用bloomfilter和倒排索引技术,以减少不必要的数据匹配与磁盘访问,提高匹配查找速度.通过两组真实数据集的实验发现,THBS在备份过程中节省了63.1%~96.7%的存储空间,比Scp和Rsync分别节约了71.3%~97.6%,41.2%~66.7%的网络带宽,累计备份时间分别为Scp和Rsync的75%~86%和91%~97%.
出处 《计算机研究与发展》 EI CSCD 北大核心 2012年第S1期206-210,共5页 Journal of Computer Research and Development
基金 国家"八六三"高技术研究发展计划基金项目(2009AA01A403) 国家自然科学基金项目(60873066) 高等学校博士学科点专项科研基金项目(200800030027)
关键词 备份系统 重复数据删除 层次化删冗 backup system data deduplication hierarchy approach for data deduplication
  • 相关文献

参考文献6

  • 1Policroniedes C,PraR I.Alternatives for detecting redundancy in storage systems data. Proc.of the 2004 USENIX AnnualTechnical Conf. (USENIX 2004) . 2004 被引量:1
  • 2Jain N,Dahlin M,Tewari R.Taper:Tiered approach for eliminating redundancy in replica synchronization. Proc of the4th Usenix Conf on File and Storage Technologies (FAST’’05) . 2005 被引量:1
  • 3Douglis P K F,Lavoie J,Tracey J M.Redundancy elimination within large collections of files. Usenix Annual Technical Conference . 2004 被引量:1
  • 4Bolosky W J,Corbin S,Goebel D,et al.Single instance storage in Windows2000. Proc of the4th Usenix Windows System Symposium . 2000 被引量:1
  • 5Langford J.Multiround rsync. http://www.cs.cmu.edu/-jcl/research/mrsync/mrsync.ps . 2012 被引量:1
  • 6敖莉,舒继武,李明强.重复数据删除技术[J].软件学报,2010,21(5):916-929. 被引量:119

二级参考文献42

  • 1Bhagwat D,Pollack K,Long DDE,Schwarz T,Miller EL,P-ris JF.Providing high reliability in a minimum redundancy archival storage system.In:Proc.of the 14th Int'l Symp.on Modeling,Analysis,and Simulation of Computer and Telecommunication Systems (MASCOTS 2006).Washington:IEEE Computer Society Press,2006.413-421. 被引量:1
  • 2Zhu B,Li K.Avoiding the disk bottleneck in the data domain deduplication file system.In:Proc.of the 6th Usenix Conf.on File and Storage Technologies (FAST 2008).Berkeley:USENIX Association,2008.269-282. 被引量:1
  • 3Bhagwat D,Eshghi K,Mehra P.Content-Based document routing and index partitioning for scalable similarity-based searches in a large corpus.In:Berkhin P,Caruana R,Wu XD,Gaffney S,eds.Proc.of the 13th ACM SIGKDD Int'l Conf.on Knowledge Discovery and Data Mining (KDD 2007).New York:ACM Press,2007.105-112. 被引量:1
  • 4You LL,Pollack KT,Long DDE.Deep store:An archival storage system architecture.In:Proc.of the 21st Int'l Conf.on Data Engineering (ICDE 2005).Washington:IEEE Computer Society Press,2005.804-815. 被引量:1
  • 5Quinlan S,Dorward S.Venti:A new approach to archival storage.In:Proc.of the 1st Usenix Conf.on File and Storage Technologies (FAST 2002).Berkeley:USENIX Association,2002.89-102. 被引量:1
  • 6Sapuntzakis CP,Chandra R,Pfaff B,Chow J,Lam MS,Rosenblum M.Optimizing the migration of virtual computers.In:Proc.of the 5th Symp.on Operating Systems Design and Implementation (OSDI 2002).New York:ACM Press,2002.377-390. 被引量:1
  • 7Rabin MO.Fingerprinting by random polynomials.Technical Report,CRCT TR-15-81,Harvard University,1981. 被引量:1
  • 8Rivest R.The MD5 message-digest algorithm.1992.http://www.python.org/doc/current/lib/module-md5.html. 被引量:1
  • 9U.S.National Institute of Standards and Technology (NIST).Federal Information Processing Standards (FIPS) Publication 180-1:Secure Hash Standard.1995.http://www.itl.nist.gov/fipspubs/fip180-1.htm. 被引量:1
  • 10U.S.National Institute of Standards and Technology (NIST).Federal Information Processing Standards (FIPS) Publication 180-2:Secure Hash Standard.2002.http://csrc.nist.gov/publications/fips/fips180-2/fips180-2.pdf. 被引量:1

共引文献118

同被引文献49

引证文献5

二级引证文献12

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部