摘要
云存储是一种低成本、高安全的数据存储方式,但是这些存储在云端的数据有很多是重复的,不仅占用存储空间、增加存储成本,而且严重消耗网络带宽。重复数据删除技术作为一种全局性的冗余数据消除技术,压缩比高达20∶1,对于缓解数据爆炸式增长带来的存储压力有积极帮助。文章首先介绍重复数据删除的基本流程和重复数据常用检测技术,随后设计一种重复数据删除系统,并介绍该系统的主体架构、功能模块和数据分配策略。最后进行分块算法性能测试、系统备份和恢复性能测试,结果表明,分块算法比滑动窗口算法在损失少于10%重删率的基础上,使系统性能提升40倍左右,让重复数据删除系统的综合应用效果得到明显改进。
Cloud storage is a low-cost and high-security data storage method,but many of these data stored in the cloud are duplicated,which not only occupies storage space,increases storage costs,but also consumes network bandwidth seriously.As a global redundant data removal technology,data deduplication technology has a compression ratio of up to 20:1,which is of great help in alleviating the storage pressure brought about by the explosive growth of data.This paper firstly introduces the basic process of deduplication and the common detection technology of deduplication,and then designs a deduplication system,and introduces the main structure,function modules and data allocation strategy of the system.Finally,the performance test of the block algorithm and the performance test of system backup and recovery were carried out.The results show that the block algorithm can improve the system performance by about 40 times on the basis of the loss of less than 10%of the deduplication rate compared with the sliding window algorithm.The comprehensive application effect of the data deletion system has been significantly improved.
出处
《科技创新与应用》
2022年第19期158-161,共4页
Technology Innovation and Application