摘要
分块是一种将文件划分成更小文件的过程,该方法被广泛应用在重复数据删除系统中。针对传统的基于内容分块(CDC)中面临的高额计算开销问题,提出了一种称为非对称最大值的分块算法(CAAM)。采用字节值代替哈希值来声明切点,利用固定大小窗口和可变大小窗口来查找作为切点的最大值,并且允许在保留内容定义分块(CDC)属性的同时进行较少的计算开销。最后将CAAM与现有的基于散列和无哈希的分块算法进行了比较,实验结果表明,CAAM算法比其他算法具有更低的计算开销和更高的分块吞吐量。
bChunking is a process of spilting files into smaller files,which is widely used in deduplication systems. Aiming at the problem of high computational overhead in traditional content-based chunking( CDC),a new chunking algorithm called asymmetric maximum( CAAM) is proposed. Instead of using hashes,CAAM uses the byte value to declare the cut points. The algorithm utilizes the fixed size window and the variable size window to find the maximum value which is cut points. The algorithm allows less computation overhead while keeping the CDC property. Finally,the CAAM algorithm is compared with the existing hash-based and hash-less. The experimental results show that the CAAM algorithm has lower computational cost and higher chunking throughput than other algorithms.
出处
《微型机与应用》
2017年第22期30-33,共4页
Microcomputer & Its Applications