摘要
纠删码技术具有存储开销低的优势,然而在进行数据修复时面临修复时间长和对前端应用性能影响高的缺陷。给出纠删码技术中数据修复完成时间的计算模型,指出影响修复性能的关键因素,进而选取计算开销、读写开销、传输开销作为修复性能的评价标准;分析了现有研究工作如何降低计算、读写和传输3种开销,重点讨论了其关键性技术的优缺点;最后从修复性能、可靠性、存储开销等方面对现有编码方案进行对比,并指出未来可能的研究方向。
Erasure codes have the advantage of low storage overhead.However,they also have the drawbacks of long recovery time and high impact on application performance.This paper presents the computation model of the time for data recovery with erasure codes,and identifies the key factors that affect the recovery performance.Thereafter,this paper chooses the computation overhead,read/write overhead,and transmission overhead as the evaluation criterion for the recovery performance.In addition,this paper analyzes how the latest efforts in this field reduce overheads from the aspects of computation,read/write,and transmission.Finally,this paper compares existing coding schemes from the aspects of recovery performance,reliability,as well as storage overhead,and then points out the future research directions.
作者
杨松霖
张广艳
YANG Songlin;ZHANG Guangyan(Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China)
出处
《计算机科学与探索》
CSCD
北大核心
2017年第10期1531-1544,共14页
Journal of Frontiers of Computer Science and Technology
基金
国家自然科学基金Nos.61672315
F020803
国家重点基础研究发展计划(973计划)No.2014CB340402~~
关键词
纠删码
多副本
数据修复
性能优化
erasure codes
multiple replicas
data recovery
performance improvement