摘要
大数据分布式存储系统中,修复流水线(Repair Pipelining,RP)减少90%的修复时间,有效地解决由于修复时间开销较大,纠删码不适用于存储热数据的问题.然而,现有的RP存在节点负载不均衡的问题,导致系统性能下降.通过研究后,设计节点负载均衡的纠删码修复流水线(Node Load Balancing-based Repair Pipelining,NLB-RP),并根据性能评价指标提出计算节点负载的算法和计算修复时间的算法.理论分析及实验结果表明,在没有引入额外修复代价的情况下,NLB-RP从局部到整体有效地平衡并减少节点的负载.相比RP,NLB-RP的节点负载方差为0,即每个节点的负载相同.因此,NLB-RP具有最优的负载均衡性.
In distributed storage system for big data,the repair pipelining(RP)reduces repair time by 90%,which effectively solves the problem that erasure code is not suitable for storing hot data due to the heavy overhead of repair time.However,the existing RP has the problem of unbalanced load among nodes,which leads to the degradation of system performance.In this paper,a repair pipelining based on node load balancing(NLB-RP)is designed,and then the algorithms for calculating the load of nodes and repair time according to the evaluation index of performance are proposed.Theoretical analysis and experimental results both show that,from local to global,the NLB-RP can effectively balance and reduce the load of nodes without introducing extra repair cost.Compared with the RP,the load variance of the NLB-RP is zero,that is,the load of each node is the same.Thus,the proposed NLB-RP has the optimal load balance.
作者
江小玉
李贵洋
周悦
胡金平
李慧
JIANG Xiao-yu;LI Gui-yang;ZHOU Yue;HU Jing-ping;LI Hui(Department of Computer Science,Sichuan Normal University,Chengdu,Sichuan 610101,China)
出处
《电子学报》
EI
CAS
CSCD
北大核心
2020年第5期930-936,共7页
Acta Electronica Sinica
基金
国家自然科学基金(No.61701331)。
关键词
大数据
分布式存储
纠删码
修复流水线
负载均衡
big data
distributed storage
erasure code
repair pipelining
load balancing