As a typical erasure coding choice, Reed-Solomon (RS) codes have such high repair cost that there is a penaltyfor high reliability and storage efficiency, thereby they are not suitable in geo-distributed storage sys...As a typical erasure coding choice, Reed-Solomon (RS) codes have such high repair cost that there is a penaltyfor high reliability and storage efficiency, thereby they are not suitable in geo-distributed storage systems. We present anovel family of concurrent regeneration codes with local reconstruction (CRL) in this paper. The CRL codes enjoy threebenefits. Firstly, they are able to minimize the network bandwidth for node repair. Secondly, they can reduce the numberof accessed nodes by calculating parities from a subset of data chunks and using an implied parity chunk. Thirdly, they arefaster than existing erasure codes for reconstruction in geo-distributed storage systems. In addition, we demonstrate howthe CRL codes overcome the limitations of the Reed-Solomon codes. We also illustrate analytically that they are excellent inthe trade-off between chunk locality and minimum distance. Furthermore, we present theoretical analysis including latencyanalysis and reliability analysis for the CRL codes. By using quantity comparisons, we prove that CRL(6, 2, 2) is only0.657x of Azure LRC(6, 2, 2), where there are six data chunks, two global parities, and two local parities, and CRL(10,4, 2) is only 0.656x of HDFS-Xorbas(10, 4, 2), where there are 10 data chunks, four local parities, and two global paritiesrespectively, in terms of data reconstruction times. Our experimental results show the performance of CRL by conductingperformance evaluations in both two kinds of environments: 1) it is at least 57.25% and 66.85% more than its competitorsin terms of encoding and decoding throughputs in memory, and 2) it has at least 1.46x and 1.21x higher encoding anddecoding throughputs than its competitors in JBOD (Just a Bunch Of Disks). We also illustrate that CRL is 28.79% and30.19% more than LRC on encoding and decoding throughputs in a geo-distributed environment.展开更多
Remote data auditing becomes critical to ensure the storage reliability in distributed cloud storage.Recently,Le et al proposed an efficient private data auditing scheme NC-Audit designed for regenerating codes,which ...Remote data auditing becomes critical to ensure the storage reliability in distributed cloud storage.Recently,Le et al proposed an efficient private data auditing scheme NC-Audit designed for regenerating codes,which claimed that NC-Audit can effectively realize privacy-preserving data auditing for distributed storage systems.However,our analysis shows that NC-Audit is not secure for that the adversarial cloud can forge some illegal blocks to cheat the auditor successfully with a high probability even without storing the user’s whole data,when the coding field is large enough.展开更多
文摘As a typical erasure coding choice, Reed-Solomon (RS) codes have such high repair cost that there is a penaltyfor high reliability and storage efficiency, thereby they are not suitable in geo-distributed storage systems. We present anovel family of concurrent regeneration codes with local reconstruction (CRL) in this paper. The CRL codes enjoy threebenefits. Firstly, they are able to minimize the network bandwidth for node repair. Secondly, they can reduce the numberof accessed nodes by calculating parities from a subset of data chunks and using an implied parity chunk. Thirdly, they arefaster than existing erasure codes for reconstruction in geo-distributed storage systems. In addition, we demonstrate howthe CRL codes overcome the limitations of the Reed-Solomon codes. We also illustrate analytically that they are excellent inthe trade-off between chunk locality and minimum distance. Furthermore, we present theoretical analysis including latencyanalysis and reliability analysis for the CRL codes. By using quantity comparisons, we prove that CRL(6, 2, 2) is only0.657x of Azure LRC(6, 2, 2), where there are six data chunks, two global parities, and two local parities, and CRL(10,4, 2) is only 0.656x of HDFS-Xorbas(10, 4, 2), where there are 10 data chunks, four local parities, and two global paritiesrespectively, in terms of data reconstruction times. Our experimental results show the performance of CRL by conductingperformance evaluations in both two kinds of environments: 1) it is at least 57.25% and 66.85% more than its competitorsin terms of encoding and decoding throughputs in memory, and 2) it has at least 1.46x and 1.21x higher encoding anddecoding throughputs than its competitors in JBOD (Just a Bunch Of Disks). We also illustrate that CRL is 28.79% and30.19% more than LRC on encoding and decoding throughputs in a geo-distributed environment.
基金Supported by the National Natural Science Foundation of China(61872088)the Science and Technology Plan Project of Xi’an(2020KJWL02,2017CGWL35)the China National Study Abroad Fund。
文摘Remote data auditing becomes critical to ensure the storage reliability in distributed cloud storage.Recently,Le et al proposed an efficient private data auditing scheme NC-Audit designed for regenerating codes,which claimed that NC-Audit can effectively realize privacy-preserving data auditing for distributed storage systems.However,our analysis shows that NC-Audit is not secure for that the adversarial cloud can forge some illegal blocks to cheat the auditor successfully with a high probability even without storing the user’s whole data,when the coding field is large enough.