Cloud storage is getting more and more popular as a new trend of data management. Data replication has been widely used to increase the data availability in cloud storage systems. However,most data replication schemes...Cloud storage is getting more and more popular as a new trend of data management. Data replication has been widely used to increase the data availability in cloud storage systems. However,most data replication schemes do not fully consider cost and latency issues when users need large amounts of remote replicas. We present an improved dynamic replication management scheme( IDRMS). By adding a prediction model,the optimal allocation of replicas among the cloud storage nodes is determined that the total communication cost and network delay are minimal. When the local data block is frequently requested,the data replicas can be moved to a closer or cheaper node for cost reduction and increased efficiency. Moreover,we replace the B+tree with the B*tree to speed up the search and reduce workload with the lowest blocking probability. We define the value of popularity to adjust the placement of replicas dynamically. We divide the data nodes in the network into hot nodes and cool nodes. By changing to visit cool nodes instead of hot nodes,we can balance the workload in the network. Finally,we implement IDRMS in Matlab simulation platform and simulation results demonstrate that IDRMS outperforms other replication management schemes in terms of communication cost and load balancing for large-scale cloud storage.展开更多
基金supported by the National Natural Science Foundation of China ( 61401234)
文摘Cloud storage is getting more and more popular as a new trend of data management. Data replication has been widely used to increase the data availability in cloud storage systems. However,most data replication schemes do not fully consider cost and latency issues when users need large amounts of remote replicas. We present an improved dynamic replication management scheme( IDRMS). By adding a prediction model,the optimal allocation of replicas among the cloud storage nodes is determined that the total communication cost and network delay are minimal. When the local data block is frequently requested,the data replicas can be moved to a closer or cheaper node for cost reduction and increased efficiency. Moreover,we replace the B+tree with the B*tree to speed up the search and reduce workload with the lowest blocking probability. We define the value of popularity to adjust the placement of replicas dynamically. We divide the data nodes in the network into hot nodes and cool nodes. By changing to visit cool nodes instead of hot nodes,we can balance the workload in the network. Finally,we implement IDRMS in Matlab simulation platform and simulation results demonstrate that IDRMS outperforms other replication management schemes in terms of communication cost and load balancing for large-scale cloud storage.