Data Grid integrates graphically distributed resources for solving data intensive scientific applications. Effective scheduling in Grid can reduce the amount of data transferred among nodes by submitting a job to a no...Data Grid integrates graphically distributed resources for solving data intensive scientific applications. Effective scheduling in Grid can reduce the amount of data transferred among nodes by submitting a job to a node, where most of the requested data files are available. Scheduling is a traditional problem in parallel and distributed system. However, due to special issues and goals of Grid, traditional approach is not effective in this environment any more. Therefore, it is necessary to propose methods specialized for this kind of parallel and distributed system. Another solution is to use a data replication strategy to create multiple copies of files and store them in convenient locations to shorten file access times. To utilize the above two concepts, in this paper we develop a job scheduling policy, called hierarchical job scheduling strategy (HJSS), and a dynamic data replication strategy, called advanced dynamic hierarchical replication strategy (ADHRS), to improve the data access efficiencies in a hierarchical Data Grid. HJSS uses hierarchical scheduling to reduce the search time for an appropriate computing node. It considers network characteristics, number of jobs waiting in queue, file locations, and disk read speed of storage drive at data sources. Moreover, due to the limited storage capacity, a good replica replacement algorithm is needed. We present a novel replacement strategy which deletes files in two steps when free space is not enough for the new replica: first, it deletes those files with minimum time for transferring. Second, if space is still insufficient then it considers the last time the replica was requested, number of access, size of replica and file transfer time. The simulation results show that our proposed algorithm has better performance in comparison with other algorithms in terms of job execution time, number of intercommunications, number of replications, hit ratio, computing resource usage and storage usage.展开更多
分析了目前数据网格环境下的副本一致性研究现状,提出一种基于时间戳的副本一致性模型(Replica Consistency Model Basedon Timestamp,RCMTS),克服了分布锁带来的时间延迟问题,并在模拟环境OptorSim下与其他几种传统的模型算法进行比较...分析了目前数据网格环境下的副本一致性研究现状,提出一种基于时间戳的副本一致性模型(Replica Consistency Model Basedon Timestamp,RCMTS),克服了分布锁带来的时间延迟问题,并在模拟环境OptorSim下与其他几种传统的模型算法进行比较,实验结果表明该模型比传统算法更适合于网格环境中保持副本的一致性。展开更多
Optorsim provides simulations of file replication strategies such as replica placement, replication scheduling, replica consistency maintenance, etc. However, to assess a replica strategy a researcher must write netwo...Optorsim provides simulations of file replication strategies such as replica placement, replication scheduling, replica consistency maintenance, etc. However, to assess a replica strategy a researcher must write network configuration files and repeatedly modify parameters, which is inefficient. In this article, a scale-free algorithm is developed to generate network topology which is loaded into Optorsim, and a graphical user interface (GUI) is implemented to set all configuration parameters and algorithm parameters. Moreover, a new replica placement strategy, the replica creation (PC) algorithm, is proposed based on 'social ability'. Through the GUI, parameters could be conveniently modified to debug proposed algorithm. The simulation results demonstrate that the RC algorithm can reduce both processing time and storage consumption. Meanwhile, the simulation experience shows that working with Optorsim through the GUI is efficient.展开更多
文摘Data Grid integrates graphically distributed resources for solving data intensive scientific applications. Effective scheduling in Grid can reduce the amount of data transferred among nodes by submitting a job to a node, where most of the requested data files are available. Scheduling is a traditional problem in parallel and distributed system. However, due to special issues and goals of Grid, traditional approach is not effective in this environment any more. Therefore, it is necessary to propose methods specialized for this kind of parallel and distributed system. Another solution is to use a data replication strategy to create multiple copies of files and store them in convenient locations to shorten file access times. To utilize the above two concepts, in this paper we develop a job scheduling policy, called hierarchical job scheduling strategy (HJSS), and a dynamic data replication strategy, called advanced dynamic hierarchical replication strategy (ADHRS), to improve the data access efficiencies in a hierarchical Data Grid. HJSS uses hierarchical scheduling to reduce the search time for an appropriate computing node. It considers network characteristics, number of jobs waiting in queue, file locations, and disk read speed of storage drive at data sources. Moreover, due to the limited storage capacity, a good replica replacement algorithm is needed. We present a novel replacement strategy which deletes files in two steps when free space is not enough for the new replica: first, it deletes those files with minimum time for transferring. Second, if space is still insufficient then it considers the last time the replica was requested, number of access, size of replica and file transfer time. The simulation results show that our proposed algorithm has better performance in comparison with other algorithms in terms of job execution time, number of intercommunications, number of replications, hit ratio, computing resource usage and storage usage.
文摘分析了目前数据网格环境下的副本一致性研究现状,提出一种基于时间戳的副本一致性模型(Replica Consistency Model Basedon Timestamp,RCMTS),克服了分布锁带来的时间延迟问题,并在模拟环境OptorSim下与其他几种传统的模型算法进行比较,实验结果表明该模型比传统算法更适合于网格环境中保持副本的一致性。
基金sponsored by the National Natural Science Foundation of China (60573141,60773041)the Hi-Tech Research and Development Program of China (2006AA01Z201,2006AA01Z439,2007AA01Z404, 2007AA01Z478)+3 种基金High Technology Research Program of Jiangsu Province (BG2006001)Foundation of National Laboratory for Modern Communications (9140C1105040805)Jiangsu Provincial Research Scheme of Natural Science for Higher Education Institutions (07KJB520083)Innovation Fund of Nanjing University of Posts and Telecommunications (NY208006)
文摘Optorsim provides simulations of file replication strategies such as replica placement, replication scheduling, replica consistency maintenance, etc. However, to assess a replica strategy a researcher must write network configuration files and repeatedly modify parameters, which is inefficient. In this article, a scale-free algorithm is developed to generate network topology which is loaded into Optorsim, and a graphical user interface (GUI) is implemented to set all configuration parameters and algorithm parameters. Moreover, a new replica placement strategy, the replica creation (PC) algorithm, is proposed based on 'social ability'. Through the GUI, parameters could be conveniently modified to debug proposed algorithm. The simulation results demonstrate that the RC algorithm can reduce both processing time and storage consumption. Meanwhile, the simulation experience shows that working with Optorsim through the GUI is efficient.