摘要
调度问题作为影响Hadoop集群性能的关键因素而成为研究的热点.延迟调度是常用的提高数据本地性和Hadoop集群性能的方法,但现有的延迟调度算法基于固定的等待时间,而且没有充分考虑集群的负载均衡.本文提出基于负载均衡的动态延迟调度机制DDS(Dynamic Delay Scheduling).DDS首先基于灰色预测模型,预测未来时刻空闲节点的到达速率;然后结合集群负载状况和作业执行进度,给每个任务设置合理的延迟等待时间,避免任务的无效等待.任务调度充分考虑节点的实际负载量,防止节点负载过重而导致任务执行缓慢甚至失败,从而缩短作业的总完成时间.实验表明,DDS在作业的总完成时间和负载均衡方面优于传统的延迟调度算法.
Scheduling problem is one of the key factors affecting the Hadoop clusters performance, and has become a hot research top- ic. Delay scheduling is one of common methods to improve data locality and clusters performance. However,ctnxent delay scheduling algorithms are based on the fixed waiting time without considering the load balance of cluster. To address the issues, the paper proposes a load balance based dynamic delay scheduling mechanism (DDS). DDS exploits the gray prediction technology to predict the future arrival rates of the idle nodes. Considering the load state of cluster and job progress, DDS assigns each job a rational delay waiting time, avoiding invalid waiting. Task scheduling fully considers the workload of nodes to balance the loads and avoid slower task execu- tion or even failure caused by overloading,consequently short the job completion time. Experimental results show that DDS performs better than the traditional delay scheduling algorithm in terms of job completion time and load balance.
出处
《小型微型计算机系统》
CSCD
北大核心
2015年第3期445-449,共5页
Journal of Chinese Computer Systems
基金
河南省教育厅自然科学基金项目(2011B520035)资助
河南省教育厅科学技术研究重点项目(13A520651)资助