期刊文献+

基于负载均衡的Hadoop动态延迟调度机制 被引量:5

Load Balance Based Dynamic Delay Scheduling Mechanism
下载PDF
导出
摘要 调度问题作为影响Hadoop集群性能的关键因素而成为研究的热点.延迟调度是常用的提高数据本地性和Hadoop集群性能的方法,但现有的延迟调度算法基于固定的等待时间,而且没有充分考虑集群的负载均衡.本文提出基于负载均衡的动态延迟调度机制DDS(Dynamic Delay Scheduling).DDS首先基于灰色预测模型,预测未来时刻空闲节点的到达速率;然后结合集群负载状况和作业执行进度,给每个任务设置合理的延迟等待时间,避免任务的无效等待.任务调度充分考虑节点的实际负载量,防止节点负载过重而导致任务执行缓慢甚至失败,从而缩短作业的总完成时间.实验表明,DDS在作业的总完成时间和负载均衡方面优于传统的延迟调度算法. Scheduling problem is one of the key factors affecting the Hadoop clusters performance, and has become a hot research top- ic. Delay scheduling is one of common methods to improve data locality and clusters performance. However,ctnxent delay scheduling algorithms are based on the fixed waiting time without considering the load balance of cluster. To address the issues, the paper proposes a load balance based dynamic delay scheduling mechanism (DDS). DDS exploits the gray prediction technology to predict the future arrival rates of the idle nodes. Considering the load state of cluster and job progress, DDS assigns each job a rational delay waiting time, avoiding invalid waiting. Task scheduling fully considers the workload of nodes to balance the loads and avoid slower task execu- tion or even failure caused by overloading,consequently short the job completion time. Experimental results show that DDS performs better than the traditional delay scheduling algorithm in terms of job completion time and load balance.
出处 《小型微型计算机系统》 CSCD 北大核心 2015年第3期445-449,共5页 Journal of Chinese Computer Systems
基金 河南省教育厅自然科学基金项目(2011B520035)资助 河南省教育厅科学技术研究重点项目(13A520651)资助
关键词 HADOOP 海量数据 延迟调度 数据本地性 Hadoop mass data delay scheduling data locality
  • 相关文献

参考文献3

二级参考文献29

  • 1DEAN J, GHEMAWAT S. MapReduce: simplified data processing on large clusters[ J]. Communication ACM ,2008,51 (1) :107-113. 被引量:1
  • 2ISARD M, BUDIU M, YU Yuan, et al. Dryad: distributed data-parallel programs from sequential building blocks [ C ]//Proc of ACM SIGOPS/EuroSys European Conference on Computer Systems. New York: ACM Press,2007:59-72. 被引量:1
  • 3Hadoop [ EB/OL ]. (2011 - 12-18 ) [ 2012- 03-12 ]. http ://hadoop. apache. org. 被引量:1
  • 4WANG Guo-hui, NG T S E. The impact of virtualization on network performance of amazon EC2 data center[ C]//Proc of the 29th Conference on Information Communications. Piscataway, NJ : IEEE Press, 2010:1163-1171. 被引量:1
  • 5FISCHER M J, SU Xue-yuan, YIN Yi-tong. Assigning tasks for efficiency in Hadoop: extended abstract[ C]//Proe of the 22nd ACM Symposium on Parallelism in Algorithms and Architectures. New York :ACM Press,2010:30-39. 被引量:1
  • 6ZAHARIA M, BORTHAKUR D, SARMA J S, et al. Delay scheduling: a simple technique for achieving locality and fairness in cluster scheduling [ C ]//Proc of European Conference on Computer Systems. Paris: [ s. n. ] ,2010:265-278. 被引量:1
  • 7GHEMAWAT S, GOBIOFF H, LEUNG S T. The Google file system [ C ]//Proc of the 19th ACM Symposium on Operating Systems Principles. New York : ACM Press,2003:29- 43. 被引量:1
  • 8WHITE T. Hadoop: the definitive guide [ M ]. [ S. l. ] : O' Reilly Media, Inc, 2009. 被引量:1
  • 9JIN Jia-hui, LUO Jun-zhou, SONG Ai-bo, et al. BAR: an efficient data locality driven task scheduling algorithm for cloud computing [ C ]//Proc of the 11 th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing. Washington DC : IEEE Computer Society,2011:295-304. 被引量:1
  • 10Max-min fairness[ EB/OL]. (2011-11-16) [2012-03-12]. http:// en. wikipedia. org/wiki/Max-min_faimess. 被引量:1

共引文献38

同被引文献19

引证文献5

二级引证文献14

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部