摘要
基于开源云计算平台Hadoop的MapReduce是当前流行的分布式计算框架之一,然而其先进先出(FIFO)调度算法存在资源利用效率低下的问题。提出了一种基于资源匹配规则的MapReduce任务调度模型并进行了算法实现。该调度模型通过获取任务的资源需求与计算节点的剩余资源,依据资源的匹配性进行任务分配,提高了系统的资源使用效率。首先对MapReduce的调度过程进行建模,提出了资源及匹配度的量化定义和相应的计算公式;然后给出了资源测量的具体方法及算法实现;最后利用TeraSort、GrepCount和WordCount任务与FIFO调度算法进行实验对比,实验结果显示,最好的情况下,提出的调度模型任务完成时间减少了22.19%,而最差情况下的吞吐量也提高了25.39%。
MapReduce is one of the popular distributed computing frameworks based on an open source cloud platform named Hadoop.However,the First-In First-Out (FIFO) scheduling algorithm of MapReduce is inefficient in resources utilization.A new tasks scheduling model based on resources matching rules was proposed and implemented.After obtaining the tasks resources requirement and remainder resources on computing nodes,the model assigned tasks to computing nodes based on resources matching degree to improve the usage efficiency of system resources.First of all,the model for MapReduce scheduling was established,the quantitative definition of resources and matching degree were given,and the corresponding calculation formulas were put forward.Second,the specific methods of resource measurement and the implementation of the algorithm were introduced.Compared with FIFO scheduling algorithm on TeraSort,GrepCount and WordCount,the experimental results show that the proposed model reduces by 22.19% in tasks completion time in the best case,and increases by 25.39% in throughput even in the worst case.
出处
《计算机应用》
CSCD
北大核心
2014年第4期1010-1013,1018,共5页
journal of Computer Applications
基金
国家自然科学基金项目资助项目(61170135)