摘要
执行时间是作业调度的重要参考因素之一。通过分析Hadoop MapReduce环境作业的执行特征,提出了以map任务和reduce任务执行时间为输入,估算作业执行时间的方法。该方法在一定假设条件下,借助作业预执行来获取map任务和reduce任务的执行时间。实验结果表明,该方法估算作业执行时间的误差率小于7%。
Execution time is very important for job scheduling. In this paper, the execution characters of Hadoop MapReduce jobs are analyzed, and then a new method is proposed to compute the execution times of these jobs. The method takes the execution times of map task and reduce task as input data. It captures these execution times by pre-executing under an assumption. The method has been evaluated in a Linux cluster, the experiment results show that the method computed the execution times of jobs with the error rate no more than 7%.
出处
《计算机工程与应用》
CSCD
2014年第10期249-252,共4页
Computer Engineering and Applications
基金
国家自然科学基金面上项目(No.51274088)
河南省教育厅项目(No.ITE12103)
河南理工大学矿山信息化省级重点实验室项目(No.KY2012-05)
河南理工大学博士基金项目(No.B2012-099)
河南省基础与前沿技术研究计划项目(No.122300410415)