摘要
MapReduce是目前大数据处理中应用最广泛的云计算模型,预测其性能有利于提高云计算的效率。然而MapReduce运行需要依赖大量的配置参数,这些参数会对MapReduce性能产生较大的影响。传统的MapReduce模型的配置参数的预测方法都是基于管理员经验的定性分析,无法准确预测MapReduce模型运行时间。为更好地对MapReduce性能进行预测,利用数学分析中的多元线性回归方法,在分析现有的影响MapReduce性能的配置参数的基础上,构建了MapReduce性能和其配置参数之间的多元线性回归模型。为了验证该方法的正确性,以两个最重要的配置参数Map和Reduce数量为例进行了算例验证。实验结果表明,多元线性回归模型可以用来预测MapReduce性能。
M apReduce is the most popular cloud computing model in big data processing. Predicting the performance of M apReduce could be used to increase the cloud computing efficiency. However,M apReduce runs based on a huge number of configuration parameters which would affect the performance. Traditional predicting of configuration is based on the experience of administrator,and this approach is of lowaccuracy. In order to give a better prediction of M apReduce performance,a multiple linear regression model based on the configuration parameters was proposed. With the aim to verify the model,an experiment was carried out taking the M ap number and Reduce number as an example. The experiments results indicate that the proposed model can be used in predicting the M apReduce performance.
出处
《计算机技术与发展》
2016年第1期70-73,共4页
Computer Technology and Development
基金
总装备部预研项目(513150701)