摘要
Hadoop已成为大数据关键部件,并获得了越来越多的支持.由于认识到Hadoop的巨大潜力,更多的用户在使用现有Hadoop平台技术的同时,着手研发和优化现有技术,以对Hadoop进行补充.在给出Hadoop系统基本框架的基础上,阐述了MapReduce并行计算框架优化、作业调度优化、HDFS性能优化、HBase性能优化和Hadoop功能增强等研究现状,分析已有技术的优势和不足,并探讨了未来的研究方向.
Hadoop has become a key component of big data,and gained more and more support.Since users have recognized the enormous potential of Hadoop,some of them are working to develop and optimize the existing technologies to supplement Hadoop when using it.This paper gives the basic framework of Hadoop system and describes the optimization work on the parallel computing framework MapReduce,Hadoops job scheduling,the performance of HDFS and HBase,and Hadoop performance enhancements. We study and analyze the advantages and disadvantages of these technologies.Finally some future research directions are given.
出处
《计算机研究与发展》
EI
CSCD
北大核心
2013年第S2期1-15,共15页
Journal of Computer Research and Development
基金
国家自然科学基金项目(61300222
61173170
60873225)
华中科技大学自主创新基金项目(2012TS052
2012TS053
2013QN120
CXY13Q019)