摘要
在充分研究大数据采集、大数据存储、HDFS和Flume基础上,综合分析并利用相关领域知识,给出了一种基于Flume和HDFS相结合的大数据采集系统BDAS的概念模型和体系结构.并根据BDAS的体系结构,可以明确实现一种大数据采集的具体工作,即:Flume Agent的配置.根据体系结构,给出一个实现Web Server日志采集的具体实现方法和步骤. BDAS概念模型和体系结构在大数据分析和研究领域具有重要的理论意义和实际意义,也为大数据领域的研究提供了一种通用的大数据获取手段.
On the basis of fully studying big data collection,big data storage,HDFS and Flume,comprehensively analyzing and utilizing the relevant domain knowledge,a conceptual model and architecture were presented based on Flume and HDFS combined with big data acquisition system BDAS.According to the architecture of BDAS,the specific work of a big data collection can be clearly realized,namely,the configuration of the Flume Agent.According to the architecture,a specific implementation method and steps to achieve Web Server log collection were offered.The BDAS conceptual model and architecture have important theoretical and practical significance in the field of big data analysis and research,providing a universal means of big data acquisition for the research of big data.
作者
方中纯
赵江鹏
FANG Zhong-chun;ZHAO Jiang-peng(Engineering and Training Center,Inner Mongolia University of Science and Technology,Baotou 014010,China;Information Engineering School,Inner Mongolia University of Science and Technology,Baotou 014010,China)
出处
《内蒙古科技大学学报》
CAS
2018年第3期255-259,共5页
Journal of Inner Mongolia University of Science and Technology
基金
国家自然科学基金资助项目(61462069)
内蒙古自然科学基金资助项目(2017MS0604
2017MS(LH)0603)
内蒙古科技大学教改重点资助项目(JY2016003)