摘要
论文设计了基于Hadoop的微博信息挖掘系统。该系统针对单一节点在分析微博海量数据的性能瓶颈问题,利用分布式和虚拟化技术的优势,将微博信息获取和相关数据分析进行有机整合,实现了一个基于Hadoop的微博信息挖掘平台。为验证该平台运行的有效性,论文采用获取热点话题做实验,展示了系统对微博信息的挖掘结果。实验结果表明,该系统能有效获取微博相关信息,高效的处理海量微博数据,得到有价值的数据信息。
This paper designed micro-blogs information mining system based on hadoop.Considering the single node problem,the system uses the advantage of cloud computing—distributed processing and virtualization,organics integration of micro-blogs information and data analysis,implements the micro-blogs information mining platform.To verify the effectiveness and efficiency of the platform,this paper makes an experiment on hot topic and shows the mining results.The experiment results show that the system can get micro-blogs information efficiency,efficient process mass data,and get valuable data information.
出处
《计算机光盘软件与应用》
2012年第1期7-8,共2页
Computer CD Software and Application