期刊文献+

基于BSP大图处理系统中多分区策略研究

Research on Multi-partition Strategy Based on BSP Processing System of Large Graphs
下载PDF
导出
摘要 由于现实生活中的许多应用都以图的形式生成数据,并且一个大图包含数百万个顶点和数十亿条边等问题.本文提出了基于BC-BSP(Bulk Synchronous Prallel,大块同步并行)的系统BC-BSP+,以并行化方式来对大图进行迭代处理.通过BSP系统灵活配置策略(即磁盘管理参数)和拓展功能(即编程接口),根据容错和负载均衡计算大规模图形.通过图的三种分区策略(随机Hash划分算法RHP、负载均衡Hash划分算法BHP和基于范围的顶点划分算法VCRP)来支持大图的处理工作.实验结果表明VCRP优于BHP和RHP,采用VCRP分区策略将BC-BSP+与基于MapReduce的Hadoop进行对比,得出BC-BSP+总体表现均比Hadoop、Giraph和Hama处理大图数据的效率高. Many applications in real life are generating data in the form of graph and a large graph usually contains millions of vertices and billions of edges. Based on the above problems, this paper puts forward a system named BSP (Synchronous Prallel Bulk, block synchronous parallel), or BC-BSP- in short, in parallel with the way to deal with the large graph. First, the strategy (namely disk man agement) should be flexibly configurated and the function (the programming interface) be expanded through the BSP system; second, large-scale graphics can be calculated based on fault tolerance and load balancing. Three partitioning strategies are supposed to support the larger graph processing: a randomized Hash partitioning algorithm (RHP), a load balancing Hash partitioning algorithm (BHP) and vertex partitioning algorithm (VCRP). Experimental results show that on the whole, the VCRP is better than BHP, RHP, specifically speaking, using the VCRP partition strategy to compare BC- BSP+ with Hadoop based MapReduce, the results show that the BC-BSP+ performs with higher efficiency in processing large data than that of Hadoop, Giraph and Hama.
作者 蹇旭 罗南超
出处 《兰州文理学院学报(自然科学版)》 2017年第3期88-95,共8页 Journal of Lanzhou University of Arts and Science(Natural Sciences)
基金 四川省教育厅重点项目(15ZA0339) 阿坝师范学院校级规划项目(ASB12-24)
关键词 聚类 分层聚类 模糊聚类 聚类数 有效性函数 clustering hierarchical clustering fuzzy clustering clustering number validity function
  • 相关文献

参考文献6

二级参考文献22

  • 1Dean J, Ghemawat S. MapReduce: Simplified data processing on large clusters [J]. Communications of the ACM, 2008, 51(1) 107-113. 被引量:1
  • 2怀特汤姆.Hadoop权威指南.周敏奇,王晓玲,金澈清,等译.2版.北京:清华大学出版社,2011. 被引量:1
  • 3Pace M F. Hama vs MapReduce [EB/OL]. [2012-06-25]. http://arxiv, org/abs/1203. 2081. 被引量:1
  • 4Kyrola A, Blelloch G, Guestrin C. GraphChi: Large-scale graph computation on just a PC [C] //Proc o the 10th USENIX Conf on Operating Systems Design and Implementation. Berkeley: USENIX Association, 2012: 31- 46. 被引量:1
  • 5Grzegorz M, Austern H M, et al. Pregel: A system for large-scale graph processing [C]//Proc of the ACM SIGMOD 2010 Int Conf on Management of Data. New York: ACM, 2010: 135-146. 被引量:1
  • 6Chen Ruishan, Weng Xuetian, He Bingshen, et al. Large graph processing in the cloud [C] /[Proc o 2010 ACM SIGMOD Int Conf on Management of Data. New York: ACM, 2010:1123-1126. 被引量:1
  • 7Sangwon S, Edward J Y. HAMA An efficient matrix computation with the MapReduce framework [C] //Proc of the 2nd IEEE Int Conf on Cloud Computing Technology and Science. Los Alamitos, CA: IEEE Computer Society, 2010: 721-726. 被引量:1
  • 8The Apache Software Foundation. Introduction to Giraph [EB/OL]. [2013-06-25]. http//giraph, apache, org/intro. html. 被引量:1
  • 9冯国栋,肖仰华.大图的分布式存储[J].中国计算机学会通讯,2012,8(11):11-8. 被引量:1
  • 10Catalyurek U V, Aykanat C. Patoh: Partitioning tool for hypergraphs[CP/OL], [2013-06-25], http://bmi, osu. edu/ umit/software, html. 被引量:1

共引文献18

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部