期刊文献+

数据中心内Incast流量的网内聚合研究 被引量:3

Aggregating Incast Transfers in Data Centers
下载PDF
导出
摘要 MapReduce等分布式计算系统应用在数据中心内产生了严重的东西向流量,其中以incast和shuffle为代表的关联性流量占相当大的比重,进而严重影响到上层应用的性能.这促使研究者们考虑在这些关联性流量的网内传输阶段尽可能早而不是仅在流量的接收端进行流间数据聚合.首先以新型数据中心网络结构为背景讨论流间数据聚合的可行性和增益,为最大化该增益,为incast传输建立最小代价树模型.为解决该模型,提出了2种近似的incast树构造方法,其能够仅基于incast成员的位置和数据中心拓扑结构生成一个有效的incast树,进一步解决了incast树的动态和容错问题.最后,采用原型系统和大规模仿真的方法评估了incast流量的网内聚合方法,实验结果证明该方法能大幅降低incast流量造成的传输开销,能节约数据中心的网络资源.同时,提出的模型和解决方法也适用于其他类型的数据中心网络结构. Data transfers, such as the common shuffle and incast communication patterns, contribute most of the network traffic in MapReduce like working paradigms and thus have severe impacts on application performance in modern data centers. This motivates us to bring opportunities for performing the inter-flow data aggregation during the transmission phase as early as possible rather than just at the receiver side. In this paper, we first examine the gain and feasibility of the inter-flow data aggregation with novel data center network structures. To achieve such a gain, we model the incast minimal tree problem. We propose two approximate incast tree construction methods, RS-based and ARS-based incast trees. We are thus able to generate an efficient incast tree solely based on the labels of incast members and the data center topology. We further present incremental methods to tackle the dynamic and fault-tolerant issues of the incast tree. Based on a prototype implementation and large-scale simulations, we demonstrate that our approach can significantly decrease the amount of network traffic, save the data center resources, and reduce the delay o for BCube and FBFLY can be adapted to other data centers structures of job processing. Our approach with minimal modifications.
出处 《计算机研究与发展》 EI CSCD 北大核心 2016年第1期53-67,共15页 Journal of Computer Research and Development
基金 国家自然科学基金优秀青年科学基金项目(61422214)~~
关键词 网内聚合 数据中心 incast传输 shuffle传输 网络流量 in-network aggregation data centers incast transfer shuffle transfer network transfer
  • 相关文献

参考文献39

  • 1Condie T,Conway N,Alvaro P, et al. MapReduce online[C] //Proc of USENIX NSDI’10. Berkeley, CA: USENIXAssociation, 2010; 313-328. 被引量:1
  • 2Yu Yuan, Isard M,Fetterly D,et al. DryadLINQ: A systemfor general-purpose distributed data-parallel computing using ahigh-level language [C] //Proc of USENIX OSDI'08. Berkeley,CA: USENIX Association, 2008: 1-14. 被引量:1
  • 3Murray D G,Schwarzkopf M, Smowton C, et al. CIEL: Auniversal execution engine for distributed data-flowcomputing [C] //Proc of USENIX NSDI’ll. Berkeley, CA:USENIX Association, 2011. 被引量:1
  • 4Malewicz G, Austern M H,Bik A J C, et al. Pregel: Asystem for large-scale graph processing [C] //Proc of ACMSIGMOD'IO. New York: ACM, 2010: 135-146. 被引量:1
  • 5Zaharia M,Chowdhury M, Franklin M J,et al. Spark:Cluster computing with working sets [J]. Book of Extremes,2010, 15(1): 1765-1773. 被引量:1
  • 6Chowdhury M? Zaharia M, Ma J, et al. Managing datatransfers in computer clusters with orchestra [C] //Proc ofACM SIGCOMM'll. New York: ACM, 2011: 98-109. 被引量:1
  • 7Al-Fares A, Loukissas A,Vahdat A. A scalable, commoditydata center network architecture [C] //Proc of ACMSIGCOMM'08. New York: ACM, 2008; 63-74. 被引量:1
  • 8Greenberg A, Jain N,Kandula S, et al. VL2: A scalableand flexible data center network [C] //Proc of ACMSIGCOMM'09. New York; ACM, 2009: 51-62. 被引量:1
  • 9Mysore R,Pamboris A, Farrington N. PortLand: A scalablefault-tolerant layer 2 data center network fabric [C] //Proc ofACM SIGCOMM'09. New York: ACM,2009: 39-50. 被引量:1
  • 10Abts D,Marty M A, Wells P M,et al. Energy proportionaldatacenter networks [C] //Proc of ACM ISC A510. NewYork: ACM, 2010: 338-347. 被引量:1

同被引文献12

引证文献3

二级引证文献18

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部