为充分利用数据中心网络的多路径带宽,现有研究多采用基于链路感知的负载均衡算法,在动态获取全局链路拥塞信息后选取最优路径对流量进行转发.然而这些研究未考虑数据中心网络流量大小分布不均匀的特性,难以在选路成本和转发效率上取得...为充分利用数据中心网络的多路径带宽,现有研究多采用基于链路感知的负载均衡算法,在动态获取全局链路拥塞信息后选取最优路径对流量进行转发.然而这些研究未考虑数据中心网络流量大小分布不均匀的特性,难以在选路成本和转发效率上取得平衡.为此,设计一种基于流分类的数据中心网络负载均衡机制(ULFC,Utilization-aware Load balancing based on Flow Classification),在实现拥塞感知的基础上进行流量特征分析,采用不同的策略为大、小流分配路径,实现网络流量特征与选路方法优势的最佳匹配.实验结果表明,相比于现有方案,ULFC的平均流处理效率提高了1.3倍至1.6倍,路由成本降低了50%以上.展开更多
软件定义网络SDN(Software Defined Network)通过将网络设备的控制层与数据层分离解耦,能够实现网络的集中控制和流量的灵活转发,因此被广泛应用于数据中心等相关领域。在数据中心网络中,为了提高网络的带宽和吞吐量,多采用具有多路径...软件定义网络SDN(Software Defined Network)通过将网络设备的控制层与数据层分离解耦,能够实现网络的集中控制和流量的灵活转发,因此被广泛应用于数据中心等相关领域。在数据中心网络中,为了提高网络的带宽和吞吐量,多采用具有多路径特性的层次型网络拓扑结构,如胖树拓扑结构。然而传统路由算法对多路径的支持非常有限,无法充分利用网络剩余带宽。重点研究基于SDN的胖树数据中心网络多路径负载均衡算法。利用SDN网络集中控制的特点,获取多路径实时状态信息,计算各路径当前可用带宽,根据数据流的带宽需求选择最佳转发路径。实验结果表明,该算法无论在降低网络传播时延还是在提高网络吞吐量等方面都优于传统路由算法,能够实现胖树数据中心网络的多路径负载均衡。展开更多
基于OpenFlow的SDN(Software Defined Networking)技术在数据中心中得到广泛研究和应用,如何缓解集中的控制平面成为网络性能的瓶颈是其中的研究热点.OpenFlow规范提出,当数据平面有缓存能力时,未命中的报文仅需发送少量摘要信息至控制...基于OpenFlow的SDN(Software Defined Networking)技术在数据中心中得到广泛研究和应用,如何缓解集中的控制平面成为网络性能的瓶颈是其中的研究热点.OpenFlow规范提出,当数据平面有缓存能力时,未命中的报文仅需发送少量摘要信息至控制器触发规则下发,从而减少控制平面与数据平面的通信负载.然而,现有的缓存模型采用报文粒度的缓存方式,使得同一条流的多个未命中报文会被送至控制器造成额外的通信负载,而且交换机处理报文的顺序会导致流内报文乱序,从而降低通信的性能.针对上述问题,该文提出了一种支持流内报文保序的OpenFlow交换机流缓存管理模型.通过基于流粒度的未命中报文缓存方式,进一步减少控制平面与数据平面的通信开销.通过设计流动作预处理机制,实现同一条流内报文传输保序.该文分别基于软件交换机OFSoftSwitch与硬件网络实验平台NetMagic对该流缓存管理模型进行了原型系统验证.展开更多
Efficient resource utilization requires that emerging datacenter interconnects support both high performance communication and efficient remote resource sharing. These goals require that the network be more tightly co...Efficient resource utilization requires that emerging datacenter interconnects support both high performance communication and efficient remote resource sharing. These goals require that the network be more tightly coupled with the CPU chips. Designing a new interconnection technology thus requires considering not only the interconnection itself, but also the design of the processors that will rely on it. In this paper, we study memory hierarchy implications for the design of high-speed datacenter interconnects particularly as they affect remote memory access -- and we use PCIe as the vehicle for our investigations. To that end, we build three complementary platforms: a PCIe-interconnected prototype server with which we measure and analyze current bottlenecks; a software simulator that lets us model microarchitectural and cache hierarchy changes; and an FPGA prototype system with a streamlined switchless customized protocol Thunder with which we study hardware optimizations outside the processor. We highlight several architectural modifications to better support remote memory access and communication, and quantify their impact and ]imitations.展开更多
The fast growth of datacenter networks,in terms of both scale and structural complexity,has led to an increase of network failure and hence brings new challenges to network management systems.As network failure such a...The fast growth of datacenter networks,in terms of both scale and structural complexity,has led to an increase of network failure and hence brings new challenges to network management systems.As network failure such as node failure is inevitable,how to find fault detection and diagnosis approaches that can effectively restore the network communication function and reduce the loss due to failure has been recognized as an important research problem in both academia and industry.This research focuses on exploring issues of node failure,and presents a proactive fault diagnosis algorithm called heuristic breadth-first detection(HBFD),through dynamically searching the spanning tree,analyzing the dial-test data and choosing a reasonable threshold to locate fault nodes.Both theoretical analysis and simulation results demonstrate that HBFD can diagnose node failures effectively,and take a smaller number of detection and a lower false rate without sacrificing accuracy.展开更多
Datacenters have played an increasingly essential role as the underlying infrastructure in cloud computing. As implied by the essence of cloud computing, resources in these datacenters are shared by multiple competing...Datacenters have played an increasingly essential role as the underlying infrastructure in cloud computing. As implied by the essence of cloud computing, resources in these datacenters are shared by multiple competing entities, which can be either tenants that rent virtual machines(VMs) in a public cloud such as Amazon EC2, or applications that embrace data parallel frameworks like MapReduce in a private cloud maintained by Google. It has been generally observed that with traditional transport-layer protocols allocating link bandwidth in datacenters, network traffic from competing applications interferes with each other, resulting in a severe lack of predictability and fairness of application performance. Such a critical issue has drawn a substantial amount of recent research attention on bandwidth allocation in datacenter networks, with a number of new mechanisms proposed to efficiently and fairly share a datacenter network among competing entities. In this article, we present an extensive survey of existing bandwidth allocation mechanisms in the literature, covering the scenarios of both public and private clouds. We thoroughly investigate their underlying design principles, evaluate the trade-off involved in their design choices and summarize them in a unified design space, with the hope of conveying some meaningful insights for better designs in the future.展开更多
由于当前数据中心网络应用对延迟要求很高,但是数据中心网络采用TCP作为其传输协议,从而引发了延迟敏感流的长尾效应问题,故提出一种基于编码的传输协议(code-based transport protocol,CTP).该协议支持选择性反馈,旨在避免过大的RTO值...由于当前数据中心网络应用对延迟要求很高,但是数据中心网络采用TCP作为其传输协议,从而引发了延迟敏感流的长尾效应问题,故提出一种基于编码的传输协议(code-based transport protocol,CTP).该协议支持选择性反馈,旨在避免过大的RTO值所导致的长尾效应,同时减少数据中心网络间的传输延迟.CTP方案采用UDP替代TCP作为其传输协议,提出对数级别反馈的LT编码(logarithmic-feedback LT code,LFLT编码)保证可靠性,同时设计了一个有效的自适应的拥塞控制算法提升传输效率并且可以和TCP流友好共存.实验结果表明,CTP在编码效率上比传统LT编码提升了100%,传输时间方面比TCP和基于纯LT编码的UDP分别提升了150%和50%,受丢包率的影响小很多,更适用于流量波动较大的数据中心网络.展开更多
Although dense interconnection datacenter networks(DCNs)(e.g.,Fat Tree) provide multiple paths and high bisection bandwidth for each server pair,the widely used single-path Transmission Control Protocol(TCP)and equal-...Although dense interconnection datacenter networks(DCNs)(e.g.,Fat Tree) provide multiple paths and high bisection bandwidth for each server pair,the widely used single-path Transmission Control Protocol(TCP)and equal-cost multipath(ECMP) transport protocols cannot achieve high resource utilization due to poor resource excavation and allocation.In this paper,we present LESSOR,a performance-oriented multipath forwarding scheme to improve DCNs' resource utilization.By adopting an Open Flow-based centralized control mechanism,LESSOR computes near-optimal transmission path and bandwidth provision for each flow according to the global network view while maintaining nearly real-time network view with the performance-oriented flow observing mechanism.Deployments and comprehensive simulations show that LESSOR can efficiently improve the network throughput,which is higher than ECMP by 4.9%–38.3% under different loads.LESSOR also provides 2%–27.7% improvement of throughput compared with Hedera.Besides,LESSOR decreases the average flow completion time significantly.展开更多
文摘为充分利用数据中心网络的多路径带宽,现有研究多采用基于链路感知的负载均衡算法,在动态获取全局链路拥塞信息后选取最优路径对流量进行转发.然而这些研究未考虑数据中心网络流量大小分布不均匀的特性,难以在选路成本和转发效率上取得平衡.为此,设计一种基于流分类的数据中心网络负载均衡机制(ULFC,Utilization-aware Load balancing based on Flow Classification),在实现拥塞感知的基础上进行流量特征分析,采用不同的策略为大、小流分配路径,实现网络流量特征与选路方法优势的最佳匹配.实验结果表明,相比于现有方案,ULFC的平均流处理效率提高了1.3倍至1.6倍,路由成本降低了50%以上.
文摘软件定义网络SDN(Software Defined Network)通过将网络设备的控制层与数据层分离解耦,能够实现网络的集中控制和流量的灵活转发,因此被广泛应用于数据中心等相关领域。在数据中心网络中,为了提高网络的带宽和吞吐量,多采用具有多路径特性的层次型网络拓扑结构,如胖树拓扑结构。然而传统路由算法对多路径的支持非常有限,无法充分利用网络剩余带宽。重点研究基于SDN的胖树数据中心网络多路径负载均衡算法。利用SDN网络集中控制的特点,获取多路径实时状态信息,计算各路径当前可用带宽,根据数据流的带宽需求选择最佳转发路径。实验结果表明,该算法无论在降低网络传播时延还是在提高网络吞吐量等方面都优于传统路由算法,能够实现胖树数据中心网络的多路径负载均衡。
文摘基于OpenFlow的SDN(Software Defined Networking)技术在数据中心中得到广泛研究和应用,如何缓解集中的控制平面成为网络性能的瓶颈是其中的研究热点.OpenFlow规范提出,当数据平面有缓存能力时,未命中的报文仅需发送少量摘要信息至控制器触发规则下发,从而减少控制平面与数据平面的通信负载.然而,现有的缓存模型采用报文粒度的缓存方式,使得同一条流的多个未命中报文会被送至控制器造成额外的通信负载,而且交换机处理报文的顺序会导致流内报文乱序,从而降低通信的性能.针对上述问题,该文提出了一种支持流内报文保序的OpenFlow交换机流缓存管理模型.通过基于流粒度的未命中报文缓存方式,进一步减少控制平面与数据平面的通信开销.通过设计流动作预处理机制,实现同一条流内报文传输保序.该文分别基于软件交换机OFSoftSwitch与硬件网络实验平台NetMagic对该流缓存管理模型进行了原型系统验证.
基金This work was supported by the Strategic Priority Research Program of the Chinese Academy of Sciences under Grant No. XDA06010401, and the National Natural Science Foundation of China under Grant Nos. 61100010, 61402438, and 61402439.
文摘Efficient resource utilization requires that emerging datacenter interconnects support both high performance communication and efficient remote resource sharing. These goals require that the network be more tightly coupled with the CPU chips. Designing a new interconnection technology thus requires considering not only the interconnection itself, but also the design of the processors that will rely on it. In this paper, we study memory hierarchy implications for the design of high-speed datacenter interconnects particularly as they affect remote memory access -- and we use PCIe as the vehicle for our investigations. To that end, we build three complementary platforms: a PCIe-interconnected prototype server with which we measure and analyze current bottlenecks; a software simulator that lets us model microarchitectural and cache hierarchy changes; and an FPGA prototype system with a streamlined switchless customized protocol Thunder with which we study hardware optimizations outside the processor. We highlight several architectural modifications to better support remote memory access and communication, and quantify their impact and ]imitations.
基金supported by the National Natural Science Foundation of China(61877067 61572435)+3 种基金the joint fund project of the Ministry of Education–the China Mobile(MCM20170103)Xi’an Science and Technology Innovation Project(201805029YD7CG13-6)Ningbo Natural Science Foundation(2016A610035 2017A610119)
文摘The fast growth of datacenter networks,in terms of both scale and structural complexity,has led to an increase of network failure and hence brings new challenges to network management systems.As network failure such as node failure is inevitable,how to find fault detection and diagnosis approaches that can effectively restore the network communication function and reduce the loss due to failure has been recognized as an important research problem in both academia and industry.This research focuses on exploring issues of node failure,and presents a proactive fault diagnosis algorithm called heuristic breadth-first detection(HBFD),through dynamically searching the spanning tree,analyzing the dial-test data and choosing a reasonable threshold to locate fault nodes.Both theoretical analysis and simulation results demonstrate that HBFD can diagnose node failures effectively,and take a smaller number of detection and a lower false rate without sacrificing accuracy.
基金support in part by the Research Grants Council(RGC)of Hong Kong under Grant No.615613the National Natural Science Foundation of China(NSFC)/RGC of Hong Kong under Grant No.N HKUST610/11+1 种基金the NSFC under Grant No.U1301253the China Cache Int.Corp.under Contract No.CCNT12EG01
文摘Datacenters have played an increasingly essential role as the underlying infrastructure in cloud computing. As implied by the essence of cloud computing, resources in these datacenters are shared by multiple competing entities, which can be either tenants that rent virtual machines(VMs) in a public cloud such as Amazon EC2, or applications that embrace data parallel frameworks like MapReduce in a private cloud maintained by Google. It has been generally observed that with traditional transport-layer protocols allocating link bandwidth in datacenters, network traffic from competing applications interferes with each other, resulting in a severe lack of predictability and fairness of application performance. Such a critical issue has drawn a substantial amount of recent research attention on bandwidth allocation in datacenter networks, with a number of new mechanisms proposed to efficiently and fairly share a datacenter network among competing entities. In this article, we present an extensive survey of existing bandwidth allocation mechanisms in the literature, covering the scenarios of both public and private clouds. We thoroughly investigate their underlying design principles, evaluate the trade-off involved in their design choices and summarize them in a unified design space, with the hope of conveying some meaningful insights for better designs in the future.
文摘由于当前数据中心网络应用对延迟要求很高,但是数据中心网络采用TCP作为其传输协议,从而引发了延迟敏感流的长尾效应问题,故提出一种基于编码的传输协议(code-based transport protocol,CTP).该协议支持选择性反馈,旨在避免过大的RTO值所导致的长尾效应,同时减少数据中心网络间的传输延迟.CTP方案采用UDP替代TCP作为其传输协议,提出对数级别反馈的LT编码(logarithmic-feedback LT code,LFLT编码)保证可靠性,同时设计了一个有效的自适应的拥塞控制算法提升传输效率并且可以和TCP流友好共存.实验结果表明,CTP在编码效率上比传统LT编码提升了100%,传输时间方面比TCP和基于纯LT编码的UDP分别提升了150%和50%,受丢包率的影响小很多,更适用于流量波动较大的数据中心网络.
基金supported by the National Basic Research Program(973)of China(No.2012CB315806)the National Natural Science Foundation of China(Nos.61103225 and61379149)+1 种基金the Jiangsu Provincial Natural Science Foundation(No.BK20140070)the Jiangsu Future Networks Innovation Institute Prospective Research Project on Future Networks,China(No.BY2013095-1-06)
文摘Although dense interconnection datacenter networks(DCNs)(e.g.,Fat Tree) provide multiple paths and high bisection bandwidth for each server pair,the widely used single-path Transmission Control Protocol(TCP)and equal-cost multipath(ECMP) transport protocols cannot achieve high resource utilization due to poor resource excavation and allocation.In this paper,we present LESSOR,a performance-oriented multipath forwarding scheme to improve DCNs' resource utilization.By adopting an Open Flow-based centralized control mechanism,LESSOR computes near-optimal transmission path and bandwidth provision for each flow according to the global network view while maintaining nearly real-time network view with the performance-oriented flow observing mechanism.Deployments and comprehensive simulations show that LESSOR can efficiently improve the network throughput,which is higher than ECMP by 4.9%–38.3% under different loads.LESSOR also provides 2%–27.7% improvement of throughput compared with Hedera.Besides,LESSOR decreases the average flow completion time significantly.