In the rising tide of the Internet of things, more and more things in the world are connected to the Internet. Recently, data have kept growing at a rate more than four times of that expected in Moore's law. This exp...In the rising tide of the Internet of things, more and more things in the world are connected to the Internet. Recently, data have kept growing at a rate more than four times of that expected in Moore's law. This explosion of data comes from various sources such as mobile phones, video cameras and sensor networks, which often present multidi- mensional characteristics. The huge amount of data brings many challenges on the management, transportation, and pro- cessing IT infrastructures. To address these challenges, the state-of-art large scale data center networks have begun to provide cloud services that are increasingly prevalent. How- ever, how to build a good data center remains an open chal- lenge. Concurrently, the architecture design, which signifi- cantly affects the total performance, is of great research inter- est. This paper surveys advances in data center network de- sign. In this paper we first introduce the upcoming trends in the data center industry. Then we review some popular design principles for today's data center network architectures. In the third part, we present some up-to-date data center frame- works and make a comprehensive comparison of them. Dur- ing the comparison, we observe that there is no so-called op- timal data center and the design should be different referring to the data placement, replication, processing, and query pro- cessing. After that, several existing challenges and limitations are discussed. According to these observations, we point out some possible future research directions.展开更多
对传统方法调度大象流时容易造成数据中心网络拥塞等问题进行研究,提出一种基于蚁群优化的动态负载均衡(Dynamic Load Balancing based on Ant Colony Optimization,DLB-ACO)算法。该算法通过计算一个周期的链路负载方差,降低瞬时负载...对传统方法调度大象流时容易造成数据中心网络拥塞等问题进行研究,提出一种基于蚁群优化的动态负载均衡(Dynamic Load Balancing based on Ant Colony Optimization,DLB-ACO)算法。该算法通过计算一个周期的链路负载方差,降低瞬时负载极值对负载均衡度的影响,避免资源浪费,再对蚁群算法的路径选择概率引入混沌策略和分段调节挥发因子,使算法具有较强的全局搜索能力和较高的收敛速度,从而提高全局最优路径计算的概率。实验结果表明,与等价多径路由(Equal Cost Multi Path,ECMP)算法和全局优先匹配流量调度(Global First Fit,GFF)算法对比,所提算法提高了网络的链路利用率和吞吐量,并且降低了时延。展开更多
提出一种高效的集中式数据中心网络路由(Efficient Centralized Data Center network Routing,ECDCR)策略。ECDCR使用集中控制器收集和发送网络链路状态,并将实际的路由计算卸载到每个交换机上。根据数据中心网络的固定模式,交换机的路...提出一种高效的集中式数据中心网络路由(Efficient Centralized Data Center network Routing,ECDCR)策略。ECDCR使用集中控制器收集和发送网络链路状态,并将实际的路由计算卸载到每个交换机上。根据数据中心网络的固定模式,交换机的路由计算简化为表查找方式,即将链路状态变化与预先安装的基本拓扑进行比较,并根据预定义规则更新路由路径。采用虚拟机和物理交换机进行大量实验,结果表明,与现有路由策略相比,ECDCR展现出了更优秀的性能。展开更多
Data Center Networks (DCNs) are the fundamental infrastructure for cloud computing. Driven by the massive parallel computing tasks in cloud computing, one-to-many data dissemination becomes one of the most important...Data Center Networks (DCNs) are the fundamental infrastructure for cloud computing. Driven by the massive parallel computing tasks in cloud computing, one-to-many data dissemination becomes one of the most important traffic patterns in DCNs. Many architectures and protocols are proposed to meet this demand. However, these proposals either require complicated configurations on switches and servers, or cannot deliver an optimal performance. In this paper, we propose the peer-assisted data dissemination for DCNs. This approach utilizes the rich physical connections with high bandwidths and mutli-path connections, to facilitate efficient one-to-many data dissemination. We prove that an optimal P2P data dissemination schedule exists for FatTree, a specially- designed DCN architecture. We then present a theoretical analysis of this algorithm in the general multi-rooted tree topology, a widely-used DCN architecture. Additionally, we explore the performance of an intuitive line structure for data dissemination. Our analysis and experimental results prove that this simple structure is able to produce a comparable performance to the optimal algorithm. Since DCN applications heavily rely on virtualization to achieve optimal resource sharing, we present a general implementation method for the proposed algorithms, which aims to mitigate the impact of the potentially-high churn rate of the virtual machines.展开更多
In modern data centers, power consumed by network is an observable portion of the total energy budget and thus improving the energy efficiency of data center networks (DCNs) truly matters. One effective way for this...In modern data centers, power consumed by network is an observable portion of the total energy budget and thus improving the energy efficiency of data center networks (DCNs) truly matters. One effective way for this energy efficiency is to make the size of DCNs elastic along with traffic demands by flow consolidation and bandwidth scheduling, i.e., turning off unnecessary network components to reduce the power consumption. Meanwhile, having the instinct support for data center management, software defined networking (SDN) provides a paradigm to elastically control the resources of DCNs. To achieve such power savings, most of the prior efforts just adopt simple greedy heuristic to reduce computational complexity. However, due to the inherent problem of greedy algorithm, a good-enough optimization cannot be always guaranteed. To address this problem, a modified hybrid genetic algorithm (MHGA) is employed to improve the solution's accuracy, and the fine-grained routing function of SDN is fully leveraged. The simulation results show that more efficient power management can be achieved than the previous studies, by increasing about 5% of network energy savings.展开更多
With the continuous enrichment of cloud services, an increasing number of applications are being deployed in data centers. These emerging applications are often communication-intensive and data-parallel, and their per...With the continuous enrichment of cloud services, an increasing number of applications are being deployed in data centers. These emerging applications are often communication-intensive and data-parallel, and their performance is closely related to the underlying network. With their distributed nature, the applications consist of tasks that involve a collection of parallel flows. Traditional techniques to optimize flow-level metrics are agnostic to task-level requirements, leading to poor application-level performance. In this paper, we address the heterogeneous task-level requirements of applications and propose task-aware flow scheduling. First, we model tasks' sensitivity to their completion time by utilities. Second, on the basis of Nash bargaining theory, we establish a flow scheduling model with heterogeneous utility characteristics, and analyze it using Lagrange multiplier method and KKT condition. Third, we propose two utility-aware bandwidth allocation algorithms with different practical constraints. Finally, we present Tasch, a system that enables tasks to maintain high utilities and guarantees the fairness of utilities. To demonstrate the feasibility of our system, we conduct comprehensive evaluations with realworld traffic trace. Communication stages complete up to 1.4 faster on average, task utilities increase up to 2.26,and the fairness of tasks improves up to 8.66 using Tasch in comparison to per-flow mechanisms.展开更多
数据中心网络(Data Center Networks,DCN)多对一的数据传输模式,容易导致TCP Incast问题,从而造成网络拥塞。数据中心网络的拥塞控制已成为当前急需研究的问题。对造成数据中心网络拥塞的Incast问题做了简要阐述,分类介绍、分析了当前...数据中心网络(Data Center Networks,DCN)多对一的数据传输模式,容易导致TCP Incast问题,从而造成网络拥塞。数据中心网络的拥塞控制已成为当前急需研究的问题。对造成数据中心网络拥塞的Incast问题做了简要阐述,分类介绍、分析了当前一些主要的数据中心网络拥塞控制技术,并对拥塞控制技术未来的发展方向进行了展望。展开更多
Many "rich - connected" topologies with multiple parallel paths between smwers have been proposed for data center networks recently to provide high bisection bandwidth, but it re mains challenging to fully utilize t...Many "rich - connected" topologies with multiple parallel paths between smwers have been proposed for data center networks recently to provide high bisection bandwidth, but it re mains challenging to fully utilize the high network capacity by appropriate multi- path routing algorithms. As flow-level path splitting may lead to trafl'ic imbalance between paths due to flow- size difference, packet-level path splitting attracts more attention lately, which spreads packets from flows into multiple available paths and significantly improves link utilizations. However, it may cause packet reordering, confusing the TCP congestion control algorithm and lowering the throughput of flows. In this paper, we design a novel packetlevel multi-path routing scheme called SOPA, which leverag- es OpenFlow to perform packet-level path splitting in a round- robin fashion, and hence significantly mitigates the packet reordering problem and improves the network throughput. Moreover, SOPA leverages the topological feature of data center networks to encode a very small number of switches along the path into the packet header, resulting in very light overhead. Compared with random packet spraying (RPS), Hedera and equal-cost multi-path routing (ECMP), our simulations demonstrate that SOPA achieves 29.87%, 50.41% and 77.74% higher network throughput respectively under permutation workload, and reduces average data transfer completion time by 53.65%, 343.31% and 348.25% respectively under production workload.展开更多
文摘In the rising tide of the Internet of things, more and more things in the world are connected to the Internet. Recently, data have kept growing at a rate more than four times of that expected in Moore's law. This explosion of data comes from various sources such as mobile phones, video cameras and sensor networks, which often present multidi- mensional characteristics. The huge amount of data brings many challenges on the management, transportation, and pro- cessing IT infrastructures. To address these challenges, the state-of-art large scale data center networks have begun to provide cloud services that are increasingly prevalent. How- ever, how to build a good data center remains an open chal- lenge. Concurrently, the architecture design, which signifi- cantly affects the total performance, is of great research inter- est. This paper surveys advances in data center network de- sign. In this paper we first introduce the upcoming trends in the data center industry. Then we review some popular design principles for today's data center network architectures. In the third part, we present some up-to-date data center frame- works and make a comprehensive comparison of them. Dur- ing the comparison, we observe that there is no so-called op- timal data center and the design should be different referring to the data placement, replication, processing, and query pro- cessing. After that, several existing challenges and limitations are discussed. According to these observations, we point out some possible future research directions.
文摘对传统方法调度大象流时容易造成数据中心网络拥塞等问题进行研究,提出一种基于蚁群优化的动态负载均衡(Dynamic Load Balancing based on Ant Colony Optimization,DLB-ACO)算法。该算法通过计算一个周期的链路负载方差,降低瞬时负载极值对负载均衡度的影响,避免资源浪费,再对蚁群算法的路径选择概率引入混沌策略和分段调节挥发因子,使算法具有较强的全局搜索能力和较高的收敛速度,从而提高全局最优路径计算的概率。实验结果表明,与等价多径路由(Equal Cost Multi Path,ECMP)算法和全局优先匹配流量调度(Global First Fit,GFF)算法对比,所提算法提高了网络的链路利用率和吞吐量,并且降低了时延。
文摘提出一种高效的集中式数据中心网络路由(Efficient Centralized Data Center network Routing,ECDCR)策略。ECDCR使用集中控制器收集和发送网络链路状态,并将实际的路由计算卸载到每个交换机上。根据数据中心网络的固定模式,交换机的路由计算简化为表查找方式,即将链路状态变化与预先安装的基本拓扑进行比较,并根据预定义规则更新路由路径。采用虚拟机和物理交换机进行大量实验,结果表明,与现有路由策略相比,ECDCR展现出了更优秀的性能。
基金supported in part by the Natural Science Foundation of USA(Nos.ECCS 1128209,CNS 10655444,CCF 1028167,CNS 0948184,and CCF 0830289)
文摘Data Center Networks (DCNs) are the fundamental infrastructure for cloud computing. Driven by the massive parallel computing tasks in cloud computing, one-to-many data dissemination becomes one of the most important traffic patterns in DCNs. Many architectures and protocols are proposed to meet this demand. However, these proposals either require complicated configurations on switches and servers, or cannot deliver an optimal performance. In this paper, we propose the peer-assisted data dissemination for DCNs. This approach utilizes the rich physical connections with high bandwidths and mutli-path connections, to facilitate efficient one-to-many data dissemination. We prove that an optimal P2P data dissemination schedule exists for FatTree, a specially- designed DCN architecture. We then present a theoretical analysis of this algorithm in the general multi-rooted tree topology, a widely-used DCN architecture. Additionally, we explore the performance of an intuitive line structure for data dissemination. Our analysis and experimental results prove that this simple structure is able to produce a comparable performance to the optimal algorithm. Since DCN applications heavily rely on virtualization to achieve optimal resource sharing, we present a general implementation method for the proposed algorithms, which aims to mitigate the impact of the potentially-high churn rate of the virtual machines.
基金supported by the Research Fund of Ministry of Education-China Mobile (MCM20160304)
文摘In modern data centers, power consumed by network is an observable portion of the total energy budget and thus improving the energy efficiency of data center networks (DCNs) truly matters. One effective way for this energy efficiency is to make the size of DCNs elastic along with traffic demands by flow consolidation and bandwidth scheduling, i.e., turning off unnecessary network components to reduce the power consumption. Meanwhile, having the instinct support for data center management, software defined networking (SDN) provides a paradigm to elastically control the resources of DCNs. To achieve such power savings, most of the prior efforts just adopt simple greedy heuristic to reduce computational complexity. However, due to the inherent problem of greedy algorithm, a good-enough optimization cannot be always guaranteed. To address this problem, a modified hybrid genetic algorithm (MHGA) is employed to improve the solution's accuracy, and the fine-grained routing function of SDN is fully leveraged. The simulation results show that more efficient power management can be achieved than the previous studies, by increasing about 5% of network energy savings.
基金supported by the National Key R&D Program of China(No.2017YFB1003000)the National Natural Science Foundation of China(Nos.61872079,61572129,61602112,61502097,61702096,61320106007,61632008,and 61702097)+4 种基金the Natural Science Foundation of Jiangsu Province(Nos.BK20160695 and BK20170689)the Fundamental Research Funds for the Central Universities(No.2242018k1G019)the Jiangsu Provincial Key Laboratory of Network and Information Security(No.BM2003201)the Key Laboratory of Computer Network and Information Integration of Ministry of Education of China(No.93K-9)partially supported by the Collaborative Innovation Center of Novel Software Technology and Industrialization and Collaborative Innovation Center of Wireless Communications Technology
文摘With the continuous enrichment of cloud services, an increasing number of applications are being deployed in data centers. These emerging applications are often communication-intensive and data-parallel, and their performance is closely related to the underlying network. With their distributed nature, the applications consist of tasks that involve a collection of parallel flows. Traditional techniques to optimize flow-level metrics are agnostic to task-level requirements, leading to poor application-level performance. In this paper, we address the heterogeneous task-level requirements of applications and propose task-aware flow scheduling. First, we model tasks' sensitivity to their completion time by utilities. Second, on the basis of Nash bargaining theory, we establish a flow scheduling model with heterogeneous utility characteristics, and analyze it using Lagrange multiplier method and KKT condition. Third, we propose two utility-aware bandwidth allocation algorithms with different practical constraints. Finally, we present Tasch, a system that enables tasks to maintain high utilities and guarantees the fairness of utilities. To demonstrate the feasibility of our system, we conduct comprehensive evaluations with realworld traffic trace. Communication stages complete up to 1.4 faster on average, task utilities increase up to 2.26,and the fairness of tasks improves up to 8.66 using Tasch in comparison to per-flow mechanisms.
文摘数据中心网络(Data Center Networks,DCN)多对一的数据传输模式,容易导致TCP Incast问题,从而造成网络拥塞。数据中心网络的拥塞控制已成为当前急需研究的问题。对造成数据中心网络拥塞的Incast问题做了简要阐述,分类介绍、分析了当前一些主要的数据中心网络拥塞控制技术,并对拥塞控制技术未来的发展方向进行了展望。
基金supported by the National Basic Research Program of China(973 program)under Grant No.2014CB347800 and No.2012CB315803the National High-Tech R&D Program of China(863 program)under Grant No.2013AA013303+1 种基金the Natural Science Foundation of China under Grant No.61170291,No.61133006,and No.61161140454ZTE IndustryAcademia-Research Cooperation Funds
文摘Many "rich - connected" topologies with multiple parallel paths between smwers have been proposed for data center networks recently to provide high bisection bandwidth, but it re mains challenging to fully utilize the high network capacity by appropriate multi- path routing algorithms. As flow-level path splitting may lead to trafl'ic imbalance between paths due to flow- size difference, packet-level path splitting attracts more attention lately, which spreads packets from flows into multiple available paths and significantly improves link utilizations. However, it may cause packet reordering, confusing the TCP congestion control algorithm and lowering the throughput of flows. In this paper, we design a novel packetlevel multi-path routing scheme called SOPA, which leverag- es OpenFlow to perform packet-level path splitting in a round- robin fashion, and hence significantly mitigates the packet reordering problem and improves the network throughput. Moreover, SOPA leverages the topological feature of data center networks to encode a very small number of switches along the path into the packet header, resulting in very light overhead. Compared with random packet spraying (RPS), Hedera and equal-cost multi-path routing (ECMP), our simulations demonstrate that SOPA achieves 29.87%, 50.41% and 77.74% higher network throughput respectively under permutation workload, and reduces average data transfer completion time by 53.65%, 343.31% and 348.25% respectively under production workload.