Increasingly powerful computational technology has caused enormous data growth both in size and complexity. A key issue is how to organize the data to adapt the challenges of data analysis. This paper borrows ideas fr...Increasingly powerful computational technology has caused enormous data growth both in size and complexity. A key issue is how to organize the data to adapt the challenges of data analysis. This paper borrows ideas from the Internet of things (IOT) into the digital world and organize the data entities to form a network, the Internet of data (IOD), which has huge potential in data-intensive applications. In the IOD, data hiding technology is utilized to embed virtual tags, which record all the activities of the data entities since they are created, into every data entity in the system. The IOD aims to organize the data to be interconnected as a network and collect useful information for data identification, data tracing, data vitalization and further data analysis.展开更多
With the continuous enrichment of cloud services, an increasing number of applications are being deployed in data centers. These emerging applications are often communication-intensive and data-parallel, and their per...With the continuous enrichment of cloud services, an increasing number of applications are being deployed in data centers. These emerging applications are often communication-intensive and data-parallel, and their performance is closely related to the underlying network. With their distributed nature, the applications consist of tasks that involve a collection of parallel flows. Traditional techniques to optimize flow-level metrics are agnostic to task-level requirements, leading to poor application-level performance. In this paper, we address the heterogeneous task-level requirements of applications and propose task-aware flow scheduling. First, we model tasks' sensitivity to their completion time by utilities. Second, on the basis of Nash bargaining theory, we establish a flow scheduling model with heterogeneous utility characteristics, and analyze it using Lagrange multiplier method and KKT condition. Third, we propose two utility-aware bandwidth allocation algorithms with different practical constraints. Finally, we present Tasch, a system that enables tasks to maintain high utilities and guarantees the fairness of utilities. To demonstrate the feasibility of our system, we conduct comprehensive evaluations with realworld traffic trace. Communication stages complete up to 1.4 faster on average, task utilities increase up to 2.26,and the fairness of tasks improves up to 8.66 using Tasch in comparison to per-flow mechanisms.展开更多
当前,大数据及人工智能技术向嵌入式系统发展,对嵌入式系统的存储访问能力提出了更高的要求.磁畴壁存储器凭借其高读写速度、高密度以及低功耗等优点,可以用于嵌入式系统,以满足数据密集型应用对访问速度、容量及能耗的需求.但是磁畴壁...当前,大数据及人工智能技术向嵌入式系统发展,对嵌入式系统的存储访问能力提出了更高的要求.磁畴壁存储器凭借其高读写速度、高密度以及低功耗等优点,可以用于嵌入式系统,以满足数据密集型应用对访问速度、容量及能耗的需求.但是磁畴壁存储器在进行数据访问之前需要进行移动操作,这将极大影响其存储访问性能.而减少移动操作可以有效提升磁畴壁存储器的性能.面向运行数据密集型应用的多读/写头磁畴壁存储器系统,研究减少移动操作的最优指令调度与数据放置技术.首先提出了可获得最小移动次数的整数线性规划(integer linear programming,简称ILP)模型.由于ILP模型不能在多项式时间内求得最优解,所以提出了多项式时间的启发式算法——生成指令调度和数据放置(generation instruction scheduling and data placement,简称GISDP)算法.实验结果表明,ILP模型和GISDP算法可以有效减少移动操作的次数.在配备8个读/写头的磁畴壁存储器上,GISDP算法生成的指令调度与数据放置方案相较其他算法可以平均减少89.7%的移动操作,并且GISDP算法的结果接近ILP模型的最优解.展开更多
文摘Increasingly powerful computational technology has caused enormous data growth both in size and complexity. A key issue is how to organize the data to adapt the challenges of data analysis. This paper borrows ideas from the Internet of things (IOT) into the digital world and organize the data entities to form a network, the Internet of data (IOD), which has huge potential in data-intensive applications. In the IOD, data hiding technology is utilized to embed virtual tags, which record all the activities of the data entities since they are created, into every data entity in the system. The IOD aims to organize the data to be interconnected as a network and collect useful information for data identification, data tracing, data vitalization and further data analysis.
基金supported by the National Key R&D Program of China(No.2017YFB1003000)the National Natural Science Foundation of China(Nos.61872079,61572129,61602112,61502097,61702096,61320106007,61632008,and 61702097)+4 种基金the Natural Science Foundation of Jiangsu Province(Nos.BK20160695 and BK20170689)the Fundamental Research Funds for the Central Universities(No.2242018k1G019)the Jiangsu Provincial Key Laboratory of Network and Information Security(No.BM2003201)the Key Laboratory of Computer Network and Information Integration of Ministry of Education of China(No.93K-9)partially supported by the Collaborative Innovation Center of Novel Software Technology and Industrialization and Collaborative Innovation Center of Wireless Communications Technology
文摘With the continuous enrichment of cloud services, an increasing number of applications are being deployed in data centers. These emerging applications are often communication-intensive and data-parallel, and their performance is closely related to the underlying network. With their distributed nature, the applications consist of tasks that involve a collection of parallel flows. Traditional techniques to optimize flow-level metrics are agnostic to task-level requirements, leading to poor application-level performance. In this paper, we address the heterogeneous task-level requirements of applications and propose task-aware flow scheduling. First, we model tasks' sensitivity to their completion time by utilities. Second, on the basis of Nash bargaining theory, we establish a flow scheduling model with heterogeneous utility characteristics, and analyze it using Lagrange multiplier method and KKT condition. Third, we propose two utility-aware bandwidth allocation algorithms with different practical constraints. Finally, we present Tasch, a system that enables tasks to maintain high utilities and guarantees the fairness of utilities. To demonstrate the feasibility of our system, we conduct comprehensive evaluations with realworld traffic trace. Communication stages complete up to 1.4 faster on average, task utilities increase up to 2.26,and the fairness of tasks improves up to 8.66 using Tasch in comparison to per-flow mechanisms.
文摘当前,大数据及人工智能技术向嵌入式系统发展,对嵌入式系统的存储访问能力提出了更高的要求.磁畴壁存储器凭借其高读写速度、高密度以及低功耗等优点,可以用于嵌入式系统,以满足数据密集型应用对访问速度、容量及能耗的需求.但是磁畴壁存储器在进行数据访问之前需要进行移动操作,这将极大影响其存储访问性能.而减少移动操作可以有效提升磁畴壁存储器的性能.面向运行数据密集型应用的多读/写头磁畴壁存储器系统,研究减少移动操作的最优指令调度与数据放置技术.首先提出了可获得最小移动次数的整数线性规划(integer linear programming,简称ILP)模型.由于ILP模型不能在多项式时间内求得最优解,所以提出了多项式时间的启发式算法——生成指令调度和数据放置(generation instruction scheduling and data placement,简称GISDP)算法.实验结果表明,ILP模型和GISDP算法可以有效减少移动操作的次数.在配备8个读/写头的磁畴壁存储器上,GISDP算法生成的指令调度与数据放置方案相较其他算法可以平均减少89.7%的移动操作,并且GISDP算法的结果接近ILP模型的最优解.