With the popularity of social network, the de- mand for real-time processing of graph data is increasing. However, most of the existing graph systems adopt a batch processing mode, therefore the overhead of maintainin...With the popularity of social network, the de- mand for real-time processing of graph data is increasing. However, most of the existing graph systems adopt a batch processing mode, therefore the overhead of maintaining and processing of dynamic graph is significantly high. In this pa- per, we design iGraph, an incremental graph processing sys- tem for dynamic graph with its continuous updates. The con- tribufions of iGraph include: 1) a hash-based graph partition strategy to enable fine-grained graph updates; 2) a vertex- based graph computing model to support incremental data processing; 3) detection and rebalance methods of hotspot to address the workload imbalance problem during incre- mental processing. Through the general-purpose API, iGraph can be used to implement various graph processing algo- rithms such as PageRank. We have implemented iGraph on Apache Spark, and experimental results show that for real life datasets, iGraph outperforms the original GraphX in respect of graph update and graph computation.展开更多
Graph processing has been widely used in many scenarios,from scientific computing to artificial intelligence.Graph processing exhibits irregular computational parallelism and random memory accesses,unlike traditional ...Graph processing has been widely used in many scenarios,from scientific computing to artificial intelligence.Graph processing exhibits irregular computational parallelism and random memory accesses,unlike traditional workloads.Therefore,running graph processing workloads on conventional architectures(e.g.,CPUs and GPUs)often shows a significantly low compute-memory ratio with few performance benefits,which can be,in many cases,even slower than a specialized single-thread graph algorithm.While domain-specific hardware designs are essential for graph processing,it is still challenging to transform the hardware capability to performance boost without coupled software codesigns.This article presents a graph processing ecosystem from hardware to software.We start by introducing a series of hardware accelerators as the foundation of this ecosystem.Subsequently,the codesigned parallel graph systems and their distributed techniques are presented to support graph applications.Finally,we introduce our efforts on novel graph applications and hardware architectures.Extensive results show that various graph applications can be efficiently accelerated in this graph processing ecosystem.展开更多
Increasingly there is a need to process graphs that are larger than the available memory on today's machines.Many systems have been developed with grapli representations that are efficient and compact for out-of-c...Increasingly there is a need to process graphs that are larger than the available memory on today's machines.Many systems have been developed with grapli representations that are efficient and compact for out-of-core processing.A necessary task in these systems is memory management.This paper presents a system called Cacheap which automatically and efficiently manages the available memory to maximize the speed of grapli processing,minimize the amount of disk access,and maximize the utilization of memory for graph data.It has a simple interface that can be easily adopted by existing graph engines.The paper describes the new system,uses it in recent graph engines,and demonstrates its integer factor improvements in the speed of large-scale grapli processing.展开更多
文摘With the popularity of social network, the de- mand for real-time processing of graph data is increasing. However, most of the existing graph systems adopt a batch processing mode, therefore the overhead of maintaining and processing of dynamic graph is significantly high. In this pa- per, we design iGraph, an incremental graph processing sys- tem for dynamic graph with its continuous updates. The con- tribufions of iGraph include: 1) a hash-based graph partition strategy to enable fine-grained graph updates; 2) a vertex- based graph computing model to support incremental data processing; 3) detection and rebalance methods of hotspot to address the workload imbalance problem during incre- mental processing. Through the general-purpose API, iGraph can be used to implement various graph processing algo- rithms such as PageRank. We have implemented iGraph on Apache Spark, and experimental results show that for real life datasets, iGraph outperforms the original GraphX in respect of graph update and graph computation.
基金supported by the National Key Research and Development Program of China under Grant No.2023YFB4502300.
文摘Graph processing has been widely used in many scenarios,from scientific computing to artificial intelligence.Graph processing exhibits irregular computational parallelism and random memory accesses,unlike traditional workloads.Therefore,running graph processing workloads on conventional architectures(e.g.,CPUs and GPUs)often shows a significantly low compute-memory ratio with few performance benefits,which can be,in many cases,even slower than a specialized single-thread graph algorithm.While domain-specific hardware designs are essential for graph processing,it is still challenging to transform the hardware capability to performance boost without coupled software codesigns.This article presents a graph processing ecosystem from hardware to software.We start by introducing a series of hardware accelerators as the foundation of this ecosystem.Subsequently,the codesigned parallel graph systems and their distributed techniques are presented to support graph applications.Finally,we introduce our efforts on novel graph applications and hardware architectures.Extensive results show that various graph applications can be efficiently accelerated in this graph processing ecosystem.
基金the National Key Research and Development Program of China under Grant No.2017YFB1003103the National Natural Science Foundation of China under Grant Nos.6143201&61432016,61332009,and 61521092+1 种基金the National Science Foundation of USA under Contract Nos.CCF-1717877 and CCF-1629376an IBM CAS Faculty Fellowship.
文摘Increasingly there is a need to process graphs that are larger than the available memory on today's machines.Many systems have been developed with grapli representations that are efficient and compact for out-of-core processing.A necessary task in these systems is memory management.This paper presents a system called Cacheap which automatically and efficiently manages the available memory to maximize the speed of grapli processing,minimize the amount of disk access,and maximize the utilization of memory for graph data.It has a simple interface that can be easily adopted by existing graph engines.The paper describes the new system,uses it in recent graph engines,and demonstrates its integer factor improvements in the speed of large-scale grapli processing.