NXgraph: An Efficient Graph Processing System on a Single Machine

Abstract

Recent studies show that graph processing systems on a single machine can achieve competitive performance compared with cluster-based graph processing systems. In this paper, we present NXgraph, an efficient graph processing system on a single machine. We propose the Destination-Sorted SubShard (DSSS) structure to store a graph. To ensure graph data access locality and enable fine-grained scheduling, NXgraph divides vertices and edges into intervals and sub-shards. To reduce write conflicts among different threads and achieve a high degree of parallelism, NXgraph sorts edges within each sub-shard according to their destination vertices. Then, three updating strategies, i.e., Single-Phase Update (SPU), Double-Phase Update (DPU), and Mixed-Phase Update (MPU), are proposed in this paper. NXgraph can adaptively choose the fastest strategy for different graph problems according to the graph size and the available memory resources to fully utilize the memory space and reduce the amount of data transfer. All these three strategies exploit streamlined disk access patterns. Extensive experiments on three real-world graphs and five synthetic graphs show that NXgraph outperforms GraphChi, TurboGraph, VENUS, and GridGraph in various situations. Moreover, NXgraph, running on a single commodity PC, can finish an iteration of PageRank on the Twitter graph with 1.5 billion edges in 2.05 seconds; while PowerGraph, a distributed graph processing system, needs 3.6s to finish the same task on a 64-node cluster.

Publication
In International Conference on Data Engineering (ICDE), IEEE.