分布式图神经网络训练具有周期性表示同步

论文标题

分布式图神经网络训练具有周期性表示同步

Distributed Graph Neural Network Training with Periodic Stale Representation Synchronization

论文作者

Chai, Zheng, Bai, Guangji, Zhao, Liang, Cheng, Yue

论文摘要

尽管图神经网络最近取得了成功，但在具有数百万个节点和数十亿个边缘的大图上训练GNN仍然具有挑战性，这在许多基于图基的应用中很普遍。传统采样方法通过掉落边缘和节点加速GNN训练，从而损害图形完整性和模型性能。不同的是，分布式GNN算法通过使用多个计算设备加速了GNN培训，并且可以分为两种类型：“基于分区的”方法的沟通成本较低，但由于降低的边缘而遭受信息损失，而“基于繁殖的方法”避免了信息损失，但遭受了邻居爆炸造成的近距离沟通损失。为了共同解决这些问题，本文提出了摘要（分布式图表示同步），这是一个新型的分布式GNN训练框架，可以协同现有方法的两类互补强度。我们建议允许每个设备在子图平行训练期间在其他子图中利用其邻居的陈旧表示形式。这样，我们的方法保留了邻居的全球图表信息，以避免信息丢失并降低通信成本。我们的融合分析表明，消化物具有最先进的收敛速度。与最先进的分布式GNN训练框架相比，对大型现实图形数据集进行了广泛的实验评估表明，摘要最多可达到21.82个加速度，而无需损害性能。

Despite the recent success of Graph Neural Networks, it remains challenging to train a GNN on large graphs with millions of nodes and billions of edges, which are prevalent in many graph-based applications. Traditional sampling-based methods accelerate GNN training by dropping edges and nodes, which impairs the graph integrity and model performance. Differently, distributed GNN algorithms accelerate GNN training by utilizing multiple computing devices and can be classified into two types: "partition-based" methods enjoy low communication costs but suffer from information loss due to dropped edges, while "propagation-based" methods avoid information loss but suffer from prohibitive communication overhead caused by the neighbor explosion. To jointly address these problems, this paper proposes DIGEST (DIstributed Graph reprEsentation SynchronizaTion), a novel distributed GNN training framework that synergizes the complementary strength of both categories of existing methods. We propose to allow each device to utilize the stale representations of its neighbors in other subgraphs during subgraph parallel training. This way, our method preserves global graph information from neighbors to avoid information loss and reduce communication costs. Our convergence analysis demonstrates that DIGEST enjoys a state-of-the-art convergence rate. Extensive experimental evaluation on large, real-world graph datasets shows that DIGEST achieves up to 21.82 speedups without compromising performance compared to state-of-the-art distributed GNN training frameworks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题