通过$ o（1）$共识率的分散学习的沟通高效拓扑

论文标题

通过$ o（1）$共识率的分散学习的沟通高效拓扑

Communication-Efficient Topologies for Decentralized Learning with $O(1)$ Consensus Rate

论文作者

Song, Zhuoqing, Li, Weijian, Jin, Kexin, Shi, Lei, Yan, Ming, Yin, Wotao, Yuan, Kun

论文摘要

分散的优化是分布式学习中的新兴范式，在没有中央服务器的情况下，代理通过对等通信实现网络范围的解决方案。由于通信往往比计算要慢，因此每个迭代仅与几个相邻代理进行通信时，它们可以比使用更多的代理或中央服务器更快地完成迭代。但是，到达网络范围的解决方案的迭代总数受代理信息通过通信``混合''的速度的影响。我们发现，流行的通信拓扑要么具有较大的最大程度（例如恒星和完整的图形），要么在混合信息（例如环和网格）方面无效。为了解决这个问题，我们提出了一个新的拓扑系列Equitopo，该拓扑家族具有（几乎）恒定程度和网络大小独立的共识率，用于衡量混合效率。在拟议的家庭中，等级的度数为$θ（\ ln（n））$，其中$ n $是网络大小，一系列与时间相关的单对拓扑（Equidyn）的恒定度为1。我们通过某个随机采样程序生成equidyn。他们俩都达到了$ n $独立的共识率。我们将它们应用于分散的SGD和分散的梯度跟踪，并在理论上和经验上获得更快的交流和更好的收敛性。我们的代码通过bluefog实现，并在\ url {https://github.com/kexinjinnn/equitopo}中获得

Decentralized optimization is an emerging paradigm in distributed learning in which agents achieve network-wide solutions by peer-to-peer communication without the central server. Since communication tends to be slower than computation, when each agent communicates with only a few neighboring agents per iteration, they can complete iterations faster than with more agents or a central server. However, the total number of iterations to reach a network-wide solution is affected by the speed at which the agents' information is ``mixed'' by communication. We found that popular communication topologies either have large maximum degrees (such as stars and complete graphs) or are ineffective at mixing information (such as rings and grids). To address this problem, we propose a new family of topologies, EquiTopo, which has an (almost) constant degree and a network-size-independent consensus rate that is used to measure the mixing efficiency. In the proposed family, EquiStatic has a degree of $Θ(\ln(n))$, where $n$ is the network size, and a series of time-dependent one-peer topologies, EquiDyn, has a constant degree of 1. We generate EquiDyn through a certain random sampling procedure. Both of them achieve an $n$-independent consensus rate. We apply them to decentralized SGD and decentralized gradient tracking and obtain faster communication and better convergence, theoretically and empirically. Our code is implemented through BlueFog and available at \url{https://github.com/kexinjinnn/EquiTopo}

下载PDF全文

下载文献需遵守相关版权规定

论文标题