更快的自适应联合学习

论文标题

更快的自适应联合学习

Faster Adaptive Federated Learning

论文作者

Wu, Xidong, Huang, Feihu, Hu, Zhengmian, Huang, Heng

论文摘要

随着分布式数据的出现，联合学习引起了人们的关注。尽管已经针对非convex分布式问题提出了广泛的联合学习算法，但实践中的联邦学习仍然面临许多挑战，例如，由于模型和数据集的大小不断增加，因此大量的培训迭代会融合，并且基于SGD的模型更新缺乏适应性。同时，对联邦学习中自适应方法的研究很少，现有作品要么缺乏完整的理论收敛保证，要么具有缓慢的样本复杂性。在本文中，我们提出了一种基于基于动量的方差降低技术的有效自适应算法（即FAFED）。我们首先探索如何在FL设置中设计自适应算法。通过提供反例，我们证明了FL和自适应方法的简单组合可能导致分歧。更重要的是，我们为我们的方法提供了融合分析，并证明我们的算法是第一个接触到最著名的样品$ O（ε^{ - 3}）$和$ O（ε^{ - 2}）$（ε^{ - 2}）$通信的回合，找到$ε$ - ε$ - ε$ - ε$ - ε$ - ε$ - ε$ - ε$ - ε$ - ε$ - ε$ - ε$ - ε$ - - 使用异质数据的语言建模任务和图像分类任务的实验结果证明了我们算法的效率。

Federated learning has attracted increasing attention with the emergence of distributed data. While extensive federated learning algorithms have been proposed for the non-convex distributed problem, federated learning in practice still faces numerous challenges, such as the large training iterations to converge since the sizes of models and datasets keep increasing, and the lack of adaptivity by SGD-based model updates. Meanwhile, the study of adaptive methods in federated learning is scarce and existing works either lack a complete theoretical convergence guarantee or have slow sample complexity. In this paper, we propose an efficient adaptive algorithm (i.e., FAFED) based on the momentum-based variance-reduced technique in cross-silo FL. We first explore how to design the adaptive algorithm in the FL setting. By providing a counter-example, we prove that a simple combination of FL and adaptive methods could lead to divergence. More importantly, we provide a convergence analysis for our method and prove that our algorithm is the first adaptive FL algorithm to reach the best-known samples $O(ε^{-3})$ and $O(ε^{-2})$ communication rounds to find an $ε$-stationary point without large batches. The experimental results on the language modeling task and image classification task with heterogeneous data demonstrate the efficiency of our algorithms.

下载PDF全文

下载文献需遵守相关版权规定

论文标题