分布式非凸优化在间歇端可用性下具有均方根加速

论文标题

分布式非凸优化在间歇端可用性下具有均方根加速

Distributed Non-Convex Optimization with Sublinear Speedup under Intermittent Client Availability

论文作者

Yan, Yikai, Niu, Chaoyue, Ding, Yucheng, Zheng, Zhenzhe, Wu, Fan, Chen, Guihai, Tang, Shaojie, Wu, Zhihua

论文摘要

Federated Learning是一个新的分布式机器学习框架，其中一堆异质客户在不共享培训数据的情况下协作培训模型。在这项工作中，我们考虑在移动环境中部署联合学习时的实用和无处不在的问题：间歇性客户可用性，合格的客户在培训过程中可能会发生变化。这种间歇性客户的可用性将严重恶化经典联邦平均算法的性能（简称FedAvg）。因此，我们提出了一种简单的分布式非凸优化算法，称为联合最新平均（简称Fedlaavg），即使不可用客户，它也会利用所有客户端的最新梯度，以共同更新每个迭代中的全局模型。我们的理论分析表明，Fedlaavg达到了$ O（E^{1/2}/（N^{1/4} T^{1/2}）））$的收敛速率，从而实现了sublinear速度相对于客户总数。我们实施了联邦联盟（Fedlaavg）以及几个基线，并通过基准的MNIST和Sentiment140数据集进行评估。评估结果表明，在凸面和非凸面设置中，联邦联盟比FedAvg实现更稳定的训练，并且确实达到了均匀的加速。

Federated learning is a new distributed machine learning framework, where a bunch of heterogeneous clients collaboratively train a model without sharing training data. In this work, we consider a practical and ubiquitous issue when deploying federated learning in mobile environments: intermittent client availability, where the set of eligible clients may change during the training process. Such intermittent client availability would seriously deteriorate the performance of the classical Federated Averaging algorithm (FedAvg for short). Thus, we propose a simple distributed non-convex optimization algorithm, called Federated Latest Averaging (FedLaAvg for short), which leverages the latest gradients of all clients, even when the clients are not available, to jointly update the global model in each iteration. Our theoretical analysis shows that FedLaAvg attains the convergence rate of $O(E^{1/2}/(N^{1/4} T^{1/2}))$, achieving a sublinear speedup with respect to the total number of clients. We implement FedLaAvg along with several baselines and evaluate them over the benchmarking MNIST and Sentiment140 datasets. The evaluation results demonstrate that FedLaAvg achieves more stable training than FedAvg in both convex and non-convex settings and indeed reaches a sublinear speedup.

下载PDF全文

下载文献需遵守相关版权规定

论文标题