层次群集的空中联合边缘学习

论文标题

层次群集的空中联合边缘学习

Over-the-Air Federated Edge Learning with Hierarchical Clustering

论文作者

Aygün, Ozan, Kazemi, Mohammad, Gündüz, Deniz, Duman, Tolga M.

论文摘要

我们通过直播（OTA）聚合研究联合学习（FL），移动用户（MUS）的目标是借助汇总本地梯度的参数服务器（PS）在全球模型上达成共识。在OTA FL中，MUS在每个训练回合中都使用本地数据训练他们的模型，并以未编码的方式使用相同的频带同时传输其梯度。根据超级梯度的接收信号，PS执行全局模型更新。尽管OTA FL的通信成本显着降低，但它容易受到不利的通道影响和噪声的影响。在接收器侧采用多个天线可以减少这些效果，但是路径损失仍然是远离PS的用户的限制因素。为了改善此问题，在本文中，我们提出了一种基于无线的层次FL方案，该方案使用中间服务器（ISS）在MUS位置更密集的区域形成簇。我们的计划利用OTA群集聚合与MUS及其相应的MUS进行交流，而OTA全球聚合从ISS到PS。我们提出了针对所提出算法的收敛分析，并通过对使用ISS的衍生分析表达式和实验结果的数值评估显示，与单独使用较少的发射功率相比，利用ISS的实验结果比单独的ota fl产生更快的收敛性和更好的性能。我们还使用不同数量的群集迭代以及不同数据集和数据分布来验证性能的结果。我们得出的结论是，群集聚集的最佳选择取决于MUS和集群之间的数据分布。

We examine federated learning (FL) with over-the-air (OTA) aggregation, where mobile users (MUs) aim to reach a consensus on a global model with the help of a parameter server (PS) that aggregates the local gradients. In OTA FL, MUs train their models using local data at every training round and transmit their gradients simultaneously using the same frequency band in an uncoded fashion. Based on the received signal of the superposed gradients, the PS performs a global model update. While the OTA FL has a significantly decreased communication cost, it is susceptible to adverse channel effects and noise. Employing multiple antennas at the receiver side can reduce these effects, yet the path-loss is still a limiting factor for users located far away from the PS. To ameliorate this issue, in this paper, we propose a wireless-based hierarchical FL scheme that uses intermediate servers (ISs) to form clusters at the areas where the MUs are more densely located. Our scheme utilizes OTA cluster aggregations for the communication of the MUs with their corresponding IS, and OTA global aggregations from the ISs to the PS. We present a convergence analysis for the proposed algorithm, and show through numerical evaluations of the derived analytical expressions and experimental results that utilizing ISs results in a faster convergence and a better performance than the OTA FL alone while using less transmit power. We also validate the results on the performance using different number of cluster iterations with different datasets and data distributions. We conclude that the best choice of cluster aggregations depends on the data distribution among the MUs and the clusters.

下载PDF全文

下载文献需遵守相关版权规定

论文标题