联邦学习向隐私保护的进步：从联邦学习到分裂学习

论文标题

联邦学习向隐私保护的进步：从联邦学习到分裂学习

Advancements of federated learning towards privacy preservation: from federated learning to split learning

论文作者

Thapa, Chandra, Chamikara, M. A. P., Camtepe, Seyit A.

论文摘要

在分布式协作机器学习（DCML）范式中，由于其在健康，金融和最新的创新（例如工业4.0和智能车辆）中的应用，联邦学习（FL）最近引起了广泛关注。 FL提供逐个设计的隐私。它在几个分布式客户（从两到数百万）（例如手机）上进行了协作训练机器学习模型，而无需与任何其他参与者共享其原始数据。在实际情况下，所有客户端都没有足够的计算资源（例如，物联网），机器学习模型具有数百万个参数，并且在培训/测试时，服务器与客户之间的隐私是主要问题（例如，竞争对手党派）。在这方面，FL还不够，因此引入了分裂学习（SL）。 SL在这些情况下是可靠的，因为它将模型分为多个部分，将它们分配到客户和服务器中，并训练/测试各自的模型部分以完成完整的模型培训/测试。在SL中，参与者不会与任何其他方共享数据及其模型部分，通常将较小的网络部分分配给数据所在的客户端。最近，引入了一种称为SplitFed学习的FL和SL的混合体，以提高FL（更快的训练/测试时间）和SL（模型拆分和训练）的好处。遵循从FL到SL的发展，考虑到SL的重要性，本章旨在在SL及其变体中提供广泛的覆盖范围。覆盖范围包括基本原理，现有发现，与诸如差异隐私，开放问题和代码实施之类的隐私措施的集成。

In the distributed collaborative machine learning (DCML) paradigm, federated learning (FL) recently attracted much attention due to its applications in health, finance, and the latest innovations such as industry 4.0 and smart vehicles. FL provides privacy-by-design. It trains a machine learning model collaboratively over several distributed clients (ranging from two to millions) such as mobile phones, without sharing their raw data with any other participant. In practical scenarios, all clients do not have sufficient computing resources (e.g., Internet of Things), the machine learning model has millions of parameters, and its privacy between the server and the clients while training/testing is a prime concern (e.g., rival parties). In this regard, FL is not sufficient, so split learning (SL) is introduced. SL is reliable in these scenarios as it splits a model into multiple portions, distributes them among clients and server, and trains/tests their respective model portions to accomplish the full model training/testing. In SL, the participants do not share both data and their model portions to any other parties, and usually, a smaller network portion is assigned to the clients where data resides. Recently, a hybrid of FL and SL, called splitfed learning, is introduced to elevate the benefits of both FL (faster training/testing time) and SL (model split and training). Following the developments from FL to SL, and considering the importance of SL, this chapter is designed to provide extensive coverage in SL and its variants. The coverage includes fundamentals, existing findings, integration with privacy measures such as differential privacy, open problems, and code implementation.

下载PDF全文

下载文献需遵守相关版权规定

论文标题