共享模型或核心：基于成员推理攻击的研究

论文标题

共享模型或核心：基于成员推理攻击的研究

Sharing Models or Coresets: A Study based on Membership Inference Attack

论文作者

Lu, Hanlin, Liu, Changchang, He, Ting, Wang, Shiqiang, Chan, Kevin S.

论文摘要

分布式机器学习通常旨在基于分布式数据培训全球模型，而无需将所有数据收集到集中式位置，在这里提出了两种不同的方法：收集和汇总本地模型（联合学习）（联合学习）以及对代表性数据摘要（CoreSet）的收集和培训。虽然每种方法在一定程度上保留了数据隐私，这要归功于没有共享原始数据，但在复杂的攻击下，确切的保护程度尚不清楚，这些攻击试图从共享信息中推断原始数据。我们在目标模型的准确性，通信成本和数据隐私方面介绍了两种方法之间的第一个比较，其中最后一个是通过称为成员推理攻击的最先进攻击策略的准确性来衡量的。我们的实验量化了每种方法的准确性 - 私人成本权衡，并揭示了可用于指导模型训练过程设计的非平凡比较。

Distributed machine learning generally aims at training a global model based on distributed data without collecting all the data to a centralized location, where two different approaches have been proposed: collecting and aggregating local models (federated learning) and collecting and training over representative data summaries (coreset). While each approach preserves data privacy to some extent thanks to not sharing the raw data, the exact extent of protection is unclear under sophisticated attacks that try to infer the raw data from the shared information. We present the first comparison between the two approaches in terms of target model accuracy, communication cost, and data privacy, where the last is measured by the accuracy of a state-of-the-art attack strategy called the membership inference attack. Our experiments quantify the accuracy-privacy-cost tradeoff of each approach, and reveal a nontrivial comparison that can be used to guide the design of model training processes.

下载PDF全文

下载文献需遵守相关版权规定

论文标题