在联邦学习语言模型中恢复私人文本

论文标题

在联邦学习语言模型中恢复私人文本

Recovering Private Text in Federated Learning of Language Models

论文作者

Gupta, Samyak, Huang, Yangsibo, Zhong, Zexuan, Gao, Tianyu, Li, Kai, Chen, Danqi

论文摘要

联合学习允许分布式用户可以协作训练模型，同时将每个用户的数据私有。最近，越来越多的工作表明，窃听的攻击者可以有效地从联邦学习过程中传播的梯度中恢复图像数据。但是，在恢复文本数据方面几乎没有取得进展。在本文中，我们提出了一种新颖的攻击方法，用于联合学习语言模型（LMS）。我们第一次展示了从多达128个句子的大批量大小中恢复文本的可行性。与经过优化以匹配梯度的图像恢复方法不同，我们采用了一种独特的方法，该方法首先从梯度中识别一组单词，然后直接基于光束搜索和先前的基于基于的重新排序策略重建句子。我们对几个大规模数据集进行了胶卷攻击，并表明它可以成功地重建具有高批量尺寸的单个句子，如果迭代使用了多个句子，甚至多个句子。我们评估了三种防御方法：梯度修剪，DPSGD和一种简单的方法来冻结我们建议的单词嵌入。我们表明，梯度修剪和DPSGD都会显着下降。但是，如果我们在不更新单词嵌入的情况下微调了公共预培训的LM，则可以通过最小的数据实用程序损失有效地捍卫攻击。我们一起希望我们的结果可以鼓励社区重新考虑LM培训的隐私问题及其将来的标准实践。

Federated learning allows distributed users to collaboratively train a model while keeping each user's data private. Recently, a growing body of work has demonstrated that an eavesdropping attacker can effectively recover image data from gradients transmitted during federated learning. However, little progress has been made in recovering text data. In this paper, we present a novel attack method FILM for federated learning of language models (LMs). For the first time, we show the feasibility of recovering text from large batch sizes of up to 128 sentences. Unlike image-recovery methods that are optimized to match gradients, we take a distinct approach that first identifies a set of words from gradients and then directly reconstructs sentences based on beam search and a prior-based reordering strategy. We conduct the FILM attack on several large-scale datasets and show that it can successfully reconstruct single sentences with high fidelity for large batch sizes and even multiple sentences if applied iteratively. We evaluate three defense methods: gradient pruning, DPSGD, and a simple approach to freeze word embeddings that we propose. We show that both gradient pruning and DPSGD lead to a significant drop in utility. However, if we fine-tune a public pre-trained LM on private text without updating word embeddings, it can effectively defend the attack with minimal data utility loss. Together, we hope that our results can encourage the community to rethink the privacy concerns of LM training and its standard practices in the future.

下载PDF全文

下载文献需遵守相关版权规定

论文标题