论文标题
可证明的机密语言建模
Provably Confidential Language Modelling
论文作者
论文摘要
大型语言模型被显示为记住隐私信息,例如培训数据中的社会保险号。鉴于培训语料库的巨大规模,手动或自动筛选和过滤这些隐私数据是一项挑战。在本文中,我们提出了秘密编辑的培训(CRT),这是一种培训语言生成模型的方法,同时保护机密细分市场。我们从差异隐私(解决一个相关但独特的问题)中借鉴了想法,并表明我们的方法能够通过随机将培训过程的部分随机分组来防止意外的记忆。此外,我们证明了通过大致正确筛选策略进行修复会放大机密性保证。我们实施了LSTM和GPT语言模型的方法。我们的实验结果表明,通过CRT训练的模型获得了几乎相同的困惑,同时保持了强大的机密性。
Large language models are shown to memorize privacy information such as social security numbers in training data. Given the sheer scale of the training corpus, it is challenging to screen and filter these privacy data, either manually or automatically. In this paper, we propose Confidentially Redacted Training (CRT), a method to train language generation models while protecting the confidential segments. We borrow ideas from differential privacy (which solves a related but distinct problem) and show that our method is able to provably prevent unintended memorization by randomizing parts of the training process. Moreover, we show that redaction with an approximately correct screening policy amplifies the confidentiality guarantee. We implement the method for both LSTM and GPT language models. Our experimental results show that the models trained by CRT obtain almost the same perplexity while preserving strong confidentiality.