嘈杂的随机梯度下降的隐私：更多的迭代而没有更多隐私权损失

论文标题

嘈杂的随机梯度下降的隐私：更多的迭代而没有更多隐私权损失

Privacy of Noisy Stochastic Gradient Descent: More Iterations without More Privacy Loss

论文作者

Altschuler, Jason M., Talwar, Kunal

论文摘要

机器学习中的一个核心问题是如何在敏感用户数据上训练模型。行业已广泛采用了一种简单的算法：带有噪声的随机梯度下降（又称随机梯度Langevin Dynamics）。但是，关于该算法的隐私损失的基本理论问题仍然开放 - 即使在看似简单的平稳凸损失的环境中，也在有限的域上。我们的主要结果解决了以下问题：对于各种参数，我们将差异隐私的特征为恒定因素。该结果表明，此设置的所有先前分析都具有错误的定性行为。具体而言，虽然以前的隐私分析增加了迭代次数的无限数量，但我们表明，经过少量燃烧期，运行SGD的泄漏更长的泄漏不会进一步泄漏。我们的分析脱离了基于快速混合的先前方法，而是使用基于最佳运输的技术（即，通过迭代进行隐私放大）和采样的高斯机制（即，通过采样来放大隐私放大）。我们的技术很容易扩展到其他设置，例如强烈凸出损失，不均匀的步骤，任意批处理大小以及批处理的随机或周期性选择。

A central issue in machine learning is how to train models on sensitive user data. Industry has widely adopted a simple algorithm: Stochastic Gradient Descent with noise (a.k.a. Stochastic Gradient Langevin Dynamics). However, foundational theoretical questions about this algorithm's privacy loss remain open -- even in the seemingly simple setting of smooth convex losses over a bounded domain. Our main result resolves these questions: for a large range of parameters, we characterize the differential privacy up to a constant factor. This result reveals that all previous analyses for this setting have the wrong qualitative behavior. Specifically, while previous privacy analyses increase ad infinitum in the number of iterations, we show that after a small burn-in period, running SGD longer leaks no further privacy. Our analysis departs from previous approaches based on fast mixing, instead using techniques based on optimal transport (namely, Privacy Amplification by Iteration) and the Sampled Gaussian Mechanism (namely, Privacy Amplification by Sampling). Our techniques readily extend to other settings, e.g., strongly convex losses, non-uniform stepsizes, arbitrary batch sizes, and random or cyclic choice of batches.

下载PDF全文

下载文献需遵守相关版权规定

论文标题