论文标题

部分私人数据的正式隐私

Formal Privacy for Partially Private Data

论文作者

Seeman, Jeremy, Reimherr, Matthew, Slavkovic, Aleksandra

论文摘要

差异隐私(DP)通过分析注入输出统计信息的噪声来量化隐私损失。对于非平凡的统计,这种噪音对于确保有限隐私损失是必要的。但是,数据策展人经常释放统计数据的收集,其中有些使用DP机制和其他使用原理,即没有其他随机噪声。因此,仅DP不能表征归因于整个发行系列的隐私损失。在本文中,我们提出隐私形式主义,$(ε,\ {θ_z\} _ {z {z \ in \ Mathcal {z}})$ - pufferfish($ \ \ \ \ \ \ {θ_z\} _ _ {z _ _ {z \ in \ intcal in \ in \ mathcal in Inffers in \ in fferiange in \ n in Inffers,in punflish($ε$ -tp for Short for Short for Short)通过实现随机变量$ z $的索引,代表不受DP噪声保护的公共信息。首先,我们证明该定义具有与DP相似的属性。接下来,我们介绍了满足$ε$ -TP的部分私人数据(PPD)的机制,并证明其理想的属性。我们提供了从给定ppd的参数后部采样的算法。然后,我们将这种推理方法与嘈杂统计量与Z结合的替代方法进行比较。我们得出了轻度条件,在这种情况下,使用我们的算法对这种更常见的方法提供了理论和计算改进。最后,我们证明了上述对COVID-19数据的案例研究的所有影响。

Differential privacy (DP) quantifies privacy loss by analyzing noise injected into output statistics. For non-trivial statistics, this noise is necessary to ensure finite privacy loss. However, data curators frequently release collections of statistics where some use DP mechanisms and others are released as-is, i.e., without additional randomized noise. Consequently, DP alone cannot characterize the privacy loss attributable to the entire collection of releases. In this paper, we present a privacy formalism, $(ε, \{ Θ_z\}_{z \in \mathcal{Z}})$-Pufferfish ($ε$-TP for short when $\{ Θ_z\}_{z \in \mathcal{Z}}$ is implied), a collection of Pufferfish mechanisms indexed by realizations of a random variable $Z$ representing public information not protected with DP noise. First, we prove that this definition has similar properties to DP. Next, we introduce mechanisms for releasing partially private data (PPD) satisfying $ε$-TP and prove their desirable properties. We provide algorithms for sampling from the posterior of a parameter given PPD. We then compare this inference approach to the alternative where noisy statistics are deterministically combined with Z. We derive mild conditions under which using our algorithms offers both theoretical and computational improvements over this more common approach. Finally, we demonstrate all the effects above on a case study on COVID-19 data.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源