论文标题

DP-尿素:差异私有文本重写的可重复性和透明度

DP-Rewrite: Towards Reproducibility and Transparency in Differentially Private Text Rewriting

论文作者

Igamberdiev, Timour, Arnold, Thomas, Habernal, Ivan

论文摘要

具有不同隐私(DP)的文本重写提供了具体的理论保证,以保护个人在文本文档中的隐私。实际上,现有系统可能缺乏验证其隐私索赔的方法,从而导致透明度和可重复性问题。我们介绍了DP-Rewrite,这是一个开源框架,用于差异化文本重写,旨在通过模块化,可扩展和高度定制来解决这些问题。我们的系统结合了各种下游数据集,模型,培训前程序和评估指标,以提供一种灵活的方式来领导和验证私人文本重写研究。为了在实践中演示我们的软件,我们提供了一组实验,作为熟练DP文本重写系统的案例研究,检测其预训练方法中的隐私泄漏。我们的系统已公开可用,我们希望它将帮助社区使DP文本重写更易于访问和透明。

Text rewriting with differential privacy (DP) provides concrete theoretical guarantees for protecting the privacy of individuals in textual documents. In practice, existing systems may lack the means to validate their privacy-preserving claims, leading to problems of transparency and reproducibility. We introduce DP-Rewrite, an open-source framework for differentially private text rewriting which aims to solve these problems by being modular, extensible, and highly customizable. Our system incorporates a variety of downstream datasets, models, pre-training procedures, and evaluation metrics to provide a flexible way to lead and validate private text rewriting research. To demonstrate our software in practice, we provide a set of experiments as a case study on the ADePT DP text rewriting system, detecting a privacy leak in its pre-training approach. Our system is publicly available, and we hope that it will help the community to make DP text rewriting research more accessible and transparent.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源