研究生：视觉变压器的梯度反转

论文标题

研究生：视觉变压器的梯度反转

GradViT: Gradient Inversion of Vision Transformers

论文作者

Hatamizadeh, Ali, Yin, Hongxu, Roth, Holger, Li, Wenqi, Kautz, Jan, Xu, Daguang, Molchanov, Pavlo

论文摘要

在这项工作中，我们证明了视觉变压器（VIT）对基于梯度的反转攻击的脆弱性。在此攻击过程中，原始数据批次被重建给定的模型权重和相应的梯度。我们引入了一种名为GradVit的方法，该方法通过迭代过程将随机噪声优化到自然看起来的图像中。优化目标包括（i）匹配梯度的损失，（ii）图像以距离的形式与预告片的CNN模型的批处理差异统计数据以及（iii）在贴片上的总变化正则化，以指导正确的恢复位置。我们提出了一个独特的损失调度函数，以在优化期间克服本地最小值。我们在ImagEnet1k和MS-Celeb-1M数据集上评估了GADVIT，并观察到前所未有的高忠诚度和与原始（隐藏）数据的亲密关系。在分析过程中，我们发现视觉变压器比存在注意机制的存在比以前研究的CNN更脆弱。我们的方法证明了定性和定量指标的梯度反转的最新结果。 https://gradvit.github.io/的项目页面。

In this work we demonstrate the vulnerability of vision transformers (ViTs) to gradient-based inversion attacks. During this attack, the original data batch is reconstructed given model weights and the corresponding gradients. We introduce a method, named GradViT, that optimizes random noise into naturally looking images via an iterative process. The optimization objective consists of (i) a loss on matching the gradients, (ii) image prior in the form of distance to batch-normalization statistics of a pretrained CNN model, and (iii) a total variation regularization on patches to guide correct recovery locations. We propose a unique loss scheduling function to overcome local minima during optimization. We evaluate GadViT on ImageNet1K and MS-Celeb-1M datasets, and observe unprecedentedly high fidelity and closeness to the original (hidden) data. During the analysis we find that vision transformers are significantly more vulnerable than previously studied CNNs due to the presence of the attention mechanism. Our method demonstrates new state-of-the-art results for gradient inversion in both qualitative and quantitative metrics. Project page at https://gradvit.github.io/.

下载PDF全文

下载文献需遵守相关版权规定

论文标题