论文标题
双重反向传播,用于训练对抗攻击的自动编码器
Double Backpropagation for Training Autoencoders against Adversarial Attack
论文作者
论文摘要
众所周知,深度学习容易受到对抗样本的影响。本文着重于对自动编码器的对抗性攻击。自动编码器(AES)的安全性很重要,因为它们被广泛用作数据存储和传输的压缩方案,但是,当前的自动编码器很容易受到攻击,即,一个人可以稍微修改输入,但具有完全不同的代码。漏洞是植根于自动编码器的敏感性并增强鲁棒性的根源,我们建议采用双重反向传播(DBP)来确保自动编码器(例如VAE和Draw)。我们将梯度从重建图像限制为原始图像,以使自动编码器对对抗性攻击产生的微不足道扰动不敏感。通过DBP平滑梯度后,我们通过高斯混合模型(GMM)进一步平滑标签,以进行准确稳健的分类。我们在MNIST,Celeba,SVHN中证明我们的方法会导致强大的自动编码器具有抗攻击性的强大自动编码器,并且可以与GMM结合使用,可用于图像过渡,可免疫对抗性攻击。
Deep learning, as widely known, is vulnerable to adversarial samples. This paper focuses on the adversarial attack on autoencoders. Safety of the autoencoders (AEs) is important because they are widely used as a compression scheme for data storage and transmission, however, the current autoencoders are easily attacked, i.e., one can slightly modify an input but has totally different codes. The vulnerability is rooted the sensitivity of the autoencoders and to enhance the robustness, we propose to adopt double backpropagation (DBP) to secure autoencoder such as VAE and DRAW. We restrict the gradient from the reconstruction image to the original one so that the autoencoder is not sensitive to trivial perturbation produced by the adversarial attack. After smoothing the gradient by DBP, we further smooth the label by Gaussian Mixture Model (GMM), aiming for accurate and robust classification. We demonstrate in MNIST, CelebA, SVHN that our method leads to a robust autoencoder resistant to attack and a robust classifier able for image transition and immune to adversarial attack if combined with GMM.