论文标题

艺术攻击:黑盒对抗性攻击通过进化艺术

Art-Attack: Black-Box Adversarial Attack via Evolutionary Art

论文作者

Williams, Phoenix, Li, Ke

论文摘要

深度神经网络(DNN)在许多任务中都达到了最先进的表现,但显示出极端的脆弱性是对对抗性例子产生的攻击。许多作品都采用白框攻击,该攻击假定对目标模型(包括其架构和梯度)的总访问。一个更现实的假设是黑框方案,攻击者仅通过查询某些输入并观察其预测的类概率来访问目标模型。与大多数使用替代模型或梯度估计的大多数普遍的黑盒攻击不同,本文提出了通过使用进化艺术的概念来产生对抗性示例,从而迭代地演变一组重叠的透明形状。为了评估我们提出的方法的有效性,我们以目标方式攻击了在CIFAR-10数据集上训练的三个最先进的图像分类模型。我们进行了一项参数研究,概述了形状对拟议攻击性能的影响和类型的影响。与最先进的黑盒攻击相比,我们的攻击更有效地产生对抗性示例,并在所有三个基线模型上都取得了更高的攻击成功率。

Deep neural networks (DNNs) have achieved state-of-the-art performance in many tasks but have shown extreme vulnerabilities to attacks generated by adversarial examples. Many works go with a white-box attack that assumes total access to the targeted model including its architecture and gradients. A more realistic assumption is the black-box scenario where an attacker only has access to the targeted model by querying some input and observing its predicted class probabilities. Different from most prevalent black-box attacks that make use of substitute models or gradient estimation, this paper proposes a gradient-free attack by using a concept of evolutionary art to generate adversarial examples that iteratively evolves a set of overlapping transparent shapes. To evaluate the effectiveness of our proposed method, we attack three state-of-the-art image classification models trained on the CIFAR-10 dataset in a targeted manner. We conduct a parameter study outlining the impact the number and type of shapes have on the proposed attack's performance. In comparison to state-of-the-art black-box attacks, our attack is more effective at generating adversarial examples and achieves a higher attack success rate on all three baseline models.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源