艺术攻击：黑盒对抗性攻击通过进化艺术

论文标题

艺术攻击：黑盒对抗性攻击通过进化艺术

Art-Attack: Black-Box Adversarial Attack via Evolutionary Art

论文作者

Williams, Phoenix, Li, Ke

论文摘要

深度神经网络（DNN）在许多任务中都达到了最先进的表现，但显示出极端的脆弱性是对对抗性例子产生的攻击。许多作品都采用白框攻击，该攻击假定对目标模型（包括其架构和梯度）的总访问。一个更现实的假设是黑框方案，攻击者仅通过查询某些输入并观察其预测的类概率来访问目标模型。与大多数使用替代模型或梯度估计的大多数普遍的黑盒攻击不同，本文提出了通过使用进化艺术的概念来产生对抗性示例，从而迭代地演变一组重叠的透明形状。为了评估我们提出的方法的有效性，我们以目标方式攻击了在CIFAR-10数据集上训练的三个最先进的图像分类模型。我们进行了一项参数研究，概述了形状对拟议攻击性能的影响和类型的影响。与最先进的黑盒攻击相比，我们的攻击更有效地产生对抗性示例，并在所有三个基线模型上都取得了更高的攻击成功率。

Deep neural networks (DNNs) have achieved state-of-the-art performance in many tasks but have shown extreme vulnerabilities to attacks generated by adversarial examples. Many works go with a white-box attack that assumes total access to the targeted model including its architecture and gradients. A more realistic assumption is the black-box scenario where an attacker only has access to the targeted model by querying some input and observing its predicted class probabilities. Different from most prevalent black-box attacks that make use of substitute models or gradient estimation, this paper proposes a gradient-free attack by using a concept of evolutionary art to generate adversarial examples that iteratively evolves a set of overlapping transparent shapes. To evaluate the effectiveness of our proposed method, we attack three state-of-the-art image classification models trained on the CIFAR-10 dataset in a targeted manner. We conduct a parameter study outlining the impact the number and type of shapes have on the proposed attack's performance. In comparison to state-of-the-art black-box attacks, our attack is more effective at generating adversarial examples and achieves a higher attack success rate on all three baseline models.

下载PDF全文

下载文献需遵守相关版权规定

论文标题