对抗性鲁棒性的上下文融合

论文标题

对抗性鲁棒性的上下文融合

Contextual Fusion For Adversarial Robustness

论文作者

Akumalla, Aiswarya, Haney, Seth, Bazhenov, Maksim

论文摘要

哺乳动物的大脑通过整合专门针对单个感觉方式的大脑区域的信息来以格式塔的方式处理复杂的推理任务。这允许提高鲁棒性和更好的概括能力。相比之下，深度神经网络通常旨在处理一个特定的信息流，并容易受到各种类型的对抗扰动的影响。尽管存在许多用于检测和防御对抗攻击的方法，但它们并未概括在一系列攻击中，并且会对清洁，不受干扰的数据产生负面影响。我们使用背景和前景特征的组合开发了一个融合模型，该特征并行从位置-CNN和Imagenet-CNN提取。我们测试了融合方法对人类可感知（例如高斯模糊）和网络可感知（例如，基于梯度的）攻击的融合方法的好处。对于基于梯度的攻击，我们的结果表明，融合可以在分类方面进行显着改进，而无需降低不受干扰的数据的性能，而无需执行对抗性训练。我们的融合模型也揭示了高斯模糊型扰动的改进。融合方法的性能提高取决于图像上下文的可变性；对于在其上下文中差异较大的图像类别的类别中，看到了较大的增加。我们还证明了正则化对在有已知对手的存在下偏向分类器决策的效果。我们建议这种以生物学启发的方法整合跨多种方式的信息，这为改善对抗性鲁棒性提供了一种新方法，可以与当前的艺术方法相辅相成。

Mammalian brains handle complex reasoning tasks in a gestalt manner by integrating information from regions of the brain that are specialised to individual sensory modalities. This allows for improved robustness and better generalisation ability. In contrast, deep neural networks are usually designed to process one particular information stream and susceptible to various types of adversarial perturbations. While many methods exist for detecting and defending against adversarial attacks, they do not generalise across a range of attacks and negatively affect performance on clean, unperturbed data. We developed a fusion model using a combination of background and foreground features extracted in parallel from Places-CNN and Imagenet-CNN. We tested the benefits of the fusion approach on preserving adversarial robustness for human perceivable (e.g., Gaussian blur) and network perceivable (e.g., gradient-based) attacks for CIFAR-10 and MS COCO data sets. For gradient based attacks, our results show that fusion allows for significant improvements in classification without decreasing performance on unperturbed data and without need to perform adversarial retraining. Our fused model revealed improvements for Gaussian blur type perturbations as well. The increase in performance from fusion approach depended on the variability of the image contexts; larger increases were seen for classes of images with larger differences in their contexts. We also demonstrate the effect of regularization to bias the classifier decision in the presence of a known adversary. We propose that this biologically inspired approach to integrate information across multiple modalities provides a new way to improve adversarial robustness that can be complementary to current state of the art approaches.

下载PDF全文

下载文献需遵守相关版权规定

论文标题