论文标题
模型鲁棒性的单一价值观点
A Singular Value Perspective on Model Robustness
论文作者
论文摘要
卷积神经网络(CNN)在几个计算机视觉基准方面取得了重大进展,但充满了许多非人类偏见,例如脆弱性对抗性样本。他们缺乏解释性使这些偏见的识别和纠正变得困难,并且理解它们的概括行为仍然是一个开放的问题。在这项工作中,我们探讨了CNN的概括行为与图像的奇异值分解(SVD)之间的关系。我们表明,自然训练和对抗性强大的CNN为同一数据集利用高度不同的功能。我们证明,对于ImageNet和CIFAR-10训练的网络,SVD可以解散这些功能。最后,我们提出了等级集成梯度(RIG),这是第一个基于等级的特征归因方法,用于了解CNN对图像级别的依赖性。
Convolutional Neural Networks (CNNs) have made significant progress on several computer vision benchmarks, but are fraught with numerous non-human biases such as vulnerability to adversarial samples. Their lack of explainability makes identification and rectification of these biases difficult, and understanding their generalization behavior remains an open problem. In this work we explore the relationship between the generalization behavior of CNNs and the Singular Value Decomposition (SVD) of images. We show that naturally trained and adversarially robust CNNs exploit highly different features for the same dataset. We demonstrate that these features can be disentangled by SVD for ImageNet and CIFAR-10 trained networks. Finally, we propose Rank Integrated Gradients (RIG), the first rank-based feature attribution method to understand the dependence of CNNs on image rank.