通过无参数攻击的合奏对对抗性鲁棒性的可靠评估

论文标题

通过无参数攻击的合奏对对抗性鲁棒性的可靠评估

Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks

论文作者

Croce, Francesco, Hein, Matthias

论文摘要

在过去的几年中，针对对抗性攻击的国防战略领域已经显着增长，但是由于对抗性防御的评估通常不足，因此进步受到了阻碍，因此给予了错误的鲁棒性印象。以后可能会破坏许多有前途的防御能力，从而难以识别最先进的防御能力。评估中频繁的陷阱是对攻击，梯度混淆或掩盖的超参数的不当调整。在本文中，我们首先提出了由于次优的步长和目标函数问题而导致的PGD攻击克服故障的两个扩展。然后，我们将新颖的攻击与两个互补的现有攻击相结合，形成一种无参数，计算负担得起的和用户独立的攻击集合，以测试对抗性鲁棒性。我们将合奏应用于最近在Top Machine Learning和计算机视野中发表的论文中的50多个模型。除一种情况外，我们达到的鲁棒测试准确性比这些论文中报道的较低，通常超过$ 10 \％$，确定了几个损坏的防御措施。

The field of defense strategies against adversarial attacks has significantly grown over the last years, but progress is hampered as the evaluation of adversarial defenses is often insufficient and thus gives a wrong impression of robustness. Many promising defenses could be broken later on, making it difficult to identify the state-of-the-art. Frequent pitfalls in the evaluation are improper tuning of hyperparameters of the attacks, gradient obfuscation or masking. In this paper we first propose two extensions of the PGD-attack overcoming failures due to suboptimal step size and problems of the objective function. We then combine our novel attacks with two complementary existing ones to form a parameter-free, computationally affordable and user-independent ensemble of attacks to test adversarial robustness. We apply our ensemble to over 50 models from papers published at recent top machine learning and computer vision venues. In all except one of the cases we achieve lower robust test accuracy than reported in these papers, often by more than $10\%$, identifying several broken defenses.

下载PDF全文

下载文献需遵守相关版权规定

论文标题