变形金刚更强大吗？采取针对变压器的确切鲁棒性验证

论文标题

变形金刚更强大吗？采取针对变压器的确切鲁棒性验证

Are Transformers More Robust? Towards Exact Robustness Verification for Transformers

论文作者

Liao, Brian Hsuan-Cheng, Cheng, Chih-Hong, Esen, Hasan, Knoll, Alois

论文摘要

作为一种新兴的神经网络（NNS），变压器都用于从自然语言处理到自动驾驶的许多领域。在本文中，我们研究了变形金刚的鲁棒性问题，关键特征是稳健性低可能引起安全问题。具体而言，我们专注于基于Sparsemax的变压器，并将其最大鲁棒性的发现降低到混合整数四限制的编程（MIQCP）问题。我们还设计了两个可以嵌入MIQCP编码并大大加速其求解中的预处理启发式方法。然后，我们使用土地出发警告的应用进行实验，以比较基于Sparsemax的变压器与更常规的多层pecceptron（MLP）NNS的鲁棒性。令我们惊讶的是，变压器不一定更强大，从而在为安全至关重要的领域应用选择适当的NN体系结构时产生了深刻的考虑。

As an emerging type of Neural Networks (NNs), Transformers are used in many domains ranging from Natural Language Processing to Autonomous Driving. In this paper, we study the robustness problem of Transformers, a key characteristic as low robustness may cause safety concerns. Specifically, we focus on Sparsemax-based Transformers and reduce the finding of their maximum robustness to a Mixed Integer Quadratically Constrained Programming (MIQCP) problem. We also design two pre-processing heuristics that can be embedded in the MIQCP encoding and substantially accelerate its solving. We then conduct experiments using the application of Land Departure Warning to compare the robustness of Sparsemax-based Transformers against that of the more conventional Multi-Layer-Perceptron (MLP) NNs. To our surprise, Transformers are not necessarily more robust, leading to profound considerations in selecting appropriate NN architectures for safety-critical domain applications.

下载PDF全文

下载文献需遵守相关版权规定

论文标题