论文标题
Impnet:在编译的神经网络中,不可察觉和黑框可触发后门
ImpNet: Imperceptible and blackbox-undetectable backdoors in compiled neural networks
论文作者
论文摘要
针对机器学习的早期后门攻击引发了攻击和防御开发中的军备竞赛。此后,防御能力表现出一些能够检测模型中的后门甚至删除后门的能力。这些防御能力通过检查培训数据,模型或培训程序的完整性来起作用。在这项工作中,我们表明可以在编译过程中添加后门,从而规避数据准备中的任何保障措施和模型培训阶段。攻击者不仅可以在编译过程中插入现有的基于重量的后门,还可以插入一类新的与权重的后门(例如Impnet)。这些后门在培训或数据准备过程中无法检测到,因为它们尚未存在。接下来,我们证明,只有在插入它们的阶段,才能可靠地检测到包括IMPNET在内的某些后门,并在其他任何地方将其删除带来重大挑战。我们得出的结论是,ML模型安全需要在整个技术管道上保证出处,包括数据,模型体系结构,编译器和硬件规范。
Early backdoor attacks against machine learning set off an arms race in attack and defence development. Defences have since appeared demonstrating some ability to detect backdoors in models or even remove them. These defences work by inspecting the training data, the model, or the integrity of the training procedure. In this work, we show that backdoors can be added during compilation, circumventing any safeguards in the data preparation and model training stages. The attacker can not only insert existing weight-based backdoors during compilation, but also a new class of weight-independent backdoors, such as ImpNet. These backdoors are impossible to detect during the training or data preparation processes, because they are not yet present. Next, we demonstrate that some backdoors, including ImpNet, can only be reliably detected at the stage where they are inserted and removing them anywhere else presents a significant challenge. We conclude that ML model security requires assurance of provenance along the entire technical pipeline, including the data, model architecture, compiler, and hardware specification.