论文标题

通过受控排毒对神经网络攻击的深太空trojan攻击

Deep Feature Space Trojan Attack of Neural Networks by Controlled Detoxification

论文作者

Cheng, Siyuan, Liu, Yingqi, Ma, Shiqing, Zhang, Xiangyu

论文摘要

特洛伊木马(后门)攻击是对深度神经网络的对抗性攻击的一种形式,攻击者为受害者提供了对恶意数据的模型训练/再训练的形式。当正常输入用某种称为触发器的图案盖章时,可以激活后门,从而导致错误分类。许多现有的特洛伊木马攻击使他们的触发器是输入空间补丁/对象(例如,具有纯色的多边形)或简单的输入转换(例如Instagram滤波器)。这些简单的触发因素容易受到最近的后门检测算法的影响。我们提出了一种具有五个特征的新型深空特洛伊木马攻击:有效性,隐身性,可控性,可靠性和对深度特征的依赖。我们对包括ImageNet在内的各种数据集的9个图像分类器进行了广泛的实验,以证明这些属性,并表明我们的攻击可以逃避最新的防御。

Trojan (backdoor) attack is a form of adversarial attack on deep neural networks where the attacker provides victims with a model trained/retrained on malicious data. The backdoor can be activated when a normal input is stamped with a certain pattern called trigger, causing misclassification. Many existing trojan attacks have their triggers being input space patches/objects (e.g., a polygon with solid color) or simple input transformations such as Instagram filters. These simple triggers are susceptible to recent backdoor detection algorithms. We propose a novel deep feature space trojan attack with five characteristics: effectiveness, stealthiness, controllability, robustness and reliance on deep features. We conduct extensive experiments on 9 image classifiers on various datasets including ImageNet to demonstrate these properties and show that our attack can evade state-of-the-art defense.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源