BadDet：对象检测的后门攻击

论文标题

BadDet：对象检测的后门攻击

BadDet: Backdoor Attacks on Object Detection

论文作者

Chan, Shih-Han, Dong, Yinpeng, Zhu, Jun, Zhang, Xiaolu, Zhou, Jun

论文摘要

深度学习模型已被部署在众多现实世界中，例如自动驾驶和监视。但是，这些模型在对抗环境中很容易受到攻击。后门攻击正在成为严重的安全威胁，它将后门触发器注入一小部分训练数据中，因此训练有素的模型通常在良性输入上行为，但在出现特定触发器时会提供错误的预测。尽管后门攻击中的大多数研究都集中在图像分类上，但尚未探索对象检测的后门攻击，而是同等重要的。对象检测已被用作各种对安全敏感应用程序（例如自动驾驶）的重要模块。因此，对物体检测的后门攻击可能对人类的生命和财产构成严重威胁。我们提出了四种用于对象检测任务的后门攻击：1）对象生成攻击：触发器可以错误地生成目标类的对象； 2）区域错误分类攻击：触发器可以将周围对象的预测更改为目标类别； 3）全局错误分类攻击：单个触发器可以将图像中所有对象的预测更改为目标类； 4）对象消失攻击：触发器可以使检测器无法检测目标类的对象。我们开发适当的指标来评估对象检测的四次后门攻击。我们使用两个典型的对象检测模型执行实验-RASTER-RCNN和YOLOV3在不同的数据集上。更重要的是，我们证明，即使在另一个良性数据集上进行微调也无法删除隐藏在对象检测模型中的后门。为了防止这些后门攻击，我们提出了检测器清洁，这是一个基于熵的运行时间检测框架，以识别任何已部署的对象检测器的中毒测试样品。

Deep learning models have been deployed in numerous real-world applications such as autonomous driving and surveillance. However, these models are vulnerable in adversarial environments. Backdoor attack is emerging as a severe security threat which injects a backdoor trigger into a small portion of training data such that the trained model behaves normally on benign inputs but gives incorrect predictions when the specific trigger appears. While most research in backdoor attacks focuses on image classification, backdoor attacks on object detection have not been explored but are of equal importance. Object detection has been adopted as an important module in various security-sensitive applications such as autonomous driving. Therefore, backdoor attacks on object detection could pose severe threats to human lives and properties. We propose four kinds of backdoor attacks for object detection task: 1) Object Generation Attack: a trigger can falsely generate an object of the target class; 2) Regional Misclassification Attack: a trigger can change the prediction of a surrounding object to the target class; 3) Global Misclassification Attack: a single trigger can change the predictions of all objects in an image to the target class; and 4) Object Disappearance Attack: a trigger can make the detector fail to detect the object of the target class. We develop appropriate metrics to evaluate the four backdoor attacks on object detection. We perform experiments using two typical object detection models -- Faster-RCNN and YOLOv3 on different datasets. More crucially, we demonstrate that even fine-tuning on another benign dataset cannot remove the backdoor hidden in the object detection model. To defend against these backdoor attacks, we propose Detector Cleanse, an entropy-based run-time detection framework to identify poisoned testing samples for any deployed object detector.

下载PDF全文

下载文献需遵守相关版权规定

论文标题