论文标题
用于逃避对象检测模型的动态对抗贴片
Dynamic Adversarial Patch for Evading Object Detection Models
论文作者
论文摘要
最近的研究表明,用于计算机视觉的神经网络模型(例如,Yolo和Fast R-CNN)容易受到对抗性逃避攻击的影响。针对对象探测器的大多数现有现实世界对抗性攻击都使用附在目标对象上的对抗贴片(例如,精心制作的贴纸放在停车标志上)。相对于目标对象,此方法可能无法强大地变化相机的位置。此外,当应用于诸如汽车之类的非平面物体时,它可能无法正常工作。在这项研究中,我们提出了一种针对现实世界设置中应用的对象探测器的创新攻击方法,该方法解决了现有攻击的某些局限性。我们的方法使用动态对抗贴片,这些贴片位于目标对象上的多个预定位置。为了生成所用的补丁,使用对抗性学习算法。动态攻击是通过根据摄像机的位置(即对象检测系统的位置)动态切换在优化的补丁之间实现的。为了在实际设置中演示我们的攻击,我们通过将平面屏幕连接到目标对象来实现补丁;根据当前的相机位置,屏幕用于呈现补丁并在它们之间切换。因此,攻击是动态的,并适应了情况,以实现最佳结果。我们通过用汽车作为目标对象攻击Yolov2对象检测器来评估我们的动态补丁方法,并在从广泛的视角范围内拍摄汽车时成功地误导了多达90%的视频帧。我们通过生成考虑目标对象与其分类之间的语义距离的补丁来改善攻击。我们还检查了攻击在不同的汽车模型之间的可转移性,并能够在71%的时间误导检测器。
Recent research shows that neural networks models used for computer vision (e.g., YOLO and Fast R-CNN) are vulnerable to adversarial evasion attacks. Most of the existing real-world adversarial attacks against object detectors use an adversarial patch which is attached to the target object (e.g., a carefully crafted sticker placed on a stop sign). This method may not be robust to changes in the camera's location relative to the target object; in addition, it may not work well when applied to nonplanar objects such as cars. In this study, we present an innovative attack method against object detectors applied in a real-world setup that addresses some of the limitations of existing attacks. Our method uses dynamic adversarial patches which are placed at multiple predetermined locations on a target object. An adversarial learning algorithm is applied in order to generate the patches used. The dynamic attack is implemented by switching between optimized patches dynamically, according to the camera's position (i.e., the object detection system's position). In order to demonstrate our attack in a real-world setup, we implemented the patches by attaching flat screens to the target object; the screens are used to present the patches and switch between them, depending on the current camera location. Thus, the attack is dynamic and adjusts itself to the situation to achieve optimal results. We evaluated our dynamic patch approach by attacking the YOLOv2 object detector with a car as the target object and succeeded in misleading it in up to 90% of the video frames when filming the car from a wide viewing angle range. We improved the attack by generating patches that consider the semantic distance between the target object and its classification. We also examined the attack's transferability among different car models and were able to mislead the detector 71% of the time.