检测选择算法：一种基于可能性的优化方法，用于执行后处理以进行对象检测

论文标题

检测选择算法：一种基于可能性的优化方法，用于执行后处理以进行对象检测

Detection Selection Algorithm: A Likelihood based Optimization Method to Perform Post Processing for Object Detection

论文作者

Fan, Angzhi, Ticknor, Benjamin, Amit, Yali

论文摘要

在对象检测中，广泛使用了诸如非最大抑制（NMS）之类的后处理方法。 NMS可以大大减少假阳性检测的数量，但仍可以低对象得分保持一些检测。为了在图像中找到确切的对象及其标签的数量，我们提出了一种称为检测选择算法（DSA）的后处理方法，该方法在NMS或相关方法后使用。 DSA贪婪地选择了一个检测到的边界框的子集，以及完整的对象重建，同时考虑到对象的闭合，可以将整个图像解释为最高的可能性。该算法由四个组件组成。首先，我们添加一个遮挡分支以更快地R-CNN，以获得对象之间的遮挡关系。其次，我们开发了一种单个重建算法，该算法可以根据对象的可见部分重建对象的全部外观，基于对训练有素的生成网络的潜在变量的优化，我们称为解码器。第三，我们提出了一种整个重建算法，该算法考虑到封闭订购，该算法在假设的解释中生成了所有对象的联合重建。最后，我们提出了一种贪婪的算法，该算法会逐步添加或从列表中删除检测，以最大程度地提高相应解释的可能性。与NMS或Soft-NMS本身相比，具有NMS或SOFT-NMS的DSA可以取得更好的结果，正如我们对具有Mutiple 3D对象的合成图像的实验所示。

In object detection, post-processing methods like Non-maximum Suppression (NMS) are widely used. NMS can substantially reduce the number of false positive detections but may still keep some detections with low objectness scores. In order to find the exact number of objects and their labels in the image, we propose a post processing method called Detection Selection Algorithm (DSA) which is used after NMS or related methods. DSA greedily selects a subset of detected bounding boxes, together with full object reconstructions that give the interpretation of the whole image with highest likelihood, taking into account object occlusions. The algorithm consists of four components. First, we add an occlusion branch to Faster R-CNN to obtain occlusion relationships between objects. Second, we develop a single reconstruction algorithm which can reconstruct the whole appearance of an object given its visible part, based on the optimization of latent variables of a trained generative network which we call the decoder. Third, we propose a whole reconstruction algorithm which generates the joint reconstruction of all objects in a hypothesized interpretation, taking into account occlusion ordering. Finally we propose a greedy algorithm that incrementally adds or removes detections from a list to maximize the likelihood of the corresponding interpretation. DSA with NMS or Soft-NMS can achieve better results than NMS or Soft-NMS themselves, as is illustrated in our experiments on synthetic images with mutiple 3d objects.

下载PDF全文

下载文献需遵守相关版权规定

论文标题