论文标题
FQDET:基于查询的快速探测器
FQDet: Fast-converging Query-based Detector
论文作者
论文摘要
最近,两阶段可变形的DETR引入了基于查询的两阶段头,这是一种与基于区域的经典探测器的两阶段头部不同的两阶段头部,作为更快的R-CNN。在基于查询的两个阶段头中,第二阶段选择一个由变压器处理的一个特征,称为查询,而不是在基于区域的检测器中汇总CNN处理的特征的矩形网格。在这项工作中,我们通过用锚点改善了跨注意操作的先验,从而提高了基于查询的头部,从而大大加快了收敛性的同时提高其性能。此外,我们从经验上表明,通过改善交叉注意事务,不再需要基于DITR的检测器使用的辅助损失和迭代边界框机制。通过结合经典和基于DITR的检测器的最佳功能,我们的FQDET头在2017年使用Resnet-50+TPN主链时,在2017年可可验证集上以45.4 AP达到45.4 AP,仅在使用1X训练的12个时期后才使用1X。我们的表现优于其他高性能的两个阶段头,例如级联R-CNN,同时使用相同的主链并且在计算上更便宜。此外,当使用大型Resnext-101-DCN+TPN主链和多尺度测试时,我们的FQDET头仅在仅12个训练时就可以在2017年的COCO Test-DEV上实现52.9 AP。代码在https://github.com/cedricpicron/fqdet上发布。
Recently, two-stage Deformable DETR introduced the query-based two-stage head, a new type of two-stage head different from the region-based two-stage heads of classical detectors as Faster R-CNN. In query-based two-stage heads, the second stage selects one feature per detection processed by a transformer, called the query, as opposed to pooling a rectangular grid of features processed by CNNs as in region-based detectors. In this work, we improve the query-based head by improving the prior of the cross-attention operation with anchors, significantly speeding up the convergence while increasing its performance. Additionally, we empirically show that by improving the cross-attention prior, auxiliary losses and iterative bounding box mechanisms typically used by DETR-based detectors are no longer needed. By combining the best of both the classical and the DETR-based detectors, our FQDet head peaks at 45.4 AP on the 2017 COCO validation set when using a ResNet-50+TPN backbone, only after training for 12 epochs using the 1x schedule. We outperform other high-performing two-stage heads such as e.g. Cascade R-CNN, while using the same backbone and while being computationally cheaper. Additionally, when using the large ResNeXt-101-DCN+TPN backbone and multi-scale testing, our FQDet head achieves 52.9 AP on the 2017 COCO test-dev set after only 12 epochs of training. Code is released at https://github.com/CedricPicron/FQDet .