论文标题
AQD:迈向准确的完全量化对象检测
AQD: Towards Accurate Fully-Quantized Object Detection
论文作者
论文摘要
网络量化允许使用低精确算术进行推断,以提高边缘设备上深神经网络的推理效率。但是,在复杂任务(例如对象检测)上设计积极的低位(例如,2位)量化方案,就严重的性能退化和对普通硬件的无法验证的效率而言,仍然仍然具有挑战性。在本文中,我们提出了一种称为AQD的准确量化对象检测解决方案,以完全摆脱浮点计算。为此,我们使用各种层中的定点操作进行定位,包括卷积层,归一化层和跳过连接,从而可以使用只有Integer-forly arith算术执行推断。为了证明改进的延迟VS-VS-ACCURACY权衡,我们将建议的方法应用于视网膜和FCO上。特别是,MS-Coco数据集的实验结果表明,与极低的近距离方案下的全精度相比,我们的AQD具有可比性甚至更好的性能,这具有巨大的实用价值。源代码和模型可在以下网址找到:https://github.com/ziplab/qtool
Network quantization allows inference to be conducted using low-precision arithmetic for improved inference efficiency of deep neural networks on edge devices. However, designing aggressively low-bit (e.g., 2-bit) quantization schemes on complex tasks, such as object detection, still remains challenging in terms of severe performance degradation and unverifiable efficiency on common hardware. In this paper, we propose an Accurate Quantized object Detection solution, termed AQD, to fully get rid of floating-point computation. To this end, we target using fixed-point operations in all kinds of layers, including the convolutional layers, normalization layers, and skip connections, allowing the inference to be executed using integer-only arithmetic. To demonstrate the improved latency-vs-accuracy trade-off, we apply the proposed methods on RetinaNet and FCOS. In particular, experimental results on MS-COCO dataset show that our AQD achieves comparable or even better performance compared with the full-precision counterpart under extremely low-bit schemes, which is of great practical value. Source code and models are available at: https://github.com/ziplab/QTool