论文标题

Vinet:3D对象检测的轻巧,可扩展和异质合作感

VINet: Lightweight, Scalable, and Heterogeneous Cooperative Perception for 3D Object Detection

论文作者

Bai, Zhengwei, Wu, Guoyuan, Barth, Matthew J., Liu, Yongkang, Sisbot, Emrah Akin, Oguchi, Kentaro

论文摘要

利用人工智能(AI)的最新进展,计算机视觉社区现在目睹了各种感知任务的前所未有的进化,尤其是在物体检测中。基于多个空间分离的感知节点,合作感知(CP)已出现以显着提高自动驾驶的感知。但是,当前的合作对象检测方法主要集中在自我车辆效率上,而无需考虑全系统范围的成本。在本文中,我们介绍了Vinet,这是一个基于统一的深度学习CP网络,用于可扩展,轻巧和异质的合作3D对象检测。 Vinet是从大规模系统级实现的角度设计的第一个CP方法,可以分为三个主要阶段:1)全局预处理和轻量级功能提取提取,将数据准备成全球样式,并以轻量级的方式提取合作功能; 2)融合了可扩展和异质感知节点的特征的两流融合; 3)中央特征主链和3D检测头,这些骨干和3D检测头进一步处理融合特征并产生合作检测结果。为CP数据集采集和模型评估设计和开发了一个开源数据实验平台。实验分析表明,Vinet可以降低84%的系统级计算成本和94%的系统级通信成本,同时提高3D检测准确性。

Utilizing the latest advances in Artificial Intelligence (AI), the computer vision community is now witnessing an unprecedented evolution in all kinds of perception tasks, particularly in object detection. Based on multiple spatially separated perception nodes, Cooperative Perception (CP) has emerged to significantly advance the perception of automated driving. However, current cooperative object detection methods mainly focus on ego-vehicle efficiency without considering the practical issues of system-wide costs. In this paper, we introduce VINet, a unified deep learning-based CP network for scalable, lightweight, and heterogeneous cooperative 3D object detection. VINet is the first CP method designed from the standpoint of large-scale system-level implementation and can be divided into three main phases: 1) Global Pre-Processing and Lightweight Feature Extraction which prepare the data into global style and extract features for cooperation in a lightweight manner; 2) Two-Stream Fusion which fuses the features from scalable and heterogeneous perception nodes; and 3) Central Feature Backbone and 3D Detection Head which further process the fused features and generate cooperative detection results. An open-source data experimental platform is designed and developed for CP dataset acquisition and model evaluation. The experimental analysis shows that VINet can reduce 84% system-level computational cost and 94% system-level communication cost while improving the 3D detection accuracy.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源