论文标题
探索RGB-D大满贯的对象感知注意力引导的框架协会
Exploring Object-Aware Attention Guided Frame Association for RGB-D SLAM
论文作者
论文摘要
深度学习模型是一个新兴的主题,在各个领域都表现出了很大的进步。尤其是,诸如类激活方法之类的可视化工具提供了有关卷积神经网络(CNN)推理的视觉解释。通过使用网络层的梯度,可以证明在特定图像识别任务中网络注意的位置。此外,这些梯度可以与CNN功能集成,以在场景中定位更概括的任务专注(显着)对象。尽管取得了这种进步,但该梯度(网络注意力)信息的明确用途并不多,无法与对象语义的CNN表示集成。这对于同时定位和映射(SLAM)等视觉任务非常有用,在该任务中,在空间细心的对象位置的CNN表示可能会导致性能改善。因此,在这项工作中,我们建议将特定于任务网络的注意力用于RGB-D室内大满贯。为此,我们将层的对象注意信息(层梯度)与CNN层表示相结合,以在RGB-D室内大满贯方法中提高框架关联性能。实验显示出令人鼓舞的结果,基线的性能提高。
Deep learning models as an emerging topic have shown great progress in various fields. Especially, visualization tools such as class activation mapping methods provided visual explanation on the reasoning of convolutional neural networks (CNNs). By using the gradients of the network layers, it is possible to demonstrate where the networks pay attention during a specific image recognition task. Moreover, these gradients can be integrated with CNN features for localizing more generalized task dependent attentive (salient) objects in scenes. Despite this progress, there is not much explicit usage of this gradient (network attention) information to integrate with CNN representations for object semantics. This can be very useful for visual tasks such as simultaneous localization and mapping (SLAM) where CNN representations of spatially attentive object locations may lead to improved performance. Therefore, in this work, we propose the use of task specific network attention for RGB-D indoor SLAM. To do so, we integrate layer-wise object attention information (layer gradients) with CNN layer representations to improve frame association performance in an RGB-D indoor SLAM method. Experiments show promising results with improved performance over the baseline.