S3E-GNN：稀疏的空间场景嵌入图形神经网络以进行相机重新定位

论文标题

S3E-GNN：稀疏的空间场景嵌入图形神经网络以进行相机重新定位

S3E-GNN: Sparse Spatial Scene Embedding with Graph Neural Networks for Camera Relocalization

论文作者

Cheng, Ran, Jiang, Xinyu, Chen, Yuan, Liu, Lige, Sun, Tao

论文摘要

摄像机重新定位是同时定位和映射（SLAM）系统的关键组成部分。本文提出了一种基于学习的方法，称为稀疏的空间场景，嵌入图形神经网络（S3E-GNN），是一种端到端的框架，以实现高效且可靠的相机重新定位。 S3E-GNN由两个模块组成。在编码模块中，训练有素的S3E网络将RGB图像编码为嵌入代码，以隐式表示空间和语义嵌入代码。通过嵌入代码和从大满贯系统获得的相关姿势，每个图像在姿势图中表示为图节点。在GNN查询模块中，姿势图被转换为形成嵌入式聚集的参考图以进行摄像机重新定位。我们在具有挑战性的环境中收集各种场景数据集以执行实验。我们的结果表明，由于基于学习的嵌入和GNN供电的场景匹配机构，S3E-GNN方法的表现优于传统词袋（BOW）进行摄像机重新定位。

Camera relocalization is the key component of simultaneous localization and mapping (SLAM) systems. This paper proposes a learning-based approach, named Sparse Spatial Scene Embedding with Graph Neural Networks (S3E-GNN), as an end-to-end framework for efficient and robust camera relocalization. S3E-GNN consists of two modules. In the encoding module, a trained S3E network encodes RGB images into embedding codes to implicitly represent spatial and semantic embedding code. With embedding codes and the associated poses obtained from a SLAM system, each image is represented as a graph node in a pose graph. In the GNN query module, the pose graph is transformed to form a embedding-aggregated reference graph for camera relocalization. We collect various scene datasets in the challenging environments to perform experiments. Our results demonstrate that S3E-GNN method outperforms the traditional Bag-of-words (BoW) for camera relocalization due to learning-based embedding and GNN powered scene matching mechanism.

下载PDF全文

下载文献需遵守相关版权规定

论文标题