图像目标导航的拓扑语义图存储器

论文标题

图像目标导航的拓扑语义图存储器

Topological Semantic Graph Memory for Image-Goal Navigation

论文作者

Kim, Nuri, Kwon, Obin, Yoo, Hwiyeon, Choi, Yunho, Park, Jeongho, Oh, Songhwai

论文摘要

提出了一个新颖的框架，以逐步收集基于里程碑的图形存储器，并使用收集的内存进行图像目标导航。给定目标图像搜索，具体的机器人利用语义内存在未知环境中找到目标。％从RGB-D摄像机的全景观察中收集语义图存储器，而无需知道机器人的姿势。在本文中，我们提出了拓扑语义图记忆（TSGM），该记忆由（1）图形构建器组成，该图将观察到的RGB-D图像构造拓扑语义图，（2）横图搅拌器模块，该模块将收集的节点带来上下文信息，以及（3）将上下文存储器作为上下文存储器作为输入的上下文解码器。在图像目标导航的任务上，TSGM在成功率上明显优于竞争基线的竞争性基线 +5.0-9.0％，而SPL的竞争性基线的表现高于 +7.0-23.5％，这意味着TSGM可以找到有效的路径。此外，我们在现实世界图像目标方案中在移动机器人上演示了我们的方法。

A novel framework is proposed to incrementally collect landmark-based graph memory and use the collected memory for image goal navigation. Given a target image to search, an embodied robot utilizes semantic memory to find the target in an unknown environment. % The semantic graph memory is collected from a panoramic observation of an RGB-D camera without knowing the robot's pose. In this paper, we present a topological semantic graph memory (TSGM), which consists of (1) a graph builder that takes the observed RGB-D image to construct a topological semantic graph, (2) a cross graph mixer module that takes the collected nodes to get contextual information, and (3) a memory decoder that takes the contextual memory as an input to find an action to the target. On the task of image goal navigation, TSGM significantly outperforms competitive baselines by +5.0-9.0% on the success rate and +7.0-23.5% on SPL, which means that the TSGM finds efficient paths. Additionally, we demonstrate our method on a mobile robot in real-world image goal scenarios.

下载PDF全文

下载文献需遵守相关版权规定

论文标题