来自具有大规模预训练的体现对话框的基于变压器的本地化

论文标题

来自具有大规模预训练的体现对话框的基于变压器的本地化

Transformer-based Localization from Embodied Dialog with Large-scale Pre-training

论文作者

Hahn, Meera, Rehg, James M.

论文摘要

我们通过具体的对话框（LED）解决了本地化的具有挑战性的任务。给定了来自两个代理的对话框，一个观察者在未知环境中导航和试图识别观察者位置的定位器，其目标是预测观察者在地图中的最终位置。我们开发了一种新颖的LED-BERT架构，并提出了有效的预处理策略。我们表明，基于图的场景表示比先前的工作中使用的自上而下的2D地图更有效。我们的方法表现优于以前的基线。

We address the challenging task of Localization via Embodied Dialog (LED). Given a dialog from two agents, an Observer navigating through an unknown environment and a Locator who is attempting to identify the Observer's location, the goal is to predict the Observer's final location in a map. We develop a novel LED-Bert architecture and present an effective pretraining strategy. We show that a graph-based scene representation is more effective than the top-down 2D maps used in prior works. Our approach outperforms previous baselines.

下载PDF全文

下载文献需遵守相关版权规定

论文标题