感知，表示，生成：将多模式信息转换为机器人运动轨迹

论文标题

感知，表示，生成：将多模式信息转换为机器人运动轨迹

Perceive, Represent, Generate: Translating Multimodal Information to Robotic Motion Trajectories

论文作者

Vital, Fábio, Vasco, Miguel, Sardinha, Alberto, Melo, Francisco

论文摘要

我们提出了一种新颖的三阶段框架感知意见生成（PRG），该框架绘制了与一系列指令序列相对应的不同模态（例如，视觉或声音）的感知信息，以与机器人执行的适当运动顺序。在第一阶段，我们感知和预处理给定的输入，将单个命令与人类用户提供的完整指令隔离。在第二阶段，我们使用深层生成模型将单个命令编码为多模式潜在空间。最后，在第三阶段，我们将多模式潜在值转换为单个轨迹，并将它们组合成单个动态运动原始的原始轨迹，从而使其在机器人平台中执行。我们在新型的机器人手写任务的背景下评估了管道，在该任务中，机器人通过不同的感知方式（例如，图像，声音）接收到单词，并生成相应的运动轨迹来编写它，创建相干和可读的手写单词。

We present Perceive-Represent-Generate (PRG), a novel three-stage framework that maps perceptual information of different modalities (e.g., visual or sound), corresponding to a sequence of instructions, to an adequate sequence of movements to be executed by a robot. In the first stage, we perceive and pre-process the given inputs, isolating individual commands from the complete instruction provided by a human user. In the second stage we encode the individual commands into a multimodal latent space, employing a deep generative model. Finally, in the third stage we convert the multimodal latent values into individual trajectories and combine them into a single dynamic movement primitive, allowing its execution in a robotic platform. We evaluate our pipeline in the context of a novel robotic handwriting task, where the robot receives as input a word through different perceptual modalities (e.g., image, sound), and generates the corresponding motion trajectory to write it, creating coherent and readable handwritten words.

下载PDF全文

下载文献需遵守相关版权规定

论文标题