论文标题

感知,表示,生成:将多模式信息转换为机器人运动轨迹

Perceive, Represent, Generate: Translating Multimodal Information to Robotic Motion Trajectories

论文作者

Vital, Fábio, Vasco, Miguel, Sardinha, Alberto, Melo, Francisco

论文摘要

我们提出了一种新颖的三阶段框架感知意见生成(PRG),该框架绘制了与一系列指令序列相对应的不同模态(例如,视觉或声音)的感知信息,以与机器人执行的适当运动顺序。在第一阶段,我们感知和预处理给定的输入,将单个命令与人类用户提供的完整指令隔离。在第二阶段,我们使用深层生成模型将单个命令编码为多模式潜在空间。最后,在第三阶段,我们将多模式潜在值转换为单个轨迹,并将它们组合成单个动态运动原始的原始轨迹,从而使其在机器人平台中执行。我们在新型的机器人手写任务的背景下评估了管道,在该任务中,机器人通过不同的感知方式(例如,图像,声音)接收到单词,并生成相应的运动轨迹来编写它,创建相干和可读的手写单词。

We present Perceive-Represent-Generate (PRG), a novel three-stage framework that maps perceptual information of different modalities (e.g., visual or sound), corresponding to a sequence of instructions, to an adequate sequence of movements to be executed by a robot. In the first stage, we perceive and pre-process the given inputs, isolating individual commands from the complete instruction provided by a human user. In the second stage we encode the individual commands into a multimodal latent space, employing a deep generative model. Finally, in the third stage we convert the multimodal latent values into individual trajectories and combine them into a single dynamic movement primitive, allowing its execution in a robotic platform. We evaluate our pipeline in the context of a novel robotic handwriting task, where the robot receives as input a word through different perceptual modalities (e.g., image, sound), and generates the corresponding motion trajectory to write it, creating coherent and readable handwritten words.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源