使用行为克隆来理解多模式感知

论文标题

使用行为克隆来理解多模式感知

Understanding Multi-Modal Perception Using Behavioral Cloning for Peg-In-a-Hole Insertion Tasks

论文作者

Liu, Yifang, Romeres, Diego, Jha, Devesh K., Nikovski, Daniel

论文摘要

钉孔（PIH）插入任务的主要挑战之一是处理目标孔位置的不确定性。为了解决这个问题，可以合并来自视觉，力/扭矩传感和本体感受等传感器方式的高维传感器输入，以学习对目标姿势中这种不确定性的强大控制策略。尽管深度学习在识别对象并用高维输入做出决策方面取得了成功，但在直接在实际系统上应用试用算法时，学习过程可能会损害机器人。同时，通过利用专家提供的演示数据，可以从演示（LFD）方法中学习（LFD）方法在实际机器人系统中实现引人注目的性能。在本文中，我们调查了多种传感器模态的优点，例如视觉，力/扭矩传感器和本体感受，以学习使用LFD技术来学习现实世界组装操作任务的控制器。该研究仅限于PIH插入；我们计划将研究扩展到将来的更多实验。此外，我们提出了一个多步骤的损失函数，以提高行为克隆方法的性能。实际操纵器的实验结果支持我们的发现，并显示了提出的损失函数的有效性。

One of the main challenges in peg-in-a-hole (PiH) insertion tasks is in handling the uncertainty in the location of the target hole. In order to address it, high-dimensional sensor inputs from sensor modalities such as vision, force/torque sensing, and proprioception can be combined to learn control policies that are robust to this uncertainty in the target pose. Whereas deep learning has shown success in recognizing objects and making decisions with high-dimensional inputs, the learning procedure might damage the robot when applying directly trial- and-error algorithms on the real system. At the same time, learning from Demonstration (LfD) methods have been shown to achieve compelling performance in real robotic systems by leveraging demonstration data provided by experts. In this paper, we investigate the merits of multiple sensor modalities such as vision, force/torque sensors, and proprioception when combined to learn a controller for real world assembly operation tasks using LfD techniques. The study is limited to PiH insertions; we plan to extend the study to more experiments in the future. Additionally, we propose a multi-step-ahead loss function to improve the performance of the behavioral cloning method. Experimental results on a real manipulator support our findings, and show the effectiveness of the proposed loss function.

下载PDF全文

下载文献需遵守相关版权规定

论文标题