政策设计，学习和转移到安全有效的工业插入的综合框架

论文标题

政策设计，学习和转移到安全有效的工业插入的综合框架

A Composable Framework for Policy Design, Learning, and Transfer Toward Safe and Efficient Industrial Insertion

论文作者

Chen, Rui, Wang, Chenxi, Wei, Tianhao, Liu, Changliu

论文摘要

精致的工业插入任务（例如，PC董事会组件）对于工业机器人仍然具有挑战性。挑战包括较低的误差容忍度，组件的美味佳肴以及相对于要插入的组件的大型任务变化。为了为这些插入任务提供可行的机器人解决方案，我们还需要考虑现有机器人系统的硬件限制并最大程度地减少集成工作。本文提出了一个可合并的框架，以有效地集成在现有机器人平台上的安全插入政策，以完成这些插入任务。该策略具有可解释的模块化设计，可以在硬件上有效地学习，并轻松转移到新任务。特别是，该策略包括安全插入剂作为插入的基准策略，最佳可配置的笛卡尔跟踪器作为机器人硬件的接口，一个概率的推理模块来处理组件多样性和插入误差，以及安全的学习模块，以优化上述模块中的参数，以实现最佳的指定性能。 UR10机器人的实验结果表明，所提出的框架可实现安全性（对于组件的美味），准确性（对于低容忍度），鲁棒性（针对感知误差和组件缺陷），适应性和可转移性（用于任务变化），以及在执行和学习过程中的任务效率以及在学习过程中的任务效率。

Delicate industrial insertion tasks (e.g., PC board assembly) remain challenging for industrial robots. The challenges include low error tolerance, delicacy of the components, and large task variations with respect to the components to be inserted. To deliver a feasible robotic solution for these insertion tasks, we also need to account for hardware limits of existing robotic systems and minimize the integration effort. This paper proposes a composable framework for efficient integration of a safe insertion policy on existing robotic platforms to accomplish these insertion tasks. The policy has an interpretable modularized design and can be learned efficiently on hardware and transferred to new tasks easily. In particular, the policy includes a safe insertion agent as a baseline policy for insertion, an optimal configurable Cartesian tracker as an interface to robot hardware, a probabilistic inference module to handle component variety and insertion errors, and a safe learning module to optimize the parameters in the aforementioned modules to achieve the best performance on designated hardware. The experiment results on a UR10 robot show that the proposed framework achieves safety (for the delicacy of components), accuracy (for low tolerance), robustness (against perception error and component defection), adaptability and transferability (for task variations), as well as task efficiency during execution plus data and time efficiency during learning.

下载PDF全文

下载文献需遵守相关版权规定

论文标题