论文标题
达到,抓握和重新抓紧:学习多模式抓握技能
Reaching, Grasping and Re-grasping: Learning Multimode Grasping Skills
论文作者
论文摘要
适应不确定性,从失败中恢复并在手指之间进行坐标的能力是完全自主的机器人抓握的必不可少的感觉运动技能。在本文中,我们旨在研究一项统一的反馈控制政策,以产生手指动作和完成无缝协调,抓握和重新抓紧的任务。我们提出了一组以任务为导向的奖励的量化指标,以指导政策探索,并分析并证明了每个奖励术语的有效性。为了获得强大的重新抓紧运动,我们在培训中部署了不同的初始状态,以经历机器人在掌握期间由于不准确的看法或干扰而遇到的失败。在三个不同的任务上评估了学习政策的绩效:掌握静态目标,抓住动态目标并重新抓取。根据不同情况下的成功率和失败的恢复时间,对学习的掌握政策的质量进行了评估。结果表明,学到的策略能够实现静态对象或移动对象的稳定掌握。此外,该策略可以适应即时的新环境变化,并在短暂恢复时间内发生失败的尝试后,即使在困难的配置中也是如此。
The ability to adapt to uncertainties, recover from failures, and coordinate between hand and fingers are essential sensorimotor skills for fully autonomous robotic grasping. In this paper, we aim to study a unified feedback control policy for generating the finger actions and the motion of hand to accomplish seamlessly coordinated tasks of reaching, grasping and re-grasping. We proposed a set of quantified metrics for task-orientated rewards to guide the policy exploration, and we analyzed and demonstrated the effectiveness of each reward term. To acquire a robust re-grasping motion, we deployed different initial states in training to experience failures that the robot would encounter during grasping due to inaccurate perception or disturbances. The performance of learned policy is evaluated on three different tasks: grasping a static target, grasping a dynamic target, and re-grasping. The quality of learned grasping policy was evaluated based on success rates in different scenarios and the recovery time from failures. The results indicate that the learned policy is able to achieve stable grasps of a static or moving object. Moreover, the policy can adapt to new environmental changes on the fly and execute collision-free re-grasp after a failed attempt within a short recovery time even in difficult configurations.