通过增强学习，创建动态四倍的机器人守门员

论文标题

通过增强学习，创建动态四倍的机器人守门员

Creating a Dynamic Quadrupedal Robotic Goalkeeper with Reinforcement Learning

论文作者

Huang, Xiaoyu, Li, Zhongyu, Xiang, Yanzhen, Ni, Yiming, Chi, Yufeng, Li, Yunhao, Yang, Lizhi, Peng, Xue Bin, Sreenath, Koushil

论文摘要

我们提出了一个强化学习（RL）框架，该框架使四足动物的机器人能够在现实世界中执行足球守门员任务。使用四足动物的足球守门员是一个具有挑战性的问题，它将高度动态的运动与精确且快速的非毛神对象（球）操纵相结合。机器人需要在很短的时间内使用动态运动动作反应并拦截潜在的飞行球，通常不到一秒钟。在本文中，我们建议使用无层次模型的RL框架解决此问题。该框架的第一个组成部分包含多个控制策略，以实现不同的运动技能，可用于涵盖目标的不同区域。每个控制策略使机器人能够在执行一种特定的运动技能（例如跳跃，潜水和避开）的同时跟踪随机参数终端效应轨迹。然后，这些技能由框架的第二部分使用，该框架是一名高级规划师，以确定所需的技能和最终影响轨迹，以拦截飞向目标不同地区的球。我们将提议的框架部署在迷你Cheetah四足机器人上，并证明了我们框架在现实世界中快速移动球的各种敏捷拦截的有效性。

We present a reinforcement learning (RL) framework that enables quadrupedal robots to perform soccer goalkeeping tasks in the real world. Soccer goalkeeping using quadrupeds is a challenging problem, that combines highly dynamic locomotion with precise and fast non-prehensile object (ball) manipulation. The robot needs to react to and intercept a potentially flying ball using dynamic locomotion maneuvers in a very short amount of time, usually less than one second. In this paper, we propose to address this problem using a hierarchical model-free RL framework. The first component of the framework contains multiple control policies for distinct locomotion skills, which can be used to cover different regions of the goal. Each control policy enables the robot to track random parametric end-effector trajectories while performing one specific locomotion skill, such as jump, dive, and sidestep. These skills are then utilized by the second part of the framework which is a high-level planner to determine a desired skill and end-effector trajectory in order to intercept a ball flying to different regions of the goal. We deploy the proposed framework on a Mini Cheetah quadrupedal robot and demonstrate the effectiveness of our framework for various agile interceptions of a fast-moving ball in the real world.

下载PDF全文

下载文献需遵守相关版权规定

论文标题