以最小的动态随机化学习和部署强大的运动策略

论文标题

以最小的动态随机化学习和部署强大的运动策略

Learning and Deploying Robust Locomotion Policies with Minimal Dynamics Randomization

论文作者

Campanaro, Luigi, Gangapurwala, Siddhant, Merkt, Wolfgang, Havoutis, Ioannis

论文摘要

训练深度强化学习（DRL）运动策略通常需要大量数据以融合到所需的行为。在这方面，模拟器提供了便宜而丰富的来源。对于成功的SIM到现实转移，通常采用详尽的设计方法，例如系统识别，动态随机化和域的适应性。作为替代方案，我们研究了一种简单的随机力注射策略（RFI），以在训练过程中扰动系统动力学。我们表明，随机力的应用使我们能够模拟动力学随机化。这使我们能够获得对系统动力学变化的强大运动策略。我们通过引入情节驱动偏见，进一步扩展了RFI，称为扩展的随机力注射（ERFI）。我们证明，ERFI为系统质量的变化提供了额外的鲁棒性，平均提供了53％的性能比RFI提高了53％。我们还表明，ERFI足以在两个不同的四足动物平台（Anymal C和Unitree A1）上成功进行SIM到真实传输，即使在户外环境中对不均匀地形的感知运动也是如此。

Training deep reinforcement learning (DRL) locomotion policies often require massive amounts of data to converge to the desired behaviour. In this regard, simulators provide a cheap and abundant source. For successful sim-to-real transfer, exhaustively engineered approaches such as system identification, dynamics randomization, and domain adaptation are generally employed. As an alternative, we investigate a simple strategy of random force injection (RFI) to perturb system dynamics during training. We show that the application of random forces enables us to emulate dynamics randomization. This allows us to obtain locomotion policies that are robust to variations in system dynamics. We further extend RFI, referred to as extended random force injection (ERFI), by introducing an episodic actuation offset. We demonstrate that ERFI provides additional robustness for variations in system mass offering on average a 53% improved performance over RFI. We also show that ERFI is sufficient to perform a successful sim-to-real transfer on two different quadrupedal platforms, ANYmal C and Unitree A1, even for perceptive locomotion over uneven terrain in outdoor environments.

下载PDF全文

下载文献需遵守相关版权规定

论文标题