可循环的可区分引擎，用于建模可使用低频数据训练的张力机器人

论文标题

可循环的可区分引擎，用于建模可使用低频数据训练的张力机器人

A Recurrent Differentiable Engine for Modeling Tensegrity Robots Trainable with Low-Frequency Data

论文作者

Wang, Kun, Aanjaneya, Mridul, Bekris, Kostas

论文摘要

鉴于存在复杂的动力学和大量DOF的存在，由刚性杆和柔性电缆组成的紧张式机器人难以准确地对其进行建模和控制。最近已提出了可微分的物理发动机作为数据驱动的方法，用于模型识别这种复杂的机器人系统。这些发动机通常以高频执行以实现准确的模拟。但是，由于现实世界传感器的局限性，通常在如此高的频率下，通常无法在训练可区分发动机的地面真相轨迹。目前的工作着重于此频率不匹配，这会影响建模准确性。我们提出了一个反复的结构，该结构是紧张的机器人的可区分物理引擎，即使使用低频轨迹也可以有效地训练。为了以可靠的方式训练这款新的经常性引擎，这项工作相对于先前的工作介绍：（i）一种新的隐式集成方案，（ii）渐进式培训管道，以及（iii）可区分的碰撞检查器。 NASA在Mujoco上的Icosahedron Superballbot的模型被用作收集培训数据的地面真相系统。模拟实验表明，一旦对Mujoco的低频轨迹进行了训练，对复发性可区分发动机进行了训练，它就可以匹配Mujoco系统的行为。成功的标准是，是否可以将使用可区分发动机的运动策略传递回地面真相系统，并导致类似的运动。值得注意的是，训练可区分发动机所需的地面真相数据数量，因此该政策可以转移到地面真相系统，是直接在地面真相系统上训练策略所需的数据的1％。

Tensegrity robots, composed of rigid rods and flexible cables, are difficult to accurately model and control given the presence of complex dynamics and high number of DoFs. Differentiable physics engines have been recently proposed as a data-driven approach for model identification of such complex robotic systems. These engines are often executed at a high-frequency to achieve accurate simulation. Ground truth trajectories for training differentiable engines, however, are not typically available at such high frequencies due to limitations of real-world sensors. The present work focuses on this frequency mismatch, which impacts the modeling accuracy. We proposed a recurrent structure for a differentiable physics engine of tensegrity robots, which can be trained effectively even with low-frequency trajectories. To train this new recurrent engine in a robust way, this work introduces relative to prior work: (i) a new implicit integration scheme, (ii) a progressive training pipeline, and (iii) a differentiable collision checker. A model of NASA's icosahedron SUPERballBot on MuJoCo is used as the ground truth system to collect training data. Simulated experiments show that once the recurrent differentiable engine has been trained given the low-frequency trajectories from MuJoCo, it is able to match the behavior of MuJoCo's system. The criterion for success is whether a locomotion strategy learned using the differentiable engine can be transferred back to the ground-truth system and result in a similar motion. Notably, the amount of ground truth data needed to train the differentiable engine, such that the policy is transferable to the ground truth system, is 1% of the data needed to train the policy directly on the ground-truth system.

下载PDF全文

下载文献需遵守相关版权规定

论文标题