酸：可变形物体操纵的动作条件隐式视觉动力学

论文标题

酸：可变形物体操纵的动作条件隐式视觉动力学

ACID: Action-Conditional Implicit Visual Dynamics for Deformable Object Manipulation

论文作者

Shen, Bokui, Jiang, Zhenyu, Choy, Christopher, Guibas, Leonidas J., Savarese, Silvio, Anandkumar, Anima, Zhu, Yuke

论文摘要

在现实世界中操纵体积变形物体，例如毛绒玩具和披萨面团，由于无限的形状变化，非刚性动作和部分可观察性带来了重大挑战。我们引入了基于结构化隐式神经表示的体积变形物体的动作条件视觉动力学模型。酸整合了两种新技术：动作条件动力学和基于大地测量的对比度学习的隐式表示。为了代表部分RGB-D观测值的变形动态，我们学习了占用和基于流动的前向动力学的隐式表示。为了准确识别在较大的非刚性变形下的状态变化，我们通过新的基于大地测量的对比损失来学习一个对应嵌入场。为了评估我们的方法，我们开发了一个模拟框架，用于在逼真的场景中操纵复杂的可变形形状，并开发一个装有17,000多个动作轨迹的基准测试框架，这些轨迹具有六种类型的毛绒玩具和78种变体。我们的模型在现有方法上实现了几何，对应和动态预测的最佳性能。酸动力学模型已成功地用于目标条件可变形的操纵任务，从而使任务成功率比最强的基线提高了30％。此外，我们将模拟训练的酸模型直接应用于现实世界中，并在将它们操纵为目标配置中表现出成功。有关更多结果和信息，请访问https://b0ku1.github.io/acid/。

Manipulating volumetric deformable objects in the real world, like plush toys and pizza dough, bring substantial challenges due to infinite shape variations, non-rigid motions, and partial observability. We introduce ACID, an action-conditional visual dynamics model for volumetric deformable objects based on structured implicit neural representations. ACID integrates two new techniques: implicit representations for action-conditional dynamics and geodesics-based contrastive learning. To represent deformable dynamics from partial RGB-D observations, we learn implicit representations of occupancy and flow-based forward dynamics. To accurately identify state change under large non-rigid deformations, we learn a correspondence embedding field through a novel geodesics-based contrastive loss. To evaluate our approach, we develop a simulation framework for manipulating complex deformable shapes in realistic scenes and a benchmark containing over 17,000 action trajectories with six types of plush toys and 78 variants. Our model achieves the best performance in geometry, correspondence, and dynamics predictions over existing approaches. The ACID dynamics models are successfully employed to goal-conditioned deformable manipulation tasks, resulting in a 30% increase in task success rate over the strongest baseline. Furthermore, we apply the simulation-trained ACID model directly to real-world objects and show success in manipulating them into target configurations. For more results and information, please visit https://b0ku1.github.io/acid/ .

下载PDF全文

下载文献需遵守相关版权规定

论文标题