视觉3D人类姿势和形状的神经下降

论文标题

视觉3D人类姿势和形状的神经下降

Neural Descent for Visual 3D Human Pose and Shape

论文作者

Zanfir, Andrei, Bazavan, Eduard Gabriel, Zanfir, Mihai, Freeman, William T., Sukthankar, Rahul, Sminchisescu, Cristian

论文摘要

我们提出了深度神经网络方法论，以重建输入RGB图像，以重建人的3D姿势和形状。我们依靠最近引入的，表现力的身体统计3D人类模型Ghum，受过训练的端到端训练，并学会在自我监督的政权中重建其姿势和形状状态。我们方法论的核心是学习和优化方法，称为人类血统（Hund），在训练模型参数时既避免二阶差异化，又避免了昂贵的状态梯度下降，以便在测试时间准确地最大程度地减少语义可分解的渲染损失。取而代之的是，我们依靠新颖的经常性阶段来更新姿势和形状参数，从而使损失不仅有效地最小化，而且该过程是元调节的，以确保终端发展。 Hund在训练和测试之间的对称性使其成为本地第3D人类传感体系结构，以支持不同的操作制度，包括自我监督的制度。在各种测试中，我们表明Hund在HH36M和3DPW之类的数据集中取得了非常具竞争力的结果，以及用于在野外收集的复杂图像的高质量3D重建。

We present deep neural network methodology to reconstruct the 3d pose and shape of people, given an input RGB image. We rely on a recently introduced, expressivefull body statistical 3d human model, GHUM, trained end-to-end, and learn to reconstruct its pose and shape state in a self-supervised regime. Central to our methodology, is a learning to learn and optimize approach, referred to as HUmanNeural Descent (HUND), which avoids both second-order differentiation when training the model parameters,and expensive state gradient descent in order to accurately minimize a semantic differentiable rendering loss at test time. Instead, we rely on novel recurrent stages to update the pose and shape parameters such that not only losses are minimized effectively, but the process is meta-regularized in order to ensure end-progress. HUND's symmetry between training and testing makes it the first 3d human sensing architecture to natively support different operating regimes including self-supervised ones. In diverse tests, we show that HUND achieves very competitive results in datasets like H3.6M and 3DPW, aswell as good quality 3d reconstructions for complex imagery collected in-the-wild.

下载PDF全文

下载文献需遵守相关版权规定

论文标题