通过野外的多视图图像进行弱监督的3D人姿势学习

论文标题

通过野外的多视图图像进行弱监督的3D人姿势学习

Weakly-Supervised 3D Human Pose Learning via Multi-view Images in the Wild

论文作者

Iqbal, Umar, Molchanov, Pavlo, Kautz, Jan

论文摘要

单眼3D人体姿势估计的一个主要挑战是获取训练数据，其中包含带有准确3D姿势注释的不受约束的图像。在本文中，我们通过提出一种不需要3D注释的弱监督方法来应对这一挑战，并学会从未标记的多视图数据中估算3D姿势，该数据可以在野外环境中轻松获取。我们提出了一个新颖的端到端学习框架，该框架可以使用多视图一致性进行弱监督的培训。由于多视图一致性容易出现退化的解决方案，因此我们采用2.5D姿势表示，并提出了一种新颖的目标函数，只有在训练有素的模型的预测一致且在所有相机视图中都是可行的，才能最小化。我们在两个大规模数据集（Human36M和MPII-INF-3DHP）上评估了我们提出的方法，在该数据集中，它可以在半/弱监督的方法中实现最新性能。

One major challenge for monocular 3D human pose estimation in-the-wild is the acquisition of training data that contains unconstrained images annotated with accurate 3D poses. In this paper, we address this challenge by proposing a weakly-supervised approach that does not require 3D annotations and learns to estimate 3D poses from unlabeled multi-view data, which can be acquired easily in in-the-wild environments. We propose a novel end-to-end learning framework that enables weakly-supervised training using multi-view consistency. Since multi-view consistency is prone to degenerated solutions, we adopt a 2.5D pose representation and propose a novel objective function that can only be minimized when the predictions of the trained model are consistent and plausible across all camera views. We evaluate our proposed approach on two large scale datasets (Human3.6M and MPII-INF-3DHP) where it achieves state-of-the-art performance among semi-/weakly-supervised methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题