Nemo：来自相同动作的多个视频实例的3D神经运动场

论文标题

Nemo：来自相同动作的多个视频实例的3D神经运动场

NeMo: 3D Neural Motion Fields from Multiple Video Instances of the Same Action

论文作者

Wang, Kuan-Chieh, Weng, Zhenzhen, Xenochristou, Maria, Araujo, Joao Pedro, Gu, Jeffrey, Liu, C. Karen, Yeung, Serena

论文摘要

重建3D人体运动的任务具有广泛的应用。金标准运动捕获（MOCAP）系统是准确的，但由于其成本，硬件和空间限制，公众无法访问。相比之下，单眼人类网格恢复（HMR）方法比MOCAP更容易获取单视频作为输入。用单眼HMR方法替换多视图的Mo-Cap系统将打破当前收集准确的3D运动的障碍，从而使人们可以访问运动分析和运动式动画等令人兴奋的应用。但是，当视频包含具有挑战性的动态运动时，现有的HMR方法的性能会降低，而这些运动不在现有的MOCAP数据集中。这会降低其吸引力，因为动态运动通常是上述应用中3D运动回收中的目标。我们的研究旨在通过利用在同一动作的多个视频实例中共享的信息来弥合单眼HMR和多视图MOCAP系统之间的差距。我们介绍了神经运动（NEMO）场。它被优化，以表示相同动作的一组视频中的基础3D动作。从经验上讲，我们表明Nemo可以使用Penn Action数据集中的视频在运动中恢复3D运动，在该视频中，NEMO在2D KePoint检测方面优于现有的HMR方法。为了进一步使用3D指标来验证NEMO，我们收集了一个模仿Penn Action的小型MOCAP数据集，并表明与各种基线相比，NEMO实现了更好的3D重建。

The task of reconstructing 3D human motion has wideranging applications. The gold standard Motion capture (MoCap) systems are accurate but inaccessible to the general public due to their cost, hardware and space constraints. In contrast, monocular human mesh recovery (HMR) methods are much more accessible than MoCap as they take single-view videos as inputs. Replacing the multi-view Mo- Cap systems with a monocular HMR method would break the current barriers to collecting accurate 3D motion thus making exciting applications like motion analysis and motiondriven animation accessible to the general public. However, performance of existing HMR methods degrade when the video contains challenging and dynamic motion that is not in existing MoCap datasets used for training. This reduces its appeal as dynamic motion is frequently the target in 3D motion recovery in the aforementioned applications. Our study aims to bridge the gap between monocular HMR and multi-view MoCap systems by leveraging information shared across multiple video instances of the same action. We introduce the Neural Motion (NeMo) field. It is optimized to represent the underlying 3D motions across a set of videos of the same action. Empirically, we show that NeMo can recover 3D motion in sports using videos from the Penn Action dataset, where NeMo outperforms existing HMR methods in terms of 2D keypoint detection. To further validate NeMo using 3D metrics, we collected a small MoCap dataset mimicking actions in Penn Action,and show that NeMo achieves better 3D reconstruction compared to various baselines.

下载PDF全文

下载文献需遵守相关版权规定

论文标题