论文标题
NEUVV:带有沉浸式渲染和编辑的神经容量视频
NeuVV: Neural Volumetric Videos with Immersive Rendering and Editing
论文作者
论文摘要
Metaverse承诺提供的一些最激动人心的体验,例如,在虚拟环境中与虚拟角色实时互动需要实时的照片真实渲染。 3D重建方法的渲染,主动或被动,仍然需要大量的清理工作来修复网格或点云。在本文中,我们提出了一种称为神经体积视频或NEUVV的神经量图技术,以支持具有照相真实性和实时实时的体积视频内容的沉浸式,互动和时空渲染。 NEUVV的核心是有效地将动态神经辐射场(NERF)编码为可渲染且可编辑的原始素。我们介绍了两种类型的分解方案:一种超球体谐波(HH)分解,用于在时空和时间上建模光滑的颜色变化以及可学习的基础表示,用于建模由运动引起的突然密度和颜色变化。 NEUVV分解可以集成到类似于Plenoctree的视频OCTREE(voctree)中,以显着加速训练,同时减少内存开销。实时NEUVV渲染进一步实现了一类沉浸式内容编辑工具。具体而言,NEUVV将每个词源视为原始的,并实现基于体积的深度顺序和α融合,以实现以内容重新使用的时空组成。例如,我们在不同的3D位置表现出相同性能的定位各种表现,并以不同的计时,调整表演者服装的颜色/纹理,铸造聚光灯阴影和综合距离偏离照明等,所有这些都以交互式速度。我们进一步开发了一个混合神经固定化渲染框架,以支持消费者级的VR耳机,以便可以在虚拟3D空间中首次深入地进行上述体积视频观看和编辑。
Some of the most exciting experiences that Metaverse promises to offer, for instance, live interactions with virtual characters in virtual environments, require real-time photo-realistic rendering. 3D reconstruction approaches to rendering, active or passive, still require extensive cleanup work to fix the meshes or point clouds. In this paper, we present a neural volumography technique called neural volumetric video or NeuVV to support immersive, interactive, and spatial-temporal rendering of volumetric video contents with photo-realism and in real-time. The core of NeuVV is to efficiently encode a dynamic neural radiance field (NeRF) into renderable and editable primitives. We introduce two types of factorization schemes: a hyper-spherical harmonics (HH) decomposition for modeling smooth color variations over space and time and a learnable basis representation for modeling abrupt density and color changes caused by motion. NeuVV factorization can be integrated into a Video Octree (VOctree) analogous to PlenOctree to significantly accelerate training while reducing memory overhead. Real-time NeuVV rendering further enables a class of immersive content editing tools. Specifically, NeuVV treats each VOctree as a primitive and implements volume-based depth ordering and alpha blending to realize spatial-temporal compositions for content re-purposing. For example, we demonstrate positioning varied manifestations of the same performance at different 3D locations with different timing, adjusting color/texture of the performer's clothing, casting spotlight shadows and synthesizing distance falloff lighting, etc, all at an interactive speed. We further develop a hybrid neural-rasterization rendering framework to support consumer-level VR headsets so that the aforementioned volumetric video viewing and editing, for the first time, can be conducted immersively in virtual 3D space.