全景神经领域：语义感知的神经场景表示

论文标题

全景神经领域：语义感知的神经场景表示

Panoptic Neural Fields: A Semantic Object-Aware Neural Scene Representation

论文作者

Kundu, Abhijit, Genova, Kyle, Yin, Xiaoqi, Fathi, Alireza, Pantofaru, Caroline, Guibas, Leonidas, Tagliasacchi, Andrea, Dellaert, Frank, Funkhouser, Thomas

论文摘要

我们提出了Panoptic神经场（PNF），这是一种对象感知的神经场景表示，将场景分解为一组对象（事物）和背景（东西）。每个对象由方向的3D边界框和一个多层感知器（MLP）表示，该框采用位置，方向和时间以及输出密度和辐射。背景内容由类似的MLP表示，该MLP另外输出语义标签。每个对象MLP都是特定于实例的，因此比以前的对象感知方法更小，更快，同时仍利用通过元学习初始化掺入的类别特定的先验。我们的模型从仅彩色图像中构建了任何场景的全景辐射场表示。我们使用现成的算法来预测相机姿势，对象轨道和2D图像语义分段。然后，我们使用分析，共同优化MLP权重和边界框参数，并通过颜色图像和伪划分的自学意义分析，并从预测的语义分割中进行伪内。在实现现实世界动态场景的实验中，我们发现我们的模型可以有效地用于多个任务，例如新型视图合成，2D Panoptic Sementation，3D场景编辑和多视图深度预测。

We present Panoptic Neural Fields (PNF), an object-aware neural scene representation that decomposes a scene into a set of objects (things) and background (stuff). Each object is represented by an oriented 3D bounding box and a multi-layer perceptron (MLP) that takes position, direction, and time and outputs density and radiance. The background stuff is represented by a similar MLP that additionally outputs semantic labels. Each object MLPs are instance-specific and thus can be smaller and faster than previous object-aware approaches, while still leveraging category-specific priors incorporated via meta-learned initialization. Our model builds a panoptic radiance field representation of any scene from just color images. We use off-the-shelf algorithms to predict camera poses, object tracks, and 2D image semantic segmentations. Then we jointly optimize the MLP weights and bounding box parameters using analysis-by-synthesis with self-supervision from color images and pseudo-supervision from predicted semantic segmentations. During experiments with real-world dynamic scenes, we find that our model can be used effectively for several tasks like novel view synthesis, 2D panoptic segmentation, 3D scene editing, and multiview depth prediction.

下载PDF全文

下载文献需遵守相关版权规定

论文标题