探戈：通过照明分解的文本驱动的感性和鲁棒3D风格化

论文标题

探戈：通过照明分解的文本驱动的感性和鲁棒3D风格化

TANGO: Text-driven Photorealistic and Robust 3D Stylization via Lighting Decomposition

论文作者

Chen, Yongwei, Chen, Rui, Lei, Jiabao, Zhang, Yabin, Jia, Kui

论文摘要

在计算机视觉和图形研究中，通过风格化创建3D内容是一个有希望而又具有挑战性的问题。在这项工作中，我们着重于给定的任意拓扑表面网格的感性外观渲染。我们提出了探戈的近期跨模式监督（剪辑）模型的跨模式监督的激励，我们提出了探戈，该探戈会根据文本提示以明显的方式转移给定3D形状的外观样式。从技术上讲，我们建议将外观样式视为空间变化的双向反射分布函数，局部几何变化和照明条件，这些照明条件是通过Spherical Gaussians基于的可不同的可不同的渲染器，通过对夹子损失进行了共同优化的照明条件。因此，探戈通过自动预测反射率效应，即使对于裸露的低质量网格，可以自动预测反射率效应，而无需在特定于任务的数据集上进行训练。广泛的实验表明，探戈的表现优于文本驱动的3D样式转移的现有方法，而在对低质量的网格进行样式化时，探戈的质量，3D几何的一致性和稳健性。我们的代码和结果可在我们的项目网页上获得https://cyw-3d.github.io/tango/。

Creation of 3D content by stylization is a promising yet challenging problem in computer vision and graphics research. In this work, we focus on stylizing photorealistic appearance renderings of a given surface mesh of arbitrary topology. Motivated by the recent surge of cross-modal supervision of the Contrastive Language-Image Pre-training (CLIP) model, we propose TANGO, which transfers the appearance style of a given 3D shape according to a text prompt in a photorealistic manner. Technically, we propose to disentangle the appearance style as the spatially varying bidirectional reflectance distribution function, the local geometric variation, and the lighting condition, which are jointly optimized, via supervision of the CLIP loss, by a spherical Gaussians based differentiable renderer. As such, TANGO enables photorealistic 3D style transfer by automatically predicting reflectance effects even for bare, low-quality meshes, without training on a task-specific dataset. Extensive experiments show that TANGO outperforms existing methods of text-driven 3D style transfer in terms of photorealistic quality, consistency of 3D geometry, and robustness when stylizing low-quality meshes. Our codes and results are available at our project webpage https://cyw-3d.github.io/tango/.

下载PDF全文

下载文献需遵守相关版权规定

论文标题