maniclip：文本的多属性面部操纵

论文标题

maniclip：文本的多属性面部操纵

ManiCLIP: Multi-Attribute Face Manipulation from Text

论文作者

Wang, Hao, Lin, Guosheng, del Molino, Ana García, Wang, Anran, Feng, Jiashi, Shen, Zhiqi

论文摘要

在本文中，我们提出了一种基于文本描述的新型多属性面部操纵方法。以前的基于文本的图像编辑方法要么需要为每个单独的图像进行测试时间优化，要么仅限于单个属性编辑。将这些方法扩展到多属性面部图像编辑方案将引入不希望的过度属性更改，例如，与文本相关的属性过于操纵，并且也更改了文本irrrelevant属性。为了解决这些挑战并在多个面部属性上实现自然编辑，我们提出了一种新的去耦训练方案，我们使用组采样从同一属性类别中获取文本段，而不是整个复杂的句子。此外，为了保留其他现有面部属性，我们鼓励该模型通过熵约束分别编辑每个属性的潜在代码。在推理阶段，即使是从复杂的文本提示中，我们的模型也能够编辑新的面部图像，而无需任何测试时间优化。我们展示了广泛的实验和分析，以证明我们方法的功效，该方法具有最小的文本属性属性编辑的自然操纵面。代码和预培训模型可在https://github.com/hwang1996/maniclip上找到。

In this paper we present a novel multi-attribute face manipulation method based on textual descriptions. Previous text-based image editing methods either require test-time optimization for each individual image or are restricted to single attribute editing. Extending these methods to multi-attribute face image editing scenarios will introduce undesired excessive attribute change, e.g., text-relevant attributes are overly manipulated and text-irrelevant attributes are also changed. In order to address these challenges and achieve natural editing over multiple face attributes, we propose a new decoupling training scheme where we use group sampling to get text segments from same attribute categories, instead of whole complex sentences. Further, to preserve other existing face attributes, we encourage the model to edit the latent code of each attribute separately via an entropy constraint. During the inference phase, our model is able to edit new face images without any test-time optimization, even from complex textual prompts. We show extensive experiments and analysis to demonstrate the efficacy of our method, which generates natural manipulated faces with minimal text-irrelevant attribute editing. Code and pre-trained model are available at https://github.com/hwang1996/ManiCLIP.

下载PDF全文

下载文献需遵守相关版权规定

论文标题