COGS：可控的生成和搜索从草图和样式

论文标题

COGS：可控的生成和搜索从草图和样式

CoGS: Controllable Generation and Search from Sketch and Style

论文作者

Ham, Cusuh, Tarres, Gemma Canet, Bui, Tu, Hays, James, Lin, Zhe, Collomosse, John

论文摘要

我们提出COGS，这是一种新颖的方法，用于图像的样式条件，素描驱动的合成。 COGS可以为给定的草图对象探索各种外观可能性，从而对输出的结构和外观进行了脱钩的控制。通过输入草图和基于变压器的草图和样式编码器的示例“样式”调理图像启用了对对象结构和外观的粗粒粒度控制，以生成离散的代码书表示。我们将代码簿表示形式映射到度量空间中，在通过量化量化的GAN（VQGAN）解码器生成图像之前，可以对多个合成选项之间的选择和插值进行细粒度的控制。我们的框架统一了搜索和综合任务，因为草图和样式对可用于运行初始合成，该合成可以通过在搜索语料库中与类似结果组合来完善，以使图像更加与用户的意图更匹配。我们表明，我们的模型对新创建的PseudoSketches数据集的125个对象类培训，能够生成各种语义内容和外观样式的范围。

We present CoGS, a novel method for the style-conditioned, sketch-driven synthesis of images. CoGS enables exploration of diverse appearance possibilities for a given sketched object, enabling decoupled control over the structure and the appearance of the output. Coarse-grained control over object structure and appearance are enabled via an input sketch and an exemplar "style" conditioning image to a transformer-based sketch and style encoder to generate a discrete codebook representation. We map the codebook representation into a metric space, enabling fine-grained control over selection and interpolation between multiple synthesis options before generating the image via a vector quantized GAN (VQGAN) decoder. Our framework thereby unifies search and synthesis tasks, in that a sketch and style pair may be used to run an initial synthesis which may be refined via combination with similar results in a search corpus to produce an image more closely matching the user's intent. We show that our model, trained on the 125 object classes of our newly created Pseudosketches dataset, is capable of producing a diverse gamut of semantic content and appearance styles.

下载PDF全文

下载文献需遵守相关版权规定

论文标题