GMM单位：通过属性高斯混合物建模无监督的多域和多模式图像到图像翻译

论文标题

GMM单位：通过属性高斯混合物建模无监督的多域和多模式图像到图像翻译

GMM-UNIT: Unsupervised Multi-Domain and Multi-Modal Image-to-Image Translation via Attribute Gaussian Mixture Modeling

论文作者

Liu, Yahui, De Nadai, Marco, Yao, Jian, Sebe, Nicu, Lepri, Bruno, Alameda-Pineda, Xavier

论文摘要

无监督的图像到图像翻译（单元）旨在通过使用未配对的训练图像来学习几个视觉域之间的映射。最近的研究表明，多个领域的成功取得了显着的成功，但它们具有两个主要局限性：它们要么是由几个两域映射建造的，这些映射需要独立学习，要么产生低多样性结果，这是一个称为模式崩溃的问题。为了克服这些局限性，我们提出了一种名为GMM-UNIT的方法，该方法基于一个content-Attribute distangled表示，其中属性空间与GMM拟合。每个GMM组件代表一个域，此简单假设具有两个突出的优势。首先，它可以轻松地扩展到大多数多域和多模式图像到图像到图像的翻译任务。其次，连续的域编码允许域之间的插值和外推到看不见的域和翻译。此外，我们展示了如何将GMM单元限制在文献中的不同方法中，这意味着GMM-Unit是无监督图像到图像翻译的统一框架。

Unsupervised image-to-image translation (UNIT) aims at learning a mapping between several visual domains by using unpaired training images. Recent studies have shown remarkable success for multiple domains but they suffer from two main limitations: they are either built from several two-domain mappings that are required to be learned independently, or they generate low-diversity results, a problem known as mode collapse. To overcome these limitations, we propose a method named GMM-UNIT, which is based on a content-attribute disentangled representation where the attribute space is fitted with a GMM. Each GMM component represents a domain, and this simple assumption has two prominent advantages. First, it can be easily extended to most multi-domain and multi-modal image-to-image translation tasks. Second, the continuous domain encoding allows for interpolation between domains and for extrapolation to unseen domains and translations. Additionally, we show how GMM-UNIT can be constrained down to different methods in the literature, meaning that GMM-UNIT is a unifying framework for unsupervised image-to-image translation.

下载PDF全文

下载文献需遵守相关版权规定

论文标题