重新思考有条件的GAN训练：一种使用几何结构的潜在歧管的方法

论文标题

重新思考有条件的GAN训练：一种使用几何结构的潜在歧管的方法

Rethinking conditional GAN training: An approach using geometrically structured latent manifolds

论文作者

Ramasinghe, Sameera, Farazi, Moshiur, Khan, Salman, Barnes, Nick, Gould, Stephen

论文摘要

有条件的甘斯（CGAN）以其基本形式的形式遭受关键缺点，例如产生的产出缺乏多样性以及潜在和输出歧管之间的变形。尽管已经努力改善结果，但它们可能会遭受令人不快的副作用，例如潜在和产出空间之间的拓扑不匹配。相比之下，我们从几何学角度解决了这个问题，并提出了一种新颖的训练机制，通过系统地鼓励在潜在和产出歧管之间进行Bi-Lipschitz映射，从而提高了香草Cgan的多样性和视觉质量。我们验证了缺乏多样性的基线CGAN（即PIX2PIX）上解决方案的功效，并表明，仅修改其训练机制（即使用我们提出的PIX2PIX-GEO），就可以在一系列图像到Imimage Transimage Translation Tasks上实现更多样化和更现实的输出。代码可在https://github.com/samgregoost/rethinking-cgans上找到。

Conditional GANs (cGAN), in their rudimentary form, suffer from critical drawbacks such as the lack of diversity in generated outputs and distortion between the latent and output manifolds. Although efforts have been made to improve results, they can suffer from unpleasant side-effects such as the topology mismatch between latent and output spaces. In contrast, we tackle this problem from a geometrical perspective and propose a novel training mechanism that increases both the diversity and the visual quality of a vanilla cGAN, by systematically encouraging a bi-lipschitz mapping between the latent and the output manifolds. We validate the efficacy of our solution on a baseline cGAN (i.e., Pix2Pix) which lacks diversity, and show that by only modifying its training mechanism (i.e., with our proposed Pix2Pix-Geo), one can achieve more diverse and realistic outputs on a broad set of image-to-image translation tasks. Codes are available at https://github.com/samgregoost/Rethinking-CGANs.

下载PDF全文

下载文献需遵守相关版权规定

论文标题