论文标题

UVCGAN:UNET视觉变压器周期符合的GAN,用于未配对的图像到图像翻译

UVCGAN: UNet Vision Transformer cycle-consistent GAN for unpaired image-to-image translation

论文作者

Torbunov, Dmitrii, Huang, Yi, Yu, Haiwang, Huang, Jin, Yoo, Shinjae, Lin, Meifeng, Viren, Brett, Ren, Yihui

论文摘要

未配对的图像到图像翻译在艺术,设计和科学模拟中具有广泛的应用。一个早期的突破是Cyclegan,它强调了两个未配对的图像域之间通过生成对流网络(GAN)以及周期矛盾的约束,而最近的作品则促进了一对一的映射以增强翻译图像的多样性。这项工作是由科学模拟和一对一需求的动机,重新审视了经典的自行车框架,并提高了其性能,以超越更多的现代模型,而不会放松周期性的限制。为了实现这一目标,我们为发电机配备了视觉变压器(VIT),并采用必要的培训和正则化技术。与以前的表现最佳模型相比,我们的模型性能更好,并保持原始图像和翻译图像之间的相关性很强。随附的消融研究表明,梯度罚款和自我监管的预训练对于改善至关重要。为了促进可重复性和开放科学,可以在https://github.com/ls4gan/uvcgan上获得源代码,超参数配置和预培训模型。

Unpaired image-to-image translation has broad applications in art, design, and scientific simulations. One early breakthrough was CycleGAN that emphasizes one-to-one mappings between two unpaired image domains via generative-adversarial networks (GAN) coupled with the cycle-consistency constraint, while more recent works promote one-to-many mapping to boost diversity of the translated images. Motivated by scientific simulation and one-to-one needs, this work revisits the classic CycleGAN framework and boosts its performance to outperform more contemporary models without relaxing the cycle-consistency constraint. To achieve this, we equip the generator with a Vision Transformer (ViT) and employ necessary training and regularization techniques. Compared to previous best-performing models, our model performs better and retains a strong correlation between the original and translated image. An accompanying ablation study shows that both the gradient penalty and self-supervised pre-training are crucial to the improvement. To promote reproducibility and open science, the source code, hyperparameter configurations, and pre-trained model are available at https://github.com/LS4GAN/uvcgan.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源