论文标题
MPG:带有条件样式的多呈现比萨像生成器
MPG: A Multi-ingredient Pizza Image Generator with Conditional StyleGANs
论文作者
论文摘要
多标签有条件的图像生成是计算机视觉中的一个具有挑战性的问题。在这项工作中,我们提出了多形成比萨发电机(MPG),这是一种有条件的生成神经网络(GAN)框架,用于合成多标记图像。我们根据最先进的gan结构设计MPG,称为stylegan2,在该结构中,我们通过执行中间功能图来开发一种新的调理技术,以学习比例尺标签信息。由于多标记图像生成问题的复杂性质,我们还通过预测相应的成分并鼓励歧视器来区分匹配的图像和不匹配的图像来正规化合成图像。为了验证MPG的疗效,我们在Pizza10上对其进行了测试,这是一个经过精心注释的多形式披萨图像数据集。 MPG可以成功生成带有所需成分的照片真实披萨图像。该框架可以轻松扩展到其他多标签图像生成方案。
Multilabel conditional image generation is a challenging problem in computer vision. In this work we propose Multi-ingredient Pizza Generator (MPG), a conditional Generative Neural Network (GAN) framework for synthesizing multilabel images. We design MPG based on a state-of-the-art GAN structure called StyleGAN2, in which we develop a new conditioning technique by enforcing intermediate feature maps to learn scalewise label information. Because of the complex nature of the multilabel image generation problem, we also regularize synthetic image by predicting the corresponding ingredients as well as encourage the discriminator to distinguish between matched image and mismatched image. To verify the efficacy of MPG, we test it on Pizza10, which is a carefully annotated multi-ingredient pizza image dataset. MPG can successfully generate photo-realist pizza images with desired ingredients. The framework can be easily extend to other multilabel image generation scenarios.