进行扩散模型快速采样的进行性蒸馏

论文标题

进行扩散模型快速采样的进行性蒸馏

Progressive Distillation for Fast Sampling of Diffusion Models

论文作者

Salimans, Tim, Ho, Jonathan

论文摘要

扩散模型最近显示出对生成建模的巨大希望，在密度估计下对感知质量和自回归模型的表现优于gan。剩余的缺点是它们的缓慢采样时间：生成高质量的样品需要数百或数千次模型评估。在这里，我们做出了两项贡献，以帮助消除这一缺点：首先，我们提出了扩散模型的新参数化，这些参数在使用几个采样步骤时提供了增加的稳定性。其次，我们提出了一种使用许多步骤来提炼经过训练的确定性扩散采样器的方法，将其采用尽可能多的采样步骤的新扩散模型。然后，我们继续将此蒸馏过程逐步应用于我们的模型，每次都需要将所需的采样步骤的数量减半。在CIFAR-10，Imagenet和LSUN等标准图像生成基准上，我们从最先进的采样器开始采用多达8192个步骤，并且能够将其蒸馏成少于4个步骤的型号，而不会失去太多的感知质量；例如，以4个步骤在CIFAR-10上实现3.0的FID。最后，我们表明，完整的渐进式蒸馏过程不需要花费更多的时间来训练原始模型，因此代表了在火车和测试时间使用扩散的生成建模的有效解决方案。

Diffusion models have recently shown great promise for generative modeling, outperforming GANs on perceptual quality and autoregressive models at density estimation. A remaining downside is their slow sampling time: generating high quality samples takes many hundreds or thousands of model evaluations. Here we make two contributions to help eliminate this downside: First, we present new parameterizations of diffusion models that provide increased stability when using few sampling steps. Second, we present a method to distill a trained deterministic diffusion sampler, using many steps, into a new diffusion model that takes half as many sampling steps. We then keep progressively applying this distillation procedure to our model, halving the number of required sampling steps each time. On standard image generation benchmarks like CIFAR-10, ImageNet, and LSUN, we start out with state-of-the-art samplers taking as many as 8192 steps, and are able to distill down to models taking as few as 4 steps without losing much perceptual quality; achieving, for example, a FID of 3.0 on CIFAR-10 in 4 steps. Finally, we show that the full progressive distillation procedure does not take more time than it takes to train the original model, thus representing an efficient solution for generative modeling using diffusion at both train and test time.

下载PDF全文

下载文献需遵守相关版权规定

论文标题