使用自我注意指导提高扩散模型的样本质量

论文标题

使用自我注意指导提高扩散模型的样本质量

Improving Sample Quality of Diffusion Models Using Self-Attention Guidance

论文作者

Hong, Susung, Lee, Gyuseong, Jang, Wooseok, Kim, Seungryong

论文摘要

denoising扩散模型（DDM）引起了人们对其出色的产生质量和多样性的关注。这种成功在很大程度上归因于使用类或文本条件扩散指导方法，例如分类器和无分类器指导。在本文中，我们提出了一个更全面的观点，它超出了传统的指导方法。从这个广义的角度来看，我们介绍了新颖的条件和无训练策略，以提高产生的图像的质量。作为一个简单的解决方案，Blur指南提高了中间样品对其细节信息和结构的适用性，从而使扩散模型能够以中等的指导量表生成更高质量的样本。为了改善这一点，自我注意力指导（SAG）使用了扩散模型的中间自我注意图来增强其稳定性和功效。具体而言，SAG的对手仅模糊了扩散模型在每次迭代中都关注的区域并相应地引导它们。我们的实验结果表明，我们的SAG提高了各种扩散模型的性能，包括ADM，IDDPM，稳定扩散和DIT。此外，将下垂与常规指导方法结合起来会进一步改善。

Denoising diffusion models (DDMs) have attracted attention for their exceptional generation quality and diversity. This success is largely attributed to the use of class- or text-conditional diffusion guidance methods, such as classifier and classifier-free guidance. In this paper, we present a more comprehensive perspective that goes beyond the traditional guidance methods. From this generalized perspective, we introduce novel condition- and training-free strategies to enhance the quality of generated images. As a simple solution, blur guidance improves the suitability of intermediate samples for their fine-scale information and structures, enabling diffusion models to generate higher quality samples with a moderate guidance scale. Improving upon this, Self-Attention Guidance (SAG) uses the intermediate self-attention maps of diffusion models to enhance their stability and efficacy. Specifically, SAG adversarially blurs only the regions that diffusion models attend to at each iteration and guides them accordingly. Our experimental results show that our SAG improves the performance of various diffusion models, including ADM, IDDPM, Stable Diffusion, and DiT. Moreover, combining SAG with conventional guidance methods leads to further improvement.

下载PDF全文

下载文献需遵守相关版权规定

论文标题