通过对比度自我监督的表示：自然图像的申请

论文标题

通过对比度自我监督的表示：自然图像的申请

A Domain-specific Perceptual Metric via Contrastive Self-supervised Representation: Applications on Natural and Medical Images

论文作者

Li, Hongwei Bran, Prabhakar, Chinmay, Shit, Suprosanna, Paetzold, Johannes, Amiranashvili, Tamaz, Zhang, Jianguo, Rueckert, Daniel, Iglesias, Juan Eugenio, Wiestler, Benedikt, Menze, Bjoern

论文摘要

在低级计算机视觉中，量化两个图像的感知相似性是一个长期存在的问题。自然图像域通常依赖于监督的学习，例如预先训练的VGG来获得潜在的表示。但是，由于域的转移，从自然图像域进行的预训练模型可能不适用于其他图像域，例如医学成像。值得注意的是，在医学成像中，评估感知相似性是由在各种医学领域进行了广泛培训的专家专门执行的。因此，医学成像仍然没有特定于任务的客观感知度量。这项工作回答了一个问题：是否有必要依靠监督学习来获得可以衡量知觉相似性的有效表示，或者自学是否足够？为了了解最近的对比自我监督的表示（CSR）是否可以进行营救，我们从自然图像开始，并系统地将CSR评估为众多当代体系结构和任务的指标，并将其与现有方法进行比较。我们发现，在自然图像域中，企业社会责任与几个感知测试的监督性指标相当，在医疗领域，CSR可以更好地量化有关专家评级的感知相似性。我们还证明，在两个图像合成任务中，CSR可以显着提高图像质量。最后，我们广泛的结果表明，感知性是CSR的新兴特性，可以将其适应许多图像域而无需注释。

Quantifying the perceptual similarity of two images is a long-standing problem in low-level computer vision. The natural image domain commonly relies on supervised learning, e.g., a pre-trained VGG, to obtain a latent representation. However, due to domain shift, pre-trained models from the natural image domain might not apply to other image domains, such as medical imaging. Notably, in medical imaging, evaluating the perceptual similarity is exclusively performed by specialists trained extensively in diverse medical fields. Thus, medical imaging remains devoid of task-specific, objective perceptual measures. This work answers the question: Is it necessary to rely on supervised learning to obtain an effective representation that could measure perceptual similarity, or is self-supervision sufficient? To understand whether recent contrastive self-supervised representation (CSR) may come to the rescue, we start with natural images and systematically evaluate CSR as a metric across numerous contemporary architectures and tasks and compare them with existing methods. We find that in the natural image domain, CSR behaves on par with the supervised one on several perceptual tests as a metric, and in the medical domain, CSR better quantifies perceptual similarity concerning the experts' ratings. We also demonstrate that CSR can significantly improve image quality in two image synthesis tasks. Finally, our extensive results suggest that perceptuality is an emergent property of CSR, which can be adapted to many image domains without requiring annotations.

下载PDF全文

下载文献需遵守相关版权规定

论文标题