图像字幕中检测到分布外示例的基线

论文标题

图像字幕中检测到分布外示例的基线

A Baseline for Detecting Out-of-Distribution Examples in Image Captioning

论文作者

Shalev, Gabi, Shalev, Gal-Lev, Keshet, Joseph

论文摘要

近年来，通过开发可以为与训练图像相同的分布绘制的图像产生多样化和高质量的描述的神经模型，从而实现了突破。但是，当面对分布外（OOD）图像（例如损坏的图像或包含未知对象的图像）时，模型无法生成相关字幕。在本文中，我们考虑了图像字幕中的OOD检测问题。我们提出问题，并提出评估设置，以评估模型在任务上的性能。然后，我们在检测和拒绝OOD图像时分析并显示了标题的可能性得分的有效性，这意味着输入图像和生成的字幕之间的相关性封装在得分内。

Image captioning research achieved breakthroughs in recent years by developing neural models that can generate diverse and high-quality descriptions for images drawn from the same distribution as training images. However, when facing out-of-distribution (OOD) images, such as corrupted images, or images containing unknown objects, the models fail in generating relevant captions. In this paper, we consider the problem of OOD detection in image captioning. We formulate the problem and suggest an evaluation setup for assessing the model's performance on the task. Then, we analyze and show the effectiveness of the caption's likelihood score at detecting and rejecting OOD images, which implies that the relatedness between the input image and the generated caption is encapsulated within the score.

下载PDF全文

下载文献需遵守相关版权规定

论文标题