提示阵列使偏见远离：具有对抗性学习的偏见视力语言模型

论文标题

提示阵列使偏见远离：具有对抗性学习的偏见视力语言模型

A Prompt Array Keeps the Bias Away: Debiasing Vision-Language Models with Adversarial Learning

论文作者

Berg, Hugo, Hall, Siobhan Mackenzie, Bhalgat, Yash, Yang, Wonsuk, Kirk, Hannah Rose, Shtedritski, Aleksandar, Bain, Max

论文摘要

视觉语言模型可以编码社会偏见和刻板印象，但是由于缺乏测量鲁棒性和特征降解，测量和缓解这些多模式危害面临挑战。为了应对这些挑战，我们研究了偏差措施并将排名指标应用于图像文本表示。然后，我们研究了伪造方法，并表明，预先学习的嵌入到经文本查询中，这些查询是经过对抗性偏见的共同训练的文本查询，而对比度损失则减少了各种偏见度量，而对图像文本表示的降低最小。

Vision-language models can encode societal biases and stereotypes, but there are challenges to measuring and mitigating these multimodal harms due to lacking measurement robustness and feature degradation. To address these challenges, we investigate bias measures and apply ranking metrics for image-text representations. We then investigate debiasing methods and show that prepending learned embeddings to text queries that are jointly trained with adversarial debiasing and a contrastive loss reduces various bias measures with minimal degradation to the image-text representation.

下载PDF全文

下载文献需遵守相关版权规定

论文标题