论文标题

您只需要一个良好的嵌入式提取器即可修复虚假相关性

You Only Need a Good Embeddings Extractor to Fix Spurious Correlations

论文作者

Mehta, Raghav, Albiero, Vítor, Chen, Li, Evtimov, Ivan, Glaser, Tamar, Li, Zhiheng, Hassner, Tal

论文摘要

培训数据中的虚假相关性通常会导致鲁棒性问题,因为模型学会使用它们作为快捷方式。例如,在预测物体是否是母牛时,模型可能会学会依靠其绿色背景,因此在沙质背景上的牛上的作用很差。用于测量方法缓解此问题的方法的标准数据集是水鸟。当前最佳方法(组分布强劲的优化-GroupDro)当前在原始图像上从头开始达到89 \%最差的组准确性和标准培训,仅获得72 \%。 GroupDro需要用子组标签以端到端方式培训模型。在本文中,我们表明我们可以在训练集中使用任何子组信息,而只需使用大型预训练的视觉型模型提取器和训练线性分类器即可到达训练集中的任何子组信息。通过对广泛的预训练模型和预训练数据集进行的实验,我们表明了预训练模型的能力以及预培训数据集的大小很重要。我们的实验表明,与高容量卷积神经网络相比,高容量视觉变压器的性能更好,并且较大的训练前数据集在虚假相关数据集上提高了更差的组精度。

Spurious correlations in training data often lead to robustness issues since models learn to use them as shortcuts. For example, when predicting whether an object is a cow, a model might learn to rely on its green background, so it would do poorly on a cow on a sandy background. A standard dataset for measuring state-of-the-art on methods mitigating this problem is Waterbirds. The best method (Group Distributionally Robust Optimization - GroupDRO) currently achieves 89\% worst group accuracy and standard training from scratch on raw images only gets 72\%. GroupDRO requires training a model in an end-to-end manner with subgroup labels. In this paper, we show that we can achieve up to 90\% accuracy without using any sub-group information in the training set by simply using embeddings from a large pre-trained vision model extractor and training a linear classifier on top of it. With experiments on a wide range of pre-trained models and pre-training datasets, we show that the capacity of the pre-training model and the size of the pre-training dataset matters. Our experiments reveal that high capacity vision transformers perform better compared to high capacity convolutional neural networks, and larger pre-training dataset leads to better worst-group accuracy on the spurious correlation dataset.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源