论文标题
识别虚假的相关性并通过基于解释的学习来纠正它们
Identifying Spurious Correlations and Correcting them with an Explanation-based Learning
论文作者
论文摘要
识别训练有素的模型学到的虚假相关性是完善训练有素的模型并构建值得信赖的模型的核心。我们提出了一种简单的方法来识别通过训练图像分类问题的模型学到的虚假相关性。我们应用图像级扰动并监视使用训练有素模型进行的预测确定性的变化。我们使用图像分类数据集证明了这种方法,该数据集包含带有合成生成的伪造区域的图像,并表明训练有素的模型过度依赖于虚假区域。此外,我们采用基于解释的学习方法删除了博学的虚假相关性。
Identifying spurious correlations learned by a trained model is at the core of refining a trained model and building a trustworthy model. We present a simple method to identify spurious correlations that have been learned by a model trained for image classification problems. We apply image-level perturbations and monitor changes in certainties of predictions made using the trained model. We demonstrate this approach using an image classification dataset that contains images with synthetically generated spurious regions and show that the trained model was overdependent on spurious regions. Moreover, we remove the learned spurious correlations with an explanation based learning approach.