论文标题

使用预测不确定性在嘈杂标签下提高组鲁棒性

Improving group robustness under noisy labels using predictive uncertainty

论文作者

Oh, Dongpin, Lee, Dae, Byun, Jeunghyun, Shin, Bonggun

论文摘要

由于输入及其标签之间存在虚假的相关性,标准的经验风险最小化(ERM)可能表现不佳(即水中的水鸟或水鸟中的水鸟)。通过关注高损失样本,一些研究提高了最差的组准确性。这背后的假设是这样的高损失样本是\ textit {scf-cue}(SCF)样本。但是,这些方法可能是有问题的,因为在现实世界中,高损失样本也可能是带有嘈杂标签的样本。为了解决此问题,我们利用模型的预测不确定性来提高嘈杂标签下最差的组精度。为了激发这一点,我们从理论上表明,高确定性样本是二进制分类问题中的SCF样本。理论上的结果意味着预测不确定性是在嘈杂标签设置中识别SCF样品的适当指标。因此,我们提出了一个基于熵的新型词汇(END)框架,该框架阻止了模型学习伪造线索,同时又对嘈杂的标签变得强大。最后,我们首先使用其预测性不确定性训练\ textit {标识模型}从训练集中获取SCF样本。然后,另一个模型在数据集上进行了培训,并使用过采样的SCF集进行了培训。实验结果表明,我们的最终框架在几个现实世界的基准上都优于其他强大的基线,这些基准既考虑嘈杂的标签和伪造,又超过了。

The standard empirical risk minimization (ERM) can underperform on certain minority groups (i.e., waterbirds in lands or landbirds in water) due to the spurious correlation between the input and its label. Several studies have improved the worst-group accuracy by focusing on the high-loss samples. The hypothesis behind this is that such high-loss samples are \textit{spurious-cue-free} (SCF) samples. However, these approaches can be problematic since the high-loss samples may also be samples with noisy labels in the real-world scenarios. To resolve this issue, we utilize the predictive uncertainty of a model to improve the worst-group accuracy under noisy labels. To motivate this, we theoretically show that the high-uncertainty samples are the SCF samples in the binary classification problem. This theoretical result implies that the predictive uncertainty is an adequate indicator to identify SCF samples in a noisy label setting. Motivated from this, we propose a novel ENtropy based Debiasing (END) framework that prevents models from learning the spurious cues while being robust to the noisy labels. In the END framework, we first train the \textit{identification model} to obtain the SCF samples from a training set using its predictive uncertainty. Then, another model is trained on the dataset augmented with an oversampled SCF set. The experimental results show that our END framework outperforms other strong baselines on several real-world benchmarks that consider both the noisy labels and the spurious-cues.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源