内核Whitening：用各向同性句子嵌入克服数据集偏差

论文标题

内核Whitening：用各向同性句子嵌入克服数据集偏差

Kernel-Whitening: Overcome Dataset Bias with Isotropic Sentence Embedding

论文作者

Gao, Songyang, Dou, Shihan, Zhang, Qi, Huang, Xuanjing

论文摘要

数据集偏见最近引起了人们对其对微调模型的概括能力有害影响的越来越多的关注。当前的主流解决方案是设计一个额外的浅模型，以预识别偏见的实例。但是，这种两阶段的方法扩大了训练过程的计算复杂性，并在减轻偏见的同时阻碍了有效的特征信息。为了解决此问题，我们利用表示归一化方法，该方法旨在解散编码句子的特征之间的相关性。我们发现，通过提供各向同性数据分布来消除偏差问题，这也有希望。我们进一步提出了内核旋转，这是一种nystrom内核近似方法，以实现非线性伪造相关性的更彻底的偏见。我们的框架是端到端的，其时间与微调类似。实验表明，旋转内核可以显着提高BERT在分布数据集中的性能，同时保持分布精度。

Dataset bias has attracted increasing attention recently for its detrimental effect on the generalization ability of fine-tuned models. The current mainstream solution is designing an additional shallow model to pre-identify biased instances. However, such two-stage methods scale up the computational complexity of training process and obstruct valid feature information while mitigating bias. To address this issue, we utilize the representation normalization method which aims at disentangling the correlations between features of encoded sentences. We find it also promising in eliminating the bias problem by providing isotropic data distribution. We further propose Kernel-Whitening, a Nystrom kernel approximation method to achieve more thorough debiasing on nonlinear spurious correlations. Our framework is end-to-end with similar time consumption to fine-tuning. Experiments show that Kernel-Whitening significantly improves the performance of BERT on out-of-distribution datasets while maintaining in-distribution accuracy.

下载PDF全文

下载文献需遵守相关版权规定

论文标题