论文标题
ITSA:一种自动避免快捷方式和域概括的信息理论方法
ITSA: An Information-Theoretic Approach to Automatic Shortcut Avoidance and Domain Generalization in Stereo Matching Networks
论文作者
论文摘要
最新的立体声匹配网络仅在合成数据上训练,通常无法推广到更具挑战性的真实数据域。在本文中,我们试图展开一个重要的因素,以阻碍网络跨越领域的概括:通过快捷方式学习的镜头。我们证明,立体声匹配网络中特征表示的学习受合成数据伪像(快捷属性)的严重影响。为了减轻此问题,我们提出了一种信息理论避免〜(ITSA)方法,以自动限制与快捷相关的信息被编码为功能表示形式。结果,我们提出的方法通过最大程度地降低潜在特征对输入变化的敏感性来学习鲁棒和快捷不变的特征。为了避免直接输入灵敏度优化的过度计算成本,我们提出了一种有效但可行的算法以实现鲁棒性。我们表明,使用此方法,仅根据合成数据进行训练的最新立体声匹配网络可以有效地推广到具有挑战性和以前看不见的真实数据方案。重要的是,所提出的方法增强了合成训练的网络的鲁棒性,以至于它们胜过微调的对应物(在真实数据上),以挑战域外立体声数据集。
State-of-the-art stereo matching networks trained only on synthetic data often fail to generalize to more challenging real data domains. In this paper, we attempt to unfold an important factor that hinders the networks from generalizing across domains: through the lens of shortcut learning. We demonstrate that the learning of feature representations in stereo matching networks is heavily influenced by synthetic data artefacts (shortcut attributes). To mitigate this issue, we propose an Information-Theoretic Shortcut Avoidance~(ITSA) approach to automatically restrict shortcut-related information from being encoded into the feature representations. As a result, our proposed method learns robust and shortcut-invariant features by minimizing the sensitivity of latent features to input variations. To avoid the prohibitive computational cost of direct input sensitivity optimization, we propose an effective yet feasible algorithm to achieve robustness. We show that using this method, state-of-the-art stereo matching networks that are trained purely on synthetic data can effectively generalize to challenging and previously unseen real data scenarios. Importantly, the proposed method enhances the robustness of the synthetic trained networks to the point that they outperform their fine-tuned counterparts (on real data) for challenging out-of-domain stereo datasets.