通过遗留错误镜头的概括

论文标题

通过遗留错误镜头的概括

Generalization Through The Lens Of Leave-One-Out Error

论文作者

Bachmann, Gregor, Hofmann, Thomas, Lucchi, Aurélien

论文摘要

尽管深度学习模型在解决各种学习任务方面取得了巨大的经验成功，但我们对它们的概括能力的理论理解非常有限。到目前为止，基于VC维度或Rademacher复杂性等工具的经典概括范围不适合深层模型，并且令人怀疑，即使在最理想的环境中，这些技术也会产生紧密的界限（Nagarajan＆Kolter，2019年）。在这项工作中，我们重新审视了剩下的（LOO）错误的概念，以衡量所谓的内核制度中深模型的概括能力。尽管在统计数据中很受欢迎，但在深度学习的背景下，LOO错误在很大程度上被忽略了。通过在神经网络和内核学习之间建立的连接基础上，我们利用封闭形式的表达式出现了一个出误差，使我们可以访问有效的测试错误代理。我们在理论上和经验上都表明，剩余的误差能够捕获泛化理论中的各种现象，例如双重下降，随机标签或转移学习。因此，我们的工作表明，一对一的错误提供了一种可探讨的方法，可以估算内核制度中深神经网络的概括能力，为潜在的，新的研究方向打开了概括领域的大门。

Despite the tremendous empirical success of deep learning models to solve various learning tasks, our theoretical understanding of their generalization ability is very limited. Classical generalization bounds based on tools such as the VC dimension or Rademacher complexity, are so far unsuitable for deep models and it is doubtful that these techniques can yield tight bounds even in the most idealistic settings (Nagarajan & Kolter, 2019). In this work, we instead revisit the concept of leave-one-out (LOO) error to measure the generalization ability of deep models in the so-called kernel regime. While popular in statistics, the LOO error has been largely overlooked in the context of deep learning. By building upon the recently established connection between neural networks and kernel learning, we leverage the closed-form expression for the leave-one-out error, giving us access to an efficient proxy for the test error. We show both theoretically and empirically that the leave-one-out error is capable of capturing various phenomena in generalization theory, such as double descent, random labels or transfer learning. Our work therefore demonstrates that the leave-one-out error provides a tractable way to estimate the generalization ability of deep neural networks in the kernel regime, opening the door to potential, new research directions in the field of generalization.

下载PDF全文

下载文献需遵守相关版权规定

论文标题