通过验证的分类器进行几次转移学习的概括范围

论文标题

通过验证的分类器进行几次转移学习的概括范围

Generalization Bounds for Few-Shot Transfer Learning with Pretrained Classifiers

论文作者

Galanti, Tomer, György, András, Hutter, Marcus

论文摘要

我们研究基础模型学习可以转移到新的，看不见的类别的分类表示的能力。文献中的最新结果表明，单个分类器在许多类中学到的表示形式在几乎没有学习问题上具有竞争力，并且由专门针对此类问题设计的特殊用途算法所学的表示。我们基于最近发现的类型变异性崩溃的现象为这种行为提供了理论解释，即在深层分类网络的培训期间，属于同一班级的样本的特征嵌入在其类平均值上。更具体地说，我们表明，在新类上，学到的功能图的几个弹药错误（定义为使用从每个新类中的少量随机样本中学到的中心的分类错误，在类别 - feature-new classibily collapse中从少数随机样本中学到的中心）很小，在假设从固定分布中独立选择类别的类别变异性崩溃。这表明基础模型可以提供可传输到新下游任务的功能图，即使很少有样本。据我们所知，这是转移学习的第一个绑定，在几次设置中是非易变的。

We study the ability of foundation models to learn representations for classification that are transferable to new, unseen classes. Recent results in the literature show that representations learned by a single classifier over many classes are competitive on few-shot learning problems with representations learned by special-purpose algorithms designed for such problems. We offer a theoretical explanation for this behavior based on the recently discovered phenomenon of class-feature-variability collapse, that is, that during the training of deep classification networks the feature embeddings of samples belonging to the same class tend to concentrate around their class means. More specifically, we show that the few-shot error of the learned feature map on new classes (defined as the classification error of the nearest class-center classifier using centers learned from a small number of random samples from each new class) is small in case of class-feature-variability collapse, under the assumption that the classes are selected independently from a fixed distribution. This suggests that foundation models can provide feature maps that are transferable to new downstream tasks, even with very few samples; to our knowledge, this is the first performance bound for transfer-learning that is non-vacuous in the few-shot setting.

下载PDF全文

下载文献需遵守相关版权规定

论文标题