终身学习无任务Oracle

论文标题

终身学习无任务Oracle

Lifelong Learning Without a Task Oracle

论文作者

Rios, Amanda, Itti, Laurent

论文摘要

众所周知，有监督的深神经网络会在学习新任务时的旧任务准确性急剧下降，称为“灾难性遗忘”。许多不断学习的最先进的解决方案都依赖于偏见和/或分区模型来逐步适应连续的任务。但是，这些方法在很大程度上取决于任务 - 轨道将任务标识授予每个测试样本的可用性，否则模型完全无法执行。为了解决这一缺点，我们提出并比较了几个候选任务分配映射器，这些映射器几乎不需要内存开销：（1）使用最近的均值，高斯混合物模型或模糊的Art Backbones增量无监督的原型分配；（2）具有快速模糊的Artmap的监督增量原型分配；（3）通过动态核心训练的浅感知器。我们提出的模型变体是从预先训练的特征提取器或主分类器网络的任务依赖性特征嵌入的训练中训练的。我们将这些管道变体应用于连续学习的基准测试，该基准由几个数据集的序列或一个数据集中。总体而言，尽管这些方法具有简单性和紧凑性，但它们的表现非常接近地面真理，尤其是在数据间任务分配的实验中。此外，表现最佳的变体仅将参数存储器的平均成本增加1.7％。

Supervised deep neural networks are known to undergo a sharp decline in the accuracy of older tasks when new tasks are learned, termed "catastrophic forgetting". Many state-of-the-art solutions to continual learning rely on biasing and/or partitioning a model to accommodate successive tasks incrementally. However, these methods largely depend on the availability of a task-oracle to confer task identities to each test sample, without which the models are entirely unable to perform. To address this shortcoming, we propose and compare several candidate task-assigning mappers which require very little memory overhead: (1) Incremental unsupervised prototype assignment using either nearest means, Gaussian Mixture Models or fuzzy ART backbones; (2) Supervised incremental prototype assignment with fast fuzzy ARTMAP; (3) Shallow perceptron trained via a dynamic coreset. Our proposed model variants are trained either from pre-trained feature extractors or task-dependent feature embeddings of the main classifier network. We apply these pipeline variants to continual learning benchmarks, comprised of either sequences of several datasets or within one single dataset. Overall, these methods, despite their simplicity and compactness, perform very close to a ground truth oracle, especially in experiments of inter-dataset task assignment. Moreover, best-performing variants only impose an average cost of 1.7% parameter memory increase.

下载PDF全文

下载文献需遵守相关版权规定

论文标题