三元特征面具：零遗漏用于任务收入学习

论文标题

三元特征面具：零遗漏用于任务收入学习

Ternary Feature Masks: zero-forgetting for task-incremental learning

论文作者

Masana, Marc, Tuytelaars, Tinne, van de Weijer, Joost

论文摘要

我们提出了一种方法，而无需忘记对任务感知制度的持续学习，在推理中，任务标签是众所周知的。通过使用三元面具，我们可以将模型升级到新任务，从以前的任务重复知识，同时又不忘记它们的任何内容。使用口罩可以防止灾难性的遗忘和向后转移。我们认为 - 并在实验上表明 - 避免前者很大程度上弥补了后者的缺乏，这在实践中很少被观察到。与较早的作品相反，我们的口罩应用于每一层而不是权重的特征（激活）。这大大减少了每个新任务的蒙版参数数量；大多数网络都有三个以上的数量级。每个功能将三元掩码编码为两个位的编码几乎没有网络的开销，从而避免了可扩展性问题。为了允许已经学习过的功能适应当前任务，而无需更改以前任务的这些功能的行为，我们介绍了特定于任务的功能归一化。在几个细化的数据集和成像网上进行了广泛的实验表明，与基于重量的方法相比，我们的方法的表现优于当前的最新时间。

We propose an approach without any forgetting to continual learning for the task-aware regime, where at inference the task-label is known. By using ternary masks we can upgrade a model to new tasks, reusing knowledge from previous tasks while not forgetting anything about them. Using masks prevents both catastrophic forgetting and backward transfer. We argue -- and show experimentally -- that avoiding the former largely compensates for the lack of the latter, which is rarely observed in practice. In contrast to earlier works, our masks are applied to the features (activations) of each layer instead of the weights. This considerably reduces the number of mask parameters for each new task; with more than three orders of magnitude for most networks. The encoding of the ternary masks into two bits per feature creates very little overhead to the network, avoiding scalability issues. To allow already learned features to adapt to the current task without changing the behavior of these features for previous tasks, we introduce task-specific feature normalization. Extensive experiments on several finegrained datasets and ImageNet show that our method outperforms current state-of-the-art while reducing memory overhead in comparison to weight-based approaches.

下载PDF全文

下载文献需遵守相关版权规定

论文标题