通过NTK重叠矩阵的灾难性遗忘的理论分析

论文标题

通过NTK重叠矩阵的灾难性遗忘的理论分析

A Theoretical Analysis of Catastrophic Forgetting through the NTK Overlap Matrix

论文作者

Doan, Thang, Bennani, Mehdi, Mazoure, Bogdan, Rabusseau, Guillaume, Alquier, Pierre

论文摘要

持续学习（CL）是代理商在整个生命中必须从传入的数据流中学习的设置。尽管该领域已经取得了重大进展，但尚未解决的一个反复出现的问题是灾难性遗忘（CF）。尽管该问题已得到广泛的经验研究，但从理论角度引起了很少的关注。在本文中，我们表明，随着两个任务越来越一致，CF的影响会增加。我们介绍了一个称为NTK重叠矩阵的任务相似性的度量，该矩阵是CF的核心。我们分析了共同的投影梯度算法，并演示了它们如何减轻遗忘。然后，我们提出了一种正交梯度下降（OGD）的变体，该变体通过主成分分析（PCA）利用数据结构。实验支持我们的理论发现，并展示我们的方法如何帮助减少经典CL数据集的CF。

Continual learning (CL) is a setting in which an agent has to learn from an incoming stream of data during its entire lifetime. Although major advances have been made in the field, one recurring problem which remains unsolved is that of Catastrophic Forgetting (CF). While the issue has been extensively studied empirically, little attention has been paid from a theoretical angle. In this paper, we show that the impact of CF increases as two tasks increasingly align. We introduce a measure of task similarity called the NTK overlap matrix which is at the core of CF. We analyze common projected gradient algorithms and demonstrate how they mitigate forgetting. Then, we propose a variant of Orthogonal Gradient Descent (OGD) which leverages structure of the data through Principal Component Analysis (PCA). Experiments support our theoretical findings and show how our method can help reduce CF on classical CL datasets.

下载PDF全文

下载文献需遵守相关版权规定

论文标题