从其他级别的其他级别蒸馏出因果效应，以持续命名实体识别

论文标题

从其他级别的其他级别蒸馏出因果效应，以持续命名实体识别

Distilling Causal Effect from Miscellaneous Other-Class for Continual Named Entity Recognition

论文作者

Zheng, Junhao, Liang, Zhanxian, Chen, Haibin, Ma, Qianli

论文摘要

命名实体识别（CL-NER）的持续学习旨在从数据流中学习越来越多的实体类型。但是，仅仅以与新实体类型相同的方式学习其他级别的方式，就会放大灾难性的遗忘，并导致大量的性能下降。其背后的主要原因是其他级别样本通常包含旧实体类型，并且这些其他级别样本中的旧知识无法正确保留。多亏了因果推断，我们确定遗忘是由旧数据缺失的因果效应引起的。为此，我们提出了一个统一的因果框架，以从新实体类型和其他类别中检索因果关系。此外，我们采用课程学习来减轻标签噪声的影响，并引入自适应重量，以平衡新实体类型与其他级别之间的因果效应。三个基准数据集的实验结果表明，我们的方法的表现优于最先进的方法。此外，我们的方法可以与现有的最新方法相结合，以提高CL-NER的性能

Continual Learning for Named Entity Recognition (CL-NER) aims to learn a growing number of entity types over time from a stream of data. However, simply learning Other-Class in the same way as new entity types amplifies the catastrophic forgetting and leads to a substantial performance drop. The main cause behind this is that Other-Class samples usually contain old entity types, and the old knowledge in these Other-Class samples is not preserved properly. Thanks to the causal inference, we identify that the forgetting is caused by the missing causal effect from the old data. To this end, we propose a unified causal framework to retrieve the causality from both new entity types and Other-Class. Furthermore, we apply curriculum learning to mitigate the impact of label noise and introduce a self-adaptive weight for balancing the causal effects between new entity types and Other-Class. Experimental results on three benchmark datasets show that our method outperforms the state-of-the-art method by a large margin. Moreover, our method can be combined with the existing state-of-the-art methods to improve the performance in CL-NER

下载PDF全文

下载文献需遵守相关版权规定

论文标题