防止RNN使用序列长度作为特征

论文标题

防止RNN使用序列长度作为特征

Preventing RNN from Using Sequence Length as a Feature

论文作者

Baillargeon, Jean-Thomas, Cossette, Hélène, Lamontagne, Luc

论文摘要

经常性的神经网络是深度学习拓扑结构，可以训练以对长文档进行分类。但是，在最近的工作中，我们发现了这些单元格的关键问题：它们可以将不同类别的文本之间的长度差异用作突出的分类特征。这具有产生易碎且脆弱的概念漂移的模型，可以提供误导性的表演，并且无论文本内容如何，都可以在毫无解释的上进行解释。本文使用综合和现实世界数据说明了问题，并使用权重衰减正则化提供了简单的解决方案。

Recurrent neural networks are deep learning topologies that can be trained to classify long documents. However, in our recent work, we found a critical problem with these cells: they can use the length differences between texts of different classes as a prominent classification feature. This has the effect of producing models that are brittle and fragile to concept drift, can provide misleading performances and are trivially explainable regardless of text content. This paper illustrates the problem using synthetic and real-world data and provides a simple solution using weight decay regularization.

下载PDF全文

下载文献需遵守相关版权规定

论文标题