放射学报告的元学习病理学使用方差意识到原型网络

论文标题

放射学报告的元学习病理学使用方差意识到原型网络

Meta-learning Pathologies from Radiology Reports using Variance Aware Prototypical Networks

论文作者

Sehanobish, Arijit, Kannan, Kawshik, Abraham, Nabila, Das, Anasuya, Odry, Benjamin

论文摘要

BERT和GPT（GPT）概括的基于变压器的大型语言模型改变了自然语言处理（NLP）的景观。但是，对这样的模型进行微调仍然需要为每个目标任务进行大量培训示例，从而注释多个数据集并在各种下游任务上训练这些模型变得既耗时又昂贵。在这项工作中，我们提出了对原型网络的简单扩展，以进行几次弹头文本分类。我们的主要思想是用高斯人代替类原型，并引入一个正规化术语，该术语鼓励示例聚集在适当的centroid附近。实验结果表明，我们的方法在13个公共和4个内部数据集上优于各种强大的基线。此外，我们使用类分布作为在部署过程中检测潜在分布（OOD）数据点的工具。

Large pretrained Transformer-based language models like BERT and GPT have changed the landscape of Natural Language Processing (NLP). However, fine tuning such models still requires a large number of training examples for each target task, thus annotating multiple datasets and training these models on various downstream tasks becomes time consuming and expensive. In this work, we propose a simple extension of the Prototypical Networks for few-shot text classification. Our main idea is to replace the class prototypes by Gaussians and introduce a regularization term that encourages the examples to be clustered near the appropriate class centroids. Experimental results show that our method outperforms various strong baselines on 13 public and 4 internal datasets. Furthermore, we use the class distributions as a tool for detecting potential out-of-distribution (OOD) data points during deployment.

下载PDF全文

下载文献需遵守相关版权规定

论文标题