您可以将多少知识包装到语言模型的参数中？

论文标题

您可以将多少知识包装到语言模型的参数中？

How Much Knowledge Can You Pack Into the Parameters of a Language Model?

论文作者

Roberts, Adam, Raffel, Colin, Shazeer, Noam

论文摘要

最近已经观察到，在非结构化文本上训练的神经语言模型可以隐式地存储和使用自然语言查询来检索知识。在这篇简短的论文中，我们通过微调预训练的模型来回答问题而无需访问任何外部上下文或知识，来衡量这种方法的实际实用性。我们表明，这种方法的尺寸缩放，并使用开放域系统进行竞争性能，这些系统在回答问题时从外部知识源中明确检索答案。为了促进可重复性和未来的工作，我们在https://goo.gle/t5-cbqa上发布了代码和训练的模型。

It has recently been observed that neural language models trained on unstructured text can implicitly store and retrieve knowledge using natural language queries. In this short paper, we measure the practical utility of this approach by fine-tuning pre-trained models to answer questions without access to any external context or knowledge. We show that this approach scales with model size and performs competitively with open-domain systems that explicitly retrieve answers from an external knowledge source when answering questions. To facilitate reproducibility and future work, we release our code and trained models at https://goo.gle/t5-cbqa.

下载PDF全文

下载文献需遵守相关版权规定

论文标题