Unifiedskg：统一和多任务结构化知识与文本到文本语言模型接地

论文标题

Unifiedskg：统一和多任务结构化知识与文本到文本语言模型接地

UnifiedSKG: Unifying and Multi-Tasking Structured Knowledge Grounding with Text-to-Text Language Models

论文作者

Xie, Tianbao, Wu, Chen Henry, Shi, Peng, Zhong, Ruiqi, Scholak, Torsten, Yasunaga, Michihiro, Wu, Chien-Sheng, Zhong, Ming, Yin, Pengcheng, Wang, Sida I., Zhong, Victor, Wang, Bailin, Li, Chengzu, Boyle, Connor, Ni, Ansong, Yao, Ziyu, Radev, Dragomir, Xiong, Caiming, Kong, Lingpeng, Zhang, Rui, Smith, Noah A., Zettlemoyer, Luke, Yu, Tao

论文摘要

结构化知识接地（SKG）利用结构化的知识来完成用户请求，例如在数据库上进行语义解析以及对知识库的回答。由于SKG任务的输入和输出是异质的，因此它们已经由不同的社区分别研究，这限制了SKG的系统和兼容研究。在本文中，我们通过提出Unifiedskg框架来克服这一限制，该框架将21个SKG任务统一为文本对文本格式，旨在促进系统的SKG研究，而不是独有的任务，域或数据集。我们使用Unifiedskg对具有不同尺寸的T5进行基准测试，并表明T5在必要时进行了简单的修改，可以在几乎所有21个任务上实现最先进的性能。我们进一步证明，多任务前缀调整可以提高大多数任务的性能，从而在很大程度上提高了整体绩效。 Unifiedskg还促进了对零击和几乎没有学习的调查，我们表明T0，GPT-3和Codex在零照片中挣扎，而SKG却很少。我们还使用Unifiedskg来对SKG任务中编码变体的结构化知识进行一系列受控实验。 unifiedskg很容易扩展到更多的任务，并且在https://github.com/hkunlp/unifiedskg上进行开源。

Structured knowledge grounding (SKG) leverages structured knowledge to complete user requests, such as semantic parsing over databases and question answering over knowledge bases. Since the inputs and outputs of SKG tasks are heterogeneous, they have been studied separately by different communities, which limits systematic and compatible research on SKG. In this paper, we overcome this limitation by proposing the UnifiedSKG framework, which unifies 21 SKG tasks into a text-to-text format, aiming to promote systematic SKG research, instead of being exclusive to a single task, domain, or dataset. We use UnifiedSKG to benchmark T5 with different sizes and show that T5, with simple modifications when necessary, achieves state-of-the-art performance on almost all of the 21 tasks. We further demonstrate that multi-task prefix-tuning improves the performance on most tasks, largely improving the overall performance. UnifiedSKG also facilitates the investigation of zero-shot and few-shot learning, and we show that T0, GPT-3, and Codex struggle in zero-shot and few-shot learning for SKG. We also use UnifiedSKG to conduct a series of controlled experiments on structured knowledge encoding variants across SKG tasks. UnifiedSKG is easily extensible to more tasks, and it is open-sourced at https://github.com/hkunlp/unifiedskg.

下载PDF全文

下载文献需遵守相关版权规定

论文标题