论文标题
部分可观测时空混沌系统的无模型预测
QBERT: Generalist Model for Processing Questions
论文作者
论文摘要
在各种任务中使用单个模型对训练和应用深度神经序列模型有益。我们解决了开发文本通才表示的问题,这些文本可用于执行一系列不同的任务,而不是专门针对单个应用程序。我们专注于处理简短的问题并为这些问题开发嵌入,这些问题对各种问题,例如问题主题分类,同等的问题识别和问题回答有用。本文介绍了Qbert,这是一种用于处理问题的通才模型。借助Qbert,我们演示了如何训练执行所有与问题相关的任务的多任务网络,并且与相应的单任务模型相比,其性能相似。
Using a single model across various tasks is beneficial for training and applying deep neural sequence models. We address the problem of developing generalist representations of text that can be used to perform a range of different tasks rather than being specialised to a single application. We focus on processing short questions and developing an embedding for these questions that is useful on a diverse set of problems, such as question topic classification, equivalent question recognition, and question answering. This paper introduces QBERT, a generalist model for processing questions. With QBERT, we demonstrate how we can train a multi-task network that performs all question-related tasks and has achieved similar performance compared to its corresponding single-task models.