当地知识有动力的对话代理

论文标题

当地知识有动力的对话代理

Local Knowledge Powered Conversational Agents

论文作者

Santhanam, Sashank, Ping, Wei, Puri, Raul, Shoeybi, Mohammad, Patwary, Mostofa, Catanzaro, Bryan

论文摘要

最先进的对话代理与使用基于大型变压器的大型语言模型相结合。但是，即使有了这些进步，对话代理仍然缺乏产生与当地环境有益且连贯的响应的能力。在这项工作中，我们提出了一个对话框框架，该框架既结合了本地知识，又包含用户的过去对话，以产生高质量的对话。我们介绍了一种基于Reddit对话的数据集构建数据集的方法，其中出站URL链接在对话中广泛可用，并且超链接文档可以自然作为本地外部知识包括在内。使用我们的框架和数据集，我们证明，使用人类评估可以在很大程度上提高信息性，相干性和现实性措施。特别是，我们的方法始终超过所有三个措施的Reddit数据集上的最新对话模型。我们还发现，将模型的规模从1.17m缩小到8.3b参数可以持续改善验证困惑以及人类评估的指标。我们具有8.3b参数的模型可以在单转模式设置中通过各种人类评估来产生类似人类的响应。

State-of-the-art conversational agents have advanced significantly in conjunction with the use of large transformer-based language models. However, even with these advancements, conversational agents still lack the ability to produce responses that are informative and coherent with the local context. In this work, we propose a dialog framework that incorporates both local knowledge as well as users' past dialogues to generate high quality conversations. We introduce an approach to build a dataset based on Reddit conversations, where outbound URL links are widely available in the conversations and the hyperlinked documents can be naturally included as local external knowledge. Using our framework and dataset, we demonstrate that incorporating local knowledge can largely improve informativeness, coherency and realisticness measures using human evaluations. In particular, our approach consistently outperforms the state-of-the-art conversational model on the Reddit dataset across all three measures. We also find that scaling the size of our models from 117M to 8.3B parameters yields consistent improvement of validation perplexity as well as human evaluated metrics. Our model with 8.3B parameters can generate human-like responses as rated by various human evaluations in a single-turn dialog setting.

下载PDF全文

下载文献需遵守相关版权规定

论文标题