建造开放域聊天机器人的食谱

论文标题

建造开放域聊天机器人的食谱

Recipes for building an open-domain chatbot

论文作者

Roller, Stephen, Dinan, Emily, Goyal, Naman, Ju, Da, Williamson, Mary, Liu, Yinhan, Xu, Jing, Ott, Myle, Shuster, Kurt, Smith, Eric M., Boureau, Y-Lan, Weston, Jason

论文摘要

建立开放域聊天机器人是机器学习研究的挑战领域。虽然先前的工作表明，在参数数量中缩放神经模型及其对数据的大小进行训练的结果可改善结果，但我们表明其他成分对于高性能的聊天机器人很重要。良好的对话需要许多技能，专家对话主义者以一种无缝的方式融合在一起：提供引人入胜的谈话要点和倾听他们的伴侣，并适当地展示知识，同理心和人格，同时保持一致的角色。我们表明，在获得适当的培训数据和发电策略的选择时，大型模型可以学习这些技能。我们使用90m，2.7b和9.4b参数模型来构建这些食谱的变体，并公开使用我们的模型和代码。人类评估表明，就吸引力和人性测量而言，我们的最佳模型优于多转向对话中的现有方法。然后，我们通过分析模型的故障案例来讨论这项工作的局限性。

Building open-domain chatbots is a challenging area for machine learning research. While prior work has shown that scaling neural models in the number of parameters and the size of the data they are trained on gives improved results, we show that other ingredients are important for a high-performing chatbot. Good conversation requires a number of skills that an expert conversationalist blends in a seamless way: providing engaging talking points and listening to their partners, and displaying knowledge, empathy and personality appropriately, while maintaining a consistent persona. We show that large scale models can learn these skills when given appropriate training data and choice of generation strategy. We build variants of these recipes with 90M, 2.7B and 9.4B parameter models, and make our models and code publicly available. Human evaluations show our best models are superior to existing approaches in multi-turn dialogue in terms of engagingness and humanness measurements. We then discuss the limitations of this work by analyzing failure cases of our models.

下载PDF全文

下载文献需遵守相关版权规定

论文标题