探索提示应用基于GPT-3的聊天机器人的设计：关于机械Turk的心理健康案例研究

论文标题

探索提示应用基于GPT-3的聊天机器人的设计：关于机械Turk的心理健康案例研究

Exploring The Design of Prompts For Applying GPT-3 based Chatbots: A Mental Wellbeing Case Study on Mechanical Turk

论文作者

Kumar, Harsh, Musabirov, Ilya, Shi, Jiakai, Lauzon, Adele, Choy, Kwan Kiu, Gross, Ofek, Kulzhabayeva, Dana, Williams, Joseph Jay

论文摘要

GPT-3（例如GPT-3）的大型模型有可能使HCI设计师和研究人员为特定应用程序创建更类似人类和有用的聊天机器人。但是，评估这些聊天机器人的可行性和设计提示，以优化特定任务的GPT-3是具有挑战性的。我们提出了一个解决这些问题的案例研究，将GPT-3应用于简短的5分钟聊天机器人，任何人都可以谈论以更好地管理自己的心情。我们报告了一个随机分类实验，对机械土耳其人的945名参与者进行了测试，该实验测试了及时设计的三个维度，以初始化聊天机器人（身份，意图和行为），并介绍对话的定量和定性分析以及用户对聊天机器人的看法。我们希望其他HCI设计师和研究人员可以基于此案例研究，用于基于GPT-3的聊天机器人在特定任务中的其他应用，并构建和扩展我们用于及时设计的方法以及对及时设计的评估。

Large-Language Models like GPT-3 have the potential to enable HCI designers and researchers to create more human-like and helpful chatbots for specific applications. But evaluating the feasibility of these chatbots and designing prompts that optimize GPT-3 for a specific task is challenging. We present a case study in tackling these questions, applying GPT-3 to a brief 5-minute chatbot that anyone can talk to better manage their mood. We report a randomized factorial experiment with 945 participants on Mechanical Turk that tests three dimensions of prompt design to initialize the chatbot (identity, intent, and behaviour), and present both quantitative and qualitative analyses of conversations and user perceptions of the chatbot. We hope other HCI designers and researchers can build on this case study, for other applications of GPT-3 based chatbots to specific tasks, and build on and extend the methods we use for prompt design, and evaluation of the prompt design.

下载PDF全文

下载文献需遵守相关版权规定

论文标题