Z-ICL：零射击中的伪示例学习

论文标题

Z-ICL：零射击中的伪示例学习

Z-ICL: Zero-Shot In-Context Learning with Pseudo-Demonstrations

论文作者

Lyu, Xinxi, Min, Sewon, Beltagy, Iz, Zettlemoyer, Luke, Hajishirzi, Hannaneh

论文摘要

尽管可以针对零和几次学习的大型语言模型提示，但是当没有演示的情况下，性能会大大下降。在本文中，我们介绍了Z-ICL，这是一种新的零射击方法，该方法通过使用原始文本语料库构造给定的测试输入的伪示例来缩小差距。具体而言，伪示例是通过（1）找到来自语料库的测试输入的最近的邻居并将其与随机任务标签配对的，以及（2）应用一组技术来减少从结果演示中减少模型的直接复制数量。对九个分类数据集进行的评估表明，Z-ICL的表现优于先前的零射击方法，并且在几乎没有射击设置中具有标记的培训数据，与在文本学习中相提并论。总体而言，Z-ICL对模型的零射击性能水平提供了明显更高的估计，并支持未来开发更好的伪示例的努力，从而进一步改善了零击结果。

Although large language models can be prompted for both zero- and few-shot learning, performance drops significantly when no demonstrations are available. In this paper, we introduce Z-ICL, a new zero-shot method that closes the gap by constructing pseudo-demonstrations for a given test input using a raw text corpus. Concretely, pseudo-demonstrations are constructed by (1) finding the nearest neighbors to the test input from the corpus and pairing them with random task labels, and (2) applying a set of techniques to reduce the amount of direct copying the model does from the resulting demonstrations. Evaluation on nine classification datasets shows that Z-ICL outperforms previous zero-shot methods by a significant margin, and is on par with in-context learning with labeled training data in the few-shot setting. Overall, Z-ICL provides a significantly higher estimate of the zero-shot performance levels of a model, and supports future efforts to develop better pseudo-demonstrations that further improve zero-shot results.

下载PDF全文

下载文献需遵守相关版权规定

论文标题