一种信息理论方法，以提示工程，而无需地面真相标签

论文标题

一种信息理论方法，以提示工程，而无需地面真相标签

An Information-theoretic Approach to Prompt Engineering Without Ground Truth Labels

论文作者

Sorensen, Taylor, Robinson, Joshua, Rytting, Christopher Michael, Shaw, Alexander Glenn, Rogers, Kyle Jeffrey, Delorey, Alexia Pauline, Khalil, Mahmoud, Fulda, Nancy, Wingate, David

论文摘要

预先训练的语言模型从培训的大规模语料库中得出了实质性的语言和事实知识，并促使工程试图使这些模型与特定任务保持一致。不幸的是，现有的及时工程方法需要大量的标记数据，访问模型参数或两者兼而有之。我们介绍了一种选择提示模板\ textit {没有标记的示例}和\ textit {无直接访问模型}的新方法。具体而言，在一组候选模板上，我们选择了最大化输入和相应模型输出之间相互信息的模板。在代表7个不同NLP任务的8个数据集中，我们表明，当模板具有很高的共同信息时，它在任务上也具有很高的精度。在最大的模型上，从我们的方法中选择提示可以从平均及时准确性到最佳及时精度的90 \％获得90 \％，并且不需要地面真相标签。

Pre-trained language models derive substantial linguistic and factual knowledge from the massive corpora on which they are trained, and prompt engineering seeks to align these models to specific tasks. Unfortunately, existing prompt engineering methods require significant amounts of labeled data, access to model parameters, or both. We introduce a new method for selecting prompt templates \textit{without labeled examples} and \textit{without direct access to the model}. Specifically, over a set of candidate templates, we choose the template that maximizes the mutual information between the input and the corresponding model output. Across 8 datasets representing 7 distinct NLP tasks, we show that when a template has high mutual information, it also has high accuracy on the task. On the largest model, selecting prompts with our method gets 90\% of the way from the average prompt accuracy to the best prompt accuracy and requires no ground truth labels.

下载PDF全文

下载文献需遵守相关版权规定

论文标题