论文标题

零:通过数据集生成有效的零击学习

ZeroGen: Efficient Zero-shot Learning via Dataset Generation

论文作者

Ye, Jiacheng, Gao, Jiahui, Li, Qintong, Xu, Hang, Feng, Jiangtao, Wu, Zhiyong, Yu, Tao, Kong, Lingpeng

论文摘要

由于大型预训练语言模型(PLM)的卓越生成能力,最近对数据集生成的兴趣越来越大。在本文中,我们研究了一种灵活,有效的零偏度学习方法,\ textsc {Zerogen}。给定一个零射击任务,我们首先以无监督的方式使用PLMS从头开始生成数据集。然后,我们在合成数据集的监督下训练一个微小的任务模型(例如LSTM)。这种方法允许高效推断,因为最终任务模型仅具有与PLM(例如GPT2-XL)相比的数量级较少的顺序。除了无注释和高效外,我们认为\ textsc {Zerogen}也可以从无数据模型 - 静态知识蒸馏和未引用的文本生成评估的角度提供有用的见解。对不同NLP任务的实验和分析,即文本分类,问答和自然语言推断,显示了\ textsc {zerogen}的有效性。

There is a growing interest in dataset generation recently due to the superior generative capacity of large pre-trained language models (PLMs). In this paper, we study a flexible and efficient zero-short learning method, \textsc{ZeroGen}. Given a zero-shot task, we first generate a dataset from scratch using PLMs in an unsupervised manner. Then, we train a tiny task model (e.g., LSTM) under the supervision of the synthesized dataset. This approach allows highly efficient inference as the final task model only has orders of magnitude fewer parameters comparing to PLMs (e.g., GPT2-XL). Apart from being annotation-free and efficient, we argue that \textsc{ZeroGen} can also provide useful insights from the perspective of data-free model-agnostic knowledge distillation, and unreferenced text generation evaluation. Experiments and analysis on different NLP tasks, namely, text classification, question answering, and natural language inference, show the effectiveness of \textsc{ZeroGen}.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源