论文标题
用人工语言进行预处理:在语言模型中研究可转移的知识
Pretraining with Artificial Language: Studying Transferable Knowledge in Language Models
论文作者
论文摘要
我们研究神经网络编码中学到的结构知识可以转移到处理自然语言中。我们设计具有模仿自然语言的结构属性的人造语言,在数据上进行了预处理编码,并查看编码器在自然语言中的下游任务上表现出了多少性能。我们的实验结果表明,使用筑巢依赖性结构的人造语言进行审计提供了一些可转移到自然语言的知识。后续探测分析表明,它在转移中的成功与编码的上下文信息的数量有关,而转移的内容是对语言的情境依赖性的了解。我们的结果提供了有关神经网络如何处理人类语言以及最近多语言模型的跨语言可传递性的来源的见解。
We investigate what kind of structural knowledge learned in neural network encoders is transferable to processing natural language. We design artificial languages with structural properties that mimic natural language, pretrain encoders on the data, and see how much performance the encoder exhibits on downstream tasks in natural language. Our experimental results show that pretraining with an artificial language with a nesting dependency structure provides some knowledge transferable to natural language. A follow-up probing analysis indicates that its success in the transfer is related to the amount of encoded contextual information and what is transferred is the knowledge of position-aware context dependence of language. Our results provide insights into how neural network encoders process human languages and the source of cross-lingual transferability of recent multilingual language models.