神经语言获取与自然相似吗？时间顺序探测研究

论文标题

神经语言获取与自然相似吗？时间顺序探测研究

Is neural language acquisition similar to natural? A chronological probing study

论文作者

Voloshina, Ekaterina, Serikov, Oleg, Shavrina, Tatiana

论文摘要

探测方法允许使用外部分类器和统计分析，获得存储在神经网络内层中的语言现象的部分表示。预先训练的基于变压器的语言模型被广泛用于自然语言理解（NLU）和自然语言生成（NLG）任务，使其最常用于下游应用程序。但是，几乎没有进行分析，无论这些模型是否经过足够的培训或包含与语言理论相关的知识。我们介绍了变压器英语模型（例如Multibert和T5）的时间顺序探测研究。我们依次比较有关模型在语料库培训过程中学到的语言的信息。结果表明，1）在培训的早期阶段获取语言信息2）两种语言模型都表明能够捕获各种语言的各种特征，包括形态学，语法，甚至是话语，而它们也可能不一致地失败了被认为容易的任务。我们还介绍了与其他基于变压器的模型兼容的时间顺序探测研究的开源框架。 https://github.com/ekaterinavoloshina/Chronology_probing

The probing methodology allows one to obtain a partial representation of linguistic phenomena stored in the inner layers of the neural network, using external classifiers and statistical analysis. Pre-trained transformer-based language models are widely used both for natural language understanding (NLU) and natural language generation (NLG) tasks making them most commonly used for downstream applications. However, little analysis was carried out, whether the models were pre-trained enough or contained knowledge correlated with linguistic theory. We are presenting the chronological probing study of transformer English models such as MultiBERT and T5. We sequentially compare the information about the language learned by the models in the process of training on corpora. The results show that 1) linguistic information is acquired in the early stages of training 2) both language models demonstrate capabilities to capture various features from various levels of language, including morphology, syntax, and even discourse, while they also can inconsistently fail on tasks that are perceived as easy. We also introduce the open-source framework for chronological probing research, compatible with other transformer-based models. https://github.com/EkaterinaVoloshina/chronological_probing

下载PDF全文

下载文献需遵守相关版权规定

论文标题