相同的神经元，不同的语言：在多语言预训练的模型中探测形态元素

论文标题

相同的神经元，不同的语言：在多语言预训练的模型中探测形态元素

Same Neurons, Different Languages: Probing Morphosyntax in Multilingual Pre-trained Models

论文作者

Stańczak, Karolina, Ponti, Edoardo, Hennigen, Lucas Torroba, Cotterell, Ryan, Augenstein, Isabelle

论文摘要

多语言预训练模型的成功是基于他们学习多种语言共享表示表示的能力的基础，即使没有任何明确的监督。但是，尚不清楚这些模型如何学会跨语言概括。在这项工作中，我们猜想多语言预训练的模型可以得出有关语法的语言摘要。特别是，我们研究了形态词法信息是否以不同语言的神经元子集编码。我们对43种语言和14种形态句法类别进行了首次大规模实证研究，并进行了最新的神经元水平探测。我们的发现表明，神经元之间的跨语性重叠是显着的，但其范围可能会因类别而异，并且取决于语言接近性和训练前数据大小。

The success of multilingual pre-trained models is underpinned by their ability to learn representations shared by multiple languages even in absence of any explicit supervision. However, it remains unclear how these models learn to generalise across languages. In this work, we conjecture that multilingual pre-trained models can derive language-universal abstractions about grammar. In particular, we investigate whether morphosyntactic information is encoded in the same subset of neurons in different languages. We conduct the first large-scale empirical study over 43 languages and 14 morphosyntactic categories with a state-of-the-art neuron-level probe. Our findings show that the cross-lingual overlap between neurons is significant, but its extent may vary across categories and depends on language proximity and pre-training data size.

下载PDF全文

下载文献需遵守相关版权规定

论文标题