论文标题
在多语言语言模型中发现语言中立子网络
Discovering Language-neutral Sub-networks in Multilingual Language Models
论文作者
论文摘要
多语言预训练的语言模型在跨语性下游任务上非常出色。但是,他们学习语言中性表示的程度(即,在跨语言编码类似现象的共享表示形式)以及此类表示对跨语性转移绩效的影响仍然是空的问题。在这项工作中,我们将多语言模型的语言中立性概念化为这些模型的语言编码子网络之间的重叠的函数。我们采用彩票假设来发现针对各种语言和任务单独优化的子网络。我们在三个不同的任务和11种类型多样性的语言中进行的评估表明,不同语言的子网络在拓扑学上是相似的(即语言中性的),使它们成为有效的跨语性转移初始化,并且性能有限地降级。
Multilingual pre-trained language models transfer remarkably well on cross-lingual downstream tasks. However, the extent to which they learn language-neutral representations (i.e., shared representations that encode similar phenomena across languages), and the effect of such representations on cross-lingual transfer performance, remain open questions. In this work, we conceptualize language neutrality of multilingual models as a function of the overlap between language-encoding sub-networks of these models. We employ the lottery ticket hypothesis to discover sub-networks that are individually optimized for various languages and tasks. Our evaluation across three distinct tasks and eleven typologically-diverse languages demonstrates that sub-networks for different languages are topologically similar (i.e., language-neutral), making them effective initializations for cross-lingual transfer with limited performance degradation.