在结构探针中引入正交约束

论文标题

在结构探针中引入正交约束

Introducing Orthogonal Constraint in Structural Probes

论文作者

Limisiewicz, Tomasz, Mareček, David

论文摘要

随着NLP预先培训模型的最新成功，重点是解释其表示形式。最突出的方法之一是结构探测（Hewitt and Manning，2019年），其中进行了单词嵌入的线性投影，以近似依赖性结构的拓扑结构。在这项工作中，我们引入了一种新型的结构探测，其中线性投影被分解为1。同构空间旋转。 2。线性缩放标识和缩放最相关的维度。除了句法依赖性外，我们还评估了我们的新任务方法（词汇上的超声和句子中的位置）。我们共同训练探针执行多个任务，并在实验上表明词汇和句法信息在表示形式中分开。此外，正交约束使结构探针较少容易受到记忆的影响。

With the recent success of pre-trained models in NLP, a significant focus was put on interpreting their representations. One of the most prominent approaches is structural probing (Hewitt and Manning, 2019), where a linear projection of word embeddings is performed in order to approximate the topology of dependency structures. In this work, we introduce a new type of structural probing, where the linear projection is decomposed into 1. isomorphic space rotation; 2. linear scaling that identifies and scales the most relevant dimensions. In addition to syntactic dependency, we evaluate our method on novel tasks (lexical hypernymy and position in a sentence). We jointly train the probes for multiple tasks and experimentally show that lexical and syntactic information is separated in the representations. Moreover, the orthogonal constraint makes the Structural Probes less vulnerable to memorization.

下载PDF全文

下载文献需遵守相关版权规定

论文标题