强烈注意变压器的正式语言识别：电路复杂性的观点

论文标题

强烈注意变压器的正式语言识别：电路复杂性的观点

Formal Language Recognition by Hard Attention Transformers: Perspectives from Circuit Complexity

论文作者

Hao, Yiding, Angluin, Dana, Frank, Robert

论文摘要

本文分析了变压器编码器的三个形式模型，它们在其自我发挥机制的形式上有所不同：独特的艰苦关注（UHAT）；广义的独特注意（Guhat），它概括了UHAT；并平均强调（ahat）。我们表明，被视为字符串受体的Uhat和Guhat Transformers只能识别复杂性类别AC $^0 $的形式语言，即恒定深度和多项式大小的布尔电路家族可以识别的语言类。这本上限的归因于Hahn（2020）的结果，Guhat无法识别Dyck语言或平等语言，因为这些语言不在AC $^0 $之外（Furst et al。，1984）。相比之下，非AC $^0 $语言多数和Dyck-1是可以通过AHAT网络识别的，这意味着Ahat可以识别Uhat和Guhat不能识别的语言。

This paper analyzes three formal models of Transformer encoders that differ in the form of their self-attention mechanism: unique hard attention (UHAT); generalized unique hard attention (GUHAT), which generalizes UHAT; and averaging hard attention (AHAT). We show that UHAT and GUHAT Transformers, viewed as string acceptors, can only recognize formal languages in the complexity class AC$^0$, the class of languages recognizable by families of Boolean circuits of constant depth and polynomial size. This upper bound subsumes Hahn's (2020) results that GUHAT cannot recognize the DYCK languages or the PARITY language, since those languages are outside AC$^0$ (Furst et al., 1984). In contrast, the non-AC$^0$ languages MAJORITY and DYCK-1 are recognizable by AHAT networks, implying that AHAT can recognize languages that UHAT and GUHAT cannot.

下载PDF全文

下载文献需遵守相关版权规定

论文标题