非确定堆栈RNN的令人惊讶的计算能力

论文标题

非确定堆栈RNN的令人惊讶的计算能力

The Surprising Computational Power of Nondeterministic Stack RNNs

论文作者

DuSell, Brian, Chiang, David

论文摘要

传统的复发性神经网络（RNN）具有固定的有限记忆单元。从理论上讲（假设范围和精确度），这将其形式的语言识别能力限制在普通语言上，而在实践中，RNN被证明无法学习许多无上下文的语言（CFL）。为了扩展RNN识别的语言类别，先前的工作通过非确定堆栈数据结构增强了RNN，使它们与下降自动机相等，并提高了CFL的语言识别能力。识别所有CFL（不仅是确定性的CFL）需要非确定性，但是在本文中，我们表明非确定性和神经控制器相互作用以产生另外两种意外能力。首先，非确定性堆栈RNN不仅可以识别CFL，而且可以识别许多非文本的语言。其次，它可以识别具有比其堆栈字母大小的字母大小要大得多的语言。最后，为了提高堆栈中的信息能力，并允许其用大型字母大小求解更复杂的任务，我们提出了一个新版本的非确定堆栈，该堆栈模拟了堆栈的向量而不是离散的符号。我们通过在宾夕法尼亚州立Treebank语言建模基准的新模型中证明了这种新模型的困惑。

Traditional recurrent neural networks (RNNs) have a fixed, finite number of memory cells. In theory (assuming bounded range and precision), this limits their formal language recognition power to regular languages, and in practice, RNNs have been shown to be unable to learn many context-free languages (CFLs). In order to expand the class of languages RNNs recognize, prior work has augmented RNNs with a nondeterministic stack data structure, putting them on par with pushdown automata and increasing their language recognition power to CFLs. Nondeterminism is needed for recognizing all CFLs (not just deterministic CFLs), but in this paper, we show that nondeterminism and the neural controller interact to produce two more unexpected abilities. First, the nondeterministic stack RNN can recognize not only CFLs, but also many non-context-free languages. Second, it can recognize languages with much larger alphabet sizes than one might expect given the size of its stack alphabet. Finally, to increase the information capacity in the stack and allow it to solve more complicated tasks with large alphabet sizes, we propose a new version of the nondeterministic stack that simulates stacks of vectors rather than discrete symbols. We demonstrate perplexity improvements with this new model on the Penn Treebank language modeling benchmark.

下载PDF全文

下载文献需遵守相关版权规定

论文标题