Lipschitz复发性神经网络

论文标题

Lipschitz复发性神经网络

Lipschitz Recurrent Neural Networks

论文作者

Erichson, N. Benjamin, Azencot, Omri, Queiruga, Alejandro, Hodgkinson, Liam, Mahoney, Michael W.

论文摘要

将复发性神经网络（RNN）视为连续的时间动力学系统，我们提出了一个经常性的单元，该单元描述了隐藏状态的演变，其中有两个部分：一个良好的线性组件以及Lipschitz的非线性。这种特殊的功能形式促进了使用非线性系统理论工具对复发单元的长期行为的稳定性分析。反过来，这可以在实验前实现建筑设计决策。获得了反复单元的全球稳定性的足够条件，激发了一种新的构建隐藏矩阵的方案。我们的实验表明，Lipschitz RNN可以在一系列基准任务（包括计算机视觉，语言建模和语音预测任务）上胜过现有的复发单元。最后，通过基于Hessian的分析，我们证明了与其他连续时间RNN相比，在输入和参数扰动方面，我们的Lipschitz复发单元更为强大。

Viewing recurrent neural networks (RNNs) as continuous-time dynamical systems, we propose a recurrent unit that describes the hidden state's evolution with two parts: a well-understood linear component plus a Lipschitz nonlinearity. This particular functional form facilitates stability analysis of the long-term behavior of the recurrent unit using tools from nonlinear systems theory. In turn, this enables architectural design decisions before experimentation. Sufficient conditions for global stability of the recurrent unit are obtained, motivating a novel scheme for constructing hidden-to-hidden matrices. Our experiments demonstrate that the Lipschitz RNN can outperform existing recurrent units on a range of benchmark tasks, including computer vision, language modeling and speech prediction tasks. Finally, through Hessian-based analysis we demonstrate that our Lipschitz recurrent unit is more robust with respect to input and parameter perturbations as compared to other continuous-time RNNs.

下载PDF全文

下载文献需遵守相关版权规定

论文标题