论文标题
深度残留网络的渐近分析
Asymptotic Analysis of Deep Residual Networks
论文作者
论文摘要
随着层数的增加,我们研究了深残留网络(RESNET)的渐近特性。我们首先显示了与神经颂歌文献中隐含的训练权重的缩放制度的存在。我们研究了这些缩放制度中隐藏状态动力学的收敛性,表明人们可以获得颂歌,随机微分方程(SDE)或两种。特别是,我们的发现表明存在一个扩散状态,其中深网限制由一类随机微分方程(SDE)描述。最后,我们得出了反向传播动力学的相应缩放限制。
We investigate the asymptotic properties of deep Residual networks (ResNets) as the number of layers increases. We first show the existence of scaling regimes for trained weights markedly different from those implicitly assumed in the neural ODE literature. We study the convergence of the hidden state dynamics in these scaling regimes, showing that one may obtain an ODE, a stochastic differential equation (SDE) or neither of these. In particular, our findings point to the existence of a diffusive regime in which the deep network limit is described by a class of stochastic differential equations (SDEs). Finally, we derive the corresponding scaling limits for the backpropagation dynamics.