论文标题
深度残留网络的收敛分析
Convergence Analysis of Deep Residual Networks
论文作者
论文摘要
在过去的二十年中,各种强大的深度神经网络架构为深度学习的激动人心的成功做出了巨大贡献。其中,深层剩余网络(RESENET)尤其重要,因为它们通过在许多深度学习比赛中赢得第一名,在计算机视觉中表现出极大的用处。同样,重新结合是深度学习发展史上第一类神经网络。了解深度重新连接的融合是数学兴趣和实践意义。我们的目的是表征深度重新集的收敛,因为在网络的参数方面,深度倾向于无穷大。为此,我们首先通过使用激活域和激活矩阵的概念对网络的一般深神经网络进行矩阵矢量描述,并为网络提出明确的表达。然后将收敛降低为涉及非平方矩阵无限产物的两个系列的收敛性。通过研究两个系列,我们建立了足够的条件,可以重新收集重新收敛。我们的结果能够为设计的设计提供理由。我们还对基准计算机学习数据进行实验,以验证我们的结果。
Various powerful deep neural network architectures have made great contribution to the exciting successes of deep learning in the past two decades. Among them, deep Residual Networks (ResNets) are of particular importance because they demonstrated great usefulness in computer vision by winning the first place in many deep learning competitions. Also, ResNets were the first class of neural networks in the development history of deep learning that are really deep. It is of mathematical interest and practical meaning to understand the convergence of deep ResNets. We aim at characterizing the convergence of deep ResNets as the depth tends to infinity in terms of the parameters of the networks. Toward this purpose, we first give a matrix-vector description of general deep neural networks with shortcut connections and formulate an explicit expression for the networks by using the notions of activation domains and activation matrices. The convergence is then reduced to the convergence of two series involving infinite products of non-square matrices. By studying the two series, we establish a sufficient condition for pointwise convergence of ResNets. Our result is able to give justification for the design of ResNets. We also conduct experiments on benchmark machine learning data to verify our results.