过渡到广泛神经网络的线性是组装弱模型的新兴特性

论文标题

过渡到广泛神经网络的线性是组装弱模型的新兴特性

Transition to Linearity of Wide Neural Networks is an Emerging Property of Assembling Weak Models

论文作者

Liu, Chaoyue, Zhu, Libin, Belkin, Mikhail

论文摘要

具有线性输出层的宽神经网络已显示为接近线性，并且在包含梯度下降的优化路径的区域中具有接近恒定的神经切线内核（NTK）。这些发现似乎是违反直觉的，因为在一般的神经网络中是高度复杂的模型。为什么当网络变得广泛时会出现线性结构？在这项工作中，我们通过将神经网络视为从与单个神经元相对应的一组子模型递归构建的组装模型来提供有关这种“线性过渡”的新观点。以这种观点，我们表明，宽神经网络的线性性实际上是组装大量不同的“弱”子模型的新兴特性，而这些属性都没有主导组装。

Wide neural networks with linear output layer have been shown to be near-linear, and to have near-constant neural tangent kernel (NTK), in a region containing the optimization path of gradient descent. These findings seem counter-intuitive since in general neural networks are highly complex models. Why does a linear structure emerge when the networks become wide? In this work, we provide a new perspective on this "transition to linearity" by considering a neural network as an assembly model recursively built from a set of sub-models corresponding to individual neurons. In this view, we show that the linearity of wide neural networks is, in fact, an emerging property of assembling a large number of diverse "weak" sub-models, none of which dominate the assembly.

下载PDF全文

下载文献需遵守相关版权规定

论文标题