迈向神经分类器无限宽度极限的一般理论

论文标题

迈向神经分类器无限宽度极限的一般理论

Towards a General Theory of Infinite-Width Limits of Neural Classifiers

论文作者

Golikov, Eugene A.

论文摘要

在一般情况下，获得神经网络培训的理论保证似乎是一个严重的问题。最近的研究集中在研究无限宽度的极限和两种不同理论的限制上：平均场（MF）和恒定内核（NTK）限制理论。我们提出了一个通用框架，该框架在这些看似不同的理论之间提供了联系。我们的框架开箱即用的框架产生了一个离散的MF限制，该限制以前在文献中没有探索过。我们证明了它的收敛定理，并表明，如果学习率不是很小，则与NTK限制相比，它为有限宽度网提供了更合理的近似值。同样，我们的框架提出了一个极限模型，该模型既不与MF限制相吻合，也不是NTK限制。我们表明，对于具有两个以上隐藏层的网络，RMSPROP培训具有非平凡的离散时间MF限制，但GD培训没有一个。总体而言，我们的框架表明，MF和NTK极限在近似有限的神经网中都有很大的限制，这表明需要为它们设计更准确的无限宽度近似值。

Obtaining theoretical guarantees for neural networks training appears to be a hard problem in a general case. Recent research has been focused on studying this problem in the limit of infinite width and two different theories have been developed: a mean-field (MF) and a constant kernel (NTK) limit theories. We propose a general framework that provides a link between these seemingly distinct theories. Our framework out of the box gives rise to a discrete-time MF limit which was not previously explored in the literature. We prove a convergence theorem for it and show that it provides a more reasonable approximation for finite-width nets compared to the NTK limit if learning rates are not very small. Also, our framework suggests a limit model that coincides neither with the MF limit nor with the NTK one. We show that for networks with more than two hidden layers RMSProp training has a non-trivial discrete-time MF limit but GD training does not have one. Overall, our framework demonstrates that both MF and NTK limits have considerable limitations in approximating finite-sized neural nets, indicating the need for designing more accurate infinite-width approximations for them.

下载PDF全文

下载文献需遵守相关版权规定

论文标题