神经网络随机动量方法的最后近期收敛分析

论文标题

神经网络随机动量方法的最后近期收敛分析

Last-iterate convergence analysis of stochastic momentum methods for neural networks

论文作者

Xu, Dongpo, Liu, Jinlan, Lu, Yinghua, Kong, Jun, Mandic, Danilo

论文摘要

随机动量方法是一种常用的加速技术，用于解决人工神经网络中的大规模随机优化问题。在非convex随机设置下，随机动量方法的当前收敛结果主要讨论了随机输出和最小输出的收敛性。为此，我们以与传统优化理论相吻合的方式解决了非凸的随机优化问题的随机动量方法的最后一个迭代输出（称为最后卷收敛）的收敛性。我们证明了在统一框架下的随机动量方法的最后近期收敛，涵盖了随机重的球动量和随机Nesterov加速梯度动量。动量因子可以固定为恒定，而不是现有分析中的时变系数。最后，在基准MNIST和CIFAR-10数据集上验证了随机动量方法的最后近期收敛。

The stochastic momentum method is a commonly used acceleration technique for solving large-scale stochastic optimization problems in artificial neural networks. Current convergence results of stochastic momentum methods under non-convex stochastic settings mostly discuss convergence in terms of the random output and minimum output. To this end, we address the convergence of the last iterate output (called last-iterate convergence) of the stochastic momentum methods for non-convex stochastic optimization problems, in a way conformal with traditional optimization theory. We prove the last-iterate convergence of the stochastic momentum methods under a unified framework, covering both stochastic heavy ball momentum and stochastic Nesterov accelerated gradient momentum. The momentum factors can be fixed to be constant, rather than time-varying coefficients in existing analyses. Finally, the last-iterate convergence of the stochastic momentum methods is verified on the benchmark MNIST and CIFAR-10 datasets.

下载PDF全文

下载文献需遵守相关版权规定

论文标题