论文标题

更深入地了解深神经网络的Hessian特征性及其对正则化的应用

A Deeper Look at the Hessian Eigenspectrum of Deep Neural Networks and its Applications to Regularization

论文作者

Sankar, Adepu Ravi, Khasbage, Yash, Vigneswaran, Rahul, Balasubramanian, Vineeth N

论文摘要

损失景观分析对于更深入了解深神经网络模型的概括能力非常有用。在这项工作中,我们提出了一个层次的损失景观分析,其中独立研究每一层的损耗表面,以及每个层与整体损失面的相关性。我们通过研究每一层的Hessian的特征性来研究层的损失格局。特别是,我们的结果表明,层的黑森几何形状在很大程度上与整个Hessian相似。我们还报告了一个有趣的现象,其中观察到与整个Hessian特征谱的最相似的深神经网络中间层的Hessian特征。我们还表明,随着网络训练的进行,最大的特征值和Hessian(完整网络和图层)的痕迹会减少。我们利用这些观察结果来根据图层Hessian的痕迹提出一个新的正规化程序。惩罚每一层Hessian的痕迹间接迫使随机梯度下降以将其收敛到扁平的最小值,这被证明具有更好的泛化性能。特别是,我们表明可以利用这种层的正规化程序来惩罚仅中间层,从而产生有希望的结果。我们对跨数据集众所周知的深网的实证研究支持这项工作的主张

Loss landscape analysis is extremely useful for a deeper understanding of the generalization ability of deep neural network models. In this work, we propose a layerwise loss landscape analysis where the loss surface at every layer is studied independently and also on how each correlates to the overall loss surface. We study the layerwise loss landscape by studying the eigenspectra of the Hessian at each layer. In particular, our results show that the layerwise Hessian geometry is largely similar to the entire Hessian. We also report an interesting phenomenon where the Hessian eigenspectrum of middle layers of the deep neural network are observed to most similar to the overall Hessian eigenspectrum. We also show that the maximum eigenvalue and the trace of the Hessian (both full network and layerwise) reduce as training of the network progresses. We leverage on these observations to propose a new regularizer based on the trace of the layerwise Hessian. Penalizing the trace of the Hessian at every layer indirectly forces Stochastic Gradient Descent to converge to flatter minima, which are shown to have better generalization performance. In particular, we show that such a layerwise regularizer can be leveraged to penalize the middlemost layers alone, which yields promising results. Our empirical studies on well-known deep nets across datasets support the claims of this work

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源