评估损失函数变化在深度学习中的影响

论文标题

评估损失函数变化在深度学习中的影响

Evaluating the Impact of Loss Function Variation in Deep Learning for Classification

论文作者

Dräger, Simon, Dunkelau, Jannik

论文摘要

可以说，损失函数是神经网络最重要的超参数之一。迄今为止，已经设计了许多损失功能，这是正确的选择。但是，在相关工作中并未做出有关选择损失函数的详尽理由。正如我们所看到的，这表明深度学习社区中缺乏经验基础的教条思维方式。在这项工作中，我们考虑了在监督分类环境中的深层神经网络，并分析了损失功能对训练结果的选择。尽管某些损失功能的表现次优，但我们的工作从经验上表明，诸如KL差异之类的代表性不足的损失可以显着胜过最新的选择，这突显了将损失函数作为调谐的超级参数包括在内的，而不是固定选择。

The loss function is arguably among the most important hyperparameters for a neural network. Many loss functions have been designed to date, making a correct choice nontrivial. However, elaborate justifications regarding the choice of the loss function are not made in related work. This is, as we see it, an indication of a dogmatic mindset in the deep learning community which lacks empirical foundation. In this work, we consider deep neural networks in a supervised classification setting and analyze the impact the choice of loss function has onto the training result. While certain loss functions perform suboptimally, our work empirically shows that under-represented losses such as the KL Divergence can outperform the State-of-the-Art choices significantly, highlighting the need to include the loss function as a tuned hyperparameter rather than a fixed choice.

下载PDF全文

下载文献需遵守相关版权规定

论文标题