论文标题
深度结合的广义负相关学习
Generalized Negative Correlation Learning for Deep Ensembling
论文作者
论文摘要
集合算法在许多机器学习应用中提供最先进的性能。其出色性能的一个常见解释是,均方误差的偏置变化分解,这表明该算法的误差可以分解为其偏差和差异。两种数量通常相互反对,合奏提供了一种有效的方法来管理它们,因为它们通过各种基本学习者减少差异,同时保持偏见的较低。即使有许多关于分解其他损失函数的作品,但很少会明确利用确切的数学连接来结合,而只是用作指导原则。在本文中,我们为任意损失函数两倍的任意偏差分解制定了一般性的偏差分解,并在深度学习的背景下进行了研究。我们使用这种分解来得出广义的负相关学习(GNCL)算法,该算法可明确控制整体的多样性,并在两个极端的独立训练和合奏的联合培训之间平稳地插入。我们展示了GNCL如何封装许多以前的作品,并讨论在哪些情况下,神经网络集合的培训可能会失败,以及根据各个网络的选择,应偏爱哪种结合方法。我们通过https://github.com/sbuschjaeger/gncl公开提供代码
Ensemble algorithms offer state of the art performance in many machine learning applications. A common explanation for their excellent performance is due to the bias-variance decomposition of the mean squared error which shows that the algorithm's error can be decomposed into its bias and variance. Both quantities are often opposed to each other and ensembles offer an effective way to manage them as they reduce the variance through a diverse set of base learners while keeping the bias low at the same time. Even though there have been numerous works on decomposing other loss functions, the exact mathematical connection is rarely exploited explicitly for ensembling, but merely used as a guiding principle. In this paper, we formulate a generalized bias-variance decomposition for arbitrary twice differentiable loss functions and study it in the context of Deep Learning. We use this decomposition to derive a Generalized Negative Correlation Learning (GNCL) algorithm which offers explicit control over the ensemble's diversity and smoothly interpolates between the two extremes of independent training and the joint training of the ensemble. We show how GNCL encapsulates many previous works and discuss under which circumstances training of an ensemble of Neural Networks might fail and what ensembling method should be favored depending on the choice of the individual networks. We make our code publicly available under https://github.com/sbuschjaeger/gncl