论文标题
在神经网络中,Relu激活和软磁输出层的近似功能
On Approximation Capabilities of ReLU Activation and Softmax Output Layer in Neural Networks
论文作者
论文摘要
在本文中,我们将良好的通用近似理论扩展到了使用未结合的relu激活函数和非线性软磁输出层的神经网络。我们已经证明,使用Relu激活功能的足够大的神经网络可以将$ l^1 $中的任何功能近似于任何任意精度。此外,我们的理论结果表明,使用非线性软马克斯输出层的足够大神经网络也可以近似于$ l^1 $中的任何指标函数,这相当于在任何现实的多类模式分类问题中相互排斥的类标签。据我们所知,这项工作是使用神经网络中的SoftMax输出层进行模式分类的第一个理论理由。
In this paper, we have extended the well-established universal approximator theory to neural networks that use the unbounded ReLU activation function and a nonlinear softmax output layer. We have proved that a sufficiently large neural network using the ReLU activation function can approximate any function in $L^1$ up to any arbitrary precision. Moreover, our theoretical results have shown that a large enough neural network using a nonlinear softmax output layer can also approximate any indicator function in $L^1$, which is equivalent to mutually-exclusive class labels in any realistic multiple-class pattern classification problems. To the best of our knowledge, this work is the first theoretical justification for using the softmax output layers in neural networks for pattern classification.