过滤的批准归一化

论文标题

过滤的批准归一化

Filtered Batch Normalization

论文作者

Horvath, Andras, Al-afandi, Jalal

论文摘要

普遍的假设是，神经网络中不同层的激活遵循高斯分布。可以使用归一化技术（例如批处理规范化，提高收敛速度和提高精度）来转换此分布。在本文中，我们想证明，激活不一定遵循所有层中的高斯分布。较深层中的神经元更具选择性和特异性，这可能导致极大的分布激活。我们将证明，通过滤除这些激活，可以为训练过程中的批次归一化创建更一致的均值和方差值，从而进一步提高收敛速度并产生更高的验证精度。

It is a common assumption that the activation of different layers in neural networks follow Gaussian distribution. This distribution can be transformed using normalization techniques, such as batch-normalization, increasing convergence speed and improving accuracy. In this paper we would like to demonstrate, that activations do not necessarily follow Gaussian distribution in all layers. Neurons in deeper layers are more selective and specific which can result extremely large, out-of-distribution activations. We will demonstrate that one can create more consistent mean and variance values for batch normalization during training by filtering out these activations which can further improve convergence speed and yield higher validation accuracy.

下载PDF全文

下载文献需遵守相关版权规定

论文标题