论文标题
非差异神经网络的分析方面
Analytical aspects of non-differentiable neural networks
论文作者
论文摘要
计算深度学习的研究通过简化激活功能或激活和权重的量化,将大量努力用于针对深度神经网络的优化。网络的最终的非差异性(甚至是停产)带来了一些具有挑战性的问题,尤其是与学习过程有关。在本文中,我们解决了有关量化神经网络的表达和非差异网络的近似技术的几个问题。首先,我们在肯定的问题上回答了QNN在$ l^{\ infty} $ norm中的Lipschitz函数方面是否具有与DNN相同的表达性的问题。然后,考虑到连续但不一定是可区分的网络,我们描述了一种层面的随机正规化技术来产生可区分的近似值,我们展示了这种正则化方法如何提供优雅的定量估计值。最后,我们考虑通过Heaviside型激活函数定义的网络,并在其正规化激活的合适假设下通过平滑网络获得了偶然的近似值。
Research in computational deep learning has directed considerable efforts towards hardware-oriented optimisations for deep neural networks, via the simplification of the activation functions, or the quantization of both activations and weights. The resulting non-differentiability (or even discontinuity) of the networks poses some challenging problems, especially in connection with the learning process. In this paper, we address several questions regarding both the expressivity of quantized neural networks and approximation techniques for non-differentiable networks. First, we answer in the affirmative the question of whether QNNs have the same expressivity as DNNs in terms of approximation of Lipschitz functions in the $L^{\infty}$ norm. Then, considering a continuous but not necessarily differentiable network, we describe a layer-wise stochastic regularisation technique to produce differentiable approximations, and we show how this approach to regularisation provides elegant quantitative estimates. Finally, we consider networks defined by means of Heaviside-type activation functions, and prove for them a pointwise approximation result by means of smooth networks under suitable assumptions on the regularised activations.