步行噪声：神经体系结构针对嘈杂计算和相关特征学习动态的特定层特定鲁棒性

论文标题

步行噪声：神经体系结构针对嘈杂计算和相关特征学习动态的特定层特定鲁棒性

Walking Noise: On Layer-Specific Robustness of Neural Architectures against Noisy Computations and Associated Characteristic Learning Dynamics

论文作者

Borras, Hendrik, Klein, Bernhard, Fröning, Holger

论文摘要

深度神经网络在各种应用中都非常成功，但是它们表现出较高的计算需求和能耗。口吃技术缩放会加剧这一点，促使需要进行新颖的方法处理日益复杂的神经体系结构。同时，诸如模拟计算之类的替代计算技术不可避免地会充满噪声和不准确的计算，从而确保了能源效率的开创性提高。这种嘈杂的计算更节能，并且鉴于固定的电力预算，也更有效率。但是，像任何不安全的优化一样，它们需要对策以确保在功能上纠正结果。这项工作以抽象的形式考虑了嘈杂的计算，并齿轮了解这种噪声对神经网络分类器的准确性的含义是示例性的工作量。我们提出了一种称为“行走噪声”的方法论，该方法会注入特定层的噪声以测量鲁棒性并提供有关学习动力学的见解。更详细地，我们研究了添加，乘法和混合噪声对不同分类任务和模型体系结构的含义。尽管嘈杂的训练可显着提高所有噪声类型的鲁棒性，但我们特别观察到它会导致体重增加，从而固有地提高了添加噪声注入的信噪比。相反，使用乘法噪声的训练可以导致模型参数的一种自生物化形式，从而导致极端鲁棒性。最后，我们讨论了这种方法在实践中的使用，以及在嘈杂的环境中讨论其用于量身定制的多执行的使用。

Deep neural networks are extremely successful in various applications, however they exhibit high computational demands and energy consumption. This is exacerbated by stuttering technology scaling, prompting the need for novel approaches to handle increasingly complex neural architectures. At the same time, alternative computing technologies such as analog computing, which promise groundbreaking improvements in energy efficiency, are inevitably fraught with noise and inaccurate calculations. Such noisy computations are more energy efficient, and, given a fixed power budget, also more time efficient. However, like any kind of unsafe optimization, they require countermeasures to ensure functionally correct results. This work considers noisy computations in an abstract form, and gears to understand the implications of such noise on the accuracy of neural network classifiers as an exemplary workload. We propose a methodology called Walking Noise which injects layer-specific noise to measure the robustness and to provide insights on the learning dynamics. In more detail, we investigate the implications of additive, multiplicative and mixed noise for different classification tasks and model architectures. While noisy training significantly increases robustness for all noise types, we observe in particular that it results in increased weight magnitudes and thus inherently improves the signal-to-noise ratio for additive noise injection. Contrarily, training with multiplicative noise can lead to a form of self-binarization of the model parameters, leading to extreme robustness. We conclude with a discussion of the use of this methodology in practice, among others, discussing its use for tailored multi-execution in noisy environments.

下载PDF全文

下载文献需遵守相关版权规定

论文标题