训练端到端模拟神经网络具有平衡传播

论文标题

训练端到端模拟神经网络具有平衡传播

Training End-to-End Analog Neural Networks with Equilibrium Propagation

论文作者

Kendall, Jack, Pantone, Ross, Manickavasagam, Kalpana, Bengio, Yoshua, Scellier, Benjamin

论文摘要

我们引入了一种原则性的方法，可以通过随机梯度下降来训练端到端模拟神经网络。在这些模拟神经网络中，要调整的权重由可编程电阻器件（例如Memristors [Chua，1971]）的电导实现，非线性转移函数（或“激活功能”）由非线性组件（例如DIODES）实现。我们在数学上表明，一类模拟神经网络（称为非线性电阻网络）是基于能量的模型：由于基希霍夫的法律管理电路，它们具有能量功能。该属性使我们能够使用平衡传播框架[Scellier and Bengio，2017]训练他们。我们针对每个电导的更新规则，该规则是局部的，仅依赖于相应电阻器上的电压下降，以计算损耗函数的梯度。我们的数值模拟使用基于香料的幽灵模拟框架来模拟电路的动力学，它表明了对MNIST分类任务的训练，比基于软件的基于软件的神经网络的执行相当或更好。我们的工作可以指导新一代的超快速，紧凑和低功耗的神经网络，以支持片上学习。

We introduce a principled method to train end-to-end analog neural networks by stochastic gradient descent. In these analog neural networks, the weights to be adjusted are implemented by the conductances of programmable resistive devices such as memristors [Chua, 1971], and the nonlinear transfer functions (or `activation functions') are implemented by nonlinear components such as diodes. We show mathematically that a class of analog neural networks (called nonlinear resistive networks) are energy-based models: they possess an energy function as a consequence of Kirchhoff's laws governing electrical circuits. This property enables us to train them using the Equilibrium Propagation framework [Scellier and Bengio, 2017]. Our update rule for each conductance, which is local and relies solely on the voltage drop across the corresponding resistor, is shown to compute the gradient of the loss function. Our numerical simulations, which use the SPICE-based Spectre simulation framework to simulate the dynamics of electrical circuits, demonstrate training on the MNIST classification task, performing comparably or better than equivalent-size software-based neural networks. Our work can guide the development of a new generation of ultra-fast, compact and low-power neural networks supporting on-chip learning.

下载PDF全文

下载文献需遵守相关版权规定

论文标题