低级彩票：通过矩阵微分方程找到高效的低级神经网络

论文标题

低级彩票：通过矩阵微分方程找到高效的低级神经网络

Low-rank lottery tickets: finding efficient low-rank neural networks via matrix differential equations

论文作者

Schotthöfer, Steffen, Zangrando, Emanuele, Kusch, Jonas, Ceruti, Gianluca, Tudisco, Francesco

论文摘要

神经网络在各种应用中取得了巨大的成功。但是，他们的内存足迹和计算需求可以使它们在有限的硬件或能源资源的应用程序设置中使其不切实际。在这项工作中，我们提出了一种新型算法，以找到有效的低级子网。值得注意的是，这些子网已在训练阶段确定和改编，并且训练和评估它们所需的总体时间和记忆资源大大减少。主要思想是将重量矩阵限制为低级别的歧管，并在训练过程中更新低级因素，而不是完整的矩阵。为了得出仅限于规定的歧管的培训更新，我们采用了矩阵微分方程的动态模型订单减少的技术。这使我们能够提供近似，稳定性和下降保证。此外，我们的方法会自动和动态地适应训练期间的等级，以达到所需的近似准确性。通过在完全连接和卷积网络上进行多种数值实验来证明所提出的方法的效率。

Neural networks have achieved tremendous success in a large variety of applications. However, their memory footprint and computational demand can render them impractical in application settings with limited hardware or energy resources. In this work, we propose a novel algorithm to find efficient low-rank subnetworks. Remarkably, these subnetworks are determined and adapted already during the training phase and the overall time and memory resources required by both training and evaluating them are significantly reduced. The main idea is to restrict the weight matrices to a low-rank manifold and to update the low-rank factors rather than the full matrix during training. To derive training updates that are restricted to the prescribed manifold, we employ techniques from dynamic model order reduction for matrix differential equations. This allows us to provide approximation, stability, and descent guarantees. Moreover, our method automatically and dynamically adapts the ranks during training to achieve the desired approximation accuracy. The efficiency of the proposed method is demonstrated through a variety of numerical experiments on fully-connected and convolutional networks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题