论文标题
更新边缘的深神经网络的压缩
Update Compression for Deep Neural Networks on the Edge
论文作者
论文摘要
越来越多的人工智能(AI)应用涉及在边缘设备上执行深神经网络(DNN)。许多实际原因激发了需要更新DNN模型后的DNN模型,例如完善模型,概念漂移或学习任务的彻底更改。在本文中,我们考虑了基于DNN模型副本在服务器端进行重新培训的方案,仅将必要的数据传输到边缘以更新部署的模型。但是,由于带宽约束,我们要最大程度地减少实现更新所需的传输。我们基于矩阵分解来开发一种简单的方法来压缩模型更新,这与压缩模型本身不同。关键想法是在当前模型中保留现有知识,并仅优化更新的小其他参数,该参数可用于重新构建模型的边缘。我们将我们的方法与联合学习中使用的类似技术进行了比较。我们的方法通常需要不到现有方法的更新大小的一半才能达到相同的准确性。
An increasing number of artificial intelligence (AI) applications involve the execution of deep neural networks (DNNs) on edge devices. Many practical reasons motivate the need to update the DNN model on the edge device post-deployment, such as refining the model, concept drift, or outright change in the learning task. In this paper, we consider the scenario where retraining can be done on the server side based on a copy of the DNN model, with only the necessary data transmitted to the edge to update the deployed model. However, due to bandwidth constraints, we want to minimise the transmission required to achieve the update. We develop a simple approach based on matrix factorisation to compress the model update -- this differs from compressing the model itself. The key idea is to preserve existing knowledge in the current model and optimise only small additional parameters for the update which can be used to reconstitute the model on the edge. We compared our method to similar techniques used in federated learning; our method usually requires less than half of the update size of existing methods to achieve the same accuracy.