减少数据运动以加速深度神经网络的培训

论文标题

减少数据运动以加速深度神经网络的培训

Reducing Data Motion to Accelerate the Training of Deep Neural Networks

论文作者

Zhuang, Sicong, Malossi, Cristiano, Casas, Marc

论文摘要

本文通过减少由多个GPU和多核心CPU设备组成的异质体系结构的数据流动量来降低DNNS培训的成本。特别是，本文提出了一种算法，以动态调整训练过程中网络权重的数据表示形式。该算法驱动了一个压缩过程，该过程在通过并行系统发送之前将数据大小降低。我们开展了广泛的评估活动，考虑了几个最新的深神经网络模型以及两个由多个GPU和CPU多核芯片组成的高端平行体系结构。我们的解决方案的平均性能从6.18 \％提高到11.91 \％。

This paper reduces the cost of DNNs training by decreasing the amount of data movement across heterogeneous architectures composed of several GPUs and multicore CPU devices. In particular, this paper proposes an algorithm to dynamically adapt the data representation format of network weights during training. This algorithm drives a compression procedure that reduces data size before sending them over the parallel system. We run an extensive evaluation campaign considering several up-to-date deep neural network models and two high-end parallel architectures composed of multiple GPUs and CPU multicore chips. Our solution achieves average performance improvements from 6.18\% up to 11.91\%.

下载PDF全文

下载文献需遵守相关版权规定

论文标题