论文标题

使用快速算法加速三维生成对抗网络

Accelerate Three-Dimensional Generative Adversarial Networks Using Fast Algorithm

论文作者

Su, Ziqi, Mao, Wendong, Wang, Zhongfeng, Lin, Jun, Wang, Wenqiang, Sun, Haitao

论文摘要

三维生成对抗网络(3D-GAN)在三维(3D)视觉任务中引起了广泛的关注。与2D DECONV相比,3D反卷积(DECONV)是3D-GAN的重要计算,可显着提高计算复杂性。 3D Deconv已成为3D-GAN加速的瓶颈。以前的加速器遇到了几个问题,例如庞大的内存需求和资源不足。为了解决上述问题,本文提出了针对3D Deconv(F3DC)的快速算法。 F3DC应用快速算法来减少乘法的数量并实现显着的算法强度降低。此外,F3DC还删除了重叠部分总和的额外内存要求,并避免了计算失衡以充分利用资源。此外,我们设计了一个基于F3DC的硬件体系结构,该架构由四个快速处理单元(FPU)组成。每个FPU都包含一个预处理模块,一个EWMM模块和用于F3DC转换的后处理模块。通过在3D-GAN的Xilinx VC709平台上实施我们的设计,我们实现了高达1700 GOPS和4 $ \ times $计算效率的提高的吞吐量。

Three-dimensional generative adversarial networks (3D-GAN) have attracted widespread attention in three-dimension (3D) visual tasks. 3D deconvolution (DeConv), as an important computation of 3D-GAN, significantly increases computational complexity compared with 2D DeConv. 3D DeConv has become a bottleneck for the acceleration of 3D-GAN. Previous accelerators suffer from several problems, such as large memory requirements and resource underutilization. To handle the above issues, a fast algorithm for 3D DeConv (F3DC) is proposed in this paper. F3DC applies a fast algorithm to reduce the number of multiplications and achieves a significant algorithmic strength reduction. Besides, F3DC removes the extra memory requirement for overlapped partial sums and avoids computational imbalance to fully utilize resources. Moreover, we design an F3DC-based hardware architecture, which consists of four fast processing units (FPUs). Each FPU includes a pre-process module, a EWMM module and a post-process module for F3DC transformation. By implementing our design on the Xilinx VC709 platform for 3D-GAN, we achieve a throughput up to 1700 GOPS and 4$\times$ computational efficiency improvement compared with prior works.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源