EXPAN（N）D：探索基于FPGA的系统中有效的人工神经网络设计的探索

论文标题

EXPAN（N）D：探索基于FPGA的系统中有效的人工神经网络设计的探索

ExPAN(N)D: Exploring Posits for Efficient Artificial Neural Network Design in FPGA-based Systems

论文作者

Nambi, Suresh, Ullah, Salim, Lohana, Aditya, Sahoo, Siva Satyendra, Merchant, Farhad, Kumar, Akash

论文摘要

一般而言，机器学习的最新进展，尤其是人工神经网络（ANN），使智能嵌入式系统成为较大应用领域的有吸引力的选择。但是，机器学习模型的高计算复杂性，内存足迹和能源需求阻碍了它们在资源受限的嵌入式系统上的部署。大多数最先进的作品都通过提出各种低位数据表示方案，优化算术运算符的实现以及不同的复杂性降低技术（例如网络修剪）来考虑此问题。为了进一步提高这些各个技术提供的实施收益，需要盘问并结合这些技术的独特功能。本文介绍了Expan（n）D，这是一个框架，用于分析和启动POTIT数字表示方案的疗效以及ANN的定点算术算术实现的效率。与IEEE $ 754 $单位浮点数格式相比，Potit方案为各种应用程序提供了更好的动态范围和更高的精度。但是，由于位置方案的各个领域的动态性质，相应的算术电路比基于单位算术的算术单元具有更高的临界路径延迟和资源需求。为此，我们向定点转换器提出了一个新颖的位置，以实现高性能和节能硬件实现，以使输出精度下降最小的ANN。我们还提出了一个基于正面的表示的表示，以存储网络的训练参数。与基于$ 8 $的定点推理加速器相比，我们提出的实现分别提供了$ \ of46 \％$和$ \ of18 \％$ $减少Mac单元的参数和能源消耗的存储要求。

The recent advances in machine learning, in general, and Artificial Neural Networks (ANN), in particular, has made smart embedded systems an attractive option for a larger number of application areas. However, the high computational complexity, memory footprints, and energy requirements of machine learning models hinder their deployment on resource-constrained embedded systems. Most state-of-the-art works have considered this problem by proposing various low bit-width data representation schemes, optimized arithmetic operators' implementations, and different complexity reduction techniques such as network pruning. To further elevate the implementation gains offered by these individual techniques, there is a need to cross-examine and combine these techniques' unique features. This paper presents ExPAN(N)D, a framework to analyze and ingather the efficacy of the Posit number representation scheme and the efficiency of fixed-point arithmetic implementations for ANNs. The Posit scheme offers a better dynamic range and higher precision for various applications than IEEE $754$ single-precision floating-point format. However, due to the dynamic nature of the various fields of the Posit scheme, the corresponding arithmetic circuits have higher critical path delay and resource requirements than the single-precision-based arithmetic units. Towards this end, we propose a novel Posit to fixed-point converter for enabling high-performance and energy-efficient hardware implementations for ANNs with minimal drop in the output accuracy. We also propose a modified Posit-based representation to store the trained parameters of a network. Compared to an $8$-bit fixed-point-based inference accelerator, our proposed implementation offers $\approx46\%$ and $\approx18\%$ reductions in the storage requirements of the parameters and energy consumption of the MAC units, respectively.

下载PDF全文

下载文献需遵守相关版权规定

论文标题