论文标题

REDMULE:一种紧凑的FP16矩阵 - 刺激加速器,用于自适应深度学习基于RISC-V的超低功率SOC

RedMulE: A Compact FP16 Matrix-Multiplication Accelerator for Adaptive Deep Learning on RISC-V-Based Ultra-Low-Power SoCs

论文作者

Tortorella, Yvan, Bertaccini, Luca, Rossi, Davide, Benini, Luca, Conti, Francesco

论文摘要

使用基于深度学习(DL)算法的极端边缘应用程序的快速扩散需要专用硬件,以满足极端边缘应用程序的延迟,吞吐量和精度要求。尽管在实际情况下推断是可以实现的,但是在线填充和一般DL模型的适应仍然是高度挑战的。关键的绊脚石之一是需要平行的浮点操作,这被认为是在100兆瓦以下的极端边缘SOC上无法承受的。我们使用Redmule(降低精度矩阵乘积引擎)解决这个问题,这是一种用于FP16矩阵乘法的参数低功率硬件加速器 - DL训练的主要内核和推理 - 被认为是在基于纸浆(Poolalla-v Power-Power-Power Power Power)架构的小型RISC-V核群中进行紧密整合的。在22 nm技术中,32-FMA Redmule实例仅占0.07 mm^2(占8核RISC-V群集的14%),最高可达666 MHz的最大工作频率,吞吐量为31.6 Mac/Cycle(98.8%的利用率)。我们达到43.5兆瓦的集群级功耗,全簇能效率为688 16位Gflops/w。总体而言,Redmule具有高达4.65倍的能源效率,而在8个RISC-V内核上执行超速22倍。

The fast proliferation of extreme-edge applications using Deep Learning (DL) based algorithms required dedicated hardware to satisfy extreme-edge applications' latency, throughput, and precision requirements. While inference is achievable in practical cases, online finetuning and adaptation of general DL models are still highly challenging. One of the key stumbling stones is the need for parallel floating-point operations, which are considered unaffordable on sub-100 mW extreme-edge SoCs. We tackle this problem with RedMulE (Reduced-precision matrix Multiplication Engine), a parametric low-power hardware accelerator for FP16 matrix multiplications - the main kernel of DL training and inference - conceived for tight integration within a cluster of tiny RISC-V cores based on the PULP (Parallel Ultra-Low-Power) architecture. In 22 nm technology, a 32-FMA RedMulE instance occupies just 0.07 mm^2 (14% of an 8-core RISC-V cluster) and achieves up to 666 MHz maximum operating frequency, for a throughput of 31.6 MAC/cycle (98.8% utilization). We reach a cluster-level power consumption of 43.5 mW and a full-cluster energy efficiency of 688 16-bit GFLOPS/W. Overall, RedMulE features up to 4.65x higher energy efficiency and 22x speedup over SW execution on 8 RISC-V cores.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源