快速：针对所有SRAM行的完全连续访问技术，以提高数据密集型应用程序的速度和能源效率

论文标题

快速：针对所有SRAM行的完全连续访问技术，以提高数据密集型应用程序的速度和能源效率

FAST: A Fully-Concurrent Access Technique to All SRAM Rows for Enhanced Speed and Energy Efficiency in Data-Intensive Applications

论文作者

Chen, Yiming, Fu, Yushen, Lee, Mingyen, George, Sumitha, Liu, Yongpan, Narayanan, Vijaykrishnan, Yang, Huazhong, Li, Xueqing

论文摘要

计算中的内存（CIM）是提高数据量应用中计算速度和能源效率的有前途方法。除了现有的位置内存操作和点产品操作的CIM技术外，本文还使用Fast，一种基于新的基于Shift的新数学计算技术扩展了CIM范式，以处理SRAM中多行的高频率操作。这种高频率操作在两个常规应用程序（例如，数据库中的表更新）和新兴应用程序（例如，神经网络加速器中的平行重量更新）中都广泛看到，其中低延迟和低能源消耗至关重要。提出的基于移位的CIM体系结构是通过将变速器函数集成到每个SRAM单元格中的，并创建一个数据tapath来利用数组中多行转移操作的高平行性的数据。 65nm CMOS中的128行16列可移动SRAM旨在评估拟议的体系结构。 Postlayout Spice模拟显示，在传统的完全数字内存计算分隔的方案中，在VGG-7框架中执行8位重量更新任务时，平均提高了4.4倍的能源效率和96.0倍的速度。

Compute-in-memory (CiM) is a promising approach to improving the computing speed and energy efficiency in dataintensive applications. Beyond existing CiM techniques of bitwise logic-in-memory operations and dot product operations, this paper extends the CiM paradigm with FAST, a new shift-based inmemory computation technique to handle high-concurrency operations on multiple rows in an SRAM. Such high-concurrency operations are widely seen in both conventional applications (e.g. the table update in a database), and emerging applications (e.g. the parallel weight update in neural network accelerators), in which low latency and low energy consumption are critical. The proposed shift-based CiM architecture is enabled by integrating the shifter function into each SRAM cell, and by creating a datapath that exploits the high-parallelism of shifting operations in multiple rows in the array. A 128-row 16-column shiftable SRAM in 65nm CMOS is designed to evaluate the proposed architecture. Postlayout SPICE simulations show average improvements of 4.4x energy efficiency and 96.0x speed over a conventional fully-digital memory-computing-separated scheme, when performing the 8-bit weight update task in a VGG-7 framework.

下载PDF全文

下载文献需遵守相关版权规定

论文标题