论文标题
快速:针对所有SRAM行的完全连续访问技术,以提高数据密集型应用程序的速度和能源效率
FAST: A Fully-Concurrent Access Technique to All SRAM Rows for Enhanced Speed and Energy Efficiency in Data-Intensive Applications
论文作者
论文摘要
计算中的内存(CIM)是提高数据量应用中计算速度和能源效率的有前途方法。除了现有的位置内存操作和点产品操作的CIM技术外,本文还使用Fast,一种基于新的基于Shift的新数学计算技术扩展了CIM范式,以处理SRAM中多行的高频率操作。这种高频率操作在两个常规应用程序(例如,数据库中的表更新)和新兴应用程序(例如,神经网络加速器中的平行重量更新)中都广泛看到,其中低延迟和低能源消耗至关重要。提出的基于移位的CIM体系结构是通过将变速器函数集成到每个SRAM单元格中的,并创建一个数据tapath来利用数组中多行转移操作的高平行性的数据。 65nm CMOS中的128行16列可移动SRAM旨在评估拟议的体系结构。 Postlayout Spice模拟显示,在传统的完全数字内存计算分隔的方案中,在VGG-7框架中执行8位重量更新任务时,平均提高了4.4倍的能源效率和96.0倍的速度。
Compute-in-memory (CiM) is a promising approach to improving the computing speed and energy efficiency in dataintensive applications. Beyond existing CiM techniques of bitwise logic-in-memory operations and dot product operations, this paper extends the CiM paradigm with FAST, a new shift-based inmemory computation technique to handle high-concurrency operations on multiple rows in an SRAM. Such high-concurrency operations are widely seen in both conventional applications (e.g. the table update in a database), and emerging applications (e.g. the parallel weight update in neural network accelerators), in which low latency and low energy consumption are critical. The proposed shift-based CiM architecture is enabled by integrating the shifter function into each SRAM cell, and by creating a datapath that exploits the high-parallelism of shifting operations in multiple rows in the array. A 128-row 16-column shiftable SRAM in 65nm CMOS is designed to evaluate the proposed architecture. Postlayout SPICE simulations show average improvements of 4.4x energy efficiency and 96.0x speed over a conventional fully-digital memory-computing-separated scheme, when performing the 8-bit weight update task in a VGG-7 framework.