Sparseloop：稀疏张量加速器建模的分析方法

论文标题

Sparseloop：稀疏张量加速器建模的分析方法

Sparseloop: An Analytical Approach To Sparse Tensor Accelerator Modeling

论文作者

Wu, Yannan Nellie, Tsai, Po-An, Parashar, Angshuman, Sze, Vivienne, Emer, Joel S.

论文摘要

近年来，已经提出了许多加速器来有效处理稀疏张量代数应用（例如稀疏的神经网络）。但是，这些建议是大而多样化的设计空间中的单个点。缺乏对这些稀疏张量加速器的系统描述和建模支持阻碍了硬件设计人员的高效设计空间探索。本文首先提出了统一的分类法，以系统地描述各种稀疏张量加速器的设计空间。然后，基于提议的分类法，它引入了Sparseloop，这是第一个快速，准确且灵活的分析建模框架，以实现稀疏张量加速器的早期评估和探索。 Sparseloop理解了大量的体系结构规格，包括各种数据流和稀疏加速功能（例如，消除基于零的计算）。使用这些规格，Sparseloop评估了设计的处理速度和能源效率，同时考虑了使用的数据流的数据移动和计算，以及使用随机张量密度模型引入的稀疏加速度特征引入的节省和架空。在代表性的加速器和工作负载中，Sparseloop的建模速度比自行车级模拟快2000倍，保持相对性能趋势，并达到0.1％至8％的平均误差。通过案例研究，我们证明了Sparseloop有助于揭示设计稀疏张量加速器的重要见解的能力（例如，共同设计正交设计方面很重要）。

In recent years, many accelerators have been proposed to efficiently process sparse tensor algebra applications (e.g., sparse neural networks). However, these proposals are single points in a large and diverse design space. The lack of systematic description and modeling support for these sparse tensor accelerators impedes hardware designers from efficient and effective design space exploration. This paper first presents a unified taxonomy to systematically describe the diverse sparse tensor accelerator design space. Based on the proposed taxonomy, it then introduces Sparseloop, the first fast, accurate, and flexible analytical modeling framework to enable early-stage evaluation and exploration of sparse tensor accelerators. Sparseloop comprehends a large set of architecture specifications, including various dataflows and sparse acceleration features (e.g., elimination of zero-based compute). Using these specifications, Sparseloop evaluates a design's processing speed and energy efficiency while accounting for data movement and compute incurred by the employed dataflow as well as the savings and overhead introduced by the sparse acceleration features using stochastic tensor density models. Across representative accelerators and workloads, Sparseloop achieves over 2000 times faster modeling speed than cycle-level simulations, maintains relative performance trends, and achieves 0.1% to 8% average error. With a case study, we demonstrate Sparseloop's ability to help reveal important insights for designing sparse tensor accelerators (e.g., it is important to co-design orthogonal design aspects).

下载PDF全文

下载文献需遵守相关版权规定

论文标题