DNNFUSER：生成的预训练变压器作为DNN加速器中层融合的广义映射器

论文标题

DNNFUSER：生成的预训练变压器作为DNN加速器中层融合的广义映射器

DNNFuser: Generative Pre-Trained Transformer as a Generalized Mapper for Layer Fusion in DNN Accelerators

论文作者

Kao, Sheng-Chun, Huang, Xiaoyu, Krishna, Tushar

论文摘要

数据流/映射决定DNN加速器的计算和能效。已经提出了许多映射器来应对层内的地图空间。但是，很少讨论用于层间地图空间（又称层融合地图空间）的映射器。在这项工作中，我们提出了一个映射器DNNFUSER，专门针对此图层融合地图空间。虽然现有的SOTA DNN映射探索依赖于基于搜索的映射器，但据我们所知，这是第一件作品，提出了一个基于推理的映射器。我们利用变形金刚作为DNN体系结构将层融合优化作为序列建模问题学习。此外，训练有素的DNNFUSER可以概括其知识，并针对看不见的条件推断新的解决方案。在一个推理通行证中，DNNFUSER可以将具有兼容性能的解决方案推断为高度优化的基于搜索的映射器的解决方案，而更快的速度是66x-127x。

Dataflow/mapping decides the compute and energy efficiency of DNN accelerators. Many mappers have been proposed to tackle the intra-layer map-space. However, mappers for inter-layer map-space (aka layer-fusion map-space), have been rarely discussed. In this work, we propose a mapper, DNNFuser, specifically focusing on this layer-fusion map-space. While existing SOTA DNN mapping explorations rely on search-based mappers, this is the first work, to the best of our knowledge, to propose a one-shot inference-based mapper. We leverage Transformer as our DNN architecture to learn layer-fusion optimization as a sequence modeling problem. Further, the trained DNNFuser can generalize its knowledge and infer new solutions for unseen conditions. Within one inference pass, DNNFuser can infer solutions with compatible performance to the ones found by a highly optimized search-based mapper while being 66x-127x faster.

下载PDF全文

下载文献需遵守相关版权规定

论文标题