ShiftAddnas：硬件启发的搜索，以获取更准确，更有效的神经网络

论文标题

ShiftAddnas：硬件启发的搜索，以获取更准确，更有效的神经网络

ShiftAddNAS: Hardware-Inspired Search for More Accurate and Efficient Neural Networks

论文作者

You, Haoran, Li, Baopu, Shi, Huihong, Fu, Yonggan, Lin, Yingyan Celine

论文摘要

具有密集乘法的神经网络（NNS）（例如，卷积和变形金刚）具有饥饿的能力，阻碍了它们更广泛的部署到资源受限的设备中。因此，遵循节能硬件实现的共同实践的无乘法网络，以更有效的运算符（例如，位移位和加法）参数化NN，已引起了人们的关注。但是，从实现的准确性方面，无乘法网络的表现不足。为此，这项工作提倡混合NN，包括强大但昂贵的乘法和有效而强大的运营商来嫁给两全其美的运营商，并提出了ShiftAddnas，可以自动寻找更准确，更有效的NN。我们的ShiftAddnas突出了两个推动者。具体而言，它集成了（1）第一个混合搜索空间，该搜索空间同时结合了基于乘法的和无乘法的运算符，以促进精确和有效的混合NNS的开发；（2）一种新型的重量共享策略，可以在遵循异质分布的不同操作员之间有效地分享（例如，用于卷积的高斯与添加操作员的Laplacian），并同时导致超级降低的超网大小和搜索更好的网络。对各种模型，数据集和任务进行的广泛实验和消融研究始终验证了ShiftAdDNA的功效，例如，与最新的NN相比，获得精度高达 +4.7％的BLEU得分高达7.7％，而最多可提供93％或69％或69％的能源和潜伏期。代码和预估计的模型可在https://github.com/rice-eic/shiftaddnas上找到。

Neural networks (NNs) with intensive multiplications (e.g., convolutions and transformers) are capable yet power hungry, impeding their more extensive deployment into resource-constrained devices. As such, multiplication-free networks, which follow a common practice in energy-efficient hardware implementation to parameterize NNs with more efficient operators (e.g., bitwise shifts and additions), have gained growing attention. However, multiplication-free networks usually under-perform their vanilla counterparts in terms of the achieved accuracy. To this end, this work advocates hybrid NNs that consist of both powerful yet costly multiplications and efficient yet less powerful operators for marrying the best of both worlds, and proposes ShiftAddNAS, which can automatically search for more accurate and more efficient NNs. Our ShiftAddNAS highlights two enablers. Specifically, it integrates (1) the first hybrid search space that incorporates both multiplication-based and multiplication-free operators for facilitating the development of both accurate and efficient hybrid NNs; and (2) a novel weight sharing strategy that enables effective weight sharing among different operators that follow heterogeneous distributions (e.g., Gaussian for convolutions vs. Laplacian for add operators) and simultaneously leads to a largely reduced supernet size and much better searched networks. Extensive experiments and ablation studies on various models, datasets, and tasks consistently validate the efficacy of ShiftAddNAS, e.g., achieving up to a +7.7% higher accuracy or a +4.9 better BLEU score compared to state-of-the-art NN, while leading to up to 93% or 69% energy and latency savings, respectively. Codes and pretrained models are available at https://github.com/RICE-EIC/ShiftAddNAS.

下载PDF全文

下载文献需遵守相关版权规定

论文标题