DL库和模型在移动设备上的基准测试

论文标题

DL库和模型在移动设备上的基准测试

Benchmarking of DL Libraries and Models on Mobile Devices

论文作者

Zhang, Qiyang, Li, Xiang, Che, Xiangying, Ma, Xiao, Zhou, Ao, Xu, Mengwei, Wang, Shangguang, Ma, Yun, Liu, Xuanzhe

论文摘要

近年来，在移动设备上部署深度学习（DL）一直是一个显着的趋势。为了支持对设备DL的快速推断，DL库在算法和硬件中起着至关重要的作用。不幸的是，先前的工作从未深入现代DL Libs的生态系统，并为其性能提供定量结果。在本文中，我们首先建立了一个全面的基准，其中包括6个代表性DL Libs和15种多元化的DL模型。然后，我们在10个移动设备上进行了广泛的实验，这有助于揭示当前移动DL LIBS生态系统的完整景观。例如，我们发现表现最佳的DL LIB在不同的模型和硬件中严重碎片，这些DL Libs之间的差距可能相当巨大。实际上，DL LIB的影响会淹没算法或硬件的优化，例如模型量化和基于GPU/DSP的异质计算。最后，在观察结果上，我们总结了对DL Lib生态系统中不同角色的实际含义。

Deploying deep learning (DL) on mobile devices has been a notable trend in recent years. To support fast inference of on-device DL, DL libraries play a critical role as algorithms and hardware do. Unfortunately, no prior work ever dives deep into the ecosystem of modern DL libs and provides quantitative results on their performance. In this paper, we first build a comprehensive benchmark that includes 6 representative DL libs and 15 diversified DL models. We then perform extensive experiments on 10 mobile devices, which help reveal a complete landscape of the current mobile DL libs ecosystem. For example, we find that the best-performing DL lib is severely fragmented across different models and hardware, and the gap between those DL libs can be rather huge. In fact, the impacts of DL libs can overwhelm the optimizations from algorithms or hardware, e.g., model quantization and GPU/DSP-based heterogeneous computing. Finally, atop the observations, we summarize practical implications to different roles in the DL lib ecosystem.

下载PDF全文

下载文献需遵守相关版权规定

论文标题