论文标题
CPRUNE:编译器信息修剪,以进行有效的目标感知DNN执行
CPrune: Compiler-Informed Model Pruning for Efficient Target-Aware DNN Execution
论文作者
论文摘要
移动设备用于各种目的,例如图像分类和语音识别。由于移动设备的资源限制,研究人员专注于使用模型修剪或使用编译器优化生成有效的代码制作轻量级深神经网络(DNN)模型。令人惊讶的是,我们发现模型压缩和编译器自动调整之间的直接集成通常不会为目标设备产生最有效的模型。我们提出了Cprune,这是一种编译器信息的模型,用于有效的目标感知DNN执行,以支持具有所需目标准确性的应用程序。 Cprune通过在编译器调整过程中构建的子图的结构信息来制作轻巧的DNN模型。我们的实验结果表明,与最先进的TVM自动调节相比,CPRUNE将DNN的执行速度提高到2.73倍,同时满足准确性要求。
Mobile devices run deep learning models for various purposes, such as image classification and speech recognition. Due to the resource constraints of mobile devices, researchers have focused on either making a lightweight deep neural network (DNN) model using model pruning or generating an efficient code using compiler optimization. Surprisingly, we found that the straightforward integration between model compression and compiler auto-tuning often does not produce the most efficient model for a target device. We propose CPrune, a compiler-informed model pruning for efficient target-aware DNN execution to support an application with a required target accuracy. CPrune makes a lightweight DNN model through informed pruning based on the structural information of subgraphs built during the compiler tuning process. Our experimental results show that CPrune increases the DNN execution speed up to 2.73x compared to the state-of-the-art TVM auto-tune while satisfying the accuracy requirement.