论文标题

INFI:端到端学习过滤输入以以移动推理的资源效率

InFi: End-to-End Learning to Filter Input for Resource-Efficiency in Mobile-Centric Inference

论文作者

Yuan, Mu, Zhang, Lan, He, Fengxiang, Tong, Xueting, Song, Miao-Hui, Xu, Zhengyuan, Li, Xiang-Yang

论文摘要

以移动为中心的AI应用程序对模型推断的资源效率有很高的要求。输入过滤是消除冗余以降低推理成本的有前途的方法。以前的努力已经针对许多应用程序量身定制了有效解决方案,但留下了两个基本问题:(1)推理工作量的理论过滤性以指导输入过滤技术的应用,从而避免了资源受限的移动应用程序的试用成本; (2)功能嵌入的鲁棒可区分性,以使输入过滤对各种推理任务和输入内容有效。为了回答它们,我们首先将输入过滤问题正式化,理论上比较了推理模型和输入过滤器的假设复杂性,以了解优化潜力。然后,我们提出了第一个端到端可学习的输入过滤框架,该框架涵盖了大多数最先进的方法,并以可辨别的可区分性嵌入功能。我们设计和实施支持六种输入方式和多个以移动为中心的部署的INFI。全面的评估证实了我们的理论结果,并表明INFI在适用性,准确性和效率方面的表现优于强大的基准。 INFI获得8.5倍的吞吐量并节省95%的带宽,同时保持超过90%的精度,以用于移动平台上的视频分析应用程序。

Mobile-centric AI applications have high requirements for resource-efficiency of model inference. Input filtering is a promising approach to eliminate the redundancy so as to reduce the cost of inference. Previous efforts have tailored effective solutions for many applications, but left two essential questions unanswered: (1) theoretical filterability of an inference workload to guide the application of input filtering techniques, thereby avoiding the trial-and-error cost for resource-constrained mobile applications; (2) robust discriminability of feature embedding to allow input filtering to be widely effective for diverse inference tasks and input content. To answer them, we first formalize the input filtering problem and theoretically compare the hypothesis complexity of inference models and input filters to understand the optimization potential. Then we propose the first end-to-end learnable input filtering framework that covers most state-of-the-art methods and surpasses them in feature embedding with robust discriminability. We design and implement InFi that supports six input modalities and multiple mobile-centric deployments. Comprehensive evaluations confirm our theoretical results and show that InFi outperforms strong baselines in applicability, accuracy, and efficiency. InFi achieve 8.5x throughput and save 95% bandwidth, while keeping over 90% accuracy, for a video analytics application on mobile platforms.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源