论文标题

Seifer:深神经网络的可伸缩边缘推断

SEIFER: Scalable Edge Inference for Deep Neural Networks

论文作者

Parthasarathy, Arjun, Krishnamachari, Bhaskar

论文摘要

通过从零售到可穿戴技术的应用程序,Edge推断一直在普遍存在。网络资源受限的边缘设备的集群变得很普遍,但是没有可以在此类边缘网络上部署深度学习模型的编排系统,从而采用了云的鲁棒性和可扩展性。我们提出了Seifer,这是一个使用独立的Kubernetes群集来分区给定DNN的框架,并以边缘网络的分布方式将这些分区放置,目的是最大化推进吞吐量。该系统是易于故障的系统,并根据模型版本的更新自动更新部署。我们对在此框架内有效的分区和放置算法进行初步评估,并证明我们可以利用足够数量的资源受限的节点来将推理管道吞吐量提高200%。我们已经在开源软件中实施了Seifer,该软件公开可供研究社区使用。

Edge inference is becoming ever prevalent through its applications from retail to wearable technology. Clusters of networked resource-constrained edge devices are becoming common, yet there is no production-ready orchestration system for deploying deep learning models over such edge networks which adopts the robustness and scalability of the cloud. We present SEIFER, a framework utilizing a standalone Kubernetes cluster to partition a given DNN and place these partitions in a distributed manner across an edge network, with the goal of maximizing inference throughput. The system is node fault-tolerant and automatically updates deployments based on updates to the model's version. We provide a preliminary evaluation of a partitioning and placement algorithm that works within this framework, and show that we can improve the inference pipeline throughput by 200% by utilizing sufficient numbers of resource-constrained nodes. We have implemented SEIFER in open-source software that is publicly available to the research community.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源