论文标题

一种用于异质分布计算的新组合编码设计

A New Combinatorial Coded Design for Heterogeneous Distributed Computing

论文作者

Woolsey, Nicholas, Chen, Rong-Rong, Ji, Mingyue

论文摘要

Li等人介绍的编码分布式计算(CDC)。 2015年,提供了一种有效的方法来交易计算能力,以减少一般分布式计算框架(例如MapReduce和Spark)中的通信负载。特别是,将MAP阶段中的计算负载增加一个因子,可以创建编码的多播机会,以减少相同因素的混合阶段的通信负载。但是,CDC方案是为均匀设置而设计的,其中存储,计算负载和计算节点上的通信负载相同。此外,相对于获得承诺增益的节点的数量,它需要大量的输入文件(数据批),减少功能和多播组。我们通过基于组合设计的新型CDC方法来解决CDC的限制,该方法可容纳各种均匀网络,该网络在该网络中具有不同的存储和计算功能。此外,与Li等人提出的原始CDC方案相比,提出的方法需要较少的输入文件数量。同时,与常规的未编码单播和渐近地实现Li等人提出的最佳性能相比,所得的计算通信权衡保持了多重增益。

Coded Distributed Computing (CDC) introduced by Li et al. in 2015 offers an efficient approach to trade computing power to reduce the communication load in general distributed computing frameworks such as MapReduce and Spark. In particular, increasing the computation load in the Map phase by a factor of r can create coded multicasting opportunities to reduce the communication load in the Shuffle phase by the same factor. However, the CDC scheme is designed for the homogeneous settings, where the storage, computation load and communication load on the computing nodes are the same. In addition, it requires an exponentially large number of input files (data batches), reduce functions and multicasting groups relative to the number of nodes to achieve the promised gain. We address the CDC limitations by proposing a novel CDC approach based on a combinatorial design, which accommodates heterogeneous networks where nodes have varying storage and computing capabilities. In addition, the proposed approach requires an exponentially less number of input files compared to the original CDC scheme proposed by Li et al. Meanwhile, the resulting computation-communication trade-off maintains the multiplicative gain compared to conventional uncoded unicast and asymptotically achieves the optimal performance proposed by Li et al.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源