论文标题
当负载重新平衡不适用于分布式哈希表
When Load Rebalancing Does Not Work for Distributed Hash Table
论文作者
论文摘要
分布式哈希表(DHT)是许多广泛使用的存储系统的基础,其出色的可扩展性和负载平衡的特征。最近,基于DHT的系统已被部署为The Internet应用程序(IoT)应用程序方案。不幸的是,这样的系统可以体验到扩展和负载重新平衡过程的故障。这种现象与DHT系统的共同概念矛盾,尤其是其可扩展性和负载平衡功能。在本文中,我们研究了基于DHT的系统在扩展过程中的分解。我们通过考虑写入工作负载和数据移动的影响来制定DHT的负载重平衡问题。我们表明,每个节点的平均网络带宽以及平均写入工作负载的强度是决定DHT负载重新平衡可行性的两个关键因素。从理论上讲,我们证明在大型DHT系统中,在逐节点缩放过程中,重型DHT系统不可行。
Distributed hash table (DHT) is the foundation of many widely used storage systems, for its prominent features of high scalability and load balancing. Recently, DHT-based systems have been deployed for the Internet-of-Things (IoT) application scenarios. Unfortunately, such systems can experience a breakdown in the scale-out and load rebalancing process. This phenomenon contradicts with the common conception of DHT systems, especially about its scalability and load balancing features. In this paper, we investigate the breakdown of DHT-based systems in the scale-out process. We formulate the load rebalancing problem of DHT by considering the impacts of write workloads and data movement. We show that, the average network bandwidth of each node and the intensity of the average write workload are the two key factors that determine the feasibility of DHT load rebalancing. We theoretically prove that load rebalancing is not feasible for a large DHT system under heavy write workloads in a node-by-node scale-out process.