论文标题
RIO:订单保存和CPU有效的远程存储访问
RIO: Order-Preserving and CPU-Efficient Remote Storage Access
论文作者
论文摘要
现代的NVME SSD和RDMA网络提供了更高的带宽和并发性。由于效率低下的存储排序保证,现有的网络存储系统(例如,织物上的NVME)无法完全利用这些新设备。这些系统中的存储顺序的严重同步执行使CPU和I/O设备停滞不前,并降低了存储系统的CPU和I/O性能效率。 我们提出了Rio,这是一种新方法,用于远程存储访问的存储顺序。里约热内卢的关键见解是,软件堆栈的分层设计以及并发和异步网络和存储设备使存储堆栈在概念上与CPU管道相似。受CPU管道的启发,Rio介绍了I/O管道,该管道允许在为应用程序提供完整的外部存储订单时,该管道允许内部订购外部和异步执行。这些设计决策与连续的有序请求合并,使写入吞吐量和CPU效率接近无序请求。 我们通过RDMA堆栈在Linux NVME中实现RIO,并进一步构建了RIOF的文件系统。评估表明,Rio的表现优于RDMA的Linux NVME和一个最先进的存储堆栈,分别是两个数量级的Horae,分别在有序的写入请求的吞吐量方面平均分别为4.9次。 RIOF分别将RocksDB的吞吐量增加1.9次和1.5次,分别针对Ext4和Horaefs。
Modern NVMe SSDs and RDMA networks provide dramatically higher bandwidth and concurrency. Existing networked storage systems (e.g., NVMe over Fabrics) fail to fully exploit these new devices due to inefficient storage ordering guarantees. Severe synchronous execution for storage order in these systems stalls the CPU and I/O devices and lowers the CPU and I/O performance efficiency of the storage system. We present Rio, a new approach to the storage order of remote storage access. The key insight in Rio is that the layered design of the software stack, along with the concurrent and asynchronous network and storage devices, makes the storage stack conceptually similar to the CPU pipeline. Inspired by the CPU pipeline that executes out-of-order and commits in-order, Rio introduces the I/O pipeline that allows internal out-of-order and asynchronous execution for ordered write requests while offering intact external storage order to applications. Together with merging consecutive ordered requests, these design decisions make for write throughput and CPU efficiency close to that of orderless requests. We implement Rio in Linux NVMe over RDMA stack, and further build a file system named RioFS atop Rio. Evaluations show that Rio outperforms Linux NVMe over RDMA and a state-of-the-art storage stack named Horae by two orders of magnitude and 4.9 times on average in terms of throughput of ordered write requests, respectively. RioFS increases the throughput of RocksDB by 1.9 times and 1.5 times on average, against Ext4 and HoraeFS, respectively.