论文标题

LSI:学识渊博的辅助指数结构

LSI: A Learned Secondary Index Structure

论文作者

Kipf, Andreas, Horn, Dominik, Pfeil, Pascal, Marcus, Ryan, Kraska, Tim

论文摘要

与传统的B-Trees相比,学到的指数结构已被证明可以实现有利的查找性能和空间消耗。但是,大多数博学的指数研究都集中在对基本数据进行分类的主要索引设置上。在这项工作中,我们调查了在次要索引环境中学习指数是否维持其优势。我们引入了学习的二级索引(LSI),这是第一次尝试使用学习的索引来索引未排序数据的尝试。 LSI通过在置换量向量上构建学习的索引来起作用,该索引允许使用随机访问在未分类的基本数据上执行二进制搜索。我们还用指纹向量增强LSI,以加速平等查找。我们表明,LSI可以达到可比的查找性能与最先进的次要索引,同时提高空间效率高达6倍。

Learned index structures have been shown to achieve favorable lookup performance and space consumption compared to their traditional counterparts such as B-trees. However, most learned index studies have focused on the primary indexing setting, where the base data is sorted. In this work, we investigate whether learned indexes sustain their advantage in the secondary indexing setting. We introduce Learned Secondary Index (LSI), a first attempt to use learned indexes for indexing unsorted data. LSI works by building a learned index over a permutation vector, which allows binary search to performed on the unsorted base data using random access. We additionally augment LSI with a fingerprint vector to accelerate equality lookups. We show that LSI achieves comparable lookup performance to state-of-the-art secondary indexes while being up to 6x more space efficient.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源