论文标题
使用矩阵乘法快速加入项目查询评估
Fast Join Project Query Evaluation using Matrix Multiplication
论文作者
论文摘要
在过去的几年中,为了实现与关系数据库的联合查询的最糟糕最佳性,已有很多努力致力于开发加入算法。为此,数据库社区在开发简洁的算法方面取得了巨大的成功,这些算法可以实现最佳的最佳运行时进行完整的联接查询,即,即将加入与输入数据库中存在的所有变量相比。但是,除了在查询执行计划中推出投影操作员的一些简单技术之外,对使用{\ em Projections}的加入评估知之甚少。此类查询在实体匹配,图形分析和在压缩图上进行搜索中具有大量应用程序。在本文中,我们研究了如何使用最差的最佳算法以及矩阵乘法来更快地评估带有投影的连接查询。至关重要的是,我们的算法是通过最终结果的输出大小来参数化的,从而可以选择最佳的执行策略。我们将算法作为子例程实施,并将其与最先进的技术进行比较,以表明它们可以将其改进多达50倍。更重要的是,我们的实验表明,矩阵乘法是一个有用的操作,可以帮助加快加入加入处理,这是由于高度优化的开源库,这些库也高度可行。
In the last few years, much effort has been devoted to developing join algorithms in order to achieve worst-case optimality for join queries over relational databases. Towards this end, the database community has had considerable success in developing succinct algorithms that achieve worst-case optimal runtime for full join queries, i.e the join is over all variables present in the input database. However, not much is known about join evaluation with {\em projections} beyond some simple techniques of pushing down the projection operator in the query execution plan. Such queries have a large number of applications in entity matching, graph analytics and searching over compressed graphs. In this paper, we study how a class of join queries with projections can be evaluated faster using worst-case optimal algorithms together with matrix multiplication. Crucially, our algorithms are parameterized by the output size of the final result, allowing for choice of the best execution strategy. We implement our algorithms as a subroutine and compare the performance with state-of-the-art techniques to show they can be improved upon by as much as 50x. More importantly, our experiments indicate that matrix multiplication is a useful operation that can help speed up join processing owing to highly optimized open source libraries that are also highly parallelizable.