复制和绘制汇编：高级语言和字节码的快速汇编算法

论文标题

复制和绘制汇编：高级语言和字节码的快速汇编算法

Copy-and-Patch Compilation: A fast compilation algorithm for high-level languages and bytecode

论文作者

Xu, Haoran, Kjolstad, Fredrik

论文摘要

当运行时发生编译时，快速汇编很重要，例如现代数据库系统中的查询编译器以及现代浏览器中的WebAssembly虚拟机。我们提出复制和绘制，这是一种非常快速的编译技术，还会产生高质量的代码。它能够通过将大型二进制实现库中的代码拼接在一起，将高级语言和低级字节码程序降低到二进制代码。我们称这些二进制实现模板是因为它们具有在代码生成期间必须插入缺失值的孔。我们展示了如何构建模板库并描述生成优化二进制代码的复制和点曲线算法。我们演示了两个复制和点的用例：用于元编程的高级C式语言的编译器，以及用于WebAssembly的编译器。我们的高级语言编译器的编译成本可以忽略不计：它在构建AST所需的时间少于AST的代码。我们已经在此元编程系统的基础上实现了SQL数据库查询编译器，并表明在TPC-H数据库基准测试中，复制和patch在代码上生成了两个比LLVM -O0快两个阶数级级，并且比更高的优化水平更快。生成的代码的数量级比解释快，比LLVM -O0快14％。我们的WebAssembly编译器生成的代码4.9x-6.5倍，比Google Chrome中的WebAssembly基线编译器升降机快。生成的代码在Coremark和PolybenchC WebAssembly基准测试上还优于升降机的升降机39％-63％。

Fast compilation is important when compilation occurs at runtime, such as query compilers in modern database systems and WebAssembly virtual machines in modern browsers. We present copy-and-patch, an extremely fast compilation technique that also produces good quality code. It is capable of lowering both high-level languages and low-level bytecode programs to binary code, by stitching together code from a large library of binary implementation variants. We call these binary implementations stencils because they have holes where missing values must be inserted during code generation. We show how to construct a stencil library and describe the copy-and-patch algorithm that generates optimized binary code. We demonstrate two use cases of copy-and-patch: a compiler for a high-level C-like language intended for metaprogramming and a compiler for WebAssembly. Our high-level language compiler has negligible compilation cost: it produces code from an AST in less time than it takes to construct the AST. We have implemented an SQL database query compiler on top of this metaprogramming system and show that on TPC-H database benchmarks, copy-and-patch generates code two orders of magnitude faster than LLVM -O0 and three orders of magnitude faster than higher optimization levels. The generated code runs an order of magnitude faster than interpretation and 14% faster than LLVM -O0. Our WebAssembly compiler generates code 4.9X-6.5X faster than Liftoff, the WebAssembly baseline compiler in Google Chrome. The generated code also outperforms Liftoff's by 39%-63% on the Coremark and PolyBenchC WebAssembly benchmarks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题