FAB：基于FPGA的基于Bootstrappable完全同构加密的加速器

论文标题

FAB：基于FPGA的基于Bootstrappable完全同构加密的加速器

FAB: An FPGA-based Accelerator for Bootstrappable Fully Homomorphic Encryption

论文作者

Agrawal, Rashmi, de Castro, Leo, Yang, Guowei, Juvekar, Chiraag, Yazicigil, Rabia, Chandrakasan, Anantha, Vaikuntanathan, Vinod, Joshi, Ajay

论文摘要

FHE通过允许以加密形式的数据进行计算，为第三方云服务器上的私人数据提供保护。但是，为了支持通用加密的计算，所有现有的FHE方案都需要一个昂贵的操作，称为自举。不幸的是，引导程序所需的计算成本和内存带宽为基于FHE的计算增添了重要的开销，从而限制了FHE的实际使用。在这项工作中，我们提出了FAB，这是一种基于FPGA的助推器，用于BootStrable FHE。先前基于FPGA的FHE加速器已提出了基本原始词的硬件加速度，用于不切实际的参数集，而无需支持引导。 Fab，有史以来第一次在FPGA上加速引导（以及基本的原语），以进行安全且实用的参数集。我们工作的关键贡献是构建平衡的晶圆厂设计，这不是内存绑定的。为此，我们利用了最新的算法来进行引导，同时认识到FPGA的计算和内存约束。我们使用最少数量的功能单元进行计算，以低频操作，利用高数据速率往返主内存，有效地利用有限的芯片内存，并仔细执行操作计划。对于以300 MHz运行时，Bootstage填充了完全包装的密文，Fab在213倍和1.5倍分别优于现有的最新CPU和GPU实现。我们的目标应用程序是对加密数据训练逻辑回归模型。对于logistic回归模型训练缩放到云上的8个FPGA，FAB的表现优于456倍和6.5倍的CPU和GPU，并且与最先进的ASIC设计相比，以成本的一小部分提供了竞争性能。

FHE offers protection to private data on third-party cloud servers by allowing computations on the data in encrypted form. However, to support general-purpose encrypted computations, all existing FHE schemes require an expensive operation known as bootstrapping. Unfortunately, the computation cost and the memory bandwidth required for bootstrapping add significant overhead to FHE-based computations, limiting the practical use of FHE. In this work, we propose FAB, an FPGA-based accelerator for bootstrappable FHE. Prior FPGA-based FHE accelerators have proposed hardware acceleration of basic FHE primitives for impractical parameter sets without support for bootstrapping. FAB, for the first time ever, accelerates bootstrapping (along with basic FHE primitives) on an FPGA for a secure and practical parameter set. The key contribution of our work is to architect a balanced FAB design, which is not memory bound. To this end, we leverage recent algorithms for bootstrapping while being cognizant of the compute and memory constraints of our FPGA. We use a minimal number of functional units for computing, operate at a low frequency, leverage high data rates to and from main memory, utilize the limited on-chip memory effectively, and perform operation scheduling carefully. For bootstrapping a fully-packed ciphertext, while operating at 300 MHz, FAB outperforms existing state-of-the-art CPU and GPU implementations by 213x and 1.5x respectively. Our target FHE application is training a logistic regression model over encrypted data. For logistic regression model training scaled to 8 FPGAs on the cloud, FAB outperforms a CPU and GPU by 456x and 6.5x and provides competitive performance when compared to the state-of-the-art ASIC design at a fraction of the cost.

下载PDF全文

下载文献需遵守相关版权规定

论文标题