论文标题
flups-灵活且性能的大规模平行傅立叶变换库
FLUPS -- a flexible and performant massively parallel Fourier transform library
论文作者
论文摘要
大量平行的傅立叶变换被广泛用于计算科学中,特别是在涉及无界泊松问题的计算流体动力学中。实际上,由于其不可避免的全部通信模式,后者通常是最耗时的操作。原始的Flups库通过实施分布式傅立叶变换量身定制的,以连续解决无限的泊松问题,以解决该问题。但是,拟议的实施缺乏灵活性,因为它仅支持以细胞为中心的数据布局并具有简单的通信策略。这项工作沿两个方向扩展了库。首先,FLUPS实现被推广以支持以节点为中心的数据布局。其次,提供了三种不同的方法来处理通信:一种全能的方法,以及依靠手动包装和MPI_Datatype的两个非阻滞实现来通过网络进行通信。针对无界,半无限制和周期域的分析解决方案进行了验证。然后使用定期案例将方法的性能与ACCFFT(另一个分布式FFT实现)进行了比较。最后,对每种实施的性能指标进行了分析和详细介绍,并详细介绍了高达49,152个核心的各种顶级欧洲设施。这项工作使Flup升至完整的生产准备和性能分布式FFT库,其中包括所有可能的FFT类型,并且具有数据范围的灵活性。该代码可在github.com/vortexlab-uclouvain/flups上获得BSD-3许可证。
Massively parallel Fourier transforms are widely used in computational sciences, and specifically in computational fluid dynamics which involves unbounded Poisson problems. In practice the latter is usually the most time-consuming operation due to its inescapable all-to-all communication pattern. The original flups library tackles that issue with an implementation of the distributed Fourier transform tailor-made for successive resolutions of unbounded Poisson problems. However the proposed implementation lacks of flexibility as it only supports cell-centered data layout and features a plain communication strategy. This work extends the library along two directions. First, flups implementation is generalized to support a node-centered data layout. Second, three distinct approaches are provided to handle the communications: one all-to-all, and two non-blocking implementations relying on manual packing and MPI_Datatype to communicate over the network. The proposed software is validated against analytical solutions for unbounded, semi-unbounded, and periodic domains. The performance of the approaches is then compared against accFFT, another distributed FFT implementation, using a periodic case. Finally the performance metrics of each implementation are analyzed and detailed on various top-tier European facilities up to 49,152 cores. This work brings flups up to a fully production-ready and performant distributed FFT library, featuring all the possible types of FFTs and with flexibility in the data-layout. The code is available under a BSD-3 license at github.com/vortexlab-uclouvain/flups.