论文标题
并行化普通微分方程的显式和隐式外推法
Parallelizing Explicit and Implicit Extrapolation Methods for Ordinary Differential Equations
论文作者
论文摘要
在数值上求解普通微分方程(ODE)是一个自然的串行过程,因此,绝大多数ODE求解器软件都是串行的。在本手稿中,我们使用外推方法开发了一组并行的ODE求解器,这些方法利用了“方法中的并行性”,以便可以并行化任意用户的ODE。我们描述了实现显式和隐式外推方法时做出的具体选择,这些方法允许生成低开销静态计划,然后通过优化的多线程实现来利用。我们证明,尽管多线程在明确和隐式问题上都具有明显的加速度,但显式并行的外推方法即使在当前优化的高阶高阶runge-kutta tablaus上具有多线程优势,对最先进的优势也没有显着改善。但是,我们证明了隐式平行的外推方法能够在标准的多核心X86 CPU上实现最先进的性能(2x-4X),用于$ <200 $ hiff的系统,以低耐受性解决,这是为高级语言方程式求解套件的绝大多数用户而言,这是一种典型的设置。所得方法分布为第一个广泛可用的开源软件,用于针对典型的适度计算体系结构的方法内并行加速。
Numerically solving ordinary differential equations (ODEs) is a naturally serial process and as a result the vast majority of ODE solver software are serial. In this manuscript we developed a set of parallelized ODE solvers using extrapolation methods which exploit "parallelism within the method" so that arbitrary user ODEs can be parallelized. We describe the specific choices made in the implementation of the explicit and implicit extrapolation methods which allow for generating low overhead static schedules to then exploit with optimized multi-threaded implementations. We demonstrate that while the multi-threading gives a noticeable acceleration on both explicit and implicit problems, the explicit parallel extrapolation methods gave no significant improvement over state-of-the-art even with a multi-threading advantage against current optimized high order Runge-Kutta tableaus. However, we demonstrate that the implicit parallel extrapolation methods are able to achieve state-of-the-art performance (2x-4x) on standard multicore x86 CPUs for systems of $<200$ stiff ODEs solved at low tolerance, a typical setup for a vast majority of users of high level language equation solver suites. The resulting method is distributed as the first widely available open source software for within-method parallel acceleration targeting typical modest compute architectures.