论文标题
一种新的方法,可以为新的浮点表示正确生成正确的圆形数学库
A Novel Approach to Generate Correctly Rounded Math Libraries for New Floating Point Representations
论文作者
论文摘要
鉴于在众多域中浮点〜(FP)性能的重要性,已经提出了几种新的FP变体及其替代方案(例如,Bfloat16,TensorFloat32和Putits)。这些表示形式没有正确的圆形数学库。此外,将现有的FP库用于这些新表示形式可能会产生不正确的结果。本文提出了一种新的方法,用于生成多项式近似值,该方法可用于实现正确的圆形数学库。现有方法生成近似基本函数$ f(x)$的实际值并由于实现中的近似错误和舍入错误而产生错误结果的多项式。相比之下,我们的方法生成的多项式近似$ f(x)$的正确圆形值(即,$ f(x)$的$ f(x)$圆形为目标表示形式)。它提供了更大的边距,以确定有效的多项式,这些多项式为所有输入提供正确的圆形结果。我们将产生有效多项式的有效多项式产生正确的圆形结果作为线性编程问题。我们的方法可以保证即使减少范围技术,我们也会产生正确的结果。使用我们的方法,我们为多个目标表示形式开发了基本功能的正确圆形但更快的实现。
Given the importance of floating-point~(FP) performance in numerous domains, several new variants of FP and its alternatives have been proposed (e.g., Bfloat16, TensorFloat32, and Posits). These representations do not have correctly rounded math libraries. Further, the use of existing FP libraries for these new representations can produce incorrect results. This paper proposes a novel approach for generating polynomial approximations that can be used to implement correctly rounded math libraries. Existing methods generate polynomials that approximate the real value of an elementary function $f(x)$ and produce wrong results due to approximation errors and rounding errors in the implementation. In contrast, our approach generates polynomials that approximate the correctly rounded value of $f(x)$ (i.e., the value of $f(x)$ rounded to the target representation). It provides more margin to identify efficient polynomials that produce correctly rounded results for all inputs. We frame the problem of generating efficient polynomials that produce correctly rounded results as a linear programming problem. Our approach guarantees that we produce the correct result even with range reduction techniques. Using our approach, we have developed correctly rounded, yet faster, implementations of elementary functions for multiple target representations.