Sign Up
Log In
Log In
or
Sign Up
Places
All Projects
Status Monitor
Collapse sidebar
openSUSE:Step:FrontRunner
glibc.13450
math-remove-slow-path.patch
Overview
Repositories
Revisions
Requests
Users
Attributes
Meta
File math-remove-slow-path.patch of Package glibc.13450
2018-04-03 Wilco Dijkstra <wdijkstr@arm.com> * sysdeps/ieee754/dbl-64/s_sin.c (__sin): Cleanup ifdefs. (__cos): Likewise. * sysdeps/ieee754/dbl-64/s_sin.c (__sincos): Refactor using the same logic as sin and cos. 2018-04-03 Wilco Dijkstra <wdijkstr@arm.com> * sysdeps/ieee754/dbl-64/s_sin.c (do_sin): Use TAYLOR_SIN for small inputs. Return correct sign. (do_sincos): Remove small input check before do_sin, let do_sin set the sign. (__sin): Likewise. (__cos): Likewise. 2018-04-03 Wilco Dijkstra <wdijkstr@arm.com> * sysdeps/ieee754/dbl-64/s_sin.c (TAYLOR_SLOW): Remove. (do_cos_slow): Likewise. (do_sin_slow): Likewise. (reduce_and_compute): Likewise. (slow): Likewise. (slow1): Likewise. (slow2): Likewise. (sloww): Likewise. (sloww1): Likewise. (sloww2): Likewise. (bslow): Likewise. (bslow1): Likewise. (bslow2): Likewise. (cslow2): Likewise. 2018-04-03 Wilco Dijkstra <wdijkstr@arm.com> * sysdeps/ieee754/dbl-64/s_sin.c (TAYLOR_SIN): Remove cor parameter. (do_cos): Remove corp parameter and calculations. (do_sin): Likewise. (do_sincos): Remove cor variable. (__sin): Use do_sincos for huge inputs. (__cos): Likewise. * sysdeps/ieee754/dbl-64/s_sincos.c (__sincos): Likewise. (reduce_and_compute_sincos): Remove unused function. 2018-04-03 Wilco Dijkstra <wdijkstr@arm.com> * sysdeps/ieee754/dbl-64/s_sin.c (reduce_sincos_1): Rename to reduce_sincos, improve accuracy to 136 bits. (do_sincos_1): Rename to do_sincos, remove fallbacks to slow functions. (__sin): Use improved reduction and simplified do_sincos calculation. (__cos): Likewise. * sysdeps/ieee754/dbl-64/s_sincos.c (__sincos): Likewise. 2018-04-03 Wilco Dijkstra <wdijkstr@arm.com> * sysdeps/ieee754/dbl-64/s_sin.c (reduce_sincos_2): Remove function. (do_sincos_2): Likewise. (__sin): Remove middle range reduction case. (__cos): Likewise. * sysdeps/ieee754/dbl-64/s_sincos.c (__sincos): Remove middle range reduction case. 2018-04-03 Wilco Dijkstra <wdijkstr@arm.com> * sysdeps/aarch64/libm-test-ulps: Update ULP for sin, cos, sincos. * sysdeps/ieee754/dbl-64/s_sin.c (__sin): Remove slow paths for small inputs. (__cos): Likewise. * sysdeps/x86_64/fpu/libm-test-ulps: Update ULP for sin, cos, sincos. 2018-02-12 Szabolcs Nagy <szabolcs.nagy@arm.com> * manual/probes.texi: Remove slowexp probes. * math/Makefile: Remove slowexp. * sysdeps/generic/math_private.h (__slowexp): Remove. * sysdeps/ieee754/dbl-64/e_exp.c (__ieee754_exp): Remove __slowexp and document error bounds. * sysdeps/i386/fpu/slowexp.c: Remove. * sysdeps/ia64/fpu/slowexp.c: Remove. * sysdeps/ieee754/dbl-64/slowexp.c: Remove. * sysdeps/ieee754/dbl-64/uexp.h (err_0): Remove. * sysdeps/m68k/m680x0/fpu/slowexp.c: Remove. * sysdeps/powerpc/power4/fpu/Makefile (CPPFLAGS-slowexp.c): Remove. * sysdeps/x86_64/fpu/multiarch/Makefile: Remove slowexp-fma. * sysdeps/x86_64/fpu/multiarch/e_exp-avx.c (__slowexp): Remove. * sysdeps/x86_64/fpu/multiarch/e_exp-fma.c (__slowexp): Remove. * sysdeps/x86_64/fpu/multiarch/e_exp-fma4.c (__slowexp): Remove. * sysdeps/x86_64/fpu/multiarch/slowexp-avx.c: Remove. * sysdeps/x86_64/fpu/multiarch/slowexp-fma.c: Remove. * sysdeps/x86_64/fpu/multiarch/slowexp-fma4.c: Remove. 2018-02-12 Wilco Dijkstra <wdijkstr@arm.com> [BZ #13932] * sysdeps/ieee754/dbl-64/uexp.h (err_1): Remove. * benchtests/pow-inputs: Update comment for slow path cases. * manual/probes.texi (slowpow_p10): Delete removed probe. (slowpow_p10): Likewise. * math/Makefile: Remove halfulp.c and slowpow.c. * sysdeps/aarch64/libm-test-ulps: Set ULP of pow to 1. * sysdeps/generic/math_private.h (__exp1): Remove error argument. (__halfulp): Remove. (__slowpow): Remove. * sysdeps/i386/fpu/halfulp.c: Delete file. * sysdeps/i386/fpu/slowpow.c: Likewise. * sysdeps/ia64/fpu/halfulp.c: Likewise. * sysdeps/ia64/fpu/slowpow.c: Likewise. * sysdeps/ieee754/dbl-64/e_exp.c (__exp1): Remove error argument, improve comments and add error analysis. * sysdeps/ieee754/dbl-64/e_pow.c (__ieee754_pow): Add error analysis. (power1): Remove function: (log1): Remove error argument, add error analysis. (my_log2): Remove function. * sysdeps/ieee754/dbl-64/halfulp.c: Delete file. * sysdeps/ieee754/dbl-64/slowpow.c: Likewise. * sysdeps/m68k/m680x0/fpu/halfulp.c: Likewise. * sysdeps/m68k/m680x0/fpu/slowpow.c: Likewise. * sysdeps/powerpc/power4/fpu/Makefile: Remove CPPFLAGS-slowpow.c. * sysdeps/x86_64/fpu/libm-test-ulps: Set ULP of pow to 1. * sysdeps/x86_64/fpu/multiarch/Makefile: Remove slowpow-fma.c, slowpow-fma4.c, halfulp-fma.c, halfulp-fma4.c. * sysdeps/x86_64/fpu/multiarch/e_pow-fma.c (__slowpow): Remove define. * sysdeps/x86_64/fpu/multiarch/e_pow-fma4.c (__slowpow): Likewise. * sysdeps/x86_64/fpu/multiarch/halfulp-fma.c: Delete file. * sysdeps/x86_64/fpu/multiarch/halfulp-fma4.c: Likewise. * sysdeps/x86_64/fpu/multiarch/slowpow-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/slowpow-fma4.c: Likewise. 2018-02-07 Wilco Dijkstra <wdijkstr@arm.com> * manual/probes.texi (slowlog): Delete documentation of removed probe. (slowlog_inexact): Likewise * sysdeps/ieee754/dbl-64/e_log.c (__ieee754_log): Remove slow paths. * sysdeps/ieee754/dbl-64/ulog.h: Remove unused declarations. Index: glibc-2.26/benchtests/pow-inputs =================================================================== --- glibc-2.26.orig/benchtests/pow-inputs +++ glibc-2.26/benchtests/pow-inputs @@ -302,8 +302,7 @@ 0x1.c004d2256a5b8p402, -0x1.a01df480fdcb7p98 0x1.52b9d41aaa1e9p-589, -0x1.292cb15f1459dp46 -0x1.ea9ca6fa0919ep-279, -0x1.601e44b6a588cp40 -# pow slow path at 240 bits -# Implemented in sysdeps/ieee754/dbl-64/slowpow.c +# old pow slow path at 240 bits ## name: 240bits 0x1.01fcd33493ea3p596, -0x1.724bd4e887783p-14 0x1.032ff59ab34fdp-540, -0x1.61e3632080b87p-24 @@ -405,8 +404,7 @@ 0x1.fae913d4f952ep-809, -0x1.4b649402fce63p-6 0x1.fe6d725408f24p484, -0x1.25f4f6441d2e4p-12 0x1.ff6393f9150ccp-718, 0x1.a0cb50a9bf2f3p-31 -# pow slowest path at 768 bits -# Implemented in sysdeps/ieee754/dbl-64/slowpow.c +# old pow slowest path at 768 bits ## name: 768bits 1.0000000000000020, 1.5 0x1.006777b4b61dep843, -0x1.67e3145491872p-1 Index: glibc-2.26/manual/probes.texi =================================================================== --- glibc-2.26.orig/manual/probes.texi +++ glibc-2.26/manual/probes.texi @@ -265,53 +265,6 @@ Unless explicitly mentioned otherwise, a precision in the mantissa of the multiple precision number. Hence, a precision level of 32 implies 768 bits of precision in the mantissa. -@deftp Probe slowexp_p6 (double @var{$arg1}, double @var{$arg2}) -This probe is triggered when the @code{exp} function is called with an -input that results in multiple precision computation with precision -6. Argument @var{$arg1} is the input value and @var{$arg2} is the -computed output. -@end deftp - -@deftp Probe slowexp_p32 (double @var{$arg1}, double @var{$arg2}) -This probe is triggered when the @code{exp} function is called with an -input that results in multiple precision computation with precision -32. Argument @var{$arg1} is the input value and @var{$arg2} is the -computed output. -@end deftp - -@deftp Probe slowpow_p10 (double @var{$arg1}, double @var{$arg2}, double @var{$arg3}, double @var{$arg4}) -This probe is triggered when the @code{pow} function is called with -inputs that result in multiple precision computation with precision -10. Arguments @var{$arg1} and @var{$arg2} are the input values, -@code{$arg3} is the value computed in the fast phase of the algorithm -and @code{$arg4} is the final accurate value. -@end deftp - -@deftp Probe slowpow_p32 (double @var{$arg1}, double @var{$arg2}, double @var{$arg3}, double @var{$arg4}) -This probe is triggered when the @code{pow} function is called with an -input that results in multiple precision computation with precision -32. Arguments @var{$arg1} and @var{$arg2} are the input values, -@code{$arg3} is the value computed in the fast phase of the algorithm -and @code{$arg4} is the final accurate value. -@end deftp - -@deftp Probe slowlog (int @var{$arg1}, double @var{$arg2}, double @var{$arg3}) -This probe is triggered when the @code{log} function is called with an -input that results in multiple precision computation. Argument -@var{$arg1} is the precision with which the computation succeeded. -Argument @var{$arg2} is the input and @var{$arg3} is the computed -output. -@end deftp - -@deftp Probe slowlog_inexact (int @var{$arg1}, double @var{$arg2}, double @var{$arg3}) -This probe is triggered when the @code{log} function is called with an -input that results in multiple precision computation and none of the -multiple precision computations result in an accurate result. -Argument @var{$arg1} is the maximum precision with which computations -were performed. Argument @var{$arg2} is the input and @var{$arg3} is -the computed output. -@end deftp - @deftp Probe slowatan2 (int @var{$arg1}, double @var{$arg2}, double @var{$arg3}, double @var{$arg4}) This probe is triggered when the @code{atan2} function is called with an input that results in multiple precision computation. Argument Index: glibc-2.26/math/Makefile =================================================================== --- glibc-2.26.orig/math/Makefile +++ glibc-2.26/math/Makefile @@ -110,9 +110,9 @@ type-ldouble-yes := ldouble # double support type-double-suffix := -type-double-routines := branred doasin dosincos halfulp mpa mpatan2 \ - mpatan mpexp mplog mpsqrt mptan sincos32 slowexp \ - slowpow sincostab k_rem_pio2 +type-double-routines := branred doasin dosincos mpa mpatan2 \ + mpatan mpexp mplog mpsqrt mptan sincos32 \ + sincostab k_rem_pio2 # float support type-float-suffix := f Index: glibc-2.26/sysdeps/aarch64/libm-test-ulps =================================================================== --- glibc-2.26.orig/sysdeps/aarch64/libm-test-ulps +++ glibc-2.26/sysdeps/aarch64/libm-test-ulps @@ -1012,7 +1012,9 @@ ildouble: 2 ldouble: 2 Function: "cos": +double: 1 float: 1 +idouble: 1 ifloat: 1 ildouble: 1 ldouble: 1 @@ -1932,7 +1934,9 @@ ildouble: 1 ldouble: 1 Function: "pow": +double: 1 float: 1 +idouble: 1 ifloat: 1 ildouble: 2 ldouble: 2 @@ -1992,7 +1996,9 @@ ildouble: 2 ldouble: 2 Function: "sin": +double: 1 float: 1 +idouble: 1 ifloat: 1 ildouble: 1 ldouble: 1 @@ -2022,7 +2028,9 @@ ildouble: 3 ldouble: 3 Function: "sincos": +double: 1 float: 1 +idouble: 1 ifloat: 1 ildouble: 1 ldouble: 1 Index: glibc-2.26/sysdeps/generic/math_private.h =================================================================== --- glibc-2.26.orig/sysdeps/generic/math_private.h +++ glibc-2.26/sysdeps/generic/math_private.h @@ -255,20 +255,17 @@ extern float __kernel_standard_f (float, extern long double __kernel_standard_l (long double,long double,int); /* Prototypes for functions of the IBM Accurate Mathematical Library. */ -extern double __exp1 (double __x, double __xx, double __error); +extern double __exp1 (double __x, double __xx); extern double __sin (double __x); extern double __cos (double __x); extern int __branred (double __x, double *__a, double *__aa); extern void __doasin (double __x, double __dx, double __v[]); extern void __dubsin (double __x, double __dx, double __v[]); extern void __dubcos (double __x, double __dx, double __v[]); -extern double __halfulp (double __x, double __y); extern double __sin32 (double __x, double __res, double __res1); extern double __cos32 (double __x, double __res, double __res1); extern double __mpsin (double __x, double __dx, bool __range_reduce); extern double __mpcos (double __x, double __dx, bool __range_reduce); -extern double __slowexp (double __x); -extern double __slowpow (double __x, double __y, double __z); extern void __docos (double __x, double __dx, double __v[]); #ifndef math_opt_barrier Index: glibc-2.26/sysdeps/i386/fpu/halfulp.c =================================================================== --- glibc-2.26.orig/sysdeps/i386/fpu/halfulp.c +++ /dev/null @@ -1 +0,0 @@ -/* Not needed. */ Index: glibc-2.26/sysdeps/i386/fpu/slowexp.c =================================================================== --- glibc-2.26.orig/sysdeps/i386/fpu/slowexp.c +++ /dev/null @@ -1 +0,0 @@ -/* Not needed. */ Index: glibc-2.26/sysdeps/i386/fpu/slowpow.c =================================================================== --- glibc-2.26.orig/sysdeps/i386/fpu/slowpow.c +++ /dev/null @@ -1 +0,0 @@ -/* Not needed. */ Index: glibc-2.26/sysdeps/ia64/fpu/halfulp.c =================================================================== --- glibc-2.26.orig/sysdeps/ia64/fpu/halfulp.c +++ /dev/null @@ -1 +0,0 @@ -/* Not needed. */ Index: glibc-2.26/sysdeps/ia64/fpu/slowexp.c =================================================================== --- glibc-2.26.orig/sysdeps/ia64/fpu/slowexp.c +++ /dev/null @@ -1 +0,0 @@ -/* Not needed. */ Index: glibc-2.26/sysdeps/ia64/fpu/slowpow.c =================================================================== --- glibc-2.26.orig/sysdeps/ia64/fpu/slowpow.c +++ /dev/null @@ -1 +0,0 @@ -/* Not needed. */ Index: glibc-2.26/sysdeps/ieee754/dbl-64/e_exp.c =================================================================== --- glibc-2.26.orig/sysdeps/ieee754/dbl-64/e_exp.c +++ glibc-2.26/sysdeps/ieee754/dbl-64/e_exp.c @@ -23,10 +23,10 @@ /* exp1 */ /* */ /* FILES NEEDED:dla.h endian.h mpa.h mydefs.h uexp.h */ -/* mpa.c mpexp.x slowexp.c */ +/* mpa.c mpexp.x */ /* */ /* An ultimate exp routine. Given an IEEE double machine number x */ -/* it computes the correctly rounded (to nearest) value of e^x */ +/* it computes an almost correctly rounded (to nearest) value of e^x */ /* Assumption: Machine arithmetic operations are performed in */ /* round to nearest mode of IEEE 754 standard. */ /* */ @@ -46,10 +46,6 @@ # define SECTION #endif -double __slowexp (double); - -/* An ultimate exp routine. Given an IEEE double machine number x it computes - the correctly rounded (to nearest) value of e^x. */ double SECTION __ieee754_exp (double x) @@ -93,17 +89,10 @@ __ieee754_exp (double x) rem = (bet + bet * eps) + al * eps; res = al + rem; - cor = (al - res) + rem; - if (res == (res + cor * err_0)) - { - retval = res * binexp.x; - goto ret; - } - else - { - retval = __slowexp (x); - goto ret; - } /*if error is over bound */ + /* Maximum relative error is 7.8e-22 (70.1 bits). + Maximum ULP error is 0.500007. */ + retval = res * binexp.x; + goto ret; } if (n <= smallint) @@ -166,38 +155,22 @@ __ieee754_exp (double x) if (ex >= -1022) { binexp.i[HIGH_HALF] = (1023 + ex) << 20; - if (res == (res + cor * err_0)) - { - retval = res * binexp.x; - goto ret; - } - else - { - retval = __slowexp (x); - goto check_uflow_ret; - } /*if error is over bound */ + /* Does not underflow: res >= 1.0, binexp >= 0x1p-1022 + Maximum relative error is 7.8e-22 (70.1 bits). + Maximum ULP error is 0.500007. */ + retval = res * binexp.x; + goto ret; } ex = -(1022 + ex); binexp.i[HIGH_HALF] = (1023 - ex) << 20; res *= binexp.x; cor *= binexp.x; - eps = 1.0000000001 + err_0 * binexp.x; t = 1.0 + res; y = ((1.0 - t) + res) + cor; res = t + y; - cor = (t - res) + y; - if (res == (res + eps * cor)) - { - binexp.i[HIGH_HALF] = 0x00100000; - retval = (res - 1.0) * binexp.x; - goto check_uflow_ret; - } - else - { - retval = __slowexp (x); - goto check_uflow_ret; - } /* if error is over bound */ - check_uflow_ret: + /* Maximum ULP error is 0.5000035. */ + binexp.i[HIGH_HALF] = 0x00100000; + retval = (res - 1.0) * binexp.x; if (retval < DBL_MIN) { double force_underflow = tiny * tiny; @@ -210,10 +183,9 @@ __ieee754_exp (double x) else { binexp.i[HIGH_HALF] = (junk1.i[LOW_HALF] + 767) << 20; - if (res == (res + cor * err_0)) - retval = res * binexp.x * t256.x; - else - retval = __slowexp (x); + /* Maximum relative error is 7.8e-22 (70.1 bits). + Maximum ULP error is 0.500007. */ + retval = res * binexp.x * t256.x; if (isinf (retval)) goto ret_huge; else @@ -233,13 +205,10 @@ ret: strong_alias (__ieee754_exp, __exp_finite) #endif -/* Compute e^(x+xx). The routine also receives bound of error of previous - calculation. If after computing exp the error exceeds the allowed bounds, - the routine returns a non-positive number. Otherwise it returns the - computed result, which is always positive. */ +/* Compute e^(x+xx). */ double SECTION -__exp1 (double x, double xx, double error) +__exp1 (double x, double xx) { double bexp, t, eps, del, base, y, al, bet, res, rem, cor; mynumber junk1, junk2, binexp = {{0, 0}}; @@ -249,6 +218,7 @@ __exp1 (double x, double xx, double erro m = junk1.i[HIGH_HALF]; n = m & hugeint; /* no sign */ + /* fabs (x) > 5.551112e-17 and fabs (x) < 7.080010e+02. */ if (n > smallint && n < bigint) { y = x * log2e.x + three51.x; @@ -276,11 +246,9 @@ __exp1 (double x, double xx, double erro rem = (bet + bet * eps) + al * eps; res = al + rem; - cor = (al - res) + rem; - if (res == (res + cor * (1.0 + error + err_1))) - return res * binexp.x; - else - return -10.0; + /* Maximum relative error before rounding is 8.8e-22 (69.9 bits). + Maximum ULP error is 0.500008. */ + return res * binexp.x; } if (n <= smallint) @@ -318,6 +286,7 @@ __exp1 (double x, double xx, double erro cor = (al - res) + rem; if (m >> 31) { + /* x < 0. */ ex = junk1.i[LOW_HALF]; if (res < 1.0) { @@ -328,34 +297,25 @@ __exp1 (double x, double xx, double erro if (ex >= -1022) { binexp.i[HIGH_HALF] = (1023 + ex) << 20; - if (res == (res + cor * (1.0 + error + err_1))) - return res * binexp.x; - else - return -10.0; + /* Maximum ULP error is 0.500008. */ + return res * binexp.x; } + /* Denormal case - ex < -1022. */ ex = -(1022 + ex); binexp.i[HIGH_HALF] = (1023 - ex) << 20; res *= binexp.x; cor *= binexp.x; - eps = 1.00000000001 + (error + err_1) * binexp.x; t = 1.0 + res; y = ((1.0 - t) + res) + cor; res = t + y; - cor = (t - res) + y; - if (res == (res + eps * cor)) - { - binexp.i[HIGH_HALF] = 0x00100000; - return (res - 1.0) * binexp.x; - } - else - return -10.0; + binexp.i[HIGH_HALF] = 0x00100000; + /* Maximum ULP error is 0.500004. */ + return (res - 1.0) * binexp.x; } else { binexp.i[HIGH_HALF] = (junk1.i[LOW_HALF] + 767) << 20; - if (res == (res + cor * (1.0 + error + err_1))) - return res * binexp.x * t256.x; - else - return -10.0; + /* Maximum ULP error is 0.500008. */ + return res * binexp.x * t256.x; } } Index: glibc-2.26/sysdeps/ieee754/dbl-64/e_log.c =================================================================== --- glibc-2.26.orig/sysdeps/ieee754/dbl-64/e_log.c +++ glibc-2.26/sysdeps/ieee754/dbl-64/e_log.c @@ -23,11 +23,10 @@ /* FUNCTION:ulog */ /* */ /* FILES NEEDED: dla.h endian.h mpa.h mydefs.h ulog.h */ -/* mpexp.c mplog.c mpa.c */ /* ulog.tbl */ /* */ /* An ultimate log routine. Given an IEEE double machine number x */ -/* it computes the correctly rounded (to nearest) value of log(x). */ +/* it computes the rounded (to nearest) value of log(x). */ /* Assumption: Machine arithmetic operations are performed in */ /* round to nearest mode of IEEE 754 standard. */ /* */ @@ -40,34 +39,26 @@ #include "MathLib.h" #include <math.h> #include <math_private.h> -#include <stap-probe.h> #ifndef SECTION # define SECTION #endif -void __mplog (mp_no *, mp_no *, int); - /*********************************************************************/ -/* An ultimate log routine. Given an IEEE double machine number x */ -/* it computes the correctly rounded (to nearest) value of log(x). */ +/* An ultimate log routine. Given an IEEE double machine number x */ +/* it computes the rounded (to nearest) value of log(x). */ /*********************************************************************/ double SECTION __ieee754_log (double x) { -#define M 4 - static const int pr[M] = { 8, 10, 18, 32 }; - int i, j, n, ux, dx, p; + int i, j, n, ux, dx; double dbl_n, u, p0, q, r0, w, nln2a, luai, lubi, lvaj, lvbj, - sij, ssij, ttij, A, B, B0, y, y1, y2, polI, polII, sa, sb, - t1, t2, t7, t8, t, ra, rb, ww, - a0, aa0, s1, s2, ss2, s3, ss3, a1, aa1, a, aa, b, bb, c; + sij, ssij, ttij, A, B, B0, polI, polII, t8, a, aa, b, bb, c; #ifndef DLA_FMS - double t3, t4, t5, t6; + double t1, t2, t3, t4, t5; #endif number num; - mp_no mpx, mpy, mpy1, mpy2, mperr; #include "ulog.tbl" #include "ulog.h" @@ -101,7 +92,7 @@ __ieee754_log (double x) if (w == 0.0) return 0.0; - /*--- Stage I, the case abs(x-1) < 0.03 */ + /*--- The case abs(x-1) < 0.03 */ t8 = MHALF * w; EMULV (t8, w, a, aa, t1, t2, t3, t4, t5); @@ -118,50 +109,12 @@ __ieee754_log (double x) polII *= w * w * w; c = (aa + bb) + polII; - /* End stage I, case abs(x-1) < 0.03 */ - if ((y = b + (c + b * E2)) == b + (c - b * E2)) - return y; - - /*--- Stage II, the case abs(x-1) < 0.03 */ - - a = d19.d + w * d20.d; - a = d18.d + w * a; - a = d17.d + w * a; - a = d16.d + w * a; - a = d15.d + w * a; - a = d14.d + w * a; - a = d13.d + w * a; - a = d12.d + w * a; - a = d11.d + w * a; - - EMULV (w, a, s2, ss2, t1, t2, t3, t4, t5); - ADD2 (d10.d, dd10.d, s2, ss2, s3, ss3, t1, t2); - MUL2 (w, 0, s3, ss3, s2, ss2, t1, t2, t3, t4, t5, t6, t7, t8); - ADD2 (d9.d, dd9.d, s2, ss2, s3, ss3, t1, t2); - MUL2 (w, 0, s3, ss3, s2, ss2, t1, t2, t3, t4, t5, t6, t7, t8); - ADD2 (d8.d, dd8.d, s2, ss2, s3, ss3, t1, t2); - MUL2 (w, 0, s3, ss3, s2, ss2, t1, t2, t3, t4, t5, t6, t7, t8); - ADD2 (d7.d, dd7.d, s2, ss2, s3, ss3, t1, t2); - MUL2 (w, 0, s3, ss3, s2, ss2, t1, t2, t3, t4, t5, t6, t7, t8); - ADD2 (d6.d, dd6.d, s2, ss2, s3, ss3, t1, t2); - MUL2 (w, 0, s3, ss3, s2, ss2, t1, t2, t3, t4, t5, t6, t7, t8); - ADD2 (d5.d, dd5.d, s2, ss2, s3, ss3, t1, t2); - MUL2 (w, 0, s3, ss3, s2, ss2, t1, t2, t3, t4, t5, t6, t7, t8); - ADD2 (d4.d, dd4.d, s2, ss2, s3, ss3, t1, t2); - MUL2 (w, 0, s3, ss3, s2, ss2, t1, t2, t3, t4, t5, t6, t7, t8); - ADD2 (d3.d, dd3.d, s2, ss2, s3, ss3, t1, t2); - MUL2 (w, 0, s3, ss3, s2, ss2, t1, t2, t3, t4, t5, t6, t7, t8); - ADD2 (d2.d, dd2.d, s2, ss2, s3, ss3, t1, t2); - MUL2 (w, 0, s3, ss3, s2, ss2, t1, t2, t3, t4, t5, t6, t7, t8); - MUL2 (w, 0, s2, ss2, s3, ss3, t1, t2, t3, t4, t5, t6, t7, t8); - ADD2 (w, 0, s3, ss3, b, bb, t1, t2); - - /* End stage II, case abs(x-1) < 0.03 */ - if ((y = b + (bb + b * E4)) == b + (bb - b * E4)) - return y; - goto stage_n; + /* Here b contains the high part of the result, and c the low part. + Maximum error is b * 2.334e-19, so accuracy is >61 bits. + Therefore max ULP error of b + c is ~0.502. */ + return b + c; - /*--- Stage I, the case abs(x-1) > 0.03 */ + /*--- The case abs(x-1) > 0.03 */ case_03: /* Find n,u such that x = u*2**n, 1/sqrt(2) < u < sqrt(2) */ @@ -203,58 +156,10 @@ case_03: B0 = (((lubi + lvbj) + ssij) + ttij) + dbl_n * LN2B; B = polI + B0; - /* End stage I, case abs(x-1) >= 0.03 */ - if ((y = A + (B + E1)) == A + (B - E1)) - return y; - - - /*--- Stage II, the case abs(x-1) > 0.03 */ - - /* Improve the accuracy of r0 */ - EMULV (p0, r0, sa, sb, t1, t2, t3, t4, t5); - t = r0 * ((1 - sa) - sb); - EADD (r0, t, ra, rb); - - /* Compute w */ - MUL2 (q, 0, ra, rb, w, ww, t1, t2, t3, t4, t5, t6, t7, t8); - - EADD (A, B0, a0, aa0); - - /* Evaluate polynomial III */ - s1 = (c3.d + (c4.d + c5.d * w) * w) * w; - EADD (c2.d, s1, s2, ss2); - MUL2 (s2, ss2, w, ww, s3, ss3, t1, t2, t3, t4, t5, t6, t7, t8); - MUL2 (s3, ss3, w, ww, s2, ss2, t1, t2, t3, t4, t5, t6, t7, t8); - ADD2 (s2, ss2, w, ww, s3, ss3, t1, t2); - ADD2 (s3, ss3, a0, aa0, a1, aa1, t1, t2); - - /* End stage II, case abs(x-1) >= 0.03 */ - if ((y = a1 + (aa1 + E3)) == a1 + (aa1 - E3)) - return y; - - - /* Final stages. Use multi-precision arithmetic. */ -stage_n: - - for (i = 0; i < M; i++) - { - p = pr[i]; - __dbl_mp (x, &mpx, p); - __dbl_mp (y, &mpy, p); - __mplog (&mpx, &mpy, p); - __dbl_mp (e[i].d, &mperr, p); - __add (&mpy, &mperr, &mpy1, p); - __sub (&mpy, &mperr, &mpy2, p); - __mp_dbl (&mpy1, &y1, p); - __mp_dbl (&mpy2, &y2, p); - if (y1 == y2) - { - LIBC_PROBE (slowlog, 3, &p, &x, &y1); - return y1; - } - } - LIBC_PROBE (slowlog_inexact, 3, &p, &x, &y1); - return y1; + /* Here A contains the high part of the result, and B the low part. + Maximum abs error is 6.095e-21 and min log (x) is 0.0295 since x > 1.03. + Therefore max ULP error of A + B is ~0.502. */ + return A + B; } #ifndef __ieee754_log Index: glibc-2.26/sysdeps/ieee754/dbl-64/e_pow.c =================================================================== --- glibc-2.26.orig/sysdeps/ieee754/dbl-64/e_pow.c +++ glibc-2.26/sysdeps/ieee754/dbl-64/e_pow.c @@ -20,13 +20,9 @@ /* MODULE_NAME: upow.c */ /* */ /* FUNCTIONS: upow */ -/* power1 */ -/* my_log2 */ /* log1 */ /* checkint */ /* FILES NEEDED: dla.h endian.h mpa.h mydefs.h */ -/* halfulp.c mpexp.c mplog.c slowexp.c slowpow.c mpa.c */ -/* uexp.c upow.c */ /* root.tbl uexp.tbl upow.tbl */ /* An ultimate power routine. Given two IEEE double machine numbers y,x */ /* it computes the correctly rounded (to nearest) value of x^y. */ @@ -50,11 +46,8 @@ static const double huge = 1.0e300, tiny = 1.0e-300; -double __exp1 (double x, double xx, double error); -static double log1 (double x, double *delta, double *error); -static double my_log2 (double x, double *delta, double *error); -double __slowpow (double x, double y, double z); -static double power1 (double x, double y); +double __exp1 (double x, double xx); +static double log1 (double x, double *delta); static int checkint (double x); /* An ultimate power routine. Given two IEEE double machine numbers y, x it @@ -63,7 +56,7 @@ double SECTION __ieee754_pow (double x, double y) { - double z, a, aa, error, t, a1, a2, y1, y2; + double z, a, aa, t, a1, a2, y1, y2; mynumber u, v; int k; int4 qx, qy; @@ -100,7 +93,7 @@ __ieee754_pow (double x, double y) not matter if |y| <= 2**-64. */ if (fabs (y) < 0x1p-64) y = y < 0 ? -0x1p-64 : 0x1p-64; - z = log1 (x, &aa, &error); /* x^y =e^(y log (X)) */ + z = log1 (x, &aa); /* x^y =e^(y log (X)) */ t = y * CN; y1 = t - (t - y); y2 = y - y1; @@ -111,9 +104,16 @@ __ieee754_pow (double x, double y) aa = y2 * a1 + y * a2; a1 = a + aa; a2 = (a - a1) + aa; - error = error * fabs (y); - t = __exp1 (a1, a2, 1.9e16 * error); /* return -10 or 0 if wasn't computed exactly */ - retval = (t > 0) ? t : power1 (x, y); + + /* Maximum relative error RElog of log1 is 1.0e-21 (69.7 bits). + Maximum relative error REexp of __exp1 is 8.8e-22 (69.9 bits). + We actually compute exp ((1 + RElog) * log (x) * y) * (1 + REexp). + Since RElog/REexp are tiny and log (x) * y is at most log (DBL_MAX), + this is equivalent to pow (x, y) * (1 + 710 * RElog + REexp). + So the relative error is 710 * 1.0e-21 + 8.8e-22 = 7.1e-19 + (60.2 bits). The worst-case ULP error is 0.5064. */ + + retval = __exp1 (a1, a2); } if (isinf (retval)) @@ -218,33 +218,11 @@ __ieee754_pow (double x, double y) strong_alias (__ieee754_pow, __pow_finite) #endif -/* Compute x^y using more accurate but more slow log routine. */ -static double -SECTION -power1 (double x, double y) -{ - double z, a, aa, error, t, a1, a2, y1, y2; - z = my_log2 (x, &aa, &error); - t = y * CN; - y1 = t - (t - y); - y2 = y - y1; - t = z * CN; - a1 = t - (t - z); - a2 = z - a1; - a = y * z; - aa = ((y1 * a1 - a) + y1 * a2 + y2 * a1) + y2 * a2 + aa * y; - a1 = a + aa; - a2 = (a - a1) + aa; - error = error * fabs (y); - t = __exp1 (a1, a2, 1.9e16 * error); - return (t >= 0) ? t : __slowpow (x, y, z); -} - /* Compute log(x) (x is left argument). The result is the returned double + the - parameter DELTA. The result is bounded by ERROR. */ + parameter DELTA. */ static double SECTION -log1 (double x, double *delta, double *error) +log1 (double x, double *delta) { unsigned int i, j; int m; @@ -260,9 +238,7 @@ log1 (double x, double *delta, double *e u.x = x; m = u.i[HIGH_HALF]; - *error = 0; - *delta = 0; - if (m < 0x00100000) /* 1<x<2^-1007 */ + if (m < 0x00100000) /* Handle denormal x. */ { x = x * t52.x; add = -52.0; @@ -284,7 +260,7 @@ log1 (double x, double *delta, double *e v.x = u.x + bigu.x; uu = v.x - bigu.x; i = (v.i[LOW_HALF] & 0x000003ff) << 2; - if (two52.i[LOW_HALF] == 1023) /* nx = 0 */ + if (two52.i[LOW_HALF] == 1023) /* Exponent of x is 0. */ { if (i > 1192 && i < 1208) /* |x-1| < 1.5*2**-10 */ { @@ -296,8 +272,8 @@ log1 (double x, double *delta, double *e * (r7 + t * r8))))) - 0.5 * t2 * (t + t1)); res = e1 + e2; - *error = 1.0e-21 * fabs (t); *delta = (e1 - res) + e2; + /* Max relative error is 1.464844e-24, so accurate to 79.1 bits. */ return res; } /* |x-1| < 1.5*2**-10 */ else @@ -316,12 +292,12 @@ log1 (double x, double *delta, double *e t2 = ((((t - t1) + e) + (ui.x[i + 3] + vj.x[j + 2])) + e2 + e * e * (p2 + e * (p3 + e * p4))); res = t1 + t2; - *error = 1.0e-24; *delta = (t1 - res) + t2; + /* Max relative error is 1.0e-24, so accurate to 79.7 bits. */ return res; } - } /* nx = 0 */ - else /* nx != 0 */ + } + else /* Exponent of x != 0. */ { eps = u.x - uu; nx = (two52.x - two52e.x) + add; @@ -334,113 +310,13 @@ log1 (double x, double *delta, double *e t2 = ((((t - t1) + e) + nx * ln2b.x + ui.x[i + 3] + e2) + e * e * (q2 + e * (q3 + e * (q4 + e * (q5 + e * q6))))); res = t1 + t2; - *error = 1.0e-21; - *delta = (t1 - res) + t2; - return res; - } /* nx != 0 */ -} - -/* Slower but more accurate routine of log. The returned result is double + - DELTA. The result is bounded by ERROR. */ -static double -SECTION -my_log2 (double x, double *delta, double *error) -{ - unsigned int i, j; - int m; - double uu, vv, eps, nx, e, e1, e2, t, t1, t2, res, add = 0; - double ou1, ou2, lu1, lu2, ov, lv1, lv2, a, a1, a2; - double y, yy, z, zz, j1, j2, j7, j8; -#ifndef DLA_FMS - double j3, j4, j5, j6; -#endif - mynumber u, v; -#ifdef BIG_ENDI - mynumber /**/ two52 = {{0x43300000, 0x00000000}}; /* 2**52 */ -#else -# ifdef LITTLE_ENDI - mynumber /**/ two52 = {{0x00000000, 0x43300000}}; /* 2**52 */ -# endif -#endif - - u.x = x; - m = u.i[HIGH_HALF]; - *error = 0; - *delta = 0; - add = 0; - if (m < 0x00100000) - { /* x < 2^-1022 */ - x = x * t52.x; - add = -52.0; - u.x = x; - m = u.i[HIGH_HALF]; - } - - if ((m & 0x000fffff) < 0x0006a09e) - { - u.i[HIGH_HALF] = (m & 0x000fffff) | 0x3ff00000; - two52.i[LOW_HALF] = (m >> 20); - } - else - { - u.i[HIGH_HALF] = (m & 0x000fffff) | 0x3fe00000; - two52.i[LOW_HALF] = (m >> 20) + 1; - } - - v.x = u.x + bigu.x; - uu = v.x - bigu.x; - i = (v.i[LOW_HALF] & 0x000003ff) << 2; - /*------------------------------------- |x-1| < 2**-11------------------------------- */ - if ((two52.i[LOW_HALF] == 1023) && (i == 1200)) - { - t = x - 1.0; - EMULV (t, s3, y, yy, j1, j2, j3, j4, j5); - ADD2 (-0.5, 0, y, yy, z, zz, j1, j2); - MUL2 (t, 0, z, zz, y, yy, j1, j2, j3, j4, j5, j6, j7, j8); - MUL2 (t, 0, y, yy, z, zz, j1, j2, j3, j4, j5, j6, j7, j8); - - e1 = t + z; - e2 = ((((t - e1) + z) + zz) + t * t * t - * (ss3 + t * (s4 + t * (s5 + t * (s6 + t * (s7 + t * s8)))))); - res = e1 + e2; - *error = 1.0e-25 * fabs (t); - *delta = (e1 - res) + e2; - return res; - } - /*----------------------------- |x-1| > 2**-11 -------------------------- */ - else - { /*Computing log(x) according to log table */ - nx = (two52.x - two52e.x) + add; - ou1 = ui.x[i]; - ou2 = ui.x[i + 1]; - lu1 = ui.x[i + 2]; - lu2 = ui.x[i + 3]; - v.x = u.x * (ou1 + ou2) + bigv.x; - vv = v.x - bigv.x; - j = v.i[LOW_HALF] & 0x0007ffff; - j = j + j + j; - eps = u.x - uu * vv; - ov = vj.x[j]; - lv1 = vj.x[j + 1]; - lv2 = vj.x[j + 2]; - a = (ou1 + ou2) * (1.0 + ov); - a1 = (a + 1.0e10) - 1.0e10; - a2 = a * (1.0 - a1 * uu * vv); - e1 = eps * a1; - e2 = eps * a2; - e = e1 + e2; - e2 = (e1 - e) + e2; - t = nx * ln2a.x + lu1 + lv1; - t1 = t + e; - t2 = ((((t - t1) + e) + (lu2 + lv2 + nx * ln2b.x + e2)) + e * e - * (p2 + e * (p3 + e * p4))); - res = t1 + t2; - *error = 1.0e-27; *delta = (t1 - res) + t2; + /* Max relative error is 1.0e-21, so accurate to 69.7 bits. */ return res; } } + /* This function receives a double x and checks if it is an integer. If not, it returns 0, else it returns 1 if even or -1 if odd. */ static int Index: glibc-2.26/sysdeps/ieee754/dbl-64/halfulp.c =================================================================== --- glibc-2.26.orig/sysdeps/ieee754/dbl-64/halfulp.c +++ /dev/null @@ -1,152 +0,0 @@ -/* - * IBM Accurate Mathematical Library - * written by International Business Machines Corp. - * Copyright (C) 2001-2017 Free Software Foundation, Inc. - * - * This program is free software; you can redistribute it and/or modify - * it under the terms of the GNU Lesser General Public License as published by - * the Free Software Foundation; either version 2.1 of the License, or - * (at your option) any later version. - * - * This program is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU Lesser General Public License for more details. - * - * You should have received a copy of the GNU Lesser General Public License - * along with this program; if not, see <http://www.gnu.org/licenses/>. - */ -/************************************************************************/ -/* */ -/* MODULE_NAME:halfulp.c */ -/* */ -/* FUNCTIONS:halfulp */ -/* FILES NEEDED: mydefs.h dla.h endian.h */ -/* uroot.c */ -/* */ -/*Routine halfulp(double x, double y) computes x^y where result does */ -/*not need rounding. If the result is closer to 0 than can be */ -/*represented it returns 0. */ -/* In the following cases the function does not compute anything */ -/*and returns a negative number: */ -/*1. if the result needs rounding, */ -/*2. if y is outside the interval [0, 2^20-1], */ -/*3. if x can be represented by x=2**n for some integer n. */ -/************************************************************************/ - -#include "endian.h" -#include "mydefs.h" -#include <dla.h> -#include <math_private.h> - -#ifndef SECTION -# define SECTION -#endif - -static const int4 tab54[32] = { - 262143, 11585, 1782, 511, 210, 107, 63, 42, - 30, 22, 17, 14, 12, 10, 9, 7, - 7, 6, 5, 5, 5, 4, 4, 4, - 3, 3, 3, 3, 3, 3, 3, 3 -}; - - -double -SECTION -__halfulp (double x, double y) -{ - mynumber v; - double z, u, uu; -#ifndef DLA_FMS - double j1, j2, j3, j4, j5; -#endif - int4 k, l, m, n; - if (y <= 0) /*if power is negative or zero */ - { - v.x = y; - if (v.i[LOW_HALF] != 0) - return -10.0; - v.x = x; - if (v.i[LOW_HALF] != 0) - return -10.0; - if ((v.i[HIGH_HALF] & 0x000fffff) != 0) - return -10; /* if x =2 ^ n */ - k = ((v.i[HIGH_HALF] & 0x7fffffff) >> 20) - 1023; /* find this n */ - z = (double) k; - return (z * y == -1075.0) ? 0 : -10.0; - } - /* if y > 0 */ - v.x = y; - if (v.i[LOW_HALF] != 0) - return -10.0; - - v.x = x; - /* case where x = 2**n for some integer n */ - if (((v.i[HIGH_HALF] & 0x000fffff) | v.i[LOW_HALF]) == 0) - { - k = (v.i[HIGH_HALF] >> 20) - 1023; - return (((double) k) * y == -1075.0) ? 0 : -10.0; - } - - v.x = y; - k = v.i[HIGH_HALF]; - m = k << 12; - l = 0; - while (m) - { - m = m << 1; l++; - } - n = (k & 0x000fffff) | 0x00100000; - n = n >> (20 - l); /* n is the odd integer of y */ - k = ((k >> 20) - 1023) - l; /* y = n*2**k */ - if (k > 5) - return -10.0; - if (k > 0) - for (; k > 0; k--) - n *= 2; - if (n > 34) - return -10.0; - k = -k; - if (k > 5) - return -10.0; - - /* now treat x */ - while (k > 0) - { - z = __ieee754_sqrt (x); - EMULV (z, z, u, uu, j1, j2, j3, j4, j5); - if (((u - x) + uu) != 0) - break; - x = z; - k--; - } - if (k) - return -10.0; - - /* it is impossible that n == 2, so the mantissa of x must be short */ - - v.x = x; - if (v.i[LOW_HALF]) - return -10.0; - k = v.i[HIGH_HALF]; - m = k << 12; - l = 0; - while (m) - { - m = m << 1; l++; - } - m = (k & 0x000fffff) | 0x00100000; - m = m >> (20 - l); /* m is the odd integer of x */ - - /* now check whether the length of m**n is at most 54 bits */ - - if (m > tab54[n - 3]) - return -10.0; - - /* yes, it is - now compute x**n by simple multiplications */ - - u = x; - for (k = 1; k < n; k++) - u = u * x; - return u; -} Index: glibc-2.26/sysdeps/ieee754/dbl-64/s_sin.c =================================================================== --- glibc-2.26.orig/sysdeps/ieee754/dbl-64/s_sin.c +++ glibc-2.26/sysdeps/ieee754/dbl-64/s_sin.c @@ -22,22 +22,11 @@ /* */ /* FUNCTIONS: usin */ /* ucos */ -/* slow */ -/* slow1 */ -/* slow2 */ -/* sloww */ -/* sloww1 */ -/* sloww2 */ -/* bsloww */ -/* bsloww1 */ -/* bsloww2 */ -/* cslow2 */ /* FILES NEEDED: dla.h endian.h mpa.h mydefs.h usncs.h */ -/* branred.c sincos32.c dosincos.c mpa.c */ -/* sincos.tbl */ +/* branred.c sincos.tbl */ /* */ -/* An ultimate sin and routine. Given an IEEE double machine number x */ -/* it computes the correctly rounded (to nearest) value of sin(x) or cos(x) */ +/* An ultimate sin and cos routine. Given an IEEE double machine number x */ +/* it computes sin(x) or cos(x) with ~0.55 ULP. */ /* Assumption: Machine arithmetic operations are performed in */ /* round to nearest mode of IEEE 754 standard. */ /* */ @@ -65,35 +54,11 @@ a - a^3/3! + a^5/5! - a^7/7! + a^9/9! + (1 - a^2) * da / 2 The constants s1, s2, s3, etc. are pre-computed values of 1/3!, 1/5! and so - on. The result is returned to LHS and correction in COR. */ -#define TAYLOR_SIN(xx, a, da, cor) \ + on. The result is returned to LHS. */ +#define TAYLOR_SIN(xx, a, da) \ ({ \ double t = ((POLYNOMIAL (xx) * (a) - 0.5 * (da)) * (xx) + (da)); \ double res = (a) + t; \ - (cor) = ((a) - res) + t; \ - res; \ -}) - -/* This is again a variation of the Taylor series expansion with the term - x^3/3! expanded into the following for better accuracy: - - bb * x ^ 3 + 3 * aa * x * x1 * x2 + aa * x1 ^ 3 + aa * x2 ^ 3 - - The correction term is dx and bb + aa = -1/3! - */ -#define TAYLOR_SLOW(x0, dx, cor) \ -({ \ - static const double th2_36 = 206158430208.0; /* 1.5*2**37 */ \ - double xx = (x0) * (x0); \ - double x1 = ((x0) + th2_36) - th2_36; \ - double y = aa * x1 * x1 * x1; \ - double r = (x0) + y; \ - double x2 = ((x0) - x1) + (dx); \ - double t = (((POLYNOMIAL2 (xx) + bb) * xx + 3.0 * aa * x1 * x2) \ - * (x0) + aa * x2 * x2 * x2 + (dx)); \ - t = (((x0) - r) + y) + t; \ - double res = r + t; \ - (cor) = (r - res) + t; \ res; \ }) @@ -123,31 +88,15 @@ static const double cs4 = -4.16666666666664434524222570944589E-02, cs6 = 1.38888874007937613028114285595617E-03; -static const double t22 = 0x1.8p22; - -void __dubsin (double x, double dx, double w[]); -void __docos (double x, double dx, double w[]); -double __mpsin (double x, double dx, bool reduce_range); -double __mpcos (double x, double dx, bool reduce_range); -static double slow (double x); -static double slow1 (double x); -static double slow2 (double x); -static double sloww (double x, double dx, double orig, bool shift_quadrant); -static double sloww1 (double x, double dx, double orig, bool shift_quadrant); -static double sloww2 (double x, double dx, double orig, int n); -static double bsloww (double x, double dx, double orig, int n); -static double bsloww1 (double x, double dx, double orig, int n); -static double bsloww2 (double x, double dx, double orig, int n); int __branred (double x, double *a, double *aa); -static double cslow2 (double x); /* Given a number partitioned into X and DX, this function computes the cosine of the number by combining the sin and cos of X (as computed by a variation of the Taylor series) with the values looked up from the sin/cos table to - get the result in RES and a correction value in COR. */ + get the result. */ static inline double __always_inline -do_cos (double x, double dx, double *corp) +do_cos (double x, double dx) { mynumber u; @@ -157,60 +106,28 @@ do_cos (double x, double dx, double *cor u.x = big + fabs (x); x = fabs (x) - (u.x - big) + dx; - double xx, s, sn, ssn, c, cs, ccs, res, cor; + double xx, s, sn, ssn, c, cs, ccs, cor; xx = x * x; s = x + x * xx * (sn3 + xx * sn5); c = xx * (cs2 + xx * (cs4 + xx * cs6)); SINCOS_TABLE_LOOKUP (u, sn, ssn, cs, ccs); cor = (ccs - s * ssn - cs * c) - sn * s; - res = cs + cor; - cor = (cs - res) + cor; - *corp = cor; - return res; -} - -/* A more precise variant of DO_COS. EPS is the adjustment to the correction - COR. */ -static inline double -__always_inline -do_cos_slow (double x, double dx, double eps, double *corp) -{ - mynumber u; - - if (x <= 0) - dx = -dx; - - u.x = big + fabs (x); - x = fabs (x) - (u.x - big); - - double xx, y, x1, x2, e1, e2, res, cor; - double s, sn, ssn, c, cs, ccs; - xx = x * x; - s = x * xx * (sn3 + xx * sn5); - c = x * dx + xx * (cs2 + xx * (cs4 + xx * cs6)); - SINCOS_TABLE_LOOKUP (u, sn, ssn, cs, ccs); - x1 = (x + t22) - t22; - x2 = (x - x1) + dx; - e1 = (sn + t22) - t22; - e2 = (sn - e1) + ssn; - cor = (ccs - cs * c - e1 * x2 - e2 * x) - sn * s; - y = cs - e1 * x1; - cor = cor + ((cs - y) - e1 * x1); - res = y + cor; - cor = (y - res) + cor; - cor = 1.0005 * cor + __copysign (eps, cor); - *corp = cor; - return res; + return cs + cor; } /* Given a number partitioned into X and DX, this function computes the sine of the number by combining the sin and cos of X (as computed by a variation of the Taylor series) with the values looked up from the sin/cos table to get - the result in RES and a correction value in COR. */ + the result. */ static inline double __always_inline -do_sin (double x, double dx, double *corp) +do_sin (double x, double dx) { + double xold = x; + /* Max ULP is 0.501 if |x| < 0.126, otherwise ULP is 0.518. */ + if (fabs (x) < 0.126) + return TAYLOR_SIN (x * x, x, dx); + mynumber u; if (x <= 0) @@ -218,85 +135,22 @@ do_sin (double x, double dx, double *cor u.x = big + fabs (x); x = fabs (x) - (u.x - big); - double xx, s, sn, ssn, c, cs, ccs, cor, res; + double xx, s, sn, ssn, c, cs, ccs, cor; xx = x * x; s = x + (dx + x * xx * (sn3 + xx * sn5)); c = x * dx + xx * (cs2 + xx * (cs4 + xx * cs6)); SINCOS_TABLE_LOOKUP (u, sn, ssn, cs, ccs); cor = (ssn + s * ccs - sn * c) + cs * s; - res = sn + cor; - cor = (sn - res) + cor; - *corp = cor; - return res; -} - -/* A more precise variant of DO_SIN. EPS is the adjustment to the correction - COR. */ -static inline double -__always_inline -do_sin_slow (double x, double dx, double eps, double *corp) -{ - mynumber u; - - if (x <= 0) - dx = -dx; - u.x = big + fabs (x); - x = fabs (x) - (u.x - big); - - double xx, y, x1, x2, c1, c2, res, cor; - double s, sn, ssn, c, cs, ccs; - xx = x * x; - s = x * xx * (sn3 + xx * sn5); - c = xx * (cs2 + xx * (cs4 + xx * cs6)); - SINCOS_TABLE_LOOKUP (u, sn, ssn, cs, ccs); - x1 = (x + t22) - t22; - x2 = (x - x1) + dx; - c1 = (cs + t22) - t22; - c2 = (cs - c1) + ccs; - cor = (ssn + s * ccs + cs * s + c2 * x + c1 * x2 - sn * x * dx) - sn * c; - y = sn + c1 * x1; - cor = cor + ((sn - y) + c1 * x1); - res = y + cor; - cor = (y - res) + cor; - cor = 1.0005 * cor + __copysign (eps, cor); - *corp = cor; - return res; -} - -/* Reduce range of X and compute sin of a + da. When SHIFT_QUADRANT is true, - the routine returns the cosine of a + da by rotating the quadrant once and - computing the sine of the result. */ -static inline double -__always_inline -reduce_and_compute (double x, bool shift_quadrant) -{ - double retval = 0, a, da; - unsigned int n = __branred (x, &a, &da); - int4 k = (n + shift_quadrant) % 4; - switch (k) - { - case 2: - a = -a; - da = -da; - /* Fall through. */ - case 0: - if (a * a < 0.01588) - retval = bsloww (a, da, x, n); - else - retval = bsloww1 (a, da, x, n); - break; - - case 1: - case 3: - retval = bsloww2 (a, da, x, n); - break; - } - return retval; + return __copysign (sn + cor, xold); } +/* Reduce range of x to within PI/2 with abs (x) < 105414350. The high part + is written to *a, the low part to *da. Range reduction is accurate to 136 + bits so that when x is large and *a very close to zero, all 53 bits of *a + are correct. */ static inline int4 __always_inline -reduce_sincos_1 (double x, double *a, double *da) +reduce_sincos (double x, double *a, double *da) { mynumber v; @@ -305,156 +159,54 @@ reduce_sincos_1 (double x, double *a, do v.x = t; double y = (x - xn * mp1) - xn * mp2; int4 n = v.i[LOW_HALF] & 3; - double db = xn * mp3; - double b = y - db; - db = (y - b) - db; - *a = b; - *da = db; - - return n; -} - -/* Compute sin (A + DA). cos can be computed by passing SHIFT_QUADRANT as - true, which results in shifting the quadrant N clockwise. */ -static double -__always_inline -do_sincos_1 (double a, double da, double x, int4 n, bool shift_quadrant) -{ - double xx, retval, res, cor; - double eps = fabs (x) * 1.2e-30; - - int k1 = (n + shift_quadrant) & 3; - switch (k1) - { /* quarter of unit circle */ - case 2: - a = -a; - da = -da; - /* Fall through. */ - case 0: - xx = a * a; - if (xx < 0.01588) - { - /* Taylor series. */ - res = TAYLOR_SIN (xx, a, da, cor); - cor = 1.02 * cor + __copysign (eps, cor); - retval = (res == res + cor) ? res : sloww (a, da, x, shift_quadrant); - } - else - { - res = do_sin (a, da, &cor); - cor = 1.035 * cor + __copysign (eps, cor); - retval = ((res == res + cor) ? __copysign (res, a) - : sloww1 (a, da, x, shift_quadrant)); - } - break; - - case 1: - case 3: - res = do_cos (a, da, &cor); - cor = 1.025 * cor + __copysign (eps, cor); - retval = ((res == res + cor) ? ((n & 2) ? -res : res) - : sloww2 (a, da, x, n)); - break; - } - - return retval; -} - -static inline int4 -__always_inline -reduce_sincos_2 (double x, double *a, double *da) -{ - mynumber v; - - double t = (x * hpinv + toint); - double xn = t - toint; - v.x = t; - double xn1 = (xn + 8.0e22) - 8.0e22; - double xn2 = xn - xn1; - double y = ((((x - xn1 * mp1) - xn1 * mp2) - xn2 * mp1) - xn2 * mp2); - int4 n = v.i[LOW_HALF] & 3; - double db = xn1 * pp3; - t = y - db; - db = (y - t) - db; - db = (db - xn2 * pp3) - xn * pp4; - double b = t + db; - db = (t - b) + db; + double b, db, t1, t2; + t1 = xn * pp3; + t2 = y - t1; + db = (y - t2) - t1; + + t1 = xn * pp4; + b = t2 - t1; + db += (t2 - b) - t1; *a = b; *da = db; - return n; } -/* Compute sin (A + DA). cos can be computed by passing SHIFT_QUADRANT as - true, which results in shifting the quadrant N clockwise. */ +/* Compute sin or cos (A + DA) for the given quadrant N. */ static double __always_inline -do_sincos_2 (double a, double da, double x, int4 n, bool shift_quadrant) +do_sincos (double a, double da, int4 n) { - double res, retval, cor, xx; + double retval; - double eps = 1.0e-24; - - int4 k = (n + shift_quadrant) & 3; - - switch (k) - { - case 2: - a = -a; - da = -da; - /* Fall through. */ - case 0: - xx = a * a; - if (xx < 0.01588) - { - /* Taylor series. */ - res = TAYLOR_SIN (xx, a, da, cor); - cor = 1.02 * cor + __copysign (eps, cor); - retval = (res == res + cor) ? res : bsloww (a, da, x, n); - } - else - { - res = do_sin (a, da, &cor); - cor = 1.035 * cor + __copysign (eps, cor); - retval = ((res == res + cor) ? __copysign (res, a) - : bsloww1 (a, da, x, n)); - } - break; - - case 1: - case 3: - res = do_cos (a, da, &cor); - cor = 1.025 * cor + __copysign (eps, cor); - retval = ((res == res + cor) ? ((n & 2) ? -res : res) - : bsloww2 (a, da, x, n)); - break; - } + if (n & 1) + /* Max ULP is 0.513. */ + retval = do_cos (a, da); + else + /* Max ULP is 0.501 if xx < 0.01588, otherwise ULP is 0.518. */ + retval = do_sin (a, da); - return retval; + return (n & 2) ? -retval : retval; } + /*******************************************************************/ /* An ultimate sin routine. Given an IEEE double machine number x */ /* it computes the correctly rounded (to nearest) value of sin(x) */ /*******************************************************************/ -#ifdef IN_SINCOS -static double -#else +#ifndef IN_SINCOS double SECTION -#endif __sin (double x) { - double xx, res, t, cor; + double t, a, da; mynumber u; - int4 k, m; + int4 k, m, n; double retval = 0; -#ifndef IN_SINCOS SET_RESTORE_ROUND_53BIT (FE_TONEAREST); -#endif u.x = x; m = u.i[HIGH_HALF]; @@ -464,56 +216,34 @@ __sin (double x) math_check_force_underflow (x); retval = x; } - /*---------------------------- 2^-26 < |x|< 0.25 ----------------------*/ - else if (k < 0x3fd00000) - { - xx = x * x; - /* Taylor series. */ - t = POLYNOMIAL (xx) * (xx * x); - res = x + t; - cor = (x - res) + t; - retval = (res == res + 1.07 * cor) ? res : slow (x); - } /* else if (k < 0x3fd00000) */ -/*---------------------------- 0.25<|x|< 0.855469---------------------- */ +/*--------------------------- 2^-26<|x|< 0.855469---------------------- */ else if (k < 0x3feb6000) { - res = do_sin (x, 0, &cor); - retval = (res == res + 1.096 * cor) ? res : slow1 (x); - retval = __copysign (retval, x); + /* Max ULP is 0.548. */ + retval = do_sin (x, 0); } /* else if (k < 0x3feb6000) */ /*----------------------- 0.855469 <|x|<2.426265 ----------------------*/ else if (k < 0x400368fd) { - t = hp0 - fabs (x); - res = do_cos (t, hp1, &cor); - retval = (res == res + 1.020 * cor) ? res : slow2 (x); - retval = __copysign (retval, x); + /* Max ULP is 0.51. */ + retval = __copysign (do_cos (t, hp1), x); } /* else if (k < 0x400368fd) */ -#ifndef IN_SINCOS /*-------------------------- 2.426265<|x|< 105414350 ----------------------*/ else if (k < 0x419921FB) { - double a, da; - int4 n = reduce_sincos_1 (x, &a, &da); - retval = do_sincos_1 (a, da, x, n, false); + n = reduce_sincos (x, &a, &da); + retval = do_sincos (a, da, n); } /* else if (k < 0x419921FB ) */ -/*---------------------105414350 <|x|< 281474976710656 --------------------*/ - else if (k < 0x42F00000) - { - double a, da; - - int4 n = reduce_sincos_2 (x, &a, &da); - retval = do_sincos_2 (a, da, x, n, false); - } /* else if (k < 0x42F00000 ) */ - -/* -----------------281474976710656 <|x| <2^1024----------------------------*/ +/* --------------------105414350 <|x| <2^1024------------------------------*/ else if (k < 0x7ff00000) - retval = reduce_and_compute (x, false); - + { + n = __branred (x, &a, &da); + retval = do_sincos (a, da, n); + } /*--------------------- |x| > 2^1024 ----------------------------------*/ else { @@ -521,7 +251,6 @@ __sin (double x) __set_errno (EDOM); retval = x / x; } -#endif return retval; } @@ -532,23 +261,17 @@ __sin (double x) /* it computes the correctly rounded (to nearest) value of cos(x) */ /*******************************************************************/ -#ifdef IN_SINCOS -static double -#else double SECTION -#endif __cos (double x) { - double y, xx, res, cor, a, da; + double y, a, da; mynumber u; - int4 k, m; + int4 k, m, n; double retval = 0; -#ifndef IN_SINCOS SET_RESTORE_ROUND_53BIT (FE_TONEAREST); -#endif u.x = x; m = u.i[HIGH_HALF]; @@ -560,8 +283,8 @@ __cos (double x) else if (k < 0x3feb6000) { /* 2^-27 < |x| < 0.855469 */ - res = do_cos (x, 0, &cor); - retval = (res == res + 1.020 * cor) ? res : cslow2 (x); + /* Max ULP is 0.51. */ + retval = do_cos (x, 0); } /* else if (k < 0x3feb6000) */ else if (k < 0x400368fd) @@ -569,43 +292,23 @@ __cos (double x) y = hp0 - fabs (x); a = y + hp1; da = (y - a) + hp1; - xx = a * a; - if (xx < 0.01588) - { - res = TAYLOR_SIN (xx, a, da, cor); - cor = 1.02 * cor + __copysign (1.0e-31, cor); - retval = (res == res + cor) ? res : sloww (a, da, x, true); - } - else - { - res = do_sin (a, da, &cor); - cor = 1.035 * cor + __copysign (1.0e-31, cor); - retval = ((res == res + cor) ? __copysign (res, a) - : sloww1 (a, da, x, true)); - } - + /* Max ULP is 0.501 if xx < 0.01588 or 0.518 otherwise. + Range reduction uses 106 bits here which is sufficient. */ + retval = do_sin (a, da); } /* else if (k < 0x400368fd) */ - -#ifndef IN_SINCOS else if (k < 0x419921FB) { /* 2.426265<|x|< 105414350 */ - double a, da; - int4 n = reduce_sincos_1 (x, &a, &da); - retval = do_sincos_1 (a, da, x, n, true); + n = reduce_sincos (x, &a, &da); + retval = do_sincos (a, da, n + 1); } /* else if (k < 0x419921FB ) */ - else if (k < 0x42F00000) - { - double a, da; - - int4 n = reduce_sincos_2 (x, &a, &da); - retval = do_sincos_2 (a, da, x, n, true); - } /* else if (k < 0x42F00000 ) */ - - /* 281474976710656 <|x| <2^1024 */ + /* 105414350 <|x| <2^1024 */ else if (k < 0x7ff00000) - retval = reduce_and_compute (x, true); + { + n = __branred (x, &a, &da); + retval = do_sincos (a, da, n + 1); + } else { @@ -613,304 +316,10 @@ __cos (double x) __set_errno (EDOM); retval = x / x; /* |x| > 2^1024 */ } -#endif return retval; } -/************************************************************************/ -/* Routine compute sin(x) for 2^-26 < |x|< 0.25 by Taylor with more */ -/* precision and if still doesn't accurate enough by mpsin or dubsin */ -/************************************************************************/ - -static inline double -__always_inline -slow (double x) -{ - double res, cor, w[2]; - res = TAYLOR_SLOW (x, 0, cor); - if (res == res + 1.0007 * cor) - return res; - - __dubsin (fabs (x), 0, w); - if (w[0] == w[0] + 1.000000001 * w[1]) - return __copysign (w[0], x); - - return __copysign (__mpsin (fabs (x), 0, false), x); -} - -/*******************************************************************************/ -/* Routine compute sin(x) for 0.25<|x|< 0.855469 by __sincostab.tbl and Taylor */ -/* and if result still doesn't accurate enough by mpsin or dubsin */ -/*******************************************************************************/ - -static inline double -__always_inline -slow1 (double x) -{ - double w[2], cor, res; - - res = do_sin_slow (x, 0, 0, &cor); - if (res == res + cor) - return res; - - __dubsin (fabs (x), 0, w); - if (w[0] == w[0] + 1.000000005 * w[1]) - return w[0]; - - return __mpsin (fabs (x), 0, false); -} - -/**************************************************************************/ -/* Routine compute sin(x) for 0.855469 <|x|<2.426265 by __sincostab.tbl */ -/* and if result still doesn't accurate enough by mpsin or dubsin */ -/**************************************************************************/ -static inline double -__always_inline -slow2 (double x) -{ - double w[2], y, y1, y2, cor, res; - - double t = hp0 - fabs (x); - res = do_cos_slow (t, hp1, 0, &cor); - if (res == res + cor) - return res; - - y = fabs (x) - hp0; - y1 = y - hp1; - y2 = (y - y1) - hp1; - __docos (y1, y2, w); - if (w[0] == w[0] + 1.000000005 * w[1]) - return w[0]; - - return __mpsin (fabs (x), 0, false); -} - -/* Compute sin(x + dx) where X is small enough to use Taylor series around zero - and (x + dx) in the first or third quarter of the unit circle. ORIG is the - original value of X for computing error of the result. If the result is not - accurate enough, the routine calls mpsin or dubsin. SHIFT_QUADRANT rotates - the unit circle by 1 to compute the cosine instead of sine. */ -static inline double -__always_inline -sloww (double x, double dx, double orig, bool shift_quadrant) -{ - double y, t, res, cor, w[2], a, da, xn; - mynumber v; - int4 n; - res = TAYLOR_SLOW (x, dx, cor); - - double eps = fabs (orig) * 3.1e-30; - - cor = 1.0005 * cor + __copysign (eps, cor); - - if (res == res + cor) - return res; - - a = fabs (x); - da = (x > 0) ? dx : -dx; - __dubsin (a, da, w); - eps = fabs (orig) * 1.1e-30; - cor = 1.000000001 * w[1] + __copysign (eps, w[1]); - - if (w[0] == w[0] + cor) - return __copysign (w[0], x); - - t = (orig * hpinv + toint); - xn = t - toint; - v.x = t; - y = (orig - xn * mp1) - xn * mp2; - n = (v.i[LOW_HALF] + shift_quadrant) & 3; - da = xn * pp3; - t = y - da; - da = (y - t) - da; - y = xn * pp4; - a = t - y; - da = ((t - a) - y) + da; - - if (n & 2) - { - a = -a; - da = -da; - } - x = fabs (a); - dx = (a > 0) ? da : -da; - __dubsin (x, dx, w); - eps = fabs (orig) * 1.1e-40; - cor = 1.000000001 * w[1] + __copysign (eps, w[1]); - - if (w[0] == w[0] + cor) - return __copysign (w[0], a); - - return shift_quadrant ? __mpcos (orig, 0, true) : __mpsin (orig, 0, true); -} - -/* Compute sin(x + dx) where X is in the first or third quarter of the unit - circle. ORIG is the original value of X for computing error of the result. - If the result is not accurate enough, the routine calls mpsin or dubsin. - SHIFT_QUADRANT rotates the unit circle by 1 to compute the cosine instead of - sine. */ -static inline double -__always_inline -sloww1 (double x, double dx, double orig, bool shift_quadrant) -{ - double w[2], cor, res; - - res = do_sin_slow (x, dx, 3.1e-30 * fabs (orig), &cor); - - if (res == res + cor) - return __copysign (res, x); - - dx = (x > 0 ? dx : -dx); - __dubsin (fabs (x), dx, w); - - double eps = 1.1e-30 * fabs (orig); - cor = 1.000000005 * w[1] + __copysign (eps, w[1]); - - if (w[0] == w[0] + cor) - return __copysign (w[0], x); - - return shift_quadrant ? __mpcos (orig, 0, true) : __mpsin (orig, 0, true); -} - -/***************************************************************************/ -/* Routine compute sin(x+dx) (Double-Length number) where x in second or */ -/* fourth quarter of unit circle.Routine receive also the original value */ -/* and quarter(n= 1or 3)of x for computing error of result.And if result not*/ -/* accurate enough routine calls mpsin1 or dubsin */ -/***************************************************************************/ - -static inline double -__always_inline -sloww2 (double x, double dx, double orig, int n) -{ - double w[2], cor, res; - - res = do_cos_slow (x, dx, 3.1e-30 * fabs (orig), &cor); - - if (res == res + cor) - return (n & 2) ? -res : res; - - dx = x > 0 ? dx : -dx; - __docos (fabs (x), dx, w); - - double eps = 1.1e-30 * fabs (orig); - cor = 1.000000005 * w[1] + __copysign (eps, w[1]); - - if (w[0] == w[0] + cor) - return (n & 2) ? -w[0] : w[0]; - - return (n & 1) ? __mpsin (orig, 0, true) : __mpcos (orig, 0, true); -} - -/***************************************************************************/ -/* Routine compute sin(x+dx) or cos(x+dx) (Double-Length number) where x */ -/* is small enough to use Taylor series around zero and (x+dx) */ -/* in first or third quarter of unit circle.Routine receive also */ -/* (right argument) the original value of x for computing error of */ -/* result.And if result not accurate enough routine calls other routines */ -/***************************************************************************/ - -static inline double -__always_inline -bsloww (double x, double dx, double orig, int n) -{ - double res, cor, w[2], a, da; - - res = TAYLOR_SLOW (x, dx, cor); - cor = 1.0005 * cor + __copysign (1.1e-24, cor); - if (res == res + cor) - return res; - - a = fabs (x); - da = (x > 0) ? dx : -dx; - __dubsin (a, da, w); - cor = 1.000000001 * w[1] + __copysign (1.1e-24, w[1]); - - if (w[0] == w[0] + cor) - return __copysign (w[0], x); - - return (n & 1) ? __mpcos (orig, 0, true) : __mpsin (orig, 0, true); -} - -/***************************************************************************/ -/* Routine compute sin(x+dx) or cos(x+dx) (Double-Length number) where x */ -/* in first or third quarter of unit circle.Routine receive also */ -/* (right argument) the original value of x for computing error of result.*/ -/* And if result not accurate enough routine calls other routines */ -/***************************************************************************/ - -static inline double -__always_inline -bsloww1 (double x, double dx, double orig, int n) -{ - double w[2], cor, res; - - res = do_sin_slow (x, dx, 1.1e-24, &cor); - if (res == res + cor) - return (x > 0) ? res : -res; - - dx = (x > 0) ? dx : -dx; - __dubsin (fabs (x), dx, w); - - cor = 1.000000005 * w[1] + __copysign (1.1e-24, w[1]); - - if (w[0] == w[0] + cor) - return __copysign (w[0], x); - - return (n & 1) ? __mpcos (orig, 0, true) : __mpsin (orig, 0, true); -} - -/***************************************************************************/ -/* Routine compute sin(x+dx) or cos(x+dx) (Double-Length number) where x */ -/* in second or fourth quarter of unit circle.Routine receive also the */ -/* original value and quarter(n= 1or 3)of x for computing error of result. */ -/* And if result not accurate enough routine calls other routines */ -/***************************************************************************/ - -static inline double -__always_inline -bsloww2 (double x, double dx, double orig, int n) -{ - double w[2], cor, res; - - res = do_cos_slow (x, dx, 1.1e-24, &cor); - if (res == res + cor) - return (n & 2) ? -res : res; - - dx = (x > 0) ? dx : -dx; - __docos (fabs (x), dx, w); - - cor = 1.000000005 * w[1] + __copysign (1.1e-24, w[1]); - - if (w[0] == w[0] + cor) - return (n & 2) ? -w[0] : w[0]; - - return (n & 1) ? __mpsin (orig, 0, true) : __mpcos (orig, 0, true); -} - -/************************************************************************/ -/* Routine compute cos(x) for 2^-27 < |x|< 0.25 by Taylor with more */ -/* precision and if still doesn't accurate enough by mpcos or docos */ -/************************************************************************/ - -static inline double -__always_inline -cslow2 (double x) -{ - double w[2], cor, res; - - res = do_cos_slow (x, 0, 0, &cor); - if (res == res + cor) - return res; - - __docos (fabs (x), 0, w); - if (w[0] == w[0] + 1.000000005 * w[1]) - return w[0]; - - return __mpcos (x, 0, false); -} - #ifndef __cos weak_alias (__cos, cos) # ifdef NO_LONG_DOUBLE @@ -925,3 +334,5 @@ strong_alias (__sin, __sinl) weak_alias (__sin, sinl) # endif #endif + +#endif Index: glibc-2.26/sysdeps/ieee754/dbl-64/s_sincos.c =================================================================== --- glibc-2.26.orig/sysdeps/ieee754/dbl-64/s_sincos.c +++ glibc-2.26/sysdeps/ieee754/dbl-64/s_sincos.c @@ -22,42 +22,9 @@ #include <math_private.h> -#define __sin __sin_local -#define __cos __cos_local -#define IN_SINCOS 1 +#define IN_SINCOS #include "s_sin.c" -/* Consolidated version of reduce_and_compute in s_sin.c that does range - reduction only once and computes sin and cos together. */ -static inline void -__always_inline -reduce_and_compute_sincos (double x, double *sinx, double *cosx) -{ - double a, da; - unsigned int n = __branred (x, &a, &da); - - n = n & 3; - - if (n == 1 || n == 2) - { - a = -a; - da = -da; - } - - if (n & 1) - { - double *temp = cosx; - cosx = sinx; - sinx = temp; - } - - if (a * a < 0.01588) - *sinx = bsloww (a, da, x, n); - else - *sinx = bsloww1 (a, da, x, n); - *cosx = bsloww2 (a, da, x, n); -} - void __sincos (double x, double *sinx, double *cosx) { @@ -67,37 +34,62 @@ __sincos (double x, double *sinx, double SET_RESTORE_ROUND_53BIT (FE_TONEAREST); u.x = x; - k = 0x7fffffff & u.i[HIGH_HALF]; + k = u.i[HIGH_HALF] & 0x7fffffff; if (k < 0x400368fd) { - *sinx = __sin_local (x); - *cosx = __cos_local (x); - return; - } - if (k < 0x419921FB) - { - double a, da; - int4 n = reduce_sincos_1 (x, &a, &da); - - *sinx = do_sincos_1 (a, da, x, n, false); - *cosx = do_sincos_1 (a, da, x, n, true); - - return; - } - if (k < 0x42F00000) - { - double a, da; - int4 n = reduce_sincos_2 (x, &a, &da); - - *sinx = do_sincos_2 (a, da, x, n, false); - *cosx = do_sincos_2 (a, da, x, n, true); - + double a, da, y; + /* |x| < 2^-27 => cos (x) = 1, sin (x) = x. */ + if (k < 0x3e400000) + { + if (k < 0x3e500000) + math_check_force_underflow (x); + *sinx = x; + *cosx = 1.0; + return; + } + /* |x| < 0.855469. */ + else if (k < 0x3feb6000) + { + *sinx = do_sin (x, 0); + *cosx = do_cos (x, 0); + return; + } + + /* |x| < 2.426265. */ + y = hp0 - fabs (x); + a = y + hp1; + da = (y - a) + hp1; + *sinx = __copysign (do_cos (a, da), x); + *cosx = do_sin (a, da); return; } + /* |x| < 2^1024. */ if (k < 0x7ff00000) { - reduce_and_compute_sincos (x, sinx, cosx); + double a, da, xx; + unsigned int n; + + /* If |x| < 105414350 use simple range reduction. */ + n = k < 0x419921FB ? reduce_sincos (x, &a, &da) : __branred (x, &a, &da); + n = n & 3; + + if (n == 1 || n == 2) + { + a = -a; + da = -da; + } + + if (n & 1) + { + double *temp = cosx; + cosx = sinx; + sinx = temp; + } + + *sinx = do_sin (a, da); + xx = do_cos (a, da); + *cosx = (n & 2) ? -xx : xx; return; } Index: glibc-2.26/sysdeps/ieee754/dbl-64/slowexp.c =================================================================== --- glibc-2.26.orig/sysdeps/ieee754/dbl-64/slowexp.c +++ /dev/null @@ -1,86 +0,0 @@ -/* - * IBM Accurate Mathematical Library - * written by International Business Machines Corp. - * Copyright (C) 2001-2017 Free Software Foundation, Inc. - * - * This program is free software; you can redistribute it and/or modify - * it under the terms of the GNU Lesser General Public License as published by - * the Free Software Foundation; either version 2.1 of the License, or - * (at your option) any later version. - * - * This program is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU Lesser General Public License for more details. - * - * You should have received a copy of the GNU Lesser General Public License - * along with this program; if not, see <http://www.gnu.org/licenses/>. - */ -/**************************************************************************/ -/* MODULE_NAME:slowexp.c */ -/* */ -/* FUNCTION:slowexp */ -/* */ -/* FILES NEEDED:mpa.h */ -/* mpa.c mpexp.c */ -/* */ -/*Converting from double precision to Multi-precision and calculating */ -/* e^x */ -/**************************************************************************/ -#include <math_private.h> - -#include <stap-probe.h> - -#ifndef USE_LONG_DOUBLE_FOR_MP -# include "mpa.h" -void __mpexp (mp_no *x, mp_no *y, int p); -#endif - -#ifndef SECTION -# define SECTION -#endif - -/*Converting from double precision to Multi-precision and calculating e^x */ -double -SECTION -__slowexp (double x) -{ -#ifndef USE_LONG_DOUBLE_FOR_MP - double w, z, res, eps = 3.0e-26; - int p; - mp_no mpx, mpy, mpz, mpw, mpeps, mpcor; - - /* Use the multiple precision __MPEXP function to compute the exponential - First at 144 bits and if it is not accurate enough, at 768 bits. */ - p = 6; - __dbl_mp (x, &mpx, p); - __mpexp (&mpx, &mpy, p); - __dbl_mp (eps, &mpeps, p); - __mul (&mpeps, &mpy, &mpcor, p); - __add (&mpy, &mpcor, &mpw, p); - __sub (&mpy, &mpcor, &mpz, p); - __mp_dbl (&mpw, &w, p); - __mp_dbl (&mpz, &z, p); - if (w == z) - { - /* Track how often we get to the slow exp code plus - its input/output values. */ - LIBC_PROBE (slowexp_p6, 2, &x, &w); - return w; - } - else - { - p = 32; - __dbl_mp (x, &mpx, p); - __mpexp (&mpx, &mpy, p); - __mp_dbl (&mpy, &res, p); - - /* Track how often we get to the uber-slow exp code plus - its input/output values. */ - LIBC_PROBE (slowexp_p32, 2, &x, &res); - return res; - } -#else - return (double) __ieee754_expl((long double)x); -#endif -} Index: glibc-2.26/sysdeps/ieee754/dbl-64/slowpow.c =================================================================== --- glibc-2.26.orig/sysdeps/ieee754/dbl-64/slowpow.c +++ /dev/null @@ -1,125 +0,0 @@ -/* - * IBM Accurate Mathematical Library - * written by International Business Machines Corp. - * Copyright (C) 2001-2017 Free Software Foundation, Inc. - * - * This program is free software; you can redistribute it and/or modify - * it under the terms of the GNU Lesser General Public License as published by - * the Free Software Foundation; either version 2.1 of the License, or - * (at your option) any later version. - * - * This program is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU Lesser General Public License for more details. - * - * You should have received a copy of the GNU Lesser General Public License - * along with this program; if not, see <http://www.gnu.org/licenses/>. - */ -/*************************************************************************/ -/* MODULE_NAME:slowpow.c */ -/* */ -/* FUNCTION:slowpow */ -/* */ -/*FILES NEEDED:mpa.h */ -/* mpa.c mpexp.c mplog.c halfulp.c */ -/* */ -/* Given two IEEE double machine numbers y,x , routine computes the */ -/* correctly rounded (to nearest) value of x^y. Result calculated by */ -/* multiplication (in halfulp.c) or if result isn't accurate enough */ -/* then routine converts x and y into multi-precision doubles and */ -/* calls to mpexp routine */ -/*************************************************************************/ - -#include "mpa.h" -#include <math_private.h> - -#include <stap-probe.h> - -#ifndef SECTION -# define SECTION -#endif - -void __mpexp (mp_no *x, mp_no *y, int p); -void __mplog (mp_no *x, mp_no *y, int p); -double ulog (double); -double __halfulp (double x, double y); - -double -SECTION -__slowpow (double x, double y, double z) -{ - double res, res1; - mp_no mpx, mpy, mpz, mpw, mpp, mpr, mpr1; - static const mp_no eps = {-3, {1.0, 4.0}}; - int p; - - /* __HALFULP returns -10 or X^Y. */ - res = __halfulp (x, y); - - /* Return if the result was computed by __HALFULP. */ - if (res >= 0) - return res; - - /* Compute pow as long double. This is currently only used by powerpc, where - one may get 106 bits of accuracy. */ -#ifdef USE_LONG_DOUBLE_FOR_MP - long double ldw, ldz, ldpp; - static const long double ldeps = 0x4.0p-96; - - ldz = __ieee754_logl ((long double) x); - ldw = (long double) y *ldz; - ldpp = __ieee754_expl (ldw); - res = (double) (ldpp + ldeps); - res1 = (double) (ldpp - ldeps); - - /* Return the result if it is accurate enough. */ - if (res == res1) - return res; -#endif - - /* Or else, calculate using multiple precision. P = 10 implies accuracy of - 240 bits accuracy, since MP_NO has a radix of 2^24. */ - p = 10; - __dbl_mp (x, &mpx, p); - __dbl_mp (y, &mpy, p); - __dbl_mp (z, &mpz, p); - - /* z = x ^ y - log (z) = y * log (x) - z = exp (y * log (x)) */ - __mplog (&mpx, &mpz, p); - __mul (&mpy, &mpz, &mpw, p); - __mpexp (&mpw, &mpp, p); - - /* Add and subtract EPS to ensure that the result remains unchanged, i.e. we - have last bit accuracy. */ - __add (&mpp, &eps, &mpr, p); - __mp_dbl (&mpr, &res, p); - __sub (&mpp, &eps, &mpr1, p); - __mp_dbl (&mpr1, &res1, p); - if (res == res1) - { - /* Track how often we get to the slow pow code plus - its input/output values. */ - LIBC_PROBE (slowpow_p10, 4, &x, &y, &z, &res); - return res; - } - - /* If we don't, then we repeat using a higher precision. 768 bits of - precision ought to be enough for anybody. */ - p = 32; - __dbl_mp (x, &mpx, p); - __dbl_mp (y, &mpy, p); - __dbl_mp (z, &mpz, p); - __mplog (&mpx, &mpz, p); - __mul (&mpy, &mpz, &mpw, p); - __mpexp (&mpw, &mpp, p); - __mp_dbl (&mpp, &res, p); - - /* Track how often we get to the uber-slow pow code plus - its input/output values. */ - LIBC_PROBE (slowpow_p32, 4, &x, &y, &z, &res); - - return res; -} Index: glibc-2.26/sysdeps/ieee754/dbl-64/uexp.h =================================================================== --- glibc-2.26.orig/sysdeps/ieee754/dbl-64/uexp.h +++ glibc-2.26/sysdeps/ieee754/dbl-64/uexp.h @@ -29,8 +29,7 @@ #include "mydefs.h" -const static double zero = 0.0, hhuge = 1.0e300, tiny = 1.0e-300, -err_0 = 1.000014, err_1 = 0.000016; +const static double zero = 0.0, hhuge = 1.0e300, tiny = 1.0e-300; const static int4 bigint = 0x40862002, badint = 0x40876000,smallint = 0x3C8fffff; const static int4 hugeint = 0x7FFFFFFF, infint = 0x7ff00000; Index: glibc-2.26/sysdeps/ieee754/dbl-64/ulog.h =================================================================== --- glibc-2.26.orig/sysdeps/ieee754/dbl-64/ulog.h +++ glibc-2.26/sysdeps/ieee754/dbl-64/ulog.h @@ -42,43 +42,6 @@ /**/ b6 = {{0x3fbc71c5, 0x25db58ac} }, /* 0.111... */ /**/ b7 = {{0xbfb9a4ac, 0x11a2a61c} }, /* -0.100... */ /**/ b8 = {{0x3fb75077, 0x0df2b591} }, /* 0.091... */ - /* polynomial III */ -#if 0 -/**/ c1 = {{0x3ff00000, 0x00000000} }, /* 1 */ -#endif -/**/ c2 = {{0xbfe00000, 0x00000000} }, /* -1/2 */ -/**/ c3 = {{0x3fd55555, 0x55555555} }, /* 1/3 */ -/**/ c4 = {{0xbfd00000, 0x00000000} }, /* -1/4 */ -/**/ c5 = {{0x3fc99999, 0x9999999a} }, /* 1/5 */ - /* polynomial IV */ -/**/ d2 = {{0xbfe00000, 0x00000000} }, /* -1/2 */ -/**/ dd2 = {{0x00000000, 0x00000000} }, /* -1/2-d2 */ -/**/ d3 = {{0x3fd55555, 0x55555555} }, /* 1/3 */ -/**/ dd3 = {{0x3c755555, 0x55555555} }, /* 1/3-d3 */ -/**/ d4 = {{0xbfd00000, 0x00000000} }, /* -1/4 */ -/**/ dd4 = {{0x00000000, 0x00000000} }, /* -1/4-d4 */ -/**/ d5 = {{0x3fc99999, 0x9999999a} }, /* 1/5 */ -/**/ dd5 = {{0xbc699999, 0x9999999a} }, /* 1/5-d5 */ -/**/ d6 = {{0xbfc55555, 0x55555555} }, /* -1/6 */ -/**/ dd6 = {{0xbc655555, 0x55555555} }, /* -1/6-d6 */ -/**/ d7 = {{0x3fc24924, 0x92492492} }, /* 1/7 */ -/**/ dd7 = {{0x3c624924, 0x92492492} }, /* 1/7-d7 */ -/**/ d8 = {{0xbfc00000, 0x00000000} }, /* -1/8 */ -/**/ dd8 = {{0x00000000, 0x00000000} }, /* -1/8-d8 */ -/**/ d9 = {{0x3fbc71c7, 0x1c71c71c} }, /* 1/9 */ -/**/ dd9 = {{0x3c5c71c7, 0x1c71c71c} }, /* 1/9-d9 */ -/**/ d10 = {{0xbfb99999, 0x9999999a} }, /* -1/10 */ -/**/ dd10 = {{0x3c599999, 0x9999999a} }, /* -1/10-d10 */ -/**/ d11 = {{0x3fb745d1, 0x745d1746} }, /* 1/11 */ -/**/ d12 = {{0xbfb55555, 0x55555555} }, /* -1/12 */ -/**/ d13 = {{0x3fb3b13b, 0x13b13b14} }, /* 1/13 */ -/**/ d14 = {{0xbfb24924, 0x92492492} }, /* -1/14 */ -/**/ d15 = {{0x3fb11111, 0x11111111} }, /* 1/15 */ -/**/ d16 = {{0xbfb00000, 0x00000000} }, /* -1/16 */ -/**/ d17 = {{0x3fae1e1e, 0x1e1e1e1e} }, /* 1/17 */ -/**/ d18 = {{0xbfac71c7, 0x1c71c71c} }, /* -1/18 */ -/**/ d19 = {{0x3faaf286, 0xbca1af28} }, /* 1/19 */ -/**/ d20 = {{0xbfa99999, 0x9999999a} }, /* -1/20 */ /* constants */ /**/ sqrt_2 = {{0x3ff6a09e, 0x667f3bcc} }, /* sqrt(2) */ /**/ h1 = {{0x3fd2e000, 0x00000000} }, /* 151/2**9 */ @@ -87,14 +50,6 @@ /**/ delv = {{0x3ef00000, 0x00000000} }, /* 1/2**16 */ /**/ ln2a = {{0x3fe62e42, 0xfefa3800} }, /* ln(2) 43 bits */ /**/ ln2b = {{0x3d2ef357, 0x93c76730} }, /* ln(2)-ln2a */ -/**/ e1 = {{0x3bbcc868, 0x00000000} }, /* 6.095e-21 */ -/**/ e2 = {{0x3c1138ce, 0x00000000} }, /* 2.334e-19 */ -/**/ e3 = {{0x3aa1565d, 0x00000000} }, /* 2.801e-26 */ -/**/ e4 = {{0x39809d88, 0x00000000} }, /* 1.024e-31 */ -/**/ e[M] ={{{0x37da223a, 0x00000000} }, /* 1.2e-39 */ -/**/ {{0x35c851c4, 0x00000000} }, /* 1.3e-49 */ -/**/ {{0x2ab85e51, 0x00000000} }, /* 6.8e-103 */ -/**/ {{0x17383827, 0x00000000} }},/* 8.1e-197 */ /**/ two54 = {{0x43500000, 0x00000000} }, /* 2**54 */ /**/ u03 = {{0x3f9eb851, 0xeb851eb8} }; /* 0.03 */ @@ -114,43 +69,6 @@ /**/ b6 = {{0x25db58ac, 0x3fbc71c5} }, /* 0.111... */ /**/ b7 = {{0x11a2a61c, 0xbfb9a4ac} }, /* -0.100... */ /**/ b8 = {{0x0df2b591, 0x3fb75077} }, /* 0.091... */ - /* polynomial III */ -#if 0 -/**/ c1 = {{0x00000000, 0x3ff00000} }, /* 1 */ -#endif -/**/ c2 = {{0x00000000, 0xbfe00000} }, /* -1/2 */ -/**/ c3 = {{0x55555555, 0x3fd55555} }, /* 1/3 */ -/**/ c4 = {{0x00000000, 0xbfd00000} }, /* -1/4 */ -/**/ c5 = {{0x9999999a, 0x3fc99999} }, /* 1/5 */ - /* polynomial IV */ -/**/ d2 = {{0x00000000, 0xbfe00000} }, /* -1/2 */ -/**/ dd2 = {{0x00000000, 0x00000000} }, /* -1/2-d2 */ -/**/ d3 = {{0x55555555, 0x3fd55555} }, /* 1/3 */ -/**/ dd3 = {{0x55555555, 0x3c755555} }, /* 1/3-d3 */ -/**/ d4 = {{0x00000000, 0xbfd00000} }, /* -1/4 */ -/**/ dd4 = {{0x00000000, 0x00000000} }, /* -1/4-d4 */ -/**/ d5 = {{0x9999999a, 0x3fc99999} }, /* 1/5 */ -/**/ dd5 = {{0x9999999a, 0xbc699999} }, /* 1/5-d5 */ -/**/ d6 = {{0x55555555, 0xbfc55555} }, /* -1/6 */ -/**/ dd6 = {{0x55555555, 0xbc655555} }, /* -1/6-d6 */ -/**/ d7 = {{0x92492492, 0x3fc24924} }, /* 1/7 */ -/**/ dd7 = {{0x92492492, 0x3c624924} }, /* 1/7-d7 */ -/**/ d8 = {{0x00000000, 0xbfc00000} }, /* -1/8 */ -/**/ dd8 = {{0x00000000, 0x00000000} }, /* -1/8-d8 */ -/**/ d9 = {{0x1c71c71c, 0x3fbc71c7} }, /* 1/9 */ -/**/ dd9 = {{0x1c71c71c, 0x3c5c71c7} }, /* 1/9-d9 */ -/**/ d10 = {{0x9999999a, 0xbfb99999} }, /* -1/10 */ -/**/ dd10 = {{0x9999999a, 0x3c599999} }, /* -1/10-d10 */ -/**/ d11 = {{0x745d1746, 0x3fb745d1} }, /* 1/11 */ -/**/ d12 = {{0x55555555, 0xbfb55555} }, /* -1/12 */ -/**/ d13 = {{0x13b13b14, 0x3fb3b13b} }, /* 1/13 */ -/**/ d14 = {{0x92492492, 0xbfb24924} }, /* -1/14 */ -/**/ d15 = {{0x11111111, 0x3fb11111} }, /* 1/15 */ -/**/ d16 = {{0x00000000, 0xbfb00000} }, /* -1/16 */ -/**/ d17 = {{0x1e1e1e1e, 0x3fae1e1e} }, /* 1/17 */ -/**/ d18 = {{0x1c71c71c, 0xbfac71c7} }, /* -1/18 */ -/**/ d19 = {{0xbca1af28, 0x3faaf286} }, /* 1/19 */ -/**/ d20 = {{0x9999999a, 0xbfa99999} }, /* -1/20 */ /* constants */ /**/ sqrt_2 = {{0x667f3bcc, 0x3ff6a09e} }, /* sqrt(2) */ /**/ h1 = {{0x00000000, 0x3fd2e000} }, /* 151/2**9 */ @@ -159,14 +77,6 @@ /**/ delv = {{0x00000000, 0x3ef00000} }, /* 1/2**16 */ /**/ ln2a = {{0xfefa3800, 0x3fe62e42} }, /* ln(2) 43 bits */ /**/ ln2b = {{0x93c76730, 0x3d2ef357} }, /* ln(2)-ln2a */ -/**/ e1 = {{0x00000000, 0x3bbcc868} }, /* 6.095e-21 */ -/**/ e2 = {{0x00000000, 0x3c1138ce} }, /* 2.334e-19 */ -/**/ e3 = {{0x00000000, 0x3aa1565d} }, /* 2.801e-26 */ -/**/ e4 = {{0x00000000, 0x39809d88} }, /* 1.024e-31 */ -/**/ e[M] ={{{0x00000000, 0x37da223a} }, /* 1.2e-39 */ -/**/ {{0x00000000, 0x35c851c4} }, /* 1.3e-49 */ -/**/ {{0x00000000, 0x2ab85e51} }, /* 6.8e-103 */ -/**/ {{0x00000000, 0x17383827} }},/* 8.1e-197 */ /**/ two54 = {{0x00000000, 0x43500000} }, /* 2**54 */ /**/ u03 = {{0xeb851eb8, 0x3f9eb851} }; /* 0.03 */ @@ -178,10 +88,6 @@ #define DEL_V delv.d #define LN2A ln2a.d #define LN2B ln2b.d -#define E1 e1.d -#define E2 e2.d -#define E3 e3.d -#define E4 e4.d #define U03 u03.d #endif Index: glibc-2.26/sysdeps/m68k/m680x0/fpu/halfulp.c =================================================================== --- glibc-2.26.orig/sysdeps/m68k/m680x0/fpu/halfulp.c +++ /dev/null @@ -1 +0,0 @@ -/* Not needed. */ Index: glibc-2.26/sysdeps/m68k/m680x0/fpu/slowexp.c =================================================================== --- glibc-2.26.orig/sysdeps/m68k/m680x0/fpu/slowexp.c +++ /dev/null @@ -1 +0,0 @@ -/* Not needed. */ Index: glibc-2.26/sysdeps/m68k/m680x0/fpu/slowpow.c =================================================================== --- glibc-2.26.orig/sysdeps/m68k/m680x0/fpu/slowpow.c +++ /dev/null @@ -1 +0,0 @@ -/* Not needed. */ Index: glibc-2.26/sysdeps/powerpc/power4/fpu/Makefile =================================================================== --- glibc-2.26.orig/sysdeps/powerpc/power4/fpu/Makefile +++ glibc-2.26/sysdeps/powerpc/power4/fpu/Makefile @@ -2,6 +2,4 @@ ifeq ($(subdir),math) CFLAGS-mpa.c += --param max-unroll-times=4 -funroll-loops -fpeel-loops -CPPFLAGS-slowpow.c += -DUSE_LONG_DOUBLE_FOR_MP=1 -CPPFLAGS-slowexp.c += -DUSE_LONG_DOUBLE_FOR_MP=1 endif Index: glibc-2.26/sysdeps/x86_64/fpu/libm-test-ulps =================================================================== --- glibc-2.26.orig/sysdeps/x86_64/fpu/libm-test-ulps +++ glibc-2.26/sysdeps/x86_64/fpu/libm-test-ulps @@ -1262,7 +1262,9 @@ ildouble: 1 ldouble: 1 Function: "cos": +double: 1 float128: 1 +idouble: 1 ifloat128: 1 ildouble: 1 ldouble: 1 @@ -2464,8 +2466,10 @@ Function: "log_vlen8_avx2": float: 2 Function: "pow": +double: 1 float: 1 float128: 2 +idouble: 1 ifloat: 1 ifloat128: 2 ildouble: 1 @@ -2552,7 +2556,9 @@ Function: "pow_vlen8_avx2": float: 3 Function: "sin": +double: 1 float128: 1 +idouble: 1 ifloat128: 1 ildouble: 1 ldouble: 1 @@ -2602,7 +2608,9 @@ Function: "sin_vlen8_avx2": float: 1 Function: "sincos": +double: 1 float128: 1 +idouble: 1 ifloat128: 1 ildouble: 1 ldouble: 1 Index: glibc-2.26/sysdeps/x86_64/fpu/multiarch/Makefile =================================================================== --- glibc-2.26.orig/sysdeps/x86_64/fpu/multiarch/Makefile +++ glibc-2.26/sysdeps/x86_64/fpu/multiarch/Makefile @@ -4,9 +4,9 @@ libm-sysdep_routines += s_floor-c s_ceil libm-sysdep_routines += e_exp-fma4 e_log-fma4 e_pow-fma4 s_atan-fma4 \ e_asin-fma4 e_atan2-fma4 s_sin-fma4 s_tan-fma4 \ - mplog-fma4 mpa-fma4 slowexp-fma4 slowpow-fma4 \ + mplog-fma4 mpa-fma4 \ sincos32-fma4 doasin-fma4 dosincos-fma4 \ - halfulp-fma4 mpexp-fma4 \ + mpexp-fma4 \ mpatan2-fma4 mpatan-fma4 mpsqrt-fma4 mptan-fma4 CFLAGS-doasin-fma4.c = -mfma4 @@ -16,7 +16,6 @@ CFLAGS-e_atan2-fma4.c = -mfma4 CFLAGS-e_exp-fma4.c = -mfma4 CFLAGS-e_log-fma4.c = -mfma4 CFLAGS-e_pow-fma4.c = -mfma4 $(config-cflags-nofma) -CFLAGS-halfulp-fma4.c = -mfma4 CFLAGS-mpa-fma4.c = -mfma4 CFLAGS-mpatan-fma4.c = -mfma4 CFLAGS-mpatan2-fma4.c = -mfma4 @@ -26,14 +25,12 @@ CFLAGS-mpsqrt-fma4.c = -mfma4 CFLAGS-mptan-fma4.c = -mfma4 CFLAGS-s_atan-fma4.c = -mfma4 CFLAGS-sincos32-fma4.c = -mfma4 -CFLAGS-slowexp-fma4.c = -mfma4 -CFLAGS-slowpow-fma4.c = -mfma4 CFLAGS-s_sin-fma4.c = -mfma4 CFLAGS-s_tan-fma4.c = -mfma4 libm-sysdep_routines += e_exp-avx e_log-avx s_atan-avx \ e_atan2-avx s_sin-avx s_tan-avx \ - mplog-avx mpa-avx slowexp-avx \ + mplog-avx mpa-avx \ mpexp-avx CFLAGS-e_atan2-avx.c = -msse2avx -DSSE2AVX @@ -44,7 +41,6 @@ CFLAGS-mpexp-avx.c = -msse2avx -DSSE2AVX CFLAGS-mplog-avx.c = -msse2avx -DSSE2AVX CFLAGS-s_atan-avx.c = -msse2avx -DSSE2AVX CFLAGS-s_sin-avx.c = -msse2avx -DSSE2AVX -CFLAGS-slowexp-avx.c = -msse2avx -DSSE2AVX CFLAGS-s_tan-avx.c = -msse2avx -DSSE2AVX endif Index: glibc-2.26/sysdeps/x86_64/fpu/multiarch/e_exp-avx.c =================================================================== --- glibc-2.26.orig/sysdeps/x86_64/fpu/multiarch/e_exp-avx.c +++ glibc-2.26/sysdeps/x86_64/fpu/multiarch/e_exp-avx.c @@ -1,6 +1,5 @@ #define __ieee754_exp __ieee754_exp_avx #define __exp1 __exp1_avx -#define __slowexp __slowexp_avx #define SECTION __attribute__ ((section (".text.avx"))) #include <sysdeps/ieee754/dbl-64/e_exp.c> Index: glibc-2.26/sysdeps/x86_64/fpu/multiarch/e_exp-fma4.c =================================================================== --- glibc-2.26.orig/sysdeps/x86_64/fpu/multiarch/e_exp-fma4.c +++ glibc-2.26/sysdeps/x86_64/fpu/multiarch/e_exp-fma4.c @@ -1,6 +1,5 @@ #define __ieee754_exp __ieee754_exp_fma4 #define __exp1 __exp1_fma4 -#define __slowexp __slowexp_fma4 #define SECTION __attribute__ ((section (".text.fma4"))) #include <sysdeps/ieee754/dbl-64/e_exp.c> Index: glibc-2.26/sysdeps/x86_64/fpu/multiarch/e_pow-fma4.c =================================================================== --- glibc-2.26.orig/sysdeps/x86_64/fpu/multiarch/e_pow-fma4.c +++ glibc-2.26/sysdeps/x86_64/fpu/multiarch/e_pow-fma4.c @@ -1,6 +1,5 @@ #define __ieee754_pow __ieee754_pow_fma4 #define __exp1 __exp1_fma4 -#define __slowpow __slowpow_fma4 #define SECTION __attribute__ ((section (".text.fma4"))) #include <sysdeps/ieee754/dbl-64/e_pow.c> Index: glibc-2.26/sysdeps/x86_64/fpu/multiarch/halfulp-fma4.c =================================================================== --- glibc-2.26.orig/sysdeps/x86_64/fpu/multiarch/halfulp-fma4.c +++ /dev/null @@ -1,4 +0,0 @@ -#define __halfulp __halfulp_fma4 -#define SECTION __attribute__ ((section (".text.fma4"))) - -#include <sysdeps/ieee754/dbl-64/halfulp.c> Index: glibc-2.26/sysdeps/x86_64/fpu/multiarch/slowexp-avx.c =================================================================== --- glibc-2.26.orig/sysdeps/x86_64/fpu/multiarch/slowexp-avx.c +++ /dev/null @@ -1,9 +0,0 @@ -#define __slowexp __slowexp_avx -#define __add __add_avx -#define __dbl_mp __dbl_mp_avx -#define __mpexp __mpexp_avx -#define __mul __mul_avx -#define __sub __sub_avx -#define SECTION __attribute__ ((section (".text.avx"))) - -#include <sysdeps/ieee754/dbl-64/slowexp.c> Index: glibc-2.26/sysdeps/x86_64/fpu/multiarch/slowexp-fma4.c =================================================================== --- glibc-2.26.orig/sysdeps/x86_64/fpu/multiarch/slowexp-fma4.c +++ /dev/null @@ -1,9 +0,0 @@ -#define __slowexp __slowexp_fma4 -#define __add __add_fma4 -#define __dbl_mp __dbl_mp_fma4 -#define __mpexp __mpexp_fma4 -#define __mul __mul_fma4 -#define __sub __sub_fma4 -#define SECTION __attribute__ ((section (".text.fma4"))) - -#include <sysdeps/ieee754/dbl-64/slowexp.c> Index: glibc-2.26/sysdeps/x86_64/fpu/multiarch/slowpow-fma4.c =================================================================== --- glibc-2.26.orig/sysdeps/x86_64/fpu/multiarch/slowpow-fma4.c +++ /dev/null @@ -1,11 +0,0 @@ -#define __slowpow __slowpow_fma4 -#define __add __add_fma4 -#define __dbl_mp __dbl_mp_fma4 -#define __mpexp __mpexp_fma4 -#define __mplog __mplog_fma4 -#define __mul __mul_fma4 -#define __sub __sub_fma4 -#define __halfulp __halfulp_fma4 -#define SECTION __attribute__ ((section (".text.fma4"))) - -#include <sysdeps/ieee754/dbl-64/slowpow.c>
Locations
Projects
Search
Status Monitor
Help
OpenBuildService.org
Documentation
API Documentation
Code of Conduct
Contact
Support
@OBShq
Terms
openSUSE Build Service is sponsored by
The Open Build Service is an
openSUSE project
.
Sign Up
Log In
Places
Places
All Projects
Status Monitor