1*24715Selefunt*************************************************************************** 2*24715Selefunt* * 3*24715Selefunt* Copyright (c) 1985 Regents of the University of California. * 4*24715Selefunt* * 5*24715Selefunt* Use and reproduction of this software are granted in accordance with * 6*24715Selefunt* the terms and conditions specified in the Berkeley Software License * 7*24715Selefunt* Agreement (in particular, this entails acknowledgement of the programs' * 8*24715Selefunt* source, and inclusion of this notice) with the additional understanding * 9*24715Selefunt* that all recipients should regard themselves as participants in an * 10*24715Selefunt* ongoing research project and hence should feel obligated to report * 11*24715Selefunt* their experiences (good or bad) with these elementary function codes, * 12*24715Selefunt* using "sendbug 4bsd-bugs@BERKELEY", to the authors. * 13*24715Selefunt* * 14*24715Selefunt* K.C. Ng, with Z-S. Alex Liu, S. McDonald, P. Tang, W. Kahan. * 15*24715Selefunt* Revised on 5/10/85, 5/13/85, 6/14/85, 8/20/85, 8/27/85, 9/11/85. * 16*24715Selefunt* * 17*24715Selefunt*************************************************************************** 1824584Szliu 19*24715Selefunt/* @(#)README 1.6 (Berkeley) 9/12/85; 1.3 (ucb.elefunt) 09/12/85 */ 2024584Szliu 21*24715SelefuntNB. The machine-independent Version 7 math library found in 4.2BSD 22*24715Selefunt is now /usr/lib/libom.a. To compile with those routines use -lom. 23*24715Selefunt 2424584Szliu****************************************************************************** 2524584Szliu* This is a description of the upgraded elementary functions (listed in 1). * 2624584Szliu* Bessel functions (j0, j1, jn, y0, y1, yn), floor, and fabs passed over * 2724584Szliu* from 4.2BSD without change except perhaps for the way floating point * 28*24715Selefunt* exception is signaled on a VAX. Three lines that contain "errno" in erf.c* 29*24715Selefunt* (error functions erf, erfc) have been deleted to prevent overriding the * 3024584Szliu* system "errno". * 3124584Szliu****************************************************************************** 3224584Szliu 3324584Szliu0. Total number of files: 40 3424584Szliu 35*24715Selefunt IEEE/Makefile VAX/Makefile VAX/support.s erf.c lgamma.c 3624584Szliu IEEE/atan2.c VAX/argred.s VAX/tan.s exp.c log.c 3724584Szliu IEEE/cabs.c VAX/atan2.s acosh.c exp__E.c log10.c 3824584Szliu IEEE/cbrt.c VAX/cabs.s asincos.c expm1.c log1p.c 3924584Szliu IEEE/support.c VAX/cbrt.s asinh.c floor.c log__L.c 4024584Szliu IEEE/trig.c VAX/infnan.s atan.c j0.c pow.c 4124584Szliu Makefile VAX/sincos.s atanh.c j1.c sinh.c 4224584Szliu README VAX/sqrt.s cosh.c jn.c tanh.c 4324584Szliu 44*24715Selefunt1. Functions implemented : 45*24715Selefunt (A). Standard elementary functions (total 22) : 46*24715Selefunt acos(x) ...in file asincos.c 47*24715Selefunt asin(x) ...in file asincos.c 48*24715Selefunt atan(x) ...in file atan.c 49*24715Selefunt atan2(x,y) ...in files IEEE/atan2.c, VAX/atan2.s 50*24715Selefunt sin(x) ...in files IEEE/trig.c, VAX/sincos.s 51*24715Selefunt cos(x) ...in files IEEE/trig.c, VAX/sincos.s 52*24715Selefunt tan(x) ...in files IEEE/trig.c, VAX/tan.s 53*24715Selefunt cabs(x,y) ...in files IEEE/cabs.c, VAX/cabs.s 54*24715Selefunt hypot(x,y) ...in files IEEE/cabs.c, VAX/cabs.s 55*24715Selefunt cbrt(x) ...in files IEEE/cbrt.c, VAX/cbrt.s 56*24715Selefunt exp(x) ...in file exp.c 57*24715Selefunt expm1(x):=exp(x)-1 ...in file expm1.c 58*24715Selefunt log(x) ...in file log.c 59*24715Selefunt log10(x) ...in file log10.c 60*24715Selefunt log1p(x):=log(1+x) ...in file log1p.c 61*24715Selefunt pow(x,y) ...in file pow.c 62*24715Selefunt sinh(x) ...in file sinh.c 63*24715Selefunt cosh(x) ...in file cosh.c 64*24715Selefunt tanh(x) ...in file tanh.c 65*24715Selefunt asinh(x) ...in file asinh.c 66*24715Selefunt acosh(x) ...in file acosh.c 67*24715Selefunt atanh(x) ...in file atanh.c 68*24715Selefunt 6924584Szliu (B). Kernel functions : 70*24715Selefunt exp__E(x,c) ...in file exp__E.c, used by expm1/exp/pow/cosh 71*24715Selefunt log__L(s) ...in file log__L.c, used by log1p/log/pow 72*24715Selefunt libm$argred ...in file VAX/argred.s, used by VAX version of sin/cos/tan 7324584Szliu 7424584Szliu (C). System supported functions : 75*24715Selefunt sqrt() ...in files IEEE/support.c, VAX/sqrt.s 76*24715Selefunt drem() ...in files IEEE/support.c, VAX/support.s 77*24715Selefunt finite() ...in files IEEE/support.c, VAX/support.s 78*24715Selefunt logb() ...in files IEEE/support.c, VAX/support.s 79*24715Selefunt scalb() ...in files IEEE/support.c, VAX/support.s 80*24715Selefunt copysign() ...in files IEEE/support.c, VAX/support.s 81*24715Selefunt rint() ...in file floor.c 8224584Szliu 8324584Szliu 8424584Szliu Notes: 8524652Szliu i. The codes in files ending with ".s" are written in VAX assembly 8624584Szliu language. They are intended for VAX computers. 8724584Szliu 8824652Szliu Files that end with ".c" are written in C. They are intended 8924584Szliu for either a VAX or a machine that conforms to the IEEE 9024652Szliu standard 754 for double precision floating-point arithmetic. 9124584Szliu 9224584Szliu ii. On other than VAX or IEEE machines, run the original math 93*24715Selefunt library, formerly "/usr/lib/libm.a", now "/usr/lib/libom.a", if 94*24715Selefunt nothing better is available. 9524584Szliu 96*24715Selefunt iii. The trigonometric functions sin/cos/tan/atan2 in files "VAX/sincos.s", 97*24715Selefunt "VAX/tan.s" and "VAX/atan2.s" are different from those in 98*24715Selefunt "IEEE/trig.c" and "IEEE/atan2.c". The VAX assembler code uses the 99*24715Selefunt true value of pi to perform argument reduction, while the C code uses 100*24715Selefunt a machine value of PI (see "IEEE/trig.c"). 10124584Szliu 10224584Szliu 10324584Szliu2. A computer system that conforms to IEEE standard 754 should provide 104*24715Selefunt sqrt(x), 105*24715Selefunt drem(x,p), (double precision remainder function) 106*24715Selefunt copysign(x,y), 107*24715Selefunt finite(x), 108*24715Selefunt scalb(x,N), 109*24715Selefunt logb(x) and 110*24715Selefunt rint(x). 11124652Szliu These functions are either required or recommended by the standard. 11224584Szliu For convenience, a (slow) C implementation of these functions is 11324652Szliu provided in the file "IEEE/support.c". 11424584Szliu 115*24715Selefunt Warning: The functions in IEEE/support.c are somewhat machine dependent. 11624584Szliu Some modifications may be necessary to run them on a different machine. 117*24715Selefunt Currently, if compiled with a suitable flag, "IEEE/support.c" will work 118*24715Selefunt on a National 32000, a Zilog 8000, a VAX, and a SUN (cf. the "Makefile" 119*24715Selefunt in this directory). Invoke the C compiler thus: 12024584Szliu 12124584Szliu cc -c -DVAX IEEE/support.c ... on a VAX, D-format 12224652Szliu cc -c -DNATIONAL IEEE/support.c ... on a National 32000 12324584Szliu cc -c IEEE/support.c ... on other IEEE machines, 12424584Szliu we hope. 12524584Szliu 12624584Szliu Notes: 12724584Szliu 1. Faster versions of "drem" and "sqrt" for IEEE double precision 12824584Szliu (coded in C but intended for assembly language) are given at the 12924652Szliu end of "IEEE/support.c" but commented out since they require certain 13024584Szliu machine-dependent functions. 13124584Szliu 13224584Szliu 2. A fast VAX assembler version of the system supported functions 13324584Szliu copysign(), logb(), scalb(), finite(), and drem() appears in file 13424652Szliu "VAX/support.s". A fast VAX assembler version of sqrt() is in 13524652Szliu file "VAX/sqrt.s". 13624584Szliu 13724584Szliu3. Two formats are supported by all the standard elementary functions: 13824652Szliu the VAX D-format (56-bit precision), and the IEEE double format 13924652Szliu (53-bit precision). The cbrt() in "IEEE/cbrt.c" is for IEEE machines 14024584Szliu only. The functions in files that end with ".s" are for VAX computers 141*24715Selefunt only. The functions in files that end with ".c" (except "IEEE/cbrt.c") 142*24715Selefunt are for VAX and IEEE machines. To use the VAX D-format, compile the code 14324584Szliu with -DVAX; to use IEEE double format on various IEEE machines, see 14424652Szliu "Makefile" in this directory). 14524584Szliu 14624584Szliu Example: 14724584Szliu cc -c -DVAX sin.c ... for VAX D-format 14824584Szliu 14924584Szliu Warning: The values of floating-point constants used in the code are 15024584Szliu given in both hexadecimal and decimal. The hexadecimal values 15124652Szliu are the intended ones. The decimal values may be used provided 15224584Szliu that the compiler converts from decimal to binary accurately 15324584Szliu enough to produce the hexadecimal values shown. If the 15424584Szliu conversion is inaccurate, then one must know the exact machine 15524652Szliu representation of the constants and alter the assembly 156*24715Selefunt language output from the compiler, or play tricks like 15724584Szliu the following in a C program. 15824584Szliu 15924584Szliu Example: to store the floating-point constant 16024584Szliu 16124584Szliu p1= 2^-6 * .F83ABE67E1066A (Hexadecimal) 16224584Szliu 16324652Szliu on a VAX in C, we use two longwords to store its 16424584Szliu machine value and define p1 to be the double constant 16524652Szliu at the location of these two longwords: 16624584Szliu 167*24715Selefunt static long p1x[] = { 0x3abe3d78, 0x066a67e1}; 16824584Szliu #define p1 (*(double*)p1x) 16924584Szliu 17024652Szliu Note: On a VAX, some functions have two codes. For example, cabs() has 171*24715Selefunt one implementation in "IEEE/cabs.c", and another in "VAX/cabs.s". 17224652Szliu In this case, the assembly language version is preferred. 17324584Szliu 17424584Szliu 17524584Szliu4. Accuracy. 17624584Szliu 17724584Szliu The errors in expm1(), log1p(), exp(), log(), cabs(), hypot() 17824584Szliu and cbrt() are below 1 ULP (Unit in the Last Place). 17924584Szliu 18024584Szliu The error in pow(x,y) grows with the size of y. Nevertheless, 18124584Szliu for integers x and y, pow(x,y) returns the correct integer value 18224584Szliu on all tested machines (VAX, SUN, NATIONAL, ZILOG), provided that 18324584Szliu x to the power of y is representable exactly. 18424584Szliu 185*24715Selefunt cosh, sinh, acosh, asinh, tanh, atanh and log10 have errors below 186*24715Selefunt about 3 ULPs. 18724584Szliu 188*24715Selefunt For trigonometric and inverse trigonometric functions: 18924584Szliu 190*24715Selefunt Let [trig(x)] denote the value actually computed for trig(x), 191*24715Selefunt 19224584Szliu 1) Those codes using the machine's value PI (true pi rounded): 193*24715Selefunt (source codes: IEEE/{trig.c,atan2.c}, asincos.c and atan.c) 19424584Szliu 19524584Szliu The errors in [sin(x)], [cos(x)], and [atan(x)] are below 19624584Szliu 1 ULP compared with sin(x*pi/PI), cos(x*pi/PI), and 19724584Szliu atan(x)*PI/pi respectively, where PI is the machine's 19824652Szliu value of pi rounded. [tan(x)] returns tan(x*pi/PI) within 19924584Szliu about 2 ULPs; [acos(x)], [asin(x)], and [atan2(y,x)] 20024584Szliu return acos(x)*PI/pi, asin(x)*PI/pi, and atan2(y,x)*PI/pi 20124584Szliu respectively to similar accuracy. 20224584Szliu 203*24715Selefunt 20424652Szliu 2) Those using true pi (for VAX D-format only): 205*24715Selefunt (source codes: VAX/{sincos.s,tan.s,atan2.s}, asincos.c and 206*24715Selefunt atan.c) 20724584Szliu 20824584Szliu The errors in [sin(x)], [cos(x)], and [atan(x)] are below 209*24715Selefunt 1 ULP. [tan(x)], [atan2(y,x)], [acos(x)], and [asin(x)] 21024584Szliu have errors below about 2 ULPs. 21124584Szliu 212*24715Selefunt 21324584Szliu Here are the results of some test runs to find worst errors on 21424584Szliu the VAX : 21524584Szliu 21624584Szliu tan : 2.09 ULPs ...1,024,000 random arguments (machine PI) 21724584Szliu sin : .861 ULPs ...1,024,000 random arguments (machine PI) 21824584Szliu cos : .857 ULPs ...1,024,000 random arguments (machine PI) 21924584Szliu (compared with tan, sin, cos of (x*pi/PI)) 22024584Szliu 22124584Szliu acos : 2.07 ULPs .....200,000 random arguments (machine PI) 22224584Szliu asin : 2.06 ULPs .....200,000 random arguments (machine PI) 22324584Szliu atan2 : 1.41 ULPs .....356,000 random arguments (machine PI) 22424584Szliu atan : 0.86 ULPs ...1,536,000 random arguments (machine PI) 22524584Szliu (compared with (PI/pi)*(atan, asin, acos, atan2 of x)) 22624584Szliu 22724584Szliu tan : 2.15 ULPs ...1,024,000 random arguments (true pi) 22824584Szliu sin : .814 ULPs ...1,024,000 random arguments (true pi) 22924584Szliu cos : .792 ULPs ...1,024,000 random arguments (true pi) 23024584Szliu acos : 2.15 ULPs ...1,024,000 random arguments (true pi) 23124584Szliu asin : 1.99 ULPs ...1,024,000 random arguments (true pi) 23224584Szliu atan2 : 1.48 ULPs ...1,024,000 random arguments (true pi) 23324584Szliu atan : .850 ULPs ...1,024,000 random arguments (true pi) 23424584Szliu 23524584Szliu acosh : 3.30 ULPs .....512,000 random arguments 23624584Szliu asinh : 1.58 ULPs .....512,000 random arguments 23724584Szliu atanh : 1.71 ULPs .....512,000 random arguments 23824584Szliu cosh : 1.23 ULPs .....768,000 random arguments 23924584Szliu sinh : 1.93 ULPs ...1,024,000 random arguments 24024584Szliu tanh : 2.22 ULPs ...1,024,000 random arguments 24124584Szliu log10 : 1.74 ULPs ...1,536,000 random arguments 24224584Szliu pow : 1.79 ULPs .....100,000 random arguments, 0 < x, y < 20. 24324584Szliu 24424584Szliu exp : .768 ULPs ...1,156,000 random arguments 24524584Szliu expm1 : .844 ULPs ...1,166,000 random arguments 24624584Szliu log1p : .846 ULPs ...1,536,000 random arguments 24724584Szliu log : .826 ULPs ...1,536,000 random arguments 24824584Szliu cabs : .959 ULPs .....500,000 random arguments 24924584Szliu cbrt : .666 ULPs ...5,120,000 random arguments 25024584Szliu 25124584Szliu 25224584Szliu5. Speed. 25324584Szliu 25424652Szliu Some functions coded in VAX assembly language (cabs(), hypot() and 25524652Szliu sqrt()) are significantly faster than the corresponding ones in 4.2BSD. 25624652Szliu In general, to improve performance, all functions in "IEEE/support.c" 25724652Szliu should be written in assembly language and, whenever possible, should 25824652Szliu be called via short subroutine calls. 25924584Szliu 26024584Szliu 261*24715Selefunt6. j0, j1, jn. 26224584Szliu 26324584Szliu The modifications to these routines were only in how an invalid 264*24715Selefunt floating point operations is signaled. 265