1*34116Sbostic/* 2*34116Sbostic * Copyright (c) 1985 Regents of the University of California. 3*34116Sbostic * All rights reserved. 4*34116Sbostic * 5*34116Sbostic * Redistribution and use in source and binary forms are permitted 6*34116Sbostic * provided that this notice is preserved and that due credit is given 7*34116Sbostic * to the University of California at Berkeley. The name of the University 8*34116Sbostic * may not be used to endorse or promote products derived from this 9*34116Sbostic * software without specific prior written permission. This software 10*34116Sbostic * is provided ``as is'' without express or implied warranty. 11*34116Sbostic * 12*34116Sbostic * All recipients should regard themselves as participants in an ongoing 13*34116Sbostic * research project and hence should feel obligated to report their 14*34116Sbostic * experiences (good or bad) with these elementary function codes, using 15*34116Sbostic * the sendbug(8) program, to the authors. 16*34116Sbostic * 17*34116Sbostic * K.C. Ng, with Z-S. Alex Liu, S. McDonald, P. Tang, W. Kahan. 18*34116Sbostic * Revised on 5/10/85, 5/13/85, 6/14/85, 8/20/85, 8/27/85, 9/11/85. 19*34116Sbostic * 20*34116Sbostic * @(#)README 5.2 (Berkeley) 04/29/88 21*34116Sbostic */ 2224584Szliu 2324584Szliu****************************************************************************** 2424584Szliu* This is a description of the upgraded elementary functions (listed in 1). * 2524584Szliu* Bessel functions (j0, j1, jn, y0, y1, yn), floor, and fabs passed over * 2624584Szliu* from 4.2BSD without change except perhaps for the way floating point * 2724715Selefunt* exception is signaled on a VAX. Three lines that contain "errno" in erf.c* 2824715Selefunt* (error functions erf, erfc) have been deleted to prevent overriding the * 2924584Szliu* system "errno". * 3024584Szliu****************************************************************************** 3124584Szliu 3224584Szliu0. Total number of files: 40 3324584Szliu 3424715Selefunt IEEE/Makefile VAX/Makefile VAX/support.s erf.c lgamma.c 3524584Szliu IEEE/atan2.c VAX/argred.s VAX/tan.s exp.c log.c 3624584Szliu IEEE/cabs.c VAX/atan2.s acosh.c exp__E.c log10.c 3724584Szliu IEEE/cbrt.c VAX/cabs.s asincos.c expm1.c log1p.c 3824584Szliu IEEE/support.c VAX/cbrt.s asinh.c floor.c log__L.c 3924584Szliu IEEE/trig.c VAX/infnan.s atan.c j0.c pow.c 4024584Szliu Makefile VAX/sincos.s atanh.c j1.c sinh.c 4124584Szliu README VAX/sqrt.s cosh.c jn.c tanh.c 4224584Szliu 4324715Selefunt1. Functions implemented : 4424715Selefunt (A). Standard elementary functions (total 22) : 4524715Selefunt acos(x) ...in file asincos.c 4624715Selefunt asin(x) ...in file asincos.c 4724715Selefunt atan(x) ...in file atan.c 4824715Selefunt atan2(x,y) ...in files IEEE/atan2.c, VAX/atan2.s 4924715Selefunt sin(x) ...in files IEEE/trig.c, VAX/sincos.s 5024715Selefunt cos(x) ...in files IEEE/trig.c, VAX/sincos.s 5124715Selefunt tan(x) ...in files IEEE/trig.c, VAX/tan.s 5224715Selefunt cabs(x,y) ...in files IEEE/cabs.c, VAX/cabs.s 5324715Selefunt hypot(x,y) ...in files IEEE/cabs.c, VAX/cabs.s 5424715Selefunt cbrt(x) ...in files IEEE/cbrt.c, VAX/cbrt.s 5524715Selefunt exp(x) ...in file exp.c 5624715Selefunt expm1(x):=exp(x)-1 ...in file expm1.c 5724715Selefunt log(x) ...in file log.c 5824715Selefunt log10(x) ...in file log10.c 5924715Selefunt log1p(x):=log(1+x) ...in file log1p.c 6024715Selefunt pow(x,y) ...in file pow.c 6124715Selefunt sinh(x) ...in file sinh.c 6224715Selefunt cosh(x) ...in file cosh.c 6324715Selefunt tanh(x) ...in file tanh.c 6424715Selefunt asinh(x) ...in file asinh.c 6524715Selefunt acosh(x) ...in file acosh.c 6624715Selefunt atanh(x) ...in file atanh.c 6724715Selefunt 6824584Szliu (B). Kernel functions : 6924715Selefunt exp__E(x,c) ...in file exp__E.c, used by expm1/exp/pow/cosh 7024715Selefunt log__L(s) ...in file log__L.c, used by log1p/log/pow 7124715Selefunt libm$argred ...in file VAX/argred.s, used by VAX version of sin/cos/tan 7224584Szliu 7324584Szliu (C). System supported functions : 7424715Selefunt sqrt() ...in files IEEE/support.c, VAX/sqrt.s 7524715Selefunt drem() ...in files IEEE/support.c, VAX/support.s 7624715Selefunt finite() ...in files IEEE/support.c, VAX/support.s 7724715Selefunt logb() ...in files IEEE/support.c, VAX/support.s 7824715Selefunt scalb() ...in files IEEE/support.c, VAX/support.s 7924715Selefunt copysign() ...in files IEEE/support.c, VAX/support.s 8024715Selefunt rint() ...in file floor.c 8124584Szliu 8224584Szliu 8324584Szliu Notes: 8424652Szliu i. The codes in files ending with ".s" are written in VAX assembly 8524584Szliu language. They are intended for VAX computers. 8624584Szliu 8724652Szliu Files that end with ".c" are written in C. They are intended 8824584Szliu for either a VAX or a machine that conforms to the IEEE 8924652Szliu standard 754 for double precision floating-point arithmetic. 9024584Szliu 9124584Szliu ii. On other than VAX or IEEE machines, run the original math 9224715Selefunt library, formerly "/usr/lib/libm.a", now "/usr/lib/libom.a", if 9324715Selefunt nothing better is available. 9424584Szliu 9524715Selefunt iii. The trigonometric functions sin/cos/tan/atan2 in files "VAX/sincos.s", 9624715Selefunt "VAX/tan.s" and "VAX/atan2.s" are different from those in 9724715Selefunt "IEEE/trig.c" and "IEEE/atan2.c". The VAX assembler code uses the 9824715Selefunt true value of pi to perform argument reduction, while the C code uses 9924715Selefunt a machine value of PI (see "IEEE/trig.c"). 10024584Szliu 10124584Szliu 10224584Szliu2. A computer system that conforms to IEEE standard 754 should provide 10324715Selefunt sqrt(x), 10424715Selefunt drem(x,p), (double precision remainder function) 10524715Selefunt copysign(x,y), 10624715Selefunt finite(x), 10724715Selefunt scalb(x,N), 10824715Selefunt logb(x) and 10924715Selefunt rint(x). 11024652Szliu These functions are either required or recommended by the standard. 11124584Szliu For convenience, a (slow) C implementation of these functions is 11224652Szliu provided in the file "IEEE/support.c". 11324584Szliu 11424715Selefunt Warning: The functions in IEEE/support.c are somewhat machine dependent. 11524584Szliu Some modifications may be necessary to run them on a different machine. 11624715Selefunt Currently, if compiled with a suitable flag, "IEEE/support.c" will work 11724715Selefunt on a National 32000, a Zilog 8000, a VAX, and a SUN (cf. the "Makefile" 11824715Selefunt in this directory). Invoke the C compiler thus: 11924584Szliu 12024584Szliu cc -c -DVAX IEEE/support.c ... on a VAX, D-format 12124652Szliu cc -c -DNATIONAL IEEE/support.c ... on a National 32000 12224584Szliu cc -c IEEE/support.c ... on other IEEE machines, 12324584Szliu we hope. 12424584Szliu 12524584Szliu Notes: 12624584Szliu 1. Faster versions of "drem" and "sqrt" for IEEE double precision 12724584Szliu (coded in C but intended for assembly language) are given at the 12824652Szliu end of "IEEE/support.c" but commented out since they require certain 12924584Szliu machine-dependent functions. 13024584Szliu 13124584Szliu 2. A fast VAX assembler version of the system supported functions 13224584Szliu copysign(), logb(), scalb(), finite(), and drem() appears in file 13324652Szliu "VAX/support.s". A fast VAX assembler version of sqrt() is in 13424652Szliu file "VAX/sqrt.s". 13524584Szliu 13624584Szliu3. Two formats are supported by all the standard elementary functions: 13724652Szliu the VAX D-format (56-bit precision), and the IEEE double format 13824652Szliu (53-bit precision). The cbrt() in "IEEE/cbrt.c" is for IEEE machines 13924584Szliu only. The functions in files that end with ".s" are for VAX computers 14024715Selefunt only. The functions in files that end with ".c" (except "IEEE/cbrt.c") 14124715Selefunt are for VAX and IEEE machines. To use the VAX D-format, compile the code 14224584Szliu with -DVAX; to use IEEE double format on various IEEE machines, see 14324652Szliu "Makefile" in this directory). 14424584Szliu 14524584Szliu Example: 14624584Szliu cc -c -DVAX sin.c ... for VAX D-format 14724584Szliu 14824584Szliu Warning: The values of floating-point constants used in the code are 14924584Szliu given in both hexadecimal and decimal. The hexadecimal values 15024652Szliu are the intended ones. The decimal values may be used provided 15124584Szliu that the compiler converts from decimal to binary accurately 15224584Szliu enough to produce the hexadecimal values shown. If the 15324584Szliu conversion is inaccurate, then one must know the exact machine 15424652Szliu representation of the constants and alter the assembly 15524715Selefunt language output from the compiler, or play tricks like 15624584Szliu the following in a C program. 15724584Szliu 15824584Szliu Example: to store the floating-point constant 15924584Szliu 16024584Szliu p1= 2^-6 * .F83ABE67E1066A (Hexadecimal) 16124584Szliu 16224652Szliu on a VAX in C, we use two longwords to store its 16324584Szliu machine value and define p1 to be the double constant 16424652Szliu at the location of these two longwords: 16524584Szliu 16624715Selefunt static long p1x[] = { 0x3abe3d78, 0x066a67e1}; 16724584Szliu #define p1 (*(double*)p1x) 16824584Szliu 16924652Szliu Note: On a VAX, some functions have two codes. For example, cabs() has 17024715Selefunt one implementation in "IEEE/cabs.c", and another in "VAX/cabs.s". 17124652Szliu In this case, the assembly language version is preferred. 17224584Szliu 17324584Szliu 17424584Szliu4. Accuracy. 17524584Szliu 17624584Szliu The errors in expm1(), log1p(), exp(), log(), cabs(), hypot() 17724584Szliu and cbrt() are below 1 ULP (Unit in the Last Place). 17824584Szliu 17924584Szliu The error in pow(x,y) grows with the size of y. Nevertheless, 18024584Szliu for integers x and y, pow(x,y) returns the correct integer value 18124584Szliu on all tested machines (VAX, SUN, NATIONAL, ZILOG), provided that 18224584Szliu x to the power of y is representable exactly. 18324584Szliu 18424715Selefunt cosh, sinh, acosh, asinh, tanh, atanh and log10 have errors below 18524715Selefunt about 3 ULPs. 18624584Szliu 18724715Selefunt For trigonometric and inverse trigonometric functions: 18824584Szliu 18924715Selefunt Let [trig(x)] denote the value actually computed for trig(x), 19024715Selefunt 19124584Szliu 1) Those codes using the machine's value PI (true pi rounded): 19224715Selefunt (source codes: IEEE/{trig.c,atan2.c}, asincos.c and atan.c) 19324584Szliu 19424584Szliu The errors in [sin(x)], [cos(x)], and [atan(x)] are below 19524584Szliu 1 ULP compared with sin(x*pi/PI), cos(x*pi/PI), and 19624584Szliu atan(x)*PI/pi respectively, where PI is the machine's 19724652Szliu value of pi rounded. [tan(x)] returns tan(x*pi/PI) within 19824584Szliu about 2 ULPs; [acos(x)], [asin(x)], and [atan2(y,x)] 19924584Szliu return acos(x)*PI/pi, asin(x)*PI/pi, and atan2(y,x)*PI/pi 20024584Szliu respectively to similar accuracy. 20124584Szliu 20224715Selefunt 20324652Szliu 2) Those using true pi (for VAX D-format only): 20424715Selefunt (source codes: VAX/{sincos.s,tan.s,atan2.s}, asincos.c and 20524715Selefunt atan.c) 20624584Szliu 20724584Szliu The errors in [sin(x)], [cos(x)], and [atan(x)] are below 20824715Selefunt 1 ULP. [tan(x)], [atan2(y,x)], [acos(x)], and [asin(x)] 20924584Szliu have errors below about 2 ULPs. 21024584Szliu 21124715Selefunt 21224584Szliu Here are the results of some test runs to find worst errors on 21324584Szliu the VAX : 21424584Szliu 21524584Szliu tan : 2.09 ULPs ...1,024,000 random arguments (machine PI) 21624584Szliu sin : .861 ULPs ...1,024,000 random arguments (machine PI) 21724584Szliu cos : .857 ULPs ...1,024,000 random arguments (machine PI) 21824584Szliu (compared with tan, sin, cos of (x*pi/PI)) 21924584Szliu 22024584Szliu acos : 2.07 ULPs .....200,000 random arguments (machine PI) 22124584Szliu asin : 2.06 ULPs .....200,000 random arguments (machine PI) 22224584Szliu atan2 : 1.41 ULPs .....356,000 random arguments (machine PI) 22324584Szliu atan : 0.86 ULPs ...1,536,000 random arguments (machine PI) 22424584Szliu (compared with (PI/pi)*(atan, asin, acos, atan2 of x)) 22524584Szliu 22624584Szliu tan : 2.15 ULPs ...1,024,000 random arguments (true pi) 22724584Szliu sin : .814 ULPs ...1,024,000 random arguments (true pi) 22824584Szliu cos : .792 ULPs ...1,024,000 random arguments (true pi) 22924584Szliu acos : 2.15 ULPs ...1,024,000 random arguments (true pi) 23024584Szliu asin : 1.99 ULPs ...1,024,000 random arguments (true pi) 23124584Szliu atan2 : 1.48 ULPs ...1,024,000 random arguments (true pi) 23224584Szliu atan : .850 ULPs ...1,024,000 random arguments (true pi) 23324584Szliu 23424584Szliu acosh : 3.30 ULPs .....512,000 random arguments 23524584Szliu asinh : 1.58 ULPs .....512,000 random arguments 23624584Szliu atanh : 1.71 ULPs .....512,000 random arguments 23724584Szliu cosh : 1.23 ULPs .....768,000 random arguments 23824584Szliu sinh : 1.93 ULPs ...1,024,000 random arguments 23924584Szliu tanh : 2.22 ULPs ...1,024,000 random arguments 24024584Szliu log10 : 1.74 ULPs ...1,536,000 random arguments 24124584Szliu pow : 1.79 ULPs .....100,000 random arguments, 0 < x, y < 20. 24224584Szliu 24324584Szliu exp : .768 ULPs ...1,156,000 random arguments 24424584Szliu expm1 : .844 ULPs ...1,166,000 random arguments 24524584Szliu log1p : .846 ULPs ...1,536,000 random arguments 24624584Szliu log : .826 ULPs ...1,536,000 random arguments 24724584Szliu cabs : .959 ULPs .....500,000 random arguments 24824584Szliu cbrt : .666 ULPs ...5,120,000 random arguments 24924584Szliu 25024584Szliu 25124584Szliu5. Speed. 25224584Szliu 25324652Szliu Some functions coded in VAX assembly language (cabs(), hypot() and 25424652Szliu sqrt()) are significantly faster than the corresponding ones in 4.2BSD. 25524652Szliu In general, to improve performance, all functions in "IEEE/support.c" 25624652Szliu should be written in assembly language and, whenever possible, should 25724652Szliu be called via short subroutine calls. 25824584Szliu 25924584Szliu 26024715Selefunt6. j0, j1, jn. 26124584Szliu 26224584Szliu The modifications to these routines were only in how an invalid 26324715Selefunt floating point operations is signaled. 264