xref: /csrg-svn/lib/libm/README (revision 24715)
1*24715Selefunt***************************************************************************
2*24715Selefunt*                                                                         *
3*24715Selefunt* Copyright (c) 1985 Regents of the University of California.             *
4*24715Selefunt*                                                                         *
5*24715Selefunt* Use and reproduction of this software are granted  in  accordance  with *
6*24715Selefunt* the terms and conditions specified in  the  Berkeley  Software  License *
7*24715Selefunt* Agreement (in particular, this entails acknowledgement of the programs' *
8*24715Selefunt* source, and inclusion of this notice) with the additional understanding *
9*24715Selefunt* that  all  recipients  should regard themselves as participants  in  an *
10*24715Selefunt* ongoing  research  project and hence should  feel  obligated  to report *
11*24715Selefunt* their  experiences (good or bad) with these elementary function  codes, *
12*24715Selefunt* using "sendbug 4bsd-bugs@BERKELEY", to the authors.                     *
13*24715Selefunt*                                                                         *
14*24715Selefunt* K.C. Ng, with Z-S. Alex Liu, S. McDonald, P. Tang, W. Kahan.            *
15*24715Selefunt* Revised on 5/10/85, 5/13/85, 6/14/85, 8/20/85, 8/27/85, 9/11/85.        *
16*24715Selefunt*                                                                         *
17*24715Selefunt***************************************************************************
1824584Szliu
19*24715Selefunt/*	@(#)README	1.6 (Berkeley) 9/12/85; 1.3 (ucb.elefunt) 09/12/85 */
2024584Szliu
21*24715SelefuntNB.  The machine-independent Version 7 math library found in 4.2BSD
22*24715Selefunt     is now /usr/lib/libom.a.  To compile with those routines use -lom.
23*24715Selefunt
2424584Szliu******************************************************************************
2524584Szliu*  This is a description of the upgraded elementary functions (listed in 1). *
2624584Szliu*  Bessel functions (j0, j1, jn, y0, y1, yn), floor, and fabs passed over    *
2724584Szliu*  from 4.2BSD without change except perhaps for the way floating point      *
28*24715Selefunt*  exception is signaled on a VAX.  Three lines that contain "errno" in erf.c*
29*24715Selefunt*  (error functions erf, erfc) have been deleted to prevent overriding the   *
3024584Szliu*  system "errno".                                                           *
3124584Szliu******************************************************************************
3224584Szliu
3324584Szliu0. Total number of files: 40
3424584Szliu
35*24715Selefunt        IEEE/Makefile   VAX/Makefile    VAX/support.s   erf.c       lgamma.c
3624584Szliu        IEEE/atan2.c    VAX/argred.s    VAX/tan.s       exp.c       log.c
3724584Szliu        IEEE/cabs.c     VAX/atan2.s     acosh.c         exp__E.c    log10.c
3824584Szliu        IEEE/cbrt.c     VAX/cabs.s      asincos.c       expm1.c     log1p.c
3924584Szliu        IEEE/support.c  VAX/cbrt.s      asinh.c         floor.c     log__L.c
4024584Szliu        IEEE/trig.c     VAX/infnan.s    atan.c          j0.c        pow.c
4124584Szliu        Makefile        VAX/sincos.s    atanh.c         j1.c        sinh.c
4224584Szliu        README          VAX/sqrt.s      cosh.c          jn.c        tanh.c
4324584Szliu
44*24715Selefunt1. Functions implemented :
45*24715Selefunt    (A). Standard elementary functions (total 22) :
46*24715Selefunt        acos(x)                 ...in file  asincos.c
47*24715Selefunt        asin(x)                 ...in file  asincos.c
48*24715Selefunt        atan(x)                 ...in file  atan.c
49*24715Selefunt        atan2(x,y)              ...in files IEEE/atan2.c, VAX/atan2.s
50*24715Selefunt        sin(x)                  ...in files IEEE/trig.c,  VAX/sincos.s
51*24715Selefunt        cos(x)                  ...in files IEEE/trig.c,  VAX/sincos.s
52*24715Selefunt        tan(x)                  ...in files IEEE/trig.c,  VAX/tan.s
53*24715Selefunt        cabs(x,y)               ...in files IEEE/cabs.c,  VAX/cabs.s
54*24715Selefunt        hypot(x,y)              ...in files IEEE/cabs.c,  VAX/cabs.s
55*24715Selefunt        cbrt(x)                 ...in files IEEE/cbrt.c,  VAX/cbrt.s
56*24715Selefunt        exp(x)                  ...in file  exp.c
57*24715Selefunt        expm1(x):=exp(x)-1      ...in file  expm1.c
58*24715Selefunt        log(x)                  ...in file  log.c
59*24715Selefunt        log10(x)                ...in file  log10.c
60*24715Selefunt        log1p(x):=log(1+x)      ...in file  log1p.c
61*24715Selefunt        pow(x,y)                ...in file  pow.c
62*24715Selefunt        sinh(x)                 ...in file  sinh.c
63*24715Selefunt        cosh(x)                 ...in file  cosh.c
64*24715Selefunt        tanh(x)                 ...in file  tanh.c
65*24715Selefunt        asinh(x)                ...in file  asinh.c
66*24715Selefunt        acosh(x)                ...in file  acosh.c
67*24715Selefunt        atanh(x)                ...in file  atanh.c
68*24715Selefunt
6924584Szliu    (B). Kernel functions :
70*24715Selefunt        exp__E(x,c) ...in file exp__E.c, used by expm1/exp/pow/cosh
71*24715Selefunt        log__L(s)   ...in file log__L.c, used by log1p/log/pow
72*24715Selefunt        libm$argred ...in file VAX/argred.s, used by VAX version of sin/cos/tan
7324584Szliu
7424584Szliu    (C). System supported functions :
75*24715Selefunt        sqrt()      ...in files IEEE/support.c, VAX/sqrt.s
76*24715Selefunt        drem()      ...in files IEEE/support.c, VAX/support.s
77*24715Selefunt        finite()    ...in files IEEE/support.c, VAX/support.s
78*24715Selefunt        logb()      ...in files IEEE/support.c, VAX/support.s
79*24715Selefunt        scalb()     ...in files IEEE/support.c, VAX/support.s
80*24715Selefunt        copysign()  ...in files IEEE/support.c, VAX/support.s
81*24715Selefunt        rint()      ...in file  floor.c
8224584Szliu
8324584Szliu
8424584Szliu   Notes:
8524652Szliu       i. The codes in files ending with ".s" are written in VAX assembly
8624584Szliu          language. They are intended for VAX computers.
8724584Szliu
8824652Szliu          Files that end with ".c" are written in C. They are intended
8924584Szliu          for either a VAX or a machine that conforms to the IEEE
9024652Szliu          standard 754 for double precision floating-point arithmetic.
9124584Szliu
9224584Szliu      ii. On other than VAX or IEEE machines, run the original math
93*24715Selefunt          library, formerly "/usr/lib/libm.a", now "/usr/lib/libom.a", if
94*24715Selefunt	  nothing better is available.
9524584Szliu
96*24715Selefunt     iii. The trigonometric functions sin/cos/tan/atan2 in files "VAX/sincos.s",
97*24715Selefunt          "VAX/tan.s" and "VAX/atan2.s" are different from those in
98*24715Selefunt          "IEEE/trig.c" and "IEEE/atan2.c".  The VAX assembler code uses the
99*24715Selefunt          true value of pi to perform argument reduction, while the C code uses
100*24715Selefunt          a machine value of PI (see "IEEE/trig.c").
10124584Szliu
10224584Szliu
10324584Szliu2. A computer system that conforms to IEEE standard 754 should provide
104*24715Selefunt                sqrt(x),
105*24715Selefunt                drem(x,p), (double precision remainder function)
106*24715Selefunt                copysign(x,y),
107*24715Selefunt                finite(x),
108*24715Selefunt                scalb(x,N),
109*24715Selefunt                logb(x) and
110*24715Selefunt                rint(x).
11124652Szliu   These functions are either required or recommended by the standard.
11224584Szliu   For convenience, a (slow) C implementation of these functions is
11324652Szliu   provided in the file "IEEE/support.c".
11424584Szliu
115*24715Selefunt   Warning: The functions in IEEE/support.c are somewhat machine dependent.
11624584Szliu   Some modifications may be necessary to run them on a different machine.
117*24715Selefunt   Currently, if compiled with a suitable flag, "IEEE/support.c" will work
118*24715Selefunt   on a National 32000, a Zilog 8000, a VAX, and a SUN (cf. the "Makefile"
119*24715Selefunt   in this directory). Invoke the C compiler thus:
12024584Szliu
12124584Szliu        cc -c -DVAX IEEE/support.c              ... on a VAX, D-format
12224652Szliu        cc -c -DNATIONAL IEEE/support.c         ... on a National 32000
12324584Szliu        cc -c  IEEE/support.c                   ... on other IEEE machines,
12424584Szliu                                                    we hope.
12524584Szliu
12624584Szliu   Notes:
12724584Szliu      1. Faster versions of "drem" and "sqrt" for IEEE double precision
12824584Szliu         (coded in C but intended for assembly language) are given at the
12924652Szliu         end of "IEEE/support.c" but commented out since they require certain
13024584Szliu         machine-dependent functions.
13124584Szliu
13224584Szliu      2. A fast VAX assembler version of the system supported functions
13324584Szliu         copysign(), logb(), scalb(), finite(), and drem() appears in file
13424652Szliu         "VAX/support.s".  A fast VAX assembler version of sqrt() is in
13524652Szliu         file "VAX/sqrt.s".
13624584Szliu
13724584Szliu3. Two formats are supported by all the standard elementary functions:
13824652Szliu   the VAX D-format (56-bit precision), and the IEEE double format
13924652Szliu   (53-bit precision).  The cbrt() in "IEEE/cbrt.c" is for IEEE machines
14024584Szliu   only. The functions in files that end with ".s" are for VAX computers
141*24715Selefunt   only. The functions in files that end with ".c" (except "IEEE/cbrt.c")
142*24715Selefunt   are for VAX and IEEE machines. To use the VAX D-format, compile the code
14324584Szliu   with -DVAX; to use IEEE double format on various IEEE machines, see
14424652Szliu   "Makefile" in this directory).
14524584Szliu
14624584Szliu    Example:
14724584Szliu        cc -c -DVAX sin.c               ... for VAX D-format
14824584Szliu
14924584Szliu       Warning: The values of floating-point constants used in the code are
15024584Szliu                given in both hexadecimal and decimal.  The hexadecimal values
15124652Szliu                are the intended ones. The decimal values may be used provided
15224584Szliu                that the compiler converts from decimal to binary accurately
15324584Szliu                enough to produce the hexadecimal values shown. If the
15424584Szliu                conversion is inaccurate, then one must know the exact machine
15524652Szliu                representation of the constants and alter the assembly
156*24715Selefunt                language output from the compiler, or play tricks like
15724584Szliu                the following in a C program.
15824584Szliu
15924584Szliu                        Example: to store the floating-point constant
16024584Szliu
16124584Szliu                             p1= 2^-6 * .F83ABE67E1066A (Hexadecimal)
16224584Szliu
16324652Szliu                        on a VAX in C, we use two longwords to store its
16424584Szliu                        machine value and define p1 to be the double constant
16524652Szliu                        at the location of these two longwords:
16624584Szliu
167*24715Selefunt                        static long  p1x[] = { 0x3abe3d78, 0x066a67e1};
16824584Szliu                        #define      p1      (*(double*)p1x)
16924584Szliu
17024652Szliu    Note:  On a VAX, some functions have two codes. For example, cabs() has
171*24715Selefunt	   one implementation in "IEEE/cabs.c", and another in "VAX/cabs.s".
17224652Szliu           In this case, the assembly language version is preferred.
17324584Szliu
17424584Szliu
17524584Szliu4. Accuracy.
17624584Szliu
17724584Szliu            The errors in expm1(), log1p(), exp(), log(), cabs(), hypot()
17824584Szliu            and cbrt() are below 1 ULP (Unit in the Last Place).
17924584Szliu
18024584Szliu            The error in pow(x,y) grows with the size of y. Nevertheless,
18124584Szliu            for integers x and y, pow(x,y) returns the correct integer value
18224584Szliu            on all tested machines (VAX, SUN, NATIONAL, ZILOG), provided that
18324584Szliu            x to the power of y is representable exactly.
18424584Szliu
185*24715Selefunt            cosh, sinh, acosh, asinh, tanh, atanh and log10 have errors below
186*24715Selefunt            about 3 ULPs.
18724584Szliu
188*24715Selefunt            For trigonometric and inverse trigonometric functions:
18924584Szliu
190*24715Selefunt                Let [trig(x)] denote the value actually computed for trig(x),
191*24715Selefunt
19224584Szliu                1) Those codes using the machine's value PI (true pi rounded):
193*24715Selefunt                   (source codes: IEEE/{trig.c,atan2.c}, asincos.c and atan.c)
19424584Szliu
19524584Szliu                   The errors in [sin(x)], [cos(x)], and [atan(x)] are below
19624584Szliu                   1 ULP compared with sin(x*pi/PI), cos(x*pi/PI), and
19724584Szliu                   atan(x)*PI/pi respectively, where PI is the machine's
19824652Szliu                   value of pi rounded. [tan(x)] returns tan(x*pi/PI) within
19924584Szliu                   about 2 ULPs; [acos(x)], [asin(x)], and [atan2(y,x)]
20024584Szliu                   return acos(x)*PI/pi, asin(x)*PI/pi, and atan2(y,x)*PI/pi
20124584Szliu                   respectively to similar accuracy.
20224584Szliu
203*24715Selefunt
20424652Szliu                2) Those using true pi (for VAX D-format only):
205*24715Selefunt                   (source codes: VAX/{sincos.s,tan.s,atan2.s}, asincos.c and
206*24715Selefunt                   atan.c)
20724584Szliu
20824584Szliu                   The errors in [sin(x)], [cos(x)], and [atan(x)] are below
209*24715Selefunt                   1 ULP. [tan(x)], [atan2(y,x)], [acos(x)], and [asin(x)]
21024584Szliu                   have errors below about 2 ULPs.
21124584Szliu
212*24715Selefunt
21324584Szliu            Here are the results of some test runs to find worst errors on
21424584Szliu            the VAX :
21524584Szliu
21624584Szliu    tan   :  2.09 ULPs          ...1,024,000 random arguments (machine PI)
21724584Szliu    sin   :  .861 ULPs          ...1,024,000 random arguments (machine PI)
21824584Szliu    cos   :  .857 ULPs          ...1,024,000 random arguments (machine PI)
21924584Szliu    (compared with tan, sin, cos of (x*pi/PI))
22024584Szliu
22124584Szliu    acos  :  2.07 ULPs          .....200,000 random arguments (machine PI)
22224584Szliu    asin  :  2.06 ULPs          .....200,000 random arguments (machine PI)
22324584Szliu    atan2 :  1.41 ULPs          .....356,000 random arguments (machine PI)
22424584Szliu    atan  :  0.86 ULPs          ...1,536,000 random arguments (machine PI)
22524584Szliu    (compared with (PI/pi)*(atan, asin, acos, atan2 of x))
22624584Szliu
22724584Szliu    tan   :  2.15 ULPs          ...1,024,000 random arguments (true pi)
22824584Szliu    sin   :  .814 ULPs          ...1,024,000 random arguments (true pi)
22924584Szliu    cos   :  .792 ULPs          ...1,024,000 random arguments (true pi)
23024584Szliu    acos  :  2.15 ULPs          ...1,024,000 random arguments (true pi)
23124584Szliu    asin  :  1.99 ULPs          ...1,024,000 random arguments (true pi)
23224584Szliu    atan2 :  1.48 ULPs          ...1,024,000 random arguments (true pi)
23324584Szliu    atan  :  .850 ULPs          ...1,024,000 random arguments (true pi)
23424584Szliu
23524584Szliu    acosh :  3.30 ULPs          .....512,000 random arguments
23624584Szliu    asinh :  1.58 ULPs          .....512,000 random arguments
23724584Szliu    atanh :  1.71 ULPs          .....512,000 random arguments
23824584Szliu    cosh  :  1.23 ULPs          .....768,000 random arguments
23924584Szliu    sinh  :  1.93 ULPs          ...1,024,000 random arguments
24024584Szliu    tanh  :  2.22 ULPs          ...1,024,000 random arguments
24124584Szliu    log10 :  1.74 ULPs          ...1,536,000 random arguments
24224584Szliu    pow   :  1.79 ULPs          .....100,000 random arguments, 0 < x, y < 20.
24324584Szliu
24424584Szliu    exp   :  .768 ULPs          ...1,156,000 random arguments
24524584Szliu    expm1 :  .844 ULPs          ...1,166,000 random arguments
24624584Szliu    log1p :  .846 ULPs          ...1,536,000 random arguments
24724584Szliu    log   :  .826 ULPs          ...1,536,000 random arguments
24824584Szliu    cabs  :  .959 ULPs          .....500,000 random arguments
24924584Szliu    cbrt  :  .666 ULPs          ...5,120,000 random arguments
25024584Szliu
25124584Szliu
25224584Szliu5. Speed.
25324584Szliu
25424652Szliu        Some functions coded in VAX assembly language (cabs(), hypot() and
25524652Szliu	sqrt()) are significantly faster than the corresponding ones in 4.2BSD.
25624652Szliu        In general, to improve performance, all functions in "IEEE/support.c"
25724652Szliu        should be written in assembly language and, whenever possible, should
25824652Szliu	be called via short subroutine calls.
25924584Szliu
26024584Szliu
261*24715Selefunt6. j0, j1, jn.
26224584Szliu
26324584Szliu        The modifications to these routines were only in how an invalid
264*24715Selefunt        floating point operations is signaled.
265