xref: /csrg-svn/lib/libm/README (revision 34116)
1*34116Sbostic/*
2*34116Sbostic * Copyright (c) 1985 Regents of the University of California.
3*34116Sbostic * All rights reserved.
4*34116Sbostic *
5*34116Sbostic * Redistribution and use in source and binary forms are permitted
6*34116Sbostic * provided that this notice is preserved and that due credit is given
7*34116Sbostic * to the University of California at Berkeley. The name of the University
8*34116Sbostic * may not be used to endorse or promote products derived from this
9*34116Sbostic * software without specific prior written permission. This software
10*34116Sbostic * is provided ``as is'' without express or implied warranty.
11*34116Sbostic *
12*34116Sbostic * All recipients should regard themselves as participants in an ongoing
13*34116Sbostic * research project and hence should feel obligated to report their
14*34116Sbostic * experiences (good or bad) with these elementary function codes, using
15*34116Sbostic * the sendbug(8) program, to the authors.
16*34116Sbostic *
17*34116Sbostic * K.C. Ng, with Z-S. Alex Liu, S. McDonald, P. Tang, W. Kahan.
18*34116Sbostic * Revised on 5/10/85, 5/13/85, 6/14/85, 8/20/85, 8/27/85, 9/11/85.
19*34116Sbostic *
20*34116Sbostic *	@(#)README	5.2 (Berkeley) 04/29/88
21*34116Sbostic */
2224584Szliu
2324584Szliu******************************************************************************
2424584Szliu*  This is a description of the upgraded elementary functions (listed in 1). *
2524584Szliu*  Bessel functions (j0, j1, jn, y0, y1, yn), floor, and fabs passed over    *
2624584Szliu*  from 4.2BSD without change except perhaps for the way floating point      *
2724715Selefunt*  exception is signaled on a VAX.  Three lines that contain "errno" in erf.c*
2824715Selefunt*  (error functions erf, erfc) have been deleted to prevent overriding the   *
2924584Szliu*  system "errno".                                                           *
3024584Szliu******************************************************************************
3124584Szliu
3224584Szliu0. Total number of files: 40
3324584Szliu
3424715Selefunt        IEEE/Makefile   VAX/Makefile    VAX/support.s   erf.c       lgamma.c
3524584Szliu        IEEE/atan2.c    VAX/argred.s    VAX/tan.s       exp.c       log.c
3624584Szliu        IEEE/cabs.c     VAX/atan2.s     acosh.c         exp__E.c    log10.c
3724584Szliu        IEEE/cbrt.c     VAX/cabs.s      asincos.c       expm1.c     log1p.c
3824584Szliu        IEEE/support.c  VAX/cbrt.s      asinh.c         floor.c     log__L.c
3924584Szliu        IEEE/trig.c     VAX/infnan.s    atan.c          j0.c        pow.c
4024584Szliu        Makefile        VAX/sincos.s    atanh.c         j1.c        sinh.c
4124584Szliu        README          VAX/sqrt.s      cosh.c          jn.c        tanh.c
4224584Szliu
4324715Selefunt1. Functions implemented :
4424715Selefunt    (A). Standard elementary functions (total 22) :
4524715Selefunt        acos(x)                 ...in file  asincos.c
4624715Selefunt        asin(x)                 ...in file  asincos.c
4724715Selefunt        atan(x)                 ...in file  atan.c
4824715Selefunt        atan2(x,y)              ...in files IEEE/atan2.c, VAX/atan2.s
4924715Selefunt        sin(x)                  ...in files IEEE/trig.c,  VAX/sincos.s
5024715Selefunt        cos(x)                  ...in files IEEE/trig.c,  VAX/sincos.s
5124715Selefunt        tan(x)                  ...in files IEEE/trig.c,  VAX/tan.s
5224715Selefunt        cabs(x,y)               ...in files IEEE/cabs.c,  VAX/cabs.s
5324715Selefunt        hypot(x,y)              ...in files IEEE/cabs.c,  VAX/cabs.s
5424715Selefunt        cbrt(x)                 ...in files IEEE/cbrt.c,  VAX/cbrt.s
5524715Selefunt        exp(x)                  ...in file  exp.c
5624715Selefunt        expm1(x):=exp(x)-1      ...in file  expm1.c
5724715Selefunt        log(x)                  ...in file  log.c
5824715Selefunt        log10(x)                ...in file  log10.c
5924715Selefunt        log1p(x):=log(1+x)      ...in file  log1p.c
6024715Selefunt        pow(x,y)                ...in file  pow.c
6124715Selefunt        sinh(x)                 ...in file  sinh.c
6224715Selefunt        cosh(x)                 ...in file  cosh.c
6324715Selefunt        tanh(x)                 ...in file  tanh.c
6424715Selefunt        asinh(x)                ...in file  asinh.c
6524715Selefunt        acosh(x)                ...in file  acosh.c
6624715Selefunt        atanh(x)                ...in file  atanh.c
6724715Selefunt
6824584Szliu    (B). Kernel functions :
6924715Selefunt        exp__E(x,c) ...in file exp__E.c, used by expm1/exp/pow/cosh
7024715Selefunt        log__L(s)   ...in file log__L.c, used by log1p/log/pow
7124715Selefunt        libm$argred ...in file VAX/argred.s, used by VAX version of sin/cos/tan
7224584Szliu
7324584Szliu    (C). System supported functions :
7424715Selefunt        sqrt()      ...in files IEEE/support.c, VAX/sqrt.s
7524715Selefunt        drem()      ...in files IEEE/support.c, VAX/support.s
7624715Selefunt        finite()    ...in files IEEE/support.c, VAX/support.s
7724715Selefunt        logb()      ...in files IEEE/support.c, VAX/support.s
7824715Selefunt        scalb()     ...in files IEEE/support.c, VAX/support.s
7924715Selefunt        copysign()  ...in files IEEE/support.c, VAX/support.s
8024715Selefunt        rint()      ...in file  floor.c
8124584Szliu
8224584Szliu
8324584Szliu   Notes:
8424652Szliu       i. The codes in files ending with ".s" are written in VAX assembly
8524584Szliu          language. They are intended for VAX computers.
8624584Szliu
8724652Szliu          Files that end with ".c" are written in C. They are intended
8824584Szliu          for either a VAX or a machine that conforms to the IEEE
8924652Szliu          standard 754 for double precision floating-point arithmetic.
9024584Szliu
9124584Szliu      ii. On other than VAX or IEEE machines, run the original math
9224715Selefunt          library, formerly "/usr/lib/libm.a", now "/usr/lib/libom.a", if
9324715Selefunt	  nothing better is available.
9424584Szliu
9524715Selefunt     iii. The trigonometric functions sin/cos/tan/atan2 in files "VAX/sincos.s",
9624715Selefunt          "VAX/tan.s" and "VAX/atan2.s" are different from those in
9724715Selefunt          "IEEE/trig.c" and "IEEE/atan2.c".  The VAX assembler code uses the
9824715Selefunt          true value of pi to perform argument reduction, while the C code uses
9924715Selefunt          a machine value of PI (see "IEEE/trig.c").
10024584Szliu
10124584Szliu
10224584Szliu2. A computer system that conforms to IEEE standard 754 should provide
10324715Selefunt                sqrt(x),
10424715Selefunt                drem(x,p), (double precision remainder function)
10524715Selefunt                copysign(x,y),
10624715Selefunt                finite(x),
10724715Selefunt                scalb(x,N),
10824715Selefunt                logb(x) and
10924715Selefunt                rint(x).
11024652Szliu   These functions are either required or recommended by the standard.
11124584Szliu   For convenience, a (slow) C implementation of these functions is
11224652Szliu   provided in the file "IEEE/support.c".
11324584Szliu
11424715Selefunt   Warning: The functions in IEEE/support.c are somewhat machine dependent.
11524584Szliu   Some modifications may be necessary to run them on a different machine.
11624715Selefunt   Currently, if compiled with a suitable flag, "IEEE/support.c" will work
11724715Selefunt   on a National 32000, a Zilog 8000, a VAX, and a SUN (cf. the "Makefile"
11824715Selefunt   in this directory). Invoke the C compiler thus:
11924584Szliu
12024584Szliu        cc -c -DVAX IEEE/support.c              ... on a VAX, D-format
12124652Szliu        cc -c -DNATIONAL IEEE/support.c         ... on a National 32000
12224584Szliu        cc -c  IEEE/support.c                   ... on other IEEE machines,
12324584Szliu                                                    we hope.
12424584Szliu
12524584Szliu   Notes:
12624584Szliu      1. Faster versions of "drem" and "sqrt" for IEEE double precision
12724584Szliu         (coded in C but intended for assembly language) are given at the
12824652Szliu         end of "IEEE/support.c" but commented out since they require certain
12924584Szliu         machine-dependent functions.
13024584Szliu
13124584Szliu      2. A fast VAX assembler version of the system supported functions
13224584Szliu         copysign(), logb(), scalb(), finite(), and drem() appears in file
13324652Szliu         "VAX/support.s".  A fast VAX assembler version of sqrt() is in
13424652Szliu         file "VAX/sqrt.s".
13524584Szliu
13624584Szliu3. Two formats are supported by all the standard elementary functions:
13724652Szliu   the VAX D-format (56-bit precision), and the IEEE double format
13824652Szliu   (53-bit precision).  The cbrt() in "IEEE/cbrt.c" is for IEEE machines
13924584Szliu   only. The functions in files that end with ".s" are for VAX computers
14024715Selefunt   only. The functions in files that end with ".c" (except "IEEE/cbrt.c")
14124715Selefunt   are for VAX and IEEE machines. To use the VAX D-format, compile the code
14224584Szliu   with -DVAX; to use IEEE double format on various IEEE machines, see
14324652Szliu   "Makefile" in this directory).
14424584Szliu
14524584Szliu    Example:
14624584Szliu        cc -c -DVAX sin.c               ... for VAX D-format
14724584Szliu
14824584Szliu       Warning: The values of floating-point constants used in the code are
14924584Szliu                given in both hexadecimal and decimal.  The hexadecimal values
15024652Szliu                are the intended ones. The decimal values may be used provided
15124584Szliu                that the compiler converts from decimal to binary accurately
15224584Szliu                enough to produce the hexadecimal values shown. If the
15324584Szliu                conversion is inaccurate, then one must know the exact machine
15424652Szliu                representation of the constants and alter the assembly
15524715Selefunt                language output from the compiler, or play tricks like
15624584Szliu                the following in a C program.
15724584Szliu
15824584Szliu                        Example: to store the floating-point constant
15924584Szliu
16024584Szliu                             p1= 2^-6 * .F83ABE67E1066A (Hexadecimal)
16124584Szliu
16224652Szliu                        on a VAX in C, we use two longwords to store its
16324584Szliu                        machine value and define p1 to be the double constant
16424652Szliu                        at the location of these two longwords:
16524584Szliu
16624715Selefunt                        static long  p1x[] = { 0x3abe3d78, 0x066a67e1};
16724584Szliu                        #define      p1      (*(double*)p1x)
16824584Szliu
16924652Szliu    Note:  On a VAX, some functions have two codes. For example, cabs() has
17024715Selefunt	   one implementation in "IEEE/cabs.c", and another in "VAX/cabs.s".
17124652Szliu           In this case, the assembly language version is preferred.
17224584Szliu
17324584Szliu
17424584Szliu4. Accuracy.
17524584Szliu
17624584Szliu            The errors in expm1(), log1p(), exp(), log(), cabs(), hypot()
17724584Szliu            and cbrt() are below 1 ULP (Unit in the Last Place).
17824584Szliu
17924584Szliu            The error in pow(x,y) grows with the size of y. Nevertheless,
18024584Szliu            for integers x and y, pow(x,y) returns the correct integer value
18124584Szliu            on all tested machines (VAX, SUN, NATIONAL, ZILOG), provided that
18224584Szliu            x to the power of y is representable exactly.
18324584Szliu
18424715Selefunt            cosh, sinh, acosh, asinh, tanh, atanh and log10 have errors below
18524715Selefunt            about 3 ULPs.
18624584Szliu
18724715Selefunt            For trigonometric and inverse trigonometric functions:
18824584Szliu
18924715Selefunt                Let [trig(x)] denote the value actually computed for trig(x),
19024715Selefunt
19124584Szliu                1) Those codes using the machine's value PI (true pi rounded):
19224715Selefunt                   (source codes: IEEE/{trig.c,atan2.c}, asincos.c and atan.c)
19324584Szliu
19424584Szliu                   The errors in [sin(x)], [cos(x)], and [atan(x)] are below
19524584Szliu                   1 ULP compared with sin(x*pi/PI), cos(x*pi/PI), and
19624584Szliu                   atan(x)*PI/pi respectively, where PI is the machine's
19724652Szliu                   value of pi rounded. [tan(x)] returns tan(x*pi/PI) within
19824584Szliu                   about 2 ULPs; [acos(x)], [asin(x)], and [atan2(y,x)]
19924584Szliu                   return acos(x)*PI/pi, asin(x)*PI/pi, and atan2(y,x)*PI/pi
20024584Szliu                   respectively to similar accuracy.
20124584Szliu
20224715Selefunt
20324652Szliu                2) Those using true pi (for VAX D-format only):
20424715Selefunt                   (source codes: VAX/{sincos.s,tan.s,atan2.s}, asincos.c and
20524715Selefunt                   atan.c)
20624584Szliu
20724584Szliu                   The errors in [sin(x)], [cos(x)], and [atan(x)] are below
20824715Selefunt                   1 ULP. [tan(x)], [atan2(y,x)], [acos(x)], and [asin(x)]
20924584Szliu                   have errors below about 2 ULPs.
21024584Szliu
21124715Selefunt
21224584Szliu            Here are the results of some test runs to find worst errors on
21324584Szliu            the VAX :
21424584Szliu
21524584Szliu    tan   :  2.09 ULPs          ...1,024,000 random arguments (machine PI)
21624584Szliu    sin   :  .861 ULPs          ...1,024,000 random arguments (machine PI)
21724584Szliu    cos   :  .857 ULPs          ...1,024,000 random arguments (machine PI)
21824584Szliu    (compared with tan, sin, cos of (x*pi/PI))
21924584Szliu
22024584Szliu    acos  :  2.07 ULPs          .....200,000 random arguments (machine PI)
22124584Szliu    asin  :  2.06 ULPs          .....200,000 random arguments (machine PI)
22224584Szliu    atan2 :  1.41 ULPs          .....356,000 random arguments (machine PI)
22324584Szliu    atan  :  0.86 ULPs          ...1,536,000 random arguments (machine PI)
22424584Szliu    (compared with (PI/pi)*(atan, asin, acos, atan2 of x))
22524584Szliu
22624584Szliu    tan   :  2.15 ULPs          ...1,024,000 random arguments (true pi)
22724584Szliu    sin   :  .814 ULPs          ...1,024,000 random arguments (true pi)
22824584Szliu    cos   :  .792 ULPs          ...1,024,000 random arguments (true pi)
22924584Szliu    acos  :  2.15 ULPs          ...1,024,000 random arguments (true pi)
23024584Szliu    asin  :  1.99 ULPs          ...1,024,000 random arguments (true pi)
23124584Szliu    atan2 :  1.48 ULPs          ...1,024,000 random arguments (true pi)
23224584Szliu    atan  :  .850 ULPs          ...1,024,000 random arguments (true pi)
23324584Szliu
23424584Szliu    acosh :  3.30 ULPs          .....512,000 random arguments
23524584Szliu    asinh :  1.58 ULPs          .....512,000 random arguments
23624584Szliu    atanh :  1.71 ULPs          .....512,000 random arguments
23724584Szliu    cosh  :  1.23 ULPs          .....768,000 random arguments
23824584Szliu    sinh  :  1.93 ULPs          ...1,024,000 random arguments
23924584Szliu    tanh  :  2.22 ULPs          ...1,024,000 random arguments
24024584Szliu    log10 :  1.74 ULPs          ...1,536,000 random arguments
24124584Szliu    pow   :  1.79 ULPs          .....100,000 random arguments, 0 < x, y < 20.
24224584Szliu
24324584Szliu    exp   :  .768 ULPs          ...1,156,000 random arguments
24424584Szliu    expm1 :  .844 ULPs          ...1,166,000 random arguments
24524584Szliu    log1p :  .846 ULPs          ...1,536,000 random arguments
24624584Szliu    log   :  .826 ULPs          ...1,536,000 random arguments
24724584Szliu    cabs  :  .959 ULPs          .....500,000 random arguments
24824584Szliu    cbrt  :  .666 ULPs          ...5,120,000 random arguments
24924584Szliu
25024584Szliu
25124584Szliu5. Speed.
25224584Szliu
25324652Szliu        Some functions coded in VAX assembly language (cabs(), hypot() and
25424652Szliu	sqrt()) are significantly faster than the corresponding ones in 4.2BSD.
25524652Szliu        In general, to improve performance, all functions in "IEEE/support.c"
25624652Szliu        should be written in assembly language and, whenever possible, should
25724652Szliu	be called via short subroutine calls.
25824584Szliu
25924584Szliu
26024715Selefunt6. j0, j1, jn.
26124584Szliu
26224584Szliu        The modifications to these routines were only in how an invalid
26324715Selefunt        floating point operations is signaled.
264