README (revision 34116) - OpenGrok cross reference for /csrg-svn/lib/libm/README

*34116Sbostic/*
*34116Sbostic * Copyright (c) 1985 Regents of the University of California.
*34116Sbostic * All rights reserved.
*34116Sbostic *
*34116Sbostic * Redistribution and use in source and binary forms are permitted
*34116Sbostic * provided that this notice is preserved and that due credit is given
*34116Sbostic * to the University of California at Berkeley. The name of the University
*34116Sbostic * may not be used to endorse or promote products derived from this
*34116Sbostic * software without specific prior written permission. This software
*34116Sbostic * is provided ``as is'' without express or implied warranty.
*34116Sbostic *
*34116Sbostic * All recipients should regard themselves as participants in an ongoing
*34116Sbostic * research project and hence should feel obligated to report their
*34116Sbostic * experiences (good or bad) with these elementary function codes, using
*34116Sbostic * the sendbug(8) program, to the authors.
*34116Sbostic *
*34116Sbostic * K.C. Ng, with Z-S. Alex Liu, S. McDonald, P. Tang, W. Kahan.
*34116Sbostic * Revised on 5/10/85, 5/13/85, 6/14/85, 8/20/85, 8/27/85, 9/11/85.
*34116Sbostic *
*34116Sbostic *	@(#)README	5.2 (Berkeley) 04/29/88
*34116Sbostic */
24584Szliu
24584Szliu******************************************************************************
24584Szliu*  This is a description of the upgraded elementary functions (listed in 1). *
24584Szliu*  Bessel functions (j0, j1, jn, y0, y1, yn), floor, and fabs passed over    *
24584Szliu*  from 4.2BSD without change except perhaps for the way floating point      *
24715Selefunt*  exception is signaled on a VAX.  Three lines that contain "errno" in erf.c*
24715Selefunt*  (error functions erf, erfc) have been deleted to prevent overriding the   *
24584Szliu*  system "errno".                                                           *
24584Szliu******************************************************************************
24584Szliu
24584Szliu0. Total number of files: 40
24584Szliu
24715Selefunt        IEEE/Makefile   VAX/Makefile    VAX/support.s   erf.c       lgamma.c
24584Szliu        IEEE/atan2.c    VAX/argred.s    VAX/tan.s       exp.c       log.c
24584Szliu        IEEE/cabs.c     VAX/atan2.s     acosh.c         exp__E.c    log10.c
24584Szliu        IEEE/cbrt.c     VAX/cabs.s      asincos.c       expm1.c     log1p.c
24584Szliu        IEEE/support.c  VAX/cbrt.s      asinh.c         floor.c     log__L.c
24584Szliu        IEEE/trig.c     VAX/infnan.s    atan.c          j0.c        pow.c
24584Szliu        Makefile        VAX/sincos.s    atanh.c         j1.c        sinh.c
24584Szliu        README          VAX/sqrt.s      cosh.c          jn.c        tanh.c
24584Szliu
24715Selefunt1. Functions implemented :
24715Selefunt    (A). Standard elementary functions (total 22) :
24715Selefunt        acos(x)                 ...in file  asincos.c
24715Selefunt        asin(x)                 ...in file  asincos.c
24715Selefunt        atan(x)                 ...in file  atan.c
24715Selefunt        atan2(x,y)              ...in files IEEE/atan2.c, VAX/atan2.s
24715Selefunt        sin(x)                  ...in files IEEE/trig.c,  VAX/sincos.s
24715Selefunt        cos(x)                  ...in files IEEE/trig.c,  VAX/sincos.s
24715Selefunt        tan(x)                  ...in files IEEE/trig.c,  VAX/tan.s
24715Selefunt        cabs(x,y)               ...in files IEEE/cabs.c,  VAX/cabs.s
24715Selefunt        hypot(x,y)              ...in files IEEE/cabs.c,  VAX/cabs.s
24715Selefunt        cbrt(x)                 ...in files IEEE/cbrt.c,  VAX/cbrt.s
24715Selefunt        exp(x)                  ...in file  exp.c
24715Selefunt        expm1(x):=exp(x)-1      ...in file  expm1.c
24715Selefunt        log(x)                  ...in file  log.c
24715Selefunt        log10(x)                ...in file  log10.c
24715Selefunt        log1p(x):=log(1+x)      ...in file  log1p.c
24715Selefunt        pow(x,y)                ...in file  pow.c
24715Selefunt        sinh(x)                 ...in file  sinh.c
24715Selefunt        cosh(x)                 ...in file  cosh.c
24715Selefunt        tanh(x)                 ...in file  tanh.c
24715Selefunt        asinh(x)                ...in file  asinh.c
24715Selefunt        acosh(x)                ...in file  acosh.c
24715Selefunt        atanh(x)                ...in file  atanh.c
24715Selefunt
24584Szliu    (B). Kernel functions :
24715Selefunt        exp__E(x,c) ...in file exp__E.c, used by expm1/exp/pow/cosh
24715Selefunt        log__L(s)   ...in file log__L.c, used by log1p/log/pow
24715Selefunt        libm$argred ...in file VAX/argred.s, used by VAX version of sin/cos/tan
24584Szliu
24584Szliu    (C). System supported functions :
24715Selefunt        sqrt()      ...in files IEEE/support.c, VAX/sqrt.s
24715Selefunt        drem()      ...in files IEEE/support.c, VAX/support.s
24715Selefunt        finite()    ...in files IEEE/support.c, VAX/support.s
24715Selefunt        logb()      ...in files IEEE/support.c, VAX/support.s
24715Selefunt        scalb()     ...in files IEEE/support.c, VAX/support.s
24715Selefunt        copysign()  ...in files IEEE/support.c, VAX/support.s
24715Selefunt        rint()      ...in file  floor.c
24584Szliu
24584Szliu
24584Szliu   Notes:
24652Szliu       i. The codes in files ending with ".s" are written in VAX assembly
24584Szliu          language. They are intended for VAX computers.
24584Szliu
24652Szliu          Files that end with ".c" are written in C. They are intended
24584Szliu          for either a VAX or a machine that conforms to the IEEE
24652Szliu          standard 754 for double precision floating-point arithmetic.
24584Szliu
24584Szliu      ii. On other than VAX or IEEE machines, run the original math
24715Selefunt          library, formerly "/usr/lib/libm.a", now "/usr/lib/libom.a", if
24715Selefunt	  nothing better is available.
24584Szliu
24715Selefunt     iii. The trigonometric functions sin/cos/tan/atan2 in files "VAX/sincos.s",
24715Selefunt          "VAX/tan.s" and "VAX/atan2.s" are different from those in
24715Selefunt          "IEEE/trig.c" and "IEEE/atan2.c".  The VAX assembler code uses the
24715Selefunt          true value of pi to perform argument reduction, while the C code uses
24715Selefunt          a machine value of PI (see "IEEE/trig.c").
24584Szliu
24584Szliu
24584Szliu2. A computer system that conforms to IEEE standard 754 should provide
24715Selefunt                sqrt(x),
24715Selefunt                drem(x,p), (double precision remainder function)
24715Selefunt                copysign(x,y),
24715Selefunt                finite(x),
24715Selefunt                scalb(x,N),
24715Selefunt                logb(x) and
24715Selefunt                rint(x).
24652Szliu   These functions are either required or recommended by the standard.
24584Szliu   For convenience, a (slow) C implementation of these functions is
24652Szliu   provided in the file "IEEE/support.c".
24584Szliu
24715Selefunt   Warning: The functions in IEEE/support.c are somewhat machine dependent.
24584Szliu   Some modifications may be necessary to run them on a different machine.
24715Selefunt   Currently, if compiled with a suitable flag, "IEEE/support.c" will work
24715Selefunt   on a National 32000, a Zilog 8000, a VAX, and a SUN (cf. the "Makefile"
24715Selefunt   in this directory). Invoke the C compiler thus:
24584Szliu
24584Szliu        cc -c -DVAX IEEE/support.c              ... on a VAX, D-format
24652Szliu        cc -c -DNATIONAL IEEE/support.c         ... on a National 32000
24584Szliu        cc -c  IEEE/support.c                   ... on other IEEE machines,
24584Szliu                                                    we hope.
24584Szliu
24584Szliu   Notes:
24584Szliu      1. Faster versions of "drem" and "sqrt" for IEEE double precision
24584Szliu         (coded in C but intended for assembly language) are given at the
24652Szliu         end of "IEEE/support.c" but commented out since they require certain
24584Szliu         machine-dependent functions.
24584Szliu
24584Szliu      2. A fast VAX assembler version of the system supported functions
24584Szliu         copysign(), logb(), scalb(), finite(), and drem() appears in file
24652Szliu         "VAX/support.s".  A fast VAX assembler version of sqrt() is in
24652Szliu         file "VAX/sqrt.s".
24584Szliu
24584Szliu3. Two formats are supported by all the standard elementary functions:
24652Szliu   the VAX D-format (56-bit precision), and the IEEE double format
24652Szliu   (53-bit precision).  The cbrt() in "IEEE/cbrt.c" is for IEEE machines
24584Szliu   only. The functions in files that end with ".s" are for VAX computers
24715Selefunt   only. The functions in files that end with ".c" (except "IEEE/cbrt.c")
24715Selefunt   are for VAX and IEEE machines. To use the VAX D-format, compile the code
24584Szliu   with -DVAX; to use IEEE double format on various IEEE machines, see
24652Szliu   "Makefile" in this directory).
24584Szliu
24584Szliu    Example:
24584Szliu        cc -c -DVAX sin.c               ... for VAX D-format
24584Szliu
24584Szliu       Warning: The values of floating-point constants used in the code are
24584Szliu                given in both hexadecimal and decimal.  The hexadecimal values
24652Szliu                are the intended ones. The decimal values may be used provided
24584Szliu                that the compiler converts from decimal to binary accurately
24584Szliu                enough to produce the hexadecimal values shown. If the
24584Szliu                conversion is inaccurate, then one must know the exact machine
24652Szliu                representation of the constants and alter the assembly
24715Selefunt                language output from the compiler, or play tricks like
24584Szliu                the following in a C program.
24584Szliu
24584Szliu                        Example: to store the floating-point constant
24584Szliu
24584Szliu                             p1= 2^-6 * .F83ABE67E1066A (Hexadecimal)
24584Szliu
24652Szliu                        on a VAX in C, we use two longwords to store its
24584Szliu                        machine value and define p1 to be the double constant
24652Szliu                        at the location of these two longwords:
24584Szliu
24715Selefunt                        static long  p1x[] = { 0x3abe3d78, 0x066a67e1};
24584Szliu                        #define      p1      (*(double*)p1x)
24584Szliu
24652Szliu    Note:  On a VAX, some functions have two codes. For example, cabs() has
24715Selefunt	   one implementation in "IEEE/cabs.c", and another in "VAX/cabs.s".
24652Szliu           In this case, the assembly language version is preferred.
24584Szliu
24584Szliu
24584Szliu4. Accuracy.
24584Szliu
24584Szliu            The errors in expm1(), log1p(), exp(), log(), cabs(), hypot()
24584Szliu            and cbrt() are below 1 ULP (Unit in the Last Place).
24584Szliu
24584Szliu            The error in pow(x,y) grows with the size of y. Nevertheless,
24584Szliu            for integers x and y, pow(x,y) returns the correct integer value
24584Szliu            on all tested machines (VAX, SUN, NATIONAL, ZILOG), provided that
24584Szliu            x to the power of y is representable exactly.
24584Szliu
24715Selefunt            cosh, sinh, acosh, asinh, tanh, atanh and log10 have errors below
24715Selefunt            about 3 ULPs.
24584Szliu
24715Selefunt            For trigonometric and inverse trigonometric functions:
24584Szliu
24715Selefunt                Let [trig(x)] denote the value actually computed for trig(x),
24715Selefunt
24584Szliu                1) Those codes using the machine's value PI (true pi rounded):
24715Selefunt                   (source codes: IEEE/{trig.c,atan2.c}, asincos.c and atan.c)
24584Szliu
24584Szliu                   The errors in [sin(x)], [cos(x)], and [atan(x)] are below
24584Szliu                   1 ULP compared with sin(x*pi/PI), cos(x*pi/PI), and
24584Szliu                   atan(x)*PI/pi respectively, where PI is the machine's
24652Szliu                   value of pi rounded. [tan(x)] returns tan(x*pi/PI) within
24584Szliu                   about 2 ULPs; [acos(x)], [asin(x)], and [atan2(y,x)]
24584Szliu                   return acos(x)*PI/pi, asin(x)*PI/pi, and atan2(y,x)*PI/pi
24584Szliu                   respectively to similar accuracy.
24584Szliu
24715Selefunt
24652Szliu                2) Those using true pi (for VAX D-format only):
24715Selefunt                   (source codes: VAX/{sincos.s,tan.s,atan2.s}, asincos.c and
24715Selefunt                   atan.c)
24584Szliu
24584Szliu                   The errors in [sin(x)], [cos(x)], and [atan(x)] are below
24715Selefunt                   1 ULP. [tan(x)], [atan2(y,x)], [acos(x)], and [asin(x)]
24584Szliu                   have errors below about 2 ULPs.
24584Szliu
24715Selefunt
24584Szliu            Here are the results of some test runs to find worst errors on
24584Szliu            the VAX :
24584Szliu
24584Szliu    tan   :  2.09 ULPs          ...1,024,000 random arguments (machine PI)
24584Szliu    sin   :  .861 ULPs          ...1,024,000 random arguments (machine PI)
24584Szliu    cos   :  .857 ULPs          ...1,024,000 random arguments (machine PI)
24584Szliu    (compared with tan, sin, cos of (x*pi/PI))
24584Szliu
24584Szliu    acos  :  2.07 ULPs          .....200,000 random arguments (machine PI)
24584Szliu    asin  :  2.06 ULPs          .....200,000 random arguments (machine PI)
24584Szliu    atan2 :  1.41 ULPs          .....356,000 random arguments (machine PI)
24584Szliu    atan  :  0.86 ULPs          ...1,536,000 random arguments (machine PI)
24584Szliu    (compared with (PI/pi)*(atan, asin, acos, atan2 of x))
24584Szliu
24584Szliu    tan   :  2.15 ULPs          ...1,024,000 random arguments (true pi)
24584Szliu    sin   :  .814 ULPs          ...1,024,000 random arguments (true pi)
24584Szliu    cos   :  .792 ULPs          ...1,024,000 random arguments (true pi)
24584Szliu    acos  :  2.15 ULPs          ...1,024,000 random arguments (true pi)
24584Szliu    asin  :  1.99 ULPs          ...1,024,000 random arguments (true pi)
24584Szliu    atan2 :  1.48 ULPs          ...1,024,000 random arguments (true pi)
24584Szliu    atan  :  .850 ULPs          ...1,024,000 random arguments (true pi)
24584Szliu
24584Szliu    acosh :  3.30 ULPs          .....512,000 random arguments
24584Szliu    asinh :  1.58 ULPs          .....512,000 random arguments
24584Szliu    atanh :  1.71 ULPs          .....512,000 random arguments
24584Szliu    cosh  :  1.23 ULPs          .....768,000 random arguments
24584Szliu    sinh  :  1.93 ULPs          ...1,024,000 random arguments
24584Szliu    tanh  :  2.22 ULPs          ...1,024,000 random arguments
24584Szliu    log10 :  1.74 ULPs          ...1,536,000 random arguments
24584Szliu    pow   :  1.79 ULPs          .....100,000 random arguments, 0 < x, y < 20.
24584Szliu
24584Szliu    exp   :  .768 ULPs          ...1,156,000 random arguments
24584Szliu    expm1 :  .844 ULPs          ...1,166,000 random arguments
24584Szliu    log1p :  .846 ULPs          ...1,536,000 random arguments
24584Szliu    log   :  .826 ULPs          ...1,536,000 random arguments
24584Szliu    cabs  :  .959 ULPs          .....500,000 random arguments
24584Szliu    cbrt  :  .666 ULPs          ...5,120,000 random arguments
24584Szliu
24584Szliu
24584Szliu5. Speed.
24584Szliu
24652Szliu        Some functions coded in VAX assembly language (cabs(), hypot() and
24652Szliu	sqrt()) are significantly faster than the corresponding ones in 4.2BSD.
24652Szliu        In general, to improve performance, all functions in "IEEE/support.c"
24652Szliu        should be written in assembly language and, whenever possible, should
24652Szliu	be called via short subroutine calls.
24584Szliu
24584Szliu
24715Selefunt6. j0, j1, jn.
24584Szliu
24584Szliu        The modifications to these routines were only in how an invalid
24715Selefunt        floating point operations is signaled.