xref: /netbsd-src/external/lgpl3/gmp/dist/doc/gmp.texi (revision d11b170b9000ada93db553723522a63d5deac310)
1\input texinfo    @c -*-texinfo-*-
2@c %**start of header
3@setfilename gmp.info
4@documentencoding ISO-8859-1
5@include version.texi
6@settitle GNU MP @value{VERSION}
7@synindex tp fn
8@iftex
9@afourpaper
10@end iftex
11@comment %**end of header
12
13@copying
14This manual describes how to install and use the GNU multiple precision
15arithmetic library, version @value{VERSION}.
16
17Copyright 1991, 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001, 2002,
182003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013 Free Software
19Foundation, Inc.
20
21Permission is granted to copy, distribute and/or modify this document under
22the terms of the GNU Free Documentation License, Version 1.3 or any later
23version published by the Free Software Foundation; with no Invariant Sections,
24with the Front-Cover Texts being ``A GNU Manual'', and with the Back-Cover
25Texts being ``You have freedom to copy and modify this GNU Manual, like GNU
26software''.  A copy of the license is included in
27@ref{GNU Free Documentation License}.
28@end copying
29@c  Note the @ref above must be on one line, a line break in an @ref within
30@c  @copying will bomb in recent texinfo.tex (eg. 2004-04-07.08 which comes
31@c  with texinfo 4.7), with messages about missing @endcsname.
32
33
34@c  Texinfo version 4.2 or up will be needed to process this file.
35@c
36@c  The version number and edition number are taken from version.texi provided
37@c  by automake (note that it's regenerated only if you configure with
38@c  --enable-maintainer-mode).
39@c
40@c  Notes discussing the present version number of GMP in relation to previous
41@c  ones (for instance in the "Compatibility" section) must be updated at
42@c  manually though.
43@c
44@c  @cindex entries have been made for function categories and programming
45@c  topics.  The "mpn" section is not included in this, because a beginner
46@c  looking for "GCD" or something is only going to be confused by pointers to
47@c  low level routines.
48@c
49@c  @cindex entries are present for processors and systems when there's
50@c  particular notes concerning them, but not just for everything GMP
51@c  supports.
52@c
53@c  Index entries for files use @code rather than @file, @samp or @option,
54@c  since the latter come out with quotes in TeX, which are nice in the text
55@c  but don't look so good in index columns.
56@c
57@c  Tex:
58@c
59@c  A suitable texinfo.tex is supplied, a newer one should work equally well.
60@c
61@c  HTML:
62@c
63@c  Nothing special is done for links to external manuals, they just come out
64@c  in the usual makeinfo style, eg. "../libc/Locales.html".  If you have
65@c  local copies of such manuals then this is a good thing, if not then you
66@c  may want to search-and-replace to some online source.
67@c
68
69@dircategory GNU libraries
70@direntry
71* gmp: (gmp).                   GNU Multiple Precision Arithmetic Library.
72@end direntry
73
74@c  html <meta name="description" content="...">
75@documentdescription
76How to install and use the GNU multiple precision arithmetic library, version @value{VERSION}.
77@end documentdescription
78
79@c smallbook
80@finalout
81@setchapternewpage on
82
83@ifnottex
84@node Top, Copying, (dir), (dir)
85@top GNU MP
86@end ifnottex
87
88@iftex
89@titlepage
90@title GNU MP
91@subtitle The GNU Multiple Precision Arithmetic Library
92@subtitle Edition @value{EDITION}
93@subtitle @value{UPDATED}
94
95@author by Torbj@"orn Granlund and the GMP development team
96@c @email{tg@@gmplib.org}
97
98@c Include the Distribution inside the titlepage so
99@c that headings are turned off.
100
101@tex
102\global\parindent=0pt
103\global\parskip=8pt
104\global\baselineskip=13pt
105@end tex
106
107@page
108@vskip 0pt plus 1filll
109@end iftex
110
111@insertcopying
112@ifnottex
113@sp 1
114@end ifnottex
115
116@iftex
117@end titlepage
118@headings double
119@end iftex
120
121@c  Don't bother with contents for html, the menus seem adequate.
122@ifnothtml
123@contents
124@end ifnothtml
125
126@menu
127* Copying::                    GMP Copying Conditions (LGPL).
128* Introduction to GMP::        Brief introduction to GNU MP.
129* Installing GMP::             How to configure and compile the GMP library.
130* GMP Basics::                 What every GMP user should know.
131* Reporting Bugs::             How to usefully report bugs.
132* Integer Functions::          Functions for arithmetic on signed integers.
133* Rational Number Functions::  Functions for arithmetic on rational numbers.
134* Floating-point Functions::   Functions for arithmetic on floats.
135* Low-level Functions::        Fast functions for natural numbers.
136* Random Number Functions::    Functions for generating random numbers.
137* Formatted Output::           @code{printf} style output.
138* Formatted Input::            @code{scanf} style input.
139* C++ Class Interface::        Class wrappers around GMP types.
140* Custom Allocation::          How to customize the internal allocation.
141* Language Bindings::          Using GMP from other languages.
142* Algorithms::                 What happens behind the scenes.
143* Internals::                  How values are represented behind the scenes.
144
145* Contributors::               Who brings you this library?
146* References::                 Some useful papers and books to read.
147* GNU Free Documentation License::
148* Concept Index::
149* Function Index::
150@end menu
151
152
153@c  @m{T,N} is $T$ in tex or @math{N} otherwise.  This is an easy way to give
154@c  different forms for math in tex and info.  Commas in N or T don't work,
155@c  but @C{} can be used instead.  \, works in info but not in tex.
156@iftex
157@macro m {T,N}
158@tex$\T\$@end tex
159@end macro
160@end iftex
161@ifnottex
162@macro m {T,N}
163@math{\N\}
164@end macro
165@end ifnottex
166
167@macro C {}
168,
169@end macro
170
171@c  @ms{V,N} is $V_N$ in tex or just vn otherwise.  This suits simple
172@c  subscripts like @ms{x,0}.
173@iftex
174@macro ms {V,N}
175@tex$\V\_{\N\}$@end tex
176@end macro
177@end iftex
178@ifnottex
179@macro ms {V,N}
180\V\\N\
181@end macro
182@end ifnottex
183
184@c  @nicode{S} is plain S in info, or @code{S} elsewhere.  This can be used
185@c  when the quotes that @code{} gives in info aren't wanted, but the
186@c  fontification in tex or html is wanted.  Doesn't work as @nicode{'\\0'}
187@c  though (gives two backslashes in tex).
188@ifinfo
189@macro nicode {S}
190\S\
191@end macro
192@end ifinfo
193@ifnotinfo
194@macro nicode {S}
195@code{\S\}
196@end macro
197@end ifnotinfo
198
199@c  @nisamp{S} is plain S in info, or @samp{S} elsewhere.  This can be used
200@c  when the quotes that @samp{} gives in info aren't wanted, but the
201@c  fontification in tex or html is wanted.
202@ifinfo
203@macro nisamp {S}
204\S\
205@end macro
206@end ifinfo
207@ifnotinfo
208@macro nisamp {S}
209@samp{\S\}
210@end macro
211@end ifnotinfo
212
213@c  Usage: @GMPtimes{}
214@c  Give either \times or the word "times".
215@tex
216\gdef\GMPtimes{\times}
217@end tex
218@ifnottex
219@macro GMPtimes
220times
221@end macro
222@end ifnottex
223
224@c  Usage: @GMPmultiply{}
225@c  Give * in info, or nothing in tex.
226@tex
227\gdef\GMPmultiply{}
228@end tex
229@ifnottex
230@macro GMPmultiply
231*
232@end macro
233@end ifnottex
234
235@c  Usage: @GMPabs{x}
236@c  Give either |x| in tex, or abs(x) in info or html.
237@tex
238\gdef\GMPabs#1{|#1|}
239@end tex
240@ifnottex
241@macro GMPabs {X}
242@abs{}(\X\)
243@end macro
244@end ifnottex
245
246@c  Usage: @GMPfloor{x}
247@c  Give either \lfloor x\rfloor in tex, or floor(x) in info or html.
248@tex
249\gdef\GMPfloor#1{\lfloor #1\rfloor}
250@end tex
251@ifnottex
252@macro GMPfloor {X}
253floor(\X\)
254@end macro
255@end ifnottex
256
257@c  Usage: @GMPceil{x}
258@c  Give either \lceil x\rceil in tex, or ceil(x) in info or html.
259@tex
260\gdef\GMPceil#1{\lceil #1 \rceil}
261@end tex
262@ifnottex
263@macro GMPceil {X}
264ceil(\X\)
265@end macro
266@end ifnottex
267
268@c  Math operators already available in tex, made available in info too.
269@c  For example @bmod{} can be used in both tex and info.
270@ifnottex
271@macro bmod
272mod
273@end macro
274@macro gcd
275gcd
276@end macro
277@macro ge
278>=
279@end macro
280@macro le
281<=
282@end macro
283@macro log
284log
285@end macro
286@macro min
287min
288@end macro
289@macro leftarrow
290<-
291@end macro
292@macro rightarrow
293->
294@end macro
295@end ifnottex
296
297@c  New math operators.
298@c  @abs{} can be used in both tex and info, or just \abs in tex.
299@tex
300\gdef\abs{\mathop{\rm abs}}
301@end tex
302@ifnottex
303@macro abs
304abs
305@end macro
306@end ifnottex
307
308@c  @cross{} is a \times symbol in tex, or an "x" in info.  In tex it works
309@c  inside or outside $ $.
310@tex
311\gdef\cross{\ifmmode\times\else$\times$\fi}
312@end tex
313@ifnottex
314@macro cross
315x
316@end macro
317@end ifnottex
318
319@c  @times{} made available as a "*" in info and html (already works in tex).
320@ifnottex
321@macro times
322*
323@end macro
324@end ifnottex
325
326@c  Usage: @W{text}
327@c  Like @w{} but working in math mode too.
328@tex
329\gdef\W#1{\ifmmode{#1}\else\w{#1}\fi}
330@end tex
331@ifnottex
332@macro W {S}
333@w{\S\}
334@end macro
335@end ifnottex
336
337@c  Usage: \GMPdisplay{text}
338@c  Put the given text in an @display style indent, but without turning off
339@c  paragraph reflow etc.
340@tex
341\gdef\GMPdisplay#1{%
342\noindent
343\advance\leftskip by \lispnarrowing
344#1\par}
345@end tex
346
347@c  Usage: \GMPhat
348@c  A new \hat that will work in math mode, unlike the texinfo redefined
349@c  version.
350@tex
351\gdef\GMPhat{\mathaccent"705E}
352@end tex
353
354@c  Usage: \GMPraise{text}
355@c  For use in a $ $ math expression as an alternative to "^".  This is good
356@c  for @code{} in an exponent, since there seems to be no superscript font
357@c  for that.
358@tex
359\gdef\GMPraise#1{\mskip0.5\thinmuskip\hbox{\raise0.8ex\hbox{#1}}}
360@end tex
361
362@c  Usage: @texlinebreak{}
363@c  A line break as per @*, but only in tex.
364@iftex
365@macro texlinebreak
366@*
367@end macro
368@end iftex
369@ifnottex
370@macro texlinebreak
371@end macro
372@end ifnottex
373
374@c  Usage: @maybepagebreak
375@c  Allow tex to insert a page break, if it feels the urge.
376@c  Normally blocks of @deftypefun/funx are kept together, which can lead to
377@c  some poor page break positioning if it's a big block, like the sets of
378@c  division functions etc.
379@tex
380\gdef\maybepagebreak{\penalty0}
381@end tex
382@ifnottex
383@macro maybepagebreak
384@end macro
385@end ifnottex
386
387@c  Usage: @GMPreftop{info,title}
388@c  Usage: @GMPpxreftop{info,title}
389@c
390@c  Like @ref{} and @pxref{}, but designed for a reference to the top of a
391@c  document, not a particular section.  The TeX output for plain @ref insists
392@c  on printing a particular section, GMPreftop gives just the title.
393@c
394@c  The texinfo manual recommends putting a likely section name in references
395@c  like this, eg. "Introduction", but it seems better to just give the title.
396@c
397@iftex
398@macro GMPreftop{info,title}
399@i{\title\}
400@end macro
401@macro GMPpxreftop{info,title}
402see @i{\title\}
403@end macro
404@end iftex
405@c
406@ifnottex
407@macro GMPreftop{info,title}
408@ref{Top,\title\,\title\,\info\,\title\}
409@end macro
410@macro GMPpxreftop{info,title}
411@pxref{Top,\title\,\title\,\info\,\title\}
412@end macro
413@end ifnottex
414
415
416@node Copying, Introduction to GMP, Top, Top
417@comment  node-name, next, previous,  up
418@unnumbered GNU MP Copying Conditions
419@cindex Copying conditions
420@cindex Conditions for copying GNU MP
421@cindex License conditions
422
423This library is @dfn{free}; this means that everyone is free to use it and
424free to redistribute it on a free basis.  The library is not in the public
425domain; it is copyrighted and there are restrictions on its distribution, but
426these restrictions are designed to permit everything that a good cooperating
427citizen would want to do.  What is not allowed is to try to prevent others
428from further sharing any version of this library that they might get from
429you.@refill
430
431Specifically, we want to make sure that you have the right to give away copies
432of the library, that you receive source code or else can get it if you want
433it, that you can change this library or use pieces of it in new free programs,
434and that you know you can do these things.@refill
435
436To make sure that everyone has such rights, we have to forbid you to deprive
437anyone else of these rights.  For example, if you distribute copies of the GNU
438MP library, you must give the recipients all the rights that you have.  You
439must make sure that they, too, receive or can get the source code.  And you
440must tell them their rights.@refill
441
442Also, for our own protection, we must make certain that everyone finds out
443that there is no warranty for the GNU MP library.  If it is modified by
444someone else and passed on, we want their recipients to know that what they
445have is not what we distributed, so that any problems introduced by others
446will not reflect on our reputation.@refill
447
448The precise conditions of the license for the GNU MP library are found in the
449Lesser General Public License version 3 that accompanies the source code,
450see @file{COPYING.LIB}.  Certain demonstration programs are provided under the
451terms of the plain General Public License version 3, see @file{COPYING}.
452
453
454@node Introduction to GMP, Installing GMP, Copying, Top
455@comment  node-name,  next,  previous,  up
456@chapter Introduction to GNU MP
457@cindex Introduction
458
459GNU MP is a portable library written in C for arbitrary precision arithmetic
460on integers, rational numbers, and floating-point numbers.  It aims to provide
461the fastest possible arithmetic for all applications that need higher
462precision than is directly supported by the basic C types.
463
464Many applications use just a few hundred bits of precision; but some
465applications may need thousands or even millions of bits.  GMP is designed to
466give good performance for both, by choosing algorithms based on the sizes of
467the operands, and by carefully keeping the overhead at a minimum.
468
469The speed of GMP is achieved by using fullwords as the basic arithmetic type,
470by using sophisticated algorithms, by including carefully optimized assembly
471code for the most common inner loops for many different CPUs, and by a general
472emphasis on speed (as opposed to simplicity or elegance).
473
474There is assembly code for these CPUs:
475@cindex CPU types
476ARM,
477DEC Alpha 21064, 21164, and 21264,
478AMD 29000,
479AMD K6, K6-2, Athlon, and Athlon64,
480Hitachi SuperH and SH-2,
481HPPA 1.0, 1.1 and 2.0,
482Intel Pentium, Pentium Pro/II/III, Pentium 4, generic x86,
483Intel IA-64, i960,
484Motorola MC68000, MC68020, MC88100, and MC88110,
485Motorola/IBM PowerPC 32 and 64,
486National NS32000,
487IBM POWER,
488MIPS R3000, R4000,
489SPARCv7, SuperSPARC, generic SPARCv8, UltraSPARC,
490DEC VAX,
491and
492Zilog Z8000.
493Some optimizations also for
494Cray vector systems,
495Clipper,
496IBM ROMP (RT),
497and
498Pyramid AP/XP.
499
500@cindex Home page
501@cindex Web page
502@noindent
503For up-to-date information on GMP, please see the GMP web pages at
504
505@display
506@uref{http://gmplib.org/}
507@end display
508
509@cindex Latest version of GMP
510@cindex Anonymous FTP of latest version
511@cindex FTP of latest version
512@noindent
513The latest version of the library is available at
514
515@display
516@uref{ftp://ftp.gnu.org/gnu/gmp/}
517@end display
518
519Many sites around the world mirror @samp{ftp.gnu.org}, please use a mirror
520near you, see @uref{http://www.gnu.org/order/ftp.html} for a full list.
521
522@cindex Mailing lists
523There are three public mailing lists of interest.  One for release
524announcements, one for general questions and discussions about usage of the GMP
525library and one for bug reports.  For more information, see
526
527@display
528@uref{http://gmplib.org/mailman/listinfo/}.
529@end display
530
531The proper place for bug reports is @email{gmp-bugs@@gmplib.org}.  See
532@ref{Reporting Bugs} for information about reporting bugs.
533
534@sp 1
535@section How to use this Manual
536@cindex About this manual
537
538Everyone should read @ref{GMP Basics}.  If you need to install the library
539yourself, then read @ref{Installing GMP}.  If you have a system with multiple
540ABIs, then read @ref{ABI and ISA}, for the compiler options that must be used
541on applications.
542
543The rest of the manual can be used for later reference, although it is
544probably a good idea to glance through it.
545
546
547@node Installing GMP, GMP Basics, Introduction to GMP, Top
548@comment  node-name,  next,  previous,  up
549@chapter Installing GMP
550@cindex Installing GMP
551@cindex Configuring GMP
552@cindex Building GMP
553
554GMP has an autoconf/automake/libtool based configuration system.  On a
555Unix-like system a basic build can be done with
556
557@example
558./configure
559make
560@end example
561
562@noindent
563Some self-tests can be run with
564
565@example
566make check
567@end example
568
569@noindent
570And you can install (under @file{/usr/local} by default) with
571
572@example
573make install
574@end example
575
576If you experience problems, please report them to @email{gmp-bugs@@gmplib.org}.
577See @ref{Reporting Bugs}, for information on what to include in useful bug
578reports.
579
580@menu
581* Build Options::
582* ABI and ISA::
583* Notes for Package Builds::
584* Notes for Particular Systems::
585* Known Build Problems::
586* Performance optimization::
587@end menu
588
589
590@node Build Options, ABI and ISA, Installing GMP, Installing GMP
591@section Build Options
592@cindex Build options
593
594All the usual autoconf configure options are available, run @samp{./configure
595--help} for a summary.  The file @file{INSTALL.autoconf} has some generic
596installation information too.
597
598@table @asis
599@item Tools
600@cindex Non-Unix systems
601@samp{configure} requires various Unix-like tools.  See @ref{Notes for
602Particular Systems}, for some options on non-Unix systems.
603
604It might be possible to build without the help of @samp{configure}, certainly
605all the code is there, but unfortunately you'll be on your own.
606
607@item Build Directory
608@cindex Build directory
609To compile in a separate build directory, @command{cd} to that directory, and
610prefix the configure command with the path to the GMP source directory.  For
611example
612
613@example
614cd /my/build/dir
615/my/sources/gmp-@value{VERSION}/configure
616@end example
617
618Not all @samp{make} programs have the necessary features (@code{VPATH}) to
619support this.  In particular, SunOS and Slowaris @command{make} have bugs that
620make them unable to build in a separate directory.  Use GNU @command{make}
621instead.
622
623@item @option{--prefix} and @option{--exec-prefix}
624@cindex Prefix
625@cindex Exec prefix
626@cindex Install prefix
627@cindex @code{--prefix}
628@cindex @code{--exec-prefix}
629The @option{--prefix} option can be used in the normal way to direct GMP to
630install under a particular tree.  The default is @samp{/usr/local}.
631
632@option{--exec-prefix} can be used to direct architecture-dependent files like
633@file{libgmp.a} to a different location.  This can be used to share
634architecture-independent parts like the documentation, but separate the
635dependent parts.  Note however that @file{gmp.h} and @file{mp.h} are
636architecture-dependent since they encode certain aspects of @file{libgmp}, so
637it will be necessary to ensure both @file{$prefix/include} and
638@file{$exec_prefix/include} are available to the compiler.
639
640@item @option{--disable-shared}, @option{--disable-static}
641@cindex @code{--disable-shared}
642@cindex @code{--disable-static}
643By default both shared and static libraries are built (where possible), but
644one or other can be disabled.  Shared libraries result in smaller executables
645and permit code sharing between separate running processes, but on some CPUs
646are slightly slower, having a small cost on each function call.
647
648@item Native Compilation, @option{--build=CPU-VENDOR-OS}
649@cindex Native compilation
650@cindex Build system
651@cindex @code{--build}
652For normal native compilation, the system can be specified with
653@samp{--build}.  By default @samp{./configure} uses the output from running
654@samp{./config.guess}.  On some systems @samp{./config.guess} can determine
655the exact CPU type, on others it will be necessary to give it explicitly.  For
656example,
657
658@example
659./configure --build=ultrasparc-sun-solaris2.7
660@end example
661
662In all cases the @samp{OS} part is important, since it controls how libtool
663generates shared libraries.  Running @samp{./config.guess} is the simplest way
664to see what it should be, if you don't know already.
665
666@item Cross Compilation, @option{--host=CPU-VENDOR-OS}
667@cindex Cross compiling
668@cindex Host system
669@cindex @code{--host}
670When cross-compiling, the system used for compiling is given by @samp{--build}
671and the system where the library will run is given by @samp{--host}.  For
672example when using a FreeBSD Athlon system to build GNU/Linux m68k binaries,
673
674@example
675./configure --build=athlon-pc-freebsd3.5 --host=m68k-mac-linux-gnu
676@end example
677
678Compiler tools are sought first with the host system type as a prefix.  For
679example @command{m68k-mac-linux-gnu-ranlib} is tried, then plain
680@command{ranlib}.  This makes it possible for a set of cross-compiling tools
681to co-exist with native tools.  The prefix is the argument to @samp{--host},
682and this can be an alias, such as @samp{m68k-linux}.  But note that tools
683don't have to be setup this way, it's enough to just have a @env{PATH} with a
684suitable cross-compiling @command{cc} etc.
685
686Compiling for a different CPU in the same family as the build system is a form
687of cross-compilation, though very possibly this would merely be special
688options on a native compiler.  In any case @samp{./configure} avoids depending
689on being able to run code on the build system, which is important when
690creating binaries for a newer CPU since they very possibly won't run on the
691build system.
692
693In all cases the compiler must be able to produce an executable (of whatever
694format) from a standard C @code{main}.  Although only object files will go to
695make up @file{libgmp}, @samp{./configure} uses linking tests for various
696purposes, such as determining what functions are available on the host system.
697
698Currently a warning is given unless an explicit @samp{--build} is used when
699cross-compiling, because it may not be possible to correctly guess the build
700system type if the @env{PATH} has only a cross-compiling @command{cc}.
701
702Note that the @samp{--target} option is not appropriate for GMP@.  It's for use
703when building compiler tools, with @samp{--host} being where they will run,
704and @samp{--target} what they'll produce code for.  Ordinary programs or
705libraries like GMP are only interested in the @samp{--host} part, being where
706they'll run.  (Some past versions of GMP used @samp{--target} incorrectly.)
707
708@item CPU types
709@cindex CPU types
710In general, if you want a library that runs as fast as possible, you should
711configure GMP for the exact CPU type your system uses.  However, this may mean
712the binaries won't run on older members of the family, and might run slower on
713other members, older or newer.  The best idea is always to build GMP for the
714exact machine type you intend to run it on.
715
716The following CPUs have specific support.  See @file{configure.ac} for details
717of what code and compiler options they select.
718
719@itemize @bullet
720
721@c Keep this formatting, it's easy to read and it can be grepped to
722@c automatically test that CPUs listed get through ./config.sub
723
724@item
725Alpha:
726@nisamp{alpha},
727@nisamp{alphaev5},
728@nisamp{alphaev56},
729@nisamp{alphapca56},
730@nisamp{alphapca57},
731@nisamp{alphaev6},
732@nisamp{alphaev67},
733@nisamp{alphaev68}
734@nisamp{alphaev7}
735
736@item
737Cray:
738@nisamp{c90},
739@nisamp{j90},
740@nisamp{t90},
741@nisamp{sv1}
742
743@item
744HPPA:
745@nisamp{hppa1.0},
746@nisamp{hppa1.1},
747@nisamp{hppa2.0},
748@nisamp{hppa2.0n},
749@nisamp{hppa2.0w},
750@nisamp{hppa64}
751
752@item
753IA-64:
754@nisamp{ia64},
755@nisamp{itanium},
756@nisamp{itanium2}
757
758@item
759MIPS:
760@nisamp{mips},
761@nisamp{mips3},
762@nisamp{mips64}
763
764@item
765Motorola:
766@nisamp{m68k},
767@nisamp{m68000},
768@nisamp{m68010},
769@nisamp{m68020},
770@nisamp{m68030},
771@nisamp{m68040},
772@nisamp{m68060},
773@nisamp{m68302},
774@nisamp{m68360},
775@nisamp{m88k},
776@nisamp{m88110}
777
778@item
779POWER:
780@nisamp{power},
781@nisamp{power1},
782@nisamp{power2},
783@nisamp{power2sc}
784
785@item
786PowerPC:
787@nisamp{powerpc},
788@nisamp{powerpc64},
789@nisamp{powerpc401},
790@nisamp{powerpc403},
791@nisamp{powerpc405},
792@nisamp{powerpc505},
793@nisamp{powerpc601},
794@nisamp{powerpc602},
795@nisamp{powerpc603},
796@nisamp{powerpc603e},
797@nisamp{powerpc604},
798@nisamp{powerpc604e},
799@nisamp{powerpc620},
800@nisamp{powerpc630},
801@nisamp{powerpc740},
802@nisamp{powerpc7400},
803@nisamp{powerpc7450},
804@nisamp{powerpc750},
805@nisamp{powerpc801},
806@nisamp{powerpc821},
807@nisamp{powerpc823},
808@nisamp{powerpc860},
809@nisamp{powerpc970}
810
811@item
812SPARC:
813@nisamp{sparc},
814@nisamp{sparcv8},
815@nisamp{microsparc},
816@nisamp{supersparc},
817@nisamp{sparcv9},
818@nisamp{ultrasparc},
819@nisamp{ultrasparc2},
820@nisamp{ultrasparc2i},
821@nisamp{ultrasparc3},
822@nisamp{sparc64}
823
824@item
825x86 family:
826@nisamp{i386},
827@nisamp{i486},
828@nisamp{i586},
829@nisamp{pentium},
830@nisamp{pentiummmx},
831@nisamp{pentiumpro},
832@nisamp{pentium2},
833@nisamp{pentium3},
834@nisamp{pentium4},
835@nisamp{k6},
836@nisamp{k62},
837@nisamp{k63},
838@nisamp{athlon},
839@nisamp{amd64},
840@nisamp{viac3},
841@nisamp{viac32}
842
843@item
844Other:
845@nisamp{a29k},
846@nisamp{arm},
847@nisamp{clipper},
848@nisamp{i960},
849@nisamp{ns32k},
850@nisamp{pyramid},
851@nisamp{sh},
852@nisamp{sh2},
853@nisamp{vax},
854@nisamp{z8k}
855@end itemize
856
857CPUs not listed will use generic C code.
858
859@item Generic C Build
860@cindex Generic C
861If some of the assembly code causes problems, or if otherwise desired, the
862generic C code can be selected with the configure @option{--disable-assembly}.
863
864Note that this will run quite slowly, but it should be portable and should at
865least make it possible to get something running if all else fails.
866
867@item Fat binary, @option{--enable-fat}
868@cindex Fat binary
869@cindex @option{--enable-fat}
870Using @option{--enable-fat} selects a ``fat binary'' build on x86, where
871optimized low level subroutines are chosen at runtime according to the CPU
872detected.  This means more code, but gives good performance on all x86 chips.
873(This option might become available for more architectures in the future.)
874
875@item @option{ABI}
876@cindex ABI
877On some systems GMP supports multiple ABIs (application binary interfaces),
878meaning data type sizes and calling conventions.  By default GMP chooses the
879best ABI available, but a particular ABI can be selected.  For example
880
881@example
882./configure --host=mips64-sgi-irix6 ABI=n32
883@end example
884
885See @ref{ABI and ISA}, for the available choices on relevant CPUs, and what
886applications need to do.
887
888@item @option{CC}, @option{CFLAGS}
889@cindex C compiler
890@cindex @code{CC}
891@cindex @code{CFLAGS}
892By default the C compiler used is chosen from among some likely candidates,
893with @command{gcc} normally preferred if it's present.  The usual
894@samp{CC=whatever} can be passed to @samp{./configure} to choose something
895different.
896
897For various systems, default compiler flags are set based on the CPU and
898compiler.  The usual @samp{CFLAGS="-whatever"} can be passed to
899@samp{./configure} to use something different or to set good flags for systems
900GMP doesn't otherwise know.
901
902The @samp{CC} and @samp{CFLAGS} used are printed during @samp{./configure},
903and can be found in each generated @file{Makefile}.  This is the easiest way
904to check the defaults when considering changing or adding something.
905
906Note that when @samp{CC} and @samp{CFLAGS} are specified on a system
907supporting multiple ABIs it's important to give an explicit
908@samp{ABI=whatever}, since GMP can't determine the ABI just from the flags and
909won't be able to select the correct assembly code.
910
911If just @samp{CC} is selected then normal default @samp{CFLAGS} for that
912compiler will be used (if GMP recognises it).  For example @samp{CC=gcc} can
913be used to force the use of GCC, with default flags (and default ABI).
914
915@item @option{CPPFLAGS}
916@cindex @code{CPPFLAGS}
917Any flags like @samp{-D} defines or @samp{-I} includes required by the
918preprocessor should be set in @samp{CPPFLAGS} rather than @samp{CFLAGS}.
919Compiling is done with both @samp{CPPFLAGS} and @samp{CFLAGS}, but
920preprocessing uses just @samp{CPPFLAGS}.  This distinction is because most
921preprocessors won't accept all the flags the compiler does.  Preprocessing is
922done separately in some configure tests.
923
924@item @option{CC_FOR_BUILD}
925@cindex @code{CC_FOR_BUILD}
926Some build-time programs are compiled and run to generate host-specific data
927tables.  @samp{CC_FOR_BUILD} is the compiler used for this.  It doesn't need
928to be in any particular ABI or mode, it merely needs to generate executables
929that can run.  The default is to try the selected @samp{CC} and some likely
930candidates such as @samp{cc} and @samp{gcc}, looking for something that works.
931
932No flags are used with @samp{CC_FOR_BUILD} because a simple invocation like
933@samp{cc foo.c} should be enough.  If some particular options are required
934they can be included as for instance @samp{CC_FOR_BUILD="cc -whatever"}.
935
936@item C++ Support, @option{--enable-cxx}
937@cindex C++ support
938@cindex @code{--enable-cxx}
939C++ support in GMP can be enabled with @samp{--enable-cxx}, in which case a
940C++ compiler will be required.  As a convenience @samp{--enable-cxx=detect}
941can be used to enable C++ support only if a compiler can be found.  The C++
942support consists of a library @file{libgmpxx.la} and header file
943@file{gmpxx.h} (@pxref{Headers and Libraries}).
944
945A separate @file{libgmpxx.la} has been adopted rather than having C++ objects
946within @file{libgmp.la} in order to ensure dynamic linked C programs aren't
947bloated by a dependency on the C++ standard library, and to avoid any chance
948that the C++ compiler could be required when linking plain C programs.
949
950@file{libgmpxx.la} will use certain internals from @file{libgmp.la} and can
951only be expected to work with @file{libgmp.la} from the same GMP version.
952Future changes to the relevant internals will be accompanied by renaming, so a
953mismatch will cause unresolved symbols rather than perhaps mysterious
954misbehaviour.
955
956In general @file{libgmpxx.la} will be usable only with the C++ compiler that
957built it, since name mangling and runtime support are usually incompatible
958between different compilers.
959
960@item @option{CXX}, @option{CXXFLAGS}
961@cindex C++ compiler
962@cindex @code{CXX}
963@cindex @code{CXXFLAGS}
964When C++ support is enabled, the C++ compiler and its flags can be set with
965variables @samp{CXX} and @samp{CXXFLAGS} in the usual way.  The default for
966@samp{CXX} is the first compiler that works from a list of likely candidates,
967with @command{g++} normally preferred when available.  The default for
968@samp{CXXFLAGS} is to try @samp{CFLAGS}, @samp{CFLAGS} without @samp{-g}, then
969for @command{g++} either @samp{-g -O2} or @samp{-O2}, or for other compilers
970@samp{-g} or nothing.  Trying @samp{CFLAGS} this way is convenient when using
971@samp{gcc} and @samp{g++} together, since the flags for @samp{gcc} will
972usually suit @samp{g++}.
973
974It's important that the C and C++ compilers match, meaning their startup and
975runtime support routines are compatible and that they generate code in the
976same ABI (if there's a choice of ABIs on the system).  @samp{./configure}
977isn't currently able to check these things very well itself, so for that
978reason @samp{--disable-cxx} is the default, to avoid a build failure due to a
979compiler mismatch.  Perhaps this will change in the future.
980
981Incidentally, it's normally not good enough to set @samp{CXX} to the same as
982@samp{CC}.  Although @command{gcc} for instance recognises @file{foo.cc} as
983C++ code, only @command{g++} will invoke the linker the right way when
984building an executable or shared library from C++ object files.
985
986@item Temporary Memory, @option{--enable-alloca=<choice>}
987@cindex Temporary memory
988@cindex Stack overflow
989@cindex @code{alloca}
990@cindex @code{--enable-alloca}
991GMP allocates temporary workspace using one of the following three methods,
992which can be selected with for instance
993@samp{--enable-alloca=malloc-reentrant}.
994
995@itemize @bullet
996@item
997@samp{alloca} - C library or compiler builtin.
998@item
999@samp{malloc-reentrant} - the heap, in a re-entrant fashion.
1000@item
1001@samp{malloc-notreentrant} - the heap, with global variables.
1002@end itemize
1003
1004For convenience, the following choices are also available.
1005@samp{--disable-alloca} is the same as @samp{no}.
1006
1007@itemize @bullet
1008@item
1009@samp{yes} - a synonym for @samp{alloca}.
1010@item
1011@samp{no} - a synonym for @samp{malloc-reentrant}.
1012@item
1013@samp{reentrant} - @code{alloca} if available, otherwise
1014@samp{malloc-reentrant}.  This is the default.
1015@item
1016@samp{notreentrant} - @code{alloca} if available, otherwise
1017@samp{malloc-notreentrant}.
1018@end itemize
1019
1020@code{alloca} is reentrant and fast, and is recommended.  It actually allocates
1021just small blocks on the stack; larger ones use malloc-reentrant.
1022
1023@samp{malloc-reentrant} is, as the name suggests, reentrant and thread safe,
1024but @samp{malloc-notreentrant} is faster and should be used if reentrancy is
1025not required.
1026
1027The two malloc methods in fact use the memory allocation functions selected by
1028@code{mp_set_memory_functions}, these being @code{malloc} and friends by
1029default.  @xref{Custom Allocation}.
1030
1031An additional choice @samp{--enable-alloca=debug} is available, to help when
1032debugging memory related problems (@pxref{Debugging}).
1033
1034@item FFT Multiplication, @option{--disable-fft}
1035@cindex FFT multiplication
1036@cindex @code{--disable-fft}
1037By default multiplications are done using Karatsuba, 3-way Toom, higher degree
1038Toom, and Fermat FFT@.  The FFT is only used on large to very large operands
1039and can be disabled to save code size if desired.
1040
1041@item Assertion Checking, @option{--enable-assert}
1042@cindex Assertion checking
1043@cindex @code{--enable-assert}
1044This option enables some consistency checking within the library.  This can be
1045of use while debugging, @pxref{Debugging}.
1046
1047@item Execution Profiling, @option{--enable-profiling=prof/gprof/instrument}
1048@cindex Execution profiling
1049@cindex @code{--enable-profiling}
1050Enable profiling support, in one of various styles, @pxref{Profiling}.
1051
1052@item @option{MPN_PATH}
1053@cindex @code{MPN_PATH}
1054Various assembly versions of each mpn subroutines are provided.  For a given
1055CPU, a search is made though a path to choose a version of each.  For example
1056@samp{sparcv8} has
1057
1058@example
1059MPN_PATH="sparc32/v8 sparc32 generic"
1060@end example
1061
1062which means look first for v8 code, then plain sparc32 (which is v7), and
1063finally fall back on generic C@.  Knowledgeable users with special requirements
1064can specify a different path.  Normally this is completely unnecessary.
1065
1066@item Documentation
1067@cindex Documentation formats
1068@cindex Texinfo
1069The source for the document you're now reading is @file{doc/gmp.texi}, in
1070Texinfo format, see @GMPreftop{texinfo, Texinfo}.
1071
1072@cindex Postscript
1073@cindex DVI
1074@cindex PDF
1075Info format @samp{doc/gmp.info} is included in the distribution.  The usual
1076automake targets are available to make PostScript, DVI, PDF and HTML (these
1077will require various @TeX{} and Texinfo tools).
1078
1079@cindex DocBook
1080@cindex XML
1081DocBook and XML can be generated by the Texinfo @command{makeinfo} program
1082too, see @ref{makeinfo options,, Options for @command{makeinfo}, texinfo,
1083Texinfo}.
1084
1085Some supplementary notes can also be found in the @file{doc} subdirectory.
1086
1087@end table
1088
1089
1090@need 2000
1091@node ABI and ISA, Notes for Package Builds, Build Options, Installing GMP
1092@section ABI and ISA
1093@cindex ABI
1094@cindex Application Binary Interface
1095@cindex ISA
1096@cindex Instruction Set Architecture
1097
1098ABI (Application Binary Interface) refers to the calling conventions between
1099functions, meaning what registers are used and what sizes the various C data
1100types are.  ISA (Instruction Set Architecture) refers to the instructions and
1101registers a CPU has available.
1102
1103Some 64-bit ISA CPUs have both a 64-bit ABI and a 32-bit ABI defined, the
1104latter for compatibility with older CPUs in the family.  GMP supports some
1105CPUs like this in both ABIs.  In fact within GMP @samp{ABI} means a
1106combination of chip ABI, plus how GMP chooses to use it.  For example in some
110732-bit ABIs, GMP may support a limb as either a 32-bit @code{long} or a 64-bit
1108@code{long long}.
1109
1110By default GMP chooses the best ABI available for a given system, and this
1111generally gives significantly greater speed.  But an ABI can be chosen
1112explicitly to make GMP compatible with other libraries, or particular
1113application requirements.  For example,
1114
1115@example
1116./configure ABI=32
1117@end example
1118
1119In all cases it's vital that all object code used in a given program is
1120compiled for the same ABI.
1121
1122Usually a limb is implemented as a @code{long}.  When a @code{long long} limb
1123is used this is encoded in the generated @file{gmp.h}.  This is convenient for
1124applications, but it does mean that @file{gmp.h} will vary, and can't be just
1125copied around.  @file{gmp.h} remains compiler independent though, since all
1126compilers for a particular ABI will be expected to use the same limb type.
1127
1128Currently no attempt is made to follow whatever conventions a system has for
1129installing library or header files built for a particular ABI@.  This will
1130probably only matter when installing multiple builds of GMP, and it might be
1131as simple as configuring with a special @samp{libdir}, or it might require
1132more than that.  Note that builds for different ABIs need to done separately,
1133with a fresh @command{./configure} and @command{make} each.
1134
1135@sp 1
1136@table @asis
1137@need 1000
1138@item AMD64 (@samp{x86_64})
1139@cindex AMD64
1140On AMD64 systems supporting both 32-bit and 64-bit modes for applications, the
1141following ABI choices are available.
1142
1143@table @asis
1144@item @samp{ABI=64}
1145The 64-bit ABI uses 64-bit limbs and pointers and makes full use of the chip
1146architecture.  This is the default.  Applications will usually not need
1147special compiler flags, but for reference the option is
1148
1149@example
1150gcc  -m64
1151@end example
1152
1153@item @samp{ABI=32}
1154The 32-bit ABI is the usual i386 conventions.  This will be slower, and is not
1155recommended except for inter-operating with other code not yet 64-bit capable.
1156Applications must be compiled with
1157
1158@example
1159gcc  -m32
1160@end example
1161
1162(In GCC 2.95 and earlier there's no @samp{-m32} option, it's the only mode.)
1163@end table
1164
1165@sp 1
1166@need 1000
1167@item HPPA 2.0 (@samp{hppa2.0*}, @samp{hppa64})
1168@cindex HPPA
1169@cindex HP-UX
1170@table @asis
1171@item @samp{ABI=2.0w}
1172The 2.0w ABI uses 64-bit limbs and pointers and is available on HP-UX 11 or
1173up.  Applications must be compiled with
1174
1175@example
1176gcc [built for 2.0w]
1177cc  +DD64
1178@end example
1179
1180@item @samp{ABI=2.0n}
1181The 2.0n ABI means the 32-bit HPPA 1.0 ABI and all its normal calling
1182conventions, but with 64-bit instructions permitted within functions.  GMP
1183uses a 64-bit @code{long long} for a limb.  This ABI is available on hppa64
1184GNU/Linux and on HP-UX 10 or higher.  Applications must be compiled with
1185
1186@example
1187gcc [built for 2.0n]
1188cc  +DA2.0 +e
1189@end example
1190
1191Note that current versions of GCC (eg.@: 3.2) don't generate 64-bit
1192instructions for @code{long long} operations and so may be slower than for
11932.0w.  (The GMP assembly code is the same though.)
1194
1195@item @samp{ABI=1.0}
1196HPPA 2.0 CPUs can run all HPPA 1.0 and 1.1 code in the 32-bit HPPA 1.0 ABI@.
1197No special compiler options are needed for applications.
1198@end table
1199
1200All three ABIs are available for CPU types @samp{hppa2.0w}, @samp{hppa2.0} and
1201@samp{hppa64}, but for CPU type @samp{hppa2.0n} only 2.0n or 1.0 are
1202considered.
1203
1204Note that GCC on HP-UX has no options to choose between 2.0n and 2.0w modes,
1205unlike HP @command{cc}.  Instead it must be built for one or the other ABI@.
1206GMP will detect how it was built, and skip to the corresponding @samp{ABI}.
1207
1208@sp 1
1209@need 1500
1210@item IA-64 under HP-UX (@samp{ia64*-*-hpux*}, @samp{itanium*-*-hpux*})
1211@cindex IA-64
1212@cindex HP-UX
1213HP-UX supports two ABIs for IA-64.  GMP performance is the same in both.
1214
1215@table @asis
1216@item @samp{ABI=32}
1217In the 32-bit ABI, pointers, @code{int}s and @code{long}s are 32 bits and GMP
1218uses a 64 bit @code{long long} for a limb.  Applications can be compiled
1219without any special flags since this ABI is the default in both HP C and GCC,
1220but for reference the flags are
1221
1222@example
1223gcc  -milp32
1224cc   +DD32
1225@end example
1226
1227@item @samp{ABI=64}
1228In the 64-bit ABI, @code{long}s and pointers are 64 bits and GMP uses a
1229@code{long} for a limb.  Applications must be compiled with
1230
1231@example
1232gcc  -mlp64
1233cc   +DD64
1234@end example
1235@end table
1236
1237On other IA-64 systems, GNU/Linux for instance, @samp{ABI=64} is the only
1238choice.
1239
1240@sp 1
1241@need 1000
1242@item MIPS under IRIX 6 (@samp{mips*-*-irix[6789]})
1243@cindex MIPS
1244@cindex IRIX
1245IRIX 6 always has a 64-bit MIPS 3 or better CPU, and supports ABIs o32, n32,
1246and 64.  n32 or 64 are recommended, and GMP performance will be the same in
1247each.  The default is n32.
1248
1249@table @asis
1250@item @samp{ABI=o32}
1251The o32 ABI is 32-bit pointers and integers, and no 64-bit operations.  GMP
1252will be slower than in n32 or 64, this option only exists to support old
1253compilers, eg.@: GCC 2.7.2.  Applications can be compiled with no special
1254flags on an old compiler, or on a newer compiler with
1255
1256@example
1257gcc  -mabi=32
1258cc   -32
1259@end example
1260
1261@item @samp{ABI=n32}
1262The n32 ABI is 32-bit pointers and integers, but with a 64-bit limb using a
1263@code{long long}.  Applications must be compiled with
1264
1265@example
1266gcc  -mabi=n32
1267cc   -n32
1268@end example
1269
1270@item @samp{ABI=64}
1271The 64-bit ABI is 64-bit pointers and integers.  Applications must be compiled
1272with
1273
1274@example
1275gcc  -mabi=64
1276cc   -64
1277@end example
1278@end table
1279
1280Note that MIPS GNU/Linux, as of kernel version 2.2, doesn't have the necessary
1281support for n32 or 64 and so only gets a 32-bit limb and the MIPS 2 code.
1282
1283@sp 1
1284@need 1000
1285@item PowerPC 64 (@samp{powerpc64}, @samp{powerpc620}, @samp{powerpc630}, @samp{powerpc970}, @samp{power4}, @samp{power5})
1286@cindex PowerPC
1287@table @asis
1288@item @samp{ABI=mode64}
1289@cindex AIX
1290The AIX 64 ABI uses 64-bit limbs and pointers and is the default on PowerPC 64
1291@samp{*-*-aix*} systems.  Applications must be compiled with
1292
1293@example
1294gcc  -maix64
1295xlc  -q64
1296@end example
1297
1298On 64-bit GNU/Linux, BSD, and Mac OS X/Darwin systems, the applications must
1299be compiled with
1300
1301@example
1302gcc  -m64
1303@end example
1304
1305@item @samp{ABI=mode32}
1306The @samp{mode32} ABI uses a 64-bit @code{long long} limb but with the chip
1307still in 32-bit mode and using 32-bit calling conventions.  This is the default
1308for systems where the true 64-bit ABI is unavailable.  No special compiler
1309options are typically needed for applications.  This ABI is not available under
1310AIX.
1311
1312@item @samp{ABI=32}
1313This is the basic 32-bit PowerPC ABI, with a 32-bit limb.  No special compiler
1314options are needed for applications.
1315@end table
1316
1317GMP's speed is greatest for the @samp{mode64} ABI, the @samp{mode32} ABI is 2nd
1318best.  In @samp{ABI=32} only the 32-bit ISA is used and this doesn't make full
1319use of a 64-bit chip.
1320
1321@sp 1
1322@need 1000
1323@item Sparc V9 (@samp{sparc64}, @samp{sparcv9}, @samp{ultrasparc*})
1324@cindex Sparc V9
1325@cindex Solaris
1326@cindex Sun
1327@table @asis
1328@item @samp{ABI=64}
1329The 64-bit V9 ABI is available on the various BSD sparc64 ports, recent
1330versions of Sparc64 GNU/Linux, and Solaris 2.7 and up (when the kernel is in
133164-bit mode).  GCC 3.2 or higher, or Sun @command{cc} is required.  On
1332GNU/Linux, depending on the default @command{gcc} mode, applications must be
1333compiled with
1334
1335@example
1336gcc  -m64
1337@end example
1338
1339On Solaris applications must be compiled with
1340
1341@example
1342gcc  -m64 -mptr64 -Wa,-xarch=v9 -mcpu=v9
1343cc   -xarch=v9
1344@end example
1345
1346On the BSD sparc64 systems no special options are required, since 64-bits is
1347the only ABI available.
1348
1349@item @samp{ABI=32}
1350For the basic 32-bit ABI, GMP still uses as much of the V9 ISA as it can.  In
1351the Sun documentation this combination is known as ``v8plus''.  On GNU/Linux,
1352depending on the default @command{gcc} mode, applications may need to be
1353compiled with
1354
1355@example
1356gcc  -m32
1357@end example
1358
1359On Solaris, no special compiler options are required for applications, though
1360using something like the following is recommended.  (@command{gcc} 2.8 and
1361earlier only support @samp{-mv8} though.)
1362
1363@example
1364gcc  -mv8plus
1365cc   -xarch=v8plus
1366@end example
1367@end table
1368
1369GMP speed is greatest in @samp{ABI=64}, so it's the default where available.
1370The speed is partly because there are extra registers available and partly
1371because 64-bits is considered the more important case and has therefore had
1372better code written for it.
1373
1374Don't be confused by the names of the @samp{-m} and @samp{-x} compiler
1375options, they're called @samp{arch} but effectively control both ABI and ISA@.
1376
1377On Solaris 2.6 and earlier, only @samp{ABI=32} is available since the kernel
1378doesn't save all registers.
1379
1380On Solaris 2.7 with the kernel in 32-bit mode, a normal native build will
1381reject @samp{ABI=64} because the resulting executables won't run.
1382@samp{ABI=64} can still be built if desired by making it look like a
1383cross-compile, for example
1384
1385@example
1386./configure --build=none --host=sparcv9-sun-solaris2.7 ABI=64
1387@end example
1388@end table
1389
1390
1391@need 2000
1392@node Notes for Package Builds, Notes for Particular Systems, ABI and ISA, Installing GMP
1393@section Notes for Package Builds
1394@cindex Build notes for binary packaging
1395@cindex Packaged builds
1396
1397GMP should present no great difficulties for packaging in a binary
1398distribution.
1399
1400@cindex Libtool versioning
1401@cindex Shared library versioning
1402Libtool is used to build the library and @samp{-version-info} is set
1403appropriately, having started from @samp{3:0:0} in GMP 3.0 (@pxref{Versioning,
1404Library interface versions, Library interface versions, libtool, GNU
1405Libtool}).
1406
1407The GMP 4 series will be upwardly binary compatible in each release and will
1408be upwardly binary compatible with all of the GMP 3 series.  Additional
1409function interfaces may be added in each release, so on systems where libtool
1410versioning is not fully checked by the loader an auxiliary mechanism may be
1411needed to express that a dynamic linked application depends on a new enough
1412GMP.
1413
1414An auxiliary mechanism may also be needed to express that @file{libgmpxx.la}
1415(from @option{--enable-cxx}, @pxref{Build Options}) requires @file{libgmp.la}
1416from the same GMP version, since this is not done by the libtool versioning,
1417nor otherwise.  A mismatch will result in unresolved symbols from the linker,
1418or perhaps the loader.
1419
1420When building a package for a CPU family, care should be taken to use
1421@samp{--host} (or @samp{--build}) to choose the least common denominator among
1422the CPUs which might use the package.  For example this might mean plain
1423@samp{sparc} (meaning V7) for SPARCs.
1424
1425For x86s, @option{--enable-fat} sets things up for a fat binary build, making a
1426runtime selection of optimized low level routines.  This is a good choice for
1427packaging to run on a range of x86 chips.
1428
1429Users who care about speed will want GMP built for their exact CPU type, to
1430make best use of the available optimizations.  Providing a way to suitably
1431rebuild a package may be useful.  This could be as simple as making it
1432possible for a user to omit @samp{--build} (and @samp{--host}) so
1433@samp{./config.guess} will detect the CPU@.  But a way to manually specify a
1434@samp{--build} will be wanted for systems where @samp{./config.guess} is
1435inexact.
1436
1437On systems with multiple ABIs, a packaged build will need to decide which
1438among the choices is to be provided, see @ref{ABI and ISA}.  A given run of
1439@samp{./configure} etc will only build one ABI@.  If a second ABI is also
1440required then a second run of @samp{./configure} etc must be made, starting
1441from a clean directory tree (@samp{make distclean}).
1442
1443As noted under ``ABI and ISA'', currently no attempt is made to follow system
1444conventions for install locations that vary with ABI, such as
1445@file{/usr/lib/sparcv9} for @samp{ABI=64} as opposed to @file{/usr/lib} for
1446@samp{ABI=32}.  A package build can override @samp{libdir} and other standard
1447variables as necessary.
1448
1449Note that @file{gmp.h} is a generated file, and will be architecture and ABI
1450dependent.  When attempting to install two ABIs simultaneously it will be
1451important that an application compile gets the correct @file{gmp.h} for its
1452desired ABI@.  If compiler include paths don't vary with ABI options then it
1453might be necessary to create a @file{/usr/include/gmp.h} which tests
1454preprocessor symbols and chooses the correct actual @file{gmp.h}.
1455
1456
1457@need 2000
1458@node Notes for Particular Systems, Known Build Problems, Notes for Package Builds, Installing GMP
1459@section Notes for Particular Systems
1460@cindex Build notes for particular systems
1461@cindex Particular systems
1462@cindex Systems
1463@table @asis
1464
1465@c This section is more or less meant for notes about performance or about
1466@c build problems that have been worked around but might leave a user
1467@c scratching their head.  Fun with different ABIs on a system belongs in the
1468@c above section.
1469
1470@item AIX 3 and 4
1471@cindex AIX
1472On systems @samp{*-*-aix[34]*} shared libraries are disabled by default, since
1473some versions of the native @command{ar} fail on the convenience libraries
1474used.  A shared build can be attempted with
1475
1476@example
1477./configure --enable-shared --disable-static
1478@end example
1479
1480Note that the @samp{--disable-static} is necessary because in a shared build
1481libtool makes @file{libgmp.a} a symlink to @file{libgmp.so}, apparently for
1482the benefit of old versions of @command{ld} which only recognise @file{.a},
1483but unfortunately this is done even if a fully functional @command{ld} is
1484available.
1485
1486@item ARM
1487@cindex ARM
1488On systems @samp{arm*-*-*}, versions of GCC up to and including 2.95.3 have a
1489bug in unsigned division, giving wrong results for some operands.  GMP
1490@samp{./configure} will demand GCC 2.95.4 or later.
1491
1492@item Compaq C++
1493@cindex Compaq C++
1494Compaq C++ on OSF 5.1 has two flavours of @code{iostream}, a standard one and
1495an old pre-standard one (see @samp{man iostream_intro}).  GMP can only use the
1496standard one, which unfortunately is not the default but must be selected by
1497defining @code{__USE_STD_IOSTREAM}.  Configure with for instance
1498
1499@example
1500./configure --enable-cxx CPPFLAGS=-D__USE_STD_IOSTREAM
1501@end example
1502
1503@item Floating Point Mode
1504@cindex Floating point mode
1505@cindex Hardware floating point mode
1506@cindex Precision of hardware floating point
1507@cindex x87
1508On some systems, the hardware floating point has a control mode which can set
1509all operations to be done in a particular precision, for instance single,
1510double or extended on x86 systems (x87 floating point).  The GMP functions
1511involving a @code{double} cannot be expected to operate to their full
1512precision when the hardware is in single precision mode.  Of course this
1513affects all code, including application code, not just GMP.
1514
1515@item MS-DOS and MS Windows
1516@cindex MS-DOS
1517@cindex MS Windows
1518@cindex Windows
1519@cindex Cygwin
1520@cindex DJGPP
1521@cindex MINGW
1522On an MS-DOS system DJGPP can be used to build GMP, and on an MS Windows
1523system Cygwin, DJGPP and MINGW can be used.  All three are excellent ports of
1524GCC and the various GNU tools.
1525
1526@display
1527@uref{http://www.cygwin.com/}
1528@uref{http://www.delorie.com/djgpp/}
1529@uref{http://www.mingw.org/}
1530@end display
1531
1532@cindex Interix
1533@cindex Services for Unix
1534Microsoft also publishes an Interix ``Services for Unix'' which can be used to
1535build GMP on Windows (with a normal @samp{./configure}), but it's not free
1536software.
1537
1538@item MS Windows DLLs
1539@cindex DLLs
1540@cindex MS Windows
1541@cindex Windows
1542On systems @samp{*-*-cygwin*}, @samp{*-*-mingw*} and @samp{*-*-pw32*} by
1543default GMP builds only a static library, but a DLL can be built instead using
1544
1545@example
1546./configure --disable-static --enable-shared
1547@end example
1548
1549Static and DLL libraries can't both be built, since certain export directives
1550in @file{gmp.h} must be different.
1551
1552A MINGW DLL build of GMP can be used with Microsoft C@.  Libtool doesn't
1553install a @file{.lib} format import library, but it can be created with MS
1554@command{lib} as follows, and copied to the install directory.  Similarly for
1555@file{libmp} and @file{libgmpxx}.
1556
1557@example
1558cd .libs
1559lib /def:libgmp-3.dll.def /out:libgmp-3.lib
1560@end example
1561
1562MINGW uses the C runtime library @samp{msvcrt.dll} for I/O, so applications
1563wanting to use the GMP I/O routines must be compiled with @samp{cl /MD} to do
1564the same.  If one of the other C runtime library choices provided by MS C is
1565desired then the suggestion is to use the GMP string functions and confine I/O
1566to the application.
1567
1568@item Motorola 68k CPU Types
1569@cindex 68000
1570@samp{m68k} is taken to mean 68000.  @samp{m68020} or higher will give a
1571performance boost on applicable CPUs.  @samp{m68360} can be used for CPU32
1572series chips.  @samp{m68302} can be used for ``Dragonball'' series chips,
1573though this is merely a synonym for @samp{m68000}.
1574
1575@item OpenBSD 2.6
1576@cindex OpenBSD
1577@command{m4} in this release of OpenBSD has a bug in @code{eval} that makes it
1578unsuitable for @file{.asm} file processing.  @samp{./configure} will detect
1579the problem and either abort or choose another m4 in the @env{PATH}.  The bug
1580is fixed in OpenBSD 2.7, so either upgrade or use GNU m4.
1581
1582@item Power CPU Types
1583@cindex Power/PowerPC
1584In GMP, CPU types @samp{power*} and @samp{powerpc*} will each use instructions
1585not available on the other, so it's important to choose the right one for the
1586CPU that will be used.  Currently GMP has no assembly code support for using
1587just the common instruction subset.  To get executables that run on both, the
1588current suggestion is to use the generic C code (@option{--disable-assembly}),
1589possibly with appropriate compiler options (like @samp{-mcpu=common} for
1590@command{gcc}).  CPU @samp{rs6000} (which is not a CPU but a family of
1591workstations) is accepted by @file{config.sub}, but is currently equivalent to
1592@option{--disable-assembly}.
1593
1594@item Sparc CPU Types
1595@cindex Sparc
1596@samp{sparcv8} or @samp{supersparc} on relevant systems will give a
1597significant performance increase over the V7 code selected by plain
1598@samp{sparc}.
1599
1600@item Sparc App Regs
1601@cindex Sparc
1602The GMP assembly code for both 32-bit and 64-bit Sparc clobbers the
1603``application registers'' @code{g2}, @code{g3} and @code{g4}, the same way
1604that the GCC default @samp{-mapp-regs} does (@pxref{SPARC Options,, SPARC
1605Options, gcc, Using the GNU Compiler Collection (GCC)}).
1606
1607This makes that code unsuitable for use with the special V9
1608@samp{-mcmodel=embmedany} (which uses @code{g4} as a data segment pointer), and
1609for applications wanting to use those registers for special purposes.  In these
1610cases the only suggestion currently is to build GMP with
1611@option{--disable-assembly} to avoid the assembly code.
1612
1613@item SunOS 4
1614@cindex SunOS
1615@command{/usr/bin/m4} lacks various features needed to process @file{.asm}
1616files, and instead @samp{./configure} will automatically use
1617@command{/usr/5bin/m4}, which we believe is always available (if not then use
1618GNU m4).
1619
1620@item x86 CPU Types
1621@cindex x86
1622@cindex 80x86
1623@cindex i386
1624@samp{i586}, @samp{pentium} or @samp{pentiummmx} code is good for its intended
1625P5 Pentium chips, but quite slow when run on Intel P6 class chips (PPro, P-II,
1626P-III)@.  @samp{i386} is a better choice when making binaries that must run on
1627both.
1628
1629@item x86 MMX and SSE2 Code
1630@cindex MMX
1631@cindex SSE2
1632If the CPU selected has MMX code but the assembler doesn't support it, a
1633warning is given and non-MMX code is used instead.  This will be an inferior
1634build, since the MMX code that's present is there because it's faster than the
1635corresponding plain integer code.  The same applies to SSE2.
1636
1637Old versions of @samp{gas} don't support MMX instructions, in particular
1638version 1.92.3 that comes with FreeBSD 2.2.8 or the more recent OpenBSD 3.1
1639doesn't.
1640
1641Solaris 2.6 and 2.7 @command{as} generate incorrect object code for register
1642to register @code{movq} instructions, and so can't be used for MMX code.
1643Install a recent @command{gas} if MMX code is wanted on these systems.
1644@end table
1645
1646
1647@need 2000
1648@node Known Build Problems, Performance optimization, Notes for Particular Systems, Installing GMP
1649@section Known Build Problems
1650@cindex Build problems known
1651
1652@c This section is more or less meant for known build problems that are not
1653@c otherwise worked around and require some sort of manual intervention.
1654
1655You might find more up-to-date information at @uref{http://gmplib.org/}.
1656
1657@table @asis
1658@item Compiler link options
1659The version of libtool currently in use rather aggressively strips compiler
1660options when linking a shared library.  This will hopefully be relaxed in the
1661future, but for now if this is a problem the suggestion is to create a little
1662script to hide them, and for instance configure with
1663
1664@example
1665./configure CC=gcc-with-my-options
1666@end example
1667
1668@item DJGPP (@samp{*-*-msdosdjgpp*})
1669@cindex DJGPP
1670The DJGPP port of @command{bash} 2.03 is unable to run the @samp{configure}
1671script, it exits silently, having died writing a preamble to
1672@file{config.log}.  Use @command{bash} 2.04 or higher.
1673
1674@samp{make all} was found to run out of memory during the final
1675@file{libgmp.la} link on one system tested, despite having 64Mb available.
1676Running @samp{make libgmp.la} directly helped, perhaps recursing into the
1677various subdirectories uses up memory.
1678
1679@item GNU binutils @command{strip} prior to 2.12
1680@cindex Stripped libraries
1681@cindex Binutils @command{strip}
1682@cindex GNU @command{strip}
1683@command{strip} from GNU binutils 2.11 and earlier should not be used on the
1684static libraries @file{libgmp.a} and @file{libmp.a} since it will discard all
1685but the last of multiple archive members with the same name, like the three
1686versions of @file{init.o} in @file{libgmp.a}.  Binutils 2.12 or higher can be
1687used successfully.
1688
1689The shared libraries @file{libgmp.so} and @file{libmp.so} are not affected by
1690this and any version of @command{strip} can be used on them.
1691
1692@item @command{make} syntax error
1693@cindex SCO
1694@cindex IRIX
1695On certain versions of SCO OpenServer 5 and IRIX 6.5 the native @command{make}
1696is unable to handle the long dependencies list for @file{libgmp.la}.  The
1697symptom is a ``syntax error'' on the following line of the top-level
1698@file{Makefile}.
1699
1700@example
1701libgmp.la: $(libgmp_la_OBJECTS) $(libgmp_la_DEPENDENCIES)
1702@end example
1703
1704Either use GNU Make, or as a workaround remove
1705@code{$(libgmp_la_DEPENDENCIES)} from that line (which will make the initial
1706build work, but if any recompiling is done @file{libgmp.la} might not be
1707rebuilt).
1708
1709@item MacOS X (@samp{*-*-darwin*})
1710@cindex MacOS X
1711@cindex Darwin
1712Libtool currently only knows how to create shared libraries on MacOS X using
1713the native @command{cc} (which is a modified GCC), not a plain GCC@.  A
1714static-only build should work though (@samp{--disable-shared}).
1715
1716@item NeXT prior to 3.3
1717@cindex NeXT
1718The system compiler on old versions of NeXT was a massacred and old GCC, even
1719if it called itself @file{cc}.  This compiler cannot be used to build GMP, you
1720need to get a real GCC, and install that.  (NeXT may have fixed this in
1721release 3.3 of their system.)
1722
1723@item POWER and PowerPC
1724@cindex Power/PowerPC
1725Bugs in GCC 2.7.2 (and 2.6.3) mean it can't be used to compile GMP on POWER or
1726PowerPC@.  If you want to use GCC for these machines, get GCC 2.7.2.1 (or
1727later).
1728
1729@item Sequent Symmetry
1730@cindex Sequent Symmetry
1731Use the GNU assembler instead of the system assembler, since the latter has
1732serious bugs.
1733
1734@item Solaris 2.6
1735@cindex Solaris
1736The system @command{sed} prints an error ``Output line too long'' when libtool
1737builds @file{libgmp.la}.  This doesn't seem to cause any obvious ill effects,
1738but GNU @command{sed} is recommended, to avoid any doubt.
1739
1740@item Sparc Solaris 2.7 with gcc 2.95.2 in @samp{ABI=32}
1741@cindex Solaris
1742A shared library build of GMP seems to fail in this combination, it builds but
1743then fails the tests, apparently due to some incorrect data relocations within
1744@code{gmp_randinit_lc_2exp_size}.  The exact cause is unknown,
1745@samp{--disable-shared} is recommended.
1746@end table
1747
1748
1749@need 2000
1750@node Performance optimization, , Known Build Problems, Installing GMP
1751@section Performance optimization
1752@cindex Optimizing performance
1753
1754@c At some point, this should perhaps move to a separate chapter on optimizing
1755@c performance.
1756
1757For optimal performance, build GMP for the exact CPU type of the target
1758computer, see @ref{Build Options}.
1759
1760Unlike what is the case for most other programs, the compiler typically
1761doesn't matter much, since GMP uses assembly language for the most critical
1762operation.
1763
1764In particular for long-running GMP applications, and applications demanding
1765extremely large numbers, building and running the @code{tuneup} program in the
1766@file{tune} subdirectory, can be important.  For example,
1767
1768@example
1769cd tune
1770make tuneup
1771./tuneup
1772@end example
1773
1774will generate better contents for the @file{gmp-mparam.h} parameter file.
1775
1776To use the results, put the output in the file indicated in the
1777@samp{Parameters for ...} header.  Then recompile from scratch.
1778
1779The @code{tuneup} program takes one useful parameter, @samp{-f NNN}, which
1780instructs the program how long to check FFT multiply parameters.  If you're
1781going to use GMP for extremely large numbers, you may want to run @code{tuneup}
1782with a large NNN value.
1783
1784
1785@node GMP Basics, Reporting Bugs, Installing GMP, Top
1786@comment  node-name,  next,  previous,  up
1787@chapter GMP Basics
1788@cindex Basics
1789
1790@strong{Using functions, macros, data types, etc.@: not documented in this
1791manual is strongly discouraged.  If you do so your application is guaranteed
1792to be incompatible with future versions of GMP.}
1793
1794@menu
1795* Headers and Libraries::
1796* Nomenclature and Types::
1797* Function Classes::
1798* Variable Conventions::
1799* Parameter Conventions::
1800* Memory Management::
1801* Reentrancy::
1802* Useful Macros and Constants::
1803* Compatibility with older versions::
1804* Demonstration Programs::
1805* Efficiency::
1806* Debugging::
1807* Profiling::
1808* Autoconf::
1809* Emacs::
1810@end menu
1811
1812@node Headers and Libraries, Nomenclature and Types, GMP Basics, GMP Basics
1813@section Headers and Libraries
1814@cindex Headers
1815
1816@cindex @file{gmp.h}
1817@cindex Include files
1818@cindex @code{#include}
1819All declarations needed to use GMP are collected in the include file
1820@file{gmp.h}.  It is designed to work with both C and C++ compilers.
1821
1822@example
1823#include <gmp.h>
1824@end example
1825
1826@cindex @code{stdio.h}
1827Note however that prototypes for GMP functions with @code{FILE *} parameters
1828are only provided if @code{<stdio.h>} is included too.
1829
1830@example
1831#include <stdio.h>
1832#include <gmp.h>
1833@end example
1834
1835@cindex @code{stdarg.h}
1836Likewise @code{<stdarg.h>} (or @code{<varargs.h>}) is required for prototypes
1837with @code{va_list} parameters, such as @code{gmp_vprintf}.  And
1838@code{<obstack.h>} for prototypes with @code{struct obstack} parameters, such
1839as @code{gmp_obstack_printf}, when available.
1840
1841@cindex Libraries
1842@cindex Linking
1843@cindex @code{libgmp}
1844All programs using GMP must link against the @file{libgmp} library.  On a
1845typical Unix-like system this can be done with @samp{-lgmp}, for example
1846
1847@example
1848gcc myprogram.c -lgmp
1849@end example
1850
1851@cindex @code{libgmpxx}
1852GMP C++ functions are in a separate @file{libgmpxx} library.  This is built
1853and installed if C++ support has been enabled (@pxref{Build Options}).  For
1854example,
1855
1856@example
1857g++ mycxxprog.cc -lgmpxx -lgmp
1858@end example
1859
1860@cindex Libtool
1861GMP is built using Libtool and an application can use that to link if desired,
1862@GMPpxreftop{libtool, GNU Libtool}.
1863
1864If GMP has been installed to a non-standard location then it may be necessary
1865to use @samp{-I} and @samp{-L} compiler options to point to the right
1866directories, and some sort of run-time path for a shared library.
1867
1868
1869@node Nomenclature and Types, Function Classes, Headers and Libraries, GMP Basics
1870@section Nomenclature and Types
1871@cindex Nomenclature
1872@cindex Types
1873
1874@cindex Integer
1875@tindex @code{mpz_t}
1876In this manual, @dfn{integer} usually means a multiple precision integer, as
1877defined by the GMP library.  The C data type for such integers is @code{mpz_t}.
1878Here are some examples of how to declare such integers:
1879
1880@example
1881mpz_t sum;
1882
1883struct foo @{ mpz_t x, y; @};
1884
1885mpz_t vec[20];
1886@end example
1887
1888@cindex Rational number
1889@tindex @code{mpq_t}
1890@dfn{Rational number} means a multiple precision fraction.  The C data type
1891for these fractions is @code{mpq_t}.  For example:
1892
1893@example
1894mpq_t quotient;
1895@end example
1896
1897@cindex Floating-point number
1898@tindex @code{mpf_t}
1899@dfn{Floating point number} or @dfn{Float} for short, is an arbitrary precision
1900mantissa with a limited precision exponent.  The C data type for such objects
1901is @code{mpf_t}.  For example:
1902
1903@example
1904mpf_t fp;
1905@end example
1906
1907@tindex @code{mp_exp_t}
1908The floating point functions accept and return exponents in the C type
1909@code{mp_exp_t}.  Currently this is usually a @code{long}, but on some systems
1910it's an @code{int} for efficiency.
1911
1912@cindex Limb
1913@tindex @code{mp_limb_t}
1914A @dfn{limb} means the part of a multi-precision number that fits in a single
1915machine word.  (We chose this word because a limb of the human body is
1916analogous to a digit, only larger, and containing several digits.)  Normally a
1917limb is 32 or 64 bits.  The C data type for a limb is @code{mp_limb_t}.
1918
1919@tindex @code{mp_size_t}
1920Counts of limbs of a multi-precision number represented in the C type
1921@code{mp_size_t}.  Currently this is normally a @code{long}, but on some
1922systems it's an @code{int} for efficiency, and on some systems it will be
1923@code{long long} in the future.
1924
1925@tindex @code{mp_bitcnt_t}
1926Counts of bits of a multi-precision number are represented in the C type
1927@code{mp_bitcnt_t}.  Currently this is always an @code{unsigned long}, but on
1928some systems it will be an @code{unsigned long long} in the future.
1929
1930@cindex Random state
1931@tindex @code{gmp_randstate_t}
1932@dfn{Random state} means an algorithm selection and current state data.  The C
1933data type for such objects is @code{gmp_randstate_t}.  For example:
1934
1935@example
1936gmp_randstate_t rstate;
1937@end example
1938
1939Also, in general @code{mp_bitcnt_t} is used for bit counts and ranges, and
1940@code{size_t} is used for byte or character counts.
1941
1942
1943@node Function Classes, Variable Conventions, Nomenclature and Types, GMP Basics
1944@section Function Classes
1945@cindex Function classes
1946
1947There are six classes of functions in the GMP library:
1948
1949@enumerate
1950@item
1951Functions for signed integer arithmetic, with names beginning with
1952@code{mpz_}.  The associated type is @code{mpz_t}.  There are about 150
1953functions in this class.  (@pxref{Integer Functions})
1954
1955@item
1956Functions for rational number arithmetic, with names beginning with
1957@code{mpq_}.  The associated type is @code{mpq_t}.  There are about 40
1958functions in this class, but the integer functions can be used for arithmetic
1959on the numerator and denominator separately.  (@pxref{Rational Number
1960Functions})
1961
1962@item
1963Functions for floating-point arithmetic, with names beginning with
1964@code{mpf_}.  The associated type is @code{mpf_t}.  There are about 60
1965functions is this class.  (@pxref{Floating-point Functions})
1966
1967@item
1968Fast low-level functions that operate on natural numbers.  These are used by
1969the functions in the preceding groups, and you can also call them directly
1970from very time-critical user programs.  These functions' names begin with
1971@code{mpn_}.  The associated type is array of @code{mp_limb_t}.  There are
1972about 30 (hard-to-use) functions in this class.  (@pxref{Low-level Functions})
1973
1974@item
1975Miscellaneous functions.  Functions for setting up custom allocation and
1976functions for generating random numbers.  (@pxref{Custom Allocation}, and
1977@pxref{Random Number Functions})
1978@end enumerate
1979
1980
1981@node Variable Conventions, Parameter Conventions, Function Classes, GMP Basics
1982@section Variable Conventions
1983@cindex Variable conventions
1984@cindex Conventions for variables
1985
1986GMP functions generally have output arguments before input arguments.  This
1987notation is by analogy with the assignment operator.  The BSD MP compatibility
1988functions are exceptions, having the output arguments last.
1989
1990GMP lets you use the same variable for both input and output in one call.  For
1991example, the main function for integer multiplication, @code{mpz_mul}, can be
1992used to square @code{x} and put the result back in @code{x} with
1993
1994@example
1995mpz_mul (x, x, x);
1996@end example
1997
1998Before you can assign to a GMP variable, you need to initialize it by calling
1999one of the special initialization functions.  When you're done with a
2000variable, you need to clear it out, using one of the functions for that
2001purpose.  Which function to use depends on the type of variable.  See the
2002chapters on integer functions, rational number functions, and floating-point
2003functions for details.
2004
2005A variable should only be initialized once, or at least cleared between each
2006initialization.  After a variable has been initialized, it may be assigned to
2007any number of times.
2008
2009For efficiency reasons, avoid excessive initializing and clearing.  In
2010general, initialize near the start of a function and clear near the end.  For
2011example,
2012
2013@example
2014void
2015foo (void)
2016@{
2017  mpz_t  n;
2018  int    i;
2019  mpz_init (n);
2020  for (i = 1; i < 100; i++)
2021    @{
2022      mpz_mul (n, @dots{});
2023      mpz_fdiv_q (n, @dots{});
2024      @dots{}
2025    @}
2026  mpz_clear (n);
2027@}
2028@end example
2029
2030
2031@node Parameter Conventions, Memory Management, Variable Conventions, GMP Basics
2032@section Parameter Conventions
2033@cindex Parameter conventions
2034@cindex Conventions for parameters
2035
2036When a GMP variable is used as a function parameter, it's effectively a
2037call-by-reference, meaning if the function stores a value there it will change
2038the original in the caller.  Parameters which are input-only can be designated
2039@code{const} to provoke a compiler error or warning on attempting to modify
2040them.
2041
2042When a function is going to return a GMP result, it should designate a
2043parameter that it sets, like the library functions do.  More than one value
2044can be returned by having more than one output parameter, again like the
2045library functions.  A @code{return} of an @code{mpz_t} etc doesn't return the
2046object, only a pointer, and this is almost certainly not what's wanted.
2047
2048Here's an example accepting an @code{mpz_t} parameter, doing a calculation,
2049and storing the result to the indicated parameter.
2050
2051@example
2052void
2053foo (mpz_t result, const mpz_t param, unsigned long n)
2054@{
2055  unsigned long  i;
2056  mpz_mul_ui (result, param, n);
2057  for (i = 1; i < n; i++)
2058    mpz_add_ui (result, result, i*7);
2059@}
2060
2061int
2062main (void)
2063@{
2064  mpz_t  r, n;
2065  mpz_init (r);
2066  mpz_init_set_str (n, "123456", 0);
2067  foo (r, n, 20L);
2068  gmp_printf ("%Zd\n", r);
2069  return 0;
2070@}
2071@end example
2072
2073@code{foo} works even if the mainline passes the same variable for
2074@code{param} and @code{result}, just like the library functions.  But
2075sometimes it's tricky to make that work, and an application might not want to
2076bother supporting that sort of thing.
2077
2078For interest, the GMP types @code{mpz_t} etc are implemented as one-element
2079arrays of certain structures.  This is why declaring a variable creates an
2080object with the fields GMP needs, but then using it as a parameter passes a
2081pointer to the object.  Note that the actual fields in each @code{mpz_t} etc
2082are for internal use only and should not be accessed directly by code that
2083expects to be compatible with future GMP releases.
2084
2085
2086@need 1000
2087@node Memory Management, Reentrancy, Parameter Conventions, GMP Basics
2088@section Memory Management
2089@cindex Memory management
2090
2091The GMP types like @code{mpz_t} are small, containing only a couple of sizes,
2092and pointers to allocated data.  Once a variable is initialized, GMP takes
2093care of all space allocation.  Additional space is allocated whenever a
2094variable doesn't have enough.
2095
2096@code{mpz_t} and @code{mpq_t} variables never reduce their allocated space.
2097Normally this is the best policy, since it avoids frequent reallocation.
2098Applications that need to return memory to the heap at some particular point
2099can use @code{mpz_realloc2}, or clear variables no longer needed.
2100
2101@code{mpf_t} variables, in the current implementation, use a fixed amount of
2102space, determined by the chosen precision and allocated at initialization, so
2103their size doesn't change.
2104
2105All memory is allocated using @code{malloc} and friends by default, but this
2106can be changed, see @ref{Custom Allocation}.  Temporary memory on the stack is
2107also used (via @code{alloca}), but this can be changed at build-time if
2108desired, see @ref{Build Options}.
2109
2110
2111@node Reentrancy, Useful Macros and Constants, Memory Management, GMP Basics
2112@section Reentrancy
2113@cindex Reentrancy
2114@cindex Thread safety
2115@cindex Multi-threading
2116
2117@noindent
2118GMP is reentrant and thread-safe, with some exceptions:
2119
2120@itemize @bullet
2121@item
2122If configured with @option{--enable-alloca=malloc-notreentrant} (or with
2123@option{--enable-alloca=notreentrant} when @code{alloca} is not available),
2124then naturally GMP is not reentrant.
2125
2126@item
2127@code{mpf_set_default_prec} and @code{mpf_init} use a global variable for the
2128selected precision.  @code{mpf_init2} can be used instead, and in the C++
2129interface an explicit precision to the @code{mpf_class} constructor.
2130
2131@item
2132@code{mpz_random} and the other old random number functions use a global
2133random state and are hence not reentrant.  The newer random number functions
2134that accept a @code{gmp_randstate_t} parameter can be used instead.
2135
2136@item
2137@code{gmp_randinit} (obsolete) returns an error indication through a global
2138variable, which is not thread safe.  Applications are advised to use
2139@code{gmp_randinit_default} or @code{gmp_randinit_lc_2exp} instead.
2140
2141@item
2142@code{mp_set_memory_functions} uses global variables to store the selected
2143memory allocation functions.
2144
2145@item
2146If the memory allocation functions set by a call to
2147@code{mp_set_memory_functions} (or @code{malloc} and friends by default) are
2148not reentrant, then GMP will not be reentrant either.
2149
2150@item
2151If the standard I/O functions such as @code{fwrite} are not reentrant then the
2152GMP I/O functions using them will not be reentrant either.
2153
2154@item
2155It's safe for two threads to read from the same GMP variable simultaneously,
2156but it's not safe for one to read while the another might be writing, nor for
2157two threads to write simultaneously.  It's not safe for two threads to
2158generate a random number from the same @code{gmp_randstate_t} simultaneously,
2159since this involves an update of that variable.
2160@end itemize
2161
2162
2163@need 2000
2164@node Useful Macros and Constants, Compatibility with older versions, Reentrancy, GMP Basics
2165@section Useful Macros and Constants
2166@cindex Useful macros and constants
2167@cindex Constants
2168
2169@deftypevr {Global Constant} {const int} mp_bits_per_limb
2170@findex mp_bits_per_limb
2171@cindex Bits per limb
2172@cindex Limb size
2173The number of bits per limb.
2174@end deftypevr
2175
2176@defmac __GNU_MP_VERSION
2177@defmacx __GNU_MP_VERSION_MINOR
2178@defmacx __GNU_MP_VERSION_PATCHLEVEL
2179@cindex Version number
2180@cindex GMP version number
2181The major and minor GMP version, and patch level, respectively, as integers.
2182For GMP i.j, these numbers will be i, j, and 0, respectively.
2183For GMP i.j.k, these numbers will be i, j, and k, respectively.
2184@end defmac
2185
2186@deftypevr {Global Constant} {const char * const} gmp_version
2187@findex gmp_version
2188The GMP version number, as a null-terminated string, in the form ``i.j.k''.
2189This release is @nicode{"@value{VERSION}"}.  Note that the format ``i.j'' was
2190used, before version 4.3.0, when k was zero.
2191@end deftypevr
2192
2193@defmac __GMP_CC
2194@defmacx __GMP_CFLAGS
2195The compiler and compiler flags, respectively, used when compiling GMP, as
2196strings.
2197@end defmac
2198
2199
2200@node Compatibility with older versions, Demonstration Programs, Useful Macros and Constants, GMP Basics
2201@section Compatibility with older versions
2202@cindex Compatibility with older versions
2203@cindex Past GMP versions
2204@cindex Upward compatibility
2205
2206This version of GMP is upwardly binary compatible with all 5.x, 4.x, and 3.x
2207versions, and upwardly compatible at the source level with all 2.x versions,
2208with the following exceptions.
2209
2210@itemize @bullet
2211@item
2212@code{mpn_gcd} had its source arguments swapped as of GMP 3.0, for consistency
2213with other @code{mpn} functions.
2214
2215@item
2216@code{mpf_get_prec} counted precision slightly differently in GMP 3.0 and
22173.0.1, but in 3.1 reverted to the 2.x style.
2218
2219@item
2220@code{mpn_bdivmod}, documented as preliminary in GMP 4, has been removed.
2221@end itemize
2222
2223There are a number of compatibility issues between GMP 1 and GMP 2 that of
2224course also apply when porting applications from GMP 1 to GMP 5.  Please
2225see the GMP 2 manual for details.
2226
2227@c @item Integer division functions round the result differently.  The obsolete
2228@c functions (@code{mpz_div}, @code{mpz_divmod}, @code{mpz_mdiv},
2229@c @code{mpz_mdivmod}, etc) now all use floor rounding (i.e., they round the
2230@c quotient towards
2231@c @ifinfo
2232@c @minus{}infinity).
2233@c @end ifinfo
2234@c @iftex
2235@c @tex
2236@c $-\infty$).
2237@c @end tex
2238@c @end iftex
2239@c There are a lot of functions for integer division, giving the user better
2240@c control over the rounding.
2241
2242@c @item The function @code{mpz_mod} now compute the true @strong{mod} function.
2243
2244@c @item The functions @code{mpz_powm} and @code{mpz_powm_ui} now use
2245@c @strong{mod} for reduction.
2246
2247@c @item The assignment functions for rational numbers do no longer canonicalize
2248@c their results.  In the case a non-canonical result could arise from an
2249@c assignment, the user need to insert an explicit call to
2250@c @code{mpq_canonicalize}.  This change was made for efficiency.
2251
2252@c @item Output generated by @code{mpz_out_raw} in this release cannot be read
2253@c by @code{mpz_inp_raw} in previous releases.  This change was made for making
2254@c the file format truly portable between machines with different word sizes.
2255
2256@c @item Several @code{mpn} functions have changed.  But they were intentionally
2257@c undocumented in previous releases.
2258
2259@c @item The functions @code{mpz_cmp_ui}, @code{mpz_cmp_si}, and @code{mpq_cmp_ui}
2260@c are now implemented as macros, and thereby sometimes evaluate their
2261@c arguments multiple times.
2262
2263@c @item The functions @code{mpz_pow_ui} and @code{mpz_ui_pow_ui} now yield 1
2264@c for 0^0.  (In version 1, they yielded 0.)
2265
2266@c In version 1 of the library, @code{mpq_set_den} handled negative
2267@c denominators by copying the sign to the numerator.  That is no longer done.
2268
2269@c Pure assignment functions do not canonicalize the assigned variable.  It is
2270@c the responsibility of the user to canonicalize the assigned variable before
2271@c any arithmetic operations are performed on that variable.
2272@c Note that this is an incompatible change from version 1 of the library.
2273
2274@c @end enumerate
2275
2276
2277@need 1000
2278@node Demonstration Programs, Efficiency, Compatibility with older versions, GMP Basics
2279@section Demonstration programs
2280@cindex Demonstration programs
2281@cindex Example programs
2282@cindex Sample programs
2283The @file{demos} subdirectory has some sample programs using GMP@.  These
2284aren't built or installed, but there's a @file{Makefile} with rules for them.
2285For instance,
2286
2287@example
2288make pexpr
2289./pexpr 68^975+10
2290@end example
2291
2292@noindent
2293The following programs are provided
2294
2295@itemize @bullet
2296@item
2297@cindex Expression parsing demo
2298@cindex Parsing expressions demo
2299@samp{pexpr} is an expression evaluator, the program used on the GMP web page.
2300@item
2301@cindex Expression parsing demo
2302@cindex Parsing expressions demo
2303The @samp{calc} subdirectory has a similar but simpler evaluator using
2304@command{lex} and @command{yacc}.
2305@item
2306@cindex Expression parsing demo
2307@cindex Parsing expressions demo
2308The @samp{expr} subdirectory is yet another expression evaluator, a library
2309designed for ease of use within a C program.  See @file{demos/expr/README} for
2310more information.
2311@item
2312@cindex Factorization demo
2313@samp{factorize} is a Pollard-Rho factorization program.
2314@item
2315@samp{isprime} is a command-line interface to the @code{mpz_probab_prime_p}
2316function.
2317@item
2318@samp{primes} counts or lists primes in an interval, using a sieve.
2319@item
2320@samp{qcn} is an example use of @code{mpz_kronecker_ui} to estimate quadratic
2321class numbers.
2322@item
2323@cindex @code{perl}
2324@cindex GMP Perl module
2325@cindex Perl module
2326The @samp{perl} subdirectory is a comprehensive perl interface to GMP@.  See
2327@file{demos/perl/INSTALL} for more information.  Documentation is in POD
2328format in @file{demos/perl/GMP.pm}.
2329@end itemize
2330
2331As an aside, consideration has been given at various times to some sort of
2332expression evaluation within the main GMP library.  Going beyond something
2333minimal quickly leads to matters like user-defined functions, looping, fixnums
2334for control variables, etc, which are considered outside the scope of GMP
2335(much closer to language interpreters or compilers, @xref{Language Bindings}.)
2336Something simple for program input convenience may yet be a possibility, a
2337combination of the @file{expr} demo and the @file{pexpr} tree back-end
2338perhaps.  But for now the above evaluators are offered as illustrations.
2339
2340
2341@need 1000
2342@node Efficiency, Debugging, Demonstration Programs, GMP Basics
2343@section Efficiency
2344@cindex Efficiency
2345
2346@table @asis
2347@item Small Operands
2348@cindex Small operands
2349On small operands, the time for function call overheads and memory allocation
2350can be significant in comparison to actual calculation.  This is unavoidable
2351in a general purpose variable precision library, although GMP attempts to be
2352as efficient as it can on both large and small operands.
2353
2354@item Static Linking
2355@cindex Static linking
2356On some CPUs, in particular the x86s, the static @file{libgmp.a} should be
2357used for maximum speed, since the PIC code in the shared @file{libgmp.so} will
2358have a small overhead on each function call and global data address.  For many
2359programs this will be insignificant, but for long calculations there's a gain
2360to be had.
2361
2362@item Initializing and Clearing
2363@cindex Initializing and clearing
2364Avoid excessive initializing and clearing of variables, since this can be
2365quite time consuming, especially in comparison to otherwise fast operations
2366like addition.
2367
2368A language interpreter might want to keep a free list or stack of
2369initialized variables ready for use.  It should be possible to integrate
2370something like that with a garbage collector too.
2371
2372@item Reallocations
2373@cindex Reallocations
2374An @code{mpz_t} or @code{mpq_t} variable used to hold successively increasing
2375values will have its memory repeatedly @code{realloc}ed, which could be quite
2376slow or could fragment memory, depending on the C library.  If an application
2377can estimate the final size then @code{mpz_init2} or @code{mpz_realloc2} can
2378be called to allocate the necessary space from the beginning
2379(@pxref{Initializing Integers}).
2380
2381It doesn't matter if a size set with @code{mpz_init2} or @code{mpz_realloc2}
2382is too small, since all functions will do a further reallocation if necessary.
2383Badly overestimating memory required will waste space though.
2384
2385@item @code{2exp} Functions
2386@cindex @code{2exp} functions
2387It's up to an application to call functions like @code{mpz_mul_2exp} when
2388appropriate.  General purpose functions like @code{mpz_mul} make no attempt to
2389identify powers of two or other special forms, because such inputs will
2390usually be very rare and testing every time would be wasteful.
2391
2392@item @code{ui} and @code{si} Functions
2393@cindex @code{ui} and @code{si} functions
2394The @code{ui} functions and the small number of @code{si} functions exist for
2395convenience and should be used where applicable.  But if for example an
2396@code{mpz_t} contains a value that fits in an @code{unsigned long} there's no
2397need extract it and call a @code{ui} function, just use the regular @code{mpz}
2398function.
2399
2400@item In-Place Operations
2401@cindex In-place operations
2402@code{mpz_abs}, @code{mpq_abs}, @code{mpf_abs}, @code{mpz_neg}, @code{mpq_neg}
2403and @code{mpf_neg} are fast when used for in-place operations like
2404@code{mpz_abs(x,x)}, since in the current implementation only a single field
2405of @code{x} needs changing.  On suitable compilers (GCC for instance) this is
2406inlined too.
2407
2408@code{mpz_add_ui}, @code{mpz_sub_ui}, @code{mpf_add_ui} and @code{mpf_sub_ui}
2409benefit from an in-place operation like @code{mpz_add_ui(x,x,y)}, since
2410usually only one or two limbs of @code{x} will need to be changed.  The same
2411applies to the full precision @code{mpz_add} etc if @code{y} is small.  If
2412@code{y} is big then cache locality may be helped, but that's all.
2413
2414@code{mpz_mul} is currently the opposite, a separate destination is slightly
2415better.  A call like @code{mpz_mul(x,x,y)} will, unless @code{y} is only one
2416limb, make a temporary copy of @code{x} before forming the result.  Normally
2417that copying will only be a tiny fraction of the time for the multiply, so
2418this is not a particularly important consideration.
2419
2420@code{mpz_set}, @code{mpq_set}, @code{mpq_set_num}, @code{mpf_set}, etc, make
2421no attempt to recognise a copy of something to itself, so a call like
2422@code{mpz_set(x,x)} will be wasteful.  Naturally that would never be written
2423deliberately, but if it might arise from two pointers to the same object then
2424a test to avoid it might be desirable.
2425
2426@example
2427if (x != y)
2428  mpz_set (x, y);
2429@end example
2430
2431Note that it's never worth introducing extra @code{mpz_set} calls just to get
2432in-place operations.  If a result should go to a particular variable then just
2433direct it there and let GMP take care of data movement.
2434
2435@item Divisibility Testing (Small Integers)
2436@cindex Divisibility testing
2437@code{mpz_divisible_ui_p} and @code{mpz_congruent_ui_p} are the best functions
2438for testing whether an @code{mpz_t} is divisible by an individual small
2439integer.  They use an algorithm which is faster than @code{mpz_tdiv_ui}, but
2440which gives no useful information about the actual remainder, only whether
2441it's zero (or a particular value).
2442
2443However when testing divisibility by several small integers, it's best to take
2444a remainder modulo their product, to save multi-precision operations.  For
2445instance to test whether a number is divisible by any of 23, 29 or 31 take a
2446remainder modulo @math{23@times{}29@times{}31 = 20677} and then test that.
2447
2448The division functions like @code{mpz_tdiv_q_ui} which give a quotient as well
2449as a remainder are generally a little slower than the remainder-only functions
2450like @code{mpz_tdiv_ui}.  If the quotient is only rarely wanted then it's
2451probably best to just take a remainder and then go back and calculate the
2452quotient if and when it's wanted (@code{mpz_divexact_ui} can be used if the
2453remainder is zero).
2454
2455@item Rational Arithmetic
2456@cindex Rational arithmetic
2457The @code{mpq} functions operate on @code{mpq_t} values with no common factors
2458in the numerator and denominator.  Common factors are checked-for and cast out
2459as necessary.  In general, cancelling factors every time is the best approach
2460since it minimizes the sizes for subsequent operations.
2461
2462However, applications that know something about the factorization of the
2463values they're working with might be able to avoid some of the GCDs used for
2464canonicalization, or swap them for divisions.  For example when multiplying by
2465a prime it's enough to check for factors of it in the denominator instead of
2466doing a full GCD@.  Or when forming a big product it might be known that very
2467little cancellation will be possible, and so canonicalization can be left to
2468the end.
2469
2470The @code{mpq_numref} and @code{mpq_denref} macros give access to the
2471numerator and denominator to do things outside the scope of the supplied
2472@code{mpq} functions.  @xref{Applying Integer Functions}.
2473
2474The canonical form for rationals allows mixed-type @code{mpq_t} and integer
2475additions or subtractions to be done directly with multiples of the
2476denominator.  This will be somewhat faster than @code{mpq_add}.  For example,
2477
2478@example
2479/* mpq increment */
2480mpz_add (mpq_numref(q), mpq_numref(q), mpq_denref(q));
2481
2482/* mpq += unsigned long */
2483mpz_addmul_ui (mpq_numref(q), mpq_denref(q), 123UL);
2484
2485/* mpq -= mpz */
2486mpz_submul (mpq_numref(q), mpq_denref(q), z);
2487@end example
2488
2489@item Number Sequences
2490@cindex Number sequences
2491Functions like @code{mpz_fac_ui}, @code{mpz_fib_ui} and @code{mpz_bin_uiui}
2492are designed for calculating isolated values.  If a range of values is wanted
2493it's probably best to call to get a starting point and iterate from there.
2494
2495@item Text Input/Output
2496@cindex Text input/output
2497Hexadecimal or octal are suggested for input or output in text form.
2498Power-of-2 bases like these can be converted much more efficiently than other
2499bases, like decimal.  For big numbers there's usually nothing of particular
2500interest to be seen in the digits, so the base doesn't matter much.
2501
2502Maybe we can hope octal will one day become the normal base for everyday use,
2503as proposed by King Charles XII of Sweden and later reformers.
2504@c Reference: Knuth volume 2 section 4.1, page 184 of second edition.  :-)
2505@end table
2506
2507
2508@node Debugging, Profiling, Efficiency, GMP Basics
2509@section Debugging
2510@cindex Debugging
2511
2512@table @asis
2513@item Stack Overflow
2514@cindex Stack overflow
2515@cindex Segmentation violation
2516@cindex Bus error
2517Depending on the system, a segmentation violation or bus error might be the
2518only indication of stack overflow.  See @samp{--enable-alloca} choices in
2519@ref{Build Options}, for how to address this.
2520
2521In new enough versions of GCC, @samp{-fstack-check} may be able to ensure an
2522overflow is recognised by the system before too much damage is done, or
2523@samp{-fstack-limit-symbol} or @samp{-fstack-limit-register} may be able to
2524add checking if the system itself doesn't do any (@pxref{Code Gen Options,,
2525Options for Code Generation, gcc, Using the GNU Compiler Collection (GCC)}).
2526These options must be added to the @samp{CFLAGS} used in the GMP build
2527(@pxref{Build Options}), adding them just to an application will have no
2528effect.  Note also they're a slowdown, adding overhead to each function call
2529and each stack allocation.
2530
2531@item Heap Problems
2532@cindex Heap problems
2533@cindex Malloc problems
2534The most likely cause of application problems with GMP is heap corruption.
2535Failing to @code{init} GMP variables will have unpredictable effects, and
2536corruption arising elsewhere in a program may well affect GMP@.  Initializing
2537GMP variables more than once or failing to clear them will cause memory leaks.
2538
2539@cindex Malloc debugger
2540In all such cases a @code{malloc} debugger is recommended.  On a GNU or BSD
2541system the standard C library @code{malloc} has some diagnostic facilities,
2542see @ref{Allocation Debugging,, Allocation Debugging, libc, The GNU C Library
2543Reference Manual}, or @samp{man 3 malloc}.  Other possibilities, in no
2544particular order, include
2545
2546@display
2547@uref{http://www.inf.ethz.ch/personal/biere/projects/ccmalloc/}
2548@uref{http://dmalloc.com/}
2549@uref{http://www.perens.com/FreeSoftware/} @ (electric fence)
2550@uref{http://packages.debian.org/stable/devel/fda}
2551@uref{http://www.gnupdate.org/components/leakbug/}
2552@uref{http://people.redhat.com/~otaylor/memprof/}
2553@uref{http://www.cbmamiga.demon.co.uk/mpatrol/}
2554@end display
2555
2556The GMP default allocation routines in @file{memory.c} also have a simple
2557sentinel scheme which can be enabled with @code{#define DEBUG} in that file.
2558This is mainly designed for detecting buffer overruns during GMP development,
2559but might find other uses.
2560
2561@item Stack Backtraces
2562@cindex Stack backtrace
2563On some systems the compiler options GMP uses by default can interfere with
2564debugging.  In particular on x86 and 68k systems @samp{-fomit-frame-pointer}
2565is used and this generally inhibits stack backtracing.  Recompiling without
2566such options may help while debugging, though the usual caveats about it
2567potentially moving a memory problem or hiding a compiler bug will apply.
2568
2569@item GDB, the GNU Debugger
2570@cindex GDB
2571@cindex GNU Debugger
2572A sample @file{.gdbinit} is included in the distribution, showing how to call
2573some undocumented dump functions to print GMP variables from within GDB@.  Note
2574that these functions shouldn't be used in final application code since they're
2575undocumented and may be subject to incompatible changes in future versions of
2576GMP.
2577
2578@item Source File Paths
2579GMP has multiple source files with the same name, in different directories.
2580For example @file{mpz}, @file{mpq} and @file{mpf} each have an
2581@file{init.c}.  If the debugger can't already determine the right one it may
2582help to build with absolute paths on each C file.  One way to do that is to
2583use a separate object directory with an absolute path to the source directory.
2584
2585@example
2586cd /my/build/dir
2587/my/source/dir/gmp-@value{VERSION}/configure
2588@end example
2589
2590This works via @code{VPATH}, and might require GNU @command{make}.
2591Alternately it might be possible to change the @code{.c.lo} rules
2592appropriately.
2593
2594@item Assertion Checking
2595@cindex Assertion checking
2596The build option @option{--enable-assert} is available to add some consistency
2597checks to the library (see @ref{Build Options}).  These are likely to be of
2598limited value to most applications.  Assertion failures are just as likely to
2599indicate memory corruption as a library or compiler bug.
2600
2601Applications using the low-level @code{mpn} functions, however, will benefit
2602from @option{--enable-assert} since it adds checks on the parameters of most
2603such functions, many of which have subtle restrictions on their usage.  Note
2604however that only the generic C code has checks, not the assembly code, so
2605@option{--disable-assembly} should be used for maximum checking.
2606
2607@item Temporary Memory Checking
2608The build option @option{--enable-alloca=debug} arranges that each block of
2609temporary memory in GMP is allocated with a separate call to @code{malloc} (or
2610the allocation function set with @code{mp_set_memory_functions}).
2611
2612This can help a malloc debugger detect accesses outside the intended bounds,
2613or detect memory not released.  In a normal build, on the other hand,
2614temporary memory is allocated in blocks which GMP divides up for its own use,
2615or may be allocated with a compiler builtin @code{alloca} which will go
2616nowhere near any malloc debugger hooks.
2617
2618@item Maximum Debuggability
2619To summarize the above, a GMP build for maximum debuggability would be
2620
2621@example
2622./configure --disable-shared --enable-assert \
2623  --enable-alloca=debug --disable-assembly CFLAGS=-g
2624@end example
2625
2626For C++, add @samp{--enable-cxx CXXFLAGS=-g}.
2627
2628@item Checker
2629@cindex Checker
2630@cindex GCC Checker
2631The GCC checker (@uref{http://savannah.nongnu.org/projects/checker/}) can be
2632used with GMP@.  It contains a stub library which means GMP applications
2633compiled with checker can use a normal GMP build.
2634
2635A build of GMP with checking within GMP itself can be made.  This will run
2636very very slowly.  On GNU/Linux for example,
2637
2638@cindex @command{checkergcc}
2639@example
2640./configure --disable-assembly CC=checkergcc
2641@end example
2642
2643@option{--disable-assembly} must be used, since the GMP assembly code doesn't
2644support the checking scheme.  The GMP C++ features cannot be used, since
2645current versions of checker (0.9.9.1) don't yet support the standard C++
2646library.
2647
2648@item Valgrind
2649@cindex Valgrind
2650Valgrind (@uref{http://valgrind.org/}) is a memory checker for x86, ARM, MIPS,
2651PowerPC, and S/390.  It translates and emulates machine instructions to do
2652strong checks for uninitialized data (at the level of individual bits), memory
2653accesses through bad pointers, and memory leaks.
2654
2655Valgrind does not always support every possible instruction, in particular
2656ones recently added to an ISA.  Valgrind might therefore be incompatible with
2657a recent GMP or even a less recent GMP which is compiled using a recent GCC.
2658
2659GMP's assembly code sometimes promotes a read of the limbs to some larger size,
2660for efficiency.  GMP will do this even at the start and end of a multilimb
2661operand, using naturally aligned operations on the larger type.  This may lead
2662to benign reads outside of allocated areas, triggering complaints from
2663Valgrind.  Valgrind's option @samp{--partial-loads-ok=yes} should help.
2664
2665@item Other Problems
2666Any suspected bug in GMP itself should be isolated to make sure it's not an
2667application problem, see @ref{Reporting Bugs}.
2668@end table
2669
2670
2671@node Profiling, Autoconf, Debugging, GMP Basics
2672@section Profiling
2673@cindex Profiling
2674@cindex Execution profiling
2675@cindex @code{--enable-profiling}
2676
2677Running a program under a profiler is a good way to find where it's spending
2678most time and where improvements can be best sought.  The profiling choices
2679for a GMP build are as follows.
2680
2681@table @asis
2682@item @samp{--disable-profiling}
2683The default is to add nothing special for profiling.
2684
2685It should be possible to just compile the mainline of a program with @code{-p}
2686and use @command{prof} to get a profile consisting of timer-based sampling of
2687the program counter.  Most of the GMP assembly code has the necessary symbol
2688information.
2689
2690This approach has the advantage of minimizing interference with normal program
2691operation, but on most systems the resolution of the sampling is quite low (10
2692milliseconds for instance), requiring long runs to get accurate information.
2693
2694@item @samp{--enable-profiling=prof}
2695@cindex @code{prof}
2696Build with support for the system @command{prof}, which means @samp{-p} added
2697to the @samp{CFLAGS}.
2698
2699This provides call counting in addition to program counter sampling, which
2700allows the most frequently called routines to be identified, and an average
2701time spent in each routine to be determined.
2702
2703The x86 assembly code has support for this option, but on other processors
2704the assembly routines will be as if compiled without @samp{-p} and therefore
2705won't appear in the call counts.
2706
2707On some systems, such as GNU/Linux, @samp{-p} in fact means @samp{-pg} and in
2708this case @samp{--enable-profiling=gprof} described below should be used
2709instead.
2710
2711@item @samp{--enable-profiling=gprof}
2712@cindex @code{gprof}
2713Build with support for @command{gprof}, which means @samp{-pg} added to the
2714@samp{CFLAGS}.
2715
2716This provides call graph construction in addition to call counting and program
2717counter sampling, which makes it possible to count calls coming from different
2718locations.  For example the number of calls to @code{mpn_mul} from
2719@code{mpz_mul} versus the number from @code{mpf_mul}.  The program counter
2720sampling is still flat though, so only a total time in @code{mpn_mul} would be
2721accumulated, not a separate amount for each call site.
2722
2723The x86 assembly code has support for this option, but on other processors
2724the assembly routines will be as if compiled without @samp{-pg} and therefore
2725not be included in the call counts.
2726
2727On x86 and m68k systems @samp{-pg} and @samp{-fomit-frame-pointer} are
2728incompatible, so the latter is omitted from the default flags in that case,
2729which might result in poorer code generation.
2730
2731Incidentally, it should be possible to use the @command{gprof} program with a
2732plain @samp{--enable-profiling=prof} build.  But in that case only the
2733@samp{gprof -p} flat profile and call counts can be expected to be valid, not
2734the @samp{gprof -q} call graph.
2735
2736@item @samp{--enable-profiling=instrument}
2737@cindex @code{-finstrument-functions}
2738@cindex @code{instrument-functions}
2739Build with the GCC option @samp{-finstrument-functions} added to the
2740@samp{CFLAGS} (@pxref{Code Gen Options,, Options for Code Generation, gcc,
2741Using the GNU Compiler Collection (GCC)}).
2742
2743This inserts special instrumenting calls at the start and end of each
2744function, allowing exact timing and full call graph construction.
2745
2746This instrumenting is not normally a standard system feature and will require
2747support from an external library, such as
2748
2749@cindex FunctionCheck
2750@cindex fnccheck
2751@display
2752@uref{http://sourceforge.net/projects/fnccheck/}
2753@end display
2754
2755This should be included in @samp{LIBS} during the GMP configure so that test
2756programs will link.  For example,
2757
2758@example
2759./configure --enable-profiling=instrument LIBS=-lfc
2760@end example
2761
2762On a GNU system the C library provides dummy instrumenting functions, so
2763programs compiled with this option will link.  In this case it's only
2764necessary to ensure the correct library is added when linking an application.
2765
2766The x86 assembly code supports this option, but on other processors the
2767assembly routines will be as if compiled without
2768@samp{-finstrument-functions} meaning time spent in them will effectively be
2769attributed to their caller.
2770@end table
2771
2772
2773@node Autoconf, Emacs, Profiling, GMP Basics
2774@section Autoconf
2775@cindex Autoconf
2776
2777Autoconf based applications can easily check whether GMP is installed.  The
2778only thing to be noted is that GMP library symbols from version 3 onwards have
2779prefixes like @code{__gmpz}.  The following therefore would be a simple test,
2780
2781@cindex @code{AC_CHECK_LIB}
2782@example
2783AC_CHECK_LIB(gmp, __gmpz_init)
2784@end example
2785
2786This just uses the default @code{AC_CHECK_LIB} actions for found or not found,
2787but an application that must have GMP would want to generate an error if not
2788found.  For example,
2789
2790@example
2791AC_CHECK_LIB(gmp, __gmpz_init, ,
2792  [AC_MSG_ERROR([GNU MP not found, see http://gmplib.org/])])
2793@end example
2794
2795If functions added in some particular version of GMP are required, then one of
2796those can be used when checking.  For example @code{mpz_mul_si} was added in
2797GMP 3.1,
2798
2799@example
2800AC_CHECK_LIB(gmp, __gmpz_mul_si, ,
2801  [AC_MSG_ERROR(
2802  [GNU MP not found, or not 3.1 or up, see http://gmplib.org/])])
2803@end example
2804
2805An alternative would be to test the version number in @file{gmp.h} using say
2806@code{AC_EGREP_CPP}.  That would make it possible to test the exact version,
2807if some particular sub-minor release is known to be necessary.
2808
2809In general it's recommended that applications should simply demand a new
2810enough GMP rather than trying to provide supplements for features not
2811available in past versions.
2812
2813Occasionally an application will need or want to know the size of a type at
2814configuration or preprocessing time, not just with @code{sizeof} in the code.
2815This can be done in the normal way with @code{mp_limb_t} etc, but GMP 4.0 or
2816up is best for this, since prior versions needed certain @samp{-D} defines on
2817systems using a @code{long long} limb.  The following would suit Autoconf 2.50
2818or up,
2819
2820@example
2821AC_CHECK_SIZEOF(mp_limb_t, , [#include <gmp.h>])
2822@end example
2823
2824
2825@node Emacs,  , Autoconf, GMP Basics
2826@section Emacs
2827@cindex Emacs
2828@cindex @code{info-lookup-symbol}
2829
2830@key{C-h C-i} (@code{info-lookup-symbol}) is a good way to find documentation
2831on C functions while editing (@pxref{Info Lookup, , Info Documentation Lookup,
2832emacs, The Emacs Editor}).
2833
2834The GMP manual can be included in such lookups by putting the following in
2835your @file{.emacs},
2836
2837@c  This isn't pretty, but there doesn't seem to be a better way (in emacs
2838@c  21.2 at least).  info-lookup->mode-value could be used for the "assoc"s,
2839@c  but that function isn't documented, whereas info-lookup-alist is.
2840@c
2841@example
2842(eval-after-load "info-look"
2843  '(let ((mode-value (assoc 'c-mode (assoc 'symbol info-lookup-alist))))
2844     (setcar (nthcdr 3 mode-value)
2845             (cons '("(gmp)Function Index" nil "^ -.* " "\\>")
2846                   (nth 3 mode-value)))))
2847@end example
2848
2849
2850@node Reporting Bugs, Integer Functions, GMP Basics, Top
2851@comment  node-name,  next,  previous,  up
2852@chapter Reporting Bugs
2853@cindex Reporting bugs
2854@cindex Bug reporting
2855
2856If you think you have found a bug in the GMP library, please investigate it
2857and report it.  We have made this library available to you, and it is not too
2858much to ask you to report the bugs you find.
2859
2860Before you report a bug, check it's not already addressed in @ref{Known Build
2861Problems}, or perhaps @ref{Notes for Particular Systems}.  You may also want
2862to check @uref{http://gmplib.org/} for patches for this release.
2863
2864Please include the following in any report,
2865
2866@itemize @bullet
2867@item
2868The GMP version number, and if pre-packaged or patched then say so.
2869
2870@item
2871A test program that makes it possible for us to reproduce the bug.  Include
2872instructions on how to run the program.
2873
2874@item
2875A description of what is wrong.  If the results are incorrect, in what way.
2876If you get a crash, say so.
2877
2878@item
2879If you get a crash, include a stack backtrace from the debugger if it's
2880informative (@samp{where} in @command{gdb}, or @samp{$C} in @command{adb}).
2881
2882@item
2883Please do not send core dumps, executables or @command{strace}s.
2884
2885@item
2886The @samp{configure} options you used when building GMP, if any.
2887
2888@item
2889The output from @samp{configure}, as printed to stdout, with any options used.
2890
2891@item
2892The name of the compiler and its version.  For @command{gcc}, get the version
2893with @samp{gcc -v}, otherwise perhaps @samp{what `which cc`}, or similar.
2894
2895@item
2896The output from running @samp{uname -a}.
2897
2898@item
2899The output from running @samp{./config.guess}, and from running
2900@samp{./configfsf.guess} (might be the same).
2901
2902@item
2903If the bug is related to @samp{configure}, then the compressed contents of
2904@file{config.log}.
2905
2906@item
2907If the bug is related to an @file{asm} file not assembling, then the contents
2908of @file{config.m4} and the offending line or lines from the temporary
2909@file{mpn/tmp-<file>.s}.
2910@end itemize
2911
2912Please make an effort to produce a self-contained report, with something
2913definite that can be tested or debugged.  Vague queries or piecemeal messages
2914are difficult to act on and don't help the development effort.
2915
2916It is not uncommon that an observed problem is actually due to a bug in the
2917compiler; the GMP code tends to explore interesting corners in compilers.
2918
2919If your bug report is good, we will do our best to help you get a corrected
2920version of the library; if the bug report is poor, we won't do anything about
2921it (except maybe ask you to send a better report).
2922
2923Send your report to: @email{gmp-bugs@@gmplib.org}.
2924
2925If you think something in this manual is unclear, or downright incorrect, or if
2926the language needs to be improved, please send a note to the same address.
2927
2928
2929@node Integer Functions, Rational Number Functions, Reporting Bugs, Top
2930@comment  node-name,  next,  previous,  up
2931@chapter Integer Functions
2932@cindex Integer functions
2933
2934This chapter describes the GMP functions for performing integer arithmetic.
2935These functions start with the prefix @code{mpz_}.
2936
2937GMP integers are stored in objects of type @code{mpz_t}.
2938
2939@menu
2940* Initializing Integers::
2941* Assigning Integers::
2942* Simultaneous Integer Init & Assign::
2943* Converting Integers::
2944* Integer Arithmetic::
2945* Integer Division::
2946* Integer Exponentiation::
2947* Integer Roots::
2948* Number Theoretic Functions::
2949* Integer Comparisons::
2950* Integer Logic and Bit Fiddling::
2951* I/O of Integers::
2952* Integer Random Numbers::
2953* Integer Import and Export::
2954* Miscellaneous Integer Functions::
2955* Integer Special Functions::
2956@end menu
2957
2958@node Initializing Integers, Assigning Integers, Integer Functions, Integer Functions
2959@comment  node-name,  next,  previous,  up
2960@section Initialization Functions
2961@cindex Integer initialization functions
2962@cindex Initialization functions
2963
2964The functions for integer arithmetic assume that all integer objects are
2965initialized.  You do that by calling the function @code{mpz_init}.  For
2966example,
2967
2968@example
2969@{
2970  mpz_t integ;
2971  mpz_init (integ);
2972  @dots{}
2973  mpz_add (integ, @dots{});
2974  @dots{}
2975  mpz_sub (integ, @dots{});
2976
2977  /* Unless the program is about to exit, do ... */
2978  mpz_clear (integ);
2979@}
2980@end example
2981
2982As you can see, you can store new values any number of times, once an
2983object is initialized.
2984
2985@deftypefun void mpz_init (mpz_t @var{x})
2986Initialize @var{x}, and set its value to 0.
2987@end deftypefun
2988
2989@deftypefun void mpz_inits (mpz_t @var{x}, ...)
2990Initialize a NULL-terminated list of @code{mpz_t} variables, and set their
2991values to 0.
2992@end deftypefun
2993
2994@deftypefun void mpz_init2 (mpz_t @var{x}, mp_bitcnt_t @var{n})
2995Initialize @var{x}, with space for @var{n}-bit numbers, and set its value to 0.
2996Calling this function instead of @code{mpz_init} or @code{mpz_inits} is never
2997necessary; reallocation is handled automatically by GMP when needed.
2998
2999While @var{n} defines the initial space, @var{x} will grow automatically in the
3000normal way, if necessary, for subsequent values stored.  @code{mpz_init2} makes
3001it possible to avoid such reallocations if a maximum size is known in advance.
3002
3003In preparation for an operation, GMP often allocates one limb more than
3004ultimately needed.  To make sure GMP will not perform reallocation for
3005@var{x}, you need to add the number of bits in @code{mp_limb_t} to @var{n}.
3006@end deftypefun
3007
3008@deftypefun void mpz_clear (mpz_t @var{x})
3009Free the space occupied by @var{x}.  Call this function for all @code{mpz_t}
3010variables when you are done with them.
3011@end deftypefun
3012
3013@deftypefun void mpz_clears (mpz_t @var{x}, ...)
3014Free the space occupied by a NULL-terminated list of @code{mpz_t} variables.
3015@end deftypefun
3016
3017@deftypefun void mpz_realloc2 (mpz_t @var{x}, mp_bitcnt_t @var{n})
3018Change the space allocated for @var{x} to @var{n} bits.  The value in @var{x}
3019is preserved if it fits, or is set to 0 if not.
3020
3021Calling this function is never necessary; reallocation is handled automatically
3022by GMP when needed.  But this function can be used to increase the space for a
3023variable in order to avoid repeated automatic reallocations, or to decrease it
3024to give memory back to the heap.
3025@end deftypefun
3026
3027
3028@node Assigning Integers, Simultaneous Integer Init & Assign, Initializing Integers, Integer Functions
3029@comment  node-name,  next,  previous,  up
3030@section Assignment Functions
3031@cindex Integer assignment functions
3032@cindex Assignment functions
3033
3034These functions assign new values to already initialized integers
3035(@pxref{Initializing Integers}).
3036
3037@deftypefun void mpz_set (mpz_t @var{rop}, const mpz_t @var{op})
3038@deftypefunx void mpz_set_ui (mpz_t @var{rop}, unsigned long int @var{op})
3039@deftypefunx void mpz_set_si (mpz_t @var{rop}, signed long int @var{op})
3040@deftypefunx void mpz_set_d (mpz_t @var{rop}, double @var{op})
3041@deftypefunx void mpz_set_q (mpz_t @var{rop}, const mpq_t @var{op})
3042@deftypefunx void mpz_set_f (mpz_t @var{rop}, const mpf_t @var{op})
3043Set the value of @var{rop} from @var{op}.
3044
3045@code{mpz_set_d}, @code{mpz_set_q} and @code{mpz_set_f} truncate @var{op} to
3046make it an integer.
3047@end deftypefun
3048
3049@deftypefun int mpz_set_str (mpz_t @var{rop}, const char *@var{str}, int @var{base})
3050Set the value of @var{rop} from @var{str}, a null-terminated C string in base
3051@var{base}.  White space is allowed in the string, and is simply ignored.
3052
3053The @var{base} may vary from 2 to 62, or if @var{base} is 0, then the leading
3054characters are used: @code{0x} and @code{0X} for hexadecimal, @code{0b} and
3055@code{0B} for binary, @code{0} for octal, or decimal otherwise.
3056
3057For bases up to 36, case is ignored; upper-case and lower-case letters have
3058the same value.  For bases 37 to 62, upper-case letter represent the usual
305910..35 while lower-case letter represent 36..61.
3060
3061This function returns 0 if the entire string is a valid number in base
3062@var{base}.  Otherwise it returns @minus{}1.
3063@c
3064@c  It turns out that it is not entirely true that this function ignores
3065@c  white-space.  It does ignore it between digits, but not after a minus sign
3066@c  or within or after ``0x''.  Some thought was given to disallowing all
3067@c  whitespace, but that would be an incompatible change, whitespace has been
3068@c  documented as ignored ever since GMP 1.
3069@c
3070@end deftypefun
3071
3072@deftypefun void mpz_swap (mpz_t @var{rop1}, mpz_t @var{rop2})
3073Swap the values @var{rop1} and @var{rop2} efficiently.
3074@end deftypefun
3075
3076
3077@node Simultaneous Integer Init & Assign, Converting Integers, Assigning Integers, Integer Functions
3078@comment  node-name,  next,  previous,  up
3079@section Combined Initialization and Assignment Functions
3080@cindex Integer assignment functions
3081@cindex Assignment functions
3082@cindex Integer initialization functions
3083@cindex Initialization functions
3084
3085For convenience, GMP provides a parallel series of initialize-and-set functions
3086which initialize the output and then store the value there.  These functions'
3087names have the form @code{mpz_init_set@dots{}}
3088
3089Here is an example of using one:
3090
3091@example
3092@{
3093  mpz_t pie;
3094  mpz_init_set_str (pie, "3141592653589793238462643383279502884", 10);
3095  @dots{}
3096  mpz_sub (pie, @dots{});
3097  @dots{}
3098  mpz_clear (pie);
3099@}
3100@end example
3101
3102@noindent
3103Once the integer has been initialized by any of the @code{mpz_init_set@dots{}}
3104functions, it can be used as the source or destination operand for the ordinary
3105integer functions.  Don't use an initialize-and-set function on a variable
3106already initialized!
3107
3108@deftypefun void mpz_init_set (mpz_t @var{rop}, const mpz_t @var{op})
3109@deftypefunx void mpz_init_set_ui (mpz_t @var{rop}, unsigned long int @var{op})
3110@deftypefunx void mpz_init_set_si (mpz_t @var{rop}, signed long int @var{op})
3111@deftypefunx void mpz_init_set_d (mpz_t @var{rop}, double @var{op})
3112Initialize @var{rop} with limb space and set the initial numeric value from
3113@var{op}.
3114@end deftypefun
3115
3116@deftypefun int mpz_init_set_str (mpz_t @var{rop}, const char *@var{str}, int @var{base})
3117Initialize @var{rop} and set its value like @code{mpz_set_str} (see its
3118documentation above for details).
3119
3120If the string is a correct base @var{base} number, the function returns 0;
3121if an error occurs it returns @minus{}1.  @var{rop} is initialized even if
3122an error occurs.  (I.e., you have to call @code{mpz_clear} for it.)
3123@end deftypefun
3124
3125
3126@node Converting Integers, Integer Arithmetic, Simultaneous Integer Init & Assign, Integer Functions
3127@comment  node-name,  next,  previous,  up
3128@section Conversion Functions
3129@cindex Integer conversion functions
3130@cindex Conversion functions
3131
3132This section describes functions for converting GMP integers to standard C
3133types.  Functions for converting @emph{to} GMP integers are described in
3134@ref{Assigning Integers} and @ref{I/O of Integers}.
3135
3136@deftypefun {unsigned long int} mpz_get_ui (const mpz_t @var{op})
3137Return the value of @var{op} as an @code{unsigned long}.
3138
3139If @var{op} is too big to fit an @code{unsigned long} then just the least
3140significant bits that do fit are returned.  The sign of @var{op} is ignored,
3141only the absolute value is used.
3142@end deftypefun
3143
3144@deftypefun {signed long int} mpz_get_si (const mpz_t @var{op})
3145If @var{op} fits into a @code{signed long int} return the value of @var{op}.
3146Otherwise return the least significant part of @var{op}, with the same sign
3147as @var{op}.
3148
3149If @var{op} is too big to fit in a @code{signed long int}, the returned
3150result is probably not very useful.  To find out if the value will fit, use
3151the function @code{mpz_fits_slong_p}.
3152@end deftypefun
3153
3154@deftypefun double mpz_get_d (const mpz_t @var{op})
3155Convert @var{op} to a @code{double}, truncating if necessary (i.e.@: rounding
3156towards zero).
3157
3158If the exponent from the conversion is too big, the result is system
3159dependent.  An infinity is returned where available.  A hardware overflow trap
3160may or may not occur.
3161@end deftypefun
3162
3163@deftypefun double mpz_get_d_2exp (signed long int *@var{exp}, const mpz_t @var{op})
3164Convert @var{op} to a @code{double}, truncating if necessary (i.e.@: rounding
3165towards zero), and returning the exponent separately.
3166
3167The return value is in the range @math{0.5@le{}@GMPabs{@var{d}}<1} and the
3168exponent is stored to @code{*@var{exp}}.  @m{@var{d} * 2^{exp}, @var{d} *
31692^@var{exp}} is the (truncated) @var{op} value.  If @var{op} is zero, the
3170return is @math{0.0} and 0 is stored to @code{*@var{exp}}.
3171
3172@cindex @code{frexp}
3173This is similar to the standard C @code{frexp} function (@pxref{Normalization
3174Functions,,, libc, The GNU C Library Reference Manual}).
3175@end deftypefun
3176
3177@deftypefun {char *} mpz_get_str (char *@var{str}, int @var{base}, const mpz_t @var{op})
3178Convert @var{op} to a string of digits in base @var{base}.  The base argument
3179may vary from 2 to 62 or from @minus{}2 to @minus{}36.
3180
3181For @var{base} in the range 2..36, digits and lower-case letters are used; for
3182@minus{}2..@minus{}36, digits and upper-case letters are used; for 37..62,
3183digits, upper-case letters, and lower-case letters (in that significance order)
3184are used.
3185
3186If @var{str} is @code{NULL}, the result string is allocated using the current
3187allocation function (@pxref{Custom Allocation}).  The block will be
3188@code{strlen(str)+1} bytes, that being exactly enough for the string and
3189null-terminator.
3190
3191If @var{str} is not @code{NULL}, it should point to a block of storage large
3192enough for the result, that being @code{mpz_sizeinbase (@var{op}, @var{base})
3193+ 2}.  The two extra bytes are for a possible minus sign, and the
3194null-terminator.
3195
3196A pointer to the result string is returned, being either the allocated block,
3197or the given @var{str}.
3198@end deftypefun
3199
3200
3201@need 2000
3202@node Integer Arithmetic, Integer Division, Converting Integers, Integer Functions
3203@comment  node-name,  next,  previous,  up
3204@section Arithmetic Functions
3205@cindex Integer arithmetic functions
3206@cindex Arithmetic functions
3207
3208@deftypefun void mpz_add (mpz_t @var{rop}, const mpz_t @var{op1}, const mpz_t @var{op2})
3209@deftypefunx void mpz_add_ui (mpz_t @var{rop}, const mpz_t @var{op1}, unsigned long int @var{op2})
3210Set @var{rop} to @math{@var{op1} + @var{op2}}.
3211@end deftypefun
3212
3213@deftypefun void mpz_sub (mpz_t @var{rop}, const mpz_t @var{op1}, const mpz_t @var{op2})
3214@deftypefunx void mpz_sub_ui (mpz_t @var{rop}, const mpz_t @var{op1}, unsigned long int @var{op2})
3215@deftypefunx void mpz_ui_sub (mpz_t @var{rop}, unsigned long int @var{op1}, const mpz_t @var{op2})
3216Set @var{rop} to @var{op1} @minus{} @var{op2}.
3217@end deftypefun
3218
3219@deftypefun void mpz_mul (mpz_t @var{rop}, const mpz_t @var{op1}, const mpz_t @var{op2})
3220@deftypefunx void mpz_mul_si (mpz_t @var{rop}, const mpz_t @var{op1}, long int @var{op2})
3221@deftypefunx void mpz_mul_ui (mpz_t @var{rop}, const mpz_t @var{op1}, unsigned long int @var{op2})
3222Set @var{rop} to @math{@var{op1} @GMPtimes{} @var{op2}}.
3223@end deftypefun
3224
3225@deftypefun void mpz_addmul (mpz_t @var{rop}, const mpz_t @var{op1}, const mpz_t @var{op2})
3226@deftypefunx void mpz_addmul_ui (mpz_t @var{rop}, const mpz_t @var{op1}, unsigned long int @var{op2})
3227Set @var{rop} to @math{@var{rop} + @var{op1} @GMPtimes{} @var{op2}}.
3228@end deftypefun
3229
3230@deftypefun void mpz_submul (mpz_t @var{rop}, const mpz_t @var{op1}, const mpz_t @var{op2})
3231@deftypefunx void mpz_submul_ui (mpz_t @var{rop}, const mpz_t @var{op1}, unsigned long int @var{op2})
3232Set @var{rop} to @math{@var{rop} - @var{op1} @GMPtimes{} @var{op2}}.
3233@end deftypefun
3234
3235@deftypefun void mpz_mul_2exp (mpz_t @var{rop}, const mpz_t @var{op1}, mp_bitcnt_t @var{op2})
3236@cindex Bit shift left
3237Set @var{rop} to @m{@var{op1} \times 2^{op2}, @var{op1} times 2 raised to
3238@var{op2}}.  This operation can also be defined as a left shift by @var{op2}
3239bits.
3240@end deftypefun
3241
3242@deftypefun void mpz_neg (mpz_t @var{rop}, const mpz_t @var{op})
3243Set @var{rop} to @minus{}@var{op}.
3244@end deftypefun
3245
3246@deftypefun void mpz_abs (mpz_t @var{rop}, const mpz_t @var{op})
3247Set @var{rop} to the absolute value of @var{op}.
3248@end deftypefun
3249
3250
3251@need 2000
3252@node Integer Division, Integer Exponentiation, Integer Arithmetic, Integer Functions
3253@section Division Functions
3254@cindex Integer division functions
3255@cindex Division functions
3256
3257Division is undefined if the divisor is zero.  Passing a zero divisor to the
3258division or modulo functions (including the modular powering functions
3259@code{mpz_powm} and @code{mpz_powm_ui}), will cause an intentional division by
3260zero.  This lets a program handle arithmetic exceptions in these functions the
3261same way as for normal C @code{int} arithmetic.
3262
3263@c  Separate deftypefun groups for cdiv, fdiv and tdiv produce a blank line
3264@c  between each, and seem to let tex do a better job of page breaks than an
3265@c  @sp 1 in the middle of one big set.
3266
3267@deftypefun void mpz_cdiv_q (mpz_t @var{q}, const mpz_t @var{n}, const mpz_t @var{d})
3268@deftypefunx void mpz_cdiv_r (mpz_t @var{r}, const mpz_t @var{n}, const mpz_t @var{d})
3269@deftypefunx void mpz_cdiv_qr (mpz_t @var{q}, mpz_t @var{r}, const mpz_t @var{n}, const mpz_t @var{d})
3270@maybepagebreak
3271@deftypefunx {unsigned long int} mpz_cdiv_q_ui (mpz_t @var{q}, const mpz_t @var{n}, @w{unsigned long int @var{d}})
3272@deftypefunx {unsigned long int} mpz_cdiv_r_ui (mpz_t @var{r}, const mpz_t @var{n}, @w{unsigned long int @var{d}})
3273@deftypefunx {unsigned long int} mpz_cdiv_qr_ui (mpz_t @var{q}, mpz_t @var{r}, @w{const mpz_t @var{n}}, @w{unsigned long int @var{d}})
3274@deftypefunx {unsigned long int} mpz_cdiv_ui (const mpz_t @var{n}, @w{unsigned long int @var{d}})
3275@maybepagebreak
3276@deftypefunx void mpz_cdiv_q_2exp (mpz_t @var{q}, const mpz_t @var{n}, @w{mp_bitcnt_t @var{b}})
3277@deftypefunx void mpz_cdiv_r_2exp (mpz_t @var{r}, const mpz_t @var{n}, @w{mp_bitcnt_t @var{b}})
3278@end deftypefun
3279
3280@deftypefun void mpz_fdiv_q (mpz_t @var{q}, const mpz_t @var{n}, const mpz_t @var{d})
3281@deftypefunx void mpz_fdiv_r (mpz_t @var{r}, const mpz_t @var{n}, const mpz_t @var{d})
3282@deftypefunx void mpz_fdiv_qr (mpz_t @var{q}, mpz_t @var{r}, const mpz_t @var{n}, const mpz_t @var{d})
3283@maybepagebreak
3284@deftypefunx {unsigned long int} mpz_fdiv_q_ui (mpz_t @var{q}, const mpz_t @var{n}, @w{unsigned long int @var{d}})
3285@deftypefunx {unsigned long int} mpz_fdiv_r_ui (mpz_t @var{r}, const mpz_t @var{n}, @w{unsigned long int @var{d}})
3286@deftypefunx {unsigned long int} mpz_fdiv_qr_ui (mpz_t @var{q}, mpz_t @var{r}, @w{const mpz_t @var{n}}, @w{unsigned long int @var{d}})
3287@deftypefunx {unsigned long int} mpz_fdiv_ui (const mpz_t @var{n}, @w{unsigned long int @var{d}})
3288@maybepagebreak
3289@deftypefunx void mpz_fdiv_q_2exp (mpz_t @var{q}, const mpz_t @var{n}, @w{mp_bitcnt_t @var{b}})
3290@deftypefunx void mpz_fdiv_r_2exp (mpz_t @var{r}, const mpz_t @var{n}, @w{mp_bitcnt_t @var{b}})
3291@end deftypefun
3292
3293@deftypefun void mpz_tdiv_q (mpz_t @var{q}, const mpz_t @var{n}, const mpz_t @var{d})
3294@deftypefunx void mpz_tdiv_r (mpz_t @var{r}, const mpz_t @var{n}, const mpz_t @var{d})
3295@deftypefunx void mpz_tdiv_qr (mpz_t @var{q}, mpz_t @var{r}, const mpz_t @var{n}, const mpz_t @var{d})
3296@maybepagebreak
3297@deftypefunx {unsigned long int} mpz_tdiv_q_ui (mpz_t @var{q}, const mpz_t @var{n}, @w{unsigned long int @var{d}})
3298@deftypefunx {unsigned long int} mpz_tdiv_r_ui (mpz_t @var{r}, const mpz_t @var{n}, @w{unsigned long int @var{d}})
3299@deftypefunx {unsigned long int} mpz_tdiv_qr_ui (mpz_t @var{q}, mpz_t @var{r}, @w{const mpz_t @var{n}}, @w{unsigned long int @var{d}})
3300@deftypefunx {unsigned long int} mpz_tdiv_ui (const mpz_t @var{n}, @w{unsigned long int @var{d}})
3301@maybepagebreak
3302@deftypefunx void mpz_tdiv_q_2exp (mpz_t @var{q}, const mpz_t @var{n}, @w{mp_bitcnt_t @var{b}})
3303@deftypefunx void mpz_tdiv_r_2exp (mpz_t @var{r}, const mpz_t @var{n}, @w{mp_bitcnt_t @var{b}})
3304@cindex Bit shift right
3305
3306@sp 1
3307Divide @var{n} by @var{d}, forming a quotient @var{q} and/or remainder
3308@var{r}.  For the @code{2exp} functions, @m{@var{d}=2^b, @var{d}=2^@var{b}}.
3309The rounding is in three styles, each suiting different applications.
3310
3311@itemize @bullet
3312@item
3313@code{cdiv} rounds @var{q} up towards @m{+\infty, +infinity}, and @var{r} will
3314have the opposite sign to @var{d}.  The @code{c} stands for ``ceil''.
3315
3316@item
3317@code{fdiv} rounds @var{q} down towards @m{-\infty, @minus{}infinity}, and
3318@var{r} will have the same sign as @var{d}.  The @code{f} stands for
3319``floor''.
3320
3321@item
3322@code{tdiv} rounds @var{q} towards zero, and @var{r} will have the same sign
3323as @var{n}.  The @code{t} stands for ``truncate''.
3324@end itemize
3325
3326In all cases @var{q} and @var{r} will satisfy
3327@m{@var{n}=@var{q}@var{d}+@var{r}, @var{n}=@var{q}*@var{d}+@var{r}}, and
3328@var{r} will satisfy @math{0@le{}@GMPabs{@var{r}}<@GMPabs{@var{d}}}.
3329
3330The @code{q} functions calculate only the quotient, the @code{r} functions
3331only the remainder, and the @code{qr} functions calculate both.  Note that for
3332@code{qr} the same variable cannot be passed for both @var{q} and @var{r}, or
3333results will be unpredictable.
3334
3335For the @code{ui} variants the return value is the remainder, and in fact
3336returning the remainder is all the @code{div_ui} functions do.  For
3337@code{tdiv} and @code{cdiv} the remainder can be negative, so for those the
3338return value is the absolute value of the remainder.
3339
3340For the @code{2exp} variants the divisor is @m{2^b,2^@var{b}}.  These
3341functions are implemented as right shifts and bit masks, but of course they
3342round the same as the other functions.
3343
3344For positive @var{n} both @code{mpz_fdiv_q_2exp} and @code{mpz_tdiv_q_2exp}
3345are simple bitwise right shifts.  For negative @var{n}, @code{mpz_fdiv_q_2exp}
3346is effectively an arithmetic right shift treating @var{n} as twos complement
3347the same as the bitwise logical functions do, whereas @code{mpz_tdiv_q_2exp}
3348effectively treats @var{n} as sign and magnitude.
3349@end deftypefun
3350
3351@deftypefun void mpz_mod (mpz_t @var{r}, const mpz_t @var{n}, const mpz_t @var{d})
3352@deftypefunx {unsigned long int} mpz_mod_ui (mpz_t @var{r}, const mpz_t @var{n}, @w{unsigned long int @var{d}})
3353Set @var{r} to @var{n} @code{mod} @var{d}.  The sign of the divisor is
3354ignored; the result is always non-negative.
3355
3356@code{mpz_mod_ui} is identical to @code{mpz_fdiv_r_ui} above, returning the
3357remainder as well as setting @var{r}.  See @code{mpz_fdiv_ui} above if only
3358the return value is wanted.
3359@end deftypefun
3360
3361@deftypefun void mpz_divexact (mpz_t @var{q}, const mpz_t @var{n}, const mpz_t @var{d})
3362@deftypefunx void mpz_divexact_ui (mpz_t @var{q}, const mpz_t @var{n}, unsigned long @var{d})
3363@cindex Exact division functions
3364Set @var{q} to @var{n}/@var{d}.  These functions produce correct results only
3365when it is known in advance that @var{d} divides @var{n}.
3366
3367These routines are much faster than the other division functions, and are the
3368best choice when exact division is known to occur, for example reducing a
3369rational to lowest terms.
3370@end deftypefun
3371
3372@deftypefun int mpz_divisible_p (const mpz_t @var{n}, const mpz_t @var{d})
3373@deftypefunx int mpz_divisible_ui_p (const mpz_t @var{n}, unsigned long int @var{d})
3374@deftypefunx int mpz_divisible_2exp_p (const mpz_t @var{n}, mp_bitcnt_t @var{b})
3375@cindex Divisibility functions
3376Return non-zero if @var{n} is exactly divisible by @var{d}, or in the case of
3377@code{mpz_divisible_2exp_p} by @m{2^b,2^@var{b}}.
3378
3379@var{n} is divisible by @var{d} if there exists an integer @var{q} satisfying
3380@math{@var{n} = @var{q}@GMPmultiply{}@var{d}}.  Unlike the other division
3381functions, @math{@var{d}=0} is accepted and following the rule it can be seen
3382that only 0 is considered divisible by 0.
3383@end deftypefun
3384
3385@deftypefun int mpz_congruent_p (const mpz_t @var{n}, const mpz_t @var{c}, const mpz_t @var{d})
3386@deftypefunx int mpz_congruent_ui_p (const mpz_t @var{n}, unsigned long int @var{c}, unsigned long int @var{d})
3387@deftypefunx int mpz_congruent_2exp_p (const mpz_t @var{n}, const mpz_t @var{c}, mp_bitcnt_t @var{b})
3388@cindex Divisibility functions
3389@cindex Congruence functions
3390Return non-zero if @var{n} is congruent to @var{c} modulo @var{d}, or in the
3391case of @code{mpz_congruent_2exp_p} modulo @m{2^b,2^@var{b}}.
3392
3393@var{n} is congruent to @var{c} mod @var{d} if there exists an integer @var{q}
3394satisfying @math{@var{n} = @var{c} + @var{q}@GMPmultiply{}@var{d}}.  Unlike
3395the other division functions, @math{@var{d}=0} is accepted and following the
3396rule it can be seen that @var{n} and @var{c} are considered congruent mod 0
3397only when exactly equal.
3398@end deftypefun
3399
3400
3401@need 2000
3402@node Integer Exponentiation, Integer Roots, Integer Division, Integer Functions
3403@section Exponentiation Functions
3404@cindex Integer exponentiation functions
3405@cindex Exponentiation functions
3406@cindex Powering functions
3407
3408@deftypefun void mpz_powm (mpz_t @var{rop}, const mpz_t @var{base}, const mpz_t @var{exp}, const mpz_t @var{mod})
3409@deftypefunx void mpz_powm_ui (mpz_t @var{rop}, const mpz_t @var{base}, unsigned long int @var{exp}, const mpz_t @var{mod})
3410Set @var{rop} to @m{base^{exp} \bmod mod, (@var{base} raised to @var{exp})
3411modulo @var{mod}}.
3412
3413Negative @var{exp} is supported if an inverse @math{@var{base}^@W{-1} @bmod
3414@var{mod}} exists (see @code{mpz_invert} in @ref{Number Theoretic Functions}).
3415If an inverse doesn't exist then a divide by zero is raised.
3416@end deftypefun
3417
3418@deftypefun void mpz_powm_sec (mpz_t @var{rop}, const mpz_t @var{base}, const mpz_t @var{exp}, const mpz_t @var{mod})
3419Set @var{rop} to @m{base^{exp} \bmod mod, (@var{base} raised to @var{exp})
3420modulo @var{mod}}.
3421
3422It is required that @math{@var{exp} > 0} and that @var{mod} is odd.
3423
3424This function is designed to take the same time and have the same cache access
3425patterns for any two same-size arguments, assuming that function arguments are
3426placed at the same position and that the machine state is identical upon
3427function entry.  This function is intended for cryptographic purposes, where
3428resilience to side-channel attacks is desired.
3429@end deftypefun
3430
3431@deftypefun void mpz_pow_ui (mpz_t @var{rop}, const mpz_t @var{base}, unsigned long int @var{exp})
3432@deftypefunx void mpz_ui_pow_ui (mpz_t @var{rop}, unsigned long int @var{base}, unsigned long int @var{exp})
3433Set @var{rop} to @m{base^{exp}, @var{base} raised to @var{exp}}.  The case
3434@math{0^0} yields 1.
3435@end deftypefun
3436
3437
3438@need 2000
3439@node Integer Roots, Number Theoretic Functions, Integer Exponentiation, Integer Functions
3440@section Root Extraction Functions
3441@cindex Integer root functions
3442@cindex Root extraction functions
3443
3444@deftypefun int mpz_root (mpz_t @var{rop}, const mpz_t @var{op}, unsigned long int @var{n})
3445Set @var{rop} to @m{\lfloor\root n \of {op}\rfloor@C{},} the truncated integer
3446part of the @var{n}th root of @var{op}.  Return non-zero if the computation
3447was exact, i.e., if @var{op} is @var{rop} to the @var{n}th power.
3448@end deftypefun
3449
3450@deftypefun void mpz_rootrem (mpz_t @var{root}, mpz_t @var{rem}, const mpz_t @var{u}, unsigned long int @var{n})
3451Set @var{root} to @m{\lfloor\root n \of {u}\rfloor@C{},} the truncated
3452integer part of the @var{n}th root of @var{u}.  Set @var{rem} to the
3453remainder, @m{(@var{u} - @var{root}^n),
3454@var{u}@minus{}@var{root}**@var{n}}.
3455@end deftypefun
3456
3457@deftypefun void mpz_sqrt (mpz_t @var{rop}, const mpz_t @var{op})
3458Set @var{rop} to @m{\lfloor\sqrt{@var{op}}\rfloor@C{},} the truncated
3459integer part of the square root of @var{op}.
3460@end deftypefun
3461
3462@deftypefun void mpz_sqrtrem (mpz_t @var{rop1}, mpz_t @var{rop2}, const mpz_t @var{op})
3463Set @var{rop1} to @m{\lfloor\sqrt{@var{op}}\rfloor, the truncated integer part
3464of the square root of @var{op}}, like @code{mpz_sqrt}.  Set @var{rop2} to the
3465remainder @m{(@var{op} - @var{rop1}^2),
3466@var{op}@minus{}@var{rop1}*@var{rop1}}, which will be zero if @var{op} is a
3467perfect square.
3468
3469If @var{rop1} and @var{rop2} are the same variable, the results are
3470undefined.
3471@end deftypefun
3472
3473@deftypefun int mpz_perfect_power_p (const mpz_t @var{op})
3474@cindex Perfect power functions
3475@cindex Root testing functions
3476Return non-zero if @var{op} is a perfect power, i.e., if there exist integers
3477@m{a,@var{a}} and @m{b,@var{b}}, with @m{b>1, @var{b}>1}, such that
3478@m{@var{op}=a^b, @var{op} equals @var{a} raised to the power @var{b}}.
3479
3480Under this definition both 0 and 1 are considered to be perfect powers.
3481Negative values of @var{op} are accepted, but of course can only be odd
3482perfect powers.
3483@end deftypefun
3484
3485@deftypefun int mpz_perfect_square_p (const mpz_t @var{op})
3486@cindex Perfect square functions
3487@cindex Root testing functions
3488Return non-zero if @var{op} is a perfect square, i.e., if the square root of
3489@var{op} is an integer.  Under this definition both 0 and 1 are considered to
3490be perfect squares.
3491@end deftypefun
3492
3493
3494@need 2000
3495@node Number Theoretic Functions, Integer Comparisons, Integer Roots, Integer Functions
3496@section Number Theoretic Functions
3497@cindex Number theoretic functions
3498
3499@deftypefun int mpz_probab_prime_p (const mpz_t @var{n}, int @var{reps})
3500@cindex Prime testing functions
3501@cindex Probable prime testing functions
3502Determine whether @var{n} is prime.  Return 2 if @var{n} is definitely prime,
3503return 1 if @var{n} is probably prime (without being certain), or return 0 if
3504@var{n} is definitely composite.
3505
3506This function does some trial divisions, then some Miller-Rabin probabilistic
3507primality tests.  The argument @var{reps} controls how many such tests are
3508done; a higher value will reduce the chances of a composite being returned as
3509``probably prime''.  25 is a reasonable number; a composite number will then be
3510identified as a prime with a probability of less than @m{2^{-50},2^(-50)}.
3511
3512Miller-Rabin and similar tests can be more properly called compositeness
3513tests.  Numbers which fail are known to be composite but those which pass
3514might be prime or might be composite.  Only a few composites pass, hence those
3515which pass are considered probably prime.
3516@end deftypefun
3517
3518@deftypefun void mpz_nextprime (mpz_t @var{rop}, const mpz_t @var{op})
3519@cindex Next prime function
3520Set @var{rop} to the next prime greater than @var{op}.
3521
3522This function uses a probabilistic algorithm to identify primes.  For
3523practical purposes it's adequate, the chance of a composite passing will be
3524extremely small.
3525@end deftypefun
3526
3527@c mpz_prime_p not implemented as of gmp 3.0.
3528
3529@c @deftypefun int mpz_prime_p (const mpz_t @var{n})
3530@c Return non-zero if @var{n} is prime and zero if @var{n} is a non-prime.
3531@c This function is far slower than @code{mpz_probab_prime_p}, but then it
3532@c never returns non-zero for composite numbers.
3533
3534@c (For practical purposes, using @code{mpz_probab_prime_p} is adequate.
3535@c The likelihood of a programming error or hardware malfunction is orders
3536@c of magnitudes greater than the likelihood for a composite to pass as a
3537@c prime, if the @var{reps} argument is in the suggested range.)
3538@c @end deftypefun
3539
3540@deftypefun void mpz_gcd (mpz_t @var{rop}, const mpz_t @var{op1}, const mpz_t @var{op2})
3541@cindex Greatest common divisor functions
3542@cindex GCD functions
3543Set @var{rop} to the greatest common divisor of @var{op1} and @var{op2}.  The
3544result is always positive even if one or both input operands are negative.
3545Except if both inputs are zero; then this function defines @math{gcd(0,0) = 0}.
3546@end deftypefun
3547
3548@deftypefun {unsigned long int} mpz_gcd_ui (mpz_t @var{rop}, const mpz_t @var{op1}, unsigned long int @var{op2})
3549Compute the greatest common divisor of @var{op1} and @var{op2}.  If
3550@var{rop} is not @code{NULL}, store the result there.
3551
3552If the result is small enough to fit in an @code{unsigned long int}, it is
3553returned.  If the result does not fit, 0 is returned, and the result is equal
3554to the argument @var{op1}.  Note that the result will always fit if @var{op2}
3555is non-zero.
3556@end deftypefun
3557
3558@deftypefun void mpz_gcdext (mpz_t @var{g}, mpz_t @var{s}, mpz_t @var{t}, const mpz_t @var{a}, const mpz_t @var{b})
3559@cindex Extended GCD
3560@cindex GCD extended
3561Set @var{g} to the greatest common divisor of @var{a} and @var{b}, and in
3562addition set @var{s} and @var{t} to coefficients satisfying
3563@math{@var{a}@GMPmultiply{}@var{s} + @var{b}@GMPmultiply{}@var{t} = @var{g}}.
3564The value in @var{g} is always positive, even if one or both of @var{a} and
3565@var{b} are negative (or zero if both inputs are zero).  The values in @var{s}
3566and @var{t} are chosen such that normally, @math{@GMPabs{@var{s}} <
3567@GMPabs{@var{b}} / (2 @var{g})} and @math{@GMPabs{@var{t}} < @GMPabs{@var{a}}
3568/ (2 @var{g})}, and these relations define @var{s} and @var{t} uniquely.  There
3569are a few exceptional cases:
3570
3571If @math{@GMPabs{@var{a}} = @GMPabs{@var{b}}}, then @math{@var{s} = 0},
3572@math{@var{t} = sgn(@var{b})}.
3573
3574Otherwise, @math{@var{s} = sgn(@var{a})} if @math{@var{b} = 0} or
3575@math{@GMPabs{@var{b}} = 2 @var{g}}, and @math{@var{t} = sgn(@var{b})} if
3576@math{@var{a} = 0} or @math{@GMPabs{@var{a}} = 2 @var{g}}.
3577
3578In all cases, @math{@var{s} = 0} if and only if @math{@var{g} =
3579@GMPabs{@var{b}}}, i.e., if @var{b} divides @var{a} or @math{@var{a} = @var{b}
3580= 0}.
3581
3582If @var{t} is @code{NULL} then that value is not computed.
3583@end deftypefun
3584
3585@deftypefun void mpz_lcm (mpz_t @var{rop}, const mpz_t @var{op1}, const mpz_t @var{op2})
3586@deftypefunx void mpz_lcm_ui (mpz_t @var{rop}, const mpz_t @var{op1}, unsigned long @var{op2})
3587@cindex Least common multiple functions
3588@cindex LCM functions
3589Set @var{rop} to the least common multiple of @var{op1} and @var{op2}.
3590@var{rop} is always positive, irrespective of the signs of @var{op1} and
3591@var{op2}.  @var{rop} will be zero if either @var{op1} or @var{op2} is zero.
3592@end deftypefun
3593
3594@deftypefun int mpz_invert (mpz_t @var{rop}, const mpz_t @var{op1}, const mpz_t @var{op2})
3595@cindex Modular inverse functions
3596@cindex Inverse modulo functions
3597Compute the inverse of @var{op1} modulo @var{op2} and put the result in
3598@var{rop}.  If the inverse exists, the return value is non-zero and @var{rop}
3599will satisfy @math{0 < @var{rop} < @GMPabs{@var{op2}}}.  If an inverse doesn't
3600exist the return value is zero and @var{rop} is undefined.  The behaviour of
3601this function is undefined when @var{op2} is zero.
3602@end deftypefun
3603
3604@deftypefun int mpz_jacobi (const mpz_t @var{a}, const mpz_t @var{b})
3605@cindex Jacobi symbol functions
3606Calculate the Jacobi symbol @m{\left(a \over b\right),
3607(@var{a}/@var{b})}.  This is defined only for @var{b} odd.
3608@end deftypefun
3609
3610@deftypefun int mpz_legendre (const mpz_t @var{a}, const mpz_t @var{p})
3611@cindex Legendre symbol functions
3612Calculate the Legendre symbol @m{\left(a \over p\right),
3613(@var{a}/@var{p})}.  This is defined only for @var{p} an odd positive
3614prime, and for such @var{p} it's identical to the Jacobi symbol.
3615@end deftypefun
3616
3617@deftypefun int mpz_kronecker (const mpz_t @var{a}, const mpz_t @var{b})
3618@deftypefunx int mpz_kronecker_si (const mpz_t @var{a}, long @var{b})
3619@deftypefunx int mpz_kronecker_ui (const mpz_t @var{a}, unsigned long @var{b})
3620@deftypefunx int mpz_si_kronecker (long @var{a}, const mpz_t @var{b})
3621@deftypefunx int mpz_ui_kronecker (unsigned long @var{a}, const mpz_t @var{b})
3622@cindex Kronecker symbol functions
3623Calculate the Jacobi symbol @m{\left(a \over b\right),
3624(@var{a}/@var{b})} with the Kronecker extension @m{\left(a \over
36252\right) = \left(2 \over a\right), (a/2)=(2/a)} when @math{a} odd, or
3626@m{\left(a \over 2\right) = 0, (a/2)=0} when @math{a} even.
3627
3628When @var{b} is odd the Jacobi symbol and Kronecker symbol are
3629identical, so @code{mpz_kronecker_ui} etc can be used for mixed
3630precision Jacobi symbols too.
3631
3632For more information see Henri Cohen section 1.4.2 (@pxref{References}),
3633or any number theory textbook.  See also the example program
3634@file{demos/qcn.c} which uses @code{mpz_kronecker_ui}.
3635@end deftypefun
3636
3637@deftypefun {mp_bitcnt_t} mpz_remove (mpz_t @var{rop}, const mpz_t @var{op}, const mpz_t @var{f})
3638@cindex Remove factor functions
3639@cindex Factor removal functions
3640Remove all occurrences of the factor @var{f} from @var{op} and store the
3641result in @var{rop}.  The return value is how many such occurrences were
3642removed.
3643@end deftypefun
3644
3645@deftypefun void mpz_fac_ui (mpz_t @var{rop}, unsigned long int @var{n})
3646@deftypefunx void mpz_2fac_ui (mpz_t @var{rop}, unsigned long int @var{n})
3647@deftypefunx void mpz_mfac_uiui (mpz_t @var{rop}, unsigned long int @var{n}, unsigned long int @var{m})
3648@cindex Factorial functions
3649Set @var{rop} to the factorial of @var{n}: @code{mpz_fac_ui} computes the plain factorial @var{n}!,
3650@code{mpz_2fac_ui} computes the double-factorial @var{n}!!, and @code{mpz_mfac_uiui} the
3651@var{m}-multi-factorial @m{n!^{(m)}, @var{n}!^(@var{m})}.
3652@end deftypefun
3653
3654@deftypefun void mpz_primorial_ui (mpz_t @var{rop}, unsigned long int @var{n})
3655@cindex Primorial functions
3656Set @var{rop} to the primorial of @var{n}, i.e. the product of all positive
3657prime numbers @math{@le{}@var{n}}.
3658@end deftypefun
3659
3660@deftypefun void mpz_bin_ui (mpz_t @var{rop}, const mpz_t @var{n}, unsigned long int @var{k})
3661@deftypefunx void mpz_bin_uiui (mpz_t @var{rop}, unsigned long int @var{n}, @w{unsigned long int @var{k}})
3662@cindex Binomial coefficient functions
3663Compute the binomial coefficient @m{\left({n}\atop{k}\right), @var{n} over
3664@var{k}} and store the result in @var{rop}.  Negative values of @var{n} are
3665supported by @code{mpz_bin_ui}, using the identity
3666@m{\left({-n}\atop{k}\right) = (-1)^k \left({n+k-1}\atop{k}\right),
3667bin(-n@C{}k) = (-1)^k * bin(n+k-1@C{}k)}, see Knuth volume 1 section 1.2.6
3668part G.
3669@end deftypefun
3670
3671@deftypefun void mpz_fib_ui (mpz_t @var{fn}, unsigned long int @var{n})
3672@deftypefunx void mpz_fib2_ui (mpz_t @var{fn}, mpz_t @var{fnsub1}, unsigned long int @var{n})
3673@cindex Fibonacci sequence functions
3674@code{mpz_fib_ui} sets @var{fn} to to @m{F_n,F[n]}, the @var{n}'th Fibonacci
3675number.  @code{mpz_fib2_ui} sets @var{fn} to @m{F_n,F[n]}, and @var{fnsub1} to
3676@m{F_{n-1},F[n-1]}.
3677
3678These functions are designed for calculating isolated Fibonacci numbers.  When
3679a sequence of values is wanted it's best to start with @code{mpz_fib2_ui} and
3680iterate the defining @m{F_{n+1} = F_n + F_{n-1}, F[n+1]=F[n]+F[n-1]} or
3681similar.
3682@end deftypefun
3683
3684@deftypefun void mpz_lucnum_ui (mpz_t @var{ln}, unsigned long int @var{n})
3685@deftypefunx void mpz_lucnum2_ui (mpz_t @var{ln}, mpz_t @var{lnsub1}, unsigned long int @var{n})
3686@cindex Lucas number functions
3687@code{mpz_lucnum_ui} sets @var{ln} to to @m{L_n,L[n]}, the @var{n}'th Lucas
3688number.  @code{mpz_lucnum2_ui} sets @var{ln} to @m{L_n,L[n]}, and @var{lnsub1}
3689to @m{L_{n-1},L[n-1]}.
3690
3691These functions are designed for calculating isolated Lucas numbers.  When a
3692sequence of values is wanted it's best to start with @code{mpz_lucnum2_ui} and
3693iterate the defining @m{L_{n+1} = L_n + L_{n-1}, L[n+1]=L[n]+L[n-1]} or
3694similar.
3695
3696The Fibonacci numbers and Lucas numbers are related sequences, so it's never
3697necessary to call both @code{mpz_fib2_ui} and @code{mpz_lucnum2_ui}.  The
3698formulas for going from Fibonacci to Lucas can be found in @ref{Lucas Numbers
3699Algorithm}, the reverse is straightforward too.
3700@end deftypefun
3701
3702
3703@node Integer Comparisons, Integer Logic and Bit Fiddling, Number Theoretic Functions, Integer Functions
3704@comment  node-name,  next,  previous,  up
3705@section Comparison Functions
3706@cindex Integer comparison functions
3707@cindex Comparison functions
3708
3709@deftypefn Function int mpz_cmp (const mpz_t @var{op1}, const mpz_t @var{op2})
3710@deftypefnx Function int mpz_cmp_d (const mpz_t @var{op1}, double @var{op2})
3711@deftypefnx Macro int mpz_cmp_si (const mpz_t @var{op1}, signed long int @var{op2})
3712@deftypefnx Macro int mpz_cmp_ui (const mpz_t @var{op1}, unsigned long int @var{op2})
3713Compare @var{op1} and @var{op2}.  Return a positive value if @math{@var{op1} >
3714@var{op2}}, zero if @math{@var{op1} = @var{op2}}, or a negative value if
3715@math{@var{op1} < @var{op2}}.
3716
3717@code{mpz_cmp_ui} and @code{mpz_cmp_si} are macros and will evaluate their
3718arguments more than once.  @code{mpz_cmp_d} can be called with an infinity,
3719but results are undefined for a NaN.
3720@end deftypefn
3721
3722@deftypefn Function int mpz_cmpabs (const mpz_t @var{op1}, const mpz_t @var{op2})
3723@deftypefnx Function int mpz_cmpabs_d (const mpz_t @var{op1}, double @var{op2})
3724@deftypefnx Function int mpz_cmpabs_ui (const mpz_t @var{op1}, unsigned long int @var{op2})
3725Compare the absolute values of @var{op1} and @var{op2}.  Return a positive
3726value if @math{@GMPabs{@var{op1}} > @GMPabs{@var{op2}}}, zero if
3727@math{@GMPabs{@var{op1}} = @GMPabs{@var{op2}}}, or a negative value if
3728@math{@GMPabs{@var{op1}} < @GMPabs{@var{op2}}}.
3729
3730@code{mpz_cmpabs_d} can be called with an infinity, but results are undefined
3731for a NaN.
3732@end deftypefn
3733
3734@deftypefn Macro int mpz_sgn (const mpz_t @var{op})
3735@cindex Sign tests
3736@cindex Integer sign tests
3737Return @math{+1} if @math{@var{op} > 0}, 0 if @math{@var{op} = 0}, and
3738@math{-1} if @math{@var{op} < 0}.
3739
3740This function is actually implemented as a macro.  It evaluates its argument
3741multiple times.
3742@end deftypefn
3743
3744
3745@node Integer Logic and Bit Fiddling, I/O of Integers, Integer Comparisons, Integer Functions
3746@comment  node-name,  next,  previous,  up
3747@section Logical and Bit Manipulation Functions
3748@cindex Logical functions
3749@cindex Bit manipulation functions
3750@cindex Integer logical functions
3751@cindex Integer bit manipulation functions
3752
3753These functions behave as if twos complement arithmetic were used (although
3754sign-magnitude is the actual implementation).  The least significant bit is
3755number 0.
3756
3757@deftypefun void mpz_and (mpz_t @var{rop}, const mpz_t @var{op1}, const mpz_t @var{op2})
3758Set @var{rop} to @var{op1} bitwise-and @var{op2}.
3759@end deftypefun
3760
3761@deftypefun void mpz_ior (mpz_t @var{rop}, const mpz_t @var{op1}, const mpz_t @var{op2})
3762Set @var{rop} to @var{op1} bitwise inclusive-or @var{op2}.
3763@end deftypefun
3764
3765@deftypefun void mpz_xor (mpz_t @var{rop}, const mpz_t @var{op1}, const mpz_t @var{op2})
3766Set @var{rop} to @var{op1} bitwise exclusive-or @var{op2}.
3767@end deftypefun
3768
3769@deftypefun void mpz_com (mpz_t @var{rop}, const mpz_t @var{op})
3770Set @var{rop} to the one's complement of @var{op}.
3771@end deftypefun
3772
3773@deftypefun {mp_bitcnt_t} mpz_popcount (const mpz_t @var{op})
3774If @math{@var{op}@ge{}0}, return the population count of @var{op}, which is the
3775number of 1 bits in the binary representation.  If @math{@var{op}<0}, the
3776number of 1s is infinite, and the return value is the largest possible
3777@code{mp_bitcnt_t}.
3778@end deftypefun
3779
3780@deftypefun {mp_bitcnt_t} mpz_hamdist (const mpz_t @var{op1}, const mpz_t @var{op2})
3781If @var{op1} and @var{op2} are both @math{@ge{}0} or both @math{<0}, return the
3782hamming distance between the two operands, which is the number of bit positions
3783where @var{op1} and @var{op2} have different bit values.  If one operand is
3784@math{@ge{}0} and the other @math{<0} then the number of bits different is
3785infinite, and the return value is the largest possible @code{mp_bitcnt_t}.
3786@end deftypefun
3787
3788@deftypefun {mp_bitcnt_t} mpz_scan0 (const mpz_t @var{op}, mp_bitcnt_t @var{starting_bit})
3789@deftypefunx {mp_bitcnt_t} mpz_scan1 (const mpz_t @var{op}, mp_bitcnt_t @var{starting_bit})
3790@cindex Bit scanning functions
3791@cindex Scan bit functions
3792Scan @var{op}, starting from bit @var{starting_bit}, towards more significant
3793bits, until the first 0 or 1 bit (respectively) is found.  Return the index of
3794the found bit.
3795
3796If the bit at @var{starting_bit} is already what's sought, then
3797@var{starting_bit} is returned.
3798
3799If there's no bit found, then the largest possible @code{mp_bitcnt_t} is
3800returned.  This will happen in @code{mpz_scan0} past the end of a negative
3801number, or @code{mpz_scan1} past the end of a nonnegative number.
3802@end deftypefun
3803
3804@deftypefun void mpz_setbit (mpz_t @var{rop}, mp_bitcnt_t @var{bit_index})
3805Set bit @var{bit_index} in @var{rop}.
3806@end deftypefun
3807
3808@deftypefun void mpz_clrbit (mpz_t @var{rop}, mp_bitcnt_t @var{bit_index})
3809Clear bit @var{bit_index} in @var{rop}.
3810@end deftypefun
3811
3812@deftypefun void mpz_combit (mpz_t @var{rop}, mp_bitcnt_t @var{bit_index})
3813Complement bit @var{bit_index} in @var{rop}.
3814@end deftypefun
3815
3816@deftypefun int mpz_tstbit (const mpz_t @var{op}, mp_bitcnt_t @var{bit_index})
3817Test bit @var{bit_index} in @var{op} and return 0 or 1 accordingly.
3818@end deftypefun
3819
3820@node I/O of Integers, Integer Random Numbers, Integer Logic and Bit Fiddling, Integer Functions
3821@comment  node-name,  next,  previous,  up
3822@section Input and Output Functions
3823@cindex Integer input and output functions
3824@cindex Input functions
3825@cindex Output functions
3826@cindex I/O functions
3827
3828Functions that perform input from a stdio stream, and functions that output to
3829a stdio stream, of @code{mpz} numbers.  Passing a @code{NULL} pointer for a
3830@var{stream} argument to any of these functions will make them read from
3831@code{stdin} and write to @code{stdout}, respectively.
3832
3833When using any of these functions, it is a good idea to include @file{stdio.h}
3834before @file{gmp.h}, since that will allow @file{gmp.h} to define prototypes
3835for these functions.
3836
3837See also @ref{Formatted Output} and @ref{Formatted Input}.
3838
3839@deftypefun size_t mpz_out_str (FILE *@var{stream}, int @var{base}, const mpz_t @var{op})
3840Output @var{op} on stdio stream @var{stream}, as a string of digits in base
3841@var{base}.  The base argument may vary from 2 to 62 or from @minus{}2 to
3842@minus{}36.
3843
3844For @var{base} in the range 2..36, digits and lower-case letters are used; for
3845@minus{}2..@minus{}36, digits and upper-case letters are used; for 37..62,
3846digits, upper-case letters, and lower-case letters (in that significance order)
3847are used.
3848
3849Return the number of bytes written, or if an error occurred, return 0.
3850@end deftypefun
3851
3852@deftypefun size_t mpz_inp_str (mpz_t @var{rop}, FILE *@var{stream}, int @var{base})
3853Input a possibly white-space preceded string in base @var{base} from stdio
3854stream @var{stream}, and put the read integer in @var{rop}.
3855
3856The @var{base} may vary from 2 to 62, or if @var{base} is 0, then the leading
3857characters are used: @code{0x} and @code{0X} for hexadecimal, @code{0b} and
3858@code{0B} for binary, @code{0} for octal, or decimal otherwise.
3859
3860For bases up to 36, case is ignored; upper-case and lower-case letters have
3861the same value.  For bases 37 to 62, upper-case letter represent the usual
386210..35 while lower-case letter represent 36..61.
3863
3864Return the number of bytes read, or if an error occurred, return 0.
3865@end deftypefun
3866
3867@deftypefun size_t mpz_out_raw (FILE *@var{stream}, const mpz_t @var{op})
3868Output @var{op} on stdio stream @var{stream}, in raw binary format.  The
3869integer is written in a portable format, with 4 bytes of size information, and
3870that many bytes of limbs.  Both the size and the limbs are written in
3871decreasing significance order (i.e., in big-endian).
3872
3873The output can be read with @code{mpz_inp_raw}.
3874
3875Return the number of bytes written, or if an error occurred, return 0.
3876
3877The output of this can not be read by @code{mpz_inp_raw} from GMP 1, because
3878of changes necessary for compatibility between 32-bit and 64-bit machines.
3879@end deftypefun
3880
3881@deftypefun size_t mpz_inp_raw (mpz_t @var{rop}, FILE *@var{stream})
3882Input from stdio stream @var{stream} in the format written by
3883@code{mpz_out_raw}, and put the result in @var{rop}.  Return the number of
3884bytes read, or if an error occurred, return 0.
3885
3886This routine can read the output from @code{mpz_out_raw} also from GMP 1, in
3887spite of changes necessary for compatibility between 32-bit and 64-bit
3888machines.
3889@end deftypefun
3890
3891
3892@need 2000
3893@node Integer Random Numbers, Integer Import and Export, I/O of Integers, Integer Functions
3894@comment  node-name,  next,  previous,  up
3895@section Random Number Functions
3896@cindex Integer random number functions
3897@cindex Random number functions
3898
3899The random number functions of GMP come in two groups; older function
3900that rely on a global state, and newer functions that accept a state
3901parameter that is read and modified.  Please see the @ref{Random Number
3902Functions} for more information on how to use and not to use random
3903number functions.
3904
3905@deftypefun void mpz_urandomb (mpz_t @var{rop}, gmp_randstate_t @var{state}, mp_bitcnt_t @var{n})
3906Generate a uniformly distributed random integer in the range 0 to @m{2^n-1,
39072^@var{n}@minus{}1}, inclusive.
3908
3909The variable @var{state} must be initialized by calling one of the
3910@code{gmp_randinit} functions (@ref{Random State Initialization}) before
3911invoking this function.
3912@end deftypefun
3913
3914@deftypefun void mpz_urandomm (mpz_t @var{rop}, gmp_randstate_t @var{state}, const mpz_t @var{n})
3915Generate a uniform random integer in the range 0 to @math{@var{n}-1},
3916inclusive.
3917
3918The variable @var{state} must be initialized by calling one of the
3919@code{gmp_randinit} functions (@ref{Random State Initialization})
3920before invoking this function.
3921@end deftypefun
3922
3923@deftypefun void mpz_rrandomb (mpz_t @var{rop}, gmp_randstate_t @var{state}, mp_bitcnt_t @var{n})
3924Generate a random integer with long strings of zeros and ones in the
3925binary representation.  Useful for testing functions and algorithms,
3926since this kind of random numbers have proven to be more likely to
3927trigger corner-case bugs.  The random number will be in the range
39280 to @m{2^n-1, 2^@var{n}@minus{}1}, inclusive.
3929
3930The variable @var{state} must be initialized by calling one of the
3931@code{gmp_randinit} functions (@ref{Random State Initialization})
3932before invoking this function.
3933@end deftypefun
3934
3935@deftypefun void mpz_random (mpz_t @var{rop}, mp_size_t @var{max_size})
3936Generate a random integer of at most @var{max_size} limbs.  The generated
3937random number doesn't satisfy any particular requirements of randomness.
3938Negative random numbers are generated when @var{max_size} is negative.
3939
3940This function is obsolete.  Use @code{mpz_urandomb} or
3941@code{mpz_urandomm} instead.
3942@end deftypefun
3943
3944@deftypefun void mpz_random2 (mpz_t @var{rop}, mp_size_t @var{max_size})
3945Generate a random integer of at most @var{max_size} limbs, with long strings
3946of zeros and ones in the binary representation.  Useful for testing functions
3947and algorithms, since this kind of random numbers have proven to be more
3948likely to trigger corner-case bugs.  Negative random numbers are generated
3949when @var{max_size} is negative.
3950
3951This function is obsolete.  Use @code{mpz_rrandomb} instead.
3952@end deftypefun
3953
3954
3955@node Integer Import and Export, Miscellaneous Integer Functions, Integer Random Numbers, Integer Functions
3956@section Integer Import and Export
3957
3958@code{mpz_t} variables can be converted to and from arbitrary words of binary
3959data with the following functions.
3960
3961@deftypefun void mpz_import (mpz_t @var{rop}, size_t @var{count}, int @var{order}, size_t @var{size}, int @var{endian}, size_t @var{nails}, const void *@var{op})
3962@cindex Integer import
3963@cindex Import
3964Set @var{rop} from an array of word data at @var{op}.
3965
3966The parameters specify the format of the data.  @var{count} many words are
3967read, each @var{size} bytes.  @var{order} can be 1 for most significant word
3968first or -1 for least significant first.  Within each word @var{endian} can be
39691 for most significant byte first, -1 for least significant first, or 0 for
3970the native endianness of the host CPU@.  The most significant @var{nails} bits
3971of each word are skipped, this can be 0 to use the full words.
3972
3973There is no sign taken from the data, @var{rop} will simply be a positive
3974integer.  An application can handle any sign itself, and apply it for instance
3975with @code{mpz_neg}.
3976
3977There are no data alignment restrictions on @var{op}, any address is allowed.
3978
3979Here's an example converting an array of @code{unsigned long} data, most
3980significant element first, and host byte order within each value.
3981
3982@example
3983unsigned long  a[20];
3984/* Initialize @var{z} and @var{a} */
3985mpz_import (z, 20, 1, sizeof(a[0]), 0, 0, a);
3986@end example
3987
3988This example assumes the full @code{sizeof} bytes are used for data in the
3989given type, which is usually true, and certainly true for @code{unsigned long}
3990everywhere we know of.  However on Cray vector systems it may be noted that
3991@code{short} and @code{int} are always stored in 8 bytes (and with
3992@code{sizeof} indicating that) but use only 32 or 46 bits.  The @var{nails}
3993feature can account for this, by passing for instance
3994@code{8*sizeof(int)-INT_BIT}.
3995@end deftypefun
3996
3997@deftypefun {void *} mpz_export (void *@var{rop}, size_t *@var{countp}, int @var{order}, size_t @var{size}, int @var{endian}, size_t @var{nails}, const mpz_t @var{op})
3998@cindex Integer export
3999@cindex Export
4000Fill @var{rop} with word data from @var{op}.
4001
4002The parameters specify the format of the data produced.  Each word will be
4003@var{size} bytes and @var{order} can be 1 for most significant word first or
4004-1 for least significant first.  Within each word @var{endian} can be 1 for
4005most significant byte first, -1 for least significant first, or 0 for the
4006native endianness of the host CPU@.  The most significant @var{nails} bits of
4007each word are unused and set to zero, this can be 0 to produce full words.
4008
4009The number of words produced is written to @code{*@var{countp}}, or
4010@var{countp} can be @code{NULL} to discard the count.  @var{rop} must have
4011enough space for the data, or if @var{rop} is @code{NULL} then a result array
4012of the necessary size is allocated using the current GMP allocation function
4013(@pxref{Custom Allocation}).  In either case the return value is the
4014destination used, either @var{rop} or the allocated block.
4015
4016If @var{op} is non-zero then the most significant word produced will be
4017non-zero.  If @var{op} is zero then the count returned will be zero and
4018nothing written to @var{rop}.  If @var{rop} is @code{NULL} in this case, no
4019block is allocated, just @code{NULL} is returned.
4020
4021The sign of @var{op} is ignored, just the absolute value is exported.  An
4022application can use @code{mpz_sgn} to get the sign and handle it as desired.
4023(@pxref{Integer Comparisons})
4024
4025There are no data alignment restrictions on @var{rop}, any address is allowed.
4026
4027When an application is allocating space itself the required size can be
4028determined with a calculation like the following.  Since @code{mpz_sizeinbase}
4029always returns at least 1, @code{count} here will be at least one, which
4030avoids any portability problems with @code{malloc(0)}, though if @code{z} is
4031zero no space at all is actually needed (or written).
4032
4033@example
4034numb = 8*size - nail;
4035count = (mpz_sizeinbase (z, 2) + numb-1) / numb;
4036p = malloc (count * size);
4037@end example
4038@end deftypefun
4039
4040
4041@need 2000
4042@node Miscellaneous Integer Functions, Integer Special Functions, Integer Import and Export, Integer Functions
4043@comment  node-name,  next,  previous,  up
4044@section Miscellaneous Functions
4045@cindex Miscellaneous integer functions
4046@cindex Integer miscellaneous functions
4047
4048@deftypefun int mpz_fits_ulong_p (const mpz_t @var{op})
4049@deftypefunx int mpz_fits_slong_p (const mpz_t @var{op})
4050@deftypefunx int mpz_fits_uint_p (const mpz_t @var{op})
4051@deftypefunx int mpz_fits_sint_p (const mpz_t @var{op})
4052@deftypefunx int mpz_fits_ushort_p (const mpz_t @var{op})
4053@deftypefunx int mpz_fits_sshort_p (const mpz_t @var{op})
4054Return non-zero iff the value of @var{op} fits in an @code{unsigned long int},
4055@code{signed long int}, @code{unsigned int}, @code{signed int}, @code{unsigned
4056short int}, or @code{signed short int}, respectively.  Otherwise, return zero.
4057@end deftypefun
4058
4059@deftypefn Macro int mpz_odd_p (const mpz_t @var{op})
4060@deftypefnx Macro int mpz_even_p (const mpz_t @var{op})
4061Determine whether @var{op} is odd or even, respectively.  Return non-zero if
4062yes, zero if no.  These macros evaluate their argument more than once.
4063@end deftypefn
4064
4065@deftypefun size_t mpz_sizeinbase (const mpz_t @var{op}, int @var{base})
4066@cindex Size in digits
4067@cindex Digits in an integer
4068Return the size of @var{op} measured in number of digits in the given
4069@var{base}.  @var{base} can vary from 2 to 62.  The sign of @var{op} is
4070ignored, just the absolute value is used.  The result will be either exact or
40711 too big.  If @var{base} is a power of 2, the result is always exact.  If
4072@var{op} is zero the return value is always 1.
4073
4074This function can be used to determine the space required when converting
4075@var{op} to a string.  The right amount of allocation is normally two more
4076than the value returned by @code{mpz_sizeinbase}, one extra for a minus sign
4077and one for the null-terminator.
4078
4079@cindex Most significant bit
4080It will be noted that @code{mpz_sizeinbase(@var{op},2)} can be used to locate
4081the most significant 1 bit in @var{op}, counting from 1.  (Unlike the bitwise
4082functions which start from 0, @xref{Integer Logic and Bit Fiddling,, Logical
4083and Bit Manipulation Functions}.)
4084@end deftypefun
4085
4086
4087@node Integer Special Functions,  , Miscellaneous Integer Functions, Integer Functions
4088@section Special Functions
4089@cindex Special integer functions
4090@cindex Integer special functions
4091
4092The functions in this section are for various special purposes.  Most
4093applications will not need them.
4094
4095@deftypefun void mpz_array_init (mpz_t @var{integer_array}, mp_size_t @var{array_size}, @w{mp_size_t @var{fixed_num_bits}})
4096This is a special type of initialization.  @strong{Fixed} space of
4097@var{fixed_num_bits} is allocated to each of the @var{array_size} integers in
4098@var{integer_array}.  There is no way to free the storage allocated by this
4099function.  Don't call @code{mpz_clear}!
4100
4101The @var{integer_array} parameter is the first @code{mpz_t} in the array.  For
4102example,
4103
4104@example
4105mpz_t  arr[20000];
4106mpz_array_init (arr[0], 20000, 512);
4107@end example
4108
4109@c  In case anyone's wondering, yes this parameter style is a bit anomalous,
4110@c  it'd probably be nicer if it was "arr" instead of "arr[0]".  Obviously the
4111@c  two differ only in the declaration, not the pointer value, but changing is
4112@c  not possible since it'd provoke warnings or errors in existing sources.
4113
4114This function is only intended for programs that create a large number
4115of integers and need to reduce memory usage by avoiding the overheads of
4116allocating and reallocating lots of small blocks.  In normal programs this
4117function is not recommended.
4118
4119The space allocated to each integer by this function will not be automatically
4120increased, unlike the normal @code{mpz_init}, so an application must ensure it
4121is sufficient for any value stored.  The following space requirements apply to
4122various routines,
4123
4124@itemize @bullet
4125@item
4126@code{mpz_abs}, @code{mpz_neg}, @code{mpz_set}, @code{mpz_set_si} and
4127@code{mpz_set_ui} need room for the value they store.
4128
4129@item
4130@code{mpz_add}, @code{mpz_add_ui}, @code{mpz_sub} and @code{mpz_sub_ui} need
4131room for the larger of the two operands, plus an extra
4132@code{mp_bits_per_limb}.
4133
4134@item
4135@code{mpz_mul}, @code{mpz_mul_ui} and @code{mpz_mul_si} need room for the sum
4136of the number of bits in their operands, but each rounded up to a multiple of
4137@code{mp_bits_per_limb}.
4138
4139@item
4140@code{mpz_swap} can be used between two array variables, but not between an
4141array and a normal variable.
4142@end itemize
4143
4144For other functions, or if in doubt, the suggestion is to calculate in a
4145regular @code{mpz_init} variable and copy the result to an array variable with
4146@code{mpz_set}.
4147@end deftypefun
4148
4149@deftypefun {void *} _mpz_realloc (mpz_t @var{integer}, mp_size_t @var{new_alloc})
4150Change the space for @var{integer} to @var{new_alloc} limbs.  The value in
4151@var{integer} is preserved if it fits, or is set to 0 if not.  The return
4152value is not useful to applications and should be ignored.
4153
4154@code{mpz_realloc2} is the preferred way to accomplish allocation changes like
4155this.  @code{mpz_realloc2} and @code{_mpz_realloc} are the same except that
4156@code{_mpz_realloc} takes its size in limbs.
4157@end deftypefun
4158
4159@deftypefun mp_limb_t mpz_getlimbn (const mpz_t @var{op}, mp_size_t @var{n})
4160Return limb number @var{n} from @var{op}.  The sign of @var{op} is ignored,
4161just the absolute value is used.  The least significant limb is number 0.
4162
4163@code{mpz_size} can be used to find how many limbs make up @var{op}.
4164@code{mpz_getlimbn} returns zero if @var{n} is outside the range 0 to
4165@code{mpz_size(@var{op})-1}.
4166@end deftypefun
4167
4168@deftypefun size_t mpz_size (const mpz_t @var{op})
4169Return the size of @var{op} measured in number of limbs.  If @var{op} is zero,
4170the returned value will be zero.
4171@c (@xref{Nomenclature}, for an explanation of the concept @dfn{limb}.)
4172@end deftypefun
4173
4174
4175
4176@node Rational Number Functions, Floating-point Functions, Integer Functions, Top
4177@comment  node-name,  next,  previous,  up
4178@chapter Rational Number Functions
4179@cindex Rational number functions
4180
4181This chapter describes the GMP functions for performing arithmetic on rational
4182numbers.  These functions start with the prefix @code{mpq_}.
4183
4184Rational numbers are stored in objects of type @code{mpq_t}.
4185
4186All rational arithmetic functions assume operands have a canonical form, and
4187canonicalize their result.  The canonical from means that the denominator and
4188the numerator have no common factors, and that the denominator is positive.
4189Zero has the unique representation 0/1.
4190
4191Pure assignment functions do not canonicalize the assigned variable.  It is
4192the responsibility of the user to canonicalize the assigned variable before
4193any arithmetic operations are performed on that variable.
4194
4195@deftypefun void mpq_canonicalize (mpq_t @var{op})
4196Remove any factors that are common to the numerator and denominator of
4197@var{op}, and make the denominator positive.
4198@end deftypefun
4199
4200@menu
4201* Initializing Rationals::
4202* Rational Conversions::
4203* Rational Arithmetic::
4204* Comparing Rationals::
4205* Applying Integer Functions::
4206* I/O of Rationals::
4207@end menu
4208
4209@node Initializing Rationals, Rational Conversions, Rational Number Functions, Rational Number Functions
4210@comment  node-name,  next,  previous,  up
4211@section Initialization and Assignment Functions
4212@cindex Rational assignment functions
4213@cindex Assignment functions
4214@cindex Rational initialization functions
4215@cindex Initialization functions
4216
4217@deftypefun void mpq_init (mpq_t @var{x})
4218Initialize @var{x} and set it to 0/1.  Each variable should normally only be
4219initialized once, or at least cleared out (using the function @code{mpq_clear})
4220between each initialization.
4221@end deftypefun
4222
4223@deftypefun void mpq_inits (mpq_t @var{x}, ...)
4224Initialize a NULL-terminated list of @code{mpq_t} variables, and set their
4225values to 0/1.
4226@end deftypefun
4227
4228@deftypefun void mpq_clear (mpq_t @var{x})
4229Free the space occupied by @var{x}.  Make sure to call this function for all
4230@code{mpq_t} variables when you are done with them.
4231@end deftypefun
4232
4233@deftypefun void mpq_clears (mpq_t @var{x}, ...)
4234Free the space occupied by a NULL-terminated list of @code{mpq_t} variables.
4235@end deftypefun
4236
4237@deftypefun void mpq_set (mpq_t @var{rop}, const mpq_t @var{op})
4238@deftypefunx void mpq_set_z (mpq_t @var{rop}, const mpz_t @var{op})
4239Assign @var{rop} from @var{op}.
4240@end deftypefun
4241
4242@deftypefun void mpq_set_ui (mpq_t @var{rop}, unsigned long int @var{op1}, unsigned long int @var{op2})
4243@deftypefunx void mpq_set_si (mpq_t @var{rop}, signed long int @var{op1}, unsigned long int @var{op2})
4244Set the value of @var{rop} to @var{op1}/@var{op2}.  Note that if @var{op1} and
4245@var{op2} have common factors, @var{rop} has to be passed to
4246@code{mpq_canonicalize} before any operations are performed on @var{rop}.
4247@end deftypefun
4248
4249@deftypefun int mpq_set_str (mpq_t @var{rop}, const char *@var{str}, int @var{base})
4250Set @var{rop} from a null-terminated string @var{str} in the given @var{base}.
4251
4252The string can be an integer like ``41'' or a fraction like ``41/152''.  The
4253fraction must be in canonical form (@pxref{Rational Number Functions}), or if
4254not then @code{mpq_canonicalize} must be called.
4255
4256The numerator and optional denominator are parsed the same as in
4257@code{mpz_set_str} (@pxref{Assigning Integers}).  White space is allowed in
4258the string, and is simply ignored.  The @var{base} can vary from 2 to 62, or
4259if @var{base} is 0 then the leading characters are used: @code{0x} or @code{0X} for hex,
4260@code{0b} or @code{0B} for binary,
4261@code{0} for octal, or decimal otherwise.  Note that this is done separately
4262for the numerator and denominator, so for instance @code{0xEF/100} is 239/100,
4263whereas @code{0xEF/0x100} is 239/256.
4264
4265The return value is 0 if the entire string is a valid number, or @minus{}1 if
4266not.
4267@end deftypefun
4268
4269@deftypefun void mpq_swap (mpq_t @var{rop1}, mpq_t @var{rop2})
4270Swap the values @var{rop1} and @var{rop2} efficiently.
4271@end deftypefun
4272
4273
4274@need 2000
4275@node Rational Conversions, Rational Arithmetic, Initializing Rationals, Rational Number Functions
4276@comment  node-name,  next,  previous,  up
4277@section Conversion Functions
4278@cindex Rational conversion functions
4279@cindex Conversion functions
4280
4281@deftypefun double mpq_get_d (const mpq_t @var{op})
4282Convert @var{op} to a @code{double}, truncating if necessary (i.e.@: rounding
4283towards zero).
4284
4285If the exponent from the conversion is too big or too small to fit a
4286@code{double} then the result is system dependent.  For too big an infinity is
4287returned when available.  For too small @math{0.0} is normally returned.
4288Hardware overflow, underflow and denorm traps may or may not occur.
4289@end deftypefun
4290
4291@deftypefun void mpq_set_d (mpq_t @var{rop}, double @var{op})
4292@deftypefunx void mpq_set_f (mpq_t @var{rop}, const mpf_t @var{op})
4293Set @var{rop} to the value of @var{op}.  There is no rounding, this conversion
4294is exact.
4295@end deftypefun
4296
4297@deftypefun {char *} mpq_get_str (char *@var{str}, int @var{base}, const mpq_t @var{op})
4298Convert @var{op} to a string of digits in base @var{base}.  The base may vary
4299from 2 to 36.  The string will be of the form @samp{num/den}, or if the
4300denominator is 1 then just @samp{num}.
4301
4302If @var{str} is @code{NULL}, the result string is allocated using the current
4303allocation function (@pxref{Custom Allocation}).  The block will be
4304@code{strlen(str)+1} bytes, that being exactly enough for the string and
4305null-terminator.
4306
4307If @var{str} is not @code{NULL}, it should point to a block of storage large
4308enough for the result, that being
4309
4310@example
4311mpz_sizeinbase (mpq_numref(@var{op}), @var{base})
4312+ mpz_sizeinbase (mpq_denref(@var{op}), @var{base}) + 3
4313@end example
4314
4315The three extra bytes are for a possible minus sign, possible slash, and the
4316null-terminator.
4317
4318A pointer to the result string is returned, being either the allocated block,
4319or the given @var{str}.
4320@end deftypefun
4321
4322
4323@node Rational Arithmetic, Comparing Rationals, Rational Conversions, Rational Number Functions
4324@comment  node-name,  next,  previous,  up
4325@section Arithmetic Functions
4326@cindex Rational arithmetic functions
4327@cindex Arithmetic functions
4328
4329@deftypefun void mpq_add (mpq_t @var{sum}, const mpq_t @var{addend1}, const mpq_t @var{addend2})
4330Set @var{sum} to @var{addend1} + @var{addend2}.
4331@end deftypefun
4332
4333@deftypefun void mpq_sub (mpq_t @var{difference}, const mpq_t @var{minuend}, const mpq_t @var{subtrahend})
4334Set @var{difference} to @var{minuend} @minus{} @var{subtrahend}.
4335@end deftypefun
4336
4337@deftypefun void mpq_mul (mpq_t @var{product}, const mpq_t @var{multiplier}, const mpq_t @var{multiplicand})
4338Set @var{product} to @math{@var{multiplier} @GMPtimes{} @var{multiplicand}}.
4339@end deftypefun
4340
4341@deftypefun void mpq_mul_2exp (mpq_t @var{rop}, const mpq_t @var{op1}, mp_bitcnt_t @var{op2})
4342Set @var{rop} to @m{@var{op1} \times 2^{op2}, @var{op1} times 2 raised to
4343@var{op2}}.
4344@end deftypefun
4345
4346@deftypefun void mpq_div (mpq_t @var{quotient}, const mpq_t @var{dividend}, const mpq_t @var{divisor})
4347@cindex Division functions
4348Set @var{quotient} to @var{dividend}/@var{divisor}.
4349@end deftypefun
4350
4351@deftypefun void mpq_div_2exp (mpq_t @var{rop}, const mpq_t @var{op1}, mp_bitcnt_t @var{op2})
4352Set @var{rop} to @m{@var{op1}/2^{op2}, @var{op1} divided by 2 raised to
4353@var{op2}}.
4354@end deftypefun
4355
4356@deftypefun void mpq_neg (mpq_t @var{negated_operand}, const mpq_t @var{operand})
4357Set @var{negated_operand} to @minus{}@var{operand}.
4358@end deftypefun
4359
4360@deftypefun void mpq_abs (mpq_t @var{rop}, const mpq_t @var{op})
4361Set @var{rop} to the absolute value of @var{op}.
4362@end deftypefun
4363
4364@deftypefun void mpq_inv (mpq_t @var{inverted_number}, const mpq_t @var{number})
4365Set @var{inverted_number} to 1/@var{number}.  If the new denominator is
4366zero, this routine will divide by zero.
4367@end deftypefun
4368
4369@node Comparing Rationals, Applying Integer Functions, Rational Arithmetic, Rational Number Functions
4370@comment  node-name,  next,  previous,  up
4371@section Comparison Functions
4372@cindex Rational comparison functions
4373@cindex Comparison functions
4374
4375@deftypefun int mpq_cmp (const mpq_t @var{op1}, const mpq_t @var{op2})
4376Compare @var{op1} and @var{op2}.  Return a positive value if @math{@var{op1} >
4377@var{op2}}, zero if @math{@var{op1} = @var{op2}}, and a negative value if
4378@math{@var{op1} < @var{op2}}.
4379
4380To determine if two rationals are equal, @code{mpq_equal} is faster than
4381@code{mpq_cmp}.
4382@end deftypefun
4383
4384@deftypefn Macro int mpq_cmp_ui (const mpq_t @var{op1}, unsigned long int @var{num2}, unsigned long int @var{den2})
4385@deftypefnx Macro int mpq_cmp_si (const mpq_t @var{op1}, long int @var{num2}, unsigned long int @var{den2})
4386Compare @var{op1} and @var{num2}/@var{den2}.  Return a positive value if
4387@math{@var{op1} > @var{num2}/@var{den2}}, zero if @math{@var{op1} =
4388@var{num2}/@var{den2}}, and a negative value if @math{@var{op1} <
4389@var{num2}/@var{den2}}.
4390
4391@var{num2} and @var{den2} are allowed to have common factors.
4392
4393These functions are implemented as a macros and evaluate their arguments
4394multiple times.
4395@end deftypefn
4396
4397@deftypefn Macro int mpq_sgn (const mpq_t @var{op})
4398@cindex Sign tests
4399@cindex Rational sign tests
4400Return @math{+1} if @math{@var{op} > 0}, 0 if @math{@var{op} = 0}, and
4401@math{-1} if @math{@var{op} < 0}.
4402
4403This function is actually implemented as a macro.  It evaluates its
4404argument multiple times.
4405@end deftypefn
4406
4407@deftypefun int mpq_equal (const mpq_t @var{op1}, const mpq_t @var{op2})
4408Return non-zero if @var{op1} and @var{op2} are equal, zero if they are
4409non-equal.  Although @code{mpq_cmp} can be used for the same purpose, this
4410function is much faster.
4411@end deftypefun
4412
4413@node Applying Integer Functions, I/O of Rationals, Comparing Rationals, Rational Number Functions
4414@comment  node-name,  next,  previous,  up
4415@section Applying Integer Functions to Rationals
4416@cindex Rational numerator and denominator
4417@cindex Numerator and denominator
4418
4419The set of @code{mpq} functions is quite small.  In particular, there are few
4420functions for either input or output.  The following functions give direct
4421access to the numerator and denominator of an @code{mpq_t}.
4422
4423Note that if an assignment to the numerator and/or denominator could take an
4424@code{mpq_t} out of the canonical form described at the start of this chapter
4425(@pxref{Rational Number Functions}) then @code{mpq_canonicalize} must be
4426called before any other @code{mpq} functions are applied to that @code{mpq_t}.
4427
4428@deftypefn Macro mpz_t mpq_numref (const mpq_t @var{op})
4429@deftypefnx Macro mpz_t mpq_denref (const mpq_t @var{op})
4430Return a reference to the numerator and denominator of @var{op}, respectively.
4431The @code{mpz} functions can be used on the result of these macros.
4432@end deftypefn
4433
4434@deftypefun void mpq_get_num (mpz_t @var{numerator}, const mpq_t @var{rational})
4435@deftypefunx void mpq_get_den (mpz_t @var{denominator}, const mpq_t @var{rational})
4436@deftypefunx void mpq_set_num (mpq_t @var{rational}, const mpz_t @var{numerator})
4437@deftypefunx void mpq_set_den (mpq_t @var{rational}, const mpz_t @var{denominator})
4438Get or set the numerator or denominator of a rational.  These functions are
4439equivalent to calling @code{mpz_set} with an appropriate @code{mpq_numref} or
4440@code{mpq_denref}.  Direct use of @code{mpq_numref} or @code{mpq_denref} is
4441recommended instead of these functions.
4442@end deftypefun
4443
4444
4445@need 2000
4446@node I/O of Rationals,  , Applying Integer Functions, Rational Number Functions
4447@comment  node-name,  next,  previous,  up
4448@section Input and Output Functions
4449@cindex Rational input and output functions
4450@cindex Input functions
4451@cindex Output functions
4452@cindex I/O functions
4453
4454Functions that perform input from a stdio stream, and functions that output to
4455a stdio stream, of @code{mpq} numbers.  Passing a @code{NULL} pointer for a
4456@var{stream} argument to any of these functions will make them read from
4457@code{stdin} and write to @code{stdout}, respectively.
4458
4459When using any of these functions, it is a good idea to include @file{stdio.h}
4460before @file{gmp.h}, since that will allow @file{gmp.h} to define prototypes
4461for these functions.
4462
4463See also @ref{Formatted Output} and @ref{Formatted Input}.
4464
4465@deftypefun size_t mpq_out_str (FILE *@var{stream}, int @var{base}, const mpq_t @var{op})
4466Output @var{op} on stdio stream @var{stream}, as a string of digits in base
4467@var{base}.  The base may vary from 2 to 36.  Output is in the form
4468@samp{num/den} or if the denominator is 1 then just @samp{num}.
4469
4470Return the number of bytes written, or if an error occurred, return 0.
4471@end deftypefun
4472
4473@deftypefun size_t mpq_inp_str (mpq_t @var{rop}, FILE *@var{stream}, int @var{base})
4474Read a string of digits from @var{stream} and convert them to a rational in
4475@var{rop}.  Any initial white-space characters are read and discarded.  Return
4476the number of characters read (including white space), or 0 if a rational
4477could not be read.
4478
4479The input can be a fraction like @samp{17/63} or just an integer like
4480@samp{123}.  Reading stops at the first character not in this form, and white
4481space is not permitted within the string.  If the input might not be in
4482canonical form, then @code{mpq_canonicalize} must be called (@pxref{Rational
4483Number Functions}).
4484
4485The @var{base} can be between 2 and 36, or can be 0 in which case the leading
4486characters of the string determine the base, @samp{0x} or @samp{0X} for
4487hexadecimal, @samp{0} for octal, or decimal otherwise.  The leading characters
4488are examined separately for the numerator and denominator of a fraction, so
4489for instance @samp{0x10/11} is @math{16/11}, whereas @samp{0x10/0x11} is
4490@math{16/17}.
4491@end deftypefun
4492
4493
4494@node Floating-point Functions, Low-level Functions, Rational Number Functions, Top
4495@comment  node-name,  next,  previous,  up
4496@chapter Floating-point Functions
4497@cindex Floating-point functions
4498@cindex Float functions
4499@cindex User-defined precision
4500@cindex Precision of floats
4501
4502GMP floating point numbers are stored in objects of type @code{mpf_t} and
4503functions operating on them have an @code{mpf_} prefix.
4504
4505The mantissa of each float has a user-selectable precision, limited only by
4506available memory.  Each variable has its own precision, and that can be
4507increased or decreased at any time.
4508
4509The exponent of each float is a fixed precision, one machine word on most
4510systems.  In the current implementation the exponent is a count of limbs, so
4511for example on a 32-bit system this means a range of roughly
4512@math{2^@W{-68719476768}} to @math{2^@W{68719476736}}, or on a 64-bit system
4513this will be greater.  Note however @code{mpf_get_str} can only return an
4514exponent which fits an @code{mp_exp_t} and currently @code{mpf_set_str}
4515doesn't accept exponents bigger than a @code{long}.
4516
4517Each variable keeps a size for the mantissa data actually in use.  This means
4518that if a float is exactly represented in only a few bits then only those bits
4519will be used in a calculation, even if the selected precision is high.
4520
4521All calculations are performed to the precision of the destination variable.
4522Each function is defined to calculate with ``infinite precision'' followed by
4523a truncation to the destination precision, but of course the work done is only
4524what's needed to determine a result under that definition.
4525
4526The precision selected for a variable is a minimum value, GMP may increase it
4527a little to facilitate efficient calculation.  Currently this means rounding
4528up to a whole limb, and then sometimes having a further partial limb,
4529depending on the high limb of the mantissa.  But applications shouldn't be
4530concerned by such details.
4531
4532The mantissa in stored in binary, as might be imagined from the fact
4533precisions are expressed in bits.  One consequence of this is that decimal
4534fractions like @math{0.1} cannot be represented exactly.  The same is true of
4535plain IEEE @code{double} floats.  This makes both highly unsuitable for
4536calculations involving money or other values that should be exact decimal
4537fractions.  (Suitably scaled integers, or perhaps rationals, are better
4538choices.)
4539
4540@code{mpf} functions and variables have no special notion of infinity or
4541not-a-number, and applications must take care not to overflow the exponent or
4542results will be unpredictable.  This might change in a future release.
4543
4544Note that the @code{mpf} functions are @emph{not} intended as a smooth
4545extension to IEEE P754 arithmetic.  In particular results obtained on one
4546computer often differ from the results on a computer with a different word
4547size.
4548
4549@menu
4550* Initializing Floats::
4551* Assigning Floats::
4552* Simultaneous Float Init & Assign::
4553* Converting Floats::
4554* Float Arithmetic::
4555* Float Comparison::
4556* I/O of Floats::
4557* Miscellaneous Float Functions::
4558@end menu
4559
4560@node Initializing Floats, Assigning Floats, Floating-point Functions, Floating-point Functions
4561@comment  node-name,  next,  previous,  up
4562@section Initialization Functions
4563@cindex Float initialization functions
4564@cindex Initialization functions
4565
4566@deftypefun void mpf_set_default_prec (mp_bitcnt_t @var{prec})
4567Set the default precision to be @strong{at least} @var{prec} bits.  All
4568subsequent calls to @code{mpf_init} will use this precision, but previously
4569initialized variables are unaffected.
4570@end deftypefun
4571
4572@deftypefun {mp_bitcnt_t} mpf_get_default_prec (void)
4573Return the default precision actually used.
4574@end deftypefun
4575
4576An @code{mpf_t} object must be initialized before storing the first value in
4577it.  The functions @code{mpf_init} and @code{mpf_init2} are used for that
4578purpose.
4579
4580@deftypefun void mpf_init (mpf_t @var{x})
4581Initialize @var{x} to 0.  Normally, a variable should be initialized once only
4582or at least be cleared, using @code{mpf_clear}, between initializations.  The
4583precision of @var{x} is undefined unless a default precision has already been
4584established by a call to @code{mpf_set_default_prec}.
4585@end deftypefun
4586
4587@deftypefun void mpf_init2 (mpf_t @var{x}, mp_bitcnt_t @var{prec})
4588Initialize @var{x} to 0 and set its precision to be @strong{at least}
4589@var{prec} bits.  Normally, a variable should be initialized once only or at
4590least be cleared, using @code{mpf_clear}, between initializations.
4591@end deftypefun
4592
4593@deftypefun void mpf_inits (mpf_t @var{x}, ...)
4594Initialize a NULL-terminated list of @code{mpf_t} variables, and set their
4595values to 0.  The precision of the initialized variables is undefined unless a
4596default precision has already been established by a call to
4597@code{mpf_set_default_prec}.
4598@end deftypefun
4599
4600@deftypefun void mpf_clear (mpf_t @var{x})
4601Free the space occupied by @var{x}.  Make sure to call this function for all
4602@code{mpf_t} variables when you are done with them.
4603@end deftypefun
4604
4605@deftypefun void mpf_clears (mpf_t @var{x}, ...)
4606Free the space occupied by a NULL-terminated list of @code{mpf_t} variables.
4607@end deftypefun
4608
4609@need 2000
4610Here is an example on how to initialize floating-point variables:
4611@example
4612@{
4613  mpf_t x, y;
4614  mpf_init (x);           /* use default precision */
4615  mpf_init2 (y, 256);     /* precision @emph{at least} 256 bits */
4616  @dots{}
4617  /* Unless the program is about to exit, do ... */
4618  mpf_clear (x);
4619  mpf_clear (y);
4620@}
4621@end example
4622
4623The following three functions are useful for changing the precision during a
4624calculation.  A typical use would be for adjusting the precision gradually in
4625iterative algorithms like Newton-Raphson, making the computation precision
4626closely match the actual accurate part of the numbers.
4627
4628@deftypefun {mp_bitcnt_t} mpf_get_prec (const mpf_t @var{op})
4629Return the current precision of @var{op}, in bits.
4630@end deftypefun
4631
4632@deftypefun void mpf_set_prec (mpf_t @var{rop}, mp_bitcnt_t @var{prec})
4633Set the precision of @var{rop} to be @strong{at least} @var{prec} bits.  The
4634value in @var{rop} will be truncated to the new precision.
4635
4636This function requires a call to @code{realloc}, and so should not be used in
4637a tight loop.
4638@end deftypefun
4639
4640@deftypefun void mpf_set_prec_raw (mpf_t @var{rop}, mp_bitcnt_t @var{prec})
4641Set the precision of @var{rop} to be @strong{at least} @var{prec} bits,
4642without changing the memory allocated.
4643
4644@var{prec} must be no more than the allocated precision for @var{rop}, that
4645being the precision when @var{rop} was initialized, or in the most recent
4646@code{mpf_set_prec}.
4647
4648The value in @var{rop} is unchanged, and in particular if it had a higher
4649precision than @var{prec} it will retain that higher precision.  New values
4650written to @var{rop} will use the new @var{prec}.
4651
4652Before calling @code{mpf_clear} or the full @code{mpf_set_prec}, another
4653@code{mpf_set_prec_raw} call must be made to restore @var{rop} to its original
4654allocated precision.  Failing to do so will have unpredictable results.
4655
4656@code{mpf_get_prec} can be used before @code{mpf_set_prec_raw} to get the
4657original allocated precision.  After @code{mpf_set_prec_raw} it reflects the
4658@var{prec} value set.
4659
4660@code{mpf_set_prec_raw} is an efficient way to use an @code{mpf_t} variable at
4661different precisions during a calculation, perhaps to gradually increase
4662precision in an iteration, or just to use various different precisions for
4663different purposes during a calculation.
4664@end deftypefun
4665
4666
4667@need 2000
4668@node Assigning Floats, Simultaneous Float Init & Assign, Initializing Floats, Floating-point Functions
4669@comment  node-name,  next,  previous,  up
4670@section Assignment Functions
4671@cindex Float assignment functions
4672@cindex Assignment functions
4673
4674These functions assign new values to already initialized floats
4675(@pxref{Initializing Floats}).
4676
4677@deftypefun void mpf_set (mpf_t @var{rop}, const mpf_t @var{op})
4678@deftypefunx void mpf_set_ui (mpf_t @var{rop}, unsigned long int @var{op})
4679@deftypefunx void mpf_set_si (mpf_t @var{rop}, signed long int @var{op})
4680@deftypefunx void mpf_set_d (mpf_t @var{rop}, double @var{op})
4681@deftypefunx void mpf_set_z (mpf_t @var{rop}, const mpz_t @var{op})
4682@deftypefunx void mpf_set_q (mpf_t @var{rop}, const mpq_t @var{op})
4683Set the value of @var{rop} from @var{op}.
4684@end deftypefun
4685
4686@deftypefun int mpf_set_str (mpf_t @var{rop}, const char *@var{str}, int @var{base})
4687Set the value of @var{rop} from the string in @var{str}.  The string is of the
4688form @samp{M@@N} or, if the base is 10 or less, alternatively @samp{MeN}.
4689@samp{M} is the mantissa and @samp{N} is the exponent.  The mantissa is always
4690in the specified base.  The exponent is either in the specified base or, if
4691@var{base} is negative, in decimal.  The decimal point expected is taken from
4692the current locale, on systems providing @code{localeconv}.
4693
4694The argument @var{base} may be in the ranges 2 to 62, or @minus{}62 to
4695@minus{}2.  Negative values are used to specify that the exponent is in
4696decimal.
4697
4698For bases up to 36, case is ignored; upper-case and lower-case letters have
4699the same value; for bases 37 to 62, upper-case letter represent the usual
470010..35 while lower-case letter represent 36..61.
4701
4702Unlike the corresponding @code{mpz} function, the base will not be determined
4703from the leading characters of the string if @var{base} is 0.  This is so that
4704numbers like @samp{0.23} are not interpreted as octal.
4705
4706White space is allowed in the string, and is simply ignored.  [This is not
4707really true; white-space is ignored in the beginning of the string and within
4708the mantissa, but not in other places, such as after a minus sign or in the
4709exponent.  We are considering changing the definition of this function, making
4710it fail when there is any white-space in the input, since that makes a lot of
4711sense.  Please tell us your opinion about this change.  Do you really want it
4712to accept @nicode{"3 14"} as meaning 314 as it does now?]
4713
4714This function returns 0 if the entire string is a valid number in base
4715@var{base}.  Otherwise it returns @minus{}1.
4716@end deftypefun
4717
4718@deftypefun void mpf_swap (mpf_t @var{rop1}, mpf_t @var{rop2})
4719Swap @var{rop1} and @var{rop2} efficiently.  Both the values and the
4720precisions of the two variables are swapped.
4721@end deftypefun
4722
4723
4724@node Simultaneous Float Init & Assign, Converting Floats, Assigning Floats, Floating-point Functions
4725@comment  node-name,  next,  previous,  up
4726@section Combined Initialization and Assignment Functions
4727@cindex Float assignment functions
4728@cindex Assignment functions
4729@cindex Float initialization functions
4730@cindex Initialization functions
4731
4732For convenience, GMP provides a parallel series of initialize-and-set functions
4733which initialize the output and then store the value there.  These functions'
4734names have the form @code{mpf_init_set@dots{}}
4735
4736Once the float has been initialized by any of the @code{mpf_init_set@dots{}}
4737functions, it can be used as the source or destination operand for the ordinary
4738float functions.  Don't use an initialize-and-set function on a variable
4739already initialized!
4740
4741@deftypefun void mpf_init_set (mpf_t @var{rop}, const mpf_t @var{op})
4742@deftypefunx void mpf_init_set_ui (mpf_t @var{rop}, unsigned long int @var{op})
4743@deftypefunx void mpf_init_set_si (mpf_t @var{rop}, signed long int @var{op})
4744@deftypefunx void mpf_init_set_d (mpf_t @var{rop}, double @var{op})
4745Initialize @var{rop} and set its value from @var{op}.
4746
4747The precision of @var{rop} will be taken from the active default precision, as
4748set by @code{mpf_set_default_prec}.
4749@end deftypefun
4750
4751@deftypefun int mpf_init_set_str (mpf_t @var{rop}, const char *@var{str}, int @var{base})
4752Initialize @var{rop} and set its value from the string in @var{str}.  See
4753@code{mpf_set_str} above for details on the assignment operation.
4754
4755Note that @var{rop} is initialized even if an error occurs.  (I.e., you have to
4756call @code{mpf_clear} for it.)
4757
4758The precision of @var{rop} will be taken from the active default precision, as
4759set by @code{mpf_set_default_prec}.
4760@end deftypefun
4761
4762
4763@node Converting Floats, Float Arithmetic, Simultaneous Float Init & Assign, Floating-point Functions
4764@comment  node-name,  next,  previous,  up
4765@section Conversion Functions
4766@cindex Float conversion functions
4767@cindex Conversion functions
4768
4769@deftypefun double mpf_get_d (const mpf_t @var{op})
4770Convert @var{op} to a @code{double}, truncating if necessary (i.e.@: rounding
4771towards zero).
4772
4773If the exponent in @var{op} is too big or too small to fit a @code{double}
4774then the result is system dependent.  For too big an infinity is returned when
4775available.  For too small @math{0.0} is normally returned.  Hardware overflow,
4776underflow and denorm traps may or may not occur.
4777@end deftypefun
4778
4779@deftypefun double mpf_get_d_2exp (signed long int *@var{exp}, const mpf_t @var{op})
4780Convert @var{op} to a @code{double}, truncating if necessary (i.e.@: rounding
4781towards zero), and with an exponent returned separately.
4782
4783The return value is in the range @math{0.5@le{}@GMPabs{@var{d}}<1} and the
4784exponent is stored to @code{*@var{exp}}.  @m{@var{d} * 2^{exp}, @var{d} *
47852^@var{exp}} is the (truncated) @var{op} value.  If @var{op} is zero, the
4786return is @math{0.0} and 0 is stored to @code{*@var{exp}}.
4787
4788@cindex @code{frexp}
4789This is similar to the standard C @code{frexp} function (@pxref{Normalization
4790Functions,,, libc, The GNU C Library Reference Manual}).
4791@end deftypefun
4792
4793@deftypefun long mpf_get_si (const mpf_t @var{op})
4794@deftypefunx {unsigned long} mpf_get_ui (const mpf_t @var{op})
4795Convert @var{op} to a @code{long} or @code{unsigned long}, truncating any
4796fraction part.  If @var{op} is too big for the return type, the result is
4797undefined.
4798
4799See also @code{mpf_fits_slong_p} and @code{mpf_fits_ulong_p}
4800(@pxref{Miscellaneous Float Functions}).
4801@end deftypefun
4802
4803@deftypefun {char *} mpf_get_str (char *@var{str}, mp_exp_t *@var{expptr}, int @var{base}, size_t @var{n_digits}, const mpf_t @var{op})
4804Convert @var{op} to a string of digits in base @var{base}.  The base argument
4805may vary from 2 to 62 or from @minus{}2 to @minus{}36.  Up to @var{n_digits}
4806digits will be generated.  Trailing zeros are not returned.  No more digits
4807than can be accurately represented by @var{op} are ever generated.  If
4808@var{n_digits} is 0 then that accurate maximum number of digits are generated.
4809
4810For @var{base} in the range 2..36, digits and lower-case letters are used; for
4811@minus{}2..@minus{}36, digits and upper-case letters are used; for 37..62,
4812digits, upper-case letters, and lower-case letters (in that significance order)
4813are used.
4814
4815If @var{str} is @code{NULL}, the result string is allocated using the current
4816allocation function (@pxref{Custom Allocation}).  The block will be
4817@code{strlen(str)+1} bytes, that being exactly enough for the string and
4818null-terminator.
4819
4820If @var{str} is not @code{NULL}, it should point to a block of
4821@math{@var{n_digits} + 2} bytes, that being enough for the mantissa, a
4822possible minus sign, and a null-terminator.  When @var{n_digits} is 0 to get
4823all significant digits, an application won't be able to know the space
4824required, and @var{str} should be @code{NULL} in that case.
4825
4826The generated string is a fraction, with an implicit radix point immediately
4827to the left of the first digit.  The applicable exponent is written through
4828the @var{expptr} pointer.  For example, the number 3.1416 would be returned as
4829string @nicode{"31416"} and exponent 1.
4830
4831When @var{op} is zero, an empty string is produced and the exponent returned
4832is 0.
4833
4834A pointer to the result string is returned, being either the allocated block
4835or the given @var{str}.
4836@end deftypefun
4837
4838
4839@node Float Arithmetic, Float Comparison, Converting Floats, Floating-point Functions
4840@comment  node-name,  next,  previous,  up
4841@section Arithmetic Functions
4842@cindex Float arithmetic functions
4843@cindex Arithmetic functions
4844
4845@deftypefun void mpf_add (mpf_t @var{rop}, const mpf_t @var{op1}, const mpf_t @var{op2})
4846@deftypefunx void mpf_add_ui (mpf_t @var{rop}, const mpf_t @var{op1}, unsigned long int @var{op2})
4847Set @var{rop} to @math{@var{op1} + @var{op2}}.
4848@end deftypefun
4849
4850@deftypefun void mpf_sub (mpf_t @var{rop}, const mpf_t @var{op1}, const mpf_t @var{op2})
4851@deftypefunx void mpf_ui_sub (mpf_t @var{rop}, unsigned long int @var{op1}, const mpf_t @var{op2})
4852@deftypefunx void mpf_sub_ui (mpf_t @var{rop}, const mpf_t @var{op1}, unsigned long int @var{op2})
4853Set @var{rop} to @var{op1} @minus{} @var{op2}.
4854@end deftypefun
4855
4856@deftypefun void mpf_mul (mpf_t @var{rop}, const mpf_t @var{op1}, const mpf_t @var{op2})
4857@deftypefunx void mpf_mul_ui (mpf_t @var{rop}, const mpf_t @var{op1}, unsigned long int @var{op2})
4858Set @var{rop} to @math{@var{op1} @GMPtimes{} @var{op2}}.
4859@end deftypefun
4860
4861Division is undefined if the divisor is zero, and passing a zero divisor to the
4862divide functions will make these functions intentionally divide by zero.  This
4863lets the user handle arithmetic exceptions in these functions in the same
4864manner as other arithmetic exceptions.
4865
4866@deftypefun void mpf_div (mpf_t @var{rop}, const mpf_t @var{op1}, const mpf_t @var{op2})
4867@deftypefunx void mpf_ui_div (mpf_t @var{rop}, unsigned long int @var{op1}, const mpf_t @var{op2})
4868@deftypefunx void mpf_div_ui (mpf_t @var{rop}, const mpf_t @var{op1}, unsigned long int @var{op2})
4869@cindex Division functions
4870Set @var{rop} to @var{op1}/@var{op2}.
4871@end deftypefun
4872
4873@deftypefun void mpf_sqrt (mpf_t @var{rop}, const mpf_t @var{op})
4874@deftypefunx void mpf_sqrt_ui (mpf_t @var{rop}, unsigned long int @var{op})
4875@cindex Root extraction functions
4876Set @var{rop} to @m{\sqrt{@var{op}}, the square root of @var{op}}.
4877@end deftypefun
4878
4879@deftypefun void mpf_pow_ui (mpf_t @var{rop}, const mpf_t @var{op1}, unsigned long int @var{op2})
4880@cindex Exponentiation functions
4881@cindex Powering functions
4882Set @var{rop} to @m{@var{op1}^{op2}, @var{op1} raised to the power @var{op2}}.
4883@end deftypefun
4884
4885@deftypefun void mpf_neg (mpf_t @var{rop}, const mpf_t @var{op})
4886Set @var{rop} to @minus{}@var{op}.
4887@end deftypefun
4888
4889@deftypefun void mpf_abs (mpf_t @var{rop}, const mpf_t @var{op})
4890Set @var{rop} to the absolute value of @var{op}.
4891@end deftypefun
4892
4893@deftypefun void mpf_mul_2exp (mpf_t @var{rop}, const mpf_t @var{op1}, mp_bitcnt_t @var{op2})
4894Set @var{rop} to @m{@var{op1} \times 2^{op2}, @var{op1} times 2 raised to
4895@var{op2}}.
4896@end deftypefun
4897
4898@deftypefun void mpf_div_2exp (mpf_t @var{rop}, const mpf_t @var{op1}, mp_bitcnt_t @var{op2})
4899Set @var{rop} to @m{@var{op1}/2^{op2}, @var{op1} divided by 2 raised to
4900@var{op2}}.
4901@end deftypefun
4902
4903@node Float Comparison, I/O of Floats, Float Arithmetic, Floating-point Functions
4904@comment  node-name,  next,  previous,  up
4905@section Comparison Functions
4906@cindex Float comparison functions
4907@cindex Comparison functions
4908
4909@deftypefun int mpf_cmp (const mpf_t @var{op1}, const mpf_t @var{op2})
4910@deftypefunx int mpf_cmp_d (const mpf_t @var{op1}, double @var{op2})
4911@deftypefunx int mpf_cmp_ui (const mpf_t @var{op1}, unsigned long int @var{op2})
4912@deftypefunx int mpf_cmp_si (const mpf_t @var{op1}, signed long int @var{op2})
4913Compare @var{op1} and @var{op2}.  Return a positive value if @math{@var{op1} >
4914@var{op2}}, zero if @math{@var{op1} = @var{op2}}, and a negative value if
4915@math{@var{op1} < @var{op2}}.
4916
4917@code{mpf_cmp_d} can be called with an infinity, but results are undefined for
4918a NaN.
4919@end deftypefun
4920
4921@deftypefun int mpf_eq (const mpf_t @var{op1}, const mpf_t @var{op2}, mp_bitcnt_t op3)
4922Return non-zero if the first @var{op3} bits of @var{op1} and @var{op2} are
4923equal, zero otherwise.  I.e., test if @var{op1} and @var{op2} are approximately
4924equal.
4925
4926Caution 1: All version of GMP up to version 4.2.4 compared just whole limbs,
4927meaning sometimes more than @var{op3} bits, sometimes fewer.
4928
4929Caution 2: This function will consider XXX11...111 and XX100...000 different,
4930even if ... is replaced by a semi-infinite number of bits.  Such numbers are
4931really just one ulp off, and should be considered equal.
4932@end deftypefun
4933
4934@deftypefun void mpf_reldiff (mpf_t @var{rop}, const mpf_t @var{op1}, const mpf_t @var{op2})
4935Compute the relative difference between @var{op1} and @var{op2} and store the
4936result in @var{rop}.  This is @math{@GMPabs{@var{op1}-@var{op2}}/@var{op1}}.
4937@end deftypefun
4938
4939@deftypefn Macro int mpf_sgn (const mpf_t @var{op})
4940@cindex Sign tests
4941@cindex Float sign tests
4942Return @math{+1} if @math{@var{op} > 0}, 0 if @math{@var{op} = 0}, and
4943@math{-1} if @math{@var{op} < 0}.
4944
4945This function is actually implemented as a macro.  It evaluates its argument
4946multiple times.
4947@end deftypefn
4948
4949@node I/O of Floats, Miscellaneous Float Functions, Float Comparison, Floating-point Functions
4950@comment  node-name,  next,  previous,  up
4951@section Input and Output Functions
4952@cindex Float input and output functions
4953@cindex Input functions
4954@cindex Output functions
4955@cindex I/O functions
4956
4957Functions that perform input from a stdio stream, and functions that output to
4958a stdio stream, of @code{mpf} numbers.  Passing a @code{NULL} pointer for a
4959@var{stream} argument to any of these functions will make them read from
4960@code{stdin} and write to @code{stdout}, respectively.
4961
4962When using any of these functions, it is a good idea to include @file{stdio.h}
4963before @file{gmp.h}, since that will allow @file{gmp.h} to define prototypes
4964for these functions.
4965
4966See also @ref{Formatted Output} and @ref{Formatted Input}.
4967
4968@deftypefun size_t mpf_out_str (FILE *@var{stream}, int @var{base}, size_t @var{n_digits}, const mpf_t @var{op})
4969Print @var{op} to @var{stream}, as a string of digits.  Return the number of
4970bytes written, or if an error occurred, return 0.
4971
4972The mantissa is prefixed with an @samp{0.} and is in the given @var{base},
4973which may vary from 2 to 62 or from @minus{}2 to @minus{}36.  An exponent is
4974then printed, separated by an @samp{e}, or if the base is greater than 10 then
4975by an @samp{@@}.  The exponent is always in decimal.  The decimal point follows
4976the current locale, on systems providing @code{localeconv}.
4977
4978For @var{base} in the range 2..36, digits and lower-case letters are used; for
4979@minus{}2..@minus{}36, digits and upper-case letters are used; for 37..62,
4980digits, upper-case letters, and lower-case letters (in that significance order)
4981are used.
4982
4983Up to @var{n_digits} will be printed from the mantissa, except that no more
4984digits than are accurately representable by @var{op} will be printed.
4985@var{n_digits} can be 0 to select that accurate maximum.
4986@end deftypefun
4987
4988@deftypefun size_t mpf_inp_str (mpf_t @var{rop}, FILE *@var{stream}, int @var{base})
4989Read a string in base @var{base} from @var{stream}, and put the read float in
4990@var{rop}.  The string is of the form @samp{M@@N} or, if the base is 10 or
4991less, alternatively @samp{MeN}.  @samp{M} is the mantissa and @samp{N} is the
4992exponent.  The mantissa is always in the specified base.  The exponent is
4993either in the specified base or, if @var{base} is negative, in decimal.  The
4994decimal point expected is taken from the current locale, on systems providing
4995@code{localeconv}.
4996
4997The argument @var{base} may be in the ranges 2 to 36, or @minus{}36 to
4998@minus{}2.  Negative values are used to specify that the exponent is in
4999decimal.
5000
5001Unlike the corresponding @code{mpz} function, the base will not be determined
5002from the leading characters of the string if @var{base} is 0.  This is so that
5003numbers like @samp{0.23} are not interpreted as octal.
5004
5005Return the number of bytes read, or if an error occurred, return 0.
5006@end deftypefun
5007
5008@c @deftypefun void mpf_out_raw (FILE *@var{stream}, const mpf_t @var{float})
5009@c Output @var{float} on stdio stream @var{stream}, in raw binary
5010@c format.  The float is written in a portable format, with 4 bytes of
5011@c size information, and that many bytes of limbs.  Both the size and the
5012@c limbs are written in decreasing significance order.
5013@c @end deftypefun
5014
5015@c @deftypefun void mpf_inp_raw (mpf_t @var{float}, FILE *@var{stream})
5016@c Input from stdio stream @var{stream} in the format written by
5017@c @code{mpf_out_raw}, and put the result in @var{float}.
5018@c @end deftypefun
5019
5020
5021@node Miscellaneous Float Functions,  , I/O of Floats, Floating-point Functions
5022@comment  node-name,  next,  previous,  up
5023@section Miscellaneous Functions
5024@cindex Miscellaneous float functions
5025@cindex Float miscellaneous functions
5026
5027@deftypefun void mpf_ceil (mpf_t @var{rop}, const mpf_t @var{op})
5028@deftypefunx void mpf_floor (mpf_t @var{rop}, const mpf_t @var{op})
5029@deftypefunx void mpf_trunc (mpf_t @var{rop}, const mpf_t @var{op})
5030@cindex Rounding functions
5031@cindex Float rounding functions
5032Set @var{rop} to @var{op} rounded to an integer.  @code{mpf_ceil} rounds to the
5033next higher integer, @code{mpf_floor} to the next lower, and @code{mpf_trunc}
5034to the integer towards zero.
5035@end deftypefun
5036
5037@deftypefun int mpf_integer_p (const mpf_t @var{op})
5038Return non-zero if @var{op} is an integer.
5039@end deftypefun
5040
5041@deftypefun int mpf_fits_ulong_p (const mpf_t @var{op})
5042@deftypefunx int mpf_fits_slong_p (const mpf_t @var{op})
5043@deftypefunx int mpf_fits_uint_p (const mpf_t @var{op})
5044@deftypefunx int mpf_fits_sint_p (const mpf_t @var{op})
5045@deftypefunx int mpf_fits_ushort_p (const mpf_t @var{op})
5046@deftypefunx int mpf_fits_sshort_p (const mpf_t @var{op})
5047Return non-zero if @var{op} would fit in the respective C data type, when
5048truncated to an integer.
5049@end deftypefun
5050
5051@deftypefun void mpf_urandomb (mpf_t @var{rop}, gmp_randstate_t @var{state}, mp_bitcnt_t @var{nbits})
5052@cindex Random number functions
5053@cindex Float random number functions
5054Generate a uniformly distributed random float in @var{rop}, such that @math{0
5055@le{} @var{rop} < 1}, with @var{nbits} significant bits in the mantissa or
5056less if the precision of @var{rop} is smaller.
5057
5058The variable @var{state} must be initialized by calling one of the
5059@code{gmp_randinit} functions (@ref{Random State Initialization}) before
5060invoking this function.
5061@end deftypefun
5062
5063@deftypefun void mpf_random2 (mpf_t @var{rop}, mp_size_t @var{max_size}, mp_exp_t @var{exp})
5064Generate a random float of at most @var{max_size} limbs, with long strings of
5065zeros and ones in the binary representation.  The exponent of the number is in
5066the interval @minus{}@var{exp} to @var{exp} (in limbs).  This function is
5067useful for testing functions and algorithms, since these kind of random
5068numbers have proven to be more likely to trigger corner-case bugs.  Negative
5069random numbers are generated when @var{max_size} is negative.
5070@end deftypefun
5071
5072@c @deftypefun size_t mpf_size (const mpf_t @var{op})
5073@c Return the size of @var{op} measured in number of limbs.  If @var{op} is
5074@c zero, the returned value will be zero.  (@xref{Nomenclature}, for an
5075@c explanation of the concept @dfn{limb}.)
5076@c
5077@c @strong{This function is obsolete.  It will disappear from future GMP
5078@c releases.}
5079@c @end deftypefun
5080
5081
5082@node Low-level Functions, Random Number Functions, Floating-point Functions, Top
5083@comment  node-name,  next,  previous,  up
5084@chapter Low-level Functions
5085@cindex Low-level functions
5086
5087This chapter describes low-level GMP functions, used to implement the
5088high-level GMP functions, but also intended for time-critical user code.
5089
5090These functions start with the prefix @code{mpn_}.
5091
5092@c 1. Some of these function clobber input operands.
5093@c
5094
5095The @code{mpn} functions are designed to be as fast as possible, @strong{not}
5096to provide a coherent calling interface.  The different functions have somewhat
5097similar interfaces, but there are variations that make them hard to use.  These
5098functions do as little as possible apart from the real multiple precision
5099computation, so that no time is spent on things that not all callers need.
5100
5101A source operand is specified by a pointer to the least significant limb and a
5102limb count.  A destination operand is specified by just a pointer.  It is the
5103responsibility of the caller to ensure that the destination has enough space
5104for storing the result.
5105
5106With this way of specifying operands, it is possible to perform computations on
5107subranges of an argument, and store the result into a subrange of a
5108destination.
5109
5110A common requirement for all functions is that each source area needs at least
5111one limb.  No size argument may be zero.  Unless otherwise stated, in-place
5112operations are allowed where source and destination are the same, but not where
5113they only partly overlap.
5114
5115The @code{mpn} functions are the base for the implementation of the
5116@code{mpz_}, @code{mpf_}, and @code{mpq_} functions.
5117
5118This example adds the number beginning at @var{s1p} and the number beginning at
5119@var{s2p} and writes the sum at @var{destp}.  All areas have @var{n} limbs.
5120
5121@example
5122cy = mpn_add_n (destp, s1p, s2p, n)
5123@end example
5124
5125It should be noted that the @code{mpn} functions make no attempt to identify
5126high or low zero limbs on their operands, or other special forms.  On random
5127data such cases will be unlikely and it'd be wasteful for every function to
5128check every time.  An application knowing something about its data can take
5129steps to trim or perhaps split its calculations.
5130@c
5131@c  For reference, within gmp mpz_t operands never have high zero limbs, and
5132@c  we rate low zero limbs as unlikely too (or something an application should
5133@c  handle).  This is a prime motivation for not stripping zero limbs in say
5134@c  mpn_mul_n etc.
5135@c
5136@c  Other applications doing variable-length calculations will quite likely do
5137@c  something similar to mpz.  And even if not then it's highly likely zero
5138@c  limb stripping can be done at just a few judicious points, which will be
5139@c  more efficient than having lots of mpn functions checking every time.
5140
5141@sp 1
5142@noindent
5143In the notation used below, a source operand is identified by the pointer to
5144the least significant limb, and the limb count in braces.  For example,
5145@{@var{s1p}, @var{s1n}@}.
5146
5147@deftypefun mp_limb_t mpn_add_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n})
5148Add @{@var{s1p}, @var{n}@} and @{@var{s2p}, @var{n}@}, and write the @var{n}
5149least significant limbs of the result to @var{rp}.  Return carry, either 0 or
51501.
5151
5152This is the lowest-level function for addition.  It is the preferred function
5153for addition, since it is written in assembly for most CPUs.  For addition of
5154a variable to itself (i.e., @var{s1p} equals @var{s2p}) use @code{mpn_lshift}
5155with a count of 1 for optimal speed.
5156@end deftypefun
5157
5158@deftypefun mp_limb_t mpn_add_1 (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{n}, mp_limb_t @var{s2limb})
5159Add @{@var{s1p}, @var{n}@} and @var{s2limb}, and write the @var{n} least
5160significant limbs of the result to @var{rp}.  Return carry, either 0 or 1.
5161@end deftypefun
5162
5163@deftypefun mp_limb_t mpn_add (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{s1n}, const mp_limb_t *@var{s2p}, mp_size_t @var{s2n})
5164Add @{@var{s1p}, @var{s1n}@} and @{@var{s2p}, @var{s2n}@}, and write the
5165@var{s1n} least significant limbs of the result to @var{rp}.  Return carry,
5166either 0 or 1.
5167
5168This function requires that @var{s1n} is greater than or equal to @var{s2n}.
5169@end deftypefun
5170
5171@deftypefun mp_limb_t mpn_sub_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n})
5172Subtract @{@var{s2p}, @var{n}@} from @{@var{s1p}, @var{n}@}, and write the
5173@var{n} least significant limbs of the result to @var{rp}.  Return borrow,
5174either 0 or 1.
5175
5176This is the lowest-level function for subtraction.  It is the preferred
5177function for subtraction, since it is written in assembly for most CPUs.
5178@end deftypefun
5179
5180@deftypefun mp_limb_t mpn_sub_1 (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{n}, mp_limb_t @var{s2limb})
5181Subtract @var{s2limb} from @{@var{s1p}, @var{n}@}, and write the @var{n} least
5182significant limbs of the result to @var{rp}.  Return borrow, either 0 or 1.
5183@end deftypefun
5184
5185@deftypefun mp_limb_t mpn_sub (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{s1n}, const mp_limb_t *@var{s2p}, mp_size_t @var{s2n})
5186Subtract @{@var{s2p}, @var{s2n}@} from @{@var{s1p}, @var{s1n}@}, and write the
5187@var{s1n} least significant limbs of the result to @var{rp}.  Return borrow,
5188either 0 or 1.
5189
5190This function requires that @var{s1n} is greater than or equal to
5191@var{s2n}.
5192@end deftypefun
5193
5194@deftypefun mp_limb_t mpn_neg (mp_limb_t *@var{rp}, const mp_limb_t *@var{sp}, mp_size_t @var{n})
5195Perform the negation of @{@var{sp}, @var{n}@}, and write the result to
5196@{@var{rp}, @var{n}@}.  Return carry-out.
5197@end deftypefun
5198
5199@deftypefun void mpn_mul_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n})
5200Multiply @{@var{s1p}, @var{n}@} and @{@var{s2p}, @var{n}@}, and write the
52012*@var{n}-limb result to @var{rp}.
5202
5203The destination has to have space for 2*@var{n} limbs, even if the product's
5204most significant limb is zero.  No overlap is permitted between the
5205destination and either source.
5206
5207If the two input operands are the same, use @code{mpn_sqr}.
5208@end deftypefun
5209
5210@deftypefun mp_limb_t mpn_mul (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{s1n}, const mp_limb_t *@var{s2p}, mp_size_t @var{s2n})
5211Multiply @{@var{s1p}, @var{s1n}@} and @{@var{s2p}, @var{s2n}@}, and write the
5212(@var{s1n}+@var{s2n})-limb result to @var{rp}.  Return the most significant
5213limb of the result.
5214
5215The destination has to have space for @var{s1n} + @var{s2n} limbs, even if the
5216product's most significant limb is zero.  No overlap is permitted between the
5217destination and either source.
5218
5219This function requires that @var{s1n} is greater than or equal to @var{s2n}.
5220@end deftypefun
5221
5222@deftypefun void mpn_sqr (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{n})
5223Compute the square of @{@var{s1p}, @var{n}@} and write the 2*@var{n}-limb
5224result to @var{rp}.
5225
5226The destination has to have space for 2*@var{n} limbs, even if the result's
5227most significant limb is zero.  No overlap is permitted between the
5228destination and the source.
5229@end deftypefun
5230
5231@deftypefun mp_limb_t mpn_mul_1 (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{n}, mp_limb_t @var{s2limb})
5232Multiply @{@var{s1p}, @var{n}@} by @var{s2limb}, and write the @var{n} least
5233significant limbs of the product to @var{rp}.  Return the most significant
5234limb of the product.  @{@var{s1p}, @var{n}@} and @{@var{rp}, @var{n}@} are
5235allowed to overlap provided @math{@var{rp} @le{} @var{s1p}}.
5236
5237This is a low-level function that is a building block for general
5238multiplication as well as other operations in GMP@.  It is written in assembly
5239for most CPUs.
5240
5241Don't call this function if @var{s2limb} is a power of 2; use @code{mpn_lshift}
5242with a count equal to the logarithm of @var{s2limb} instead, for optimal speed.
5243@end deftypefun
5244
5245@deftypefun mp_limb_t mpn_addmul_1 (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{n}, mp_limb_t @var{s2limb})
5246Multiply @{@var{s1p}, @var{n}@} and @var{s2limb}, and add the @var{n} least
5247significant limbs of the product to @{@var{rp}, @var{n}@} and write the result
5248to @var{rp}.  Return the most significant limb of the product, plus carry-out
5249from the addition.
5250
5251This is a low-level function that is a building block for general
5252multiplication as well as other operations in GMP@.  It is written in assembly
5253for most CPUs.
5254@end deftypefun
5255
5256@deftypefun mp_limb_t mpn_submul_1 (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{n}, mp_limb_t @var{s2limb})
5257Multiply @{@var{s1p}, @var{n}@} and @var{s2limb}, and subtract the @var{n}
5258least significant limbs of the product from @{@var{rp}, @var{n}@} and write the
5259result to @var{rp}.  Return the most significant limb of the product, plus
5260borrow-out from the subtraction.
5261
5262This is a low-level function that is a building block for general
5263multiplication and division as well as other operations in GMP@.  It is written
5264in assembly for most CPUs.
5265@end deftypefun
5266
5267@deftypefun void mpn_tdiv_qr (mp_limb_t *@var{qp}, mp_limb_t *@var{rp}, mp_size_t @var{qxn}, const mp_limb_t *@var{np}, mp_size_t @var{nn}, const mp_limb_t *@var{dp}, mp_size_t @var{dn})
5268Divide @{@var{np}, @var{nn}@} by @{@var{dp}, @var{dn}@} and put the quotient
5269at @{@var{qp}, @var{nn}@minus{}@var{dn}+1@} and the remainder at @{@var{rp},
5270@var{dn}@}.  The quotient is rounded towards 0.
5271
5272No overlap is permitted between arguments, except that @var{np} might equal
5273@var{rp}.  The dividend size @var{nn} must be greater than or equal to divisor
5274size @var{dn}.  The most significant limb of the divisor must be non-zero.  The
5275@var{qxn} operand must be zero.
5276@end deftypefun
5277
5278@deftypefun mp_limb_t mpn_divrem (mp_limb_t *@var{r1p}, mp_size_t @var{qxn}, mp_limb_t *@var{rs2p}, mp_size_t @var{rs2n}, const mp_limb_t *@var{s3p}, mp_size_t @var{s3n})
5279[This function is obsolete.  Please call @code{mpn_tdiv_qr} instead for best
5280performance.]
5281
5282Divide @{@var{rs2p}, @var{rs2n}@} by @{@var{s3p}, @var{s3n}@}, and write the
5283quotient at @var{r1p}, with the exception of the most significant limb, which
5284is returned.  The remainder replaces the dividend at @var{rs2p}; it will be
5285@var{s3n} limbs long (i.e., as many limbs as the divisor).
5286
5287In addition to an integer quotient, @var{qxn} fraction limbs are developed, and
5288stored after the integral limbs.  For most usages, @var{qxn} will be zero.
5289
5290It is required that @var{rs2n} is greater than or equal to @var{s3n}.  It is
5291required that the most significant bit of the divisor is set.
5292
5293If the quotient is not needed, pass @var{rs2p} + @var{s3n} as @var{r1p}.  Aside
5294from that special case, no overlap between arguments is permitted.
5295
5296Return the most significant limb of the quotient, either 0 or 1.
5297
5298The area at @var{r1p} needs to be @var{rs2n} @minus{} @var{s3n} + @var{qxn}
5299limbs large.
5300@end deftypefun
5301
5302@deftypefn Function mp_limb_t mpn_divrem_1 (mp_limb_t *@var{r1p}, mp_size_t @var{qxn}, @w{mp_limb_t *@var{s2p}}, mp_size_t @var{s2n}, mp_limb_t @var{s3limb})
5303@deftypefnx Macro mp_limb_t mpn_divmod_1 (mp_limb_t *@var{r1p}, mp_limb_t *@var{s2p}, @w{mp_size_t @var{s2n}}, @w{mp_limb_t @var{s3limb}})
5304Divide @{@var{s2p}, @var{s2n}@} by @var{s3limb}, and write the quotient at
5305@var{r1p}.  Return the remainder.
5306
5307The integer quotient is written to @{@var{r1p}+@var{qxn}, @var{s2n}@} and in
5308addition @var{qxn} fraction limbs are developed and written to @{@var{r1p},
5309@var{qxn}@}.  Either or both @var{s2n} and @var{qxn} can be zero.  For most
5310usages, @var{qxn} will be zero.
5311
5312@code{mpn_divmod_1} exists for upward source compatibility and is simply a
5313macro calling @code{mpn_divrem_1} with a @var{qxn} of 0.
5314
5315The areas at @var{r1p} and @var{s2p} have to be identical or completely
5316separate, not partially overlapping.
5317@end deftypefn
5318
5319@deftypefun mp_limb_t mpn_divmod (mp_limb_t *@var{r1p}, mp_limb_t *@var{rs2p}, mp_size_t @var{rs2n}, const mp_limb_t *@var{s3p}, mp_size_t @var{s3n})
5320[This function is obsolete.  Please call @code{mpn_tdiv_qr} instead for best
5321performance.]
5322@end deftypefun
5323
5324@deftypefn Macro mp_limb_t mpn_divexact_by3 (mp_limb_t *@var{rp}, mp_limb_t *@var{sp}, @w{mp_size_t @var{n}})
5325@deftypefnx Function mp_limb_t mpn_divexact_by3c (mp_limb_t *@var{rp}, mp_limb_t *@var{sp}, @w{mp_size_t @var{n}}, mp_limb_t @var{carry})
5326Divide @{@var{sp}, @var{n}@} by 3, expecting it to divide exactly, and writing
5327the result to @{@var{rp}, @var{n}@}.  If 3 divides exactly, the return value is
5328zero and the result is the quotient.  If not, the return value is non-zero and
5329the result won't be anything useful.
5330
5331@code{mpn_divexact_by3c} takes an initial carry parameter, which can be the
5332return value from a previous call, so a large calculation can be done piece by
5333piece from low to high.  @code{mpn_divexact_by3} is simply a macro calling
5334@code{mpn_divexact_by3c} with a 0 carry parameter.
5335
5336These routines use a multiply-by-inverse and will be faster than
5337@code{mpn_divrem_1} on CPUs with fast multiplication but slow division.
5338
5339The source @math{a}, result @math{q}, size @math{n}, initial carry @math{i},
5340and return value @math{c} satisfy @m{cb^n+a-i=3q, c*b^n + a-i = 3*q}, where
5341@m{b=2\GMPraise{@code{GMP\_NUMB\_BITS}}, b=2^GMP_NUMB_BITS}.  The
5342return @math{c} is always 0, 1 or 2, and the initial carry @math{i} must also
5343be 0, 1 or 2 (these are both borrows really).  When @math{c=0} clearly
5344@math{q=(a-i)/3}.  When @m{c \neq 0, c!=0}, the remainder @math{(a-i) @bmod{}
53453} is given by @math{3-c}, because @math{b @equiv{} 1 @bmod{} 3} (when
5346@code{mp_bits_per_limb} is even, which is always so currently).
5347@end deftypefn
5348
5349@deftypefun mp_limb_t mpn_mod_1 (const mp_limb_t *@var{s1p}, mp_size_t @var{s1n}, mp_limb_t @var{s2limb})
5350Divide @{@var{s1p}, @var{s1n}@} by @var{s2limb}, and return the remainder.
5351@var{s1n} can be zero.
5352@end deftypefun
5353
5354@deftypefun mp_limb_t mpn_lshift (mp_limb_t *@var{rp}, const mp_limb_t *@var{sp}, mp_size_t @var{n}, unsigned int @var{count})
5355Shift @{@var{sp}, @var{n}@} left by @var{count} bits, and write the result to
5356@{@var{rp}, @var{n}@}.  The bits shifted out at the left are returned in the
5357least significant @var{count} bits of the return value (the rest of the return
5358value is zero).
5359
5360@var{count} must be in the range 1 to @nicode{mp_bits_per_limb}@minus{}1.  The
5361regions @{@var{sp}, @var{n}@} and @{@var{rp}, @var{n}@} may overlap, provided
5362@math{@var{rp} @ge{} @var{sp}}.
5363
5364This function is written in assembly for most CPUs.
5365@end deftypefun
5366
5367@deftypefun mp_limb_t mpn_rshift (mp_limb_t *@var{rp}, const mp_limb_t *@var{sp}, mp_size_t @var{n}, unsigned int @var{count})
5368Shift @{@var{sp}, @var{n}@} right by @var{count} bits, and write the result to
5369@{@var{rp}, @var{n}@}.  The bits shifted out at the right are returned in the
5370most significant @var{count} bits of the return value (the rest of the return
5371value is zero).
5372
5373@var{count} must be in the range 1 to @nicode{mp_bits_per_limb}@minus{}1.  The
5374regions @{@var{sp}, @var{n}@} and @{@var{rp}, @var{n}@} may overlap, provided
5375@math{@var{rp} @le{} @var{sp}}.
5376
5377This function is written in assembly for most CPUs.
5378@end deftypefun
5379
5380@deftypefun int mpn_cmp (const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n})
5381Compare @{@var{s1p}, @var{n}@} and @{@var{s2p}, @var{n}@} and return a
5382positive value if @math{@var{s1} > @var{s2}}, 0 if they are equal, or a
5383negative value if @math{@var{s1} < @var{s2}}.
5384@end deftypefun
5385
5386@deftypefun mp_size_t mpn_gcd (mp_limb_t *@var{rp}, mp_limb_t *@var{xp}, mp_size_t @var{xn}, mp_limb_t *@var{yp}, mp_size_t @var{yn})
5387Set @{@var{rp}, @var{retval}@} to the greatest common divisor of @{@var{xp},
5388@var{xn}@} and @{@var{yp}, @var{yn}@}.  The result can be up to @var{yn} limbs,
5389the return value is the actual number produced.  Both source operands are
5390destroyed.
5391
5392It is required that @math{@var{xn} @ge @var{yn} > 0}, and the most significant
5393limb of @{@var{yp}, @var{yn}@} must be non-zero.  No overlap is permitted
5394between @{@var{xp}, @var{xn}@} and @{@var{yp}, @var{yn}@}.
5395@end deftypefun
5396
5397@deftypefun mp_limb_t mpn_gcd_1 (const mp_limb_t *@var{xp}, mp_size_t @var{xn}, mp_limb_t @var{ylimb})
5398Return the greatest common divisor of @{@var{xp}, @var{xn}@} and @var{ylimb}.
5399Both operands must be non-zero.
5400@end deftypefun
5401
5402@deftypefun mp_size_t mpn_gcdext (mp_limb_t *@var{gp}, mp_limb_t *@var{sp}, mp_size_t *@var{sn}, mp_limb_t *@var{up}, mp_size_t @var{un}, mp_limb_t *@var{vp}, mp_size_t @var{vn})
5403Let @m{U,@var{U}} be defined by @{@var{up}, @var{un}@} and let @m{V,@var{V}} be
5404defined by @{@var{vp}, @var{vn}@}.
5405
5406Compute the greatest common divisor @math{G} of @math{U} and @math{V}.  Compute
5407a cofactor @math{S} such that @math{G = US + VT}.  The second cofactor @var{T}
5408is not computed but can easily be obtained from @m{(G - US) / V, (@var{G} -
5409@var{U}*@var{S}) / @var{V}} (the division will be exact).  It is required that
5410@math{@var{un} @ge @var{vn} > 0}, and the most significant
5411limb of @{@var{vp}, @var{vn}@} must be non-zero.
5412
5413@math{S} satisfies @math{S = 1} or @math{@GMPabs{S} < V / (2 G)}. @math{S =
54140} if and only if @math{V} divides @math{U} (i.e., @math{G = V}).
5415
5416Store @math{G} at @var{gp} and let the return value define its limb count.
5417Store @math{S} at @var{sp} and let |*@var{sn}| define its limb count.  @math{S}
5418can be negative; when this happens *@var{sn} will be negative.  The area at
5419@var{gp} should have room for @var{vn} limbs and the area at @var{sp} should
5420have room for @math{@var{vn}+1} limbs.
5421
5422Both source operands are destroyed.
5423
5424Compatibility notes: GMP 4.3.0 and 4.3.1 defined @math{S} less strictly.
5425Earlier as well as later GMP releases define @math{S} as described here.
5426GMP releases before GMP 4.3.0 required additional space for both input and output
5427areas. More precisely, the areas @{@var{up}, @math{@var{un}+1}@} and
5428@{@var{vp}, @math{@var{vn}+1}@} were destroyed (i.e.@: the operands plus an
5429extra limb past the end of each), and the areas pointed to by @var{gp} and
5430@var{sp} should each have room for @math{@var{un}+1} limbs.
5431@end deftypefun
5432
5433@deftypefun mp_size_t mpn_sqrtrem (mp_limb_t *@var{r1p}, mp_limb_t *@var{r2p}, const mp_limb_t *@var{sp}, mp_size_t @var{n})
5434Compute the square root of @{@var{sp}, @var{n}@} and put the result at
5435@{@var{r1p}, @math{@GMPceil{@var{n}/2}}@} and the remainder at @{@var{r2p},
5436@var{retval}@}.  @var{r2p} needs space for @var{n} limbs, but the return value
5437indicates how many are produced.
5438
5439The most significant limb of @{@var{sp}, @var{n}@} must be non-zero.  The
5440areas @{@var{r1p}, @math{@GMPceil{@var{n}/2}}@} and @{@var{sp}, @var{n}@} must
5441be completely separate.  The areas @{@var{r2p}, @var{n}@} and @{@var{sp},
5442@var{n}@} must be either identical or completely separate.
5443
5444If the remainder is not wanted then @var{r2p} can be @code{NULL}, and in this
5445case the return value is zero or non-zero according to whether the remainder
5446would have been zero or non-zero.
5447
5448A return value of zero indicates a perfect square.  See also
5449@code{mpn_perfect_square_p}.
5450@end deftypefun
5451
5452@deftypefun mp_size_t mpn_get_str (unsigned char *@var{str}, int @var{base}, mp_limb_t *@var{s1p}, mp_size_t @var{s1n})
5453Convert @{@var{s1p}, @var{s1n}@} to a raw unsigned char array at @var{str} in
5454base @var{base}, and return the number of characters produced.  There may be
5455leading zeros in the string.  The string is not in ASCII; to convert it to
5456printable format, add the ASCII codes for @samp{0} or @samp{A}, depending on
5457the base and range.  @var{base} can vary from 2 to 256.
5458
5459The most significant limb of the input @{@var{s1p}, @var{s1n}@} must be
5460non-zero.  The input @{@var{s1p}, @var{s1n}@} is clobbered, except when
5461@var{base} is a power of 2, in which case it's unchanged.
5462
5463The area at @var{str} has to have space for the largest possible number
5464represented by a @var{s1n} long limb array, plus one extra character.
5465@end deftypefun
5466
5467@deftypefun mp_size_t mpn_set_str (mp_limb_t *@var{rp}, const unsigned char *@var{str}, size_t @var{strsize}, int @var{base})
5468Convert bytes @{@var{str},@var{strsize}@} in the given @var{base} to limbs at
5469@var{rp}.
5470
5471@math{@var{str}[0]} is the most significant byte and
5472@math{@var{str}[@var{strsize}-1]} is the least significant.  Each byte should
5473be a value in the range 0 to @math{@var{base}-1}, not an ASCII character.
5474@var{base} can vary from 2 to 256.
5475
5476The return value is the number of limbs written to @var{rp}.  If the most
5477significant input byte is non-zero then the high limb at @var{rp} will be
5478non-zero, and only that exact number of limbs will be required there.
5479
5480If the most significant input byte is zero then there may be high zero limbs
5481written to @var{rp} and included in the return value.
5482
5483@var{strsize} must be at least 1, and no overlap is permitted between
5484@{@var{str},@var{strsize}@} and the result at @var{rp}.
5485@end deftypefun
5486
5487@deftypefun {mp_bitcnt_t} mpn_scan0 (const mp_limb_t *@var{s1p}, mp_bitcnt_t @var{bit})
5488Scan @var{s1p} from bit position @var{bit} for the next clear bit.
5489
5490It is required that there be a clear bit within the area at @var{s1p} at or
5491beyond bit position @var{bit}, so that the function has something to return.
5492@end deftypefun
5493
5494@deftypefun {mp_bitcnt_t} mpn_scan1 (const mp_limb_t *@var{s1p}, mp_bitcnt_t @var{bit})
5495Scan @var{s1p} from bit position @var{bit} for the next set bit.
5496
5497It is required that there be a set bit within the area at @var{s1p} at or
5498beyond bit position @var{bit}, so that the function has something to return.
5499@end deftypefun
5500
5501@deftypefun void mpn_random (mp_limb_t *@var{r1p}, mp_size_t @var{r1n})
5502@deftypefunx void mpn_random2 (mp_limb_t *@var{r1p}, mp_size_t @var{r1n})
5503Generate a random number of length @var{r1n} and store it at @var{r1p}.  The
5504most significant limb is always non-zero.  @code{mpn_random} generates
5505uniformly distributed limb data, @code{mpn_random2} generates long strings of
5506zeros and ones in the binary representation.
5507
5508@code{mpn_random2} is intended for testing the correctness of the @code{mpn}
5509routines.
5510@end deftypefun
5511
5512@deftypefun {mp_bitcnt_t} mpn_popcount (const mp_limb_t *@var{s1p}, mp_size_t @var{n})
5513Count the number of set bits in @{@var{s1p}, @var{n}@}.
5514@end deftypefun
5515
5516@deftypefun {mp_bitcnt_t} mpn_hamdist (const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n})
5517Compute the hamming distance between @{@var{s1p}, @var{n}@} and @{@var{s2p},
5518@var{n}@}, which is the number of bit positions where the two operands have
5519different bit values.
5520@end deftypefun
5521
5522@deftypefun int mpn_perfect_square_p (const mp_limb_t *@var{s1p}, mp_size_t @var{n})
5523Return non-zero iff @{@var{s1p}, @var{n}@} is a perfect square.
5524The most significant limb of the input @{@var{s1p}, @var{n}@} must be
5525non-zero.
5526@end deftypefun
5527
5528@deftypefun void mpn_and_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n})
5529Perform the bitwise logical and of @{@var{s1p}, @var{n}@} and @{@var{s2p},
5530@var{n}@}, and write the result to @{@var{rp}, @var{n}@}.
5531@end deftypefun
5532
5533@deftypefun void mpn_ior_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n})
5534Perform the bitwise logical inclusive or of @{@var{s1p}, @var{n}@} and
5535@{@var{s2p}, @var{n}@}, and write the result to @{@var{rp}, @var{n}@}.
5536@end deftypefun
5537
5538@deftypefun void mpn_xor_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n})
5539Perform the bitwise logical exclusive or of @{@var{s1p}, @var{n}@} and
5540@{@var{s2p}, @var{n}@}, and write the result to @{@var{rp}, @var{n}@}.
5541@end deftypefun
5542
5543@deftypefun void mpn_andn_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n})
5544Perform the bitwise logical and of @{@var{s1p}, @var{n}@} and the bitwise
5545complement of @{@var{s2p}, @var{n}@}, and write the result to @{@var{rp}, @var{n}@}.
5546@end deftypefun
5547
5548@deftypefun void mpn_iorn_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n})
5549Perform the bitwise logical inclusive or of @{@var{s1p}, @var{n}@} and the bitwise
5550complement of @{@var{s2p}, @var{n}@}, and write the result to @{@var{rp}, @var{n}@}.
5551@end deftypefun
5552
5553@deftypefun void mpn_nand_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n})
5554Perform the bitwise logical and of @{@var{s1p}, @var{n}@} and @{@var{s2p},
5555@var{n}@}, and write the bitwise complement of the result to @{@var{rp}, @var{n}@}.
5556@end deftypefun
5557
5558@deftypefun void mpn_nior_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n})
5559Perform the bitwise logical inclusive or of @{@var{s1p}, @var{n}@} and
5560@{@var{s2p}, @var{n}@}, and write the bitwise complement of the result to
5561@{@var{rp}, @var{n}@}.
5562@end deftypefun
5563
5564@deftypefun void mpn_xnor_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n})
5565Perform the bitwise logical exclusive or of @{@var{s1p}, @var{n}@} and
5566@{@var{s2p}, @var{n}@}, and write the bitwise complement of the result to
5567@{@var{rp}, @var{n}@}.
5568@end deftypefun
5569
5570@deftypefun void mpn_com (mp_limb_t *@var{rp}, const mp_limb_t *@var{sp}, mp_size_t @var{n})
5571Perform the bitwise complement of @{@var{sp}, @var{n}@}, and write the result
5572to @{@var{rp}, @var{n}@}.
5573@end deftypefun
5574
5575@deftypefun void mpn_copyi (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{n})
5576Copy from @{@var{s1p}, @var{n}@} to @{@var{rp}, @var{n}@}, increasingly.
5577@end deftypefun
5578
5579@deftypefun void mpn_copyd (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{n})
5580Copy from @{@var{s1p}, @var{n}@} to @{@var{rp}, @var{n}@}, decreasingly.
5581@end deftypefun
5582
5583@deftypefun void mpn_zero (mp_limb_t *@var{rp}, mp_size_t @var{n})
5584Zero @{@var{rp}, @var{n}@}.
5585@end deftypefun
5586
5587@sp 1
5588@section Nails
5589@cindex Nails
5590
5591@strong{Everything in this section is highly experimental and may disappear or
5592be subject to incompatible changes in a future version of GMP.}
5593
5594Nails are an experimental feature whereby a few bits are left unused at the
5595top of each @code{mp_limb_t}.  This can significantly improve carry handling
5596on some processors.
5597
5598All the @code{mpn} functions accepting limb data will expect the nail bits to
5599be zero on entry, and will return data with the nails similarly all zero.
5600This applies both to limb vectors and to single limb arguments.
5601
5602Nails can be enabled by configuring with @samp{--enable-nails}.  By default
5603the number of bits will be chosen according to what suits the host processor,
5604but a particular number can be selected with @samp{--enable-nails=N}.
5605
5606At the mpn level, a nail build is neither source nor binary compatible with a
5607non-nail build, strictly speaking.  But programs acting on limbs only through
5608the mpn functions are likely to work equally well with either build, and
5609judicious use of the definitions below should make any program compatible with
5610either build, at the source level.
5611
5612For the higher level routines, meaning @code{mpz} etc, a nail build should be
5613fully source and binary compatible with a non-nail build.
5614
5615@defmac GMP_NAIL_BITS
5616@defmacx GMP_NUMB_BITS
5617@defmacx GMP_LIMB_BITS
5618@code{GMP_NAIL_BITS} is the number of nail bits, or 0 when nails are not in
5619use.  @code{GMP_NUMB_BITS} is the number of data bits in a limb.
5620@code{GMP_LIMB_BITS} is the total number of bits in an @code{mp_limb_t}.  In
5621all cases
5622
5623@example
5624GMP_LIMB_BITS == GMP_NAIL_BITS + GMP_NUMB_BITS
5625@end example
5626@end defmac
5627
5628@defmac GMP_NAIL_MASK
5629@defmacx GMP_NUMB_MASK
5630Bit masks for the nail and number parts of a limb.  @code{GMP_NAIL_MASK} is 0
5631when nails are not in use.
5632
5633@code{GMP_NAIL_MASK} is not often needed, since the nail part can be obtained
5634with @code{x >> GMP_NUMB_BITS}, and that means one less large constant, which
5635can help various RISC chips.
5636@end defmac
5637
5638@defmac GMP_NUMB_MAX
5639The maximum value that can be stored in the number part of a limb.  This is
5640the same as @code{GMP_NUMB_MASK}, but can be used for clarity when doing
5641comparisons rather than bit-wise operations.
5642@end defmac
5643
5644The term ``nails'' comes from finger or toe nails, which are at the ends of a
5645limb (arm or leg).  ``numb'' is short for number, but is also how the
5646developers felt after trying for a long time to come up with sensible names
5647for these things.
5648
5649In the future (the distant future most likely) a non-zero nail might be
5650permitted, giving non-unique representations for numbers in a limb vector.
5651This would help vector processors since carries would only ever need to
5652propagate one or two limbs.
5653
5654
5655@node Random Number Functions, Formatted Output, Low-level Functions, Top
5656@chapter Random Number Functions
5657@cindex Random number functions
5658
5659Sequences of pseudo-random numbers in GMP are generated using a variable of
5660type @code{gmp_randstate_t}, which holds an algorithm selection and a current
5661state.  Such a variable must be initialized by a call to one of the
5662@code{gmp_randinit} functions, and can be seeded with one of the
5663@code{gmp_randseed} functions.
5664
5665The functions actually generating random numbers are described in @ref{Integer
5666Random Numbers}, and @ref{Miscellaneous Float Functions}.
5667
5668The older style random number functions don't accept a @code{gmp_randstate_t}
5669parameter but instead share a global variable of that type.  They use a
5670default algorithm and are currently not seeded (though perhaps that will
5671change in the future).  The new functions accepting a @code{gmp_randstate_t}
5672are recommended for applications that care about randomness.
5673
5674@menu
5675* Random State Initialization::
5676* Random State Seeding::
5677* Random State Miscellaneous::
5678@end menu
5679
5680@node Random State Initialization, Random State Seeding, Random Number Functions, Random Number Functions
5681@section Random State Initialization
5682@cindex Random number state
5683@cindex Initialization functions
5684
5685@deftypefun void gmp_randinit_default (gmp_randstate_t @var{state})
5686Initialize @var{state} with a default algorithm.  This will be a compromise
5687between speed and randomness, and is recommended for applications with no
5688special requirements.  Currently this is @code{gmp_randinit_mt}.
5689@end deftypefun
5690
5691@deftypefun void gmp_randinit_mt (gmp_randstate_t @var{state})
5692@cindex Mersenne twister random numbers
5693Initialize @var{state} for a Mersenne Twister algorithm.  This algorithm is
5694fast and has good randomness properties.
5695@end deftypefun
5696
5697@deftypefun void gmp_randinit_lc_2exp (gmp_randstate_t @var{state}, const mpz_t @var{a}, @w{unsigned long @var{c}}, @w{mp_bitcnt_t @var{m2exp}})
5698@cindex Linear congruential random numbers
5699Initialize @var{state} with a linear congruential algorithm @m{X = (@var{a}X +
5700@var{c}) @bmod 2^{m2exp}, X = (@var{a}*X + @var{c}) mod 2^@var{m2exp}}.
5701
5702The low bits of @math{X} in this algorithm are not very random.  The least
5703significant bit will have a period no more than 2, and the second bit no more
5704than 4, etc.  For this reason only the high half of each @math{X} is actually
5705used.
5706
5707When a random number of more than @math{@var{m2exp}/2} bits is to be
5708generated, multiple iterations of the recurrence are used and the results
5709concatenated.
5710@end deftypefun
5711
5712@deftypefun int gmp_randinit_lc_2exp_size (gmp_randstate_t @var{state}, mp_bitcnt_t @var{size})
5713@cindex Linear congruential random numbers
5714Initialize @var{state} for a linear congruential algorithm as per
5715@code{gmp_randinit_lc_2exp}.  @var{a}, @var{c} and @var{m2exp} are selected
5716from a table, chosen so that @var{size} bits (or more) of each @math{X} will
5717be used, i.e.@: @math{@var{m2exp}/2 @ge{} @var{size}}.
5718
5719If successful the return value is non-zero.  If @var{size} is bigger than the
5720table data provides then the return value is zero.  The maximum @var{size}
5721currently supported is 128.
5722@end deftypefun
5723
5724@deftypefun void gmp_randinit_set (gmp_randstate_t @var{rop}, gmp_randstate_t @var{op})
5725Initialize @var{rop} with a copy of the algorithm and state from @var{op}.
5726@end deftypefun
5727
5728@c  Although gmp_randinit, gmp_errno and related constants are obsolete, we
5729@c  still put @findex entries for them, since they're still documented and
5730@c  someone might be looking them up when perusing old application code.
5731
5732@deftypefun void gmp_randinit (gmp_randstate_t @var{state}, @w{gmp_randalg_t @var{alg}}, @dots{})
5733@strong{This function is obsolete.}
5734
5735@findex GMP_RAND_ALG_LC
5736@findex GMP_RAND_ALG_DEFAULT
5737Initialize @var{state} with an algorithm selected by @var{alg}.  The only
5738choice is @code{GMP_RAND_ALG_LC}, which is @code{gmp_randinit_lc_2exp_size}
5739described above.  A third parameter of type @code{unsigned long} is required,
5740this is the @var{size} for that function.  @code{GMP_RAND_ALG_DEFAULT} or 0
5741are the same as @code{GMP_RAND_ALG_LC}.
5742
5743@c  For reference, this is the only place gmp_errno has been documented, and
5744@c  due to being non thread safe we won't be adding to it's uses.
5745@findex gmp_errno
5746@findex GMP_ERROR_UNSUPPORTED_ARGUMENT
5747@findex GMP_ERROR_INVALID_ARGUMENT
5748@code{gmp_randinit} sets bits in the global variable @code{gmp_errno} to
5749indicate an error.  @code{GMP_ERROR_UNSUPPORTED_ARGUMENT} if @var{alg} is
5750unsupported, or @code{GMP_ERROR_INVALID_ARGUMENT} if the @var{size} parameter
5751is too big.  It may be noted this error reporting is not thread safe (a good
5752reason to use @code{gmp_randinit_lc_2exp_size} instead).
5753@end deftypefun
5754
5755@deftypefun void gmp_randclear (gmp_randstate_t @var{state})
5756Free all memory occupied by @var{state}.
5757@end deftypefun
5758
5759
5760@node Random State Seeding, Random State Miscellaneous, Random State Initialization, Random Number Functions
5761@section Random State Seeding
5762@cindex Random number seeding
5763@cindex Seeding random numbers
5764
5765@deftypefun void gmp_randseed (gmp_randstate_t @var{state}, const mpz_t @var{seed})
5766@deftypefunx void gmp_randseed_ui (gmp_randstate_t @var{state}, @w{unsigned long int @var{seed}})
5767Set an initial seed value into @var{state}.
5768
5769The size of a seed determines how many different sequences of random numbers
5770that it's possible to generate.  The ``quality'' of the seed is the randomness
5771of a given seed compared to the previous seed used, and this affects the
5772randomness of separate number sequences.  The method for choosing a seed is
5773critical if the generated numbers are to be used for important applications,
5774such as generating cryptographic keys.
5775
5776Traditionally the system time has been used to seed, but care needs to be
5777taken with this.  If an application seeds often and the resolution of the
5778system clock is low, then the same sequence of numbers might be repeated.
5779Also, the system time is quite easy to guess, so if unpredictability is
5780required then it should definitely not be the only source for the seed value.
5781On some systems there's a special device @file{/dev/random} which provides
5782random data better suited for use as a seed.
5783@end deftypefun
5784
5785
5786@node Random State Miscellaneous,  , Random State Seeding, Random Number Functions
5787@section Random State Miscellaneous
5788
5789@deftypefun {unsigned long} gmp_urandomb_ui (gmp_randstate_t @var{state}, unsigned long @var{n})
5790Return a uniformly distributed random number of @var{n} bits, i.e.@: in the
5791range 0 to @m{2^n-1,2^@var{n}-1} inclusive.  @var{n} must be less than or
5792equal to the number of bits in an @code{unsigned long}.
5793@end deftypefun
5794
5795@deftypefun {unsigned long} gmp_urandomm_ui (gmp_randstate_t @var{state}, unsigned long @var{n})
5796Return a uniformly distributed random number in the range 0 to
5797@math{@var{n}-1}, inclusive.
5798@end deftypefun
5799
5800
5801@node Formatted Output, Formatted Input, Random Number Functions, Top
5802@chapter Formatted Output
5803@cindex Formatted output
5804@cindex @code{printf} formatted output
5805
5806@menu
5807* Formatted Output Strings::
5808* Formatted Output Functions::
5809* C++ Formatted Output::
5810@end menu
5811
5812@node Formatted Output Strings, Formatted Output Functions, Formatted Output, Formatted Output
5813@section Format Strings
5814
5815@code{gmp_printf} and friends accept format strings similar to the standard C
5816@code{printf} (@pxref{Formatted Output,, Formatted Output, libc, The GNU C
5817Library Reference Manual}).  A format specification is of the form
5818
5819@example
5820% [flags] [width] [.[precision]] [type] conv
5821@end example
5822
5823GMP adds types @samp{Z}, @samp{Q} and @samp{F} for @code{mpz_t}, @code{mpq_t}
5824and @code{mpf_t} respectively, @samp{M} for @code{mp_limb_t}, and @samp{N} for
5825an @code{mp_limb_t} array.  @samp{Z}, @samp{Q}, @samp{M} and @samp{N} behave
5826like integers.  @samp{Q} will print a @samp{/} and a denominator, if needed.
5827@samp{F} behaves like a float.  For example,
5828
5829@example
5830mpz_t z;
5831gmp_printf ("%s is an mpz %Zd\n", "here", z);
5832
5833mpq_t q;
5834gmp_printf ("a hex rational: %#40Qx\n", q);
5835
5836mpf_t f;
5837int   n;
5838gmp_printf ("fixed point mpf %.*Ff with %d digits\n", n, f, n);
5839
5840mp_limb_t l;
5841gmp_printf ("limb %Mu\n", l);
5842
5843const mp_limb_t *ptr;
5844mp_size_t       size;
5845gmp_printf ("limb array %Nx\n", ptr, size);
5846@end example
5847
5848For @samp{N} the limbs are expected least significant first, as per the
5849@code{mpn} functions (@pxref{Low-level Functions}).  A negative size can be
5850given to print the value as a negative.
5851
5852All the standard C @code{printf} types behave the same as the C library
5853@code{printf}, and can be freely intermixed with the GMP extensions.  In the
5854current implementation the standard parts of the format string are simply
5855handed to @code{printf} and only the GMP extensions handled directly.
5856
5857The flags accepted are as follows.  GLIBC style @nisamp{'} is only for the
5858standard C types (not the GMP types), and only if the C library supports it.
5859
5860@quotation
5861@multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM}
5862@item @nicode{0} @tab pad with zeros (rather than spaces)
5863@item @nicode{#} @tab show the base with @samp{0x}, @samp{0X} or @samp{0}
5864@item @nicode{+} @tab always show a sign
5865@item (space)    @tab show a space or a @samp{-} sign
5866@item @nicode{'} @tab group digits, GLIBC style (not GMP types)
5867@end multitable
5868@end quotation
5869
5870The optional width and precision can be given as a number within the format
5871string, or as a @samp{*} to take an extra parameter of type @code{int}, the
5872same as the standard @code{printf}.
5873
5874The standard types accepted are as follows.  @samp{h} and @samp{l} are
5875portable, the rest will depend on the compiler (or include files) for the type
5876and the C library for the output.
5877
5878@quotation
5879@multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM}
5880@item @nicode{h}  @tab @nicode{short}
5881@item @nicode{hh} @tab @nicode{char}
5882@item @nicode{j}  @tab @nicode{intmax_t} or @nicode{uintmax_t}
5883@item @nicode{l}  @tab @nicode{long} or @nicode{wchar_t}
5884@item @nicode{ll} @tab @nicode{long long}
5885@item @nicode{L}  @tab @nicode{long double}
5886@item @nicode{q}  @tab @nicode{quad_t} or @nicode{u_quad_t}
5887@item @nicode{t}  @tab @nicode{ptrdiff_t}
5888@item @nicode{z}  @tab @nicode{size_t}
5889@end multitable
5890@end quotation
5891
5892@noindent
5893The GMP types are
5894
5895@quotation
5896@multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM}
5897@item @nicode{F}  @tab @nicode{mpf_t}, float conversions
5898@item @nicode{Q}  @tab @nicode{mpq_t}, integer conversions
5899@item @nicode{M}  @tab @nicode{mp_limb_t}, integer conversions
5900@item @nicode{N}  @tab @nicode{mp_limb_t} array, integer conversions
5901@item @nicode{Z}  @tab @nicode{mpz_t}, integer conversions
5902@end multitable
5903@end quotation
5904
5905The conversions accepted are as follows.  @samp{a} and @samp{A} are always
5906supported for @code{mpf_t} but depend on the C library for standard C float
5907types.  @samp{m} and @samp{p} depend on the C library.
5908
5909@quotation
5910@multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM}
5911@item @nicode{a} @nicode{A} @tab hex floats, C99 style
5912@item @nicode{c}            @tab character
5913@item @nicode{d}            @tab decimal integer
5914@item @nicode{e} @nicode{E} @tab scientific format float
5915@item @nicode{f}            @tab fixed point float
5916@item @nicode{i}            @tab same as @nicode{d}
5917@item @nicode{g} @nicode{G} @tab fixed or scientific float
5918@item @nicode{m}            @tab @code{strerror} string, GLIBC style
5919@item @nicode{n}            @tab store characters written so far
5920@item @nicode{o}            @tab octal integer
5921@item @nicode{p}            @tab pointer
5922@item @nicode{s}            @tab string
5923@item @nicode{u}            @tab unsigned integer
5924@item @nicode{x} @nicode{X} @tab hex integer
5925@end multitable
5926@end quotation
5927
5928@samp{o}, @samp{x} and @samp{X} are unsigned for the standard C types, but for
5929types @samp{Z}, @samp{Q} and @samp{N} they are signed.  @samp{u} is not
5930meaningful for @samp{Z}, @samp{Q} and @samp{N}.
5931
5932@samp{M} is a proxy for the C library @samp{l} or @samp{L}, according to the
5933size of @code{mp_limb_t}.  Unsigned conversions will be usual, but a signed
5934conversion can be used and will interpret the value as a twos complement
5935negative.
5936
5937@samp{n} can be used with any type, even the GMP types.
5938
5939Other types or conversions that might be accepted by the C library
5940@code{printf} cannot be used through @code{gmp_printf}, this includes for
5941instance extensions registered with GLIBC @code{register_printf_function}.
5942Also currently there's no support for POSIX @samp{$} style numbered arguments
5943(perhaps this will be added in the future).
5944
5945The precision field has its usual meaning for integer @samp{Z} and float
5946@samp{F} types, but is currently undefined for @samp{Q} and should not be used
5947with that.
5948
5949@code{mpf_t} conversions only ever generate as many digits as can be
5950accurately represented by the operand, the same as @code{mpf_get_str} does.
5951Zeros will be used if necessary to pad to the requested precision.  This
5952happens even for an @samp{f} conversion of an @code{mpf_t} which is an
5953integer, for instance @math{2^@W{1024}} in an @code{mpf_t} of 128 bits
5954precision will only produce about 40 digits, then pad with zeros to the
5955decimal point.  An empty precision field like @samp{%.Fe} or @samp{%.Ff} can
5956be used to specifically request just the significant digits.  Without any dot
5957and thus no precision field, a precision value of 6 will be used.  Note that
5958these rules mean that @samp{%Ff}, @samp{%.Ff}, and @samp{%.0Ff} will all be
5959different.
5960
5961The decimal point character (or string) is taken from the current locale
5962settings on systems which provide @code{localeconv} (@pxref{Locales,, Locales
5963and Internationalization, libc, The GNU C Library Reference Manual}).  The C
5964library will normally do the same for standard float output.
5965
5966The format string is only interpreted as plain @code{char}s, multibyte
5967characters are not recognised.  Perhaps this will change in the future.
5968
5969
5970@node Formatted Output Functions, C++ Formatted Output, Formatted Output Strings, Formatted Output
5971@section Functions
5972@cindex Output functions
5973
5974Each of the following functions is similar to the corresponding C library
5975function.  The basic @code{printf} forms take a variable argument list.  The
5976@code{vprintf} forms take an argument pointer, see @ref{Variadic Functions,,
5977Variadic Functions, libc, The GNU C Library Reference Manual}, or @samp{man 3
5978va_start}.
5979
5980It should be emphasised that if a format string is invalid, or the arguments
5981don't match what the format specifies, then the behaviour of any of these
5982functions will be unpredictable.  GCC format string checking is not available,
5983since it doesn't recognise the GMP extensions.
5984
5985The file based functions @code{gmp_printf} and @code{gmp_fprintf} will return
5986@math{-1} to indicate a write error.  Output is not ``atomic'', so partial
5987output may be produced if a write error occurs.  All the functions can return
5988@math{-1} if the C library @code{printf} variant in use returns @math{-1}, but
5989this shouldn't normally occur.
5990
5991@deftypefun int gmp_printf (const char *@var{fmt}, @dots{})
5992@deftypefunx int gmp_vprintf (const char *@var{fmt}, va_list @var{ap})
5993Print to the standard output @code{stdout}.  Return the number of characters
5994written, or @math{-1} if an error occurred.
5995@end deftypefun
5996
5997@deftypefun int gmp_fprintf (FILE *@var{fp}, const char *@var{fmt}, @dots{})
5998@deftypefunx int gmp_vfprintf (FILE *@var{fp}, const char *@var{fmt}, va_list @var{ap})
5999Print to the stream @var{fp}.  Return the number of characters written, or
6000@math{-1} if an error occurred.
6001@end deftypefun
6002
6003@deftypefun int gmp_sprintf (char *@var{buf}, const char *@var{fmt}, @dots{})
6004@deftypefunx int gmp_vsprintf (char *@var{buf}, const char *@var{fmt}, va_list @var{ap})
6005Form a null-terminated string in @var{buf}.  Return the number of characters
6006written, excluding the terminating null.
6007
6008No overlap is permitted between the space at @var{buf} and the string
6009@var{fmt}.
6010
6011These functions are not recommended, since there's no protection against
6012exceeding the space available at @var{buf}.
6013@end deftypefun
6014
6015@deftypefun int gmp_snprintf (char *@var{buf}, size_t @var{size}, const char *@var{fmt}, @dots{})
6016@deftypefunx int gmp_vsnprintf (char *@var{buf}, size_t @var{size}, const char *@var{fmt}, va_list @var{ap})
6017Form a null-terminated string in @var{buf}.  No more than @var{size} bytes
6018will be written.  To get the full output, @var{size} must be enough for the
6019string and null-terminator.
6020
6021The return value is the total number of characters which ought to have been
6022produced, excluding the terminating null.  If @math{@var{retval} @ge{}
6023@var{size}} then the actual output has been truncated to the first
6024@math{@var{size}-1} characters, and a null appended.
6025
6026No overlap is permitted between the region @{@var{buf},@var{size}@} and the
6027@var{fmt} string.
6028
6029Notice the return value is in ISO C99 @code{snprintf} style.  This is so even
6030if the C library @code{vsnprintf} is the older GLIBC 2.0.x style.
6031@end deftypefun
6032
6033@deftypefun int gmp_asprintf (char **@var{pp}, const char *@var{fmt}, @dots{})
6034@deftypefunx int gmp_vasprintf (char **@var{pp}, const char *@var{fmt}, va_list @var{ap})
6035Form a null-terminated string in a block of memory obtained from the current
6036memory allocation function (@pxref{Custom Allocation}).  The block will be the
6037size of the string and null-terminator.  The address of the block in stored to
6038*@var{pp}.  The return value is the number of characters produced, excluding
6039the null-terminator.
6040
6041Unlike the C library @code{asprintf}, @code{gmp_asprintf} doesn't return
6042@math{-1} if there's no more memory available, it lets the current allocation
6043function handle that.
6044@end deftypefun
6045
6046@deftypefun int gmp_obstack_printf (struct obstack *@var{ob}, const char *@var{fmt}, @dots{})
6047@deftypefunx int gmp_obstack_vprintf (struct obstack *@var{ob}, const char *@var{fmt}, va_list @var{ap})
6048@cindex @code{obstack} output
6049Append to the current object in @var{ob}.  The return value is the number of
6050characters written.  A null-terminator is not written.
6051
6052@var{fmt} cannot be within the current object in @var{ob}, since that object
6053might move as it grows.
6054
6055These functions are available only when the C library provides the obstack
6056feature, which probably means only on GNU systems, see @ref{Obstacks,,
6057Obstacks, libc, The GNU C Library Reference Manual}.
6058@end deftypefun
6059
6060
6061@node C++ Formatted Output,  , Formatted Output Functions, Formatted Output
6062@section C++ Formatted Output
6063@cindex C++ @code{ostream} output
6064@cindex @code{ostream} output
6065
6066The following functions are provided in @file{libgmpxx} (@pxref{Headers and
6067Libraries}), which is built if C++ support is enabled (@pxref{Build Options}).
6068Prototypes are available from @code{<gmp.h>}.
6069
6070@deftypefun ostream& operator<< (ostream& @var{stream}, mpz_t @var{op})
6071Print @var{op} to @var{stream}, using its @code{ios} formatting settings.
6072@code{ios::width} is reset to 0 after output, the same as the standard
6073@code{ostream operator<<} routines do.
6074
6075In hex or octal, @var{op} is printed as a signed number, the same as for
6076decimal.  This is unlike the standard @code{operator<<} routines on @code{int}
6077etc, which instead give twos complement.
6078@end deftypefun
6079
6080@deftypefun ostream& operator<< (ostream& @var{stream}, mpq_t @var{op})
6081Print @var{op} to @var{stream}, using its @code{ios} formatting settings.
6082@code{ios::width} is reset to 0 after output, the same as the standard
6083@code{ostream operator<<} routines do.
6084
6085Output will be a fraction like @samp{5/9}, or if the denominator is 1 then
6086just a plain integer like @samp{123}.
6087
6088In hex or octal, @var{op} is printed as a signed value, the same as for
6089decimal.  If @code{ios::showbase} is set then a base indicator is shown on
6090both the numerator and denominator (if the denominator is required).
6091@end deftypefun
6092
6093@deftypefun ostream& operator<< (ostream& @var{stream}, mpf_t @var{op})
6094Print @var{op} to @var{stream}, using its @code{ios} formatting settings.
6095@code{ios::width} is reset to 0 after output, the same as the standard
6096@code{ostream operator<<} routines do.
6097
6098The decimal point follows the standard library float @code{operator<<}, which
6099on recent systems means the @code{std::locale} imbued on @var{stream}.
6100
6101Hex and octal are supported, unlike the standard @code{operator<<} on
6102@code{double}.  The mantissa will be in hex or octal, the exponent will be in
6103decimal.  For hex the exponent delimiter is an @samp{@@}.  This is as per
6104@code{mpf_out_str}.
6105
6106@code{ios::showbase} is supported, and will put a base on the mantissa, for
6107example hex @samp{0x1.8} or @samp{0x0.8}, or octal @samp{01.4} or @samp{00.4}.
6108This last form is slightly strange, but at least differentiates itself from
6109decimal.
6110@end deftypefun
6111
6112These operators mean that GMP types can be printed in the usual C++ way, for
6113example,
6114
6115@example
6116mpz_t  z;
6117int    n;
6118...
6119cout << "iteration " << n << " value " << z << "\n";
6120@end example
6121
6122But note that @code{ostream} output (and @code{istream} input, @pxref{C++
6123Formatted Input}) is the only overloading available for the GMP types and that
6124for instance using @code{+} with an @code{mpz_t} will have unpredictable
6125results.  For classes with overloading, see @ref{C++ Class Interface}.
6126
6127
6128@node Formatted Input, C++ Class Interface, Formatted Output, Top
6129@chapter Formatted Input
6130@cindex Formatted input
6131@cindex @code{scanf} formatted input
6132
6133@menu
6134* Formatted Input Strings::
6135* Formatted Input Functions::
6136* C++ Formatted Input::
6137@end menu
6138
6139
6140@node Formatted Input Strings, Formatted Input Functions, Formatted Input, Formatted Input
6141@section Formatted Input Strings
6142
6143@code{gmp_scanf} and friends accept format strings similar to the standard C
6144@code{scanf} (@pxref{Formatted Input,, Formatted Input, libc, The GNU C
6145Library Reference Manual}).  A format specification is of the form
6146
6147@example
6148% [flags] [width] [type] conv
6149@end example
6150
6151GMP adds types @samp{Z}, @samp{Q} and @samp{F} for @code{mpz_t}, @code{mpq_t}
6152and @code{mpf_t} respectively.  @samp{Z} and @samp{Q} behave like integers.
6153@samp{Q} will read a @samp{/} and a denominator, if present.  @samp{F} behaves
6154like a float.
6155
6156GMP variables don't require an @code{&} when passed to @code{gmp_scanf}, since
6157they're already ``call-by-reference''.  For example,
6158
6159@example
6160/* to read say "a(5) = 1234" */
6161int   n;
6162mpz_t z;
6163gmp_scanf ("a(%d) = %Zd\n", &n, z);
6164
6165mpq_t q1, q2;
6166gmp_sscanf ("0377 + 0x10/0x11", "%Qi + %Qi", q1, q2);
6167
6168/* to read say "topleft (1.55,-2.66)" */
6169mpf_t x, y;
6170char  buf[32];
6171gmp_scanf ("%31s (%Ff,%Ff)", buf, x, y);
6172@end example
6173
6174All the standard C @code{scanf} types behave the same as in the C library
6175@code{scanf}, and can be freely intermixed with the GMP extensions.  In the
6176current implementation the standard parts of the format string are simply
6177handed to @code{scanf} and only the GMP extensions handled directly.
6178
6179The flags accepted are as follows.  @samp{a} and @samp{'} will depend on
6180support from the C library, and @samp{'} cannot be used with GMP types.
6181
6182@quotation
6183@multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM}
6184@item @nicode{*} @tab read but don't store
6185@item @nicode{a} @tab allocate a buffer (string conversions)
6186@item @nicode{'} @tab grouped digits, GLIBC style (not GMP types)
6187@end multitable
6188@end quotation
6189
6190The standard types accepted are as follows.  @samp{h} and @samp{l} are
6191portable, the rest will depend on the compiler (or include files) for the type
6192and the C library for the input.
6193
6194@quotation
6195@multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM}
6196@item @nicode{h}  @tab @nicode{short}
6197@item @nicode{hh} @tab @nicode{char}
6198@item @nicode{j}  @tab @nicode{intmax_t} or @nicode{uintmax_t}
6199@item @nicode{l}  @tab @nicode{long int}, @nicode{double} or @nicode{wchar_t}
6200@item @nicode{ll} @tab @nicode{long long}
6201@item @nicode{L}  @tab @nicode{long double}
6202@item @nicode{q}  @tab @nicode{quad_t} or @nicode{u_quad_t}
6203@item @nicode{t}  @tab @nicode{ptrdiff_t}
6204@item @nicode{z}  @tab @nicode{size_t}
6205@end multitable
6206@end quotation
6207
6208@noindent
6209The GMP types are
6210
6211@quotation
6212@multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM}
6213@item @nicode{F}  @tab @nicode{mpf_t}, float conversions
6214@item @nicode{Q}  @tab @nicode{mpq_t}, integer conversions
6215@item @nicode{Z}  @tab @nicode{mpz_t}, integer conversions
6216@end multitable
6217@end quotation
6218
6219The conversions accepted are as follows.  @samp{p} and @samp{[} will depend on
6220support from the C library, the rest are standard.
6221
6222@quotation
6223@multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM}
6224@item @nicode{c}            @tab character or characters
6225@item @nicode{d}            @tab decimal integer
6226@item @nicode{e} @nicode{E} @nicode{f} @nicode{g} @nicode{G}
6227                            @tab float
6228@item @nicode{i}            @tab integer with base indicator
6229@item @nicode{n}            @tab characters read so far
6230@item @nicode{o}            @tab octal integer
6231@item @nicode{p}            @tab pointer
6232@item @nicode{s}            @tab string of non-whitespace characters
6233@item @nicode{u}            @tab decimal integer
6234@item @nicode{x} @nicode{X} @tab hex integer
6235@item @nicode{[}            @tab string of characters in a set
6236@end multitable
6237@end quotation
6238
6239@samp{e}, @samp{E}, @samp{f}, @samp{g} and @samp{G} are identical, they all
6240read either fixed point or scientific format, and either upper or lower case
6241@samp{e} for the exponent in scientific format.
6242
6243C99 style hex float format (@code{printf %a}, @pxref{Formatted Output
6244Strings}) is always accepted for @code{mpf_t}, but for the standard float
6245types it will depend on the C library.
6246
6247@samp{x} and @samp{X} are identical, both accept both upper and lower case
6248hexadecimal.
6249
6250@samp{o}, @samp{u}, @samp{x} and @samp{X} all read positive or negative
6251values.  For the standard C types these are described as ``unsigned''
6252conversions, but that merely affects certain overflow handling, negatives are
6253still allowed (per @code{strtoul}, @pxref{Parsing of Integers,, Parsing of
6254Integers, libc, The GNU C Library Reference Manual}).  For GMP types there are
6255no overflows, so @samp{d} and @samp{u} are identical.
6256
6257@samp{Q} type reads the numerator and (optional) denominator as given.  If the
6258value might not be in canonical form then @code{mpq_canonicalize} must be
6259called before using it in any calculations (@pxref{Rational Number
6260Functions}).
6261
6262@samp{Qi} will read a base specification separately for the numerator and
6263denominator.  For example @samp{0x10/11} would be 16/11, whereas
6264@samp{0x10/0x11} would be 16/17.
6265
6266@samp{n} can be used with any of the types above, even the GMP types.
6267@samp{*} to suppress assignment is allowed, though in that case it would do
6268nothing at all.
6269
6270Other conversions or types that might be accepted by the C library
6271@code{scanf} cannot be used through @code{gmp_scanf}.
6272
6273Whitespace is read and discarded before a field, except for @samp{c} and
6274@samp{[} conversions.
6275
6276For float conversions, the decimal point character (or string) expected is
6277taken from the current locale settings on systems which provide
6278@code{localeconv} (@pxref{Locales,, Locales and Internationalization, libc,
6279The GNU C Library Reference Manual}).  The C library will normally do the same
6280for standard float input.
6281
6282The format string is only interpreted as plain @code{char}s, multibyte
6283characters are not recognised.  Perhaps this will change in the future.
6284
6285
6286@node Formatted Input Functions, C++ Formatted Input, Formatted Input Strings, Formatted Input
6287@section Formatted Input Functions
6288@cindex Input functions
6289
6290Each of the following functions is similar to the corresponding C library
6291function.  The plain @code{scanf} forms take a variable argument list.  The
6292@code{vscanf} forms take an argument pointer, see @ref{Variadic Functions,,
6293Variadic Functions, libc, The GNU C Library Reference Manual}, or @samp{man 3
6294va_start}.
6295
6296It should be emphasised that if a format string is invalid, or the arguments
6297don't match what the format specifies, then the behaviour of any of these
6298functions will be unpredictable.  GCC format string checking is not available,
6299since it doesn't recognise the GMP extensions.
6300
6301No overlap is permitted between the @var{fmt} string and any of the results
6302produced.
6303
6304@deftypefun int gmp_scanf (const char *@var{fmt}, @dots{})
6305@deftypefunx int gmp_vscanf (const char *@var{fmt}, va_list @var{ap})
6306Read from the standard input @code{stdin}.
6307@end deftypefun
6308
6309@deftypefun int gmp_fscanf (FILE *@var{fp}, const char *@var{fmt}, @dots{})
6310@deftypefunx int gmp_vfscanf (FILE *@var{fp}, const char *@var{fmt}, va_list @var{ap})
6311Read from the stream @var{fp}.
6312@end deftypefun
6313
6314@deftypefun int gmp_sscanf (const char *@var{s}, const char *@var{fmt}, @dots{})
6315@deftypefunx int gmp_vsscanf (const char *@var{s}, const char *@var{fmt}, va_list @var{ap})
6316Read from a null-terminated string @var{s}.
6317@end deftypefun
6318
6319The return value from each of these functions is the same as the standard C99
6320@code{scanf}, namely the number of fields successfully parsed and stored.
6321@samp{%n} fields and fields read but suppressed by @samp{*} don't count
6322towards the return value.
6323
6324If end of input (or a file error) is reached before a character for a field or
6325a literal, and if no previous non-suppressed fields have matched, then the
6326return value is @code{EOF} instead of 0.  A whitespace character in the format
6327string is only an optional match and doesn't induce an @code{EOF} in this
6328fashion.  Leading whitespace read and discarded for a field don't count as
6329characters for that field.
6330
6331For the GMP types, input parsing follows C99 rules, namely one character of
6332lookahead is used and characters are read while they continue to meet the
6333format requirements.  If this doesn't provide a complete number then the
6334function terminates, with that field not stored nor counted towards the return
6335value.  For instance with @code{mpf_t} an input @samp{1.23e-XYZ} would be read
6336up to the @samp{X} and that character pushed back since it's not a digit.  The
6337string @samp{1.23e-} would then be considered invalid since an @samp{e} must
6338be followed by at least one digit.
6339
6340For the standard C types, in the current implementation GMP calls the C
6341library @code{scanf} functions, which might have looser rules about what
6342constitutes a valid input.
6343
6344Note that @code{gmp_sscanf} is the same as @code{gmp_fscanf} and only does one
6345character of lookahead when parsing.  Although clearly it could look at its
6346entire input, it is deliberately made identical to @code{gmp_fscanf}, the same
6347way C99 @code{sscanf} is the same as @code{fscanf}.
6348
6349
6350@node C++ Formatted Input,  , Formatted Input Functions, Formatted Input
6351@section C++ Formatted Input
6352@cindex C++ @code{istream} input
6353@cindex @code{istream} input
6354
6355The following functions are provided in @file{libgmpxx} (@pxref{Headers and
6356Libraries}), which is built only if C++ support is enabled (@pxref{Build
6357Options}).  Prototypes are available from @code{<gmp.h>}.
6358
6359@deftypefun istream& operator>> (istream& @var{stream}, mpz_t @var{rop})
6360Read @var{rop} from @var{stream}, using its @code{ios} formatting settings.
6361@end deftypefun
6362
6363@deftypefun istream& operator>> (istream& @var{stream}, mpq_t @var{rop})
6364An integer like @samp{123} will be read, or a fraction like @samp{5/9}.  No
6365whitespace is allowed around the @samp{/}.  If the fraction is not in
6366canonical form then @code{mpq_canonicalize} must be called (@pxref{Rational
6367Number Functions}) before operating on it.
6368
6369As per integer input, an @samp{0} or @samp{0x} base indicator is read when
6370none of @code{ios::dec}, @code{ios::oct} or @code{ios::hex} are set.  This is
6371done separately for numerator and denominator, so that for instance
6372@samp{0x10/11} is @math{16/11} and @samp{0x10/0x11} is @math{16/17}.
6373@end deftypefun
6374
6375@deftypefun istream& operator>> (istream& @var{stream}, mpf_t @var{rop})
6376Read @var{rop} from @var{stream}, using its @code{ios} formatting settings.
6377
6378Hex or octal floats are not supported, but might be in the future, or perhaps
6379it's best to accept only what the standard float @code{operator>>} does.
6380@end deftypefun
6381
6382Note that digit grouping specified by the @code{istream} locale is currently
6383not accepted.  Perhaps this will change in the future.
6384
6385@sp 1
6386These operators mean that GMP types can be read in the usual C++ way, for
6387example,
6388
6389@example
6390mpz_t  z;
6391...
6392cin >> z;
6393@end example
6394
6395But note that @code{istream} input (and @code{ostream} output, @pxref{C++
6396Formatted Output}) is the only overloading available for the GMP types and
6397that for instance using @code{+} with an @code{mpz_t} will have unpredictable
6398results.  For classes with overloading, see @ref{C++ Class Interface}.
6399
6400
6401
6402@node C++ Class Interface, Custom Allocation, Formatted Input, Top
6403@chapter C++ Class Interface
6404@cindex C++ interface
6405
6406This chapter describes the C++ class based interface to GMP.
6407
6408All GMP C language types and functions can be used in C++ programs, since
6409@file{gmp.h} has @code{extern "C"} qualifiers, but the class interface offers
6410overloaded functions and operators which may be more convenient.
6411
6412Due to the implementation of this interface, a reasonably recent C++ compiler
6413is required, one supporting namespaces, partial specialization of templates
6414and member templates.  For GCC this means version 2.91 or later.
6415
6416@strong{Everything described in this chapter is to be considered preliminary
6417and might be subject to incompatible changes if some unforeseen difficulty
6418reveals itself.}
6419
6420@menu
6421* C++ Interface General::
6422* C++ Interface Integers::
6423* C++ Interface Rationals::
6424* C++ Interface Floats::
6425* C++ Interface Random Numbers::
6426* C++ Interface Limitations::
6427@end menu
6428
6429
6430@node C++ Interface General, C++ Interface Integers, C++ Class Interface, C++ Class Interface
6431@section C++ Interface General
6432
6433@noindent
6434All the C++ classes and functions are available with
6435
6436@cindex @code{gmpxx.h}
6437@example
6438#include <gmpxx.h>
6439@end example
6440
6441Programs should be linked with the @file{libgmpxx} and @file{libgmp}
6442libraries.  For example,
6443
6444@example
6445g++ mycxxprog.cc -lgmpxx -lgmp
6446@end example
6447
6448@noindent
6449The classes defined are
6450
6451@deftp Class mpz_class
6452@deftpx Class mpq_class
6453@deftpx Class mpf_class
6454@end deftp
6455
6456The standard operators and various standard functions are overloaded to allow
6457arithmetic with these classes.  For example,
6458
6459@example
6460int
6461main (void)
6462@{
6463  mpz_class a, b, c;
6464
6465  a = 1234;
6466  b = "-5678";
6467  c = a+b;
6468  cout << "sum is " << c << "\n";
6469  cout << "absolute value is " << abs(c) << "\n";
6470
6471  return 0;
6472@}
6473@end example
6474
6475An important feature of the implementation is that an expression like
6476@code{a=b+c} results in a single call to the corresponding @code{mpz_add},
6477without using a temporary for the @code{b+c} part.  Expressions which by their
6478nature imply intermediate values, like @code{a=b*c+d*e}, still use temporaries
6479though.
6480
6481The classes can be freely intermixed in expressions, as can the classes and
6482the standard types @code{long}, @code{unsigned long} and @code{double}.
6483Smaller types like @code{int} or @code{float} can also be intermixed, since
6484C++ will promote them.
6485
6486Note that @code{bool} is not accepted directly, but must be explicitly cast to
6487an @code{int} first.  This is because C++ will automatically convert any
6488pointer to a @code{bool}, so if GMP accepted @code{bool} it would make all
6489sorts of invalid class and pointer combinations compile but almost certainly
6490not do anything sensible.
6491
6492Conversions back from the classes to standard C++ types aren't done
6493automatically, instead member functions like @code{get_si} are provided (see
6494the following sections for details).
6495
6496Also there are no automatic conversions from the classes to the corresponding
6497GMP C types, instead a reference to the underlying C object can be obtained
6498with the following functions,
6499
6500@deftypefun mpz_t mpz_class::get_mpz_t ()
6501@deftypefunx mpq_t mpq_class::get_mpq_t ()
6502@deftypefunx mpf_t mpf_class::get_mpf_t ()
6503@end deftypefun
6504
6505These can be used to call a C function which doesn't have a C++ class
6506interface.  For example to set @code{a} to the GCD of @code{b} and @code{c},
6507
6508@example
6509mpz_class a, b, c;
6510...
6511mpz_gcd (a.get_mpz_t(), b.get_mpz_t(), c.get_mpz_t());
6512@end example
6513
6514In the other direction, a class can be initialized from the corresponding GMP
6515C type, or assigned to if an explicit constructor is used.  In both cases this
6516makes a copy of the value, it doesn't create any sort of association.  For
6517example,
6518
6519@example
6520mpz_t z;
6521// ... init and calculate z ...
6522mpz_class x(z);
6523mpz_class y;
6524y = mpz_class (z);
6525@end example
6526
6527There are no namespace setups in @file{gmpxx.h}, all types and functions are
6528simply put into the global namespace.  This is what @file{gmp.h} has done in
6529the past, and continues to do for compatibility.  The extras provided by
6530@file{gmpxx.h} follow GMP naming conventions and are unlikely to clash with
6531anything.
6532
6533
6534@node C++ Interface Integers, C++ Interface Rationals, C++ Interface General, C++ Class Interface
6535@section C++ Interface Integers
6536
6537@deftypefun {} mpz_class::mpz_class (type @var{n})
6538Construct an @code{mpz_class}.  All the standard C++ types may be used, except
6539@code{long long} and @code{long double}, and all the GMP C++ classes can be
6540used, although conversions from @code{mpq_class} and @code{mpf_class} are
6541@code{explicit}.  Any necessary conversion follows the corresponding C
6542function, for example @code{double} follows @code{mpz_set_d}
6543(@pxref{Assigning Integers}).
6544@end deftypefun
6545
6546@deftypefun explicit mpz_class::mpz_class (mpz_t @var{z})
6547Construct an @code{mpz_class} from an @code{mpz_t}.  The value in @var{z} is
6548copied into the new @code{mpz_class}, there won't be any permanent association
6549between it and @var{z}.
6550@end deftypefun
6551
6552@deftypefun explicit mpz_class::mpz_class (const char *@var{s}, int @var{base} = 0)
6553@deftypefunx explicit mpz_class::mpz_class (const string& @var{s}, int @var{base} = 0)
6554Construct an @code{mpz_class} converted from a string using @code{mpz_set_str}
6555(@pxref{Assigning Integers}).
6556
6557If the string is not a valid integer, an @code{std::invalid_argument}
6558exception is thrown.  The same applies to @code{operator=}.
6559@end deftypefun
6560
6561@deftypefun mpz_class operator"" _mpz (const char *@var{str})
6562With C++11 compilers, integers can be constructed with the syntax
6563@code{123_mpz} which is equivalent to @code{mpz_class("123")}.
6564@end deftypefun
6565
6566@deftypefun mpz_class operator/ (mpz_class @var{a}, mpz_class @var{d})
6567@deftypefunx mpz_class operator% (mpz_class @var{a}, mpz_class @var{d})
6568Divisions involving @code{mpz_class} round towards zero, as per the
6569@code{mpz_tdiv_q} and @code{mpz_tdiv_r} functions (@pxref{Integer Division}).
6570This is the same as the C99 @code{/} and @code{%} operators.
6571
6572The @code{mpz_fdiv@dots{}} or @code{mpz_cdiv@dots{}} functions can always be called
6573directly if desired.  For example,
6574
6575@example
6576mpz_class q, a, d;
6577...
6578mpz_fdiv_q (q.get_mpz_t(), a.get_mpz_t(), d.get_mpz_t());
6579@end example
6580@end deftypefun
6581
6582@deftypefun mpz_class abs (mpz_class @var{op})
6583@deftypefunx int cmp (mpz_class @var{op1}, type @var{op2})
6584@deftypefunx int cmp (type @var{op1}, mpz_class @var{op2})
6585@maybepagebreak
6586@deftypefunx bool mpz_class::fits_sint_p (void)
6587@deftypefunx bool mpz_class::fits_slong_p (void)
6588@deftypefunx bool mpz_class::fits_sshort_p (void)
6589@maybepagebreak
6590@deftypefunx bool mpz_class::fits_uint_p (void)
6591@deftypefunx bool mpz_class::fits_ulong_p (void)
6592@deftypefunx bool mpz_class::fits_ushort_p (void)
6593@maybepagebreak
6594@deftypefunx double mpz_class::get_d (void)
6595@deftypefunx long mpz_class::get_si (void)
6596@deftypefunx string mpz_class::get_str (int @var{base} = 10)
6597@deftypefunx {unsigned long} mpz_class::get_ui (void)
6598@maybepagebreak
6599@deftypefunx int mpz_class::set_str (const char *@var{str}, int @var{base})
6600@deftypefunx int mpz_class::set_str (const string& @var{str}, int @var{base})
6601@deftypefunx int sgn (mpz_class @var{op})
6602@deftypefunx mpz_class sqrt (mpz_class @var{op})
6603@maybepagebreak
6604@deftypefunx void mpz_class::swap (mpz_class& @var{op})
6605@deftypefunx void swap (mpz_class& @var{op1}, mpz_class& @var{op2})
6606These functions provide a C++ class interface to the corresponding GMP C
6607routines.
6608
6609@code{cmp} can be used with any of the classes or the standard C++ types,
6610except @code{long long} and @code{long double}.
6611@end deftypefun
6612
6613@sp 1
6614Overloaded operators for combinations of @code{mpz_class} and @code{double}
6615are provided for completeness, but it should be noted that if the given
6616@code{double} is not an integer then the way any rounding is done is currently
6617unspecified.  The rounding might take place at the start, in the middle, or at
6618the end of the operation, and it might change in the future.
6619
6620Conversions between @code{mpz_class} and @code{double}, however, are defined
6621to follow the corresponding C functions @code{mpz_get_d} and @code{mpz_set_d}.
6622And comparisons are always made exactly, as per @code{mpz_cmp_d}.
6623
6624
6625@node C++ Interface Rationals, C++ Interface Floats, C++ Interface Integers, C++ Class Interface
6626@section C++ Interface Rationals
6627
6628In all the following constructors, if a fraction is given then it should be in
6629canonical form, or if not then @code{mpq_class::canonicalize} called.
6630
6631@deftypefun {} mpq_class::mpq_class (type @var{op})
6632@deftypefunx {} mpq_class::mpq_class (integer @var{num}, integer @var{den})
6633Construct an @code{mpq_class}.  The initial value can be a single value of any
6634type (conversion from @code{mpf_class} is @code{explicit}), or a pair of
6635integers (@code{mpz_class} or standard C++ integer types) representing a
6636fraction, except that @code{long long} and @code{long double} are not
6637supported.  For example,
6638
6639@example
6640mpq_class q (99);
6641mpq_class q (1.75);
6642mpq_class q (1, 3);
6643@end example
6644@end deftypefun
6645
6646@deftypefun explicit mpq_class::mpq_class (mpq_t @var{q})
6647Construct an @code{mpq_class} from an @code{mpq_t}.  The value in @var{q} is
6648copied into the new @code{mpq_class}, there won't be any permanent association
6649between it and @var{q}.
6650@end deftypefun
6651
6652@deftypefun explicit mpq_class::mpq_class (const char *@var{s}, int @var{base} = 0)
6653@deftypefunx explicit mpq_class::mpq_class (const string& @var{s}, int @var{base} = 0)
6654Construct an @code{mpq_class} converted from a string using @code{mpq_set_str}
6655(@pxref{Initializing Rationals}).
6656
6657If the string is not a valid rational, an @code{std::invalid_argument}
6658exception is thrown.  The same applies to @code{operator=}.
6659@end deftypefun
6660
6661@deftypefun mpq_class operator"" _mpq (const char *@var{str})
6662With C++11 compilers, integral rationals can be constructed with the syntax
6663@code{123_mpq} which is equivalent to @code{mpq_class(123_mpz)}. Other
6664rationals can be built as @code{-1_mpq/2} or @code{0xb_mpq/123456_mpz}.
6665@end deftypefun
6666
6667@deftypefun void mpq_class::canonicalize ()
6668Put an @code{mpq_class} into canonical form, as per @ref{Rational Number
6669Functions}.  All arithmetic operators require their operands in canonical
6670form, and will return results in canonical form.
6671@end deftypefun
6672
6673@deftypefun mpq_class abs (mpq_class @var{op})
6674@deftypefunx int cmp (mpq_class @var{op1}, type @var{op2})
6675@deftypefunx int cmp (type @var{op1}, mpq_class @var{op2})
6676@maybepagebreak
6677@deftypefunx double mpq_class::get_d (void)
6678@deftypefunx string mpq_class::get_str (int @var{base} = 10)
6679@maybepagebreak
6680@deftypefunx int mpq_class::set_str (const char *@var{str}, int @var{base})
6681@deftypefunx int mpq_class::set_str (const string& @var{str}, int @var{base})
6682@deftypefunx int sgn (mpq_class @var{op})
6683@maybepagebreak
6684@deftypefunx void mpq_class::swap (mpq_class& @var{op})
6685@deftypefunx void swap (mpq_class& @var{op1}, mpq_class& @var{op2})
6686These functions provide a C++ class interface to the corresponding GMP C
6687routines.
6688
6689@code{cmp} can be used with any of the classes or the standard C++ types,
6690except @code{long long} and @code{long double}.
6691@end deftypefun
6692
6693@deftypefun {mpz_class&} mpq_class::get_num ()
6694@deftypefunx {mpz_class&} mpq_class::get_den ()
6695Get a reference to an @code{mpz_class} which is the numerator or denominator
6696of an @code{mpq_class}.  This can be used both for read and write access.  If
6697the object returned is modified, it modifies the original @code{mpq_class}.
6698
6699If direct manipulation might produce a non-canonical value, then
6700@code{mpq_class::canonicalize} must be called before further operations.
6701@end deftypefun
6702
6703@deftypefun mpz_t mpq_class::get_num_mpz_t ()
6704@deftypefunx mpz_t mpq_class::get_den_mpz_t ()
6705Get a reference to the underlying @code{mpz_t} numerator or denominator of an
6706@code{mpq_class}.  This can be passed to C functions expecting an
6707@code{mpz_t}.  Any modifications made to the @code{mpz_t} will modify the
6708original @code{mpq_class}.
6709
6710If direct manipulation might produce a non-canonical value, then
6711@code{mpq_class::canonicalize} must be called before further operations.
6712@end deftypefun
6713
6714@deftypefun istream& operator>> (istream& @var{stream}, mpq_class& @var{rop});
6715Read @var{rop} from @var{stream}, using its @code{ios} formatting settings,
6716the same as @code{mpq_t operator>>} (@pxref{C++ Formatted Input}).
6717
6718If the @var{rop} read might not be in canonical form then
6719@code{mpq_class::canonicalize} must be called.
6720@end deftypefun
6721
6722
6723@node C++ Interface Floats, C++ Interface Random Numbers, C++ Interface Rationals, C++ Class Interface
6724@section C++ Interface Floats
6725
6726When an expression requires the use of temporary intermediate @code{mpf_class}
6727values, like @code{f=g*h+x*y}, those temporaries will have the same precision
6728as the destination @code{f}.  Explicit constructors can be used if this
6729doesn't suit.
6730
6731@deftypefun {} mpf_class::mpf_class (type @var{op})
6732@deftypefunx {} mpf_class::mpf_class (type @var{op}, mp_bitcnt_t @var{prec})
6733Construct an @code{mpf_class}.  Any standard C++ type can be used, except
6734@code{long long} and @code{long double}, and any of the GMP C++ classes can be
6735used.
6736
6737If @var{prec} is given, the initial precision is that value, in bits.  If
6738@var{prec} is not given, then the initial precision is determined by the type
6739of @var{op} given.  An @code{mpz_class}, @code{mpq_class}, or C++
6740builtin type will give the default @code{mpf} precision (@pxref{Initializing
6741Floats}).  An @code{mpf_class} or expression will give the precision of that
6742value.  The precision of a binary expression is the higher of the two
6743operands.
6744
6745@example
6746mpf_class f(1.5);        // default precision
6747mpf_class f(1.5, 500);   // 500 bits (at least)
6748mpf_class f(x);          // precision of x
6749mpf_class f(abs(x));     // precision of x
6750mpf_class f(-g, 1000);   // 1000 bits (at least)
6751mpf_class f(x+y);        // greater of precisions of x and y
6752@end example
6753@end deftypefun
6754
6755@deftypefun explicit mpf_class::mpf_class (mpf_t @var{f})
6756@deftypefunx {} mpf_class::mpf_class (mpf_t @var{f}, mp_bitcnt_t @var{prec})
6757Construct an @code{mpf_class} from an @code{mpf_t}.  The value in @var{f} is
6758copied into the new @code{mpf_class}, there won't be any permanent association
6759between it and @var{f}.
6760
6761If @var{prec} is given, the initial precision is that value, in bits.  If
6762@var{prec} is not given, then the initial precision is that of @var{f}.
6763@end deftypefun
6764
6765@deftypefun explicit mpf_class::mpf_class (const char *@var{s})
6766@deftypefunx {} mpf_class::mpf_class (const char *@var{s}, mp_bitcnt_t @var{prec}, int @var{base} = 0)
6767@deftypefunx explicit mpf_class::mpf_class (const string& @var{s})
6768@deftypefunx {} mpf_class::mpf_class (const string& @var{s}, mp_bitcnt_t @var{prec}, int @var{base} = 0)
6769Construct an @code{mpf_class} converted from a string using @code{mpf_set_str}
6770(@pxref{Assigning Floats}).  If @var{prec} is given, the initial precision is
6771that value, in bits.  If not, the default @code{mpf} precision
6772(@pxref{Initializing Floats}) is used.
6773
6774If the string is not a valid float, an @code{std::invalid_argument} exception
6775is thrown.  The same applies to @code{operator=}.
6776@end deftypefun
6777
6778@deftypefun mpf_class operator"" _mpf (const char *@var{str})
6779With C++11 compilers, floats can be constructed with the syntax
6780@code{1.23e-1_mpf} which is equivalent to @code{mpf_class("1.23e-1")}.
6781@end deftypefun
6782
6783@deftypefun {mpf_class&} mpf_class::operator= (type @var{op})
6784Convert and store the given @var{op} value to an @code{mpf_class} object.  The
6785same types are accepted as for the constructors above.
6786
6787Note that @code{operator=} only stores a new value, it doesn't copy or change
6788the precision of the destination, instead the value is truncated if necessary.
6789This is the same as @code{mpf_set} etc.  Note in particular this means for
6790@code{mpf_class} a copy constructor is not the same as a default constructor
6791plus assignment.
6792
6793@example
6794mpf_class x (y);   // x created with precision of y
6795
6796mpf_class x;       // x created with default precision
6797x = y;             // value truncated to that precision
6798@end example
6799
6800Applications using templated code may need to be careful about the assumptions
6801the code makes in this area, when working with @code{mpf_class} values of
6802various different or non-default precisions.  For instance implementations of
6803the standard @code{complex} template have been seen in both styles above,
6804though of course @code{complex} is normally only actually specified for use
6805with the builtin float types.
6806@end deftypefun
6807
6808@deftypefun mpf_class abs (mpf_class @var{op})
6809@deftypefunx mpf_class ceil (mpf_class @var{op})
6810@deftypefunx int cmp (mpf_class @var{op1}, type @var{op2})
6811@deftypefunx int cmp (type @var{op1}, mpf_class @var{op2})
6812@maybepagebreak
6813@deftypefunx bool mpf_class::fits_sint_p (void)
6814@deftypefunx bool mpf_class::fits_slong_p (void)
6815@deftypefunx bool mpf_class::fits_sshort_p (void)
6816@maybepagebreak
6817@deftypefunx bool mpf_class::fits_uint_p (void)
6818@deftypefunx bool mpf_class::fits_ulong_p (void)
6819@deftypefunx bool mpf_class::fits_ushort_p (void)
6820@maybepagebreak
6821@deftypefunx mpf_class floor (mpf_class @var{op})
6822@deftypefunx mpf_class hypot (mpf_class @var{op1}, mpf_class @var{op2})
6823@maybepagebreak
6824@deftypefunx double mpf_class::get_d (void)
6825@deftypefunx long mpf_class::get_si (void)
6826@deftypefunx string mpf_class::get_str (mp_exp_t& @var{exp}, int @var{base} = 10, size_t @var{digits} = 0)
6827@deftypefunx {unsigned long} mpf_class::get_ui (void)
6828@maybepagebreak
6829@deftypefunx int mpf_class::set_str (const char *@var{str}, int @var{base})
6830@deftypefunx int mpf_class::set_str (const string& @var{str}, int @var{base})
6831@deftypefunx int sgn (mpf_class @var{op})
6832@deftypefunx mpf_class sqrt (mpf_class @var{op})
6833@maybepagebreak
6834@deftypefunx void mpf_class::swap (mpf_class& @var{op})
6835@deftypefunx void swap (mpf_class& @var{op1}, mpf_class& @var{op2})
6836@deftypefunx mpf_class trunc (mpf_class @var{op})
6837These functions provide a C++ class interface to the corresponding GMP C
6838routines.
6839
6840@code{cmp} can be used with any of the classes or the standard C++ types,
6841except @code{long long} and @code{long double}.
6842
6843The accuracy provided by @code{hypot} is not currently guaranteed.
6844@end deftypefun
6845
6846@deftypefun {mp_bitcnt_t} mpf_class::get_prec ()
6847@deftypefunx void mpf_class::set_prec (mp_bitcnt_t @var{prec})
6848@deftypefunx void mpf_class::set_prec_raw (mp_bitcnt_t @var{prec})
6849Get or set the current precision of an @code{mpf_class}.
6850
6851The restrictions described for @code{mpf_set_prec_raw} (@pxref{Initializing
6852Floats}) apply to @code{mpf_class::set_prec_raw}.  Note in particular that the
6853@code{mpf_class} must be restored to it's allocated precision before being
6854destroyed.  This must be done by application code, there's no automatic
6855mechanism for it.
6856@end deftypefun
6857
6858
6859@node C++ Interface Random Numbers, C++ Interface Limitations, C++ Interface Floats, C++ Class Interface
6860@section C++ Interface Random Numbers
6861
6862@deftp Class gmp_randclass
6863The C++ class interface to the GMP random number functions uses
6864@code{gmp_randclass} to hold an algorithm selection and current state, as per
6865@code{gmp_randstate_t}.
6866@end deftp
6867
6868@deftypefun {} gmp_randclass::gmp_randclass (void (*@var{randinit}) (gmp_randstate_t, @dots{}), @dots{})
6869Construct a @code{gmp_randclass}, using a call to the given @var{randinit}
6870function (@pxref{Random State Initialization}).  The arguments expected are
6871the same as @var{randinit}, but with @code{mpz_class} instead of @code{mpz_t}.
6872For example,
6873
6874@example
6875gmp_randclass r1 (gmp_randinit_default);
6876gmp_randclass r2 (gmp_randinit_lc_2exp_size, 32);
6877gmp_randclass r3 (gmp_randinit_lc_2exp, a, c, m2exp);
6878gmp_randclass r4 (gmp_randinit_mt);
6879@end example
6880
6881@code{gmp_randinit_lc_2exp_size} will fail if the size requested is too big,
6882an @code{std::length_error} exception is thrown in that case.
6883@end deftypefun
6884
6885@deftypefun {} gmp_randclass::gmp_randclass (gmp_randalg_t @var{alg}, @dots{})
6886Construct a @code{gmp_randclass} using the same parameters as
6887@code{gmp_randinit} (@pxref{Random State Initialization}).  This function is
6888obsolete and the above @var{randinit} style should be preferred.
6889@end deftypefun
6890
6891@deftypefun void gmp_randclass::seed (unsigned long int @var{s})
6892@deftypefunx void gmp_randclass::seed (mpz_class @var{s})
6893Seed a random number generator.  See @pxref{Random Number Functions}, for how
6894to choose a good seed.
6895@end deftypefun
6896
6897@deftypefun mpz_class gmp_randclass::get_z_bits (mp_bitcnt_t @var{bits})
6898@deftypefunx mpz_class gmp_randclass::get_z_bits (mpz_class @var{bits})
6899Generate a random integer with a specified number of bits.
6900@end deftypefun
6901
6902@deftypefun mpz_class gmp_randclass::get_z_range (mpz_class @var{n})
6903Generate a random integer in the range 0 to @math{@var{n}-1} inclusive.
6904@end deftypefun
6905
6906@deftypefun mpf_class gmp_randclass::get_f ()
6907@deftypefunx mpf_class gmp_randclass::get_f (mp_bitcnt_t @var{prec})
6908Generate a random float @var{f} in the range @math{0 <= @var{f} < 1}.  @var{f}
6909will be to @var{prec} bits precision, or if @var{prec} is not given then to
6910the precision of the destination.  For example,
6911
6912@example
6913gmp_randclass  r;
6914...
6915mpf_class  f (0, 512);   // 512 bits precision
6916f = r.get_f();           // random number, 512 bits
6917@end example
6918@end deftypefun
6919
6920
6921
6922@node C++ Interface Limitations,  , C++ Interface Random Numbers, C++ Class Interface
6923@section C++ Interface Limitations
6924
6925@table @asis
6926@item @code{mpq_class} and Templated Reading
6927A generic piece of template code probably won't know that @code{mpq_class}
6928requires a @code{canonicalize} call if inputs read with @code{operator>>}
6929might be non-canonical.  This can lead to incorrect results.
6930
6931@code{operator>>} behaves as it does for reasons of efficiency.  A
6932canonicalize can be quite time consuming on large operands, and is best
6933avoided if it's not necessary.
6934
6935But this potential difficulty reduces the usefulness of @code{mpq_class}.
6936Perhaps a mechanism to tell @code{operator>>} what to do will be adopted in
6937the future, maybe a preprocessor define, a global flag, or an @code{ios} flag
6938pressed into service.  Or maybe, at the risk of inconsistency, the
6939@code{mpq_class} @code{operator>>} could canonicalize and leave @code{mpq_t}
6940@code{operator>>} not doing so, for use on those occasions when that's
6941acceptable.  Send feedback or alternate ideas to @email{gmp-bugs@@gmplib.org}.
6942
6943@item Subclassing
6944Subclassing the GMP C++ classes works, but is not currently recommended.
6945
6946Expressions involving subclasses resolve correctly (or seem to), but in normal
6947C++ fashion the subclass doesn't inherit constructors and assignments.
6948There's many of those in the GMP classes, and a good way to reestablish them
6949in a subclass is not yet provided.
6950
6951@item Templated Expressions
6952A subtle difficulty exists when using expressions together with
6953application-defined template functions.  Consider the following, with @code{T}
6954intended to be some numeric type,
6955
6956@example
6957template <class T>
6958T fun (const T &, const T &);
6959@end example
6960
6961@noindent
6962When used with, say, plain @code{mpz_class} variables, it works fine: @code{T}
6963is resolved as @code{mpz_class}.
6964
6965@example
6966mpz_class f(1), g(2);
6967fun (f, g);    // Good
6968@end example
6969
6970@noindent
6971But when one of the arguments is an expression, it doesn't work.
6972
6973@example
6974mpz_class f(1), g(2), h(3);
6975fun (f, g+h);  // Bad
6976@end example
6977
6978This is because @code{g+h} ends up being a certain expression template type
6979internal to @code{gmpxx.h}, which the C++ template resolution rules are unable
6980to automatically convert to @code{mpz_class}.  The workaround is simply to add
6981an explicit cast.
6982
6983@example
6984mpz_class f(1), g(2), h(3);
6985fun (f, mpz_class(g+h));  // Good
6986@end example
6987
6988Similarly, within @code{fun} it may be necessary to cast an expression to type
6989@code{T} when calling a templated @code{fun2}.
6990
6991@example
6992template <class T>
6993void fun (T f, T g)
6994@{
6995  fun2 (f, f+g);     // Bad
6996@}
6997
6998template <class T>
6999void fun (T f, T g)
7000@{
7001  fun2 (f, T(f+g));  // Good
7002@}
7003@end example
7004@end table
7005
7006
7007@node Custom Allocation, Language Bindings, C++ Class Interface, Top
7008@comment  node-name,  next,  previous,  up
7009@chapter Custom Allocation
7010@cindex Custom allocation
7011@cindex Memory allocation
7012@cindex Allocation of memory
7013
7014By default GMP uses @code{malloc}, @code{realloc} and @code{free} for memory
7015allocation, and if they fail GMP prints a message to the standard error output
7016and terminates the program.
7017
7018Alternate functions can be specified, to allocate memory in a different way or
7019to have a different error action on running out of memory.
7020
7021@deftypefun void mp_set_memory_functions (@* void *(*@var{alloc_func_ptr}) (size_t), @* void *(*@var{realloc_func_ptr}) (void *, size_t, size_t), @* void (*@var{free_func_ptr}) (void *, size_t))
7022Replace the current allocation functions from the arguments.  If an argument
7023is @code{NULL}, the corresponding default function is used.
7024
7025These functions will be used for all memory allocation done by GMP, apart from
7026temporary space from @code{alloca} if that function is available and GMP is
7027configured to use it (@pxref{Build Options}).
7028
7029@strong{Be sure to call @code{mp_set_memory_functions} only when there are no
7030active GMP objects allocated using the previous memory functions!  Usually
7031that means calling it before any other GMP function.}
7032@end deftypefun
7033
7034The functions supplied should fit the following declarations:
7035
7036@deftypevr Function {void *} allocate_function (size_t @var{alloc_size})
7037Return a pointer to newly allocated space with at least @var{alloc_size}
7038bytes.
7039@end deftypevr
7040
7041@deftypevr Function {void *} reallocate_function (void *@var{ptr}, size_t @var{old_size}, size_t @var{new_size})
7042Resize a previously allocated block @var{ptr} of @var{old_size} bytes to be
7043@var{new_size} bytes.
7044
7045The block may be moved if necessary or if desired, and in that case the
7046smaller of @var{old_size} and @var{new_size} bytes must be copied to the new
7047location.  The return value is a pointer to the resized block, that being the
7048new location if moved or just @var{ptr} if not.
7049
7050@var{ptr} is never @code{NULL}, it's always a previously allocated block.
7051@var{new_size} may be bigger or smaller than @var{old_size}.
7052@end deftypevr
7053
7054@deftypevr Function void free_function (void *@var{ptr}, size_t @var{size})
7055De-allocate the space pointed to by @var{ptr}.
7056
7057@var{ptr} is never @code{NULL}, it's always a previously allocated block of
7058@var{size} bytes.
7059@end deftypevr
7060
7061A @dfn{byte} here means the unit used by the @code{sizeof} operator.
7062
7063The @var{reallocate_function} parameter @var{old_size} and the
7064@var{free_function} parameter @var{size} are passed for convenience, but of
7065course they can be ignored if not needed by an implementation.  The default
7066functions using @code{malloc} and friends for instance don't use them.
7067
7068No error return is allowed from any of these functions, if they return then
7069they must have performed the specified operation.  In particular note that
7070@var{allocate_function} or @var{reallocate_function} mustn't return
7071@code{NULL}.
7072
7073Getting a different fatal error action is a good use for custom allocation
7074functions, for example giving a graphical dialog rather than the default print
7075to @code{stderr}.  How much is possible when genuinely out of memory is
7076another question though.
7077
7078There's currently no defined way for the allocation functions to recover from
7079an error such as out of memory, they must terminate program execution.  A
7080@code{longjmp} or throwing a C++ exception will have undefined results.  This
7081may change in the future.
7082
7083GMP may use allocated blocks to hold pointers to other allocated blocks.  This
7084will limit the assumptions a conservative garbage collection scheme can make.
7085
7086Since the default GMP allocation uses @code{malloc} and friends, those
7087functions will be linked in even if the first thing a program does is an
7088@code{mp_set_memory_functions}.  It's necessary to change the GMP sources if
7089this is a problem.
7090
7091@sp 1
7092@deftypefun void mp_get_memory_functions (@* void *(**@var{alloc_func_ptr}) (size_t), @* void *(**@var{realloc_func_ptr}) (void *, size_t, size_t), @* void (**@var{free_func_ptr}) (void *, size_t))
7093Get the current allocation functions, storing function pointers to the
7094locations given by the arguments.  If an argument is @code{NULL}, that
7095function pointer is not stored.
7096
7097@need 1000
7098For example, to get just the current free function,
7099
7100@example
7101void (*freefunc) (void *, size_t);
7102
7103mp_get_memory_functions (NULL, NULL, &freefunc);
7104@end example
7105@end deftypefun
7106
7107@node Language Bindings, Algorithms, Custom Allocation, Top
7108@chapter Language Bindings
7109@cindex Language bindings
7110@cindex Other languages
7111
7112The following packages and projects offer access to GMP from languages other
7113than C, though perhaps with varying levels of functionality and efficiency.
7114
7115@c  @spaceuref{U} is the same as @uref{U}, but with a couple of extra spaces
7116@c  in tex, just to separate the URL from the preceding text a bit.
7117@iftex
7118@macro spaceuref {U}
7119@ @ @uref{\U\}
7120@end macro
7121@end iftex
7122@ifnottex
7123@macro spaceuref {U}
7124@uref{\U\}
7125@end macro
7126@end ifnottex
7127
7128@sp 1
7129@table @asis
7130@item C++
7131@itemize @bullet
7132@item
7133GMP C++ class interface, @pxref{C++ Class Interface} @* Straightforward
7134interface, expression templates to eliminate temporaries.
7135@item
7136ALP @spaceuref{http://www-sop.inria.fr/saga/logiciels/ALP/} @* Linear algebra and
7137polynomials using templates.
7138@item
7139Arithmos @spaceuref{http://cant.ua.ac.be/old/arithmos/} @* Rationals
7140with infinities and square roots.
7141@item
7142CLN @spaceuref{http://www.ginac.de/CLN/} @* High level classes for arithmetic.
7143@item
7144LiDIA @spaceuref{http://www.cdc.informatik.tu-darmstadt.de/TI/LiDIA/} @* A C++
7145library for computational number theory.
7146@item
7147Linbox @spaceuref{http://www.linalg.org/} @* Sparse vectors and matrices.
7148@item
7149NTL @spaceuref{http://www.shoup.net/ntl/} @* A C++ number theory library.
7150@end itemize
7151
7152@c @item D
7153@c @itemize @bullet
7154@c @item
7155@c gmp-d @spaceuref{http://home.comcast.net/~benhinkle/gmp-d/}
7156@c @end itemize
7157
7158@item Eiffel
7159@itemize @bullet
7160@item
7161Eiffelroom @spaceuref{http://www.eiffelroom.org/node/442}
7162@end itemize
7163
7164@item Fortran
7165@itemize @bullet
7166@item
7167Omni F77 @spaceuref{http://phase.hpcc.jp/Omni/home.html} @* Arbitrary
7168precision floats.
7169@end itemize
7170
7171@item Haskell
7172@itemize @bullet
7173@item
7174Glasgow Haskell Compiler @spaceuref{http://www.haskell.org/ghc/}
7175@end itemize
7176
7177@item Java
7178@itemize @bullet
7179@item
7180Kaffe @spaceuref{http://www.kaffe.org/}
7181@item
7182Kissme @spaceuref{http://kissme.sourceforge.net/}
7183@end itemize
7184
7185@item Lisp
7186@itemize @bullet
7187@item
7188GNU Common Lisp @spaceuref{http://www.gnu.org/software/gcl/gcl.html}
7189@item
7190Librep @spaceuref{http://librep.sourceforge.net/}
7191@item
7192@c  FIXME: When there's a stable release with gmp support, just refer to it
7193@c  rather than bothering to talk about betas.
7194XEmacs (21.5.18 beta and up) @spaceuref{http://www.xemacs.org} @* Optional
7195big integers, rationals and floats using GMP.
7196@end itemize
7197
7198@item M4
7199@itemize @bullet
7200@item
7201@c  FIXME: When there's a stable release with gmp support, just refer to it
7202@c  rather than bothering to talk about betas.
7203GNU m4 betas @spaceuref{http://www.seindal.dk/rene/gnu/} @* Optionally provides
7204an arbitrary precision @code{mpeval}.
7205@end itemize
7206
7207@item ML
7208@itemize @bullet
7209@item
7210MLton compiler @spaceuref{http://mlton.org/}
7211@end itemize
7212
7213@item Objective Caml
7214@itemize @bullet
7215@item
7216MLGMP @spaceuref{http://www.di.ens.fr/~monniaux/programmes.html.en}
7217@item
7218Numerix @spaceuref{http://pauillac.inria.fr/~quercia/} @* Optionally using
7219GMP.
7220@end itemize
7221
7222@item Oz
7223@itemize @bullet
7224@item
7225Mozart @spaceuref{http://www.mozart-oz.org/}
7226@end itemize
7227
7228@item Pascal
7229@itemize @bullet
7230@item
7231GNU Pascal Compiler @spaceuref{http://www.gnu-pascal.de/} @* GMP unit.
7232@item
7233Numerix @spaceuref{http://pauillac.inria.fr/~quercia/} @* For Free Pascal,
7234optionally using GMP.
7235@end itemize
7236
7237@item Perl
7238@itemize @bullet
7239@item
7240GMP module, see @file{demos/perl} in the GMP sources (@pxref{Demonstration
7241Programs}).
7242@item
7243Math::GMP @spaceuref{http://www.cpan.org/} @* Compatible with Math::BigInt, but
7244not as many functions as the GMP module above.
7245@item
7246Math::BigInt::GMP @spaceuref{http://www.cpan.org/} @* Plug Math::GMP into
7247normal Math::BigInt operations.
7248@end itemize
7249
7250@need 1000
7251@item Pike
7252@itemize @bullet
7253@item
7254mpz module in the standard distribution, @uref{http://pike.ida.liu.se/}
7255@end itemize
7256
7257@need 500
7258@item Prolog
7259@itemize @bullet
7260@item
7261SWI Prolog @spaceuref{http://www.swi-prolog.org/} @*
7262Arbitrary precision floats.
7263@end itemize
7264
7265@item Python
7266@itemize @bullet
7267@item
7268GMPY @uref{http://code.google.com/p/gmpy/}
7269@end itemize
7270
7271@item Ruby
7272@itemize @bullet
7273@item
7274http://rubygems.org/gems/gmp
7275@end itemize
7276
7277@item Scheme
7278@itemize @bullet
7279@item
7280GNU Guile (upcoming 1.8) @spaceuref{http://www.gnu.org/software/guile/guile.html}
7281@item
7282RScheme @spaceuref{http://www.rscheme.org/}
7283@item
7284STklos @spaceuref{http://www.stklos.org/}
7285@c
7286@c  For reference, MzScheme uses some of gmp, but (as of version 205) it only
7287@c  has copies of some of the generic C code, and we don't consider that a
7288@c  language binding to gmp.
7289@c
7290@end itemize
7291
7292@item Smalltalk
7293@itemize @bullet
7294@item
7295GNU Smalltalk @spaceuref{http://www.smalltalk.org/versions/GNUSmalltalk.html}
7296@end itemize
7297
7298@item Other
7299@itemize @bullet
7300@item
7301Axiom @uref{http://savannah.nongnu.org/projects/axiom} @* Computer algebra
7302using GCL.
7303@item
7304DrGenius @spaceuref{http://drgenius.seul.org/} @* Geometry system and
7305mathematical programming language.
7306@item
7307GiNaC @spaceuref{http://www.ginac.de/} @* C++ computer algebra using CLN.
7308@item
7309GOO @spaceuref{http://www.googoogaga.org/} @* Dynamic object oriented
7310language.
7311@item
7312Maxima @uref{http://www.ma.utexas.edu/users/wfs/maxima.html} @* Macsyma
7313computer algebra using GCL.
7314@item
7315Q @spaceuref{http://q-lang.sourceforge.net/} @* Equational programming system.
7316@item
7317Regina @spaceuref{http://regina.sourceforge.net/} @* Topological calculator.
7318@item
7319Yacas @spaceuref{yacas.sourceforge.net} @* Yet another computer algebra system.
7320@end itemize
7321
7322@end table
7323
7324
7325@node Algorithms, Internals, Language Bindings, Top
7326@chapter Algorithms
7327@cindex Algorithms
7328
7329This chapter is an introduction to some of the algorithms used for various GMP
7330operations.  The code is likely to be hard to understand without knowing
7331something about the algorithms.
7332
7333Some GMP internals are mentioned, but applications that expect to be
7334compatible with future GMP releases should take care to use only the
7335documented functions.
7336
7337@menu
7338* Multiplication Algorithms::
7339* Division Algorithms::
7340* Greatest Common Divisor Algorithms::
7341* Powering Algorithms::
7342* Root Extraction Algorithms::
7343* Radix Conversion Algorithms::
7344* Other Algorithms::
7345* Assembly Coding::
7346@end menu
7347
7348
7349@node Multiplication Algorithms, Division Algorithms, Algorithms, Algorithms
7350@section Multiplication
7351@cindex Multiplication algorithms
7352
7353N@cross{}N limb multiplications and squares are done using one of seven
7354algorithms, as the size N increases.
7355
7356@quotation
7357@multitable {KaratsubaMMM} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM}
7358@item Algorithm @tab Threshold
7359@item Basecase  @tab (none)
7360@item Karatsuba @tab @code{MUL_TOOM22_THRESHOLD}
7361@item Toom-3    @tab @code{MUL_TOOM33_THRESHOLD}
7362@item Toom-4    @tab @code{MUL_TOOM44_THRESHOLD}
7363@item Toom-6.5  @tab @code{MUL_TOOM6H_THRESHOLD}
7364@item Toom-8.5  @tab @code{MUL_TOOM8H_THRESHOLD}
7365@item FFT       @tab @code{MUL_FFT_THRESHOLD}
7366@end multitable
7367@end quotation
7368
7369Similarly for squaring, with the @code{SQR} thresholds.
7370
7371N@cross{}M multiplications of operands with different sizes above
7372@code{MUL_TOOM22_THRESHOLD} are currently done by special Toom-inspired
7373algorithms or directly with FFT, depending on operand size (@pxref{Unbalanced
7374Multiplication}).
7375
7376@menu
7377* Basecase Multiplication::
7378* Karatsuba Multiplication::
7379* Toom 3-Way Multiplication::
7380* Toom 4-Way Multiplication::
7381* Higher degree Toom'n'half::
7382* FFT Multiplication::
7383* Other Multiplication::
7384* Unbalanced Multiplication::
7385@end menu
7386
7387
7388@node Basecase Multiplication, Karatsuba Multiplication, Multiplication Algorithms, Multiplication Algorithms
7389@subsection Basecase Multiplication
7390
7391Basecase N@cross{}M multiplication is a straightforward rectangular set of
7392cross-products, the same as long multiplication done by hand and for that
7393reason sometimes known as the schoolbook or grammar school method.  This is an
7394@m{O(NM),O(N*M)} algorithm.  See Knuth section 4.3.1 algorithm M
7395(@pxref{References}), and the @file{mpn/generic/mul_basecase.c} code.
7396
7397Assembly implementations of @code{mpn_mul_basecase} are essentially the same
7398as the generic C code, but have all the usual assembly tricks and
7399obscurities introduced for speed.
7400
7401A square can be done in roughly half the time of a multiply, by using the fact
7402that the cross products above and below the diagonal are the same.  A triangle
7403of products below the diagonal is formed, doubled (left shift by one bit), and
7404then the products on the diagonal added.  This can be seen in
7405@file{mpn/generic/sqr_basecase.c}.  Again the assembly implementations take
7406essentially the same approach.
7407
7408@tex
7409\def\GMPline#1#2#3#4#5#6{%
7410  \hbox {%
7411    \vrule height 2.5ex depth 1ex
7412           \hbox to 2em {\hfil{#2}\hfil}%
7413    \vrule \hbox to 2em {\hfil{#3}\hfil}%
7414    \vrule \hbox to 2em {\hfil{#4}\hfil}%
7415    \vrule \hbox to 2em {\hfil{#5}\hfil}%
7416    \vrule \hbox to 2em {\hfil{#6}\hfil}%
7417    \vrule}}
7418\GMPdisplay{
7419  \hbox{%
7420    \vbox{%
7421      \hbox to 1.5em {\vrule height 2.5ex depth 1ex width 0pt}%
7422      \hbox {\vrule height 2.5ex depth 1ex width 0pt u0\hfil}%
7423      \hbox {\vrule height 2.5ex depth 1ex width 0pt u1\hfil}%
7424      \hbox {\vrule height 2.5ex depth 1ex width 0pt u2\hfil}%
7425      \hbox {\vrule height 2.5ex depth 1ex width 0pt u3\hfil}%
7426      \hbox {\vrule height 2.5ex depth 1ex width 0pt u4\hfil}%
7427      \vfill}%
7428    \vbox{%
7429      \hbox{%
7430        \hbox to 2em {\hfil u0\hfil}%
7431        \hbox to 2em {\hfil u1\hfil}%
7432        \hbox to 2em {\hfil u2\hfil}%
7433        \hbox to 2em {\hfil u3\hfil}%
7434        \hbox to 2em {\hfil u4\hfil}}%
7435      \vskip 0.7ex
7436      \hrule
7437      \GMPline{u0}{d}{}{}{}{}%
7438      \hrule
7439      \GMPline{u1}{}{d}{}{}{}%
7440      \hrule
7441      \GMPline{u2}{}{}{d}{}{}%
7442      \hrule
7443      \GMPline{u3}{}{}{}{d}{}%
7444      \hrule
7445      \GMPline{u4}{}{}{}{}{d}%
7446      \hrule}}}
7447@end tex
7448@ifnottex
7449@example
7450@group
7451     u0  u1  u2  u3  u4
7452   +---+---+---+---+---+
7453u0 | d |   |   |   |   |
7454   +---+---+---+---+---+
7455u1 |   | d |   |   |   |
7456   +---+---+---+---+---+
7457u2 |   |   | d |   |   |
7458   +---+---+---+---+---+
7459u3 |   |   |   | d |   |
7460   +---+---+---+---+---+
7461u4 |   |   |   |   | d |
7462   +---+---+---+---+---+
7463@end group
7464@end example
7465@end ifnottex
7466
7467In practice squaring isn't a full 2@cross{} faster than multiplying, it's
7468usually around 1.5@cross{}.  Less than 1.5@cross{} probably indicates
7469@code{mpn_sqr_basecase} wants improving on that CPU.
7470
7471On some CPUs @code{mpn_mul_basecase} can be faster than the generic C
7472@code{mpn_sqr_basecase} on some small sizes.  @code{SQR_BASECASE_THRESHOLD} is
7473the size at which to use @code{mpn_sqr_basecase}, this will be zero if that
7474routine should be used always.
7475
7476
7477@node Karatsuba Multiplication, Toom 3-Way Multiplication, Basecase Multiplication, Multiplication Algorithms
7478@subsection Karatsuba Multiplication
7479@cindex Karatsuba multiplication
7480
7481The Karatsuba multiplication algorithm is described in Knuth section 4.3.3
7482part A, and various other textbooks.  A brief description is given here.
7483
7484The inputs @math{x} and @math{y} are treated as each split into two parts of
7485equal length (or the most significant part one limb shorter if N is odd).
7486
7487@tex
7488% GMPboxwidth used for all the multiplication pictures
7489\global\newdimen\GMPboxwidth \global\GMPboxwidth=5em
7490% GMPboxdepth and GMPboxheight are also used for the float pictures
7491\global\newdimen\GMPboxdepth  \global\GMPboxdepth=1ex
7492\global\newdimen\GMPboxheight \global\GMPboxheight=2ex
7493\gdef\GMPvrule{\vrule height \GMPboxheight depth \GMPboxdepth}
7494\def\GMPbox#1#2{%
7495  \vbox {%
7496    \hrule
7497    \hbox to 2\GMPboxwidth{%
7498      \GMPvrule \hfil $#1$\hfil \vrule \hfil $#2$\hfil \vrule}%
7499    \hrule}}
7500\GMPdisplay{%
7501\vbox{%
7502  \hbox to 2\GMPboxwidth {high \hfil low}
7503  \vskip 0.7ex
7504  \GMPbox{x_1}{x_0}
7505  \vskip 0.5ex
7506  \GMPbox{y_1}{y_0}
7507}}
7508@end tex
7509@ifnottex
7510@example
7511@group
7512 high              low
7513+----------+----------+
7514|    x1    |    x0    |
7515+----------+----------+
7516
7517+----------+----------+
7518|    y1    |    y0    |
7519+----------+----------+
7520@end group
7521@end example
7522@end ifnottex
7523
7524Let @math{b} be the power of 2 where the split occurs, i.e.@: if @ms{x,0} is
7525@math{k} limbs (@ms{y,0} the same) then
7526@m{b=2\GMPraise{$k*$@code{mp\_bits\_per\_limb}}, b=2^(k*mp_bits_per_limb)}.
7527With that @m{x=x_1b+x_0,x=x1*b+x0} and @m{y=y_1b+y_0,y=y1*b+y0}, and the
7528following holds,
7529
7530@display
7531@m{xy = (b^2+b)x_1y_1 - b(x_1-x_0)(y_1-y_0) + (b+1)x_0y_0,
7532  x*y = (b^2+b)*x1*y1 - b*(x1-x0)*(y1-y0) + (b+1)*x0*y0}
7533@end display
7534
7535This formula means doing only three multiplies of (N/2)@cross{}(N/2) limbs,
7536whereas a basecase multiply of N@cross{}N limbs is equivalent to four
7537multiplies of (N/2)@cross{}(N/2).  The factors @math{(b^2+b)} etc represent
7538the positions where the three products must be added.
7539
7540@tex
7541\def\GMPboxA#1#2{%
7542  \vbox{%
7543    \hrule
7544    \hbox{%
7545      \GMPvrule
7546      \hbox to 2\GMPboxwidth {\hfil\hbox{$#1$}\hfil}%
7547      \vrule
7548      \hbox to 2\GMPboxwidth {\hfil\hbox{$#2$}\hfil}%
7549      \vrule}
7550    \hrule}}
7551\def\GMPboxB#1#2{%
7552  \hbox{%
7553    \raise \GMPboxdepth \hbox to \GMPboxwidth {\hfil #1\hskip 0.5em}%
7554    \vbox{%
7555      \hrule
7556      \hbox{%
7557        \GMPvrule
7558        \hbox to 2\GMPboxwidth {\hfil\hbox{$#2$}\hfil}%
7559        \vrule}%
7560      \hrule}}}
7561\GMPdisplay{%
7562\vbox{%
7563  \hbox to 4\GMPboxwidth {high \hfil low}
7564  \vskip 0.7ex
7565  \GMPboxA{x_1y_1}{x_0y_0}
7566  \vskip 0.5ex
7567  \GMPboxB{$+$}{x_1y_1}
7568  \vskip 0.5ex
7569  \GMPboxB{$+$}{x_0y_0}
7570  \vskip 0.5ex
7571  \GMPboxB{$-$}{(x_1-x_0)(y_1-y_0)}
7572}}
7573@end tex
7574@ifnottex
7575@example
7576@group
7577 high                              low
7578+--------+--------+ +--------+--------+
7579|      x1*y1      | |      x0*y0      |
7580+--------+--------+ +--------+--------+
7581          +--------+--------+
7582      add |      x1*y1      |
7583          +--------+--------+
7584          +--------+--------+
7585      add |      x0*y0      |
7586          +--------+--------+
7587          +--------+--------+
7588      sub | (x1-x0)*(y1-y0) |
7589          +--------+--------+
7590@end group
7591@end example
7592@end ifnottex
7593
7594The term @m{(x_1-x_0)(y_1-y_0),(x1-x0)*(y1-y0)} is best calculated as an
7595absolute value, and the sign used to choose to add or subtract.  Notice the
7596sum @m{\mathop{\rm high}(x_0y_0)+\mathop{\rm low}(x_1y_1),
7597high(x0*y0)+low(x1*y1)} occurs twice, so it's possible to do @m{5k,5*k} limb
7598additions, rather than @m{6k,6*k}, but in GMP extra function call overheads
7599outweigh the saving.
7600
7601Squaring is similar to multiplying, but with @math{x=y} the formula reduces to
7602an equivalent with three squares,
7603
7604@display
7605@m{x^2 = (b^2+b)x_1^2 - b(x_1-x_0)^2 + (b+1)x_0^2,
7606   x^2 = (b^2+b)*x1^2 - b*(x1-x0)^2 + (b+1)*x0^2}
7607@end display
7608
7609The final result is accumulated from those three squares the same way as for
7610the three multiplies above.  The middle term @m{(x_1-x_0)^2,(x1-x0)^2} is now
7611always positive.
7612
7613A similar formula for both multiplying and squaring can be constructed with a
7614middle term @m{(x_1+x_0)(y_1+y_0),(x1+x0)*(y1+y0)}.  But those sums can exceed
7615@math{k} limbs, leading to more carry handling and additions than the form
7616above.
7617
7618Karatsuba multiplication is asymptotically an @math{O(N^@W{1.585})} algorithm,
7619the exponent being @m{\log3/\log2,log(3)/log(2)}, representing 3 multiplies
7620each @math{1/2} the size of the inputs.  This is a big improvement over the
7621basecase multiply at @math{O(N^2)} and the advantage soon overcomes the extra
7622additions Karatsuba performs.  @code{MUL_TOOM22_THRESHOLD} can be as little
7623as 10 limbs.  The @code{SQR} threshold is usually about twice the @code{MUL}.
7624
7625The basecase algorithm will take a time of the form @m{M(N) = aN^2 + bN + c,
7626M(N) = a*N^2 + b*N + c} and the Karatsuba algorithm @m{K(N) = 3M(N/2) + dN +
7627e, K(N) = 3*M(N/2) + d*N + e}, which expands to @m{K(N) = {3\over4} aN^2 +
7628{3\over2} bN + 3c + dN + e, K(N) = 3/4*a*N^2 + 3/2*b*N + 3*c + d*N + e}.  The
7629factor @m{3\over4, 3/4} for @math{a} means per-crossproduct speedups in the
7630basecase code will increase the threshold since they benefit @math{M(N)} more
7631than @math{K(N)}.  And conversely the @m{3\over2, 3/2} for @math{b} means
7632linear style speedups of @math{b} will increase the threshold since they
7633benefit @math{K(N)} more than @math{M(N)}.  The latter can be seen for
7634instance when adding an optimized @code{mpn_sqr_diagonal} to
7635@code{mpn_sqr_basecase}.  Of course all speedups reduce total time, and in
7636that sense the algorithm thresholds are merely of academic interest.
7637
7638
7639@node Toom 3-Way Multiplication, Toom 4-Way Multiplication, Karatsuba Multiplication, Multiplication Algorithms
7640@subsection Toom 3-Way Multiplication
7641@cindex Toom multiplication
7642
7643The Karatsuba formula is the simplest case of a general approach to splitting
7644inputs that leads to both Toom and FFT algorithms.  A description of
7645Toom can be found in Knuth section 4.3.3, with an example 3-way
7646calculation after Theorem A@.  The 3-way form used in GMP is described here.
7647
7648The operands are each considered split into 3 pieces of equal length (or the
7649most significant part 1 or 2 limbs shorter than the other two).
7650
7651@tex
7652\def\GMPbox#1#2#3{%
7653  \vbox{%
7654    \hrule \vfil
7655    \hbox to 3\GMPboxwidth {%
7656      \GMPvrule
7657      \hfil$#1$\hfil
7658      \vrule
7659      \hfil$#2$\hfil
7660      \vrule
7661      \hfil$#3$\hfil
7662      \vrule}%
7663    \vfil \hrule
7664}}
7665\GMPdisplay{%
7666\vbox{%
7667  \hbox to 3\GMPboxwidth {high \hfil low}
7668  \vskip 0.7ex
7669  \GMPbox{x_2}{x_1}{x_0}
7670  \vskip 0.5ex
7671  \GMPbox{y_2}{y_1}{y_0}
7672  \vskip 0.5ex
7673}}
7674@end tex
7675@ifnottex
7676@example
7677@group
7678 high                         low
7679+----------+----------+----------+
7680|    x2    |    x1    |    x0    |
7681+----------+----------+----------+
7682
7683+----------+----------+----------+
7684|    y2    |    y1    |    y0    |
7685+----------+----------+----------+
7686@end group
7687@end example
7688@end ifnottex
7689
7690@noindent
7691These parts are treated as the coefficients of two polynomials
7692
7693@display
7694@group
7695@m{X(t) = x_2t^2 + x_1t + x_0,
7696   X(t) = x2*t^2 + x1*t + x0}
7697@m{Y(t) = y_2t^2 + y_1t + y_0,
7698   Y(t) = y2*t^2 + y1*t + y0}
7699@end group
7700@end display
7701
7702Let @math{b} equal the power of 2 which is the size of the @ms{x,0}, @ms{x,1},
7703@ms{y,0} and @ms{y,1} pieces, i.e.@: if they're @math{k} limbs each then
7704@m{b=2\GMPraise{$k*$@code{mp\_bits\_per\_limb}}, b=2^(k*mp_bits_per_limb)}.
7705With this @math{x=X(b)} and @math{y=Y(b)}.
7706
7707Let a polynomial @m{W(t)=X(t)Y(t),W(t)=X(t)*Y(t)} and suppose its coefficients
7708are
7709
7710@display
7711@m{W(t) = w_4t^4 + w_3t^3 + w_2t^2 + w_1t + w_0,
7712   W(t) = w4*t^4 + w3*t^3 + w2*t^2 + w1*t + w0}
7713@end display
7714
7715The @m{w_i,w[i]} are going to be determined, and when they are they'll give
7716the final result using @math{w=W(b)}, since
7717@m{xy=X(b)Y(b),x*y=X(b)*Y(b)=W(b)}.  The coefficients will be roughly
7718@math{b^2} each, and the final @math{W(b)} will be an addition like,
7719
7720@tex
7721\def\GMPbox#1#2{%
7722  \moveright #1\GMPboxwidth
7723  \vbox{%
7724    \hrule
7725    \hbox{%
7726      \GMPvrule
7727      \hbox to 2\GMPboxwidth {\hfil$#2$\hfil}%
7728      \vrule}%
7729    \hrule
7730}}
7731\GMPdisplay{%
7732\vbox{%
7733  \hbox to 6\GMPboxwidth {high \hfil low}%
7734  \vskip 0.7ex
7735  \GMPbox{0}{w_4}
7736  \vskip 0.5ex
7737  \GMPbox{1}{w_3}
7738  \vskip 0.5ex
7739  \GMPbox{2}{w_2}
7740  \vskip 0.5ex
7741  \GMPbox{3}{w_1}
7742  \vskip 0.5ex
7743  \GMPbox{4}{w_0}
7744}}
7745@end tex
7746@ifnottex
7747@example
7748@group
7749 high                                        low
7750+-------+-------+
7751|       w4      |
7752+-------+-------+
7753       +--------+-------+
7754       |        w3      |
7755       +--------+-------+
7756               +--------+-------+
7757               |        w2      |
7758               +--------+-------+
7759                       +--------+-------+
7760                       |        w1      |
7761                       +--------+-------+
7762                                +-------+-------+
7763                                |       w0      |
7764                                +-------+-------+
7765@end group
7766@end example
7767@end ifnottex
7768
7769The @m{w_i,w[i]} coefficients could be formed by a simple set of cross
7770products, like @m{w_4=x_2y_2,w4=x2*y2}, @m{w_3=x_2y_1+x_1y_2,w3=x2*y1+x1*y2},
7771@m{w_2=x_2y_0+x_1y_1+x_0y_2,w2=x2*y0+x1*y1+x0*y2} etc, but this would need all
7772nine @m{x_iy_j,x[i]*y[j]} for @math{i,j=0,1,2}, and would be equivalent merely
7773to a basecase multiply.  Instead the following approach is used.
7774
7775@math{X(t)} and @math{Y(t)} are evaluated and multiplied at 5 points, giving
7776values of @math{W(t)} at those points.  In GMP the following points are used,
7777
7778@quotation
7779@multitable {@m{t=\infty,t=inf}M} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM}
7780@item Point                 @tab Value
7781@item @math{t=0}            @tab @m{x_0y_0,x0 * y0}, which gives @ms{w,0} immediately
7782@item @math{t=1}            @tab @m{(x_2+x_1+x_0)(y_2+y_1+y_0),(x2+x1+x0) * (y2+y1+y0)}
7783@item @math{t=-1}           @tab @m{(x_2-x_1+x_0)(y_2-y_1+y_0),(x2-x1+x0) * (y2-y1+y0)}
7784@item @math{t=2}            @tab @m{(4x_2+2x_1+x_0)(4y_2+2y_1+y_0),(4*x2+2*x1+x0) * (4*y2+2*y1+y0)}
7785@item @m{t=\infty,t=inf}    @tab @m{x_2y_2,x2 * y2}, which gives @ms{w,4} immediately
7786@end multitable
7787@end quotation
7788
7789At @math{t=-1} the values can be negative and that's handled using the
7790absolute values and tracking the sign separately.  At @m{t=\infty,t=inf} the
7791value is actually @m{\lim_{t\to\infty} {X(t)Y(t)\over t^4}, X(t)*Y(t)/t^4 in
7792the limit as t approaches infinity}, but it's much easier to think of as
7793simply @m{x_2y_2,x2*y2} giving @ms{w,4} immediately (much like
7794@m{x_0y_0,x0*y0} at @math{t=0} gives @ms{w,0} immediately).
7795
7796Each of the points substituted into
7797@m{W(t)=w_4t^4+\cdots+w_0,W(t)=w4*t^4+@dots{}+w0} gives a linear combination
7798of the @m{w_i,w[i]} coefficients, and the value of those combinations has just
7799been calculated.
7800
7801@tex
7802\GMPdisplay{%
7803$\matrix{%
7804W(0)      & = &       &   &      &   &      &   &      &   & w_0 \cr
7805W(1)      & = &   w_4 & + &  w_3 & + &  w_2 & + &  w_1 & + & w_0 \cr
7806W(-1)     & = &   w_4 & - &  w_3 & + &  w_2 & - &  w_1 & + & w_0 \cr
7807W(2)      & = & 16w_4 & + & 8w_3 & + & 4w_2 & + & 2w_1 & + & w_0 \cr
7808W(\infty) & = &   w_4 \cr
7809}$}
7810@end tex
7811@ifnottex
7812@example
7813@group
7814W(0)   =                              w0
7815W(1)   =    w4 +   w3 +   w2 +   w1 + w0
7816W(-1)  =    w4 -   w3 +   w2 -   w1 + w0
7817W(2)   = 16*w4 + 8*w3 + 4*w2 + 2*w1 + w0
7818W(inf) =    w4
7819@end group
7820@end example
7821@end ifnottex
7822
7823This is a set of five equations in five unknowns, and some elementary linear
7824algebra quickly isolates each @m{w_i,w[i]}.  This involves adding or
7825subtracting one @math{W(t)} value from another, and a couple of divisions by
7826powers of 2 and one division by 3, the latter using the special
7827@code{mpn_divexact_by3} (@pxref{Exact Division}).
7828
7829The conversion of @math{W(t)} values to the coefficients is interpolation.  A
7830polynomial of degree 4 like @math{W(t)} is uniquely determined by values known
7831at 5 different points.  The points are arbitrary and can be chosen to make the
7832linear equations come out with a convenient set of steps for quickly isolating
7833the @m{w_i,w[i]}.
7834
7835Squaring follows the same procedure as multiplication, but there's only one
7836@math{X(t)} and it's evaluated at the 5 points, and those values squared to
7837give values of @math{W(t)}.  The interpolation is then identical, and in fact
7838the same @code{toom_interpolate_5pts} subroutine is used for both squaring and
7839multiplying.
7840
7841Toom-3 is asymptotically @math{O(N^@W{1.465})}, the exponent being
7842@m{\log5/\log3,log(5)/log(3)}, representing 5 recursive multiplies of 1/3 the
7843original size each.  This is an improvement over Karatsuba at
7844@math{O(N^@W{1.585})}, though Toom does more work in the evaluation and
7845interpolation and so it only realizes its advantage above a certain size.
7846
7847Near the crossover between Toom-3 and Karatsuba there's generally a range of
7848sizes where the difference between the two is small.
7849@code{MUL_TOOM33_THRESHOLD} is a somewhat arbitrary point in that range and
7850successive runs of the tune program can give different values due to small
7851variations in measuring.  A graph of time versus size for the two shows the
7852effect, see @file{tune/README}.
7853
7854At the fairly small sizes where the Toom-3 thresholds occur it's worth
7855remembering that the asymptotic behaviour for Karatsuba and Toom-3 can't be
7856expected to make accurate predictions, due of course to the big influence of
7857all sorts of overheads, and the fact that only a few recursions of each are
7858being performed.  Even at large sizes there's a good chance machine dependent
7859effects like cache architecture will mean actual performance deviates from
7860what might be predicted.
7861
7862The formula given for the Karatsuba algorithm (@pxref{Karatsuba
7863Multiplication}) has an equivalent for Toom-3 involving only five multiplies,
7864but this would be complicated and unenlightening.
7865
7866An alternate view of Toom-3 can be found in Zuras (@pxref{References}), using
7867a vector to represent the @math{x} and @math{y} splits and a matrix
7868multiplication for the evaluation and interpolation stages.  The matrix
7869inverses are not meant to be actually used, and they have elements with values
7870much greater than in fact arise in the interpolation steps.  The diagram shown
7871for the 3-way is attractive, but again doesn't have to be implemented that way
7872and for example with a bit of rearrangement just one division by 6 can be
7873done.
7874
7875
7876@node Toom 4-Way Multiplication, Higher degree Toom'n'half, Toom 3-Way Multiplication, Multiplication Algorithms
7877@subsection Toom 4-Way Multiplication
7878@cindex Toom multiplication
7879
7880Karatsuba and Toom-3 split the operands into 2 and 3 coefficients,
7881respectively.  Toom-4 analogously splits the operands into 4 coefficients.
7882Using the notation from the section on Toom-3 multiplication, we form two
7883polynomials:
7884
7885@display
7886@group
7887@m{X(t) = x_3t^3 + x_2t^2 + x_1t + x_0,
7888   X(t) = x3*t^3 + x2*t^2 + x1*t + x0}
7889@m{Y(t) = y_3t^3 + y_2t^2 + y_1t + y_0,
7890   Y(t) = y3*t^3 + y2*t^2 + y1*t + y0}
7891@end group
7892@end display
7893
7894@math{X(t)} and @math{Y(t)} are evaluated and multiplied at 7 points, giving
7895values of @math{W(t)} at those points.  In GMP the following points are used,
7896
7897@quotation
7898@multitable {@m{t=-1/2,t=inf}M} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM}
7899@item Point              @tab Value
7900@item @math{t=0}         @tab @m{x_0y_0,x0 * y0}, which gives @ms{w,0} immediately
7901@item @math{t=1/2}       @tab @m{(x_3+2x_2+4x_1+8x_0)(y_3+2y_2+4y_1+8y_0),(x3+2*x2+4*x1+8*x0) * (y3+2*y2+4*y1+8*y0)}
7902@item @math{t=-1/2}      @tab @m{(-x_3+2x_2-4x_1+8x_0)(-y_3+2y_2-4y_1+8y_0),(-x3+2*x2-4*x1+8*x0) * (-y3+2*y2-4*y1+8*y0)}
7903@item @math{t=1}         @tab @m{(x_3+x_2+x_1+x_0)(y_3+y_2+y_1+y_0),(x3+x2+x1+x0) * (y3+y2+y1+y0)}
7904@item @math{t=-1}        @tab @m{(-x_3+x_2-x_1+x_0)(-y_3+y_2-y_1+y_0),(-x3+x2-x1+x0) * (-y3+y2-y1+y0)}
7905@item @math{t=2}         @tab @m{(8x_3+4x_2+2x_1+x_0)(8y_3+4y_2+2y_1+y_0),(8*x3+4*x2+2*x1+x0) * (8*y3+4*y2+2*y1+y0)}
7906@item @m{t=\infty,t=inf} @tab @m{x_3y_3,x3 * y3}, which gives @ms{w,6} immediately
7907@end multitable
7908@end quotation
7909
7910The number of additions and subtractions for Toom-4 is much larger than for Toom-3.
7911But several subexpressions occur multiple times, for example @m{x_2+x_0,x2+x0}, occurs
7912for both @math{t=1} and @math{t=-1}.
7913
7914Toom-4 is asymptotically @math{O(N^@W{1.404})}, the exponent being
7915@m{\log7/\log4,log(7)/log(4)}, representing 7 recursive multiplies of 1/4 the
7916original size each.
7917
7918
7919@node Higher degree Toom'n'half, FFT Multiplication, Toom 4-Way Multiplication, Multiplication Algorithms
7920@subsection Higher degree Toom'n'half
7921@cindex Toom multiplication
7922
7923The Toom algorithms described above (@pxref{Toom 3-Way Multiplication},
7924@pxref{Toom 4-Way Multiplication}) generalizes to split into an arbitrary
7925number of pieces. In general a split of two equally long operands into
7926@math{r} pieces leads to evaluations and pointwise multiplications done at
7927@m{2r-1,2*r-1} points. To fully exploit symmetries it would be better to have
7928a multiple of 4 points, that's why for higher degree Toom'n'half is used.
7929
7930Toom'n'half means that the existence of one more piece is considered for a
7931single operand. It can be virtual, i.e. zero, or real, when the two operand
7932are not exactly balanced. By chosing an even @math{r},
7933Toom-@m{r{1\over2},r+1/2} requires @math{2r} points, a multiple of four.
7934
7935The four-plets of points inlcude 0, @m{\infty,inf}, +1, -1 and
7936@m{\pm2^i,+-2^i}, @m{\pm2^{-i},+-2^-i} . Each of them giving shortcuts for the
7937evaluation phase and for some steps in the interpolation phase. Further tricks
7938are used to reduce the memory footprint of the whole multiplication algorithm
7939to a memory buffer equanl in size to the result of the product.
7940
7941Current GMP uses both Toom-6'n'half and Toom-8'n'half.
7942
7943
7944@node FFT Multiplication, Other Multiplication, Higher degree Toom'n'half, Multiplication Algorithms
7945@subsection FFT Multiplication
7946@cindex FFT multiplication
7947@cindex Fast Fourier Transform
7948
7949At large to very large sizes a Fermat style FFT multiplication is used,
7950following Sch@"onhage and Strassen (@pxref{References}).  Descriptions of FFTs
7951in various forms can be found in many textbooks, for instance Knuth section
79524.3.3 part C or Lipson chapter IX@.  A brief description of the form used in
7953GMP is given here.
7954
7955The multiplication done is @m{xy \bmod 2^N+1, x*y mod 2^N+1}, for a given
7956@math{N}.  A full product @m{xy,x*y} is obtained by choosing @m{N \ge
7957\mathop{\rm bits}(x)+\mathop{\rm bits}(y), N>=bits(x)+bits(y)} and padding
7958@math{x} and @math{y} with high zero limbs.  The modular product is the native
7959form for the algorithm, so padding to get a full product is unavoidable.
7960
7961The algorithm follows a split, evaluate, pointwise multiply, interpolate and
7962combine similar to that described above for Karatsuba and Toom-3.  A @math{k}
7963parameter controls the split, with an FFT-@math{k} splitting into @math{2^k}
7964pieces of @math{M=N/2^k} bits each.  @math{N} must be a multiple of
7965@m{2^k\times@code{mp\_bits\_per\_limb}, (2^k)*@nicode{mp_bits_per_limb}} so
7966the split falls on limb boundaries, avoiding bit shifts in the split and
7967combine stages.
7968
7969The evaluations, pointwise multiplications, and interpolation, are all done
7970modulo @m{2^{N'}+1, 2^N'+1} where @math{N'} is @math{2M+k+3} rounded up to a
7971multiple of @math{2^k} and of @code{mp_bits_per_limb}.  The results of
7972interpolation will be the following negacyclic convolution of the input
7973pieces, and the choice of @math{N'} ensures these sums aren't truncated.
7974@tex
7975$$ w_n = \sum_{{i+j = b2^k+n}\atop{b=0,1}} (-1)^b x_i y_j $$
7976@end tex
7977@ifnottex
7978
7979@example
7980           ---
7981           \         b
7982w[n] =     /     (-1) * x[i] * y[j]
7983           ---
7984       i+j==b*2^k+n
7985          b=0,1
7986@end example
7987
7988@end ifnottex
7989The points used for the evaluation are @math{g^i} for @math{i=0} to
7990@math{2^k-1} where @m{g=2^{2N'/2^k}, g=2^(2N'/2^k)}.  @math{g} is a
7991@m{2^k,2^k'}th root of unity mod @m{2^{N'}+1,2^N'+1}, which produces necessary
7992cancellations at the interpolation stage, and it's also a power of 2 so the
7993fast Fourier transforms used for the evaluation and interpolation do only
7994shifts, adds and negations.
7995
7996The pointwise multiplications are done modulo @m{2^{N'}+1, 2^N'+1} and either
7997recurse into a further FFT or use a plain multiplication (Toom-3, Karatsuba or
7998basecase), whichever is optimal at the size @math{N'}.  The interpolation is
7999an inverse fast Fourier transform.  The resulting set of sums of @m{x_iy_j,
8000x[i]*y[j]} are added at appropriate offsets to give the final result.
8001
8002Squaring is the same, but @math{x} is the only input so it's one transform at
8003the evaluate stage and the pointwise multiplies are squares.  The
8004interpolation is the same.
8005
8006For a mod @math{2^N+1} product, an FFT-@math{k} is an @m{O(N^{k/(k-1)}),
8007O(N^(k/(k-1)))} algorithm, the exponent representing @math{2^k} recursed
8008modular multiplies each @m{1/2^{k-1},1/2^(k-1)} the size of the original.
8009Each successive @math{k} is an asymptotic improvement, but overheads mean each
8010is only faster at bigger and bigger sizes.  In the code, @code{MUL_FFT_TABLE}
8011and @code{SQR_FFT_TABLE} are the thresholds where each @math{k} is used.  Each
8012new @math{k} effectively swaps some multiplying for some shifts, adds and
8013overheads.
8014
8015A mod @math{2^N+1} product can be formed with a normal
8016@math{N@cross{}N@rightarrow{}2N} bit multiply plus a subtraction, so an FFT
8017and Toom-3 etc can be compared directly.  A @math{k=4} FFT at
8018@math{O(N^@W{1.333})} can be expected to be the first faster than Toom-3 at
8019@math{O(N^@W{1.465})}.  In practice this is what's found, with
8020@code{MUL_FFT_MODF_THRESHOLD} and @code{SQR_FFT_MODF_THRESHOLD} being between
8021300 and 1000 limbs, depending on the CPU@.  So far it's been found that only
8022very large FFTs recurse into pointwise multiplies above these sizes.
8023
8024When an FFT is to give a full product, the change of @math{N} to @math{2N}
8025doesn't alter the theoretical complexity for a given @math{k}, but for the
8026purposes of considering where an FFT might be first used it can be assumed
8027that the FFT is recursing into a normal multiply and that on that basis it's
8028doing @math{2^k} recursed multiplies each @m{1/2^{k-2},1/2^(k-2)} the size of
8029the inputs, making it @m{O(N^{k/(k-2)}), O(N^(k/(k-2)))}.  This would mean
8030@math{k=7} at @math{O(N^@W{1.4})} would be the first FFT faster than Toom-3.
8031In practice @code{MUL_FFT_THRESHOLD} and @code{SQR_FFT_THRESHOLD} have been
8032found to be in the @math{k=8} range, somewhere between 3000 and 10000 limbs.
8033
8034The way @math{N} is split into @math{2^k} pieces and then @math{2M+k+3} is
8035rounded up to a multiple of @math{2^k} and @code{mp_bits_per_limb} means that
8036when @math{2^k@ge{}@nicode{mp\_bits\_per\_limb}} the effective @math{N} is a
8037multiple of @m{2^{2k-1},2^(2k-1)} bits.  The @math{+k+3} means some values of
8038@math{N} just under such a multiple will be rounded to the next.  The
8039complexity calculations above assume that a favourable size is used, meaning
8040one which isn't padded through rounding, and it's also assumed that the extra
8041@math{+k+3} bits are negligible at typical FFT sizes.
8042
8043The practical effect of the @m{2^{2k-1},2^(2k-1)} constraint is to introduce a
8044step-effect into measured speeds.  For example @math{k=8} will round @math{N}
8045up to a multiple of 32768 bits, so for a 32-bit limb there'll be 512 limb
8046groups of sizes for which @code{mpn_mul_n} runs at the same speed.  Or for
8047@math{k=9} groups of 2048 limbs, @math{k=10} groups of 8192 limbs, etc.  In
8048practice it's been found each @math{k} is used at quite small multiples of its
8049size constraint and so the step effect is quite noticeable in a time versus
8050size graph.
8051
8052The threshold determinations currently measure at the mid-points of size
8053steps, but this is sub-optimal since at the start of a new step it can happen
8054that it's better to go back to the previous @math{k} for a while.  Something
8055more sophisticated for @code{MUL_FFT_TABLE} and @code{SQR_FFT_TABLE} will be
8056needed.
8057
8058
8059@node Other Multiplication, Unbalanced Multiplication, FFT Multiplication, Multiplication Algorithms
8060@subsection Other Multiplication
8061@cindex Toom multiplication
8062
8063The Toom algorithms described above (@pxref{Toom 3-Way Multiplication},
8064@pxref{Toom 4-Way Multiplication}) generalizes to split into an arbitrary
8065number of pieces, as per Knuth section 4.3.3 algorithm C@.  This is not
8066currently used.  The notes here are merely for interest.
8067
8068In general a split into @math{r+1} pieces is made, and evaluations and
8069pointwise multiplications done at @m{2r+1,2*r+1} points.  A 4-way split does 7
8070pointwise multiplies, 5-way does 9, etc.  Asymptotically an @math{(r+1)}-way
8071algorithm is @m{O(N^{log(2r+1)/log(r+1)}), O(N^(log(2*r+1)/log(r+1)))}.  Only
8072the pointwise multiplications count towards big-@math{O} complexity, but the
8073time spent in the evaluate and interpolate stages grows with @math{r} and has
8074a significant practical impact, with the asymptotic advantage of each @math{r}
8075realized only at bigger and bigger sizes.  The overheads grow as
8076@m{O(Nr),O(N*r)}, whereas in an @math{r=2^k} FFT they grow only as @m{O(N \log
8077r), O(N*log(r))}.
8078
8079Knuth algorithm C evaluates at points 0,1,2,@dots{},@m{2r,2*r}, but exercise 4
8080uses @math{-r},@dots{},0,@dots{},@math{r} and the latter saves some small
8081multiplies in the evaluate stage (or rather trades them for additions), and
8082has a further saving of nearly half the interpolate steps.  The idea is to
8083separate odd and even final coefficients and then perform algorithm C steps C7
8084and C8 on them separately.  The divisors at step C7 become @math{j^2} and the
8085multipliers at C8 become @m{2tj-j^2,2*t*j-j^2}.
8086
8087Splitting odd and even parts through positive and negative points can be
8088thought of as using @math{-1} as a square root of unity.  If a 4th root of
8089unity was available then a further split and speedup would be possible, but no
8090such root exists for plain integers.  Going to complex integers with
8091@m{i=\sqrt{-1}, i=sqrt(-1)} doesn't help, essentially because in Cartesian
8092form it takes three real multiplies to do a complex multiply.  The existence
8093of @m{2^k,2^k'}th roots of unity in a suitable ring or field lets the fast
8094Fourier transform keep splitting and get to @m{O(N \log r), O(N*log(r))}.
8095
8096Floating point FFTs use complex numbers approximating Nth roots of unity.
8097Some processors have special support for such FFTs.  But these are not used in
8098GMP since it's very difficult to guarantee an exact result (to some number of
8099bits).  An occasional difference of 1 in the last bit might not matter to a
8100typical signal processing algorithm, but is of course of vital importance to
8101GMP.
8102
8103
8104@node Unbalanced Multiplication,  , Other Multiplication, Multiplication Algorithms
8105@subsection Unbalanced Multiplication
8106@cindex Unbalanced multiplication
8107
8108Multiplication of operands with different sizes, both below
8109@code{MUL_TOOM22_THRESHOLD} are done with plain schoolbook multiplication
8110(@pxref{Basecase Multiplication}).
8111
8112For really large operands, we invoke FFT directly.
8113
8114For operands between these sizes, we use Toom inspired algorithms suggested by
8115Alberto Zanoni and Marco Bodrato.  The idea is to split the operands into
8116polynomials of different degree.  GMP currently splits the smaller operand
8117onto 2 coefficients, i.e., a polynomial of degree 1, but the larger operand
8118can be split into 2, 3, or 4 coefficients, i.e., a polynomial of degree 1 to
81193.
8120
8121@c FIXME: This is mighty ugly, but a cleaner @need triggers texinfo bugs that
8122@c screws up layout here and there in the rest of the manual.
8123@c @tex
8124@c \goodbreak
8125@c @end tex
8126@node Division Algorithms, Greatest Common Divisor Algorithms, Multiplication Algorithms, Algorithms
8127@section Division Algorithms
8128@cindex Division algorithms
8129
8130@menu
8131* Single Limb Division::
8132* Basecase Division::
8133* Divide and Conquer Division::
8134* Block-Wise Barrett Division::
8135* Exact Division::
8136* Exact Remainder::
8137* Small Quotient Division::
8138@end menu
8139
8140
8141@node Single Limb Division, Basecase Division, Division Algorithms, Division Algorithms
8142@subsection Single Limb Division
8143
8144N@cross{}1 division is implemented using repeated 2@cross{}1 divisions from
8145high to low, either with a hardware divide instruction or a multiplication by
8146inverse, whichever is best on a given CPU.
8147
8148The multiply by inverse follows ``Improved division by invariant integers'' by
8149M@"oller and Granlund (@pxref{References}) and is implemented as
8150@code{udiv_qrnnd_preinv} in @file{gmp-impl.h}.  The idea is to have a
8151fixed-point approximation to @math{1/d} (see @code{invert_limb}) and then
8152multiply by the high limb (plus one bit) of the dividend to get a quotient
8153@math{q}.  With @math{d} normalized (high bit set), @math{q} is no more than 1
8154too small.  Subtracting @m{qd,q*d} from the dividend gives a remainder, and
8155reveals whether @math{q} or @math{q-1} is correct.
8156
8157The result is a division done with two multiplications and four or five
8158arithmetic operations.  On CPUs with low latency multipliers this can be much
8159faster than a hardware divide, though the cost of calculating the inverse at
8160the start may mean it's only better on inputs bigger than say 4 or 5 limbs.
8161
8162When a divisor must be normalized, either for the generic C
8163@code{__udiv_qrnnd_c} or the multiply by inverse, the division performed is
8164actually @m{a2^k,a*2^k} by @m{d2^k,d*2^k} where @math{a} is the dividend and
8165@math{k} is the power necessary to have the high bit of @m{d2^k,d*2^k} set.
8166The bit shifts for the dividend are usually accomplished ``on the fly''
8167meaning by extracting the appropriate bits at each step.  Done this way the
8168quotient limbs come out aligned ready to store.  When only the remainder is
8169wanted, an alternative is to take the dividend limbs unshifted and calculate
8170@m{r = a \bmod d2^k, r = a mod d*2^k} followed by an extra final step @m{r2^k
8171\bmod d2^k, r*2^k mod d*2^k}.  This can help on CPUs with poor bit shifts or
8172few registers.
8173
8174The multiply by inverse can be done two limbs at a time.  The calculation is
8175basically the same, but the inverse is two limbs and the divisor treated as if
8176padded with a low zero limb.  This means more work, since the inverse will
8177need a 2@cross{}2 multiply, but the four 1@cross{}1s to do that are
8178independent and can therefore be done partly or wholly in parallel.  Likewise
8179for a 2@cross{}1 calculating @m{qd,q*d}.  The net effect is to process two
8180limbs with roughly the same two multiplies worth of latency that one limb at a
8181time gives.  This extends to 3 or 4 limbs at a time, though the extra work to
8182apply the inverse will almost certainly soon reach the limits of multiplier
8183throughput.
8184
8185A similar approach in reverse can be taken to process just half a limb at a
8186time if the divisor is only a half limb.  In this case the 1@cross{}1 multiply
8187for the inverse effectively becomes two @m{{1\over2}\times1, (1/2)x1} for each
8188limb, which can be a saving on CPUs with a fast half limb multiply, or in fact
8189if the only multiply is a half limb, and especially if it's not pipelined.
8190
8191
8192@node Basecase Division, Divide and Conquer Division, Single Limb Division, Division Algorithms
8193@subsection Basecase Division
8194
8195Basecase N@cross{}M division is like long division done by hand, but in base
8196@m{2\GMPraise{@code{mp\_bits\_per\_limb}}, 2^mp_bits_per_limb}.  See Knuth
8197section 4.3.1 algorithm D, and @file{mpn/generic/sb_divrem_mn.c}.
8198
8199Briefly stated, while the dividend remains larger than the divisor, a high
8200quotient limb is formed and the N@cross{}1 product @m{qd,q*d} subtracted at
8201the top end of the dividend.  With a normalized divisor (most significant bit
8202set), each quotient limb can be formed with a 2@cross{}1 division and a
82031@cross{}1 multiplication plus some subtractions.  The 2@cross{}1 division is
8204by the high limb of the divisor and is done either with a hardware divide or a
8205multiply by inverse (the same as in @ref{Single Limb Division}) whichever is
8206faster.  Such a quotient is sometimes one too big, requiring an addback of the
8207divisor, but that happens rarely.
8208
8209With Q=N@minus{}M being the number of quotient limbs, this is an
8210@m{O(QM),O(Q*M)} algorithm and will run at a speed similar to a basecase
8211Q@cross{}M multiplication, differing in fact only in the extra multiply and
8212divide for each of the Q quotient limbs.
8213
8214
8215@node Divide and Conquer Division, Block-Wise Barrett Division, Basecase Division, Division Algorithms
8216@subsection Divide and Conquer Division
8217
8218For divisors larger than @code{DC_DIV_QR_THRESHOLD}, division is done by dividing.
8219Or to be precise by a recursive divide and conquer algorithm based on work by
8220Moenck and Borodin, Jebelean, and Burnikel and Ziegler (@pxref{References}).
8221
8222The algorithm consists essentially of recognising that a 2N@cross{}N division
8223can be done with the basecase division algorithm (@pxref{Basecase Division}),
8224but using N/2 limbs as a base, not just a single limb.  This way the
8225multiplications that arise are (N/2)@cross{}(N/2) and can take advantage of
8226Karatsuba and higher multiplication algorithms (@pxref{Multiplication
8227Algorithms}).  The two ``digits'' of the quotient are formed by recursive
8228N@cross{}(N/2) divisions.
8229
8230If the (N/2)@cross{}(N/2) multiplies are done with a basecase multiplication
8231then the work is about the same as a basecase division, but with more function
8232call overheads and with some subtractions separated from the multiplies.
8233These overheads mean that it's only when N/2 is above
8234@code{MUL_TOOM22_THRESHOLD} that divide and conquer is of use.
8235
8236@code{DC_DIV_QR_THRESHOLD} is based on the divisor size N, so it will be somewhere
8237above twice @code{MUL_TOOM22_THRESHOLD}, but how much above depends on the
8238CPU@.  An optimized @code{mpn_mul_basecase} can lower @code{DC_DIV_QR_THRESHOLD} a
8239little by offering a ready-made advantage over repeated @code{mpn_submul_1}
8240calls.
8241
8242Divide and conquer is asymptotically @m{O(M(N)\log N),O(M(N)*log(N))} where
8243@math{M(N)} is the time for an N@cross{}N multiplication done with FFTs.  The
8244actual time is a sum over multiplications of the recursed sizes, as can be
8245seen near the end of section 2.2 of Burnikel and Ziegler.  For example, within
8246the Toom-3 range, divide and conquer is @m{2.63M(N), 2.63*M(N)}.  With higher
8247algorithms the @math{M(N)} term improves and the multiplier tends to @m{\log
8248N, log(N)}.  In practice, at moderate to large sizes, a 2N@cross{}N division
8249is about 2 to 4 times slower than an N@cross{}N multiplication.
8250
8251
8252@node Block-Wise Barrett Division, Exact Division, Divide and Conquer Division, Division Algorithms
8253@subsection Block-Wise Barrett Division
8254
8255For the largest divisions, a block-wise Barrett division algorithm is used.
8256Here, the divisor is inverted to a precision determined by the relative size of
8257the dividend and divisor.  Blocks of quotient limbs are then generated by
8258multiplying blocks from the dividend by the inverse.
8259
8260Our block-wise algorithm computes a smaller inverse than in the plain Barrett
8261algorithm.  For a @math{2n/n} division, the inverse will be just @m{\lceil n/2
8262\rceil, ceil(n/2)} limbs.
8263
8264
8265@node Exact Division, Exact Remainder, Block-Wise Barrett Division, Division Algorithms
8266@subsection Exact Division
8267
8268
8269A so-called exact division is when the dividend is known to be an exact
8270multiple of the divisor.  Jebelean's exact division algorithm uses this
8271knowledge to make some significant optimizations (@pxref{References}).
8272
8273The idea can be illustrated in decimal for example with 368154 divided by
8274543.  Because the low digit of the dividend is 4, the low digit of the
8275quotient must be 8.  This is arrived at from @m{4 \mathord{\times} 7 \bmod 10,
82764*7 mod 10}, using the fact 7 is the modular inverse of 3 (the low digit of
8277the divisor), since @m{3 \mathord{\times} 7 \mathop{\equiv} 1 \bmod 10, 3*7
8278@equiv{} 1 mod 10}.  So @m{8\mathord{\times}543 = 4344,8*543=4344} can be
8279subtracted from the dividend leaving 363810.  Notice the low digit has become
8280zero.
8281
8282The procedure is repeated at the second digit, with the next quotient digit 7
8283(@m{1 \mathord{\times} 7 \bmod 10, 7 @equiv{} 1*7 mod 10}), subtracting
8284@m{7\mathord{\times}543 = 3801,7*543=3801}, leaving 325800.  And finally at
8285the third digit with quotient digit 6 (@m{8 \mathord{\times} 7 \bmod 10, 8*7
8286mod 10}), subtracting @m{6\mathord{\times}543 = 3258,6*543=3258} leaving 0.
8287So the quotient is 678.
8288
8289Notice however that the multiplies and subtractions don't need to extend past
8290the low three digits of the dividend, since that's enough to determine the
8291three quotient digits.  For the last quotient digit no subtraction is needed
8292at all.  On a 2N@cross{}N division like this one, only about half the work of
8293a normal basecase division is necessary.
8294
8295For an N@cross{}M exact division producing Q=N@minus{}M quotient limbs, the
8296saving over a normal basecase division is in two parts.  Firstly, each of the
8297Q quotient limbs needs only one multiply, not a 2@cross{}1 divide and
8298multiply.  Secondly, the crossproducts are reduced when @math{Q>M} to
8299@m{QM-M(M+1)/2,Q*M-M*(M+1)/2}, or when @math{Q@le{}M} to @m{Q(Q-1)/2,
8300Q*(Q-1)/2}.  Notice the savings are complementary.  If Q is big then many
8301divisions are saved, or if Q is small then the crossproducts reduce to a small
8302number.
8303
8304The modular inverse used is calculated efficiently by @code{binvert_limb} in
8305@file{gmp-impl.h}.  This does four multiplies for a 32-bit limb, or six for a
830664-bit limb.  @file{tune/modlinv.c} has some alternate implementations that
8307might suit processors better at bit twiddling than multiplying.
8308
8309The sub-quadratic exact division described by Jebelean in ``Exact Division
8310with Karatsuba Complexity'' is not currently implemented.  It uses a
8311rearrangement similar to the divide and conquer for normal division
8312(@pxref{Divide and Conquer Division}), but operating from low to high.  A
8313further possibility not currently implemented is ``Bidirectional Exact Integer
8314Division'' by Krandick and Jebelean which forms quotient limbs from both the
8315high and low ends of the dividend, and can halve once more the number of
8316crossproducts needed in a 2N@cross{}N division.
8317
8318A special case exact division by 3 exists in @code{mpn_divexact_by3},
8319supporting Toom-3 multiplication and @code{mpq} canonicalizations.  It forms
8320quotient digits with a multiply by the modular inverse of 3 (which is
8321@code{0xAA..AAB}) and uses two comparisons to determine a borrow for the next
8322limb.  The multiplications don't need to be on the dependent chain, as long as
8323the effect of the borrows is applied, which can help chips with pipelined
8324multipliers.
8325
8326
8327@node Exact Remainder, Small Quotient Division, Exact Division, Division Algorithms
8328@subsection Exact Remainder
8329@cindex Exact remainder
8330
8331If the exact division algorithm is done with a full subtraction at each stage
8332and the dividend isn't a multiple of the divisor, then low zero limbs are
8333produced but with a remainder in the high limbs.  For dividend @math{a},
8334divisor @math{d}, quotient @math{q}, and @m{b = 2
8335\GMPraise{@code{mp\_bits\_per\_limb}}, b = 2^mp_bits_per_limb}, this remainder
8336@math{r} is of the form
8337@tex
8338$$ a = qd + r b^n $$
8339@end tex
8340@ifnottex
8341
8342@example
8343a = q*d + r*b^n
8344@end example
8345
8346@end ifnottex
8347@math{n} represents the number of zero limbs produced by the subtractions,
8348that being the number of limbs produced for @math{q}.  @math{r} will be in the
8349range @math{0@le{}r<d} and can be viewed as a remainder, but one shifted up by
8350a factor of @math{b^n}.
8351
8352Carrying out full subtractions at each stage means the same number of cross
8353products must be done as a normal division, but there's still some single limb
8354divisions saved.  When @math{d} is a single limb some simplifications arise,
8355providing good speedups on a number of processors.
8356
8357The functions @code{mpn_divexact_by3}, @code{mpn_modexact_1_odd} and the
8358internal @code{mpn_redc_X} functions differ subtly in how they return @math{r},
8359leading to some negations in the above formula, but all are essentially the
8360same.
8361
8362@cindex Divisibility algorithm
8363@cindex Congruence algorithm
8364Clearly @math{r} is zero when @math{a} is a multiple of @math{d}, and this
8365leads to divisibility or congruence tests which are potentially more efficient
8366than a normal division.
8367
8368The factor of @math{b^n} on @math{r} can be ignored in a GCD when @math{d} is
8369odd, hence the use of @code{mpn_modexact_1_odd} by @code{mpn_gcd_1} and
8370@code{mpz_kronecker_ui} etc (@pxref{Greatest Common Divisor Algorithms}).
8371
8372Montgomery's REDC method for modular multiplications uses operands of the form
8373of @m{xb^{-n}, x*b^-n} and @m{yb^{-n}, y*b^-n} and on calculating @m{(xb^{-n})
8374(yb^{-n}), (x*b^-n)*(y*b^-n)} uses the factor of @math{b^n} in the exact
8375remainder to reach a product in the same form @m{(xy)b^{-n}, (x*y)*b^-n}
8376(@pxref{Modular Powering Algorithm}).
8377
8378Notice that @math{r} generally gives no useful information about the ordinary
8379remainder @math{a @bmod d} since @math{b^n @bmod d} could be anything.  If
8380however @math{b^n @equiv{} 1 @bmod d}, then @math{r} is the negative of the
8381ordinary remainder.  This occurs whenever @math{d} is a factor of
8382@math{b^n-1}, as for example with 3 in @code{mpn_divexact_by3}.  For a 32 or
838364 bit limb other such factors include 5, 17 and 257, but no particular use
8384has been found for this.
8385
8386
8387@node Small Quotient Division,  , Exact Remainder, Division Algorithms
8388@subsection Small Quotient Division
8389
8390An N@cross{}M division where the number of quotient limbs Q=N@minus{}M is
8391small can be optimized somewhat.
8392
8393An ordinary basecase division normalizes the divisor by shifting it to make
8394the high bit set, shifting the dividend accordingly, and shifting the
8395remainder back down at the end of the calculation.  This is wasteful if only a
8396few quotient limbs are to be formed.  Instead a division of just the top
8397@m{\rm2Q,2*Q} limbs of the dividend by the top Q limbs of the divisor can be
8398used to form a trial quotient.  This requires only those limbs normalized, not
8399the whole of the divisor and dividend.
8400
8401A multiply and subtract then applies the trial quotient to the M@minus{}Q
8402unused limbs of the divisor and N@minus{}Q dividend limbs (which includes Q
8403limbs remaining from the trial quotient division).  The starting trial
8404quotient can be 1 or 2 too big, but all cases of 2 too big and most cases of 1
8405too big are detected by first comparing the most significant limbs that will
8406arise from the subtraction.  An addback is done if the quotient still turns
8407out to be 1 too big.
8408
8409This whole procedure is essentially the same as one step of the basecase
8410algorithm done in a Q limb base, though with the trial quotient test done only
8411with the high limbs, not an entire Q limb ``digit'' product.  The correctness
8412of this weaker test can be established by following the argument of Knuth
8413section 4.3.1 exercise 20 but with the @m{v_2 \GMPhat q > b \GMPhat r
8414+ u_2, v2*q>b*r+u2} condition appropriately relaxed.
8415
8416
8417@need 1000
8418@node Greatest Common Divisor Algorithms, Powering Algorithms, Division Algorithms, Algorithms
8419@section Greatest Common Divisor
8420@cindex Greatest common divisor algorithms
8421@cindex GCD algorithms
8422
8423@menu
8424* Binary GCD::
8425* Lehmer's Algorithm::
8426* Subquadratic GCD::
8427* Extended GCD::
8428* Jacobi Symbol::
8429@end menu
8430
8431
8432@node Binary GCD, Lehmer's Algorithm, Greatest Common Divisor Algorithms, Greatest Common Divisor Algorithms
8433@subsection Binary GCD
8434
8435At small sizes GMP uses an @math{O(N^2)} binary style GCD@.  This is described
8436in many textbooks, for example Knuth section 4.5.2 algorithm B@.  It simply
8437consists of successively reducing odd operands @math{a} and @math{b} using
8438
8439@quotation
8440@math{a,b = @abs{}(a-b),@min{}(a,b)} @*
8441strip factors of 2 from @math{a}
8442@end quotation
8443
8444The Euclidean GCD algorithm, as per Knuth algorithms E and A, repeatedly
8445computes the quotient @m{q = \lfloor a/b \rfloor, q = floor(a/b)} and replaces
8446@math{a,b} by @math{v, u - q v}. The binary algorithm has so far been found to
8447be faster than the Euclidean algorithm everywhere.  One reason the binary
8448method does well is that the implied quotient at each step is usually small,
8449so often only one or two subtractions are needed to get the same effect as a
8450division.  Quotients 1, 2 and 3 for example occur 67.7% of the time, see Knuth
8451section 4.5.3 Theorem E.
8452
8453When the implied quotient is large, meaning @math{b} is much smaller than
8454@math{a}, then a division is worthwhile.  This is the basis for the initial
8455@math{a @bmod b} reductions in @code{mpn_gcd} and @code{mpn_gcd_1} (the latter
8456for both N@cross{}1 and 1@cross{}1 cases).  But after that initial reduction,
8457big quotients occur too rarely to make it worth checking for them.
8458
8459@sp 1
8460The final @math{1@cross{}1} GCD in @code{mpn_gcd_1} is done in the generic C
8461code as described above.  For two N-bit operands, the algorithm takes about
84620.68 iterations per bit.  For optimum performance some attention needs to be
8463paid to the way the factors of 2 are stripped from @math{a}.
8464
8465Firstly it may be noted that in twos complement the number of low zero bits on
8466@math{a-b} is the same as @math{b-a}, so counting or testing can begin on
8467@math{a-b} without waiting for @math{@abs{}(a-b)} to be determined.
8468
8469A loop stripping low zero bits tends not to branch predict well, since the
8470condition is data dependent.  But on average there's only a few low zeros, so
8471an option is to strip one or two bits arithmetically then loop for more (as
8472done for AMD K6).  Or use a lookup table to get a count for several bits then
8473loop for more (as done for AMD K7).  An alternative approach is to keep just
8474one of @math{a} or @math{b} odd and iterate
8475
8476@quotation
8477@math{a,b = @abs{}(a-b), @min{}(a,b)} @*
8478@math{a = a/2} if even @*
8479@math{b = b/2} if even
8480@end quotation
8481
8482This requires about 1.25 iterations per bit, but stripping of a single bit at
8483each step avoids any branching.  Repeating the bit strip reduces to about 0.9
8484iterations per bit, which may be a worthwhile tradeoff.
8485
8486Generally with the above approaches a speed of perhaps 6 cycles per bit can be
8487achieved, which is still not terribly fast with for instance a 64-bit GCD
8488taking nearly 400 cycles.  It's this sort of time which means it's not usually
8489advantageous to combine a set of divisibility tests into a GCD.
8490
8491Currently, the binary algorithm is used for GCD only when @math{N < 3}.
8492
8493@node Lehmer's Algorithm, Subquadratic GCD, Binary GCD, Greatest Common Divisor Algorithms
8494@comment  node-name,  next,  previous,  up
8495@subsection Lehmer's algorithm
8496
8497Lehmer's improvement of the Euclidean algorithms is based on the observation
8498that the initial part of the quotient sequence depends only on the most
8499significant parts of the inputs. The variant of Lehmer's algorithm used in GMP
8500splits off the most significant two limbs, as suggested, e.g., in ``A
8501Double-Digit Lehmer-Euclid Algorithm'' by Jebelean (@pxref{References}). The
8502quotients of two double-limb inputs are collected as a 2 by 2 matrix with
8503single-limb elements. This is done by the function @code{mpn_hgcd2}. The
8504resulting matrix is applied to the inputs using @code{mpn_mul_1} and
8505@code{mpn_submul_1}. Each iteration usually reduces the inputs by almost one
8506limb. In the rare case of a large quotient, no progress can be made by
8507examining just the most significant two limbs, and the quotient is computed
8508using plain division.
8509
8510The resulting algorithm is asymptotically @math{O(N^2)}, just as the Euclidean
8511algorithm and the binary algorithm. The quadratic part of the work are
8512the calls to @code{mpn_mul_1} and @code{mpn_submul_1}. For small sizes, the
8513linear work is also significant. There are roughly @math{N} calls to the
8514@code{mpn_hgcd2} function. This function uses a couple of important
8515optimizations:
8516
8517@itemize
8518@item
8519It uses the same relaxed notion of correctness as @code{mpn_hgcd} (see next
8520section). This means that when called with the most significant two limbs of
8521two large numbers, the returned matrix does not always correspond exactly to
8522the initial quotient sequence for the two large numbers; the final quotient
8523may sometimes be one off.
8524
8525@item
8526It takes advantage of the fact the quotients are usually small. The division
8527operator is not used, since the corresponding assembler instruction is very
8528slow on most architectures. (This code could probably be improved further, it
8529uses many branches that are unfriendly to prediction).
8530
8531@item
8532It switches from double-limb calculations to single-limb calculations half-way
8533through, when the input numbers have been reduced in size from two limbs to
8534one and a half.
8535
8536@end itemize
8537
8538@node Subquadratic GCD, Extended GCD, Lehmer's Algorithm, Greatest Common Divisor Algorithms
8539@subsection Subquadratic GCD
8540
8541For inputs larger than @code{GCD_DC_THRESHOLD}, GCD is computed via the HGCD
8542(Half GCD) function, as a generalization to Lehmer's algorithm.
8543
8544Let the inputs @math{a,b} be of size @math{N} limbs each. Put @m{S=\lfloor N/2
8545\rfloor + 1, S = floor(N/2) + 1}. Then HGCD(a,b) returns a transformation
8546matrix @math{T} with non-negative elements, and reduced numbers @math{(c;d) =
8547T^{-1} (a;b)}. The reduced numbers @math{c,d} must be larger than @math{S}
8548limbs, while their difference @math{abs(c-d)} must fit in @math{S} limbs. The
8549matrix elements will also be of size roughly @math{N/2}.
8550
8551The HGCD base case uses Lehmer's algorithm, but with the above stop condition
8552that returns reduced numbers and the corresponding transformation matrix
8553half-way through. For inputs larger than @code{HGCD_THRESHOLD}, HGCD is
8554computed recursively, using the divide and conquer algorithm in ``On
8555Sch@"onhage's algorithm and subquadratic integer GCD computation'' by M@"oller
8556(@pxref{References}). The recursive algorithm consists of these main
8557steps.
8558
8559@itemize
8560
8561@item
8562Call HGCD recursively, on the most significant @math{N/2} limbs. Apply the
8563resulting matrix @math{T_1} to the full numbers, reducing them to a size just
8564above @math{3N/2}.
8565
8566@item
8567Perform a small number of division or subtraction steps to reduce the numbers
8568to size below @math{3N/2}. This is essential mainly for the unlikely case of
8569large quotients.
8570
8571@item
8572Call HGCD recursively, on the most significant @math{N/2} limbs of the reduced
8573numbers. Apply the resulting matrix @math{T_2} to the full numbers, reducing
8574them to a size just above @math{N/2}.
8575
8576@item
8577Compute @math{T = T_1 T_2}.
8578
8579@item
8580Perform a small number of division and subtraction steps to satisfy the
8581requirements, and return.
8582@end itemize
8583
8584GCD is then implemented as a loop around HGCD, similarly to Lehmer's
8585algorithm. Where Lehmer repeatedly chops off the top two limbs, calls
8586@code{mpn_hgcd2}, and applies the resulting matrix to the full numbers, the
8587subquadratic GCD chops off the most significant third of the limbs (the
8588proportion is a tuning parameter, and @math{1/3} seems to be more efficient
8589than, e.g, @math{1/2}), calls @code{mpn_hgcd}, and applies the resulting
8590matrix. Once the input numbers are reduced to size below
8591@code{GCD_DC_THRESHOLD}, Lehmer's algorithm is used for the rest of the work.
8592
8593The asymptotic running time of both HGCD and GCD is @m{O(M(N)\log N),O(M(N)*log(N))},
8594where @math{M(N)} is the time for multiplying two @math{N}-limb numbers.
8595
8596@comment  node-name,  next,  previous,  up
8597
8598@node Extended GCD, Jacobi Symbol, Subquadratic GCD, Greatest Common Divisor Algorithms
8599@subsection Extended GCD
8600
8601The extended GCD function, or GCDEXT, calculates @math{@gcd{}(a,b)} and also
8602cofactors @math{x} and @math{y} satisfying @m{ax+by=\gcd(a@C{}b),
8603a*x+b*y=gcd(a@C{}b)}. All the algorithms used for plain GCD are extended to
8604handle this case. The binary algorithm is used only for single-limb GCDEXT.
8605Lehmer's algorithm is used for sizes up to @code{GCDEXT_DC_THRESHOLD}. Above
8606this threshold, GCDEXT is implemented as a loop around HGCD, but with more
8607book-keeping to keep track of the cofactors. This gives the same asymptotic
8608running time as for GCD and HGCD, @m{O(M(N)\log N),O(M(N)*log(N))}
8609
8610One difference to plain GCD is that while the inputs @math{a} and @math{b} are
8611reduced as the algorithm proceeds, the cofactors @math{x} and @math{y} grow in
8612size. This makes the tuning of the chopping-point more difficult. The current
8613code chops off the most significant half of the inputs for the call to HGCD in
8614the first iteration, and the most significant two thirds for the remaining
8615calls. This strategy could surely be improved. Also the stop condition for the
8616loop, where Lehmer's algorithm is invoked once the inputs are reduced below
8617@code{GCDEXT_DC_THRESHOLD}, could maybe be improved by taking into account the
8618current size of the cofactors.
8619
8620@node Jacobi Symbol,  , Extended GCD, Greatest Common Divisor Algorithms
8621@subsection Jacobi Symbol
8622@cindex Jacobi symbol algorithm
8623
8624[This section is obsolete.  The current Jacobi code actually uses a very
8625efficient algorithm.]
8626
8627@code{mpz_jacobi} and @code{mpz_kronecker} are currently implemented with a
8628simple binary algorithm similar to that described for the GCDs (@pxref{Binary
8629GCD}).  They're not very fast when both inputs are large.  Lehmer's multi-step
8630improvement or a binary based multi-step algorithm is likely to be better.
8631
8632When one operand fits a single limb, and that includes @code{mpz_kronecker_ui}
8633and friends, an initial reduction is done with either @code{mpn_mod_1} or
8634@code{mpn_modexact_1_odd}, followed by the binary algorithm on a single limb.
8635The binary algorithm is well suited to a single limb, and the whole
8636calculation in this case is quite efficient.
8637
8638In all the routines sign changes for the result are accumulated using some bit
8639twiddling, avoiding table lookups or conditional jumps.
8640
8641
8642@need 1000
8643@node Powering Algorithms, Root Extraction Algorithms, Greatest Common Divisor Algorithms, Algorithms
8644@section Powering Algorithms
8645@cindex Powering algorithms
8646
8647@menu
8648* Normal Powering Algorithm::
8649* Modular Powering Algorithm::
8650@end menu
8651
8652
8653@node Normal Powering Algorithm, Modular Powering Algorithm, Powering Algorithms, Powering Algorithms
8654@subsection Normal Powering
8655
8656Normal @code{mpz} or @code{mpf} powering uses a simple binary algorithm,
8657successively squaring and then multiplying by the base when a 1 bit is seen in
8658the exponent, as per Knuth section 4.6.3.  The ``left to right''
8659variant described there is used rather than algorithm A, since it's just as
8660easy and can be done with somewhat less temporary memory.
8661
8662
8663@node Modular Powering Algorithm,  , Normal Powering Algorithm, Powering Algorithms
8664@subsection Modular Powering
8665
8666Modular powering is implemented using a @math{2^k}-ary sliding window
8667algorithm, as per ``Handbook of Applied Cryptography'' algorithm 14.85
8668(@pxref{References}).  @math{k} is chosen according to the size of the
8669exponent.  Larger exponents use larger values of @math{k}, the choice being
8670made to minimize the average number of multiplications that must supplement
8671the squaring.
8672
8673The modular multiplies and squarings use either a simple division or the REDC
8674method by Montgomery (@pxref{References}).  REDC is a little faster,
8675essentially saving N single limb divisions in a fashion similar to an exact
8676remainder (@pxref{Exact Remainder}).
8677
8678
8679@node Root Extraction Algorithms, Radix Conversion Algorithms, Powering Algorithms, Algorithms
8680@section Root Extraction Algorithms
8681@cindex Root extraction algorithms
8682
8683@menu
8684* Square Root Algorithm::
8685* Nth Root Algorithm::
8686* Perfect Square Algorithm::
8687* Perfect Power Algorithm::
8688@end menu
8689
8690
8691@node Square Root Algorithm, Nth Root Algorithm, Root Extraction Algorithms, Root Extraction Algorithms
8692@subsection Square Root
8693@cindex Square root algorithm
8694@cindex Karatsuba square root algorithm
8695
8696Square roots are taken using the ``Karatsuba Square Root'' algorithm by Paul
8697Zimmermann (@pxref{References}).
8698
8699An input @math{n} is split into four parts of @math{k} bits each, so with
8700@math{b=2^k} we have @m{n = a_3b^3 + a_2b^2 + a_1b + a_0, n = a3*b^3 + a2*b^2
8701+ a1*b + a0}.  Part @ms{a,3} must be ``normalized'' so that either the high or
8702second highest bit is set.  In GMP, @math{k} is kept on a limb boundary and
8703the input is left shifted (by an even number of bits) to normalize.
8704
8705The square root of the high two parts is taken, by recursive application of
8706the algorithm (bottoming out in a one-limb Newton's method),
8707@tex
8708$$ s',r' = \mathop{\rm sqrtrem} \> (a_3b + a_2) $$
8709@end tex
8710@ifnottex
8711
8712@example
8713s1,r1 = sqrtrem (a3*b + a2)
8714@end example
8715
8716@end ifnottex
8717This is an approximation to the desired root and is extended by a division to
8718give @math{s},@math{r},
8719@tex
8720$$\eqalign{
8721q,u &= \mathop{\rm divrem} \> (r'b + a_1, 2s') \cr
8722s &= s'b + q \cr
8723r &= ub + a_0 - q^2
8724}$$
8725@end tex
8726@ifnottex
8727
8728@example
8729q,u = divrem (r1*b + a1, 2*s1)
8730s = s1*b + q
8731r = u*b + a0 - q^2
8732@end example
8733
8734@end ifnottex
8735The normalization requirement on @ms{a,3} means at this point @math{s} is
8736either correct or 1 too big.  @math{r} is negative in the latter case, so
8737@tex
8738$$\eqalign{
8739\mathop{\rm if} \; r &< 0 \; \mathop{\rm then} \cr
8740r &\leftarrow r + 2s - 1 \cr
8741s &\leftarrow s - 1
8742}$$
8743@end tex
8744@ifnottex
8745
8746@example
8747if r < 0 then
8748  r = r + 2*s - 1
8749  s = s - 1
8750@end example
8751
8752@end ifnottex
8753The algorithm is expressed in a divide and conquer form, but as noted in the
8754paper it can also be viewed as a discrete variant of Newton's method, or as a
8755variation on the schoolboy method (no longer taught) for square roots two
8756digits at a time.
8757
8758If the remainder @math{r} is not required then usually only a few high limbs
8759of @math{r} and @math{u} need to be calculated to determine whether an
8760adjustment to @math{s} is required.  This optimization is not currently
8761implemented.
8762
8763In the Karatsuba multiplication range this algorithm is @m{O({3\over2}
8764M(N/2)),O(1.5*M(N/2))}, where @math{M(n)} is the time to multiply two numbers
8765of @math{n} limbs.  In the FFT multiplication range this grows to a bound of
8766@m{O(6 M(N/2)),O(6*M(N/2))}.  In practice a factor of about 1.5 to 1.8 is
8767found in the Karatsuba and Toom-3 ranges, growing to 2 or 3 in the FFT range.
8768
8769The algorithm does all its calculations in integers and the resulting
8770@code{mpn_sqrtrem} is used for both @code{mpz_sqrt} and @code{mpf_sqrt}.
8771The extended precision given by @code{mpf_sqrt_ui} is obtained by
8772padding with zero limbs.
8773
8774
8775@node Nth Root Algorithm, Perfect Square Algorithm, Square Root Algorithm, Root Extraction Algorithms
8776@subsection Nth Root
8777@cindex Root extraction algorithm
8778@cindex Nth root algorithm
8779
8780Integer Nth roots are taken using Newton's method with the following
8781iteration, where @math{A} is the input and @math{n} is the root to be taken.
8782@tex
8783$$a_{i+1} = {1\over n} \left({A \over a_i^{n-1}} + (n-1)a_i \right)$$
8784@end tex
8785@ifnottex
8786
8787@example
8788         1         A
8789a[i+1] = - * ( --------- + (n-1)*a[i] )
8790         n     a[i]^(n-1)
8791@end example
8792
8793@end ifnottex
8794The initial approximation @m{a_1,a[1]} is generated bitwise by successively
8795powering a trial root with or without new 1 bits, aiming to be just above the
8796true root.  The iteration converges quadratically when started from a good
8797approximation.  When @math{n} is large more initial bits are needed to get
8798good convergence.  The current implementation is not particularly well
8799optimized.
8800
8801
8802@node Perfect Square Algorithm, Perfect Power Algorithm, Nth Root Algorithm, Root Extraction Algorithms
8803@subsection Perfect Square
8804@cindex Perfect square algorithm
8805
8806A significant fraction of non-squares can be quickly identified by checking
8807whether the input is a quadratic residue modulo small integers.
8808
8809@code{mpz_perfect_square_p} first tests the input mod 256, which means just
8810examining the low byte.  Only 44 different values occur for squares mod 256,
8811so 82.8% of inputs can be immediately identified as non-squares.
8812
8813On a 32-bit system similar tests are done mod 9, 5, 7, 13 and 17, for a total
881499.25% of inputs identified as non-squares.  On a 64-bit system 97 is tested
8815too, for a total 99.62%.
8816
8817These moduli are chosen because they're factors of @math{2^@W{24}-1} (or
8818@math{2^@W{48}-1} for 64-bits), and such a remainder can be quickly taken just
8819using additions (see @code{mpn_mod_34lsub1}).
8820
8821When nails are in use moduli are instead selected by the @file{gen-psqr.c}
8822program and applied with an @code{mpn_mod_1}.  The same @math{2^@W{24}-1} or
8823@math{2^@W{48}-1} could be done with nails using some extra bit shifts, but
8824this is not currently implemented.
8825
8826In any case each modulus is applied to the @code{mpn_mod_34lsub1} or
8827@code{mpn_mod_1} remainder and a table lookup identifies non-squares.  By
8828using a ``modexact'' style calculation, and suitably permuted tables, just one
8829multiply each is required, see the code for details.  Moduli are also combined
8830to save operations, so long as the lookup tables don't become too big.
8831@file{gen-psqr.c} does all the pre-calculations.
8832
8833A square root must still be taken for any value that passes these tests, to
8834verify it's really a square and not one of the small fraction of non-squares
8835that get through (i.e.@: a pseudo-square to all the tested bases).
8836
8837Clearly more residue tests could be done, @code{mpz_perfect_square_p} only
8838uses a compact and efficient set.  Big inputs would probably benefit from more
8839residue testing, small inputs might be better off with less.  The assumed
8840distribution of squares versus non-squares in the input would affect such
8841considerations.
8842
8843
8844@node Perfect Power Algorithm,  , Perfect Square Algorithm, Root Extraction Algorithms
8845@subsection Perfect Power
8846@cindex Perfect power algorithm
8847
8848Detecting perfect powers is required by some factorization algorithms.
8849Currently @code{mpz_perfect_power_p} is implemented using repeated Nth root
8850extractions, though naturally only prime roots need to be considered.
8851(@xref{Nth Root Algorithm}.)
8852
8853If a prime divisor @math{p} with multiplicity @math{e} can be found, then only
8854roots which are divisors of @math{e} need to be considered, much reducing the
8855work necessary.  To this end divisibility by a set of small primes is checked.
8856
8857
8858@node Radix Conversion Algorithms, Other Algorithms, Root Extraction Algorithms, Algorithms
8859@section Radix Conversion
8860@cindex Radix conversion algorithms
8861
8862Radix conversions are less important than other algorithms.  A program
8863dominated by conversions should probably use a different data representation.
8864
8865@menu
8866* Binary to Radix::
8867* Radix to Binary::
8868@end menu
8869
8870
8871@node Binary to Radix, Radix to Binary, Radix Conversion Algorithms, Radix Conversion Algorithms
8872@subsection Binary to Radix
8873
8874Conversions from binary to a power-of-2 radix use a simple and fast
8875@math{O(N)} bit extraction algorithm.
8876
8877Conversions from binary to other radices use one of two algorithms.  Sizes
8878below @code{GET_STR_PRECOMPUTE_THRESHOLD} use a basic @math{O(N^2)} method.
8879Repeated divisions by @math{b^n} are made, where @math{b} is the radix and
8880@math{n} is the biggest power that fits in a limb.  But instead of simply
8881using the remainder @math{r} from such divisions, an extra divide step is done
8882to give a fractional limb representing @math{r/b^n}.  The digits of @math{r}
8883can then be extracted using multiplications by @math{b} rather than divisions.
8884Special case code is provided for decimal, allowing multiplications by 10 to
8885optimize to shifts and adds.
8886
8887Above @code{GET_STR_PRECOMPUTE_THRESHOLD} a sub-quadratic algorithm is used.
8888For an input @math{t}, powers @m{b^{n2^i},b^(n*2^i)} of the radix are
8889calculated, until a power between @math{t} and @m{\sqrt{t},sqrt(t)} is
8890reached.  @math{t} is then divided by that largest power, giving a quotient
8891which is the digits above that power, and a remainder which is those below.
8892These two parts are in turn divided by the second highest power, and so on
8893recursively.  When a piece has been divided down to less than
8894@code{GET_STR_DC_THRESHOLD} limbs, the basecase algorithm described above is
8895used.
8896
8897The advantage of this algorithm is that big divisions can make use of the
8898sub-quadratic divide and conquer division (@pxref{Divide and Conquer
8899Division}), and big divisions tend to have less overheads than lots of
8900separate single limb divisions anyway.  But in any case the cost of
8901calculating the powers @m{b^{n2^i},b^(n*2^i)} must first be overcome.
8902
8903@code{GET_STR_PRECOMPUTE_THRESHOLD} and @code{GET_STR_DC_THRESHOLD} represent
8904the same basic thing, the point where it becomes worth doing a big division to
8905cut the input in half.  @code{GET_STR_PRECOMPUTE_THRESHOLD} includes the cost
8906of calculating the radix power required, whereas @code{GET_STR_DC_THRESHOLD}
8907assumes that's already available, which is the case when recursing.
8908
8909Since the base case produces digits from least to most significant but they
8910want to be stored from most to least, it's necessary to calculate in advance
8911how many digits there will be, or at least be sure not to underestimate that.
8912For GMP the number of input bits is multiplied by @code{chars_per_bit_exactly}
8913from @code{mp_bases}, rounding up.  The result is either correct or one too
8914big.
8915
8916Examining some of the high bits of the input could increase the chance of
8917getting the exact number of digits, but an exact result every time would not
8918be practical, since in general the difference between numbers 100@dots{} and
891999@dots{} is only in the last few bits and the work to identify 99@dots{}
8920might well be almost as much as a full conversion.
8921
8922@code{mpf_get_str} doesn't currently use the algorithm described here, it
8923multiplies or divides by a power of @math{b} to move the radix point to the
8924just above the highest non-zero digit (or at worst one above that location),
8925then multiplies by @math{b^n} to bring out digits.  This is @math{O(N^2)} and
8926is certainly not optimal.
8927
8928The @math{r/b^n} scheme described above for using multiplications to bring out
8929digits might be useful for more than a single limb.  Some brief experiments
8930with it on the base case when recursing didn't give a noticeable improvement,
8931but perhaps that was only due to the implementation.  Something similar would
8932work for the sub-quadratic divisions too, though there would be the cost of
8933calculating a bigger radix power.
8934
8935Another possible improvement for the sub-quadratic part would be to arrange
8936for radix powers that balanced the sizes of quotient and remainder produced,
8937i.e.@: the highest power would be an @m{b^{nk},b^(n*k)} approximately equal to
8938@m{\sqrt{t},sqrt(t)}, not restricted to a @math{2^i} factor.  That ought to
8939smooth out a graph of times against sizes, but may or may not be a net
8940speedup.
8941
8942
8943@node Radix to Binary,  , Binary to Radix, Radix Conversion Algorithms
8944@subsection Radix to Binary
8945
8946@strong{This section needs to be rewritten, it currently describes the
8947algorithms used before GMP 4.3.}
8948
8949Conversions from a power-of-2 radix into binary use a simple and fast
8950@math{O(N)} bitwise concatenation algorithm.
8951
8952Conversions from other radices use one of two algorithms.  Sizes below
8953@code{SET_STR_PRECOMPUTE_THRESHOLD} use a basic @math{O(N^2)} method.  Groups
8954of @math{n} digits are converted to limbs, where @math{n} is the biggest
8955power of the base @math{b} which will fit in a limb, then those groups are
8956accumulated into the result by multiplying by @math{b^n} and adding.  This
8957saves multi-precision operations, as per Knuth section 4.4 part E
8958(@pxref{References}).  Some special case code is provided for decimal, giving
8959the compiler a chance to optimize multiplications by 10.
8960
8961Above @code{SET_STR_PRECOMPUTE_THRESHOLD} a sub-quadratic algorithm is used.
8962First groups of @math{n} digits are converted into limbs.  Then adjacent
8963limbs are combined into limb pairs with @m{xb^n+y,x*b^n+y}, where @math{x}
8964and @math{y} are the limbs.  Adjacent limb pairs are combined into quads
8965similarly with @m{xb^{2n}+y,x*b^(2n)+y}.  This continues until a single block
8966remains, that being the result.
8967
8968The advantage of this method is that the multiplications for each @math{x} are
8969big blocks, allowing Karatsuba and higher algorithms to be used.  But the cost
8970of calculating the powers @m{b^{n2^i},b^(n*2^i)} must be overcome.
8971@code{SET_STR_PRECOMPUTE_THRESHOLD} usually ends up quite big, around 5000 digits, and on
8972some processors much bigger still.
8973
8974@code{SET_STR_PRECOMPUTE_THRESHOLD} is based on the input digits (and tuned
8975for decimal), though it might be better based on a limb count, so as to be
8976independent of the base.  But that sort of count isn't used by the base case
8977and so would need some sort of initial calculation or estimate.
8978
8979The main reason @code{SET_STR_PRECOMPUTE_THRESHOLD} is so much bigger than the
8980corresponding @code{GET_STR_PRECOMPUTE_THRESHOLD} is that @code{mpn_mul_1} is
8981much faster than @code{mpn_divrem_1} (often by a factor of 5, or more).
8982
8983
8984@need 1000
8985@node Other Algorithms, Assembly Coding, Radix Conversion Algorithms, Algorithms
8986@section Other Algorithms
8987
8988@menu
8989* Prime Testing Algorithm::
8990* Factorial Algorithm::
8991* Binomial Coefficients Algorithm::
8992* Fibonacci Numbers Algorithm::
8993* Lucas Numbers Algorithm::
8994* Random Number Algorithms::
8995@end menu
8996
8997
8998@node Prime Testing Algorithm, Factorial Algorithm, Other Algorithms, Other Algorithms
8999@subsection Prime Testing
9000@cindex Prime testing algorithms
9001
9002The primality testing in @code{mpz_probab_prime_p} (@pxref{Number Theoretic
9003Functions}) first does some trial division by small factors and then uses the
9004Miller-Rabin probabilistic primality testing algorithm, as described in Knuth
9005section 4.5.4 algorithm P (@pxref{References}).
9006
9007For an odd input @math{n}, and with @math{n = q@GMPmultiply{}2^k+1} where
9008@math{q} is odd, this algorithm selects a random base @math{x} and tests
9009whether @math{x^q @bmod{} n} is 1 or @math{-1}, or an @m{x^{q2^j} \bmod n,
9010x^(q*2^j) mod n} is @math{1}, for @math{1@le{}j@le{}k}.  If so then @math{n}
9011is probably prime, if not then @math{n} is definitely composite.
9012
9013Any prime @math{n} will pass the test, but some composites do too.  Such
9014composites are known as strong pseudoprimes to base @math{x}.  No @math{n} is
9015a strong pseudoprime to more than @math{1/4} of all bases (see Knuth exercise
901622), hence with @math{x} chosen at random there's no more than a @math{1/4}
9017chance a ``probable prime'' will in fact be composite.
9018
9019In fact strong pseudoprimes are quite rare, making the test much more
9020powerful than this analysis would suggest, but @math{1/4} is all that's proven
9021for an arbitrary @math{n}.
9022
9023
9024@node Factorial Algorithm, Binomial Coefficients Algorithm, Prime Testing Algorithm, Other Algorithms
9025@subsection Factorial
9026@cindex Factorial algorithm
9027
9028Factorials are calculated by a combination of two algorithms. An idea is
9029shared among them: to compute the odd part of the factorial; a final step
9030takes account of the power of @math{2} term, by shifting.
9031
9032For small @math{n}, the odd factor of @math{n!} is computed with the simple
9033observation that it is equal to the product of all positive odd numbers
9034smaller than @math{n} times the odd factor of @m{\lfloor n/2\rfloor!, [n/2]!},
9035where @m{\lfloor x\rfloor, [x]} is the integer part of @math{x}, and so on
9036recursively. The procedure can be best illustrated with an example,
9037
9038@quotation
9039@math{23! = (23.21.19.17.15.13.11.9.7.5.3)(11.9.7.5.3)(5.3)2^{19}}
9040@end quotation
9041
9042Current code collects all the factors in a single list, with a loop and no
9043recursion, and compute the product, with no special care for repeated chunks.
9044
9045When @math{n} is larger, computation pass trough prime sieving. An helper
9046function is used, as suggested by Peter Luschny:
9047@tex
9048$$\mathop{\rm msf}(n) = {n!\over\lfloor n/2\rfloor!^2\cdot2^k} = \prod_{p=3}^{n}
9049p^{\mathop{\rm L}(p,n)} $$
9050@end tex
9051@ifnottex
9052
9053@example
9054                            n
9055                          -----
9056               n!          | |   L(p,n)
9057msf(n) = -------------- =  | |  p
9058          [n/2]!^2.2^k     p=3
9059@end example
9060@end ifnottex
9061
9062Where @math{p} ranges on odd prime numbers. The exponent @math{k} is chosen to
9063obtain an odd integer number: @math{k} is the number of 1 bits in the binary
9064representation of @m{\lfloor n/2\rfloor, [n/2]}. The function L@math{(p,n)}
9065can be defined as zero when @math{p} is composite, and, for any prime
9066@math{p}, it is computed with:
9067@tex
9068$$\mathop{\rm L}(p,n) = \sum_{i>0}\left\lfloor{n\over p^i}\right\rfloor\bmod2
9069\leq\log_p(n)$$
9070@end tex
9071@ifnottex
9072
9073@example
9074          ---
9075           \    n
9076L(p,n) =   /  [---] mod 2   <=  log (n) .
9077          ---  p^i                p
9078          i>0
9079@end example
9080@end ifnottex
9081
9082With this helper function, we are able to compute the odd part of @math{n!}
9083using the recursion implied by @m{n!=\lfloor n/2\rfloor!^2\cdot\mathop{\rm
9084msf}(n)\cdot2^k , n!=[n/2]!^2*msf(n)*2^k}. The recursion stops using the
9085small-@math{n} algorithm on some @m{\lfloor n/2^i\rfloor, [n/2^i]}.
9086
9087Both the above algorithms use binary splitting to compute the product of many
9088small factors. At first as many products as possible are accumulated in a
9089single register, generating a list of factors that fit in a machine word. This
9090list is then split into halves, and the product is computed recursively.
9091
9092Such splitting is more efficient than repeated N@cross{}1 multiplies since it
9093forms big multiplies, allowing Karatsuba and higher algorithms to be used.
9094And even below the Karatsuba threshold a big block of work can be more
9095efficient for the basecase algorithm.
9096
9097
9098@node Binomial Coefficients Algorithm, Fibonacci Numbers Algorithm, Factorial Algorithm, Other Algorithms
9099@subsection Binomial Coefficients
9100@cindex Binomial coefficient algorithm
9101
9102Binomial coefficients @m{\left({n}\atop{k}\right), C(n@C{}k)} are calculated
9103by first arranging @math{k @le{} n/2} using @m{\left({n}\atop{k}\right) =
9104\left({n}\atop{n-k}\right), C(n@C{}k) = C(n@C{}n-k)} if necessary, and then
9105evaluating the following product simply from @math{i=2} to @math{i=k}.
9106@tex
9107$$ \left({n}\atop{k}\right) = (n-k+1) \prod_{i=2}^{k} {{n-k+i} \over i} $$
9108@end tex
9109@ifnottex
9110
9111@example
9112                      k  (n-k+i)
9113C(n,k) =  (n-k+1) * prod -------
9114                     i=2    i
9115@end example
9116
9117@end ifnottex
9118It's easy to show that each denominator @math{i} will divide the product so
9119far, so the exact division algorithm is used (@pxref{Exact Division}).
9120
9121The numerators @math{n-k+i} and denominators @math{i} are first accumulated
9122into as many fit a limb, to save multi-precision operations, though for
9123@code{mpz_bin_ui} this applies only to the divisors, since @math{n} is an
9124@code{mpz_t} and @math{n-k+i} in general won't fit in a limb at all.
9125
9126
9127@node Fibonacci Numbers Algorithm, Lucas Numbers Algorithm, Binomial Coefficients Algorithm, Other Algorithms
9128@subsection Fibonacci Numbers
9129@cindex Fibonacci number algorithm
9130
9131The Fibonacci functions @code{mpz_fib_ui} and @code{mpz_fib2_ui} are designed
9132for calculating isolated @m{F_n,F[n]} or @m{F_n,F[n]},@m{F_{n-1},F[n-1]}
9133values efficiently.
9134
9135For small @math{n}, a table of single limb values in @code{__gmp_fib_table} is
9136used.  On a 32-bit limb this goes up to @m{F_{47},F[47]}, or on a 64-bit limb
9137up to @m{F_{93},F[93]}.  For convenience the table starts at @m{F_{-1},F[-1]}.
9138
9139Beyond the table, values are generated with a binary powering algorithm,
9140calculating a pair @m{F_n,F[n]} and @m{F_{n-1},F[n-1]} working from high to
9141low across the bits of @math{n}.  The formulas used are
9142@tex
9143$$\eqalign{
9144  F_{2k+1} &= 4F_k^2 - F_{k-1}^2 + 2(-1)^k \cr
9145  F_{2k-1} &=  F_k^2 + F_{k-1}^2           \cr
9146  F_{2k}   &= F_{2k+1} - F_{2k-1}
9147}$$
9148@end tex
9149@ifnottex
9150
9151@example
9152F[2k+1] = 4*F[k]^2 - F[k-1]^2 + 2*(-1)^k
9153F[2k-1] =   F[k]^2 + F[k-1]^2
9154
9155F[2k] = F[2k+1] - F[2k-1]
9156@end example
9157
9158@end ifnottex
9159At each step, @math{k} is the high @math{b} bits of @math{n}.  If the next bit
9160of @math{n} is 0 then @m{F_{2k},F[2k]},@m{F_{2k-1},F[2k-1]} is used, or if
9161it's a 1 then @m{F_{2k+1},F[2k+1]},@m{F_{2k},F[2k]} is used, and the process
9162repeated until all bits of @math{n} are incorporated.  Notice these formulas
9163require just two squares per bit of @math{n}.
9164
9165It'd be possible to handle the first few @math{n} above the single limb table
9166with simple additions, using the defining Fibonacci recurrence @m{F_{k+1} =
9167F_k + F_{k-1}, F[k+1]=F[k]+F[k-1]}, but this is not done since it usually
9168turns out to be faster for only about 10 or 20 values of @math{n}, and
9169including a block of code for just those doesn't seem worthwhile.  If they
9170really mattered it'd be better to extend the data table.
9171
9172Using a table avoids lots of calculations on small numbers, and makes small
9173@math{n} go fast.  A bigger table would make more small @math{n} go fast, it's
9174just a question of balancing size against desired speed.  For GMP the code is
9175kept compact, with the emphasis primarily on a good powering algorithm.
9176
9177@code{mpz_fib2_ui} returns both @m{F_n,F[n]} and @m{F_{n-1},F[n-1]}, but
9178@code{mpz_fib_ui} is only interested in @m{F_n,F[n]}.  In this case the last
9179step of the algorithm can become one multiply instead of two squares.  One of
9180the following two formulas is used, according as @math{n} is odd or even.
9181@tex
9182$$\eqalign{
9183  F_{2k}   &= F_k (F_k + 2F_{k-1}) \cr
9184  F_{2k+1} &= (2F_k + F_{k-1}) (2F_k - F_{k-1}) + 2(-1)^k
9185}$$
9186@end tex
9187@ifnottex
9188
9189@example
9190F[2k]   = F[k]*(F[k]+2F[k-1])
9191
9192F[2k+1] = (2F[k]+F[k-1])*(2F[k]-F[k-1]) + 2*(-1)^k
9193@end example
9194
9195@end ifnottex
9196@m{F_{2k+1},F[2k+1]} here is the same as above, just rearranged to be a
9197multiply.  For interest, the @m{2(-1)^k, 2*(-1)^k} term both here and above
9198can be applied just to the low limb of the calculation, without a carry or
9199borrow into further limbs, which saves some code size.  See comments with
9200@code{mpz_fib_ui} and the internal @code{mpn_fib2_ui} for how this is done.
9201
9202
9203@node Lucas Numbers Algorithm, Random Number Algorithms, Fibonacci Numbers Algorithm, Other Algorithms
9204@subsection Lucas Numbers
9205@cindex Lucas number algorithm
9206
9207@code{mpz_lucnum2_ui} derives a pair of Lucas numbers from a pair of Fibonacci
9208numbers with the following simple formulas.
9209@tex
9210$$\eqalign{
9211  L_k     &=  F_k + 2F_{k-1} \cr
9212  L_{k-1} &= 2F_k -  F_{k-1}
9213}$$
9214@end tex
9215@ifnottex
9216
9217@example
9218L[k]   =   F[k] + 2*F[k-1]
9219L[k-1] = 2*F[k] -   F[k-1]
9220@end example
9221
9222@end ifnottex
9223@code{mpz_lucnum_ui} is only interested in @m{L_n,L[n]}, and some work can be
9224saved.  Trailing zero bits on @math{n} can be handled with a single square
9225each.
9226@tex
9227$$ L_{2k} = L_k^2 - 2(-1)^k $$
9228@end tex
9229@ifnottex
9230
9231@example
9232L[2k] = L[k]^2 - 2*(-1)^k
9233@end example
9234
9235@end ifnottex
9236And the lowest 1 bit can be handled with one multiply of a pair of Fibonacci
9237numbers, similar to what @code{mpz_fib_ui} does.
9238@tex
9239$$ L_{2k+1} = 5F_{k-1} (2F_k + F_{k-1}) - 4(-1)^k $$
9240@end tex
9241@ifnottex
9242
9243@example
9244L[2k+1] = 5*F[k-1]*(2*F[k]+F[k-1]) - 4*(-1)^k
9245@end example
9246
9247@end ifnottex
9248
9249
9250@node Random Number Algorithms,  , Lucas Numbers Algorithm, Other Algorithms
9251@subsection Random Numbers
9252@cindex Random number algorithms
9253
9254For the @code{urandomb} functions, random numbers are generated simply by
9255concatenating bits produced by the generator.  As long as the generator has
9256good randomness properties this will produce well-distributed @math{N} bit
9257numbers.
9258
9259For the @code{urandomm} functions, random numbers in a range @math{0@le{}R<N}
9260are generated by taking values @math{R} of @m{\lceil \log_2 N \rceil,
9261ceil(log2(N))} bits each until one satisfies @math{R<N}.  This will normally
9262require only one or two attempts, but the attempts are limited in case the
9263generator is somehow degenerate and produces only 1 bits or similar.
9264
9265@cindex Mersenne twister algorithm
9266The Mersenne Twister generator is by Matsumoto and Nishimura
9267(@pxref{References}).  It has a non-repeating period of @math{2^@W{19937}-1},
9268which is a Mersenne prime, hence the name of the generator.  The state is 624
9269words of 32-bits each, which is iterated with one XOR and shift for each
927032-bit word generated, making the algorithm very fast.  Randomness properties
9271are also very good and this is the default algorithm used by GMP.
9272
9273@cindex Linear congruential algorithm
9274Linear congruential generators are described in many text books, for instance
9275Knuth volume 2 (@pxref{References}).  With a modulus @math{M} and parameters
9276@math{A} and @math{C}, an integer state @math{S} is iterated by the formula
9277@math{S @leftarrow{} A@GMPmultiply{}S+C @bmod{} M}.  At each step the new
9278state is a linear function of the previous, mod @math{M}, hence the name of
9279the generator.
9280
9281In GMP only moduli of the form @math{2^N} are supported, and the current
9282implementation is not as well optimized as it could be.  Overheads are
9283significant when @math{N} is small, and when @math{N} is large clearly the
9284multiply at each step will become slow.  This is not a big concern, since the
9285Mersenne Twister generator is better in every respect and is therefore
9286recommended for all normal applications.
9287
9288For both generators the current state can be deduced by observing enough
9289output and applying some linear algebra (over GF(2) in the case of the
9290Mersenne Twister).  This generally means raw output is unsuitable for
9291cryptographic applications without further hashing or the like.
9292
9293
9294@node Assembly Coding,  , Other Algorithms, Algorithms
9295@section Assembly Coding
9296@cindex Assembly coding
9297
9298The assembly subroutines in GMP are the most significant source of speed at
9299small to moderate sizes.  At larger sizes algorithm selection becomes more
9300important, but of course speedups in low level routines will still speed up
9301everything proportionally.
9302
9303Carry handling and widening multiplies that are important for GMP can't be
9304easily expressed in C@.  GCC @code{asm} blocks help a lot and are provided in
9305@file{longlong.h}, but hand coding low level routines invariably offers a
9306speedup over generic C by a factor of anything from 2 to 10.
9307
9308@menu
9309* Assembly Code Organisation::
9310* Assembly Basics::
9311* Assembly Carry Propagation::
9312* Assembly Cache Handling::
9313* Assembly Functional Units::
9314* Assembly Floating Point::
9315* Assembly SIMD Instructions::
9316* Assembly Software Pipelining::
9317* Assembly Loop Unrolling::
9318* Assembly Writing Guide::
9319@end menu
9320
9321
9322@node Assembly Code Organisation, Assembly Basics, Assembly Coding, Assembly Coding
9323@subsection Code Organisation
9324@cindex Assembly code organisation
9325@cindex Code organisation
9326
9327The various @file{mpn} subdirectories contain machine-dependent code, written
9328in C or assembly.  The @file{mpn/generic} subdirectory contains default code,
9329used when there's no machine-specific version of a particular file.
9330
9331Each @file{mpn} subdirectory is for an ISA family.  Generally 32-bit and
933264-bit variants in a family cannot share code and have separate directories.
9333Within a family further subdirectories may exist for CPU variants.
9334
9335In each directory a @file{nails} subdirectory may exist, holding code with
9336nails support for that CPU variant.  A @code{NAILS_SUPPORT} directive in each
9337file indicates the nails values the code handles.  Nails code only exists
9338where it's faster, or promises to be faster, than plain code.  There's no
9339effort put into nails if they're not going to enhance a given CPU.
9340
9341
9342@node Assembly Basics, Assembly Carry Propagation, Assembly Code Organisation, Assembly Coding
9343@subsection Assembly Basics
9344
9345@code{mpn_addmul_1} and @code{mpn_submul_1} are the most important routines
9346for overall GMP performance.  All multiplications and divisions come down to
9347repeated calls to these.  @code{mpn_add_n}, @code{mpn_sub_n},
9348@code{mpn_lshift} and @code{mpn_rshift} are next most important.
9349
9350On some CPUs assembly versions of the internal functions
9351@code{mpn_mul_basecase} and @code{mpn_sqr_basecase} give significant speedups,
9352mainly through avoiding function call overheads.  They can also potentially
9353make better use of a wide superscalar processor, as can bigger primitives like
9354@code{mpn_addmul_2} or @code{mpn_addmul_4}.
9355
9356The restrictions on overlaps between sources and destinations
9357(@pxref{Low-level Functions}) are designed to facilitate a variety of
9358implementations.  For example, knowing @code{mpn_add_n} won't have partly
9359overlapping sources and destination means reading can be done far ahead of
9360writing on superscalar processors, and loops can be vectorized on a vector
9361processor, depending on the carry handling.
9362
9363
9364@node Assembly Carry Propagation, Assembly Cache Handling, Assembly Basics, Assembly Coding
9365@subsection Carry Propagation
9366@cindex Assembly carry propagation
9367
9368The problem that presents most challenges in GMP is propagating carries from
9369one limb to the next.  In functions like @code{mpn_addmul_1} and
9370@code{mpn_add_n}, carries are the only dependencies between limb operations.
9371
9372On processors with carry flags, a straightforward CISC style @code{adc} is
9373generally best.  AMD K6 @code{mpn_addmul_1} however is an example of an
9374unusual set of circumstances where a branch works out better.
9375
9376On RISC processors generally an add and compare for overflow is used.  This
9377sort of thing can be seen in @file{mpn/generic/aors_n.c}.  Some carry
9378propagation schemes require 4 instructions, meaning at least 4 cycles per
9379limb, but other schemes may use just 1 or 2.  On wide superscalar processors
9380performance may be completely determined by the number of dependent
9381instructions between carry-in and carry-out for each limb.
9382
9383On vector processors good use can be made of the fact that a carry bit only
9384very rarely propagates more than one limb.  When adding a single bit to a
9385limb, there's only a carry out if that limb was @code{0xFF@dots{}FF} which on
9386random data will be only 1 in @m{2\GMPraise{@code{mp\_bits\_per\_limb}},
93872^mp_bits_per_limb}.  @file{mpn/cray/add_n.c} is an example of this, it adds
9388all limbs in parallel, adds one set of carry bits in parallel and then only
9389rarely needs to fall through to a loop propagating further carries.
9390
9391On the x86s, GCC (as of version 2.95.2) doesn't generate particularly good code
9392for the RISC style idioms that are necessary to handle carry bits in
9393C@.  Often conditional jumps are generated where @code{adc} or @code{sbb} forms
9394would be better.  And so unfortunately almost any loop involving carry bits
9395needs to be coded in assembly for best results.
9396
9397
9398@node Assembly Cache Handling, Assembly Functional Units, Assembly Carry Propagation, Assembly Coding
9399@subsection Cache Handling
9400@cindex Assembly cache handling
9401
9402GMP aims to perform well both on operands that fit entirely in L1 cache and
9403those which don't.
9404
9405Basic routines like @code{mpn_add_n} or @code{mpn_lshift} are often used on
9406large operands, so L2 and main memory performance is important for them.
9407@code{mpn_mul_1} and @code{mpn_addmul_1} are mostly used for multiply and
9408square basecases, so L1 performance matters most for them, unless assembly
9409versions of @code{mpn_mul_basecase} and @code{mpn_sqr_basecase} exist, in
9410which case the remaining uses are mostly for larger operands.
9411
9412For L2 or main memory operands, memory access times will almost certainly be
9413more than the calculation time.  The aim therefore is to maximize memory
9414throughput, by starting a load of the next cache line while processing the
9415contents of the previous one.  Clearly this is only possible if the chip has a
9416lock-up free cache or some sort of prefetch instruction.  Most current chips
9417have both these features.
9418
9419Prefetching sources combines well with loop unrolling, since a prefetch can be
9420initiated once per unrolled loop (or more than once if the loop covers more
9421than one cache line).
9422
9423On CPUs without write-allocate caches, prefetching destinations will ensure
9424individual stores don't go further down the cache hierarchy, limiting
9425bandwidth.  Of course for calculations which are slow anyway, like
9426@code{mpn_divrem_1}, write-throughs might be fine.
9427
9428The distance ahead to prefetch will be determined by memory latency versus
9429throughput.  The aim of course is to have data arriving continuously, at peak
9430throughput.  Some CPUs have limits on the number of fetches or prefetches in
9431progress.
9432
9433If a special prefetch instruction doesn't exist then a plain load can be used,
9434but in that case care must be taken not to attempt to read past the end of an
9435operand, since that might produce a segmentation violation.
9436
9437Some CPUs or systems have hardware that detects sequential memory accesses and
9438initiates suitable cache movements automatically, making life easy.
9439
9440
9441@node Assembly Functional Units, Assembly Floating Point, Assembly Cache Handling, Assembly Coding
9442@subsection Functional Units
9443
9444When choosing an approach for an assembly loop, consideration is given to
9445what operations can execute simultaneously and what throughput can thereby be
9446achieved.  In some cases an algorithm can be tweaked to accommodate available
9447resources.
9448
9449Loop control will generally require a counter and pointer updates, costing as
9450much as 5 instructions, plus any delays a branch introduces.  CPU addressing
9451modes might reduce pointer updates, perhaps by allowing just one updating
9452pointer and others expressed as offsets from it, or on CISC chips with all
9453addressing done with the loop counter as a scaled index.
9454
9455The final loop control cost can be amortised by processing several limbs in
9456each iteration (@pxref{Assembly Loop Unrolling}).  This at least ensures loop
9457control isn't a big fraction the work done.
9458
9459Memory throughput is always a limit.  If perhaps only one load or one store
9460can be done per cycle then 3 cycles/limb will the top speed for ``binary''
9461operations like @code{mpn_add_n}, and any code achieving that is optimal.
9462
9463Integer resources can be freed up by having the loop counter in a float
9464register, or by pressing the float units into use for some multiplying,
9465perhaps doing every second limb on the float side (@pxref{Assembly Floating
9466Point}).
9467
9468Float resources can be freed up by doing carry propagation on the integer
9469side, or even by doing integer to float conversions in integers using bit
9470twiddling.
9471
9472
9473@node Assembly Floating Point, Assembly SIMD Instructions, Assembly Functional Units, Assembly Coding
9474@subsection Floating Point
9475@cindex Assembly floating Point
9476
9477Floating point arithmetic is used in GMP for multiplications on CPUs with poor
9478integer multipliers.  It's mostly useful for @code{mpn_mul_1},
9479@code{mpn_addmul_1} and @code{mpn_submul_1} on 64-bit machines, and
9480@code{mpn_mul_basecase} on both 32-bit and 64-bit machines.
9481
9482With IEEE 53-bit double precision floats, integer multiplications producing up
9483to 53 bits will give exact results.  Breaking a 64@cross{}64 multiplication
9484into eight 16@cross{}@math{32@rightarrow{}48} bit pieces is convenient.  With
9485some care though six 21@cross{}@math{32@rightarrow{}53} bit products can be
9486used, if one of the lower two 21-bit pieces also uses the sign bit.
9487
9488For the @code{mpn_mul_1} family of functions on a 64-bit machine, the
9489invariant single limb is split at the start, into 3 or 4 pieces.  Inside the
9490loop, the bignum operand is split into 32-bit pieces.  Fast conversion of
9491these unsigned 32-bit pieces to floating point is highly machine-dependent.
9492In some cases, reading the data into the integer unit, zero-extending to
949364-bits, then transferring to the floating point unit back via memory is the
9494only option.
9495
9496Converting partial products back to 64-bit limbs is usually best done as a
9497signed conversion.  Since all values are smaller than @m{2^{53},2^53}, signed
9498and unsigned are the same, but most processors lack unsigned conversions.
9499
9500@sp 2
9501
9502Here is a diagram showing 16@cross{}32 bit products for an @code{mpn_mul_1} or
9503@code{mpn_addmul_1} with a 64-bit limb.  The single limb operand V is split
9504into four 16-bit parts.  The multi-limb operand U is split in the loop into
9505two 32-bit parts.
9506
9507@tex
9508\global\newdimen\GMPbits      \global\GMPbits=0.18em
9509\def\GMPbox#1#2#3{%
9510  \hbox{%
9511    \hbox to 128\GMPbits{\hfil
9512      \vbox{%
9513        \hrule
9514        \hbox to 48\GMPbits {\GMPvrule \hfil$#2$\hfil \vrule}%
9515        \hrule}%
9516      \hskip #1\GMPbits}%
9517    \raise \GMPboxdepth \hbox{\hskip 2em #3}}}
9518%
9519\GMPdisplay{%
9520  \vbox{%
9521    \hbox{%
9522      \hbox to 128\GMPbits {\hfil
9523        \vbox{%
9524          \hrule
9525          \hbox to 64\GMPbits{%
9526            \GMPvrule \hfil$v48$\hfil
9527            \vrule    \hfil$v32$\hfil
9528            \vrule    \hfil$v16$\hfil
9529            \vrule    \hfil$v00$\hfil
9530            \vrule}
9531          \hrule}}%
9532       \raise \GMPboxdepth \hbox{\hskip 2em V Operand}}
9533    \vskip 0.5ex
9534    \hbox{%
9535      \hbox to 128\GMPbits {\hfil
9536        \raise \GMPboxdepth \hbox{$\times$\hskip 1.5em}%
9537        \vbox{%
9538          \hrule
9539          \hbox to 64\GMPbits {%
9540            \GMPvrule \hfil$u32$\hfil
9541            \vrule \hfil$u00$\hfil
9542            \vrule}%
9543          \hrule}}%
9544       \raise \GMPboxdepth \hbox{\hskip 2em U Operand (one limb)}}%
9545    \vskip 0.5ex
9546    \hbox{\vbox to 2ex{\hrule width 128\GMPbits}}%
9547    \GMPbox{0}{u00 \times v00}{$p00$\hskip 1.5em 48-bit products}%
9548    \vskip 0.5ex
9549    \GMPbox{16}{u00 \times v16}{$p16$}
9550    \vskip 0.5ex
9551    \GMPbox{32}{u00 \times v32}{$p32$}
9552    \vskip 0.5ex
9553    \GMPbox{48}{u00 \times v48}{$p48$}
9554    \vskip 0.5ex
9555    \GMPbox{32}{u32 \times v00}{$r32$}
9556    \vskip 0.5ex
9557    \GMPbox{48}{u32 \times v16}{$r48$}
9558    \vskip 0.5ex
9559    \GMPbox{64}{u32 \times v32}{$r64$}
9560    \vskip 0.5ex
9561    \GMPbox{80}{u32 \times v48}{$r80$}
9562}}
9563@end tex
9564@ifnottex
9565@example
9566@group
9567                +---+---+---+---+
9568                |v48|v32|v16|v00|    V operand
9569                +---+---+---+---+
9570
9571                +-------+---+---+
9572            x   |  u32  |  u00  |    U operand (one limb)
9573                +---------------+
9574
9575---------------------------------
9576
9577                    +-----------+
9578                    | u00 x v00 |    p00    48-bit products
9579                    +-----------+
9580                +-----------+
9581                | u00 x v16 |        p16
9582                +-----------+
9583            +-----------+
9584            | u00 x v32 |            p32
9585            +-----------+
9586        +-----------+
9587        | u00 x v48 |                p48
9588        +-----------+
9589            +-----------+
9590            | u32 x v00 |            r32
9591            +-----------+
9592        +-----------+
9593        | u32 x v16 |                r48
9594        +-----------+
9595    +-----------+
9596    | u32 x v32 |                    r64
9597    +-----------+
9598+-----------+
9599| u32 x v48 |                        r80
9600+-----------+
9601@end group
9602@end example
9603@end ifnottex
9604
9605@math{p32} and @math{r32} can be summed using floating-point addition, and
9606likewise @math{p48} and @math{r48}.  @math{p00} and @math{p16} can be summed
9607with @math{r64} and @math{r80} from the previous iteration.
9608
9609For each loop then, four 49-bit quantities are transferred to the integer unit,
9610aligned as follows,
9611
9612@tex
9613% GMPbox here should be 49 bits wide, but use 51 to better show p16+r80'
9614% crossing into the upper 64 bits.
9615\def\GMPbox#1#2#3{%
9616  \hbox{%
9617    \hbox to 128\GMPbits {%
9618      \hfil
9619      \vbox{%
9620        \hrule
9621        \hbox to 51\GMPbits {\GMPvrule \hfil$#2$\hfil \vrule}%
9622        \hrule}%
9623      \hskip #1\GMPbits}%
9624    \raise \GMPboxdepth \hbox{\hskip 1.5em $#3$\hfil}%
9625}}
9626\newbox\b \setbox\b\hbox{64 bits}%
9627\newdimen\bw \bw=\wd\b \advance\bw by 2em
9628\newdimen\x \x=128\GMPbits
9629\advance\x by -2\bw
9630\divide\x by4
9631\GMPdisplay{%
9632  \vbox{%
9633    \hbox to 128\GMPbits {%
9634      \GMPvrule
9635      \raise 0.5ex \vbox{\hrule \hbox to \x {}}%
9636      \hfil 64 bits\hfil
9637      \raise 0.5ex \vbox{\hrule \hbox to \x {}}%
9638      \vrule
9639      \raise 0.5ex \vbox{\hrule \hbox to \x {}}%
9640      \hfil 64 bits\hfil
9641      \raise 0.5ex \vbox{\hrule \hbox to \x {}}%
9642      \vrule}%
9643    \vskip 0.7ex
9644    \GMPbox{0}{p00+r64'}{i00}
9645    \vskip 0.5ex
9646    \GMPbox{16}{p16+r80'}{i16}
9647    \vskip 0.5ex
9648    \GMPbox{32}{p32+r32}{i32}
9649    \vskip 0.5ex
9650    \GMPbox{48}{p48+r48}{i48}
9651}}
9652@end tex
9653@ifnottex
9654@example
9655@group
9656|-----64bits----|-----64bits----|
9657                   +------------+
9658                   | p00 + r64' |    i00
9659                   +------------+
9660               +------------+
9661               | p16 + r80' |        i16
9662               +------------+
9663           +------------+
9664           | p32 + r32  |            i32
9665           +------------+
9666       +------------+
9667       | p48 + r48  |                i48
9668       +------------+
9669@end group
9670@end example
9671@end ifnottex
9672
9673The challenge then is to sum these efficiently and add in a carry limb,
9674generating a low 64-bit result limb and a high 33-bit carry limb (@math{i48}
9675extends 33 bits into the high half).
9676
9677
9678@node Assembly SIMD Instructions, Assembly Software Pipelining, Assembly Floating Point, Assembly Coding
9679@subsection SIMD Instructions
9680@cindex Assembly SIMD
9681
9682The single-instruction multiple-data support in current microprocessors is
9683aimed at signal processing algorithms where each data point can be treated
9684more or less independently.  There's generally not much support for
9685propagating the sort of carries that arise in GMP.
9686
9687SIMD multiplications of say four 16@cross{}16 bit multiplies only do as much
9688work as one 32@cross{}32 from GMP's point of view, and need some shifts and
9689adds besides.  But of course if say the SIMD form is fully pipelined and uses
9690less instruction decoding then it may still be worthwhile.
9691
9692On the x86 chips, MMX has so far found a use in @code{mpn_rshift} and
9693@code{mpn_lshift}, and is used in a special case for 16-bit multipliers in the
9694P55 @code{mpn_mul_1}.  SSE2 is used for Pentium 4 @code{mpn_mul_1},
9695@code{mpn_addmul_1}, and @code{mpn_submul_1}.
9696
9697
9698@node Assembly Software Pipelining, Assembly Loop Unrolling, Assembly SIMD Instructions, Assembly Coding
9699@subsection Software Pipelining
9700@cindex Assembly software pipelining
9701
9702Software pipelining consists of scheduling instructions around the branch
9703point in a loop.  For example a loop might issue a load not for use in the
9704present iteration but the next, thereby allowing extra cycles for the data to
9705arrive from memory.
9706
9707Naturally this is wanted only when doing things like loads or multiplies that
9708take several cycles to complete, and only where a CPU has multiple functional
9709units so that other work can be done in the meantime.
9710
9711A pipeline with several stages will have a data value in progress at each
9712stage and each loop iteration moves them along one stage.  This is like
9713juggling.
9714
9715If the latency of some instruction is greater than the loop time then it will
9716be necessary to unroll, so one register has a result ready to use while
9717another (or multiple others) are still in progress.  (@pxref{Assembly Loop
9718Unrolling}).
9719
9720
9721@node Assembly Loop Unrolling, Assembly Writing Guide, Assembly Software Pipelining, Assembly Coding
9722@subsection Loop Unrolling
9723@cindex Assembly loop unrolling
9724
9725Loop unrolling consists of replicating code so that several limbs are
9726processed in each loop.  At a minimum this reduces loop overheads by a
9727corresponding factor, but it can also allow better register usage, for example
9728alternately using one register combination and then another.  Judicious use of
9729@command{m4} macros can help avoid lots of duplication in the source code.
9730
9731Any amount of unrolling can be handled with a loop counter that's decremented
9732by @math{N} each time, stopping when the remaining count is less than the
9733further @math{N} the loop will process.  Or by subtracting @math{N} at the
9734start, the termination condition becomes when the counter @math{C} is less
9735than 0 (and the count of remaining limbs is @math{C+N}).
9736
9737Alternately for a power of 2 unroll the loop count and remainder can be
9738established with a shift and mask.  This is convenient if also making a
9739computed jump into the middle of a large loop.
9740
9741The limbs not a multiple of the unrolling can be handled in various ways, for
9742example
9743
9744@itemize @bullet
9745@item
9746A simple loop at the end (or the start) to process the excess.  Care will be
9747wanted that it isn't too much slower than the unrolled part.
9748
9749@item
9750A set of binary tests, for example after an 8-limb unrolling, test for 4 more
9751limbs to process, then a further 2 more or not, and finally 1 more or not.
9752This will probably take more code space than a simple loop.
9753
9754@item
9755A @code{switch} statement, providing separate code for each possible excess,
9756for example an 8-limb unrolling would have separate code for 0 remaining, 1
9757remaining, etc, up to 7 remaining.  This might take a lot of code, but may be
9758the best way to optimize all cases in combination with a deep pipelined loop.
9759
9760@item
9761A computed jump into the middle of the loop, thus making the first iteration
9762handle the excess.  This should make times smoothly increase with size, which
9763is attractive, but setups for the jump and adjustments for pointers can be
9764tricky and could become quite difficult in combination with deep pipelining.
9765@end itemize
9766
9767
9768@node Assembly Writing Guide,  , Assembly Loop Unrolling, Assembly Coding
9769@subsection Writing Guide
9770@cindex Assembly writing guide
9771
9772This is a guide to writing software pipelined loops for processing limb
9773vectors in assembly.
9774
9775First determine the algorithm and which instructions are needed.  Code it
9776without unrolling or scheduling, to make sure it works.  On a 3-operand CPU
9777try to write each new value to a new register, this will greatly simplify later
9778steps.
9779
9780Then note for each instruction the functional unit and/or issue port
9781requirements.  If an instruction can use either of two units, like U0 or U1
9782then make a category ``U0/U1''.  Count the total using each unit (or combined
9783unit), and count all instructions.
9784
9785Figure out from those counts the best possible loop time.  The goal will be to
9786find a perfect schedule where instruction latencies are completely hidden.
9787The total instruction count might be the limiting factor, or perhaps a
9788particular functional unit.  It might be possible to tweak the instructions to
9789help the limiting factor.
9790
9791Suppose the loop time is @math{N}, then make @math{N} issue buckets, with the
9792final loop branch at the end of the last.  Now fill the buckets with dummy
9793instructions using the functional units desired.  Run this to make sure the
9794intended speed is reached.
9795
9796Now replace the dummy instructions with the real instructions from the slow
9797but correct loop you started with.  The first will typically be a load
9798instruction.  Then the instruction using that value is placed in a bucket an
9799appropriate distance down.  Run the loop again, to check it still runs at
9800target speed.
9801
9802Keep placing instructions, frequently measuring the loop.  After a few you
9803will need to wrap around from the last bucket back to the top of the loop.  If
9804you used the new-register for new-value strategy above then there will be no
9805register conflicts.  If not then take care not to clobber something already in
9806use.  Changing registers at this time is very error prone.
9807
9808The loop will overlap two or more of the original loop iterations, and the
9809computation of one vector element result will be started in one iteration of
9810the new loop, and completed one or several iterations later.
9811
9812The final step is to create feed-in and wind-down code for the loop.  A good
9813way to do this is to make a copy (or copies) of the loop at the start and
9814delete those instructions which don't have valid antecedents, and at the end
9815replicate and delete those whose results are unwanted (including any further
9816loads).
9817
9818The loop will have a minimum number of limbs loaded and processed, so the
9819feed-in code must test if the request size is smaller and skip either to a
9820suitable part of the wind-down or to special code for small sizes.
9821
9822
9823@node Internals, Contributors, Algorithms, Top
9824@chapter Internals
9825@cindex Internals
9826
9827@strong{This chapter is provided only for informational purposes and the
9828various internals described here may change in future GMP releases.
9829Applications expecting to be compatible with future releases should use only
9830the documented interfaces described in previous chapters.}
9831
9832@menu
9833* Integer Internals::
9834* Rational Internals::
9835* Float Internals::
9836* Raw Output Internals::
9837* C++ Interface Internals::
9838@end menu
9839
9840@node Integer Internals, Rational Internals, Internals, Internals
9841@section Integer Internals
9842@cindex Integer internals
9843
9844@code{mpz_t} variables represent integers using sign and magnitude, in space
9845dynamically allocated and reallocated.  The fields are as follows.
9846
9847@table @asis
9848@item @code{_mp_size}
9849The number of limbs, or the negative of that when representing a negative
9850integer.  Zero is represented by @code{_mp_size} set to zero, in which case
9851the @code{_mp_d} data is unused.
9852
9853@item @code{_mp_d}
9854A pointer to an array of limbs which is the magnitude.  These are stored
9855``little endian'' as per the @code{mpn} functions, so @code{_mp_d[0]} is the
9856least significant limb and @code{_mp_d[ABS(_mp_size)-1]} is the most
9857significant.  Whenever @code{_mp_size} is non-zero, the most significant limb
9858is non-zero.
9859
9860Currently there's always at least one limb allocated, so for instance
9861@code{mpz_set_ui} never needs to reallocate, and @code{mpz_get_ui} can fetch
9862@code{_mp_d[0]} unconditionally (though its value is then only wanted if
9863@code{_mp_size} is non-zero).
9864
9865@item @code{_mp_alloc}
9866@code{_mp_alloc} is the number of limbs currently allocated at @code{_mp_d},
9867and naturally @code{_mp_alloc >= ABS(_mp_size)}.  When an @code{mpz} routine
9868is about to (or might be about to) increase @code{_mp_size}, it checks
9869@code{_mp_alloc} to see whether there's enough space, and reallocates if not.
9870@code{MPZ_REALLOC} is generally used for this.
9871@end table
9872
9873The various bitwise logical functions like @code{mpz_and} behave as if
9874negative values were twos complement.  But sign and magnitude is always used
9875internally, and necessary adjustments are made during the calculations.
9876Sometimes this isn't pretty, but sign and magnitude are best for other
9877routines.
9878
9879Some internal temporary variables are setup with @code{MPZ_TMP_INIT} and these
9880have @code{_mp_d} space obtained from @code{TMP_ALLOC} rather than the memory
9881allocation functions.  Care is taken to ensure that these are big enough that
9882no reallocation is necessary (since it would have unpredictable consequences).
9883
9884@code{_mp_size} and @code{_mp_alloc} are @code{int}, although @code{mp_size_t}
9885is usually a @code{long}.  This is done to make the fields just 32 bits on
9886some 64 bits systems, thereby saving a few bytes of data space but still
9887providing plenty of range.
9888
9889
9890@node Rational Internals, Float Internals, Integer Internals, Internals
9891@section Rational Internals
9892@cindex Rational internals
9893
9894@code{mpq_t} variables represent rationals using an @code{mpz_t} numerator and
9895denominator (@pxref{Integer Internals}).
9896
9897The canonical form adopted is denominator positive (and non-zero), no common
9898factors between numerator and denominator, and zero uniquely represented as
98990/1.
9900
9901It's believed that casting out common factors at each stage of a calculation
9902is best in general.  A GCD is an @math{O(N^2)} operation so it's better to do
9903a few small ones immediately than to delay and have to do a big one later.
9904Knowing the numerator and denominator have no common factors can be used for
9905example in @code{mpq_mul} to make only two cross GCDs necessary, not four.
9906
9907This general approach to common factors is badly sub-optimal in the presence
9908of simple factorizations or little prospect for cancellation, but GMP has no
9909way to know when this will occur.  As per @ref{Efficiency}, that's left to
9910applications.  The @code{mpq_t} framework might still suit, with
9911@code{mpq_numref} and @code{mpq_denref} for direct access to the numerator and
9912denominator, or of course @code{mpz_t} variables can be used directly.
9913
9914
9915@node Float Internals, Raw Output Internals, Rational Internals, Internals
9916@section Float Internals
9917@cindex Float internals
9918
9919Efficient calculation is the primary aim of GMP floats and the use of whole
9920limbs and simple rounding facilitates this.
9921
9922@code{mpf_t} floats have a variable precision mantissa and a single machine
9923word signed exponent.  The mantissa is represented using sign and magnitude.
9924
9925@c FIXME: The arrow heads don't join to the lines exactly.
9926@tex
9927\global\newdimen\GMPboxwidth \GMPboxwidth=5em
9928\global\newdimen\GMPboxheight \GMPboxheight=3ex
9929\def\centreline{\hbox{\raise 0.8ex \vbox{\hrule \hbox{\hfil}}}}
9930\GMPdisplay{%
9931\vbox{%
9932  \hbox to 5\GMPboxwidth {most significant limb \hfil least significant limb}
9933  \vskip 0.7ex
9934  \def\GMPcentreline#1{\hbox{\raise 0.5 ex \vbox{\hrule \hbox to #1 {}}}}
9935  \hbox {
9936    \hbox to 3\GMPboxwidth {%
9937      \setbox 0 = \hbox{@code{\_mp\_exp}}%
9938      \dimen0=3\GMPboxwidth
9939      \advance\dimen0 by -\wd0
9940      \divide\dimen0 by 2
9941      \advance\dimen0 by -1em
9942      \setbox1 = \hbox{$\rightarrow$}%
9943      \dimen1=\dimen0
9944      \advance\dimen1 by -\wd1
9945      \GMPcentreline{\dimen0}%
9946      \hfil
9947      \box0%
9948      \hfil
9949      \GMPcentreline{\dimen1{}}%
9950      \box1}
9951    \hbox to 2\GMPboxwidth {\hfil @code{\_mp\_d}}}
9952  \vskip 0.5ex
9953  \vbox {%
9954    \hrule
9955    \hbox{%
9956      \vrule height 2ex depth 1ex
9957      \hbox to \GMPboxwidth {}%
9958      \vrule
9959      \hbox to \GMPboxwidth {}%
9960      \vrule
9961      \hbox to \GMPboxwidth {}%
9962      \vrule
9963      \hbox to \GMPboxwidth {}%
9964      \vrule
9965      \hbox to \GMPboxwidth {}%
9966      \vrule}
9967    \hrule
9968  }
9969  \hbox {%
9970    \hbox to 0.8 pt {}
9971    \hbox to 3\GMPboxwidth {%
9972      \hfil $\cdot$} \hbox {$\leftarrow$ radix point\hfil}}
9973  \hbox to 5\GMPboxwidth{%
9974    \setbox 0 = \hbox{@code{\_mp\_size}}%
9975    \dimen0 = 5\GMPboxwidth
9976    \advance\dimen0 by -\wd0
9977    \divide\dimen0 by 2
9978    \advance\dimen0 by -1em
9979    \dimen1 = \dimen0
9980    \setbox1 = \hbox{$\leftarrow$}%
9981    \setbox2 = \hbox{$\rightarrow$}%
9982    \advance\dimen0 by -\wd1
9983    \advance\dimen1 by -\wd2
9984    \hbox to 0.3 em {}%
9985    \box1
9986    \GMPcentreline{\dimen0}%
9987    \hfil
9988    \box0
9989    \hfil
9990    \GMPcentreline{\dimen1}%
9991    \box2}
9992}}
9993@end tex
9994@ifnottex
9995@example
9996   most                   least
9997significant            significant
9998   limb                   limb
9999
10000                            _mp_d
10001 |---- _mp_exp --->           |
10002  _____ _____ _____ _____ _____
10003 |_____|_____|_____|_____|_____|
10004                   . <------------ radix point
10005
10006  <-------- _mp_size --------->
10007@sp 1
10008@end example
10009@end ifnottex
10010
10011@noindent
10012The fields are as follows.
10013
10014@table @asis
10015@item @code{_mp_size}
10016The number of limbs currently in use, or the negative of that when
10017representing a negative value.  Zero is represented by @code{_mp_size} and
10018@code{_mp_exp} both set to zero, and in that case the @code{_mp_d} data is
10019unused.  (In the future @code{_mp_exp} might be undefined when representing
10020zero.)
10021
10022@item @code{_mp_prec}
10023The precision of the mantissa, in limbs.  In any calculation the aim is to
10024produce @code{_mp_prec} limbs of result (the most significant being non-zero).
10025
10026@item @code{_mp_d}
10027A pointer to the array of limbs which is the absolute value of the mantissa.
10028These are stored ``little endian'' as per the @code{mpn} functions, so
10029@code{_mp_d[0]} is the least significant limb and
10030@code{_mp_d[ABS(_mp_size)-1]} the most significant.
10031
10032The most significant limb is always non-zero, but there are no other
10033restrictions on its value, in particular the highest 1 bit can be anywhere
10034within the limb.
10035
10036@code{_mp_prec+1} limbs are allocated to @code{_mp_d}, the extra limb being
10037for convenience (see below).  There are no reallocations during a calculation,
10038only in a change of precision with @code{mpf_set_prec}.
10039
10040@item @code{_mp_exp}
10041The exponent, in limbs, determining the location of the implied radix point.
10042Zero means the radix point is just above the most significant limb.  Positive
10043values mean a radix point offset towards the lower limbs and hence a value
10044@math{@ge{} 1}, as for example in the diagram above.  Negative exponents mean
10045a radix point further above the highest limb.
10046
10047Naturally the exponent can be any value, it doesn't have to fall within the
10048limbs as the diagram shows, it can be a long way above or a long way below.
10049Limbs other than those included in the @code{@{_mp_d,_mp_size@}} data
10050are treated as zero.
10051@end table
10052
10053The @code{_mp_size} and @code{_mp_prec} fields are @code{int}, although the
10054@code{mp_size_t} type is usually a @code{long}.  The @code{_mp_exp} field is
10055usually @code{long}.  This is done to make some fields just 32 bits on some 64
10056bits systems, thereby saving a few bytes of data space but still providing
10057plenty of precision and a very large range.
10058
10059
10060@sp 1
10061@noindent
10062The following various points should be noted.
10063
10064@table @asis
10065@item Low Zeros
10066The least significant limbs @code{_mp_d[0]} etc can be zero, though such low
10067zeros can always be ignored.  Routines likely to produce low zeros check and
10068avoid them to save time in subsequent calculations, but for most routines
10069they're quite unlikely and aren't checked.
10070
10071@item Mantissa Size Range
10072The @code{_mp_size} count of limbs in use can be less than @code{_mp_prec} if
10073the value can be represented in less.  This means low precision values or
10074small integers stored in a high precision @code{mpf_t} can still be operated
10075on efficiently.
10076
10077@code{_mp_size} can also be greater than @code{_mp_prec}.  Firstly a value is
10078allowed to use all of the @code{_mp_prec+1} limbs available at @code{_mp_d},
10079and secondly when @code{mpf_set_prec_raw} lowers @code{_mp_prec} it leaves
10080@code{_mp_size} unchanged and so the size can be arbitrarily bigger than
10081@code{_mp_prec}.
10082
10083@item Rounding
10084All rounding is done on limb boundaries.  Calculating @code{_mp_prec} limbs
10085with the high non-zero will ensure the application requested minimum precision
10086is obtained.
10087
10088The use of simple ``trunc'' rounding towards zero is efficient, since there's
10089no need to examine extra limbs and increment or decrement.
10090
10091@item Bit Shifts
10092Since the exponent is in limbs, there are no bit shifts in basic operations
10093like @code{mpf_add} and @code{mpf_mul}.  When differing exponents are
10094encountered all that's needed is to adjust pointers to line up the relevant
10095limbs.
10096
10097Of course @code{mpf_mul_2exp} and @code{mpf_div_2exp} will require bit shifts,
10098but the choice is between an exponent in limbs which requires shifts there, or
10099one in bits which requires them almost everywhere else.
10100
10101@item Use of @code{_mp_prec+1} Limbs
10102The extra limb on @code{_mp_d} (@code{_mp_prec+1} rather than just
10103@code{_mp_prec}) helps when an @code{mpf} routine might get a carry from its
10104operation.  @code{mpf_add} for instance will do an @code{mpn_add} of
10105@code{_mp_prec} limbs.  If there's no carry then that's the result, but if
10106there is a carry then it's stored in the extra limb of space and
10107@code{_mp_size} becomes @code{_mp_prec+1}.
10108
10109Whenever @code{_mp_prec+1} limbs are held in a variable, the low limb is not
10110needed for the intended precision, only the @code{_mp_prec} high limbs.  But
10111zeroing it out or moving the rest down is unnecessary.  Subsequent routines
10112reading the value will simply take the high limbs they need, and this will be
10113@code{_mp_prec} if their target has that same precision.  This is no more than
10114a pointer adjustment, and must be checked anyway since the destination
10115precision can be different from the sources.
10116
10117Copy functions like @code{mpf_set} will retain a full @code{_mp_prec+1} limbs
10118if available.  This ensures that a variable which has @code{_mp_size} equal to
10119@code{_mp_prec+1} will get its full exact value copied.  Strictly speaking
10120this is unnecessary since only @code{_mp_prec} limbs are needed for the
10121application's requested precision, but it's considered that an @code{mpf_set}
10122from one variable into another of the same precision ought to produce an exact
10123copy.
10124
10125@item Application Precisions
10126@code{__GMPF_BITS_TO_PREC} converts an application requested precision to an
10127@code{_mp_prec}.  The value in bits is rounded up to a whole limb then an
10128extra limb is added since the most significant limb of @code{_mp_d} is only
10129non-zero and therefore might contain only one bit.
10130
10131@code{__GMPF_PREC_TO_BITS} does the reverse conversion, and removes the extra
10132limb from @code{_mp_prec} before converting to bits.  The net effect of
10133reading back with @code{mpf_get_prec} is simply the precision rounded up to a
10134multiple of @code{mp_bits_per_limb}.
10135
10136Note that the extra limb added here for the high only being non-zero is in
10137addition to the extra limb allocated to @code{_mp_d}.  For example with a
1013832-bit limb, an application request for 250 bits will be rounded up to 8
10139limbs, then an extra added for the high being only non-zero, giving an
10140@code{_mp_prec} of 9.  @code{_mp_d} then gets 10 limbs allocated.  Reading
10141back with @code{mpf_get_prec} will take @code{_mp_prec} subtract 1 limb and
10142multiply by 32, giving 256 bits.
10143
10144Strictly speaking, the fact the high limb has at least one bit means that a
10145float with, say, 3 limbs of 32-bits each will be holding at least 65 bits, but
10146for the purposes of @code{mpf_t} it's considered simply to be 64 bits, a nice
10147multiple of the limb size.
10148@end table
10149
10150
10151@node Raw Output Internals, C++ Interface Internals, Float Internals, Internals
10152@section Raw Output Internals
10153@cindex Raw output internals
10154
10155@noindent
10156@code{mpz_out_raw} uses the following format.
10157
10158@tex
10159\global\newdimen\GMPboxwidth \GMPboxwidth=5em
10160\global\newdimen\GMPboxheight \GMPboxheight=3ex
10161\def\centreline{\hbox{\raise 0.8ex \vbox{\hrule \hbox{\hfil}}}}
10162\GMPdisplay{%
10163\vbox{%
10164  \def\GMPcentreline#1{\hbox{\raise 0.5 ex \vbox{\hrule \hbox to #1 {}}}}
10165  \vbox {%
10166    \hrule
10167    \hbox{%
10168      \vrule height 2.5ex depth 1.5ex
10169      \hbox to \GMPboxwidth {\hfil size\hfil}%
10170      \vrule
10171      \hbox to 3\GMPboxwidth {\hfil data bytes\hfil}%
10172      \vrule}
10173    \hrule}
10174}}
10175@end tex
10176@ifnottex
10177@example
10178+------+------------------------+
10179| size |       data bytes       |
10180+------+------------------------+
10181@end example
10182@end ifnottex
10183
10184The size is 4 bytes written most significant byte first, being the number of
10185subsequent data bytes, or the twos complement negative of that when a negative
10186integer is represented.  The data bytes are the absolute value of the integer,
10187written most significant byte first.
10188
10189The most significant data byte is always non-zero, so the output is the same
10190on all systems, irrespective of limb size.
10191
10192In GMP 1, leading zero bytes were written to pad the data bytes to a multiple
10193of the limb size.  @code{mpz_inp_raw} will still accept this, for
10194compatibility.
10195
10196The use of ``big endian'' for both the size and data fields is deliberate, it
10197makes the data easy to read in a hex dump of a file.  Unfortunately it also
10198means that the limb data must be reversed when reading or writing, so neither
10199a big endian nor little endian system can just read and write @code{_mp_d}.
10200
10201
10202@node C++ Interface Internals,  , Raw Output Internals, Internals
10203@section C++ Interface Internals
10204@cindex C++ interface internals
10205
10206A system of expression templates is used to ensure something like @code{a=b+c}
10207turns into a simple call to @code{mpz_add} etc.  For @code{mpf_class}
10208the scheme also ensures the precision of the final
10209destination is used for any temporaries within a statement like
10210@code{f=w*x+y*z}.  These are important features which a naive implementation
10211cannot provide.
10212
10213A simplified description of the scheme follows.  The true scheme is
10214complicated by the fact that expressions have different return types.  For
10215detailed information, refer to the source code.
10216
10217To perform an operation, say, addition, we first define a ``function object''
10218evaluating it,
10219
10220@example
10221struct __gmp_binary_plus
10222@{
10223  static void eval(mpf_t f, mpf_t g, mpf_t h) @{ mpf_add(f, g, h); @}
10224@};
10225@end example
10226
10227@noindent
10228And an ``additive expression'' object,
10229
10230@example
10231__gmp_expr<__gmp_binary_expr<mpf_class, mpf_class, __gmp_binary_plus> >
10232operator+(const mpf_class &f, const mpf_class &g)
10233@{
10234  return __gmp_expr
10235    <__gmp_binary_expr<mpf_class, mpf_class, __gmp_binary_plus> >(f, g);
10236@}
10237@end example
10238
10239The seemingly redundant @code{__gmp_expr<__gmp_binary_expr<@dots{}>>} is used to
10240encapsulate any possible kind of expression into a single template type.  In
10241fact even @code{mpf_class} etc are @code{typedef} specializations of
10242@code{__gmp_expr}.
10243
10244Next we define assignment of @code{__gmp_expr} to @code{mpf_class}.
10245
10246@example
10247template <class T>
10248mpf_class & mpf_class::operator=(const __gmp_expr<T> &expr)
10249@{
10250  expr.eval(this->get_mpf_t(), this->precision());
10251  return *this;
10252@}
10253
10254template <class Op>
10255void __gmp_expr<__gmp_binary_expr<mpf_class, mpf_class, Op> >::eval
10256(mpf_t f, mp_bitcnt_t precision)
10257@{
10258  Op::eval(f, expr.val1.get_mpf_t(), expr.val2.get_mpf_t());
10259@}
10260@end example
10261
10262where @code{expr.val1} and @code{expr.val2} are references to the expression's
10263operands (here @code{expr} is the @code{__gmp_binary_expr} stored within the
10264@code{__gmp_expr}).
10265
10266This way, the expression is actually evaluated only at the time of assignment,
10267when the required precision (that of @code{f}) is known.  Furthermore the
10268target @code{mpf_t} is now available, thus we can call @code{mpf_add} directly
10269with @code{f} as the output argument.
10270
10271Compound expressions are handled by defining operators taking subexpressions
10272as their arguments, like this:
10273
10274@example
10275template <class T, class U>
10276__gmp_expr
10277<__gmp_binary_expr<__gmp_expr<T>, __gmp_expr<U>, __gmp_binary_plus> >
10278operator+(const __gmp_expr<T> &expr1, const __gmp_expr<U> &expr2)
10279@{
10280  return __gmp_expr
10281    <__gmp_binary_expr<__gmp_expr<T>, __gmp_expr<U>, __gmp_binary_plus> >
10282    (expr1, expr2);
10283@}
10284@end example
10285
10286And the corresponding specializations of @code{__gmp_expr::eval}:
10287
10288@example
10289template <class T, class U, class Op>
10290void __gmp_expr
10291<__gmp_binary_expr<__gmp_expr<T>, __gmp_expr<U>, Op> >::eval
10292(mpf_t f, mp_bitcnt_t precision)
10293@{
10294  // declare two temporaries
10295  mpf_class temp1(expr.val1, precision), temp2(expr.val2, precision);
10296  Op::eval(f, temp1.get_mpf_t(), temp2.get_mpf_t());
10297@}
10298@end example
10299
10300The expression is thus recursively evaluated to any level of complexity and
10301all subexpressions are evaluated to the precision of @code{f}.
10302
10303
10304@node Contributors, References, Internals, Top
10305@comment  node-name,  next,  previous,  up
10306@appendix Contributors
10307@cindex Contributors
10308
10309Torbj@"orn Granlund wrote the original GMP library and is still the main
10310developer.  Code not explicitly attributed to others, was contributed by
10311Torbj@"orn.  Several other individuals and organizations have contributed
10312GMP.  Here is a list in chronological order on first contribution:
10313
10314Gunnar Sj@"odin and Hans Riesel helped with mathematical problems in early
10315versions of the library.
10316
10317Richard Stallman helped with the interface design and revised the first
10318version of this manual.
10319
10320Brian Beuning and Doug Lea helped with testing of early versions of the
10321library and made creative suggestions.
10322
10323John Amanatides of York University in Canada contributed the function
10324@code{mpz_probab_prime_p}.
10325
10326Paul Zimmermann wrote the REDC-based mpz_powm code, the Sch@"onhage-Strassen
10327FFT multiply code, and the Karatsuba square root code.  He also improved the
10328Toom3 code for GMP 4.2.  Paul sparked the development of GMP 2, with his
10329comparisons between bignum packages.  The ECMNET project Paul is organizing
10330was a driving force behind many of the optimizations in GMP 3.  Paul also
10331wrote the new GMP 4.3 nth root code (with Torbj@"orn).
10332
10333Ken Weber (Kent State University, Universidade Federal do Rio Grande do Sul)
10334contributed now defunct versions of @code{mpz_gcd}, @code{mpz_divexact},
10335@code{mpn_gcd}, and @code{mpn_bdivmod}, partially supported by CNPq (Brazil)
10336grant 301314194-2.
10337
10338Per Bothner of Cygnus Support helped to set up GMP to use Cygnus' configure.
10339He has also made valuable suggestions and tested numerous intermediary
10340releases.
10341
10342Joachim Hollman was involved in the design of the @code{mpf} interface, and in
10343the @code{mpz} design revisions for version 2.
10344
10345Bennet Yee contributed the initial versions of @code{mpz_jacobi} and
10346@code{mpz_legendre}.
10347
10348Andreas Schwab contributed the files @file{mpn/m68k/lshift.S} and
10349@file{mpn/m68k/rshift.S} (now in @file{.asm} form).
10350
10351Robert Harley of Inria, France and David Seal of ARM, England, suggested clever
10352improvements for population count.  Robert also wrote highly optimized
10353Karatsuba and 3-way Toom multiplication functions for GMP 3, and contributed
10354the ARM assembly code.
10355
10356Torsten Ekedahl of the Mathematical department of Stockholm University provided
10357significant inspiration during several phases of the GMP development.  His
10358mathematical expertise helped improve several algorithms.
10359
10360Linus Nordberg wrote the new configure system based on autoconf and
10361implemented the new random functions.
10362
10363Kevin Ryde worked on a large number of things: optimized x86 code, m4 asm
10364macros, parameter tuning, speed measuring, the configure system, function
10365inlining, divisibility tests, bit scanning, Jacobi symbols, Fibonacci and Lucas
10366number functions, printf and scanf functions, perl interface, demo expression
10367parser, the algorithms chapter in the manual, @file{gmpasm-mode.el}, and
10368various miscellaneous improvements elsewhere.
10369
10370Kent Boortz made the Mac OS 9 port.
10371
10372Steve Root helped write the optimized alpha 21264 assembly code.
10373
10374Gerardo Ballabio wrote the @file{gmpxx.h} C++ class interface and the C++
10375@code{istream} input routines.
10376
10377Jason Moxham rewrote @code{mpz_fac_ui}.
10378
10379Pedro Gimeno implemented the Mersenne Twister and made other random number
10380improvements.
10381
10382Niels M@"oller wrote the sub-quadratic GCD, extended GCD and jacobi code, the
10383quadratic Hensel division code, and (with Torbj@"orn) the new divide and
10384conquer division code for GMP 4.3.  Niels also helped implement the new Toom
10385multiply code for GMP 4.3 and implemented helper functions to simplify Toom
10386evaluations for GMP 5.0.  He wrote the original version of mpn_mulmod_bnm1, and
10387he is the main author of the mini-gmp package used for gmp bootstrapping.
10388
10389Alberto Zanoni and Marco Bodrato suggested the unbalanced multiply strategy,
10390and found the optimal strategies for evaluation and interpolation in Toom
10391multiplication.
10392
10393Marco Bodrato helped implement the new Toom multiply code for GMP 4.3 and
10394implemented most of the new Toom multiply and squaring code for 5.0.
10395He is the main author of the current mpn_mulmod_bnm1 and mpn_mullo_n.  Marco
10396also wrote the functions mpn_invert and mpn_invertappr.  He is the author of
10397the current combinatorial functions: binomial, factorial, multifactorial,
10398primorial.
10399
10400David Harvey suggested the internal function @code{mpn_bdiv_dbm1}, implementing
10401division relevant to Toom multiplication.  He also worked on fast assembly
10402sequences, in particular on a fast AMD64 @code{mpn_mul_basecase}. He wrote
10403the internal middle product functions @code{mpn_mulmid_basecase},
10404@code{mpn_toom42_mulmid}, @code{mpn_mulmid_n} and related helper routines.
10405
10406Martin Boij wrote @code{mpn_perfect_power_p}.
10407
10408Marc Glisse improved @file{gmpxx.h}: use fewer temporaries (faster),
10409specializations of @code{numeric_limits} and @code{common_type}, C++11
10410features (move constructors, explicit bool conversion, UDL), make the
10411conversion from @code{mpq_class} to @code{mpz_class} explicit, optimize
10412operations where one argument is a small compile-time constant, replace
10413some heap allocations by stack allocations.  He also fixed the eofbit
10414handling of C++ streams, and removed one division from @file{mpq/aors.c}.
10415
10416(This list is chronological, not ordered after significance.  If you have
10417contributed to GMP but are not listed above, please tell
10418@email{gmp-devel@@gmplib.org} about the omission!)
10419
10420The development of floating point functions of GNU MP 2, were supported in part
10421by the ESPRIT-BRA (Basic Research Activities) 6846 project POSSO (POlynomial
10422System SOlving).
10423
10424The development of GMP 2, 3, and 4 was supported in part by the IDA Center for
10425Computing Sciences.
10426
10427Thanks go to Hans Thorsen for donating an SGI system for the GMP test system
10428environment.
10429
10430@node References, GNU Free Documentation License, Contributors, Top
10431@comment  node-name,  next,  previous,  up
10432@appendix References
10433@cindex References
10434
10435@c  FIXME: In tex, the @uref's are unhyphenated, which is good for clarity,
10436@c  but being long words they upset paragraph formatting (the preceding line
10437@c  can get badly stretched).  Would like an conditional @* style line break
10438@c  if the uref is too long to fit on the last line of the paragraph, but it's
10439@c  not clear how to do that.  For now explicit @texlinebreak{}s are used on
10440@c  paragraphs that come out bad.
10441
10442@section Books
10443
10444@itemize @bullet
10445@item
10446Jonathan M. Borwein and Peter B. Borwein, ``Pi and the AGM: A Study in
10447Analytic Number Theory and Computational Complexity'', Wiley, 1998.
10448
10449@item
10450Richard Crandall and Carl Pomerance, ``Prime Numbers: A Computational
10451Perspective'', 2nd edition, Springer-Verlag, 2005.
10452@texlinebreak{} @uref{http://www.math.dartmouth.edu/~carlp/}
10453
10454@item
10455Henri Cohen, ``A Course in Computational Algebraic Number Theory'', Graduate
10456Texts in Mathematics number 138, Springer-Verlag, 1993.
10457@texlinebreak{} @uref{http://www.math.u-bordeaux.fr/~cohen/}
10458
10459@item
10460Donald E. Knuth, ``The Art of Computer Programming'', volume 2,
10461``Seminumerical Algorithms'', 3rd edition, Addison-Wesley, 1998.
10462@texlinebreak{} @uref{http://www-cs-faculty.stanford.edu/~knuth/taocp.html}
10463
10464@item
10465John D. Lipson, ``Elements of Algebra and Algebraic Computing'',
10466The Benjamin Cummings Publishing Company Inc, 1981.
10467
10468@item
10469Alfred J. Menezes, Paul C. van Oorschot and Scott A. Vanstone, ``Handbook of
10470Applied Cryptography'', @uref{http://www.cacr.math.uwaterloo.ca/hac/}
10471
10472@item
10473Richard M. Stallman and the GCC Developer Community, ``Using the GNU Compiler
10474Collection'', Free Software Foundation, 2008, available online
10475@uref{http://gcc.gnu.org/onlinedocs/}, and in the GCC package
10476@uref{ftp://ftp.gnu.org/gnu/gcc/}
10477@end itemize
10478
10479@section Papers
10480
10481@itemize @bullet
10482@item
10483Yves Bertot, Nicolas Magaud and Paul Zimmermann, ``A Proof of GMP Square
10484Root'', Journal of Automated Reasoning, volume 29, 2002, pp.@: 225-252.  Also
10485available online as INRIA Research Report 4475, June 2002,
10486@uref{http://hal.inria.fr/docs/00/07/21/13/PDF/RR-4475.pdf}
10487
10488@item
10489Christoph Burnikel and Joachim Ziegler, ``Fast Recursive Division'',
10490Max-Planck-Institut fuer Informatik Research Report MPI-I-98-1-022,
10491@texlinebreak{} @uref{http://data.mpi-sb.mpg.de/internet/reports.nsf/NumberView/1998-1-022}
10492
10493@item
10494Torbj@"orn Granlund and Peter L. Montgomery, ``Division by Invariant Integers
10495using Multiplication'', in Proceedings of the SIGPLAN PLDI'94 Conference, June
104961994.  Also available @uref{http://gmplib.org/~tege/divcnst-pldi94.pdf}.
10497
10498@item
10499Niels M@"oller and Torbj@"orn Granlund, ``Improved division by invariant
10500integers'', IEEE Transactions on Computers, 11 June 2010.
10501@uref{http://gmplib.org/~tege/division-paper.pdf}
10502
10503@item
10504Torbj@"orn Granlund and Niels M@"oller, ``Division of integers large and
10505small'', to appear.
10506
10507@item
10508Tudor Jebelean,
10509``An algorithm for exact division'',
10510Journal of Symbolic Computation,
10511volume 15, 1993, pp.@: 169-180.
10512Research report version available @texlinebreak{}
10513@uref{ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1992/92-35.ps.gz}
10514
10515@item
10516Tudor Jebelean, ``Exact Division with Karatsuba Complexity - Extended
10517Abstract'', RISC-Linz technical report 96-31, @texlinebreak{}
10518@uref{ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1996/96-31.ps.gz}
10519
10520@item
10521Tudor Jebelean, ``Practical Integer Division with Karatsuba Complexity'',
10522ISSAC 97, pp.@: 339-341.  Technical report available @texlinebreak{}
10523@uref{ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1996/96-29.ps.gz}
10524
10525@item
10526Tudor Jebelean, ``A Generalization of the Binary GCD Algorithm'', ISSAC 93,
10527pp.@: 111-116.  Technical report version available @texlinebreak{}
10528@uref{ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1993/93-01.ps.gz}
10529
10530@item
10531Tudor Jebelean, ``A Double-Digit Lehmer-Euclid Algorithm for Finding the GCD
10532of Long Integers'', Journal of Symbolic Computation, volume 19, 1995,
10533pp.@: 145-157.  Technical report version also available @texlinebreak{}
10534@uref{ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1992/92-69.ps.gz}
10535
10536@item
10537Werner Krandick and Tudor Jebelean, ``Bidirectional Exact Integer Division'',
10538Journal of Symbolic Computation, volume 21, 1996, pp.@: 441-455.  Early
10539technical report version also available
10540@uref{ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1994/94-50.ps.gz}
10541
10542@item
10543Makoto Matsumoto and Takuji Nishimura, ``Mersenne Twister: A 623-dimensionally
10544equidistributed uniform pseudorandom number generator'', ACM Transactions on
10545Modelling and Computer Simulation, volume 8, January 1998, pp.@: 3-30.
10546Available online @texlinebreak{}
10547@uref{http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/ARTICLES/mt.ps.gz} (or .pdf)
10548
10549@item
10550R. Moenck and A. Borodin, ``Fast Modular Transforms via Division'',
10551Proceedings of the 13th Annual IEEE Symposium on Switching and Automata
10552Theory, October 1972, pp.@: 90-96.  Reprinted as ``Fast Modular Transforms'',
10553Journal of Computer and System Sciences, volume 8, number 3, June 1974,
10554pp.@: 366-386.
10555
10556@item
10557Niels M@"oller, ``On Sch@"onhage's algorithm and subquadratic integer GCD
10558  computation'', in Mathematics of Computation, volume 77, January 2008, pp.@:
10559  589-607.
10560
10561@item
10562Peter L. Montgomery, ``Modular Multiplication Without Trial Division'', in
10563Mathematics of Computation, volume 44, number 170, April 1985.
10564
10565@item
10566Arnold Sch@"onhage and Volker Strassen, ``Schnelle Multiplikation grosser
10567Zahlen'', Computing 7, 1971, pp.@: 281-292.
10568
10569@item
10570Kenneth Weber, ``The accelerated integer GCD algorithm'',
10571ACM Transactions on Mathematical Software,
10572volume 21, number 1, March 1995, pp.@: 111-122.
10573
10574@item
10575Paul Zimmermann, ``Karatsuba Square Root'', INRIA Research Report 3805,
10576November 1999, @uref{http://hal.inria.fr/inria-00072854/PDF/RR-3805.pdf}
10577
10578@item
10579Paul Zimmermann, ``A Proof of GMP Fast Division and Square Root
10580Implementations'', @texlinebreak{}
10581@uref{http://www.loria.fr/~zimmerma/papers/proof-div-sqrt.ps.gz}
10582
10583@item
10584Dan Zuras, ``On Squaring and Multiplying Large Integers'', ARITH-11: IEEE
10585Symposium on Computer Arithmetic, 1993, pp.@: 260 to 271.  Reprinted as ``More
10586on Multiplying and Squaring Large Integers'', IEEE Transactions on Computers,
10587volume 43, number 8, August 1994, pp.@: 899-908.
10588@end itemize
10589
10590
10591@node GNU Free Documentation License, Concept Index, References, Top
10592@appendix GNU Free Documentation License
10593@cindex GNU Free Documentation License
10594@cindex Free Documentation License
10595@cindex Documentation license
10596@include fdl-1.3.texi
10597
10598
10599@node Concept Index, Function Index, GNU Free Documentation License, Top
10600@comment  node-name,  next,  previous,  up
10601@unnumbered Concept Index
10602@printindex cp
10603
10604@node Function Index,  , Concept Index, Top
10605@comment  node-name,  next,  previous,  up
10606@unnumbered Function and Type Index
10607@printindex fn
10608
10609@bye
10610
10611@c Local variables:
10612@c fill-column: 78
10613@c compile-command: "make gmp.info"
10614@c End:
10615