xref: /netbsd-src/external/lgpl3/gmp/dist/doc/gmp.texi (revision 9ddb6ab554e70fb9bbd90c3d96b812bc57755a14)
1\input texinfo    @c -*-texinfo-*-
2@c %**start of header
3@setfilename gmp.info
4@documentencoding ISO-8859-1
5@include version.texi
6@settitle GNU MP @value{VERSION}
7@synindex tp fn
8@iftex
9@afourpaper
10@end iftex
11@comment %**end of header
12
13@copying
14This manual describes how to install and use the GNU multiple precision
15arithmetic library, version @value{VERSION}.
16
17Copyright 1991, 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001, 2002,
182003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011 Free Software
19Foundation, Inc.
20
21Permission is granted to copy, distribute and/or modify this document under
22the terms of the GNU Free Documentation License, Version 1.3 or any later
23version published by the Free Software Foundation; with no Invariant Sections,
24with the Front-Cover Texts being ``A GNU Manual'', and with the Back-Cover
25Texts being ``You have freedom to copy and modify this GNU Manual, like GNU
26software''.  A copy of the license is included in
27@ref{GNU Free Documentation License}.
28@end copying
29@c  Note the @ref above must be on one line, a line break in an @ref within
30@c  @copying will bomb in recent texinfo.tex (eg. 2004-04-07.08 which comes
31@c  with texinfo 4.7), with messages about missing @endcsname.
32
33
34@c  Texinfo version 4.2 or up will be needed to process this file.
35@c
36@c  The version number and edition number are taken from version.texi provided
37@c  by automake (note that it's regenerated only if you configure with
38@c  --enable-maintainer-mode).
39@c
40@c  Notes discussing the present version number of GMP in relation to previous
41@c  ones (for instance in the "Compatibility" section) must be updated at
42@c  manually though.
43@c
44@c  @cindex entries have been made for function categories and programming
45@c  topics.  The "mpn" section is not included in this, because a beginner
46@c  looking for "GCD" or something is only going to be confused by pointers to
47@c  low level routines.
48@c
49@c  @cindex entries are present for processors and systems when there's
50@c  particular notes concerning them, but not just for everything GMP
51@c  supports.
52@c
53@c  Index entries for files use @code rather than @file, @samp or @option,
54@c  since the latter come out with quotes in TeX, which are nice in the text
55@c  but don't look so good in index columns.
56@c
57@c  Tex:
58@c
59@c  A suitable texinfo.tex is supplied, a newer one should work equally well.
60@c
61@c  HTML:
62@c
63@c  Nothing special is done for links to external manuals, they just come out
64@c  in the usual makeinfo style, eg. "../libc/Locales.html".  If you have
65@c  local copies of such manuals then this is a good thing, if not then you
66@c  may want to search-and-replace to some online source.
67@c
68
69@dircategory GNU libraries
70@direntry
71* gmp: (gmp).                   GNU Multiple Precision Arithmetic Library.
72@end direntry
73
74@c  html <meta name="description" content="...">
75@documentdescription
76How to install and use the GNU multiple precision arithmetic library, version @value{VERSION}.
77@end documentdescription
78
79@c smallbook
80@finalout
81@setchapternewpage on
82
83@ifnottex
84@node Top, Copying, (dir), (dir)
85@top GNU MP
86@end ifnottex
87
88@iftex
89@titlepage
90@title GNU MP
91@subtitle The GNU Multiple Precision Arithmetic Library
92@subtitle Edition @value{EDITION}
93@subtitle @value{UPDATED}
94
95@author by Torbj@"orn Granlund and the GMP development team
96@c @email{tg@@gmplib.org}
97
98@c Include the Distribution inside the titlepage so
99@c that headings are turned off.
100
101@tex
102\global\parindent=0pt
103\global\parskip=8pt
104\global\baselineskip=13pt
105@end tex
106
107@page
108@vskip 0pt plus 1filll
109@end iftex
110
111@insertcopying
112@ifnottex
113@sp 1
114@end ifnottex
115
116@iftex
117@end titlepage
118@headings double
119@end iftex
120
121@c  Don't bother with contents for html, the menus seem adequate.
122@ifnothtml
123@contents
124@end ifnothtml
125
126@menu
127* Copying::                    GMP Copying Conditions (LGPL).
128* Introduction to GMP::        Brief introduction to GNU MP.
129* Installing GMP::             How to configure and compile the GMP library.
130* GMP Basics::                 What every GMP user should know.
131* Reporting Bugs::             How to usefully report bugs.
132* Integer Functions::          Functions for arithmetic on signed integers.
133* Rational Number Functions::  Functions for arithmetic on rational numbers.
134* Floating-point Functions::   Functions for arithmetic on floats.
135* Low-level Functions::        Fast functions for natural numbers.
136* Random Number Functions::    Functions for generating random numbers.
137* Formatted Output::           @code{printf} style output.
138* Formatted Input::            @code{scanf} style input.
139* C++ Class Interface::        Class wrappers around GMP types.
140* BSD Compatible Functions::   All functions found in BSD MP.
141* Custom Allocation::          How to customize the internal allocation.
142* Language Bindings::          Using GMP from other languages.
143* Algorithms::                 What happens behind the scenes.
144* Internals::                  How values are represented behind the scenes.
145
146* Contributors::               Who brings you this library?
147* References::                 Some useful papers and books to read.
148* GNU Free Documentation License::
149* Concept Index::
150* Function Index::
151@end menu
152
153
154@c  @m{T,N} is $T$ in tex or @math{N} otherwise.  This is an easy way to give
155@c  different forms for math in tex and info.  Commas in N or T don't work,
156@c  but @C{} can be used instead.  \, works in info but not in tex.
157@iftex
158@macro m {T,N}
159@tex$\T\$@end tex
160@end macro
161@end iftex
162@ifnottex
163@macro m {T,N}
164@math{\N\}
165@end macro
166@end ifnottex
167
168@macro C {}
169,
170@end macro
171
172@c  @ms{V,N} is $V_N$ in tex or just vn otherwise.  This suits simple
173@c  subscripts like @ms{x,0}.
174@iftex
175@macro ms {V,N}
176@tex$\V\_{\N\}$@end tex
177@end macro
178@end iftex
179@ifnottex
180@macro ms {V,N}
181\V\\N\
182@end macro
183@end ifnottex
184
185@c  @nicode{S} is plain S in info, or @code{S} elsewhere.  This can be used
186@c  when the quotes that @code{} gives in info aren't wanted, but the
187@c  fontification in tex or html is wanted.  Doesn't work as @nicode{'\\0'}
188@c  though (gives two backslashes in tex).
189@ifinfo
190@macro nicode {S}
191\S\
192@end macro
193@end ifinfo
194@ifnotinfo
195@macro nicode {S}
196@code{\S\}
197@end macro
198@end ifnotinfo
199
200@c  @nisamp{S} is plain S in info, or @samp{S} elsewhere.  This can be used
201@c  when the quotes that @samp{} gives in info aren't wanted, but the
202@c  fontification in tex or html is wanted.
203@ifinfo
204@macro nisamp {S}
205\S\
206@end macro
207@end ifinfo
208@ifnotinfo
209@macro nisamp {S}
210@samp{\S\}
211@end macro
212@end ifnotinfo
213
214@c  Usage: @GMPtimes{}
215@c  Give either \times or the word "times".
216@tex
217\gdef\GMPtimes{\times}
218@end tex
219@ifnottex
220@macro GMPtimes
221times
222@end macro
223@end ifnottex
224
225@c  Usage: @GMPmultiply{}
226@c  Give * in info, or nothing in tex.
227@tex
228\gdef\GMPmultiply{}
229@end tex
230@ifnottex
231@macro GMPmultiply
232*
233@end macro
234@end ifnottex
235
236@c  Usage: @GMPabs{x}
237@c  Give either |x| in tex, or abs(x) in info or html.
238@tex
239\gdef\GMPabs#1{|#1|}
240@end tex
241@ifnottex
242@macro GMPabs {X}
243@abs{}(\X\)
244@end macro
245@end ifnottex
246
247@c  Usage: @GMPfloor{x}
248@c  Give either \lfloor x\rfloor in tex, or floor(x) in info or html.
249@tex
250\gdef\GMPfloor#1{\lfloor #1\rfloor}
251@end tex
252@ifnottex
253@macro GMPfloor {X}
254floor(\X\)
255@end macro
256@end ifnottex
257
258@c  Usage: @GMPceil{x}
259@c  Give either \lceil x\rceil in tex, or ceil(x) in info or html.
260@tex
261\gdef\GMPceil#1{\lceil #1 \rceil}
262@end tex
263@ifnottex
264@macro GMPceil {X}
265ceil(\X\)
266@end macro
267@end ifnottex
268
269@c  Math operators already available in tex, made available in info too.
270@c  For example @bmod{} can be used in both tex and info.
271@ifnottex
272@macro bmod
273mod
274@end macro
275@macro gcd
276gcd
277@end macro
278@macro ge
279>=
280@end macro
281@macro le
282<=
283@end macro
284@macro log
285log
286@end macro
287@macro min
288min
289@end macro
290@macro leftarrow
291<-
292@end macro
293@macro rightarrow
294->
295@end macro
296@end ifnottex
297
298@c  New math operators.
299@c  @abs{} can be used in both tex and info, or just \abs in tex.
300@tex
301\gdef\abs{\mathop{\rm abs}}
302@end tex
303@ifnottex
304@macro abs
305abs
306@end macro
307@end ifnottex
308
309@c  @cross{} is a \times symbol in tex, or an "x" in info.  In tex it works
310@c  inside or outside $ $.
311@tex
312\gdef\cross{\ifmmode\times\else$\times$\fi}
313@end tex
314@ifnottex
315@macro cross
316x
317@end macro
318@end ifnottex
319
320@c  @times{} made available as a "*" in info and html (already works in tex).
321@ifnottex
322@macro times
323*
324@end macro
325@end ifnottex
326
327@c  Usage: @W{text}
328@c  Like @w{} but working in math mode too.
329@tex
330\gdef\W#1{\ifmmode{#1}\else\w{#1}\fi}
331@end tex
332@ifnottex
333@macro W {S}
334@w{\S\}
335@end macro
336@end ifnottex
337
338@c  Usage: \GMPdisplay{text}
339@c  Put the given text in an @display style indent, but without turning off
340@c  paragraph reflow etc.
341@tex
342\gdef\GMPdisplay#1{%
343\noindent
344\advance\leftskip by \lispnarrowing
345#1\par}
346@end tex
347
348@c  Usage: \GMPhat
349@c  A new \hat that will work in math mode, unlike the texinfo redefined
350@c  version.
351@tex
352\gdef\GMPhat{\mathaccent"705E}
353@end tex
354
355@c  Usage: \GMPraise{text}
356@c  For use in a $ $ math expression as an alternative to "^".  This is good
357@c  for @code{} in an exponent, since there seems to be no superscript font
358@c  for that.
359@tex
360\gdef\GMPraise#1{\mskip0.5\thinmuskip\hbox{\raise0.8ex\hbox{#1}}}
361@end tex
362
363@c  Usage: @texlinebreak{}
364@c  A line break as per @*, but only in tex.
365@iftex
366@macro texlinebreak
367@*
368@end macro
369@end iftex
370@ifnottex
371@macro texlinebreak
372@end macro
373@end ifnottex
374
375@c  Usage: @maybepagebreak
376@c  Allow tex to insert a page break, if it feels the urge.
377@c  Normally blocks of @deftypefun/funx are kept together, which can lead to
378@c  some poor page break positioning if it's a big block, like the sets of
379@c  division functions etc.
380@tex
381\gdef\maybepagebreak{\penalty0}
382@end tex
383@ifnottex
384@macro maybepagebreak
385@end macro
386@end ifnottex
387
388@c  Usage: @GMPreftop{info,title}
389@c  Usage: @GMPpxreftop{info,title}
390@c
391@c  Like @ref{} and @pxref{}, but designed for a reference to the top of a
392@c  document, not a particular section.  The TeX output for plain @ref insists
393@c  on printing a particular section, GMPreftop gives just the title.
394@c
395@c  The texinfo manual recommends putting a likely section name in references
396@c  like this, eg. "Introduction", but it seems better to just give the title.
397@c
398@iftex
399@macro GMPreftop{info,title}
400@i{\title\}
401@end macro
402@macro GMPpxreftop{info,title}
403see @i{\title\}
404@end macro
405@end iftex
406@c
407@ifnottex
408@macro GMPreftop{info,title}
409@ref{Top,\title\,\title\,\info\,\title\}
410@end macro
411@macro GMPpxreftop{info,title}
412@pxref{Top,\title\,\title\,\info\,\title\}
413@end macro
414@end ifnottex
415
416
417@node Copying, Introduction to GMP, Top, Top
418@comment  node-name, next, previous,  up
419@unnumbered GNU MP Copying Conditions
420@cindex Copying conditions
421@cindex Conditions for copying GNU MP
422@cindex License conditions
423
424This library is @dfn{free}; this means that everyone is free to use it and
425free to redistribute it on a free basis.  The library is not in the public
426domain; it is copyrighted and there are restrictions on its distribution, but
427these restrictions are designed to permit everything that a good cooperating
428citizen would want to do.  What is not allowed is to try to prevent others
429from further sharing any version of this library that they might get from
430you.@refill
431
432Specifically, we want to make sure that you have the right to give away copies
433of the library, that you receive source code or else can get it if you want
434it, that you can change this library or use pieces of it in new free programs,
435and that you know you can do these things.@refill
436
437To make sure that everyone has such rights, we have to forbid you to deprive
438anyone else of these rights.  For example, if you distribute copies of the GNU
439MP library, you must give the recipients all the rights that you have.  You
440must make sure that they, too, receive or can get the source code.  And you
441must tell them their rights.@refill
442
443Also, for our own protection, we must make certain that everyone finds out
444that there is no warranty for the GNU MP library.  If it is modified by
445someone else and passed on, we want their recipients to know that what they
446have is not what we distributed, so that any problems introduced by others
447will not reflect on our reputation.@refill
448
449The precise conditions of the license for the GNU MP library are found in the
450Lesser General Public License version 3 that accompanies the source code,
451see @file{COPYING.LIB}.  Certain demonstration programs are provided under the
452terms of the plain General Public License version 3, see @file{COPYING}.
453
454
455@node Introduction to GMP, Installing GMP, Copying, Top
456@comment  node-name,  next,  previous,  up
457@chapter Introduction to GNU MP
458@cindex Introduction
459
460GNU MP is a portable library written in C for arbitrary precision arithmetic
461on integers, rational numbers, and floating-point numbers.  It aims to provide
462the fastest possible arithmetic for all applications that need higher
463precision than is directly supported by the basic C types.
464
465Many applications use just a few hundred bits of precision; but some
466applications may need thousands or even millions of bits.  GMP is designed to
467give good performance for both, by choosing algorithms based on the sizes of
468the operands, and by carefully keeping the overhead at a minimum.
469
470The speed of GMP is achieved by using fullwords as the basic arithmetic type,
471by using sophisticated algorithms, by including carefully optimized assembly
472code for the most common inner loops for many different CPUs, and by a general
473emphasis on speed (as opposed to simplicity or elegance).
474
475There is assembly code for these CPUs:
476@cindex CPU types
477ARM,
478DEC Alpha 21064, 21164, and 21264,
479AMD 29000,
480AMD K6, K6-2, Athlon, and Athlon64,
481Hitachi SuperH and SH-2,
482HPPA 1.0, 1.1 and 2.0,
483Intel Pentium, Pentium Pro/II/III, Pentium 4, generic x86,
484Intel IA-64, i960,
485Motorola MC68000, MC68020, MC88100, and MC88110,
486Motorola/IBM PowerPC 32 and 64,
487National NS32000,
488IBM POWER,
489MIPS R3000, R4000,
490SPARCv7, SuperSPARC, generic SPARCv8, UltraSPARC,
491DEC VAX,
492and
493Zilog Z8000.
494Some optimizations also for
495Cray vector systems,
496Clipper,
497IBM ROMP (RT),
498and
499Pyramid AP/XP.
500
501@cindex Home page
502@cindex Web page
503@noindent
504For up-to-date information on GMP, please see the GMP web pages at
505
506@display
507@uref{http://gmplib.org/}
508@end display
509
510@cindex Latest version of GMP
511@cindex Anonymous FTP of latest version
512@cindex FTP of latest version
513@noindent
514The latest version of the library is available at
515
516@display
517@uref{ftp://ftp.gnu.org/gnu/gmp/}
518@end display
519
520Many sites around the world mirror @samp{ftp.gnu.org}, please use a mirror
521near you, see @uref{http://www.gnu.org/order/ftp.html} for a full list.
522
523@cindex Mailing lists
524There are three public mailing lists of interest.  One for release
525announcements, one for general questions and discussions about usage of the GMP
526library and one for bug reports.  For more information, see
527
528@display
529@uref{http://gmplib.org/mailman/listinfo/}.
530@end display
531
532The proper place for bug reports is @email{gmp-bugs@@gmplib.org}.  See
533@ref{Reporting Bugs} for information about reporting bugs.
534
535@sp 1
536@section How to use this Manual
537@cindex About this manual
538
539Everyone should read @ref{GMP Basics}.  If you need to install the library
540yourself, then read @ref{Installing GMP}.  If you have a system with multiple
541ABIs, then read @ref{ABI and ISA}, for the compiler options that must be used
542on applications.
543
544The rest of the manual can be used for later reference, although it is
545probably a good idea to glance through it.
546
547
548@node Installing GMP, GMP Basics, Introduction to GMP, Top
549@comment  node-name,  next,  previous,  up
550@chapter Installing GMP
551@cindex Installing GMP
552@cindex Configuring GMP
553@cindex Building GMP
554
555GMP has an autoconf/automake/libtool based configuration system.  On a
556Unix-like system a basic build can be done with
557
558@example
559./configure
560make
561@end example
562
563@noindent
564Some self-tests can be run with
565
566@example
567make check
568@end example
569
570@noindent
571And you can install (under @file{/usr/local} by default) with
572
573@example
574make install
575@end example
576
577If you experience problems, please report them to @email{gmp-bugs@@gmplib.org}.
578See @ref{Reporting Bugs}, for information on what to include in useful bug
579reports.
580
581@menu
582* Build Options::
583* ABI and ISA::
584* Notes for Package Builds::
585* Notes for Particular Systems::
586* Known Build Problems::
587* Performance optimization::
588@end menu
589
590
591@node Build Options, ABI and ISA, Installing GMP, Installing GMP
592@section Build Options
593@cindex Build options
594
595All the usual autoconf configure options are available, run @samp{./configure
596--help} for a summary.  The file @file{INSTALL.autoconf} has some generic
597installation information too.
598
599@table @asis
600@item Tools
601@cindex Non-Unix systems
602@samp{configure} requires various Unix-like tools.  See @ref{Notes for
603Particular Systems}, for some options on non-Unix systems.
604
605It might be possible to build without the help of @samp{configure}, certainly
606all the code is there, but unfortunately you'll be on your own.
607
608@item Build Directory
609@cindex Build directory
610To compile in a separate build directory, @command{cd} to that directory, and
611prefix the configure command with the path to the GMP source directory.  For
612example
613
614@example
615cd /my/build/dir
616/my/sources/gmp-@value{VERSION}/configure
617@end example
618
619Not all @samp{make} programs have the necessary features (@code{VPATH}) to
620support this.  In particular, SunOS and Slowaris @command{make} have bugs that
621make them unable to build in a separate directory.  Use GNU @command{make}
622instead.
623
624@item @option{--prefix} and @option{--exec-prefix}
625@cindex Prefix
626@cindex Exec prefix
627@cindex Install prefix
628@cindex @code{--prefix}
629@cindex @code{--exec-prefix}
630The @option{--prefix} option can be used in the normal way to direct GMP to
631install under a particular tree.  The default is @samp{/usr/local}.
632
633@option{--exec-prefix} can be used to direct architecture-dependent files like
634@file{libgmp.a} to a different location.  This can be used to share
635architecture-independent parts like the documentation, but separate the
636dependent parts.  Note however that @file{gmp.h} and @file{mp.h} are
637architecture-dependent since they encode certain aspects of @file{libgmp}, so
638it will be necessary to ensure both @file{$prefix/include} and
639@file{$exec_prefix/include} are available to the compiler.
640
641@item @option{--disable-shared}, @option{--disable-static}
642@cindex @code{--disable-shared}
643@cindex @code{--disable-static}
644By default both shared and static libraries are built (where possible), but
645one or other can be disabled.  Shared libraries result in smaller executables
646and permit code sharing between separate running processes, but on some CPUs
647are slightly slower, having a small cost on each function call.
648
649@item Native Compilation, @option{--build=CPU-VENDOR-OS}
650@cindex Native compilation
651@cindex Build system
652@cindex @code{--build}
653For normal native compilation, the system can be specified with
654@samp{--build}.  By default @samp{./configure} uses the output from running
655@samp{./config.guess}.  On some systems @samp{./config.guess} can determine
656the exact CPU type, on others it will be necessary to give it explicitly.  For
657example,
658
659@example
660./configure --build=ultrasparc-sun-solaris2.7
661@end example
662
663In all cases the @samp{OS} part is important, since it controls how libtool
664generates shared libraries.  Running @samp{./config.guess} is the simplest way
665to see what it should be, if you don't know already.
666
667@item Cross Compilation, @option{--host=CPU-VENDOR-OS}
668@cindex Cross compiling
669@cindex Host system
670@cindex @code{--host}
671When cross-compiling, the system used for compiling is given by @samp{--build}
672and the system where the library will run is given by @samp{--host}.  For
673example when using a FreeBSD Athlon system to build GNU/Linux m68k binaries,
674
675@example
676./configure --build=athlon-pc-freebsd3.5 --host=m68k-mac-linux-gnu
677@end example
678
679Compiler tools are sought first with the host system type as a prefix.  For
680example @command{m68k-mac-linux-gnu-ranlib} is tried, then plain
681@command{ranlib}.  This makes it possible for a set of cross-compiling tools
682to co-exist with native tools.  The prefix is the argument to @samp{--host},
683and this can be an alias, such as @samp{m68k-linux}.  But note that tools
684don't have to be setup this way, it's enough to just have a @env{PATH} with a
685suitable cross-compiling @command{cc} etc.
686
687Compiling for a different CPU in the same family as the build system is a form
688of cross-compilation, though very possibly this would merely be special
689options on a native compiler.  In any case @samp{./configure} avoids depending
690on being able to run code on the build system, which is important when
691creating binaries for a newer CPU since they very possibly won't run on the
692build system.
693
694In all cases the compiler must be able to produce an executable (of whatever
695format) from a standard C @code{main}.  Although only object files will go to
696make up @file{libgmp}, @samp{./configure} uses linking tests for various
697purposes, such as determining what functions are available on the host system.
698
699Currently a warning is given unless an explicit @samp{--build} is used when
700cross-compiling, because it may not be possible to correctly guess the build
701system type if the @env{PATH} has only a cross-compiling @command{cc}.
702
703Note that the @samp{--target} option is not appropriate for GMP@.  It's for use
704when building compiler tools, with @samp{--host} being where they will run,
705and @samp{--target} what they'll produce code for.  Ordinary programs or
706libraries like GMP are only interested in the @samp{--host} part, being where
707they'll run.  (Some past versions of GMP used @samp{--target} incorrectly.)
708
709@item CPU types
710@cindex CPU types
711In general, if you want a library that runs as fast as possible, you should
712configure GMP for the exact CPU type your system uses.  However, this may mean
713the binaries won't run on older members of the family, and might run slower on
714other members, older or newer.  The best idea is always to build GMP for the
715exact machine type you intend to run it on.
716
717The following CPUs have specific support.  See @file{configure.in} for details
718of what code and compiler options they select.
719
720@itemize @bullet
721
722@c Keep this formatting, it's easy to read and it can be grepped to
723@c automatically test that CPUs listed get through ./config.sub
724
725@item
726Alpha:
727@nisamp{alpha},
728@nisamp{alphaev5},
729@nisamp{alphaev56},
730@nisamp{alphapca56},
731@nisamp{alphapca57},
732@nisamp{alphaev6},
733@nisamp{alphaev67},
734@nisamp{alphaev68}
735@nisamp{alphaev7}
736
737@item
738Cray:
739@nisamp{c90},
740@nisamp{j90},
741@nisamp{t90},
742@nisamp{sv1}
743
744@item
745HPPA:
746@nisamp{hppa1.0},
747@nisamp{hppa1.1},
748@nisamp{hppa2.0},
749@nisamp{hppa2.0n},
750@nisamp{hppa2.0w},
751@nisamp{hppa64}
752
753@item
754IA-64:
755@nisamp{ia64},
756@nisamp{itanium},
757@nisamp{itanium2}
758
759@item
760MIPS:
761@nisamp{mips},
762@nisamp{mips3},
763@nisamp{mips64}
764
765@item
766Motorola:
767@nisamp{m68k},
768@nisamp{m68000},
769@nisamp{m68010},
770@nisamp{m68020},
771@nisamp{m68030},
772@nisamp{m68040},
773@nisamp{m68060},
774@nisamp{m68302},
775@nisamp{m68360},
776@nisamp{m88k},
777@nisamp{m88110}
778
779@item
780POWER:
781@nisamp{power},
782@nisamp{power1},
783@nisamp{power2},
784@nisamp{power2sc}
785
786@item
787PowerPC:
788@nisamp{powerpc},
789@nisamp{powerpc64},
790@nisamp{powerpc401},
791@nisamp{powerpc403},
792@nisamp{powerpc405},
793@nisamp{powerpc505},
794@nisamp{powerpc601},
795@nisamp{powerpc602},
796@nisamp{powerpc603},
797@nisamp{powerpc603e},
798@nisamp{powerpc604},
799@nisamp{powerpc604e},
800@nisamp{powerpc620},
801@nisamp{powerpc630},
802@nisamp{powerpc740},
803@nisamp{powerpc7400},
804@nisamp{powerpc7450},
805@nisamp{powerpc750},
806@nisamp{powerpc801},
807@nisamp{powerpc821},
808@nisamp{powerpc823},
809@nisamp{powerpc860},
810@nisamp{powerpc970}
811
812@item
813SPARC:
814@nisamp{sparc},
815@nisamp{sparcv8},
816@nisamp{microsparc},
817@nisamp{supersparc},
818@nisamp{sparcv9},
819@nisamp{ultrasparc},
820@nisamp{ultrasparc2},
821@nisamp{ultrasparc2i},
822@nisamp{ultrasparc3},
823@nisamp{sparc64}
824
825@item
826x86 family:
827@nisamp{i386},
828@nisamp{i486},
829@nisamp{i586},
830@nisamp{pentium},
831@nisamp{pentiummmx},
832@nisamp{pentiumpro},
833@nisamp{pentium2},
834@nisamp{pentium3},
835@nisamp{pentium4},
836@nisamp{k6},
837@nisamp{k62},
838@nisamp{k63},
839@nisamp{athlon},
840@nisamp{amd64},
841@nisamp{viac3},
842@nisamp{viac32}
843
844@item
845Other:
846@nisamp{a29k},
847@nisamp{arm},
848@nisamp{clipper},
849@nisamp{i960},
850@nisamp{ns32k},
851@nisamp{pyramid},
852@nisamp{sh},
853@nisamp{sh2},
854@nisamp{vax},
855@nisamp{z8k}
856@end itemize
857
858CPUs not listed will use generic C code.
859
860@item Generic C Build
861@cindex Generic C
862If some of the assembly code causes problems, or if otherwise desired, the
863generic C code can be selected with CPU @samp{none}.  For example,
864
865@example
866./configure --host=none-unknown-freebsd3.5
867@end example
868
869Note that this will run quite slowly, but it should be portable and should at
870least make it possible to get something running if all else fails.
871
872@item Fat binary, @option{--enable-fat}
873@cindex Fat binary
874@cindex @option{--enable-fat}
875Using @option{--enable-fat} selects a ``fat binary'' build on x86, where
876optimized low level subroutines are chosen at runtime according to the CPU
877detected.  This means more code, but gives good performance on all x86 chips.
878(This option might become available for more architectures in the future.)
879
880@item @option{ABI}
881@cindex ABI
882On some systems GMP supports multiple ABIs (application binary interfaces),
883meaning data type sizes and calling conventions.  By default GMP chooses the
884best ABI available, but a particular ABI can be selected.  For example
885
886@example
887./configure --host=mips64-sgi-irix6 ABI=n32
888@end example
889
890See @ref{ABI and ISA}, for the available choices on relevant CPUs, and what
891applications need to do.
892
893@item @option{CC}, @option{CFLAGS}
894@cindex C compiler
895@cindex @code{CC}
896@cindex @code{CFLAGS}
897By default the C compiler used is chosen from among some likely candidates,
898with @command{gcc} normally preferred if it's present.  The usual
899@samp{CC=whatever} can be passed to @samp{./configure} to choose something
900different.
901
902For various systems, default compiler flags are set based on the CPU and
903compiler.  The usual @samp{CFLAGS="-whatever"} can be passed to
904@samp{./configure} to use something different or to set good flags for systems
905GMP doesn't otherwise know.
906
907The @samp{CC} and @samp{CFLAGS} used are printed during @samp{./configure},
908and can be found in each generated @file{Makefile}.  This is the easiest way
909to check the defaults when considering changing or adding something.
910
911Note that when @samp{CC} and @samp{CFLAGS} are specified on a system
912supporting multiple ABIs it's important to give an explicit
913@samp{ABI=whatever}, since GMP can't determine the ABI just from the flags and
914won't be able to select the correct assembly code.
915
916If just @samp{CC} is selected then normal default @samp{CFLAGS} for that
917compiler will be used (if GMP recognises it).  For example @samp{CC=gcc} can
918be used to force the use of GCC, with default flags (and default ABI).
919
920@item @option{CPPFLAGS}
921@cindex @code{CPPFLAGS}
922Any flags like @samp{-D} defines or @samp{-I} includes required by the
923preprocessor should be set in @samp{CPPFLAGS} rather than @samp{CFLAGS}.
924Compiling is done with both @samp{CPPFLAGS} and @samp{CFLAGS}, but
925preprocessing uses just @samp{CPPFLAGS}.  This distinction is because most
926preprocessors won't accept all the flags the compiler does.  Preprocessing is
927done separately in some configure tests, and in the @samp{ansi2knr} support
928for K&R compilers.
929
930@item @option{CC_FOR_BUILD}
931@cindex @code{CC_FOR_BUILD}
932Some build-time programs are compiled and run to generate host-specific data
933tables.  @samp{CC_FOR_BUILD} is the compiler used for this.  It doesn't need
934to be in any particular ABI or mode, it merely needs to generate executables
935that can run.  The default is to try the selected @samp{CC} and some likely
936candidates such as @samp{cc} and @samp{gcc}, looking for something that works.
937
938No flags are used with @samp{CC_FOR_BUILD} because a simple invocation like
939@samp{cc foo.c} should be enough.  If some particular options are required
940they can be included as for instance @samp{CC_FOR_BUILD="cc -whatever"}.
941
942@item C++ Support, @option{--enable-cxx}
943@cindex C++ support
944@cindex @code{--enable-cxx}
945C++ support in GMP can be enabled with @samp{--enable-cxx}, in which case a
946C++ compiler will be required.  As a convenience @samp{--enable-cxx=detect}
947can be used to enable C++ support only if a compiler can be found.  The C++
948support consists of a library @file{libgmpxx.la} and header file
949@file{gmpxx.h} (@pxref{Headers and Libraries}).
950
951A separate @file{libgmpxx.la} has been adopted rather than having C++ objects
952within @file{libgmp.la} in order to ensure dynamic linked C programs aren't
953bloated by a dependency on the C++ standard library, and to avoid any chance
954that the C++ compiler could be required when linking plain C programs.
955
956@file{libgmpxx.la} will use certain internals from @file{libgmp.la} and can
957only be expected to work with @file{libgmp.la} from the same GMP version.
958Future changes to the relevant internals will be accompanied by renaming, so a
959mismatch will cause unresolved symbols rather than perhaps mysterious
960misbehaviour.
961
962In general @file{libgmpxx.la} will be usable only with the C++ compiler that
963built it, since name mangling and runtime support are usually incompatible
964between different compilers.
965
966@item @option{CXX}, @option{CXXFLAGS}
967@cindex C++ compiler
968@cindex @code{CXX}
969@cindex @code{CXXFLAGS}
970When C++ support is enabled, the C++ compiler and its flags can be set with
971variables @samp{CXX} and @samp{CXXFLAGS} in the usual way.  The default for
972@samp{CXX} is the first compiler that works from a list of likely candidates,
973with @command{g++} normally preferred when available.  The default for
974@samp{CXXFLAGS} is to try @samp{CFLAGS}, @samp{CFLAGS} without @samp{-g}, then
975for @command{g++} either @samp{-g -O2} or @samp{-O2}, or for other compilers
976@samp{-g} or nothing.  Trying @samp{CFLAGS} this way is convenient when using
977@samp{gcc} and @samp{g++} together, since the flags for @samp{gcc} will
978usually suit @samp{g++}.
979
980It's important that the C and C++ compilers match, meaning their startup and
981runtime support routines are compatible and that they generate code in the
982same ABI (if there's a choice of ABIs on the system).  @samp{./configure}
983isn't currently able to check these things very well itself, so for that
984reason @samp{--disable-cxx} is the default, to avoid a build failure due to a
985compiler mismatch.  Perhaps this will change in the future.
986
987Incidentally, it's normally not good enough to set @samp{CXX} to the same as
988@samp{CC}.  Although @command{gcc} for instance recognises @file{foo.cc} as
989C++ code, only @command{g++} will invoke the linker the right way when
990building an executable or shared library from C++ object files.
991
992@item Temporary Memory, @option{--enable-alloca=<choice>}
993@cindex Temporary memory
994@cindex Stack overflow
995@cindex @code{alloca}
996@cindex @code{--enable-alloca}
997GMP allocates temporary workspace using one of the following three methods,
998which can be selected with for instance
999@samp{--enable-alloca=malloc-reentrant}.
1000
1001@itemize @bullet
1002@item
1003@samp{alloca} - C library or compiler builtin.
1004@item
1005@samp{malloc-reentrant} - the heap, in a re-entrant fashion.
1006@item
1007@samp{malloc-notreentrant} - the heap, with global variables.
1008@end itemize
1009
1010For convenience, the following choices are also available.
1011@samp{--disable-alloca} is the same as @samp{no}.
1012
1013@itemize @bullet
1014@item
1015@samp{yes} - a synonym for @samp{alloca}.
1016@item
1017@samp{no} - a synonym for @samp{malloc-reentrant}.
1018@item
1019@samp{reentrant} - @code{alloca} if available, otherwise
1020@samp{malloc-reentrant}.  This is the default.
1021@item
1022@samp{notreentrant} - @code{alloca} if available, otherwise
1023@samp{malloc-notreentrant}.
1024@end itemize
1025
1026@code{alloca} is reentrant and fast, and is recommended.  It actually allocates
1027just small blocks on the stack; larger ones use malloc-reentrant.
1028
1029@samp{malloc-reentrant} is, as the name suggests, reentrant and thread safe,
1030but @samp{malloc-notreentrant} is faster and should be used if reentrancy is
1031not required.
1032
1033The two malloc methods in fact use the memory allocation functions selected by
1034@code{mp_set_memory_functions}, these being @code{malloc} and friends by
1035default.  @xref{Custom Allocation}.
1036
1037An additional choice @samp{--enable-alloca=debug} is available, to help when
1038debugging memory related problems (@pxref{Debugging}).
1039
1040@item FFT Multiplication, @option{--disable-fft}
1041@cindex FFT multiplication
1042@cindex @code{--disable-fft}
1043By default multiplications are done using Karatsuba, 3-way Toom, and
1044Fermat FFT@.  The FFT is only used on large to very large operands and can be
1045disabled to save code size if desired.
1046
1047@item Berkeley MP, @option{--enable-mpbsd}
1048@cindex Berkeley MP compatible functions
1049@cindex BSD MP compatible functions
1050@cindex @code{--enable-mpbsd}
1051The Berkeley MP compatibility library (@file{libmp}) and header file
1052(@file{mp.h}) are built and installed only if @option{--enable-mpbsd} is used.
1053@xref{BSD Compatible Functions}.
1054
1055@item Assertion Checking, @option{--enable-assert}
1056@cindex Assertion checking
1057@cindex @code{--enable-assert}
1058This option enables some consistency checking within the library.  This can be
1059of use while debugging, @pxref{Debugging}.
1060
1061@item Execution Profiling, @option{--enable-profiling=prof/gprof/instrument}
1062@cindex Execution profiling
1063@cindex @code{--enable-profiling}
1064Enable profiling support, in one of various styles, @pxref{Profiling}.
1065
1066@item @option{MPN_PATH}
1067@cindex @code{MPN_PATH}
1068Various assembly versions of each mpn subroutines are provided.  For a given
1069CPU, a search is made though a path to choose a version of each.  For example
1070@samp{sparcv8} has
1071
1072@example
1073MPN_PATH="sparc32/v8 sparc32 generic"
1074@end example
1075
1076which means look first for v8 code, then plain sparc32 (which is v7), and
1077finally fall back on generic C@.  Knowledgeable users with special requirements
1078can specify a different path.  Normally this is completely unnecessary.
1079
1080@item Documentation
1081@cindex Documentation formats
1082@cindex Texinfo
1083The source for the document you're now reading is @file{doc/gmp.texi}, in
1084Texinfo format, see @GMPreftop{texinfo, Texinfo}.
1085
1086@cindex Postscript
1087@cindex DVI
1088@cindex PDF
1089Info format @samp{doc/gmp.info} is included in the distribution.  The usual
1090automake targets are available to make PostScript, DVI, PDF and HTML (these
1091will require various @TeX{} and Texinfo tools).
1092
1093@cindex DocBook
1094@cindex XML
1095DocBook and XML can be generated by the Texinfo @command{makeinfo} program
1096too, see @ref{makeinfo options,, Options for @command{makeinfo}, texinfo,
1097Texinfo}.
1098
1099Some supplementary notes can also be found in the @file{doc} subdirectory.
1100
1101@end table
1102
1103
1104@need 2000
1105@node ABI and ISA, Notes for Package Builds, Build Options, Installing GMP
1106@section ABI and ISA
1107@cindex ABI
1108@cindex Application Binary Interface
1109@cindex ISA
1110@cindex Instruction Set Architecture
1111
1112ABI (Application Binary Interface) refers to the calling conventions between
1113functions, meaning what registers are used and what sizes the various C data
1114types are.  ISA (Instruction Set Architecture) refers to the instructions and
1115registers a CPU has available.
1116
1117Some 64-bit ISA CPUs have both a 64-bit ABI and a 32-bit ABI defined, the
1118latter for compatibility with older CPUs in the family.  GMP supports some
1119CPUs like this in both ABIs.  In fact within GMP @samp{ABI} means a
1120combination of chip ABI, plus how GMP chooses to use it.  For example in some
112132-bit ABIs, GMP may support a limb as either a 32-bit @code{long} or a 64-bit
1122@code{long long}.
1123
1124By default GMP chooses the best ABI available for a given system, and this
1125generally gives significantly greater speed.  But an ABI can be chosen
1126explicitly to make GMP compatible with other libraries, or particular
1127application requirements.  For example,
1128
1129@example
1130./configure ABI=32
1131@end example
1132
1133In all cases it's vital that all object code used in a given program is
1134compiled for the same ABI.
1135
1136Usually a limb is implemented as a @code{long}.  When a @code{long long} limb
1137is used this is encoded in the generated @file{gmp.h}.  This is convenient for
1138applications, but it does mean that @file{gmp.h} will vary, and can't be just
1139copied around.  @file{gmp.h} remains compiler independent though, since all
1140compilers for a particular ABI will be expected to use the same limb type.
1141
1142Currently no attempt is made to follow whatever conventions a system has for
1143installing library or header files built for a particular ABI@.  This will
1144probably only matter when installing multiple builds of GMP, and it might be
1145as simple as configuring with a special @samp{libdir}, or it might require
1146more than that.  Note that builds for different ABIs need to done separately,
1147with a fresh @command{./configure} and @command{make} each.
1148
1149@sp 1
1150@table @asis
1151@need 1000
1152@item AMD64 (@samp{x86_64})
1153@cindex AMD64
1154On AMD64 systems supporting both 32-bit and 64-bit modes for applications, the
1155following ABI choices are available.
1156
1157@table @asis
1158@item @samp{ABI=64}
1159The 64-bit ABI uses 64-bit limbs and pointers and makes full use of the chip
1160architecture.  This is the default.  Applications will usually not need
1161special compiler flags, but for reference the option is
1162
1163@example
1164gcc  -m64
1165@end example
1166
1167@item @samp{ABI=32}
1168The 32-bit ABI is the usual i386 conventions.  This will be slower, and is not
1169recommended except for inter-operating with other code not yet 64-bit capable.
1170Applications must be compiled with
1171
1172@example
1173gcc  -m32
1174@end example
1175
1176(In GCC 2.95 and earlier there's no @samp{-m32} option, it's the only mode.)
1177@end table
1178
1179@sp 1
1180@need 1000
1181@item HPPA 2.0 (@samp{hppa2.0*}, @samp{hppa64})
1182@cindex HPPA
1183@cindex HP-UX
1184@table @asis
1185@item @samp{ABI=2.0w}
1186The 2.0w ABI uses 64-bit limbs and pointers and is available on HP-UX 11 or
1187up.  Applications must be compiled with
1188
1189@example
1190gcc [built for 2.0w]
1191cc  +DD64
1192@end example
1193
1194@item @samp{ABI=2.0n}
1195The 2.0n ABI means the 32-bit HPPA 1.0 ABI and all its normal calling
1196conventions, but with 64-bit instructions permitted within functions.  GMP
1197uses a 64-bit @code{long long} for a limb.  This ABI is available on hppa64
1198GNU/Linux and on HP-UX 10 or higher.  Applications must be compiled with
1199
1200@example
1201gcc [built for 2.0n]
1202cc  +DA2.0 +e
1203@end example
1204
1205Note that current versions of GCC (eg.@: 3.2) don't generate 64-bit
1206instructions for @code{long long} operations and so may be slower than for
12072.0w.  (The GMP assembly code is the same though.)
1208
1209@item @samp{ABI=1.0}
1210HPPA 2.0 CPUs can run all HPPA 1.0 and 1.1 code in the 32-bit HPPA 1.0 ABI@.
1211No special compiler options are needed for applications.
1212@end table
1213
1214All three ABIs are available for CPU types @samp{hppa2.0w}, @samp{hppa2.0} and
1215@samp{hppa64}, but for CPU type @samp{hppa2.0n} only 2.0n or 1.0 are
1216considered.
1217
1218Note that GCC on HP-UX has no options to choose between 2.0n and 2.0w modes,
1219unlike HP @command{cc}.  Instead it must be built for one or the other ABI@.
1220GMP will detect how it was built, and skip to the corresponding @samp{ABI}.
1221
1222@sp 1
1223@need 1500
1224@item IA-64 under HP-UX (@samp{ia64*-*-hpux*}, @samp{itanium*-*-hpux*})
1225@cindex IA-64
1226@cindex HP-UX
1227HP-UX supports two ABIs for IA-64.  GMP performance is the same in both.
1228
1229@table @asis
1230@item @samp{ABI=32}
1231In the 32-bit ABI, pointers, @code{int}s and @code{long}s are 32 bits and GMP
1232uses a 64 bit @code{long long} for a limb.  Applications can be compiled
1233without any special flags since this ABI is the default in both HP C and GCC,
1234but for reference the flags are
1235
1236@example
1237gcc  -milp32
1238cc   +DD32
1239@end example
1240
1241@item @samp{ABI=64}
1242In the 64-bit ABI, @code{long}s and pointers are 64 bits and GMP uses a
1243@code{long} for a limb.  Applications must be compiled with
1244
1245@example
1246gcc  -mlp64
1247cc   +DD64
1248@end example
1249@end table
1250
1251On other IA-64 systems, GNU/Linux for instance, @samp{ABI=64} is the only
1252choice.
1253
1254@sp 1
1255@need 1000
1256@item MIPS under IRIX 6 (@samp{mips*-*-irix[6789]})
1257@cindex MIPS
1258@cindex IRIX
1259IRIX 6 always has a 64-bit MIPS 3 or better CPU, and supports ABIs o32, n32,
1260and 64.  n32 or 64 are recommended, and GMP performance will be the same in
1261each.  The default is n32.
1262
1263@table @asis
1264@item @samp{ABI=o32}
1265The o32 ABI is 32-bit pointers and integers, and no 64-bit operations.  GMP
1266will be slower than in n32 or 64, this option only exists to support old
1267compilers, eg.@: GCC 2.7.2.  Applications can be compiled with no special
1268flags on an old compiler, or on a newer compiler with
1269
1270@example
1271gcc  -mabi=32
1272cc   -32
1273@end example
1274
1275@item @samp{ABI=n32}
1276The n32 ABI is 32-bit pointers and integers, but with a 64-bit limb using a
1277@code{long long}.  Applications must be compiled with
1278
1279@example
1280gcc  -mabi=n32
1281cc   -n32
1282@end example
1283
1284@item @samp{ABI=64}
1285The 64-bit ABI is 64-bit pointers and integers.  Applications must be compiled
1286with
1287
1288@example
1289gcc  -mabi=64
1290cc   -64
1291@end example
1292@end table
1293
1294Note that MIPS GNU/Linux, as of kernel version 2.2, doesn't have the necessary
1295support for n32 or 64 and so only gets a 32-bit limb and the MIPS 2 code.
1296
1297@sp 1
1298@need 1000
1299@item PowerPC 64 (@samp{powerpc64}, @samp{powerpc620}, @samp{powerpc630}, @samp{powerpc970}, @samp{power4}, @samp{power5})
1300@cindex PowerPC
1301@table @asis
1302@item @samp{ABI=aix64}
1303@cindex AIX
1304The AIX 64 ABI uses 64-bit limbs and pointers and is the default on PowerPC 64
1305@samp{*-*-aix*} systems.  Applications must be compiled with
1306
1307@example
1308gcc  -maix64
1309xlc  -q64
1310@end example
1311
1312@item @samp{ABI=mode64}
1313The @samp{mode64} ABI uses 64-bit limbs and pointers, and is the default on
131464-bit GNU/Linux, BSD, and Mac OS X/Darwin systems.  Applications must be
1315compiled with
1316
1317@example
1318gcc  -m64
1319@end example
1320
1321@item @samp{ABI=mode32}
1322@cindex AIX
1323The @samp{mode32} ABI uses a 64-bit @code{long long} limb but with the chip
1324still in 32-bit mode and using 32-bit calling conventions.  This is the default
1325for systems where the true 64-bit ABI is unavailable.  No special compiler
1326options are typically needed for applications.
1327
1328@item @samp{ABI=32}
1329This is the basic 32-bit PowerPC ABI, with a 32-bit limb.  No special compiler
1330options are needed for applications.
1331@end table
1332
1333GMP's speed is greatest for @samp{aix64} and @samp{mode64}.  In @samp{ABI=32}
1334only the 32-bit ISA is used and this doesn't make full use of a 64-bit chip.
1335On a suitable system we could perhaps use more of the ISA, but there are no
1336plans to do so.
1337
1338@sp 1
1339@need 1000
1340@item Sparc V9 (@samp{sparc64}, @samp{sparcv9}, @samp{ultrasparc*})
1341@cindex Sparc V9
1342@cindex Solaris
1343@cindex Sun
1344@table @asis
1345@item @samp{ABI=64}
1346The 64-bit V9 ABI is available on the various BSD sparc64 ports, recent
1347versions of Sparc64 GNU/Linux, and Solaris 2.7 and up (when the kernel is in
134864-bit mode).  GCC 3.2 or higher, or Sun @command{cc} is required.  On
1349GNU/Linux, depending on the default @command{gcc} mode, applications must be
1350compiled with
1351
1352@example
1353gcc  -m64
1354@end example
1355
1356On Solaris applications must be compiled with
1357
1358@example
1359gcc  -m64 -mptr64 -Wa,-xarch=v9 -mcpu=v9
1360cc   -xarch=v9
1361@end example
1362
1363On the BSD sparc64 systems no special options are required, since 64-bits is
1364the only ABI available.
1365
1366@item @samp{ABI=32}
1367For the basic 32-bit ABI, GMP still uses as much of the V9 ISA as it can.  In
1368the Sun documentation this combination is known as ``v8plus''.  On GNU/Linux,
1369depending on the default @command{gcc} mode, applications may need to be
1370compiled with
1371
1372@example
1373gcc  -m32
1374@end example
1375
1376On Solaris, no special compiler options are required for applications, though
1377using something like the following is recommended.  (@command{gcc} 2.8 and
1378earlier only support @samp{-mv8} though.)
1379
1380@example
1381gcc  -mv8plus
1382cc   -xarch=v8plus
1383@end example
1384@end table
1385
1386GMP speed is greatest in @samp{ABI=64}, so it's the default where available.
1387The speed is partly because there are extra registers available and partly
1388because 64-bits is considered the more important case and has therefore had
1389better code written for it.
1390
1391Don't be confused by the names of the @samp{-m} and @samp{-x} compiler
1392options, they're called @samp{arch} but effectively control both ABI and ISA@.
1393
1394On Solaris 2.6 and earlier, only @samp{ABI=32} is available since the kernel
1395doesn't save all registers.
1396
1397On Solaris 2.7 with the kernel in 32-bit mode, a normal native build will
1398reject @samp{ABI=64} because the resulting executables won't run.
1399@samp{ABI=64} can still be built if desired by making it look like a
1400cross-compile, for example
1401
1402@example
1403./configure --build=none --host=sparcv9-sun-solaris2.7 ABI=64
1404@end example
1405@end table
1406
1407
1408@need 2000
1409@node Notes for Package Builds, Notes for Particular Systems, ABI and ISA, Installing GMP
1410@section Notes for Package Builds
1411@cindex Build notes for binary packaging
1412@cindex Packaged builds
1413
1414GMP should present no great difficulties for packaging in a binary
1415distribution.
1416
1417@cindex Libtool versioning
1418@cindex Shared library versioning
1419Libtool is used to build the library and @samp{-version-info} is set
1420appropriately, having started from @samp{3:0:0} in GMP 3.0 (@pxref{Versioning,
1421Library interface versions, Library interface versions, libtool, GNU
1422Libtool}).
1423
1424The GMP 4 series will be upwardly binary compatible in each release and will
1425be upwardly binary compatible with all of the GMP 3 series.  Additional
1426function interfaces may be added in each release, so on systems where libtool
1427versioning is not fully checked by the loader an auxiliary mechanism may be
1428needed to express that a dynamic linked application depends on a new enough
1429GMP.
1430
1431An auxiliary mechanism may also be needed to express that @file{libgmpxx.la}
1432(from @option{--enable-cxx}, @pxref{Build Options}) requires @file{libgmp.la}
1433from the same GMP version, since this is not done by the libtool versioning,
1434nor otherwise.  A mismatch will result in unresolved symbols from the linker,
1435or perhaps the loader.
1436
1437When building a package for a CPU family, care should be taken to use
1438@samp{--host} (or @samp{--build}) to choose the least common denominator among
1439the CPUs which might use the package.  For example this might mean plain
1440@samp{sparc} (meaning V7) for SPARCs.
1441
1442For x86s, @option{--enable-fat} sets things up for a fat binary build, making a
1443runtime selection of optimized low level routines.  This is a good choice for
1444packaging to run on a range of x86 chips.
1445
1446Users who care about speed will want GMP built for their exact CPU type, to
1447make best use of the available optimizations.  Providing a way to suitably
1448rebuild a package may be useful.  This could be as simple as making it
1449possible for a user to omit @samp{--build} (and @samp{--host}) so
1450@samp{./config.guess} will detect the CPU@.  But a way to manually specify a
1451@samp{--build} will be wanted for systems where @samp{./config.guess} is
1452inexact.
1453
1454On systems with multiple ABIs, a packaged build will need to decide which
1455among the choices is to be provided, see @ref{ABI and ISA}.  A given run of
1456@samp{./configure} etc will only build one ABI@.  If a second ABI is also
1457required then a second run of @samp{./configure} etc must be made, starting
1458from a clean directory tree (@samp{make distclean}).
1459
1460As noted under ``ABI and ISA'', currently no attempt is made to follow system
1461conventions for install locations that vary with ABI, such as
1462@file{/usr/lib/sparcv9} for @samp{ABI=64} as opposed to @file{/usr/lib} for
1463@samp{ABI=32}.  A package build can override @samp{libdir} and other standard
1464variables as necessary.
1465
1466Note that @file{gmp.h} is a generated file, and will be architecture and ABI
1467dependent.  When attempting to install two ABIs simultaneously it will be
1468important that an application compile gets the correct @file{gmp.h} for its
1469desired ABI@.  If compiler include paths don't vary with ABI options then it
1470might be necessary to create a @file{/usr/include/gmp.h} which tests
1471preprocessor symbols and chooses the correct actual @file{gmp.h}.
1472
1473
1474@need 2000
1475@node Notes for Particular Systems, Known Build Problems, Notes for Package Builds, Installing GMP
1476@section Notes for Particular Systems
1477@cindex Build notes for particular systems
1478@cindex Particular systems
1479@cindex Systems
1480@table @asis
1481
1482@c This section is more or less meant for notes about performance or about
1483@c build problems that have been worked around but might leave a user
1484@c scratching their head.  Fun with different ABIs on a system belongs in the
1485@c above section.
1486
1487@item AIX 3 and 4
1488@cindex AIX
1489On systems @samp{*-*-aix[34]*} shared libraries are disabled by default, since
1490some versions of the native @command{ar} fail on the convenience libraries
1491used.  A shared build can be attempted with
1492
1493@example
1494./configure --enable-shared --disable-static
1495@end example
1496
1497Note that the @samp{--disable-static} is necessary because in a shared build
1498libtool makes @file{libgmp.a} a symlink to @file{libgmp.so}, apparently for
1499the benefit of old versions of @command{ld} which only recognise @file{.a},
1500but unfortunately this is done even if a fully functional @command{ld} is
1501available.
1502
1503@item ARM
1504@cindex ARM
1505On systems @samp{arm*-*-*}, versions of GCC up to and including 2.95.3 have a
1506bug in unsigned division, giving wrong results for some operands.  GMP
1507@samp{./configure} will demand GCC 2.95.4 or later.
1508
1509@item Compaq C++
1510@cindex Compaq C++
1511Compaq C++ on OSF 5.1 has two flavours of @code{iostream}, a standard one and
1512an old pre-standard one (see @samp{man iostream_intro}).  GMP can only use the
1513standard one, which unfortunately is not the default but must be selected by
1514defining @code{__USE_STD_IOSTREAM}.  Configure with for instance
1515
1516@example
1517./configure --enable-cxx CPPFLAGS=-D__USE_STD_IOSTREAM
1518@end example
1519
1520@item Floating Point Mode
1521@cindex Floating point mode
1522@cindex Hardware floating point mode
1523@cindex Precision of hardware floating point
1524@cindex x87
1525On some systems, the hardware floating point has a control mode which can set
1526all operations to be done in a particular precision, for instance single,
1527double or extended on x86 systems (x87 floating point).  The GMP functions
1528involving a @code{double} cannot be expected to operate to their full
1529precision when the hardware is in single precision mode.  Of course this
1530affects all code, including application code, not just GMP.
1531
1532@item MS-DOS and MS Windows
1533@cindex MS-DOS
1534@cindex MS Windows
1535@cindex Windows
1536@cindex Cygwin
1537@cindex DJGPP
1538@cindex MINGW
1539On an MS-DOS system DJGPP can be used to build GMP, and on an MS Windows
1540system Cygwin, DJGPP and MINGW can be used.  All three are excellent ports of
1541GCC and the various GNU tools.
1542
1543@display
1544@uref{http://www.cygwin.com/}
1545@uref{http://www.delorie.com/djgpp/}
1546@uref{http://www.mingw.org/}
1547@end display
1548
1549@cindex Interix
1550@cindex Services for Unix
1551Microsoft also publishes an Interix ``Services for Unix'' which can be used to
1552build GMP on Windows (with a normal @samp{./configure}), but it's not free
1553software.
1554
1555@item MS Windows DLLs
1556@cindex DLLs
1557@cindex MS Windows
1558@cindex Windows
1559On systems @samp{*-*-cygwin*}, @samp{*-*-mingw*} and @samp{*-*-pw32*} by
1560default GMP builds only a static library, but a DLL can be built instead using
1561
1562@example
1563./configure --disable-static --enable-shared
1564@end example
1565
1566Static and DLL libraries can't both be built, since certain export directives
1567in @file{gmp.h} must be different.
1568
1569A MINGW DLL build of GMP can be used with Microsoft C@.  Libtool doesn't
1570install a @file{.lib} format import library, but it can be created with MS
1571@command{lib} as follows, and copied to the install directory.  Similarly for
1572@file{libmp} and @file{libgmpxx}.
1573
1574@example
1575cd .libs
1576lib /def:libgmp-3.dll.def /out:libgmp-3.lib
1577@end example
1578
1579MINGW uses the C runtime library @samp{msvcrt.dll} for I/O, so applications
1580wanting to use the GMP I/O routines must be compiled with @samp{cl /MD} to do
1581the same.  If one of the other C runtime library choices provided by MS C is
1582desired then the suggestion is to use the GMP string functions and confine I/O
1583to the application.
1584
1585@item Motorola 68k CPU Types
1586@cindex 68000
1587@samp{m68k} is taken to mean 68000.  @samp{m68020} or higher will give a
1588performance boost on applicable CPUs.  @samp{m68360} can be used for CPU32
1589series chips.  @samp{m68302} can be used for ``Dragonball'' series chips,
1590though this is merely a synonym for @samp{m68000}.
1591
1592@item OpenBSD 2.6
1593@cindex OpenBSD
1594@command{m4} in this release of OpenBSD has a bug in @code{eval} that makes it
1595unsuitable for @file{.asm} file processing.  @samp{./configure} will detect
1596the problem and either abort or choose another m4 in the @env{PATH}.  The bug
1597is fixed in OpenBSD 2.7, so either upgrade or use GNU m4.
1598
1599@item Power CPU Types
1600@cindex Power/PowerPC
1601In GMP, CPU types @samp{power*} and @samp{powerpc*} will each use instructions
1602not available on the other, so it's important to choose the right one for the
1603CPU that will be used.  Currently GMP has no assembly code support for using
1604just the common instruction subset.  To get executables that run on both, the
1605current suggestion is to use the generic C code (CPU @samp{none}), possibly
1606with appropriate compiler options (like @samp{-mcpu=common} for
1607@command{gcc}).  CPU @samp{rs6000} (which is not a CPU but a family of
1608workstations) is accepted by @file{config.sub}, but is currently equivalent to
1609@samp{none}.
1610
1611@item Sparc CPU Types
1612@cindex Sparc
1613@samp{sparcv8} or @samp{supersparc} on relevant systems will give a
1614significant performance increase over the V7 code selected by plain
1615@samp{sparc}.
1616
1617@item Sparc App Regs
1618@cindex Sparc
1619The GMP assembly code for both 32-bit and 64-bit Sparc clobbers the
1620``application registers'' @code{g2}, @code{g3} and @code{g4}, the same way
1621that the GCC default @samp{-mapp-regs} does (@pxref{SPARC Options,, SPARC
1622Options, gcc, Using the GNU Compiler Collection (GCC)}).
1623
1624This makes that code unsuitable for use with the special V9
1625@samp{-mcmodel=embmedany} (which uses @code{g4} as a data segment pointer), and
1626for applications wanting to use those registers for special purposes.  In these
1627cases the only suggestion currently is to build GMP with CPU @samp{none} to
1628avoid the assembly code.
1629
1630@item SunOS 4
1631@cindex SunOS
1632@command{/usr/bin/m4} lacks various features needed to process @file{.asm}
1633files, and instead @samp{./configure} will automatically use
1634@command{/usr/5bin/m4}, which we believe is always available (if not then use
1635GNU m4).
1636
1637@item x86 CPU Types
1638@cindex x86
1639@cindex 80x86
1640@cindex i386
1641@samp{i586}, @samp{pentium} or @samp{pentiummmx} code is good for its intended
1642P5 Pentium chips, but quite slow when run on Intel P6 class chips (PPro, P-II,
1643P-III)@.  @samp{i386} is a better choice when making binaries that must run on
1644both.
1645
1646@item x86 MMX and SSE2 Code
1647@cindex MMX
1648@cindex SSE2
1649If the CPU selected has MMX code but the assembler doesn't support it, a
1650warning is given and non-MMX code is used instead.  This will be an inferior
1651build, since the MMX code that's present is there because it's faster than the
1652corresponding plain integer code.  The same applies to SSE2.
1653
1654Old versions of @samp{gas} don't support MMX instructions, in particular
1655version 1.92.3 that comes with FreeBSD 2.2.8 or the more recent OpenBSD 3.1
1656doesn't.
1657
1658Solaris 2.6 and 2.7 @command{as} generate incorrect object code for register
1659to register @code{movq} instructions, and so can't be used for MMX code.
1660Install a recent @command{gas} if MMX code is wanted on these systems.
1661@end table
1662
1663
1664@need 2000
1665@node Known Build Problems, Performance optimization, Notes for Particular Systems, Installing GMP
1666@section Known Build Problems
1667@cindex Build problems known
1668
1669@c This section is more or less meant for known build problems that are not
1670@c otherwise worked around and require some sort of manual intervention.
1671
1672You might find more up-to-date information at @uref{http://gmplib.org/}.
1673
1674@table @asis
1675@item Compiler link options
1676The version of libtool currently in use rather aggressively strips compiler
1677options when linking a shared library.  This will hopefully be relaxed in the
1678future, but for now if this is a problem the suggestion is to create a little
1679script to hide them, and for instance configure with
1680
1681@example
1682./configure CC=gcc-with-my-options
1683@end example
1684
1685@item DJGPP (@samp{*-*-msdosdjgpp*})
1686@cindex DJGPP
1687The DJGPP port of @command{bash} 2.03 is unable to run the @samp{configure}
1688script, it exits silently, having died writing a preamble to
1689@file{config.log}.  Use @command{bash} 2.04 or higher.
1690
1691@samp{make all} was found to run out of memory during the final
1692@file{libgmp.la} link on one system tested, despite having 64Mb available.
1693Running @samp{make libgmp.la} directly helped, perhaps recursing into the
1694various subdirectories uses up memory.
1695
1696@item GNU binutils @command{strip} prior to 2.12
1697@cindex Stripped libraries
1698@cindex Binutils @command{strip}
1699@cindex GNU @command{strip}
1700@command{strip} from GNU binutils 2.11 and earlier should not be used on the
1701static libraries @file{libgmp.a} and @file{libmp.a} since it will discard all
1702but the last of multiple archive members with the same name, like the three
1703versions of @file{init.o} in @file{libgmp.a}.  Binutils 2.12 or higher can be
1704used successfully.
1705
1706The shared libraries @file{libgmp.so} and @file{libmp.so} are not affected by
1707this and any version of @command{strip} can be used on them.
1708
1709@item @command{make} syntax error
1710@cindex SCO
1711@cindex IRIX
1712On certain versions of SCO OpenServer 5 and IRIX 6.5 the native @command{make}
1713is unable to handle the long dependencies list for @file{libgmp.la}.  The
1714symptom is a ``syntax error'' on the following line of the top-level
1715@file{Makefile}.
1716
1717@example
1718libgmp.la: $(libgmp_la_OBJECTS) $(libgmp_la_DEPENDENCIES)
1719@end example
1720
1721Either use GNU Make, or as a workaround remove
1722@code{$(libgmp_la_DEPENDENCIES)} from that line (which will make the initial
1723build work, but if any recompiling is done @file{libgmp.la} might not be
1724rebuilt).
1725
1726@item MacOS X (@samp{*-*-darwin*})
1727@cindex MacOS X
1728@cindex Darwin
1729Libtool currently only knows how to create shared libraries on MacOS X using
1730the native @command{cc} (which is a modified GCC), not a plain GCC@.  A
1731static-only build should work though (@samp{--disable-shared}).
1732
1733@item NeXT prior to 3.3
1734@cindex NeXT
1735The system compiler on old versions of NeXT was a massacred and old GCC, even
1736if it called itself @file{cc}.  This compiler cannot be used to build GMP, you
1737need to get a real GCC, and install that.  (NeXT may have fixed this in
1738release 3.3 of their system.)
1739
1740@item POWER and PowerPC
1741@cindex Power/PowerPC
1742Bugs in GCC 2.7.2 (and 2.6.3) mean it can't be used to compile GMP on POWER or
1743PowerPC@.  If you want to use GCC for these machines, get GCC 2.7.2.1 (or
1744later).
1745
1746@item Sequent Symmetry
1747@cindex Sequent Symmetry
1748Use the GNU assembler instead of the system assembler, since the latter has
1749serious bugs.
1750
1751@item Solaris 2.6
1752@cindex Solaris
1753The system @command{sed} prints an error ``Output line too long'' when libtool
1754builds @file{libgmp.la}.  This doesn't seem to cause any obvious ill effects,
1755but GNU @command{sed} is recommended, to avoid any doubt.
1756
1757@item Sparc Solaris 2.7 with gcc 2.95.2 in @samp{ABI=32}
1758@cindex Solaris
1759A shared library build of GMP seems to fail in this combination, it builds but
1760then fails the tests, apparently due to some incorrect data relocations within
1761@code{gmp_randinit_lc_2exp_size}.  The exact cause is unknown,
1762@samp{--disable-shared} is recommended.
1763@end table
1764
1765
1766@need 2000
1767@node Performance optimization, , Known Build Problems, Installing GMP
1768@section Performance optimization
1769@cindex Optimizing performance
1770
1771@c At some point, this should perhaps move to a separate chapter on optimizing
1772@c performance.
1773
1774For optimal performance, build GMP for the exact CPU type of the target
1775computer, see @ref{Build Options}.
1776
1777Unlike what is the case for most other programs, the compiler typically
1778doesn't matter much, since GMP uses assembly language for the most critical
1779operation.
1780
1781In particular for long-running GMP applications, and applications demanding
1782extremely large numbers, building and running the @code{tuneup} program in the
1783@file{tune} subdirectory, can be important.  For example,
1784
1785@example
1786cd tune
1787make tuneup
1788./tuneup
1789@end example
1790
1791will generate better contents for the @file{gmp-mparam.h} parameter file.
1792
1793To use the results, put the output in the file indicated in the
1794@samp{Parameters for ...} header.  Then recompile from scratch.
1795
1796The @code{tuneup} program takes one useful parameter, @samp{-f NNN}, which
1797instructs the program how long to check FFT multiply parameters.  If you're
1798going to use GMP for extremely large numbers, you may want to run @code{tuneup}
1799with a large NNN value.
1800
1801
1802@node GMP Basics, Reporting Bugs, Installing GMP, Top
1803@comment  node-name,  next,  previous,  up
1804@chapter GMP Basics
1805@cindex Basics
1806
1807@strong{Using functions, macros, data types, etc.@: not documented in this
1808manual is strongly discouraged.  If you do so your application is guaranteed
1809to be incompatible with future versions of GMP.}
1810
1811@menu
1812* Headers and Libraries::
1813* Nomenclature and Types::
1814* Function Classes::
1815* Variable Conventions::
1816* Parameter Conventions::
1817* Memory Management::
1818* Reentrancy::
1819* Useful Macros and Constants::
1820* Compatibility with older versions::
1821* Demonstration Programs::
1822* Efficiency::
1823* Debugging::
1824* Profiling::
1825* Autoconf::
1826* Emacs::
1827@end menu
1828
1829@node Headers and Libraries, Nomenclature and Types, GMP Basics, GMP Basics
1830@section Headers and Libraries
1831@cindex Headers
1832
1833@cindex @file{gmp.h}
1834@cindex Include files
1835@cindex @code{#include}
1836All declarations needed to use GMP are collected in the include file
1837@file{gmp.h}.  It is designed to work with both C and C++ compilers.
1838
1839@example
1840#include <gmp.h>
1841@end example
1842
1843@cindex @code{stdio.h}
1844Note however that prototypes for GMP functions with @code{FILE *} parameters
1845are only provided if @code{<stdio.h>} is included too.
1846
1847@example
1848#include <stdio.h>
1849#include <gmp.h>
1850@end example
1851
1852@cindex @code{stdarg.h}
1853Likewise @code{<stdarg.h>} (or @code{<varargs.h>}) is required for prototypes
1854with @code{va_list} parameters, such as @code{gmp_vprintf}.  And
1855@code{<obstack.h>} for prototypes with @code{struct obstack} parameters, such
1856as @code{gmp_obstack_printf}, when available.
1857
1858@cindex Libraries
1859@cindex Linking
1860@cindex @code{libgmp}
1861All programs using GMP must link against the @file{libgmp} library.  On a
1862typical Unix-like system this can be done with @samp{-lgmp}, for example
1863
1864@example
1865gcc myprogram.c -lgmp
1866@end example
1867
1868@cindex @code{libgmpxx}
1869GMP C++ functions are in a separate @file{libgmpxx} library.  This is built
1870and installed if C++ support has been enabled (@pxref{Build Options}).  For
1871example,
1872
1873@example
1874g++ mycxxprog.cc -lgmpxx -lgmp
1875@end example
1876
1877@cindex Libtool
1878GMP is built using Libtool and an application can use that to link if desired,
1879@GMPpxreftop{libtool, GNU Libtool}.
1880
1881If GMP has been installed to a non-standard location then it may be necessary
1882to use @samp{-I} and @samp{-L} compiler options to point to the right
1883directories, and some sort of run-time path for a shared library.
1884
1885
1886@node Nomenclature and Types, Function Classes, Headers and Libraries, GMP Basics
1887@section Nomenclature and Types
1888@cindex Nomenclature
1889@cindex Types
1890
1891@cindex Integer
1892@tindex @code{mpz_t}
1893In this manual, @dfn{integer} usually means a multiple precision integer, as
1894defined by the GMP library.  The C data type for such integers is @code{mpz_t}.
1895Here are some examples of how to declare such integers:
1896
1897@example
1898mpz_t sum;
1899
1900struct foo @{ mpz_t x, y; @};
1901
1902mpz_t vec[20];
1903@end example
1904
1905@cindex Rational number
1906@tindex @code{mpq_t}
1907@dfn{Rational number} means a multiple precision fraction.  The C data type
1908for these fractions is @code{mpq_t}.  For example:
1909
1910@example
1911mpq_t quotient;
1912@end example
1913
1914@cindex Floating-point number
1915@tindex @code{mpf_t}
1916@dfn{Floating point number} or @dfn{Float} for short, is an arbitrary precision
1917mantissa with a limited precision exponent.  The C data type for such objects
1918is @code{mpf_t}.  For example:
1919
1920@example
1921mpf_t fp;
1922@end example
1923
1924@tindex @code{mp_exp_t}
1925The floating point functions accept and return exponents in the C type
1926@code{mp_exp_t}.  Currently this is usually a @code{long}, but on some systems
1927it's an @code{int} for efficiency.
1928
1929@cindex Limb
1930@tindex @code{mp_limb_t}
1931A @dfn{limb} means the part of a multi-precision number that fits in a single
1932machine word.  (We chose this word because a limb of the human body is
1933analogous to a digit, only larger, and containing several digits.)  Normally a
1934limb is 32 or 64 bits.  The C data type for a limb is @code{mp_limb_t}.
1935
1936@tindex @code{mp_size_t}
1937Counts of limbs of a multi-precision number represented in the C type
1938@code{mp_size_t}.  Currently this is normally a @code{long}, but on some
1939systems it's an @code{int} for efficiency, and on some systems it will be
1940@code{long long} in the future.
1941
1942@tindex @code{mp_bitcnt_t}
1943Counts of bits of a multi-precision number are represented in the C type
1944@code{mp_bitcnt_t}.  Currently this is always an @code{unsigned long}, but on
1945some systems it will be an @code{unsigned long long} in the future .
1946
1947@cindex Random state
1948@tindex @code{gmp_randstate_t}
1949@dfn{Random state} means an algorithm selection and current state data.  The C
1950data type for such objects is @code{gmp_randstate_t}.  For example:
1951
1952@example
1953gmp_randstate_t rstate;
1954@end example
1955
1956Also, in general @code{mp_bitcnt_t} is used for bit counts and ranges, and
1957@code{size_t} is used for byte or character counts.
1958
1959
1960@node Function Classes, Variable Conventions, Nomenclature and Types, GMP Basics
1961@section Function Classes
1962@cindex Function classes
1963
1964There are six classes of functions in the GMP library:
1965
1966@enumerate
1967@item
1968Functions for signed integer arithmetic, with names beginning with
1969@code{mpz_}.  The associated type is @code{mpz_t}.  There are about 150
1970functions in this class.  (@pxref{Integer Functions})
1971
1972@item
1973Functions for rational number arithmetic, with names beginning with
1974@code{mpq_}.  The associated type is @code{mpq_t}.  There are about 40
1975functions in this class, but the integer functions can be used for arithmetic
1976on the numerator and denominator separately.  (@pxref{Rational Number
1977Functions})
1978
1979@item
1980Functions for floating-point arithmetic, with names beginning with
1981@code{mpf_}.  The associated type is @code{mpf_t}.  There are about 60
1982functions is this class.  (@pxref{Floating-point Functions})
1983
1984@item
1985Functions compatible with Berkeley MP, such as @code{itom}, @code{madd}, and
1986@code{mult}.  The associated type is @code{MINT}.  (@pxref{BSD Compatible
1987Functions})
1988
1989@item
1990Fast low-level functions that operate on natural numbers.  These are used by
1991the functions in the preceding groups, and you can also call them directly
1992from very time-critical user programs.  These functions' names begin with
1993@code{mpn_}.  The associated type is array of @code{mp_limb_t}.  There are
1994about 30 (hard-to-use) functions in this class.  (@pxref{Low-level Functions})
1995
1996@item
1997Miscellaneous functions.  Functions for setting up custom allocation and
1998functions for generating random numbers.  (@pxref{Custom Allocation}, and
1999@pxref{Random Number Functions})
2000@end enumerate
2001
2002
2003@node Variable Conventions, Parameter Conventions, Function Classes, GMP Basics
2004@section Variable Conventions
2005@cindex Variable conventions
2006@cindex Conventions for variables
2007
2008GMP functions generally have output arguments before input arguments.  This
2009notation is by analogy with the assignment operator.  The BSD MP compatibility
2010functions are exceptions, having the output arguments last.
2011
2012GMP lets you use the same variable for both input and output in one call.  For
2013example, the main function for integer multiplication, @code{mpz_mul}, can be
2014used to square @code{x} and put the result back in @code{x} with
2015
2016@example
2017mpz_mul (x, x, x);
2018@end example
2019
2020Before you can assign to a GMP variable, you need to initialize it by calling
2021one of the special initialization functions.  When you're done with a
2022variable, you need to clear it out, using one of the functions for that
2023purpose.  Which function to use depends on the type of variable.  See the
2024chapters on integer functions, rational number functions, and floating-point
2025functions for details.
2026
2027A variable should only be initialized once, or at least cleared between each
2028initialization.  After a variable has been initialized, it may be assigned to
2029any number of times.
2030
2031For efficiency reasons, avoid excessive initializing and clearing.  In
2032general, initialize near the start of a function and clear near the end.  For
2033example,
2034
2035@example
2036void
2037foo (void)
2038@{
2039  mpz_t  n;
2040  int    i;
2041  mpz_init (n);
2042  for (i = 1; i < 100; i++)
2043    @{
2044      mpz_mul (n, @dots{});
2045      mpz_fdiv_q (n, @dots{});
2046      @dots{}
2047    @}
2048  mpz_clear (n);
2049@}
2050@end example
2051
2052
2053@node Parameter Conventions, Memory Management, Variable Conventions, GMP Basics
2054@section Parameter Conventions
2055@cindex Parameter conventions
2056@cindex Conventions for parameters
2057
2058When a GMP variable is used as a function parameter, it's effectively a
2059call-by-reference, meaning if the function stores a value there it will change
2060the original in the caller.  Parameters which are input-only can be designated
2061@code{const} to provoke a compiler error or warning on attempting to modify
2062them.
2063
2064When a function is going to return a GMP result, it should designate a
2065parameter that it sets, like the library functions do.  More than one value
2066can be returned by having more than one output parameter, again like the
2067library functions.  A @code{return} of an @code{mpz_t} etc doesn't return the
2068object, only a pointer, and this is almost certainly not what's wanted.
2069
2070Here's an example accepting an @code{mpz_t} parameter, doing a calculation,
2071and storing the result to the indicated parameter.
2072
2073@example
2074void
2075foo (mpz_t result, const mpz_t param, unsigned long n)
2076@{
2077  unsigned long  i;
2078  mpz_mul_ui (result, param, n);
2079  for (i = 1; i < n; i++)
2080    mpz_add_ui (result, result, i*7);
2081@}
2082
2083int
2084main (void)
2085@{
2086  mpz_t  r, n;
2087  mpz_init (r);
2088  mpz_init_set_str (n, "123456", 0);
2089  foo (r, n, 20L);
2090  gmp_printf ("%Zd\n", r);
2091  return 0;
2092@}
2093@end example
2094
2095@code{foo} works even if the mainline passes the same variable for
2096@code{param} and @code{result}, just like the library functions.  But
2097sometimes it's tricky to make that work, and an application might not want to
2098bother supporting that sort of thing.
2099
2100For interest, the GMP types @code{mpz_t} etc are implemented as one-element
2101arrays of certain structures.  This is why declaring a variable creates an
2102object with the fields GMP needs, but then using it as a parameter passes a
2103pointer to the object.  Note that the actual fields in each @code{mpz_t} etc
2104are for internal use only and should not be accessed directly by code that
2105expects to be compatible with future GMP releases.
2106
2107
2108@need 1000
2109@node Memory Management, Reentrancy, Parameter Conventions, GMP Basics
2110@section Memory Management
2111@cindex Memory management
2112
2113The GMP types like @code{mpz_t} are small, containing only a couple of sizes,
2114and pointers to allocated data.  Once a variable is initialized, GMP takes
2115care of all space allocation.  Additional space is allocated whenever a
2116variable doesn't have enough.
2117
2118@code{mpz_t} and @code{mpq_t} variables never reduce their allocated space.
2119Normally this is the best policy, since it avoids frequent reallocation.
2120Applications that need to return memory to the heap at some particular point
2121can use @code{mpz_realloc2}, or clear variables no longer needed.
2122
2123@code{mpf_t} variables, in the current implementation, use a fixed amount of
2124space, determined by the chosen precision and allocated at initialization, so
2125their size doesn't change.
2126
2127All memory is allocated using @code{malloc} and friends by default, but this
2128can be changed, see @ref{Custom Allocation}.  Temporary memory on the stack is
2129also used (via @code{alloca}), but this can be changed at build-time if
2130desired, see @ref{Build Options}.
2131
2132
2133@node Reentrancy, Useful Macros and Constants, Memory Management, GMP Basics
2134@section Reentrancy
2135@cindex Reentrancy
2136@cindex Thread safety
2137@cindex Multi-threading
2138
2139@noindent
2140GMP is reentrant and thread-safe, with some exceptions:
2141
2142@itemize @bullet
2143@item
2144If configured with @option{--enable-alloca=malloc-notreentrant} (or with
2145@option{--enable-alloca=notreentrant} when @code{alloca} is not available),
2146then naturally GMP is not reentrant.
2147
2148@item
2149@code{mpf_set_default_prec} and @code{mpf_init} use a global variable for the
2150selected precision.  @code{mpf_init2} can be used instead, and in the C++
2151interface an explicit precision to the @code{mpf_class} constructor.
2152
2153@item
2154@code{mpz_random} and the other old random number functions use a global
2155random state and are hence not reentrant.  The newer random number functions
2156that accept a @code{gmp_randstate_t} parameter can be used instead.
2157
2158@item
2159@code{gmp_randinit} (obsolete) returns an error indication through a global
2160variable, which is not thread safe.  Applications are advised to use
2161@code{gmp_randinit_default} or @code{gmp_randinit_lc_2exp} instead.
2162
2163@item
2164@code{mp_set_memory_functions} uses global variables to store the selected
2165memory allocation functions.
2166
2167@item
2168If the memory allocation functions set by a call to
2169@code{mp_set_memory_functions} (or @code{malloc} and friends by default) are
2170not reentrant, then GMP will not be reentrant either.
2171
2172@item
2173If the standard I/O functions such as @code{fwrite} are not reentrant then the
2174GMP I/O functions using them will not be reentrant either.
2175
2176@item
2177It's safe for two threads to read from the same GMP variable simultaneously,
2178but it's not safe for one to read while the another might be writing, nor for
2179two threads to write simultaneously.  It's not safe for two threads to
2180generate a random number from the same @code{gmp_randstate_t} simultaneously,
2181since this involves an update of that variable.
2182@end itemize
2183
2184
2185@need 2000
2186@node Useful Macros and Constants, Compatibility with older versions, Reentrancy, GMP Basics
2187@section Useful Macros and Constants
2188@cindex Useful macros and constants
2189@cindex Constants
2190
2191@deftypevr {Global Constant} {const int} mp_bits_per_limb
2192@findex mp_bits_per_limb
2193@cindex Bits per limb
2194@cindex Limb size
2195The number of bits per limb.
2196@end deftypevr
2197
2198@defmac __GNU_MP_VERSION
2199@defmacx __GNU_MP_VERSION_MINOR
2200@defmacx __GNU_MP_VERSION_PATCHLEVEL
2201@cindex Version number
2202@cindex GMP version number
2203The major and minor GMP version, and patch level, respectively, as integers.
2204For GMP i.j, these numbers will be i, j, and 0, respectively.
2205For GMP i.j.k, these numbers will be i, j, and k, respectively.
2206@end defmac
2207
2208@deftypevr {Global Constant} {const char * const} gmp_version
2209@findex gmp_version
2210The GMP version number, as a null-terminated string, in the form ``i.j.k''.
2211This release is @nicode{"@value{VERSION}"}.  Note that the format ``i.j'' was
2212used when k was zero was used before version 4.3.0.
2213@end deftypevr
2214
2215@defmac __GMP_CC
2216@defmacx __GMP_CFLAGS
2217The compiler and compiler flags, respectively, used when compiling GMP, as
2218strings.
2219@end defmac
2220
2221
2222@node Compatibility with older versions, Demonstration Programs, Useful Macros and Constants, GMP Basics
2223@section Compatibility with older versions
2224@cindex Compatibility with older versions
2225@cindex Past GMP versions
2226@cindex Upward compatibility
2227
2228This version of GMP is upwardly binary compatible with all 4.x and 3.x
2229versions, and upwardly compatible at the source level with all 2.x versions,
2230with the following exceptions.
2231
2232@itemize @bullet
2233@item
2234@code{mpn_gcd} had its source arguments swapped as of GMP 3.0, for consistency
2235with other @code{mpn} functions.
2236
2237@item
2238@code{mpf_get_prec} counted precision slightly differently in GMP 3.0 and
22393.0.1, but in 3.1 reverted to the 2.x style.
2240@end itemize
2241
2242There are a number of compatibility issues between GMP 1 and GMP 2 that of
2243course also apply when porting applications from GMP 1 to GMP 4.  Please
2244see the GMP 2 manual for details.
2245
2246The Berkeley MP compatibility library (@pxref{BSD Compatible Functions}) is
2247source and binary compatible with the standard @file{libmp}.
2248
2249@c @enumerate
2250@c @item Integer division functions round the result differently.  The obsolete
2251@c functions (@code{mpz_div}, @code{mpz_divmod}, @code{mpz_mdiv},
2252@c @code{mpz_mdivmod}, etc) now all use floor rounding (i.e., they round the
2253@c quotient towards
2254@c @ifinfo
2255@c @minus{}infinity).
2256@c @end ifinfo
2257@c @iftex
2258@c @tex
2259@c $-\infty$).
2260@c @end tex
2261@c @end iftex
2262@c There are a lot of functions for integer division, giving the user better
2263@c control over the rounding.
2264
2265@c @item The function @code{mpz_mod} now compute the true @strong{mod} function.
2266
2267@c @item The functions @code{mpz_powm} and @code{mpz_powm_ui} now use
2268@c @strong{mod} for reduction.
2269
2270@c @item The assignment functions for rational numbers do no longer canonicalize
2271@c their results.  In the case a non-canonical result could arise from an
2272@c assignment, the user need to insert an explicit call to
2273@c @code{mpq_canonicalize}.  This change was made for efficiency.
2274
2275@c @item Output generated by @code{mpz_out_raw} in this release cannot be read
2276@c by @code{mpz_inp_raw} in previous releases.  This change was made for making
2277@c the file format truly portable between machines with different word sizes.
2278
2279@c @item Several @code{mpn} functions have changed.  But they were intentionally
2280@c undocumented in previous releases.
2281
2282@c @item The functions @code{mpz_cmp_ui}, @code{mpz_cmp_si}, and @code{mpq_cmp_ui}
2283@c are now implemented as macros, and thereby sometimes evaluate their
2284@c arguments multiple times.
2285
2286@c @item The functions @code{mpz_pow_ui} and @code{mpz_ui_pow_ui} now yield 1
2287@c for 0^0.  (In version 1, they yielded 0.)
2288
2289@c In version 1 of the library, @code{mpq_set_den} handled negative
2290@c denominators by copying the sign to the numerator.  That is no longer done.
2291
2292@c Pure assignment functions do not canonicalize the assigned variable.  It is
2293@c the responsibility of the user to canonicalize the assigned variable before
2294@c any arithmetic operations are performed on that variable.
2295@c Note that this is an incompatible change from version 1 of the library.
2296
2297@c @end enumerate
2298
2299
2300@need 1000
2301@node Demonstration Programs, Efficiency, Compatibility with older versions, GMP Basics
2302@section Demonstration programs
2303@cindex Demonstration programs
2304@cindex Example programs
2305@cindex Sample programs
2306The @file{demos} subdirectory has some sample programs using GMP@.  These
2307aren't built or installed, but there's a @file{Makefile} with rules for them.
2308For instance,
2309
2310@example
2311make pexpr
2312./pexpr 68^975+10
2313@end example
2314
2315@noindent
2316The following programs are provided
2317
2318@itemize @bullet
2319@item
2320@cindex Expression parsing demo
2321@cindex Parsing expressions demo
2322@samp{pexpr} is an expression evaluator, the program used on the GMP web page.
2323@item
2324@cindex Expression parsing demo
2325@cindex Parsing expressions demo
2326The @samp{calc} subdirectory has a similar but simpler evaluator using
2327@command{lex} and @command{yacc}.
2328@item
2329@cindex Expression parsing demo
2330@cindex Parsing expressions demo
2331The @samp{expr} subdirectory is yet another expression evaluator, a library
2332designed for ease of use within a C program.  See @file{demos/expr/README} for
2333more information.
2334@item
2335@cindex Factorization demo
2336@samp{factorize} is a Pollard-Rho factorization program.
2337@item
2338@samp{isprime} is a command-line interface to the @code{mpz_probab_prime_p}
2339function.
2340@item
2341@samp{primes} counts or lists primes in an interval, using a sieve.
2342@item
2343@samp{qcn} is an example use of @code{mpz_kronecker_ui} to estimate quadratic
2344class numbers.
2345@item
2346@cindex @code{perl}
2347@cindex GMP Perl module
2348@cindex Perl module
2349The @samp{perl} subdirectory is a comprehensive perl interface to GMP@.  See
2350@file{demos/perl/INSTALL} for more information.  Documentation is in POD
2351format in @file{demos/perl/GMP.pm}.
2352@end itemize
2353
2354As an aside, consideration has been given at various times to some sort of
2355expression evaluation within the main GMP library.  Going beyond something
2356minimal quickly leads to matters like user-defined functions, looping, fixnums
2357for control variables, etc, which are considered outside the scope of GMP
2358(much closer to language interpreters or compilers, @xref{Language Bindings}.)
2359Something simple for program input convenience may yet be a possibility, a
2360combination of the @file{expr} demo and the @file{pexpr} tree back-end
2361perhaps.  But for now the above evaluators are offered as illustrations.
2362
2363
2364@need 1000
2365@node Efficiency, Debugging, Demonstration Programs, GMP Basics
2366@section Efficiency
2367@cindex Efficiency
2368
2369@table @asis
2370@item Small Operands
2371@cindex Small operands
2372On small operands, the time for function call overheads and memory allocation
2373can be significant in comparison to actual calculation.  This is unavoidable
2374in a general purpose variable precision library, although GMP attempts to be
2375as efficient as it can on both large and small operands.
2376
2377@item Static Linking
2378@cindex Static linking
2379On some CPUs, in particular the x86s, the static @file{libgmp.a} should be
2380used for maximum speed, since the PIC code in the shared @file{libgmp.so} will
2381have a small overhead on each function call and global data address.  For many
2382programs this will be insignificant, but for long calculations there's a gain
2383to be had.
2384
2385@item Initializing and Clearing
2386@cindex Initializing and clearing
2387Avoid excessive initializing and clearing of variables, since this can be
2388quite time consuming, especially in comparison to otherwise fast operations
2389like addition.
2390
2391A language interpreter might want to keep a free list or stack of
2392initialized variables ready for use.  It should be possible to integrate
2393something like that with a garbage collector too.
2394
2395@item Reallocations
2396@cindex Reallocations
2397An @code{mpz_t} or @code{mpq_t} variable used to hold successively increasing
2398values will have its memory repeatedly @code{realloc}ed, which could be quite
2399slow or could fragment memory, depending on the C library.  If an application
2400can estimate the final size then @code{mpz_init2} or @code{mpz_realloc2} can
2401be called to allocate the necessary space from the beginning
2402(@pxref{Initializing Integers}).
2403
2404It doesn't matter if a size set with @code{mpz_init2} or @code{mpz_realloc2}
2405is too small, since all functions will do a further reallocation if necessary.
2406Badly overestimating memory required will waste space though.
2407
2408@item @code{2exp} Functions
2409@cindex @code{2exp} functions
2410It's up to an application to call functions like @code{mpz_mul_2exp} when
2411appropriate.  General purpose functions like @code{mpz_mul} make no attempt to
2412identify powers of two or other special forms, because such inputs will
2413usually be very rare and testing every time would be wasteful.
2414
2415@item @code{ui} and @code{si} Functions
2416@cindex @code{ui} and @code{si} functions
2417The @code{ui} functions and the small number of @code{si} functions exist for
2418convenience and should be used where applicable.  But if for example an
2419@code{mpz_t} contains a value that fits in an @code{unsigned long} there's no
2420need extract it and call a @code{ui} function, just use the regular @code{mpz}
2421function.
2422
2423@item In-Place Operations
2424@cindex In-place operations
2425@code{mpz_abs}, @code{mpq_abs}, @code{mpf_abs}, @code{mpz_neg}, @code{mpq_neg}
2426and @code{mpf_neg} are fast when used for in-place operations like
2427@code{mpz_abs(x,x)}, since in the current implementation only a single field
2428of @code{x} needs changing.  On suitable compilers (GCC for instance) this is
2429inlined too.
2430
2431@code{mpz_add_ui}, @code{mpz_sub_ui}, @code{mpf_add_ui} and @code{mpf_sub_ui}
2432benefit from an in-place operation like @code{mpz_add_ui(x,x,y)}, since
2433usually only one or two limbs of @code{x} will need to be changed.  The same
2434applies to the full precision @code{mpz_add} etc if @code{y} is small.  If
2435@code{y} is big then cache locality may be helped, but that's all.
2436
2437@code{mpz_mul} is currently the opposite, a separate destination is slightly
2438better.  A call like @code{mpz_mul(x,x,y)} will, unless @code{y} is only one
2439limb, make a temporary copy of @code{x} before forming the result.  Normally
2440that copying will only be a tiny fraction of the time for the multiply, so
2441this is not a particularly important consideration.
2442
2443@code{mpz_set}, @code{mpq_set}, @code{mpq_set_num}, @code{mpf_set}, etc, make
2444no attempt to recognise a copy of something to itself, so a call like
2445@code{mpz_set(x,x)} will be wasteful.  Naturally that would never be written
2446deliberately, but if it might arise from two pointers to the same object then
2447a test to avoid it might be desirable.
2448
2449@example
2450if (x != y)
2451  mpz_set (x, y);
2452@end example
2453
2454Note that it's never worth introducing extra @code{mpz_set} calls just to get
2455in-place operations.  If a result should go to a particular variable then just
2456direct it there and let GMP take care of data movement.
2457
2458@item Divisibility Testing (Small Integers)
2459@cindex Divisibility testing
2460@code{mpz_divisible_ui_p} and @code{mpz_congruent_ui_p} are the best functions
2461for testing whether an @code{mpz_t} is divisible by an individual small
2462integer.  They use an algorithm which is faster than @code{mpz_tdiv_ui}, but
2463which gives no useful information about the actual remainder, only whether
2464it's zero (or a particular value).
2465
2466However when testing divisibility by several small integers, it's best to take
2467a remainder modulo their product, to save multi-precision operations.  For
2468instance to test whether a number is divisible by any of 23, 29 or 31 take a
2469remainder modulo @math{23@times{}29@times{}31 = 20677} and then test that.
2470
2471The division functions like @code{mpz_tdiv_q_ui} which give a quotient as well
2472as a remainder are generally a little slower than the remainder-only functions
2473like @code{mpz_tdiv_ui}.  If the quotient is only rarely wanted then it's
2474probably best to just take a remainder and then go back and calculate the
2475quotient if and when it's wanted (@code{mpz_divexact_ui} can be used if the
2476remainder is zero).
2477
2478@item Rational Arithmetic
2479@cindex Rational arithmetic
2480The @code{mpq} functions operate on @code{mpq_t} values with no common factors
2481in the numerator and denominator.  Common factors are checked-for and cast out
2482as necessary.  In general, cancelling factors every time is the best approach
2483since it minimizes the sizes for subsequent operations.
2484
2485However, applications that know something about the factorization of the
2486values they're working with might be able to avoid some of the GCDs used for
2487canonicalization, or swap them for divisions.  For example when multiplying by
2488a prime it's enough to check for factors of it in the denominator instead of
2489doing a full GCD@.  Or when forming a big product it might be known that very
2490little cancellation will be possible, and so canonicalization can be left to
2491the end.
2492
2493The @code{mpq_numref} and @code{mpq_denref} macros give access to the
2494numerator and denominator to do things outside the scope of the supplied
2495@code{mpq} functions.  @xref{Applying Integer Functions}.
2496
2497The canonical form for rationals allows mixed-type @code{mpq_t} and integer
2498additions or subtractions to be done directly with multiples of the
2499denominator.  This will be somewhat faster than @code{mpq_add}.  For example,
2500
2501@example
2502/* mpq increment */
2503mpz_add (mpq_numref(q), mpq_numref(q), mpq_denref(q));
2504
2505/* mpq += unsigned long */
2506mpz_addmul_ui (mpq_numref(q), mpq_denref(q), 123UL);
2507
2508/* mpq -= mpz */
2509mpz_submul (mpq_numref(q), mpq_denref(q), z);
2510@end example
2511
2512@item Number Sequences
2513@cindex Number sequences
2514Functions like @code{mpz_fac_ui}, @code{mpz_fib_ui} and @code{mpz_bin_uiui}
2515are designed for calculating isolated values.  If a range of values is wanted
2516it's probably best to call to get a starting point and iterate from there.
2517
2518@item Text Input/Output
2519@cindex Text input/output
2520Hexadecimal or octal are suggested for input or output in text form.
2521Power-of-2 bases like these can be converted much more efficiently than other
2522bases, like decimal.  For big numbers there's usually nothing of particular
2523interest to be seen in the digits, so the base doesn't matter much.
2524
2525Maybe we can hope octal will one day become the normal base for everyday use,
2526as proposed by King Charles XII of Sweden and later reformers.
2527@c Reference: Knuth volume 2 section 4.1, page 184 of second edition.  :-)
2528@end table
2529
2530
2531@node Debugging, Profiling, Efficiency, GMP Basics
2532@section Debugging
2533@cindex Debugging
2534
2535@table @asis
2536@item Stack Overflow
2537@cindex Stack overflow
2538@cindex Segmentation violation
2539@cindex Bus error
2540Depending on the system, a segmentation violation or bus error might be the
2541only indication of stack overflow.  See @samp{--enable-alloca} choices in
2542@ref{Build Options}, for how to address this.
2543
2544In new enough versions of GCC, @samp{-fstack-check} may be able to ensure an
2545overflow is recognised by the system before too much damage is done, or
2546@samp{-fstack-limit-symbol} or @samp{-fstack-limit-register} may be able to
2547add checking if the system itself doesn't do any (@pxref{Code Gen Options,,
2548Options for Code Generation, gcc, Using the GNU Compiler Collection (GCC)}).
2549These options must be added to the @samp{CFLAGS} used in the GMP build
2550(@pxref{Build Options}), adding them just to an application will have no
2551effect.  Note also they're a slowdown, adding overhead to each function call
2552and each stack allocation.
2553
2554@item Heap Problems
2555@cindex Heap problems
2556@cindex Malloc problems
2557The most likely cause of application problems with GMP is heap corruption.
2558Failing to @code{init} GMP variables will have unpredictable effects, and
2559corruption arising elsewhere in a program may well affect GMP@.  Initializing
2560GMP variables more than once or failing to clear them will cause memory leaks.
2561
2562@cindex Malloc debugger
2563In all such cases a @code{malloc} debugger is recommended.  On a GNU or BSD
2564system the standard C library @code{malloc} has some diagnostic facilities,
2565see @ref{Allocation Debugging,, Allocation Debugging, libc, The GNU C Library
2566Reference Manual}, or @samp{man 3 malloc}.  Other possibilities, in no
2567particular order, include
2568
2569@display
2570@uref{http://www.inf.ethz.ch/personal/biere/projects/ccmalloc/}
2571@uref{http://dmalloc.com/}
2572@uref{http://www.perens.com/FreeSoftware/} @ (electric fence)
2573@uref{http://packages.debian.org/stable/devel/fda}
2574@uref{http://www.gnupdate.org/components/leakbug/}
2575@uref{http://people.redhat.com/~otaylor/memprof/}
2576@uref{http://www.cbmamiga.demon.co.uk/mpatrol/}
2577@end display
2578
2579The GMP default allocation routines in @file{memory.c} also have a simple
2580sentinel scheme which can be enabled with @code{#define DEBUG} in that file.
2581This is mainly designed for detecting buffer overruns during GMP development,
2582but might find other uses.
2583
2584@item Stack Backtraces
2585@cindex Stack backtrace
2586On some systems the compiler options GMP uses by default can interfere with
2587debugging.  In particular on x86 and 68k systems @samp{-fomit-frame-pointer}
2588is used and this generally inhibits stack backtracing.  Recompiling without
2589such options may help while debugging, though the usual caveats about it
2590potentially moving a memory problem or hiding a compiler bug will apply.
2591
2592@item GDB, the GNU Debugger
2593@cindex GDB
2594@cindex GNU Debugger
2595A sample @file{.gdbinit} is included in the distribution, showing how to call
2596some undocumented dump functions to print GMP variables from within GDB@.  Note
2597that these functions shouldn't be used in final application code since they're
2598undocumented and may be subject to incompatible changes in future versions of
2599GMP.
2600
2601@item Source File Paths
2602GMP has multiple source files with the same name, in different directories.
2603For example @file{mpz}, @file{mpq} and @file{mpf} each have an
2604@file{init.c}.  If the debugger can't already determine the right one it may
2605help to build with absolute paths on each C file.  One way to do that is to
2606use a separate object directory with an absolute path to the source directory.
2607
2608@example
2609cd /my/build/dir
2610/my/source/dir/gmp-@value{VERSION}/configure
2611@end example
2612
2613This works via @code{VPATH}, and might require GNU @command{make}.
2614Alternately it might be possible to change the @code{.c.lo} rules
2615appropriately.
2616
2617@item Assertion Checking
2618@cindex Assertion checking
2619The build option @option{--enable-assert} is available to add some consistency
2620checks to the library (see @ref{Build Options}).  These are likely to be of
2621limited value to most applications.  Assertion failures are just as likely to
2622indicate memory corruption as a library or compiler bug.
2623
2624Applications using the low-level @code{mpn} functions, however, will benefit
2625from @option{--enable-assert} since it adds checks on the parameters of most
2626such functions, many of which have subtle restrictions on their usage.  Note
2627however that only the generic C code has checks, not the assembly code, so
2628CPU @samp{none} should be used for maximum checking.
2629
2630@item Temporary Memory Checking
2631The build option @option{--enable-alloca=debug} arranges that each block of
2632temporary memory in GMP is allocated with a separate call to @code{malloc} (or
2633the allocation function set with @code{mp_set_memory_functions}).
2634
2635This can help a malloc debugger detect accesses outside the intended bounds,
2636or detect memory not released.  In a normal build, on the other hand,
2637temporary memory is allocated in blocks which GMP divides up for its own use,
2638or may be allocated with a compiler builtin @code{alloca} which will go
2639nowhere near any malloc debugger hooks.
2640
2641@item Maximum Debuggability
2642To summarize the above, a GMP build for maximum debuggability would be
2643
2644@example
2645./configure --disable-shared --enable-assert \
2646  --enable-alloca=debug --host=none CFLAGS=-g
2647@end example
2648
2649For C++, add @samp{--enable-cxx CXXFLAGS=-g}.
2650
2651@item Checker
2652@cindex Checker
2653@cindex GCC Checker
2654The GCC checker (@uref{http://savannah.nongnu.org/projects/checker/}) can be
2655used with GMP@.  It contains a stub library which means GMP applications
2656compiled with checker can use a normal GMP build.
2657
2658A build of GMP with checking within GMP itself can be made.  This will run
2659very very slowly.  On GNU/Linux for example,
2660
2661@cindex @command{checkergcc}
2662@example
2663./configure --host=none-pc-linux-gnu CC=checkergcc
2664@end example
2665
2666@samp{--host=none} must be used, since the GMP assembly code doesn't support
2667the checking scheme.  The GMP C++ features cannot be used, since current
2668versions of checker (0.9.9.1) don't yet support the standard C++ library.
2669
2670@item Valgrind
2671@cindex Valgrind
2672The valgrind program (@uref{http://valgrind.org/}) is a memory
2673checker for x86s.  It translates and emulates machine instructions to do
2674strong checks for uninitialized data (at the level of individual bits), memory
2675accesses through bad pointers, and memory leaks.
2676
2677Recent versions of Valgrind are getting support for MMX and SSE/SSE2
2678instructions, for past versions GMP will need to be configured not to use
2679those, ie.@: for an x86 without them (for instance plain @samp{i486}).
2680
2681@item Other Problems
2682Any suspected bug in GMP itself should be isolated to make sure it's not an
2683application problem, see @ref{Reporting Bugs}.
2684@end table
2685
2686
2687@node Profiling, Autoconf, Debugging, GMP Basics
2688@section Profiling
2689@cindex Profiling
2690@cindex Execution profiling
2691@cindex @code{--enable-profiling}
2692
2693Running a program under a profiler is a good way to find where it's spending
2694most time and where improvements can be best sought.  The profiling choices
2695for a GMP build are as follows.
2696
2697@table @asis
2698@item @samp{--disable-profiling}
2699The default is to add nothing special for profiling.
2700
2701It should be possible to just compile the mainline of a program with @code{-p}
2702and use @command{prof} to get a profile consisting of timer-based sampling of
2703the program counter.  Most of the GMP assembly code has the necessary symbol
2704information.
2705
2706This approach has the advantage of minimizing interference with normal program
2707operation, but on most systems the resolution of the sampling is quite low (10
2708milliseconds for instance), requiring long runs to get accurate information.
2709
2710@item @samp{--enable-profiling=prof}
2711@cindex @code{prof}
2712Build with support for the system @command{prof}, which means @samp{-p} added
2713to the @samp{CFLAGS}.
2714
2715This provides call counting in addition to program counter sampling, which
2716allows the most frequently called routines to be identified, and an average
2717time spent in each routine to be determined.
2718
2719The x86 assembly code has support for this option, but on other processors
2720the assembly routines will be as if compiled without @samp{-p} and therefore
2721won't appear in the call counts.
2722
2723On some systems, such as GNU/Linux, @samp{-p} in fact means @samp{-pg} and in
2724this case @samp{--enable-profiling=gprof} described below should be used
2725instead.
2726
2727@item @samp{--enable-profiling=gprof}
2728@cindex @code{gprof}
2729Build with support for @command{gprof}, which means @samp{-pg} added to the
2730@samp{CFLAGS}.
2731
2732This provides call graph construction in addition to call counting and program
2733counter sampling, which makes it possible to count calls coming from different
2734locations.  For example the number of calls to @code{mpn_mul} from
2735@code{mpz_mul} versus the number from @code{mpf_mul}.  The program counter
2736sampling is still flat though, so only a total time in @code{mpn_mul} would be
2737accumulated, not a separate amount for each call site.
2738
2739The x86 assembly code has support for this option, but on other processors
2740the assembly routines will be as if compiled without @samp{-pg} and therefore
2741not be included in the call counts.
2742
2743On x86 and m68k systems @samp{-pg} and @samp{-fomit-frame-pointer} are
2744incompatible, so the latter is omitted from the default flags in that case,
2745which might result in poorer code generation.
2746
2747Incidentally, it should be possible to use the @command{gprof} program with a
2748plain @samp{--enable-profiling=prof} build.  But in that case only the
2749@samp{gprof -p} flat profile and call counts can be expected to be valid, not
2750the @samp{gprof -q} call graph.
2751
2752@item @samp{--enable-profiling=instrument}
2753@cindex @code{-finstrument-functions}
2754@cindex @code{instrument-functions}
2755Build with the GCC option @samp{-finstrument-functions} added to the
2756@samp{CFLAGS} (@pxref{Code Gen Options,, Options for Code Generation, gcc,
2757Using the GNU Compiler Collection (GCC)}).
2758
2759This inserts special instrumenting calls at the start and end of each
2760function, allowing exact timing and full call graph construction.
2761
2762This instrumenting is not normally a standard system feature and will require
2763support from an external library, such as
2764
2765@cindex FunctionCheck
2766@cindex fnccheck
2767@display
2768@uref{http://sourceforge.net/projects/fnccheck/}
2769@end display
2770
2771This should be included in @samp{LIBS} during the GMP configure so that test
2772programs will link.  For example,
2773
2774@example
2775./configure --enable-profiling=instrument LIBS=-lfc
2776@end example
2777
2778On a GNU system the C library provides dummy instrumenting functions, so
2779programs compiled with this option will link.  In this case it's only
2780necessary to ensure the correct library is added when linking an application.
2781
2782The x86 assembly code supports this option, but on other processors the
2783assembly routines will be as if compiled without
2784@samp{-finstrument-functions} meaning time spent in them will effectively be
2785attributed to their caller.
2786@end table
2787
2788
2789@node Autoconf, Emacs, Profiling, GMP Basics
2790@section Autoconf
2791@cindex Autoconf
2792
2793Autoconf based applications can easily check whether GMP is installed.  The
2794only thing to be noted is that GMP library symbols from version 3 onwards have
2795prefixes like @code{__gmpz}.  The following therefore would be a simple test,
2796
2797@cindex @code{AC_CHECK_LIB}
2798@example
2799AC_CHECK_LIB(gmp, __gmpz_init)
2800@end example
2801
2802This just uses the default @code{AC_CHECK_LIB} actions for found or not found,
2803but an application that must have GMP would want to generate an error if not
2804found.  For example,
2805
2806@example
2807AC_CHECK_LIB(gmp, __gmpz_init, ,
2808  [AC_MSG_ERROR([GNU MP not found, see http://gmplib.org/])])
2809@end example
2810
2811If functions added in some particular version of GMP are required, then one of
2812those can be used when checking.  For example @code{mpz_mul_si} was added in
2813GMP 3.1,
2814
2815@example
2816AC_CHECK_LIB(gmp, __gmpz_mul_si, ,
2817  [AC_MSG_ERROR(
2818  [GNU MP not found, or not 3.1 or up, see http://gmplib.org/])])
2819@end example
2820
2821An alternative would be to test the version number in @file{gmp.h} using say
2822@code{AC_EGREP_CPP}.  That would make it possible to test the exact version,
2823if some particular sub-minor release is known to be necessary.
2824
2825In general it's recommended that applications should simply demand a new
2826enough GMP rather than trying to provide supplements for features not
2827available in past versions.
2828
2829Occasionally an application will need or want to know the size of a type at
2830configuration or preprocessing time, not just with @code{sizeof} in the code.
2831This can be done in the normal way with @code{mp_limb_t} etc, but GMP 4.0 or
2832up is best for this, since prior versions needed certain @samp{-D} defines on
2833systems using a @code{long long} limb.  The following would suit Autoconf 2.50
2834or up,
2835
2836@example
2837AC_CHECK_SIZEOF(mp_limb_t, , [#include <gmp.h>])
2838@end example
2839
2840
2841@node Emacs,  , Autoconf, GMP Basics
2842@section Emacs
2843@cindex Emacs
2844@cindex @code{info-lookup-symbol}
2845
2846@key{C-h C-i} (@code{info-lookup-symbol}) is a good way to find documentation
2847on C functions while editing (@pxref{Info Lookup, , Info Documentation Lookup,
2848emacs, The Emacs Editor}).
2849
2850The GMP manual can be included in such lookups by putting the following in
2851your @file{.emacs},
2852
2853@c  This isn't pretty, but there doesn't seem to be a better way (in emacs
2854@c  21.2 at least).  info-lookup->mode-value could be used for the "assoc"s,
2855@c  but that function isn't documented, whereas info-lookup-alist is.
2856@c
2857@example
2858(eval-after-load "info-look"
2859  '(let ((mode-value (assoc 'c-mode (assoc 'symbol info-lookup-alist))))
2860     (setcar (nthcdr 3 mode-value)
2861             (cons '("(gmp)Function Index" nil "^ -.* " "\\>")
2862                   (nth 3 mode-value)))))
2863@end example
2864
2865
2866@node Reporting Bugs, Integer Functions, GMP Basics, Top
2867@comment  node-name,  next,  previous,  up
2868@chapter Reporting Bugs
2869@cindex Reporting bugs
2870@cindex Bug reporting
2871
2872If you think you have found a bug in the GMP library, please investigate it
2873and report it.  We have made this library available to you, and it is not too
2874much to ask you to report the bugs you find.
2875
2876Before you report a bug, check it's not already addressed in @ref{Known Build
2877Problems}, or perhaps @ref{Notes for Particular Systems}.  You may also want
2878to check @uref{http://gmplib.org/} for patches for this release.
2879
2880Please include the following in any report,
2881
2882@itemize @bullet
2883@item
2884The GMP version number, and if pre-packaged or patched then say so.
2885
2886@item
2887A test program that makes it possible for us to reproduce the bug.  Include
2888instructions on how to run the program.
2889
2890@item
2891A description of what is wrong.  If the results are incorrect, in what way.
2892If you get a crash, say so.
2893
2894@item
2895If you get a crash, include a stack backtrace from the debugger if it's
2896informative (@samp{where} in @command{gdb}, or @samp{$C} in @command{adb}).
2897
2898@item
2899Please do not send core dumps, executables or @command{strace}s.
2900
2901@item
2902The configuration options you used when building GMP, if any.
2903
2904@item
2905The name of the compiler and its version.  For @command{gcc}, get the version
2906with @samp{gcc -v}, otherwise perhaps @samp{what `which cc`}, or similar.
2907
2908@item
2909The output from running @samp{uname -a}.
2910
2911@item
2912The output from running @samp{./config.guess}, and from running
2913@samp{./configfsf.guess} (might be the same).
2914
2915@item
2916If the bug is related to @samp{configure}, then the compressed contents of
2917@file{config.log}.
2918
2919@item
2920If the bug is related to an @file{asm} file not assembling, then the contents
2921of @file{config.m4} and the offending line or lines from the temporary
2922@file{mpn/tmp-<file>.s}.
2923@end itemize
2924
2925Please make an effort to produce a self-contained report, with something
2926definite that can be tested or debugged.  Vague queries or piecemeal messages
2927are difficult to act on and don't help the development effort.
2928
2929It is not uncommon that an observed problem is actually due to a bug in the
2930compiler; the GMP code tends to explore interesting corners in compilers.
2931
2932If your bug report is good, we will do our best to help you get a corrected
2933version of the library; if the bug report is poor, we won't do anything about
2934it (except maybe ask you to send a better report).
2935
2936Send your report to: @email{gmp-bugs@@gmplib.org}.
2937
2938If you think something in this manual is unclear, or downright incorrect, or if
2939the language needs to be improved, please send a note to the same address.
2940
2941
2942@node Integer Functions, Rational Number Functions, Reporting Bugs, Top
2943@comment  node-name,  next,  previous,  up
2944@chapter Integer Functions
2945@cindex Integer functions
2946
2947This chapter describes the GMP functions for performing integer arithmetic.
2948These functions start with the prefix @code{mpz_}.
2949
2950GMP integers are stored in objects of type @code{mpz_t}.
2951
2952@menu
2953* Initializing Integers::
2954* Assigning Integers::
2955* Simultaneous Integer Init & Assign::
2956* Converting Integers::
2957* Integer Arithmetic::
2958* Integer Division::
2959* Integer Exponentiation::
2960* Integer Roots::
2961* Number Theoretic Functions::
2962* Integer Comparisons::
2963* Integer Logic and Bit Fiddling::
2964* I/O of Integers::
2965* Integer Random Numbers::
2966* Integer Import and Export::
2967* Miscellaneous Integer Functions::
2968* Integer Special Functions::
2969@end menu
2970
2971@node Initializing Integers, Assigning Integers, Integer Functions, Integer Functions
2972@comment  node-name,  next,  previous,  up
2973@section Initialization Functions
2974@cindex Integer initialization functions
2975@cindex Initialization functions
2976
2977The functions for integer arithmetic assume that all integer objects are
2978initialized.  You do that by calling the function @code{mpz_init}.  For
2979example,
2980
2981@example
2982@{
2983  mpz_t integ;
2984  mpz_init (integ);
2985  @dots{}
2986  mpz_add (integ, @dots{});
2987  @dots{}
2988  mpz_sub (integ, @dots{});
2989
2990  /* Unless the program is about to exit, do ... */
2991  mpz_clear (integ);
2992@}
2993@end example
2994
2995As you can see, you can store new values any number of times, once an
2996object is initialized.
2997
2998@deftypefun void mpz_init (mpz_t @var{x})
2999Initialize @var{x}, and set its value to 0.
3000@end deftypefun
3001
3002@deftypefun void mpz_inits (mpz_t @var{x}, ...)
3003Initialize a NULL-terminated list of @code{mpz_t} variables, and set their
3004values to 0.
3005@end deftypefun
3006
3007@deftypefun void mpz_init2 (mpz_t @var{x}, mp_bitcnt_t @var{n})
3008Initialize @var{x}, with space for @var{n}-bit numbers, and set its value to 0.
3009Calling this function instead of @code{mpz_init} or @code{mpz_inits} is never
3010necessary; reallocation is handled automatically by GMP when needed.
3011
3012@var{n} is only the initial space, @var{x} will grow automatically in
3013the normal way, if necessary, for subsequent values stored.  @code{mpz_init2}
3014makes it possible to avoid such reallocations if a maximum size is known in
3015advance.
3016@end deftypefun
3017
3018@deftypefun void mpz_clear (mpz_t @var{x})
3019Free the space occupied by @var{x}.  Call this function for all @code{mpz_t}
3020variables when you are done with them.
3021@end deftypefun
3022
3023@deftypefun void mpz_clears (mpz_t @var{x}, ...)
3024Free the space occupied by a NULL-terminated list of @code{mpz_t} variables.
3025@end deftypefun
3026
3027@deftypefun void mpz_realloc2 (mpz_t @var{x}, mp_bitcnt_t @var{n})
3028Change the space allocated for @var{x} to @var{n} bits.  The value in @var{x}
3029is preserved if it fits, or is set to 0 if not.
3030
3031Calling this function is never necessary; reallocation is handled automatically
3032by GMP when needed.  But this function can be used to increase the space for a
3033variable in order to avoid repeated automatic reallocations, or to decrease it
3034to give memory back to the heap.
3035@end deftypefun
3036
3037
3038@node Assigning Integers, Simultaneous Integer Init & Assign, Initializing Integers, Integer Functions
3039@comment  node-name,  next,  previous,  up
3040@section Assignment Functions
3041@cindex Integer assignment functions
3042@cindex Assignment functions
3043
3044These functions assign new values to already initialized integers
3045(@pxref{Initializing Integers}).
3046
3047@deftypefun void mpz_set (mpz_t @var{rop}, mpz_t @var{op})
3048@deftypefunx void mpz_set_ui (mpz_t @var{rop}, unsigned long int @var{op})
3049@deftypefunx void mpz_set_si (mpz_t @var{rop}, signed long int @var{op})
3050@deftypefunx void mpz_set_d (mpz_t @var{rop}, double @var{op})
3051@deftypefunx void mpz_set_q (mpz_t @var{rop}, mpq_t @var{op})
3052@deftypefunx void mpz_set_f (mpz_t @var{rop}, mpf_t @var{op})
3053Set the value of @var{rop} from @var{op}.
3054
3055@code{mpz_set_d}, @code{mpz_set_q} and @code{mpz_set_f} truncate @var{op} to
3056make it an integer.
3057@end deftypefun
3058
3059@deftypefun int mpz_set_str (mpz_t @var{rop}, char *@var{str}, int @var{base})
3060Set the value of @var{rop} from @var{str}, a null-terminated C string in base
3061@var{base}.  White space is allowed in the string, and is simply ignored.
3062
3063The @var{base} may vary from 2 to 62, or if @var{base} is 0, then the leading
3064characters are used: @code{0x} and @code{0X} for hexadecimal, @code{0b} and
3065@code{0B} for binary, @code{0} for octal, or decimal otherwise.
3066
3067For bases up to 36, case is ignored; upper-case and lower-case letters have
3068the same value.  For bases 37 to 62, upper-case letter represent the usual
306910..35 while lower-case letter represent 36..61.
3070
3071This function returns 0 if the entire string is a valid number in base
3072@var{base}.  Otherwise it returns @minus{}1.
3073@c
3074@c  It turns out that it is not entirely true that this function ignores
3075@c  white-space.  It does ignore it between digits, but not after a minus sign
3076@c  or within or after ``0x''.  Some thought was given to disallowing all
3077@c  whitespace, but that would be an incompatible change, whitespace has been
3078@c  documented as ignored ever since GMP 1.
3079@c
3080@end deftypefun
3081
3082@deftypefun void mpz_swap (mpz_t @var{rop1}, mpz_t @var{rop2})
3083Swap the values @var{rop1} and @var{rop2} efficiently.
3084@end deftypefun
3085
3086
3087@node Simultaneous Integer Init & Assign, Converting Integers, Assigning Integers, Integer Functions
3088@comment  node-name,  next,  previous,  up
3089@section Combined Initialization and Assignment Functions
3090@cindex Integer assignment functions
3091@cindex Assignment functions
3092@cindex Integer initialization functions
3093@cindex Initialization functions
3094
3095For convenience, GMP provides a parallel series of initialize-and-set functions
3096which initialize the output and then store the value there.  These functions'
3097names have the form @code{mpz_init_set@dots{}}
3098
3099Here is an example of using one:
3100
3101@example
3102@{
3103  mpz_t pie;
3104  mpz_init_set_str (pie, "3141592653589793238462643383279502884", 10);
3105  @dots{}
3106  mpz_sub (pie, @dots{});
3107  @dots{}
3108  mpz_clear (pie);
3109@}
3110@end example
3111
3112@noindent
3113Once the integer has been initialized by any of the @code{mpz_init_set@dots{}}
3114functions, it can be used as the source or destination operand for the ordinary
3115integer functions.  Don't use an initialize-and-set function on a variable
3116already initialized!
3117
3118@deftypefun void mpz_init_set (mpz_t @var{rop}, mpz_t @var{op})
3119@deftypefunx void mpz_init_set_ui (mpz_t @var{rop}, unsigned long int @var{op})
3120@deftypefunx void mpz_init_set_si (mpz_t @var{rop}, signed long int @var{op})
3121@deftypefunx void mpz_init_set_d (mpz_t @var{rop}, double @var{op})
3122Initialize @var{rop} with limb space and set the initial numeric value from
3123@var{op}.
3124@end deftypefun
3125
3126@deftypefun int mpz_init_set_str (mpz_t @var{rop}, char *@var{str}, int @var{base})
3127Initialize @var{rop} and set its value like @code{mpz_set_str} (see its
3128documentation above for details).
3129
3130If the string is a correct base @var{base} number, the function returns 0;
3131if an error occurs it returns @minus{}1.  @var{rop} is initialized even if
3132an error occurs.  (I.e., you have to call @code{mpz_clear} for it.)
3133@end deftypefun
3134
3135
3136@node Converting Integers, Integer Arithmetic, Simultaneous Integer Init & Assign, Integer Functions
3137@comment  node-name,  next,  previous,  up
3138@section Conversion Functions
3139@cindex Integer conversion functions
3140@cindex Conversion functions
3141
3142This section describes functions for converting GMP integers to standard C
3143types.  Functions for converting @emph{to} GMP integers are described in
3144@ref{Assigning Integers} and @ref{I/O of Integers}.
3145
3146@deftypefun {unsigned long int} mpz_get_ui (mpz_t @var{op})
3147Return the value of @var{op} as an @code{unsigned long}.
3148
3149If @var{op} is too big to fit an @code{unsigned long} then just the least
3150significant bits that do fit are returned.  The sign of @var{op} is ignored,
3151only the absolute value is used.
3152@end deftypefun
3153
3154@deftypefun {signed long int} mpz_get_si (mpz_t @var{op})
3155If @var{op} fits into a @code{signed long int} return the value of @var{op}.
3156Otherwise return the least significant part of @var{op}, with the same sign
3157as @var{op}.
3158
3159If @var{op} is too big to fit in a @code{signed long int}, the returned
3160result is probably not very useful.  To find out if the value will fit, use
3161the function @code{mpz_fits_slong_p}.
3162@end deftypefun
3163
3164@deftypefun double mpz_get_d (mpz_t @var{op})
3165Convert @var{op} to a @code{double}, truncating if necessary (ie.@: rounding
3166towards zero).
3167
3168If the exponent from the conversion is too big, the result is system
3169dependent.  An infinity is returned where available.  A hardware overflow trap
3170may or may not occur.
3171@end deftypefun
3172
3173@deftypefun double mpz_get_d_2exp (signed long int *@var{exp}, mpz_t @var{op})
3174Convert @var{op} to a @code{double}, truncating if necessary (ie.@: rounding
3175towards zero), and returning the exponent separately.
3176
3177The return value is in the range @math{0.5@le{}@GMPabs{@var{d}}<1} and the
3178exponent is stored to @code{*@var{exp}}.  @m{@var{d} * 2^{exp}, @var{d} *
31792^@var{exp}} is the (truncated) @var{op} value.  If @var{op} is zero, the
3180return is @math{0.0} and 0 is stored to @code{*@var{exp}}.
3181
3182@cindex @code{frexp}
3183This is similar to the standard C @code{frexp} function (@pxref{Normalization
3184Functions,,, libc, The GNU C Library Reference Manual}).
3185@end deftypefun
3186
3187@deftypefun {char *} mpz_get_str (char *@var{str}, int @var{base}, mpz_t @var{op})
3188Convert @var{op} to a string of digits in base @var{base}.  The base argument
3189may vary from 2 to 62 or from @minus{}2 to @minus{}36.
3190
3191For @var{base} in the range 2..36, digits and lower-case letters are used; for
3192@minus{}2..@minus{}36, digits and upper-case letters are used; for 37..62,
3193digits, upper-case letters, and lower-case letters (in that significance order)
3194are used.
3195
3196If @var{str} is @code{NULL}, the result string is allocated using the current
3197allocation function (@pxref{Custom Allocation}).  The block will be
3198@code{strlen(str)+1} bytes, that being exactly enough for the string and
3199null-terminator.
3200
3201If @var{str} is not @code{NULL}, it should point to a block of storage large
3202enough for the result, that being @code{mpz_sizeinbase (@var{op}, @var{base})
3203+ 2}.  The two extra bytes are for a possible minus sign, and the
3204null-terminator.
3205
3206A pointer to the result string is returned, being either the allocated block,
3207or the given @var{str}.
3208@end deftypefun
3209
3210
3211@need 2000
3212@node Integer Arithmetic, Integer Division, Converting Integers, Integer Functions
3213@comment  node-name,  next,  previous,  up
3214@section Arithmetic Functions
3215@cindex Integer arithmetic functions
3216@cindex Arithmetic functions
3217
3218@deftypefun void mpz_add (mpz_t @var{rop}, mpz_t @var{op1}, mpz_t @var{op2})
3219@deftypefunx void mpz_add_ui (mpz_t @var{rop}, mpz_t @var{op1}, unsigned long int @var{op2})
3220Set @var{rop} to @math{@var{op1} + @var{op2}}.
3221@end deftypefun
3222
3223@deftypefun void mpz_sub (mpz_t @var{rop}, mpz_t @var{op1}, mpz_t @var{op2})
3224@deftypefunx void mpz_sub_ui (mpz_t @var{rop}, mpz_t @var{op1}, unsigned long int @var{op2})
3225@deftypefunx void mpz_ui_sub (mpz_t @var{rop}, unsigned long int @var{op1}, mpz_t @var{op2})
3226Set @var{rop} to @var{op1} @minus{} @var{op2}.
3227@end deftypefun
3228
3229@deftypefun void mpz_mul (mpz_t @var{rop}, mpz_t @var{op1}, mpz_t @var{op2})
3230@deftypefunx void mpz_mul_si (mpz_t @var{rop}, mpz_t @var{op1}, long int @var{op2})
3231@deftypefunx void mpz_mul_ui (mpz_t @var{rop}, mpz_t @var{op1}, unsigned long int @var{op2})
3232Set @var{rop} to @math{@var{op1} @GMPtimes{} @var{op2}}.
3233@end deftypefun
3234
3235@deftypefun void mpz_addmul (mpz_t @var{rop}, mpz_t @var{op1}, mpz_t @var{op2})
3236@deftypefunx void mpz_addmul_ui (mpz_t @var{rop}, mpz_t @var{op1}, unsigned long int @var{op2})
3237Set @var{rop} to @math{@var{rop} + @var{op1} @GMPtimes{} @var{op2}}.
3238@end deftypefun
3239
3240@deftypefun void mpz_submul (mpz_t @var{rop}, mpz_t @var{op1}, mpz_t @var{op2})
3241@deftypefunx void mpz_submul_ui (mpz_t @var{rop}, mpz_t @var{op1}, unsigned long int @var{op2})
3242Set @var{rop} to @math{@var{rop} - @var{op1} @GMPtimes{} @var{op2}}.
3243@end deftypefun
3244
3245@deftypefun void mpz_mul_2exp (mpz_t @var{rop}, mpz_t @var{op1}, mp_bitcnt_t @var{op2})
3246@cindex Bit shift left
3247Set @var{rop} to @m{@var{op1} \times 2^{op2}, @var{op1} times 2 raised to
3248@var{op2}}.  This operation can also be defined as a left shift by @var{op2}
3249bits.
3250@end deftypefun
3251
3252@deftypefun void mpz_neg (mpz_t @var{rop}, mpz_t @var{op})
3253Set @var{rop} to @minus{}@var{op}.
3254@end deftypefun
3255
3256@deftypefun void mpz_abs (mpz_t @var{rop}, mpz_t @var{op})
3257Set @var{rop} to the absolute value of @var{op}.
3258@end deftypefun
3259
3260
3261@need 2000
3262@node Integer Division, Integer Exponentiation, Integer Arithmetic, Integer Functions
3263@section Division Functions
3264@cindex Integer division functions
3265@cindex Division functions
3266
3267Division is undefined if the divisor is zero.  Passing a zero divisor to the
3268division or modulo functions (including the modular powering functions
3269@code{mpz_powm} and @code{mpz_powm_ui}), will cause an intentional division by
3270zero.  This lets a program handle arithmetic exceptions in these functions the
3271same way as for normal C @code{int} arithmetic.
3272
3273@c  Separate deftypefun groups for cdiv, fdiv and tdiv produce a blank line
3274@c  between each, and seem to let tex do a better job of page breaks than an
3275@c  @sp 1 in the middle of one big set.
3276
3277@deftypefun void mpz_cdiv_q (mpz_t @var{q}, mpz_t @var{n}, mpz_t @var{d})
3278@deftypefunx void mpz_cdiv_r (mpz_t @var{r}, mpz_t @var{n}, mpz_t @var{d})
3279@deftypefunx void mpz_cdiv_qr (mpz_t @var{q}, mpz_t @var{r}, mpz_t @var{n}, mpz_t @var{d})
3280@maybepagebreak
3281@deftypefunx {unsigned long int} mpz_cdiv_q_ui (mpz_t @var{q}, mpz_t @var{n}, @w{unsigned long int @var{d}})
3282@deftypefunx {unsigned long int} mpz_cdiv_r_ui (mpz_t @var{r}, mpz_t @var{n}, @w{unsigned long int @var{d}})
3283@deftypefunx {unsigned long int} mpz_cdiv_qr_ui (mpz_t @var{q}, mpz_t @var{r}, @w{mpz_t @var{n}}, @w{unsigned long int @var{d}})
3284@deftypefunx {unsigned long int} mpz_cdiv_ui (mpz_t @var{n}, @w{unsigned long int @var{d}})
3285@maybepagebreak
3286@deftypefunx void mpz_cdiv_q_2exp (mpz_t @var{q}, mpz_t @var{n}, @w{mp_bitcnt_t @var{b}})
3287@deftypefunx void mpz_cdiv_r_2exp (mpz_t @var{r}, mpz_t @var{n}, @w{mp_bitcnt_t @var{b}})
3288@end deftypefun
3289
3290@deftypefun void mpz_fdiv_q (mpz_t @var{q}, mpz_t @var{n}, mpz_t @var{d})
3291@deftypefunx void mpz_fdiv_r (mpz_t @var{r}, mpz_t @var{n}, mpz_t @var{d})
3292@deftypefunx void mpz_fdiv_qr (mpz_t @var{q}, mpz_t @var{r}, mpz_t @var{n}, mpz_t @var{d})
3293@maybepagebreak
3294@deftypefunx {unsigned long int} mpz_fdiv_q_ui (mpz_t @var{q}, mpz_t @var{n}, @w{unsigned long int @var{d}})
3295@deftypefunx {unsigned long int} mpz_fdiv_r_ui (mpz_t @var{r}, mpz_t @var{n}, @w{unsigned long int @var{d}})
3296@deftypefunx {unsigned long int} mpz_fdiv_qr_ui (mpz_t @var{q}, mpz_t @var{r}, @w{mpz_t @var{n}}, @w{unsigned long int @var{d}})
3297@deftypefunx {unsigned long int} mpz_fdiv_ui (mpz_t @var{n}, @w{unsigned long int @var{d}})
3298@maybepagebreak
3299@deftypefunx void mpz_fdiv_q_2exp (mpz_t @var{q}, mpz_t @var{n}, @w{mp_bitcnt_t @var{b}})
3300@deftypefunx void mpz_fdiv_r_2exp (mpz_t @var{r}, mpz_t @var{n}, @w{mp_bitcnt_t @var{b}})
3301@end deftypefun
3302
3303@deftypefun void mpz_tdiv_q (mpz_t @var{q}, mpz_t @var{n}, mpz_t @var{d})
3304@deftypefunx void mpz_tdiv_r (mpz_t @var{r}, mpz_t @var{n}, mpz_t @var{d})
3305@deftypefunx void mpz_tdiv_qr (mpz_t @var{q}, mpz_t @var{r}, mpz_t @var{n}, mpz_t @var{d})
3306@maybepagebreak
3307@deftypefunx {unsigned long int} mpz_tdiv_q_ui (mpz_t @var{q}, mpz_t @var{n}, @w{unsigned long int @var{d}})
3308@deftypefunx {unsigned long int} mpz_tdiv_r_ui (mpz_t @var{r}, mpz_t @var{n}, @w{unsigned long int @var{d}})
3309@deftypefunx {unsigned long int} mpz_tdiv_qr_ui (mpz_t @var{q}, mpz_t @var{r}, @w{mpz_t @var{n}}, @w{unsigned long int @var{d}})
3310@deftypefunx {unsigned long int} mpz_tdiv_ui (mpz_t @var{n}, @w{unsigned long int @var{d}})
3311@maybepagebreak
3312@deftypefunx void mpz_tdiv_q_2exp (mpz_t @var{q}, mpz_t @var{n}, @w{mp_bitcnt_t @var{b}})
3313@deftypefunx void mpz_tdiv_r_2exp (mpz_t @var{r}, mpz_t @var{n}, @w{mp_bitcnt_t @var{b}})
3314@cindex Bit shift right
3315
3316@sp 1
3317Divide @var{n} by @var{d}, forming a quotient @var{q} and/or remainder
3318@var{r}.  For the @code{2exp} functions, @m{@var{d}=2^b, @var{d}=2^@var{b}}.
3319The rounding is in three styles, each suiting different applications.
3320
3321@itemize @bullet
3322@item
3323@code{cdiv} rounds @var{q} up towards @m{+\infty, +infinity}, and @var{r} will
3324have the opposite sign to @var{d}.  The @code{c} stands for ``ceil''.
3325
3326@item
3327@code{fdiv} rounds @var{q} down towards @m{-\infty, @minus{}infinity}, and
3328@var{r} will have the same sign as @var{d}.  The @code{f} stands for
3329``floor''.
3330
3331@item
3332@code{tdiv} rounds @var{q} towards zero, and @var{r} will have the same sign
3333as @var{n}.  The @code{t} stands for ``truncate''.
3334@end itemize
3335
3336In all cases @var{q} and @var{r} will satisfy
3337@m{@var{n}=@var{q}@var{d}+@var{r}, @var{n}=@var{q}*@var{d}+@var{r}}, and
3338@var{r} will satisfy @math{0@le{}@GMPabs{@var{r}}<@GMPabs{@var{d}}}.
3339
3340The @code{q} functions calculate only the quotient, the @code{r} functions
3341only the remainder, and the @code{qr} functions calculate both.  Note that for
3342@code{qr} the same variable cannot be passed for both @var{q} and @var{r}, or
3343results will be unpredictable.
3344
3345For the @code{ui} variants the return value is the remainder, and in fact
3346returning the remainder is all the @code{div_ui} functions do.  For
3347@code{tdiv} and @code{cdiv} the remainder can be negative, so for those the
3348return value is the absolute value of the remainder.
3349
3350For the @code{2exp} variants the divisor is @m{2^b,2^@var{b}}.  These
3351functions are implemented as right shifts and bit masks, but of course they
3352round the same as the other functions.
3353
3354For positive @var{n} both @code{mpz_fdiv_q_2exp} and @code{mpz_tdiv_q_2exp}
3355are simple bitwise right shifts.  For negative @var{n}, @code{mpz_fdiv_q_2exp}
3356is effectively an arithmetic right shift treating @var{n} as twos complement
3357the same as the bitwise logical functions do, whereas @code{mpz_tdiv_q_2exp}
3358effectively treats @var{n} as sign and magnitude.
3359@end deftypefun
3360
3361@deftypefun void mpz_mod (mpz_t @var{r}, mpz_t @var{n}, mpz_t @var{d})
3362@deftypefunx {unsigned long int} mpz_mod_ui (mpz_t @var{r}, mpz_t @var{n}, @w{unsigned long int @var{d}})
3363Set @var{r} to @var{n} @code{mod} @var{d}.  The sign of the divisor is
3364ignored; the result is always non-negative.
3365
3366@code{mpz_mod_ui} is identical to @code{mpz_fdiv_r_ui} above, returning the
3367remainder as well as setting @var{r}.  See @code{mpz_fdiv_ui} above if only
3368the return value is wanted.
3369@end deftypefun
3370
3371@deftypefun void mpz_divexact (mpz_t @var{q}, mpz_t @var{n}, mpz_t @var{d})
3372@deftypefunx void mpz_divexact_ui (mpz_t @var{q}, mpz_t @var{n}, unsigned long @var{d})
3373@cindex Exact division functions
3374Set @var{q} to @var{n}/@var{d}.  These functions produce correct results only
3375when it is known in advance that @var{d} divides @var{n}.
3376
3377These routines are much faster than the other division functions, and are the
3378best choice when exact division is known to occur, for example reducing a
3379rational to lowest terms.
3380@end deftypefun
3381
3382@deftypefun int mpz_divisible_p (mpz_t @var{n}, mpz_t @var{d})
3383@deftypefunx int mpz_divisible_ui_p (mpz_t @var{n}, unsigned long int @var{d})
3384@deftypefunx int mpz_divisible_2exp_p (mpz_t @var{n}, mp_bitcnt_t @var{b})
3385@cindex Divisibility functions
3386Return non-zero if @var{n} is exactly divisible by @var{d}, or in the case of
3387@code{mpz_divisible_2exp_p} by @m{2^b,2^@var{b}}.
3388
3389@var{n} is divisible by @var{d} if there exists an integer @var{q} satisfying
3390@math{@var{n} = @var{q}@GMPmultiply{}@var{d}}.  Unlike the other division
3391functions, @math{@var{d}=0} is accepted and following the rule it can be seen
3392that only 0 is considered divisible by 0.
3393@end deftypefun
3394
3395@deftypefun int mpz_congruent_p (mpz_t @var{n}, mpz_t @var{c}, mpz_t @var{d})
3396@deftypefunx int mpz_congruent_ui_p (mpz_t @var{n}, unsigned long int @var{c}, unsigned long int @var{d})
3397@deftypefunx int mpz_congruent_2exp_p (mpz_t @var{n}, mpz_t @var{c}, mp_bitcnt_t @var{b})
3398@cindex Divisibility functions
3399@cindex Congruence functions
3400Return non-zero if @var{n} is congruent to @var{c} modulo @var{d}, or in the
3401case of @code{mpz_congruent_2exp_p} modulo @m{2^b,2^@var{b}}.
3402
3403@var{n} is congruent to @var{c} mod @var{d} if there exists an integer @var{q}
3404satisfying @math{@var{n} = @var{c} + @var{q}@GMPmultiply{}@var{d}}.  Unlike
3405the other division functions, @math{@var{d}=0} is accepted and following the
3406rule it can be seen that @var{n} and @var{c} are considered congruent mod 0
3407only when exactly equal.
3408@end deftypefun
3409
3410
3411@need 2000
3412@node Integer Exponentiation, Integer Roots, Integer Division, Integer Functions
3413@section Exponentiation Functions
3414@cindex Integer exponentiation functions
3415@cindex Exponentiation functions
3416@cindex Powering functions
3417
3418@deftypefun void mpz_powm (mpz_t @var{rop}, mpz_t @var{base}, mpz_t @var{exp}, mpz_t @var{mod})
3419@deftypefunx void mpz_powm_ui (mpz_t @var{rop}, mpz_t @var{base}, unsigned long int @var{exp}, mpz_t @var{mod})
3420Set @var{rop} to @m{base^{exp} \bmod mod, (@var{base} raised to @var{exp})
3421modulo @var{mod}}.
3422
3423Negative @var{exp} is supported if an inverse @math{@var{base}^@W{-1} @bmod
3424@var{mod}} exists (see @code{mpz_invert} in @ref{Number Theoretic Functions}).
3425If an inverse doesn't exist then a divide by zero is raised.
3426@end deftypefun
3427
3428@deftypefun void mpz_powm_sec (mpz_t @var{rop}, mpz_t @var{base}, mpz_t @var{exp}, mpz_t @var{mod})
3429Set @var{rop} to @m{base^{exp} \bmod mod, (@var{base} raised to @var{exp})
3430modulo @var{mod}}.
3431
3432It is required that @math{@var{exp} > 0} and that @var{mod} is odd.
3433
3434This function is designed to take the same time and have the same cache access
3435patterns for any two same-size arguments, assuming that function arguments are
3436placed at the same position and that the machine state is identical upon
3437function entry.  This function is intended for cryptographic purposes, where
3438resilience to side-channel attacks is desired.
3439@end deftypefun
3440
3441@deftypefun void mpz_pow_ui (mpz_t @var{rop}, mpz_t @var{base}, unsigned long int @var{exp})
3442@deftypefunx void mpz_ui_pow_ui (mpz_t @var{rop}, unsigned long int @var{base}, unsigned long int @var{exp})
3443Set @var{rop} to @m{base^{exp}, @var{base} raised to @var{exp}}.  The case
3444@math{0^0} yields 1.
3445@end deftypefun
3446
3447
3448@need 2000
3449@node Integer Roots, Number Theoretic Functions, Integer Exponentiation, Integer Functions
3450@section Root Extraction Functions
3451@cindex Integer root functions
3452@cindex Root extraction functions
3453
3454@deftypefun int mpz_root (mpz_t @var{rop}, mpz_t @var{op}, unsigned long int @var{n})
3455Set @var{rop} to @m{\lfloor\root n \of {op}\rfloor@C{},} the truncated integer
3456part of the @var{n}th root of @var{op}.  Return non-zero if the computation
3457was exact, i.e., if @var{op} is @var{rop} to the @var{n}th power.
3458@end deftypefun
3459
3460@deftypefun void mpz_rootrem (mpz_t @var{root}, mpz_t @var{rem}, mpz_t @var{u}, unsigned long int @var{n})
3461Set @var{root} to @m{\lfloor\root n \of {u}\rfloor@C{},} the truncated
3462integer part of the @var{n}th root of @var{u}.  Set @var{rem} to the
3463remainder, @m{(@var{u} - @var{root}^n),
3464@var{u}@minus{}@var{root}**@var{n}}.
3465@end deftypefun
3466
3467@deftypefun void mpz_sqrt (mpz_t @var{rop}, mpz_t @var{op})
3468Set @var{rop} to @m{\lfloor\sqrt{@var{op}}\rfloor@C{},} the truncated
3469integer part of the square root of @var{op}.
3470@end deftypefun
3471
3472@deftypefun void mpz_sqrtrem (mpz_t @var{rop1}, mpz_t @var{rop2}, mpz_t @var{op})
3473Set @var{rop1} to @m{\lfloor\sqrt{@var{op}}\rfloor, the truncated integer part
3474of the square root of @var{op}}, like @code{mpz_sqrt}.  Set @var{rop2} to the
3475remainder @m{(@var{op} - @var{rop1}^2),
3476@var{op}@minus{}@var{rop1}*@var{rop1}}, which will be zero if @var{op} is a
3477perfect square.
3478
3479If @var{rop1} and @var{rop2} are the same variable, the results are
3480undefined.
3481@end deftypefun
3482
3483@deftypefun int mpz_perfect_power_p (mpz_t @var{op})
3484@cindex Perfect power functions
3485@cindex Root testing functions
3486Return non-zero if @var{op} is a perfect power, i.e., if there exist integers
3487@m{a,@var{a}} and @m{b,@var{b}}, with @m{b>1, @var{b}>1}, such that
3488@m{@var{op}=a^b, @var{op} equals @var{a} raised to the power @var{b}}.
3489
3490Under this definition both 0 and 1 are considered to be perfect powers.
3491Negative values of @var{op} are accepted, but of course can only be odd
3492perfect powers.
3493@end deftypefun
3494
3495@deftypefun int mpz_perfect_square_p (mpz_t @var{op})
3496@cindex Perfect square functions
3497@cindex Root testing functions
3498Return non-zero if @var{op} is a perfect square, i.e., if the square root of
3499@var{op} is an integer.  Under this definition both 0 and 1 are considered to
3500be perfect squares.
3501@end deftypefun
3502
3503
3504@need 2000
3505@node Number Theoretic Functions, Integer Comparisons, Integer Roots, Integer Functions
3506@section Number Theoretic Functions
3507@cindex Number theoretic functions
3508
3509@deftypefun int mpz_probab_prime_p (mpz_t @var{n}, int @var{reps})
3510@cindex Prime testing functions
3511@cindex Probable prime testing functions
3512Determine whether @var{n} is prime.  Return 2 if @var{n} is definitely prime,
3513return 1 if @var{n} is probably prime (without being certain), or return 0 if
3514@var{n} is definitely composite.
3515
3516This function does some trial divisions, then some Miller-Rabin probabilistic
3517primality tests.  @var{reps} controls how many such tests are done, 5 to 10 is
3518a reasonable number, more will reduce the chances of a composite being
3519returned as ``probably prime''.
3520
3521Miller-Rabin and similar tests can be more properly called compositeness
3522tests.  Numbers which fail are known to be composite but those which pass
3523might be prime or might be composite.  Only a few composites pass, hence those
3524which pass are considered probably prime.
3525@end deftypefun
3526
3527@deftypefun void mpz_nextprime (mpz_t @var{rop}, mpz_t @var{op})
3528@cindex Next prime function
3529Set @var{rop} to the next prime greater than @var{op}.
3530
3531This function uses a probabilistic algorithm to identify primes.  For
3532practical purposes it's adequate, the chance of a composite passing will be
3533extremely small.
3534@end deftypefun
3535
3536@c mpz_prime_p not implemented as of gmp 3.0.
3537
3538@c @deftypefun int mpz_prime_p (mpz_t @var{n})
3539@c Return non-zero if @var{n} is prime and zero if @var{n} is a non-prime.
3540@c This function is far slower than @code{mpz_probab_prime_p}, but then it
3541@c never returns non-zero for composite numbers.
3542
3543@c (For practical purposes, using @code{mpz_probab_prime_p} is adequate.
3544@c The likelihood of a programming error or hardware malfunction is orders
3545@c of magnitudes greater than the likelihood for a composite to pass as a
3546@c prime, if the @var{reps} argument is in the suggested range.)
3547@c @end deftypefun
3548
3549@deftypefun void mpz_gcd (mpz_t @var{rop}, mpz_t @var{op1}, mpz_t @var{op2})
3550@cindex Greatest common divisor functions
3551@cindex GCD functions
3552Set @var{rop} to the greatest common divisor of @var{op1} and @var{op2}.
3553The result is always positive even if one or both input operands
3554are negative.
3555@end deftypefun
3556
3557@deftypefun {unsigned long int} mpz_gcd_ui (mpz_t @var{rop}, mpz_t @var{op1}, unsigned long int @var{op2})
3558Compute the greatest common divisor of @var{op1} and @var{op2}.  If
3559@var{rop} is not @code{NULL}, store the result there.
3560
3561If the result is small enough to fit in an @code{unsigned long int}, it is
3562returned.  If the result does not fit, 0 is returned, and the result is equal
3563to the argument @var{op1}.  Note that the result will always fit if @var{op2}
3564is non-zero.
3565@end deftypefun
3566
3567@deftypefun void mpz_gcdext (mpz_t @var{g}, mpz_t @var{s}, mpz_t @var{t}, mpz_t @var{a}, mpz_t @var{b})
3568@cindex Extended GCD
3569@cindex GCD extended
3570Set @var{g} to the greatest common divisor of @var{a} and @var{b}, and in
3571addition set @var{s} and @var{t} to coefficients satisfying
3572@math{@var{a}@GMPmultiply{}@var{s} + @var{b}@GMPmultiply{}@var{t} = @var{g}}.
3573The value in @var{g} is always positive, even if one or both of @var{a} and
3574@var{b} are negative.  The values in @var{s} and @var{t} are chosen such that
3575@math{@GMPabs{@var{s}} @le{} @GMPabs{@var{b}}} and @math{@GMPabs{@var{t}}
3576@le{} @GMPabs{@var{a}}}.
3577
3578If @var{t} is @code{NULL} then that value is not computed.
3579@end deftypefun
3580
3581@deftypefun void mpz_lcm (mpz_t @var{rop}, mpz_t @var{op1}, mpz_t @var{op2})
3582@deftypefunx void mpz_lcm_ui (mpz_t @var{rop}, mpz_t @var{op1}, unsigned long @var{op2})
3583@cindex Least common multiple functions
3584@cindex LCM functions
3585Set @var{rop} to the least common multiple of @var{op1} and @var{op2}.
3586@var{rop} is always positive, irrespective of the signs of @var{op1} and
3587@var{op2}.  @var{rop} will be zero if either @var{op1} or @var{op2} is zero.
3588@end deftypefun
3589
3590@deftypefun int mpz_invert (mpz_t @var{rop}, mpz_t @var{op1}, mpz_t @var{op2})
3591@cindex Modular inverse functions
3592@cindex Inverse modulo functions
3593Compute the inverse of @var{op1} modulo @var{op2} and put the result in
3594@var{rop}.  If the inverse exists, the return value is non-zero and @var{rop}
3595will satisfy @math{0 @le{} @var{rop} < @var{op2}}.  If an inverse doesn't exist
3596the return value is zero and @var{rop} is undefined.
3597@end deftypefun
3598
3599@deftypefun int mpz_jacobi (mpz_t @var{a}, mpz_t @var{b})
3600@cindex Jacobi symbol functions
3601Calculate the Jacobi symbol @m{\left(a \over b\right),
3602(@var{a}/@var{b})}.  This is defined only for @var{b} odd.
3603@end deftypefun
3604
3605@deftypefun int mpz_legendre (mpz_t @var{a}, mpz_t @var{p})
3606@cindex Legendre symbol functions
3607Calculate the Legendre symbol @m{\left(a \over p\right),
3608(@var{a}/@var{p})}.  This is defined only for @var{p} an odd positive
3609prime, and for such @var{p} it's identical to the Jacobi symbol.
3610@end deftypefun
3611
3612@deftypefun int mpz_kronecker (mpz_t @var{a}, mpz_t @var{b})
3613@deftypefunx int mpz_kronecker_si (mpz_t @var{a}, long @var{b})
3614@deftypefunx int mpz_kronecker_ui (mpz_t @var{a}, unsigned long @var{b})
3615@deftypefunx int mpz_si_kronecker (long @var{a}, mpz_t @var{b})
3616@deftypefunx int mpz_ui_kronecker (unsigned long @var{a}, mpz_t @var{b})
3617@cindex Kronecker symbol functions
3618Calculate the Jacobi symbol @m{\left(a \over b\right),
3619(@var{a}/@var{b})} with the Kronecker extension @m{\left(a \over
36202\right) = \left(2 \over a\right), (a/2)=(2/a)} when @math{a} odd, or
3621@m{\left(a \over 2\right) = 0, (a/2)=0} when @math{a} even.
3622
3623When @var{b} is odd the Jacobi symbol and Kronecker symbol are
3624identical, so @code{mpz_kronecker_ui} etc can be used for mixed
3625precision Jacobi symbols too.
3626
3627For more information see Henri Cohen section 1.4.2 (@pxref{References}),
3628or any number theory textbook.  See also the example program
3629@file{demos/qcn.c} which uses @code{mpz_kronecker_ui}.
3630@end deftypefun
3631
3632@deftypefun {mp_bitcnt_t} mpz_remove (mpz_t @var{rop}, mpz_t @var{op}, mpz_t @var{f})
3633@cindex Remove factor functions
3634@cindex Factor removal functions
3635Remove all occurrences of the factor @var{f} from @var{op} and store the
3636result in @var{rop}.  The return value is how many such occurrences were
3637removed.
3638@end deftypefun
3639
3640@deftypefun void mpz_fac_ui (mpz_t @var{rop}, unsigned long int @var{op})
3641@cindex Factorial functions
3642Set @var{rop} to @var{op}!, the factorial of @var{op}.
3643@end deftypefun
3644
3645@deftypefun void mpz_bin_ui (mpz_t @var{rop}, mpz_t @var{n}, unsigned long int @var{k})
3646@deftypefunx void mpz_bin_uiui (mpz_t @var{rop}, unsigned long int @var{n}, @w{unsigned long int @var{k}})
3647@cindex Binomial coefficient functions
3648Compute the binomial coefficient @m{\left({n}\atop{k}\right), @var{n} over
3649@var{k}} and store the result in @var{rop}.  Negative values of @var{n} are
3650supported by @code{mpz_bin_ui}, using the identity
3651@m{\left({-n}\atop{k}\right) = (-1)^k \left({n+k-1}\atop{k}\right),
3652bin(-n@C{}k) = (-1)^k * bin(n+k-1@C{}k)}, see Knuth volume 1 section 1.2.6
3653part G.
3654@end deftypefun
3655
3656@deftypefun void mpz_fib_ui (mpz_t @var{fn}, unsigned long int @var{n})
3657@deftypefunx void mpz_fib2_ui (mpz_t @var{fn}, mpz_t @var{fnsub1}, unsigned long int @var{n})
3658@cindex Fibonacci sequence functions
3659@code{mpz_fib_ui} sets @var{fn} to to @m{F_n,F[n]}, the @var{n}'th Fibonacci
3660number.  @code{mpz_fib2_ui} sets @var{fn} to @m{F_n,F[n]}, and @var{fnsub1} to
3661@m{F_{n-1},F[n-1]}.
3662
3663These functions are designed for calculating isolated Fibonacci numbers.  When
3664a sequence of values is wanted it's best to start with @code{mpz_fib2_ui} and
3665iterate the defining @m{F_{n+1} = F_n + F_{n-1}, F[n+1]=F[n]+F[n-1]} or
3666similar.
3667@end deftypefun
3668
3669@deftypefun void mpz_lucnum_ui (mpz_t @var{ln}, unsigned long int @var{n})
3670@deftypefunx void mpz_lucnum2_ui (mpz_t @var{ln}, mpz_t @var{lnsub1}, unsigned long int @var{n})
3671@cindex Lucas number functions
3672@code{mpz_lucnum_ui} sets @var{ln} to to @m{L_n,L[n]}, the @var{n}'th Lucas
3673number.  @code{mpz_lucnum2_ui} sets @var{ln} to @m{L_n,L[n]}, and @var{lnsub1}
3674to @m{L_{n-1},L[n-1]}.
3675
3676These functions are designed for calculating isolated Lucas numbers.  When a
3677sequence of values is wanted it's best to start with @code{mpz_lucnum2_ui} and
3678iterate the defining @m{L_{n+1} = L_n + L_{n-1}, L[n+1]=L[n]+L[n-1]} or
3679similar.
3680
3681The Fibonacci numbers and Lucas numbers are related sequences, so it's never
3682necessary to call both @code{mpz_fib2_ui} and @code{mpz_lucnum2_ui}.  The
3683formulas for going from Fibonacci to Lucas can be found in @ref{Lucas Numbers
3684Algorithm}, the reverse is straightforward too.
3685@end deftypefun
3686
3687
3688@node Integer Comparisons, Integer Logic and Bit Fiddling, Number Theoretic Functions, Integer Functions
3689@comment  node-name,  next,  previous,  up
3690@section Comparison Functions
3691@cindex Integer comparison functions
3692@cindex Comparison functions
3693
3694@deftypefn Function int mpz_cmp (mpz_t @var{op1}, mpz_t @var{op2})
3695@deftypefnx Function int mpz_cmp_d (mpz_t @var{op1}, double @var{op2})
3696@deftypefnx Macro int mpz_cmp_si (mpz_t @var{op1}, signed long int @var{op2})
3697@deftypefnx Macro int mpz_cmp_ui (mpz_t @var{op1}, unsigned long int @var{op2})
3698Compare @var{op1} and @var{op2}.  Return a positive value if @math{@var{op1} >
3699@var{op2}}, zero if @math{@var{op1} = @var{op2}}, or a negative value if
3700@math{@var{op1} < @var{op2}}.
3701
3702@code{mpz_cmp_ui} and @code{mpz_cmp_si} are macros and will evaluate their
3703arguments more than once.  @code{mpz_cmp_d} can be called with an infinity,
3704but results are undefined for a NaN.
3705@end deftypefn
3706
3707@deftypefn Function int mpz_cmpabs (mpz_t @var{op1}, mpz_t @var{op2})
3708@deftypefnx Function int mpz_cmpabs_d (mpz_t @var{op1}, double @var{op2})
3709@deftypefnx Function int mpz_cmpabs_ui (mpz_t @var{op1}, unsigned long int @var{op2})
3710Compare the absolute values of @var{op1} and @var{op2}.  Return a positive
3711value if @math{@GMPabs{@var{op1}} > @GMPabs{@var{op2}}}, zero if
3712@math{@GMPabs{@var{op1}} = @GMPabs{@var{op2}}}, or a negative value if
3713@math{@GMPabs{@var{op1}} < @GMPabs{@var{op2}}}.
3714
3715@code{mpz_cmpabs_d} can be called with an infinity, but results are undefined
3716for a NaN.
3717@end deftypefn
3718
3719@deftypefn Macro int mpz_sgn (mpz_t @var{op})
3720@cindex Sign tests
3721@cindex Integer sign tests
3722Return @math{+1} if @math{@var{op} > 0}, 0 if @math{@var{op} = 0}, and
3723@math{-1} if @math{@var{op} < 0}.
3724
3725This function is actually implemented as a macro.  It evaluates its argument
3726multiple times.
3727@end deftypefn
3728
3729
3730@node Integer Logic and Bit Fiddling, I/O of Integers, Integer Comparisons, Integer Functions
3731@comment  node-name,  next,  previous,  up
3732@section Logical and Bit Manipulation Functions
3733@cindex Logical functions
3734@cindex Bit manipulation functions
3735@cindex Integer logical functions
3736@cindex Integer bit manipulation functions
3737
3738These functions behave as if twos complement arithmetic were used (although
3739sign-magnitude is the actual implementation).  The least significant bit is
3740number 0.
3741
3742@deftypefun void mpz_and (mpz_t @var{rop}, mpz_t @var{op1}, mpz_t @var{op2})
3743Set @var{rop} to @var{op1} bitwise-and @var{op2}.
3744@end deftypefun
3745
3746@deftypefun void mpz_ior (mpz_t @var{rop}, mpz_t @var{op1}, mpz_t @var{op2})
3747Set @var{rop} to @var{op1} bitwise inclusive-or @var{op2}.
3748@end deftypefun
3749
3750@deftypefun void mpz_xor (mpz_t @var{rop}, mpz_t @var{op1}, mpz_t @var{op2})
3751Set @var{rop} to @var{op1} bitwise exclusive-or @var{op2}.
3752@end deftypefun
3753
3754@deftypefun void mpz_com (mpz_t @var{rop}, mpz_t @var{op})
3755Set @var{rop} to the one's complement of @var{op}.
3756@end deftypefun
3757
3758@deftypefun {mp_bitcnt_t} mpz_popcount (mpz_t @var{op})
3759If @math{@var{op}@ge{}0}, return the population count of @var{op}, which is the
3760number of 1 bits in the binary representation.  If @math{@var{op}<0}, the
3761number of 1s is infinite, and the return value is the largest possible
3762@code{mp_bitcnt_t}.
3763@end deftypefun
3764
3765@deftypefun {mp_bitcnt_t} mpz_hamdist (mpz_t @var{op1}, mpz_t @var{op2})
3766If @var{op1} and @var{op2} are both @math{@ge{}0} or both @math{<0}, return the
3767hamming distance between the two operands, which is the number of bit positions
3768where @var{op1} and @var{op2} have different bit values.  If one operand is
3769@math{@ge{}0} and the other @math{<0} then the number of bits different is
3770infinite, and the return value is the largest possible @code{mp_bitcnt_t}.
3771@end deftypefun
3772
3773@deftypefun {mp_bitcnt_t} mpz_scan0 (mpz_t @var{op}, mp_bitcnt_t @var{starting_bit})
3774@deftypefunx {mp_bitcnt_t} mpz_scan1 (mpz_t @var{op}, mp_bitcnt_t @var{starting_bit})
3775@cindex Bit scanning functions
3776@cindex Scan bit functions
3777Scan @var{op}, starting from bit @var{starting_bit}, towards more significant
3778bits, until the first 0 or 1 bit (respectively) is found.  Return the index of
3779the found bit.
3780
3781If the bit at @var{starting_bit} is already what's sought, then
3782@var{starting_bit} is returned.
3783
3784If there's no bit found, then the largest possible @code{mp_bitcnt_t} is
3785returned.  This will happen in @code{mpz_scan0} past the end of a negative
3786number, or @code{mpz_scan1} past the end of a nonnegative number.
3787@end deftypefun
3788
3789@deftypefun void mpz_setbit (mpz_t @var{rop}, mp_bitcnt_t @var{bit_index})
3790Set bit @var{bit_index} in @var{rop}.
3791@end deftypefun
3792
3793@deftypefun void mpz_clrbit (mpz_t @var{rop}, mp_bitcnt_t @var{bit_index})
3794Clear bit @var{bit_index} in @var{rop}.
3795@end deftypefun
3796
3797@deftypefun void mpz_combit (mpz_t @var{rop}, mp_bitcnt_t @var{bit_index})
3798Complement bit @var{bit_index} in @var{rop}.
3799@end deftypefun
3800
3801@deftypefun int mpz_tstbit (mpz_t @var{op}, mp_bitcnt_t @var{bit_index})
3802Test bit @var{bit_index} in @var{op} and return 0 or 1 accordingly.
3803@end deftypefun
3804
3805@node I/O of Integers, Integer Random Numbers, Integer Logic and Bit Fiddling, Integer Functions
3806@comment  node-name,  next,  previous,  up
3807@section Input and Output Functions
3808@cindex Integer input and output functions
3809@cindex Input functions
3810@cindex Output functions
3811@cindex I/O functions
3812
3813Functions that perform input from a stdio stream, and functions that output to
3814a stdio stream, of @code{mpz} numbers.  Passing a @code{NULL} pointer for a
3815@var{stream} argument to any of these functions will make them read from
3816@code{stdin} and write to @code{stdout}, respectively.
3817
3818When using any of these functions, it is a good idea to include @file{stdio.h}
3819before @file{gmp.h}, since that will allow @file{gmp.h} to define prototypes
3820for these functions.
3821
3822See also @ref{Formatted Output} and @ref{Formatted Input}.
3823
3824@deftypefun size_t mpz_out_str (FILE *@var{stream}, int @var{base}, mpz_t @var{op})
3825Output @var{op} on stdio stream @var{stream}, as a string of digits in base
3826@var{base}.  The base argument may vary from 2 to 62 or from @minus{}2 to
3827@minus{}36.
3828
3829For @var{base} in the range 2..36, digits and lower-case letters are used; for
3830@minus{}2..@minus{}36, digits and upper-case letters are used; for 37..62,
3831digits, upper-case letters, and lower-case letters (in that significance order)
3832are used.
3833
3834Return the number of bytes written, or if an error occurred, return 0.
3835@end deftypefun
3836
3837@deftypefun size_t mpz_inp_str (mpz_t @var{rop}, FILE *@var{stream}, int @var{base})
3838Input a possibly white-space preceded string in base @var{base} from stdio
3839stream @var{stream}, and put the read integer in @var{rop}.
3840
3841The @var{base} may vary from 2 to 62, or if @var{base} is 0, then the leading
3842characters are used: @code{0x} and @code{0X} for hexadecimal, @code{0b} and
3843@code{0B} for binary, @code{0} for octal, or decimal otherwise.
3844
3845For bases up to 36, case is ignored; upper-case and lower-case letters have
3846the same value.  For bases 37 to 62, upper-case letter represent the usual
384710..35 while lower-case letter represent 36..61.
3848
3849Return the number of bytes read, or if an error occurred, return 0.
3850@end deftypefun
3851
3852@deftypefun size_t mpz_out_raw (FILE *@var{stream}, mpz_t @var{op})
3853Output @var{op} on stdio stream @var{stream}, in raw binary format.  The
3854integer is written in a portable format, with 4 bytes of size information, and
3855that many bytes of limbs.  Both the size and the limbs are written in
3856decreasing significance order (i.e., in big-endian).
3857
3858The output can be read with @code{mpz_inp_raw}.
3859
3860Return the number of bytes written, or if an error occurred, return 0.
3861
3862The output of this can not be read by @code{mpz_inp_raw} from GMP 1, because
3863of changes necessary for compatibility between 32-bit and 64-bit machines.
3864@end deftypefun
3865
3866@deftypefun size_t mpz_inp_raw (mpz_t @var{rop}, FILE *@var{stream})
3867Input from stdio stream @var{stream} in the format written by
3868@code{mpz_out_raw}, and put the result in @var{rop}.  Return the number of
3869bytes read, or if an error occurred, return 0.
3870
3871This routine can read the output from @code{mpz_out_raw} also from GMP 1, in
3872spite of changes necessary for compatibility between 32-bit and 64-bit
3873machines.
3874@end deftypefun
3875
3876
3877@need 2000
3878@node Integer Random Numbers, Integer Import and Export, I/O of Integers, Integer Functions
3879@comment  node-name,  next,  previous,  up
3880@section Random Number Functions
3881@cindex Integer random number functions
3882@cindex Random number functions
3883
3884The random number functions of GMP come in two groups; older function
3885that rely on a global state, and newer functions that accept a state
3886parameter that is read and modified.  Please see the @ref{Random Number
3887Functions} for more information on how to use and not to use random
3888number functions.
3889
3890@deftypefun void mpz_urandomb (mpz_t @var{rop}, gmp_randstate_t @var{state}, mp_bitcnt_t @var{n})
3891Generate a uniformly distributed random integer in the range 0 to @m{2^n-1,
38922^@var{n}@minus{}1}, inclusive.
3893
3894The variable @var{state} must be initialized by calling one of the
3895@code{gmp_randinit} functions (@ref{Random State Initialization}) before
3896invoking this function.
3897@end deftypefun
3898
3899@deftypefun void mpz_urandomm (mpz_t @var{rop}, gmp_randstate_t @var{state}, mpz_t @var{n})
3900Generate a uniform random integer in the range 0 to @math{@var{n}-1},
3901inclusive.
3902
3903The variable @var{state} must be initialized by calling one of the
3904@code{gmp_randinit} functions (@ref{Random State Initialization})
3905before invoking this function.
3906@end deftypefun
3907
3908@deftypefun void mpz_rrandomb (mpz_t @var{rop}, gmp_randstate_t @var{state}, mp_bitcnt_t @var{n})
3909Generate a random integer with long strings of zeros and ones in the
3910binary representation.  Useful for testing functions and algorithms,
3911since this kind of random numbers have proven to be more likely to
3912trigger corner-case bugs.  The random number will be in the range
39130 to @m{2^n-1, 2^@var{n}@minus{}1}, inclusive.
3914
3915The variable @var{state} must be initialized by calling one of the
3916@code{gmp_randinit} functions (@ref{Random State Initialization})
3917before invoking this function.
3918@end deftypefun
3919
3920@deftypefun void mpz_random (mpz_t @var{rop}, mp_size_t @var{max_size})
3921Generate a random integer of at most @var{max_size} limbs.  The generated
3922random number doesn't satisfy any particular requirements of randomness.
3923Negative random numbers are generated when @var{max_size} is negative.
3924
3925This function is obsolete.  Use @code{mpz_urandomb} or
3926@code{mpz_urandomm} instead.
3927@end deftypefun
3928
3929@deftypefun void mpz_random2 (mpz_t @var{rop}, mp_size_t @var{max_size})
3930Generate a random integer of at most @var{max_size} limbs, with long strings
3931of zeros and ones in the binary representation.  Useful for testing functions
3932and algorithms, since this kind of random numbers have proven to be more
3933likely to trigger corner-case bugs.  Negative random numbers are generated
3934when @var{max_size} is negative.
3935
3936This function is obsolete.  Use @code{mpz_rrandomb} instead.
3937@end deftypefun
3938
3939
3940@node Integer Import and Export, Miscellaneous Integer Functions, Integer Random Numbers, Integer Functions
3941@section Integer Import and Export
3942
3943@code{mpz_t} variables can be converted to and from arbitrary words of binary
3944data with the following functions.
3945
3946@deftypefun void mpz_import (mpz_t @var{rop}, size_t @var{count}, int @var{order}, size_t @var{size}, int @var{endian}, size_t @var{nails}, const void *@var{op})
3947@cindex Integer import
3948@cindex Import
3949Set @var{rop} from an array of word data at @var{op}.
3950
3951The parameters specify the format of the data.  @var{count} many words are
3952read, each @var{size} bytes.  @var{order} can be 1 for most significant word
3953first or -1 for least significant first.  Within each word @var{endian} can be
39541 for most significant byte first, -1 for least significant first, or 0 for
3955the native endianness of the host CPU@.  The most significant @var{nails} bits
3956of each word are skipped, this can be 0 to use the full words.
3957
3958There is no sign taken from the data, @var{rop} will simply be a positive
3959integer.  An application can handle any sign itself, and apply it for instance
3960with @code{mpz_neg}.
3961
3962There are no data alignment restrictions on @var{op}, any address is allowed.
3963
3964Here's an example converting an array of @code{unsigned long} data, most
3965significant element first, and host byte order within each value.
3966
3967@example
3968unsigned long  a[20];
3969/* Initialize @var{z} and @var{a} */
3970mpz_import (z, 20, 1, sizeof(a[0]), 0, 0, a);
3971@end example
3972
3973This example assumes the full @code{sizeof} bytes are used for data in the
3974given type, which is usually true, and certainly true for @code{unsigned long}
3975everywhere we know of.  However on Cray vector systems it may be noted that
3976@code{short} and @code{int} are always stored in 8 bytes (and with
3977@code{sizeof} indicating that) but use only 32 or 46 bits.  The @var{nails}
3978feature can account for this, by passing for instance
3979@code{8*sizeof(int)-INT_BIT}.
3980@end deftypefun
3981
3982@deftypefun {void *} mpz_export (void *@var{rop}, size_t *@var{countp}, int @var{order}, size_t @var{size}, int @var{endian}, size_t @var{nails}, mpz_t @var{op})
3983@cindex Integer export
3984@cindex Export
3985Fill @var{rop} with word data from @var{op}.
3986
3987The parameters specify the format of the data produced.  Each word will be
3988@var{size} bytes and @var{order} can be 1 for most significant word first or
3989-1 for least significant first.  Within each word @var{endian} can be 1 for
3990most significant byte first, -1 for least significant first, or 0 for the
3991native endianness of the host CPU@.  The most significant @var{nails} bits of
3992each word are unused and set to zero, this can be 0 to produce full words.
3993
3994The number of words produced is written to @code{*@var{countp}}, or
3995@var{countp} can be @code{NULL} to discard the count.  @var{rop} must have
3996enough space for the data, or if @var{rop} is @code{NULL} then a result array
3997of the necessary size is allocated using the current GMP allocation function
3998(@pxref{Custom Allocation}).  In either case the return value is the
3999destination used, either @var{rop} or the allocated block.
4000
4001If @var{op} is non-zero then the most significant word produced will be
4002non-zero.  If @var{op} is zero then the count returned will be zero and
4003nothing written to @var{rop}.  If @var{rop} is @code{NULL} in this case, no
4004block is allocated, just @code{NULL} is returned.
4005
4006The sign of @var{op} is ignored, just the absolute value is exported.  An
4007application can use @code{mpz_sgn} to get the sign and handle it as desired.
4008(@pxref{Integer Comparisons})
4009
4010There are no data alignment restrictions on @var{rop}, any address is allowed.
4011
4012When an application is allocating space itself the required size can be
4013determined with a calculation like the following.  Since @code{mpz_sizeinbase}
4014always returns at least 1, @code{count} here will be at least one, which
4015avoids any portability problems with @code{malloc(0)}, though if @code{z} is
4016zero no space at all is actually needed (or written).
4017
4018@example
4019numb = 8*size - nail;
4020count = (mpz_sizeinbase (z, 2) + numb-1) / numb;
4021p = malloc (count * size);
4022@end example
4023@end deftypefun
4024
4025
4026@need 2000
4027@node Miscellaneous Integer Functions, Integer Special Functions, Integer Import and Export, Integer Functions
4028@comment  node-name,  next,  previous,  up
4029@section Miscellaneous Functions
4030@cindex Miscellaneous integer functions
4031@cindex Integer miscellaneous functions
4032
4033@deftypefun int mpz_fits_ulong_p (mpz_t @var{op})
4034@deftypefunx int mpz_fits_slong_p (mpz_t @var{op})
4035@deftypefunx int mpz_fits_uint_p (mpz_t @var{op})
4036@deftypefunx int mpz_fits_sint_p (mpz_t @var{op})
4037@deftypefunx int mpz_fits_ushort_p (mpz_t @var{op})
4038@deftypefunx int mpz_fits_sshort_p (mpz_t @var{op})
4039Return non-zero iff the value of @var{op} fits in an @code{unsigned long int},
4040@code{signed long int}, @code{unsigned int}, @code{signed int}, @code{unsigned
4041short int}, or @code{signed short int}, respectively.  Otherwise, return zero.
4042@end deftypefun
4043
4044@deftypefn Macro int mpz_odd_p (mpz_t @var{op})
4045@deftypefnx Macro int mpz_even_p (mpz_t @var{op})
4046Determine whether @var{op} is odd or even, respectively.  Return non-zero if
4047yes, zero if no.  These macros evaluate their argument more than once.
4048@end deftypefn
4049
4050@deftypefun size_t mpz_sizeinbase (mpz_t @var{op}, int @var{base})
4051@cindex Size in digits
4052@cindex Digits in an integer
4053Return the size of @var{op} measured in number of digits in the given
4054@var{base}.  @var{base} can vary from 2 to 62.  The sign of @var{op} is
4055ignored, just the absolute value is used.  The result will be either exact or
40561 too big.  If @var{base} is a power of 2, the result is always exact.  If
4057@var{op} is zero the return value is always 1.
4058
4059This function can be used to determine the space required when converting
4060@var{op} to a string.  The right amount of allocation is normally two more
4061than the value returned by @code{mpz_sizeinbase}, one extra for a minus sign
4062and one for the null-terminator.
4063
4064@cindex Most significant bit
4065It will be noted that @code{mpz_sizeinbase(@var{op},2)} can be used to locate
4066the most significant 1 bit in @var{op}, counting from 1.  (Unlike the bitwise
4067functions which start from 0, @xref{Integer Logic and Bit Fiddling,, Logical
4068and Bit Manipulation Functions}.)
4069@end deftypefun
4070
4071
4072@node Integer Special Functions,  , Miscellaneous Integer Functions, Integer Functions
4073@section Special Functions
4074@cindex Special integer functions
4075@cindex Integer special functions
4076
4077The functions in this section are for various special purposes.  Most
4078applications will not need them.
4079
4080@deftypefun void mpz_array_init (mpz_t @var{integer_array}, mp_size_t @var{array_size}, @w{mp_size_t @var{fixed_num_bits}})
4081This is a special type of initialization.  @strong{Fixed} space of
4082@var{fixed_num_bits} is allocated to each of the @var{array_size} integers in
4083@var{integer_array}.  There is no way to free the storage allocated by this
4084function.  Don't call @code{mpz_clear}!
4085
4086The @var{integer_array} parameter is the first @code{mpz_t} in the array.  For
4087example,
4088
4089@example
4090mpz_t  arr[20000];
4091mpz_array_init (arr[0], 20000, 512);
4092@end example
4093
4094@c  In case anyone's wondering, yes this parameter style is a bit anomalous,
4095@c  it'd probably be nicer if it was "arr" instead of "arr[0]".  Obviously the
4096@c  two differ only in the declaration, not the pointer value, but changing is
4097@c  not possible since it'd provoke warnings or errors in existing sources.
4098
4099This function is only intended for programs that create a large number
4100of integers and need to reduce memory usage by avoiding the overheads of
4101allocating and reallocating lots of small blocks.  In normal programs this
4102function is not recommended.
4103
4104The space allocated to each integer by this function will not be automatically
4105increased, unlike the normal @code{mpz_init}, so an application must ensure it
4106is sufficient for any value stored.  The following space requirements apply to
4107various routines,
4108
4109@itemize @bullet
4110@item
4111@code{mpz_abs}, @code{mpz_neg}, @code{mpz_set}, @code{mpz_set_si} and
4112@code{mpz_set_ui} need room for the value they store.
4113
4114@item
4115@code{mpz_add}, @code{mpz_add_ui}, @code{mpz_sub} and @code{mpz_sub_ui} need
4116room for the larger of the two operands, plus an extra
4117@code{mp_bits_per_limb}.
4118
4119@item
4120@code{mpz_mul}, @code{mpz_mul_ui} and @code{mpz_mul_si} need room for the sum
4121of the number of bits in their operands, but each rounded up to a multiple of
4122@code{mp_bits_per_limb}.
4123
4124@item
4125@code{mpz_swap} can be used between two array variables, but not between an
4126array and a normal variable.
4127@end itemize
4128
4129For other functions, or if in doubt, the suggestion is to calculate in a
4130regular @code{mpz_init} variable and copy the result to an array variable with
4131@code{mpz_set}.
4132@end deftypefun
4133
4134@deftypefun {void *} _mpz_realloc (mpz_t @var{integer}, mp_size_t @var{new_alloc})
4135Change the space for @var{integer} to @var{new_alloc} limbs.  The value in
4136@var{integer} is preserved if it fits, or is set to 0 if not.  The return
4137value is not useful to applications and should be ignored.
4138
4139@code{mpz_realloc2} is the preferred way to accomplish allocation changes like
4140this.  @code{mpz_realloc2} and @code{_mpz_realloc} are the same except that
4141@code{_mpz_realloc} takes its size in limbs.
4142@end deftypefun
4143
4144@deftypefun mp_limb_t mpz_getlimbn (mpz_t @var{op}, mp_size_t @var{n})
4145Return limb number @var{n} from @var{op}.  The sign of @var{op} is ignored,
4146just the absolute value is used.  The least significant limb is number 0.
4147
4148@code{mpz_size} can be used to find how many limbs make up @var{op}.
4149@code{mpz_getlimbn} returns zero if @var{n} is outside the range 0 to
4150@code{mpz_size(@var{op})-1}.
4151@end deftypefun
4152
4153@deftypefun size_t mpz_size (mpz_t @var{op})
4154Return the size of @var{op} measured in number of limbs.  If @var{op} is zero,
4155the returned value will be zero.
4156@c (@xref{Nomenclature}, for an explanation of the concept @dfn{limb}.)
4157@end deftypefun
4158
4159
4160
4161@node Rational Number Functions, Floating-point Functions, Integer Functions, Top
4162@comment  node-name,  next,  previous,  up
4163@chapter Rational Number Functions
4164@cindex Rational number functions
4165
4166This chapter describes the GMP functions for performing arithmetic on rational
4167numbers.  These functions start with the prefix @code{mpq_}.
4168
4169Rational numbers are stored in objects of type @code{mpq_t}.
4170
4171All rational arithmetic functions assume operands have a canonical form, and
4172canonicalize their result.  The canonical from means that the denominator and
4173the numerator have no common factors, and that the denominator is positive.
4174Zero has the unique representation 0/1.
4175
4176Pure assignment functions do not canonicalize the assigned variable.  It is
4177the responsibility of the user to canonicalize the assigned variable before
4178any arithmetic operations are performed on that variable.
4179
4180@deftypefun void mpq_canonicalize (mpq_t @var{op})
4181Remove any factors that are common to the numerator and denominator of
4182@var{op}, and make the denominator positive.
4183@end deftypefun
4184
4185@menu
4186* Initializing Rationals::
4187* Rational Conversions::
4188* Rational Arithmetic::
4189* Comparing Rationals::
4190* Applying Integer Functions::
4191* I/O of Rationals::
4192@end menu
4193
4194@node Initializing Rationals, Rational Conversions, Rational Number Functions, Rational Number Functions
4195@comment  node-name,  next,  previous,  up
4196@section Initialization and Assignment Functions
4197@cindex Rational assignment functions
4198@cindex Assignment functions
4199@cindex Rational initialization functions
4200@cindex Initialization functions
4201
4202@deftypefun void mpq_init (mpq_t @var{x})
4203Initialize @var{x} and set it to 0/1.  Each variable should normally only be
4204initialized once, or at least cleared out (using the function @code{mpq_clear})
4205between each initialization.
4206@end deftypefun
4207
4208@deftypefun void mpq_inits (mpq_t @var{x}, ...)
4209Initialize a NULL-terminated list of @code{mpq_t} variables, and set their
4210values to 0/1.
4211@end deftypefun
4212
4213@deftypefun void mpq_clear (mpq_t @var{x})
4214Free the space occupied by @var{x}.  Make sure to call this function for all
4215@code{mpq_t} variables when you are done with them.
4216@end deftypefun
4217
4218@deftypefun void mpq_clears (mpq_t @var{x}, ...)
4219Free the space occupied by a NULL-terminated list of @code{mpq_t} variables.
4220@end deftypefun
4221
4222@deftypefun void mpq_set (mpq_t @var{rop}, mpq_t @var{op})
4223@deftypefunx void mpq_set_z (mpq_t @var{rop}, mpz_t @var{op})
4224Assign @var{rop} from @var{op}.
4225@end deftypefun
4226
4227@deftypefun void mpq_set_ui (mpq_t @var{rop}, unsigned long int @var{op1}, unsigned long int @var{op2})
4228@deftypefunx void mpq_set_si (mpq_t @var{rop}, signed long int @var{op1}, unsigned long int @var{op2})
4229Set the value of @var{rop} to @var{op1}/@var{op2}.  Note that if @var{op1} and
4230@var{op2} have common factors, @var{rop} has to be passed to
4231@code{mpq_canonicalize} before any operations are performed on @var{rop}.
4232@end deftypefun
4233
4234@deftypefun int mpq_set_str (mpq_t @var{rop}, char *@var{str}, int @var{base})
4235Set @var{rop} from a null-terminated string @var{str} in the given @var{base}.
4236
4237The string can be an integer like ``41'' or a fraction like ``41/152''.  The
4238fraction must be in canonical form (@pxref{Rational Number Functions}), or if
4239not then @code{mpq_canonicalize} must be called.
4240
4241The numerator and optional denominator are parsed the same as in
4242@code{mpz_set_str} (@pxref{Assigning Integers}).  White space is allowed in
4243the string, and is simply ignored.  The @var{base} can vary from 2 to 62, or
4244if @var{base} is 0 then the leading characters are used: @code{0x} or @code{0X} for hex,
4245@code{0b} or @code{0B} for binary,
4246@code{0} for octal, or decimal otherwise.  Note that this is done separately
4247for the numerator and denominator, so for instance @code{0xEF/100} is 239/100,
4248whereas @code{0xEF/0x100} is 239/256.
4249
4250The return value is 0 if the entire string is a valid number, or @minus{}1 if
4251not.
4252@end deftypefun
4253
4254@deftypefun void mpq_swap (mpq_t @var{rop1}, mpq_t @var{rop2})
4255Swap the values @var{rop1} and @var{rop2} efficiently.
4256@end deftypefun
4257
4258
4259@need 2000
4260@node Rational Conversions, Rational Arithmetic, Initializing Rationals, Rational Number Functions
4261@comment  node-name,  next,  previous,  up
4262@section Conversion Functions
4263@cindex Rational conversion functions
4264@cindex Conversion functions
4265
4266@deftypefun double mpq_get_d (mpq_t @var{op})
4267Convert @var{op} to a @code{double}, truncating if necessary (ie.@: rounding
4268towards zero).
4269
4270If the exponent from the conversion is too big or too small to fit a
4271@code{double} then the result is system dependent.  For too big an infinity is
4272returned when available.  For too small @math{0.0} is normally returned.
4273Hardware overflow, underflow and denorm traps may or may not occur.
4274@end deftypefun
4275
4276@deftypefun void mpq_set_d (mpq_t @var{rop}, double @var{op})
4277@deftypefunx void mpq_set_f (mpq_t @var{rop}, mpf_t @var{op})
4278Set @var{rop} to the value of @var{op}.  There is no rounding, this conversion
4279is exact.
4280@end deftypefun
4281
4282@deftypefun {char *} mpq_get_str (char *@var{str}, int @var{base}, mpq_t @var{op})
4283Convert @var{op} to a string of digits in base @var{base}.  The base may vary
4284from 2 to 36.  The string will be of the form @samp{num/den}, or if the
4285denominator is 1 then just @samp{num}.
4286
4287If @var{str} is @code{NULL}, the result string is allocated using the current
4288allocation function (@pxref{Custom Allocation}).  The block will be
4289@code{strlen(str)+1} bytes, that being exactly enough for the string and
4290null-terminator.
4291
4292If @var{str} is not @code{NULL}, it should point to a block of storage large
4293enough for the result, that being
4294
4295@example
4296mpz_sizeinbase (mpq_numref(@var{op}), @var{base})
4297+ mpz_sizeinbase (mpq_denref(@var{op}), @var{base}) + 3
4298@end example
4299
4300The three extra bytes are for a possible minus sign, possible slash, and the
4301null-terminator.
4302
4303A pointer to the result string is returned, being either the allocated block,
4304or the given @var{str}.
4305@end deftypefun
4306
4307
4308@node Rational Arithmetic, Comparing Rationals, Rational Conversions, Rational Number Functions
4309@comment  node-name,  next,  previous,  up
4310@section Arithmetic Functions
4311@cindex Rational arithmetic functions
4312@cindex Arithmetic functions
4313
4314@deftypefun void mpq_add (mpq_t @var{sum}, mpq_t @var{addend1}, mpq_t @var{addend2})
4315Set @var{sum} to @var{addend1} + @var{addend2}.
4316@end deftypefun
4317
4318@deftypefun void mpq_sub (mpq_t @var{difference}, mpq_t @var{minuend}, mpq_t @var{subtrahend})
4319Set @var{difference} to @var{minuend} @minus{} @var{subtrahend}.
4320@end deftypefun
4321
4322@deftypefun void mpq_mul (mpq_t @var{product}, mpq_t @var{multiplier}, mpq_t @var{multiplicand})
4323Set @var{product} to @math{@var{multiplier} @GMPtimes{} @var{multiplicand}}.
4324@end deftypefun
4325
4326@deftypefun void mpq_mul_2exp (mpq_t @var{rop}, mpq_t @var{op1}, mp_bitcnt_t @var{op2})
4327Set @var{rop} to @m{@var{op1} \times 2^{op2}, @var{op1} times 2 raised to
4328@var{op2}}.
4329@end deftypefun
4330
4331@deftypefun void mpq_div (mpq_t @var{quotient}, mpq_t @var{dividend}, mpq_t @var{divisor})
4332@cindex Division functions
4333Set @var{quotient} to @var{dividend}/@var{divisor}.
4334@end deftypefun
4335
4336@deftypefun void mpq_div_2exp (mpq_t @var{rop}, mpq_t @var{op1}, mp_bitcnt_t @var{op2})
4337Set @var{rop} to @m{@var{op1}/2^{op2}, @var{op1} divided by 2 raised to
4338@var{op2}}.
4339@end deftypefun
4340
4341@deftypefun void mpq_neg (mpq_t @var{negated_operand}, mpq_t @var{operand})
4342Set @var{negated_operand} to @minus{}@var{operand}.
4343@end deftypefun
4344
4345@deftypefun void mpq_abs (mpq_t @var{rop}, mpq_t @var{op})
4346Set @var{rop} to the absolute value of @var{op}.
4347@end deftypefun
4348
4349@deftypefun void mpq_inv (mpq_t @var{inverted_number}, mpq_t @var{number})
4350Set @var{inverted_number} to 1/@var{number}.  If the new denominator is
4351zero, this routine will divide by zero.
4352@end deftypefun
4353
4354@node Comparing Rationals, Applying Integer Functions, Rational Arithmetic, Rational Number Functions
4355@comment  node-name,  next,  previous,  up
4356@section Comparison Functions
4357@cindex Rational comparison functions
4358@cindex Comparison functions
4359
4360@deftypefun int mpq_cmp (mpq_t @var{op1}, mpq_t @var{op2})
4361Compare @var{op1} and @var{op2}.  Return a positive value if @math{@var{op1} >
4362@var{op2}}, zero if @math{@var{op1} = @var{op2}}, and a negative value if
4363@math{@var{op1} < @var{op2}}.
4364
4365To determine if two rationals are equal, @code{mpq_equal} is faster than
4366@code{mpq_cmp}.
4367@end deftypefun
4368
4369@deftypefn Macro int mpq_cmp_ui (mpq_t @var{op1}, unsigned long int @var{num2}, unsigned long int @var{den2})
4370@deftypefnx Macro int mpq_cmp_si (mpq_t @var{op1}, long int @var{num2}, unsigned long int @var{den2})
4371Compare @var{op1} and @var{num2}/@var{den2}.  Return a positive value if
4372@math{@var{op1} > @var{num2}/@var{den2}}, zero if @math{@var{op1} =
4373@var{num2}/@var{den2}}, and a negative value if @math{@var{op1} <
4374@var{num2}/@var{den2}}.
4375
4376@var{num2} and @var{den2} are allowed to have common factors.
4377
4378These functions are implemented as a macros and evaluate their arguments
4379multiple times.
4380@end deftypefn
4381
4382@deftypefn Macro int mpq_sgn (mpq_t @var{op})
4383@cindex Sign tests
4384@cindex Rational sign tests
4385Return @math{+1} if @math{@var{op} > 0}, 0 if @math{@var{op} = 0}, and
4386@math{-1} if @math{@var{op} < 0}.
4387
4388This function is actually implemented as a macro.  It evaluates its
4389arguments multiple times.
4390@end deftypefn
4391
4392@deftypefun int mpq_equal (mpq_t @var{op1}, mpq_t @var{op2})
4393Return non-zero if @var{op1} and @var{op2} are equal, zero if they are
4394non-equal.  Although @code{mpq_cmp} can be used for the same purpose, this
4395function is much faster.
4396@end deftypefun
4397
4398@node Applying Integer Functions, I/O of Rationals, Comparing Rationals, Rational Number Functions
4399@comment  node-name,  next,  previous,  up
4400@section Applying Integer Functions to Rationals
4401@cindex Rational numerator and denominator
4402@cindex Numerator and denominator
4403
4404The set of @code{mpq} functions is quite small.  In particular, there are few
4405functions for either input or output.  The following functions give direct
4406access to the numerator and denominator of an @code{mpq_t}.
4407
4408Note that if an assignment to the numerator and/or denominator could take an
4409@code{mpq_t} out of the canonical form described at the start of this chapter
4410(@pxref{Rational Number Functions}) then @code{mpq_canonicalize} must be
4411called before any other @code{mpq} functions are applied to that @code{mpq_t}.
4412
4413@deftypefn Macro mpz_t mpq_numref (mpq_t @var{op})
4414@deftypefnx Macro mpz_t mpq_denref (mpq_t @var{op})
4415Return a reference to the numerator and denominator of @var{op}, respectively.
4416The @code{mpz} functions can be used on the result of these macros.
4417@end deftypefn
4418
4419@deftypefun void mpq_get_num (mpz_t @var{numerator}, mpq_t @var{rational})
4420@deftypefunx void mpq_get_den (mpz_t @var{denominator}, mpq_t @var{rational})
4421@deftypefunx void mpq_set_num (mpq_t @var{rational}, mpz_t @var{numerator})
4422@deftypefunx void mpq_set_den (mpq_t @var{rational}, mpz_t @var{denominator})
4423Get or set the numerator or denominator of a rational.  These functions are
4424equivalent to calling @code{mpz_set} with an appropriate @code{mpq_numref} or
4425@code{mpq_denref}.  Direct use of @code{mpq_numref} or @code{mpq_denref} is
4426recommended instead of these functions.
4427@end deftypefun
4428
4429
4430@need 2000
4431@node I/O of Rationals,  , Applying Integer Functions, Rational Number Functions
4432@comment  node-name,  next,  previous,  up
4433@section Input and Output Functions
4434@cindex Rational input and output functions
4435@cindex Input functions
4436@cindex Output functions
4437@cindex I/O functions
4438
4439Functions that perform input from a stdio stream, and functions that output to
4440a stdio stream, of @code{mpq} numbers.  Passing a @code{NULL} pointer for a
4441@var{stream} argument to any of these functions will make them read from
4442@code{stdin} and write to @code{stdout}, respectively.
4443
4444When using any of these functions, it is a good idea to include @file{stdio.h}
4445before @file{gmp.h}, since that will allow @file{gmp.h} to define prototypes
4446for these functions.
4447
4448See also @ref{Formatted Output} and @ref{Formatted Input}.
4449
4450@deftypefun size_t mpq_out_str (FILE *@var{stream}, int @var{base}, mpq_t @var{op})
4451Output @var{op} on stdio stream @var{stream}, as a string of digits in base
4452@var{base}.  The base may vary from 2 to 36.  Output is in the form
4453@samp{num/den} or if the denominator is 1 then just @samp{num}.
4454
4455Return the number of bytes written, or if an error occurred, return 0.
4456@end deftypefun
4457
4458@deftypefun size_t mpq_inp_str (mpq_t @var{rop}, FILE *@var{stream}, int @var{base})
4459Read a string of digits from @var{stream} and convert them to a rational in
4460@var{rop}.  Any initial white-space characters are read and discarded.  Return
4461the number of characters read (including white space), or 0 if a rational
4462could not be read.
4463
4464The input can be a fraction like @samp{17/63} or just an integer like
4465@samp{123}.  Reading stops at the first character not in this form, and white
4466space is not permitted within the string.  If the input might not be in
4467canonical form, then @code{mpq_canonicalize} must be called (@pxref{Rational
4468Number Functions}).
4469
4470The @var{base} can be between 2 and 36, or can be 0 in which case the leading
4471characters of the string determine the base, @samp{0x} or @samp{0X} for
4472hexadecimal, @samp{0} for octal, or decimal otherwise.  The leading characters
4473are examined separately for the numerator and denominator of a fraction, so
4474for instance @samp{0x10/11} is @math{16/11}, whereas @samp{0x10/0x11} is
4475@math{16/17}.
4476@end deftypefun
4477
4478
4479@node Floating-point Functions, Low-level Functions, Rational Number Functions, Top
4480@comment  node-name,  next,  previous,  up
4481@chapter Floating-point Functions
4482@cindex Floating-point functions
4483@cindex Float functions
4484@cindex User-defined precision
4485@cindex Precision of floats
4486
4487GMP floating point numbers are stored in objects of type @code{mpf_t} and
4488functions operating on them have an @code{mpf_} prefix.
4489
4490The mantissa of each float has a user-selectable precision, limited only by
4491available memory.  Each variable has its own precision, and that can be
4492increased or decreased at any time.
4493
4494The exponent of each float is a fixed precision, one machine word on most
4495systems.  In the current implementation the exponent is a count of limbs, so
4496for example on a 32-bit system this means a range of roughly
4497@math{2^@W{-68719476768}} to @math{2^@W{68719476736}}, or on a 64-bit system
4498this will be greater.  Note however @code{mpf_get_str} can only return an
4499exponent which fits an @code{mp_exp_t} and currently @code{mpf_set_str}
4500doesn't accept exponents bigger than a @code{long}.
4501
4502Each variable keeps a size for the mantissa data actually in use.  This means
4503that if a float is exactly represented in only a few bits then only those bits
4504will be used in a calculation, even if the selected precision is high.
4505
4506All calculations are performed to the precision of the destination variable.
4507Each function is defined to calculate with ``infinite precision'' followed by
4508a truncation to the destination precision, but of course the work done is only
4509what's needed to determine a result under that definition.
4510
4511The precision selected for a variable is a minimum value, GMP may increase it
4512a little to facilitate efficient calculation.  Currently this means rounding
4513up to a whole limb, and then sometimes having a further partial limb,
4514depending on the high limb of the mantissa.  But applications shouldn't be
4515concerned by such details.
4516
4517The mantissa in stored in binary, as might be imagined from the fact
4518precisions are expressed in bits.  One consequence of this is that decimal
4519fractions like @math{0.1} cannot be represented exactly.  The same is true of
4520plain IEEE @code{double} floats.  This makes both highly unsuitable for
4521calculations involving money or other values that should be exact decimal
4522fractions.  (Suitably scaled integers, or perhaps rationals, are better
4523choices.)
4524
4525@code{mpf} functions and variables have no special notion of infinity or
4526not-a-number, and applications must take care not to overflow the exponent or
4527results will be unpredictable.  This might change in a future release.
4528
4529Note that the @code{mpf} functions are @emph{not} intended as a smooth
4530extension to IEEE P754 arithmetic.  In particular results obtained on one
4531computer often differ from the results on a computer with a different word
4532size.
4533
4534@menu
4535* Initializing Floats::
4536* Assigning Floats::
4537* Simultaneous Float Init & Assign::
4538* Converting Floats::
4539* Float Arithmetic::
4540* Float Comparison::
4541* I/O of Floats::
4542* Miscellaneous Float Functions::
4543@end menu
4544
4545@node Initializing Floats, Assigning Floats, Floating-point Functions, Floating-point Functions
4546@comment  node-name,  next,  previous,  up
4547@section Initialization Functions
4548@cindex Float initialization functions
4549@cindex Initialization functions
4550
4551@deftypefun void mpf_set_default_prec (mp_bitcnt_t @var{prec})
4552Set the default precision to be @strong{at least} @var{prec} bits.  All
4553subsequent calls to @code{mpf_init} will use this precision, but previously
4554initialized variables are unaffected.
4555@end deftypefun
4556
4557@deftypefun {mp_bitcnt_t} mpf_get_default_prec (void)
4558Return the default precision actually used.
4559@end deftypefun
4560
4561An @code{mpf_t} object must be initialized before storing the first value in
4562it.  The functions @code{mpf_init} and @code{mpf_init2} are used for that
4563purpose.
4564
4565@deftypefun void mpf_init (mpf_t @var{x})
4566Initialize @var{x} to 0.  Normally, a variable should be initialized once only
4567or at least be cleared, using @code{mpf_clear}, between initializations.  The
4568precision of @var{x} is undefined unless a default precision has already been
4569established by a call to @code{mpf_set_default_prec}.
4570@end deftypefun
4571
4572@deftypefun void mpf_init2 (mpf_t @var{x}, mp_bitcnt_t @var{prec})
4573Initialize @var{x} to 0 and set its precision to be @strong{at least}
4574@var{prec} bits.  Normally, a variable should be initialized once only or at
4575least be cleared, using @code{mpf_clear}, between initializations.
4576@end deftypefun
4577
4578@deftypefun void mpf_inits (mpf_t @var{x}, ...)
4579Initialize a NULL-terminated list of @code{mpf_t} variables, and set their
4580values to 0.  The precision of the initialized variables is undefined unless a
4581default precision has already been established by a call to
4582@code{mpf_set_default_prec}.
4583@end deftypefun
4584
4585@deftypefun void mpf_clear (mpf_t @var{x})
4586Free the space occupied by @var{x}.  Make sure to call this function for all
4587@code{mpf_t} variables when you are done with them.
4588@end deftypefun
4589
4590@deftypefun void mpf_clears (mpf_t @var{x}, ...)
4591Free the space occupied by a NULL-terminated list of @code{mpf_t} variables.
4592@end deftypefun
4593
4594@need 2000
4595Here is an example on how to initialize floating-point variables:
4596@example
4597@{
4598  mpf_t x, y;
4599  mpf_init (x);           /* use default precision */
4600  mpf_init2 (y, 256);     /* precision @emph{at least} 256 bits */
4601  @dots{}
4602  /* Unless the program is about to exit, do ... */
4603  mpf_clear (x);
4604  mpf_clear (y);
4605@}
4606@end example
4607
4608The following three functions are useful for changing the precision during a
4609calculation.  A typical use would be for adjusting the precision gradually in
4610iterative algorithms like Newton-Raphson, making the computation precision
4611closely match the actual accurate part of the numbers.
4612
4613@deftypefun {mp_bitcnt_t} mpf_get_prec (mpf_t @var{op})
4614Return the current precision of @var{op}, in bits.
4615@end deftypefun
4616
4617@deftypefun void mpf_set_prec (mpf_t @var{rop}, mp_bitcnt_t @var{prec})
4618Set the precision of @var{rop} to be @strong{at least} @var{prec} bits.  The
4619value in @var{rop} will be truncated to the new precision.
4620
4621This function requires a call to @code{realloc}, and so should not be used in
4622a tight loop.
4623@end deftypefun
4624
4625@deftypefun void mpf_set_prec_raw (mpf_t @var{rop}, mp_bitcnt_t @var{prec})
4626Set the precision of @var{rop} to be @strong{at least} @var{prec} bits,
4627without changing the memory allocated.
4628
4629@var{prec} must be no more than the allocated precision for @var{rop}, that
4630being the precision when @var{rop} was initialized, or in the most recent
4631@code{mpf_set_prec}.
4632
4633The value in @var{rop} is unchanged, and in particular if it had a higher
4634precision than @var{prec} it will retain that higher precision.  New values
4635written to @var{rop} will use the new @var{prec}.
4636
4637Before calling @code{mpf_clear} or the full @code{mpf_set_prec}, another
4638@code{mpf_set_prec_raw} call must be made to restore @var{rop} to its original
4639allocated precision.  Failing to do so will have unpredictable results.
4640
4641@code{mpf_get_prec} can be used before @code{mpf_set_prec_raw} to get the
4642original allocated precision.  After @code{mpf_set_prec_raw} it reflects the
4643@var{prec} value set.
4644
4645@code{mpf_set_prec_raw} is an efficient way to use an @code{mpf_t} variable at
4646different precisions during a calculation, perhaps to gradually increase
4647precision in an iteration, or just to use various different precisions for
4648different purposes during a calculation.
4649@end deftypefun
4650
4651
4652@need 2000
4653@node Assigning Floats, Simultaneous Float Init & Assign, Initializing Floats, Floating-point Functions
4654@comment  node-name,  next,  previous,  up
4655@section Assignment Functions
4656@cindex Float assignment functions
4657@cindex Assignment functions
4658
4659These functions assign new values to already initialized floats
4660(@pxref{Initializing Floats}).
4661
4662@deftypefun void mpf_set (mpf_t @var{rop}, mpf_t @var{op})
4663@deftypefunx void mpf_set_ui (mpf_t @var{rop}, unsigned long int @var{op})
4664@deftypefunx void mpf_set_si (mpf_t @var{rop}, signed long int @var{op})
4665@deftypefunx void mpf_set_d (mpf_t @var{rop}, double @var{op})
4666@deftypefunx void mpf_set_z (mpf_t @var{rop}, mpz_t @var{op})
4667@deftypefunx void mpf_set_q (mpf_t @var{rop}, mpq_t @var{op})
4668Set the value of @var{rop} from @var{op}.
4669@end deftypefun
4670
4671@deftypefun int mpf_set_str (mpf_t @var{rop}, char *@var{str}, int @var{base})
4672Set the value of @var{rop} from the string in @var{str}.  The string is of the
4673form @samp{M@@N} or, if the base is 10 or less, alternatively @samp{MeN}.
4674@samp{M} is the mantissa and @samp{N} is the exponent.  The mantissa is always
4675in the specified base.  The exponent is either in the specified base or, if
4676@var{base} is negative, in decimal.  The decimal point expected is taken from
4677the current locale, on systems providing @code{localeconv}.
4678
4679The argument @var{base} may be in the ranges 2 to 62, or @minus{}62 to
4680@minus{}2.  Negative values are used to specify that the exponent is in
4681decimal.
4682
4683For bases up to 36, case is ignored; upper-case and lower-case letters have
4684the same value; for bases 37 to 62, upper-case letter represent the usual
468510..35 while lower-case letter represent 36..61.
4686
4687Unlike the corresponding @code{mpz} function, the base will not be determined
4688from the leading characters of the string if @var{base} is 0.  This is so that
4689numbers like @samp{0.23} are not interpreted as octal.
4690
4691White space is allowed in the string, and is simply ignored.  [This is not
4692really true; white-space is ignored in the beginning of the string and within
4693the mantissa, but not in other places, such as after a minus sign or in the
4694exponent.  We are considering changing the definition of this function, making
4695it fail when there is any white-space in the input, since that makes a lot of
4696sense.  Please tell us your opinion about this change.  Do you really want it
4697to accept @nicode{"3 14"} as meaning 314 as it does now?]
4698
4699This function returns 0 if the entire string is a valid number in base
4700@var{base}.  Otherwise it returns @minus{}1.
4701@end deftypefun
4702
4703@deftypefun void mpf_swap (mpf_t @var{rop1}, mpf_t @var{rop2})
4704Swap @var{rop1} and @var{rop2} efficiently.  Both the values and the
4705precisions of the two variables are swapped.
4706@end deftypefun
4707
4708
4709@node Simultaneous Float Init & Assign, Converting Floats, Assigning Floats, Floating-point Functions
4710@comment  node-name,  next,  previous,  up
4711@section Combined Initialization and Assignment Functions
4712@cindex Float assignment functions
4713@cindex Assignment functions
4714@cindex Float initialization functions
4715@cindex Initialization functions
4716
4717For convenience, GMP provides a parallel series of initialize-and-set functions
4718which initialize the output and then store the value there.  These functions'
4719names have the form @code{mpf_init_set@dots{}}
4720
4721Once the float has been initialized by any of the @code{mpf_init_set@dots{}}
4722functions, it can be used as the source or destination operand for the ordinary
4723float functions.  Don't use an initialize-and-set function on a variable
4724already initialized!
4725
4726@deftypefun void mpf_init_set (mpf_t @var{rop}, mpf_t @var{op})
4727@deftypefunx void mpf_init_set_ui (mpf_t @var{rop}, unsigned long int @var{op})
4728@deftypefunx void mpf_init_set_si (mpf_t @var{rop}, signed long int @var{op})
4729@deftypefunx void mpf_init_set_d (mpf_t @var{rop}, double @var{op})
4730Initialize @var{rop} and set its value from @var{op}.
4731
4732The precision of @var{rop} will be taken from the active default precision, as
4733set by @code{mpf_set_default_prec}.
4734@end deftypefun
4735
4736@deftypefun int mpf_init_set_str (mpf_t @var{rop}, char *@var{str}, int @var{base})
4737Initialize @var{rop} and set its value from the string in @var{str}.  See
4738@code{mpf_set_str} above for details on the assignment operation.
4739
4740Note that @var{rop} is initialized even if an error occurs.  (I.e., you have to
4741call @code{mpf_clear} for it.)
4742
4743The precision of @var{rop} will be taken from the active default precision, as
4744set by @code{mpf_set_default_prec}.
4745@end deftypefun
4746
4747
4748@node Converting Floats, Float Arithmetic, Simultaneous Float Init & Assign, Floating-point Functions
4749@comment  node-name,  next,  previous,  up
4750@section Conversion Functions
4751@cindex Float conversion functions
4752@cindex Conversion functions
4753
4754@deftypefun double mpf_get_d (mpf_t @var{op})
4755Convert @var{op} to a @code{double}, truncating if necessary (ie.@: rounding
4756towards zero).
4757
4758If the exponent in @var{op} is too big or too small to fit a @code{double}
4759then the result is system dependent.  For too big an infinity is returned when
4760available.  For too small @math{0.0} is normally returned.  Hardware overflow,
4761underflow and denorm traps may or may not occur.
4762@end deftypefun
4763
4764@deftypefun double mpf_get_d_2exp (signed long int *@var{exp}, mpf_t @var{op})
4765Convert @var{op} to a @code{double}, truncating if necessary (ie.@: rounding
4766towards zero), and with an exponent returned separately.
4767
4768The return value is in the range @math{0.5@le{}@GMPabs{@var{d}}<1} and the
4769exponent is stored to @code{*@var{exp}}.  @m{@var{d} * 2^{exp}, @var{d} *
47702^@var{exp}} is the (truncated) @var{op} value.  If @var{op} is zero, the
4771return is @math{0.0} and 0 is stored to @code{*@var{exp}}.
4772
4773@cindex @code{frexp}
4774This is similar to the standard C @code{frexp} function (@pxref{Normalization
4775Functions,,, libc, The GNU C Library Reference Manual}).
4776@end deftypefun
4777
4778@deftypefun long mpf_get_si (mpf_t @var{op})
4779@deftypefunx {unsigned long} mpf_get_ui (mpf_t @var{op})
4780Convert @var{op} to a @code{long} or @code{unsigned long}, truncating any
4781fraction part.  If @var{op} is too big for the return type, the result is
4782undefined.
4783
4784See also @code{mpf_fits_slong_p} and @code{mpf_fits_ulong_p}
4785(@pxref{Miscellaneous Float Functions}).
4786@end deftypefun
4787
4788@deftypefun {char *} mpf_get_str (char *@var{str}, mp_exp_t *@var{expptr}, int @var{base}, size_t @var{n_digits}, mpf_t @var{op})
4789Convert @var{op} to a string of digits in base @var{base}.  The base argument
4790may vary from 2 to 62 or from @minus{}2 to @minus{}36.  Up to @var{n_digits}
4791digits will be generated.  Trailing zeros are not returned.  No more digits
4792than can be accurately represented by @var{op} are ever generated.  If
4793@var{n_digits} is 0 then that accurate maximum number of digits are generated.
4794
4795For @var{base} in the range 2..36, digits and lower-case letters are used; for
4796@minus{}2..@minus{}36, digits and upper-case letters are used; for 37..62,
4797digits, upper-case letters, and lower-case letters (in that significance order)
4798are used.
4799
4800If @var{str} is @code{NULL}, the result string is allocated using the current
4801allocation function (@pxref{Custom Allocation}).  The block will be
4802@code{strlen(str)+1} bytes, that being exactly enough for the string and
4803null-terminator.
4804
4805If @var{str} is not @code{NULL}, it should point to a block of
4806@math{@var{n_digits} + 2} bytes, that being enough for the mantissa, a
4807possible minus sign, and a null-terminator.  When @var{n_digits} is 0 to get
4808all significant digits, an application won't be able to know the space
4809required, and @var{str} should be @code{NULL} in that case.
4810
4811The generated string is a fraction, with an implicit radix point immediately
4812to the left of the first digit.  The applicable exponent is written through
4813the @var{expptr} pointer.  For example, the number 3.1416 would be returned as
4814string @nicode{"31416"} and exponent 1.
4815
4816When @var{op} is zero, an empty string is produced and the exponent returned
4817is 0.
4818
4819A pointer to the result string is returned, being either the allocated block
4820or the given @var{str}.
4821@end deftypefun
4822
4823
4824@node Float Arithmetic, Float Comparison, Converting Floats, Floating-point Functions
4825@comment  node-name,  next,  previous,  up
4826@section Arithmetic Functions
4827@cindex Float arithmetic functions
4828@cindex Arithmetic functions
4829
4830@deftypefun void mpf_add (mpf_t @var{rop}, mpf_t @var{op1}, mpf_t @var{op2})
4831@deftypefunx void mpf_add_ui (mpf_t @var{rop}, mpf_t @var{op1}, unsigned long int @var{op2})
4832Set @var{rop} to @math{@var{op1} + @var{op2}}.
4833@end deftypefun
4834
4835@deftypefun void mpf_sub (mpf_t @var{rop}, mpf_t @var{op1}, mpf_t @var{op2})
4836@deftypefunx void mpf_ui_sub (mpf_t @var{rop}, unsigned long int @var{op1}, mpf_t @var{op2})
4837@deftypefunx void mpf_sub_ui (mpf_t @var{rop}, mpf_t @var{op1}, unsigned long int @var{op2})
4838Set @var{rop} to @var{op1} @minus{} @var{op2}.
4839@end deftypefun
4840
4841@deftypefun void mpf_mul (mpf_t @var{rop}, mpf_t @var{op1}, mpf_t @var{op2})
4842@deftypefunx void mpf_mul_ui (mpf_t @var{rop}, mpf_t @var{op1}, unsigned long int @var{op2})
4843Set @var{rop} to @math{@var{op1} @GMPtimes{} @var{op2}}.
4844@end deftypefun
4845
4846Division is undefined if the divisor is zero, and passing a zero divisor to the
4847divide functions will make these functions intentionally divide by zero.  This
4848lets the user handle arithmetic exceptions in these functions in the same
4849manner as other arithmetic exceptions.
4850
4851@deftypefun void mpf_div (mpf_t @var{rop}, mpf_t @var{op1}, mpf_t @var{op2})
4852@deftypefunx void mpf_ui_div (mpf_t @var{rop}, unsigned long int @var{op1}, mpf_t @var{op2})
4853@deftypefunx void mpf_div_ui (mpf_t @var{rop}, mpf_t @var{op1}, unsigned long int @var{op2})
4854@cindex Division functions
4855Set @var{rop} to @var{op1}/@var{op2}.
4856@end deftypefun
4857
4858@deftypefun void mpf_sqrt (mpf_t @var{rop}, mpf_t @var{op})
4859@deftypefunx void mpf_sqrt_ui (mpf_t @var{rop}, unsigned long int @var{op})
4860@cindex Root extraction functions
4861Set @var{rop} to @m{\sqrt{@var{op}}, the square root of @var{op}}.
4862@end deftypefun
4863
4864@deftypefun void mpf_pow_ui (mpf_t @var{rop}, mpf_t @var{op1}, unsigned long int @var{op2})
4865@cindex Exponentiation functions
4866@cindex Powering functions
4867Set @var{rop} to @m{@var{op1}^{op2}, @var{op1} raised to the power @var{op2}}.
4868@end deftypefun
4869
4870@deftypefun void mpf_neg (mpf_t @var{rop}, mpf_t @var{op})
4871Set @var{rop} to @minus{}@var{op}.
4872@end deftypefun
4873
4874@deftypefun void mpf_abs (mpf_t @var{rop}, mpf_t @var{op})
4875Set @var{rop} to the absolute value of @var{op}.
4876@end deftypefun
4877
4878@deftypefun void mpf_mul_2exp (mpf_t @var{rop}, mpf_t @var{op1}, mp_bitcnt_t @var{op2})
4879Set @var{rop} to @m{@var{op1} \times 2^{op2}, @var{op1} times 2 raised to
4880@var{op2}}.
4881@end deftypefun
4882
4883@deftypefun void mpf_div_2exp (mpf_t @var{rop}, mpf_t @var{op1}, mp_bitcnt_t @var{op2})
4884Set @var{rop} to @m{@var{op1}/2^{op2}, @var{op1} divided by 2 raised to
4885@var{op2}}.
4886@end deftypefun
4887
4888@node Float Comparison, I/O of Floats, Float Arithmetic, Floating-point Functions
4889@comment  node-name,  next,  previous,  up
4890@section Comparison Functions
4891@cindex Float comparison functions
4892@cindex Comparison functions
4893
4894@deftypefun int mpf_cmp (mpf_t @var{op1}, mpf_t @var{op2})
4895@deftypefunx int mpf_cmp_d (mpf_t @var{op1}, double @var{op2})
4896@deftypefunx int mpf_cmp_ui (mpf_t @var{op1}, unsigned long int @var{op2})
4897@deftypefunx int mpf_cmp_si (mpf_t @var{op1}, signed long int @var{op2})
4898Compare @var{op1} and @var{op2}.  Return a positive value if @math{@var{op1} >
4899@var{op2}}, zero if @math{@var{op1} = @var{op2}}, and a negative value if
4900@math{@var{op1} < @var{op2}}.
4901
4902@code{mpf_cmp_d} can be called with an infinity, but results are undefined for
4903a NaN.
4904@end deftypefun
4905
4906@deftypefun int mpf_eq (mpf_t @var{op1}, mpf_t @var{op2}, mp_bitcnt_t op3)
4907Return non-zero if the first @var{op3} bits of @var{op1} and @var{op2} are
4908equal, zero otherwise.  I.e., test if @var{op1} and @var{op2} are approximately
4909equal.
4910
4911Caution 1: All version of GMP up to version 4.2.4 compared just whole limbs,
4912meaning sometimes more than @var{op3} bits, sometimes fewer.
4913
4914Caution 2: This function will consider XXX11...111 and XX100...000 different,
4915even if ... is replaced by a semi-infinite number of bits.  Such numbers are
4916really just one ulp off, and should be considered equal.
4917@end deftypefun
4918
4919@deftypefun void mpf_reldiff (mpf_t @var{rop}, mpf_t @var{op1}, mpf_t @var{op2})
4920Compute the relative difference between @var{op1} and @var{op2} and store the
4921result in @var{rop}.  This is @math{@GMPabs{@var{op1}-@var{op2}}/@var{op1}}.
4922@end deftypefun
4923
4924@deftypefn Macro int mpf_sgn (mpf_t @var{op})
4925@cindex Sign tests
4926@cindex Float sign tests
4927Return @math{+1} if @math{@var{op} > 0}, 0 if @math{@var{op} = 0}, and
4928@math{-1} if @math{@var{op} < 0}.
4929
4930This function is actually implemented as a macro.  It evaluates its arguments
4931multiple times.
4932@end deftypefn
4933
4934@node I/O of Floats, Miscellaneous Float Functions, Float Comparison, Floating-point Functions
4935@comment  node-name,  next,  previous,  up
4936@section Input and Output Functions
4937@cindex Float input and output functions
4938@cindex Input functions
4939@cindex Output functions
4940@cindex I/O functions
4941
4942Functions that perform input from a stdio stream, and functions that output to
4943a stdio stream, of @code{mpf} numbers.  Passing a @code{NULL} pointer for a
4944@var{stream} argument to any of these functions will make them read from
4945@code{stdin} and write to @code{stdout}, respectively.
4946
4947When using any of these functions, it is a good idea to include @file{stdio.h}
4948before @file{gmp.h}, since that will allow @file{gmp.h} to define prototypes
4949for these functions.
4950
4951See also @ref{Formatted Output} and @ref{Formatted Input}.
4952
4953@deftypefun size_t mpf_out_str (FILE *@var{stream}, int @var{base}, size_t @var{n_digits}, mpf_t @var{op})
4954Print @var{op} to @var{stream}, as a string of digits.  Return the number of
4955bytes written, or if an error occurred, return 0.
4956
4957The mantissa is prefixed with an @samp{0.} and is in the given @var{base},
4958which may vary from 2 to 62 or from @minus{}2 to @minus{}36.  An exponent is
4959then printed, separated by an @samp{e}, or if the base is greater than 10 then
4960by an @samp{@@}.  The exponent is always in decimal.  The decimal point follows
4961the current locale, on systems providing @code{localeconv}.
4962
4963For @var{base} in the range 2..36, digits and lower-case letters are used; for
4964@minus{}2..@minus{}36, digits and upper-case letters are used; for 37..62,
4965digits, upper-case letters, and lower-case letters (in that significance order)
4966are used.
4967
4968Up to @var{n_digits} will be printed from the mantissa, except that no more
4969digits than are accurately representable by @var{op} will be printed.
4970@var{n_digits} can be 0 to select that accurate maximum.
4971@end deftypefun
4972
4973@deftypefun size_t mpf_inp_str (mpf_t @var{rop}, FILE *@var{stream}, int @var{base})
4974Read a string in base @var{base} from @var{stream}, and put the read float in
4975@var{rop}.  The string is of the form @samp{M@@N} or, if the base is 10 or
4976less, alternatively @samp{MeN}.  @samp{M} is the mantissa and @samp{N} is the
4977exponent.  The mantissa is always in the specified base.  The exponent is
4978either in the specified base or, if @var{base} is negative, in decimal.  The
4979decimal point expected is taken from the current locale, on systems providing
4980@code{localeconv}.
4981
4982The argument @var{base} may be in the ranges 2 to 36, or @minus{}36 to
4983@minus{}2.  Negative values are used to specify that the exponent is in
4984decimal.
4985
4986Unlike the corresponding @code{mpz} function, the base will not be determined
4987from the leading characters of the string if @var{base} is 0.  This is so that
4988numbers like @samp{0.23} are not interpreted as octal.
4989
4990Return the number of bytes read, or if an error occurred, return 0.
4991@end deftypefun
4992
4993@c @deftypefun void mpf_out_raw (FILE *@var{stream}, mpf_t @var{float})
4994@c Output @var{float} on stdio stream @var{stream}, in raw binary
4995@c format.  The float is written in a portable format, with 4 bytes of
4996@c size information, and that many bytes of limbs.  Both the size and the
4997@c limbs are written in decreasing significance order.
4998@c @end deftypefun
4999
5000@c @deftypefun void mpf_inp_raw (mpf_t @var{float}, FILE *@var{stream})
5001@c Input from stdio stream @var{stream} in the format written by
5002@c @code{mpf_out_raw}, and put the result in @var{float}.
5003@c @end deftypefun
5004
5005
5006@node Miscellaneous Float Functions,  , I/O of Floats, Floating-point Functions
5007@comment  node-name,  next,  previous,  up
5008@section Miscellaneous Functions
5009@cindex Miscellaneous float functions
5010@cindex Float miscellaneous functions
5011
5012@deftypefun void mpf_ceil (mpf_t @var{rop}, mpf_t @var{op})
5013@deftypefunx void mpf_floor (mpf_t @var{rop}, mpf_t @var{op})
5014@deftypefunx void mpf_trunc (mpf_t @var{rop}, mpf_t @var{op})
5015@cindex Rounding functions
5016@cindex Float rounding functions
5017Set @var{rop} to @var{op} rounded to an integer.  @code{mpf_ceil} rounds to the
5018next higher integer, @code{mpf_floor} to the next lower, and @code{mpf_trunc}
5019to the integer towards zero.
5020@end deftypefun
5021
5022@deftypefun int mpf_integer_p (mpf_t @var{op})
5023Return non-zero if @var{op} is an integer.
5024@end deftypefun
5025
5026@deftypefun int mpf_fits_ulong_p (mpf_t @var{op})
5027@deftypefunx int mpf_fits_slong_p (mpf_t @var{op})
5028@deftypefunx int mpf_fits_uint_p (mpf_t @var{op})
5029@deftypefunx int mpf_fits_sint_p (mpf_t @var{op})
5030@deftypefunx int mpf_fits_ushort_p (mpf_t @var{op})
5031@deftypefunx int mpf_fits_sshort_p (mpf_t @var{op})
5032Return non-zero if @var{op} would fit in the respective C data type, when
5033truncated to an integer.
5034@end deftypefun
5035
5036@deftypefun void mpf_urandomb (mpf_t @var{rop}, gmp_randstate_t @var{state}, mp_bitcnt_t @var{nbits})
5037@cindex Random number functions
5038@cindex Float random number functions
5039Generate a uniformly distributed random float in @var{rop}, such that @math{0
5040@le{} @var{rop} < 1}, with @var{nbits} significant bits in the mantissa or
5041less if the precision of @var{rop} is smaller.
5042
5043The variable @var{state} must be initialized by calling one of the
5044@code{gmp_randinit} functions (@ref{Random State Initialization}) before
5045invoking this function.
5046@end deftypefun
5047
5048@deftypefun void mpf_random2 (mpf_t @var{rop}, mp_size_t @var{max_size}, mp_exp_t @var{exp})
5049Generate a random float of at most @var{max_size} limbs, with long strings of
5050zeros and ones in the binary representation.  The exponent of the number is in
5051the interval @minus{}@var{exp} to @var{exp} (in limbs).  This function is
5052useful for testing functions and algorithms, since these kind of random
5053numbers have proven to be more likely to trigger corner-case bugs.  Negative
5054random numbers are generated when @var{max_size} is negative.
5055@end deftypefun
5056
5057@c @deftypefun size_t mpf_size (mpf_t @var{op})
5058@c Return the size of @var{op} measured in number of limbs.  If @var{op} is
5059@c zero, the returned value will be zero.  (@xref{Nomenclature}, for an
5060@c explanation of the concept @dfn{limb}.)
5061@c
5062@c @strong{This function is obsolete.  It will disappear from future GMP
5063@c releases.}
5064@c @end deftypefun
5065
5066
5067@node Low-level Functions, Random Number Functions, Floating-point Functions, Top
5068@comment  node-name,  next,  previous,  up
5069@chapter Low-level Functions
5070@cindex Low-level functions
5071
5072This chapter describes low-level GMP functions, used to implement the
5073high-level GMP functions, but also intended for time-critical user code.
5074
5075These functions start with the prefix @code{mpn_}.
5076
5077@c 1. Some of these function clobber input operands.
5078@c
5079
5080The @code{mpn} functions are designed to be as fast as possible, @strong{not}
5081to provide a coherent calling interface.  The different functions have somewhat
5082similar interfaces, but there are variations that make them hard to use.  These
5083functions do as little as possible apart from the real multiple precision
5084computation, so that no time is spent on things that not all callers need.
5085
5086A source operand is specified by a pointer to the least significant limb and a
5087limb count.  A destination operand is specified by just a pointer.  It is the
5088responsibility of the caller to ensure that the destination has enough space
5089for storing the result.
5090
5091With this way of specifying operands, it is possible to perform computations on
5092subranges of an argument, and store the result into a subrange of a
5093destination.
5094
5095A common requirement for all functions is that each source area needs at least
5096one limb.  No size argument may be zero.  Unless otherwise stated, in-place
5097operations are allowed where source and destination are the same, but not where
5098they only partly overlap.
5099
5100The @code{mpn} functions are the base for the implementation of the
5101@code{mpz_}, @code{mpf_}, and @code{mpq_} functions.
5102
5103This example adds the number beginning at @var{s1p} and the number beginning at
5104@var{s2p} and writes the sum at @var{destp}.  All areas have @var{n} limbs.
5105
5106@example
5107cy = mpn_add_n (destp, s1p, s2p, n)
5108@end example
5109
5110It should be noted that the @code{mpn} functions make no attempt to identify
5111high or low zero limbs on their operands, or other special forms.  On random
5112data such cases will be unlikely and it'd be wasteful for every function to
5113check every time.  An application knowing something about its data can take
5114steps to trim or perhaps split its calculations.
5115@c
5116@c  For reference, within gmp mpz_t operands never have high zero limbs, and
5117@c  we rate low zero limbs as unlikely too (or something an application should
5118@c  handle).  This is a prime motivation for not stripping zero limbs in say
5119@c  mpn_mul_n etc.
5120@c
5121@c  Other applications doing variable-length calculations will quite likely do
5122@c  something similar to mpz.  And even if not then it's highly likely zero
5123@c  limb stripping can be done at just a few judicious points, which will be
5124@c  more efficient than having lots of mpn functions checking every time.
5125
5126@sp 1
5127@noindent
5128In the notation used below, a source operand is identified by the pointer to
5129the least significant limb, and the limb count in braces.  For example,
5130@{@var{s1p}, @var{s1n}@}.
5131
5132@deftypefun mp_limb_t mpn_add_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n})
5133Add @{@var{s1p}, @var{n}@} and @{@var{s2p}, @var{n}@}, and write the @var{n}
5134least significant limbs of the result to @var{rp}.  Return carry, either 0 or
51351.
5136
5137This is the lowest-level function for addition.  It is the preferred function
5138for addition, since it is written in assembly for most CPUs.  For addition of
5139a variable to itself (i.e., @var{s1p} equals @var{s2p}) use @code{mpn_lshift}
5140with a count of 1 for optimal speed.
5141@end deftypefun
5142
5143@deftypefun mp_limb_t mpn_add_1 (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{n}, mp_limb_t @var{s2limb})
5144Add @{@var{s1p}, @var{n}@} and @var{s2limb}, and write the @var{n} least
5145significant limbs of the result to @var{rp}.  Return carry, either 0 or 1.
5146@end deftypefun
5147
5148@deftypefun mp_limb_t mpn_add (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{s1n}, const mp_limb_t *@var{s2p}, mp_size_t @var{s2n})
5149Add @{@var{s1p}, @var{s1n}@} and @{@var{s2p}, @var{s2n}@}, and write the
5150@var{s1n} least significant limbs of the result to @var{rp}.  Return carry,
5151either 0 or 1.
5152
5153This function requires that @var{s1n} is greater than or equal to @var{s2n}.
5154@end deftypefun
5155
5156@deftypefun mp_limb_t mpn_sub_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n})
5157Subtract @{@var{s2p}, @var{n}@} from @{@var{s1p}, @var{n}@}, and write the
5158@var{n} least significant limbs of the result to @var{rp}.  Return borrow,
5159either 0 or 1.
5160
5161This is the lowest-level function for subtraction.  It is the preferred
5162function for subtraction, since it is written in assembly for most CPUs.
5163@end deftypefun
5164
5165@deftypefun mp_limb_t mpn_sub_1 (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{n}, mp_limb_t @var{s2limb})
5166Subtract @var{s2limb} from @{@var{s1p}, @var{n}@}, and write the @var{n} least
5167significant limbs of the result to @var{rp}.  Return borrow, either 0 or 1.
5168@end deftypefun
5169
5170@deftypefun mp_limb_t mpn_sub (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{s1n}, const mp_limb_t *@var{s2p}, mp_size_t @var{s2n})
5171Subtract @{@var{s2p}, @var{s2n}@} from @{@var{s1p}, @var{s1n}@}, and write the
5172@var{s1n} least significant limbs of the result to @var{rp}.  Return borrow,
5173either 0 or 1.
5174
5175This function requires that @var{s1n} is greater than or equal to
5176@var{s2n}.
5177@end deftypefun
5178
5179@deftypefun void mpn_neg (mp_limb_t *@var{rp}, const mp_limb_t *@var{sp}, mp_size_t @var{n})
5180Perform the negation of @{@var{sp}, @var{n}@}, and write the result to
5181@{@var{rp}, @var{n}@}.  Return carry-out.
5182@end deftypefun
5183
5184@deftypefun void mpn_mul_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n})
5185Multiply @{@var{s1p}, @var{n}@} and @{@var{s2p}, @var{n}@}, and write the
51862*@var{n}-limb result to @var{rp}.
5187
5188The destination has to have space for 2*@var{n} limbs, even if the product's
5189most significant limb is zero.  No overlap is permitted between the
5190destination and either source.
5191
5192If the two input operands are the same, use @code{mpn_sqr}.
5193@end deftypefun
5194
5195@deftypefun mp_limb_t mpn_mul (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{s1n}, const mp_limb_t *@var{s2p}, mp_size_t @var{s2n})
5196Multiply @{@var{s1p}, @var{s1n}@} and @{@var{s2p}, @var{s2n}@}, and write the
5197(@var{s1n}+@var{s2n})-limb result to @var{rp}.  Return the most significant
5198limb of the result.
5199
5200The destination has to have space for @var{s1n} + @var{s2n} limbs, even if the
5201product's most significant limb is zero.  No overlap is permitted between the
5202destination and either source.
5203
5204This function requires that @var{s1n} is greater than or equal to @var{s2n}.
5205@end deftypefun
5206
5207@deftypefun void mpn_sqr (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{n})
5208Compute the square of @{@var{s1p}, @var{n}@} and write the 2*@var{n}-limb
5209result to @var{rp}.
5210
5211The destination has to have space for 2*@var{n} limbs, even if the result's
5212most significant limb is zero.  No overlap is permitted between the
5213destination and the source.
5214@end deftypefun
5215
5216@deftypefun mp_limb_t mpn_mul_1 (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{n}, mp_limb_t @var{s2limb})
5217Multiply @{@var{s1p}, @var{n}@} by @var{s2limb}, and write the @var{n} least
5218significant limbs of the product to @var{rp}.  Return the most significant
5219limb of the product.  @{@var{s1p}, @var{n}@} and @{@var{rp}, @var{n}@} are
5220allowed to overlap provided @math{@var{rp} @le{} @var{s1p}}.
5221
5222This is a low-level function that is a building block for general
5223multiplication as well as other operations in GMP@.  It is written in assembly
5224for most CPUs.
5225
5226Don't call this function if @var{s2limb} is a power of 2; use @code{mpn_lshift}
5227with a count equal to the logarithm of @var{s2limb} instead, for optimal speed.
5228@end deftypefun
5229
5230@deftypefun mp_limb_t mpn_addmul_1 (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{n}, mp_limb_t @var{s2limb})
5231Multiply @{@var{s1p}, @var{n}@} and @var{s2limb}, and add the @var{n} least
5232significant limbs of the product to @{@var{rp}, @var{n}@} and write the result
5233to @var{rp}.  Return the most significant limb of the product, plus carry-out
5234from the addition.
5235
5236This is a low-level function that is a building block for general
5237multiplication as well as other operations in GMP@.  It is written in assembly
5238for most CPUs.
5239@end deftypefun
5240
5241@deftypefun mp_limb_t mpn_submul_1 (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{n}, mp_limb_t @var{s2limb})
5242Multiply @{@var{s1p}, @var{n}@} and @var{s2limb}, and subtract the @var{n}
5243least significant limbs of the product from @{@var{rp}, @var{n}@} and write the
5244result to @var{rp}.  Return the most significant limb of the product, plus
5245borrow-out from the subtraction.
5246
5247This is a low-level function that is a building block for general
5248multiplication and division as well as other operations in GMP@.  It is written
5249in assembly for most CPUs.
5250@end deftypefun
5251
5252@deftypefun void mpn_tdiv_qr (mp_limb_t *@var{qp}, mp_limb_t *@var{rp}, mp_size_t @var{qxn}, const mp_limb_t *@var{np}, mp_size_t @var{nn}, const mp_limb_t *@var{dp}, mp_size_t @var{dn})
5253Divide @{@var{np}, @var{nn}@} by @{@var{dp}, @var{dn}@} and put the quotient
5254at @{@var{qp}, @var{nn}@minus{}@var{dn}+1@} and the remainder at @{@var{rp},
5255@var{dn}@}.  The quotient is rounded towards 0.
5256
5257No overlap is permitted between arguments, except that @var{np} might equal
5258@var{rp}.  The dividend size @var{nn} must be greater than or equal to divisor
5259size @var{dn}.  The most significant limb of the divisor must be non-zero.  The
5260@var{qxn} operand must be zero.
5261@end deftypefun
5262
5263@deftypefun mp_limb_t mpn_divrem (mp_limb_t *@var{r1p}, mp_size_t @var{qxn}, mp_limb_t *@var{rs2p}, mp_size_t @var{rs2n}, const mp_limb_t *@var{s3p}, mp_size_t @var{s3n})
5264[This function is obsolete.  Please call @code{mpn_tdiv_qr} instead for best
5265performance.]
5266
5267Divide @{@var{rs2p}, @var{rs2n}@} by @{@var{s3p}, @var{s3n}@}, and write the
5268quotient at @var{r1p}, with the exception of the most significant limb, which
5269is returned.  The remainder replaces the dividend at @var{rs2p}; it will be
5270@var{s3n} limbs long (i.e., as many limbs as the divisor).
5271
5272In addition to an integer quotient, @var{qxn} fraction limbs are developed, and
5273stored after the integral limbs.  For most usages, @var{qxn} will be zero.
5274
5275It is required that @var{rs2n} is greater than or equal to @var{s3n}.  It is
5276required that the most significant bit of the divisor is set.
5277
5278If the quotient is not needed, pass @var{rs2p} + @var{s3n} as @var{r1p}.  Aside
5279from that special case, no overlap between arguments is permitted.
5280
5281Return the most significant limb of the quotient, either 0 or 1.
5282
5283The area at @var{r1p} needs to be @var{rs2n} @minus{} @var{s3n} + @var{qxn}
5284limbs large.
5285@end deftypefun
5286
5287@deftypefn Function mp_limb_t mpn_divrem_1 (mp_limb_t *@var{r1p}, mp_size_t @var{qxn}, @w{mp_limb_t *@var{s2p}}, mp_size_t @var{s2n}, mp_limb_t @var{s3limb})
5288@deftypefnx Macro mp_limb_t mpn_divmod_1 (mp_limb_t *@var{r1p}, mp_limb_t *@var{s2p}, @w{mp_size_t @var{s2n}}, @w{mp_limb_t @var{s3limb}})
5289Divide @{@var{s2p}, @var{s2n}@} by @var{s3limb}, and write the quotient at
5290@var{r1p}.  Return the remainder.
5291
5292The integer quotient is written to @{@var{r1p}+@var{qxn}, @var{s2n}@} and in
5293addition @var{qxn} fraction limbs are developed and written to @{@var{r1p},
5294@var{qxn}@}.  Either or both @var{s2n} and @var{qxn} can be zero.  For most
5295usages, @var{qxn} will be zero.
5296
5297@code{mpn_divmod_1} exists for upward source compatibility and is simply a
5298macro calling @code{mpn_divrem_1} with a @var{qxn} of 0.
5299
5300The areas at @var{r1p} and @var{s2p} have to be identical or completely
5301separate, not partially overlapping.
5302@end deftypefn
5303
5304@deftypefun mp_limb_t mpn_divmod (mp_limb_t *@var{r1p}, mp_limb_t *@var{rs2p}, mp_size_t @var{rs2n}, const mp_limb_t *@var{s3p}, mp_size_t @var{s3n})
5305[This function is obsolete.  Please call @code{mpn_tdiv_qr} instead for best
5306performance.]
5307@end deftypefun
5308
5309@deftypefn Macro mp_limb_t mpn_divexact_by3 (mp_limb_t *@var{rp}, mp_limb_t *@var{sp}, @w{mp_size_t @var{n}})
5310@deftypefnx Function mp_limb_t mpn_divexact_by3c (mp_limb_t *@var{rp}, mp_limb_t *@var{sp}, @w{mp_size_t @var{n}}, mp_limb_t @var{carry})
5311Divide @{@var{sp}, @var{n}@} by 3, expecting it to divide exactly, and writing
5312the result to @{@var{rp}, @var{n}@}.  If 3 divides exactly, the return value is
5313zero and the result is the quotient.  If not, the return value is non-zero and
5314the result won't be anything useful.
5315
5316@code{mpn_divexact_by3c} takes an initial carry parameter, which can be the
5317return value from a previous call, so a large calculation can be done piece by
5318piece from low to high.  @code{mpn_divexact_by3} is simply a macro calling
5319@code{mpn_divexact_by3c} with a 0 carry parameter.
5320
5321These routines use a multiply-by-inverse and will be faster than
5322@code{mpn_divrem_1} on CPUs with fast multiplication but slow division.
5323
5324The source @math{a}, result @math{q}, size @math{n}, initial carry @math{i},
5325and return value @math{c} satisfy @m{cb^n+a-i=3q, c*b^n + a-i = 3*q}, where
5326@m{b=2\GMPraise{@code{GMP\_NUMB\_BITS}}, b=2^GMP_NUMB_BITS}.  The
5327return @math{c} is always 0, 1 or 2, and the initial carry @math{i} must also
5328be 0, 1 or 2 (these are both borrows really).  When @math{c=0} clearly
5329@math{q=(a-i)/3}.  When @m{c \neq 0, c!=0}, the remainder @math{(a-i) @bmod{}
53303} is given by @math{3-c}, because @math{b @equiv{} 1 @bmod{} 3} (when
5331@code{mp_bits_per_limb} is even, which is always so currently).
5332@end deftypefn
5333
5334@deftypefun mp_limb_t mpn_mod_1 (const mp_limb_t *@var{s1p}, mp_size_t @var{s1n}, mp_limb_t @var{s2limb})
5335Divide @{@var{s1p}, @var{s1n}@} by @var{s2limb}, and return the remainder.
5336@var{s1n} can be zero.
5337@end deftypefun
5338
5339@deftypefun mp_limb_t mpn_lshift (mp_limb_t *@var{rp}, const mp_limb_t *@var{sp}, mp_size_t @var{n}, unsigned int @var{count})
5340Shift @{@var{sp}, @var{n}@} left by @var{count} bits, and write the result to
5341@{@var{rp}, @var{n}@}.  The bits shifted out at the left are returned in the
5342least significant @var{count} bits of the return value (the rest of the return
5343value is zero).
5344
5345@var{count} must be in the range 1 to @nicode{mp_bits_per_limb}@minus{}1.  The
5346regions @{@var{sp}, @var{n}@} and @{@var{rp}, @var{n}@} may overlap, provided
5347@math{@var{rp} @ge{} @var{sp}}.
5348
5349This function is written in assembly for most CPUs.
5350@end deftypefun
5351
5352@deftypefun mp_limb_t mpn_rshift (mp_limb_t *@var{rp}, const mp_limb_t *@var{sp}, mp_size_t @var{n}, unsigned int @var{count})
5353Shift @{@var{sp}, @var{n}@} right by @var{count} bits, and write the result to
5354@{@var{rp}, @var{n}@}.  The bits shifted out at the right are returned in the
5355most significant @var{count} bits of the return value (the rest of the return
5356value is zero).
5357
5358@var{count} must be in the range 1 to @nicode{mp_bits_per_limb}@minus{}1.  The
5359regions @{@var{sp}, @var{n}@} and @{@var{rp}, @var{n}@} may overlap, provided
5360@math{@var{rp} @le{} @var{sp}}.
5361
5362This function is written in assembly for most CPUs.
5363@end deftypefun
5364
5365@deftypefun int mpn_cmp (const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n})
5366Compare @{@var{s1p}, @var{n}@} and @{@var{s2p}, @var{n}@} and return a
5367positive value if @math{@var{s1} > @var{s2}}, 0 if they are equal, or a
5368negative value if @math{@var{s1} < @var{s2}}.
5369@end deftypefun
5370
5371@deftypefun mp_size_t mpn_gcd (mp_limb_t *@var{rp}, mp_limb_t *@var{xp}, mp_size_t @var{xn}, mp_limb_t *@var{yp}, mp_size_t @var{yn})
5372Set @{@var{rp}, @var{retval}@} to the greatest common divisor of @{@var{xp},
5373@var{xn}@} and @{@var{yp}, @var{yn}@}.  The result can be up to @var{yn} limbs,
5374the return value is the actual number produced.  Both source operands are
5375destroyed.
5376
5377@{@var{xp}, @var{xn}@} must have at least as many bits as @{@var{yp},
5378@var{yn}@}.  @{@var{yp}, @var{yn}@} must be odd.  Both operands must have
5379non-zero most significant limbs.  No overlap is permitted between @{@var{xp},
5380@var{xn}@} and @{@var{yp}, @var{yn}@}.
5381@end deftypefun
5382
5383@deftypefun mp_limb_t mpn_gcd_1 (const mp_limb_t *@var{xp}, mp_size_t @var{xn}, mp_limb_t @var{ylimb})
5384Return the greatest common divisor of @{@var{xp}, @var{xn}@} and @var{ylimb}.
5385Both operands must be non-zero.
5386@end deftypefun
5387
5388@deftypefun mp_size_t mpn_gcdext (mp_limb_t *@var{gp}, mp_limb_t *@var{sp}, mp_size_t *@var{sn}, mp_limb_t *@var{xp}, mp_size_t @var{xn}, mp_limb_t *@var{yp}, mp_size_t @var{yn})
5389Let @m{U,@var{U}} be defined by @{@var{xp}, @var{xn}@} and let @m{V,@var{V}} be
5390defined by @{@var{yp}, @var{yn}@}.
5391
5392Compute the greatest common divisor @math{G} of @math{U} and @math{V}.  Compute
5393a cofactor @math{S} such that @math{G = US + VT}.  The second cofactor @var{T}
5394is not computed but can easily be obtained from @m{(G - US) / V, (@var{G} -
5395@var{U}*@var{S}) / @var{V}} (the division will be exact).  It is required that
5396@math{U @ge V > 0}.
5397
5398@math{S} satisfies @math{S = 1} or @math{@GMPabs{S} < V / (2 G)}. @math{S =
53990} if and only if @math{V} divides @math{U} (i.e., @math{G = V}).
5400
5401Store @math{G} at @var{gp} and let the return value define its limb count.
5402Store @math{S} at @var{sp} and let |*@var{sn}| define its limb count.  @math{S}
5403can be negative; when this happens *@var{sn} will be negative.  The areas at
5404@var{gp} and @var{sp} should each have room for @math{@var{xn}+1} limbs.
5405
5406The areas @{@var{xp}, @math{@var{xn}+1}@} and @{@var{yp}, @math{@var{yn}+1}@}
5407are destroyed (i.e.@: the input operands plus an extra limb past the end of
5408each).
5409
5410Compatibility note: GMP 4.3.0 and 4.3.1 defined @math{S} less strictly.
5411Earlier as well as later GMP releases define @math{S} as described here.
5412@end deftypefun
5413
5414@deftypefun mp_size_t mpn_sqrtrem (mp_limb_t *@var{r1p}, mp_limb_t *@var{r2p}, const mp_limb_t *@var{sp}, mp_size_t @var{n})
5415Compute the square root of @{@var{sp}, @var{n}@} and put the result at
5416@{@var{r1p}, @math{@GMPceil{@var{n}/2}}@} and the remainder at @{@var{r2p},
5417@var{retval}@}.  @var{r2p} needs space for @var{n} limbs, but the return value
5418indicates how many are produced.
5419
5420The most significant limb of @{@var{sp}, @var{n}@} must be non-zero.  The
5421areas @{@var{r1p}, @math{@GMPceil{@var{n}/2}}@} and @{@var{sp}, @var{n}@} must
5422be completely separate.  The areas @{@var{r2p}, @var{n}@} and @{@var{sp},
5423@var{n}@} must be either identical or completely separate.
5424
5425If the remainder is not wanted then @var{r2p} can be @code{NULL}, and in this
5426case the return value is zero or non-zero according to whether the remainder
5427would have been zero or non-zero.
5428
5429A return value of zero indicates a perfect square.  See also
5430@code{mpz_perfect_square_p}.
5431@end deftypefun
5432
5433@deftypefun mp_size_t mpn_get_str (unsigned char *@var{str}, int @var{base}, mp_limb_t *@var{s1p}, mp_size_t @var{s1n})
5434Convert @{@var{s1p}, @var{s1n}@} to a raw unsigned char array at @var{str} in
5435base @var{base}, and return the number of characters produced.  There may be
5436leading zeros in the string.  The string is not in ASCII; to convert it to
5437printable format, add the ASCII codes for @samp{0} or @samp{A}, depending on
5438the base and range.  @var{base} can vary from 2 to 256.
5439
5440The most significant limb of the input @{@var{s1p}, @var{s1n}@} must be
5441non-zero.  The input @{@var{s1p}, @var{s1n}@} is clobbered, except when
5442@var{base} is a power of 2, in which case it's unchanged.
5443
5444The area at @var{str} has to have space for the largest possible number
5445represented by a @var{s1n} long limb array, plus one extra character.
5446@end deftypefun
5447
5448@deftypefun mp_size_t mpn_set_str (mp_limb_t *@var{rp}, const unsigned char *@var{str}, size_t @var{strsize}, int @var{base})
5449Convert bytes @{@var{str},@var{strsize}@} in the given @var{base} to limbs at
5450@var{rp}.
5451
5452@math{@var{str}[0]} is the most significant byte and
5453@math{@var{str}[@var{strsize}-1]} is the least significant.  Each byte should
5454be a value in the range 0 to @math{@var{base}-1}, not an ASCII character.
5455@var{base} can vary from 2 to 256.
5456
5457The return value is the number of limbs written to @var{rp}.  If the most
5458significant input byte is non-zero then the high limb at @var{rp} will be
5459non-zero, and only that exact number of limbs will be required there.
5460
5461If the most significant input byte is zero then there may be high zero limbs
5462written to @var{rp} and included in the return value.
5463
5464@var{strsize} must be at least 1, and no overlap is permitted between
5465@{@var{str},@var{strsize}@} and the result at @var{rp}.
5466@end deftypefun
5467
5468@deftypefun {mp_bitcnt_t} mpn_scan0 (const mp_limb_t *@var{s1p}, mp_bitcnt_t @var{bit})
5469Scan @var{s1p} from bit position @var{bit} for the next clear bit.
5470
5471It is required that there be a clear bit within the area at @var{s1p} at or
5472beyond bit position @var{bit}, so that the function has something to return.
5473@end deftypefun
5474
5475@deftypefun {mp_bitcnt_t} mpn_scan1 (const mp_limb_t *@var{s1p}, mp_bitcnt_t @var{bit})
5476Scan @var{s1p} from bit position @var{bit} for the next set bit.
5477
5478It is required that there be a set bit within the area at @var{s1p} at or
5479beyond bit position @var{bit}, so that the function has something to return.
5480@end deftypefun
5481
5482@deftypefun void mpn_random (mp_limb_t *@var{r1p}, mp_size_t @var{r1n})
5483@deftypefunx void mpn_random2 (mp_limb_t *@var{r1p}, mp_size_t @var{r1n})
5484Generate a random number of length @var{r1n} and store it at @var{r1p}.  The
5485most significant limb is always non-zero.  @code{mpn_random} generates
5486uniformly distributed limb data, @code{mpn_random2} generates long strings of
5487zeros and ones in the binary representation.
5488
5489@code{mpn_random2} is intended for testing the correctness of the @code{mpn}
5490routines.
5491@end deftypefun
5492
5493@deftypefun {mp_bitcnt_t} mpn_popcount (const mp_limb_t *@var{s1p}, mp_size_t @var{n})
5494Count the number of set bits in @{@var{s1p}, @var{n}@}.
5495@end deftypefun
5496
5497@deftypefun {mp_bitcnt_t} mpn_hamdist (const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n})
5498Compute the hamming distance between @{@var{s1p}, @var{n}@} and @{@var{s2p},
5499@var{n}@}, which is the number of bit positions where the two operands have
5500different bit values.
5501@end deftypefun
5502
5503@deftypefun int mpn_perfect_square_p (const mp_limb_t *@var{s1p}, mp_size_t @var{n})
5504Return non-zero iff @{@var{s1p}, @var{n}@} is a perfect square.
5505The most significant limb of the input @{@var{s1p}, @var{n}@} must be
5506non-zero.
5507@end deftypefun
5508
5509@deftypefun void mpn_and_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n})
5510Perform the bitwise logical and of @{@var{s1p}, @var{n}@} and @{@var{s2p},
5511@var{n}@}, and write the result to @{@var{rp}, @var{n}@}.
5512@end deftypefun
5513
5514@deftypefun void mpn_ior_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n})
5515Perform the bitwise logical inclusive or of @{@var{s1p}, @var{n}@} and
5516@{@var{s2p}, @var{n}@}, and write the result to @{@var{rp}, @var{n}@}.
5517@end deftypefun
5518
5519@deftypefun void mpn_xor_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n})
5520Perform the bitwise logical exclusive or of @{@var{s1p}, @var{n}@} and
5521@{@var{s2p}, @var{n}@}, and write the result to @{@var{rp}, @var{n}@}.
5522@end deftypefun
5523
5524@deftypefun void mpn_andn_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n})
5525Perform the bitwise logical and of @{@var{s1p}, @var{n}@} and the bitwise
5526complement of @{@var{s2p}, @var{n}@}, and write the result to @{@var{rp}, @var{n}@}.
5527@end deftypefun
5528
5529@deftypefun void mpn_iorn_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n})
5530Perform the bitwise logical inclusive or of @{@var{s1p}, @var{n}@} and the bitwise
5531complement of @{@var{s2p}, @var{n}@}, and write the result to @{@var{rp}, @var{n}@}.
5532@end deftypefun
5533
5534@deftypefun void mpn_nand_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n})
5535Perform the bitwise logical and of @{@var{s1p}, @var{n}@} and @{@var{s2p},
5536@var{n}@}, and write the bitwise complement of the result to @{@var{rp}, @var{n}@}.
5537@end deftypefun
5538
5539@deftypefun void mpn_nior_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n})
5540Perform the bitwise logical inclusive or of @{@var{s1p}, @var{n}@} and
5541@{@var{s2p}, @var{n}@}, and write the bitwise complement of the result to
5542@{@var{rp}, @var{n}@}.
5543@end deftypefun
5544
5545@deftypefun void mpn_xnor_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n})
5546Perform the bitwise logical exclusive or of @{@var{s1p}, @var{n}@} and
5547@{@var{s2p}, @var{n}@}, and write the bitwise complement of the result to
5548@{@var{rp}, @var{n}@}.
5549@end deftypefun
5550
5551@deftypefun void mpn_com (mp_limb_t *@var{rp}, const mp_limb_t *@var{sp}, mp_size_t @var{n})
5552Perform the bitwise complement of @{@var{sp}, @var{n}@}, and write the result
5553to @{@var{rp}, @var{n}@}.
5554@end deftypefun
5555
5556@deftypefun void mpn_copyi (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{n})
5557Copy from @{@var{s1p}, @var{n}@} to @{@var{rp}, @var{n}@}, increasingly.
5558@end deftypefun
5559
5560@deftypefun void mpn_copyd (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{n})
5561Copy from @{@var{s1p}, @var{n}@} to @{@var{rp}, @var{n}@}, decreasingly.
5562@end deftypefun
5563
5564@deftypefun void mpn_zero (mp_limb_t *@var{rp}, mp_size_t @var{n})
5565Zero @{@var{rp}, @var{n}@}.
5566@end deftypefun
5567
5568@sp 1
5569@section Nails
5570@cindex Nails
5571
5572@strong{Everything in this section is highly experimental and may disappear or
5573be subject to incompatible changes in a future version of GMP.}
5574
5575Nails are an experimental feature whereby a few bits are left unused at the
5576top of each @code{mp_limb_t}.  This can significantly improve carry handling
5577on some processors.
5578
5579All the @code{mpn} functions accepting limb data will expect the nail bits to
5580be zero on entry, and will return data with the nails similarly all zero.
5581This applies both to limb vectors and to single limb arguments.
5582
5583Nails can be enabled by configuring with @samp{--enable-nails}.  By default
5584the number of bits will be chosen according to what suits the host processor,
5585but a particular number can be selected with @samp{--enable-nails=N}.
5586
5587At the mpn level, a nail build is neither source nor binary compatible with a
5588non-nail build, strictly speaking.  But programs acting on limbs only through
5589the mpn functions are likely to work equally well with either build, and
5590judicious use of the definitions below should make any program compatible with
5591either build, at the source level.
5592
5593For the higher level routines, meaning @code{mpz} etc, a nail build should be
5594fully source and binary compatible with a non-nail build.
5595
5596@defmac GMP_NAIL_BITS
5597@defmacx GMP_NUMB_BITS
5598@defmacx GMP_LIMB_BITS
5599@code{GMP_NAIL_BITS} is the number of nail bits, or 0 when nails are not in
5600use.  @code{GMP_NUMB_BITS} is the number of data bits in a limb.
5601@code{GMP_LIMB_BITS} is the total number of bits in an @code{mp_limb_t}.  In
5602all cases
5603
5604@example
5605GMP_LIMB_BITS == GMP_NAIL_BITS + GMP_NUMB_BITS
5606@end example
5607@end defmac
5608
5609@defmac GMP_NAIL_MASK
5610@defmacx GMP_NUMB_MASK
5611Bit masks for the nail and number parts of a limb.  @code{GMP_NAIL_MASK} is 0
5612when nails are not in use.
5613
5614@code{GMP_NAIL_MASK} is not often needed, since the nail part can be obtained
5615with @code{x >> GMP_NUMB_BITS}, and that means one less large constant, which
5616can help various RISC chips.
5617@end defmac
5618
5619@defmac GMP_NUMB_MAX
5620The maximum value that can be stored in the number part of a limb.  This is
5621the same as @code{GMP_NUMB_MASK}, but can be used for clarity when doing
5622comparisons rather than bit-wise operations.
5623@end defmac
5624
5625The term ``nails'' comes from finger or toe nails, which are at the ends of a
5626limb (arm or leg).  ``numb'' is short for number, but is also how the
5627developers felt after trying for a long time to come up with sensible names
5628for these things.
5629
5630In the future (the distant future most likely) a non-zero nail might be
5631permitted, giving non-unique representations for numbers in a limb vector.
5632This would help vector processors since carries would only ever need to
5633propagate one or two limbs.
5634
5635
5636@node Random Number Functions, Formatted Output, Low-level Functions, Top
5637@chapter Random Number Functions
5638@cindex Random number functions
5639
5640Sequences of pseudo-random numbers in GMP are generated using a variable of
5641type @code{gmp_randstate_t}, which holds an algorithm selection and a current
5642state.  Such a variable must be initialized by a call to one of the
5643@code{gmp_randinit} functions, and can be seeded with one of the
5644@code{gmp_randseed} functions.
5645
5646The functions actually generating random numbers are described in @ref{Integer
5647Random Numbers}, and @ref{Miscellaneous Float Functions}.
5648
5649The older style random number functions don't accept a @code{gmp_randstate_t}
5650parameter but instead share a global variable of that type.  They use a
5651default algorithm and are currently not seeded (though perhaps that will
5652change in the future).  The new functions accepting a @code{gmp_randstate_t}
5653are recommended for applications that care about randomness.
5654
5655@menu
5656* Random State Initialization::
5657* Random State Seeding::
5658* Random State Miscellaneous::
5659@end menu
5660
5661@node Random State Initialization, Random State Seeding, Random Number Functions, Random Number Functions
5662@section Random State Initialization
5663@cindex Random number state
5664@cindex Initialization functions
5665
5666@deftypefun void gmp_randinit_default (gmp_randstate_t @var{state})
5667Initialize @var{state} with a default algorithm.  This will be a compromise
5668between speed and randomness, and is recommended for applications with no
5669special requirements.  Currently this is @code{gmp_randinit_mt}.
5670@end deftypefun
5671
5672@deftypefun void gmp_randinit_mt (gmp_randstate_t @var{state})
5673@cindex Mersenne twister random numbers
5674Initialize @var{state} for a Mersenne Twister algorithm.  This algorithm is
5675fast and has good randomness properties.
5676@end deftypefun
5677
5678@deftypefun void gmp_randinit_lc_2exp (gmp_randstate_t @var{state}, mpz_t @var{a}, @w{unsigned long @var{c}}, @w{mp_bitcnt_t @var{m2exp}})
5679@cindex Linear congruential random numbers
5680Initialize @var{state} with a linear congruential algorithm @m{X = (@var{a}X +
5681@var{c}) @bmod 2^{m2exp}, X = (@var{a}*X + @var{c}) mod 2^@var{m2exp}}.
5682
5683The low bits of @math{X} in this algorithm are not very random.  The least
5684significant bit will have a period no more than 2, and the second bit no more
5685than 4, etc.  For this reason only the high half of each @math{X} is actually
5686used.
5687
5688When a random number of more than @math{@var{m2exp}/2} bits is to be
5689generated, multiple iterations of the recurrence are used and the results
5690concatenated.
5691@end deftypefun
5692
5693@deftypefun int gmp_randinit_lc_2exp_size (gmp_randstate_t @var{state}, mp_bitcnt_t @var{size})
5694@cindex Linear congruential random numbers
5695Initialize @var{state} for a linear congruential algorithm as per
5696@code{gmp_randinit_lc_2exp}.  @var{a}, @var{c} and @var{m2exp} are selected
5697from a table, chosen so that @var{size} bits (or more) of each @math{X} will
5698be used, ie.@: @math{@var{m2exp}/2 @ge{} @var{size}}.
5699
5700If successful the return value is non-zero.  If @var{size} is bigger than the
5701table data provides then the return value is zero.  The maximum @var{size}
5702currently supported is 128.
5703@end deftypefun
5704
5705@deftypefun void gmp_randinit_set (gmp_randstate_t @var{rop}, gmp_randstate_t @var{op})
5706Initialize @var{rop} with a copy of the algorithm and state from @var{op}.
5707@end deftypefun
5708
5709@c  Although gmp_randinit, gmp_errno and related constants are obsolete, we
5710@c  still put @findex entries for them, since they're still documented and
5711@c  someone might be looking them up when perusing old application code.
5712
5713@deftypefun void gmp_randinit (gmp_randstate_t @var{state}, @w{gmp_randalg_t @var{alg}}, @dots{})
5714@strong{This function is obsolete.}
5715
5716@findex GMP_RAND_ALG_LC
5717@findex GMP_RAND_ALG_DEFAULT
5718Initialize @var{state} with an algorithm selected by @var{alg}.  The only
5719choice is @code{GMP_RAND_ALG_LC}, which is @code{gmp_randinit_lc_2exp_size}
5720described above.  A third parameter of type @code{unsigned long} is required,
5721this is the @var{size} for that function.  @code{GMP_RAND_ALG_DEFAULT} or 0
5722are the same as @code{GMP_RAND_ALG_LC}.
5723
5724@c  For reference, this is the only place gmp_errno has been documented, and
5725@c  due to being non thread safe we won't be adding to it's uses.
5726@findex gmp_errno
5727@findex GMP_ERROR_UNSUPPORTED_ARGUMENT
5728@findex GMP_ERROR_INVALID_ARGUMENT
5729@code{gmp_randinit} sets bits in the global variable @code{gmp_errno} to
5730indicate an error.  @code{GMP_ERROR_UNSUPPORTED_ARGUMENT} if @var{alg} is
5731unsupported, or @code{GMP_ERROR_INVALID_ARGUMENT} if the @var{size} parameter
5732is too big.  It may be noted this error reporting is not thread safe (a good
5733reason to use @code{gmp_randinit_lc_2exp_size} instead).
5734@end deftypefun
5735
5736@deftypefun void gmp_randclear (gmp_randstate_t @var{state})
5737Free all memory occupied by @var{state}.
5738@end deftypefun
5739
5740
5741@node Random State Seeding, Random State Miscellaneous, Random State Initialization, Random Number Functions
5742@section Random State Seeding
5743@cindex Random number seeding
5744@cindex Seeding random numbers
5745
5746@deftypefun void gmp_randseed (gmp_randstate_t @var{state}, mpz_t @var{seed})
5747@deftypefunx void gmp_randseed_ui (gmp_randstate_t @var{state}, @w{unsigned long int @var{seed}})
5748Set an initial seed value into @var{state}.
5749
5750The size of a seed determines how many different sequences of random numbers
5751that it's possible to generate.  The ``quality'' of the seed is the randomness
5752of a given seed compared to the previous seed used, and this affects the
5753randomness of separate number sequences.  The method for choosing a seed is
5754critical if the generated numbers are to be used for important applications,
5755such as generating cryptographic keys.
5756
5757Traditionally the system time has been used to seed, but care needs to be
5758taken with this.  If an application seeds often and the resolution of the
5759system clock is low, then the same sequence of numbers might be repeated.
5760Also, the system time is quite easy to guess, so if unpredictability is
5761required then it should definitely not be the only source for the seed value.
5762On some systems there's a special device @file{/dev/random} which provides
5763random data better suited for use as a seed.
5764@end deftypefun
5765
5766
5767@node Random State Miscellaneous,  , Random State Seeding, Random Number Functions
5768@section Random State Miscellaneous
5769
5770@deftypefun {unsigned long} gmp_urandomb_ui (gmp_randstate_t @var{state}, unsigned long @var{n})
5771Return a uniformly distributed random number of @var{n} bits, ie.@: in the
5772range 0 to @m{2^n-1,2^@var{n}-1} inclusive.  @var{n} must be less than or
5773equal to the number of bits in an @code{unsigned long}.
5774@end deftypefun
5775
5776@deftypefun {unsigned long} gmp_urandomm_ui (gmp_randstate_t @var{state}, unsigned long @var{n})
5777Return a uniformly distributed random number in the range 0 to
5778@math{@var{n}-1}, inclusive.
5779@end deftypefun
5780
5781
5782@node Formatted Output, Formatted Input, Random Number Functions, Top
5783@chapter Formatted Output
5784@cindex Formatted output
5785@cindex @code{printf} formatted output
5786
5787@menu
5788* Formatted Output Strings::
5789* Formatted Output Functions::
5790* C++ Formatted Output::
5791@end menu
5792
5793@node Formatted Output Strings, Formatted Output Functions, Formatted Output, Formatted Output
5794@section Format Strings
5795
5796@code{gmp_printf} and friends accept format strings similar to the standard C
5797@code{printf} (@pxref{Formatted Output,, Formatted Output, libc, The GNU C
5798Library Reference Manual}).  A format specification is of the form
5799
5800@example
5801% [flags] [width] [.[precision]] [type] conv
5802@end example
5803
5804GMP adds types @samp{Z}, @samp{Q} and @samp{F} for @code{mpz_t}, @code{mpq_t}
5805and @code{mpf_t} respectively, @samp{M} for @code{mp_limb_t}, and @samp{N} for
5806an @code{mp_limb_t} array.  @samp{Z}, @samp{Q}, @samp{M} and @samp{N} behave
5807like integers.  @samp{Q} will print a @samp{/} and a denominator, if needed.
5808@samp{F} behaves like a float.  For example,
5809
5810@example
5811mpz_t z;
5812gmp_printf ("%s is an mpz %Zd\n", "here", z);
5813
5814mpq_t q;
5815gmp_printf ("a hex rational: %#40Qx\n", q);
5816
5817mpf_t f;
5818int   n;
5819gmp_printf ("fixed point mpf %.*Ff with %d digits\n", n, f, n);
5820
5821mp_limb_t l;
5822gmp_printf ("limb %Mu\n", l);
5823
5824const mp_limb_t *ptr;
5825mp_size_t       size;
5826gmp_printf ("limb array %Nx\n", ptr, size);
5827@end example
5828
5829For @samp{N} the limbs are expected least significant first, as per the
5830@code{mpn} functions (@pxref{Low-level Functions}).  A negative size can be
5831given to print the value as a negative.
5832
5833All the standard C @code{printf} types behave the same as the C library
5834@code{printf}, and can be freely intermixed with the GMP extensions.  In the
5835current implementation the standard parts of the format string are simply
5836handed to @code{printf} and only the GMP extensions handled directly.
5837
5838The flags accepted are as follows.  GLIBC style @nisamp{'} is only for the
5839standard C types (not the GMP types), and only if the C library supports it.
5840
5841@quotation
5842@multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM}
5843@item @nicode{0} @tab pad with zeros (rather than spaces)
5844@item @nicode{#} @tab show the base with @samp{0x}, @samp{0X} or @samp{0}
5845@item @nicode{+} @tab always show a sign
5846@item (space)    @tab show a space or a @samp{-} sign
5847@item @nicode{'} @tab group digits, GLIBC style (not GMP types)
5848@end multitable
5849@end quotation
5850
5851The optional width and precision can be given as a number within the format
5852string, or as a @samp{*} to take an extra parameter of type @code{int}, the
5853same as the standard @code{printf}.
5854
5855The standard types accepted are as follows.  @samp{h} and @samp{l} are
5856portable, the rest will depend on the compiler (or include files) for the type
5857and the C library for the output.
5858
5859@quotation
5860@multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM}
5861@item @nicode{h}  @tab @nicode{short}
5862@item @nicode{hh} @tab @nicode{char}
5863@item @nicode{j}  @tab @nicode{intmax_t} or @nicode{uintmax_t}
5864@item @nicode{l}  @tab @nicode{long} or @nicode{wchar_t}
5865@item @nicode{ll} @tab @nicode{long long}
5866@item @nicode{L}  @tab @nicode{long double}
5867@item @nicode{q}  @tab @nicode{quad_t} or @nicode{u_quad_t}
5868@item @nicode{t}  @tab @nicode{ptrdiff_t}
5869@item @nicode{z}  @tab @nicode{size_t}
5870@end multitable
5871@end quotation
5872
5873@noindent
5874The GMP types are
5875
5876@quotation
5877@multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM}
5878@item @nicode{F}  @tab @nicode{mpf_t}, float conversions
5879@item @nicode{Q}  @tab @nicode{mpq_t}, integer conversions
5880@item @nicode{M}  @tab @nicode{mp_limb_t}, integer conversions
5881@item @nicode{N}  @tab @nicode{mp_limb_t} array, integer conversions
5882@item @nicode{Z}  @tab @nicode{mpz_t}, integer conversions
5883@end multitable
5884@end quotation
5885
5886The conversions accepted are as follows.  @samp{a} and @samp{A} are always
5887supported for @code{mpf_t} but depend on the C library for standard C float
5888types.  @samp{m} and @samp{p} depend on the C library.
5889
5890@quotation
5891@multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM}
5892@item @nicode{a} @nicode{A} @tab hex floats, C99 style
5893@item @nicode{c}            @tab character
5894@item @nicode{d}            @tab decimal integer
5895@item @nicode{e} @nicode{E} @tab scientific format float
5896@item @nicode{f}            @tab fixed point float
5897@item @nicode{i}            @tab same as @nicode{d}
5898@item @nicode{g} @nicode{G} @tab fixed or scientific float
5899@item @nicode{m}            @tab @code{strerror} string, GLIBC style
5900@item @nicode{n}            @tab store characters written so far
5901@item @nicode{o}            @tab octal integer
5902@item @nicode{p}            @tab pointer
5903@item @nicode{s}            @tab string
5904@item @nicode{u}            @tab unsigned integer
5905@item @nicode{x} @nicode{X} @tab hex integer
5906@end multitable
5907@end quotation
5908
5909@samp{o}, @samp{x} and @samp{X} are unsigned for the standard C types, but for
5910types @samp{Z}, @samp{Q} and @samp{N} they are signed.  @samp{u} is not
5911meaningful for @samp{Z}, @samp{Q} and @samp{N}.
5912
5913@samp{M} is a proxy for the C library @samp{l} or @samp{L}, according to the
5914size of @code{mp_limb_t}.  Unsigned conversions will be usual, but a signed
5915conversion can be used and will interpret the value as a twos complement
5916negative.
5917
5918@samp{n} can be used with any type, even the GMP types.
5919
5920Other types or conversions that might be accepted by the C library
5921@code{printf} cannot be used through @code{gmp_printf}, this includes for
5922instance extensions registered with GLIBC @code{register_printf_function}.
5923Also currently there's no support for POSIX @samp{$} style numbered arguments
5924(perhaps this will be added in the future).
5925
5926The precision field has it's usual meaning for integer @samp{Z} and float
5927@samp{F} types, but is currently undefined for @samp{Q} and should not be used
5928with that.
5929
5930@code{mpf_t} conversions only ever generate as many digits as can be
5931accurately represented by the operand, the same as @code{mpf_get_str} does.
5932Zeros will be used if necessary to pad to the requested precision.  This
5933happens even for an @samp{f} conversion of an @code{mpf_t} which is an
5934integer, for instance @math{2^@W{1024}} in an @code{mpf_t} of 128 bits
5935precision will only produce about 40 digits, then pad with zeros to the
5936decimal point.  An empty precision field like @samp{%.Fe} or @samp{%.Ff} can
5937be used to specifically request just the significant digits.
5938
5939The decimal point character (or string) is taken from the current locale
5940settings on systems which provide @code{localeconv} (@pxref{Locales,, Locales
5941and Internationalization, libc, The GNU C Library Reference Manual}).  The C
5942library will normally do the same for standard float output.
5943
5944The format string is only interpreted as plain @code{char}s, multibyte
5945characters are not recognised.  Perhaps this will change in the future.
5946
5947
5948@node Formatted Output Functions, C++ Formatted Output, Formatted Output Strings, Formatted Output
5949@section Functions
5950@cindex Output functions
5951
5952Each of the following functions is similar to the corresponding C library
5953function.  The basic @code{printf} forms take a variable argument list.  The
5954@code{vprintf} forms take an argument pointer, see @ref{Variadic Functions,,
5955Variadic Functions, libc, The GNU C Library Reference Manual}, or @samp{man 3
5956va_start}.
5957
5958It should be emphasised that if a format string is invalid, or the arguments
5959don't match what the format specifies, then the behaviour of any of these
5960functions will be unpredictable.  GCC format string checking is not available,
5961since it doesn't recognise the GMP extensions.
5962
5963The file based functions @code{gmp_printf} and @code{gmp_fprintf} will return
5964@math{-1} to indicate a write error.  Output is not ``atomic'', so partial
5965output may be produced if a write error occurs.  All the functions can return
5966@math{-1} if the C library @code{printf} variant in use returns @math{-1}, but
5967this shouldn't normally occur.
5968
5969@deftypefun int gmp_printf (const char *@var{fmt}, @dots{})
5970@deftypefunx int gmp_vprintf (const char *@var{fmt}, va_list @var{ap})
5971Print to the standard output @code{stdout}.  Return the number of characters
5972written, or @math{-1} if an error occurred.
5973@end deftypefun
5974
5975@deftypefun int gmp_fprintf (FILE *@var{fp}, const char *@var{fmt}, @dots{})
5976@deftypefunx int gmp_vfprintf (FILE *@var{fp}, const char *@var{fmt}, va_list @var{ap})
5977Print to the stream @var{fp}.  Return the number of characters written, or
5978@math{-1} if an error occurred.
5979@end deftypefun
5980
5981@deftypefun int gmp_sprintf (char *@var{buf}, const char *@var{fmt}, @dots{})
5982@deftypefunx int gmp_vsprintf (char *@var{buf}, const char *@var{fmt}, va_list @var{ap})
5983Form a null-terminated string in @var{buf}.  Return the number of characters
5984written, excluding the terminating null.
5985
5986No overlap is permitted between the space at @var{buf} and the string
5987@var{fmt}.
5988
5989These functions are not recommended, since there's no protection against
5990exceeding the space available at @var{buf}.
5991@end deftypefun
5992
5993@deftypefun int gmp_snprintf (char *@var{buf}, size_t @var{size}, const char *@var{fmt}, @dots{})
5994@deftypefunx int gmp_vsnprintf (char *@var{buf}, size_t @var{size}, const char *@var{fmt}, va_list @var{ap})
5995Form a null-terminated string in @var{buf}.  No more than @var{size} bytes
5996will be written.  To get the full output, @var{size} must be enough for the
5997string and null-terminator.
5998
5999The return value is the total number of characters which ought to have been
6000produced, excluding the terminating null.  If @math{@var{retval} @ge{}
6001@var{size}} then the actual output has been truncated to the first
6002@math{@var{size}-1} characters, and a null appended.
6003
6004No overlap is permitted between the region @{@var{buf},@var{size}@} and the
6005@var{fmt} string.
6006
6007Notice the return value is in ISO C99 @code{snprintf} style.  This is so even
6008if the C library @code{vsnprintf} is the older GLIBC 2.0.x style.
6009@end deftypefun
6010
6011@deftypefun int gmp_asprintf (char **@var{pp}, const char *@var{fmt}, @dots{})
6012@deftypefunx int gmp_vasprintf (char **@var{pp}, const char *@var{fmt}, va_list @var{ap})
6013Form a null-terminated string in a block of memory obtained from the current
6014memory allocation function (@pxref{Custom Allocation}).  The block will be the
6015size of the string and null-terminator.  The address of the block in stored to
6016*@var{pp}.  The return value is the number of characters produced, excluding
6017the null-terminator.
6018
6019Unlike the C library @code{asprintf}, @code{gmp_asprintf} doesn't return
6020@math{-1} if there's no more memory available, it lets the current allocation
6021function handle that.
6022@end deftypefun
6023
6024@deftypefun int gmp_obstack_printf (struct obstack *@var{ob}, const char *@var{fmt}, @dots{})
6025@deftypefunx int gmp_obstack_vprintf (struct obstack *@var{ob}, const char *@var{fmt}, va_list @var{ap})
6026@cindex @code{obstack} output
6027Append to the current object in @var{ob}.  The return value is the number of
6028characters written.  A null-terminator is not written.
6029
6030@var{fmt} cannot be within the current object in @var{ob}, since that object
6031might move as it grows.
6032
6033These functions are available only when the C library provides the obstack
6034feature, which probably means only on GNU systems, see @ref{Obstacks,,
6035Obstacks, libc, The GNU C Library Reference Manual}.
6036@end deftypefun
6037
6038
6039@node C++ Formatted Output,  , Formatted Output Functions, Formatted Output
6040@section C++ Formatted Output
6041@cindex C++ @code{ostream} output
6042@cindex @code{ostream} output
6043
6044The following functions are provided in @file{libgmpxx} (@pxref{Headers and
6045Libraries}), which is built if C++ support is enabled (@pxref{Build Options}).
6046Prototypes are available from @code{<gmp.h>}.
6047
6048@deftypefun ostream& operator<< (ostream& @var{stream}, mpz_t @var{op})
6049Print @var{op} to @var{stream}, using its @code{ios} formatting settings.
6050@code{ios::width} is reset to 0 after output, the same as the standard
6051@code{ostream operator<<} routines do.
6052
6053In hex or octal, @var{op} is printed as a signed number, the same as for
6054decimal.  This is unlike the standard @code{operator<<} routines on @code{int}
6055etc, which instead give twos complement.
6056@end deftypefun
6057
6058@deftypefun ostream& operator<< (ostream& @var{stream}, mpq_t @var{op})
6059Print @var{op} to @var{stream}, using its @code{ios} formatting settings.
6060@code{ios::width} is reset to 0 after output, the same as the standard
6061@code{ostream operator<<} routines do.
6062
6063Output will be a fraction like @samp{5/9}, or if the denominator is 1 then
6064just a plain integer like @samp{123}.
6065
6066In hex or octal, @var{op} is printed as a signed value, the same as for
6067decimal.  If @code{ios::showbase} is set then a base indicator is shown on
6068both the numerator and denominator (if the denominator is required).
6069@end deftypefun
6070
6071@deftypefun ostream& operator<< (ostream& @var{stream}, mpf_t @var{op})
6072Print @var{op} to @var{stream}, using its @code{ios} formatting settings.
6073@code{ios::width} is reset to 0 after output, the same as the standard
6074@code{ostream operator<<} routines do.
6075
6076The decimal point follows the standard library float @code{operator<<}, which
6077on recent systems means the @code{std::locale} imbued on @var{stream}.
6078
6079Hex and octal are supported, unlike the standard @code{operator<<} on
6080@code{double}.  The mantissa will be in hex or octal, the exponent will be in
6081decimal.  For hex the exponent delimiter is an @samp{@@}.  This is as per
6082@code{mpf_out_str}.
6083
6084@code{ios::showbase} is supported, and will put a base on the mantissa, for
6085example hex @samp{0x1.8} or @samp{0x0.8}, or octal @samp{01.4} or @samp{00.4}.
6086This last form is slightly strange, but at least differentiates itself from
6087decimal.
6088@end deftypefun
6089
6090These operators mean that GMP types can be printed in the usual C++ way, for
6091example,
6092
6093@example
6094mpz_t  z;
6095int    n;
6096...
6097cout << "iteration " << n << " value " << z << "\n";
6098@end example
6099
6100But note that @code{ostream} output (and @code{istream} input, @pxref{C++
6101Formatted Input}) is the only overloading available for the GMP types and that
6102for instance using @code{+} with an @code{mpz_t} will have unpredictable
6103results.  For classes with overloading, see @ref{C++ Class Interface}.
6104
6105
6106@node Formatted Input, C++ Class Interface, Formatted Output, Top
6107@chapter Formatted Input
6108@cindex Formatted input
6109@cindex @code{scanf} formatted input
6110
6111@menu
6112* Formatted Input Strings::
6113* Formatted Input Functions::
6114* C++ Formatted Input::
6115@end menu
6116
6117
6118@node Formatted Input Strings, Formatted Input Functions, Formatted Input, Formatted Input
6119@section Formatted Input Strings
6120
6121@code{gmp_scanf} and friends accept format strings similar to the standard C
6122@code{scanf} (@pxref{Formatted Input,, Formatted Input, libc, The GNU C
6123Library Reference Manual}).  A format specification is of the form
6124
6125@example
6126% [flags] [width] [type] conv
6127@end example
6128
6129GMP adds types @samp{Z}, @samp{Q} and @samp{F} for @code{mpz_t}, @code{mpq_t}
6130and @code{mpf_t} respectively.  @samp{Z} and @samp{Q} behave like integers.
6131@samp{Q} will read a @samp{/} and a denominator, if present.  @samp{F} behaves
6132like a float.
6133
6134GMP variables don't require an @code{&} when passed to @code{gmp_scanf}, since
6135they're already ``call-by-reference''.  For example,
6136
6137@example
6138/* to read say "a(5) = 1234" */
6139int   n;
6140mpz_t z;
6141gmp_scanf ("a(%d) = %Zd\n", &n, z);
6142
6143mpq_t q1, q2;
6144gmp_sscanf ("0377 + 0x10/0x11", "%Qi + %Qi", q1, q2);
6145
6146/* to read say "topleft (1.55,-2.66)" */
6147mpf_t x, y;
6148char  buf[32];
6149gmp_scanf ("%31s (%Ff,%Ff)", buf, x, y);
6150@end example
6151
6152All the standard C @code{scanf} types behave the same as in the C library
6153@code{scanf}, and can be freely intermixed with the GMP extensions.  In the
6154current implementation the standard parts of the format string are simply
6155handed to @code{scanf} and only the GMP extensions handled directly.
6156
6157The flags accepted are as follows.  @samp{a} and @samp{'} will depend on
6158support from the C library, and @samp{'} cannot be used with GMP types.
6159
6160@quotation
6161@multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM}
6162@item @nicode{*} @tab read but don't store
6163@item @nicode{a} @tab allocate a buffer (string conversions)
6164@item @nicode{'} @tab grouped digits, GLIBC style (not GMP types)
6165@end multitable
6166@end quotation
6167
6168The standard types accepted are as follows.  @samp{h} and @samp{l} are
6169portable, the rest will depend on the compiler (or include files) for the type
6170and the C library for the input.
6171
6172@quotation
6173@multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM}
6174@item @nicode{h}  @tab @nicode{short}
6175@item @nicode{hh} @tab @nicode{char}
6176@item @nicode{j}  @tab @nicode{intmax_t} or @nicode{uintmax_t}
6177@item @nicode{l}  @tab @nicode{long int}, @nicode{double} or @nicode{wchar_t}
6178@item @nicode{ll} @tab @nicode{long long}
6179@item @nicode{L}  @tab @nicode{long double}
6180@item @nicode{q}  @tab @nicode{quad_t} or @nicode{u_quad_t}
6181@item @nicode{t}  @tab @nicode{ptrdiff_t}
6182@item @nicode{z}  @tab @nicode{size_t}
6183@end multitable
6184@end quotation
6185
6186@noindent
6187The GMP types are
6188
6189@quotation
6190@multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM}
6191@item @nicode{F}  @tab @nicode{mpf_t}, float conversions
6192@item @nicode{Q}  @tab @nicode{mpq_t}, integer conversions
6193@item @nicode{Z}  @tab @nicode{mpz_t}, integer conversions
6194@end multitable
6195@end quotation
6196
6197The conversions accepted are as follows.  @samp{p} and @samp{[} will depend on
6198support from the C library, the rest are standard.
6199
6200@quotation
6201@multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM}
6202@item @nicode{c}            @tab character or characters
6203@item @nicode{d}            @tab decimal integer
6204@item @nicode{e} @nicode{E} @nicode{f} @nicode{g} @nicode{G}
6205                            @tab float
6206@item @nicode{i}            @tab integer with base indicator
6207@item @nicode{n}            @tab characters read so far
6208@item @nicode{o}            @tab octal integer
6209@item @nicode{p}            @tab pointer
6210@item @nicode{s}            @tab string of non-whitespace characters
6211@item @nicode{u}            @tab decimal integer
6212@item @nicode{x} @nicode{X} @tab hex integer
6213@item @nicode{[}            @tab string of characters in a set
6214@end multitable
6215@end quotation
6216
6217@samp{e}, @samp{E}, @samp{f}, @samp{g} and @samp{G} are identical, they all
6218read either fixed point or scientific format, and either upper or lower case
6219@samp{e} for the exponent in scientific format.
6220
6221C99 style hex float format (@code{printf %a}, @pxref{Formatted Output
6222Strings}) is always accepted for @code{mpf_t}, but for the standard float
6223types it will depend on the C library.
6224
6225@samp{x} and @samp{X} are identical, both accept both upper and lower case
6226hexadecimal.
6227
6228@samp{o}, @samp{u}, @samp{x} and @samp{X} all read positive or negative
6229values.  For the standard C types these are described as ``unsigned''
6230conversions, but that merely affects certain overflow handling, negatives are
6231still allowed (per @code{strtoul}, @pxref{Parsing of Integers,, Parsing of
6232Integers, libc, The GNU C Library Reference Manual}).  For GMP types there are
6233no overflows, so @samp{d} and @samp{u} are identical.
6234
6235@samp{Q} type reads the numerator and (optional) denominator as given.  If the
6236value might not be in canonical form then @code{mpq_canonicalize} must be
6237called before using it in any calculations (@pxref{Rational Number
6238Functions}).
6239
6240@samp{Qi} will read a base specification separately for the numerator and
6241denominator.  For example @samp{0x10/11} would be 16/11, whereas
6242@samp{0x10/0x11} would be 16/17.
6243
6244@samp{n} can be used with any of the types above, even the GMP types.
6245@samp{*} to suppress assignment is allowed, though in that case it would do
6246nothing at all.
6247
6248Other conversions or types that might be accepted by the C library
6249@code{scanf} cannot be used through @code{gmp_scanf}.
6250
6251Whitespace is read and discarded before a field, except for @samp{c} and
6252@samp{[} conversions.
6253
6254For float conversions, the decimal point character (or string) expected is
6255taken from the current locale settings on systems which provide
6256@code{localeconv} (@pxref{Locales,, Locales and Internationalization, libc,
6257The GNU C Library Reference Manual}).  The C library will normally do the same
6258for standard float input.
6259
6260The format string is only interpreted as plain @code{char}s, multibyte
6261characters are not recognised.  Perhaps this will change in the future.
6262
6263
6264@node Formatted Input Functions, C++ Formatted Input, Formatted Input Strings, Formatted Input
6265@section Formatted Input Functions
6266@cindex Input functions
6267
6268Each of the following functions is similar to the corresponding C library
6269function.  The plain @code{scanf} forms take a variable argument list.  The
6270@code{vscanf} forms take an argument pointer, see @ref{Variadic Functions,,
6271Variadic Functions, libc, The GNU C Library Reference Manual}, or @samp{man 3
6272va_start}.
6273
6274It should be emphasised that if a format string is invalid, or the arguments
6275don't match what the format specifies, then the behaviour of any of these
6276functions will be unpredictable.  GCC format string checking is not available,
6277since it doesn't recognise the GMP extensions.
6278
6279No overlap is permitted between the @var{fmt} string and any of the results
6280produced.
6281
6282@deftypefun int gmp_scanf (const char *@var{fmt}, @dots{})
6283@deftypefunx int gmp_vscanf (const char *@var{fmt}, va_list @var{ap})
6284Read from the standard input @code{stdin}.
6285@end deftypefun
6286
6287@deftypefun int gmp_fscanf (FILE *@var{fp}, const char *@var{fmt}, @dots{})
6288@deftypefunx int gmp_vfscanf (FILE *@var{fp}, const char *@var{fmt}, va_list @var{ap})
6289Read from the stream @var{fp}.
6290@end deftypefun
6291
6292@deftypefun int gmp_sscanf (const char *@var{s}, const char *@var{fmt}, @dots{})
6293@deftypefunx int gmp_vsscanf (const char *@var{s}, const char *@var{fmt}, va_list @var{ap})
6294Read from a null-terminated string @var{s}.
6295@end deftypefun
6296
6297The return value from each of these functions is the same as the standard C99
6298@code{scanf}, namely the number of fields successfully parsed and stored.
6299@samp{%n} fields and fields read but suppressed by @samp{*} don't count
6300towards the return value.
6301
6302If end of input (or a file error) is reached before a character for a field or
6303a literal, and if no previous non-suppressed fields have matched, then the
6304return value is @code{EOF} instead of 0.  A whitespace character in the format
6305string is only an optional match and doesn't induce an @code{EOF} in this
6306fashion.  Leading whitespace read and discarded for a field don't count as
6307characters for that field.
6308
6309For the GMP types, input parsing follows C99 rules, namely one character of
6310lookahead is used and characters are read while they continue to meet the
6311format requirements.  If this doesn't provide a complete number then the
6312function terminates, with that field not stored nor counted towards the return
6313value.  For instance with @code{mpf_t} an input @samp{1.23e-XYZ} would be read
6314up to the @samp{X} and that character pushed back since it's not a digit.  The
6315string @samp{1.23e-} would then be considered invalid since an @samp{e} must
6316be followed by at least one digit.
6317
6318For the standard C types, in the current implementation GMP calls the C
6319library @code{scanf} functions, which might have looser rules about what
6320constitutes a valid input.
6321
6322Note that @code{gmp_sscanf} is the same as @code{gmp_fscanf} and only does one
6323character of lookahead when parsing.  Although clearly it could look at its
6324entire input, it is deliberately made identical to @code{gmp_fscanf}, the same
6325way C99 @code{sscanf} is the same as @code{fscanf}.
6326
6327
6328@node C++ Formatted Input,  , Formatted Input Functions, Formatted Input
6329@section C++ Formatted Input
6330@cindex C++ @code{istream} input
6331@cindex @code{istream} input
6332
6333The following functions are provided in @file{libgmpxx} (@pxref{Headers and
6334Libraries}), which is built only if C++ support is enabled (@pxref{Build
6335Options}).  Prototypes are available from @code{<gmp.h>}.
6336
6337@deftypefun istream& operator>> (istream& @var{stream}, mpz_t @var{rop})
6338Read @var{rop} from @var{stream}, using its @code{ios} formatting settings.
6339@end deftypefun
6340
6341@deftypefun istream& operator>> (istream& @var{stream}, mpq_t @var{rop})
6342An integer like @samp{123} will be read, or a fraction like @samp{5/9}.  No
6343whitespace is allowed around the @samp{/}.  If the fraction is not in
6344canonical form then @code{mpq_canonicalize} must be called (@pxref{Rational
6345Number Functions}) before operating on it.
6346
6347As per integer input, an @samp{0} or @samp{0x} base indicator is read when
6348none of @code{ios::dec}, @code{ios::oct} or @code{ios::hex} are set.  This is
6349done separately for numerator and denominator, so that for instance
6350@samp{0x10/11} is @math{16/11} and @samp{0x10/0x11} is @math{16/17}.
6351@end deftypefun
6352
6353@deftypefun istream& operator>> (istream& @var{stream}, mpf_t @var{rop})
6354Read @var{rop} from @var{stream}, using its @code{ios} formatting settings.
6355
6356Hex or octal floats are not supported, but might be in the future, or perhaps
6357it's best to accept only what the standard float @code{operator>>} does.
6358@end deftypefun
6359
6360Note that digit grouping specified by the @code{istream} locale is currently
6361not accepted.  Perhaps this will change in the future.
6362
6363@sp 1
6364These operators mean that GMP types can be read in the usual C++ way, for
6365example,
6366
6367@example
6368mpz_t  z;
6369...
6370cin >> z;
6371@end example
6372
6373But note that @code{istream} input (and @code{ostream} output, @pxref{C++
6374Formatted Output}) is the only overloading available for the GMP types and
6375that for instance using @code{+} with an @code{mpz_t} will have unpredictable
6376results.  For classes with overloading, see @ref{C++ Class Interface}.
6377
6378
6379
6380@node C++ Class Interface, BSD Compatible Functions, Formatted Input, Top
6381@chapter C++ Class Interface
6382@cindex C++ interface
6383
6384This chapter describes the C++ class based interface to GMP.
6385
6386All GMP C language types and functions can be used in C++ programs, since
6387@file{gmp.h} has @code{extern "C"} qualifiers, but the class interface offers
6388overloaded functions and operators which may be more convenient.
6389
6390Due to the implementation of this interface, a reasonably recent C++ compiler
6391is required, one supporting namespaces, partial specialization of templates
6392and member templates.  For GCC this means version 2.91 or later.
6393
6394@strong{Everything described in this chapter is to be considered preliminary
6395and might be subject to incompatible changes if some unforeseen difficulty
6396reveals itself.}
6397
6398@menu
6399* C++ Interface General::
6400* C++ Interface Integers::
6401* C++ Interface Rationals::
6402* C++ Interface Floats::
6403* C++ Interface Random Numbers::
6404* C++ Interface Limitations::
6405@end menu
6406
6407
6408@node C++ Interface General, C++ Interface Integers, C++ Class Interface, C++ Class Interface
6409@section C++ Interface General
6410
6411@noindent
6412All the C++ classes and functions are available with
6413
6414@cindex @code{gmpxx.h}
6415@example
6416#include <gmpxx.h>
6417@end example
6418
6419Programs should be linked with the @file{libgmpxx} and @file{libgmp}
6420libraries.  For example,
6421
6422@example
6423g++ mycxxprog.cc -lgmpxx -lgmp
6424@end example
6425
6426@noindent
6427The classes defined are
6428
6429@deftp Class mpz_class
6430@deftpx Class mpq_class
6431@deftpx Class mpf_class
6432@end deftp
6433
6434The standard operators and various standard functions are overloaded to allow
6435arithmetic with these classes.  For example,
6436
6437@example
6438int
6439main (void)
6440@{
6441  mpz_class a, b, c;
6442
6443  a = 1234;
6444  b = "-5678";
6445  c = a+b;
6446  cout << "sum is " << c << "\n";
6447  cout << "absolute value is " << abs(c) << "\n";
6448
6449  return 0;
6450@}
6451@end example
6452
6453An important feature of the implementation is that an expression like
6454@code{a=b+c} results in a single call to the corresponding @code{mpz_add},
6455without using a temporary for the @code{b+c} part.  Expressions which by their
6456nature imply intermediate values, like @code{a=b*c+d*e}, still use temporaries
6457though.
6458
6459The classes can be freely intermixed in expressions, as can the classes and
6460the standard types @code{long}, @code{unsigned long} and @code{double}.
6461Smaller types like @code{int} or @code{float} can also be intermixed, since
6462C++ will promote them.
6463
6464Note that @code{bool} is not accepted directly, but must be explicitly cast to
6465an @code{int} first.  This is because C++ will automatically convert any
6466pointer to a @code{bool}, so if GMP accepted @code{bool} it would make all
6467sorts of invalid class and pointer combinations compile but almost certainly
6468not do anything sensible.
6469
6470Conversions back from the classes to standard C++ types aren't done
6471automatically, instead member functions like @code{get_si} are provided (see
6472the following sections for details).
6473
6474Also there are no automatic conversions from the classes to the corresponding
6475GMP C types, instead a reference to the underlying C object can be obtained
6476with the following functions,
6477
6478@deftypefun mpz_t mpz_class::get_mpz_t ()
6479@deftypefunx mpq_t mpq_class::get_mpq_t ()
6480@deftypefunx mpf_t mpf_class::get_mpf_t ()
6481@end deftypefun
6482
6483These can be used to call a C function which doesn't have a C++ class
6484interface.  For example to set @code{a} to the GCD of @code{b} and @code{c},
6485
6486@example
6487mpz_class a, b, c;
6488...
6489mpz_gcd (a.get_mpz_t(), b.get_mpz_t(), c.get_mpz_t());
6490@end example
6491
6492In the other direction, a class can be initialized from the corresponding GMP
6493C type, or assigned to if an explicit constructor is used.  In both cases this
6494makes a copy of the value, it doesn't create any sort of association.  For
6495example,
6496
6497@example
6498mpz_t z;
6499// ... init and calculate z ...
6500mpz_class x(z);
6501mpz_class y;
6502y = mpz_class (z);
6503@end example
6504
6505There are no namespace setups in @file{gmpxx.h}, all types and functions are
6506simply put into the global namespace.  This is what @file{gmp.h} has done in
6507the past, and continues to do for compatibility.  The extras provided by
6508@file{gmpxx.h} follow GMP naming conventions and are unlikely to clash with
6509anything.
6510
6511
6512@node C++ Interface Integers, C++ Interface Rationals, C++ Interface General, C++ Class Interface
6513@section C++ Interface Integers
6514
6515@deftypefun {} mpz_class::mpz_class (type @var{n})
6516Construct an @code{mpz_class}.  All the standard C++ types may be used, except
6517@code{long long} and @code{long double}, and all the GMP C++ classes can be
6518used.  Any necessary conversion follows the corresponding C function, for
6519example @code{double} follows @code{mpz_set_d} (@pxref{Assigning Integers}).
6520@end deftypefun
6521
6522@deftypefun explicit mpz_class::mpz_class (mpz_t @var{z})
6523Construct an @code{mpz_class} from an @code{mpz_t}.  The value in @var{z} is
6524copied into the new @code{mpz_class}, there won't be any permanent association
6525between it and @var{z}.
6526@end deftypefun
6527
6528@deftypefun explicit mpz_class::mpz_class (const char *@var{s}, int @var{base} = 0)
6529@deftypefunx explicit mpz_class::mpz_class (const string& @var{s}, int @var{base} = 0)
6530Construct an @code{mpz_class} converted from a string using @code{mpz_set_str}
6531(@pxref{Assigning Integers}).
6532
6533If the string is not a valid integer, an @code{std::invalid_argument}
6534exception is thrown.  The same applies to @code{operator=}.
6535@end deftypefun
6536
6537@deftypefun mpz_class operator/ (mpz_class @var{a}, mpz_class @var{d})
6538@deftypefunx mpz_class operator% (mpz_class @var{a}, mpz_class @var{d})
6539Divisions involving @code{mpz_class} round towards zero, as per the
6540@code{mpz_tdiv_q} and @code{mpz_tdiv_r} functions (@pxref{Integer Division}).
6541This is the same as the C99 @code{/} and @code{%} operators.
6542
6543The @code{mpz_fdiv@dots{}} or @code{mpz_cdiv@dots{}} functions can always be called
6544directly if desired.  For example,
6545
6546@example
6547mpz_class q, a, d;
6548...
6549mpz_fdiv_q (q.get_mpz_t(), a.get_mpz_t(), d.get_mpz_t());
6550@end example
6551@end deftypefun
6552
6553@deftypefun mpz_class abs (mpz_class @var{op1})
6554@deftypefunx int cmp (mpz_class @var{op1}, type @var{op2})
6555@deftypefunx int cmp (type @var{op1}, mpz_class @var{op2})
6556@maybepagebreak
6557@deftypefunx bool mpz_class::fits_sint_p (void)
6558@deftypefunx bool mpz_class::fits_slong_p (void)
6559@deftypefunx bool mpz_class::fits_sshort_p (void)
6560@maybepagebreak
6561@deftypefunx bool mpz_class::fits_uint_p (void)
6562@deftypefunx bool mpz_class::fits_ulong_p (void)
6563@deftypefunx bool mpz_class::fits_ushort_p (void)
6564@maybepagebreak
6565@deftypefunx double mpz_class::get_d (void)
6566@deftypefunx long mpz_class::get_si (void)
6567@deftypefunx string mpz_class::get_str (int @var{base} = 10)
6568@deftypefunx {unsigned long} mpz_class::get_ui (void)
6569@maybepagebreak
6570@deftypefunx int mpz_class::set_str (const char *@var{str}, int @var{base})
6571@deftypefunx int mpz_class::set_str (const string& @var{str}, int @var{base})
6572@deftypefunx int sgn (mpz_class @var{op})
6573@deftypefunx mpz_class sqrt (mpz_class @var{op})
6574These functions provide a C++ class interface to the corresponding GMP C
6575routines.
6576
6577@code{cmp} can be used with any of the classes or the standard C++ types,
6578except @code{long long} and @code{long double}.
6579@end deftypefun
6580
6581@sp 1
6582Overloaded operators for combinations of @code{mpz_class} and @code{double}
6583are provided for completeness, but it should be noted that if the given
6584@code{double} is not an integer then the way any rounding is done is currently
6585unspecified.  The rounding might take place at the start, in the middle, or at
6586the end of the operation, and it might change in the future.
6587
6588Conversions between @code{mpz_class} and @code{double}, however, are defined
6589to follow the corresponding C functions @code{mpz_get_d} and @code{mpz_set_d}.
6590And comparisons are always made exactly, as per @code{mpz_cmp_d}.
6591
6592
6593@node C++ Interface Rationals, C++ Interface Floats, C++ Interface Integers, C++ Class Interface
6594@section C++ Interface Rationals
6595
6596In all the following constructors, if a fraction is given then it should be in
6597canonical form, or if not then @code{mpq_class::canonicalize} called.
6598
6599@deftypefun {} mpq_class::mpq_class (type @var{op})
6600@deftypefunx {} mpq_class::mpq_class (integer @var{num}, integer @var{den})
6601Construct an @code{mpq_class}.  The initial value can be a single value of any
6602type, or a pair of integers (@code{mpz_class} or standard C++ integer types)
6603representing a fraction, except that @code{long long} and @code{long double}
6604are not supported.  For example,
6605
6606@example
6607mpq_class q (99);
6608mpq_class q (1.75);
6609mpq_class q (1, 3);
6610@end example
6611@end deftypefun
6612
6613@deftypefun explicit mpq_class::mpq_class (mpq_t @var{q})
6614Construct an @code{mpq_class} from an @code{mpq_t}.  The value in @var{q} is
6615copied into the new @code{mpq_class}, there won't be any permanent association
6616between it and @var{q}.
6617@end deftypefun
6618
6619@deftypefun explicit mpq_class::mpq_class (const char *@var{s}, int @var{base} = 0)
6620@deftypefunx explicit mpq_class::mpq_class (const string& @var{s}, int @var{base} = 0)
6621Construct an @code{mpq_class} converted from a string using @code{mpq_set_str}
6622(@pxref{Initializing Rationals}).
6623
6624If the string is not a valid rational, an @code{std::invalid_argument}
6625exception is thrown.  The same applies to @code{operator=}.
6626@end deftypefun
6627
6628@deftypefun void mpq_class::canonicalize ()
6629Put an @code{mpq_class} into canonical form, as per @ref{Rational Number
6630Functions}.  All arithmetic operators require their operands in canonical
6631form, and will return results in canonical form.
6632@end deftypefun
6633
6634@deftypefun mpq_class abs (mpq_class @var{op})
6635@deftypefunx int cmp (mpq_class @var{op1}, type @var{op2})
6636@deftypefunx int cmp (type @var{op1}, mpq_class @var{op2})
6637@maybepagebreak
6638@deftypefunx double mpq_class::get_d (void)
6639@deftypefunx string mpq_class::get_str (int @var{base} = 10)
6640@maybepagebreak
6641@deftypefunx int mpq_class::set_str (const char *@var{str}, int @var{base})
6642@deftypefunx int mpq_class::set_str (const string& @var{str}, int @var{base})
6643@deftypefunx int sgn (mpq_class @var{op})
6644These functions provide a C++ class interface to the corresponding GMP C
6645routines.
6646
6647@code{cmp} can be used with any of the classes or the standard C++ types,
6648except @code{long long} and @code{long double}.
6649@end deftypefun
6650
6651@deftypefun {mpz_class&} mpq_class::get_num ()
6652@deftypefunx {mpz_class&} mpq_class::get_den ()
6653Get a reference to an @code{mpz_class} which is the numerator or denominator
6654of an @code{mpq_class}.  This can be used both for read and write access.  If
6655the object returned is modified, it modifies the original @code{mpq_class}.
6656
6657If direct manipulation might produce a non-canonical value, then
6658@code{mpq_class::canonicalize} must be called before further operations.
6659@end deftypefun
6660
6661@deftypefun mpz_t mpq_class::get_num_mpz_t ()
6662@deftypefunx mpz_t mpq_class::get_den_mpz_t ()
6663Get a reference to the underlying @code{mpz_t} numerator or denominator of an
6664@code{mpq_class}.  This can be passed to C functions expecting an
6665@code{mpz_t}.  Any modifications made to the @code{mpz_t} will modify the
6666original @code{mpq_class}.
6667
6668If direct manipulation might produce a non-canonical value, then
6669@code{mpq_class::canonicalize} must be called before further operations.
6670@end deftypefun
6671
6672@deftypefun istream& operator>> (istream& @var{stream}, mpq_class& @var{rop});
6673Read @var{rop} from @var{stream}, using its @code{ios} formatting settings,
6674the same as @code{mpq_t operator>>} (@pxref{C++ Formatted Input}).
6675
6676If the @var{rop} read might not be in canonical form then
6677@code{mpq_class::canonicalize} must be called.
6678@end deftypefun
6679
6680
6681@node C++ Interface Floats, C++ Interface Random Numbers, C++ Interface Rationals, C++ Class Interface
6682@section C++ Interface Floats
6683
6684When an expression requires the use of temporary intermediate @code{mpf_class}
6685values, like @code{f=g*h+x*y}, those temporaries will have the same precision
6686as the destination @code{f}.  Explicit constructors can be used if this
6687doesn't suit.
6688
6689@deftypefun {} mpf_class::mpf_class (type @var{op})
6690@deftypefunx {} mpf_class::mpf_class (type @var{op}, unsigned long @var{prec})
6691Construct an @code{mpf_class}.  Any standard C++ type can be used, except
6692@code{long long} and @code{long double}, and any of the GMP C++ classes can be
6693used.
6694
6695If @var{prec} is given, the initial precision is that value, in bits.  If
6696@var{prec} is not given, then the initial precision is determined by the type
6697of @var{op} given.  An @code{mpz_class}, @code{mpq_class}, or C++
6698builtin type will give the default @code{mpf} precision (@pxref{Initializing
6699Floats}).  An @code{mpf_class} or expression will give the precision of that
6700value.  The precision of a binary expression is the higher of the two
6701operands.
6702
6703@example
6704mpf_class f(1.5);        // default precision
6705mpf_class f(1.5, 500);   // 500 bits (at least)
6706mpf_class f(x);          // precision of x
6707mpf_class f(abs(x));     // precision of x
6708mpf_class f(-g, 1000);   // 1000 bits (at least)
6709mpf_class f(x+y);        // greater of precisions of x and y
6710@end example
6711@end deftypefun
6712
6713@deftypefun explicit mpf_class::mpf_class (mpf_t @var{f})
6714@deftypefunx {} mpf_class::mpf_class (mpf_t @var{f}, unsigned long @var{prec})
6715Construct an @code{mpf_class} from an @code{mpf_t}.  The value in @var{f} is
6716copied into the new @code{mpf_class}, there won't be any permanent association
6717between it and @var{f}.
6718
6719If @var{prec} is given, the initial precision is that value, in bits.  If
6720@var{prec} is not given, then the initial precision is that of @var{f}.
6721@end deftypefun
6722
6723@deftypefun explicit mpf_class::mpf_class (const char *@var{s})
6724@deftypefunx {} mpf_class::mpf_class (const char *@var{s}, unsigned long @var{prec}, int @var{base} = 0)
6725@deftypefunx explicit mpf_class::mpf_class (const string& @var{s})
6726@deftypefunx {} mpf_class::mpf_class (const string& @var{s}, unsigned long @var{prec}, int @var{base} = 0)
6727Construct an @code{mpf_class} converted from a string using @code{mpf_set_str}
6728(@pxref{Assigning Floats}).  If @var{prec} is given, the initial precision is
6729that value, in bits.  If not, the default @code{mpf} precision
6730(@pxref{Initializing Floats}) is used.
6731
6732If the string is not a valid float, an @code{std::invalid_argument} exception
6733is thrown.  The same applies to @code{operator=}.
6734@end deftypefun
6735
6736@deftypefun {mpf_class&} mpf_class::operator= (type @var{op})
6737Convert and store the given @var{op} value to an @code{mpf_class} object.  The
6738same types are accepted as for the constructors above.
6739
6740Note that @code{operator=} only stores a new value, it doesn't copy or change
6741the precision of the destination, instead the value is truncated if necessary.
6742This is the same as @code{mpf_set} etc.  Note in particular this means for
6743@code{mpf_class} a copy constructor is not the same as a default constructor
6744plus assignment.
6745
6746@example
6747mpf_class x (y);   // x created with precision of y
6748
6749mpf_class x;       // x created with default precision
6750x = y;             // value truncated to that precision
6751@end example
6752
6753Applications using templated code may need to be careful about the assumptions
6754the code makes in this area, when working with @code{mpf_class} values of
6755various different or non-default precisions.  For instance implementations of
6756the standard @code{complex} template have been seen in both styles above,
6757though of course @code{complex} is normally only actually specified for use
6758with the builtin float types.
6759@end deftypefun
6760
6761@deftypefun mpf_class abs (mpf_class @var{op})
6762@deftypefunx mpf_class ceil (mpf_class @var{op})
6763@deftypefunx int cmp (mpf_class @var{op1}, type @var{op2})
6764@deftypefunx int cmp (type @var{op1}, mpf_class @var{op2})
6765@maybepagebreak
6766@deftypefunx bool mpf_class::fits_sint_p (void)
6767@deftypefunx bool mpf_class::fits_slong_p (void)
6768@deftypefunx bool mpf_class::fits_sshort_p (void)
6769@maybepagebreak
6770@deftypefunx bool mpf_class::fits_uint_p (void)
6771@deftypefunx bool mpf_class::fits_ulong_p (void)
6772@deftypefunx bool mpf_class::fits_ushort_p (void)
6773@maybepagebreak
6774@deftypefunx mpf_class floor (mpf_class @var{op})
6775@deftypefunx mpf_class hypot (mpf_class @var{op1}, mpf_class @var{op2})
6776@maybepagebreak
6777@deftypefunx double mpf_class::get_d (void)
6778@deftypefunx long mpf_class::get_si (void)
6779@deftypefunx string mpf_class::get_str (mp_exp_t& @var{exp}, int @var{base} = 10, size_t @var{digits} = 0)
6780@deftypefunx {unsigned long} mpf_class::get_ui (void)
6781@maybepagebreak
6782@deftypefunx int mpf_class::set_str (const char *@var{str}, int @var{base})
6783@deftypefunx int mpf_class::set_str (const string& @var{str}, int @var{base})
6784@deftypefunx int sgn (mpf_class @var{op})
6785@deftypefunx mpf_class sqrt (mpf_class @var{op})
6786@deftypefunx mpf_class trunc (mpf_class @var{op})
6787These functions provide a C++ class interface to the corresponding GMP C
6788routines.
6789
6790@code{cmp} can be used with any of the classes or the standard C++ types,
6791except @code{long long} and @code{long double}.
6792
6793The accuracy provided by @code{hypot} is not currently guaranteed.
6794@end deftypefun
6795
6796@deftypefun {mp_bitcnt_t} mpf_class::get_prec ()
6797@deftypefunx void mpf_class::set_prec (mp_bitcnt_t @var{prec})
6798@deftypefunx void mpf_class::set_prec_raw (mp_bitcnt_t @var{prec})
6799Get or set the current precision of an @code{mpf_class}.
6800
6801The restrictions described for @code{mpf_set_prec_raw} (@pxref{Initializing
6802Floats}) apply to @code{mpf_class::set_prec_raw}.  Note in particular that the
6803@code{mpf_class} must be restored to it's allocated precision before being
6804destroyed.  This must be done by application code, there's no automatic
6805mechanism for it.
6806@end deftypefun
6807
6808
6809@node C++ Interface Random Numbers, C++ Interface Limitations, C++ Interface Floats, C++ Class Interface
6810@section C++ Interface Random Numbers
6811
6812@deftp Class gmp_randclass
6813The C++ class interface to the GMP random number functions uses
6814@code{gmp_randclass} to hold an algorithm selection and current state, as per
6815@code{gmp_randstate_t}.
6816@end deftp
6817
6818@deftypefun {} gmp_randclass::gmp_randclass (void (*@var{randinit}) (gmp_randstate_t, @dots{}), @dots{})
6819Construct a @code{gmp_randclass}, using a call to the given @var{randinit}
6820function (@pxref{Random State Initialization}).  The arguments expected are
6821the same as @var{randinit}, but with @code{mpz_class} instead of @code{mpz_t}.
6822For example,
6823
6824@example
6825gmp_randclass r1 (gmp_randinit_default);
6826gmp_randclass r2 (gmp_randinit_lc_2exp_size, 32);
6827gmp_randclass r3 (gmp_randinit_lc_2exp, a, c, m2exp);
6828gmp_randclass r4 (gmp_randinit_mt);
6829@end example
6830
6831@code{gmp_randinit_lc_2exp_size} will fail if the size requested is too big,
6832an @code{std::length_error} exception is thrown in that case.
6833@end deftypefun
6834
6835@deftypefun {} gmp_randclass::gmp_randclass (gmp_randalg_t @var{alg}, @dots{})
6836Construct a @code{gmp_randclass} using the same parameters as
6837@code{gmp_randinit} (@pxref{Random State Initialization}).  This function is
6838obsolete and the above @var{randinit} style should be preferred.
6839@end deftypefun
6840
6841@deftypefun void gmp_randclass::seed (unsigned long int @var{s})
6842@deftypefunx void gmp_randclass::seed (mpz_class @var{s})
6843Seed a random number generator.  See @pxref{Random Number Functions}, for how
6844to choose a good seed.
6845@end deftypefun
6846
6847@deftypefun mpz_class gmp_randclass::get_z_bits (unsigned long @var{bits})
6848@deftypefunx mpz_class gmp_randclass::get_z_bits (mpz_class @var{bits})
6849Generate a random integer with a specified number of bits.
6850@end deftypefun
6851
6852@deftypefun mpz_class gmp_randclass::get_z_range (mpz_class @var{n})
6853Generate a random integer in the range 0 to @math{@var{n}-1} inclusive.
6854@end deftypefun
6855
6856@deftypefun mpf_class gmp_randclass::get_f ()
6857@deftypefunx mpf_class gmp_randclass::get_f (unsigned long @var{prec})
6858Generate a random float @var{f} in the range @math{0 <= @var{f} < 1}.  @var{f}
6859will be to @var{prec} bits precision, or if @var{prec} is not given then to
6860the precision of the destination.  For example,
6861
6862@example
6863gmp_randclass  r;
6864...
6865mpf_class  f (0, 512);   // 512 bits precision
6866f = r.get_f();           // random number, 512 bits
6867@end example
6868@end deftypefun
6869
6870
6871
6872@node C++ Interface Limitations,  , C++ Interface Random Numbers, C++ Class Interface
6873@section C++ Interface Limitations
6874
6875@table @asis
6876@item @code{mpq_class} and Templated Reading
6877A generic piece of template code probably won't know that @code{mpq_class}
6878requires a @code{canonicalize} call if inputs read with @code{operator>>}
6879might be non-canonical.  This can lead to incorrect results.
6880
6881@code{operator>>} behaves as it does for reasons of efficiency.  A
6882canonicalize can be quite time consuming on large operands, and is best
6883avoided if it's not necessary.
6884
6885But this potential difficulty reduces the usefulness of @code{mpq_class}.
6886Perhaps a mechanism to tell @code{operator>>} what to do will be adopted in
6887the future, maybe a preprocessor define, a global flag, or an @code{ios} flag
6888pressed into service.  Or maybe, at the risk of inconsistency, the
6889@code{mpq_class} @code{operator>>} could canonicalize and leave @code{mpq_t}
6890@code{operator>>} not doing so, for use on those occasions when that's
6891acceptable.  Send feedback or alternate ideas to @email{gmp-bugs@@gmplib.org}.
6892
6893@item Subclassing
6894Subclassing the GMP C++ classes works, but is not currently recommended.
6895
6896Expressions involving subclasses resolve correctly (or seem to), but in normal
6897C++ fashion the subclass doesn't inherit constructors and assignments.
6898There's many of those in the GMP classes, and a good way to reestablish them
6899in a subclass is not yet provided.
6900
6901@item Templated Expressions
6902A subtle difficulty exists when using expressions together with
6903application-defined template functions.  Consider the following, with @code{T}
6904intended to be some numeric type,
6905
6906@example
6907template <class T>
6908T fun (const T &, const T &);
6909@end example
6910
6911@noindent
6912When used with, say, plain @code{mpz_class} variables, it works fine: @code{T}
6913is resolved as @code{mpz_class}.
6914
6915@example
6916mpz_class f(1), g(2);
6917fun (f, g);    // Good
6918@end example
6919
6920@noindent
6921But when one of the arguments is an expression, it doesn't work.
6922
6923@example
6924mpz_class f(1), g(2), h(3);
6925fun (f, g+h);  // Bad
6926@end example
6927
6928This is because @code{g+h} ends up being a certain expression template type
6929internal to @code{gmpxx.h}, which the C++ template resolution rules are unable
6930to automatically convert to @code{mpz_class}.  The workaround is simply to add
6931an explicit cast.
6932
6933@example
6934mpz_class f(1), g(2), h(3);
6935fun (f, mpz_class(g+h));  // Good
6936@end example
6937
6938Similarly, within @code{fun} it may be necessary to cast an expression to type
6939@code{T} when calling a templated @code{fun2}.
6940
6941@example
6942template <class T>
6943void fun (T f, T g)
6944@{
6945  fun2 (f, f+g);     // Bad
6946@}
6947
6948template <class T>
6949void fun (T f, T g)
6950@{
6951  fun2 (f, T(f+g));  // Good
6952@}
6953@end example
6954@end table
6955
6956
6957@node BSD Compatible Functions, Custom Allocation, C++ Class Interface, Top
6958@comment  node-name,  next,  previous,  up
6959@chapter Berkeley MP Compatible Functions
6960@cindex Berkeley MP compatible functions
6961@cindex BSD MP compatible functions
6962
6963These functions are intended to be fully compatible with the Berkeley MP
6964library which is available on many BSD derived U*ix systems.  The
6965@samp{--enable-mpbsd} option must be used when building GNU MP to make these
6966available (@pxref{Installing GMP}).
6967
6968The original Berkeley MP library has a usage restriction: you cannot use the
6969same variable as both source and destination in a single function call.  The
6970compatible functions in GNU MP do not share this restriction---inputs and
6971outputs may overlap.
6972
6973It is not recommended that new programs are written using these functions.
6974Apart from the incomplete set of functions, the interface for initializing
6975@code{MINT} objects is more error prone, and the @code{pow} function collides
6976with @code{pow} in @file{libm.a}.
6977
6978@cindex @code{mp.h}
6979@tindex MINT
6980Include the header @file{mp.h} to get the definition of the necessary types and
6981functions.  If you are on a BSD derived system, make sure to include GNU
6982@file{mp.h} if you are going to link the GNU @file{libmp.a} to your program.
6983This means that you probably need to give the @samp{-I<dir>} option to the
6984compiler, where @samp{<dir>} is the directory where you have GNU @file{mp.h}.
6985
6986@deftypefun {MINT *} itom (signed short int @var{initial_value})
6987Allocate an integer consisting of a @code{MINT} object and dynamic limb space.
6988Initialize the integer to @var{initial_value}.  Return a pointer to the
6989@code{MINT} object.
6990@end deftypefun
6991
6992@deftypefun {MINT *} xtom (char *@var{initial_value})
6993Allocate an integer consisting of a @code{MINT} object and dynamic limb space.
6994Initialize the integer from @var{initial_value}, a hexadecimal,
6995null-terminated C string.  Return a pointer to the @code{MINT} object.
6996@end deftypefun
6997
6998@deftypefun void move (MINT *@var{src}, MINT *@var{dest})
6999Set @var{dest} to @var{src} by copying.  Both variables must be previously
7000initialized.
7001@end deftypefun
7002
7003@deftypefun void madd (MINT *@var{src_1}, MINT *@var{src_2}, MINT *@var{destination})
7004Add @var{src_1} and @var{src_2} and put the sum in @var{destination}.
7005@end deftypefun
7006
7007@deftypefun void msub (MINT *@var{src_1}, MINT *@var{src_2}, MINT *@var{destination})
7008Subtract @var{src_2} from @var{src_1} and put the difference in
7009@var{destination}.
7010@end deftypefun
7011
7012@deftypefun void mult (MINT *@var{src_1}, MINT *@var{src_2}, MINT *@var{destination})
7013Multiply @var{src_1} and @var{src_2} and put the product in @var{destination}.
7014@end deftypefun
7015
7016@deftypefun void mdiv (MINT *@var{dividend}, MINT *@var{divisor}, MINT *@var{quotient}, MINT *@var{remainder})
7017@deftypefunx void sdiv (MINT *@var{dividend}, signed short int @var{divisor}, MINT *@var{quotient}, signed short int *@var{remainder})
7018Set @var{quotient} to @var{dividend}/@var{divisor}, and @var{remainder} to
7019@var{dividend} mod @var{divisor}.  The quotient is rounded towards zero; the
7020remainder has the same sign as the dividend unless it is zero.
7021
7022Some implementations of these functions work differently---or not at all---for
7023negative arguments.
7024@end deftypefun
7025
7026@deftypefun void msqrt (MINT *@var{op}, MINT *@var{root}, MINT *@var{remainder})
7027Set @var{root} to @m{\lfloor\sqrt{@var{op}}\rfloor, the truncated integer part
7028of the square root of @var{op}}, like @code{mpz_sqrt}.  Set @var{remainder} to
7029@m{(@var{op} - @var{root}^2), @var{op}@minus{}@var{root}*@var{root}}, i.e.
7030zero if @var{op} is a perfect square.
7031
7032If @var{root} and @var{remainder} are the same variable, the results are
7033undefined.
7034@end deftypefun
7035
7036@deftypefun void pow (MINT *@var{base}, MINT *@var{exp}, MINT *@var{mod}, MINT *@var{dest})
7037Set @var{dest} to (@var{base} raised to @var{exp}) modulo @var{mod}.
7038
7039Note that the name @code{pow} clashes with @code{pow} from the standard C math
7040library (@pxref{Exponents and Logarithms,, Exponentiation and Logarithms,
7041libc, The GNU C Library Reference Manual}).  An application will only be able
7042to use one or the other.
7043@end deftypefun
7044
7045@deftypefun void rpow (MINT *@var{base}, signed short int @var{exp}, MINT *@var{dest})
7046Set @var{dest} to @var{base} raised to @var{exp}.
7047@end deftypefun
7048
7049@deftypefun void gcd (MINT *@var{op1}, MINT *@var{op2}, MINT *@var{res})
7050Set @var{res} to the greatest common divisor of @var{op1} and @var{op2}.
7051@end deftypefun
7052
7053@deftypefun int mcmp (MINT *@var{op1}, MINT *@var{op2})
7054Compare @var{op1} and @var{op2}.  Return a positive value if @var{op1} >
7055@var{op2}, zero if @var{op1} = @var{op2}, and a negative value if @var{op1} <
7056@var{op2}.
7057@end deftypefun
7058
7059@deftypefun void min (MINT *@var{dest})
7060Input a decimal string from @code{stdin}, and put the read integer in
7061@var{dest}.  SPC and TAB are allowed in the number string, and are ignored.
7062@end deftypefun
7063
7064@deftypefun void mout (MINT *@var{src})
7065Output @var{src} to @code{stdout}, as a decimal string.  Also output a newline.
7066@end deftypefun
7067
7068@deftypefun {char *} mtox (MINT *@var{op})
7069Convert @var{op} to a hexadecimal string, and return a pointer to the string.
7070The returned string is allocated using the default memory allocation function,
7071@code{malloc} by default.  It will be @code{strlen(str)+1} bytes, that being
7072exactly enough for the string and null-terminator.
7073@end deftypefun
7074
7075@deftypefun void mfree (MINT *@var{op})
7076De-allocate, the space used by @var{op}.  @strong{This function should only be
7077passed a value returned by @code{itom} or @code{xtom}.}
7078@end deftypefun
7079
7080
7081@node Custom Allocation, Language Bindings, BSD Compatible Functions, Top
7082@comment  node-name,  next,  previous,  up
7083@chapter Custom Allocation
7084@cindex Custom allocation
7085@cindex Memory allocation
7086@cindex Allocation of memory
7087
7088By default GMP uses @code{malloc}, @code{realloc} and @code{free} for memory
7089allocation, and if they fail GMP prints a message to the standard error output
7090and terminates the program.
7091
7092Alternate functions can be specified, to allocate memory in a different way or
7093to have a different error action on running out of memory.
7094
7095This feature is available in the Berkeley compatibility library (@pxref{BSD
7096Compatible Functions}) as well as the main GMP library.
7097
7098@deftypefun void mp_set_memory_functions (@* void *(*@var{alloc_func_ptr}) (size_t), @* void *(*@var{realloc_func_ptr}) (void *, size_t, size_t), @* void (*@var{free_func_ptr}) (void *, size_t))
7099Replace the current allocation functions from the arguments.  If an argument
7100is @code{NULL}, the corresponding default function is used.
7101
7102These functions will be used for all memory allocation done by GMP, apart from
7103temporary space from @code{alloca} if that function is available and GMP is
7104configured to use it (@pxref{Build Options}).
7105
7106@strong{Be sure to call @code{mp_set_memory_functions} only when there are no
7107active GMP objects allocated using the previous memory functions!  Usually
7108that means calling it before any other GMP function.}
7109@end deftypefun
7110
7111The functions supplied should fit the following declarations:
7112
7113@deftypevr Function {void *} allocate_function (size_t @var{alloc_size})
7114Return a pointer to newly allocated space with at least @var{alloc_size}
7115bytes.
7116@end deftypevr
7117
7118@deftypevr Function {void *} reallocate_function (void *@var{ptr}, size_t @var{old_size}, size_t @var{new_size})
7119Resize a previously allocated block @var{ptr} of @var{old_size} bytes to be
7120@var{new_size} bytes.
7121
7122The block may be moved if necessary or if desired, and in that case the
7123smaller of @var{old_size} and @var{new_size} bytes must be copied to the new
7124location.  The return value is a pointer to the resized block, that being the
7125new location if moved or just @var{ptr} if not.
7126
7127@var{ptr} is never @code{NULL}, it's always a previously allocated block.
7128@var{new_size} may be bigger or smaller than @var{old_size}.
7129@end deftypevr
7130
7131@deftypevr Function void free_function (void *@var{ptr}, size_t @var{size})
7132De-allocate the space pointed to by @var{ptr}.
7133
7134@var{ptr} is never @code{NULL}, it's always a previously allocated block of
7135@var{size} bytes.
7136@end deftypevr
7137
7138A @dfn{byte} here means the unit used by the @code{sizeof} operator.
7139
7140The @var{old_size} parameters to @var{reallocate_function} and
7141@var{free_function} are passed for convenience, but of course can be ignored
7142if not needed.  The default functions using @code{malloc} and friends for
7143instance don't use them.
7144
7145No error return is allowed from any of these functions, if they return then
7146they must have performed the specified operation.  In particular note that
7147@var{allocate_function} or @var{reallocate_function} mustn't return
7148@code{NULL}.
7149
7150Getting a different fatal error action is a good use for custom allocation
7151functions, for example giving a graphical dialog rather than the default print
7152to @code{stderr}.  How much is possible when genuinely out of memory is
7153another question though.
7154
7155There's currently no defined way for the allocation functions to recover from
7156an error such as out of memory, they must terminate program execution.  A
7157@code{longjmp} or throwing a C++ exception will have undefined results.  This
7158may change in the future.
7159
7160GMP may use allocated blocks to hold pointers to other allocated blocks.  This
7161will limit the assumptions a conservative garbage collection scheme can make.
7162
7163Since the default GMP allocation uses @code{malloc} and friends, those
7164functions will be linked in even if the first thing a program does is an
7165@code{mp_set_memory_functions}.  It's necessary to change the GMP sources if
7166this is a problem.
7167
7168@sp 1
7169@deftypefun void mp_get_memory_functions (@* void *(**@var{alloc_func_ptr}) (size_t), @* void *(**@var{realloc_func_ptr}) (void *, size_t, size_t), @* void (**@var{free_func_ptr}) (void *, size_t))
7170Get the current allocation functions, storing function pointers to the
7171locations given by the arguments.  If an argument is @code{NULL}, that
7172function pointer is not stored.
7173
7174@need 1000
7175For example, to get just the current free function,
7176
7177@example
7178void (*freefunc) (void *, size_t);
7179
7180mp_get_memory_functions (NULL, NULL, &freefunc);
7181@end example
7182@end deftypefun
7183
7184@node Language Bindings, Algorithms, Custom Allocation, Top
7185@chapter Language Bindings
7186@cindex Language bindings
7187@cindex Other languages
7188
7189The following packages and projects offer access to GMP from languages other
7190than C, though perhaps with varying levels of functionality and efficiency.
7191
7192@c  @spaceuref{U} is the same as @uref{U}, but with a couple of extra spaces
7193@c  in tex, just to separate the URL from the preceding text a bit.
7194@iftex
7195@macro spaceuref {U}
7196@ @ @uref{\U\}
7197@end macro
7198@end iftex
7199@ifnottex
7200@macro spaceuref {U}
7201@uref{\U\}
7202@end macro
7203@end ifnottex
7204
7205@sp 1
7206@table @asis
7207@item C++
7208@itemize @bullet
7209@item
7210GMP C++ class interface, @pxref{C++ Class Interface} @* Straightforward
7211interface, expression templates to eliminate temporaries.
7212@item
7213ALP @spaceuref{http://www-sop.inria.fr/saga/logiciels/ALP/} @* Linear algebra and
7214polynomials using templates.
7215@item
7216Arithmos @spaceuref{http://www.win.ua.ac.be/~cant/arithmos/} @* Rationals
7217with infinities and square roots.
7218@item
7219CLN @spaceuref{http://www.ginac.de/CLN/} @* High level classes for arithmetic.
7220@item
7221LiDIA @spaceuref{http://www.cdc.informatik.tu-darmstadt.de/TI/LiDIA/} @* A C++
7222library for computational number theory.
7223@item
7224Linbox @spaceuref{http://www.linalg.org/} @* Sparse vectors and matrices.
7225@item
7226NTL @spaceuref{http://www.shoup.net/ntl/} @* A C++ number theory library.
7227@end itemize
7228
7229@c @item D
7230@c @itemize @bullet
7231@c @item
7232@c gmp-d @spaceuref{http://home.comcast.net/~benhinkle/gmp-d/}
7233@c @end itemize
7234
7235@item Eiffel
7236@itemize @bullet
7237@item
7238Eiffelroom @spaceuref{http://www.eiffelroom.org/node/442}
7239@end itemize
7240
7241@item Fortran
7242@itemize @bullet
7243@item
7244Omni F77 @spaceuref{http://phase.hpcc.jp/Omni/home.html} @* Arbitrary
7245precision floats.
7246@end itemize
7247
7248@item Haskell
7249@itemize @bullet
7250@item
7251Glasgow Haskell Compiler @spaceuref{http://www.haskell.org/ghc/}
7252@end itemize
7253
7254@item Java
7255@itemize @bullet
7256@item
7257Kaffe @spaceuref{http://www.kaffe.org/}
7258@item
7259Kissme @spaceuref{http://kissme.sourceforge.net/}
7260@end itemize
7261
7262@item Lisp
7263@itemize @bullet
7264@item
7265GNU Common Lisp @spaceuref{http://www.gnu.org/software/gcl/gcl.html}
7266@item
7267Librep @spaceuref{http://librep.sourceforge.net/}
7268@item
7269@c  FIXME: When there's a stable release with gmp support, just refer to it
7270@c  rather than bothering to talk about betas.
7271XEmacs (21.5.18 beta and up) @spaceuref{http://www.xemacs.org} @* Optional
7272big integers, rationals and floats using GMP.
7273@end itemize
7274
7275@item M4
7276@itemize @bullet
7277@item
7278@c  FIXME: When there's a stable release with gmp support, just refer to it
7279@c  rather than bothering to talk about betas.
7280GNU m4 betas @spaceuref{http://www.seindal.dk/rene/gnu/} @* Optionally provides
7281an arbitrary precision @code{mpeval}.
7282@end itemize
7283
7284@item ML
7285@itemize @bullet
7286@item
7287MLton compiler @spaceuref{http://mlton.org/}
7288@end itemize
7289
7290@item Objective Caml
7291@itemize @bullet
7292@item
7293MLGMP @spaceuref{http://www.di.ens.fr/~monniaux/programmes.html.en}
7294@item
7295Numerix @spaceuref{http://pauillac.inria.fr/~quercia/} @* Optionally using
7296GMP.
7297@end itemize
7298
7299@item Oz
7300@itemize @bullet
7301@item
7302Mozart @spaceuref{http://www.mozart-oz.org/}
7303@end itemize
7304
7305@item Pascal
7306@itemize @bullet
7307@item
7308GNU Pascal Compiler @spaceuref{http://www.gnu-pascal.de/} @* GMP unit.
7309@item
7310Numerix @spaceuref{http://pauillac.inria.fr/~quercia/} @* For Free Pascal,
7311optionally using GMP.
7312@end itemize
7313
7314@item Perl
7315@itemize @bullet
7316@item
7317GMP module, see @file{demos/perl} in the GMP sources (@pxref{Demonstration
7318Programs}).
7319@item
7320Math::GMP @spaceuref{http://www.cpan.org/} @* Compatible with Math::BigInt, but
7321not as many functions as the GMP module above.
7322@item
7323Math::BigInt::GMP @spaceuref{http://www.cpan.org/} @* Plug Math::GMP into
7324normal Math::BigInt operations.
7325@end itemize
7326
7327@need 1000
7328@item Pike
7329@itemize @bullet
7330@item
7331mpz module in the standard distribution, @uref{http://pike.ida.liu.se/}
7332@end itemize
7333
7334@need 500
7335@item Prolog
7336@itemize @bullet
7337@item
7338SWI Prolog @spaceuref{http://www.swi-prolog.org/} @*
7339Arbitrary precision floats.
7340@end itemize
7341
7342@item Python
7343@itemize @bullet
7344@item
7345GMPY @uref{http://code.google.com/p/gmpy/}
7346@end itemize
7347
7348@item Ruby
7349@itemize @bullet
7350@item
7351http://rubygems.org/gems/gmp
7352@end itemize
7353
7354@item Scheme
7355@itemize @bullet
7356@item
7357GNU Guile (upcoming 1.8) @spaceuref{http://www.gnu.org/software/guile/guile.html}
7358@item
7359RScheme @spaceuref{http://www.rscheme.org/}
7360@item
7361STklos @spaceuref{http://www.stklos.org/}
7362@c
7363@c  For reference, MzScheme uses some of gmp, but (as of version 205) it only
7364@c  has copies of some of the generic C code, and we don't consider that a
7365@c  language binding to gmp.
7366@c
7367@end itemize
7368
7369@item Smalltalk
7370@itemize @bullet
7371@item
7372GNU Smalltalk @spaceuref{http://www.smalltalk.org/versions/GNUSmalltalk.html}
7373@end itemize
7374
7375@item Other
7376@itemize @bullet
7377@item
7378Axiom @uref{http://savannah.nongnu.org/projects/axiom} @* Computer algebra
7379using GCL.
7380@item
7381DrGenius @spaceuref{http://drgenius.seul.org/} @* Geometry system and
7382mathematical programming language.
7383@item
7384GiNaC @spaceuref{http://www.ginac.de/} @* C++ computer algebra using CLN.
7385@item
7386GOO @spaceuref{http://www.googoogaga.org/} @* Dynamic object oriented
7387language.
7388@item
7389Maxima @uref{http://www.ma.utexas.edu/users/wfs/maxima.html} @* Macsyma
7390computer algebra using GCL.
7391@item
7392Q @spaceuref{http://q-lang.sourceforge.net/} @* Equational programming system.
7393@item
7394Regina @spaceuref{http://regina.sourceforge.net/} @* Topological calculator.
7395@item
7396Yacas @spaceuref{http://www.xs4all.nl/~apinkus/yacas.html} @* Yet another
7397computer algebra system.
7398@end itemize
7399
7400@end table
7401
7402
7403@node Algorithms, Internals, Language Bindings, Top
7404@chapter Algorithms
7405@cindex Algorithms
7406
7407This chapter is an introduction to some of the algorithms used for various GMP
7408operations.  The code is likely to be hard to understand without knowing
7409something about the algorithms.
7410
7411Some GMP internals are mentioned, but applications that expect to be
7412compatible with future GMP releases should take care to use only the
7413documented functions.
7414
7415@menu
7416* Multiplication Algorithms::
7417* Division Algorithms::
7418* Greatest Common Divisor Algorithms::
7419* Powering Algorithms::
7420* Root Extraction Algorithms::
7421* Radix Conversion Algorithms::
7422* Other Algorithms::
7423* Assembly Coding::
7424@end menu
7425
7426
7427@node Multiplication Algorithms, Division Algorithms, Algorithms, Algorithms
7428@section Multiplication
7429@cindex Multiplication algorithms
7430
7431N@cross{}N limb multiplications and squares are done using one of five
7432algorithms, as the size N increases.
7433
7434@quotation
7435@multitable {KaratsubaMMM} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM}
7436@item Algorithm @tab Threshold
7437@item Basecase  @tab (none)
7438@item Karatsuba @tab @code{MUL_TOOM22_THRESHOLD}
7439@item Toom-3    @tab @code{MUL_TOOM33_THRESHOLD}
7440@item Toom-4    @tab @code{MUL_TOOM44_THRESHOLD}
7441@item FFT       @tab @code{MUL_FFT_THRESHOLD}
7442@end multitable
7443@end quotation
7444
7445Similarly for squaring, with the @code{SQR} thresholds.
7446
7447N@cross{}M multiplications of operands with different sizes above
7448@code{MUL_TOOM22_THRESHOLD} are currently done by special Toom-inspired
7449algorithms or directly with FFT, depending on operand size (@pxref{Unbalanced
7450Multiplication}).
7451
7452@menu
7453* Basecase Multiplication::
7454* Karatsuba Multiplication::
7455* Toom 3-Way Multiplication::
7456* Toom 4-Way Multiplication::
7457* FFT Multiplication::
7458* Other Multiplication::
7459* Unbalanced Multiplication::
7460@end menu
7461
7462
7463@node Basecase Multiplication, Karatsuba Multiplication, Multiplication Algorithms, Multiplication Algorithms
7464@subsection Basecase Multiplication
7465
7466Basecase N@cross{}M multiplication is a straightforward rectangular set of
7467cross-products, the same as long multiplication done by hand and for that
7468reason sometimes known as the schoolbook or grammar school method.  This is an
7469@m{O(NM),O(N*M)} algorithm.  See Knuth section 4.3.1 algorithm M
7470(@pxref{References}), and the @file{mpn/generic/mul_basecase.c} code.
7471
7472Assembly implementations of @code{mpn_mul_basecase} are essentially the same
7473as the generic C code, but have all the usual assembly tricks and
7474obscurities introduced for speed.
7475
7476A square can be done in roughly half the time of a multiply, by using the fact
7477that the cross products above and below the diagonal are the same.  A triangle
7478of products below the diagonal is formed, doubled (left shift by one bit), and
7479then the products on the diagonal added.  This can be seen in
7480@file{mpn/generic/sqr_basecase.c}.  Again the assembly implementations take
7481essentially the same approach.
7482
7483@tex
7484\def\GMPline#1#2#3#4#5#6{%
7485  \hbox {%
7486    \vrule height 2.5ex depth 1ex
7487           \hbox to 2em {\hfil{#2}\hfil}%
7488    \vrule \hbox to 2em {\hfil{#3}\hfil}%
7489    \vrule \hbox to 2em {\hfil{#4}\hfil}%
7490    \vrule \hbox to 2em {\hfil{#5}\hfil}%
7491    \vrule \hbox to 2em {\hfil{#6}\hfil}%
7492    \vrule}}
7493\GMPdisplay{
7494  \hbox{%
7495    \vbox{%
7496      \hbox to 1.5em {\vrule height 2.5ex depth 1ex width 0pt}%
7497      \hbox {\vrule height 2.5ex depth 1ex width 0pt u0\hfil}%
7498      \hbox {\vrule height 2.5ex depth 1ex width 0pt u1\hfil}%
7499      \hbox {\vrule height 2.5ex depth 1ex width 0pt u2\hfil}%
7500      \hbox {\vrule height 2.5ex depth 1ex width 0pt u3\hfil}%
7501      \hbox {\vrule height 2.5ex depth 1ex width 0pt u4\hfil}%
7502      \vfill}%
7503    \vbox{%
7504      \hbox{%
7505        \hbox to 2em {\hfil u0\hfil}%
7506        \hbox to 2em {\hfil u1\hfil}%
7507        \hbox to 2em {\hfil u2\hfil}%
7508        \hbox to 2em {\hfil u3\hfil}%
7509        \hbox to 2em {\hfil u4\hfil}}%
7510      \vskip 0.7ex
7511      \hrule
7512      \GMPline{u0}{d}{}{}{}{}%
7513      \hrule
7514      \GMPline{u1}{}{d}{}{}{}%
7515      \hrule
7516      \GMPline{u2}{}{}{d}{}{}%
7517      \hrule
7518      \GMPline{u3}{}{}{}{d}{}%
7519      \hrule
7520      \GMPline{u4}{}{}{}{}{d}%
7521      \hrule}}}
7522@end tex
7523@ifnottex
7524@example
7525@group
7526     u0  u1  u2  u3  u4
7527   +---+---+---+---+---+
7528u0 | d |   |   |   |   |
7529   +---+---+---+---+---+
7530u1 |   | d |   |   |   |
7531   +---+---+---+---+---+
7532u2 |   |   | d |   |   |
7533   +---+---+---+---+---+
7534u3 |   |   |   | d |   |
7535   +---+---+---+---+---+
7536u4 |   |   |   |   | d |
7537   +---+---+---+---+---+
7538@end group
7539@end example
7540@end ifnottex
7541
7542In practice squaring isn't a full 2@cross{} faster than multiplying, it's
7543usually around 1.5@cross{}.  Less than 1.5@cross{} probably indicates
7544@code{mpn_sqr_basecase} wants improving on that CPU.
7545
7546On some CPUs @code{mpn_mul_basecase} can be faster than the generic C
7547@code{mpn_sqr_basecase} on some small sizes.  @code{SQR_BASECASE_THRESHOLD} is
7548the size at which to use @code{mpn_sqr_basecase}, this will be zero if that
7549routine should be used always.
7550
7551
7552@node Karatsuba Multiplication, Toom 3-Way Multiplication, Basecase Multiplication, Multiplication Algorithms
7553@subsection Karatsuba Multiplication
7554@cindex Karatsuba multiplication
7555
7556The Karatsuba multiplication algorithm is described in Knuth section 4.3.3
7557part A, and various other textbooks.  A brief description is given here.
7558
7559The inputs @math{x} and @math{y} are treated as each split into two parts of
7560equal length (or the most significant part one limb shorter if N is odd).
7561
7562@tex
7563% GMPboxwidth used for all the multiplication pictures
7564\global\newdimen\GMPboxwidth \global\GMPboxwidth=5em
7565% GMPboxdepth and GMPboxheight are also used for the float pictures
7566\global\newdimen\GMPboxdepth  \global\GMPboxdepth=1ex
7567\global\newdimen\GMPboxheight \global\GMPboxheight=2ex
7568\gdef\GMPvrule{\vrule height \GMPboxheight depth \GMPboxdepth}
7569\def\GMPbox#1#2{%
7570  \vbox {%
7571    \hrule
7572    \hbox to 2\GMPboxwidth{%
7573      \GMPvrule \hfil $#1$\hfil \vrule \hfil $#2$\hfil \vrule}%
7574    \hrule}}
7575\GMPdisplay{%
7576\vbox{%
7577  \hbox to 2\GMPboxwidth {high \hfil low}
7578  \vskip 0.7ex
7579  \GMPbox{x_1}{x_0}
7580  \vskip 0.5ex
7581  \GMPbox{y_1}{y_0}
7582}}
7583@end tex
7584@ifnottex
7585@example
7586@group
7587 high              low
7588+----------+----------+
7589|    x1    |    x0    |
7590+----------+----------+
7591
7592+----------+----------+
7593|    y1    |    y0    |
7594+----------+----------+
7595@end group
7596@end example
7597@end ifnottex
7598
7599Let @math{b} be the power of 2 where the split occurs, ie.@: if @ms{x,0} is
7600@math{k} limbs (@ms{y,0} the same) then
7601@m{b=2\GMPraise{$k*$@code{mp\_bits\_per\_limb}}, b=2^(k*mp_bits_per_limb)}.
7602With that @m{x=x_1b+x_0,x=x1*b+x0} and @m{y=y_1b+y_0,y=y1*b+y0}, and the
7603following holds,
7604
7605@display
7606@m{xy = (b^2+b)x_1y_1 - b(x_1-x_0)(y_1-y_0) + (b+1)x_0y_0,
7607  x*y = (b^2+b)*x1*y1 - b*(x1-x0)*(y1-y0) + (b+1)*x0*y0}
7608@end display
7609
7610This formula means doing only three multiplies of (N/2)@cross{}(N/2) limbs,
7611whereas a basecase multiply of N@cross{}N limbs is equivalent to four
7612multiplies of (N/2)@cross{}(N/2).  The factors @math{(b^2+b)} etc represent
7613the positions where the three products must be added.
7614
7615@tex
7616\def\GMPboxA#1#2{%
7617  \vbox{%
7618    \hrule
7619    \hbox{%
7620      \GMPvrule
7621      \hbox to 2\GMPboxwidth {\hfil\hbox{$#1$}\hfil}%
7622      \vrule
7623      \hbox to 2\GMPboxwidth {\hfil\hbox{$#2$}\hfil}%
7624      \vrule}
7625    \hrule}}
7626\def\GMPboxB#1#2{%
7627  \hbox{%
7628    \raise \GMPboxdepth \hbox to \GMPboxwidth {\hfil #1\hskip 0.5em}%
7629    \vbox{%
7630      \hrule
7631      \hbox{%
7632        \GMPvrule
7633        \hbox to 2\GMPboxwidth {\hfil\hbox{$#2$}\hfil}%
7634        \vrule}%
7635      \hrule}}}
7636\GMPdisplay{%
7637\vbox{%
7638  \hbox to 4\GMPboxwidth {high \hfil low}
7639  \vskip 0.7ex
7640  \GMPboxA{x_1y_1}{x_0y_0}
7641  \vskip 0.5ex
7642  \GMPboxB{$+$}{x_1y_1}
7643  \vskip 0.5ex
7644  \GMPboxB{$+$}{x_0y_0}
7645  \vskip 0.5ex
7646  \GMPboxB{$-$}{(x_1-x_0)(y_1-y_0)}
7647}}
7648@end tex
7649@ifnottex
7650@example
7651@group
7652 high                              low
7653+--------+--------+ +--------+--------+
7654|      x1*y1      | |      x0*y0      |
7655+--------+--------+ +--------+--------+
7656          +--------+--------+
7657      add |      x1*y1      |
7658          +--------+--------+
7659          +--------+--------+
7660      add |      x0*y0      |
7661          +--------+--------+
7662          +--------+--------+
7663      sub | (x1-x0)*(y1-y0) |
7664          +--------+--------+
7665@end group
7666@end example
7667@end ifnottex
7668
7669The term @m{(x_1-x_0)(y_1-y_0),(x1-x0)*(y1-y0)} is best calculated as an
7670absolute value, and the sign used to choose to add or subtract.  Notice the
7671sum @m{\mathop{\rm high}(x_0y_0)+\mathop{\rm low}(x_1y_1),
7672high(x0*y0)+low(x1*y1)} occurs twice, so it's possible to do @m{5k,5*k} limb
7673additions, rather than @m{6k,6*k}, but in GMP extra function call overheads
7674outweigh the saving.
7675
7676Squaring is similar to multiplying, but with @math{x=y} the formula reduces to
7677an equivalent with three squares,
7678
7679@display
7680@m{x^2 = (b^2+b)x_1^2 - b(x_1-x_0)^2 + (b+1)x_0^2,
7681   x^2 = (b^2+b)*x1^2 - b*(x1-x0)^2 + (b+1)*x0^2}
7682@end display
7683
7684The final result is accumulated from those three squares the same way as for
7685the three multiplies above.  The middle term @m{(x_1-x_0)^2,(x1-x0)^2} is now
7686always positive.
7687
7688A similar formula for both multiplying and squaring can be constructed with a
7689middle term @m{(x_1+x_0)(y_1+y_0),(x1+x0)*(y1+y0)}.  But those sums can exceed
7690@math{k} limbs, leading to more carry handling and additions than the form
7691above.
7692
7693Karatsuba multiplication is asymptotically an @math{O(N^@W{1.585})} algorithm,
7694the exponent being @m{\log3/\log2,log(3)/log(2)}, representing 3 multiplies
7695each @math{1/2} the size of the inputs.  This is a big improvement over the
7696basecase multiply at @math{O(N^2)} and the advantage soon overcomes the extra
7697additions Karatsuba performs.  @code{MUL_TOOM22_THRESHOLD} can be as little
7698as 10 limbs.  The @code{SQR} threshold is usually about twice the @code{MUL}.
7699
7700The basecase algorithm will take a time of the form @m{M(N) = aN^2 + bN + c,
7701M(N) = a*N^2 + b*N + c} and the Karatsuba algorithm @m{K(N) = 3M(N/2) + dN +
7702e, K(N) = 3*M(N/2) + d*N + e}, which expands to @m{K(N) = {3\over4} aN^2 +
7703{3\over2} bN + 3c + dN + e, K(N) = 3/4*a*N^2 + 3/2*b*N + 3*c + d*N + e}.  The
7704factor @m{3\over4, 3/4} for @math{a} means per-crossproduct speedups in the
7705basecase code will increase the threshold since they benefit @math{M(N)} more
7706than @math{K(N)}.  And conversely the @m{3\over2, 3/2} for @math{b} means
7707linear style speedups of @math{b} will increase the threshold since they
7708benefit @math{K(N)} more than @math{M(N)}.  The latter can be seen for
7709instance when adding an optimized @code{mpn_sqr_diagonal} to
7710@code{mpn_sqr_basecase}.  Of course all speedups reduce total time, and in
7711that sense the algorithm thresholds are merely of academic interest.
7712
7713
7714@node Toom 3-Way Multiplication, Toom 4-Way Multiplication, Karatsuba Multiplication, Multiplication Algorithms
7715@subsection Toom 3-Way Multiplication
7716@cindex Toom multiplication
7717
7718The Karatsuba formula is the simplest case of a general approach to splitting
7719inputs that leads to both Toom and FFT algorithms.  A description of
7720Toom can be found in Knuth section 4.3.3, with an example 3-way
7721calculation after Theorem A@.  The 3-way form used in GMP is described here.
7722
7723The operands are each considered split into 3 pieces of equal length (or the
7724most significant part 1 or 2 limbs shorter than the other two).
7725
7726@tex
7727\def\GMPbox#1#2#3{%
7728  \vbox{%
7729    \hrule \vfil
7730    \hbox to 3\GMPboxwidth {%
7731      \GMPvrule
7732      \hfil$#1$\hfil
7733      \vrule
7734      \hfil$#2$\hfil
7735      \vrule
7736      \hfil$#3$\hfil
7737      \vrule}%
7738    \vfil \hrule
7739}}
7740\GMPdisplay{%
7741\vbox{%
7742  \hbox to 3\GMPboxwidth {high \hfil low}
7743  \vskip 0.7ex
7744  \GMPbox{x_2}{x_1}{x_0}
7745  \vskip 0.5ex
7746  \GMPbox{y_2}{y_1}{y_0}
7747  \vskip 0.5ex
7748}}
7749@end tex
7750@ifnottex
7751@example
7752@group
7753 high                         low
7754+----------+----------+----------+
7755|    x2    |    x1    |    x0    |
7756+----------+----------+----------+
7757
7758+----------+----------+----------+
7759|    y2    |    y1    |    y0    |
7760+----------+----------+----------+
7761@end group
7762@end example
7763@end ifnottex
7764
7765@noindent
7766These parts are treated as the coefficients of two polynomials
7767
7768@display
7769@group
7770@m{X(t) = x_2t^2 + x_1t + x_0,
7771   X(t) = x2*t^2 + x1*t + x0}
7772@m{Y(t) = y_2t^2 + y_1t + y_0,
7773   Y(t) = y2*t^2 + y1*t + y0}
7774@end group
7775@end display
7776
7777Let @math{b} equal the power of 2 which is the size of the @ms{x,0}, @ms{x,1},
7778@ms{y,0} and @ms{y,1} pieces, ie.@: if they're @math{k} limbs each then
7779@m{b=2\GMPraise{$k*$@code{mp\_bits\_per\_limb}}, b=2^(k*mp_bits_per_limb)}.
7780With this @math{x=X(b)} and @math{y=Y(b)}.
7781
7782Let a polynomial @m{W(t)=X(t)Y(t),W(t)=X(t)*Y(t)} and suppose its coefficients
7783are
7784
7785@display
7786@m{W(t) = w_4t^4 + w_3t^3 + w_2t^2 + w_1t + w_0,
7787   W(t) = w4*t^4 + w3*t^3 + w2*t^2 + w1*t + w0}
7788@end display
7789
7790The @m{w_i,w[i]} are going to be determined, and when they are they'll give
7791the final result using @math{w=W(b)}, since
7792@m{xy=X(b)Y(b),x*y=X(b)*Y(b)=W(b)}.  The coefficients will be roughly
7793@math{b^2} each, and the final @math{W(b)} will be an addition like,
7794
7795@tex
7796\def\GMPbox#1#2{%
7797  \moveright #1\GMPboxwidth
7798  \vbox{%
7799    \hrule
7800    \hbox{%
7801      \GMPvrule
7802      \hbox to 2\GMPboxwidth {\hfil$#2$\hfil}%
7803      \vrule}%
7804    \hrule
7805}}
7806\GMPdisplay{%
7807\vbox{%
7808  \hbox to 6\GMPboxwidth {high \hfil low}%
7809  \vskip 0.7ex
7810  \GMPbox{0}{w_4}
7811  \vskip 0.5ex
7812  \GMPbox{1}{w_3}
7813  \vskip 0.5ex
7814  \GMPbox{2}{w_2}
7815  \vskip 0.5ex
7816  \GMPbox{3}{w_1}
7817  \vskip 0.5ex
7818  \GMPbox{4}{w_0}
7819}}
7820@end tex
7821@ifnottex
7822@example
7823@group
7824 high                                        low
7825+-------+-------+
7826|       w4      |
7827+-------+-------+
7828       +--------+-------+
7829       |        w3      |
7830       +--------+-------+
7831               +--------+-------+
7832               |        w2      |
7833               +--------+-------+
7834                       +--------+-------+
7835                       |        w1      |
7836                       +--------+-------+
7837                                +-------+-------+
7838                                |       w0      |
7839                                +-------+-------+
7840@end group
7841@end example
7842@end ifnottex
7843
7844The @m{w_i,w[i]} coefficients could be formed by a simple set of cross
7845products, like @m{w_4=x_2y_2,w4=x2*y2}, @m{w_3=x_2y_1+x_1y_2,w3=x2*y1+x1*y2},
7846@m{w_2=x_2y_0+x_1y_1+x_0y_2,w2=x2*y0+x1*y1+x0*y2} etc, but this would need all
7847nine @m{x_iy_j,x[i]*y[j]} for @math{i,j=0,1,2}, and would be equivalent merely
7848to a basecase multiply.  Instead the following approach is used.
7849
7850@math{X(t)} and @math{Y(t)} are evaluated and multiplied at 5 points, giving
7851values of @math{W(t)} at those points.  In GMP the following points are used,
7852
7853@quotation
7854@multitable {@m{t=\infty,t=inf}M} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM}
7855@item Point                 @tab Value
7856@item @math{t=0}            @tab @m{x_0y_0,x0 * y0}, which gives @ms{w,0} immediately
7857@item @math{t=1}            @tab @m{(x_2+x_1+x_0)(y_2+y_1+y_0),(x2+x1+x0) * (y2+y1+y0)}
7858@item @math{t=-1}           @tab @m{(x_2-x_1+x_0)(y_2-y_1+y_0),(x2-x1+x0) * (y2-y1+y0)}
7859@item @math{t=2}            @tab @m{(4x_2+2x_1+x_0)(4y_2+2y_1+y_0),(4*x2+2*x1+x0) * (4*y2+2*y1+y0)}
7860@item @m{t=\infty,t=inf}    @tab @m{x_2y_2,x2 * y2}, which gives @ms{w,4} immediately
7861@end multitable
7862@end quotation
7863
7864At @math{t=-1} the values can be negative and that's handled using the
7865absolute values and tracking the sign separately.  At @m{t=\infty,t=inf} the
7866value is actually @m{\lim_{t\to\infty} {X(t)Y(t)\over t^4}, X(t)*Y(t)/t^4 in
7867the limit as t approaches infinity}, but it's much easier to think of as
7868simply @m{x_2y_2,x2*y2} giving @ms{w,4} immediately (much like
7869@m{x_0y_0,x0*y0} at @math{t=0} gives @ms{w,0} immediately).
7870
7871Each of the points substituted into
7872@m{W(t)=w_4t^4+\cdots+w_0,W(t)=w4*t^4+@dots{}+w0} gives a linear combination
7873of the @m{w_i,w[i]} coefficients, and the value of those combinations has just
7874been calculated.
7875
7876@tex
7877\GMPdisplay{%
7878$\matrix{%
7879W(0)      & = &       &   &      &   &      &   &      &   & w_0 \cr
7880W(1)      & = &   w_4 & + &  w_3 & + &  w_2 & + &  w_1 & + & w_0 \cr
7881W(-1)     & = &   w_4 & - &  w_3 & + &  w_2 & - &  w_1 & + & w_0 \cr
7882W(2)      & = & 16w_4 & + & 8w_3 & + & 4w_2 & + & 2w_1 & + & w_0 \cr
7883W(\infty) & = &   w_4 \cr
7884}$}
7885@end tex
7886@ifnottex
7887@example
7888@group
7889W(0)   =                              w0
7890W(1)   =    w4 +   w3 +   w2 +   w1 + w0
7891W(-1)  =    w4 -   w3 +   w2 -   w1 + w0
7892W(2)   = 16*w4 + 8*w3 + 4*w2 + 2*w1 + w0
7893W(inf) =    w4
7894@end group
7895@end example
7896@end ifnottex
7897
7898This is a set of five equations in five unknowns, and some elementary linear
7899algebra quickly isolates each @m{w_i,w[i]}.  This involves adding or
7900subtracting one @math{W(t)} value from another, and a couple of divisions by
7901powers of 2 and one division by 3, the latter using the special
7902@code{mpn_divexact_by3} (@pxref{Exact Division}).
7903
7904The conversion of @math{W(t)} values to the coefficients is interpolation.  A
7905polynomial of degree 4 like @math{W(t)} is uniquely determined by values known
7906at 5 different points.  The points are arbitrary and can be chosen to make the
7907linear equations come out with a convenient set of steps for quickly isolating
7908the @m{w_i,w[i]}.
7909
7910Squaring follows the same procedure as multiplication, but there's only one
7911@math{X(t)} and it's evaluated at the 5 points, and those values squared to
7912give values of @math{W(t)}.  The interpolation is then identical, and in fact
7913the same @code{toom3_interpolate} subroutine is used for both squaring and
7914multiplying.
7915
7916Toom-3 is asymptotically @math{O(N^@W{1.465})}, the exponent being
7917@m{\log5/\log3,log(5)/log(3)}, representing 5 recursive multiplies of 1/3 the
7918original size each.  This is an improvement over Karatsuba at
7919@math{O(N^@W{1.585})}, though Toom does more work in the evaluation and
7920interpolation and so it only realizes its advantage above a certain size.
7921
7922Near the crossover between Toom-3 and Karatsuba there's generally a range of
7923sizes where the difference between the two is small.
7924@code{MUL_TOOM33_THRESHOLD} is a somewhat arbitrary point in that range and
7925successive runs of the tune program can give different values due to small
7926variations in measuring.  A graph of time versus size for the two shows the
7927effect, see @file{tune/README}.
7928
7929At the fairly small sizes where the Toom-3 thresholds occur it's worth
7930remembering that the asymptotic behaviour for Karatsuba and Toom-3 can't be
7931expected to make accurate predictions, due of course to the big influence of
7932all sorts of overheads, and the fact that only a few recursions of each are
7933being performed.  Even at large sizes there's a good chance machine dependent
7934effects like cache architecture will mean actual performance deviates from
7935what might be predicted.
7936
7937The formula given for the Karatsuba algorithm (@pxref{Karatsuba
7938Multiplication}) has an equivalent for Toom-3 involving only five multiplies,
7939but this would be complicated and unenlightening.
7940
7941An alternate view of Toom-3 can be found in Zuras (@pxref{References}), using
7942a vector to represent the @math{x} and @math{y} splits and a matrix
7943multiplication for the evaluation and interpolation stages.  The matrix
7944inverses are not meant to be actually used, and they have elements with values
7945much greater than in fact arise in the interpolation steps.  The diagram shown
7946for the 3-way is attractive, but again doesn't have to be implemented that way
7947and for example with a bit of rearrangement just one division by 6 can be
7948done.
7949
7950
7951@node Toom 4-Way Multiplication, FFT Multiplication, Toom 3-Way Multiplication, Multiplication Algorithms
7952@subsection Toom 4-Way Multiplication
7953@cindex Toom multiplication
7954
7955Karatsuba and Toom-3 split the operands into 2 and 3 coefficients,
7956respectively.  Toom-4 analogously splits the operands into 4 coefficients.
7957Using the notation from the section on Toom-3 multiplication, we form two
7958polynomials:
7959
7960@display
7961@group
7962@m{X(t) = x_3t^3 + x_2t^2 + x_1t + x_0,
7963   X(t) = x3*t^3 + x2*t^2 + x1*t + x0}
7964@m{Y(t) = y_3t^3 + y_2t^2 + y_1t + y_0,
7965   Y(t) = y3*t^3 + y2*t^2 + y1*t + y0}
7966@end group
7967@end display
7968
7969@math{X(t)} and @math{Y(t)} are evaluated and multiplied at 7 points, giving
7970values of @math{W(t)} at those points.  In GMP the following points are used,
7971
7972@quotation
7973@multitable {@m{t=-1/2,t=inf}M} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM}
7974@item Point              @tab Value
7975@item @math{t=0}         @tab @m{x_0y_0,x0 * y0}, which gives @ms{w,0} immediately
7976@item @math{t=1/2}       @tab @m{(x_3+2x_2+4x_1+8x_0)(y_3+2y_2+4y_1+8y_0),(x3+2*x2+4*x1+8*x0) * (y3+2*y2+4*y1+8*y0)}
7977@item @math{t=-1/2}      @tab @m{(-x_3+2x_2-4x_1+8x_0)(-y_3+2y_2-4y_1+8y_0),(-x3+2*x2-4*x1+8*x0) * (-y3+2*y2-4*y1+8*y0)}
7978@item @math{t=1}         @tab @m{(x_3+x_2+x_1+x_0)(y_3+y_2+y_1+y_0),(x3+x2+x1+x0) * (y3+y2+y1+y0)}
7979@item @math{t=-1}        @tab @m{(-x_3+x_2-x_1+x_0)(-y_3+y_2-y_1+y_0),(-x3+x2-x1+x0) * (-y3+y2-y1+y0)}
7980@item @math{t=2}         @tab @m{(8x_3+4x_2+2x_1+x_0)(8y_3+4y_2+2y_1+y_0),(8*x3+4*x2+2*x1+x0) * (8*y3+4*y2+2*y1+y0)}
7981@item @m{t=\infty,t=inf} @tab @m{x_3y_3,x3 * y3}, which gives @ms{w,6} immediately
7982@end multitable
7983@end quotation
7984
7985The number of additions and subtractions for Toom-4 is much larger than for Toom-3.
7986But several subexpressions occur multiple times, for example @m{x_2+x_0,x2+x0}, occurs
7987for both @math{t=1} and @math{t=-1}.
7988
7989Toom-4 is asymptotically @math{O(N^@W{1.404})}, the exponent being
7990@m{\log7/\log4,log(7)/log(4)}, representing 7 recursive multiplies of 1/4 the
7991original size each.
7992
7993
7994@node FFT Multiplication, Other Multiplication, Toom 4-Way Multiplication, Multiplication Algorithms
7995@subsection FFT Multiplication
7996@cindex FFT multiplication
7997@cindex Fast Fourier Transform
7998
7999At large to very large sizes a Fermat style FFT multiplication is used,
8000following Sch@"onhage and Strassen (@pxref{References}).  Descriptions of FFTs
8001in various forms can be found in many textbooks, for instance Knuth section
80024.3.3 part C or Lipson chapter IX@.  A brief description of the form used in
8003GMP is given here.
8004
8005The multiplication done is @m{xy \bmod 2^N+1, x*y mod 2^N+1}, for a given
8006@math{N}.  A full product @m{xy,x*y} is obtained by choosing @m{N \ge
8007\mathop{\rm bits}(x)+\mathop{\rm bits}(y), N>=bits(x)+bits(y)} and padding
8008@math{x} and @math{y} with high zero limbs.  The modular product is the native
8009form for the algorithm, so padding to get a full product is unavoidable.
8010
8011The algorithm follows a split, evaluate, pointwise multiply, interpolate and
8012combine similar to that described above for Karatsuba and Toom-3.  A @math{k}
8013parameter controls the split, with an FFT-@math{k} splitting into @math{2^k}
8014pieces of @math{M=N/2^k} bits each.  @math{N} must be a multiple of
8015@m{2^k\times@code{mp\_bits\_per\_limb}, (2^k)*@nicode{mp_bits_per_limb}} so
8016the split falls on limb boundaries, avoiding bit shifts in the split and
8017combine stages.
8018
8019The evaluations, pointwise multiplications, and interpolation, are all done
8020modulo @m{2^{N'}+1, 2^N'+1} where @math{N'} is @math{2M+k+3} rounded up to a
8021multiple of @math{2^k} and of @code{mp_bits_per_limb}.  The results of
8022interpolation will be the following negacyclic convolution of the input
8023pieces, and the choice of @math{N'} ensures these sums aren't truncated.
8024@tex
8025$$ w_n = \sum_{{i+j = b2^k+n}\atop{b=0,1}} (-1)^b x_i y_j $$
8026@end tex
8027@ifnottex
8028
8029@example
8030           ---
8031           \         b
8032w[n] =     /     (-1) * x[i] * y[j]
8033           ---
8034       i+j==b*2^k+n
8035          b=0,1
8036@end example
8037
8038@end ifnottex
8039The points used for the evaluation are @math{g^i} for @math{i=0} to
8040@math{2^k-1} where @m{g=2^{2N'/2^k}, g=2^(2N'/2^k)}.  @math{g} is a
8041@m{2^k,2^k'}th root of unity mod @m{2^{N'}+1,2^N'+1}, which produces necessary
8042cancellations at the interpolation stage, and it's also a power of 2 so the
8043fast Fourier transforms used for the evaluation and interpolation do only
8044shifts, adds and negations.
8045
8046The pointwise multiplications are done modulo @m{2^{N'}+1, 2^N'+1} and either
8047recurse into a further FFT or use a plain multiplication (Toom-3, Karatsuba or
8048basecase), whichever is optimal at the size @math{N'}.  The interpolation is
8049an inverse fast Fourier transform.  The resulting set of sums of @m{x_iy_j,
8050x[i]*y[j]} are added at appropriate offsets to give the final result.
8051
8052Squaring is the same, but @math{x} is the only input so it's one transform at
8053the evaluate stage and the pointwise multiplies are squares.  The
8054interpolation is the same.
8055
8056For a mod @math{2^N+1} product, an FFT-@math{k} is an @m{O(N^{k/(k-1)}),
8057O(N^(k/(k-1)))} algorithm, the exponent representing @math{2^k} recursed
8058modular multiplies each @m{1/2^{k-1},1/2^(k-1)} the size of the original.
8059Each successive @math{k} is an asymptotic improvement, but overheads mean each
8060is only faster at bigger and bigger sizes.  In the code, @code{MUL_FFT_TABLE}
8061and @code{SQR_FFT_TABLE} are the thresholds where each @math{k} is used.  Each
8062new @math{k} effectively swaps some multiplying for some shifts, adds and
8063overheads.
8064
8065A mod @math{2^N+1} product can be formed with a normal
8066@math{N@cross{}N@rightarrow{}2N} bit multiply plus a subtraction, so an FFT
8067and Toom-3 etc can be compared directly.  A @math{k=4} FFT at
8068@math{O(N^@W{1.333})} can be expected to be the first faster than Toom-3 at
8069@math{O(N^@W{1.465})}.  In practice this is what's found, with
8070@code{MUL_FFT_MODF_THRESHOLD} and @code{SQR_FFT_MODF_THRESHOLD} being between
8071300 and 1000 limbs, depending on the CPU@.  So far it's been found that only
8072very large FFTs recurse into pointwise multiplies above these sizes.
8073
8074When an FFT is to give a full product, the change of @math{N} to @math{2N}
8075doesn't alter the theoretical complexity for a given @math{k}, but for the
8076purposes of considering where an FFT might be first used it can be assumed
8077that the FFT is recursing into a normal multiply and that on that basis it's
8078doing @math{2^k} recursed multiplies each @m{1/2^{k-2},1/2^(k-2)} the size of
8079the inputs, making it @m{O(N^{k/(k-2)}), O(N^(k/(k-2)))}.  This would mean
8080@math{k=7} at @math{O(N^@W{1.4})} would be the first FFT faster than Toom-3.
8081In practice @code{MUL_FFT_THRESHOLD} and @code{SQR_FFT_THRESHOLD} have been
8082found to be in the @math{k=8} range, somewhere between 3000 and 10000 limbs.
8083
8084The way @math{N} is split into @math{2^k} pieces and then @math{2M+k+3} is
8085rounded up to a multiple of @math{2^k} and @code{mp_bits_per_limb} means that
8086when @math{2^k@ge{}@nicode{mp\_bits\_per\_limb}} the effective @math{N} is a
8087multiple of @m{2^{2k-1},2^(2k-1)} bits.  The @math{+k+3} means some values of
8088@math{N} just under such a multiple will be rounded to the next.  The
8089complexity calculations above assume that a favourable size is used, meaning
8090one which isn't padded through rounding, and it's also assumed that the extra
8091@math{+k+3} bits are negligible at typical FFT sizes.
8092
8093The practical effect of the @m{2^{2k-1},2^(2k-1)} constraint is to introduce a
8094step-effect into measured speeds.  For example @math{k=8} will round @math{N}
8095up to a multiple of 32768 bits, so for a 32-bit limb there'll be 512 limb
8096groups of sizes for which @code{mpn_mul_n} runs at the same speed.  Or for
8097@math{k=9} groups of 2048 limbs, @math{k=10} groups of 8192 limbs, etc.  In
8098practice it's been found each @math{k} is used at quite small multiples of its
8099size constraint and so the step effect is quite noticeable in a time versus
8100size graph.
8101
8102The threshold determinations currently measure at the mid-points of size
8103steps, but this is sub-optimal since at the start of a new step it can happen
8104that it's better to go back to the previous @math{k} for a while.  Something
8105more sophisticated for @code{MUL_FFT_TABLE} and @code{SQR_FFT_TABLE} will be
8106needed.
8107
8108
8109@node Other Multiplication, Unbalanced Multiplication, FFT Multiplication, Multiplication Algorithms
8110@subsection Other Multiplication
8111@cindex Toom multiplication
8112
8113The Toom algorithms described above (@pxref{Toom 3-Way Multiplication},
8114@pxref{Toom 4-Way Multiplication}) generalizes to split into an arbitrary
8115number of pieces, as per Knuth section 4.3.3 algorithm C@.  This is not
8116currently used.  The notes here are merely for interest.
8117
8118In general a split into @math{r+1} pieces is made, and evaluations and
8119pointwise multiplications done at @m{2r+1,2*r+1} points.  A 4-way split does 7
8120pointwise multiplies, 5-way does 9, etc.  Asymptotically an @math{(r+1)}-way
8121algorithm is @m{O(N^{log(2r+1)/log(r+1)}, O(N^(log(2*r+1)/log(r+1)))}.  Only
8122the pointwise multiplications count towards big-@math{O} complexity, but the
8123time spent in the evaluate and interpolate stages grows with @math{r} and has
8124a significant practical impact, with the asymptotic advantage of each @math{r}
8125realized only at bigger and bigger sizes.  The overheads grow as
8126@m{O(Nr),O(N*r)}, whereas in an @math{r=2^k} FFT they grow only as @m{O(N \log
8127r), O(N*log(r))}.
8128
8129Knuth algorithm C evaluates at points 0,1,2,@dots{},@m{2r,2*r}, but exercise 4
8130uses @math{-r},@dots{},0,@dots{},@math{r} and the latter saves some small
8131multiplies in the evaluate stage (or rather trades them for additions), and
8132has a further saving of nearly half the interpolate steps.  The idea is to
8133separate odd and even final coefficients and then perform algorithm C steps C7
8134and C8 on them separately.  The divisors at step C7 become @math{j^2} and the
8135multipliers at C8 become @m{2tj-j^2,2*t*j-j^2}.
8136
8137Splitting odd and even parts through positive and negative points can be
8138thought of as using @math{-1} as a square root of unity.  If a 4th root of
8139unity was available then a further split and speedup would be possible, but no
8140such root exists for plain integers.  Going to complex integers with
8141@m{i=\sqrt{-1}, i=sqrt(-1)} doesn't help, essentially because in Cartesian
8142form it takes three real multiplies to do a complex multiply.  The existence
8143of @m{2^k,2^k'}th roots of unity in a suitable ring or field lets the fast
8144Fourier transform keep splitting and get to @m{O(N \log r), O(N*log(r))}.
8145
8146Floating point FFTs use complex numbers approximating Nth roots of unity.
8147Some processors have special support for such FFTs.  But these are not used in
8148GMP since it's very difficult to guarantee an exact result (to some number of
8149bits).  An occasional difference of 1 in the last bit might not matter to a
8150typical signal processing algorithm, but is of course of vital importance to
8151GMP.
8152
8153
8154@node Unbalanced Multiplication,  , Other Multiplication, Multiplication Algorithms
8155@subsection Unbalanced Multiplication
8156@cindex Unbalanced multiplication
8157
8158Multiplication of operands with different sizes, both below
8159@code{MUL_TOOM22_THRESHOLD} are done with plain schoolbook multiplication
8160(@pxref{Basecase Multiplication}).
8161
8162For really large operands, we invoke FFT directly.
8163
8164For operands between these sizes, we use Toom inspired algorithms suggested by
8165Alberto Zanoni and Marco Bodrato.  The idea is to split the operands into
8166polynomials of different degree.  GMP currently splits the smaller operand
8167onto 2 coefficients, i.e., a polynomial of degree 1, but the larger operand
8168can be split into 2, 3, or 4 coefficients, i.e., a polynomial of degree 1 to
81693.
8170
8171@c FIXME: This is mighty ugly, but a cleaner @need triggers texinfo bugs that
8172@c screws up layout here and there in the rest of the manual.
8173@c @tex
8174@c \goodbreak
8175@c @end tex
8176@node Division Algorithms, Greatest Common Divisor Algorithms, Multiplication Algorithms, Algorithms
8177@section Division Algorithms
8178@cindex Division algorithms
8179
8180@menu
8181* Single Limb Division::
8182* Basecase Division::
8183* Divide and Conquer Division::
8184* Block-Wise Barrett Division::
8185* Exact Division::
8186* Exact Remainder::
8187* Small Quotient Division::
8188@end menu
8189
8190
8191@node Single Limb Division, Basecase Division, Division Algorithms, Division Algorithms
8192@subsection Single Limb Division
8193
8194N@cross{}1 division is implemented using repeated 2@cross{}1 divisions from
8195high to low, either with a hardware divide instruction or a multiplication by
8196inverse, whichever is best on a given CPU.
8197
8198The multiply by inverse follows ``Improved division by invariant integers'' by
8199M@"oller and Granlund (@pxref{References}) and is implemented as
8200@code{udiv_qrnnd_preinv} in @file{gmp-impl.h}.  The idea is to have a
8201fixed-point approximation to @math{1/d} (see @code{invert_limb}) and then
8202multiply by the high limb (plus one bit) of the dividend to get a quotient
8203@math{q}.  With @math{d} normalized (high bit set), @math{q} is no more than 1
8204too small.  Subtracting @m{qd,q*d} from the dividend gives a remainder, and
8205reveals whether @math{q} or @math{q-1} is correct.
8206
8207The result is a division done with two multiplications and four or five
8208arithmetic operations.  On CPUs with low latency multipliers this can be much
8209faster than a hardware divide, though the cost of calculating the inverse at
8210the start may mean it's only better on inputs bigger than say 4 or 5 limbs.
8211
8212When a divisor must be normalized, either for the generic C
8213@code{__udiv_qrnnd_c} or the multiply by inverse, the division performed is
8214actually @m{a2^k,a*2^k} by @m{d2^k,d*2^k} where @math{a} is the dividend and
8215@math{k} is the power necessary to have the high bit of @m{d2^k,d*2^k} set.
8216The bit shifts for the dividend are usually accomplished ``on the fly''
8217meaning by extracting the appropriate bits at each step.  Done this way the
8218quotient limbs come out aligned ready to store.  When only the remainder is
8219wanted, an alternative is to take the dividend limbs unshifted and calculate
8220@m{r = a \bmod d2^k, r = a mod d*2^k} followed by an extra final step @m{r2^k
8221\bmod d2^k, r*2^k mod d*2^k}.  This can help on CPUs with poor bit shifts or
8222few registers.
8223
8224The multiply by inverse can be done two limbs at a time.  The calculation is
8225basically the same, but the inverse is two limbs and the divisor treated as if
8226padded with a low zero limb.  This means more work, since the inverse will
8227need a 2@cross{}2 multiply, but the four 1@cross{}1s to do that are
8228independent and can therefore be done partly or wholly in parallel.  Likewise
8229for a 2@cross{}1 calculating @m{qd,q*d}.  The net effect is to process two
8230limbs with roughly the same two multiplies worth of latency that one limb at a
8231time gives.  This extends to 3 or 4 limbs at a time, though the extra work to
8232apply the inverse will almost certainly soon reach the limits of multiplier
8233throughput.
8234
8235A similar approach in reverse can be taken to process just half a limb at a
8236time if the divisor is only a half limb.  In this case the 1@cross{}1 multiply
8237for the inverse effectively becomes two @m{{1\over2}\times1, (1/2)x1} for each
8238limb, which can be a saving on CPUs with a fast half limb multiply, or in fact
8239if the only multiply is a half limb, and especially if it's not pipelined.
8240
8241
8242@node Basecase Division, Divide and Conquer Division, Single Limb Division, Division Algorithms
8243@subsection Basecase Division
8244
8245Basecase N@cross{}M division is like long division done by hand, but in base
8246@m{2\GMPraise{@code{mp\_bits\_per\_limb}}, 2^mp_bits_per_limb}.  See Knuth
8247section 4.3.1 algorithm D, and @file{mpn/generic/sb_divrem_mn.c}.
8248
8249Briefly stated, while the dividend remains larger than the divisor, a high
8250quotient limb is formed and the N@cross{}1 product @m{qd,q*d} subtracted at
8251the top end of the dividend.  With a normalized divisor (most significant bit
8252set), each quotient limb can be formed with a 2@cross{}1 division and a
82531@cross{}1 multiplication plus some subtractions.  The 2@cross{}1 division is
8254by the high limb of the divisor and is done either with a hardware divide or a
8255multiply by inverse (the same as in @ref{Single Limb Division}) whichever is
8256faster.  Such a quotient is sometimes one too big, requiring an addback of the
8257divisor, but that happens rarely.
8258
8259With Q=N@minus{}M being the number of quotient limbs, this is an
8260@m{O(QM),O(Q*M)} algorithm and will run at a speed similar to a basecase
8261Q@cross{}M multiplication, differing in fact only in the extra multiply and
8262divide for each of the Q quotient limbs.
8263
8264
8265@node Divide and Conquer Division, Block-Wise Barrett Division, Basecase Division, Division Algorithms
8266@subsection Divide and Conquer Division
8267
8268For divisors larger than @code{DC_DIV_QR_THRESHOLD}, division is done by dividing.
8269Or to be precise by a recursive divide and conquer algorithm based on work by
8270Moenck and Borodin, Jebelean, and Burnikel and Ziegler (@pxref{References}).
8271
8272The algorithm consists essentially of recognising that a 2N@cross{}N division
8273can be done with the basecase division algorithm (@pxref{Basecase Division}),
8274but using N/2 limbs as a base, not just a single limb.  This way the
8275multiplications that arise are (N/2)@cross{}(N/2) and can take advantage of
8276Karatsuba and higher multiplication algorithms (@pxref{Multiplication
8277Algorithms}).  The two ``digits'' of the quotient are formed by recursive
8278N@cross{}(N/2) divisions.
8279
8280If the (N/2)@cross{}(N/2) multiplies are done with a basecase multiplication
8281then the work is about the same as a basecase division, but with more function
8282call overheads and with some subtractions separated from the multiplies.
8283These overheads mean that it's only when N/2 is above
8284@code{MUL_TOOM22_THRESHOLD} that divide and conquer is of use.
8285
8286@code{DC_DIV_QR_THRESHOLD} is based on the divisor size N, so it will be somewhere
8287above twice @code{MUL_TOOM22_THRESHOLD}, but how much above depends on the
8288CPU@.  An optimized @code{mpn_mul_basecase} can lower @code{DC_DIV_QR_THRESHOLD} a
8289little by offering a ready-made advantage over repeated @code{mpn_submul_1}
8290calls.
8291
8292Divide and conquer is asymptotically @m{O(M(N)\log N),O(M(N)*log(N))} where
8293@math{M(N)} is the time for an N@cross{}N multiplication done with FFTs.  The
8294actual time is a sum over multiplications of the recursed sizes, as can be
8295seen near the end of section 2.2 of Burnikel and Ziegler.  For example, within
8296the Toom-3 range, divide and conquer is @m{2.63M(N), 2.63*M(N)}.  With higher
8297algorithms the @math{M(N)} term improves and the multiplier tends to @m{\log
8298N, log(N)}.  In practice, at moderate to large sizes, a 2N@cross{}N division
8299is about 2 to 4 times slower than an N@cross{}N multiplication.
8300
8301
8302@node Block-Wise Barrett Division, Exact Division, Divide and Conquer Division, Division Algorithms
8303@subsection Block-Wise Barrett Division
8304
8305For the largest divisions, a block-wise Barrett division algorithm is used.
8306Here, the divisor is inverted to a precision determined by the relative size of
8307the dividend and divisor.  Blocks of quotient limbs are then generated by
8308multiplying blocks from the dividend by the inverse.
8309
8310Our block-wise algorithm computes a smaller inverse than in the plain Barrett
8311algorithm.  For a @math{2n/n} division, the inverse will be just @m{\lceil n/2
8312\rceil, ceil(n/2)} limbs.
8313
8314
8315@node Exact Division, Exact Remainder, Block-Wise Barrett Division, Division Algorithms
8316@subsection Exact Division
8317
8318
8319A so-called exact division is when the dividend is known to be an exact
8320multiple of the divisor.  Jebelean's exact division algorithm uses this
8321knowledge to make some significant optimizations (@pxref{References}).
8322
8323The idea can be illustrated in decimal for example with 368154 divided by
8324543.  Because the low digit of the dividend is 4, the low digit of the
8325quotient must be 8.  This is arrived at from @m{4 \mathord{\times} 7 \bmod 10,
83264*7 mod 10}, using the fact 7 is the modular inverse of 3 (the low digit of
8327the divisor), since @m{3 \mathord{\times} 7 \mathop{\equiv} 1 \bmod 10, 3*7
8328@equiv{} 1 mod 10}.  So @m{8\mathord{\times}543 = 4344,8*543=4344} can be
8329subtracted from the dividend leaving 363810.  Notice the low digit has become
8330zero.
8331
8332The procedure is repeated at the second digit, with the next quotient digit 7
8333(@m{1 \mathord{\times} 7 \bmod 10, 7 @equiv{} 1*7 mod 10}), subtracting
8334@m{7\mathord{\times}543 = 3801,7*543=3801}, leaving 325800.  And finally at
8335the third digit with quotient digit 6 (@m{8 \mathord{\times} 7 \bmod 10, 8*7
8336mod 10}), subtracting @m{6\mathord{\times}543 = 3258,6*543=3258} leaving 0.
8337So the quotient is 678.
8338
8339Notice however that the multiplies and subtractions don't need to extend past
8340the low three digits of the dividend, since that's enough to determine the
8341three quotient digits.  For the last quotient digit no subtraction is needed
8342at all.  On a 2N@cross{}N division like this one, only about half the work of
8343a normal basecase division is necessary.
8344
8345For an N@cross{}M exact division producing Q=N@minus{}M quotient limbs, the
8346saving over a normal basecase division is in two parts.  Firstly, each of the
8347Q quotient limbs needs only one multiply, not a 2@cross{}1 divide and
8348multiply.  Secondly, the crossproducts are reduced when @math{Q>M} to
8349@m{QM-M(M+1)/2,Q*M-M*(M+1)/2}, or when @math{Q@le{}M} to @m{Q(Q-1)/2,
8350Q*(Q-1)/2}.  Notice the savings are complementary.  If Q is big then many
8351divisions are saved, or if Q is small then the crossproducts reduce to a small
8352number.
8353
8354The modular inverse used is calculated efficiently by @code{binvert_limb} in
8355@file{gmp-impl.h}.  This does four multiplies for a 32-bit limb, or six for a
835664-bit limb.  @file{tune/modlinv.c} has some alternate implementations that
8357might suit processors better at bit twiddling than multiplying.
8358
8359The sub-quadratic exact division described by Jebelean in ``Exact Division
8360with Karatsuba Complexity'' is not currently implemented.  It uses a
8361rearrangement similar to the divide and conquer for normal division
8362(@pxref{Divide and Conquer Division}), but operating from low to high.  A
8363further possibility not currently implemented is ``Bidirectional Exact Integer
8364Division'' by Krandick and Jebelean which forms quotient limbs from both the
8365high and low ends of the dividend, and can halve once more the number of
8366crossproducts needed in a 2N@cross{}N division.
8367
8368A special case exact division by 3 exists in @code{mpn_divexact_by3},
8369supporting Toom-3 multiplication and @code{mpq} canonicalizations.  It forms
8370quotient digits with a multiply by the modular inverse of 3 (which is
8371@code{0xAA..AAB}) and uses two comparisons to determine a borrow for the next
8372limb.  The multiplications don't need to be on the dependent chain, as long as
8373the effect of the borrows is applied, which can help chips with pipelined
8374multipliers.
8375
8376
8377@node Exact Remainder, Small Quotient Division, Exact Division, Division Algorithms
8378@subsection Exact Remainder
8379@cindex Exact remainder
8380
8381If the exact division algorithm is done with a full subtraction at each stage
8382and the dividend isn't a multiple of the divisor, then low zero limbs are
8383produced but with a remainder in the high limbs.  For dividend @math{a},
8384divisor @math{d}, quotient @math{q}, and @m{b = 2
8385\GMPraise{@code{mp\_bits\_per\_limb}}, b = 2^mp_bits_per_limb}, this remainder
8386@math{r} is of the form
8387@tex
8388$$ a = qd + r b^n $$
8389@end tex
8390@ifnottex
8391
8392@example
8393a = q*d + r*b^n
8394@end example
8395
8396@end ifnottex
8397@math{n} represents the number of zero limbs produced by the subtractions,
8398that being the number of limbs produced for @math{q}.  @math{r} will be in the
8399range @math{0@le{}r<d} and can be viewed as a remainder, but one shifted up by
8400a factor of @math{b^n}.
8401
8402Carrying out full subtractions at each stage means the same number of cross
8403products must be done as a normal division, but there's still some single limb
8404divisions saved.  When @math{d} is a single limb some simplifications arise,
8405providing good speedups on a number of processors.
8406
8407@code{mpn_divexact_by3}, @code{mpn_modexact_1_odd} and the @code{mpn_redc_X}
8408functions differ subtly in how they return @math{r}, leading to some negations
8409in the above formula, but all are essentially the same.
8410
8411@cindex Divisibility algorithm
8412@cindex Congruence algorithm
8413Clearly @math{r} is zero when @math{a} is a multiple of @math{d}, and this
8414leads to divisibility or congruence tests which are potentially more efficient
8415than a normal division.
8416
8417The factor of @math{b^n} on @math{r} can be ignored in a GCD when @math{d} is
8418odd, hence the use of @code{mpn_modexact_1_odd} by @code{mpn_gcd_1} and
8419@code{mpz_kronecker_ui} etc (@pxref{Greatest Common Divisor Algorithms}).
8420
8421Montgomery's REDC method for modular multiplications uses operands of the form
8422of @m{xb^{-n}, x*b^-n} and @m{yb^{-n}, y*b^-n} and on calculating @m{(xb^{-n})
8423(yb^{-n}), (x*b^-n)*(y*b^-n)} uses the factor of @math{b^n} in the exact
8424remainder to reach a product in the same form @m{(xy)b^{-n}, (x*y)*b^-n}
8425(@pxref{Modular Powering Algorithm}).
8426
8427Notice that @math{r} generally gives no useful information about the ordinary
8428remainder @math{a @bmod d} since @math{b^n @bmod d} could be anything.  If
8429however @math{b^n @equiv{} 1 @bmod d}, then @math{r} is the negative of the
8430ordinary remainder.  This occurs whenever @math{d} is a factor of
8431@math{b^n-1}, as for example with 3 in @code{mpn_divexact_by3}.  For a 32 or
843264 bit limb other such factors include 5, 17 and 257, but no particular use
8433has been found for this.
8434
8435
8436@node Small Quotient Division,  , Exact Remainder, Division Algorithms
8437@subsection Small Quotient Division
8438
8439An N@cross{}M division where the number of quotient limbs Q=N@minus{}M is
8440small can be optimized somewhat.
8441
8442An ordinary basecase division normalizes the divisor by shifting it to make
8443the high bit set, shifting the dividend accordingly, and shifting the
8444remainder back down at the end of the calculation.  This is wasteful if only a
8445few quotient limbs are to be formed.  Instead a division of just the top
8446@m{\rm2Q,2*Q} limbs of the dividend by the top Q limbs of the divisor can be
8447used to form a trial quotient.  This requires only those limbs normalized, not
8448the whole of the divisor and dividend.
8449
8450A multiply and subtract then applies the trial quotient to the M@minus{}Q
8451unused limbs of the divisor and N@minus{}Q dividend limbs (which includes Q
8452limbs remaining from the trial quotient division).  The starting trial
8453quotient can be 1 or 2 too big, but all cases of 2 too big and most cases of 1
8454too big are detected by first comparing the most significant limbs that will
8455arise from the subtraction.  An addback is done if the quotient still turns
8456out to be 1 too big.
8457
8458This whole procedure is essentially the same as one step of the basecase
8459algorithm done in a Q limb base, though with the trial quotient test done only
8460with the high limbs, not an entire Q limb ``digit'' product.  The correctness
8461of this weaker test can be established by following the argument of Knuth
8462section 4.3.1 exercise 20 but with the @m{v_2 \GMPhat q > b \GMPhat r
8463+ u_2, v2*q>b*r+u2} condition appropriately relaxed.
8464
8465
8466@need 1000
8467@node Greatest Common Divisor Algorithms, Powering Algorithms, Division Algorithms, Algorithms
8468@section Greatest Common Divisor
8469@cindex Greatest common divisor algorithms
8470@cindex GCD algorithms
8471
8472@menu
8473* Binary GCD::
8474* Lehmer's Algorithm::
8475* Subquadratic GCD::
8476* Extended GCD::
8477* Jacobi Symbol::
8478@end menu
8479
8480
8481@node Binary GCD, Lehmer's Algorithm, Greatest Common Divisor Algorithms, Greatest Common Divisor Algorithms
8482@subsection Binary GCD
8483
8484At small sizes GMP uses an @math{O(N^2)} binary style GCD@.  This is described
8485in many textbooks, for example Knuth section 4.5.2 algorithm B@.  It simply
8486consists of successively reducing odd operands @math{a} and @math{b} using
8487
8488@quotation
8489@math{a,b = @abs{}(a-b),@min{}(a,b)} @*
8490strip factors of 2 from @math{a}
8491@end quotation
8492
8493The Euclidean GCD algorithm, as per Knuth algorithms E and A, repeatedly
8494computes the quotient @m{q = \lfloor a/b \rfloor, q = floor(a/b)} and replaces
8495@math{a,b} by @math{v, u - q v}. The binary algorithm has so far been found to
8496be faster than the Euclidean algorithm everywhere.  One reason the binary
8497method does well is that the implied quotient at each step is usually small,
8498so often only one or two subtractions are needed to get the same effect as a
8499division.  Quotients 1, 2 and 3 for example occur 67.7% of the time, see Knuth
8500section 4.5.3 Theorem E.
8501
8502When the implied quotient is large, meaning @math{b} is much smaller than
8503@math{a}, then a division is worthwhile.  This is the basis for the initial
8504@math{a @bmod b} reductions in @code{mpn_gcd} and @code{mpn_gcd_1} (the latter
8505for both N@cross{}1 and 1@cross{}1 cases).  But after that initial reduction,
8506big quotients occur too rarely to make it worth checking for them.
8507
8508@sp 1
8509The final @math{1@cross{}1} GCD in @code{mpn_gcd_1} is done in the generic C
8510code as described above.  For two N-bit operands, the algorithm takes about
85110.68 iterations per bit.  For optimum performance some attention needs to be
8512paid to the way the factors of 2 are stripped from @math{a}.
8513
8514Firstly it may be noted that in twos complement the number of low zero bits on
8515@math{a-b} is the same as @math{b-a}, so counting or testing can begin on
8516@math{a-b} without waiting for @math{@abs{}(a-b)} to be determined.
8517
8518A loop stripping low zero bits tends not to branch predict well, since the
8519condition is data dependent.  But on average there's only a few low zeros, so
8520an option is to strip one or two bits arithmetically then loop for more (as
8521done for AMD K6).  Or use a lookup table to get a count for several bits then
8522loop for more (as done for AMD K7).  An alternative approach is to keep just
8523one of @math{a} or @math{b} odd and iterate
8524
8525@quotation
8526@math{a,b = @abs{}(a-b), @min{}(a,b)} @*
8527@math{a = a/2} if even @*
8528@math{b = b/2} if even
8529@end quotation
8530
8531This requires about 1.25 iterations per bit, but stripping of a single bit at
8532each step avoids any branching.  Repeating the bit strip reduces to about 0.9
8533iterations per bit, which may be a worthwhile tradeoff.
8534
8535Generally with the above approaches a speed of perhaps 6 cycles per bit can be
8536achieved, which is still not terribly fast with for instance a 64-bit GCD
8537taking nearly 400 cycles.  It's this sort of time which means it's not usually
8538advantageous to combine a set of divisibility tests into a GCD.
8539
8540Currently, the binary algorithm is used for GCD only when @math{N < 3}.
8541
8542@node Lehmer's Algorithm, Subquadratic GCD, Binary GCD, Greatest Common Divisor Algorithms
8543@comment  node-name,  next,  previous,  up
8544@subsection Lehmer's algorithm
8545
8546Lehmer's improvement of the Euclidean algorithms is based on the observation
8547that the initial part of the quotient sequence depends only on the most
8548significant parts of the inputs. The variant of Lehmer's algorithm used in GMP
8549splits off the most significant two limbs, as suggested, e.g., in ``A
8550Double-Digit Lehmer-Euclid Algorithm'' by Jebelean (@pxref{References}). The
8551quotients of two double-limb inputs are collected as a 2 by 2 matrix with
8552single-limb elements. This is done by the function @code{mpn_hgcd2}. The
8553resulting matrix is applied to the inputs using @code{mpn_mul_1} and
8554@code{mpn_submul_1}. Each iteration usually reduces the inputs by almost one
8555limb. In the rare case of a large quotient, no progress can be made by
8556examining just the most significant two limbs, and the quotient is computed
8557using plain division.
8558
8559The resulting algorithm is asymptotically @math{O(N^2)}, just as the Euclidean
8560algorithm and the binary algorithm. The quadratic part of the work are
8561the calls to @code{mpn_mul_1} and @code{mpn_submul_1}. For small sizes, the
8562linear work is also significant. There are roughly @math{N} calls to the
8563@code{mpn_hgcd2} function. This function uses a couple of important
8564optimizations:
8565
8566@itemize
8567@item
8568It uses the same relaxed notion of correctness as @code{mpn_hgcd} (see next
8569section). This means that when called with the most significant two limbs of
8570two large numbers, the returned matrix does not always correspond exactly to
8571the initial quotient sequence for the two large numbers; the final quotient
8572may sometimes be one off.
8573
8574@item
8575It takes advantage of the fact the quotients are usually small. The division
8576operator is not used, since the corresponding assembler instruction is very
8577slow on most architectures. (This code could probably be improved further, it
8578uses many branches that are unfriendly to prediction).
8579
8580@item
8581It switches from double-limb calculations to single-limb calculations half-way
8582through, when the input numbers have been reduced in size from two limbs to
8583one and a half.
8584
8585@end itemize
8586
8587@node Subquadratic GCD, Extended GCD, Lehmer's Algorithm, Greatest Common Divisor Algorithms
8588@subsection Subquadratic GCD
8589
8590For inputs larger than @code{GCD_DC_THRESHOLD}, GCD is computed via the HGCD
8591(Half GCD) function, as a generalization to Lehmer's algorithm.
8592
8593Let the inputs @math{a,b} be of size @math{N} limbs each. Put @m{S=\lfloor N/2
8594\rfloor + 1, S = floor(N/2) + 1}. Then HGCD(a,b) returns a transformation
8595matrix @math{T} with non-negative elements, and reduced numbers @math{(c;d) =
8596T^{-1} (a;b)}. The reduced numbers @math{c,d} must be larger than @math{S}
8597limbs, while their difference @math{abs(c-d)} must fit in @math{S} limbs. The
8598matrix elements will also be of size roughly @math{N/2}.
8599
8600The HGCD base case uses Lehmer's algorithm, but with the above stop condition
8601that returns reduced numbers and the corresponding transformation matrix
8602half-way through. For inputs larger than @code{HGCD_THRESHOLD}, HGCD is
8603computed recursively, using the divide and conquer algorithm in ``On
8604Sch@"onhage's algorithm and subquadratic integer GCD computation'' by M@"oller
8605(@pxref{References}). The recursive algorithm consists of these main
8606steps.
8607
8608@itemize
8609
8610@item
8611Call HGCD recursively, on the most significant @math{N/2} limbs. Apply the
8612resulting matrix @math{T_1} to the full numbers, reducing them to a size just
8613above @math{3N/2}.
8614
8615@item
8616Perform a small number of division or subtraction steps to reduce the numbers
8617to size below @math{3N/2}. This is essential mainly for the unlikely case of
8618large quotients.
8619
8620@item
8621Call HGCD recursively, on the most significant @math{N/2} limbs of the reduced
8622numbers. Apply the resulting matrix @math{T_2} to the full numbers, reducing
8623them to a size just above @math{N/2}.
8624
8625@item
8626Compute @math{T = T_1 T_2}.
8627
8628@item
8629Perform a small number of division and subtraction steps to satisfy the
8630requirements, and return.
8631@end itemize
8632
8633GCD is then implemented as a loop around HGCD, similarly to Lehmer's
8634algorithm. Where Lehmer repeatedly chops off the top two limbs, calls
8635@code{mpn_hgcd2}, and applies the resulting matrix to the full numbers, the
8636subquadratic GCD chops off the most significant third of the limbs (the
8637proportion is a tuning parameter, and @math{1/3} seems to be more efficient
8638than, e.g, @math{1/2}), calls @code{mpn_hgcd}, and applies the resulting
8639matrix. Once the input numbers are reduced to size below
8640@code{GCD_DC_THRESHOLD}, Lehmer's algorithm is used for the rest of the work.
8641
8642The asymptotic running time of both HGCD and GCD is @m{O(M(N)\log N),O(M(N)*log(N))},
8643where @math{M(N)} is the time for multiplying two @math{N}-limb numbers.
8644
8645@comment  node-name,  next,  previous,  up
8646
8647@node Extended GCD, Jacobi Symbol, Subquadratic GCD, Greatest Common Divisor Algorithms
8648@subsection Extended GCD
8649
8650The extended GCD function, or GCDEXT, calculates @math{@gcd{}(a,b)} and also
8651cofactors @math{x} and @math{y} satisfying @m{ax+by=\gcd(a@C{}b),
8652a*x+b*y=gcd(a@C{}b)}. All the algorithms used for plain GCD are extended to
8653handle this case. The binary algorithm is used only for single-limb GCDEXT.
8654Lehmer's algorithm is used for sizes up to @code{GCDEXT_DC_THRESHOLD}. Above
8655this threshold, GCDEXT is implemented as a loop around HGCD, but with more
8656book-keeping to keep track of the cofactors. This gives the same asymptotic
8657running time as for GCD and HGCD, @m{O(M(N)\log N),O(M(N)*log(N))}
8658
8659One difference to plain GCD is that while the inputs @math{a} and @math{b} are
8660reduced as the algorithm proceeds, the cofactors @math{x} and @math{y} grow in
8661size. This makes the tuning of the chopping-point more difficult. The current
8662code chops off the most significant half of the inputs for the call to HGCD in
8663the first iteration, and the most significant two thirds for the remaining
8664calls. This strategy could surely be improved. Also the stop condition for the
8665loop, where Lehmer's algorithm is invoked once the inputs are reduced below
8666@code{GCDEXT_DC_THRESHOLD}, could maybe be improved by taking into account the
8667current size of the cofactors.
8668
8669@node Jacobi Symbol,  , Extended GCD, Greatest Common Divisor Algorithms
8670@subsection Jacobi Symbol
8671@cindex Jacobi symbol algorithm
8672
8673@code{mpz_jacobi} and @code{mpz_kronecker} are currently implemented with a
8674simple binary algorithm similar to that described for the GCDs (@pxref{Binary
8675GCD}).  They're not very fast when both inputs are large.  Lehmer's multi-step
8676improvement or a binary based multi-step algorithm is likely to be better.
8677
8678When one operand fits a single limb, and that includes @code{mpz_kronecker_ui}
8679and friends, an initial reduction is done with either @code{mpn_mod_1} or
8680@code{mpn_modexact_1_odd}, followed by the binary algorithm on a single limb.
8681The binary algorithm is well suited to a single limb, and the whole
8682calculation in this case is quite efficient.
8683
8684In all the routines sign changes for the result are accumulated using some bit
8685twiddling, avoiding table lookups or conditional jumps.
8686
8687
8688@need 1000
8689@node Powering Algorithms, Root Extraction Algorithms, Greatest Common Divisor Algorithms, Algorithms
8690@section Powering Algorithms
8691@cindex Powering algorithms
8692
8693@menu
8694* Normal Powering Algorithm::
8695* Modular Powering Algorithm::
8696@end menu
8697
8698
8699@node Normal Powering Algorithm, Modular Powering Algorithm, Powering Algorithms, Powering Algorithms
8700@subsection Normal Powering
8701
8702Normal @code{mpz} or @code{mpf} powering uses a simple binary algorithm,
8703successively squaring and then multiplying by the base when a 1 bit is seen in
8704the exponent, as per Knuth section 4.6.3.  The ``left to right''
8705variant described there is used rather than algorithm A, since it's just as
8706easy and can be done with somewhat less temporary memory.
8707
8708
8709@node Modular Powering Algorithm,  , Normal Powering Algorithm, Powering Algorithms
8710@subsection Modular Powering
8711
8712Modular powering is implemented using a @math{2^k}-ary sliding window
8713algorithm, as per ``Handbook of Applied Cryptography'' algorithm 14.85
8714(@pxref{References}).  @math{k} is chosen according to the size of the
8715exponent.  Larger exponents use larger values of @math{k}, the choice being
8716made to minimize the average number of multiplications that must supplement
8717the squaring.
8718
8719The modular multiplies and squares use either a simple division or the REDC
8720method by Montgomery (@pxref{References}).  REDC is a little faster,
8721essentially saving N single limb divisions in a fashion similar to an exact
8722remainder (@pxref{Exact Remainder}).
8723
8724
8725@node Root Extraction Algorithms, Radix Conversion Algorithms, Powering Algorithms, Algorithms
8726@section Root Extraction Algorithms
8727@cindex Root extraction algorithms
8728
8729@menu
8730* Square Root Algorithm::
8731* Nth Root Algorithm::
8732* Perfect Square Algorithm::
8733* Perfect Power Algorithm::
8734@end menu
8735
8736
8737@node Square Root Algorithm, Nth Root Algorithm, Root Extraction Algorithms, Root Extraction Algorithms
8738@subsection Square Root
8739@cindex Square root algorithm
8740@cindex Karatsuba square root algorithm
8741
8742Square roots are taken using the ``Karatsuba Square Root'' algorithm by Paul
8743Zimmermann (@pxref{References}).
8744
8745An input @math{n} is split into four parts of @math{k} bits each, so with
8746@math{b=2^k} we have @m{n = a_3b^3 + a_2b^2 + a_1b + a_0, n = a3*b^3 + a2*b^2
8747+ a1*b + a0}.  Part @ms{a,3} must be ``normalized'' so that either the high or
8748second highest bit is set.  In GMP, @math{k} is kept on a limb boundary and
8749the input is left shifted (by an even number of bits) to normalize.
8750
8751The square root of the high two parts is taken, by recursive application of
8752the algorithm (bottoming out in a one-limb Newton's method),
8753@tex
8754$$ s',r' = \mathop{\rm sqrtrem} \> (a_3b + a_2) $$
8755@end tex
8756@ifnottex
8757
8758@example
8759s1,r1 = sqrtrem (a3*b + a2)
8760@end example
8761
8762@end ifnottex
8763This is an approximation to the desired root and is extended by a division to
8764give @math{s},@math{r},
8765@tex
8766$$\eqalign{
8767q,u &= \mathop{\rm divrem} \> (r'b + a_1, 2s') \cr
8768s &= s'b + q \cr
8769r &= ub + a_0 - q^2
8770}$$
8771@end tex
8772@ifnottex
8773
8774@example
8775q,u = divrem (r1*b + a1, 2*s1)
8776s = s1*b + q
8777r = u*b + a0 - q^2
8778@end example
8779
8780@end ifnottex
8781The normalization requirement on @ms{a,3} means at this point @math{s} is
8782either correct or 1 too big.  @math{r} is negative in the latter case, so
8783@tex
8784$$\eqalign{
8785\mathop{\rm if} \; r &< 0 \; \mathop{\rm then} \cr
8786r &\leftarrow r + 2s - 1 \cr
8787s &\leftarrow s - 1
8788}$$
8789@end tex
8790@ifnottex
8791
8792@example
8793if r < 0 then
8794  r = r + 2*s - 1
8795  s = s - 1
8796@end example
8797
8798@end ifnottex
8799The algorithm is expressed in a divide and conquer form, but as noted in the
8800paper it can also be viewed as a discrete variant of Newton's method, or as a
8801variation on the schoolboy method (no longer taught) for square roots two
8802digits at a time.
8803
8804If the remainder @math{r} is not required then usually only a few high limbs
8805of @math{r} and @math{u} need to be calculated to determine whether an
8806adjustment to @math{s} is required.  This optimization is not currently
8807implemented.
8808
8809In the Karatsuba multiplication range this algorithm is @m{O({3\over2}
8810M(N/2)),O(1.5*M(N/2))}, where @math{M(n)} is the time to multiply two numbers
8811of @math{n} limbs.  In the FFT multiplication range this grows to a bound of
8812@m{O(6 M(N/2)),O(6*M(N/2))}.  In practice a factor of about 1.5 to 1.8 is
8813found in the Karatsuba and Toom-3 ranges, growing to 2 or 3 in the FFT range.
8814
8815The algorithm does all its calculations in integers and the resulting
8816@code{mpn_sqrtrem} is used for both @code{mpz_sqrt} and @code{mpf_sqrt}.
8817The extended precision given by @code{mpf_sqrt_ui} is obtained by
8818padding with zero limbs.
8819
8820
8821@node Nth Root Algorithm, Perfect Square Algorithm, Square Root Algorithm, Root Extraction Algorithms
8822@subsection Nth Root
8823@cindex Root extraction algorithm
8824@cindex Nth root algorithm
8825
8826Integer Nth roots are taken using Newton's method with the following
8827iteration, where @math{A} is the input and @math{n} is the root to be taken.
8828@tex
8829$$a_{i+1} = {1\over n} \left({A \over a_i^{n-1}} + (n-1)a_i \right)$$
8830@end tex
8831@ifnottex
8832
8833@example
8834         1         A
8835a[i+1] = - * ( --------- + (n-1)*a[i] )
8836         n     a[i]^(n-1)
8837@end example
8838
8839@end ifnottex
8840The initial approximation @m{a_1,a[1]} is generated bitwise by successively
8841powering a trial root with or without new 1 bits, aiming to be just above the
8842true root.  The iteration converges quadratically when started from a good
8843approximation.  When @math{n} is large more initial bits are needed to get
8844good convergence.  The current implementation is not particularly well
8845optimized.
8846
8847
8848@node Perfect Square Algorithm, Perfect Power Algorithm, Nth Root Algorithm, Root Extraction Algorithms
8849@subsection Perfect Square
8850@cindex Perfect square algorithm
8851
8852A significant fraction of non-squares can be quickly identified by checking
8853whether the input is a quadratic residue modulo small integers.
8854
8855@code{mpz_perfect_square_p} first tests the input mod 256, which means just
8856examining the low byte.  Only 44 different values occur for squares mod 256,
8857so 82.8% of inputs can be immediately identified as non-squares.
8858
8859On a 32-bit system similar tests are done mod 9, 5, 7, 13 and 17, for a total
886099.25% of inputs identified as non-squares.  On a 64-bit system 97 is tested
8861too, for a total 99.62%.
8862
8863These moduli are chosen because they're factors of @math{2^@W{24}-1} (or
8864@math{2^@W{48}-1} for 64-bits), and such a remainder can be quickly taken just
8865using additions (see @code{mpn_mod_34lsub1}).
8866
8867When nails are in use moduli are instead selected by the @file{gen-psqr.c}
8868program and applied with an @code{mpn_mod_1}.  The same @math{2^@W{24}-1} or
8869@math{2^@W{48}-1} could be done with nails using some extra bit shifts, but
8870this is not currently implemented.
8871
8872In any case each modulus is applied to the @code{mpn_mod_34lsub1} or
8873@code{mpn_mod_1} remainder and a table lookup identifies non-squares.  By
8874using a ``modexact'' style calculation, and suitably permuted tables, just one
8875multiply each is required, see the code for details.  Moduli are also combined
8876to save operations, so long as the lookup tables don't become too big.
8877@file{gen-psqr.c} does all the pre-calculations.
8878
8879A square root must still be taken for any value that passes these tests, to
8880verify it's really a square and not one of the small fraction of non-squares
8881that get through (ie.@: a pseudo-square to all the tested bases).
8882
8883Clearly more residue tests could be done, @code{mpz_perfect_square_p} only
8884uses a compact and efficient set.  Big inputs would probably benefit from more
8885residue testing, small inputs might be better off with less.  The assumed
8886distribution of squares versus non-squares in the input would affect such
8887considerations.
8888
8889
8890@node Perfect Power Algorithm,  , Perfect Square Algorithm, Root Extraction Algorithms
8891@subsection Perfect Power
8892@cindex Perfect power algorithm
8893
8894Detecting perfect powers is required by some factorization algorithms.
8895Currently @code{mpz_perfect_power_p} is implemented using repeated Nth root
8896extractions, though naturally only prime roots need to be considered.
8897(@xref{Nth Root Algorithm}.)
8898
8899If a prime divisor @math{p} with multiplicity @math{e} can be found, then only
8900roots which are divisors of @math{e} need to be considered, much reducing the
8901work necessary.  To this end divisibility by a set of small primes is checked.
8902
8903
8904@node Radix Conversion Algorithms, Other Algorithms, Root Extraction Algorithms, Algorithms
8905@section Radix Conversion
8906@cindex Radix conversion algorithms
8907
8908Radix conversions are less important than other algorithms.  A program
8909dominated by conversions should probably use a different data representation.
8910
8911@menu
8912* Binary to Radix::
8913* Radix to Binary::
8914@end menu
8915
8916
8917@node Binary to Radix, Radix to Binary, Radix Conversion Algorithms, Radix Conversion Algorithms
8918@subsection Binary to Radix
8919
8920Conversions from binary to a power-of-2 radix use a simple and fast
8921@math{O(N)} bit extraction algorithm.
8922
8923Conversions from binary to other radices use one of two algorithms.  Sizes
8924below @code{GET_STR_PRECOMPUTE_THRESHOLD} use a basic @math{O(N^2)} method.
8925Repeated divisions by @math{b^n} are made, where @math{b} is the radix and
8926@math{n} is the biggest power that fits in a limb.  But instead of simply
8927using the remainder @math{r} from such divisions, an extra divide step is done
8928to give a fractional limb representing @math{r/b^n}.  The digits of @math{r}
8929can then be extracted using multiplications by @math{b} rather than divisions.
8930Special case code is provided for decimal, allowing multiplications by 10 to
8931optimize to shifts and adds.
8932
8933Above @code{GET_STR_PRECOMPUTE_THRESHOLD} a sub-quadratic algorithm is used.
8934For an input @math{t}, powers @m{b^{n2^i},b^(n*2^i)} of the radix are
8935calculated, until a power between @math{t} and @m{\sqrt{t},sqrt(t)} is
8936reached.  @math{t} is then divided by that largest power, giving a quotient
8937which is the digits above that power, and a remainder which is those below.
8938These two parts are in turn divided by the second highest power, and so on
8939recursively.  When a piece has been divided down to less than
8940@code{GET_STR_DC_THRESHOLD} limbs, the basecase algorithm described above is
8941used.
8942
8943The advantage of this algorithm is that big divisions can make use of the
8944sub-quadratic divide and conquer division (@pxref{Divide and Conquer
8945Division}), and big divisions tend to have less overheads than lots of
8946separate single limb divisions anyway.  But in any case the cost of
8947calculating the powers @m{b^{n2^i},b^(n*2^i)} must first be overcome.
8948
8949@code{GET_STR_PRECOMPUTE_THRESHOLD} and @code{GET_STR_DC_THRESHOLD} represent
8950the same basic thing, the point where it becomes worth doing a big division to
8951cut the input in half.  @code{GET_STR_PRECOMPUTE_THRESHOLD} includes the cost
8952of calculating the radix power required, whereas @code{GET_STR_DC_THRESHOLD}
8953assumes that's already available, which is the case when recursing.
8954
8955Since the base case produces digits from least to most significant but they
8956want to be stored from most to least, it's necessary to calculate in advance
8957how many digits there will be, or at least be sure not to underestimate that.
8958For GMP the number of input bits is multiplied by @code{chars_per_bit_exactly}
8959from @code{mp_bases}, rounding up.  The result is either correct or one too
8960big.
8961
8962Examining some of the high bits of the input could increase the chance of
8963getting the exact number of digits, but an exact result every time would not
8964be practical, since in general the difference between numbers 100@dots{} and
896599@dots{} is only in the last few bits and the work to identify 99@dots{}
8966might well be almost as much as a full conversion.
8967
8968@code{mpf_get_str} doesn't currently use the algorithm described here, it
8969multiplies or divides by a power of @math{b} to move the radix point to the
8970just above the highest non-zero digit (or at worst one above that location),
8971then multiplies by @math{b^n} to bring out digits.  This is @math{O(N^2)} and
8972is certainly not optimal.
8973
8974The @math{r/b^n} scheme described above for using multiplications to bring out
8975digits might be useful for more than a single limb.  Some brief experiments
8976with it on the base case when recursing didn't give a noticeable improvement,
8977but perhaps that was only due to the implementation.  Something similar would
8978work for the sub-quadratic divisions too, though there would be the cost of
8979calculating a bigger radix power.
8980
8981Another possible improvement for the sub-quadratic part would be to arrange
8982for radix powers that balanced the sizes of quotient and remainder produced,
8983ie.@: the highest power would be an @m{b^{nk},b^(n*k)} approximately equal to
8984@m{\sqrt{t},sqrt(t)}, not restricted to a @math{2^i} factor.  That ought to
8985smooth out a graph of times against sizes, but may or may not be a net
8986speedup.
8987
8988
8989@node Radix to Binary,  , Binary to Radix, Radix Conversion Algorithms
8990@subsection Radix to Binary
8991
8992@strong{This section needs to be rewritten, it currently describes the
8993algorithms used before GMP 4.3.}
8994
8995Conversions from a power-of-2 radix into binary use a simple and fast
8996@math{O(N)} bitwise concatenation algorithm.
8997
8998Conversions from other radices use one of two algorithms.  Sizes below
8999@code{SET_STR_PRECOMPUTE_THRESHOLD} use a basic @math{O(N^2)} method.  Groups
9000of @math{n} digits are converted to limbs, where @math{n} is the biggest
9001power of the base @math{b} which will fit in a limb, then those groups are
9002accumulated into the result by multiplying by @math{b^n} and adding.  This
9003saves multi-precision operations, as per Knuth section 4.4 part E
9004(@pxref{References}).  Some special case code is provided for decimal, giving
9005the compiler a chance to optimize multiplications by 10.
9006
9007Above @code{SET_STR_PRECOMPUTE_THRESHOLD} a sub-quadratic algorithm is used.
9008First groups of @math{n} digits are converted into limbs.  Then adjacent
9009limbs are combined into limb pairs with @m{xb^n+y,x*b^n+y}, where @math{x}
9010and @math{y} are the limbs.  Adjacent limb pairs are combined into quads
9011similarly with @m{xb^{2n}+y,x*b^(2n)+y}.  This continues until a single block
9012remains, that being the result.
9013
9014The advantage of this method is that the multiplications for each @math{x} are
9015big blocks, allowing Karatsuba and higher algorithms to be used.  But the cost
9016of calculating the powers @m{b^{n2^i},b^(n*2^i)} must be overcome.
9017@code{SET_STR_PRECOMPUTE_THRESHOLD} usually ends up quite big, around 5000 digits, and on
9018some processors much bigger still.
9019
9020@code{SET_STR_PRECOMPUTE_THRESHOLD} is based on the input digits (and tuned
9021for decimal), though it might be better based on a limb count, so as to be
9022independent of the base.  But that sort of count isn't used by the base case
9023and so would need some sort of initial calculation or estimate.
9024
9025The main reason @code{SET_STR_PRECOMPUTE_THRESHOLD} is so much bigger than the
9026corresponding @code{GET_STR_PRECOMPUTE_THRESHOLD} is that @code{mpn_mul_1} is
9027much faster than @code{mpn_divrem_1} (often by a factor of 5, or more).
9028
9029
9030@need 1000
9031@node Other Algorithms, Assembly Coding, Radix Conversion Algorithms, Algorithms
9032@section Other Algorithms
9033
9034@menu
9035* Prime Testing Algorithm::
9036* Factorial Algorithm::
9037* Binomial Coefficients Algorithm::
9038* Fibonacci Numbers Algorithm::
9039* Lucas Numbers Algorithm::
9040* Random Number Algorithms::
9041@end menu
9042
9043
9044@node Prime Testing Algorithm, Factorial Algorithm, Other Algorithms, Other Algorithms
9045@subsection Prime Testing
9046@cindex Prime testing algorithms
9047
9048The primality testing in @code{mpz_probab_prime_p} (@pxref{Number Theoretic
9049Functions}) first does some trial division by small factors and then uses the
9050Miller-Rabin probabilistic primality testing algorithm, as described in Knuth
9051section 4.5.4 algorithm P (@pxref{References}).
9052
9053For an odd input @math{n}, and with @math{n = q@GMPmultiply{}2^k+1} where
9054@math{q} is odd, this algorithm selects a random base @math{x} and tests
9055whether @math{x^q @bmod{} n} is 1 or @math{-1}, or an @m{x^{q2^j} \bmod n,
9056x^(q*2^j) mod n} is @math{1}, for @math{1@le{}j@le{}k}.  If so then @math{n}
9057is probably prime, if not then @math{n} is definitely composite.
9058
9059Any prime @math{n} will pass the test, but some composites do too.  Such
9060composites are known as strong pseudoprimes to base @math{x}.  No @math{n} is
9061a strong pseudoprime to more than @math{1/4} of all bases (see Knuth exercise
906222), hence with @math{x} chosen at random there's no more than a @math{1/4}
9063chance a ``probable prime'' will in fact be composite.
9064
9065In fact strong pseudoprimes are quite rare, making the test much more
9066powerful than this analysis would suggest, but @math{1/4} is all that's proven
9067for an arbitrary @math{n}.
9068
9069
9070@node Factorial Algorithm, Binomial Coefficients Algorithm, Prime Testing Algorithm, Other Algorithms
9071@subsection Factorial
9072@cindex Factorial algorithm
9073
9074Factorials are calculated by a combination of removal of twos, powering, and
9075binary splitting.  The procedure can be best illustrated with an example,
9076
9077@quotation
9078@math{23! = 1.2.3.4.5.6.7.8.9.10.11.12.13.14.15.16.17.18.19.20.21.22.23}
9079@end quotation
9080
9081@noindent
9082has factors of two removed,
9083
9084@quotation
9085@math{23! = 2^{19}.1.1.3.1.5.3.7.1.9.5.11.3.13.7.15.1.17.9.19.5.21.11.23}
9086@end quotation
9087
9088@noindent
9089and the resulting terms collected up according to their multiplicity,
9090
9091@quotation
9092@math{23! = 2^{19}.(3.5)^3.(7.9.11)^2.(13.15.17.19.21.23)}
9093@end quotation
9094
9095Each sequence such as @math{13.15.17.19.21.23} is evaluated by splitting into
9096every second term, as for instance @math{(13.17.21).(15.19.23)}, and the same
9097recursively on each half.  This is implemented iteratively using some bit
9098twiddling.
9099
9100Such splitting is more efficient than repeated N@cross{}1 multiplies since it
9101forms big multiplies, allowing Karatsuba and higher algorithms to be used.
9102And even below the Karatsuba threshold a big block of work can be more
9103efficient for the basecase algorithm.
9104
9105Splitting into subsequences of every second term keeps the resulting products
9106more nearly equal in size than would the simpler approach of say taking the
9107first half and second half of the sequence.  Nearly equal products are more
9108efficient for the current multiply implementation.
9109
9110
9111@node Binomial Coefficients Algorithm, Fibonacci Numbers Algorithm, Factorial Algorithm, Other Algorithms
9112@subsection Binomial Coefficients
9113@cindex Binomial coefficient algorithm
9114
9115Binomial coefficients @m{\left({n}\atop{k}\right), C(n@C{}k)} are calculated
9116by first arranging @math{k @le{} n/2} using @m{\left({n}\atop{k}\right) =
9117\left({n}\atop{n-k}\right), C(n@C{}k) = C(n@C{}n-k)} if necessary, and then
9118evaluating the following product simply from @math{i=2} to @math{i=k}.
9119@tex
9120$$ \left({n}\atop{k}\right) = (n-k+1) \prod_{i=2}^{k} {{n-k+i} \over i} $$
9121@end tex
9122@ifnottex
9123
9124@example
9125                      k  (n-k+i)
9126C(n,k) =  (n-k+1) * prod -------
9127                     i=2    i
9128@end example
9129
9130@end ifnottex
9131It's easy to show that each denominator @math{i} will divide the product so
9132far, so the exact division algorithm is used (@pxref{Exact Division}).
9133
9134The numerators @math{n-k+i} and denominators @math{i} are first accumulated
9135into as many fit a limb, to save multi-precision operations, though for
9136@code{mpz_bin_ui} this applies only to the divisors, since @math{n} is an
9137@code{mpz_t} and @math{n-k+i} in general won't fit in a limb at all.
9138
9139
9140@node Fibonacci Numbers Algorithm, Lucas Numbers Algorithm, Binomial Coefficients Algorithm, Other Algorithms
9141@subsection Fibonacci Numbers
9142@cindex Fibonacci number algorithm
9143
9144The Fibonacci functions @code{mpz_fib_ui} and @code{mpz_fib2_ui} are designed
9145for calculating isolated @m{F_n,F[n]} or @m{F_n,F[n]},@m{F_{n-1},F[n-1]}
9146values efficiently.
9147
9148For small @math{n}, a table of single limb values in @code{__gmp_fib_table} is
9149used.  On a 32-bit limb this goes up to @m{F_{47},F[47]}, or on a 64-bit limb
9150up to @m{F_{93},F[93]}.  For convenience the table starts at @m{F_{-1},F[-1]}.
9151
9152Beyond the table, values are generated with a binary powering algorithm,
9153calculating a pair @m{F_n,F[n]} and @m{F_{n-1},F[n-1]} working from high to
9154low across the bits of @math{n}.  The formulas used are
9155@tex
9156$$\eqalign{
9157  F_{2k+1} &= 4F_k^2 - F_{k-1}^2 + 2(-1)^k \cr
9158  F_{2k-1} &=  F_k^2 + F_{k-1}^2           \cr
9159  F_{2k}   &= F_{2k+1} - F_{2k-1}
9160}$$
9161@end tex
9162@ifnottex
9163
9164@example
9165F[2k+1] = 4*F[k]^2 - F[k-1]^2 + 2*(-1)^k
9166F[2k-1] =   F[k]^2 + F[k-1]^2
9167
9168F[2k] = F[2k+1] - F[2k-1]
9169@end example
9170
9171@end ifnottex
9172At each step, @math{k} is the high @math{b} bits of @math{n}.  If the next bit
9173of @math{n} is 0 then @m{F_{2k},F[2k]},@m{F_{2k-1},F[2k-1]} is used, or if
9174it's a 1 then @m{F_{2k+1},F[2k+1]},@m{F_{2k},F[2k]} is used, and the process
9175repeated until all bits of @math{n} are incorporated.  Notice these formulas
9176require just two squares per bit of @math{n}.
9177
9178It'd be possible to handle the first few @math{n} above the single limb table
9179with simple additions, using the defining Fibonacci recurrence @m{F_{k+1} =
9180F_k + F_{k-1}, F[k+1]=F[k]+F[k-1]}, but this is not done since it usually
9181turns out to be faster for only about 10 or 20 values of @math{n}, and
9182including a block of code for just those doesn't seem worthwhile.  If they
9183really mattered it'd be better to extend the data table.
9184
9185Using a table avoids lots of calculations on small numbers, and makes small
9186@math{n} go fast.  A bigger table would make more small @math{n} go fast, it's
9187just a question of balancing size against desired speed.  For GMP the code is
9188kept compact, with the emphasis primarily on a good powering algorithm.
9189
9190@code{mpz_fib2_ui} returns both @m{F_n,F[n]} and @m{F_{n-1},F[n-1]}, but
9191@code{mpz_fib_ui} is only interested in @m{F_n,F[n]}.  In this case the last
9192step of the algorithm can become one multiply instead of two squares.  One of
9193the following two formulas is used, according as @math{n} is odd or even.
9194@tex
9195$$\eqalign{
9196  F_{2k}   &= F_k (F_k + 2F_{k-1}) \cr
9197  F_{2k+1} &= (2F_k + F_{k-1}) (2F_k - F_{k-1}) + 2(-1)^k
9198}$$
9199@end tex
9200@ifnottex
9201
9202@example
9203F[2k]   = F[k]*(F[k]+2F[k-1])
9204
9205F[2k+1] = (2F[k]+F[k-1])*(2F[k]-F[k-1]) + 2*(-1)^k
9206@end example
9207
9208@end ifnottex
9209@m{F_{2k+1},F[2k+1]} here is the same as above, just rearranged to be a
9210multiply.  For interest, the @m{2(-1)^k, 2*(-1)^k} term both here and above
9211can be applied just to the low limb of the calculation, without a carry or
9212borrow into further limbs, which saves some code size.  See comments with
9213@code{mpz_fib_ui} and the internal @code{mpn_fib2_ui} for how this is done.
9214
9215
9216@node Lucas Numbers Algorithm, Random Number Algorithms, Fibonacci Numbers Algorithm, Other Algorithms
9217@subsection Lucas Numbers
9218@cindex Lucas number algorithm
9219
9220@code{mpz_lucnum2_ui} derives a pair of Lucas numbers from a pair of Fibonacci
9221numbers with the following simple formulas.
9222@tex
9223$$\eqalign{
9224  L_k     &=  F_k + 2F_{k-1} \cr
9225  L_{k-1} &= 2F_k -  F_{k-1}
9226}$$
9227@end tex
9228@ifnottex
9229
9230@example
9231L[k]   =   F[k] + 2*F[k-1]
9232L[k-1] = 2*F[k] -   F[k-1]
9233@end example
9234
9235@end ifnottex
9236@code{mpz_lucnum_ui} is only interested in @m{L_n,L[n]}, and some work can be
9237saved.  Trailing zero bits on @math{n} can be handled with a single square
9238each.
9239@tex
9240$$ L_{2k} = L_k^2 - 2(-1)^k $$
9241@end tex
9242@ifnottex
9243
9244@example
9245L[2k] = L[k]^2 - 2*(-1)^k
9246@end example
9247
9248@end ifnottex
9249And the lowest 1 bit can be handled with one multiply of a pair of Fibonacci
9250numbers, similar to what @code{mpz_fib_ui} does.
9251@tex
9252$$ L_{2k+1} = 5F_{k-1} (2F_k + F_{k-1}) - 4(-1)^k $$
9253@end tex
9254@ifnottex
9255
9256@example
9257L[2k+1] = 5*F[k-1]*(2*F[k]+F[k-1]) - 4*(-1)^k
9258@end example
9259
9260@end ifnottex
9261
9262
9263@node Random Number Algorithms,  , Lucas Numbers Algorithm, Other Algorithms
9264@subsection Random Numbers
9265@cindex Random number algorithms
9266
9267For the @code{urandomb} functions, random numbers are generated simply by
9268concatenating bits produced by the generator.  As long as the generator has
9269good randomness properties this will produce well-distributed @math{N} bit
9270numbers.
9271
9272For the @code{urandomm} functions, random numbers in a range @math{0@le{}R<N}
9273are generated by taking values @math{R} of @m{\lceil \log_2 N \rceil,
9274ceil(log2(N))} bits each until one satisfies @math{R<N}.  This will normally
9275require only one or two attempts, but the attempts are limited in case the
9276generator is somehow degenerate and produces only 1 bits or similar.
9277
9278@cindex Mersenne twister algorithm
9279The Mersenne Twister generator is by Matsumoto and Nishimura
9280(@pxref{References}).  It has a non-repeating period of @math{2^@W{19937}-1},
9281which is a Mersenne prime, hence the name of the generator.  The state is 624
9282words of 32-bits each, which is iterated with one XOR and shift for each
928332-bit word generated, making the algorithm very fast.  Randomness properties
9284are also very good and this is the default algorithm used by GMP.
9285
9286@cindex Linear congruential algorithm
9287Linear congruential generators are described in many text books, for instance
9288Knuth volume 2 (@pxref{References}).  With a modulus @math{M} and parameters
9289@math{A} and @math{C}, a integer state @math{S} is iterated by the formula
9290@math{S @leftarrow{} A@GMPmultiply{}S+C @bmod{} M}.  At each step the new
9291state is a linear function of the previous, mod @math{M}, hence the name of
9292the generator.
9293
9294In GMP only moduli of the form @math{2^N} are supported, and the current
9295implementation is not as well optimized as it could be.  Overheads are
9296significant when @math{N} is small, and when @math{N} is large clearly the
9297multiply at each step will become slow.  This is not a big concern, since the
9298Mersenne Twister generator is better in every respect and is therefore
9299recommended for all normal applications.
9300
9301For both generators the current state can be deduced by observing enough
9302output and applying some linear algebra (over GF(2) in the case of the
9303Mersenne Twister).  This generally means raw output is unsuitable for
9304cryptographic applications without further hashing or the like.
9305
9306
9307@node Assembly Coding,  , Other Algorithms, Algorithms
9308@section Assembly Coding
9309@cindex Assembly coding
9310
9311The assembly subroutines in GMP are the most significant source of speed at
9312small to moderate sizes.  At larger sizes algorithm selection becomes more
9313important, but of course speedups in low level routines will still speed up
9314everything proportionally.
9315
9316Carry handling and widening multiplies that are important for GMP can't be
9317easily expressed in C@.  GCC @code{asm} blocks help a lot and are provided in
9318@file{longlong.h}, but hand coding low level routines invariably offers a
9319speedup over generic C by a factor of anything from 2 to 10.
9320
9321@menu
9322* Assembly Code Organisation::
9323* Assembly Basics::
9324* Assembly Carry Propagation::
9325* Assembly Cache Handling::
9326* Assembly Functional Units::
9327* Assembly Floating Point::
9328* Assembly SIMD Instructions::
9329* Assembly Software Pipelining::
9330* Assembly Loop Unrolling::
9331* Assembly Writing Guide::
9332@end menu
9333
9334
9335@node Assembly Code Organisation, Assembly Basics, Assembly Coding, Assembly Coding
9336@subsection Code Organisation
9337@cindex Assembly code organisation
9338@cindex Code organisation
9339
9340The various @file{mpn} subdirectories contain machine-dependent code, written
9341in C or assembly.  The @file{mpn/generic} subdirectory contains default code,
9342used when there's no machine-specific version of a particular file.
9343
9344Each @file{mpn} subdirectory is for an ISA family.  Generally 32-bit and
934564-bit variants in a family cannot share code and have separate directories.
9346Within a family further subdirectories may exist for CPU variants.
9347
9348In each directory a @file{nails} subdirectory may exist, holding code with
9349nails support for that CPU variant.  A @code{NAILS_SUPPORT} directive in each
9350file indicates the nails values the code handles.  Nails code only exists
9351where it's faster, or promises to be faster, than plain code.  There's no
9352effort put into nails if they're not going to enhance a given CPU.
9353
9354
9355@node Assembly Basics, Assembly Carry Propagation, Assembly Code Organisation, Assembly Coding
9356@subsection Assembly Basics
9357
9358@code{mpn_addmul_1} and @code{mpn_submul_1} are the most important routines
9359for overall GMP performance.  All multiplications and divisions come down to
9360repeated calls to these.  @code{mpn_add_n}, @code{mpn_sub_n},
9361@code{mpn_lshift} and @code{mpn_rshift} are next most important.
9362
9363On some CPUs assembly versions of the internal functions
9364@code{mpn_mul_basecase} and @code{mpn_sqr_basecase} give significant speedups,
9365mainly through avoiding function call overheads.  They can also potentially
9366make better use of a wide superscalar processor, as can bigger primitives like
9367@code{mpn_addmul_2} or @code{mpn_addmul_4}.
9368
9369The restrictions on overlaps between sources and destinations
9370(@pxref{Low-level Functions}) are designed to facilitate a variety of
9371implementations.  For example, knowing @code{mpn_add_n} won't have partly
9372overlapping sources and destination means reading can be done far ahead of
9373writing on superscalar processors, and loops can be vectorized on a vector
9374processor, depending on the carry handling.
9375
9376
9377@node Assembly Carry Propagation, Assembly Cache Handling, Assembly Basics, Assembly Coding
9378@subsection Carry Propagation
9379@cindex Assembly carry propagation
9380
9381The problem that presents most challenges in GMP is propagating carries from
9382one limb to the next.  In functions like @code{mpn_addmul_1} and
9383@code{mpn_add_n}, carries are the only dependencies between limb operations.
9384
9385On processors with carry flags, a straightforward CISC style @code{adc} is
9386generally best.  AMD K6 @code{mpn_addmul_1} however is an example of an
9387unusual set of circumstances where a branch works out better.
9388
9389On RISC processors generally an add and compare for overflow is used.  This
9390sort of thing can be seen in @file{mpn/generic/aors_n.c}.  Some carry
9391propagation schemes require 4 instructions, meaning at least 4 cycles per
9392limb, but other schemes may use just 1 or 2.  On wide superscalar processors
9393performance may be completely determined by the number of dependent
9394instructions between carry-in and carry-out for each limb.
9395
9396On vector processors good use can be made of the fact that a carry bit only
9397very rarely propagates more than one limb.  When adding a single bit to a
9398limb, there's only a carry out if that limb was @code{0xFF@dots{}FF} which on
9399random data will be only 1 in @m{2\GMPraise{@code{mp\_bits\_per\_limb}},
94002^mp_bits_per_limb}.  @file{mpn/cray/add_n.c} is an example of this, it adds
9401all limbs in parallel, adds one set of carry bits in parallel and then only
9402rarely needs to fall through to a loop propagating further carries.
9403
9404On the x86s, GCC (as of version 2.95.2) doesn't generate particularly good code
9405for the RISC style idioms that are necessary to handle carry bits in
9406C@.  Often conditional jumps are generated where @code{adc} or @code{sbb} forms
9407would be better.  And so unfortunately almost any loop involving carry bits
9408needs to be coded in assembly for best results.
9409
9410
9411@node Assembly Cache Handling, Assembly Functional Units, Assembly Carry Propagation, Assembly Coding
9412@subsection Cache Handling
9413@cindex Assembly cache handling
9414
9415GMP aims to perform well both on operands that fit entirely in L1 cache and
9416those which don't.
9417
9418Basic routines like @code{mpn_add_n} or @code{mpn_lshift} are often used on
9419large operands, so L2 and main memory performance is important for them.
9420@code{mpn_mul_1} and @code{mpn_addmul_1} are mostly used for multiply and
9421square basecases, so L1 performance matters most for them, unless assembly
9422versions of @code{mpn_mul_basecase} and @code{mpn_sqr_basecase} exist, in
9423which case the remaining uses are mostly for larger operands.
9424
9425For L2 or main memory operands, memory access times will almost certainly be
9426more than the calculation time.  The aim therefore is to maximize memory
9427throughput, by starting a load of the next cache line while processing the
9428contents of the previous one.  Clearly this is only possible if the chip has a
9429lock-up free cache or some sort of prefetch instruction.  Most current chips
9430have both these features.
9431
9432Prefetching sources combines well with loop unrolling, since a prefetch can be
9433initiated once per unrolled loop (or more than once if the loop covers more
9434than one cache line).
9435
9436On CPUs without write-allocate caches, prefetching destinations will ensure
9437individual stores don't go further down the cache hierarchy, limiting
9438bandwidth.  Of course for calculations which are slow anyway, like
9439@code{mpn_divrem_1}, write-throughs might be fine.
9440
9441The distance ahead to prefetch will be determined by memory latency versus
9442throughput.  The aim of course is to have data arriving continuously, at peak
9443throughput.  Some CPUs have limits on the number of fetches or prefetches in
9444progress.
9445
9446If a special prefetch instruction doesn't exist then a plain load can be used,
9447but in that case care must be taken not to attempt to read past the end of an
9448operand, since that might produce a segmentation violation.
9449
9450Some CPUs or systems have hardware that detects sequential memory accesses and
9451initiates suitable cache movements automatically, making life easy.
9452
9453
9454@node Assembly Functional Units, Assembly Floating Point, Assembly Cache Handling, Assembly Coding
9455@subsection Functional Units
9456
9457When choosing an approach for an assembly loop, consideration is given to
9458what operations can execute simultaneously and what throughput can thereby be
9459achieved.  In some cases an algorithm can be tweaked to accommodate available
9460resources.
9461
9462Loop control will generally require a counter and pointer updates, costing as
9463much as 5 instructions, plus any delays a branch introduces.  CPU addressing
9464modes might reduce pointer updates, perhaps by allowing just one updating
9465pointer and others expressed as offsets from it, or on CISC chips with all
9466addressing done with the loop counter as a scaled index.
9467
9468The final loop control cost can be amortised by processing several limbs in
9469each iteration (@pxref{Assembly Loop Unrolling}).  This at least ensures loop
9470control isn't a big fraction the work done.
9471
9472Memory throughput is always a limit.  If perhaps only one load or one store
9473can be done per cycle then 3 cycles/limb will the top speed for ``binary''
9474operations like @code{mpn_add_n}, and any code achieving that is optimal.
9475
9476Integer resources can be freed up by having the loop counter in a float
9477register, or by pressing the float units into use for some multiplying,
9478perhaps doing every second limb on the float side (@pxref{Assembly Floating
9479Point}).
9480
9481Float resources can be freed up by doing carry propagation on the integer
9482side, or even by doing integer to float conversions in integers using bit
9483twiddling.
9484
9485
9486@node Assembly Floating Point, Assembly SIMD Instructions, Assembly Functional Units, Assembly Coding
9487@subsection Floating Point
9488@cindex Assembly floating Point
9489
9490Floating point arithmetic is used in GMP for multiplications on CPUs with poor
9491integer multipliers.  It's mostly useful for @code{mpn_mul_1},
9492@code{mpn_addmul_1} and @code{mpn_submul_1} on 64-bit machines, and
9493@code{mpn_mul_basecase} on both 32-bit and 64-bit machines.
9494
9495With IEEE 53-bit double precision floats, integer multiplications producing up
9496to 53 bits will give exact results.  Breaking a 64@cross{}64 multiplication
9497into eight 16@cross{}@math{32@rightarrow{}48} bit pieces is convenient.  With
9498some care though six 21@cross{}@math{32@rightarrow{}53} bit products can be
9499used, if one of the lower two 21-bit pieces also uses the sign bit.
9500
9501For the @code{mpn_mul_1} family of functions on a 64-bit machine, the
9502invariant single limb is split at the start, into 3 or 4 pieces.  Inside the
9503loop, the bignum operand is split into 32-bit pieces.  Fast conversion of
9504these unsigned 32-bit pieces to floating point is highly machine-dependent.
9505In some cases, reading the data into the integer unit, zero-extending to
950664-bits, then transferring to the floating point unit back via memory is the
9507only option.
9508
9509Converting partial products back to 64-bit limbs is usually best done as a
9510signed conversion.  Since all values are smaller than @m{2^{53},2^53}, signed
9511and unsigned are the same, but most processors lack unsigned conversions.
9512
9513@sp 2
9514
9515Here is a diagram showing 16@cross{}32 bit products for an @code{mpn_mul_1} or
9516@code{mpn_addmul_1} with a 64-bit limb.  The single limb operand V is split
9517into four 16-bit parts.  The multi-limb operand U is split in the loop into
9518two 32-bit parts.
9519
9520@tex
9521\global\newdimen\GMPbits      \global\GMPbits=0.18em
9522\def\GMPbox#1#2#3{%
9523  \hbox{%
9524    \hbox to 128\GMPbits{\hfil
9525      \vbox{%
9526        \hrule
9527        \hbox to 48\GMPbits {\GMPvrule \hfil$#2$\hfil \vrule}%
9528        \hrule}%
9529      \hskip #1\GMPbits}%
9530    \raise \GMPboxdepth \hbox{\hskip 2em #3}}}
9531%
9532\GMPdisplay{%
9533  \vbox{%
9534    \hbox{%
9535      \hbox to 128\GMPbits {\hfil
9536        \vbox{%
9537          \hrule
9538          \hbox to 64\GMPbits{%
9539            \GMPvrule \hfil$v48$\hfil
9540            \vrule    \hfil$v32$\hfil
9541            \vrule    \hfil$v16$\hfil
9542            \vrule    \hfil$v00$\hfil
9543            \vrule}
9544          \hrule}}%
9545       \raise \GMPboxdepth \hbox{\hskip 2em V Operand}}
9546    \vskip 0.5ex
9547    \hbox{%
9548      \hbox to 128\GMPbits {\hfil
9549        \raise \GMPboxdepth \hbox{$\times$\hskip 1.5em}%
9550        \vbox{%
9551          \hrule
9552          \hbox to 64\GMPbits {%
9553            \GMPvrule \hfil$u32$\hfil
9554            \vrule \hfil$u00$\hfil
9555            \vrule}%
9556          \hrule}}%
9557       \raise \GMPboxdepth \hbox{\hskip 2em U Operand (one limb)}}%
9558    \vskip 0.5ex
9559    \hbox{\vbox to 2ex{\hrule width 128\GMPbits}}%
9560    \GMPbox{0}{u00 \times v00}{$p00$\hskip 1.5em 48-bit products}%
9561    \vskip 0.5ex
9562    \GMPbox{16}{u00 \times v16}{$p16$}
9563    \vskip 0.5ex
9564    \GMPbox{32}{u00 \times v32}{$p32$}
9565    \vskip 0.5ex
9566    \GMPbox{48}{u00 \times v48}{$p48$}
9567    \vskip 0.5ex
9568    \GMPbox{32}{u32 \times v00}{$r32$}
9569    \vskip 0.5ex
9570    \GMPbox{48}{u32 \times v16}{$r48$}
9571    \vskip 0.5ex
9572    \GMPbox{64}{u32 \times v32}{$r64$}
9573    \vskip 0.5ex
9574    \GMPbox{80}{u32 \times v48}{$r80$}
9575}}
9576@end tex
9577@ifnottex
9578@example
9579@group
9580                +---+---+---+---+
9581                |v48|v32|v16|v00|    V operand
9582                +---+---+---+---+
9583
9584                +-------+---+---+
9585            x   |  u32  |  u00  |    U operand (one limb)
9586                +---------------+
9587
9588---------------------------------
9589
9590                    +-----------+
9591                    | u00 x v00 |    p00    48-bit products
9592                    +-----------+
9593                +-----------+
9594                | u00 x v16 |        p16
9595                +-----------+
9596            +-----------+
9597            | u00 x v32 |            p32
9598            +-----------+
9599        +-----------+
9600        | u00 x v48 |                p48
9601        +-----------+
9602            +-----------+
9603            | u32 x v00 |            r32
9604            +-----------+
9605        +-----------+
9606        | u32 x v16 |                r48
9607        +-----------+
9608    +-----------+
9609    | u32 x v32 |                    r64
9610    +-----------+
9611+-----------+
9612| u32 x v48 |                        r80
9613+-----------+
9614@end group
9615@end example
9616@end ifnottex
9617
9618@math{p32} and @math{r32} can be summed using floating-point addition, and
9619likewise @math{p48} and @math{r48}.  @math{p00} and @math{p16} can be summed
9620with @math{r64} and @math{r80} from the previous iteration.
9621
9622For each loop then, four 49-bit quantities are transferred to the integer unit,
9623aligned as follows,
9624
9625@tex
9626% GMPbox here should be 49 bits wide, but use 51 to better show p16+r80'
9627% crossing into the upper 64 bits.
9628\def\GMPbox#1#2#3{%
9629  \hbox{%
9630    \hbox to 128\GMPbits {%
9631      \hfil
9632      \vbox{%
9633        \hrule
9634        \hbox to 51\GMPbits {\GMPvrule \hfil$#2$\hfil \vrule}%
9635        \hrule}%
9636      \hskip #1\GMPbits}%
9637    \raise \GMPboxdepth \hbox{\hskip 1.5em $#3$\hfil}%
9638}}
9639\newbox\b \setbox\b\hbox{64 bits}%
9640\newdimen\bw \bw=\wd\b \advance\bw by 2em
9641\newdimen\x \x=128\GMPbits
9642\advance\x by -2\bw
9643\divide\x by4
9644\GMPdisplay{%
9645  \vbox{%
9646    \hbox to 128\GMPbits {%
9647      \GMPvrule
9648      \raise 0.5ex \vbox{\hrule \hbox to \x {}}%
9649      \hfil 64 bits\hfil
9650      \raise 0.5ex \vbox{\hrule \hbox to \x {}}%
9651      \vrule
9652      \raise 0.5ex \vbox{\hrule \hbox to \x {}}%
9653      \hfil 64 bits\hfil
9654      \raise 0.5ex \vbox{\hrule \hbox to \x {}}%
9655      \vrule}%
9656    \vskip 0.7ex
9657    \GMPbox{0}{p00+r64'}{i00}
9658    \vskip 0.5ex
9659    \GMPbox{16}{p16+r80'}{i16}
9660    \vskip 0.5ex
9661    \GMPbox{32}{p32+r32}{i32}
9662    \vskip 0.5ex
9663    \GMPbox{48}{p48+r48}{i48}
9664}}
9665@end tex
9666@ifnottex
9667@example
9668@group
9669|-----64bits----|-----64bits----|
9670                   +------------+
9671                   | p00 + r64' |    i00
9672                   +------------+
9673               +------------+
9674               | p16 + r80' |        i16
9675               +------------+
9676           +------------+
9677           | p32 + r32  |            i32
9678           +------------+
9679       +------------+
9680       | p48 + r48  |                i48
9681       +------------+
9682@end group
9683@end example
9684@end ifnottex
9685
9686The challenge then is to sum these efficiently and add in a carry limb,
9687generating a low 64-bit result limb and a high 33-bit carry limb (@math{i48}
9688extends 33 bits into the high half).
9689
9690
9691@node Assembly SIMD Instructions, Assembly Software Pipelining, Assembly Floating Point, Assembly Coding
9692@subsection SIMD Instructions
9693@cindex Assembly SIMD
9694
9695The single-instruction multiple-data support in current microprocessors is
9696aimed at signal processing algorithms where each data point can be treated
9697more or less independently.  There's generally not much support for
9698propagating the sort of carries that arise in GMP.
9699
9700SIMD multiplications of say four 16@cross{}16 bit multiplies only do as much
9701work as one 32@cross{}32 from GMP's point of view, and need some shifts and
9702adds besides.  But of course if say the SIMD form is fully pipelined and uses
9703less instruction decoding then it may still be worthwhile.
9704
9705On the x86 chips, MMX has so far found a use in @code{mpn_rshift} and
9706@code{mpn_lshift}, and is used in a special case for 16-bit multipliers in the
9707P55 @code{mpn_mul_1}.  SSE2 is used for Pentium 4 @code{mpn_mul_1},
9708@code{mpn_addmul_1}, and @code{mpn_submul_1}.
9709
9710
9711@node Assembly Software Pipelining, Assembly Loop Unrolling, Assembly SIMD Instructions, Assembly Coding
9712@subsection Software Pipelining
9713@cindex Assembly software pipelining
9714
9715Software pipelining consists of scheduling instructions around the branch
9716point in a loop.  For example a loop might issue a load not for use in the
9717present iteration but the next, thereby allowing extra cycles for the data to
9718arrive from memory.
9719
9720Naturally this is wanted only when doing things like loads or multiplies that
9721take several cycles to complete, and only where a CPU has multiple functional
9722units so that other work can be done in the meantime.
9723
9724A pipeline with several stages will have a data value in progress at each
9725stage and each loop iteration moves them along one stage.  This is like
9726juggling.
9727
9728If the latency of some instruction is greater than the loop time then it will
9729be necessary to unroll, so one register has a result ready to use while
9730another (or multiple others) are still in progress.  (@pxref{Assembly Loop
9731Unrolling}).
9732
9733
9734@node Assembly Loop Unrolling, Assembly Writing Guide, Assembly Software Pipelining, Assembly Coding
9735@subsection Loop Unrolling
9736@cindex Assembly loop unrolling
9737
9738Loop unrolling consists of replicating code so that several limbs are
9739processed in each loop.  At a minimum this reduces loop overheads by a
9740corresponding factor, but it can also allow better register usage, for example
9741alternately using one register combination and then another.  Judicious use of
9742@command{m4} macros can help avoid lots of duplication in the source code.
9743
9744Any amount of unrolling can be handled with a loop counter that's decremented
9745by @math{N} each time, stopping when the remaining count is less than the
9746further @math{N} the loop will process.  Or by subtracting @math{N} at the
9747start, the termination condition becomes when the counter @math{C} is less
9748than 0 (and the count of remaining limbs is @math{C+N}).
9749
9750Alternately for a power of 2 unroll the loop count and remainder can be
9751established with a shift and mask.  This is convenient if also making a
9752computed jump into the middle of a large loop.
9753
9754The limbs not a multiple of the unrolling can be handled in various ways, for
9755example
9756
9757@itemize @bullet
9758@item
9759A simple loop at the end (or the start) to process the excess.  Care will be
9760wanted that it isn't too much slower than the unrolled part.
9761
9762@item
9763A set of binary tests, for example after an 8-limb unrolling, test for 4 more
9764limbs to process, then a further 2 more or not, and finally 1 more or not.
9765This will probably take more code space than a simple loop.
9766
9767@item
9768A @code{switch} statement, providing separate code for each possible excess,
9769for example an 8-limb unrolling would have separate code for 0 remaining, 1
9770remaining, etc, up to 7 remaining.  This might take a lot of code, but may be
9771the best way to optimize all cases in combination with a deep pipelined loop.
9772
9773@item
9774A computed jump into the middle of the loop, thus making the first iteration
9775handle the excess.  This should make times smoothly increase with size, which
9776is attractive, but setups for the jump and adjustments for pointers can be
9777tricky and could become quite difficult in combination with deep pipelining.
9778@end itemize
9779
9780
9781@node Assembly Writing Guide,  , Assembly Loop Unrolling, Assembly Coding
9782@subsection Writing Guide
9783@cindex Assembly writing guide
9784
9785This is a guide to writing software pipelined loops for processing limb
9786vectors in assembly.
9787
9788First determine the algorithm and which instructions are needed.  Code it
9789without unrolling or scheduling, to make sure it works.  On a 3-operand CPU
9790try to write each new value to a new register, this will greatly simplify later
9791steps.
9792
9793Then note for each instruction the functional unit and/or issue port
9794requirements.  If an instruction can use either of two units, like U0 or U1
9795then make a category ``U0/U1''.  Count the total using each unit (or combined
9796unit), and count all instructions.
9797
9798Figure out from those counts the best possible loop time.  The goal will be to
9799find a perfect schedule where instruction latencies are completely hidden.
9800The total instruction count might be the limiting factor, or perhaps a
9801particular functional unit.  It might be possible to tweak the instructions to
9802help the limiting factor.
9803
9804Suppose the loop time is @math{N}, then make @math{N} issue buckets, with the
9805final loop branch at the end of the last.  Now fill the buckets with dummy
9806instructions using the functional units desired.  Run this to make sure the
9807intended speed is reached.
9808
9809Now replace the dummy instructions with the real instructions from the slow
9810but correct loop you started with.  The first will typically be a load
9811instruction.  Then the instruction using that value is placed in a bucket an
9812appropriate distance down.  Run the loop again, to check it still runs at
9813target speed.
9814
9815Keep placing instructions, frequently measuring the loop.  After a few you
9816will need to wrap around from the last bucket back to the top of the loop.  If
9817you used the new-register for new-value strategy above then there will be no
9818register conflicts.  If not then take care not to clobber something already in
9819use.  Changing registers at this time is very error prone.
9820
9821The loop will overlap two or more of the original loop iterations, and the
9822computation of one vector element result will be started in one iteration of
9823the new loop, and completed one or several iterations later.
9824
9825The final step is to create feed-in and wind-down code for the loop.  A good
9826way to do this is to make a copy (or copies) of the loop at the start and
9827delete those instructions which don't have valid antecedents, and at the end
9828replicate and delete those whose results are unwanted (including any further
9829loads).
9830
9831The loop will have a minimum number of limbs loaded and processed, so the
9832feed-in code must test if the request size is smaller and skip either to a
9833suitable part of the wind-down or to special code for small sizes.
9834
9835
9836@node Internals, Contributors, Algorithms, Top
9837@chapter Internals
9838@cindex Internals
9839
9840@strong{This chapter is provided only for informational purposes and the
9841various internals described here may change in future GMP releases.
9842Applications expecting to be compatible with future releases should use only
9843the documented interfaces described in previous chapters.}
9844
9845@menu
9846* Integer Internals::
9847* Rational Internals::
9848* Float Internals::
9849* Raw Output Internals::
9850* C++ Interface Internals::
9851@end menu
9852
9853@node Integer Internals, Rational Internals, Internals, Internals
9854@section Integer Internals
9855@cindex Integer internals
9856
9857@code{mpz_t} variables represent integers using sign and magnitude, in space
9858dynamically allocated and reallocated.  The fields are as follows.
9859
9860@table @asis
9861@item @code{_mp_size}
9862The number of limbs, or the negative of that when representing a negative
9863integer.  Zero is represented by @code{_mp_size} set to zero, in which case
9864the @code{_mp_d} data is unused.
9865
9866@item @code{_mp_d}
9867A pointer to an array of limbs which is the magnitude.  These are stored
9868``little endian'' as per the @code{mpn} functions, so @code{_mp_d[0]} is the
9869least significant limb and @code{_mp_d[ABS(_mp_size)-1]} is the most
9870significant.  Whenever @code{_mp_size} is non-zero, the most significant limb
9871is non-zero.
9872
9873Currently there's always at least one limb allocated, so for instance
9874@code{mpz_set_ui} never needs to reallocate, and @code{mpz_get_ui} can fetch
9875@code{_mp_d[0]} unconditionally (though its value is then only wanted if
9876@code{_mp_size} is non-zero).
9877
9878@item @code{_mp_alloc}
9879@code{_mp_alloc} is the number of limbs currently allocated at @code{_mp_d},
9880and naturally @code{_mp_alloc >= ABS(_mp_size)}.  When an @code{mpz} routine
9881is about to (or might be about to) increase @code{_mp_size}, it checks
9882@code{_mp_alloc} to see whether there's enough space, and reallocates if not.
9883@code{MPZ_REALLOC} is generally used for this.
9884@end table
9885
9886The various bitwise logical functions like @code{mpz_and} behave as if
9887negative values were twos complement.  But sign and magnitude is always used
9888internally, and necessary adjustments are made during the calculations.
9889Sometimes this isn't pretty, but sign and magnitude are best for other
9890routines.
9891
9892Some internal temporary variables are setup with @code{MPZ_TMP_INIT} and these
9893have @code{_mp_d} space obtained from @code{TMP_ALLOC} rather than the memory
9894allocation functions.  Care is taken to ensure that these are big enough that
9895no reallocation is necessary (since it would have unpredictable consequences).
9896
9897@code{_mp_size} and @code{_mp_alloc} are @code{int}, although @code{mp_size_t}
9898is usually a @code{long}.  This is done to make the fields just 32 bits on
9899some 64 bits systems, thereby saving a few bytes of data space but still
9900providing plenty of range.
9901
9902
9903@node Rational Internals, Float Internals, Integer Internals, Internals
9904@section Rational Internals
9905@cindex Rational internals
9906
9907@code{mpq_t} variables represent rationals using an @code{mpz_t} numerator and
9908denominator (@pxref{Integer Internals}).
9909
9910The canonical form adopted is denominator positive (and non-zero), no common
9911factors between numerator and denominator, and zero uniquely represented as
99120/1.
9913
9914It's believed that casting out common factors at each stage of a calculation
9915is best in general.  A GCD is an @math{O(N^2)} operation so it's better to do
9916a few small ones immediately than to delay and have to do a big one later.
9917Knowing the numerator and denominator have no common factors can be used for
9918example in @code{mpq_mul} to make only two cross GCDs necessary, not four.
9919
9920This general approach to common factors is badly sub-optimal in the presence
9921of simple factorizations or little prospect for cancellation, but GMP has no
9922way to know when this will occur.  As per @ref{Efficiency}, that's left to
9923applications.  The @code{mpq_t} framework might still suit, with
9924@code{mpq_numref} and @code{mpq_denref} for direct access to the numerator and
9925denominator, or of course @code{mpz_t} variables can be used directly.
9926
9927
9928@node Float Internals, Raw Output Internals, Rational Internals, Internals
9929@section Float Internals
9930@cindex Float internals
9931
9932Efficient calculation is the primary aim of GMP floats and the use of whole
9933limbs and simple rounding facilitates this.
9934
9935@code{mpf_t} floats have a variable precision mantissa and a single machine
9936word signed exponent.  The mantissa is represented using sign and magnitude.
9937
9938@c FIXME: The arrow heads don't join to the lines exactly.
9939@tex
9940\global\newdimen\GMPboxwidth \GMPboxwidth=5em
9941\global\newdimen\GMPboxheight \GMPboxheight=3ex
9942\def\centreline{\hbox{\raise 0.8ex \vbox{\hrule \hbox{\hfil}}}}
9943\GMPdisplay{%
9944\vbox{%
9945  \hbox to 5\GMPboxwidth {most significant limb \hfil least significant limb}
9946  \vskip 0.7ex
9947  \def\GMPcentreline#1{\hbox{\raise 0.5 ex \vbox{\hrule \hbox to #1 {}}}}
9948  \hbox {
9949    \hbox to 3\GMPboxwidth {%
9950      \setbox 0 = \hbox{@code{\_mp\_exp}}%
9951      \dimen0=3\GMPboxwidth
9952      \advance\dimen0 by -\wd0
9953      \divide\dimen0 by 2
9954      \advance\dimen0 by -1em
9955      \setbox1 = \hbox{$\rightarrow$}%
9956      \dimen1=\dimen0
9957      \advance\dimen1 by -\wd1
9958      \GMPcentreline{\dimen0}%
9959      \hfil
9960      \box0%
9961      \hfil
9962      \GMPcentreline{\dimen1{}}%
9963      \box1}
9964    \hbox to 2\GMPboxwidth {\hfil @code{\_mp\_d}}}
9965  \vskip 0.5ex
9966  \vbox {%
9967    \hrule
9968    \hbox{%
9969      \vrule height 2ex depth 1ex
9970      \hbox to \GMPboxwidth {}%
9971      \vrule
9972      \hbox to \GMPboxwidth {}%
9973      \vrule
9974      \hbox to \GMPboxwidth {}%
9975      \vrule
9976      \hbox to \GMPboxwidth {}%
9977      \vrule
9978      \hbox to \GMPboxwidth {}%
9979      \vrule}
9980    \hrule
9981  }
9982  \hbox {%
9983    \hbox to 0.8 pt {}
9984    \hbox to 3\GMPboxwidth {%
9985      \hfil $\cdot$} \hbox {$\leftarrow$ radix point\hfil}}
9986  \hbox to 5\GMPboxwidth{%
9987    \setbox 0 = \hbox{@code{\_mp\_size}}%
9988    \dimen0 = 5\GMPboxwidth
9989    \advance\dimen0 by -\wd0
9990    \divide\dimen0 by 2
9991    \advance\dimen0 by -1em
9992    \dimen1 = \dimen0
9993    \setbox1 = \hbox{$\leftarrow$}%
9994    \setbox2 = \hbox{$\rightarrow$}%
9995    \advance\dimen0 by -\wd1
9996    \advance\dimen1 by -\wd2
9997    \hbox to 0.3 em {}%
9998    \box1
9999    \GMPcentreline{\dimen0}%
10000    \hfil
10001    \box0
10002    \hfil
10003    \GMPcentreline{\dimen1}%
10004    \box2}
10005}}
10006@end tex
10007@ifnottex
10008@example
10009   most                   least
10010significant            significant
10011   limb                   limb
10012
10013                            _mp_d
10014 |---- _mp_exp --->           |
10015  _____ _____ _____ _____ _____
10016 |_____|_____|_____|_____|_____|
10017                   . <------------ radix point
10018
10019  <-------- _mp_size --------->
10020@sp 1
10021@end example
10022@end ifnottex
10023
10024@noindent
10025The fields are as follows.
10026
10027@table @asis
10028@item @code{_mp_size}
10029The number of limbs currently in use, or the negative of that when
10030representing a negative value.  Zero is represented by @code{_mp_size} and
10031@code{_mp_exp} both set to zero, and in that case the @code{_mp_d} data is
10032unused.  (In the future @code{_mp_exp} might be undefined when representing
10033zero.)
10034
10035@item @code{_mp_prec}
10036The precision of the mantissa, in limbs.  In any calculation the aim is to
10037produce @code{_mp_prec} limbs of result (the most significant being non-zero).
10038
10039@item @code{_mp_d}
10040A pointer to the array of limbs which is the absolute value of the mantissa.
10041These are stored ``little endian'' as per the @code{mpn} functions, so
10042@code{_mp_d[0]} is the least significant limb and
10043@code{_mp_d[ABS(_mp_size)-1]} the most significant.
10044
10045The most significant limb is always non-zero, but there are no other
10046restrictions on its value, in particular the highest 1 bit can be anywhere
10047within the limb.
10048
10049@code{_mp_prec+1} limbs are allocated to @code{_mp_d}, the extra limb being
10050for convenience (see below).  There are no reallocations during a calculation,
10051only in a change of precision with @code{mpf_set_prec}.
10052
10053@item @code{_mp_exp}
10054The exponent, in limbs, determining the location of the implied radix point.
10055Zero means the radix point is just above the most significant limb.  Positive
10056values mean a radix point offset towards the lower limbs and hence a value
10057@math{@ge{} 1}, as for example in the diagram above.  Negative exponents mean
10058a radix point further above the highest limb.
10059
10060Naturally the exponent can be any value, it doesn't have to fall within the
10061limbs as the diagram shows, it can be a long way above or a long way below.
10062Limbs other than those included in the @code{@{_mp_d,_mp_size@}} data
10063are treated as zero.
10064@end table
10065
10066The @code{_mp_size} and @code{_mp_prec} fields are @code{int}, although the
10067@code{mp_size_t} type is usually a @code{long}.  The @code{_mp_exp} field is
10068usually @code{long}.  This is done to make some fields just 32 bits on some 64
10069bits systems, thereby saving a few bytes of data space but still providing
10070plenty of precision and a very large range.
10071
10072
10073@sp 1
10074@noindent
10075The following various points should be noted.
10076
10077@table @asis
10078@item Low Zeros
10079The least significant limbs @code{_mp_d[0]} etc can be zero, though such low
10080zeros can always be ignored.  Routines likely to produce low zeros check and
10081avoid them to save time in subsequent calculations, but for most routines
10082they're quite unlikely and aren't checked.
10083
10084@item Mantissa Size Range
10085The @code{_mp_size} count of limbs in use can be less than @code{_mp_prec} if
10086the value can be represented in less.  This means low precision values or
10087small integers stored in a high precision @code{mpf_t} can still be operated
10088on efficiently.
10089
10090@code{_mp_size} can also be greater than @code{_mp_prec}.  Firstly a value is
10091allowed to use all of the @code{_mp_prec+1} limbs available at @code{_mp_d},
10092and secondly when @code{mpf_set_prec_raw} lowers @code{_mp_prec} it leaves
10093@code{_mp_size} unchanged and so the size can be arbitrarily bigger than
10094@code{_mp_prec}.
10095
10096@item Rounding
10097All rounding is done on limb boundaries.  Calculating @code{_mp_prec} limbs
10098with the high non-zero will ensure the application requested minimum precision
10099is obtained.
10100
10101The use of simple ``trunc'' rounding towards zero is efficient, since there's
10102no need to examine extra limbs and increment or decrement.
10103
10104@item Bit Shifts
10105Since the exponent is in limbs, there are no bit shifts in basic operations
10106like @code{mpf_add} and @code{mpf_mul}.  When differing exponents are
10107encountered all that's needed is to adjust pointers to line up the relevant
10108limbs.
10109
10110Of course @code{mpf_mul_2exp} and @code{mpf_div_2exp} will require bit shifts,
10111but the choice is between an exponent in limbs which requires shifts there, or
10112one in bits which requires them almost everywhere else.
10113
10114@item Use of @code{_mp_prec+1} Limbs
10115The extra limb on @code{_mp_d} (@code{_mp_prec+1} rather than just
10116@code{_mp_prec}) helps when an @code{mpf} routine might get a carry from its
10117operation.  @code{mpf_add} for instance will do an @code{mpn_add} of
10118@code{_mp_prec} limbs.  If there's no carry then that's the result, but if
10119there is a carry then it's stored in the extra limb of space and
10120@code{_mp_size} becomes @code{_mp_prec+1}.
10121
10122Whenever @code{_mp_prec+1} limbs are held in a variable, the low limb is not
10123needed for the intended precision, only the @code{_mp_prec} high limbs.  But
10124zeroing it out or moving the rest down is unnecessary.  Subsequent routines
10125reading the value will simply take the high limbs they need, and this will be
10126@code{_mp_prec} if their target has that same precision.  This is no more than
10127a pointer adjustment, and must be checked anyway since the destination
10128precision can be different from the sources.
10129
10130Copy functions like @code{mpf_set} will retain a full @code{_mp_prec+1} limbs
10131if available.  This ensures that a variable which has @code{_mp_size} equal to
10132@code{_mp_prec+1} will get its full exact value copied.  Strictly speaking
10133this is unnecessary since only @code{_mp_prec} limbs are needed for the
10134application's requested precision, but it's considered that an @code{mpf_set}
10135from one variable into another of the same precision ought to produce an exact
10136copy.
10137
10138@item Application Precisions
10139@code{__GMPF_BITS_TO_PREC} converts an application requested precision to an
10140@code{_mp_prec}.  The value in bits is rounded up to a whole limb then an
10141extra limb is added since the most significant limb of @code{_mp_d} is only
10142non-zero and therefore might contain only one bit.
10143
10144@code{__GMPF_PREC_TO_BITS} does the reverse conversion, and removes the extra
10145limb from @code{_mp_prec} before converting to bits.  The net effect of
10146reading back with @code{mpf_get_prec} is simply the precision rounded up to a
10147multiple of @code{mp_bits_per_limb}.
10148
10149Note that the extra limb added here for the high only being non-zero is in
10150addition to the extra limb allocated to @code{_mp_d}.  For example with a
1015132-bit limb, an application request for 250 bits will be rounded up to 8
10152limbs, then an extra added for the high being only non-zero, giving an
10153@code{_mp_prec} of 9.  @code{_mp_d} then gets 10 limbs allocated.  Reading
10154back with @code{mpf_get_prec} will take @code{_mp_prec} subtract 1 limb and
10155multiply by 32, giving 256 bits.
10156
10157Strictly speaking, the fact the high limb has at least one bit means that a
10158float with, say, 3 limbs of 32-bits each will be holding at least 65 bits, but
10159for the purposes of @code{mpf_t} it's considered simply to be 64 bits, a nice
10160multiple of the limb size.
10161@end table
10162
10163
10164@node Raw Output Internals, C++ Interface Internals, Float Internals, Internals
10165@section Raw Output Internals
10166@cindex Raw output internals
10167
10168@noindent
10169@code{mpz_out_raw} uses the following format.
10170
10171@tex
10172\global\newdimen\GMPboxwidth \GMPboxwidth=5em
10173\global\newdimen\GMPboxheight \GMPboxheight=3ex
10174\def\centreline{\hbox{\raise 0.8ex \vbox{\hrule \hbox{\hfil}}}}
10175\GMPdisplay{%
10176\vbox{%
10177  \def\GMPcentreline#1{\hbox{\raise 0.5 ex \vbox{\hrule \hbox to #1 {}}}}
10178  \vbox {%
10179    \hrule
10180    \hbox{%
10181      \vrule height 2.5ex depth 1.5ex
10182      \hbox to \GMPboxwidth {\hfil size\hfil}%
10183      \vrule
10184      \hbox to 3\GMPboxwidth {\hfil data bytes\hfil}%
10185      \vrule}
10186    \hrule}
10187}}
10188@end tex
10189@ifnottex
10190@example
10191+------+------------------------+
10192| size |       data bytes       |
10193+------+------------------------+
10194@end example
10195@end ifnottex
10196
10197The size is 4 bytes written most significant byte first, being the number of
10198subsequent data bytes, or the twos complement negative of that when a negative
10199integer is represented.  The data bytes are the absolute value of the integer,
10200written most significant byte first.
10201
10202The most significant data byte is always non-zero, so the output is the same
10203on all systems, irrespective of limb size.
10204
10205In GMP 1, leading zero bytes were written to pad the data bytes to a multiple
10206of the limb size.  @code{mpz_inp_raw} will still accept this, for
10207compatibility.
10208
10209The use of ``big endian'' for both the size and data fields is deliberate, it
10210makes the data easy to read in a hex dump of a file.  Unfortunately it also
10211means that the limb data must be reversed when reading or writing, so neither
10212a big endian nor little endian system can just read and write @code{_mp_d}.
10213
10214
10215@node C++ Interface Internals,  , Raw Output Internals, Internals
10216@section C++ Interface Internals
10217@cindex C++ interface internals
10218
10219A system of expression templates is used to ensure something like @code{a=b+c}
10220turns into a simple call to @code{mpz_add} etc.  For @code{mpf_class}
10221the scheme also ensures the precision of the final
10222destination is used for any temporaries within a statement like
10223@code{f=w*x+y*z}.  These are important features which a naive implementation
10224cannot provide.
10225
10226A simplified description of the scheme follows.  The true scheme is
10227complicated by the fact that expressions have different return types.  For
10228detailed information, refer to the source code.
10229
10230To perform an operation, say, addition, we first define a ``function object''
10231evaluating it,
10232
10233@example
10234struct __gmp_binary_plus
10235@{
10236  static void eval(mpf_t f, mpf_t g, mpf_t h) @{ mpf_add(f, g, h); @}
10237@};
10238@end example
10239
10240@noindent
10241And an ``additive expression'' object,
10242
10243@example
10244__gmp_expr<__gmp_binary_expr<mpf_class, mpf_class, __gmp_binary_plus> >
10245operator+(const mpf_class &f, const mpf_class &g)
10246@{
10247  return __gmp_expr
10248    <__gmp_binary_expr<mpf_class, mpf_class, __gmp_binary_plus> >(f, g);
10249@}
10250@end example
10251
10252The seemingly redundant @code{__gmp_expr<__gmp_binary_expr<@dots{}>>} is used to
10253encapsulate any possible kind of expression into a single template type.  In
10254fact even @code{mpf_class} etc are @code{typedef} specializations of
10255@code{__gmp_expr}.
10256
10257Next we define assignment of @code{__gmp_expr} to @code{mpf_class}.
10258
10259@example
10260template <class T>
10261mpf_class & mpf_class::operator=(const __gmp_expr<T> &expr)
10262@{
10263  expr.eval(this->get_mpf_t(), this->precision());
10264  return *this;
10265@}
10266
10267template <class Op>
10268void __gmp_expr<__gmp_binary_expr<mpf_class, mpf_class, Op> >::eval
10269(mpf_t f, mp_bitcnt_t precision)
10270@{
10271  Op::eval(f, expr.val1.get_mpf_t(), expr.val2.get_mpf_t());
10272@}
10273@end example
10274
10275where @code{expr.val1} and @code{expr.val2} are references to the expression's
10276operands (here @code{expr} is the @code{__gmp_binary_expr} stored within the
10277@code{__gmp_expr}).
10278
10279This way, the expression is actually evaluated only at the time of assignment,
10280when the required precision (that of @code{f}) is known.  Furthermore the
10281target @code{mpf_t} is now available, thus we can call @code{mpf_add} directly
10282with @code{f} as the output argument.
10283
10284Compound expressions are handled by defining operators taking subexpressions
10285as their arguments, like this:
10286
10287@example
10288template <class T, class U>
10289__gmp_expr
10290<__gmp_binary_expr<__gmp_expr<T>, __gmp_expr<U>, __gmp_binary_plus> >
10291operator+(const __gmp_expr<T> &expr1, const __gmp_expr<U> &expr2)
10292@{
10293  return __gmp_expr
10294    <__gmp_binary_expr<__gmp_expr<T>, __gmp_expr<U>, __gmp_binary_plus> >
10295    (expr1, expr2);
10296@}
10297@end example
10298
10299And the corresponding specializations of @code{__gmp_expr::eval}:
10300
10301@example
10302template <class T, class U, class Op>
10303void __gmp_expr
10304<__gmp_binary_expr<__gmp_expr<T>, __gmp_expr<U>, Op> >::eval
10305(mpf_t f, mp_bitcnt_t precision)
10306@{
10307  // declare two temporaries
10308  mpf_class temp1(expr.val1, precision), temp2(expr.val2, precision);
10309  Op::eval(f, temp1.get_mpf_t(), temp2.get_mpf_t());
10310@}
10311@end example
10312
10313The expression is thus recursively evaluated to any level of complexity and
10314all subexpressions are evaluated to the precision of @code{f}.
10315
10316
10317@node Contributors, References, Internals, Top
10318@comment  node-name,  next,  previous,  up
10319@appendix Contributors
10320@cindex Contributors
10321
10322Torbj@"orn Granlund wrote the original GMP library and is still the main
10323developer.  Code not explicitly attributed to others, was contributed by
10324Torbj@"orn.  Several other individuals and organizations have contributed
10325GMP.  Here is a list in chronological order on first contribution:
10326
10327Gunnar Sj@"odin and Hans Riesel helped with mathematical problems in early
10328versions of the library.
10329
10330Richard Stallman helped with the interface design and revised the first
10331version of this manual.
10332
10333Brian Beuning and Doug Lea helped with testing of early versions of the
10334library and made creative suggestions.
10335
10336John Amanatides of York University in Canada contributed the function
10337@code{mpz_probab_prime_p}.
10338
10339Paul Zimmermann wrote the REDC-based mpz_powm code, the Sch@"onhage-Strassen
10340FFT multiply code, and the Karatsuba square root code.  He also improved the
10341Toom3 code for GMP 4.2.  Paul sparked the development of GMP 2, with his
10342comparisons between bignum packages.  The ECMNET project Paul is organizing
10343was a driving force behind many of the optimizations in GMP 3.  Paul also
10344wrote the new GMP 4.3 nth root code (with Torbj@"orn).
10345
10346Ken Weber (Kent State University, Universidade Federal do Rio Grande do Sul)
10347contributed now defunct versions of @code{mpz_gcd}, @code{mpz_divexact},
10348@code{mpn_gcd}, and @code{mpn_bdivmod}, partially supported by CNPq (Brazil)
10349grant 301314194-2.
10350
10351Per Bothner of Cygnus Support helped to set up GMP to use Cygnus' configure.
10352He has also made valuable suggestions and tested numerous intermediary
10353releases.
10354
10355Joachim Hollman was involved in the design of the @code{mpf} interface, and in
10356the @code{mpz} design revisions for version 2.
10357
10358Bennet Yee contributed the initial versions of @code{mpz_jacobi} and
10359@code{mpz_legendre}.
10360
10361Andreas Schwab contributed the files @file{mpn/m68k/lshift.S} and
10362@file{mpn/m68k/rshift.S} (now in @file{.asm} form).
10363
10364Robert Harley of Inria, France and David Seal of ARM, England, suggested clever
10365improvements for population count.  Robert also wrote highly optimized
10366Karatsuba and 3-way Toom multiplication functions for GMP 3, and contributed
10367the ARM assembly code.
10368
10369Torsten Ekedahl of the Mathematical department of Stockholm University provided
10370significant inspiration during several phases of the GMP development.  His
10371mathematical expertise helped improve several algorithms.
10372
10373Linus Nordberg wrote the new configure system based on autoconf and
10374implemented the new random functions.
10375
10376Kevin Ryde worked on a large number of things: optimized x86 code, m4 asm
10377macros, parameter tuning, speed measuring, the configure system, function
10378inlining, divisibility tests, bit scanning, Jacobi symbols, Fibonacci and Lucas
10379number functions, printf and scanf functions, perl interface, demo expression
10380parser, the algorithms chapter in the manual, @file{gmpasm-mode.el}, and
10381various miscellaneous improvements elsewhere.
10382
10383Kent Boortz made the Mac OS 9 port.
10384
10385Steve Root helped write the optimized alpha 21264 assembly code.
10386
10387Gerardo Ballabio wrote the @file{gmpxx.h} C++ class interface and the C++
10388@code{istream} input routines.
10389
10390Jason Moxham rewrote @code{mpz_fac_ui}.
10391
10392Pedro Gimeno implemented the Mersenne Twister and made other random number
10393improvements.
10394
10395Niels M@"oller wrote the sub-quadratic GCD and extended GCD code, the
10396quadratic Hensel division code, and (with Torbj@"orn) the new divide and
10397conquer division code for GMP 4.3.  Niels also helped implement the new Toom
10398multiply code for GMP 4.3 and implemented helper functions to simplify Toom
10399evaluations for GMP 5.0.  He wrote the original version of mpn_mulmod_bnm1.
10400
10401Alberto Zanoni and Marco Bodrato suggested the unbalanced multiply strategy,
10402and found the optimal strategies for evaluation and interpolation in Toom
10403multiplication.
10404
10405Marco Bodrato helped implement the new Toom multiply code for GMP 4.3 and
10406implemented most of the new Toom multiply and squaring code for 5.0.
10407He is the main author of the current mpn_mulmod_bnm1 and mpn_mullo_n.  Marco
10408also wrote the functions mpn_invert and mpn_invertappr.
10409
10410David Harvey suggested the internal function @code{mpn_bdiv_dbm1}, implementing
10411division relevant to Toom multiplication.  He also worked on fast assembly
10412sequences, in particular on a fast AMD64 @code{mpn_mul_basecase}.
10413
10414Martin Boij wrote @code{mpn_perfect_power_p}.
10415
10416(This list is chronological, not ordered after significance.  If you have
10417contributed to GMP but are not listed above, please tell
10418@email{gmp-devel@@gmplib.org} about the omission!)
10419
10420The development of floating point functions of GNU MP 2, were supported in part
10421by the ESPRIT-BRA (Basic Research Activities) 6846 project POSSO (POlynomial
10422System SOlving).
10423
10424The development of GMP 2, 3, and 4 was supported in part by the IDA Center for
10425Computing Sciences.
10426
10427Thanks go to Hans Thorsen for donating an SGI system for the GMP test system
10428environment.
10429
10430@node References, GNU Free Documentation License, Contributors, Top
10431@comment  node-name,  next,  previous,  up
10432@appendix References
10433@cindex References
10434
10435@c  FIXME: In tex, the @uref's are unhyphenated, which is good for clarity,
10436@c  but being long words they upset paragraph formatting (the preceding line
10437@c  can get badly stretched).  Would like an conditional @* style line break
10438@c  if the uref is too long to fit on the last line of the paragraph, but it's
10439@c  not clear how to do that.  For now explicit @texlinebreak{}s are used on
10440@c  paragraphs that come out bad.
10441
10442@section Books
10443
10444@itemize @bullet
10445@item
10446Jonathan M. Borwein and Peter B. Borwein, ``Pi and the AGM: A Study in
10447Analytic Number Theory and Computational Complexity'', Wiley, 1998.
10448
10449@item
10450Richard Crandall and Carl Pomerance, ``Prime Numbers: A Computational
10451Perspective'', 2nd edition, Springer-Verlag, 2005.
10452@texlinebreak{} @uref{http://math.dartmouth.edu/~carlp/}
10453
10454@item
10455Henri Cohen, ``A Course in Computational Algebraic Number Theory'', Graduate
10456Texts in Mathematics number 138, Springer-Verlag, 1993.
10457@texlinebreak{} @uref{http://www.math.u-bordeaux.fr/~cohen/}
10458
10459@item
10460Donald E. Knuth, ``The Art of Computer Programming'', volume 2,
10461``Seminumerical Algorithms'', 3rd edition, Addison-Wesley, 1998.
10462@texlinebreak{} @uref{http://www-cs-faculty.stanford.edu/~knuth/taocp.html}
10463
10464@item
10465John D. Lipson, ``Elements of Algebra and Algebraic Computing'',
10466The Benjamin Cummings Publishing Company Inc, 1981.
10467
10468@item
10469Alfred J. Menezes, Paul C. van Oorschot and Scott A. Vanstone, ``Handbook of
10470Applied Cryptography'', @uref{http://www.cacr.math.uwaterloo.ca/hac/}
10471
10472@item
10473Richard M. Stallman and the GCC Developer Community, ``Using the GNU Compiler
10474Collection'', Free Software Foundation, 2008, available online
10475@uref{http://gcc.gnu.org/onlinedocs/}, and in the GCC package
10476@uref{ftp://ftp.gnu.org/gnu/gcc/}
10477@end itemize
10478
10479@section Papers
10480
10481@itemize @bullet
10482@item
10483Yves Bertot, Nicolas Magaud and Paul Zimmermann, ``A Proof of GMP Square
10484Root'', Journal of Automated Reasoning, volume 29, 2002, pp.@: 225-252.  Also
10485available online as INRIA Research Report 4475, June 2001,
10486@uref{http://www.inria.fr/rrrt/rr-4475.html}
10487
10488@item
10489Christoph Burnikel and Joachim Ziegler, ``Fast Recursive Division'',
10490Max-Planck-Institut fuer Informatik Research Report MPI-I-98-1-022,
10491@texlinebreak{} @uref{http://data.mpi-sb.mpg.de/internet/reports.nsf/NumberView/1998-1-022}
10492
10493@item
10494Torbj@"orn Granlund and Peter L. Montgomery, ``Division by Invariant Integers
10495using Multiplication'', in Proceedings of the SIGPLAN PLDI'94 Conference, June
104961994.  Also available @uref{ftp://ftp.cwi.nl/pub/pmontgom/divcnst.psa4.gz}
10497(and .psl.gz).
10498
10499@item
10500Niels M@"oller and Torbj@"orn Granlund, ``Improved division by invariant
10501integers'', IEEE Transactions on Computers, 11 June 2010.
10502@uref{http://gmplib.org/~tege/division-paper.pdf}
10503
10504@item
10505Torbj@"orn Granlund and Niels M@"oller, ``Division of integers large and
10506small'', to appear.
10507
10508@item
10509Tudor Jebelean,
10510``An algorithm for exact division'',
10511Journal of Symbolic Computation,
10512volume 15, 1993, pp.@: 169-180.
10513Research report version available @texlinebreak{}
10514@uref{ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1992/92-35.ps.gz}
10515
10516@item
10517Tudor Jebelean, ``Exact Division with Karatsuba Complexity - Extended
10518Abstract'', RISC-Linz technical report 96-31, @texlinebreak{}
10519@uref{ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1996/96-31.ps.gz}
10520
10521@item
10522Tudor Jebelean, ``Practical Integer Division with Karatsuba Complexity'',
10523ISSAC 97, pp.@: 339-341.  Technical report available @texlinebreak{}
10524@uref{ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1996/96-29.ps.gz}
10525
10526@item
10527Tudor Jebelean, ``A Generalization of the Binary GCD Algorithm'', ISSAC 93,
10528pp.@: 111-116.  Technical report version available @texlinebreak{}
10529@uref{ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1993/93-01.ps.gz}
10530
10531@item
10532Tudor Jebelean, ``A Double-Digit Lehmer-Euclid Algorithm for Finding the GCD
10533of Long Integers'', Journal of Symbolic Computation, volume 19, 1995,
10534pp.@: 145-157.  Technical report version also available @texlinebreak{}
10535@uref{ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1992/92-69.ps.gz}
10536
10537@item
10538Werner Krandick and Tudor Jebelean, ``Bidirectional Exact Integer Division'',
10539Journal of Symbolic Computation, volume 21, 1996, pp.@: 441-455.  Early
10540technical report version also available
10541@uref{ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1994/94-50.ps.gz}
10542
10543@item
10544Makoto Matsumoto and Takuji Nishimura, ``Mersenne Twister: A 623-dimensionally
10545equidistributed uniform pseudorandom number generator'', ACM Transactions on
10546Modelling and Computer Simulation, volume 8, January 1998, pp.@: 3-30.
10547Available online @texlinebreak{}
10548@uref{http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/ARTICLES/mt.ps.gz} (or .pdf)
10549
10550@item
10551R. Moenck and A. Borodin, ``Fast Modular Transforms via Division'',
10552Proceedings of the 13th Annual IEEE Symposium on Switching and Automata
10553Theory, October 1972, pp.@: 90-96.  Reprinted as ``Fast Modular Transforms'',
10554Journal of Computer and System Sciences, volume 8, number 3, June 1974,
10555pp.@: 366-386.
10556
10557@item
10558Niels M@"oller, ``On Sch@"onhage's algorithm and subquadratic integer GCD
10559  computation'', in Mathematics of Computation, volume 77, January 2008, pp.@:
10560  589-607.
10561
10562@item
10563Peter L. Montgomery, ``Modular Multiplication Without Trial Division'', in
10564Mathematics of Computation, volume 44, number 170, April 1985.
10565
10566@item
10567Arnold Sch@"onhage and Volker Strassen, ``Schnelle Multiplikation grosser
10568Zahlen'', Computing 7, 1971, pp.@: 281-292.
10569
10570@item
10571Kenneth Weber, ``The accelerated integer GCD algorithm'',
10572ACM Transactions on Mathematical Software,
10573volume 21, number 1, March 1995, pp.@: 111-122.
10574
10575@item
10576Paul Zimmermann, ``Karatsuba Square Root'', INRIA Research Report 3805,
10577November 1999, @uref{http://www.inria.fr/rrrt/rr-3805.html}
10578
10579@item
10580Paul Zimmermann, ``A Proof of GMP Fast Division and Square Root
10581Implementations'', @texlinebreak{}
10582@uref{http://www.loria.fr/~zimmerma/papers/proof-div-sqrt.ps.gz}
10583
10584@item
10585Dan Zuras, ``On Squaring and Multiplying Large Integers'', ARITH-11: IEEE
10586Symposium on Computer Arithmetic, 1993, pp.@: 260 to 271.  Reprinted as ``More
10587on Multiplying and Squaring Large Integers'', IEEE Transactions on Computers,
10588volume 43, number 8, August 1994, pp.@: 899-908.
10589@end itemize
10590
10591
10592@node GNU Free Documentation License, Concept Index, References, Top
10593@appendix GNU Free Documentation License
10594@cindex GNU Free Documentation License
10595@cindex Free Documentation License
10596@cindex Documentation license
10597@include fdl-1.3.texi
10598
10599
10600@node Concept Index, Function Index, GNU Free Documentation License, Top
10601@comment  node-name,  next,  previous,  up
10602@unnumbered Concept Index
10603@printindex cp
10604
10605@node Function Index,  , Concept Index, Top
10606@comment  node-name,  next,  previous,  up
10607@unnumbered Function and Type Index
10608@printindex fn
10609
10610@bye
10611
10612@c Local variables:
10613@c fill-column: 78
10614@c compile-command: "make gmp.info"
10615@c End:
10616