xref: /netbsd-src/external/lgpl3/gmp/dist/doc/gmp.texi (revision 1580a27b92f58fcdcb23fdfbc04a7c2b54a0b7c8)
1\input texinfo    @c -*-texinfo-*-
2@c %**start of header
3@setfilename gmp.info
4@documentencoding ISO-8859-1
5@include version.texi
6@settitle GNU MP @value{VERSION}
7@synindex tp fn
8@iftex
9@afourpaper
10@end iftex
11@comment %**end of header
12
13@copying
14This manual describes how to install and use the GNU multiple precision
15arithmetic library, version @value{VERSION}.
16
17Copyright 1991, 1993-2016 Free Software Foundation, Inc.
18
19Permission is granted to copy, distribute and/or modify this document under
20the terms of the GNU Free Documentation License, Version 1.3 or any later
21version published by the Free Software Foundation; with no Invariant Sections,
22with the Front-Cover Texts being ``A GNU Manual'', and with the Back-Cover
23Texts being ``You have freedom to copy and modify this GNU Manual, like GNU
24software''.  A copy of the license is included in
25@ref{GNU Free Documentation License}.
26@end copying
27@c  Note the @ref above must be on one line, a line break in an @ref within
28@c  @copying will bomb in recent texinfo.tex (eg. 2004-04-07.08 which comes
29@c  with texinfo 4.7), with messages about missing @endcsname.
30
31
32@c  Texinfo version 4.2 or up will be needed to process this file.
33@c
34@c  The version number and edition number are taken from version.texi provided
35@c  by automake (note that it's regenerated only if you configure with
36@c  --enable-maintainer-mode).
37@c
38@c  Notes discussing the present version number of GMP in relation to previous
39@c  ones (for instance in the "Compatibility" section) must be updated at
40@c  manually though.
41@c
42@c  @cindex entries have been made for function categories and programming
43@c  topics.  The "mpn" section is not included in this, because a beginner
44@c  looking for "GCD" or something is only going to be confused by pointers to
45@c  low level routines.
46@c
47@c  @cindex entries are present for processors and systems when there's
48@c  particular notes concerning them, but not just for everything GMP
49@c  supports.
50@c
51@c  Index entries for files use @code rather than @file, @samp or @option,
52@c  since the latter come out with quotes in TeX, which are nice in the text
53@c  but don't look so good in index columns.
54@c
55@c  Tex:
56@c
57@c  A suitable texinfo.tex is supplied, a newer one should work equally well.
58@c
59@c  HTML:
60@c
61@c  Nothing special is done for links to external manuals, they just come out
62@c  in the usual makeinfo style, eg. "../libc/Locales.html".  If you have
63@c  local copies of such manuals then this is a good thing, if not then you
64@c  may want to search-and-replace to some online source.
65@c
66
67@dircategory GNU libraries
68@direntry
69* gmp: (gmp).                   GNU Multiple Precision Arithmetic Library.
70@end direntry
71
72@c  html <meta name="description" content="...">
73@documentdescription
74How to install and use the GNU multiple precision arithmetic library, version @value{VERSION}.
75@end documentdescription
76
77@c smallbook
78@finalout
79@setchapternewpage on
80
81@ifnottex
82@node Top, Copying, (dir), (dir)
83@top GNU MP
84@end ifnottex
85
86@iftex
87@titlepage
88@title GNU MP
89@subtitle The GNU Multiple Precision Arithmetic Library
90@subtitle Edition @value{EDITION}
91@subtitle @value{UPDATED}
92
93@author by Torbj@"orn Granlund and the GMP development team
94@c @email{tg@@gmplib.org}
95
96@c Include the Distribution inside the titlepage so
97@c that headings are turned off.
98
99@tex
100\global\parindent=0pt
101\global\parskip=8pt
102\global\baselineskip=13pt
103@end tex
104
105@page
106@vskip 0pt plus 1filll
107@end iftex
108
109@insertcopying
110@ifnottex
111@sp 1
112@end ifnottex
113
114@iftex
115@end titlepage
116@headings double
117@end iftex
118
119@c  Don't bother with contents for html, the menus seem adequate.
120@ifnothtml
121@contents
122@end ifnothtml
123
124@menu
125* Copying::                    GMP Copying Conditions (LGPL).
126* Introduction to GMP::        Brief introduction to GNU MP.
127* Installing GMP::             How to configure and compile the GMP library.
128* GMP Basics::                 What every GMP user should know.
129* Reporting Bugs::             How to usefully report bugs.
130* Integer Functions::          Functions for arithmetic on signed integers.
131* Rational Number Functions::  Functions for arithmetic on rational numbers.
132* Floating-point Functions::   Functions for arithmetic on floats.
133* Low-level Functions::        Fast functions for natural numbers.
134* Random Number Functions::    Functions for generating random numbers.
135* Formatted Output::           @code{printf} style output.
136* Formatted Input::            @code{scanf} style input.
137* C++ Class Interface::        Class wrappers around GMP types.
138* Custom Allocation::          How to customize the internal allocation.
139* Language Bindings::          Using GMP from other languages.
140* Algorithms::                 What happens behind the scenes.
141* Internals::                  How values are represented behind the scenes.
142
143* Contributors::               Who brings you this library?
144* References::                 Some useful papers and books to read.
145* GNU Free Documentation License::
146* Concept Index::
147* Function Index::
148@end menu
149
150
151@c  @m{T,N} is $T$ in tex or @math{N} otherwise.  This is an easy way to give
152@c  different forms for math in tex and info.  Commas in N or T don't work,
153@c  but @C{} can be used instead.  \, works in info but not in tex.
154@iftex
155@macro m {T,N}
156@tex$\T\$@end tex
157@end macro
158@end iftex
159@ifnottex
160@macro m {T,N}
161@math{\N\}
162@end macro
163@end ifnottex
164
165@macro C {}
166,
167@end macro
168
169@c  @ms{V,N} is $V_N$ in tex or just vn otherwise.  This suits simple
170@c  subscripts like @ms{x,0}.
171@iftex
172@macro ms {V,N}
173@tex$\V\_{\N\}$@end tex
174@end macro
175@end iftex
176@ifnottex
177@macro ms {V,N}
178\V\\N\
179@end macro
180@end ifnottex
181
182@c  @nicode{S} is plain S in info, or @code{S} elsewhere.  This can be used
183@c  when the quotes that @code{} gives in info aren't wanted, but the
184@c  fontification in tex or html is wanted.  Doesn't work as @nicode{'\\0'}
185@c  though (gives two backslashes in tex).
186@ifinfo
187@macro nicode {S}
188\S\
189@end macro
190@end ifinfo
191@ifnotinfo
192@macro nicode {S}
193@code{\S\}
194@end macro
195@end ifnotinfo
196
197@c  @nisamp{S} is plain S in info, or @samp{S} elsewhere.  This can be used
198@c  when the quotes that @samp{} gives in info aren't wanted, but the
199@c  fontification in tex or html is wanted.
200@ifinfo
201@macro nisamp {S}
202\S\
203@end macro
204@end ifinfo
205@ifnotinfo
206@macro nisamp {S}
207@samp{\S\}
208@end macro
209@end ifnotinfo
210
211@c  Usage: @GMPtimes{}
212@c  Give either \times or the word "times".
213@tex
214\gdef\GMPtimes{\times}
215@end tex
216@ifnottex
217@macro GMPtimes
218times
219@end macro
220@end ifnottex
221
222@c  Usage: @GMPmultiply{}
223@c  Give * in info, or nothing in tex.
224@tex
225\gdef\GMPmultiply{}
226@end tex
227@ifnottex
228@macro GMPmultiply
229*
230@end macro
231@end ifnottex
232
233@c  Usage: @GMPabs{x}
234@c  Give either |x| in tex, or abs(x) in info or html.
235@tex
236\gdef\GMPabs#1{|#1|}
237@end tex
238@ifnottex
239@macro GMPabs {X}
240@abs{}(\X\)
241@end macro
242@end ifnottex
243
244@c  Usage: @GMPfloor{x}
245@c  Give either \lfloor x\rfloor in tex, or floor(x) in info or html.
246@tex
247\gdef\GMPfloor#1{\lfloor #1\rfloor}
248@end tex
249@ifnottex
250@macro GMPfloor {X}
251floor(\X\)
252@end macro
253@end ifnottex
254
255@c  Usage: @GMPceil{x}
256@c  Give either \lceil x\rceil in tex, or ceil(x) in info or html.
257@tex
258\gdef\GMPceil#1{\lceil #1 \rceil}
259@end tex
260@ifnottex
261@macro GMPceil {X}
262ceil(\X\)
263@end macro
264@end ifnottex
265
266@c  Math operators already available in tex, made available in info too.
267@c  For example @bmod{} can be used in both tex and info.
268@ifnottex
269@macro bmod
270mod
271@end macro
272@macro gcd
273gcd
274@end macro
275@macro ge
276>=
277@end macro
278@macro le
279<=
280@end macro
281@macro log
282log
283@end macro
284@macro min
285min
286@end macro
287@macro leftarrow
288<-
289@end macro
290@macro rightarrow
291->
292@end macro
293@end ifnottex
294
295@c  New math operators.
296@c  @abs{} can be used in both tex and info, or just \abs in tex.
297@tex
298\gdef\abs{\mathop{\rm abs}}
299@end tex
300@ifnottex
301@macro abs
302abs
303@end macro
304@end ifnottex
305
306@c  @cross{} is a \times symbol in tex, or an "x" in info.  In tex it works
307@c  inside or outside $ $.
308@tex
309\gdef\cross{\ifmmode\times\else$\times$\fi}
310@end tex
311@ifnottex
312@macro cross
313x
314@end macro
315@end ifnottex
316
317@c  @times{} made available as a "*" in info and html (already works in tex).
318@ifnottex
319@macro times
320*
321@end macro
322@end ifnottex
323
324@c  Usage: @W{text}
325@c  Like @w{} but working in math mode too.
326@tex
327\gdef\W#1{\ifmmode{#1}\else\w{#1}\fi}
328@end tex
329@ifnottex
330@macro W {S}
331@w{\S\}
332@end macro
333@end ifnottex
334
335@c  Usage: \GMPdisplay{text}
336@c  Put the given text in an @display style indent, but without turning off
337@c  paragraph reflow etc.
338@tex
339\gdef\GMPdisplay#1{%
340\noindent
341\advance\leftskip by \lispnarrowing
342#1\par}
343@end tex
344
345@c  Usage: \GMPhat
346@c  A new \hat that will work in math mode, unlike the texinfo redefined
347@c  version.
348@tex
349\gdef\GMPhat{\mathaccent"705E}
350@end tex
351
352@c  Usage: \GMPraise{text}
353@c  For use in a $ $ math expression as an alternative to "^".  This is good
354@c  for @code{} in an exponent, since there seems to be no superscript font
355@c  for that.
356@tex
357\gdef\GMPraise#1{\mskip0.5\thinmuskip\hbox{\raise0.8ex\hbox{#1}}}
358@end tex
359
360@c  Usage: @texlinebreak{}
361@c  A line break as per @*, but only in tex.
362@iftex
363@macro texlinebreak
364@*
365@end macro
366@end iftex
367@ifnottex
368@macro texlinebreak
369@end macro
370@end ifnottex
371
372@c  Usage: @maybepagebreak
373@c  Allow tex to insert a page break, if it feels the urge.
374@c  Normally blocks of @deftypefun/funx are kept together, which can lead to
375@c  some poor page break positioning if it's a big block, like the sets of
376@c  division functions etc.
377@tex
378\gdef\maybepagebreak{\penalty0}
379@end tex
380@ifnottex
381@macro maybepagebreak
382@end macro
383@end ifnottex
384
385@c  Usage: @GMPreftop{info,title}
386@c  Usage: @GMPpxreftop{info,title}
387@c
388@c  Like @ref{} and @pxref{}, but designed for a reference to the top of a
389@c  document, not a particular section.  The TeX output for plain @ref insists
390@c  on printing a particular section, GMPreftop gives just the title.
391@c
392@c  The texinfo manual recommends putting a likely section name in references
393@c  like this, eg. "Introduction", but it seems better to just give the title.
394@c
395@iftex
396@macro GMPreftop{info,title}
397@i{\title\}
398@end macro
399@macro GMPpxreftop{info,title}
400see @i{\title\}
401@end macro
402@end iftex
403@c
404@ifnottex
405@macro GMPreftop{info,title}
406@ref{Top,\title\,\title\,\info\,\title\}
407@end macro
408@macro GMPpxreftop{info,title}
409@pxref{Top,\title\,\title\,\info\,\title\}
410@end macro
411@end ifnottex
412
413
414@node Copying, Introduction to GMP, Top, Top
415@comment  node-name, next, previous,  up
416@unnumbered GNU MP Copying Conditions
417@cindex Copying conditions
418@cindex Conditions for copying GNU MP
419@cindex License conditions
420
421This library is @dfn{free}; this means that everyone is free to use it and
422free to redistribute it on a free basis.  The library is not in the public
423domain; it is copyrighted and there are restrictions on its distribution, but
424these restrictions are designed to permit everything that a good cooperating
425citizen would want to do.  What is not allowed is to try to prevent others
426from further sharing any version of this library that they might get from
427you.@refill
428
429Specifically, we want to make sure that you have the right to give away copies
430of the library, that you receive source code or else can get it if you want
431it, that you can change this library or use pieces of it in new free programs,
432and that you know you can do these things.@refill
433
434To make sure that everyone has such rights, we have to forbid you to deprive
435anyone else of these rights.  For example, if you distribute copies of the GNU
436MP library, you must give the recipients all the rights that you have.  You
437must make sure that they, too, receive or can get the source code.  And you
438must tell them their rights.@refill
439
440Also, for our own protection, we must make certain that everyone finds out
441that there is no warranty for the GNU MP library.  If it is modified by
442someone else and passed on, we want their recipients to know that what they
443have is not what we distributed, so that any problems introduced by others
444will not reflect on our reputation.@refill
445
446More precisely, the GNU MP library is dual licensed, under the conditions of
447the GNU Lesser General Public License version 3 (see
448@file{COPYING.LESSERv3}), or the GNU General Public License version 2 (see
449@file{COPYINGv2}). This is the recipient's choice, and the recipient also has
450the additional option of applying later versions of these licenses. (The
451reason for this dual licensing is to make it possible to use the library with
452programs which are licensed under GPL version 2, but which for historical or
453other reasons do not allow use under later versions of the GPL).
454
455Programs which are not part of the library itself, such as demonstration
456programs and the GMP testsuite, are licensed under the terms of the GNU
457General Public License version 3 (see @file{COPYINGv3}), or any later
458version.
459
460
461@node Introduction to GMP, Installing GMP, Copying, Top
462@comment  node-name,  next,  previous,  up
463@chapter Introduction to GNU MP
464@cindex Introduction
465
466GNU MP is a portable library written in C for arbitrary precision arithmetic
467on integers, rational numbers, and floating-point numbers.  It aims to provide
468the fastest possible arithmetic for all applications that need higher
469precision than is directly supported by the basic C types.
470
471Many applications use just a few hundred bits of precision; but some
472applications may need thousands or even millions of bits.  GMP is designed to
473give good performance for both, by choosing algorithms based on the sizes of
474the operands, and by carefully keeping the overhead at a minimum.
475
476The speed of GMP is achieved by using fullwords as the basic arithmetic type,
477by using sophisticated algorithms, by including carefully optimized assembly
478code for the most common inner loops for many different CPUs, and by a general
479emphasis on speed (as opposed to simplicity or elegance).
480
481There is assembly code for these CPUs:
482@cindex CPU types
483ARM Cortex-A9, Cortex-A15, and generic ARM,
484DEC Alpha 21064, 21164, and 21264,
485AMD K8 and K10 (sold under many brands, e.g. Athlon64, Phenom, Opteron)
486Bulldozer, and Bobcat,
487Intel Pentium, Pentium Pro/II/III, Pentium 4, Core2, Nehalem, Sandy bridge, Haswell, generic x86,
488Intel IA-64,
489Motorola/IBM PowerPC 32 and 64 such as POWER970, POWER5, POWER6, and POWER7,
490MIPS 32-bit and 64-bit,
491SPARC 32-bit ad 64-bit with special support for all UltraSPARC models.
492There is also assembly code for many obsolete CPUs.
493
494
495@cindex Home page
496@cindex Web page
497@noindent
498For up-to-date information on GMP, please see the GMP web pages at
499
500@display
501@uref{https://gmplib.org/}
502@end display
503
504@cindex Latest version of GMP
505@cindex Anonymous FTP of latest version
506@cindex FTP of latest version
507@noindent
508The latest version of the library is available at
509
510@display
511@uref{https://ftp.gnu.org/gnu/gmp/}
512@end display
513
514Many sites around the world mirror @samp{ftp.gnu.org}, please use a mirror
515near you, see @uref{https://www.gnu.org/order/ftp.html} for a full list.
516
517@cindex Mailing lists
518There are three public mailing lists of interest.  One for release
519announcements, one for general questions and discussions about usage of the GMP
520library and one for bug reports.  For more information, see
521
522@display
523@uref{https://gmplib.org/mailman/listinfo/}.
524@end display
525
526The proper place for bug reports is @email{gmp-bugs@@gmplib.org}.  See
527@ref{Reporting Bugs} for information about reporting bugs.
528
529@sp 1
530@section How to use this Manual
531@cindex About this manual
532
533Everyone should read @ref{GMP Basics}.  If you need to install the library
534yourself, then read @ref{Installing GMP}.  If you have a system with multiple
535ABIs, then read @ref{ABI and ISA}, for the compiler options that must be used
536on applications.
537
538The rest of the manual can be used for later reference, although it is
539probably a good idea to glance through it.
540
541
542@node Installing GMP, GMP Basics, Introduction to GMP, Top
543@comment  node-name,  next,  previous,  up
544@chapter Installing GMP
545@cindex Installing GMP
546@cindex Configuring GMP
547@cindex Building GMP
548
549GMP has an autoconf/automake/libtool based configuration system.  On a
550Unix-like system a basic build can be done with
551
552@example
553./configure
554make
555@end example
556
557@noindent
558Some self-tests can be run with
559
560@example
561make check
562@end example
563
564@noindent
565And you can install (under @file{/usr/local} by default) with
566
567@example
568make install
569@end example
570
571If you experience problems, please report them to @email{gmp-bugs@@gmplib.org}.
572See @ref{Reporting Bugs}, for information on what to include in useful bug
573reports.
574
575@menu
576* Build Options::
577* ABI and ISA::
578* Notes for Package Builds::
579* Notes for Particular Systems::
580* Known Build Problems::
581* Performance optimization::
582@end menu
583
584
585@node Build Options, ABI and ISA, Installing GMP, Installing GMP
586@section Build Options
587@cindex Build options
588
589All the usual autoconf configure options are available, run @samp{./configure
590--help} for a summary.  The file @file{INSTALL.autoconf} has some generic
591installation information too.
592
593@table @asis
594@item Tools
595@cindex Non-Unix systems
596@samp{configure} requires various Unix-like tools.  See @ref{Notes for
597Particular Systems}, for some options on non-Unix systems.
598
599It might be possible to build without the help of @samp{configure}, certainly
600all the code is there, but unfortunately you'll be on your own.
601
602@item Build Directory
603@cindex Build directory
604To compile in a separate build directory, @command{cd} to that directory, and
605prefix the configure command with the path to the GMP source directory.  For
606example
607
608@example
609cd /my/build/dir
610/my/sources/gmp-@value{VERSION}/configure
611@end example
612
613Not all @samp{make} programs have the necessary features (@code{VPATH}) to
614support this.  In particular, SunOS and Slowaris @command{make} have bugs that
615make them unable to build in a separate directory.  Use GNU @command{make}
616instead.
617
618@item @option{--prefix} and @option{--exec-prefix}
619@cindex Prefix
620@cindex Exec prefix
621@cindex Install prefix
622@cindex @code{--prefix}
623@cindex @code{--exec-prefix}
624The @option{--prefix} option can be used in the normal way to direct GMP to
625install under a particular tree.  The default is @samp{/usr/local}.
626
627@option{--exec-prefix} can be used to direct architecture-dependent files like
628@file{libgmp.a} to a different location.  This can be used to share
629architecture-independent parts like the documentation, but separate the
630dependent parts.  Note however that @file{gmp.h} is
631architecture-dependent since it encodes certain aspects of @file{libgmp}, so
632it will be necessary to ensure both @file{$prefix/include} and
633@file{$exec_prefix/include} are available to the compiler.
634
635@item @option{--disable-shared}, @option{--disable-static}
636@cindex @code{--disable-shared}
637@cindex @code{--disable-static}
638By default both shared and static libraries are built (where possible), but
639one or other can be disabled.  Shared libraries result in smaller executables
640and permit code sharing between separate running processes, but on some CPUs
641are slightly slower, having a small cost on each function call.
642
643@item Native Compilation, @option{--build=CPU-VENDOR-OS}
644@cindex Native compilation
645@cindex Build system
646@cindex @code{--build}
647For normal native compilation, the system can be specified with
648@samp{--build}.  By default @samp{./configure} uses the output from running
649@samp{./config.guess}.  On some systems @samp{./config.guess} can determine
650the exact CPU type, on others it will be necessary to give it explicitly.  For
651example,
652
653@example
654./configure --build=ultrasparc-sun-solaris2.7
655@end example
656
657In all cases the @samp{OS} part is important, since it controls how libtool
658generates shared libraries.  Running @samp{./config.guess} is the simplest way
659to see what it should be, if you don't know already.
660
661@item Cross Compilation, @option{--host=CPU-VENDOR-OS}
662@cindex Cross compiling
663@cindex Host system
664@cindex @code{--host}
665When cross-compiling, the system used for compiling is given by @samp{--build}
666and the system where the library will run is given by @samp{--host}.  For
667example when using a FreeBSD Athlon system to build GNU/Linux m68k binaries,
668
669@example
670./configure --build=athlon-pc-freebsd3.5 --host=m68k-mac-linux-gnu
671@end example
672
673Compiler tools are sought first with the host system type as a prefix.  For
674example @command{m68k-mac-linux-gnu-ranlib} is tried, then plain
675@command{ranlib}.  This makes it possible for a set of cross-compiling tools
676to co-exist with native tools.  The prefix is the argument to @samp{--host},
677and this can be an alias, such as @samp{m68k-linux}.  But note that tools
678don't have to be setup this way, it's enough to just have a @env{PATH} with a
679suitable cross-compiling @command{cc} etc.
680
681Compiling for a different CPU in the same family as the build system is a form
682of cross-compilation, though very possibly this would merely be special
683options on a native compiler.  In any case @samp{./configure} avoids depending
684on being able to run code on the build system, which is important when
685creating binaries for a newer CPU since they very possibly won't run on the
686build system.
687
688In all cases the compiler must be able to produce an executable (of whatever
689format) from a standard C @code{main}.  Although only object files will go to
690make up @file{libgmp}, @samp{./configure} uses linking tests for various
691purposes, such as determining what functions are available on the host system.
692
693Currently a warning is given unless an explicit @samp{--build} is used when
694cross-compiling, because it may not be possible to correctly guess the build
695system type if the @env{PATH} has only a cross-compiling @command{cc}.
696
697Note that the @samp{--target} option is not appropriate for GMP@.  It's for use
698when building compiler tools, with @samp{--host} being where they will run,
699and @samp{--target} what they'll produce code for.  Ordinary programs or
700libraries like GMP are only interested in the @samp{--host} part, being where
701they'll run.  (Some past versions of GMP used @samp{--target} incorrectly.)
702
703@item CPU types
704@cindex CPU types
705In general, if you want a library that runs as fast as possible, you should
706configure GMP for the exact CPU type your system uses.  However, this may mean
707the binaries won't run on older members of the family, and might run slower on
708other members, older or newer.  The best idea is always to build GMP for the
709exact machine type you intend to run it on.
710
711The following CPUs have specific support.  See @file{configure.ac} for details
712of what code and compiler options they select.
713
714@itemize @bullet
715
716@c Keep this formatting, it's easy to read and it can be grepped to
717@c automatically test that CPUs listed get through ./config.sub
718
719@item
720Alpha:
721@nisamp{alpha},
722@nisamp{alphaev5},
723@nisamp{alphaev56},
724@nisamp{alphapca56},
725@nisamp{alphapca57},
726@nisamp{alphaev6},
727@nisamp{alphaev67},
728@nisamp{alphaev68}
729@nisamp{alphaev7}
730
731@item
732Cray:
733@nisamp{c90},
734@nisamp{j90},
735@nisamp{t90},
736@nisamp{sv1}
737
738@item
739HPPA:
740@nisamp{hppa1.0},
741@nisamp{hppa1.1},
742@nisamp{hppa2.0},
743@nisamp{hppa2.0n},
744@nisamp{hppa2.0w},
745@nisamp{hppa64}
746
747@item
748IA-64:
749@nisamp{ia64},
750@nisamp{itanium},
751@nisamp{itanium2}
752
753@item
754MIPS:
755@nisamp{mips},
756@nisamp{mips3},
757@nisamp{mips64}
758
759@item
760Motorola:
761@nisamp{m68k},
762@nisamp{m68000},
763@nisamp{m68010},
764@nisamp{m68020},
765@nisamp{m68030},
766@nisamp{m68040},
767@nisamp{m68060},
768@nisamp{m68302},
769@nisamp{m68360},
770@nisamp{m88k},
771@nisamp{m88110}
772
773@item
774POWER:
775@nisamp{power},
776@nisamp{power1},
777@nisamp{power2},
778@nisamp{power2sc}
779
780@item
781PowerPC:
782@nisamp{powerpc},
783@nisamp{powerpc64},
784@nisamp{powerpc401},
785@nisamp{powerpc403},
786@nisamp{powerpc405},
787@nisamp{powerpc505},
788@nisamp{powerpc601},
789@nisamp{powerpc602},
790@nisamp{powerpc603},
791@nisamp{powerpc603e},
792@nisamp{powerpc604},
793@nisamp{powerpc604e},
794@nisamp{powerpc620},
795@nisamp{powerpc630},
796@nisamp{powerpc740},
797@nisamp{powerpc7400},
798@nisamp{powerpc7450},
799@nisamp{powerpc750},
800@nisamp{powerpc801},
801@nisamp{powerpc821},
802@nisamp{powerpc823},
803@nisamp{powerpc860},
804@nisamp{powerpc970}
805
806@item
807SPARC:
808@nisamp{sparc},
809@nisamp{sparcv8},
810@nisamp{microsparc},
811@nisamp{supersparc},
812@nisamp{sparcv9},
813@nisamp{ultrasparc},
814@nisamp{ultrasparc2},
815@nisamp{ultrasparc2i},
816@nisamp{ultrasparc3},
817@nisamp{sparc64}
818
819@item
820x86 family:
821@nisamp{i386},
822@nisamp{i486},
823@nisamp{i586},
824@nisamp{pentium},
825@nisamp{pentiummmx},
826@nisamp{pentiumpro},
827@nisamp{pentium2},
828@nisamp{pentium3},
829@nisamp{pentium4},
830@nisamp{k6},
831@nisamp{k62},
832@nisamp{k63},
833@nisamp{athlon},
834@nisamp{amd64},
835@nisamp{viac3},
836@nisamp{viac32}
837
838@item
839Other:
840@nisamp{arm},
841@nisamp{sh},
842@nisamp{sh2},
843@nisamp{vax},
844@end itemize
845
846CPUs not listed will use generic C code.
847
848@item Generic C Build
849@cindex Generic C
850If some of the assembly code causes problems, or if otherwise desired, the
851generic C code can be selected with the configure @option{--disable-assembly}.
852
853Note that this will run quite slowly, but it should be portable and should at
854least make it possible to get something running if all else fails.
855
856@item Fat binary, @option{--enable-fat}
857@cindex Fat binary
858@cindex @code{--enable-fat}
859Using @option{--enable-fat} selects a ``fat binary'' build on x86, where
860optimized low level subroutines are chosen at runtime according to the CPU
861detected.  This means more code, but gives good performance on all x86 chips.
862(This option might become available for more architectures in the future.)
863
864@item @option{ABI}
865@cindex ABI
866On some systems GMP supports multiple ABIs (application binary interfaces),
867meaning data type sizes and calling conventions.  By default GMP chooses the
868best ABI available, but a particular ABI can be selected.  For example
869
870@example
871./configure --host=mips64-sgi-irix6 ABI=n32
872@end example
873
874See @ref{ABI and ISA}, for the available choices on relevant CPUs, and what
875applications need to do.
876
877@item @option{CC}, @option{CFLAGS}
878@cindex C compiler
879@cindex @code{CC}
880@cindex @code{CFLAGS}
881By default the C compiler used is chosen from among some likely candidates,
882with @command{gcc} normally preferred if it's present.  The usual
883@samp{CC=whatever} can be passed to @samp{./configure} to choose something
884different.
885
886For various systems, default compiler flags are set based on the CPU and
887compiler.  The usual @samp{CFLAGS="-whatever"} can be passed to
888@samp{./configure} to use something different or to set good flags for systems
889GMP doesn't otherwise know.
890
891The @samp{CC} and @samp{CFLAGS} used are printed during @samp{./configure},
892and can be found in each generated @file{Makefile}.  This is the easiest way
893to check the defaults when considering changing or adding something.
894
895Note that when @samp{CC} and @samp{CFLAGS} are specified on a system
896supporting multiple ABIs it's important to give an explicit
897@samp{ABI=whatever}, since GMP can't determine the ABI just from the flags and
898won't be able to select the correct assembly code.
899
900If just @samp{CC} is selected then normal default @samp{CFLAGS} for that
901compiler will be used (if GMP recognises it).  For example @samp{CC=gcc} can
902be used to force the use of GCC, with default flags (and default ABI).
903
904@item @option{CPPFLAGS}
905@cindex @code{CPPFLAGS}
906Any flags like @samp{-D} defines or @samp{-I} includes required by the
907preprocessor should be set in @samp{CPPFLAGS} rather than @samp{CFLAGS}.
908Compiling is done with both @samp{CPPFLAGS} and @samp{CFLAGS}, but
909preprocessing uses just @samp{CPPFLAGS}.  This distinction is because most
910preprocessors won't accept all the flags the compiler does.  Preprocessing is
911done separately in some configure tests.
912
913@item @option{CC_FOR_BUILD}
914@cindex @code{CC_FOR_BUILD}
915Some build-time programs are compiled and run to generate host-specific data
916tables.  @samp{CC_FOR_BUILD} is the compiler used for this.  It doesn't need
917to be in any particular ABI or mode, it merely needs to generate executables
918that can run.  The default is to try the selected @samp{CC} and some likely
919candidates such as @samp{cc} and @samp{gcc}, looking for something that works.
920
921No flags are used with @samp{CC_FOR_BUILD} because a simple invocation like
922@samp{cc foo.c} should be enough.  If some particular options are required
923they can be included as for instance @samp{CC_FOR_BUILD="cc -whatever"}.
924
925@item C++ Support, @option{--enable-cxx}
926@cindex C++ support
927@cindex @code{--enable-cxx}
928C++ support in GMP can be enabled with @samp{--enable-cxx}, in which case a
929C++ compiler will be required.  As a convenience @samp{--enable-cxx=detect}
930can be used to enable C++ support only if a compiler can be found.  The C++
931support consists of a library @file{libgmpxx.la} and header file
932@file{gmpxx.h} (@pxref{Headers and Libraries}).
933
934A separate @file{libgmpxx.la} has been adopted rather than having C++ objects
935within @file{libgmp.la} in order to ensure dynamic linked C programs aren't
936bloated by a dependency on the C++ standard library, and to avoid any chance
937that the C++ compiler could be required when linking plain C programs.
938
939@file{libgmpxx.la} will use certain internals from @file{libgmp.la} and can
940only be expected to work with @file{libgmp.la} from the same GMP version.
941Future changes to the relevant internals will be accompanied by renaming, so a
942mismatch will cause unresolved symbols rather than perhaps mysterious
943misbehaviour.
944
945In general @file{libgmpxx.la} will be usable only with the C++ compiler that
946built it, since name mangling and runtime support are usually incompatible
947between different compilers.
948
949@item @option{CXX}, @option{CXXFLAGS}
950@cindex C++ compiler
951@cindex @code{CXX}
952@cindex @code{CXXFLAGS}
953When C++ support is enabled, the C++ compiler and its flags can be set with
954variables @samp{CXX} and @samp{CXXFLAGS} in the usual way.  The default for
955@samp{CXX} is the first compiler that works from a list of likely candidates,
956with @command{g++} normally preferred when available.  The default for
957@samp{CXXFLAGS} is to try @samp{CFLAGS}, @samp{CFLAGS} without @samp{-g}, then
958for @command{g++} either @samp{-g -O2} or @samp{-O2}, or for other compilers
959@samp{-g} or nothing.  Trying @samp{CFLAGS} this way is convenient when using
960@samp{gcc} and @samp{g++} together, since the flags for @samp{gcc} will
961usually suit @samp{g++}.
962
963It's important that the C and C++ compilers match, meaning their startup and
964runtime support routines are compatible and that they generate code in the
965same ABI (if there's a choice of ABIs on the system).  @samp{./configure}
966isn't currently able to check these things very well itself, so for that
967reason @samp{--disable-cxx} is the default, to avoid a build failure due to a
968compiler mismatch.  Perhaps this will change in the future.
969
970Incidentally, it's normally not good enough to set @samp{CXX} to the same as
971@samp{CC}.  Although @command{gcc} for instance recognises @file{foo.cc} as
972C++ code, only @command{g++} will invoke the linker the right way when
973building an executable or shared library from C++ object files.
974
975@item Temporary Memory, @option{--enable-alloca=<choice>}
976@cindex Temporary memory
977@cindex Stack overflow
978@cindex @code{alloca}
979@cindex @code{--enable-alloca}
980GMP allocates temporary workspace using one of the following three methods,
981which can be selected with for instance
982@samp{--enable-alloca=malloc-reentrant}.
983
984@itemize @bullet
985@item
986@samp{alloca} - C library or compiler builtin.
987@item
988@samp{malloc-reentrant} - the heap, in a re-entrant fashion.
989@item
990@samp{malloc-notreentrant} - the heap, with global variables.
991@end itemize
992
993For convenience, the following choices are also available.
994@samp{--disable-alloca} is the same as @samp{no}.
995
996@itemize @bullet
997@item
998@samp{yes} - a synonym for @samp{alloca}.
999@item
1000@samp{no} - a synonym for @samp{malloc-reentrant}.
1001@item
1002@samp{reentrant} - @code{alloca} if available, otherwise
1003@samp{malloc-reentrant}.  This is the default.
1004@item
1005@samp{notreentrant} - @code{alloca} if available, otherwise
1006@samp{malloc-notreentrant}.
1007@end itemize
1008
1009@code{alloca} is reentrant and fast, and is recommended.  It actually allocates
1010just small blocks on the stack; larger ones use malloc-reentrant.
1011
1012@samp{malloc-reentrant} is, as the name suggests, reentrant and thread safe,
1013but @samp{malloc-notreentrant} is faster and should be used if reentrancy is
1014not required.
1015
1016The two malloc methods in fact use the memory allocation functions selected by
1017@code{mp_set_memory_functions}, these being @code{malloc} and friends by
1018default.  @xref{Custom Allocation}.
1019
1020An additional choice @samp{--enable-alloca=debug} is available, to help when
1021debugging memory related problems (@pxref{Debugging}).
1022
1023@item FFT Multiplication, @option{--disable-fft}
1024@cindex FFT multiplication
1025@cindex @code{--disable-fft}
1026By default multiplications are done using Karatsuba, 3-way Toom, higher degree
1027Toom, and Fermat FFT@.  The FFT is only used on large to very large operands
1028and can be disabled to save code size if desired.
1029
1030@item Assertion Checking, @option{--enable-assert}
1031@cindex Assertion checking
1032@cindex @code{--enable-assert}
1033This option enables some consistency checking within the library.  This can be
1034of use while debugging, @pxref{Debugging}.
1035
1036@item Execution Profiling, @option{--enable-profiling=prof/gprof/instrument}
1037@cindex Execution profiling
1038@cindex @code{--enable-profiling}
1039Enable profiling support, in one of various styles, @pxref{Profiling}.
1040
1041@item @option{MPN_PATH}
1042@cindex @code{MPN_PATH}
1043Various assembly versions of each mpn subroutines are provided.  For a given
1044CPU, a search is made though a path to choose a version of each.  For example
1045@samp{sparcv8} has
1046
1047@example
1048MPN_PATH="sparc32/v8 sparc32 generic"
1049@end example
1050
1051which means look first for v8 code, then plain sparc32 (which is v7), and
1052finally fall back on generic C@.  Knowledgeable users with special requirements
1053can specify a different path.  Normally this is completely unnecessary.
1054
1055@item Documentation
1056@cindex Documentation formats
1057@cindex Texinfo
1058The source for the document you're now reading is @file{doc/gmp.texi}, in
1059Texinfo format, see @GMPreftop{texinfo, Texinfo}.
1060
1061@cindex Postscript
1062@cindex DVI
1063@cindex PDF
1064Info format @samp{doc/gmp.info} is included in the distribution.  The usual
1065automake targets are available to make PostScript, DVI, PDF and HTML (these
1066will require various @TeX{} and Texinfo tools).
1067
1068@cindex DocBook
1069@cindex XML
1070DocBook and XML can be generated by the Texinfo @command{makeinfo} program
1071too, see @ref{makeinfo options,, Options for @command{makeinfo}, texinfo,
1072Texinfo}.
1073
1074Some supplementary notes can also be found in the @file{doc} subdirectory.
1075
1076@end table
1077
1078
1079@need 2000
1080@node ABI and ISA, Notes for Package Builds, Build Options, Installing GMP
1081@section ABI and ISA
1082@cindex ABI
1083@cindex Application Binary Interface
1084@cindex ISA
1085@cindex Instruction Set Architecture
1086
1087ABI (Application Binary Interface) refers to the calling conventions between
1088functions, meaning what registers are used and what sizes the various C data
1089types are.  ISA (Instruction Set Architecture) refers to the instructions and
1090registers a CPU has available.
1091
1092Some 64-bit ISA CPUs have both a 64-bit ABI and a 32-bit ABI defined, the
1093latter for compatibility with older CPUs in the family.  GMP supports some
1094CPUs like this in both ABIs.  In fact within GMP @samp{ABI} means a
1095combination of chip ABI, plus how GMP chooses to use it.  For example in some
109632-bit ABIs, GMP may support a limb as either a 32-bit @code{long} or a 64-bit
1097@code{long long}.
1098
1099By default GMP chooses the best ABI available for a given system, and this
1100generally gives significantly greater speed.  But an ABI can be chosen
1101explicitly to make GMP compatible with other libraries, or particular
1102application requirements.  For example,
1103
1104@example
1105./configure ABI=32
1106@end example
1107
1108In all cases it's vital that all object code used in a given program is
1109compiled for the same ABI.
1110
1111Usually a limb is implemented as a @code{long}.  When a @code{long long} limb
1112is used this is encoded in the generated @file{gmp.h}.  This is convenient for
1113applications, but it does mean that @file{gmp.h} will vary, and can't be just
1114copied around.  @file{gmp.h} remains compiler independent though, since all
1115compilers for a particular ABI will be expected to use the same limb type.
1116
1117Currently no attempt is made to follow whatever conventions a system has for
1118installing library or header files built for a particular ABI@.  This will
1119probably only matter when installing multiple builds of GMP, and it might be
1120as simple as configuring with a special @samp{libdir}, or it might require
1121more than that.  Note that builds for different ABIs need to done separately,
1122with a fresh @command{./configure} and @command{make} each.
1123
1124@sp 1
1125@table @asis
1126@need 1000
1127@item AMD64 (@samp{x86_64})
1128@cindex AMD64
1129On AMD64 systems supporting both 32-bit and 64-bit modes for applications, the
1130following ABI choices are available.
1131
1132@table @asis
1133@item @samp{ABI=64}
1134The 64-bit ABI uses 64-bit limbs and pointers and makes full use of the chip
1135architecture.  This is the default.  Applications will usually not need
1136special compiler flags, but for reference the option is
1137
1138@example
1139gcc  -m64
1140@end example
1141
1142@item @samp{ABI=32}
1143The 32-bit ABI is the usual i386 conventions.  This will be slower, and is not
1144recommended except for inter-operating with other code not yet 64-bit capable.
1145Applications must be compiled with
1146
1147@example
1148gcc  -m32
1149@end example
1150
1151(In GCC 2.95 and earlier there's no @samp{-m32} option, it's the only mode.)
1152
1153@item @samp{ABI=x32}
1154The x32 ABI uses 64-bit limbs but 32-bit pointers.  Like the 64-bit ABI, it
1155makes full use of the chip's arithmetic capabilities.  This ABI is not
1156supported by all operating systems.
1157
1158@example
1159gcc  -mx32
1160@end example
1161
1162@end table
1163
1164@sp 1
1165@need 1000
1166@item HPPA 2.0 (@samp{hppa2.0*}, @samp{hppa64})
1167@cindex HPPA
1168@cindex HP-UX
1169@table @asis
1170@item @samp{ABI=2.0w}
1171The 2.0w ABI uses 64-bit limbs and pointers and is available on HP-UX 11 or
1172up.  Applications must be compiled with
1173
1174@example
1175gcc [built for 2.0w]
1176cc  +DD64
1177@end example
1178
1179@item @samp{ABI=2.0n}
1180The 2.0n ABI means the 32-bit HPPA 1.0 ABI and all its normal calling
1181conventions, but with 64-bit instructions permitted within functions.  GMP
1182uses a 64-bit @code{long long} for a limb.  This ABI is available on hppa64
1183GNU/Linux and on HP-UX 10 or higher.  Applications must be compiled with
1184
1185@example
1186gcc [built for 2.0n]
1187cc  +DA2.0 +e
1188@end example
1189
1190Note that current versions of GCC (eg.@: 3.2) don't generate 64-bit
1191instructions for @code{long long} operations and so may be slower than for
11922.0w.  (The GMP assembly code is the same though.)
1193
1194@item @samp{ABI=1.0}
1195HPPA 2.0 CPUs can run all HPPA 1.0 and 1.1 code in the 32-bit HPPA 1.0 ABI@.
1196No special compiler options are needed for applications.
1197@end table
1198
1199All three ABIs are available for CPU types @samp{hppa2.0w}, @samp{hppa2.0} and
1200@samp{hppa64}, but for CPU type @samp{hppa2.0n} only 2.0n or 1.0 are
1201considered.
1202
1203Note that GCC on HP-UX has no options to choose between 2.0n and 2.0w modes,
1204unlike HP @command{cc}.  Instead it must be built for one or the other ABI@.
1205GMP will detect how it was built, and skip to the corresponding @samp{ABI}.
1206
1207@sp 1
1208@need 1500
1209@item IA-64 under HP-UX (@samp{ia64*-*-hpux*}, @samp{itanium*-*-hpux*})
1210@cindex IA-64
1211@cindex HP-UX
1212HP-UX supports two ABIs for IA-64.  GMP performance is the same in both.
1213
1214@table @asis
1215@item @samp{ABI=32}
1216In the 32-bit ABI, pointers, @code{int}s and @code{long}s are 32 bits and GMP
1217uses a 64 bit @code{long long} for a limb.  Applications can be compiled
1218without any special flags since this ABI is the default in both HP C and GCC,
1219but for reference the flags are
1220
1221@example
1222gcc  -milp32
1223cc   +DD32
1224@end example
1225
1226@item @samp{ABI=64}
1227In the 64-bit ABI, @code{long}s and pointers are 64 bits and GMP uses a
1228@code{long} for a limb.  Applications must be compiled with
1229
1230@example
1231gcc  -mlp64
1232cc   +DD64
1233@end example
1234@end table
1235
1236On other IA-64 systems, GNU/Linux for instance, @samp{ABI=64} is the only
1237choice.
1238
1239@sp 1
1240@need 1000
1241@item MIPS under IRIX 6 (@samp{mips*-*-irix[6789]})
1242@cindex MIPS
1243@cindex IRIX
1244IRIX 6 always has a 64-bit MIPS 3 or better CPU, and supports ABIs o32, n32,
1245and 64.  n32 or 64 are recommended, and GMP performance will be the same in
1246each.  The default is n32.
1247
1248@table @asis
1249@item @samp{ABI=o32}
1250The o32 ABI is 32-bit pointers and integers, and no 64-bit operations.  GMP
1251will be slower than in n32 or 64, this option only exists to support old
1252compilers, eg.@: GCC 2.7.2.  Applications can be compiled with no special
1253flags on an old compiler, or on a newer compiler with
1254
1255@example
1256gcc  -mabi=32
1257cc   -32
1258@end example
1259
1260@item @samp{ABI=n32}
1261The n32 ABI is 32-bit pointers and integers, but with a 64-bit limb using a
1262@code{long long}.  Applications must be compiled with
1263
1264@example
1265gcc  -mabi=n32
1266cc   -n32
1267@end example
1268
1269@item @samp{ABI=64}
1270The 64-bit ABI is 64-bit pointers and integers.  Applications must be compiled
1271with
1272
1273@example
1274gcc  -mabi=64
1275cc   -64
1276@end example
1277@end table
1278
1279Note that MIPS GNU/Linux, as of kernel version 2.2, doesn't have the necessary
1280support for n32 or 64 and so only gets a 32-bit limb and the MIPS 2 code.
1281
1282@sp 1
1283@need 1000
1284@item PowerPC 64 (@samp{powerpc64}, @samp{powerpc620}, @samp{powerpc630}, @samp{powerpc970}, @samp{power4}, @samp{power5})
1285@cindex PowerPC
1286@table @asis
1287@item @samp{ABI=mode64}
1288@cindex AIX
1289The AIX 64 ABI uses 64-bit limbs and pointers and is the default on PowerPC 64
1290@samp{*-*-aix*} systems.  Applications must be compiled with
1291
1292@example
1293gcc  -maix64
1294xlc  -q64
1295@end example
1296
1297On 64-bit GNU/Linux, BSD, and Mac OS X/Darwin systems, the applications must
1298be compiled with
1299
1300@example
1301gcc  -m64
1302@end example
1303
1304@item @samp{ABI=mode32}
1305The @samp{mode32} ABI uses a 64-bit @code{long long} limb but with the chip
1306still in 32-bit mode and using 32-bit calling conventions.  This is the default
1307for systems where the true 64-bit ABI is unavailable.  No special compiler
1308options are typically needed for applications.  This ABI is not available under
1309AIX.
1310
1311@item @samp{ABI=32}
1312This is the basic 32-bit PowerPC ABI, with a 32-bit limb.  No special compiler
1313options are needed for applications.
1314@end table
1315
1316GMP's speed is greatest for the @samp{mode64} ABI, the @samp{mode32} ABI is 2nd
1317best.  In @samp{ABI=32} only the 32-bit ISA is used and this doesn't make full
1318use of a 64-bit chip.
1319
1320@sp 1
1321@need 1000
1322@item Sparc V9 (@samp{sparc64}, @samp{sparcv9}, @samp{ultrasparc*})
1323@cindex Sparc V9
1324@cindex Solaris
1325@cindex Sun
1326@table @asis
1327@item @samp{ABI=64}
1328The 64-bit V9 ABI is available on the various BSD sparc64 ports, recent
1329versions of Sparc64 GNU/Linux, and Solaris 2.7 and up (when the kernel is in
133064-bit mode).  GCC 3.2 or higher, or Sun @command{cc} is required.  On
1331GNU/Linux, depending on the default @command{gcc} mode, applications must be
1332compiled with
1333
1334@example
1335gcc  -m64
1336@end example
1337
1338On Solaris applications must be compiled with
1339
1340@example
1341gcc  -m64 -mptr64 -Wa,-xarch=v9 -mcpu=v9
1342cc   -xarch=v9
1343@end example
1344
1345On the BSD sparc64 systems no special options are required, since 64-bits is
1346the only ABI available.
1347
1348@item @samp{ABI=32}
1349For the basic 32-bit ABI, GMP still uses as much of the V9 ISA as it can.  In
1350the Sun documentation this combination is known as ``v8plus''.  On GNU/Linux,
1351depending on the default @command{gcc} mode, applications may need to be
1352compiled with
1353
1354@example
1355gcc  -m32
1356@end example
1357
1358On Solaris, no special compiler options are required for applications, though
1359using something like the following is recommended.  (@command{gcc} 2.8 and
1360earlier only support @samp{-mv8} though.)
1361
1362@example
1363gcc  -mv8plus
1364cc   -xarch=v8plus
1365@end example
1366@end table
1367
1368GMP speed is greatest in @samp{ABI=64}, so it's the default where available.
1369The speed is partly because there are extra registers available and partly
1370because 64-bits is considered the more important case and has therefore had
1371better code written for it.
1372
1373Don't be confused by the names of the @samp{-m} and @samp{-x} compiler
1374options, they're called @samp{arch} but effectively control both ABI and ISA@.
1375
1376On Solaris 2.6 and earlier, only @samp{ABI=32} is available since the kernel
1377doesn't save all registers.
1378
1379On Solaris 2.7 with the kernel in 32-bit mode, a normal native build will
1380reject @samp{ABI=64} because the resulting executables won't run.
1381@samp{ABI=64} can still be built if desired by making it look like a
1382cross-compile, for example
1383
1384@example
1385./configure --build=none --host=sparcv9-sun-solaris2.7 ABI=64
1386@end example
1387@end table
1388
1389
1390@need 2000
1391@node Notes for Package Builds, Notes for Particular Systems, ABI and ISA, Installing GMP
1392@section Notes for Package Builds
1393@cindex Build notes for binary packaging
1394@cindex Packaged builds
1395
1396GMP should present no great difficulties for packaging in a binary
1397distribution.
1398
1399@cindex Libtool versioning
1400@cindex Shared library versioning
1401Libtool is used to build the library and @samp{-version-info} is set
1402appropriately, having started from @samp{3:0:0} in GMP 3.0 (@pxref{Versioning,
1403Library interface versions, Library interface versions, libtool, GNU
1404Libtool}).
1405
1406The GMP 4 series will be upwardly binary compatible in each release and will
1407be upwardly binary compatible with all of the GMP 3 series.  Additional
1408function interfaces may be added in each release, so on systems where libtool
1409versioning is not fully checked by the loader an auxiliary mechanism may be
1410needed to express that a dynamic linked application depends on a new enough
1411GMP.
1412
1413An auxiliary mechanism may also be needed to express that @file{libgmpxx.la}
1414(from @option{--enable-cxx}, @pxref{Build Options}) requires @file{libgmp.la}
1415from the same GMP version, since this is not done by the libtool versioning,
1416nor otherwise.  A mismatch will result in unresolved symbols from the linker,
1417or perhaps the loader.
1418
1419When building a package for a CPU family, care should be taken to use
1420@samp{--host} (or @samp{--build}) to choose the least common denominator among
1421the CPUs which might use the package.  For example this might mean plain
1422@samp{sparc} (meaning V7) for SPARCs.
1423
1424For x86s, @option{--enable-fat} sets things up for a fat binary build, making a
1425runtime selection of optimized low level routines.  This is a good choice for
1426packaging to run on a range of x86 chips.
1427
1428Users who care about speed will want GMP built for their exact CPU type, to
1429make best use of the available optimizations.  Providing a way to suitably
1430rebuild a package may be useful.  This could be as simple as making it
1431possible for a user to omit @samp{--build} (and @samp{--host}) so
1432@samp{./config.guess} will detect the CPU@.  But a way to manually specify a
1433@samp{--build} will be wanted for systems where @samp{./config.guess} is
1434inexact.
1435
1436On systems with multiple ABIs, a packaged build will need to decide which
1437among the choices is to be provided, see @ref{ABI and ISA}.  A given run of
1438@samp{./configure} etc will only build one ABI@.  If a second ABI is also
1439required then a second run of @samp{./configure} etc must be made, starting
1440from a clean directory tree (@samp{make distclean}).
1441
1442As noted under ``ABI and ISA'', currently no attempt is made to follow system
1443conventions for install locations that vary with ABI, such as
1444@file{/usr/lib/sparcv9} for @samp{ABI=64} as opposed to @file{/usr/lib} for
1445@samp{ABI=32}.  A package build can override @samp{libdir} and other standard
1446variables as necessary.
1447
1448Note that @file{gmp.h} is a generated file, and will be architecture and ABI
1449dependent.  When attempting to install two ABIs simultaneously it will be
1450important that an application compile gets the correct @file{gmp.h} for its
1451desired ABI@.  If compiler include paths don't vary with ABI options then it
1452might be necessary to create a @file{/usr/include/gmp.h} which tests
1453preprocessor symbols and chooses the correct actual @file{gmp.h}.
1454
1455
1456@need 2000
1457@node Notes for Particular Systems, Known Build Problems, Notes for Package Builds, Installing GMP
1458@section Notes for Particular Systems
1459@cindex Build notes for particular systems
1460@cindex Particular systems
1461@cindex Systems
1462@table @asis
1463
1464@c This section is more or less meant for notes about performance or about
1465@c build problems that have been worked around but might leave a user
1466@c scratching their head.  Fun with different ABIs on a system belongs in the
1467@c above section.
1468
1469@item AIX 3 and 4
1470@cindex AIX
1471On systems @samp{*-*-aix[34]*} shared libraries are disabled by default, since
1472some versions of the native @command{ar} fail on the convenience libraries
1473used.  A shared build can be attempted with
1474
1475@example
1476./configure --enable-shared --disable-static
1477@end example
1478
1479Note that the @samp{--disable-static} is necessary because in a shared build
1480libtool makes @file{libgmp.a} a symlink to @file{libgmp.so}, apparently for
1481the benefit of old versions of @command{ld} which only recognise @file{.a},
1482but unfortunately this is done even if a fully functional @command{ld} is
1483available.
1484
1485@item ARM
1486@cindex ARM
1487On systems @samp{arm*-*-*}, versions of GCC up to and including 2.95.3 have a
1488bug in unsigned division, giving wrong results for some operands.  GMP
1489@samp{./configure} will demand GCC 2.95.4 or later.
1490
1491@item Compaq C++
1492@cindex Compaq C++
1493Compaq C++ on OSF 5.1 has two flavours of @code{iostream}, a standard one and
1494an old pre-standard one (see @samp{man iostream_intro}).  GMP can only use the
1495standard one, which unfortunately is not the default but must be selected by
1496defining @code{__USE_STD_IOSTREAM}.  Configure with for instance
1497
1498@example
1499./configure --enable-cxx CPPFLAGS=-D__USE_STD_IOSTREAM
1500@end example
1501
1502@item Floating Point Mode
1503@cindex Floating point mode
1504@cindex Hardware floating point mode
1505@cindex Precision of hardware floating point
1506@cindex x87
1507On some systems, the hardware floating point has a control mode which can set
1508all operations to be done in a particular precision, for instance single,
1509double or extended on x86 systems (x87 floating point).  The GMP functions
1510involving a @code{double} cannot be expected to operate to their full
1511precision when the hardware is in single precision mode.  Of course this
1512affects all code, including application code, not just GMP.
1513
1514@item FreeBSD 7.x, 8.x, 9.0, 9.1, 9.2
1515@cindex FreeBSD
1516@command{m4} in these releases of FreeBSD has an eval function which ignores
1517its 2nd and 3rd arguments, which makes it unsuitable for @file{.asm} file
1518processing.  @samp{./configure} will detect the problem and either abort or
1519choose another m4 in the @env{PATH}.  The bug is fixed in FreeBSD 9.3 and 10.0,
1520so either upgrade or use GNU m4.  Note that the FreeBSD package system installs
1521GNU m4 under the name @samp{gm4}, which GMP cannot guess.
1522
1523@item FreeBSD 7.x, 8.x, 9.x
1524@cindex FreeBSD
1525GMP releases starting with 6.0 do not support @samp{ABI=32} on FreeBSD/amd64
1526prior to release 10.0 of the system.  The cause is a broken @code{limits.h},
1527which GMP no longer works around.
1528
1529@item MS-DOS and MS Windows
1530@cindex MS-DOS
1531@cindex MS Windows
1532@cindex Windows
1533@cindex Cygwin
1534@cindex DJGPP
1535@cindex MINGW
1536On an MS-DOS system DJGPP can be used to build GMP, and on an MS Windows
1537system Cygwin, DJGPP and MINGW can be used.  All three are excellent ports of
1538GCC and the various GNU tools.
1539
1540@display
1541@uref{http://www.cygwin.com/}
1542@uref{http://www.delorie.com/djgpp/}
1543@uref{http://www.mingw.org/}
1544@end display
1545
1546@cindex Interix
1547@cindex Services for Unix
1548Microsoft also publishes an Interix ``Services for Unix'' which can be used to
1549build GMP on Windows (with a normal @samp{./configure}), but it's not free
1550software.
1551
1552@item MS Windows DLLs
1553@cindex DLLs
1554@cindex MS Windows
1555@cindex Windows
1556On systems @samp{*-*-cygwin*}, @samp{*-*-mingw*} and @samp{*-*-pw32*} by
1557default GMP builds only a static library, but a DLL can be built instead using
1558
1559@example
1560./configure --disable-static --enable-shared
1561@end example
1562
1563Static and DLL libraries can't both be built, since certain export directives
1564in @file{gmp.h} must be different.
1565
1566A MINGW DLL build of GMP can be used with Microsoft C@.  Libtool doesn't
1567install a @file{.lib} format import library, but it can be created with MS
1568@command{lib} as follows, and copied to the install directory.  Similarly for
1569@file{libmp} and @file{libgmpxx}.
1570
1571@example
1572cd .libs
1573lib /def:libgmp-3.dll.def /out:libgmp-3.lib
1574@end example
1575
1576MINGW uses the C runtime library @samp{msvcrt.dll} for I/O, so applications
1577wanting to use the GMP I/O routines must be compiled with @samp{cl /MD} to do
1578the same.  If one of the other C runtime library choices provided by MS C is
1579desired then the suggestion is to use the GMP string functions and confine I/O
1580to the application.
1581
1582@item Motorola 68k CPU Types
1583@cindex 68000
1584@samp{m68k} is taken to mean 68000.  @samp{m68020} or higher will give a
1585performance boost on applicable CPUs.  @samp{m68360} can be used for CPU32
1586series chips.  @samp{m68302} can be used for ``Dragonball'' series chips,
1587though this is merely a synonym for @samp{m68000}.
1588
1589@item NetBSD 5.x
1590@cindex NetBSD
1591@command{m4} in these releases of NetBSD has an eval function which ignores its
15922nd and 3rd arguments, which makes it unsuitable for @file{.asm} file
1593processing.  @samp{./configure} will detect the problem and either abort or
1594choose another m4 in the @env{PATH}.  The bug is fixed in NetBSD 6, so either
1595upgrade or use GNU m4.  Note that the NetBSD package system installs GNU m4
1596under the name @samp{gm4}, which GMP cannot guess.
1597
1598@item OpenBSD 2.6
1599@cindex OpenBSD
1600@command{m4} in this release of OpenBSD has a bug in @code{eval} that makes it
1601unsuitable for @file{.asm} file processing.  @samp{./configure} will detect
1602the problem and either abort or choose another m4 in the @env{PATH}.  The bug
1603is fixed in OpenBSD 2.7, so either upgrade or use GNU m4.
1604
1605@item Power CPU Types
1606@cindex Power/PowerPC
1607In GMP, CPU types @samp{power*} and @samp{powerpc*} will each use instructions
1608not available on the other, so it's important to choose the right one for the
1609CPU that will be used.  Currently GMP has no assembly code support for using
1610just the common instruction subset.  To get executables that run on both, the
1611current suggestion is to use the generic C code (@option{--disable-assembly}),
1612possibly with appropriate compiler options (like @samp{-mcpu=common} for
1613@command{gcc}).  CPU @samp{rs6000} (which is not a CPU but a family of
1614workstations) is accepted by @file{config.sub}, but is currently equivalent to
1615@option{--disable-assembly}.
1616
1617@item Sparc CPU Types
1618@cindex Sparc
1619@samp{sparcv8} or @samp{supersparc} on relevant systems will give a
1620significant performance increase over the V7 code selected by plain
1621@samp{sparc}.
1622
1623@item Sparc App Regs
1624@cindex Sparc
1625The GMP assembly code for both 32-bit and 64-bit Sparc clobbers the
1626``application registers'' @code{g2}, @code{g3} and @code{g4}, the same way
1627that the GCC default @samp{-mapp-regs} does (@pxref{SPARC Options,, SPARC
1628Options, gcc, Using the GNU Compiler Collection (GCC)}).
1629
1630This makes that code unsuitable for use with the special V9
1631@samp{-mcmodel=embmedany} (which uses @code{g4} as a data segment pointer), and
1632for applications wanting to use those registers for special purposes.  In these
1633cases the only suggestion currently is to build GMP with
1634@option{--disable-assembly} to avoid the assembly code.
1635
1636@item SunOS 4
1637@cindex SunOS
1638@command{/usr/bin/m4} lacks various features needed to process @file{.asm}
1639files, and instead @samp{./configure} will automatically use
1640@command{/usr/5bin/m4}, which we believe is always available (if not then use
1641GNU m4).
1642
1643@item x86 CPU Types
1644@cindex x86
1645@cindex 80x86
1646@cindex i386
1647@samp{i586}, @samp{pentium} or @samp{pentiummmx} code is good for its intended
1648P5 Pentium chips, but quite slow when run on Intel P6 class chips (PPro, P-II,
1649P-III)@.  @samp{i386} is a better choice when making binaries that must run on
1650both.
1651
1652@item x86 MMX and SSE2 Code
1653@cindex MMX
1654@cindex SSE2
1655If the CPU selected has MMX code but the assembler doesn't support it, a
1656warning is given and non-MMX code is used instead.  This will be an inferior
1657build, since the MMX code that's present is there because it's faster than the
1658corresponding plain integer code.  The same applies to SSE2.
1659
1660Old versions of @samp{gas} don't support MMX instructions, in particular
1661version 1.92.3 that comes with FreeBSD 2.2.8 or the more recent OpenBSD 3.1
1662doesn't.
1663
1664Solaris 2.6 and 2.7 @command{as} generate incorrect object code for register
1665to register @code{movq} instructions, and so can't be used for MMX code.
1666Install a recent @command{gas} if MMX code is wanted on these systems.
1667@end table
1668
1669
1670@need 2000
1671@node Known Build Problems, Performance optimization, Notes for Particular Systems, Installing GMP
1672@section Known Build Problems
1673@cindex Build problems known
1674
1675@c This section is more or less meant for known build problems that are not
1676@c otherwise worked around and require some sort of manual intervention.
1677
1678You might find more up-to-date information at @uref{https://gmplib.org/}.
1679
1680@table @asis
1681@item Compiler link options
1682The version of libtool currently in use rather aggressively strips compiler
1683options when linking a shared library.  This will hopefully be relaxed in the
1684future, but for now if this is a problem the suggestion is to create a little
1685script to hide them, and for instance configure with
1686
1687@example
1688./configure CC=gcc-with-my-options
1689@end example
1690
1691@item DJGPP (@samp{*-*-msdosdjgpp*})
1692@cindex DJGPP
1693The DJGPP port of @command{bash} 2.03 is unable to run the @samp{configure}
1694script, it exits silently, having died writing a preamble to
1695@file{config.log}.  Use @command{bash} 2.04 or higher.
1696
1697@samp{make all} was found to run out of memory during the final
1698@file{libgmp.la} link on one system tested, despite having 64Mb available.
1699Running @samp{make libgmp.la} directly helped, perhaps recursing into the
1700various subdirectories uses up memory.
1701
1702@item GNU binutils @command{strip} prior to 2.12
1703@cindex Stripped libraries
1704@cindex Binutils @command{strip}
1705@cindex GNU @command{strip}
1706@command{strip} from GNU binutils 2.11 and earlier should not be used on the
1707static libraries @file{libgmp.a} and @file{libmp.a} since it will discard all
1708but the last of multiple archive members with the same name, like the three
1709versions of @file{init.o} in @file{libgmp.a}.  Binutils 2.12 or higher can be
1710used successfully.
1711
1712The shared libraries @file{libgmp.so} and @file{libmp.so} are not affected by
1713this and any version of @command{strip} can be used on them.
1714
1715@item @command{make} syntax error
1716@cindex SCO
1717@cindex IRIX
1718On certain versions of SCO OpenServer 5 and IRIX 6.5 the native @command{make}
1719is unable to handle the long dependencies list for @file{libgmp.la}.  The
1720symptom is a ``syntax error'' on the following line of the top-level
1721@file{Makefile}.
1722
1723@example
1724libgmp.la: $(libgmp_la_OBJECTS) $(libgmp_la_DEPENDENCIES)
1725@end example
1726
1727Either use GNU Make, or as a workaround remove
1728@code{$(libgmp_la_DEPENDENCIES)} from that line (which will make the initial
1729build work, but if any recompiling is done @file{libgmp.la} might not be
1730rebuilt).
1731
1732@item MacOS X (@samp{*-*-darwin*})
1733@cindex MacOS X
1734@cindex Darwin
1735Libtool currently only knows how to create shared libraries on MacOS X using
1736the native @command{cc} (which is a modified GCC), not a plain GCC@.  A
1737static-only build should work though (@samp{--disable-shared}).
1738
1739@item NeXT prior to 3.3
1740@cindex NeXT
1741The system compiler on old versions of NeXT was a massacred and old GCC, even
1742if it called itself @file{cc}.  This compiler cannot be used to build GMP, you
1743need to get a real GCC, and install that.  (NeXT may have fixed this in
1744release 3.3 of their system.)
1745
1746@item POWER and PowerPC
1747@cindex Power/PowerPC
1748Bugs in GCC 2.7.2 (and 2.6.3) mean it can't be used to compile GMP on POWER or
1749PowerPC@.  If you want to use GCC for these machines, get GCC 2.7.2.1 (or
1750later).
1751
1752@item Sequent Symmetry
1753@cindex Sequent Symmetry
1754Use the GNU assembler instead of the system assembler, since the latter has
1755serious bugs.
1756
1757@item Solaris 2.6
1758@cindex Solaris
1759The system @command{sed} prints an error ``Output line too long'' when libtool
1760builds @file{libgmp.la}.  This doesn't seem to cause any obvious ill effects,
1761but GNU @command{sed} is recommended, to avoid any doubt.
1762
1763@item Sparc Solaris 2.7 with gcc 2.95.2 in @samp{ABI=32}
1764@cindex Solaris
1765A shared library build of GMP seems to fail in this combination, it builds but
1766then fails the tests, apparently due to some incorrect data relocations within
1767@code{gmp_randinit_lc_2exp_size}.  The exact cause is unknown,
1768@samp{--disable-shared} is recommended.
1769@end table
1770
1771
1772@need 2000
1773@node Performance optimization, , Known Build Problems, Installing GMP
1774@section Performance optimization
1775@cindex Optimizing performance
1776
1777@c At some point, this should perhaps move to a separate chapter on optimizing
1778@c performance.
1779
1780For optimal performance, build GMP for the exact CPU type of the target
1781computer, see @ref{Build Options}.
1782
1783Unlike what is the case for most other programs, the compiler typically
1784doesn't matter much, since GMP uses assembly language for the most critical
1785operation.
1786
1787In particular for long-running GMP applications, and applications demanding
1788extremely large numbers, building and running the @code{tuneup} program in the
1789@file{tune} subdirectory, can be important.  For example,
1790
1791@example
1792cd tune
1793make tuneup
1794./tuneup
1795@end example
1796
1797will generate better contents for the @file{gmp-mparam.h} parameter file.
1798
1799To use the results, put the output in the file indicated in the
1800@samp{Parameters for ...} header.  Then recompile from scratch.
1801
1802The @code{tuneup} program takes one useful parameter, @samp{-f NNN}, which
1803instructs the program how long to check FFT multiply parameters.  If you're
1804going to use GMP for extremely large numbers, you may want to run @code{tuneup}
1805with a large NNN value.
1806
1807
1808@node GMP Basics, Reporting Bugs, Installing GMP, Top
1809@comment  node-name,  next,  previous,  up
1810@chapter GMP Basics
1811@cindex Basics
1812
1813@strong{Using functions, macros, data types, etc.@: not documented in this
1814manual is strongly discouraged.  If you do so your application is guaranteed
1815to be incompatible with future versions of GMP.}
1816
1817@menu
1818* Headers and Libraries::
1819* Nomenclature and Types::
1820* Function Classes::
1821* Variable Conventions::
1822* Parameter Conventions::
1823* Memory Management::
1824* Reentrancy::
1825* Useful Macros and Constants::
1826* Compatibility with older versions::
1827* Demonstration Programs::
1828* Efficiency::
1829* Debugging::
1830* Profiling::
1831* Autoconf::
1832* Emacs::
1833@end menu
1834
1835@node Headers and Libraries, Nomenclature and Types, GMP Basics, GMP Basics
1836@section Headers and Libraries
1837@cindex Headers
1838
1839@cindex @file{gmp.h}
1840@cindex Include files
1841@cindex @code{#include}
1842All declarations needed to use GMP are collected in the include file
1843@file{gmp.h}.  It is designed to work with both C and C++ compilers.
1844
1845@example
1846#include <gmp.h>
1847@end example
1848
1849@cindex @code{stdio.h}
1850Note however that prototypes for GMP functions with @code{FILE *} parameters
1851are only provided if @code{<stdio.h>} is included too.
1852
1853@example
1854#include <stdio.h>
1855#include <gmp.h>
1856@end example
1857
1858@cindex @code{stdarg.h}
1859Likewise @code{<stdarg.h>} is required for prototypes with @code{va_list}
1860parameters, such as @code{gmp_vprintf}.  And @code{<obstack.h>} for prototypes
1861with @code{struct obstack} parameters, such as @code{gmp_obstack_printf}, when
1862available.
1863
1864@cindex Libraries
1865@cindex Linking
1866@cindex @code{libgmp}
1867All programs using GMP must link against the @file{libgmp} library.  On a
1868typical Unix-like system this can be done with @samp{-lgmp}, for example
1869
1870@example
1871gcc myprogram.c -lgmp
1872@end example
1873
1874@cindex @code{libgmpxx}
1875GMP C++ functions are in a separate @file{libgmpxx} library.  This is built
1876and installed if C++ support has been enabled (@pxref{Build Options}).  For
1877example,
1878
1879@example
1880g++ mycxxprog.cc -lgmpxx -lgmp
1881@end example
1882
1883@cindex Libtool
1884GMP is built using Libtool and an application can use that to link if desired,
1885@GMPpxreftop{libtool, GNU Libtool}.
1886
1887If GMP has been installed to a non-standard location then it may be necessary
1888to use @samp{-I} and @samp{-L} compiler options to point to the right
1889directories, and some sort of run-time path for a shared library.
1890
1891
1892@node Nomenclature and Types, Function Classes, Headers and Libraries, GMP Basics
1893@section Nomenclature and Types
1894@cindex Nomenclature
1895@cindex Types
1896
1897@cindex Integer
1898@tindex @code{mpz_t}
1899In this manual, @dfn{integer} usually means a multiple precision integer, as
1900defined by the GMP library.  The C data type for such integers is @code{mpz_t}.
1901Here are some examples of how to declare such integers:
1902
1903@example
1904mpz_t sum;
1905
1906struct foo @{ mpz_t x, y; @};
1907
1908mpz_t vec[20];
1909@end example
1910
1911@cindex Rational number
1912@tindex @code{mpq_t}
1913@dfn{Rational number} means a multiple precision fraction.  The C data type
1914for these fractions is @code{mpq_t}.  For example:
1915
1916@example
1917mpq_t quotient;
1918@end example
1919
1920@cindex Floating-point number
1921@tindex @code{mpf_t}
1922@dfn{Floating point number} or @dfn{Float} for short, is an arbitrary precision
1923mantissa with a limited precision exponent.  The C data type for such objects
1924is @code{mpf_t}.  For example:
1925
1926@example
1927mpf_t fp;
1928@end example
1929
1930@tindex @code{mp_exp_t}
1931The floating point functions accept and return exponents in the C type
1932@code{mp_exp_t}.  Currently this is usually a @code{long}, but on some systems
1933it's an @code{int} for efficiency.
1934
1935@cindex Limb
1936@tindex @code{mp_limb_t}
1937A @dfn{limb} means the part of a multi-precision number that fits in a single
1938machine word.  (We chose this word because a limb of the human body is
1939analogous to a digit, only larger, and containing several digits.)  Normally a
1940limb is 32 or 64 bits.  The C data type for a limb is @code{mp_limb_t}.
1941
1942@tindex @code{mp_size_t}
1943Counts of limbs of a multi-precision number represented in the C type
1944@code{mp_size_t}.  Currently this is normally a @code{long}, but on some
1945systems it's an @code{int} for efficiency, and on some systems it will be
1946@code{long long} in the future.
1947
1948@tindex @code{mp_bitcnt_t}
1949Counts of bits of a multi-precision number are represented in the C type
1950@code{mp_bitcnt_t}.  Currently this is always an @code{unsigned long}, but on
1951some systems it will be an @code{unsigned long long} in the future.
1952
1953@cindex Random state
1954@tindex @code{gmp_randstate_t}
1955@dfn{Random state} means an algorithm selection and current state data.  The C
1956data type for such objects is @code{gmp_randstate_t}.  For example:
1957
1958@example
1959gmp_randstate_t rstate;
1960@end example
1961
1962Also, in general @code{mp_bitcnt_t} is used for bit counts and ranges, and
1963@code{size_t} is used for byte or character counts.
1964
1965
1966@node Function Classes, Variable Conventions, Nomenclature and Types, GMP Basics
1967@section Function Classes
1968@cindex Function classes
1969
1970There are six classes of functions in the GMP library:
1971
1972@enumerate
1973@item
1974Functions for signed integer arithmetic, with names beginning with
1975@code{mpz_}.  The associated type is @code{mpz_t}.  There are about 150
1976functions in this class.  (@pxref{Integer Functions})
1977
1978@item
1979Functions for rational number arithmetic, with names beginning with
1980@code{mpq_}.  The associated type is @code{mpq_t}.  There are about 35
1981functions in this class, but the integer functions can be used for arithmetic
1982on the numerator and denominator separately.  (@pxref{Rational Number
1983Functions})
1984
1985@item
1986Functions for floating-point arithmetic, with names beginning with
1987@code{mpf_}.  The associated type is @code{mpf_t}.  There are about 70
1988functions is this class.  (@pxref{Floating-point Functions})
1989
1990@item
1991Fast low-level functions that operate on natural numbers.  These are used by
1992the functions in the preceding groups, and you can also call them directly
1993from very time-critical user programs.  These functions' names begin with
1994@code{mpn_}.  The associated type is array of @code{mp_limb_t}.  There are
1995about 60 (hard-to-use) functions in this class.  (@pxref{Low-level Functions})
1996
1997@item
1998Miscellaneous functions.  Functions for setting up custom allocation and
1999functions for generating random numbers.  (@pxref{Custom Allocation}, and
2000@pxref{Random Number Functions})
2001@end enumerate
2002
2003
2004@node Variable Conventions, Parameter Conventions, Function Classes, GMP Basics
2005@section Variable Conventions
2006@cindex Variable conventions
2007@cindex Conventions for variables
2008
2009GMP functions generally have output arguments before input arguments.  This
2010notation is by analogy with the assignment operator.  The BSD MP compatibility
2011functions are exceptions, having the output arguments last.
2012
2013GMP lets you use the same variable for both input and output in one call.  For
2014example, the main function for integer multiplication, @code{mpz_mul}, can be
2015used to square @code{x} and put the result back in @code{x} with
2016
2017@example
2018mpz_mul (x, x, x);
2019@end example
2020
2021Before you can assign to a GMP variable, you need to initialize it by calling
2022one of the special initialization functions.  When you're done with a
2023variable, you need to clear it out, using one of the functions for that
2024purpose.  Which function to use depends on the type of variable.  See the
2025chapters on integer functions, rational number functions, and floating-point
2026functions for details.
2027
2028A variable should only be initialized once, or at least cleared between each
2029initialization.  After a variable has been initialized, it may be assigned to
2030any number of times.
2031
2032For efficiency reasons, avoid excessive initializing and clearing.  In
2033general, initialize near the start of a function and clear near the end.  For
2034example,
2035
2036@example
2037void
2038foo (void)
2039@{
2040  mpz_t  n;
2041  int    i;
2042  mpz_init (n);
2043  for (i = 1; i < 100; i++)
2044    @{
2045      mpz_mul (n, @dots{});
2046      mpz_fdiv_q (n, @dots{});
2047      @dots{}
2048    @}
2049  mpz_clear (n);
2050@}
2051@end example
2052
2053
2054@node Parameter Conventions, Memory Management, Variable Conventions, GMP Basics
2055@section Parameter Conventions
2056@cindex Parameter conventions
2057@cindex Conventions for parameters
2058
2059When a GMP variable is used as a function parameter, it's effectively a
2060call-by-reference, meaning if the function stores a value there it will change
2061the original in the caller.  Parameters which are input-only can be designated
2062@code{const} to provoke a compiler error or warning on attempting to modify
2063them.
2064
2065When a function is going to return a GMP result, it should designate a
2066parameter that it sets, like the library functions do.  More than one value
2067can be returned by having more than one output parameter, again like the
2068library functions.  A @code{return} of an @code{mpz_t} etc doesn't return the
2069object, only a pointer, and this is almost certainly not what's wanted.
2070
2071Here's an example accepting an @code{mpz_t} parameter, doing a calculation,
2072and storing the result to the indicated parameter.
2073
2074@example
2075void
2076foo (mpz_t result, const mpz_t param, unsigned long n)
2077@{
2078  unsigned long  i;
2079  mpz_mul_ui (result, param, n);
2080  for (i = 1; i < n; i++)
2081    mpz_add_ui (result, result, i*7);
2082@}
2083
2084int
2085main (void)
2086@{
2087  mpz_t  r, n;
2088  mpz_init (r);
2089  mpz_init_set_str (n, "123456", 0);
2090  foo (r, n, 20L);
2091  gmp_printf ("%Zd\n", r);
2092  return 0;
2093@}
2094@end example
2095
2096@code{foo} works even if the mainline passes the same variable for
2097@code{param} and @code{result}, just like the library functions.  But
2098sometimes it's tricky to make that work, and an application might not want to
2099bother supporting that sort of thing.
2100
2101For interest, the GMP types @code{mpz_t} etc are implemented as one-element
2102arrays of certain structures.  This is why declaring a variable creates an
2103object with the fields GMP needs, but then using it as a parameter passes a
2104pointer to the object.  Note that the actual fields in each @code{mpz_t} etc
2105are for internal use only and should not be accessed directly by code that
2106expects to be compatible with future GMP releases.
2107
2108
2109@need 1000
2110@node Memory Management, Reentrancy, Parameter Conventions, GMP Basics
2111@section Memory Management
2112@cindex Memory management
2113
2114The GMP types like @code{mpz_t} are small, containing only a couple of sizes,
2115and pointers to allocated data.  Once a variable is initialized, GMP takes
2116care of all space allocation.  Additional space is allocated whenever a
2117variable doesn't have enough.
2118
2119@code{mpz_t} and @code{mpq_t} variables never reduce their allocated space.
2120Normally this is the best policy, since it avoids frequent reallocation.
2121Applications that need to return memory to the heap at some particular point
2122can use @code{mpz_realloc2}, or clear variables no longer needed.
2123
2124@code{mpf_t} variables, in the current implementation, use a fixed amount of
2125space, determined by the chosen precision and allocated at initialization, so
2126their size doesn't change.
2127
2128All memory is allocated using @code{malloc} and friends by default, but this
2129can be changed, see @ref{Custom Allocation}.  Temporary memory on the stack is
2130also used (via @code{alloca}), but this can be changed at build-time if
2131desired, see @ref{Build Options}.
2132
2133
2134@node Reentrancy, Useful Macros and Constants, Memory Management, GMP Basics
2135@section Reentrancy
2136@cindex Reentrancy
2137@cindex Thread safety
2138@cindex Multi-threading
2139
2140@noindent
2141GMP is reentrant and thread-safe, with some exceptions:
2142
2143@itemize @bullet
2144@item
2145If configured with @option{--enable-alloca=malloc-notreentrant} (or with
2146@option{--enable-alloca=notreentrant} when @code{alloca} is not available),
2147then naturally GMP is not reentrant.
2148
2149@item
2150@code{mpf_set_default_prec} and @code{mpf_init} use a global variable for the
2151selected precision.  @code{mpf_init2} can be used instead, and in the C++
2152interface an explicit precision to the @code{mpf_class} constructor.
2153
2154@item
2155@code{mpz_random} and the other old random number functions use a global
2156random state and are hence not reentrant.  The newer random number functions
2157that accept a @code{gmp_randstate_t} parameter can be used instead.
2158
2159@item
2160@code{gmp_randinit} (obsolete) returns an error indication through a global
2161variable, which is not thread safe.  Applications are advised to use
2162@code{gmp_randinit_default} or @code{gmp_randinit_lc_2exp} instead.
2163
2164@item
2165@code{mp_set_memory_functions} uses global variables to store the selected
2166memory allocation functions.
2167
2168@item
2169If the memory allocation functions set by a call to
2170@code{mp_set_memory_functions} (or @code{malloc} and friends by default) are
2171not reentrant, then GMP will not be reentrant either.
2172
2173@item
2174If the standard I/O functions such as @code{fwrite} are not reentrant then the
2175GMP I/O functions using them will not be reentrant either.
2176
2177@item
2178It's safe for two threads to read from the same GMP variable simultaneously,
2179but it's not safe for one to read while another might be writing, nor for
2180two threads to write simultaneously.  It's not safe for two threads to
2181generate a random number from the same @code{gmp_randstate_t} simultaneously,
2182since this involves an update of that variable.
2183@end itemize
2184
2185
2186@need 2000
2187@node Useful Macros and Constants, Compatibility with older versions, Reentrancy, GMP Basics
2188@section Useful Macros and Constants
2189@cindex Useful macros and constants
2190@cindex Constants
2191
2192@deftypevr {Global Constant} {const int} mp_bits_per_limb
2193@findex mp_bits_per_limb
2194@cindex Bits per limb
2195@cindex Limb size
2196The number of bits per limb.
2197@end deftypevr
2198
2199@defmac __GNU_MP_VERSION
2200@defmacx __GNU_MP_VERSION_MINOR
2201@defmacx __GNU_MP_VERSION_PATCHLEVEL
2202@cindex Version number
2203@cindex GMP version number
2204The major and minor GMP version, and patch level, respectively, as integers.
2205For GMP i.j, these numbers will be i, j, and 0, respectively.
2206For GMP i.j.k, these numbers will be i, j, and k, respectively.
2207@end defmac
2208
2209@deftypevr {Global Constant} {const char * const} gmp_version
2210@findex gmp_version
2211The GMP version number, as a null-terminated string, in the form ``i.j.k''.
2212This release is @nicode{"@value{VERSION}"}.  Note that the format ``i.j'' was
2213used, before version 4.3.0, when k was zero.
2214@end deftypevr
2215
2216@defmac __GMP_CC
2217@defmacx __GMP_CFLAGS
2218The compiler and compiler flags, respectively, used when compiling GMP, as
2219strings.
2220@end defmac
2221
2222
2223@node Compatibility with older versions, Demonstration Programs, Useful Macros and Constants, GMP Basics
2224@section Compatibility with older versions
2225@cindex Compatibility with older versions
2226@cindex Past GMP versions
2227@cindex Upward compatibility
2228
2229This version of GMP is upwardly binary compatible with all 5.x, 4.x, and 3.x
2230versions, and upwardly compatible at the source level with all 2.x versions,
2231with the following exceptions.
2232
2233@itemize @bullet
2234@item
2235@code{mpn_gcd} had its source arguments swapped as of GMP 3.0, for consistency
2236with other @code{mpn} functions.
2237
2238@item
2239@code{mpf_get_prec} counted precision slightly differently in GMP 3.0 and
22403.0.1, but in 3.1 reverted to the 2.x style.
2241
2242@item
2243@code{mpn_bdivmod}, documented as preliminary in GMP 4, has been removed.
2244@end itemize
2245
2246There are a number of compatibility issues between GMP 1 and GMP 2 that of
2247course also apply when porting applications from GMP 1 to GMP 5.  Please
2248see the GMP 2 manual for details.
2249
2250@c @item Integer division functions round the result differently.  The obsolete
2251@c functions (@code{mpz_div}, @code{mpz_divmod}, @code{mpz_mdiv},
2252@c @code{mpz_mdivmod}, etc) now all use floor rounding (i.e., they round the
2253@c quotient towards
2254@c @ifinfo
2255@c @minus{}infinity).
2256@c @end ifinfo
2257@c @iftex
2258@c @tex
2259@c $-\infty$).
2260@c @end tex
2261@c @end iftex
2262@c There are a lot of functions for integer division, giving the user better
2263@c control over the rounding.
2264
2265@c @item The function @code{mpz_mod} now compute the true @strong{mod} function.
2266
2267@c @item The functions @code{mpz_powm} and @code{mpz_powm_ui} now use
2268@c @strong{mod} for reduction.
2269
2270@c @item The assignment functions for rational numbers do no longer canonicalize
2271@c their results.  In the case a non-canonical result could arise from an
2272@c assignment, the user need to insert an explicit call to
2273@c @code{mpq_canonicalize}.  This change was made for efficiency.
2274
2275@c @item Output generated by @code{mpz_out_raw} in this release cannot be read
2276@c by @code{mpz_inp_raw} in previous releases.  This change was made for making
2277@c the file format truly portable between machines with different word sizes.
2278
2279@c @item Several @code{mpn} functions have changed.  But they were intentionally
2280@c undocumented in previous releases.
2281
2282@c @item The functions @code{mpz_cmp_ui}, @code{mpz_cmp_si}, and @code{mpq_cmp_ui}
2283@c are now implemented as macros, and thereby sometimes evaluate their
2284@c arguments multiple times.
2285
2286@c @item The functions @code{mpz_pow_ui} and @code{mpz_ui_pow_ui} now yield 1
2287@c for 0^0.  (In version 1, they yielded 0.)
2288
2289@c In version 1 of the library, @code{mpq_set_den} handled negative
2290@c denominators by copying the sign to the numerator.  That is no longer done.
2291
2292@c Pure assignment functions do not canonicalize the assigned variable.  It is
2293@c the responsibility of the user to canonicalize the assigned variable before
2294@c any arithmetic operations are performed on that variable.
2295@c Note that this is an incompatible change from version 1 of the library.
2296
2297@c @end enumerate
2298
2299
2300@need 1000
2301@node Demonstration Programs, Efficiency, Compatibility with older versions, GMP Basics
2302@section Demonstration programs
2303@cindex Demonstration programs
2304@cindex Example programs
2305@cindex Sample programs
2306The @file{demos} subdirectory has some sample programs using GMP@.  These
2307aren't built or installed, but there's a @file{Makefile} with rules for them.
2308For instance,
2309
2310@example
2311make pexpr
2312./pexpr 68^975+10
2313@end example
2314
2315@noindent
2316The following programs are provided
2317
2318@itemize @bullet
2319@item
2320@cindex Expression parsing demo
2321@cindex Parsing expressions demo
2322@samp{pexpr} is an expression evaluator, the program used on the GMP web page.
2323@item
2324@cindex Expression parsing demo
2325@cindex Parsing expressions demo
2326The @samp{calc} subdirectory has a similar but simpler evaluator using
2327@command{lex} and @command{yacc}.
2328@item
2329@cindex Expression parsing demo
2330@cindex Parsing expressions demo
2331The @samp{expr} subdirectory is yet another expression evaluator, a library
2332designed for ease of use within a C program.  See @file{demos/expr/README} for
2333more information.
2334@item
2335@cindex Factorization demo
2336@samp{factorize} is a Pollard-Rho factorization program.
2337@item
2338@samp{isprime} is a command-line interface to the @code{mpz_probab_prime_p}
2339function.
2340@item
2341@samp{primes} counts or lists primes in an interval, using a sieve.
2342@item
2343@samp{qcn} is an example use of @code{mpz_kronecker_ui} to estimate quadratic
2344class numbers.
2345@item
2346@cindex @code{perl}
2347@cindex GMP Perl module
2348@cindex Perl module
2349The @samp{perl} subdirectory is a comprehensive perl interface to GMP@.  See
2350@file{demos/perl/INSTALL} for more information.  Documentation is in POD
2351format in @file{demos/perl/GMP.pm}.
2352@end itemize
2353
2354As an aside, consideration has been given at various times to some sort of
2355expression evaluation within the main GMP library.  Going beyond something
2356minimal quickly leads to matters like user-defined functions, looping, fixnums
2357for control variables, etc, which are considered outside the scope of GMP
2358(much closer to language interpreters or compilers, @xref{Language Bindings}.)
2359Something simple for program input convenience may yet be a possibility, a
2360combination of the @file{expr} demo and the @file{pexpr} tree back-end
2361perhaps.  But for now the above evaluators are offered as illustrations.
2362
2363
2364@need 1000
2365@node Efficiency, Debugging, Demonstration Programs, GMP Basics
2366@section Efficiency
2367@cindex Efficiency
2368
2369@table @asis
2370@item Small Operands
2371@cindex Small operands
2372On small operands, the time for function call overheads and memory allocation
2373can be significant in comparison to actual calculation.  This is unavoidable
2374in a general purpose variable precision library, although GMP attempts to be
2375as efficient as it can on both large and small operands.
2376
2377@item Static Linking
2378@cindex Static linking
2379On some CPUs, in particular the x86s, the static @file{libgmp.a} should be
2380used for maximum speed, since the PIC code in the shared @file{libgmp.so} will
2381have a small overhead on each function call and global data address.  For many
2382programs this will be insignificant, but for long calculations there's a gain
2383to be had.
2384
2385@item Initializing and Clearing
2386@cindex Initializing and clearing
2387Avoid excessive initializing and clearing of variables, since this can be
2388quite time consuming, especially in comparison to otherwise fast operations
2389like addition.
2390
2391A language interpreter might want to keep a free list or stack of
2392initialized variables ready for use.  It should be possible to integrate
2393something like that with a garbage collector too.
2394
2395@item Reallocations
2396@cindex Reallocations
2397An @code{mpz_t} or @code{mpq_t} variable used to hold successively increasing
2398values will have its memory repeatedly @code{realloc}ed, which could be quite
2399slow or could fragment memory, depending on the C library.  If an application
2400can estimate the final size then @code{mpz_init2} or @code{mpz_realloc2} can
2401be called to allocate the necessary space from the beginning
2402(@pxref{Initializing Integers}).
2403
2404It doesn't matter if a size set with @code{mpz_init2} or @code{mpz_realloc2}
2405is too small, since all functions will do a further reallocation if necessary.
2406Badly overestimating memory required will waste space though.
2407
2408@item @code{2exp} Functions
2409@cindex @code{2exp} functions
2410It's up to an application to call functions like @code{mpz_mul_2exp} when
2411appropriate.  General purpose functions like @code{mpz_mul} make no attempt to
2412identify powers of two or other special forms, because such inputs will
2413usually be very rare and testing every time would be wasteful.
2414
2415@item @code{ui} and @code{si} Functions
2416@cindex @code{ui} and @code{si} functions
2417The @code{ui} functions and the small number of @code{si} functions exist for
2418convenience and should be used where applicable.  But if for example an
2419@code{mpz_t} contains a value that fits in an @code{unsigned long} there's no
2420need extract it and call a @code{ui} function, just use the regular @code{mpz}
2421function.
2422
2423@item In-Place Operations
2424@cindex In-place operations
2425@code{mpz_abs}, @code{mpq_abs}, @code{mpf_abs}, @code{mpz_neg}, @code{mpq_neg}
2426and @code{mpf_neg} are fast when used for in-place operations like
2427@code{mpz_abs(x,x)}, since in the current implementation only a single field
2428of @code{x} needs changing.  On suitable compilers (GCC for instance) this is
2429inlined too.
2430
2431@code{mpz_add_ui}, @code{mpz_sub_ui}, @code{mpf_add_ui} and @code{mpf_sub_ui}
2432benefit from an in-place operation like @code{mpz_add_ui(x,x,y)}, since
2433usually only one or two limbs of @code{x} will need to be changed.  The same
2434applies to the full precision @code{mpz_add} etc if @code{y} is small.  If
2435@code{y} is big then cache locality may be helped, but that's all.
2436
2437@code{mpz_mul} is currently the opposite, a separate destination is slightly
2438better.  A call like @code{mpz_mul(x,x,y)} will, unless @code{y} is only one
2439limb, make a temporary copy of @code{x} before forming the result.  Normally
2440that copying will only be a tiny fraction of the time for the multiply, so
2441this is not a particularly important consideration.
2442
2443@code{mpz_set}, @code{mpq_set}, @code{mpq_set_num}, @code{mpf_set}, etc, make
2444no attempt to recognise a copy of something to itself, so a call like
2445@code{mpz_set(x,x)} will be wasteful.  Naturally that would never be written
2446deliberately, but if it might arise from two pointers to the same object then
2447a test to avoid it might be desirable.
2448
2449@example
2450if (x != y)
2451  mpz_set (x, y);
2452@end example
2453
2454Note that it's never worth introducing extra @code{mpz_set} calls just to get
2455in-place operations.  If a result should go to a particular variable then just
2456direct it there and let GMP take care of data movement.
2457
2458@item Divisibility Testing (Small Integers)
2459@cindex Divisibility testing
2460@code{mpz_divisible_ui_p} and @code{mpz_congruent_ui_p} are the best functions
2461for testing whether an @code{mpz_t} is divisible by an individual small
2462integer.  They use an algorithm which is faster than @code{mpz_tdiv_ui}, but
2463which gives no useful information about the actual remainder, only whether
2464it's zero (or a particular value).
2465
2466However when testing divisibility by several small integers, it's best to take
2467a remainder modulo their product, to save multi-precision operations.  For
2468instance to test whether a number is divisible by any of 23, 29 or 31 take a
2469remainder modulo @math{23@times{}29@times{}31 = 20677} and then test that.
2470
2471The division functions like @code{mpz_tdiv_q_ui} which give a quotient as well
2472as a remainder are generally a little slower than the remainder-only functions
2473like @code{mpz_tdiv_ui}.  If the quotient is only rarely wanted then it's
2474probably best to just take a remainder and then go back and calculate the
2475quotient if and when it's wanted (@code{mpz_divexact_ui} can be used if the
2476remainder is zero).
2477
2478@item Rational Arithmetic
2479@cindex Rational arithmetic
2480The @code{mpq} functions operate on @code{mpq_t} values with no common factors
2481in the numerator and denominator.  Common factors are checked-for and cast out
2482as necessary.  In general, cancelling factors every time is the best approach
2483since it minimizes the sizes for subsequent operations.
2484
2485However, applications that know something about the factorization of the
2486values they're working with might be able to avoid some of the GCDs used for
2487canonicalization, or swap them for divisions.  For example when multiplying by
2488a prime it's enough to check for factors of it in the denominator instead of
2489doing a full GCD@.  Or when forming a big product it might be known that very
2490little cancellation will be possible, and so canonicalization can be left to
2491the end.
2492
2493The @code{mpq_numref} and @code{mpq_denref} macros give access to the
2494numerator and denominator to do things outside the scope of the supplied
2495@code{mpq} functions.  @xref{Applying Integer Functions}.
2496
2497The canonical form for rationals allows mixed-type @code{mpq_t} and integer
2498additions or subtractions to be done directly with multiples of the
2499denominator.  This will be somewhat faster than @code{mpq_add}.  For example,
2500
2501@example
2502/* mpq increment */
2503mpz_add (mpq_numref(q), mpq_numref(q), mpq_denref(q));
2504
2505/* mpq += unsigned long */
2506mpz_addmul_ui (mpq_numref(q), mpq_denref(q), 123UL);
2507
2508/* mpq -= mpz */
2509mpz_submul (mpq_numref(q), mpq_denref(q), z);
2510@end example
2511
2512@item Number Sequences
2513@cindex Number sequences
2514Functions like @code{mpz_fac_ui}, @code{mpz_fib_ui} and @code{mpz_bin_uiui}
2515are designed for calculating isolated values.  If a range of values is wanted
2516it's probably best to call to get a starting point and iterate from there.
2517
2518@item Text Input/Output
2519@cindex Text input/output
2520Hexadecimal or octal are suggested for input or output in text form.
2521Power-of-2 bases like these can be converted much more efficiently than other
2522bases, like decimal.  For big numbers there's usually nothing of particular
2523interest to be seen in the digits, so the base doesn't matter much.
2524
2525Maybe we can hope octal will one day become the normal base for everyday use,
2526as proposed by King Charles XII of Sweden and later reformers.
2527@c Reference: Knuth volume 2 section 4.1, page 184 of second edition.  :-)
2528@end table
2529
2530
2531@node Debugging, Profiling, Efficiency, GMP Basics
2532@section Debugging
2533@cindex Debugging
2534
2535@table @asis
2536@item Stack Overflow
2537@cindex Stack overflow
2538@cindex Segmentation violation
2539@cindex Bus error
2540Depending on the system, a segmentation violation or bus error might be the
2541only indication of stack overflow.  See @samp{--enable-alloca} choices in
2542@ref{Build Options}, for how to address this.
2543
2544In new enough versions of GCC, @samp{-fstack-check} may be able to ensure an
2545overflow is recognised by the system before too much damage is done, or
2546@samp{-fstack-limit-symbol} or @samp{-fstack-limit-register} may be able to
2547add checking if the system itself doesn't do any (@pxref{Code Gen Options,,
2548Options for Code Generation, gcc, Using the GNU Compiler Collection (GCC)}).
2549These options must be added to the @samp{CFLAGS} used in the GMP build
2550(@pxref{Build Options}), adding them just to an application will have no
2551effect.  Note also they're a slowdown, adding overhead to each function call
2552and each stack allocation.
2553
2554@item Heap Problems
2555@cindex Heap problems
2556@cindex Malloc problems
2557The most likely cause of application problems with GMP is heap corruption.
2558Failing to @code{init} GMP variables will have unpredictable effects, and
2559corruption arising elsewhere in a program may well affect GMP@.  Initializing
2560GMP variables more than once or failing to clear them will cause memory leaks.
2561
2562@cindex Malloc debugger
2563In all such cases a @code{malloc} debugger is recommended.  On a GNU or BSD
2564system the standard C library @code{malloc} has some diagnostic facilities,
2565see @ref{Allocation Debugging,, Allocation Debugging, libc, The GNU C Library
2566Reference Manual}, or @samp{man 3 malloc}.  Other possibilities, in no
2567particular order, include
2568
2569@display
2570@uref{http://www.inf.ethz.ch/personal/biere/projects/ccmalloc/}
2571@uref{http://dmalloc.com/}
2572@uref{http://www.perens.com/FreeSoftware/} @ (electric fence)
2573@uref{http://packages.debian.org/stable/devel/fda}
2574@uref{http://www.gnupdate.org/components/leakbug/}
2575@uref{http://people.redhat.com/~otaylor/memprof/}
2576@uref{http://www.cbmamiga.demon.co.uk/mpatrol/}
2577@end display
2578
2579The GMP default allocation routines in @file{memory.c} also have a simple
2580sentinel scheme which can be enabled with @code{#define DEBUG} in that file.
2581This is mainly designed for detecting buffer overruns during GMP development,
2582but might find other uses.
2583
2584@item Stack Backtraces
2585@cindex Stack backtrace
2586On some systems the compiler options GMP uses by default can interfere with
2587debugging.  In particular on x86 and 68k systems @samp{-fomit-frame-pointer}
2588is used and this generally inhibits stack backtracing.  Recompiling without
2589such options may help while debugging, though the usual caveats about it
2590potentially moving a memory problem or hiding a compiler bug will apply.
2591
2592@item GDB, the GNU Debugger
2593@cindex GDB
2594@cindex GNU Debugger
2595A sample @file{.gdbinit} is included in the distribution, showing how to call
2596some undocumented dump functions to print GMP variables from within GDB@.  Note
2597that these functions shouldn't be used in final application code since they're
2598undocumented and may be subject to incompatible changes in future versions of
2599GMP.
2600
2601@item Source File Paths
2602GMP has multiple source files with the same name, in different directories.
2603For example @file{mpz}, @file{mpq} and @file{mpf} each have an
2604@file{init.c}.  If the debugger can't already determine the right one it may
2605help to build with absolute paths on each C file.  One way to do that is to
2606use a separate object directory with an absolute path to the source directory.
2607
2608@example
2609cd /my/build/dir
2610/my/source/dir/gmp-@value{VERSION}/configure
2611@end example
2612
2613This works via @code{VPATH}, and might require GNU @command{make}.
2614Alternately it might be possible to change the @code{.c.lo} rules
2615appropriately.
2616
2617@item Assertion Checking
2618@cindex Assertion checking
2619The build option @option{--enable-assert} is available to add some consistency
2620checks to the library (see @ref{Build Options}).  These are likely to be of
2621limited value to most applications.  Assertion failures are just as likely to
2622indicate memory corruption as a library or compiler bug.
2623
2624Applications using the low-level @code{mpn} functions, however, will benefit
2625from @option{--enable-assert} since it adds checks on the parameters of most
2626such functions, many of which have subtle restrictions on their usage.  Note
2627however that only the generic C code has checks, not the assembly code, so
2628@option{--disable-assembly} should be used for maximum checking.
2629
2630@item Temporary Memory Checking
2631The build option @option{--enable-alloca=debug} arranges that each block of
2632temporary memory in GMP is allocated with a separate call to @code{malloc} (or
2633the allocation function set with @code{mp_set_memory_functions}).
2634
2635This can help a malloc debugger detect accesses outside the intended bounds,
2636or detect memory not released.  In a normal build, on the other hand,
2637temporary memory is allocated in blocks which GMP divides up for its own use,
2638or may be allocated with a compiler builtin @code{alloca} which will go
2639nowhere near any malloc debugger hooks.
2640
2641@item Maximum Debuggability
2642To summarize the above, a GMP build for maximum debuggability would be
2643
2644@example
2645./configure --disable-shared --enable-assert \
2646  --enable-alloca=debug --disable-assembly CFLAGS=-g
2647@end example
2648
2649For C++, add @samp{--enable-cxx CXXFLAGS=-g}.
2650
2651@item Checker
2652@cindex Checker
2653@cindex GCC Checker
2654The GCC checker (@uref{https://savannah.nongnu.org/projects/checker/}) can be
2655used with GMP@.  It contains a stub library which means GMP applications
2656compiled with checker can use a normal GMP build.
2657
2658A build of GMP with checking within GMP itself can be made.  This will run
2659very very slowly.  On GNU/Linux for example,
2660
2661@cindex @command{checkergcc}
2662@example
2663./configure --disable-assembly CC=checkergcc
2664@end example
2665
2666@option{--disable-assembly} must be used, since the GMP assembly code doesn't
2667support the checking scheme.  The GMP C++ features cannot be used, since
2668current versions of checker (0.9.9.1) don't yet support the standard C++
2669library.
2670
2671@item Valgrind
2672@cindex Valgrind
2673Valgrind (@uref{http://valgrind.org/}) is a memory checker for x86, ARM, MIPS,
2674PowerPC, and S/390.  It translates and emulates machine instructions to do
2675strong checks for uninitialized data (at the level of individual bits), memory
2676accesses through bad pointers, and memory leaks.
2677
2678Valgrind does not always support every possible instruction, in particular
2679ones recently added to an ISA.  Valgrind might therefore be incompatible with
2680a recent GMP or even a less recent GMP which is compiled using a recent GCC.
2681
2682GMP's assembly code sometimes promotes a read of the limbs to some larger size,
2683for efficiency.  GMP will do this even at the start and end of a multilimb
2684operand, using naturally aligned operations on the larger type.  This may lead
2685to benign reads outside of allocated areas, triggering complaints from
2686Valgrind.  Valgrind's option @samp{--partial-loads-ok=yes} should help.
2687
2688@item Other Problems
2689Any suspected bug in GMP itself should be isolated to make sure it's not an
2690application problem, see @ref{Reporting Bugs}.
2691@end table
2692
2693
2694@node Profiling, Autoconf, Debugging, GMP Basics
2695@section Profiling
2696@cindex Profiling
2697@cindex Execution profiling
2698@cindex @code{--enable-profiling}
2699
2700Running a program under a profiler is a good way to find where it's spending
2701most time and where improvements can be best sought.  The profiling choices
2702for a GMP build are as follows.
2703
2704@table @asis
2705@item @samp{--disable-profiling}
2706The default is to add nothing special for profiling.
2707
2708It should be possible to just compile the mainline of a program with @code{-p}
2709and use @command{prof} to get a profile consisting of timer-based sampling of
2710the program counter.  Most of the GMP assembly code has the necessary symbol
2711information.
2712
2713This approach has the advantage of minimizing interference with normal program
2714operation, but on most systems the resolution of the sampling is quite low (10
2715milliseconds for instance), requiring long runs to get accurate information.
2716
2717@item @samp{--enable-profiling=prof}
2718@cindex @code{prof}
2719Build with support for the system @command{prof}, which means @samp{-p} added
2720to the @samp{CFLAGS}.
2721
2722This provides call counting in addition to program counter sampling, which
2723allows the most frequently called routines to be identified, and an average
2724time spent in each routine to be determined.
2725
2726The x86 assembly code has support for this option, but on other processors
2727the assembly routines will be as if compiled without @samp{-p} and therefore
2728won't appear in the call counts.
2729
2730On some systems, such as GNU/Linux, @samp{-p} in fact means @samp{-pg} and in
2731this case @samp{--enable-profiling=gprof} described below should be used
2732instead.
2733
2734@item @samp{--enable-profiling=gprof}
2735@cindex @code{gprof}
2736Build with support for @command{gprof}, which means @samp{-pg} added to the
2737@samp{CFLAGS}.
2738
2739This provides call graph construction in addition to call counting and program
2740counter sampling, which makes it possible to count calls coming from different
2741locations.  For example the number of calls to @code{mpn_mul} from
2742@code{mpz_mul} versus the number from @code{mpf_mul}.  The program counter
2743sampling is still flat though, so only a total time in @code{mpn_mul} would be
2744accumulated, not a separate amount for each call site.
2745
2746The x86 assembly code has support for this option, but on other processors
2747the assembly routines will be as if compiled without @samp{-pg} and therefore
2748not be included in the call counts.
2749
2750On x86 and m68k systems @samp{-pg} and @samp{-fomit-frame-pointer} are
2751incompatible, so the latter is omitted from the default flags in that case,
2752which might result in poorer code generation.
2753
2754Incidentally, it should be possible to use the @command{gprof} program with a
2755plain @samp{--enable-profiling=prof} build.  But in that case only the
2756@samp{gprof -p} flat profile and call counts can be expected to be valid, not
2757the @samp{gprof -q} call graph.
2758
2759@item @samp{--enable-profiling=instrument}
2760@cindex @code{-finstrument-functions}
2761@cindex @code{instrument-functions}
2762Build with the GCC option @samp{-finstrument-functions} added to the
2763@samp{CFLAGS} (@pxref{Code Gen Options,, Options for Code Generation, gcc,
2764Using the GNU Compiler Collection (GCC)}).
2765
2766This inserts special instrumenting calls at the start and end of each
2767function, allowing exact timing and full call graph construction.
2768
2769This instrumenting is not normally a standard system feature and will require
2770support from an external library, such as
2771
2772@cindex FunctionCheck
2773@cindex fnccheck
2774@display
2775@uref{http://sourceforge.net/projects/fnccheck/}
2776@end display
2777
2778This should be included in @samp{LIBS} during the GMP configure so that test
2779programs will link.  For example,
2780
2781@example
2782./configure --enable-profiling=instrument LIBS=-lfc
2783@end example
2784
2785On a GNU system the C library provides dummy instrumenting functions, so
2786programs compiled with this option will link.  In this case it's only
2787necessary to ensure the correct library is added when linking an application.
2788
2789The x86 assembly code supports this option, but on other processors the
2790assembly routines will be as if compiled without
2791@samp{-finstrument-functions} meaning time spent in them will effectively be
2792attributed to their caller.
2793@end table
2794
2795
2796@node Autoconf, Emacs, Profiling, GMP Basics
2797@section Autoconf
2798@cindex Autoconf
2799
2800Autoconf based applications can easily check whether GMP is installed.  The
2801only thing to be noted is that GMP library symbols from version 3 onwards have
2802prefixes like @code{__gmpz}.  The following therefore would be a simple test,
2803
2804@cindex @code{AC_CHECK_LIB}
2805@example
2806AC_CHECK_LIB(gmp, __gmpz_init)
2807@end example
2808
2809This just uses the default @code{AC_CHECK_LIB} actions for found or not found,
2810but an application that must have GMP would want to generate an error if not
2811found.  For example,
2812
2813@example
2814AC_CHECK_LIB(gmp, __gmpz_init, ,
2815  [AC_MSG_ERROR([GNU MP not found, see https://gmplib.org/])])
2816@end example
2817
2818If functions added in some particular version of GMP are required, then one of
2819those can be used when checking.  For example @code{mpz_mul_si} was added in
2820GMP 3.1,
2821
2822@example
2823AC_CHECK_LIB(gmp, __gmpz_mul_si, ,
2824  [AC_MSG_ERROR(
2825  [GNU MP not found, or not 3.1 or up, see https://gmplib.org/])])
2826@end example
2827
2828An alternative would be to test the version number in @file{gmp.h} using say
2829@code{AC_EGREP_CPP}.  That would make it possible to test the exact version,
2830if some particular sub-minor release is known to be necessary.
2831
2832In general it's recommended that applications should simply demand a new
2833enough GMP rather than trying to provide supplements for features not
2834available in past versions.
2835
2836Occasionally an application will need or want to know the size of a type at
2837configuration or preprocessing time, not just with @code{sizeof} in the code.
2838This can be done in the normal way with @code{mp_limb_t} etc, but GMP 4.0 or
2839up is best for this, since prior versions needed certain @samp{-D} defines on
2840systems using a @code{long long} limb.  The following would suit Autoconf 2.50
2841or up,
2842
2843@example
2844AC_CHECK_SIZEOF(mp_limb_t, , [#include <gmp.h>])
2845@end example
2846
2847
2848@node Emacs,  , Autoconf, GMP Basics
2849@section Emacs
2850@cindex Emacs
2851@cindex @code{info-lookup-symbol}
2852
2853@key{C-h C-i} (@code{info-lookup-symbol}) is a good way to find documentation
2854on C functions while editing (@pxref{Info Lookup, , Info Documentation Lookup,
2855emacs, The Emacs Editor}).
2856
2857The GMP manual can be included in such lookups by putting the following in
2858your @file{.emacs},
2859
2860@c  This isn't pretty, but there doesn't seem to be a better way (in emacs
2861@c  21.2 at least).  info-lookup->mode-value could be used for the "assoc"s,
2862@c  but that function isn't documented, whereas info-lookup-alist is.
2863@c
2864@example
2865(eval-after-load "info-look"
2866  '(let ((mode-value (assoc 'c-mode (assoc 'symbol info-lookup-alist))))
2867     (setcar (nthcdr 3 mode-value)
2868             (cons '("(gmp)Function Index" nil "^ -.* " "\\>")
2869                   (nth 3 mode-value)))))
2870@end example
2871
2872
2873@node Reporting Bugs, Integer Functions, GMP Basics, Top
2874@comment  node-name,  next,  previous,  up
2875@chapter Reporting Bugs
2876@cindex Reporting bugs
2877@cindex Bug reporting
2878
2879If you think you have found a bug in the GMP library, please investigate it
2880and report it.  We have made this library available to you, and it is not too
2881much to ask you to report the bugs you find.
2882
2883Before you report a bug, check it's not already addressed in @ref{Known Build
2884Problems}, or perhaps @ref{Notes for Particular Systems}.  You may also want
2885to check @uref{https://gmplib.org/} for patches for this release.
2886
2887Please include the following in any report,
2888
2889@itemize @bullet
2890@item
2891The GMP version number, and if pre-packaged or patched then say so.
2892
2893@item
2894A test program that makes it possible for us to reproduce the bug.  Include
2895instructions on how to run the program.
2896
2897@item
2898A description of what is wrong.  If the results are incorrect, in what way.
2899If you get a crash, say so.
2900
2901@item
2902If you get a crash, include a stack backtrace from the debugger if it's
2903informative (@samp{where} in @command{gdb}, or @samp{$C} in @command{adb}).
2904
2905@item
2906Please do not send core dumps, executables or @command{strace}s.
2907
2908@item
2909The @samp{configure} options you used when building GMP, if any.
2910
2911@item
2912The output from @samp{configure}, as printed to stdout, with any options used.
2913
2914@item
2915The name of the compiler and its version.  For @command{gcc}, get the version
2916with @samp{gcc -v}, otherwise perhaps @samp{what `which cc`}, or similar.
2917
2918@item
2919The output from running @samp{uname -a}.
2920
2921@item
2922The output from running @samp{./config.guess}, and from running
2923@samp{./configfsf.guess} (might be the same).
2924
2925@item
2926If the bug is related to @samp{configure}, then the compressed contents of
2927@file{config.log}.
2928
2929@item
2930If the bug is related to an @file{asm} file not assembling, then the contents
2931of @file{config.m4} and the offending line or lines from the temporary
2932@file{mpn/tmp-<file>.s}.
2933@end itemize
2934
2935Please make an effort to produce a self-contained report, with something
2936definite that can be tested or debugged.  Vague queries or piecemeal messages
2937are difficult to act on and don't help the development effort.
2938
2939It is not uncommon that an observed problem is actually due to a bug in the
2940compiler; the GMP code tends to explore interesting corners in compilers.
2941
2942If your bug report is good, we will do our best to help you get a corrected
2943version of the library; if the bug report is poor, we won't do anything about
2944it (except maybe ask you to send a better report).
2945
2946Send your report to: @email{gmp-bugs@@gmplib.org}.
2947
2948If you think something in this manual is unclear, or downright incorrect, or if
2949the language needs to be improved, please send a note to the same address.
2950
2951
2952@node Integer Functions, Rational Number Functions, Reporting Bugs, Top
2953@comment  node-name,  next,  previous,  up
2954@chapter Integer Functions
2955@cindex Integer functions
2956
2957This chapter describes the GMP functions for performing integer arithmetic.
2958These functions start with the prefix @code{mpz_}.
2959
2960GMP integers are stored in objects of type @code{mpz_t}.
2961
2962@menu
2963* Initializing Integers::
2964* Assigning Integers::
2965* Simultaneous Integer Init & Assign::
2966* Converting Integers::
2967* Integer Arithmetic::
2968* Integer Division::
2969* Integer Exponentiation::
2970* Integer Roots::
2971* Number Theoretic Functions::
2972* Integer Comparisons::
2973* Integer Logic and Bit Fiddling::
2974* I/O of Integers::
2975* Integer Random Numbers::
2976* Integer Import and Export::
2977* Miscellaneous Integer Functions::
2978* Integer Special Functions::
2979@end menu
2980
2981@node Initializing Integers, Assigning Integers, Integer Functions, Integer Functions
2982@comment  node-name,  next,  previous,  up
2983@section Initialization Functions
2984@cindex Integer initialization functions
2985@cindex Initialization functions
2986
2987The functions for integer arithmetic assume that all integer objects are
2988initialized.  You do that by calling the function @code{mpz_init}.  For
2989example,
2990
2991@example
2992@{
2993  mpz_t integ;
2994  mpz_init (integ);
2995  @dots{}
2996  mpz_add (integ, @dots{});
2997  @dots{}
2998  mpz_sub (integ, @dots{});
2999
3000  /* Unless the program is about to exit, do ... */
3001  mpz_clear (integ);
3002@}
3003@end example
3004
3005As you can see, you can store new values any number of times, once an
3006object is initialized.
3007
3008@deftypefun void mpz_init (mpz_t @var{x})
3009Initialize @var{x}, and set its value to 0.
3010@end deftypefun
3011
3012@deftypefun void mpz_inits (mpz_t @var{x}, ...)
3013Initialize a NULL-terminated list of @code{mpz_t} variables, and set their
3014values to 0.
3015@end deftypefun
3016
3017@deftypefun void mpz_init2 (mpz_t @var{x}, mp_bitcnt_t @var{n})
3018Initialize @var{x}, with space for @var{n}-bit numbers, and set its value to 0.
3019Calling this function instead of @code{mpz_init} or @code{mpz_inits} is never
3020necessary; reallocation is handled automatically by GMP when needed.
3021
3022While @var{n} defines the initial space, @var{x} will grow automatically in the
3023normal way, if necessary, for subsequent values stored.  @code{mpz_init2} makes
3024it possible to avoid such reallocations if a maximum size is known in advance.
3025
3026In preparation for an operation, GMP often allocates one limb more than
3027ultimately needed.  To make sure GMP will not perform reallocation for
3028@var{x}, you need to add the number of bits in @code{mp_limb_t} to @var{n}.
3029@end deftypefun
3030
3031@deftypefun void mpz_clear (mpz_t @var{x})
3032Free the space occupied by @var{x}.  Call this function for all @code{mpz_t}
3033variables when you are done with them.
3034@end deftypefun
3035
3036@deftypefun void mpz_clears (mpz_t @var{x}, ...)
3037Free the space occupied by a NULL-terminated list of @code{mpz_t} variables.
3038@end deftypefun
3039
3040@deftypefun void mpz_realloc2 (mpz_t @var{x}, mp_bitcnt_t @var{n})
3041Change the space allocated for @var{x} to @var{n} bits.  The value in @var{x}
3042is preserved if it fits, or is set to 0 if not.
3043
3044Calling this function is never necessary; reallocation is handled automatically
3045by GMP when needed.  But this function can be used to increase the space for a
3046variable in order to avoid repeated automatic reallocations, or to decrease it
3047to give memory back to the heap.
3048@end deftypefun
3049
3050
3051@node Assigning Integers, Simultaneous Integer Init & Assign, Initializing Integers, Integer Functions
3052@comment  node-name,  next,  previous,  up
3053@section Assignment Functions
3054@cindex Integer assignment functions
3055@cindex Assignment functions
3056
3057These functions assign new values to already initialized integers
3058(@pxref{Initializing Integers}).
3059
3060@deftypefun void mpz_set (mpz_t @var{rop}, const mpz_t @var{op})
3061@deftypefunx void mpz_set_ui (mpz_t @var{rop}, unsigned long int @var{op})
3062@deftypefunx void mpz_set_si (mpz_t @var{rop}, signed long int @var{op})
3063@deftypefunx void mpz_set_d (mpz_t @var{rop}, double @var{op})
3064@deftypefunx void mpz_set_q (mpz_t @var{rop}, const mpq_t @var{op})
3065@deftypefunx void mpz_set_f (mpz_t @var{rop}, const mpf_t @var{op})
3066Set the value of @var{rop} from @var{op}.
3067
3068@code{mpz_set_d}, @code{mpz_set_q} and @code{mpz_set_f} truncate @var{op} to
3069make it an integer.
3070@end deftypefun
3071
3072@deftypefun int mpz_set_str (mpz_t @var{rop}, const char *@var{str}, int @var{base})
3073Set the value of @var{rop} from @var{str}, a null-terminated C string in base
3074@var{base}.  White space is allowed in the string, and is simply ignored.
3075
3076The @var{base} may vary from 2 to 62, or if @var{base} is 0, then the leading
3077characters are used: @code{0x} and @code{0X} for hexadecimal, @code{0b} and
3078@code{0B} for binary, @code{0} for octal, or decimal otherwise.
3079
3080For bases up to 36, case is ignored; upper-case and lower-case letters have
3081the same value.  For bases 37 to 62, upper-case letter represent the usual
308210..35 while lower-case letter represent 36..61.
3083
3084This function returns 0 if the entire string is a valid number in base
3085@var{base}.  Otherwise it returns @minus{}1.
3086@c
3087@c  It turns out that it is not entirely true that this function ignores
3088@c  white-space.  It does ignore it between digits, but not after a minus sign
3089@c  or within or after ``0x''.  Some thought was given to disallowing all
3090@c  whitespace, but that would be an incompatible change, whitespace has been
3091@c  documented as ignored ever since GMP 1.
3092@c
3093@end deftypefun
3094
3095@deftypefun void mpz_swap (mpz_t @var{rop1}, mpz_t @var{rop2})
3096Swap the values @var{rop1} and @var{rop2} efficiently.
3097@end deftypefun
3098
3099
3100@node Simultaneous Integer Init & Assign, Converting Integers, Assigning Integers, Integer Functions
3101@comment  node-name,  next,  previous,  up
3102@section Combined Initialization and Assignment Functions
3103@cindex Integer assignment functions
3104@cindex Assignment functions
3105@cindex Integer initialization functions
3106@cindex Initialization functions
3107
3108For convenience, GMP provides a parallel series of initialize-and-set functions
3109which initialize the output and then store the value there.  These functions'
3110names have the form @code{mpz_init_set@dots{}}
3111
3112Here is an example of using one:
3113
3114@example
3115@{
3116  mpz_t pie;
3117  mpz_init_set_str (pie, "3141592653589793238462643383279502884", 10);
3118  @dots{}
3119  mpz_sub (pie, @dots{});
3120  @dots{}
3121  mpz_clear (pie);
3122@}
3123@end example
3124
3125@noindent
3126Once the integer has been initialized by any of the @code{mpz_init_set@dots{}}
3127functions, it can be used as the source or destination operand for the ordinary
3128integer functions.  Don't use an initialize-and-set function on a variable
3129already initialized!
3130
3131@deftypefun void mpz_init_set (mpz_t @var{rop}, const mpz_t @var{op})
3132@deftypefunx void mpz_init_set_ui (mpz_t @var{rop}, unsigned long int @var{op})
3133@deftypefunx void mpz_init_set_si (mpz_t @var{rop}, signed long int @var{op})
3134@deftypefunx void mpz_init_set_d (mpz_t @var{rop}, double @var{op})
3135Initialize @var{rop} with limb space and set the initial numeric value from
3136@var{op}.
3137@end deftypefun
3138
3139@deftypefun int mpz_init_set_str (mpz_t @var{rop}, const char *@var{str}, int @var{base})
3140Initialize @var{rop} and set its value like @code{mpz_set_str} (see its
3141documentation above for details).
3142
3143If the string is a correct base @var{base} number, the function returns 0;
3144if an error occurs it returns @minus{}1.  @var{rop} is initialized even if
3145an error occurs.  (I.e., you have to call @code{mpz_clear} for it.)
3146@end deftypefun
3147
3148
3149@node Converting Integers, Integer Arithmetic, Simultaneous Integer Init & Assign, Integer Functions
3150@comment  node-name,  next,  previous,  up
3151@section Conversion Functions
3152@cindex Integer conversion functions
3153@cindex Conversion functions
3154
3155This section describes functions for converting GMP integers to standard C
3156types.  Functions for converting @emph{to} GMP integers are described in
3157@ref{Assigning Integers} and @ref{I/O of Integers}.
3158
3159@deftypefun {unsigned long int} mpz_get_ui (const mpz_t @var{op})
3160Return the value of @var{op} as an @code{unsigned long}.
3161
3162If @var{op} is too big to fit an @code{unsigned long} then just the least
3163significant bits that do fit are returned.  The sign of @var{op} is ignored,
3164only the absolute value is used.
3165@end deftypefun
3166
3167@deftypefun {signed long int} mpz_get_si (const mpz_t @var{op})
3168If @var{op} fits into a @code{signed long int} return the value of @var{op}.
3169Otherwise return the least significant part of @var{op}, with the same sign
3170as @var{op}.
3171
3172If @var{op} is too big to fit in a @code{signed long int}, the returned
3173result is probably not very useful.  To find out if the value will fit, use
3174the function @code{mpz_fits_slong_p}.
3175@end deftypefun
3176
3177@deftypefun double mpz_get_d (const mpz_t @var{op})
3178Convert @var{op} to a @code{double}, truncating if necessary (i.e.@: rounding
3179towards zero).
3180
3181If the exponent from the conversion is too big, the result is system
3182dependent.  An infinity is returned where available.  A hardware overflow trap
3183may or may not occur.
3184@end deftypefun
3185
3186@deftypefun double mpz_get_d_2exp (signed long int *@var{exp}, const mpz_t @var{op})
3187Convert @var{op} to a @code{double}, truncating if necessary (i.e.@: rounding
3188towards zero), and returning the exponent separately.
3189
3190The return value is in the range @math{0.5@le{}@GMPabs{@var{d}}<1} and the
3191exponent is stored to @code{*@var{exp}}.  @m{@var{d} * 2^{exp}, @var{d} *
31922^@var{exp}} is the (truncated) @var{op} value.  If @var{op} is zero, the
3193return is @math{0.0} and 0 is stored to @code{*@var{exp}}.
3194
3195@cindex @code{frexp}
3196This is similar to the standard C @code{frexp} function (@pxref{Normalization
3197Functions,,, libc, The GNU C Library Reference Manual}).
3198@end deftypefun
3199
3200@deftypefun {char *} mpz_get_str (char *@var{str}, int @var{base}, const mpz_t @var{op})
3201Convert @var{op} to a string of digits in base @var{base}.  The base argument
3202may vary from 2 to 62 or from @minus{}2 to @minus{}36.
3203
3204For @var{base} in the range 2..36, digits and lower-case letters are used; for
3205@minus{}2..@minus{}36, digits and upper-case letters are used; for 37..62,
3206digits, upper-case letters, and lower-case letters (in that significance order)
3207are used.
3208
3209If @var{str} is @code{NULL}, the result string is allocated using the current
3210allocation function (@pxref{Custom Allocation}).  The block will be
3211@code{strlen(str)+1} bytes, that being exactly enough for the string and
3212null-terminator.
3213
3214If @var{str} is not @code{NULL}, it should point to a block of storage large
3215enough for the result, that being @code{mpz_sizeinbase (@var{op}, @var{base})
3216+ 2}.  The two extra bytes are for a possible minus sign, and the
3217null-terminator.
3218
3219A pointer to the result string is returned, being either the allocated block,
3220or the given @var{str}.
3221@end deftypefun
3222
3223
3224@need 2000
3225@node Integer Arithmetic, Integer Division, Converting Integers, Integer Functions
3226@comment  node-name,  next,  previous,  up
3227@section Arithmetic Functions
3228@cindex Integer arithmetic functions
3229@cindex Arithmetic functions
3230
3231@deftypefun void mpz_add (mpz_t @var{rop}, const mpz_t @var{op1}, const mpz_t @var{op2})
3232@deftypefunx void mpz_add_ui (mpz_t @var{rop}, const mpz_t @var{op1}, unsigned long int @var{op2})
3233Set @var{rop} to @math{@var{op1} + @var{op2}}.
3234@end deftypefun
3235
3236@deftypefun void mpz_sub (mpz_t @var{rop}, const mpz_t @var{op1}, const mpz_t @var{op2})
3237@deftypefunx void mpz_sub_ui (mpz_t @var{rop}, const mpz_t @var{op1}, unsigned long int @var{op2})
3238@deftypefunx void mpz_ui_sub (mpz_t @var{rop}, unsigned long int @var{op1}, const mpz_t @var{op2})
3239Set @var{rop} to @var{op1} @minus{} @var{op2}.
3240@end deftypefun
3241
3242@deftypefun void mpz_mul (mpz_t @var{rop}, const mpz_t @var{op1}, const mpz_t @var{op2})
3243@deftypefunx void mpz_mul_si (mpz_t @var{rop}, const mpz_t @var{op1}, long int @var{op2})
3244@deftypefunx void mpz_mul_ui (mpz_t @var{rop}, const mpz_t @var{op1}, unsigned long int @var{op2})
3245Set @var{rop} to @math{@var{op1} @GMPtimes{} @var{op2}}.
3246@end deftypefun
3247
3248@deftypefun void mpz_addmul (mpz_t @var{rop}, const mpz_t @var{op1}, const mpz_t @var{op2})
3249@deftypefunx void mpz_addmul_ui (mpz_t @var{rop}, const mpz_t @var{op1}, unsigned long int @var{op2})
3250Set @var{rop} to @math{@var{rop} + @var{op1} @GMPtimes{} @var{op2}}.
3251@end deftypefun
3252
3253@deftypefun void mpz_submul (mpz_t @var{rop}, const mpz_t @var{op1}, const mpz_t @var{op2})
3254@deftypefunx void mpz_submul_ui (mpz_t @var{rop}, const mpz_t @var{op1}, unsigned long int @var{op2})
3255Set @var{rop} to @math{@var{rop} - @var{op1} @GMPtimes{} @var{op2}}.
3256@end deftypefun
3257
3258@deftypefun void mpz_mul_2exp (mpz_t @var{rop}, const mpz_t @var{op1}, mp_bitcnt_t @var{op2})
3259@cindex Bit shift left
3260Set @var{rop} to @m{@var{op1} \times 2^{op2}, @var{op1} times 2 raised to
3261@var{op2}}.  This operation can also be defined as a left shift by @var{op2}
3262bits.
3263@end deftypefun
3264
3265@deftypefun void mpz_neg (mpz_t @var{rop}, const mpz_t @var{op})
3266Set @var{rop} to @minus{}@var{op}.
3267@end deftypefun
3268
3269@deftypefun void mpz_abs (mpz_t @var{rop}, const mpz_t @var{op})
3270Set @var{rop} to the absolute value of @var{op}.
3271@end deftypefun
3272
3273
3274@need 2000
3275@node Integer Division, Integer Exponentiation, Integer Arithmetic, Integer Functions
3276@section Division Functions
3277@cindex Integer division functions
3278@cindex Division functions
3279
3280Division is undefined if the divisor is zero.  Passing a zero divisor to the
3281division or modulo functions (including the modular powering functions
3282@code{mpz_powm} and @code{mpz_powm_ui}), will cause an intentional division by
3283zero.  This lets a program handle arithmetic exceptions in these functions the
3284same way as for normal C @code{int} arithmetic.
3285
3286@c  Separate deftypefun groups for cdiv, fdiv and tdiv produce a blank line
3287@c  between each, and seem to let tex do a better job of page breaks than an
3288@c  @sp 1 in the middle of one big set.
3289
3290@deftypefun void mpz_cdiv_q (mpz_t @var{q}, const mpz_t @var{n}, const mpz_t @var{d})
3291@deftypefunx void mpz_cdiv_r (mpz_t @var{r}, const mpz_t @var{n}, const mpz_t @var{d})
3292@deftypefunx void mpz_cdiv_qr (mpz_t @var{q}, mpz_t @var{r}, const mpz_t @var{n}, const mpz_t @var{d})
3293@maybepagebreak
3294@deftypefunx {unsigned long int} mpz_cdiv_q_ui (mpz_t @var{q}, const mpz_t @var{n}, @w{unsigned long int @var{d}})
3295@deftypefunx {unsigned long int} mpz_cdiv_r_ui (mpz_t @var{r}, const mpz_t @var{n}, @w{unsigned long int @var{d}})
3296@deftypefunx {unsigned long int} mpz_cdiv_qr_ui (mpz_t @var{q}, mpz_t @var{r}, @w{const mpz_t @var{n}}, @w{unsigned long int @var{d}})
3297@deftypefunx {unsigned long int} mpz_cdiv_ui (const mpz_t @var{n}, @w{unsigned long int @var{d}})
3298@maybepagebreak
3299@deftypefunx void mpz_cdiv_q_2exp (mpz_t @var{q}, const mpz_t @var{n}, @w{mp_bitcnt_t @var{b}})
3300@deftypefunx void mpz_cdiv_r_2exp (mpz_t @var{r}, const mpz_t @var{n}, @w{mp_bitcnt_t @var{b}})
3301@end deftypefun
3302
3303@deftypefun void mpz_fdiv_q (mpz_t @var{q}, const mpz_t @var{n}, const mpz_t @var{d})
3304@deftypefunx void mpz_fdiv_r (mpz_t @var{r}, const mpz_t @var{n}, const mpz_t @var{d})
3305@deftypefunx void mpz_fdiv_qr (mpz_t @var{q}, mpz_t @var{r}, const mpz_t @var{n}, const mpz_t @var{d})
3306@maybepagebreak
3307@deftypefunx {unsigned long int} mpz_fdiv_q_ui (mpz_t @var{q}, const mpz_t @var{n}, @w{unsigned long int @var{d}})
3308@deftypefunx {unsigned long int} mpz_fdiv_r_ui (mpz_t @var{r}, const mpz_t @var{n}, @w{unsigned long int @var{d}})
3309@deftypefunx {unsigned long int} mpz_fdiv_qr_ui (mpz_t @var{q}, mpz_t @var{r}, @w{const mpz_t @var{n}}, @w{unsigned long int @var{d}})
3310@deftypefunx {unsigned long int} mpz_fdiv_ui (const mpz_t @var{n}, @w{unsigned long int @var{d}})
3311@maybepagebreak
3312@deftypefunx void mpz_fdiv_q_2exp (mpz_t @var{q}, const mpz_t @var{n}, @w{mp_bitcnt_t @var{b}})
3313@deftypefunx void mpz_fdiv_r_2exp (mpz_t @var{r}, const mpz_t @var{n}, @w{mp_bitcnt_t @var{b}})
3314@end deftypefun
3315
3316@deftypefun void mpz_tdiv_q (mpz_t @var{q}, const mpz_t @var{n}, const mpz_t @var{d})
3317@deftypefunx void mpz_tdiv_r (mpz_t @var{r}, const mpz_t @var{n}, const mpz_t @var{d})
3318@deftypefunx void mpz_tdiv_qr (mpz_t @var{q}, mpz_t @var{r}, const mpz_t @var{n}, const mpz_t @var{d})
3319@maybepagebreak
3320@deftypefunx {unsigned long int} mpz_tdiv_q_ui (mpz_t @var{q}, const mpz_t @var{n}, @w{unsigned long int @var{d}})
3321@deftypefunx {unsigned long int} mpz_tdiv_r_ui (mpz_t @var{r}, const mpz_t @var{n}, @w{unsigned long int @var{d}})
3322@deftypefunx {unsigned long int} mpz_tdiv_qr_ui (mpz_t @var{q}, mpz_t @var{r}, @w{const mpz_t @var{n}}, @w{unsigned long int @var{d}})
3323@deftypefunx {unsigned long int} mpz_tdiv_ui (const mpz_t @var{n}, @w{unsigned long int @var{d}})
3324@maybepagebreak
3325@deftypefunx void mpz_tdiv_q_2exp (mpz_t @var{q}, const mpz_t @var{n}, @w{mp_bitcnt_t @var{b}})
3326@deftypefunx void mpz_tdiv_r_2exp (mpz_t @var{r}, const mpz_t @var{n}, @w{mp_bitcnt_t @var{b}})
3327@cindex Bit shift right
3328
3329@sp 1
3330Divide @var{n} by @var{d}, forming a quotient @var{q} and/or remainder
3331@var{r}.  For the @code{2exp} functions, @m{@var{d}=2^b, @var{d}=2^@var{b}}.
3332The rounding is in three styles, each suiting different applications.
3333
3334@itemize @bullet
3335@item
3336@code{cdiv} rounds @var{q} up towards @m{+\infty, +infinity}, and @var{r} will
3337have the opposite sign to @var{d}.  The @code{c} stands for ``ceil''.
3338
3339@item
3340@code{fdiv} rounds @var{q} down towards @m{-\infty, @minus{}infinity}, and
3341@var{r} will have the same sign as @var{d}.  The @code{f} stands for
3342``floor''.
3343
3344@item
3345@code{tdiv} rounds @var{q} towards zero, and @var{r} will have the same sign
3346as @var{n}.  The @code{t} stands for ``truncate''.
3347@end itemize
3348
3349In all cases @var{q} and @var{r} will satisfy
3350@m{@var{n}=@var{q}@var{d}+@var{r}, @var{n}=@var{q}*@var{d}+@var{r}}, and
3351@var{r} will satisfy @math{0@le{}@GMPabs{@var{r}}<@GMPabs{@var{d}}}.
3352
3353The @code{q} functions calculate only the quotient, the @code{r} functions
3354only the remainder, and the @code{qr} functions calculate both.  Note that for
3355@code{qr} the same variable cannot be passed for both @var{q} and @var{r}, or
3356results will be unpredictable.
3357
3358For the @code{ui} variants the return value is the remainder, and in fact
3359returning the remainder is all the @code{div_ui} functions do.  For
3360@code{tdiv} and @code{cdiv} the remainder can be negative, so for those the
3361return value is the absolute value of the remainder.
3362
3363For the @code{2exp} variants the divisor is @m{2^b,2^@var{b}}.  These
3364functions are implemented as right shifts and bit masks, but of course they
3365round the same as the other functions.
3366
3367For positive @var{n} both @code{mpz_fdiv_q_2exp} and @code{mpz_tdiv_q_2exp}
3368are simple bitwise right shifts.  For negative @var{n}, @code{mpz_fdiv_q_2exp}
3369is effectively an arithmetic right shift treating @var{n} as twos complement
3370the same as the bitwise logical functions do, whereas @code{mpz_tdiv_q_2exp}
3371effectively treats @var{n} as sign and magnitude.
3372@end deftypefun
3373
3374@deftypefun void mpz_mod (mpz_t @var{r}, const mpz_t @var{n}, const mpz_t @var{d})
3375@deftypefunx {unsigned long int} mpz_mod_ui (mpz_t @var{r}, const mpz_t @var{n}, @w{unsigned long int @var{d}})
3376Set @var{r} to @var{n} @code{mod} @var{d}.  The sign of the divisor is
3377ignored; the result is always non-negative.
3378
3379@code{mpz_mod_ui} is identical to @code{mpz_fdiv_r_ui} above, returning the
3380remainder as well as setting @var{r}.  See @code{mpz_fdiv_ui} above if only
3381the return value is wanted.
3382@end deftypefun
3383
3384@deftypefun void mpz_divexact (mpz_t @var{q}, const mpz_t @var{n}, const mpz_t @var{d})
3385@deftypefunx void mpz_divexact_ui (mpz_t @var{q}, const mpz_t @var{n}, unsigned long @var{d})
3386@cindex Exact division functions
3387Set @var{q} to @var{n}/@var{d}.  These functions produce correct results only
3388when it is known in advance that @var{d} divides @var{n}.
3389
3390These routines are much faster than the other division functions, and are the
3391best choice when exact division is known to occur, for example reducing a
3392rational to lowest terms.
3393@end deftypefun
3394
3395@deftypefun int mpz_divisible_p (const mpz_t @var{n}, const mpz_t @var{d})
3396@deftypefunx int mpz_divisible_ui_p (const mpz_t @var{n}, unsigned long int @var{d})
3397@deftypefunx int mpz_divisible_2exp_p (const mpz_t @var{n}, mp_bitcnt_t @var{b})
3398@cindex Divisibility functions
3399Return non-zero if @var{n} is exactly divisible by @var{d}, or in the case of
3400@code{mpz_divisible_2exp_p} by @m{2^b,2^@var{b}}.
3401
3402@var{n} is divisible by @var{d} if there exists an integer @var{q} satisfying
3403@math{@var{n} = @var{q}@GMPmultiply{}@var{d}}.  Unlike the other division
3404functions, @math{@var{d}=0} is accepted and following the rule it can be seen
3405that only 0 is considered divisible by 0.
3406@end deftypefun
3407
3408@deftypefun int mpz_congruent_p (const mpz_t @var{n}, const mpz_t @var{c}, const mpz_t @var{d})
3409@deftypefunx int mpz_congruent_ui_p (const mpz_t @var{n}, unsigned long int @var{c}, unsigned long int @var{d})
3410@deftypefunx int mpz_congruent_2exp_p (const mpz_t @var{n}, const mpz_t @var{c}, mp_bitcnt_t @var{b})
3411@cindex Divisibility functions
3412@cindex Congruence functions
3413Return non-zero if @var{n} is congruent to @var{c} modulo @var{d}, or in the
3414case of @code{mpz_congruent_2exp_p} modulo @m{2^b,2^@var{b}}.
3415
3416@var{n} is congruent to @var{c} mod @var{d} if there exists an integer @var{q}
3417satisfying @math{@var{n} = @var{c} + @var{q}@GMPmultiply{}@var{d}}.  Unlike
3418the other division functions, @math{@var{d}=0} is accepted and following the
3419rule it can be seen that @var{n} and @var{c} are considered congruent mod 0
3420only when exactly equal.
3421@end deftypefun
3422
3423
3424@need 2000
3425@node Integer Exponentiation, Integer Roots, Integer Division, Integer Functions
3426@section Exponentiation Functions
3427@cindex Integer exponentiation functions
3428@cindex Exponentiation functions
3429@cindex Powering functions
3430
3431@deftypefun void mpz_powm (mpz_t @var{rop}, const mpz_t @var{base}, const mpz_t @var{exp}, const mpz_t @var{mod})
3432@deftypefunx void mpz_powm_ui (mpz_t @var{rop}, const mpz_t @var{base}, unsigned long int @var{exp}, const mpz_t @var{mod})
3433Set @var{rop} to @m{base^{exp} \bmod mod, (@var{base} raised to @var{exp})
3434modulo @var{mod}}.
3435
3436Negative @var{exp} is supported if an inverse @math{@var{base}^@W{-1} @bmod
3437@var{mod}} exists (see @code{mpz_invert} in @ref{Number Theoretic Functions}).
3438If an inverse doesn't exist then a divide by zero is raised.
3439@end deftypefun
3440
3441@deftypefun void mpz_powm_sec (mpz_t @var{rop}, const mpz_t @var{base}, const mpz_t @var{exp}, const mpz_t @var{mod})
3442Set @var{rop} to @m{base^{exp} \bmod @var{mod}, (@var{base} raised to @var{exp})
3443modulo @var{mod}}.
3444
3445It is required that @math{@var{exp} > 0} and that @var{mod} is odd.
3446
3447This function is designed to take the same time and have the same cache access
3448patterns for any two same-size arguments, assuming that function arguments are
3449placed at the same position and that the machine state is identical upon
3450function entry.  This function is intended for cryptographic purposes, where
3451resilience to side-channel attacks is desired.
3452@end deftypefun
3453
3454@deftypefun void mpz_pow_ui (mpz_t @var{rop}, const mpz_t @var{base}, unsigned long int @var{exp})
3455@deftypefunx void mpz_ui_pow_ui (mpz_t @var{rop}, unsigned long int @var{base}, unsigned long int @var{exp})
3456Set @var{rop} to @m{base^{exp}, @var{base} raised to @var{exp}}.  The case
3457@math{0^0} yields 1.
3458@end deftypefun
3459
3460
3461@need 2000
3462@node Integer Roots, Number Theoretic Functions, Integer Exponentiation, Integer Functions
3463@section Root Extraction Functions
3464@cindex Integer root functions
3465@cindex Root extraction functions
3466
3467@deftypefun int mpz_root (mpz_t @var{rop}, const mpz_t @var{op}, unsigned long int @var{n})
3468Set @var{rop} to @m{\lfloor\root n \of {op}\rfloor@C{},} the truncated integer
3469part of the @var{n}th root of @var{op}.  Return non-zero if the computation
3470was exact, i.e., if @var{op} is @var{rop} to the @var{n}th power.
3471@end deftypefun
3472
3473@deftypefun void mpz_rootrem (mpz_t @var{root}, mpz_t @var{rem}, const mpz_t @var{u}, unsigned long int @var{n})
3474Set @var{root} to @m{\lfloor\root n \of {u}\rfloor@C{},} the truncated
3475integer part of the @var{n}th root of @var{u}.  Set @var{rem} to the
3476remainder, @m{(@var{u} - @var{root}^n),
3477@var{u}@minus{}@var{root}**@var{n}}.
3478@end deftypefun
3479
3480@deftypefun void mpz_sqrt (mpz_t @var{rop}, const mpz_t @var{op})
3481Set @var{rop} to @m{\lfloor\sqrt{@var{op}}\rfloor@C{},} the truncated
3482integer part of the square root of @var{op}.
3483@end deftypefun
3484
3485@deftypefun void mpz_sqrtrem (mpz_t @var{rop1}, mpz_t @var{rop2}, const mpz_t @var{op})
3486Set @var{rop1} to @m{\lfloor\sqrt{@var{op}}\rfloor, the truncated integer part
3487of the square root of @var{op}}, like @code{mpz_sqrt}.  Set @var{rop2} to the
3488remainder @m{(@var{op} - @var{rop1}^2),
3489@var{op}@minus{}@var{rop1}*@var{rop1}}, which will be zero if @var{op} is a
3490perfect square.
3491
3492If @var{rop1} and @var{rop2} are the same variable, the results are
3493undefined.
3494@end deftypefun
3495
3496@deftypefun int mpz_perfect_power_p (const mpz_t @var{op})
3497@cindex Perfect power functions
3498@cindex Root testing functions
3499Return non-zero if @var{op} is a perfect power, i.e., if there exist integers
3500@m{a,@var{a}} and @m{b,@var{b}}, with @m{b>1, @var{b}>1}, such that
3501@m{@var{op}=a^b, @var{op} equals @var{a} raised to the power @var{b}}.
3502
3503Under this definition both 0 and 1 are considered to be perfect powers.
3504Negative values of @var{op} are accepted, but of course can only be odd
3505perfect powers.
3506@end deftypefun
3507
3508@deftypefun int mpz_perfect_square_p (const mpz_t @var{op})
3509@cindex Perfect square functions
3510@cindex Root testing functions
3511Return non-zero if @var{op} is a perfect square, i.e., if the square root of
3512@var{op} is an integer.  Under this definition both 0 and 1 are considered to
3513be perfect squares.
3514@end deftypefun
3515
3516
3517@need 2000
3518@node Number Theoretic Functions, Integer Comparisons, Integer Roots, Integer Functions
3519@section Number Theoretic Functions
3520@cindex Number theoretic functions
3521
3522@deftypefun int mpz_probab_prime_p (const mpz_t @var{n}, int @var{reps})
3523@cindex Prime testing functions
3524@cindex Probable prime testing functions
3525Determine whether @var{n} is prime.  Return 2 if @var{n} is definitely prime,
3526return 1 if @var{n} is probably prime (without being certain), or return 0 if
3527@var{n} is definitely non-prime.
3528
3529This function performs some trial divisions, then @var{reps} Miller-Rabin
3530probabilistic primality tests.  A higher @var{reps} value will reduce the
3531chances of a non-prime being identified as ``probably prime''.  A composite
3532number will be identified as a prime with a probability of less than
3533@m{4^{-reps},4^(-@var{reps})}.  Reasonable values of @var{reps} are between 15
3534and 50.
3535@end deftypefun
3536
3537@deftypefun void mpz_nextprime (mpz_t @var{rop}, const mpz_t @var{op})
3538@cindex Next prime function
3539Set @var{rop} to the next prime greater than @var{op}.
3540
3541This function uses a probabilistic algorithm to identify primes.  For
3542practical purposes it's adequate, the chance of a composite passing will be
3543extremely small.
3544@end deftypefun
3545
3546@c mpz_prime_p not implemented as of gmp 3.0.
3547
3548@c @deftypefun int mpz_prime_p (const mpz_t @var{n})
3549@c Return non-zero if @var{n} is prime and zero if @var{n} is a non-prime.
3550@c This function is far slower than @code{mpz_probab_prime_p}, but then it
3551@c never returns non-zero for composite numbers.
3552
3553@c (For practical purposes, using @code{mpz_probab_prime_p} is adequate.
3554@c The likelihood of a programming error or hardware malfunction is orders
3555@c of magnitudes greater than the likelihood for a composite to pass as a
3556@c prime, if the @var{reps} argument is in the suggested range.)
3557@c @end deftypefun
3558
3559@deftypefun void mpz_gcd (mpz_t @var{rop}, const mpz_t @var{op1}, const mpz_t @var{op2})
3560@cindex Greatest common divisor functions
3561@cindex GCD functions
3562Set @var{rop} to the greatest common divisor of @var{op1} and @var{op2}.  The
3563result is always positive even if one or both input operands are negative.
3564Except if both inputs are zero; then this function defines @math{gcd(0,0) = 0}.
3565@end deftypefun
3566
3567@deftypefun {unsigned long int} mpz_gcd_ui (mpz_t @var{rop}, const mpz_t @var{op1}, unsigned long int @var{op2})
3568Compute the greatest common divisor of @var{op1} and @var{op2}.  If
3569@var{rop} is not @code{NULL}, store the result there.
3570
3571If the result is small enough to fit in an @code{unsigned long int}, it is
3572returned.  If the result does not fit, 0 is returned, and the result is equal
3573to the argument @var{op1}.  Note that the result will always fit if @var{op2}
3574is non-zero.
3575@end deftypefun
3576
3577@deftypefun void mpz_gcdext (mpz_t @var{g}, mpz_t @var{s}, mpz_t @var{t}, const mpz_t @var{a}, const mpz_t @var{b})
3578@cindex Extended GCD
3579@cindex GCD extended
3580Set @var{g} to the greatest common divisor of @var{a} and @var{b}, and in
3581addition set @var{s} and @var{t} to coefficients satisfying
3582@math{@var{a}@GMPmultiply{}@var{s} + @var{b}@GMPmultiply{}@var{t} = @var{g}}.
3583The value in @var{g} is always positive, even if one or both of @var{a} and
3584@var{b} are negative (or zero if both inputs are zero).  The values in @var{s}
3585and @var{t} are chosen such that normally, @math{@GMPabs{@var{s}} <
3586@GMPabs{@var{b}} / (2 @var{g})} and @math{@GMPabs{@var{t}} < @GMPabs{@var{a}}
3587/ (2 @var{g})}, and these relations define @var{s} and @var{t} uniquely.  There
3588are a few exceptional cases:
3589
3590If @math{@GMPabs{@var{a}} = @GMPabs{@var{b}}}, then @math{@var{s} = 0},
3591@math{@var{t} = sgn(@var{b})}.
3592
3593Otherwise, @math{@var{s} = sgn(@var{a})} if @math{@var{b} = 0} or
3594@math{@GMPabs{@var{b}} = 2 @var{g}}, and @math{@var{t} = sgn(@var{b})} if
3595@math{@var{a} = 0} or @math{@GMPabs{@var{a}} = 2 @var{g}}.
3596
3597In all cases, @math{@var{s} = 0} if and only if @math{@var{g} =
3598@GMPabs{@var{b}}}, i.e., if @var{b} divides @var{a} or @math{@var{a} = @var{b}
3599= 0}.
3600
3601If @var{t} is @code{NULL} then that value is not computed.
3602@end deftypefun
3603
3604@deftypefun void mpz_lcm (mpz_t @var{rop}, const mpz_t @var{op1}, const mpz_t @var{op2})
3605@deftypefunx void mpz_lcm_ui (mpz_t @var{rop}, const mpz_t @var{op1}, unsigned long @var{op2})
3606@cindex Least common multiple functions
3607@cindex LCM functions
3608Set @var{rop} to the least common multiple of @var{op1} and @var{op2}.
3609@var{rop} is always positive, irrespective of the signs of @var{op1} and
3610@var{op2}.  @var{rop} will be zero if either @var{op1} or @var{op2} is zero.
3611@end deftypefun
3612
3613@deftypefun int mpz_invert (mpz_t @var{rop}, const mpz_t @var{op1}, const mpz_t @var{op2})
3614@cindex Modular inverse functions
3615@cindex Inverse modulo functions
3616Compute the inverse of @var{op1} modulo @var{op2} and put the result in
3617@var{rop}.  If the inverse exists, the return value is non-zero and @var{rop}
3618will satisfy @math{0 @le{} @var{rop} < @GMPabs{@var{op2}}} (with @math{@var{rop}
3619= 0} possible only when @math{@GMPabs{@var{op2}} = 1}, i.e., in the
3620somewhat degenerate zero ring).  If an inverse doesn't
3621exist the return value is zero and @var{rop} is undefined.  The behaviour of
3622this function is undefined when @var{op2} is zero.
3623@end deftypefun
3624
3625@deftypefun int mpz_jacobi (const mpz_t @var{a}, const mpz_t @var{b})
3626@cindex Jacobi symbol functions
3627Calculate the Jacobi symbol @m{\left(a \over b\right),
3628(@var{a}/@var{b})}.  This is defined only for @var{b} odd.
3629@end deftypefun
3630
3631@deftypefun int mpz_legendre (const mpz_t @var{a}, const mpz_t @var{p})
3632@cindex Legendre symbol functions
3633Calculate the Legendre symbol @m{\left(a \over p\right),
3634(@var{a}/@var{p})}.  This is defined only for @var{p} an odd positive
3635prime, and for such @var{p} it's identical to the Jacobi symbol.
3636@end deftypefun
3637
3638@deftypefun int mpz_kronecker (const mpz_t @var{a}, const mpz_t @var{b})
3639@deftypefunx int mpz_kronecker_si (const mpz_t @var{a}, long @var{b})
3640@deftypefunx int mpz_kronecker_ui (const mpz_t @var{a}, unsigned long @var{b})
3641@deftypefunx int mpz_si_kronecker (long @var{a}, const mpz_t @var{b})
3642@deftypefunx int mpz_ui_kronecker (unsigned long @var{a}, const mpz_t @var{b})
3643@cindex Kronecker symbol functions
3644Calculate the Jacobi symbol @m{\left(a \over b\right),
3645(@var{a}/@var{b})} with the Kronecker extension @m{\left(a \over
36462\right) = \left(2 \over a\right), (a/2)=(2/a)} when @math{a} odd, or
3647@m{\left(a \over 2\right) = 0, (a/2)=0} when @math{a} even.
3648
3649When @var{b} is odd the Jacobi symbol and Kronecker symbol are
3650identical, so @code{mpz_kronecker_ui} etc can be used for mixed
3651precision Jacobi symbols too.
3652
3653For more information see Henri Cohen section 1.4.2 (@pxref{References}),
3654or any number theory textbook.  See also the example program
3655@file{demos/qcn.c} which uses @code{mpz_kronecker_ui}.
3656@end deftypefun
3657
3658@deftypefun {mp_bitcnt_t} mpz_remove (mpz_t @var{rop}, const mpz_t @var{op}, const mpz_t @var{f})
3659@cindex Remove factor functions
3660@cindex Factor removal functions
3661Remove all occurrences of the factor @var{f} from @var{op} and store the
3662result in @var{rop}.  The return value is how many such occurrences were
3663removed.
3664@end deftypefun
3665
3666@deftypefun void mpz_fac_ui (mpz_t @var{rop}, unsigned long int @var{n})
3667@deftypefunx void mpz_2fac_ui (mpz_t @var{rop}, unsigned long int @var{n})
3668@deftypefunx void mpz_mfac_uiui (mpz_t @var{rop}, unsigned long int @var{n}, unsigned long int @var{m})
3669@cindex Factorial functions
3670Set @var{rop} to the factorial of @var{n}: @code{mpz_fac_ui} computes the plain factorial @var{n}!,
3671@code{mpz_2fac_ui} computes the double-factorial @var{n}!!, and @code{mpz_mfac_uiui} the
3672@var{m}-multi-factorial @m{n!^{(m)}, @var{n}!^(@var{m})}.
3673@end deftypefun
3674
3675@deftypefun void mpz_primorial_ui (mpz_t @var{rop}, unsigned long int @var{n})
3676@cindex Primorial functions
3677Set @var{rop} to the primorial of @var{n}, i.e. the product of all positive
3678prime numbers @math{@le{}@var{n}}.
3679@end deftypefun
3680
3681@deftypefun void mpz_bin_ui (mpz_t @var{rop}, const mpz_t @var{n}, unsigned long int @var{k})
3682@deftypefunx void mpz_bin_uiui (mpz_t @var{rop}, unsigned long int @var{n}, @w{unsigned long int @var{k}})
3683@cindex Binomial coefficient functions
3684Compute the binomial coefficient @m{\left({n}\atop{k}\right), @var{n} over
3685@var{k}} and store the result in @var{rop}.  Negative values of @var{n} are
3686supported by @code{mpz_bin_ui}, using the identity
3687@m{\left({-n}\atop{k}\right) = (-1)^k \left({n+k-1}\atop{k}\right),
3688bin(-n@C{}k) = (-1)^k * bin(n+k-1@C{}k)}, see Knuth volume 1 section 1.2.6
3689part G.
3690@end deftypefun
3691
3692@deftypefun void mpz_fib_ui (mpz_t @var{fn}, unsigned long int @var{n})
3693@deftypefunx void mpz_fib2_ui (mpz_t @var{fn}, mpz_t @var{fnsub1}, unsigned long int @var{n})
3694@cindex Fibonacci sequence functions
3695@code{mpz_fib_ui} sets @var{fn} to to @m{F_n,F[n]}, the @var{n}'th Fibonacci
3696number.  @code{mpz_fib2_ui} sets @var{fn} to @m{F_n,F[n]}, and @var{fnsub1} to
3697@m{F_{n-1},F[n-1]}.
3698
3699These functions are designed for calculating isolated Fibonacci numbers.  When
3700a sequence of values is wanted it's best to start with @code{mpz_fib2_ui} and
3701iterate the defining @m{F_{n+1} = F_n + F_{n-1}, F[n+1]=F[n]+F[n-1]} or
3702similar.
3703@end deftypefun
3704
3705@deftypefun void mpz_lucnum_ui (mpz_t @var{ln}, unsigned long int @var{n})
3706@deftypefunx void mpz_lucnum2_ui (mpz_t @var{ln}, mpz_t @var{lnsub1}, unsigned long int @var{n})
3707@cindex Lucas number functions
3708@code{mpz_lucnum_ui} sets @var{ln} to to @m{L_n,L[n]}, the @var{n}'th Lucas
3709number.  @code{mpz_lucnum2_ui} sets @var{ln} to @m{L_n,L[n]}, and @var{lnsub1}
3710to @m{L_{n-1},L[n-1]}.
3711
3712These functions are designed for calculating isolated Lucas numbers.  When a
3713sequence of values is wanted it's best to start with @code{mpz_lucnum2_ui} and
3714iterate the defining @m{L_{n+1} = L_n + L_{n-1}, L[n+1]=L[n]+L[n-1]} or
3715similar.
3716
3717The Fibonacci numbers and Lucas numbers are related sequences, so it's never
3718necessary to call both @code{mpz_fib2_ui} and @code{mpz_lucnum2_ui}.  The
3719formulas for going from Fibonacci to Lucas can be found in @ref{Lucas Numbers
3720Algorithm}, the reverse is straightforward too.
3721@end deftypefun
3722
3723
3724@node Integer Comparisons, Integer Logic and Bit Fiddling, Number Theoretic Functions, Integer Functions
3725@comment  node-name,  next,  previous,  up
3726@section Comparison Functions
3727@cindex Integer comparison functions
3728@cindex Comparison functions
3729
3730@deftypefn Function int mpz_cmp (const mpz_t @var{op1}, const mpz_t @var{op2})
3731@deftypefnx Function int mpz_cmp_d (const mpz_t @var{op1}, double @var{op2})
3732@deftypefnx Macro int mpz_cmp_si (const mpz_t @var{op1}, signed long int @var{op2})
3733@deftypefnx Macro int mpz_cmp_ui (const mpz_t @var{op1}, unsigned long int @var{op2})
3734Compare @var{op1} and @var{op2}.  Return a positive value if @math{@var{op1} >
3735@var{op2}}, zero if @math{@var{op1} = @var{op2}}, or a negative value if
3736@math{@var{op1} < @var{op2}}.
3737
3738@code{mpz_cmp_ui} and @code{mpz_cmp_si} are macros and will evaluate their
3739arguments more than once.  @code{mpz_cmp_d} can be called with an infinity,
3740but results are undefined for a NaN.
3741@end deftypefn
3742
3743@deftypefn Function int mpz_cmpabs (const mpz_t @var{op1}, const mpz_t @var{op2})
3744@deftypefnx Function int mpz_cmpabs_d (const mpz_t @var{op1}, double @var{op2})
3745@deftypefnx Function int mpz_cmpabs_ui (const mpz_t @var{op1}, unsigned long int @var{op2})
3746Compare the absolute values of @var{op1} and @var{op2}.  Return a positive
3747value if @math{@GMPabs{@var{op1}} > @GMPabs{@var{op2}}}, zero if
3748@math{@GMPabs{@var{op1}} = @GMPabs{@var{op2}}}, or a negative value if
3749@math{@GMPabs{@var{op1}} < @GMPabs{@var{op2}}}.
3750
3751@code{mpz_cmpabs_d} can be called with an infinity, but results are undefined
3752for a NaN.
3753@end deftypefn
3754
3755@deftypefn Macro int mpz_sgn (const mpz_t @var{op})
3756@cindex Sign tests
3757@cindex Integer sign tests
3758Return @math{+1} if @math{@var{op} > 0}, 0 if @math{@var{op} = 0}, and
3759@math{-1} if @math{@var{op} < 0}.
3760
3761This function is actually implemented as a macro.  It evaluates its argument
3762multiple times.
3763@end deftypefn
3764
3765
3766@node Integer Logic and Bit Fiddling, I/O of Integers, Integer Comparisons, Integer Functions
3767@comment  node-name,  next,  previous,  up
3768@section Logical and Bit Manipulation Functions
3769@cindex Logical functions
3770@cindex Bit manipulation functions
3771@cindex Integer logical functions
3772@cindex Integer bit manipulation functions
3773
3774These functions behave as if twos complement arithmetic were used (although
3775sign-magnitude is the actual implementation).  The least significant bit is
3776number 0.
3777
3778@deftypefun void mpz_and (mpz_t @var{rop}, const mpz_t @var{op1}, const mpz_t @var{op2})
3779Set @var{rop} to @var{op1} bitwise-and @var{op2}.
3780@end deftypefun
3781
3782@deftypefun void mpz_ior (mpz_t @var{rop}, const mpz_t @var{op1}, const mpz_t @var{op2})
3783Set @var{rop} to @var{op1} bitwise inclusive-or @var{op2}.
3784@end deftypefun
3785
3786@deftypefun void mpz_xor (mpz_t @var{rop}, const mpz_t @var{op1}, const mpz_t @var{op2})
3787Set @var{rop} to @var{op1} bitwise exclusive-or @var{op2}.
3788@end deftypefun
3789
3790@deftypefun void mpz_com (mpz_t @var{rop}, const mpz_t @var{op})
3791Set @var{rop} to the one's complement of @var{op}.
3792@end deftypefun
3793
3794@deftypefun {mp_bitcnt_t} mpz_popcount (const mpz_t @var{op})
3795If @math{@var{op}@ge{}0}, return the population count of @var{op}, which is the
3796number of 1 bits in the binary representation.  If @math{@var{op}<0}, the
3797number of 1s is infinite, and the return value is the largest possible
3798@code{mp_bitcnt_t}.
3799@end deftypefun
3800
3801@deftypefun {mp_bitcnt_t} mpz_hamdist (const mpz_t @var{op1}, const mpz_t @var{op2})
3802If @var{op1} and @var{op2} are both @math{@ge{}0} or both @math{<0}, return the
3803hamming distance between the two operands, which is the number of bit positions
3804where @var{op1} and @var{op2} have different bit values.  If one operand is
3805@math{@ge{}0} and the other @math{<0} then the number of bits different is
3806infinite, and the return value is the largest possible @code{mp_bitcnt_t}.
3807@end deftypefun
3808
3809@deftypefun {mp_bitcnt_t} mpz_scan0 (const mpz_t @var{op}, mp_bitcnt_t @var{starting_bit})
3810@deftypefunx {mp_bitcnt_t} mpz_scan1 (const mpz_t @var{op}, mp_bitcnt_t @var{starting_bit})
3811@cindex Bit scanning functions
3812@cindex Scan bit functions
3813Scan @var{op}, starting from bit @var{starting_bit}, towards more significant
3814bits, until the first 0 or 1 bit (respectively) is found.  Return the index of
3815the found bit.
3816
3817If the bit at @var{starting_bit} is already what's sought, then
3818@var{starting_bit} is returned.
3819
3820If there's no bit found, then the largest possible @code{mp_bitcnt_t} is
3821returned.  This will happen in @code{mpz_scan0} past the end of a negative
3822number, or @code{mpz_scan1} past the end of a nonnegative number.
3823@end deftypefun
3824
3825@deftypefun void mpz_setbit (mpz_t @var{rop}, mp_bitcnt_t @var{bit_index})
3826Set bit @var{bit_index} in @var{rop}.
3827@end deftypefun
3828
3829@deftypefun void mpz_clrbit (mpz_t @var{rop}, mp_bitcnt_t @var{bit_index})
3830Clear bit @var{bit_index} in @var{rop}.
3831@end deftypefun
3832
3833@deftypefun void mpz_combit (mpz_t @var{rop}, mp_bitcnt_t @var{bit_index})
3834Complement bit @var{bit_index} in @var{rop}.
3835@end deftypefun
3836
3837@deftypefun int mpz_tstbit (const mpz_t @var{op}, mp_bitcnt_t @var{bit_index})
3838Test bit @var{bit_index} in @var{op} and return 0 or 1 accordingly.
3839@end deftypefun
3840
3841@node I/O of Integers, Integer Random Numbers, Integer Logic and Bit Fiddling, Integer Functions
3842@comment  node-name,  next,  previous,  up
3843@section Input and Output Functions
3844@cindex Integer input and output functions
3845@cindex Input functions
3846@cindex Output functions
3847@cindex I/O functions
3848
3849Functions that perform input from a stdio stream, and functions that output to
3850a stdio stream, of @code{mpz} numbers.  Passing a @code{NULL} pointer for a
3851@var{stream} argument to any of these functions will make them read from
3852@code{stdin} and write to @code{stdout}, respectively.
3853
3854When using any of these functions, it is a good idea to include @file{stdio.h}
3855before @file{gmp.h}, since that will allow @file{gmp.h} to define prototypes
3856for these functions.
3857
3858See also @ref{Formatted Output} and @ref{Formatted Input}.
3859
3860@deftypefun size_t mpz_out_str (FILE *@var{stream}, int @var{base}, const mpz_t @var{op})
3861Output @var{op} on stdio stream @var{stream}, as a string of digits in base
3862@var{base}.  The base argument may vary from 2 to 62 or from @minus{}2 to
3863@minus{}36.
3864
3865For @var{base} in the range 2..36, digits and lower-case letters are used; for
3866@minus{}2..@minus{}36, digits and upper-case letters are used; for 37..62,
3867digits, upper-case letters, and lower-case letters (in that significance order)
3868are used.
3869
3870Return the number of bytes written, or if an error occurred, return 0.
3871@end deftypefun
3872
3873@deftypefun size_t mpz_inp_str (mpz_t @var{rop}, FILE *@var{stream}, int @var{base})
3874Input a possibly white-space preceded string in base @var{base} from stdio
3875stream @var{stream}, and put the read integer in @var{rop}.
3876
3877The @var{base} may vary from 2 to 62, or if @var{base} is 0, then the leading
3878characters are used: @code{0x} and @code{0X} for hexadecimal, @code{0b} and
3879@code{0B} for binary, @code{0} for octal, or decimal otherwise.
3880
3881For bases up to 36, case is ignored; upper-case and lower-case letters have
3882the same value.  For bases 37 to 62, upper-case letter represent the usual
388310..35 while lower-case letter represent 36..61.
3884
3885Return the number of bytes read, or if an error occurred, return 0.
3886@end deftypefun
3887
3888@deftypefun size_t mpz_out_raw (FILE *@var{stream}, const mpz_t @var{op})
3889Output @var{op} on stdio stream @var{stream}, in raw binary format.  The
3890integer is written in a portable format, with 4 bytes of size information, and
3891that many bytes of limbs.  Both the size and the limbs are written in
3892decreasing significance order (i.e., in big-endian).
3893
3894The output can be read with @code{mpz_inp_raw}.
3895
3896Return the number of bytes written, or if an error occurred, return 0.
3897
3898The output of this can not be read by @code{mpz_inp_raw} from GMP 1, because
3899of changes necessary for compatibility between 32-bit and 64-bit machines.
3900@end deftypefun
3901
3902@deftypefun size_t mpz_inp_raw (mpz_t @var{rop}, FILE *@var{stream})
3903Input from stdio stream @var{stream} in the format written by
3904@code{mpz_out_raw}, and put the result in @var{rop}.  Return the number of
3905bytes read, or if an error occurred, return 0.
3906
3907This routine can read the output from @code{mpz_out_raw} also from GMP 1, in
3908spite of changes necessary for compatibility between 32-bit and 64-bit
3909machines.
3910@end deftypefun
3911
3912
3913@need 2000
3914@node Integer Random Numbers, Integer Import and Export, I/O of Integers, Integer Functions
3915@comment  node-name,  next,  previous,  up
3916@section Random Number Functions
3917@cindex Integer random number functions
3918@cindex Random number functions
3919
3920The random number functions of GMP come in two groups; older function
3921that rely on a global state, and newer functions that accept a state
3922parameter that is read and modified.  Please see the @ref{Random Number
3923Functions} for more information on how to use and not to use random
3924number functions.
3925
3926@deftypefun void mpz_urandomb (mpz_t @var{rop}, gmp_randstate_t @var{state}, mp_bitcnt_t @var{n})
3927Generate a uniformly distributed random integer in the range 0 to @m{2^n-1,
39282^@var{n}@minus{}1}, inclusive.
3929
3930The variable @var{state} must be initialized by calling one of the
3931@code{gmp_randinit} functions (@ref{Random State Initialization}) before
3932invoking this function.
3933@end deftypefun
3934
3935@deftypefun void mpz_urandomm (mpz_t @var{rop}, gmp_randstate_t @var{state}, const mpz_t @var{n})
3936Generate a uniform random integer in the range 0 to @math{@var{n}-1},
3937inclusive.
3938
3939The variable @var{state} must be initialized by calling one of the
3940@code{gmp_randinit} functions (@ref{Random State Initialization})
3941before invoking this function.
3942@end deftypefun
3943
3944@deftypefun void mpz_rrandomb (mpz_t @var{rop}, gmp_randstate_t @var{state}, mp_bitcnt_t @var{n})
3945Generate a random integer with long strings of zeros and ones in the
3946binary representation.  Useful for testing functions and algorithms,
3947since this kind of random numbers have proven to be more likely to
3948trigger corner-case bugs.  The random number will be in the range
3949@m{2^{n-1}, 2^@var{n@minus{}1}} to @m{2^n-1, 2^@var{n}@minus{}1}, inclusive.
3950
3951The variable @var{state} must be initialized by calling one of the
3952@code{gmp_randinit} functions (@ref{Random State Initialization})
3953before invoking this function.
3954@end deftypefun
3955
3956@deftypefun void mpz_random (mpz_t @var{rop}, mp_size_t @var{max_size})
3957Generate a random integer of at most @var{max_size} limbs.  The generated
3958random number doesn't satisfy any particular requirements of randomness.
3959Negative random numbers are generated when @var{max_size} is negative.
3960
3961This function is obsolete.  Use @code{mpz_urandomb} or
3962@code{mpz_urandomm} instead.
3963@end deftypefun
3964
3965@deftypefun void mpz_random2 (mpz_t @var{rop}, mp_size_t @var{max_size})
3966Generate a random integer of at most @var{max_size} limbs, with long strings
3967of zeros and ones in the binary representation.  Useful for testing functions
3968and algorithms, since this kind of random numbers have proven to be more
3969likely to trigger corner-case bugs.  Negative random numbers are generated
3970when @var{max_size} is negative.
3971
3972This function is obsolete.  Use @code{mpz_rrandomb} instead.
3973@end deftypefun
3974
3975
3976@node Integer Import and Export, Miscellaneous Integer Functions, Integer Random Numbers, Integer Functions
3977@section Integer Import and Export
3978
3979@code{mpz_t} variables can be converted to and from arbitrary words of binary
3980data with the following functions.
3981
3982@deftypefun void mpz_import (mpz_t @var{rop}, size_t @var{count}, int @var{order}, size_t @var{size}, int @var{endian}, size_t @var{nails}, const void *@var{op})
3983@cindex Integer import
3984@cindex Import
3985Set @var{rop} from an array of word data at @var{op}.
3986
3987The parameters specify the format of the data.  @var{count} many words are
3988read, each @var{size} bytes.  @var{order} can be 1 for most significant word
3989first or -1 for least significant first.  Within each word @var{endian} can be
39901 for most significant byte first, -1 for least significant first, or 0 for
3991the native endianness of the host CPU@.  The most significant @var{nails} bits
3992of each word are skipped, this can be 0 to use the full words.
3993
3994There is no sign taken from the data, @var{rop} will simply be a positive
3995integer.  An application can handle any sign itself, and apply it for instance
3996with @code{mpz_neg}.
3997
3998There are no data alignment restrictions on @var{op}, any address is allowed.
3999
4000Here's an example converting an array of @code{unsigned long} data, most
4001significant element first, and host byte order within each value.
4002
4003@example
4004unsigned long  a[20];
4005/* Initialize @var{z} and @var{a} */
4006mpz_import (z, 20, 1, sizeof(a[0]), 0, 0, a);
4007@end example
4008
4009This example assumes the full @code{sizeof} bytes are used for data in the
4010given type, which is usually true, and certainly true for @code{unsigned long}
4011everywhere we know of.  However on Cray vector systems it may be noted that
4012@code{short} and @code{int} are always stored in 8 bytes (and with
4013@code{sizeof} indicating that) but use only 32 or 46 bits.  The @var{nails}
4014feature can account for this, by passing for instance
4015@code{8*sizeof(int)-INT_BIT}.
4016@end deftypefun
4017
4018@deftypefun {void *} mpz_export (void *@var{rop}, size_t *@var{countp}, int @var{order}, size_t @var{size}, int @var{endian}, size_t @var{nails}, const mpz_t @var{op})
4019@cindex Integer export
4020@cindex Export
4021Fill @var{rop} with word data from @var{op}.
4022
4023The parameters specify the format of the data produced.  Each word will be
4024@var{size} bytes and @var{order} can be 1 for most significant word first or
4025-1 for least significant first.  Within each word @var{endian} can be 1 for
4026most significant byte first, -1 for least significant first, or 0 for the
4027native endianness of the host CPU@.  The most significant @var{nails} bits of
4028each word are unused and set to zero, this can be 0 to produce full words.
4029
4030The number of words produced is written to @code{*@var{countp}}, or
4031@var{countp} can be @code{NULL} to discard the count.  @var{rop} must have
4032enough space for the data, or if @var{rop} is @code{NULL} then a result array
4033of the necessary size is allocated using the current GMP allocation function
4034(@pxref{Custom Allocation}).  In either case the return value is the
4035destination used, either @var{rop} or the allocated block.
4036
4037If @var{op} is non-zero then the most significant word produced will be
4038non-zero.  If @var{op} is zero then the count returned will be zero and
4039nothing written to @var{rop}.  If @var{rop} is @code{NULL} in this case, no
4040block is allocated, just @code{NULL} is returned.
4041
4042The sign of @var{op} is ignored, just the absolute value is exported.  An
4043application can use @code{mpz_sgn} to get the sign and handle it as desired.
4044(@pxref{Integer Comparisons})
4045
4046There are no data alignment restrictions on @var{rop}, any address is allowed.
4047
4048When an application is allocating space itself the required size can be
4049determined with a calculation like the following.  Since @code{mpz_sizeinbase}
4050always returns at least 1, @code{count} here will be at least one, which
4051avoids any portability problems with @code{malloc(0)}, though if @code{z} is
4052zero no space at all is actually needed (or written).
4053
4054@example
4055numb = 8*size - nail;
4056count = (mpz_sizeinbase (z, 2) + numb-1) / numb;
4057p = malloc (count * size);
4058@end example
4059@end deftypefun
4060
4061
4062@need 2000
4063@node Miscellaneous Integer Functions, Integer Special Functions, Integer Import and Export, Integer Functions
4064@comment  node-name,  next,  previous,  up
4065@section Miscellaneous Functions
4066@cindex Miscellaneous integer functions
4067@cindex Integer miscellaneous functions
4068
4069@deftypefun int mpz_fits_ulong_p (const mpz_t @var{op})
4070@deftypefunx int mpz_fits_slong_p (const mpz_t @var{op})
4071@deftypefunx int mpz_fits_uint_p (const mpz_t @var{op})
4072@deftypefunx int mpz_fits_sint_p (const mpz_t @var{op})
4073@deftypefunx int mpz_fits_ushort_p (const mpz_t @var{op})
4074@deftypefunx int mpz_fits_sshort_p (const mpz_t @var{op})
4075Return non-zero iff the value of @var{op} fits in an @code{unsigned long int},
4076@code{signed long int}, @code{unsigned int}, @code{signed int}, @code{unsigned
4077short int}, or @code{signed short int}, respectively.  Otherwise, return zero.
4078@end deftypefun
4079
4080@deftypefn Macro int mpz_odd_p (const mpz_t @var{op})
4081@deftypefnx Macro int mpz_even_p (const mpz_t @var{op})
4082Determine whether @var{op} is odd or even, respectively.  Return non-zero if
4083yes, zero if no.  These macros evaluate their argument more than once.
4084@end deftypefn
4085
4086@deftypefun size_t mpz_sizeinbase (const mpz_t @var{op}, int @var{base})
4087@cindex Size in digits
4088@cindex Digits in an integer
4089Return the size of @var{op} measured in number of digits in the given
4090@var{base}.  @var{base} can vary from 2 to 62.  The sign of @var{op} is
4091ignored, just the absolute value is used.  The result will be either exact or
40921 too big.  If @var{base} is a power of 2, the result is always exact.  If
4093@var{op} is zero the return value is always 1.
4094
4095This function can be used to determine the space required when converting
4096@var{op} to a string.  The right amount of allocation is normally two more
4097than the value returned by @code{mpz_sizeinbase}, one extra for a minus sign
4098and one for the null-terminator.
4099
4100@cindex Most significant bit
4101It will be noted that @code{mpz_sizeinbase(@var{op},2)} can be used to locate
4102the most significant 1 bit in @var{op}, counting from 1.  (Unlike the bitwise
4103functions which start from 0, @xref{Integer Logic and Bit Fiddling,, Logical
4104and Bit Manipulation Functions}.)
4105@end deftypefun
4106
4107
4108@node Integer Special Functions,  , Miscellaneous Integer Functions, Integer Functions
4109@section Special Functions
4110@cindex Special integer functions
4111@cindex Integer special functions
4112
4113The functions in this section are for various special purposes.  Most
4114applications will not need them.
4115
4116@deftypefun void mpz_array_init (mpz_t @var{integer_array}, mp_size_t @var{array_size}, @w{mp_size_t @var{fixed_num_bits}})
4117@strong{This is an obsolete function.  Do not use it.}
4118@end deftypefun
4119
4120@deftypefun {void *} _mpz_realloc (mpz_t @var{integer}, mp_size_t @var{new_alloc})
4121Change the space for @var{integer} to @var{new_alloc} limbs.  The value in
4122@var{integer} is preserved if it fits, or is set to 0 if not.  The return
4123value is not useful to applications and should be ignored.
4124
4125@code{mpz_realloc2} is the preferred way to accomplish allocation changes like
4126this.  @code{mpz_realloc2} and @code{_mpz_realloc} are the same except that
4127@code{_mpz_realloc} takes its size in limbs.
4128@end deftypefun
4129
4130@deftypefun mp_limb_t mpz_getlimbn (const mpz_t @var{op}, mp_size_t @var{n})
4131Return limb number @var{n} from @var{op}.  The sign of @var{op} is ignored,
4132just the absolute value is used.  The least significant limb is number 0.
4133
4134@code{mpz_size} can be used to find how many limbs make up @var{op}.
4135@code{mpz_getlimbn} returns zero if @var{n} is outside the range 0 to
4136@code{mpz_size(@var{op})-1}.
4137@end deftypefun
4138
4139@deftypefun size_t mpz_size (const mpz_t @var{op})
4140Return the size of @var{op} measured in number of limbs.  If @var{op} is zero,
4141the returned value will be zero.
4142@c (@xref{Nomenclature}, for an explanation of the concept @dfn{limb}.)
4143@end deftypefun
4144
4145@deftypefun {const mp_limb_t *} mpz_limbs_read (const mpz_t @var{x})
4146Return a pointer to the limb array representing the absolute value of @var{x}.
4147The size of the array is @code{mpz_size(@var{x})}. Intended for read access
4148only.
4149@end deftypefun
4150
4151@deftypefun {mp_limb_t *} mpz_limbs_write (mpz_t @var{x}, mp_size_t @var{n})
4152@deftypefunx {mp_limb_t *} mpz_limbs_modify (mpz_t @var{x}, mp_size_t @var{n})
4153Return a pointer to the limb array, intended for write access. The array is
4154reallocated as needed, to make room for @var{n} limbs. Requires @math{@var{n}
4155> 0}. The @code{mpz_limbs_modify} function returns an array that holds the old
4156absolute value of @var{x}, while @code{mpz_limbs_write} may destroy the old
4157value and return an array with unspecified contents.
4158@end deftypefun
4159
4160@deftypefun void mpz_limbs_finish (mpz_t @var{x}, mp_size_t @var{s})
4161Updates the internal size field of @var{x}. Used after writing to the limb
4162array pointer returned by @code{mpz_limbs_write} or @code{mpz_limbs_modify} is
4163completed. The array should contain @math{@GMPabs{@var{s}}} valid limbs,
4164representing the new absolute value for @var{x}, and the sign of @var{x} is
4165taken from the sign of @var{s}. This function never reallocates @var{x}, so
4166the limb pointer remains valid.
4167@end deftypefun
4168
4169@c FIXME: Some more useful and less silly example?
4170@example
4171void foo (mpz_t x)
4172@{
4173  mp_size_t n, i;
4174  mp_limb_t *xp;
4175
4176  n = mpz_size (x);
4177  xp = mpz_limbs_modify (x, 2*n);
4178  for (i = 0; i < n; i++)
4179    xp[n+i] = xp[n-1-i];
4180  mpz_limbs_finish (x, mpz_sgn (x) < 0 ? - 2*n : 2*n);
4181@}
4182@end example
4183
4184@deftypefun mpz_srcptr mpz_roinit_n (mpz_t @var{x}, const mp_limb_t *@var{xp}, mp_size_t @var{xs})
4185Special initialization of @var{x}, using the given limb array and size.
4186@var{x} should be treated as read-only: it can be passed safely as input to
4187any mpz function, but not as an output. The array @var{xp} must point to at
4188least a readable limb, its size is
4189@math{@GMPabs{@var{xs}}}, and the sign of @var{x} is the sign of @var{xs}. For
4190convenience, the function returns @var{x}, but cast to a const pointer type.
4191@end deftypefun
4192
4193@example
4194void foo (mpz_t x)
4195@{
4196  static const mp_limb_t y[3] = @{ 0x1, 0x2, 0x3 @};
4197  mpz_t tmp;
4198  mpz_add (x, x, mpz_roinit_n (tmp, y, 3));
4199@}
4200@end example
4201
4202@deftypefn Macro mpz_t MPZ_ROINIT_N (mp_limb_t *@var{xp}, mp_size_t @var{xs})
4203This macro expands to an initializer which can be assigned to an mpz_t
4204variable. The limb array @var{xp} must point to at least a readable limb,
4205moreover, unlike the @code{mpz_roinit_n} function, the array must be
4206normalized: if @var{xs} is non-zero, then
4207@code{@var{xp}[@math{@GMPabs{@var{xs}}-1}]} must be non-zero. Intended
4208primarily for constant values. Using it for non-constant values requires a C
4209compiler supporting C99.
4210@end deftypefn
4211
4212@example
4213void foo (mpz_t x)
4214@{
4215  static const mp_limb_t ya[3] = @{ 0x1, 0x2, 0x3 @};
4216  static const mpz_t y = MPZ_ROINIT_N ((mp_limb_t *) ya, 3);
4217
4218  mpz_add (x, x, y);
4219@}
4220@end example
4221
4222
4223@node Rational Number Functions, Floating-point Functions, Integer Functions, Top
4224@comment  node-name,  next,  previous,  up
4225@chapter Rational Number Functions
4226@cindex Rational number functions
4227
4228This chapter describes the GMP functions for performing arithmetic on rational
4229numbers.  These functions start with the prefix @code{mpq_}.
4230
4231Rational numbers are stored in objects of type @code{mpq_t}.
4232
4233All rational arithmetic functions assume operands have a canonical form, and
4234canonicalize their result.  The canonical form means that the denominator and
4235the numerator have no common factors, and that the denominator is positive.
4236Zero has the unique representation 0/1.
4237
4238Pure assignment functions do not canonicalize the assigned variable.  It is
4239the responsibility of the user to canonicalize the assigned variable before
4240any arithmetic operations are performed on that variable.
4241
4242@deftypefun void mpq_canonicalize (mpq_t @var{op})
4243Remove any factors that are common to the numerator and denominator of
4244@var{op}, and make the denominator positive.
4245@end deftypefun
4246
4247@menu
4248* Initializing Rationals::
4249* Rational Conversions::
4250* Rational Arithmetic::
4251* Comparing Rationals::
4252* Applying Integer Functions::
4253* I/O of Rationals::
4254@end menu
4255
4256@node Initializing Rationals, Rational Conversions, Rational Number Functions, Rational Number Functions
4257@comment  node-name,  next,  previous,  up
4258@section Initialization and Assignment Functions
4259@cindex Rational assignment functions
4260@cindex Assignment functions
4261@cindex Rational initialization functions
4262@cindex Initialization functions
4263
4264@deftypefun void mpq_init (mpq_t @var{x})
4265Initialize @var{x} and set it to 0/1.  Each variable should normally only be
4266initialized once, or at least cleared out (using the function @code{mpq_clear})
4267between each initialization.
4268@end deftypefun
4269
4270@deftypefun void mpq_inits (mpq_t @var{x}, ...)
4271Initialize a NULL-terminated list of @code{mpq_t} variables, and set their
4272values to 0/1.
4273@end deftypefun
4274
4275@deftypefun void mpq_clear (mpq_t @var{x})
4276Free the space occupied by @var{x}.  Make sure to call this function for all
4277@code{mpq_t} variables when you are done with them.
4278@end deftypefun
4279
4280@deftypefun void mpq_clears (mpq_t @var{x}, ...)
4281Free the space occupied by a NULL-terminated list of @code{mpq_t} variables.
4282@end deftypefun
4283
4284@deftypefun void mpq_set (mpq_t @var{rop}, const mpq_t @var{op})
4285@deftypefunx void mpq_set_z (mpq_t @var{rop}, const mpz_t @var{op})
4286Assign @var{rop} from @var{op}.
4287@end deftypefun
4288
4289@deftypefun void mpq_set_ui (mpq_t @var{rop}, unsigned long int @var{op1}, unsigned long int @var{op2})
4290@deftypefunx void mpq_set_si (mpq_t @var{rop}, signed long int @var{op1}, unsigned long int @var{op2})
4291Set the value of @var{rop} to @var{op1}/@var{op2}.  Note that if @var{op1} and
4292@var{op2} have common factors, @var{rop} has to be passed to
4293@code{mpq_canonicalize} before any operations are performed on @var{rop}.
4294@end deftypefun
4295
4296@deftypefun int mpq_set_str (mpq_t @var{rop}, const char *@var{str}, int @var{base})
4297Set @var{rop} from a null-terminated string @var{str} in the given @var{base}.
4298
4299The string can be an integer like ``41'' or a fraction like ``41/152''.  The
4300fraction must be in canonical form (@pxref{Rational Number Functions}), or if
4301not then @code{mpq_canonicalize} must be called.
4302
4303The numerator and optional denominator are parsed the same as in
4304@code{mpz_set_str} (@pxref{Assigning Integers}).  White space is allowed in
4305the string, and is simply ignored.  The @var{base} can vary from 2 to 62, or
4306if @var{base} is 0 then the leading characters are used: @code{0x} or @code{0X} for hex,
4307@code{0b} or @code{0B} for binary,
4308@code{0} for octal, or decimal otherwise.  Note that this is done separately
4309for the numerator and denominator, so for instance @code{0xEF/100} is 239/100,
4310whereas @code{0xEF/0x100} is 239/256.
4311
4312The return value is 0 if the entire string is a valid number, or @minus{}1 if
4313not.
4314@end deftypefun
4315
4316@deftypefun void mpq_swap (mpq_t @var{rop1}, mpq_t @var{rop2})
4317Swap the values @var{rop1} and @var{rop2} efficiently.
4318@end deftypefun
4319
4320
4321@need 2000
4322@node Rational Conversions, Rational Arithmetic, Initializing Rationals, Rational Number Functions
4323@comment  node-name,  next,  previous,  up
4324@section Conversion Functions
4325@cindex Rational conversion functions
4326@cindex Conversion functions
4327
4328@deftypefun double mpq_get_d (const mpq_t @var{op})
4329Convert @var{op} to a @code{double}, truncating if necessary (i.e.@: rounding
4330towards zero).
4331
4332If the exponent from the conversion is too big or too small to fit a
4333@code{double} then the result is system dependent.  For too big an infinity is
4334returned when available.  For too small @math{0.0} is normally returned.
4335Hardware overflow, underflow and denorm traps may or may not occur.
4336@end deftypefun
4337
4338@deftypefun void mpq_set_d (mpq_t @var{rop}, double @var{op})
4339@deftypefunx void mpq_set_f (mpq_t @var{rop}, const mpf_t @var{op})
4340Set @var{rop} to the value of @var{op}.  There is no rounding, this conversion
4341is exact.
4342@end deftypefun
4343
4344@deftypefun {char *} mpq_get_str (char *@var{str}, int @var{base}, const mpq_t @var{op})
4345Convert @var{op} to a string of digits in base @var{base}.  The base may vary
4346from 2 to 36.  The string will be of the form @samp{num/den}, or if the
4347denominator is 1 then just @samp{num}.
4348
4349If @var{str} is @code{NULL}, the result string is allocated using the current
4350allocation function (@pxref{Custom Allocation}).  The block will be
4351@code{strlen(str)+1} bytes, that being exactly enough for the string and
4352null-terminator.
4353
4354If @var{str} is not @code{NULL}, it should point to a block of storage large
4355enough for the result, that being
4356
4357@example
4358mpz_sizeinbase (mpq_numref(@var{op}), @var{base})
4359+ mpz_sizeinbase (mpq_denref(@var{op}), @var{base}) + 3
4360@end example
4361
4362The three extra bytes are for a possible minus sign, possible slash, and the
4363null-terminator.
4364
4365A pointer to the result string is returned, being either the allocated block,
4366or the given @var{str}.
4367@end deftypefun
4368
4369
4370@node Rational Arithmetic, Comparing Rationals, Rational Conversions, Rational Number Functions
4371@comment  node-name,  next,  previous,  up
4372@section Arithmetic Functions
4373@cindex Rational arithmetic functions
4374@cindex Arithmetic functions
4375
4376@deftypefun void mpq_add (mpq_t @var{sum}, const mpq_t @var{addend1}, const mpq_t @var{addend2})
4377Set @var{sum} to @var{addend1} + @var{addend2}.
4378@end deftypefun
4379
4380@deftypefun void mpq_sub (mpq_t @var{difference}, const mpq_t @var{minuend}, const mpq_t @var{subtrahend})
4381Set @var{difference} to @var{minuend} @minus{} @var{subtrahend}.
4382@end deftypefun
4383
4384@deftypefun void mpq_mul (mpq_t @var{product}, const mpq_t @var{multiplier}, const mpq_t @var{multiplicand})
4385Set @var{product} to @math{@var{multiplier} @GMPtimes{} @var{multiplicand}}.
4386@end deftypefun
4387
4388@deftypefun void mpq_mul_2exp (mpq_t @var{rop}, const mpq_t @var{op1}, mp_bitcnt_t @var{op2})
4389Set @var{rop} to @m{@var{op1} \times 2^{op2}, @var{op1} times 2 raised to
4390@var{op2}}.
4391@end deftypefun
4392
4393@deftypefun void mpq_div (mpq_t @var{quotient}, const mpq_t @var{dividend}, const mpq_t @var{divisor})
4394@cindex Division functions
4395Set @var{quotient} to @var{dividend}/@var{divisor}.
4396@end deftypefun
4397
4398@deftypefun void mpq_div_2exp (mpq_t @var{rop}, const mpq_t @var{op1}, mp_bitcnt_t @var{op2})
4399Set @var{rop} to @m{@var{op1}/2^{op2}, @var{op1} divided by 2 raised to
4400@var{op2}}.
4401@end deftypefun
4402
4403@deftypefun void mpq_neg (mpq_t @var{negated_operand}, const mpq_t @var{operand})
4404Set @var{negated_operand} to @minus{}@var{operand}.
4405@end deftypefun
4406
4407@deftypefun void mpq_abs (mpq_t @var{rop}, const mpq_t @var{op})
4408Set @var{rop} to the absolute value of @var{op}.
4409@end deftypefun
4410
4411@deftypefun void mpq_inv (mpq_t @var{inverted_number}, const mpq_t @var{number})
4412Set @var{inverted_number} to 1/@var{number}.  If the new denominator is
4413zero, this routine will divide by zero.
4414@end deftypefun
4415
4416@node Comparing Rationals, Applying Integer Functions, Rational Arithmetic, Rational Number Functions
4417@comment  node-name,  next,  previous,  up
4418@section Comparison Functions
4419@cindex Rational comparison functions
4420@cindex Comparison functions
4421
4422@deftypefun int mpq_cmp (const mpq_t @var{op1}, const mpq_t @var{op2})
4423@deftypefunx int mpq_cmp_z (const mpq_t @var{op1}, const mpz_t @var{op2})
4424Compare @var{op1} and @var{op2}.  Return a positive value if @math{@var{op1} >
4425@var{op2}}, zero if @math{@var{op1} = @var{op2}}, and a negative value if
4426@math{@var{op1} < @var{op2}}.
4427
4428To determine if two rationals are equal, @code{mpq_equal} is faster than
4429@code{mpq_cmp}.
4430@end deftypefun
4431
4432@deftypefn Macro int mpq_cmp_ui (const mpq_t @var{op1}, unsigned long int @var{num2}, unsigned long int @var{den2})
4433@deftypefnx Macro int mpq_cmp_si (const mpq_t @var{op1}, long int @var{num2}, unsigned long int @var{den2})
4434Compare @var{op1} and @var{num2}/@var{den2}.  Return a positive value if
4435@math{@var{op1} > @var{num2}/@var{den2}}, zero if @math{@var{op1} =
4436@var{num2}/@var{den2}}, and a negative value if @math{@var{op1} <
4437@var{num2}/@var{den2}}.
4438
4439@var{num2} and @var{den2} are allowed to have common factors.
4440
4441These functions are implemented as a macros and evaluate their arguments
4442multiple times.
4443@end deftypefn
4444
4445@deftypefn Macro int mpq_sgn (const mpq_t @var{op})
4446@cindex Sign tests
4447@cindex Rational sign tests
4448Return @math{+1} if @math{@var{op} > 0}, 0 if @math{@var{op} = 0}, and
4449@math{-1} if @math{@var{op} < 0}.
4450
4451This function is actually implemented as a macro.  It evaluates its
4452argument multiple times.
4453@end deftypefn
4454
4455@deftypefun int mpq_equal (const mpq_t @var{op1}, const mpq_t @var{op2})
4456Return non-zero if @var{op1} and @var{op2} are equal, zero if they are
4457non-equal.  Although @code{mpq_cmp} can be used for the same purpose, this
4458function is much faster.
4459@end deftypefun
4460
4461@node Applying Integer Functions, I/O of Rationals, Comparing Rationals, Rational Number Functions
4462@comment  node-name,  next,  previous,  up
4463@section Applying Integer Functions to Rationals
4464@cindex Rational numerator and denominator
4465@cindex Numerator and denominator
4466
4467The set of @code{mpq} functions is quite small.  In particular, there are few
4468functions for either input or output.  The following functions give direct
4469access to the numerator and denominator of an @code{mpq_t}.
4470
4471Note that if an assignment to the numerator and/or denominator could take an
4472@code{mpq_t} out of the canonical form described at the start of this chapter
4473(@pxref{Rational Number Functions}) then @code{mpq_canonicalize} must be
4474called before any other @code{mpq} functions are applied to that @code{mpq_t}.
4475
4476@deftypefn Macro mpz_t mpq_numref (const mpq_t @var{op})
4477@deftypefnx Macro mpz_t mpq_denref (const mpq_t @var{op})
4478Return a reference to the numerator and denominator of @var{op}, respectively.
4479The @code{mpz} functions can be used on the result of these macros.
4480@end deftypefn
4481
4482@deftypefun void mpq_get_num (mpz_t @var{numerator}, const mpq_t @var{rational})
4483@deftypefunx void mpq_get_den (mpz_t @var{denominator}, const mpq_t @var{rational})
4484@deftypefunx void mpq_set_num (mpq_t @var{rational}, const mpz_t @var{numerator})
4485@deftypefunx void mpq_set_den (mpq_t @var{rational}, const mpz_t @var{denominator})
4486Get or set the numerator or denominator of a rational.  These functions are
4487equivalent to calling @code{mpz_set} with an appropriate @code{mpq_numref} or
4488@code{mpq_denref}.  Direct use of @code{mpq_numref} or @code{mpq_denref} is
4489recommended instead of these functions.
4490@end deftypefun
4491
4492
4493@need 2000
4494@node I/O of Rationals,  , Applying Integer Functions, Rational Number Functions
4495@comment  node-name,  next,  previous,  up
4496@section Input and Output Functions
4497@cindex Rational input and output functions
4498@cindex Input functions
4499@cindex Output functions
4500@cindex I/O functions
4501
4502Functions that perform input from a stdio stream, and functions that output to
4503a stdio stream, of @code{mpq} numbers.  Passing a @code{NULL} pointer for a
4504@var{stream} argument to any of these functions will make them read from
4505@code{stdin} and write to @code{stdout}, respectively.
4506
4507When using any of these functions, it is a good idea to include @file{stdio.h}
4508before @file{gmp.h}, since that will allow @file{gmp.h} to define prototypes
4509for these functions.
4510
4511See also @ref{Formatted Output} and @ref{Formatted Input}.
4512
4513@deftypefun size_t mpq_out_str (FILE *@var{stream}, int @var{base}, const mpq_t @var{op})
4514Output @var{op} on stdio stream @var{stream}, as a string of digits in base
4515@var{base}.  The base may vary from 2 to 36.  Output is in the form
4516@samp{num/den} or if the denominator is 1 then just @samp{num}.
4517
4518Return the number of bytes written, or if an error occurred, return 0.
4519@end deftypefun
4520
4521@deftypefun size_t mpq_inp_str (mpq_t @var{rop}, FILE *@var{stream}, int @var{base})
4522Read a string of digits from @var{stream} and convert them to a rational in
4523@var{rop}.  Any initial white-space characters are read and discarded.  Return
4524the number of characters read (including white space), or 0 if a rational
4525could not be read.
4526
4527The input can be a fraction like @samp{17/63} or just an integer like
4528@samp{123}.  Reading stops at the first character not in this form, and white
4529space is not permitted within the string.  If the input might not be in
4530canonical form, then @code{mpq_canonicalize} must be called (@pxref{Rational
4531Number Functions}).
4532
4533The @var{base} can be between 2 and 36, or can be 0 in which case the leading
4534characters of the string determine the base, @samp{0x} or @samp{0X} for
4535hexadecimal, @samp{0} for octal, or decimal otherwise.  The leading characters
4536are examined separately for the numerator and denominator of a fraction, so
4537for instance @samp{0x10/11} is @math{16/11}, whereas @samp{0x10/0x11} is
4538@math{16/17}.
4539@end deftypefun
4540
4541
4542@node Floating-point Functions, Low-level Functions, Rational Number Functions, Top
4543@comment  node-name,  next,  previous,  up
4544@chapter Floating-point Functions
4545@cindex Floating-point functions
4546@cindex Float functions
4547@cindex User-defined precision
4548@cindex Precision of floats
4549
4550GMP floating point numbers are stored in objects of type @code{mpf_t} and
4551functions operating on them have an @code{mpf_} prefix.
4552
4553The mantissa of each float has a user-selectable precision, in practice only
4554limited by available memory.  Each variable has its own precision, and that can
4555be increased or decreased at any time.  This selectable precision is a minimum
4556value, GMP rounds it up to a whole limb.
4557
4558The accuracy of a calculation is determined by the priorly set precision of the
4559destination variable and the numeric values of the input variables.  Input
4560variables' set precisions do not affect calculations (except indirectly as
4561their values might have been affected when they were assigned).
4562
4563The exponent of each float has fixed precision, one machine word on most
4564systems.  In the current implementation the exponent is a count of limbs, so
4565for example on a 32-bit system this means a range of roughly
4566@math{2^@W{-68719476768}} to @math{2^@W{68719476736}}, or on a 64-bit system
4567this will be much greater.  Note however that @code{mpf_get_str} can only
4568return an exponent which fits an @code{mp_exp_t} and currently
4569@code{mpf_set_str} doesn't accept exponents bigger than a @code{long}.
4570
4571Each variable keeps track of the mantissa data actually in use.  This means
4572that if a float is exactly represented in only a few bits then only those bits
4573will be used in a calculation, even if the variable's selected precision is
4574high.  This is a performance optimization; it does not affect the numeric
4575results.
4576
4577Internally, GMP sometimes calculates with higher precision than that of the
4578destination variable in order to limit errors.  Final results are always
4579truncated to the destination variable's precision.
4580
4581The mantissa is stored in binary.  One consequence of this is that decimal
4582fractions like @math{0.1} cannot be represented exactly.  The same is true of
4583plain IEEE @code{double} floats.  This makes both highly unsuitable for
4584calculations involving money or other values that should be exact decimal
4585fractions.  (Suitably scaled integers, or perhaps rationals, are better
4586choices.)
4587
4588The @code{mpf} functions and variables have no special notion of infinity or
4589not-a-number, and applications must take care not to overflow the exponent or
4590results will be unpredictable.
4591
4592Note that the @code{mpf} functions are @emph{not} intended as a smooth
4593extension to IEEE P754 arithmetic.  In particular results obtained on one
4594computer often differ from the results on a computer with a different word
4595size.
4596
4597New projects should consider using the GMP extension library MPFR
4598(@url{http://mpfr.org}) instead.  MPFR provides well-defined precision and
4599accurate rounding, and thereby naturally extends IEEE P754.
4600
4601@menu
4602* Initializing Floats::
4603* Assigning Floats::
4604* Simultaneous Float Init & Assign::
4605* Converting Floats::
4606* Float Arithmetic::
4607* Float Comparison::
4608* I/O of Floats::
4609* Miscellaneous Float Functions::
4610@end menu
4611
4612@node Initializing Floats, Assigning Floats, Floating-point Functions, Floating-point Functions
4613@comment  node-name,  next,  previous,  up
4614@section Initialization Functions
4615@cindex Float initialization functions
4616@cindex Initialization functions
4617
4618@deftypefun void mpf_set_default_prec (mp_bitcnt_t @var{prec})
4619Set the default precision to be @strong{at least} @var{prec} bits.  All
4620subsequent calls to @code{mpf_init} will use this precision, but previously
4621initialized variables are unaffected.
4622@end deftypefun
4623
4624@deftypefun {mp_bitcnt_t} mpf_get_default_prec (void)
4625Return the default precision actually used.
4626@end deftypefun
4627
4628An @code{mpf_t} object must be initialized before storing the first value in
4629it.  The functions @code{mpf_init} and @code{mpf_init2} are used for that
4630purpose.
4631
4632@deftypefun void mpf_init (mpf_t @var{x})
4633Initialize @var{x} to 0.  Normally, a variable should be initialized once only
4634or at least be cleared, using @code{mpf_clear}, between initializations.  The
4635precision of @var{x} is undefined unless a default precision has already been
4636established by a call to @code{mpf_set_default_prec}.
4637@end deftypefun
4638
4639@deftypefun void mpf_init2 (mpf_t @var{x}, mp_bitcnt_t @var{prec})
4640Initialize @var{x} to 0 and set its precision to be @strong{at least}
4641@var{prec} bits.  Normally, a variable should be initialized once only or at
4642least be cleared, using @code{mpf_clear}, between initializations.
4643@end deftypefun
4644
4645@deftypefun void mpf_inits (mpf_t @var{x}, ...)
4646Initialize a NULL-terminated list of @code{mpf_t} variables, and set their
4647values to 0.  The precision of the initialized variables is undefined unless a
4648default precision has already been established by a call to
4649@code{mpf_set_default_prec}.
4650@end deftypefun
4651
4652@deftypefun void mpf_clear (mpf_t @var{x})
4653Free the space occupied by @var{x}.  Make sure to call this function for all
4654@code{mpf_t} variables when you are done with them.
4655@end deftypefun
4656
4657@deftypefun void mpf_clears (mpf_t @var{x}, ...)
4658Free the space occupied by a NULL-terminated list of @code{mpf_t} variables.
4659@end deftypefun
4660
4661@need 2000
4662Here is an example on how to initialize floating-point variables:
4663@example
4664@{
4665  mpf_t x, y;
4666  mpf_init (x);           /* use default precision */
4667  mpf_init2 (y, 256);     /* precision @emph{at least} 256 bits */
4668  @dots{}
4669  /* Unless the program is about to exit, do ... */
4670  mpf_clear (x);
4671  mpf_clear (y);
4672@}
4673@end example
4674
4675The following three functions are useful for changing the precision during a
4676calculation.  A typical use would be for adjusting the precision gradually in
4677iterative algorithms like Newton-Raphson, making the computation precision
4678closely match the actual accurate part of the numbers.
4679
4680@deftypefun {mp_bitcnt_t} mpf_get_prec (const mpf_t @var{op})
4681Return the current precision of @var{op}, in bits.
4682@end deftypefun
4683
4684@deftypefun void mpf_set_prec (mpf_t @var{rop}, mp_bitcnt_t @var{prec})
4685Set the precision of @var{rop} to be @strong{at least} @var{prec} bits.  The
4686value in @var{rop} will be truncated to the new precision.
4687
4688This function requires a call to @code{realloc}, and so should not be used in
4689a tight loop.
4690@end deftypefun
4691
4692@deftypefun void mpf_set_prec_raw (mpf_t @var{rop}, mp_bitcnt_t @var{prec})
4693Set the precision of @var{rop} to be @strong{at least} @var{prec} bits,
4694without changing the memory allocated.
4695
4696@var{prec} must be no more than the allocated precision for @var{rop}, that
4697being the precision when @var{rop} was initialized, or in the most recent
4698@code{mpf_set_prec}.
4699
4700The value in @var{rop} is unchanged, and in particular if it had a higher
4701precision than @var{prec} it will retain that higher precision.  New values
4702written to @var{rop} will use the new @var{prec}.
4703
4704Before calling @code{mpf_clear} or the full @code{mpf_set_prec}, another
4705@code{mpf_set_prec_raw} call must be made to restore @var{rop} to its original
4706allocated precision.  Failing to do so will have unpredictable results.
4707
4708@code{mpf_get_prec} can be used before @code{mpf_set_prec_raw} to get the
4709original allocated precision.  After @code{mpf_set_prec_raw} it reflects the
4710@var{prec} value set.
4711
4712@code{mpf_set_prec_raw} is an efficient way to use an @code{mpf_t} variable at
4713different precisions during a calculation, perhaps to gradually increase
4714precision in an iteration, or just to use various different precisions for
4715different purposes during a calculation.
4716@end deftypefun
4717
4718
4719@need 2000
4720@node Assigning Floats, Simultaneous Float Init & Assign, Initializing Floats, Floating-point Functions
4721@comment  node-name,  next,  previous,  up
4722@section Assignment Functions
4723@cindex Float assignment functions
4724@cindex Assignment functions
4725
4726These functions assign new values to already initialized floats
4727(@pxref{Initializing Floats}).
4728
4729@deftypefun void mpf_set (mpf_t @var{rop}, const mpf_t @var{op})
4730@deftypefunx void mpf_set_ui (mpf_t @var{rop}, unsigned long int @var{op})
4731@deftypefunx void mpf_set_si (mpf_t @var{rop}, signed long int @var{op})
4732@deftypefunx void mpf_set_d (mpf_t @var{rop}, double @var{op})
4733@deftypefunx void mpf_set_z (mpf_t @var{rop}, const mpz_t @var{op})
4734@deftypefunx void mpf_set_q (mpf_t @var{rop}, const mpq_t @var{op})
4735Set the value of @var{rop} from @var{op}.
4736@end deftypefun
4737
4738@deftypefun int mpf_set_str (mpf_t @var{rop}, const char *@var{str}, int @var{base})
4739Set the value of @var{rop} from the string in @var{str}.  The string is of the
4740form @samp{M@@N} or, if the base is 10 or less, alternatively @samp{MeN}.
4741@samp{M} is the mantissa and @samp{N} is the exponent.  The mantissa is always
4742in the specified base.  The exponent is either in the specified base or, if
4743@var{base} is negative, in decimal.  The decimal point expected is taken from
4744the current locale, on systems providing @code{localeconv}.
4745
4746The argument @var{base} may be in the ranges 2 to 62, or @minus{}62 to
4747@minus{}2.  Negative values are used to specify that the exponent is in
4748decimal.
4749
4750For bases up to 36, case is ignored; upper-case and lower-case letters have
4751the same value; for bases 37 to 62, upper-case letter represent the usual
475210..35 while lower-case letter represent 36..61.
4753
4754Unlike the corresponding @code{mpz} function, the base will not be determined
4755from the leading characters of the string if @var{base} is 0.  This is so that
4756numbers like @samp{0.23} are not interpreted as octal.
4757
4758White space is allowed in the string, and is simply ignored.  [This is not
4759really true; white-space is ignored in the beginning of the string and within
4760the mantissa, but not in other places, such as after a minus sign or in the
4761exponent.  We are considering changing the definition of this function, making
4762it fail when there is any white-space in the input, since that makes a lot of
4763sense.  Please tell us your opinion about this change.  Do you really want it
4764to accept @nicode{"3 14"} as meaning 314 as it does now?]
4765
4766This function returns 0 if the entire string is a valid number in base
4767@var{base}.  Otherwise it returns @minus{}1.
4768@end deftypefun
4769
4770@deftypefun void mpf_swap (mpf_t @var{rop1}, mpf_t @var{rop2})
4771Swap @var{rop1} and @var{rop2} efficiently.  Both the values and the
4772precisions of the two variables are swapped.
4773@end deftypefun
4774
4775
4776@node Simultaneous Float Init & Assign, Converting Floats, Assigning Floats, Floating-point Functions
4777@comment  node-name,  next,  previous,  up
4778@section Combined Initialization and Assignment Functions
4779@cindex Float assignment functions
4780@cindex Assignment functions
4781@cindex Float initialization functions
4782@cindex Initialization functions
4783
4784For convenience, GMP provides a parallel series of initialize-and-set functions
4785which initialize the output and then store the value there.  These functions'
4786names have the form @code{mpf_init_set@dots{}}
4787
4788Once the float has been initialized by any of the @code{mpf_init_set@dots{}}
4789functions, it can be used as the source or destination operand for the ordinary
4790float functions.  Don't use an initialize-and-set function on a variable
4791already initialized!
4792
4793@deftypefun void mpf_init_set (mpf_t @var{rop}, const mpf_t @var{op})
4794@deftypefunx void mpf_init_set_ui (mpf_t @var{rop}, unsigned long int @var{op})
4795@deftypefunx void mpf_init_set_si (mpf_t @var{rop}, signed long int @var{op})
4796@deftypefunx void mpf_init_set_d (mpf_t @var{rop}, double @var{op})
4797Initialize @var{rop} and set its value from @var{op}.
4798
4799The precision of @var{rop} will be taken from the active default precision, as
4800set by @code{mpf_set_default_prec}.
4801@end deftypefun
4802
4803@deftypefun int mpf_init_set_str (mpf_t @var{rop}, const char *@var{str}, int @var{base})
4804Initialize @var{rop} and set its value from the string in @var{str}.  See
4805@code{mpf_set_str} above for details on the assignment operation.
4806
4807Note that @var{rop} is initialized even if an error occurs.  (I.e., you have to
4808call @code{mpf_clear} for it.)
4809
4810The precision of @var{rop} will be taken from the active default precision, as
4811set by @code{mpf_set_default_prec}.
4812@end deftypefun
4813
4814
4815@node Converting Floats, Float Arithmetic, Simultaneous Float Init & Assign, Floating-point Functions
4816@comment  node-name,  next,  previous,  up
4817@section Conversion Functions
4818@cindex Float conversion functions
4819@cindex Conversion functions
4820
4821@deftypefun double mpf_get_d (const mpf_t @var{op})
4822Convert @var{op} to a @code{double}, truncating if necessary (i.e.@: rounding
4823towards zero).
4824
4825If the exponent in @var{op} is too big or too small to fit a @code{double}
4826then the result is system dependent.  For too big an infinity is returned when
4827available.  For too small @math{0.0} is normally returned.  Hardware overflow,
4828underflow and denorm traps may or may not occur.
4829@end deftypefun
4830
4831@deftypefun double mpf_get_d_2exp (signed long int *@var{exp}, const mpf_t @var{op})
4832Convert @var{op} to a @code{double}, truncating if necessary (i.e.@: rounding
4833towards zero), and with an exponent returned separately.
4834
4835The return value is in the range @math{0.5@le{}@GMPabs{@var{d}}<1} and the
4836exponent is stored to @code{*@var{exp}}.  @m{@var{d} \times 2^{exp},
4837@var{d} * 2^@var{exp}} is the (truncated) @var{op} value.  If @var{op} is zero,
4838the return is @math{0.0} and 0 is stored to @code{*@var{exp}}.
4839
4840@cindex @code{frexp}
4841This is similar to the standard C @code{frexp} function (@pxref{Normalization
4842Functions,,, libc, The GNU C Library Reference Manual}).
4843@end deftypefun
4844
4845@deftypefun long mpf_get_si (const mpf_t @var{op})
4846@deftypefunx {unsigned long} mpf_get_ui (const mpf_t @var{op})
4847Convert @var{op} to a @code{long} or @code{unsigned long}, truncating any
4848fraction part.  If @var{op} is too big for the return type, the result is
4849undefined.
4850
4851See also @code{mpf_fits_slong_p} and @code{mpf_fits_ulong_p}
4852(@pxref{Miscellaneous Float Functions}).
4853@end deftypefun
4854
4855@deftypefun {char *} mpf_get_str (char *@var{str}, mp_exp_t *@var{expptr}, int @var{base}, size_t @var{n_digits}, const mpf_t @var{op})
4856Convert @var{op} to a string of digits in base @var{base}.  The base argument
4857may vary from 2 to 62 or from @minus{}2 to @minus{}36.  Up to @var{n_digits}
4858digits will be generated.  Trailing zeros are not returned.  No more digits
4859than can be accurately represented by @var{op} are ever generated.  If
4860@var{n_digits} is 0 then that accurate maximum number of digits are generated.
4861
4862For @var{base} in the range 2..36, digits and lower-case letters are used; for
4863@minus{}2..@minus{}36, digits and upper-case letters are used; for 37..62,
4864digits, upper-case letters, and lower-case letters (in that significance order)
4865are used.
4866
4867If @var{str} is @code{NULL}, the result string is allocated using the current
4868allocation function (@pxref{Custom Allocation}).  The block will be
4869@code{strlen(str)+1} bytes, that being exactly enough for the string and
4870null-terminator.
4871
4872If @var{str} is not @code{NULL}, it should point to a block of
4873@math{@var{n_digits} + 2} bytes, that being enough for the mantissa, a
4874possible minus sign, and a null-terminator.  When @var{n_digits} is 0 to get
4875all significant digits, an application won't be able to know the space
4876required, and @var{str} should be @code{NULL} in that case.
4877
4878The generated string is a fraction, with an implicit radix point immediately
4879to the left of the first digit.  The applicable exponent is written through
4880the @var{expptr} pointer.  For example, the number 3.1416 would be returned as
4881string @nicode{"31416"} and exponent 1.
4882
4883When @var{op} is zero, an empty string is produced and the exponent returned
4884is 0.
4885
4886A pointer to the result string is returned, being either the allocated block
4887or the given @var{str}.
4888@end deftypefun
4889
4890
4891@node Float Arithmetic, Float Comparison, Converting Floats, Floating-point Functions
4892@comment  node-name,  next,  previous,  up
4893@section Arithmetic Functions
4894@cindex Float arithmetic functions
4895@cindex Arithmetic functions
4896
4897@deftypefun void mpf_add (mpf_t @var{rop}, const mpf_t @var{op1}, const mpf_t @var{op2})
4898@deftypefunx void mpf_add_ui (mpf_t @var{rop}, const mpf_t @var{op1}, unsigned long int @var{op2})
4899Set @var{rop} to @math{@var{op1} + @var{op2}}.
4900@end deftypefun
4901
4902@deftypefun void mpf_sub (mpf_t @var{rop}, const mpf_t @var{op1}, const mpf_t @var{op2})
4903@deftypefunx void mpf_ui_sub (mpf_t @var{rop}, unsigned long int @var{op1}, const mpf_t @var{op2})
4904@deftypefunx void mpf_sub_ui (mpf_t @var{rop}, const mpf_t @var{op1}, unsigned long int @var{op2})
4905Set @var{rop} to @var{op1} @minus{} @var{op2}.
4906@end deftypefun
4907
4908@deftypefun void mpf_mul (mpf_t @var{rop}, const mpf_t @var{op1}, const mpf_t @var{op2})
4909@deftypefunx void mpf_mul_ui (mpf_t @var{rop}, const mpf_t @var{op1}, unsigned long int @var{op2})
4910Set @var{rop} to @math{@var{op1} @GMPtimes{} @var{op2}}.
4911@end deftypefun
4912
4913Division is undefined if the divisor is zero, and passing a zero divisor to the
4914divide functions will make these functions intentionally divide by zero.  This
4915lets the user handle arithmetic exceptions in these functions in the same
4916manner as other arithmetic exceptions.
4917
4918@deftypefun void mpf_div (mpf_t @var{rop}, const mpf_t @var{op1}, const mpf_t @var{op2})
4919@deftypefunx void mpf_ui_div (mpf_t @var{rop}, unsigned long int @var{op1}, const mpf_t @var{op2})
4920@deftypefunx void mpf_div_ui (mpf_t @var{rop}, const mpf_t @var{op1}, unsigned long int @var{op2})
4921@cindex Division functions
4922Set @var{rop} to @var{op1}/@var{op2}.
4923@end deftypefun
4924
4925@deftypefun void mpf_sqrt (mpf_t @var{rop}, const mpf_t @var{op})
4926@deftypefunx void mpf_sqrt_ui (mpf_t @var{rop}, unsigned long int @var{op})
4927@cindex Root extraction functions
4928Set @var{rop} to @m{\sqrt{@var{op}}, the square root of @var{op}}.
4929@end deftypefun
4930
4931@deftypefun void mpf_pow_ui (mpf_t @var{rop}, const mpf_t @var{op1}, unsigned long int @var{op2})
4932@cindex Exponentiation functions
4933@cindex Powering functions
4934Set @var{rop} to @m{@var{op1}^{op2}, @var{op1} raised to the power @var{op2}}.
4935@end deftypefun
4936
4937@deftypefun void mpf_neg (mpf_t @var{rop}, const mpf_t @var{op})
4938Set @var{rop} to @minus{}@var{op}.
4939@end deftypefun
4940
4941@deftypefun void mpf_abs (mpf_t @var{rop}, const mpf_t @var{op})
4942Set @var{rop} to the absolute value of @var{op}.
4943@end deftypefun
4944
4945@deftypefun void mpf_mul_2exp (mpf_t @var{rop}, const mpf_t @var{op1}, mp_bitcnt_t @var{op2})
4946Set @var{rop} to @m{@var{op1} \times 2^{op2}, @var{op1} times 2 raised to
4947@var{op2}}.
4948@end deftypefun
4949
4950@deftypefun void mpf_div_2exp (mpf_t @var{rop}, const mpf_t @var{op1}, mp_bitcnt_t @var{op2})
4951Set @var{rop} to @m{@var{op1}/2^{op2}, @var{op1} divided by 2 raised to
4952@var{op2}}.
4953@end deftypefun
4954
4955@node Float Comparison, I/O of Floats, Float Arithmetic, Floating-point Functions
4956@comment  node-name,  next,  previous,  up
4957@section Comparison Functions
4958@cindex Float comparison functions
4959@cindex Comparison functions
4960
4961@deftypefun int mpf_cmp (const mpf_t @var{op1}, const mpf_t @var{op2})
4962@deftypefunx int mpf_cmp_z (const mpf_t @var{op1}, const mpz_t @var{op2})
4963@deftypefunx int mpf_cmp_d (const mpf_t @var{op1}, double @var{op2})
4964@deftypefunx int mpf_cmp_ui (const mpf_t @var{op1}, unsigned long int @var{op2})
4965@deftypefunx int mpf_cmp_si (const mpf_t @var{op1}, signed long int @var{op2})
4966Compare @var{op1} and @var{op2}.  Return a positive value if @math{@var{op1} >
4967@var{op2}}, zero if @math{@var{op1} = @var{op2}}, and a negative value if
4968@math{@var{op1} < @var{op2}}.
4969
4970@code{mpf_cmp_d} can be called with an infinity, but results are undefined for
4971a NaN.
4972@end deftypefun
4973
4974@deftypefun int mpf_eq (const mpf_t @var{op1}, const mpf_t @var{op2}, mp_bitcnt_t op3)
4975@strong{This function is mathematically ill-defined and should not be used.}
4976
4977Return non-zero if the first @var{op3} bits of @var{op1} and @var{op2} are
4978equal, zero otherwise.  Note that numbers like e.g., 256 (binary 100000000) and
4979255 (binary 11111111) will never be equal by this function's measure, and
4980furthermore that 0 will only be equal to itself.
4981@end deftypefun
4982
4983@deftypefun void mpf_reldiff (mpf_t @var{rop}, const mpf_t @var{op1}, const mpf_t @var{op2})
4984Compute the relative difference between @var{op1} and @var{op2} and store the
4985result in @var{rop}.  This is @math{@GMPabs{@var{op1}-@var{op2}}/@var{op1}}.
4986@end deftypefun
4987
4988@deftypefn Macro int mpf_sgn (const mpf_t @var{op})
4989@cindex Sign tests
4990@cindex Float sign tests
4991Return @math{+1} if @math{@var{op} > 0}, 0 if @math{@var{op} = 0}, and
4992@math{-1} if @math{@var{op} < 0}.
4993
4994This function is actually implemented as a macro.  It evaluates its argument
4995multiple times.
4996@end deftypefn
4997
4998@node I/O of Floats, Miscellaneous Float Functions, Float Comparison, Floating-point Functions
4999@comment  node-name,  next,  previous,  up
5000@section Input and Output Functions
5001@cindex Float input and output functions
5002@cindex Input functions
5003@cindex Output functions
5004@cindex I/O functions
5005
5006Functions that perform input from a stdio stream, and functions that output to
5007a stdio stream, of @code{mpf} numbers.  Passing a @code{NULL} pointer for a
5008@var{stream} argument to any of these functions will make them read from
5009@code{stdin} and write to @code{stdout}, respectively.
5010
5011When using any of these functions, it is a good idea to include @file{stdio.h}
5012before @file{gmp.h}, since that will allow @file{gmp.h} to define prototypes
5013for these functions.
5014
5015See also @ref{Formatted Output} and @ref{Formatted Input}.
5016
5017@deftypefun size_t mpf_out_str (FILE *@var{stream}, int @var{base}, size_t @var{n_digits}, const mpf_t @var{op})
5018Print @var{op} to @var{stream}, as a string of digits.  Return the number of
5019bytes written, or if an error occurred, return 0.
5020
5021The mantissa is prefixed with an @samp{0.} and is in the given @var{base},
5022which may vary from 2 to 62 or from @minus{}2 to @minus{}36.  An exponent is
5023then printed, separated by an @samp{e}, or if the base is greater than 10 then
5024by an @samp{@@}.  The exponent is always in decimal.  The decimal point follows
5025the current locale, on systems providing @code{localeconv}.
5026
5027For @var{base} in the range 2..36, digits and lower-case letters are used; for
5028@minus{}2..@minus{}36, digits and upper-case letters are used; for 37..62,
5029digits, upper-case letters, and lower-case letters (in that significance order)
5030are used.
5031
5032Up to @var{n_digits} will be printed from the mantissa, except that no more
5033digits than are accurately representable by @var{op} will be printed.
5034@var{n_digits} can be 0 to select that accurate maximum.
5035@end deftypefun
5036
5037@deftypefun size_t mpf_inp_str (mpf_t @var{rop}, FILE *@var{stream}, int @var{base})
5038Read a string in base @var{base} from @var{stream}, and put the read float in
5039@var{rop}.  The string is of the form @samp{M@@N} or, if the base is 10 or
5040less, alternatively @samp{MeN}.  @samp{M} is the mantissa and @samp{N} is the
5041exponent.  The mantissa is always in the specified base.  The exponent is
5042either in the specified base or, if @var{base} is negative, in decimal.  The
5043decimal point expected is taken from the current locale, on systems providing
5044@code{localeconv}.
5045
5046The argument @var{base} may be in the ranges 2 to 36, or @minus{}36 to
5047@minus{}2.  Negative values are used to specify that the exponent is in
5048decimal.
5049
5050Unlike the corresponding @code{mpz} function, the base will not be determined
5051from the leading characters of the string if @var{base} is 0.  This is so that
5052numbers like @samp{0.23} are not interpreted as octal.
5053
5054Return the number of bytes read, or if an error occurred, return 0.
5055@end deftypefun
5056
5057@c @deftypefun void mpf_out_raw (FILE *@var{stream}, const mpf_t @var{float})
5058@c Output @var{float} on stdio stream @var{stream}, in raw binary
5059@c format.  The float is written in a portable format, with 4 bytes of
5060@c size information, and that many bytes of limbs.  Both the size and the
5061@c limbs are written in decreasing significance order.
5062@c @end deftypefun
5063
5064@c @deftypefun void mpf_inp_raw (mpf_t @var{float}, FILE *@var{stream})
5065@c Input from stdio stream @var{stream} in the format written by
5066@c @code{mpf_out_raw}, and put the result in @var{float}.
5067@c @end deftypefun
5068
5069
5070@node Miscellaneous Float Functions,  , I/O of Floats, Floating-point Functions
5071@comment  node-name,  next,  previous,  up
5072@section Miscellaneous Functions
5073@cindex Miscellaneous float functions
5074@cindex Float miscellaneous functions
5075
5076@deftypefun void mpf_ceil (mpf_t @var{rop}, const mpf_t @var{op})
5077@deftypefunx void mpf_floor (mpf_t @var{rop}, const mpf_t @var{op})
5078@deftypefunx void mpf_trunc (mpf_t @var{rop}, const mpf_t @var{op})
5079@cindex Rounding functions
5080@cindex Float rounding functions
5081Set @var{rop} to @var{op} rounded to an integer.  @code{mpf_ceil} rounds to the
5082next higher integer, @code{mpf_floor} to the next lower, and @code{mpf_trunc}
5083to the integer towards zero.
5084@end deftypefun
5085
5086@deftypefun int mpf_integer_p (const mpf_t @var{op})
5087Return non-zero if @var{op} is an integer.
5088@end deftypefun
5089
5090@deftypefun int mpf_fits_ulong_p (const mpf_t @var{op})
5091@deftypefunx int mpf_fits_slong_p (const mpf_t @var{op})
5092@deftypefunx int mpf_fits_uint_p (const mpf_t @var{op})
5093@deftypefunx int mpf_fits_sint_p (const mpf_t @var{op})
5094@deftypefunx int mpf_fits_ushort_p (const mpf_t @var{op})
5095@deftypefunx int mpf_fits_sshort_p (const mpf_t @var{op})
5096Return non-zero if @var{op} would fit in the respective C data type, when
5097truncated to an integer.
5098@end deftypefun
5099
5100@deftypefun void mpf_urandomb (mpf_t @var{rop}, gmp_randstate_t @var{state}, mp_bitcnt_t @var{nbits})
5101@cindex Random number functions
5102@cindex Float random number functions
5103Generate a uniformly distributed random float in @var{rop}, such that @math{0
5104@le{} @var{rop} < 1}, with @var{nbits} significant bits in the mantissa or
5105less if the precision of @var{rop} is smaller.
5106
5107The variable @var{state} must be initialized by calling one of the
5108@code{gmp_randinit} functions (@ref{Random State Initialization}) before
5109invoking this function.
5110@end deftypefun
5111
5112@deftypefun void mpf_random2 (mpf_t @var{rop}, mp_size_t @var{max_size}, mp_exp_t @var{exp})
5113Generate a random float of at most @var{max_size} limbs, with long strings of
5114zeros and ones in the binary representation.  The exponent of the number is in
5115the interval @minus{}@var{exp} to @var{exp} (in limbs).  This function is
5116useful for testing functions and algorithms, since these kind of random
5117numbers have proven to be more likely to trigger corner-case bugs.  Negative
5118random numbers are generated when @var{max_size} is negative.
5119@end deftypefun
5120
5121@c @deftypefun size_t mpf_size (const mpf_t @var{op})
5122@c Return the size of @var{op} measured in number of limbs.  If @var{op} is
5123@c zero, the returned value will be zero.  (@xref{Nomenclature}, for an
5124@c explanation of the concept @dfn{limb}.)
5125@c
5126@c @strong{This function is obsolete.  It will disappear from future GMP
5127@c releases.}
5128@c @end deftypefun
5129
5130
5131@node Low-level Functions, Random Number Functions, Floating-point Functions, Top
5132@comment  node-name,  next,  previous,  up
5133@chapter Low-level Functions
5134@cindex Low-level functions
5135
5136This chapter describes low-level GMP functions, used to implement the
5137high-level GMP functions, but also intended for time-critical user code.
5138
5139These functions start with the prefix @code{mpn_}.
5140
5141@c 1. Some of these function clobber input operands.
5142@c
5143
5144The @code{mpn} functions are designed to be as fast as possible, @strong{not}
5145to provide a coherent calling interface.  The different functions have somewhat
5146similar interfaces, but there are variations that make them hard to use.  These
5147functions do as little as possible apart from the real multiple precision
5148computation, so that no time is spent on things that not all callers need.
5149
5150A source operand is specified by a pointer to the least significant limb and a
5151limb count.  A destination operand is specified by just a pointer.  It is the
5152responsibility of the caller to ensure that the destination has enough space
5153for storing the result.
5154
5155With this way of specifying operands, it is possible to perform computations on
5156subranges of an argument, and store the result into a subrange of a
5157destination.
5158
5159A common requirement for all functions is that each source area needs at least
5160one limb.  No size argument may be zero.  Unless otherwise stated, in-place
5161operations are allowed where source and destination are the same, but not where
5162they only partly overlap.
5163
5164The @code{mpn} functions are the base for the implementation of the
5165@code{mpz_}, @code{mpf_}, and @code{mpq_} functions.
5166
5167This example adds the number beginning at @var{s1p} and the number beginning at
5168@var{s2p} and writes the sum at @var{destp}.  All areas have @var{n} limbs.
5169
5170@example
5171cy = mpn_add_n (destp, s1p, s2p, n)
5172@end example
5173
5174It should be noted that the @code{mpn} functions make no attempt to identify
5175high or low zero limbs on their operands, or other special forms.  On random
5176data such cases will be unlikely and it'd be wasteful for every function to
5177check every time.  An application knowing something about its data can take
5178steps to trim or perhaps split its calculations.
5179@c
5180@c  For reference, within gmp mpz_t operands never have high zero limbs, and
5181@c  we rate low zero limbs as unlikely too (or something an application should
5182@c  handle).  This is a prime motivation for not stripping zero limbs in say
5183@c  mpn_mul_n etc.
5184@c
5185@c  Other applications doing variable-length calculations will quite likely do
5186@c  something similar to mpz.  And even if not then it's highly likely zero
5187@c  limb stripping can be done at just a few judicious points, which will be
5188@c  more efficient than having lots of mpn functions checking every time.
5189
5190@sp 1
5191@noindent
5192In the notation used below, a source operand is identified by the pointer to
5193the least significant limb, and the limb count in braces.  For example,
5194@{@var{s1p}, @var{s1n}@}.
5195
5196@deftypefun mp_limb_t mpn_add_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n})
5197Add @{@var{s1p}, @var{n}@} and @{@var{s2p}, @var{n}@}, and write the @var{n}
5198least significant limbs of the result to @var{rp}.  Return carry, either 0 or
51991.
5200
5201This is the lowest-level function for addition.  It is the preferred function
5202for addition, since it is written in assembly for most CPUs.  For addition of
5203a variable to itself (i.e., @var{s1p} equals @var{s2p}) use @code{mpn_lshift}
5204with a count of 1 for optimal speed.
5205@end deftypefun
5206
5207@deftypefun mp_limb_t mpn_add_1 (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{n}, mp_limb_t @var{s2limb})
5208Add @{@var{s1p}, @var{n}@} and @var{s2limb}, and write the @var{n} least
5209significant limbs of the result to @var{rp}.  Return carry, either 0 or 1.
5210@end deftypefun
5211
5212@deftypefun mp_limb_t mpn_add (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{s1n}, const mp_limb_t *@var{s2p}, mp_size_t @var{s2n})
5213Add @{@var{s1p}, @var{s1n}@} and @{@var{s2p}, @var{s2n}@}, and write the
5214@var{s1n} least significant limbs of the result to @var{rp}.  Return carry,
5215either 0 or 1.
5216
5217This function requires that @var{s1n} is greater than or equal to @var{s2n}.
5218@end deftypefun
5219
5220@deftypefun mp_limb_t mpn_sub_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n})
5221Subtract @{@var{s2p}, @var{n}@} from @{@var{s1p}, @var{n}@}, and write the
5222@var{n} least significant limbs of the result to @var{rp}.  Return borrow,
5223either 0 or 1.
5224
5225This is the lowest-level function for subtraction.  It is the preferred
5226function for subtraction, since it is written in assembly for most CPUs.
5227@end deftypefun
5228
5229@deftypefun mp_limb_t mpn_sub_1 (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{n}, mp_limb_t @var{s2limb})
5230Subtract @var{s2limb} from @{@var{s1p}, @var{n}@}, and write the @var{n} least
5231significant limbs of the result to @var{rp}.  Return borrow, either 0 or 1.
5232@end deftypefun
5233
5234@deftypefun mp_limb_t mpn_sub (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{s1n}, const mp_limb_t *@var{s2p}, mp_size_t @var{s2n})
5235Subtract @{@var{s2p}, @var{s2n}@} from @{@var{s1p}, @var{s1n}@}, and write the
5236@var{s1n} least significant limbs of the result to @var{rp}.  Return borrow,
5237either 0 or 1.
5238
5239This function requires that @var{s1n} is greater than or equal to
5240@var{s2n}.
5241@end deftypefun
5242
5243@deftypefun mp_limb_t mpn_neg (mp_limb_t *@var{rp}, const mp_limb_t *@var{sp}, mp_size_t @var{n})
5244Perform the negation of @{@var{sp}, @var{n}@}, and write the result to
5245@{@var{rp}, @var{n}@}.  This is equivalent to calling @code{mpn_sub_n} with a
5246@var{n}-limb zero minuend and passing @{@var{sp}, @var{n}@} as subtrahend.
5247Return borrow, either 0 or 1.
5248@end deftypefun
5249
5250@deftypefun void mpn_mul_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n})
5251Multiply @{@var{s1p}, @var{n}@} and @{@var{s2p}, @var{n}@}, and write the
52522*@var{n}-limb result to @var{rp}.
5253
5254The destination has to have space for 2*@var{n} limbs, even if the product's
5255most significant limb is zero.  No overlap is permitted between the
5256destination and either source.
5257
5258If the two input operands are the same, use @code{mpn_sqr}.
5259@end deftypefun
5260
5261@deftypefun mp_limb_t mpn_mul (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{s1n}, const mp_limb_t *@var{s2p}, mp_size_t @var{s2n})
5262Multiply @{@var{s1p}, @var{s1n}@} and @{@var{s2p}, @var{s2n}@}, and write the
5263(@var{s1n}+@var{s2n})-limb result to @var{rp}.  Return the most significant
5264limb of the result.
5265
5266The destination has to have space for @var{s1n} + @var{s2n} limbs, even if the
5267product's most significant limb is zero.  No overlap is permitted between the
5268destination and either source.
5269
5270This function requires that @var{s1n} is greater than or equal to @var{s2n}.
5271@end deftypefun
5272
5273@deftypefun void mpn_sqr (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{n})
5274Compute the square of @{@var{s1p}, @var{n}@} and write the 2*@var{n}-limb
5275result to @var{rp}.
5276
5277The destination has to have space for 2@var{n} limbs, even if the result's
5278most significant limb is zero.  No overlap is permitted between the
5279destination and the source.
5280@end deftypefun
5281
5282@deftypefun mp_limb_t mpn_mul_1 (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{n}, mp_limb_t @var{s2limb})
5283Multiply @{@var{s1p}, @var{n}@} by @var{s2limb}, and write the @var{n} least
5284significant limbs of the product to @var{rp}.  Return the most significant
5285limb of the product.  @{@var{s1p}, @var{n}@} and @{@var{rp}, @var{n}@} are
5286allowed to overlap provided @math{@var{rp} @le{} @var{s1p}}.
5287
5288This is a low-level function that is a building block for general
5289multiplication as well as other operations in GMP@.  It is written in assembly
5290for most CPUs.
5291
5292Don't call this function if @var{s2limb} is a power of 2; use @code{mpn_lshift}
5293with a count equal to the logarithm of @var{s2limb} instead, for optimal speed.
5294@end deftypefun
5295
5296@deftypefun mp_limb_t mpn_addmul_1 (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{n}, mp_limb_t @var{s2limb})
5297Multiply @{@var{s1p}, @var{n}@} and @var{s2limb}, and add the @var{n} least
5298significant limbs of the product to @{@var{rp}, @var{n}@} and write the result
5299to @var{rp}.  Return the most significant limb of the product, plus carry-out
5300from the addition.  @{@var{s1p}, @var{n}@} and @{@var{rp}, @var{n}@} are
5301allowed to overlap provided @math{@var{rp} @le{} @var{s1p}}.
5302
5303This is a low-level function that is a building block for general
5304multiplication as well as other operations in GMP@.  It is written in assembly
5305for most CPUs.
5306@end deftypefun
5307
5308@deftypefun mp_limb_t mpn_submul_1 (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{n}, mp_limb_t @var{s2limb})
5309Multiply @{@var{s1p}, @var{n}@} and @var{s2limb}, and subtract the @var{n}
5310least significant limbs of the product from @{@var{rp}, @var{n}@} and write the
5311result to @var{rp}.  Return the most significant limb of the product, plus
5312borrow-out from the subtraction.  @{@var{s1p}, @var{n}@} and @{@var{rp},
5313@var{n}@} are allowed to overlap provided @math{@var{rp} @le{} @var{s1p}}.
5314
5315This is a low-level function that is a building block for general
5316multiplication and division as well as other operations in GMP@.  It is written
5317in assembly for most CPUs.
5318@end deftypefun
5319
5320@deftypefun void mpn_tdiv_qr (mp_limb_t *@var{qp}, mp_limb_t *@var{rp}, mp_size_t @var{qxn}, const mp_limb_t *@var{np}, mp_size_t @var{nn}, const mp_limb_t *@var{dp}, mp_size_t @var{dn})
5321Divide @{@var{np}, @var{nn}@} by @{@var{dp}, @var{dn}@} and put the quotient
5322at @{@var{qp}, @var{nn}@minus{}@var{dn}+1@} and the remainder at @{@var{rp},
5323@var{dn}@}.  The quotient is rounded towards 0.
5324
5325No overlap is permitted between arguments, except that @var{np} might equal
5326@var{rp}.  The dividend size @var{nn} must be greater than or equal to divisor
5327size @var{dn}.  The most significant limb of the divisor must be non-zero.  The
5328@var{qxn} operand must be zero.
5329@end deftypefun
5330
5331@deftypefun mp_limb_t mpn_divrem (mp_limb_t *@var{r1p}, mp_size_t @var{qxn}, mp_limb_t *@var{rs2p}, mp_size_t @var{rs2n}, const mp_limb_t *@var{s3p}, mp_size_t @var{s3n})
5332[This function is obsolete.  Please call @code{mpn_tdiv_qr} instead for best
5333performance.]
5334
5335Divide @{@var{rs2p}, @var{rs2n}@} by @{@var{s3p}, @var{s3n}@}, and write the
5336quotient at @var{r1p}, with the exception of the most significant limb, which
5337is returned.  The remainder replaces the dividend at @var{rs2p}; it will be
5338@var{s3n} limbs long (i.e., as many limbs as the divisor).
5339
5340In addition to an integer quotient, @var{qxn} fraction limbs are developed, and
5341stored after the integral limbs.  For most usages, @var{qxn} will be zero.
5342
5343It is required that @var{rs2n} is greater than or equal to @var{s3n}.  It is
5344required that the most significant bit of the divisor is set.
5345
5346If the quotient is not needed, pass @var{rs2p} + @var{s3n} as @var{r1p}.  Aside
5347from that special case, no overlap between arguments is permitted.
5348
5349Return the most significant limb of the quotient, either 0 or 1.
5350
5351The area at @var{r1p} needs to be @var{rs2n} @minus{} @var{s3n} + @var{qxn}
5352limbs large.
5353@end deftypefun
5354
5355@deftypefn Function mp_limb_t mpn_divrem_1 (mp_limb_t *@var{r1p}, mp_size_t @var{qxn}, @w{mp_limb_t *@var{s2p}}, mp_size_t @var{s2n}, mp_limb_t @var{s3limb})
5356@deftypefnx Macro mp_limb_t mpn_divmod_1 (mp_limb_t *@var{r1p}, mp_limb_t *@var{s2p}, @w{mp_size_t @var{s2n}}, @w{mp_limb_t @var{s3limb}})
5357Divide @{@var{s2p}, @var{s2n}@} by @var{s3limb}, and write the quotient at
5358@var{r1p}.  Return the remainder.
5359
5360The integer quotient is written to @{@var{r1p}+@var{qxn}, @var{s2n}@} and in
5361addition @var{qxn} fraction limbs are developed and written to @{@var{r1p},
5362@var{qxn}@}.  Either or both @var{s2n} and @var{qxn} can be zero.  For most
5363usages, @var{qxn} will be zero.
5364
5365@code{mpn_divmod_1} exists for upward source compatibility and is simply a
5366macro calling @code{mpn_divrem_1} with a @var{qxn} of 0.
5367
5368The areas at @var{r1p} and @var{s2p} have to be identical or completely
5369separate, not partially overlapping.
5370@end deftypefn
5371
5372@deftypefun mp_limb_t mpn_divmod (mp_limb_t *@var{r1p}, mp_limb_t *@var{rs2p}, mp_size_t @var{rs2n}, const mp_limb_t *@var{s3p}, mp_size_t @var{s3n})
5373[This function is obsolete.  Please call @code{mpn_tdiv_qr} instead for best
5374performance.]
5375@end deftypefun
5376
5377@deftypefun void mpn_divexact_1 (mp_limb_t * @var{rp}, const mp_limb_t * @var{sp}, mp_size_t @var{n}, mp_limb_t @var{d})
5378Divide @{@var{sp}, @var{n}@} by @var{d}, expecting it to divide exactly, and
5379writing the result to @{@var{rp}, @var{n}@}. If @var{d} doesn't divide
5380exactly, the value written to @{@var{rp}, @var{n}@} is undefined. The areas at
5381@var{rp} and @var{sp} have to be identical or completely separate, not
5382partially overlapping.
5383@end deftypefun
5384
5385@deftypefn Macro mp_limb_t mpn_divexact_by3 (mp_limb_t *@var{rp}, mp_limb_t *@var{sp}, @w{mp_size_t @var{n}})
5386@deftypefnx Function mp_limb_t mpn_divexact_by3c (mp_limb_t *@var{rp}, mp_limb_t *@var{sp}, @w{mp_size_t @var{n}}, mp_limb_t @var{carry})
5387Divide @{@var{sp}, @var{n}@} by 3, expecting it to divide exactly, and writing
5388the result to @{@var{rp}, @var{n}@}.  If 3 divides exactly, the return value is
5389zero and the result is the quotient.  If not, the return value is non-zero and
5390the result won't be anything useful.
5391
5392@code{mpn_divexact_by3c} takes an initial carry parameter, which can be the
5393return value from a previous call, so a large calculation can be done piece by
5394piece from low to high.  @code{mpn_divexact_by3} is simply a macro calling
5395@code{mpn_divexact_by3c} with a 0 carry parameter.
5396
5397These routines use a multiply-by-inverse and will be faster than
5398@code{mpn_divrem_1} on CPUs with fast multiplication but slow division.
5399
5400The source @math{a}, result @math{q}, size @math{n}, initial carry @math{i},
5401and return value @math{c} satisfy @m{cb^n+a-i=3q, c*b^n + a-i = 3*q}, where
5402@m{b=2\GMPraise{@code{GMP\_NUMB\_BITS}}, b=2^GMP_NUMB_BITS}.  The
5403return @math{c} is always 0, 1 or 2, and the initial carry @math{i} must also
5404be 0, 1 or 2 (these are both borrows really).  When @math{c=0} clearly
5405@math{q=(a-i)/3}.  When @m{c \neq 0, c!=0}, the remainder @math{(a-i) @bmod{}
54063} is given by @math{3-c}, because @math{b @equiv{} 1 @bmod{} 3} (when
5407@code{mp_bits_per_limb} is even, which is always so currently).
5408@end deftypefn
5409
5410@deftypefun mp_limb_t mpn_mod_1 (const mp_limb_t *@var{s1p}, mp_size_t @var{s1n}, mp_limb_t @var{s2limb})
5411Divide @{@var{s1p}, @var{s1n}@} by @var{s2limb}, and return the remainder.
5412@var{s1n} can be zero.
5413@end deftypefun
5414
5415@deftypefun mp_limb_t mpn_lshift (mp_limb_t *@var{rp}, const mp_limb_t *@var{sp}, mp_size_t @var{n}, unsigned int @var{count})
5416Shift @{@var{sp}, @var{n}@} left by @var{count} bits, and write the result to
5417@{@var{rp}, @var{n}@}.  The bits shifted out at the left are returned in the
5418least significant @var{count} bits of the return value (the rest of the return
5419value is zero).
5420
5421@var{count} must be in the range 1 to @nicode{mp_bits_per_limb}@minus{}1.  The
5422regions @{@var{sp}, @var{n}@} and @{@var{rp}, @var{n}@} may overlap, provided
5423@math{@var{rp} @ge{} @var{sp}}.
5424
5425This function is written in assembly for most CPUs.
5426@end deftypefun
5427
5428@deftypefun mp_limb_t mpn_rshift (mp_limb_t *@var{rp}, const mp_limb_t *@var{sp}, mp_size_t @var{n}, unsigned int @var{count})
5429Shift @{@var{sp}, @var{n}@} right by @var{count} bits, and write the result to
5430@{@var{rp}, @var{n}@}.  The bits shifted out at the right are returned in the
5431most significant @var{count} bits of the return value (the rest of the return
5432value is zero).
5433
5434@var{count} must be in the range 1 to @nicode{mp_bits_per_limb}@minus{}1.  The
5435regions @{@var{sp}, @var{n}@} and @{@var{rp}, @var{n}@} may overlap, provided
5436@math{@var{rp} @le{} @var{sp}}.
5437
5438This function is written in assembly for most CPUs.
5439@end deftypefun
5440
5441@deftypefun int mpn_cmp (const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n})
5442Compare @{@var{s1p}, @var{n}@} and @{@var{s2p}, @var{n}@} and return a
5443positive value if @math{@var{s1} > @var{s2}}, 0 if they are equal, or a
5444negative value if @math{@var{s1} < @var{s2}}.
5445@end deftypefun
5446
5447@deftypefun int mpn_zero_p (const mp_limb_t *@var{sp}, mp_size_t @var{n})
5448Test @{@var{sp}, @var{n}@} and return 1 if the operand is zero, 0 otherwise.
5449@end deftypefun
5450
5451@deftypefun mp_size_t mpn_gcd (mp_limb_t *@var{rp}, mp_limb_t *@var{xp}, mp_size_t @var{xn}, mp_limb_t *@var{yp}, mp_size_t @var{yn})
5452Set @{@var{rp}, @var{retval}@} to the greatest common divisor of @{@var{xp},
5453@var{xn}@} and @{@var{yp}, @var{yn}@}.  The result can be up to @var{yn} limbs,
5454the return value is the actual number produced.  Both source operands are
5455destroyed.
5456
5457It is required that @math{@var{xn} @ge @var{yn} > 0}, and the most significant
5458limb of @{@var{yp}, @var{yn}@} must be non-zero.  No overlap is permitted
5459between @{@var{xp}, @var{xn}@} and @{@var{yp}, @var{yn}@}.
5460@end deftypefun
5461
5462@deftypefun mp_limb_t mpn_gcd_1 (const mp_limb_t *@var{xp}, mp_size_t @var{xn}, mp_limb_t @var{ylimb})
5463Return the greatest common divisor of @{@var{xp}, @var{xn}@} and @var{ylimb}.
5464Both operands must be non-zero.
5465@end deftypefun
5466
5467@deftypefun mp_size_t mpn_gcdext (mp_limb_t *@var{gp}, mp_limb_t *@var{sp}, mp_size_t *@var{sn}, mp_limb_t *@var{up}, mp_size_t @var{un}, mp_limb_t *@var{vp}, mp_size_t @var{vn})
5468Let @m{U,@var{U}} be defined by @{@var{up}, @var{un}@} and let @m{V,@var{V}} be
5469defined by @{@var{vp}, @var{vn}@}.
5470
5471Compute the greatest common divisor @math{G} of @math{U} and @math{V}.  Compute
5472a cofactor @math{S} such that @math{G = US + VT}.  The second cofactor @var{T}
5473is not computed but can easily be obtained from @m{(G - US) / V, (@var{G} -
5474@var{U}*@var{S}) / @var{V}} (the division will be exact).  It is required that
5475@math{@var{un} @ge @var{vn} > 0}, and the most significant
5476limb of @{@var{vp}, @var{vn}@} must be non-zero.
5477
5478@math{S} satisfies @math{S = 1} or @math{@GMPabs{S} < V / (2 G)}. @math{S =
54790} if and only if @math{V} divides @math{U} (i.e., @math{G = V}).
5480
5481Store @math{G} at @var{gp} and let the return value define its limb count.
5482Store @math{S} at @var{sp} and let |*@var{sn}| define its limb count.  @math{S}
5483can be negative; when this happens *@var{sn} will be negative.  The area at
5484@var{gp} should have room for @var{vn} limbs and the area at @var{sp} should
5485have room for @math{@var{vn}+1} limbs.
5486
5487Both source operands are destroyed.
5488
5489Compatibility notes: GMP 4.3.0 and 4.3.1 defined @math{S} less strictly.
5490Earlier as well as later GMP releases define @math{S} as described here.
5491GMP releases before GMP 4.3.0 required additional space for both input and output
5492areas. More precisely, the areas @{@var{up}, @math{@var{un}+1}@} and
5493@{@var{vp}, @math{@var{vn}+1}@} were destroyed (i.e.@: the operands plus an
5494extra limb past the end of each), and the areas pointed to by @var{gp} and
5495@var{sp} should each have room for @math{@var{un}+1} limbs.
5496@end deftypefun
5497
5498@deftypefun mp_size_t mpn_sqrtrem (mp_limb_t *@var{r1p}, mp_limb_t *@var{r2p}, const mp_limb_t *@var{sp}, mp_size_t @var{n})
5499Compute the square root of @{@var{sp}, @var{n}@} and put the result at
5500@{@var{r1p}, @math{@GMPceil{@var{n}/2}}@} and the remainder at @{@var{r2p},
5501@var{retval}@}.  @var{r2p} needs space for @var{n} limbs, but the return value
5502indicates how many are produced.
5503
5504The most significant limb of @{@var{sp}, @var{n}@} must be non-zero.  The
5505areas @{@var{r1p}, @math{@GMPceil{@var{n}/2}}@} and @{@var{sp}, @var{n}@} must
5506be completely separate.  The areas @{@var{r2p}, @var{n}@} and @{@var{sp},
5507@var{n}@} must be either identical or completely separate.
5508
5509If the remainder is not wanted then @var{r2p} can be @code{NULL}, and in this
5510case the return value is zero or non-zero according to whether the remainder
5511would have been zero or non-zero.
5512
5513A return value of zero indicates a perfect square.  See also
5514@code{mpn_perfect_square_p}.
5515@end deftypefun
5516
5517@deftypefun size_t mpn_sizeinbase (const mp_limb_t *@var{xp}, mp_size_t @var{n}, int @var{base})
5518Return the size of @{@var{xp},@var{n}@} measured in number of digits in the
5519given @var{base}.  @var{base} can vary from 2 to 62.  Requires @math{@var{n} > 0}
5520and @math{@var{xp}[@var{n}-1] > 0}.  The result will be either exact or
55211 too big.  If @var{base} is a power of 2, the result is always exact.
5522@end deftypefun
5523
5524@deftypefun mp_size_t mpn_get_str (unsigned char *@var{str}, int @var{base}, mp_limb_t *@var{s1p}, mp_size_t @var{s1n})
5525Convert @{@var{s1p}, @var{s1n}@} to a raw unsigned char array at @var{str} in
5526base @var{base}, and return the number of characters produced.  There may be
5527leading zeros in the string.  The string is not in ASCII; to convert it to
5528printable format, add the ASCII codes for @samp{0} or @samp{A}, depending on
5529the base and range.  @var{base} can vary from 2 to 256.
5530
5531The most significant limb of the input @{@var{s1p}, @var{s1n}@} must be
5532non-zero.  The input @{@var{s1p}, @var{s1n}@} is clobbered, except when
5533@var{base} is a power of 2, in which case it's unchanged.
5534
5535The area at @var{str} has to have space for the largest possible number
5536represented by a @var{s1n} long limb array, plus one extra character.
5537@end deftypefun
5538
5539@deftypefun mp_size_t mpn_set_str (mp_limb_t *@var{rp}, const unsigned char *@var{str}, size_t @var{strsize}, int @var{base})
5540Convert bytes @{@var{str},@var{strsize}@} in the given @var{base} to limbs at
5541@var{rp}.
5542
5543@math{@var{str}[0]} is the most significant input byte and
5544@math{@var{str}[@var{strsize}-1]} is the least significant input byte.  Each
5545byte should be a value in the range 0 to @math{@var{base}-1}, not an ASCII
5546character.  @var{base} can vary from 2 to 256.
5547
5548The converted value is @{@var{rp},@var{rn}@} where @var{rn} is the return
5549value.  If the most significant input byte @math{@var{str}[0]} is non-zero,
5550then @math{@var{rp}[@var{rn}-1]} will be non-zero, else
5551@math{@var{rp}[@var{rn}-1]} and some number of subsequent limbs may be zero.
5552
5553The area at @var{rp} has to have space for the largest possible number with
5554@var{strsize} digits in the chosen base, plus one extra limb.
5555
5556The input must have at least one byte, and no overlap is permitted between
5557@{@var{str},@var{strsize}@} and the result at @var{rp}.
5558@end deftypefun
5559
5560@deftypefun {mp_bitcnt_t} mpn_scan0 (const mp_limb_t *@var{s1p}, mp_bitcnt_t @var{bit})
5561Scan @var{s1p} from bit position @var{bit} for the next clear bit.
5562
5563It is required that there be a clear bit within the area at @var{s1p} at or
5564beyond bit position @var{bit}, so that the function has something to return.
5565@end deftypefun
5566
5567@deftypefun {mp_bitcnt_t} mpn_scan1 (const mp_limb_t *@var{s1p}, mp_bitcnt_t @var{bit})
5568Scan @var{s1p} from bit position @var{bit} for the next set bit.
5569
5570It is required that there be a set bit within the area at @var{s1p} at or
5571beyond bit position @var{bit}, so that the function has something to return.
5572@end deftypefun
5573
5574@deftypefun void mpn_random (mp_limb_t *@var{r1p}, mp_size_t @var{r1n})
5575@deftypefunx void mpn_random2 (mp_limb_t *@var{r1p}, mp_size_t @var{r1n})
5576Generate a random number of length @var{r1n} and store it at @var{r1p}.  The
5577most significant limb is always non-zero.  @code{mpn_random} generates
5578uniformly distributed limb data, @code{mpn_random2} generates long strings of
5579zeros and ones in the binary representation.
5580
5581@code{mpn_random2} is intended for testing the correctness of the @code{mpn}
5582routines.
5583@end deftypefun
5584
5585@deftypefun {mp_bitcnt_t} mpn_popcount (const mp_limb_t *@var{s1p}, mp_size_t @var{n})
5586Count the number of set bits in @{@var{s1p}, @var{n}@}.
5587@end deftypefun
5588
5589@deftypefun {mp_bitcnt_t} mpn_hamdist (const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n})
5590Compute the hamming distance between @{@var{s1p}, @var{n}@} and @{@var{s2p},
5591@var{n}@}, which is the number of bit positions where the two operands have
5592different bit values.
5593@end deftypefun
5594
5595@deftypefun int mpn_perfect_square_p (const mp_limb_t *@var{s1p}, mp_size_t @var{n})
5596Return non-zero iff @{@var{s1p}, @var{n}@} is a perfect square.
5597The most significant limb of the input @{@var{s1p}, @var{n}@} must be
5598non-zero.
5599@end deftypefun
5600
5601@deftypefun void mpn_and_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n})
5602Perform the bitwise logical and of @{@var{s1p}, @var{n}@} and @{@var{s2p},
5603@var{n}@}, and write the result to @{@var{rp}, @var{n}@}.
5604@end deftypefun
5605
5606@deftypefun void mpn_ior_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n})
5607Perform the bitwise logical inclusive or of @{@var{s1p}, @var{n}@} and
5608@{@var{s2p}, @var{n}@}, and write the result to @{@var{rp}, @var{n}@}.
5609@end deftypefun
5610
5611@deftypefun void mpn_xor_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n})
5612Perform the bitwise logical exclusive or of @{@var{s1p}, @var{n}@} and
5613@{@var{s2p}, @var{n}@}, and write the result to @{@var{rp}, @var{n}@}.
5614@end deftypefun
5615
5616@deftypefun void mpn_andn_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n})
5617Perform the bitwise logical and of @{@var{s1p}, @var{n}@} and the bitwise
5618complement of @{@var{s2p}, @var{n}@}, and write the result to @{@var{rp}, @var{n}@}.
5619@end deftypefun
5620
5621@deftypefun void mpn_iorn_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n})
5622Perform the bitwise logical inclusive or of @{@var{s1p}, @var{n}@} and the bitwise
5623complement of @{@var{s2p}, @var{n}@}, and write the result to @{@var{rp}, @var{n}@}.
5624@end deftypefun
5625
5626@deftypefun void mpn_nand_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n})
5627Perform the bitwise logical and of @{@var{s1p}, @var{n}@} and @{@var{s2p},
5628@var{n}@}, and write the bitwise complement of the result to @{@var{rp}, @var{n}@}.
5629@end deftypefun
5630
5631@deftypefun void mpn_nior_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n})
5632Perform the bitwise logical inclusive or of @{@var{s1p}, @var{n}@} and
5633@{@var{s2p}, @var{n}@}, and write the bitwise complement of the result to
5634@{@var{rp}, @var{n}@}.
5635@end deftypefun
5636
5637@deftypefun void mpn_xnor_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n})
5638Perform the bitwise logical exclusive or of @{@var{s1p}, @var{n}@} and
5639@{@var{s2p}, @var{n}@}, and write the bitwise complement of the result to
5640@{@var{rp}, @var{n}@}.
5641@end deftypefun
5642
5643@deftypefun void mpn_com (mp_limb_t *@var{rp}, const mp_limb_t *@var{sp}, mp_size_t @var{n})
5644Perform the bitwise complement of @{@var{sp}, @var{n}@}, and write the result
5645to @{@var{rp}, @var{n}@}.
5646@end deftypefun
5647
5648@deftypefun void mpn_copyi (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{n})
5649Copy from @{@var{s1p}, @var{n}@} to @{@var{rp}, @var{n}@}, increasingly.
5650@end deftypefun
5651
5652@deftypefun void mpn_copyd (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{n})
5653Copy from @{@var{s1p}, @var{n}@} to @{@var{rp}, @var{n}@}, decreasingly.
5654@end deftypefun
5655
5656@deftypefun void mpn_zero (mp_limb_t *@var{rp}, mp_size_t @var{n})
5657Zero @{@var{rp}, @var{n}@}.
5658@end deftypefun
5659
5660@sp 1
5661@section Low-level functions for cryptography
5662@cindex Low-level functions for cryptography
5663@cindex Cryptography functions, low-level
5664
5665The functions prefixed with @code{mpn_sec_} and @code{mpn_cnd_} are designed to
5666perform the exact same low-level operations and have the same cache access
5667patterns for any two same-size arguments, assuming that function arguments are
5668placed at the same position and that the machine state is identical upon
5669function entry.  These functions are intended for cryptographic purposes, where
5670resilience to side-channel attacks is desired.
5671
5672These functions are less efficient than their ``leaky'' counterparts; their
5673performance for operands of the sizes typically used for cryptographic
5674applications is between 15% and 100% worse.  For larger operands, these
5675functions might be inadequate, since they rely on asymptotically elementary
5676algorithms.
5677
5678These functions do not make any explicit allocations.  Those of these functions
5679that need scratch space accept a scratch space operand.  This convention allows
5680callers to keep sensitive data in designated memory areas.  Note however that
5681compilers may choose to spill scalar values used within these functions to
5682their stack frame and that such scalars may contain sensitive data.
5683
5684In addition to these specially crafted functions, the following @code{mpn}
5685functions are naturally side-channel resistant: @code{mpn_add_n},
5686@code{mpn_sub_n}, @code{mpn_lshift}, @code{mpn_rshift}, @code{mpn_zero},
5687@code{mpn_copyi}, @code{mpn_copyd}, @code{mpn_com}, and the logical function
5688(@code{mpn_and_n}, etc).
5689
5690There are some exceptions from the side-channel resilience: (1) Some assembly
5691implementations of @code{mpn_lshift} identify shift-by-one as a special case.
5692This is a problem iff the shift count is a function of sensitive data.  (2)
5693Alpha ev6 and Pentium4 using 64-bit limbs have leaky @code{mpn_add_n} and
5694@code{mpn_sub_n}.  (3) Alpha ev6 has a leaky @code{mpn_mul_1} which also makes
5695@code{mpn_sec_mul} on those systems unsafe.
5696
5697@deftypefun mp_limb_t mpn_cnd_add_n (mp_limb_t @var{cnd}, mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n})
5698@deftypefunx mp_limb_t mpn_cnd_sub_n (mp_limb_t @var{cnd}, mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n})
5699These functions do conditional addition and subtraction.  If @var{cnd} is
5700non-zero, they produce the same result as a regular @code{mpn_add_n} or
5701@code{mpn_sub_n}, and if @var{cnd} is zero, they copy @{@var{s1p},@var{n}@} to
5702the result area and return zero.  The functions are designed to have timing and
5703memory access patterns depending only on size and location of the data areas,
5704but independent of the condition @var{cnd}.  Like for @code{mpn_add_n} and
5705@code{mpn_sub_n}, on most machines, the timing will also be independent of the
5706actual limb values.
5707@end deftypefun
5708
5709@deftypefun mp_limb_t mpn_sec_add_1 (mp_limb_t *@var{rp}, const mp_limb_t *@var{ap}, mp_size_t @var{n}, mp_limb_t @var{b}, mp_limb_t *@var{tp})
5710@deftypefunx mp_limb_t mpn_sec_sub_1 (mp_limb_t *@var{rp}, const mp_limb_t *@var{ap}, mp_size_t @var{n}, mp_limb_t @var{b}, mp_limb_t *@var{tp})
5711Set @var{R} to @var{A} + @var{b} or @var{A} - @var{b}, respectively, where
5712@var{R} = @{@var{rp},@var{n}@}, @var{A} = @{@var{ap},@var{n}@}, and @var{b} is
5713a single limb. Returns carry.
5714
5715These functions take @math{O(N)} time, unlike the leaky functions
5716@code{mpn_add_1} which are @math{O(1)} on average. They require scratch space
5717of @code{mpn_sec_add_1_itch(@var{n})} and @code{mpn_sec_sub_1_itch(@var{n})}
5718limbs, respectively, to be passed in the @var{tp} parameter. The scratch space
5719requirements are guaranteed to be at most @var{n} limbs, and increase
5720monotonously in the operand size.
5721@end deftypefun
5722
5723@deftypefun void mpn_cnd_swap (mp_limb_t @var{cnd}, volatile mp_limb_t *@var{ap}, volatile mp_limb_t *@var{bp}, mp_size_t @var{n})
5724If @var{cnd} is non-zero, swaps the contents of the areas @{@var{ap},@var{n}@}
5725and @{@var{bp},@var{n}@}. Otherwise, the areas are left unmodified.
5726Implemented using logical operations on the limbs, with the same memory
5727accesses independent of the value of @var{cnd}.
5728@end deftypefun
5729
5730@deftypefun void mpn_sec_mul (mp_limb_t *@var{rp}, const mp_limb_t *@var{ap}, mp_size_t @var{an}, const mp_limb_t *@var{bp}, mp_size_t @var{bn}, mp_limb_t *@var{tp})
5731@deftypefunx mp_size_t mpn_sec_mul_itch (mp_size_t @var{an}, mp_size_t @var{bn})
5732Set @var{R} to @math{A @times{} B}, where @var{A} = @{@var{ap},@var{an}@},
5733@var{B} = @{@var{bp},@var{bn}@}, and @var{R} =
5734@{@var{rp},@math{@var{an}+@var{bn}}@}.
5735
5736It is required that @math{@var{an} @ge @var{bn} > 0}.
5737
5738No overlapping between @var{R} and the input operands is allowed.  For
5739@math{@var{A} = @var{B}}, use @code{mpn_sec_sqr} for optimal performance.
5740
5741This function requires scratch space of @code{mpn_sec_mul_itch(@var{an},
5742@var{bn})} limbs to be passed in the @var{tp} parameter.  The scratch space
5743requirements are guaranteed to increase monotonously in the operand sizes.
5744@end deftypefun
5745
5746
5747@deftypefun void mpn_sec_sqr (mp_limb_t *@var{rp}, const mp_limb_t *@var{ap}, mp_size_t @var{an}, mp_limb_t *@var{tp})
5748@deftypefunx mp_size_t mpn_sec_sqr_itch (mp_size_t @var{an})
5749Set @var{R} to @math{A^2}, where @var{A} = @{@var{ap},@var{an}@}, and @var{R} =
5750@{@var{rp},@math{2@var{an}}@}.
5751
5752It is required that @math{@var{an} > 0}.
5753
5754No overlapping between @var{R} and the input operands is allowed.
5755
5756This function requires scratch space of @code{mpn_sec_sqr_itch(@var{an})} limbs
5757to be passed in the @var{tp} parameter.  The scratch space requirements are
5758guaranteed to increase monotonously in the operand size.
5759@end deftypefun
5760
5761
5762@deftypefun void mpn_sec_powm (mp_limb_t *@var{rp}, const mp_limb_t *@var{bp}, mp_size_t @var{bn}, const mp_limb_t *@var{ep}, mp_bitcnt_t @var{enb},  const mp_limb_t *@var{mp}, mp_size_t @var{n}, mp_limb_t *@var{tp})
5763@deftypefunx mp_size_t mpn_sec_powm_itch (mp_size_t @var{bn}, mp_bitcnt_t @var{enb}, size_t @var{n})
5764Set @var{R} to @m{B^E \bmod @var{M}, (@var{B} raised to @var{E}) modulo
5765@var{M}}, where @var{R} = @{@var{rp},@var{n}@}, @var{M} = @{@var{mp},@var{n}@},
5766and @var{E} = @{@var{ep},@math{@GMPceil{@var{enb} /
5767@code{GMP\_NUMB\_BITS}}}@}.
5768
5769It is required that @math{@var{B} > 0}, that @math{@var{M} > 0} is odd, and
5770that @m{@var{E} < 2@GMPraise{@var{enb}}, @var{E} < 2^@var{enb}}.
5771
5772No overlapping between @var{R} and the input operands is allowed.
5773
5774This function requires scratch space of @code{mpn_sec_powm_itch(@var{bn},
5775@var{enb}, @var{n})} limbs to be passed in the @var{tp} parameter.  The scratch
5776space requirements are guaranteed to increase monotonously in the operand
5777sizes.
5778@end deftypefun
5779
5780@deftypefun void mpn_sec_tabselect (mp_limb_t *@var{rp}, const mp_limb_t *@var{tab}, mp_size_t @var{n}, mp_size_t @var{nents}, mp_size_t @var{which})
5781Select entry @var{which} from table @var{tab}, which has @var{nents} entries, each @var{n}
5782limbs.  Store the selected entry at @var{rp}.
5783
5784This function reads the entire table to avoid side-channel information leaks.
5785@end deftypefun
5786
5787@deftypefun mp_limb_t mpn_sec_div_qr (mp_limb_t *@var{qp}, mp_limb_t *@var{np}, mp_size_t @var{nn}, const mp_limb_t *@var{dp}, mp_size_t @var{dn}, mp_limb_t *@var{tp})
5788@deftypefunx mp_size_t mpn_sec_div_qr_itch (mp_size_t @var{nn}, mp_size_t @var{dn})
5789
5790Set @var{Q} to @m{\lfloor @var{N} / @var{D}\rfloor, the truncated quotient
5791@var{N} / @var{D}} and @var{R} to @m{@var{N} \bmod @var{D}, @var{N} modulo
5792@var{D}}, where @var{N} = @{@var{np},@var{nn}@}, @var{D} =
5793@{@var{dp},@var{dn}@}, @var{Q}'s most significant limb is the function return
5794value and the remaining limbs are @{@var{qp},@var{nn-dn}@}, and @var{R} =
5795@{@var{np},@var{dn}@}.
5796
5797It is required that @math{@var{nn} @ge @var{dn} @ge 1}, and that
5798@m{@var{dp}[@var{dn}-1] @neq 0, @var{dp}[@var{dn}-1] != 0}.  This does not
5799imply that @math{@var{N} @ge @var{D}} since @var{N} might be zero-padded.
5800
5801Note the overlapping between @var{N} and @var{R}.  No other operand overlapping
5802is allowed.  The entire space occupied by @var{N} is overwritten.
5803
5804This function requires scratch space of @code{mpn_sec_div_qr_itch(@var{nn},
5805@var{dn})} limbs to be passed in the @var{tp} parameter.
5806@end deftypefun
5807
5808@deftypefun void mpn_sec_div_r (mp_limb_t *@var{np}, mp_size_t @var{nn}, const mp_limb_t *@var{dp}, mp_size_t @var{dn}, mp_limb_t *@var{tp})
5809@deftypefunx mp_size_t mpn_sec_div_r_itch (mp_size_t @var{nn}, mp_size_t @var{dn})
5810
5811Set @var{R} to @m{@var{N} \bmod @var{D}, @var{N} modulo @var{D}}, where @var{N}
5812= @{@var{np},@var{nn}@}, @var{D} = @{@var{dp},@var{dn}@}, and @var{R} =
5813@{@var{np},@var{dn}@}.
5814
5815It is required that @math{@var{nn} @ge @var{dn} @ge 1}, and that
5816@m{@var{dp}[@var{dn}-1] @neq 0, @var{dp}[@var{dn}-1] != 0}.  This does not
5817imply that @math{@var{N} @ge @var{D}} since @var{N} might be zero-padded.
5818
5819Note the overlapping between @var{N} and @var{R}.  No other operand overlapping
5820is allowed.  The entire space occupied by @var{N} is overwritten.
5821
5822This function requires scratch space of @code{mpn_sec_div_r_itch(@var{nn},
5823@var{dn})} limbs to be passed in the @var{tp} parameter.
5824@end deftypefun
5825
5826@deftypefun int mpn_sec_invert (mp_limb_t *@var{rp}, mp_limb_t *@var{ap}, const mp_limb_t *@var{mp}, mp_size_t @var{n}, mp_bitcnt_t @var{nbcnt}, mp_limb_t *@var{tp})
5827@deftypefunx mp_size_t mpn_sec_invert_itch (mp_size_t @var{n})
5828Set @var{R} to @m{@var{A}^{-1} \bmod @var{M}, the inverse of @var{A} modulo
5829@var{M}}, where @var{R} = @{@var{rp},@var{n}@}, @var{A} = @{@var{ap},@var{n}@},
5830and @var{M} = @{@var{mp},@var{n}@}.  @strong{This function's interface is
5831preliminary.}
5832
5833If an inverse exists, return 1, otherwise return 0 and leave @var{R}
5834undefined. In either case, the input @var{A} is destroyed.
5835
5836It is required that @var{M} is odd, and that @math{@var{nbcnt} @ge
5837@GMPceil{\log(@var{A}+1)} + @GMPceil{\log(@var{M}+1)}}.  A safe choice is
5838@m{@var{nbcnt} = 2@var{n} @times{} @code{GMP\_NUMB\_BITS}, @var{nbcnt} = 2
5839@times{} @var{n} @times{} GMP_NUMB_BITS}, but a smaller value might improve
5840performance if @var{M} or @var{A} are known to have leading zero bits.
5841
5842This function requires scratch space of @code{mpn_sec_invert_itch(@var{n})}
5843limbs to be passed in the @var{tp} parameter.
5844@end deftypefun
5845
5846
5847@sp 1
5848@section Nails
5849@cindex Nails
5850
5851@strong{Everything in this section is highly experimental and may disappear or
5852be subject to incompatible changes in a future version of GMP.}
5853
5854Nails are an experimental feature whereby a few bits are left unused at the
5855top of each @code{mp_limb_t}.  This can significantly improve carry handling
5856on some processors.
5857
5858All the @code{mpn} functions accepting limb data will expect the nail bits to
5859be zero on entry, and will return data with the nails similarly all zero.
5860This applies both to limb vectors and to single limb arguments.
5861
5862Nails can be enabled by configuring with @samp{--enable-nails}.  By default
5863the number of bits will be chosen according to what suits the host processor,
5864but a particular number can be selected with @samp{--enable-nails=N}.
5865
5866At the mpn level, a nail build is neither source nor binary compatible with a
5867non-nail build, strictly speaking.  But programs acting on limbs only through
5868the mpn functions are likely to work equally well with either build, and
5869judicious use of the definitions below should make any program compatible with
5870either build, at the source level.
5871
5872For the higher level routines, meaning @code{mpz} etc, a nail build should be
5873fully source and binary compatible with a non-nail build.
5874
5875@defmac GMP_NAIL_BITS
5876@defmacx GMP_NUMB_BITS
5877@defmacx GMP_LIMB_BITS
5878@code{GMP_NAIL_BITS} is the number of nail bits, or 0 when nails are not in
5879use.  @code{GMP_NUMB_BITS} is the number of data bits in a limb.
5880@code{GMP_LIMB_BITS} is the total number of bits in an @code{mp_limb_t}.  In
5881all cases
5882
5883@example
5884GMP_LIMB_BITS == GMP_NAIL_BITS + GMP_NUMB_BITS
5885@end example
5886@end defmac
5887
5888@defmac GMP_NAIL_MASK
5889@defmacx GMP_NUMB_MASK
5890Bit masks for the nail and number parts of a limb.  @code{GMP_NAIL_MASK} is 0
5891when nails are not in use.
5892
5893@code{GMP_NAIL_MASK} is not often needed, since the nail part can be obtained
5894with @code{x >> GMP_NUMB_BITS}, and that means one less large constant, which
5895can help various RISC chips.
5896@end defmac
5897
5898@defmac GMP_NUMB_MAX
5899The maximum value that can be stored in the number part of a limb.  This is
5900the same as @code{GMP_NUMB_MASK}, but can be used for clarity when doing
5901comparisons rather than bit-wise operations.
5902@end defmac
5903
5904The term ``nails'' comes from finger or toe nails, which are at the ends of a
5905limb (arm or leg).  ``numb'' is short for number, but is also how the
5906developers felt after trying for a long time to come up with sensible names
5907for these things.
5908
5909In the future (the distant future most likely) a non-zero nail might be
5910permitted, giving non-unique representations for numbers in a limb vector.
5911This would help vector processors since carries would only ever need to
5912propagate one or two limbs.
5913
5914
5915@node Random Number Functions, Formatted Output, Low-level Functions, Top
5916@chapter Random Number Functions
5917@cindex Random number functions
5918
5919Sequences of pseudo-random numbers in GMP are generated using a variable of
5920type @code{gmp_randstate_t}, which holds an algorithm selection and a current
5921state.  Such a variable must be initialized by a call to one of the
5922@code{gmp_randinit} functions, and can be seeded with one of the
5923@code{gmp_randseed} functions.
5924
5925The functions actually generating random numbers are described in @ref{Integer
5926Random Numbers}, and @ref{Miscellaneous Float Functions}.
5927
5928The older style random number functions don't accept a @code{gmp_randstate_t}
5929parameter but instead share a global variable of that type.  They use a
5930default algorithm and are currently not seeded (though perhaps that will
5931change in the future).  The new functions accepting a @code{gmp_randstate_t}
5932are recommended for applications that care about randomness.
5933
5934@menu
5935* Random State Initialization::
5936* Random State Seeding::
5937* Random State Miscellaneous::
5938@end menu
5939
5940@node Random State Initialization, Random State Seeding, Random Number Functions, Random Number Functions
5941@section Random State Initialization
5942@cindex Random number state
5943@cindex Initialization functions
5944
5945@deftypefun void gmp_randinit_default (gmp_randstate_t @var{state})
5946Initialize @var{state} with a default algorithm.  This will be a compromise
5947between speed and randomness, and is recommended for applications with no
5948special requirements.  Currently this is @code{gmp_randinit_mt}.
5949@end deftypefun
5950
5951@deftypefun void gmp_randinit_mt (gmp_randstate_t @var{state})
5952@cindex Mersenne twister random numbers
5953Initialize @var{state} for a Mersenne Twister algorithm.  This algorithm is
5954fast and has good randomness properties.
5955@end deftypefun
5956
5957@deftypefun void gmp_randinit_lc_2exp (gmp_randstate_t @var{state}, const mpz_t @var{a}, @w{unsigned long @var{c}}, @w{mp_bitcnt_t @var{m2exp}})
5958@cindex Linear congruential random numbers
5959Initialize @var{state} with a linear congruential algorithm @m{X = (@var{a}X +
5960@var{c}) @bmod 2^{m2exp}, X = (@var{a}*X + @var{c}) mod 2^@var{m2exp}}.
5961
5962The low bits of @math{X} in this algorithm are not very random.  The least
5963significant bit will have a period no more than 2, and the second bit no more
5964than 4, etc.  For this reason only the high half of each @math{X} is actually
5965used.
5966
5967When a random number of more than @math{@var{m2exp}/2} bits is to be
5968generated, multiple iterations of the recurrence are used and the results
5969concatenated.
5970@end deftypefun
5971
5972@deftypefun int gmp_randinit_lc_2exp_size (gmp_randstate_t @var{state}, mp_bitcnt_t @var{size})
5973@cindex Linear congruential random numbers
5974Initialize @var{state} for a linear congruential algorithm as per
5975@code{gmp_randinit_lc_2exp}.  @var{a}, @var{c} and @var{m2exp} are selected
5976from a table, chosen so that @var{size} bits (or more) of each @math{X} will
5977be used, i.e.@: @math{@var{m2exp}/2 @ge{} @var{size}}.
5978
5979If successful the return value is non-zero.  If @var{size} is bigger than the
5980table data provides then the return value is zero.  The maximum @var{size}
5981currently supported is 128.
5982@end deftypefun
5983
5984@deftypefun void gmp_randinit_set (gmp_randstate_t @var{rop}, gmp_randstate_t @var{op})
5985Initialize @var{rop} with a copy of the algorithm and state from @var{op}.
5986@end deftypefun
5987
5988@c  Although gmp_randinit, gmp_errno and related constants are obsolete, we
5989@c  still put @findex entries for them, since they're still documented and
5990@c  someone might be looking them up when perusing old application code.
5991
5992@deftypefun void gmp_randinit (gmp_randstate_t @var{state}, @w{gmp_randalg_t @var{alg}}, @dots{})
5993@strong{This function is obsolete.}
5994
5995@findex GMP_RAND_ALG_LC
5996@findex GMP_RAND_ALG_DEFAULT
5997Initialize @var{state} with an algorithm selected by @var{alg}.  The only
5998choice is @code{GMP_RAND_ALG_LC}, which is @code{gmp_randinit_lc_2exp_size}
5999described above.  A third parameter of type @code{unsigned long} is required,
6000this is the @var{size} for that function.  @code{GMP_RAND_ALG_DEFAULT} or 0
6001are the same as @code{GMP_RAND_ALG_LC}.
6002
6003@c  For reference, this is the only place gmp_errno has been documented, and
6004@c  due to being non thread safe we won't be adding to it's uses.
6005@findex gmp_errno
6006@findex GMP_ERROR_UNSUPPORTED_ARGUMENT
6007@findex GMP_ERROR_INVALID_ARGUMENT
6008@code{gmp_randinit} sets bits in the global variable @code{gmp_errno} to
6009indicate an error.  @code{GMP_ERROR_UNSUPPORTED_ARGUMENT} if @var{alg} is
6010unsupported, or @code{GMP_ERROR_INVALID_ARGUMENT} if the @var{size} parameter
6011is too big.  It may be noted this error reporting is not thread safe (a good
6012reason to use @code{gmp_randinit_lc_2exp_size} instead).
6013@end deftypefun
6014
6015@deftypefun void gmp_randclear (gmp_randstate_t @var{state})
6016Free all memory occupied by @var{state}.
6017@end deftypefun
6018
6019
6020@node Random State Seeding, Random State Miscellaneous, Random State Initialization, Random Number Functions
6021@section Random State Seeding
6022@cindex Random number seeding
6023@cindex Seeding random numbers
6024
6025@deftypefun void gmp_randseed (gmp_randstate_t @var{state}, const mpz_t @var{seed})
6026@deftypefunx void gmp_randseed_ui (gmp_randstate_t @var{state}, @w{unsigned long int @var{seed}})
6027Set an initial seed value into @var{state}.
6028
6029The size of a seed determines how many different sequences of random numbers
6030that it's possible to generate.  The ``quality'' of the seed is the randomness
6031of a given seed compared to the previous seed used, and this affects the
6032randomness of separate number sequences.  The method for choosing a seed is
6033critical if the generated numbers are to be used for important applications,
6034such as generating cryptographic keys.
6035
6036Traditionally the system time has been used to seed, but care needs to be
6037taken with this.  If an application seeds often and the resolution of the
6038system clock is low, then the same sequence of numbers might be repeated.
6039Also, the system time is quite easy to guess, so if unpredictability is
6040required then it should definitely not be the only source for the seed value.
6041On some systems there's a special device @file{/dev/random} which provides
6042random data better suited for use as a seed.
6043@end deftypefun
6044
6045
6046@node Random State Miscellaneous,  , Random State Seeding, Random Number Functions
6047@section Random State Miscellaneous
6048
6049@deftypefun {unsigned long} gmp_urandomb_ui (gmp_randstate_t @var{state}, unsigned long @var{n})
6050Return a uniformly distributed random number of @var{n} bits, i.e.@: in the
6051range 0 to @m{2^n-1,2^@var{n}-1} inclusive.  @var{n} must be less than or
6052equal to the number of bits in an @code{unsigned long}.
6053@end deftypefun
6054
6055@deftypefun {unsigned long} gmp_urandomm_ui (gmp_randstate_t @var{state}, unsigned long @var{n})
6056Return a uniformly distributed random number in the range 0 to
6057@math{@var{n}-1}, inclusive.
6058@end deftypefun
6059
6060
6061@node Formatted Output, Formatted Input, Random Number Functions, Top
6062@chapter Formatted Output
6063@cindex Formatted output
6064@cindex @code{printf} formatted output
6065
6066@menu
6067* Formatted Output Strings::
6068* Formatted Output Functions::
6069* C++ Formatted Output::
6070@end menu
6071
6072@node Formatted Output Strings, Formatted Output Functions, Formatted Output, Formatted Output
6073@section Format Strings
6074
6075@code{gmp_printf} and friends accept format strings similar to the standard C
6076@code{printf} (@pxref{Formatted Output,, Formatted Output, libc, The GNU C
6077Library Reference Manual}).  A format specification is of the form
6078
6079@example
6080% [flags] [width] [.[precision]] [type] conv
6081@end example
6082
6083GMP adds types @samp{Z}, @samp{Q} and @samp{F} for @code{mpz_t}, @code{mpq_t}
6084and @code{mpf_t} respectively, @samp{M} for @code{mp_limb_t}, and @samp{N} for
6085an @code{mp_limb_t} array.  @samp{Z}, @samp{Q}, @samp{M} and @samp{N} behave
6086like integers.  @samp{Q} will print a @samp{/} and a denominator, if needed.
6087@samp{F} behaves like a float.  For example,
6088
6089@example
6090mpz_t z;
6091gmp_printf ("%s is an mpz %Zd\n", "here", z);
6092
6093mpq_t q;
6094gmp_printf ("a hex rational: %#40Qx\n", q);
6095
6096mpf_t f;
6097int   n;
6098gmp_printf ("fixed point mpf %.*Ff with %d digits\n", n, f, n);
6099
6100mp_limb_t l;
6101gmp_printf ("limb %Mu\n", l);
6102
6103const mp_limb_t *ptr;
6104mp_size_t       size;
6105gmp_printf ("limb array %Nx\n", ptr, size);
6106@end example
6107
6108For @samp{N} the limbs are expected least significant first, as per the
6109@code{mpn} functions (@pxref{Low-level Functions}).  A negative size can be
6110given to print the value as a negative.
6111
6112All the standard C @code{printf} types behave the same as the C library
6113@code{printf}, and can be freely intermixed with the GMP extensions.  In the
6114current implementation the standard parts of the format string are simply
6115handed to @code{printf} and only the GMP extensions handled directly.
6116
6117The flags accepted are as follows.  GLIBC style @nisamp{'} is only for the
6118standard C types (not the GMP types), and only if the C library supports it.
6119
6120@quotation
6121@multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM}
6122@item @nicode{0} @tab pad with zeros (rather than spaces)
6123@item @nicode{#} @tab show the base with @samp{0x}, @samp{0X} or @samp{0}
6124@item @nicode{+} @tab always show a sign
6125@item (space)    @tab show a space or a @samp{-} sign
6126@item @nicode{'} @tab group digits, GLIBC style (not GMP types)
6127@end multitable
6128@end quotation
6129
6130The optional width and precision can be given as a number within the format
6131string, or as a @samp{*} to take an extra parameter of type @code{int}, the
6132same as the standard @code{printf}.
6133
6134The standard types accepted are as follows.  @samp{h} and @samp{l} are
6135portable, the rest will depend on the compiler (or include files) for the type
6136and the C library for the output.
6137
6138@quotation
6139@multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM}
6140@item @nicode{h}  @tab @nicode{short}
6141@item @nicode{hh} @tab @nicode{char}
6142@item @nicode{j}  @tab @nicode{intmax_t} or @nicode{uintmax_t}
6143@item @nicode{l}  @tab @nicode{long} or @nicode{wchar_t}
6144@item @nicode{ll} @tab @nicode{long long}
6145@item @nicode{L}  @tab @nicode{long double}
6146@item @nicode{q}  @tab @nicode{quad_t} or @nicode{u_quad_t}
6147@item @nicode{t}  @tab @nicode{ptrdiff_t}
6148@item @nicode{z}  @tab @nicode{size_t}
6149@end multitable
6150@end quotation
6151
6152@noindent
6153The GMP types are
6154
6155@quotation
6156@multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM}
6157@item @nicode{F}  @tab @nicode{mpf_t}, float conversions
6158@item @nicode{Q}  @tab @nicode{mpq_t}, integer conversions
6159@item @nicode{M}  @tab @nicode{mp_limb_t}, integer conversions
6160@item @nicode{N}  @tab @nicode{mp_limb_t} array, integer conversions
6161@item @nicode{Z}  @tab @nicode{mpz_t}, integer conversions
6162@end multitable
6163@end quotation
6164
6165The conversions accepted are as follows.  @samp{a} and @samp{A} are always
6166supported for @code{mpf_t} but depend on the C library for standard C float
6167types.  @samp{m} and @samp{p} depend on the C library.
6168
6169@quotation
6170@multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM}
6171@item @nicode{a} @nicode{A} @tab hex floats, C99 style
6172@item @nicode{c}            @tab character
6173@item @nicode{d}            @tab decimal integer
6174@item @nicode{e} @nicode{E} @tab scientific format float
6175@item @nicode{f}            @tab fixed point float
6176@item @nicode{i}            @tab same as @nicode{d}
6177@item @nicode{g} @nicode{G} @tab fixed or scientific float
6178@item @nicode{m}            @tab @code{strerror} string, GLIBC style
6179@item @nicode{n}            @tab store characters written so far
6180@item @nicode{o}            @tab octal integer
6181@item @nicode{p}            @tab pointer
6182@item @nicode{s}            @tab string
6183@item @nicode{u}            @tab unsigned integer
6184@item @nicode{x} @nicode{X} @tab hex integer
6185@end multitable
6186@end quotation
6187
6188@samp{o}, @samp{x} and @samp{X} are unsigned for the standard C types, but for
6189types @samp{Z}, @samp{Q} and @samp{N} they are signed.  @samp{u} is not
6190meaningful for @samp{Z}, @samp{Q} and @samp{N}.
6191
6192@samp{M} is a proxy for the C library @samp{l} or @samp{L}, according to the
6193size of @code{mp_limb_t}.  Unsigned conversions will be usual, but a signed
6194conversion can be used and will interpret the value as a twos complement
6195negative.
6196
6197@samp{n} can be used with any type, even the GMP types.
6198
6199Other types or conversions that might be accepted by the C library
6200@code{printf} cannot be used through @code{gmp_printf}, this includes for
6201instance extensions registered with GLIBC @code{register_printf_function}.
6202Also currently there's no support for POSIX @samp{$} style numbered arguments
6203(perhaps this will be added in the future).
6204
6205The precision field has its usual meaning for integer @samp{Z} and float
6206@samp{F} types, but is currently undefined for @samp{Q} and should not be used
6207with that.
6208
6209@code{mpf_t} conversions only ever generate as many digits as can be
6210accurately represented by the operand, the same as @code{mpf_get_str} does.
6211Zeros will be used if necessary to pad to the requested precision.  This
6212happens even for an @samp{f} conversion of an @code{mpf_t} which is an
6213integer, for instance @math{2^@W{1024}} in an @code{mpf_t} of 128 bits
6214precision will only produce about 40 digits, then pad with zeros to the
6215decimal point.  An empty precision field like @samp{%.Fe} or @samp{%.Ff} can
6216be used to specifically request just the significant digits.  Without any dot
6217and thus no precision field, a precision value of 6 will be used.  Note that
6218these rules mean that @samp{%Ff}, @samp{%.Ff}, and @samp{%.0Ff} will all be
6219different.
6220
6221The decimal point character (or string) is taken from the current locale
6222settings on systems which provide @code{localeconv} (@pxref{Locales,, Locales
6223and Internationalization, libc, The GNU C Library Reference Manual}).  The C
6224library will normally do the same for standard float output.
6225
6226The format string is only interpreted as plain @code{char}s, multibyte
6227characters are not recognised.  Perhaps this will change in the future.
6228
6229
6230@node Formatted Output Functions, C++ Formatted Output, Formatted Output Strings, Formatted Output
6231@section Functions
6232@cindex Output functions
6233
6234Each of the following functions is similar to the corresponding C library
6235function.  The basic @code{printf} forms take a variable argument list.  The
6236@code{vprintf} forms take an argument pointer, see @ref{Variadic Functions,,
6237Variadic Functions, libc, The GNU C Library Reference Manual}, or @samp{man 3
6238va_start}.
6239
6240It should be emphasised that if a format string is invalid, or the arguments
6241don't match what the format specifies, then the behaviour of any of these
6242functions will be unpredictable.  GCC format string checking is not available,
6243since it doesn't recognise the GMP extensions.
6244
6245The file based functions @code{gmp_printf} and @code{gmp_fprintf} will return
6246@math{-1} to indicate a write error.  Output is not ``atomic'', so partial
6247output may be produced if a write error occurs.  All the functions can return
6248@math{-1} if the C library @code{printf} variant in use returns @math{-1}, but
6249this shouldn't normally occur.
6250
6251@deftypefun int gmp_printf (const char *@var{fmt}, @dots{})
6252@deftypefunx int gmp_vprintf (const char *@var{fmt}, va_list @var{ap})
6253Print to the standard output @code{stdout}.  Return the number of characters
6254written, or @math{-1} if an error occurred.
6255@end deftypefun
6256
6257@deftypefun int gmp_fprintf (FILE *@var{fp}, const char *@var{fmt}, @dots{})
6258@deftypefunx int gmp_vfprintf (FILE *@var{fp}, const char *@var{fmt}, va_list @var{ap})
6259Print to the stream @var{fp}.  Return the number of characters written, or
6260@math{-1} if an error occurred.
6261@end deftypefun
6262
6263@deftypefun int gmp_sprintf (char *@var{buf}, const char *@var{fmt}, @dots{})
6264@deftypefunx int gmp_vsprintf (char *@var{buf}, const char *@var{fmt}, va_list @var{ap})
6265Form a null-terminated string in @var{buf}.  Return the number of characters
6266written, excluding the terminating null.
6267
6268No overlap is permitted between the space at @var{buf} and the string
6269@var{fmt}.
6270
6271These functions are not recommended, since there's no protection against
6272exceeding the space available at @var{buf}.
6273@end deftypefun
6274
6275@deftypefun int gmp_snprintf (char *@var{buf}, size_t @var{size}, const char *@var{fmt}, @dots{})
6276@deftypefunx int gmp_vsnprintf (char *@var{buf}, size_t @var{size}, const char *@var{fmt}, va_list @var{ap})
6277Form a null-terminated string in @var{buf}.  No more than @var{size} bytes
6278will be written.  To get the full output, @var{size} must be enough for the
6279string and null-terminator.
6280
6281The return value is the total number of characters which ought to have been
6282produced, excluding the terminating null.  If @math{@var{retval} @ge{}
6283@var{size}} then the actual output has been truncated to the first
6284@math{@var{size}-1} characters, and a null appended.
6285
6286No overlap is permitted between the region @{@var{buf},@var{size}@} and the
6287@var{fmt} string.
6288
6289Notice the return value is in ISO C99 @code{snprintf} style.  This is so even
6290if the C library @code{vsnprintf} is the older GLIBC 2.0.x style.
6291@end deftypefun
6292
6293@deftypefun int gmp_asprintf (char **@var{pp}, const char *@var{fmt}, @dots{})
6294@deftypefunx int gmp_vasprintf (char **@var{pp}, const char *@var{fmt}, va_list @var{ap})
6295Form a null-terminated string in a block of memory obtained from the current
6296memory allocation function (@pxref{Custom Allocation}).  The block will be the
6297size of the string and null-terminator.  The address of the block in stored to
6298*@var{pp}.  The return value is the number of characters produced, excluding
6299the null-terminator.
6300
6301Unlike the C library @code{asprintf}, @code{gmp_asprintf} doesn't return
6302@math{-1} if there's no more memory available, it lets the current allocation
6303function handle that.
6304@end deftypefun
6305
6306@deftypefun int gmp_obstack_printf (struct obstack *@var{ob}, const char *@var{fmt}, @dots{})
6307@deftypefunx int gmp_obstack_vprintf (struct obstack *@var{ob}, const char *@var{fmt}, va_list @var{ap})
6308@cindex @code{obstack} output
6309Append to the current object in @var{ob}.  The return value is the number of
6310characters written.  A null-terminator is not written.
6311
6312@var{fmt} cannot be within the current object in @var{ob}, since that object
6313might move as it grows.
6314
6315These functions are available only when the C library provides the obstack
6316feature, which probably means only on GNU systems, see @ref{Obstacks,,
6317Obstacks, libc, The GNU C Library Reference Manual}.
6318@end deftypefun
6319
6320
6321@node C++ Formatted Output,  , Formatted Output Functions, Formatted Output
6322@section C++ Formatted Output
6323@cindex C++ @code{ostream} output
6324@cindex @code{ostream} output
6325
6326The following functions are provided in @file{libgmpxx} (@pxref{Headers and
6327Libraries}), which is built if C++ support is enabled (@pxref{Build Options}).
6328Prototypes are available from @code{<gmp.h>}.
6329
6330@deftypefun ostream& operator<< (ostream& @var{stream}, const mpz_t @var{op})
6331Print @var{op} to @var{stream}, using its @code{ios} formatting settings.
6332@code{ios::width} is reset to 0 after output, the same as the standard
6333@code{ostream operator<<} routines do.
6334
6335In hex or octal, @var{op} is printed as a signed number, the same as for
6336decimal.  This is unlike the standard @code{operator<<} routines on @code{int}
6337etc, which instead give twos complement.
6338@end deftypefun
6339
6340@deftypefun ostream& operator<< (ostream& @var{stream}, const mpq_t @var{op})
6341Print @var{op} to @var{stream}, using its @code{ios} formatting settings.
6342@code{ios::width} is reset to 0 after output, the same as the standard
6343@code{ostream operator<<} routines do.
6344
6345Output will be a fraction like @samp{5/9}, or if the denominator is 1 then
6346just a plain integer like @samp{123}.
6347
6348In hex or octal, @var{op} is printed as a signed value, the same as for
6349decimal.  If @code{ios::showbase} is set then a base indicator is shown on
6350both the numerator and denominator (if the denominator is required).
6351@end deftypefun
6352
6353@deftypefun ostream& operator<< (ostream& @var{stream}, const mpf_t @var{op})
6354Print @var{op} to @var{stream}, using its @code{ios} formatting settings.
6355@code{ios::width} is reset to 0 after output, the same as the standard
6356@code{ostream operator<<} routines do.
6357
6358The decimal point follows the standard library float @code{operator<<}, which
6359on recent systems means the @code{std::locale} imbued on @var{stream}.
6360
6361Hex and octal are supported, unlike the standard @code{operator<<} on
6362@code{double}.  The mantissa will be in hex or octal, the exponent will be in
6363decimal.  For hex the exponent delimiter is an @samp{@@}.  This is as per
6364@code{mpf_out_str}.
6365
6366@code{ios::showbase} is supported, and will put a base on the mantissa, for
6367example hex @samp{0x1.8} or @samp{0x0.8}, or octal @samp{01.4} or @samp{00.4}.
6368This last form is slightly strange, but at least differentiates itself from
6369decimal.
6370@end deftypefun
6371
6372These operators mean that GMP types can be printed in the usual C++ way, for
6373example,
6374
6375@example
6376mpz_t  z;
6377int    n;
6378...
6379cout << "iteration " << n << " value " << z << "\n";
6380@end example
6381
6382But note that @code{ostream} output (and @code{istream} input, @pxref{C++
6383Formatted Input}) is the only overloading available for the GMP types and that
6384for instance using @code{+} with an @code{mpz_t} will have unpredictable
6385results.  For classes with overloading, see @ref{C++ Class Interface}.
6386
6387
6388@node Formatted Input, C++ Class Interface, Formatted Output, Top
6389@chapter Formatted Input
6390@cindex Formatted input
6391@cindex @code{scanf} formatted input
6392
6393@menu
6394* Formatted Input Strings::
6395* Formatted Input Functions::
6396* C++ Formatted Input::
6397@end menu
6398
6399
6400@node Formatted Input Strings, Formatted Input Functions, Formatted Input, Formatted Input
6401@section Formatted Input Strings
6402
6403@code{gmp_scanf} and friends accept format strings similar to the standard C
6404@code{scanf} (@pxref{Formatted Input,, Formatted Input, libc, The GNU C
6405Library Reference Manual}).  A format specification is of the form
6406
6407@example
6408% [flags] [width] [type] conv
6409@end example
6410
6411GMP adds types @samp{Z}, @samp{Q} and @samp{F} for @code{mpz_t}, @code{mpq_t}
6412and @code{mpf_t} respectively.  @samp{Z} and @samp{Q} behave like integers.
6413@samp{Q} will read a @samp{/} and a denominator, if present.  @samp{F} behaves
6414like a float.
6415
6416GMP variables don't require an @code{&} when passed to @code{gmp_scanf}, since
6417they're already ``call-by-reference''.  For example,
6418
6419@example
6420/* to read say "a(5) = 1234" */
6421int   n;
6422mpz_t z;
6423gmp_scanf ("a(%d) = %Zd\n", &n, z);
6424
6425mpq_t q1, q2;
6426gmp_sscanf ("0377 + 0x10/0x11", "%Qi + %Qi", q1, q2);
6427
6428/* to read say "topleft (1.55,-2.66)" */
6429mpf_t x, y;
6430char  buf[32];
6431gmp_scanf ("%31s (%Ff,%Ff)", buf, x, y);
6432@end example
6433
6434All the standard C @code{scanf} types behave the same as in the C library
6435@code{scanf}, and can be freely intermixed with the GMP extensions.  In the
6436current implementation the standard parts of the format string are simply
6437handed to @code{scanf} and only the GMP extensions handled directly.
6438
6439The flags accepted are as follows.  @samp{a} and @samp{'} will depend on
6440support from the C library, and @samp{'} cannot be used with GMP types.
6441
6442@quotation
6443@multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM}
6444@item @nicode{*} @tab read but don't store
6445@item @nicode{a} @tab allocate a buffer (string conversions)
6446@item @nicode{'} @tab grouped digits, GLIBC style (not GMP types)
6447@end multitable
6448@end quotation
6449
6450The standard types accepted are as follows.  @samp{h} and @samp{l} are
6451portable, the rest will depend on the compiler (or include files) for the type
6452and the C library for the input.
6453
6454@quotation
6455@multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM}
6456@item @nicode{h}  @tab @nicode{short}
6457@item @nicode{hh} @tab @nicode{char}
6458@item @nicode{j}  @tab @nicode{intmax_t} or @nicode{uintmax_t}
6459@item @nicode{l}  @tab @nicode{long int}, @nicode{double} or @nicode{wchar_t}
6460@item @nicode{ll} @tab @nicode{long long}
6461@item @nicode{L}  @tab @nicode{long double}
6462@item @nicode{q}  @tab @nicode{quad_t} or @nicode{u_quad_t}
6463@item @nicode{t}  @tab @nicode{ptrdiff_t}
6464@item @nicode{z}  @tab @nicode{size_t}
6465@end multitable
6466@end quotation
6467
6468@noindent
6469The GMP types are
6470
6471@quotation
6472@multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM}
6473@item @nicode{F}  @tab @nicode{mpf_t}, float conversions
6474@item @nicode{Q}  @tab @nicode{mpq_t}, integer conversions
6475@item @nicode{Z}  @tab @nicode{mpz_t}, integer conversions
6476@end multitable
6477@end quotation
6478
6479The conversions accepted are as follows.  @samp{p} and @samp{[} will depend on
6480support from the C library, the rest are standard.
6481
6482@quotation
6483@multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM}
6484@item @nicode{c}            @tab character or characters
6485@item @nicode{d}            @tab decimal integer
6486@item @nicode{e} @nicode{E} @nicode{f} @nicode{g} @nicode{G}
6487                            @tab float
6488@item @nicode{i}            @tab integer with base indicator
6489@item @nicode{n}            @tab characters read so far
6490@item @nicode{o}            @tab octal integer
6491@item @nicode{p}            @tab pointer
6492@item @nicode{s}            @tab string of non-whitespace characters
6493@item @nicode{u}            @tab decimal integer
6494@item @nicode{x} @nicode{X} @tab hex integer
6495@item @nicode{[}            @tab string of characters in a set
6496@end multitable
6497@end quotation
6498
6499@samp{e}, @samp{E}, @samp{f}, @samp{g} and @samp{G} are identical, they all
6500read either fixed point or scientific format, and either upper or lower case
6501@samp{e} for the exponent in scientific format.
6502
6503C99 style hex float format (@code{printf %a}, @pxref{Formatted Output
6504Strings}) is always accepted for @code{mpf_t}, but for the standard float
6505types it will depend on the C library.
6506
6507@samp{x} and @samp{X} are identical, both accept both upper and lower case
6508hexadecimal.
6509
6510@samp{o}, @samp{u}, @samp{x} and @samp{X} all read positive or negative
6511values.  For the standard C types these are described as ``unsigned''
6512conversions, but that merely affects certain overflow handling, negatives are
6513still allowed (per @code{strtoul}, @pxref{Parsing of Integers,, Parsing of
6514Integers, libc, The GNU C Library Reference Manual}).  For GMP types there are
6515no overflows, so @samp{d} and @samp{u} are identical.
6516
6517@samp{Q} type reads the numerator and (optional) denominator as given.  If the
6518value might not be in canonical form then @code{mpq_canonicalize} must be
6519called before using it in any calculations (@pxref{Rational Number
6520Functions}).
6521
6522@samp{Qi} will read a base specification separately for the numerator and
6523denominator.  For example @samp{0x10/11} would be 16/11, whereas
6524@samp{0x10/0x11} would be 16/17.
6525
6526@samp{n} can be used with any of the types above, even the GMP types.
6527@samp{*} to suppress assignment is allowed, though in that case it would do
6528nothing at all.
6529
6530Other conversions or types that might be accepted by the C library
6531@code{scanf} cannot be used through @code{gmp_scanf}.
6532
6533Whitespace is read and discarded before a field, except for @samp{c} and
6534@samp{[} conversions.
6535
6536For float conversions, the decimal point character (or string) expected is
6537taken from the current locale settings on systems which provide
6538@code{localeconv} (@pxref{Locales,, Locales and Internationalization, libc,
6539The GNU C Library Reference Manual}).  The C library will normally do the same
6540for standard float input.
6541
6542The format string is only interpreted as plain @code{char}s, multibyte
6543characters are not recognised.  Perhaps this will change in the future.
6544
6545
6546@node Formatted Input Functions, C++ Formatted Input, Formatted Input Strings, Formatted Input
6547@section Formatted Input Functions
6548@cindex Input functions
6549
6550Each of the following functions is similar to the corresponding C library
6551function.  The plain @code{scanf} forms take a variable argument list.  The
6552@code{vscanf} forms take an argument pointer, see @ref{Variadic Functions,,
6553Variadic Functions, libc, The GNU C Library Reference Manual}, or @samp{man 3
6554va_start}.
6555
6556It should be emphasised that if a format string is invalid, or the arguments
6557don't match what the format specifies, then the behaviour of any of these
6558functions will be unpredictable.  GCC format string checking is not available,
6559since it doesn't recognise the GMP extensions.
6560
6561No overlap is permitted between the @var{fmt} string and any of the results
6562produced.
6563
6564@deftypefun int gmp_scanf (const char *@var{fmt}, @dots{})
6565@deftypefunx int gmp_vscanf (const char *@var{fmt}, va_list @var{ap})
6566Read from the standard input @code{stdin}.
6567@end deftypefun
6568
6569@deftypefun int gmp_fscanf (FILE *@var{fp}, const char *@var{fmt}, @dots{})
6570@deftypefunx int gmp_vfscanf (FILE *@var{fp}, const char *@var{fmt}, va_list @var{ap})
6571Read from the stream @var{fp}.
6572@end deftypefun
6573
6574@deftypefun int gmp_sscanf (const char *@var{s}, const char *@var{fmt}, @dots{})
6575@deftypefunx int gmp_vsscanf (const char *@var{s}, const char *@var{fmt}, va_list @var{ap})
6576Read from a null-terminated string @var{s}.
6577@end deftypefun
6578
6579The return value from each of these functions is the same as the standard C99
6580@code{scanf}, namely the number of fields successfully parsed and stored.
6581@samp{%n} fields and fields read but suppressed by @samp{*} don't count
6582towards the return value.
6583
6584If end of input (or a file error) is reached before a character for a field or
6585a literal, and if no previous non-suppressed fields have matched, then the
6586return value is @code{EOF} instead of 0.  A whitespace character in the format
6587string is only an optional match and doesn't induce an @code{EOF} in this
6588fashion.  Leading whitespace read and discarded for a field don't count as
6589characters for that field.
6590
6591For the GMP types, input parsing follows C99 rules, namely one character of
6592lookahead is used and characters are read while they continue to meet the
6593format requirements.  If this doesn't provide a complete number then the
6594function terminates, with that field not stored nor counted towards the return
6595value.  For instance with @code{mpf_t} an input @samp{1.23e-XYZ} would be read
6596up to the @samp{X} and that character pushed back since it's not a digit.  The
6597string @samp{1.23e-} would then be considered invalid since an @samp{e} must
6598be followed by at least one digit.
6599
6600For the standard C types, in the current implementation GMP calls the C
6601library @code{scanf} functions, which might have looser rules about what
6602constitutes a valid input.
6603
6604Note that @code{gmp_sscanf} is the same as @code{gmp_fscanf} and only does one
6605character of lookahead when parsing.  Although clearly it could look at its
6606entire input, it is deliberately made identical to @code{gmp_fscanf}, the same
6607way C99 @code{sscanf} is the same as @code{fscanf}.
6608
6609
6610@node C++ Formatted Input,  , Formatted Input Functions, Formatted Input
6611@section C++ Formatted Input
6612@cindex C++ @code{istream} input
6613@cindex @code{istream} input
6614
6615The following functions are provided in @file{libgmpxx} (@pxref{Headers and
6616Libraries}), which is built only if C++ support is enabled (@pxref{Build
6617Options}).  Prototypes are available from @code{<gmp.h>}.
6618
6619@deftypefun istream& operator>> (istream& @var{stream}, mpz_t @var{rop})
6620Read @var{rop} from @var{stream}, using its @code{ios} formatting settings.
6621@end deftypefun
6622
6623@deftypefun istream& operator>> (istream& @var{stream}, mpq_t @var{rop})
6624An integer like @samp{123} will be read, or a fraction like @samp{5/9}.  No
6625whitespace is allowed around the @samp{/}.  If the fraction is not in
6626canonical form then @code{mpq_canonicalize} must be called (@pxref{Rational
6627Number Functions}) before operating on it.
6628
6629As per integer input, an @samp{0} or @samp{0x} base indicator is read when
6630none of @code{ios::dec}, @code{ios::oct} or @code{ios::hex} are set.  This is
6631done separately for numerator and denominator, so that for instance
6632@samp{0x10/11} is @math{16/11} and @samp{0x10/0x11} is @math{16/17}.
6633@end deftypefun
6634
6635@deftypefun istream& operator>> (istream& @var{stream}, mpf_t @var{rop})
6636Read @var{rop} from @var{stream}, using its @code{ios} formatting settings.
6637
6638Hex or octal floats are not supported, but might be in the future, or perhaps
6639it's best to accept only what the standard float @code{operator>>} does.
6640@end deftypefun
6641
6642Note that digit grouping specified by the @code{istream} locale is currently
6643not accepted.  Perhaps this will change in the future.
6644
6645@sp 1
6646These operators mean that GMP types can be read in the usual C++ way, for
6647example,
6648
6649@example
6650mpz_t  z;
6651...
6652cin >> z;
6653@end example
6654
6655But note that @code{istream} input (and @code{ostream} output, @pxref{C++
6656Formatted Output}) is the only overloading available for the GMP types and
6657that for instance using @code{+} with an @code{mpz_t} will have unpredictable
6658results.  For classes with overloading, see @ref{C++ Class Interface}.
6659
6660
6661
6662@node C++ Class Interface, Custom Allocation, Formatted Input, Top
6663@chapter C++ Class Interface
6664@cindex C++ interface
6665
6666This chapter describes the C++ class based interface to GMP.
6667
6668All GMP C language types and functions can be used in C++ programs, since
6669@file{gmp.h} has @code{extern "C"} qualifiers, but the class interface offers
6670overloaded functions and operators which may be more convenient.
6671
6672Due to the implementation of this interface, a reasonably recent C++ compiler
6673is required, one supporting namespaces, partial specialization of templates
6674and member templates.
6675
6676@strong{Everything described in this chapter is to be considered preliminary
6677and might be subject to incompatible changes if some unforeseen difficulty
6678reveals itself.}
6679
6680@menu
6681* C++ Interface General::
6682* C++ Interface Integers::
6683* C++ Interface Rationals::
6684* C++ Interface Floats::
6685* C++ Interface Random Numbers::
6686* C++ Interface Limitations::
6687@end menu
6688
6689
6690@node C++ Interface General, C++ Interface Integers, C++ Class Interface, C++ Class Interface
6691@section C++ Interface General
6692
6693@noindent
6694All the C++ classes and functions are available with
6695
6696@cindex @code{gmpxx.h}
6697@example
6698#include <gmpxx.h>
6699@end example
6700
6701Programs should be linked with the @file{libgmpxx} and @file{libgmp}
6702libraries.  For example,
6703
6704@example
6705g++ mycxxprog.cc -lgmpxx -lgmp
6706@end example
6707
6708@noindent
6709The classes defined are
6710
6711@deftp Class mpz_class
6712@deftpx Class mpq_class
6713@deftpx Class mpf_class
6714@end deftp
6715
6716The standard operators and various standard functions are overloaded to allow
6717arithmetic with these classes.  For example,
6718
6719@example
6720int
6721main (void)
6722@{
6723  mpz_class a, b, c;
6724
6725  a = 1234;
6726  b = "-5678";
6727  c = a+b;
6728  cout << "sum is " << c << "\n";
6729  cout << "absolute value is " << abs(c) << "\n";
6730
6731  return 0;
6732@}
6733@end example
6734
6735An important feature of the implementation is that an expression like
6736@code{a=b+c} results in a single call to the corresponding @code{mpz_add},
6737without using a temporary for the @code{b+c} part.  Expressions which by their
6738nature imply intermediate values, like @code{a=b*c+d*e}, still use temporaries
6739though.
6740
6741The classes can be freely intermixed in expressions, as can the classes and
6742the standard types @code{long}, @code{unsigned long} and @code{double}.
6743Smaller types like @code{int} or @code{float} can also be intermixed, since
6744C++ will promote them.
6745
6746Note that @code{bool} is not accepted directly, but must be explicitly cast to
6747an @code{int} first.  This is because C++ will automatically convert any
6748pointer to a @code{bool}, so if GMP accepted @code{bool} it would make all
6749sorts of invalid class and pointer combinations compile but almost certainly
6750not do anything sensible.
6751
6752Conversions back from the classes to standard C++ types aren't done
6753automatically, instead member functions like @code{get_si} are provided (see
6754the following sections for details).
6755
6756Also there are no automatic conversions from the classes to the corresponding
6757GMP C types, instead a reference to the underlying C object can be obtained
6758with the following functions,
6759
6760@deftypefun mpz_t mpz_class::get_mpz_t ()
6761@deftypefunx mpq_t mpq_class::get_mpq_t ()
6762@deftypefunx mpf_t mpf_class::get_mpf_t ()
6763@end deftypefun
6764
6765These can be used to call a C function which doesn't have a C++ class
6766interface.  For example to set @code{a} to the GCD of @code{b} and @code{c},
6767
6768@example
6769mpz_class a, b, c;
6770...
6771mpz_gcd (a.get_mpz_t(), b.get_mpz_t(), c.get_mpz_t());
6772@end example
6773
6774In the other direction, a class can be initialized from the corresponding GMP
6775C type, or assigned to if an explicit constructor is used.  In both cases this
6776makes a copy of the value, it doesn't create any sort of association.  For
6777example,
6778
6779@example
6780mpz_t z;
6781// ... init and calculate z ...
6782mpz_class x(z);
6783mpz_class y;
6784y = mpz_class (z);
6785@end example
6786
6787There are no namespace setups in @file{gmpxx.h}, all types and functions are
6788simply put into the global namespace.  This is what @file{gmp.h} has done in
6789the past, and continues to do for compatibility.  The extras provided by
6790@file{gmpxx.h} follow GMP naming conventions and are unlikely to clash with
6791anything.
6792
6793
6794@node C++ Interface Integers, C++ Interface Rationals, C++ Interface General, C++ Class Interface
6795@section C++ Interface Integers
6796
6797@deftypefun {} mpz_class::mpz_class (type @var{n})
6798Construct an @code{mpz_class}.  All the standard C++ types may be used, except
6799@code{long long} and @code{long double}, and all the GMP C++ classes can be
6800used, although conversions from @code{mpq_class} and @code{mpf_class} are
6801@code{explicit}.  Any necessary conversion follows the corresponding C
6802function, for example @code{double} follows @code{mpz_set_d}
6803(@pxref{Assigning Integers}).
6804@end deftypefun
6805
6806@deftypefun explicit mpz_class::mpz_class (const mpz_t @var{z})
6807Construct an @code{mpz_class} from an @code{mpz_t}.  The value in @var{z} is
6808copied into the new @code{mpz_class}, there won't be any permanent association
6809between it and @var{z}.
6810@end deftypefun
6811
6812@deftypefun explicit mpz_class::mpz_class (const char *@var{s}, int @var{base} = 0)
6813@deftypefunx explicit mpz_class::mpz_class (const string& @var{s}, int @var{base} = 0)
6814Construct an @code{mpz_class} converted from a string using @code{mpz_set_str}
6815(@pxref{Assigning Integers}).
6816
6817If the string is not a valid integer, an @code{std::invalid_argument}
6818exception is thrown.  The same applies to @code{operator=}.
6819@end deftypefun
6820
6821@deftypefun mpz_class operator"" _mpz (const char *@var{str})
6822With C++11 compilers, integers can be constructed with the syntax
6823@code{123_mpz} which is equivalent to @code{mpz_class("123")}.
6824@end deftypefun
6825
6826@deftypefun mpz_class operator/ (mpz_class @var{a}, mpz_class @var{d})
6827@deftypefunx mpz_class operator% (mpz_class @var{a}, mpz_class @var{d})
6828Divisions involving @code{mpz_class} round towards zero, as per the
6829@code{mpz_tdiv_q} and @code{mpz_tdiv_r} functions (@pxref{Integer Division}).
6830This is the same as the C99 @code{/} and @code{%} operators.
6831
6832The @code{mpz_fdiv@dots{}} or @code{mpz_cdiv@dots{}} functions can always be called
6833directly if desired.  For example,
6834
6835@example
6836mpz_class q, a, d;
6837...
6838mpz_fdiv_q (q.get_mpz_t(), a.get_mpz_t(), d.get_mpz_t());
6839@end example
6840@end deftypefun
6841
6842@deftypefun mpz_class abs (mpz_class @var{op})
6843@deftypefunx int cmp (mpz_class @var{op1}, type @var{op2})
6844@deftypefunx int cmp (type @var{op1}, mpz_class @var{op2})
6845@maybepagebreak
6846@deftypefunx bool mpz_class::fits_sint_p (void)
6847@deftypefunx bool mpz_class::fits_slong_p (void)
6848@deftypefunx bool mpz_class::fits_sshort_p (void)
6849@maybepagebreak
6850@deftypefunx bool mpz_class::fits_uint_p (void)
6851@deftypefunx bool mpz_class::fits_ulong_p (void)
6852@deftypefunx bool mpz_class::fits_ushort_p (void)
6853@maybepagebreak
6854@deftypefunx double mpz_class::get_d (void)
6855@deftypefunx long mpz_class::get_si (void)
6856@deftypefunx string mpz_class::get_str (int @var{base} = 10)
6857@deftypefunx {unsigned long} mpz_class::get_ui (void)
6858@maybepagebreak
6859@deftypefunx int mpz_class::set_str (const char *@var{str}, int @var{base})
6860@deftypefunx int mpz_class::set_str (const string& @var{str}, int @var{base})
6861@deftypefunx int sgn (mpz_class @var{op})
6862@deftypefunx mpz_class sqrt (mpz_class @var{op})
6863@maybepagebreak
6864@deftypefunx mpz_class gcd (mpz_class @var{op1}, mpz_class @var{op2})
6865@deftypefunx mpz_class lcm (mpz_class @var{op1}, mpz_class @var{op2})
6866@maybepagebreak
6867@deftypefunx void mpz_class::swap (mpz_class& @var{op})
6868@deftypefunx void swap (mpz_class& @var{op1}, mpz_class& @var{op2})
6869These functions provide a C++ class interface to the corresponding GMP C
6870routines.
6871
6872@code{cmp} can be used with any of the classes or the standard C++ types,
6873except @code{long long} and @code{long double}.
6874@end deftypefun
6875
6876@sp 1
6877Overloaded operators for combinations of @code{mpz_class} and @code{double}
6878are provided for completeness, but it should be noted that if the given
6879@code{double} is not an integer then the way any rounding is done is currently
6880unspecified.  The rounding might take place at the start, in the middle, or at
6881the end of the operation, and it might change in the future.
6882
6883Conversions between @code{mpz_class} and @code{double}, however, are defined
6884to follow the corresponding C functions @code{mpz_get_d} and @code{mpz_set_d}.
6885And comparisons are always made exactly, as per @code{mpz_cmp_d}.
6886
6887
6888@node C++ Interface Rationals, C++ Interface Floats, C++ Interface Integers, C++ Class Interface
6889@section C++ Interface Rationals
6890
6891In all the following constructors, if a fraction is given then it should be in
6892canonical form, or if not then @code{mpq_class::canonicalize} called.
6893
6894@deftypefun {} mpq_class::mpq_class (type @var{op})
6895@deftypefunx {} mpq_class::mpq_class (integer @var{num}, integer @var{den})
6896Construct an @code{mpq_class}.  The initial value can be a single value of any
6897type (conversion from @code{mpf_class} is @code{explicit}), or a pair of
6898integers (@code{mpz_class} or standard C++ integer types) representing a
6899fraction, except that @code{long long} and @code{long double} are not
6900supported.  For example,
6901
6902@example
6903mpq_class q (99);
6904mpq_class q (1.75);
6905mpq_class q (1, 3);
6906@end example
6907@end deftypefun
6908
6909@deftypefun explicit mpq_class::mpq_class (const mpq_t @var{q})
6910Construct an @code{mpq_class} from an @code{mpq_t}.  The value in @var{q} is
6911copied into the new @code{mpq_class}, there won't be any permanent association
6912between it and @var{q}.
6913@end deftypefun
6914
6915@deftypefun explicit mpq_class::mpq_class (const char *@var{s}, int @var{base} = 0)
6916@deftypefunx explicit mpq_class::mpq_class (const string& @var{s}, int @var{base} = 0)
6917Construct an @code{mpq_class} converted from a string using @code{mpq_set_str}
6918(@pxref{Initializing Rationals}).
6919
6920If the string is not a valid rational, an @code{std::invalid_argument}
6921exception is thrown.  The same applies to @code{operator=}.
6922@end deftypefun
6923
6924@deftypefun mpq_class operator"" _mpq (const char *@var{str})
6925With C++11 compilers, integral rationals can be constructed with the syntax
6926@code{123_mpq} which is equivalent to @code{mpq_class(123_mpz)}. Other
6927rationals can be built as @code{-1_mpq/2} or @code{0xb_mpq/123456_mpz}.
6928@end deftypefun
6929
6930@deftypefun void mpq_class::canonicalize ()
6931Put an @code{mpq_class} into canonical form, as per @ref{Rational Number
6932Functions}.  All arithmetic operators require their operands in canonical
6933form, and will return results in canonical form.
6934@end deftypefun
6935
6936@deftypefun mpq_class abs (mpq_class @var{op})
6937@deftypefunx int cmp (mpq_class @var{op1}, type @var{op2})
6938@deftypefunx int cmp (type @var{op1}, mpq_class @var{op2})
6939@maybepagebreak
6940@deftypefunx double mpq_class::get_d (void)
6941@deftypefunx string mpq_class::get_str (int @var{base} = 10)
6942@maybepagebreak
6943@deftypefunx int mpq_class::set_str (const char *@var{str}, int @var{base})
6944@deftypefunx int mpq_class::set_str (const string& @var{str}, int @var{base})
6945@deftypefunx int sgn (mpq_class @var{op})
6946@maybepagebreak
6947@deftypefunx void mpq_class::swap (mpq_class& @var{op})
6948@deftypefunx void swap (mpq_class& @var{op1}, mpq_class& @var{op2})
6949These functions provide a C++ class interface to the corresponding GMP C
6950routines.
6951
6952@code{cmp} can be used with any of the classes or the standard C++ types,
6953except @code{long long} and @code{long double}.
6954@end deftypefun
6955
6956@deftypefun {mpz_class&} mpq_class::get_num ()
6957@deftypefunx {mpz_class&} mpq_class::get_den ()
6958Get a reference to an @code{mpz_class} which is the numerator or denominator
6959of an @code{mpq_class}.  This can be used both for read and write access.  If
6960the object returned is modified, it modifies the original @code{mpq_class}.
6961
6962If direct manipulation might produce a non-canonical value, then
6963@code{mpq_class::canonicalize} must be called before further operations.
6964@end deftypefun
6965
6966@deftypefun mpz_t mpq_class::get_num_mpz_t ()
6967@deftypefunx mpz_t mpq_class::get_den_mpz_t ()
6968Get a reference to the underlying @code{mpz_t} numerator or denominator of an
6969@code{mpq_class}.  This can be passed to C functions expecting an
6970@code{mpz_t}.  Any modifications made to the @code{mpz_t} will modify the
6971original @code{mpq_class}.
6972
6973If direct manipulation might produce a non-canonical value, then
6974@code{mpq_class::canonicalize} must be called before further operations.
6975@end deftypefun
6976
6977@deftypefun istream& operator>> (istream& @var{stream}, mpq_class& @var{rop});
6978Read @var{rop} from @var{stream}, using its @code{ios} formatting settings,
6979the same as @code{mpq_t operator>>} (@pxref{C++ Formatted Input}).
6980
6981If the @var{rop} read might not be in canonical form then
6982@code{mpq_class::canonicalize} must be called.
6983@end deftypefun
6984
6985
6986@node C++ Interface Floats, C++ Interface Random Numbers, C++ Interface Rationals, C++ Class Interface
6987@section C++ Interface Floats
6988
6989When an expression requires the use of temporary intermediate @code{mpf_class}
6990values, like @code{f=g*h+x*y}, those temporaries will have the same precision
6991as the destination @code{f}.  Explicit constructors can be used if this
6992doesn't suit.
6993
6994@deftypefun {} mpf_class::mpf_class (type @var{op})
6995@deftypefunx {} mpf_class::mpf_class (type @var{op}, mp_bitcnt_t @var{prec})
6996Construct an @code{mpf_class}.  Any standard C++ type can be used, except
6997@code{long long} and @code{long double}, and any of the GMP C++ classes can be
6998used.
6999
7000If @var{prec} is given, the initial precision is that value, in bits.  If
7001@var{prec} is not given, then the initial precision is determined by the type
7002of @var{op} given.  An @code{mpz_class}, @code{mpq_class}, or C++
7003builtin type will give the default @code{mpf} precision (@pxref{Initializing
7004Floats}).  An @code{mpf_class} or expression will give the precision of that
7005value.  The precision of a binary expression is the higher of the two
7006operands.
7007
7008@example
7009mpf_class f(1.5);        // default precision
7010mpf_class f(1.5, 500);   // 500 bits (at least)
7011mpf_class f(x);          // precision of x
7012mpf_class f(abs(x));     // precision of x
7013mpf_class f(-g, 1000);   // 1000 bits (at least)
7014mpf_class f(x+y);        // greater of precisions of x and y
7015@end example
7016@end deftypefun
7017
7018@deftypefun explicit mpf_class::mpf_class (const mpf_t @var{f})
7019@deftypefunx {} mpf_class::mpf_class (const mpf_t @var{f}, mp_bitcnt_t @var{prec})
7020Construct an @code{mpf_class} from an @code{mpf_t}.  The value in @var{f} is
7021copied into the new @code{mpf_class}, there won't be any permanent association
7022between it and @var{f}.
7023
7024If @var{prec} is given, the initial precision is that value, in bits.  If
7025@var{prec} is not given, then the initial precision is that of @var{f}.
7026@end deftypefun
7027
7028@deftypefun explicit mpf_class::mpf_class (const char *@var{s})
7029@deftypefunx {} mpf_class::mpf_class (const char *@var{s}, mp_bitcnt_t @var{prec}, int @var{base} = 0)
7030@deftypefunx explicit mpf_class::mpf_class (const string& @var{s})
7031@deftypefunx {} mpf_class::mpf_class (const string& @var{s}, mp_bitcnt_t @var{prec}, int @var{base} = 0)
7032Construct an @code{mpf_class} converted from a string using @code{mpf_set_str}
7033(@pxref{Assigning Floats}).  If @var{prec} is given, the initial precision is
7034that value, in bits.  If not, the default @code{mpf} precision
7035(@pxref{Initializing Floats}) is used.
7036
7037If the string is not a valid float, an @code{std::invalid_argument} exception
7038is thrown.  The same applies to @code{operator=}.
7039@end deftypefun
7040
7041@deftypefun mpf_class operator"" _mpf (const char *@var{str})
7042With C++11 compilers, floats can be constructed with the syntax
7043@code{1.23e-1_mpf} which is equivalent to @code{mpf_class("1.23e-1")}.
7044@end deftypefun
7045
7046@deftypefun {mpf_class&} mpf_class::operator= (type @var{op})
7047Convert and store the given @var{op} value to an @code{mpf_class} object.  The
7048same types are accepted as for the constructors above.
7049
7050Note that @code{operator=} only stores a new value, it doesn't copy or change
7051the precision of the destination, instead the value is truncated if necessary.
7052This is the same as @code{mpf_set} etc.  Note in particular this means for
7053@code{mpf_class} a copy constructor is not the same as a default constructor
7054plus assignment.
7055
7056@example
7057mpf_class x (y);   // x created with precision of y
7058
7059mpf_class x;       // x created with default precision
7060x = y;             // value truncated to that precision
7061@end example
7062
7063Applications using templated code may need to be careful about the assumptions
7064the code makes in this area, when working with @code{mpf_class} values of
7065various different or non-default precisions.  For instance implementations of
7066the standard @code{complex} template have been seen in both styles above,
7067though of course @code{complex} is normally only actually specified for use
7068with the builtin float types.
7069@end deftypefun
7070
7071@deftypefun mpf_class abs (mpf_class @var{op})
7072@deftypefunx mpf_class ceil (mpf_class @var{op})
7073@deftypefunx int cmp (mpf_class @var{op1}, type @var{op2})
7074@deftypefunx int cmp (type @var{op1}, mpf_class @var{op2})
7075@maybepagebreak
7076@deftypefunx bool mpf_class::fits_sint_p (void)
7077@deftypefunx bool mpf_class::fits_slong_p (void)
7078@deftypefunx bool mpf_class::fits_sshort_p (void)
7079@maybepagebreak
7080@deftypefunx bool mpf_class::fits_uint_p (void)
7081@deftypefunx bool mpf_class::fits_ulong_p (void)
7082@deftypefunx bool mpf_class::fits_ushort_p (void)
7083@maybepagebreak
7084@deftypefunx mpf_class floor (mpf_class @var{op})
7085@deftypefunx mpf_class hypot (mpf_class @var{op1}, mpf_class @var{op2})
7086@maybepagebreak
7087@deftypefunx double mpf_class::get_d (void)
7088@deftypefunx long mpf_class::get_si (void)
7089@deftypefunx string mpf_class::get_str (mp_exp_t& @var{exp}, int @var{base} = 10, size_t @var{digits} = 0)
7090@deftypefunx {unsigned long} mpf_class::get_ui (void)
7091@maybepagebreak
7092@deftypefunx int mpf_class::set_str (const char *@var{str}, int @var{base})
7093@deftypefunx int mpf_class::set_str (const string& @var{str}, int @var{base})
7094@deftypefunx int sgn (mpf_class @var{op})
7095@deftypefunx mpf_class sqrt (mpf_class @var{op})
7096@maybepagebreak
7097@deftypefunx void mpf_class::swap (mpf_class& @var{op})
7098@deftypefunx void swap (mpf_class& @var{op1}, mpf_class& @var{op2})
7099@deftypefunx mpf_class trunc (mpf_class @var{op})
7100These functions provide a C++ class interface to the corresponding GMP C
7101routines.
7102
7103@code{cmp} can be used with any of the classes or the standard C++ types,
7104except @code{long long} and @code{long double}.
7105
7106The accuracy provided by @code{hypot} is not currently guaranteed.
7107@end deftypefun
7108
7109@deftypefun {mp_bitcnt_t} mpf_class::get_prec ()
7110@deftypefunx void mpf_class::set_prec (mp_bitcnt_t @var{prec})
7111@deftypefunx void mpf_class::set_prec_raw (mp_bitcnt_t @var{prec})
7112Get or set the current precision of an @code{mpf_class}.
7113
7114The restrictions described for @code{mpf_set_prec_raw} (@pxref{Initializing
7115Floats}) apply to @code{mpf_class::set_prec_raw}.  Note in particular that the
7116@code{mpf_class} must be restored to it's allocated precision before being
7117destroyed.  This must be done by application code, there's no automatic
7118mechanism for it.
7119@end deftypefun
7120
7121
7122@node C++ Interface Random Numbers, C++ Interface Limitations, C++ Interface Floats, C++ Class Interface
7123@section C++ Interface Random Numbers
7124
7125@deftp Class gmp_randclass
7126The C++ class interface to the GMP random number functions uses
7127@code{gmp_randclass} to hold an algorithm selection and current state, as per
7128@code{gmp_randstate_t}.
7129@end deftp
7130
7131@deftypefun {} gmp_randclass::gmp_randclass (void (*@var{randinit}) (gmp_randstate_t, @dots{}), @dots{})
7132Construct a @code{gmp_randclass}, using a call to the given @var{randinit}
7133function (@pxref{Random State Initialization}).  The arguments expected are
7134the same as @var{randinit}, but with @code{mpz_class} instead of @code{mpz_t}.
7135For example,
7136
7137@example
7138gmp_randclass r1 (gmp_randinit_default);
7139gmp_randclass r2 (gmp_randinit_lc_2exp_size, 32);
7140gmp_randclass r3 (gmp_randinit_lc_2exp, a, c, m2exp);
7141gmp_randclass r4 (gmp_randinit_mt);
7142@end example
7143
7144@code{gmp_randinit_lc_2exp_size} will fail if the size requested is too big,
7145an @code{std::length_error} exception is thrown in that case.
7146@end deftypefun
7147
7148@deftypefun {} gmp_randclass::gmp_randclass (gmp_randalg_t @var{alg}, @dots{})
7149Construct a @code{gmp_randclass} using the same parameters as
7150@code{gmp_randinit} (@pxref{Random State Initialization}).  This function is
7151obsolete and the above @var{randinit} style should be preferred.
7152@end deftypefun
7153
7154@deftypefun void gmp_randclass::seed (unsigned long int @var{s})
7155@deftypefunx void gmp_randclass::seed (mpz_class @var{s})
7156Seed a random number generator.  See @pxref{Random Number Functions}, for how
7157to choose a good seed.
7158@end deftypefun
7159
7160@deftypefun mpz_class gmp_randclass::get_z_bits (mp_bitcnt_t @var{bits})
7161@deftypefunx mpz_class gmp_randclass::get_z_bits (mpz_class @var{bits})
7162Generate a random integer with a specified number of bits.
7163@end deftypefun
7164
7165@deftypefun mpz_class gmp_randclass::get_z_range (mpz_class @var{n})
7166Generate a random integer in the range 0 to @math{@var{n}-1} inclusive.
7167@end deftypefun
7168
7169@deftypefun mpf_class gmp_randclass::get_f ()
7170@deftypefunx mpf_class gmp_randclass::get_f (mp_bitcnt_t @var{prec})
7171Generate a random float @var{f} in the range @math{0 <= @var{f} < 1}.  @var{f}
7172will be to @var{prec} bits precision, or if @var{prec} is not given then to
7173the precision of the destination.  For example,
7174
7175@example
7176gmp_randclass  r;
7177...
7178mpf_class  f (0, 512);   // 512 bits precision
7179f = r.get_f();           // random number, 512 bits
7180@end example
7181@end deftypefun
7182
7183
7184
7185@node C++ Interface Limitations,  , C++ Interface Random Numbers, C++ Class Interface
7186@section C++ Interface Limitations
7187
7188@table @asis
7189@item @code{mpq_class} and Templated Reading
7190A generic piece of template code probably won't know that @code{mpq_class}
7191requires a @code{canonicalize} call if inputs read with @code{operator>>}
7192might be non-canonical.  This can lead to incorrect results.
7193
7194@code{operator>>} behaves as it does for reasons of efficiency.  A
7195canonicalize can be quite time consuming on large operands, and is best
7196avoided if it's not necessary.
7197
7198But this potential difficulty reduces the usefulness of @code{mpq_class}.
7199Perhaps a mechanism to tell @code{operator>>} what to do will be adopted in
7200the future, maybe a preprocessor define, a global flag, or an @code{ios} flag
7201pressed into service.  Or maybe, at the risk of inconsistency, the
7202@code{mpq_class} @code{operator>>} could canonicalize and leave @code{mpq_t}
7203@code{operator>>} not doing so, for use on those occasions when that's
7204acceptable.  Send feedback or alternate ideas to @email{gmp-bugs@@gmplib.org}.
7205
7206@item Subclassing
7207Subclassing the GMP C++ classes works, but is not currently recommended.
7208
7209Expressions involving subclasses resolve correctly (or seem to), but in normal
7210C++ fashion the subclass doesn't inherit constructors and assignments.
7211There's many of those in the GMP classes, and a good way to reestablish them
7212in a subclass is not yet provided.
7213
7214@item Templated Expressions
7215A subtle difficulty exists when using expressions together with
7216application-defined template functions.  Consider the following, with @code{T}
7217intended to be some numeric type,
7218
7219@example
7220template <class T>
7221T fun (const T &, const T &);
7222@end example
7223
7224@noindent
7225When used with, say, plain @code{mpz_class} variables, it works fine: @code{T}
7226is resolved as @code{mpz_class}.
7227
7228@example
7229mpz_class f(1), g(2);
7230fun (f, g);    // Good
7231@end example
7232
7233@noindent
7234But when one of the arguments is an expression, it doesn't work.
7235
7236@example
7237mpz_class f(1), g(2), h(3);
7238fun (f, g+h);  // Bad
7239@end example
7240
7241This is because @code{g+h} ends up being a certain expression template type
7242internal to @code{gmpxx.h}, which the C++ template resolution rules are unable
7243to automatically convert to @code{mpz_class}.  The workaround is simply to add
7244an explicit cast.
7245
7246@example
7247mpz_class f(1), g(2), h(3);
7248fun (f, mpz_class(g+h));  // Good
7249@end example
7250
7251Similarly, within @code{fun} it may be necessary to cast an expression to type
7252@code{T} when calling a templated @code{fun2}.
7253
7254@example
7255template <class T>
7256void fun (T f, T g)
7257@{
7258  fun2 (f, f+g);     // Bad
7259@}
7260
7261template <class T>
7262void fun (T f, T g)
7263@{
7264  fun2 (f, T(f+g));  // Good
7265@}
7266@end example
7267
7268@item C++11
7269C++11 provides several new ways in which types can be inferred: @code{auto},
7270@code{decltype}, etc. While they can be very convenient, they don't mix well
7271with expression templates. In this example, the addition is performed twice,
7272as if we had defined @code{sum} as a macro.
7273
7274@example
7275mpz_class z = 33;
7276auto sum = z + z;
7277mpz_class prod = sum * sum;
7278@end example
7279
7280This other example may crash, though some compilers might make it look like
7281it is working, because the expression @code{z+z} goes out of scope before it
7282is evaluated.
7283
7284@example
7285mpz_class z = 33;
7286auto sum = z + z + z;
7287mpz_class prod = sum * 2;
7288@end example
7289
7290It is thus strongly recommended to avoid @code{auto} anywhere a GMP C++
7291expression may appear.
7292@end table
7293
7294
7295@node Custom Allocation, Language Bindings, C++ Class Interface, Top
7296@comment  node-name,  next,  previous,  up
7297@chapter Custom Allocation
7298@cindex Custom allocation
7299@cindex Memory allocation
7300@cindex Allocation of memory
7301
7302By default GMP uses @code{malloc}, @code{realloc} and @code{free} for memory
7303allocation, and if they fail GMP prints a message to the standard error output
7304and terminates the program.
7305
7306Alternate functions can be specified, to allocate memory in a different way or
7307to have a different error action on running out of memory.
7308
7309@deftypefun void mp_set_memory_functions (@* void *(*@var{alloc_func_ptr}) (size_t), @* void *(*@var{realloc_func_ptr}) (void *, size_t, size_t), @* void (*@var{free_func_ptr}) (void *, size_t))
7310Replace the current allocation functions from the arguments.  If an argument
7311is @code{NULL}, the corresponding default function is used.
7312
7313These functions will be used for all memory allocation done by GMP, apart from
7314temporary space from @code{alloca} if that function is available and GMP is
7315configured to use it (@pxref{Build Options}).
7316
7317@strong{Be sure to call @code{mp_set_memory_functions} only when there are no
7318active GMP objects allocated using the previous memory functions!  Usually
7319that means calling it before any other GMP function.}
7320@end deftypefun
7321
7322The functions supplied should fit the following declarations:
7323
7324@deftypevr Function {void *} allocate_function (size_t @var{alloc_size})
7325Return a pointer to newly allocated space with at least @var{alloc_size}
7326bytes.
7327@end deftypevr
7328
7329@deftypevr Function {void *} reallocate_function (void *@var{ptr}, size_t @var{old_size}, size_t @var{new_size})
7330Resize a previously allocated block @var{ptr} of @var{old_size} bytes to be
7331@var{new_size} bytes.
7332
7333The block may be moved if necessary or if desired, and in that case the
7334smaller of @var{old_size} and @var{new_size} bytes must be copied to the new
7335location.  The return value is a pointer to the resized block, that being the
7336new location if moved or just @var{ptr} if not.
7337
7338@var{ptr} is never @code{NULL}, it's always a previously allocated block.
7339@var{new_size} may be bigger or smaller than @var{old_size}.
7340@end deftypevr
7341
7342@deftypevr Function void free_function (void *@var{ptr}, size_t @var{size})
7343De-allocate the space pointed to by @var{ptr}.
7344
7345@var{ptr} is never @code{NULL}, it's always a previously allocated block of
7346@var{size} bytes.
7347@end deftypevr
7348
7349A @dfn{byte} here means the unit used by the @code{sizeof} operator.
7350
7351The @var{reallocate_function} parameter @var{old_size} and the
7352@var{free_function} parameter @var{size} are passed for convenience, but of
7353course they can be ignored if not needed by an implementation.  The default
7354functions using @code{malloc} and friends for instance don't use them.
7355
7356No error return is allowed from any of these functions, if they return then
7357they must have performed the specified operation.  In particular note that
7358@var{allocate_function} or @var{reallocate_function} mustn't return
7359@code{NULL}.
7360
7361Getting a different fatal error action is a good use for custom allocation
7362functions, for example giving a graphical dialog rather than the default print
7363to @code{stderr}.  How much is possible when genuinely out of memory is
7364another question though.
7365
7366There's currently no defined way for the allocation functions to recover from
7367an error such as out of memory, they must terminate program execution.  A
7368@code{longjmp} or throwing a C++ exception will have undefined results.  This
7369may change in the future.
7370
7371GMP may use allocated blocks to hold pointers to other allocated blocks.  This
7372will limit the assumptions a conservative garbage collection scheme can make.
7373
7374Since the default GMP allocation uses @code{malloc} and friends, those
7375functions will be linked in even if the first thing a program does is an
7376@code{mp_set_memory_functions}.  It's necessary to change the GMP sources if
7377this is a problem.
7378
7379@sp 1
7380@deftypefun void mp_get_memory_functions (@* void *(**@var{alloc_func_ptr}) (size_t), @* void *(**@var{realloc_func_ptr}) (void *, size_t, size_t), @* void (**@var{free_func_ptr}) (void *, size_t))
7381Get the current allocation functions, storing function pointers to the
7382locations given by the arguments.  If an argument is @code{NULL}, that
7383function pointer is not stored.
7384
7385@need 1000
7386For example, to get just the current free function,
7387
7388@example
7389void (*freefunc) (void *, size_t);
7390
7391mp_get_memory_functions (NULL, NULL, &freefunc);
7392@end example
7393@end deftypefun
7394
7395@node Language Bindings, Algorithms, Custom Allocation, Top
7396@chapter Language Bindings
7397@cindex Language bindings
7398@cindex Other languages
7399
7400The following packages and projects offer access to GMP from languages other
7401than C, though perhaps with varying levels of functionality and efficiency.
7402
7403@c  @spaceuref{U} is the same as @uref{U}, but with a couple of extra spaces
7404@c  in tex, just to separate the URL from the preceding text a bit.
7405@iftex
7406@macro spaceuref {U}
7407@ @ @uref{\U\}
7408@end macro
7409@end iftex
7410@ifnottex
7411@macro spaceuref {U}
7412@uref{\U\}
7413@end macro
7414@end ifnottex
7415
7416@sp 1
7417@table @asis
7418@item C++
7419@itemize @bullet
7420@item
7421GMP C++ class interface, @pxref{C++ Class Interface} @* Straightforward
7422interface, expression templates to eliminate temporaries.
7423@item
7424ALP @spaceuref{https://www-sop.inria.fr/saga/logiciels/ALP/} @* Linear algebra and
7425polynomials using templates.
7426@item
7427Arithmos @spaceuref{http://cant.ua.ac.be/old/arithmos/} @* Rationals
7428with infinities and square roots.
7429@item
7430CLN @spaceuref{http://www.ginac.de/CLN/} @* High level classes for arithmetic.
7431@item
7432Linbox @spaceuref{http://www.linalg.org/} @* Sparse vectors and matrices.
7433@item
7434NTL @spaceuref{http://www.shoup.net/ntl/} @* A C++ number theory library.
7435@end itemize
7436
7437@c @item D
7438@c @itemize @bullet
7439@c @item
7440@c gmp-d @spaceuref{http://home.comcast.net/~benhinkle/gmp-d/}
7441@c @end itemize
7442
7443@item Eiffel
7444@itemize @bullet
7445@item
7446Eiffelroom @spaceuref{http://www.eiffelroom.org/node/442}
7447@end itemize
7448
7449@c @item Fortran
7450@c @itemize @bullet
7451@c @item
7452@c Omni F77 @spaceuref{http://phase.hpcc.jp/Omni/home.html} @* Arbitrary
7453@c precision floats.
7454@c @end itemize
7455
7456@item Haskell
7457@itemize @bullet
7458@item
7459Glasgow Haskell Compiler @spaceuref{https://www.haskell.org/ghc/}
7460@end itemize
7461
7462@item Java
7463@itemize @bullet
7464@item
7465Kaffe @spaceuref{https://github.com/kaffe/kaffe}
7466@end itemize
7467
7468@item Lisp
7469@itemize @bullet
7470@item
7471GNU Common Lisp @spaceuref{https://www.gnu.org/software/gcl/gcl.html}
7472@item
7473Librep @spaceuref{http://librep.sourceforge.net/}
7474@item
7475@c  FIXME: When there's a stable release with gmp support, just refer to it
7476@c  rather than bothering to talk about betas.
7477XEmacs (21.5.18 beta and up) @spaceuref{http://www.xemacs.org} @* Optional
7478big integers, rationals and floats using GMP.
7479@end itemize
7480
7481@item M4
7482@itemize @bullet
7483@item
7484@c  FIXME: When there's a stable release with gmp support, just refer to it
7485@c  rather than bothering to talk about betas.
7486GNU m4 betas @spaceuref{http://www.seindal.dk/rene/gnu/} @* Optionally provides
7487an arbitrary precision @code{mpeval}.
7488@end itemize
7489
7490@item ML
7491@itemize @bullet
7492@item
7493MLton compiler @spaceuref{http://mlton.org/}
7494@end itemize
7495
7496@item Objective Caml
7497@itemize @bullet
7498@item
7499MLGMP @spaceuref{http://opam.ocamlpro.com/pkg/mlgmp.20120224.html}
7500@item
7501Numerix @spaceuref{http://pauillac.inria.fr/~quercia/} @* Optionally using
7502GMP.
7503@end itemize
7504
7505@item Oz
7506@itemize @bullet
7507@item
7508Mozart @spaceuref{http://mozart.github.io/}
7509@end itemize
7510
7511@item Pascal
7512@itemize @bullet
7513@item
7514GNU Pascal Compiler @spaceuref{http://www.gnu-pascal.de/} @* GMP unit.
7515@item
7516Numerix @spaceuref{http://pauillac.inria.fr/~quercia/} @* For Free Pascal,
7517optionally using GMP.
7518@end itemize
7519
7520@item Perl
7521@itemize @bullet
7522@item
7523GMP module, see @file{demos/perl} in the GMP sources (@pxref{Demonstration
7524Programs}).
7525@item
7526Math::GMP @spaceuref{http://www.cpan.org/} @* Compatible with Math::BigInt, but
7527not as many functions as the GMP module above.
7528@item
7529Math::BigInt::GMP @spaceuref{http://www.cpan.org/} @* Plug Math::GMP into
7530normal Math::BigInt operations.
7531@end itemize
7532
7533@need 1000
7534@item Pike
7535@itemize @bullet
7536@item
7537mpz module in the standard distribution, @uref{http://pike.ida.liu.se/}
7538@end itemize
7539
7540@need 500
7541@item Prolog
7542@itemize @bullet
7543@item
7544SWI Prolog @spaceuref{http://www.swi-prolog.org/} @*
7545Arbitrary precision floats.
7546@end itemize
7547
7548@item Python
7549@itemize @bullet
7550@item
7551GMPY @uref{https://code.google.com/p/gmpy/}
7552@end itemize
7553
7554@item Ruby
7555@itemize @bullet
7556@item
7557http://rubygems.org/gems/gmp
7558@end itemize
7559
7560@item Scheme
7561@itemize @bullet
7562@item
7563GNU Guile @spaceuref{https://www.gnu.org/software/guile/guile.html}
7564@item
7565RScheme @spaceuref{http://www.rscheme.org/}
7566@item
7567STklos @spaceuref{http://www.stklos.net/}
7568@c
7569@c  For reference, MzScheme uses some of gmp, but (as of version 205) it only
7570@c  has copies of some of the generic C code, and we don't consider that a
7571@c  language binding to gmp.
7572@c
7573@end itemize
7574
7575@item Smalltalk
7576@itemize @bullet
7577@item
7578GNU Smalltalk @spaceuref{http://www.smalltalk.org/versions/GNUSmalltalk.html}
7579@end itemize
7580
7581@item Other
7582@itemize @bullet
7583@item
7584Axiom @uref{https://savannah.nongnu.org/projects/axiom} @* Computer algebra
7585using GCL.
7586@item
7587DrGenius @spaceuref{http://drgenius.seul.org/} @* Geometry system and
7588mathematical programming language.
7589@item
7590GiNaC @spaceuref{http://www.ginac.de/} @* C++ computer algebra using CLN.
7591@item
7592GOO @spaceuref{https://www.eecs.berkeley.edu/~jrb/goo/} @* Dynamic object oriented
7593language.
7594@item
7595Maxima @uref{https://www.ma.utexas.edu/users/wfs/maxima.html} @* Macsyma
7596computer algebra using GCL.
7597@c @item
7598@c Q @spaceuref{http://q-lang.sourceforge.net/} @* Equational programming system.
7599@item
7600Regina @spaceuref{http://regina.sourceforge.net/} @* Topological calculator.
7601@item
7602Yacas @spaceuref{http://yacas.sourceforge.net} @* Yet another computer algebra system.
7603@end itemize
7604
7605@end table
7606
7607
7608@node Algorithms, Internals, Language Bindings, Top
7609@chapter Algorithms
7610@cindex Algorithms
7611
7612This chapter is an introduction to some of the algorithms used for various GMP
7613operations.  The code is likely to be hard to understand without knowing
7614something about the algorithms.
7615
7616Some GMP internals are mentioned, but applications that expect to be
7617compatible with future GMP releases should take care to use only the
7618documented functions.
7619
7620@menu
7621* Multiplication Algorithms::
7622* Division Algorithms::
7623* Greatest Common Divisor Algorithms::
7624* Powering Algorithms::
7625* Root Extraction Algorithms::
7626* Radix Conversion Algorithms::
7627* Other Algorithms::
7628* Assembly Coding::
7629@end menu
7630
7631
7632@node Multiplication Algorithms, Division Algorithms, Algorithms, Algorithms
7633@section Multiplication
7634@cindex Multiplication algorithms
7635
7636N@cross{}N limb multiplications and squares are done using one of seven
7637algorithms, as the size N increases.
7638
7639@quotation
7640@multitable {KaratsubaMMM} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM}
7641@item Algorithm @tab Threshold
7642@item Basecase  @tab (none)
7643@item Karatsuba @tab @code{MUL_TOOM22_THRESHOLD}
7644@item Toom-3    @tab @code{MUL_TOOM33_THRESHOLD}
7645@item Toom-4    @tab @code{MUL_TOOM44_THRESHOLD}
7646@item Toom-6.5  @tab @code{MUL_TOOM6H_THRESHOLD}
7647@item Toom-8.5  @tab @code{MUL_TOOM8H_THRESHOLD}
7648@item FFT       @tab @code{MUL_FFT_THRESHOLD}
7649@end multitable
7650@end quotation
7651
7652Similarly for squaring, with the @code{SQR} thresholds.
7653
7654N@cross{}M multiplications of operands with different sizes above
7655@code{MUL_TOOM22_THRESHOLD} are currently done by special Toom-inspired
7656algorithms or directly with FFT, depending on operand size (@pxref{Unbalanced
7657Multiplication}).
7658
7659@menu
7660* Basecase Multiplication::
7661* Karatsuba Multiplication::
7662* Toom 3-Way Multiplication::
7663* Toom 4-Way Multiplication::
7664* Higher degree Toom'n'half::
7665* FFT Multiplication::
7666* Other Multiplication::
7667* Unbalanced Multiplication::
7668@end menu
7669
7670
7671@node Basecase Multiplication, Karatsuba Multiplication, Multiplication Algorithms, Multiplication Algorithms
7672@subsection Basecase Multiplication
7673
7674Basecase N@cross{}M multiplication is a straightforward rectangular set of
7675cross-products, the same as long multiplication done by hand and for that
7676reason sometimes known as the schoolbook or grammar school method.  This is an
7677@m{O(NM),O(N*M)} algorithm.  See Knuth section 4.3.1 algorithm M
7678(@pxref{References}), and the @file{mpn/generic/mul_basecase.c} code.
7679
7680Assembly implementations of @code{mpn_mul_basecase} are essentially the same
7681as the generic C code, but have all the usual assembly tricks and
7682obscurities introduced for speed.
7683
7684A square can be done in roughly half the time of a multiply, by using the fact
7685that the cross products above and below the diagonal are the same.  A triangle
7686of products below the diagonal is formed, doubled (left shift by one bit), and
7687then the products on the diagonal added.  This can be seen in
7688@file{mpn/generic/sqr_basecase.c}.  Again the assembly implementations take
7689essentially the same approach.
7690
7691@tex
7692\def\GMPline#1#2#3#4#5#6{%
7693  \hbox {%
7694    \vrule height 2.5ex depth 1ex
7695           \hbox to 2em {\hfil{#2}\hfil}%
7696    \vrule \hbox to 2em {\hfil{#3}\hfil}%
7697    \vrule \hbox to 2em {\hfil{#4}\hfil}%
7698    \vrule \hbox to 2em {\hfil{#5}\hfil}%
7699    \vrule \hbox to 2em {\hfil{#6}\hfil}%
7700    \vrule}}
7701\GMPdisplay{
7702  \hbox{%
7703    \vbox{%
7704      \hbox to 1.5em {\vrule height 2.5ex depth 1ex width 0pt}%
7705      \hbox {\vrule height 2.5ex depth 1ex width 0pt u0\hfil}%
7706      \hbox {\vrule height 2.5ex depth 1ex width 0pt u1\hfil}%
7707      \hbox {\vrule height 2.5ex depth 1ex width 0pt u2\hfil}%
7708      \hbox {\vrule height 2.5ex depth 1ex width 0pt u3\hfil}%
7709      \hbox {\vrule height 2.5ex depth 1ex width 0pt u4\hfil}%
7710      \vfill}%
7711    \vbox{%
7712      \hbox{%
7713        \hbox to 2em {\hfil u0\hfil}%
7714        \hbox to 2em {\hfil u1\hfil}%
7715        \hbox to 2em {\hfil u2\hfil}%
7716        \hbox to 2em {\hfil u3\hfil}%
7717        \hbox to 2em {\hfil u4\hfil}}%
7718      \vskip 0.7ex
7719      \hrule
7720      \GMPline{u0}{d}{}{}{}{}%
7721      \hrule
7722      \GMPline{u1}{}{d}{}{}{}%
7723      \hrule
7724      \GMPline{u2}{}{}{d}{}{}%
7725      \hrule
7726      \GMPline{u3}{}{}{}{d}{}%
7727      \hrule
7728      \GMPline{u4}{}{}{}{}{d}%
7729      \hrule}}}
7730@end tex
7731@ifnottex
7732@example
7733@group
7734     u0  u1  u2  u3  u4
7735   +---+---+---+---+---+
7736u0 | d |   |   |   |   |
7737   +---+---+---+---+---+
7738u1 |   | d |   |   |   |
7739   +---+---+---+---+---+
7740u2 |   |   | d |   |   |
7741   +---+---+---+---+---+
7742u3 |   |   |   | d |   |
7743   +---+---+---+---+---+
7744u4 |   |   |   |   | d |
7745   +---+---+---+---+---+
7746@end group
7747@end example
7748@end ifnottex
7749
7750In practice squaring isn't a full 2@cross{} faster than multiplying, it's
7751usually around 1.5@cross{}.  Less than 1.5@cross{} probably indicates
7752@code{mpn_sqr_basecase} wants improving on that CPU.
7753
7754On some CPUs @code{mpn_mul_basecase} can be faster than the generic C
7755@code{mpn_sqr_basecase} on some small sizes.  @code{SQR_BASECASE_THRESHOLD} is
7756the size at which to use @code{mpn_sqr_basecase}, this will be zero if that
7757routine should be used always.
7758
7759
7760@node Karatsuba Multiplication, Toom 3-Way Multiplication, Basecase Multiplication, Multiplication Algorithms
7761@subsection Karatsuba Multiplication
7762@cindex Karatsuba multiplication
7763
7764The Karatsuba multiplication algorithm is described in Knuth section 4.3.3
7765part A, and various other textbooks.  A brief description is given here.
7766
7767The inputs @math{x} and @math{y} are treated as each split into two parts of
7768equal length (or the most significant part one limb shorter if N is odd).
7769
7770@tex
7771% GMPboxwidth used for all the multiplication pictures
7772\global\newdimen\GMPboxwidth \global\GMPboxwidth=5em
7773% GMPboxdepth and GMPboxheight are also used for the float pictures
7774\global\newdimen\GMPboxdepth  \global\GMPboxdepth=1ex
7775\global\newdimen\GMPboxheight \global\GMPboxheight=2ex
7776\gdef\GMPvrule{\vrule height \GMPboxheight depth \GMPboxdepth}
7777\def\GMPbox#1#2{%
7778  \vbox {%
7779    \hrule
7780    \hbox to 2\GMPboxwidth{%
7781      \GMPvrule \hfil $#1$\hfil \vrule \hfil $#2$\hfil \vrule}%
7782    \hrule}}
7783\GMPdisplay{%
7784\vbox{%
7785  \hbox to 2\GMPboxwidth {high \hfil low}
7786  \vskip 0.7ex
7787  \GMPbox{x_1}{x_0}
7788  \vskip 0.5ex
7789  \GMPbox{y_1}{y_0}
7790}}
7791@end tex
7792@ifnottex
7793@example
7794@group
7795 high              low
7796+----------+----------+
7797|    x1    |    x0    |
7798+----------+----------+
7799
7800+----------+----------+
7801|    y1    |    y0    |
7802+----------+----------+
7803@end group
7804@end example
7805@end ifnottex
7806
7807Let @math{b} be the power of 2 where the split occurs, i.e.@: if @ms{x,0} is
7808@math{k} limbs (@ms{y,0} the same) then
7809@m{b=2\GMPraise{$k*$@code{mp\_bits\_per\_limb}}, b=2^(k*mp_bits_per_limb)}.
7810With that @m{x=x_1b+x_0,x=x1*b+x0} and @m{y=y_1b+y_0,y=y1*b+y0}, and the
7811following holds,
7812
7813@display
7814@m{xy = (b^2+b)x_1y_1 - b(x_1-x_0)(y_1-y_0) + (b+1)x_0y_0,
7815  x*y = (b^2+b)*x1*y1 - b*(x1-x0)*(y1-y0) + (b+1)*x0*y0}
7816@end display
7817
7818This formula means doing only three multiplies of (N/2)@cross{}(N/2) limbs,
7819whereas a basecase multiply of N@cross{}N limbs is equivalent to four
7820multiplies of (N/2)@cross{}(N/2).  The factors @math{(b^2+b)} etc represent
7821the positions where the three products must be added.
7822
7823@tex
7824\def\GMPboxA#1#2{%
7825  \vbox{%
7826    \hrule
7827    \hbox{%
7828      \GMPvrule
7829      \hbox to 2\GMPboxwidth {\hfil\hbox{$#1$}\hfil}%
7830      \vrule
7831      \hbox to 2\GMPboxwidth {\hfil\hbox{$#2$}\hfil}%
7832      \vrule}
7833    \hrule}}
7834\def\GMPboxB#1#2{%
7835  \hbox{%
7836    \raise \GMPboxdepth \hbox to \GMPboxwidth {\hfil #1\hskip 0.5em}%
7837    \vbox{%
7838      \hrule
7839      \hbox{%
7840        \GMPvrule
7841        \hbox to 2\GMPboxwidth {\hfil\hbox{$#2$}\hfil}%
7842        \vrule}%
7843      \hrule}}}
7844\GMPdisplay{%
7845\vbox{%
7846  \hbox to 4\GMPboxwidth {high \hfil low}
7847  \vskip 0.7ex
7848  \GMPboxA{x_1y_1}{x_0y_0}
7849  \vskip 0.5ex
7850  \GMPboxB{$+$}{x_1y_1}
7851  \vskip 0.5ex
7852  \GMPboxB{$+$}{x_0y_0}
7853  \vskip 0.5ex
7854  \GMPboxB{$-$}{(x_1-x_0)(y_1-y_0)}
7855}}
7856@end tex
7857@ifnottex
7858@example
7859@group
7860 high                              low
7861+--------+--------+ +--------+--------+
7862|      x1*y1      | |      x0*y0      |
7863+--------+--------+ +--------+--------+
7864          +--------+--------+
7865      add |      x1*y1      |
7866          +--------+--------+
7867          +--------+--------+
7868      add |      x0*y0      |
7869          +--------+--------+
7870          +--------+--------+
7871      sub | (x1-x0)*(y1-y0) |
7872          +--------+--------+
7873@end group
7874@end example
7875@end ifnottex
7876
7877The term @m{(x_1-x_0)(y_1-y_0),(x1-x0)*(y1-y0)} is best calculated as an
7878absolute value, and the sign used to choose to add or subtract.  Notice the
7879sum @m{\mathop{\rm high}(x_0y_0)+\mathop{\rm low}(x_1y_1),
7880high(x0*y0)+low(x1*y1)} occurs twice, so it's possible to do @m{5k,5*k} limb
7881additions, rather than @m{6k,6*k}, but in GMP extra function call overheads
7882outweigh the saving.
7883
7884Squaring is similar to multiplying, but with @math{x=y} the formula reduces to
7885an equivalent with three squares,
7886
7887@display
7888@m{x^2 = (b^2+b)x_1^2 - b(x_1-x_0)^2 + (b+1)x_0^2,
7889   x^2 = (b^2+b)*x1^2 - b*(x1-x0)^2 + (b+1)*x0^2}
7890@end display
7891
7892The final result is accumulated from those three squares the same way as for
7893the three multiplies above.  The middle term @m{(x_1-x_0)^2,(x1-x0)^2} is now
7894always positive.
7895
7896A similar formula for both multiplying and squaring can be constructed with a
7897middle term @m{(x_1+x_0)(y_1+y_0),(x1+x0)*(y1+y0)}.  But those sums can exceed
7898@math{k} limbs, leading to more carry handling and additions than the form
7899above.
7900
7901Karatsuba multiplication is asymptotically an @math{O(N^@W{1.585})} algorithm,
7902the exponent being @m{\log3/\log2,log(3)/log(2)}, representing 3 multiplies
7903each @math{1/2} the size of the inputs.  This is a big improvement over the
7904basecase multiply at @math{O(N^2)} and the advantage soon overcomes the extra
7905additions Karatsuba performs.  @code{MUL_TOOM22_THRESHOLD} can be as little
7906as 10 limbs.  The @code{SQR} threshold is usually about twice the @code{MUL}.
7907
7908The basecase algorithm will take a time of the form @m{M(N) = aN^2 + bN + c,
7909M(N) = a*N^2 + b*N + c} and the Karatsuba algorithm @m{K(N) = 3M(N/2) + dN +
7910e, K(N) = 3*M(N/2) + d*N + e}, which expands to @m{K(N) = {3\over4} aN^2 +
7911{3\over2} bN + 3c + dN + e, K(N) = 3/4*a*N^2 + 3/2*b*N + 3*c + d*N + e}.  The
7912factor @m{3\over4, 3/4} for @math{a} means per-crossproduct speedups in the
7913basecase code will increase the threshold since they benefit @math{M(N)} more
7914than @math{K(N)}.  And conversely the @m{3\over2, 3/2} for @math{b} means
7915linear style speedups of @math{b} will increase the threshold since they
7916benefit @math{K(N)} more than @math{M(N)}.  The latter can be seen for
7917instance when adding an optimized @code{mpn_sqr_diagonal} to
7918@code{mpn_sqr_basecase}.  Of course all speedups reduce total time, and in
7919that sense the algorithm thresholds are merely of academic interest.
7920
7921
7922@node Toom 3-Way Multiplication, Toom 4-Way Multiplication, Karatsuba Multiplication, Multiplication Algorithms
7923@subsection Toom 3-Way Multiplication
7924@cindex Toom multiplication
7925
7926The Karatsuba formula is the simplest case of a general approach to splitting
7927inputs that leads to both Toom and FFT algorithms.  A description of
7928Toom can be found in Knuth section 4.3.3, with an example 3-way
7929calculation after Theorem A@.  The 3-way form used in GMP is described here.
7930
7931The operands are each considered split into 3 pieces of equal length (or the
7932most significant part 1 or 2 limbs shorter than the other two).
7933
7934@tex
7935\def\GMPbox#1#2#3{%
7936  \vbox{%
7937    \hrule \vfil
7938    \hbox to 3\GMPboxwidth {%
7939      \GMPvrule
7940      \hfil$#1$\hfil
7941      \vrule
7942      \hfil$#2$\hfil
7943      \vrule
7944      \hfil$#3$\hfil
7945      \vrule}%
7946    \vfil \hrule
7947}}
7948\GMPdisplay{%
7949\vbox{%
7950  \hbox to 3\GMPboxwidth {high \hfil low}
7951  \vskip 0.7ex
7952  \GMPbox{x_2}{x_1}{x_0}
7953  \vskip 0.5ex
7954  \GMPbox{y_2}{y_1}{y_0}
7955  \vskip 0.5ex
7956}}
7957@end tex
7958@ifnottex
7959@example
7960@group
7961 high                         low
7962+----------+----------+----------+
7963|    x2    |    x1    |    x0    |
7964+----------+----------+----------+
7965
7966+----------+----------+----------+
7967|    y2    |    y1    |    y0    |
7968+----------+----------+----------+
7969@end group
7970@end example
7971@end ifnottex
7972
7973@noindent
7974These parts are treated as the coefficients of two polynomials
7975
7976@display
7977@group
7978@m{X(t) = x_2t^2 + x_1t + x_0,
7979   X(t) = x2*t^2 + x1*t + x0}
7980@m{Y(t) = y_2t^2 + y_1t + y_0,
7981   Y(t) = y2*t^2 + y1*t + y0}
7982@end group
7983@end display
7984
7985Let @math{b} equal the power of 2 which is the size of the @ms{x,0}, @ms{x,1},
7986@ms{y,0} and @ms{y,1} pieces, i.e.@: if they're @math{k} limbs each then
7987@m{b=2\GMPraise{$k*$@code{mp\_bits\_per\_limb}}, b=2^(k*mp_bits_per_limb)}.
7988With this @math{x=X(b)} and @math{y=Y(b)}.
7989
7990Let a polynomial @m{W(t)=X(t)Y(t),W(t)=X(t)*Y(t)} and suppose its coefficients
7991are
7992
7993@display
7994@m{W(t) = w_4t^4 + w_3t^3 + w_2t^2 + w_1t + w_0,
7995   W(t) = w4*t^4 + w3*t^3 + w2*t^2 + w1*t + w0}
7996@end display
7997
7998The @m{w_i,w[i]} are going to be determined, and when they are they'll give
7999the final result using @math{w=W(b)}, since
8000@m{xy=X(b)Y(b),x*y=X(b)*Y(b)=W(b)}.  The coefficients will be roughly
8001@math{b^2} each, and the final @math{W(b)} will be an addition like,
8002
8003@tex
8004\def\GMPbox#1#2{%
8005  \moveright #1\GMPboxwidth
8006  \vbox{%
8007    \hrule
8008    \hbox{%
8009      \GMPvrule
8010      \hbox to 2\GMPboxwidth {\hfil$#2$\hfil}%
8011      \vrule}%
8012    \hrule
8013}}
8014\GMPdisplay{%
8015\vbox{%
8016  \hbox to 6\GMPboxwidth {high \hfil low}%
8017  \vskip 0.7ex
8018  \GMPbox{0}{w_4}
8019  \vskip 0.5ex
8020  \GMPbox{1}{w_3}
8021  \vskip 0.5ex
8022  \GMPbox{2}{w_2}
8023  \vskip 0.5ex
8024  \GMPbox{3}{w_1}
8025  \vskip 0.5ex
8026  \GMPbox{4}{w_0}
8027}}
8028@end tex
8029@ifnottex
8030@example
8031@group
8032 high                                        low
8033+-------+-------+
8034|       w4      |
8035+-------+-------+
8036       +--------+-------+
8037       |        w3      |
8038       +--------+-------+
8039               +--------+-------+
8040               |        w2      |
8041               +--------+-------+
8042                       +--------+-------+
8043                       |        w1      |
8044                       +--------+-------+
8045                                +-------+-------+
8046                                |       w0      |
8047                                +-------+-------+
8048@end group
8049@end example
8050@end ifnottex
8051
8052The @m{w_i,w[i]} coefficients could be formed by a simple set of cross
8053products, like @m{w_4=x_2y_2,w4=x2*y2}, @m{w_3=x_2y_1+x_1y_2,w3=x2*y1+x1*y2},
8054@m{w_2=x_2y_0+x_1y_1+x_0y_2,w2=x2*y0+x1*y1+x0*y2} etc, but this would need all
8055nine @m{x_iy_j,x[i]*y[j]} for @math{i,j=0,1,2}, and would be equivalent merely
8056to a basecase multiply.  Instead the following approach is used.
8057
8058@math{X(t)} and @math{Y(t)} are evaluated and multiplied at 5 points, giving
8059values of @math{W(t)} at those points.  In GMP the following points are used,
8060
8061@quotation
8062@multitable {@m{t=\infty,t=inf}M} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM}
8063@item Point                 @tab Value
8064@item @math{t=0}            @tab @m{x_0y_0,x0 * y0}, which gives @ms{w,0} immediately
8065@item @math{t=1}            @tab @m{(x_2+x_1+x_0)(y_2+y_1+y_0),(x2+x1+x0) * (y2+y1+y0)}
8066@item @math{t=-1}           @tab @m{(x_2-x_1+x_0)(y_2-y_1+y_0),(x2-x1+x0) * (y2-y1+y0)}
8067@item @math{t=2}            @tab @m{(4x_2+2x_1+x_0)(4y_2+2y_1+y_0),(4*x2+2*x1+x0) * (4*y2+2*y1+y0)}
8068@item @m{t=\infty,t=inf}    @tab @m{x_2y_2,x2 * y2}, which gives @ms{w,4} immediately
8069@end multitable
8070@end quotation
8071
8072At @math{t=-1} the values can be negative and that's handled using the
8073absolute values and tracking the sign separately.  At @m{t=\infty,t=inf} the
8074value is actually @m{\lim_{t\to\infty} {X(t)Y(t)\over t^4}, X(t)*Y(t)/t^4 in
8075the limit as t approaches infinity}, but it's much easier to think of as
8076simply @m{x_2y_2,x2*y2} giving @ms{w,4} immediately (much like
8077@m{x_0y_0,x0*y0} at @math{t=0} gives @ms{w,0} immediately).
8078
8079Each of the points substituted into
8080@m{W(t)=w_4t^4+\cdots+w_0,W(t)=w4*t^4+@dots{}+w0} gives a linear combination
8081of the @m{w_i,w[i]} coefficients, and the value of those combinations has just
8082been calculated.
8083
8084@tex
8085\GMPdisplay{%
8086$\matrix{%
8087W(0)      & = &       &   &      &   &      &   &      &   & w_0 \cr
8088W(1)      & = &   w_4 & + &  w_3 & + &  w_2 & + &  w_1 & + & w_0 \cr
8089W(-1)     & = &   w_4 & - &  w_3 & + &  w_2 & - &  w_1 & + & w_0 \cr
8090W(2)      & = & 16w_4 & + & 8w_3 & + & 4w_2 & + & 2w_1 & + & w_0 \cr
8091W(\infty) & = &   w_4 \cr
8092}$}
8093@end tex
8094@ifnottex
8095@example
8096@group
8097W(0)   =                              w0
8098W(1)   =    w4 +   w3 +   w2 +   w1 + w0
8099W(-1)  =    w4 -   w3 +   w2 -   w1 + w0
8100W(2)   = 16*w4 + 8*w3 + 4*w2 + 2*w1 + w0
8101W(inf) =    w4
8102@end group
8103@end example
8104@end ifnottex
8105
8106This is a set of five equations in five unknowns, and some elementary linear
8107algebra quickly isolates each @m{w_i,w[i]}.  This involves adding or
8108subtracting one @math{W(t)} value from another, and a couple of divisions by
8109powers of 2 and one division by 3, the latter using the special
8110@code{mpn_divexact_by3} (@pxref{Exact Division}).
8111
8112The conversion of @math{W(t)} values to the coefficients is interpolation.  A
8113polynomial of degree 4 like @math{W(t)} is uniquely determined by values known
8114at 5 different points.  The points are arbitrary and can be chosen to make the
8115linear equations come out with a convenient set of steps for quickly isolating
8116the @m{w_i,w[i]}.
8117
8118Squaring follows the same procedure as multiplication, but there's only one
8119@math{X(t)} and it's evaluated at the 5 points, and those values squared to
8120give values of @math{W(t)}.  The interpolation is then identical, and in fact
8121the same @code{toom_interpolate_5pts} subroutine is used for both squaring and
8122multiplying.
8123
8124Toom-3 is asymptotically @math{O(N^@W{1.465})}, the exponent being
8125@m{\log5/\log3,log(5)/log(3)}, representing 5 recursive multiplies of 1/3 the
8126original size each.  This is an improvement over Karatsuba at
8127@math{O(N^@W{1.585})}, though Toom does more work in the evaluation and
8128interpolation and so it only realizes its advantage above a certain size.
8129
8130Near the crossover between Toom-3 and Karatsuba there's generally a range of
8131sizes where the difference between the two is small.
8132@code{MUL_TOOM33_THRESHOLD} is a somewhat arbitrary point in that range and
8133successive runs of the tune program can give different values due to small
8134variations in measuring.  A graph of time versus size for the two shows the
8135effect, see @file{tune/README}.
8136
8137At the fairly small sizes where the Toom-3 thresholds occur it's worth
8138remembering that the asymptotic behaviour for Karatsuba and Toom-3 can't be
8139expected to make accurate predictions, due of course to the big influence of
8140all sorts of overheads, and the fact that only a few recursions of each are
8141being performed.  Even at large sizes there's a good chance machine dependent
8142effects like cache architecture will mean actual performance deviates from
8143what might be predicted.
8144
8145The formula given for the Karatsuba algorithm (@pxref{Karatsuba
8146Multiplication}) has an equivalent for Toom-3 involving only five multiplies,
8147but this would be complicated and unenlightening.
8148
8149An alternate view of Toom-3 can be found in Zuras (@pxref{References}), using
8150a vector to represent the @math{x} and @math{y} splits and a matrix
8151multiplication for the evaluation and interpolation stages.  The matrix
8152inverses are not meant to be actually used, and they have elements with values
8153much greater than in fact arise in the interpolation steps.  The diagram shown
8154for the 3-way is attractive, but again doesn't have to be implemented that way
8155and for example with a bit of rearrangement just one division by 6 can be
8156done.
8157
8158
8159@node Toom 4-Way Multiplication, Higher degree Toom'n'half, Toom 3-Way Multiplication, Multiplication Algorithms
8160@subsection Toom 4-Way Multiplication
8161@cindex Toom multiplication
8162
8163Karatsuba and Toom-3 split the operands into 2 and 3 coefficients,
8164respectively.  Toom-4 analogously splits the operands into 4 coefficients.
8165Using the notation from the section on Toom-3 multiplication, we form two
8166polynomials:
8167
8168@display
8169@group
8170@m{X(t) = x_3t^3 + x_2t^2 + x_1t + x_0,
8171   X(t) = x3*t^3 + x2*t^2 + x1*t + x0}
8172@m{Y(t) = y_3t^3 + y_2t^2 + y_1t + y_0,
8173   Y(t) = y3*t^3 + y2*t^2 + y1*t + y0}
8174@end group
8175@end display
8176
8177@math{X(t)} and @math{Y(t)} are evaluated and multiplied at 7 points, giving
8178values of @math{W(t)} at those points.  In GMP the following points are used,
8179
8180@quotation
8181@multitable {@m{t=-1/2,t=inf}M} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM}
8182@item Point              @tab Value
8183@item @math{t=0}         @tab @m{x_0y_0,x0 * y0}, which gives @ms{w,0} immediately
8184@item @math{t=1/2}       @tab @m{(x_3+2x_2+4x_1+8x_0)(y_3+2y_2+4y_1+8y_0),(x3+2*x2+4*x1+8*x0) * (y3+2*y2+4*y1+8*y0)}
8185@item @math{t=-1/2}      @tab @m{(-x_3+2x_2-4x_1+8x_0)(-y_3+2y_2-4y_1+8y_0),(-x3+2*x2-4*x1+8*x0) * (-y3+2*y2-4*y1+8*y0)}
8186@item @math{t=1}         @tab @m{(x_3+x_2+x_1+x_0)(y_3+y_2+y_1+y_0),(x3+x2+x1+x0) * (y3+y2+y1+y0)}
8187@item @math{t=-1}        @tab @m{(-x_3+x_2-x_1+x_0)(-y_3+y_2-y_1+y_0),(-x3+x2-x1+x0) * (-y3+y2-y1+y0)}
8188@item @math{t=2}         @tab @m{(8x_3+4x_2+2x_1+x_0)(8y_3+4y_2+2y_1+y_0),(8*x3+4*x2+2*x1+x0) * (8*y3+4*y2+2*y1+y0)}
8189@item @m{t=\infty,t=inf} @tab @m{x_3y_3,x3 * y3}, which gives @ms{w,6} immediately
8190@end multitable
8191@end quotation
8192
8193The number of additions and subtractions for Toom-4 is much larger than for Toom-3.
8194But several subexpressions occur multiple times, for example @m{x_2+x_0,x2+x0}, occurs
8195for both @math{t=1} and @math{t=-1}.
8196
8197Toom-4 is asymptotically @math{O(N^@W{1.404})}, the exponent being
8198@m{\log7/\log4,log(7)/log(4)}, representing 7 recursive multiplies of 1/4 the
8199original size each.
8200
8201
8202@node Higher degree Toom'n'half, FFT Multiplication, Toom 4-Way Multiplication, Multiplication Algorithms
8203@subsection Higher degree Toom'n'half
8204@cindex Toom multiplication
8205
8206The Toom algorithms described above (@pxref{Toom 3-Way Multiplication},
8207@pxref{Toom 4-Way Multiplication}) generalizes to split into an arbitrary
8208number of pieces. In general a split of two equally long operands into
8209@math{r} pieces leads to evaluations and pointwise multiplications done at
8210@m{2r-1,2*r-1} points. To fully exploit symmetries it would be better to have
8211a multiple of 4 points, that's why for higher degree Toom'n'half is used.
8212
8213Toom'n'half means that the existence of one more piece is considered for a
8214single operand. It can be virtual, i.e. zero, or real, when the two operand
8215are not exactly balanced. By choosing an even @math{r},
8216Toom-@m{r{1\over2},r+1/2} requires @math{2r} points, a multiple of four.
8217
8218The four-plets of points include 0, @m{\infty,inf}, +1, -1 and
8219@m{\pm2^i,+-2^i}, @m{\pm2^{-i},+-2^-i} . Each of them giving shortcuts for the
8220evaluation phase and for some steps in the interpolation phase. Further tricks
8221are used to reduce the memory footprint of the whole multiplication algorithm
8222to a memory buffer equanl in size to the result of the product.
8223
8224Current GMP uses both Toom-6'n'half and Toom-8'n'half.
8225
8226
8227@node FFT Multiplication, Other Multiplication, Higher degree Toom'n'half, Multiplication Algorithms
8228@subsection FFT Multiplication
8229@cindex FFT multiplication
8230@cindex Fast Fourier Transform
8231
8232At large to very large sizes a Fermat style FFT multiplication is used,
8233following Sch@"onhage and Strassen (@pxref{References}).  Descriptions of FFTs
8234in various forms can be found in many textbooks, for instance Knuth section
82354.3.3 part C or Lipson chapter IX@.  A brief description of the form used in
8236GMP is given here.
8237
8238The multiplication done is @m{xy \bmod 2^N+1, x*y mod 2^N+1}, for a given
8239@math{N}.  A full product @m{xy,x*y} is obtained by choosing @m{N \ge
8240\mathop{\rm bits}(x)+\mathop{\rm bits}(y), N>=bits(x)+bits(y)} and padding
8241@math{x} and @math{y} with high zero limbs.  The modular product is the native
8242form for the algorithm, so padding to get a full product is unavoidable.
8243
8244The algorithm follows a split, evaluate, pointwise multiply, interpolate and
8245combine similar to that described above for Karatsuba and Toom-3.  A @math{k}
8246parameter controls the split, with an FFT-@math{k} splitting into @math{2^k}
8247pieces of @math{M=N/2^k} bits each.  @math{N} must be a multiple of
8248@m{2^k\times@code{mp\_bits\_per\_limb}, (2^k)*@nicode{mp_bits_per_limb}} so
8249the split falls on limb boundaries, avoiding bit shifts in the split and
8250combine stages.
8251
8252The evaluations, pointwise multiplications, and interpolation, are all done
8253modulo @m{2^{N'}+1, 2^N'+1} where @math{N'} is @math{2M+k+3} rounded up to a
8254multiple of @math{2^k} and of @code{mp_bits_per_limb}.  The results of
8255interpolation will be the following negacyclic convolution of the input
8256pieces, and the choice of @math{N'} ensures these sums aren't truncated.
8257@tex
8258$$ w_n = \sum_{{i+j = b2^k+n}\atop{b=0,1}} (-1)^b x_i y_j $$
8259@end tex
8260@ifnottex
8261
8262@example
8263           ---
8264           \         b
8265w[n] =     /     (-1) * x[i] * y[j]
8266           ---
8267       i+j==b*2^k+n
8268          b=0,1
8269@end example
8270
8271@end ifnottex
8272The points used for the evaluation are @math{g^i} for @math{i=0} to
8273@math{2^k-1} where @m{g=2^{2N'/2^k}, g=2^(2N'/2^k)}.  @math{g} is a
8274@m{2^k,2^k'}th root of unity mod @m{2^{N'}+1,2^N'+1}, which produces necessary
8275cancellations at the interpolation stage, and it's also a power of 2 so the
8276fast Fourier transforms used for the evaluation and interpolation do only
8277shifts, adds and negations.
8278
8279The pointwise multiplications are done modulo @m{2^{N'}+1, 2^N'+1} and either
8280recurse into a further FFT or use a plain multiplication (Toom-3, Karatsuba or
8281basecase), whichever is optimal at the size @math{N'}.  The interpolation is
8282an inverse fast Fourier transform.  The resulting set of sums of @m{x_iy_j,
8283x[i]*y[j]} are added at appropriate offsets to give the final result.
8284
8285Squaring is the same, but @math{x} is the only input so it's one transform at
8286the evaluate stage and the pointwise multiplies are squares.  The
8287interpolation is the same.
8288
8289For a mod @math{2^N+1} product, an FFT-@math{k} is an @m{O(N^{k/(k-1)}),
8290O(N^(k/(k-1)))} algorithm, the exponent representing @math{2^k} recursed
8291modular multiplies each @m{1/2^{k-1},1/2^(k-1)} the size of the original.
8292Each successive @math{k} is an asymptotic improvement, but overheads mean each
8293is only faster at bigger and bigger sizes.  In the code, @code{MUL_FFT_TABLE}
8294and @code{SQR_FFT_TABLE} are the thresholds where each @math{k} is used.  Each
8295new @math{k} effectively swaps some multiplying for some shifts, adds and
8296overheads.
8297
8298A mod @math{2^N+1} product can be formed with a normal
8299@math{N@cross{}N@rightarrow{}2N} bit multiply plus a subtraction, so an FFT
8300and Toom-3 etc can be compared directly.  A @math{k=4} FFT at
8301@math{O(N^@W{1.333})} can be expected to be the first faster than Toom-3 at
8302@math{O(N^@W{1.465})}.  In practice this is what's found, with
8303@code{MUL_FFT_MODF_THRESHOLD} and @code{SQR_FFT_MODF_THRESHOLD} being between
8304300 and 1000 limbs, depending on the CPU@.  So far it's been found that only
8305very large FFTs recurse into pointwise multiplies above these sizes.
8306
8307When an FFT is to give a full product, the change of @math{N} to @math{2N}
8308doesn't alter the theoretical complexity for a given @math{k}, but for the
8309purposes of considering where an FFT might be first used it can be assumed
8310that the FFT is recursing into a normal multiply and that on that basis it's
8311doing @math{2^k} recursed multiplies each @m{1/2^{k-2},1/2^(k-2)} the size of
8312the inputs, making it @m{O(N^{k/(k-2)}), O(N^(k/(k-2)))}.  This would mean
8313@math{k=7} at @math{O(N^@W{1.4})} would be the first FFT faster than Toom-3.
8314In practice @code{MUL_FFT_THRESHOLD} and @code{SQR_FFT_THRESHOLD} have been
8315found to be in the @math{k=8} range, somewhere between 3000 and 10000 limbs.
8316
8317The way @math{N} is split into @math{2^k} pieces and then @math{2M+k+3} is
8318rounded up to a multiple of @math{2^k} and @code{mp_bits_per_limb} means that
8319when @math{2^k@ge{}@nicode{mp\_bits\_per\_limb}} the effective @math{N} is a
8320multiple of @m{2^{2k-1},2^(2k-1)} bits.  The @math{+k+3} means some values of
8321@math{N} just under such a multiple will be rounded to the next.  The
8322complexity calculations above assume that a favourable size is used, meaning
8323one which isn't padded through rounding, and it's also assumed that the extra
8324@math{+k+3} bits are negligible at typical FFT sizes.
8325
8326The practical effect of the @m{2^{2k-1},2^(2k-1)} constraint is to introduce a
8327step-effect into measured speeds.  For example @math{k=8} will round @math{N}
8328up to a multiple of 32768 bits, so for a 32-bit limb there'll be 512 limb
8329groups of sizes for which @code{mpn_mul_n} runs at the same speed.  Or for
8330@math{k=9} groups of 2048 limbs, @math{k=10} groups of 8192 limbs, etc.  In
8331practice it's been found each @math{k} is used at quite small multiples of its
8332size constraint and so the step effect is quite noticeable in a time versus
8333size graph.
8334
8335The threshold determinations currently measure at the mid-points of size
8336steps, but this is sub-optimal since at the start of a new step it can happen
8337that it's better to go back to the previous @math{k} for a while.  Something
8338more sophisticated for @code{MUL_FFT_TABLE} and @code{SQR_FFT_TABLE} will be
8339needed.
8340
8341
8342@node Other Multiplication, Unbalanced Multiplication, FFT Multiplication, Multiplication Algorithms
8343@subsection Other Multiplication
8344@cindex Toom multiplication
8345
8346The Toom algorithms described above (@pxref{Toom 3-Way Multiplication},
8347@pxref{Toom 4-Way Multiplication}) generalizes to split into an arbitrary
8348number of pieces, as per Knuth section 4.3.3 algorithm C@.  This is not
8349currently used.  The notes here are merely for interest.
8350
8351In general a split into @math{r+1} pieces is made, and evaluations and
8352pointwise multiplications done at @m{2r+1,2*r+1} points.  A 4-way split does 7
8353pointwise multiplies, 5-way does 9, etc.  Asymptotically an @math{(r+1)}-way
8354algorithm is @m{O(N^{log(2r+1)/log(r+1)}), O(N^(log(2*r+1)/log(r+1)))}.  Only
8355the pointwise multiplications count towards big-@math{O} complexity, but the
8356time spent in the evaluate and interpolate stages grows with @math{r} and has
8357a significant practical impact, with the asymptotic advantage of each @math{r}
8358realized only at bigger and bigger sizes.  The overheads grow as
8359@m{O(Nr),O(N*r)}, whereas in an @math{r=2^k} FFT they grow only as @m{O(N \log
8360r), O(N*log(r))}.
8361
8362Knuth algorithm C evaluates at points 0,1,2,@dots{},@m{2r,2*r}, but exercise 4
8363uses @math{-r},@dots{},0,@dots{},@math{r} and the latter saves some small
8364multiplies in the evaluate stage (or rather trades them for additions), and
8365has a further saving of nearly half the interpolate steps.  The idea is to
8366separate odd and even final coefficients and then perform algorithm C steps C7
8367and C8 on them separately.  The divisors at step C7 become @math{j^2} and the
8368multipliers at C8 become @m{2tj-j^2,2*t*j-j^2}.
8369
8370Splitting odd and even parts through positive and negative points can be
8371thought of as using @math{-1} as a square root of unity.  If a 4th root of
8372unity was available then a further split and speedup would be possible, but no
8373such root exists for plain integers.  Going to complex integers with
8374@m{i=\sqrt{-1}, i=sqrt(-1)} doesn't help, essentially because in Cartesian
8375form it takes three real multiplies to do a complex multiply.  The existence
8376of @m{2^k,2^k'}th roots of unity in a suitable ring or field lets the fast
8377Fourier transform keep splitting and get to @m{O(N \log r), O(N*log(r))}.
8378
8379Floating point FFTs use complex numbers approximating Nth roots of unity.
8380Some processors have special support for such FFTs.  But these are not used in
8381GMP since it's very difficult to guarantee an exact result (to some number of
8382bits).  An occasional difference of 1 in the last bit might not matter to a
8383typical signal processing algorithm, but is of course of vital importance to
8384GMP.
8385
8386
8387@node Unbalanced Multiplication,  , Other Multiplication, Multiplication Algorithms
8388@subsection Unbalanced Multiplication
8389@cindex Unbalanced multiplication
8390
8391Multiplication of operands with different sizes, both below
8392@code{MUL_TOOM22_THRESHOLD} are done with plain schoolbook multiplication
8393(@pxref{Basecase Multiplication}).
8394
8395For really large operands, we invoke FFT directly.
8396
8397For operands between these sizes, we use Toom inspired algorithms suggested by
8398Alberto Zanoni and Marco Bodrato.  The idea is to split the operands into
8399polynomials of different degree.  GMP currently splits the smaller operand
8400onto 2 coefficients, i.e., a polynomial of degree 1, but the larger operand
8401can be split into 2, 3, or 4 coefficients, i.e., a polynomial of degree 1 to
84023.
8403
8404@c FIXME: This is mighty ugly, but a cleaner @need triggers texinfo bugs that
8405@c screws up layout here and there in the rest of the manual.
8406@c @tex
8407@c \goodbreak
8408@c @end tex
8409@node Division Algorithms, Greatest Common Divisor Algorithms, Multiplication Algorithms, Algorithms
8410@section Division Algorithms
8411@cindex Division algorithms
8412
8413@menu
8414* Single Limb Division::
8415* Basecase Division::
8416* Divide and Conquer Division::
8417* Block-Wise Barrett Division::
8418* Exact Division::
8419* Exact Remainder::
8420* Small Quotient Division::
8421@end menu
8422
8423
8424@node Single Limb Division, Basecase Division, Division Algorithms, Division Algorithms
8425@subsection Single Limb Division
8426
8427N@cross{}1 division is implemented using repeated 2@cross{}1 divisions from
8428high to low, either with a hardware divide instruction or a multiplication by
8429inverse, whichever is best on a given CPU.
8430
8431The multiply by inverse follows ``Improved division by invariant integers'' by
8432M@"oller and Granlund (@pxref{References}) and is implemented as
8433@code{udiv_qrnnd_preinv} in @file{gmp-impl.h}.  The idea is to have a
8434fixed-point approximation to @math{1/d} (see @code{invert_limb}) and then
8435multiply by the high limb (plus one bit) of the dividend to get a quotient
8436@math{q}.  With @math{d} normalized (high bit set), @math{q} is no more than 1
8437too small.  Subtracting @m{qd,q*d} from the dividend gives a remainder, and
8438reveals whether @math{q} or @math{q-1} is correct.
8439
8440The result is a division done with two multiplications and four or five
8441arithmetic operations.  On CPUs with low latency multipliers this can be much
8442faster than a hardware divide, though the cost of calculating the inverse at
8443the start may mean it's only better on inputs bigger than say 4 or 5 limbs.
8444
8445When a divisor must be normalized, either for the generic C
8446@code{__udiv_qrnnd_c} or the multiply by inverse, the division performed is
8447actually @m{a2^k,a*2^k} by @m{d2^k,d*2^k} where @math{a} is the dividend and
8448@math{k} is the power necessary to have the high bit of @m{d2^k,d*2^k} set.
8449The bit shifts for the dividend are usually accomplished ``on the fly''
8450meaning by extracting the appropriate bits at each step.  Done this way the
8451quotient limbs come out aligned ready to store.  When only the remainder is
8452wanted, an alternative is to take the dividend limbs unshifted and calculate
8453@m{r = a \bmod d2^k, r = a mod d*2^k} followed by an extra final step @m{r2^k
8454\bmod d2^k, r*2^k mod d*2^k}.  This can help on CPUs with poor bit shifts or
8455few registers.
8456
8457The multiply by inverse can be done two limbs at a time.  The calculation is
8458basically the same, but the inverse is two limbs and the divisor treated as if
8459padded with a low zero limb.  This means more work, since the inverse will
8460need a 2@cross{}2 multiply, but the four 1@cross{}1s to do that are
8461independent and can therefore be done partly or wholly in parallel.  Likewise
8462for a 2@cross{}1 calculating @m{qd,q*d}.  The net effect is to process two
8463limbs with roughly the same two multiplies worth of latency that one limb at a
8464time gives.  This extends to 3 or 4 limbs at a time, though the extra work to
8465apply the inverse will almost certainly soon reach the limits of multiplier
8466throughput.
8467
8468A similar approach in reverse can be taken to process just half a limb at a
8469time if the divisor is only a half limb.  In this case the 1@cross{}1 multiply
8470for the inverse effectively becomes two @m{{1\over2}\times1, (1/2)x1} for each
8471limb, which can be a saving on CPUs with a fast half limb multiply, or in fact
8472if the only multiply is a half limb, and especially if it's not pipelined.
8473
8474
8475@node Basecase Division, Divide and Conquer Division, Single Limb Division, Division Algorithms
8476@subsection Basecase Division
8477
8478Basecase N@cross{}M division is like long division done by hand, but in base
8479@m{2\GMPraise{@code{mp\_bits\_per\_limb}}, 2^mp_bits_per_limb}.  See Knuth
8480section 4.3.1 algorithm D, and @file{mpn/generic/sb_divrem_mn.c}.
8481
8482Briefly stated, while the dividend remains larger than the divisor, a high
8483quotient limb is formed and the N@cross{}1 product @m{qd,q*d} subtracted at
8484the top end of the dividend.  With a normalized divisor (most significant bit
8485set), each quotient limb can be formed with a 2@cross{}1 division and a
84861@cross{}1 multiplication plus some subtractions.  The 2@cross{}1 division is
8487by the high limb of the divisor and is done either with a hardware divide or a
8488multiply by inverse (the same as in @ref{Single Limb Division}) whichever is
8489faster.  Such a quotient is sometimes one too big, requiring an addback of the
8490divisor, but that happens rarely.
8491
8492With Q=N@minus{}M being the number of quotient limbs, this is an
8493@m{O(QM),O(Q*M)} algorithm and will run at a speed similar to a basecase
8494Q@cross{}M multiplication, differing in fact only in the extra multiply and
8495divide for each of the Q quotient limbs.
8496
8497
8498@node Divide and Conquer Division, Block-Wise Barrett Division, Basecase Division, Division Algorithms
8499@subsection Divide and Conquer Division
8500
8501For divisors larger than @code{DC_DIV_QR_THRESHOLD}, division is done by dividing.
8502Or to be precise by a recursive divide and conquer algorithm based on work by
8503Moenck and Borodin, Jebelean, and Burnikel and Ziegler (@pxref{References}).
8504
8505The algorithm consists essentially of recognising that a 2N@cross{}N division
8506can be done with the basecase division algorithm (@pxref{Basecase Division}),
8507but using N/2 limbs as a base, not just a single limb.  This way the
8508multiplications that arise are (N/2)@cross{}(N/2) and can take advantage of
8509Karatsuba and higher multiplication algorithms (@pxref{Multiplication
8510Algorithms}).  The two ``digits'' of the quotient are formed by recursive
8511N@cross{}(N/2) divisions.
8512
8513If the (N/2)@cross{}(N/2) multiplies are done with a basecase multiplication
8514then the work is about the same as a basecase division, but with more function
8515call overheads and with some subtractions separated from the multiplies.
8516These overheads mean that it's only when N/2 is above
8517@code{MUL_TOOM22_THRESHOLD} that divide and conquer is of use.
8518
8519@code{DC_DIV_QR_THRESHOLD} is based on the divisor size N, so it will be somewhere
8520above twice @code{MUL_TOOM22_THRESHOLD}, but how much above depends on the
8521CPU@.  An optimized @code{mpn_mul_basecase} can lower @code{DC_DIV_QR_THRESHOLD} a
8522little by offering a ready-made advantage over repeated @code{mpn_submul_1}
8523calls.
8524
8525Divide and conquer is asymptotically @m{O(M(N)\log N),O(M(N)*log(N))} where
8526@math{M(N)} is the time for an N@cross{}N multiplication done with FFTs.  The
8527actual time is a sum over multiplications of the recursed sizes, as can be
8528seen near the end of section 2.2 of Burnikel and Ziegler.  For example, within
8529the Toom-3 range, divide and conquer is @m{2.63M(N), 2.63*M(N)}.  With higher
8530algorithms the @math{M(N)} term improves and the multiplier tends to @m{\log
8531N, log(N)}.  In practice, at moderate to large sizes, a 2N@cross{}N division
8532is about 2 to 4 times slower than an N@cross{}N multiplication.
8533
8534
8535@node Block-Wise Barrett Division, Exact Division, Divide and Conquer Division, Division Algorithms
8536@subsection Block-Wise Barrett Division
8537
8538For the largest divisions, a block-wise Barrett division algorithm is used.
8539Here, the divisor is inverted to a precision determined by the relative size of
8540the dividend and divisor.  Blocks of quotient limbs are then generated by
8541multiplying blocks from the dividend by the inverse.
8542
8543Our block-wise algorithm computes a smaller inverse than in the plain Barrett
8544algorithm.  For a @math{2n/n} division, the inverse will be just @m{\lceil n/2
8545\rceil, ceil(n/2)} limbs.
8546
8547
8548@node Exact Division, Exact Remainder, Block-Wise Barrett Division, Division Algorithms
8549@subsection Exact Division
8550
8551
8552A so-called exact division is when the dividend is known to be an exact
8553multiple of the divisor.  Jebelean's exact division algorithm uses this
8554knowledge to make some significant optimizations (@pxref{References}).
8555
8556The idea can be illustrated in decimal for example with 368154 divided by
8557543.  Because the low digit of the dividend is 4, the low digit of the
8558quotient must be 8.  This is arrived at from @m{4 \mathord{\times} 7 \bmod 10,
85594*7 mod 10}, using the fact 7 is the modular inverse of 3 (the low digit of
8560the divisor), since @m{3 \mathord{\times} 7 \mathop{\equiv} 1 \bmod 10, 3*7
8561@equiv{} 1 mod 10}.  So @m{8\mathord{\times}543 = 4344,8*543=4344} can be
8562subtracted from the dividend leaving 363810.  Notice the low digit has become
8563zero.
8564
8565The procedure is repeated at the second digit, with the next quotient digit 7
8566(@m{1 \mathord{\times} 7 \bmod 10, 7 @equiv{} 1*7 mod 10}), subtracting
8567@m{7\mathord{\times}543 = 3801,7*543=3801}, leaving 325800.  And finally at
8568the third digit with quotient digit 6 (@m{8 \mathord{\times} 7 \bmod 10, 8*7
8569mod 10}), subtracting @m{6\mathord{\times}543 = 3258,6*543=3258} leaving 0.
8570So the quotient is 678.
8571
8572Notice however that the multiplies and subtractions don't need to extend past
8573the low three digits of the dividend, since that's enough to determine the
8574three quotient digits.  For the last quotient digit no subtraction is needed
8575at all.  On a 2N@cross{}N division like this one, only about half the work of
8576a normal basecase division is necessary.
8577
8578For an N@cross{}M exact division producing Q=N@minus{}M quotient limbs, the
8579saving over a normal basecase division is in two parts.  Firstly, each of the
8580Q quotient limbs needs only one multiply, not a 2@cross{}1 divide and
8581multiply.  Secondly, the crossproducts are reduced when @math{Q>M} to
8582@m{QM-M(M+1)/2,Q*M-M*(M+1)/2}, or when @math{Q@le{}M} to @m{Q(Q-1)/2,
8583Q*(Q-1)/2}.  Notice the savings are complementary.  If Q is big then many
8584divisions are saved, or if Q is small then the crossproducts reduce to a small
8585number.
8586
8587The modular inverse used is calculated efficiently by @code{binvert_limb} in
8588@file{gmp-impl.h}.  This does four multiplies for a 32-bit limb, or six for a
858964-bit limb.  @file{tune/modlinv.c} has some alternate implementations that
8590might suit processors better at bit twiddling than multiplying.
8591
8592The sub-quadratic exact division described by Jebelean in ``Exact Division
8593with Karatsuba Complexity'' is not currently implemented.  It uses a
8594rearrangement similar to the divide and conquer for normal division
8595(@pxref{Divide and Conquer Division}), but operating from low to high.  A
8596further possibility not currently implemented is ``Bidirectional Exact Integer
8597Division'' by Krandick and Jebelean which forms quotient limbs from both the
8598high and low ends of the dividend, and can halve once more the number of
8599crossproducts needed in a 2N@cross{}N division.
8600
8601A special case exact division by 3 exists in @code{mpn_divexact_by3},
8602supporting Toom-3 multiplication and @code{mpq} canonicalizations.  It forms
8603quotient digits with a multiply by the modular inverse of 3 (which is
8604@code{0xAA..AAB}) and uses two comparisons to determine a borrow for the next
8605limb.  The multiplications don't need to be on the dependent chain, as long as
8606the effect of the borrows is applied, which can help chips with pipelined
8607multipliers.
8608
8609
8610@node Exact Remainder, Small Quotient Division, Exact Division, Division Algorithms
8611@subsection Exact Remainder
8612@cindex Exact remainder
8613
8614If the exact division algorithm is done with a full subtraction at each stage
8615and the dividend isn't a multiple of the divisor, then low zero limbs are
8616produced but with a remainder in the high limbs.  For dividend @math{a},
8617divisor @math{d}, quotient @math{q}, and @m{b = 2
8618\GMPraise{@code{mp\_bits\_per\_limb}}, b = 2^mp_bits_per_limb}, this remainder
8619@math{r} is of the form
8620@tex
8621$$ a = qd + r b^n $$
8622@end tex
8623@ifnottex
8624
8625@example
8626a = q*d + r*b^n
8627@end example
8628
8629@end ifnottex
8630@math{n} represents the number of zero limbs produced by the subtractions,
8631that being the number of limbs produced for @math{q}.  @math{r} will be in the
8632range @math{0@le{}r<d} and can be viewed as a remainder, but one shifted up by
8633a factor of @math{b^n}.
8634
8635Carrying out full subtractions at each stage means the same number of cross
8636products must be done as a normal division, but there's still some single limb
8637divisions saved.  When @math{d} is a single limb some simplifications arise,
8638providing good speedups on a number of processors.
8639
8640The functions @code{mpn_divexact_by3}, @code{mpn_modexact_1_odd} and the
8641internal @code{mpn_redc_X} functions differ subtly in how they return @math{r},
8642leading to some negations in the above formula, but all are essentially the
8643same.
8644
8645@cindex Divisibility algorithm
8646@cindex Congruence algorithm
8647Clearly @math{r} is zero when @math{a} is a multiple of @math{d}, and this
8648leads to divisibility or congruence tests which are potentially more efficient
8649than a normal division.
8650
8651The factor of @math{b^n} on @math{r} can be ignored in a GCD when @math{d} is
8652odd, hence the use of @code{mpn_modexact_1_odd} by @code{mpn_gcd_1} and
8653@code{mpz_kronecker_ui} etc (@pxref{Greatest Common Divisor Algorithms}).
8654
8655Montgomery's REDC method for modular multiplications uses operands of the form
8656of @m{xb^{-n}, x*b^-n} and @m{yb^{-n}, y*b^-n} and on calculating @m{(xb^{-n})
8657(yb^{-n}), (x*b^-n)*(y*b^-n)} uses the factor of @math{b^n} in the exact
8658remainder to reach a product in the same form @m{(xy)b^{-n}, (x*y)*b^-n}
8659(@pxref{Modular Powering Algorithm}).
8660
8661Notice that @math{r} generally gives no useful information about the ordinary
8662remainder @math{a @bmod d} since @math{b^n @bmod d} could be anything.  If
8663however @math{b^n @equiv{} 1 @bmod d}, then @math{r} is the negative of the
8664ordinary remainder.  This occurs whenever @math{d} is a factor of
8665@math{b^n-1}, as for example with 3 in @code{mpn_divexact_by3}.  For a 32 or
866664 bit limb other such factors include 5, 17 and 257, but no particular use
8667has been found for this.
8668
8669
8670@node Small Quotient Division,  , Exact Remainder, Division Algorithms
8671@subsection Small Quotient Division
8672
8673An N@cross{}M division where the number of quotient limbs Q=N@minus{}M is
8674small can be optimized somewhat.
8675
8676An ordinary basecase division normalizes the divisor by shifting it to make
8677the high bit set, shifting the dividend accordingly, and shifting the
8678remainder back down at the end of the calculation.  This is wasteful if only a
8679few quotient limbs are to be formed.  Instead a division of just the top
8680@m{\rm2Q,2*Q} limbs of the dividend by the top Q limbs of the divisor can be
8681used to form a trial quotient.  This requires only those limbs normalized, not
8682the whole of the divisor and dividend.
8683
8684A multiply and subtract then applies the trial quotient to the M@minus{}Q
8685unused limbs of the divisor and N@minus{}Q dividend limbs (which includes Q
8686limbs remaining from the trial quotient division).  The starting trial
8687quotient can be 1 or 2 too big, but all cases of 2 too big and most cases of 1
8688too big are detected by first comparing the most significant limbs that will
8689arise from the subtraction.  An addback is done if the quotient still turns
8690out to be 1 too big.
8691
8692This whole procedure is essentially the same as one step of the basecase
8693algorithm done in a Q limb base, though with the trial quotient test done only
8694with the high limbs, not an entire Q limb ``digit'' product.  The correctness
8695of this weaker test can be established by following the argument of Knuth
8696section 4.3.1 exercise 20 but with the @m{v_2 \GMPhat q > b \GMPhat r
8697+ u_2, v2*q>b*r+u2} condition appropriately relaxed.
8698
8699
8700@need 1000
8701@node Greatest Common Divisor Algorithms, Powering Algorithms, Division Algorithms, Algorithms
8702@section Greatest Common Divisor
8703@cindex Greatest common divisor algorithms
8704@cindex GCD algorithms
8705
8706@menu
8707* Binary GCD::
8708* Lehmer's Algorithm::
8709* Subquadratic GCD::
8710* Extended GCD::
8711* Jacobi Symbol::
8712@end menu
8713
8714
8715@node Binary GCD, Lehmer's Algorithm, Greatest Common Divisor Algorithms, Greatest Common Divisor Algorithms
8716@subsection Binary GCD
8717
8718At small sizes GMP uses an @math{O(N^2)} binary style GCD@.  This is described
8719in many textbooks, for example Knuth section 4.5.2 algorithm B@.  It simply
8720consists of successively reducing odd operands @math{a} and @math{b} using
8721
8722@quotation
8723@math{a,b = @abs{}(a-b),@min{}(a,b)} @*
8724strip factors of 2 from @math{a}
8725@end quotation
8726
8727The Euclidean GCD algorithm, as per Knuth algorithms E and A, repeatedly
8728computes the quotient @m{q = \lfloor a/b \rfloor, q = floor(a/b)} and replaces
8729@math{a,b} by @math{v, u - q v}. The binary algorithm has so far been found to
8730be faster than the Euclidean algorithm everywhere.  One reason the binary
8731method does well is that the implied quotient at each step is usually small,
8732so often only one or two subtractions are needed to get the same effect as a
8733division.  Quotients 1, 2 and 3 for example occur 67.7% of the time, see Knuth
8734section 4.5.3 Theorem E.
8735
8736When the implied quotient is large, meaning @math{b} is much smaller than
8737@math{a}, then a division is worthwhile.  This is the basis for the initial
8738@math{a @bmod b} reductions in @code{mpn_gcd} and @code{mpn_gcd_1} (the latter
8739for both N@cross{}1 and 1@cross{}1 cases).  But after that initial reduction,
8740big quotients occur too rarely to make it worth checking for them.
8741
8742@sp 1
8743The final @math{1@cross{}1} GCD in @code{mpn_gcd_1} is done in the generic C
8744code as described above.  For two N-bit operands, the algorithm takes about
87450.68 iterations per bit.  For optimum performance some attention needs to be
8746paid to the way the factors of 2 are stripped from @math{a}.
8747
8748Firstly it may be noted that in twos complement the number of low zero bits on
8749@math{a-b} is the same as @math{b-a}, so counting or testing can begin on
8750@math{a-b} without waiting for @math{@abs{}(a-b)} to be determined.
8751
8752A loop stripping low zero bits tends not to branch predict well, since the
8753condition is data dependent.  But on average there's only a few low zeros, so
8754an option is to strip one or two bits arithmetically then loop for more (as
8755done for AMD K6).  Or use a lookup table to get a count for several bits then
8756loop for more (as done for AMD K7).  An alternative approach is to keep just
8757one of @math{a} or @math{b} odd and iterate
8758
8759@quotation
8760@math{a,b = @abs{}(a-b), @min{}(a,b)} @*
8761@math{a = a/2} if even @*
8762@math{b = b/2} if even
8763@end quotation
8764
8765This requires about 1.25 iterations per bit, but stripping of a single bit at
8766each step avoids any branching.  Repeating the bit strip reduces to about 0.9
8767iterations per bit, which may be a worthwhile tradeoff.
8768
8769Generally with the above approaches a speed of perhaps 6 cycles per bit can be
8770achieved, which is still not terribly fast with for instance a 64-bit GCD
8771taking nearly 400 cycles.  It's this sort of time which means it's not usually
8772advantageous to combine a set of divisibility tests into a GCD.
8773
8774Currently, the binary algorithm is used for GCD only when @math{N < 3}.
8775
8776@node Lehmer's Algorithm, Subquadratic GCD, Binary GCD, Greatest Common Divisor Algorithms
8777@comment  node-name,  next,  previous,  up
8778@subsection Lehmer's algorithm
8779
8780Lehmer's improvement of the Euclidean algorithms is based on the observation
8781that the initial part of the quotient sequence depends only on the most
8782significant parts of the inputs. The variant of Lehmer's algorithm used in GMP
8783splits off the most significant two limbs, as suggested, e.g., in ``A
8784Double-Digit Lehmer-Euclid Algorithm'' by Jebelean (@pxref{References}). The
8785quotients of two double-limb inputs are collected as a 2 by 2 matrix with
8786single-limb elements. This is done by the function @code{mpn_hgcd2}. The
8787resulting matrix is applied to the inputs using @code{mpn_mul_1} and
8788@code{mpn_submul_1}. Each iteration usually reduces the inputs by almost one
8789limb. In the rare case of a large quotient, no progress can be made by
8790examining just the most significant two limbs, and the quotient is computed
8791using plain division.
8792
8793The resulting algorithm is asymptotically @math{O(N^2)}, just as the Euclidean
8794algorithm and the binary algorithm. The quadratic part of the work are
8795the calls to @code{mpn_mul_1} and @code{mpn_submul_1}. For small sizes, the
8796linear work is also significant. There are roughly @math{N} calls to the
8797@code{mpn_hgcd2} function. This function uses a couple of important
8798optimizations:
8799
8800@itemize
8801@item
8802It uses the same relaxed notion of correctness as @code{mpn_hgcd} (see next
8803section). This means that when called with the most significant two limbs of
8804two large numbers, the returned matrix does not always correspond exactly to
8805the initial quotient sequence for the two large numbers; the final quotient
8806may sometimes be one off.
8807
8808@item
8809It takes advantage of the fact the quotients are usually small. The division
8810operator is not used, since the corresponding assembler instruction is very
8811slow on most architectures. (This code could probably be improved further, it
8812uses many branches that are unfriendly to prediction).
8813
8814@item
8815It switches from double-limb calculations to single-limb calculations half-way
8816through, when the input numbers have been reduced in size from two limbs to
8817one and a half.
8818
8819@end itemize
8820
8821@node Subquadratic GCD, Extended GCD, Lehmer's Algorithm, Greatest Common Divisor Algorithms
8822@subsection Subquadratic GCD
8823
8824For inputs larger than @code{GCD_DC_THRESHOLD}, GCD is computed via the HGCD
8825(Half GCD) function, as a generalization to Lehmer's algorithm.
8826
8827Let the inputs @math{a,b} be of size @math{N} limbs each. Put @m{S=\lfloor N/2
8828\rfloor + 1, S = floor(N/2) + 1}. Then HGCD(a,b) returns a transformation
8829matrix @math{T} with non-negative elements, and reduced numbers @math{(c;d) =
8830T^{-1} (a;b)}. The reduced numbers @math{c,d} must be larger than @math{S}
8831limbs, while their difference @math{abs(c-d)} must fit in @math{S} limbs. The
8832matrix elements will also be of size roughly @math{N/2}.
8833
8834The HGCD base case uses Lehmer's algorithm, but with the above stop condition
8835that returns reduced numbers and the corresponding transformation matrix
8836half-way through. For inputs larger than @code{HGCD_THRESHOLD}, HGCD is
8837computed recursively, using the divide and conquer algorithm in ``On
8838Sch@"onhage's algorithm and subquadratic integer GCD computation'' by M@"oller
8839(@pxref{References}). The recursive algorithm consists of these main
8840steps.
8841
8842@itemize
8843
8844@item
8845Call HGCD recursively, on the most significant @math{N/2} limbs. Apply the
8846resulting matrix @math{T_1} to the full numbers, reducing them to a size just
8847above @math{3N/2}.
8848
8849@item
8850Perform a small number of division or subtraction steps to reduce the numbers
8851to size below @math{3N/2}. This is essential mainly for the unlikely case of
8852large quotients.
8853
8854@item
8855Call HGCD recursively, on the most significant @math{N/2} limbs of the reduced
8856numbers. Apply the resulting matrix @math{T_2} to the full numbers, reducing
8857them to a size just above @math{N/2}.
8858
8859@item
8860Compute @math{T = T_1 T_2}.
8861
8862@item
8863Perform a small number of division and subtraction steps to satisfy the
8864requirements, and return.
8865@end itemize
8866
8867GCD is then implemented as a loop around HGCD, similarly to Lehmer's
8868algorithm. Where Lehmer repeatedly chops off the top two limbs, calls
8869@code{mpn_hgcd2}, and applies the resulting matrix to the full numbers, the
8870sub-quadratic GCD chops off the most significant third of the limbs (the
8871proportion is a tuning parameter, and @math{1/3} seems to be more efficient
8872than, e.g, @math{1/2}), calls @code{mpn_hgcd}, and applies the resulting
8873matrix. Once the input numbers are reduced to size below
8874@code{GCD_DC_THRESHOLD}, Lehmer's algorithm is used for the rest of the work.
8875
8876The asymptotic running time of both HGCD and GCD is @m{O(M(N)\log N),O(M(N)*log(N))},
8877where @math{M(N)} is the time for multiplying two @math{N}-limb numbers.
8878
8879@comment  node-name,  next,  previous,  up
8880
8881@node Extended GCD, Jacobi Symbol, Subquadratic GCD, Greatest Common Divisor Algorithms
8882@subsection Extended GCD
8883
8884The extended GCD function, or GCDEXT, calculates @math{@gcd{}(a,b)} and also
8885cofactors @math{x} and @math{y} satisfying @m{ax+by=\gcd(a@C{}b),
8886a*x+b*y=gcd(a@C{}b)}. All the algorithms used for plain GCD are extended to
8887handle this case. The binary algorithm is used only for single-limb GCDEXT.
8888Lehmer's algorithm is used for sizes up to @code{GCDEXT_DC_THRESHOLD}. Above
8889this threshold, GCDEXT is implemented as a loop around HGCD, but with more
8890book-keeping to keep track of the cofactors. This gives the same asymptotic
8891running time as for GCD and HGCD, @m{O(M(N)\log N),O(M(N)*log(N))}
8892
8893One difference to plain GCD is that while the inputs @math{a} and @math{b} are
8894reduced as the algorithm proceeds, the cofactors @math{x} and @math{y} grow in
8895size. This makes the tuning of the chopping-point more difficult. The current
8896code chops off the most significant half of the inputs for the call to HGCD in
8897the first iteration, and the most significant two thirds for the remaining
8898calls. This strategy could surely be improved. Also the stop condition for the
8899loop, where Lehmer's algorithm is invoked once the inputs are reduced below
8900@code{GCDEXT_DC_THRESHOLD}, could maybe be improved by taking into account the
8901current size of the cofactors.
8902
8903@node Jacobi Symbol,  , Extended GCD, Greatest Common Divisor Algorithms
8904@subsection Jacobi Symbol
8905@cindex Jacobi symbol algorithm
8906
8907[This section is obsolete.  The current Jacobi code actually uses a very
8908efficient algorithm.]
8909
8910@code{mpz_jacobi} and @code{mpz_kronecker} are currently implemented with a
8911simple binary algorithm similar to that described for the GCDs (@pxref{Binary
8912GCD}).  They're not very fast when both inputs are large.  Lehmer's multi-step
8913improvement or a binary based multi-step algorithm is likely to be better.
8914
8915When one operand fits a single limb, and that includes @code{mpz_kronecker_ui}
8916and friends, an initial reduction is done with either @code{mpn_mod_1} or
8917@code{mpn_modexact_1_odd}, followed by the binary algorithm on a single limb.
8918The binary algorithm is well suited to a single limb, and the whole
8919calculation in this case is quite efficient.
8920
8921In all the routines sign changes for the result are accumulated using some bit
8922twiddling, avoiding table lookups or conditional jumps.
8923
8924
8925@need 1000
8926@node Powering Algorithms, Root Extraction Algorithms, Greatest Common Divisor Algorithms, Algorithms
8927@section Powering Algorithms
8928@cindex Powering algorithms
8929
8930@menu
8931* Normal Powering Algorithm::
8932* Modular Powering Algorithm::
8933@end menu
8934
8935
8936@node Normal Powering Algorithm, Modular Powering Algorithm, Powering Algorithms, Powering Algorithms
8937@subsection Normal Powering
8938
8939Normal @code{mpz} or @code{mpf} powering uses a simple binary algorithm,
8940successively squaring and then multiplying by the base when a 1 bit is seen in
8941the exponent, as per Knuth section 4.6.3.  The ``left to right''
8942variant described there is used rather than algorithm A, since it's just as
8943easy and can be done with somewhat less temporary memory.
8944
8945
8946@node Modular Powering Algorithm,  , Normal Powering Algorithm, Powering Algorithms
8947@subsection Modular Powering
8948
8949Modular powering is implemented using a @math{2^k}-ary sliding window
8950algorithm, as per ``Handbook of Applied Cryptography'' algorithm 14.85
8951(@pxref{References}).  @math{k} is chosen according to the size of the
8952exponent.  Larger exponents use larger values of @math{k}, the choice being
8953made to minimize the average number of multiplications that must supplement
8954the squaring.
8955
8956The modular multiplies and squarings use either a simple division or the REDC
8957method by Montgomery (@pxref{References}).  REDC is a little faster,
8958essentially saving N single limb divisions in a fashion similar to an exact
8959remainder (@pxref{Exact Remainder}).
8960
8961
8962@node Root Extraction Algorithms, Radix Conversion Algorithms, Powering Algorithms, Algorithms
8963@section Root Extraction Algorithms
8964@cindex Root extraction algorithms
8965
8966@menu
8967* Square Root Algorithm::
8968* Nth Root Algorithm::
8969* Perfect Square Algorithm::
8970* Perfect Power Algorithm::
8971@end menu
8972
8973
8974@node Square Root Algorithm, Nth Root Algorithm, Root Extraction Algorithms, Root Extraction Algorithms
8975@subsection Square Root
8976@cindex Square root algorithm
8977@cindex Karatsuba square root algorithm
8978
8979Square roots are taken using the ``Karatsuba Square Root'' algorithm by Paul
8980Zimmermann (@pxref{References}).
8981
8982An input @math{n} is split into four parts of @math{k} bits each, so with
8983@math{b=2^k} we have @m{n = a_3b^3 + a_2b^2 + a_1b + a_0, n = a3*b^3 + a2*b^2
8984+ a1*b + a0}.  Part @ms{a,3} must be ``normalized'' so that either the high or
8985second highest bit is set.  In GMP, @math{k} is kept on a limb boundary and
8986the input is left shifted (by an even number of bits) to normalize.
8987
8988The square root of the high two parts is taken, by recursive application of
8989the algorithm (bottoming out in a one-limb Newton's method),
8990@tex
8991$$ s',r' = \mathop{\rm sqrtrem} \> (a_3b + a_2) $$
8992@end tex
8993@ifnottex
8994
8995@example
8996s1,r1 = sqrtrem (a3*b + a2)
8997@end example
8998
8999@end ifnottex
9000This is an approximation to the desired root and is extended by a division to
9001give @math{s},@math{r},
9002@tex
9003$$\eqalign{
9004q,u &= \mathop{\rm divrem} \> (r'b + a_1, 2s') \cr
9005s &= s'b + q \cr
9006r &= ub + a_0 - q^2
9007}$$
9008@end tex
9009@ifnottex
9010
9011@example
9012q,u = divrem (r1*b + a1, 2*s1)
9013s = s1*b + q
9014r = u*b + a0 - q^2
9015@end example
9016
9017@end ifnottex
9018The normalization requirement on @ms{a,3} means at this point @math{s} is
9019either correct or 1 too big.  @math{r} is negative in the latter case, so
9020@tex
9021$$\eqalign{
9022\mathop{\rm if} \; r &< 0 \; \mathop{\rm then} \cr
9023r &\leftarrow r + 2s - 1 \cr
9024s &\leftarrow s - 1
9025}$$
9026@end tex
9027@ifnottex
9028
9029@example
9030if r < 0 then
9031  r = r + 2*s - 1
9032  s = s - 1
9033@end example
9034
9035@end ifnottex
9036The algorithm is expressed in a divide and conquer form, but as noted in the
9037paper it can also be viewed as a discrete variant of Newton's method, or as a
9038variation on the schoolboy method (no longer taught) for square roots two
9039digits at a time.
9040
9041If the remainder @math{r} is not required then usually only a few high limbs
9042of @math{r} and @math{u} need to be calculated to determine whether an
9043adjustment to @math{s} is required.  This optimization is not currently
9044implemented.
9045
9046In the Karatsuba multiplication range this algorithm is @m{O({3\over2}
9047M(N/2)),O(1.5*M(N/2))}, where @math{M(n)} is the time to multiply two numbers
9048of @math{n} limbs.  In the FFT multiplication range this grows to a bound of
9049@m{O(6 M(N/2)),O(6*M(N/2))}.  In practice a factor of about 1.5 to 1.8 is
9050found in the Karatsuba and Toom-3 ranges, growing to 2 or 3 in the FFT range.
9051
9052The algorithm does all its calculations in integers and the resulting
9053@code{mpn_sqrtrem} is used for both @code{mpz_sqrt} and @code{mpf_sqrt}.
9054The extended precision given by @code{mpf_sqrt_ui} is obtained by
9055padding with zero limbs.
9056
9057
9058@node Nth Root Algorithm, Perfect Square Algorithm, Square Root Algorithm, Root Extraction Algorithms
9059@subsection Nth Root
9060@cindex Root extraction algorithm
9061@cindex Nth root algorithm
9062
9063Integer Nth roots are taken using Newton's method with the following
9064iteration, where @math{A} is the input and @math{n} is the root to be taken.
9065@tex
9066$$a_{i+1} = {1\over n} \left({A \over a_i^{n-1}} + (n-1)a_i \right)$$
9067@end tex
9068@ifnottex
9069
9070@example
9071         1         A
9072a[i+1] = - * ( --------- + (n-1)*a[i] )
9073         n     a[i]^(n-1)
9074@end example
9075
9076@end ifnottex
9077The initial approximation @m{a_1,a[1]} is generated bitwise by successively
9078powering a trial root with or without new 1 bits, aiming to be just above the
9079true root.  The iteration converges quadratically when started from a good
9080approximation.  When @math{n} is large more initial bits are needed to get
9081good convergence.  The current implementation is not particularly well
9082optimized.
9083
9084
9085@node Perfect Square Algorithm, Perfect Power Algorithm, Nth Root Algorithm, Root Extraction Algorithms
9086@subsection Perfect Square
9087@cindex Perfect square algorithm
9088
9089A significant fraction of non-squares can be quickly identified by checking
9090whether the input is a quadratic residue modulo small integers.
9091
9092@code{mpz_perfect_square_p} first tests the input mod 256, which means just
9093examining the low byte.  Only 44 different values occur for squares mod 256,
9094so 82.8% of inputs can be immediately identified as non-squares.
9095
9096On a 32-bit system similar tests are done mod 9, 5, 7, 13 and 17, for a total
909799.25% of inputs identified as non-squares.  On a 64-bit system 97 is tested
9098too, for a total 99.62%.
9099
9100These moduli are chosen because they're factors of @math{2^@W{24}-1} (or
9101@math{2^@W{48}-1} for 64-bits), and such a remainder can be quickly taken just
9102using additions (see @code{mpn_mod_34lsub1}).
9103
9104When nails are in use moduli are instead selected by the @file{gen-psqr.c}
9105program and applied with an @code{mpn_mod_1}.  The same @math{2^@W{24}-1} or
9106@math{2^@W{48}-1} could be done with nails using some extra bit shifts, but
9107this is not currently implemented.
9108
9109In any case each modulus is applied to the @code{mpn_mod_34lsub1} or
9110@code{mpn_mod_1} remainder and a table lookup identifies non-squares.  By
9111using a ``modexact'' style calculation, and suitably permuted tables, just one
9112multiply each is required, see the code for details.  Moduli are also combined
9113to save operations, so long as the lookup tables don't become too big.
9114@file{gen-psqr.c} does all the pre-calculations.
9115
9116A square root must still be taken for any value that passes these tests, to
9117verify it's really a square and not one of the small fraction of non-squares
9118that get through (i.e.@: a pseudo-square to all the tested bases).
9119
9120Clearly more residue tests could be done, @code{mpz_perfect_square_p} only
9121uses a compact and efficient set.  Big inputs would probably benefit from more
9122residue testing, small inputs might be better off with less.  The assumed
9123distribution of squares versus non-squares in the input would affect such
9124considerations.
9125
9126
9127@node Perfect Power Algorithm,  , Perfect Square Algorithm, Root Extraction Algorithms
9128@subsection Perfect Power
9129@cindex Perfect power algorithm
9130
9131Detecting perfect powers is required by some factorization algorithms.
9132Currently @code{mpz_perfect_power_p} is implemented using repeated Nth root
9133extractions, though naturally only prime roots need to be considered.
9134(@xref{Nth Root Algorithm}.)
9135
9136If a prime divisor @math{p} with multiplicity @math{e} can be found, then only
9137roots which are divisors of @math{e} need to be considered, much reducing the
9138work necessary.  To this end divisibility by a set of small primes is checked.
9139
9140
9141@node Radix Conversion Algorithms, Other Algorithms, Root Extraction Algorithms, Algorithms
9142@section Radix Conversion
9143@cindex Radix conversion algorithms
9144
9145Radix conversions are less important than other algorithms.  A program
9146dominated by conversions should probably use a different data representation.
9147
9148@menu
9149* Binary to Radix::
9150* Radix to Binary::
9151@end menu
9152
9153
9154@node Binary to Radix, Radix to Binary, Radix Conversion Algorithms, Radix Conversion Algorithms
9155@subsection Binary to Radix
9156
9157Conversions from binary to a power-of-2 radix use a simple and fast
9158@math{O(N)} bit extraction algorithm.
9159
9160Conversions from binary to other radices use one of two algorithms.  Sizes
9161below @code{GET_STR_PRECOMPUTE_THRESHOLD} use a basic @math{O(N^2)} method.
9162Repeated divisions by @math{b^n} are made, where @math{b} is the radix and
9163@math{n} is the biggest power that fits in a limb.  But instead of simply
9164using the remainder @math{r} from such divisions, an extra divide step is done
9165to give a fractional limb representing @math{r/b^n}.  The digits of @math{r}
9166can then be extracted using multiplications by @math{b} rather than divisions.
9167Special case code is provided for decimal, allowing multiplications by 10 to
9168optimize to shifts and adds.
9169
9170Above @code{GET_STR_PRECOMPUTE_THRESHOLD} a sub-quadratic algorithm is used.
9171For an input @math{t}, powers @m{b^{n2^i},b^(n*2^i)} of the radix are
9172calculated, until a power between @math{t} and @m{\sqrt{t},sqrt(t)} is
9173reached.  @math{t} is then divided by that largest power, giving a quotient
9174which is the digits above that power, and a remainder which is those below.
9175These two parts are in turn divided by the second highest power, and so on
9176recursively.  When a piece has been divided down to less than
9177@code{GET_STR_DC_THRESHOLD} limbs, the basecase algorithm described above is
9178used.
9179
9180The advantage of this algorithm is that big divisions can make use of the
9181sub-quadratic divide and conquer division (@pxref{Divide and Conquer
9182Division}), and big divisions tend to have less overheads than lots of
9183separate single limb divisions anyway.  But in any case the cost of
9184calculating the powers @m{b^{n2^i},b^(n*2^i)} must first be overcome.
9185
9186@code{GET_STR_PRECOMPUTE_THRESHOLD} and @code{GET_STR_DC_THRESHOLD} represent
9187the same basic thing, the point where it becomes worth doing a big division to
9188cut the input in half.  @code{GET_STR_PRECOMPUTE_THRESHOLD} includes the cost
9189of calculating the radix power required, whereas @code{GET_STR_DC_THRESHOLD}
9190assumes that's already available, which is the case when recursing.
9191
9192Since the base case produces digits from least to most significant but they
9193want to be stored from most to least, it's necessary to calculate in advance
9194how many digits there will be, or at least be sure not to underestimate that.
9195For GMP the number of input bits is multiplied by @code{chars_per_bit_exactly}
9196from @code{mp_bases}, rounding up.  The result is either correct or one too
9197big.
9198
9199Examining some of the high bits of the input could increase the chance of
9200getting the exact number of digits, but an exact result every time would not
9201be practical, since in general the difference between numbers 100@dots{} and
920299@dots{} is only in the last few bits and the work to identify 99@dots{}
9203might well be almost as much as a full conversion.
9204
9205The @math{r/b^n} scheme described above for using multiplications to bring out
9206digits might be useful for more than a single limb.  Some brief experiments
9207with it on the base case when recursing didn't give a noticeable improvement,
9208but perhaps that was only due to the implementation.  Something similar would
9209work for the sub-quadratic divisions too, though there would be the cost of
9210calculating a bigger radix power.
9211
9212Another possible improvement for the sub-quadratic part would be to arrange
9213for radix powers that balanced the sizes of quotient and remainder produced,
9214i.e.@: the highest power would be an @m{b^{nk},b^(n*k)} approximately equal to
9215@m{\sqrt{t},sqrt(t)}, not restricted to a @math{2^i} factor.  That ought to
9216smooth out a graph of times against sizes, but may or may not be a net
9217speedup.
9218
9219
9220@node Radix to Binary,  , Binary to Radix, Radix Conversion Algorithms
9221@subsection Radix to Binary
9222
9223@strong{This section needs to be rewritten, it currently describes the
9224algorithms used before GMP 4.3.}
9225
9226Conversions from a power-of-2 radix into binary use a simple and fast
9227@math{O(N)} bitwise concatenation algorithm.
9228
9229Conversions from other radices use one of two algorithms.  Sizes below
9230@code{SET_STR_PRECOMPUTE_THRESHOLD} use a basic @math{O(N^2)} method.  Groups
9231of @math{n} digits are converted to limbs, where @math{n} is the biggest
9232power of the base @math{b} which will fit in a limb, then those groups are
9233accumulated into the result by multiplying by @math{b^n} and adding.  This
9234saves multi-precision operations, as per Knuth section 4.4 part E
9235(@pxref{References}).  Some special case code is provided for decimal, giving
9236the compiler a chance to optimize multiplications by 10.
9237
9238Above @code{SET_STR_PRECOMPUTE_THRESHOLD} a sub-quadratic algorithm is used.
9239First groups of @math{n} digits are converted into limbs.  Then adjacent
9240limbs are combined into limb pairs with @m{xb^n+y,x*b^n+y}, where @math{x}
9241and @math{y} are the limbs.  Adjacent limb pairs are combined into quads
9242similarly with @m{xb^{2n}+y,x*b^(2n)+y}.  This continues until a single block
9243remains, that being the result.
9244
9245The advantage of this method is that the multiplications for each @math{x} are
9246big blocks, allowing Karatsuba and higher algorithms to be used.  But the cost
9247of calculating the powers @m{b^{n2^i},b^(n*2^i)} must be overcome.
9248@code{SET_STR_PRECOMPUTE_THRESHOLD} usually ends up quite big, around 5000 digits, and on
9249some processors much bigger still.
9250
9251@code{SET_STR_PRECOMPUTE_THRESHOLD} is based on the input digits (and tuned
9252for decimal), though it might be better based on a limb count, so as to be
9253independent of the base.  But that sort of count isn't used by the base case
9254and so would need some sort of initial calculation or estimate.
9255
9256The main reason @code{SET_STR_PRECOMPUTE_THRESHOLD} is so much bigger than the
9257corresponding @code{GET_STR_PRECOMPUTE_THRESHOLD} is that @code{mpn_mul_1} is
9258much faster than @code{mpn_divrem_1} (often by a factor of 5, or more).
9259
9260
9261@need 1000
9262@node Other Algorithms, Assembly Coding, Radix Conversion Algorithms, Algorithms
9263@section Other Algorithms
9264
9265@menu
9266* Prime Testing Algorithm::
9267* Factorial Algorithm::
9268* Binomial Coefficients Algorithm::
9269* Fibonacci Numbers Algorithm::
9270* Lucas Numbers Algorithm::
9271* Random Number Algorithms::
9272@end menu
9273
9274
9275@node Prime Testing Algorithm, Factorial Algorithm, Other Algorithms, Other Algorithms
9276@subsection Prime Testing
9277@cindex Prime testing algorithms
9278
9279The primality testing in @code{mpz_probab_prime_p} (@pxref{Number Theoretic
9280Functions}) first does some trial division by small factors and then uses the
9281Miller-Rabin probabilistic primality testing algorithm, as described in Knuth
9282section 4.5.4 algorithm P (@pxref{References}).
9283
9284For an odd input @math{n}, and with @math{n = q@GMPmultiply{}2^k+1} where
9285@math{q} is odd, this algorithm selects a random base @math{x} and tests
9286whether @math{x^q @bmod{} n} is 1 or @math{-1}, or an @m{x^{q2^j} \bmod n,
9287x^(q*2^j) mod n} is @math{1}, for @math{1@le{}j@le{}k}.  If so then @math{n}
9288is probably prime, if not then @math{n} is definitely composite.
9289
9290Any prime @math{n} will pass the test, but some composites do too.  Such
9291composites are known as strong pseudoprimes to base @math{x}.  No @math{n} is
9292a strong pseudoprime to more than @math{1/4} of all bases (see Knuth exercise
929322), hence with @math{x} chosen at random there's no more than a @math{1/4}
9294chance a ``probable prime'' will in fact be composite.
9295
9296In fact strong pseudoprimes are quite rare, making the test much more
9297powerful than this analysis would suggest, but @math{1/4} is all that's proven
9298for an arbitrary @math{n}.
9299
9300
9301@node Factorial Algorithm, Binomial Coefficients Algorithm, Prime Testing Algorithm, Other Algorithms
9302@subsection Factorial
9303@cindex Factorial algorithm
9304
9305Factorials are calculated by a combination of two algorithms. An idea is
9306shared among them: to compute the odd part of the factorial; a final step
9307takes account of the power of @math{2} term, by shifting.
9308
9309For small @math{n}, the odd factor of @math{n!} is computed with the simple
9310observation that it is equal to the product of all positive odd numbers
9311smaller than @math{n} times the odd factor of @m{\lfloor n/2\rfloor!, [n/2]!},
9312where @m{\lfloor x\rfloor, [x]} is the integer part of @math{x}, and so on
9313recursively. The procedure can be best illustrated with an example,
9314
9315@quotation
9316@math{23! = (23.21.19.17.15.13.11.9.7.5.3)(11.9.7.5.3)(5.3)2^{19}}
9317@end quotation
9318
9319Current code collects all the factors in a single list, with a loop and no
9320recursion, and compute the product, with no special care for repeated chunks.
9321
9322When @math{n} is larger, computation pass trough prime sieving. An helper
9323function is used, as suggested by Peter Luschny:
9324@tex
9325$$\mathop{\rm msf}(n) = {n!\over\lfloor n/2\rfloor!^2\cdot2^k} = \prod_{p=3}^{n}
9326p^{\mathop{\rm L}(p,n)} $$
9327@end tex
9328@ifnottex
9329
9330@example
9331                            n
9332                          -----
9333               n!          | |   L(p,n)
9334msf(n) = -------------- =  | |  p
9335          [n/2]!^2.2^k     p=3
9336@end example
9337@end ifnottex
9338
9339Where @math{p} ranges on odd prime numbers. The exponent @math{k} is chosen to
9340obtain an odd integer number: @math{k} is the number of 1 bits in the binary
9341representation of @m{\lfloor n/2\rfloor, [n/2]}. The function L@math{(p,n)}
9342can be defined as zero when @math{p} is composite, and, for any prime
9343@math{p}, it is computed with:
9344@tex
9345$$\mathop{\rm L}(p,n) = \sum_{i>0}\left\lfloor{n\over p^i}\right\rfloor\bmod2
9346\leq\log_p(n)$$
9347@end tex
9348@ifnottex
9349
9350@example
9351          ---
9352           \    n
9353L(p,n) =   /  [---] mod 2   <=  log (n) .
9354          ---  p^i                p
9355          i>0
9356@end example
9357@end ifnottex
9358
9359With this helper function, we are able to compute the odd part of @math{n!}
9360using the recursion implied by @m{n!=\lfloor n/2\rfloor!^2\cdot\mathop{\rm
9361msf}(n)\cdot2^k , n!=[n/2]!^2*msf(n)*2^k}. The recursion stops using the
9362small-@math{n} algorithm on some @m{\lfloor n/2^i\rfloor, [n/2^i]}.
9363
9364Both the above algorithms use binary splitting to compute the product of many
9365small factors. At first as many products as possible are accumulated in a
9366single register, generating a list of factors that fit in a machine word. This
9367list is then split into halves, and the product is computed recursively.
9368
9369Such splitting is more efficient than repeated N@cross{}1 multiplies since it
9370forms big multiplies, allowing Karatsuba and higher algorithms to be used.
9371And even below the Karatsuba threshold a big block of work can be more
9372efficient for the basecase algorithm.
9373
9374
9375@node Binomial Coefficients Algorithm, Fibonacci Numbers Algorithm, Factorial Algorithm, Other Algorithms
9376@subsection Binomial Coefficients
9377@cindex Binomial coefficient algorithm
9378
9379Binomial coefficients @m{\left({n}\atop{k}\right), C(n@C{}k)} are calculated
9380by first arranging @math{k @le{} n/2} using @m{\left({n}\atop{k}\right) =
9381\left({n}\atop{n-k}\right), C(n@C{}k) = C(n@C{}n-k)} if necessary, and then
9382evaluating the following product simply from @math{i=2} to @math{i=k}.
9383@tex
9384$$ \left({n}\atop{k}\right) = (n-k+1) \prod_{i=2}^{k} {{n-k+i} \over i} $$
9385@end tex
9386@ifnottex
9387
9388@example
9389                      k  (n-k+i)
9390C(n,k) =  (n-k+1) * prod -------
9391                     i=2    i
9392@end example
9393
9394@end ifnottex
9395It's easy to show that each denominator @math{i} will divide the product so
9396far, so the exact division algorithm is used (@pxref{Exact Division}).
9397
9398The numerators @math{n-k+i} and denominators @math{i} are first accumulated
9399into as many fit a limb, to save multi-precision operations, though for
9400@code{mpz_bin_ui} this applies only to the divisors, since @math{n} is an
9401@code{mpz_t} and @math{n-k+i} in general won't fit in a limb at all.
9402
9403
9404@node Fibonacci Numbers Algorithm, Lucas Numbers Algorithm, Binomial Coefficients Algorithm, Other Algorithms
9405@subsection Fibonacci Numbers
9406@cindex Fibonacci number algorithm
9407
9408The Fibonacci functions @code{mpz_fib_ui} and @code{mpz_fib2_ui} are designed
9409for calculating isolated @m{F_n,F[n]} or @m{F_n,F[n]},@m{F_{n-1},F[n-1]}
9410values efficiently.
9411
9412For small @math{n}, a table of single limb values in @code{__gmp_fib_table} is
9413used.  On a 32-bit limb this goes up to @m{F_{47},F[47]}, or on a 64-bit limb
9414up to @m{F_{93},F[93]}.  For convenience the table starts at @m{F_{-1},F[-1]}.
9415
9416Beyond the table, values are generated with a binary powering algorithm,
9417calculating a pair @m{F_n,F[n]} and @m{F_{n-1},F[n-1]} working from high to
9418low across the bits of @math{n}.  The formulas used are
9419@tex
9420$$\eqalign{
9421  F_{2k+1} &= 4F_k^2 - F_{k-1}^2 + 2(-1)^k \cr
9422  F_{2k-1} &=  F_k^2 + F_{k-1}^2           \cr
9423  F_{2k}   &= F_{2k+1} - F_{2k-1}
9424}$$
9425@end tex
9426@ifnottex
9427
9428@example
9429F[2k+1] = 4*F[k]^2 - F[k-1]^2 + 2*(-1)^k
9430F[2k-1] =   F[k]^2 + F[k-1]^2
9431
9432F[2k] = F[2k+1] - F[2k-1]
9433@end example
9434
9435@end ifnottex
9436At each step, @math{k} is the high @math{b} bits of @math{n}.  If the next bit
9437of @math{n} is 0 then @m{F_{2k},F[2k]},@m{F_{2k-1},F[2k-1]} is used, or if
9438it's a 1 then @m{F_{2k+1},F[2k+1]},@m{F_{2k},F[2k]} is used, and the process
9439repeated until all bits of @math{n} are incorporated.  Notice these formulas
9440require just two squares per bit of @math{n}.
9441
9442It'd be possible to handle the first few @math{n} above the single limb table
9443with simple additions, using the defining Fibonacci recurrence @m{F_{k+1} =
9444F_k + F_{k-1}, F[k+1]=F[k]+F[k-1]}, but this is not done since it usually
9445turns out to be faster for only about 10 or 20 values of @math{n}, and
9446including a block of code for just those doesn't seem worthwhile.  If they
9447really mattered it'd be better to extend the data table.
9448
9449Using a table avoids lots of calculations on small numbers, and makes small
9450@math{n} go fast.  A bigger table would make more small @math{n} go fast, it's
9451just a question of balancing size against desired speed.  For GMP the code is
9452kept compact, with the emphasis primarily on a good powering algorithm.
9453
9454@code{mpz_fib2_ui} returns both @m{F_n,F[n]} and @m{F_{n-1},F[n-1]}, but
9455@code{mpz_fib_ui} is only interested in @m{F_n,F[n]}.  In this case the last
9456step of the algorithm can become one multiply instead of two squares.  One of
9457the following two formulas is used, according as @math{n} is odd or even.
9458@tex
9459$$\eqalign{
9460  F_{2k}   &= F_k (F_k + 2F_{k-1}) \cr
9461  F_{2k+1} &= (2F_k + F_{k-1}) (2F_k - F_{k-1}) + 2(-1)^k
9462}$$
9463@end tex
9464@ifnottex
9465
9466@example
9467F[2k]   = F[k]*(F[k]+2F[k-1])
9468
9469F[2k+1] = (2F[k]+F[k-1])*(2F[k]-F[k-1]) + 2*(-1)^k
9470@end example
9471
9472@end ifnottex
9473@m{F_{2k+1},F[2k+1]} here is the same as above, just rearranged to be a
9474multiply.  For interest, the @m{2(-1)^k, 2*(-1)^k} term both here and above
9475can be applied just to the low limb of the calculation, without a carry or
9476borrow into further limbs, which saves some code size.  See comments with
9477@code{mpz_fib_ui} and the internal @code{mpn_fib2_ui} for how this is done.
9478
9479
9480@node Lucas Numbers Algorithm, Random Number Algorithms, Fibonacci Numbers Algorithm, Other Algorithms
9481@subsection Lucas Numbers
9482@cindex Lucas number algorithm
9483
9484@code{mpz_lucnum2_ui} derives a pair of Lucas numbers from a pair of Fibonacci
9485numbers with the following simple formulas.
9486@tex
9487$$\eqalign{
9488  L_k     &=  F_k + 2F_{k-1} \cr
9489  L_{k-1} &= 2F_k -  F_{k-1}
9490}$$
9491@end tex
9492@ifnottex
9493
9494@example
9495L[k]   =   F[k] + 2*F[k-1]
9496L[k-1] = 2*F[k] -   F[k-1]
9497@end example
9498
9499@end ifnottex
9500@code{mpz_lucnum_ui} is only interested in @m{L_n,L[n]}, and some work can be
9501saved.  Trailing zero bits on @math{n} can be handled with a single square
9502each.
9503@tex
9504$$ L_{2k} = L_k^2 - 2(-1)^k $$
9505@end tex
9506@ifnottex
9507
9508@example
9509L[2k] = L[k]^2 - 2*(-1)^k
9510@end example
9511
9512@end ifnottex
9513And the lowest 1 bit can be handled with one multiply of a pair of Fibonacci
9514numbers, similar to what @code{mpz_fib_ui} does.
9515@tex
9516$$ L_{2k+1} = 5F_{k-1} (2F_k + F_{k-1}) - 4(-1)^k $$
9517@end tex
9518@ifnottex
9519
9520@example
9521L[2k+1] = 5*F[k-1]*(2*F[k]+F[k-1]) - 4*(-1)^k
9522@end example
9523
9524@end ifnottex
9525
9526
9527@node Random Number Algorithms,  , Lucas Numbers Algorithm, Other Algorithms
9528@subsection Random Numbers
9529@cindex Random number algorithms
9530
9531For the @code{urandomb} functions, random numbers are generated simply by
9532concatenating bits produced by the generator.  As long as the generator has
9533good randomness properties this will produce well-distributed @math{N} bit
9534numbers.
9535
9536For the @code{urandomm} functions, random numbers in a range @math{0@le{}R<N}
9537are generated by taking values @math{R} of @m{\lceil \log_2 N \rceil,
9538ceil(log2(N))} bits each until one satisfies @math{R<N}.  This will normally
9539require only one or two attempts, but the attempts are limited in case the
9540generator is somehow degenerate and produces only 1 bits or similar.
9541
9542@cindex Mersenne twister algorithm
9543The Mersenne Twister generator is by Matsumoto and Nishimura
9544(@pxref{References}).  It has a non-repeating period of @math{2^@W{19937}-1},
9545which is a Mersenne prime, hence the name of the generator.  The state is 624
9546words of 32-bits each, which is iterated with one XOR and shift for each
954732-bit word generated, making the algorithm very fast.  Randomness properties
9548are also very good and this is the default algorithm used by GMP.
9549
9550@cindex Linear congruential algorithm
9551Linear congruential generators are described in many text books, for instance
9552Knuth volume 2 (@pxref{References}).  With a modulus @math{M} and parameters
9553@math{A} and @math{C}, an integer state @math{S} is iterated by the formula
9554@math{S @leftarrow{} A@GMPmultiply{}S+C @bmod{} M}.  At each step the new
9555state is a linear function of the previous, mod @math{M}, hence the name of
9556the generator.
9557
9558In GMP only moduli of the form @math{2^N} are supported, and the current
9559implementation is not as well optimized as it could be.  Overheads are
9560significant when @math{N} is small, and when @math{N} is large clearly the
9561multiply at each step will become slow.  This is not a big concern, since the
9562Mersenne Twister generator is better in every respect and is therefore
9563recommended for all normal applications.
9564
9565For both generators the current state can be deduced by observing enough
9566output and applying some linear algebra (over GF(2) in the case of the
9567Mersenne Twister).  This generally means raw output is unsuitable for
9568cryptographic applications without further hashing or the like.
9569
9570
9571@node Assembly Coding,  , Other Algorithms, Algorithms
9572@section Assembly Coding
9573@cindex Assembly coding
9574
9575The assembly subroutines in GMP are the most significant source of speed at
9576small to moderate sizes.  At larger sizes algorithm selection becomes more
9577important, but of course speedups in low level routines will still speed up
9578everything proportionally.
9579
9580Carry handling and widening multiplies that are important for GMP can't be
9581easily expressed in C@.  GCC @code{asm} blocks help a lot and are provided in
9582@file{longlong.h}, but hand coding low level routines invariably offers a
9583speedup over generic C by a factor of anything from 2 to 10.
9584
9585@menu
9586* Assembly Code Organisation::
9587* Assembly Basics::
9588* Assembly Carry Propagation::
9589* Assembly Cache Handling::
9590* Assembly Functional Units::
9591* Assembly Floating Point::
9592* Assembly SIMD Instructions::
9593* Assembly Software Pipelining::
9594* Assembly Loop Unrolling::
9595* Assembly Writing Guide::
9596@end menu
9597
9598
9599@node Assembly Code Organisation, Assembly Basics, Assembly Coding, Assembly Coding
9600@subsection Code Organisation
9601@cindex Assembly code organisation
9602@cindex Code organisation
9603
9604The various @file{mpn} subdirectories contain machine-dependent code, written
9605in C or assembly.  The @file{mpn/generic} subdirectory contains default code,
9606used when there's no machine-specific version of a particular file.
9607
9608Each @file{mpn} subdirectory is for an ISA family.  Generally 32-bit and
960964-bit variants in a family cannot share code and have separate directories.
9610Within a family further subdirectories may exist for CPU variants.
9611
9612In each directory a @file{nails} subdirectory may exist, holding code with
9613nails support for that CPU variant.  A @code{NAILS_SUPPORT} directive in each
9614file indicates the nails values the code handles.  Nails code only exists
9615where it's faster, or promises to be faster, than plain code.  There's no
9616effort put into nails if they're not going to enhance a given CPU.
9617
9618
9619@node Assembly Basics, Assembly Carry Propagation, Assembly Code Organisation, Assembly Coding
9620@subsection Assembly Basics
9621
9622@code{mpn_addmul_1} and @code{mpn_submul_1} are the most important routines
9623for overall GMP performance.  All multiplications and divisions come down to
9624repeated calls to these.  @code{mpn_add_n}, @code{mpn_sub_n},
9625@code{mpn_lshift} and @code{mpn_rshift} are next most important.
9626
9627On some CPUs assembly versions of the internal functions
9628@code{mpn_mul_basecase} and @code{mpn_sqr_basecase} give significant speedups,
9629mainly through avoiding function call overheads.  They can also potentially
9630make better use of a wide superscalar processor, as can bigger primitives like
9631@code{mpn_addmul_2} or @code{mpn_addmul_4}.
9632
9633The restrictions on overlaps between sources and destinations
9634(@pxref{Low-level Functions}) are designed to facilitate a variety of
9635implementations.  For example, knowing @code{mpn_add_n} won't have partly
9636overlapping sources and destination means reading can be done far ahead of
9637writing on superscalar processors, and loops can be vectorized on a vector
9638processor, depending on the carry handling.
9639
9640
9641@node Assembly Carry Propagation, Assembly Cache Handling, Assembly Basics, Assembly Coding
9642@subsection Carry Propagation
9643@cindex Assembly carry propagation
9644
9645The problem that presents most challenges in GMP is propagating carries from
9646one limb to the next.  In functions like @code{mpn_addmul_1} and
9647@code{mpn_add_n}, carries are the only dependencies between limb operations.
9648
9649On processors with carry flags, a straightforward CISC style @code{adc} is
9650generally best.  AMD K6 @code{mpn_addmul_1} however is an example of an
9651unusual set of circumstances where a branch works out better.
9652
9653On RISC processors generally an add and compare for overflow is used.  This
9654sort of thing can be seen in @file{mpn/generic/aors_n.c}.  Some carry
9655propagation schemes require 4 instructions, meaning at least 4 cycles per
9656limb, but other schemes may use just 1 or 2.  On wide superscalar processors
9657performance may be completely determined by the number of dependent
9658instructions between carry-in and carry-out for each limb.
9659
9660On vector processors good use can be made of the fact that a carry bit only
9661very rarely propagates more than one limb.  When adding a single bit to a
9662limb, there's only a carry out if that limb was @code{0xFF@dots{}FF} which on
9663random data will be only 1 in @m{2\GMPraise{@code{mp\_bits\_per\_limb}},
96642^mp_bits_per_limb}.  @file{mpn/cray/add_n.c} is an example of this, it adds
9665all limbs in parallel, adds one set of carry bits in parallel and then only
9666rarely needs to fall through to a loop propagating further carries.
9667
9668On the x86s, GCC (as of version 2.95.2) doesn't generate particularly good code
9669for the RISC style idioms that are necessary to handle carry bits in
9670C@.  Often conditional jumps are generated where @code{adc} or @code{sbb} forms
9671would be better.  And so unfortunately almost any loop involving carry bits
9672needs to be coded in assembly for best results.
9673
9674
9675@node Assembly Cache Handling, Assembly Functional Units, Assembly Carry Propagation, Assembly Coding
9676@subsection Cache Handling
9677@cindex Assembly cache handling
9678
9679GMP aims to perform well both on operands that fit entirely in L1 cache and
9680those which don't.
9681
9682Basic routines like @code{mpn_add_n} or @code{mpn_lshift} are often used on
9683large operands, so L2 and main memory performance is important for them.
9684@code{mpn_mul_1} and @code{mpn_addmul_1} are mostly used for multiply and
9685square basecases, so L1 performance matters most for them, unless assembly
9686versions of @code{mpn_mul_basecase} and @code{mpn_sqr_basecase} exist, in
9687which case the remaining uses are mostly for larger operands.
9688
9689For L2 or main memory operands, memory access times will almost certainly be
9690more than the calculation time.  The aim therefore is to maximize memory
9691throughput, by starting a load of the next cache line while processing the
9692contents of the previous one.  Clearly this is only possible if the chip has a
9693lock-up free cache or some sort of prefetch instruction.  Most current chips
9694have both these features.
9695
9696Prefetching sources combines well with loop unrolling, since a prefetch can be
9697initiated once per unrolled loop (or more than once if the loop covers more
9698than one cache line).
9699
9700On CPUs without write-allocate caches, prefetching destinations will ensure
9701individual stores don't go further down the cache hierarchy, limiting
9702bandwidth.  Of course for calculations which are slow anyway, like
9703@code{mpn_divrem_1}, write-throughs might be fine.
9704
9705The distance ahead to prefetch will be determined by memory latency versus
9706throughput.  The aim of course is to have data arriving continuously, at peak
9707throughput.  Some CPUs have limits on the number of fetches or prefetches in
9708progress.
9709
9710If a special prefetch instruction doesn't exist then a plain load can be used,
9711but in that case care must be taken not to attempt to read past the end of an
9712operand, since that might produce a segmentation violation.
9713
9714Some CPUs or systems have hardware that detects sequential memory accesses and
9715initiates suitable cache movements automatically, making life easy.
9716
9717
9718@node Assembly Functional Units, Assembly Floating Point, Assembly Cache Handling, Assembly Coding
9719@subsection Functional Units
9720
9721When choosing an approach for an assembly loop, consideration is given to
9722what operations can execute simultaneously and what throughput can thereby be
9723achieved.  In some cases an algorithm can be tweaked to accommodate available
9724resources.
9725
9726Loop control will generally require a counter and pointer updates, costing as
9727much as 5 instructions, plus any delays a branch introduces.  CPU addressing
9728modes might reduce pointer updates, perhaps by allowing just one updating
9729pointer and others expressed as offsets from it, or on CISC chips with all
9730addressing done with the loop counter as a scaled index.
9731
9732The final loop control cost can be amortised by processing several limbs in
9733each iteration (@pxref{Assembly Loop Unrolling}).  This at least ensures loop
9734control isn't a big fraction the work done.
9735
9736Memory throughput is always a limit.  If perhaps only one load or one store
9737can be done per cycle then 3 cycles/limb will the top speed for ``binary''
9738operations like @code{mpn_add_n}, and any code achieving that is optimal.
9739
9740Integer resources can be freed up by having the loop counter in a float
9741register, or by pressing the float units into use for some multiplying,
9742perhaps doing every second limb on the float side (@pxref{Assembly Floating
9743Point}).
9744
9745Float resources can be freed up by doing carry propagation on the integer
9746side, or even by doing integer to float conversions in integers using bit
9747twiddling.
9748
9749
9750@node Assembly Floating Point, Assembly SIMD Instructions, Assembly Functional Units, Assembly Coding
9751@subsection Floating Point
9752@cindex Assembly floating Point
9753
9754Floating point arithmetic is used in GMP for multiplications on CPUs with poor
9755integer multipliers.  It's mostly useful for @code{mpn_mul_1},
9756@code{mpn_addmul_1} and @code{mpn_submul_1} on 64-bit machines, and
9757@code{mpn_mul_basecase} on both 32-bit and 64-bit machines.
9758
9759With IEEE 53-bit double precision floats, integer multiplications producing up
9760to 53 bits will give exact results.  Breaking a 64@cross{}64 multiplication
9761into eight 16@cross{}@math{32@rightarrow{}48} bit pieces is convenient.  With
9762some care though six 21@cross{}@math{32@rightarrow{}53} bit products can be
9763used, if one of the lower two 21-bit pieces also uses the sign bit.
9764
9765For the @code{mpn_mul_1} family of functions on a 64-bit machine, the
9766invariant single limb is split at the start, into 3 or 4 pieces.  Inside the
9767loop, the bignum operand is split into 32-bit pieces.  Fast conversion of
9768these unsigned 32-bit pieces to floating point is highly machine-dependent.
9769In some cases, reading the data into the integer unit, zero-extending to
977064-bits, then transferring to the floating point unit back via memory is the
9771only option.
9772
9773Converting partial products back to 64-bit limbs is usually best done as a
9774signed conversion.  Since all values are smaller than @m{2^{53},2^53}, signed
9775and unsigned are the same, but most processors lack unsigned conversions.
9776
9777@sp 2
9778
9779Here is a diagram showing 16@cross{}32 bit products for an @code{mpn_mul_1} or
9780@code{mpn_addmul_1} with a 64-bit limb.  The single limb operand V is split
9781into four 16-bit parts.  The multi-limb operand U is split in the loop into
9782two 32-bit parts.
9783
9784@tex
9785\global\newdimen\GMPbits      \global\GMPbits=0.18em
9786\def\GMPbox#1#2#3{%
9787  \hbox{%
9788    \hbox to 128\GMPbits{\hfil
9789      \vbox{%
9790        \hrule
9791        \hbox to 48\GMPbits {\GMPvrule \hfil$#2$\hfil \vrule}%
9792        \hrule}%
9793      \hskip #1\GMPbits}%
9794    \raise \GMPboxdepth \hbox{\hskip 2em #3}}}
9795%
9796\GMPdisplay{%
9797  \vbox{%
9798    \hbox{%
9799      \hbox to 128\GMPbits {\hfil
9800        \vbox{%
9801          \hrule
9802          \hbox to 64\GMPbits{%
9803            \GMPvrule \hfil$v48$\hfil
9804            \vrule    \hfil$v32$\hfil
9805            \vrule    \hfil$v16$\hfil
9806            \vrule    \hfil$v00$\hfil
9807            \vrule}
9808          \hrule}}%
9809       \raise \GMPboxdepth \hbox{\hskip 2em V Operand}}
9810    \vskip 0.5ex
9811    \hbox{%
9812      \hbox to 128\GMPbits {\hfil
9813        \raise \GMPboxdepth \hbox{$\times$\hskip 1.5em}%
9814        \vbox{%
9815          \hrule
9816          \hbox to 64\GMPbits {%
9817            \GMPvrule \hfil$u32$\hfil
9818            \vrule \hfil$u00$\hfil
9819            \vrule}%
9820          \hrule}}%
9821       \raise \GMPboxdepth \hbox{\hskip 2em U Operand (one limb)}}%
9822    \vskip 0.5ex
9823    \hbox{\vbox to 2ex{\hrule width 128\GMPbits}}%
9824    \GMPbox{0}{u00 \times v00}{$p00$\hskip 1.5em 48-bit products}%
9825    \vskip 0.5ex
9826    \GMPbox{16}{u00 \times v16}{$p16$}
9827    \vskip 0.5ex
9828    \GMPbox{32}{u00 \times v32}{$p32$}
9829    \vskip 0.5ex
9830    \GMPbox{48}{u00 \times v48}{$p48$}
9831    \vskip 0.5ex
9832    \GMPbox{32}{u32 \times v00}{$r32$}
9833    \vskip 0.5ex
9834    \GMPbox{48}{u32 \times v16}{$r48$}
9835    \vskip 0.5ex
9836    \GMPbox{64}{u32 \times v32}{$r64$}
9837    \vskip 0.5ex
9838    \GMPbox{80}{u32 \times v48}{$r80$}
9839}}
9840@end tex
9841@ifnottex
9842@example
9843@group
9844                +---+---+---+---+
9845                |v48|v32|v16|v00|    V operand
9846                +---+---+---+---+
9847
9848                +-------+---+---+
9849            x   |  u32  |  u00  |    U operand (one limb)
9850                +---------------+
9851
9852---------------------------------
9853
9854                    +-----------+
9855                    | u00 x v00 |    p00    48-bit products
9856                    +-----------+
9857                +-----------+
9858                | u00 x v16 |        p16
9859                +-----------+
9860            +-----------+
9861            | u00 x v32 |            p32
9862            +-----------+
9863        +-----------+
9864        | u00 x v48 |                p48
9865        +-----------+
9866            +-----------+
9867            | u32 x v00 |            r32
9868            +-----------+
9869        +-----------+
9870        | u32 x v16 |                r48
9871        +-----------+
9872    +-----------+
9873    | u32 x v32 |                    r64
9874    +-----------+
9875+-----------+
9876| u32 x v48 |                        r80
9877+-----------+
9878@end group
9879@end example
9880@end ifnottex
9881
9882@math{p32} and @math{r32} can be summed using floating-point addition, and
9883likewise @math{p48} and @math{r48}.  @math{p00} and @math{p16} can be summed
9884with @math{r64} and @math{r80} from the previous iteration.
9885
9886For each loop then, four 49-bit quantities are transferred to the integer unit,
9887aligned as follows,
9888
9889@tex
9890% GMPbox here should be 49 bits wide, but use 51 to better show p16+r80'
9891% crossing into the upper 64 bits.
9892\def\GMPbox#1#2#3{%
9893  \hbox{%
9894    \hbox to 128\GMPbits {%
9895      \hfil
9896      \vbox{%
9897        \hrule
9898        \hbox to 51\GMPbits {\GMPvrule \hfil$#2$\hfil \vrule}%
9899        \hrule}%
9900      \hskip #1\GMPbits}%
9901    \raise \GMPboxdepth \hbox{\hskip 1.5em $#3$\hfil}%
9902}}
9903\newbox\b \setbox\b\hbox{64 bits}%
9904\newdimen\bw \bw=\wd\b \advance\bw by 2em
9905\newdimen\x \x=128\GMPbits
9906\advance\x by -2\bw
9907\divide\x by4
9908\GMPdisplay{%
9909  \vbox{%
9910    \hbox to 128\GMPbits {%
9911      \GMPvrule
9912      \raise 0.5ex \vbox{\hrule \hbox to \x {}}%
9913      \hfil 64 bits\hfil
9914      \raise 0.5ex \vbox{\hrule \hbox to \x {}}%
9915      \vrule
9916      \raise 0.5ex \vbox{\hrule \hbox to \x {}}%
9917      \hfil 64 bits\hfil
9918      \raise 0.5ex \vbox{\hrule \hbox to \x {}}%
9919      \vrule}%
9920    \vskip 0.7ex
9921    \GMPbox{0}{p00+r64'}{i00}
9922    \vskip 0.5ex
9923    \GMPbox{16}{p16+r80'}{i16}
9924    \vskip 0.5ex
9925    \GMPbox{32}{p32+r32}{i32}
9926    \vskip 0.5ex
9927    \GMPbox{48}{p48+r48}{i48}
9928}}
9929@end tex
9930@ifnottex
9931@example
9932@group
9933|-----64bits----|-----64bits----|
9934                   +------------+
9935                   | p00 + r64' |    i00
9936                   +------------+
9937               +------------+
9938               | p16 + r80' |        i16
9939               +------------+
9940           +------------+
9941           | p32 + r32  |            i32
9942           +------------+
9943       +------------+
9944       | p48 + r48  |                i48
9945       +------------+
9946@end group
9947@end example
9948@end ifnottex
9949
9950The challenge then is to sum these efficiently and add in a carry limb,
9951generating a low 64-bit result limb and a high 33-bit carry limb (@math{i48}
9952extends 33 bits into the high half).
9953
9954
9955@node Assembly SIMD Instructions, Assembly Software Pipelining, Assembly Floating Point, Assembly Coding
9956@subsection SIMD Instructions
9957@cindex Assembly SIMD
9958
9959The single-instruction multiple-data support in current microprocessors is
9960aimed at signal processing algorithms where each data point can be treated
9961more or less independently.  There's generally not much support for
9962propagating the sort of carries that arise in GMP.
9963
9964SIMD multiplications of say four 16@cross{}16 bit multiplies only do as much
9965work as one 32@cross{}32 from GMP's point of view, and need some shifts and
9966adds besides.  But of course if say the SIMD form is fully pipelined and uses
9967less instruction decoding then it may still be worthwhile.
9968
9969On the x86 chips, MMX has so far found a use in @code{mpn_rshift} and
9970@code{mpn_lshift}, and is used in a special case for 16-bit multipliers in the
9971P55 @code{mpn_mul_1}.  SSE2 is used for Pentium 4 @code{mpn_mul_1},
9972@code{mpn_addmul_1}, and @code{mpn_submul_1}.
9973
9974
9975@node Assembly Software Pipelining, Assembly Loop Unrolling, Assembly SIMD Instructions, Assembly Coding
9976@subsection Software Pipelining
9977@cindex Assembly software pipelining
9978
9979Software pipelining consists of scheduling instructions around the branch
9980point in a loop.  For example a loop might issue a load not for use in the
9981present iteration but the next, thereby allowing extra cycles for the data to
9982arrive from memory.
9983
9984Naturally this is wanted only when doing things like loads or multiplies that
9985take several cycles to complete, and only where a CPU has multiple functional
9986units so that other work can be done in the meantime.
9987
9988A pipeline with several stages will have a data value in progress at each
9989stage and each loop iteration moves them along one stage.  This is like
9990juggling.
9991
9992If the latency of some instruction is greater than the loop time then it will
9993be necessary to unroll, so one register has a result ready to use while
9994another (or multiple others) are still in progress.  (@pxref{Assembly Loop
9995Unrolling}).
9996
9997
9998@node Assembly Loop Unrolling, Assembly Writing Guide, Assembly Software Pipelining, Assembly Coding
9999@subsection Loop Unrolling
10000@cindex Assembly loop unrolling
10001
10002Loop unrolling consists of replicating code so that several limbs are
10003processed in each loop.  At a minimum this reduces loop overheads by a
10004corresponding factor, but it can also allow better register usage, for example
10005alternately using one register combination and then another.  Judicious use of
10006@command{m4} macros can help avoid lots of duplication in the source code.
10007
10008Any amount of unrolling can be handled with a loop counter that's decremented
10009by @math{N} each time, stopping when the remaining count is less than the
10010further @math{N} the loop will process.  Or by subtracting @math{N} at the
10011start, the termination condition becomes when the counter @math{C} is less
10012than 0 (and the count of remaining limbs is @math{C+N}).
10013
10014Alternately for a power of 2 unroll the loop count and remainder can be
10015established with a shift and mask.  This is convenient if also making a
10016computed jump into the middle of a large loop.
10017
10018The limbs not a multiple of the unrolling can be handled in various ways, for
10019example
10020
10021@itemize @bullet
10022@item
10023A simple loop at the end (or the start) to process the excess.  Care will be
10024wanted that it isn't too much slower than the unrolled part.
10025
10026@item
10027A set of binary tests, for example after an 8-limb unrolling, test for 4 more
10028limbs to process, then a further 2 more or not, and finally 1 more or not.
10029This will probably take more code space than a simple loop.
10030
10031@item
10032A @code{switch} statement, providing separate code for each possible excess,
10033for example an 8-limb unrolling would have separate code for 0 remaining, 1
10034remaining, etc, up to 7 remaining.  This might take a lot of code, but may be
10035the best way to optimize all cases in combination with a deep pipelined loop.
10036
10037@item
10038A computed jump into the middle of the loop, thus making the first iteration
10039handle the excess.  This should make times smoothly increase with size, which
10040is attractive, but setups for the jump and adjustments for pointers can be
10041tricky and could become quite difficult in combination with deep pipelining.
10042@end itemize
10043
10044
10045@node Assembly Writing Guide,  , Assembly Loop Unrolling, Assembly Coding
10046@subsection Writing Guide
10047@cindex Assembly writing guide
10048
10049This is a guide to writing software pipelined loops for processing limb
10050vectors in assembly.
10051
10052First determine the algorithm and which instructions are needed.  Code it
10053without unrolling or scheduling, to make sure it works.  On a 3-operand CPU
10054try to write each new value to a new register, this will greatly simplify later
10055steps.
10056
10057Then note for each instruction the functional unit and/or issue port
10058requirements.  If an instruction can use either of two units, like U0 or U1
10059then make a category ``U0/U1''.  Count the total using each unit (or combined
10060unit), and count all instructions.
10061
10062Figure out from those counts the best possible loop time.  The goal will be to
10063find a perfect schedule where instruction latencies are completely hidden.
10064The total instruction count might be the limiting factor, or perhaps a
10065particular functional unit.  It might be possible to tweak the instructions to
10066help the limiting factor.
10067
10068Suppose the loop time is @math{N}, then make @math{N} issue buckets, with the
10069final loop branch at the end of the last.  Now fill the buckets with dummy
10070instructions using the functional units desired.  Run this to make sure the
10071intended speed is reached.
10072
10073Now replace the dummy instructions with the real instructions from the slow
10074but correct loop you started with.  The first will typically be a load
10075instruction.  Then the instruction using that value is placed in a bucket an
10076appropriate distance down.  Run the loop again, to check it still runs at
10077target speed.
10078
10079Keep placing instructions, frequently measuring the loop.  After a few you
10080will need to wrap around from the last bucket back to the top of the loop.  If
10081you used the new-register for new-value strategy above then there will be no
10082register conflicts.  If not then take care not to clobber something already in
10083use.  Changing registers at this time is very error prone.
10084
10085The loop will overlap two or more of the original loop iterations, and the
10086computation of one vector element result will be started in one iteration of
10087the new loop, and completed one or several iterations later.
10088
10089The final step is to create feed-in and wind-down code for the loop.  A good
10090way to do this is to make a copy (or copies) of the loop at the start and
10091delete those instructions which don't have valid antecedents, and at the end
10092replicate and delete those whose results are unwanted (including any further
10093loads).
10094
10095The loop will have a minimum number of limbs loaded and processed, so the
10096feed-in code must test if the request size is smaller and skip either to a
10097suitable part of the wind-down or to special code for small sizes.
10098
10099
10100@node Internals, Contributors, Algorithms, Top
10101@chapter Internals
10102@cindex Internals
10103
10104@strong{This chapter is provided only for informational purposes and the
10105various internals described here may change in future GMP releases.
10106Applications expecting to be compatible with future releases should use only
10107the documented interfaces described in previous chapters.}
10108
10109@menu
10110* Integer Internals::
10111* Rational Internals::
10112* Float Internals::
10113* Raw Output Internals::
10114* C++ Interface Internals::
10115@end menu
10116
10117@node Integer Internals, Rational Internals, Internals, Internals
10118@section Integer Internals
10119@cindex Integer internals
10120
10121@code{mpz_t} variables represent integers using sign and magnitude, in space
10122dynamically allocated and reallocated.  The fields are as follows.
10123
10124@table @asis
10125@item @code{_mp_size}
10126The number of limbs, or the negative of that when representing a negative
10127integer.  Zero is represented by @code{_mp_size} set to zero, in which case
10128the @code{_mp_d} data is unused.
10129
10130@item @code{_mp_d}
10131A pointer to an array of limbs which is the magnitude.  These are stored
10132``little endian'' as per the @code{mpn} functions, so @code{_mp_d[0]} is the
10133least significant limb and @code{_mp_d[ABS(_mp_size)-1]} is the most
10134significant.  Whenever @code{_mp_size} is non-zero, the most significant limb
10135is non-zero.
10136
10137Currently there's always at least one limb allocated, so for instance
10138@code{mpz_set_ui} never needs to reallocate, and @code{mpz_get_ui} can fetch
10139@code{_mp_d[0]} unconditionally (though its value is then only wanted if
10140@code{_mp_size} is non-zero).
10141
10142@item @code{_mp_alloc}
10143@code{_mp_alloc} is the number of limbs currently allocated at @code{_mp_d},
10144and naturally @code{_mp_alloc >= ABS(_mp_size)}.  When an @code{mpz} routine
10145is about to (or might be about to) increase @code{_mp_size}, it checks
10146@code{_mp_alloc} to see whether there's enough space, and reallocates if not.
10147@code{MPZ_REALLOC} is generally used for this.
10148@end table
10149
10150The various bitwise logical functions like @code{mpz_and} behave as if
10151negative values were twos complement.  But sign and magnitude is always used
10152internally, and necessary adjustments are made during the calculations.
10153Sometimes this isn't pretty, but sign and magnitude are best for other
10154routines.
10155
10156Some internal temporary variables are setup with @code{MPZ_TMP_INIT} and these
10157have @code{_mp_d} space obtained from @code{TMP_ALLOC} rather than the memory
10158allocation functions.  Care is taken to ensure that these are big enough that
10159no reallocation is necessary (since it would have unpredictable consequences).
10160
10161@code{_mp_size} and @code{_mp_alloc} are @code{int}, although @code{mp_size_t}
10162is usually a @code{long}.  This is done to make the fields just 32 bits on
10163some 64 bits systems, thereby saving a few bytes of data space but still
10164providing plenty of range.
10165
10166
10167@node Rational Internals, Float Internals, Integer Internals, Internals
10168@section Rational Internals
10169@cindex Rational internals
10170
10171@code{mpq_t} variables represent rationals using an @code{mpz_t} numerator and
10172denominator (@pxref{Integer Internals}).
10173
10174The canonical form adopted is denominator positive (and non-zero), no common
10175factors between numerator and denominator, and zero uniquely represented as
101760/1.
10177
10178It's believed that casting out common factors at each stage of a calculation
10179is best in general.  A GCD is an @math{O(N^2)} operation so it's better to do
10180a few small ones immediately than to delay and have to do a big one later.
10181Knowing the numerator and denominator have no common factors can be used for
10182example in @code{mpq_mul} to make only two cross GCDs necessary, not four.
10183
10184This general approach to common factors is badly sub-optimal in the presence
10185of simple factorizations or little prospect for cancellation, but GMP has no
10186way to know when this will occur.  As per @ref{Efficiency}, that's left to
10187applications.  The @code{mpq_t} framework might still suit, with
10188@code{mpq_numref} and @code{mpq_denref} for direct access to the numerator and
10189denominator, or of course @code{mpz_t} variables can be used directly.
10190
10191
10192@node Float Internals, Raw Output Internals, Rational Internals, Internals
10193@section Float Internals
10194@cindex Float internals
10195
10196Efficient calculation is the primary aim of GMP floats and the use of whole
10197limbs and simple rounding facilitates this.
10198
10199@code{mpf_t} floats have a variable precision mantissa and a single machine
10200word signed exponent.  The mantissa is represented using sign and magnitude.
10201
10202@c FIXME: The arrow heads don't join to the lines exactly.
10203@tex
10204\global\newdimen\GMPboxwidth \GMPboxwidth=5em
10205\global\newdimen\GMPboxheight \GMPboxheight=3ex
10206\def\centreline{\hbox{\raise 0.8ex \vbox{\hrule \hbox{\hfil}}}}
10207\GMPdisplay{%
10208\vbox{%
10209  \hbox to 5\GMPboxwidth {most significant limb \hfil least significant limb}
10210  \vskip 0.7ex
10211  \def\GMPcentreline#1{\hbox{\raise 0.5 ex \vbox{\hrule \hbox to #1 {}}}}
10212  \hbox {
10213    \hbox to 3\GMPboxwidth {%
10214      \setbox 0 = \hbox{@code{\_mp\_exp}}%
10215      \dimen0=3\GMPboxwidth
10216      \advance\dimen0 by -\wd0
10217      \divide\dimen0 by 2
10218      \advance\dimen0 by -1em
10219      \setbox1 = \hbox{$\rightarrow$}%
10220      \dimen1=\dimen0
10221      \advance\dimen1 by -\wd1
10222      \GMPcentreline{\dimen0}%
10223      \hfil
10224      \box0%
10225      \hfil
10226      \GMPcentreline{\dimen1{}}%
10227      \box1}
10228    \hbox to 2\GMPboxwidth {\hfil @code{\_mp\_d}}}
10229  \vskip 0.5ex
10230  \vbox {%
10231    \hrule
10232    \hbox{%
10233      \vrule height 2ex depth 1ex
10234      \hbox to \GMPboxwidth {}%
10235      \vrule
10236      \hbox to \GMPboxwidth {}%
10237      \vrule
10238      \hbox to \GMPboxwidth {}%
10239      \vrule
10240      \hbox to \GMPboxwidth {}%
10241      \vrule
10242      \hbox to \GMPboxwidth {}%
10243      \vrule}
10244    \hrule
10245  }
10246  \hbox {%
10247    \hbox to 0.8 pt {}
10248    \hbox to 3\GMPboxwidth {%
10249      \hfil $\cdot$} \hbox {$\leftarrow$ radix point\hfil}}
10250  \hbox to 5\GMPboxwidth{%
10251    \setbox 0 = \hbox{@code{\_mp\_size}}%
10252    \dimen0 = 5\GMPboxwidth
10253    \advance\dimen0 by -\wd0
10254    \divide\dimen0 by 2
10255    \advance\dimen0 by -1em
10256    \dimen1 = \dimen0
10257    \setbox1 = \hbox{$\leftarrow$}%
10258    \setbox2 = \hbox{$\rightarrow$}%
10259    \advance\dimen0 by -\wd1
10260    \advance\dimen1 by -\wd2
10261    \hbox to 0.3 em {}%
10262    \box1
10263    \GMPcentreline{\dimen0}%
10264    \hfil
10265    \box0
10266    \hfil
10267    \GMPcentreline{\dimen1}%
10268    \box2}
10269}}
10270@end tex
10271@ifnottex
10272@example
10273   most                   least
10274significant            significant
10275   limb                   limb
10276
10277                            _mp_d
10278 |---- _mp_exp --->           |
10279  _____ _____ _____ _____ _____
10280 |_____|_____|_____|_____|_____|
10281                   . <------------ radix point
10282
10283  <-------- _mp_size --------->
10284@sp 1
10285@end example
10286@end ifnottex
10287
10288@noindent
10289The fields are as follows.
10290
10291@table @asis
10292@item @code{_mp_size}
10293The number of limbs currently in use, or the negative of that when
10294representing a negative value.  Zero is represented by @code{_mp_size} and
10295@code{_mp_exp} both set to zero, and in that case the @code{_mp_d} data is
10296unused.  (In the future @code{_mp_exp} might be undefined when representing
10297zero.)
10298
10299@item @code{_mp_prec}
10300The precision of the mantissa, in limbs.  In any calculation the aim is to
10301produce @code{_mp_prec} limbs of result (the most significant being non-zero).
10302
10303@item @code{_mp_d}
10304A pointer to the array of limbs which is the absolute value of the mantissa.
10305These are stored ``little endian'' as per the @code{mpn} functions, so
10306@code{_mp_d[0]} is the least significant limb and
10307@code{_mp_d[ABS(_mp_size)-1]} the most significant.
10308
10309The most significant limb is always non-zero, but there are no other
10310restrictions on its value, in particular the highest 1 bit can be anywhere
10311within the limb.
10312
10313@code{_mp_prec+1} limbs are allocated to @code{_mp_d}, the extra limb being
10314for convenience (see below).  There are no reallocations during a calculation,
10315only in a change of precision with @code{mpf_set_prec}.
10316
10317@item @code{_mp_exp}
10318The exponent, in limbs, determining the location of the implied radix point.
10319Zero means the radix point is just above the most significant limb.  Positive
10320values mean a radix point offset towards the lower limbs and hence a value
10321@math{@ge{} 1}, as for example in the diagram above.  Negative exponents mean
10322a radix point further above the highest limb.
10323
10324Naturally the exponent can be any value, it doesn't have to fall within the
10325limbs as the diagram shows, it can be a long way above or a long way below.
10326Limbs other than those included in the @code{@{_mp_d,_mp_size@}} data
10327are treated as zero.
10328@end table
10329
10330The @code{_mp_size} and @code{_mp_prec} fields are @code{int}, although the
10331@code{mp_size_t} type is usually a @code{long}.  The @code{_mp_exp} field is
10332usually @code{long}.  This is done to make some fields just 32 bits on some 64
10333bits systems, thereby saving a few bytes of data space but still providing
10334plenty of precision and a very large range.
10335
10336
10337@sp 1
10338@noindent
10339The following various points should be noted.
10340
10341@table @asis
10342@item Low Zeros
10343The least significant limbs @code{_mp_d[0]} etc can be zero, though such low
10344zeros can always be ignored.  Routines likely to produce low zeros check and
10345avoid them to save time in subsequent calculations, but for most routines
10346they're quite unlikely and aren't checked.
10347
10348@item Mantissa Size Range
10349The @code{_mp_size} count of limbs in use can be less than @code{_mp_prec} if
10350the value can be represented in less.  This means low precision values or
10351small integers stored in a high precision @code{mpf_t} can still be operated
10352on efficiently.
10353
10354@code{_mp_size} can also be greater than @code{_mp_prec}.  Firstly a value is
10355allowed to use all of the @code{_mp_prec+1} limbs available at @code{_mp_d},
10356and secondly when @code{mpf_set_prec_raw} lowers @code{_mp_prec} it leaves
10357@code{_mp_size} unchanged and so the size can be arbitrarily bigger than
10358@code{_mp_prec}.
10359
10360@item Rounding
10361All rounding is done on limb boundaries.  Calculating @code{_mp_prec} limbs
10362with the high non-zero will ensure the application requested minimum precision
10363is obtained.
10364
10365The use of simple ``trunc'' rounding towards zero is efficient, since there's
10366no need to examine extra limbs and increment or decrement.
10367
10368@item Bit Shifts
10369Since the exponent is in limbs, there are no bit shifts in basic operations
10370like @code{mpf_add} and @code{mpf_mul}.  When differing exponents are
10371encountered all that's needed is to adjust pointers to line up the relevant
10372limbs.
10373
10374Of course @code{mpf_mul_2exp} and @code{mpf_div_2exp} will require bit shifts,
10375but the choice is between an exponent in limbs which requires shifts there, or
10376one in bits which requires them almost everywhere else.
10377
10378@item Use of @code{_mp_prec+1} Limbs
10379The extra limb on @code{_mp_d} (@code{_mp_prec+1} rather than just
10380@code{_mp_prec}) helps when an @code{mpf} routine might get a carry from its
10381operation.  @code{mpf_add} for instance will do an @code{mpn_add} of
10382@code{_mp_prec} limbs.  If there's no carry then that's the result, but if
10383there is a carry then it's stored in the extra limb of space and
10384@code{_mp_size} becomes @code{_mp_prec+1}.
10385
10386Whenever @code{_mp_prec+1} limbs are held in a variable, the low limb is not
10387needed for the intended precision, only the @code{_mp_prec} high limbs.  But
10388zeroing it out or moving the rest down is unnecessary.  Subsequent routines
10389reading the value will simply take the high limbs they need, and this will be
10390@code{_mp_prec} if their target has that same precision.  This is no more than
10391a pointer adjustment, and must be checked anyway since the destination
10392precision can be different from the sources.
10393
10394Copy functions like @code{mpf_set} will retain a full @code{_mp_prec+1} limbs
10395if available.  This ensures that a variable which has @code{_mp_size} equal to
10396@code{_mp_prec+1} will get its full exact value copied.  Strictly speaking
10397this is unnecessary since only @code{_mp_prec} limbs are needed for the
10398application's requested precision, but it's considered that an @code{mpf_set}
10399from one variable into another of the same precision ought to produce an exact
10400copy.
10401
10402@item Application Precisions
10403@code{__GMPF_BITS_TO_PREC} converts an application requested precision to an
10404@code{_mp_prec}.  The value in bits is rounded up to a whole limb then an
10405extra limb is added since the most significant limb of @code{_mp_d} is only
10406non-zero and therefore might contain only one bit.
10407
10408@code{__GMPF_PREC_TO_BITS} does the reverse conversion, and removes the extra
10409limb from @code{_mp_prec} before converting to bits.  The net effect of
10410reading back with @code{mpf_get_prec} is simply the precision rounded up to a
10411multiple of @code{mp_bits_per_limb}.
10412
10413Note that the extra limb added here for the high only being non-zero is in
10414addition to the extra limb allocated to @code{_mp_d}.  For example with a
1041532-bit limb, an application request for 250 bits will be rounded up to 8
10416limbs, then an extra added for the high being only non-zero, giving an
10417@code{_mp_prec} of 9.  @code{_mp_d} then gets 10 limbs allocated.  Reading
10418back with @code{mpf_get_prec} will take @code{_mp_prec} subtract 1 limb and
10419multiply by 32, giving 256 bits.
10420
10421Strictly speaking, the fact the high limb has at least one bit means that a
10422float with, say, 3 limbs of 32-bits each will be holding at least 65 bits, but
10423for the purposes of @code{mpf_t} it's considered simply to be 64 bits, a nice
10424multiple of the limb size.
10425@end table
10426
10427
10428@node Raw Output Internals, C++ Interface Internals, Float Internals, Internals
10429@section Raw Output Internals
10430@cindex Raw output internals
10431
10432@noindent
10433@code{mpz_out_raw} uses the following format.
10434
10435@tex
10436\global\newdimen\GMPboxwidth \GMPboxwidth=5em
10437\global\newdimen\GMPboxheight \GMPboxheight=3ex
10438\def\centreline{\hbox{\raise 0.8ex \vbox{\hrule \hbox{\hfil}}}}
10439\GMPdisplay{%
10440\vbox{%
10441  \def\GMPcentreline#1{\hbox{\raise 0.5 ex \vbox{\hrule \hbox to #1 {}}}}
10442  \vbox {%
10443    \hrule
10444    \hbox{%
10445      \vrule height 2.5ex depth 1.5ex
10446      \hbox to \GMPboxwidth {\hfil size\hfil}%
10447      \vrule
10448      \hbox to 3\GMPboxwidth {\hfil data bytes\hfil}%
10449      \vrule}
10450    \hrule}
10451}}
10452@end tex
10453@ifnottex
10454@example
10455+------+------------------------+
10456| size |       data bytes       |
10457+------+------------------------+
10458@end example
10459@end ifnottex
10460
10461The size is 4 bytes written most significant byte first, being the number of
10462subsequent data bytes, or the twos complement negative of that when a negative
10463integer is represented.  The data bytes are the absolute value of the integer,
10464written most significant byte first.
10465
10466The most significant data byte is always non-zero, so the output is the same
10467on all systems, irrespective of limb size.
10468
10469In GMP 1, leading zero bytes were written to pad the data bytes to a multiple
10470of the limb size.  @code{mpz_inp_raw} will still accept this, for
10471compatibility.
10472
10473The use of ``big endian'' for both the size and data fields is deliberate, it
10474makes the data easy to read in a hex dump of a file.  Unfortunately it also
10475means that the limb data must be reversed when reading or writing, so neither
10476a big endian nor little endian system can just read and write @code{_mp_d}.
10477
10478
10479@node C++ Interface Internals,  , Raw Output Internals, Internals
10480@section C++ Interface Internals
10481@cindex C++ interface internals
10482
10483A system of expression templates is used to ensure something like @code{a=b+c}
10484turns into a simple call to @code{mpz_add} etc.  For @code{mpf_class}
10485the scheme also ensures the precision of the final
10486destination is used for any temporaries within a statement like
10487@code{f=w*x+y*z}.  These are important features which a naive implementation
10488cannot provide.
10489
10490A simplified description of the scheme follows.  The true scheme is
10491complicated by the fact that expressions have different return types.  For
10492detailed information, refer to the source code.
10493
10494To perform an operation, say, addition, we first define a ``function object''
10495evaluating it,
10496
10497@example
10498struct __gmp_binary_plus
10499@{
10500  static void eval(mpf_t f, const mpf_t g, const mpf_t h)
10501  @{
10502    mpf_add(f, g, h);
10503  @}
10504@};
10505@end example
10506
10507@noindent
10508And an ``additive expression'' object,
10509
10510@example
10511__gmp_expr<__gmp_binary_expr<mpf_class, mpf_class, __gmp_binary_plus> >
10512operator+(const mpf_class &f, const mpf_class &g)
10513@{
10514  return __gmp_expr
10515    <__gmp_binary_expr<mpf_class, mpf_class, __gmp_binary_plus> >(f, g);
10516@}
10517@end example
10518
10519The seemingly redundant @code{__gmp_expr<__gmp_binary_expr<@dots{}>>} is used to
10520encapsulate any possible kind of expression into a single template type.  In
10521fact even @code{mpf_class} etc are @code{typedef} specializations of
10522@code{__gmp_expr}.
10523
10524Next we define assignment of @code{__gmp_expr} to @code{mpf_class}.
10525
10526@example
10527template <class T>
10528mpf_class & mpf_class::operator=(const __gmp_expr<T> &expr)
10529@{
10530  expr.eval(this->get_mpf_t(), this->precision());
10531  return *this;
10532@}
10533
10534template <class Op>
10535void __gmp_expr<__gmp_binary_expr<mpf_class, mpf_class, Op> >::eval
10536(mpf_t f, mp_bitcnt_t precision)
10537@{
10538  Op::eval(f, expr.val1.get_mpf_t(), expr.val2.get_mpf_t());
10539@}
10540@end example
10541
10542where @code{expr.val1} and @code{expr.val2} are references to the expression's
10543operands (here @code{expr} is the @code{__gmp_binary_expr} stored within the
10544@code{__gmp_expr}).
10545
10546This way, the expression is actually evaluated only at the time of assignment,
10547when the required precision (that of @code{f}) is known.  Furthermore the
10548target @code{mpf_t} is now available, thus we can call @code{mpf_add} directly
10549with @code{f} as the output argument.
10550
10551Compound expressions are handled by defining operators taking subexpressions
10552as their arguments, like this:
10553
10554@example
10555template <class T, class U>
10556__gmp_expr
10557<__gmp_binary_expr<__gmp_expr<T>, __gmp_expr<U>, __gmp_binary_plus> >
10558operator+(const __gmp_expr<T> &expr1, const __gmp_expr<U> &expr2)
10559@{
10560  return __gmp_expr
10561    <__gmp_binary_expr<__gmp_expr<T>, __gmp_expr<U>, __gmp_binary_plus> >
10562    (expr1, expr2);
10563@}
10564@end example
10565
10566And the corresponding specializations of @code{__gmp_expr::eval}:
10567
10568@example
10569template <class T, class U, class Op>
10570void __gmp_expr
10571<__gmp_binary_expr<__gmp_expr<T>, __gmp_expr<U>, Op> >::eval
10572(mpf_t f, mp_bitcnt_t precision)
10573@{
10574  // declare two temporaries
10575  mpf_class temp1(expr.val1, precision), temp2(expr.val2, precision);
10576  Op::eval(f, temp1.get_mpf_t(), temp2.get_mpf_t());
10577@}
10578@end example
10579
10580The expression is thus recursively evaluated to any level of complexity and
10581all subexpressions are evaluated to the precision of @code{f}.
10582
10583
10584@node Contributors, References, Internals, Top
10585@comment  node-name,  next,  previous,  up
10586@appendix Contributors
10587@cindex Contributors
10588
10589Torbj@"orn Granlund wrote the original GMP library and is still the main
10590developer.  Code not explicitly attributed to others, was contributed by
10591Torbj@"orn.  Several other individuals and organizations have contributed
10592GMP.  Here is a list in chronological order on first contribution:
10593
10594Gunnar Sj@"odin and Hans Riesel helped with mathematical problems in early
10595versions of the library.
10596
10597Richard Stallman helped with the interface design and revised the first
10598version of this manual.
10599
10600Brian Beuning and Doug Lea helped with testing of early versions of the
10601library and made creative suggestions.
10602
10603John Amanatides of York University in Canada contributed the function
10604@code{mpz_probab_prime_p}.
10605
10606Paul Zimmermann wrote the REDC-based mpz_powm code, the Sch@"onhage-Strassen
10607FFT multiply code, and the Karatsuba square root code.  He also improved the
10608Toom3 code for GMP 4.2.  Paul sparked the development of GMP 2, with his
10609comparisons between bignum packages.  The ECMNET project Paul is organizing
10610was a driving force behind many of the optimizations in GMP 3.  Paul also
10611wrote the new GMP 4.3 nth root code (with Torbj@"orn).
10612
10613Ken Weber (Kent State University, Universidade Federal do Rio Grande do Sul)
10614contributed now defunct versions of @code{mpz_gcd}, @code{mpz_divexact},
10615@code{mpn_gcd}, and @code{mpn_bdivmod}, partially supported by CNPq (Brazil)
10616grant 301314194-2.
10617
10618Per Bothner of Cygnus Support helped to set up GMP to use Cygnus' configure.
10619He has also made valuable suggestions and tested numerous intermediary
10620releases.
10621
10622Joachim Hollman was involved in the design of the @code{mpf} interface, and in
10623the @code{mpz} design revisions for version 2.
10624
10625Bennet Yee contributed the initial versions of @code{mpz_jacobi} and
10626@code{mpz_legendre}.
10627
10628Andreas Schwab contributed the files @file{mpn/m68k/lshift.S} and
10629@file{mpn/m68k/rshift.S} (now in @file{.asm} form).
10630
10631Robert Harley of Inria, France and David Seal of ARM, England, suggested clever
10632improvements for population count.  Robert also wrote highly optimized
10633Karatsuba and 3-way Toom multiplication functions for GMP 3, and contributed
10634the ARM assembly code.
10635
10636Torsten Ekedahl of the Mathematical department of Stockholm University provided
10637significant inspiration during several phases of the GMP development.  His
10638mathematical expertise helped improve several algorithms.
10639
10640Linus Nordberg wrote the new configure system based on autoconf and
10641implemented the new random functions.
10642
10643Kevin Ryde worked on a large number of things: optimized x86 code, m4 asm
10644macros, parameter tuning, speed measuring, the configure system, function
10645inlining, divisibility tests, bit scanning, Jacobi symbols, Fibonacci and Lucas
10646number functions, printf and scanf functions, perl interface, demo expression
10647parser, the algorithms chapter in the manual, @file{gmpasm-mode.el}, and
10648various miscellaneous improvements elsewhere.
10649
10650Kent Boortz made the Mac OS 9 port.
10651
10652Steve Root helped write the optimized alpha 21264 assembly code.
10653
10654Gerardo Ballabio wrote the @file{gmpxx.h} C++ class interface and the C++
10655@code{istream} input routines.
10656
10657Jason Moxham rewrote @code{mpz_fac_ui}.
10658
10659Pedro Gimeno implemented the Mersenne Twister and made other random number
10660improvements.
10661
10662Niels M@"oller wrote the sub-quadratic GCD, extended GCD and jacobi code, the
10663quadratic Hensel division code, and (with Torbj@"orn) the new divide and
10664conquer division code for GMP 4.3.  Niels also helped implement the new Toom
10665multiply code for GMP 4.3 and implemented helper functions to simplify Toom
10666evaluations for GMP 5.0.  He wrote the original version of mpn_mulmod_bnm1, and
10667he is the main author of the mini-gmp package used for gmp bootstrapping.
10668
10669Alberto Zanoni and Marco Bodrato suggested the unbalanced multiply strategy,
10670and found the optimal strategies for evaluation and interpolation in Toom
10671multiplication.
10672
10673Marco Bodrato helped implement the new Toom multiply code for GMP 4.3 and
10674implemented most of the new Toom multiply and squaring code for 5.0.
10675He is the main author of the current mpn_mulmod_bnm1, mpn_mullo_n, and
10676mpn_sqrlo.  Marco also wrote the functions mpn_invert and mpn_invertappr,
10677and improved the speed of integer root extraction.  He is the author of
10678the current combinatorial functions: binomial, factorial, multifactorial,
10679primorial.
10680
10681David Harvey suggested the internal function @code{mpn_bdiv_dbm1}, implementing
10682division relevant to Toom multiplication.  He also worked on fast assembly
10683sequences, in particular on a fast AMD64 @code{mpn_mul_basecase}. He wrote
10684the internal middle product functions @code{mpn_mulmid_basecase},
10685@code{mpn_toom42_mulmid}, @code{mpn_mulmid_n} and related helper routines.
10686
10687Martin Boij wrote @code{mpn_perfect_power_p}.
10688
10689Marc Glisse improved @file{gmpxx.h}: use fewer temporaries (faster),
10690specializations of @code{numeric_limits} and @code{common_type}, C++11
10691features (move constructors, explicit bool conversion, UDL), make the
10692conversion from @code{mpq_class} to @code{mpz_class} explicit, optimize
10693operations where one argument is a small compile-time constant, replace
10694some heap allocations by stack allocations.  He also fixed the eofbit
10695handling of C++ streams, and removed one division from @file{mpq/aors.c}.
10696
10697David S Miller wrote assembly code for SPARC T3 and T4.
10698
10699Mark Sofroniou cleaned up the types of mul_fft.c, letting it work for huge
10700operands.
10701
10702Ulrich Weigand ported GMP to the powerpc64le ABI.
10703
10704(This list is chronological, not ordered after significance.  If you have
10705contributed to GMP but are not listed above, please tell
10706@email{gmp-devel@@gmplib.org} about the omission!)
10707
10708The development of floating point functions of GNU MP 2, were supported in part
10709by the ESPRIT-BRA (Basic Research Activities) 6846 project POSSO (POlynomial
10710System SOlving).
10711
10712The development of GMP 2, 3, and 4.0 was supported in part by the IDA Center
10713for Computing Sciences.
10714
10715The development of GMP 4.3, 5.0, and 5.1 was supported in part by the Swedish
10716Foundation for Strategic Research.
10717
10718Thanks go to Hans Thorsen for donating an SGI system for the GMP test system
10719environment.
10720
10721@node References, GNU Free Documentation License, Contributors, Top
10722@comment  node-name,  next,  previous,  up
10723@appendix References
10724@cindex References
10725
10726@c  FIXME: In tex, the @uref's are unhyphenated, which is good for clarity,
10727@c  but being long words they upset paragraph formatting (the preceding line
10728@c  can get badly stretched).  Would like an conditional @* style line break
10729@c  if the uref is too long to fit on the last line of the paragraph, but it's
10730@c  not clear how to do that.  For now explicit @texlinebreak{}s are used on
10731@c  paragraphs that come out bad.
10732
10733@section Books
10734
10735@itemize @bullet
10736@item
10737Jonathan M. Borwein and Peter B. Borwein, ``Pi and the AGM: A Study in
10738Analytic Number Theory and Computational Complexity'', Wiley, 1998.
10739
10740@item
10741Richard Crandall and Carl Pomerance, ``Prime Numbers: A Computational
10742Perspective'', 2nd edition, Springer-Verlag, 2005.
10743@texlinebreak{} @uref{http://www.math.dartmouth.edu/~carlp/}
10744
10745@item
10746Henri Cohen, ``A Course in Computational Algebraic Number Theory'', Graduate
10747Texts in Mathematics number 138, Springer-Verlag, 1993.
10748@texlinebreak{} @uref{http://www.math.u-bordeaux.fr/~cohen/}
10749
10750@item
10751Donald E. Knuth, ``The Art of Computer Programming'', volume 2,
10752``Seminumerical Algorithms'', 3rd edition, Addison-Wesley, 1998.
10753@texlinebreak{} @uref{http://www-cs-faculty.stanford.edu/~knuth/taocp.html}
10754
10755@item
10756John D. Lipson, ``Elements of Algebra and Algebraic Computing'',
10757The Benjamin Cummings Publishing Company Inc, 1981.
10758
10759@item
10760Alfred J. Menezes, Paul C. van Oorschot and Scott A. Vanstone, ``Handbook of
10761Applied Cryptography'', @uref{http://www.cacr.math.uwaterloo.ca/hac/}
10762
10763@item
10764Richard M. Stallman and the GCC Developer Community, ``Using the GNU Compiler
10765Collection'', Free Software Foundation, 2008, available online
10766@uref{https://gcc.gnu.org/onlinedocs/}, and in the GCC package
10767@uref{https://ftp.gnu.org/gnu/gcc/}
10768@end itemize
10769
10770@section Papers
10771
10772@itemize @bullet
10773@item
10774Yves Bertot, Nicolas Magaud and Paul Zimmermann, ``A Proof of GMP Square
10775Root'', Journal of Automated Reasoning, volume 29, 2002, pp.@: 225-252.  Also
10776available online as INRIA Research Report 4475, June 2002,
10777@uref{http://hal.inria.fr/docs/00/07/21/13/PDF/RR-4475.pdf}
10778
10779@item
10780Christoph Burnikel and Joachim Ziegler, ``Fast Recursive Division'',
10781Max-Planck-Institut fuer Informatik Research Report MPI-I-98-1-022,
10782@texlinebreak{} @uref{http://data.mpi-sb.mpg.de/internet/reports.nsf/NumberView/1998-1-022}
10783
10784@item
10785Torbj@"orn Granlund and Peter L. Montgomery, ``Division by Invariant Integers
10786using Multiplication'', in Proceedings of the SIGPLAN PLDI'94 Conference, June
107871994.  Also available @uref{https://gmplib.org/~tege/divcnst-pldi94.pdf}.
10788
10789@item
10790Niels M@"oller and Torbj@"orn Granlund, ``Improved division by invariant
10791integers'', IEEE Transactions on Computers, 11 June 2010.
10792@uref{https://gmplib.org/~tege/division-paper.pdf}
10793
10794@item
10795Torbj@"orn Granlund and Niels M@"oller, ``Division of integers large and
10796small'', to appear.
10797
10798@item
10799Tudor Jebelean,
10800``An algorithm for exact division'',
10801Journal of Symbolic Computation,
10802volume 15, 1993, pp.@: 169-180.
10803Research report version available @texlinebreak{}
10804@uref{ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1992/92-35.ps.gz}
10805
10806@item
10807Tudor Jebelean, ``Exact Division with Karatsuba Complexity - Extended
10808Abstract'', RISC-Linz technical report 96-31, @texlinebreak{}
10809@uref{ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1996/96-31.ps.gz}
10810
10811@item
10812Tudor Jebelean, ``Practical Integer Division with Karatsuba Complexity'',
10813ISSAC 97, pp.@: 339-341.  Technical report available @texlinebreak{}
10814@uref{ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1996/96-29.ps.gz}
10815
10816@item
10817Tudor Jebelean, ``A Generalization of the Binary GCD Algorithm'', ISSAC 93,
10818pp.@: 111-116.  Technical report version available @texlinebreak{}
10819@uref{ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1993/93-01.ps.gz}
10820
10821@item
10822Tudor Jebelean, ``A Double-Digit Lehmer-Euclid Algorithm for Finding the GCD
10823of Long Integers'', Journal of Symbolic Computation, volume 19, 1995,
10824pp.@: 145-157.  Technical report version also available @texlinebreak{}
10825@uref{ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1992/92-69.ps.gz}
10826
10827@item
10828Werner Krandick and Tudor Jebelean, ``Bidirectional Exact Integer Division'',
10829Journal of Symbolic Computation, volume 21, 1996, pp.@: 441-455.  Early
10830technical report version also available
10831@uref{ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1994/94-50.ps.gz}
10832
10833@item
10834Makoto Matsumoto and Takuji Nishimura, ``Mersenne Twister: A 623-dimensionally
10835equidistributed uniform pseudorandom number generator'', ACM Transactions on
10836Modelling and Computer Simulation, volume 8, January 1998, pp.@: 3-30.
10837Available online @texlinebreak{}
10838@uref{http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/ARTICLES/mt.ps.gz} (or .pdf)
10839
10840@item
10841R. Moenck and A. Borodin, ``Fast Modular Transforms via Division'',
10842Proceedings of the 13th Annual IEEE Symposium on Switching and Automata
10843Theory, October 1972, pp.@: 90-96.  Reprinted as ``Fast Modular Transforms'',
10844Journal of Computer and System Sciences, volume 8, number 3, June 1974,
10845pp.@: 366-386.
10846
10847@item
10848Niels M@"oller, ``On Sch@"onhage's algorithm and subquadratic integer GCD
10849  computation'', in Mathematics of Computation, volume 77, January 2008, pp.@:
10850  589-607.
10851
10852@item
10853Peter L. Montgomery, ``Modular Multiplication Without Trial Division'', in
10854Mathematics of Computation, volume 44, number 170, April 1985.
10855
10856@item
10857Arnold Sch@"onhage and Volker Strassen, ``Schnelle Multiplikation grosser
10858Zahlen'', Computing 7, 1971, pp.@: 281-292.
10859
10860@item
10861Kenneth Weber, ``The accelerated integer GCD algorithm'',
10862ACM Transactions on Mathematical Software,
10863volume 21, number 1, March 1995, pp.@: 111-122.
10864
10865@item
10866Paul Zimmermann, ``Karatsuba Square Root'', INRIA Research Report 3805,
10867November 1999, @uref{http://hal.inria.fr/inria-00072854/PDF/RR-3805.pdf}
10868
10869@item
10870Paul Zimmermann, ``A Proof of GMP Fast Division and Square Root
10871Implementations'', @texlinebreak{}
10872@uref{http://www.loria.fr/~zimmerma/papers/proof-div-sqrt.ps.gz}
10873
10874@item
10875Dan Zuras, ``On Squaring and Multiplying Large Integers'', ARITH-11: IEEE
10876Symposium on Computer Arithmetic, 1993, pp.@: 260 to 271.  Reprinted as ``More
10877on Multiplying and Squaring Large Integers'', IEEE Transactions on Computers,
10878volume 43, number 8, August 1994, pp.@: 899-908.
10879@end itemize
10880
10881
10882@node GNU Free Documentation License, Concept Index, References, Top
10883@appendix GNU Free Documentation License
10884@cindex GNU Free Documentation License
10885@cindex Free Documentation License
10886@cindex Documentation license
10887@include fdl-1.3.texi
10888
10889
10890@node Concept Index, Function Index, GNU Free Documentation License, Top
10891@comment  node-name,  next,  previous,  up
10892@unnumbered Concept Index
10893@printindex cp
10894
10895@node Function Index,  , Concept Index, Top
10896@comment  node-name,  next,  previous,  up
10897@unnumbered Function and Type Index
10898@printindex fn
10899
10900@bye
10901
10902@c Local variables:
10903@c fill-column: 78
10904@c compile-command: "make gmp.info"
10905@c End:
10906