1\input texinfo @c -*-texinfo-*- 2@c %**start of header 3@setfilename gmp.info 4@documentencoding ISO-8859-1 5@include version.texi 6@settitle GNU MP @value{VERSION} 7@synindex tp fn 8@iftex 9@afourpaper 10@end iftex 11@comment %**end of header 12 13@copying 14This manual describes how to install and use the GNU multiple precision 15arithmetic library, version @value{VERSION}. 16 17Copyright 1991, 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001, 2002, 182003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013 Free Software 19Foundation, Inc. 20 21Permission is granted to copy, distribute and/or modify this document under 22the terms of the GNU Free Documentation License, Version 1.3 or any later 23version published by the Free Software Foundation; with no Invariant Sections, 24with the Front-Cover Texts being ``A GNU Manual'', and with the Back-Cover 25Texts being ``You have freedom to copy and modify this GNU Manual, like GNU 26software''. A copy of the license is included in 27@ref{GNU Free Documentation License}. 28@end copying 29@c Note the @ref above must be on one line, a line break in an @ref within 30@c @copying will bomb in recent texinfo.tex (eg. 2004-04-07.08 which comes 31@c with texinfo 4.7), with messages about missing @endcsname. 32 33 34@c Texinfo version 4.2 or up will be needed to process this file. 35@c 36@c The version number and edition number are taken from version.texi provided 37@c by automake (note that it's regenerated only if you configure with 38@c --enable-maintainer-mode). 39@c 40@c Notes discussing the present version number of GMP in relation to previous 41@c ones (for instance in the "Compatibility" section) must be updated at 42@c manually though. 43@c 44@c @cindex entries have been made for function categories and programming 45@c topics. The "mpn" section is not included in this, because a beginner 46@c looking for "GCD" or something is only going to be confused by pointers to 47@c low level routines. 48@c 49@c @cindex entries are present for processors and systems when there's 50@c particular notes concerning them, but not just for everything GMP 51@c supports. 52@c 53@c Index entries for files use @code rather than @file, @samp or @option, 54@c since the latter come out with quotes in TeX, which are nice in the text 55@c but don't look so good in index columns. 56@c 57@c Tex: 58@c 59@c A suitable texinfo.tex is supplied, a newer one should work equally well. 60@c 61@c HTML: 62@c 63@c Nothing special is done for links to external manuals, they just come out 64@c in the usual makeinfo style, eg. "../libc/Locales.html". If you have 65@c local copies of such manuals then this is a good thing, if not then you 66@c may want to search-and-replace to some online source. 67@c 68 69@dircategory GNU libraries 70@direntry 71* gmp: (gmp). GNU Multiple Precision Arithmetic Library. 72@end direntry 73 74@c html <meta name="description" content="..."> 75@documentdescription 76How to install and use the GNU multiple precision arithmetic library, version @value{VERSION}. 77@end documentdescription 78 79@c smallbook 80@finalout 81@setchapternewpage on 82 83@ifnottex 84@node Top, Copying, (dir), (dir) 85@top GNU MP 86@end ifnottex 87 88@iftex 89@titlepage 90@title GNU MP 91@subtitle The GNU Multiple Precision Arithmetic Library 92@subtitle Edition @value{EDITION} 93@subtitle @value{UPDATED} 94 95@author by Torbj@"orn Granlund and the GMP development team 96@c @email{tg@@gmplib.org} 97 98@c Include the Distribution inside the titlepage so 99@c that headings are turned off. 100 101@tex 102\global\parindent=0pt 103\global\parskip=8pt 104\global\baselineskip=13pt 105@end tex 106 107@page 108@vskip 0pt plus 1filll 109@end iftex 110 111@insertcopying 112@ifnottex 113@sp 1 114@end ifnottex 115 116@iftex 117@end titlepage 118@headings double 119@end iftex 120 121@c Don't bother with contents for html, the menus seem adequate. 122@ifnothtml 123@contents 124@end ifnothtml 125 126@menu 127* Copying:: GMP Copying Conditions (LGPL). 128* Introduction to GMP:: Brief introduction to GNU MP. 129* Installing GMP:: How to configure and compile the GMP library. 130* GMP Basics:: What every GMP user should know. 131* Reporting Bugs:: How to usefully report bugs. 132* Integer Functions:: Functions for arithmetic on signed integers. 133* Rational Number Functions:: Functions for arithmetic on rational numbers. 134* Floating-point Functions:: Functions for arithmetic on floats. 135* Low-level Functions:: Fast functions for natural numbers. 136* Random Number Functions:: Functions for generating random numbers. 137* Formatted Output:: @code{printf} style output. 138* Formatted Input:: @code{scanf} style input. 139* C++ Class Interface:: Class wrappers around GMP types. 140* Custom Allocation:: How to customize the internal allocation. 141* Language Bindings:: Using GMP from other languages. 142* Algorithms:: What happens behind the scenes. 143* Internals:: How values are represented behind the scenes. 144 145* Contributors:: Who brings you this library? 146* References:: Some useful papers and books to read. 147* GNU Free Documentation License:: 148* Concept Index:: 149* Function Index:: 150@end menu 151 152 153@c @m{T,N} is $T$ in tex or @math{N} otherwise. This is an easy way to give 154@c different forms for math in tex and info. Commas in N or T don't work, 155@c but @C{} can be used instead. \, works in info but not in tex. 156@iftex 157@macro m {T,N} 158@tex$\T\$@end tex 159@end macro 160@end iftex 161@ifnottex 162@macro m {T,N} 163@math{\N\} 164@end macro 165@end ifnottex 166 167@macro C {} 168, 169@end macro 170 171@c @ms{V,N} is $V_N$ in tex or just vn otherwise. This suits simple 172@c subscripts like @ms{x,0}. 173@iftex 174@macro ms {V,N} 175@tex$\V\_{\N\}$@end tex 176@end macro 177@end iftex 178@ifnottex 179@macro ms {V,N} 180\V\\N\ 181@end macro 182@end ifnottex 183 184@c @nicode{S} is plain S in info, or @code{S} elsewhere. This can be used 185@c when the quotes that @code{} gives in info aren't wanted, but the 186@c fontification in tex or html is wanted. Doesn't work as @nicode{'\\0'} 187@c though (gives two backslashes in tex). 188@ifinfo 189@macro nicode {S} 190\S\ 191@end macro 192@end ifinfo 193@ifnotinfo 194@macro nicode {S} 195@code{\S\} 196@end macro 197@end ifnotinfo 198 199@c @nisamp{S} is plain S in info, or @samp{S} elsewhere. This can be used 200@c when the quotes that @samp{} gives in info aren't wanted, but the 201@c fontification in tex or html is wanted. 202@ifinfo 203@macro nisamp {S} 204\S\ 205@end macro 206@end ifinfo 207@ifnotinfo 208@macro nisamp {S} 209@samp{\S\} 210@end macro 211@end ifnotinfo 212 213@c Usage: @GMPtimes{} 214@c Give either \times or the word "times". 215@tex 216\gdef\GMPtimes{\times} 217@end tex 218@ifnottex 219@macro GMPtimes 220times 221@end macro 222@end ifnottex 223 224@c Usage: @GMPmultiply{} 225@c Give * in info, or nothing in tex. 226@tex 227\gdef\GMPmultiply{} 228@end tex 229@ifnottex 230@macro GMPmultiply 231* 232@end macro 233@end ifnottex 234 235@c Usage: @GMPabs{x} 236@c Give either |x| in tex, or abs(x) in info or html. 237@tex 238\gdef\GMPabs#1{|#1|} 239@end tex 240@ifnottex 241@macro GMPabs {X} 242@abs{}(\X\) 243@end macro 244@end ifnottex 245 246@c Usage: @GMPfloor{x} 247@c Give either \lfloor x\rfloor in tex, or floor(x) in info or html. 248@tex 249\gdef\GMPfloor#1{\lfloor #1\rfloor} 250@end tex 251@ifnottex 252@macro GMPfloor {X} 253floor(\X\) 254@end macro 255@end ifnottex 256 257@c Usage: @GMPceil{x} 258@c Give either \lceil x\rceil in tex, or ceil(x) in info or html. 259@tex 260\gdef\GMPceil#1{\lceil #1 \rceil} 261@end tex 262@ifnottex 263@macro GMPceil {X} 264ceil(\X\) 265@end macro 266@end ifnottex 267 268@c Math operators already available in tex, made available in info too. 269@c For example @bmod{} can be used in both tex and info. 270@ifnottex 271@macro bmod 272mod 273@end macro 274@macro gcd 275gcd 276@end macro 277@macro ge 278>= 279@end macro 280@macro le 281<= 282@end macro 283@macro log 284log 285@end macro 286@macro min 287min 288@end macro 289@macro leftarrow 290<- 291@end macro 292@macro rightarrow 293-> 294@end macro 295@end ifnottex 296 297@c New math operators. 298@c @abs{} can be used in both tex and info, or just \abs in tex. 299@tex 300\gdef\abs{\mathop{\rm abs}} 301@end tex 302@ifnottex 303@macro abs 304abs 305@end macro 306@end ifnottex 307 308@c @cross{} is a \times symbol in tex, or an "x" in info. In tex it works 309@c inside or outside $ $. 310@tex 311\gdef\cross{\ifmmode\times\else$\times$\fi} 312@end tex 313@ifnottex 314@macro cross 315x 316@end macro 317@end ifnottex 318 319@c @times{} made available as a "*" in info and html (already works in tex). 320@ifnottex 321@macro times 322* 323@end macro 324@end ifnottex 325 326@c Usage: @W{text} 327@c Like @w{} but working in math mode too. 328@tex 329\gdef\W#1{\ifmmode{#1}\else\w{#1}\fi} 330@end tex 331@ifnottex 332@macro W {S} 333@w{\S\} 334@end macro 335@end ifnottex 336 337@c Usage: \GMPdisplay{text} 338@c Put the given text in an @display style indent, but without turning off 339@c paragraph reflow etc. 340@tex 341\gdef\GMPdisplay#1{% 342\noindent 343\advance\leftskip by \lispnarrowing 344#1\par} 345@end tex 346 347@c Usage: \GMPhat 348@c A new \hat that will work in math mode, unlike the texinfo redefined 349@c version. 350@tex 351\gdef\GMPhat{\mathaccent"705E} 352@end tex 353 354@c Usage: \GMPraise{text} 355@c For use in a $ $ math expression as an alternative to "^". This is good 356@c for @code{} in an exponent, since there seems to be no superscript font 357@c for that. 358@tex 359\gdef\GMPraise#1{\mskip0.5\thinmuskip\hbox{\raise0.8ex\hbox{#1}}} 360@end tex 361 362@c Usage: @texlinebreak{} 363@c A line break as per @*, but only in tex. 364@iftex 365@macro texlinebreak 366@* 367@end macro 368@end iftex 369@ifnottex 370@macro texlinebreak 371@end macro 372@end ifnottex 373 374@c Usage: @maybepagebreak 375@c Allow tex to insert a page break, if it feels the urge. 376@c Normally blocks of @deftypefun/funx are kept together, which can lead to 377@c some poor page break positioning if it's a big block, like the sets of 378@c division functions etc. 379@tex 380\gdef\maybepagebreak{\penalty0} 381@end tex 382@ifnottex 383@macro maybepagebreak 384@end macro 385@end ifnottex 386 387@c Usage: @GMPreftop{info,title} 388@c Usage: @GMPpxreftop{info,title} 389@c 390@c Like @ref{} and @pxref{}, but designed for a reference to the top of a 391@c document, not a particular section. The TeX output for plain @ref insists 392@c on printing a particular section, GMPreftop gives just the title. 393@c 394@c The texinfo manual recommends putting a likely section name in references 395@c like this, eg. "Introduction", but it seems better to just give the title. 396@c 397@iftex 398@macro GMPreftop{info,title} 399@i{\title\} 400@end macro 401@macro GMPpxreftop{info,title} 402see @i{\title\} 403@end macro 404@end iftex 405@c 406@ifnottex 407@macro GMPreftop{info,title} 408@ref{Top,\title\,\title\,\info\,\title\} 409@end macro 410@macro GMPpxreftop{info,title} 411@pxref{Top,\title\,\title\,\info\,\title\} 412@end macro 413@end ifnottex 414 415 416@node Copying, Introduction to GMP, Top, Top 417@comment node-name, next, previous, up 418@unnumbered GNU MP Copying Conditions 419@cindex Copying conditions 420@cindex Conditions for copying GNU MP 421@cindex License conditions 422 423This library is @dfn{free}; this means that everyone is free to use it and 424free to redistribute it on a free basis. The library is not in the public 425domain; it is copyrighted and there are restrictions on its distribution, but 426these restrictions are designed to permit everything that a good cooperating 427citizen would want to do. What is not allowed is to try to prevent others 428from further sharing any version of this library that they might get from 429you.@refill 430 431Specifically, we want to make sure that you have the right to give away copies 432of the library, that you receive source code or else can get it if you want 433it, that you can change this library or use pieces of it in new free programs, 434and that you know you can do these things.@refill 435 436To make sure that everyone has such rights, we have to forbid you to deprive 437anyone else of these rights. For example, if you distribute copies of the GNU 438MP library, you must give the recipients all the rights that you have. You 439must make sure that they, too, receive or can get the source code. And you 440must tell them their rights.@refill 441 442Also, for our own protection, we must make certain that everyone finds out 443that there is no warranty for the GNU MP library. If it is modified by 444someone else and passed on, we want their recipients to know that what they 445have is not what we distributed, so that any problems introduced by others 446will not reflect on our reputation.@refill 447 448The precise conditions of the license for the GNU MP library are found in the 449Lesser General Public License version 3 that accompanies the source code, 450see @file{COPYING.LIB}. Certain demonstration programs are provided under the 451terms of the plain General Public License version 3, see @file{COPYING}. 452 453 454@node Introduction to GMP, Installing GMP, Copying, Top 455@comment node-name, next, previous, up 456@chapter Introduction to GNU MP 457@cindex Introduction 458 459GNU MP is a portable library written in C for arbitrary precision arithmetic 460on integers, rational numbers, and floating-point numbers. It aims to provide 461the fastest possible arithmetic for all applications that need higher 462precision than is directly supported by the basic C types. 463 464Many applications use just a few hundred bits of precision; but some 465applications may need thousands or even millions of bits. GMP is designed to 466give good performance for both, by choosing algorithms based on the sizes of 467the operands, and by carefully keeping the overhead at a minimum. 468 469The speed of GMP is achieved by using fullwords as the basic arithmetic type, 470by using sophisticated algorithms, by including carefully optimized assembly 471code for the most common inner loops for many different CPUs, and by a general 472emphasis on speed (as opposed to simplicity or elegance). 473 474There is assembly code for these CPUs: 475@cindex CPU types 476ARM, 477DEC Alpha 21064, 21164, and 21264, 478AMD 29000, 479AMD K6, K6-2, Athlon, and Athlon64, 480Hitachi SuperH and SH-2, 481HPPA 1.0, 1.1 and 2.0, 482Intel Pentium, Pentium Pro/II/III, Pentium 4, generic x86, 483Intel IA-64, i960, 484Motorola MC68000, MC68020, MC88100, and MC88110, 485Motorola/IBM PowerPC 32 and 64, 486National NS32000, 487IBM POWER, 488MIPS R3000, R4000, 489SPARCv7, SuperSPARC, generic SPARCv8, UltraSPARC, 490DEC VAX, 491and 492Zilog Z8000. 493Some optimizations also for 494Cray vector systems, 495Clipper, 496IBM ROMP (RT), 497and 498Pyramid AP/XP. 499 500@cindex Home page 501@cindex Web page 502@noindent 503For up-to-date information on GMP, please see the GMP web pages at 504 505@display 506@uref{http://gmplib.org/} 507@end display 508 509@cindex Latest version of GMP 510@cindex Anonymous FTP of latest version 511@cindex FTP of latest version 512@noindent 513The latest version of the library is available at 514 515@display 516@uref{ftp://ftp.gnu.org/gnu/gmp/} 517@end display 518 519Many sites around the world mirror @samp{ftp.gnu.org}, please use a mirror 520near you, see @uref{http://www.gnu.org/order/ftp.html} for a full list. 521 522@cindex Mailing lists 523There are three public mailing lists of interest. One for release 524announcements, one for general questions and discussions about usage of the GMP 525library and one for bug reports. For more information, see 526 527@display 528@uref{http://gmplib.org/mailman/listinfo/}. 529@end display 530 531The proper place for bug reports is @email{gmp-bugs@@gmplib.org}. See 532@ref{Reporting Bugs} for information about reporting bugs. 533 534@sp 1 535@section How to use this Manual 536@cindex About this manual 537 538Everyone should read @ref{GMP Basics}. If you need to install the library 539yourself, then read @ref{Installing GMP}. If you have a system with multiple 540ABIs, then read @ref{ABI and ISA}, for the compiler options that must be used 541on applications. 542 543The rest of the manual can be used for later reference, although it is 544probably a good idea to glance through it. 545 546 547@node Installing GMP, GMP Basics, Introduction to GMP, Top 548@comment node-name, next, previous, up 549@chapter Installing GMP 550@cindex Installing GMP 551@cindex Configuring GMP 552@cindex Building GMP 553 554GMP has an autoconf/automake/libtool based configuration system. On a 555Unix-like system a basic build can be done with 556 557@example 558./configure 559make 560@end example 561 562@noindent 563Some self-tests can be run with 564 565@example 566make check 567@end example 568 569@noindent 570And you can install (under @file{/usr/local} by default) with 571 572@example 573make install 574@end example 575 576If you experience problems, please report them to @email{gmp-bugs@@gmplib.org}. 577See @ref{Reporting Bugs}, for information on what to include in useful bug 578reports. 579 580@menu 581* Build Options:: 582* ABI and ISA:: 583* Notes for Package Builds:: 584* Notes for Particular Systems:: 585* Known Build Problems:: 586* Performance optimization:: 587@end menu 588 589 590@node Build Options, ABI and ISA, Installing GMP, Installing GMP 591@section Build Options 592@cindex Build options 593 594All the usual autoconf configure options are available, run @samp{./configure 595--help} for a summary. The file @file{INSTALL.autoconf} has some generic 596installation information too. 597 598@table @asis 599@item Tools 600@cindex Non-Unix systems 601@samp{configure} requires various Unix-like tools. See @ref{Notes for 602Particular Systems}, for some options on non-Unix systems. 603 604It might be possible to build without the help of @samp{configure}, certainly 605all the code is there, but unfortunately you'll be on your own. 606 607@item Build Directory 608@cindex Build directory 609To compile in a separate build directory, @command{cd} to that directory, and 610prefix the configure command with the path to the GMP source directory. For 611example 612 613@example 614cd /my/build/dir 615/my/sources/gmp-@value{VERSION}/configure 616@end example 617 618Not all @samp{make} programs have the necessary features (@code{VPATH}) to 619support this. In particular, SunOS and Slowaris @command{make} have bugs that 620make them unable to build in a separate directory. Use GNU @command{make} 621instead. 622 623@item @option{--prefix} and @option{--exec-prefix} 624@cindex Prefix 625@cindex Exec prefix 626@cindex Install prefix 627@cindex @code{--prefix} 628@cindex @code{--exec-prefix} 629The @option{--prefix} option can be used in the normal way to direct GMP to 630install under a particular tree. The default is @samp{/usr/local}. 631 632@option{--exec-prefix} can be used to direct architecture-dependent files like 633@file{libgmp.a} to a different location. This can be used to share 634architecture-independent parts like the documentation, but separate the 635dependent parts. Note however that @file{gmp.h} and @file{mp.h} are 636architecture-dependent since they encode certain aspects of @file{libgmp}, so 637it will be necessary to ensure both @file{$prefix/include} and 638@file{$exec_prefix/include} are available to the compiler. 639 640@item @option{--disable-shared}, @option{--disable-static} 641@cindex @code{--disable-shared} 642@cindex @code{--disable-static} 643By default both shared and static libraries are built (where possible), but 644one or other can be disabled. Shared libraries result in smaller executables 645and permit code sharing between separate running processes, but on some CPUs 646are slightly slower, having a small cost on each function call. 647 648@item Native Compilation, @option{--build=CPU-VENDOR-OS} 649@cindex Native compilation 650@cindex Build system 651@cindex @code{--build} 652For normal native compilation, the system can be specified with 653@samp{--build}. By default @samp{./configure} uses the output from running 654@samp{./config.guess}. On some systems @samp{./config.guess} can determine 655the exact CPU type, on others it will be necessary to give it explicitly. For 656example, 657 658@example 659./configure --build=ultrasparc-sun-solaris2.7 660@end example 661 662In all cases the @samp{OS} part is important, since it controls how libtool 663generates shared libraries. Running @samp{./config.guess} is the simplest way 664to see what it should be, if you don't know already. 665 666@item Cross Compilation, @option{--host=CPU-VENDOR-OS} 667@cindex Cross compiling 668@cindex Host system 669@cindex @code{--host} 670When cross-compiling, the system used for compiling is given by @samp{--build} 671and the system where the library will run is given by @samp{--host}. For 672example when using a FreeBSD Athlon system to build GNU/Linux m68k binaries, 673 674@example 675./configure --build=athlon-pc-freebsd3.5 --host=m68k-mac-linux-gnu 676@end example 677 678Compiler tools are sought first with the host system type as a prefix. For 679example @command{m68k-mac-linux-gnu-ranlib} is tried, then plain 680@command{ranlib}. This makes it possible for a set of cross-compiling tools 681to co-exist with native tools. The prefix is the argument to @samp{--host}, 682and this can be an alias, such as @samp{m68k-linux}. But note that tools 683don't have to be setup this way, it's enough to just have a @env{PATH} with a 684suitable cross-compiling @command{cc} etc. 685 686Compiling for a different CPU in the same family as the build system is a form 687of cross-compilation, though very possibly this would merely be special 688options on a native compiler. In any case @samp{./configure} avoids depending 689on being able to run code on the build system, which is important when 690creating binaries for a newer CPU since they very possibly won't run on the 691build system. 692 693In all cases the compiler must be able to produce an executable (of whatever 694format) from a standard C @code{main}. Although only object files will go to 695make up @file{libgmp}, @samp{./configure} uses linking tests for various 696purposes, such as determining what functions are available on the host system. 697 698Currently a warning is given unless an explicit @samp{--build} is used when 699cross-compiling, because it may not be possible to correctly guess the build 700system type if the @env{PATH} has only a cross-compiling @command{cc}. 701 702Note that the @samp{--target} option is not appropriate for GMP@. It's for use 703when building compiler tools, with @samp{--host} being where they will run, 704and @samp{--target} what they'll produce code for. Ordinary programs or 705libraries like GMP are only interested in the @samp{--host} part, being where 706they'll run. (Some past versions of GMP used @samp{--target} incorrectly.) 707 708@item CPU types 709@cindex CPU types 710In general, if you want a library that runs as fast as possible, you should 711configure GMP for the exact CPU type your system uses. However, this may mean 712the binaries won't run on older members of the family, and might run slower on 713other members, older or newer. The best idea is always to build GMP for the 714exact machine type you intend to run it on. 715 716The following CPUs have specific support. See @file{configure.ac} for details 717of what code and compiler options they select. 718 719@itemize @bullet 720 721@c Keep this formatting, it's easy to read and it can be grepped to 722@c automatically test that CPUs listed get through ./config.sub 723 724@item 725Alpha: 726@nisamp{alpha}, 727@nisamp{alphaev5}, 728@nisamp{alphaev56}, 729@nisamp{alphapca56}, 730@nisamp{alphapca57}, 731@nisamp{alphaev6}, 732@nisamp{alphaev67}, 733@nisamp{alphaev68} 734@nisamp{alphaev7} 735 736@item 737Cray: 738@nisamp{c90}, 739@nisamp{j90}, 740@nisamp{t90}, 741@nisamp{sv1} 742 743@item 744HPPA: 745@nisamp{hppa1.0}, 746@nisamp{hppa1.1}, 747@nisamp{hppa2.0}, 748@nisamp{hppa2.0n}, 749@nisamp{hppa2.0w}, 750@nisamp{hppa64} 751 752@item 753IA-64: 754@nisamp{ia64}, 755@nisamp{itanium}, 756@nisamp{itanium2} 757 758@item 759MIPS: 760@nisamp{mips}, 761@nisamp{mips3}, 762@nisamp{mips64} 763 764@item 765Motorola: 766@nisamp{m68k}, 767@nisamp{m68000}, 768@nisamp{m68010}, 769@nisamp{m68020}, 770@nisamp{m68030}, 771@nisamp{m68040}, 772@nisamp{m68060}, 773@nisamp{m68302}, 774@nisamp{m68360}, 775@nisamp{m88k}, 776@nisamp{m88110} 777 778@item 779POWER: 780@nisamp{power}, 781@nisamp{power1}, 782@nisamp{power2}, 783@nisamp{power2sc} 784 785@item 786PowerPC: 787@nisamp{powerpc}, 788@nisamp{powerpc64}, 789@nisamp{powerpc401}, 790@nisamp{powerpc403}, 791@nisamp{powerpc405}, 792@nisamp{powerpc505}, 793@nisamp{powerpc601}, 794@nisamp{powerpc602}, 795@nisamp{powerpc603}, 796@nisamp{powerpc603e}, 797@nisamp{powerpc604}, 798@nisamp{powerpc604e}, 799@nisamp{powerpc620}, 800@nisamp{powerpc630}, 801@nisamp{powerpc740}, 802@nisamp{powerpc7400}, 803@nisamp{powerpc7450}, 804@nisamp{powerpc750}, 805@nisamp{powerpc801}, 806@nisamp{powerpc821}, 807@nisamp{powerpc823}, 808@nisamp{powerpc860}, 809@nisamp{powerpc970} 810 811@item 812SPARC: 813@nisamp{sparc}, 814@nisamp{sparcv8}, 815@nisamp{microsparc}, 816@nisamp{supersparc}, 817@nisamp{sparcv9}, 818@nisamp{ultrasparc}, 819@nisamp{ultrasparc2}, 820@nisamp{ultrasparc2i}, 821@nisamp{ultrasparc3}, 822@nisamp{sparc64} 823 824@item 825x86 family: 826@nisamp{i386}, 827@nisamp{i486}, 828@nisamp{i586}, 829@nisamp{pentium}, 830@nisamp{pentiummmx}, 831@nisamp{pentiumpro}, 832@nisamp{pentium2}, 833@nisamp{pentium3}, 834@nisamp{pentium4}, 835@nisamp{k6}, 836@nisamp{k62}, 837@nisamp{k63}, 838@nisamp{athlon}, 839@nisamp{amd64}, 840@nisamp{viac3}, 841@nisamp{viac32} 842 843@item 844Other: 845@nisamp{a29k}, 846@nisamp{arm}, 847@nisamp{clipper}, 848@nisamp{i960}, 849@nisamp{ns32k}, 850@nisamp{pyramid}, 851@nisamp{sh}, 852@nisamp{sh2}, 853@nisamp{vax}, 854@nisamp{z8k} 855@end itemize 856 857CPUs not listed will use generic C code. 858 859@item Generic C Build 860@cindex Generic C 861If some of the assembly code causes problems, or if otherwise desired, the 862generic C code can be selected with the configure @option{--disable-assembly}. 863 864Note that this will run quite slowly, but it should be portable and should at 865least make it possible to get something running if all else fails. 866 867@item Fat binary, @option{--enable-fat} 868@cindex Fat binary 869@cindex @option{--enable-fat} 870Using @option{--enable-fat} selects a ``fat binary'' build on x86, where 871optimized low level subroutines are chosen at runtime according to the CPU 872detected. This means more code, but gives good performance on all x86 chips. 873(This option might become available for more architectures in the future.) 874 875@item @option{ABI} 876@cindex ABI 877On some systems GMP supports multiple ABIs (application binary interfaces), 878meaning data type sizes and calling conventions. By default GMP chooses the 879best ABI available, but a particular ABI can be selected. For example 880 881@example 882./configure --host=mips64-sgi-irix6 ABI=n32 883@end example 884 885See @ref{ABI and ISA}, for the available choices on relevant CPUs, and what 886applications need to do. 887 888@item @option{CC}, @option{CFLAGS} 889@cindex C compiler 890@cindex @code{CC} 891@cindex @code{CFLAGS} 892By default the C compiler used is chosen from among some likely candidates, 893with @command{gcc} normally preferred if it's present. The usual 894@samp{CC=whatever} can be passed to @samp{./configure} to choose something 895different. 896 897For various systems, default compiler flags are set based on the CPU and 898compiler. The usual @samp{CFLAGS="-whatever"} can be passed to 899@samp{./configure} to use something different or to set good flags for systems 900GMP doesn't otherwise know. 901 902The @samp{CC} and @samp{CFLAGS} used are printed during @samp{./configure}, 903and can be found in each generated @file{Makefile}. This is the easiest way 904to check the defaults when considering changing or adding something. 905 906Note that when @samp{CC} and @samp{CFLAGS} are specified on a system 907supporting multiple ABIs it's important to give an explicit 908@samp{ABI=whatever}, since GMP can't determine the ABI just from the flags and 909won't be able to select the correct assembly code. 910 911If just @samp{CC} is selected then normal default @samp{CFLAGS} for that 912compiler will be used (if GMP recognises it). For example @samp{CC=gcc} can 913be used to force the use of GCC, with default flags (and default ABI). 914 915@item @option{CPPFLAGS} 916@cindex @code{CPPFLAGS} 917Any flags like @samp{-D} defines or @samp{-I} includes required by the 918preprocessor should be set in @samp{CPPFLAGS} rather than @samp{CFLAGS}. 919Compiling is done with both @samp{CPPFLAGS} and @samp{CFLAGS}, but 920preprocessing uses just @samp{CPPFLAGS}. This distinction is because most 921preprocessors won't accept all the flags the compiler does. Preprocessing is 922done separately in some configure tests. 923 924@item @option{CC_FOR_BUILD} 925@cindex @code{CC_FOR_BUILD} 926Some build-time programs are compiled and run to generate host-specific data 927tables. @samp{CC_FOR_BUILD} is the compiler used for this. It doesn't need 928to be in any particular ABI or mode, it merely needs to generate executables 929that can run. The default is to try the selected @samp{CC} and some likely 930candidates such as @samp{cc} and @samp{gcc}, looking for something that works. 931 932No flags are used with @samp{CC_FOR_BUILD} because a simple invocation like 933@samp{cc foo.c} should be enough. If some particular options are required 934they can be included as for instance @samp{CC_FOR_BUILD="cc -whatever"}. 935 936@item C++ Support, @option{--enable-cxx} 937@cindex C++ support 938@cindex @code{--enable-cxx} 939C++ support in GMP can be enabled with @samp{--enable-cxx}, in which case a 940C++ compiler will be required. As a convenience @samp{--enable-cxx=detect} 941can be used to enable C++ support only if a compiler can be found. The C++ 942support consists of a library @file{libgmpxx.la} and header file 943@file{gmpxx.h} (@pxref{Headers and Libraries}). 944 945A separate @file{libgmpxx.la} has been adopted rather than having C++ objects 946within @file{libgmp.la} in order to ensure dynamic linked C programs aren't 947bloated by a dependency on the C++ standard library, and to avoid any chance 948that the C++ compiler could be required when linking plain C programs. 949 950@file{libgmpxx.la} will use certain internals from @file{libgmp.la} and can 951only be expected to work with @file{libgmp.la} from the same GMP version. 952Future changes to the relevant internals will be accompanied by renaming, so a 953mismatch will cause unresolved symbols rather than perhaps mysterious 954misbehaviour. 955 956In general @file{libgmpxx.la} will be usable only with the C++ compiler that 957built it, since name mangling and runtime support are usually incompatible 958between different compilers. 959 960@item @option{CXX}, @option{CXXFLAGS} 961@cindex C++ compiler 962@cindex @code{CXX} 963@cindex @code{CXXFLAGS} 964When C++ support is enabled, the C++ compiler and its flags can be set with 965variables @samp{CXX} and @samp{CXXFLAGS} in the usual way. The default for 966@samp{CXX} is the first compiler that works from a list of likely candidates, 967with @command{g++} normally preferred when available. The default for 968@samp{CXXFLAGS} is to try @samp{CFLAGS}, @samp{CFLAGS} without @samp{-g}, then 969for @command{g++} either @samp{-g -O2} or @samp{-O2}, or for other compilers 970@samp{-g} or nothing. Trying @samp{CFLAGS} this way is convenient when using 971@samp{gcc} and @samp{g++} together, since the flags for @samp{gcc} will 972usually suit @samp{g++}. 973 974It's important that the C and C++ compilers match, meaning their startup and 975runtime support routines are compatible and that they generate code in the 976same ABI (if there's a choice of ABIs on the system). @samp{./configure} 977isn't currently able to check these things very well itself, so for that 978reason @samp{--disable-cxx} is the default, to avoid a build failure due to a 979compiler mismatch. Perhaps this will change in the future. 980 981Incidentally, it's normally not good enough to set @samp{CXX} to the same as 982@samp{CC}. Although @command{gcc} for instance recognises @file{foo.cc} as 983C++ code, only @command{g++} will invoke the linker the right way when 984building an executable or shared library from C++ object files. 985 986@item Temporary Memory, @option{--enable-alloca=<choice>} 987@cindex Temporary memory 988@cindex Stack overflow 989@cindex @code{alloca} 990@cindex @code{--enable-alloca} 991GMP allocates temporary workspace using one of the following three methods, 992which can be selected with for instance 993@samp{--enable-alloca=malloc-reentrant}. 994 995@itemize @bullet 996@item 997@samp{alloca} - C library or compiler builtin. 998@item 999@samp{malloc-reentrant} - the heap, in a re-entrant fashion. 1000@item 1001@samp{malloc-notreentrant} - the heap, with global variables. 1002@end itemize 1003 1004For convenience, the following choices are also available. 1005@samp{--disable-alloca} is the same as @samp{no}. 1006 1007@itemize @bullet 1008@item 1009@samp{yes} - a synonym for @samp{alloca}. 1010@item 1011@samp{no} - a synonym for @samp{malloc-reentrant}. 1012@item 1013@samp{reentrant} - @code{alloca} if available, otherwise 1014@samp{malloc-reentrant}. This is the default. 1015@item 1016@samp{notreentrant} - @code{alloca} if available, otherwise 1017@samp{malloc-notreentrant}. 1018@end itemize 1019 1020@code{alloca} is reentrant and fast, and is recommended. It actually allocates 1021just small blocks on the stack; larger ones use malloc-reentrant. 1022 1023@samp{malloc-reentrant} is, as the name suggests, reentrant and thread safe, 1024but @samp{malloc-notreentrant} is faster and should be used if reentrancy is 1025not required. 1026 1027The two malloc methods in fact use the memory allocation functions selected by 1028@code{mp_set_memory_functions}, these being @code{malloc} and friends by 1029default. @xref{Custom Allocation}. 1030 1031An additional choice @samp{--enable-alloca=debug} is available, to help when 1032debugging memory related problems (@pxref{Debugging}). 1033 1034@item FFT Multiplication, @option{--disable-fft} 1035@cindex FFT multiplication 1036@cindex @code{--disable-fft} 1037By default multiplications are done using Karatsuba, 3-way Toom, higher degree 1038Toom, and Fermat FFT@. The FFT is only used on large to very large operands 1039and can be disabled to save code size if desired. 1040 1041@item Assertion Checking, @option{--enable-assert} 1042@cindex Assertion checking 1043@cindex @code{--enable-assert} 1044This option enables some consistency checking within the library. This can be 1045of use while debugging, @pxref{Debugging}. 1046 1047@item Execution Profiling, @option{--enable-profiling=prof/gprof/instrument} 1048@cindex Execution profiling 1049@cindex @code{--enable-profiling} 1050Enable profiling support, in one of various styles, @pxref{Profiling}. 1051 1052@item @option{MPN_PATH} 1053@cindex @code{MPN_PATH} 1054Various assembly versions of each mpn subroutines are provided. For a given 1055CPU, a search is made though a path to choose a version of each. For example 1056@samp{sparcv8} has 1057 1058@example 1059MPN_PATH="sparc32/v8 sparc32 generic" 1060@end example 1061 1062which means look first for v8 code, then plain sparc32 (which is v7), and 1063finally fall back on generic C@. Knowledgeable users with special requirements 1064can specify a different path. Normally this is completely unnecessary. 1065 1066@item Documentation 1067@cindex Documentation formats 1068@cindex Texinfo 1069The source for the document you're now reading is @file{doc/gmp.texi}, in 1070Texinfo format, see @GMPreftop{texinfo, Texinfo}. 1071 1072@cindex Postscript 1073@cindex DVI 1074@cindex PDF 1075Info format @samp{doc/gmp.info} is included in the distribution. The usual 1076automake targets are available to make PostScript, DVI, PDF and HTML (these 1077will require various @TeX{} and Texinfo tools). 1078 1079@cindex DocBook 1080@cindex XML 1081DocBook and XML can be generated by the Texinfo @command{makeinfo} program 1082too, see @ref{makeinfo options,, Options for @command{makeinfo}, texinfo, 1083Texinfo}. 1084 1085Some supplementary notes can also be found in the @file{doc} subdirectory. 1086 1087@end table 1088 1089 1090@need 2000 1091@node ABI and ISA, Notes for Package Builds, Build Options, Installing GMP 1092@section ABI and ISA 1093@cindex ABI 1094@cindex Application Binary Interface 1095@cindex ISA 1096@cindex Instruction Set Architecture 1097 1098ABI (Application Binary Interface) refers to the calling conventions between 1099functions, meaning what registers are used and what sizes the various C data 1100types are. ISA (Instruction Set Architecture) refers to the instructions and 1101registers a CPU has available. 1102 1103Some 64-bit ISA CPUs have both a 64-bit ABI and a 32-bit ABI defined, the 1104latter for compatibility with older CPUs in the family. GMP supports some 1105CPUs like this in both ABIs. In fact within GMP @samp{ABI} means a 1106combination of chip ABI, plus how GMP chooses to use it. For example in some 110732-bit ABIs, GMP may support a limb as either a 32-bit @code{long} or a 64-bit 1108@code{long long}. 1109 1110By default GMP chooses the best ABI available for a given system, and this 1111generally gives significantly greater speed. But an ABI can be chosen 1112explicitly to make GMP compatible with other libraries, or particular 1113application requirements. For example, 1114 1115@example 1116./configure ABI=32 1117@end example 1118 1119In all cases it's vital that all object code used in a given program is 1120compiled for the same ABI. 1121 1122Usually a limb is implemented as a @code{long}. When a @code{long long} limb 1123is used this is encoded in the generated @file{gmp.h}. This is convenient for 1124applications, but it does mean that @file{gmp.h} will vary, and can't be just 1125copied around. @file{gmp.h} remains compiler independent though, since all 1126compilers for a particular ABI will be expected to use the same limb type. 1127 1128Currently no attempt is made to follow whatever conventions a system has for 1129installing library or header files built for a particular ABI@. This will 1130probably only matter when installing multiple builds of GMP, and it might be 1131as simple as configuring with a special @samp{libdir}, or it might require 1132more than that. Note that builds for different ABIs need to done separately, 1133with a fresh @command{./configure} and @command{make} each. 1134 1135@sp 1 1136@table @asis 1137@need 1000 1138@item AMD64 (@samp{x86_64}) 1139@cindex AMD64 1140On AMD64 systems supporting both 32-bit and 64-bit modes for applications, the 1141following ABI choices are available. 1142 1143@table @asis 1144@item @samp{ABI=64} 1145The 64-bit ABI uses 64-bit limbs and pointers and makes full use of the chip 1146architecture. This is the default. Applications will usually not need 1147special compiler flags, but for reference the option is 1148 1149@example 1150gcc -m64 1151@end example 1152 1153@item @samp{ABI=32} 1154The 32-bit ABI is the usual i386 conventions. This will be slower, and is not 1155recommended except for inter-operating with other code not yet 64-bit capable. 1156Applications must be compiled with 1157 1158@example 1159gcc -m32 1160@end example 1161 1162(In GCC 2.95 and earlier there's no @samp{-m32} option, it's the only mode.) 1163@end table 1164 1165@sp 1 1166@need 1000 1167@item HPPA 2.0 (@samp{hppa2.0*}, @samp{hppa64}) 1168@cindex HPPA 1169@cindex HP-UX 1170@table @asis 1171@item @samp{ABI=2.0w} 1172The 2.0w ABI uses 64-bit limbs and pointers and is available on HP-UX 11 or 1173up. Applications must be compiled with 1174 1175@example 1176gcc [built for 2.0w] 1177cc +DD64 1178@end example 1179 1180@item @samp{ABI=2.0n} 1181The 2.0n ABI means the 32-bit HPPA 1.0 ABI and all its normal calling 1182conventions, but with 64-bit instructions permitted within functions. GMP 1183uses a 64-bit @code{long long} for a limb. This ABI is available on hppa64 1184GNU/Linux and on HP-UX 10 or higher. Applications must be compiled with 1185 1186@example 1187gcc [built for 2.0n] 1188cc +DA2.0 +e 1189@end example 1190 1191Note that current versions of GCC (eg.@: 3.2) don't generate 64-bit 1192instructions for @code{long long} operations and so may be slower than for 11932.0w. (The GMP assembly code is the same though.) 1194 1195@item @samp{ABI=1.0} 1196HPPA 2.0 CPUs can run all HPPA 1.0 and 1.1 code in the 32-bit HPPA 1.0 ABI@. 1197No special compiler options are needed for applications. 1198@end table 1199 1200All three ABIs are available for CPU types @samp{hppa2.0w}, @samp{hppa2.0} and 1201@samp{hppa64}, but for CPU type @samp{hppa2.0n} only 2.0n or 1.0 are 1202considered. 1203 1204Note that GCC on HP-UX has no options to choose between 2.0n and 2.0w modes, 1205unlike HP @command{cc}. Instead it must be built for one or the other ABI@. 1206GMP will detect how it was built, and skip to the corresponding @samp{ABI}. 1207 1208@sp 1 1209@need 1500 1210@item IA-64 under HP-UX (@samp{ia64*-*-hpux*}, @samp{itanium*-*-hpux*}) 1211@cindex IA-64 1212@cindex HP-UX 1213HP-UX supports two ABIs for IA-64. GMP performance is the same in both. 1214 1215@table @asis 1216@item @samp{ABI=32} 1217In the 32-bit ABI, pointers, @code{int}s and @code{long}s are 32 bits and GMP 1218uses a 64 bit @code{long long} for a limb. Applications can be compiled 1219without any special flags since this ABI is the default in both HP C and GCC, 1220but for reference the flags are 1221 1222@example 1223gcc -milp32 1224cc +DD32 1225@end example 1226 1227@item @samp{ABI=64} 1228In the 64-bit ABI, @code{long}s and pointers are 64 bits and GMP uses a 1229@code{long} for a limb. Applications must be compiled with 1230 1231@example 1232gcc -mlp64 1233cc +DD64 1234@end example 1235@end table 1236 1237On other IA-64 systems, GNU/Linux for instance, @samp{ABI=64} is the only 1238choice. 1239 1240@sp 1 1241@need 1000 1242@item MIPS under IRIX 6 (@samp{mips*-*-irix[6789]}) 1243@cindex MIPS 1244@cindex IRIX 1245IRIX 6 always has a 64-bit MIPS 3 or better CPU, and supports ABIs o32, n32, 1246and 64. n32 or 64 are recommended, and GMP performance will be the same in 1247each. The default is n32. 1248 1249@table @asis 1250@item @samp{ABI=o32} 1251The o32 ABI is 32-bit pointers and integers, and no 64-bit operations. GMP 1252will be slower than in n32 or 64, this option only exists to support old 1253compilers, eg.@: GCC 2.7.2. Applications can be compiled with no special 1254flags on an old compiler, or on a newer compiler with 1255 1256@example 1257gcc -mabi=32 1258cc -32 1259@end example 1260 1261@item @samp{ABI=n32} 1262The n32 ABI is 32-bit pointers and integers, but with a 64-bit limb using a 1263@code{long long}. Applications must be compiled with 1264 1265@example 1266gcc -mabi=n32 1267cc -n32 1268@end example 1269 1270@item @samp{ABI=64} 1271The 64-bit ABI is 64-bit pointers and integers. Applications must be compiled 1272with 1273 1274@example 1275gcc -mabi=64 1276cc -64 1277@end example 1278@end table 1279 1280Note that MIPS GNU/Linux, as of kernel version 2.2, doesn't have the necessary 1281support for n32 or 64 and so only gets a 32-bit limb and the MIPS 2 code. 1282 1283@sp 1 1284@need 1000 1285@item PowerPC 64 (@samp{powerpc64}, @samp{powerpc620}, @samp{powerpc630}, @samp{powerpc970}, @samp{power4}, @samp{power5}) 1286@cindex PowerPC 1287@table @asis 1288@item @samp{ABI=mode64} 1289@cindex AIX 1290The AIX 64 ABI uses 64-bit limbs and pointers and is the default on PowerPC 64 1291@samp{*-*-aix*} systems. Applications must be compiled with 1292 1293@example 1294gcc -maix64 1295xlc -q64 1296@end example 1297 1298On 64-bit GNU/Linux, BSD, and Mac OS X/Darwin systems, the applications must 1299be compiled with 1300 1301@example 1302gcc -m64 1303@end example 1304 1305@item @samp{ABI=mode32} 1306The @samp{mode32} ABI uses a 64-bit @code{long long} limb but with the chip 1307still in 32-bit mode and using 32-bit calling conventions. This is the default 1308for systems where the true 64-bit ABI is unavailable. No special compiler 1309options are typically needed for applications. This ABI is not available under 1310AIX. 1311 1312@item @samp{ABI=32} 1313This is the basic 32-bit PowerPC ABI, with a 32-bit limb. No special compiler 1314options are needed for applications. 1315@end table 1316 1317GMP's speed is greatest for the @samp{mode64} ABI, the @samp{mode32} ABI is 2nd 1318best. In @samp{ABI=32} only the 32-bit ISA is used and this doesn't make full 1319use of a 64-bit chip. 1320 1321@sp 1 1322@need 1000 1323@item Sparc V9 (@samp{sparc64}, @samp{sparcv9}, @samp{ultrasparc*}) 1324@cindex Sparc V9 1325@cindex Solaris 1326@cindex Sun 1327@table @asis 1328@item @samp{ABI=64} 1329The 64-bit V9 ABI is available on the various BSD sparc64 ports, recent 1330versions of Sparc64 GNU/Linux, and Solaris 2.7 and up (when the kernel is in 133164-bit mode). GCC 3.2 or higher, or Sun @command{cc} is required. On 1332GNU/Linux, depending on the default @command{gcc} mode, applications must be 1333compiled with 1334 1335@example 1336gcc -m64 1337@end example 1338 1339On Solaris applications must be compiled with 1340 1341@example 1342gcc -m64 -mptr64 -Wa,-xarch=v9 -mcpu=v9 1343cc -xarch=v9 1344@end example 1345 1346On the BSD sparc64 systems no special options are required, since 64-bits is 1347the only ABI available. 1348 1349@item @samp{ABI=32} 1350For the basic 32-bit ABI, GMP still uses as much of the V9 ISA as it can. In 1351the Sun documentation this combination is known as ``v8plus''. On GNU/Linux, 1352depending on the default @command{gcc} mode, applications may need to be 1353compiled with 1354 1355@example 1356gcc -m32 1357@end example 1358 1359On Solaris, no special compiler options are required for applications, though 1360using something like the following is recommended. (@command{gcc} 2.8 and 1361earlier only support @samp{-mv8} though.) 1362 1363@example 1364gcc -mv8plus 1365cc -xarch=v8plus 1366@end example 1367@end table 1368 1369GMP speed is greatest in @samp{ABI=64}, so it's the default where available. 1370The speed is partly because there are extra registers available and partly 1371because 64-bits is considered the more important case and has therefore had 1372better code written for it. 1373 1374Don't be confused by the names of the @samp{-m} and @samp{-x} compiler 1375options, they're called @samp{arch} but effectively control both ABI and ISA@. 1376 1377On Solaris 2.6 and earlier, only @samp{ABI=32} is available since the kernel 1378doesn't save all registers. 1379 1380On Solaris 2.7 with the kernel in 32-bit mode, a normal native build will 1381reject @samp{ABI=64} because the resulting executables won't run. 1382@samp{ABI=64} can still be built if desired by making it look like a 1383cross-compile, for example 1384 1385@example 1386./configure --build=none --host=sparcv9-sun-solaris2.7 ABI=64 1387@end example 1388@end table 1389 1390 1391@need 2000 1392@node Notes for Package Builds, Notes for Particular Systems, ABI and ISA, Installing GMP 1393@section Notes for Package Builds 1394@cindex Build notes for binary packaging 1395@cindex Packaged builds 1396 1397GMP should present no great difficulties for packaging in a binary 1398distribution. 1399 1400@cindex Libtool versioning 1401@cindex Shared library versioning 1402Libtool is used to build the library and @samp{-version-info} is set 1403appropriately, having started from @samp{3:0:0} in GMP 3.0 (@pxref{Versioning, 1404Library interface versions, Library interface versions, libtool, GNU 1405Libtool}). 1406 1407The GMP 4 series will be upwardly binary compatible in each release and will 1408be upwardly binary compatible with all of the GMP 3 series. Additional 1409function interfaces may be added in each release, so on systems where libtool 1410versioning is not fully checked by the loader an auxiliary mechanism may be 1411needed to express that a dynamic linked application depends on a new enough 1412GMP. 1413 1414An auxiliary mechanism may also be needed to express that @file{libgmpxx.la} 1415(from @option{--enable-cxx}, @pxref{Build Options}) requires @file{libgmp.la} 1416from the same GMP version, since this is not done by the libtool versioning, 1417nor otherwise. A mismatch will result in unresolved symbols from the linker, 1418or perhaps the loader. 1419 1420When building a package for a CPU family, care should be taken to use 1421@samp{--host} (or @samp{--build}) to choose the least common denominator among 1422the CPUs which might use the package. For example this might mean plain 1423@samp{sparc} (meaning V7) for SPARCs. 1424 1425For x86s, @option{--enable-fat} sets things up for a fat binary build, making a 1426runtime selection of optimized low level routines. This is a good choice for 1427packaging to run on a range of x86 chips. 1428 1429Users who care about speed will want GMP built for their exact CPU type, to 1430make best use of the available optimizations. Providing a way to suitably 1431rebuild a package may be useful. This could be as simple as making it 1432possible for a user to omit @samp{--build} (and @samp{--host}) so 1433@samp{./config.guess} will detect the CPU@. But a way to manually specify a 1434@samp{--build} will be wanted for systems where @samp{./config.guess} is 1435inexact. 1436 1437On systems with multiple ABIs, a packaged build will need to decide which 1438among the choices is to be provided, see @ref{ABI and ISA}. A given run of 1439@samp{./configure} etc will only build one ABI@. If a second ABI is also 1440required then a second run of @samp{./configure} etc must be made, starting 1441from a clean directory tree (@samp{make distclean}). 1442 1443As noted under ``ABI and ISA'', currently no attempt is made to follow system 1444conventions for install locations that vary with ABI, such as 1445@file{/usr/lib/sparcv9} for @samp{ABI=64} as opposed to @file{/usr/lib} for 1446@samp{ABI=32}. A package build can override @samp{libdir} and other standard 1447variables as necessary. 1448 1449Note that @file{gmp.h} is a generated file, and will be architecture and ABI 1450dependent. When attempting to install two ABIs simultaneously it will be 1451important that an application compile gets the correct @file{gmp.h} for its 1452desired ABI@. If compiler include paths don't vary with ABI options then it 1453might be necessary to create a @file{/usr/include/gmp.h} which tests 1454preprocessor symbols and chooses the correct actual @file{gmp.h}. 1455 1456 1457@need 2000 1458@node Notes for Particular Systems, Known Build Problems, Notes for Package Builds, Installing GMP 1459@section Notes for Particular Systems 1460@cindex Build notes for particular systems 1461@cindex Particular systems 1462@cindex Systems 1463@table @asis 1464 1465@c This section is more or less meant for notes about performance or about 1466@c build problems that have been worked around but might leave a user 1467@c scratching their head. Fun with different ABIs on a system belongs in the 1468@c above section. 1469 1470@item AIX 3 and 4 1471@cindex AIX 1472On systems @samp{*-*-aix[34]*} shared libraries are disabled by default, since 1473some versions of the native @command{ar} fail on the convenience libraries 1474used. A shared build can be attempted with 1475 1476@example 1477./configure --enable-shared --disable-static 1478@end example 1479 1480Note that the @samp{--disable-static} is necessary because in a shared build 1481libtool makes @file{libgmp.a} a symlink to @file{libgmp.so}, apparently for 1482the benefit of old versions of @command{ld} which only recognise @file{.a}, 1483but unfortunately this is done even if a fully functional @command{ld} is 1484available. 1485 1486@item ARM 1487@cindex ARM 1488On systems @samp{arm*-*-*}, versions of GCC up to and including 2.95.3 have a 1489bug in unsigned division, giving wrong results for some operands. GMP 1490@samp{./configure} will demand GCC 2.95.4 or later. 1491 1492@item Compaq C++ 1493@cindex Compaq C++ 1494Compaq C++ on OSF 5.1 has two flavours of @code{iostream}, a standard one and 1495an old pre-standard one (see @samp{man iostream_intro}). GMP can only use the 1496standard one, which unfortunately is not the default but must be selected by 1497defining @code{__USE_STD_IOSTREAM}. Configure with for instance 1498 1499@example 1500./configure --enable-cxx CPPFLAGS=-D__USE_STD_IOSTREAM 1501@end example 1502 1503@item Floating Point Mode 1504@cindex Floating point mode 1505@cindex Hardware floating point mode 1506@cindex Precision of hardware floating point 1507@cindex x87 1508On some systems, the hardware floating point has a control mode which can set 1509all operations to be done in a particular precision, for instance single, 1510double or extended on x86 systems (x87 floating point). The GMP functions 1511involving a @code{double} cannot be expected to operate to their full 1512precision when the hardware is in single precision mode. Of course this 1513affects all code, including application code, not just GMP. 1514 1515@item MS-DOS and MS Windows 1516@cindex MS-DOS 1517@cindex MS Windows 1518@cindex Windows 1519@cindex Cygwin 1520@cindex DJGPP 1521@cindex MINGW 1522On an MS-DOS system DJGPP can be used to build GMP, and on an MS Windows 1523system Cygwin, DJGPP and MINGW can be used. All three are excellent ports of 1524GCC and the various GNU tools. 1525 1526@display 1527@uref{http://www.cygwin.com/} 1528@uref{http://www.delorie.com/djgpp/} 1529@uref{http://www.mingw.org/} 1530@end display 1531 1532@cindex Interix 1533@cindex Services for Unix 1534Microsoft also publishes an Interix ``Services for Unix'' which can be used to 1535build GMP on Windows (with a normal @samp{./configure}), but it's not free 1536software. 1537 1538@item MS Windows DLLs 1539@cindex DLLs 1540@cindex MS Windows 1541@cindex Windows 1542On systems @samp{*-*-cygwin*}, @samp{*-*-mingw*} and @samp{*-*-pw32*} by 1543default GMP builds only a static library, but a DLL can be built instead using 1544 1545@example 1546./configure --disable-static --enable-shared 1547@end example 1548 1549Static and DLL libraries can't both be built, since certain export directives 1550in @file{gmp.h} must be different. 1551 1552A MINGW DLL build of GMP can be used with Microsoft C@. Libtool doesn't 1553install a @file{.lib} format import library, but it can be created with MS 1554@command{lib} as follows, and copied to the install directory. Similarly for 1555@file{libmp} and @file{libgmpxx}. 1556 1557@example 1558cd .libs 1559lib /def:libgmp-3.dll.def /out:libgmp-3.lib 1560@end example 1561 1562MINGW uses the C runtime library @samp{msvcrt.dll} for I/O, so applications 1563wanting to use the GMP I/O routines must be compiled with @samp{cl /MD} to do 1564the same. If one of the other C runtime library choices provided by MS C is 1565desired then the suggestion is to use the GMP string functions and confine I/O 1566to the application. 1567 1568@item Motorola 68k CPU Types 1569@cindex 68000 1570@samp{m68k} is taken to mean 68000. @samp{m68020} or higher will give a 1571performance boost on applicable CPUs. @samp{m68360} can be used for CPU32 1572series chips. @samp{m68302} can be used for ``Dragonball'' series chips, 1573though this is merely a synonym for @samp{m68000}. 1574 1575@item OpenBSD 2.6 1576@cindex OpenBSD 1577@command{m4} in this release of OpenBSD has a bug in @code{eval} that makes it 1578unsuitable for @file{.asm} file processing. @samp{./configure} will detect 1579the problem and either abort or choose another m4 in the @env{PATH}. The bug 1580is fixed in OpenBSD 2.7, so either upgrade or use GNU m4. 1581 1582@item Power CPU Types 1583@cindex Power/PowerPC 1584In GMP, CPU types @samp{power*} and @samp{powerpc*} will each use instructions 1585not available on the other, so it's important to choose the right one for the 1586CPU that will be used. Currently GMP has no assembly code support for using 1587just the common instruction subset. To get executables that run on both, the 1588current suggestion is to use the generic C code (@option{--disable-assembly}), 1589possibly with appropriate compiler options (like @samp{-mcpu=common} for 1590@command{gcc}). CPU @samp{rs6000} (which is not a CPU but a family of 1591workstations) is accepted by @file{config.sub}, but is currently equivalent to 1592@option{--disable-assembly}. 1593 1594@item Sparc CPU Types 1595@cindex Sparc 1596@samp{sparcv8} or @samp{supersparc} on relevant systems will give a 1597significant performance increase over the V7 code selected by plain 1598@samp{sparc}. 1599 1600@item Sparc App Regs 1601@cindex Sparc 1602The GMP assembly code for both 32-bit and 64-bit Sparc clobbers the 1603``application registers'' @code{g2}, @code{g3} and @code{g4}, the same way 1604that the GCC default @samp{-mapp-regs} does (@pxref{SPARC Options,, SPARC 1605Options, gcc, Using the GNU Compiler Collection (GCC)}). 1606 1607This makes that code unsuitable for use with the special V9 1608@samp{-mcmodel=embmedany} (which uses @code{g4} as a data segment pointer), and 1609for applications wanting to use those registers for special purposes. In these 1610cases the only suggestion currently is to build GMP with 1611@option{--disable-assembly} to avoid the assembly code. 1612 1613@item SunOS 4 1614@cindex SunOS 1615@command{/usr/bin/m4} lacks various features needed to process @file{.asm} 1616files, and instead @samp{./configure} will automatically use 1617@command{/usr/5bin/m4}, which we believe is always available (if not then use 1618GNU m4). 1619 1620@item x86 CPU Types 1621@cindex x86 1622@cindex 80x86 1623@cindex i386 1624@samp{i586}, @samp{pentium} or @samp{pentiummmx} code is good for its intended 1625P5 Pentium chips, but quite slow when run on Intel P6 class chips (PPro, P-II, 1626P-III)@. @samp{i386} is a better choice when making binaries that must run on 1627both. 1628 1629@item x86 MMX and SSE2 Code 1630@cindex MMX 1631@cindex SSE2 1632If the CPU selected has MMX code but the assembler doesn't support it, a 1633warning is given and non-MMX code is used instead. This will be an inferior 1634build, since the MMX code that's present is there because it's faster than the 1635corresponding plain integer code. The same applies to SSE2. 1636 1637Old versions of @samp{gas} don't support MMX instructions, in particular 1638version 1.92.3 that comes with FreeBSD 2.2.8 or the more recent OpenBSD 3.1 1639doesn't. 1640 1641Solaris 2.6 and 2.7 @command{as} generate incorrect object code for register 1642to register @code{movq} instructions, and so can't be used for MMX code. 1643Install a recent @command{gas} if MMX code is wanted on these systems. 1644@end table 1645 1646 1647@need 2000 1648@node Known Build Problems, Performance optimization, Notes for Particular Systems, Installing GMP 1649@section Known Build Problems 1650@cindex Build problems known 1651 1652@c This section is more or less meant for known build problems that are not 1653@c otherwise worked around and require some sort of manual intervention. 1654 1655You might find more up-to-date information at @uref{http://gmplib.org/}. 1656 1657@table @asis 1658@item Compiler link options 1659The version of libtool currently in use rather aggressively strips compiler 1660options when linking a shared library. This will hopefully be relaxed in the 1661future, but for now if this is a problem the suggestion is to create a little 1662script to hide them, and for instance configure with 1663 1664@example 1665./configure CC=gcc-with-my-options 1666@end example 1667 1668@item DJGPP (@samp{*-*-msdosdjgpp*}) 1669@cindex DJGPP 1670The DJGPP port of @command{bash} 2.03 is unable to run the @samp{configure} 1671script, it exits silently, having died writing a preamble to 1672@file{config.log}. Use @command{bash} 2.04 or higher. 1673 1674@samp{make all} was found to run out of memory during the final 1675@file{libgmp.la} link on one system tested, despite having 64Mb available. 1676Running @samp{make libgmp.la} directly helped, perhaps recursing into the 1677various subdirectories uses up memory. 1678 1679@item GNU binutils @command{strip} prior to 2.12 1680@cindex Stripped libraries 1681@cindex Binutils @command{strip} 1682@cindex GNU @command{strip} 1683@command{strip} from GNU binutils 2.11 and earlier should not be used on the 1684static libraries @file{libgmp.a} and @file{libmp.a} since it will discard all 1685but the last of multiple archive members with the same name, like the three 1686versions of @file{init.o} in @file{libgmp.a}. Binutils 2.12 or higher can be 1687used successfully. 1688 1689The shared libraries @file{libgmp.so} and @file{libmp.so} are not affected by 1690this and any version of @command{strip} can be used on them. 1691 1692@item @command{make} syntax error 1693@cindex SCO 1694@cindex IRIX 1695On certain versions of SCO OpenServer 5 and IRIX 6.5 the native @command{make} 1696is unable to handle the long dependencies list for @file{libgmp.la}. The 1697symptom is a ``syntax error'' on the following line of the top-level 1698@file{Makefile}. 1699 1700@example 1701libgmp.la: $(libgmp_la_OBJECTS) $(libgmp_la_DEPENDENCIES) 1702@end example 1703 1704Either use GNU Make, or as a workaround remove 1705@code{$(libgmp_la_DEPENDENCIES)} from that line (which will make the initial 1706build work, but if any recompiling is done @file{libgmp.la} might not be 1707rebuilt). 1708 1709@item MacOS X (@samp{*-*-darwin*}) 1710@cindex MacOS X 1711@cindex Darwin 1712Libtool currently only knows how to create shared libraries on MacOS X using 1713the native @command{cc} (which is a modified GCC), not a plain GCC@. A 1714static-only build should work though (@samp{--disable-shared}). 1715 1716@item NeXT prior to 3.3 1717@cindex NeXT 1718The system compiler on old versions of NeXT was a massacred and old GCC, even 1719if it called itself @file{cc}. This compiler cannot be used to build GMP, you 1720need to get a real GCC, and install that. (NeXT may have fixed this in 1721release 3.3 of their system.) 1722 1723@item POWER and PowerPC 1724@cindex Power/PowerPC 1725Bugs in GCC 2.7.2 (and 2.6.3) mean it can't be used to compile GMP on POWER or 1726PowerPC@. If you want to use GCC for these machines, get GCC 2.7.2.1 (or 1727later). 1728 1729@item Sequent Symmetry 1730@cindex Sequent Symmetry 1731Use the GNU assembler instead of the system assembler, since the latter has 1732serious bugs. 1733 1734@item Solaris 2.6 1735@cindex Solaris 1736The system @command{sed} prints an error ``Output line too long'' when libtool 1737builds @file{libgmp.la}. This doesn't seem to cause any obvious ill effects, 1738but GNU @command{sed} is recommended, to avoid any doubt. 1739 1740@item Sparc Solaris 2.7 with gcc 2.95.2 in @samp{ABI=32} 1741@cindex Solaris 1742A shared library build of GMP seems to fail in this combination, it builds but 1743then fails the tests, apparently due to some incorrect data relocations within 1744@code{gmp_randinit_lc_2exp_size}. The exact cause is unknown, 1745@samp{--disable-shared} is recommended. 1746@end table 1747 1748 1749@need 2000 1750@node Performance optimization, , Known Build Problems, Installing GMP 1751@section Performance optimization 1752@cindex Optimizing performance 1753 1754@c At some point, this should perhaps move to a separate chapter on optimizing 1755@c performance. 1756 1757For optimal performance, build GMP for the exact CPU type of the target 1758computer, see @ref{Build Options}. 1759 1760Unlike what is the case for most other programs, the compiler typically 1761doesn't matter much, since GMP uses assembly language for the most critical 1762operation. 1763 1764In particular for long-running GMP applications, and applications demanding 1765extremely large numbers, building and running the @code{tuneup} program in the 1766@file{tune} subdirectory, can be important. For example, 1767 1768@example 1769cd tune 1770make tuneup 1771./tuneup 1772@end example 1773 1774will generate better contents for the @file{gmp-mparam.h} parameter file. 1775 1776To use the results, put the output in the file indicated in the 1777@samp{Parameters for ...} header. Then recompile from scratch. 1778 1779The @code{tuneup} program takes one useful parameter, @samp{-f NNN}, which 1780instructs the program how long to check FFT multiply parameters. If you're 1781going to use GMP for extremely large numbers, you may want to run @code{tuneup} 1782with a large NNN value. 1783 1784 1785@node GMP Basics, Reporting Bugs, Installing GMP, Top 1786@comment node-name, next, previous, up 1787@chapter GMP Basics 1788@cindex Basics 1789 1790@strong{Using functions, macros, data types, etc.@: not documented in this 1791manual is strongly discouraged. If you do so your application is guaranteed 1792to be incompatible with future versions of GMP.} 1793 1794@menu 1795* Headers and Libraries:: 1796* Nomenclature and Types:: 1797* Function Classes:: 1798* Variable Conventions:: 1799* Parameter Conventions:: 1800* Memory Management:: 1801* Reentrancy:: 1802* Useful Macros and Constants:: 1803* Compatibility with older versions:: 1804* Demonstration Programs:: 1805* Efficiency:: 1806* Debugging:: 1807* Profiling:: 1808* Autoconf:: 1809* Emacs:: 1810@end menu 1811 1812@node Headers and Libraries, Nomenclature and Types, GMP Basics, GMP Basics 1813@section Headers and Libraries 1814@cindex Headers 1815 1816@cindex @file{gmp.h} 1817@cindex Include files 1818@cindex @code{#include} 1819All declarations needed to use GMP are collected in the include file 1820@file{gmp.h}. It is designed to work with both C and C++ compilers. 1821 1822@example 1823#include <gmp.h> 1824@end example 1825 1826@cindex @code{stdio.h} 1827Note however that prototypes for GMP functions with @code{FILE *} parameters 1828are only provided if @code{<stdio.h>} is included too. 1829 1830@example 1831#include <stdio.h> 1832#include <gmp.h> 1833@end example 1834 1835@cindex @code{stdarg.h} 1836Likewise @code{<stdarg.h>} (or @code{<varargs.h>}) is required for prototypes 1837with @code{va_list} parameters, such as @code{gmp_vprintf}. And 1838@code{<obstack.h>} for prototypes with @code{struct obstack} parameters, such 1839as @code{gmp_obstack_printf}, when available. 1840 1841@cindex Libraries 1842@cindex Linking 1843@cindex @code{libgmp} 1844All programs using GMP must link against the @file{libgmp} library. On a 1845typical Unix-like system this can be done with @samp{-lgmp}, for example 1846 1847@example 1848gcc myprogram.c -lgmp 1849@end example 1850 1851@cindex @code{libgmpxx} 1852GMP C++ functions are in a separate @file{libgmpxx} library. This is built 1853and installed if C++ support has been enabled (@pxref{Build Options}). For 1854example, 1855 1856@example 1857g++ mycxxprog.cc -lgmpxx -lgmp 1858@end example 1859 1860@cindex Libtool 1861GMP is built using Libtool and an application can use that to link if desired, 1862@GMPpxreftop{libtool, GNU Libtool}. 1863 1864If GMP has been installed to a non-standard location then it may be necessary 1865to use @samp{-I} and @samp{-L} compiler options to point to the right 1866directories, and some sort of run-time path for a shared library. 1867 1868 1869@node Nomenclature and Types, Function Classes, Headers and Libraries, GMP Basics 1870@section Nomenclature and Types 1871@cindex Nomenclature 1872@cindex Types 1873 1874@cindex Integer 1875@tindex @code{mpz_t} 1876In this manual, @dfn{integer} usually means a multiple precision integer, as 1877defined by the GMP library. The C data type for such integers is @code{mpz_t}. 1878Here are some examples of how to declare such integers: 1879 1880@example 1881mpz_t sum; 1882 1883struct foo @{ mpz_t x, y; @}; 1884 1885mpz_t vec[20]; 1886@end example 1887 1888@cindex Rational number 1889@tindex @code{mpq_t} 1890@dfn{Rational number} means a multiple precision fraction. The C data type 1891for these fractions is @code{mpq_t}. For example: 1892 1893@example 1894mpq_t quotient; 1895@end example 1896 1897@cindex Floating-point number 1898@tindex @code{mpf_t} 1899@dfn{Floating point number} or @dfn{Float} for short, is an arbitrary precision 1900mantissa with a limited precision exponent. The C data type for such objects 1901is @code{mpf_t}. For example: 1902 1903@example 1904mpf_t fp; 1905@end example 1906 1907@tindex @code{mp_exp_t} 1908The floating point functions accept and return exponents in the C type 1909@code{mp_exp_t}. Currently this is usually a @code{long}, but on some systems 1910it's an @code{int} for efficiency. 1911 1912@cindex Limb 1913@tindex @code{mp_limb_t} 1914A @dfn{limb} means the part of a multi-precision number that fits in a single 1915machine word. (We chose this word because a limb of the human body is 1916analogous to a digit, only larger, and containing several digits.) Normally a 1917limb is 32 or 64 bits. The C data type for a limb is @code{mp_limb_t}. 1918 1919@tindex @code{mp_size_t} 1920Counts of limbs of a multi-precision number represented in the C type 1921@code{mp_size_t}. Currently this is normally a @code{long}, but on some 1922systems it's an @code{int} for efficiency, and on some systems it will be 1923@code{long long} in the future. 1924 1925@tindex @code{mp_bitcnt_t} 1926Counts of bits of a multi-precision number are represented in the C type 1927@code{mp_bitcnt_t}. Currently this is always an @code{unsigned long}, but on 1928some systems it will be an @code{unsigned long long} in the future. 1929 1930@cindex Random state 1931@tindex @code{gmp_randstate_t} 1932@dfn{Random state} means an algorithm selection and current state data. The C 1933data type for such objects is @code{gmp_randstate_t}. For example: 1934 1935@example 1936gmp_randstate_t rstate; 1937@end example 1938 1939Also, in general @code{mp_bitcnt_t} is used for bit counts and ranges, and 1940@code{size_t} is used for byte or character counts. 1941 1942 1943@node Function Classes, Variable Conventions, Nomenclature and Types, GMP Basics 1944@section Function Classes 1945@cindex Function classes 1946 1947There are six classes of functions in the GMP library: 1948 1949@enumerate 1950@item 1951Functions for signed integer arithmetic, with names beginning with 1952@code{mpz_}. The associated type is @code{mpz_t}. There are about 150 1953functions in this class. (@pxref{Integer Functions}) 1954 1955@item 1956Functions for rational number arithmetic, with names beginning with 1957@code{mpq_}. The associated type is @code{mpq_t}. There are about 40 1958functions in this class, but the integer functions can be used for arithmetic 1959on the numerator and denominator separately. (@pxref{Rational Number 1960Functions}) 1961 1962@item 1963Functions for floating-point arithmetic, with names beginning with 1964@code{mpf_}. The associated type is @code{mpf_t}. There are about 60 1965functions is this class. (@pxref{Floating-point Functions}) 1966 1967@item 1968Fast low-level functions that operate on natural numbers. These are used by 1969the functions in the preceding groups, and you can also call them directly 1970from very time-critical user programs. These functions' names begin with 1971@code{mpn_}. The associated type is array of @code{mp_limb_t}. There are 1972about 30 (hard-to-use) functions in this class. (@pxref{Low-level Functions}) 1973 1974@item 1975Miscellaneous functions. Functions for setting up custom allocation and 1976functions for generating random numbers. (@pxref{Custom Allocation}, and 1977@pxref{Random Number Functions}) 1978@end enumerate 1979 1980 1981@node Variable Conventions, Parameter Conventions, Function Classes, GMP Basics 1982@section Variable Conventions 1983@cindex Variable conventions 1984@cindex Conventions for variables 1985 1986GMP functions generally have output arguments before input arguments. This 1987notation is by analogy with the assignment operator. The BSD MP compatibility 1988functions are exceptions, having the output arguments last. 1989 1990GMP lets you use the same variable for both input and output in one call. For 1991example, the main function for integer multiplication, @code{mpz_mul}, can be 1992used to square @code{x} and put the result back in @code{x} with 1993 1994@example 1995mpz_mul (x, x, x); 1996@end example 1997 1998Before you can assign to a GMP variable, you need to initialize it by calling 1999one of the special initialization functions. When you're done with a 2000variable, you need to clear it out, using one of the functions for that 2001purpose. Which function to use depends on the type of variable. See the 2002chapters on integer functions, rational number functions, and floating-point 2003functions for details. 2004 2005A variable should only be initialized once, or at least cleared between each 2006initialization. After a variable has been initialized, it may be assigned to 2007any number of times. 2008 2009For efficiency reasons, avoid excessive initializing and clearing. In 2010general, initialize near the start of a function and clear near the end. For 2011example, 2012 2013@example 2014void 2015foo (void) 2016@{ 2017 mpz_t n; 2018 int i; 2019 mpz_init (n); 2020 for (i = 1; i < 100; i++) 2021 @{ 2022 mpz_mul (n, @dots{}); 2023 mpz_fdiv_q (n, @dots{}); 2024 @dots{} 2025 @} 2026 mpz_clear (n); 2027@} 2028@end example 2029 2030 2031@node Parameter Conventions, Memory Management, Variable Conventions, GMP Basics 2032@section Parameter Conventions 2033@cindex Parameter conventions 2034@cindex Conventions for parameters 2035 2036When a GMP variable is used as a function parameter, it's effectively a 2037call-by-reference, meaning if the function stores a value there it will change 2038the original in the caller. Parameters which are input-only can be designated 2039@code{const} to provoke a compiler error or warning on attempting to modify 2040them. 2041 2042When a function is going to return a GMP result, it should designate a 2043parameter that it sets, like the library functions do. More than one value 2044can be returned by having more than one output parameter, again like the 2045library functions. A @code{return} of an @code{mpz_t} etc doesn't return the 2046object, only a pointer, and this is almost certainly not what's wanted. 2047 2048Here's an example accepting an @code{mpz_t} parameter, doing a calculation, 2049and storing the result to the indicated parameter. 2050 2051@example 2052void 2053foo (mpz_t result, const mpz_t param, unsigned long n) 2054@{ 2055 unsigned long i; 2056 mpz_mul_ui (result, param, n); 2057 for (i = 1; i < n; i++) 2058 mpz_add_ui (result, result, i*7); 2059@} 2060 2061int 2062main (void) 2063@{ 2064 mpz_t r, n; 2065 mpz_init (r); 2066 mpz_init_set_str (n, "123456", 0); 2067 foo (r, n, 20L); 2068 gmp_printf ("%Zd\n", r); 2069 return 0; 2070@} 2071@end example 2072 2073@code{foo} works even if the mainline passes the same variable for 2074@code{param} and @code{result}, just like the library functions. But 2075sometimes it's tricky to make that work, and an application might not want to 2076bother supporting that sort of thing. 2077 2078For interest, the GMP types @code{mpz_t} etc are implemented as one-element 2079arrays of certain structures. This is why declaring a variable creates an 2080object with the fields GMP needs, but then using it as a parameter passes a 2081pointer to the object. Note that the actual fields in each @code{mpz_t} etc 2082are for internal use only and should not be accessed directly by code that 2083expects to be compatible with future GMP releases. 2084 2085 2086@need 1000 2087@node Memory Management, Reentrancy, Parameter Conventions, GMP Basics 2088@section Memory Management 2089@cindex Memory management 2090 2091The GMP types like @code{mpz_t} are small, containing only a couple of sizes, 2092and pointers to allocated data. Once a variable is initialized, GMP takes 2093care of all space allocation. Additional space is allocated whenever a 2094variable doesn't have enough. 2095 2096@code{mpz_t} and @code{mpq_t} variables never reduce their allocated space. 2097Normally this is the best policy, since it avoids frequent reallocation. 2098Applications that need to return memory to the heap at some particular point 2099can use @code{mpz_realloc2}, or clear variables no longer needed. 2100 2101@code{mpf_t} variables, in the current implementation, use a fixed amount of 2102space, determined by the chosen precision and allocated at initialization, so 2103their size doesn't change. 2104 2105All memory is allocated using @code{malloc} and friends by default, but this 2106can be changed, see @ref{Custom Allocation}. Temporary memory on the stack is 2107also used (via @code{alloca}), but this can be changed at build-time if 2108desired, see @ref{Build Options}. 2109 2110 2111@node Reentrancy, Useful Macros and Constants, Memory Management, GMP Basics 2112@section Reentrancy 2113@cindex Reentrancy 2114@cindex Thread safety 2115@cindex Multi-threading 2116 2117@noindent 2118GMP is reentrant and thread-safe, with some exceptions: 2119 2120@itemize @bullet 2121@item 2122If configured with @option{--enable-alloca=malloc-notreentrant} (or with 2123@option{--enable-alloca=notreentrant} when @code{alloca} is not available), 2124then naturally GMP is not reentrant. 2125 2126@item 2127@code{mpf_set_default_prec} and @code{mpf_init} use a global variable for the 2128selected precision. @code{mpf_init2} can be used instead, and in the C++ 2129interface an explicit precision to the @code{mpf_class} constructor. 2130 2131@item 2132@code{mpz_random} and the other old random number functions use a global 2133random state and are hence not reentrant. The newer random number functions 2134that accept a @code{gmp_randstate_t} parameter can be used instead. 2135 2136@item 2137@code{gmp_randinit} (obsolete) returns an error indication through a global 2138variable, which is not thread safe. Applications are advised to use 2139@code{gmp_randinit_default} or @code{gmp_randinit_lc_2exp} instead. 2140 2141@item 2142@code{mp_set_memory_functions} uses global variables to store the selected 2143memory allocation functions. 2144 2145@item 2146If the memory allocation functions set by a call to 2147@code{mp_set_memory_functions} (or @code{malloc} and friends by default) are 2148not reentrant, then GMP will not be reentrant either. 2149 2150@item 2151If the standard I/O functions such as @code{fwrite} are not reentrant then the 2152GMP I/O functions using them will not be reentrant either. 2153 2154@item 2155It's safe for two threads to read from the same GMP variable simultaneously, 2156but it's not safe for one to read while the another might be writing, nor for 2157two threads to write simultaneously. It's not safe for two threads to 2158generate a random number from the same @code{gmp_randstate_t} simultaneously, 2159since this involves an update of that variable. 2160@end itemize 2161 2162 2163@need 2000 2164@node Useful Macros and Constants, Compatibility with older versions, Reentrancy, GMP Basics 2165@section Useful Macros and Constants 2166@cindex Useful macros and constants 2167@cindex Constants 2168 2169@deftypevr {Global Constant} {const int} mp_bits_per_limb 2170@findex mp_bits_per_limb 2171@cindex Bits per limb 2172@cindex Limb size 2173The number of bits per limb. 2174@end deftypevr 2175 2176@defmac __GNU_MP_VERSION 2177@defmacx __GNU_MP_VERSION_MINOR 2178@defmacx __GNU_MP_VERSION_PATCHLEVEL 2179@cindex Version number 2180@cindex GMP version number 2181The major and minor GMP version, and patch level, respectively, as integers. 2182For GMP i.j, these numbers will be i, j, and 0, respectively. 2183For GMP i.j.k, these numbers will be i, j, and k, respectively. 2184@end defmac 2185 2186@deftypevr {Global Constant} {const char * const} gmp_version 2187@findex gmp_version 2188The GMP version number, as a null-terminated string, in the form ``i.j.k''. 2189This release is @nicode{"@value{VERSION}"}. Note that the format ``i.j'' was 2190used, before version 4.3.0, when k was zero. 2191@end deftypevr 2192 2193@defmac __GMP_CC 2194@defmacx __GMP_CFLAGS 2195The compiler and compiler flags, respectively, used when compiling GMP, as 2196strings. 2197@end defmac 2198 2199 2200@node Compatibility with older versions, Demonstration Programs, Useful Macros and Constants, GMP Basics 2201@section Compatibility with older versions 2202@cindex Compatibility with older versions 2203@cindex Past GMP versions 2204@cindex Upward compatibility 2205 2206This version of GMP is upwardly binary compatible with all 5.x, 4.x, and 3.x 2207versions, and upwardly compatible at the source level with all 2.x versions, 2208with the following exceptions. 2209 2210@itemize @bullet 2211@item 2212@code{mpn_gcd} had its source arguments swapped as of GMP 3.0, for consistency 2213with other @code{mpn} functions. 2214 2215@item 2216@code{mpf_get_prec} counted precision slightly differently in GMP 3.0 and 22173.0.1, but in 3.1 reverted to the 2.x style. 2218 2219@item 2220@code{mpn_bdivmod}, documented as preliminary in GMP 4, has been removed. 2221@end itemize 2222 2223There are a number of compatibility issues between GMP 1 and GMP 2 that of 2224course also apply when porting applications from GMP 1 to GMP 5. Please 2225see the GMP 2 manual for details. 2226 2227@c @item Integer division functions round the result differently. The obsolete 2228@c functions (@code{mpz_div}, @code{mpz_divmod}, @code{mpz_mdiv}, 2229@c @code{mpz_mdivmod}, etc) now all use floor rounding (i.e., they round the 2230@c quotient towards 2231@c @ifinfo 2232@c @minus{}infinity). 2233@c @end ifinfo 2234@c @iftex 2235@c @tex 2236@c $-\infty$). 2237@c @end tex 2238@c @end iftex 2239@c There are a lot of functions for integer division, giving the user better 2240@c control over the rounding. 2241 2242@c @item The function @code{mpz_mod} now compute the true @strong{mod} function. 2243 2244@c @item The functions @code{mpz_powm} and @code{mpz_powm_ui} now use 2245@c @strong{mod} for reduction. 2246 2247@c @item The assignment functions for rational numbers do no longer canonicalize 2248@c their results. In the case a non-canonical result could arise from an 2249@c assignment, the user need to insert an explicit call to 2250@c @code{mpq_canonicalize}. This change was made for efficiency. 2251 2252@c @item Output generated by @code{mpz_out_raw} in this release cannot be read 2253@c by @code{mpz_inp_raw} in previous releases. This change was made for making 2254@c the file format truly portable between machines with different word sizes. 2255 2256@c @item Several @code{mpn} functions have changed. But they were intentionally 2257@c undocumented in previous releases. 2258 2259@c @item The functions @code{mpz_cmp_ui}, @code{mpz_cmp_si}, and @code{mpq_cmp_ui} 2260@c are now implemented as macros, and thereby sometimes evaluate their 2261@c arguments multiple times. 2262 2263@c @item The functions @code{mpz_pow_ui} and @code{mpz_ui_pow_ui} now yield 1 2264@c for 0^0. (In version 1, they yielded 0.) 2265 2266@c In version 1 of the library, @code{mpq_set_den} handled negative 2267@c denominators by copying the sign to the numerator. That is no longer done. 2268 2269@c Pure assignment functions do not canonicalize the assigned variable. It is 2270@c the responsibility of the user to canonicalize the assigned variable before 2271@c any arithmetic operations are performed on that variable. 2272@c Note that this is an incompatible change from version 1 of the library. 2273 2274@c @end enumerate 2275 2276 2277@need 1000 2278@node Demonstration Programs, Efficiency, Compatibility with older versions, GMP Basics 2279@section Demonstration programs 2280@cindex Demonstration programs 2281@cindex Example programs 2282@cindex Sample programs 2283The @file{demos} subdirectory has some sample programs using GMP@. These 2284aren't built or installed, but there's a @file{Makefile} with rules for them. 2285For instance, 2286 2287@example 2288make pexpr 2289./pexpr 68^975+10 2290@end example 2291 2292@noindent 2293The following programs are provided 2294 2295@itemize @bullet 2296@item 2297@cindex Expression parsing demo 2298@cindex Parsing expressions demo 2299@samp{pexpr} is an expression evaluator, the program used on the GMP web page. 2300@item 2301@cindex Expression parsing demo 2302@cindex Parsing expressions demo 2303The @samp{calc} subdirectory has a similar but simpler evaluator using 2304@command{lex} and @command{yacc}. 2305@item 2306@cindex Expression parsing demo 2307@cindex Parsing expressions demo 2308The @samp{expr} subdirectory is yet another expression evaluator, a library 2309designed for ease of use within a C program. See @file{demos/expr/README} for 2310more information. 2311@item 2312@cindex Factorization demo 2313@samp{factorize} is a Pollard-Rho factorization program. 2314@item 2315@samp{isprime} is a command-line interface to the @code{mpz_probab_prime_p} 2316function. 2317@item 2318@samp{primes} counts or lists primes in an interval, using a sieve. 2319@item 2320@samp{qcn} is an example use of @code{mpz_kronecker_ui} to estimate quadratic 2321class numbers. 2322@item 2323@cindex @code{perl} 2324@cindex GMP Perl module 2325@cindex Perl module 2326The @samp{perl} subdirectory is a comprehensive perl interface to GMP@. See 2327@file{demos/perl/INSTALL} for more information. Documentation is in POD 2328format in @file{demos/perl/GMP.pm}. 2329@end itemize 2330 2331As an aside, consideration has been given at various times to some sort of 2332expression evaluation within the main GMP library. Going beyond something 2333minimal quickly leads to matters like user-defined functions, looping, fixnums 2334for control variables, etc, which are considered outside the scope of GMP 2335(much closer to language interpreters or compilers, @xref{Language Bindings}.) 2336Something simple for program input convenience may yet be a possibility, a 2337combination of the @file{expr} demo and the @file{pexpr} tree back-end 2338perhaps. But for now the above evaluators are offered as illustrations. 2339 2340 2341@need 1000 2342@node Efficiency, Debugging, Demonstration Programs, GMP Basics 2343@section Efficiency 2344@cindex Efficiency 2345 2346@table @asis 2347@item Small Operands 2348@cindex Small operands 2349On small operands, the time for function call overheads and memory allocation 2350can be significant in comparison to actual calculation. This is unavoidable 2351in a general purpose variable precision library, although GMP attempts to be 2352as efficient as it can on both large and small operands. 2353 2354@item Static Linking 2355@cindex Static linking 2356On some CPUs, in particular the x86s, the static @file{libgmp.a} should be 2357used for maximum speed, since the PIC code in the shared @file{libgmp.so} will 2358have a small overhead on each function call and global data address. For many 2359programs this will be insignificant, but for long calculations there's a gain 2360to be had. 2361 2362@item Initializing and Clearing 2363@cindex Initializing and clearing 2364Avoid excessive initializing and clearing of variables, since this can be 2365quite time consuming, especially in comparison to otherwise fast operations 2366like addition. 2367 2368A language interpreter might want to keep a free list or stack of 2369initialized variables ready for use. It should be possible to integrate 2370something like that with a garbage collector too. 2371 2372@item Reallocations 2373@cindex Reallocations 2374An @code{mpz_t} or @code{mpq_t} variable used to hold successively increasing 2375values will have its memory repeatedly @code{realloc}ed, which could be quite 2376slow or could fragment memory, depending on the C library. If an application 2377can estimate the final size then @code{mpz_init2} or @code{mpz_realloc2} can 2378be called to allocate the necessary space from the beginning 2379(@pxref{Initializing Integers}). 2380 2381It doesn't matter if a size set with @code{mpz_init2} or @code{mpz_realloc2} 2382is too small, since all functions will do a further reallocation if necessary. 2383Badly overestimating memory required will waste space though. 2384 2385@item @code{2exp} Functions 2386@cindex @code{2exp} functions 2387It's up to an application to call functions like @code{mpz_mul_2exp} when 2388appropriate. General purpose functions like @code{mpz_mul} make no attempt to 2389identify powers of two or other special forms, because such inputs will 2390usually be very rare and testing every time would be wasteful. 2391 2392@item @code{ui} and @code{si} Functions 2393@cindex @code{ui} and @code{si} functions 2394The @code{ui} functions and the small number of @code{si} functions exist for 2395convenience and should be used where applicable. But if for example an 2396@code{mpz_t} contains a value that fits in an @code{unsigned long} there's no 2397need extract it and call a @code{ui} function, just use the regular @code{mpz} 2398function. 2399 2400@item In-Place Operations 2401@cindex In-place operations 2402@code{mpz_abs}, @code{mpq_abs}, @code{mpf_abs}, @code{mpz_neg}, @code{mpq_neg} 2403and @code{mpf_neg} are fast when used for in-place operations like 2404@code{mpz_abs(x,x)}, since in the current implementation only a single field 2405of @code{x} needs changing. On suitable compilers (GCC for instance) this is 2406inlined too. 2407 2408@code{mpz_add_ui}, @code{mpz_sub_ui}, @code{mpf_add_ui} and @code{mpf_sub_ui} 2409benefit from an in-place operation like @code{mpz_add_ui(x,x,y)}, since 2410usually only one or two limbs of @code{x} will need to be changed. The same 2411applies to the full precision @code{mpz_add} etc if @code{y} is small. If 2412@code{y} is big then cache locality may be helped, but that's all. 2413 2414@code{mpz_mul} is currently the opposite, a separate destination is slightly 2415better. A call like @code{mpz_mul(x,x,y)} will, unless @code{y} is only one 2416limb, make a temporary copy of @code{x} before forming the result. Normally 2417that copying will only be a tiny fraction of the time for the multiply, so 2418this is not a particularly important consideration. 2419 2420@code{mpz_set}, @code{mpq_set}, @code{mpq_set_num}, @code{mpf_set}, etc, make 2421no attempt to recognise a copy of something to itself, so a call like 2422@code{mpz_set(x,x)} will be wasteful. Naturally that would never be written 2423deliberately, but if it might arise from two pointers to the same object then 2424a test to avoid it might be desirable. 2425 2426@example 2427if (x != y) 2428 mpz_set (x, y); 2429@end example 2430 2431Note that it's never worth introducing extra @code{mpz_set} calls just to get 2432in-place operations. If a result should go to a particular variable then just 2433direct it there and let GMP take care of data movement. 2434 2435@item Divisibility Testing (Small Integers) 2436@cindex Divisibility testing 2437@code{mpz_divisible_ui_p} and @code{mpz_congruent_ui_p} are the best functions 2438for testing whether an @code{mpz_t} is divisible by an individual small 2439integer. They use an algorithm which is faster than @code{mpz_tdiv_ui}, but 2440which gives no useful information about the actual remainder, only whether 2441it's zero (or a particular value). 2442 2443However when testing divisibility by several small integers, it's best to take 2444a remainder modulo their product, to save multi-precision operations. For 2445instance to test whether a number is divisible by any of 23, 29 or 31 take a 2446remainder modulo @math{23@times{}29@times{}31 = 20677} and then test that. 2447 2448The division functions like @code{mpz_tdiv_q_ui} which give a quotient as well 2449as a remainder are generally a little slower than the remainder-only functions 2450like @code{mpz_tdiv_ui}. If the quotient is only rarely wanted then it's 2451probably best to just take a remainder and then go back and calculate the 2452quotient if and when it's wanted (@code{mpz_divexact_ui} can be used if the 2453remainder is zero). 2454 2455@item Rational Arithmetic 2456@cindex Rational arithmetic 2457The @code{mpq} functions operate on @code{mpq_t} values with no common factors 2458in the numerator and denominator. Common factors are checked-for and cast out 2459as necessary. In general, cancelling factors every time is the best approach 2460since it minimizes the sizes for subsequent operations. 2461 2462However, applications that know something about the factorization of the 2463values they're working with might be able to avoid some of the GCDs used for 2464canonicalization, or swap them for divisions. For example when multiplying by 2465a prime it's enough to check for factors of it in the denominator instead of 2466doing a full GCD@. Or when forming a big product it might be known that very 2467little cancellation will be possible, and so canonicalization can be left to 2468the end. 2469 2470The @code{mpq_numref} and @code{mpq_denref} macros give access to the 2471numerator and denominator to do things outside the scope of the supplied 2472@code{mpq} functions. @xref{Applying Integer Functions}. 2473 2474The canonical form for rationals allows mixed-type @code{mpq_t} and integer 2475additions or subtractions to be done directly with multiples of the 2476denominator. This will be somewhat faster than @code{mpq_add}. For example, 2477 2478@example 2479/* mpq increment */ 2480mpz_add (mpq_numref(q), mpq_numref(q), mpq_denref(q)); 2481 2482/* mpq += unsigned long */ 2483mpz_addmul_ui (mpq_numref(q), mpq_denref(q), 123UL); 2484 2485/* mpq -= mpz */ 2486mpz_submul (mpq_numref(q), mpq_denref(q), z); 2487@end example 2488 2489@item Number Sequences 2490@cindex Number sequences 2491Functions like @code{mpz_fac_ui}, @code{mpz_fib_ui} and @code{mpz_bin_uiui} 2492are designed for calculating isolated values. If a range of values is wanted 2493it's probably best to call to get a starting point and iterate from there. 2494 2495@item Text Input/Output 2496@cindex Text input/output 2497Hexadecimal or octal are suggested for input or output in text form. 2498Power-of-2 bases like these can be converted much more efficiently than other 2499bases, like decimal. For big numbers there's usually nothing of particular 2500interest to be seen in the digits, so the base doesn't matter much. 2501 2502Maybe we can hope octal will one day become the normal base for everyday use, 2503as proposed by King Charles XII of Sweden and later reformers. 2504@c Reference: Knuth volume 2 section 4.1, page 184 of second edition. :-) 2505@end table 2506 2507 2508@node Debugging, Profiling, Efficiency, GMP Basics 2509@section Debugging 2510@cindex Debugging 2511 2512@table @asis 2513@item Stack Overflow 2514@cindex Stack overflow 2515@cindex Segmentation violation 2516@cindex Bus error 2517Depending on the system, a segmentation violation or bus error might be the 2518only indication of stack overflow. See @samp{--enable-alloca} choices in 2519@ref{Build Options}, for how to address this. 2520 2521In new enough versions of GCC, @samp{-fstack-check} may be able to ensure an 2522overflow is recognised by the system before too much damage is done, or 2523@samp{-fstack-limit-symbol} or @samp{-fstack-limit-register} may be able to 2524add checking if the system itself doesn't do any (@pxref{Code Gen Options,, 2525Options for Code Generation, gcc, Using the GNU Compiler Collection (GCC)}). 2526These options must be added to the @samp{CFLAGS} used in the GMP build 2527(@pxref{Build Options}), adding them just to an application will have no 2528effect. Note also they're a slowdown, adding overhead to each function call 2529and each stack allocation. 2530 2531@item Heap Problems 2532@cindex Heap problems 2533@cindex Malloc problems 2534The most likely cause of application problems with GMP is heap corruption. 2535Failing to @code{init} GMP variables will have unpredictable effects, and 2536corruption arising elsewhere in a program may well affect GMP@. Initializing 2537GMP variables more than once or failing to clear them will cause memory leaks. 2538 2539@cindex Malloc debugger 2540In all such cases a @code{malloc} debugger is recommended. On a GNU or BSD 2541system the standard C library @code{malloc} has some diagnostic facilities, 2542see @ref{Allocation Debugging,, Allocation Debugging, libc, The GNU C Library 2543Reference Manual}, or @samp{man 3 malloc}. Other possibilities, in no 2544particular order, include 2545 2546@display 2547@uref{http://www.inf.ethz.ch/personal/biere/projects/ccmalloc/} 2548@uref{http://dmalloc.com/} 2549@uref{http://www.perens.com/FreeSoftware/} @ (electric fence) 2550@uref{http://packages.debian.org/stable/devel/fda} 2551@uref{http://www.gnupdate.org/components/leakbug/} 2552@uref{http://people.redhat.com/~otaylor/memprof/} 2553@uref{http://www.cbmamiga.demon.co.uk/mpatrol/} 2554@end display 2555 2556The GMP default allocation routines in @file{memory.c} also have a simple 2557sentinel scheme which can be enabled with @code{#define DEBUG} in that file. 2558This is mainly designed for detecting buffer overruns during GMP development, 2559but might find other uses. 2560 2561@item Stack Backtraces 2562@cindex Stack backtrace 2563On some systems the compiler options GMP uses by default can interfere with 2564debugging. In particular on x86 and 68k systems @samp{-fomit-frame-pointer} 2565is used and this generally inhibits stack backtracing. Recompiling without 2566such options may help while debugging, though the usual caveats about it 2567potentially moving a memory problem or hiding a compiler bug will apply. 2568 2569@item GDB, the GNU Debugger 2570@cindex GDB 2571@cindex GNU Debugger 2572A sample @file{.gdbinit} is included in the distribution, showing how to call 2573some undocumented dump functions to print GMP variables from within GDB@. Note 2574that these functions shouldn't be used in final application code since they're 2575undocumented and may be subject to incompatible changes in future versions of 2576GMP. 2577 2578@item Source File Paths 2579GMP has multiple source files with the same name, in different directories. 2580For example @file{mpz}, @file{mpq} and @file{mpf} each have an 2581@file{init.c}. If the debugger can't already determine the right one it may 2582help to build with absolute paths on each C file. One way to do that is to 2583use a separate object directory with an absolute path to the source directory. 2584 2585@example 2586cd /my/build/dir 2587/my/source/dir/gmp-@value{VERSION}/configure 2588@end example 2589 2590This works via @code{VPATH}, and might require GNU @command{make}. 2591Alternately it might be possible to change the @code{.c.lo} rules 2592appropriately. 2593 2594@item Assertion Checking 2595@cindex Assertion checking 2596The build option @option{--enable-assert} is available to add some consistency 2597checks to the library (see @ref{Build Options}). These are likely to be of 2598limited value to most applications. Assertion failures are just as likely to 2599indicate memory corruption as a library or compiler bug. 2600 2601Applications using the low-level @code{mpn} functions, however, will benefit 2602from @option{--enable-assert} since it adds checks on the parameters of most 2603such functions, many of which have subtle restrictions on their usage. Note 2604however that only the generic C code has checks, not the assembly code, so 2605@option{--disable-assembly} should be used for maximum checking. 2606 2607@item Temporary Memory Checking 2608The build option @option{--enable-alloca=debug} arranges that each block of 2609temporary memory in GMP is allocated with a separate call to @code{malloc} (or 2610the allocation function set with @code{mp_set_memory_functions}). 2611 2612This can help a malloc debugger detect accesses outside the intended bounds, 2613or detect memory not released. In a normal build, on the other hand, 2614temporary memory is allocated in blocks which GMP divides up for its own use, 2615or may be allocated with a compiler builtin @code{alloca} which will go 2616nowhere near any malloc debugger hooks. 2617 2618@item Maximum Debuggability 2619To summarize the above, a GMP build for maximum debuggability would be 2620 2621@example 2622./configure --disable-shared --enable-assert \ 2623 --enable-alloca=debug --disable-assembly CFLAGS=-g 2624@end example 2625 2626For C++, add @samp{--enable-cxx CXXFLAGS=-g}. 2627 2628@item Checker 2629@cindex Checker 2630@cindex GCC Checker 2631The GCC checker (@uref{http://savannah.nongnu.org/projects/checker/}) can be 2632used with GMP@. It contains a stub library which means GMP applications 2633compiled with checker can use a normal GMP build. 2634 2635A build of GMP with checking within GMP itself can be made. This will run 2636very very slowly. On GNU/Linux for example, 2637 2638@cindex @command{checkergcc} 2639@example 2640./configure --disable-assembly CC=checkergcc 2641@end example 2642 2643@option{--disable-assembly} must be used, since the GMP assembly code doesn't 2644support the checking scheme. The GMP C++ features cannot be used, since 2645current versions of checker (0.9.9.1) don't yet support the standard C++ 2646library. 2647 2648@item Valgrind 2649@cindex Valgrind 2650Valgrind (@uref{http://valgrind.org/}) is a memory checker for x86, ARM, MIPS, 2651PowerPC, and S/390. It translates and emulates machine instructions to do 2652strong checks for uninitialized data (at the level of individual bits), memory 2653accesses through bad pointers, and memory leaks. 2654 2655Valgrind does not always support every possible instruction, in particular 2656ones recently added to an ISA. Valgrind might therefore be incompatible with 2657a recent GMP or even a less recent GMP which is compiled using a recent GCC. 2658 2659GMP's assembly code sometimes promotes a read of the limbs to some larger size, 2660for efficiency. GMP will do this even at the start and end of a multilimb 2661operand, using naturally aligned operations on the larger type. This may lead 2662to benign reads outside of allocated areas, triggering complaints from 2663Valgrind. Valgrind's option @samp{--partial-loads-ok=yes} should help. 2664 2665@item Other Problems 2666Any suspected bug in GMP itself should be isolated to make sure it's not an 2667application problem, see @ref{Reporting Bugs}. 2668@end table 2669 2670 2671@node Profiling, Autoconf, Debugging, GMP Basics 2672@section Profiling 2673@cindex Profiling 2674@cindex Execution profiling 2675@cindex @code{--enable-profiling} 2676 2677Running a program under a profiler is a good way to find where it's spending 2678most time and where improvements can be best sought. The profiling choices 2679for a GMP build are as follows. 2680 2681@table @asis 2682@item @samp{--disable-profiling} 2683The default is to add nothing special for profiling. 2684 2685It should be possible to just compile the mainline of a program with @code{-p} 2686and use @command{prof} to get a profile consisting of timer-based sampling of 2687the program counter. Most of the GMP assembly code has the necessary symbol 2688information. 2689 2690This approach has the advantage of minimizing interference with normal program 2691operation, but on most systems the resolution of the sampling is quite low (10 2692milliseconds for instance), requiring long runs to get accurate information. 2693 2694@item @samp{--enable-profiling=prof} 2695@cindex @code{prof} 2696Build with support for the system @command{prof}, which means @samp{-p} added 2697to the @samp{CFLAGS}. 2698 2699This provides call counting in addition to program counter sampling, which 2700allows the most frequently called routines to be identified, and an average 2701time spent in each routine to be determined. 2702 2703The x86 assembly code has support for this option, but on other processors 2704the assembly routines will be as if compiled without @samp{-p} and therefore 2705won't appear in the call counts. 2706 2707On some systems, such as GNU/Linux, @samp{-p} in fact means @samp{-pg} and in 2708this case @samp{--enable-profiling=gprof} described below should be used 2709instead. 2710 2711@item @samp{--enable-profiling=gprof} 2712@cindex @code{gprof} 2713Build with support for @command{gprof}, which means @samp{-pg} added to the 2714@samp{CFLAGS}. 2715 2716This provides call graph construction in addition to call counting and program 2717counter sampling, which makes it possible to count calls coming from different 2718locations. For example the number of calls to @code{mpn_mul} from 2719@code{mpz_mul} versus the number from @code{mpf_mul}. The program counter 2720sampling is still flat though, so only a total time in @code{mpn_mul} would be 2721accumulated, not a separate amount for each call site. 2722 2723The x86 assembly code has support for this option, but on other processors 2724the assembly routines will be as if compiled without @samp{-pg} and therefore 2725not be included in the call counts. 2726 2727On x86 and m68k systems @samp{-pg} and @samp{-fomit-frame-pointer} are 2728incompatible, so the latter is omitted from the default flags in that case, 2729which might result in poorer code generation. 2730 2731Incidentally, it should be possible to use the @command{gprof} program with a 2732plain @samp{--enable-profiling=prof} build. But in that case only the 2733@samp{gprof -p} flat profile and call counts can be expected to be valid, not 2734the @samp{gprof -q} call graph. 2735 2736@item @samp{--enable-profiling=instrument} 2737@cindex @code{-finstrument-functions} 2738@cindex @code{instrument-functions} 2739Build with the GCC option @samp{-finstrument-functions} added to the 2740@samp{CFLAGS} (@pxref{Code Gen Options,, Options for Code Generation, gcc, 2741Using the GNU Compiler Collection (GCC)}). 2742 2743This inserts special instrumenting calls at the start and end of each 2744function, allowing exact timing and full call graph construction. 2745 2746This instrumenting is not normally a standard system feature and will require 2747support from an external library, such as 2748 2749@cindex FunctionCheck 2750@cindex fnccheck 2751@display 2752@uref{http://sourceforge.net/projects/fnccheck/} 2753@end display 2754 2755This should be included in @samp{LIBS} during the GMP configure so that test 2756programs will link. For example, 2757 2758@example 2759./configure --enable-profiling=instrument LIBS=-lfc 2760@end example 2761 2762On a GNU system the C library provides dummy instrumenting functions, so 2763programs compiled with this option will link. In this case it's only 2764necessary to ensure the correct library is added when linking an application. 2765 2766The x86 assembly code supports this option, but on other processors the 2767assembly routines will be as if compiled without 2768@samp{-finstrument-functions} meaning time spent in them will effectively be 2769attributed to their caller. 2770@end table 2771 2772 2773@node Autoconf, Emacs, Profiling, GMP Basics 2774@section Autoconf 2775@cindex Autoconf 2776 2777Autoconf based applications can easily check whether GMP is installed. The 2778only thing to be noted is that GMP library symbols from version 3 onwards have 2779prefixes like @code{__gmpz}. The following therefore would be a simple test, 2780 2781@cindex @code{AC_CHECK_LIB} 2782@example 2783AC_CHECK_LIB(gmp, __gmpz_init) 2784@end example 2785 2786This just uses the default @code{AC_CHECK_LIB} actions for found or not found, 2787but an application that must have GMP would want to generate an error if not 2788found. For example, 2789 2790@example 2791AC_CHECK_LIB(gmp, __gmpz_init, , 2792 [AC_MSG_ERROR([GNU MP not found, see http://gmplib.org/])]) 2793@end example 2794 2795If functions added in some particular version of GMP are required, then one of 2796those can be used when checking. For example @code{mpz_mul_si} was added in 2797GMP 3.1, 2798 2799@example 2800AC_CHECK_LIB(gmp, __gmpz_mul_si, , 2801 [AC_MSG_ERROR( 2802 [GNU MP not found, or not 3.1 or up, see http://gmplib.org/])]) 2803@end example 2804 2805An alternative would be to test the version number in @file{gmp.h} using say 2806@code{AC_EGREP_CPP}. That would make it possible to test the exact version, 2807if some particular sub-minor release is known to be necessary. 2808 2809In general it's recommended that applications should simply demand a new 2810enough GMP rather than trying to provide supplements for features not 2811available in past versions. 2812 2813Occasionally an application will need or want to know the size of a type at 2814configuration or preprocessing time, not just with @code{sizeof} in the code. 2815This can be done in the normal way with @code{mp_limb_t} etc, but GMP 4.0 or 2816up is best for this, since prior versions needed certain @samp{-D} defines on 2817systems using a @code{long long} limb. The following would suit Autoconf 2.50 2818or up, 2819 2820@example 2821AC_CHECK_SIZEOF(mp_limb_t, , [#include <gmp.h>]) 2822@end example 2823 2824 2825@node Emacs, , Autoconf, GMP Basics 2826@section Emacs 2827@cindex Emacs 2828@cindex @code{info-lookup-symbol} 2829 2830@key{C-h C-i} (@code{info-lookup-symbol}) is a good way to find documentation 2831on C functions while editing (@pxref{Info Lookup, , Info Documentation Lookup, 2832emacs, The Emacs Editor}). 2833 2834The GMP manual can be included in such lookups by putting the following in 2835your @file{.emacs}, 2836 2837@c This isn't pretty, but there doesn't seem to be a better way (in emacs 2838@c 21.2 at least). info-lookup->mode-value could be used for the "assoc"s, 2839@c but that function isn't documented, whereas info-lookup-alist is. 2840@c 2841@example 2842(eval-after-load "info-look" 2843 '(let ((mode-value (assoc 'c-mode (assoc 'symbol info-lookup-alist)))) 2844 (setcar (nthcdr 3 mode-value) 2845 (cons '("(gmp)Function Index" nil "^ -.* " "\\>") 2846 (nth 3 mode-value))))) 2847@end example 2848 2849 2850@node Reporting Bugs, Integer Functions, GMP Basics, Top 2851@comment node-name, next, previous, up 2852@chapter Reporting Bugs 2853@cindex Reporting bugs 2854@cindex Bug reporting 2855 2856If you think you have found a bug in the GMP library, please investigate it 2857and report it. We have made this library available to you, and it is not too 2858much to ask you to report the bugs you find. 2859 2860Before you report a bug, check it's not already addressed in @ref{Known Build 2861Problems}, or perhaps @ref{Notes for Particular Systems}. You may also want 2862to check @uref{http://gmplib.org/} for patches for this release. 2863 2864Please include the following in any report, 2865 2866@itemize @bullet 2867@item 2868The GMP version number, and if pre-packaged or patched then say so. 2869 2870@item 2871A test program that makes it possible for us to reproduce the bug. Include 2872instructions on how to run the program. 2873 2874@item 2875A description of what is wrong. If the results are incorrect, in what way. 2876If you get a crash, say so. 2877 2878@item 2879If you get a crash, include a stack backtrace from the debugger if it's 2880informative (@samp{where} in @command{gdb}, or @samp{$C} in @command{adb}). 2881 2882@item 2883Please do not send core dumps, executables or @command{strace}s. 2884 2885@item 2886The @samp{configure} options you used when building GMP, if any. 2887 2888@item 2889The output from @samp{configure}, as printed to stdout, with any options used. 2890 2891@item 2892The name of the compiler and its version. For @command{gcc}, get the version 2893with @samp{gcc -v}, otherwise perhaps @samp{what `which cc`}, or similar. 2894 2895@item 2896The output from running @samp{uname -a}. 2897 2898@item 2899The output from running @samp{./config.guess}, and from running 2900@samp{./configfsf.guess} (might be the same). 2901 2902@item 2903If the bug is related to @samp{configure}, then the compressed contents of 2904@file{config.log}. 2905 2906@item 2907If the bug is related to an @file{asm} file not assembling, then the contents 2908of @file{config.m4} and the offending line or lines from the temporary 2909@file{mpn/tmp-<file>.s}. 2910@end itemize 2911 2912Please make an effort to produce a self-contained report, with something 2913definite that can be tested or debugged. Vague queries or piecemeal messages 2914are difficult to act on and don't help the development effort. 2915 2916It is not uncommon that an observed problem is actually due to a bug in the 2917compiler; the GMP code tends to explore interesting corners in compilers. 2918 2919If your bug report is good, we will do our best to help you get a corrected 2920version of the library; if the bug report is poor, we won't do anything about 2921it (except maybe ask you to send a better report). 2922 2923Send your report to: @email{gmp-bugs@@gmplib.org}. 2924 2925If you think something in this manual is unclear, or downright incorrect, or if 2926the language needs to be improved, please send a note to the same address. 2927 2928 2929@node Integer Functions, Rational Number Functions, Reporting Bugs, Top 2930@comment node-name, next, previous, up 2931@chapter Integer Functions 2932@cindex Integer functions 2933 2934This chapter describes the GMP functions for performing integer arithmetic. 2935These functions start with the prefix @code{mpz_}. 2936 2937GMP integers are stored in objects of type @code{mpz_t}. 2938 2939@menu 2940* Initializing Integers:: 2941* Assigning Integers:: 2942* Simultaneous Integer Init & Assign:: 2943* Converting Integers:: 2944* Integer Arithmetic:: 2945* Integer Division:: 2946* Integer Exponentiation:: 2947* Integer Roots:: 2948* Number Theoretic Functions:: 2949* Integer Comparisons:: 2950* Integer Logic and Bit Fiddling:: 2951* I/O of Integers:: 2952* Integer Random Numbers:: 2953* Integer Import and Export:: 2954* Miscellaneous Integer Functions:: 2955* Integer Special Functions:: 2956@end menu 2957 2958@node Initializing Integers, Assigning Integers, Integer Functions, Integer Functions 2959@comment node-name, next, previous, up 2960@section Initialization Functions 2961@cindex Integer initialization functions 2962@cindex Initialization functions 2963 2964The functions for integer arithmetic assume that all integer objects are 2965initialized. You do that by calling the function @code{mpz_init}. For 2966example, 2967 2968@example 2969@{ 2970 mpz_t integ; 2971 mpz_init (integ); 2972 @dots{} 2973 mpz_add (integ, @dots{}); 2974 @dots{} 2975 mpz_sub (integ, @dots{}); 2976 2977 /* Unless the program is about to exit, do ... */ 2978 mpz_clear (integ); 2979@} 2980@end example 2981 2982As you can see, you can store new values any number of times, once an 2983object is initialized. 2984 2985@deftypefun void mpz_init (mpz_t @var{x}) 2986Initialize @var{x}, and set its value to 0. 2987@end deftypefun 2988 2989@deftypefun void mpz_inits (mpz_t @var{x}, ...) 2990Initialize a NULL-terminated list of @code{mpz_t} variables, and set their 2991values to 0. 2992@end deftypefun 2993 2994@deftypefun void mpz_init2 (mpz_t @var{x}, mp_bitcnt_t @var{n}) 2995Initialize @var{x}, with space for @var{n}-bit numbers, and set its value to 0. 2996Calling this function instead of @code{mpz_init} or @code{mpz_inits} is never 2997necessary; reallocation is handled automatically by GMP when needed. 2998 2999While @var{n} defines the initial space, @var{x} will grow automatically in the 3000normal way, if necessary, for subsequent values stored. @code{mpz_init2} makes 3001it possible to avoid such reallocations if a maximum size is known in advance. 3002 3003In preparation for an operation, GMP often allocates one limb more than 3004ultimately needed. To make sure GMP will not perform reallocation for 3005@var{x}, you need to add the number of bits in @code{mp_limb_t} to @var{n}. 3006@end deftypefun 3007 3008@deftypefun void mpz_clear (mpz_t @var{x}) 3009Free the space occupied by @var{x}. Call this function for all @code{mpz_t} 3010variables when you are done with them. 3011@end deftypefun 3012 3013@deftypefun void mpz_clears (mpz_t @var{x}, ...) 3014Free the space occupied by a NULL-terminated list of @code{mpz_t} variables. 3015@end deftypefun 3016 3017@deftypefun void mpz_realloc2 (mpz_t @var{x}, mp_bitcnt_t @var{n}) 3018Change the space allocated for @var{x} to @var{n} bits. The value in @var{x} 3019is preserved if it fits, or is set to 0 if not. 3020 3021Calling this function is never necessary; reallocation is handled automatically 3022by GMP when needed. But this function can be used to increase the space for a 3023variable in order to avoid repeated automatic reallocations, or to decrease it 3024to give memory back to the heap. 3025@end deftypefun 3026 3027 3028@node Assigning Integers, Simultaneous Integer Init & Assign, Initializing Integers, Integer Functions 3029@comment node-name, next, previous, up 3030@section Assignment Functions 3031@cindex Integer assignment functions 3032@cindex Assignment functions 3033 3034These functions assign new values to already initialized integers 3035(@pxref{Initializing Integers}). 3036 3037@deftypefun void mpz_set (mpz_t @var{rop}, const mpz_t @var{op}) 3038@deftypefunx void mpz_set_ui (mpz_t @var{rop}, unsigned long int @var{op}) 3039@deftypefunx void mpz_set_si (mpz_t @var{rop}, signed long int @var{op}) 3040@deftypefunx void mpz_set_d (mpz_t @var{rop}, double @var{op}) 3041@deftypefunx void mpz_set_q (mpz_t @var{rop}, const mpq_t @var{op}) 3042@deftypefunx void mpz_set_f (mpz_t @var{rop}, const mpf_t @var{op}) 3043Set the value of @var{rop} from @var{op}. 3044 3045@code{mpz_set_d}, @code{mpz_set_q} and @code{mpz_set_f} truncate @var{op} to 3046make it an integer. 3047@end deftypefun 3048 3049@deftypefun int mpz_set_str (mpz_t @var{rop}, const char *@var{str}, int @var{base}) 3050Set the value of @var{rop} from @var{str}, a null-terminated C string in base 3051@var{base}. White space is allowed in the string, and is simply ignored. 3052 3053The @var{base} may vary from 2 to 62, or if @var{base} is 0, then the leading 3054characters are used: @code{0x} and @code{0X} for hexadecimal, @code{0b} and 3055@code{0B} for binary, @code{0} for octal, or decimal otherwise. 3056 3057For bases up to 36, case is ignored; upper-case and lower-case letters have 3058the same value. For bases 37 to 62, upper-case letter represent the usual 305910..35 while lower-case letter represent 36..61. 3060 3061This function returns 0 if the entire string is a valid number in base 3062@var{base}. Otherwise it returns @minus{}1. 3063@c 3064@c It turns out that it is not entirely true that this function ignores 3065@c white-space. It does ignore it between digits, but not after a minus sign 3066@c or within or after ``0x''. Some thought was given to disallowing all 3067@c whitespace, but that would be an incompatible change, whitespace has been 3068@c documented as ignored ever since GMP 1. 3069@c 3070@end deftypefun 3071 3072@deftypefun void mpz_swap (mpz_t @var{rop1}, mpz_t @var{rop2}) 3073Swap the values @var{rop1} and @var{rop2} efficiently. 3074@end deftypefun 3075 3076 3077@node Simultaneous Integer Init & Assign, Converting Integers, Assigning Integers, Integer Functions 3078@comment node-name, next, previous, up 3079@section Combined Initialization and Assignment Functions 3080@cindex Integer assignment functions 3081@cindex Assignment functions 3082@cindex Integer initialization functions 3083@cindex Initialization functions 3084 3085For convenience, GMP provides a parallel series of initialize-and-set functions 3086which initialize the output and then store the value there. These functions' 3087names have the form @code{mpz_init_set@dots{}} 3088 3089Here is an example of using one: 3090 3091@example 3092@{ 3093 mpz_t pie; 3094 mpz_init_set_str (pie, "3141592653589793238462643383279502884", 10); 3095 @dots{} 3096 mpz_sub (pie, @dots{}); 3097 @dots{} 3098 mpz_clear (pie); 3099@} 3100@end example 3101 3102@noindent 3103Once the integer has been initialized by any of the @code{mpz_init_set@dots{}} 3104functions, it can be used as the source or destination operand for the ordinary 3105integer functions. Don't use an initialize-and-set function on a variable 3106already initialized! 3107 3108@deftypefun void mpz_init_set (mpz_t @var{rop}, const mpz_t @var{op}) 3109@deftypefunx void mpz_init_set_ui (mpz_t @var{rop}, unsigned long int @var{op}) 3110@deftypefunx void mpz_init_set_si (mpz_t @var{rop}, signed long int @var{op}) 3111@deftypefunx void mpz_init_set_d (mpz_t @var{rop}, double @var{op}) 3112Initialize @var{rop} with limb space and set the initial numeric value from 3113@var{op}. 3114@end deftypefun 3115 3116@deftypefun int mpz_init_set_str (mpz_t @var{rop}, const char *@var{str}, int @var{base}) 3117Initialize @var{rop} and set its value like @code{mpz_set_str} (see its 3118documentation above for details). 3119 3120If the string is a correct base @var{base} number, the function returns 0; 3121if an error occurs it returns @minus{}1. @var{rop} is initialized even if 3122an error occurs. (I.e., you have to call @code{mpz_clear} for it.) 3123@end deftypefun 3124 3125 3126@node Converting Integers, Integer Arithmetic, Simultaneous Integer Init & Assign, Integer Functions 3127@comment node-name, next, previous, up 3128@section Conversion Functions 3129@cindex Integer conversion functions 3130@cindex Conversion functions 3131 3132This section describes functions for converting GMP integers to standard C 3133types. Functions for converting @emph{to} GMP integers are described in 3134@ref{Assigning Integers} and @ref{I/O of Integers}. 3135 3136@deftypefun {unsigned long int} mpz_get_ui (const mpz_t @var{op}) 3137Return the value of @var{op} as an @code{unsigned long}. 3138 3139If @var{op} is too big to fit an @code{unsigned long} then just the least 3140significant bits that do fit are returned. The sign of @var{op} is ignored, 3141only the absolute value is used. 3142@end deftypefun 3143 3144@deftypefun {signed long int} mpz_get_si (const mpz_t @var{op}) 3145If @var{op} fits into a @code{signed long int} return the value of @var{op}. 3146Otherwise return the least significant part of @var{op}, with the same sign 3147as @var{op}. 3148 3149If @var{op} is too big to fit in a @code{signed long int}, the returned 3150result is probably not very useful. To find out if the value will fit, use 3151the function @code{mpz_fits_slong_p}. 3152@end deftypefun 3153 3154@deftypefun double mpz_get_d (const mpz_t @var{op}) 3155Convert @var{op} to a @code{double}, truncating if necessary (i.e.@: rounding 3156towards zero). 3157 3158If the exponent from the conversion is too big, the result is system 3159dependent. An infinity is returned where available. A hardware overflow trap 3160may or may not occur. 3161@end deftypefun 3162 3163@deftypefun double mpz_get_d_2exp (signed long int *@var{exp}, const mpz_t @var{op}) 3164Convert @var{op} to a @code{double}, truncating if necessary (i.e.@: rounding 3165towards zero), and returning the exponent separately. 3166 3167The return value is in the range @math{0.5@le{}@GMPabs{@var{d}}<1} and the 3168exponent is stored to @code{*@var{exp}}. @m{@var{d} * 2^{exp}, @var{d} * 31692^@var{exp}} is the (truncated) @var{op} value. If @var{op} is zero, the 3170return is @math{0.0} and 0 is stored to @code{*@var{exp}}. 3171 3172@cindex @code{frexp} 3173This is similar to the standard C @code{frexp} function (@pxref{Normalization 3174Functions,,, libc, The GNU C Library Reference Manual}). 3175@end deftypefun 3176 3177@deftypefun {char *} mpz_get_str (char *@var{str}, int @var{base}, const mpz_t @var{op}) 3178Convert @var{op} to a string of digits in base @var{base}. The base argument 3179may vary from 2 to 62 or from @minus{}2 to @minus{}36. 3180 3181For @var{base} in the range 2..36, digits and lower-case letters are used; for 3182@minus{}2..@minus{}36, digits and upper-case letters are used; for 37..62, 3183digits, upper-case letters, and lower-case letters (in that significance order) 3184are used. 3185 3186If @var{str} is @code{NULL}, the result string is allocated using the current 3187allocation function (@pxref{Custom Allocation}). The block will be 3188@code{strlen(str)+1} bytes, that being exactly enough for the string and 3189null-terminator. 3190 3191If @var{str} is not @code{NULL}, it should point to a block of storage large 3192enough for the result, that being @code{mpz_sizeinbase (@var{op}, @var{base}) 3193+ 2}. The two extra bytes are for a possible minus sign, and the 3194null-terminator. 3195 3196A pointer to the result string is returned, being either the allocated block, 3197or the given @var{str}. 3198@end deftypefun 3199 3200 3201@need 2000 3202@node Integer Arithmetic, Integer Division, Converting Integers, Integer Functions 3203@comment node-name, next, previous, up 3204@section Arithmetic Functions 3205@cindex Integer arithmetic functions 3206@cindex Arithmetic functions 3207 3208@deftypefun void mpz_add (mpz_t @var{rop}, const mpz_t @var{op1}, const mpz_t @var{op2}) 3209@deftypefunx void mpz_add_ui (mpz_t @var{rop}, const mpz_t @var{op1}, unsigned long int @var{op2}) 3210Set @var{rop} to @math{@var{op1} + @var{op2}}. 3211@end deftypefun 3212 3213@deftypefun void mpz_sub (mpz_t @var{rop}, const mpz_t @var{op1}, const mpz_t @var{op2}) 3214@deftypefunx void mpz_sub_ui (mpz_t @var{rop}, const mpz_t @var{op1}, unsigned long int @var{op2}) 3215@deftypefunx void mpz_ui_sub (mpz_t @var{rop}, unsigned long int @var{op1}, const mpz_t @var{op2}) 3216Set @var{rop} to @var{op1} @minus{} @var{op2}. 3217@end deftypefun 3218 3219@deftypefun void mpz_mul (mpz_t @var{rop}, const mpz_t @var{op1}, const mpz_t @var{op2}) 3220@deftypefunx void mpz_mul_si (mpz_t @var{rop}, const mpz_t @var{op1}, long int @var{op2}) 3221@deftypefunx void mpz_mul_ui (mpz_t @var{rop}, const mpz_t @var{op1}, unsigned long int @var{op2}) 3222Set @var{rop} to @math{@var{op1} @GMPtimes{} @var{op2}}. 3223@end deftypefun 3224 3225@deftypefun void mpz_addmul (mpz_t @var{rop}, const mpz_t @var{op1}, const mpz_t @var{op2}) 3226@deftypefunx void mpz_addmul_ui (mpz_t @var{rop}, const mpz_t @var{op1}, unsigned long int @var{op2}) 3227Set @var{rop} to @math{@var{rop} + @var{op1} @GMPtimes{} @var{op2}}. 3228@end deftypefun 3229 3230@deftypefun void mpz_submul (mpz_t @var{rop}, const mpz_t @var{op1}, const mpz_t @var{op2}) 3231@deftypefunx void mpz_submul_ui (mpz_t @var{rop}, const mpz_t @var{op1}, unsigned long int @var{op2}) 3232Set @var{rop} to @math{@var{rop} - @var{op1} @GMPtimes{} @var{op2}}. 3233@end deftypefun 3234 3235@deftypefun void mpz_mul_2exp (mpz_t @var{rop}, const mpz_t @var{op1}, mp_bitcnt_t @var{op2}) 3236@cindex Bit shift left 3237Set @var{rop} to @m{@var{op1} \times 2^{op2}, @var{op1} times 2 raised to 3238@var{op2}}. This operation can also be defined as a left shift by @var{op2} 3239bits. 3240@end deftypefun 3241 3242@deftypefun void mpz_neg (mpz_t @var{rop}, const mpz_t @var{op}) 3243Set @var{rop} to @minus{}@var{op}. 3244@end deftypefun 3245 3246@deftypefun void mpz_abs (mpz_t @var{rop}, const mpz_t @var{op}) 3247Set @var{rop} to the absolute value of @var{op}. 3248@end deftypefun 3249 3250 3251@need 2000 3252@node Integer Division, Integer Exponentiation, Integer Arithmetic, Integer Functions 3253@section Division Functions 3254@cindex Integer division functions 3255@cindex Division functions 3256 3257Division is undefined if the divisor is zero. Passing a zero divisor to the 3258division or modulo functions (including the modular powering functions 3259@code{mpz_powm} and @code{mpz_powm_ui}), will cause an intentional division by 3260zero. This lets a program handle arithmetic exceptions in these functions the 3261same way as for normal C @code{int} arithmetic. 3262 3263@c Separate deftypefun groups for cdiv, fdiv and tdiv produce a blank line 3264@c between each, and seem to let tex do a better job of page breaks than an 3265@c @sp 1 in the middle of one big set. 3266 3267@deftypefun void mpz_cdiv_q (mpz_t @var{q}, const mpz_t @var{n}, const mpz_t @var{d}) 3268@deftypefunx void mpz_cdiv_r (mpz_t @var{r}, const mpz_t @var{n}, const mpz_t @var{d}) 3269@deftypefunx void mpz_cdiv_qr (mpz_t @var{q}, mpz_t @var{r}, const mpz_t @var{n}, const mpz_t @var{d}) 3270@maybepagebreak 3271@deftypefunx {unsigned long int} mpz_cdiv_q_ui (mpz_t @var{q}, const mpz_t @var{n}, @w{unsigned long int @var{d}}) 3272@deftypefunx {unsigned long int} mpz_cdiv_r_ui (mpz_t @var{r}, const mpz_t @var{n}, @w{unsigned long int @var{d}}) 3273@deftypefunx {unsigned long int} mpz_cdiv_qr_ui (mpz_t @var{q}, mpz_t @var{r}, @w{const mpz_t @var{n}}, @w{unsigned long int @var{d}}) 3274@deftypefunx {unsigned long int} mpz_cdiv_ui (const mpz_t @var{n}, @w{unsigned long int @var{d}}) 3275@maybepagebreak 3276@deftypefunx void mpz_cdiv_q_2exp (mpz_t @var{q}, const mpz_t @var{n}, @w{mp_bitcnt_t @var{b}}) 3277@deftypefunx void mpz_cdiv_r_2exp (mpz_t @var{r}, const mpz_t @var{n}, @w{mp_bitcnt_t @var{b}}) 3278@end deftypefun 3279 3280@deftypefun void mpz_fdiv_q (mpz_t @var{q}, const mpz_t @var{n}, const mpz_t @var{d}) 3281@deftypefunx void mpz_fdiv_r (mpz_t @var{r}, const mpz_t @var{n}, const mpz_t @var{d}) 3282@deftypefunx void mpz_fdiv_qr (mpz_t @var{q}, mpz_t @var{r}, const mpz_t @var{n}, const mpz_t @var{d}) 3283@maybepagebreak 3284@deftypefunx {unsigned long int} mpz_fdiv_q_ui (mpz_t @var{q}, const mpz_t @var{n}, @w{unsigned long int @var{d}}) 3285@deftypefunx {unsigned long int} mpz_fdiv_r_ui (mpz_t @var{r}, const mpz_t @var{n}, @w{unsigned long int @var{d}}) 3286@deftypefunx {unsigned long int} mpz_fdiv_qr_ui (mpz_t @var{q}, mpz_t @var{r}, @w{const mpz_t @var{n}}, @w{unsigned long int @var{d}}) 3287@deftypefunx {unsigned long int} mpz_fdiv_ui (const mpz_t @var{n}, @w{unsigned long int @var{d}}) 3288@maybepagebreak 3289@deftypefunx void mpz_fdiv_q_2exp (mpz_t @var{q}, const mpz_t @var{n}, @w{mp_bitcnt_t @var{b}}) 3290@deftypefunx void mpz_fdiv_r_2exp (mpz_t @var{r}, const mpz_t @var{n}, @w{mp_bitcnt_t @var{b}}) 3291@end deftypefun 3292 3293@deftypefun void mpz_tdiv_q (mpz_t @var{q}, const mpz_t @var{n}, const mpz_t @var{d}) 3294@deftypefunx void mpz_tdiv_r (mpz_t @var{r}, const mpz_t @var{n}, const mpz_t @var{d}) 3295@deftypefunx void mpz_tdiv_qr (mpz_t @var{q}, mpz_t @var{r}, const mpz_t @var{n}, const mpz_t @var{d}) 3296@maybepagebreak 3297@deftypefunx {unsigned long int} mpz_tdiv_q_ui (mpz_t @var{q}, const mpz_t @var{n}, @w{unsigned long int @var{d}}) 3298@deftypefunx {unsigned long int} mpz_tdiv_r_ui (mpz_t @var{r}, const mpz_t @var{n}, @w{unsigned long int @var{d}}) 3299@deftypefunx {unsigned long int} mpz_tdiv_qr_ui (mpz_t @var{q}, mpz_t @var{r}, @w{const mpz_t @var{n}}, @w{unsigned long int @var{d}}) 3300@deftypefunx {unsigned long int} mpz_tdiv_ui (const mpz_t @var{n}, @w{unsigned long int @var{d}}) 3301@maybepagebreak 3302@deftypefunx void mpz_tdiv_q_2exp (mpz_t @var{q}, const mpz_t @var{n}, @w{mp_bitcnt_t @var{b}}) 3303@deftypefunx void mpz_tdiv_r_2exp (mpz_t @var{r}, const mpz_t @var{n}, @w{mp_bitcnt_t @var{b}}) 3304@cindex Bit shift right 3305 3306@sp 1 3307Divide @var{n} by @var{d}, forming a quotient @var{q} and/or remainder 3308@var{r}. For the @code{2exp} functions, @m{@var{d}=2^b, @var{d}=2^@var{b}}. 3309The rounding is in three styles, each suiting different applications. 3310 3311@itemize @bullet 3312@item 3313@code{cdiv} rounds @var{q} up towards @m{+\infty, +infinity}, and @var{r} will 3314have the opposite sign to @var{d}. The @code{c} stands for ``ceil''. 3315 3316@item 3317@code{fdiv} rounds @var{q} down towards @m{-\infty, @minus{}infinity}, and 3318@var{r} will have the same sign as @var{d}. The @code{f} stands for 3319``floor''. 3320 3321@item 3322@code{tdiv} rounds @var{q} towards zero, and @var{r} will have the same sign 3323as @var{n}. The @code{t} stands for ``truncate''. 3324@end itemize 3325 3326In all cases @var{q} and @var{r} will satisfy 3327@m{@var{n}=@var{q}@var{d}+@var{r}, @var{n}=@var{q}*@var{d}+@var{r}}, and 3328@var{r} will satisfy @math{0@le{}@GMPabs{@var{r}}<@GMPabs{@var{d}}}. 3329 3330The @code{q} functions calculate only the quotient, the @code{r} functions 3331only the remainder, and the @code{qr} functions calculate both. Note that for 3332@code{qr} the same variable cannot be passed for both @var{q} and @var{r}, or 3333results will be unpredictable. 3334 3335For the @code{ui} variants the return value is the remainder, and in fact 3336returning the remainder is all the @code{div_ui} functions do. For 3337@code{tdiv} and @code{cdiv} the remainder can be negative, so for those the 3338return value is the absolute value of the remainder. 3339 3340For the @code{2exp} variants the divisor is @m{2^b,2^@var{b}}. These 3341functions are implemented as right shifts and bit masks, but of course they 3342round the same as the other functions. 3343 3344For positive @var{n} both @code{mpz_fdiv_q_2exp} and @code{mpz_tdiv_q_2exp} 3345are simple bitwise right shifts. For negative @var{n}, @code{mpz_fdiv_q_2exp} 3346is effectively an arithmetic right shift treating @var{n} as twos complement 3347the same as the bitwise logical functions do, whereas @code{mpz_tdiv_q_2exp} 3348effectively treats @var{n} as sign and magnitude. 3349@end deftypefun 3350 3351@deftypefun void mpz_mod (mpz_t @var{r}, const mpz_t @var{n}, const mpz_t @var{d}) 3352@deftypefunx {unsigned long int} mpz_mod_ui (mpz_t @var{r}, const mpz_t @var{n}, @w{unsigned long int @var{d}}) 3353Set @var{r} to @var{n} @code{mod} @var{d}. The sign of the divisor is 3354ignored; the result is always non-negative. 3355 3356@code{mpz_mod_ui} is identical to @code{mpz_fdiv_r_ui} above, returning the 3357remainder as well as setting @var{r}. See @code{mpz_fdiv_ui} above if only 3358the return value is wanted. 3359@end deftypefun 3360 3361@deftypefun void mpz_divexact (mpz_t @var{q}, const mpz_t @var{n}, const mpz_t @var{d}) 3362@deftypefunx void mpz_divexact_ui (mpz_t @var{q}, const mpz_t @var{n}, unsigned long @var{d}) 3363@cindex Exact division functions 3364Set @var{q} to @var{n}/@var{d}. These functions produce correct results only 3365when it is known in advance that @var{d} divides @var{n}. 3366 3367These routines are much faster than the other division functions, and are the 3368best choice when exact division is known to occur, for example reducing a 3369rational to lowest terms. 3370@end deftypefun 3371 3372@deftypefun int mpz_divisible_p (const mpz_t @var{n}, const mpz_t @var{d}) 3373@deftypefunx int mpz_divisible_ui_p (const mpz_t @var{n}, unsigned long int @var{d}) 3374@deftypefunx int mpz_divisible_2exp_p (const mpz_t @var{n}, mp_bitcnt_t @var{b}) 3375@cindex Divisibility functions 3376Return non-zero if @var{n} is exactly divisible by @var{d}, or in the case of 3377@code{mpz_divisible_2exp_p} by @m{2^b,2^@var{b}}. 3378 3379@var{n} is divisible by @var{d} if there exists an integer @var{q} satisfying 3380@math{@var{n} = @var{q}@GMPmultiply{}@var{d}}. Unlike the other division 3381functions, @math{@var{d}=0} is accepted and following the rule it can be seen 3382that only 0 is considered divisible by 0. 3383@end deftypefun 3384 3385@deftypefun int mpz_congruent_p (const mpz_t @var{n}, const mpz_t @var{c}, const mpz_t @var{d}) 3386@deftypefunx int mpz_congruent_ui_p (const mpz_t @var{n}, unsigned long int @var{c}, unsigned long int @var{d}) 3387@deftypefunx int mpz_congruent_2exp_p (const mpz_t @var{n}, const mpz_t @var{c}, mp_bitcnt_t @var{b}) 3388@cindex Divisibility functions 3389@cindex Congruence functions 3390Return non-zero if @var{n} is congruent to @var{c} modulo @var{d}, or in the 3391case of @code{mpz_congruent_2exp_p} modulo @m{2^b,2^@var{b}}. 3392 3393@var{n} is congruent to @var{c} mod @var{d} if there exists an integer @var{q} 3394satisfying @math{@var{n} = @var{c} + @var{q}@GMPmultiply{}@var{d}}. Unlike 3395the other division functions, @math{@var{d}=0} is accepted and following the 3396rule it can be seen that @var{n} and @var{c} are considered congruent mod 0 3397only when exactly equal. 3398@end deftypefun 3399 3400 3401@need 2000 3402@node Integer Exponentiation, Integer Roots, Integer Division, Integer Functions 3403@section Exponentiation Functions 3404@cindex Integer exponentiation functions 3405@cindex Exponentiation functions 3406@cindex Powering functions 3407 3408@deftypefun void mpz_powm (mpz_t @var{rop}, const mpz_t @var{base}, const mpz_t @var{exp}, const mpz_t @var{mod}) 3409@deftypefunx void mpz_powm_ui (mpz_t @var{rop}, const mpz_t @var{base}, unsigned long int @var{exp}, const mpz_t @var{mod}) 3410Set @var{rop} to @m{base^{exp} \bmod mod, (@var{base} raised to @var{exp}) 3411modulo @var{mod}}. 3412 3413Negative @var{exp} is supported if an inverse @math{@var{base}^@W{-1} @bmod 3414@var{mod}} exists (see @code{mpz_invert} in @ref{Number Theoretic Functions}). 3415If an inverse doesn't exist then a divide by zero is raised. 3416@end deftypefun 3417 3418@deftypefun void mpz_powm_sec (mpz_t @var{rop}, const mpz_t @var{base}, const mpz_t @var{exp}, const mpz_t @var{mod}) 3419Set @var{rop} to @m{base^{exp} \bmod mod, (@var{base} raised to @var{exp}) 3420modulo @var{mod}}. 3421 3422It is required that @math{@var{exp} > 0} and that @var{mod} is odd. 3423 3424This function is designed to take the same time and have the same cache access 3425patterns for any two same-size arguments, assuming that function arguments are 3426placed at the same position and that the machine state is identical upon 3427function entry. This function is intended for cryptographic purposes, where 3428resilience to side-channel attacks is desired. 3429@end deftypefun 3430 3431@deftypefun void mpz_pow_ui (mpz_t @var{rop}, const mpz_t @var{base}, unsigned long int @var{exp}) 3432@deftypefunx void mpz_ui_pow_ui (mpz_t @var{rop}, unsigned long int @var{base}, unsigned long int @var{exp}) 3433Set @var{rop} to @m{base^{exp}, @var{base} raised to @var{exp}}. The case 3434@math{0^0} yields 1. 3435@end deftypefun 3436 3437 3438@need 2000 3439@node Integer Roots, Number Theoretic Functions, Integer Exponentiation, Integer Functions 3440@section Root Extraction Functions 3441@cindex Integer root functions 3442@cindex Root extraction functions 3443 3444@deftypefun int mpz_root (mpz_t @var{rop}, const mpz_t @var{op}, unsigned long int @var{n}) 3445Set @var{rop} to @m{\lfloor\root n \of {op}\rfloor@C{},} the truncated integer 3446part of the @var{n}th root of @var{op}. Return non-zero if the computation 3447was exact, i.e., if @var{op} is @var{rop} to the @var{n}th power. 3448@end deftypefun 3449 3450@deftypefun void mpz_rootrem (mpz_t @var{root}, mpz_t @var{rem}, const mpz_t @var{u}, unsigned long int @var{n}) 3451Set @var{root} to @m{\lfloor\root n \of {u}\rfloor@C{},} the truncated 3452integer part of the @var{n}th root of @var{u}. Set @var{rem} to the 3453remainder, @m{(@var{u} - @var{root}^n), 3454@var{u}@minus{}@var{root}**@var{n}}. 3455@end deftypefun 3456 3457@deftypefun void mpz_sqrt (mpz_t @var{rop}, const mpz_t @var{op}) 3458Set @var{rop} to @m{\lfloor\sqrt{@var{op}}\rfloor@C{},} the truncated 3459integer part of the square root of @var{op}. 3460@end deftypefun 3461 3462@deftypefun void mpz_sqrtrem (mpz_t @var{rop1}, mpz_t @var{rop2}, const mpz_t @var{op}) 3463Set @var{rop1} to @m{\lfloor\sqrt{@var{op}}\rfloor, the truncated integer part 3464of the square root of @var{op}}, like @code{mpz_sqrt}. Set @var{rop2} to the 3465remainder @m{(@var{op} - @var{rop1}^2), 3466@var{op}@minus{}@var{rop1}*@var{rop1}}, which will be zero if @var{op} is a 3467perfect square. 3468 3469If @var{rop1} and @var{rop2} are the same variable, the results are 3470undefined. 3471@end deftypefun 3472 3473@deftypefun int mpz_perfect_power_p (const mpz_t @var{op}) 3474@cindex Perfect power functions 3475@cindex Root testing functions 3476Return non-zero if @var{op} is a perfect power, i.e., if there exist integers 3477@m{a,@var{a}} and @m{b,@var{b}}, with @m{b>1, @var{b}>1}, such that 3478@m{@var{op}=a^b, @var{op} equals @var{a} raised to the power @var{b}}. 3479 3480Under this definition both 0 and 1 are considered to be perfect powers. 3481Negative values of @var{op} are accepted, but of course can only be odd 3482perfect powers. 3483@end deftypefun 3484 3485@deftypefun int mpz_perfect_square_p (const mpz_t @var{op}) 3486@cindex Perfect square functions 3487@cindex Root testing functions 3488Return non-zero if @var{op} is a perfect square, i.e., if the square root of 3489@var{op} is an integer. Under this definition both 0 and 1 are considered to 3490be perfect squares. 3491@end deftypefun 3492 3493 3494@need 2000 3495@node Number Theoretic Functions, Integer Comparisons, Integer Roots, Integer Functions 3496@section Number Theoretic Functions 3497@cindex Number theoretic functions 3498 3499@deftypefun int mpz_probab_prime_p (const mpz_t @var{n}, int @var{reps}) 3500@cindex Prime testing functions 3501@cindex Probable prime testing functions 3502Determine whether @var{n} is prime. Return 2 if @var{n} is definitely prime, 3503return 1 if @var{n} is probably prime (without being certain), or return 0 if 3504@var{n} is definitely composite. 3505 3506This function does some trial divisions, then some Miller-Rabin probabilistic 3507primality tests. The argument @var{reps} controls how many such tests are 3508done; a higher value will reduce the chances of a composite being returned as 3509``probably prime''. 25 is a reasonable number; a composite number will then be 3510identified as a prime with a probability of less than @m{2^{-50},2^(-50)}. 3511 3512Miller-Rabin and similar tests can be more properly called compositeness 3513tests. Numbers which fail are known to be composite but those which pass 3514might be prime or might be composite. Only a few composites pass, hence those 3515which pass are considered probably prime. 3516@end deftypefun 3517 3518@deftypefun void mpz_nextprime (mpz_t @var{rop}, const mpz_t @var{op}) 3519@cindex Next prime function 3520Set @var{rop} to the next prime greater than @var{op}. 3521 3522This function uses a probabilistic algorithm to identify primes. For 3523practical purposes it's adequate, the chance of a composite passing will be 3524extremely small. 3525@end deftypefun 3526 3527@c mpz_prime_p not implemented as of gmp 3.0. 3528 3529@c @deftypefun int mpz_prime_p (const mpz_t @var{n}) 3530@c Return non-zero if @var{n} is prime and zero if @var{n} is a non-prime. 3531@c This function is far slower than @code{mpz_probab_prime_p}, but then it 3532@c never returns non-zero for composite numbers. 3533 3534@c (For practical purposes, using @code{mpz_probab_prime_p} is adequate. 3535@c The likelihood of a programming error or hardware malfunction is orders 3536@c of magnitudes greater than the likelihood for a composite to pass as a 3537@c prime, if the @var{reps} argument is in the suggested range.) 3538@c @end deftypefun 3539 3540@deftypefun void mpz_gcd (mpz_t @var{rop}, const mpz_t @var{op1}, const mpz_t @var{op2}) 3541@cindex Greatest common divisor functions 3542@cindex GCD functions 3543Set @var{rop} to the greatest common divisor of @var{op1} and @var{op2}. The 3544result is always positive even if one or both input operands are negative. 3545Except if both inputs are zero; then this function defines @math{gcd(0,0) = 0}. 3546@end deftypefun 3547 3548@deftypefun {unsigned long int} mpz_gcd_ui (mpz_t @var{rop}, const mpz_t @var{op1}, unsigned long int @var{op2}) 3549Compute the greatest common divisor of @var{op1} and @var{op2}. If 3550@var{rop} is not @code{NULL}, store the result there. 3551 3552If the result is small enough to fit in an @code{unsigned long int}, it is 3553returned. If the result does not fit, 0 is returned, and the result is equal 3554to the argument @var{op1}. Note that the result will always fit if @var{op2} 3555is non-zero. 3556@end deftypefun 3557 3558@deftypefun void mpz_gcdext (mpz_t @var{g}, mpz_t @var{s}, mpz_t @var{t}, const mpz_t @var{a}, const mpz_t @var{b}) 3559@cindex Extended GCD 3560@cindex GCD extended 3561Set @var{g} to the greatest common divisor of @var{a} and @var{b}, and in 3562addition set @var{s} and @var{t} to coefficients satisfying 3563@math{@var{a}@GMPmultiply{}@var{s} + @var{b}@GMPmultiply{}@var{t} = @var{g}}. 3564The value in @var{g} is always positive, even if one or both of @var{a} and 3565@var{b} are negative (or zero if both inputs are zero). The values in @var{s} 3566and @var{t} are chosen such that normally, @math{@GMPabs{@var{s}} < 3567@GMPabs{@var{b}} / (2 @var{g})} and @math{@GMPabs{@var{t}} < @GMPabs{@var{a}} 3568/ (2 @var{g})}, and these relations define @var{s} and @var{t} uniquely. There 3569are a few exceptional cases: 3570 3571If @math{@GMPabs{@var{a}} = @GMPabs{@var{b}}}, then @math{@var{s} = 0}, 3572@math{@var{t} = sgn(@var{b})}. 3573 3574Otherwise, @math{@var{s} = sgn(@var{a})} if @math{@var{b} = 0} or 3575@math{@GMPabs{@var{b}} = 2 @var{g}}, and @math{@var{t} = sgn(@var{b})} if 3576@math{@var{a} = 0} or @math{@GMPabs{@var{a}} = 2 @var{g}}. 3577 3578In all cases, @math{@var{s} = 0} if and only if @math{@var{g} = 3579@GMPabs{@var{b}}}, i.e., if @var{b} divides @var{a} or @math{@var{a} = @var{b} 3580= 0}. 3581 3582If @var{t} is @code{NULL} then that value is not computed. 3583@end deftypefun 3584 3585@deftypefun void mpz_lcm (mpz_t @var{rop}, const mpz_t @var{op1}, const mpz_t @var{op2}) 3586@deftypefunx void mpz_lcm_ui (mpz_t @var{rop}, const mpz_t @var{op1}, unsigned long @var{op2}) 3587@cindex Least common multiple functions 3588@cindex LCM functions 3589Set @var{rop} to the least common multiple of @var{op1} and @var{op2}. 3590@var{rop} is always positive, irrespective of the signs of @var{op1} and 3591@var{op2}. @var{rop} will be zero if either @var{op1} or @var{op2} is zero. 3592@end deftypefun 3593 3594@deftypefun int mpz_invert (mpz_t @var{rop}, const mpz_t @var{op1}, const mpz_t @var{op2}) 3595@cindex Modular inverse functions 3596@cindex Inverse modulo functions 3597Compute the inverse of @var{op1} modulo @var{op2} and put the result in 3598@var{rop}. If the inverse exists, the return value is non-zero and @var{rop} 3599will satisfy @math{0 < @var{rop} < @GMPabs{@var{op2}}}. If an inverse doesn't 3600exist the return value is zero and @var{rop} is undefined. The behaviour of 3601this function is undefined when @var{op2} is zero. 3602@end deftypefun 3603 3604@deftypefun int mpz_jacobi (const mpz_t @var{a}, const mpz_t @var{b}) 3605@cindex Jacobi symbol functions 3606Calculate the Jacobi symbol @m{\left(a \over b\right), 3607(@var{a}/@var{b})}. This is defined only for @var{b} odd. 3608@end deftypefun 3609 3610@deftypefun int mpz_legendre (const mpz_t @var{a}, const mpz_t @var{p}) 3611@cindex Legendre symbol functions 3612Calculate the Legendre symbol @m{\left(a \over p\right), 3613(@var{a}/@var{p})}. This is defined only for @var{p} an odd positive 3614prime, and for such @var{p} it's identical to the Jacobi symbol. 3615@end deftypefun 3616 3617@deftypefun int mpz_kronecker (const mpz_t @var{a}, const mpz_t @var{b}) 3618@deftypefunx int mpz_kronecker_si (const mpz_t @var{a}, long @var{b}) 3619@deftypefunx int mpz_kronecker_ui (const mpz_t @var{a}, unsigned long @var{b}) 3620@deftypefunx int mpz_si_kronecker (long @var{a}, const mpz_t @var{b}) 3621@deftypefunx int mpz_ui_kronecker (unsigned long @var{a}, const mpz_t @var{b}) 3622@cindex Kronecker symbol functions 3623Calculate the Jacobi symbol @m{\left(a \over b\right), 3624(@var{a}/@var{b})} with the Kronecker extension @m{\left(a \over 36252\right) = \left(2 \over a\right), (a/2)=(2/a)} when @math{a} odd, or 3626@m{\left(a \over 2\right) = 0, (a/2)=0} when @math{a} even. 3627 3628When @var{b} is odd the Jacobi symbol and Kronecker symbol are 3629identical, so @code{mpz_kronecker_ui} etc can be used for mixed 3630precision Jacobi symbols too. 3631 3632For more information see Henri Cohen section 1.4.2 (@pxref{References}), 3633or any number theory textbook. See also the example program 3634@file{demos/qcn.c} which uses @code{mpz_kronecker_ui}. 3635@end deftypefun 3636 3637@deftypefun {mp_bitcnt_t} mpz_remove (mpz_t @var{rop}, const mpz_t @var{op}, const mpz_t @var{f}) 3638@cindex Remove factor functions 3639@cindex Factor removal functions 3640Remove all occurrences of the factor @var{f} from @var{op} and store the 3641result in @var{rop}. The return value is how many such occurrences were 3642removed. 3643@end deftypefun 3644 3645@deftypefun void mpz_fac_ui (mpz_t @var{rop}, unsigned long int @var{n}) 3646@deftypefunx void mpz_2fac_ui (mpz_t @var{rop}, unsigned long int @var{n}) 3647@deftypefunx void mpz_mfac_uiui (mpz_t @var{rop}, unsigned long int @var{n}, unsigned long int @var{m}) 3648@cindex Factorial functions 3649Set @var{rop} to the factorial of @var{n}: @code{mpz_fac_ui} computes the plain factorial @var{n}!, 3650@code{mpz_2fac_ui} computes the double-factorial @var{n}!!, and @code{mpz_mfac_uiui} the 3651@var{m}-multi-factorial @m{n!^{(m)}, @var{n}!^(@var{m})}. 3652@end deftypefun 3653 3654@deftypefun void mpz_primorial_ui (mpz_t @var{rop}, unsigned long int @var{n}) 3655@cindex Primorial functions 3656Set @var{rop} to the primorial of @var{n}, i.e. the product of all positive 3657prime numbers @math{@le{}@var{n}}. 3658@end deftypefun 3659 3660@deftypefun void mpz_bin_ui (mpz_t @var{rop}, const mpz_t @var{n}, unsigned long int @var{k}) 3661@deftypefunx void mpz_bin_uiui (mpz_t @var{rop}, unsigned long int @var{n}, @w{unsigned long int @var{k}}) 3662@cindex Binomial coefficient functions 3663Compute the binomial coefficient @m{\left({n}\atop{k}\right), @var{n} over 3664@var{k}} and store the result in @var{rop}. Negative values of @var{n} are 3665supported by @code{mpz_bin_ui}, using the identity 3666@m{\left({-n}\atop{k}\right) = (-1)^k \left({n+k-1}\atop{k}\right), 3667bin(-n@C{}k) = (-1)^k * bin(n+k-1@C{}k)}, see Knuth volume 1 section 1.2.6 3668part G. 3669@end deftypefun 3670 3671@deftypefun void mpz_fib_ui (mpz_t @var{fn}, unsigned long int @var{n}) 3672@deftypefunx void mpz_fib2_ui (mpz_t @var{fn}, mpz_t @var{fnsub1}, unsigned long int @var{n}) 3673@cindex Fibonacci sequence functions 3674@code{mpz_fib_ui} sets @var{fn} to to @m{F_n,F[n]}, the @var{n}'th Fibonacci 3675number. @code{mpz_fib2_ui} sets @var{fn} to @m{F_n,F[n]}, and @var{fnsub1} to 3676@m{F_{n-1},F[n-1]}. 3677 3678These functions are designed for calculating isolated Fibonacci numbers. When 3679a sequence of values is wanted it's best to start with @code{mpz_fib2_ui} and 3680iterate the defining @m{F_{n+1} = F_n + F_{n-1}, F[n+1]=F[n]+F[n-1]} or 3681similar. 3682@end deftypefun 3683 3684@deftypefun void mpz_lucnum_ui (mpz_t @var{ln}, unsigned long int @var{n}) 3685@deftypefunx void mpz_lucnum2_ui (mpz_t @var{ln}, mpz_t @var{lnsub1}, unsigned long int @var{n}) 3686@cindex Lucas number functions 3687@code{mpz_lucnum_ui} sets @var{ln} to to @m{L_n,L[n]}, the @var{n}'th Lucas 3688number. @code{mpz_lucnum2_ui} sets @var{ln} to @m{L_n,L[n]}, and @var{lnsub1} 3689to @m{L_{n-1},L[n-1]}. 3690 3691These functions are designed for calculating isolated Lucas numbers. When a 3692sequence of values is wanted it's best to start with @code{mpz_lucnum2_ui} and 3693iterate the defining @m{L_{n+1} = L_n + L_{n-1}, L[n+1]=L[n]+L[n-1]} or 3694similar. 3695 3696The Fibonacci numbers and Lucas numbers are related sequences, so it's never 3697necessary to call both @code{mpz_fib2_ui} and @code{mpz_lucnum2_ui}. The 3698formulas for going from Fibonacci to Lucas can be found in @ref{Lucas Numbers 3699Algorithm}, the reverse is straightforward too. 3700@end deftypefun 3701 3702 3703@node Integer Comparisons, Integer Logic and Bit Fiddling, Number Theoretic Functions, Integer Functions 3704@comment node-name, next, previous, up 3705@section Comparison Functions 3706@cindex Integer comparison functions 3707@cindex Comparison functions 3708 3709@deftypefn Function int mpz_cmp (const mpz_t @var{op1}, const mpz_t @var{op2}) 3710@deftypefnx Function int mpz_cmp_d (const mpz_t @var{op1}, double @var{op2}) 3711@deftypefnx Macro int mpz_cmp_si (const mpz_t @var{op1}, signed long int @var{op2}) 3712@deftypefnx Macro int mpz_cmp_ui (const mpz_t @var{op1}, unsigned long int @var{op2}) 3713Compare @var{op1} and @var{op2}. Return a positive value if @math{@var{op1} > 3714@var{op2}}, zero if @math{@var{op1} = @var{op2}}, or a negative value if 3715@math{@var{op1} < @var{op2}}. 3716 3717@code{mpz_cmp_ui} and @code{mpz_cmp_si} are macros and will evaluate their 3718arguments more than once. @code{mpz_cmp_d} can be called with an infinity, 3719but results are undefined for a NaN. 3720@end deftypefn 3721 3722@deftypefn Function int mpz_cmpabs (const mpz_t @var{op1}, const mpz_t @var{op2}) 3723@deftypefnx Function int mpz_cmpabs_d (const mpz_t @var{op1}, double @var{op2}) 3724@deftypefnx Function int mpz_cmpabs_ui (const mpz_t @var{op1}, unsigned long int @var{op2}) 3725Compare the absolute values of @var{op1} and @var{op2}. Return a positive 3726value if @math{@GMPabs{@var{op1}} > @GMPabs{@var{op2}}}, zero if 3727@math{@GMPabs{@var{op1}} = @GMPabs{@var{op2}}}, or a negative value if 3728@math{@GMPabs{@var{op1}} < @GMPabs{@var{op2}}}. 3729 3730@code{mpz_cmpabs_d} can be called with an infinity, but results are undefined 3731for a NaN. 3732@end deftypefn 3733 3734@deftypefn Macro int mpz_sgn (const mpz_t @var{op}) 3735@cindex Sign tests 3736@cindex Integer sign tests 3737Return @math{+1} if @math{@var{op} > 0}, 0 if @math{@var{op} = 0}, and 3738@math{-1} if @math{@var{op} < 0}. 3739 3740This function is actually implemented as a macro. It evaluates its argument 3741multiple times. 3742@end deftypefn 3743 3744 3745@node Integer Logic and Bit Fiddling, I/O of Integers, Integer Comparisons, Integer Functions 3746@comment node-name, next, previous, up 3747@section Logical and Bit Manipulation Functions 3748@cindex Logical functions 3749@cindex Bit manipulation functions 3750@cindex Integer logical functions 3751@cindex Integer bit manipulation functions 3752 3753These functions behave as if twos complement arithmetic were used (although 3754sign-magnitude is the actual implementation). The least significant bit is 3755number 0. 3756 3757@deftypefun void mpz_and (mpz_t @var{rop}, const mpz_t @var{op1}, const mpz_t @var{op2}) 3758Set @var{rop} to @var{op1} bitwise-and @var{op2}. 3759@end deftypefun 3760 3761@deftypefun void mpz_ior (mpz_t @var{rop}, const mpz_t @var{op1}, const mpz_t @var{op2}) 3762Set @var{rop} to @var{op1} bitwise inclusive-or @var{op2}. 3763@end deftypefun 3764 3765@deftypefun void mpz_xor (mpz_t @var{rop}, const mpz_t @var{op1}, const mpz_t @var{op2}) 3766Set @var{rop} to @var{op1} bitwise exclusive-or @var{op2}. 3767@end deftypefun 3768 3769@deftypefun void mpz_com (mpz_t @var{rop}, const mpz_t @var{op}) 3770Set @var{rop} to the one's complement of @var{op}. 3771@end deftypefun 3772 3773@deftypefun {mp_bitcnt_t} mpz_popcount (const mpz_t @var{op}) 3774If @math{@var{op}@ge{}0}, return the population count of @var{op}, which is the 3775number of 1 bits in the binary representation. If @math{@var{op}<0}, the 3776number of 1s is infinite, and the return value is the largest possible 3777@code{mp_bitcnt_t}. 3778@end deftypefun 3779 3780@deftypefun {mp_bitcnt_t} mpz_hamdist (const mpz_t @var{op1}, const mpz_t @var{op2}) 3781If @var{op1} and @var{op2} are both @math{@ge{}0} or both @math{<0}, return the 3782hamming distance between the two operands, which is the number of bit positions 3783where @var{op1} and @var{op2} have different bit values. If one operand is 3784@math{@ge{}0} and the other @math{<0} then the number of bits different is 3785infinite, and the return value is the largest possible @code{mp_bitcnt_t}. 3786@end deftypefun 3787 3788@deftypefun {mp_bitcnt_t} mpz_scan0 (const mpz_t @var{op}, mp_bitcnt_t @var{starting_bit}) 3789@deftypefunx {mp_bitcnt_t} mpz_scan1 (const mpz_t @var{op}, mp_bitcnt_t @var{starting_bit}) 3790@cindex Bit scanning functions 3791@cindex Scan bit functions 3792Scan @var{op}, starting from bit @var{starting_bit}, towards more significant 3793bits, until the first 0 or 1 bit (respectively) is found. Return the index of 3794the found bit. 3795 3796If the bit at @var{starting_bit} is already what's sought, then 3797@var{starting_bit} is returned. 3798 3799If there's no bit found, then the largest possible @code{mp_bitcnt_t} is 3800returned. This will happen in @code{mpz_scan0} past the end of a negative 3801number, or @code{mpz_scan1} past the end of a nonnegative number. 3802@end deftypefun 3803 3804@deftypefun void mpz_setbit (mpz_t @var{rop}, mp_bitcnt_t @var{bit_index}) 3805Set bit @var{bit_index} in @var{rop}. 3806@end deftypefun 3807 3808@deftypefun void mpz_clrbit (mpz_t @var{rop}, mp_bitcnt_t @var{bit_index}) 3809Clear bit @var{bit_index} in @var{rop}. 3810@end deftypefun 3811 3812@deftypefun void mpz_combit (mpz_t @var{rop}, mp_bitcnt_t @var{bit_index}) 3813Complement bit @var{bit_index} in @var{rop}. 3814@end deftypefun 3815 3816@deftypefun int mpz_tstbit (const mpz_t @var{op}, mp_bitcnt_t @var{bit_index}) 3817Test bit @var{bit_index} in @var{op} and return 0 or 1 accordingly. 3818@end deftypefun 3819 3820@node I/O of Integers, Integer Random Numbers, Integer Logic and Bit Fiddling, Integer Functions 3821@comment node-name, next, previous, up 3822@section Input and Output Functions 3823@cindex Integer input and output functions 3824@cindex Input functions 3825@cindex Output functions 3826@cindex I/O functions 3827 3828Functions that perform input from a stdio stream, and functions that output to 3829a stdio stream, of @code{mpz} numbers. Passing a @code{NULL} pointer for a 3830@var{stream} argument to any of these functions will make them read from 3831@code{stdin} and write to @code{stdout}, respectively. 3832 3833When using any of these functions, it is a good idea to include @file{stdio.h} 3834before @file{gmp.h}, since that will allow @file{gmp.h} to define prototypes 3835for these functions. 3836 3837See also @ref{Formatted Output} and @ref{Formatted Input}. 3838 3839@deftypefun size_t mpz_out_str (FILE *@var{stream}, int @var{base}, const mpz_t @var{op}) 3840Output @var{op} on stdio stream @var{stream}, as a string of digits in base 3841@var{base}. The base argument may vary from 2 to 62 or from @minus{}2 to 3842@minus{}36. 3843 3844For @var{base} in the range 2..36, digits and lower-case letters are used; for 3845@minus{}2..@minus{}36, digits and upper-case letters are used; for 37..62, 3846digits, upper-case letters, and lower-case letters (in that significance order) 3847are used. 3848 3849Return the number of bytes written, or if an error occurred, return 0. 3850@end deftypefun 3851 3852@deftypefun size_t mpz_inp_str (mpz_t @var{rop}, FILE *@var{stream}, int @var{base}) 3853Input a possibly white-space preceded string in base @var{base} from stdio 3854stream @var{stream}, and put the read integer in @var{rop}. 3855 3856The @var{base} may vary from 2 to 62, or if @var{base} is 0, then the leading 3857characters are used: @code{0x} and @code{0X} for hexadecimal, @code{0b} and 3858@code{0B} for binary, @code{0} for octal, or decimal otherwise. 3859 3860For bases up to 36, case is ignored; upper-case and lower-case letters have 3861the same value. For bases 37 to 62, upper-case letter represent the usual 386210..35 while lower-case letter represent 36..61. 3863 3864Return the number of bytes read, or if an error occurred, return 0. 3865@end deftypefun 3866 3867@deftypefun size_t mpz_out_raw (FILE *@var{stream}, const mpz_t @var{op}) 3868Output @var{op} on stdio stream @var{stream}, in raw binary format. The 3869integer is written in a portable format, with 4 bytes of size information, and 3870that many bytes of limbs. Both the size and the limbs are written in 3871decreasing significance order (i.e., in big-endian). 3872 3873The output can be read with @code{mpz_inp_raw}. 3874 3875Return the number of bytes written, or if an error occurred, return 0. 3876 3877The output of this can not be read by @code{mpz_inp_raw} from GMP 1, because 3878of changes necessary for compatibility between 32-bit and 64-bit machines. 3879@end deftypefun 3880 3881@deftypefun size_t mpz_inp_raw (mpz_t @var{rop}, FILE *@var{stream}) 3882Input from stdio stream @var{stream} in the format written by 3883@code{mpz_out_raw}, and put the result in @var{rop}. Return the number of 3884bytes read, or if an error occurred, return 0. 3885 3886This routine can read the output from @code{mpz_out_raw} also from GMP 1, in 3887spite of changes necessary for compatibility between 32-bit and 64-bit 3888machines. 3889@end deftypefun 3890 3891 3892@need 2000 3893@node Integer Random Numbers, Integer Import and Export, I/O of Integers, Integer Functions 3894@comment node-name, next, previous, up 3895@section Random Number Functions 3896@cindex Integer random number functions 3897@cindex Random number functions 3898 3899The random number functions of GMP come in two groups; older function 3900that rely on a global state, and newer functions that accept a state 3901parameter that is read and modified. Please see the @ref{Random Number 3902Functions} for more information on how to use and not to use random 3903number functions. 3904 3905@deftypefun void mpz_urandomb (mpz_t @var{rop}, gmp_randstate_t @var{state}, mp_bitcnt_t @var{n}) 3906Generate a uniformly distributed random integer in the range 0 to @m{2^n-1, 39072^@var{n}@minus{}1}, inclusive. 3908 3909The variable @var{state} must be initialized by calling one of the 3910@code{gmp_randinit} functions (@ref{Random State Initialization}) before 3911invoking this function. 3912@end deftypefun 3913 3914@deftypefun void mpz_urandomm (mpz_t @var{rop}, gmp_randstate_t @var{state}, const mpz_t @var{n}) 3915Generate a uniform random integer in the range 0 to @math{@var{n}-1}, 3916inclusive. 3917 3918The variable @var{state} must be initialized by calling one of the 3919@code{gmp_randinit} functions (@ref{Random State Initialization}) 3920before invoking this function. 3921@end deftypefun 3922 3923@deftypefun void mpz_rrandomb (mpz_t @var{rop}, gmp_randstate_t @var{state}, mp_bitcnt_t @var{n}) 3924Generate a random integer with long strings of zeros and ones in the 3925binary representation. Useful for testing functions and algorithms, 3926since this kind of random numbers have proven to be more likely to 3927trigger corner-case bugs. The random number will be in the range 39280 to @m{2^n-1, 2^@var{n}@minus{}1}, inclusive. 3929 3930The variable @var{state} must be initialized by calling one of the 3931@code{gmp_randinit} functions (@ref{Random State Initialization}) 3932before invoking this function. 3933@end deftypefun 3934 3935@deftypefun void mpz_random (mpz_t @var{rop}, mp_size_t @var{max_size}) 3936Generate a random integer of at most @var{max_size} limbs. The generated 3937random number doesn't satisfy any particular requirements of randomness. 3938Negative random numbers are generated when @var{max_size} is negative. 3939 3940This function is obsolete. Use @code{mpz_urandomb} or 3941@code{mpz_urandomm} instead. 3942@end deftypefun 3943 3944@deftypefun void mpz_random2 (mpz_t @var{rop}, mp_size_t @var{max_size}) 3945Generate a random integer of at most @var{max_size} limbs, with long strings 3946of zeros and ones in the binary representation. Useful for testing functions 3947and algorithms, since this kind of random numbers have proven to be more 3948likely to trigger corner-case bugs. Negative random numbers are generated 3949when @var{max_size} is negative. 3950 3951This function is obsolete. Use @code{mpz_rrandomb} instead. 3952@end deftypefun 3953 3954 3955@node Integer Import and Export, Miscellaneous Integer Functions, Integer Random Numbers, Integer Functions 3956@section Integer Import and Export 3957 3958@code{mpz_t} variables can be converted to and from arbitrary words of binary 3959data with the following functions. 3960 3961@deftypefun void mpz_import (mpz_t @var{rop}, size_t @var{count}, int @var{order}, size_t @var{size}, int @var{endian}, size_t @var{nails}, const void *@var{op}) 3962@cindex Integer import 3963@cindex Import 3964Set @var{rop} from an array of word data at @var{op}. 3965 3966The parameters specify the format of the data. @var{count} many words are 3967read, each @var{size} bytes. @var{order} can be 1 for most significant word 3968first or -1 for least significant first. Within each word @var{endian} can be 39691 for most significant byte first, -1 for least significant first, or 0 for 3970the native endianness of the host CPU@. The most significant @var{nails} bits 3971of each word are skipped, this can be 0 to use the full words. 3972 3973There is no sign taken from the data, @var{rop} will simply be a positive 3974integer. An application can handle any sign itself, and apply it for instance 3975with @code{mpz_neg}. 3976 3977There are no data alignment restrictions on @var{op}, any address is allowed. 3978 3979Here's an example converting an array of @code{unsigned long} data, most 3980significant element first, and host byte order within each value. 3981 3982@example 3983unsigned long a[20]; 3984/* Initialize @var{z} and @var{a} */ 3985mpz_import (z, 20, 1, sizeof(a[0]), 0, 0, a); 3986@end example 3987 3988This example assumes the full @code{sizeof} bytes are used for data in the 3989given type, which is usually true, and certainly true for @code{unsigned long} 3990everywhere we know of. However on Cray vector systems it may be noted that 3991@code{short} and @code{int} are always stored in 8 bytes (and with 3992@code{sizeof} indicating that) but use only 32 or 46 bits. The @var{nails} 3993feature can account for this, by passing for instance 3994@code{8*sizeof(int)-INT_BIT}. 3995@end deftypefun 3996 3997@deftypefun {void *} mpz_export (void *@var{rop}, size_t *@var{countp}, int @var{order}, size_t @var{size}, int @var{endian}, size_t @var{nails}, const mpz_t @var{op}) 3998@cindex Integer export 3999@cindex Export 4000Fill @var{rop} with word data from @var{op}. 4001 4002The parameters specify the format of the data produced. Each word will be 4003@var{size} bytes and @var{order} can be 1 for most significant word first or 4004-1 for least significant first. Within each word @var{endian} can be 1 for 4005most significant byte first, -1 for least significant first, or 0 for the 4006native endianness of the host CPU@. The most significant @var{nails} bits of 4007each word are unused and set to zero, this can be 0 to produce full words. 4008 4009The number of words produced is written to @code{*@var{countp}}, or 4010@var{countp} can be @code{NULL} to discard the count. @var{rop} must have 4011enough space for the data, or if @var{rop} is @code{NULL} then a result array 4012of the necessary size is allocated using the current GMP allocation function 4013(@pxref{Custom Allocation}). In either case the return value is the 4014destination used, either @var{rop} or the allocated block. 4015 4016If @var{op} is non-zero then the most significant word produced will be 4017non-zero. If @var{op} is zero then the count returned will be zero and 4018nothing written to @var{rop}. If @var{rop} is @code{NULL} in this case, no 4019block is allocated, just @code{NULL} is returned. 4020 4021The sign of @var{op} is ignored, just the absolute value is exported. An 4022application can use @code{mpz_sgn} to get the sign and handle it as desired. 4023(@pxref{Integer Comparisons}) 4024 4025There are no data alignment restrictions on @var{rop}, any address is allowed. 4026 4027When an application is allocating space itself the required size can be 4028determined with a calculation like the following. Since @code{mpz_sizeinbase} 4029always returns at least 1, @code{count} here will be at least one, which 4030avoids any portability problems with @code{malloc(0)}, though if @code{z} is 4031zero no space at all is actually needed (or written). 4032 4033@example 4034numb = 8*size - nail; 4035count = (mpz_sizeinbase (z, 2) + numb-1) / numb; 4036p = malloc (count * size); 4037@end example 4038@end deftypefun 4039 4040 4041@need 2000 4042@node Miscellaneous Integer Functions, Integer Special Functions, Integer Import and Export, Integer Functions 4043@comment node-name, next, previous, up 4044@section Miscellaneous Functions 4045@cindex Miscellaneous integer functions 4046@cindex Integer miscellaneous functions 4047 4048@deftypefun int mpz_fits_ulong_p (const mpz_t @var{op}) 4049@deftypefunx int mpz_fits_slong_p (const mpz_t @var{op}) 4050@deftypefunx int mpz_fits_uint_p (const mpz_t @var{op}) 4051@deftypefunx int mpz_fits_sint_p (const mpz_t @var{op}) 4052@deftypefunx int mpz_fits_ushort_p (const mpz_t @var{op}) 4053@deftypefunx int mpz_fits_sshort_p (const mpz_t @var{op}) 4054Return non-zero iff the value of @var{op} fits in an @code{unsigned long int}, 4055@code{signed long int}, @code{unsigned int}, @code{signed int}, @code{unsigned 4056short int}, or @code{signed short int}, respectively. Otherwise, return zero. 4057@end deftypefun 4058 4059@deftypefn Macro int mpz_odd_p (const mpz_t @var{op}) 4060@deftypefnx Macro int mpz_even_p (const mpz_t @var{op}) 4061Determine whether @var{op} is odd or even, respectively. Return non-zero if 4062yes, zero if no. These macros evaluate their argument more than once. 4063@end deftypefn 4064 4065@deftypefun size_t mpz_sizeinbase (const mpz_t @var{op}, int @var{base}) 4066@cindex Size in digits 4067@cindex Digits in an integer 4068Return the size of @var{op} measured in number of digits in the given 4069@var{base}. @var{base} can vary from 2 to 62. The sign of @var{op} is 4070ignored, just the absolute value is used. The result will be either exact or 40711 too big. If @var{base} is a power of 2, the result is always exact. If 4072@var{op} is zero the return value is always 1. 4073 4074This function can be used to determine the space required when converting 4075@var{op} to a string. The right amount of allocation is normally two more 4076than the value returned by @code{mpz_sizeinbase}, one extra for a minus sign 4077and one for the null-terminator. 4078 4079@cindex Most significant bit 4080It will be noted that @code{mpz_sizeinbase(@var{op},2)} can be used to locate 4081the most significant 1 bit in @var{op}, counting from 1. (Unlike the bitwise 4082functions which start from 0, @xref{Integer Logic and Bit Fiddling,, Logical 4083and Bit Manipulation Functions}.) 4084@end deftypefun 4085 4086 4087@node Integer Special Functions, , Miscellaneous Integer Functions, Integer Functions 4088@section Special Functions 4089@cindex Special integer functions 4090@cindex Integer special functions 4091 4092The functions in this section are for various special purposes. Most 4093applications will not need them. 4094 4095@deftypefun void mpz_array_init (mpz_t @var{integer_array}, mp_size_t @var{array_size}, @w{mp_size_t @var{fixed_num_bits}}) 4096This is a special type of initialization. @strong{Fixed} space of 4097@var{fixed_num_bits} is allocated to each of the @var{array_size} integers in 4098@var{integer_array}. There is no way to free the storage allocated by this 4099function. Don't call @code{mpz_clear}! 4100 4101The @var{integer_array} parameter is the first @code{mpz_t} in the array. For 4102example, 4103 4104@example 4105mpz_t arr[20000]; 4106mpz_array_init (arr[0], 20000, 512); 4107@end example 4108 4109@c In case anyone's wondering, yes this parameter style is a bit anomalous, 4110@c it'd probably be nicer if it was "arr" instead of "arr[0]". Obviously the 4111@c two differ only in the declaration, not the pointer value, but changing is 4112@c not possible since it'd provoke warnings or errors in existing sources. 4113 4114This function is only intended for programs that create a large number 4115of integers and need to reduce memory usage by avoiding the overheads of 4116allocating and reallocating lots of small blocks. In normal programs this 4117function is not recommended. 4118 4119The space allocated to each integer by this function will not be automatically 4120increased, unlike the normal @code{mpz_init}, so an application must ensure it 4121is sufficient for any value stored. The following space requirements apply to 4122various routines, 4123 4124@itemize @bullet 4125@item 4126@code{mpz_abs}, @code{mpz_neg}, @code{mpz_set}, @code{mpz_set_si} and 4127@code{mpz_set_ui} need room for the value they store. 4128 4129@item 4130@code{mpz_add}, @code{mpz_add_ui}, @code{mpz_sub} and @code{mpz_sub_ui} need 4131room for the larger of the two operands, plus an extra 4132@code{mp_bits_per_limb}. 4133 4134@item 4135@code{mpz_mul}, @code{mpz_mul_ui} and @code{mpz_mul_si} need room for the sum 4136of the number of bits in their operands, but each rounded up to a multiple of 4137@code{mp_bits_per_limb}. 4138 4139@item 4140@code{mpz_swap} can be used between two array variables, but not between an 4141array and a normal variable. 4142@end itemize 4143 4144For other functions, or if in doubt, the suggestion is to calculate in a 4145regular @code{mpz_init} variable and copy the result to an array variable with 4146@code{mpz_set}. 4147@end deftypefun 4148 4149@deftypefun {void *} _mpz_realloc (mpz_t @var{integer}, mp_size_t @var{new_alloc}) 4150Change the space for @var{integer} to @var{new_alloc} limbs. The value in 4151@var{integer} is preserved if it fits, or is set to 0 if not. The return 4152value is not useful to applications and should be ignored. 4153 4154@code{mpz_realloc2} is the preferred way to accomplish allocation changes like 4155this. @code{mpz_realloc2} and @code{_mpz_realloc} are the same except that 4156@code{_mpz_realloc} takes its size in limbs. 4157@end deftypefun 4158 4159@deftypefun mp_limb_t mpz_getlimbn (const mpz_t @var{op}, mp_size_t @var{n}) 4160Return limb number @var{n} from @var{op}. The sign of @var{op} is ignored, 4161just the absolute value is used. The least significant limb is number 0. 4162 4163@code{mpz_size} can be used to find how many limbs make up @var{op}. 4164@code{mpz_getlimbn} returns zero if @var{n} is outside the range 0 to 4165@code{mpz_size(@var{op})-1}. 4166@end deftypefun 4167 4168@deftypefun size_t mpz_size (const mpz_t @var{op}) 4169Return the size of @var{op} measured in number of limbs. If @var{op} is zero, 4170the returned value will be zero. 4171@c (@xref{Nomenclature}, for an explanation of the concept @dfn{limb}.) 4172@end deftypefun 4173 4174 4175 4176@node Rational Number Functions, Floating-point Functions, Integer Functions, Top 4177@comment node-name, next, previous, up 4178@chapter Rational Number Functions 4179@cindex Rational number functions 4180 4181This chapter describes the GMP functions for performing arithmetic on rational 4182numbers. These functions start with the prefix @code{mpq_}. 4183 4184Rational numbers are stored in objects of type @code{mpq_t}. 4185 4186All rational arithmetic functions assume operands have a canonical form, and 4187canonicalize their result. The canonical from means that the denominator and 4188the numerator have no common factors, and that the denominator is positive. 4189Zero has the unique representation 0/1. 4190 4191Pure assignment functions do not canonicalize the assigned variable. It is 4192the responsibility of the user to canonicalize the assigned variable before 4193any arithmetic operations are performed on that variable. 4194 4195@deftypefun void mpq_canonicalize (mpq_t @var{op}) 4196Remove any factors that are common to the numerator and denominator of 4197@var{op}, and make the denominator positive. 4198@end deftypefun 4199 4200@menu 4201* Initializing Rationals:: 4202* Rational Conversions:: 4203* Rational Arithmetic:: 4204* Comparing Rationals:: 4205* Applying Integer Functions:: 4206* I/O of Rationals:: 4207@end menu 4208 4209@node Initializing Rationals, Rational Conversions, Rational Number Functions, Rational Number Functions 4210@comment node-name, next, previous, up 4211@section Initialization and Assignment Functions 4212@cindex Rational assignment functions 4213@cindex Assignment functions 4214@cindex Rational initialization functions 4215@cindex Initialization functions 4216 4217@deftypefun void mpq_init (mpq_t @var{x}) 4218Initialize @var{x} and set it to 0/1. Each variable should normally only be 4219initialized once, or at least cleared out (using the function @code{mpq_clear}) 4220between each initialization. 4221@end deftypefun 4222 4223@deftypefun void mpq_inits (mpq_t @var{x}, ...) 4224Initialize a NULL-terminated list of @code{mpq_t} variables, and set their 4225values to 0/1. 4226@end deftypefun 4227 4228@deftypefun void mpq_clear (mpq_t @var{x}) 4229Free the space occupied by @var{x}. Make sure to call this function for all 4230@code{mpq_t} variables when you are done with them. 4231@end deftypefun 4232 4233@deftypefun void mpq_clears (mpq_t @var{x}, ...) 4234Free the space occupied by a NULL-terminated list of @code{mpq_t} variables. 4235@end deftypefun 4236 4237@deftypefun void mpq_set (mpq_t @var{rop}, const mpq_t @var{op}) 4238@deftypefunx void mpq_set_z (mpq_t @var{rop}, const mpz_t @var{op}) 4239Assign @var{rop} from @var{op}. 4240@end deftypefun 4241 4242@deftypefun void mpq_set_ui (mpq_t @var{rop}, unsigned long int @var{op1}, unsigned long int @var{op2}) 4243@deftypefunx void mpq_set_si (mpq_t @var{rop}, signed long int @var{op1}, unsigned long int @var{op2}) 4244Set the value of @var{rop} to @var{op1}/@var{op2}. Note that if @var{op1} and 4245@var{op2} have common factors, @var{rop} has to be passed to 4246@code{mpq_canonicalize} before any operations are performed on @var{rop}. 4247@end deftypefun 4248 4249@deftypefun int mpq_set_str (mpq_t @var{rop}, const char *@var{str}, int @var{base}) 4250Set @var{rop} from a null-terminated string @var{str} in the given @var{base}. 4251 4252The string can be an integer like ``41'' or a fraction like ``41/152''. The 4253fraction must be in canonical form (@pxref{Rational Number Functions}), or if 4254not then @code{mpq_canonicalize} must be called. 4255 4256The numerator and optional denominator are parsed the same as in 4257@code{mpz_set_str} (@pxref{Assigning Integers}). White space is allowed in 4258the string, and is simply ignored. The @var{base} can vary from 2 to 62, or 4259if @var{base} is 0 then the leading characters are used: @code{0x} or @code{0X} for hex, 4260@code{0b} or @code{0B} for binary, 4261@code{0} for octal, or decimal otherwise. Note that this is done separately 4262for the numerator and denominator, so for instance @code{0xEF/100} is 239/100, 4263whereas @code{0xEF/0x100} is 239/256. 4264 4265The return value is 0 if the entire string is a valid number, or @minus{}1 if 4266not. 4267@end deftypefun 4268 4269@deftypefun void mpq_swap (mpq_t @var{rop1}, mpq_t @var{rop2}) 4270Swap the values @var{rop1} and @var{rop2} efficiently. 4271@end deftypefun 4272 4273 4274@need 2000 4275@node Rational Conversions, Rational Arithmetic, Initializing Rationals, Rational Number Functions 4276@comment node-name, next, previous, up 4277@section Conversion Functions 4278@cindex Rational conversion functions 4279@cindex Conversion functions 4280 4281@deftypefun double mpq_get_d (const mpq_t @var{op}) 4282Convert @var{op} to a @code{double}, truncating if necessary (i.e.@: rounding 4283towards zero). 4284 4285If the exponent from the conversion is too big or too small to fit a 4286@code{double} then the result is system dependent. For too big an infinity is 4287returned when available. For too small @math{0.0} is normally returned. 4288Hardware overflow, underflow and denorm traps may or may not occur. 4289@end deftypefun 4290 4291@deftypefun void mpq_set_d (mpq_t @var{rop}, double @var{op}) 4292@deftypefunx void mpq_set_f (mpq_t @var{rop}, const mpf_t @var{op}) 4293Set @var{rop} to the value of @var{op}. There is no rounding, this conversion 4294is exact. 4295@end deftypefun 4296 4297@deftypefun {char *} mpq_get_str (char *@var{str}, int @var{base}, const mpq_t @var{op}) 4298Convert @var{op} to a string of digits in base @var{base}. The base may vary 4299from 2 to 36. The string will be of the form @samp{num/den}, or if the 4300denominator is 1 then just @samp{num}. 4301 4302If @var{str} is @code{NULL}, the result string is allocated using the current 4303allocation function (@pxref{Custom Allocation}). The block will be 4304@code{strlen(str)+1} bytes, that being exactly enough for the string and 4305null-terminator. 4306 4307If @var{str} is not @code{NULL}, it should point to a block of storage large 4308enough for the result, that being 4309 4310@example 4311mpz_sizeinbase (mpq_numref(@var{op}), @var{base}) 4312+ mpz_sizeinbase (mpq_denref(@var{op}), @var{base}) + 3 4313@end example 4314 4315The three extra bytes are for a possible minus sign, possible slash, and the 4316null-terminator. 4317 4318A pointer to the result string is returned, being either the allocated block, 4319or the given @var{str}. 4320@end deftypefun 4321 4322 4323@node Rational Arithmetic, Comparing Rationals, Rational Conversions, Rational Number Functions 4324@comment node-name, next, previous, up 4325@section Arithmetic Functions 4326@cindex Rational arithmetic functions 4327@cindex Arithmetic functions 4328 4329@deftypefun void mpq_add (mpq_t @var{sum}, const mpq_t @var{addend1}, const mpq_t @var{addend2}) 4330Set @var{sum} to @var{addend1} + @var{addend2}. 4331@end deftypefun 4332 4333@deftypefun void mpq_sub (mpq_t @var{difference}, const mpq_t @var{minuend}, const mpq_t @var{subtrahend}) 4334Set @var{difference} to @var{minuend} @minus{} @var{subtrahend}. 4335@end deftypefun 4336 4337@deftypefun void mpq_mul (mpq_t @var{product}, const mpq_t @var{multiplier}, const mpq_t @var{multiplicand}) 4338Set @var{product} to @math{@var{multiplier} @GMPtimes{} @var{multiplicand}}. 4339@end deftypefun 4340 4341@deftypefun void mpq_mul_2exp (mpq_t @var{rop}, const mpq_t @var{op1}, mp_bitcnt_t @var{op2}) 4342Set @var{rop} to @m{@var{op1} \times 2^{op2}, @var{op1} times 2 raised to 4343@var{op2}}. 4344@end deftypefun 4345 4346@deftypefun void mpq_div (mpq_t @var{quotient}, const mpq_t @var{dividend}, const mpq_t @var{divisor}) 4347@cindex Division functions 4348Set @var{quotient} to @var{dividend}/@var{divisor}. 4349@end deftypefun 4350 4351@deftypefun void mpq_div_2exp (mpq_t @var{rop}, const mpq_t @var{op1}, mp_bitcnt_t @var{op2}) 4352Set @var{rop} to @m{@var{op1}/2^{op2}, @var{op1} divided by 2 raised to 4353@var{op2}}. 4354@end deftypefun 4355 4356@deftypefun void mpq_neg (mpq_t @var{negated_operand}, const mpq_t @var{operand}) 4357Set @var{negated_operand} to @minus{}@var{operand}. 4358@end deftypefun 4359 4360@deftypefun void mpq_abs (mpq_t @var{rop}, const mpq_t @var{op}) 4361Set @var{rop} to the absolute value of @var{op}. 4362@end deftypefun 4363 4364@deftypefun void mpq_inv (mpq_t @var{inverted_number}, const mpq_t @var{number}) 4365Set @var{inverted_number} to 1/@var{number}. If the new denominator is 4366zero, this routine will divide by zero. 4367@end deftypefun 4368 4369@node Comparing Rationals, Applying Integer Functions, Rational Arithmetic, Rational Number Functions 4370@comment node-name, next, previous, up 4371@section Comparison Functions 4372@cindex Rational comparison functions 4373@cindex Comparison functions 4374 4375@deftypefun int mpq_cmp (const mpq_t @var{op1}, const mpq_t @var{op2}) 4376Compare @var{op1} and @var{op2}. Return a positive value if @math{@var{op1} > 4377@var{op2}}, zero if @math{@var{op1} = @var{op2}}, and a negative value if 4378@math{@var{op1} < @var{op2}}. 4379 4380To determine if two rationals are equal, @code{mpq_equal} is faster than 4381@code{mpq_cmp}. 4382@end deftypefun 4383 4384@deftypefn Macro int mpq_cmp_ui (const mpq_t @var{op1}, unsigned long int @var{num2}, unsigned long int @var{den2}) 4385@deftypefnx Macro int mpq_cmp_si (const mpq_t @var{op1}, long int @var{num2}, unsigned long int @var{den2}) 4386Compare @var{op1} and @var{num2}/@var{den2}. Return a positive value if 4387@math{@var{op1} > @var{num2}/@var{den2}}, zero if @math{@var{op1} = 4388@var{num2}/@var{den2}}, and a negative value if @math{@var{op1} < 4389@var{num2}/@var{den2}}. 4390 4391@var{num2} and @var{den2} are allowed to have common factors. 4392 4393These functions are implemented as a macros and evaluate their arguments 4394multiple times. 4395@end deftypefn 4396 4397@deftypefn Macro int mpq_sgn (const mpq_t @var{op}) 4398@cindex Sign tests 4399@cindex Rational sign tests 4400Return @math{+1} if @math{@var{op} > 0}, 0 if @math{@var{op} = 0}, and 4401@math{-1} if @math{@var{op} < 0}. 4402 4403This function is actually implemented as a macro. It evaluates its 4404argument multiple times. 4405@end deftypefn 4406 4407@deftypefun int mpq_equal (const mpq_t @var{op1}, const mpq_t @var{op2}) 4408Return non-zero if @var{op1} and @var{op2} are equal, zero if they are 4409non-equal. Although @code{mpq_cmp} can be used for the same purpose, this 4410function is much faster. 4411@end deftypefun 4412 4413@node Applying Integer Functions, I/O of Rationals, Comparing Rationals, Rational Number Functions 4414@comment node-name, next, previous, up 4415@section Applying Integer Functions to Rationals 4416@cindex Rational numerator and denominator 4417@cindex Numerator and denominator 4418 4419The set of @code{mpq} functions is quite small. In particular, there are few 4420functions for either input or output. The following functions give direct 4421access to the numerator and denominator of an @code{mpq_t}. 4422 4423Note that if an assignment to the numerator and/or denominator could take an 4424@code{mpq_t} out of the canonical form described at the start of this chapter 4425(@pxref{Rational Number Functions}) then @code{mpq_canonicalize} must be 4426called before any other @code{mpq} functions are applied to that @code{mpq_t}. 4427 4428@deftypefn Macro mpz_t mpq_numref (const mpq_t @var{op}) 4429@deftypefnx Macro mpz_t mpq_denref (const mpq_t @var{op}) 4430Return a reference to the numerator and denominator of @var{op}, respectively. 4431The @code{mpz} functions can be used on the result of these macros. 4432@end deftypefn 4433 4434@deftypefun void mpq_get_num (mpz_t @var{numerator}, const mpq_t @var{rational}) 4435@deftypefunx void mpq_get_den (mpz_t @var{denominator}, const mpq_t @var{rational}) 4436@deftypefunx void mpq_set_num (mpq_t @var{rational}, const mpz_t @var{numerator}) 4437@deftypefunx void mpq_set_den (mpq_t @var{rational}, const mpz_t @var{denominator}) 4438Get or set the numerator or denominator of a rational. These functions are 4439equivalent to calling @code{mpz_set} with an appropriate @code{mpq_numref} or 4440@code{mpq_denref}. Direct use of @code{mpq_numref} or @code{mpq_denref} is 4441recommended instead of these functions. 4442@end deftypefun 4443 4444 4445@need 2000 4446@node I/O of Rationals, , Applying Integer Functions, Rational Number Functions 4447@comment node-name, next, previous, up 4448@section Input and Output Functions 4449@cindex Rational input and output functions 4450@cindex Input functions 4451@cindex Output functions 4452@cindex I/O functions 4453 4454Functions that perform input from a stdio stream, and functions that output to 4455a stdio stream, of @code{mpq} numbers. Passing a @code{NULL} pointer for a 4456@var{stream} argument to any of these functions will make them read from 4457@code{stdin} and write to @code{stdout}, respectively. 4458 4459When using any of these functions, it is a good idea to include @file{stdio.h} 4460before @file{gmp.h}, since that will allow @file{gmp.h} to define prototypes 4461for these functions. 4462 4463See also @ref{Formatted Output} and @ref{Formatted Input}. 4464 4465@deftypefun size_t mpq_out_str (FILE *@var{stream}, int @var{base}, const mpq_t @var{op}) 4466Output @var{op} on stdio stream @var{stream}, as a string of digits in base 4467@var{base}. The base may vary from 2 to 36. Output is in the form 4468@samp{num/den} or if the denominator is 1 then just @samp{num}. 4469 4470Return the number of bytes written, or if an error occurred, return 0. 4471@end deftypefun 4472 4473@deftypefun size_t mpq_inp_str (mpq_t @var{rop}, FILE *@var{stream}, int @var{base}) 4474Read a string of digits from @var{stream} and convert them to a rational in 4475@var{rop}. Any initial white-space characters are read and discarded. Return 4476the number of characters read (including white space), or 0 if a rational 4477could not be read. 4478 4479The input can be a fraction like @samp{17/63} or just an integer like 4480@samp{123}. Reading stops at the first character not in this form, and white 4481space is not permitted within the string. If the input might not be in 4482canonical form, then @code{mpq_canonicalize} must be called (@pxref{Rational 4483Number Functions}). 4484 4485The @var{base} can be between 2 and 36, or can be 0 in which case the leading 4486characters of the string determine the base, @samp{0x} or @samp{0X} for 4487hexadecimal, @samp{0} for octal, or decimal otherwise. The leading characters 4488are examined separately for the numerator and denominator of a fraction, so 4489for instance @samp{0x10/11} is @math{16/11}, whereas @samp{0x10/0x11} is 4490@math{16/17}. 4491@end deftypefun 4492 4493 4494@node Floating-point Functions, Low-level Functions, Rational Number Functions, Top 4495@comment node-name, next, previous, up 4496@chapter Floating-point Functions 4497@cindex Floating-point functions 4498@cindex Float functions 4499@cindex User-defined precision 4500@cindex Precision of floats 4501 4502GMP floating point numbers are stored in objects of type @code{mpf_t} and 4503functions operating on them have an @code{mpf_} prefix. 4504 4505The mantissa of each float has a user-selectable precision, limited only by 4506available memory. Each variable has its own precision, and that can be 4507increased or decreased at any time. 4508 4509The exponent of each float is a fixed precision, one machine word on most 4510systems. In the current implementation the exponent is a count of limbs, so 4511for example on a 32-bit system this means a range of roughly 4512@math{2^@W{-68719476768}} to @math{2^@W{68719476736}}, or on a 64-bit system 4513this will be greater. Note however @code{mpf_get_str} can only return an 4514exponent which fits an @code{mp_exp_t} and currently @code{mpf_set_str} 4515doesn't accept exponents bigger than a @code{long}. 4516 4517Each variable keeps a size for the mantissa data actually in use. This means 4518that if a float is exactly represented in only a few bits then only those bits 4519will be used in a calculation, even if the selected precision is high. 4520 4521All calculations are performed to the precision of the destination variable. 4522Each function is defined to calculate with ``infinite precision'' followed by 4523a truncation to the destination precision, but of course the work done is only 4524what's needed to determine a result under that definition. 4525 4526The precision selected for a variable is a minimum value, GMP may increase it 4527a little to facilitate efficient calculation. Currently this means rounding 4528up to a whole limb, and then sometimes having a further partial limb, 4529depending on the high limb of the mantissa. But applications shouldn't be 4530concerned by such details. 4531 4532The mantissa in stored in binary, as might be imagined from the fact 4533precisions are expressed in bits. One consequence of this is that decimal 4534fractions like @math{0.1} cannot be represented exactly. The same is true of 4535plain IEEE @code{double} floats. This makes both highly unsuitable for 4536calculations involving money or other values that should be exact decimal 4537fractions. (Suitably scaled integers, or perhaps rationals, are better 4538choices.) 4539 4540@code{mpf} functions and variables have no special notion of infinity or 4541not-a-number, and applications must take care not to overflow the exponent or 4542results will be unpredictable. This might change in a future release. 4543 4544Note that the @code{mpf} functions are @emph{not} intended as a smooth 4545extension to IEEE P754 arithmetic. In particular results obtained on one 4546computer often differ from the results on a computer with a different word 4547size. 4548 4549@menu 4550* Initializing Floats:: 4551* Assigning Floats:: 4552* Simultaneous Float Init & Assign:: 4553* Converting Floats:: 4554* Float Arithmetic:: 4555* Float Comparison:: 4556* I/O of Floats:: 4557* Miscellaneous Float Functions:: 4558@end menu 4559 4560@node Initializing Floats, Assigning Floats, Floating-point Functions, Floating-point Functions 4561@comment node-name, next, previous, up 4562@section Initialization Functions 4563@cindex Float initialization functions 4564@cindex Initialization functions 4565 4566@deftypefun void mpf_set_default_prec (mp_bitcnt_t @var{prec}) 4567Set the default precision to be @strong{at least} @var{prec} bits. All 4568subsequent calls to @code{mpf_init} will use this precision, but previously 4569initialized variables are unaffected. 4570@end deftypefun 4571 4572@deftypefun {mp_bitcnt_t} mpf_get_default_prec (void) 4573Return the default precision actually used. 4574@end deftypefun 4575 4576An @code{mpf_t} object must be initialized before storing the first value in 4577it. The functions @code{mpf_init} and @code{mpf_init2} are used for that 4578purpose. 4579 4580@deftypefun void mpf_init (mpf_t @var{x}) 4581Initialize @var{x} to 0. Normally, a variable should be initialized once only 4582or at least be cleared, using @code{mpf_clear}, between initializations. The 4583precision of @var{x} is undefined unless a default precision has already been 4584established by a call to @code{mpf_set_default_prec}. 4585@end deftypefun 4586 4587@deftypefun void mpf_init2 (mpf_t @var{x}, mp_bitcnt_t @var{prec}) 4588Initialize @var{x} to 0 and set its precision to be @strong{at least} 4589@var{prec} bits. Normally, a variable should be initialized once only or at 4590least be cleared, using @code{mpf_clear}, between initializations. 4591@end deftypefun 4592 4593@deftypefun void mpf_inits (mpf_t @var{x}, ...) 4594Initialize a NULL-terminated list of @code{mpf_t} variables, and set their 4595values to 0. The precision of the initialized variables is undefined unless a 4596default precision has already been established by a call to 4597@code{mpf_set_default_prec}. 4598@end deftypefun 4599 4600@deftypefun void mpf_clear (mpf_t @var{x}) 4601Free the space occupied by @var{x}. Make sure to call this function for all 4602@code{mpf_t} variables when you are done with them. 4603@end deftypefun 4604 4605@deftypefun void mpf_clears (mpf_t @var{x}, ...) 4606Free the space occupied by a NULL-terminated list of @code{mpf_t} variables. 4607@end deftypefun 4608 4609@need 2000 4610Here is an example on how to initialize floating-point variables: 4611@example 4612@{ 4613 mpf_t x, y; 4614 mpf_init (x); /* use default precision */ 4615 mpf_init2 (y, 256); /* precision @emph{at least} 256 bits */ 4616 @dots{} 4617 /* Unless the program is about to exit, do ... */ 4618 mpf_clear (x); 4619 mpf_clear (y); 4620@} 4621@end example 4622 4623The following three functions are useful for changing the precision during a 4624calculation. A typical use would be for adjusting the precision gradually in 4625iterative algorithms like Newton-Raphson, making the computation precision 4626closely match the actual accurate part of the numbers. 4627 4628@deftypefun {mp_bitcnt_t} mpf_get_prec (const mpf_t @var{op}) 4629Return the current precision of @var{op}, in bits. 4630@end deftypefun 4631 4632@deftypefun void mpf_set_prec (mpf_t @var{rop}, mp_bitcnt_t @var{prec}) 4633Set the precision of @var{rop} to be @strong{at least} @var{prec} bits. The 4634value in @var{rop} will be truncated to the new precision. 4635 4636This function requires a call to @code{realloc}, and so should not be used in 4637a tight loop. 4638@end deftypefun 4639 4640@deftypefun void mpf_set_prec_raw (mpf_t @var{rop}, mp_bitcnt_t @var{prec}) 4641Set the precision of @var{rop} to be @strong{at least} @var{prec} bits, 4642without changing the memory allocated. 4643 4644@var{prec} must be no more than the allocated precision for @var{rop}, that 4645being the precision when @var{rop} was initialized, or in the most recent 4646@code{mpf_set_prec}. 4647 4648The value in @var{rop} is unchanged, and in particular if it had a higher 4649precision than @var{prec} it will retain that higher precision. New values 4650written to @var{rop} will use the new @var{prec}. 4651 4652Before calling @code{mpf_clear} or the full @code{mpf_set_prec}, another 4653@code{mpf_set_prec_raw} call must be made to restore @var{rop} to its original 4654allocated precision. Failing to do so will have unpredictable results. 4655 4656@code{mpf_get_prec} can be used before @code{mpf_set_prec_raw} to get the 4657original allocated precision. After @code{mpf_set_prec_raw} it reflects the 4658@var{prec} value set. 4659 4660@code{mpf_set_prec_raw} is an efficient way to use an @code{mpf_t} variable at 4661different precisions during a calculation, perhaps to gradually increase 4662precision in an iteration, or just to use various different precisions for 4663different purposes during a calculation. 4664@end deftypefun 4665 4666 4667@need 2000 4668@node Assigning Floats, Simultaneous Float Init & Assign, Initializing Floats, Floating-point Functions 4669@comment node-name, next, previous, up 4670@section Assignment Functions 4671@cindex Float assignment functions 4672@cindex Assignment functions 4673 4674These functions assign new values to already initialized floats 4675(@pxref{Initializing Floats}). 4676 4677@deftypefun void mpf_set (mpf_t @var{rop}, const mpf_t @var{op}) 4678@deftypefunx void mpf_set_ui (mpf_t @var{rop}, unsigned long int @var{op}) 4679@deftypefunx void mpf_set_si (mpf_t @var{rop}, signed long int @var{op}) 4680@deftypefunx void mpf_set_d (mpf_t @var{rop}, double @var{op}) 4681@deftypefunx void mpf_set_z (mpf_t @var{rop}, const mpz_t @var{op}) 4682@deftypefunx void mpf_set_q (mpf_t @var{rop}, const mpq_t @var{op}) 4683Set the value of @var{rop} from @var{op}. 4684@end deftypefun 4685 4686@deftypefun int mpf_set_str (mpf_t @var{rop}, const char *@var{str}, int @var{base}) 4687Set the value of @var{rop} from the string in @var{str}. The string is of the 4688form @samp{M@@N} or, if the base is 10 or less, alternatively @samp{MeN}. 4689@samp{M} is the mantissa and @samp{N} is the exponent. The mantissa is always 4690in the specified base. The exponent is either in the specified base or, if 4691@var{base} is negative, in decimal. The decimal point expected is taken from 4692the current locale, on systems providing @code{localeconv}. 4693 4694The argument @var{base} may be in the ranges 2 to 62, or @minus{}62 to 4695@minus{}2. Negative values are used to specify that the exponent is in 4696decimal. 4697 4698For bases up to 36, case is ignored; upper-case and lower-case letters have 4699the same value; for bases 37 to 62, upper-case letter represent the usual 470010..35 while lower-case letter represent 36..61. 4701 4702Unlike the corresponding @code{mpz} function, the base will not be determined 4703from the leading characters of the string if @var{base} is 0. This is so that 4704numbers like @samp{0.23} are not interpreted as octal. 4705 4706White space is allowed in the string, and is simply ignored. [This is not 4707really true; white-space is ignored in the beginning of the string and within 4708the mantissa, but not in other places, such as after a minus sign or in the 4709exponent. We are considering changing the definition of this function, making 4710it fail when there is any white-space in the input, since that makes a lot of 4711sense. Please tell us your opinion about this change. Do you really want it 4712to accept @nicode{"3 14"} as meaning 314 as it does now?] 4713 4714This function returns 0 if the entire string is a valid number in base 4715@var{base}. Otherwise it returns @minus{}1. 4716@end deftypefun 4717 4718@deftypefun void mpf_swap (mpf_t @var{rop1}, mpf_t @var{rop2}) 4719Swap @var{rop1} and @var{rop2} efficiently. Both the values and the 4720precisions of the two variables are swapped. 4721@end deftypefun 4722 4723 4724@node Simultaneous Float Init & Assign, Converting Floats, Assigning Floats, Floating-point Functions 4725@comment node-name, next, previous, up 4726@section Combined Initialization and Assignment Functions 4727@cindex Float assignment functions 4728@cindex Assignment functions 4729@cindex Float initialization functions 4730@cindex Initialization functions 4731 4732For convenience, GMP provides a parallel series of initialize-and-set functions 4733which initialize the output and then store the value there. These functions' 4734names have the form @code{mpf_init_set@dots{}} 4735 4736Once the float has been initialized by any of the @code{mpf_init_set@dots{}} 4737functions, it can be used as the source or destination operand for the ordinary 4738float functions. Don't use an initialize-and-set function on a variable 4739already initialized! 4740 4741@deftypefun void mpf_init_set (mpf_t @var{rop}, const mpf_t @var{op}) 4742@deftypefunx void mpf_init_set_ui (mpf_t @var{rop}, unsigned long int @var{op}) 4743@deftypefunx void mpf_init_set_si (mpf_t @var{rop}, signed long int @var{op}) 4744@deftypefunx void mpf_init_set_d (mpf_t @var{rop}, double @var{op}) 4745Initialize @var{rop} and set its value from @var{op}. 4746 4747The precision of @var{rop} will be taken from the active default precision, as 4748set by @code{mpf_set_default_prec}. 4749@end deftypefun 4750 4751@deftypefun int mpf_init_set_str (mpf_t @var{rop}, const char *@var{str}, int @var{base}) 4752Initialize @var{rop} and set its value from the string in @var{str}. See 4753@code{mpf_set_str} above for details on the assignment operation. 4754 4755Note that @var{rop} is initialized even if an error occurs. (I.e., you have to 4756call @code{mpf_clear} for it.) 4757 4758The precision of @var{rop} will be taken from the active default precision, as 4759set by @code{mpf_set_default_prec}. 4760@end deftypefun 4761 4762 4763@node Converting Floats, Float Arithmetic, Simultaneous Float Init & Assign, Floating-point Functions 4764@comment node-name, next, previous, up 4765@section Conversion Functions 4766@cindex Float conversion functions 4767@cindex Conversion functions 4768 4769@deftypefun double mpf_get_d (const mpf_t @var{op}) 4770Convert @var{op} to a @code{double}, truncating if necessary (i.e.@: rounding 4771towards zero). 4772 4773If the exponent in @var{op} is too big or too small to fit a @code{double} 4774then the result is system dependent. For too big an infinity is returned when 4775available. For too small @math{0.0} is normally returned. Hardware overflow, 4776underflow and denorm traps may or may not occur. 4777@end deftypefun 4778 4779@deftypefun double mpf_get_d_2exp (signed long int *@var{exp}, const mpf_t @var{op}) 4780Convert @var{op} to a @code{double}, truncating if necessary (i.e.@: rounding 4781towards zero), and with an exponent returned separately. 4782 4783The return value is in the range @math{0.5@le{}@GMPabs{@var{d}}<1} and the 4784exponent is stored to @code{*@var{exp}}. @m{@var{d} * 2^{exp}, @var{d} * 47852^@var{exp}} is the (truncated) @var{op} value. If @var{op} is zero, the 4786return is @math{0.0} and 0 is stored to @code{*@var{exp}}. 4787 4788@cindex @code{frexp} 4789This is similar to the standard C @code{frexp} function (@pxref{Normalization 4790Functions,,, libc, The GNU C Library Reference Manual}). 4791@end deftypefun 4792 4793@deftypefun long mpf_get_si (const mpf_t @var{op}) 4794@deftypefunx {unsigned long} mpf_get_ui (const mpf_t @var{op}) 4795Convert @var{op} to a @code{long} or @code{unsigned long}, truncating any 4796fraction part. If @var{op} is too big for the return type, the result is 4797undefined. 4798 4799See also @code{mpf_fits_slong_p} and @code{mpf_fits_ulong_p} 4800(@pxref{Miscellaneous Float Functions}). 4801@end deftypefun 4802 4803@deftypefun {char *} mpf_get_str (char *@var{str}, mp_exp_t *@var{expptr}, int @var{base}, size_t @var{n_digits}, const mpf_t @var{op}) 4804Convert @var{op} to a string of digits in base @var{base}. The base argument 4805may vary from 2 to 62 or from @minus{}2 to @minus{}36. Up to @var{n_digits} 4806digits will be generated. Trailing zeros are not returned. No more digits 4807than can be accurately represented by @var{op} are ever generated. If 4808@var{n_digits} is 0 then that accurate maximum number of digits are generated. 4809 4810For @var{base} in the range 2..36, digits and lower-case letters are used; for 4811@minus{}2..@minus{}36, digits and upper-case letters are used; for 37..62, 4812digits, upper-case letters, and lower-case letters (in that significance order) 4813are used. 4814 4815If @var{str} is @code{NULL}, the result string is allocated using the current 4816allocation function (@pxref{Custom Allocation}). The block will be 4817@code{strlen(str)+1} bytes, that being exactly enough for the string and 4818null-terminator. 4819 4820If @var{str} is not @code{NULL}, it should point to a block of 4821@math{@var{n_digits} + 2} bytes, that being enough for the mantissa, a 4822possible minus sign, and a null-terminator. When @var{n_digits} is 0 to get 4823all significant digits, an application won't be able to know the space 4824required, and @var{str} should be @code{NULL} in that case. 4825 4826The generated string is a fraction, with an implicit radix point immediately 4827to the left of the first digit. The applicable exponent is written through 4828the @var{expptr} pointer. For example, the number 3.1416 would be returned as 4829string @nicode{"31416"} and exponent 1. 4830 4831When @var{op} is zero, an empty string is produced and the exponent returned 4832is 0. 4833 4834A pointer to the result string is returned, being either the allocated block 4835or the given @var{str}. 4836@end deftypefun 4837 4838 4839@node Float Arithmetic, Float Comparison, Converting Floats, Floating-point Functions 4840@comment node-name, next, previous, up 4841@section Arithmetic Functions 4842@cindex Float arithmetic functions 4843@cindex Arithmetic functions 4844 4845@deftypefun void mpf_add (mpf_t @var{rop}, const mpf_t @var{op1}, const mpf_t @var{op2}) 4846@deftypefunx void mpf_add_ui (mpf_t @var{rop}, const mpf_t @var{op1}, unsigned long int @var{op2}) 4847Set @var{rop} to @math{@var{op1} + @var{op2}}. 4848@end deftypefun 4849 4850@deftypefun void mpf_sub (mpf_t @var{rop}, const mpf_t @var{op1}, const mpf_t @var{op2}) 4851@deftypefunx void mpf_ui_sub (mpf_t @var{rop}, unsigned long int @var{op1}, const mpf_t @var{op2}) 4852@deftypefunx void mpf_sub_ui (mpf_t @var{rop}, const mpf_t @var{op1}, unsigned long int @var{op2}) 4853Set @var{rop} to @var{op1} @minus{} @var{op2}. 4854@end deftypefun 4855 4856@deftypefun void mpf_mul (mpf_t @var{rop}, const mpf_t @var{op1}, const mpf_t @var{op2}) 4857@deftypefunx void mpf_mul_ui (mpf_t @var{rop}, const mpf_t @var{op1}, unsigned long int @var{op2}) 4858Set @var{rop} to @math{@var{op1} @GMPtimes{} @var{op2}}. 4859@end deftypefun 4860 4861Division is undefined if the divisor is zero, and passing a zero divisor to the 4862divide functions will make these functions intentionally divide by zero. This 4863lets the user handle arithmetic exceptions in these functions in the same 4864manner as other arithmetic exceptions. 4865 4866@deftypefun void mpf_div (mpf_t @var{rop}, const mpf_t @var{op1}, const mpf_t @var{op2}) 4867@deftypefunx void mpf_ui_div (mpf_t @var{rop}, unsigned long int @var{op1}, const mpf_t @var{op2}) 4868@deftypefunx void mpf_div_ui (mpf_t @var{rop}, const mpf_t @var{op1}, unsigned long int @var{op2}) 4869@cindex Division functions 4870Set @var{rop} to @var{op1}/@var{op2}. 4871@end deftypefun 4872 4873@deftypefun void mpf_sqrt (mpf_t @var{rop}, const mpf_t @var{op}) 4874@deftypefunx void mpf_sqrt_ui (mpf_t @var{rop}, unsigned long int @var{op}) 4875@cindex Root extraction functions 4876Set @var{rop} to @m{\sqrt{@var{op}}, the square root of @var{op}}. 4877@end deftypefun 4878 4879@deftypefun void mpf_pow_ui (mpf_t @var{rop}, const mpf_t @var{op1}, unsigned long int @var{op2}) 4880@cindex Exponentiation functions 4881@cindex Powering functions 4882Set @var{rop} to @m{@var{op1}^{op2}, @var{op1} raised to the power @var{op2}}. 4883@end deftypefun 4884 4885@deftypefun void mpf_neg (mpf_t @var{rop}, const mpf_t @var{op}) 4886Set @var{rop} to @minus{}@var{op}. 4887@end deftypefun 4888 4889@deftypefun void mpf_abs (mpf_t @var{rop}, const mpf_t @var{op}) 4890Set @var{rop} to the absolute value of @var{op}. 4891@end deftypefun 4892 4893@deftypefun void mpf_mul_2exp (mpf_t @var{rop}, const mpf_t @var{op1}, mp_bitcnt_t @var{op2}) 4894Set @var{rop} to @m{@var{op1} \times 2^{op2}, @var{op1} times 2 raised to 4895@var{op2}}. 4896@end deftypefun 4897 4898@deftypefun void mpf_div_2exp (mpf_t @var{rop}, const mpf_t @var{op1}, mp_bitcnt_t @var{op2}) 4899Set @var{rop} to @m{@var{op1}/2^{op2}, @var{op1} divided by 2 raised to 4900@var{op2}}. 4901@end deftypefun 4902 4903@node Float Comparison, I/O of Floats, Float Arithmetic, Floating-point Functions 4904@comment node-name, next, previous, up 4905@section Comparison Functions 4906@cindex Float comparison functions 4907@cindex Comparison functions 4908 4909@deftypefun int mpf_cmp (const mpf_t @var{op1}, const mpf_t @var{op2}) 4910@deftypefunx int mpf_cmp_d (const mpf_t @var{op1}, double @var{op2}) 4911@deftypefunx int mpf_cmp_ui (const mpf_t @var{op1}, unsigned long int @var{op2}) 4912@deftypefunx int mpf_cmp_si (const mpf_t @var{op1}, signed long int @var{op2}) 4913Compare @var{op1} and @var{op2}. Return a positive value if @math{@var{op1} > 4914@var{op2}}, zero if @math{@var{op1} = @var{op2}}, and a negative value if 4915@math{@var{op1} < @var{op2}}. 4916 4917@code{mpf_cmp_d} can be called with an infinity, but results are undefined for 4918a NaN. 4919@end deftypefun 4920 4921@deftypefun int mpf_eq (const mpf_t @var{op1}, const mpf_t @var{op2}, mp_bitcnt_t op3) 4922Return non-zero if the first @var{op3} bits of @var{op1} and @var{op2} are 4923equal, zero otherwise. I.e., test if @var{op1} and @var{op2} are approximately 4924equal. 4925 4926Caution 1: All version of GMP up to version 4.2.4 compared just whole limbs, 4927meaning sometimes more than @var{op3} bits, sometimes fewer. 4928 4929Caution 2: This function will consider XXX11...111 and XX100...000 different, 4930even if ... is replaced by a semi-infinite number of bits. Such numbers are 4931really just one ulp off, and should be considered equal. 4932@end deftypefun 4933 4934@deftypefun void mpf_reldiff (mpf_t @var{rop}, const mpf_t @var{op1}, const mpf_t @var{op2}) 4935Compute the relative difference between @var{op1} and @var{op2} and store the 4936result in @var{rop}. This is @math{@GMPabs{@var{op1}-@var{op2}}/@var{op1}}. 4937@end deftypefun 4938 4939@deftypefn Macro int mpf_sgn (const mpf_t @var{op}) 4940@cindex Sign tests 4941@cindex Float sign tests 4942Return @math{+1} if @math{@var{op} > 0}, 0 if @math{@var{op} = 0}, and 4943@math{-1} if @math{@var{op} < 0}. 4944 4945This function is actually implemented as a macro. It evaluates its argument 4946multiple times. 4947@end deftypefn 4948 4949@node I/O of Floats, Miscellaneous Float Functions, Float Comparison, Floating-point Functions 4950@comment node-name, next, previous, up 4951@section Input and Output Functions 4952@cindex Float input and output functions 4953@cindex Input functions 4954@cindex Output functions 4955@cindex I/O functions 4956 4957Functions that perform input from a stdio stream, and functions that output to 4958a stdio stream, of @code{mpf} numbers. Passing a @code{NULL} pointer for a 4959@var{stream} argument to any of these functions will make them read from 4960@code{stdin} and write to @code{stdout}, respectively. 4961 4962When using any of these functions, it is a good idea to include @file{stdio.h} 4963before @file{gmp.h}, since that will allow @file{gmp.h} to define prototypes 4964for these functions. 4965 4966See also @ref{Formatted Output} and @ref{Formatted Input}. 4967 4968@deftypefun size_t mpf_out_str (FILE *@var{stream}, int @var{base}, size_t @var{n_digits}, const mpf_t @var{op}) 4969Print @var{op} to @var{stream}, as a string of digits. Return the number of 4970bytes written, or if an error occurred, return 0. 4971 4972The mantissa is prefixed with an @samp{0.} and is in the given @var{base}, 4973which may vary from 2 to 62 or from @minus{}2 to @minus{}36. An exponent is 4974then printed, separated by an @samp{e}, or if the base is greater than 10 then 4975by an @samp{@@}. The exponent is always in decimal. The decimal point follows 4976the current locale, on systems providing @code{localeconv}. 4977 4978For @var{base} in the range 2..36, digits and lower-case letters are used; for 4979@minus{}2..@minus{}36, digits and upper-case letters are used; for 37..62, 4980digits, upper-case letters, and lower-case letters (in that significance order) 4981are used. 4982 4983Up to @var{n_digits} will be printed from the mantissa, except that no more 4984digits than are accurately representable by @var{op} will be printed. 4985@var{n_digits} can be 0 to select that accurate maximum. 4986@end deftypefun 4987 4988@deftypefun size_t mpf_inp_str (mpf_t @var{rop}, FILE *@var{stream}, int @var{base}) 4989Read a string in base @var{base} from @var{stream}, and put the read float in 4990@var{rop}. The string is of the form @samp{M@@N} or, if the base is 10 or 4991less, alternatively @samp{MeN}. @samp{M} is the mantissa and @samp{N} is the 4992exponent. The mantissa is always in the specified base. The exponent is 4993either in the specified base or, if @var{base} is negative, in decimal. The 4994decimal point expected is taken from the current locale, on systems providing 4995@code{localeconv}. 4996 4997The argument @var{base} may be in the ranges 2 to 36, or @minus{}36 to 4998@minus{}2. Negative values are used to specify that the exponent is in 4999decimal. 5000 5001Unlike the corresponding @code{mpz} function, the base will not be determined 5002from the leading characters of the string if @var{base} is 0. This is so that 5003numbers like @samp{0.23} are not interpreted as octal. 5004 5005Return the number of bytes read, or if an error occurred, return 0. 5006@end deftypefun 5007 5008@c @deftypefun void mpf_out_raw (FILE *@var{stream}, const mpf_t @var{float}) 5009@c Output @var{float} on stdio stream @var{stream}, in raw binary 5010@c format. The float is written in a portable format, with 4 bytes of 5011@c size information, and that many bytes of limbs. Both the size and the 5012@c limbs are written in decreasing significance order. 5013@c @end deftypefun 5014 5015@c @deftypefun void mpf_inp_raw (mpf_t @var{float}, FILE *@var{stream}) 5016@c Input from stdio stream @var{stream} in the format written by 5017@c @code{mpf_out_raw}, and put the result in @var{float}. 5018@c @end deftypefun 5019 5020 5021@node Miscellaneous Float Functions, , I/O of Floats, Floating-point Functions 5022@comment node-name, next, previous, up 5023@section Miscellaneous Functions 5024@cindex Miscellaneous float functions 5025@cindex Float miscellaneous functions 5026 5027@deftypefun void mpf_ceil (mpf_t @var{rop}, const mpf_t @var{op}) 5028@deftypefunx void mpf_floor (mpf_t @var{rop}, const mpf_t @var{op}) 5029@deftypefunx void mpf_trunc (mpf_t @var{rop}, const mpf_t @var{op}) 5030@cindex Rounding functions 5031@cindex Float rounding functions 5032Set @var{rop} to @var{op} rounded to an integer. @code{mpf_ceil} rounds to the 5033next higher integer, @code{mpf_floor} to the next lower, and @code{mpf_trunc} 5034to the integer towards zero. 5035@end deftypefun 5036 5037@deftypefun int mpf_integer_p (const mpf_t @var{op}) 5038Return non-zero if @var{op} is an integer. 5039@end deftypefun 5040 5041@deftypefun int mpf_fits_ulong_p (const mpf_t @var{op}) 5042@deftypefunx int mpf_fits_slong_p (const mpf_t @var{op}) 5043@deftypefunx int mpf_fits_uint_p (const mpf_t @var{op}) 5044@deftypefunx int mpf_fits_sint_p (const mpf_t @var{op}) 5045@deftypefunx int mpf_fits_ushort_p (const mpf_t @var{op}) 5046@deftypefunx int mpf_fits_sshort_p (const mpf_t @var{op}) 5047Return non-zero if @var{op} would fit in the respective C data type, when 5048truncated to an integer. 5049@end deftypefun 5050 5051@deftypefun void mpf_urandomb (mpf_t @var{rop}, gmp_randstate_t @var{state}, mp_bitcnt_t @var{nbits}) 5052@cindex Random number functions 5053@cindex Float random number functions 5054Generate a uniformly distributed random float in @var{rop}, such that @math{0 5055@le{} @var{rop} < 1}, with @var{nbits} significant bits in the mantissa or 5056less if the precision of @var{rop} is smaller. 5057 5058The variable @var{state} must be initialized by calling one of the 5059@code{gmp_randinit} functions (@ref{Random State Initialization}) before 5060invoking this function. 5061@end deftypefun 5062 5063@deftypefun void mpf_random2 (mpf_t @var{rop}, mp_size_t @var{max_size}, mp_exp_t @var{exp}) 5064Generate a random float of at most @var{max_size} limbs, with long strings of 5065zeros and ones in the binary representation. The exponent of the number is in 5066the interval @minus{}@var{exp} to @var{exp} (in limbs). This function is 5067useful for testing functions and algorithms, since these kind of random 5068numbers have proven to be more likely to trigger corner-case bugs. Negative 5069random numbers are generated when @var{max_size} is negative. 5070@end deftypefun 5071 5072@c @deftypefun size_t mpf_size (const mpf_t @var{op}) 5073@c Return the size of @var{op} measured in number of limbs. If @var{op} is 5074@c zero, the returned value will be zero. (@xref{Nomenclature}, for an 5075@c explanation of the concept @dfn{limb}.) 5076@c 5077@c @strong{This function is obsolete. It will disappear from future GMP 5078@c releases.} 5079@c @end deftypefun 5080 5081 5082@node Low-level Functions, Random Number Functions, Floating-point Functions, Top 5083@comment node-name, next, previous, up 5084@chapter Low-level Functions 5085@cindex Low-level functions 5086 5087This chapter describes low-level GMP functions, used to implement the 5088high-level GMP functions, but also intended for time-critical user code. 5089 5090These functions start with the prefix @code{mpn_}. 5091 5092@c 1. Some of these function clobber input operands. 5093@c 5094 5095The @code{mpn} functions are designed to be as fast as possible, @strong{not} 5096to provide a coherent calling interface. The different functions have somewhat 5097similar interfaces, but there are variations that make them hard to use. These 5098functions do as little as possible apart from the real multiple precision 5099computation, so that no time is spent on things that not all callers need. 5100 5101A source operand is specified by a pointer to the least significant limb and a 5102limb count. A destination operand is specified by just a pointer. It is the 5103responsibility of the caller to ensure that the destination has enough space 5104for storing the result. 5105 5106With this way of specifying operands, it is possible to perform computations on 5107subranges of an argument, and store the result into a subrange of a 5108destination. 5109 5110A common requirement for all functions is that each source area needs at least 5111one limb. No size argument may be zero. Unless otherwise stated, in-place 5112operations are allowed where source and destination are the same, but not where 5113they only partly overlap. 5114 5115The @code{mpn} functions are the base for the implementation of the 5116@code{mpz_}, @code{mpf_}, and @code{mpq_} functions. 5117 5118This example adds the number beginning at @var{s1p} and the number beginning at 5119@var{s2p} and writes the sum at @var{destp}. All areas have @var{n} limbs. 5120 5121@example 5122cy = mpn_add_n (destp, s1p, s2p, n) 5123@end example 5124 5125It should be noted that the @code{mpn} functions make no attempt to identify 5126high or low zero limbs on their operands, or other special forms. On random 5127data such cases will be unlikely and it'd be wasteful for every function to 5128check every time. An application knowing something about its data can take 5129steps to trim or perhaps split its calculations. 5130@c 5131@c For reference, within gmp mpz_t operands never have high zero limbs, and 5132@c we rate low zero limbs as unlikely too (or something an application should 5133@c handle). This is a prime motivation for not stripping zero limbs in say 5134@c mpn_mul_n etc. 5135@c 5136@c Other applications doing variable-length calculations will quite likely do 5137@c something similar to mpz. And even if not then it's highly likely zero 5138@c limb stripping can be done at just a few judicious points, which will be 5139@c more efficient than having lots of mpn functions checking every time. 5140 5141@sp 1 5142@noindent 5143In the notation used below, a source operand is identified by the pointer to 5144the least significant limb, and the limb count in braces. For example, 5145@{@var{s1p}, @var{s1n}@}. 5146 5147@deftypefun mp_limb_t mpn_add_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n}) 5148Add @{@var{s1p}, @var{n}@} and @{@var{s2p}, @var{n}@}, and write the @var{n} 5149least significant limbs of the result to @var{rp}. Return carry, either 0 or 51501. 5151 5152This is the lowest-level function for addition. It is the preferred function 5153for addition, since it is written in assembly for most CPUs. For addition of 5154a variable to itself (i.e., @var{s1p} equals @var{s2p}) use @code{mpn_lshift} 5155with a count of 1 for optimal speed. 5156@end deftypefun 5157 5158@deftypefun mp_limb_t mpn_add_1 (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{n}, mp_limb_t @var{s2limb}) 5159Add @{@var{s1p}, @var{n}@} and @var{s2limb}, and write the @var{n} least 5160significant limbs of the result to @var{rp}. Return carry, either 0 or 1. 5161@end deftypefun 5162 5163@deftypefun mp_limb_t mpn_add (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{s1n}, const mp_limb_t *@var{s2p}, mp_size_t @var{s2n}) 5164Add @{@var{s1p}, @var{s1n}@} and @{@var{s2p}, @var{s2n}@}, and write the 5165@var{s1n} least significant limbs of the result to @var{rp}. Return carry, 5166either 0 or 1. 5167 5168This function requires that @var{s1n} is greater than or equal to @var{s2n}. 5169@end deftypefun 5170 5171@deftypefun mp_limb_t mpn_sub_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n}) 5172Subtract @{@var{s2p}, @var{n}@} from @{@var{s1p}, @var{n}@}, and write the 5173@var{n} least significant limbs of the result to @var{rp}. Return borrow, 5174either 0 or 1. 5175 5176This is the lowest-level function for subtraction. It is the preferred 5177function for subtraction, since it is written in assembly for most CPUs. 5178@end deftypefun 5179 5180@deftypefun mp_limb_t mpn_sub_1 (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{n}, mp_limb_t @var{s2limb}) 5181Subtract @var{s2limb} from @{@var{s1p}, @var{n}@}, and write the @var{n} least 5182significant limbs of the result to @var{rp}. Return borrow, either 0 or 1. 5183@end deftypefun 5184 5185@deftypefun mp_limb_t mpn_sub (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{s1n}, const mp_limb_t *@var{s2p}, mp_size_t @var{s2n}) 5186Subtract @{@var{s2p}, @var{s2n}@} from @{@var{s1p}, @var{s1n}@}, and write the 5187@var{s1n} least significant limbs of the result to @var{rp}. Return borrow, 5188either 0 or 1. 5189 5190This function requires that @var{s1n} is greater than or equal to 5191@var{s2n}. 5192@end deftypefun 5193 5194@deftypefun mp_limb_t mpn_neg (mp_limb_t *@var{rp}, const mp_limb_t *@var{sp}, mp_size_t @var{n}) 5195Perform the negation of @{@var{sp}, @var{n}@}, and write the result to 5196@{@var{rp}, @var{n}@}. Return carry-out. 5197@end deftypefun 5198 5199@deftypefun void mpn_mul_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n}) 5200Multiply @{@var{s1p}, @var{n}@} and @{@var{s2p}, @var{n}@}, and write the 52012*@var{n}-limb result to @var{rp}. 5202 5203The destination has to have space for 2*@var{n} limbs, even if the product's 5204most significant limb is zero. No overlap is permitted between the 5205destination and either source. 5206 5207If the two input operands are the same, use @code{mpn_sqr}. 5208@end deftypefun 5209 5210@deftypefun mp_limb_t mpn_mul (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{s1n}, const mp_limb_t *@var{s2p}, mp_size_t @var{s2n}) 5211Multiply @{@var{s1p}, @var{s1n}@} and @{@var{s2p}, @var{s2n}@}, and write the 5212(@var{s1n}+@var{s2n})-limb result to @var{rp}. Return the most significant 5213limb of the result. 5214 5215The destination has to have space for @var{s1n} + @var{s2n} limbs, even if the 5216product's most significant limb is zero. No overlap is permitted between the 5217destination and either source. 5218 5219This function requires that @var{s1n} is greater than or equal to @var{s2n}. 5220@end deftypefun 5221 5222@deftypefun void mpn_sqr (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{n}) 5223Compute the square of @{@var{s1p}, @var{n}@} and write the 2*@var{n}-limb 5224result to @var{rp}. 5225 5226The destination has to have space for 2*@var{n} limbs, even if the result's 5227most significant limb is zero. No overlap is permitted between the 5228destination and the source. 5229@end deftypefun 5230 5231@deftypefun mp_limb_t mpn_mul_1 (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{n}, mp_limb_t @var{s2limb}) 5232Multiply @{@var{s1p}, @var{n}@} by @var{s2limb}, and write the @var{n} least 5233significant limbs of the product to @var{rp}. Return the most significant 5234limb of the product. @{@var{s1p}, @var{n}@} and @{@var{rp}, @var{n}@} are 5235allowed to overlap provided @math{@var{rp} @le{} @var{s1p}}. 5236 5237This is a low-level function that is a building block for general 5238multiplication as well as other operations in GMP@. It is written in assembly 5239for most CPUs. 5240 5241Don't call this function if @var{s2limb} is a power of 2; use @code{mpn_lshift} 5242with a count equal to the logarithm of @var{s2limb} instead, for optimal speed. 5243@end deftypefun 5244 5245@deftypefun mp_limb_t mpn_addmul_1 (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{n}, mp_limb_t @var{s2limb}) 5246Multiply @{@var{s1p}, @var{n}@} and @var{s2limb}, and add the @var{n} least 5247significant limbs of the product to @{@var{rp}, @var{n}@} and write the result 5248to @var{rp}. Return the most significant limb of the product, plus carry-out 5249from the addition. 5250 5251This is a low-level function that is a building block for general 5252multiplication as well as other operations in GMP@. It is written in assembly 5253for most CPUs. 5254@end deftypefun 5255 5256@deftypefun mp_limb_t mpn_submul_1 (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{n}, mp_limb_t @var{s2limb}) 5257Multiply @{@var{s1p}, @var{n}@} and @var{s2limb}, and subtract the @var{n} 5258least significant limbs of the product from @{@var{rp}, @var{n}@} and write the 5259result to @var{rp}. Return the most significant limb of the product, plus 5260borrow-out from the subtraction. 5261 5262This is a low-level function that is a building block for general 5263multiplication and division as well as other operations in GMP@. It is written 5264in assembly for most CPUs. 5265@end deftypefun 5266 5267@deftypefun void mpn_tdiv_qr (mp_limb_t *@var{qp}, mp_limb_t *@var{rp}, mp_size_t @var{qxn}, const mp_limb_t *@var{np}, mp_size_t @var{nn}, const mp_limb_t *@var{dp}, mp_size_t @var{dn}) 5268Divide @{@var{np}, @var{nn}@} by @{@var{dp}, @var{dn}@} and put the quotient 5269at @{@var{qp}, @var{nn}@minus{}@var{dn}+1@} and the remainder at @{@var{rp}, 5270@var{dn}@}. The quotient is rounded towards 0. 5271 5272No overlap is permitted between arguments, except that @var{np} might equal 5273@var{rp}. The dividend size @var{nn} must be greater than or equal to divisor 5274size @var{dn}. The most significant limb of the divisor must be non-zero. The 5275@var{qxn} operand must be zero. 5276@end deftypefun 5277 5278@deftypefun mp_limb_t mpn_divrem (mp_limb_t *@var{r1p}, mp_size_t @var{qxn}, mp_limb_t *@var{rs2p}, mp_size_t @var{rs2n}, const mp_limb_t *@var{s3p}, mp_size_t @var{s3n}) 5279[This function is obsolete. Please call @code{mpn_tdiv_qr} instead for best 5280performance.] 5281 5282Divide @{@var{rs2p}, @var{rs2n}@} by @{@var{s3p}, @var{s3n}@}, and write the 5283quotient at @var{r1p}, with the exception of the most significant limb, which 5284is returned. The remainder replaces the dividend at @var{rs2p}; it will be 5285@var{s3n} limbs long (i.e., as many limbs as the divisor). 5286 5287In addition to an integer quotient, @var{qxn} fraction limbs are developed, and 5288stored after the integral limbs. For most usages, @var{qxn} will be zero. 5289 5290It is required that @var{rs2n} is greater than or equal to @var{s3n}. It is 5291required that the most significant bit of the divisor is set. 5292 5293If the quotient is not needed, pass @var{rs2p} + @var{s3n} as @var{r1p}. Aside 5294from that special case, no overlap between arguments is permitted. 5295 5296Return the most significant limb of the quotient, either 0 or 1. 5297 5298The area at @var{r1p} needs to be @var{rs2n} @minus{} @var{s3n} + @var{qxn} 5299limbs large. 5300@end deftypefun 5301 5302@deftypefn Function mp_limb_t mpn_divrem_1 (mp_limb_t *@var{r1p}, mp_size_t @var{qxn}, @w{mp_limb_t *@var{s2p}}, mp_size_t @var{s2n}, mp_limb_t @var{s3limb}) 5303@deftypefnx Macro mp_limb_t mpn_divmod_1 (mp_limb_t *@var{r1p}, mp_limb_t *@var{s2p}, @w{mp_size_t @var{s2n}}, @w{mp_limb_t @var{s3limb}}) 5304Divide @{@var{s2p}, @var{s2n}@} by @var{s3limb}, and write the quotient at 5305@var{r1p}. Return the remainder. 5306 5307The integer quotient is written to @{@var{r1p}+@var{qxn}, @var{s2n}@} and in 5308addition @var{qxn} fraction limbs are developed and written to @{@var{r1p}, 5309@var{qxn}@}. Either or both @var{s2n} and @var{qxn} can be zero. For most 5310usages, @var{qxn} will be zero. 5311 5312@code{mpn_divmod_1} exists for upward source compatibility and is simply a 5313macro calling @code{mpn_divrem_1} with a @var{qxn} of 0. 5314 5315The areas at @var{r1p} and @var{s2p} have to be identical or completely 5316separate, not partially overlapping. 5317@end deftypefn 5318 5319@deftypefun mp_limb_t mpn_divmod (mp_limb_t *@var{r1p}, mp_limb_t *@var{rs2p}, mp_size_t @var{rs2n}, const mp_limb_t *@var{s3p}, mp_size_t @var{s3n}) 5320[This function is obsolete. Please call @code{mpn_tdiv_qr} instead for best 5321performance.] 5322@end deftypefun 5323 5324@deftypefn Macro mp_limb_t mpn_divexact_by3 (mp_limb_t *@var{rp}, mp_limb_t *@var{sp}, @w{mp_size_t @var{n}}) 5325@deftypefnx Function mp_limb_t mpn_divexact_by3c (mp_limb_t *@var{rp}, mp_limb_t *@var{sp}, @w{mp_size_t @var{n}}, mp_limb_t @var{carry}) 5326Divide @{@var{sp}, @var{n}@} by 3, expecting it to divide exactly, and writing 5327the result to @{@var{rp}, @var{n}@}. If 3 divides exactly, the return value is 5328zero and the result is the quotient. If not, the return value is non-zero and 5329the result won't be anything useful. 5330 5331@code{mpn_divexact_by3c} takes an initial carry parameter, which can be the 5332return value from a previous call, so a large calculation can be done piece by 5333piece from low to high. @code{mpn_divexact_by3} is simply a macro calling 5334@code{mpn_divexact_by3c} with a 0 carry parameter. 5335 5336These routines use a multiply-by-inverse and will be faster than 5337@code{mpn_divrem_1} on CPUs with fast multiplication but slow division. 5338 5339The source @math{a}, result @math{q}, size @math{n}, initial carry @math{i}, 5340and return value @math{c} satisfy @m{cb^n+a-i=3q, c*b^n + a-i = 3*q}, where 5341@m{b=2\GMPraise{@code{GMP\_NUMB\_BITS}}, b=2^GMP_NUMB_BITS}. The 5342return @math{c} is always 0, 1 or 2, and the initial carry @math{i} must also 5343be 0, 1 or 2 (these are both borrows really). When @math{c=0} clearly 5344@math{q=(a-i)/3}. When @m{c \neq 0, c!=0}, the remainder @math{(a-i) @bmod{} 53453} is given by @math{3-c}, because @math{b @equiv{} 1 @bmod{} 3} (when 5346@code{mp_bits_per_limb} is even, which is always so currently). 5347@end deftypefn 5348 5349@deftypefun mp_limb_t mpn_mod_1 (const mp_limb_t *@var{s1p}, mp_size_t @var{s1n}, mp_limb_t @var{s2limb}) 5350Divide @{@var{s1p}, @var{s1n}@} by @var{s2limb}, and return the remainder. 5351@var{s1n} can be zero. 5352@end deftypefun 5353 5354@deftypefun mp_limb_t mpn_lshift (mp_limb_t *@var{rp}, const mp_limb_t *@var{sp}, mp_size_t @var{n}, unsigned int @var{count}) 5355Shift @{@var{sp}, @var{n}@} left by @var{count} bits, and write the result to 5356@{@var{rp}, @var{n}@}. The bits shifted out at the left are returned in the 5357least significant @var{count} bits of the return value (the rest of the return 5358value is zero). 5359 5360@var{count} must be in the range 1 to @nicode{mp_bits_per_limb}@minus{}1. The 5361regions @{@var{sp}, @var{n}@} and @{@var{rp}, @var{n}@} may overlap, provided 5362@math{@var{rp} @ge{} @var{sp}}. 5363 5364This function is written in assembly for most CPUs. 5365@end deftypefun 5366 5367@deftypefun mp_limb_t mpn_rshift (mp_limb_t *@var{rp}, const mp_limb_t *@var{sp}, mp_size_t @var{n}, unsigned int @var{count}) 5368Shift @{@var{sp}, @var{n}@} right by @var{count} bits, and write the result to 5369@{@var{rp}, @var{n}@}. The bits shifted out at the right are returned in the 5370most significant @var{count} bits of the return value (the rest of the return 5371value is zero). 5372 5373@var{count} must be in the range 1 to @nicode{mp_bits_per_limb}@minus{}1. The 5374regions @{@var{sp}, @var{n}@} and @{@var{rp}, @var{n}@} may overlap, provided 5375@math{@var{rp} @le{} @var{sp}}. 5376 5377This function is written in assembly for most CPUs. 5378@end deftypefun 5379 5380@deftypefun int mpn_cmp (const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n}) 5381Compare @{@var{s1p}, @var{n}@} and @{@var{s2p}, @var{n}@} and return a 5382positive value if @math{@var{s1} > @var{s2}}, 0 if they are equal, or a 5383negative value if @math{@var{s1} < @var{s2}}. 5384@end deftypefun 5385 5386@deftypefun mp_size_t mpn_gcd (mp_limb_t *@var{rp}, mp_limb_t *@var{xp}, mp_size_t @var{xn}, mp_limb_t *@var{yp}, mp_size_t @var{yn}) 5387Set @{@var{rp}, @var{retval}@} to the greatest common divisor of @{@var{xp}, 5388@var{xn}@} and @{@var{yp}, @var{yn}@}. The result can be up to @var{yn} limbs, 5389the return value is the actual number produced. Both source operands are 5390destroyed. 5391 5392It is required that @math{@var{xn} @ge @var{yn} > 0}, and the most significant 5393limb of @{@var{yp}, @var{yn}@} must be non-zero. No overlap is permitted 5394between @{@var{xp}, @var{xn}@} and @{@var{yp}, @var{yn}@}. 5395@end deftypefun 5396 5397@deftypefun mp_limb_t mpn_gcd_1 (const mp_limb_t *@var{xp}, mp_size_t @var{xn}, mp_limb_t @var{ylimb}) 5398Return the greatest common divisor of @{@var{xp}, @var{xn}@} and @var{ylimb}. 5399Both operands must be non-zero. 5400@end deftypefun 5401 5402@deftypefun mp_size_t mpn_gcdext (mp_limb_t *@var{gp}, mp_limb_t *@var{sp}, mp_size_t *@var{sn}, mp_limb_t *@var{up}, mp_size_t @var{un}, mp_limb_t *@var{vp}, mp_size_t @var{vn}) 5403Let @m{U,@var{U}} be defined by @{@var{up}, @var{un}@} and let @m{V,@var{V}} be 5404defined by @{@var{vp}, @var{vn}@}. 5405 5406Compute the greatest common divisor @math{G} of @math{U} and @math{V}. Compute 5407a cofactor @math{S} such that @math{G = US + VT}. The second cofactor @var{T} 5408is not computed but can easily be obtained from @m{(G - US) / V, (@var{G} - 5409@var{U}*@var{S}) / @var{V}} (the division will be exact). It is required that 5410@math{@var{un} @ge @var{vn} > 0}, and the most significant 5411limb of @{@var{vp}, @var{vn}@} must be non-zero. 5412 5413@math{S} satisfies @math{S = 1} or @math{@GMPabs{S} < V / (2 G)}. @math{S = 54140} if and only if @math{V} divides @math{U} (i.e., @math{G = V}). 5415 5416Store @math{G} at @var{gp} and let the return value define its limb count. 5417Store @math{S} at @var{sp} and let |*@var{sn}| define its limb count. @math{S} 5418can be negative; when this happens *@var{sn} will be negative. The area at 5419@var{gp} should have room for @var{vn} limbs and the area at @var{sp} should 5420have room for @math{@var{vn}+1} limbs. 5421 5422Both source operands are destroyed. 5423 5424Compatibility notes: GMP 4.3.0 and 4.3.1 defined @math{S} less strictly. 5425Earlier as well as later GMP releases define @math{S} as described here. 5426GMP releases before GMP 4.3.0 required additional space for both input and output 5427areas. More precisely, the areas @{@var{up}, @math{@var{un}+1}@} and 5428@{@var{vp}, @math{@var{vn}+1}@} were destroyed (i.e.@: the operands plus an 5429extra limb past the end of each), and the areas pointed to by @var{gp} and 5430@var{sp} should each have room for @math{@var{un}+1} limbs. 5431@end deftypefun 5432 5433@deftypefun mp_size_t mpn_sqrtrem (mp_limb_t *@var{r1p}, mp_limb_t *@var{r2p}, const mp_limb_t *@var{sp}, mp_size_t @var{n}) 5434Compute the square root of @{@var{sp}, @var{n}@} and put the result at 5435@{@var{r1p}, @math{@GMPceil{@var{n}/2}}@} and the remainder at @{@var{r2p}, 5436@var{retval}@}. @var{r2p} needs space for @var{n} limbs, but the return value 5437indicates how many are produced. 5438 5439The most significant limb of @{@var{sp}, @var{n}@} must be non-zero. The 5440areas @{@var{r1p}, @math{@GMPceil{@var{n}/2}}@} and @{@var{sp}, @var{n}@} must 5441be completely separate. The areas @{@var{r2p}, @var{n}@} and @{@var{sp}, 5442@var{n}@} must be either identical or completely separate. 5443 5444If the remainder is not wanted then @var{r2p} can be @code{NULL}, and in this 5445case the return value is zero or non-zero according to whether the remainder 5446would have been zero or non-zero. 5447 5448A return value of zero indicates a perfect square. See also 5449@code{mpn_perfect_square_p}. 5450@end deftypefun 5451 5452@deftypefun mp_size_t mpn_get_str (unsigned char *@var{str}, int @var{base}, mp_limb_t *@var{s1p}, mp_size_t @var{s1n}) 5453Convert @{@var{s1p}, @var{s1n}@} to a raw unsigned char array at @var{str} in 5454base @var{base}, and return the number of characters produced. There may be 5455leading zeros in the string. The string is not in ASCII; to convert it to 5456printable format, add the ASCII codes for @samp{0} or @samp{A}, depending on 5457the base and range. @var{base} can vary from 2 to 256. 5458 5459The most significant limb of the input @{@var{s1p}, @var{s1n}@} must be 5460non-zero. The input @{@var{s1p}, @var{s1n}@} is clobbered, except when 5461@var{base} is a power of 2, in which case it's unchanged. 5462 5463The area at @var{str} has to have space for the largest possible number 5464represented by a @var{s1n} long limb array, plus one extra character. 5465@end deftypefun 5466 5467@deftypefun mp_size_t mpn_set_str (mp_limb_t *@var{rp}, const unsigned char *@var{str}, size_t @var{strsize}, int @var{base}) 5468Convert bytes @{@var{str},@var{strsize}@} in the given @var{base} to limbs at 5469@var{rp}. 5470 5471@math{@var{str}[0]} is the most significant byte and 5472@math{@var{str}[@var{strsize}-1]} is the least significant. Each byte should 5473be a value in the range 0 to @math{@var{base}-1}, not an ASCII character. 5474@var{base} can vary from 2 to 256. 5475 5476The return value is the number of limbs written to @var{rp}. If the most 5477significant input byte is non-zero then the high limb at @var{rp} will be 5478non-zero, and only that exact number of limbs will be required there. 5479 5480If the most significant input byte is zero then there may be high zero limbs 5481written to @var{rp} and included in the return value. 5482 5483@var{strsize} must be at least 1, and no overlap is permitted between 5484@{@var{str},@var{strsize}@} and the result at @var{rp}. 5485@end deftypefun 5486 5487@deftypefun {mp_bitcnt_t} mpn_scan0 (const mp_limb_t *@var{s1p}, mp_bitcnt_t @var{bit}) 5488Scan @var{s1p} from bit position @var{bit} for the next clear bit. 5489 5490It is required that there be a clear bit within the area at @var{s1p} at or 5491beyond bit position @var{bit}, so that the function has something to return. 5492@end deftypefun 5493 5494@deftypefun {mp_bitcnt_t} mpn_scan1 (const mp_limb_t *@var{s1p}, mp_bitcnt_t @var{bit}) 5495Scan @var{s1p} from bit position @var{bit} for the next set bit. 5496 5497It is required that there be a set bit within the area at @var{s1p} at or 5498beyond bit position @var{bit}, so that the function has something to return. 5499@end deftypefun 5500 5501@deftypefun void mpn_random (mp_limb_t *@var{r1p}, mp_size_t @var{r1n}) 5502@deftypefunx void mpn_random2 (mp_limb_t *@var{r1p}, mp_size_t @var{r1n}) 5503Generate a random number of length @var{r1n} and store it at @var{r1p}. The 5504most significant limb is always non-zero. @code{mpn_random} generates 5505uniformly distributed limb data, @code{mpn_random2} generates long strings of 5506zeros and ones in the binary representation. 5507 5508@code{mpn_random2} is intended for testing the correctness of the @code{mpn} 5509routines. 5510@end deftypefun 5511 5512@deftypefun {mp_bitcnt_t} mpn_popcount (const mp_limb_t *@var{s1p}, mp_size_t @var{n}) 5513Count the number of set bits in @{@var{s1p}, @var{n}@}. 5514@end deftypefun 5515 5516@deftypefun {mp_bitcnt_t} mpn_hamdist (const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n}) 5517Compute the hamming distance between @{@var{s1p}, @var{n}@} and @{@var{s2p}, 5518@var{n}@}, which is the number of bit positions where the two operands have 5519different bit values. 5520@end deftypefun 5521 5522@deftypefun int mpn_perfect_square_p (const mp_limb_t *@var{s1p}, mp_size_t @var{n}) 5523Return non-zero iff @{@var{s1p}, @var{n}@} is a perfect square. 5524The most significant limb of the input @{@var{s1p}, @var{n}@} must be 5525non-zero. 5526@end deftypefun 5527 5528@deftypefun void mpn_and_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n}) 5529Perform the bitwise logical and of @{@var{s1p}, @var{n}@} and @{@var{s2p}, 5530@var{n}@}, and write the result to @{@var{rp}, @var{n}@}. 5531@end deftypefun 5532 5533@deftypefun void mpn_ior_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n}) 5534Perform the bitwise logical inclusive or of @{@var{s1p}, @var{n}@} and 5535@{@var{s2p}, @var{n}@}, and write the result to @{@var{rp}, @var{n}@}. 5536@end deftypefun 5537 5538@deftypefun void mpn_xor_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n}) 5539Perform the bitwise logical exclusive or of @{@var{s1p}, @var{n}@} and 5540@{@var{s2p}, @var{n}@}, and write the result to @{@var{rp}, @var{n}@}. 5541@end deftypefun 5542 5543@deftypefun void mpn_andn_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n}) 5544Perform the bitwise logical and of @{@var{s1p}, @var{n}@} and the bitwise 5545complement of @{@var{s2p}, @var{n}@}, and write the result to @{@var{rp}, @var{n}@}. 5546@end deftypefun 5547 5548@deftypefun void mpn_iorn_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n}) 5549Perform the bitwise logical inclusive or of @{@var{s1p}, @var{n}@} and the bitwise 5550complement of @{@var{s2p}, @var{n}@}, and write the result to @{@var{rp}, @var{n}@}. 5551@end deftypefun 5552 5553@deftypefun void mpn_nand_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n}) 5554Perform the bitwise logical and of @{@var{s1p}, @var{n}@} and @{@var{s2p}, 5555@var{n}@}, and write the bitwise complement of the result to @{@var{rp}, @var{n}@}. 5556@end deftypefun 5557 5558@deftypefun void mpn_nior_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n}) 5559Perform the bitwise logical inclusive or of @{@var{s1p}, @var{n}@} and 5560@{@var{s2p}, @var{n}@}, and write the bitwise complement of the result to 5561@{@var{rp}, @var{n}@}. 5562@end deftypefun 5563 5564@deftypefun void mpn_xnor_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n}) 5565Perform the bitwise logical exclusive or of @{@var{s1p}, @var{n}@} and 5566@{@var{s2p}, @var{n}@}, and write the bitwise complement of the result to 5567@{@var{rp}, @var{n}@}. 5568@end deftypefun 5569 5570@deftypefun void mpn_com (mp_limb_t *@var{rp}, const mp_limb_t *@var{sp}, mp_size_t @var{n}) 5571Perform the bitwise complement of @{@var{sp}, @var{n}@}, and write the result 5572to @{@var{rp}, @var{n}@}. 5573@end deftypefun 5574 5575@deftypefun void mpn_copyi (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{n}) 5576Copy from @{@var{s1p}, @var{n}@} to @{@var{rp}, @var{n}@}, increasingly. 5577@end deftypefun 5578 5579@deftypefun void mpn_copyd (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{n}) 5580Copy from @{@var{s1p}, @var{n}@} to @{@var{rp}, @var{n}@}, decreasingly. 5581@end deftypefun 5582 5583@deftypefun void mpn_zero (mp_limb_t *@var{rp}, mp_size_t @var{n}) 5584Zero @{@var{rp}, @var{n}@}. 5585@end deftypefun 5586 5587@sp 1 5588@section Nails 5589@cindex Nails 5590 5591@strong{Everything in this section is highly experimental and may disappear or 5592be subject to incompatible changes in a future version of GMP.} 5593 5594Nails are an experimental feature whereby a few bits are left unused at the 5595top of each @code{mp_limb_t}. This can significantly improve carry handling 5596on some processors. 5597 5598All the @code{mpn} functions accepting limb data will expect the nail bits to 5599be zero on entry, and will return data with the nails similarly all zero. 5600This applies both to limb vectors and to single limb arguments. 5601 5602Nails can be enabled by configuring with @samp{--enable-nails}. By default 5603the number of bits will be chosen according to what suits the host processor, 5604but a particular number can be selected with @samp{--enable-nails=N}. 5605 5606At the mpn level, a nail build is neither source nor binary compatible with a 5607non-nail build, strictly speaking. But programs acting on limbs only through 5608the mpn functions are likely to work equally well with either build, and 5609judicious use of the definitions below should make any program compatible with 5610either build, at the source level. 5611 5612For the higher level routines, meaning @code{mpz} etc, a nail build should be 5613fully source and binary compatible with a non-nail build. 5614 5615@defmac GMP_NAIL_BITS 5616@defmacx GMP_NUMB_BITS 5617@defmacx GMP_LIMB_BITS 5618@code{GMP_NAIL_BITS} is the number of nail bits, or 0 when nails are not in 5619use. @code{GMP_NUMB_BITS} is the number of data bits in a limb. 5620@code{GMP_LIMB_BITS} is the total number of bits in an @code{mp_limb_t}. In 5621all cases 5622 5623@example 5624GMP_LIMB_BITS == GMP_NAIL_BITS + GMP_NUMB_BITS 5625@end example 5626@end defmac 5627 5628@defmac GMP_NAIL_MASK 5629@defmacx GMP_NUMB_MASK 5630Bit masks for the nail and number parts of a limb. @code{GMP_NAIL_MASK} is 0 5631when nails are not in use. 5632 5633@code{GMP_NAIL_MASK} is not often needed, since the nail part can be obtained 5634with @code{x >> GMP_NUMB_BITS}, and that means one less large constant, which 5635can help various RISC chips. 5636@end defmac 5637 5638@defmac GMP_NUMB_MAX 5639The maximum value that can be stored in the number part of a limb. This is 5640the same as @code{GMP_NUMB_MASK}, but can be used for clarity when doing 5641comparisons rather than bit-wise operations. 5642@end defmac 5643 5644The term ``nails'' comes from finger or toe nails, which are at the ends of a 5645limb (arm or leg). ``numb'' is short for number, but is also how the 5646developers felt after trying for a long time to come up with sensible names 5647for these things. 5648 5649In the future (the distant future most likely) a non-zero nail might be 5650permitted, giving non-unique representations for numbers in a limb vector. 5651This would help vector processors since carries would only ever need to 5652propagate one or two limbs. 5653 5654 5655@node Random Number Functions, Formatted Output, Low-level Functions, Top 5656@chapter Random Number Functions 5657@cindex Random number functions 5658 5659Sequences of pseudo-random numbers in GMP are generated using a variable of 5660type @code{gmp_randstate_t}, which holds an algorithm selection and a current 5661state. Such a variable must be initialized by a call to one of the 5662@code{gmp_randinit} functions, and can be seeded with one of the 5663@code{gmp_randseed} functions. 5664 5665The functions actually generating random numbers are described in @ref{Integer 5666Random Numbers}, and @ref{Miscellaneous Float Functions}. 5667 5668The older style random number functions don't accept a @code{gmp_randstate_t} 5669parameter but instead share a global variable of that type. They use a 5670default algorithm and are currently not seeded (though perhaps that will 5671change in the future). The new functions accepting a @code{gmp_randstate_t} 5672are recommended for applications that care about randomness. 5673 5674@menu 5675* Random State Initialization:: 5676* Random State Seeding:: 5677* Random State Miscellaneous:: 5678@end menu 5679 5680@node Random State Initialization, Random State Seeding, Random Number Functions, Random Number Functions 5681@section Random State Initialization 5682@cindex Random number state 5683@cindex Initialization functions 5684 5685@deftypefun void gmp_randinit_default (gmp_randstate_t @var{state}) 5686Initialize @var{state} with a default algorithm. This will be a compromise 5687between speed and randomness, and is recommended for applications with no 5688special requirements. Currently this is @code{gmp_randinit_mt}. 5689@end deftypefun 5690 5691@deftypefun void gmp_randinit_mt (gmp_randstate_t @var{state}) 5692@cindex Mersenne twister random numbers 5693Initialize @var{state} for a Mersenne Twister algorithm. This algorithm is 5694fast and has good randomness properties. 5695@end deftypefun 5696 5697@deftypefun void gmp_randinit_lc_2exp (gmp_randstate_t @var{state}, const mpz_t @var{a}, @w{unsigned long @var{c}}, @w{mp_bitcnt_t @var{m2exp}}) 5698@cindex Linear congruential random numbers 5699Initialize @var{state} with a linear congruential algorithm @m{X = (@var{a}X + 5700@var{c}) @bmod 2^{m2exp}, X = (@var{a}*X + @var{c}) mod 2^@var{m2exp}}. 5701 5702The low bits of @math{X} in this algorithm are not very random. The least 5703significant bit will have a period no more than 2, and the second bit no more 5704than 4, etc. For this reason only the high half of each @math{X} is actually 5705used. 5706 5707When a random number of more than @math{@var{m2exp}/2} bits is to be 5708generated, multiple iterations of the recurrence are used and the results 5709concatenated. 5710@end deftypefun 5711 5712@deftypefun int gmp_randinit_lc_2exp_size (gmp_randstate_t @var{state}, mp_bitcnt_t @var{size}) 5713@cindex Linear congruential random numbers 5714Initialize @var{state} for a linear congruential algorithm as per 5715@code{gmp_randinit_lc_2exp}. @var{a}, @var{c} and @var{m2exp} are selected 5716from a table, chosen so that @var{size} bits (or more) of each @math{X} will 5717be used, i.e.@: @math{@var{m2exp}/2 @ge{} @var{size}}. 5718 5719If successful the return value is non-zero. If @var{size} is bigger than the 5720table data provides then the return value is zero. The maximum @var{size} 5721currently supported is 128. 5722@end deftypefun 5723 5724@deftypefun void gmp_randinit_set (gmp_randstate_t @var{rop}, gmp_randstate_t @var{op}) 5725Initialize @var{rop} with a copy of the algorithm and state from @var{op}. 5726@end deftypefun 5727 5728@c Although gmp_randinit, gmp_errno and related constants are obsolete, we 5729@c still put @findex entries for them, since they're still documented and 5730@c someone might be looking them up when perusing old application code. 5731 5732@deftypefun void gmp_randinit (gmp_randstate_t @var{state}, @w{gmp_randalg_t @var{alg}}, @dots{}) 5733@strong{This function is obsolete.} 5734 5735@findex GMP_RAND_ALG_LC 5736@findex GMP_RAND_ALG_DEFAULT 5737Initialize @var{state} with an algorithm selected by @var{alg}. The only 5738choice is @code{GMP_RAND_ALG_LC}, which is @code{gmp_randinit_lc_2exp_size} 5739described above. A third parameter of type @code{unsigned long} is required, 5740this is the @var{size} for that function. @code{GMP_RAND_ALG_DEFAULT} or 0 5741are the same as @code{GMP_RAND_ALG_LC}. 5742 5743@c For reference, this is the only place gmp_errno has been documented, and 5744@c due to being non thread safe we won't be adding to it's uses. 5745@findex gmp_errno 5746@findex GMP_ERROR_UNSUPPORTED_ARGUMENT 5747@findex GMP_ERROR_INVALID_ARGUMENT 5748@code{gmp_randinit} sets bits in the global variable @code{gmp_errno} to 5749indicate an error. @code{GMP_ERROR_UNSUPPORTED_ARGUMENT} if @var{alg} is 5750unsupported, or @code{GMP_ERROR_INVALID_ARGUMENT} if the @var{size} parameter 5751is too big. It may be noted this error reporting is not thread safe (a good 5752reason to use @code{gmp_randinit_lc_2exp_size} instead). 5753@end deftypefun 5754 5755@deftypefun void gmp_randclear (gmp_randstate_t @var{state}) 5756Free all memory occupied by @var{state}. 5757@end deftypefun 5758 5759 5760@node Random State Seeding, Random State Miscellaneous, Random State Initialization, Random Number Functions 5761@section Random State Seeding 5762@cindex Random number seeding 5763@cindex Seeding random numbers 5764 5765@deftypefun void gmp_randseed (gmp_randstate_t @var{state}, const mpz_t @var{seed}) 5766@deftypefunx void gmp_randseed_ui (gmp_randstate_t @var{state}, @w{unsigned long int @var{seed}}) 5767Set an initial seed value into @var{state}. 5768 5769The size of a seed determines how many different sequences of random numbers 5770that it's possible to generate. The ``quality'' of the seed is the randomness 5771of a given seed compared to the previous seed used, and this affects the 5772randomness of separate number sequences. The method for choosing a seed is 5773critical if the generated numbers are to be used for important applications, 5774such as generating cryptographic keys. 5775 5776Traditionally the system time has been used to seed, but care needs to be 5777taken with this. If an application seeds often and the resolution of the 5778system clock is low, then the same sequence of numbers might be repeated. 5779Also, the system time is quite easy to guess, so if unpredictability is 5780required then it should definitely not be the only source for the seed value. 5781On some systems there's a special device @file{/dev/random} which provides 5782random data better suited for use as a seed. 5783@end deftypefun 5784 5785 5786@node Random State Miscellaneous, , Random State Seeding, Random Number Functions 5787@section Random State Miscellaneous 5788 5789@deftypefun {unsigned long} gmp_urandomb_ui (gmp_randstate_t @var{state}, unsigned long @var{n}) 5790Return a uniformly distributed random number of @var{n} bits, i.e.@: in the 5791range 0 to @m{2^n-1,2^@var{n}-1} inclusive. @var{n} must be less than or 5792equal to the number of bits in an @code{unsigned long}. 5793@end deftypefun 5794 5795@deftypefun {unsigned long} gmp_urandomm_ui (gmp_randstate_t @var{state}, unsigned long @var{n}) 5796Return a uniformly distributed random number in the range 0 to 5797@math{@var{n}-1}, inclusive. 5798@end deftypefun 5799 5800 5801@node Formatted Output, Formatted Input, Random Number Functions, Top 5802@chapter Formatted Output 5803@cindex Formatted output 5804@cindex @code{printf} formatted output 5805 5806@menu 5807* Formatted Output Strings:: 5808* Formatted Output Functions:: 5809* C++ Formatted Output:: 5810@end menu 5811 5812@node Formatted Output Strings, Formatted Output Functions, Formatted Output, Formatted Output 5813@section Format Strings 5814 5815@code{gmp_printf} and friends accept format strings similar to the standard C 5816@code{printf} (@pxref{Formatted Output,, Formatted Output, libc, The GNU C 5817Library Reference Manual}). A format specification is of the form 5818 5819@example 5820% [flags] [width] [.[precision]] [type] conv 5821@end example 5822 5823GMP adds types @samp{Z}, @samp{Q} and @samp{F} for @code{mpz_t}, @code{mpq_t} 5824and @code{mpf_t} respectively, @samp{M} for @code{mp_limb_t}, and @samp{N} for 5825an @code{mp_limb_t} array. @samp{Z}, @samp{Q}, @samp{M} and @samp{N} behave 5826like integers. @samp{Q} will print a @samp{/} and a denominator, if needed. 5827@samp{F} behaves like a float. For example, 5828 5829@example 5830mpz_t z; 5831gmp_printf ("%s is an mpz %Zd\n", "here", z); 5832 5833mpq_t q; 5834gmp_printf ("a hex rational: %#40Qx\n", q); 5835 5836mpf_t f; 5837int n; 5838gmp_printf ("fixed point mpf %.*Ff with %d digits\n", n, f, n); 5839 5840mp_limb_t l; 5841gmp_printf ("limb %Mu\n", l); 5842 5843const mp_limb_t *ptr; 5844mp_size_t size; 5845gmp_printf ("limb array %Nx\n", ptr, size); 5846@end example 5847 5848For @samp{N} the limbs are expected least significant first, as per the 5849@code{mpn} functions (@pxref{Low-level Functions}). A negative size can be 5850given to print the value as a negative. 5851 5852All the standard C @code{printf} types behave the same as the C library 5853@code{printf}, and can be freely intermixed with the GMP extensions. In the 5854current implementation the standard parts of the format string are simply 5855handed to @code{printf} and only the GMP extensions handled directly. 5856 5857The flags accepted are as follows. GLIBC style @nisamp{'} is only for the 5858standard C types (not the GMP types), and only if the C library supports it. 5859 5860@quotation 5861@multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM} 5862@item @nicode{0} @tab pad with zeros (rather than spaces) 5863@item @nicode{#} @tab show the base with @samp{0x}, @samp{0X} or @samp{0} 5864@item @nicode{+} @tab always show a sign 5865@item (space) @tab show a space or a @samp{-} sign 5866@item @nicode{'} @tab group digits, GLIBC style (not GMP types) 5867@end multitable 5868@end quotation 5869 5870The optional width and precision can be given as a number within the format 5871string, or as a @samp{*} to take an extra parameter of type @code{int}, the 5872same as the standard @code{printf}. 5873 5874The standard types accepted are as follows. @samp{h} and @samp{l} are 5875portable, the rest will depend on the compiler (or include files) for the type 5876and the C library for the output. 5877 5878@quotation 5879@multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM} 5880@item @nicode{h} @tab @nicode{short} 5881@item @nicode{hh} @tab @nicode{char} 5882@item @nicode{j} @tab @nicode{intmax_t} or @nicode{uintmax_t} 5883@item @nicode{l} @tab @nicode{long} or @nicode{wchar_t} 5884@item @nicode{ll} @tab @nicode{long long} 5885@item @nicode{L} @tab @nicode{long double} 5886@item @nicode{q} @tab @nicode{quad_t} or @nicode{u_quad_t} 5887@item @nicode{t} @tab @nicode{ptrdiff_t} 5888@item @nicode{z} @tab @nicode{size_t} 5889@end multitable 5890@end quotation 5891 5892@noindent 5893The GMP types are 5894 5895@quotation 5896@multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM} 5897@item @nicode{F} @tab @nicode{mpf_t}, float conversions 5898@item @nicode{Q} @tab @nicode{mpq_t}, integer conversions 5899@item @nicode{M} @tab @nicode{mp_limb_t}, integer conversions 5900@item @nicode{N} @tab @nicode{mp_limb_t} array, integer conversions 5901@item @nicode{Z} @tab @nicode{mpz_t}, integer conversions 5902@end multitable 5903@end quotation 5904 5905The conversions accepted are as follows. @samp{a} and @samp{A} are always 5906supported for @code{mpf_t} but depend on the C library for standard C float 5907types. @samp{m} and @samp{p} depend on the C library. 5908 5909@quotation 5910@multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM} 5911@item @nicode{a} @nicode{A} @tab hex floats, C99 style 5912@item @nicode{c} @tab character 5913@item @nicode{d} @tab decimal integer 5914@item @nicode{e} @nicode{E} @tab scientific format float 5915@item @nicode{f} @tab fixed point float 5916@item @nicode{i} @tab same as @nicode{d} 5917@item @nicode{g} @nicode{G} @tab fixed or scientific float 5918@item @nicode{m} @tab @code{strerror} string, GLIBC style 5919@item @nicode{n} @tab store characters written so far 5920@item @nicode{o} @tab octal integer 5921@item @nicode{p} @tab pointer 5922@item @nicode{s} @tab string 5923@item @nicode{u} @tab unsigned integer 5924@item @nicode{x} @nicode{X} @tab hex integer 5925@end multitable 5926@end quotation 5927 5928@samp{o}, @samp{x} and @samp{X} are unsigned for the standard C types, but for 5929types @samp{Z}, @samp{Q} and @samp{N} they are signed. @samp{u} is not 5930meaningful for @samp{Z}, @samp{Q} and @samp{N}. 5931 5932@samp{M} is a proxy for the C library @samp{l} or @samp{L}, according to the 5933size of @code{mp_limb_t}. Unsigned conversions will be usual, but a signed 5934conversion can be used and will interpret the value as a twos complement 5935negative. 5936 5937@samp{n} can be used with any type, even the GMP types. 5938 5939Other types or conversions that might be accepted by the C library 5940@code{printf} cannot be used through @code{gmp_printf}, this includes for 5941instance extensions registered with GLIBC @code{register_printf_function}. 5942Also currently there's no support for POSIX @samp{$} style numbered arguments 5943(perhaps this will be added in the future). 5944 5945The precision field has its usual meaning for integer @samp{Z} and float 5946@samp{F} types, but is currently undefined for @samp{Q} and should not be used 5947with that. 5948 5949@code{mpf_t} conversions only ever generate as many digits as can be 5950accurately represented by the operand, the same as @code{mpf_get_str} does. 5951Zeros will be used if necessary to pad to the requested precision. This 5952happens even for an @samp{f} conversion of an @code{mpf_t} which is an 5953integer, for instance @math{2^@W{1024}} in an @code{mpf_t} of 128 bits 5954precision will only produce about 40 digits, then pad with zeros to the 5955decimal point. An empty precision field like @samp{%.Fe} or @samp{%.Ff} can 5956be used to specifically request just the significant digits. Without any dot 5957and thus no precision field, a precision value of 6 will be used. Note that 5958these rules mean that @samp{%Ff}, @samp{%.Ff}, and @samp{%.0Ff} will all be 5959different. 5960 5961The decimal point character (or string) is taken from the current locale 5962settings on systems which provide @code{localeconv} (@pxref{Locales,, Locales 5963and Internationalization, libc, The GNU C Library Reference Manual}). The C 5964library will normally do the same for standard float output. 5965 5966The format string is only interpreted as plain @code{char}s, multibyte 5967characters are not recognised. Perhaps this will change in the future. 5968 5969 5970@node Formatted Output Functions, C++ Formatted Output, Formatted Output Strings, Formatted Output 5971@section Functions 5972@cindex Output functions 5973 5974Each of the following functions is similar to the corresponding C library 5975function. The basic @code{printf} forms take a variable argument list. The 5976@code{vprintf} forms take an argument pointer, see @ref{Variadic Functions,, 5977Variadic Functions, libc, The GNU C Library Reference Manual}, or @samp{man 3 5978va_start}. 5979 5980It should be emphasised that if a format string is invalid, or the arguments 5981don't match what the format specifies, then the behaviour of any of these 5982functions will be unpredictable. GCC format string checking is not available, 5983since it doesn't recognise the GMP extensions. 5984 5985The file based functions @code{gmp_printf} and @code{gmp_fprintf} will return 5986@math{-1} to indicate a write error. Output is not ``atomic'', so partial 5987output may be produced if a write error occurs. All the functions can return 5988@math{-1} if the C library @code{printf} variant in use returns @math{-1}, but 5989this shouldn't normally occur. 5990 5991@deftypefun int gmp_printf (const char *@var{fmt}, @dots{}) 5992@deftypefunx int gmp_vprintf (const char *@var{fmt}, va_list @var{ap}) 5993Print to the standard output @code{stdout}. Return the number of characters 5994written, or @math{-1} if an error occurred. 5995@end deftypefun 5996 5997@deftypefun int gmp_fprintf (FILE *@var{fp}, const char *@var{fmt}, @dots{}) 5998@deftypefunx int gmp_vfprintf (FILE *@var{fp}, const char *@var{fmt}, va_list @var{ap}) 5999Print to the stream @var{fp}. Return the number of characters written, or 6000@math{-1} if an error occurred. 6001@end deftypefun 6002 6003@deftypefun int gmp_sprintf (char *@var{buf}, const char *@var{fmt}, @dots{}) 6004@deftypefunx int gmp_vsprintf (char *@var{buf}, const char *@var{fmt}, va_list @var{ap}) 6005Form a null-terminated string in @var{buf}. Return the number of characters 6006written, excluding the terminating null. 6007 6008No overlap is permitted between the space at @var{buf} and the string 6009@var{fmt}. 6010 6011These functions are not recommended, since there's no protection against 6012exceeding the space available at @var{buf}. 6013@end deftypefun 6014 6015@deftypefun int gmp_snprintf (char *@var{buf}, size_t @var{size}, const char *@var{fmt}, @dots{}) 6016@deftypefunx int gmp_vsnprintf (char *@var{buf}, size_t @var{size}, const char *@var{fmt}, va_list @var{ap}) 6017Form a null-terminated string in @var{buf}. No more than @var{size} bytes 6018will be written. To get the full output, @var{size} must be enough for the 6019string and null-terminator. 6020 6021The return value is the total number of characters which ought to have been 6022produced, excluding the terminating null. If @math{@var{retval} @ge{} 6023@var{size}} then the actual output has been truncated to the first 6024@math{@var{size}-1} characters, and a null appended. 6025 6026No overlap is permitted between the region @{@var{buf},@var{size}@} and the 6027@var{fmt} string. 6028 6029Notice the return value is in ISO C99 @code{snprintf} style. This is so even 6030if the C library @code{vsnprintf} is the older GLIBC 2.0.x style. 6031@end deftypefun 6032 6033@deftypefun int gmp_asprintf (char **@var{pp}, const char *@var{fmt}, @dots{}) 6034@deftypefunx int gmp_vasprintf (char **@var{pp}, const char *@var{fmt}, va_list @var{ap}) 6035Form a null-terminated string in a block of memory obtained from the current 6036memory allocation function (@pxref{Custom Allocation}). The block will be the 6037size of the string and null-terminator. The address of the block in stored to 6038*@var{pp}. The return value is the number of characters produced, excluding 6039the null-terminator. 6040 6041Unlike the C library @code{asprintf}, @code{gmp_asprintf} doesn't return 6042@math{-1} if there's no more memory available, it lets the current allocation 6043function handle that. 6044@end deftypefun 6045 6046@deftypefun int gmp_obstack_printf (struct obstack *@var{ob}, const char *@var{fmt}, @dots{}) 6047@deftypefunx int gmp_obstack_vprintf (struct obstack *@var{ob}, const char *@var{fmt}, va_list @var{ap}) 6048@cindex @code{obstack} output 6049Append to the current object in @var{ob}. The return value is the number of 6050characters written. A null-terminator is not written. 6051 6052@var{fmt} cannot be within the current object in @var{ob}, since that object 6053might move as it grows. 6054 6055These functions are available only when the C library provides the obstack 6056feature, which probably means only on GNU systems, see @ref{Obstacks,, 6057Obstacks, libc, The GNU C Library Reference Manual}. 6058@end deftypefun 6059 6060 6061@node C++ Formatted Output, , Formatted Output Functions, Formatted Output 6062@section C++ Formatted Output 6063@cindex C++ @code{ostream} output 6064@cindex @code{ostream} output 6065 6066The following functions are provided in @file{libgmpxx} (@pxref{Headers and 6067Libraries}), which is built if C++ support is enabled (@pxref{Build Options}). 6068Prototypes are available from @code{<gmp.h>}. 6069 6070@deftypefun ostream& operator<< (ostream& @var{stream}, mpz_t @var{op}) 6071Print @var{op} to @var{stream}, using its @code{ios} formatting settings. 6072@code{ios::width} is reset to 0 after output, the same as the standard 6073@code{ostream operator<<} routines do. 6074 6075In hex or octal, @var{op} is printed as a signed number, the same as for 6076decimal. This is unlike the standard @code{operator<<} routines on @code{int} 6077etc, which instead give twos complement. 6078@end deftypefun 6079 6080@deftypefun ostream& operator<< (ostream& @var{stream}, mpq_t @var{op}) 6081Print @var{op} to @var{stream}, using its @code{ios} formatting settings. 6082@code{ios::width} is reset to 0 after output, the same as the standard 6083@code{ostream operator<<} routines do. 6084 6085Output will be a fraction like @samp{5/9}, or if the denominator is 1 then 6086just a plain integer like @samp{123}. 6087 6088In hex or octal, @var{op} is printed as a signed value, the same as for 6089decimal. If @code{ios::showbase} is set then a base indicator is shown on 6090both the numerator and denominator (if the denominator is required). 6091@end deftypefun 6092 6093@deftypefun ostream& operator<< (ostream& @var{stream}, mpf_t @var{op}) 6094Print @var{op} to @var{stream}, using its @code{ios} formatting settings. 6095@code{ios::width} is reset to 0 after output, the same as the standard 6096@code{ostream operator<<} routines do. 6097 6098The decimal point follows the standard library float @code{operator<<}, which 6099on recent systems means the @code{std::locale} imbued on @var{stream}. 6100 6101Hex and octal are supported, unlike the standard @code{operator<<} on 6102@code{double}. The mantissa will be in hex or octal, the exponent will be in 6103decimal. For hex the exponent delimiter is an @samp{@@}. This is as per 6104@code{mpf_out_str}. 6105 6106@code{ios::showbase} is supported, and will put a base on the mantissa, for 6107example hex @samp{0x1.8} or @samp{0x0.8}, or octal @samp{01.4} or @samp{00.4}. 6108This last form is slightly strange, but at least differentiates itself from 6109decimal. 6110@end deftypefun 6111 6112These operators mean that GMP types can be printed in the usual C++ way, for 6113example, 6114 6115@example 6116mpz_t z; 6117int n; 6118... 6119cout << "iteration " << n << " value " << z << "\n"; 6120@end example 6121 6122But note that @code{ostream} output (and @code{istream} input, @pxref{C++ 6123Formatted Input}) is the only overloading available for the GMP types and that 6124for instance using @code{+} with an @code{mpz_t} will have unpredictable 6125results. For classes with overloading, see @ref{C++ Class Interface}. 6126 6127 6128@node Formatted Input, C++ Class Interface, Formatted Output, Top 6129@chapter Formatted Input 6130@cindex Formatted input 6131@cindex @code{scanf} formatted input 6132 6133@menu 6134* Formatted Input Strings:: 6135* Formatted Input Functions:: 6136* C++ Formatted Input:: 6137@end menu 6138 6139 6140@node Formatted Input Strings, Formatted Input Functions, Formatted Input, Formatted Input 6141@section Formatted Input Strings 6142 6143@code{gmp_scanf} and friends accept format strings similar to the standard C 6144@code{scanf} (@pxref{Formatted Input,, Formatted Input, libc, The GNU C 6145Library Reference Manual}). A format specification is of the form 6146 6147@example 6148% [flags] [width] [type] conv 6149@end example 6150 6151GMP adds types @samp{Z}, @samp{Q} and @samp{F} for @code{mpz_t}, @code{mpq_t} 6152and @code{mpf_t} respectively. @samp{Z} and @samp{Q} behave like integers. 6153@samp{Q} will read a @samp{/} and a denominator, if present. @samp{F} behaves 6154like a float. 6155 6156GMP variables don't require an @code{&} when passed to @code{gmp_scanf}, since 6157they're already ``call-by-reference''. For example, 6158 6159@example 6160/* to read say "a(5) = 1234" */ 6161int n; 6162mpz_t z; 6163gmp_scanf ("a(%d) = %Zd\n", &n, z); 6164 6165mpq_t q1, q2; 6166gmp_sscanf ("0377 + 0x10/0x11", "%Qi + %Qi", q1, q2); 6167 6168/* to read say "topleft (1.55,-2.66)" */ 6169mpf_t x, y; 6170char buf[32]; 6171gmp_scanf ("%31s (%Ff,%Ff)", buf, x, y); 6172@end example 6173 6174All the standard C @code{scanf} types behave the same as in the C library 6175@code{scanf}, and can be freely intermixed with the GMP extensions. In the 6176current implementation the standard parts of the format string are simply 6177handed to @code{scanf} and only the GMP extensions handled directly. 6178 6179The flags accepted are as follows. @samp{a} and @samp{'} will depend on 6180support from the C library, and @samp{'} cannot be used with GMP types. 6181 6182@quotation 6183@multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM} 6184@item @nicode{*} @tab read but don't store 6185@item @nicode{a} @tab allocate a buffer (string conversions) 6186@item @nicode{'} @tab grouped digits, GLIBC style (not GMP types) 6187@end multitable 6188@end quotation 6189 6190The standard types accepted are as follows. @samp{h} and @samp{l} are 6191portable, the rest will depend on the compiler (or include files) for the type 6192and the C library for the input. 6193 6194@quotation 6195@multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM} 6196@item @nicode{h} @tab @nicode{short} 6197@item @nicode{hh} @tab @nicode{char} 6198@item @nicode{j} @tab @nicode{intmax_t} or @nicode{uintmax_t} 6199@item @nicode{l} @tab @nicode{long int}, @nicode{double} or @nicode{wchar_t} 6200@item @nicode{ll} @tab @nicode{long long} 6201@item @nicode{L} @tab @nicode{long double} 6202@item @nicode{q} @tab @nicode{quad_t} or @nicode{u_quad_t} 6203@item @nicode{t} @tab @nicode{ptrdiff_t} 6204@item @nicode{z} @tab @nicode{size_t} 6205@end multitable 6206@end quotation 6207 6208@noindent 6209The GMP types are 6210 6211@quotation 6212@multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM} 6213@item @nicode{F} @tab @nicode{mpf_t}, float conversions 6214@item @nicode{Q} @tab @nicode{mpq_t}, integer conversions 6215@item @nicode{Z} @tab @nicode{mpz_t}, integer conversions 6216@end multitable 6217@end quotation 6218 6219The conversions accepted are as follows. @samp{p} and @samp{[} will depend on 6220support from the C library, the rest are standard. 6221 6222@quotation 6223@multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM} 6224@item @nicode{c} @tab character or characters 6225@item @nicode{d} @tab decimal integer 6226@item @nicode{e} @nicode{E} @nicode{f} @nicode{g} @nicode{G} 6227 @tab float 6228@item @nicode{i} @tab integer with base indicator 6229@item @nicode{n} @tab characters read so far 6230@item @nicode{o} @tab octal integer 6231@item @nicode{p} @tab pointer 6232@item @nicode{s} @tab string of non-whitespace characters 6233@item @nicode{u} @tab decimal integer 6234@item @nicode{x} @nicode{X} @tab hex integer 6235@item @nicode{[} @tab string of characters in a set 6236@end multitable 6237@end quotation 6238 6239@samp{e}, @samp{E}, @samp{f}, @samp{g} and @samp{G} are identical, they all 6240read either fixed point or scientific format, and either upper or lower case 6241@samp{e} for the exponent in scientific format. 6242 6243C99 style hex float format (@code{printf %a}, @pxref{Formatted Output 6244Strings}) is always accepted for @code{mpf_t}, but for the standard float 6245types it will depend on the C library. 6246 6247@samp{x} and @samp{X} are identical, both accept both upper and lower case 6248hexadecimal. 6249 6250@samp{o}, @samp{u}, @samp{x} and @samp{X} all read positive or negative 6251values. For the standard C types these are described as ``unsigned'' 6252conversions, but that merely affects certain overflow handling, negatives are 6253still allowed (per @code{strtoul}, @pxref{Parsing of Integers,, Parsing of 6254Integers, libc, The GNU C Library Reference Manual}). For GMP types there are 6255no overflows, so @samp{d} and @samp{u} are identical. 6256 6257@samp{Q} type reads the numerator and (optional) denominator as given. If the 6258value might not be in canonical form then @code{mpq_canonicalize} must be 6259called before using it in any calculations (@pxref{Rational Number 6260Functions}). 6261 6262@samp{Qi} will read a base specification separately for the numerator and 6263denominator. For example @samp{0x10/11} would be 16/11, whereas 6264@samp{0x10/0x11} would be 16/17. 6265 6266@samp{n} can be used with any of the types above, even the GMP types. 6267@samp{*} to suppress assignment is allowed, though in that case it would do 6268nothing at all. 6269 6270Other conversions or types that might be accepted by the C library 6271@code{scanf} cannot be used through @code{gmp_scanf}. 6272 6273Whitespace is read and discarded before a field, except for @samp{c} and 6274@samp{[} conversions. 6275 6276For float conversions, the decimal point character (or string) expected is 6277taken from the current locale settings on systems which provide 6278@code{localeconv} (@pxref{Locales,, Locales and Internationalization, libc, 6279The GNU C Library Reference Manual}). The C library will normally do the same 6280for standard float input. 6281 6282The format string is only interpreted as plain @code{char}s, multibyte 6283characters are not recognised. Perhaps this will change in the future. 6284 6285 6286@node Formatted Input Functions, C++ Formatted Input, Formatted Input Strings, Formatted Input 6287@section Formatted Input Functions 6288@cindex Input functions 6289 6290Each of the following functions is similar to the corresponding C library 6291function. The plain @code{scanf} forms take a variable argument list. The 6292@code{vscanf} forms take an argument pointer, see @ref{Variadic Functions,, 6293Variadic Functions, libc, The GNU C Library Reference Manual}, or @samp{man 3 6294va_start}. 6295 6296It should be emphasised that if a format string is invalid, or the arguments 6297don't match what the format specifies, then the behaviour of any of these 6298functions will be unpredictable. GCC format string checking is not available, 6299since it doesn't recognise the GMP extensions. 6300 6301No overlap is permitted between the @var{fmt} string and any of the results 6302produced. 6303 6304@deftypefun int gmp_scanf (const char *@var{fmt}, @dots{}) 6305@deftypefunx int gmp_vscanf (const char *@var{fmt}, va_list @var{ap}) 6306Read from the standard input @code{stdin}. 6307@end deftypefun 6308 6309@deftypefun int gmp_fscanf (FILE *@var{fp}, const char *@var{fmt}, @dots{}) 6310@deftypefunx int gmp_vfscanf (FILE *@var{fp}, const char *@var{fmt}, va_list @var{ap}) 6311Read from the stream @var{fp}. 6312@end deftypefun 6313 6314@deftypefun int gmp_sscanf (const char *@var{s}, const char *@var{fmt}, @dots{}) 6315@deftypefunx int gmp_vsscanf (const char *@var{s}, const char *@var{fmt}, va_list @var{ap}) 6316Read from a null-terminated string @var{s}. 6317@end deftypefun 6318 6319The return value from each of these functions is the same as the standard C99 6320@code{scanf}, namely the number of fields successfully parsed and stored. 6321@samp{%n} fields and fields read but suppressed by @samp{*} don't count 6322towards the return value. 6323 6324If end of input (or a file error) is reached before a character for a field or 6325a literal, and if no previous non-suppressed fields have matched, then the 6326return value is @code{EOF} instead of 0. A whitespace character in the format 6327string is only an optional match and doesn't induce an @code{EOF} in this 6328fashion. Leading whitespace read and discarded for a field don't count as 6329characters for that field. 6330 6331For the GMP types, input parsing follows C99 rules, namely one character of 6332lookahead is used and characters are read while they continue to meet the 6333format requirements. If this doesn't provide a complete number then the 6334function terminates, with that field not stored nor counted towards the return 6335value. For instance with @code{mpf_t} an input @samp{1.23e-XYZ} would be read 6336up to the @samp{X} and that character pushed back since it's not a digit. The 6337string @samp{1.23e-} would then be considered invalid since an @samp{e} must 6338be followed by at least one digit. 6339 6340For the standard C types, in the current implementation GMP calls the C 6341library @code{scanf} functions, which might have looser rules about what 6342constitutes a valid input. 6343 6344Note that @code{gmp_sscanf} is the same as @code{gmp_fscanf} and only does one 6345character of lookahead when parsing. Although clearly it could look at its 6346entire input, it is deliberately made identical to @code{gmp_fscanf}, the same 6347way C99 @code{sscanf} is the same as @code{fscanf}. 6348 6349 6350@node C++ Formatted Input, , Formatted Input Functions, Formatted Input 6351@section C++ Formatted Input 6352@cindex C++ @code{istream} input 6353@cindex @code{istream} input 6354 6355The following functions are provided in @file{libgmpxx} (@pxref{Headers and 6356Libraries}), which is built only if C++ support is enabled (@pxref{Build 6357Options}). Prototypes are available from @code{<gmp.h>}. 6358 6359@deftypefun istream& operator>> (istream& @var{stream}, mpz_t @var{rop}) 6360Read @var{rop} from @var{stream}, using its @code{ios} formatting settings. 6361@end deftypefun 6362 6363@deftypefun istream& operator>> (istream& @var{stream}, mpq_t @var{rop}) 6364An integer like @samp{123} will be read, or a fraction like @samp{5/9}. No 6365whitespace is allowed around the @samp{/}. If the fraction is not in 6366canonical form then @code{mpq_canonicalize} must be called (@pxref{Rational 6367Number Functions}) before operating on it. 6368 6369As per integer input, an @samp{0} or @samp{0x} base indicator is read when 6370none of @code{ios::dec}, @code{ios::oct} or @code{ios::hex} are set. This is 6371done separately for numerator and denominator, so that for instance 6372@samp{0x10/11} is @math{16/11} and @samp{0x10/0x11} is @math{16/17}. 6373@end deftypefun 6374 6375@deftypefun istream& operator>> (istream& @var{stream}, mpf_t @var{rop}) 6376Read @var{rop} from @var{stream}, using its @code{ios} formatting settings. 6377 6378Hex or octal floats are not supported, but might be in the future, or perhaps 6379it's best to accept only what the standard float @code{operator>>} does. 6380@end deftypefun 6381 6382Note that digit grouping specified by the @code{istream} locale is currently 6383not accepted. Perhaps this will change in the future. 6384 6385@sp 1 6386These operators mean that GMP types can be read in the usual C++ way, for 6387example, 6388 6389@example 6390mpz_t z; 6391... 6392cin >> z; 6393@end example 6394 6395But note that @code{istream} input (and @code{ostream} output, @pxref{C++ 6396Formatted Output}) is the only overloading available for the GMP types and 6397that for instance using @code{+} with an @code{mpz_t} will have unpredictable 6398results. For classes with overloading, see @ref{C++ Class Interface}. 6399 6400 6401 6402@node C++ Class Interface, Custom Allocation, Formatted Input, Top 6403@chapter C++ Class Interface 6404@cindex C++ interface 6405 6406This chapter describes the C++ class based interface to GMP. 6407 6408All GMP C language types and functions can be used in C++ programs, since 6409@file{gmp.h} has @code{extern "C"} qualifiers, but the class interface offers 6410overloaded functions and operators which may be more convenient. 6411 6412Due to the implementation of this interface, a reasonably recent C++ compiler 6413is required, one supporting namespaces, partial specialization of templates 6414and member templates. For GCC this means version 2.91 or later. 6415 6416@strong{Everything described in this chapter is to be considered preliminary 6417and might be subject to incompatible changes if some unforeseen difficulty 6418reveals itself.} 6419 6420@menu 6421* C++ Interface General:: 6422* C++ Interface Integers:: 6423* C++ Interface Rationals:: 6424* C++ Interface Floats:: 6425* C++ Interface Random Numbers:: 6426* C++ Interface Limitations:: 6427@end menu 6428 6429 6430@node C++ Interface General, C++ Interface Integers, C++ Class Interface, C++ Class Interface 6431@section C++ Interface General 6432 6433@noindent 6434All the C++ classes and functions are available with 6435 6436@cindex @code{gmpxx.h} 6437@example 6438#include <gmpxx.h> 6439@end example 6440 6441Programs should be linked with the @file{libgmpxx} and @file{libgmp} 6442libraries. For example, 6443 6444@example 6445g++ mycxxprog.cc -lgmpxx -lgmp 6446@end example 6447 6448@noindent 6449The classes defined are 6450 6451@deftp Class mpz_class 6452@deftpx Class mpq_class 6453@deftpx Class mpf_class 6454@end deftp 6455 6456The standard operators and various standard functions are overloaded to allow 6457arithmetic with these classes. For example, 6458 6459@example 6460int 6461main (void) 6462@{ 6463 mpz_class a, b, c; 6464 6465 a = 1234; 6466 b = "-5678"; 6467 c = a+b; 6468 cout << "sum is " << c << "\n"; 6469 cout << "absolute value is " << abs(c) << "\n"; 6470 6471 return 0; 6472@} 6473@end example 6474 6475An important feature of the implementation is that an expression like 6476@code{a=b+c} results in a single call to the corresponding @code{mpz_add}, 6477without using a temporary for the @code{b+c} part. Expressions which by their 6478nature imply intermediate values, like @code{a=b*c+d*e}, still use temporaries 6479though. 6480 6481The classes can be freely intermixed in expressions, as can the classes and 6482the standard types @code{long}, @code{unsigned long} and @code{double}. 6483Smaller types like @code{int} or @code{float} can also be intermixed, since 6484C++ will promote them. 6485 6486Note that @code{bool} is not accepted directly, but must be explicitly cast to 6487an @code{int} first. This is because C++ will automatically convert any 6488pointer to a @code{bool}, so if GMP accepted @code{bool} it would make all 6489sorts of invalid class and pointer combinations compile but almost certainly 6490not do anything sensible. 6491 6492Conversions back from the classes to standard C++ types aren't done 6493automatically, instead member functions like @code{get_si} are provided (see 6494the following sections for details). 6495 6496Also there are no automatic conversions from the classes to the corresponding 6497GMP C types, instead a reference to the underlying C object can be obtained 6498with the following functions, 6499 6500@deftypefun mpz_t mpz_class::get_mpz_t () 6501@deftypefunx mpq_t mpq_class::get_mpq_t () 6502@deftypefunx mpf_t mpf_class::get_mpf_t () 6503@end deftypefun 6504 6505These can be used to call a C function which doesn't have a C++ class 6506interface. For example to set @code{a} to the GCD of @code{b} and @code{c}, 6507 6508@example 6509mpz_class a, b, c; 6510... 6511mpz_gcd (a.get_mpz_t(), b.get_mpz_t(), c.get_mpz_t()); 6512@end example 6513 6514In the other direction, a class can be initialized from the corresponding GMP 6515C type, or assigned to if an explicit constructor is used. In both cases this 6516makes a copy of the value, it doesn't create any sort of association. For 6517example, 6518 6519@example 6520mpz_t z; 6521// ... init and calculate z ... 6522mpz_class x(z); 6523mpz_class y; 6524y = mpz_class (z); 6525@end example 6526 6527There are no namespace setups in @file{gmpxx.h}, all types and functions are 6528simply put into the global namespace. This is what @file{gmp.h} has done in 6529the past, and continues to do for compatibility. The extras provided by 6530@file{gmpxx.h} follow GMP naming conventions and are unlikely to clash with 6531anything. 6532 6533 6534@node C++ Interface Integers, C++ Interface Rationals, C++ Interface General, C++ Class Interface 6535@section C++ Interface Integers 6536 6537@deftypefun {} mpz_class::mpz_class (type @var{n}) 6538Construct an @code{mpz_class}. All the standard C++ types may be used, except 6539@code{long long} and @code{long double}, and all the GMP C++ classes can be 6540used, although conversions from @code{mpq_class} and @code{mpf_class} are 6541@code{explicit}. Any necessary conversion follows the corresponding C 6542function, for example @code{double} follows @code{mpz_set_d} 6543(@pxref{Assigning Integers}). 6544@end deftypefun 6545 6546@deftypefun explicit mpz_class::mpz_class (mpz_t @var{z}) 6547Construct an @code{mpz_class} from an @code{mpz_t}. The value in @var{z} is 6548copied into the new @code{mpz_class}, there won't be any permanent association 6549between it and @var{z}. 6550@end deftypefun 6551 6552@deftypefun explicit mpz_class::mpz_class (const char *@var{s}, int @var{base} = 0) 6553@deftypefunx explicit mpz_class::mpz_class (const string& @var{s}, int @var{base} = 0) 6554Construct an @code{mpz_class} converted from a string using @code{mpz_set_str} 6555(@pxref{Assigning Integers}). 6556 6557If the string is not a valid integer, an @code{std::invalid_argument} 6558exception is thrown. The same applies to @code{operator=}. 6559@end deftypefun 6560 6561@deftypefun mpz_class operator"" _mpz (const char *@var{str}) 6562With C++11 compilers, integers can be constructed with the syntax 6563@code{123_mpz} which is equivalent to @code{mpz_class("123")}. 6564@end deftypefun 6565 6566@deftypefun mpz_class operator/ (mpz_class @var{a}, mpz_class @var{d}) 6567@deftypefunx mpz_class operator% (mpz_class @var{a}, mpz_class @var{d}) 6568Divisions involving @code{mpz_class} round towards zero, as per the 6569@code{mpz_tdiv_q} and @code{mpz_tdiv_r} functions (@pxref{Integer Division}). 6570This is the same as the C99 @code{/} and @code{%} operators. 6571 6572The @code{mpz_fdiv@dots{}} or @code{mpz_cdiv@dots{}} functions can always be called 6573directly if desired. For example, 6574 6575@example 6576mpz_class q, a, d; 6577... 6578mpz_fdiv_q (q.get_mpz_t(), a.get_mpz_t(), d.get_mpz_t()); 6579@end example 6580@end deftypefun 6581 6582@deftypefun mpz_class abs (mpz_class @var{op}) 6583@deftypefunx int cmp (mpz_class @var{op1}, type @var{op2}) 6584@deftypefunx int cmp (type @var{op1}, mpz_class @var{op2}) 6585@maybepagebreak 6586@deftypefunx bool mpz_class::fits_sint_p (void) 6587@deftypefunx bool mpz_class::fits_slong_p (void) 6588@deftypefunx bool mpz_class::fits_sshort_p (void) 6589@maybepagebreak 6590@deftypefunx bool mpz_class::fits_uint_p (void) 6591@deftypefunx bool mpz_class::fits_ulong_p (void) 6592@deftypefunx bool mpz_class::fits_ushort_p (void) 6593@maybepagebreak 6594@deftypefunx double mpz_class::get_d (void) 6595@deftypefunx long mpz_class::get_si (void) 6596@deftypefunx string mpz_class::get_str (int @var{base} = 10) 6597@deftypefunx {unsigned long} mpz_class::get_ui (void) 6598@maybepagebreak 6599@deftypefunx int mpz_class::set_str (const char *@var{str}, int @var{base}) 6600@deftypefunx int mpz_class::set_str (const string& @var{str}, int @var{base}) 6601@deftypefunx int sgn (mpz_class @var{op}) 6602@deftypefunx mpz_class sqrt (mpz_class @var{op}) 6603@maybepagebreak 6604@deftypefunx void mpz_class::swap (mpz_class& @var{op}) 6605@deftypefunx void swap (mpz_class& @var{op1}, mpz_class& @var{op2}) 6606These functions provide a C++ class interface to the corresponding GMP C 6607routines. 6608 6609@code{cmp} can be used with any of the classes or the standard C++ types, 6610except @code{long long} and @code{long double}. 6611@end deftypefun 6612 6613@sp 1 6614Overloaded operators for combinations of @code{mpz_class} and @code{double} 6615are provided for completeness, but it should be noted that if the given 6616@code{double} is not an integer then the way any rounding is done is currently 6617unspecified. The rounding might take place at the start, in the middle, or at 6618the end of the operation, and it might change in the future. 6619 6620Conversions between @code{mpz_class} and @code{double}, however, are defined 6621to follow the corresponding C functions @code{mpz_get_d} and @code{mpz_set_d}. 6622And comparisons are always made exactly, as per @code{mpz_cmp_d}. 6623 6624 6625@node C++ Interface Rationals, C++ Interface Floats, C++ Interface Integers, C++ Class Interface 6626@section C++ Interface Rationals 6627 6628In all the following constructors, if a fraction is given then it should be in 6629canonical form, or if not then @code{mpq_class::canonicalize} called. 6630 6631@deftypefun {} mpq_class::mpq_class (type @var{op}) 6632@deftypefunx {} mpq_class::mpq_class (integer @var{num}, integer @var{den}) 6633Construct an @code{mpq_class}. The initial value can be a single value of any 6634type (conversion from @code{mpf_class} is @code{explicit}), or a pair of 6635integers (@code{mpz_class} or standard C++ integer types) representing a 6636fraction, except that @code{long long} and @code{long double} are not 6637supported. For example, 6638 6639@example 6640mpq_class q (99); 6641mpq_class q (1.75); 6642mpq_class q (1, 3); 6643@end example 6644@end deftypefun 6645 6646@deftypefun explicit mpq_class::mpq_class (mpq_t @var{q}) 6647Construct an @code{mpq_class} from an @code{mpq_t}. The value in @var{q} is 6648copied into the new @code{mpq_class}, there won't be any permanent association 6649between it and @var{q}. 6650@end deftypefun 6651 6652@deftypefun explicit mpq_class::mpq_class (const char *@var{s}, int @var{base} = 0) 6653@deftypefunx explicit mpq_class::mpq_class (const string& @var{s}, int @var{base} = 0) 6654Construct an @code{mpq_class} converted from a string using @code{mpq_set_str} 6655(@pxref{Initializing Rationals}). 6656 6657If the string is not a valid rational, an @code{std::invalid_argument} 6658exception is thrown. The same applies to @code{operator=}. 6659@end deftypefun 6660 6661@deftypefun mpq_class operator"" _mpq (const char *@var{str}) 6662With C++11 compilers, integral rationals can be constructed with the syntax 6663@code{123_mpq} which is equivalent to @code{mpq_class(123_mpz)}. Other 6664rationals can be built as @code{-1_mpq/2} or @code{0xb_mpq/123456_mpz}. 6665@end deftypefun 6666 6667@deftypefun void mpq_class::canonicalize () 6668Put an @code{mpq_class} into canonical form, as per @ref{Rational Number 6669Functions}. All arithmetic operators require their operands in canonical 6670form, and will return results in canonical form. 6671@end deftypefun 6672 6673@deftypefun mpq_class abs (mpq_class @var{op}) 6674@deftypefunx int cmp (mpq_class @var{op1}, type @var{op2}) 6675@deftypefunx int cmp (type @var{op1}, mpq_class @var{op2}) 6676@maybepagebreak 6677@deftypefunx double mpq_class::get_d (void) 6678@deftypefunx string mpq_class::get_str (int @var{base} = 10) 6679@maybepagebreak 6680@deftypefunx int mpq_class::set_str (const char *@var{str}, int @var{base}) 6681@deftypefunx int mpq_class::set_str (const string& @var{str}, int @var{base}) 6682@deftypefunx int sgn (mpq_class @var{op}) 6683@maybepagebreak 6684@deftypefunx void mpq_class::swap (mpq_class& @var{op}) 6685@deftypefunx void swap (mpq_class& @var{op1}, mpq_class& @var{op2}) 6686These functions provide a C++ class interface to the corresponding GMP C 6687routines. 6688 6689@code{cmp} can be used with any of the classes or the standard C++ types, 6690except @code{long long} and @code{long double}. 6691@end deftypefun 6692 6693@deftypefun {mpz_class&} mpq_class::get_num () 6694@deftypefunx {mpz_class&} mpq_class::get_den () 6695Get a reference to an @code{mpz_class} which is the numerator or denominator 6696of an @code{mpq_class}. This can be used both for read and write access. If 6697the object returned is modified, it modifies the original @code{mpq_class}. 6698 6699If direct manipulation might produce a non-canonical value, then 6700@code{mpq_class::canonicalize} must be called before further operations. 6701@end deftypefun 6702 6703@deftypefun mpz_t mpq_class::get_num_mpz_t () 6704@deftypefunx mpz_t mpq_class::get_den_mpz_t () 6705Get a reference to the underlying @code{mpz_t} numerator or denominator of an 6706@code{mpq_class}. This can be passed to C functions expecting an 6707@code{mpz_t}. Any modifications made to the @code{mpz_t} will modify the 6708original @code{mpq_class}. 6709 6710If direct manipulation might produce a non-canonical value, then 6711@code{mpq_class::canonicalize} must be called before further operations. 6712@end deftypefun 6713 6714@deftypefun istream& operator>> (istream& @var{stream}, mpq_class& @var{rop}); 6715Read @var{rop} from @var{stream}, using its @code{ios} formatting settings, 6716the same as @code{mpq_t operator>>} (@pxref{C++ Formatted Input}). 6717 6718If the @var{rop} read might not be in canonical form then 6719@code{mpq_class::canonicalize} must be called. 6720@end deftypefun 6721 6722 6723@node C++ Interface Floats, C++ Interface Random Numbers, C++ Interface Rationals, C++ Class Interface 6724@section C++ Interface Floats 6725 6726When an expression requires the use of temporary intermediate @code{mpf_class} 6727values, like @code{f=g*h+x*y}, those temporaries will have the same precision 6728as the destination @code{f}. Explicit constructors can be used if this 6729doesn't suit. 6730 6731@deftypefun {} mpf_class::mpf_class (type @var{op}) 6732@deftypefunx {} mpf_class::mpf_class (type @var{op}, mp_bitcnt_t @var{prec}) 6733Construct an @code{mpf_class}. Any standard C++ type can be used, except 6734@code{long long} and @code{long double}, and any of the GMP C++ classes can be 6735used. 6736 6737If @var{prec} is given, the initial precision is that value, in bits. If 6738@var{prec} is not given, then the initial precision is determined by the type 6739of @var{op} given. An @code{mpz_class}, @code{mpq_class}, or C++ 6740builtin type will give the default @code{mpf} precision (@pxref{Initializing 6741Floats}). An @code{mpf_class} or expression will give the precision of that 6742value. The precision of a binary expression is the higher of the two 6743operands. 6744 6745@example 6746mpf_class f(1.5); // default precision 6747mpf_class f(1.5, 500); // 500 bits (at least) 6748mpf_class f(x); // precision of x 6749mpf_class f(abs(x)); // precision of x 6750mpf_class f(-g, 1000); // 1000 bits (at least) 6751mpf_class f(x+y); // greater of precisions of x and y 6752@end example 6753@end deftypefun 6754 6755@deftypefun explicit mpf_class::mpf_class (mpf_t @var{f}) 6756@deftypefunx {} mpf_class::mpf_class (mpf_t @var{f}, mp_bitcnt_t @var{prec}) 6757Construct an @code{mpf_class} from an @code{mpf_t}. The value in @var{f} is 6758copied into the new @code{mpf_class}, there won't be any permanent association 6759between it and @var{f}. 6760 6761If @var{prec} is given, the initial precision is that value, in bits. If 6762@var{prec} is not given, then the initial precision is that of @var{f}. 6763@end deftypefun 6764 6765@deftypefun explicit mpf_class::mpf_class (const char *@var{s}) 6766@deftypefunx {} mpf_class::mpf_class (const char *@var{s}, mp_bitcnt_t @var{prec}, int @var{base} = 0) 6767@deftypefunx explicit mpf_class::mpf_class (const string& @var{s}) 6768@deftypefunx {} mpf_class::mpf_class (const string& @var{s}, mp_bitcnt_t @var{prec}, int @var{base} = 0) 6769Construct an @code{mpf_class} converted from a string using @code{mpf_set_str} 6770(@pxref{Assigning Floats}). If @var{prec} is given, the initial precision is 6771that value, in bits. If not, the default @code{mpf} precision 6772(@pxref{Initializing Floats}) is used. 6773 6774If the string is not a valid float, an @code{std::invalid_argument} exception 6775is thrown. The same applies to @code{operator=}. 6776@end deftypefun 6777 6778@deftypefun mpf_class operator"" _mpf (const char *@var{str}) 6779With C++11 compilers, floats can be constructed with the syntax 6780@code{1.23e-1_mpf} which is equivalent to @code{mpf_class("1.23e-1")}. 6781@end deftypefun 6782 6783@deftypefun {mpf_class&} mpf_class::operator= (type @var{op}) 6784Convert and store the given @var{op} value to an @code{mpf_class} object. The 6785same types are accepted as for the constructors above. 6786 6787Note that @code{operator=} only stores a new value, it doesn't copy or change 6788the precision of the destination, instead the value is truncated if necessary. 6789This is the same as @code{mpf_set} etc. Note in particular this means for 6790@code{mpf_class} a copy constructor is not the same as a default constructor 6791plus assignment. 6792 6793@example 6794mpf_class x (y); // x created with precision of y 6795 6796mpf_class x; // x created with default precision 6797x = y; // value truncated to that precision 6798@end example 6799 6800Applications using templated code may need to be careful about the assumptions 6801the code makes in this area, when working with @code{mpf_class} values of 6802various different or non-default precisions. For instance implementations of 6803the standard @code{complex} template have been seen in both styles above, 6804though of course @code{complex} is normally only actually specified for use 6805with the builtin float types. 6806@end deftypefun 6807 6808@deftypefun mpf_class abs (mpf_class @var{op}) 6809@deftypefunx mpf_class ceil (mpf_class @var{op}) 6810@deftypefunx int cmp (mpf_class @var{op1}, type @var{op2}) 6811@deftypefunx int cmp (type @var{op1}, mpf_class @var{op2}) 6812@maybepagebreak 6813@deftypefunx bool mpf_class::fits_sint_p (void) 6814@deftypefunx bool mpf_class::fits_slong_p (void) 6815@deftypefunx bool mpf_class::fits_sshort_p (void) 6816@maybepagebreak 6817@deftypefunx bool mpf_class::fits_uint_p (void) 6818@deftypefunx bool mpf_class::fits_ulong_p (void) 6819@deftypefunx bool mpf_class::fits_ushort_p (void) 6820@maybepagebreak 6821@deftypefunx mpf_class floor (mpf_class @var{op}) 6822@deftypefunx mpf_class hypot (mpf_class @var{op1}, mpf_class @var{op2}) 6823@maybepagebreak 6824@deftypefunx double mpf_class::get_d (void) 6825@deftypefunx long mpf_class::get_si (void) 6826@deftypefunx string mpf_class::get_str (mp_exp_t& @var{exp}, int @var{base} = 10, size_t @var{digits} = 0) 6827@deftypefunx {unsigned long} mpf_class::get_ui (void) 6828@maybepagebreak 6829@deftypefunx int mpf_class::set_str (const char *@var{str}, int @var{base}) 6830@deftypefunx int mpf_class::set_str (const string& @var{str}, int @var{base}) 6831@deftypefunx int sgn (mpf_class @var{op}) 6832@deftypefunx mpf_class sqrt (mpf_class @var{op}) 6833@maybepagebreak 6834@deftypefunx void mpf_class::swap (mpf_class& @var{op}) 6835@deftypefunx void swap (mpf_class& @var{op1}, mpf_class& @var{op2}) 6836@deftypefunx mpf_class trunc (mpf_class @var{op}) 6837These functions provide a C++ class interface to the corresponding GMP C 6838routines. 6839 6840@code{cmp} can be used with any of the classes or the standard C++ types, 6841except @code{long long} and @code{long double}. 6842 6843The accuracy provided by @code{hypot} is not currently guaranteed. 6844@end deftypefun 6845 6846@deftypefun {mp_bitcnt_t} mpf_class::get_prec () 6847@deftypefunx void mpf_class::set_prec (mp_bitcnt_t @var{prec}) 6848@deftypefunx void mpf_class::set_prec_raw (mp_bitcnt_t @var{prec}) 6849Get or set the current precision of an @code{mpf_class}. 6850 6851The restrictions described for @code{mpf_set_prec_raw} (@pxref{Initializing 6852Floats}) apply to @code{mpf_class::set_prec_raw}. Note in particular that the 6853@code{mpf_class} must be restored to it's allocated precision before being 6854destroyed. This must be done by application code, there's no automatic 6855mechanism for it. 6856@end deftypefun 6857 6858 6859@node C++ Interface Random Numbers, C++ Interface Limitations, C++ Interface Floats, C++ Class Interface 6860@section C++ Interface Random Numbers 6861 6862@deftp Class gmp_randclass 6863The C++ class interface to the GMP random number functions uses 6864@code{gmp_randclass} to hold an algorithm selection and current state, as per 6865@code{gmp_randstate_t}. 6866@end deftp 6867 6868@deftypefun {} gmp_randclass::gmp_randclass (void (*@var{randinit}) (gmp_randstate_t, @dots{}), @dots{}) 6869Construct a @code{gmp_randclass}, using a call to the given @var{randinit} 6870function (@pxref{Random State Initialization}). The arguments expected are 6871the same as @var{randinit}, but with @code{mpz_class} instead of @code{mpz_t}. 6872For example, 6873 6874@example 6875gmp_randclass r1 (gmp_randinit_default); 6876gmp_randclass r2 (gmp_randinit_lc_2exp_size, 32); 6877gmp_randclass r3 (gmp_randinit_lc_2exp, a, c, m2exp); 6878gmp_randclass r4 (gmp_randinit_mt); 6879@end example 6880 6881@code{gmp_randinit_lc_2exp_size} will fail if the size requested is too big, 6882an @code{std::length_error} exception is thrown in that case. 6883@end deftypefun 6884 6885@deftypefun {} gmp_randclass::gmp_randclass (gmp_randalg_t @var{alg}, @dots{}) 6886Construct a @code{gmp_randclass} using the same parameters as 6887@code{gmp_randinit} (@pxref{Random State Initialization}). This function is 6888obsolete and the above @var{randinit} style should be preferred. 6889@end deftypefun 6890 6891@deftypefun void gmp_randclass::seed (unsigned long int @var{s}) 6892@deftypefunx void gmp_randclass::seed (mpz_class @var{s}) 6893Seed a random number generator. See @pxref{Random Number Functions}, for how 6894to choose a good seed. 6895@end deftypefun 6896 6897@deftypefun mpz_class gmp_randclass::get_z_bits (mp_bitcnt_t @var{bits}) 6898@deftypefunx mpz_class gmp_randclass::get_z_bits (mpz_class @var{bits}) 6899Generate a random integer with a specified number of bits. 6900@end deftypefun 6901 6902@deftypefun mpz_class gmp_randclass::get_z_range (mpz_class @var{n}) 6903Generate a random integer in the range 0 to @math{@var{n}-1} inclusive. 6904@end deftypefun 6905 6906@deftypefun mpf_class gmp_randclass::get_f () 6907@deftypefunx mpf_class gmp_randclass::get_f (mp_bitcnt_t @var{prec}) 6908Generate a random float @var{f} in the range @math{0 <= @var{f} < 1}. @var{f} 6909will be to @var{prec} bits precision, or if @var{prec} is not given then to 6910the precision of the destination. For example, 6911 6912@example 6913gmp_randclass r; 6914... 6915mpf_class f (0, 512); // 512 bits precision 6916f = r.get_f(); // random number, 512 bits 6917@end example 6918@end deftypefun 6919 6920 6921 6922@node C++ Interface Limitations, , C++ Interface Random Numbers, C++ Class Interface 6923@section C++ Interface Limitations 6924 6925@table @asis 6926@item @code{mpq_class} and Templated Reading 6927A generic piece of template code probably won't know that @code{mpq_class} 6928requires a @code{canonicalize} call if inputs read with @code{operator>>} 6929might be non-canonical. This can lead to incorrect results. 6930 6931@code{operator>>} behaves as it does for reasons of efficiency. A 6932canonicalize can be quite time consuming on large operands, and is best 6933avoided if it's not necessary. 6934 6935But this potential difficulty reduces the usefulness of @code{mpq_class}. 6936Perhaps a mechanism to tell @code{operator>>} what to do will be adopted in 6937the future, maybe a preprocessor define, a global flag, or an @code{ios} flag 6938pressed into service. Or maybe, at the risk of inconsistency, the 6939@code{mpq_class} @code{operator>>} could canonicalize and leave @code{mpq_t} 6940@code{operator>>} not doing so, for use on those occasions when that's 6941acceptable. Send feedback or alternate ideas to @email{gmp-bugs@@gmplib.org}. 6942 6943@item Subclassing 6944Subclassing the GMP C++ classes works, but is not currently recommended. 6945 6946Expressions involving subclasses resolve correctly (or seem to), but in normal 6947C++ fashion the subclass doesn't inherit constructors and assignments. 6948There's many of those in the GMP classes, and a good way to reestablish them 6949in a subclass is not yet provided. 6950 6951@item Templated Expressions 6952A subtle difficulty exists when using expressions together with 6953application-defined template functions. Consider the following, with @code{T} 6954intended to be some numeric type, 6955 6956@example 6957template <class T> 6958T fun (const T &, const T &); 6959@end example 6960 6961@noindent 6962When used with, say, plain @code{mpz_class} variables, it works fine: @code{T} 6963is resolved as @code{mpz_class}. 6964 6965@example 6966mpz_class f(1), g(2); 6967fun (f, g); // Good 6968@end example 6969 6970@noindent 6971But when one of the arguments is an expression, it doesn't work. 6972 6973@example 6974mpz_class f(1), g(2), h(3); 6975fun (f, g+h); // Bad 6976@end example 6977 6978This is because @code{g+h} ends up being a certain expression template type 6979internal to @code{gmpxx.h}, which the C++ template resolution rules are unable 6980to automatically convert to @code{mpz_class}. The workaround is simply to add 6981an explicit cast. 6982 6983@example 6984mpz_class f(1), g(2), h(3); 6985fun (f, mpz_class(g+h)); // Good 6986@end example 6987 6988Similarly, within @code{fun} it may be necessary to cast an expression to type 6989@code{T} when calling a templated @code{fun2}. 6990 6991@example 6992template <class T> 6993void fun (T f, T g) 6994@{ 6995 fun2 (f, f+g); // Bad 6996@} 6997 6998template <class T> 6999void fun (T f, T g) 7000@{ 7001 fun2 (f, T(f+g)); // Good 7002@} 7003@end example 7004@end table 7005 7006 7007@node Custom Allocation, Language Bindings, C++ Class Interface, Top 7008@comment node-name, next, previous, up 7009@chapter Custom Allocation 7010@cindex Custom allocation 7011@cindex Memory allocation 7012@cindex Allocation of memory 7013 7014By default GMP uses @code{malloc}, @code{realloc} and @code{free} for memory 7015allocation, and if they fail GMP prints a message to the standard error output 7016and terminates the program. 7017 7018Alternate functions can be specified, to allocate memory in a different way or 7019to have a different error action on running out of memory. 7020 7021@deftypefun void mp_set_memory_functions (@* void *(*@var{alloc_func_ptr}) (size_t), @* void *(*@var{realloc_func_ptr}) (void *, size_t, size_t), @* void (*@var{free_func_ptr}) (void *, size_t)) 7022Replace the current allocation functions from the arguments. If an argument 7023is @code{NULL}, the corresponding default function is used. 7024 7025These functions will be used for all memory allocation done by GMP, apart from 7026temporary space from @code{alloca} if that function is available and GMP is 7027configured to use it (@pxref{Build Options}). 7028 7029@strong{Be sure to call @code{mp_set_memory_functions} only when there are no 7030active GMP objects allocated using the previous memory functions! Usually 7031that means calling it before any other GMP function.} 7032@end deftypefun 7033 7034The functions supplied should fit the following declarations: 7035 7036@deftypevr Function {void *} allocate_function (size_t @var{alloc_size}) 7037Return a pointer to newly allocated space with at least @var{alloc_size} 7038bytes. 7039@end deftypevr 7040 7041@deftypevr Function {void *} reallocate_function (void *@var{ptr}, size_t @var{old_size}, size_t @var{new_size}) 7042Resize a previously allocated block @var{ptr} of @var{old_size} bytes to be 7043@var{new_size} bytes. 7044 7045The block may be moved if necessary or if desired, and in that case the 7046smaller of @var{old_size} and @var{new_size} bytes must be copied to the new 7047location. The return value is a pointer to the resized block, that being the 7048new location if moved or just @var{ptr} if not. 7049 7050@var{ptr} is never @code{NULL}, it's always a previously allocated block. 7051@var{new_size} may be bigger or smaller than @var{old_size}. 7052@end deftypevr 7053 7054@deftypevr Function void free_function (void *@var{ptr}, size_t @var{size}) 7055De-allocate the space pointed to by @var{ptr}. 7056 7057@var{ptr} is never @code{NULL}, it's always a previously allocated block of 7058@var{size} bytes. 7059@end deftypevr 7060 7061A @dfn{byte} here means the unit used by the @code{sizeof} operator. 7062 7063The @var{reallocate_function} parameter @var{old_size} and the 7064@var{free_function} parameter @var{size} are passed for convenience, but of 7065course they can be ignored if not needed by an implementation. The default 7066functions using @code{malloc} and friends for instance don't use them. 7067 7068No error return is allowed from any of these functions, if they return then 7069they must have performed the specified operation. In particular note that 7070@var{allocate_function} or @var{reallocate_function} mustn't return 7071@code{NULL}. 7072 7073Getting a different fatal error action is a good use for custom allocation 7074functions, for example giving a graphical dialog rather than the default print 7075to @code{stderr}. How much is possible when genuinely out of memory is 7076another question though. 7077 7078There's currently no defined way for the allocation functions to recover from 7079an error such as out of memory, they must terminate program execution. A 7080@code{longjmp} or throwing a C++ exception will have undefined results. This 7081may change in the future. 7082 7083GMP may use allocated blocks to hold pointers to other allocated blocks. This 7084will limit the assumptions a conservative garbage collection scheme can make. 7085 7086Since the default GMP allocation uses @code{malloc} and friends, those 7087functions will be linked in even if the first thing a program does is an 7088@code{mp_set_memory_functions}. It's necessary to change the GMP sources if 7089this is a problem. 7090 7091@sp 1 7092@deftypefun void mp_get_memory_functions (@* void *(**@var{alloc_func_ptr}) (size_t), @* void *(**@var{realloc_func_ptr}) (void *, size_t, size_t), @* void (**@var{free_func_ptr}) (void *, size_t)) 7093Get the current allocation functions, storing function pointers to the 7094locations given by the arguments. If an argument is @code{NULL}, that 7095function pointer is not stored. 7096 7097@need 1000 7098For example, to get just the current free function, 7099 7100@example 7101void (*freefunc) (void *, size_t); 7102 7103mp_get_memory_functions (NULL, NULL, &freefunc); 7104@end example 7105@end deftypefun 7106 7107@node Language Bindings, Algorithms, Custom Allocation, Top 7108@chapter Language Bindings 7109@cindex Language bindings 7110@cindex Other languages 7111 7112The following packages and projects offer access to GMP from languages other 7113than C, though perhaps with varying levels of functionality and efficiency. 7114 7115@c @spaceuref{U} is the same as @uref{U}, but with a couple of extra spaces 7116@c in tex, just to separate the URL from the preceding text a bit. 7117@iftex 7118@macro spaceuref {U} 7119@ @ @uref{\U\} 7120@end macro 7121@end iftex 7122@ifnottex 7123@macro spaceuref {U} 7124@uref{\U\} 7125@end macro 7126@end ifnottex 7127 7128@sp 1 7129@table @asis 7130@item C++ 7131@itemize @bullet 7132@item 7133GMP C++ class interface, @pxref{C++ Class Interface} @* Straightforward 7134interface, expression templates to eliminate temporaries. 7135@item 7136ALP @spaceuref{http://www-sop.inria.fr/saga/logiciels/ALP/} @* Linear algebra and 7137polynomials using templates. 7138@item 7139Arithmos @spaceuref{http://cant.ua.ac.be/old/arithmos/} @* Rationals 7140with infinities and square roots. 7141@item 7142CLN @spaceuref{http://www.ginac.de/CLN/} @* High level classes for arithmetic. 7143@item 7144LiDIA @spaceuref{http://www.cdc.informatik.tu-darmstadt.de/TI/LiDIA/} @* A C++ 7145library for computational number theory. 7146@item 7147Linbox @spaceuref{http://www.linalg.org/} @* Sparse vectors and matrices. 7148@item 7149NTL @spaceuref{http://www.shoup.net/ntl/} @* A C++ number theory library. 7150@end itemize 7151 7152@c @item D 7153@c @itemize @bullet 7154@c @item 7155@c gmp-d @spaceuref{http://home.comcast.net/~benhinkle/gmp-d/} 7156@c @end itemize 7157 7158@item Eiffel 7159@itemize @bullet 7160@item 7161Eiffelroom @spaceuref{http://www.eiffelroom.org/node/442} 7162@end itemize 7163 7164@item Fortran 7165@itemize @bullet 7166@item 7167Omni F77 @spaceuref{http://phase.hpcc.jp/Omni/home.html} @* Arbitrary 7168precision floats. 7169@end itemize 7170 7171@item Haskell 7172@itemize @bullet 7173@item 7174Glasgow Haskell Compiler @spaceuref{http://www.haskell.org/ghc/} 7175@end itemize 7176 7177@item Java 7178@itemize @bullet 7179@item 7180Kaffe @spaceuref{http://www.kaffe.org/} 7181@item 7182Kissme @spaceuref{http://kissme.sourceforge.net/} 7183@end itemize 7184 7185@item Lisp 7186@itemize @bullet 7187@item 7188GNU Common Lisp @spaceuref{http://www.gnu.org/software/gcl/gcl.html} 7189@item 7190Librep @spaceuref{http://librep.sourceforge.net/} 7191@item 7192@c FIXME: When there's a stable release with gmp support, just refer to it 7193@c rather than bothering to talk about betas. 7194XEmacs (21.5.18 beta and up) @spaceuref{http://www.xemacs.org} @* Optional 7195big integers, rationals and floats using GMP. 7196@end itemize 7197 7198@item M4 7199@itemize @bullet 7200@item 7201@c FIXME: When there's a stable release with gmp support, just refer to it 7202@c rather than bothering to talk about betas. 7203GNU m4 betas @spaceuref{http://www.seindal.dk/rene/gnu/} @* Optionally provides 7204an arbitrary precision @code{mpeval}. 7205@end itemize 7206 7207@item ML 7208@itemize @bullet 7209@item 7210MLton compiler @spaceuref{http://mlton.org/} 7211@end itemize 7212 7213@item Objective Caml 7214@itemize @bullet 7215@item 7216MLGMP @spaceuref{http://www.di.ens.fr/~monniaux/programmes.html.en} 7217@item 7218Numerix @spaceuref{http://pauillac.inria.fr/~quercia/} @* Optionally using 7219GMP. 7220@end itemize 7221 7222@item Oz 7223@itemize @bullet 7224@item 7225Mozart @spaceuref{http://www.mozart-oz.org/} 7226@end itemize 7227 7228@item Pascal 7229@itemize @bullet 7230@item 7231GNU Pascal Compiler @spaceuref{http://www.gnu-pascal.de/} @* GMP unit. 7232@item 7233Numerix @spaceuref{http://pauillac.inria.fr/~quercia/} @* For Free Pascal, 7234optionally using GMP. 7235@end itemize 7236 7237@item Perl 7238@itemize @bullet 7239@item 7240GMP module, see @file{demos/perl} in the GMP sources (@pxref{Demonstration 7241Programs}). 7242@item 7243Math::GMP @spaceuref{http://www.cpan.org/} @* Compatible with Math::BigInt, but 7244not as many functions as the GMP module above. 7245@item 7246Math::BigInt::GMP @spaceuref{http://www.cpan.org/} @* Plug Math::GMP into 7247normal Math::BigInt operations. 7248@end itemize 7249 7250@need 1000 7251@item Pike 7252@itemize @bullet 7253@item 7254mpz module in the standard distribution, @uref{http://pike.ida.liu.se/} 7255@end itemize 7256 7257@need 500 7258@item Prolog 7259@itemize @bullet 7260@item 7261SWI Prolog @spaceuref{http://www.swi-prolog.org/} @* 7262Arbitrary precision floats. 7263@end itemize 7264 7265@item Python 7266@itemize @bullet 7267@item 7268GMPY @uref{http://code.google.com/p/gmpy/} 7269@end itemize 7270 7271@item Ruby 7272@itemize @bullet 7273@item 7274http://rubygems.org/gems/gmp 7275@end itemize 7276 7277@item Scheme 7278@itemize @bullet 7279@item 7280GNU Guile (upcoming 1.8) @spaceuref{http://www.gnu.org/software/guile/guile.html} 7281@item 7282RScheme @spaceuref{http://www.rscheme.org/} 7283@item 7284STklos @spaceuref{http://www.stklos.org/} 7285@c 7286@c For reference, MzScheme uses some of gmp, but (as of version 205) it only 7287@c has copies of some of the generic C code, and we don't consider that a 7288@c language binding to gmp. 7289@c 7290@end itemize 7291 7292@item Smalltalk 7293@itemize @bullet 7294@item 7295GNU Smalltalk @spaceuref{http://www.smalltalk.org/versions/GNUSmalltalk.html} 7296@end itemize 7297 7298@item Other 7299@itemize @bullet 7300@item 7301Axiom @uref{http://savannah.nongnu.org/projects/axiom} @* Computer algebra 7302using GCL. 7303@item 7304DrGenius @spaceuref{http://drgenius.seul.org/} @* Geometry system and 7305mathematical programming language. 7306@item 7307GiNaC @spaceuref{http://www.ginac.de/} @* C++ computer algebra using CLN. 7308@item 7309GOO @spaceuref{http://www.googoogaga.org/} @* Dynamic object oriented 7310language. 7311@item 7312Maxima @uref{http://www.ma.utexas.edu/users/wfs/maxima.html} @* Macsyma 7313computer algebra using GCL. 7314@item 7315Q @spaceuref{http://q-lang.sourceforge.net/} @* Equational programming system. 7316@item 7317Regina @spaceuref{http://regina.sourceforge.net/} @* Topological calculator. 7318@item 7319Yacas @spaceuref{yacas.sourceforge.net} @* Yet another computer algebra system. 7320@end itemize 7321 7322@end table 7323 7324 7325@node Algorithms, Internals, Language Bindings, Top 7326@chapter Algorithms 7327@cindex Algorithms 7328 7329This chapter is an introduction to some of the algorithms used for various GMP 7330operations. The code is likely to be hard to understand without knowing 7331something about the algorithms. 7332 7333Some GMP internals are mentioned, but applications that expect to be 7334compatible with future GMP releases should take care to use only the 7335documented functions. 7336 7337@menu 7338* Multiplication Algorithms:: 7339* Division Algorithms:: 7340* Greatest Common Divisor Algorithms:: 7341* Powering Algorithms:: 7342* Root Extraction Algorithms:: 7343* Radix Conversion Algorithms:: 7344* Other Algorithms:: 7345* Assembly Coding:: 7346@end menu 7347 7348 7349@node Multiplication Algorithms, Division Algorithms, Algorithms, Algorithms 7350@section Multiplication 7351@cindex Multiplication algorithms 7352 7353N@cross{}N limb multiplications and squares are done using one of seven 7354algorithms, as the size N increases. 7355 7356@quotation 7357@multitable {KaratsubaMMM} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM} 7358@item Algorithm @tab Threshold 7359@item Basecase @tab (none) 7360@item Karatsuba @tab @code{MUL_TOOM22_THRESHOLD} 7361@item Toom-3 @tab @code{MUL_TOOM33_THRESHOLD} 7362@item Toom-4 @tab @code{MUL_TOOM44_THRESHOLD} 7363@item Toom-6.5 @tab @code{MUL_TOOM6H_THRESHOLD} 7364@item Toom-8.5 @tab @code{MUL_TOOM8H_THRESHOLD} 7365@item FFT @tab @code{MUL_FFT_THRESHOLD} 7366@end multitable 7367@end quotation 7368 7369Similarly for squaring, with the @code{SQR} thresholds. 7370 7371N@cross{}M multiplications of operands with different sizes above 7372@code{MUL_TOOM22_THRESHOLD} are currently done by special Toom-inspired 7373algorithms or directly with FFT, depending on operand size (@pxref{Unbalanced 7374Multiplication}). 7375 7376@menu 7377* Basecase Multiplication:: 7378* Karatsuba Multiplication:: 7379* Toom 3-Way Multiplication:: 7380* Toom 4-Way Multiplication:: 7381* Higher degree Toom'n'half:: 7382* FFT Multiplication:: 7383* Other Multiplication:: 7384* Unbalanced Multiplication:: 7385@end menu 7386 7387 7388@node Basecase Multiplication, Karatsuba Multiplication, Multiplication Algorithms, Multiplication Algorithms 7389@subsection Basecase Multiplication 7390 7391Basecase N@cross{}M multiplication is a straightforward rectangular set of 7392cross-products, the same as long multiplication done by hand and for that 7393reason sometimes known as the schoolbook or grammar school method. This is an 7394@m{O(NM),O(N*M)} algorithm. See Knuth section 4.3.1 algorithm M 7395(@pxref{References}), and the @file{mpn/generic/mul_basecase.c} code. 7396 7397Assembly implementations of @code{mpn_mul_basecase} are essentially the same 7398as the generic C code, but have all the usual assembly tricks and 7399obscurities introduced for speed. 7400 7401A square can be done in roughly half the time of a multiply, by using the fact 7402that the cross products above and below the diagonal are the same. A triangle 7403of products below the diagonal is formed, doubled (left shift by one bit), and 7404then the products on the diagonal added. This can be seen in 7405@file{mpn/generic/sqr_basecase.c}. Again the assembly implementations take 7406essentially the same approach. 7407 7408@tex 7409\def\GMPline#1#2#3#4#5#6{% 7410 \hbox {% 7411 \vrule height 2.5ex depth 1ex 7412 \hbox to 2em {\hfil{#2}\hfil}% 7413 \vrule \hbox to 2em {\hfil{#3}\hfil}% 7414 \vrule \hbox to 2em {\hfil{#4}\hfil}% 7415 \vrule \hbox to 2em {\hfil{#5}\hfil}% 7416 \vrule \hbox to 2em {\hfil{#6}\hfil}% 7417 \vrule}} 7418\GMPdisplay{ 7419 \hbox{% 7420 \vbox{% 7421 \hbox to 1.5em {\vrule height 2.5ex depth 1ex width 0pt}% 7422 \hbox {\vrule height 2.5ex depth 1ex width 0pt u0\hfil}% 7423 \hbox {\vrule height 2.5ex depth 1ex width 0pt u1\hfil}% 7424 \hbox {\vrule height 2.5ex depth 1ex width 0pt u2\hfil}% 7425 \hbox {\vrule height 2.5ex depth 1ex width 0pt u3\hfil}% 7426 \hbox {\vrule height 2.5ex depth 1ex width 0pt u4\hfil}% 7427 \vfill}% 7428 \vbox{% 7429 \hbox{% 7430 \hbox to 2em {\hfil u0\hfil}% 7431 \hbox to 2em {\hfil u1\hfil}% 7432 \hbox to 2em {\hfil u2\hfil}% 7433 \hbox to 2em {\hfil u3\hfil}% 7434 \hbox to 2em {\hfil u4\hfil}}% 7435 \vskip 0.7ex 7436 \hrule 7437 \GMPline{u0}{d}{}{}{}{}% 7438 \hrule 7439 \GMPline{u1}{}{d}{}{}{}% 7440 \hrule 7441 \GMPline{u2}{}{}{d}{}{}% 7442 \hrule 7443 \GMPline{u3}{}{}{}{d}{}% 7444 \hrule 7445 \GMPline{u4}{}{}{}{}{d}% 7446 \hrule}}} 7447@end tex 7448@ifnottex 7449@example 7450@group 7451 u0 u1 u2 u3 u4 7452 +---+---+---+---+---+ 7453u0 | d | | | | | 7454 +---+---+---+---+---+ 7455u1 | | d | | | | 7456 +---+---+---+---+---+ 7457u2 | | | d | | | 7458 +---+---+---+---+---+ 7459u3 | | | | d | | 7460 +---+---+---+---+---+ 7461u4 | | | | | d | 7462 +---+---+---+---+---+ 7463@end group 7464@end example 7465@end ifnottex 7466 7467In practice squaring isn't a full 2@cross{} faster than multiplying, it's 7468usually around 1.5@cross{}. Less than 1.5@cross{} probably indicates 7469@code{mpn_sqr_basecase} wants improving on that CPU. 7470 7471On some CPUs @code{mpn_mul_basecase} can be faster than the generic C 7472@code{mpn_sqr_basecase} on some small sizes. @code{SQR_BASECASE_THRESHOLD} is 7473the size at which to use @code{mpn_sqr_basecase}, this will be zero if that 7474routine should be used always. 7475 7476 7477@node Karatsuba Multiplication, Toom 3-Way Multiplication, Basecase Multiplication, Multiplication Algorithms 7478@subsection Karatsuba Multiplication 7479@cindex Karatsuba multiplication 7480 7481The Karatsuba multiplication algorithm is described in Knuth section 4.3.3 7482part A, and various other textbooks. A brief description is given here. 7483 7484The inputs @math{x} and @math{y} are treated as each split into two parts of 7485equal length (or the most significant part one limb shorter if N is odd). 7486 7487@tex 7488% GMPboxwidth used for all the multiplication pictures 7489\global\newdimen\GMPboxwidth \global\GMPboxwidth=5em 7490% GMPboxdepth and GMPboxheight are also used for the float pictures 7491\global\newdimen\GMPboxdepth \global\GMPboxdepth=1ex 7492\global\newdimen\GMPboxheight \global\GMPboxheight=2ex 7493\gdef\GMPvrule{\vrule height \GMPboxheight depth \GMPboxdepth} 7494\def\GMPbox#1#2{% 7495 \vbox {% 7496 \hrule 7497 \hbox to 2\GMPboxwidth{% 7498 \GMPvrule \hfil $#1$\hfil \vrule \hfil $#2$\hfil \vrule}% 7499 \hrule}} 7500\GMPdisplay{% 7501\vbox{% 7502 \hbox to 2\GMPboxwidth {high \hfil low} 7503 \vskip 0.7ex 7504 \GMPbox{x_1}{x_0} 7505 \vskip 0.5ex 7506 \GMPbox{y_1}{y_0} 7507}} 7508@end tex 7509@ifnottex 7510@example 7511@group 7512 high low 7513+----------+----------+ 7514| x1 | x0 | 7515+----------+----------+ 7516 7517+----------+----------+ 7518| y1 | y0 | 7519+----------+----------+ 7520@end group 7521@end example 7522@end ifnottex 7523 7524Let @math{b} be the power of 2 where the split occurs, i.e.@: if @ms{x,0} is 7525@math{k} limbs (@ms{y,0} the same) then 7526@m{b=2\GMPraise{$k*$@code{mp\_bits\_per\_limb}}, b=2^(k*mp_bits_per_limb)}. 7527With that @m{x=x_1b+x_0,x=x1*b+x0} and @m{y=y_1b+y_0,y=y1*b+y0}, and the 7528following holds, 7529 7530@display 7531@m{xy = (b^2+b)x_1y_1 - b(x_1-x_0)(y_1-y_0) + (b+1)x_0y_0, 7532 x*y = (b^2+b)*x1*y1 - b*(x1-x0)*(y1-y0) + (b+1)*x0*y0} 7533@end display 7534 7535This formula means doing only three multiplies of (N/2)@cross{}(N/2) limbs, 7536whereas a basecase multiply of N@cross{}N limbs is equivalent to four 7537multiplies of (N/2)@cross{}(N/2). The factors @math{(b^2+b)} etc represent 7538the positions where the three products must be added. 7539 7540@tex 7541\def\GMPboxA#1#2{% 7542 \vbox{% 7543 \hrule 7544 \hbox{% 7545 \GMPvrule 7546 \hbox to 2\GMPboxwidth {\hfil\hbox{$#1$}\hfil}% 7547 \vrule 7548 \hbox to 2\GMPboxwidth {\hfil\hbox{$#2$}\hfil}% 7549 \vrule} 7550 \hrule}} 7551\def\GMPboxB#1#2{% 7552 \hbox{% 7553 \raise \GMPboxdepth \hbox to \GMPboxwidth {\hfil #1\hskip 0.5em}% 7554 \vbox{% 7555 \hrule 7556 \hbox{% 7557 \GMPvrule 7558 \hbox to 2\GMPboxwidth {\hfil\hbox{$#2$}\hfil}% 7559 \vrule}% 7560 \hrule}}} 7561\GMPdisplay{% 7562\vbox{% 7563 \hbox to 4\GMPboxwidth {high \hfil low} 7564 \vskip 0.7ex 7565 \GMPboxA{x_1y_1}{x_0y_0} 7566 \vskip 0.5ex 7567 \GMPboxB{$+$}{x_1y_1} 7568 \vskip 0.5ex 7569 \GMPboxB{$+$}{x_0y_0} 7570 \vskip 0.5ex 7571 \GMPboxB{$-$}{(x_1-x_0)(y_1-y_0)} 7572}} 7573@end tex 7574@ifnottex 7575@example 7576@group 7577 high low 7578+--------+--------+ +--------+--------+ 7579| x1*y1 | | x0*y0 | 7580+--------+--------+ +--------+--------+ 7581 +--------+--------+ 7582 add | x1*y1 | 7583 +--------+--------+ 7584 +--------+--------+ 7585 add | x0*y0 | 7586 +--------+--------+ 7587 +--------+--------+ 7588 sub | (x1-x0)*(y1-y0) | 7589 +--------+--------+ 7590@end group 7591@end example 7592@end ifnottex 7593 7594The term @m{(x_1-x_0)(y_1-y_0),(x1-x0)*(y1-y0)} is best calculated as an 7595absolute value, and the sign used to choose to add or subtract. Notice the 7596sum @m{\mathop{\rm high}(x_0y_0)+\mathop{\rm low}(x_1y_1), 7597high(x0*y0)+low(x1*y1)} occurs twice, so it's possible to do @m{5k,5*k} limb 7598additions, rather than @m{6k,6*k}, but in GMP extra function call overheads 7599outweigh the saving. 7600 7601Squaring is similar to multiplying, but with @math{x=y} the formula reduces to 7602an equivalent with three squares, 7603 7604@display 7605@m{x^2 = (b^2+b)x_1^2 - b(x_1-x_0)^2 + (b+1)x_0^2, 7606 x^2 = (b^2+b)*x1^2 - b*(x1-x0)^2 + (b+1)*x0^2} 7607@end display 7608 7609The final result is accumulated from those three squares the same way as for 7610the three multiplies above. The middle term @m{(x_1-x_0)^2,(x1-x0)^2} is now 7611always positive. 7612 7613A similar formula for both multiplying and squaring can be constructed with a 7614middle term @m{(x_1+x_0)(y_1+y_0),(x1+x0)*(y1+y0)}. But those sums can exceed 7615@math{k} limbs, leading to more carry handling and additions than the form 7616above. 7617 7618Karatsuba multiplication is asymptotically an @math{O(N^@W{1.585})} algorithm, 7619the exponent being @m{\log3/\log2,log(3)/log(2)}, representing 3 multiplies 7620each @math{1/2} the size of the inputs. This is a big improvement over the 7621basecase multiply at @math{O(N^2)} and the advantage soon overcomes the extra 7622additions Karatsuba performs. @code{MUL_TOOM22_THRESHOLD} can be as little 7623as 10 limbs. The @code{SQR} threshold is usually about twice the @code{MUL}. 7624 7625The basecase algorithm will take a time of the form @m{M(N) = aN^2 + bN + c, 7626M(N) = a*N^2 + b*N + c} and the Karatsuba algorithm @m{K(N) = 3M(N/2) + dN + 7627e, K(N) = 3*M(N/2) + d*N + e}, which expands to @m{K(N) = {3\over4} aN^2 + 7628{3\over2} bN + 3c + dN + e, K(N) = 3/4*a*N^2 + 3/2*b*N + 3*c + d*N + e}. The 7629factor @m{3\over4, 3/4} for @math{a} means per-crossproduct speedups in the 7630basecase code will increase the threshold since they benefit @math{M(N)} more 7631than @math{K(N)}. And conversely the @m{3\over2, 3/2} for @math{b} means 7632linear style speedups of @math{b} will increase the threshold since they 7633benefit @math{K(N)} more than @math{M(N)}. The latter can be seen for 7634instance when adding an optimized @code{mpn_sqr_diagonal} to 7635@code{mpn_sqr_basecase}. Of course all speedups reduce total time, and in 7636that sense the algorithm thresholds are merely of academic interest. 7637 7638 7639@node Toom 3-Way Multiplication, Toom 4-Way Multiplication, Karatsuba Multiplication, Multiplication Algorithms 7640@subsection Toom 3-Way Multiplication 7641@cindex Toom multiplication 7642 7643The Karatsuba formula is the simplest case of a general approach to splitting 7644inputs that leads to both Toom and FFT algorithms. A description of 7645Toom can be found in Knuth section 4.3.3, with an example 3-way 7646calculation after Theorem A@. The 3-way form used in GMP is described here. 7647 7648The operands are each considered split into 3 pieces of equal length (or the 7649most significant part 1 or 2 limbs shorter than the other two). 7650 7651@tex 7652\def\GMPbox#1#2#3{% 7653 \vbox{% 7654 \hrule \vfil 7655 \hbox to 3\GMPboxwidth {% 7656 \GMPvrule 7657 \hfil$#1$\hfil 7658 \vrule 7659 \hfil$#2$\hfil 7660 \vrule 7661 \hfil$#3$\hfil 7662 \vrule}% 7663 \vfil \hrule 7664}} 7665\GMPdisplay{% 7666\vbox{% 7667 \hbox to 3\GMPboxwidth {high \hfil low} 7668 \vskip 0.7ex 7669 \GMPbox{x_2}{x_1}{x_0} 7670 \vskip 0.5ex 7671 \GMPbox{y_2}{y_1}{y_0} 7672 \vskip 0.5ex 7673}} 7674@end tex 7675@ifnottex 7676@example 7677@group 7678 high low 7679+----------+----------+----------+ 7680| x2 | x1 | x0 | 7681+----------+----------+----------+ 7682 7683+----------+----------+----------+ 7684| y2 | y1 | y0 | 7685+----------+----------+----------+ 7686@end group 7687@end example 7688@end ifnottex 7689 7690@noindent 7691These parts are treated as the coefficients of two polynomials 7692 7693@display 7694@group 7695@m{X(t) = x_2t^2 + x_1t + x_0, 7696 X(t) = x2*t^2 + x1*t + x0} 7697@m{Y(t) = y_2t^2 + y_1t + y_0, 7698 Y(t) = y2*t^2 + y1*t + y0} 7699@end group 7700@end display 7701 7702Let @math{b} equal the power of 2 which is the size of the @ms{x,0}, @ms{x,1}, 7703@ms{y,0} and @ms{y,1} pieces, i.e.@: if they're @math{k} limbs each then 7704@m{b=2\GMPraise{$k*$@code{mp\_bits\_per\_limb}}, b=2^(k*mp_bits_per_limb)}. 7705With this @math{x=X(b)} and @math{y=Y(b)}. 7706 7707Let a polynomial @m{W(t)=X(t)Y(t),W(t)=X(t)*Y(t)} and suppose its coefficients 7708are 7709 7710@display 7711@m{W(t) = w_4t^4 + w_3t^3 + w_2t^2 + w_1t + w_0, 7712 W(t) = w4*t^4 + w3*t^3 + w2*t^2 + w1*t + w0} 7713@end display 7714 7715The @m{w_i,w[i]} are going to be determined, and when they are they'll give 7716the final result using @math{w=W(b)}, since 7717@m{xy=X(b)Y(b),x*y=X(b)*Y(b)=W(b)}. The coefficients will be roughly 7718@math{b^2} each, and the final @math{W(b)} will be an addition like, 7719 7720@tex 7721\def\GMPbox#1#2{% 7722 \moveright #1\GMPboxwidth 7723 \vbox{% 7724 \hrule 7725 \hbox{% 7726 \GMPvrule 7727 \hbox to 2\GMPboxwidth {\hfil$#2$\hfil}% 7728 \vrule}% 7729 \hrule 7730}} 7731\GMPdisplay{% 7732\vbox{% 7733 \hbox to 6\GMPboxwidth {high \hfil low}% 7734 \vskip 0.7ex 7735 \GMPbox{0}{w_4} 7736 \vskip 0.5ex 7737 \GMPbox{1}{w_3} 7738 \vskip 0.5ex 7739 \GMPbox{2}{w_2} 7740 \vskip 0.5ex 7741 \GMPbox{3}{w_1} 7742 \vskip 0.5ex 7743 \GMPbox{4}{w_0} 7744}} 7745@end tex 7746@ifnottex 7747@example 7748@group 7749 high low 7750+-------+-------+ 7751| w4 | 7752+-------+-------+ 7753 +--------+-------+ 7754 | w3 | 7755 +--------+-------+ 7756 +--------+-------+ 7757 | w2 | 7758 +--------+-------+ 7759 +--------+-------+ 7760 | w1 | 7761 +--------+-------+ 7762 +-------+-------+ 7763 | w0 | 7764 +-------+-------+ 7765@end group 7766@end example 7767@end ifnottex 7768 7769The @m{w_i,w[i]} coefficients could be formed by a simple set of cross 7770products, like @m{w_4=x_2y_2,w4=x2*y2}, @m{w_3=x_2y_1+x_1y_2,w3=x2*y1+x1*y2}, 7771@m{w_2=x_2y_0+x_1y_1+x_0y_2,w2=x2*y0+x1*y1+x0*y2} etc, but this would need all 7772nine @m{x_iy_j,x[i]*y[j]} for @math{i,j=0,1,2}, and would be equivalent merely 7773to a basecase multiply. Instead the following approach is used. 7774 7775@math{X(t)} and @math{Y(t)} are evaluated and multiplied at 5 points, giving 7776values of @math{W(t)} at those points. In GMP the following points are used, 7777 7778@quotation 7779@multitable {@m{t=\infty,t=inf}M} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM} 7780@item Point @tab Value 7781@item @math{t=0} @tab @m{x_0y_0,x0 * y0}, which gives @ms{w,0} immediately 7782@item @math{t=1} @tab @m{(x_2+x_1+x_0)(y_2+y_1+y_0),(x2+x1+x0) * (y2+y1+y0)} 7783@item @math{t=-1} @tab @m{(x_2-x_1+x_0)(y_2-y_1+y_0),(x2-x1+x0) * (y2-y1+y0)} 7784@item @math{t=2} @tab @m{(4x_2+2x_1+x_0)(4y_2+2y_1+y_0),(4*x2+2*x1+x0) * (4*y2+2*y1+y0)} 7785@item @m{t=\infty,t=inf} @tab @m{x_2y_2,x2 * y2}, which gives @ms{w,4} immediately 7786@end multitable 7787@end quotation 7788 7789At @math{t=-1} the values can be negative and that's handled using the 7790absolute values and tracking the sign separately. At @m{t=\infty,t=inf} the 7791value is actually @m{\lim_{t\to\infty} {X(t)Y(t)\over t^4}, X(t)*Y(t)/t^4 in 7792the limit as t approaches infinity}, but it's much easier to think of as 7793simply @m{x_2y_2,x2*y2} giving @ms{w,4} immediately (much like 7794@m{x_0y_0,x0*y0} at @math{t=0} gives @ms{w,0} immediately). 7795 7796Each of the points substituted into 7797@m{W(t)=w_4t^4+\cdots+w_0,W(t)=w4*t^4+@dots{}+w0} gives a linear combination 7798of the @m{w_i,w[i]} coefficients, and the value of those combinations has just 7799been calculated. 7800 7801@tex 7802\GMPdisplay{% 7803$\matrix{% 7804W(0) & = & & & & & & & & & w_0 \cr 7805W(1) & = & w_4 & + & w_3 & + & w_2 & + & w_1 & + & w_0 \cr 7806W(-1) & = & w_4 & - & w_3 & + & w_2 & - & w_1 & + & w_0 \cr 7807W(2) & = & 16w_4 & + & 8w_3 & + & 4w_2 & + & 2w_1 & + & w_0 \cr 7808W(\infty) & = & w_4 \cr 7809}$} 7810@end tex 7811@ifnottex 7812@example 7813@group 7814W(0) = w0 7815W(1) = w4 + w3 + w2 + w1 + w0 7816W(-1) = w4 - w3 + w2 - w1 + w0 7817W(2) = 16*w4 + 8*w3 + 4*w2 + 2*w1 + w0 7818W(inf) = w4 7819@end group 7820@end example 7821@end ifnottex 7822 7823This is a set of five equations in five unknowns, and some elementary linear 7824algebra quickly isolates each @m{w_i,w[i]}. This involves adding or 7825subtracting one @math{W(t)} value from another, and a couple of divisions by 7826powers of 2 and one division by 3, the latter using the special 7827@code{mpn_divexact_by3} (@pxref{Exact Division}). 7828 7829The conversion of @math{W(t)} values to the coefficients is interpolation. A 7830polynomial of degree 4 like @math{W(t)} is uniquely determined by values known 7831at 5 different points. The points are arbitrary and can be chosen to make the 7832linear equations come out with a convenient set of steps for quickly isolating 7833the @m{w_i,w[i]}. 7834 7835Squaring follows the same procedure as multiplication, but there's only one 7836@math{X(t)} and it's evaluated at the 5 points, and those values squared to 7837give values of @math{W(t)}. The interpolation is then identical, and in fact 7838the same @code{toom_interpolate_5pts} subroutine is used for both squaring and 7839multiplying. 7840 7841Toom-3 is asymptotically @math{O(N^@W{1.465})}, the exponent being 7842@m{\log5/\log3,log(5)/log(3)}, representing 5 recursive multiplies of 1/3 the 7843original size each. This is an improvement over Karatsuba at 7844@math{O(N^@W{1.585})}, though Toom does more work in the evaluation and 7845interpolation and so it only realizes its advantage above a certain size. 7846 7847Near the crossover between Toom-3 and Karatsuba there's generally a range of 7848sizes where the difference between the two is small. 7849@code{MUL_TOOM33_THRESHOLD} is a somewhat arbitrary point in that range and 7850successive runs of the tune program can give different values due to small 7851variations in measuring. A graph of time versus size for the two shows the 7852effect, see @file{tune/README}. 7853 7854At the fairly small sizes where the Toom-3 thresholds occur it's worth 7855remembering that the asymptotic behaviour for Karatsuba and Toom-3 can't be 7856expected to make accurate predictions, due of course to the big influence of 7857all sorts of overheads, and the fact that only a few recursions of each are 7858being performed. Even at large sizes there's a good chance machine dependent 7859effects like cache architecture will mean actual performance deviates from 7860what might be predicted. 7861 7862The formula given for the Karatsuba algorithm (@pxref{Karatsuba 7863Multiplication}) has an equivalent for Toom-3 involving only five multiplies, 7864but this would be complicated and unenlightening. 7865 7866An alternate view of Toom-3 can be found in Zuras (@pxref{References}), using 7867a vector to represent the @math{x} and @math{y} splits and a matrix 7868multiplication for the evaluation and interpolation stages. The matrix 7869inverses are not meant to be actually used, and they have elements with values 7870much greater than in fact arise in the interpolation steps. The diagram shown 7871for the 3-way is attractive, but again doesn't have to be implemented that way 7872and for example with a bit of rearrangement just one division by 6 can be 7873done. 7874 7875 7876@node Toom 4-Way Multiplication, Higher degree Toom'n'half, Toom 3-Way Multiplication, Multiplication Algorithms 7877@subsection Toom 4-Way Multiplication 7878@cindex Toom multiplication 7879 7880Karatsuba and Toom-3 split the operands into 2 and 3 coefficients, 7881respectively. Toom-4 analogously splits the operands into 4 coefficients. 7882Using the notation from the section on Toom-3 multiplication, we form two 7883polynomials: 7884 7885@display 7886@group 7887@m{X(t) = x_3t^3 + x_2t^2 + x_1t + x_0, 7888 X(t) = x3*t^3 + x2*t^2 + x1*t + x0} 7889@m{Y(t) = y_3t^3 + y_2t^2 + y_1t + y_0, 7890 Y(t) = y3*t^3 + y2*t^2 + y1*t + y0} 7891@end group 7892@end display 7893 7894@math{X(t)} and @math{Y(t)} are evaluated and multiplied at 7 points, giving 7895values of @math{W(t)} at those points. In GMP the following points are used, 7896 7897@quotation 7898@multitable {@m{t=-1/2,t=inf}M} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM} 7899@item Point @tab Value 7900@item @math{t=0} @tab @m{x_0y_0,x0 * y0}, which gives @ms{w,0} immediately 7901@item @math{t=1/2} @tab @m{(x_3+2x_2+4x_1+8x_0)(y_3+2y_2+4y_1+8y_0),(x3+2*x2+4*x1+8*x0) * (y3+2*y2+4*y1+8*y0)} 7902@item @math{t=-1/2} @tab @m{(-x_3+2x_2-4x_1+8x_0)(-y_3+2y_2-4y_1+8y_0),(-x3+2*x2-4*x1+8*x0) * (-y3+2*y2-4*y1+8*y0)} 7903@item @math{t=1} @tab @m{(x_3+x_2+x_1+x_0)(y_3+y_2+y_1+y_0),(x3+x2+x1+x0) * (y3+y2+y1+y0)} 7904@item @math{t=-1} @tab @m{(-x_3+x_2-x_1+x_0)(-y_3+y_2-y_1+y_0),(-x3+x2-x1+x0) * (-y3+y2-y1+y0)} 7905@item @math{t=2} @tab @m{(8x_3+4x_2+2x_1+x_0)(8y_3+4y_2+2y_1+y_0),(8*x3+4*x2+2*x1+x0) * (8*y3+4*y2+2*y1+y0)} 7906@item @m{t=\infty,t=inf} @tab @m{x_3y_3,x3 * y3}, which gives @ms{w,6} immediately 7907@end multitable 7908@end quotation 7909 7910The number of additions and subtractions for Toom-4 is much larger than for Toom-3. 7911But several subexpressions occur multiple times, for example @m{x_2+x_0,x2+x0}, occurs 7912for both @math{t=1} and @math{t=-1}. 7913 7914Toom-4 is asymptotically @math{O(N^@W{1.404})}, the exponent being 7915@m{\log7/\log4,log(7)/log(4)}, representing 7 recursive multiplies of 1/4 the 7916original size each. 7917 7918 7919@node Higher degree Toom'n'half, FFT Multiplication, Toom 4-Way Multiplication, Multiplication Algorithms 7920@subsection Higher degree Toom'n'half 7921@cindex Toom multiplication 7922 7923The Toom algorithms described above (@pxref{Toom 3-Way Multiplication}, 7924@pxref{Toom 4-Way Multiplication}) generalizes to split into an arbitrary 7925number of pieces. In general a split of two equally long operands into 7926@math{r} pieces leads to evaluations and pointwise multiplications done at 7927@m{2r-1,2*r-1} points. To fully exploit symmetries it would be better to have 7928a multiple of 4 points, that's why for higher degree Toom'n'half is used. 7929 7930Toom'n'half means that the existence of one more piece is considered for a 7931single operand. It can be virtual, i.e. zero, or real, when the two operand 7932are not exactly balanced. By chosing an even @math{r}, 7933Toom-@m{r{1\over2},r+1/2} requires @math{2r} points, a multiple of four. 7934 7935The four-plets of points inlcude 0, @m{\infty,inf}, +1, -1 and 7936@m{\pm2^i,+-2^i}, @m{\pm2^{-i},+-2^-i} . Each of them giving shortcuts for the 7937evaluation phase and for some steps in the interpolation phase. Further tricks 7938are used to reduce the memory footprint of the whole multiplication algorithm 7939to a memory buffer equanl in size to the result of the product. 7940 7941Current GMP uses both Toom-6'n'half and Toom-8'n'half. 7942 7943 7944@node FFT Multiplication, Other Multiplication, Higher degree Toom'n'half, Multiplication Algorithms 7945@subsection FFT Multiplication 7946@cindex FFT multiplication 7947@cindex Fast Fourier Transform 7948 7949At large to very large sizes a Fermat style FFT multiplication is used, 7950following Sch@"onhage and Strassen (@pxref{References}). Descriptions of FFTs 7951in various forms can be found in many textbooks, for instance Knuth section 79524.3.3 part C or Lipson chapter IX@. A brief description of the form used in 7953GMP is given here. 7954 7955The multiplication done is @m{xy \bmod 2^N+1, x*y mod 2^N+1}, for a given 7956@math{N}. A full product @m{xy,x*y} is obtained by choosing @m{N \ge 7957\mathop{\rm bits}(x)+\mathop{\rm bits}(y), N>=bits(x)+bits(y)} and padding 7958@math{x} and @math{y} with high zero limbs. The modular product is the native 7959form for the algorithm, so padding to get a full product is unavoidable. 7960 7961The algorithm follows a split, evaluate, pointwise multiply, interpolate and 7962combine similar to that described above for Karatsuba and Toom-3. A @math{k} 7963parameter controls the split, with an FFT-@math{k} splitting into @math{2^k} 7964pieces of @math{M=N/2^k} bits each. @math{N} must be a multiple of 7965@m{2^k\times@code{mp\_bits\_per\_limb}, (2^k)*@nicode{mp_bits_per_limb}} so 7966the split falls on limb boundaries, avoiding bit shifts in the split and 7967combine stages. 7968 7969The evaluations, pointwise multiplications, and interpolation, are all done 7970modulo @m{2^{N'}+1, 2^N'+1} where @math{N'} is @math{2M+k+3} rounded up to a 7971multiple of @math{2^k} and of @code{mp_bits_per_limb}. The results of 7972interpolation will be the following negacyclic convolution of the input 7973pieces, and the choice of @math{N'} ensures these sums aren't truncated. 7974@tex 7975$$ w_n = \sum_{{i+j = b2^k+n}\atop{b=0,1}} (-1)^b x_i y_j $$ 7976@end tex 7977@ifnottex 7978 7979@example 7980 --- 7981 \ b 7982w[n] = / (-1) * x[i] * y[j] 7983 --- 7984 i+j==b*2^k+n 7985 b=0,1 7986@end example 7987 7988@end ifnottex 7989The points used for the evaluation are @math{g^i} for @math{i=0} to 7990@math{2^k-1} where @m{g=2^{2N'/2^k}, g=2^(2N'/2^k)}. @math{g} is a 7991@m{2^k,2^k'}th root of unity mod @m{2^{N'}+1,2^N'+1}, which produces necessary 7992cancellations at the interpolation stage, and it's also a power of 2 so the 7993fast Fourier transforms used for the evaluation and interpolation do only 7994shifts, adds and negations. 7995 7996The pointwise multiplications are done modulo @m{2^{N'}+1, 2^N'+1} and either 7997recurse into a further FFT or use a plain multiplication (Toom-3, Karatsuba or 7998basecase), whichever is optimal at the size @math{N'}. The interpolation is 7999an inverse fast Fourier transform. The resulting set of sums of @m{x_iy_j, 8000x[i]*y[j]} are added at appropriate offsets to give the final result. 8001 8002Squaring is the same, but @math{x} is the only input so it's one transform at 8003the evaluate stage and the pointwise multiplies are squares. The 8004interpolation is the same. 8005 8006For a mod @math{2^N+1} product, an FFT-@math{k} is an @m{O(N^{k/(k-1)}), 8007O(N^(k/(k-1)))} algorithm, the exponent representing @math{2^k} recursed 8008modular multiplies each @m{1/2^{k-1},1/2^(k-1)} the size of the original. 8009Each successive @math{k} is an asymptotic improvement, but overheads mean each 8010is only faster at bigger and bigger sizes. In the code, @code{MUL_FFT_TABLE} 8011and @code{SQR_FFT_TABLE} are the thresholds where each @math{k} is used. Each 8012new @math{k} effectively swaps some multiplying for some shifts, adds and 8013overheads. 8014 8015A mod @math{2^N+1} product can be formed with a normal 8016@math{N@cross{}N@rightarrow{}2N} bit multiply plus a subtraction, so an FFT 8017and Toom-3 etc can be compared directly. A @math{k=4} FFT at 8018@math{O(N^@W{1.333})} can be expected to be the first faster than Toom-3 at 8019@math{O(N^@W{1.465})}. In practice this is what's found, with 8020@code{MUL_FFT_MODF_THRESHOLD} and @code{SQR_FFT_MODF_THRESHOLD} being between 8021300 and 1000 limbs, depending on the CPU@. So far it's been found that only 8022very large FFTs recurse into pointwise multiplies above these sizes. 8023 8024When an FFT is to give a full product, the change of @math{N} to @math{2N} 8025doesn't alter the theoretical complexity for a given @math{k}, but for the 8026purposes of considering where an FFT might be first used it can be assumed 8027that the FFT is recursing into a normal multiply and that on that basis it's 8028doing @math{2^k} recursed multiplies each @m{1/2^{k-2},1/2^(k-2)} the size of 8029the inputs, making it @m{O(N^{k/(k-2)}), O(N^(k/(k-2)))}. This would mean 8030@math{k=7} at @math{O(N^@W{1.4})} would be the first FFT faster than Toom-3. 8031In practice @code{MUL_FFT_THRESHOLD} and @code{SQR_FFT_THRESHOLD} have been 8032found to be in the @math{k=8} range, somewhere between 3000 and 10000 limbs. 8033 8034The way @math{N} is split into @math{2^k} pieces and then @math{2M+k+3} is 8035rounded up to a multiple of @math{2^k} and @code{mp_bits_per_limb} means that 8036when @math{2^k@ge{}@nicode{mp\_bits\_per\_limb}} the effective @math{N} is a 8037multiple of @m{2^{2k-1},2^(2k-1)} bits. The @math{+k+3} means some values of 8038@math{N} just under such a multiple will be rounded to the next. The 8039complexity calculations above assume that a favourable size is used, meaning 8040one which isn't padded through rounding, and it's also assumed that the extra 8041@math{+k+3} bits are negligible at typical FFT sizes. 8042 8043The practical effect of the @m{2^{2k-1},2^(2k-1)} constraint is to introduce a 8044step-effect into measured speeds. For example @math{k=8} will round @math{N} 8045up to a multiple of 32768 bits, so for a 32-bit limb there'll be 512 limb 8046groups of sizes for which @code{mpn_mul_n} runs at the same speed. Or for 8047@math{k=9} groups of 2048 limbs, @math{k=10} groups of 8192 limbs, etc. In 8048practice it's been found each @math{k} is used at quite small multiples of its 8049size constraint and so the step effect is quite noticeable in a time versus 8050size graph. 8051 8052The threshold determinations currently measure at the mid-points of size 8053steps, but this is sub-optimal since at the start of a new step it can happen 8054that it's better to go back to the previous @math{k} for a while. Something 8055more sophisticated for @code{MUL_FFT_TABLE} and @code{SQR_FFT_TABLE} will be 8056needed. 8057 8058 8059@node Other Multiplication, Unbalanced Multiplication, FFT Multiplication, Multiplication Algorithms 8060@subsection Other Multiplication 8061@cindex Toom multiplication 8062 8063The Toom algorithms described above (@pxref{Toom 3-Way Multiplication}, 8064@pxref{Toom 4-Way Multiplication}) generalizes to split into an arbitrary 8065number of pieces, as per Knuth section 4.3.3 algorithm C@. This is not 8066currently used. The notes here are merely for interest. 8067 8068In general a split into @math{r+1} pieces is made, and evaluations and 8069pointwise multiplications done at @m{2r+1,2*r+1} points. A 4-way split does 7 8070pointwise multiplies, 5-way does 9, etc. Asymptotically an @math{(r+1)}-way 8071algorithm is @m{O(N^{log(2r+1)/log(r+1)}), O(N^(log(2*r+1)/log(r+1)))}. Only 8072the pointwise multiplications count towards big-@math{O} complexity, but the 8073time spent in the evaluate and interpolate stages grows with @math{r} and has 8074a significant practical impact, with the asymptotic advantage of each @math{r} 8075realized only at bigger and bigger sizes. The overheads grow as 8076@m{O(Nr),O(N*r)}, whereas in an @math{r=2^k} FFT they grow only as @m{O(N \log 8077r), O(N*log(r))}. 8078 8079Knuth algorithm C evaluates at points 0,1,2,@dots{},@m{2r,2*r}, but exercise 4 8080uses @math{-r},@dots{},0,@dots{},@math{r} and the latter saves some small 8081multiplies in the evaluate stage (or rather trades them for additions), and 8082has a further saving of nearly half the interpolate steps. The idea is to 8083separate odd and even final coefficients and then perform algorithm C steps C7 8084and C8 on them separately. The divisors at step C7 become @math{j^2} and the 8085multipliers at C8 become @m{2tj-j^2,2*t*j-j^2}. 8086 8087Splitting odd and even parts through positive and negative points can be 8088thought of as using @math{-1} as a square root of unity. If a 4th root of 8089unity was available then a further split and speedup would be possible, but no 8090such root exists for plain integers. Going to complex integers with 8091@m{i=\sqrt{-1}, i=sqrt(-1)} doesn't help, essentially because in Cartesian 8092form it takes three real multiplies to do a complex multiply. The existence 8093of @m{2^k,2^k'}th roots of unity in a suitable ring or field lets the fast 8094Fourier transform keep splitting and get to @m{O(N \log r), O(N*log(r))}. 8095 8096Floating point FFTs use complex numbers approximating Nth roots of unity. 8097Some processors have special support for such FFTs. But these are not used in 8098GMP since it's very difficult to guarantee an exact result (to some number of 8099bits). An occasional difference of 1 in the last bit might not matter to a 8100typical signal processing algorithm, but is of course of vital importance to 8101GMP. 8102 8103 8104@node Unbalanced Multiplication, , Other Multiplication, Multiplication Algorithms 8105@subsection Unbalanced Multiplication 8106@cindex Unbalanced multiplication 8107 8108Multiplication of operands with different sizes, both below 8109@code{MUL_TOOM22_THRESHOLD} are done with plain schoolbook multiplication 8110(@pxref{Basecase Multiplication}). 8111 8112For really large operands, we invoke FFT directly. 8113 8114For operands between these sizes, we use Toom inspired algorithms suggested by 8115Alberto Zanoni and Marco Bodrato. The idea is to split the operands into 8116polynomials of different degree. GMP currently splits the smaller operand 8117onto 2 coefficients, i.e., a polynomial of degree 1, but the larger operand 8118can be split into 2, 3, or 4 coefficients, i.e., a polynomial of degree 1 to 81193. 8120 8121@c FIXME: This is mighty ugly, but a cleaner @need triggers texinfo bugs that 8122@c screws up layout here and there in the rest of the manual. 8123@c @tex 8124@c \goodbreak 8125@c @end tex 8126@node Division Algorithms, Greatest Common Divisor Algorithms, Multiplication Algorithms, Algorithms 8127@section Division Algorithms 8128@cindex Division algorithms 8129 8130@menu 8131* Single Limb Division:: 8132* Basecase Division:: 8133* Divide and Conquer Division:: 8134* Block-Wise Barrett Division:: 8135* Exact Division:: 8136* Exact Remainder:: 8137* Small Quotient Division:: 8138@end menu 8139 8140 8141@node Single Limb Division, Basecase Division, Division Algorithms, Division Algorithms 8142@subsection Single Limb Division 8143 8144N@cross{}1 division is implemented using repeated 2@cross{}1 divisions from 8145high to low, either with a hardware divide instruction or a multiplication by 8146inverse, whichever is best on a given CPU. 8147 8148The multiply by inverse follows ``Improved division by invariant integers'' by 8149M@"oller and Granlund (@pxref{References}) and is implemented as 8150@code{udiv_qrnnd_preinv} in @file{gmp-impl.h}. The idea is to have a 8151fixed-point approximation to @math{1/d} (see @code{invert_limb}) and then 8152multiply by the high limb (plus one bit) of the dividend to get a quotient 8153@math{q}. With @math{d} normalized (high bit set), @math{q} is no more than 1 8154too small. Subtracting @m{qd,q*d} from the dividend gives a remainder, and 8155reveals whether @math{q} or @math{q-1} is correct. 8156 8157The result is a division done with two multiplications and four or five 8158arithmetic operations. On CPUs with low latency multipliers this can be much 8159faster than a hardware divide, though the cost of calculating the inverse at 8160the start may mean it's only better on inputs bigger than say 4 or 5 limbs. 8161 8162When a divisor must be normalized, either for the generic C 8163@code{__udiv_qrnnd_c} or the multiply by inverse, the division performed is 8164actually @m{a2^k,a*2^k} by @m{d2^k,d*2^k} where @math{a} is the dividend and 8165@math{k} is the power necessary to have the high bit of @m{d2^k,d*2^k} set. 8166The bit shifts for the dividend are usually accomplished ``on the fly'' 8167meaning by extracting the appropriate bits at each step. Done this way the 8168quotient limbs come out aligned ready to store. When only the remainder is 8169wanted, an alternative is to take the dividend limbs unshifted and calculate 8170@m{r = a \bmod d2^k, r = a mod d*2^k} followed by an extra final step @m{r2^k 8171\bmod d2^k, r*2^k mod d*2^k}. This can help on CPUs with poor bit shifts or 8172few registers. 8173 8174The multiply by inverse can be done two limbs at a time. The calculation is 8175basically the same, but the inverse is two limbs and the divisor treated as if 8176padded with a low zero limb. This means more work, since the inverse will 8177need a 2@cross{}2 multiply, but the four 1@cross{}1s to do that are 8178independent and can therefore be done partly or wholly in parallel. Likewise 8179for a 2@cross{}1 calculating @m{qd,q*d}. The net effect is to process two 8180limbs with roughly the same two multiplies worth of latency that one limb at a 8181time gives. This extends to 3 or 4 limbs at a time, though the extra work to 8182apply the inverse will almost certainly soon reach the limits of multiplier 8183throughput. 8184 8185A similar approach in reverse can be taken to process just half a limb at a 8186time if the divisor is only a half limb. In this case the 1@cross{}1 multiply 8187for the inverse effectively becomes two @m{{1\over2}\times1, (1/2)x1} for each 8188limb, which can be a saving on CPUs with a fast half limb multiply, or in fact 8189if the only multiply is a half limb, and especially if it's not pipelined. 8190 8191 8192@node Basecase Division, Divide and Conquer Division, Single Limb Division, Division Algorithms 8193@subsection Basecase Division 8194 8195Basecase N@cross{}M division is like long division done by hand, but in base 8196@m{2\GMPraise{@code{mp\_bits\_per\_limb}}, 2^mp_bits_per_limb}. See Knuth 8197section 4.3.1 algorithm D, and @file{mpn/generic/sb_divrem_mn.c}. 8198 8199Briefly stated, while the dividend remains larger than the divisor, a high 8200quotient limb is formed and the N@cross{}1 product @m{qd,q*d} subtracted at 8201the top end of the dividend. With a normalized divisor (most significant bit 8202set), each quotient limb can be formed with a 2@cross{}1 division and a 82031@cross{}1 multiplication plus some subtractions. The 2@cross{}1 division is 8204by the high limb of the divisor and is done either with a hardware divide or a 8205multiply by inverse (the same as in @ref{Single Limb Division}) whichever is 8206faster. Such a quotient is sometimes one too big, requiring an addback of the 8207divisor, but that happens rarely. 8208 8209With Q=N@minus{}M being the number of quotient limbs, this is an 8210@m{O(QM),O(Q*M)} algorithm and will run at a speed similar to a basecase 8211Q@cross{}M multiplication, differing in fact only in the extra multiply and 8212divide for each of the Q quotient limbs. 8213 8214 8215@node Divide and Conquer Division, Block-Wise Barrett Division, Basecase Division, Division Algorithms 8216@subsection Divide and Conquer Division 8217 8218For divisors larger than @code{DC_DIV_QR_THRESHOLD}, division is done by dividing. 8219Or to be precise by a recursive divide and conquer algorithm based on work by 8220Moenck and Borodin, Jebelean, and Burnikel and Ziegler (@pxref{References}). 8221 8222The algorithm consists essentially of recognising that a 2N@cross{}N division 8223can be done with the basecase division algorithm (@pxref{Basecase Division}), 8224but using N/2 limbs as a base, not just a single limb. This way the 8225multiplications that arise are (N/2)@cross{}(N/2) and can take advantage of 8226Karatsuba and higher multiplication algorithms (@pxref{Multiplication 8227Algorithms}). The two ``digits'' of the quotient are formed by recursive 8228N@cross{}(N/2) divisions. 8229 8230If the (N/2)@cross{}(N/2) multiplies are done with a basecase multiplication 8231then the work is about the same as a basecase division, but with more function 8232call overheads and with some subtractions separated from the multiplies. 8233These overheads mean that it's only when N/2 is above 8234@code{MUL_TOOM22_THRESHOLD} that divide and conquer is of use. 8235 8236@code{DC_DIV_QR_THRESHOLD} is based on the divisor size N, so it will be somewhere 8237above twice @code{MUL_TOOM22_THRESHOLD}, but how much above depends on the 8238CPU@. An optimized @code{mpn_mul_basecase} can lower @code{DC_DIV_QR_THRESHOLD} a 8239little by offering a ready-made advantage over repeated @code{mpn_submul_1} 8240calls. 8241 8242Divide and conquer is asymptotically @m{O(M(N)\log N),O(M(N)*log(N))} where 8243@math{M(N)} is the time for an N@cross{}N multiplication done with FFTs. The 8244actual time is a sum over multiplications of the recursed sizes, as can be 8245seen near the end of section 2.2 of Burnikel and Ziegler. For example, within 8246the Toom-3 range, divide and conquer is @m{2.63M(N), 2.63*M(N)}. With higher 8247algorithms the @math{M(N)} term improves and the multiplier tends to @m{\log 8248N, log(N)}. In practice, at moderate to large sizes, a 2N@cross{}N division 8249is about 2 to 4 times slower than an N@cross{}N multiplication. 8250 8251 8252@node Block-Wise Barrett Division, Exact Division, Divide and Conquer Division, Division Algorithms 8253@subsection Block-Wise Barrett Division 8254 8255For the largest divisions, a block-wise Barrett division algorithm is used. 8256Here, the divisor is inverted to a precision determined by the relative size of 8257the dividend and divisor. Blocks of quotient limbs are then generated by 8258multiplying blocks from the dividend by the inverse. 8259 8260Our block-wise algorithm computes a smaller inverse than in the plain Barrett 8261algorithm. For a @math{2n/n} division, the inverse will be just @m{\lceil n/2 8262\rceil, ceil(n/2)} limbs. 8263 8264 8265@node Exact Division, Exact Remainder, Block-Wise Barrett Division, Division Algorithms 8266@subsection Exact Division 8267 8268 8269A so-called exact division is when the dividend is known to be an exact 8270multiple of the divisor. Jebelean's exact division algorithm uses this 8271knowledge to make some significant optimizations (@pxref{References}). 8272 8273The idea can be illustrated in decimal for example with 368154 divided by 8274543. Because the low digit of the dividend is 4, the low digit of the 8275quotient must be 8. This is arrived at from @m{4 \mathord{\times} 7 \bmod 10, 82764*7 mod 10}, using the fact 7 is the modular inverse of 3 (the low digit of 8277the divisor), since @m{3 \mathord{\times} 7 \mathop{\equiv} 1 \bmod 10, 3*7 8278@equiv{} 1 mod 10}. So @m{8\mathord{\times}543 = 4344,8*543=4344} can be 8279subtracted from the dividend leaving 363810. Notice the low digit has become 8280zero. 8281 8282The procedure is repeated at the second digit, with the next quotient digit 7 8283(@m{1 \mathord{\times} 7 \bmod 10, 7 @equiv{} 1*7 mod 10}), subtracting 8284@m{7\mathord{\times}543 = 3801,7*543=3801}, leaving 325800. And finally at 8285the third digit with quotient digit 6 (@m{8 \mathord{\times} 7 \bmod 10, 8*7 8286mod 10}), subtracting @m{6\mathord{\times}543 = 3258,6*543=3258} leaving 0. 8287So the quotient is 678. 8288 8289Notice however that the multiplies and subtractions don't need to extend past 8290the low three digits of the dividend, since that's enough to determine the 8291three quotient digits. For the last quotient digit no subtraction is needed 8292at all. On a 2N@cross{}N division like this one, only about half the work of 8293a normal basecase division is necessary. 8294 8295For an N@cross{}M exact division producing Q=N@minus{}M quotient limbs, the 8296saving over a normal basecase division is in two parts. Firstly, each of the 8297Q quotient limbs needs only one multiply, not a 2@cross{}1 divide and 8298multiply. Secondly, the crossproducts are reduced when @math{Q>M} to 8299@m{QM-M(M+1)/2,Q*M-M*(M+1)/2}, or when @math{Q@le{}M} to @m{Q(Q-1)/2, 8300Q*(Q-1)/2}. Notice the savings are complementary. If Q is big then many 8301divisions are saved, or if Q is small then the crossproducts reduce to a small 8302number. 8303 8304The modular inverse used is calculated efficiently by @code{binvert_limb} in 8305@file{gmp-impl.h}. This does four multiplies for a 32-bit limb, or six for a 830664-bit limb. @file{tune/modlinv.c} has some alternate implementations that 8307might suit processors better at bit twiddling than multiplying. 8308 8309The sub-quadratic exact division described by Jebelean in ``Exact Division 8310with Karatsuba Complexity'' is not currently implemented. It uses a 8311rearrangement similar to the divide and conquer for normal division 8312(@pxref{Divide and Conquer Division}), but operating from low to high. A 8313further possibility not currently implemented is ``Bidirectional Exact Integer 8314Division'' by Krandick and Jebelean which forms quotient limbs from both the 8315high and low ends of the dividend, and can halve once more the number of 8316crossproducts needed in a 2N@cross{}N division. 8317 8318A special case exact division by 3 exists in @code{mpn_divexact_by3}, 8319supporting Toom-3 multiplication and @code{mpq} canonicalizations. It forms 8320quotient digits with a multiply by the modular inverse of 3 (which is 8321@code{0xAA..AAB}) and uses two comparisons to determine a borrow for the next 8322limb. The multiplications don't need to be on the dependent chain, as long as 8323the effect of the borrows is applied, which can help chips with pipelined 8324multipliers. 8325 8326 8327@node Exact Remainder, Small Quotient Division, Exact Division, Division Algorithms 8328@subsection Exact Remainder 8329@cindex Exact remainder 8330 8331If the exact division algorithm is done with a full subtraction at each stage 8332and the dividend isn't a multiple of the divisor, then low zero limbs are 8333produced but with a remainder in the high limbs. For dividend @math{a}, 8334divisor @math{d}, quotient @math{q}, and @m{b = 2 8335\GMPraise{@code{mp\_bits\_per\_limb}}, b = 2^mp_bits_per_limb}, this remainder 8336@math{r} is of the form 8337@tex 8338$$ a = qd + r b^n $$ 8339@end tex 8340@ifnottex 8341 8342@example 8343a = q*d + r*b^n 8344@end example 8345 8346@end ifnottex 8347@math{n} represents the number of zero limbs produced by the subtractions, 8348that being the number of limbs produced for @math{q}. @math{r} will be in the 8349range @math{0@le{}r<d} and can be viewed as a remainder, but one shifted up by 8350a factor of @math{b^n}. 8351 8352Carrying out full subtractions at each stage means the same number of cross 8353products must be done as a normal division, but there's still some single limb 8354divisions saved. When @math{d} is a single limb some simplifications arise, 8355providing good speedups on a number of processors. 8356 8357The functions @code{mpn_divexact_by3}, @code{mpn_modexact_1_odd} and the 8358internal @code{mpn_redc_X} functions differ subtly in how they return @math{r}, 8359leading to some negations in the above formula, but all are essentially the 8360same. 8361 8362@cindex Divisibility algorithm 8363@cindex Congruence algorithm 8364Clearly @math{r} is zero when @math{a} is a multiple of @math{d}, and this 8365leads to divisibility or congruence tests which are potentially more efficient 8366than a normal division. 8367 8368The factor of @math{b^n} on @math{r} can be ignored in a GCD when @math{d} is 8369odd, hence the use of @code{mpn_modexact_1_odd} by @code{mpn_gcd_1} and 8370@code{mpz_kronecker_ui} etc (@pxref{Greatest Common Divisor Algorithms}). 8371 8372Montgomery's REDC method for modular multiplications uses operands of the form 8373of @m{xb^{-n}, x*b^-n} and @m{yb^{-n}, y*b^-n} and on calculating @m{(xb^{-n}) 8374(yb^{-n}), (x*b^-n)*(y*b^-n)} uses the factor of @math{b^n} in the exact 8375remainder to reach a product in the same form @m{(xy)b^{-n}, (x*y)*b^-n} 8376(@pxref{Modular Powering Algorithm}). 8377 8378Notice that @math{r} generally gives no useful information about the ordinary 8379remainder @math{a @bmod d} since @math{b^n @bmod d} could be anything. If 8380however @math{b^n @equiv{} 1 @bmod d}, then @math{r} is the negative of the 8381ordinary remainder. This occurs whenever @math{d} is a factor of 8382@math{b^n-1}, as for example with 3 in @code{mpn_divexact_by3}. For a 32 or 838364 bit limb other such factors include 5, 17 and 257, but no particular use 8384has been found for this. 8385 8386 8387@node Small Quotient Division, , Exact Remainder, Division Algorithms 8388@subsection Small Quotient Division 8389 8390An N@cross{}M division where the number of quotient limbs Q=N@minus{}M is 8391small can be optimized somewhat. 8392 8393An ordinary basecase division normalizes the divisor by shifting it to make 8394the high bit set, shifting the dividend accordingly, and shifting the 8395remainder back down at the end of the calculation. This is wasteful if only a 8396few quotient limbs are to be formed. Instead a division of just the top 8397@m{\rm2Q,2*Q} limbs of the dividend by the top Q limbs of the divisor can be 8398used to form a trial quotient. This requires only those limbs normalized, not 8399the whole of the divisor and dividend. 8400 8401A multiply and subtract then applies the trial quotient to the M@minus{}Q 8402unused limbs of the divisor and N@minus{}Q dividend limbs (which includes Q 8403limbs remaining from the trial quotient division). The starting trial 8404quotient can be 1 or 2 too big, but all cases of 2 too big and most cases of 1 8405too big are detected by first comparing the most significant limbs that will 8406arise from the subtraction. An addback is done if the quotient still turns 8407out to be 1 too big. 8408 8409This whole procedure is essentially the same as one step of the basecase 8410algorithm done in a Q limb base, though with the trial quotient test done only 8411with the high limbs, not an entire Q limb ``digit'' product. The correctness 8412of this weaker test can be established by following the argument of Knuth 8413section 4.3.1 exercise 20 but with the @m{v_2 \GMPhat q > b \GMPhat r 8414+ u_2, v2*q>b*r+u2} condition appropriately relaxed. 8415 8416 8417@need 1000 8418@node Greatest Common Divisor Algorithms, Powering Algorithms, Division Algorithms, Algorithms 8419@section Greatest Common Divisor 8420@cindex Greatest common divisor algorithms 8421@cindex GCD algorithms 8422 8423@menu 8424* Binary GCD:: 8425* Lehmer's Algorithm:: 8426* Subquadratic GCD:: 8427* Extended GCD:: 8428* Jacobi Symbol:: 8429@end menu 8430 8431 8432@node Binary GCD, Lehmer's Algorithm, Greatest Common Divisor Algorithms, Greatest Common Divisor Algorithms 8433@subsection Binary GCD 8434 8435At small sizes GMP uses an @math{O(N^2)} binary style GCD@. This is described 8436in many textbooks, for example Knuth section 4.5.2 algorithm B@. It simply 8437consists of successively reducing odd operands @math{a} and @math{b} using 8438 8439@quotation 8440@math{a,b = @abs{}(a-b),@min{}(a,b)} @* 8441strip factors of 2 from @math{a} 8442@end quotation 8443 8444The Euclidean GCD algorithm, as per Knuth algorithms E and A, repeatedly 8445computes the quotient @m{q = \lfloor a/b \rfloor, q = floor(a/b)} and replaces 8446@math{a,b} by @math{v, u - q v}. The binary algorithm has so far been found to 8447be faster than the Euclidean algorithm everywhere. One reason the binary 8448method does well is that the implied quotient at each step is usually small, 8449so often only one or two subtractions are needed to get the same effect as a 8450division. Quotients 1, 2 and 3 for example occur 67.7% of the time, see Knuth 8451section 4.5.3 Theorem E. 8452 8453When the implied quotient is large, meaning @math{b} is much smaller than 8454@math{a}, then a division is worthwhile. This is the basis for the initial 8455@math{a @bmod b} reductions in @code{mpn_gcd} and @code{mpn_gcd_1} (the latter 8456for both N@cross{}1 and 1@cross{}1 cases). But after that initial reduction, 8457big quotients occur too rarely to make it worth checking for them. 8458 8459@sp 1 8460The final @math{1@cross{}1} GCD in @code{mpn_gcd_1} is done in the generic C 8461code as described above. For two N-bit operands, the algorithm takes about 84620.68 iterations per bit. For optimum performance some attention needs to be 8463paid to the way the factors of 2 are stripped from @math{a}. 8464 8465Firstly it may be noted that in twos complement the number of low zero bits on 8466@math{a-b} is the same as @math{b-a}, so counting or testing can begin on 8467@math{a-b} without waiting for @math{@abs{}(a-b)} to be determined. 8468 8469A loop stripping low zero bits tends not to branch predict well, since the 8470condition is data dependent. But on average there's only a few low zeros, so 8471an option is to strip one or two bits arithmetically then loop for more (as 8472done for AMD K6). Or use a lookup table to get a count for several bits then 8473loop for more (as done for AMD K7). An alternative approach is to keep just 8474one of @math{a} or @math{b} odd and iterate 8475 8476@quotation 8477@math{a,b = @abs{}(a-b), @min{}(a,b)} @* 8478@math{a = a/2} if even @* 8479@math{b = b/2} if even 8480@end quotation 8481 8482This requires about 1.25 iterations per bit, but stripping of a single bit at 8483each step avoids any branching. Repeating the bit strip reduces to about 0.9 8484iterations per bit, which may be a worthwhile tradeoff. 8485 8486Generally with the above approaches a speed of perhaps 6 cycles per bit can be 8487achieved, which is still not terribly fast with for instance a 64-bit GCD 8488taking nearly 400 cycles. It's this sort of time which means it's not usually 8489advantageous to combine a set of divisibility tests into a GCD. 8490 8491Currently, the binary algorithm is used for GCD only when @math{N < 3}. 8492 8493@node Lehmer's Algorithm, Subquadratic GCD, Binary GCD, Greatest Common Divisor Algorithms 8494@comment node-name, next, previous, up 8495@subsection Lehmer's algorithm 8496 8497Lehmer's improvement of the Euclidean algorithms is based on the observation 8498that the initial part of the quotient sequence depends only on the most 8499significant parts of the inputs. The variant of Lehmer's algorithm used in GMP 8500splits off the most significant two limbs, as suggested, e.g., in ``A 8501Double-Digit Lehmer-Euclid Algorithm'' by Jebelean (@pxref{References}). The 8502quotients of two double-limb inputs are collected as a 2 by 2 matrix with 8503single-limb elements. This is done by the function @code{mpn_hgcd2}. The 8504resulting matrix is applied to the inputs using @code{mpn_mul_1} and 8505@code{mpn_submul_1}. Each iteration usually reduces the inputs by almost one 8506limb. In the rare case of a large quotient, no progress can be made by 8507examining just the most significant two limbs, and the quotient is computed 8508using plain division. 8509 8510The resulting algorithm is asymptotically @math{O(N^2)}, just as the Euclidean 8511algorithm and the binary algorithm. The quadratic part of the work are 8512the calls to @code{mpn_mul_1} and @code{mpn_submul_1}. For small sizes, the 8513linear work is also significant. There are roughly @math{N} calls to the 8514@code{mpn_hgcd2} function. This function uses a couple of important 8515optimizations: 8516 8517@itemize 8518@item 8519It uses the same relaxed notion of correctness as @code{mpn_hgcd} (see next 8520section). This means that when called with the most significant two limbs of 8521two large numbers, the returned matrix does not always correspond exactly to 8522the initial quotient sequence for the two large numbers; the final quotient 8523may sometimes be one off. 8524 8525@item 8526It takes advantage of the fact the quotients are usually small. The division 8527operator is not used, since the corresponding assembler instruction is very 8528slow on most architectures. (This code could probably be improved further, it 8529uses many branches that are unfriendly to prediction). 8530 8531@item 8532It switches from double-limb calculations to single-limb calculations half-way 8533through, when the input numbers have been reduced in size from two limbs to 8534one and a half. 8535 8536@end itemize 8537 8538@node Subquadratic GCD, Extended GCD, Lehmer's Algorithm, Greatest Common Divisor Algorithms 8539@subsection Subquadratic GCD 8540 8541For inputs larger than @code{GCD_DC_THRESHOLD}, GCD is computed via the HGCD 8542(Half GCD) function, as a generalization to Lehmer's algorithm. 8543 8544Let the inputs @math{a,b} be of size @math{N} limbs each. Put @m{S=\lfloor N/2 8545\rfloor + 1, S = floor(N/2) + 1}. Then HGCD(a,b) returns a transformation 8546matrix @math{T} with non-negative elements, and reduced numbers @math{(c;d) = 8547T^{-1} (a;b)}. The reduced numbers @math{c,d} must be larger than @math{S} 8548limbs, while their difference @math{abs(c-d)} must fit in @math{S} limbs. The 8549matrix elements will also be of size roughly @math{N/2}. 8550 8551The HGCD base case uses Lehmer's algorithm, but with the above stop condition 8552that returns reduced numbers and the corresponding transformation matrix 8553half-way through. For inputs larger than @code{HGCD_THRESHOLD}, HGCD is 8554computed recursively, using the divide and conquer algorithm in ``On 8555Sch@"onhage's algorithm and subquadratic integer GCD computation'' by M@"oller 8556(@pxref{References}). The recursive algorithm consists of these main 8557steps. 8558 8559@itemize 8560 8561@item 8562Call HGCD recursively, on the most significant @math{N/2} limbs. Apply the 8563resulting matrix @math{T_1} to the full numbers, reducing them to a size just 8564above @math{3N/2}. 8565 8566@item 8567Perform a small number of division or subtraction steps to reduce the numbers 8568to size below @math{3N/2}. This is essential mainly for the unlikely case of 8569large quotients. 8570 8571@item 8572Call HGCD recursively, on the most significant @math{N/2} limbs of the reduced 8573numbers. Apply the resulting matrix @math{T_2} to the full numbers, reducing 8574them to a size just above @math{N/2}. 8575 8576@item 8577Compute @math{T = T_1 T_2}. 8578 8579@item 8580Perform a small number of division and subtraction steps to satisfy the 8581requirements, and return. 8582@end itemize 8583 8584GCD is then implemented as a loop around HGCD, similarly to Lehmer's 8585algorithm. Where Lehmer repeatedly chops off the top two limbs, calls 8586@code{mpn_hgcd2}, and applies the resulting matrix to the full numbers, the 8587subquadratic GCD chops off the most significant third of the limbs (the 8588proportion is a tuning parameter, and @math{1/3} seems to be more efficient 8589than, e.g, @math{1/2}), calls @code{mpn_hgcd}, and applies the resulting 8590matrix. Once the input numbers are reduced to size below 8591@code{GCD_DC_THRESHOLD}, Lehmer's algorithm is used for the rest of the work. 8592 8593The asymptotic running time of both HGCD and GCD is @m{O(M(N)\log N),O(M(N)*log(N))}, 8594where @math{M(N)} is the time for multiplying two @math{N}-limb numbers. 8595 8596@comment node-name, next, previous, up 8597 8598@node Extended GCD, Jacobi Symbol, Subquadratic GCD, Greatest Common Divisor Algorithms 8599@subsection Extended GCD 8600 8601The extended GCD function, or GCDEXT, calculates @math{@gcd{}(a,b)} and also 8602cofactors @math{x} and @math{y} satisfying @m{ax+by=\gcd(a@C{}b), 8603a*x+b*y=gcd(a@C{}b)}. All the algorithms used for plain GCD are extended to 8604handle this case. The binary algorithm is used only for single-limb GCDEXT. 8605Lehmer's algorithm is used for sizes up to @code{GCDEXT_DC_THRESHOLD}. Above 8606this threshold, GCDEXT is implemented as a loop around HGCD, but with more 8607book-keeping to keep track of the cofactors. This gives the same asymptotic 8608running time as for GCD and HGCD, @m{O(M(N)\log N),O(M(N)*log(N))} 8609 8610One difference to plain GCD is that while the inputs @math{a} and @math{b} are 8611reduced as the algorithm proceeds, the cofactors @math{x} and @math{y} grow in 8612size. This makes the tuning of the chopping-point more difficult. The current 8613code chops off the most significant half of the inputs for the call to HGCD in 8614the first iteration, and the most significant two thirds for the remaining 8615calls. This strategy could surely be improved. Also the stop condition for the 8616loop, where Lehmer's algorithm is invoked once the inputs are reduced below 8617@code{GCDEXT_DC_THRESHOLD}, could maybe be improved by taking into account the 8618current size of the cofactors. 8619 8620@node Jacobi Symbol, , Extended GCD, Greatest Common Divisor Algorithms 8621@subsection Jacobi Symbol 8622@cindex Jacobi symbol algorithm 8623 8624[This section is obsolete. The current Jacobi code actually uses a very 8625efficient algorithm.] 8626 8627@code{mpz_jacobi} and @code{mpz_kronecker} are currently implemented with a 8628simple binary algorithm similar to that described for the GCDs (@pxref{Binary 8629GCD}). They're not very fast when both inputs are large. Lehmer's multi-step 8630improvement or a binary based multi-step algorithm is likely to be better. 8631 8632When one operand fits a single limb, and that includes @code{mpz_kronecker_ui} 8633and friends, an initial reduction is done with either @code{mpn_mod_1} or 8634@code{mpn_modexact_1_odd}, followed by the binary algorithm on a single limb. 8635The binary algorithm is well suited to a single limb, and the whole 8636calculation in this case is quite efficient. 8637 8638In all the routines sign changes for the result are accumulated using some bit 8639twiddling, avoiding table lookups or conditional jumps. 8640 8641 8642@need 1000 8643@node Powering Algorithms, Root Extraction Algorithms, Greatest Common Divisor Algorithms, Algorithms 8644@section Powering Algorithms 8645@cindex Powering algorithms 8646 8647@menu 8648* Normal Powering Algorithm:: 8649* Modular Powering Algorithm:: 8650@end menu 8651 8652 8653@node Normal Powering Algorithm, Modular Powering Algorithm, Powering Algorithms, Powering Algorithms 8654@subsection Normal Powering 8655 8656Normal @code{mpz} or @code{mpf} powering uses a simple binary algorithm, 8657successively squaring and then multiplying by the base when a 1 bit is seen in 8658the exponent, as per Knuth section 4.6.3. The ``left to right'' 8659variant described there is used rather than algorithm A, since it's just as 8660easy and can be done with somewhat less temporary memory. 8661 8662 8663@node Modular Powering Algorithm, , Normal Powering Algorithm, Powering Algorithms 8664@subsection Modular Powering 8665 8666Modular powering is implemented using a @math{2^k}-ary sliding window 8667algorithm, as per ``Handbook of Applied Cryptography'' algorithm 14.85 8668(@pxref{References}). @math{k} is chosen according to the size of the 8669exponent. Larger exponents use larger values of @math{k}, the choice being 8670made to minimize the average number of multiplications that must supplement 8671the squaring. 8672 8673The modular multiplies and squarings use either a simple division or the REDC 8674method by Montgomery (@pxref{References}). REDC is a little faster, 8675essentially saving N single limb divisions in a fashion similar to an exact 8676remainder (@pxref{Exact Remainder}). 8677 8678 8679@node Root Extraction Algorithms, Radix Conversion Algorithms, Powering Algorithms, Algorithms 8680@section Root Extraction Algorithms 8681@cindex Root extraction algorithms 8682 8683@menu 8684* Square Root Algorithm:: 8685* Nth Root Algorithm:: 8686* Perfect Square Algorithm:: 8687* Perfect Power Algorithm:: 8688@end menu 8689 8690 8691@node Square Root Algorithm, Nth Root Algorithm, Root Extraction Algorithms, Root Extraction Algorithms 8692@subsection Square Root 8693@cindex Square root algorithm 8694@cindex Karatsuba square root algorithm 8695 8696Square roots are taken using the ``Karatsuba Square Root'' algorithm by Paul 8697Zimmermann (@pxref{References}). 8698 8699An input @math{n} is split into four parts of @math{k} bits each, so with 8700@math{b=2^k} we have @m{n = a_3b^3 + a_2b^2 + a_1b + a_0, n = a3*b^3 + a2*b^2 8701+ a1*b + a0}. Part @ms{a,3} must be ``normalized'' so that either the high or 8702second highest bit is set. In GMP, @math{k} is kept on a limb boundary and 8703the input is left shifted (by an even number of bits) to normalize. 8704 8705The square root of the high two parts is taken, by recursive application of 8706the algorithm (bottoming out in a one-limb Newton's method), 8707@tex 8708$$ s',r' = \mathop{\rm sqrtrem} \> (a_3b + a_2) $$ 8709@end tex 8710@ifnottex 8711 8712@example 8713s1,r1 = sqrtrem (a3*b + a2) 8714@end example 8715 8716@end ifnottex 8717This is an approximation to the desired root and is extended by a division to 8718give @math{s},@math{r}, 8719@tex 8720$$\eqalign{ 8721q,u &= \mathop{\rm divrem} \> (r'b + a_1, 2s') \cr 8722s &= s'b + q \cr 8723r &= ub + a_0 - q^2 8724}$$ 8725@end tex 8726@ifnottex 8727 8728@example 8729q,u = divrem (r1*b + a1, 2*s1) 8730s = s1*b + q 8731r = u*b + a0 - q^2 8732@end example 8733 8734@end ifnottex 8735The normalization requirement on @ms{a,3} means at this point @math{s} is 8736either correct or 1 too big. @math{r} is negative in the latter case, so 8737@tex 8738$$\eqalign{ 8739\mathop{\rm if} \; r &< 0 \; \mathop{\rm then} \cr 8740r &\leftarrow r + 2s - 1 \cr 8741s &\leftarrow s - 1 8742}$$ 8743@end tex 8744@ifnottex 8745 8746@example 8747if r < 0 then 8748 r = r + 2*s - 1 8749 s = s - 1 8750@end example 8751 8752@end ifnottex 8753The algorithm is expressed in a divide and conquer form, but as noted in the 8754paper it can also be viewed as a discrete variant of Newton's method, or as a 8755variation on the schoolboy method (no longer taught) for square roots two 8756digits at a time. 8757 8758If the remainder @math{r} is not required then usually only a few high limbs 8759of @math{r} and @math{u} need to be calculated to determine whether an 8760adjustment to @math{s} is required. This optimization is not currently 8761implemented. 8762 8763In the Karatsuba multiplication range this algorithm is @m{O({3\over2} 8764M(N/2)),O(1.5*M(N/2))}, where @math{M(n)} is the time to multiply two numbers 8765of @math{n} limbs. In the FFT multiplication range this grows to a bound of 8766@m{O(6 M(N/2)),O(6*M(N/2))}. In practice a factor of about 1.5 to 1.8 is 8767found in the Karatsuba and Toom-3 ranges, growing to 2 or 3 in the FFT range. 8768 8769The algorithm does all its calculations in integers and the resulting 8770@code{mpn_sqrtrem} is used for both @code{mpz_sqrt} and @code{mpf_sqrt}. 8771The extended precision given by @code{mpf_sqrt_ui} is obtained by 8772padding with zero limbs. 8773 8774 8775@node Nth Root Algorithm, Perfect Square Algorithm, Square Root Algorithm, Root Extraction Algorithms 8776@subsection Nth Root 8777@cindex Root extraction algorithm 8778@cindex Nth root algorithm 8779 8780Integer Nth roots are taken using Newton's method with the following 8781iteration, where @math{A} is the input and @math{n} is the root to be taken. 8782@tex 8783$$a_{i+1} = {1\over n} \left({A \over a_i^{n-1}} + (n-1)a_i \right)$$ 8784@end tex 8785@ifnottex 8786 8787@example 8788 1 A 8789a[i+1] = - * ( --------- + (n-1)*a[i] ) 8790 n a[i]^(n-1) 8791@end example 8792 8793@end ifnottex 8794The initial approximation @m{a_1,a[1]} is generated bitwise by successively 8795powering a trial root with or without new 1 bits, aiming to be just above the 8796true root. The iteration converges quadratically when started from a good 8797approximation. When @math{n} is large more initial bits are needed to get 8798good convergence. The current implementation is not particularly well 8799optimized. 8800 8801 8802@node Perfect Square Algorithm, Perfect Power Algorithm, Nth Root Algorithm, Root Extraction Algorithms 8803@subsection Perfect Square 8804@cindex Perfect square algorithm 8805 8806A significant fraction of non-squares can be quickly identified by checking 8807whether the input is a quadratic residue modulo small integers. 8808 8809@code{mpz_perfect_square_p} first tests the input mod 256, which means just 8810examining the low byte. Only 44 different values occur for squares mod 256, 8811so 82.8% of inputs can be immediately identified as non-squares. 8812 8813On a 32-bit system similar tests are done mod 9, 5, 7, 13 and 17, for a total 881499.25% of inputs identified as non-squares. On a 64-bit system 97 is tested 8815too, for a total 99.62%. 8816 8817These moduli are chosen because they're factors of @math{2^@W{24}-1} (or 8818@math{2^@W{48}-1} for 64-bits), and such a remainder can be quickly taken just 8819using additions (see @code{mpn_mod_34lsub1}). 8820 8821When nails are in use moduli are instead selected by the @file{gen-psqr.c} 8822program and applied with an @code{mpn_mod_1}. The same @math{2^@W{24}-1} or 8823@math{2^@W{48}-1} could be done with nails using some extra bit shifts, but 8824this is not currently implemented. 8825 8826In any case each modulus is applied to the @code{mpn_mod_34lsub1} or 8827@code{mpn_mod_1} remainder and a table lookup identifies non-squares. By 8828using a ``modexact'' style calculation, and suitably permuted tables, just one 8829multiply each is required, see the code for details. Moduli are also combined 8830to save operations, so long as the lookup tables don't become too big. 8831@file{gen-psqr.c} does all the pre-calculations. 8832 8833A square root must still be taken for any value that passes these tests, to 8834verify it's really a square and not one of the small fraction of non-squares 8835that get through (i.e.@: a pseudo-square to all the tested bases). 8836 8837Clearly more residue tests could be done, @code{mpz_perfect_square_p} only 8838uses a compact and efficient set. Big inputs would probably benefit from more 8839residue testing, small inputs might be better off with less. The assumed 8840distribution of squares versus non-squares in the input would affect such 8841considerations. 8842 8843 8844@node Perfect Power Algorithm, , Perfect Square Algorithm, Root Extraction Algorithms 8845@subsection Perfect Power 8846@cindex Perfect power algorithm 8847 8848Detecting perfect powers is required by some factorization algorithms. 8849Currently @code{mpz_perfect_power_p} is implemented using repeated Nth root 8850extractions, though naturally only prime roots need to be considered. 8851(@xref{Nth Root Algorithm}.) 8852 8853If a prime divisor @math{p} with multiplicity @math{e} can be found, then only 8854roots which are divisors of @math{e} need to be considered, much reducing the 8855work necessary. To this end divisibility by a set of small primes is checked. 8856 8857 8858@node Radix Conversion Algorithms, Other Algorithms, Root Extraction Algorithms, Algorithms 8859@section Radix Conversion 8860@cindex Radix conversion algorithms 8861 8862Radix conversions are less important than other algorithms. A program 8863dominated by conversions should probably use a different data representation. 8864 8865@menu 8866* Binary to Radix:: 8867* Radix to Binary:: 8868@end menu 8869 8870 8871@node Binary to Radix, Radix to Binary, Radix Conversion Algorithms, Radix Conversion Algorithms 8872@subsection Binary to Radix 8873 8874Conversions from binary to a power-of-2 radix use a simple and fast 8875@math{O(N)} bit extraction algorithm. 8876 8877Conversions from binary to other radices use one of two algorithms. Sizes 8878below @code{GET_STR_PRECOMPUTE_THRESHOLD} use a basic @math{O(N^2)} method. 8879Repeated divisions by @math{b^n} are made, where @math{b} is the radix and 8880@math{n} is the biggest power that fits in a limb. But instead of simply 8881using the remainder @math{r} from such divisions, an extra divide step is done 8882to give a fractional limb representing @math{r/b^n}. The digits of @math{r} 8883can then be extracted using multiplications by @math{b} rather than divisions. 8884Special case code is provided for decimal, allowing multiplications by 10 to 8885optimize to shifts and adds. 8886 8887Above @code{GET_STR_PRECOMPUTE_THRESHOLD} a sub-quadratic algorithm is used. 8888For an input @math{t}, powers @m{b^{n2^i},b^(n*2^i)} of the radix are 8889calculated, until a power between @math{t} and @m{\sqrt{t},sqrt(t)} is 8890reached. @math{t} is then divided by that largest power, giving a quotient 8891which is the digits above that power, and a remainder which is those below. 8892These two parts are in turn divided by the second highest power, and so on 8893recursively. When a piece has been divided down to less than 8894@code{GET_STR_DC_THRESHOLD} limbs, the basecase algorithm described above is 8895used. 8896 8897The advantage of this algorithm is that big divisions can make use of the 8898sub-quadratic divide and conquer division (@pxref{Divide and Conquer 8899Division}), and big divisions tend to have less overheads than lots of 8900separate single limb divisions anyway. But in any case the cost of 8901calculating the powers @m{b^{n2^i},b^(n*2^i)} must first be overcome. 8902 8903@code{GET_STR_PRECOMPUTE_THRESHOLD} and @code{GET_STR_DC_THRESHOLD} represent 8904the same basic thing, the point where it becomes worth doing a big division to 8905cut the input in half. @code{GET_STR_PRECOMPUTE_THRESHOLD} includes the cost 8906of calculating the radix power required, whereas @code{GET_STR_DC_THRESHOLD} 8907assumes that's already available, which is the case when recursing. 8908 8909Since the base case produces digits from least to most significant but they 8910want to be stored from most to least, it's necessary to calculate in advance 8911how many digits there will be, or at least be sure not to underestimate that. 8912For GMP the number of input bits is multiplied by @code{chars_per_bit_exactly} 8913from @code{mp_bases}, rounding up. The result is either correct or one too 8914big. 8915 8916Examining some of the high bits of the input could increase the chance of 8917getting the exact number of digits, but an exact result every time would not 8918be practical, since in general the difference between numbers 100@dots{} and 891999@dots{} is only in the last few bits and the work to identify 99@dots{} 8920might well be almost as much as a full conversion. 8921 8922@code{mpf_get_str} doesn't currently use the algorithm described here, it 8923multiplies or divides by a power of @math{b} to move the radix point to the 8924just above the highest non-zero digit (or at worst one above that location), 8925then multiplies by @math{b^n} to bring out digits. This is @math{O(N^2)} and 8926is certainly not optimal. 8927 8928The @math{r/b^n} scheme described above for using multiplications to bring out 8929digits might be useful for more than a single limb. Some brief experiments 8930with it on the base case when recursing didn't give a noticeable improvement, 8931but perhaps that was only due to the implementation. Something similar would 8932work for the sub-quadratic divisions too, though there would be the cost of 8933calculating a bigger radix power. 8934 8935Another possible improvement for the sub-quadratic part would be to arrange 8936for radix powers that balanced the sizes of quotient and remainder produced, 8937i.e.@: the highest power would be an @m{b^{nk},b^(n*k)} approximately equal to 8938@m{\sqrt{t},sqrt(t)}, not restricted to a @math{2^i} factor. That ought to 8939smooth out a graph of times against sizes, but may or may not be a net 8940speedup. 8941 8942 8943@node Radix to Binary, , Binary to Radix, Radix Conversion Algorithms 8944@subsection Radix to Binary 8945 8946@strong{This section needs to be rewritten, it currently describes the 8947algorithms used before GMP 4.3.} 8948 8949Conversions from a power-of-2 radix into binary use a simple and fast 8950@math{O(N)} bitwise concatenation algorithm. 8951 8952Conversions from other radices use one of two algorithms. Sizes below 8953@code{SET_STR_PRECOMPUTE_THRESHOLD} use a basic @math{O(N^2)} method. Groups 8954of @math{n} digits are converted to limbs, where @math{n} is the biggest 8955power of the base @math{b} which will fit in a limb, then those groups are 8956accumulated into the result by multiplying by @math{b^n} and adding. This 8957saves multi-precision operations, as per Knuth section 4.4 part E 8958(@pxref{References}). Some special case code is provided for decimal, giving 8959the compiler a chance to optimize multiplications by 10. 8960 8961Above @code{SET_STR_PRECOMPUTE_THRESHOLD} a sub-quadratic algorithm is used. 8962First groups of @math{n} digits are converted into limbs. Then adjacent 8963limbs are combined into limb pairs with @m{xb^n+y,x*b^n+y}, where @math{x} 8964and @math{y} are the limbs. Adjacent limb pairs are combined into quads 8965similarly with @m{xb^{2n}+y,x*b^(2n)+y}. This continues until a single block 8966remains, that being the result. 8967 8968The advantage of this method is that the multiplications for each @math{x} are 8969big blocks, allowing Karatsuba and higher algorithms to be used. But the cost 8970of calculating the powers @m{b^{n2^i},b^(n*2^i)} must be overcome. 8971@code{SET_STR_PRECOMPUTE_THRESHOLD} usually ends up quite big, around 5000 digits, and on 8972some processors much bigger still. 8973 8974@code{SET_STR_PRECOMPUTE_THRESHOLD} is based on the input digits (and tuned 8975for decimal), though it might be better based on a limb count, so as to be 8976independent of the base. But that sort of count isn't used by the base case 8977and so would need some sort of initial calculation or estimate. 8978 8979The main reason @code{SET_STR_PRECOMPUTE_THRESHOLD} is so much bigger than the 8980corresponding @code{GET_STR_PRECOMPUTE_THRESHOLD} is that @code{mpn_mul_1} is 8981much faster than @code{mpn_divrem_1} (often by a factor of 5, or more). 8982 8983 8984@need 1000 8985@node Other Algorithms, Assembly Coding, Radix Conversion Algorithms, Algorithms 8986@section Other Algorithms 8987 8988@menu 8989* Prime Testing Algorithm:: 8990* Factorial Algorithm:: 8991* Binomial Coefficients Algorithm:: 8992* Fibonacci Numbers Algorithm:: 8993* Lucas Numbers Algorithm:: 8994* Random Number Algorithms:: 8995@end menu 8996 8997 8998@node Prime Testing Algorithm, Factorial Algorithm, Other Algorithms, Other Algorithms 8999@subsection Prime Testing 9000@cindex Prime testing algorithms 9001 9002The primality testing in @code{mpz_probab_prime_p} (@pxref{Number Theoretic 9003Functions}) first does some trial division by small factors and then uses the 9004Miller-Rabin probabilistic primality testing algorithm, as described in Knuth 9005section 4.5.4 algorithm P (@pxref{References}). 9006 9007For an odd input @math{n}, and with @math{n = q@GMPmultiply{}2^k+1} where 9008@math{q} is odd, this algorithm selects a random base @math{x} and tests 9009whether @math{x^q @bmod{} n} is 1 or @math{-1}, or an @m{x^{q2^j} \bmod n, 9010x^(q*2^j) mod n} is @math{1}, for @math{1@le{}j@le{}k}. If so then @math{n} 9011is probably prime, if not then @math{n} is definitely composite. 9012 9013Any prime @math{n} will pass the test, but some composites do too. Such 9014composites are known as strong pseudoprimes to base @math{x}. No @math{n} is 9015a strong pseudoprime to more than @math{1/4} of all bases (see Knuth exercise 901622), hence with @math{x} chosen at random there's no more than a @math{1/4} 9017chance a ``probable prime'' will in fact be composite. 9018 9019In fact strong pseudoprimes are quite rare, making the test much more 9020powerful than this analysis would suggest, but @math{1/4} is all that's proven 9021for an arbitrary @math{n}. 9022 9023 9024@node Factorial Algorithm, Binomial Coefficients Algorithm, Prime Testing Algorithm, Other Algorithms 9025@subsection Factorial 9026@cindex Factorial algorithm 9027 9028Factorials are calculated by a combination of two algorithms. An idea is 9029shared among them: to compute the odd part of the factorial; a final step 9030takes account of the power of @math{2} term, by shifting. 9031 9032For small @math{n}, the odd factor of @math{n!} is computed with the simple 9033observation that it is equal to the product of all positive odd numbers 9034smaller than @math{n} times the odd factor of @m{\lfloor n/2\rfloor!, [n/2]!}, 9035where @m{\lfloor x\rfloor, [x]} is the integer part of @math{x}, and so on 9036recursively. The procedure can be best illustrated with an example, 9037 9038@quotation 9039@math{23! = (23.21.19.17.15.13.11.9.7.5.3)(11.9.7.5.3)(5.3)2^{19}} 9040@end quotation 9041 9042Current code collects all the factors in a single list, with a loop and no 9043recursion, and compute the product, with no special care for repeated chunks. 9044 9045When @math{n} is larger, computation pass trough prime sieving. An helper 9046function is used, as suggested by Peter Luschny: 9047@tex 9048$$\mathop{\rm msf}(n) = {n!\over\lfloor n/2\rfloor!^2\cdot2^k} = \prod_{p=3}^{n} 9049p^{\mathop{\rm L}(p,n)} $$ 9050@end tex 9051@ifnottex 9052 9053@example 9054 n 9055 ----- 9056 n! | | L(p,n) 9057msf(n) = -------------- = | | p 9058 [n/2]!^2.2^k p=3 9059@end example 9060@end ifnottex 9061 9062Where @math{p} ranges on odd prime numbers. The exponent @math{k} is chosen to 9063obtain an odd integer number: @math{k} is the number of 1 bits in the binary 9064representation of @m{\lfloor n/2\rfloor, [n/2]}. The function L@math{(p,n)} 9065can be defined as zero when @math{p} is composite, and, for any prime 9066@math{p}, it is computed with: 9067@tex 9068$$\mathop{\rm L}(p,n) = \sum_{i>0}\left\lfloor{n\over p^i}\right\rfloor\bmod2 9069\leq\log_p(n)$$ 9070@end tex 9071@ifnottex 9072 9073@example 9074 --- 9075 \ n 9076L(p,n) = / [---] mod 2 <= log (n) . 9077 --- p^i p 9078 i>0 9079@end example 9080@end ifnottex 9081 9082With this helper function, we are able to compute the odd part of @math{n!} 9083using the recursion implied by @m{n!=\lfloor n/2\rfloor!^2\cdot\mathop{\rm 9084msf}(n)\cdot2^k , n!=[n/2]!^2*msf(n)*2^k}. The recursion stops using the 9085small-@math{n} algorithm on some @m{\lfloor n/2^i\rfloor, [n/2^i]}. 9086 9087Both the above algorithms use binary splitting to compute the product of many 9088small factors. At first as many products as possible are accumulated in a 9089single register, generating a list of factors that fit in a machine word. This 9090list is then split into halves, and the product is computed recursively. 9091 9092Such splitting is more efficient than repeated N@cross{}1 multiplies since it 9093forms big multiplies, allowing Karatsuba and higher algorithms to be used. 9094And even below the Karatsuba threshold a big block of work can be more 9095efficient for the basecase algorithm. 9096 9097 9098@node Binomial Coefficients Algorithm, Fibonacci Numbers Algorithm, Factorial Algorithm, Other Algorithms 9099@subsection Binomial Coefficients 9100@cindex Binomial coefficient algorithm 9101 9102Binomial coefficients @m{\left({n}\atop{k}\right), C(n@C{}k)} are calculated 9103by first arranging @math{k @le{} n/2} using @m{\left({n}\atop{k}\right) = 9104\left({n}\atop{n-k}\right), C(n@C{}k) = C(n@C{}n-k)} if necessary, and then 9105evaluating the following product simply from @math{i=2} to @math{i=k}. 9106@tex 9107$$ \left({n}\atop{k}\right) = (n-k+1) \prod_{i=2}^{k} {{n-k+i} \over i} $$ 9108@end tex 9109@ifnottex 9110 9111@example 9112 k (n-k+i) 9113C(n,k) = (n-k+1) * prod ------- 9114 i=2 i 9115@end example 9116 9117@end ifnottex 9118It's easy to show that each denominator @math{i} will divide the product so 9119far, so the exact division algorithm is used (@pxref{Exact Division}). 9120 9121The numerators @math{n-k+i} and denominators @math{i} are first accumulated 9122into as many fit a limb, to save multi-precision operations, though for 9123@code{mpz_bin_ui} this applies only to the divisors, since @math{n} is an 9124@code{mpz_t} and @math{n-k+i} in general won't fit in a limb at all. 9125 9126 9127@node Fibonacci Numbers Algorithm, Lucas Numbers Algorithm, Binomial Coefficients Algorithm, Other Algorithms 9128@subsection Fibonacci Numbers 9129@cindex Fibonacci number algorithm 9130 9131The Fibonacci functions @code{mpz_fib_ui} and @code{mpz_fib2_ui} are designed 9132for calculating isolated @m{F_n,F[n]} or @m{F_n,F[n]},@m{F_{n-1},F[n-1]} 9133values efficiently. 9134 9135For small @math{n}, a table of single limb values in @code{__gmp_fib_table} is 9136used. On a 32-bit limb this goes up to @m{F_{47},F[47]}, or on a 64-bit limb 9137up to @m{F_{93},F[93]}. For convenience the table starts at @m{F_{-1},F[-1]}. 9138 9139Beyond the table, values are generated with a binary powering algorithm, 9140calculating a pair @m{F_n,F[n]} and @m{F_{n-1},F[n-1]} working from high to 9141low across the bits of @math{n}. The formulas used are 9142@tex 9143$$\eqalign{ 9144 F_{2k+1} &= 4F_k^2 - F_{k-1}^2 + 2(-1)^k \cr 9145 F_{2k-1} &= F_k^2 + F_{k-1}^2 \cr 9146 F_{2k} &= F_{2k+1} - F_{2k-1} 9147}$$ 9148@end tex 9149@ifnottex 9150 9151@example 9152F[2k+1] = 4*F[k]^2 - F[k-1]^2 + 2*(-1)^k 9153F[2k-1] = F[k]^2 + F[k-1]^2 9154 9155F[2k] = F[2k+1] - F[2k-1] 9156@end example 9157 9158@end ifnottex 9159At each step, @math{k} is the high @math{b} bits of @math{n}. If the next bit 9160of @math{n} is 0 then @m{F_{2k},F[2k]},@m{F_{2k-1},F[2k-1]} is used, or if 9161it's a 1 then @m{F_{2k+1},F[2k+1]},@m{F_{2k},F[2k]} is used, and the process 9162repeated until all bits of @math{n} are incorporated. Notice these formulas 9163require just two squares per bit of @math{n}. 9164 9165It'd be possible to handle the first few @math{n} above the single limb table 9166with simple additions, using the defining Fibonacci recurrence @m{F_{k+1} = 9167F_k + F_{k-1}, F[k+1]=F[k]+F[k-1]}, but this is not done since it usually 9168turns out to be faster for only about 10 or 20 values of @math{n}, and 9169including a block of code for just those doesn't seem worthwhile. If they 9170really mattered it'd be better to extend the data table. 9171 9172Using a table avoids lots of calculations on small numbers, and makes small 9173@math{n} go fast. A bigger table would make more small @math{n} go fast, it's 9174just a question of balancing size against desired speed. For GMP the code is 9175kept compact, with the emphasis primarily on a good powering algorithm. 9176 9177@code{mpz_fib2_ui} returns both @m{F_n,F[n]} and @m{F_{n-1},F[n-1]}, but 9178@code{mpz_fib_ui} is only interested in @m{F_n,F[n]}. In this case the last 9179step of the algorithm can become one multiply instead of two squares. One of 9180the following two formulas is used, according as @math{n} is odd or even. 9181@tex 9182$$\eqalign{ 9183 F_{2k} &= F_k (F_k + 2F_{k-1}) \cr 9184 F_{2k+1} &= (2F_k + F_{k-1}) (2F_k - F_{k-1}) + 2(-1)^k 9185}$$ 9186@end tex 9187@ifnottex 9188 9189@example 9190F[2k] = F[k]*(F[k]+2F[k-1]) 9191 9192F[2k+1] = (2F[k]+F[k-1])*(2F[k]-F[k-1]) + 2*(-1)^k 9193@end example 9194 9195@end ifnottex 9196@m{F_{2k+1},F[2k+1]} here is the same as above, just rearranged to be a 9197multiply. For interest, the @m{2(-1)^k, 2*(-1)^k} term both here and above 9198can be applied just to the low limb of the calculation, without a carry or 9199borrow into further limbs, which saves some code size. See comments with 9200@code{mpz_fib_ui} and the internal @code{mpn_fib2_ui} for how this is done. 9201 9202 9203@node Lucas Numbers Algorithm, Random Number Algorithms, Fibonacci Numbers Algorithm, Other Algorithms 9204@subsection Lucas Numbers 9205@cindex Lucas number algorithm 9206 9207@code{mpz_lucnum2_ui} derives a pair of Lucas numbers from a pair of Fibonacci 9208numbers with the following simple formulas. 9209@tex 9210$$\eqalign{ 9211 L_k &= F_k + 2F_{k-1} \cr 9212 L_{k-1} &= 2F_k - F_{k-1} 9213}$$ 9214@end tex 9215@ifnottex 9216 9217@example 9218L[k] = F[k] + 2*F[k-1] 9219L[k-1] = 2*F[k] - F[k-1] 9220@end example 9221 9222@end ifnottex 9223@code{mpz_lucnum_ui} is only interested in @m{L_n,L[n]}, and some work can be 9224saved. Trailing zero bits on @math{n} can be handled with a single square 9225each. 9226@tex 9227$$ L_{2k} = L_k^2 - 2(-1)^k $$ 9228@end tex 9229@ifnottex 9230 9231@example 9232L[2k] = L[k]^2 - 2*(-1)^k 9233@end example 9234 9235@end ifnottex 9236And the lowest 1 bit can be handled with one multiply of a pair of Fibonacci 9237numbers, similar to what @code{mpz_fib_ui} does. 9238@tex 9239$$ L_{2k+1} = 5F_{k-1} (2F_k + F_{k-1}) - 4(-1)^k $$ 9240@end tex 9241@ifnottex 9242 9243@example 9244L[2k+1] = 5*F[k-1]*(2*F[k]+F[k-1]) - 4*(-1)^k 9245@end example 9246 9247@end ifnottex 9248 9249 9250@node Random Number Algorithms, , Lucas Numbers Algorithm, Other Algorithms 9251@subsection Random Numbers 9252@cindex Random number algorithms 9253 9254For the @code{urandomb} functions, random numbers are generated simply by 9255concatenating bits produced by the generator. As long as the generator has 9256good randomness properties this will produce well-distributed @math{N} bit 9257numbers. 9258 9259For the @code{urandomm} functions, random numbers in a range @math{0@le{}R<N} 9260are generated by taking values @math{R} of @m{\lceil \log_2 N \rceil, 9261ceil(log2(N))} bits each until one satisfies @math{R<N}. This will normally 9262require only one or two attempts, but the attempts are limited in case the 9263generator is somehow degenerate and produces only 1 bits or similar. 9264 9265@cindex Mersenne twister algorithm 9266The Mersenne Twister generator is by Matsumoto and Nishimura 9267(@pxref{References}). It has a non-repeating period of @math{2^@W{19937}-1}, 9268which is a Mersenne prime, hence the name of the generator. The state is 624 9269words of 32-bits each, which is iterated with one XOR and shift for each 927032-bit word generated, making the algorithm very fast. Randomness properties 9271are also very good and this is the default algorithm used by GMP. 9272 9273@cindex Linear congruential algorithm 9274Linear congruential generators are described in many text books, for instance 9275Knuth volume 2 (@pxref{References}). With a modulus @math{M} and parameters 9276@math{A} and @math{C}, an integer state @math{S} is iterated by the formula 9277@math{S @leftarrow{} A@GMPmultiply{}S+C @bmod{} M}. At each step the new 9278state is a linear function of the previous, mod @math{M}, hence the name of 9279the generator. 9280 9281In GMP only moduli of the form @math{2^N} are supported, and the current 9282implementation is not as well optimized as it could be. Overheads are 9283significant when @math{N} is small, and when @math{N} is large clearly the 9284multiply at each step will become slow. This is not a big concern, since the 9285Mersenne Twister generator is better in every respect and is therefore 9286recommended for all normal applications. 9287 9288For both generators the current state can be deduced by observing enough 9289output and applying some linear algebra (over GF(2) in the case of the 9290Mersenne Twister). This generally means raw output is unsuitable for 9291cryptographic applications without further hashing or the like. 9292 9293 9294@node Assembly Coding, , Other Algorithms, Algorithms 9295@section Assembly Coding 9296@cindex Assembly coding 9297 9298The assembly subroutines in GMP are the most significant source of speed at 9299small to moderate sizes. At larger sizes algorithm selection becomes more 9300important, but of course speedups in low level routines will still speed up 9301everything proportionally. 9302 9303Carry handling and widening multiplies that are important for GMP can't be 9304easily expressed in C@. GCC @code{asm} blocks help a lot and are provided in 9305@file{longlong.h}, but hand coding low level routines invariably offers a 9306speedup over generic C by a factor of anything from 2 to 10. 9307 9308@menu 9309* Assembly Code Organisation:: 9310* Assembly Basics:: 9311* Assembly Carry Propagation:: 9312* Assembly Cache Handling:: 9313* Assembly Functional Units:: 9314* Assembly Floating Point:: 9315* Assembly SIMD Instructions:: 9316* Assembly Software Pipelining:: 9317* Assembly Loop Unrolling:: 9318* Assembly Writing Guide:: 9319@end menu 9320 9321 9322@node Assembly Code Organisation, Assembly Basics, Assembly Coding, Assembly Coding 9323@subsection Code Organisation 9324@cindex Assembly code organisation 9325@cindex Code organisation 9326 9327The various @file{mpn} subdirectories contain machine-dependent code, written 9328in C or assembly. The @file{mpn/generic} subdirectory contains default code, 9329used when there's no machine-specific version of a particular file. 9330 9331Each @file{mpn} subdirectory is for an ISA family. Generally 32-bit and 933264-bit variants in a family cannot share code and have separate directories. 9333Within a family further subdirectories may exist for CPU variants. 9334 9335In each directory a @file{nails} subdirectory may exist, holding code with 9336nails support for that CPU variant. A @code{NAILS_SUPPORT} directive in each 9337file indicates the nails values the code handles. Nails code only exists 9338where it's faster, or promises to be faster, than plain code. There's no 9339effort put into nails if they're not going to enhance a given CPU. 9340 9341 9342@node Assembly Basics, Assembly Carry Propagation, Assembly Code Organisation, Assembly Coding 9343@subsection Assembly Basics 9344 9345@code{mpn_addmul_1} and @code{mpn_submul_1} are the most important routines 9346for overall GMP performance. All multiplications and divisions come down to 9347repeated calls to these. @code{mpn_add_n}, @code{mpn_sub_n}, 9348@code{mpn_lshift} and @code{mpn_rshift} are next most important. 9349 9350On some CPUs assembly versions of the internal functions 9351@code{mpn_mul_basecase} and @code{mpn_sqr_basecase} give significant speedups, 9352mainly through avoiding function call overheads. They can also potentially 9353make better use of a wide superscalar processor, as can bigger primitives like 9354@code{mpn_addmul_2} or @code{mpn_addmul_4}. 9355 9356The restrictions on overlaps between sources and destinations 9357(@pxref{Low-level Functions}) are designed to facilitate a variety of 9358implementations. For example, knowing @code{mpn_add_n} won't have partly 9359overlapping sources and destination means reading can be done far ahead of 9360writing on superscalar processors, and loops can be vectorized on a vector 9361processor, depending on the carry handling. 9362 9363 9364@node Assembly Carry Propagation, Assembly Cache Handling, Assembly Basics, Assembly Coding 9365@subsection Carry Propagation 9366@cindex Assembly carry propagation 9367 9368The problem that presents most challenges in GMP is propagating carries from 9369one limb to the next. In functions like @code{mpn_addmul_1} and 9370@code{mpn_add_n}, carries are the only dependencies between limb operations. 9371 9372On processors with carry flags, a straightforward CISC style @code{adc} is 9373generally best. AMD K6 @code{mpn_addmul_1} however is an example of an 9374unusual set of circumstances where a branch works out better. 9375 9376On RISC processors generally an add and compare for overflow is used. This 9377sort of thing can be seen in @file{mpn/generic/aors_n.c}. Some carry 9378propagation schemes require 4 instructions, meaning at least 4 cycles per 9379limb, but other schemes may use just 1 or 2. On wide superscalar processors 9380performance may be completely determined by the number of dependent 9381instructions between carry-in and carry-out for each limb. 9382 9383On vector processors good use can be made of the fact that a carry bit only 9384very rarely propagates more than one limb. When adding a single bit to a 9385limb, there's only a carry out if that limb was @code{0xFF@dots{}FF} which on 9386random data will be only 1 in @m{2\GMPraise{@code{mp\_bits\_per\_limb}}, 93872^mp_bits_per_limb}. @file{mpn/cray/add_n.c} is an example of this, it adds 9388all limbs in parallel, adds one set of carry bits in parallel and then only 9389rarely needs to fall through to a loop propagating further carries. 9390 9391On the x86s, GCC (as of version 2.95.2) doesn't generate particularly good code 9392for the RISC style idioms that are necessary to handle carry bits in 9393C@. Often conditional jumps are generated where @code{adc} or @code{sbb} forms 9394would be better. And so unfortunately almost any loop involving carry bits 9395needs to be coded in assembly for best results. 9396 9397 9398@node Assembly Cache Handling, Assembly Functional Units, Assembly Carry Propagation, Assembly Coding 9399@subsection Cache Handling 9400@cindex Assembly cache handling 9401 9402GMP aims to perform well both on operands that fit entirely in L1 cache and 9403those which don't. 9404 9405Basic routines like @code{mpn_add_n} or @code{mpn_lshift} are often used on 9406large operands, so L2 and main memory performance is important for them. 9407@code{mpn_mul_1} and @code{mpn_addmul_1} are mostly used for multiply and 9408square basecases, so L1 performance matters most for them, unless assembly 9409versions of @code{mpn_mul_basecase} and @code{mpn_sqr_basecase} exist, in 9410which case the remaining uses are mostly for larger operands. 9411 9412For L2 or main memory operands, memory access times will almost certainly be 9413more than the calculation time. The aim therefore is to maximize memory 9414throughput, by starting a load of the next cache line while processing the 9415contents of the previous one. Clearly this is only possible if the chip has a 9416lock-up free cache or some sort of prefetch instruction. Most current chips 9417have both these features. 9418 9419Prefetching sources combines well with loop unrolling, since a prefetch can be 9420initiated once per unrolled loop (or more than once if the loop covers more 9421than one cache line). 9422 9423On CPUs without write-allocate caches, prefetching destinations will ensure 9424individual stores don't go further down the cache hierarchy, limiting 9425bandwidth. Of course for calculations which are slow anyway, like 9426@code{mpn_divrem_1}, write-throughs might be fine. 9427 9428The distance ahead to prefetch will be determined by memory latency versus 9429throughput. The aim of course is to have data arriving continuously, at peak 9430throughput. Some CPUs have limits on the number of fetches or prefetches in 9431progress. 9432 9433If a special prefetch instruction doesn't exist then a plain load can be used, 9434but in that case care must be taken not to attempt to read past the end of an 9435operand, since that might produce a segmentation violation. 9436 9437Some CPUs or systems have hardware that detects sequential memory accesses and 9438initiates suitable cache movements automatically, making life easy. 9439 9440 9441@node Assembly Functional Units, Assembly Floating Point, Assembly Cache Handling, Assembly Coding 9442@subsection Functional Units 9443 9444When choosing an approach for an assembly loop, consideration is given to 9445what operations can execute simultaneously and what throughput can thereby be 9446achieved. In some cases an algorithm can be tweaked to accommodate available 9447resources. 9448 9449Loop control will generally require a counter and pointer updates, costing as 9450much as 5 instructions, plus any delays a branch introduces. CPU addressing 9451modes might reduce pointer updates, perhaps by allowing just one updating 9452pointer and others expressed as offsets from it, or on CISC chips with all 9453addressing done with the loop counter as a scaled index. 9454 9455The final loop control cost can be amortised by processing several limbs in 9456each iteration (@pxref{Assembly Loop Unrolling}). This at least ensures loop 9457control isn't a big fraction the work done. 9458 9459Memory throughput is always a limit. If perhaps only one load or one store 9460can be done per cycle then 3 cycles/limb will the top speed for ``binary'' 9461operations like @code{mpn_add_n}, and any code achieving that is optimal. 9462 9463Integer resources can be freed up by having the loop counter in a float 9464register, or by pressing the float units into use for some multiplying, 9465perhaps doing every second limb on the float side (@pxref{Assembly Floating 9466Point}). 9467 9468Float resources can be freed up by doing carry propagation on the integer 9469side, or even by doing integer to float conversions in integers using bit 9470twiddling. 9471 9472 9473@node Assembly Floating Point, Assembly SIMD Instructions, Assembly Functional Units, Assembly Coding 9474@subsection Floating Point 9475@cindex Assembly floating Point 9476 9477Floating point arithmetic is used in GMP for multiplications on CPUs with poor 9478integer multipliers. It's mostly useful for @code{mpn_mul_1}, 9479@code{mpn_addmul_1} and @code{mpn_submul_1} on 64-bit machines, and 9480@code{mpn_mul_basecase} on both 32-bit and 64-bit machines. 9481 9482With IEEE 53-bit double precision floats, integer multiplications producing up 9483to 53 bits will give exact results. Breaking a 64@cross{}64 multiplication 9484into eight 16@cross{}@math{32@rightarrow{}48} bit pieces is convenient. With 9485some care though six 21@cross{}@math{32@rightarrow{}53} bit products can be 9486used, if one of the lower two 21-bit pieces also uses the sign bit. 9487 9488For the @code{mpn_mul_1} family of functions on a 64-bit machine, the 9489invariant single limb is split at the start, into 3 or 4 pieces. Inside the 9490loop, the bignum operand is split into 32-bit pieces. Fast conversion of 9491these unsigned 32-bit pieces to floating point is highly machine-dependent. 9492In some cases, reading the data into the integer unit, zero-extending to 949364-bits, then transferring to the floating point unit back via memory is the 9494only option. 9495 9496Converting partial products back to 64-bit limbs is usually best done as a 9497signed conversion. Since all values are smaller than @m{2^{53},2^53}, signed 9498and unsigned are the same, but most processors lack unsigned conversions. 9499 9500@sp 2 9501 9502Here is a diagram showing 16@cross{}32 bit products for an @code{mpn_mul_1} or 9503@code{mpn_addmul_1} with a 64-bit limb. The single limb operand V is split 9504into four 16-bit parts. The multi-limb operand U is split in the loop into 9505two 32-bit parts. 9506 9507@tex 9508\global\newdimen\GMPbits \global\GMPbits=0.18em 9509\def\GMPbox#1#2#3{% 9510 \hbox{% 9511 \hbox to 128\GMPbits{\hfil 9512 \vbox{% 9513 \hrule 9514 \hbox to 48\GMPbits {\GMPvrule \hfil$#2$\hfil \vrule}% 9515 \hrule}% 9516 \hskip #1\GMPbits}% 9517 \raise \GMPboxdepth \hbox{\hskip 2em #3}}} 9518% 9519\GMPdisplay{% 9520 \vbox{% 9521 \hbox{% 9522 \hbox to 128\GMPbits {\hfil 9523 \vbox{% 9524 \hrule 9525 \hbox to 64\GMPbits{% 9526 \GMPvrule \hfil$v48$\hfil 9527 \vrule \hfil$v32$\hfil 9528 \vrule \hfil$v16$\hfil 9529 \vrule \hfil$v00$\hfil 9530 \vrule} 9531 \hrule}}% 9532 \raise \GMPboxdepth \hbox{\hskip 2em V Operand}} 9533 \vskip 0.5ex 9534 \hbox{% 9535 \hbox to 128\GMPbits {\hfil 9536 \raise \GMPboxdepth \hbox{$\times$\hskip 1.5em}% 9537 \vbox{% 9538 \hrule 9539 \hbox to 64\GMPbits {% 9540 \GMPvrule \hfil$u32$\hfil 9541 \vrule \hfil$u00$\hfil 9542 \vrule}% 9543 \hrule}}% 9544 \raise \GMPboxdepth \hbox{\hskip 2em U Operand (one limb)}}% 9545 \vskip 0.5ex 9546 \hbox{\vbox to 2ex{\hrule width 128\GMPbits}}% 9547 \GMPbox{0}{u00 \times v00}{$p00$\hskip 1.5em 48-bit products}% 9548 \vskip 0.5ex 9549 \GMPbox{16}{u00 \times v16}{$p16$} 9550 \vskip 0.5ex 9551 \GMPbox{32}{u00 \times v32}{$p32$} 9552 \vskip 0.5ex 9553 \GMPbox{48}{u00 \times v48}{$p48$} 9554 \vskip 0.5ex 9555 \GMPbox{32}{u32 \times v00}{$r32$} 9556 \vskip 0.5ex 9557 \GMPbox{48}{u32 \times v16}{$r48$} 9558 \vskip 0.5ex 9559 \GMPbox{64}{u32 \times v32}{$r64$} 9560 \vskip 0.5ex 9561 \GMPbox{80}{u32 \times v48}{$r80$} 9562}} 9563@end tex 9564@ifnottex 9565@example 9566@group 9567 +---+---+---+---+ 9568 |v48|v32|v16|v00| V operand 9569 +---+---+---+---+ 9570 9571 +-------+---+---+ 9572 x | u32 | u00 | U operand (one limb) 9573 +---------------+ 9574 9575--------------------------------- 9576 9577 +-----------+ 9578 | u00 x v00 | p00 48-bit products 9579 +-----------+ 9580 +-----------+ 9581 | u00 x v16 | p16 9582 +-----------+ 9583 +-----------+ 9584 | u00 x v32 | p32 9585 +-----------+ 9586 +-----------+ 9587 | u00 x v48 | p48 9588 +-----------+ 9589 +-----------+ 9590 | u32 x v00 | r32 9591 +-----------+ 9592 +-----------+ 9593 | u32 x v16 | r48 9594 +-----------+ 9595 +-----------+ 9596 | u32 x v32 | r64 9597 +-----------+ 9598+-----------+ 9599| u32 x v48 | r80 9600+-----------+ 9601@end group 9602@end example 9603@end ifnottex 9604 9605@math{p32} and @math{r32} can be summed using floating-point addition, and 9606likewise @math{p48} and @math{r48}. @math{p00} and @math{p16} can be summed 9607with @math{r64} and @math{r80} from the previous iteration. 9608 9609For each loop then, four 49-bit quantities are transferred to the integer unit, 9610aligned as follows, 9611 9612@tex 9613% GMPbox here should be 49 bits wide, but use 51 to better show p16+r80' 9614% crossing into the upper 64 bits. 9615\def\GMPbox#1#2#3{% 9616 \hbox{% 9617 \hbox to 128\GMPbits {% 9618 \hfil 9619 \vbox{% 9620 \hrule 9621 \hbox to 51\GMPbits {\GMPvrule \hfil$#2$\hfil \vrule}% 9622 \hrule}% 9623 \hskip #1\GMPbits}% 9624 \raise \GMPboxdepth \hbox{\hskip 1.5em $#3$\hfil}% 9625}} 9626\newbox\b \setbox\b\hbox{64 bits}% 9627\newdimen\bw \bw=\wd\b \advance\bw by 2em 9628\newdimen\x \x=128\GMPbits 9629\advance\x by -2\bw 9630\divide\x by4 9631\GMPdisplay{% 9632 \vbox{% 9633 \hbox to 128\GMPbits {% 9634 \GMPvrule 9635 \raise 0.5ex \vbox{\hrule \hbox to \x {}}% 9636 \hfil 64 bits\hfil 9637 \raise 0.5ex \vbox{\hrule \hbox to \x {}}% 9638 \vrule 9639 \raise 0.5ex \vbox{\hrule \hbox to \x {}}% 9640 \hfil 64 bits\hfil 9641 \raise 0.5ex \vbox{\hrule \hbox to \x {}}% 9642 \vrule}% 9643 \vskip 0.7ex 9644 \GMPbox{0}{p00+r64'}{i00} 9645 \vskip 0.5ex 9646 \GMPbox{16}{p16+r80'}{i16} 9647 \vskip 0.5ex 9648 \GMPbox{32}{p32+r32}{i32} 9649 \vskip 0.5ex 9650 \GMPbox{48}{p48+r48}{i48} 9651}} 9652@end tex 9653@ifnottex 9654@example 9655@group 9656|-----64bits----|-----64bits----| 9657 +------------+ 9658 | p00 + r64' | i00 9659 +------------+ 9660 +------------+ 9661 | p16 + r80' | i16 9662 +------------+ 9663 +------------+ 9664 | p32 + r32 | i32 9665 +------------+ 9666 +------------+ 9667 | p48 + r48 | i48 9668 +------------+ 9669@end group 9670@end example 9671@end ifnottex 9672 9673The challenge then is to sum these efficiently and add in a carry limb, 9674generating a low 64-bit result limb and a high 33-bit carry limb (@math{i48} 9675extends 33 bits into the high half). 9676 9677 9678@node Assembly SIMD Instructions, Assembly Software Pipelining, Assembly Floating Point, Assembly Coding 9679@subsection SIMD Instructions 9680@cindex Assembly SIMD 9681 9682The single-instruction multiple-data support in current microprocessors is 9683aimed at signal processing algorithms where each data point can be treated 9684more or less independently. There's generally not much support for 9685propagating the sort of carries that arise in GMP. 9686 9687SIMD multiplications of say four 16@cross{}16 bit multiplies only do as much 9688work as one 32@cross{}32 from GMP's point of view, and need some shifts and 9689adds besides. But of course if say the SIMD form is fully pipelined and uses 9690less instruction decoding then it may still be worthwhile. 9691 9692On the x86 chips, MMX has so far found a use in @code{mpn_rshift} and 9693@code{mpn_lshift}, and is used in a special case for 16-bit multipliers in the 9694P55 @code{mpn_mul_1}. SSE2 is used for Pentium 4 @code{mpn_mul_1}, 9695@code{mpn_addmul_1}, and @code{mpn_submul_1}. 9696 9697 9698@node Assembly Software Pipelining, Assembly Loop Unrolling, Assembly SIMD Instructions, Assembly Coding 9699@subsection Software Pipelining 9700@cindex Assembly software pipelining 9701 9702Software pipelining consists of scheduling instructions around the branch 9703point in a loop. For example a loop might issue a load not for use in the 9704present iteration but the next, thereby allowing extra cycles for the data to 9705arrive from memory. 9706 9707Naturally this is wanted only when doing things like loads or multiplies that 9708take several cycles to complete, and only where a CPU has multiple functional 9709units so that other work can be done in the meantime. 9710 9711A pipeline with several stages will have a data value in progress at each 9712stage and each loop iteration moves them along one stage. This is like 9713juggling. 9714 9715If the latency of some instruction is greater than the loop time then it will 9716be necessary to unroll, so one register has a result ready to use while 9717another (or multiple others) are still in progress. (@pxref{Assembly Loop 9718Unrolling}). 9719 9720 9721@node Assembly Loop Unrolling, Assembly Writing Guide, Assembly Software Pipelining, Assembly Coding 9722@subsection Loop Unrolling 9723@cindex Assembly loop unrolling 9724 9725Loop unrolling consists of replicating code so that several limbs are 9726processed in each loop. At a minimum this reduces loop overheads by a 9727corresponding factor, but it can also allow better register usage, for example 9728alternately using one register combination and then another. Judicious use of 9729@command{m4} macros can help avoid lots of duplication in the source code. 9730 9731Any amount of unrolling can be handled with a loop counter that's decremented 9732by @math{N} each time, stopping when the remaining count is less than the 9733further @math{N} the loop will process. Or by subtracting @math{N} at the 9734start, the termination condition becomes when the counter @math{C} is less 9735than 0 (and the count of remaining limbs is @math{C+N}). 9736 9737Alternately for a power of 2 unroll the loop count and remainder can be 9738established with a shift and mask. This is convenient if also making a 9739computed jump into the middle of a large loop. 9740 9741The limbs not a multiple of the unrolling can be handled in various ways, for 9742example 9743 9744@itemize @bullet 9745@item 9746A simple loop at the end (or the start) to process the excess. Care will be 9747wanted that it isn't too much slower than the unrolled part. 9748 9749@item 9750A set of binary tests, for example after an 8-limb unrolling, test for 4 more 9751limbs to process, then a further 2 more or not, and finally 1 more or not. 9752This will probably take more code space than a simple loop. 9753 9754@item 9755A @code{switch} statement, providing separate code for each possible excess, 9756for example an 8-limb unrolling would have separate code for 0 remaining, 1 9757remaining, etc, up to 7 remaining. This might take a lot of code, but may be 9758the best way to optimize all cases in combination with a deep pipelined loop. 9759 9760@item 9761A computed jump into the middle of the loop, thus making the first iteration 9762handle the excess. This should make times smoothly increase with size, which 9763is attractive, but setups for the jump and adjustments for pointers can be 9764tricky and could become quite difficult in combination with deep pipelining. 9765@end itemize 9766 9767 9768@node Assembly Writing Guide, , Assembly Loop Unrolling, Assembly Coding 9769@subsection Writing Guide 9770@cindex Assembly writing guide 9771 9772This is a guide to writing software pipelined loops for processing limb 9773vectors in assembly. 9774 9775First determine the algorithm and which instructions are needed. Code it 9776without unrolling or scheduling, to make sure it works. On a 3-operand CPU 9777try to write each new value to a new register, this will greatly simplify later 9778steps. 9779 9780Then note for each instruction the functional unit and/or issue port 9781requirements. If an instruction can use either of two units, like U0 or U1 9782then make a category ``U0/U1''. Count the total using each unit (or combined 9783unit), and count all instructions. 9784 9785Figure out from those counts the best possible loop time. The goal will be to 9786find a perfect schedule where instruction latencies are completely hidden. 9787The total instruction count might be the limiting factor, or perhaps a 9788particular functional unit. It might be possible to tweak the instructions to 9789help the limiting factor. 9790 9791Suppose the loop time is @math{N}, then make @math{N} issue buckets, with the 9792final loop branch at the end of the last. Now fill the buckets with dummy 9793instructions using the functional units desired. Run this to make sure the 9794intended speed is reached. 9795 9796Now replace the dummy instructions with the real instructions from the slow 9797but correct loop you started with. The first will typically be a load 9798instruction. Then the instruction using that value is placed in a bucket an 9799appropriate distance down. Run the loop again, to check it still runs at 9800target speed. 9801 9802Keep placing instructions, frequently measuring the loop. After a few you 9803will need to wrap around from the last bucket back to the top of the loop. If 9804you used the new-register for new-value strategy above then there will be no 9805register conflicts. If not then take care not to clobber something already in 9806use. Changing registers at this time is very error prone. 9807 9808The loop will overlap two or more of the original loop iterations, and the 9809computation of one vector element result will be started in one iteration of 9810the new loop, and completed one or several iterations later. 9811 9812The final step is to create feed-in and wind-down code for the loop. A good 9813way to do this is to make a copy (or copies) of the loop at the start and 9814delete those instructions which don't have valid antecedents, and at the end 9815replicate and delete those whose results are unwanted (including any further 9816loads). 9817 9818The loop will have a minimum number of limbs loaded and processed, so the 9819feed-in code must test if the request size is smaller and skip either to a 9820suitable part of the wind-down or to special code for small sizes. 9821 9822 9823@node Internals, Contributors, Algorithms, Top 9824@chapter Internals 9825@cindex Internals 9826 9827@strong{This chapter is provided only for informational purposes and the 9828various internals described here may change in future GMP releases. 9829Applications expecting to be compatible with future releases should use only 9830the documented interfaces described in previous chapters.} 9831 9832@menu 9833* Integer Internals:: 9834* Rational Internals:: 9835* Float Internals:: 9836* Raw Output Internals:: 9837* C++ Interface Internals:: 9838@end menu 9839 9840@node Integer Internals, Rational Internals, Internals, Internals 9841@section Integer Internals 9842@cindex Integer internals 9843 9844@code{mpz_t} variables represent integers using sign and magnitude, in space 9845dynamically allocated and reallocated. The fields are as follows. 9846 9847@table @asis 9848@item @code{_mp_size} 9849The number of limbs, or the negative of that when representing a negative 9850integer. Zero is represented by @code{_mp_size} set to zero, in which case 9851the @code{_mp_d} data is unused. 9852 9853@item @code{_mp_d} 9854A pointer to an array of limbs which is the magnitude. These are stored 9855``little endian'' as per the @code{mpn} functions, so @code{_mp_d[0]} is the 9856least significant limb and @code{_mp_d[ABS(_mp_size)-1]} is the most 9857significant. Whenever @code{_mp_size} is non-zero, the most significant limb 9858is non-zero. 9859 9860Currently there's always at least one limb allocated, so for instance 9861@code{mpz_set_ui} never needs to reallocate, and @code{mpz_get_ui} can fetch 9862@code{_mp_d[0]} unconditionally (though its value is then only wanted if 9863@code{_mp_size} is non-zero). 9864 9865@item @code{_mp_alloc} 9866@code{_mp_alloc} is the number of limbs currently allocated at @code{_mp_d}, 9867and naturally @code{_mp_alloc >= ABS(_mp_size)}. When an @code{mpz} routine 9868is about to (or might be about to) increase @code{_mp_size}, it checks 9869@code{_mp_alloc} to see whether there's enough space, and reallocates if not. 9870@code{MPZ_REALLOC} is generally used for this. 9871@end table 9872 9873The various bitwise logical functions like @code{mpz_and} behave as if 9874negative values were twos complement. But sign and magnitude is always used 9875internally, and necessary adjustments are made during the calculations. 9876Sometimes this isn't pretty, but sign and magnitude are best for other 9877routines. 9878 9879Some internal temporary variables are setup with @code{MPZ_TMP_INIT} and these 9880have @code{_mp_d} space obtained from @code{TMP_ALLOC} rather than the memory 9881allocation functions. Care is taken to ensure that these are big enough that 9882no reallocation is necessary (since it would have unpredictable consequences). 9883 9884@code{_mp_size} and @code{_mp_alloc} are @code{int}, although @code{mp_size_t} 9885is usually a @code{long}. This is done to make the fields just 32 bits on 9886some 64 bits systems, thereby saving a few bytes of data space but still 9887providing plenty of range. 9888 9889 9890@node Rational Internals, Float Internals, Integer Internals, Internals 9891@section Rational Internals 9892@cindex Rational internals 9893 9894@code{mpq_t} variables represent rationals using an @code{mpz_t} numerator and 9895denominator (@pxref{Integer Internals}). 9896 9897The canonical form adopted is denominator positive (and non-zero), no common 9898factors between numerator and denominator, and zero uniquely represented as 98990/1. 9900 9901It's believed that casting out common factors at each stage of a calculation 9902is best in general. A GCD is an @math{O(N^2)} operation so it's better to do 9903a few small ones immediately than to delay and have to do a big one later. 9904Knowing the numerator and denominator have no common factors can be used for 9905example in @code{mpq_mul} to make only two cross GCDs necessary, not four. 9906 9907This general approach to common factors is badly sub-optimal in the presence 9908of simple factorizations or little prospect for cancellation, but GMP has no 9909way to know when this will occur. As per @ref{Efficiency}, that's left to 9910applications. The @code{mpq_t} framework might still suit, with 9911@code{mpq_numref} and @code{mpq_denref} for direct access to the numerator and 9912denominator, or of course @code{mpz_t} variables can be used directly. 9913 9914 9915@node Float Internals, Raw Output Internals, Rational Internals, Internals 9916@section Float Internals 9917@cindex Float internals 9918 9919Efficient calculation is the primary aim of GMP floats and the use of whole 9920limbs and simple rounding facilitates this. 9921 9922@code{mpf_t} floats have a variable precision mantissa and a single machine 9923word signed exponent. The mantissa is represented using sign and magnitude. 9924 9925@c FIXME: The arrow heads don't join to the lines exactly. 9926@tex 9927\global\newdimen\GMPboxwidth \GMPboxwidth=5em 9928\global\newdimen\GMPboxheight \GMPboxheight=3ex 9929\def\centreline{\hbox{\raise 0.8ex \vbox{\hrule \hbox{\hfil}}}} 9930\GMPdisplay{% 9931\vbox{% 9932 \hbox to 5\GMPboxwidth {most significant limb \hfil least significant limb} 9933 \vskip 0.7ex 9934 \def\GMPcentreline#1{\hbox{\raise 0.5 ex \vbox{\hrule \hbox to #1 {}}}} 9935 \hbox { 9936 \hbox to 3\GMPboxwidth {% 9937 \setbox 0 = \hbox{@code{\_mp\_exp}}% 9938 \dimen0=3\GMPboxwidth 9939 \advance\dimen0 by -\wd0 9940 \divide\dimen0 by 2 9941 \advance\dimen0 by -1em 9942 \setbox1 = \hbox{$\rightarrow$}% 9943 \dimen1=\dimen0 9944 \advance\dimen1 by -\wd1 9945 \GMPcentreline{\dimen0}% 9946 \hfil 9947 \box0% 9948 \hfil 9949 \GMPcentreline{\dimen1{}}% 9950 \box1} 9951 \hbox to 2\GMPboxwidth {\hfil @code{\_mp\_d}}} 9952 \vskip 0.5ex 9953 \vbox {% 9954 \hrule 9955 \hbox{% 9956 \vrule height 2ex depth 1ex 9957 \hbox to \GMPboxwidth {}% 9958 \vrule 9959 \hbox to \GMPboxwidth {}% 9960 \vrule 9961 \hbox to \GMPboxwidth {}% 9962 \vrule 9963 \hbox to \GMPboxwidth {}% 9964 \vrule 9965 \hbox to \GMPboxwidth {}% 9966 \vrule} 9967 \hrule 9968 } 9969 \hbox {% 9970 \hbox to 0.8 pt {} 9971 \hbox to 3\GMPboxwidth {% 9972 \hfil $\cdot$} \hbox {$\leftarrow$ radix point\hfil}} 9973 \hbox to 5\GMPboxwidth{% 9974 \setbox 0 = \hbox{@code{\_mp\_size}}% 9975 \dimen0 = 5\GMPboxwidth 9976 \advance\dimen0 by -\wd0 9977 \divide\dimen0 by 2 9978 \advance\dimen0 by -1em 9979 \dimen1 = \dimen0 9980 \setbox1 = \hbox{$\leftarrow$}% 9981 \setbox2 = \hbox{$\rightarrow$}% 9982 \advance\dimen0 by -\wd1 9983 \advance\dimen1 by -\wd2 9984 \hbox to 0.3 em {}% 9985 \box1 9986 \GMPcentreline{\dimen0}% 9987 \hfil 9988 \box0 9989 \hfil 9990 \GMPcentreline{\dimen1}% 9991 \box2} 9992}} 9993@end tex 9994@ifnottex 9995@example 9996 most least 9997significant significant 9998 limb limb 9999 10000 _mp_d 10001 |---- _mp_exp ---> | 10002 _____ _____ _____ _____ _____ 10003 |_____|_____|_____|_____|_____| 10004 . <------------ radix point 10005 10006 <-------- _mp_size ---------> 10007@sp 1 10008@end example 10009@end ifnottex 10010 10011@noindent 10012The fields are as follows. 10013 10014@table @asis 10015@item @code{_mp_size} 10016The number of limbs currently in use, or the negative of that when 10017representing a negative value. Zero is represented by @code{_mp_size} and 10018@code{_mp_exp} both set to zero, and in that case the @code{_mp_d} data is 10019unused. (In the future @code{_mp_exp} might be undefined when representing 10020zero.) 10021 10022@item @code{_mp_prec} 10023The precision of the mantissa, in limbs. In any calculation the aim is to 10024produce @code{_mp_prec} limbs of result (the most significant being non-zero). 10025 10026@item @code{_mp_d} 10027A pointer to the array of limbs which is the absolute value of the mantissa. 10028These are stored ``little endian'' as per the @code{mpn} functions, so 10029@code{_mp_d[0]} is the least significant limb and 10030@code{_mp_d[ABS(_mp_size)-1]} the most significant. 10031 10032The most significant limb is always non-zero, but there are no other 10033restrictions on its value, in particular the highest 1 bit can be anywhere 10034within the limb. 10035 10036@code{_mp_prec+1} limbs are allocated to @code{_mp_d}, the extra limb being 10037for convenience (see below). There are no reallocations during a calculation, 10038only in a change of precision with @code{mpf_set_prec}. 10039 10040@item @code{_mp_exp} 10041The exponent, in limbs, determining the location of the implied radix point. 10042Zero means the radix point is just above the most significant limb. Positive 10043values mean a radix point offset towards the lower limbs and hence a value 10044@math{@ge{} 1}, as for example in the diagram above. Negative exponents mean 10045a radix point further above the highest limb. 10046 10047Naturally the exponent can be any value, it doesn't have to fall within the 10048limbs as the diagram shows, it can be a long way above or a long way below. 10049Limbs other than those included in the @code{@{_mp_d,_mp_size@}} data 10050are treated as zero. 10051@end table 10052 10053The @code{_mp_size} and @code{_mp_prec} fields are @code{int}, although the 10054@code{mp_size_t} type is usually a @code{long}. The @code{_mp_exp} field is 10055usually @code{long}. This is done to make some fields just 32 bits on some 64 10056bits systems, thereby saving a few bytes of data space but still providing 10057plenty of precision and a very large range. 10058 10059 10060@sp 1 10061@noindent 10062The following various points should be noted. 10063 10064@table @asis 10065@item Low Zeros 10066The least significant limbs @code{_mp_d[0]} etc can be zero, though such low 10067zeros can always be ignored. Routines likely to produce low zeros check and 10068avoid them to save time in subsequent calculations, but for most routines 10069they're quite unlikely and aren't checked. 10070 10071@item Mantissa Size Range 10072The @code{_mp_size} count of limbs in use can be less than @code{_mp_prec} if 10073the value can be represented in less. This means low precision values or 10074small integers stored in a high precision @code{mpf_t} can still be operated 10075on efficiently. 10076 10077@code{_mp_size} can also be greater than @code{_mp_prec}. Firstly a value is 10078allowed to use all of the @code{_mp_prec+1} limbs available at @code{_mp_d}, 10079and secondly when @code{mpf_set_prec_raw} lowers @code{_mp_prec} it leaves 10080@code{_mp_size} unchanged and so the size can be arbitrarily bigger than 10081@code{_mp_prec}. 10082 10083@item Rounding 10084All rounding is done on limb boundaries. Calculating @code{_mp_prec} limbs 10085with the high non-zero will ensure the application requested minimum precision 10086is obtained. 10087 10088The use of simple ``trunc'' rounding towards zero is efficient, since there's 10089no need to examine extra limbs and increment or decrement. 10090 10091@item Bit Shifts 10092Since the exponent is in limbs, there are no bit shifts in basic operations 10093like @code{mpf_add} and @code{mpf_mul}. When differing exponents are 10094encountered all that's needed is to adjust pointers to line up the relevant 10095limbs. 10096 10097Of course @code{mpf_mul_2exp} and @code{mpf_div_2exp} will require bit shifts, 10098but the choice is between an exponent in limbs which requires shifts there, or 10099one in bits which requires them almost everywhere else. 10100 10101@item Use of @code{_mp_prec+1} Limbs 10102The extra limb on @code{_mp_d} (@code{_mp_prec+1} rather than just 10103@code{_mp_prec}) helps when an @code{mpf} routine might get a carry from its 10104operation. @code{mpf_add} for instance will do an @code{mpn_add} of 10105@code{_mp_prec} limbs. If there's no carry then that's the result, but if 10106there is a carry then it's stored in the extra limb of space and 10107@code{_mp_size} becomes @code{_mp_prec+1}. 10108 10109Whenever @code{_mp_prec+1} limbs are held in a variable, the low limb is not 10110needed for the intended precision, only the @code{_mp_prec} high limbs. But 10111zeroing it out or moving the rest down is unnecessary. Subsequent routines 10112reading the value will simply take the high limbs they need, and this will be 10113@code{_mp_prec} if their target has that same precision. This is no more than 10114a pointer adjustment, and must be checked anyway since the destination 10115precision can be different from the sources. 10116 10117Copy functions like @code{mpf_set} will retain a full @code{_mp_prec+1} limbs 10118if available. This ensures that a variable which has @code{_mp_size} equal to 10119@code{_mp_prec+1} will get its full exact value copied. Strictly speaking 10120this is unnecessary since only @code{_mp_prec} limbs are needed for the 10121application's requested precision, but it's considered that an @code{mpf_set} 10122from one variable into another of the same precision ought to produce an exact 10123copy. 10124 10125@item Application Precisions 10126@code{__GMPF_BITS_TO_PREC} converts an application requested precision to an 10127@code{_mp_prec}. The value in bits is rounded up to a whole limb then an 10128extra limb is added since the most significant limb of @code{_mp_d} is only 10129non-zero and therefore might contain only one bit. 10130 10131@code{__GMPF_PREC_TO_BITS} does the reverse conversion, and removes the extra 10132limb from @code{_mp_prec} before converting to bits. The net effect of 10133reading back with @code{mpf_get_prec} is simply the precision rounded up to a 10134multiple of @code{mp_bits_per_limb}. 10135 10136Note that the extra limb added here for the high only being non-zero is in 10137addition to the extra limb allocated to @code{_mp_d}. For example with a 1013832-bit limb, an application request for 250 bits will be rounded up to 8 10139limbs, then an extra added for the high being only non-zero, giving an 10140@code{_mp_prec} of 9. @code{_mp_d} then gets 10 limbs allocated. Reading 10141back with @code{mpf_get_prec} will take @code{_mp_prec} subtract 1 limb and 10142multiply by 32, giving 256 bits. 10143 10144Strictly speaking, the fact the high limb has at least one bit means that a 10145float with, say, 3 limbs of 32-bits each will be holding at least 65 bits, but 10146for the purposes of @code{mpf_t} it's considered simply to be 64 bits, a nice 10147multiple of the limb size. 10148@end table 10149 10150 10151@node Raw Output Internals, C++ Interface Internals, Float Internals, Internals 10152@section Raw Output Internals 10153@cindex Raw output internals 10154 10155@noindent 10156@code{mpz_out_raw} uses the following format. 10157 10158@tex 10159\global\newdimen\GMPboxwidth \GMPboxwidth=5em 10160\global\newdimen\GMPboxheight \GMPboxheight=3ex 10161\def\centreline{\hbox{\raise 0.8ex \vbox{\hrule \hbox{\hfil}}}} 10162\GMPdisplay{% 10163\vbox{% 10164 \def\GMPcentreline#1{\hbox{\raise 0.5 ex \vbox{\hrule \hbox to #1 {}}}} 10165 \vbox {% 10166 \hrule 10167 \hbox{% 10168 \vrule height 2.5ex depth 1.5ex 10169 \hbox to \GMPboxwidth {\hfil size\hfil}% 10170 \vrule 10171 \hbox to 3\GMPboxwidth {\hfil data bytes\hfil}% 10172 \vrule} 10173 \hrule} 10174}} 10175@end tex 10176@ifnottex 10177@example 10178+------+------------------------+ 10179| size | data bytes | 10180+------+------------------------+ 10181@end example 10182@end ifnottex 10183 10184The size is 4 bytes written most significant byte first, being the number of 10185subsequent data bytes, or the twos complement negative of that when a negative 10186integer is represented. The data bytes are the absolute value of the integer, 10187written most significant byte first. 10188 10189The most significant data byte is always non-zero, so the output is the same 10190on all systems, irrespective of limb size. 10191 10192In GMP 1, leading zero bytes were written to pad the data bytes to a multiple 10193of the limb size. @code{mpz_inp_raw} will still accept this, for 10194compatibility. 10195 10196The use of ``big endian'' for both the size and data fields is deliberate, it 10197makes the data easy to read in a hex dump of a file. Unfortunately it also 10198means that the limb data must be reversed when reading or writing, so neither 10199a big endian nor little endian system can just read and write @code{_mp_d}. 10200 10201 10202@node C++ Interface Internals, , Raw Output Internals, Internals 10203@section C++ Interface Internals 10204@cindex C++ interface internals 10205 10206A system of expression templates is used to ensure something like @code{a=b+c} 10207turns into a simple call to @code{mpz_add} etc. For @code{mpf_class} 10208the scheme also ensures the precision of the final 10209destination is used for any temporaries within a statement like 10210@code{f=w*x+y*z}. These are important features which a naive implementation 10211cannot provide. 10212 10213A simplified description of the scheme follows. The true scheme is 10214complicated by the fact that expressions have different return types. For 10215detailed information, refer to the source code. 10216 10217To perform an operation, say, addition, we first define a ``function object'' 10218evaluating it, 10219 10220@example 10221struct __gmp_binary_plus 10222@{ 10223 static void eval(mpf_t f, mpf_t g, mpf_t h) @{ mpf_add(f, g, h); @} 10224@}; 10225@end example 10226 10227@noindent 10228And an ``additive expression'' object, 10229 10230@example 10231__gmp_expr<__gmp_binary_expr<mpf_class, mpf_class, __gmp_binary_plus> > 10232operator+(const mpf_class &f, const mpf_class &g) 10233@{ 10234 return __gmp_expr 10235 <__gmp_binary_expr<mpf_class, mpf_class, __gmp_binary_plus> >(f, g); 10236@} 10237@end example 10238 10239The seemingly redundant @code{__gmp_expr<__gmp_binary_expr<@dots{}>>} is used to 10240encapsulate any possible kind of expression into a single template type. In 10241fact even @code{mpf_class} etc are @code{typedef} specializations of 10242@code{__gmp_expr}. 10243 10244Next we define assignment of @code{__gmp_expr} to @code{mpf_class}. 10245 10246@example 10247template <class T> 10248mpf_class & mpf_class::operator=(const __gmp_expr<T> &expr) 10249@{ 10250 expr.eval(this->get_mpf_t(), this->precision()); 10251 return *this; 10252@} 10253 10254template <class Op> 10255void __gmp_expr<__gmp_binary_expr<mpf_class, mpf_class, Op> >::eval 10256(mpf_t f, mp_bitcnt_t precision) 10257@{ 10258 Op::eval(f, expr.val1.get_mpf_t(), expr.val2.get_mpf_t()); 10259@} 10260@end example 10261 10262where @code{expr.val1} and @code{expr.val2} are references to the expression's 10263operands (here @code{expr} is the @code{__gmp_binary_expr} stored within the 10264@code{__gmp_expr}). 10265 10266This way, the expression is actually evaluated only at the time of assignment, 10267when the required precision (that of @code{f}) is known. Furthermore the 10268target @code{mpf_t} is now available, thus we can call @code{mpf_add} directly 10269with @code{f} as the output argument. 10270 10271Compound expressions are handled by defining operators taking subexpressions 10272as their arguments, like this: 10273 10274@example 10275template <class T, class U> 10276__gmp_expr 10277<__gmp_binary_expr<__gmp_expr<T>, __gmp_expr<U>, __gmp_binary_plus> > 10278operator+(const __gmp_expr<T> &expr1, const __gmp_expr<U> &expr2) 10279@{ 10280 return __gmp_expr 10281 <__gmp_binary_expr<__gmp_expr<T>, __gmp_expr<U>, __gmp_binary_plus> > 10282 (expr1, expr2); 10283@} 10284@end example 10285 10286And the corresponding specializations of @code{__gmp_expr::eval}: 10287 10288@example 10289template <class T, class U, class Op> 10290void __gmp_expr 10291<__gmp_binary_expr<__gmp_expr<T>, __gmp_expr<U>, Op> >::eval 10292(mpf_t f, mp_bitcnt_t precision) 10293@{ 10294 // declare two temporaries 10295 mpf_class temp1(expr.val1, precision), temp2(expr.val2, precision); 10296 Op::eval(f, temp1.get_mpf_t(), temp2.get_mpf_t()); 10297@} 10298@end example 10299 10300The expression is thus recursively evaluated to any level of complexity and 10301all subexpressions are evaluated to the precision of @code{f}. 10302 10303 10304@node Contributors, References, Internals, Top 10305@comment node-name, next, previous, up 10306@appendix Contributors 10307@cindex Contributors 10308 10309Torbj@"orn Granlund wrote the original GMP library and is still the main 10310developer. Code not explicitly attributed to others, was contributed by 10311Torbj@"orn. Several other individuals and organizations have contributed 10312GMP. Here is a list in chronological order on first contribution: 10313 10314Gunnar Sj@"odin and Hans Riesel helped with mathematical problems in early 10315versions of the library. 10316 10317Richard Stallman helped with the interface design and revised the first 10318version of this manual. 10319 10320Brian Beuning and Doug Lea helped with testing of early versions of the 10321library and made creative suggestions. 10322 10323John Amanatides of York University in Canada contributed the function 10324@code{mpz_probab_prime_p}. 10325 10326Paul Zimmermann wrote the REDC-based mpz_powm code, the Sch@"onhage-Strassen 10327FFT multiply code, and the Karatsuba square root code. He also improved the 10328Toom3 code for GMP 4.2. Paul sparked the development of GMP 2, with his 10329comparisons between bignum packages. The ECMNET project Paul is organizing 10330was a driving force behind many of the optimizations in GMP 3. Paul also 10331wrote the new GMP 4.3 nth root code (with Torbj@"orn). 10332 10333Ken Weber (Kent State University, Universidade Federal do Rio Grande do Sul) 10334contributed now defunct versions of @code{mpz_gcd}, @code{mpz_divexact}, 10335@code{mpn_gcd}, and @code{mpn_bdivmod}, partially supported by CNPq (Brazil) 10336grant 301314194-2. 10337 10338Per Bothner of Cygnus Support helped to set up GMP to use Cygnus' configure. 10339He has also made valuable suggestions and tested numerous intermediary 10340releases. 10341 10342Joachim Hollman was involved in the design of the @code{mpf} interface, and in 10343the @code{mpz} design revisions for version 2. 10344 10345Bennet Yee contributed the initial versions of @code{mpz_jacobi} and 10346@code{mpz_legendre}. 10347 10348Andreas Schwab contributed the files @file{mpn/m68k/lshift.S} and 10349@file{mpn/m68k/rshift.S} (now in @file{.asm} form). 10350 10351Robert Harley of Inria, France and David Seal of ARM, England, suggested clever 10352improvements for population count. Robert also wrote highly optimized 10353Karatsuba and 3-way Toom multiplication functions for GMP 3, and contributed 10354the ARM assembly code. 10355 10356Torsten Ekedahl of the Mathematical department of Stockholm University provided 10357significant inspiration during several phases of the GMP development. His 10358mathematical expertise helped improve several algorithms. 10359 10360Linus Nordberg wrote the new configure system based on autoconf and 10361implemented the new random functions. 10362 10363Kevin Ryde worked on a large number of things: optimized x86 code, m4 asm 10364macros, parameter tuning, speed measuring, the configure system, function 10365inlining, divisibility tests, bit scanning, Jacobi symbols, Fibonacci and Lucas 10366number functions, printf and scanf functions, perl interface, demo expression 10367parser, the algorithms chapter in the manual, @file{gmpasm-mode.el}, and 10368various miscellaneous improvements elsewhere. 10369 10370Kent Boortz made the Mac OS 9 port. 10371 10372Steve Root helped write the optimized alpha 21264 assembly code. 10373 10374Gerardo Ballabio wrote the @file{gmpxx.h} C++ class interface and the C++ 10375@code{istream} input routines. 10376 10377Jason Moxham rewrote @code{mpz_fac_ui}. 10378 10379Pedro Gimeno implemented the Mersenne Twister and made other random number 10380improvements. 10381 10382Niels M@"oller wrote the sub-quadratic GCD, extended GCD and jacobi code, the 10383quadratic Hensel division code, and (with Torbj@"orn) the new divide and 10384conquer division code for GMP 4.3. Niels also helped implement the new Toom 10385multiply code for GMP 4.3 and implemented helper functions to simplify Toom 10386evaluations for GMP 5.0. He wrote the original version of mpn_mulmod_bnm1, and 10387he is the main author of the mini-gmp package used for gmp bootstrapping. 10388 10389Alberto Zanoni and Marco Bodrato suggested the unbalanced multiply strategy, 10390and found the optimal strategies for evaluation and interpolation in Toom 10391multiplication. 10392 10393Marco Bodrato helped implement the new Toom multiply code for GMP 4.3 and 10394implemented most of the new Toom multiply and squaring code for 5.0. 10395He is the main author of the current mpn_mulmod_bnm1 and mpn_mullo_n. Marco 10396also wrote the functions mpn_invert and mpn_invertappr. He is the author of 10397the current combinatorial functions: binomial, factorial, multifactorial, 10398primorial. 10399 10400David Harvey suggested the internal function @code{mpn_bdiv_dbm1}, implementing 10401division relevant to Toom multiplication. He also worked on fast assembly 10402sequences, in particular on a fast AMD64 @code{mpn_mul_basecase}. He wrote 10403the internal middle product functions @code{mpn_mulmid_basecase}, 10404@code{mpn_toom42_mulmid}, @code{mpn_mulmid_n} and related helper routines. 10405 10406Martin Boij wrote @code{mpn_perfect_power_p}. 10407 10408Marc Glisse improved @file{gmpxx.h}: use fewer temporaries (faster), 10409specializations of @code{numeric_limits} and @code{common_type}, C++11 10410features (move constructors, explicit bool conversion, UDL), make the 10411conversion from @code{mpq_class} to @code{mpz_class} explicit, optimize 10412operations where one argument is a small compile-time constant, replace 10413some heap allocations by stack allocations. He also fixed the eofbit 10414handling of C++ streams, and removed one division from @file{mpq/aors.c}. 10415 10416(This list is chronological, not ordered after significance. If you have 10417contributed to GMP but are not listed above, please tell 10418@email{gmp-devel@@gmplib.org} about the omission!) 10419 10420The development of floating point functions of GNU MP 2, were supported in part 10421by the ESPRIT-BRA (Basic Research Activities) 6846 project POSSO (POlynomial 10422System SOlving). 10423 10424The development of GMP 2, 3, and 4 was supported in part by the IDA Center for 10425Computing Sciences. 10426 10427Thanks go to Hans Thorsen for donating an SGI system for the GMP test system 10428environment. 10429 10430@node References, GNU Free Documentation License, Contributors, Top 10431@comment node-name, next, previous, up 10432@appendix References 10433@cindex References 10434 10435@c FIXME: In tex, the @uref's are unhyphenated, which is good for clarity, 10436@c but being long words they upset paragraph formatting (the preceding line 10437@c can get badly stretched). Would like an conditional @* style line break 10438@c if the uref is too long to fit on the last line of the paragraph, but it's 10439@c not clear how to do that. For now explicit @texlinebreak{}s are used on 10440@c paragraphs that come out bad. 10441 10442@section Books 10443 10444@itemize @bullet 10445@item 10446Jonathan M. Borwein and Peter B. Borwein, ``Pi and the AGM: A Study in 10447Analytic Number Theory and Computational Complexity'', Wiley, 1998. 10448 10449@item 10450Richard Crandall and Carl Pomerance, ``Prime Numbers: A Computational 10451Perspective'', 2nd edition, Springer-Verlag, 2005. 10452@texlinebreak{} @uref{http://www.math.dartmouth.edu/~carlp/} 10453 10454@item 10455Henri Cohen, ``A Course in Computational Algebraic Number Theory'', Graduate 10456Texts in Mathematics number 138, Springer-Verlag, 1993. 10457@texlinebreak{} @uref{http://www.math.u-bordeaux.fr/~cohen/} 10458 10459@item 10460Donald E. Knuth, ``The Art of Computer Programming'', volume 2, 10461``Seminumerical Algorithms'', 3rd edition, Addison-Wesley, 1998. 10462@texlinebreak{} @uref{http://www-cs-faculty.stanford.edu/~knuth/taocp.html} 10463 10464@item 10465John D. Lipson, ``Elements of Algebra and Algebraic Computing'', 10466The Benjamin Cummings Publishing Company Inc, 1981. 10467 10468@item 10469Alfred J. Menezes, Paul C. van Oorschot and Scott A. Vanstone, ``Handbook of 10470Applied Cryptography'', @uref{http://www.cacr.math.uwaterloo.ca/hac/} 10471 10472@item 10473Richard M. Stallman and the GCC Developer Community, ``Using the GNU Compiler 10474Collection'', Free Software Foundation, 2008, available online 10475@uref{http://gcc.gnu.org/onlinedocs/}, and in the GCC package 10476@uref{ftp://ftp.gnu.org/gnu/gcc/} 10477@end itemize 10478 10479@section Papers 10480 10481@itemize @bullet 10482@item 10483Yves Bertot, Nicolas Magaud and Paul Zimmermann, ``A Proof of GMP Square 10484Root'', Journal of Automated Reasoning, volume 29, 2002, pp.@: 225-252. Also 10485available online as INRIA Research Report 4475, June 2002, 10486@uref{http://hal.inria.fr/docs/00/07/21/13/PDF/RR-4475.pdf} 10487 10488@item 10489Christoph Burnikel and Joachim Ziegler, ``Fast Recursive Division'', 10490Max-Planck-Institut fuer Informatik Research Report MPI-I-98-1-022, 10491@texlinebreak{} @uref{http://data.mpi-sb.mpg.de/internet/reports.nsf/NumberView/1998-1-022} 10492 10493@item 10494Torbj@"orn Granlund and Peter L. Montgomery, ``Division by Invariant Integers 10495using Multiplication'', in Proceedings of the SIGPLAN PLDI'94 Conference, June 104961994. Also available @uref{http://gmplib.org/~tege/divcnst-pldi94.pdf}. 10497 10498@item 10499Niels M@"oller and Torbj@"orn Granlund, ``Improved division by invariant 10500integers'', IEEE Transactions on Computers, 11 June 2010. 10501@uref{http://gmplib.org/~tege/division-paper.pdf} 10502 10503@item 10504Torbj@"orn Granlund and Niels M@"oller, ``Division of integers large and 10505small'', to appear. 10506 10507@item 10508Tudor Jebelean, 10509``An algorithm for exact division'', 10510Journal of Symbolic Computation, 10511volume 15, 1993, pp.@: 169-180. 10512Research report version available @texlinebreak{} 10513@uref{ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1992/92-35.ps.gz} 10514 10515@item 10516Tudor Jebelean, ``Exact Division with Karatsuba Complexity - Extended 10517Abstract'', RISC-Linz technical report 96-31, @texlinebreak{} 10518@uref{ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1996/96-31.ps.gz} 10519 10520@item 10521Tudor Jebelean, ``Practical Integer Division with Karatsuba Complexity'', 10522ISSAC 97, pp.@: 339-341. Technical report available @texlinebreak{} 10523@uref{ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1996/96-29.ps.gz} 10524 10525@item 10526Tudor Jebelean, ``A Generalization of the Binary GCD Algorithm'', ISSAC 93, 10527pp.@: 111-116. Technical report version available @texlinebreak{} 10528@uref{ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1993/93-01.ps.gz} 10529 10530@item 10531Tudor Jebelean, ``A Double-Digit Lehmer-Euclid Algorithm for Finding the GCD 10532of Long Integers'', Journal of Symbolic Computation, volume 19, 1995, 10533pp.@: 145-157. Technical report version also available @texlinebreak{} 10534@uref{ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1992/92-69.ps.gz} 10535 10536@item 10537Werner Krandick and Tudor Jebelean, ``Bidirectional Exact Integer Division'', 10538Journal of Symbolic Computation, volume 21, 1996, pp.@: 441-455. Early 10539technical report version also available 10540@uref{ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1994/94-50.ps.gz} 10541 10542@item 10543Makoto Matsumoto and Takuji Nishimura, ``Mersenne Twister: A 623-dimensionally 10544equidistributed uniform pseudorandom number generator'', ACM Transactions on 10545Modelling and Computer Simulation, volume 8, January 1998, pp.@: 3-30. 10546Available online @texlinebreak{} 10547@uref{http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/ARTICLES/mt.ps.gz} (or .pdf) 10548 10549@item 10550R. Moenck and A. Borodin, ``Fast Modular Transforms via Division'', 10551Proceedings of the 13th Annual IEEE Symposium on Switching and Automata 10552Theory, October 1972, pp.@: 90-96. Reprinted as ``Fast Modular Transforms'', 10553Journal of Computer and System Sciences, volume 8, number 3, June 1974, 10554pp.@: 366-386. 10555 10556@item 10557Niels M@"oller, ``On Sch@"onhage's algorithm and subquadratic integer GCD 10558 computation'', in Mathematics of Computation, volume 77, January 2008, pp.@: 10559 589-607. 10560 10561@item 10562Peter L. Montgomery, ``Modular Multiplication Without Trial Division'', in 10563Mathematics of Computation, volume 44, number 170, April 1985. 10564 10565@item 10566Arnold Sch@"onhage and Volker Strassen, ``Schnelle Multiplikation grosser 10567Zahlen'', Computing 7, 1971, pp.@: 281-292. 10568 10569@item 10570Kenneth Weber, ``The accelerated integer GCD algorithm'', 10571ACM Transactions on Mathematical Software, 10572volume 21, number 1, March 1995, pp.@: 111-122. 10573 10574@item 10575Paul Zimmermann, ``Karatsuba Square Root'', INRIA Research Report 3805, 10576November 1999, @uref{http://hal.inria.fr/inria-00072854/PDF/RR-3805.pdf} 10577 10578@item 10579Paul Zimmermann, ``A Proof of GMP Fast Division and Square Root 10580Implementations'', @texlinebreak{} 10581@uref{http://www.loria.fr/~zimmerma/papers/proof-div-sqrt.ps.gz} 10582 10583@item 10584Dan Zuras, ``On Squaring and Multiplying Large Integers'', ARITH-11: IEEE 10585Symposium on Computer Arithmetic, 1993, pp.@: 260 to 271. Reprinted as ``More 10586on Multiplying and Squaring Large Integers'', IEEE Transactions on Computers, 10587volume 43, number 8, August 1994, pp.@: 899-908. 10588@end itemize 10589 10590 10591@node GNU Free Documentation License, Concept Index, References, Top 10592@appendix GNU Free Documentation License 10593@cindex GNU Free Documentation License 10594@cindex Free Documentation License 10595@cindex Documentation license 10596@include fdl-1.3.texi 10597 10598 10599@node Concept Index, Function Index, GNU Free Documentation License, Top 10600@comment node-name, next, previous, up 10601@unnumbered Concept Index 10602@printindex cp 10603 10604@node Function Index, , Concept Index, Top 10605@comment node-name, next, previous, up 10606@unnumbered Function and Type Index 10607@printindex fn 10608 10609@bye 10610 10611@c Local variables: 10612@c fill-column: 78 10613@c compile-command: "make gmp.info" 10614@c End: 10615