1\input texinfo @c -*-texinfo-*- 2@c %**start of header 3@setfilename gmp.info 4@documentencoding ISO-8859-1 5@include version.texi 6@settitle GNU MP @value{VERSION} 7@synindex tp fn 8@iftex 9@afourpaper 10@end iftex 11@comment %**end of header 12 13@copying 14This manual describes how to install and use the GNU multiple precision 15arithmetic library, version @value{VERSION}. 16 17Copyright 1991, 1993-2016, 2018-2020 Free Software Foundation, Inc. 18 19Permission is granted to copy, distribute and/or modify this document under 20the terms of the GNU Free Documentation License, Version 1.3 or any later 21version published by the Free Software Foundation; with no Invariant Sections, 22with the Front-Cover Texts being ``A GNU Manual'', and with the Back-Cover 23Texts being ``You have freedom to copy and modify this GNU Manual, like GNU 24software''. A copy of the license is included in 25@ref{GNU Free Documentation License}. 26@end copying 27@c Note the @ref above must be on one line, a line break in an @ref within 28@c @copying will bomb in recent texinfo.tex (eg. 2004-04-07.08 which comes 29@c with texinfo 4.7), with messages about missing @endcsname. 30 31 32@c Texinfo version 4.2 or up will be needed to process this file. 33@c 34@c The version number and edition number are taken from version.texi provided 35@c by automake (note that it's regenerated only if you configure with 36@c --enable-maintainer-mode). 37@c 38@c Notes discussing the present version number of GMP in relation to previous 39@c ones (for instance in the "Compatibility" section) must be updated at 40@c manually though. 41@c 42@c @cindex entries have been made for function categories and programming 43@c topics. The "mpn" section is not included in this, because a beginner 44@c looking for "GCD" or something is only going to be confused by pointers to 45@c low level routines. 46@c 47@c @cindex entries are present for processors and systems when there's 48@c particular notes concerning them, but not just for everything GMP 49@c supports. 50@c 51@c Index entries for files use @code rather than @file, @samp or @option, 52@c since the latter come out with quotes in TeX, which are nice in the text 53@c but don't look so good in index columns. 54@c 55@c Tex: 56@c 57@c A suitable texinfo.tex is supplied, a newer one should work equally well. 58@c 59@c HTML: 60@c 61@c Nothing special is done for links to external manuals, they just come out 62@c in the usual makeinfo style, eg. "../libc/Locales.html". If you have 63@c local copies of such manuals then this is a good thing, if not then you 64@c may want to search-and-replace to some online source. 65@c 66 67@dircategory GNU libraries 68@direntry 69* gmp: (gmp). GNU Multiple Precision Arithmetic Library. 70@end direntry 71 72@c html <meta name="description" content="..."> 73@documentdescription 74How to install and use the GNU multiple precision arithmetic library, version @value{VERSION}. 75@end documentdescription 76 77@c smallbook 78@finalout 79@setchapternewpage on 80 81@ifnottex 82@node Top, Copying, (dir), (dir) 83@top GNU MP 84@end ifnottex 85 86@iftex 87@titlepage 88@title GNU MP 89@subtitle The GNU Multiple Precision Arithmetic Library 90@subtitle Edition @value{EDITION} 91@subtitle @value{UPDATED} 92 93@author by Torbj@"orn Granlund and the GMP development team 94@c @email{tg@@gmplib.org} 95 96@c Include the Distribution inside the titlepage so 97@c that headings are turned off. 98 99@tex 100\global\parindent=0pt 101\global\parskip=8pt 102\global\baselineskip=13pt 103@end tex 104 105@page 106@vskip 0pt plus 1filll 107@end iftex 108 109@insertcopying 110@ifnottex 111@sp 1 112@end ifnottex 113 114@iftex 115@end titlepage 116@headings double 117@end iftex 118 119@c Don't bother with contents for html, the menus seem adequate. 120@ifnothtml 121@contents 122@end ifnothtml 123 124@menu 125* Copying:: GMP Copying Conditions (LGPL). 126* Introduction to GMP:: Brief introduction to GNU MP. 127* Installing GMP:: How to configure and compile the GMP library. 128* GMP Basics:: What every GMP user should know. 129* Reporting Bugs:: How to usefully report bugs. 130* Integer Functions:: Functions for arithmetic on signed integers. 131* Rational Number Functions:: Functions for arithmetic on rational numbers. 132* Floating-point Functions:: Functions for arithmetic on floats. 133* Low-level Functions:: Fast functions for natural numbers. 134* Random Number Functions:: Functions for generating random numbers. 135* Formatted Output:: @code{printf} style output. 136* Formatted Input:: @code{scanf} style input. 137* C++ Class Interface:: Class wrappers around GMP types. 138* Custom Allocation:: How to customize the internal allocation. 139* Language Bindings:: Using GMP from other languages. 140* Algorithms:: What happens behind the scenes. 141* Internals:: How values are represented behind the scenes. 142 143* Contributors:: Who brings you this library? 144* References:: Some useful papers and books to read. 145* GNU Free Documentation License:: 146* Concept Index:: 147* Function Index:: 148@end menu 149 150 151@c @m{T,N} is $T$ in tex or @math{N} otherwise. Commas in N or T don't work, 152@c but @C{} can be used instead. 153@iftex 154@macro m {T,N} 155@tex$\T\$@end tex 156@end macro 157@end iftex 158@ifnottex 159@macro m {T,N} 160@math{\N\} 161@end macro 162@end ifnottex 163 164@c @mm{T,N} is $T$ tex and html and @math{N} in info. Commas in N or T don't 165@c work, but @C{} can be used instead. 166@iftex 167@macro mm {T,N} 168@tex$\T\$@end tex 169@end macro 170@end iftex 171 172@ifhtml 173@macro mm {T,N} 174@math{\T\} 175@end macro 176@end ifhtml 177 178@ifinfo 179@macro mm {T,N} 180@math{\N\} 181@end macro 182@end ifinfo 183 184 185@macro C {} 186, 187@end macro 188 189@c @ms{V,N} is $V_N$ in tex or just vn otherwise. This suits simple 190@c subscripts like @ms{x,0}. 191@iftex 192@macro ms {V,N} 193@tex$\V\_{\N\}$@end tex 194@end macro 195@end iftex 196@ifnottex 197@macro ms {V,N} 198\V\\N\ 199@end macro 200@end ifnottex 201 202@c @nicode{S} is plain S in info, or @code{S} elsewhere. This can be used 203@c when the quotes that @code{} gives in info aren't wanted, but the 204@c fontification in tex or html is wanted. Doesn't work as @nicode{'\\0'} 205@c though (gives two backslashes in tex). 206@ifinfo 207@macro nicode {S} 208\S\ 209@end macro 210@end ifinfo 211@ifnotinfo 212@macro nicode {S} 213@code{\S\} 214@end macro 215@end ifnotinfo 216 217@c @nisamp{S} is plain S in info, or @samp{S} elsewhere. This can be used 218@c when the quotes that @samp{} gives in info aren't wanted, but the 219@c fontification in tex or html is wanted. 220@ifinfo 221@macro nisamp {S} 222\S\ 223@end macro 224@end ifinfo 225@ifnotinfo 226@macro nisamp {S} 227@samp{\S\} 228@end macro 229@end ifnotinfo 230 231@c Usage: @GMPtimes{} 232@c Give either \times or the word "times". 233@tex 234\gdef\GMPtimes{\times} 235@end tex 236@ifnottex 237@macro GMPtimes 238times 239@end macro 240@end ifnottex 241 242@c Usage: @GMPmultiply{} 243@c Give * in info, or nothing in tex. 244@tex 245\gdef\GMPmultiply{} 246@end tex 247@ifnottex 248@macro GMPmultiply 249* 250@end macro 251@end ifnottex 252 253@c Usage: @GMPabs{x} 254@c Give either |x| in tex, or abs(x) in info or html. 255@tex 256\gdef\GMPabs#1{|#1|} 257@end tex 258@ifnottex 259@macro GMPabs {X} 260@abs{}(\X\) 261@end macro 262@end ifnottex 263 264@c Usage: @GMPfloor{x} 265@c Give either \lfloor x\rfloor in tex, or floor(x) in info or html. 266@tex 267\gdef\GMPfloor#1{\lfloor #1\rfloor} 268@end tex 269@ifnottex 270@macro GMPfloor {X} 271floor(\X\) 272@end macro 273@end ifnottex 274 275@c Usage: @GMPceil{x} 276@c Give either \lceil x\rceil in tex, or ceil(x) in info or html. 277@tex 278\gdef\GMPceil#1{\lceil #1 \rceil} 279@end tex 280@ifnottex 281@macro GMPceil {X} 282ceil(\X\) 283@end macro 284@end ifnottex 285 286@c Math operators already available in tex, made available in info too. 287@c For example @bmod{} can be used in both tex and info. 288@ifnottex 289@macro bmod 290mod 291@end macro 292@macro gcd 293gcd 294@end macro 295@macro ge 296>= 297@end macro 298@macro le 299<= 300@end macro 301@macro log 302log 303@end macro 304@macro min 305min 306@end macro 307@macro leftarrow 308<- 309@end macro 310@macro rightarrow 311-> 312@end macro 313@end ifnottex 314 315@c New math operators. 316@c @abs{} can be used in both tex and info, or just \abs in tex. 317@tex 318\gdef\abs{\mathop{\rm abs}} 319@end tex 320@ifnottex 321@macro abs 322abs 323@end macro 324@end ifnottex 325 326@c @cross{} is a \times symbol in tex, or an "x" in info. In tex it works 327@c inside or outside $ $. 328@tex 329\gdef\cross{\ifmmode\times\else$\times$\fi} 330@end tex 331@ifnottex 332@macro cross 333x 334@end macro 335@end ifnottex 336 337@c @times{} made available as a "*" in info and html (already works in tex). 338@ifnottex 339@macro times 340* 341@end macro 342@end ifnottex 343 344@c Usage: @W{text} 345@c Like @w{} but working in math mode too. 346@tex 347\gdef\W#1{\ifmmode{#1}\else\w{#1}\fi} 348@end tex 349@ifnottex 350@macro W {S} 351@w{\S\} 352@end macro 353@end ifnottex 354 355@c Usage: \GMPdisplay{text} 356@c Put the given text in an @display style indent, but without turning off 357@c paragraph reflow etc. 358@tex 359\gdef\GMPdisplay#1{% 360\noindent 361\advance\leftskip by \lispnarrowing 362#1\par} 363@end tex 364 365@c Usage: \GMPhat 366@c A new \hat that will work in math mode, unlike the texinfo redefined 367@c version. 368@tex 369\gdef\GMPhat{\mathaccent"705E} 370@end tex 371 372@c Usage: \GMPraise{text} 373@c For use in a $ $ math expression as an alternative to "^". This is good 374@c for @code{} in an exponent, since there seems to be no superscript font 375@c for that. 376@tex 377\gdef\GMPraise#1{\mskip0.5\thinmuskip\hbox{\raise0.8ex\hbox{#1}}} 378@end tex 379 380@c Usage: @texlinebreak{} 381@c A line break as per @*, but only in tex. 382@iftex 383@macro texlinebreak 384@* 385@end macro 386@end iftex 387@ifnottex 388@macro texlinebreak 389@end macro 390@end ifnottex 391 392@c Usage: @maybepagebreak 393@c Allow tex to insert a page break, if it feels the urge. 394@c Normally blocks of @deftypefun/funx are kept together, which can lead to 395@c some poor page break positioning if it's a big block, like the sets of 396@c division functions etc. 397@tex 398\gdef\maybepagebreak{\penalty0} 399@end tex 400@ifnottex 401@macro maybepagebreak 402@end macro 403@end ifnottex 404 405@c Usage: @GMPreftop{info,title} 406@c Usage: @GMPpxreftop{info,title} 407@c 408@c Like @ref{} and @pxref{}, but designed for a reference to the top of a 409@c document, not a particular section. The TeX output for plain @ref insists 410@c on printing a particular section, GMPreftop gives just the title. 411@c 412@c The texinfo manual recommends putting a likely section name in references 413@c like this, eg. "Introduction", but it seems better to just give the title. 414@c 415@iftex 416@macro GMPreftop{info,title} 417@i{\title\} 418@end macro 419@macro GMPpxreftop{info,title} 420see @i{\title\} 421@end macro 422@end iftex 423@c 424@ifnottex 425@macro GMPreftop{info,title} 426@ref{Top,\title\,\title\,\info\,\title\} 427@end macro 428@macro GMPpxreftop{info,title} 429@pxref{Top,\title\,\title\,\info\,\title\} 430@end macro 431@end ifnottex 432 433 434@node Copying, Introduction to GMP, Top, Top 435@comment node-name, next, previous, up 436@unnumbered GNU MP Copying Conditions 437@cindex Copying conditions 438@cindex Conditions for copying GNU MP 439@cindex License conditions 440 441This library is @dfn{free}; this means that everyone is free to use it and 442free to redistribute it on a free basis. The library is not in the public 443domain; it is copyrighted and there are restrictions on its distribution, but 444these restrictions are designed to permit everything that a good cooperating 445citizen would want to do. What is not allowed is to try to prevent others 446from further sharing any version of this library that they might get from 447you.@refill 448 449Specifically, we want to make sure that you have the right to give away copies 450of the library, that you receive source code or else can get it if you want 451it, that you can change this library or use pieces of it in new free programs, 452and that you know you can do these things.@refill 453 454To make sure that everyone has such rights, we have to forbid you to deprive 455anyone else of these rights. For example, if you distribute copies of the GNU 456MP library, you must give the recipients all the rights that you have. You 457must make sure that they, too, receive or can get the source code. And you 458must tell them their rights.@refill 459 460Also, for our own protection, we must make certain that everyone finds out 461that there is no warranty for the GNU MP library. If it is modified by 462someone else and passed on, we want their recipients to know that what they 463have is not what we distributed, so that any problems introduced by others 464will not reflect on our reputation.@refill 465 466More precisely, the GNU MP library is dual licensed, under the conditions of 467the GNU Lesser General Public License version 3 (see 468@file{COPYING.LESSERv3}), or the GNU General Public License version 2 (see 469@file{COPYINGv2}). This is the recipient's choice, and the recipient also has 470the additional option of applying later versions of these licenses. (The 471reason for this dual licensing is to make it possible to use the library with 472programs which are licensed under GPL version 2, but which for historical or 473other reasons do not allow use under later versions of the GPL). 474 475Programs which are not part of the library itself, such as demonstration 476programs and the GMP testsuite, are licensed under the terms of the GNU 477General Public License version 3 (see @file{COPYINGv3}), or any later 478version. 479 480 481@node Introduction to GMP, Installing GMP, Copying, Top 482@comment node-name, next, previous, up 483@chapter Introduction to GNU MP 484@cindex Introduction 485 486GNU MP is a portable library written in C for arbitrary precision arithmetic 487on integers, rational numbers, and floating-point numbers. It aims to provide 488the fastest possible arithmetic for all applications that need higher 489precision than is directly supported by the basic C types. 490 491Many applications use just a few hundred bits of precision; but some 492applications may need thousands or even millions of bits. GMP is designed to 493give good performance for both, by choosing algorithms based on the sizes of 494the operands, and by carefully keeping the overhead at a minimum. 495 496The speed of GMP is achieved by using fullwords as the basic arithmetic type, 497by using sophisticated algorithms, by including carefully optimized assembly 498code for the most common inner loops for many different CPUs, and by a general 499emphasis on speed (as opposed to simplicity or elegance). 500 501There is assembly code for these CPUs: 502@cindex CPU types 503ARM Cortex-A9, Cortex-A15, and generic ARM, 504DEC Alpha 21064, 21164, and 21264, 505AMD K8 and K10 (sold under many brands, e.g. Athlon64, Phenom, Opteron) 506Bulldozer, and Bobcat, 507Intel Pentium, Pentium Pro/II/III, Pentium 4, Core2, Nehalem, Sandy bridge, Haswell, generic x86, 508Intel IA-64, 509Motorola/IBM PowerPC 32 and 64 such as POWER970, POWER5, POWER6, and POWER7, 510MIPS 32-bit and 64-bit, 511SPARC 32-bit ad 64-bit with special support for all UltraSPARC models. 512There is also assembly code for many obsolete CPUs. 513 514 515@cindex Home page 516@cindex Web page 517@noindent 518For up-to-date information on GMP, please see the GMP web pages at 519 520@display 521@uref{https://gmplib.org/} 522@end display 523 524@cindex Latest version of GMP 525@cindex Anonymous FTP of latest version 526@cindex FTP of latest version 527@noindent 528The latest version of the library is available at 529 530@display 531@uref{https://ftp.gnu.org/gnu/gmp/} 532@end display 533 534Many sites around the world mirror @samp{ftp.gnu.org}, please use a mirror 535near you, see @uref{https://www.gnu.org/order/ftp.html} for a full list. 536 537@cindex Mailing lists 538There are three public mailing lists of interest. One for release 539announcements, one for general questions and discussions about usage of the GMP 540library and one for bug reports. For more information, see 541 542@display 543@uref{https://gmplib.org/mailman/listinfo/}. 544@end display 545 546The proper place for bug reports is @email{gmp-bugs@@gmplib.org}. See 547@ref{Reporting Bugs} for information about reporting bugs. 548 549@sp 1 550@section How to use this Manual 551@cindex About this manual 552 553Everyone should read @ref{GMP Basics}. If you need to install the library 554yourself, then read @ref{Installing GMP}. If you have a system with multiple 555ABIs, then read @ref{ABI and ISA}, for the compiler options that must be used 556on applications. 557 558The rest of the manual can be used for later reference, although it is 559probably a good idea to glance through it. 560 561 562@node Installing GMP, GMP Basics, Introduction to GMP, Top 563@comment node-name, next, previous, up 564@chapter Installing GMP 565@cindex Installing GMP 566@cindex Configuring GMP 567@cindex Building GMP 568 569GMP has an autoconf/automake/libtool based configuration system. On a 570Unix-like system a basic build can be done with 571 572@example 573./configure 574make 575@end example 576 577@noindent 578Some self-tests can be run with 579 580@example 581make check 582@end example 583 584@noindent 585And you can install (under @file{/usr/local} by default) with 586 587@example 588make install 589@end example 590 591If you experience problems, please report them to @email{gmp-bugs@@gmplib.org}. 592See @ref{Reporting Bugs}, for information on what to include in useful bug 593reports. 594 595@menu 596* Build Options:: 597* ABI and ISA:: 598* Notes for Package Builds:: 599* Notes for Particular Systems:: 600* Known Build Problems:: 601* Performance optimization:: 602@end menu 603 604 605@node Build Options, ABI and ISA, Installing GMP, Installing GMP 606@section Build Options 607@cindex Build options 608 609All the usual autoconf configure options are available, run @samp{./configure 610--help} for a summary. The file @file{INSTALL.autoconf} has some generic 611installation information too. 612 613@table @asis 614@item Tools 615@cindex Non-Unix systems 616@samp{configure} requires various Unix-like tools. See @ref{Notes for 617Particular Systems}, for some options on non-Unix systems. 618 619It might be possible to build without the help of @samp{configure}, certainly 620all the code is there, but unfortunately you'll be on your own. 621 622@item Build Directory 623@cindex Build directory 624To compile in a separate build directory, @command{cd} to that directory, and 625prefix the configure command with the path to the GMP source directory. For 626example 627 628@example 629cd /my/build/dir 630/my/sources/gmp-@value{VERSION}/configure 631@end example 632 633Not all @samp{make} programs have the necessary features (@code{VPATH}) to 634support this. In particular, SunOS and Slowaris @command{make} have bugs that 635make them unable to build in a separate directory. Use GNU @command{make} 636instead. 637 638@item @option{--prefix} and @option{--exec-prefix} 639@cindex Prefix 640@cindex Exec prefix 641@cindex Install prefix 642@cindex @code{--prefix} 643@cindex @code{--exec-prefix} 644The @option{--prefix} option can be used in the normal way to direct GMP to 645install under a particular tree. The default is @samp{/usr/local}. 646 647@option{--exec-prefix} can be used to direct architecture-dependent files like 648@file{libgmp.a} to a different location. This can be used to share 649architecture-independent parts like the documentation, but separate the 650dependent parts. Note however that @file{gmp.h} is 651architecture-dependent since it encodes certain aspects of @file{libgmp}, so 652it will be necessary to ensure both @file{$prefix/include} and 653@file{$exec_prefix/include} are available to the compiler. 654 655@item @option{--disable-shared}, @option{--disable-static} 656@cindex @code{--disable-shared} 657@cindex @code{--disable-static} 658By default both shared and static libraries are built (where possible), but 659one or other can be disabled. Shared libraries result in smaller executables 660and permit code sharing between separate running processes, but on some CPUs 661are slightly slower, having a small cost on each function call. 662 663@item Native Compilation, @option{--build=CPU-VENDOR-OS} 664@cindex Native compilation 665@cindex Build system 666@cindex @code{--build} 667For normal native compilation, the system can be specified with 668@samp{--build}. By default @samp{./configure} uses the output from running 669@samp{./config.guess}. On some systems @samp{./config.guess} can determine 670the exact CPU type, on others it will be necessary to give it explicitly. For 671example, 672 673@example 674./configure --build=ultrasparc-sun-solaris2.7 675@end example 676 677In all cases the @samp{OS} part is important, since it controls how libtool 678generates shared libraries. Running @samp{./config.guess} is the simplest way 679to see what it should be, if you don't know already. 680 681@item Cross Compilation, @option{--host=CPU-VENDOR-OS} 682@cindex Cross compiling 683@cindex Host system 684@cindex @code{--host} 685When cross-compiling, the system used for compiling is given by @samp{--build} 686and the system where the library will run is given by @samp{--host}. For 687example when using a FreeBSD Athlon system to build GNU/Linux m68k binaries, 688 689@example 690./configure --build=athlon-pc-freebsd3.5 --host=m68k-mac-linux-gnu 691@end example 692 693Compiler tools are sought first with the host system type as a prefix. For 694example @command{m68k-mac-linux-gnu-ranlib} is tried, then plain 695@command{ranlib}. This makes it possible for a set of cross-compiling tools 696to co-exist with native tools. The prefix is the argument to @samp{--host}, 697and this can be an alias, such as @samp{m68k-linux}. But note that tools 698don't have to be setup this way, it's enough to just have a @env{PATH} with a 699suitable cross-compiling @command{cc} etc. 700 701Compiling for a different CPU in the same family as the build system is a form 702of cross-compilation, though very possibly this would merely be special 703options on a native compiler. In any case @samp{./configure} avoids depending 704on being able to run code on the build system, which is important when 705creating binaries for a newer CPU since they very possibly won't run on the 706build system. 707 708In all cases the compiler must be able to produce an executable (of whatever 709format) from a standard C @code{main}. Although only object files will go to 710make up @file{libgmp}, @samp{./configure} uses linking tests for various 711purposes, such as determining what functions are available on the host system. 712 713Currently a warning is given unless an explicit @samp{--build} is used when 714cross-compiling, because it may not be possible to correctly guess the build 715system type if the @env{PATH} has only a cross-compiling @command{cc}. 716 717Note that the @samp{--target} option is not appropriate for GMP@. It's for use 718when building compiler tools, with @samp{--host} being where they will run, 719and @samp{--target} what they'll produce code for. Ordinary programs or 720libraries like GMP are only interested in the @samp{--host} part, being where 721they'll run. (Some past versions of GMP used @samp{--target} incorrectly.) 722 723@item CPU types 724@cindex CPU types 725In general, if you want a library that runs as fast as possible, you should 726configure GMP for the exact CPU type your system uses. However, this may mean 727the binaries won't run on older members of the family, and might run slower on 728other members, older or newer. The best idea is always to build GMP for the 729exact machine type you intend to run it on. 730 731The following CPUs have specific support. See @file{configure.ac} for details 732of what code and compiler options they select. 733 734@itemize @bullet 735 736@c Keep this formatting, it's easy to read and it can be grepped to 737@c automatically test that CPUs listed get through ./config.sub 738 739@item 740Alpha: 741@nisamp{alpha}, 742@nisamp{alphaev5}, 743@nisamp{alphaev56}, 744@nisamp{alphapca56}, 745@nisamp{alphapca57}, 746@nisamp{alphaev6}, 747@nisamp{alphaev67}, 748@nisamp{alphaev68} 749@nisamp{alphaev7} 750 751@item 752Cray: 753@nisamp{c90}, 754@nisamp{j90}, 755@nisamp{t90}, 756@nisamp{sv1} 757 758@item 759HPPA: 760@nisamp{hppa1.0}, 761@nisamp{hppa1.1}, 762@nisamp{hppa2.0}, 763@nisamp{hppa2.0n}, 764@nisamp{hppa2.0w}, 765@nisamp{hppa64} 766 767@item 768IA-64: 769@nisamp{ia64}, 770@nisamp{itanium}, 771@nisamp{itanium2} 772 773@item 774MIPS: 775@nisamp{mips}, 776@nisamp{mips3}, 777@nisamp{mips64} 778 779@item 780Motorola: 781@nisamp{m68k}, 782@nisamp{m68000}, 783@nisamp{m68010}, 784@nisamp{m68020}, 785@nisamp{m68030}, 786@nisamp{m68040}, 787@nisamp{m68060}, 788@nisamp{m68302}, 789@nisamp{m68360}, 790@nisamp{m88k}, 791@nisamp{m88110} 792 793@item 794POWER: 795@nisamp{power}, 796@nisamp{power1}, 797@nisamp{power2}, 798@nisamp{power2sc} 799 800@item 801PowerPC: 802@nisamp{powerpc}, 803@nisamp{powerpc64}, 804@nisamp{powerpc401}, 805@nisamp{powerpc403}, 806@nisamp{powerpc405}, 807@nisamp{powerpc505}, 808@nisamp{powerpc601}, 809@nisamp{powerpc602}, 810@nisamp{powerpc603}, 811@nisamp{powerpc603e}, 812@nisamp{powerpc604}, 813@nisamp{powerpc604e}, 814@nisamp{powerpc620}, 815@nisamp{powerpc630}, 816@nisamp{powerpc740}, 817@nisamp{powerpc7400}, 818@nisamp{powerpc7450}, 819@nisamp{powerpc750}, 820@nisamp{powerpc801}, 821@nisamp{powerpc821}, 822@nisamp{powerpc823}, 823@nisamp{powerpc860}, 824@nisamp{powerpc970} 825 826@item 827SPARC: 828@nisamp{sparc}, 829@nisamp{sparcv8}, 830@nisamp{microsparc}, 831@nisamp{supersparc}, 832@nisamp{sparcv9}, 833@nisamp{ultrasparc}, 834@nisamp{ultrasparc2}, 835@nisamp{ultrasparc2i}, 836@nisamp{ultrasparc3}, 837@nisamp{sparc64} 838 839@item 840x86 family: 841@nisamp{i386}, 842@nisamp{i486}, 843@nisamp{i586}, 844@nisamp{pentium}, 845@nisamp{pentiummmx}, 846@nisamp{pentiumpro}, 847@nisamp{pentium2}, 848@nisamp{pentium3}, 849@nisamp{pentium4}, 850@nisamp{k6}, 851@nisamp{k62}, 852@nisamp{k63}, 853@nisamp{athlon}, 854@nisamp{amd64}, 855@nisamp{viac3}, 856@nisamp{viac32} 857 858@item 859Other: 860@nisamp{arm}, 861@nisamp{sh}, 862@nisamp{sh2}, 863@nisamp{vax}, 864@end itemize 865 866CPUs not listed will use generic C code. 867 868@item Generic C Build 869@cindex Generic C 870If some of the assembly code causes problems, or if otherwise desired, the 871generic C code can be selected with the configure @option{--disable-assembly}. 872 873Note that this will run quite slowly, but it should be portable and should at 874least make it possible to get something running if all else fails. 875 876@item Fat binary, @option{--enable-fat} 877@cindex Fat binary 878@cindex @code{--enable-fat} 879Using @option{--enable-fat} selects a ``fat binary'' build on x86, where 880optimized low level subroutines are chosen at runtime according to the CPU 881detected. This means more code, but gives good performance on all x86 chips. 882(This option might become available for more architectures in the future.) 883 884@item @option{ABI} 885@cindex ABI 886On some systems GMP supports multiple ABIs (application binary interfaces), 887meaning data type sizes and calling conventions. By default GMP chooses the 888best ABI available, but a particular ABI can be selected. For example 889 890@example 891./configure --host=mips64-sgi-irix6 ABI=n32 892@end example 893 894See @ref{ABI and ISA}, for the available choices on relevant CPUs, and what 895applications need to do. 896 897@item @option{CC}, @option{CFLAGS} 898@cindex C compiler 899@cindex @code{CC} 900@cindex @code{CFLAGS} 901By default the C compiler used is chosen from among some likely candidates, 902with @command{gcc} normally preferred if it's present. The usual 903@samp{CC=whatever} can be passed to @samp{./configure} to choose something 904different. 905 906For various systems, default compiler flags are set based on the CPU and 907compiler. The usual @samp{CFLAGS="-whatever"} can be passed to 908@samp{./configure} to use something different or to set good flags for systems 909GMP doesn't otherwise know. 910 911The @samp{CC} and @samp{CFLAGS} used are printed during @samp{./configure}, 912and can be found in each generated @file{Makefile}. This is the easiest way 913to check the defaults when considering changing or adding something. 914 915Note that when @samp{CC} and @samp{CFLAGS} are specified on a system 916supporting multiple ABIs it's important to give an explicit 917@samp{ABI=whatever}, since GMP can't determine the ABI just from the flags and 918won't be able to select the correct assembly code. 919 920If just @samp{CC} is selected then normal default @samp{CFLAGS} for that 921compiler will be used (if GMP recognises it). For example @samp{CC=gcc} can 922be used to force the use of GCC, with default flags (and default ABI). 923 924@item @option{CPPFLAGS} 925@cindex @code{CPPFLAGS} 926Any flags like @samp{-D} defines or @samp{-I} includes required by the 927preprocessor should be set in @samp{CPPFLAGS} rather than @samp{CFLAGS}. 928Compiling is done with both @samp{CPPFLAGS} and @samp{CFLAGS}, but 929preprocessing uses just @samp{CPPFLAGS}. This distinction is because most 930preprocessors won't accept all the flags the compiler does. Preprocessing is 931done separately in some configure tests. 932 933@item @option{CC_FOR_BUILD} 934@cindex @code{CC_FOR_BUILD} 935Some build-time programs are compiled and run to generate host-specific data 936tables. @samp{CC_FOR_BUILD} is the compiler used for this. It doesn't need 937to be in any particular ABI or mode, it merely needs to generate executables 938that can run. The default is to try the selected @samp{CC} and some likely 939candidates such as @samp{cc} and @samp{gcc}, looking for something that works. 940 941No flags are used with @samp{CC_FOR_BUILD} because a simple invocation like 942@samp{cc foo.c} should be enough. If some particular options are required 943they can be included as for instance @samp{CC_FOR_BUILD="cc -whatever"}. 944 945@item C++ Support, @option{--enable-cxx} 946@cindex C++ support 947@cindex @code{--enable-cxx} 948C++ support in GMP can be enabled with @samp{--enable-cxx}, in which case a 949C++ compiler will be required. As a convenience @samp{--enable-cxx=detect} 950can be used to enable C++ support only if a compiler can be found. The C++ 951support consists of a library @file{libgmpxx.la} and header file 952@file{gmpxx.h} (@pxref{Headers and Libraries}). 953 954A separate @file{libgmpxx.la} has been adopted rather than having C++ objects 955within @file{libgmp.la} in order to ensure dynamic linked C programs aren't 956bloated by a dependency on the C++ standard library, and to avoid any chance 957that the C++ compiler could be required when linking plain C programs. 958 959@file{libgmpxx.la} will use certain internals from @file{libgmp.la} and can 960only be expected to work with @file{libgmp.la} from the same GMP version. 961Future changes to the relevant internals will be accompanied by renaming, so a 962mismatch will cause unresolved symbols rather than perhaps mysterious 963misbehaviour. 964 965In general @file{libgmpxx.la} will be usable only with the C++ compiler that 966built it, since name mangling and runtime support are usually incompatible 967between different compilers. 968 969@item @option{CXX}, @option{CXXFLAGS} 970@cindex C++ compiler 971@cindex @code{CXX} 972@cindex @code{CXXFLAGS} 973When C++ support is enabled, the C++ compiler and its flags can be set with 974variables @samp{CXX} and @samp{CXXFLAGS} in the usual way. The default for 975@samp{CXX} is the first compiler that works from a list of likely candidates, 976with @command{g++} normally preferred when available. The default for 977@samp{CXXFLAGS} is to try @samp{CFLAGS}, @samp{CFLAGS} without @samp{-g}, then 978for @command{g++} either @samp{-g -O2} or @samp{-O2}, or for other compilers 979@samp{-g} or nothing. Trying @samp{CFLAGS} this way is convenient when using 980@samp{gcc} and @samp{g++} together, since the flags for @samp{gcc} will 981usually suit @samp{g++}. 982 983It's important that the C and C++ compilers match, meaning their startup and 984runtime support routines are compatible and that they generate code in the 985same ABI (if there's a choice of ABIs on the system). @samp{./configure} 986isn't currently able to check these things very well itself, so for that 987reason @samp{--disable-cxx} is the default, to avoid a build failure due to a 988compiler mismatch. Perhaps this will change in the future. 989 990Incidentally, it's normally not good enough to set @samp{CXX} to the same as 991@samp{CC}. Although @command{gcc} for instance recognises @file{foo.cc} as 992C++ code, only @command{g++} will invoke the linker the right way when 993building an executable or shared library from C++ object files. 994 995@item Temporary Memory, @option{--enable-alloca=<choice>} 996@cindex Temporary memory 997@cindex Stack overflow 998@cindex @code{alloca} 999@cindex @code{--enable-alloca} 1000GMP allocates temporary workspace using one of the following three methods, 1001which can be selected with for instance 1002@samp{--enable-alloca=malloc-reentrant}. 1003 1004@itemize @bullet 1005@item 1006@samp{alloca} - C library or compiler builtin. 1007@item 1008@samp{malloc-reentrant} - the heap, in a re-entrant fashion. 1009@item 1010@samp{malloc-notreentrant} - the heap, with global variables. 1011@end itemize 1012 1013For convenience, the following choices are also available. 1014@samp{--disable-alloca} is the same as @samp{no}. 1015 1016@itemize @bullet 1017@item 1018@samp{yes} - a synonym for @samp{alloca}. 1019@item 1020@samp{no} - a synonym for @samp{malloc-reentrant}. 1021@item 1022@samp{reentrant} - @code{alloca} if available, otherwise 1023@samp{malloc-reentrant}. This is the default. 1024@item 1025@samp{notreentrant} - @code{alloca} if available, otherwise 1026@samp{malloc-notreentrant}. 1027@end itemize 1028 1029@code{alloca} is reentrant and fast, and is recommended. It actually allocates 1030just small blocks on the stack; larger ones use malloc-reentrant. 1031 1032@samp{malloc-reentrant} is, as the name suggests, reentrant and thread safe, 1033but @samp{malloc-notreentrant} is faster and should be used if reentrancy is 1034not required. 1035 1036The two malloc methods in fact use the memory allocation functions selected by 1037@code{mp_set_memory_functions}, these being @code{malloc} and friends by 1038default. @xref{Custom Allocation}. 1039 1040An additional choice @samp{--enable-alloca=debug} is available, to help when 1041debugging memory related problems (@pxref{Debugging}). 1042 1043@item FFT Multiplication, @option{--disable-fft} 1044@cindex FFT multiplication 1045@cindex @code{--disable-fft} 1046By default multiplications are done using Karatsuba, 3-way Toom, higher degree 1047Toom, and Fermat FFT@. The FFT is only used on large to very large operands 1048and can be disabled to save code size if desired. 1049 1050@item Assertion Checking, @option{--enable-assert} 1051@cindex Assertion checking 1052@cindex @code{--enable-assert} 1053This option enables some consistency checking within the library. This can be 1054of use while debugging, @pxref{Debugging}. 1055 1056@item Execution Profiling, @option{--enable-profiling=prof/gprof/instrument} 1057@cindex Execution profiling 1058@cindex @code{--enable-profiling} 1059Enable profiling support, in one of various styles, @pxref{Profiling}. 1060 1061@item @option{MPN_PATH} 1062@cindex @code{MPN_PATH} 1063Various assembly versions of each mpn subroutines are provided. For a given 1064CPU, a search is made though a path to choose a version of each. For example 1065@samp{sparcv8} has 1066 1067@example 1068MPN_PATH="sparc32/v8 sparc32 generic" 1069@end example 1070 1071which means look first for v8 code, then plain sparc32 (which is v7), and 1072finally fall back on generic C@. Knowledgeable users with special requirements 1073can specify a different path. Normally this is completely unnecessary. 1074 1075@item Documentation 1076@cindex Documentation formats 1077@cindex Texinfo 1078The source for the document you're now reading is @file{doc/gmp.texi}, in 1079Texinfo format, see @GMPreftop{texinfo, Texinfo}. 1080 1081@cindex Postscript 1082@cindex DVI 1083@cindex PDF 1084Info format @samp{doc/gmp.info} is included in the distribution. The usual 1085automake targets are available to make PostScript, DVI, PDF and HTML (these 1086will require various @TeX{} and Texinfo tools). 1087 1088@cindex DocBook 1089@cindex XML 1090DocBook and XML can be generated by the Texinfo @command{makeinfo} program 1091too, see @ref{makeinfo options,, Options for @command{makeinfo}, texinfo, 1092Texinfo}. 1093 1094Some supplementary notes can also be found in the @file{doc} subdirectory. 1095 1096@end table 1097 1098 1099@need 2000 1100@node ABI and ISA, Notes for Package Builds, Build Options, Installing GMP 1101@section ABI and ISA 1102@cindex ABI 1103@cindex Application Binary Interface 1104@cindex ISA 1105@cindex Instruction Set Architecture 1106 1107ABI (Application Binary Interface) refers to the calling conventions between 1108functions, meaning what registers are used and what sizes the various C data 1109types are. ISA (Instruction Set Architecture) refers to the instructions and 1110registers a CPU has available. 1111 1112Some 64-bit ISA CPUs have both a 64-bit ABI and a 32-bit ABI defined, the 1113latter for compatibility with older CPUs in the family. GMP supports some 1114CPUs like this in both ABIs. In fact within GMP @samp{ABI} means a 1115combination of chip ABI, plus how GMP chooses to use it. For example in some 111632-bit ABIs, GMP may support a limb as either a 32-bit @code{long} or a 64-bit 1117@code{long long}. 1118 1119By default GMP chooses the best ABI available for a given system, and this 1120generally gives significantly greater speed. But an ABI can be chosen 1121explicitly to make GMP compatible with other libraries, or particular 1122application requirements. For example, 1123 1124@example 1125./configure ABI=32 1126@end example 1127 1128In all cases it's vital that all object code used in a given program is 1129compiled for the same ABI. 1130 1131Usually a limb is implemented as a @code{long}. When a @code{long long} limb 1132is used this is encoded in the generated @file{gmp.h}. This is convenient for 1133applications, but it does mean that @file{gmp.h} will vary, and can't be just 1134copied around. @file{gmp.h} remains compiler independent though, since all 1135compilers for a particular ABI will be expected to use the same limb type. 1136 1137Currently no attempt is made to follow whatever conventions a system has for 1138installing library or header files built for a particular ABI@. This will 1139probably only matter when installing multiple builds of GMP, and it might be 1140as simple as configuring with a special @samp{libdir}, or it might require 1141more than that. Note that builds for different ABIs need to done separately, 1142with a fresh @command{./configure} and @command{make} each. 1143 1144@sp 1 1145@table @asis 1146@need 1000 1147@item AMD64 (@samp{x86_64}) 1148@cindex AMD64 1149On AMD64 systems supporting both 32-bit and 64-bit modes for applications, the 1150following ABI choices are available. 1151 1152@table @asis 1153@item @samp{ABI=64} 1154The 64-bit ABI uses 64-bit limbs and pointers and makes full use of the chip 1155architecture. This is the default. Applications will usually not need 1156special compiler flags, but for reference the option is 1157 1158@example 1159gcc -m64 1160@end example 1161 1162@item @samp{ABI=32} 1163The 32-bit ABI is the usual i386 conventions. This will be slower, and is not 1164recommended except for inter-operating with other code not yet 64-bit capable. 1165Applications must be compiled with 1166 1167@example 1168gcc -m32 1169@end example 1170 1171(In GCC 2.95 and earlier there's no @samp{-m32} option, it's the only mode.) 1172 1173@item @samp{ABI=x32} 1174The x32 ABI uses 64-bit limbs but 32-bit pointers. Like the 64-bit ABI, it 1175makes full use of the chip's arithmetic capabilities. This ABI is not 1176supported by all operating systems. 1177 1178@example 1179gcc -mx32 1180@end example 1181 1182@end table 1183 1184@sp 1 1185@need 1000 1186@item HPPA 2.0 (@samp{hppa2.0*}, @samp{hppa64}) 1187@cindex HPPA 1188@cindex HP-UX 1189@table @asis 1190@item @samp{ABI=2.0w} 1191The 2.0w ABI uses 64-bit limbs and pointers and is available on HP-UX 11 or 1192up. Applications must be compiled with 1193 1194@example 1195gcc [built for 2.0w] 1196cc +DD64 1197@end example 1198 1199@item @samp{ABI=2.0n} 1200The 2.0n ABI means the 32-bit HPPA 1.0 ABI and all its normal calling 1201conventions, but with 64-bit instructions permitted within functions. GMP 1202uses a 64-bit @code{long long} for a limb. This ABI is available on hppa64 1203GNU/Linux and on HP-UX 10 or higher. Applications must be compiled with 1204 1205@example 1206gcc [built for 2.0n] 1207cc +DA2.0 +e 1208@end example 1209 1210Note that current versions of GCC (eg.@: 3.2) don't generate 64-bit 1211instructions for @code{long long} operations and so may be slower than for 12122.0w. (The GMP assembly code is the same though.) 1213 1214@item @samp{ABI=1.0} 1215HPPA 2.0 CPUs can run all HPPA 1.0 and 1.1 code in the 32-bit HPPA 1.0 ABI@. 1216No special compiler options are needed for applications. 1217@end table 1218 1219All three ABIs are available for CPU types @samp{hppa2.0w}, @samp{hppa2.0} and 1220@samp{hppa64}, but for CPU type @samp{hppa2.0n} only 2.0n or 1.0 are 1221considered. 1222 1223Note that GCC on HP-UX has no options to choose between 2.0n and 2.0w modes, 1224unlike HP @command{cc}. Instead it must be built for one or the other ABI@. 1225GMP will detect how it was built, and skip to the corresponding @samp{ABI}. 1226 1227@sp 1 1228@need 1500 1229@item IA-64 under HP-UX (@samp{ia64*-*-hpux*}, @samp{itanium*-*-hpux*}) 1230@cindex IA-64 1231@cindex HP-UX 1232HP-UX supports two ABIs for IA-64. GMP performance is the same in both. 1233 1234@table @asis 1235@item @samp{ABI=32} 1236In the 32-bit ABI, pointers, @code{int}s and @code{long}s are 32 bits and GMP 1237uses a 64 bit @code{long long} for a limb. Applications can be compiled 1238without any special flags since this ABI is the default in both HP C and GCC, 1239but for reference the flags are 1240 1241@example 1242gcc -milp32 1243cc +DD32 1244@end example 1245 1246@item @samp{ABI=64} 1247In the 64-bit ABI, @code{long}s and pointers are 64 bits and GMP uses a 1248@code{long} for a limb. Applications must be compiled with 1249 1250@example 1251gcc -mlp64 1252cc +DD64 1253@end example 1254@end table 1255 1256On other IA-64 systems, GNU/Linux for instance, @samp{ABI=64} is the only 1257choice. 1258 1259@sp 1 1260@need 1000 1261@item MIPS under IRIX 6 (@samp{mips*-*-irix[6789]}) 1262@cindex MIPS 1263@cindex IRIX 1264IRIX 6 always has a 64-bit MIPS 3 or better CPU, and supports ABIs o32, n32, 1265and 64. n32 or 64 are recommended, and GMP performance will be the same in 1266each. The default is n32. 1267 1268@table @asis 1269@item @samp{ABI=o32} 1270The o32 ABI is 32-bit pointers and integers, and no 64-bit operations. GMP 1271will be slower than in n32 or 64, this option only exists to support old 1272compilers, eg.@: GCC 2.7.2. Applications can be compiled with no special 1273flags on an old compiler, or on a newer compiler with 1274 1275@example 1276gcc -mabi=32 1277cc -32 1278@end example 1279 1280@item @samp{ABI=n32} 1281The n32 ABI is 32-bit pointers and integers, but with a 64-bit limb using a 1282@code{long long}. Applications must be compiled with 1283 1284@example 1285gcc -mabi=n32 1286cc -n32 1287@end example 1288 1289@item @samp{ABI=64} 1290The 64-bit ABI is 64-bit pointers and integers. Applications must be compiled 1291with 1292 1293@example 1294gcc -mabi=64 1295cc -64 1296@end example 1297@end table 1298 1299Note that MIPS GNU/Linux, as of kernel version 2.2, doesn't have the necessary 1300support for n32 or 64 and so only gets a 32-bit limb and the MIPS 2 code. 1301 1302@sp 1 1303@need 1000 1304@item PowerPC 64 (@samp{powerpc64}, @samp{powerpc620}, @samp{powerpc630}, @samp{powerpc970}, @samp{power4}, @samp{power5}) 1305@cindex PowerPC 1306@table @asis 1307@item @samp{ABI=mode64} 1308@cindex AIX 1309The AIX 64 ABI uses 64-bit limbs and pointers and is the default on PowerPC 64 1310@samp{*-*-aix*} systems. Applications must be compiled with 1311 1312@example 1313gcc -maix64 1314xlc -q64 1315@end example 1316 1317On 64-bit GNU/Linux, BSD, and Mac OS X/Darwin systems, the applications must 1318be compiled with 1319 1320@example 1321gcc -m64 1322@end example 1323 1324@item @samp{ABI=mode32} 1325The @samp{mode32} ABI uses a 64-bit @code{long long} limb but with the chip 1326still in 32-bit mode and using 32-bit calling conventions. This is the default 1327for systems where the true 64-bit ABI is unavailable. No special compiler 1328options are typically needed for applications. This ABI is not available under 1329AIX. 1330 1331@item @samp{ABI=32} 1332This is the basic 32-bit PowerPC ABI, with a 32-bit limb. No special compiler 1333options are needed for applications. 1334@end table 1335 1336GMP's speed is greatest for the @samp{mode64} ABI, the @samp{mode32} ABI is 2nd 1337best. In @samp{ABI=32} only the 32-bit ISA is used and this doesn't make full 1338use of a 64-bit chip. 1339 1340@sp 1 1341@need 1000 1342@item Sparc V9 (@samp{sparc64}, @samp{sparcv9}, @samp{ultrasparc*}) 1343@cindex Sparc V9 1344@cindex Solaris 1345@cindex Sun 1346@table @asis 1347@item @samp{ABI=64} 1348The 64-bit V9 ABI is available on the various BSD sparc64 ports, recent 1349versions of Sparc64 GNU/Linux, and Solaris 2.7 and up (when the kernel is in 135064-bit mode). GCC 3.2 or higher, or Sun @command{cc} is required. On 1351GNU/Linux, depending on the default @command{gcc} mode, applications must be 1352compiled with 1353 1354@example 1355gcc -m64 1356@end example 1357 1358On Solaris applications must be compiled with 1359 1360@example 1361gcc -m64 -mptr64 -Wa,-xarch=v9 -mcpu=v9 1362cc -xarch=v9 1363@end example 1364 1365On the BSD sparc64 systems no special options are required, since 64-bits is 1366the only ABI available. 1367 1368@item @samp{ABI=32} 1369For the basic 32-bit ABI, GMP still uses as much of the V9 ISA as it can. In 1370the Sun documentation this combination is known as ``v8plus''. On GNU/Linux, 1371depending on the default @command{gcc} mode, applications may need to be 1372compiled with 1373 1374@example 1375gcc -m32 1376@end example 1377 1378On Solaris, no special compiler options are required for applications, though 1379using something like the following is recommended. (@command{gcc} 2.8 and 1380earlier only support @samp{-mv8} though.) 1381 1382@example 1383gcc -mv8plus 1384cc -xarch=v8plus 1385@end example 1386@end table 1387 1388GMP speed is greatest in @samp{ABI=64}, so it's the default where available. 1389The speed is partly because there are extra registers available and partly 1390because 64-bits is considered the more important case and has therefore had 1391better code written for it. 1392 1393Don't be confused by the names of the @samp{-m} and @samp{-x} compiler 1394options, they're called @samp{arch} but effectively control both ABI and ISA@. 1395 1396On Solaris 2.6 and earlier, only @samp{ABI=32} is available since the kernel 1397doesn't save all registers. 1398 1399On Solaris 2.7 with the kernel in 32-bit mode, a normal native build will 1400reject @samp{ABI=64} because the resulting executables won't run. 1401@samp{ABI=64} can still be built if desired by making it look like a 1402cross-compile, for example 1403 1404@example 1405./configure --build=none --host=sparcv9-sun-solaris2.7 ABI=64 1406@end example 1407@end table 1408 1409 1410@need 2000 1411@node Notes for Package Builds, Notes for Particular Systems, ABI and ISA, Installing GMP 1412@section Notes for Package Builds 1413@cindex Build notes for binary packaging 1414@cindex Packaged builds 1415 1416GMP should present no great difficulties for packaging in a binary 1417distribution. 1418 1419@cindex Libtool versioning 1420@cindex Shared library versioning 1421Libtool is used to build the library and @samp{-version-info} is set 1422appropriately, having started from @samp{3:0:0} in GMP 3.0 (@pxref{Versioning, 1423Library interface versions, Library interface versions, libtool, GNU 1424Libtool}). 1425 1426The GMP 4 series will be upwardly binary compatible in each release and will 1427be upwardly binary compatible with all of the GMP 3 series. Additional 1428function interfaces may be added in each release, so on systems where libtool 1429versioning is not fully checked by the loader an auxiliary mechanism may be 1430needed to express that a dynamic linked application depends on a new enough 1431GMP. 1432 1433An auxiliary mechanism may also be needed to express that @file{libgmpxx.la} 1434(from @option{--enable-cxx}, @pxref{Build Options}) requires @file{libgmp.la} 1435from the same GMP version, since this is not done by the libtool versioning, 1436nor otherwise. A mismatch will result in unresolved symbols from the linker, 1437or perhaps the loader. 1438 1439When building a package for a CPU family, care should be taken to use 1440@samp{--host} (or @samp{--build}) to choose the least common denominator among 1441the CPUs which might use the package. For example this might mean plain 1442@samp{sparc} (meaning V7) for SPARCs. 1443 1444For x86s, @option{--enable-fat} sets things up for a fat binary build, making a 1445runtime selection of optimized low level routines. This is a good choice for 1446packaging to run on a range of x86 chips. 1447 1448Users who care about speed will want GMP built for their exact CPU type, to 1449make best use of the available optimizations. Providing a way to suitably 1450rebuild a package may be useful. This could be as simple as making it 1451possible for a user to omit @samp{--build} (and @samp{--host}) so 1452@samp{./config.guess} will detect the CPU@. But a way to manually specify a 1453@samp{--build} will be wanted for systems where @samp{./config.guess} is 1454inexact. 1455 1456On systems with multiple ABIs, a packaged build will need to decide which 1457among the choices is to be provided, see @ref{ABI and ISA}. A given run of 1458@samp{./configure} etc will only build one ABI@. If a second ABI is also 1459required then a second run of @samp{./configure} etc must be made, starting 1460from a clean directory tree (@samp{make distclean}). 1461 1462As noted under ``ABI and ISA'', currently no attempt is made to follow system 1463conventions for install locations that vary with ABI, such as 1464@file{/usr/lib/sparcv9} for @samp{ABI=64} as opposed to @file{/usr/lib} for 1465@samp{ABI=32}. A package build can override @samp{libdir} and other standard 1466variables as necessary. 1467 1468Note that @file{gmp.h} is a generated file, and will be architecture and ABI 1469dependent. When attempting to install two ABIs simultaneously it will be 1470important that an application compile gets the correct @file{gmp.h} for its 1471desired ABI@. If compiler include paths don't vary with ABI options then it 1472might be necessary to create a @file{/usr/include/gmp.h} which tests 1473preprocessor symbols and chooses the correct actual @file{gmp.h}. 1474 1475 1476@need 2000 1477@node Notes for Particular Systems, Known Build Problems, Notes for Package Builds, Installing GMP 1478@section Notes for Particular Systems 1479@cindex Build notes for particular systems 1480@cindex Particular systems 1481@cindex Systems 1482@table @asis 1483 1484@c This section is more or less meant for notes about performance or about 1485@c build problems that have been worked around but might leave a user 1486@c scratching their head. Fun with different ABIs on a system belongs in the 1487@c above section. 1488 1489@item AIX 3 and 4 1490@cindex AIX 1491On systems @samp{*-*-aix[34]*} shared libraries are disabled by default, since 1492some versions of the native @command{ar} fail on the convenience libraries 1493used. A shared build can be attempted with 1494 1495@example 1496./configure --enable-shared --disable-static 1497@end example 1498 1499Note that the @samp{--disable-static} is necessary because in a shared build 1500libtool makes @file{libgmp.a} a symlink to @file{libgmp.so}, apparently for 1501the benefit of old versions of @command{ld} which only recognise @file{.a}, 1502but unfortunately this is done even if a fully functional @command{ld} is 1503available. 1504 1505@item ARM 1506@cindex ARM 1507On systems @samp{arm*-*-*}, versions of GCC up to and including 2.95.3 have a 1508bug in unsigned division, giving wrong results for some operands. GMP 1509@samp{./configure} will demand GCC 2.95.4 or later. 1510 1511@item Compaq C++ 1512@cindex Compaq C++ 1513Compaq C++ on OSF 5.1 has two flavours of @code{iostream}, a standard one and 1514an old pre-standard one (see @samp{man iostream_intro}). GMP can only use the 1515standard one, which unfortunately is not the default but must be selected by 1516defining @code{__USE_STD_IOSTREAM}. Configure with for instance 1517 1518@example 1519./configure --enable-cxx CPPFLAGS=-D__USE_STD_IOSTREAM 1520@end example 1521 1522@item Floating Point Mode 1523@cindex Floating point mode 1524@cindex Hardware floating point mode 1525@cindex Precision of hardware floating point 1526@cindex x87 1527On some systems, the hardware floating point has a control mode which can set 1528all operations to be done in a particular precision, for instance single, 1529double or extended on x86 systems (x87 floating point). The GMP functions 1530involving a @code{double} cannot be expected to operate to their full 1531precision when the hardware is in single precision mode. Of course this 1532affects all code, including application code, not just GMP. 1533 1534@item FreeBSD 7.x, 8.x, 9.0, 9.1, 9.2 1535@cindex FreeBSD 1536@command{m4} in these releases of FreeBSD has an eval function which ignores 1537its 2nd and 3rd arguments, which makes it unsuitable for @file{.asm} file 1538processing. @samp{./configure} will detect the problem and either abort or 1539choose another m4 in the @env{PATH}. The bug is fixed in FreeBSD 9.3 and 10.0, 1540so either upgrade or use GNU m4. Note that the FreeBSD package system installs 1541GNU m4 under the name @samp{gm4}, which GMP cannot guess. 1542 1543@item FreeBSD 7.x, 8.x, 9.x 1544@cindex FreeBSD 1545GMP releases starting with 6.0 do not support @samp{ABI=32} on FreeBSD/amd64 1546prior to release 10.0 of the system. The cause is a broken @code{limits.h}, 1547which GMP no longer works around. 1548 1549@item MS-DOS and MS Windows 1550@cindex MS-DOS 1551@cindex MS Windows 1552@cindex Windows 1553@cindex Cygwin 1554@cindex DJGPP 1555@cindex MINGW 1556On an MS-DOS system DJGPP can be used to build GMP, and on an MS Windows 1557system Cygwin, DJGPP and MINGW can be used. All three are excellent ports of 1558GCC and the various GNU tools. 1559 1560@display 1561@uref{https://www.cygwin.com/} 1562@uref{http://www.delorie.com/djgpp/} 1563@uref{http://www.mingw.org/} 1564@end display 1565 1566@cindex Interix 1567@cindex Services for Unix 1568Microsoft also publishes an Interix ``Services for Unix'' which can be used to 1569build GMP on Windows (with a normal @samp{./configure}), but it's not free 1570software. 1571 1572@item MS Windows DLLs 1573@cindex DLLs 1574@cindex MS Windows 1575@cindex Windows 1576On systems @samp{*-*-cygwin*}, @samp{*-*-mingw*} and @samp{*-*-pw32*} by 1577default GMP builds only a static library, but a DLL can be built instead using 1578 1579@example 1580./configure --disable-static --enable-shared 1581@end example 1582 1583Static and DLL libraries can't both be built, since certain export directives 1584in @file{gmp.h} must be different. 1585 1586A MINGW DLL build of GMP can be used with Microsoft C@. Libtool doesn't 1587install a @file{.lib} format import library, but it can be created with MS 1588@command{lib} as follows, and copied to the install directory. Similarly for 1589@file{libmp} and @file{libgmpxx}. 1590 1591@example 1592cd .libs 1593lib /def:libgmp-3.dll.def /out:libgmp-3.lib 1594@end example 1595 1596MINGW uses the C runtime library @samp{msvcrt.dll} for I/O, so applications 1597wanting to use the GMP I/O routines must be compiled with @samp{cl /MD} to do 1598the same. If one of the other C runtime library choices provided by MS C is 1599desired then the suggestion is to use the GMP string functions and confine I/O 1600to the application. 1601 1602@item Motorola 68k CPU Types 1603@cindex 68000 1604@samp{m68k} is taken to mean 68000. @samp{m68020} or higher will give a 1605performance boost on applicable CPUs. @samp{m68360} can be used for CPU32 1606series chips. @samp{m68302} can be used for ``Dragonball'' series chips, 1607though this is merely a synonym for @samp{m68000}. 1608 1609@item NetBSD 5.x 1610@cindex NetBSD 1611@command{m4} in these releases of NetBSD has an eval function which ignores its 16122nd and 3rd arguments, which makes it unsuitable for @file{.asm} file 1613processing. @samp{./configure} will detect the problem and either abort or 1614choose another m4 in the @env{PATH}. The bug is fixed in NetBSD 6, so either 1615upgrade or use GNU m4. Note that the NetBSD package system installs GNU m4 1616under the name @samp{gm4}, which GMP cannot guess. 1617 1618@item OpenBSD 2.6 1619@cindex OpenBSD 1620@command{m4} in this release of OpenBSD has a bug in @code{eval} that makes it 1621unsuitable for @file{.asm} file processing. @samp{./configure} will detect 1622the problem and either abort or choose another m4 in the @env{PATH}. The bug 1623is fixed in OpenBSD 2.7, so either upgrade or use GNU m4. 1624 1625@item Power CPU Types 1626@cindex Power/PowerPC 1627In GMP, CPU types @samp{power*} and @samp{powerpc*} will each use instructions 1628not available on the other, so it's important to choose the right one for the 1629CPU that will be used. Currently GMP has no assembly code support for using 1630just the common instruction subset. To get executables that run on both, the 1631current suggestion is to use the generic C code (@option{--disable-assembly}), 1632possibly with appropriate compiler options (like @samp{-mcpu=common} for 1633@command{gcc}). CPU @samp{rs6000} (which is not a CPU but a family of 1634workstations) is accepted by @file{config.sub}, but is currently equivalent to 1635@option{--disable-assembly}. 1636 1637@item Sparc CPU Types 1638@cindex Sparc 1639@samp{sparcv8} or @samp{supersparc} on relevant systems will give a 1640significant performance increase over the V7 code selected by plain 1641@samp{sparc}. 1642 1643@item Sparc App Regs 1644@cindex Sparc 1645The GMP assembly code for both 32-bit and 64-bit Sparc clobbers the 1646``application registers'' @code{g2}, @code{g3} and @code{g4}, the same way 1647that the GCC default @samp{-mapp-regs} does (@pxref{SPARC Options,, SPARC 1648Options, gcc, Using the GNU Compiler Collection (GCC)}). 1649 1650This makes that code unsuitable for use with the special V9 1651@samp{-mcmodel=embmedany} (which uses @code{g4} as a data segment pointer), and 1652for applications wanting to use those registers for special purposes. In these 1653cases the only suggestion currently is to build GMP with 1654@option{--disable-assembly} to avoid the assembly code. 1655 1656@item SunOS 4 1657@cindex SunOS 1658@command{/usr/bin/m4} lacks various features needed to process @file{.asm} 1659files, and instead @samp{./configure} will automatically use 1660@command{/usr/5bin/m4}, which we believe is always available (if not then use 1661GNU m4). 1662 1663@item x86 CPU Types 1664@cindex x86 1665@cindex 80x86 1666@cindex i386 1667@samp{i586}, @samp{pentium} or @samp{pentiummmx} code is good for its intended 1668P5 Pentium chips, but quite slow when run on Intel P6 class chips (PPro, P-II, 1669P-III)@. @samp{i386} is a better choice when making binaries that must run on 1670both. 1671 1672@item x86 MMX and SSE2 Code 1673@cindex MMX 1674@cindex SSE2 1675If the CPU selected has MMX code but the assembler doesn't support it, a 1676warning is given and non-MMX code is used instead. This will be an inferior 1677build, since the MMX code that's present is there because it's faster than the 1678corresponding plain integer code. The same applies to SSE2. 1679 1680Old versions of @samp{gas} don't support MMX instructions, in particular 1681version 1.92.3 that comes with FreeBSD 2.2.8 or the more recent OpenBSD 3.1 1682doesn't. 1683 1684Solaris 2.6 and 2.7 @command{as} generate incorrect object code for register 1685to register @code{movq} instructions, and so can't be used for MMX code. 1686Install a recent @command{gas} if MMX code is wanted on these systems. 1687@end table 1688 1689 1690@need 2000 1691@node Known Build Problems, Performance optimization, Notes for Particular Systems, Installing GMP 1692@section Known Build Problems 1693@cindex Build problems known 1694 1695@c This section is more or less meant for known build problems that are not 1696@c otherwise worked around and require some sort of manual intervention. 1697 1698You might find more up-to-date information at @uref{https://gmplib.org/}. 1699 1700@table @asis 1701@item Compiler link options 1702The version of libtool currently in use rather aggressively strips compiler 1703options when linking a shared library. This will hopefully be relaxed in the 1704future, but for now if this is a problem the suggestion is to create a little 1705script to hide them, and for instance configure with 1706 1707@example 1708./configure CC=gcc-with-my-options 1709@end example 1710 1711@item DJGPP (@samp{*-*-msdosdjgpp*}) 1712@cindex DJGPP 1713The DJGPP port of @command{bash} 2.03 is unable to run the @samp{configure} 1714script, it exits silently, having died writing a preamble to 1715@file{config.log}. Use @command{bash} 2.04 or higher. 1716 1717@samp{make all} was found to run out of memory during the final 1718@file{libgmp.la} link on one system tested, despite having 64Mb available. 1719Running @samp{make libgmp.la} directly helped, perhaps recursing into the 1720various subdirectories uses up memory. 1721 1722@item GNU binutils @command{strip} prior to 2.12 1723@cindex Stripped libraries 1724@cindex Binutils @command{strip} 1725@cindex GNU @command{strip} 1726@command{strip} from GNU binutils 2.11 and earlier should not be used on the 1727static libraries @file{libgmp.a} and @file{libmp.a} since it will discard all 1728but the last of multiple archive members with the same name, like the three 1729versions of @file{init.o} in @file{libgmp.a}. Binutils 2.12 or higher can be 1730used successfully. 1731 1732The shared libraries @file{libgmp.so} and @file{libmp.so} are not affected by 1733this and any version of @command{strip} can be used on them. 1734 1735@item @command{make} syntax error 1736@cindex SCO 1737@cindex IRIX 1738On certain versions of SCO OpenServer 5 and IRIX 6.5 the native @command{make} 1739is unable to handle the long dependencies list for @file{libgmp.la}. The 1740symptom is a ``syntax error'' on the following line of the top-level 1741@file{Makefile}. 1742 1743@example 1744libgmp.la: $(libgmp_la_OBJECTS) $(libgmp_la_DEPENDENCIES) 1745@end example 1746 1747Either use GNU Make, or as a workaround remove 1748@code{$(libgmp_la_DEPENDENCIES)} from that line (which will make the initial 1749build work, but if any recompiling is done @file{libgmp.la} might not be 1750rebuilt). 1751 1752@item MacOS X (@samp{*-*-darwin*}) 1753@cindex MacOS X 1754@cindex Darwin 1755Libtool currently only knows how to create shared libraries on MacOS X using 1756the native @command{cc} (which is a modified GCC), not a plain GCC@. A 1757static-only build should work though (@samp{--disable-shared}). 1758 1759@item NeXT prior to 3.3 1760@cindex NeXT 1761The system compiler on old versions of NeXT was a massacred and old GCC, even 1762if it called itself @file{cc}. This compiler cannot be used to build GMP, you 1763need to get a real GCC, and install that. (NeXT may have fixed this in 1764release 3.3 of their system.) 1765 1766@item POWER and PowerPC 1767@cindex Power/PowerPC 1768Bugs in GCC 2.7.2 (and 2.6.3) mean it can't be used to compile GMP on POWER or 1769PowerPC@. If you want to use GCC for these machines, get GCC 2.7.2.1 (or 1770later). 1771 1772@item Sequent Symmetry 1773@cindex Sequent Symmetry 1774Use the GNU assembler instead of the system assembler, since the latter has 1775serious bugs. 1776 1777@item Solaris 2.6 1778@cindex Solaris 1779The system @command{sed} prints an error ``Output line too long'' when libtool 1780builds @file{libgmp.la}. This doesn't seem to cause any obvious ill effects, 1781but GNU @command{sed} is recommended, to avoid any doubt. 1782 1783@item Sparc Solaris 2.7 with gcc 2.95.2 in @samp{ABI=32} 1784@cindex Solaris 1785A shared library build of GMP seems to fail in this combination, it builds but 1786then fails the tests, apparently due to some incorrect data relocations within 1787@code{gmp_randinit_lc_2exp_size}. The exact cause is unknown, 1788@samp{--disable-shared} is recommended. 1789@end table 1790 1791 1792@need 2000 1793@node Performance optimization, , Known Build Problems, Installing GMP 1794@section Performance optimization 1795@cindex Optimizing performance 1796 1797@c At some point, this should perhaps move to a separate chapter on optimizing 1798@c performance. 1799 1800For optimal performance, build GMP for the exact CPU type of the target 1801computer, see @ref{Build Options}. 1802 1803Unlike what is the case for most other programs, the compiler typically 1804doesn't matter much, since GMP uses assembly language for the most critical 1805operation. 1806 1807In particular for long-running GMP applications, and applications demanding 1808extremely large numbers, building and running the @code{tuneup} program in the 1809@file{tune} subdirectory, can be important. For example, 1810 1811@example 1812cd tune 1813make tuneup 1814./tuneup 1815@end example 1816 1817will generate better contents for the @file{gmp-mparam.h} parameter file. 1818 1819To use the results, put the output in the file indicated in the 1820@samp{Parameters for ...} header. Then recompile from scratch. 1821 1822The @code{tuneup} program takes one useful parameter, @samp{-f NNN}, which 1823instructs the program how long to check FFT multiply parameters. If you're 1824going to use GMP for extremely large numbers, you may want to run @code{tuneup} 1825with a large NNN value. 1826 1827 1828@node GMP Basics, Reporting Bugs, Installing GMP, Top 1829@comment node-name, next, previous, up 1830@chapter GMP Basics 1831@cindex Basics 1832 1833@strong{Using functions, macros, data types, etc.@: not documented in this 1834manual is strongly discouraged. If you do so your application is guaranteed 1835to be incompatible with future versions of GMP.} 1836 1837@menu 1838* Headers and Libraries:: 1839* Nomenclature and Types:: 1840* Function Classes:: 1841* Variable Conventions:: 1842* Parameter Conventions:: 1843* Memory Management:: 1844* Reentrancy:: 1845* Useful Macros and Constants:: 1846* Compatibility with older versions:: 1847* Demonstration Programs:: 1848* Efficiency:: 1849* Debugging:: 1850* Profiling:: 1851* Autoconf:: 1852* Emacs:: 1853@end menu 1854 1855@node Headers and Libraries, Nomenclature and Types, GMP Basics, GMP Basics 1856@section Headers and Libraries 1857@cindex Headers 1858 1859@cindex @file{gmp.h} 1860@cindex Include files 1861@cindex @code{#include} 1862All declarations needed to use GMP are collected in the include file 1863@file{gmp.h}. It is designed to work with both C and C++ compilers. 1864 1865@example 1866#include <gmp.h> 1867@end example 1868 1869@cindex @code{stdio.h} 1870Note however that prototypes for GMP functions with @code{FILE *} parameters 1871are only provided if @code{<stdio.h>} is included too. 1872 1873@example 1874#include <stdio.h> 1875#include <gmp.h> 1876@end example 1877 1878@cindex @code{stdarg.h} 1879Likewise @code{<stdarg.h>} is required for prototypes with @code{va_list} 1880parameters, such as @code{gmp_vprintf}. And @code{<obstack.h>} for prototypes 1881with @code{struct obstack} parameters, such as @code{gmp_obstack_printf}, when 1882available. 1883 1884@cindex Libraries 1885@cindex Linking 1886@cindex @code{libgmp} 1887All programs using GMP must link against the @file{libgmp} library. On a 1888typical Unix-like system this can be done with @samp{-lgmp}, for example 1889 1890@example 1891gcc myprogram.c -lgmp 1892@end example 1893 1894@cindex @code{libgmpxx} 1895GMP C++ functions are in a separate @file{libgmpxx} library. This is built 1896and installed if C++ support has been enabled (@pxref{Build Options}). For 1897example, 1898 1899@example 1900g++ mycxxprog.cc -lgmpxx -lgmp 1901@end example 1902 1903@cindex Libtool 1904GMP is built using Libtool and an application can use that to link if desired, 1905@GMPpxreftop{libtool, GNU Libtool}. 1906 1907If GMP has been installed to a non-standard location then it may be necessary 1908to use @samp{-I} and @samp{-L} compiler options to point to the right 1909directories, and some sort of run-time path for a shared library. 1910 1911 1912@node Nomenclature and Types, Function Classes, Headers and Libraries, GMP Basics 1913@section Nomenclature and Types 1914@cindex Nomenclature 1915@cindex Types 1916 1917@cindex Integer 1918@tindex @code{mpz_t} 1919In this manual, @dfn{integer} usually means a multiple precision integer, as 1920defined by the GMP library. The C data type for such integers is @code{mpz_t}. 1921Here are some examples of how to declare such integers: 1922 1923@example 1924mpz_t sum; 1925 1926struct foo @{ mpz_t x, y; @}; 1927 1928mpz_t vec[20]; 1929@end example 1930 1931@cindex Rational number 1932@tindex @code{mpq_t} 1933@dfn{Rational number} means a multiple precision fraction. The C data type 1934for these fractions is @code{mpq_t}. For example: 1935 1936@example 1937mpq_t quotient; 1938@end example 1939 1940@cindex Floating-point number 1941@tindex @code{mpf_t} 1942@dfn{Floating point number} or @dfn{Float} for short, is an arbitrary precision 1943mantissa with a limited precision exponent. The C data type for such objects 1944is @code{mpf_t}. For example: 1945 1946@example 1947mpf_t fp; 1948@end example 1949 1950@tindex @code{mp_exp_t} 1951The floating point functions accept and return exponents in the C type 1952@code{mp_exp_t}. Currently this is usually a @code{long}, but on some systems 1953it's an @code{int} for efficiency. 1954 1955@cindex Limb 1956@tindex @code{mp_limb_t} 1957A @dfn{limb} means the part of a multi-precision number that fits in a single 1958machine word. (We chose this word because a limb of the human body is 1959analogous to a digit, only larger, and containing several digits.) Normally a 1960limb is 32 or 64 bits. The C data type for a limb is @code{mp_limb_t}. 1961 1962@tindex @code{mp_size_t} 1963Counts of limbs of a multi-precision number represented in the C type 1964@code{mp_size_t}. Currently this is normally a @code{long}, but on some 1965systems it's an @code{int} for efficiency, and on some systems it will be 1966@code{long long} in the future. 1967 1968@tindex @code{mp_bitcnt_t} 1969Counts of bits of a multi-precision number are represented in the C type 1970@code{mp_bitcnt_t}. Currently this is always an @code{unsigned long}, but on 1971some systems it will be an @code{unsigned long long} in the future. 1972 1973@cindex Random state 1974@tindex @code{gmp_randstate_t} 1975@dfn{Random state} means an algorithm selection and current state data. The C 1976data type for such objects is @code{gmp_randstate_t}. For example: 1977 1978@example 1979gmp_randstate_t rstate; 1980@end example 1981 1982Also, in general @code{mp_bitcnt_t} is used for bit counts and ranges, and 1983@code{size_t} is used for byte or character counts. 1984 1985 1986@node Function Classes, Variable Conventions, Nomenclature and Types, GMP Basics 1987@section Function Classes 1988@cindex Function classes 1989 1990There are six classes of functions in the GMP library: 1991 1992@enumerate 1993@item 1994Functions for signed integer arithmetic, with names beginning with 1995@code{mpz_}. The associated type is @code{mpz_t}. There are about 150 1996functions in this class. (@pxref{Integer Functions}) 1997 1998@item 1999Functions for rational number arithmetic, with names beginning with 2000@code{mpq_}. The associated type is @code{mpq_t}. There are about 35 2001functions in this class, but the integer functions can be used for arithmetic 2002on the numerator and denominator separately. (@pxref{Rational Number 2003Functions}) 2004 2005@item 2006Functions for floating-point arithmetic, with names beginning with 2007@code{mpf_}. The associated type is @code{mpf_t}. There are about 70 2008functions is this class. (@pxref{Floating-point Functions}) 2009 2010@item 2011Fast low-level functions that operate on natural numbers. These are used by 2012the functions in the preceding groups, and you can also call them directly 2013from very time-critical user programs. These functions' names begin with 2014@code{mpn_}. The associated type is array of @code{mp_limb_t}. There are 2015about 60 (hard-to-use) functions in this class. (@pxref{Low-level Functions}) 2016 2017@item 2018Miscellaneous functions. Functions for setting up custom allocation and 2019functions for generating random numbers. (@pxref{Custom Allocation}, and 2020@pxref{Random Number Functions}) 2021@end enumerate 2022 2023 2024@node Variable Conventions, Parameter Conventions, Function Classes, GMP Basics 2025@section Variable Conventions 2026@cindex Variable conventions 2027@cindex Conventions for variables 2028 2029GMP functions generally have output arguments before input arguments. This 2030notation is by analogy with the assignment operator. 2031 2032GMP lets you use the same variable for both input and output in one call. For 2033example, the main function for integer multiplication, @code{mpz_mul}, can be 2034used to square @code{x} and put the result back in @code{x} with 2035 2036@example 2037mpz_mul (x, x, x); 2038@end example 2039 2040Before you can assign to a GMP variable, you need to initialize it by calling 2041one of the special initialization functions. When you're done with a 2042variable, you need to clear it out, using one of the functions for that 2043purpose. Which function to use depends on the type of variable. See the 2044chapters on integer functions, rational number functions, and floating-point 2045functions for details. 2046 2047A variable should only be initialized once, or at least cleared between each 2048initialization. After a variable has been initialized, it may be assigned to 2049any number of times. 2050 2051For efficiency reasons, avoid excessive initializing and clearing. In 2052general, initialize near the start of a function and clear near the end. For 2053example, 2054 2055@example 2056void 2057foo (void) 2058@{ 2059 mpz_t n; 2060 int i; 2061 mpz_init (n); 2062 for (i = 1; i < 100; i++) 2063 @{ 2064 mpz_mul (n, @dots{}); 2065 mpz_fdiv_q (n, @dots{}); 2066 @dots{} 2067 @} 2068 mpz_clear (n); 2069@} 2070@end example 2071 2072GMP types like @code{mpz_t} are implemented as one-element arrays of certain 2073structures. Declaring a variable creates an object with the fields GMP needs, 2074but variables are normally manipulated by using the pointer to the object. For 2075both behavior and efficiency reasons, it is discouraged to make copies of the 2076GMP object itself (either directly or via aggregate objects containing such GMP 2077objects). If copies are done, all of them must be used read-only; using a copy 2078as the output of some function will invalidate all the other copies. Note that 2079the actual fields in each @code{mpz_t} etc are for internal use only and should 2080not be accessed directly by code that expects to be compatible with future GMP 2081releases. 2082 2083@node Parameter Conventions, Memory Management, Variable Conventions, GMP Basics 2084@section Parameter Conventions 2085@cindex Parameter conventions 2086@cindex Conventions for parameters 2087 2088When a GMP variable is used as a function parameter, it's effectively a 2089call-by-reference, meaning that when the function stores a value there it will 2090change the original in the caller. Parameters which are input-only can be 2091designated @code{const} to provoke a compiler error or warning on attempting to 2092modify them. 2093 2094When a function is going to return a GMP result, it should designate a 2095parameter that it sets, like the library functions do. More than one value 2096can be returned by having more than one output parameter, again like the 2097library functions. A @code{return} of an @code{mpz_t} etc doesn't return the 2098object, only a pointer, and this is almost certainly not what's wanted. 2099 2100Here's an example accepting an @code{mpz_t} parameter, doing a calculation, 2101and storing the result to the indicated parameter. 2102 2103@example 2104void 2105foo (mpz_t result, const mpz_t param, unsigned long n) 2106@{ 2107 unsigned long i; 2108 mpz_mul_ui (result, param, n); 2109 for (i = 1; i < n; i++) 2110 mpz_add_ui (result, result, i*7); 2111@} 2112 2113int 2114main (void) 2115@{ 2116 mpz_t r, n; 2117 mpz_init (r); 2118 mpz_init_set_str (n, "123456", 0); 2119 foo (r, n, 20L); 2120 gmp_printf ("%Zd\n", r); 2121 return 0; 2122@} 2123@end example 2124 2125Our function @code{foo} works even if its caller passes the same variable for 2126@code{param} and @code{result}, just like the library functions. But 2127sometimes it's tricky to make that work, and an application might not want to 2128bother supporting that sort of thing. 2129 2130Since GMP types are implemented as one-element arrays, using a GMP variable as 2131a parameter passes a pointer to the object. Hence the call-by-reference. 2132 2133 2134@need 1000 2135@node Memory Management, Reentrancy, Parameter Conventions, GMP Basics 2136@section Memory Management 2137@cindex Memory management 2138 2139The GMP types like @code{mpz_t} are small, containing only a couple of sizes, 2140and pointers to allocated data. Once a variable is initialized, GMP takes 2141care of all space allocation. Additional space is allocated whenever a 2142variable doesn't have enough. 2143 2144@code{mpz_t} and @code{mpq_t} variables never reduce their allocated space. 2145Normally this is the best policy, since it avoids frequent reallocation. 2146Applications that need to return memory to the heap at some particular point 2147can use @code{mpz_realloc2}, or clear variables no longer needed. 2148 2149@code{mpf_t} variables, in the current implementation, use a fixed amount of 2150space, determined by the chosen precision and allocated at initialization, so 2151their size doesn't change. 2152 2153All memory is allocated using @code{malloc} and friends by default, but this 2154can be changed, see @ref{Custom Allocation}. Temporary memory on the stack is 2155also used (via @code{alloca}), but this can be changed at build-time if 2156desired, see @ref{Build Options}. 2157 2158 2159@node Reentrancy, Useful Macros and Constants, Memory Management, GMP Basics 2160@section Reentrancy 2161@cindex Reentrancy 2162@cindex Thread safety 2163@cindex Multi-threading 2164 2165@noindent 2166GMP is reentrant and thread-safe, with some exceptions: 2167 2168@itemize @bullet 2169@item 2170If configured with @option{--enable-alloca=malloc-notreentrant} (or with 2171@option{--enable-alloca=notreentrant} when @code{alloca} is not available), 2172then naturally GMP is not reentrant. 2173 2174@item 2175@code{mpf_set_default_prec} and @code{mpf_init} use a global variable for the 2176selected precision. @code{mpf_init2} can be used instead, and in the C++ 2177interface an explicit precision to the @code{mpf_class} constructor. 2178 2179@item 2180@code{mpz_random} and the other old random number functions use a global 2181random state and are hence not reentrant. The newer random number functions 2182that accept a @code{gmp_randstate_t} parameter can be used instead. 2183 2184@item 2185@code{gmp_randinit} (obsolete) returns an error indication through a global 2186variable, which is not thread safe. Applications are advised to use 2187@code{gmp_randinit_default} or @code{gmp_randinit_lc_2exp} instead. 2188 2189@item 2190@code{mp_set_memory_functions} uses global variables to store the selected 2191memory allocation functions. 2192 2193@item 2194If the memory allocation functions set by a call to 2195@code{mp_set_memory_functions} (or @code{malloc} and friends by default) are 2196not reentrant, then GMP will not be reentrant either. 2197 2198@item 2199If the standard I/O functions such as @code{fwrite} are not reentrant then the 2200GMP I/O functions using them will not be reentrant either. 2201 2202@item 2203It's safe for two threads to read from the same GMP variable simultaneously, 2204but it's not safe for one to read while another might be writing, nor for 2205two threads to write simultaneously. It's not safe for two threads to 2206generate a random number from the same @code{gmp_randstate_t} simultaneously, 2207since this involves an update of that variable. 2208@end itemize 2209 2210 2211@need 2000 2212@node Useful Macros and Constants, Compatibility with older versions, Reentrancy, GMP Basics 2213@section Useful Macros and Constants 2214@cindex Useful macros and constants 2215@cindex Constants 2216 2217@deftypevr {Global Constant} {const int} mp_bits_per_limb 2218@findex mp_bits_per_limb 2219@cindex Bits per limb 2220@cindex Limb size 2221The number of bits per limb. 2222@end deftypevr 2223 2224@defmac __GNU_MP_VERSION 2225@defmacx __GNU_MP_VERSION_MINOR 2226@defmacx __GNU_MP_VERSION_PATCHLEVEL 2227@cindex Version number 2228@cindex GMP version number 2229The major and minor GMP version, and patch level, respectively, as integers. 2230For GMP i.j, these numbers will be i, j, and 0, respectively. 2231For GMP i.j.k, these numbers will be i, j, and k, respectively. 2232@end defmac 2233 2234@deftypevr {Global Constant} {const char * const} gmp_version 2235@findex gmp_version 2236The GMP version number, as a null-terminated string, in the form ``i.j.k''. 2237This release is @nicode{"@value{VERSION}"}. Note that the format ``i.j'' was 2238used, before version 4.3.0, when k was zero. 2239@end deftypevr 2240 2241@defmac __GMP_CC 2242@defmacx __GMP_CFLAGS 2243The compiler and compiler flags, respectively, used when compiling GMP, as 2244strings. 2245@end defmac 2246 2247 2248@node Compatibility with older versions, Demonstration Programs, Useful Macros and Constants, GMP Basics 2249@section Compatibility with older versions 2250@cindex Compatibility with older versions 2251@cindex Past GMP versions 2252@cindex Upward compatibility 2253 2254This version of GMP is upwardly binary compatible with all 5.x, 4.x, and 3.x 2255versions, and upwardly compatible at the source level with all 2.x versions, 2256with the following exceptions. 2257 2258@itemize @bullet 2259@item 2260@code{mpn_gcd} had its source arguments swapped as of GMP 3.0, for consistency 2261with other @code{mpn} functions. 2262 2263@item 2264@code{mpf_get_prec} counted precision slightly differently in GMP 3.0 and 22653.0.1, but in 3.1 reverted to the 2.x style. 2266 2267@item 2268@code{mpn_bdivmod}, documented as preliminary in GMP 4, has been removed. 2269@end itemize 2270 2271There are a number of compatibility issues between GMP 1 and GMP 2 that of 2272course also apply when porting applications from GMP 1 to GMP 5. Please 2273see the GMP 2 manual for details. 2274 2275@c @item Integer division functions round the result differently. The obsolete 2276@c functions (@code{mpz_div}, @code{mpz_divmod}, @code{mpz_mdiv}, 2277@c @code{mpz_mdivmod}, etc) now all use floor rounding (i.e., they round the 2278@c quotient towards 2279@c @ifinfo 2280@c @minus{}infinity). 2281@c @end ifinfo 2282@c @iftex 2283@c @tex 2284@c $-\infty$). 2285@c @end tex 2286@c @end iftex 2287@c There are a lot of functions for integer division, giving the user better 2288@c control over the rounding. 2289 2290@c @item The function @code{mpz_mod} now compute the true @strong{mod} function. 2291 2292@c @item The functions @code{mpz_powm} and @code{mpz_powm_ui} now use 2293@c @strong{mod} for reduction. 2294 2295@c @item The assignment functions for rational numbers do no longer canonicalize 2296@c their results. In the case a non-canonical result could arise from an 2297@c assignment, the user need to insert an explicit call to 2298@c @code{mpq_canonicalize}. This change was made for efficiency. 2299 2300@c @item Output generated by @code{mpz_out_raw} in this release cannot be read 2301@c by @code{mpz_inp_raw} in previous releases. This change was made for making 2302@c the file format truly portable between machines with different word sizes. 2303 2304@c @item Several @code{mpn} functions have changed. But they were intentionally 2305@c undocumented in previous releases. 2306 2307@c @item The functions @code{mpz_cmp_ui}, @code{mpz_cmp_si}, and @code{mpq_cmp_ui} 2308@c are now implemented as macros, and thereby sometimes evaluate their 2309@c arguments multiple times. 2310 2311@c @item The functions @code{mpz_pow_ui} and @code{mpz_ui_pow_ui} now yield 1 2312@c for 0^0. (In version 1, they yielded 0.) 2313 2314@c In version 1 of the library, @code{mpq_set_den} handled negative 2315@c denominators by copying the sign to the numerator. That is no longer done. 2316 2317@c Pure assignment functions do not canonicalize the assigned variable. It is 2318@c the responsibility of the user to canonicalize the assigned variable before 2319@c any arithmetic operations are performed on that variable. 2320@c Note that this is an incompatible change from version 1 of the library. 2321 2322@c @end enumerate 2323 2324 2325@need 1000 2326@node Demonstration Programs, Efficiency, Compatibility with older versions, GMP Basics 2327@section Demonstration programs 2328@cindex Demonstration programs 2329@cindex Example programs 2330@cindex Sample programs 2331The @file{demos} subdirectory has some sample programs using GMP@. These 2332aren't built or installed, but there's a @file{Makefile} with rules for them. 2333For instance, 2334 2335@example 2336make pexpr 2337./pexpr 68^975+10 2338@end example 2339 2340@noindent 2341The following programs are provided 2342 2343@itemize @bullet 2344@item 2345@cindex Expression parsing demo 2346@cindex Parsing expressions demo 2347@samp{pexpr} is an expression evaluator, the program used on the GMP web page. 2348@item 2349@cindex Expression parsing demo 2350@cindex Parsing expressions demo 2351The @samp{calc} subdirectory has a similar but simpler evaluator using 2352@command{lex} and @command{yacc}. 2353@item 2354@cindex Expression parsing demo 2355@cindex Parsing expressions demo 2356The @samp{expr} subdirectory is yet another expression evaluator, a library 2357designed for ease of use within a C program. See @file{demos/expr/README} for 2358more information. 2359@item 2360@cindex Factorization demo 2361@samp{factorize} is a Pollard-Rho factorization program. 2362@item 2363@samp{isprime} is a command-line interface to the @code{mpz_probab_prime_p} 2364function. 2365@item 2366@samp{primes} counts or lists primes in an interval, using a sieve. 2367@item 2368@samp{qcn} is an example use of @code{mpz_kronecker_ui} to estimate quadratic 2369class numbers. 2370@item 2371@cindex @code{perl} 2372@cindex GMP Perl module 2373@cindex Perl module 2374The @samp{perl} subdirectory is a comprehensive perl interface to GMP@. See 2375@file{demos/perl/INSTALL} for more information. Documentation is in POD 2376format in @file{demos/perl/GMP.pm}. 2377@end itemize 2378 2379As an aside, consideration has been given at various times to some sort of 2380expression evaluation within the main GMP library. Going beyond something 2381minimal quickly leads to matters like user-defined functions, looping, fixnums 2382for control variables, etc, which are considered outside the scope of GMP 2383(much closer to language interpreters or compilers, @xref{Language Bindings}.) 2384Something simple for program input convenience may yet be a possibility, a 2385combination of the @file{expr} demo and the @file{pexpr} tree back-end 2386perhaps. But for now the above evaluators are offered as illustrations. 2387 2388 2389@need 1000 2390@node Efficiency, Debugging, Demonstration Programs, GMP Basics 2391@section Efficiency 2392@cindex Efficiency 2393 2394@table @asis 2395@item Small Operands 2396@cindex Small operands 2397On small operands, the time for function call overheads and memory allocation 2398can be significant in comparison to actual calculation. This is unavoidable 2399in a general purpose variable precision library, although GMP attempts to be 2400as efficient as it can on both large and small operands. 2401 2402@item Static Linking 2403@cindex Static linking 2404On some CPUs, in particular the x86s, the static @file{libgmp.a} should be 2405used for maximum speed, since the PIC code in the shared @file{libgmp.so} will 2406have a small overhead on each function call and global data address. For many 2407programs this will be insignificant, but for long calculations there's a gain 2408to be had. 2409 2410@item Initializing and Clearing 2411@cindex Initializing and clearing 2412Avoid excessive initializing and clearing of variables, since this can be 2413quite time consuming, especially in comparison to otherwise fast operations 2414like addition. 2415 2416A language interpreter might want to keep a free list or stack of 2417initialized variables ready for use. It should be possible to integrate 2418something like that with a garbage collector too. 2419 2420@item Reallocations 2421@cindex Reallocations 2422An @code{mpz_t} or @code{mpq_t} variable used to hold successively increasing 2423values will have its memory repeatedly @code{realloc}ed, which could be quite 2424slow or could fragment memory, depending on the C library. If an application 2425can estimate the final size then @code{mpz_init2} or @code{mpz_realloc2} can 2426be called to allocate the necessary space from the beginning 2427(@pxref{Initializing Integers}). 2428 2429It doesn't matter if a size set with @code{mpz_init2} or @code{mpz_realloc2} 2430is too small, since all functions will do a further reallocation if necessary. 2431Badly overestimating memory required will waste space though. 2432 2433@item @code{2exp} Functions 2434@cindex @code{2exp} functions 2435It's up to an application to call functions like @code{mpz_mul_2exp} when 2436appropriate. General purpose functions like @code{mpz_mul} make no attempt to 2437identify powers of two or other special forms, because such inputs will 2438usually be very rare and testing every time would be wasteful. 2439 2440@item @code{ui} and @code{si} Functions 2441@cindex @code{ui} and @code{si} functions 2442The @code{ui} functions and the small number of @code{si} functions exist for 2443convenience and should be used where applicable. But if for example an 2444@code{mpz_t} contains a value that fits in an @code{unsigned long} there's no 2445need extract it and call a @code{ui} function, just use the regular @code{mpz} 2446function. 2447 2448@item In-Place Operations 2449@cindex In-place operations 2450@code{mpz_abs}, @code{mpq_abs}, @code{mpf_abs}, @code{mpz_neg}, @code{mpq_neg} 2451and @code{mpf_neg} are fast when used for in-place operations like 2452@code{mpz_abs(x,x)}, since in the current implementation only a single field 2453of @code{x} needs changing. On suitable compilers (GCC for instance) this is 2454inlined too. 2455 2456@code{mpz_add_ui}, @code{mpz_sub_ui}, @code{mpf_add_ui} and @code{mpf_sub_ui} 2457benefit from an in-place operation like @code{mpz_add_ui(x,x,y)}, since 2458usually only one or two limbs of @code{x} will need to be changed. The same 2459applies to the full precision @code{mpz_add} etc if @code{y} is small. If 2460@code{y} is big then cache locality may be helped, but that's all. 2461 2462@code{mpz_mul} is currently the opposite, a separate destination is slightly 2463better. A call like @code{mpz_mul(x,x,y)} will, unless @code{y} is only one 2464limb, make a temporary copy of @code{x} before forming the result. Normally 2465that copying will only be a tiny fraction of the time for the multiply, so 2466this is not a particularly important consideration. 2467 2468@code{mpz_set}, @code{mpq_set}, @code{mpq_set_num}, @code{mpf_set}, etc, make 2469no attempt to recognise a copy of something to itself, so a call like 2470@code{mpz_set(x,x)} will be wasteful. Naturally that would never be written 2471deliberately, but if it might arise from two pointers to the same object then 2472a test to avoid it might be desirable. 2473 2474@example 2475if (x != y) 2476 mpz_set (x, y); 2477@end example 2478 2479Note that it's never worth introducing extra @code{mpz_set} calls just to get 2480in-place operations. If a result should go to a particular variable then just 2481direct it there and let GMP take care of data movement. 2482 2483@item Divisibility Testing (Small Integers) 2484@cindex Divisibility testing 2485@code{mpz_divisible_ui_p} and @code{mpz_congruent_ui_p} are the best functions 2486for testing whether an @code{mpz_t} is divisible by an individual small 2487integer. They use an algorithm which is faster than @code{mpz_tdiv_ui}, but 2488which gives no useful information about the actual remainder, only whether 2489it's zero (or a particular value). 2490 2491However when testing divisibility by several small integers, it's best to take 2492a remainder modulo their product, to save multi-precision operations. For 2493instance to test whether a number is divisible by any of 23, 29 or 31 take a 2494remainder modulo @math{23@times{}29@times{}31 = 20677} and then test that. 2495 2496The division functions like @code{mpz_tdiv_q_ui} which give a quotient as well 2497as a remainder are generally a little slower than the remainder-only functions 2498like @code{mpz_tdiv_ui}. If the quotient is only rarely wanted then it's 2499probably best to just take a remainder and then go back and calculate the 2500quotient if and when it's wanted (@code{mpz_divexact_ui} can be used if the 2501remainder is zero). 2502 2503@item Rational Arithmetic 2504@cindex Rational arithmetic 2505The @code{mpq} functions operate on @code{mpq_t} values with no common factors 2506in the numerator and denominator. Common factors are checked-for and cast out 2507as necessary. In general, cancelling factors every time is the best approach 2508since it minimizes the sizes for subsequent operations. 2509 2510However, applications that know something about the factorization of the 2511values they're working with might be able to avoid some of the GCDs used for 2512canonicalization, or swap them for divisions. For example when multiplying by 2513a prime it's enough to check for factors of it in the denominator instead of 2514doing a full GCD@. Or when forming a big product it might be known that very 2515little cancellation will be possible, and so canonicalization can be left to 2516the end. 2517 2518The @code{mpq_numref} and @code{mpq_denref} macros give access to the 2519numerator and denominator to do things outside the scope of the supplied 2520@code{mpq} functions. @xref{Applying Integer Functions}. 2521 2522The canonical form for rationals allows mixed-type @code{mpq_t} and integer 2523additions or subtractions to be done directly with multiples of the 2524denominator. This will be somewhat faster than @code{mpq_add}. For example, 2525 2526@example 2527/* mpq increment */ 2528mpz_add (mpq_numref(q), mpq_numref(q), mpq_denref(q)); 2529 2530/* mpq += unsigned long */ 2531mpz_addmul_ui (mpq_numref(q), mpq_denref(q), 123UL); 2532 2533/* mpq -= mpz */ 2534mpz_submul (mpq_numref(q), mpq_denref(q), z); 2535@end example 2536 2537@item Number Sequences 2538@cindex Number sequences 2539Functions like @code{mpz_fac_ui}, @code{mpz_fib_ui} and @code{mpz_bin_uiui} 2540are designed for calculating isolated values. If a range of values is wanted 2541it's probably best to get a starting point and iterate from there. 2542 2543@item Text Input/Output 2544@cindex Text input/output 2545Hexadecimal or octal are suggested for input or output in text form. 2546Power-of-2 bases like these can be converted much more efficiently than other 2547bases, like decimal. For big numbers there's usually nothing of particular 2548interest to be seen in the digits, so the base doesn't matter much. 2549 2550Maybe we can hope octal will one day become the normal base for everyday use, 2551as proposed by King Charles XII of Sweden and later reformers. 2552@c Reference: Knuth volume 2 section 4.1, page 184 of second edition. :-) 2553@end table 2554 2555 2556@node Debugging, Profiling, Efficiency, GMP Basics 2557@section Debugging 2558@cindex Debugging 2559 2560@table @asis 2561@item Stack Overflow 2562@cindex Stack overflow 2563@cindex Segmentation violation 2564@cindex Bus error 2565Depending on the system, a segmentation violation or bus error might be the 2566only indication of stack overflow. See @samp{--enable-alloca} choices in 2567@ref{Build Options}, for how to address this. 2568 2569In new enough versions of GCC, @samp{-fstack-check} may be able to ensure an 2570overflow is recognised by the system before too much damage is done, or 2571@samp{-fstack-limit-symbol} or @samp{-fstack-limit-register} may be able to 2572add checking if the system itself doesn't do any (@pxref{Code Gen Options,, 2573Options for Code Generation, gcc, Using the GNU Compiler Collection (GCC)}). 2574These options must be added to the @samp{CFLAGS} used in the GMP build 2575(@pxref{Build Options}), adding them just to an application will have no 2576effect. Note also they're a slowdown, adding overhead to each function call 2577and each stack allocation. 2578 2579@item Heap Problems 2580@cindex Heap problems 2581@cindex Malloc problems 2582The most likely cause of application problems with GMP is heap corruption. 2583Failing to @code{init} GMP variables will have unpredictable effects, and 2584corruption arising elsewhere in a program may well affect GMP@. Initializing 2585GMP variables more than once or failing to clear them will cause memory leaks. 2586 2587@cindex Malloc debugger 2588In all such cases a @code{malloc} debugger is recommended. On a GNU or BSD 2589system the standard C library @code{malloc} has some diagnostic facilities, 2590see @ref{Allocation Debugging,, Allocation Debugging, libc, The GNU C Library 2591Reference Manual}, or @samp{man 3 malloc}. Other possibilities, in no 2592particular order, include 2593 2594@display 2595@uref{http://cs.ecs.baylor.edu/~donahoo/tools/ccmalloc/} 2596@uref{http://dmalloc.com/} 2597@uref{https://wiki.gnome.org/Apps/MemProf} 2598@end display 2599 2600The GMP default allocation routines in @file{memory.c} also have a simple 2601sentinel scheme which can be enabled with @code{#define DEBUG} in that file. 2602This is mainly designed for detecting buffer overruns during GMP development, 2603but might find other uses. 2604 2605@item Stack Backtraces 2606@cindex Stack backtrace 2607On some systems the compiler options GMP uses by default can interfere with 2608debugging. In particular on x86 and 68k systems @samp{-fomit-frame-pointer} 2609is used and this generally inhibits stack backtracing. Recompiling without 2610such options may help while debugging, though the usual caveats about it 2611potentially moving a memory problem or hiding a compiler bug will apply. 2612 2613@item GDB, the GNU Debugger 2614@cindex GDB 2615@cindex GNU Debugger 2616A sample @file{.gdbinit} is included in the distribution, showing how to call 2617some undocumented dump functions to print GMP variables from within GDB@. Note 2618that these functions shouldn't be used in final application code since they're 2619undocumented and may be subject to incompatible changes in future versions of 2620GMP. 2621 2622@item Source File Paths 2623GMP has multiple source files with the same name, in different directories. 2624For example @file{mpz}, @file{mpq} and @file{mpf} each have an 2625@file{init.c}. If the debugger can't already determine the right one it may 2626help to build with absolute paths on each C file. One way to do that is to 2627use a separate object directory with an absolute path to the source directory. 2628 2629@example 2630cd /my/build/dir 2631/my/source/dir/gmp-@value{VERSION}/configure 2632@end example 2633 2634This works via @code{VPATH}, and might require GNU @command{make}. 2635Alternately it might be possible to change the @code{.c.lo} rules 2636appropriately. 2637 2638@item Assertion Checking 2639@cindex Assertion checking 2640The build option @option{--enable-assert} is available to add some consistency 2641checks to the library (see @ref{Build Options}). These are likely to be of 2642limited value to most applications. Assertion failures are just as likely to 2643indicate memory corruption as a library or compiler bug. 2644 2645Applications using the low-level @code{mpn} functions, however, will benefit 2646from @option{--enable-assert} since it adds checks on the parameters of most 2647such functions, many of which have subtle restrictions on their usage. Note 2648however that only the generic C code has checks, not the assembly code, so 2649@option{--disable-assembly} should be used for maximum checking. 2650 2651@item Temporary Memory Checking 2652The build option @option{--enable-alloca=debug} arranges that each block of 2653temporary memory in GMP is allocated with a separate call to @code{malloc} (or 2654the allocation function set with @code{mp_set_memory_functions}). 2655 2656This can help a malloc debugger detect accesses outside the intended bounds, 2657or detect memory not released. In a normal build, on the other hand, 2658temporary memory is allocated in blocks which GMP divides up for its own use, 2659or may be allocated with a compiler builtin @code{alloca} which will go 2660nowhere near any malloc debugger hooks. 2661 2662@item Maximum Debuggability 2663To summarize the above, a GMP build for maximum debuggability would be 2664 2665@example 2666./configure --disable-shared --enable-assert \ 2667 --enable-alloca=debug --disable-assembly CFLAGS=-g 2668@end example 2669 2670For C++, add @samp{--enable-cxx CXXFLAGS=-g}. 2671 2672@item Checker 2673@cindex Checker 2674@cindex GCC Checker 2675The GCC checker (@uref{https://savannah.nongnu.org/projects/checker/}) can be 2676used with GMP@. It contains a stub library which means GMP applications 2677compiled with checker can use a normal GMP build. 2678 2679A build of GMP with checking within GMP itself can be made. This will run 2680very very slowly. On GNU/Linux for example, 2681 2682@cindex @command{checkergcc} 2683@example 2684./configure --disable-assembly CC=checkergcc 2685@end example 2686 2687@option{--disable-assembly} must be used, since the GMP assembly code doesn't 2688support the checking scheme. The GMP C++ features cannot be used, since 2689current versions of checker (0.9.9.1) don't yet support the standard C++ 2690library. 2691 2692@item Valgrind 2693@cindex Valgrind 2694Valgrind (@uref{http://valgrind.org/}) is a memory checker for x86, ARM, MIPS, 2695PowerPC, and S/390. It translates and emulates machine instructions to do 2696strong checks for uninitialized data (at the level of individual bits), memory 2697accesses through bad pointers, and memory leaks. 2698 2699Valgrind does not always support every possible instruction, in particular 2700ones recently added to an ISA. Valgrind might therefore be incompatible with 2701a recent GMP or even a less recent GMP which is compiled using a recent GCC. 2702 2703GMP's assembly code sometimes promotes a read of the limbs to some larger size, 2704for efficiency. GMP will do this even at the start and end of a multilimb 2705operand, using naturally aligned operations on the larger type. This may lead 2706to benign reads outside of allocated areas, triggering complaints from 2707Valgrind. Valgrind's option @samp{--partial-loads-ok=yes} should help. 2708 2709@item Other Problems 2710Any suspected bug in GMP itself should be isolated to make sure it's not an 2711application problem, see @ref{Reporting Bugs}. 2712@end table 2713 2714 2715@node Profiling, Autoconf, Debugging, GMP Basics 2716@section Profiling 2717@cindex Profiling 2718@cindex Execution profiling 2719@cindex @code{--enable-profiling} 2720 2721Running a program under a profiler is a good way to find where it's spending 2722most time and where improvements can be best sought. The profiling choices 2723for a GMP build are as follows. 2724 2725@table @asis 2726@item @samp{--disable-profiling} 2727The default is to add nothing special for profiling. 2728 2729It should be possible to just compile the mainline of a program with @code{-p} 2730and use @command{prof} to get a profile consisting of timer-based sampling of 2731the program counter. Most of the GMP assembly code has the necessary symbol 2732information. 2733 2734This approach has the advantage of minimizing interference with normal program 2735operation, but on most systems the resolution of the sampling is quite low (10 2736milliseconds for instance), requiring long runs to get accurate information. 2737 2738@item @samp{--enable-profiling=prof} 2739@cindex @code{prof} 2740Build with support for the system @command{prof}, which means @samp{-p} added 2741to the @samp{CFLAGS}. 2742 2743This provides call counting in addition to program counter sampling, which 2744allows the most frequently called routines to be identified, and an average 2745time spent in each routine to be determined. 2746 2747The x86 assembly code has support for this option, but on other processors 2748the assembly routines will be as if compiled without @samp{-p} and therefore 2749won't appear in the call counts. 2750 2751On some systems, such as GNU/Linux, @samp{-p} in fact means @samp{-pg} and in 2752this case @samp{--enable-profiling=gprof} described below should be used 2753instead. 2754 2755@item @samp{--enable-profiling=gprof} 2756@cindex @code{gprof} 2757Build with support for @command{gprof}, which means @samp{-pg} added to the 2758@samp{CFLAGS}. 2759 2760This provides call graph construction in addition to call counting and program 2761counter sampling, which makes it possible to count calls coming from different 2762locations. For example the number of calls to @code{mpn_mul} from 2763@code{mpz_mul} versus the number from @code{mpf_mul}. The program counter 2764sampling is still flat though, so only a total time in @code{mpn_mul} would be 2765accumulated, not a separate amount for each call site. 2766 2767The x86 assembly code has support for this option, but on other processors 2768the assembly routines will be as if compiled without @samp{-pg} and therefore 2769not be included in the call counts. 2770 2771On x86 and m68k systems @samp{-pg} and @samp{-fomit-frame-pointer} are 2772incompatible, so the latter is omitted from the default flags in that case, 2773which might result in poorer code generation. 2774 2775Incidentally, it should be possible to use the @command{gprof} program with a 2776plain @samp{--enable-profiling=prof} build. But in that case only the 2777@samp{gprof -p} flat profile and call counts can be expected to be valid, not 2778the @samp{gprof -q} call graph. 2779 2780@item @samp{--enable-profiling=instrument} 2781@cindex @code{-finstrument-functions} 2782@cindex @code{instrument-functions} 2783Build with the GCC option @samp{-finstrument-functions} added to the 2784@samp{CFLAGS} (@pxref{Code Gen Options,, Options for Code Generation, gcc, 2785Using the GNU Compiler Collection (GCC)}). 2786 2787This inserts special instrumenting calls at the start and end of each 2788function, allowing exact timing and full call graph construction. 2789 2790This instrumenting is not normally a standard system feature and will require 2791support from an external library, such as 2792 2793@cindex FunctionCheck 2794@cindex fnccheck 2795@display 2796@uref{https://sourceforge.net/projects/fnccheck/} 2797@end display 2798 2799This should be included in @samp{LIBS} during the GMP configure so that test 2800programs will link. For example, 2801 2802@example 2803./configure --enable-profiling=instrument LIBS=-lfc 2804@end example 2805 2806On a GNU system the C library provides dummy instrumenting functions, so 2807programs compiled with this option will link. In this case it's only 2808necessary to ensure the correct library is added when linking an application. 2809 2810The x86 assembly code supports this option, but on other processors the 2811assembly routines will be as if compiled without 2812@samp{-finstrument-functions} meaning time spent in them will effectively be 2813attributed to their caller. 2814@end table 2815 2816 2817@node Autoconf, Emacs, Profiling, GMP Basics 2818@section Autoconf 2819@cindex Autoconf 2820 2821Autoconf based applications can easily check whether GMP is installed. The 2822only thing to be noted is that GMP library symbols from version 3 onwards have 2823prefixes like @code{__gmpz}. The following therefore would be a simple test, 2824 2825@cindex @code{AC_CHECK_LIB} 2826@example 2827AC_CHECK_LIB(gmp, __gmpz_init) 2828@end example 2829 2830This just uses the default @code{AC_CHECK_LIB} actions for found or not found, 2831but an application that must have GMP would want to generate an error if not 2832found. For example, 2833 2834@example 2835AC_CHECK_LIB(gmp, __gmpz_init, , 2836 [AC_MSG_ERROR([GNU MP not found, see https://gmplib.org/])]) 2837@end example 2838 2839If functions added in some particular version of GMP are required, then one of 2840those can be used when checking. For example @code{mpz_mul_si} was added in 2841GMP 3.1, 2842 2843@example 2844AC_CHECK_LIB(gmp, __gmpz_mul_si, , 2845 [AC_MSG_ERROR( 2846 [GNU MP not found, or not 3.1 or up, see https://gmplib.org/])]) 2847@end example 2848 2849An alternative would be to test the version number in @file{gmp.h} using say 2850@code{AC_EGREP_CPP}. That would make it possible to test the exact version, 2851if some particular sub-minor release is known to be necessary. 2852 2853In general it's recommended that applications should simply demand a new 2854enough GMP rather than trying to provide supplements for features not 2855available in past versions. 2856 2857Occasionally an application will need or want to know the size of a type at 2858configuration or preprocessing time, not just with @code{sizeof} in the code. 2859This can be done in the normal way with @code{mp_limb_t} etc, but GMP 4.0 or 2860up is best for this, since prior versions needed certain @samp{-D} defines on 2861systems using a @code{long long} limb. The following would suit Autoconf 2.50 2862or up, 2863 2864@example 2865AC_CHECK_SIZEOF(mp_limb_t, , [#include <gmp.h>]) 2866@end example 2867 2868 2869@node Emacs, , Autoconf, GMP Basics 2870@section Emacs 2871@cindex Emacs 2872@cindex @code{info-lookup-symbol} 2873 2874@key{C-h C-i} (@code{info-lookup-symbol}) is a good way to find documentation 2875on C functions while editing (@pxref{Info Lookup, , Info Documentation Lookup, 2876emacs, The Emacs Editor}). 2877 2878The GMP manual can be included in such lookups by putting the following in 2879your @file{.emacs}, 2880 2881@c This isn't pretty, but there doesn't seem to be a better way (in emacs 2882@c 21.2 at least). info-lookup->mode-value could be used for the "assoc"s, 2883@c but that function isn't documented, whereas info-lookup-alist is. 2884@c 2885@example 2886(eval-after-load "info-look" 2887 '(let ((mode-value (assoc 'c-mode (assoc 'symbol info-lookup-alist)))) 2888 (setcar (nthcdr 3 mode-value) 2889 (cons '("(gmp)Function Index" nil "^ -.* " "\\>") 2890 (nth 3 mode-value))))) 2891@end example 2892 2893 2894@node Reporting Bugs, Integer Functions, GMP Basics, Top 2895@comment node-name, next, previous, up 2896@chapter Reporting Bugs 2897@cindex Reporting bugs 2898@cindex Bug reporting 2899 2900If you think you have found a bug in the GMP library, please investigate it 2901and report it. We have made this library available to you, and it is not too 2902much to ask you to report the bugs you find. 2903 2904Before you report a bug, check it's not already addressed in @ref{Known Build 2905Problems}, or perhaps @ref{Notes for Particular Systems}. You may also want 2906to check @uref{https://gmplib.org/} for patches for this release. 2907 2908Please include the following in any report, 2909 2910@itemize @bullet 2911@item 2912The GMP version number, and if pre-packaged or patched then say so. 2913 2914@item 2915A test program that makes it possible for us to reproduce the bug. Include 2916instructions on how to run the program. 2917 2918@item 2919A description of what is wrong. If the results are incorrect, in what way. 2920If you get a crash, say so. 2921 2922@item 2923If you get a crash, include a stack backtrace from the debugger if it's 2924informative (@samp{where} in @command{gdb}, or @samp{$C} in @command{adb}). 2925 2926@item 2927Please do not send core dumps, executables or @command{strace}s. 2928 2929@item 2930The @samp{configure} options you used when building GMP, if any. 2931 2932@item 2933The output from @samp{configure}, as printed to stdout, with any options used. 2934 2935@item 2936The name of the compiler and its version. For @command{gcc}, get the version 2937with @samp{gcc -v}, otherwise perhaps @samp{what `which cc`}, or similar. 2938 2939@item 2940The output from running @samp{uname -a}. 2941 2942@item 2943The output from running @samp{./config.guess}, and from running 2944@samp{./configfsf.guess} (might be the same). 2945 2946@item 2947If the bug is related to @samp{configure}, then the compressed contents of 2948@file{config.log}. 2949 2950@item 2951If the bug is related to an @file{asm} file not assembling, then the contents 2952of @file{config.m4} and the offending line or lines from the temporary 2953@file{mpn/tmp-<file>.s}. 2954@end itemize 2955 2956Please make an effort to produce a self-contained report, with something 2957definite that can be tested or debugged. Vague queries or piecemeal messages 2958are difficult to act on and don't help the development effort. 2959 2960It is not uncommon that an observed problem is actually due to a bug in the 2961compiler; the GMP code tends to explore interesting corners in compilers. 2962 2963If your bug report is good, we will do our best to help you get a corrected 2964version of the library; if the bug report is poor, we won't do anything about 2965it (except maybe ask you to send a better report). 2966 2967Send your report to: @email{gmp-bugs@@gmplib.org}. 2968 2969If you think something in this manual is unclear, or downright incorrect, or if 2970the language needs to be improved, please send a note to the same address. 2971 2972 2973@node Integer Functions, Rational Number Functions, Reporting Bugs, Top 2974@comment node-name, next, previous, up 2975@chapter Integer Functions 2976@cindex Integer functions 2977 2978This chapter describes the GMP functions for performing integer arithmetic. 2979These functions start with the prefix @code{mpz_}. 2980 2981GMP integers are stored in objects of type @code{mpz_t}. 2982 2983@menu 2984* Initializing Integers:: 2985* Assigning Integers:: 2986* Simultaneous Integer Init & Assign:: 2987* Converting Integers:: 2988* Integer Arithmetic:: 2989* Integer Division:: 2990* Integer Exponentiation:: 2991* Integer Roots:: 2992* Number Theoretic Functions:: 2993* Integer Comparisons:: 2994* Integer Logic and Bit Fiddling:: 2995* I/O of Integers:: 2996* Integer Random Numbers:: 2997* Integer Import and Export:: 2998* Miscellaneous Integer Functions:: 2999* Integer Special Functions:: 3000@end menu 3001 3002@node Initializing Integers, Assigning Integers, Integer Functions, Integer Functions 3003@comment node-name, next, previous, up 3004@section Initialization Functions 3005@cindex Integer initialization functions 3006@cindex Initialization functions 3007 3008The functions for integer arithmetic assume that all integer objects are 3009initialized. You do that by calling the function @code{mpz_init}. For 3010example, 3011 3012@example 3013@{ 3014 mpz_t integ; 3015 mpz_init (integ); 3016 @dots{} 3017 mpz_add (integ, @dots{}); 3018 @dots{} 3019 mpz_sub (integ, @dots{}); 3020 3021 /* Unless the program is about to exit, do ... */ 3022 mpz_clear (integ); 3023@} 3024@end example 3025 3026As you can see, you can store new values any number of times, once an 3027object is initialized. 3028 3029@deftypefun void mpz_init (mpz_t @var{x}) 3030Initialize @var{x}, and set its value to 0. 3031@end deftypefun 3032 3033@deftypefun void mpz_inits (mpz_t @var{x}, ...) 3034Initialize a NULL-terminated list of @code{mpz_t} variables, and set their 3035values to 0. 3036@end deftypefun 3037 3038@deftypefun void mpz_init2 (mpz_t @var{x}, mp_bitcnt_t @var{n}) 3039Initialize @var{x}, with space for @var{n}-bit numbers, and set its value to 0. 3040Calling this function instead of @code{mpz_init} or @code{mpz_inits} is never 3041necessary; reallocation is handled automatically by GMP when needed. 3042 3043While @var{n} defines the initial space, @var{x} will grow automatically in the 3044normal way, if necessary, for subsequent values stored. @code{mpz_init2} makes 3045it possible to avoid such reallocations if a maximum size is known in advance. 3046 3047In preparation for an operation, GMP often allocates one limb more than 3048ultimately needed. To make sure GMP will not perform reallocation for 3049@var{x}, you need to add the number of bits in @code{mp_limb_t} to @var{n}. 3050@end deftypefun 3051 3052@deftypefun void mpz_clear (mpz_t @var{x}) 3053Free the space occupied by @var{x}. Call this function for all @code{mpz_t} 3054variables when you are done with them. 3055@end deftypefun 3056 3057@deftypefun void mpz_clears (mpz_t @var{x}, ...) 3058Free the space occupied by a NULL-terminated list of @code{mpz_t} variables. 3059@end deftypefun 3060 3061@deftypefun void mpz_realloc2 (mpz_t @var{x}, mp_bitcnt_t @var{n}) 3062Change the space allocated for @var{x} to @var{n} bits. The value in @var{x} 3063is preserved if it fits, or is set to 0 if not. 3064 3065Calling this function is never necessary; reallocation is handled automatically 3066by GMP when needed. But this function can be used to increase the space for a 3067variable in order to avoid repeated automatic reallocations, or to decrease it 3068to give memory back to the heap. 3069@end deftypefun 3070 3071 3072@node Assigning Integers, Simultaneous Integer Init & Assign, Initializing Integers, Integer Functions 3073@comment node-name, next, previous, up 3074@section Assignment Functions 3075@cindex Integer assignment functions 3076@cindex Assignment functions 3077 3078These functions assign new values to already initialized integers 3079(@pxref{Initializing Integers}). 3080 3081@deftypefun void mpz_set (mpz_t @var{rop}, const mpz_t @var{op}) 3082@deftypefunx void mpz_set_ui (mpz_t @var{rop}, unsigned long int @var{op}) 3083@deftypefunx void mpz_set_si (mpz_t @var{rop}, signed long int @var{op}) 3084@deftypefunx void mpz_set_d (mpz_t @var{rop}, double @var{op}) 3085@deftypefunx void mpz_set_q (mpz_t @var{rop}, const mpq_t @var{op}) 3086@deftypefunx void mpz_set_f (mpz_t @var{rop}, const mpf_t @var{op}) 3087Set the value of @var{rop} from @var{op}. 3088 3089@code{mpz_set_d}, @code{mpz_set_q} and @code{mpz_set_f} truncate @var{op} to 3090make it an integer. 3091@end deftypefun 3092 3093@deftypefun int mpz_set_str (mpz_t @var{rop}, const char *@var{str}, int @var{base}) 3094Set the value of @var{rop} from @var{str}, a null-terminated C string in base 3095@var{base}. White space is allowed in the string, and is simply ignored. 3096 3097The @var{base} may vary from 2 to 62, or if @var{base} is 0, then the leading 3098characters are used: @code{0x} and @code{0X} for hexadecimal, @code{0b} and 3099@code{0B} for binary, @code{0} for octal, or decimal otherwise. 3100 3101For bases up to 36, case is ignored; upper-case and lower-case letters have 3102the same value. For bases 37 to 62, upper-case letter represent the usual 310310..35 while lower-case letter represent 36..61. 3104 3105This function returns 0 if the entire string is a valid number in base 3106@var{base}. Otherwise it returns @minus{}1. 3107@c 3108@c It turns out that it is not entirely true that this function ignores 3109@c white-space. It does ignore it between digits, but not after a minus sign 3110@c or within or after ``0x''. Some thought was given to disallowing all 3111@c whitespace, but that would be an incompatible change, whitespace has been 3112@c documented as ignored ever since GMP 1. 3113@c 3114@end deftypefun 3115 3116@deftypefun void mpz_swap (mpz_t @var{rop1}, mpz_t @var{rop2}) 3117Swap the values @var{rop1} and @var{rop2} efficiently. 3118@end deftypefun 3119 3120 3121@node Simultaneous Integer Init & Assign, Converting Integers, Assigning Integers, Integer Functions 3122@comment node-name, next, previous, up 3123@section Combined Initialization and Assignment Functions 3124@cindex Integer assignment functions 3125@cindex Assignment functions 3126@cindex Integer initialization functions 3127@cindex Initialization functions 3128 3129For convenience, GMP provides a parallel series of initialize-and-set functions 3130which initialize the output and then store the value there. These functions' 3131names have the form @code{mpz_init_set@dots{}} 3132 3133Here is an example of using one: 3134 3135@example 3136@{ 3137 mpz_t pie; 3138 mpz_init_set_str (pie, "3141592653589793238462643383279502884", 10); 3139 @dots{} 3140 mpz_sub (pie, @dots{}); 3141 @dots{} 3142 mpz_clear (pie); 3143@} 3144@end example 3145 3146@noindent 3147Once the integer has been initialized by any of the @code{mpz_init_set@dots{}} 3148functions, it can be used as the source or destination operand for the ordinary 3149integer functions. Don't use an initialize-and-set function on a variable 3150already initialized! 3151 3152@deftypefun void mpz_init_set (mpz_t @var{rop}, const mpz_t @var{op}) 3153@deftypefunx void mpz_init_set_ui (mpz_t @var{rop}, unsigned long int @var{op}) 3154@deftypefunx void mpz_init_set_si (mpz_t @var{rop}, signed long int @var{op}) 3155@deftypefunx void mpz_init_set_d (mpz_t @var{rop}, double @var{op}) 3156Initialize @var{rop} with limb space and set the initial numeric value from 3157@var{op}. 3158@end deftypefun 3159 3160@deftypefun int mpz_init_set_str (mpz_t @var{rop}, const char *@var{str}, int @var{base}) 3161Initialize @var{rop} and set its value like @code{mpz_set_str} (see its 3162documentation above for details). 3163 3164If the string is a correct base @var{base} number, the function returns 0; 3165if an error occurs it returns @minus{}1. @var{rop} is initialized even if 3166an error occurs. (I.e., you have to call @code{mpz_clear} for it.) 3167@end deftypefun 3168 3169 3170@node Converting Integers, Integer Arithmetic, Simultaneous Integer Init & Assign, Integer Functions 3171@comment node-name, next, previous, up 3172@section Conversion Functions 3173@cindex Integer conversion functions 3174@cindex Conversion functions 3175 3176This section describes functions for converting GMP integers to standard C 3177types. Functions for converting @emph{to} GMP integers are described in 3178@ref{Assigning Integers} and @ref{I/O of Integers}. 3179 3180@deftypefun {unsigned long int} mpz_get_ui (const mpz_t @var{op}) 3181Return the value of @var{op} as an @code{unsigned long}. 3182 3183If @var{op} is too big to fit an @code{unsigned long} then just the least 3184significant bits that do fit are returned. The sign of @var{op} is ignored, 3185only the absolute value is used. 3186@end deftypefun 3187 3188@deftypefun {signed long int} mpz_get_si (const mpz_t @var{op}) 3189If @var{op} fits into a @code{signed long int} return the value of @var{op}. 3190Otherwise return the least significant part of @var{op}, with the same sign 3191as @var{op}. 3192 3193If @var{op} is too big to fit in a @code{signed long int}, the returned 3194result is probably not very useful. To find out if the value will fit, use 3195the function @code{mpz_fits_slong_p}. 3196@end deftypefun 3197 3198@deftypefun double mpz_get_d (const mpz_t @var{op}) 3199Convert @var{op} to a @code{double}, truncating if necessary (i.e.@: rounding 3200towards zero). 3201 3202If the exponent from the conversion is too big, the result is system 3203dependent. An infinity is returned where available. A hardware overflow trap 3204may or may not occur. 3205@end deftypefun 3206 3207@deftypefun double mpz_get_d_2exp (signed long int *@var{exp}, const mpz_t @var{op}) 3208Convert @var{op} to a @code{double}, truncating if necessary (i.e.@: rounding 3209towards zero), and returning the exponent separately. 3210 3211The return value is in the range @math{0.5@le{}@GMPabs{@var{d}}<1} and the 3212exponent is stored to @code{*@var{exp}}. @m{@var{d} * 2^{exp}, @var{d} * 32132^@var{exp}} is the (truncated) @var{op} value. If @var{op} is zero, the 3214return is @math{0.0} and 0 is stored to @code{*@var{exp}}. 3215 3216@cindex @code{frexp} 3217This is similar to the standard C @code{frexp} function (@pxref{Normalization 3218Functions,,, libc, The GNU C Library Reference Manual}). 3219@end deftypefun 3220 3221@deftypefun {char *} mpz_get_str (char *@var{str}, int @var{base}, const mpz_t @var{op}) 3222Convert @var{op} to a string of digits in base @var{base}. The base argument 3223may vary from 2 to 62 or from @minus{}2 to @minus{}36. 3224 3225For @var{base} in the range 2..36, digits and lower-case letters are used; for 3226@minus{}2..@minus{}36, digits and upper-case letters are used; for 37..62, 3227digits, upper-case letters, and lower-case letters (in that significance order) 3228are used. 3229 3230If @var{str} is @code{NULL}, the result string is allocated using the current 3231allocation function (@pxref{Custom Allocation}). The block will be 3232@code{strlen(str)+1} bytes, that being exactly enough for the string and 3233null-terminator. 3234 3235If @var{str} is not @code{NULL}, it should point to a block of storage large 3236enough for the result, that being @code{mpz_sizeinbase (@var{op}, @var{base}) 3237+ 2}. The two extra bytes are for a possible minus sign, and the 3238null-terminator. 3239 3240A pointer to the result string is returned, being either the allocated block, 3241or the given @var{str}. 3242@end deftypefun 3243 3244 3245@need 2000 3246@node Integer Arithmetic, Integer Division, Converting Integers, Integer Functions 3247@comment node-name, next, previous, up 3248@section Arithmetic Functions 3249@cindex Integer arithmetic functions 3250@cindex Arithmetic functions 3251 3252@deftypefun void mpz_add (mpz_t @var{rop}, const mpz_t @var{op1}, const mpz_t @var{op2}) 3253@deftypefunx void mpz_add_ui (mpz_t @var{rop}, const mpz_t @var{op1}, unsigned long int @var{op2}) 3254Set @var{rop} to @math{@var{op1} + @var{op2}}. 3255@end deftypefun 3256 3257@deftypefun void mpz_sub (mpz_t @var{rop}, const mpz_t @var{op1}, const mpz_t @var{op2}) 3258@deftypefunx void mpz_sub_ui (mpz_t @var{rop}, const mpz_t @var{op1}, unsigned long int @var{op2}) 3259@deftypefunx void mpz_ui_sub (mpz_t @var{rop}, unsigned long int @var{op1}, const mpz_t @var{op2}) 3260Set @var{rop} to @var{op1} @minus{} @var{op2}. 3261@end deftypefun 3262 3263@deftypefun void mpz_mul (mpz_t @var{rop}, const mpz_t @var{op1}, const mpz_t @var{op2}) 3264@deftypefunx void mpz_mul_si (mpz_t @var{rop}, const mpz_t @var{op1}, long int @var{op2}) 3265@deftypefunx void mpz_mul_ui (mpz_t @var{rop}, const mpz_t @var{op1}, unsigned long int @var{op2}) 3266Set @var{rop} to @math{@var{op1} @GMPtimes{} @var{op2}}. 3267@end deftypefun 3268 3269@deftypefun void mpz_addmul (mpz_t @var{rop}, const mpz_t @var{op1}, const mpz_t @var{op2}) 3270@deftypefunx void mpz_addmul_ui (mpz_t @var{rop}, const mpz_t @var{op1}, unsigned long int @var{op2}) 3271Set @var{rop} to @math{@var{rop} + @var{op1} @GMPtimes{} @var{op2}}. 3272@end deftypefun 3273 3274@deftypefun void mpz_submul (mpz_t @var{rop}, const mpz_t @var{op1}, const mpz_t @var{op2}) 3275@deftypefunx void mpz_submul_ui (mpz_t @var{rop}, const mpz_t @var{op1}, unsigned long int @var{op2}) 3276Set @var{rop} to @math{@var{rop} - @var{op1} @GMPtimes{} @var{op2}}. 3277@end deftypefun 3278 3279@deftypefun void mpz_mul_2exp (mpz_t @var{rop}, const mpz_t @var{op1}, mp_bitcnt_t @var{op2}) 3280@cindex Bit shift left 3281Set @var{rop} to @m{@var{op1} \times 2^{op2}, @var{op1} times 2 raised to 3282@var{op2}}. This operation can also be defined as a left shift by @var{op2} 3283bits. 3284@end deftypefun 3285 3286@deftypefun void mpz_neg (mpz_t @var{rop}, const mpz_t @var{op}) 3287Set @var{rop} to @minus{}@var{op}. 3288@end deftypefun 3289 3290@deftypefun void mpz_abs (mpz_t @var{rop}, const mpz_t @var{op}) 3291Set @var{rop} to the absolute value of @var{op}. 3292@end deftypefun 3293 3294 3295@need 2000 3296@node Integer Division, Integer Exponentiation, Integer Arithmetic, Integer Functions 3297@section Division Functions 3298@cindex Integer division functions 3299@cindex Division functions 3300 3301Division is undefined if the divisor is zero. Passing a zero divisor to the 3302division or modulo functions (including the modular powering functions 3303@code{mpz_powm} and @code{mpz_powm_ui}), will cause an intentional division by 3304zero. This lets a program handle arithmetic exceptions in these functions the 3305same way as for normal C @code{int} arithmetic. 3306 3307@c Separate deftypefun groups for cdiv, fdiv and tdiv produce a blank line 3308@c between each, and seem to let tex do a better job of page breaks than an 3309@c @sp 1 in the middle of one big set. 3310 3311@deftypefun void mpz_cdiv_q (mpz_t @var{q}, const mpz_t @var{n}, const mpz_t @var{d}) 3312@deftypefunx void mpz_cdiv_r (mpz_t @var{r}, const mpz_t @var{n}, const mpz_t @var{d}) 3313@deftypefunx void mpz_cdiv_qr (mpz_t @var{q}, mpz_t @var{r}, const mpz_t @var{n}, const mpz_t @var{d}) 3314@maybepagebreak 3315@deftypefunx {unsigned long int} mpz_cdiv_q_ui (mpz_t @var{q}, const mpz_t @var{n}, @w{unsigned long int @var{d}}) 3316@deftypefunx {unsigned long int} mpz_cdiv_r_ui (mpz_t @var{r}, const mpz_t @var{n}, @w{unsigned long int @var{d}}) 3317@deftypefunx {unsigned long int} mpz_cdiv_qr_ui (mpz_t @var{q}, mpz_t @var{r}, @w{const mpz_t @var{n}}, @w{unsigned long int @var{d}}) 3318@deftypefunx {unsigned long int} mpz_cdiv_ui (const mpz_t @var{n}, @w{unsigned long int @var{d}}) 3319@maybepagebreak 3320@deftypefunx void mpz_cdiv_q_2exp (mpz_t @var{q}, const mpz_t @var{n}, @w{mp_bitcnt_t @var{b}}) 3321@deftypefunx void mpz_cdiv_r_2exp (mpz_t @var{r}, const mpz_t @var{n}, @w{mp_bitcnt_t @var{b}}) 3322@end deftypefun 3323 3324@deftypefun void mpz_fdiv_q (mpz_t @var{q}, const mpz_t @var{n}, const mpz_t @var{d}) 3325@deftypefunx void mpz_fdiv_r (mpz_t @var{r}, const mpz_t @var{n}, const mpz_t @var{d}) 3326@deftypefunx void mpz_fdiv_qr (mpz_t @var{q}, mpz_t @var{r}, const mpz_t @var{n}, const mpz_t @var{d}) 3327@maybepagebreak 3328@deftypefunx {unsigned long int} mpz_fdiv_q_ui (mpz_t @var{q}, const mpz_t @var{n}, @w{unsigned long int @var{d}}) 3329@deftypefunx {unsigned long int} mpz_fdiv_r_ui (mpz_t @var{r}, const mpz_t @var{n}, @w{unsigned long int @var{d}}) 3330@deftypefunx {unsigned long int} mpz_fdiv_qr_ui (mpz_t @var{q}, mpz_t @var{r}, @w{const mpz_t @var{n}}, @w{unsigned long int @var{d}}) 3331@deftypefunx {unsigned long int} mpz_fdiv_ui (const mpz_t @var{n}, @w{unsigned long int @var{d}}) 3332@maybepagebreak 3333@deftypefunx void mpz_fdiv_q_2exp (mpz_t @var{q}, const mpz_t @var{n}, @w{mp_bitcnt_t @var{b}}) 3334@deftypefunx void mpz_fdiv_r_2exp (mpz_t @var{r}, const mpz_t @var{n}, @w{mp_bitcnt_t @var{b}}) 3335@end deftypefun 3336 3337@deftypefun void mpz_tdiv_q (mpz_t @var{q}, const mpz_t @var{n}, const mpz_t @var{d}) 3338@deftypefunx void mpz_tdiv_r (mpz_t @var{r}, const mpz_t @var{n}, const mpz_t @var{d}) 3339@deftypefunx void mpz_tdiv_qr (mpz_t @var{q}, mpz_t @var{r}, const mpz_t @var{n}, const mpz_t @var{d}) 3340@maybepagebreak 3341@deftypefunx {unsigned long int} mpz_tdiv_q_ui (mpz_t @var{q}, const mpz_t @var{n}, @w{unsigned long int @var{d}}) 3342@deftypefunx {unsigned long int} mpz_tdiv_r_ui (mpz_t @var{r}, const mpz_t @var{n}, @w{unsigned long int @var{d}}) 3343@deftypefunx {unsigned long int} mpz_tdiv_qr_ui (mpz_t @var{q}, mpz_t @var{r}, @w{const mpz_t @var{n}}, @w{unsigned long int @var{d}}) 3344@deftypefunx {unsigned long int} mpz_tdiv_ui (const mpz_t @var{n}, @w{unsigned long int @var{d}}) 3345@maybepagebreak 3346@deftypefunx void mpz_tdiv_q_2exp (mpz_t @var{q}, const mpz_t @var{n}, @w{mp_bitcnt_t @var{b}}) 3347@deftypefunx void mpz_tdiv_r_2exp (mpz_t @var{r}, const mpz_t @var{n}, @w{mp_bitcnt_t @var{b}}) 3348@cindex Bit shift right 3349 3350@sp 1 3351Divide @var{n} by @var{d}, forming a quotient @var{q} and/or remainder 3352@var{r}. For the @code{2exp} functions, @m{@var{d}=2^b, @var{d}=2^@var{b}}. 3353The rounding is in three styles, each suiting different applications. 3354 3355@itemize @bullet 3356@item 3357@code{cdiv} rounds @var{q} up towards @m{+\infty, +infinity}, and @var{r} will 3358have the opposite sign to @var{d}. The @code{c} stands for ``ceil''. 3359 3360@item 3361@code{fdiv} rounds @var{q} down towards @m{-\infty, @minus{}infinity}, and 3362@var{r} will have the same sign as @var{d}. The @code{f} stands for 3363``floor''. 3364 3365@item 3366@code{tdiv} rounds @var{q} towards zero, and @var{r} will have the same sign 3367as @var{n}. The @code{t} stands for ``truncate''. 3368@end itemize 3369 3370In all cases @var{q} and @var{r} will satisfy 3371@m{@var{n}=@var{q}@var{d}+@var{r}, @var{n}=@var{q}*@var{d}+@var{r}}, and 3372@var{r} will satisfy @math{0@le{}@GMPabs{@var{r}}<@GMPabs{@var{d}}}. 3373 3374The @code{q} functions calculate only the quotient, the @code{r} functions 3375only the remainder, and the @code{qr} functions calculate both. Note that for 3376@code{qr} the same variable cannot be passed for both @var{q} and @var{r}, or 3377results will be unpredictable. 3378 3379For the @code{ui} variants the return value is the remainder, and in fact 3380returning the remainder is all the @code{div_ui} functions do. For 3381@code{tdiv} and @code{cdiv} the remainder can be negative, so for those the 3382return value is the absolute value of the remainder. 3383 3384For the @code{2exp} variants the divisor is @m{2^b,2^@var{b}}. These 3385functions are implemented as right shifts and bit masks, but of course they 3386round the same as the other functions. 3387 3388For positive @var{n} both @code{mpz_fdiv_q_2exp} and @code{mpz_tdiv_q_2exp} 3389are simple bitwise right shifts. For negative @var{n}, @code{mpz_fdiv_q_2exp} 3390is effectively an arithmetic right shift treating @var{n} as twos complement 3391the same as the bitwise logical functions do, whereas @code{mpz_tdiv_q_2exp} 3392effectively treats @var{n} as sign and magnitude. 3393@end deftypefun 3394 3395@deftypefun void mpz_mod (mpz_t @var{r}, const mpz_t @var{n}, const mpz_t @var{d}) 3396@deftypefunx {unsigned long int} mpz_mod_ui (mpz_t @var{r}, const mpz_t @var{n}, @w{unsigned long int @var{d}}) 3397Set @var{r} to @var{n} @code{mod} @var{d}. The sign of the divisor is 3398ignored; the result is always non-negative. 3399 3400@code{mpz_mod_ui} is identical to @code{mpz_fdiv_r_ui} above, returning the 3401remainder as well as setting @var{r}. See @code{mpz_fdiv_ui} above if only 3402the return value is wanted. 3403@end deftypefun 3404 3405@deftypefun void mpz_divexact (mpz_t @var{q}, const mpz_t @var{n}, const mpz_t @var{d}) 3406@deftypefunx void mpz_divexact_ui (mpz_t @var{q}, const mpz_t @var{n}, unsigned long @var{d}) 3407@cindex Exact division functions 3408Set @var{q} to @var{n}/@var{d}. These functions produce correct results only 3409when it is known in advance that @var{d} divides @var{n}. 3410 3411These routines are much faster than the other division functions, and are the 3412best choice when exact division is known to occur, for example reducing a 3413rational to lowest terms. 3414@end deftypefun 3415 3416@deftypefun int mpz_divisible_p (const mpz_t @var{n}, const mpz_t @var{d}) 3417@deftypefunx int mpz_divisible_ui_p (const mpz_t @var{n}, unsigned long int @var{d}) 3418@deftypefunx int mpz_divisible_2exp_p (const mpz_t @var{n}, mp_bitcnt_t @var{b}) 3419@cindex Divisibility functions 3420Return non-zero if @var{n} is exactly divisible by @var{d}, or in the case of 3421@code{mpz_divisible_2exp_p} by @m{2^b,2^@var{b}}. 3422 3423@var{n} is divisible by @var{d} if there exists an integer @var{q} satisfying 3424@math{@var{n} = @var{q}@GMPmultiply{}@var{d}}. Unlike the other division 3425functions, @math{@var{d}=0} is accepted and following the rule it can be seen 3426that only 0 is considered divisible by 0. 3427@end deftypefun 3428 3429@deftypefun int mpz_congruent_p (const mpz_t @var{n}, const mpz_t @var{c}, const mpz_t @var{d}) 3430@deftypefunx int mpz_congruent_ui_p (const mpz_t @var{n}, unsigned long int @var{c}, unsigned long int @var{d}) 3431@deftypefunx int mpz_congruent_2exp_p (const mpz_t @var{n}, const mpz_t @var{c}, mp_bitcnt_t @var{b}) 3432@cindex Divisibility functions 3433@cindex Congruence functions 3434Return non-zero if @var{n} is congruent to @var{c} modulo @var{d}, or in the 3435case of @code{mpz_congruent_2exp_p} modulo @m{2^b,2^@var{b}}. 3436 3437@var{n} is congruent to @var{c} mod @var{d} if there exists an integer @var{q} 3438satisfying @math{@var{n} = @var{c} + @var{q}@GMPmultiply{}@var{d}}. Unlike 3439the other division functions, @math{@var{d}=0} is accepted and following the 3440rule it can be seen that @var{n} and @var{c} are considered congruent mod 0 3441only when exactly equal. 3442@end deftypefun 3443 3444 3445@need 2000 3446@node Integer Exponentiation, Integer Roots, Integer Division, Integer Functions 3447@section Exponentiation Functions 3448@cindex Integer exponentiation functions 3449@cindex Exponentiation functions 3450@cindex Powering functions 3451 3452@deftypefun void mpz_powm (mpz_t @var{rop}, const mpz_t @var{base}, const mpz_t @var{exp}, const mpz_t @var{mod}) 3453@deftypefunx void mpz_powm_ui (mpz_t @var{rop}, const mpz_t @var{base}, unsigned long int @var{exp}, const mpz_t @var{mod}) 3454Set @var{rop} to @m{base^{exp} \bmod mod, (@var{base} raised to @var{exp}) 3455modulo @var{mod}}. 3456 3457Negative @var{exp} is supported if the inverse @mm{@var{base}@sup{-1} @bmod 3458@var{mod}, @var{base}^(-1) @bmod @var{mod}} exists (see @code{mpz_invert} in 3459@ref{Number Theoretic Functions}). If an inverse doesn't exist then a divide 3460by zero is raised. 3461@end deftypefun 3462 3463@deftypefun void mpz_powm_sec (mpz_t @var{rop}, const mpz_t @var{base}, const mpz_t @var{exp}, const mpz_t @var{mod}) 3464Set @var{rop} to @m{base^{exp} \bmod @var{mod}, (@var{base} raised to @var{exp}) 3465modulo @var{mod}}. 3466 3467It is required that @math{@var{exp} > 0} and that @var{mod} is odd. 3468 3469This function is designed to take the same time and have the same cache access 3470patterns for any two same-size arguments, assuming that function arguments are 3471placed at the same position and that the machine state is identical upon 3472function entry. This function is intended for cryptographic purposes, where 3473resilience to side-channel attacks is desired. 3474@end deftypefun 3475 3476@deftypefun void mpz_pow_ui (mpz_t @var{rop}, const mpz_t @var{base}, unsigned long int @var{exp}) 3477@deftypefunx void mpz_ui_pow_ui (mpz_t @var{rop}, unsigned long int @var{base}, unsigned long int @var{exp}) 3478Set @var{rop} to @m{base^{exp}, @var{base} raised to @var{exp}}. The case 3479@math{0^0} yields 1. 3480@end deftypefun 3481 3482 3483@need 2000 3484@node Integer Roots, Number Theoretic Functions, Integer Exponentiation, Integer Functions 3485@section Root Extraction Functions 3486@cindex Integer root functions 3487@cindex Root extraction functions 3488 3489@deftypefun int mpz_root (mpz_t @var{rop}, const mpz_t @var{op}, unsigned long int @var{n}) 3490Set @var{rop} to @m{\lfloor\root n \of {op}\rfloor@C{},} the truncated integer 3491part of the @var{n}th root of @var{op}. Return non-zero if the computation 3492was exact, i.e., if @var{op} is @var{rop} to the @var{n}th power. 3493@end deftypefun 3494 3495@deftypefun void mpz_rootrem (mpz_t @var{root}, mpz_t @var{rem}, const mpz_t @var{u}, unsigned long int @var{n}) 3496Set @var{root} to @m{\lfloor\root n \of {u}\rfloor@C{},} the truncated 3497integer part of the @var{n}th root of @var{u}. Set @var{rem} to the 3498remainder, @m{(@var{u} - @var{root}^n), 3499@var{u}@minus{}@var{root}**@var{n}}. 3500@end deftypefun 3501 3502@deftypefun void mpz_sqrt (mpz_t @var{rop}, const mpz_t @var{op}) 3503Set @var{rop} to @m{\lfloor\sqrt{@var{op}}\rfloor@C{},} the truncated 3504integer part of the square root of @var{op}. 3505@end deftypefun 3506 3507@deftypefun void mpz_sqrtrem (mpz_t @var{rop1}, mpz_t @var{rop2}, const mpz_t @var{op}) 3508Set @var{rop1} to @m{\lfloor\sqrt{@var{op}}\rfloor, the truncated integer part 3509of the square root of @var{op}}, like @code{mpz_sqrt}. Set @var{rop2} to the 3510remainder @m{(@var{op} - @var{rop1}^2), 3511@var{op}@minus{}@var{rop1}*@var{rop1}}, which will be zero if @var{op} is a 3512perfect square. 3513 3514If @var{rop1} and @var{rop2} are the same variable, the results are 3515undefined. 3516@end deftypefun 3517 3518@deftypefun int mpz_perfect_power_p (const mpz_t @var{op}) 3519@cindex Perfect power functions 3520@cindex Root testing functions 3521Return non-zero if @var{op} is a perfect power, i.e., if there exist integers 3522@m{a,@var{a}} and @m{b,@var{b}}, with @m{b>1, @var{b}>1}, such that 3523@m{@var{op}=a^b, @var{op} equals @var{a} raised to the power @var{b}}. 3524 3525Under this definition both 0 and 1 are considered to be perfect powers. 3526Negative values of @var{op} are accepted, but of course can only be odd 3527perfect powers. 3528@end deftypefun 3529 3530@deftypefun int mpz_perfect_square_p (const mpz_t @var{op}) 3531@cindex Perfect square functions 3532@cindex Root testing functions 3533Return non-zero if @var{op} is a perfect square, i.e., if the square root of 3534@var{op} is an integer. Under this definition both 0 and 1 are considered to 3535be perfect squares. 3536@end deftypefun 3537 3538 3539@need 2000 3540@node Number Theoretic Functions, Integer Comparisons, Integer Roots, Integer Functions 3541@section Number Theoretic Functions 3542@cindex Number theoretic functions 3543 3544@deftypefun int mpz_probab_prime_p (const mpz_t @var{n}, int @var{reps}) 3545@cindex Prime testing functions 3546@cindex Probable prime testing functions 3547Determine whether @var{n} is prime. Return 2 if @var{n} is definitely prime, 3548return 1 if @var{n} is probably prime (without being certain), or return 0 if 3549@var{n} is definitely non-prime. 3550 3551This function performs some trial divisions, a Baillie-PSW probable prime 3552test, then @var{reps-24} Miller-Rabin probabilistic primality tests. A 3553higher @var{reps} value will reduce the chances of a non-prime being 3554identified as ``probably prime''. A composite number will be identified as a 3555prime with an asymptotic probability of less than @m{4^{-reps},4^(-@var{reps})}. 3556Reasonable values of @var{reps} are between 15 and 50. 3557 3558GMP versions up to and including 6.1.2 did not use the Baillie-PSW 3559primality test. In those older versions of GMP, this function performed 3560@var{reps} Miller-Rabin tests. 3561@end deftypefun 3562 3563@deftypefun void mpz_nextprime (mpz_t @var{rop}, const mpz_t @var{op}) 3564@cindex Next prime function 3565Set @var{rop} to the next prime greater than @var{op}. 3566 3567This function uses a probabilistic algorithm to identify primes. For 3568practical purposes it's adequate, the chance of a composite passing will be 3569extremely small. 3570@end deftypefun 3571 3572@c mpz_prime_p not implemented as of gmp 3.0. 3573 3574@c @deftypefun int mpz_prime_p (const mpz_t @var{n}) 3575@c Return non-zero if @var{n} is prime and zero if @var{n} is a non-prime. 3576@c This function is far slower than @code{mpz_probab_prime_p}, but then it 3577@c never returns non-zero for composite numbers. 3578 3579@c (For practical purposes, using @code{mpz_probab_prime_p} is adequate. 3580@c The likelihood of a programming error or hardware malfunction is orders 3581@c of magnitudes greater than the likelihood for a composite to pass as a 3582@c prime, if the @var{reps} argument is in the suggested range.) 3583@c @end deftypefun 3584 3585@deftypefun void mpz_gcd (mpz_t @var{rop}, const mpz_t @var{op1}, const mpz_t @var{op2}) 3586@cindex Greatest common divisor functions 3587@cindex GCD functions 3588Set @var{rop} to the greatest common divisor of @var{op1} and @var{op2}. The 3589result is always positive even if one or both input operands are negative. 3590Except if both inputs are zero; then this function defines @math{gcd(0,0) = 0}. 3591@end deftypefun 3592 3593@deftypefun {unsigned long int} mpz_gcd_ui (mpz_t @var{rop}, const mpz_t @var{op1}, unsigned long int @var{op2}) 3594Compute the greatest common divisor of @var{op1} and @var{op2}. If 3595@var{rop} is not @code{NULL}, store the result there. 3596 3597If the result is small enough to fit in an @code{unsigned long int}, it is 3598returned. If the result does not fit, 0 is returned, and the result is equal 3599to the argument @var{op1}. Note that the result will always fit if @var{op2} 3600is non-zero. 3601@end deftypefun 3602 3603@deftypefun void mpz_gcdext (mpz_t @var{g}, mpz_t @var{s}, mpz_t @var{t}, const mpz_t @var{a}, const mpz_t @var{b}) 3604@cindex Extended GCD 3605@cindex GCD extended 3606Set @var{g} to the greatest common divisor of @var{a} and @var{b}, and in 3607addition set @var{s} and @var{t} to coefficients satisfying 3608@math{@var{a}@GMPmultiply{}@var{s} + @var{b}@GMPmultiply{}@var{t} = @var{g}}. 3609The value in @var{g} is always positive, even if one or both of @var{a} and 3610@var{b} are negative (or zero if both inputs are zero). The values in @var{s} 3611and @var{t} are chosen such that normally, @math{@GMPabs{@var{s}} < 3612@GMPabs{@var{b}} / (2 @var{g})} and @math{@GMPabs{@var{t}} < @GMPabs{@var{a}} 3613/ (2 @var{g})}, and these relations define @var{s} and @var{t} uniquely. There 3614are a few exceptional cases: 3615 3616If @math{@GMPabs{@var{a}} = @GMPabs{@var{b}}}, then @math{@var{s} = 0}, 3617@math{@var{t} = sgn(@var{b})}. 3618 3619Otherwise, @math{@var{s} = sgn(@var{a})} if @math{@var{b} = 0} or 3620@math{@GMPabs{@var{b}} = 2 @var{g}}, and @math{@var{t} = sgn(@var{b})} if 3621@math{@var{a} = 0} or @math{@GMPabs{@var{a}} = 2 @var{g}}. 3622 3623In all cases, @math{@var{s} = 0} if and only if @math{@var{g} = 3624@GMPabs{@var{b}}}, i.e., if @var{b} divides @var{a} or @math{@var{a} = @var{b} 3625= 0}. 3626 3627If @var{t} or @var{g} is @code{NULL} then that value is not computed. 3628@end deftypefun 3629 3630@deftypefun void mpz_lcm (mpz_t @var{rop}, const mpz_t @var{op1}, const mpz_t @var{op2}) 3631@deftypefunx void mpz_lcm_ui (mpz_t @var{rop}, const mpz_t @var{op1}, unsigned long @var{op2}) 3632@cindex Least common multiple functions 3633@cindex LCM functions 3634Set @var{rop} to the least common multiple of @var{op1} and @var{op2}. 3635@var{rop} is always positive, irrespective of the signs of @var{op1} and 3636@var{op2}. @var{rop} will be zero if either @var{op1} or @var{op2} is zero. 3637@end deftypefun 3638 3639@deftypefun int mpz_invert (mpz_t @var{rop}, const mpz_t @var{op1}, const mpz_t @var{op2}) 3640@cindex Modular inverse functions 3641@cindex Inverse modulo functions 3642Compute the inverse of @var{op1} modulo @var{op2} and put the result in 3643@var{rop}. If the inverse exists, the return value is non-zero and @var{rop} 3644will satisfy @math{0 @le{} @var{rop} < @GMPabs{@var{op2}}} (with @math{@var{rop} 3645= 0} possible only when @math{@GMPabs{@var{op2}} = 1}, i.e., in the 3646somewhat degenerate zero ring). If an inverse doesn't 3647exist the return value is zero and @var{rop} is undefined. The behaviour of 3648this function is undefined when @var{op2} is zero. 3649@end deftypefun 3650 3651@deftypefun int mpz_jacobi (const mpz_t @var{a}, const mpz_t @var{b}) 3652@cindex Jacobi symbol functions 3653Calculate the Jacobi symbol @m{\left(a \over b\right), 3654(@var{a}/@var{b})}. This is defined only for @var{b} odd. 3655@end deftypefun 3656 3657@deftypefun int mpz_legendre (const mpz_t @var{a}, const mpz_t @var{p}) 3658@cindex Legendre symbol functions 3659Calculate the Legendre symbol @m{\left(a \over p\right), 3660(@var{a}/@var{p})}. This is defined only for @var{p} an odd positive 3661prime, and for such @var{p} it's identical to the Jacobi symbol. 3662@end deftypefun 3663 3664@deftypefun int mpz_kronecker (const mpz_t @var{a}, const mpz_t @var{b}) 3665@deftypefunx int mpz_kronecker_si (const mpz_t @var{a}, long @var{b}) 3666@deftypefunx int mpz_kronecker_ui (const mpz_t @var{a}, unsigned long @var{b}) 3667@deftypefunx int mpz_si_kronecker (long @var{a}, const mpz_t @var{b}) 3668@deftypefunx int mpz_ui_kronecker (unsigned long @var{a}, const mpz_t @var{b}) 3669@cindex Kronecker symbol functions 3670Calculate the Jacobi symbol @m{\left(a \over b\right), 3671(@var{a}/@var{b})} with the Kronecker extension @m{\left(a \over 36722\right) = \left(2 \over a\right), (a/2)=(2/a)} when @math{a} odd, or 3673@m{\left(a \over 2\right) = 0, (a/2)=0} when @math{a} even. 3674 3675When @var{b} is odd the Jacobi symbol and Kronecker symbol are 3676identical, so @code{mpz_kronecker_ui} etc can be used for mixed 3677precision Jacobi symbols too. 3678 3679For more information see Henri Cohen section 1.4.2 (@pxref{References}), 3680or any number theory textbook. See also the example program 3681@file{demos/qcn.c} which uses @code{mpz_kronecker_ui}. 3682@end deftypefun 3683 3684@deftypefun {mp_bitcnt_t} mpz_remove (mpz_t @var{rop}, const mpz_t @var{op}, const mpz_t @var{f}) 3685@cindex Remove factor functions 3686@cindex Factor removal functions 3687Remove all occurrences of the factor @var{f} from @var{op} and store the 3688result in @var{rop}. The return value is how many such occurrences were 3689removed. 3690@end deftypefun 3691 3692@deftypefun void mpz_fac_ui (mpz_t @var{rop}, unsigned long int @var{n}) 3693@deftypefunx void mpz_2fac_ui (mpz_t @var{rop}, unsigned long int @var{n}) 3694@deftypefunx void mpz_mfac_uiui (mpz_t @var{rop}, unsigned long int @var{n}, unsigned long int @var{m}) 3695@cindex Factorial functions 3696Set @var{rop} to the factorial of @var{n}: @code{mpz_fac_ui} computes the plain factorial @var{n}!, 3697@code{mpz_2fac_ui} computes the double-factorial @var{n}!!, and @code{mpz_mfac_uiui} the 3698@var{m}-multi-factorial @m{n!^{(m)}, @var{n}!^(@var{m})}. 3699@end deftypefun 3700 3701@deftypefun void mpz_primorial_ui (mpz_t @var{rop}, unsigned long int @var{n}) 3702@cindex Primorial functions 3703Set @var{rop} to the primorial of @var{n}, i.e. the product of all positive 3704prime numbers @math{@le{}@var{n}}. 3705@end deftypefun 3706 3707@deftypefun void mpz_bin_ui (mpz_t @var{rop}, const mpz_t @var{n}, unsigned long int @var{k}) 3708@deftypefunx void mpz_bin_uiui (mpz_t @var{rop}, unsigned long int @var{n}, @w{unsigned long int @var{k}}) 3709@cindex Binomial coefficient functions 3710Compute the binomial coefficient @m{\left({n}\atop{k}\right), @var{n} over 3711@var{k}} and store the result in @var{rop}. Negative values of @var{n} are 3712supported by @code{mpz_bin_ui}, using the identity 3713@m{\left({-n}\atop{k}\right) = (-1)^k \left({n+k-1}\atop{k}\right), 3714bin(-n@C{}k) = (-1)^k * bin(n+k-1@C{}k)}, see Knuth volume 1 section 1.2.6 3715part G. 3716@end deftypefun 3717 3718@deftypefun void mpz_fib_ui (mpz_t @var{fn}, unsigned long int @var{n}) 3719@deftypefunx void mpz_fib2_ui (mpz_t @var{fn}, mpz_t @var{fnsub1}, unsigned long int @var{n}) 3720@cindex Fibonacci sequence functions 3721@code{mpz_fib_ui} sets @var{fn} to to @m{F_n,F[n]}, the @var{n}'th Fibonacci 3722number. @code{mpz_fib2_ui} sets @var{fn} to @m{F_n,F[n]}, and @var{fnsub1} to 3723@m{F_{n-1},F[n-1]}. 3724 3725These functions are designed for calculating isolated Fibonacci numbers. When 3726a sequence of values is wanted it's best to start with @code{mpz_fib2_ui} and 3727iterate the defining @m{F_{n+1} = F_n + F_{n-1}, F[n+1]=F[n]+F[n-1]} or 3728similar. 3729@end deftypefun 3730 3731@deftypefun void mpz_lucnum_ui (mpz_t @var{ln}, unsigned long int @var{n}) 3732@deftypefunx void mpz_lucnum2_ui (mpz_t @var{ln}, mpz_t @var{lnsub1}, unsigned long int @var{n}) 3733@cindex Lucas number functions 3734@code{mpz_lucnum_ui} sets @var{ln} to to @m{L_n,L[n]}, the @var{n}'th Lucas 3735number. @code{mpz_lucnum2_ui} sets @var{ln} to @m{L_n,L[n]}, and @var{lnsub1} 3736to @m{L_{n-1},L[n-1]}. 3737 3738These functions are designed for calculating isolated Lucas numbers. When a 3739sequence of values is wanted it's best to start with @code{mpz_lucnum2_ui} and 3740iterate the defining @m{L_{n+1} = L_n + L_{n-1}, L[n+1]=L[n]+L[n-1]} or 3741similar. 3742 3743The Fibonacci numbers and Lucas numbers are related sequences, so it's never 3744necessary to call both @code{mpz_fib2_ui} and @code{mpz_lucnum2_ui}. The 3745formulas for going from Fibonacci to Lucas can be found in @ref{Lucas Numbers 3746Algorithm}, the reverse is straightforward too. 3747@end deftypefun 3748 3749 3750@node Integer Comparisons, Integer Logic and Bit Fiddling, Number Theoretic Functions, Integer Functions 3751@comment node-name, next, previous, up 3752@section Comparison Functions 3753@cindex Integer comparison functions 3754@cindex Comparison functions 3755 3756@deftypefn Function int mpz_cmp (const mpz_t @var{op1}, const mpz_t @var{op2}) 3757@deftypefnx Function int mpz_cmp_d (const mpz_t @var{op1}, double @var{op2}) 3758@deftypefnx Macro int mpz_cmp_si (const mpz_t @var{op1}, signed long int @var{op2}) 3759@deftypefnx Macro int mpz_cmp_ui (const mpz_t @var{op1}, unsigned long int @var{op2}) 3760Compare @var{op1} and @var{op2}. Return a positive value if @math{@var{op1} > 3761@var{op2}}, zero if @math{@var{op1} = @var{op2}}, or a negative value if 3762@math{@var{op1} < @var{op2}}. 3763 3764@code{mpz_cmp_ui} and @code{mpz_cmp_si} are macros and will evaluate their 3765arguments more than once. @code{mpz_cmp_d} can be called with an infinity, 3766but results are undefined for a NaN. 3767@end deftypefn 3768 3769@deftypefn Function int mpz_cmpabs (const mpz_t @var{op1}, const mpz_t @var{op2}) 3770@deftypefnx Function int mpz_cmpabs_d (const mpz_t @var{op1}, double @var{op2}) 3771@deftypefnx Function int mpz_cmpabs_ui (const mpz_t @var{op1}, unsigned long int @var{op2}) 3772Compare the absolute values of @var{op1} and @var{op2}. Return a positive 3773value if @math{@GMPabs{@var{op1}} > @GMPabs{@var{op2}}}, zero if 3774@math{@GMPabs{@var{op1}} = @GMPabs{@var{op2}}}, or a negative value if 3775@math{@GMPabs{@var{op1}} < @GMPabs{@var{op2}}}. 3776 3777@code{mpz_cmpabs_d} can be called with an infinity, but results are undefined 3778for a NaN. 3779@end deftypefn 3780 3781@deftypefn Macro int mpz_sgn (const mpz_t @var{op}) 3782@cindex Sign tests 3783@cindex Integer sign tests 3784Return @math{+1} if @math{@var{op} > 0}, 0 if @math{@var{op} = 0}, and 3785@math{-1} if @math{@var{op} < 0}. 3786 3787This function is actually implemented as a macro. It evaluates its argument 3788multiple times. 3789@end deftypefn 3790 3791 3792@node Integer Logic and Bit Fiddling, I/O of Integers, Integer Comparisons, Integer Functions 3793@comment node-name, next, previous, up 3794@section Logical and Bit Manipulation Functions 3795@cindex Logical functions 3796@cindex Bit manipulation functions 3797@cindex Integer logical functions 3798@cindex Integer bit manipulation functions 3799 3800These functions behave as if twos complement arithmetic were used (although 3801sign-magnitude is the actual implementation). The least significant bit is 3802number 0. 3803 3804@deftypefun void mpz_and (mpz_t @var{rop}, const mpz_t @var{op1}, const mpz_t @var{op2}) 3805Set @var{rop} to @var{op1} bitwise-and @var{op2}. 3806@end deftypefun 3807 3808@deftypefun void mpz_ior (mpz_t @var{rop}, const mpz_t @var{op1}, const mpz_t @var{op2}) 3809Set @var{rop} to @var{op1} bitwise inclusive-or @var{op2}. 3810@end deftypefun 3811 3812@deftypefun void mpz_xor (mpz_t @var{rop}, const mpz_t @var{op1}, const mpz_t @var{op2}) 3813Set @var{rop} to @var{op1} bitwise exclusive-or @var{op2}. 3814@end deftypefun 3815 3816@deftypefun void mpz_com (mpz_t @var{rop}, const mpz_t @var{op}) 3817Set @var{rop} to the one's complement of @var{op}. 3818@end deftypefun 3819 3820@deftypefun {mp_bitcnt_t} mpz_popcount (const mpz_t @var{op}) 3821If @math{@var{op}@ge{}0}, return the population count of @var{op}, which is the 3822number of 1 bits in the binary representation. If @math{@var{op}<0}, the 3823number of 1s is infinite, and the return value is the largest possible 3824@code{mp_bitcnt_t}. 3825@end deftypefun 3826 3827@deftypefun {mp_bitcnt_t} mpz_hamdist (const mpz_t @var{op1}, const mpz_t @var{op2}) 3828If @var{op1} and @var{op2} are both @math{@ge{}0} or both @math{<0}, return the 3829hamming distance between the two operands, which is the number of bit positions 3830where @var{op1} and @var{op2} have different bit values. If one operand is 3831@math{@ge{}0} and the other @math{<0} then the number of bits different is 3832infinite, and the return value is the largest possible @code{mp_bitcnt_t}. 3833@end deftypefun 3834 3835@deftypefun {mp_bitcnt_t} mpz_scan0 (const mpz_t @var{op}, mp_bitcnt_t @var{starting_bit}) 3836@deftypefunx {mp_bitcnt_t} mpz_scan1 (const mpz_t @var{op}, mp_bitcnt_t @var{starting_bit}) 3837@cindex Bit scanning functions 3838@cindex Scan bit functions 3839Scan @var{op}, starting from bit @var{starting_bit}, towards more significant 3840bits, until the first 0 or 1 bit (respectively) is found. Return the index of 3841the found bit. 3842 3843If the bit at @var{starting_bit} is already what's sought, then 3844@var{starting_bit} is returned. 3845 3846If there's no bit found, then the largest possible @code{mp_bitcnt_t} is 3847returned. This will happen in @code{mpz_scan0} past the end of a negative 3848number, or @code{mpz_scan1} past the end of a nonnegative number. 3849@end deftypefun 3850 3851@deftypefun void mpz_setbit (mpz_t @var{rop}, mp_bitcnt_t @var{bit_index}) 3852Set bit @var{bit_index} in @var{rop}. 3853@end deftypefun 3854 3855@deftypefun void mpz_clrbit (mpz_t @var{rop}, mp_bitcnt_t @var{bit_index}) 3856Clear bit @var{bit_index} in @var{rop}. 3857@end deftypefun 3858 3859@deftypefun void mpz_combit (mpz_t @var{rop}, mp_bitcnt_t @var{bit_index}) 3860Complement bit @var{bit_index} in @var{rop}. 3861@end deftypefun 3862 3863@deftypefun int mpz_tstbit (const mpz_t @var{op}, mp_bitcnt_t @var{bit_index}) 3864Test bit @var{bit_index} in @var{op} and return 0 or 1 accordingly. 3865@end deftypefun 3866 3867@node I/O of Integers, Integer Random Numbers, Integer Logic and Bit Fiddling, Integer Functions 3868@comment node-name, next, previous, up 3869@section Input and Output Functions 3870@cindex Integer input and output functions 3871@cindex Input functions 3872@cindex Output functions 3873@cindex I/O functions 3874 3875Functions that perform input from a stdio stream, and functions that output to 3876a stdio stream, of @code{mpz} numbers. Passing a @code{NULL} pointer for a 3877@var{stream} argument to any of these functions will make them read from 3878@code{stdin} and write to @code{stdout}, respectively. 3879 3880When using any of these functions, it is a good idea to include @file{stdio.h} 3881before @file{gmp.h}, since that will allow @file{gmp.h} to define prototypes 3882for these functions. 3883 3884See also @ref{Formatted Output} and @ref{Formatted Input}. 3885 3886@deftypefun size_t mpz_out_str (FILE *@var{stream}, int @var{base}, const mpz_t @var{op}) 3887Output @var{op} on stdio stream @var{stream}, as a string of digits in base 3888@var{base}. The base argument may vary from 2 to 62 or from @minus{}2 to 3889@minus{}36. 3890 3891For @var{base} in the range 2..36, digits and lower-case letters are used; for 3892@minus{}2..@minus{}36, digits and upper-case letters are used; for 37..62, 3893digits, upper-case letters, and lower-case letters (in that significance order) 3894are used. 3895 3896Return the number of bytes written, or if an error occurred, return 0. 3897@end deftypefun 3898 3899@deftypefun size_t mpz_inp_str (mpz_t @var{rop}, FILE *@var{stream}, int @var{base}) 3900Input a possibly white-space preceded string in base @var{base} from stdio 3901stream @var{stream}, and put the read integer in @var{rop}. 3902 3903The @var{base} may vary from 2 to 62, or if @var{base} is 0, then the leading 3904characters are used: @code{0x} and @code{0X} for hexadecimal, @code{0b} and 3905@code{0B} for binary, @code{0} for octal, or decimal otherwise. 3906 3907For bases up to 36, case is ignored; upper-case and lower-case letters have 3908the same value. For bases 37 to 62, upper-case letter represent the usual 390910..35 while lower-case letter represent 36..61. 3910 3911Return the number of bytes read, or if an error occurred, return 0. 3912@end deftypefun 3913 3914@deftypefun size_t mpz_out_raw (FILE *@var{stream}, const mpz_t @var{op}) 3915Output @var{op} on stdio stream @var{stream}, in raw binary format. The 3916integer is written in a portable format, with 4 bytes of size information, and 3917that many bytes of limbs. Both the size and the limbs are written in 3918decreasing significance order (i.e., in big-endian). 3919 3920The output can be read with @code{mpz_inp_raw}. 3921 3922Return the number of bytes written, or if an error occurred, return 0. 3923 3924The output of this can not be read by @code{mpz_inp_raw} from GMP 1, because 3925of changes necessary for compatibility between 32-bit and 64-bit machines. 3926@end deftypefun 3927 3928@deftypefun size_t mpz_inp_raw (mpz_t @var{rop}, FILE *@var{stream}) 3929Input from stdio stream @var{stream} in the format written by 3930@code{mpz_out_raw}, and put the result in @var{rop}. Return the number of 3931bytes read, or if an error occurred, return 0. 3932 3933This routine can read the output from @code{mpz_out_raw} also from GMP 1, in 3934spite of changes necessary for compatibility between 32-bit and 64-bit 3935machines. 3936@end deftypefun 3937 3938 3939@need 2000 3940@node Integer Random Numbers, Integer Import and Export, I/O of Integers, Integer Functions 3941@comment node-name, next, previous, up 3942@section Random Number Functions 3943@cindex Integer random number functions 3944@cindex Random number functions 3945 3946The random number functions of GMP come in two groups; older function 3947that rely on a global state, and newer functions that accept a state 3948parameter that is read and modified. Please see the @ref{Random Number 3949Functions} for more information on how to use and not to use random 3950number functions. 3951 3952@deftypefun void mpz_urandomb (mpz_t @var{rop}, gmp_randstate_t @var{state}, mp_bitcnt_t @var{n}) 3953Generate a uniformly distributed random integer in the range 0 to 3954@mm{2@sup{n}-1, 2^@var{n}@minus{}1}, inclusive. 3955 3956The variable @var{state} must be initialized by calling one of the 3957@code{gmp_randinit} functions (@ref{Random State Initialization}) before 3958invoking this function. 3959@end deftypefun 3960 3961@deftypefun void mpz_urandomm (mpz_t @var{rop}, gmp_randstate_t @var{state}, const mpz_t @var{n}) 3962Generate a uniform random integer in the range 0 to @math{@var{n}-1}, 3963inclusive. 3964 3965The variable @var{state} must be initialized by calling one of the 3966@code{gmp_randinit} functions (@ref{Random State Initialization}) 3967before invoking this function. 3968@end deftypefun 3969 3970@deftypefun void mpz_rrandomb (mpz_t @var{rop}, gmp_randstate_t @var{state}, mp_bitcnt_t @var{n}) 3971Generate a random integer with long strings of zeros and ones in the 3972binary representation. Useful for testing functions and algorithms, 3973since this kind of random numbers have proven to be more likely to 3974trigger corner-case bugs. The random number will be in the range 3975@mm{2@sup{n-1}, 2^(@var{n}@minus{}1)} to @mm{2@sup{n}-1, 39762^@var{n}@minus{}1}, inclusive. 3977 3978The variable @var{state} must be initialized by calling one of the 3979@code{gmp_randinit} functions (@ref{Random State Initialization}) 3980before invoking this function. 3981@end deftypefun 3982 3983@deftypefun void mpz_random (mpz_t @var{rop}, mp_size_t @var{max_size}) 3984Generate a random integer of at most @var{max_size} limbs. The generated 3985random number doesn't satisfy any particular requirements of randomness. 3986Negative random numbers are generated when @var{max_size} is negative. 3987 3988This function is obsolete. Use @code{mpz_urandomb} or 3989@code{mpz_urandomm} instead. 3990@end deftypefun 3991 3992@deftypefun void mpz_random2 (mpz_t @var{rop}, mp_size_t @var{max_size}) 3993Generate a random integer of at most @var{max_size} limbs, with long strings 3994of zeros and ones in the binary representation. Useful for testing functions 3995and algorithms, since this kind of random numbers have proven to be more 3996likely to trigger corner-case bugs. Negative random numbers are generated 3997when @var{max_size} is negative. 3998 3999This function is obsolete. Use @code{mpz_rrandomb} instead. 4000@end deftypefun 4001 4002 4003@node Integer Import and Export, Miscellaneous Integer Functions, Integer Random Numbers, Integer Functions 4004@section Integer Import and Export 4005 4006@code{mpz_t} variables can be converted to and from arbitrary words of binary 4007data with the following functions. 4008 4009@deftypefun void mpz_import (mpz_t @var{rop}, size_t @var{count}, int @var{order}, size_t @var{size}, int @var{endian}, size_t @var{nails}, const void *@var{op}) 4010@cindex Integer import 4011@cindex Import 4012Set @var{rop} from an array of word data at @var{op}. 4013 4014The parameters specify the format of the data. @var{count} many words are 4015read, each @var{size} bytes. @var{order} can be 1 for most significant word 4016first or -1 for least significant first. Within each word @var{endian} can be 40171 for most significant byte first, -1 for least significant first, or 0 for 4018the native endianness of the host CPU@. The most significant @var{nails} bits 4019of each word are skipped, this can be 0 to use the full words. 4020 4021There is no sign taken from the data, @var{rop} will simply be a positive 4022integer. An application can handle any sign itself, and apply it for instance 4023with @code{mpz_neg}. 4024 4025There are no data alignment restrictions on @var{op}, any address is allowed. 4026 4027Here's an example converting an array of @code{unsigned long} data, most 4028significant element first, and host byte order within each value. 4029 4030@example 4031unsigned long a[20]; 4032/* Initialize @var{z} and @var{a} */ 4033mpz_import (z, 20, 1, sizeof(a[0]), 0, 0, a); 4034@end example 4035 4036This example assumes the full @code{sizeof} bytes are used for data in the 4037given type, which is usually true, and certainly true for @code{unsigned long} 4038everywhere we know of. However on Cray vector systems it may be noted that 4039@code{short} and @code{int} are always stored in 8 bytes (and with 4040@code{sizeof} indicating that) but use only 32 or 46 bits. The @var{nails} 4041feature can account for this, by passing for instance 4042@code{8*sizeof(int)-INT_BIT}. 4043@end deftypefun 4044 4045@deftypefun {void *} mpz_export (void *@var{rop}, size_t *@var{countp}, int @var{order}, size_t @var{size}, int @var{endian}, size_t @var{nails}, const mpz_t @var{op}) 4046@cindex Integer export 4047@cindex Export 4048Fill @var{rop} with word data from @var{op}. 4049 4050The parameters specify the format of the data produced. Each word will be 4051@var{size} bytes and @var{order} can be 1 for most significant word first or 4052-1 for least significant first. Within each word @var{endian} can be 1 for 4053most significant byte first, -1 for least significant first, or 0 for the 4054native endianness of the host CPU@. The most significant @var{nails} bits of 4055each word are unused and set to zero, this can be 0 to produce full words. 4056 4057The number of words produced is written to @code{*@var{countp}}, or 4058@var{countp} can be @code{NULL} to discard the count. @var{rop} must have 4059enough space for the data, or if @var{rop} is @code{NULL} then a result array 4060of the necessary size is allocated using the current GMP allocation function 4061(@pxref{Custom Allocation}). In either case the return value is the 4062destination used, either @var{rop} or the allocated block. 4063 4064If @var{op} is non-zero then the most significant word produced will be 4065non-zero. If @var{op} is zero then the count returned will be zero and 4066nothing written to @var{rop}. If @var{rop} is @code{NULL} in this case, no 4067block is allocated, just @code{NULL} is returned. 4068 4069The sign of @var{op} is ignored, just the absolute value is exported. An 4070application can use @code{mpz_sgn} to get the sign and handle it as desired. 4071(@pxref{Integer Comparisons}) 4072 4073There are no data alignment restrictions on @var{rop}, any address is allowed. 4074 4075When an application is allocating space itself the required size can be 4076determined with a calculation like the following. Since @code{mpz_sizeinbase} 4077always returns at least 1, @code{count} here will be at least one, which 4078avoids any portability problems with @code{malloc(0)}, though if @code{z} is 4079zero no space at all is actually needed (or written). 4080 4081@example 4082numb = 8*size - nail; 4083count = (mpz_sizeinbase (z, 2) + numb-1) / numb; 4084p = malloc (count * size); 4085@end example 4086@end deftypefun 4087 4088 4089@need 2000 4090@node Miscellaneous Integer Functions, Integer Special Functions, Integer Import and Export, Integer Functions 4091@comment node-name, next, previous, up 4092@section Miscellaneous Functions 4093@cindex Miscellaneous integer functions 4094@cindex Integer miscellaneous functions 4095 4096@deftypefun int mpz_fits_ulong_p (const mpz_t @var{op}) 4097@deftypefunx int mpz_fits_slong_p (const mpz_t @var{op}) 4098@deftypefunx int mpz_fits_uint_p (const mpz_t @var{op}) 4099@deftypefunx int mpz_fits_sint_p (const mpz_t @var{op}) 4100@deftypefunx int mpz_fits_ushort_p (const mpz_t @var{op}) 4101@deftypefunx int mpz_fits_sshort_p (const mpz_t @var{op}) 4102Return non-zero iff the value of @var{op} fits in an @code{unsigned long int}, 4103@code{signed long int}, @code{unsigned int}, @code{signed int}, @code{unsigned 4104short int}, or @code{signed short int}, respectively. Otherwise, return zero. 4105@end deftypefun 4106 4107@deftypefn Macro int mpz_odd_p (const mpz_t @var{op}) 4108@deftypefnx Macro int mpz_even_p (const mpz_t @var{op}) 4109Determine whether @var{op} is odd or even, respectively. Return non-zero if 4110yes, zero if no. These macros evaluate their argument more than once. 4111@end deftypefn 4112 4113@deftypefun size_t mpz_sizeinbase (const mpz_t @var{op}, int @var{base}) 4114@cindex Size in digits 4115@cindex Digits in an integer 4116Return the size of @var{op} measured in number of digits in the given 4117@var{base}. @var{base} can vary from 2 to 62. The sign of @var{op} is 4118ignored, just the absolute value is used. The result will be either exact or 41191 too big. If @var{base} is a power of 2, the result is always exact. If 4120@var{op} is zero the return value is always 1. 4121 4122This function can be used to determine the space required when converting 4123@var{op} to a string. The right amount of allocation is normally two more 4124than the value returned by @code{mpz_sizeinbase}, one extra for a minus sign 4125and one for the null-terminator. 4126 4127@cindex Most significant bit 4128It will be noted that @code{mpz_sizeinbase(@var{op},2)} can be used to locate 4129the most significant 1 bit in @var{op}, counting from 1. (Unlike the bitwise 4130functions which start from 0, @xref{Integer Logic and Bit Fiddling,, Logical 4131and Bit Manipulation Functions}.) 4132@end deftypefun 4133 4134 4135@node Integer Special Functions, , Miscellaneous Integer Functions, Integer Functions 4136@section Special Functions 4137@cindex Special integer functions 4138@cindex Integer special functions 4139 4140The functions in this section are for various special purposes. Most 4141applications will not need them. 4142 4143@deftypefun void mpz_array_init (mpz_t @var{integer_array}, mp_size_t @var{array_size}, @w{mp_size_t @var{fixed_num_bits}}) 4144@strong{This is an obsolete function. Do not use it.} 4145@end deftypefun 4146 4147@deftypefun {void *} _mpz_realloc (mpz_t @var{integer}, mp_size_t @var{new_alloc}) 4148Change the space for @var{integer} to @var{new_alloc} limbs. The value in 4149@var{integer} is preserved if it fits, or is set to 0 if not. The return 4150value is not useful to applications and should be ignored. 4151 4152@code{mpz_realloc2} is the preferred way to accomplish allocation changes like 4153this. @code{mpz_realloc2} and @code{_mpz_realloc} are the same except that 4154@code{_mpz_realloc} takes its size in limbs. 4155@end deftypefun 4156 4157@deftypefun mp_limb_t mpz_getlimbn (const mpz_t @var{op}, mp_size_t @var{n}) 4158Return limb number @var{n} from @var{op}. The sign of @var{op} is ignored, 4159just the absolute value is used. The least significant limb is number 0. 4160 4161@code{mpz_size} can be used to find how many limbs make up @var{op}. 4162@code{mpz_getlimbn} returns zero if @var{n} is outside the range 0 to 4163@code{mpz_size(@var{op})-1}. 4164@end deftypefun 4165 4166@deftypefun size_t mpz_size (const mpz_t @var{op}) 4167Return the size of @var{op} measured in number of limbs. If @var{op} is zero, 4168the returned value will be zero. 4169@c (@xref{Nomenclature}, for an explanation of the concept @dfn{limb}.) 4170@end deftypefun 4171 4172@deftypefun {const mp_limb_t *} mpz_limbs_read (const mpz_t @var{x}) 4173Return a pointer to the limb array representing the absolute value of @var{x}. 4174The size of the array is @code{mpz_size(@var{x})}. Intended for read access 4175only. 4176@end deftypefun 4177 4178@deftypefun {mp_limb_t *} mpz_limbs_write (mpz_t @var{x}, mp_size_t @var{n}) 4179@deftypefunx {mp_limb_t *} mpz_limbs_modify (mpz_t @var{x}, mp_size_t @var{n}) 4180Return a pointer to the limb array, intended for write access. The array is 4181reallocated as needed, to make room for @var{n} limbs. Requires @math{@var{n} 4182> 0}. The @code{mpz_limbs_modify} function returns an array that holds the old 4183absolute value of @var{x}, while @code{mpz_limbs_write} may destroy the old 4184value and return an array with unspecified contents. 4185@end deftypefun 4186 4187@deftypefun void mpz_limbs_finish (mpz_t @var{x}, mp_size_t @var{s}) 4188Updates the internal size field of @var{x}. Used after writing to the limb 4189array pointer returned by @code{mpz_limbs_write} or @code{mpz_limbs_modify} is 4190completed. The array should contain @math{@GMPabs{@var{s}}} valid limbs, 4191representing the new absolute value for @var{x}, and the sign of @var{x} is 4192taken from the sign of @var{s}. This function never reallocates @var{x}, so 4193the limb pointer remains valid. 4194@end deftypefun 4195 4196@c FIXME: Some more useful and less silly example? 4197@example 4198void foo (mpz_t x) 4199@{ 4200 mp_size_t n, i; 4201 mp_limb_t *xp; 4202 4203 n = mpz_size (x); 4204 xp = mpz_limbs_modify (x, 2*n); 4205 for (i = 0; i < n; i++) 4206 xp[n+i] = xp[n-1-i]; 4207 mpz_limbs_finish (x, mpz_sgn (x) < 0 ? - 2*n : 2*n); 4208@} 4209@end example 4210 4211@deftypefun mpz_srcptr mpz_roinit_n (mpz_t @var{x}, const mp_limb_t *@var{xp}, mp_size_t @var{xs}) 4212Special initialization of @var{x}, using the given limb array and size. 4213@var{x} should be treated as read-only: it can be passed safely as input to 4214any mpz function, but not as an output. The array @var{xp} must point to at 4215least a readable limb, its size is 4216@math{@GMPabs{@var{xs}}}, and the sign of @var{x} is the sign of @var{xs}. For 4217convenience, the function returns @var{x}, but cast to a const pointer type. 4218@end deftypefun 4219 4220@example 4221void foo (mpz_t x) 4222@{ 4223 static const mp_limb_t y[3] = @{ 0x1, 0x2, 0x3 @}; 4224 mpz_t tmp; 4225 mpz_add (x, x, mpz_roinit_n (tmp, y, 3)); 4226@} 4227@end example 4228 4229@deftypefn Macro mpz_t MPZ_ROINIT_N (mp_limb_t *@var{xp}, mp_size_t @var{xs}) 4230This macro expands to an initializer which can be assigned to an mpz_t 4231variable. The limb array @var{xp} must point to at least a readable limb, 4232moreover, unlike the @code{mpz_roinit_n} function, the array must be 4233normalized: if @var{xs} is non-zero, then 4234@code{@var{xp}[@math{@GMPabs{@var{xs}}-1}]} must be non-zero. Intended 4235primarily for constant values. Using it for non-constant values requires a C 4236compiler supporting C99. 4237@end deftypefn 4238 4239@example 4240void foo (mpz_t x) 4241@{ 4242 static const mp_limb_t ya[3] = @{ 0x1, 0x2, 0x3 @}; 4243 static const mpz_t y = MPZ_ROINIT_N ((mp_limb_t *) ya, 3); 4244 4245 mpz_add (x, x, y); 4246@} 4247@end example 4248 4249 4250@node Rational Number Functions, Floating-point Functions, Integer Functions, Top 4251@comment node-name, next, previous, up 4252@chapter Rational Number Functions 4253@cindex Rational number functions 4254 4255This chapter describes the GMP functions for performing arithmetic on rational 4256numbers. These functions start with the prefix @code{mpq_}. 4257 4258Rational numbers are stored in objects of type @code{mpq_t}. 4259 4260All rational arithmetic functions assume operands have a canonical form, and 4261canonicalize their result. The canonical form means that the denominator and 4262the numerator have no common factors, and that the denominator is positive. 4263Zero has the unique representation 0/1. 4264 4265Pure assignment functions do not canonicalize the assigned variable. It is 4266the responsibility of the user to canonicalize the assigned variable before 4267any arithmetic operations are performed on that variable. 4268 4269@deftypefun void mpq_canonicalize (mpq_t @var{op}) 4270Remove any factors that are common to the numerator and denominator of 4271@var{op}, and make the denominator positive. 4272@end deftypefun 4273 4274@menu 4275* Initializing Rationals:: 4276* Rational Conversions:: 4277* Rational Arithmetic:: 4278* Comparing Rationals:: 4279* Applying Integer Functions:: 4280* I/O of Rationals:: 4281@end menu 4282 4283@node Initializing Rationals, Rational Conversions, Rational Number Functions, Rational Number Functions 4284@comment node-name, next, previous, up 4285@section Initialization and Assignment Functions 4286@cindex Rational assignment functions 4287@cindex Assignment functions 4288@cindex Rational initialization functions 4289@cindex Initialization functions 4290 4291@deftypefun void mpq_init (mpq_t @var{x}) 4292Initialize @var{x} and set it to 0/1. Each variable should normally only be 4293initialized once, or at least cleared out (using the function @code{mpq_clear}) 4294between each initialization. 4295@end deftypefun 4296 4297@deftypefun void mpq_inits (mpq_t @var{x}, ...) 4298Initialize a NULL-terminated list of @code{mpq_t} variables, and set their 4299values to 0/1. 4300@end deftypefun 4301 4302@deftypefun void mpq_clear (mpq_t @var{x}) 4303Free the space occupied by @var{x}. Make sure to call this function for all 4304@code{mpq_t} variables when you are done with them. 4305@end deftypefun 4306 4307@deftypefun void mpq_clears (mpq_t @var{x}, ...) 4308Free the space occupied by a NULL-terminated list of @code{mpq_t} variables. 4309@end deftypefun 4310 4311@deftypefun void mpq_set (mpq_t @var{rop}, const mpq_t @var{op}) 4312@deftypefunx void mpq_set_z (mpq_t @var{rop}, const mpz_t @var{op}) 4313Assign @var{rop} from @var{op}. 4314@end deftypefun 4315 4316@deftypefun void mpq_set_ui (mpq_t @var{rop}, unsigned long int @var{op1}, unsigned long int @var{op2}) 4317@deftypefunx void mpq_set_si (mpq_t @var{rop}, signed long int @var{op1}, unsigned long int @var{op2}) 4318Set the value of @var{rop} to @var{op1}/@var{op2}. Note that if @var{op1} and 4319@var{op2} have common factors, @var{rop} has to be passed to 4320@code{mpq_canonicalize} before any operations are performed on @var{rop}. 4321@end deftypefun 4322 4323@deftypefun int mpq_set_str (mpq_t @var{rop}, const char *@var{str}, int @var{base}) 4324Set @var{rop} from a null-terminated string @var{str} in the given @var{base}. 4325 4326The string can be an integer like ``41'' or a fraction like ``41/152''. The 4327fraction must be in canonical form (@pxref{Rational Number Functions}), or if 4328not then @code{mpq_canonicalize} must be called. 4329 4330The numerator and optional denominator are parsed the same as in 4331@code{mpz_set_str} (@pxref{Assigning Integers}). White space is allowed in 4332the string, and is simply ignored. The @var{base} can vary from 2 to 62, or 4333if @var{base} is 0 then the leading characters are used: @code{0x} or @code{0X} for hex, 4334@code{0b} or @code{0B} for binary, 4335@code{0} for octal, or decimal otherwise. Note that this is done separately 4336for the numerator and denominator, so for instance @code{0xEF/100} is 239/100, 4337whereas @code{0xEF/0x100} is 239/256. 4338 4339The return value is 0 if the entire string is a valid number, or @minus{}1 if 4340not. 4341@end deftypefun 4342 4343@deftypefun void mpq_swap (mpq_t @var{rop1}, mpq_t @var{rop2}) 4344Swap the values @var{rop1} and @var{rop2} efficiently. 4345@end deftypefun 4346 4347 4348@need 2000 4349@node Rational Conversions, Rational Arithmetic, Initializing Rationals, Rational Number Functions 4350@comment node-name, next, previous, up 4351@section Conversion Functions 4352@cindex Rational conversion functions 4353@cindex Conversion functions 4354 4355@deftypefun double mpq_get_d (const mpq_t @var{op}) 4356Convert @var{op} to a @code{double}, truncating if necessary (i.e.@: rounding 4357towards zero). 4358 4359If the exponent from the conversion is too big or too small to fit a 4360@code{double} then the result is system dependent. For too big an infinity is 4361returned when available. For too small @math{0.0} is normally returned. 4362Hardware overflow, underflow and denorm traps may or may not occur. 4363@end deftypefun 4364 4365@deftypefun void mpq_set_d (mpq_t @var{rop}, double @var{op}) 4366@deftypefunx void mpq_set_f (mpq_t @var{rop}, const mpf_t @var{op}) 4367Set @var{rop} to the value of @var{op}. There is no rounding, this conversion 4368is exact. 4369@end deftypefun 4370 4371@deftypefun {char *} mpq_get_str (char *@var{str}, int @var{base}, const mpq_t @var{op}) 4372Convert @var{op} to a string of digits in base @var{base}. The base argument 4373may vary from 2 to 62 or from @minus{}2 to @minus{}36. The string will be of 4374the form @samp{num/den}, or if the denominator is 1 then just @samp{num}. 4375 4376For @var{base} in the range 2..36, digits and lower-case letters are used; for 4377@minus{}2..@minus{}36, digits and upper-case letters are used; for 37..62, 4378digits, upper-case letters, and lower-case letters (in that significance order) 4379are used. 4380 4381If @var{str} is @code{NULL}, the result string is allocated using the current 4382allocation function (@pxref{Custom Allocation}). The block will be 4383@code{strlen(str)+1} bytes, that being exactly enough for the string and 4384null-terminator. 4385 4386If @var{str} is not @code{NULL}, it should point to a block of storage large 4387enough for the result, that being 4388 4389@example 4390mpz_sizeinbase (mpq_numref(@var{op}), @var{base}) 4391+ mpz_sizeinbase (mpq_denref(@var{op}), @var{base}) + 3 4392@end example 4393 4394The three extra bytes are for a possible minus sign, possible slash, and the 4395null-terminator. 4396 4397A pointer to the result string is returned, being either the allocated block, 4398or the given @var{str}. 4399@end deftypefun 4400 4401 4402@node Rational Arithmetic, Comparing Rationals, Rational Conversions, Rational Number Functions 4403@comment node-name, next, previous, up 4404@section Arithmetic Functions 4405@cindex Rational arithmetic functions 4406@cindex Arithmetic functions 4407 4408@deftypefun void mpq_add (mpq_t @var{sum}, const mpq_t @var{addend1}, const mpq_t @var{addend2}) 4409Set @var{sum} to @var{addend1} + @var{addend2}. 4410@end deftypefun 4411 4412@deftypefun void mpq_sub (mpq_t @var{difference}, const mpq_t @var{minuend}, const mpq_t @var{subtrahend}) 4413Set @var{difference} to @var{minuend} @minus{} @var{subtrahend}. 4414@end deftypefun 4415 4416@deftypefun void mpq_mul (mpq_t @var{product}, const mpq_t @var{multiplier}, const mpq_t @var{multiplicand}) 4417Set @var{product} to @math{@var{multiplier} @GMPtimes{} @var{multiplicand}}. 4418@end deftypefun 4419 4420@deftypefun void mpq_mul_2exp (mpq_t @var{rop}, const mpq_t @var{op1}, mp_bitcnt_t @var{op2}) 4421Set @var{rop} to @m{@var{op1} \times 2^{op2}, @var{op1} times 2 raised to 4422@var{op2}}. 4423@end deftypefun 4424 4425@deftypefun void mpq_div (mpq_t @var{quotient}, const mpq_t @var{dividend}, const mpq_t @var{divisor}) 4426@cindex Division functions 4427Set @var{quotient} to @var{dividend}/@var{divisor}. 4428@end deftypefun 4429 4430@deftypefun void mpq_div_2exp (mpq_t @var{rop}, const mpq_t @var{op1}, mp_bitcnt_t @var{op2}) 4431Set @var{rop} to @m{@var{op1}/2^{op2}, @var{op1} divided by 2 raised to 4432@var{op2}}. 4433@end deftypefun 4434 4435@deftypefun void mpq_neg (mpq_t @var{negated_operand}, const mpq_t @var{operand}) 4436Set @var{negated_operand} to @minus{}@var{operand}. 4437@end deftypefun 4438 4439@deftypefun void mpq_abs (mpq_t @var{rop}, const mpq_t @var{op}) 4440Set @var{rop} to the absolute value of @var{op}. 4441@end deftypefun 4442 4443@deftypefun void mpq_inv (mpq_t @var{inverted_number}, const mpq_t @var{number}) 4444Set @var{inverted_number} to 1/@var{number}. If the new denominator is 4445zero, this routine will divide by zero. 4446@end deftypefun 4447 4448@node Comparing Rationals, Applying Integer Functions, Rational Arithmetic, Rational Number Functions 4449@comment node-name, next, previous, up 4450@section Comparison Functions 4451@cindex Rational comparison functions 4452@cindex Comparison functions 4453 4454@deftypefun int mpq_cmp (const mpq_t @var{op1}, const mpq_t @var{op2}) 4455@deftypefunx int mpq_cmp_z (const mpq_t @var{op1}, const mpz_t @var{op2}) 4456Compare @var{op1} and @var{op2}. Return a positive value if @math{@var{op1} > 4457@var{op2}}, zero if @math{@var{op1} = @var{op2}}, and a negative value if 4458@math{@var{op1} < @var{op2}}. 4459 4460To determine if two rationals are equal, @code{mpq_equal} is faster than 4461@code{mpq_cmp}. 4462@end deftypefun 4463 4464@deftypefn Macro int mpq_cmp_ui (const mpq_t @var{op1}, unsigned long int @var{num2}, unsigned long int @var{den2}) 4465@deftypefnx Macro int mpq_cmp_si (const mpq_t @var{op1}, long int @var{num2}, unsigned long int @var{den2}) 4466Compare @var{op1} and @var{num2}/@var{den2}. Return a positive value if 4467@math{@var{op1} > @var{num2}/@var{den2}}, zero if @math{@var{op1} = 4468@var{num2}/@var{den2}}, and a negative value if @math{@var{op1} < 4469@var{num2}/@var{den2}}. 4470 4471@var{num2} and @var{den2} are allowed to have common factors. 4472 4473These functions are implemented as a macros and evaluate their arguments 4474multiple times. 4475@end deftypefn 4476 4477@deftypefn Macro int mpq_sgn (const mpq_t @var{op}) 4478@cindex Sign tests 4479@cindex Rational sign tests 4480Return @math{+1} if @math{@var{op} > 0}, 0 if @math{@var{op} = 0}, and 4481@math{-1} if @math{@var{op} < 0}. 4482 4483This function is actually implemented as a macro. It evaluates its 4484argument multiple times. 4485@end deftypefn 4486 4487@deftypefun int mpq_equal (const mpq_t @var{op1}, const mpq_t @var{op2}) 4488Return non-zero if @var{op1} and @var{op2} are equal, zero if they are 4489non-equal. Although @code{mpq_cmp} can be used for the same purpose, this 4490function is much faster. 4491@end deftypefun 4492 4493@node Applying Integer Functions, I/O of Rationals, Comparing Rationals, Rational Number Functions 4494@comment node-name, next, previous, up 4495@section Applying Integer Functions to Rationals 4496@cindex Rational numerator and denominator 4497@cindex Numerator and denominator 4498 4499The set of @code{mpq} functions is quite small. In particular, there are few 4500functions for either input or output. The following functions give direct 4501access to the numerator and denominator of an @code{mpq_t}. 4502 4503Note that if an assignment to the numerator and/or denominator could take an 4504@code{mpq_t} out of the canonical form described at the start of this chapter 4505(@pxref{Rational Number Functions}) then @code{mpq_canonicalize} must be 4506called before any other @code{mpq} functions are applied to that @code{mpq_t}. 4507 4508@deftypefn Macro mpz_t mpq_numref (const mpq_t @var{op}) 4509@deftypefnx Macro mpz_t mpq_denref (const mpq_t @var{op}) 4510Return a reference to the numerator and denominator of @var{op}, respectively. 4511The @code{mpz} functions can be used on the result of these macros. 4512@end deftypefn 4513 4514@deftypefun void mpq_get_num (mpz_t @var{numerator}, const mpq_t @var{rational}) 4515@deftypefunx void mpq_get_den (mpz_t @var{denominator}, const mpq_t @var{rational}) 4516@deftypefunx void mpq_set_num (mpq_t @var{rational}, const mpz_t @var{numerator}) 4517@deftypefunx void mpq_set_den (mpq_t @var{rational}, const mpz_t @var{denominator}) 4518Get or set the numerator or denominator of a rational. These functions are 4519equivalent to calling @code{mpz_set} with an appropriate @code{mpq_numref} or 4520@code{mpq_denref}. Direct use of @code{mpq_numref} or @code{mpq_denref} is 4521recommended instead of these functions. 4522@end deftypefun 4523 4524 4525@need 2000 4526@node I/O of Rationals, , Applying Integer Functions, Rational Number Functions 4527@comment node-name, next, previous, up 4528@section Input and Output Functions 4529@cindex Rational input and output functions 4530@cindex Input functions 4531@cindex Output functions 4532@cindex I/O functions 4533 4534Functions that perform input from a stdio stream, and functions that output to 4535a stdio stream, of @code{mpq} numbers. Passing a @code{NULL} pointer for a 4536@var{stream} argument to any of these functions will make them read from 4537@code{stdin} and write to @code{stdout}, respectively. 4538 4539When using any of these functions, it is a good idea to include @file{stdio.h} 4540before @file{gmp.h}, since that will allow @file{gmp.h} to define prototypes 4541for these functions. 4542 4543See also @ref{Formatted Output} and @ref{Formatted Input}. 4544 4545@deftypefun size_t mpq_out_str (FILE *@var{stream}, int @var{base}, const mpq_t @var{op}) 4546Output @var{op} on stdio stream @var{stream}, as a string of digits in base 4547@var{base}. The base argument may vary from 2 to 62 or from @minus{}2 to 4548@minus{}36. Output is in the form 4549@samp{num/den} or if the denominator is 1 then just @samp{num}. 4550 4551For @var{base} in the range 2..36, digits and lower-case letters are used; for 4552@minus{}2..@minus{}36, digits and upper-case letters are used; for 37..62, 4553digits, upper-case letters, and lower-case letters (in that significance order) 4554are used. 4555 4556Return the number of bytes written, or if an error occurred, return 0. 4557@end deftypefun 4558 4559@deftypefun size_t mpq_inp_str (mpq_t @var{rop}, FILE *@var{stream}, int @var{base}) 4560Read a string of digits from @var{stream} and convert them to a rational in 4561@var{rop}. Any initial white-space characters are read and discarded. Return 4562the number of characters read (including white space), or 0 if a rational 4563could not be read. 4564 4565The input can be a fraction like @samp{17/63} or just an integer like 4566@samp{123}. Reading stops at the first character not in this form, and white 4567space is not permitted within the string. If the input might not be in 4568canonical form, then @code{mpq_canonicalize} must be called (@pxref{Rational 4569Number Functions}). 4570 4571The @var{base} can be between 2 and 62, or can be 0 in which case the leading 4572characters of the string determine the base, @samp{0x} or @samp{0X} for 4573hexadecimal, @code{0b} and @code{0B} for binary, @samp{0} for octal, or 4574decimal otherwise. The leading characters 4575are examined separately for the numerator and denominator of a fraction, so 4576for instance @samp{0x10/11} is @math{16/11}, whereas @samp{0x10/0x11} is 4577@math{16/17}. 4578@end deftypefun 4579 4580 4581@node Floating-point Functions, Low-level Functions, Rational Number Functions, Top 4582@comment node-name, next, previous, up 4583@chapter Floating-point Functions 4584@cindex Floating-point functions 4585@cindex Float functions 4586@cindex User-defined precision 4587@cindex Precision of floats 4588 4589GMP floating point numbers are stored in objects of type @code{mpf_t} and 4590functions operating on them have an @code{mpf_} prefix. 4591 4592The mantissa of each float has a user-selectable precision, in practice only 4593limited by available memory. Each variable has its own precision, and that can 4594be increased or decreased at any time. This selectable precision is a minimum 4595value, GMP rounds it up to a whole limb. 4596 4597The accuracy of a calculation is determined by the priorly set precision of the 4598destination variable and the numeric values of the input variables. Input 4599variables' set precisions do not affect calculations (except indirectly as 4600their values might have been affected when they were assigned). 4601 4602The exponent of each float has fixed precision, one machine word on most 4603systems. In the current implementation the exponent is a count of limbs, so 4604for example on a 32-bit system this means a range of roughly 4605@math{2^@W{-68719476768}} to @math{2^@W{68719476736}}, or on a 64-bit system 4606this will be much greater. Note however that @code{mpf_get_str} can only 4607return an exponent which fits an @code{mp_exp_t} and currently 4608@code{mpf_set_str} doesn't accept exponents bigger than a @code{long}. 4609 4610Each variable keeps track of the mantissa data actually in use. This means 4611that if a float is exactly represented in only a few bits then only those bits 4612will be used in a calculation, even if the variable's selected precision is 4613high. This is a performance optimization; it does not affect the numeric 4614results. 4615 4616Internally, GMP sometimes calculates with higher precision than that of the 4617destination variable in order to limit errors. Final results are always 4618truncated to the destination variable's precision. 4619 4620The mantissa is stored in binary. One consequence of this is that decimal 4621fractions like @math{0.1} cannot be represented exactly. The same is true of 4622plain IEEE @code{double} floats. This makes both highly unsuitable for 4623calculations involving money or other values that should be exact decimal 4624fractions. (Suitably scaled integers, or perhaps rationals, are better 4625choices.) 4626 4627The @code{mpf} functions and variables have no special notion of infinity or 4628not-a-number, and applications must take care not to overflow the exponent or 4629results will be unpredictable. 4630 4631Note that the @code{mpf} functions are @emph{not} intended as a smooth 4632extension to IEEE P754 arithmetic. In particular results obtained on one 4633computer often differ from the results on a computer with a different word 4634size. 4635 4636New projects should consider using the GMP extension library MPFR 4637(@url{http://mpfr.org}) instead. MPFR provides well-defined precision and 4638accurate rounding, and thereby naturally extends IEEE P754. 4639 4640@menu 4641* Initializing Floats:: 4642* Assigning Floats:: 4643* Simultaneous Float Init & Assign:: 4644* Converting Floats:: 4645* Float Arithmetic:: 4646* Float Comparison:: 4647* I/O of Floats:: 4648* Miscellaneous Float Functions:: 4649@end menu 4650 4651@node Initializing Floats, Assigning Floats, Floating-point Functions, Floating-point Functions 4652@comment node-name, next, previous, up 4653@section Initialization Functions 4654@cindex Float initialization functions 4655@cindex Initialization functions 4656 4657@deftypefun void mpf_set_default_prec (mp_bitcnt_t @var{prec}) 4658Set the default precision to be @strong{at least} @var{prec} bits. All 4659subsequent calls to @code{mpf_init} will use this precision, but previously 4660initialized variables are unaffected. 4661@end deftypefun 4662 4663@deftypefun {mp_bitcnt_t} mpf_get_default_prec (void) 4664Return the default precision actually used. 4665@end deftypefun 4666 4667An @code{mpf_t} object must be initialized before storing the first value in 4668it. The functions @code{mpf_init} and @code{mpf_init2} are used for that 4669purpose. 4670 4671@deftypefun void mpf_init (mpf_t @var{x}) 4672Initialize @var{x} to 0. Normally, a variable should be initialized once only 4673or at least be cleared, using @code{mpf_clear}, between initializations. The 4674precision of @var{x} is undefined unless a default precision has already been 4675established by a call to @code{mpf_set_default_prec}. 4676@end deftypefun 4677 4678@deftypefun void mpf_init2 (mpf_t @var{x}, mp_bitcnt_t @var{prec}) 4679Initialize @var{x} to 0 and set its precision to be @strong{at least} 4680@var{prec} bits. Normally, a variable should be initialized once only or at 4681least be cleared, using @code{mpf_clear}, between initializations. 4682@end deftypefun 4683 4684@deftypefun void mpf_inits (mpf_t @var{x}, ...) 4685Initialize a NULL-terminated list of @code{mpf_t} variables, and set their 4686values to 0. The precision of the initialized variables is undefined unless a 4687default precision has already been established by a call to 4688@code{mpf_set_default_prec}. 4689@end deftypefun 4690 4691@deftypefun void mpf_clear (mpf_t @var{x}) 4692Free the space occupied by @var{x}. Make sure to call this function for all 4693@code{mpf_t} variables when you are done with them. 4694@end deftypefun 4695 4696@deftypefun void mpf_clears (mpf_t @var{x}, ...) 4697Free the space occupied by a NULL-terminated list of @code{mpf_t} variables. 4698@end deftypefun 4699 4700@need 2000 4701Here is an example on how to initialize floating-point variables: 4702@example 4703@{ 4704 mpf_t x, y; 4705 mpf_init (x); /* use default precision */ 4706 mpf_init2 (y, 256); /* precision @emph{at least} 256 bits */ 4707 @dots{} 4708 /* Unless the program is about to exit, do ... */ 4709 mpf_clear (x); 4710 mpf_clear (y); 4711@} 4712@end example 4713 4714The following three functions are useful for changing the precision during a 4715calculation. A typical use would be for adjusting the precision gradually in 4716iterative algorithms like Newton-Raphson, making the computation precision 4717closely match the actual accurate part of the numbers. 4718 4719@deftypefun {mp_bitcnt_t} mpf_get_prec (const mpf_t @var{op}) 4720Return the current precision of @var{op}, in bits. 4721@end deftypefun 4722 4723@deftypefun void mpf_set_prec (mpf_t @var{rop}, mp_bitcnt_t @var{prec}) 4724Set the precision of @var{rop} to be @strong{at least} @var{prec} bits. The 4725value in @var{rop} will be truncated to the new precision. 4726 4727This function requires a call to @code{realloc}, and so should not be used in 4728a tight loop. 4729@end deftypefun 4730 4731@deftypefun void mpf_set_prec_raw (mpf_t @var{rop}, mp_bitcnt_t @var{prec}) 4732Set the precision of @var{rop} to be @strong{at least} @var{prec} bits, 4733without changing the memory allocated. 4734 4735@var{prec} must be no more than the allocated precision for @var{rop}, that 4736being the precision when @var{rop} was initialized, or in the most recent 4737@code{mpf_set_prec}. 4738 4739The value in @var{rop} is unchanged, and in particular if it had a higher 4740precision than @var{prec} it will retain that higher precision. New values 4741written to @var{rop} will use the new @var{prec}. 4742 4743Before calling @code{mpf_clear} or the full @code{mpf_set_prec}, another 4744@code{mpf_set_prec_raw} call must be made to restore @var{rop} to its original 4745allocated precision. Failing to do so will have unpredictable results. 4746 4747@code{mpf_get_prec} can be used before @code{mpf_set_prec_raw} to get the 4748original allocated precision. After @code{mpf_set_prec_raw} it reflects the 4749@var{prec} value set. 4750 4751@code{mpf_set_prec_raw} is an efficient way to use an @code{mpf_t} variable at 4752different precisions during a calculation, perhaps to gradually increase 4753precision in an iteration, or just to use various different precisions for 4754different purposes during a calculation. 4755@end deftypefun 4756 4757 4758@need 2000 4759@node Assigning Floats, Simultaneous Float Init & Assign, Initializing Floats, Floating-point Functions 4760@comment node-name, next, previous, up 4761@section Assignment Functions 4762@cindex Float assignment functions 4763@cindex Assignment functions 4764 4765These functions assign new values to already initialized floats 4766(@pxref{Initializing Floats}). 4767 4768@deftypefun void mpf_set (mpf_t @var{rop}, const mpf_t @var{op}) 4769@deftypefunx void mpf_set_ui (mpf_t @var{rop}, unsigned long int @var{op}) 4770@deftypefunx void mpf_set_si (mpf_t @var{rop}, signed long int @var{op}) 4771@deftypefunx void mpf_set_d (mpf_t @var{rop}, double @var{op}) 4772@deftypefunx void mpf_set_z (mpf_t @var{rop}, const mpz_t @var{op}) 4773@deftypefunx void mpf_set_q (mpf_t @var{rop}, const mpq_t @var{op}) 4774Set the value of @var{rop} from @var{op}. 4775@end deftypefun 4776 4777@deftypefun int mpf_set_str (mpf_t @var{rop}, const char *@var{str}, int @var{base}) 4778Set the value of @var{rop} from the string in @var{str}. The string is of the 4779form @samp{M@@N} or, if the base is 10 or less, alternatively @samp{MeN}. 4780@samp{M} is the mantissa and @samp{N} is the exponent. The mantissa is always 4781in the specified base. The exponent is either in the specified base or, if 4782@var{base} is negative, in decimal. The decimal point expected is taken from 4783the current locale, on systems providing @code{localeconv}. 4784 4785The argument @var{base} may be in the ranges 2 to 62, or @minus{}62 to 4786@minus{}2. Negative values are used to specify that the exponent is in 4787decimal. 4788 4789For bases up to 36, case is ignored; upper-case and lower-case letters have 4790the same value; for bases 37 to 62, upper-case letter represent the usual 479110..35 while lower-case letter represent 36..61. 4792 4793Unlike the corresponding @code{mpz} function, the base will not be determined 4794from the leading characters of the string if @var{base} is 0. This is so that 4795numbers like @samp{0.23} are not interpreted as octal. 4796 4797White space is allowed in the string, and is simply ignored. [This is not 4798really true; white-space is ignored in the beginning of the string and within 4799the mantissa, but not in other places, such as after a minus sign or in the 4800exponent. We are considering changing the definition of this function, making 4801it fail when there is any white-space in the input, since that makes a lot of 4802sense. Please tell us your opinion about this change. Do you really want it 4803to accept @nicode{"3 14"} as meaning 314 as it does now?] 4804 4805This function returns 0 if the entire string is a valid number in base 4806@var{base}. Otherwise it returns @minus{}1. 4807@end deftypefun 4808 4809@deftypefun void mpf_swap (mpf_t @var{rop1}, mpf_t @var{rop2}) 4810Swap @var{rop1} and @var{rop2} efficiently. Both the values and the 4811precisions of the two variables are swapped. 4812@end deftypefun 4813 4814 4815@node Simultaneous Float Init & Assign, Converting Floats, Assigning Floats, Floating-point Functions 4816@comment node-name, next, previous, up 4817@section Combined Initialization and Assignment Functions 4818@cindex Float assignment functions 4819@cindex Assignment functions 4820@cindex Float initialization functions 4821@cindex Initialization functions 4822 4823For convenience, GMP provides a parallel series of initialize-and-set functions 4824which initialize the output and then store the value there. These functions' 4825names have the form @code{mpf_init_set@dots{}} 4826 4827Once the float has been initialized by any of the @code{mpf_init_set@dots{}} 4828functions, it can be used as the source or destination operand for the ordinary 4829float functions. Don't use an initialize-and-set function on a variable 4830already initialized! 4831 4832@deftypefun void mpf_init_set (mpf_t @var{rop}, const mpf_t @var{op}) 4833@deftypefunx void mpf_init_set_ui (mpf_t @var{rop}, unsigned long int @var{op}) 4834@deftypefunx void mpf_init_set_si (mpf_t @var{rop}, signed long int @var{op}) 4835@deftypefunx void mpf_init_set_d (mpf_t @var{rop}, double @var{op}) 4836Initialize @var{rop} and set its value from @var{op}. 4837 4838The precision of @var{rop} will be taken from the active default precision, as 4839set by @code{mpf_set_default_prec}. 4840@end deftypefun 4841 4842@deftypefun int mpf_init_set_str (mpf_t @var{rop}, const char *@var{str}, int @var{base}) 4843Initialize @var{rop} and set its value from the string in @var{str}. See 4844@code{mpf_set_str} above for details on the assignment operation. 4845 4846Note that @var{rop} is initialized even if an error occurs. (I.e., you have to 4847call @code{mpf_clear} for it.) 4848 4849The precision of @var{rop} will be taken from the active default precision, as 4850set by @code{mpf_set_default_prec}. 4851@end deftypefun 4852 4853 4854@node Converting Floats, Float Arithmetic, Simultaneous Float Init & Assign, Floating-point Functions 4855@comment node-name, next, previous, up 4856@section Conversion Functions 4857@cindex Float conversion functions 4858@cindex Conversion functions 4859 4860@deftypefun double mpf_get_d (const mpf_t @var{op}) 4861Convert @var{op} to a @code{double}, truncating if necessary (i.e.@: rounding 4862towards zero). 4863 4864If the exponent in @var{op} is too big or too small to fit a @code{double} 4865then the result is system dependent. For too big an infinity is returned when 4866available. For too small @math{0.0} is normally returned. Hardware overflow, 4867underflow and denorm traps may or may not occur. 4868@end deftypefun 4869 4870@deftypefun double mpf_get_d_2exp (signed long int *@var{exp}, const mpf_t @var{op}) 4871Convert @var{op} to a @code{double}, truncating if necessary (i.e.@: rounding 4872towards zero), and with an exponent returned separately. 4873 4874The return value is in the range @math{0.5@le{}@GMPabs{@var{d}}<1} and the 4875exponent is stored to @code{*@var{exp}}. @m{@var{d} \times 2^{exp}, 4876@var{d} * 2^@var{exp}} is the (truncated) @var{op} value. If @var{op} is zero, 4877the return is @math{0.0} and 0 is stored to @code{*@var{exp}}. 4878 4879@cindex @code{frexp} 4880This is similar to the standard C @code{frexp} function (@pxref{Normalization 4881Functions,,, libc, The GNU C Library Reference Manual}). 4882@end deftypefun 4883 4884@deftypefun long mpf_get_si (const mpf_t @var{op}) 4885@deftypefunx {unsigned long} mpf_get_ui (const mpf_t @var{op}) 4886Convert @var{op} to a @code{long} or @code{unsigned long}, truncating any 4887fraction part. If @var{op} is too big for the return type, the result is 4888undefined. 4889 4890See also @code{mpf_fits_slong_p} and @code{mpf_fits_ulong_p} 4891(@pxref{Miscellaneous Float Functions}). 4892@end deftypefun 4893 4894@deftypefun {char *} mpf_get_str (char *@var{str}, mp_exp_t *@var{expptr}, int @var{base}, size_t @var{n_digits}, const mpf_t @var{op}) 4895Convert @var{op} to a string of digits in base @var{base}. The base argument 4896may vary from 2 to 62 or from @minus{}2 to @minus{}36. Up to @var{n_digits} 4897digits will be generated. Trailing zeros are not returned. No more digits 4898than can be accurately represented by @var{op} are ever generated. If 4899@var{n_digits} is 0 then that accurate maximum number of digits are generated. 4900 4901For @var{base} in the range 2..36, digits and lower-case letters are used; for 4902@minus{}2..@minus{}36, digits and upper-case letters are used; for 37..62, 4903digits, upper-case letters, and lower-case letters (in that significance order) 4904are used. 4905 4906If @var{str} is @code{NULL}, the result string is allocated using the current 4907allocation function (@pxref{Custom Allocation}). The block will be 4908@code{strlen(str)+1} bytes, that being exactly enough for the string and 4909null-terminator. 4910 4911If @var{str} is not @code{NULL}, it should point to a block of 4912@math{@var{n_digits} + 2} bytes, that being enough for the mantissa, a 4913possible minus sign, and a null-terminator. When @var{n_digits} is 0 to get 4914all significant digits, an application won't be able to know the space 4915required, and @var{str} should be @code{NULL} in that case. 4916 4917The generated string is a fraction, with an implicit radix point immediately 4918to the left of the first digit. The applicable exponent is written through 4919the @var{expptr} pointer. For example, the number 3.1416 would be returned as 4920string @nicode{"31416"} and exponent 1. 4921 4922When @var{op} is zero, an empty string is produced and the exponent returned 4923is 0. 4924 4925A pointer to the result string is returned, being either the allocated block 4926or the given @var{str}. 4927@end deftypefun 4928 4929 4930@node Float Arithmetic, Float Comparison, Converting Floats, Floating-point Functions 4931@comment node-name, next, previous, up 4932@section Arithmetic Functions 4933@cindex Float arithmetic functions 4934@cindex Arithmetic functions 4935 4936@deftypefun void mpf_add (mpf_t @var{rop}, const mpf_t @var{op1}, const mpf_t @var{op2}) 4937@deftypefunx void mpf_add_ui (mpf_t @var{rop}, const mpf_t @var{op1}, unsigned long int @var{op2}) 4938Set @var{rop} to @math{@var{op1} + @var{op2}}. 4939@end deftypefun 4940 4941@deftypefun void mpf_sub (mpf_t @var{rop}, const mpf_t @var{op1}, const mpf_t @var{op2}) 4942@deftypefunx void mpf_ui_sub (mpf_t @var{rop}, unsigned long int @var{op1}, const mpf_t @var{op2}) 4943@deftypefunx void mpf_sub_ui (mpf_t @var{rop}, const mpf_t @var{op1}, unsigned long int @var{op2}) 4944Set @var{rop} to @var{op1} @minus{} @var{op2}. 4945@end deftypefun 4946 4947@deftypefun void mpf_mul (mpf_t @var{rop}, const mpf_t @var{op1}, const mpf_t @var{op2}) 4948@deftypefunx void mpf_mul_ui (mpf_t @var{rop}, const mpf_t @var{op1}, unsigned long int @var{op2}) 4949Set @var{rop} to @math{@var{op1} @GMPtimes{} @var{op2}}. 4950@end deftypefun 4951 4952Division is undefined if the divisor is zero, and passing a zero divisor to the 4953divide functions will make these functions intentionally divide by zero. This 4954lets the user handle arithmetic exceptions in these functions in the same 4955manner as other arithmetic exceptions. 4956 4957@deftypefun void mpf_div (mpf_t @var{rop}, const mpf_t @var{op1}, const mpf_t @var{op2}) 4958@deftypefunx void mpf_ui_div (mpf_t @var{rop}, unsigned long int @var{op1}, const mpf_t @var{op2}) 4959@deftypefunx void mpf_div_ui (mpf_t @var{rop}, const mpf_t @var{op1}, unsigned long int @var{op2}) 4960@cindex Division functions 4961Set @var{rop} to @var{op1}/@var{op2}. 4962@end deftypefun 4963 4964@deftypefun void mpf_sqrt (mpf_t @var{rop}, const mpf_t @var{op}) 4965@deftypefunx void mpf_sqrt_ui (mpf_t @var{rop}, unsigned long int @var{op}) 4966@cindex Root extraction functions 4967Set @var{rop} to @m{\sqrt{@var{op}}, the square root of @var{op}}. 4968@end deftypefun 4969 4970@deftypefun void mpf_pow_ui (mpf_t @var{rop}, const mpf_t @var{op1}, unsigned long int @var{op2}) 4971@cindex Exponentiation functions 4972@cindex Powering functions 4973Set @var{rop} to @m{@var{op1}^{op2}, @var{op1} raised to the power @var{op2}}. 4974@end deftypefun 4975 4976@deftypefun void mpf_neg (mpf_t @var{rop}, const mpf_t @var{op}) 4977Set @var{rop} to @minus{}@var{op}. 4978@end deftypefun 4979 4980@deftypefun void mpf_abs (mpf_t @var{rop}, const mpf_t @var{op}) 4981Set @var{rop} to the absolute value of @var{op}. 4982@end deftypefun 4983 4984@deftypefun void mpf_mul_2exp (mpf_t @var{rop}, const mpf_t @var{op1}, mp_bitcnt_t @var{op2}) 4985Set @var{rop} to @m{@var{op1} \times 2^{op2}, @var{op1} times 2 raised to 4986@var{op2}}. 4987@end deftypefun 4988 4989@deftypefun void mpf_div_2exp (mpf_t @var{rop}, const mpf_t @var{op1}, mp_bitcnt_t @var{op2}) 4990Set @var{rop} to @m{@var{op1}/2^{op2}, @var{op1} divided by 2 raised to 4991@var{op2}}. 4992@end deftypefun 4993 4994@node Float Comparison, I/O of Floats, Float Arithmetic, Floating-point Functions 4995@comment node-name, next, previous, up 4996@section Comparison Functions 4997@cindex Float comparison functions 4998@cindex Comparison functions 4999 5000@deftypefun int mpf_cmp (const mpf_t @var{op1}, const mpf_t @var{op2}) 5001@deftypefunx int mpf_cmp_z (const mpf_t @var{op1}, const mpz_t @var{op2}) 5002@deftypefunx int mpf_cmp_d (const mpf_t @var{op1}, double @var{op2}) 5003@deftypefunx int mpf_cmp_ui (const mpf_t @var{op1}, unsigned long int @var{op2}) 5004@deftypefunx int mpf_cmp_si (const mpf_t @var{op1}, signed long int @var{op2}) 5005Compare @var{op1} and @var{op2}. Return a positive value if @math{@var{op1} > 5006@var{op2}}, zero if @math{@var{op1} = @var{op2}}, and a negative value if 5007@math{@var{op1} < @var{op2}}. 5008 5009@code{mpf_cmp_d} can be called with an infinity, but results are undefined for 5010a NaN. 5011@end deftypefun 5012 5013@deftypefun int mpf_eq (const mpf_t @var{op1}, const mpf_t @var{op2}, mp_bitcnt_t op3) 5014@strong{This function is mathematically ill-defined and should not be used.} 5015 5016Return non-zero if the first @var{op3} bits of @var{op1} and @var{op2} are 5017equal, zero otherwise. Note that numbers like e.g., 256 (binary 100000000) and 5018255 (binary 11111111) will never be equal by this function's measure, and 5019furthermore that 0 will only be equal to itself. 5020@end deftypefun 5021 5022@deftypefun void mpf_reldiff (mpf_t @var{rop}, const mpf_t @var{op1}, const mpf_t @var{op2}) 5023Compute the relative difference between @var{op1} and @var{op2} and store the 5024result in @var{rop}. This is @math{@GMPabs{@var{op1}-@var{op2}}/@var{op1}}. 5025@end deftypefun 5026 5027@deftypefn Macro int mpf_sgn (const mpf_t @var{op}) 5028@cindex Sign tests 5029@cindex Float sign tests 5030Return @math{+1} if @math{@var{op} > 0}, 0 if @math{@var{op} = 0}, and 5031@math{-1} if @math{@var{op} < 0}. 5032 5033This function is actually implemented as a macro. It evaluates its argument 5034multiple times. 5035@end deftypefn 5036 5037@node I/O of Floats, Miscellaneous Float Functions, Float Comparison, Floating-point Functions 5038@comment node-name, next, previous, up 5039@section Input and Output Functions 5040@cindex Float input and output functions 5041@cindex Input functions 5042@cindex Output functions 5043@cindex I/O functions 5044 5045Functions that perform input from a stdio stream, and functions that output to 5046a stdio stream, of @code{mpf} numbers. Passing a @code{NULL} pointer for a 5047@var{stream} argument to any of these functions will make them read from 5048@code{stdin} and write to @code{stdout}, respectively. 5049 5050When using any of these functions, it is a good idea to include @file{stdio.h} 5051before @file{gmp.h}, since that will allow @file{gmp.h} to define prototypes 5052for these functions. 5053 5054See also @ref{Formatted Output} and @ref{Formatted Input}. 5055 5056@deftypefun size_t mpf_out_str (FILE *@var{stream}, int @var{base}, size_t @var{n_digits}, const mpf_t @var{op}) 5057Print @var{op} to @var{stream}, as a string of digits. Return the number of 5058bytes written, or if an error occurred, return 0. 5059 5060The mantissa is prefixed with an @samp{0.} and is in the given @var{base}, 5061which may vary from 2 to 62 or from @minus{}2 to @minus{}36. An exponent is 5062then printed, separated by an @samp{e}, or if the base is greater than 10 then 5063by an @samp{@@}. The exponent is always in decimal. The decimal point follows 5064the current locale, on systems providing @code{localeconv}. 5065 5066For @var{base} in the range 2..36, digits and lower-case letters are used; for 5067@minus{}2..@minus{}36, digits and upper-case letters are used; for 37..62, 5068digits, upper-case letters, and lower-case letters (in that significance order) 5069are used. 5070 5071Up to @var{n_digits} will be printed from the mantissa, except that no more 5072digits than are accurately representable by @var{op} will be printed. 5073@var{n_digits} can be 0 to select that accurate maximum. 5074@end deftypefun 5075 5076@deftypefun size_t mpf_inp_str (mpf_t @var{rop}, FILE *@var{stream}, int @var{base}) 5077Read a string in base @var{base} from @var{stream}, and put the read float in 5078@var{rop}. The string is of the form @samp{M@@N} or, if the base is 10 or 5079less, alternatively @samp{MeN}. @samp{M} is the mantissa and @samp{N} is the 5080exponent. The mantissa is always in the specified base. The exponent is 5081either in the specified base or, if @var{base} is negative, in decimal. The 5082decimal point expected is taken from the current locale, on systems providing 5083@code{localeconv}. 5084 5085The argument @var{base} may be in the ranges 2 to 36, or @minus{}36 to 5086@minus{}2. Negative values are used to specify that the exponent is in 5087decimal. 5088 5089Unlike the corresponding @code{mpz} function, the base will not be determined 5090from the leading characters of the string if @var{base} is 0. This is so that 5091numbers like @samp{0.23} are not interpreted as octal. 5092 5093Return the number of bytes read, or if an error occurred, return 0. 5094@end deftypefun 5095 5096@c @deftypefun void mpf_out_raw (FILE *@var{stream}, const mpf_t @var{float}) 5097@c Output @var{float} on stdio stream @var{stream}, in raw binary 5098@c format. The float is written in a portable format, with 4 bytes of 5099@c size information, and that many bytes of limbs. Both the size and the 5100@c limbs are written in decreasing significance order. 5101@c @end deftypefun 5102 5103@c @deftypefun void mpf_inp_raw (mpf_t @var{float}, FILE *@var{stream}) 5104@c Input from stdio stream @var{stream} in the format written by 5105@c @code{mpf_out_raw}, and put the result in @var{float}. 5106@c @end deftypefun 5107 5108 5109@node Miscellaneous Float Functions, , I/O of Floats, Floating-point Functions 5110@comment node-name, next, previous, up 5111@section Miscellaneous Functions 5112@cindex Miscellaneous float functions 5113@cindex Float miscellaneous functions 5114 5115@deftypefun void mpf_ceil (mpf_t @var{rop}, const mpf_t @var{op}) 5116@deftypefunx void mpf_floor (mpf_t @var{rop}, const mpf_t @var{op}) 5117@deftypefunx void mpf_trunc (mpf_t @var{rop}, const mpf_t @var{op}) 5118@cindex Rounding functions 5119@cindex Float rounding functions 5120Set @var{rop} to @var{op} rounded to an integer. @code{mpf_ceil} rounds to the 5121next higher integer, @code{mpf_floor} to the next lower, and @code{mpf_trunc} 5122to the integer towards zero. 5123@end deftypefun 5124 5125@deftypefun int mpf_integer_p (const mpf_t @var{op}) 5126Return non-zero if @var{op} is an integer. 5127@end deftypefun 5128 5129@deftypefun int mpf_fits_ulong_p (const mpf_t @var{op}) 5130@deftypefunx int mpf_fits_slong_p (const mpf_t @var{op}) 5131@deftypefunx int mpf_fits_uint_p (const mpf_t @var{op}) 5132@deftypefunx int mpf_fits_sint_p (const mpf_t @var{op}) 5133@deftypefunx int mpf_fits_ushort_p (const mpf_t @var{op}) 5134@deftypefunx int mpf_fits_sshort_p (const mpf_t @var{op}) 5135Return non-zero if @var{op} would fit in the respective C data type, when 5136truncated to an integer. 5137@end deftypefun 5138 5139@deftypefun void mpf_urandomb (mpf_t @var{rop}, gmp_randstate_t @var{state}, mp_bitcnt_t @var{nbits}) 5140@cindex Random number functions 5141@cindex Float random number functions 5142Generate a uniformly distributed random float in @var{rop}, such that @math{0 5143@le{} @var{rop} < 1}, with @var{nbits} significant bits in the mantissa or 5144less if the precision of @var{rop} is smaller. 5145 5146The variable @var{state} must be initialized by calling one of the 5147@code{gmp_randinit} functions (@ref{Random State Initialization}) before 5148invoking this function. 5149@end deftypefun 5150 5151@deftypefun void mpf_random2 (mpf_t @var{rop}, mp_size_t @var{max_size}, mp_exp_t @var{exp}) 5152Generate a random float of at most @var{max_size} limbs, with long strings of 5153zeros and ones in the binary representation. The exponent of the number is in 5154the interval @minus{}@var{exp} to @var{exp} (in limbs). This function is 5155useful for testing functions and algorithms, since these kind of random 5156numbers have proven to be more likely to trigger corner-case bugs. Negative 5157random numbers are generated when @var{max_size} is negative. 5158@end deftypefun 5159 5160@c @deftypefun size_t mpf_size (const mpf_t @var{op}) 5161@c Return the size of @var{op} measured in number of limbs. If @var{op} is 5162@c zero, the returned value will be zero. (@xref{Nomenclature}, for an 5163@c explanation of the concept @dfn{limb}.) 5164@c 5165@c @strong{This function is obsolete. It will disappear from future GMP 5166@c releases.} 5167@c @end deftypefun 5168 5169 5170@node Low-level Functions, Random Number Functions, Floating-point Functions, Top 5171@comment node-name, next, previous, up 5172@chapter Low-level Functions 5173@cindex Low-level functions 5174 5175This chapter describes low-level GMP functions, used to implement the 5176high-level GMP functions, but also intended for time-critical user code. 5177 5178These functions start with the prefix @code{mpn_}. 5179 5180@c 1. Some of these function clobber input operands. 5181@c 5182 5183The @code{mpn} functions are designed to be as fast as possible, @strong{not} 5184to provide a coherent calling interface. The different functions have somewhat 5185similar interfaces, but there are variations that make them hard to use. These 5186functions do as little as possible apart from the real multiple precision 5187computation, so that no time is spent on things that not all callers need. 5188 5189A source operand is specified by a pointer to the least significant limb and a 5190limb count. A destination operand is specified by just a pointer. It is the 5191responsibility of the caller to ensure that the destination has enough space 5192for storing the result. 5193 5194With this way of specifying operands, it is possible to perform computations on 5195subranges of an argument, and store the result into a subrange of a 5196destination. 5197 5198A common requirement for all functions is that each source area needs at least 5199one limb. No size argument may be zero. Unless otherwise stated, in-place 5200operations are allowed where source and destination are the same, but not where 5201they only partly overlap. 5202 5203The @code{mpn} functions are the base for the implementation of the 5204@code{mpz_}, @code{mpf_}, and @code{mpq_} functions. 5205 5206This example adds the number beginning at @var{s1p} and the number beginning at 5207@var{s2p} and writes the sum at @var{destp}. All areas have @var{n} limbs. 5208 5209@example 5210cy = mpn_add_n (destp, s1p, s2p, n) 5211@end example 5212 5213It should be noted that the @code{mpn} functions make no attempt to identify 5214high or low zero limbs on their operands, or other special forms. On random 5215data such cases will be unlikely and it'd be wasteful for every function to 5216check every time. An application knowing something about its data can take 5217steps to trim or perhaps split its calculations. 5218@c 5219@c For reference, within gmp mpz_t operands never have high zero limbs, and 5220@c we rate low zero limbs as unlikely too (or something an application should 5221@c handle). This is a prime motivation for not stripping zero limbs in say 5222@c mpn_mul_n etc. 5223@c 5224@c Other applications doing variable-length calculations will quite likely do 5225@c something similar to mpz. And even if not then it's highly likely zero 5226@c limb stripping can be done at just a few judicious points, which will be 5227@c more efficient than having lots of mpn functions checking every time. 5228 5229@sp 1 5230@noindent 5231In the notation used below, a source operand is identified by the pointer to 5232the least significant limb, and the limb count in braces. For example, 5233@{@var{s1p}, @var{s1n}@}. 5234 5235@deftypefun mp_limb_t mpn_add_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n}) 5236Add @{@var{s1p}, @var{n}@} and @{@var{s2p}, @var{n}@}, and write the @var{n} 5237least significant limbs of the result to @var{rp}. Return carry, either 0 or 52381. 5239 5240This is the lowest-level function for addition. It is the preferred function 5241for addition, since it is written in assembly for most CPUs. For addition of 5242a variable to itself (i.e., @var{s1p} equals @var{s2p}) use @code{mpn_lshift} 5243with a count of 1 for optimal speed. 5244@end deftypefun 5245 5246@deftypefun mp_limb_t mpn_add_1 (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{n}, mp_limb_t @var{s2limb}) 5247Add @{@var{s1p}, @var{n}@} and @var{s2limb}, and write the @var{n} least 5248significant limbs of the result to @var{rp}. Return carry, either 0 or 1. 5249@end deftypefun 5250 5251@deftypefun mp_limb_t mpn_add (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{s1n}, const mp_limb_t *@var{s2p}, mp_size_t @var{s2n}) 5252Add @{@var{s1p}, @var{s1n}@} and @{@var{s2p}, @var{s2n}@}, and write the 5253@var{s1n} least significant limbs of the result to @var{rp}. Return carry, 5254either 0 or 1. 5255 5256This function requires that @var{s1n} is greater than or equal to @var{s2n}. 5257@end deftypefun 5258 5259@deftypefun mp_limb_t mpn_sub_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n}) 5260Subtract @{@var{s2p}, @var{n}@} from @{@var{s1p}, @var{n}@}, and write the 5261@var{n} least significant limbs of the result to @var{rp}. Return borrow, 5262either 0 or 1. 5263 5264This is the lowest-level function for subtraction. It is the preferred 5265function for subtraction, since it is written in assembly for most CPUs. 5266@end deftypefun 5267 5268@deftypefun mp_limb_t mpn_sub_1 (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{n}, mp_limb_t @var{s2limb}) 5269Subtract @var{s2limb} from @{@var{s1p}, @var{n}@}, and write the @var{n} least 5270significant limbs of the result to @var{rp}. Return borrow, either 0 or 1. 5271@end deftypefun 5272 5273@deftypefun mp_limb_t mpn_sub (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{s1n}, const mp_limb_t *@var{s2p}, mp_size_t @var{s2n}) 5274Subtract @{@var{s2p}, @var{s2n}@} from @{@var{s1p}, @var{s1n}@}, and write the 5275@var{s1n} least significant limbs of the result to @var{rp}. Return borrow, 5276either 0 or 1. 5277 5278This function requires that @var{s1n} is greater than or equal to 5279@var{s2n}. 5280@end deftypefun 5281 5282@deftypefun mp_limb_t mpn_neg (mp_limb_t *@var{rp}, const mp_limb_t *@var{sp}, mp_size_t @var{n}) 5283Perform the negation of @{@var{sp}, @var{n}@}, and write the result to 5284@{@var{rp}, @var{n}@}. This is equivalent to calling @code{mpn_sub_n} with a 5285@var{n}-limb zero minuend and passing @{@var{sp}, @var{n}@} as subtrahend. 5286Return borrow, either 0 or 1. 5287@end deftypefun 5288 5289@deftypefun void mpn_mul_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n}) 5290Multiply @{@var{s1p}, @var{n}@} and @{@var{s2p}, @var{n}@}, and write the 52912*@var{n}-limb result to @var{rp}. 5292 5293The destination has to have space for 2*@var{n} limbs, even if the product's 5294most significant limb is zero. No overlap is permitted between the 5295destination and either source. 5296 5297If the two input operands are the same, use @code{mpn_sqr}. 5298@end deftypefun 5299 5300@deftypefun mp_limb_t mpn_mul (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{s1n}, const mp_limb_t *@var{s2p}, mp_size_t @var{s2n}) 5301Multiply @{@var{s1p}, @var{s1n}@} and @{@var{s2p}, @var{s2n}@}, and write the 5302(@var{s1n}+@var{s2n})-limb result to @var{rp}. Return the most significant 5303limb of the result. 5304 5305The destination has to have space for @var{s1n} + @var{s2n} limbs, even if the 5306product's most significant limb is zero. No overlap is permitted between the 5307destination and either source. 5308 5309This function requires that @var{s1n} is greater than or equal to @var{s2n}. 5310@end deftypefun 5311 5312@deftypefun void mpn_sqr (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{n}) 5313Compute the square of @{@var{s1p}, @var{n}@} and write the 2*@var{n}-limb 5314result to @var{rp}. 5315 5316The destination has to have space for 2@var{n} limbs, even if the result's 5317most significant limb is zero. No overlap is permitted between the 5318destination and the source. 5319@end deftypefun 5320 5321@deftypefun mp_limb_t mpn_mul_1 (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{n}, mp_limb_t @var{s2limb}) 5322Multiply @{@var{s1p}, @var{n}@} by @var{s2limb}, and write the @var{n} least 5323significant limbs of the product to @var{rp}. Return the most significant 5324limb of the product. @{@var{s1p}, @var{n}@} and @{@var{rp}, @var{n}@} are 5325allowed to overlap provided @math{@var{rp} @le{} @var{s1p}}. 5326 5327This is a low-level function that is a building block for general 5328multiplication as well as other operations in GMP@. It is written in assembly 5329for most CPUs. 5330 5331Don't call this function if @var{s2limb} is a power of 2; use @code{mpn_lshift} 5332with a count equal to the logarithm of @var{s2limb} instead, for optimal speed. 5333@end deftypefun 5334 5335@deftypefun mp_limb_t mpn_addmul_1 (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{n}, mp_limb_t @var{s2limb}) 5336Multiply @{@var{s1p}, @var{n}@} and @var{s2limb}, and add the @var{n} least 5337significant limbs of the product to @{@var{rp}, @var{n}@} and write the result 5338to @var{rp}. Return the most significant limb of the product, plus carry-out 5339from the addition. @{@var{s1p}, @var{n}@} and @{@var{rp}, @var{n}@} are 5340allowed to overlap provided @math{@var{rp} @le{} @var{s1p}}. 5341 5342This is a low-level function that is a building block for general 5343multiplication as well as other operations in GMP@. It is written in assembly 5344for most CPUs. 5345@end deftypefun 5346 5347@deftypefun mp_limb_t mpn_submul_1 (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{n}, mp_limb_t @var{s2limb}) 5348Multiply @{@var{s1p}, @var{n}@} and @var{s2limb}, and subtract the @var{n} 5349least significant limbs of the product from @{@var{rp}, @var{n}@} and write the 5350result to @var{rp}. Return the most significant limb of the product, plus 5351borrow-out from the subtraction. @{@var{s1p}, @var{n}@} and @{@var{rp}, 5352@var{n}@} are allowed to overlap provided @math{@var{rp} @le{} @var{s1p}}. 5353 5354This is a low-level function that is a building block for general 5355multiplication and division as well as other operations in GMP@. It is written 5356in assembly for most CPUs. 5357@end deftypefun 5358 5359@deftypefun void mpn_tdiv_qr (mp_limb_t *@var{qp}, mp_limb_t *@var{rp}, mp_size_t @var{qxn}, const mp_limb_t *@var{np}, mp_size_t @var{nn}, const mp_limb_t *@var{dp}, mp_size_t @var{dn}) 5360Divide @{@var{np}, @var{nn}@} by @{@var{dp}, @var{dn}@} and put the quotient 5361at @{@var{qp}, @var{nn}@minus{}@var{dn}+1@} and the remainder at @{@var{rp}, 5362@var{dn}@}. The quotient is rounded towards 0. 5363 5364No overlap is permitted between arguments, except that @var{np} might equal 5365@var{rp}. The dividend size @var{nn} must be greater than or equal to divisor 5366size @var{dn}. The most significant limb of the divisor must be non-zero. The 5367@var{qxn} operand must be zero. 5368@end deftypefun 5369 5370@deftypefun mp_limb_t mpn_divrem (mp_limb_t *@var{r1p}, mp_size_t @var{qxn}, mp_limb_t *@var{rs2p}, mp_size_t @var{rs2n}, const mp_limb_t *@var{s3p}, mp_size_t @var{s3n}) 5371[This function is obsolete. Please call @code{mpn_tdiv_qr} instead for best 5372performance.] 5373 5374Divide @{@var{rs2p}, @var{rs2n}@} by @{@var{s3p}, @var{s3n}@}, and write the 5375quotient at @var{r1p}, with the exception of the most significant limb, which 5376is returned. The remainder replaces the dividend at @var{rs2p}; it will be 5377@var{s3n} limbs long (i.e., as many limbs as the divisor). 5378 5379In addition to an integer quotient, @var{qxn} fraction limbs are developed, and 5380stored after the integral limbs. For most usages, @var{qxn} will be zero. 5381 5382It is required that @var{rs2n} is greater than or equal to @var{s3n}. It is 5383required that the most significant bit of the divisor is set. 5384 5385If the quotient is not needed, pass @var{rs2p} + @var{s3n} as @var{r1p}. Aside 5386from that special case, no overlap between arguments is permitted. 5387 5388Return the most significant limb of the quotient, either 0 or 1. 5389 5390The area at @var{r1p} needs to be @var{rs2n} @minus{} @var{s3n} + @var{qxn} 5391limbs large. 5392@end deftypefun 5393 5394@deftypefn Function mp_limb_t mpn_divrem_1 (mp_limb_t *@var{r1p}, mp_size_t @var{qxn}, @w{mp_limb_t *@var{s2p}}, mp_size_t @var{s2n}, mp_limb_t @var{s3limb}) 5395@deftypefnx Macro mp_limb_t mpn_divmod_1 (mp_limb_t *@var{r1p}, mp_limb_t *@var{s2p}, @w{mp_size_t @var{s2n}}, @w{mp_limb_t @var{s3limb}}) 5396Divide @{@var{s2p}, @var{s2n}@} by @var{s3limb}, and write the quotient at 5397@var{r1p}. Return the remainder. 5398 5399The integer quotient is written to @{@var{r1p}+@var{qxn}, @var{s2n}@} and in 5400addition @var{qxn} fraction limbs are developed and written to @{@var{r1p}, 5401@var{qxn}@}. Either or both @var{s2n} and @var{qxn} can be zero. For most 5402usages, @var{qxn} will be zero. 5403 5404@code{mpn_divmod_1} exists for upward source compatibility and is simply a 5405macro calling @code{mpn_divrem_1} with a @var{qxn} of 0. 5406 5407The areas at @var{r1p} and @var{s2p} have to be identical or completely 5408separate, not partially overlapping. 5409@end deftypefn 5410 5411@deftypefun mp_limb_t mpn_divmod (mp_limb_t *@var{r1p}, mp_limb_t *@var{rs2p}, mp_size_t @var{rs2n}, const mp_limb_t *@var{s3p}, mp_size_t @var{s3n}) 5412[This function is obsolete. Please call @code{mpn_tdiv_qr} instead for best 5413performance.] 5414@end deftypefun 5415 5416@deftypefun void mpn_divexact_1 (mp_limb_t * @var{rp}, const mp_limb_t * @var{sp}, mp_size_t @var{n}, mp_limb_t @var{d}) 5417Divide @{@var{sp}, @var{n}@} by @var{d}, expecting it to divide exactly, and 5418writing the result to @{@var{rp}, @var{n}@}. If @var{d} doesn't divide 5419exactly, the value written to @{@var{rp}, @var{n}@} is undefined. The areas at 5420@var{rp} and @var{sp} have to be identical or completely separate, not 5421partially overlapping. 5422@end deftypefun 5423 5424@deftypefn Macro mp_limb_t mpn_divexact_by3 (mp_limb_t *@var{rp}, mp_limb_t *@var{sp}, @w{mp_size_t @var{n}}) 5425@deftypefnx Function mp_limb_t mpn_divexact_by3c (mp_limb_t *@var{rp}, mp_limb_t *@var{sp}, @w{mp_size_t @var{n}}, mp_limb_t @var{carry}) 5426Divide @{@var{sp}, @var{n}@} by 3, expecting it to divide exactly, and writing 5427the result to @{@var{rp}, @var{n}@}. If 3 divides exactly, the return value is 5428zero and the result is the quotient. If not, the return value is non-zero and 5429the result won't be anything useful. 5430 5431@code{mpn_divexact_by3c} takes an initial carry parameter, which can be the 5432return value from a previous call, so a large calculation can be done piece by 5433piece from low to high. @code{mpn_divexact_by3} is simply a macro calling 5434@code{mpn_divexact_by3c} with a 0 carry parameter. 5435 5436These routines use a multiply-by-inverse and will be faster than 5437@code{mpn_divrem_1} on CPUs with fast multiplication but slow division. 5438 5439The source @math{a}, result @math{q}, size @math{n}, initial carry @math{i}, 5440and return value @math{c} satisfy @m{cb^n+a-i=3q, c*b^n + a-i = 3*q}, where 5441@m{b=2\GMPraise{@code{GMP\_NUMB\_BITS}}, b=2^GMP_NUMB_BITS}. The 5442return @math{c} is always 0, 1 or 2, and the initial carry @math{i} must also 5443be 0, 1 or 2 (these are both borrows really). When @math{c=0} clearly 5444@math{q=(a-i)/3}. When @m{c \neq 0, c!=0}, the remainder @math{(a-i) @bmod{} 54453} is given by @math{3-c}, because @math{b @equiv{} 1 @bmod{} 3} (when 5446@code{mp_bits_per_limb} is even, which is always so currently). 5447@end deftypefn 5448 5449@deftypefun mp_limb_t mpn_mod_1 (const mp_limb_t *@var{s1p}, mp_size_t @var{s1n}, mp_limb_t @var{s2limb}) 5450Divide @{@var{s1p}, @var{s1n}@} by @var{s2limb}, and return the remainder. 5451@var{s1n} can be zero. 5452@end deftypefun 5453 5454@deftypefun mp_limb_t mpn_lshift (mp_limb_t *@var{rp}, const mp_limb_t *@var{sp}, mp_size_t @var{n}, unsigned int @var{count}) 5455Shift @{@var{sp}, @var{n}@} left by @var{count} bits, and write the result to 5456@{@var{rp}, @var{n}@}. The bits shifted out at the left are returned in the 5457least significant @var{count} bits of the return value (the rest of the return 5458value is zero). 5459 5460@var{count} must be in the range 1 to @nicode{mp_bits_per_limb}@minus{}1. The 5461regions @{@var{sp}, @var{n}@} and @{@var{rp}, @var{n}@} may overlap, provided 5462@math{@var{rp} @ge{} @var{sp}}. 5463 5464This function is written in assembly for most CPUs. 5465@end deftypefun 5466 5467@deftypefun mp_limb_t mpn_rshift (mp_limb_t *@var{rp}, const mp_limb_t *@var{sp}, mp_size_t @var{n}, unsigned int @var{count}) 5468Shift @{@var{sp}, @var{n}@} right by @var{count} bits, and write the result to 5469@{@var{rp}, @var{n}@}. The bits shifted out at the right are returned in the 5470most significant @var{count} bits of the return value (the rest of the return 5471value is zero). 5472 5473@var{count} must be in the range 1 to @nicode{mp_bits_per_limb}@minus{}1. The 5474regions @{@var{sp}, @var{n}@} and @{@var{rp}, @var{n}@} may overlap, provided 5475@math{@var{rp} @le{} @var{sp}}. 5476 5477This function is written in assembly for most CPUs. 5478@end deftypefun 5479 5480@deftypefun int mpn_cmp (const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n}) 5481Compare @{@var{s1p}, @var{n}@} and @{@var{s2p}, @var{n}@} and return a 5482positive value if @math{@var{s1} > @var{s2}}, 0 if they are equal, or a 5483negative value if @math{@var{s1} < @var{s2}}. 5484@end deftypefun 5485 5486@deftypefun int mpn_zero_p (const mp_limb_t *@var{sp}, mp_size_t @var{n}) 5487Test @{@var{sp}, @var{n}@} and return 1 if the operand is zero, 0 otherwise. 5488@end deftypefun 5489 5490@deftypefun mp_size_t mpn_gcd (mp_limb_t *@var{rp}, mp_limb_t *@var{xp}, mp_size_t @var{xn}, mp_limb_t *@var{yp}, mp_size_t @var{yn}) 5491Set @{@var{rp}, @var{retval}@} to the greatest common divisor of @{@var{xp}, 5492@var{xn}@} and @{@var{yp}, @var{yn}@}. The result can be up to @var{yn} limbs, 5493the return value is the actual number produced. Both source operands are 5494destroyed. 5495 5496It is required that @math{@var{xn} @ge @var{yn} > 0}, the most significant 5497limb of @{@var{yp}, @var{yn}@} must be non-zero, and at least one of 5498the two operands must be odd. No overlap is permitted 5499between @{@var{xp}, @var{xn}@} and @{@var{yp}, @var{yn}@}. 5500@end deftypefun 5501 5502@deftypefun mp_limb_t mpn_gcd_1 (const mp_limb_t *@var{xp}, mp_size_t @var{xn}, mp_limb_t @var{ylimb}) 5503Return the greatest common divisor of @{@var{xp}, @var{xn}@} and @var{ylimb}. 5504Both operands must be non-zero. 5505@end deftypefun 5506 5507@deftypefun mp_size_t mpn_gcdext (mp_limb_t *@var{gp}, mp_limb_t *@var{sp}, mp_size_t *@var{sn}, mp_limb_t *@var{up}, mp_size_t @var{un}, mp_limb_t *@var{vp}, mp_size_t @var{vn}) 5508Let @m{U,@var{U}} be defined by @{@var{up}, @var{un}@} and let @m{V,@var{V}} be 5509defined by @{@var{vp}, @var{vn}@}. 5510 5511Compute the greatest common divisor @math{G} of @math{U} and @math{V}. Compute 5512a cofactor @math{S} such that @math{G = US + VT}. The second cofactor @var{T} 5513is not computed but can easily be obtained from @m{(G - US) / V, (@var{G} - 5514@var{U}*@var{S}) / @var{V}} (the division will be exact). It is required that 5515@math{@var{un} @ge @var{vn} > 0}, and the most significant 5516limb of @{@var{vp}, @var{vn}@} must be non-zero. 5517 5518@math{S} satisfies @math{S = 1} or @math{@GMPabs{S} < V / (2 G)}. @math{S = 55190} if and only if @math{V} divides @math{U} (i.e., @math{G = V}). 5520 5521Store @math{G} at @var{gp} and let the return value define its limb count. 5522Store @math{S} at @var{sp} and let |*@var{sn}| define its limb count. @math{S} 5523can be negative; when this happens *@var{sn} will be negative. The area at 5524@var{gp} should have room for @var{vn} limbs and the area at @var{sp} should 5525have room for @math{@var{vn}+1} limbs. 5526 5527Both source operands are destroyed. 5528 5529Compatibility notes: GMP 4.3.0 and 4.3.1 defined @math{S} less strictly. 5530Earlier as well as later GMP releases define @math{S} as described here. 5531GMP releases before GMP 4.3.0 required additional space for both input and output 5532areas. More precisely, the areas @{@var{up}, @math{@var{un}+1}@} and 5533@{@var{vp}, @math{@var{vn}+1}@} were destroyed (i.e.@: the operands plus an 5534extra limb past the end of each), and the areas pointed to by @var{gp} and 5535@var{sp} should each have room for @math{@var{un}+1} limbs. 5536@end deftypefun 5537 5538@deftypefun mp_size_t mpn_sqrtrem (mp_limb_t *@var{r1p}, mp_limb_t *@var{r2p}, const mp_limb_t *@var{sp}, mp_size_t @var{n}) 5539Compute the square root of @{@var{sp}, @var{n}@} and put the result at 5540@{@var{r1p}, @math{@GMPceil{@var{n}/2}}@} and the remainder at @{@var{r2p}, 5541@var{retval}@}. @var{r2p} needs space for @var{n} limbs, but the return value 5542indicates how many are produced. 5543 5544The most significant limb of @{@var{sp}, @var{n}@} must be non-zero. The 5545areas @{@var{r1p}, @math{@GMPceil{@var{n}/2}}@} and @{@var{sp}, @var{n}@} must 5546be completely separate. The areas @{@var{r2p}, @var{n}@} and @{@var{sp}, 5547@var{n}@} must be either identical or completely separate. 5548 5549If the remainder is not wanted then @var{r2p} can be @code{NULL}, and in this 5550case the return value is zero or non-zero according to whether the remainder 5551would have been zero or non-zero. 5552 5553A return value of zero indicates a perfect square. See also 5554@code{mpn_perfect_square_p}. 5555@end deftypefun 5556 5557@deftypefun size_t mpn_sizeinbase (const mp_limb_t *@var{xp}, mp_size_t @var{n}, int @var{base}) 5558Return the size of @{@var{xp},@var{n}@} measured in number of digits in the 5559given @var{base}. @var{base} can vary from 2 to 62. Requires @math{@var{n} > 0} 5560and @math{@var{xp}[@var{n}-1] > 0}. The result will be either exact or 55611 too big. If @var{base} is a power of 2, the result is always exact. 5562@end deftypefun 5563 5564@deftypefun mp_size_t mpn_get_str (unsigned char *@var{str}, int @var{base}, mp_limb_t *@var{s1p}, mp_size_t @var{s1n}) 5565Convert @{@var{s1p}, @var{s1n}@} to a raw unsigned char array at @var{str} in 5566base @var{base}, and return the number of characters produced. There may be 5567leading zeros in the string. The string is not in ASCII; to convert it to 5568printable format, add the ASCII codes for @samp{0} or @samp{A}, depending on 5569the base and range. @var{base} can vary from 2 to 256. 5570 5571The most significant limb of the input @{@var{s1p}, @var{s1n}@} must be 5572non-zero. The input @{@var{s1p}, @var{s1n}@} is clobbered, except when 5573@var{base} is a power of 2, in which case it's unchanged. 5574 5575The area at @var{str} has to have space for the largest possible number 5576represented by a @var{s1n} long limb array, plus one extra character. 5577@end deftypefun 5578 5579@deftypefun mp_size_t mpn_set_str (mp_limb_t *@var{rp}, const unsigned char *@var{str}, size_t @var{strsize}, int @var{base}) 5580Convert bytes @{@var{str},@var{strsize}@} in the given @var{base} to limbs at 5581@var{rp}. 5582 5583@math{@var{str}[0]} is the most significant input byte and 5584@math{@var{str}[@var{strsize}-1]} is the least significant input byte. Each 5585byte should be a value in the range 0 to @math{@var{base}-1}, not an ASCII 5586character. @var{base} can vary from 2 to 256. 5587 5588The converted value is @{@var{rp},@var{rn}@} where @var{rn} is the return 5589value. If the most significant input byte @math{@var{str}[0]} is non-zero, 5590then @math{@var{rp}[@var{rn}-1]} will be non-zero, else 5591@math{@var{rp}[@var{rn}-1]} and some number of subsequent limbs may be zero. 5592 5593The area at @var{rp} has to have space for the largest possible number with 5594@var{strsize} digits in the chosen base, plus one extra limb. 5595 5596The input must have at least one byte, and no overlap is permitted between 5597@{@var{str},@var{strsize}@} and the result at @var{rp}. 5598@end deftypefun 5599 5600@deftypefun {mp_bitcnt_t} mpn_scan0 (const mp_limb_t *@var{s1p}, mp_bitcnt_t @var{bit}) 5601Scan @var{s1p} from bit position @var{bit} for the next clear bit. 5602 5603It is required that there be a clear bit within the area at @var{s1p} at or 5604beyond bit position @var{bit}, so that the function has something to return. 5605@end deftypefun 5606 5607@deftypefun {mp_bitcnt_t} mpn_scan1 (const mp_limb_t *@var{s1p}, mp_bitcnt_t @var{bit}) 5608Scan @var{s1p} from bit position @var{bit} for the next set bit. 5609 5610It is required that there be a set bit within the area at @var{s1p} at or 5611beyond bit position @var{bit}, so that the function has something to return. 5612@end deftypefun 5613 5614@deftypefun void mpn_random (mp_limb_t *@var{r1p}, mp_size_t @var{r1n}) 5615@deftypefunx void mpn_random2 (mp_limb_t *@var{r1p}, mp_size_t @var{r1n}) 5616Generate a random number of length @var{r1n} and store it at @var{r1p}. The 5617most significant limb is always non-zero. @code{mpn_random} generates 5618uniformly distributed limb data, @code{mpn_random2} generates long strings of 5619zeros and ones in the binary representation. 5620 5621@code{mpn_random2} is intended for testing the correctness of the @code{mpn} 5622routines. 5623@end deftypefun 5624 5625@deftypefun {mp_bitcnt_t} mpn_popcount (const mp_limb_t *@var{s1p}, mp_size_t @var{n}) 5626Count the number of set bits in @{@var{s1p}, @var{n}@}. 5627@end deftypefun 5628 5629@deftypefun {mp_bitcnt_t} mpn_hamdist (const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n}) 5630Compute the hamming distance between @{@var{s1p}, @var{n}@} and @{@var{s2p}, 5631@var{n}@}, which is the number of bit positions where the two operands have 5632different bit values. 5633@end deftypefun 5634 5635@deftypefun int mpn_perfect_square_p (const mp_limb_t *@var{s1p}, mp_size_t @var{n}) 5636Return non-zero iff @{@var{s1p}, @var{n}@} is a perfect square. 5637The most significant limb of the input @{@var{s1p}, @var{n}@} must be 5638non-zero. 5639@end deftypefun 5640 5641@deftypefun void mpn_and_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n}) 5642Perform the bitwise logical and of @{@var{s1p}, @var{n}@} and @{@var{s2p}, 5643@var{n}@}, and write the result to @{@var{rp}, @var{n}@}. 5644@end deftypefun 5645 5646@deftypefun void mpn_ior_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n}) 5647Perform the bitwise logical inclusive or of @{@var{s1p}, @var{n}@} and 5648@{@var{s2p}, @var{n}@}, and write the result to @{@var{rp}, @var{n}@}. 5649@end deftypefun 5650 5651@deftypefun void mpn_xor_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n}) 5652Perform the bitwise logical exclusive or of @{@var{s1p}, @var{n}@} and 5653@{@var{s2p}, @var{n}@}, and write the result to @{@var{rp}, @var{n}@}. 5654@end deftypefun 5655 5656@deftypefun void mpn_andn_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n}) 5657Perform the bitwise logical and of @{@var{s1p}, @var{n}@} and the bitwise 5658complement of @{@var{s2p}, @var{n}@}, and write the result to @{@var{rp}, @var{n}@}. 5659@end deftypefun 5660 5661@deftypefun void mpn_iorn_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n}) 5662Perform the bitwise logical inclusive or of @{@var{s1p}, @var{n}@} and the bitwise 5663complement of @{@var{s2p}, @var{n}@}, and write the result to @{@var{rp}, @var{n}@}. 5664@end deftypefun 5665 5666@deftypefun void mpn_nand_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n}) 5667Perform the bitwise logical and of @{@var{s1p}, @var{n}@} and @{@var{s2p}, 5668@var{n}@}, and write the bitwise complement of the result to @{@var{rp}, @var{n}@}. 5669@end deftypefun 5670 5671@deftypefun void mpn_nior_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n}) 5672Perform the bitwise logical inclusive or of @{@var{s1p}, @var{n}@} and 5673@{@var{s2p}, @var{n}@}, and write the bitwise complement of the result to 5674@{@var{rp}, @var{n}@}. 5675@end deftypefun 5676 5677@deftypefun void mpn_xnor_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n}) 5678Perform the bitwise logical exclusive or of @{@var{s1p}, @var{n}@} and 5679@{@var{s2p}, @var{n}@}, and write the bitwise complement of the result to 5680@{@var{rp}, @var{n}@}. 5681@end deftypefun 5682 5683@deftypefun void mpn_com (mp_limb_t *@var{rp}, const mp_limb_t *@var{sp}, mp_size_t @var{n}) 5684Perform the bitwise complement of @{@var{sp}, @var{n}@}, and write the result 5685to @{@var{rp}, @var{n}@}. 5686@end deftypefun 5687 5688@deftypefun void mpn_copyi (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{n}) 5689Copy from @{@var{s1p}, @var{n}@} to @{@var{rp}, @var{n}@}, increasingly. 5690@end deftypefun 5691 5692@deftypefun void mpn_copyd (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{n}) 5693Copy from @{@var{s1p}, @var{n}@} to @{@var{rp}, @var{n}@}, decreasingly. 5694@end deftypefun 5695 5696@deftypefun void mpn_zero (mp_limb_t *@var{rp}, mp_size_t @var{n}) 5697Zero @{@var{rp}, @var{n}@}. 5698@end deftypefun 5699 5700@sp 1 5701@section Low-level functions for cryptography 5702@cindex Low-level functions for cryptography 5703@cindex Cryptography functions, low-level 5704 5705The functions prefixed with @code{mpn_sec_} and @code{mpn_cnd_} are designed to 5706perform the exact same low-level operations and have the same cache access 5707patterns for any two same-size arguments, assuming that function arguments are 5708placed at the same position and that the machine state is identical upon 5709function entry. These functions are intended for cryptographic purposes, where 5710resilience to side-channel attacks is desired. 5711 5712These functions are less efficient than their ``leaky'' counterparts; their 5713performance for operands of the sizes typically used for cryptographic 5714applications is between 15% and 100% worse. For larger operands, these 5715functions might be inadequate, since they rely on asymptotically elementary 5716algorithms. 5717 5718These functions do not make any explicit allocations. Those of these functions 5719that need scratch space accept a scratch space operand. This convention allows 5720callers to keep sensitive data in designated memory areas. Note however that 5721compilers may choose to spill scalar values used within these functions to 5722their stack frame and that such scalars may contain sensitive data. 5723 5724In addition to these specially crafted functions, the following @code{mpn} 5725functions are naturally side-channel resistant: @code{mpn_add_n}, 5726@code{mpn_sub_n}, @code{mpn_lshift}, @code{mpn_rshift}, @code{mpn_zero}, 5727@code{mpn_copyi}, @code{mpn_copyd}, @code{mpn_com}, and the logical function 5728(@code{mpn_and_n}, etc). 5729 5730There are some exceptions from the side-channel resilience: (1) Some assembly 5731implementations of @code{mpn_lshift} identify shift-by-one as a special case. 5732This is a problem iff the shift count is a function of sensitive data. (2) 5733Alpha ev6 and Pentium4 using 64-bit limbs have leaky @code{mpn_add_n} and 5734@code{mpn_sub_n}. (3) Alpha ev6 has a leaky @code{mpn_mul_1} which also makes 5735@code{mpn_sec_mul} on those systems unsafe. 5736 5737@deftypefun mp_limb_t mpn_cnd_add_n (mp_limb_t @var{cnd}, mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n}) 5738@deftypefunx mp_limb_t mpn_cnd_sub_n (mp_limb_t @var{cnd}, mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n}) 5739These functions do conditional addition and subtraction. If @var{cnd} is 5740non-zero, they produce the same result as a regular @code{mpn_add_n} or 5741@code{mpn_sub_n}, and if @var{cnd} is zero, they copy @{@var{s1p},@var{n}@} to 5742the result area and return zero. The functions are designed to have timing and 5743memory access patterns depending only on size and location of the data areas, 5744but independent of the condition @var{cnd}. Like for @code{mpn_add_n} and 5745@code{mpn_sub_n}, on most machines, the timing will also be independent of the 5746actual limb values. 5747@end deftypefun 5748 5749@deftypefun mp_limb_t mpn_sec_add_1 (mp_limb_t *@var{rp}, const mp_limb_t *@var{ap}, mp_size_t @var{n}, mp_limb_t @var{b}, mp_limb_t *@var{tp}) 5750@deftypefunx mp_limb_t mpn_sec_sub_1 (mp_limb_t *@var{rp}, const mp_limb_t *@var{ap}, mp_size_t @var{n}, mp_limb_t @var{b}, mp_limb_t *@var{tp}) 5751Set @var{R} to @var{A} + @var{b} or @var{A} - @var{b}, respectively, where 5752@var{R} = @{@var{rp},@var{n}@}, @var{A} = @{@var{ap},@var{n}@}, and @var{b} is 5753a single limb. Returns carry. 5754 5755These functions take @math{O(N)} time, unlike the leaky functions 5756@code{mpn_add_1} which are @math{O(1)} on average. They require scratch space 5757of @code{mpn_sec_add_1_itch(@var{n})} and @code{mpn_sec_sub_1_itch(@var{n})} 5758limbs, respectively, to be passed in the @var{tp} parameter. The scratch space 5759requirements are guaranteed to be at most @var{n} limbs, and increase 5760monotonously in the operand size. 5761@end deftypefun 5762 5763@deftypefun void mpn_cnd_swap (mp_limb_t @var{cnd}, volatile mp_limb_t *@var{ap}, volatile mp_limb_t *@var{bp}, mp_size_t @var{n}) 5764If @var{cnd} is non-zero, swaps the contents of the areas @{@var{ap},@var{n}@} 5765and @{@var{bp},@var{n}@}. Otherwise, the areas are left unmodified. 5766Implemented using logical operations on the limbs, with the same memory 5767accesses independent of the value of @var{cnd}. 5768@end deftypefun 5769 5770@deftypefun void mpn_sec_mul (mp_limb_t *@var{rp}, const mp_limb_t *@var{ap}, mp_size_t @var{an}, const mp_limb_t *@var{bp}, mp_size_t @var{bn}, mp_limb_t *@var{tp}) 5771@deftypefunx mp_size_t mpn_sec_mul_itch (mp_size_t @var{an}, mp_size_t @var{bn}) 5772Set @var{R} to @math{A @times{} B}, where @var{A} = @{@var{ap},@var{an}@}, 5773@var{B} = @{@var{bp},@var{bn}@}, and @var{R} = 5774@{@var{rp},@math{@var{an}+@var{bn}}@}. 5775 5776It is required that @math{@var{an} @ge @var{bn} > 0}. 5777 5778No overlapping between @var{R} and the input operands is allowed. For 5779@math{@var{A} = @var{B}}, use @code{mpn_sec_sqr} for optimal performance. 5780 5781This function requires scratch space of @code{mpn_sec_mul_itch(@var{an}, 5782@var{bn})} limbs to be passed in the @var{tp} parameter. The scratch space 5783requirements are guaranteed to increase monotonously in the operand sizes. 5784@end deftypefun 5785 5786 5787@deftypefun void mpn_sec_sqr (mp_limb_t *@var{rp}, const mp_limb_t *@var{ap}, mp_size_t @var{an}, mp_limb_t *@var{tp}) 5788@deftypefunx mp_size_t mpn_sec_sqr_itch (mp_size_t @var{an}) 5789Set @var{R} to @math{A^2}, where @var{A} = @{@var{ap},@var{an}@}, and @var{R} = 5790@{@var{rp},@math{2@var{an}}@}. 5791 5792It is required that @math{@var{an} > 0}. 5793 5794No overlapping between @var{R} and the input operands is allowed. 5795 5796This function requires scratch space of @code{mpn_sec_sqr_itch(@var{an})} limbs 5797to be passed in the @var{tp} parameter. The scratch space requirements are 5798guaranteed to increase monotonously in the operand size. 5799@end deftypefun 5800 5801 5802@deftypefun void mpn_sec_powm (mp_limb_t *@var{rp}, const mp_limb_t *@var{bp}, mp_size_t @var{bn}, const mp_limb_t *@var{ep}, mp_bitcnt_t @var{enb}, const mp_limb_t *@var{mp}, mp_size_t @var{n}, mp_limb_t *@var{tp}) 5803@deftypefunx mp_size_t mpn_sec_powm_itch (mp_size_t @var{bn}, mp_bitcnt_t @var{enb}, size_t @var{n}) 5804Set @var{R} to @m{B^E \bmod @var{M}, (@var{B} raised to @var{E}) modulo 5805@var{M}}, where @var{R} = @{@var{rp},@var{n}@}, @var{M} = @{@var{mp},@var{n}@}, 5806and @var{E} = @{@var{ep},@math{@GMPceil{@var{enb} / 5807@code{GMP\_NUMB\_BITS}}}@}. 5808 5809It is required that @math{@var{B} > 0}, that @math{@var{M} > 0} is odd, and 5810that @m{@var{E} < 2@GMPraise{@var{enb}}, @var{E} < 2^@var{enb}}, with @math{@var{enb} > 0}. 5811 5812No overlapping between @var{R} and the input operands is allowed. 5813 5814This function requires scratch space of @code{mpn_sec_powm_itch(@var{bn}, 5815@var{enb}, @var{n})} limbs to be passed in the @var{tp} parameter. The scratch 5816space requirements are guaranteed to increase monotonously in the operand 5817sizes. 5818@end deftypefun 5819 5820@deftypefun void mpn_sec_tabselect (mp_limb_t *@var{rp}, const mp_limb_t *@var{tab}, mp_size_t @var{n}, mp_size_t @var{nents}, mp_size_t @var{which}) 5821Select entry @var{which} from table @var{tab}, which has @var{nents} entries, each @var{n} 5822limbs. Store the selected entry at @var{rp}. 5823 5824This function reads the entire table to avoid side-channel information leaks. 5825@end deftypefun 5826 5827@deftypefun mp_limb_t mpn_sec_div_qr (mp_limb_t *@var{qp}, mp_limb_t *@var{np}, mp_size_t @var{nn}, const mp_limb_t *@var{dp}, mp_size_t @var{dn}, mp_limb_t *@var{tp}) 5828@deftypefunx mp_size_t mpn_sec_div_qr_itch (mp_size_t @var{nn}, mp_size_t @var{dn}) 5829 5830Set @var{Q} to @m{\lfloor @var{N} / @var{D}\rfloor, the truncated quotient 5831@var{N} / @var{D}} and @var{R} to @m{@var{N} \bmod @var{D}, @var{N} modulo 5832@var{D}}, where @var{N} = @{@var{np},@var{nn}@}, @var{D} = 5833@{@var{dp},@var{dn}@}, @var{Q}'s most significant limb is the function return 5834value and the remaining limbs are @{@var{qp},@var{nn-dn}@}, and @var{R} = 5835@{@var{np},@var{dn}@}. 5836 5837It is required that @math{@var{nn} @ge @var{dn} @ge 1}, and that 5838@m{@var{dp}[@var{dn}-1] @neq 0, @var{dp}[@var{dn}-1] != 0}. This does not 5839imply that @math{@var{N} @ge @var{D}} since @var{N} might be zero-padded. 5840 5841Note the overlapping between @var{N} and @var{R}. No other operand overlapping 5842is allowed. The entire space occupied by @var{N} is overwritten. 5843 5844This function requires scratch space of @code{mpn_sec_div_qr_itch(@var{nn}, 5845@var{dn})} limbs to be passed in the @var{tp} parameter. 5846@end deftypefun 5847 5848@deftypefun void mpn_sec_div_r (mp_limb_t *@var{np}, mp_size_t @var{nn}, const mp_limb_t *@var{dp}, mp_size_t @var{dn}, mp_limb_t *@var{tp}) 5849@deftypefunx mp_size_t mpn_sec_div_r_itch (mp_size_t @var{nn}, mp_size_t @var{dn}) 5850 5851Set @var{R} to @m{@var{N} \bmod @var{D}, @var{N} modulo @var{D}}, where @var{N} 5852= @{@var{np},@var{nn}@}, @var{D} = @{@var{dp},@var{dn}@}, and @var{R} = 5853@{@var{np},@var{dn}@}. 5854 5855It is required that @math{@var{nn} @ge @var{dn} @ge 1}, and that 5856@m{@var{dp}[@var{dn}-1] @neq 0, @var{dp}[@var{dn}-1] != 0}. This does not 5857imply that @math{@var{N} @ge @var{D}} since @var{N} might be zero-padded. 5858 5859Note the overlapping between @var{N} and @var{R}. No other operand overlapping 5860is allowed. The entire space occupied by @var{N} is overwritten. 5861 5862This function requires scratch space of @code{mpn_sec_div_r_itch(@var{nn}, 5863@var{dn})} limbs to be passed in the @var{tp} parameter. 5864@end deftypefun 5865 5866@deftypefun int mpn_sec_invert (mp_limb_t *@var{rp}, mp_limb_t *@var{ap}, const mp_limb_t *@var{mp}, mp_size_t @var{n}, mp_bitcnt_t @var{nbcnt}, mp_limb_t *@var{tp}) 5867@deftypefunx mp_size_t mpn_sec_invert_itch (mp_size_t @var{n}) 5868Set @var{R} to @m{@var{A}^{-1} \bmod @var{M}, the inverse of @var{A} modulo 5869@var{M}}, where @var{R} = @{@var{rp},@var{n}@}, @var{A} = @{@var{ap},@var{n}@}, 5870and @var{M} = @{@var{mp},@var{n}@}. @strong{This function's interface is 5871preliminary.} 5872 5873If an inverse exists, return 1, otherwise return 0 and leave @var{R} 5874undefined. In either case, the input @var{A} is destroyed. 5875 5876It is required that @var{M} is odd, and that @math{@var{nbcnt} @ge 5877@GMPceil{\log(@var{A}+1)} + @GMPceil{\log(@var{M}+1)}}. A safe choice is 5878@m{@var{nbcnt} = 2@var{n} @times{} @code{GMP\_NUMB\_BITS}, @var{nbcnt} = 2 5879@times{} @var{n} @times{} GMP_NUMB_BITS}, but a smaller value might improve 5880performance if @var{M} or @var{A} are known to have leading zero bits. 5881 5882This function requires scratch space of @code{mpn_sec_invert_itch(@var{n})} 5883limbs to be passed in the @var{tp} parameter. 5884@end deftypefun 5885 5886 5887@sp 1 5888@section Nails 5889@cindex Nails 5890 5891@strong{Everything in this section is highly experimental and may disappear or 5892be subject to incompatible changes in a future version of GMP.} 5893 5894Nails are an experimental feature whereby a few bits are left unused at the 5895top of each @code{mp_limb_t}. This can significantly improve carry handling 5896on some processors. 5897 5898All the @code{mpn} functions accepting limb data will expect the nail bits to 5899be zero on entry, and will return data with the nails similarly all zero. 5900This applies both to limb vectors and to single limb arguments. 5901 5902Nails can be enabled by configuring with @samp{--enable-nails}. By default 5903the number of bits will be chosen according to what suits the host processor, 5904but a particular number can be selected with @samp{--enable-nails=N}. 5905 5906At the mpn level, a nail build is neither source nor binary compatible with a 5907non-nail build, strictly speaking. But programs acting on limbs only through 5908the mpn functions are likely to work equally well with either build, and 5909judicious use of the definitions below should make any program compatible with 5910either build, at the source level. 5911 5912For the higher level routines, meaning @code{mpz} etc, a nail build should be 5913fully source and binary compatible with a non-nail build. 5914 5915@defmac GMP_NAIL_BITS 5916@defmacx GMP_NUMB_BITS 5917@defmacx GMP_LIMB_BITS 5918@code{GMP_NAIL_BITS} is the number of nail bits, or 0 when nails are not in 5919use. @code{GMP_NUMB_BITS} is the number of data bits in a limb. 5920@code{GMP_LIMB_BITS} is the total number of bits in an @code{mp_limb_t}. In 5921all cases 5922 5923@example 5924GMP_LIMB_BITS == GMP_NAIL_BITS + GMP_NUMB_BITS 5925@end example 5926@end defmac 5927 5928@defmac GMP_NAIL_MASK 5929@defmacx GMP_NUMB_MASK 5930Bit masks for the nail and number parts of a limb. @code{GMP_NAIL_MASK} is 0 5931when nails are not in use. 5932 5933@code{GMP_NAIL_MASK} is not often needed, since the nail part can be obtained 5934with @code{x >> GMP_NUMB_BITS}, and that means one less large constant, which 5935can help various RISC chips. 5936@end defmac 5937 5938@defmac GMP_NUMB_MAX 5939The maximum value that can be stored in the number part of a limb. This is 5940the same as @code{GMP_NUMB_MASK}, but can be used for clarity when doing 5941comparisons rather than bit-wise operations. 5942@end defmac 5943 5944The term ``nails'' comes from finger or toe nails, which are at the ends of a 5945limb (arm or leg). ``numb'' is short for number, but is also how the 5946developers felt after trying for a long time to come up with sensible names 5947for these things. 5948 5949In the future (the distant future most likely) a non-zero nail might be 5950permitted, giving non-unique representations for numbers in a limb vector. 5951This would help vector processors since carries would only ever need to 5952propagate one or two limbs. 5953 5954 5955@node Random Number Functions, Formatted Output, Low-level Functions, Top 5956@chapter Random Number Functions 5957@cindex Random number functions 5958 5959Sequences of pseudo-random numbers in GMP are generated using a variable of 5960type @code{gmp_randstate_t}, which holds an algorithm selection and a current 5961state. Such a variable must be initialized by a call to one of the 5962@code{gmp_randinit} functions, and can be seeded with one of the 5963@code{gmp_randseed} functions. 5964 5965The functions actually generating random numbers are described in @ref{Integer 5966Random Numbers}, and @ref{Miscellaneous Float Functions}. 5967 5968The older style random number functions don't accept a @code{gmp_randstate_t} 5969parameter but instead share a global variable of that type. They use a 5970default algorithm and are currently not seeded (though perhaps that will 5971change in the future). The new functions accepting a @code{gmp_randstate_t} 5972are recommended for applications that care about randomness. 5973 5974@menu 5975* Random State Initialization:: 5976* Random State Seeding:: 5977* Random State Miscellaneous:: 5978@end menu 5979 5980@node Random State Initialization, Random State Seeding, Random Number Functions, Random Number Functions 5981@section Random State Initialization 5982@cindex Random number state 5983@cindex Initialization functions 5984 5985@deftypefun void gmp_randinit_default (gmp_randstate_t @var{state}) 5986Initialize @var{state} with a default algorithm. This will be a compromise 5987between speed and randomness, and is recommended for applications with no 5988special requirements. Currently this is @code{gmp_randinit_mt}. 5989@end deftypefun 5990 5991@deftypefun void gmp_randinit_mt (gmp_randstate_t @var{state}) 5992@cindex Mersenne twister random numbers 5993Initialize @var{state} for a Mersenne Twister algorithm. This algorithm is 5994fast and has good randomness properties. 5995@end deftypefun 5996 5997@deftypefun void gmp_randinit_lc_2exp (gmp_randstate_t @var{state}, const mpz_t @var{a}, @w{unsigned long @var{c}}, @w{mp_bitcnt_t @var{m2exp}}) 5998@cindex Linear congruential random numbers 5999Initialize @var{state} with a linear congruential algorithm @m{X = (@var{a}X + 6000@var{c}) @bmod 2^{m2exp}, X = (@var{a}*X + @var{c}) mod 2^@var{m2exp}}. 6001 6002The low bits of @math{X} in this algorithm are not very random. The least 6003significant bit will have a period no more than 2, and the second bit no more 6004than 4, etc. For this reason only the high half of each @math{X} is actually 6005used. 6006 6007When a random number of more than @math{@var{m2exp}/2} bits is to be 6008generated, multiple iterations of the recurrence are used and the results 6009concatenated. 6010@end deftypefun 6011 6012@deftypefun int gmp_randinit_lc_2exp_size (gmp_randstate_t @var{state}, mp_bitcnt_t @var{size}) 6013@cindex Linear congruential random numbers 6014Initialize @var{state} for a linear congruential algorithm as per 6015@code{gmp_randinit_lc_2exp}. @var{a}, @var{c} and @var{m2exp} are selected 6016from a table, chosen so that @var{size} bits (or more) of each @math{X} will 6017be used, i.e.@: @math{@var{m2exp}/2 @ge{} @var{size}}. 6018 6019If successful the return value is non-zero. If @var{size} is bigger than the 6020table data provides then the return value is zero. The maximum @var{size} 6021currently supported is 128. 6022@end deftypefun 6023 6024@deftypefun void gmp_randinit_set (gmp_randstate_t @var{rop}, gmp_randstate_t @var{op}) 6025Initialize @var{rop} with a copy of the algorithm and state from @var{op}. 6026@end deftypefun 6027 6028@c Although gmp_randinit, gmp_errno and related constants are obsolete, we 6029@c still put @findex entries for them, since they're still documented and 6030@c someone might be looking them up when perusing old application code. 6031 6032@deftypefun void gmp_randinit (gmp_randstate_t @var{state}, @w{gmp_randalg_t @var{alg}}, @dots{}) 6033@strong{This function is obsolete.} 6034 6035@findex GMP_RAND_ALG_LC 6036@findex GMP_RAND_ALG_DEFAULT 6037Initialize @var{state} with an algorithm selected by @var{alg}. The only 6038choice is @code{GMP_RAND_ALG_LC}, which is @code{gmp_randinit_lc_2exp_size} 6039described above. A third parameter of type @code{unsigned long} is required, 6040this is the @var{size} for that function. @code{GMP_RAND_ALG_DEFAULT} or 0 6041are the same as @code{GMP_RAND_ALG_LC}. 6042 6043@c For reference, this is the only place gmp_errno has been documented, and 6044@c due to being non thread safe we won't be adding to it's uses. 6045@findex gmp_errno 6046@findex GMP_ERROR_UNSUPPORTED_ARGUMENT 6047@findex GMP_ERROR_INVALID_ARGUMENT 6048@code{gmp_randinit} sets bits in the global variable @code{gmp_errno} to 6049indicate an error. @code{GMP_ERROR_UNSUPPORTED_ARGUMENT} if @var{alg} is 6050unsupported, or @code{GMP_ERROR_INVALID_ARGUMENT} if the @var{size} parameter 6051is too big. It may be noted this error reporting is not thread safe (a good 6052reason to use @code{gmp_randinit_lc_2exp_size} instead). 6053@end deftypefun 6054 6055@deftypefun void gmp_randclear (gmp_randstate_t @var{state}) 6056Free all memory occupied by @var{state}. 6057@end deftypefun 6058 6059 6060@node Random State Seeding, Random State Miscellaneous, Random State Initialization, Random Number Functions 6061@section Random State Seeding 6062@cindex Random number seeding 6063@cindex Seeding random numbers 6064 6065@deftypefun void gmp_randseed (gmp_randstate_t @var{state}, const mpz_t @var{seed}) 6066@deftypefunx void gmp_randseed_ui (gmp_randstate_t @var{state}, @w{unsigned long int @var{seed}}) 6067Set an initial seed value into @var{state}. 6068 6069The size of a seed determines how many different sequences of random numbers 6070that it's possible to generate. The ``quality'' of the seed is the randomness 6071of a given seed compared to the previous seed used, and this affects the 6072randomness of separate number sequences. The method for choosing a seed is 6073critical if the generated numbers are to be used for important applications, 6074such as generating cryptographic keys. 6075 6076Traditionally the system time has been used to seed, but care needs to be 6077taken with this. If an application seeds often and the resolution of the 6078system clock is low, then the same sequence of numbers might be repeated. 6079Also, the system time is quite easy to guess, so if unpredictability is 6080required then it should definitely not be the only source for the seed value. 6081On some systems there's a special device @file{/dev/random} which provides 6082random data better suited for use as a seed. 6083@end deftypefun 6084 6085 6086@node Random State Miscellaneous, , Random State Seeding, Random Number Functions 6087@section Random State Miscellaneous 6088 6089@deftypefun {unsigned long} gmp_urandomb_ui (gmp_randstate_t @var{state}, unsigned long @var{n}) 6090Return a uniformly distributed random number of @var{n} bits, i.e.@: in the 6091range 0 to @m{2^n-1,2^@var{n}-1} inclusive. @var{n} must be less than or 6092equal to the number of bits in an @code{unsigned long}. 6093@end deftypefun 6094 6095@deftypefun {unsigned long} gmp_urandomm_ui (gmp_randstate_t @var{state}, unsigned long @var{n}) 6096Return a uniformly distributed random number in the range 0 to 6097@math{@var{n}-1}, inclusive. 6098@end deftypefun 6099 6100 6101@node Formatted Output, Formatted Input, Random Number Functions, Top 6102@chapter Formatted Output 6103@cindex Formatted output 6104@cindex @code{printf} formatted output 6105 6106@menu 6107* Formatted Output Strings:: 6108* Formatted Output Functions:: 6109* C++ Formatted Output:: 6110@end menu 6111 6112@node Formatted Output Strings, Formatted Output Functions, Formatted Output, Formatted Output 6113@section Format Strings 6114 6115@code{gmp_printf} and friends accept format strings similar to the standard C 6116@code{printf} (@pxref{Formatted Output,, Formatted Output, libc, The GNU C 6117Library Reference Manual}). A format specification is of the form 6118 6119@example 6120% [flags] [width] [.[precision]] [type] conv 6121@end example 6122 6123GMP adds types @samp{Z}, @samp{Q} and @samp{F} for @code{mpz_t}, @code{mpq_t} 6124and @code{mpf_t} respectively, @samp{M} for @code{mp_limb_t}, and @samp{N} for 6125an @code{mp_limb_t} array. @samp{Z}, @samp{Q}, @samp{M} and @samp{N} behave 6126like integers. @samp{Q} will print a @samp{/} and a denominator, if needed. 6127@samp{F} behaves like a float. For example, 6128 6129@example 6130mpz_t z; 6131gmp_printf ("%s is an mpz %Zd\n", "here", z); 6132 6133mpq_t q; 6134gmp_printf ("a hex rational: %#40Qx\n", q); 6135 6136mpf_t f; 6137int n; 6138gmp_printf ("fixed point mpf %.*Ff with %d digits\n", n, f, n); 6139 6140mp_limb_t l; 6141gmp_printf ("limb %Mu\n", l); 6142 6143const mp_limb_t *ptr; 6144mp_size_t size; 6145gmp_printf ("limb array %Nx\n", ptr, size); 6146@end example 6147 6148For @samp{N} the limbs are expected least significant first, as per the 6149@code{mpn} functions (@pxref{Low-level Functions}). A negative size can be 6150given to print the value as a negative. 6151 6152All the standard C @code{printf} types behave the same as the C library 6153@code{printf}, and can be freely intermixed with the GMP extensions. In the 6154current implementation the standard parts of the format string are simply 6155handed to @code{printf} and only the GMP extensions handled directly. 6156 6157The flags accepted are as follows. GLIBC style @nisamp{'} is only for the 6158standard C types (not the GMP types), and only if the C library supports it. 6159 6160@quotation 6161@multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM} 6162@item @nicode{0} @tab pad with zeros (rather than spaces) 6163@item @nicode{#} @tab show the base with @samp{0x}, @samp{0X} or @samp{0} 6164@item @nicode{+} @tab always show a sign 6165@item (space) @tab show a space or a @samp{-} sign 6166@item @nicode{'} @tab group digits, GLIBC style (not GMP types) 6167@end multitable 6168@end quotation 6169 6170The optional width and precision can be given as a number within the format 6171string, or as a @samp{*} to take an extra parameter of type @code{int}, the 6172same as the standard @code{printf}. 6173 6174The standard types accepted are as follows. @samp{h} and @samp{l} are 6175portable, the rest will depend on the compiler (or include files) for the type 6176and the C library for the output. 6177 6178@quotation 6179@multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM} 6180@item @nicode{h} @tab @nicode{short} 6181@item @nicode{hh} @tab @nicode{char} 6182@item @nicode{j} @tab @nicode{intmax_t} or @nicode{uintmax_t} 6183@item @nicode{l} @tab @nicode{long} or @nicode{wchar_t} 6184@item @nicode{ll} @tab @nicode{long long} 6185@item @nicode{L} @tab @nicode{long double} 6186@item @nicode{q} @tab @nicode{quad_t} or @nicode{u_quad_t} 6187@item @nicode{t} @tab @nicode{ptrdiff_t} 6188@item @nicode{z} @tab @nicode{size_t} 6189@end multitable 6190@end quotation 6191 6192@noindent 6193The GMP types are 6194 6195@quotation 6196@multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM} 6197@item @nicode{F} @tab @nicode{mpf_t}, float conversions 6198@item @nicode{Q} @tab @nicode{mpq_t}, integer conversions 6199@item @nicode{M} @tab @nicode{mp_limb_t}, integer conversions 6200@item @nicode{N} @tab @nicode{mp_limb_t} array, integer conversions 6201@item @nicode{Z} @tab @nicode{mpz_t}, integer conversions 6202@end multitable 6203@end quotation 6204 6205The conversions accepted are as follows. @samp{a} and @samp{A} are always 6206supported for @code{mpf_t} but depend on the C library for standard C float 6207types. @samp{m} and @samp{p} depend on the C library. 6208 6209@quotation 6210@multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM} 6211@item @nicode{a} @nicode{A} @tab hex floats, C99 style 6212@item @nicode{c} @tab character 6213@item @nicode{d} @tab decimal integer 6214@item @nicode{e} @nicode{E} @tab scientific format float 6215@item @nicode{f} @tab fixed point float 6216@item @nicode{i} @tab same as @nicode{d} 6217@item @nicode{g} @nicode{G} @tab fixed or scientific float 6218@item @nicode{m} @tab @code{strerror} string, GLIBC style 6219@item @nicode{n} @tab store characters written so far 6220@item @nicode{o} @tab octal integer 6221@item @nicode{p} @tab pointer 6222@item @nicode{s} @tab string 6223@item @nicode{u} @tab unsigned integer 6224@item @nicode{x} @nicode{X} @tab hex integer 6225@end multitable 6226@end quotation 6227 6228@samp{o}, @samp{x} and @samp{X} are unsigned for the standard C types, but for 6229types @samp{Z}, @samp{Q} and @samp{N} they are signed. @samp{u} is not 6230meaningful for @samp{Z}, @samp{Q} and @samp{N}. 6231 6232@samp{M} is a proxy for the C library @samp{l} or @samp{L}, according to the 6233size of @code{mp_limb_t}. Unsigned conversions will be usual, but a signed 6234conversion can be used and will interpret the value as a twos complement 6235negative. 6236 6237@samp{n} can be used with any type, even the GMP types. 6238 6239Other types or conversions that might be accepted by the C library 6240@code{printf} cannot be used through @code{gmp_printf}, this includes for 6241instance extensions registered with GLIBC @code{register_printf_function}. 6242Also currently there's no support for POSIX @samp{$} style numbered arguments 6243(perhaps this will be added in the future). 6244 6245The precision field has its usual meaning for integer @samp{Z} and float 6246@samp{F} types, but is currently undefined for @samp{Q} and should not be used 6247with that. 6248 6249@code{mpf_t} conversions only ever generate as many digits as can be 6250accurately represented by the operand, the same as @code{mpf_get_str} does. 6251Zeros will be used if necessary to pad to the requested precision. This 6252happens even for an @samp{f} conversion of an @code{mpf_t} which is an 6253integer, for instance @math{2^@W{1024}} in an @code{mpf_t} of 128 bits 6254precision will only produce about 40 digits, then pad with zeros to the 6255decimal point. An empty precision field like @samp{%.Fe} or @samp{%.Ff} can 6256be used to specifically request just the significant digits. Without any dot 6257and thus no precision field, a precision value of 6 will be used. Note that 6258these rules mean that @samp{%Ff}, @samp{%.Ff}, and @samp{%.0Ff} will all be 6259different. 6260 6261The decimal point character (or string) is taken from the current locale 6262settings on systems which provide @code{localeconv} (@pxref{Locales,, Locales 6263and Internationalization, libc, The GNU C Library Reference Manual}). The C 6264library will normally do the same for standard float output. 6265 6266The format string is only interpreted as plain @code{char}s, multibyte 6267characters are not recognised. Perhaps this will change in the future. 6268 6269 6270@node Formatted Output Functions, C++ Formatted Output, Formatted Output Strings, Formatted Output 6271@section Functions 6272@cindex Output functions 6273 6274Each of the following functions is similar to the corresponding C library 6275function. The basic @code{printf} forms take a variable argument list. The 6276@code{vprintf} forms take an argument pointer, see @ref{Variadic Functions,, 6277Variadic Functions, libc, The GNU C Library Reference Manual}, or @samp{man 3 6278va_start}. 6279 6280It should be emphasised that if a format string is invalid, or the arguments 6281don't match what the format specifies, then the behaviour of any of these 6282functions will be unpredictable. GCC format string checking is not available, 6283since it doesn't recognise the GMP extensions. 6284 6285The file based functions @code{gmp_printf} and @code{gmp_fprintf} will return 6286@math{-1} to indicate a write error. Output is not ``atomic'', so partial 6287output may be produced if a write error occurs. All the functions can return 6288@math{-1} if the C library @code{printf} variant in use returns @math{-1}, but 6289this shouldn't normally occur. 6290 6291@deftypefun int gmp_printf (const char *@var{fmt}, @dots{}) 6292@deftypefunx int gmp_vprintf (const char *@var{fmt}, va_list @var{ap}) 6293Print to the standard output @code{stdout}. Return the number of characters 6294written, or @math{-1} if an error occurred. 6295@end deftypefun 6296 6297@deftypefun int gmp_fprintf (FILE *@var{fp}, const char *@var{fmt}, @dots{}) 6298@deftypefunx int gmp_vfprintf (FILE *@var{fp}, const char *@var{fmt}, va_list @var{ap}) 6299Print to the stream @var{fp}. Return the number of characters written, or 6300@math{-1} if an error occurred. 6301@end deftypefun 6302 6303@deftypefun int gmp_sprintf (char *@var{buf}, const char *@var{fmt}, @dots{}) 6304@deftypefunx int gmp_vsprintf (char *@var{buf}, const char *@var{fmt}, va_list @var{ap}) 6305Form a null-terminated string in @var{buf}. Return the number of characters 6306written, excluding the terminating null. 6307 6308No overlap is permitted between the space at @var{buf} and the string 6309@var{fmt}. 6310 6311These functions are not recommended, since there's no protection against 6312exceeding the space available at @var{buf}. 6313@end deftypefun 6314 6315@deftypefun int gmp_snprintf (char *@var{buf}, size_t @var{size}, const char *@var{fmt}, @dots{}) 6316@deftypefunx int gmp_vsnprintf (char *@var{buf}, size_t @var{size}, const char *@var{fmt}, va_list @var{ap}) 6317Form a null-terminated string in @var{buf}. No more than @var{size} bytes 6318will be written. To get the full output, @var{size} must be enough for the 6319string and null-terminator. 6320 6321The return value is the total number of characters which ought to have been 6322produced, excluding the terminating null. If @math{@var{retval} @ge{} 6323@var{size}} then the actual output has been truncated to the first 6324@math{@var{size}-1} characters, and a null appended. 6325 6326No overlap is permitted between the region @{@var{buf},@var{size}@} and the 6327@var{fmt} string. 6328 6329Notice the return value is in ISO C99 @code{snprintf} style. This is so even 6330if the C library @code{vsnprintf} is the older GLIBC 2.0.x style. 6331@end deftypefun 6332 6333@deftypefun int gmp_asprintf (char **@var{pp}, const char *@var{fmt}, @dots{}) 6334@deftypefunx int gmp_vasprintf (char **@var{pp}, const char *@var{fmt}, va_list @var{ap}) 6335Form a null-terminated string in a block of memory obtained from the current 6336memory allocation function (@pxref{Custom Allocation}). The block will be the 6337size of the string and null-terminator. The address of the block in stored to 6338*@var{pp}. The return value is the number of characters produced, excluding 6339the null-terminator. 6340 6341Unlike the C library @code{asprintf}, @code{gmp_asprintf} doesn't return 6342@math{-1} if there's no more memory available, it lets the current allocation 6343function handle that. 6344@end deftypefun 6345 6346@deftypefun int gmp_obstack_printf (struct obstack *@var{ob}, const char *@var{fmt}, @dots{}) 6347@deftypefunx int gmp_obstack_vprintf (struct obstack *@var{ob}, const char *@var{fmt}, va_list @var{ap}) 6348@cindex @code{obstack} output 6349Append to the current object in @var{ob}. The return value is the number of 6350characters written. A null-terminator is not written. 6351 6352@var{fmt} cannot be within the current object in @var{ob}, since that object 6353might move as it grows. 6354 6355These functions are available only when the C library provides the obstack 6356feature, which probably means only on GNU systems, see @ref{Obstacks,, 6357Obstacks, libc, The GNU C Library Reference Manual}. 6358@end deftypefun 6359 6360 6361@node C++ Formatted Output, , Formatted Output Functions, Formatted Output 6362@section C++ Formatted Output 6363@cindex C++ @code{ostream} output 6364@cindex @code{ostream} output 6365 6366The following functions are provided in @file{libgmpxx} (@pxref{Headers and 6367Libraries}), which is built if C++ support is enabled (@pxref{Build Options}). 6368Prototypes are available from @code{<gmp.h>}. 6369 6370@deftypefun ostream& operator<< (ostream& @var{stream}, const mpz_t @var{op}) 6371Print @var{op} to @var{stream}, using its @code{ios} formatting settings. 6372@code{ios::width} is reset to 0 after output, the same as the standard 6373@code{ostream operator<<} routines do. 6374 6375In hex or octal, @var{op} is printed as a signed number, the same as for 6376decimal. This is unlike the standard @code{operator<<} routines on @code{int} 6377etc, which instead give twos complement. 6378@end deftypefun 6379 6380@deftypefun ostream& operator<< (ostream& @var{stream}, const mpq_t @var{op}) 6381Print @var{op} to @var{stream}, using its @code{ios} formatting settings. 6382@code{ios::width} is reset to 0 after output, the same as the standard 6383@code{ostream operator<<} routines do. 6384 6385Output will be a fraction like @samp{5/9}, or if the denominator is 1 then 6386just a plain integer like @samp{123}. 6387 6388In hex or octal, @var{op} is printed as a signed value, the same as for 6389decimal. If @code{ios::showbase} is set then a base indicator is shown on 6390both the numerator and denominator (if the denominator is required). 6391@end deftypefun 6392 6393@deftypefun ostream& operator<< (ostream& @var{stream}, const mpf_t @var{op}) 6394Print @var{op} to @var{stream}, using its @code{ios} formatting settings. 6395@code{ios::width} is reset to 0 after output, the same as the standard 6396@code{ostream operator<<} routines do. 6397 6398The decimal point follows the standard library float @code{operator<<}, which 6399on recent systems means the @code{std::locale} imbued on @var{stream}. 6400 6401Hex and octal are supported, unlike the standard @code{operator<<} on 6402@code{double}. The mantissa will be in hex or octal, the exponent will be in 6403decimal. For hex the exponent delimiter is an @samp{@@}. This is as per 6404@code{mpf_out_str}. 6405 6406@code{ios::showbase} is supported, and will put a base on the mantissa, for 6407example hex @samp{0x1.8} or @samp{0x0.8}, or octal @samp{01.4} or @samp{00.4}. 6408This last form is slightly strange, but at least differentiates itself from 6409decimal. 6410@end deftypefun 6411 6412These operators mean that GMP types can be printed in the usual C++ way, for 6413example, 6414 6415@example 6416mpz_t z; 6417int n; 6418... 6419cout << "iteration " << n << " value " << z << "\n"; 6420@end example 6421 6422But note that @code{ostream} output (and @code{istream} input, @pxref{C++ 6423Formatted Input}) is the only overloading available for the GMP types and that 6424for instance using @code{+} with an @code{mpz_t} will have unpredictable 6425results. For classes with overloading, see @ref{C++ Class Interface}. 6426 6427 6428@node Formatted Input, C++ Class Interface, Formatted Output, Top 6429@chapter Formatted Input 6430@cindex Formatted input 6431@cindex @code{scanf} formatted input 6432 6433@menu 6434* Formatted Input Strings:: 6435* Formatted Input Functions:: 6436* C++ Formatted Input:: 6437@end menu 6438 6439 6440@node Formatted Input Strings, Formatted Input Functions, Formatted Input, Formatted Input 6441@section Formatted Input Strings 6442 6443@code{gmp_scanf} and friends accept format strings similar to the standard C 6444@code{scanf} (@pxref{Formatted Input,, Formatted Input, libc, The GNU C 6445Library Reference Manual}). A format specification is of the form 6446 6447@example 6448% [flags] [width] [type] conv 6449@end example 6450 6451GMP adds types @samp{Z}, @samp{Q} and @samp{F} for @code{mpz_t}, @code{mpq_t} 6452and @code{mpf_t} respectively. @samp{Z} and @samp{Q} behave like integers. 6453@samp{Q} will read a @samp{/} and a denominator, if present. @samp{F} behaves 6454like a float. 6455 6456GMP variables don't require an @code{&} when passed to @code{gmp_scanf}, since 6457they're already ``call-by-reference''. For example, 6458 6459@example 6460/* to read say "a(5) = 1234" */ 6461int n; 6462mpz_t z; 6463gmp_scanf ("a(%d) = %Zd\n", &n, z); 6464 6465mpq_t q1, q2; 6466gmp_sscanf ("0377 + 0x10/0x11", "%Qi + %Qi", q1, q2); 6467 6468/* to read say "topleft (1.55,-2.66)" */ 6469mpf_t x, y; 6470char buf[32]; 6471gmp_scanf ("%31s (%Ff,%Ff)", buf, x, y); 6472@end example 6473 6474All the standard C @code{scanf} types behave the same as in the C library 6475@code{scanf}, and can be freely intermixed with the GMP extensions. In the 6476current implementation the standard parts of the format string are simply 6477handed to @code{scanf} and only the GMP extensions handled directly. 6478 6479The flags accepted are as follows. @samp{a} and @samp{'} will depend on 6480support from the C library, and @samp{'} cannot be used with GMP types. 6481 6482@quotation 6483@multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM} 6484@item @nicode{*} @tab read but don't store 6485@item @nicode{a} @tab allocate a buffer (string conversions) 6486@item @nicode{'} @tab grouped digits, GLIBC style (not GMP types) 6487@end multitable 6488@end quotation 6489 6490The standard types accepted are as follows. @samp{h} and @samp{l} are 6491portable, the rest will depend on the compiler (or include files) for the type 6492and the C library for the input. 6493 6494@quotation 6495@multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM} 6496@item @nicode{h} @tab @nicode{short} 6497@item @nicode{hh} @tab @nicode{char} 6498@item @nicode{j} @tab @nicode{intmax_t} or @nicode{uintmax_t} 6499@item @nicode{l} @tab @nicode{long int}, @nicode{double} or @nicode{wchar_t} 6500@item @nicode{ll} @tab @nicode{long long} 6501@item @nicode{L} @tab @nicode{long double} 6502@item @nicode{q} @tab @nicode{quad_t} or @nicode{u_quad_t} 6503@item @nicode{t} @tab @nicode{ptrdiff_t} 6504@item @nicode{z} @tab @nicode{size_t} 6505@end multitable 6506@end quotation 6507 6508@noindent 6509The GMP types are 6510 6511@quotation 6512@multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM} 6513@item @nicode{F} @tab @nicode{mpf_t}, float conversions 6514@item @nicode{Q} @tab @nicode{mpq_t}, integer conversions 6515@item @nicode{Z} @tab @nicode{mpz_t}, integer conversions 6516@end multitable 6517@end quotation 6518 6519The conversions accepted are as follows. @samp{p} and @samp{[} will depend on 6520support from the C library, the rest are standard. 6521 6522@quotation 6523@multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM} 6524@item @nicode{c} @tab character or characters 6525@item @nicode{d} @tab decimal integer 6526@item @nicode{e} @nicode{E} @nicode{f} @nicode{g} @nicode{G} 6527 @tab float 6528@item @nicode{i} @tab integer with base indicator 6529@item @nicode{n} @tab characters read so far 6530@item @nicode{o} @tab octal integer 6531@item @nicode{p} @tab pointer 6532@item @nicode{s} @tab string of non-whitespace characters 6533@item @nicode{u} @tab decimal integer 6534@item @nicode{x} @nicode{X} @tab hex integer 6535@item @nicode{[} @tab string of characters in a set 6536@end multitable 6537@end quotation 6538 6539@samp{e}, @samp{E}, @samp{f}, @samp{g} and @samp{G} are identical, they all 6540read either fixed point or scientific format, and either upper or lower case 6541@samp{e} for the exponent in scientific format. 6542 6543C99 style hex float format (@code{printf %a}, @pxref{Formatted Output 6544Strings}) is always accepted for @code{mpf_t}, but for the standard float 6545types it will depend on the C library. 6546 6547@samp{x} and @samp{X} are identical, both accept both upper and lower case 6548hexadecimal. 6549 6550@samp{o}, @samp{u}, @samp{x} and @samp{X} all read positive or negative 6551values. For the standard C types these are described as ``unsigned'' 6552conversions, but that merely affects certain overflow handling, negatives are 6553still allowed (per @code{strtoul}, @pxref{Parsing of Integers,, Parsing of 6554Integers, libc, The GNU C Library Reference Manual}). For GMP types there are 6555no overflows, so @samp{d} and @samp{u} are identical. 6556 6557@samp{Q} type reads the numerator and (optional) denominator as given. If the 6558value might not be in canonical form then @code{mpq_canonicalize} must be 6559called before using it in any calculations (@pxref{Rational Number 6560Functions}). 6561 6562@samp{Qi} will read a base specification separately for the numerator and 6563denominator. For example @samp{0x10/11} would be 16/11, whereas 6564@samp{0x10/0x11} would be 16/17. 6565 6566@samp{n} can be used with any of the types above, even the GMP types. 6567@samp{*} to suppress assignment is allowed, though in that case it would do 6568nothing at all. 6569 6570Other conversions or types that might be accepted by the C library 6571@code{scanf} cannot be used through @code{gmp_scanf}. 6572 6573Whitespace is read and discarded before a field, except for @samp{c} and 6574@samp{[} conversions. 6575 6576For float conversions, the decimal point character (or string) expected is 6577taken from the current locale settings on systems which provide 6578@code{localeconv} (@pxref{Locales,, Locales and Internationalization, libc, 6579The GNU C Library Reference Manual}). The C library will normally do the same 6580for standard float input. 6581 6582The format string is only interpreted as plain @code{char}s, multibyte 6583characters are not recognised. Perhaps this will change in the future. 6584 6585 6586@node Formatted Input Functions, C++ Formatted Input, Formatted Input Strings, Formatted Input 6587@section Formatted Input Functions 6588@cindex Input functions 6589 6590Each of the following functions is similar to the corresponding C library 6591function. The plain @code{scanf} forms take a variable argument list. The 6592@code{vscanf} forms take an argument pointer, see @ref{Variadic Functions,, 6593Variadic Functions, libc, The GNU C Library Reference Manual}, or @samp{man 3 6594va_start}. 6595 6596It should be emphasised that if a format string is invalid, or the arguments 6597don't match what the format specifies, then the behaviour of any of these 6598functions will be unpredictable. GCC format string checking is not available, 6599since it doesn't recognise the GMP extensions. 6600 6601No overlap is permitted between the @var{fmt} string and any of the results 6602produced. 6603 6604@deftypefun int gmp_scanf (const char *@var{fmt}, @dots{}) 6605@deftypefunx int gmp_vscanf (const char *@var{fmt}, va_list @var{ap}) 6606Read from the standard input @code{stdin}. 6607@end deftypefun 6608 6609@deftypefun int gmp_fscanf (FILE *@var{fp}, const char *@var{fmt}, @dots{}) 6610@deftypefunx int gmp_vfscanf (FILE *@var{fp}, const char *@var{fmt}, va_list @var{ap}) 6611Read from the stream @var{fp}. 6612@end deftypefun 6613 6614@deftypefun int gmp_sscanf (const char *@var{s}, const char *@var{fmt}, @dots{}) 6615@deftypefunx int gmp_vsscanf (const char *@var{s}, const char *@var{fmt}, va_list @var{ap}) 6616Read from a null-terminated string @var{s}. 6617@end deftypefun 6618 6619The return value from each of these functions is the same as the standard C99 6620@code{scanf}, namely the number of fields successfully parsed and stored. 6621@samp{%n} fields and fields read but suppressed by @samp{*} don't count 6622towards the return value. 6623 6624If end of input (or a file error) is reached before a character for a field or 6625a literal, and if no previous non-suppressed fields have matched, then the 6626return value is @code{EOF} instead of 0. A whitespace character in the format 6627string is only an optional match and doesn't induce an @code{EOF} in this 6628fashion. Leading whitespace read and discarded for a field don't count as 6629characters for that field. 6630 6631For the GMP types, input parsing follows C99 rules, namely one character of 6632lookahead is used and characters are read while they continue to meet the 6633format requirements. If this doesn't provide a complete number then the 6634function terminates, with that field not stored nor counted towards the return 6635value. For instance with @code{mpf_t} an input @samp{1.23e-XYZ} would be read 6636up to the @samp{X} and that character pushed back since it's not a digit. The 6637string @samp{1.23e-} would then be considered invalid since an @samp{e} must 6638be followed by at least one digit. 6639 6640For the standard C types, in the current implementation GMP calls the C 6641library @code{scanf} functions, which might have looser rules about what 6642constitutes a valid input. 6643 6644Note that @code{gmp_sscanf} is the same as @code{gmp_fscanf} and only does one 6645character of lookahead when parsing. Although clearly it could look at its 6646entire input, it is deliberately made identical to @code{gmp_fscanf}, the same 6647way C99 @code{sscanf} is the same as @code{fscanf}. 6648 6649 6650@node C++ Formatted Input, , Formatted Input Functions, Formatted Input 6651@section C++ Formatted Input 6652@cindex C++ @code{istream} input 6653@cindex @code{istream} input 6654 6655The following functions are provided in @file{libgmpxx} (@pxref{Headers and 6656Libraries}), which is built only if C++ support is enabled (@pxref{Build 6657Options}). Prototypes are available from @code{<gmp.h>}. 6658 6659@deftypefun istream& operator>> (istream& @var{stream}, mpz_t @var{rop}) 6660Read @var{rop} from @var{stream}, using its @code{ios} formatting settings. 6661@end deftypefun 6662 6663@deftypefun istream& operator>> (istream& @var{stream}, mpq_t @var{rop}) 6664An integer like @samp{123} will be read, or a fraction like @samp{5/9}. No 6665whitespace is allowed around the @samp{/}. If the fraction is not in 6666canonical form then @code{mpq_canonicalize} must be called (@pxref{Rational 6667Number Functions}) before operating on it. 6668 6669As per integer input, an @samp{0} or @samp{0x} base indicator is read when 6670none of @code{ios::dec}, @code{ios::oct} or @code{ios::hex} are set. This is 6671done separately for numerator and denominator, so that for instance 6672@samp{0x10/11} is @math{16/11} and @samp{0x10/0x11} is @math{16/17}. 6673@end deftypefun 6674 6675@deftypefun istream& operator>> (istream& @var{stream}, mpf_t @var{rop}) 6676Read @var{rop} from @var{stream}, using its @code{ios} formatting settings. 6677 6678Hex or octal floats are not supported, but might be in the future, or perhaps 6679it's best to accept only what the standard float @code{operator>>} does. 6680@end deftypefun 6681 6682Note that digit grouping specified by the @code{istream} locale is currently 6683not accepted. Perhaps this will change in the future. 6684 6685@sp 1 6686These operators mean that GMP types can be read in the usual C++ way, for 6687example, 6688 6689@example 6690mpz_t z; 6691... 6692cin >> z; 6693@end example 6694 6695But note that @code{istream} input (and @code{ostream} output, @pxref{C++ 6696Formatted Output}) is the only overloading available for the GMP types and 6697that for instance using @code{+} with an @code{mpz_t} will have unpredictable 6698results. For classes with overloading, see @ref{C++ Class Interface}. 6699 6700 6701 6702@node C++ Class Interface, Custom Allocation, Formatted Input, Top 6703@chapter C++ Class Interface 6704@cindex C++ interface 6705 6706This chapter describes the C++ class based interface to GMP. 6707 6708All GMP C language types and functions can be used in C++ programs, since 6709@file{gmp.h} has @code{extern "C"} qualifiers, but the class interface offers 6710overloaded functions and operators which may be more convenient. 6711 6712Due to the implementation of this interface, a reasonably recent C++ compiler 6713is required, one supporting namespaces, partial specialization of templates 6714and member templates. 6715 6716@strong{Everything described in this chapter is to be considered preliminary 6717and might be subject to incompatible changes if some unforeseen difficulty 6718reveals itself.} 6719 6720@menu 6721* C++ Interface General:: 6722* C++ Interface Integers:: 6723* C++ Interface Rationals:: 6724* C++ Interface Floats:: 6725* C++ Interface Random Numbers:: 6726* C++ Interface Limitations:: 6727@end menu 6728 6729 6730@node C++ Interface General, C++ Interface Integers, C++ Class Interface, C++ Class Interface 6731@section C++ Interface General 6732 6733@noindent 6734All the C++ classes and functions are available with 6735 6736@cindex @code{gmpxx.h} 6737@example 6738#include <gmpxx.h> 6739@end example 6740 6741Programs should be linked with the @file{libgmpxx} and @file{libgmp} 6742libraries. For example, 6743 6744@example 6745g++ mycxxprog.cc -lgmpxx -lgmp 6746@end example 6747 6748@noindent 6749The classes defined are 6750 6751@deftp Class mpz_class 6752@deftpx Class mpq_class 6753@deftpx Class mpf_class 6754@end deftp 6755 6756The standard operators and various standard functions are overloaded to allow 6757arithmetic with these classes. For example, 6758 6759@example 6760int 6761main (void) 6762@{ 6763 mpz_class a, b, c; 6764 6765 a = 1234; 6766 b = "-5678"; 6767 c = a+b; 6768 cout << "sum is " << c << "\n"; 6769 cout << "absolute value is " << abs(c) << "\n"; 6770 6771 return 0; 6772@} 6773@end example 6774 6775An important feature of the implementation is that an expression like 6776@code{a=b+c} results in a single call to the corresponding @code{mpz_add}, 6777without using a temporary for the @code{b+c} part. Expressions which by their 6778nature imply intermediate values, like @code{a=b*c+d*e}, still use temporaries 6779though. 6780 6781The classes can be freely intermixed in expressions, as can the classes and 6782the standard types @code{long}, @code{unsigned long} and @code{double}. 6783Smaller types like @code{int} or @code{float} can also be intermixed, since 6784C++ will promote them. 6785 6786Note that @code{bool} is not accepted directly, but must be explicitly cast to 6787an @code{int} first. This is because C++ will automatically convert any 6788pointer to a @code{bool}, so if GMP accepted @code{bool} it would make all 6789sorts of invalid class and pointer combinations compile but almost certainly 6790not do anything sensible. 6791 6792Conversions back from the classes to standard C++ types aren't done 6793automatically, instead member functions like @code{get_si} are provided (see 6794the following sections for details). 6795 6796Also there are no automatic conversions from the classes to the corresponding 6797GMP C types, instead a reference to the underlying C object can be obtained 6798with the following functions, 6799 6800@deftypefun mpz_t mpz_class::get_mpz_t () 6801@deftypefunx mpq_t mpq_class::get_mpq_t () 6802@deftypefunx mpf_t mpf_class::get_mpf_t () 6803@end deftypefun 6804 6805These can be used to call a C function which doesn't have a C++ class 6806interface. For example to set @code{a} to the GCD of @code{b} and @code{c}, 6807 6808@example 6809mpz_class a, b, c; 6810... 6811mpz_gcd (a.get_mpz_t(), b.get_mpz_t(), c.get_mpz_t()); 6812@end example 6813 6814In the other direction, a class can be initialized from the corresponding GMP 6815C type, or assigned to if an explicit constructor is used. In both cases this 6816makes a copy of the value, it doesn't create any sort of association. For 6817example, 6818 6819@example 6820mpz_t z; 6821// ... init and calculate z ... 6822mpz_class x(z); 6823mpz_class y; 6824y = mpz_class (z); 6825@end example 6826 6827There are no namespace setups in @file{gmpxx.h}, all types and functions are 6828simply put into the global namespace. This is what @file{gmp.h} has done in 6829the past, and continues to do for compatibility. The extras provided by 6830@file{gmpxx.h} follow GMP naming conventions and are unlikely to clash with 6831anything. 6832 6833 6834@node C++ Interface Integers, C++ Interface Rationals, C++ Interface General, C++ Class Interface 6835@section C++ Interface Integers 6836 6837@deftypefun {} mpz_class::mpz_class (type @var{n}) 6838Construct an @code{mpz_class}. All the standard C++ types may be used, except 6839@code{long long} and @code{long double}, and all the GMP C++ classes can be 6840used, although conversions from @code{mpq_class} and @code{mpf_class} are 6841@code{explicit}. Any necessary conversion follows the corresponding C 6842function, for example @code{double} follows @code{mpz_set_d} 6843(@pxref{Assigning Integers}). 6844@end deftypefun 6845 6846@deftypefun explicit mpz_class::mpz_class (const mpz_t @var{z}) 6847Construct an @code{mpz_class} from an @code{mpz_t}. The value in @var{z} is 6848copied into the new @code{mpz_class}, there won't be any permanent association 6849between it and @var{z}. 6850@end deftypefun 6851 6852@deftypefun explicit mpz_class::mpz_class (const char *@var{s}, int @var{base} = 0) 6853@deftypefunx explicit mpz_class::mpz_class (const string& @var{s}, int @var{base} = 0) 6854Construct an @code{mpz_class} converted from a string using @code{mpz_set_str} 6855(@pxref{Assigning Integers}). 6856 6857If the string is not a valid integer, an @code{std::invalid_argument} 6858exception is thrown. The same applies to @code{operator=}. 6859@end deftypefun 6860 6861@deftypefun mpz_class operator"" _mpz (const char *@var{str}) 6862With C++11 compilers, integers can be constructed with the syntax 6863@code{123_mpz} which is equivalent to @code{mpz_class("123")}. 6864@end deftypefun 6865 6866@deftypefun mpz_class operator/ (mpz_class @var{a}, mpz_class @var{d}) 6867@deftypefunx mpz_class operator% (mpz_class @var{a}, mpz_class @var{d}) 6868Divisions involving @code{mpz_class} round towards zero, as per the 6869@code{mpz_tdiv_q} and @code{mpz_tdiv_r} functions (@pxref{Integer Division}). 6870This is the same as the C99 @code{/} and @code{%} operators. 6871 6872The @code{mpz_fdiv@dots{}} or @code{mpz_cdiv@dots{}} functions can always be called 6873directly if desired. For example, 6874 6875@example 6876mpz_class q, a, d; 6877... 6878mpz_fdiv_q (q.get_mpz_t(), a.get_mpz_t(), d.get_mpz_t()); 6879@end example 6880@end deftypefun 6881 6882@deftypefun mpz_class abs (mpz_class @var{op}) 6883@deftypefunx int cmp (mpz_class @var{op1}, type @var{op2}) 6884@deftypefunx int cmp (type @var{op1}, mpz_class @var{op2}) 6885@maybepagebreak 6886@deftypefunx bool mpz_class::fits_sint_p (void) 6887@deftypefunx bool mpz_class::fits_slong_p (void) 6888@deftypefunx bool mpz_class::fits_sshort_p (void) 6889@maybepagebreak 6890@deftypefunx bool mpz_class::fits_uint_p (void) 6891@deftypefunx bool mpz_class::fits_ulong_p (void) 6892@deftypefunx bool mpz_class::fits_ushort_p (void) 6893@maybepagebreak 6894@deftypefunx double mpz_class::get_d (void) 6895@deftypefunx long mpz_class::get_si (void) 6896@deftypefunx string mpz_class::get_str (int @var{base} = 10) 6897@deftypefunx {unsigned long} mpz_class::get_ui (void) 6898@maybepagebreak 6899@deftypefunx int mpz_class::set_str (const char *@var{str}, int @var{base}) 6900@deftypefunx int mpz_class::set_str (const string& @var{str}, int @var{base}) 6901@deftypefunx int sgn (mpz_class @var{op}) 6902@deftypefunx mpz_class sqrt (mpz_class @var{op}) 6903@maybepagebreak 6904@deftypefunx mpz_class gcd (mpz_class @var{op1}, mpz_class @var{op2}) 6905@deftypefunx mpz_class lcm (mpz_class @var{op1}, mpz_class @var{op2}) 6906@deftypefunx mpz_class mpz_class::factorial (type @var{op}) 6907@deftypefunx mpz_class factorial (mpz_class @var{op}) 6908@deftypefunx mpz_class mpz_class::primorial (type @var{op}) 6909@deftypefunx mpz_class primorial (mpz_class @var{op}) 6910@deftypefunx mpz_class mpz_class::fibonacci (type @var{op}) 6911@deftypefunx mpz_class fibonacci (mpz_class @var{op}) 6912@maybepagebreak 6913@deftypefunx void mpz_class::swap (mpz_class& @var{op}) 6914@deftypefunx void swap (mpz_class& @var{op1}, mpz_class& @var{op2}) 6915These functions provide a C++ class interface to the corresponding GMP C 6916routines. Calling @code{factorial} or @code{primorial} on a negative number 6917is undefined. 6918 6919@code{cmp} can be used with any of the classes or the standard C++ types, 6920except @code{long long} and @code{long double}. 6921@end deftypefun 6922 6923@sp 1 6924Overloaded operators for combinations of @code{mpz_class} and @code{double} 6925are provided for completeness, but it should be noted that if the given 6926@code{double} is not an integer then the way any rounding is done is currently 6927unspecified. The rounding might take place at the start, in the middle, or at 6928the end of the operation, and it might change in the future. 6929 6930Conversions between @code{mpz_class} and @code{double}, however, are defined 6931to follow the corresponding C functions @code{mpz_get_d} and @code{mpz_set_d}. 6932And comparisons are always made exactly, as per @code{mpz_cmp_d}. 6933 6934 6935@node C++ Interface Rationals, C++ Interface Floats, C++ Interface Integers, C++ Class Interface 6936@section C++ Interface Rationals 6937 6938In all the following constructors, if a fraction is given then it should be in 6939canonical form, or if not then @code{mpq_class::canonicalize} called. 6940 6941@deftypefun {} mpq_class::mpq_class (type @var{op}) 6942@deftypefunx {} mpq_class::mpq_class (integer @var{num}, integer @var{den}) 6943Construct an @code{mpq_class}. The initial value can be a single value of any 6944type (conversion from @code{mpf_class} is @code{explicit}), or a pair of 6945integers (@code{mpz_class} or standard C++ integer types) representing a 6946fraction, except that @code{long long} and @code{long double} are not 6947supported. For example, 6948 6949@example 6950mpq_class q (99); 6951mpq_class q (1.75); 6952mpq_class q (1, 3); 6953@end example 6954@end deftypefun 6955 6956@deftypefun explicit mpq_class::mpq_class (const mpq_t @var{q}) 6957Construct an @code{mpq_class} from an @code{mpq_t}. The value in @var{q} is 6958copied into the new @code{mpq_class}, there won't be any permanent association 6959between it and @var{q}. 6960@end deftypefun 6961 6962@deftypefun explicit mpq_class::mpq_class (const char *@var{s}, int @var{base} = 0) 6963@deftypefunx explicit mpq_class::mpq_class (const string& @var{s}, int @var{base} = 0) 6964Construct an @code{mpq_class} converted from a string using @code{mpq_set_str} 6965(@pxref{Initializing Rationals}). 6966 6967If the string is not a valid rational, an @code{std::invalid_argument} 6968exception is thrown. The same applies to @code{operator=}. 6969@end deftypefun 6970 6971@deftypefun mpq_class operator"" _mpq (const char *@var{str}) 6972With C++11 compilers, integral rationals can be constructed with the syntax 6973@code{123_mpq} which is equivalent to @code{mpq_class(123_mpz)}. Other 6974rationals can be built as @code{-1_mpq/2} or @code{0xb_mpq/123456_mpz}. 6975@end deftypefun 6976 6977@deftypefun void mpq_class::canonicalize () 6978Put an @code{mpq_class} into canonical form, as per @ref{Rational Number 6979Functions}. All arithmetic operators require their operands in canonical 6980form, and will return results in canonical form. 6981@end deftypefun 6982 6983@deftypefun mpq_class abs (mpq_class @var{op}) 6984@deftypefunx int cmp (mpq_class @var{op1}, type @var{op2}) 6985@deftypefunx int cmp (type @var{op1}, mpq_class @var{op2}) 6986@maybepagebreak 6987@deftypefunx double mpq_class::get_d (void) 6988@deftypefunx string mpq_class::get_str (int @var{base} = 10) 6989@maybepagebreak 6990@deftypefunx int mpq_class::set_str (const char *@var{str}, int @var{base}) 6991@deftypefunx int mpq_class::set_str (const string& @var{str}, int @var{base}) 6992@deftypefunx int sgn (mpq_class @var{op}) 6993@maybepagebreak 6994@deftypefunx void mpq_class::swap (mpq_class& @var{op}) 6995@deftypefunx void swap (mpq_class& @var{op1}, mpq_class& @var{op2}) 6996These functions provide a C++ class interface to the corresponding GMP C 6997routines. 6998 6999@code{cmp} can be used with any of the classes or the standard C++ types, 7000except @code{long long} and @code{long double}. 7001@end deftypefun 7002 7003@deftypefun {mpz_class&} mpq_class::get_num () 7004@deftypefunx {mpz_class&} mpq_class::get_den () 7005Get a reference to an @code{mpz_class} which is the numerator or denominator 7006of an @code{mpq_class}. This can be used both for read and write access. If 7007the object returned is modified, it modifies the original @code{mpq_class}. 7008 7009If direct manipulation might produce a non-canonical value, then 7010@code{mpq_class::canonicalize} must be called before further operations. 7011@end deftypefun 7012 7013@deftypefun mpz_t mpq_class::get_num_mpz_t () 7014@deftypefunx mpz_t mpq_class::get_den_mpz_t () 7015Get a reference to the underlying @code{mpz_t} numerator or denominator of an 7016@code{mpq_class}. This can be passed to C functions expecting an 7017@code{mpz_t}. Any modifications made to the @code{mpz_t} will modify the 7018original @code{mpq_class}. 7019 7020If direct manipulation might produce a non-canonical value, then 7021@code{mpq_class::canonicalize} must be called before further operations. 7022@end deftypefun 7023 7024@deftypefun istream& operator>> (istream& @var{stream}, mpq_class& @var{rop}); 7025Read @var{rop} from @var{stream}, using its @code{ios} formatting settings, 7026the same as @code{mpq_t operator>>} (@pxref{C++ Formatted Input}). 7027 7028If the @var{rop} read might not be in canonical form then 7029@code{mpq_class::canonicalize} must be called. 7030@end deftypefun 7031 7032 7033@node C++ Interface Floats, C++ Interface Random Numbers, C++ Interface Rationals, C++ Class Interface 7034@section C++ Interface Floats 7035 7036When an expression requires the use of temporary intermediate @code{mpf_class} 7037values, like @code{f=g*h+x*y}, those temporaries will have the same precision 7038as the destination @code{f}. Explicit constructors can be used if this 7039doesn't suit. 7040 7041@deftypefun {} mpf_class::mpf_class (type @var{op}) 7042@deftypefunx {} mpf_class::mpf_class (type @var{op}, mp_bitcnt_t @var{prec}) 7043Construct an @code{mpf_class}. Any standard C++ type can be used, except 7044@code{long long} and @code{long double}, and any of the GMP C++ classes can be 7045used. 7046 7047If @var{prec} is given, the initial precision is that value, in bits. If 7048@var{prec} is not given, then the initial precision is determined by the type 7049of @var{op} given. An @code{mpz_class}, @code{mpq_class}, or C++ 7050builtin type will give the default @code{mpf} precision (@pxref{Initializing 7051Floats}). An @code{mpf_class} or expression will give the precision of that 7052value. The precision of a binary expression is the higher of the two 7053operands. 7054 7055@example 7056mpf_class f(1.5); // default precision 7057mpf_class f(1.5, 500); // 500 bits (at least) 7058mpf_class f(x); // precision of x 7059mpf_class f(abs(x)); // precision of x 7060mpf_class f(-g, 1000); // 1000 bits (at least) 7061mpf_class f(x+y); // greater of precisions of x and y 7062@end example 7063@end deftypefun 7064 7065@deftypefun explicit mpf_class::mpf_class (const mpf_t @var{f}) 7066@deftypefunx {} mpf_class::mpf_class (const mpf_t @var{f}, mp_bitcnt_t @var{prec}) 7067Construct an @code{mpf_class} from an @code{mpf_t}. The value in @var{f} is 7068copied into the new @code{mpf_class}, there won't be any permanent association 7069between it and @var{f}. 7070 7071If @var{prec} is given, the initial precision is that value, in bits. If 7072@var{prec} is not given, then the initial precision is that of @var{f}. 7073@end deftypefun 7074 7075@deftypefun explicit mpf_class::mpf_class (const char *@var{s}) 7076@deftypefunx {} mpf_class::mpf_class (const char *@var{s}, mp_bitcnt_t @var{prec}, int @var{base} = 0) 7077@deftypefunx explicit mpf_class::mpf_class (const string& @var{s}) 7078@deftypefunx {} mpf_class::mpf_class (const string& @var{s}, mp_bitcnt_t @var{prec}, int @var{base} = 0) 7079Construct an @code{mpf_class} converted from a string using @code{mpf_set_str} 7080(@pxref{Assigning Floats}). If @var{prec} is given, the initial precision is 7081that value, in bits. If not, the default @code{mpf} precision 7082(@pxref{Initializing Floats}) is used. 7083 7084If the string is not a valid float, an @code{std::invalid_argument} exception 7085is thrown. The same applies to @code{operator=}. 7086@end deftypefun 7087 7088@deftypefun mpf_class operator"" _mpf (const char *@var{str}) 7089With C++11 compilers, floats can be constructed with the syntax 7090@code{1.23e-1_mpf} which is equivalent to @code{mpf_class("1.23e-1")}. 7091@end deftypefun 7092 7093@deftypefun {mpf_class&} mpf_class::operator= (type @var{op}) 7094Convert and store the given @var{op} value to an @code{mpf_class} object. The 7095same types are accepted as for the constructors above. 7096 7097Note that @code{operator=} only stores a new value, it doesn't copy or change 7098the precision of the destination, instead the value is truncated if necessary. 7099This is the same as @code{mpf_set} etc. Note in particular this means for 7100@code{mpf_class} a copy constructor is not the same as a default constructor 7101plus assignment. 7102 7103@example 7104mpf_class x (y); // x created with precision of y 7105 7106mpf_class x; // x created with default precision 7107x = y; // value truncated to that precision 7108@end example 7109 7110Applications using templated code may need to be careful about the assumptions 7111the code makes in this area, when working with @code{mpf_class} values of 7112various different or non-default precisions. For instance implementations of 7113the standard @code{complex} template have been seen in both styles above, 7114though of course @code{complex} is normally only actually specified for use 7115with the builtin float types. 7116@end deftypefun 7117 7118@deftypefun mpf_class abs (mpf_class @var{op}) 7119@deftypefunx mpf_class ceil (mpf_class @var{op}) 7120@deftypefunx int cmp (mpf_class @var{op1}, type @var{op2}) 7121@deftypefunx int cmp (type @var{op1}, mpf_class @var{op2}) 7122@maybepagebreak 7123@deftypefunx bool mpf_class::fits_sint_p (void) 7124@deftypefunx bool mpf_class::fits_slong_p (void) 7125@deftypefunx bool mpf_class::fits_sshort_p (void) 7126@maybepagebreak 7127@deftypefunx bool mpf_class::fits_uint_p (void) 7128@deftypefunx bool mpf_class::fits_ulong_p (void) 7129@deftypefunx bool mpf_class::fits_ushort_p (void) 7130@maybepagebreak 7131@deftypefunx mpf_class floor (mpf_class @var{op}) 7132@deftypefunx mpf_class hypot (mpf_class @var{op1}, mpf_class @var{op2}) 7133@maybepagebreak 7134@deftypefunx double mpf_class::get_d (void) 7135@deftypefunx long mpf_class::get_si (void) 7136@deftypefunx string mpf_class::get_str (mp_exp_t& @var{exp}, int @var{base} = 10, size_t @var{digits} = 0) 7137@deftypefunx {unsigned long} mpf_class::get_ui (void) 7138@maybepagebreak 7139@deftypefunx int mpf_class::set_str (const char *@var{str}, int @var{base}) 7140@deftypefunx int mpf_class::set_str (const string& @var{str}, int @var{base}) 7141@deftypefunx int sgn (mpf_class @var{op}) 7142@deftypefunx mpf_class sqrt (mpf_class @var{op}) 7143@maybepagebreak 7144@deftypefunx void mpf_class::swap (mpf_class& @var{op}) 7145@deftypefunx void swap (mpf_class& @var{op1}, mpf_class& @var{op2}) 7146@deftypefunx mpf_class trunc (mpf_class @var{op}) 7147These functions provide a C++ class interface to the corresponding GMP C 7148routines. 7149 7150@code{cmp} can be used with any of the classes or the standard C++ types, 7151except @code{long long} and @code{long double}. 7152 7153The accuracy provided by @code{hypot} is not currently guaranteed. 7154@end deftypefun 7155 7156@deftypefun {mp_bitcnt_t} mpf_class::get_prec () 7157@deftypefunx void mpf_class::set_prec (mp_bitcnt_t @var{prec}) 7158@deftypefunx void mpf_class::set_prec_raw (mp_bitcnt_t @var{prec}) 7159Get or set the current precision of an @code{mpf_class}. 7160 7161The restrictions described for @code{mpf_set_prec_raw} (@pxref{Initializing 7162Floats}) apply to @code{mpf_class::set_prec_raw}. Note in particular that the 7163@code{mpf_class} must be restored to it's allocated precision before being 7164destroyed. This must be done by application code, there's no automatic 7165mechanism for it. 7166@end deftypefun 7167 7168 7169@node C++ Interface Random Numbers, C++ Interface Limitations, C++ Interface Floats, C++ Class Interface 7170@section C++ Interface Random Numbers 7171 7172@deftp Class gmp_randclass 7173The C++ class interface to the GMP random number functions uses 7174@code{gmp_randclass} to hold an algorithm selection and current state, as per 7175@code{gmp_randstate_t}. 7176@end deftp 7177 7178@deftypefun {} gmp_randclass::gmp_randclass (void (*@var{randinit}) (gmp_randstate_t, @dots{}), @dots{}) 7179Construct a @code{gmp_randclass}, using a call to the given @var{randinit} 7180function (@pxref{Random State Initialization}). The arguments expected are 7181the same as @var{randinit}, but with @code{mpz_class} instead of @code{mpz_t}. 7182For example, 7183 7184@example 7185gmp_randclass r1 (gmp_randinit_default); 7186gmp_randclass r2 (gmp_randinit_lc_2exp_size, 32); 7187gmp_randclass r3 (gmp_randinit_lc_2exp, a, c, m2exp); 7188gmp_randclass r4 (gmp_randinit_mt); 7189@end example 7190 7191@code{gmp_randinit_lc_2exp_size} will fail if the size requested is too big, 7192an @code{std::length_error} exception is thrown in that case. 7193@end deftypefun 7194 7195@deftypefun {} gmp_randclass::gmp_randclass (gmp_randalg_t @var{alg}, @dots{}) 7196Construct a @code{gmp_randclass} using the same parameters as 7197@code{gmp_randinit} (@pxref{Random State Initialization}). This function is 7198obsolete and the above @var{randinit} style should be preferred. 7199@end deftypefun 7200 7201@deftypefun void gmp_randclass::seed (unsigned long int @var{s}) 7202@deftypefunx void gmp_randclass::seed (mpz_class @var{s}) 7203Seed a random number generator. See @pxref{Random Number Functions}, for how 7204to choose a good seed. 7205@end deftypefun 7206 7207@deftypefun mpz_class gmp_randclass::get_z_bits (mp_bitcnt_t @var{bits}) 7208@deftypefunx mpz_class gmp_randclass::get_z_bits (mpz_class @var{bits}) 7209Generate a random integer with a specified number of bits. 7210@end deftypefun 7211 7212@deftypefun mpz_class gmp_randclass::get_z_range (mpz_class @var{n}) 7213Generate a random integer in the range 0 to @math{@var{n}-1} inclusive. 7214@end deftypefun 7215 7216@deftypefun mpf_class gmp_randclass::get_f () 7217@deftypefunx mpf_class gmp_randclass::get_f (mp_bitcnt_t @var{prec}) 7218Generate a random float @var{f} in the range @math{0 <= @var{f} < 1}. @var{f} 7219will be to @var{prec} bits precision, or if @var{prec} is not given then to 7220the precision of the destination. For example, 7221 7222@example 7223gmp_randclass r; 7224... 7225mpf_class f (0, 512); // 512 bits precision 7226f = r.get_f(); // random number, 512 bits 7227@end example 7228@end deftypefun 7229 7230 7231 7232@node C++ Interface Limitations, , C++ Interface Random Numbers, C++ Class Interface 7233@section C++ Interface Limitations 7234 7235@table @asis 7236@item @code{mpq_class} and Templated Reading 7237A generic piece of template code probably won't know that @code{mpq_class} 7238requires a @code{canonicalize} call if inputs read with @code{operator>>} 7239might be non-canonical. This can lead to incorrect results. 7240 7241@code{operator>>} behaves as it does for reasons of efficiency. A 7242canonicalize can be quite time consuming on large operands, and is best 7243avoided if it's not necessary. 7244 7245But this potential difficulty reduces the usefulness of @code{mpq_class}. 7246Perhaps a mechanism to tell @code{operator>>} what to do will be adopted in 7247the future, maybe a preprocessor define, a global flag, or an @code{ios} flag 7248pressed into service. Or maybe, at the risk of inconsistency, the 7249@code{mpq_class} @code{operator>>} could canonicalize and leave @code{mpq_t} 7250@code{operator>>} not doing so, for use on those occasions when that's 7251acceptable. Send feedback or alternate ideas to @email{gmp-bugs@@gmplib.org}. 7252 7253@item Subclassing 7254Subclassing the GMP C++ classes works, but is not currently recommended. 7255 7256Expressions involving subclasses resolve correctly (or seem to), but in normal 7257C++ fashion the subclass doesn't inherit constructors and assignments. 7258There's many of those in the GMP classes, and a good way to reestablish them 7259in a subclass is not yet provided. 7260 7261@item Templated Expressions 7262A subtle difficulty exists when using expressions together with 7263application-defined template functions. Consider the following, with @code{T} 7264intended to be some numeric type, 7265 7266@example 7267template <class T> 7268T fun (const T &, const T &); 7269@end example 7270 7271@noindent 7272When used with, say, plain @code{mpz_class} variables, it works fine: @code{T} 7273is resolved as @code{mpz_class}. 7274 7275@example 7276mpz_class f(1), g(2); 7277fun (f, g); // Good 7278@end example 7279 7280@noindent 7281But when one of the arguments is an expression, it doesn't work. 7282 7283@example 7284mpz_class f(1), g(2), h(3); 7285fun (f, g+h); // Bad 7286@end example 7287 7288This is because @code{g+h} ends up being a certain expression template type 7289internal to @code{gmpxx.h}, which the C++ template resolution rules are unable 7290to automatically convert to @code{mpz_class}. The workaround is simply to add 7291an explicit cast. 7292 7293@example 7294mpz_class f(1), g(2), h(3); 7295fun (f, mpz_class(g+h)); // Good 7296@end example 7297 7298Similarly, within @code{fun} it may be necessary to cast an expression to type 7299@code{T} when calling a templated @code{fun2}. 7300 7301@example 7302template <class T> 7303void fun (T f, T g) 7304@{ 7305 fun2 (f, f+g); // Bad 7306@} 7307 7308template <class T> 7309void fun (T f, T g) 7310@{ 7311 fun2 (f, T(f+g)); // Good 7312@} 7313@end example 7314 7315@item C++11 7316C++11 provides several new ways in which types can be inferred: @code{auto}, 7317@code{decltype}, etc. While they can be very convenient, they don't mix well 7318with expression templates. In this example, the addition is performed twice, 7319as if we had defined @code{sum} as a macro. 7320 7321@example 7322mpz_class z = 33; 7323auto sum = z + z; 7324mpz_class prod = sum * sum; 7325@end example 7326 7327This other example may crash, though some compilers might make it look like 7328it is working, because the expression @code{z+z} goes out of scope before it 7329is evaluated. 7330 7331@example 7332mpz_class z = 33; 7333auto sum = z + z + z; 7334mpz_class prod = sum * 2; 7335@end example 7336 7337It is thus strongly recommended to avoid @code{auto} anywhere a GMP C++ 7338expression may appear. 7339@end table 7340 7341 7342@node Custom Allocation, Language Bindings, C++ Class Interface, Top 7343@comment node-name, next, previous, up 7344@chapter Custom Allocation 7345@cindex Custom allocation 7346@cindex Memory allocation 7347@cindex Allocation of memory 7348 7349By default GMP uses @code{malloc}, @code{realloc} and @code{free} for memory 7350allocation, and if they fail GMP prints a message to the standard error output 7351and terminates the program. 7352 7353Alternate functions can be specified, to allocate memory in a different way or 7354to have a different error action on running out of memory. 7355 7356@deftypefun void mp_set_memory_functions (@* void *(*@var{alloc_func_ptr}) (size_t), @* void *(*@var{realloc_func_ptr}) (void *, size_t, size_t), @* void (*@var{free_func_ptr}) (void *, size_t)) 7357Replace the current allocation functions from the arguments. If an argument 7358is @code{NULL}, the corresponding default function is used. 7359 7360These functions will be used for all memory allocation done by GMP, apart from 7361temporary space from @code{alloca} if that function is available and GMP is 7362configured to use it (@pxref{Build Options}). 7363 7364@strong{Be sure to call @code{mp_set_memory_functions} only when there are no 7365active GMP objects allocated using the previous memory functions! Usually 7366that means calling it before any other GMP function.} 7367@end deftypefun 7368 7369The functions supplied should fit the following declarations: 7370 7371@deftypevr Function {void *} allocate_function (size_t @var{alloc_size}) 7372Return a pointer to newly allocated space with at least @var{alloc_size} 7373bytes. 7374@end deftypevr 7375 7376@deftypevr Function {void *} reallocate_function (void *@var{ptr}, size_t @var{old_size}, size_t @var{new_size}) 7377Resize a previously allocated block @var{ptr} of @var{old_size} bytes to be 7378@var{new_size} bytes. 7379 7380The block may be moved if necessary or if desired, and in that case the 7381smaller of @var{old_size} and @var{new_size} bytes must be copied to the new 7382location. The return value is a pointer to the resized block, that being the 7383new location if moved or just @var{ptr} if not. 7384 7385@var{ptr} is never @code{NULL}, it's always a previously allocated block. 7386@var{new_size} may be bigger or smaller than @var{old_size}. 7387@end deftypevr 7388 7389@deftypevr Function void free_function (void *@var{ptr}, size_t @var{size}) 7390De-allocate the space pointed to by @var{ptr}. 7391 7392@var{ptr} is never @code{NULL}, it's always a previously allocated block of 7393@var{size} bytes. 7394@end deftypevr 7395 7396A @dfn{byte} here means the unit used by the @code{sizeof} operator. 7397 7398The @var{reallocate_function} parameter @var{old_size} and the 7399@var{free_function} parameter @var{size} are passed for convenience, but of 7400course they can be ignored if not needed by an implementation. The default 7401functions using @code{malloc} and friends for instance don't use them. 7402 7403No error return is allowed from any of these functions, if they return then 7404they must have performed the specified operation. In particular note that 7405@var{allocate_function} or @var{reallocate_function} mustn't return 7406@code{NULL}. 7407 7408Getting a different fatal error action is a good use for custom allocation 7409functions, for example giving a graphical dialog rather than the default print 7410to @code{stderr}. How much is possible when genuinely out of memory is 7411another question though. 7412 7413There's currently no defined way for the allocation functions to recover from 7414an error such as out of memory, they must terminate program execution. A 7415@code{longjmp} or throwing a C++ exception will have undefined results. This 7416may change in the future. 7417 7418GMP may use allocated blocks to hold pointers to other allocated blocks. This 7419will limit the assumptions a conservative garbage collection scheme can make. 7420 7421Since the default GMP allocation uses @code{malloc} and friends, those 7422functions will be linked in even if the first thing a program does is an 7423@code{mp_set_memory_functions}. It's necessary to change the GMP sources if 7424this is a problem. 7425 7426@sp 1 7427@deftypefun void mp_get_memory_functions (@* void *(**@var{alloc_func_ptr}) (size_t), @* void *(**@var{realloc_func_ptr}) (void *, size_t, size_t), @* void (**@var{free_func_ptr}) (void *, size_t)) 7428Get the current allocation functions, storing function pointers to the 7429locations given by the arguments. If an argument is @code{NULL}, that 7430function pointer is not stored. 7431 7432@need 1000 7433For example, to get just the current free function, 7434 7435@example 7436void (*freefunc) (void *, size_t); 7437 7438mp_get_memory_functions (NULL, NULL, &freefunc); 7439@end example 7440@end deftypefun 7441 7442@node Language Bindings, Algorithms, Custom Allocation, Top 7443@chapter Language Bindings 7444@cindex Language bindings 7445@cindex Other languages 7446 7447The following packages and projects offer access to GMP from languages other 7448than C, though perhaps with varying levels of functionality and efficiency. 7449 7450@c @spaceuref{U} is the same as @uref{U}, but with a couple of extra spaces 7451@c in tex, just to separate the URL from the preceding text a bit. 7452@iftex 7453@macro spaceuref {U} 7454@ @ @uref{\U\} 7455@end macro 7456@end iftex 7457@ifnottex 7458@macro spaceuref {U} 7459@uref{\U\} 7460@end macro 7461@end ifnottex 7462 7463@sp 1 7464@table @asis 7465@item C++ 7466@itemize @bullet 7467@item 7468GMP C++ class interface, @pxref{C++ Class Interface} @* Straightforward 7469interface, expression templates to eliminate temporaries. 7470@item 7471ALP @spaceuref{https://www-sop.inria.fr/saga/logiciels/ALP/} @* Linear algebra and 7472polynomials using templates. 7473@item 7474CLN @spaceuref{https://www.ginac.de/CLN/} @* High level classes for arithmetic. 7475@item 7476Linbox @spaceuref{http://www.linalg.org/} @* Sparse vectors and matrices. 7477@item 7478NTL @spaceuref{http://www.shoup.net/ntl/} @* A C++ number theory library. 7479@end itemize 7480 7481@c @item D 7482@c @itemize @bullet 7483@c @item 7484@c gmp-d @spaceuref{http://home.comcast.net/~benhinkle/gmp-d/} 7485@c @end itemize 7486 7487@item Eiffel 7488@itemize @bullet 7489@item 7490Eiffelroom @spaceuref{http://www.eiffelroom.org/node/442} 7491@end itemize 7492 7493@c @item Fortran 7494@c @itemize @bullet 7495@c @item 7496@c Omni F77 @spaceuref{http://phase.hpcc.jp/Omni/home.html} @* Arbitrary 7497@c precision floats. 7498@c @end itemize 7499 7500@item Haskell 7501@itemize @bullet 7502@item 7503Glasgow Haskell Compiler @spaceuref{https://www.haskell.org/ghc/} 7504@end itemize 7505 7506@item Java 7507@itemize @bullet 7508@item 7509Kaffe @spaceuref{https://github.com/kaffe/kaffe} 7510@end itemize 7511 7512@item Lisp 7513@itemize @bullet 7514@item 7515GNU Common Lisp @spaceuref{https://www.gnu.org/software/gcl/gcl.html} 7516@item 7517Librep @spaceuref{http://librep.sourceforge.net/} 7518@item 7519@c FIXME: When there's a stable release with gmp support, just refer to it 7520@c rather than bothering to talk about betas. 7521XEmacs (21.5.18 beta and up) @spaceuref{https://www.xemacs.org} @* Optional 7522big integers, rationals and floats using GMP. 7523@end itemize 7524 7525@item ML 7526@itemize @bullet 7527@item 7528MLton compiler @spaceuref{http://mlton.org/} 7529@end itemize 7530 7531@item Objective Caml 7532@itemize @bullet 7533@item 7534MLGMP @spaceuref{https://opam.ocaml.org/packages/mlgmp/} 7535@item 7536Numerix @spaceuref{http://pauillac.inria.fr/~quercia/} @* Optionally using 7537GMP. 7538@end itemize 7539 7540@item Oz 7541@itemize @bullet 7542@item 7543Mozart @spaceuref{https://mozart.github.io/} 7544@end itemize 7545 7546@item Pascal 7547@itemize @bullet 7548@item 7549GNU Pascal Compiler @spaceuref{http://www.gnu-pascal.de/} @* GMP unit. 7550@item 7551Numerix @spaceuref{http://pauillac.inria.fr/~quercia/} @* For Free Pascal, 7552optionally using GMP. 7553@end itemize 7554 7555@item Perl 7556@itemize @bullet 7557@item 7558GMP module, see @file{demos/perl} in the GMP sources (@pxref{Demonstration 7559Programs}). 7560@item 7561Math::GMP @spaceuref{https://www.cpan.org/} @* Compatible with Math::BigInt, but 7562not as many functions as the GMP module above. 7563@item 7564Math::BigInt::GMP @spaceuref{https://www.cpan.org/} @* Plug Math::GMP into 7565normal Math::BigInt operations. 7566@end itemize 7567 7568@need 1000 7569@item Pike 7570@itemize @bullet 7571@item 7572pikempz module in the standard distribution, @uref{https://pike.lysator.liu.se/} 7573@end itemize 7574 7575@need 500 7576@item Prolog 7577@itemize @bullet 7578@item 7579SWI Prolog @spaceuref{http://www.swi-prolog.org/} @* 7580Arbitrary precision floats. 7581@end itemize 7582 7583@item Python 7584@itemize @bullet 7585@item 7586GMPY @uref{https://code.google.com/p/gmpy/} 7587@end itemize 7588 7589@item Ruby 7590@itemize @bullet 7591@item 7592@uref{https://rubygems.org/gems/gmp} 7593@end itemize 7594 7595@item Scheme 7596@itemize @bullet 7597@item 7598GNU Guile @spaceuref{https://www.gnu.org/software/guile/guile.html} 7599@item 7600RScheme @spaceuref{https://www.rscheme.org/} 7601@item 7602STklos @spaceuref{http://www.stklos.net/} 7603@c 7604@c For reference, MzScheme uses some of gmp, but (as of version 205) it only 7605@c has copies of some of the generic C code, and we don't consider that a 7606@c language binding to gmp. 7607@c 7608@end itemize 7609 7610@item Smalltalk 7611@itemize @bullet 7612@item 7613GNU Smalltalk @spaceuref{http://smalltalk.gnu.org/} 7614@end itemize 7615 7616@item Other 7617@itemize @bullet 7618@item 7619Axiom @uref{https://savannah.nongnu.org/projects/axiom} @* Computer algebra 7620using GCL. 7621@item 7622DrGenius @spaceuref{http://drgenius.seul.org/} @* Geometry system and 7623mathematical programming language. 7624@item 7625GiNaC @spaceuref{httsp://www.ginac.de/} @* C++ computer algebra using CLN. 7626@item 7627GOO @spaceuref{https://www.eecs.berkeley.edu/~jrb/goo/} @* Dynamic object oriented 7628language. 7629@item 7630Maxima @uref{https://www.ma.utexas.edu/users/wfs/maxima.html} @* Macsyma 7631computer algebra using GCL. 7632@c @item 7633@c Q @spaceuref{http://q-lang.sourceforge.net/} @* Equational programming system. 7634@item 7635Regina @spaceuref{http://regina.sourceforge.net/} @* Topological calculator. 7636@item 7637Yacas @spaceuref{http://yacas.sourceforge.net} @* Yet another computer algebra system. 7638@end itemize 7639 7640@end table 7641 7642 7643@node Algorithms, Internals, Language Bindings, Top 7644@chapter Algorithms 7645@cindex Algorithms 7646 7647This chapter is an introduction to some of the algorithms used for various GMP 7648operations. The code is likely to be hard to understand without knowing 7649something about the algorithms. 7650 7651Some GMP internals are mentioned, but applications that expect to be 7652compatible with future GMP releases should take care to use only the 7653documented functions. 7654 7655@menu 7656* Multiplication Algorithms:: 7657* Division Algorithms:: 7658* Greatest Common Divisor Algorithms:: 7659* Powering Algorithms:: 7660* Root Extraction Algorithms:: 7661* Radix Conversion Algorithms:: 7662* Other Algorithms:: 7663* Assembly Coding:: 7664@end menu 7665 7666 7667@node Multiplication Algorithms, Division Algorithms, Algorithms, Algorithms 7668@section Multiplication 7669@cindex Multiplication algorithms 7670 7671N@cross{}N limb multiplications and squares are done using one of seven 7672algorithms, as the size N increases. 7673 7674@quotation 7675@multitable {KaratsubaMMM} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM} 7676@item Algorithm @tab Threshold 7677@item Basecase @tab (none) 7678@item Karatsuba @tab @code{MUL_TOOM22_THRESHOLD} 7679@item Toom-3 @tab @code{MUL_TOOM33_THRESHOLD} 7680@item Toom-4 @tab @code{MUL_TOOM44_THRESHOLD} 7681@item Toom-6.5 @tab @code{MUL_TOOM6H_THRESHOLD} 7682@item Toom-8.5 @tab @code{MUL_TOOM8H_THRESHOLD} 7683@item FFT @tab @code{MUL_FFT_THRESHOLD} 7684@end multitable 7685@end quotation 7686 7687Similarly for squaring, with the @code{SQR} thresholds. 7688 7689N@cross{}M multiplications of operands with different sizes above 7690@code{MUL_TOOM22_THRESHOLD} are currently done by special Toom-inspired 7691algorithms or directly with FFT, depending on operand size (@pxref{Unbalanced 7692Multiplication}). 7693 7694@menu 7695* Basecase Multiplication:: 7696* Karatsuba Multiplication:: 7697* Toom 3-Way Multiplication:: 7698* Toom 4-Way Multiplication:: 7699* Higher degree Toom'n'half:: 7700* FFT Multiplication:: 7701* Other Multiplication:: 7702* Unbalanced Multiplication:: 7703@end menu 7704 7705 7706@node Basecase Multiplication, Karatsuba Multiplication, Multiplication Algorithms, Multiplication Algorithms 7707@subsection Basecase Multiplication 7708 7709Basecase N@cross{}M multiplication is a straightforward rectangular set of 7710cross-products, the same as long multiplication done by hand and for that 7711reason sometimes known as the schoolbook or grammar school method. This is an 7712@m{O(NM),O(N*M)} algorithm. See Knuth section 4.3.1 algorithm M 7713(@pxref{References}), and the @file{mpn/generic/mul_basecase.c} code. 7714 7715Assembly implementations of @code{mpn_mul_basecase} are essentially the same 7716as the generic C code, but have all the usual assembly tricks and 7717obscurities introduced for speed. 7718 7719A square can be done in roughly half the time of a multiply, by using the fact 7720that the cross products above and below the diagonal are the same. A triangle 7721of products below the diagonal is formed, doubled (left shift by one bit), and 7722then the products on the diagonal added. This can be seen in 7723@file{mpn/generic/sqr_basecase.c}. Again the assembly implementations take 7724essentially the same approach. 7725 7726@tex 7727\def\GMPline#1#2#3#4#5#6{% 7728 \hbox {% 7729 \vrule height 2.5ex depth 1ex 7730 \hbox to 2em {\hfil{#2}\hfil}% 7731 \vrule \hbox to 2em {\hfil{#3}\hfil}% 7732 \vrule \hbox to 2em {\hfil{#4}\hfil}% 7733 \vrule \hbox to 2em {\hfil{#5}\hfil}% 7734 \vrule \hbox to 2em {\hfil{#6}\hfil}% 7735 \vrule}} 7736\GMPdisplay{ 7737 \hbox{% 7738 \vbox{% 7739 \hbox to 1.5em {\vrule height 2.5ex depth 1ex width 0pt}% 7740 \hbox {\vrule height 2.5ex depth 1ex width 0pt u0\hfil}% 7741 \hbox {\vrule height 2.5ex depth 1ex width 0pt u1\hfil}% 7742 \hbox {\vrule height 2.5ex depth 1ex width 0pt u2\hfil}% 7743 \hbox {\vrule height 2.5ex depth 1ex width 0pt u3\hfil}% 7744 \hbox {\vrule height 2.5ex depth 1ex width 0pt u4\hfil}% 7745 \vfill}% 7746 \vbox{% 7747 \hbox{% 7748 \hbox to 2em {\hfil u0\hfil}% 7749 \hbox to 2em {\hfil u1\hfil}% 7750 \hbox to 2em {\hfil u2\hfil}% 7751 \hbox to 2em {\hfil u3\hfil}% 7752 \hbox to 2em {\hfil u4\hfil}}% 7753 \vskip 0.7ex 7754 \hrule 7755 \GMPline{u0}{d}{}{}{}{}% 7756 \hrule 7757 \GMPline{u1}{}{d}{}{}{}% 7758 \hrule 7759 \GMPline{u2}{}{}{d}{}{}% 7760 \hrule 7761 \GMPline{u3}{}{}{}{d}{}% 7762 \hrule 7763 \GMPline{u4}{}{}{}{}{d}% 7764 \hrule}}} 7765@end tex 7766@ifnottex 7767@example 7768@group 7769 u0 u1 u2 u3 u4 7770 +---+---+---+---+---+ 7771u0 | d | | | | | 7772 +---+---+---+---+---+ 7773u1 | | d | | | | 7774 +---+---+---+---+---+ 7775u2 | | | d | | | 7776 +---+---+---+---+---+ 7777u3 | | | | d | | 7778 +---+---+---+---+---+ 7779u4 | | | | | d | 7780 +---+---+---+---+---+ 7781@end group 7782@end example 7783@end ifnottex 7784 7785In practice squaring isn't a full 2@cross{} faster than multiplying, it's 7786usually around 1.5@cross{}. Less than 1.5@cross{} probably indicates 7787@code{mpn_sqr_basecase} wants improving on that CPU. 7788 7789On some CPUs @code{mpn_mul_basecase} can be faster than the generic C 7790@code{mpn_sqr_basecase} on some small sizes. @code{SQR_BASECASE_THRESHOLD} is 7791the size at which to use @code{mpn_sqr_basecase}, this will be zero if that 7792routine should be used always. 7793 7794 7795@node Karatsuba Multiplication, Toom 3-Way Multiplication, Basecase Multiplication, Multiplication Algorithms 7796@subsection Karatsuba Multiplication 7797@cindex Karatsuba multiplication 7798 7799The Karatsuba multiplication algorithm is described in Knuth section 4.3.3 7800part A, and various other textbooks. A brief description is given here. 7801 7802The inputs @math{x} and @math{y} are treated as each split into two parts of 7803equal length (or the most significant part one limb shorter if N is odd). 7804 7805@tex 7806% GMPboxwidth used for all the multiplication pictures 7807\global\newdimen\GMPboxwidth \global\GMPboxwidth=5em 7808% GMPboxdepth and GMPboxheight are also used for the float pictures 7809\global\newdimen\GMPboxdepth \global\GMPboxdepth=1ex 7810\global\newdimen\GMPboxheight \global\GMPboxheight=2ex 7811\gdef\GMPvrule{\vrule height \GMPboxheight depth \GMPboxdepth} 7812\def\GMPbox#1#2{% 7813 \vbox {% 7814 \hrule 7815 \hbox to 2\GMPboxwidth{% 7816 \GMPvrule \hfil $#1$\hfil \vrule \hfil $#2$\hfil \vrule}% 7817 \hrule}} 7818\GMPdisplay{% 7819\vbox{% 7820 \hbox to 2\GMPboxwidth {high \hfil low} 7821 \vskip 0.7ex 7822 \GMPbox{x_1}{x_0} 7823 \vskip 0.5ex 7824 \GMPbox{y_1}{y_0} 7825}} 7826@end tex 7827@ifnottex 7828@example 7829@group 7830 high low 7831+----------+----------+ 7832| x1 | x0 | 7833+----------+----------+ 7834 7835+----------+----------+ 7836| y1 | y0 | 7837+----------+----------+ 7838@end group 7839@end example 7840@end ifnottex 7841 7842Let @math{b} be the power of 2 where the split occurs, i.e.@: if @ms{x,0} is 7843@math{k} limbs (@ms{y,0} the same) then 7844@m{b=2\GMPraise{$k*$@code{mp\_bits\_per\_limb}}, b=2^(k*mp_bits_per_limb)}. 7845With that @m{x=x_1b+x_0,x=x1*b+x0} and @m{y=y_1b+y_0,y=y1*b+y0}, and the 7846following holds, 7847 7848@display 7849@m{xy = (b^2+b)x_1y_1 - b(x_1-x_0)(y_1-y_0) + (b+1)x_0y_0, 7850 x*y = (b^2+b)*x1*y1 - b*(x1-x0)*(y1-y0) + (b+1)*x0*y0} 7851@end display 7852 7853This formula means doing only three multiplies of (N/2)@cross{}(N/2) limbs, 7854whereas a basecase multiply of N@cross{}N limbs is equivalent to four 7855multiplies of (N/2)@cross{}(N/2). The factors @math{(b^2+b)} etc represent 7856the positions where the three products must be added. 7857 7858@tex 7859\def\GMPboxA#1#2{% 7860 \vbox{% 7861 \hrule 7862 \hbox{% 7863 \GMPvrule 7864 \hbox to 2\GMPboxwidth {\hfil\hbox{$#1$}\hfil}% 7865 \vrule 7866 \hbox to 2\GMPboxwidth {\hfil\hbox{$#2$}\hfil}% 7867 \vrule} 7868 \hrule}} 7869\def\GMPboxB#1#2{% 7870 \hbox{% 7871 \raise \GMPboxdepth \hbox to \GMPboxwidth {\hfil #1\hskip 0.5em}% 7872 \vbox{% 7873 \hrule 7874 \hbox{% 7875 \GMPvrule 7876 \hbox to 2\GMPboxwidth {\hfil\hbox{$#2$}\hfil}% 7877 \vrule}% 7878 \hrule}}} 7879\GMPdisplay{% 7880\vbox{% 7881 \hbox to 4\GMPboxwidth {high \hfil low} 7882 \vskip 0.7ex 7883 \GMPboxA{x_1y_1}{x_0y_0} 7884 \vskip 0.5ex 7885 \GMPboxB{$+$}{x_1y_1} 7886 \vskip 0.5ex 7887 \GMPboxB{$+$}{x_0y_0} 7888 \vskip 0.5ex 7889 \GMPboxB{$-$}{(x_1-x_0)(y_1-y_0)} 7890}} 7891@end tex 7892@ifnottex 7893@example 7894@group 7895 high low 7896+--------+--------+ +--------+--------+ 7897| x1*y1 | | x0*y0 | 7898+--------+--------+ +--------+--------+ 7899 +--------+--------+ 7900 add | x1*y1 | 7901 +--------+--------+ 7902 +--------+--------+ 7903 add | x0*y0 | 7904 +--------+--------+ 7905 +--------+--------+ 7906 sub | (x1-x0)*(y1-y0) | 7907 +--------+--------+ 7908@end group 7909@end example 7910@end ifnottex 7911 7912The term @m{(x_1-x_0)(y_1-y_0),(x1-x0)*(y1-y0)} is best calculated as an 7913absolute value, and the sign used to choose to add or subtract. Notice the 7914sum @m{\mathop{\rm high}(x_0y_0)+\mathop{\rm low}(x_1y_1), 7915high(x0*y0)+low(x1*y1)} occurs twice, so it's possible to do @m{5k,5*k} limb 7916additions, rather than @m{6k,6*k}, but in GMP extra function call overheads 7917outweigh the saving. 7918 7919Squaring is similar to multiplying, but with @math{x=y} the formula reduces to 7920an equivalent with three squares, 7921 7922@display 7923@m{x^2 = (b^2+b)x_1^2 - b(x_1-x_0)^2 + (b+1)x_0^2, 7924 x^2 = (b^2+b)*x1^2 - b*(x1-x0)^2 + (b+1)*x0^2} 7925@end display 7926 7927The final result is accumulated from those three squares the same way as for 7928the three multiplies above. The middle term @m{(x_1-x_0)^2,(x1-x0)^2} is now 7929always positive. 7930 7931A similar formula for both multiplying and squaring can be constructed with a 7932middle term @m{(x_1+x_0)(y_1+y_0),(x1+x0)*(y1+y0)}. But those sums can exceed 7933@math{k} limbs, leading to more carry handling and additions than the form 7934above. 7935 7936Karatsuba multiplication is asymptotically an @math{O(N^@W{1.585})} algorithm, 7937the exponent being @m{\log3/\log2,log(3)/log(2)}, representing 3 multiplies 7938each @math{1/2} the size of the inputs. This is a big improvement over the 7939basecase multiply at @math{O(N^2)} and the advantage soon overcomes the extra 7940additions Karatsuba performs. @code{MUL_TOOM22_THRESHOLD} can be as little 7941as 10 limbs. The @code{SQR} threshold is usually about twice the @code{MUL}. 7942 7943The basecase algorithm will take a time of the form @m{M(N) = aN^2 + bN + c, 7944M(N) = a*N^2 + b*N + c} and the Karatsuba algorithm @m{K(N) = 3M(N/2) + dN + 7945e, K(N) = 3*M(N/2) + d*N + e}, which expands to @m{K(N) = {3\over4} aN^2 + 7946{3\over2} bN + 3c + dN + e, K(N) = 3/4*a*N^2 + 3/2*b*N + 3*c + d*N + e}. The 7947factor @m{3\over4, 3/4} for @math{a} means per-crossproduct speedups in the 7948basecase code will increase the threshold since they benefit @math{M(N)} more 7949than @math{K(N)}. And conversely the @m{3\over2, 3/2} for @math{b} means 7950linear style speedups of @math{b} will increase the threshold since they 7951benefit @math{K(N)} more than @math{M(N)}. The latter can be seen for 7952instance when adding an optimized @code{mpn_sqr_diagonal} to 7953@code{mpn_sqr_basecase}. Of course all speedups reduce total time, and in 7954that sense the algorithm thresholds are merely of academic interest. 7955 7956 7957@node Toom 3-Way Multiplication, Toom 4-Way Multiplication, Karatsuba Multiplication, Multiplication Algorithms 7958@subsection Toom 3-Way Multiplication 7959@cindex Toom multiplication 7960 7961The Karatsuba formula is the simplest case of a general approach to splitting 7962inputs that leads to both Toom and FFT algorithms. A description of 7963Toom can be found in Knuth section 4.3.3, with an example 3-way 7964calculation after Theorem A@. The 3-way form used in GMP is described here. 7965 7966The operands are each considered split into 3 pieces of equal length (or the 7967most significant part 1 or 2 limbs shorter than the other two). 7968 7969@tex 7970\def\GMPbox#1#2#3{% 7971 \vbox{% 7972 \hrule \vfil 7973 \hbox to 3\GMPboxwidth {% 7974 \GMPvrule 7975 \hfil$#1$\hfil 7976 \vrule 7977 \hfil$#2$\hfil 7978 \vrule 7979 \hfil$#3$\hfil 7980 \vrule}% 7981 \vfil \hrule 7982}} 7983\GMPdisplay{% 7984\vbox{% 7985 \hbox to 3\GMPboxwidth {high \hfil low} 7986 \vskip 0.7ex 7987 \GMPbox{x_2}{x_1}{x_0} 7988 \vskip 0.5ex 7989 \GMPbox{y_2}{y_1}{y_0} 7990 \vskip 0.5ex 7991}} 7992@end tex 7993@ifnottex 7994@example 7995@group 7996 high low 7997+----------+----------+----------+ 7998| x2 | x1 | x0 | 7999+----------+----------+----------+ 8000 8001+----------+----------+----------+ 8002| y2 | y1 | y0 | 8003+----------+----------+----------+ 8004@end group 8005@end example 8006@end ifnottex 8007 8008@noindent 8009These parts are treated as the coefficients of two polynomials 8010 8011@display 8012@group 8013@m{X(t) = x_2t^2 + x_1t + x_0, 8014 X(t) = x2*t^2 + x1*t + x0} 8015@m{Y(t) = y_2t^2 + y_1t + y_0, 8016 Y(t) = y2*t^2 + y1*t + y0} 8017@end group 8018@end display 8019 8020Let @math{b} equal the power of 2 which is the size of the @ms{x,0}, @ms{x,1}, 8021@ms{y,0} and @ms{y,1} pieces, i.e.@: if they're @math{k} limbs each then 8022@m{b=2\GMPraise{$k*$@code{mp\_bits\_per\_limb}}, b=2^(k*mp_bits_per_limb)}. 8023With this @math{x=X(b)} and @math{y=Y(b)}. 8024 8025Let a polynomial @m{W(t)=X(t)Y(t),W(t)=X(t)*Y(t)} and suppose its coefficients 8026are 8027 8028@display 8029@m{W(t) = w_4t^4 + w_3t^3 + w_2t^2 + w_1t + w_0, 8030 W(t) = w4*t^4 + w3*t^3 + w2*t^2 + w1*t + w0} 8031@end display 8032 8033The @m{w_i,w[i]} are going to be determined, and when they are they'll give 8034the final result using @math{w=W(b)}, since 8035@m{xy=X(b)Y(b),x*y=X(b)*Y(b)=W(b)}. The coefficients will be roughly 8036@math{b^2} each, and the final @math{W(b)} will be an addition like, 8037 8038@tex 8039\def\GMPbox#1#2{% 8040 \moveright #1\GMPboxwidth 8041 \vbox{% 8042 \hrule 8043 \hbox{% 8044 \GMPvrule 8045 \hbox to 2\GMPboxwidth {\hfil$#2$\hfil}% 8046 \vrule}% 8047 \hrule 8048}} 8049\GMPdisplay{% 8050\vbox{% 8051 \hbox to 6\GMPboxwidth {high \hfil low}% 8052 \vskip 0.7ex 8053 \GMPbox{0}{w_4} 8054 \vskip 0.5ex 8055 \GMPbox{1}{w_3} 8056 \vskip 0.5ex 8057 \GMPbox{2}{w_2} 8058 \vskip 0.5ex 8059 \GMPbox{3}{w_1} 8060 \vskip 0.5ex 8061 \GMPbox{4}{w_0} 8062}} 8063@end tex 8064@ifnottex 8065@example 8066@group 8067 high low 8068+-------+-------+ 8069| w4 | 8070+-------+-------+ 8071 +--------+-------+ 8072 | w3 | 8073 +--------+-------+ 8074 +--------+-------+ 8075 | w2 | 8076 +--------+-------+ 8077 +--------+-------+ 8078 | w1 | 8079 +--------+-------+ 8080 +-------+-------+ 8081 | w0 | 8082 +-------+-------+ 8083@end group 8084@end example 8085@end ifnottex 8086 8087The @m{w_i,w[i]} coefficients could be formed by a simple set of cross 8088products, like @m{w_4=x_2y_2,w4=x2*y2}, @m{w_3=x_2y_1+x_1y_2,w3=x2*y1+x1*y2}, 8089@m{w_2=x_2y_0+x_1y_1+x_0y_2,w2=x2*y0+x1*y1+x0*y2} etc, but this would need all 8090nine @m{x_iy_j,x[i]*y[j]} for @math{i,j=0,1,2}, and would be equivalent merely 8091to a basecase multiply. Instead the following approach is used. 8092 8093@math{X(t)} and @math{Y(t)} are evaluated and multiplied at 5 points, giving 8094values of @math{W(t)} at those points. In GMP the following points are used, 8095 8096@quotation 8097@multitable {@m{t=\infty,t=inf}M} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM} 8098@item Point @tab Value 8099@item @math{t=0} @tab @m{x_0y_0,x0 * y0}, which gives @ms{w,0} immediately 8100@item @math{t=1} @tab @m{(x_2+x_1+x_0)(y_2+y_1+y_0),(x2+x1+x0) * (y2+y1+y0)} 8101@item @math{t=-1} @tab @m{(x_2-x_1+x_0)(y_2-y_1+y_0),(x2-x1+x0) * (y2-y1+y0)} 8102@item @math{t=2} @tab @m{(4x_2+2x_1+x_0)(4y_2+2y_1+y_0),(4*x2+2*x1+x0) * (4*y2+2*y1+y0)} 8103@item @m{t=\infty,t=inf} @tab @m{x_2y_2,x2 * y2}, which gives @ms{w,4} immediately 8104@end multitable 8105@end quotation 8106 8107At @math{t=-1} the values can be negative and that's handled using the 8108absolute values and tracking the sign separately. At @m{t=\infty,t=inf} the 8109value is actually @m{\lim_{t\to\infty} {X(t)Y(t)\over t^4}, X(t)*Y(t)/t^4 in 8110the limit as t approaches infinity}, but it's much easier to think of as 8111simply @m{x_2y_2,x2*y2} giving @ms{w,4} immediately (much like 8112@m{x_0y_0,x0*y0} at @math{t=0} gives @ms{w,0} immediately). 8113 8114Each of the points substituted into 8115@m{W(t)=w_4t^4+\cdots+w_0,W(t)=w4*t^4+@dots{}+w0} gives a linear combination 8116of the @m{w_i,w[i]} coefficients, and the value of those combinations has just 8117been calculated. 8118 8119@tex 8120\GMPdisplay{% 8121$\matrix{% 8122W(0) & = & & & & & & & & & w_0 \cr 8123W(1) & = & w_4 & + & w_3 & + & w_2 & + & w_1 & + & w_0 \cr 8124W(-1) & = & w_4 & - & w_3 & + & w_2 & - & w_1 & + & w_0 \cr 8125W(2) & = & 16w_4 & + & 8w_3 & + & 4w_2 & + & 2w_1 & + & w_0 \cr 8126W(\infty) & = & w_4 \cr 8127}$} 8128@end tex 8129@ifnottex 8130@example 8131@group 8132W(0) = w0 8133W(1) = w4 + w3 + w2 + w1 + w0 8134W(-1) = w4 - w3 + w2 - w1 + w0 8135W(2) = 16*w4 + 8*w3 + 4*w2 + 2*w1 + w0 8136W(inf) = w4 8137@end group 8138@end example 8139@end ifnottex 8140 8141This is a set of five equations in five unknowns, and some elementary linear 8142algebra quickly isolates each @m{w_i,w[i]}. This involves adding or 8143subtracting one @math{W(t)} value from another, and a couple of divisions by 8144powers of 2 and one division by 3, the latter using the special 8145@code{mpn_divexact_by3} (@pxref{Exact Division}). 8146 8147The conversion of @math{W(t)} values to the coefficients is interpolation. A 8148polynomial of degree 4 like @math{W(t)} is uniquely determined by values known 8149at 5 different points. The points are arbitrary and can be chosen to make the 8150linear equations come out with a convenient set of steps for quickly isolating 8151the @m{w_i,w[i]}. 8152 8153Squaring follows the same procedure as multiplication, but there's only one 8154@math{X(t)} and it's evaluated at the 5 points, and those values squared to 8155give values of @math{W(t)}. The interpolation is then identical, and in fact 8156the same @code{toom_interpolate_5pts} subroutine is used for both squaring and 8157multiplying. 8158 8159Toom-3 is asymptotically @math{O(N^@W{1.465})}, the exponent being 8160@m{\log5/\log3,log(5)/log(3)}, representing 5 recursive multiplies of 1/3 the 8161original size each. This is an improvement over Karatsuba at 8162@math{O(N^@W{1.585})}, though Toom does more work in the evaluation and 8163interpolation and so it only realizes its advantage above a certain size. 8164 8165Near the crossover between Toom-3 and Karatsuba there's generally a range of 8166sizes where the difference between the two is small. 8167@code{MUL_TOOM33_THRESHOLD} is a somewhat arbitrary point in that range and 8168successive runs of the tune program can give different values due to small 8169variations in measuring. A graph of time versus size for the two shows the 8170effect, see @file{tune/README}. 8171 8172At the fairly small sizes where the Toom-3 thresholds occur it's worth 8173remembering that the asymptotic behaviour for Karatsuba and Toom-3 can't be 8174expected to make accurate predictions, due of course to the big influence of 8175all sorts of overheads, and the fact that only a few recursions of each are 8176being performed. Even at large sizes there's a good chance machine dependent 8177effects like cache architecture will mean actual performance deviates from 8178what might be predicted. 8179 8180The formula given for the Karatsuba algorithm (@pxref{Karatsuba 8181Multiplication}) has an equivalent for Toom-3 involving only five multiplies, 8182but this would be complicated and unenlightening. 8183 8184An alternate view of Toom-3 can be found in Zuras (@pxref{References}), using 8185a vector to represent the @math{x} and @math{y} splits and a matrix 8186multiplication for the evaluation and interpolation stages. The matrix 8187inverses are not meant to be actually used, and they have elements with values 8188much greater than in fact arise in the interpolation steps. The diagram shown 8189for the 3-way is attractive, but again doesn't have to be implemented that way 8190and for example with a bit of rearrangement just one division by 6 can be 8191done. 8192 8193 8194@node Toom 4-Way Multiplication, Higher degree Toom'n'half, Toom 3-Way Multiplication, Multiplication Algorithms 8195@subsection Toom 4-Way Multiplication 8196@cindex Toom multiplication 8197 8198Karatsuba and Toom-3 split the operands into 2 and 3 coefficients, 8199respectively. Toom-4 analogously splits the operands into 4 coefficients. 8200Using the notation from the section on Toom-3 multiplication, we form two 8201polynomials: 8202 8203@display 8204@group 8205@m{X(t) = x_3t^3 + x_2t^2 + x_1t + x_0, 8206 X(t) = x3*t^3 + x2*t^2 + x1*t + x0} 8207@m{Y(t) = y_3t^3 + y_2t^2 + y_1t + y_0, 8208 Y(t) = y3*t^3 + y2*t^2 + y1*t + y0} 8209@end group 8210@end display 8211 8212@math{X(t)} and @math{Y(t)} are evaluated and multiplied at 7 points, giving 8213values of @math{W(t)} at those points. In GMP the following points are used, 8214 8215@quotation 8216@multitable {@m{t=-1/2,t=inf}M} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM} 8217@item Point @tab Value 8218@item @math{t=0} @tab @m{x_0y_0,x0 * y0}, which gives @ms{w,0} immediately 8219@item @math{t=1/2} @tab @m{(x_3+2x_2+4x_1+8x_0)(y_3+2y_2+4y_1+8y_0),(x3+2*x2+4*x1+8*x0) * (y3+2*y2+4*y1+8*y0)} 8220@item @math{t=-1/2} @tab @m{(-x_3+2x_2-4x_1+8x_0)(-y_3+2y_2-4y_1+8y_0),(-x3+2*x2-4*x1+8*x0) * (-y3+2*y2-4*y1+8*y0)} 8221@item @math{t=1} @tab @m{(x_3+x_2+x_1+x_0)(y_3+y_2+y_1+y_0),(x3+x2+x1+x0) * (y3+y2+y1+y0)} 8222@item @math{t=-1} @tab @m{(-x_3+x_2-x_1+x_0)(-y_3+y_2-y_1+y_0),(-x3+x2-x1+x0) * (-y3+y2-y1+y0)} 8223@item @math{t=2} @tab @m{(8x_3+4x_2+2x_1+x_0)(8y_3+4y_2+2y_1+y_0),(8*x3+4*x2+2*x1+x0) * (8*y3+4*y2+2*y1+y0)} 8224@item @m{t=\infty,t=inf} @tab @m{x_3y_3,x3 * y3}, which gives @ms{w,6} immediately 8225@end multitable 8226@end quotation 8227 8228The number of additions and subtractions for Toom-4 is much larger than for Toom-3. 8229But several subexpressions occur multiple times, for example @m{x_2+x_0,x2+x0}, occurs 8230for both @math{t=1} and @math{t=-1}. 8231 8232Toom-4 is asymptotically @math{O(N^@W{1.404})}, the exponent being 8233@m{\log7/\log4,log(7)/log(4)}, representing 7 recursive multiplies of 1/4 the 8234original size each. 8235 8236 8237@node Higher degree Toom'n'half, FFT Multiplication, Toom 4-Way Multiplication, Multiplication Algorithms 8238@subsection Higher degree Toom'n'half 8239@cindex Toom multiplication 8240 8241The Toom algorithms described above (@pxref{Toom 3-Way Multiplication}, 8242@pxref{Toom 4-Way Multiplication}) generalizes to split into an arbitrary 8243number of pieces. In general a split of two equally long operands into 8244@math{r} pieces leads to evaluations and pointwise multiplications done at 8245@m{2r-1,2*r-1} points. To fully exploit symmetries it would be better to have 8246a multiple of 4 points, that's why for higher degree Toom'n'half is used. 8247 8248Toom'n'half means that the existence of one more piece is considered for a 8249single operand. It can be virtual, i.e. zero, or real, when the two operand 8250are not exactly balanced. By choosing an even @math{r}, 8251Toom-@m{r{1\over2},r+1/2} requires @math{2r} points, a multiple of four. 8252 8253The quadruplets of points include 0, @m{\infty,inf}, +1, -1 and 8254@m{\pm2^i,+-2^i}, @m{\pm2^{-i},+-2^-i} . Each of them giving shortcuts for the 8255evaluation phase and for some steps in the interpolation phase. Further tricks 8256are used to reduce the memory footprint of the whole multiplication algorithm 8257to a memory buffer equal in size to the result of the product. 8258 8259Current GMP uses both Toom-6'n'half and Toom-8'n'half. 8260 8261 8262@node FFT Multiplication, Other Multiplication, Higher degree Toom'n'half, Multiplication Algorithms 8263@subsection FFT Multiplication 8264@cindex FFT multiplication 8265@cindex Fast Fourier Transform 8266 8267At large to very large sizes a Fermat style FFT multiplication is used, 8268following Sch@"onhage and Strassen (@pxref{References}). Descriptions of FFTs 8269in various forms can be found in many textbooks, for instance Knuth section 82704.3.3 part C or Lipson chapter IX@. A brief description of the form used in 8271GMP is given here. 8272 8273The multiplication done is @m{xy \bmod 2^N+1, x*y mod 2^N+1}, for a given 8274@math{N}. A full product @m{xy,x*y} is obtained by choosing @m{N \ge 8275\mathop{\rm bits}(x)+\mathop{\rm bits}(y), N>=bits(x)+bits(y)} and padding 8276@math{x} and @math{y} with high zero limbs. The modular product is the native 8277form for the algorithm, so padding to get a full product is unavoidable. 8278 8279The algorithm follows a split, evaluate, pointwise multiply, interpolate and 8280combine similar to that described above for Karatsuba and Toom-3. A @math{k} 8281parameter controls the split, with an FFT-@math{k} splitting into @math{2^k} 8282pieces of @math{M=N/2^k} bits each. @math{N} must be a multiple of 8283@m{2^k\times@code{mp\_bits\_per\_limb}, (2^k)*@nicode{mp_bits_per_limb}} so 8284the split falls on limb boundaries, avoiding bit shifts in the split and 8285combine stages. 8286 8287The evaluations, pointwise multiplications, and interpolation, are all done 8288modulo @m{2^{N'}+1, 2^N'+1} where @math{N'} is @math{2M+k+3} rounded up to a 8289multiple of @math{2^k} and of @code{mp_bits_per_limb}. The results of 8290interpolation will be the following negacyclic convolution of the input 8291pieces, and the choice of @math{N'} ensures these sums aren't truncated. 8292@tex 8293$$ w_n = \sum_{{i+j = b2^k+n}\atop{b=0,1}} (-1)^b x_i y_j $$ 8294@end tex 8295@ifnottex 8296 8297@example 8298 --- 8299 \ b 8300w[n] = / (-1) * x[i] * y[j] 8301 --- 8302 i+j==b*2^k+n 8303 b=0,1 8304@end example 8305 8306@end ifnottex 8307The points used for the evaluation are @math{g^i} for @math{i=0} to 8308@math{2^k-1} where @m{g=2^{2N'/2^k}, g=2^(2N'/2^k)}. @math{g} is a 8309@m{2^k,2^k'}th root of unity mod @m{2^{N'}+1,2^N'+1}, which produces necessary 8310cancellations at the interpolation stage, and it's also a power of 2 so the 8311fast Fourier transforms used for the evaluation and interpolation do only 8312shifts, adds and negations. 8313 8314The pointwise multiplications are done modulo @m{2^{N'}+1, 2^N'+1} and either 8315recurse into a further FFT or use a plain multiplication (Toom-3, Karatsuba or 8316basecase), whichever is optimal at the size @math{N'}. The interpolation is 8317an inverse fast Fourier transform. The resulting set of sums of @m{x_iy_j, 8318x[i]*y[j]} are added at appropriate offsets to give the final result. 8319 8320Squaring is the same, but @math{x} is the only input so it's one transform at 8321the evaluate stage and the pointwise multiplies are squares. The 8322interpolation is the same. 8323 8324For a mod @math{2^N+1} product, an FFT-@math{k} is an @m{O(N^{k/(k-1)}), 8325O(N^(k/(k-1)))} algorithm, the exponent representing @math{2^k} recursed 8326modular multiplies each @m{1/2^{k-1},1/2^(k-1)} the size of the original. 8327Each successive @math{k} is an asymptotic improvement, but overheads mean each 8328is only faster at bigger and bigger sizes. In the code, @code{MUL_FFT_TABLE} 8329and @code{SQR_FFT_TABLE} are the thresholds where each @math{k} is used. Each 8330new @math{k} effectively swaps some multiplying for some shifts, adds and 8331overheads. 8332 8333A mod @math{2^N+1} product can be formed with a normal 8334@math{N@cross{}N@rightarrow{}2N} bit multiply plus a subtraction, so an FFT 8335and Toom-3 etc can be compared directly. A @math{k=4} FFT at 8336@math{O(N^@W{1.333})} can be expected to be the first faster than Toom-3 at 8337@math{O(N^@W{1.465})}. In practice this is what's found, with 8338@code{MUL_FFT_MODF_THRESHOLD} and @code{SQR_FFT_MODF_THRESHOLD} being between 8339300 and 1000 limbs, depending on the CPU@. So far it's been found that only 8340very large FFTs recurse into pointwise multiplies above these sizes. 8341 8342When an FFT is to give a full product, the change of @math{N} to @math{2N} 8343doesn't alter the theoretical complexity for a given @math{k}, but for the 8344purposes of considering where an FFT might be first used it can be assumed 8345that the FFT is recursing into a normal multiply and that on that basis it's 8346doing @math{2^k} recursed multiplies each @m{1/2^{k-2},1/2^(k-2)} the size of 8347the inputs, making it @m{O(N^{k/(k-2)}), O(N^(k/(k-2)))}. This would mean 8348@math{k=7} at @math{O(N^@W{1.4})} would be the first FFT faster than Toom-3. 8349In practice @code{MUL_FFT_THRESHOLD} and @code{SQR_FFT_THRESHOLD} have been 8350found to be in the @math{k=8} range, somewhere between 3000 and 10000 limbs. 8351 8352The way @math{N} is split into @math{2^k} pieces and then @math{2M+k+3} is 8353rounded up to a multiple of @math{2^k} and @code{mp_bits_per_limb} means that 8354when @math{2^k@ge{}@nicode{mp\_bits\_per\_limb}} the effective @math{N} is a 8355multiple of @m{2^{2k-1},2^(2k-1)} bits. The @math{+k+3} means some values of 8356@math{N} just under such a multiple will be rounded to the next. The 8357complexity calculations above assume that a favourable size is used, meaning 8358one which isn't padded through rounding, and it's also assumed that the extra 8359@math{+k+3} bits are negligible at typical FFT sizes. 8360 8361The practical effect of the @m{2^{2k-1},2^(2k-1)} constraint is to introduce a 8362step-effect into measured speeds. For example @math{k=8} will round @math{N} 8363up to a multiple of 32768 bits, so for a 32-bit limb there'll be 512 limb 8364groups of sizes for which @code{mpn_mul_n} runs at the same speed. Or for 8365@math{k=9} groups of 2048 limbs, @math{k=10} groups of 8192 limbs, etc. In 8366practice it's been found each @math{k} is used at quite small multiples of its 8367size constraint and so the step effect is quite noticeable in a time versus 8368size graph. 8369 8370The threshold determinations currently measure at the mid-points of size 8371steps, but this is sub-optimal since at the start of a new step it can happen 8372that it's better to go back to the previous @math{k} for a while. Something 8373more sophisticated for @code{MUL_FFT_TABLE} and @code{SQR_FFT_TABLE} will be 8374needed. 8375 8376 8377@node Other Multiplication, Unbalanced Multiplication, FFT Multiplication, Multiplication Algorithms 8378@subsection Other Multiplication 8379@cindex Toom multiplication 8380 8381The Toom algorithms described above (@pxref{Toom 3-Way Multiplication}, 8382@pxref{Toom 4-Way Multiplication}) generalizes to split into an arbitrary 8383number of pieces, as per Knuth section 4.3.3 algorithm C@. This is not 8384currently used. The notes here are merely for interest. 8385 8386In general a split into @math{r+1} pieces is made, and evaluations and 8387pointwise multiplications done at @m{2r+1,2*r+1} points. A 4-way split does 7 8388pointwise multiplies, 5-way does 9, etc. Asymptotically an @math{(r+1)}-way 8389algorithm is @m{O(N^{log(2r+1)/log(r+1)}), O(N^(log(2*r+1)/log(r+1)))}. Only 8390the pointwise multiplications count towards big-@math{O} complexity, but the 8391time spent in the evaluate and interpolate stages grows with @math{r} and has 8392a significant practical impact, with the asymptotic advantage of each @math{r} 8393realized only at bigger and bigger sizes. The overheads grow as 8394@m{O(Nr),O(N*r)}, whereas in an @math{r=2^k} FFT they grow only as @m{O(N \log 8395r), O(N*log(r))}. 8396 8397Knuth algorithm C evaluates at points 0,1,2,@dots{},@m{2r,2*r}, but exercise 4 8398uses @math{-r},@dots{},0,@dots{},@math{r} and the latter saves some small 8399multiplies in the evaluate stage (or rather trades them for additions), and 8400has a further saving of nearly half the interpolate steps. The idea is to 8401separate odd and even final coefficients and then perform algorithm C steps C7 8402and C8 on them separately. The divisors at step C7 become @math{j^2} and the 8403multipliers at C8 become @m{2tj-j^2,2*t*j-j^2}. 8404 8405Splitting odd and even parts through positive and negative points can be 8406thought of as using @math{-1} as a square root of unity. If a 4th root of 8407unity was available then a further split and speedup would be possible, but no 8408such root exists for plain integers. Going to complex integers with 8409@m{i=\sqrt{-1}, i=sqrt(-1)} doesn't help, essentially because in Cartesian 8410form it takes three real multiplies to do a complex multiply. The existence 8411of @m{2^k,2^k'}th roots of unity in a suitable ring or field lets the fast 8412Fourier transform keep splitting and get to @m{O(N \log r), O(N*log(r))}. 8413 8414Floating point FFTs use complex numbers approximating Nth roots of unity. 8415Some processors have special support for such FFTs. But these are not used in 8416GMP since it's very difficult to guarantee an exact result (to some number of 8417bits). An occasional difference of 1 in the last bit might not matter to a 8418typical signal processing algorithm, but is of course of vital importance to 8419GMP. 8420 8421 8422@node Unbalanced Multiplication, , Other Multiplication, Multiplication Algorithms 8423@subsection Unbalanced Multiplication 8424@cindex Unbalanced multiplication 8425 8426Multiplication of operands with different sizes, both below 8427@code{MUL_TOOM22_THRESHOLD} are done with plain schoolbook multiplication 8428(@pxref{Basecase Multiplication}). 8429 8430For really large operands, we invoke FFT directly. 8431 8432For operands between these sizes, we use Toom inspired algorithms suggested by 8433Alberto Zanoni and Marco Bodrato. The idea is to split the operands into 8434polynomials of different degree. GMP currently splits the smaller operand 8435onto 2 coefficients, i.e., a polynomial of degree 1, but the larger operand 8436can be split into 2, 3, or 4 coefficients, i.e., a polynomial of degree 1 to 84373. 8438 8439@c FIXME: This is mighty ugly, but a cleaner @need triggers texinfo bugs that 8440@c screws up layout here and there in the rest of the manual. 8441@c @tex 8442@c \goodbreak 8443@c @end tex 8444@node Division Algorithms, Greatest Common Divisor Algorithms, Multiplication Algorithms, Algorithms 8445@section Division Algorithms 8446@cindex Division algorithms 8447 8448@menu 8449* Single Limb Division:: 8450* Basecase Division:: 8451* Divide and Conquer Division:: 8452* Block-Wise Barrett Division:: 8453* Exact Division:: 8454* Exact Remainder:: 8455* Small Quotient Division:: 8456@end menu 8457 8458 8459@node Single Limb Division, Basecase Division, Division Algorithms, Division Algorithms 8460@subsection Single Limb Division 8461 8462N@cross{}1 division is implemented using repeated 2@cross{}1 divisions from 8463high to low, either with a hardware divide instruction or a multiplication by 8464inverse, whichever is best on a given CPU. 8465 8466The multiply by inverse follows ``Improved division by invariant integers'' by 8467M@"oller and Granlund (@pxref{References}) and is implemented as 8468@code{udiv_qrnnd_preinv} in @file{gmp-impl.h}. The idea is to have a 8469fixed-point approximation to @math{1/d} (see @code{invert_limb}) and then 8470multiply by the high limb (plus one bit) of the dividend to get a quotient 8471@math{q}. With @math{d} normalized (high bit set), @math{q} is no more than 1 8472too small. Subtracting @m{qd,q*d} from the dividend gives a remainder, and 8473reveals whether @math{q} or @math{q-1} is correct. 8474 8475The result is a division done with two multiplications and four or five 8476arithmetic operations. On CPUs with low latency multipliers this can be much 8477faster than a hardware divide, though the cost of calculating the inverse at 8478the start may mean it's only better on inputs bigger than say 4 or 5 limbs. 8479 8480When a divisor must be normalized, either for the generic C 8481@code{__udiv_qrnnd_c} or the multiply by inverse, the division performed is 8482actually @m{a2^k,a*2^k} by @m{d2^k,d*2^k} where @math{a} is the dividend and 8483@math{k} is the power necessary to have the high bit of @m{d2^k,d*2^k} set. 8484The bit shifts for the dividend are usually accomplished ``on the fly'' 8485meaning by extracting the appropriate bits at each step. Done this way the 8486quotient limbs come out aligned ready to store. When only the remainder is 8487wanted, an alternative is to take the dividend limbs unshifted and calculate 8488@m{r = a \bmod d2^k, r = a mod d*2^k} followed by an extra final step @m{r2^k 8489\bmod d2^k, r*2^k mod d*2^k}. This can help on CPUs with poor bit shifts or 8490few registers. 8491 8492The multiply by inverse can be done two limbs at a time. The calculation is 8493basically the same, but the inverse is two limbs and the divisor treated as if 8494padded with a low zero limb. This means more work, since the inverse will 8495need a 2@cross{}2 multiply, but the four 1@cross{}1s to do that are 8496independent and can therefore be done partly or wholly in parallel. Likewise 8497for a 2@cross{}1 calculating @m{qd,q*d}. The net effect is to process two 8498limbs with roughly the same two multiplies worth of latency that one limb at a 8499time gives. This extends to 3 or 4 limbs at a time, though the extra work to 8500apply the inverse will almost certainly soon reach the limits of multiplier 8501throughput. 8502 8503A similar approach in reverse can be taken to process just half a limb at a 8504time if the divisor is only a half limb. In this case the 1@cross{}1 multiply 8505for the inverse effectively becomes two @m{{1\over2}\times1, (1/2)x1} for each 8506limb, which can be a saving on CPUs with a fast half limb multiply, or in fact 8507if the only multiply is a half limb, and especially if it's not pipelined. 8508 8509 8510@node Basecase Division, Divide and Conquer Division, Single Limb Division, Division Algorithms 8511@subsection Basecase Division 8512 8513Basecase N@cross{}M division is like long division done by hand, but in base 8514@m{2\GMPraise{@code{mp\_bits\_per\_limb}}, 2^mp_bits_per_limb}. See Knuth 8515section 4.3.1 algorithm D, and @file{mpn/generic/sb_divrem_mn.c}. 8516 8517Briefly stated, while the dividend remains larger than the divisor, a high 8518quotient limb is formed and the N@cross{}1 product @m{qd,q*d} subtracted at 8519the top end of the dividend. With a normalized divisor (most significant bit 8520set), each quotient limb can be formed with a 2@cross{}1 division and a 85211@cross{}1 multiplication plus some subtractions. The 2@cross{}1 division is 8522by the high limb of the divisor and is done either with a hardware divide or a 8523multiply by inverse (the same as in @ref{Single Limb Division}) whichever is 8524faster. Such a quotient is sometimes one too big, requiring an addback of the 8525divisor, but that happens rarely. 8526 8527With Q=N@minus{}M being the number of quotient limbs, this is an 8528@m{O(QM),O(Q*M)} algorithm and will run at a speed similar to a basecase 8529Q@cross{}M multiplication, differing in fact only in the extra multiply and 8530divide for each of the Q quotient limbs. 8531 8532 8533@node Divide and Conquer Division, Block-Wise Barrett Division, Basecase Division, Division Algorithms 8534@subsection Divide and Conquer Division 8535 8536For divisors larger than @code{DC_DIV_QR_THRESHOLD}, division is done by dividing. 8537Or to be precise by a recursive divide and conquer algorithm based on work by 8538Moenck and Borodin, Jebelean, and Burnikel and Ziegler (@pxref{References}). 8539 8540The algorithm consists essentially of recognising that a 2N@cross{}N division 8541can be done with the basecase division algorithm (@pxref{Basecase Division}), 8542but using N/2 limbs as a base, not just a single limb. This way the 8543multiplications that arise are (N/2)@cross{}(N/2) and can take advantage of 8544Karatsuba and higher multiplication algorithms (@pxref{Multiplication 8545Algorithms}). The two ``digits'' of the quotient are formed by recursive 8546N@cross{}(N/2) divisions. 8547 8548If the (N/2)@cross{}(N/2) multiplies are done with a basecase multiplication 8549then the work is about the same as a basecase division, but with more function 8550call overheads and with some subtractions separated from the multiplies. 8551These overheads mean that it's only when N/2 is above 8552@code{MUL_TOOM22_THRESHOLD} that divide and conquer is of use. 8553 8554@code{DC_DIV_QR_THRESHOLD} is based on the divisor size N, so it will be somewhere 8555above twice @code{MUL_TOOM22_THRESHOLD}, but how much above depends on the 8556CPU@. An optimized @code{mpn_mul_basecase} can lower @code{DC_DIV_QR_THRESHOLD} a 8557little by offering a ready-made advantage over repeated @code{mpn_submul_1} 8558calls. 8559 8560Divide and conquer is asymptotically @m{O(M(N)\log N),O(M(N)*log(N))} where 8561@math{M(N)} is the time for an N@cross{}N multiplication done with FFTs. The 8562actual time is a sum over multiplications of the recursed sizes, as can be 8563seen near the end of section 2.2 of Burnikel and Ziegler. For example, within 8564the Toom-3 range, divide and conquer is @m{2.63M(N), 2.63*M(N)}. With higher 8565algorithms the @math{M(N)} term improves and the multiplier tends to @m{\log 8566N, log(N)}. In practice, at moderate to large sizes, a 2N@cross{}N division 8567is about 2 to 4 times slower than an N@cross{}N multiplication. 8568 8569 8570@node Block-Wise Barrett Division, Exact Division, Divide and Conquer Division, Division Algorithms 8571@subsection Block-Wise Barrett Division 8572 8573For the largest divisions, a block-wise Barrett division algorithm is used. 8574Here, the divisor is inverted to a precision determined by the relative size of 8575the dividend and divisor. Blocks of quotient limbs are then generated by 8576multiplying blocks from the dividend by the inverse. 8577 8578Our block-wise algorithm computes a smaller inverse than in the plain Barrett 8579algorithm. For a @math{2n/n} division, the inverse will be just @m{\lceil n/2 8580\rceil, ceil(n/2)} limbs. 8581 8582 8583@node Exact Division, Exact Remainder, Block-Wise Barrett Division, Division Algorithms 8584@subsection Exact Division 8585 8586 8587A so-called exact division is when the dividend is known to be an exact 8588multiple of the divisor. Jebelean's exact division algorithm uses this 8589knowledge to make some significant optimizations (@pxref{References}). 8590 8591The idea can be illustrated in decimal for example with 368154 divided by 8592543. Because the low digit of the dividend is 4, the low digit of the 8593quotient must be 8. This is arrived at from @m{4 \mathord{\times} 7 \bmod 10, 85944*7 mod 10}, using the fact 7 is the modular inverse of 3 (the low digit of 8595the divisor), since @m{3 \mathord{\times} 7 \mathop{\equiv} 1 \bmod 10, 3*7 8596@equiv{} 1 mod 10}. So @m{8\mathord{\times}543 = 4344,8*543=4344} can be 8597subtracted from the dividend leaving 363810. Notice the low digit has become 8598zero. 8599 8600The procedure is repeated at the second digit, with the next quotient digit 7 8601(@m{1 \mathord{\times} 7 \bmod 10, 7 @equiv{} 1*7 mod 10}), subtracting 8602@m{7\mathord{\times}543 = 3801,7*543=3801}, leaving 325800. And finally at 8603the third digit with quotient digit 6 (@m{8 \mathord{\times} 7 \bmod 10, 8*7 8604mod 10}), subtracting @m{6\mathord{\times}543 = 3258,6*543=3258} leaving 0. 8605So the quotient is 678. 8606 8607Notice however that the multiplies and subtractions don't need to extend past 8608the low three digits of the dividend, since that's enough to determine the 8609three quotient digits. For the last quotient digit no subtraction is needed 8610at all. On a 2N@cross{}N division like this one, only about half the work of 8611a normal basecase division is necessary. 8612 8613For an N@cross{}M exact division producing Q=N@minus{}M quotient limbs, the 8614saving over a normal basecase division is in two parts. Firstly, each of the 8615Q quotient limbs needs only one multiply, not a 2@cross{}1 divide and 8616multiply. Secondly, the crossproducts are reduced when @math{Q>M} to 8617@m{QM-M(M+1)/2,Q*M-M*(M+1)/2}, or when @math{Q@le{}M} to @m{Q(Q-1)/2, 8618Q*(Q-1)/2}. Notice the savings are complementary. If Q is big then many 8619divisions are saved, or if Q is small then the crossproducts reduce to a small 8620number. 8621 8622The modular inverse used is calculated efficiently by @code{binvert_limb} in 8623@file{gmp-impl.h}. This does four multiplies for a 32-bit limb, or six for a 862464-bit limb. @file{tune/modlinv.c} has some alternate implementations that 8625might suit processors better at bit twiddling than multiplying. 8626 8627The sub-quadratic exact division described by Jebelean in ``Exact Division 8628with Karatsuba Complexity'' is not currently implemented. It uses a 8629rearrangement similar to the divide and conquer for normal division 8630(@pxref{Divide and Conquer Division}), but operating from low to high. A 8631further possibility not currently implemented is ``Bidirectional Exact Integer 8632Division'' by Krandick and Jebelean which forms quotient limbs from both the 8633high and low ends of the dividend, and can halve once more the number of 8634crossproducts needed in a 2N@cross{}N division. 8635 8636A special case exact division by 3 exists in @code{mpn_divexact_by3}, 8637supporting Toom-3 multiplication and @code{mpq} canonicalizations. It forms 8638quotient digits with a multiply by the modular inverse of 3 (which is 8639@code{0xAA..AAB}) and uses two comparisons to determine a borrow for the next 8640limb. The multiplications don't need to be on the dependent chain, as long as 8641the effect of the borrows is applied, which can help chips with pipelined 8642multipliers. 8643 8644 8645@node Exact Remainder, Small Quotient Division, Exact Division, Division Algorithms 8646@subsection Exact Remainder 8647@cindex Exact remainder 8648 8649If the exact division algorithm is done with a full subtraction at each stage 8650and the dividend isn't a multiple of the divisor, then low zero limbs are 8651produced but with a remainder in the high limbs. For dividend @math{a}, 8652divisor @math{d}, quotient @math{q}, and @m{b = 2 8653\GMPraise{@code{mp\_bits\_per\_limb}}, b = 2^mp_bits_per_limb}, this remainder 8654@math{r} is of the form 8655@tex 8656$$ a = qd + r b^n $$ 8657@end tex 8658@ifnottex 8659 8660@example 8661a = q*d + r*b^n 8662@end example 8663 8664@end ifnottex 8665@math{n} represents the number of zero limbs produced by the subtractions, 8666that being the number of limbs produced for @math{q}. @math{r} will be in the 8667range @math{0@le{}r<d} and can be viewed as a remainder, but one shifted up by 8668a factor of @math{b^n}. 8669 8670Carrying out full subtractions at each stage means the same number of cross 8671products must be done as a normal division, but there's still some single limb 8672divisions saved. When @math{d} is a single limb some simplifications arise, 8673providing good speedups on a number of processors. 8674 8675The functions @code{mpn_divexact_by3}, @code{mpn_modexact_1_odd} and the 8676internal @code{mpn_redc_X} functions differ subtly in how they return @math{r}, 8677leading to some negations in the above formula, but all are essentially the 8678same. 8679 8680@cindex Divisibility algorithm 8681@cindex Congruence algorithm 8682Clearly @math{r} is zero when @math{a} is a multiple of @math{d}, and this 8683leads to divisibility or congruence tests which are potentially more efficient 8684than a normal division. 8685 8686The factor of @math{b^n} on @math{r} can be ignored in a GCD when @math{d} is 8687odd, hence the use of @code{mpn_modexact_1_odd} by @code{mpn_gcd_1} and 8688@code{mpz_kronecker_ui} etc (@pxref{Greatest Common Divisor Algorithms}). 8689 8690Montgomery's REDC method for modular multiplications uses operands of the form 8691of @m{xb^{-n}, x*b^-n} and @m{yb^{-n}, y*b^-n} and on calculating @m{(xb^{-n}) 8692(yb^{-n}), (x*b^-n)*(y*b^-n)} uses the factor of @math{b^n} in the exact 8693remainder to reach a product in the same form @m{(xy)b^{-n}, (x*y)*b^-n} 8694(@pxref{Modular Powering Algorithm}). 8695 8696Notice that @math{r} generally gives no useful information about the ordinary 8697remainder @math{a @bmod d} since @math{b^n @bmod d} could be anything. If 8698however @math{b^n @equiv{} 1 @bmod d}, then @math{r} is the negative of the 8699ordinary remainder. This occurs whenever @math{d} is a factor of 8700@math{b^n-1}, as for example with 3 in @code{mpn_divexact_by3}. For a 32 or 870164 bit limb other such factors include 5, 17 and 257, but no particular use 8702has been found for this. 8703 8704 8705@node Small Quotient Division, , Exact Remainder, Division Algorithms 8706@subsection Small Quotient Division 8707 8708An N@cross{}M division where the number of quotient limbs Q=N@minus{}M is 8709small can be optimized somewhat. 8710 8711An ordinary basecase division normalizes the divisor by shifting it to make 8712the high bit set, shifting the dividend accordingly, and shifting the 8713remainder back down at the end of the calculation. This is wasteful if only a 8714few quotient limbs are to be formed. Instead a division of just the top 8715@m{\rm2Q,2*Q} limbs of the dividend by the top Q limbs of the divisor can be 8716used to form a trial quotient. This requires only those limbs normalized, not 8717the whole of the divisor and dividend. 8718 8719A multiply and subtract then applies the trial quotient to the M@minus{}Q 8720unused limbs of the divisor and N@minus{}Q dividend limbs (which includes Q 8721limbs remaining from the trial quotient division). The starting trial 8722quotient can be 1 or 2 too big, but all cases of 2 too big and most cases of 1 8723too big are detected by first comparing the most significant limbs that will 8724arise from the subtraction. An addback is done if the quotient still turns 8725out to be 1 too big. 8726 8727This whole procedure is essentially the same as one step of the basecase 8728algorithm done in a Q limb base, though with the trial quotient test done only 8729with the high limbs, not an entire Q limb ``digit'' product. The correctness 8730of this weaker test can be established by following the argument of Knuth 8731section 4.3.1 exercise 20 but with the @m{v_2 \GMPhat q > b \GMPhat r 8732+ u_2, v2*q>b*r+u2} condition appropriately relaxed. 8733 8734 8735@need 1000 8736@node Greatest Common Divisor Algorithms, Powering Algorithms, Division Algorithms, Algorithms 8737@section Greatest Common Divisor 8738@cindex Greatest common divisor algorithms 8739@cindex GCD algorithms 8740 8741@menu 8742* Binary GCD:: 8743* Lehmer's Algorithm:: 8744* Subquadratic GCD:: 8745* Extended GCD:: 8746* Jacobi Symbol:: 8747@end menu 8748 8749 8750@node Binary GCD, Lehmer's Algorithm, Greatest Common Divisor Algorithms, Greatest Common Divisor Algorithms 8751@subsection Binary GCD 8752 8753At small sizes GMP uses an @math{O(N^2)} binary style GCD@. This is described 8754in many textbooks, for example Knuth section 4.5.2 algorithm B@. It simply 8755consists of successively reducing odd operands @math{a} and @math{b} using 8756 8757@quotation 8758@math{a,b = @abs{}(a-b),@min{}(a,b)} @* 8759strip factors of 2 from @math{a} 8760@end quotation 8761 8762The Euclidean GCD algorithm, as per Knuth algorithms E and A, repeatedly 8763computes the quotient @m{q = \lfloor a/b \rfloor, q = floor(a/b)} and replaces 8764@math{a,b} by @math{v, u - q v}. The binary algorithm has so far been found to 8765be faster than the Euclidean algorithm everywhere. One reason the binary 8766method does well is that the implied quotient at each step is usually small, 8767so often only one or two subtractions are needed to get the same effect as a 8768division. Quotients 1, 2 and 3 for example occur 67.7% of the time, see Knuth 8769section 4.5.3 Theorem E. 8770 8771When the implied quotient is large, meaning @math{b} is much smaller than 8772@math{a}, then a division is worthwhile. This is the basis for the initial 8773@math{a @bmod b} reductions in @code{mpn_gcd} and @code{mpn_gcd_1} (the latter 8774for both N@cross{}1 and 1@cross{}1 cases). But after that initial reduction, 8775big quotients occur too rarely to make it worth checking for them. 8776 8777@sp 1 8778The final @math{1@cross{}1} GCD in @code{mpn_gcd_1} is done in the generic C 8779code as described above. For two N-bit operands, the algorithm takes about 87800.68 iterations per bit. For optimum performance some attention needs to be 8781paid to the way the factors of 2 are stripped from @math{a}. 8782 8783Firstly it may be noted that in twos complement the number of low zero bits on 8784@math{a-b} is the same as @math{b-a}, so counting or testing can begin on 8785@math{a-b} without waiting for @math{@abs{}(a-b)} to be determined. 8786 8787A loop stripping low zero bits tends not to branch predict well, since the 8788condition is data dependent. But on average there's only a few low zeros, so 8789an option is to strip one or two bits arithmetically then loop for more (as 8790done for AMD K6). Or use a lookup table to get a count for several bits then 8791loop for more (as done for AMD K7). An alternative approach is to keep just 8792one of @math{a} or @math{b} odd and iterate 8793 8794@quotation 8795@math{a,b = @abs{}(a-b), @min{}(a,b)} @* 8796@math{a = a/2} if even @* 8797@math{b = b/2} if even 8798@end quotation 8799 8800This requires about 1.25 iterations per bit, but stripping of a single bit at 8801each step avoids any branching. Repeating the bit strip reduces to about 0.9 8802iterations per bit, which may be a worthwhile tradeoff. 8803 8804Generally with the above approaches a speed of perhaps 6 cycles per bit can be 8805achieved, which is still not terribly fast with for instance a 64-bit GCD 8806taking nearly 400 cycles. It's this sort of time which means it's not usually 8807advantageous to combine a set of divisibility tests into a GCD. 8808 8809Currently, the binary algorithm is used for GCD only when @math{N < 3}. 8810 8811@node Lehmer's Algorithm, Subquadratic GCD, Binary GCD, Greatest Common Divisor Algorithms 8812@comment node-name, next, previous, up 8813@subsection Lehmer's algorithm 8814 8815Lehmer's improvement of the Euclidean algorithms is based on the observation 8816that the initial part of the quotient sequence depends only on the most 8817significant parts of the inputs. The variant of Lehmer's algorithm used in GMP 8818splits off the most significant two limbs, as suggested, e.g., in ``A 8819Double-Digit Lehmer-Euclid Algorithm'' by Jebelean (@pxref{References}). The 8820quotients of two double-limb inputs are collected as a 2 by 2 matrix with 8821single-limb elements. This is done by the function @code{mpn_hgcd2}. The 8822resulting matrix is applied to the inputs using @code{mpn_mul_1} and 8823@code{mpn_submul_1}. Each iteration usually reduces the inputs by almost one 8824limb. In the rare case of a large quotient, no progress can be made by 8825examining just the most significant two limbs, and the quotient is computed 8826using plain division. 8827 8828The resulting algorithm is asymptotically @math{O(N^2)}, just as the Euclidean 8829algorithm and the binary algorithm. The quadratic part of the work are 8830the calls to @code{mpn_mul_1} and @code{mpn_submul_1}. For small sizes, the 8831linear work is also significant. There are roughly @math{N} calls to the 8832@code{mpn_hgcd2} function. This function uses a couple of important 8833optimizations: 8834 8835@itemize 8836@item 8837It uses the same relaxed notion of correctness as @code{mpn_hgcd} (see next 8838section). This means that when called with the most significant two limbs of 8839two large numbers, the returned matrix does not always correspond exactly to 8840the initial quotient sequence for the two large numbers; the final quotient 8841may sometimes be one off. 8842 8843@item 8844It takes advantage of the fact the quotients are usually small. The division 8845operator is not used, since the corresponding assembler instruction is very 8846slow on most architectures. (This code could probably be improved further, it 8847uses many branches that are unfriendly to prediction). 8848 8849@item 8850It switches from double-limb calculations to single-limb calculations half-way 8851through, when the input numbers have been reduced in size from two limbs to 8852one and a half. 8853 8854@end itemize 8855 8856@node Subquadratic GCD, Extended GCD, Lehmer's Algorithm, Greatest Common Divisor Algorithms 8857@subsection Subquadratic GCD 8858 8859For inputs larger than @code{GCD_DC_THRESHOLD}, GCD is computed via the HGCD 8860(Half GCD) function, as a generalization to Lehmer's algorithm. 8861 8862Let the inputs @math{a,b} be of size @math{N} limbs each. Put @m{S=\lfloor N/2 8863\rfloor + 1, S = floor(N/2) + 1}. Then HGCD(a,b) returns a transformation 8864matrix @math{T} with non-negative elements, and reduced numbers @math{(c;d) = 8865T^{-1} (a;b)}. The reduced numbers @math{c,d} must be larger than @math{S} 8866limbs, while their difference @math{abs(c-d)} must fit in @math{S} limbs. The 8867matrix elements will also be of size roughly @math{N/2}. 8868 8869The HGCD base case uses Lehmer's algorithm, but with the above stop condition 8870that returns reduced numbers and the corresponding transformation matrix 8871half-way through. For inputs larger than @code{HGCD_THRESHOLD}, HGCD is 8872computed recursively, using the divide and conquer algorithm in ``On 8873Sch@"onhage's algorithm and subquadratic integer GCD computation'' by M@"oller 8874(@pxref{References}). The recursive algorithm consists of these main 8875steps. 8876 8877@itemize 8878 8879@item 8880Call HGCD recursively, on the most significant @math{N/2} limbs. Apply the 8881resulting matrix @math{T_1} to the full numbers, reducing them to a size just 8882above @math{3N/2}. 8883 8884@item 8885Perform a small number of division or subtraction steps to reduce the numbers 8886to size below @math{3N/2}. This is essential mainly for the unlikely case of 8887large quotients. 8888 8889@item 8890Call HGCD recursively, on the most significant @math{N/2} limbs of the reduced 8891numbers. Apply the resulting matrix @math{T_2} to the full numbers, reducing 8892them to a size just above @math{N/2}. 8893 8894@item 8895Compute @math{T = T_1 T_2}. 8896 8897@item 8898Perform a small number of division and subtraction steps to satisfy the 8899requirements, and return. 8900@end itemize 8901 8902GCD is then implemented as a loop around HGCD, similarly to Lehmer's 8903algorithm. Where Lehmer repeatedly chops off the top two limbs, calls 8904@code{mpn_hgcd2}, and applies the resulting matrix to the full numbers, the 8905sub-quadratic GCD chops off the most significant third of the limbs (the 8906proportion is a tuning parameter, and @math{1/3} seems to be more efficient 8907than, e.g, @math{1/2}), calls @code{mpn_hgcd}, and applies the resulting 8908matrix. Once the input numbers are reduced to size below 8909@code{GCD_DC_THRESHOLD}, Lehmer's algorithm is used for the rest of the work. 8910 8911The asymptotic running time of both HGCD and GCD is @m{O(M(N)\log N),O(M(N)*log(N))}, 8912where @math{M(N)} is the time for multiplying two @math{N}-limb numbers. 8913 8914@comment node-name, next, previous, up 8915 8916@node Extended GCD, Jacobi Symbol, Subquadratic GCD, Greatest Common Divisor Algorithms 8917@subsection Extended GCD 8918 8919The extended GCD function, or GCDEXT, calculates @math{@gcd{}(a,b)} and also 8920cofactors @math{x} and @math{y} satisfying @m{ax+by=\gcd(a@C{}b), 8921a*x+b*y=gcd(a@C{}b)}. All the algorithms used for plain GCD are extended to 8922handle this case. The binary algorithm is used only for single-limb GCDEXT. 8923Lehmer's algorithm is used for sizes up to @code{GCDEXT_DC_THRESHOLD}. Above 8924this threshold, GCDEXT is implemented as a loop around HGCD, but with more 8925book-keeping to keep track of the cofactors. This gives the same asymptotic 8926running time as for GCD and HGCD, @m{O(M(N)\log N),O(M(N)*log(N))} 8927 8928One difference to plain GCD is that while the inputs @math{a} and @math{b} are 8929reduced as the algorithm proceeds, the cofactors @math{x} and @math{y} grow in 8930size. This makes the tuning of the chopping-point more difficult. The current 8931code chops off the most significant half of the inputs for the call to HGCD in 8932the first iteration, and the most significant two thirds for the remaining 8933calls. This strategy could surely be improved. Also the stop condition for the 8934loop, where Lehmer's algorithm is invoked once the inputs are reduced below 8935@code{GCDEXT_DC_THRESHOLD}, could maybe be improved by taking into account the 8936current size of the cofactors. 8937 8938@node Jacobi Symbol, , Extended GCD, Greatest Common Divisor Algorithms 8939@subsection Jacobi Symbol 8940@cindex Jacobi symbol algorithm 8941 8942@c Editor Note: I don't see other people defining the inputs, it would be nice 8943@c here because the code uses (a/b) where other references use (n/k) 8944 8945Jacobi symbol @m{\left(a \over b\right), (@var{a}/@var{b})} 8946 8947Initially if either operand fits in a single limb, a reduction is done with 8948either @code{mpn_mod_1} or @code{mpn_modexact_1_odd}, followed by the binary 8949algorithm on a single limb. The binary algorithm is well suited to a single limb, 8950and the whole calculation in this case is quite efficient. 8951 8952For inputs larger than @code{GCD_DC_THRESHOLD}, @code{mpz_jacobi}, 8953@code{mpz_legendre} and @code{mpz_kronecker} are computed via the HGCD (Half 8954GCD) function, as a generalization to Lehmer's algorithm. 8955 8956Most GCD algorithms reduce @math{a} and @math{b} by repeatatily computing the 8957quotient @m{q = \lfloor a/b \rfloor, q = floor(a/b)} and iteratively replacing 8958 8959@c Couldn't figure out macros with commas. 8960@tex 8961$$ a, b = b, a - q * b$$ 8962@end tex 8963@ifnottex 8964@math{a, b = b, a - q * b} 8965@end ifnottex 8966 8967Different algorithms use different methods for calculating q, but the core 8968algorithm is the same if we use @ref{Lehmer's Algorithm} or 8969@ref{Subquadratic GCD, HGCD}. 8970 8971At each step it is possible to compute if the reduction inverts the Jacobi 8972symbol based on the two least significant bits of @var{a} and @var{b}. For 8973more details see ``Efficient computation of the Jacobi symbol'' by 8974M@"oller (@pxref{References}). 8975 8976A small set of bits is thus used to track state 8977@itemize 8978@item 8979current sign of result (1 bit) 8980 8981@item 8982two least significant bits of @var{a} and @var{b} (4 bits) 8983 8984@item 8985a pointer to which input is currently the denominator (1 bit) 8986@end itemize 8987 8988In all the routines sign changes for the result are accumulated using fast bit 8989twiddling which avoids conditional jumps. 8990 8991The final result is calculated after verifying the inputs are coprime (GCD = 1) 8992by raising @m{(-1)^e,(-1)^e} 8993 8994Much of the HGCD code is shared directly with the HGCD implementations, such 8995as the 2x2 matrix calculation, @xref{Lehmer's Algorithm} basecase and 8996@code{GCD_DC_THRESHOLD}. 8997 8998The asymptotic running time is @m{O(M(N)\log N),O(M(N)*log(N))}, where 8999@math{M(N)} is the time for multiplying two @math{N}-limb numbers. 9000 9001@need 1000 9002@node Powering Algorithms, Root Extraction Algorithms, Greatest Common Divisor Algorithms, Algorithms 9003@section Powering Algorithms 9004@cindex Powering algorithms 9005 9006@menu 9007* Normal Powering Algorithm:: 9008* Modular Powering Algorithm:: 9009@end menu 9010 9011 9012@node Normal Powering Algorithm, Modular Powering Algorithm, Powering Algorithms, Powering Algorithms 9013@subsection Normal Powering 9014 9015Normal @code{mpz} or @code{mpf} powering uses a simple binary algorithm, 9016successively squaring and then multiplying by the base when a 1 bit is seen in 9017the exponent, as per Knuth section 4.6.3. The ``left to right'' 9018variant described there is used rather than algorithm A, since it's just as 9019easy and can be done with somewhat less temporary memory. 9020 9021 9022@node Modular Powering Algorithm, , Normal Powering Algorithm, Powering Algorithms 9023@subsection Modular Powering 9024 9025Modular powering is implemented using a @math{2^k}-ary sliding window 9026algorithm, as per ``Handbook of Applied Cryptography'' algorithm 14.85 9027(@pxref{References}). @math{k} is chosen according to the size of the 9028exponent. Larger exponents use larger values of @math{k}, the choice being 9029made to minimize the average number of multiplications that must supplement 9030the squaring. 9031 9032The modular multiplies and squarings use either a simple division or the REDC 9033method by Montgomery (@pxref{References}). REDC is a little faster, 9034essentially saving N single limb divisions in a fashion similar to an exact 9035remainder (@pxref{Exact Remainder}). 9036 9037 9038@node Root Extraction Algorithms, Radix Conversion Algorithms, Powering Algorithms, Algorithms 9039@section Root Extraction Algorithms 9040@cindex Root extraction algorithms 9041 9042@menu 9043* Square Root Algorithm:: 9044* Nth Root Algorithm:: 9045* Perfect Square Algorithm:: 9046* Perfect Power Algorithm:: 9047@end menu 9048 9049 9050@node Square Root Algorithm, Nth Root Algorithm, Root Extraction Algorithms, Root Extraction Algorithms 9051@subsection Square Root 9052@cindex Square root algorithm 9053@cindex Karatsuba square root algorithm 9054 9055Square roots are taken using the ``Karatsuba Square Root'' algorithm by Paul 9056Zimmermann (@pxref{References}). 9057 9058An input @math{n} is split into four parts of @math{k} bits each, so with 9059@math{b=2^k} we have @m{n = a_3b^3 + a_2b^2 + a_1b + a_0, n = a3*b^3 + a2*b^2 9060+ a1*b + a0}. Part @ms{a,3} must be ``normalized'' so that either the high or 9061second highest bit is set. In GMP, @math{k} is kept on a limb boundary and 9062the input is left shifted (by an even number of bits) to normalize. 9063 9064The square root of the high two parts is taken, by recursive application of 9065the algorithm (bottoming out in a one-limb Newton's method), 9066@tex 9067$$ s',r' = \mathop{\rm sqrtrem} \> (a_3b + a_2) $$ 9068@end tex 9069@ifnottex 9070 9071@example 9072s1,r1 = sqrtrem (a3*b + a2) 9073@end example 9074 9075@end ifnottex 9076This is an approximation to the desired root and is extended by a division to 9077give @math{s},@math{r}, 9078@tex 9079$$\eqalign{ 9080q,u &= \mathop{\rm divrem} \> (r'b + a_1, 2s') \cr 9081s &= s'b + q \cr 9082r &= ub + a_0 - q^2 9083}$$ 9084@end tex 9085@ifnottex 9086 9087@example 9088q,u = divrem (r1*b + a1, 2*s1) 9089s = s1*b + q 9090r = u*b + a0 - q^2 9091@end example 9092 9093@end ifnottex 9094The normalization requirement on @ms{a,3} means at this point @math{s} is 9095either correct or 1 too big. @math{r} is negative in the latter case, so 9096@tex 9097$$\eqalign{ 9098\mathop{\rm if} \; r &< 0 \; \mathop{\rm then} \cr 9099r &\leftarrow r + 2s - 1 \cr 9100s &\leftarrow s - 1 9101}$$ 9102@end tex 9103@ifnottex 9104 9105@example 9106if r < 0 then 9107 r = r + 2*s - 1 9108 s = s - 1 9109@end example 9110 9111@end ifnottex 9112The algorithm is expressed in a divide and conquer form, but as noted in the 9113paper it can also be viewed as a discrete variant of Newton's method, or as a 9114variation on the schoolboy method (no longer taught) for square roots two 9115digits at a time. 9116 9117If the remainder @math{r} is not required then usually only a few high limbs 9118of @math{r} and @math{u} need to be calculated to determine whether an 9119adjustment to @math{s} is required. This optimization is not currently 9120implemented. 9121 9122In the Karatsuba multiplication range this algorithm is @m{O({3\over2} 9123M(N/2)),O(1.5*M(N/2))}, where @math{M(n)} is the time to multiply two numbers 9124of @math{n} limbs. In the FFT multiplication range this grows to a bound of 9125@m{O(6 M(N/2)),O(6*M(N/2))}. In practice a factor of about 1.5 to 1.8 is 9126found in the Karatsuba and Toom-3 ranges, growing to 2 or 3 in the FFT range. 9127 9128The algorithm does all its calculations in integers and the resulting 9129@code{mpn_sqrtrem} is used for both @code{mpz_sqrt} and @code{mpf_sqrt}. 9130The extended precision given by @code{mpf_sqrt_ui} is obtained by 9131padding with zero limbs. 9132 9133 9134@node Nth Root Algorithm, Perfect Square Algorithm, Square Root Algorithm, Root Extraction Algorithms 9135@subsection Nth Root 9136@cindex Root extraction algorithm 9137@cindex Nth root algorithm 9138 9139Integer Nth roots are taken using Newton's method with the following 9140iteration, where @math{A} is the input and @math{n} is the root to be taken. 9141@tex 9142$$a_{i+1} = {1\over n} \left({A \over a_i^{n-1}} + (n-1)a_i \right)$$ 9143@end tex 9144@ifnottex 9145 9146@example 9147 1 A 9148a[i+1] = - * ( --------- + (n-1)*a[i] ) 9149 n a[i]^(n-1) 9150@end example 9151 9152@end ifnottex 9153The initial approximation @m{a_1,a[1]} is generated bitwise by successively 9154powering a trial root with or without new 1 bits, aiming to be just above the 9155true root. The iteration converges quadratically when started from a good 9156approximation. When @math{n} is large more initial bits are needed to get 9157good convergence. The current implementation is not particularly well 9158optimized. 9159 9160 9161@node Perfect Square Algorithm, Perfect Power Algorithm, Nth Root Algorithm, Root Extraction Algorithms 9162@subsection Perfect Square 9163@cindex Perfect square algorithm 9164 9165A significant fraction of non-squares can be quickly identified by checking 9166whether the input is a quadratic residue modulo small integers. 9167 9168@code{mpz_perfect_square_p} first tests the input mod 256, which means just 9169examining the low byte. Only 44 different values occur for squares mod 256, 9170so 82.8% of inputs can be immediately identified as non-squares. 9171 9172On a 32-bit system similar tests are done mod 9, 5, 7, 13 and 17, for a total 917399.25% of inputs identified as non-squares. On a 64-bit system 97 is tested 9174too, for a total 99.62%. 9175 9176These moduli are chosen because they're factors of @math{2^@W{24}-1} (or 9177@math{2^@W{48}-1} for 64-bits), and such a remainder can be quickly taken just 9178using additions (see @code{mpn_mod_34lsub1}). 9179 9180When nails are in use moduli are instead selected by the @file{gen-psqr.c} 9181program and applied with an @code{mpn_mod_1}. The same @math{2^@W{24}-1} or 9182@math{2^@W{48}-1} could be done with nails using some extra bit shifts, but 9183this is not currently implemented. 9184 9185In any case each modulus is applied to the @code{mpn_mod_34lsub1} or 9186@code{mpn_mod_1} remainder and a table lookup identifies non-squares. By 9187using a ``modexact'' style calculation, and suitably permuted tables, just one 9188multiply each is required, see the code for details. Moduli are also combined 9189to save operations, so long as the lookup tables don't become too big. 9190@file{gen-psqr.c} does all the pre-calculations. 9191 9192A square root must still be taken for any value that passes these tests, to 9193verify it's really a square and not one of the small fraction of non-squares 9194that get through (i.e.@: a pseudo-square to all the tested bases). 9195 9196Clearly more residue tests could be done, @code{mpz_perfect_square_p} only 9197uses a compact and efficient set. Big inputs would probably benefit from more 9198residue testing, small inputs might be better off with less. The assumed 9199distribution of squares versus non-squares in the input would affect such 9200considerations. 9201 9202 9203@node Perfect Power Algorithm, , Perfect Square Algorithm, Root Extraction Algorithms 9204@subsection Perfect Power 9205@cindex Perfect power algorithm 9206 9207Detecting perfect powers is required by some factorization algorithms. 9208Currently @code{mpz_perfect_power_p} is implemented using repeated Nth root 9209extractions, though naturally only prime roots need to be considered. 9210(@xref{Nth Root Algorithm}.) 9211 9212If a prime divisor @math{p} with multiplicity @math{e} can be found, then only 9213roots which are divisors of @math{e} need to be considered, much reducing the 9214work necessary. To this end divisibility by a set of small primes is checked. 9215 9216 9217@node Radix Conversion Algorithms, Other Algorithms, Root Extraction Algorithms, Algorithms 9218@section Radix Conversion 9219@cindex Radix conversion algorithms 9220 9221Radix conversions are less important than other algorithms. A program 9222dominated by conversions should probably use a different data representation. 9223 9224@menu 9225* Binary to Radix:: 9226* Radix to Binary:: 9227@end menu 9228 9229 9230@node Binary to Radix, Radix to Binary, Radix Conversion Algorithms, Radix Conversion Algorithms 9231@subsection Binary to Radix 9232 9233Conversions from binary to a power-of-2 radix use a simple and fast 9234@math{O(N)} bit extraction algorithm. 9235 9236Conversions from binary to other radices use one of two algorithms. Sizes 9237below @code{GET_STR_PRECOMPUTE_THRESHOLD} use a basic @math{O(N^2)} method. 9238Repeated divisions by @math{b^n} are made, where @math{b} is the radix and 9239@math{n} is the biggest power that fits in a limb. But instead of simply 9240using the remainder @math{r} from such divisions, an extra divide step is done 9241to give a fractional limb representing @math{r/b^n}. The digits of @math{r} 9242can then be extracted using multiplications by @math{b} rather than divisions. 9243Special case code is provided for decimal, allowing multiplications by 10 to 9244optimize to shifts and adds. 9245 9246Above @code{GET_STR_PRECOMPUTE_THRESHOLD} a sub-quadratic algorithm is used. 9247For an input @math{t}, powers @m{b^{n2^i},b^(n*2^i)} of the radix are 9248calculated, until a power between @math{t} and @m{\sqrt{t},sqrt(t)} is 9249reached. @math{t} is then divided by that largest power, giving a quotient 9250which is the digits above that power, and a remainder which is those below. 9251These two parts are in turn divided by the second highest power, and so on 9252recursively. When a piece has been divided down to less than 9253@code{GET_STR_DC_THRESHOLD} limbs, the basecase algorithm described above is 9254used. 9255 9256The advantage of this algorithm is that big divisions can make use of the 9257sub-quadratic divide and conquer division (@pxref{Divide and Conquer 9258Division}), and big divisions tend to have less overheads than lots of 9259separate single limb divisions anyway. But in any case the cost of 9260calculating the powers @m{b^{n2^i},b^(n*2^i)} must first be overcome. 9261 9262@code{GET_STR_PRECOMPUTE_THRESHOLD} and @code{GET_STR_DC_THRESHOLD} represent 9263the same basic thing, the point where it becomes worth doing a big division to 9264cut the input in half. @code{GET_STR_PRECOMPUTE_THRESHOLD} includes the cost 9265of calculating the radix power required, whereas @code{GET_STR_DC_THRESHOLD} 9266assumes that's already available, which is the case when recursing. 9267 9268Since the base case produces digits from least to most significant but they 9269want to be stored from most to least, it's necessary to calculate in advance 9270how many digits there will be, or at least be sure not to underestimate that. 9271For GMP the number of input bits is multiplied by @code{chars_per_bit_exactly} 9272from @code{mp_bases}, rounding up. The result is either correct or one too 9273big. 9274 9275Examining some of the high bits of the input could increase the chance of 9276getting the exact number of digits, but an exact result every time would not 9277be practical, since in general the difference between numbers 100@dots{} and 927899@dots{} is only in the last few bits and the work to identify 99@dots{} 9279might well be almost as much as a full conversion. 9280 9281The @math{r/b^n} scheme described above for using multiplications to bring out 9282digits might be useful for more than a single limb. Some brief experiments 9283with it on the base case when recursing didn't give a noticeable improvement, 9284but perhaps that was only due to the implementation. Something similar would 9285work for the sub-quadratic divisions too, though there would be the cost of 9286calculating a bigger radix power. 9287 9288Another possible improvement for the sub-quadratic part would be to arrange 9289for radix powers that balanced the sizes of quotient and remainder produced, 9290i.e.@: the highest power would be an @m{b^{nk},b^(n*k)} approximately equal to 9291@m{\sqrt{t},sqrt(t)}, not restricted to a @math{2^i} factor. That ought to 9292smooth out a graph of times against sizes, but may or may not be a net 9293speedup. 9294 9295 9296@node Radix to Binary, , Binary to Radix, Radix Conversion Algorithms 9297@subsection Radix to Binary 9298 9299@strong{This section needs to be rewritten, it currently describes the 9300algorithms used before GMP 4.3.} 9301 9302Conversions from a power-of-2 radix into binary use a simple and fast 9303@math{O(N)} bitwise concatenation algorithm. 9304 9305Conversions from other radices use one of two algorithms. Sizes below 9306@code{SET_STR_PRECOMPUTE_THRESHOLD} use a basic @math{O(N^2)} method. Groups 9307of @math{n} digits are converted to limbs, where @math{n} is the biggest 9308power of the base @math{b} which will fit in a limb, then those groups are 9309accumulated into the result by multiplying by @math{b^n} and adding. This 9310saves multi-precision operations, as per Knuth section 4.4 part E 9311(@pxref{References}). Some special case code is provided for decimal, giving 9312the compiler a chance to optimize multiplications by 10. 9313 9314Above @code{SET_STR_PRECOMPUTE_THRESHOLD} a sub-quadratic algorithm is used. 9315First groups of @math{n} digits are converted into limbs. Then adjacent 9316limbs are combined into limb pairs with @m{xb^n+y,x*b^n+y}, where @math{x} 9317and @math{y} are the limbs. Adjacent limb pairs are combined into quads 9318similarly with @m{xb^{2n}+y,x*b^(2n)+y}. This continues until a single block 9319remains, that being the result. 9320 9321The advantage of this method is that the multiplications for each @math{x} are 9322big blocks, allowing Karatsuba and higher algorithms to be used. But the cost 9323of calculating the powers @m{b^{n2^i},b^(n*2^i)} must be overcome. 9324@code{SET_STR_PRECOMPUTE_THRESHOLD} usually ends up quite big, around 5000 digits, and on 9325some processors much bigger still. 9326 9327@code{SET_STR_PRECOMPUTE_THRESHOLD} is based on the input digits (and tuned 9328for decimal), though it might be better based on a limb count, so as to be 9329independent of the base. But that sort of count isn't used by the base case 9330and so would need some sort of initial calculation or estimate. 9331 9332The main reason @code{SET_STR_PRECOMPUTE_THRESHOLD} is so much bigger than the 9333corresponding @code{GET_STR_PRECOMPUTE_THRESHOLD} is that @code{mpn_mul_1} is 9334much faster than @code{mpn_divrem_1} (often by a factor of 5, or more). 9335 9336 9337@need 1000 9338@node Other Algorithms, Assembly Coding, Radix Conversion Algorithms, Algorithms 9339@section Other Algorithms 9340 9341@menu 9342* Prime Testing Algorithm:: 9343* Factorial Algorithm:: 9344* Binomial Coefficients Algorithm:: 9345* Fibonacci Numbers Algorithm:: 9346* Lucas Numbers Algorithm:: 9347* Random Number Algorithms:: 9348@end menu 9349 9350 9351@node Prime Testing Algorithm, Factorial Algorithm, Other Algorithms, Other Algorithms 9352@subsection Prime Testing 9353@cindex Prime testing algorithms 9354 9355The primality testing in @code{mpz_probab_prime_p} (@pxref{Number Theoretic 9356Functions}) first does some trial division by small factors and then uses the 9357Miller-Rabin probabilistic primality testing algorithm, as described in Knuth 9358section 4.5.4 algorithm P (@pxref{References}). 9359 9360For an odd input @math{n}, and with @math{n = q@GMPmultiply{}2^k+1} where 9361@math{q} is odd, this algorithm selects a random base @math{x} and tests 9362whether @math{x^q @bmod{} n} is 1 or @math{-1}, or an @m{x^{q2^j} \bmod n, 9363x^(q*2^j) mod n} is @math{1}, for @math{1@le{}j@le{}k}. If so then @math{n} 9364is probably prime, if not then @math{n} is definitely composite. 9365 9366Any prime @math{n} will pass the test, but some composites do too. Such 9367composites are known as strong pseudoprimes to base @math{x}. No @math{n} is 9368a strong pseudoprime to more than @math{1/4} of all bases (see Knuth exercise 936922), hence with @math{x} chosen at random there's no more than a @math{1/4} 9370chance a ``probable prime'' will in fact be composite. 9371 9372In fact strong pseudoprimes are quite rare, making the test much more 9373powerful than this analysis would suggest, but @math{1/4} is all that's proven 9374for an arbitrary @math{n}. 9375 9376 9377@node Factorial Algorithm, Binomial Coefficients Algorithm, Prime Testing Algorithm, Other Algorithms 9378@subsection Factorial 9379@cindex Factorial algorithm 9380 9381Factorials are calculated by a combination of two algorithms. An idea is 9382shared among them: to compute the odd part of the factorial; a final step 9383takes account of the power of @math{2} term, by shifting. 9384 9385For small @math{n}, the odd factor of @math{n!} is computed with the simple 9386observation that it is equal to the product of all positive odd numbers 9387smaller than @math{n} times the odd factor of @m{\lfloor n/2\rfloor!, [n/2]!}, 9388where @m{\lfloor x\rfloor, [x]} is the integer part of @math{x}, and so on 9389recursively. The procedure can be best illustrated with an example, 9390 9391@quotation 9392@math{23! = (23.21.19.17.15.13.11.9.7.5.3)(11.9.7.5.3)(5.3)2^{19}} 9393@end quotation 9394 9395Current code collects all the factors in a single list, with a loop and no 9396recursion, and compute the product, with no special care for repeated chunks. 9397 9398When @math{n} is larger, computation pass trough prime sieving. An helper 9399function is used, as suggested by Peter Luschny: 9400@tex 9401$$\mathop{\rm msf}(n) = {n!\over\lfloor n/2\rfloor!^2\cdot2^k} = \prod_{p=3}^{n} 9402p^{\mathop{\rm L}(p,n)} $$ 9403@end tex 9404@ifnottex 9405 9406@example 9407 n 9408 ----- 9409 n! | | L(p,n) 9410msf(n) = -------------- = | | p 9411 [n/2]!^2.2^k p=3 9412@end example 9413@end ifnottex 9414 9415Where @math{p} ranges on odd prime numbers. The exponent @math{k} is chosen to 9416obtain an odd integer number: @math{k} is the number of 1 bits in the binary 9417representation of @m{\lfloor n/2\rfloor, [n/2]}. The function L@math{(p,n)} 9418can be defined as zero when @math{p} is composite, and, for any prime 9419@math{p}, it is computed with: 9420@tex 9421$$\mathop{\rm L}(p,n) = \sum_{i>0}\left\lfloor{n\over p^i}\right\rfloor\bmod2 9422\leq\log_p(n)$$ 9423@end tex 9424@ifnottex 9425 9426@example 9427 --- 9428 \ n 9429L(p,n) = / [---] mod 2 <= log (n) . 9430 --- p^i p 9431 i>0 9432@end example 9433@end ifnottex 9434 9435With this helper function, we are able to compute the odd part of @math{n!} 9436using the recursion implied by @m{n!=\lfloor n/2\rfloor!^2\cdot\mathop{\rm 9437msf}(n)\cdot2^k , n!=[n/2]!^2*msf(n)*2^k}. The recursion stops using the 9438small-@math{n} algorithm on some @m{\lfloor n/2^i\rfloor, [n/2^i]}. 9439 9440Both the above algorithms use binary splitting to compute the product of many 9441small factors. At first as many products as possible are accumulated in a 9442single register, generating a list of factors that fit in a machine word. This 9443list is then split into halves, and the product is computed recursively. 9444 9445Such splitting is more efficient than repeated N@cross{}1 multiplies since it 9446forms big multiplies, allowing Karatsuba and higher algorithms to be used. 9447And even below the Karatsuba threshold a big block of work can be more 9448efficient for the basecase algorithm. 9449 9450 9451@node Binomial Coefficients Algorithm, Fibonacci Numbers Algorithm, Factorial Algorithm, Other Algorithms 9452@subsection Binomial Coefficients 9453@cindex Binomial coefficient algorithm 9454 9455Binomial coefficients @m{\left({n}\atop{k}\right), C(n@C{}k)} are calculated 9456by first arranging @math{k @le{} n/2} using @m{\left({n}\atop{k}\right) = 9457\left({n}\atop{n-k}\right), C(n@C{}k) = C(n@C{}n-k)} if necessary, and then 9458evaluating the following product simply from @math{i=2} to @math{i=k}. 9459@tex 9460$$ \left({n}\atop{k}\right) = (n-k+1) \prod_{i=2}^{k} {{n-k+i} \over i} $$ 9461@end tex 9462@ifnottex 9463 9464@example 9465 k (n-k+i) 9466C(n,k) = (n-k+1) * prod ------- 9467 i=2 i 9468@end example 9469 9470@end ifnottex 9471It's easy to show that each denominator @math{i} will divide the product so 9472far, so the exact division algorithm is used (@pxref{Exact Division}). 9473 9474The numerators @math{n-k+i} and denominators @math{i} are first accumulated 9475into as many fit a limb, to save multi-precision operations, though for 9476@code{mpz_bin_ui} this applies only to the divisors, since @math{n} is an 9477@code{mpz_t} and @math{n-k+i} in general won't fit in a limb at all. 9478 9479 9480@node Fibonacci Numbers Algorithm, Lucas Numbers Algorithm, Binomial Coefficients Algorithm, Other Algorithms 9481@subsection Fibonacci Numbers 9482@cindex Fibonacci number algorithm 9483 9484The Fibonacci functions @code{mpz_fib_ui} and @code{mpz_fib2_ui} are designed 9485for calculating isolated @m{F_n,F[n]} or @m{F_n,F[n]},@m{F_{n-1},F[n-1]} 9486values efficiently. 9487 9488For small @math{n}, a table of single limb values in @code{__gmp_fib_table} is 9489used. On a 32-bit limb this goes up to @m{F_{47},F[47]}, or on a 64-bit limb 9490up to @m{F_{93},F[93]}. For convenience the table starts at @m{F_{-1},F[-1]}. 9491 9492Beyond the table, values are generated with a binary powering algorithm, 9493calculating a pair @m{F_n,F[n]} and @m{F_{n-1},F[n-1]} working from high to 9494low across the bits of @math{n}. The formulas used are 9495@tex 9496$$\eqalign{ 9497 F_{2k+1} &= 4F_k^2 - F_{k-1}^2 + 2(-1)^k \cr 9498 F_{2k-1} &= F_k^2 + F_{k-1}^2 \cr 9499 F_{2k} &= F_{2k+1} - F_{2k-1} 9500}$$ 9501@end tex 9502@ifnottex 9503 9504@example 9505F[2k+1] = 4*F[k]^2 - F[k-1]^2 + 2*(-1)^k 9506F[2k-1] = F[k]^2 + F[k-1]^2 9507 9508F[2k] = F[2k+1] - F[2k-1] 9509@end example 9510 9511@end ifnottex 9512At each step, @math{k} is the high @math{b} bits of @math{n}. If the next bit 9513of @math{n} is 0 then @m{F_{2k},F[2k]},@m{F_{2k-1},F[2k-1]} is used, or if 9514it's a 1 then @m{F_{2k+1},F[2k+1]},@m{F_{2k},F[2k]} is used, and the process 9515repeated until all bits of @math{n} are incorporated. Notice these formulas 9516require just two squares per bit of @math{n}. 9517 9518It'd be possible to handle the first few @math{n} above the single limb table 9519with simple additions, using the defining Fibonacci recurrence @m{F_{k+1} = 9520F_k + F_{k-1}, F[k+1]=F[k]+F[k-1]}, but this is not done since it usually 9521turns out to be faster for only about 10 or 20 values of @math{n}, and 9522including a block of code for just those doesn't seem worthwhile. If they 9523really mattered it'd be better to extend the data table. 9524 9525Using a table avoids lots of calculations on small numbers, and makes small 9526@math{n} go fast. A bigger table would make more small @math{n} go fast, it's 9527just a question of balancing size against desired speed. For GMP the code is 9528kept compact, with the emphasis primarily on a good powering algorithm. 9529 9530@code{mpz_fib2_ui} returns both @m{F_n,F[n]} and @m{F_{n-1},F[n-1]}, but 9531@code{mpz_fib_ui} is only interested in @m{F_n,F[n]}. In this case the last 9532step of the algorithm can become one multiply instead of two squares. One of 9533the following two formulas is used, according as @math{n} is odd or even. 9534@tex 9535$$\eqalign{ 9536 F_{2k} &= F_k (F_k + 2F_{k-1}) \cr 9537 F_{2k+1} &= (2F_k + F_{k-1}) (2F_k - F_{k-1}) + 2(-1)^k 9538}$$ 9539@end tex 9540@ifnottex 9541 9542@example 9543F[2k] = F[k]*(F[k]+2F[k-1]) 9544 9545F[2k+1] = (2F[k]+F[k-1])*(2F[k]-F[k-1]) + 2*(-1)^k 9546@end example 9547 9548@end ifnottex 9549@m{F_{2k+1},F[2k+1]} here is the same as above, just rearranged to be a 9550multiply. For interest, the @m{2(-1)^k, 2*(-1)^k} term both here and above 9551can be applied just to the low limb of the calculation, without a carry or 9552borrow into further limbs, which saves some code size. See comments with 9553@code{mpz_fib_ui} and the internal @code{mpn_fib2_ui} for how this is done. 9554 9555 9556@node Lucas Numbers Algorithm, Random Number Algorithms, Fibonacci Numbers Algorithm, Other Algorithms 9557@subsection Lucas Numbers 9558@cindex Lucas number algorithm 9559 9560@code{mpz_lucnum2_ui} derives a pair of Lucas numbers from a pair of Fibonacci 9561numbers with the following simple formulas. 9562@tex 9563$$\eqalign{ 9564 L_k &= F_k + 2F_{k-1} \cr 9565 L_{k-1} &= 2F_k - F_{k-1} 9566}$$ 9567@end tex 9568@ifnottex 9569 9570@example 9571L[k] = F[k] + 2*F[k-1] 9572L[k-1] = 2*F[k] - F[k-1] 9573@end example 9574 9575@end ifnottex 9576@code{mpz_lucnum_ui} is only interested in @m{L_n,L[n]}, and some work can be 9577saved. Trailing zero bits on @math{n} can be handled with a single square 9578each. 9579@tex 9580$$ L_{2k} = L_k^2 - 2(-1)^k $$ 9581@end tex 9582@ifnottex 9583 9584@example 9585L[2k] = L[k]^2 - 2*(-1)^k 9586@end example 9587 9588@end ifnottex 9589And the lowest 1 bit can be handled with one multiply of a pair of Fibonacci 9590numbers, similar to what @code{mpz_fib_ui} does. 9591@tex 9592$$ L_{2k+1} = 5F_{k-1} (2F_k + F_{k-1}) - 4(-1)^k $$ 9593@end tex 9594@ifnottex 9595 9596@example 9597L[2k+1] = 5*F[k-1]*(2*F[k]+F[k-1]) - 4*(-1)^k 9598@end example 9599 9600@end ifnottex 9601 9602 9603@node Random Number Algorithms, , Lucas Numbers Algorithm, Other Algorithms 9604@subsection Random Numbers 9605@cindex Random number algorithms 9606 9607For the @code{urandomb} functions, random numbers are generated simply by 9608concatenating bits produced by the generator. As long as the generator has 9609good randomness properties this will produce well-distributed @math{N} bit 9610numbers. 9611 9612For the @code{urandomm} functions, random numbers in a range @math{0@le{}R<N} 9613are generated by taking values @math{R} of @m{\lceil \log_2 N \rceil, 9614ceil(log2(N))} bits each until one satisfies @math{R<N}. This will normally 9615require only one or two attempts, but the attempts are limited in case the 9616generator is somehow degenerate and produces only 1 bits or similar. 9617 9618@cindex Mersenne twister algorithm 9619The Mersenne Twister generator is by Matsumoto and Nishimura 9620(@pxref{References}). It has a non-repeating period of @math{2^@W{19937}-1}, 9621which is a Mersenne prime, hence the name of the generator. The state is 624 9622words of 32-bits each, which is iterated with one XOR and shift for each 962332-bit word generated, making the algorithm very fast. Randomness properties 9624are also very good and this is the default algorithm used by GMP. 9625 9626@cindex Linear congruential algorithm 9627Linear congruential generators are described in many text books, for instance 9628Knuth volume 2 (@pxref{References}). With a modulus @math{M} and parameters 9629@math{A} and @math{C}, an integer state @math{S} is iterated by the formula 9630@math{S @leftarrow{} A@GMPmultiply{}S+C @bmod{} M}. At each step the new 9631state is a linear function of the previous, mod @math{M}, hence the name of 9632the generator. 9633 9634In GMP only moduli of the form @math{2^N} are supported, and the current 9635implementation is not as well optimized as it could be. Overheads are 9636significant when @math{N} is small, and when @math{N} is large clearly the 9637multiply at each step will become slow. This is not a big concern, since the 9638Mersenne Twister generator is better in every respect and is therefore 9639recommended for all normal applications. 9640 9641For both generators the current state can be deduced by observing enough 9642output and applying some linear algebra (over GF(2) in the case of the 9643Mersenne Twister). This generally means raw output is unsuitable for 9644cryptographic applications without further hashing or the like. 9645 9646 9647@node Assembly Coding, , Other Algorithms, Algorithms 9648@section Assembly Coding 9649@cindex Assembly coding 9650 9651The assembly subroutines in GMP are the most significant source of speed at 9652small to moderate sizes. At larger sizes algorithm selection becomes more 9653important, but of course speedups in low level routines will still speed up 9654everything proportionally. 9655 9656Carry handling and widening multiplies that are important for GMP can't be 9657easily expressed in C@. GCC @code{asm} blocks help a lot and are provided in 9658@file{longlong.h}, but hand coding low level routines invariably offers a 9659speedup over generic C by a factor of anything from 2 to 10. 9660 9661@menu 9662* Assembly Code Organisation:: 9663* Assembly Basics:: 9664* Assembly Carry Propagation:: 9665* Assembly Cache Handling:: 9666* Assembly Functional Units:: 9667* Assembly Floating Point:: 9668* Assembly SIMD Instructions:: 9669* Assembly Software Pipelining:: 9670* Assembly Loop Unrolling:: 9671* Assembly Writing Guide:: 9672@end menu 9673 9674 9675@node Assembly Code Organisation, Assembly Basics, Assembly Coding, Assembly Coding 9676@subsection Code Organisation 9677@cindex Assembly code organisation 9678@cindex Code organisation 9679 9680The various @file{mpn} subdirectories contain machine-dependent code, written 9681in C or assembly. The @file{mpn/generic} subdirectory contains default code, 9682used when there's no machine-specific version of a particular file. 9683 9684Each @file{mpn} subdirectory is for an ISA family. Generally 32-bit and 968564-bit variants in a family cannot share code and have separate directories. 9686Within a family further subdirectories may exist for CPU variants. 9687 9688In each directory a @file{nails} subdirectory may exist, holding code with 9689nails support for that CPU variant. A @code{NAILS_SUPPORT} directive in each 9690file indicates the nails values the code handles. Nails code only exists 9691where it's faster, or promises to be faster, than plain code. There's no 9692effort put into nails if they're not going to enhance a given CPU. 9693 9694 9695@node Assembly Basics, Assembly Carry Propagation, Assembly Code Organisation, Assembly Coding 9696@subsection Assembly Basics 9697 9698@code{mpn_addmul_1} and @code{mpn_submul_1} are the most important routines 9699for overall GMP performance. All multiplications and divisions come down to 9700repeated calls to these. @code{mpn_add_n}, @code{mpn_sub_n}, 9701@code{mpn_lshift} and @code{mpn_rshift} are next most important. 9702 9703On some CPUs assembly versions of the internal functions 9704@code{mpn_mul_basecase} and @code{mpn_sqr_basecase} give significant speedups, 9705mainly through avoiding function call overheads. They can also potentially 9706make better use of a wide superscalar processor, as can bigger primitives like 9707@code{mpn_addmul_2} or @code{mpn_addmul_4}. 9708 9709The restrictions on overlaps between sources and destinations 9710(@pxref{Low-level Functions}) are designed to facilitate a variety of 9711implementations. For example, knowing @code{mpn_add_n} won't have partly 9712overlapping sources and destination means reading can be done far ahead of 9713writing on superscalar processors, and loops can be vectorized on a vector 9714processor, depending on the carry handling. 9715 9716 9717@node Assembly Carry Propagation, Assembly Cache Handling, Assembly Basics, Assembly Coding 9718@subsection Carry Propagation 9719@cindex Assembly carry propagation 9720 9721The problem that presents most challenges in GMP is propagating carries from 9722one limb to the next. In functions like @code{mpn_addmul_1} and 9723@code{mpn_add_n}, carries are the only dependencies between limb operations. 9724 9725On processors with carry flags, a straightforward CISC style @code{adc} is 9726generally best. AMD K6 @code{mpn_addmul_1} however is an example of an 9727unusual set of circumstances where a branch works out better. 9728 9729On RISC processors generally an add and compare for overflow is used. This 9730sort of thing can be seen in @file{mpn/generic/aors_n.c}. Some carry 9731propagation schemes require 4 instructions, meaning at least 4 cycles per 9732limb, but other schemes may use just 1 or 2. On wide superscalar processors 9733performance may be completely determined by the number of dependent 9734instructions between carry-in and carry-out for each limb. 9735 9736On vector processors good use can be made of the fact that a carry bit only 9737very rarely propagates more than one limb. When adding a single bit to a 9738limb, there's only a carry out if that limb was @code{0xFF@dots{}FF} which on 9739random data will be only 1 in @m{2\GMPraise{@code{mp\_bits\_per\_limb}}, 97402^mp_bits_per_limb}. @file{mpn/cray/add_n.c} is an example of this, it adds 9741all limbs in parallel, adds one set of carry bits in parallel and then only 9742rarely needs to fall through to a loop propagating further carries. 9743 9744On the x86s, GCC (as of version 2.95.2) doesn't generate particularly good code 9745for the RISC style idioms that are necessary to handle carry bits in 9746C@. Often conditional jumps are generated where @code{adc} or @code{sbb} forms 9747would be better. And so unfortunately almost any loop involving carry bits 9748needs to be coded in assembly for best results. 9749 9750 9751@node Assembly Cache Handling, Assembly Functional Units, Assembly Carry Propagation, Assembly Coding 9752@subsection Cache Handling 9753@cindex Assembly cache handling 9754 9755GMP aims to perform well both on operands that fit entirely in L1 cache and 9756those which don't. 9757 9758Basic routines like @code{mpn_add_n} or @code{mpn_lshift} are often used on 9759large operands, so L2 and main memory performance is important for them. 9760@code{mpn_mul_1} and @code{mpn_addmul_1} are mostly used for multiply and 9761square basecases, so L1 performance matters most for them, unless assembly 9762versions of @code{mpn_mul_basecase} and @code{mpn_sqr_basecase} exist, in 9763which case the remaining uses are mostly for larger operands. 9764 9765For L2 or main memory operands, memory access times will almost certainly be 9766more than the calculation time. The aim therefore is to maximize memory 9767throughput, by starting a load of the next cache line while processing the 9768contents of the previous one. Clearly this is only possible if the chip has a 9769lock-up free cache or some sort of prefetch instruction. Most current chips 9770have both these features. 9771 9772Prefetching sources combines well with loop unrolling, since a prefetch can be 9773initiated once per unrolled loop (or more than once if the loop covers more 9774than one cache line). 9775 9776On CPUs without write-allocate caches, prefetching destinations will ensure 9777individual stores don't go further down the cache hierarchy, limiting 9778bandwidth. Of course for calculations which are slow anyway, like 9779@code{mpn_divrem_1}, write-throughs might be fine. 9780 9781The distance ahead to prefetch will be determined by memory latency versus 9782throughput. The aim of course is to have data arriving continuously, at peak 9783throughput. Some CPUs have limits on the number of fetches or prefetches in 9784progress. 9785 9786If a special prefetch instruction doesn't exist then a plain load can be used, 9787but in that case care must be taken not to attempt to read past the end of an 9788operand, since that might produce a segmentation violation. 9789 9790Some CPUs or systems have hardware that detects sequential memory accesses and 9791initiates suitable cache movements automatically, making life easy. 9792 9793 9794@node Assembly Functional Units, Assembly Floating Point, Assembly Cache Handling, Assembly Coding 9795@subsection Functional Units 9796 9797When choosing an approach for an assembly loop, consideration is given to 9798what operations can execute simultaneously and what throughput can thereby be 9799achieved. In some cases an algorithm can be tweaked to accommodate available 9800resources. 9801 9802Loop control will generally require a counter and pointer updates, costing as 9803much as 5 instructions, plus any delays a branch introduces. CPU addressing 9804modes might reduce pointer updates, perhaps by allowing just one updating 9805pointer and others expressed as offsets from it, or on CISC chips with all 9806addressing done with the loop counter as a scaled index. 9807 9808The final loop control cost can be amortised by processing several limbs in 9809each iteration (@pxref{Assembly Loop Unrolling}). This at least ensures loop 9810control isn't a big fraction the work done. 9811 9812Memory throughput is always a limit. If perhaps only one load or one store 9813can be done per cycle then 3 cycles/limb will the top speed for ``binary'' 9814operations like @code{mpn_add_n}, and any code achieving that is optimal. 9815 9816Integer resources can be freed up by having the loop counter in a float 9817register, or by pressing the float units into use for some multiplying, 9818perhaps doing every second limb on the float side (@pxref{Assembly Floating 9819Point}). 9820 9821Float resources can be freed up by doing carry propagation on the integer 9822side, or even by doing integer to float conversions in integers using bit 9823twiddling. 9824 9825 9826@node Assembly Floating Point, Assembly SIMD Instructions, Assembly Functional Units, Assembly Coding 9827@subsection Floating Point 9828@cindex Assembly floating Point 9829 9830Floating point arithmetic is used in GMP for multiplications on CPUs with poor 9831integer multipliers. It's mostly useful for @code{mpn_mul_1}, 9832@code{mpn_addmul_1} and @code{mpn_submul_1} on 64-bit machines, and 9833@code{mpn_mul_basecase} on both 32-bit and 64-bit machines. 9834 9835With IEEE 53-bit double precision floats, integer multiplications producing up 9836to 53 bits will give exact results. Breaking a 64@cross{}64 multiplication 9837into eight 16@cross{}@math{32@rightarrow{}48} bit pieces is convenient. With 9838some care though six 21@cross{}@math{32@rightarrow{}53} bit products can be 9839used, if one of the lower two 21-bit pieces also uses the sign bit. 9840 9841For the @code{mpn_mul_1} family of functions on a 64-bit machine, the 9842invariant single limb is split at the start, into 3 or 4 pieces. Inside the 9843loop, the bignum operand is split into 32-bit pieces. Fast conversion of 9844these unsigned 32-bit pieces to floating point is highly machine-dependent. 9845In some cases, reading the data into the integer unit, zero-extending to 984664-bits, then transferring to the floating point unit back via memory is the 9847only option. 9848 9849Converting partial products back to 64-bit limbs is usually best done as a 9850signed conversion. Since all values are smaller than @m{2^{53},2^53}, signed 9851and unsigned are the same, but most processors lack unsigned conversions. 9852 9853@sp 2 9854 9855Here is a diagram showing 16@cross{}32 bit products for an @code{mpn_mul_1} or 9856@code{mpn_addmul_1} with a 64-bit limb. The single limb operand V is split 9857into four 16-bit parts. The multi-limb operand U is split in the loop into 9858two 32-bit parts. 9859 9860@tex 9861\global\newdimen\GMPbits \global\GMPbits=0.18em 9862\def\GMPbox#1#2#3{% 9863 \hbox{% 9864 \hbox to 128\GMPbits{\hfil 9865 \vbox{% 9866 \hrule 9867 \hbox to 48\GMPbits {\GMPvrule \hfil$#2$\hfil \vrule}% 9868 \hrule}% 9869 \hskip #1\GMPbits}% 9870 \raise \GMPboxdepth \hbox{\hskip 2em #3}}} 9871% 9872\GMPdisplay{% 9873 \vbox{% 9874 \hbox{% 9875 \hbox to 128\GMPbits {\hfil 9876 \vbox{% 9877 \hrule 9878 \hbox to 64\GMPbits{% 9879 \GMPvrule \hfil$v48$\hfil 9880 \vrule \hfil$v32$\hfil 9881 \vrule \hfil$v16$\hfil 9882 \vrule \hfil$v00$\hfil 9883 \vrule} 9884 \hrule}}% 9885 \raise \GMPboxdepth \hbox{\hskip 2em V Operand}} 9886 \vskip 0.5ex 9887 \hbox{% 9888 \hbox to 128\GMPbits {\hfil 9889 \raise \GMPboxdepth \hbox{$\times$\hskip 1.5em}% 9890 \vbox{% 9891 \hrule 9892 \hbox to 64\GMPbits {% 9893 \GMPvrule \hfil$u32$\hfil 9894 \vrule \hfil$u00$\hfil 9895 \vrule}% 9896 \hrule}}% 9897 \raise \GMPboxdepth \hbox{\hskip 2em U Operand (one limb)}}% 9898 \vskip 0.5ex 9899 \hbox{\vbox to 2ex{\hrule width 128\GMPbits}}% 9900 \GMPbox{0}{u00 \times v00}{$p00$\hskip 1.5em 48-bit products}% 9901 \vskip 0.5ex 9902 \GMPbox{16}{u00 \times v16}{$p16$} 9903 \vskip 0.5ex 9904 \GMPbox{32}{u00 \times v32}{$p32$} 9905 \vskip 0.5ex 9906 \GMPbox{48}{u00 \times v48}{$p48$} 9907 \vskip 0.5ex 9908 \GMPbox{32}{u32 \times v00}{$r32$} 9909 \vskip 0.5ex 9910 \GMPbox{48}{u32 \times v16}{$r48$} 9911 \vskip 0.5ex 9912 \GMPbox{64}{u32 \times v32}{$r64$} 9913 \vskip 0.5ex 9914 \GMPbox{80}{u32 \times v48}{$r80$} 9915}} 9916@end tex 9917@ifnottex 9918@example 9919@group 9920 +---+---+---+---+ 9921 |v48|v32|v16|v00| V operand 9922 +---+---+---+---+ 9923 9924 +-------+---+---+ 9925 x | u32 | u00 | U operand (one limb) 9926 +---------------+ 9927 9928--------------------------------- 9929 9930 +-----------+ 9931 | u00 x v00 | p00 48-bit products 9932 +-----------+ 9933 +-----------+ 9934 | u00 x v16 | p16 9935 +-----------+ 9936 +-----------+ 9937 | u00 x v32 | p32 9938 +-----------+ 9939 +-----------+ 9940 | u00 x v48 | p48 9941 +-----------+ 9942 +-----------+ 9943 | u32 x v00 | r32 9944 +-----------+ 9945 +-----------+ 9946 | u32 x v16 | r48 9947 +-----------+ 9948 +-----------+ 9949 | u32 x v32 | r64 9950 +-----------+ 9951+-----------+ 9952| u32 x v48 | r80 9953+-----------+ 9954@end group 9955@end example 9956@end ifnottex 9957 9958@math{p32} and @math{r32} can be summed using floating-point addition, and 9959likewise @math{p48} and @math{r48}. @math{p00} and @math{p16} can be summed 9960with @math{r64} and @math{r80} from the previous iteration. 9961 9962For each loop then, four 49-bit quantities are transferred to the integer unit, 9963aligned as follows, 9964 9965@tex 9966% GMPbox here should be 49 bits wide, but use 51 to better show p16+r80' 9967% crossing into the upper 64 bits. 9968\def\GMPbox#1#2#3{% 9969 \hbox{% 9970 \hbox to 128\GMPbits {% 9971 \hfil 9972 \vbox{% 9973 \hrule 9974 \hbox to 51\GMPbits {\GMPvrule \hfil$#2$\hfil \vrule}% 9975 \hrule}% 9976 \hskip #1\GMPbits}% 9977 \raise \GMPboxdepth \hbox{\hskip 1.5em $#3$\hfil}% 9978}} 9979\newbox\b \setbox\b\hbox{64 bits}% 9980\newdimen\bw \bw=\wd\b \advance\bw by 2em 9981\newdimen\x \x=128\GMPbits 9982\advance\x by -2\bw 9983\divide\x by4 9984\GMPdisplay{% 9985 \vbox{% 9986 \hbox to 128\GMPbits {% 9987 \GMPvrule 9988 \raise 0.5ex \vbox{\hrule \hbox to \x {}}% 9989 \hfil 64 bits\hfil 9990 \raise 0.5ex \vbox{\hrule \hbox to \x {}}% 9991 \vrule 9992 \raise 0.5ex \vbox{\hrule \hbox to \x {}}% 9993 \hfil 64 bits\hfil 9994 \raise 0.5ex \vbox{\hrule \hbox to \x {}}% 9995 \vrule}% 9996 \vskip 0.7ex 9997 \GMPbox{0}{p00+r64'}{i00} 9998 \vskip 0.5ex 9999 \GMPbox{16}{p16+r80'}{i16} 10000 \vskip 0.5ex 10001 \GMPbox{32}{p32+r32}{i32} 10002 \vskip 0.5ex 10003 \GMPbox{48}{p48+r48}{i48} 10004}} 10005@end tex 10006@ifnottex 10007@example 10008@group 10009|-----64bits----|-----64bits----| 10010 +------------+ 10011 | p00 + r64' | i00 10012 +------------+ 10013 +------------+ 10014 | p16 + r80' | i16 10015 +------------+ 10016 +------------+ 10017 | p32 + r32 | i32 10018 +------------+ 10019 +------------+ 10020 | p48 + r48 | i48 10021 +------------+ 10022@end group 10023@end example 10024@end ifnottex 10025 10026The challenge then is to sum these efficiently and add in a carry limb, 10027generating a low 64-bit result limb and a high 33-bit carry limb (@math{i48} 10028extends 33 bits into the high half). 10029 10030 10031@node Assembly SIMD Instructions, Assembly Software Pipelining, Assembly Floating Point, Assembly Coding 10032@subsection SIMD Instructions 10033@cindex Assembly SIMD 10034 10035The single-instruction multiple-data support in current microprocessors is 10036aimed at signal processing algorithms where each data point can be treated 10037more or less independently. There's generally not much support for 10038propagating the sort of carries that arise in GMP. 10039 10040SIMD multiplications of say four 16@cross{}16 bit multiplies only do as much 10041work as one 32@cross{}32 from GMP's point of view, and need some shifts and 10042adds besides. But of course if say the SIMD form is fully pipelined and uses 10043less instruction decoding then it may still be worthwhile. 10044 10045On the x86 chips, MMX has so far found a use in @code{mpn_rshift} and 10046@code{mpn_lshift}, and is used in a special case for 16-bit multipliers in the 10047P55 @code{mpn_mul_1}. SSE2 is used for Pentium 4 @code{mpn_mul_1}, 10048@code{mpn_addmul_1}, and @code{mpn_submul_1}. 10049 10050 10051@node Assembly Software Pipelining, Assembly Loop Unrolling, Assembly SIMD Instructions, Assembly Coding 10052@subsection Software Pipelining 10053@cindex Assembly software pipelining 10054 10055Software pipelining consists of scheduling instructions around the branch 10056point in a loop. For example a loop might issue a load not for use in the 10057present iteration but the next, thereby allowing extra cycles for the data to 10058arrive from memory. 10059 10060Naturally this is wanted only when doing things like loads or multiplies that 10061take several cycles to complete, and only where a CPU has multiple functional 10062units so that other work can be done in the meantime. 10063 10064A pipeline with several stages will have a data value in progress at each 10065stage and each loop iteration moves them along one stage. This is like 10066juggling. 10067 10068If the latency of some instruction is greater than the loop time then it will 10069be necessary to unroll, so one register has a result ready to use while 10070another (or multiple others) are still in progress. (@pxref{Assembly Loop 10071Unrolling}). 10072 10073 10074@node Assembly Loop Unrolling, Assembly Writing Guide, Assembly Software Pipelining, Assembly Coding 10075@subsection Loop Unrolling 10076@cindex Assembly loop unrolling 10077 10078Loop unrolling consists of replicating code so that several limbs are 10079processed in each loop. At a minimum this reduces loop overheads by a 10080corresponding factor, but it can also allow better register usage, for example 10081alternately using one register combination and then another. Judicious use of 10082@command{m4} macros can help avoid lots of duplication in the source code. 10083 10084Any amount of unrolling can be handled with a loop counter that's decremented 10085by @math{N} each time, stopping when the remaining count is less than the 10086further @math{N} the loop will process. Or by subtracting @math{N} at the 10087start, the termination condition becomes when the counter @math{C} is less 10088than 0 (and the count of remaining limbs is @math{C+N}). 10089 10090Alternately for a power of 2 unroll the loop count and remainder can be 10091established with a shift and mask. This is convenient if also making a 10092computed jump into the middle of a large loop. 10093 10094The limbs not a multiple of the unrolling can be handled in various ways, for 10095example 10096 10097@itemize @bullet 10098@item 10099A simple loop at the end (or the start) to process the excess. Care will be 10100wanted that it isn't too much slower than the unrolled part. 10101 10102@item 10103A set of binary tests, for example after an 8-limb unrolling, test for 4 more 10104limbs to process, then a further 2 more or not, and finally 1 more or not. 10105This will probably take more code space than a simple loop. 10106 10107@item 10108A @code{switch} statement, providing separate code for each possible excess, 10109for example an 8-limb unrolling would have separate code for 0 remaining, 1 10110remaining, etc, up to 7 remaining. This might take a lot of code, but may be 10111the best way to optimize all cases in combination with a deep pipelined loop. 10112 10113@item 10114A computed jump into the middle of the loop, thus making the first iteration 10115handle the excess. This should make times smoothly increase with size, which 10116is attractive, but setups for the jump and adjustments for pointers can be 10117tricky and could become quite difficult in combination with deep pipelining. 10118@end itemize 10119 10120 10121@node Assembly Writing Guide, , Assembly Loop Unrolling, Assembly Coding 10122@subsection Writing Guide 10123@cindex Assembly writing guide 10124 10125This is a guide to writing software pipelined loops for processing limb 10126vectors in assembly. 10127 10128First determine the algorithm and which instructions are needed. Code it 10129without unrolling or scheduling, to make sure it works. On a 3-operand CPU 10130try to write each new value to a new register, this will greatly simplify later 10131steps. 10132 10133Then note for each instruction the functional unit and/or issue port 10134requirements. If an instruction can use either of two units, like U0 or U1 10135then make a category ``U0/U1''. Count the total using each unit (or combined 10136unit), and count all instructions. 10137 10138Figure out from those counts the best possible loop time. The goal will be to 10139find a perfect schedule where instruction latencies are completely hidden. 10140The total instruction count might be the limiting factor, or perhaps a 10141particular functional unit. It might be possible to tweak the instructions to 10142help the limiting factor. 10143 10144Suppose the loop time is @math{N}, then make @math{N} issue buckets, with the 10145final loop branch at the end of the last. Now fill the buckets with dummy 10146instructions using the functional units desired. Run this to make sure the 10147intended speed is reached. 10148 10149Now replace the dummy instructions with the real instructions from the slow 10150but correct loop you started with. The first will typically be a load 10151instruction. Then the instruction using that value is placed in a bucket an 10152appropriate distance down. Run the loop again, to check it still runs at 10153target speed. 10154 10155Keep placing instructions, frequently measuring the loop. After a few you 10156will need to wrap around from the last bucket back to the top of the loop. If 10157you used the new-register for new-value strategy above then there will be no 10158register conflicts. If not then take care not to clobber something already in 10159use. Changing registers at this time is very error prone. 10160 10161The loop will overlap two or more of the original loop iterations, and the 10162computation of one vector element result will be started in one iteration of 10163the new loop, and completed one or several iterations later. 10164 10165The final step is to create feed-in and wind-down code for the loop. A good 10166way to do this is to make a copy (or copies) of the loop at the start and 10167delete those instructions which don't have valid antecedents, and at the end 10168replicate and delete those whose results are unwanted (including any further 10169loads). 10170 10171The loop will have a minimum number of limbs loaded and processed, so the 10172feed-in code must test if the request size is smaller and skip either to a 10173suitable part of the wind-down or to special code for small sizes. 10174 10175 10176@node Internals, Contributors, Algorithms, Top 10177@chapter Internals 10178@cindex Internals 10179 10180@strong{This chapter is provided only for informational purposes and the 10181various internals described here may change in future GMP releases. 10182Applications expecting to be compatible with future releases should use only 10183the documented interfaces described in previous chapters.} 10184 10185@menu 10186* Integer Internals:: 10187* Rational Internals:: 10188* Float Internals:: 10189* Raw Output Internals:: 10190* C++ Interface Internals:: 10191@end menu 10192 10193@node Integer Internals, Rational Internals, Internals, Internals 10194@section Integer Internals 10195@cindex Integer internals 10196 10197@code{mpz_t} variables represent integers using sign and magnitude, in space 10198dynamically allocated and reallocated. The fields are as follows. 10199 10200@table @asis 10201@item @code{_mp_size} 10202The number of limbs, or the negative of that when representing a negative 10203integer. Zero is represented by @code{_mp_size} set to zero, in which case 10204the @code{_mp_d} data is undefined. 10205 10206@item @code{_mp_d} 10207A pointer to an array of limbs which is the magnitude. These are stored 10208``little endian'' as per the @code{mpn} functions, so @code{_mp_d[0]} is the 10209least significant limb and @code{_mp_d[ABS(_mp_size)-1]} is the most 10210significant. Whenever @code{_mp_size} is non-zero, the most significant limb 10211is non-zero. 10212 10213Currently there's always at least one readable limb, so for instance 10214@code{mpz_get_ui} can fetch @code{_mp_d[0]} unconditionally (though its value 10215is undefined if @code{_mp_size} is zero). 10216 10217@item @code{_mp_alloc} 10218@code{_mp_alloc} is the number of limbs currently allocated at @code{_mp_d}, 10219and normally @code{_mp_alloc >= ABS(_mp_size)}. When an @code{mpz} routine 10220is about to (or might be about to) increase @code{_mp_size}, it checks 10221@code{_mp_alloc} to see whether there's enough space, and reallocates if not. 10222@code{MPZ_REALLOC} is generally used for this. 10223 10224@code{mpz_t} variables initialised with the @code{mpz_roinit_n} function or 10225the @code{MPZ_ROINIT_N} macro have @code{_mp_alloc = 0} but can have a 10226non-zero @code{_mp_size}. They can only be used as read-only constants. See 10227@ref{Integer Special Functions} for details. 10228@end table 10229 10230The various bitwise logical functions like @code{mpz_and} behave as if 10231negative values were twos complement. But sign and magnitude is always used 10232internally, and necessary adjustments are made during the calculations. 10233Sometimes this isn't pretty, but sign and magnitude are best for other 10234routines. 10235 10236Some internal temporary variables are setup with @code{MPZ_TMP_INIT} and these 10237have @code{_mp_d} space obtained from @code{TMP_ALLOC} rather than the memory 10238allocation functions. Care is taken to ensure that these are big enough that 10239no reallocation is necessary (since it would have unpredictable consequences). 10240 10241@code{_mp_size} and @code{_mp_alloc} are @code{int}, although @code{mp_size_t} 10242is usually a @code{long}. This is done to make the fields just 32 bits on 10243some 64 bits systems, thereby saving a few bytes of data space but still 10244providing plenty of range. 10245 10246 10247@node Rational Internals, Float Internals, Integer Internals, Internals 10248@section Rational Internals 10249@cindex Rational internals 10250 10251@code{mpq_t} variables represent rationals using an @code{mpz_t} numerator and 10252denominator (@pxref{Integer Internals}). 10253 10254The canonical form adopted is denominator positive (and non-zero), no common 10255factors between numerator and denominator, and zero uniquely represented as 102560/1. 10257 10258It's believed that casting out common factors at each stage of a calculation 10259is best in general. A GCD is an @math{O(N^2)} operation so it's better to do 10260a few small ones immediately than to delay and have to do a big one later. 10261Knowing the numerator and denominator have no common factors can be used for 10262example in @code{mpq_mul} to make only two cross GCDs necessary, not four. 10263 10264This general approach to common factors is badly sub-optimal in the presence 10265of simple factorizations or little prospect for cancellation, but GMP has no 10266way to know when this will occur. As per @ref{Efficiency}, that's left to 10267applications. The @code{mpq_t} framework might still suit, with 10268@code{mpq_numref} and @code{mpq_denref} for direct access to the numerator and 10269denominator, or of course @code{mpz_t} variables can be used directly. 10270 10271 10272@node Float Internals, Raw Output Internals, Rational Internals, Internals 10273@section Float Internals 10274@cindex Float internals 10275 10276Efficient calculation is the primary aim of GMP floats and the use of whole 10277limbs and simple rounding facilitates this. 10278 10279@code{mpf_t} floats have a variable precision mantissa and a single machine 10280word signed exponent. The mantissa is represented using sign and magnitude. 10281 10282@c FIXME: The arrow heads don't join to the lines exactly. 10283@tex 10284\global\newdimen\GMPboxwidth \GMPboxwidth=5em 10285\global\newdimen\GMPboxheight \GMPboxheight=3ex 10286\def\centreline{\hbox{\raise 0.8ex \vbox{\hrule \hbox{\hfil}}}} 10287\GMPdisplay{% 10288\vbox{% 10289 \hbox to 5\GMPboxwidth {most significant limb \hfil least significant limb} 10290 \vskip 0.7ex 10291 \def\GMPcentreline#1{\hbox{\raise 0.5 ex \vbox{\hrule \hbox to #1 {}}}} 10292 \hbox { 10293 \hbox to 3\GMPboxwidth {% 10294 \setbox 0 = \hbox{@code{\_mp\_exp}}% 10295 \dimen0=3\GMPboxwidth 10296 \advance\dimen0 by -\wd0 10297 \divide\dimen0 by 2 10298 \advance\dimen0 by -1em 10299 \setbox1 = \hbox{$\rightarrow$}% 10300 \dimen1=\dimen0 10301 \advance\dimen1 by -\wd1 10302 \GMPcentreline{\dimen0}% 10303 \hfil 10304 \box0% 10305 \hfil 10306 \GMPcentreline{\dimen1{}}% 10307 \box1} 10308 \hbox to 2\GMPboxwidth {\hfil @code{\_mp\_d}}} 10309 \vskip 0.5ex 10310 \vbox {% 10311 \hrule 10312 \hbox{% 10313 \vrule height 2ex depth 1ex 10314 \hbox to \GMPboxwidth {}% 10315 \vrule 10316 \hbox to \GMPboxwidth {}% 10317 \vrule 10318 \hbox to \GMPboxwidth {}% 10319 \vrule 10320 \hbox to \GMPboxwidth {}% 10321 \vrule 10322 \hbox to \GMPboxwidth {}% 10323 \vrule} 10324 \hrule 10325 } 10326 \hbox {% 10327 \hbox to 0.8 pt {} 10328 \hbox to 3\GMPboxwidth {% 10329 \hfil $\cdot$} \hbox {$\leftarrow$ radix point\hfil}} 10330 \hbox to 5\GMPboxwidth{% 10331 \setbox 0 = \hbox{@code{\_mp\_size}}% 10332 \dimen0 = 5\GMPboxwidth 10333 \advance\dimen0 by -\wd0 10334 \divide\dimen0 by 2 10335 \advance\dimen0 by -1em 10336 \dimen1 = \dimen0 10337 \setbox1 = \hbox{$\leftarrow$}% 10338 \setbox2 = \hbox{$\rightarrow$}% 10339 \advance\dimen0 by -\wd1 10340 \advance\dimen1 by -\wd2 10341 \hbox to 0.3 em {}% 10342 \box1 10343 \GMPcentreline{\dimen0}% 10344 \hfil 10345 \box0 10346 \hfil 10347 \GMPcentreline{\dimen1}% 10348 \box2} 10349}} 10350@end tex 10351@ifnottex 10352@example 10353 most least 10354significant significant 10355 limb limb 10356 10357 _mp_d 10358 |---- _mp_exp ---> | 10359 _____ _____ _____ _____ _____ 10360 |_____|_____|_____|_____|_____| 10361 . <------------ radix point 10362 10363 <-------- _mp_size ---------> 10364@sp 1 10365@end example 10366@end ifnottex 10367 10368@noindent 10369The fields are as follows. 10370 10371@table @asis 10372@item @code{_mp_size} 10373The number of limbs currently in use, or the negative of that when 10374representing a negative value. Zero is represented by @code{_mp_size} and 10375@code{_mp_exp} both set to zero, and in that case the @code{_mp_d} data is 10376unused. (In the future @code{_mp_exp} might be undefined when representing 10377zero.) 10378 10379@item @code{_mp_prec} 10380The precision of the mantissa, in limbs. In any calculation the aim is to 10381produce @code{_mp_prec} limbs of result (the most significant being non-zero). 10382 10383@item @code{_mp_d} 10384A pointer to the array of limbs which is the absolute value of the mantissa. 10385These are stored ``little endian'' as per the @code{mpn} functions, so 10386@code{_mp_d[0]} is the least significant limb and 10387@code{_mp_d[ABS(_mp_size)-1]} the most significant. 10388 10389The most significant limb is always non-zero, but there are no other 10390restrictions on its value, in particular the highest 1 bit can be anywhere 10391within the limb. 10392 10393@code{_mp_prec+1} limbs are allocated to @code{_mp_d}, the extra limb being 10394for convenience (see below). There are no reallocations during a calculation, 10395only in a change of precision with @code{mpf_set_prec}. 10396 10397@item @code{_mp_exp} 10398The exponent, in limbs, determining the location of the implied radix point. 10399Zero means the radix point is just above the most significant limb. Positive 10400values mean a radix point offset towards the lower limbs and hence a value 10401@math{@ge{} 1}, as for example in the diagram above. Negative exponents mean 10402a radix point further above the highest limb. 10403 10404Naturally the exponent can be any value, it doesn't have to fall within the 10405limbs as the diagram shows, it can be a long way above or a long way below. 10406Limbs other than those included in the @code{@{_mp_d,_mp_size@}} data 10407are treated as zero. 10408@end table 10409 10410The @code{_mp_size} and @code{_mp_prec} fields are @code{int}, although the 10411@code{mp_size_t} type is usually a @code{long}. The @code{_mp_exp} field is 10412usually @code{long}. This is done to make some fields just 32 bits on some 64 10413bits systems, thereby saving a few bytes of data space but still providing 10414plenty of precision and a very large range. 10415 10416 10417@sp 1 10418@noindent 10419The following various points should be noted. 10420 10421@table @asis 10422@item Low Zeros 10423The least significant limbs @code{_mp_d[0]} etc can be zero, though such low 10424zeros can always be ignored. Routines likely to produce low zeros check and 10425avoid them to save time in subsequent calculations, but for most routines 10426they're quite unlikely and aren't checked. 10427 10428@item Mantissa Size Range 10429The @code{_mp_size} count of limbs in use can be less than @code{_mp_prec} if 10430the value can be represented in less. This means low precision values or 10431small integers stored in a high precision @code{mpf_t} can still be operated 10432on efficiently. 10433 10434@code{_mp_size} can also be greater than @code{_mp_prec}. Firstly a value is 10435allowed to use all of the @code{_mp_prec+1} limbs available at @code{_mp_d}, 10436and secondly when @code{mpf_set_prec_raw} lowers @code{_mp_prec} it leaves 10437@code{_mp_size} unchanged and so the size can be arbitrarily bigger than 10438@code{_mp_prec}. 10439 10440@item Rounding 10441All rounding is done on limb boundaries. Calculating @code{_mp_prec} limbs 10442with the high non-zero will ensure the application requested minimum precision 10443is obtained. 10444 10445The use of simple ``trunc'' rounding towards zero is efficient, since there's 10446no need to examine extra limbs and increment or decrement. 10447 10448@item Bit Shifts 10449Since the exponent is in limbs, there are no bit shifts in basic operations 10450like @code{mpf_add} and @code{mpf_mul}. When differing exponents are 10451encountered all that's needed is to adjust pointers to line up the relevant 10452limbs. 10453 10454Of course @code{mpf_mul_2exp} and @code{mpf_div_2exp} will require bit shifts, 10455but the choice is between an exponent in limbs which requires shifts there, or 10456one in bits which requires them almost everywhere else. 10457 10458@item Use of @code{_mp_prec+1} Limbs 10459The extra limb on @code{_mp_d} (@code{_mp_prec+1} rather than just 10460@code{_mp_prec}) helps when an @code{mpf} routine might get a carry from its 10461operation. @code{mpf_add} for instance will do an @code{mpn_add} of 10462@code{_mp_prec} limbs. If there's no carry then that's the result, but if 10463there is a carry then it's stored in the extra limb of space and 10464@code{_mp_size} becomes @code{_mp_prec+1}. 10465 10466Whenever @code{_mp_prec+1} limbs are held in a variable, the low limb is not 10467needed for the intended precision, only the @code{_mp_prec} high limbs. But 10468zeroing it out or moving the rest down is unnecessary. Subsequent routines 10469reading the value will simply take the high limbs they need, and this will be 10470@code{_mp_prec} if their target has that same precision. This is no more than 10471a pointer adjustment, and must be checked anyway since the destination 10472precision can be different from the sources. 10473 10474Copy functions like @code{mpf_set} will retain a full @code{_mp_prec+1} limbs 10475if available. This ensures that a variable which has @code{_mp_size} equal to 10476@code{_mp_prec+1} will get its full exact value copied. Strictly speaking 10477this is unnecessary since only @code{_mp_prec} limbs are needed for the 10478application's requested precision, but it's considered that an @code{mpf_set} 10479from one variable into another of the same precision ought to produce an exact 10480copy. 10481 10482@item Application Precisions 10483@code{__GMPF_BITS_TO_PREC} converts an application requested precision to an 10484@code{_mp_prec}. The value in bits is rounded up to a whole limb then an 10485extra limb is added since the most significant limb of @code{_mp_d} is only 10486non-zero and therefore might contain only one bit. 10487 10488@code{__GMPF_PREC_TO_BITS} does the reverse conversion, and removes the extra 10489limb from @code{_mp_prec} before converting to bits. The net effect of 10490reading back with @code{mpf_get_prec} is simply the precision rounded up to a 10491multiple of @code{mp_bits_per_limb}. 10492 10493Note that the extra limb added here for the high only being non-zero is in 10494addition to the extra limb allocated to @code{_mp_d}. For example with a 1049532-bit limb, an application request for 250 bits will be rounded up to 8 10496limbs, then an extra added for the high being only non-zero, giving an 10497@code{_mp_prec} of 9. @code{_mp_d} then gets 10 limbs allocated. Reading 10498back with @code{mpf_get_prec} will take @code{_mp_prec} subtract 1 limb and 10499multiply by 32, giving 256 bits. 10500 10501Strictly speaking, the fact the high limb has at least one bit means that a 10502float with, say, 3 limbs of 32-bits each will be holding at least 65 bits, but 10503for the purposes of @code{mpf_t} it's considered simply to be 64 bits, a nice 10504multiple of the limb size. 10505@end table 10506 10507 10508@node Raw Output Internals, C++ Interface Internals, Float Internals, Internals 10509@section Raw Output Internals 10510@cindex Raw output internals 10511 10512@noindent 10513@code{mpz_out_raw} uses the following format. 10514 10515@tex 10516\global\newdimen\GMPboxwidth \GMPboxwidth=5em 10517\global\newdimen\GMPboxheight \GMPboxheight=3ex 10518\def\centreline{\hbox{\raise 0.8ex \vbox{\hrule \hbox{\hfil}}}} 10519\GMPdisplay{% 10520\vbox{% 10521 \def\GMPcentreline#1{\hbox{\raise 0.5 ex \vbox{\hrule \hbox to #1 {}}}} 10522 \vbox {% 10523 \hrule 10524 \hbox{% 10525 \vrule height 2.5ex depth 1.5ex 10526 \hbox to \GMPboxwidth {\hfil size\hfil}% 10527 \vrule 10528 \hbox to 3\GMPboxwidth {\hfil data bytes\hfil}% 10529 \vrule} 10530 \hrule} 10531}} 10532@end tex 10533@ifnottex 10534@example 10535+------+------------------------+ 10536| size | data bytes | 10537+------+------------------------+ 10538@end example 10539@end ifnottex 10540 10541The size is 4 bytes written most significant byte first, being the number of 10542subsequent data bytes, or the twos complement negative of that when a negative 10543integer is represented. The data bytes are the absolute value of the integer, 10544written most significant byte first. 10545 10546The most significant data byte is always non-zero, so the output is the same 10547on all systems, irrespective of limb size. 10548 10549In GMP 1, leading zero bytes were written to pad the data bytes to a multiple 10550of the limb size. @code{mpz_inp_raw} will still accept this, for 10551compatibility. 10552 10553The use of ``big endian'' for both the size and data fields is deliberate, it 10554makes the data easy to read in a hex dump of a file. Unfortunately it also 10555means that the limb data must be reversed when reading or writing, so neither 10556a big endian nor little endian system can just read and write @code{_mp_d}. 10557 10558 10559@node C++ Interface Internals, , Raw Output Internals, Internals 10560@section C++ Interface Internals 10561@cindex C++ interface internals 10562 10563A system of expression templates is used to ensure something like @code{a=b+c} 10564turns into a simple call to @code{mpz_add} etc. For @code{mpf_class} 10565the scheme also ensures the precision of the final 10566destination is used for any temporaries within a statement like 10567@code{f=w*x+y*z}. These are important features which a naive implementation 10568cannot provide. 10569 10570A simplified description of the scheme follows. The true scheme is 10571complicated by the fact that expressions have different return types. For 10572detailed information, refer to the source code. 10573 10574To perform an operation, say, addition, we first define a ``function object'' 10575evaluating it, 10576 10577@example 10578struct __gmp_binary_plus 10579@{ 10580 static void eval(mpf_t f, const mpf_t g, const mpf_t h) 10581 @{ 10582 mpf_add(f, g, h); 10583 @} 10584@}; 10585@end example 10586 10587@noindent 10588And an ``additive expression'' object, 10589 10590@example 10591__gmp_expr<__gmp_binary_expr<mpf_class, mpf_class, __gmp_binary_plus> > 10592operator+(const mpf_class &f, const mpf_class &g) 10593@{ 10594 return __gmp_expr 10595 <__gmp_binary_expr<mpf_class, mpf_class, __gmp_binary_plus> >(f, g); 10596@} 10597@end example 10598 10599The seemingly redundant @code{__gmp_expr<__gmp_binary_expr<@dots{}>>} is used to 10600encapsulate any possible kind of expression into a single template type. In 10601fact even @code{mpf_class} etc are @code{typedef} specializations of 10602@code{__gmp_expr}. 10603 10604Next we define assignment of @code{__gmp_expr} to @code{mpf_class}. 10605 10606@example 10607template <class T> 10608mpf_class & mpf_class::operator=(const __gmp_expr<T> &expr) 10609@{ 10610 expr.eval(this->get_mpf_t(), this->precision()); 10611 return *this; 10612@} 10613 10614template <class Op> 10615void __gmp_expr<__gmp_binary_expr<mpf_class, mpf_class, Op> >::eval 10616(mpf_t f, mp_bitcnt_t precision) 10617@{ 10618 Op::eval(f, expr.val1.get_mpf_t(), expr.val2.get_mpf_t()); 10619@} 10620@end example 10621 10622where @code{expr.val1} and @code{expr.val2} are references to the expression's 10623operands (here @code{expr} is the @code{__gmp_binary_expr} stored within the 10624@code{__gmp_expr}). 10625 10626This way, the expression is actually evaluated only at the time of assignment, 10627when the required precision (that of @code{f}) is known. Furthermore the 10628target @code{mpf_t} is now available, thus we can call @code{mpf_add} directly 10629with @code{f} as the output argument. 10630 10631Compound expressions are handled by defining operators taking subexpressions 10632as their arguments, like this: 10633 10634@example 10635template <class T, class U> 10636__gmp_expr 10637<__gmp_binary_expr<__gmp_expr<T>, __gmp_expr<U>, __gmp_binary_plus> > 10638operator+(const __gmp_expr<T> &expr1, const __gmp_expr<U> &expr2) 10639@{ 10640 return __gmp_expr 10641 <__gmp_binary_expr<__gmp_expr<T>, __gmp_expr<U>, __gmp_binary_plus> > 10642 (expr1, expr2); 10643@} 10644@end example 10645 10646And the corresponding specializations of @code{__gmp_expr::eval}: 10647 10648@example 10649template <class T, class U, class Op> 10650void __gmp_expr 10651<__gmp_binary_expr<__gmp_expr<T>, __gmp_expr<U>, Op> >::eval 10652(mpf_t f, mp_bitcnt_t precision) 10653@{ 10654 // declare two temporaries 10655 mpf_class temp1(expr.val1, precision), temp2(expr.val2, precision); 10656 Op::eval(f, temp1.get_mpf_t(), temp2.get_mpf_t()); 10657@} 10658@end example 10659 10660The expression is thus recursively evaluated to any level of complexity and 10661all subexpressions are evaluated to the precision of @code{f}. 10662 10663 10664@node Contributors, References, Internals, Top 10665@comment node-name, next, previous, up 10666@appendix Contributors 10667@cindex Contributors 10668 10669Torbj@"orn Granlund wrote the original GMP library and is still the main 10670developer. Code not explicitly attributed to others, was contributed by 10671Torbj@"orn. Several other individuals and organizations have contributed 10672GMP. Here is a list in chronological order on first contribution: 10673 10674Gunnar Sj@"odin and Hans Riesel helped with mathematical problems in early 10675versions of the library. 10676 10677Richard Stallman helped with the interface design and revised the first 10678version of this manual. 10679 10680Brian Beuning and Doug Lea helped with testing of early versions of the 10681library and made creative suggestions. 10682 10683John Amanatides of York University in Canada contributed the function 10684@code{mpz_probab_prime_p}. 10685 10686Paul Zimmermann wrote the REDC-based mpz_powm code, the Sch@"onhage-Strassen 10687FFT multiply code, and the Karatsuba square root code. He also improved the 10688Toom3 code for GMP 4.2. Paul sparked the development of GMP 2, with his 10689comparisons between bignum packages. The ECMNET project Paul is organizing 10690was a driving force behind many of the optimizations in GMP 3. Paul also 10691wrote the new GMP 4.3 nth root code (with Torbj@"orn). 10692 10693Ken Weber (Kent State University, Universidade Federal do Rio Grande do Sul) 10694contributed now defunct versions of @code{mpz_gcd}, @code{mpz_divexact}, 10695@code{mpn_gcd}, and @code{mpn_bdivmod}, partially supported by CNPq (Brazil) 10696grant 301314194-2. 10697 10698Per Bothner of Cygnus Support helped to set up GMP to use Cygnus' configure. 10699He has also made valuable suggestions and tested numerous intermediary 10700releases. 10701 10702Joachim Hollman was involved in the design of the @code{mpf} interface, and in 10703the @code{mpz} design revisions for version 2. 10704 10705Bennet Yee contributed the initial versions of @code{mpz_jacobi} and 10706@code{mpz_legendre}. 10707 10708Andreas Schwab contributed the files @file{mpn/m68k/lshift.S} and 10709@file{mpn/m68k/rshift.S} (now in @file{.asm} form). 10710 10711Robert Harley of Inria, France and David Seal of ARM, England, suggested clever 10712improvements for population count. Robert also wrote highly optimized 10713Karatsuba and 3-way Toom multiplication functions for GMP 3, and contributed 10714the ARM assembly code. 10715 10716Torsten Ekedahl of the Mathematical department of Stockholm University provided 10717significant inspiration during several phases of the GMP development. His 10718mathematical expertise helped improve several algorithms. 10719 10720Linus Nordberg wrote the new configure system based on autoconf and 10721implemented the new random functions. 10722 10723Kevin Ryde worked on a large number of things: optimized x86 code, m4 asm 10724macros, parameter tuning, speed measuring, the configure system, function 10725inlining, divisibility tests, bit scanning, Jacobi symbols, Fibonacci and Lucas 10726number functions, printf and scanf functions, perl interface, demo expression 10727parser, the algorithms chapter in the manual, @file{gmpasm-mode.el}, and 10728various miscellaneous improvements elsewhere. 10729 10730Kent Boortz made the Mac OS 9 port. 10731 10732Steve Root helped write the optimized alpha 21264 assembly code. 10733 10734Gerardo Ballabio wrote the @file{gmpxx.h} C++ class interface and the C++ 10735@code{istream} input routines. 10736 10737Jason Moxham rewrote @code{mpz_fac_ui}. 10738 10739Pedro Gimeno implemented the Mersenne Twister and made other random number 10740improvements. 10741 10742Niels M@"oller wrote the sub-quadratic GCD, extended GCD and jacobi code, the 10743quadratic Hensel division code, and (with Torbj@"orn) the new divide and 10744conquer division code for GMP 4.3. Niels also helped implement the new Toom 10745multiply code for GMP 4.3 and implemented helper functions to simplify Toom 10746evaluations for GMP 5.0. He wrote the original version of mpn_mulmod_bnm1, and 10747he is the main author of the mini-gmp package used for gmp bootstrapping. 10748 10749Alberto Zanoni and Marco Bodrato suggested the unbalanced multiply strategy, 10750and found the optimal strategies for evaluation and interpolation in Toom 10751multiplication. 10752 10753Marco Bodrato helped implement the new Toom multiply code for GMP 4.3 and 10754implemented most of the new Toom multiply and squaring code for 5.0. 10755He is the main author of the current mpn_mulmod_bnm1, mpn_mullo_n, and 10756mpn_sqrlo. Marco also wrote the functions mpn_invert and mpn_invertappr, 10757and improved the speed of integer root extraction. He is the author of 10758mini-mpq, an additional layer to mini-gmp; of most of the combinatorial 10759functions and the BPSW primality testing implementation, for both the 10760main library and the mini-gmp package. 10761 10762David Harvey suggested the internal function @code{mpn_bdiv_dbm1}, implementing 10763division relevant to Toom multiplication. He also worked on fast assembly 10764sequences, in particular on a fast AMD64 @code{mpn_mul_basecase}. He wrote 10765the internal middle product functions @code{mpn_mulmid_basecase}, 10766@code{mpn_toom42_mulmid}, @code{mpn_mulmid_n} and related helper routines. 10767 10768Martin Boij wrote @code{mpn_perfect_power_p}. 10769 10770Marc Glisse improved @file{gmpxx.h}: use fewer temporaries (faster), 10771specializations of @code{numeric_limits} and @code{common_type}, C++11 10772features (move constructors, explicit bool conversion, UDL), make the 10773conversion from @code{mpq_class} to @code{mpz_class} explicit, optimize 10774operations where one argument is a small compile-time constant, replace 10775some heap allocations by stack allocations. He also fixed the eofbit 10776handling of C++ streams, and removed one division from @file{mpq/aors.c}. 10777 10778David S Miller wrote assembly code for SPARC T3 and T4. 10779 10780Mark Sofroniou cleaned up the types of mul_fft.c, letting it work for huge 10781operands. 10782 10783Ulrich Weigand ported GMP to the powerpc64le ABI. 10784 10785(This list is chronological, not ordered after significance. If you have 10786contributed to GMP but are not listed above, please tell 10787@email{gmp-devel@@gmplib.org} about the omission!) 10788 10789The development of floating point functions of GNU MP 2, were supported in part 10790by the ESPRIT-BRA (Basic Research Activities) 6846 project POSSO (POlynomial 10791System SOlving). 10792 10793The development of GMP 2, 3, and 4.0 was supported in part by the IDA Center 10794for Computing Sciences. 10795 10796The development of GMP 4.3, 5.0, and 5.1 was supported in part by the Swedish 10797Foundation for Strategic Research. 10798 10799Thanks go to Hans Thorsen for donating an SGI system for the GMP test system 10800environment. 10801 10802@node References, GNU Free Documentation License, Contributors, Top 10803@comment node-name, next, previous, up 10804@appendix References 10805@cindex References 10806 10807@c FIXME: In tex, the @uref's are unhyphenated, which is good for clarity, 10808@c but being long words they upset paragraph formatting (the preceding line 10809@c can get badly stretched). Would like an conditional @* style line break 10810@c if the uref is too long to fit on the last line of the paragraph, but it's 10811@c not clear how to do that. For now explicit @texlinebreak{}s are used on 10812@c paragraphs that come out bad. 10813 10814@section Books 10815 10816@itemize @bullet 10817@item 10818Jonathan M. Borwein and Peter B. Borwein, ``Pi and the AGM: A Study in 10819Analytic Number Theory and Computational Complexity'', Wiley, 1998. 10820 10821@item 10822Richard Crandall and Carl Pomerance, ``Prime Numbers: A Computational 10823Perspective'', 2nd edition, Springer-Verlag, 2005. 10824@texlinebreak{} @uref{https://www.math.dartmouth.edu/~carlp/} 10825 10826@item 10827Henri Cohen, ``A Course in Computational Algebraic Number Theory'', Graduate 10828Texts in Mathematics number 138, Springer-Verlag, 1993. 10829@texlinebreak{} @uref{https://www.math.u-bordeaux.fr/~cohen/} 10830 10831@item 10832Donald E. Knuth, ``The Art of Computer Programming'', volume 2, 10833``Seminumerical Algorithms'', 3rd edition, Addison-Wesley, 1998. 10834@texlinebreak{} @uref{https://www-cs-faculty.stanford.edu/~knuth/taocp.html} 10835 10836@item 10837John D. Lipson, ``Elements of Algebra and Algebraic Computing'', 10838The Benjamin Cummings Publishing Company Inc, 1981. 10839 10840@item 10841Alfred J. Menezes, Paul C. van Oorschot and Scott A. Vanstone, ``Handbook of 10842Applied Cryptography'', @uref{http://www.cacr.math.uwaterloo.ca/hac/} 10843 10844@item 10845Richard M. Stallman and the GCC Developer Community, ``Using the GNU Compiler 10846Collection'', Free Software Foundation, 2008, available online 10847@uref{https://gcc.gnu.org/onlinedocs/}, and in the GCC package 10848@uref{https://ftp.gnu.org/gnu/gcc/} 10849@end itemize 10850 10851@section Papers 10852 10853@itemize @bullet 10854@item 10855Yves Bertot, Nicolas Magaud and Paul Zimmermann, ``A Proof of GMP Square 10856Root'', Journal of Automated Reasoning, volume 29, 2002, pp.@: 225-252. Also 10857available online as INRIA Research Report 4475, June 2002, 10858@uref{https://hal.inria.fr/docs/00/07/21/13/PDF/RR-4475.pdf} 10859 10860@item 10861Christoph Burnikel and Joachim Ziegler, ``Fast Recursive Division'', 10862Max-Planck-Institut fuer Informatik Research Report MPI-I-98-1-022, 10863@texlinebreak{} @uref{https://www.mpi-inf.mpg.de/~ziegler/TechRep.ps.gz} 10864 10865@item 10866Torbj@"orn Granlund and Peter L. Montgomery, ``Division by Invariant Integers 10867using Multiplication'', in Proceedings of the SIGPLAN PLDI'94 Conference, June 108681994. Also available @uref{https://gmplib.org/~tege/divcnst-pldi94.pdf}. 10869 10870@item 10871Niels M@"oller and Torbj@"orn Granlund, ``Improved division by invariant 10872integers'', IEEE Transactions on Computers, 11 June 2010. 10873@uref{https://gmplib.org/~tege/division-paper.pdf} 10874 10875@item 10876Torbj@"orn Granlund and Niels M@"oller, ``Division of integers large and 10877small'', to appear. 10878 10879@item 10880Tudor Jebelean, 10881``An algorithm for exact division'', 10882Journal of Symbolic Computation, 10883volume 15, 1993, pp.@: 169-180. 10884Research report version available @texlinebreak{} 10885@uref{ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1992/92-35.ps.gz} 10886 10887@item 10888Tudor Jebelean, ``Exact Division with Karatsuba Complexity - Extended 10889Abstract'', RISC-Linz technical report 96-31, @texlinebreak{} 10890@uref{ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1996/96-31.ps.gz} 10891 10892@item 10893Tudor Jebelean, ``Practical Integer Division with Karatsuba Complexity'', 10894ISSAC 97, pp.@: 339-341. Technical report available @texlinebreak{} 10895@uref{ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1996/96-29.ps.gz} 10896 10897@item 10898Tudor Jebelean, ``A Generalization of the Binary GCD Algorithm'', ISSAC 93, 10899pp.@: 111-116. Technical report version available @texlinebreak{} 10900@uref{ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1993/93-01.ps.gz} 10901 10902@item 10903Tudor Jebelean, ``A Double-Digit Lehmer-Euclid Algorithm for Finding the GCD 10904of Long Integers'', Journal of Symbolic Computation, volume 19, 1995, 10905pp.@: 145-157. Technical report version also available @texlinebreak{} 10906@uref{ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1992/92-69.ps.gz} 10907 10908@item 10909Werner Krandick and Tudor Jebelean, ``Bidirectional Exact Integer Division'', 10910Journal of Symbolic Computation, volume 21, 1996, pp.@: 441-455. Early 10911technical report version also available 10912@uref{ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1994/94-50.ps.gz} 10913 10914@item 10915Makoto Matsumoto and Takuji Nishimura, ``Mersenne Twister: A 623-dimensionally 10916equidistributed uniform pseudorandom number generator'', ACM Transactions on 10917Modelling and Computer Simulation, volume 8, January 1998, pp.@: 3-30. 10918Available online @texlinebreak{} 10919@uref{http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/ARTICLES/mt.pdf} 10920 10921@item 10922R. Moenck and A. Borodin, ``Fast Modular Transforms via Division'', 10923Proceedings of the 13th Annual IEEE Symposium on Switching and Automata 10924Theory, October 1972, pp.@: 90-96. Reprinted as ``Fast Modular Transforms'', 10925Journal of Computer and System Sciences, volume 8, number 3, June 1974, 10926pp.@: 366-386. 10927 10928@item 10929Niels M@"oller, ``On Sch@"onhage's algorithm and subquadratic integer GCD 10930 computation'', in Mathematics of Computation, volume 77, January 2008, pp.@: 10931 589-607, @uref{https://www.ams.org/journals/mcom/2008-77-261/S0025-5718-07-02017-0/home.html} 10932 10933@item 10934Peter L. Montgomery, ``Modular Multiplication Without Trial Division'', in 10935Mathematics of Computation, volume 44, number 170, April 1985. 10936 10937@item 10938Arnold Sch@"onhage and Volker Strassen, ``Schnelle Multiplikation grosser 10939Zahlen'', Computing 7, 1971, pp.@: 281-292. 10940 10941@item 10942Kenneth Weber, ``The accelerated integer GCD algorithm'', 10943ACM Transactions on Mathematical Software, 10944volume 21, number 1, March 1995, pp.@: 111-122. 10945 10946@item 10947Paul Zimmermann, ``Karatsuba Square Root'', INRIA Research Report 3805, 10948November 1999, @uref{https://hal.inria.fr/inria-00072854/PDF/RR-3805.pdf} 10949 10950@item 10951Paul Zimmermann, ``A Proof of GMP Fast Division and Square Root 10952Implementations'', @texlinebreak{} 10953@uref{https://homepages.loria.fr/PZimmermann/papers/proof-div-sqrt.ps.gz} 10954 10955@item 10956Dan Zuras, ``On Squaring and Multiplying Large Integers'', ARITH-11: IEEE 10957Symposium on Computer Arithmetic, 1993, pp.@: 260 to 271. Reprinted as ``More 10958on Multiplying and Squaring Large Integers'', IEEE Transactions on Computers, 10959volume 43, number 8, August 1994, pp.@: 899-908. 10960 10961@item 10962Niels M@"oller, ``Efficient computation of the Jacobi symbol'', @texlinebreak{} 10963@uref{https://arxiv.org/abs/1907.07795} 10964@end itemize 10965 10966@node GNU Free Documentation License, Concept Index, References, Top 10967@appendix GNU Free Documentation License 10968@cindex GNU Free Documentation License 10969@cindex Free Documentation License 10970@cindex Documentation license 10971@include fdl-1.3.texi 10972 10973 10974@node Concept Index, Function Index, GNU Free Documentation License, Top 10975@comment node-name, next, previous, up 10976@unnumbered Concept Index 10977@printindex cp 10978 10979@node Function Index, , Concept Index, Top 10980@comment node-name, next, previous, up 10981@unnumbered Function and Type Index 10982@printindex fn 10983 10984@bye 10985 10986@c Local variables: 10987@c fill-column: 78 10988@c compile-command: "make gmp.info" 10989@c End: 10990