1\input texinfo @c -*-texinfo-*- 2@c %**start of header 3@setfilename gmp.info 4@documentencoding ISO-8859-1 5@include version.texi 6@settitle GNU MP @value{VERSION} 7@synindex tp fn 8@iftex 9@afourpaper 10@end iftex 11@comment %**end of header 12 13@copying 14This manual describes how to install and use the GNU multiple precision 15arithmetic library, version @value{VERSION}. 16 17Copyright 1991, 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001, 2002, 182003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011 Free Software 19Foundation, Inc. 20 21Permission is granted to copy, distribute and/or modify this document under 22the terms of the GNU Free Documentation License, Version 1.3 or any later 23version published by the Free Software Foundation; with no Invariant Sections, 24with the Front-Cover Texts being ``A GNU Manual'', and with the Back-Cover 25Texts being ``You have freedom to copy and modify this GNU Manual, like GNU 26software''. A copy of the license is included in 27@ref{GNU Free Documentation License}. 28@end copying 29@c Note the @ref above must be on one line, a line break in an @ref within 30@c @copying will bomb in recent texinfo.tex (eg. 2004-04-07.08 which comes 31@c with texinfo 4.7), with messages about missing @endcsname. 32 33 34@c Texinfo version 4.2 or up will be needed to process this file. 35@c 36@c The version number and edition number are taken from version.texi provided 37@c by automake (note that it's regenerated only if you configure with 38@c --enable-maintainer-mode). 39@c 40@c Notes discussing the present version number of GMP in relation to previous 41@c ones (for instance in the "Compatibility" section) must be updated at 42@c manually though. 43@c 44@c @cindex entries have been made for function categories and programming 45@c topics. The "mpn" section is not included in this, because a beginner 46@c looking for "GCD" or something is only going to be confused by pointers to 47@c low level routines. 48@c 49@c @cindex entries are present for processors and systems when there's 50@c particular notes concerning them, but not just for everything GMP 51@c supports. 52@c 53@c Index entries for files use @code rather than @file, @samp or @option, 54@c since the latter come out with quotes in TeX, which are nice in the text 55@c but don't look so good in index columns. 56@c 57@c Tex: 58@c 59@c A suitable texinfo.tex is supplied, a newer one should work equally well. 60@c 61@c HTML: 62@c 63@c Nothing special is done for links to external manuals, they just come out 64@c in the usual makeinfo style, eg. "../libc/Locales.html". If you have 65@c local copies of such manuals then this is a good thing, if not then you 66@c may want to search-and-replace to some online source. 67@c 68 69@dircategory GNU libraries 70@direntry 71* gmp: (gmp). GNU Multiple Precision Arithmetic Library. 72@end direntry 73 74@c html <meta name="description" content="..."> 75@documentdescription 76How to install and use the GNU multiple precision arithmetic library, version @value{VERSION}. 77@end documentdescription 78 79@c smallbook 80@finalout 81@setchapternewpage on 82 83@ifnottex 84@node Top, Copying, (dir), (dir) 85@top GNU MP 86@end ifnottex 87 88@iftex 89@titlepage 90@title GNU MP 91@subtitle The GNU Multiple Precision Arithmetic Library 92@subtitle Edition @value{EDITION} 93@subtitle @value{UPDATED} 94 95@author by Torbj@"orn Granlund and the GMP development team 96@c @email{tg@@gmplib.org} 97 98@c Include the Distribution inside the titlepage so 99@c that headings are turned off. 100 101@tex 102\global\parindent=0pt 103\global\parskip=8pt 104\global\baselineskip=13pt 105@end tex 106 107@page 108@vskip 0pt plus 1filll 109@end iftex 110 111@insertcopying 112@ifnottex 113@sp 1 114@end ifnottex 115 116@iftex 117@end titlepage 118@headings double 119@end iftex 120 121@c Don't bother with contents for html, the menus seem adequate. 122@ifnothtml 123@contents 124@end ifnothtml 125 126@menu 127* Copying:: GMP Copying Conditions (LGPL). 128* Introduction to GMP:: Brief introduction to GNU MP. 129* Installing GMP:: How to configure and compile the GMP library. 130* GMP Basics:: What every GMP user should know. 131* Reporting Bugs:: How to usefully report bugs. 132* Integer Functions:: Functions for arithmetic on signed integers. 133* Rational Number Functions:: Functions for arithmetic on rational numbers. 134* Floating-point Functions:: Functions for arithmetic on floats. 135* Low-level Functions:: Fast functions for natural numbers. 136* Random Number Functions:: Functions for generating random numbers. 137* Formatted Output:: @code{printf} style output. 138* Formatted Input:: @code{scanf} style input. 139* C++ Class Interface:: Class wrappers around GMP types. 140* BSD Compatible Functions:: All functions found in BSD MP. 141* Custom Allocation:: How to customize the internal allocation. 142* Language Bindings:: Using GMP from other languages. 143* Algorithms:: What happens behind the scenes. 144* Internals:: How values are represented behind the scenes. 145 146* Contributors:: Who brings you this library? 147* References:: Some useful papers and books to read. 148* GNU Free Documentation License:: 149* Concept Index:: 150* Function Index:: 151@end menu 152 153 154@c @m{T,N} is $T$ in tex or @math{N} otherwise. This is an easy way to give 155@c different forms for math in tex and info. Commas in N or T don't work, 156@c but @C{} can be used instead. \, works in info but not in tex. 157@iftex 158@macro m {T,N} 159@tex$\T\$@end tex 160@end macro 161@end iftex 162@ifnottex 163@macro m {T,N} 164@math{\N\} 165@end macro 166@end ifnottex 167 168@macro C {} 169, 170@end macro 171 172@c @ms{V,N} is $V_N$ in tex or just vn otherwise. This suits simple 173@c subscripts like @ms{x,0}. 174@iftex 175@macro ms {V,N} 176@tex$\V\_{\N\}$@end tex 177@end macro 178@end iftex 179@ifnottex 180@macro ms {V,N} 181\V\\N\ 182@end macro 183@end ifnottex 184 185@c @nicode{S} is plain S in info, or @code{S} elsewhere. This can be used 186@c when the quotes that @code{} gives in info aren't wanted, but the 187@c fontification in tex or html is wanted. Doesn't work as @nicode{'\\0'} 188@c though (gives two backslashes in tex). 189@ifinfo 190@macro nicode {S} 191\S\ 192@end macro 193@end ifinfo 194@ifnotinfo 195@macro nicode {S} 196@code{\S\} 197@end macro 198@end ifnotinfo 199 200@c @nisamp{S} is plain S in info, or @samp{S} elsewhere. This can be used 201@c when the quotes that @samp{} gives in info aren't wanted, but the 202@c fontification in tex or html is wanted. 203@ifinfo 204@macro nisamp {S} 205\S\ 206@end macro 207@end ifinfo 208@ifnotinfo 209@macro nisamp {S} 210@samp{\S\} 211@end macro 212@end ifnotinfo 213 214@c Usage: @GMPtimes{} 215@c Give either \times or the word "times". 216@tex 217\gdef\GMPtimes{\times} 218@end tex 219@ifnottex 220@macro GMPtimes 221times 222@end macro 223@end ifnottex 224 225@c Usage: @GMPmultiply{} 226@c Give * in info, or nothing in tex. 227@tex 228\gdef\GMPmultiply{} 229@end tex 230@ifnottex 231@macro GMPmultiply 232* 233@end macro 234@end ifnottex 235 236@c Usage: @GMPabs{x} 237@c Give either |x| in tex, or abs(x) in info or html. 238@tex 239\gdef\GMPabs#1{|#1|} 240@end tex 241@ifnottex 242@macro GMPabs {X} 243@abs{}(\X\) 244@end macro 245@end ifnottex 246 247@c Usage: @GMPfloor{x} 248@c Give either \lfloor x\rfloor in tex, or floor(x) in info or html. 249@tex 250\gdef\GMPfloor#1{\lfloor #1\rfloor} 251@end tex 252@ifnottex 253@macro GMPfloor {X} 254floor(\X\) 255@end macro 256@end ifnottex 257 258@c Usage: @GMPceil{x} 259@c Give either \lceil x\rceil in tex, or ceil(x) in info or html. 260@tex 261\gdef\GMPceil#1{\lceil #1 \rceil} 262@end tex 263@ifnottex 264@macro GMPceil {X} 265ceil(\X\) 266@end macro 267@end ifnottex 268 269@c Math operators already available in tex, made available in info too. 270@c For example @bmod{} can be used in both tex and info. 271@ifnottex 272@macro bmod 273mod 274@end macro 275@macro gcd 276gcd 277@end macro 278@macro ge 279>= 280@end macro 281@macro le 282<= 283@end macro 284@macro log 285log 286@end macro 287@macro min 288min 289@end macro 290@macro leftarrow 291<- 292@end macro 293@macro rightarrow 294-> 295@end macro 296@end ifnottex 297 298@c New math operators. 299@c @abs{} can be used in both tex and info, or just \abs in tex. 300@tex 301\gdef\abs{\mathop{\rm abs}} 302@end tex 303@ifnottex 304@macro abs 305abs 306@end macro 307@end ifnottex 308 309@c @cross{} is a \times symbol in tex, or an "x" in info. In tex it works 310@c inside or outside $ $. 311@tex 312\gdef\cross{\ifmmode\times\else$\times$\fi} 313@end tex 314@ifnottex 315@macro cross 316x 317@end macro 318@end ifnottex 319 320@c @times{} made available as a "*" in info and html (already works in tex). 321@ifnottex 322@macro times 323* 324@end macro 325@end ifnottex 326 327@c Usage: @W{text} 328@c Like @w{} but working in math mode too. 329@tex 330\gdef\W#1{\ifmmode{#1}\else\w{#1}\fi} 331@end tex 332@ifnottex 333@macro W {S} 334@w{\S\} 335@end macro 336@end ifnottex 337 338@c Usage: \GMPdisplay{text} 339@c Put the given text in an @display style indent, but without turning off 340@c paragraph reflow etc. 341@tex 342\gdef\GMPdisplay#1{% 343\noindent 344\advance\leftskip by \lispnarrowing 345#1\par} 346@end tex 347 348@c Usage: \GMPhat 349@c A new \hat that will work in math mode, unlike the texinfo redefined 350@c version. 351@tex 352\gdef\GMPhat{\mathaccent"705E} 353@end tex 354 355@c Usage: \GMPraise{text} 356@c For use in a $ $ math expression as an alternative to "^". This is good 357@c for @code{} in an exponent, since there seems to be no superscript font 358@c for that. 359@tex 360\gdef\GMPraise#1{\mskip0.5\thinmuskip\hbox{\raise0.8ex\hbox{#1}}} 361@end tex 362 363@c Usage: @texlinebreak{} 364@c A line break as per @*, but only in tex. 365@iftex 366@macro texlinebreak 367@* 368@end macro 369@end iftex 370@ifnottex 371@macro texlinebreak 372@end macro 373@end ifnottex 374 375@c Usage: @maybepagebreak 376@c Allow tex to insert a page break, if it feels the urge. 377@c Normally blocks of @deftypefun/funx are kept together, which can lead to 378@c some poor page break positioning if it's a big block, like the sets of 379@c division functions etc. 380@tex 381\gdef\maybepagebreak{\penalty0} 382@end tex 383@ifnottex 384@macro maybepagebreak 385@end macro 386@end ifnottex 387 388@c Usage: @GMPreftop{info,title} 389@c Usage: @GMPpxreftop{info,title} 390@c 391@c Like @ref{} and @pxref{}, but designed for a reference to the top of a 392@c document, not a particular section. The TeX output for plain @ref insists 393@c on printing a particular section, GMPreftop gives just the title. 394@c 395@c The texinfo manual recommends putting a likely section name in references 396@c like this, eg. "Introduction", but it seems better to just give the title. 397@c 398@iftex 399@macro GMPreftop{info,title} 400@i{\title\} 401@end macro 402@macro GMPpxreftop{info,title} 403see @i{\title\} 404@end macro 405@end iftex 406@c 407@ifnottex 408@macro GMPreftop{info,title} 409@ref{Top,\title\,\title\,\info\,\title\} 410@end macro 411@macro GMPpxreftop{info,title} 412@pxref{Top,\title\,\title\,\info\,\title\} 413@end macro 414@end ifnottex 415 416 417@node Copying, Introduction to GMP, Top, Top 418@comment node-name, next, previous, up 419@unnumbered GNU MP Copying Conditions 420@cindex Copying conditions 421@cindex Conditions for copying GNU MP 422@cindex License conditions 423 424This library is @dfn{free}; this means that everyone is free to use it and 425free to redistribute it on a free basis. The library is not in the public 426domain; it is copyrighted and there are restrictions on its distribution, but 427these restrictions are designed to permit everything that a good cooperating 428citizen would want to do. What is not allowed is to try to prevent others 429from further sharing any version of this library that they might get from 430you.@refill 431 432Specifically, we want to make sure that you have the right to give away copies 433of the library, that you receive source code or else can get it if you want 434it, that you can change this library or use pieces of it in new free programs, 435and that you know you can do these things.@refill 436 437To make sure that everyone has such rights, we have to forbid you to deprive 438anyone else of these rights. For example, if you distribute copies of the GNU 439MP library, you must give the recipients all the rights that you have. You 440must make sure that they, too, receive or can get the source code. And you 441must tell them their rights.@refill 442 443Also, for our own protection, we must make certain that everyone finds out 444that there is no warranty for the GNU MP library. If it is modified by 445someone else and passed on, we want their recipients to know that what they 446have is not what we distributed, so that any problems introduced by others 447will not reflect on our reputation.@refill 448 449The precise conditions of the license for the GNU MP library are found in the 450Lesser General Public License version 3 that accompanies the source code, 451see @file{COPYING.LIB}. Certain demonstration programs are provided under the 452terms of the plain General Public License version 3, see @file{COPYING}. 453 454 455@node Introduction to GMP, Installing GMP, Copying, Top 456@comment node-name, next, previous, up 457@chapter Introduction to GNU MP 458@cindex Introduction 459 460GNU MP is a portable library written in C for arbitrary precision arithmetic 461on integers, rational numbers, and floating-point numbers. It aims to provide 462the fastest possible arithmetic for all applications that need higher 463precision than is directly supported by the basic C types. 464 465Many applications use just a few hundred bits of precision; but some 466applications may need thousands or even millions of bits. GMP is designed to 467give good performance for both, by choosing algorithms based on the sizes of 468the operands, and by carefully keeping the overhead at a minimum. 469 470The speed of GMP is achieved by using fullwords as the basic arithmetic type, 471by using sophisticated algorithms, by including carefully optimized assembly 472code for the most common inner loops for many different CPUs, and by a general 473emphasis on speed (as opposed to simplicity or elegance). 474 475There is assembly code for these CPUs: 476@cindex CPU types 477ARM, 478DEC Alpha 21064, 21164, and 21264, 479AMD 29000, 480AMD K6, K6-2, Athlon, and Athlon64, 481Hitachi SuperH and SH-2, 482HPPA 1.0, 1.1 and 2.0, 483Intel Pentium, Pentium Pro/II/III, Pentium 4, generic x86, 484Intel IA-64, i960, 485Motorola MC68000, MC68020, MC88100, and MC88110, 486Motorola/IBM PowerPC 32 and 64, 487National NS32000, 488IBM POWER, 489MIPS R3000, R4000, 490SPARCv7, SuperSPARC, generic SPARCv8, UltraSPARC, 491DEC VAX, 492and 493Zilog Z8000. 494Some optimizations also for 495Cray vector systems, 496Clipper, 497IBM ROMP (RT), 498and 499Pyramid AP/XP. 500 501@cindex Home page 502@cindex Web page 503@noindent 504For up-to-date information on GMP, please see the GMP web pages at 505 506@display 507@uref{http://gmplib.org/} 508@end display 509 510@cindex Latest version of GMP 511@cindex Anonymous FTP of latest version 512@cindex FTP of latest version 513@noindent 514The latest version of the library is available at 515 516@display 517@uref{ftp://ftp.gnu.org/gnu/gmp/} 518@end display 519 520Many sites around the world mirror @samp{ftp.gnu.org}, please use a mirror 521near you, see @uref{http://www.gnu.org/order/ftp.html} for a full list. 522 523@cindex Mailing lists 524There are three public mailing lists of interest. One for release 525announcements, one for general questions and discussions about usage of the GMP 526library and one for bug reports. For more information, see 527 528@display 529@uref{http://gmplib.org/mailman/listinfo/}. 530@end display 531 532The proper place for bug reports is @email{gmp-bugs@@gmplib.org}. See 533@ref{Reporting Bugs} for information about reporting bugs. 534 535@sp 1 536@section How to use this Manual 537@cindex About this manual 538 539Everyone should read @ref{GMP Basics}. If you need to install the library 540yourself, then read @ref{Installing GMP}. If you have a system with multiple 541ABIs, then read @ref{ABI and ISA}, for the compiler options that must be used 542on applications. 543 544The rest of the manual can be used for later reference, although it is 545probably a good idea to glance through it. 546 547 548@node Installing GMP, GMP Basics, Introduction to GMP, Top 549@comment node-name, next, previous, up 550@chapter Installing GMP 551@cindex Installing GMP 552@cindex Configuring GMP 553@cindex Building GMP 554 555GMP has an autoconf/automake/libtool based configuration system. On a 556Unix-like system a basic build can be done with 557 558@example 559./configure 560make 561@end example 562 563@noindent 564Some self-tests can be run with 565 566@example 567make check 568@end example 569 570@noindent 571And you can install (under @file{/usr/local} by default) with 572 573@example 574make install 575@end example 576 577If you experience problems, please report them to @email{gmp-bugs@@gmplib.org}. 578See @ref{Reporting Bugs}, for information on what to include in useful bug 579reports. 580 581@menu 582* Build Options:: 583* ABI and ISA:: 584* Notes for Package Builds:: 585* Notes for Particular Systems:: 586* Known Build Problems:: 587* Performance optimization:: 588@end menu 589 590 591@node Build Options, ABI and ISA, Installing GMP, Installing GMP 592@section Build Options 593@cindex Build options 594 595All the usual autoconf configure options are available, run @samp{./configure 596--help} for a summary. The file @file{INSTALL.autoconf} has some generic 597installation information too. 598 599@table @asis 600@item Tools 601@cindex Non-Unix systems 602@samp{configure} requires various Unix-like tools. See @ref{Notes for 603Particular Systems}, for some options on non-Unix systems. 604 605It might be possible to build without the help of @samp{configure}, certainly 606all the code is there, but unfortunately you'll be on your own. 607 608@item Build Directory 609@cindex Build directory 610To compile in a separate build directory, @command{cd} to that directory, and 611prefix the configure command with the path to the GMP source directory. For 612example 613 614@example 615cd /my/build/dir 616/my/sources/gmp-@value{VERSION}/configure 617@end example 618 619Not all @samp{make} programs have the necessary features (@code{VPATH}) to 620support this. In particular, SunOS and Slowaris @command{make} have bugs that 621make them unable to build in a separate directory. Use GNU @command{make} 622instead. 623 624@item @option{--prefix} and @option{--exec-prefix} 625@cindex Prefix 626@cindex Exec prefix 627@cindex Install prefix 628@cindex @code{--prefix} 629@cindex @code{--exec-prefix} 630The @option{--prefix} option can be used in the normal way to direct GMP to 631install under a particular tree. The default is @samp{/usr/local}. 632 633@option{--exec-prefix} can be used to direct architecture-dependent files like 634@file{libgmp.a} to a different location. This can be used to share 635architecture-independent parts like the documentation, but separate the 636dependent parts. Note however that @file{gmp.h} and @file{mp.h} are 637architecture-dependent since they encode certain aspects of @file{libgmp}, so 638it will be necessary to ensure both @file{$prefix/include} and 639@file{$exec_prefix/include} are available to the compiler. 640 641@item @option{--disable-shared}, @option{--disable-static} 642@cindex @code{--disable-shared} 643@cindex @code{--disable-static} 644By default both shared and static libraries are built (where possible), but 645one or other can be disabled. Shared libraries result in smaller executables 646and permit code sharing between separate running processes, but on some CPUs 647are slightly slower, having a small cost on each function call. 648 649@item Native Compilation, @option{--build=CPU-VENDOR-OS} 650@cindex Native compilation 651@cindex Build system 652@cindex @code{--build} 653For normal native compilation, the system can be specified with 654@samp{--build}. By default @samp{./configure} uses the output from running 655@samp{./config.guess}. On some systems @samp{./config.guess} can determine 656the exact CPU type, on others it will be necessary to give it explicitly. For 657example, 658 659@example 660./configure --build=ultrasparc-sun-solaris2.7 661@end example 662 663In all cases the @samp{OS} part is important, since it controls how libtool 664generates shared libraries. Running @samp{./config.guess} is the simplest way 665to see what it should be, if you don't know already. 666 667@item Cross Compilation, @option{--host=CPU-VENDOR-OS} 668@cindex Cross compiling 669@cindex Host system 670@cindex @code{--host} 671When cross-compiling, the system used for compiling is given by @samp{--build} 672and the system where the library will run is given by @samp{--host}. For 673example when using a FreeBSD Athlon system to build GNU/Linux m68k binaries, 674 675@example 676./configure --build=athlon-pc-freebsd3.5 --host=m68k-mac-linux-gnu 677@end example 678 679Compiler tools are sought first with the host system type as a prefix. For 680example @command{m68k-mac-linux-gnu-ranlib} is tried, then plain 681@command{ranlib}. This makes it possible for a set of cross-compiling tools 682to co-exist with native tools. The prefix is the argument to @samp{--host}, 683and this can be an alias, such as @samp{m68k-linux}. But note that tools 684don't have to be setup this way, it's enough to just have a @env{PATH} with a 685suitable cross-compiling @command{cc} etc. 686 687Compiling for a different CPU in the same family as the build system is a form 688of cross-compilation, though very possibly this would merely be special 689options on a native compiler. In any case @samp{./configure} avoids depending 690on being able to run code on the build system, which is important when 691creating binaries for a newer CPU since they very possibly won't run on the 692build system. 693 694In all cases the compiler must be able to produce an executable (of whatever 695format) from a standard C @code{main}. Although only object files will go to 696make up @file{libgmp}, @samp{./configure} uses linking tests for various 697purposes, such as determining what functions are available on the host system. 698 699Currently a warning is given unless an explicit @samp{--build} is used when 700cross-compiling, because it may not be possible to correctly guess the build 701system type if the @env{PATH} has only a cross-compiling @command{cc}. 702 703Note that the @samp{--target} option is not appropriate for GMP@. It's for use 704when building compiler tools, with @samp{--host} being where they will run, 705and @samp{--target} what they'll produce code for. Ordinary programs or 706libraries like GMP are only interested in the @samp{--host} part, being where 707they'll run. (Some past versions of GMP used @samp{--target} incorrectly.) 708 709@item CPU types 710@cindex CPU types 711In general, if you want a library that runs as fast as possible, you should 712configure GMP for the exact CPU type your system uses. However, this may mean 713the binaries won't run on older members of the family, and might run slower on 714other members, older or newer. The best idea is always to build GMP for the 715exact machine type you intend to run it on. 716 717The following CPUs have specific support. See @file{configure.in} for details 718of what code and compiler options they select. 719 720@itemize @bullet 721 722@c Keep this formatting, it's easy to read and it can be grepped to 723@c automatically test that CPUs listed get through ./config.sub 724 725@item 726Alpha: 727@nisamp{alpha}, 728@nisamp{alphaev5}, 729@nisamp{alphaev56}, 730@nisamp{alphapca56}, 731@nisamp{alphapca57}, 732@nisamp{alphaev6}, 733@nisamp{alphaev67}, 734@nisamp{alphaev68} 735@nisamp{alphaev7} 736 737@item 738Cray: 739@nisamp{c90}, 740@nisamp{j90}, 741@nisamp{t90}, 742@nisamp{sv1} 743 744@item 745HPPA: 746@nisamp{hppa1.0}, 747@nisamp{hppa1.1}, 748@nisamp{hppa2.0}, 749@nisamp{hppa2.0n}, 750@nisamp{hppa2.0w}, 751@nisamp{hppa64} 752 753@item 754IA-64: 755@nisamp{ia64}, 756@nisamp{itanium}, 757@nisamp{itanium2} 758 759@item 760MIPS: 761@nisamp{mips}, 762@nisamp{mips3}, 763@nisamp{mips64} 764 765@item 766Motorola: 767@nisamp{m68k}, 768@nisamp{m68000}, 769@nisamp{m68010}, 770@nisamp{m68020}, 771@nisamp{m68030}, 772@nisamp{m68040}, 773@nisamp{m68060}, 774@nisamp{m68302}, 775@nisamp{m68360}, 776@nisamp{m88k}, 777@nisamp{m88110} 778 779@item 780POWER: 781@nisamp{power}, 782@nisamp{power1}, 783@nisamp{power2}, 784@nisamp{power2sc} 785 786@item 787PowerPC: 788@nisamp{powerpc}, 789@nisamp{powerpc64}, 790@nisamp{powerpc401}, 791@nisamp{powerpc403}, 792@nisamp{powerpc405}, 793@nisamp{powerpc505}, 794@nisamp{powerpc601}, 795@nisamp{powerpc602}, 796@nisamp{powerpc603}, 797@nisamp{powerpc603e}, 798@nisamp{powerpc604}, 799@nisamp{powerpc604e}, 800@nisamp{powerpc620}, 801@nisamp{powerpc630}, 802@nisamp{powerpc740}, 803@nisamp{powerpc7400}, 804@nisamp{powerpc7450}, 805@nisamp{powerpc750}, 806@nisamp{powerpc801}, 807@nisamp{powerpc821}, 808@nisamp{powerpc823}, 809@nisamp{powerpc860}, 810@nisamp{powerpc970} 811 812@item 813SPARC: 814@nisamp{sparc}, 815@nisamp{sparcv8}, 816@nisamp{microsparc}, 817@nisamp{supersparc}, 818@nisamp{sparcv9}, 819@nisamp{ultrasparc}, 820@nisamp{ultrasparc2}, 821@nisamp{ultrasparc2i}, 822@nisamp{ultrasparc3}, 823@nisamp{sparc64} 824 825@item 826x86 family: 827@nisamp{i386}, 828@nisamp{i486}, 829@nisamp{i586}, 830@nisamp{pentium}, 831@nisamp{pentiummmx}, 832@nisamp{pentiumpro}, 833@nisamp{pentium2}, 834@nisamp{pentium3}, 835@nisamp{pentium4}, 836@nisamp{k6}, 837@nisamp{k62}, 838@nisamp{k63}, 839@nisamp{athlon}, 840@nisamp{amd64}, 841@nisamp{viac3}, 842@nisamp{viac32} 843 844@item 845Other: 846@nisamp{a29k}, 847@nisamp{arm}, 848@nisamp{clipper}, 849@nisamp{i960}, 850@nisamp{ns32k}, 851@nisamp{pyramid}, 852@nisamp{sh}, 853@nisamp{sh2}, 854@nisamp{vax}, 855@nisamp{z8k} 856@end itemize 857 858CPUs not listed will use generic C code. 859 860@item Generic C Build 861@cindex Generic C 862If some of the assembly code causes problems, or if otherwise desired, the 863generic C code can be selected with CPU @samp{none}. For example, 864 865@example 866./configure --host=none-unknown-freebsd3.5 867@end example 868 869Note that this will run quite slowly, but it should be portable and should at 870least make it possible to get something running if all else fails. 871 872@item Fat binary, @option{--enable-fat} 873@cindex Fat binary 874@cindex @option{--enable-fat} 875Using @option{--enable-fat} selects a ``fat binary'' build on x86, where 876optimized low level subroutines are chosen at runtime according to the CPU 877detected. This means more code, but gives good performance on all x86 chips. 878(This option might become available for more architectures in the future.) 879 880@item @option{ABI} 881@cindex ABI 882On some systems GMP supports multiple ABIs (application binary interfaces), 883meaning data type sizes and calling conventions. By default GMP chooses the 884best ABI available, but a particular ABI can be selected. For example 885 886@example 887./configure --host=mips64-sgi-irix6 ABI=n32 888@end example 889 890See @ref{ABI and ISA}, for the available choices on relevant CPUs, and what 891applications need to do. 892 893@item @option{CC}, @option{CFLAGS} 894@cindex C compiler 895@cindex @code{CC} 896@cindex @code{CFLAGS} 897By default the C compiler used is chosen from among some likely candidates, 898with @command{gcc} normally preferred if it's present. The usual 899@samp{CC=whatever} can be passed to @samp{./configure} to choose something 900different. 901 902For various systems, default compiler flags are set based on the CPU and 903compiler. The usual @samp{CFLAGS="-whatever"} can be passed to 904@samp{./configure} to use something different or to set good flags for systems 905GMP doesn't otherwise know. 906 907The @samp{CC} and @samp{CFLAGS} used are printed during @samp{./configure}, 908and can be found in each generated @file{Makefile}. This is the easiest way 909to check the defaults when considering changing or adding something. 910 911Note that when @samp{CC} and @samp{CFLAGS} are specified on a system 912supporting multiple ABIs it's important to give an explicit 913@samp{ABI=whatever}, since GMP can't determine the ABI just from the flags and 914won't be able to select the correct assembly code. 915 916If just @samp{CC} is selected then normal default @samp{CFLAGS} for that 917compiler will be used (if GMP recognises it). For example @samp{CC=gcc} can 918be used to force the use of GCC, with default flags (and default ABI). 919 920@item @option{CPPFLAGS} 921@cindex @code{CPPFLAGS} 922Any flags like @samp{-D} defines or @samp{-I} includes required by the 923preprocessor should be set in @samp{CPPFLAGS} rather than @samp{CFLAGS}. 924Compiling is done with both @samp{CPPFLAGS} and @samp{CFLAGS}, but 925preprocessing uses just @samp{CPPFLAGS}. This distinction is because most 926preprocessors won't accept all the flags the compiler does. Preprocessing is 927done separately in some configure tests, and in the @samp{ansi2knr} support 928for K&R compilers. 929 930@item @option{CC_FOR_BUILD} 931@cindex @code{CC_FOR_BUILD} 932Some build-time programs are compiled and run to generate host-specific data 933tables. @samp{CC_FOR_BUILD} is the compiler used for this. It doesn't need 934to be in any particular ABI or mode, it merely needs to generate executables 935that can run. The default is to try the selected @samp{CC} and some likely 936candidates such as @samp{cc} and @samp{gcc}, looking for something that works. 937 938No flags are used with @samp{CC_FOR_BUILD} because a simple invocation like 939@samp{cc foo.c} should be enough. If some particular options are required 940they can be included as for instance @samp{CC_FOR_BUILD="cc -whatever"}. 941 942@item C++ Support, @option{--enable-cxx} 943@cindex C++ support 944@cindex @code{--enable-cxx} 945C++ support in GMP can be enabled with @samp{--enable-cxx}, in which case a 946C++ compiler will be required. As a convenience @samp{--enable-cxx=detect} 947can be used to enable C++ support only if a compiler can be found. The C++ 948support consists of a library @file{libgmpxx.la} and header file 949@file{gmpxx.h} (@pxref{Headers and Libraries}). 950 951A separate @file{libgmpxx.la} has been adopted rather than having C++ objects 952within @file{libgmp.la} in order to ensure dynamic linked C programs aren't 953bloated by a dependency on the C++ standard library, and to avoid any chance 954that the C++ compiler could be required when linking plain C programs. 955 956@file{libgmpxx.la} will use certain internals from @file{libgmp.la} and can 957only be expected to work with @file{libgmp.la} from the same GMP version. 958Future changes to the relevant internals will be accompanied by renaming, so a 959mismatch will cause unresolved symbols rather than perhaps mysterious 960misbehaviour. 961 962In general @file{libgmpxx.la} will be usable only with the C++ compiler that 963built it, since name mangling and runtime support are usually incompatible 964between different compilers. 965 966@item @option{CXX}, @option{CXXFLAGS} 967@cindex C++ compiler 968@cindex @code{CXX} 969@cindex @code{CXXFLAGS} 970When C++ support is enabled, the C++ compiler and its flags can be set with 971variables @samp{CXX} and @samp{CXXFLAGS} in the usual way. The default for 972@samp{CXX} is the first compiler that works from a list of likely candidates, 973with @command{g++} normally preferred when available. The default for 974@samp{CXXFLAGS} is to try @samp{CFLAGS}, @samp{CFLAGS} without @samp{-g}, then 975for @command{g++} either @samp{-g -O2} or @samp{-O2}, or for other compilers 976@samp{-g} or nothing. Trying @samp{CFLAGS} this way is convenient when using 977@samp{gcc} and @samp{g++} together, since the flags for @samp{gcc} will 978usually suit @samp{g++}. 979 980It's important that the C and C++ compilers match, meaning their startup and 981runtime support routines are compatible and that they generate code in the 982same ABI (if there's a choice of ABIs on the system). @samp{./configure} 983isn't currently able to check these things very well itself, so for that 984reason @samp{--disable-cxx} is the default, to avoid a build failure due to a 985compiler mismatch. Perhaps this will change in the future. 986 987Incidentally, it's normally not good enough to set @samp{CXX} to the same as 988@samp{CC}. Although @command{gcc} for instance recognises @file{foo.cc} as 989C++ code, only @command{g++} will invoke the linker the right way when 990building an executable or shared library from C++ object files. 991 992@item Temporary Memory, @option{--enable-alloca=<choice>} 993@cindex Temporary memory 994@cindex Stack overflow 995@cindex @code{alloca} 996@cindex @code{--enable-alloca} 997GMP allocates temporary workspace using one of the following three methods, 998which can be selected with for instance 999@samp{--enable-alloca=malloc-reentrant}. 1000 1001@itemize @bullet 1002@item 1003@samp{alloca} - C library or compiler builtin. 1004@item 1005@samp{malloc-reentrant} - the heap, in a re-entrant fashion. 1006@item 1007@samp{malloc-notreentrant} - the heap, with global variables. 1008@end itemize 1009 1010For convenience, the following choices are also available. 1011@samp{--disable-alloca} is the same as @samp{no}. 1012 1013@itemize @bullet 1014@item 1015@samp{yes} - a synonym for @samp{alloca}. 1016@item 1017@samp{no} - a synonym for @samp{malloc-reentrant}. 1018@item 1019@samp{reentrant} - @code{alloca} if available, otherwise 1020@samp{malloc-reentrant}. This is the default. 1021@item 1022@samp{notreentrant} - @code{alloca} if available, otherwise 1023@samp{malloc-notreentrant}. 1024@end itemize 1025 1026@code{alloca} is reentrant and fast, and is recommended. It actually allocates 1027just small blocks on the stack; larger ones use malloc-reentrant. 1028 1029@samp{malloc-reentrant} is, as the name suggests, reentrant and thread safe, 1030but @samp{malloc-notreentrant} is faster and should be used if reentrancy is 1031not required. 1032 1033The two malloc methods in fact use the memory allocation functions selected by 1034@code{mp_set_memory_functions}, these being @code{malloc} and friends by 1035default. @xref{Custom Allocation}. 1036 1037An additional choice @samp{--enable-alloca=debug} is available, to help when 1038debugging memory related problems (@pxref{Debugging}). 1039 1040@item FFT Multiplication, @option{--disable-fft} 1041@cindex FFT multiplication 1042@cindex @code{--disable-fft} 1043By default multiplications are done using Karatsuba, 3-way Toom, and 1044Fermat FFT@. The FFT is only used on large to very large operands and can be 1045disabled to save code size if desired. 1046 1047@item Berkeley MP, @option{--enable-mpbsd} 1048@cindex Berkeley MP compatible functions 1049@cindex BSD MP compatible functions 1050@cindex @code{--enable-mpbsd} 1051The Berkeley MP compatibility library (@file{libmp}) and header file 1052(@file{mp.h}) are built and installed only if @option{--enable-mpbsd} is used. 1053@xref{BSD Compatible Functions}. 1054 1055@item Assertion Checking, @option{--enable-assert} 1056@cindex Assertion checking 1057@cindex @code{--enable-assert} 1058This option enables some consistency checking within the library. This can be 1059of use while debugging, @pxref{Debugging}. 1060 1061@item Execution Profiling, @option{--enable-profiling=prof/gprof/instrument} 1062@cindex Execution profiling 1063@cindex @code{--enable-profiling} 1064Enable profiling support, in one of various styles, @pxref{Profiling}. 1065 1066@item @option{MPN_PATH} 1067@cindex @code{MPN_PATH} 1068Various assembly versions of each mpn subroutines are provided. For a given 1069CPU, a search is made though a path to choose a version of each. For example 1070@samp{sparcv8} has 1071 1072@example 1073MPN_PATH="sparc32/v8 sparc32 generic" 1074@end example 1075 1076which means look first for v8 code, then plain sparc32 (which is v7), and 1077finally fall back on generic C@. Knowledgeable users with special requirements 1078can specify a different path. Normally this is completely unnecessary. 1079 1080@item Documentation 1081@cindex Documentation formats 1082@cindex Texinfo 1083The source for the document you're now reading is @file{doc/gmp.texi}, in 1084Texinfo format, see @GMPreftop{texinfo, Texinfo}. 1085 1086@cindex Postscript 1087@cindex DVI 1088@cindex PDF 1089Info format @samp{doc/gmp.info} is included in the distribution. The usual 1090automake targets are available to make PostScript, DVI, PDF and HTML (these 1091will require various @TeX{} and Texinfo tools). 1092 1093@cindex DocBook 1094@cindex XML 1095DocBook and XML can be generated by the Texinfo @command{makeinfo} program 1096too, see @ref{makeinfo options,, Options for @command{makeinfo}, texinfo, 1097Texinfo}. 1098 1099Some supplementary notes can also be found in the @file{doc} subdirectory. 1100 1101@end table 1102 1103 1104@need 2000 1105@node ABI and ISA, Notes for Package Builds, Build Options, Installing GMP 1106@section ABI and ISA 1107@cindex ABI 1108@cindex Application Binary Interface 1109@cindex ISA 1110@cindex Instruction Set Architecture 1111 1112ABI (Application Binary Interface) refers to the calling conventions between 1113functions, meaning what registers are used and what sizes the various C data 1114types are. ISA (Instruction Set Architecture) refers to the instructions and 1115registers a CPU has available. 1116 1117Some 64-bit ISA CPUs have both a 64-bit ABI and a 32-bit ABI defined, the 1118latter for compatibility with older CPUs in the family. GMP supports some 1119CPUs like this in both ABIs. In fact within GMP @samp{ABI} means a 1120combination of chip ABI, plus how GMP chooses to use it. For example in some 112132-bit ABIs, GMP may support a limb as either a 32-bit @code{long} or a 64-bit 1122@code{long long}. 1123 1124By default GMP chooses the best ABI available for a given system, and this 1125generally gives significantly greater speed. But an ABI can be chosen 1126explicitly to make GMP compatible with other libraries, or particular 1127application requirements. For example, 1128 1129@example 1130./configure ABI=32 1131@end example 1132 1133In all cases it's vital that all object code used in a given program is 1134compiled for the same ABI. 1135 1136Usually a limb is implemented as a @code{long}. When a @code{long long} limb 1137is used this is encoded in the generated @file{gmp.h}. This is convenient for 1138applications, but it does mean that @file{gmp.h} will vary, and can't be just 1139copied around. @file{gmp.h} remains compiler independent though, since all 1140compilers for a particular ABI will be expected to use the same limb type. 1141 1142Currently no attempt is made to follow whatever conventions a system has for 1143installing library or header files built for a particular ABI@. This will 1144probably only matter when installing multiple builds of GMP, and it might be 1145as simple as configuring with a special @samp{libdir}, or it might require 1146more than that. Note that builds for different ABIs need to done separately, 1147with a fresh @command{./configure} and @command{make} each. 1148 1149@sp 1 1150@table @asis 1151@need 1000 1152@item AMD64 (@samp{x86_64}) 1153@cindex AMD64 1154On AMD64 systems supporting both 32-bit and 64-bit modes for applications, the 1155following ABI choices are available. 1156 1157@table @asis 1158@item @samp{ABI=64} 1159The 64-bit ABI uses 64-bit limbs and pointers and makes full use of the chip 1160architecture. This is the default. Applications will usually not need 1161special compiler flags, but for reference the option is 1162 1163@example 1164gcc -m64 1165@end example 1166 1167@item @samp{ABI=32} 1168The 32-bit ABI is the usual i386 conventions. This will be slower, and is not 1169recommended except for inter-operating with other code not yet 64-bit capable. 1170Applications must be compiled with 1171 1172@example 1173gcc -m32 1174@end example 1175 1176(In GCC 2.95 and earlier there's no @samp{-m32} option, it's the only mode.) 1177@end table 1178 1179@sp 1 1180@need 1000 1181@item HPPA 2.0 (@samp{hppa2.0*}, @samp{hppa64}) 1182@cindex HPPA 1183@cindex HP-UX 1184@table @asis 1185@item @samp{ABI=2.0w} 1186The 2.0w ABI uses 64-bit limbs and pointers and is available on HP-UX 11 or 1187up. Applications must be compiled with 1188 1189@example 1190gcc [built for 2.0w] 1191cc +DD64 1192@end example 1193 1194@item @samp{ABI=2.0n} 1195The 2.0n ABI means the 32-bit HPPA 1.0 ABI and all its normal calling 1196conventions, but with 64-bit instructions permitted within functions. GMP 1197uses a 64-bit @code{long long} for a limb. This ABI is available on hppa64 1198GNU/Linux and on HP-UX 10 or higher. Applications must be compiled with 1199 1200@example 1201gcc [built for 2.0n] 1202cc +DA2.0 +e 1203@end example 1204 1205Note that current versions of GCC (eg.@: 3.2) don't generate 64-bit 1206instructions for @code{long long} operations and so may be slower than for 12072.0w. (The GMP assembly code is the same though.) 1208 1209@item @samp{ABI=1.0} 1210HPPA 2.0 CPUs can run all HPPA 1.0 and 1.1 code in the 32-bit HPPA 1.0 ABI@. 1211No special compiler options are needed for applications. 1212@end table 1213 1214All three ABIs are available for CPU types @samp{hppa2.0w}, @samp{hppa2.0} and 1215@samp{hppa64}, but for CPU type @samp{hppa2.0n} only 2.0n or 1.0 are 1216considered. 1217 1218Note that GCC on HP-UX has no options to choose between 2.0n and 2.0w modes, 1219unlike HP @command{cc}. Instead it must be built for one or the other ABI@. 1220GMP will detect how it was built, and skip to the corresponding @samp{ABI}. 1221 1222@sp 1 1223@need 1500 1224@item IA-64 under HP-UX (@samp{ia64*-*-hpux*}, @samp{itanium*-*-hpux*}) 1225@cindex IA-64 1226@cindex HP-UX 1227HP-UX supports two ABIs for IA-64. GMP performance is the same in both. 1228 1229@table @asis 1230@item @samp{ABI=32} 1231In the 32-bit ABI, pointers, @code{int}s and @code{long}s are 32 bits and GMP 1232uses a 64 bit @code{long long} for a limb. Applications can be compiled 1233without any special flags since this ABI is the default in both HP C and GCC, 1234but for reference the flags are 1235 1236@example 1237gcc -milp32 1238cc +DD32 1239@end example 1240 1241@item @samp{ABI=64} 1242In the 64-bit ABI, @code{long}s and pointers are 64 bits and GMP uses a 1243@code{long} for a limb. Applications must be compiled with 1244 1245@example 1246gcc -mlp64 1247cc +DD64 1248@end example 1249@end table 1250 1251On other IA-64 systems, GNU/Linux for instance, @samp{ABI=64} is the only 1252choice. 1253 1254@sp 1 1255@need 1000 1256@item MIPS under IRIX 6 (@samp{mips*-*-irix[6789]}) 1257@cindex MIPS 1258@cindex IRIX 1259IRIX 6 always has a 64-bit MIPS 3 or better CPU, and supports ABIs o32, n32, 1260and 64. n32 or 64 are recommended, and GMP performance will be the same in 1261each. The default is n32. 1262 1263@table @asis 1264@item @samp{ABI=o32} 1265The o32 ABI is 32-bit pointers and integers, and no 64-bit operations. GMP 1266will be slower than in n32 or 64, this option only exists to support old 1267compilers, eg.@: GCC 2.7.2. Applications can be compiled with no special 1268flags on an old compiler, or on a newer compiler with 1269 1270@example 1271gcc -mabi=32 1272cc -32 1273@end example 1274 1275@item @samp{ABI=n32} 1276The n32 ABI is 32-bit pointers and integers, but with a 64-bit limb using a 1277@code{long long}. Applications must be compiled with 1278 1279@example 1280gcc -mabi=n32 1281cc -n32 1282@end example 1283 1284@item @samp{ABI=64} 1285The 64-bit ABI is 64-bit pointers and integers. Applications must be compiled 1286with 1287 1288@example 1289gcc -mabi=64 1290cc -64 1291@end example 1292@end table 1293 1294Note that MIPS GNU/Linux, as of kernel version 2.2, doesn't have the necessary 1295support for n32 or 64 and so only gets a 32-bit limb and the MIPS 2 code. 1296 1297@sp 1 1298@need 1000 1299@item PowerPC 64 (@samp{powerpc64}, @samp{powerpc620}, @samp{powerpc630}, @samp{powerpc970}, @samp{power4}, @samp{power5}) 1300@cindex PowerPC 1301@table @asis 1302@item @samp{ABI=aix64} 1303@cindex AIX 1304The AIX 64 ABI uses 64-bit limbs and pointers and is the default on PowerPC 64 1305@samp{*-*-aix*} systems. Applications must be compiled with 1306 1307@example 1308gcc -maix64 1309xlc -q64 1310@end example 1311 1312@item @samp{ABI=mode64} 1313The @samp{mode64} ABI uses 64-bit limbs and pointers, and is the default on 131464-bit GNU/Linux, BSD, and Mac OS X/Darwin systems. Applications must be 1315compiled with 1316 1317@example 1318gcc -m64 1319@end example 1320 1321@item @samp{ABI=mode32} 1322@cindex AIX 1323The @samp{mode32} ABI uses a 64-bit @code{long long} limb but with the chip 1324still in 32-bit mode and using 32-bit calling conventions. This is the default 1325for systems where the true 64-bit ABI is unavailable. No special compiler 1326options are typically needed for applications. 1327 1328@item @samp{ABI=32} 1329This is the basic 32-bit PowerPC ABI, with a 32-bit limb. No special compiler 1330options are needed for applications. 1331@end table 1332 1333GMP's speed is greatest for @samp{aix64} and @samp{mode64}. In @samp{ABI=32} 1334only the 32-bit ISA is used and this doesn't make full use of a 64-bit chip. 1335On a suitable system we could perhaps use more of the ISA, but there are no 1336plans to do so. 1337 1338@sp 1 1339@need 1000 1340@item Sparc V9 (@samp{sparc64}, @samp{sparcv9}, @samp{ultrasparc*}) 1341@cindex Sparc V9 1342@cindex Solaris 1343@cindex Sun 1344@table @asis 1345@item @samp{ABI=64} 1346The 64-bit V9 ABI is available on the various BSD sparc64 ports, recent 1347versions of Sparc64 GNU/Linux, and Solaris 2.7 and up (when the kernel is in 134864-bit mode). GCC 3.2 or higher, or Sun @command{cc} is required. On 1349GNU/Linux, depending on the default @command{gcc} mode, applications must be 1350compiled with 1351 1352@example 1353gcc -m64 1354@end example 1355 1356On Solaris applications must be compiled with 1357 1358@example 1359gcc -m64 -mptr64 -Wa,-xarch=v9 -mcpu=v9 1360cc -xarch=v9 1361@end example 1362 1363On the BSD sparc64 systems no special options are required, since 64-bits is 1364the only ABI available. 1365 1366@item @samp{ABI=32} 1367For the basic 32-bit ABI, GMP still uses as much of the V9 ISA as it can. In 1368the Sun documentation this combination is known as ``v8plus''. On GNU/Linux, 1369depending on the default @command{gcc} mode, applications may need to be 1370compiled with 1371 1372@example 1373gcc -m32 1374@end example 1375 1376On Solaris, no special compiler options are required for applications, though 1377using something like the following is recommended. (@command{gcc} 2.8 and 1378earlier only support @samp{-mv8} though.) 1379 1380@example 1381gcc -mv8plus 1382cc -xarch=v8plus 1383@end example 1384@end table 1385 1386GMP speed is greatest in @samp{ABI=64}, so it's the default where available. 1387The speed is partly because there are extra registers available and partly 1388because 64-bits is considered the more important case and has therefore had 1389better code written for it. 1390 1391Don't be confused by the names of the @samp{-m} and @samp{-x} compiler 1392options, they're called @samp{arch} but effectively control both ABI and ISA@. 1393 1394On Solaris 2.6 and earlier, only @samp{ABI=32} is available since the kernel 1395doesn't save all registers. 1396 1397On Solaris 2.7 with the kernel in 32-bit mode, a normal native build will 1398reject @samp{ABI=64} because the resulting executables won't run. 1399@samp{ABI=64} can still be built if desired by making it look like a 1400cross-compile, for example 1401 1402@example 1403./configure --build=none --host=sparcv9-sun-solaris2.7 ABI=64 1404@end example 1405@end table 1406 1407 1408@need 2000 1409@node Notes for Package Builds, Notes for Particular Systems, ABI and ISA, Installing GMP 1410@section Notes for Package Builds 1411@cindex Build notes for binary packaging 1412@cindex Packaged builds 1413 1414GMP should present no great difficulties for packaging in a binary 1415distribution. 1416 1417@cindex Libtool versioning 1418@cindex Shared library versioning 1419Libtool is used to build the library and @samp{-version-info} is set 1420appropriately, having started from @samp{3:0:0} in GMP 3.0 (@pxref{Versioning, 1421Library interface versions, Library interface versions, libtool, GNU 1422Libtool}). 1423 1424The GMP 4 series will be upwardly binary compatible in each release and will 1425be upwardly binary compatible with all of the GMP 3 series. Additional 1426function interfaces may be added in each release, so on systems where libtool 1427versioning is not fully checked by the loader an auxiliary mechanism may be 1428needed to express that a dynamic linked application depends on a new enough 1429GMP. 1430 1431An auxiliary mechanism may also be needed to express that @file{libgmpxx.la} 1432(from @option{--enable-cxx}, @pxref{Build Options}) requires @file{libgmp.la} 1433from the same GMP version, since this is not done by the libtool versioning, 1434nor otherwise. A mismatch will result in unresolved symbols from the linker, 1435or perhaps the loader. 1436 1437When building a package for a CPU family, care should be taken to use 1438@samp{--host} (or @samp{--build}) to choose the least common denominator among 1439the CPUs which might use the package. For example this might mean plain 1440@samp{sparc} (meaning V7) for SPARCs. 1441 1442For x86s, @option{--enable-fat} sets things up for a fat binary build, making a 1443runtime selection of optimized low level routines. This is a good choice for 1444packaging to run on a range of x86 chips. 1445 1446Users who care about speed will want GMP built for their exact CPU type, to 1447make best use of the available optimizations. Providing a way to suitably 1448rebuild a package may be useful. This could be as simple as making it 1449possible for a user to omit @samp{--build} (and @samp{--host}) so 1450@samp{./config.guess} will detect the CPU@. But a way to manually specify a 1451@samp{--build} will be wanted for systems where @samp{./config.guess} is 1452inexact. 1453 1454On systems with multiple ABIs, a packaged build will need to decide which 1455among the choices is to be provided, see @ref{ABI and ISA}. A given run of 1456@samp{./configure} etc will only build one ABI@. If a second ABI is also 1457required then a second run of @samp{./configure} etc must be made, starting 1458from a clean directory tree (@samp{make distclean}). 1459 1460As noted under ``ABI and ISA'', currently no attempt is made to follow system 1461conventions for install locations that vary with ABI, such as 1462@file{/usr/lib/sparcv9} for @samp{ABI=64} as opposed to @file{/usr/lib} for 1463@samp{ABI=32}. A package build can override @samp{libdir} and other standard 1464variables as necessary. 1465 1466Note that @file{gmp.h} is a generated file, and will be architecture and ABI 1467dependent. When attempting to install two ABIs simultaneously it will be 1468important that an application compile gets the correct @file{gmp.h} for its 1469desired ABI@. If compiler include paths don't vary with ABI options then it 1470might be necessary to create a @file{/usr/include/gmp.h} which tests 1471preprocessor symbols and chooses the correct actual @file{gmp.h}. 1472 1473 1474@need 2000 1475@node Notes for Particular Systems, Known Build Problems, Notes for Package Builds, Installing GMP 1476@section Notes for Particular Systems 1477@cindex Build notes for particular systems 1478@cindex Particular systems 1479@cindex Systems 1480@table @asis 1481 1482@c This section is more or less meant for notes about performance or about 1483@c build problems that have been worked around but might leave a user 1484@c scratching their head. Fun with different ABIs on a system belongs in the 1485@c above section. 1486 1487@item AIX 3 and 4 1488@cindex AIX 1489On systems @samp{*-*-aix[34]*} shared libraries are disabled by default, since 1490some versions of the native @command{ar} fail on the convenience libraries 1491used. A shared build can be attempted with 1492 1493@example 1494./configure --enable-shared --disable-static 1495@end example 1496 1497Note that the @samp{--disable-static} is necessary because in a shared build 1498libtool makes @file{libgmp.a} a symlink to @file{libgmp.so}, apparently for 1499the benefit of old versions of @command{ld} which only recognise @file{.a}, 1500but unfortunately this is done even if a fully functional @command{ld} is 1501available. 1502 1503@item ARM 1504@cindex ARM 1505On systems @samp{arm*-*-*}, versions of GCC up to and including 2.95.3 have a 1506bug in unsigned division, giving wrong results for some operands. GMP 1507@samp{./configure} will demand GCC 2.95.4 or later. 1508 1509@item Compaq C++ 1510@cindex Compaq C++ 1511Compaq C++ on OSF 5.1 has two flavours of @code{iostream}, a standard one and 1512an old pre-standard one (see @samp{man iostream_intro}). GMP can only use the 1513standard one, which unfortunately is not the default but must be selected by 1514defining @code{__USE_STD_IOSTREAM}. Configure with for instance 1515 1516@example 1517./configure --enable-cxx CPPFLAGS=-D__USE_STD_IOSTREAM 1518@end example 1519 1520@item Floating Point Mode 1521@cindex Floating point mode 1522@cindex Hardware floating point mode 1523@cindex Precision of hardware floating point 1524@cindex x87 1525On some systems, the hardware floating point has a control mode which can set 1526all operations to be done in a particular precision, for instance single, 1527double or extended on x86 systems (x87 floating point). The GMP functions 1528involving a @code{double} cannot be expected to operate to their full 1529precision when the hardware is in single precision mode. Of course this 1530affects all code, including application code, not just GMP. 1531 1532@item MS-DOS and MS Windows 1533@cindex MS-DOS 1534@cindex MS Windows 1535@cindex Windows 1536@cindex Cygwin 1537@cindex DJGPP 1538@cindex MINGW 1539On an MS-DOS system DJGPP can be used to build GMP, and on an MS Windows 1540system Cygwin, DJGPP and MINGW can be used. All three are excellent ports of 1541GCC and the various GNU tools. 1542 1543@display 1544@uref{http://www.cygwin.com/} 1545@uref{http://www.delorie.com/djgpp/} 1546@uref{http://www.mingw.org/} 1547@end display 1548 1549@cindex Interix 1550@cindex Services for Unix 1551Microsoft also publishes an Interix ``Services for Unix'' which can be used to 1552build GMP on Windows (with a normal @samp{./configure}), but it's not free 1553software. 1554 1555@item MS Windows DLLs 1556@cindex DLLs 1557@cindex MS Windows 1558@cindex Windows 1559On systems @samp{*-*-cygwin*}, @samp{*-*-mingw*} and @samp{*-*-pw32*} by 1560default GMP builds only a static library, but a DLL can be built instead using 1561 1562@example 1563./configure --disable-static --enable-shared 1564@end example 1565 1566Static and DLL libraries can't both be built, since certain export directives 1567in @file{gmp.h} must be different. 1568 1569A MINGW DLL build of GMP can be used with Microsoft C@. Libtool doesn't 1570install a @file{.lib} format import library, but it can be created with MS 1571@command{lib} as follows, and copied to the install directory. Similarly for 1572@file{libmp} and @file{libgmpxx}. 1573 1574@example 1575cd .libs 1576lib /def:libgmp-3.dll.def /out:libgmp-3.lib 1577@end example 1578 1579MINGW uses the C runtime library @samp{msvcrt.dll} for I/O, so applications 1580wanting to use the GMP I/O routines must be compiled with @samp{cl /MD} to do 1581the same. If one of the other C runtime library choices provided by MS C is 1582desired then the suggestion is to use the GMP string functions and confine I/O 1583to the application. 1584 1585@item Motorola 68k CPU Types 1586@cindex 68000 1587@samp{m68k} is taken to mean 68000. @samp{m68020} or higher will give a 1588performance boost on applicable CPUs. @samp{m68360} can be used for CPU32 1589series chips. @samp{m68302} can be used for ``Dragonball'' series chips, 1590though this is merely a synonym for @samp{m68000}. 1591 1592@item OpenBSD 2.6 1593@cindex OpenBSD 1594@command{m4} in this release of OpenBSD has a bug in @code{eval} that makes it 1595unsuitable for @file{.asm} file processing. @samp{./configure} will detect 1596the problem and either abort or choose another m4 in the @env{PATH}. The bug 1597is fixed in OpenBSD 2.7, so either upgrade or use GNU m4. 1598 1599@item Power CPU Types 1600@cindex Power/PowerPC 1601In GMP, CPU types @samp{power*} and @samp{powerpc*} will each use instructions 1602not available on the other, so it's important to choose the right one for the 1603CPU that will be used. Currently GMP has no assembly code support for using 1604just the common instruction subset. To get executables that run on both, the 1605current suggestion is to use the generic C code (CPU @samp{none}), possibly 1606with appropriate compiler options (like @samp{-mcpu=common} for 1607@command{gcc}). CPU @samp{rs6000} (which is not a CPU but a family of 1608workstations) is accepted by @file{config.sub}, but is currently equivalent to 1609@samp{none}. 1610 1611@item Sparc CPU Types 1612@cindex Sparc 1613@samp{sparcv8} or @samp{supersparc} on relevant systems will give a 1614significant performance increase over the V7 code selected by plain 1615@samp{sparc}. 1616 1617@item Sparc App Regs 1618@cindex Sparc 1619The GMP assembly code for both 32-bit and 64-bit Sparc clobbers the 1620``application registers'' @code{g2}, @code{g3} and @code{g4}, the same way 1621that the GCC default @samp{-mapp-regs} does (@pxref{SPARC Options,, SPARC 1622Options, gcc, Using the GNU Compiler Collection (GCC)}). 1623 1624This makes that code unsuitable for use with the special V9 1625@samp{-mcmodel=embmedany} (which uses @code{g4} as a data segment pointer), and 1626for applications wanting to use those registers for special purposes. In these 1627cases the only suggestion currently is to build GMP with CPU @samp{none} to 1628avoid the assembly code. 1629 1630@item SunOS 4 1631@cindex SunOS 1632@command{/usr/bin/m4} lacks various features needed to process @file{.asm} 1633files, and instead @samp{./configure} will automatically use 1634@command{/usr/5bin/m4}, which we believe is always available (if not then use 1635GNU m4). 1636 1637@item x86 CPU Types 1638@cindex x86 1639@cindex 80x86 1640@cindex i386 1641@samp{i586}, @samp{pentium} or @samp{pentiummmx} code is good for its intended 1642P5 Pentium chips, but quite slow when run on Intel P6 class chips (PPro, P-II, 1643P-III)@. @samp{i386} is a better choice when making binaries that must run on 1644both. 1645 1646@item x86 MMX and SSE2 Code 1647@cindex MMX 1648@cindex SSE2 1649If the CPU selected has MMX code but the assembler doesn't support it, a 1650warning is given and non-MMX code is used instead. This will be an inferior 1651build, since the MMX code that's present is there because it's faster than the 1652corresponding plain integer code. The same applies to SSE2. 1653 1654Old versions of @samp{gas} don't support MMX instructions, in particular 1655version 1.92.3 that comes with FreeBSD 2.2.8 or the more recent OpenBSD 3.1 1656doesn't. 1657 1658Solaris 2.6 and 2.7 @command{as} generate incorrect object code for register 1659to register @code{movq} instructions, and so can't be used for MMX code. 1660Install a recent @command{gas} if MMX code is wanted on these systems. 1661@end table 1662 1663 1664@need 2000 1665@node Known Build Problems, Performance optimization, Notes for Particular Systems, Installing GMP 1666@section Known Build Problems 1667@cindex Build problems known 1668 1669@c This section is more or less meant for known build problems that are not 1670@c otherwise worked around and require some sort of manual intervention. 1671 1672You might find more up-to-date information at @uref{http://gmplib.org/}. 1673 1674@table @asis 1675@item Compiler link options 1676The version of libtool currently in use rather aggressively strips compiler 1677options when linking a shared library. This will hopefully be relaxed in the 1678future, but for now if this is a problem the suggestion is to create a little 1679script to hide them, and for instance configure with 1680 1681@example 1682./configure CC=gcc-with-my-options 1683@end example 1684 1685@item DJGPP (@samp{*-*-msdosdjgpp*}) 1686@cindex DJGPP 1687The DJGPP port of @command{bash} 2.03 is unable to run the @samp{configure} 1688script, it exits silently, having died writing a preamble to 1689@file{config.log}. Use @command{bash} 2.04 or higher. 1690 1691@samp{make all} was found to run out of memory during the final 1692@file{libgmp.la} link on one system tested, despite having 64Mb available. 1693Running @samp{make libgmp.la} directly helped, perhaps recursing into the 1694various subdirectories uses up memory. 1695 1696@item GNU binutils @command{strip} prior to 2.12 1697@cindex Stripped libraries 1698@cindex Binutils @command{strip} 1699@cindex GNU @command{strip} 1700@command{strip} from GNU binutils 2.11 and earlier should not be used on the 1701static libraries @file{libgmp.a} and @file{libmp.a} since it will discard all 1702but the last of multiple archive members with the same name, like the three 1703versions of @file{init.o} in @file{libgmp.a}. Binutils 2.12 or higher can be 1704used successfully. 1705 1706The shared libraries @file{libgmp.so} and @file{libmp.so} are not affected by 1707this and any version of @command{strip} can be used on them. 1708 1709@item @command{make} syntax error 1710@cindex SCO 1711@cindex IRIX 1712On certain versions of SCO OpenServer 5 and IRIX 6.5 the native @command{make} 1713is unable to handle the long dependencies list for @file{libgmp.la}. The 1714symptom is a ``syntax error'' on the following line of the top-level 1715@file{Makefile}. 1716 1717@example 1718libgmp.la: $(libgmp_la_OBJECTS) $(libgmp_la_DEPENDENCIES) 1719@end example 1720 1721Either use GNU Make, or as a workaround remove 1722@code{$(libgmp_la_DEPENDENCIES)} from that line (which will make the initial 1723build work, but if any recompiling is done @file{libgmp.la} might not be 1724rebuilt). 1725 1726@item MacOS X (@samp{*-*-darwin*}) 1727@cindex MacOS X 1728@cindex Darwin 1729Libtool currently only knows how to create shared libraries on MacOS X using 1730the native @command{cc} (which is a modified GCC), not a plain GCC@. A 1731static-only build should work though (@samp{--disable-shared}). 1732 1733@item NeXT prior to 3.3 1734@cindex NeXT 1735The system compiler on old versions of NeXT was a massacred and old GCC, even 1736if it called itself @file{cc}. This compiler cannot be used to build GMP, you 1737need to get a real GCC, and install that. (NeXT may have fixed this in 1738release 3.3 of their system.) 1739 1740@item POWER and PowerPC 1741@cindex Power/PowerPC 1742Bugs in GCC 2.7.2 (and 2.6.3) mean it can't be used to compile GMP on POWER or 1743PowerPC@. If you want to use GCC for these machines, get GCC 2.7.2.1 (or 1744later). 1745 1746@item Sequent Symmetry 1747@cindex Sequent Symmetry 1748Use the GNU assembler instead of the system assembler, since the latter has 1749serious bugs. 1750 1751@item Solaris 2.6 1752@cindex Solaris 1753The system @command{sed} prints an error ``Output line too long'' when libtool 1754builds @file{libgmp.la}. This doesn't seem to cause any obvious ill effects, 1755but GNU @command{sed} is recommended, to avoid any doubt. 1756 1757@item Sparc Solaris 2.7 with gcc 2.95.2 in @samp{ABI=32} 1758@cindex Solaris 1759A shared library build of GMP seems to fail in this combination, it builds but 1760then fails the tests, apparently due to some incorrect data relocations within 1761@code{gmp_randinit_lc_2exp_size}. The exact cause is unknown, 1762@samp{--disable-shared} is recommended. 1763@end table 1764 1765 1766@need 2000 1767@node Performance optimization, , Known Build Problems, Installing GMP 1768@section Performance optimization 1769@cindex Optimizing performance 1770 1771@c At some point, this should perhaps move to a separate chapter on optimizing 1772@c performance. 1773 1774For optimal performance, build GMP for the exact CPU type of the target 1775computer, see @ref{Build Options}. 1776 1777Unlike what is the case for most other programs, the compiler typically 1778doesn't matter much, since GMP uses assembly language for the most critical 1779operation. 1780 1781In particular for long-running GMP applications, and applications demanding 1782extremely large numbers, building and running the @code{tuneup} program in the 1783@file{tune} subdirectory, can be important. For example, 1784 1785@example 1786cd tune 1787make tuneup 1788./tuneup 1789@end example 1790 1791will generate better contents for the @file{gmp-mparam.h} parameter file. 1792 1793To use the results, put the output in the file indicated in the 1794@samp{Parameters for ...} header. Then recompile from scratch. 1795 1796The @code{tuneup} program takes one useful parameter, @samp{-f NNN}, which 1797instructs the program how long to check FFT multiply parameters. If you're 1798going to use GMP for extremely large numbers, you may want to run @code{tuneup} 1799with a large NNN value. 1800 1801 1802@node GMP Basics, Reporting Bugs, Installing GMP, Top 1803@comment node-name, next, previous, up 1804@chapter GMP Basics 1805@cindex Basics 1806 1807@strong{Using functions, macros, data types, etc.@: not documented in this 1808manual is strongly discouraged. If you do so your application is guaranteed 1809to be incompatible with future versions of GMP.} 1810 1811@menu 1812* Headers and Libraries:: 1813* Nomenclature and Types:: 1814* Function Classes:: 1815* Variable Conventions:: 1816* Parameter Conventions:: 1817* Memory Management:: 1818* Reentrancy:: 1819* Useful Macros and Constants:: 1820* Compatibility with older versions:: 1821* Demonstration Programs:: 1822* Efficiency:: 1823* Debugging:: 1824* Profiling:: 1825* Autoconf:: 1826* Emacs:: 1827@end menu 1828 1829@node Headers and Libraries, Nomenclature and Types, GMP Basics, GMP Basics 1830@section Headers and Libraries 1831@cindex Headers 1832 1833@cindex @file{gmp.h} 1834@cindex Include files 1835@cindex @code{#include} 1836All declarations needed to use GMP are collected in the include file 1837@file{gmp.h}. It is designed to work with both C and C++ compilers. 1838 1839@example 1840#include <gmp.h> 1841@end example 1842 1843@cindex @code{stdio.h} 1844Note however that prototypes for GMP functions with @code{FILE *} parameters 1845are only provided if @code{<stdio.h>} is included too. 1846 1847@example 1848#include <stdio.h> 1849#include <gmp.h> 1850@end example 1851 1852@cindex @code{stdarg.h} 1853Likewise @code{<stdarg.h>} (or @code{<varargs.h>}) is required for prototypes 1854with @code{va_list} parameters, such as @code{gmp_vprintf}. And 1855@code{<obstack.h>} for prototypes with @code{struct obstack} parameters, such 1856as @code{gmp_obstack_printf}, when available. 1857 1858@cindex Libraries 1859@cindex Linking 1860@cindex @code{libgmp} 1861All programs using GMP must link against the @file{libgmp} library. On a 1862typical Unix-like system this can be done with @samp{-lgmp}, for example 1863 1864@example 1865gcc myprogram.c -lgmp 1866@end example 1867 1868@cindex @code{libgmpxx} 1869GMP C++ functions are in a separate @file{libgmpxx} library. This is built 1870and installed if C++ support has been enabled (@pxref{Build Options}). For 1871example, 1872 1873@example 1874g++ mycxxprog.cc -lgmpxx -lgmp 1875@end example 1876 1877@cindex Libtool 1878GMP is built using Libtool and an application can use that to link if desired, 1879@GMPpxreftop{libtool, GNU Libtool}. 1880 1881If GMP has been installed to a non-standard location then it may be necessary 1882to use @samp{-I} and @samp{-L} compiler options to point to the right 1883directories, and some sort of run-time path for a shared library. 1884 1885 1886@node Nomenclature and Types, Function Classes, Headers and Libraries, GMP Basics 1887@section Nomenclature and Types 1888@cindex Nomenclature 1889@cindex Types 1890 1891@cindex Integer 1892@tindex @code{mpz_t} 1893In this manual, @dfn{integer} usually means a multiple precision integer, as 1894defined by the GMP library. The C data type for such integers is @code{mpz_t}. 1895Here are some examples of how to declare such integers: 1896 1897@example 1898mpz_t sum; 1899 1900struct foo @{ mpz_t x, y; @}; 1901 1902mpz_t vec[20]; 1903@end example 1904 1905@cindex Rational number 1906@tindex @code{mpq_t} 1907@dfn{Rational number} means a multiple precision fraction. The C data type 1908for these fractions is @code{mpq_t}. For example: 1909 1910@example 1911mpq_t quotient; 1912@end example 1913 1914@cindex Floating-point number 1915@tindex @code{mpf_t} 1916@dfn{Floating point number} or @dfn{Float} for short, is an arbitrary precision 1917mantissa with a limited precision exponent. The C data type for such objects 1918is @code{mpf_t}. For example: 1919 1920@example 1921mpf_t fp; 1922@end example 1923 1924@tindex @code{mp_exp_t} 1925The floating point functions accept and return exponents in the C type 1926@code{mp_exp_t}. Currently this is usually a @code{long}, but on some systems 1927it's an @code{int} for efficiency. 1928 1929@cindex Limb 1930@tindex @code{mp_limb_t} 1931A @dfn{limb} means the part of a multi-precision number that fits in a single 1932machine word. (We chose this word because a limb of the human body is 1933analogous to a digit, only larger, and containing several digits.) Normally a 1934limb is 32 or 64 bits. The C data type for a limb is @code{mp_limb_t}. 1935 1936@tindex @code{mp_size_t} 1937Counts of limbs of a multi-precision number represented in the C type 1938@code{mp_size_t}. Currently this is normally a @code{long}, but on some 1939systems it's an @code{int} for efficiency, and on some systems it will be 1940@code{long long} in the future. 1941 1942@tindex @code{mp_bitcnt_t} 1943Counts of bits of a multi-precision number are represented in the C type 1944@code{mp_bitcnt_t}. Currently this is always an @code{unsigned long}, but on 1945some systems it will be an @code{unsigned long long} in the future . 1946 1947@cindex Random state 1948@tindex @code{gmp_randstate_t} 1949@dfn{Random state} means an algorithm selection and current state data. The C 1950data type for such objects is @code{gmp_randstate_t}. For example: 1951 1952@example 1953gmp_randstate_t rstate; 1954@end example 1955 1956Also, in general @code{mp_bitcnt_t} is used for bit counts and ranges, and 1957@code{size_t} is used for byte or character counts. 1958 1959 1960@node Function Classes, Variable Conventions, Nomenclature and Types, GMP Basics 1961@section Function Classes 1962@cindex Function classes 1963 1964There are six classes of functions in the GMP library: 1965 1966@enumerate 1967@item 1968Functions for signed integer arithmetic, with names beginning with 1969@code{mpz_}. The associated type is @code{mpz_t}. There are about 150 1970functions in this class. (@pxref{Integer Functions}) 1971 1972@item 1973Functions for rational number arithmetic, with names beginning with 1974@code{mpq_}. The associated type is @code{mpq_t}. There are about 40 1975functions in this class, but the integer functions can be used for arithmetic 1976on the numerator and denominator separately. (@pxref{Rational Number 1977Functions}) 1978 1979@item 1980Functions for floating-point arithmetic, with names beginning with 1981@code{mpf_}. The associated type is @code{mpf_t}. There are about 60 1982functions is this class. (@pxref{Floating-point Functions}) 1983 1984@item 1985Functions compatible with Berkeley MP, such as @code{itom}, @code{madd}, and 1986@code{mult}. The associated type is @code{MINT}. (@pxref{BSD Compatible 1987Functions}) 1988 1989@item 1990Fast low-level functions that operate on natural numbers. These are used by 1991the functions in the preceding groups, and you can also call them directly 1992from very time-critical user programs. These functions' names begin with 1993@code{mpn_}. The associated type is array of @code{mp_limb_t}. There are 1994about 30 (hard-to-use) functions in this class. (@pxref{Low-level Functions}) 1995 1996@item 1997Miscellaneous functions. Functions for setting up custom allocation and 1998functions for generating random numbers. (@pxref{Custom Allocation}, and 1999@pxref{Random Number Functions}) 2000@end enumerate 2001 2002 2003@node Variable Conventions, Parameter Conventions, Function Classes, GMP Basics 2004@section Variable Conventions 2005@cindex Variable conventions 2006@cindex Conventions for variables 2007 2008GMP functions generally have output arguments before input arguments. This 2009notation is by analogy with the assignment operator. The BSD MP compatibility 2010functions are exceptions, having the output arguments last. 2011 2012GMP lets you use the same variable for both input and output in one call. For 2013example, the main function for integer multiplication, @code{mpz_mul}, can be 2014used to square @code{x} and put the result back in @code{x} with 2015 2016@example 2017mpz_mul (x, x, x); 2018@end example 2019 2020Before you can assign to a GMP variable, you need to initialize it by calling 2021one of the special initialization functions. When you're done with a 2022variable, you need to clear it out, using one of the functions for that 2023purpose. Which function to use depends on the type of variable. See the 2024chapters on integer functions, rational number functions, and floating-point 2025functions for details. 2026 2027A variable should only be initialized once, or at least cleared between each 2028initialization. After a variable has been initialized, it may be assigned to 2029any number of times. 2030 2031For efficiency reasons, avoid excessive initializing and clearing. In 2032general, initialize near the start of a function and clear near the end. For 2033example, 2034 2035@example 2036void 2037foo (void) 2038@{ 2039 mpz_t n; 2040 int i; 2041 mpz_init (n); 2042 for (i = 1; i < 100; i++) 2043 @{ 2044 mpz_mul (n, @dots{}); 2045 mpz_fdiv_q (n, @dots{}); 2046 @dots{} 2047 @} 2048 mpz_clear (n); 2049@} 2050@end example 2051 2052 2053@node Parameter Conventions, Memory Management, Variable Conventions, GMP Basics 2054@section Parameter Conventions 2055@cindex Parameter conventions 2056@cindex Conventions for parameters 2057 2058When a GMP variable is used as a function parameter, it's effectively a 2059call-by-reference, meaning if the function stores a value there it will change 2060the original in the caller. Parameters which are input-only can be designated 2061@code{const} to provoke a compiler error or warning on attempting to modify 2062them. 2063 2064When a function is going to return a GMP result, it should designate a 2065parameter that it sets, like the library functions do. More than one value 2066can be returned by having more than one output parameter, again like the 2067library functions. A @code{return} of an @code{mpz_t} etc doesn't return the 2068object, only a pointer, and this is almost certainly not what's wanted. 2069 2070Here's an example accepting an @code{mpz_t} parameter, doing a calculation, 2071and storing the result to the indicated parameter. 2072 2073@example 2074void 2075foo (mpz_t result, const mpz_t param, unsigned long n) 2076@{ 2077 unsigned long i; 2078 mpz_mul_ui (result, param, n); 2079 for (i = 1; i < n; i++) 2080 mpz_add_ui (result, result, i*7); 2081@} 2082 2083int 2084main (void) 2085@{ 2086 mpz_t r, n; 2087 mpz_init (r); 2088 mpz_init_set_str (n, "123456", 0); 2089 foo (r, n, 20L); 2090 gmp_printf ("%Zd\n", r); 2091 return 0; 2092@} 2093@end example 2094 2095@code{foo} works even if the mainline passes the same variable for 2096@code{param} and @code{result}, just like the library functions. But 2097sometimes it's tricky to make that work, and an application might not want to 2098bother supporting that sort of thing. 2099 2100For interest, the GMP types @code{mpz_t} etc are implemented as one-element 2101arrays of certain structures. This is why declaring a variable creates an 2102object with the fields GMP needs, but then using it as a parameter passes a 2103pointer to the object. Note that the actual fields in each @code{mpz_t} etc 2104are for internal use only and should not be accessed directly by code that 2105expects to be compatible with future GMP releases. 2106 2107 2108@need 1000 2109@node Memory Management, Reentrancy, Parameter Conventions, GMP Basics 2110@section Memory Management 2111@cindex Memory management 2112 2113The GMP types like @code{mpz_t} are small, containing only a couple of sizes, 2114and pointers to allocated data. Once a variable is initialized, GMP takes 2115care of all space allocation. Additional space is allocated whenever a 2116variable doesn't have enough. 2117 2118@code{mpz_t} and @code{mpq_t} variables never reduce their allocated space. 2119Normally this is the best policy, since it avoids frequent reallocation. 2120Applications that need to return memory to the heap at some particular point 2121can use @code{mpz_realloc2}, or clear variables no longer needed. 2122 2123@code{mpf_t} variables, in the current implementation, use a fixed amount of 2124space, determined by the chosen precision and allocated at initialization, so 2125their size doesn't change. 2126 2127All memory is allocated using @code{malloc} and friends by default, but this 2128can be changed, see @ref{Custom Allocation}. Temporary memory on the stack is 2129also used (via @code{alloca}), but this can be changed at build-time if 2130desired, see @ref{Build Options}. 2131 2132 2133@node Reentrancy, Useful Macros and Constants, Memory Management, GMP Basics 2134@section Reentrancy 2135@cindex Reentrancy 2136@cindex Thread safety 2137@cindex Multi-threading 2138 2139@noindent 2140GMP is reentrant and thread-safe, with some exceptions: 2141 2142@itemize @bullet 2143@item 2144If configured with @option{--enable-alloca=malloc-notreentrant} (or with 2145@option{--enable-alloca=notreentrant} when @code{alloca} is not available), 2146then naturally GMP is not reentrant. 2147 2148@item 2149@code{mpf_set_default_prec} and @code{mpf_init} use a global variable for the 2150selected precision. @code{mpf_init2} can be used instead, and in the C++ 2151interface an explicit precision to the @code{mpf_class} constructor. 2152 2153@item 2154@code{mpz_random} and the other old random number functions use a global 2155random state and are hence not reentrant. The newer random number functions 2156that accept a @code{gmp_randstate_t} parameter can be used instead. 2157 2158@item 2159@code{gmp_randinit} (obsolete) returns an error indication through a global 2160variable, which is not thread safe. Applications are advised to use 2161@code{gmp_randinit_default} or @code{gmp_randinit_lc_2exp} instead. 2162 2163@item 2164@code{mp_set_memory_functions} uses global variables to store the selected 2165memory allocation functions. 2166 2167@item 2168If the memory allocation functions set by a call to 2169@code{mp_set_memory_functions} (or @code{malloc} and friends by default) are 2170not reentrant, then GMP will not be reentrant either. 2171 2172@item 2173If the standard I/O functions such as @code{fwrite} are not reentrant then the 2174GMP I/O functions using them will not be reentrant either. 2175 2176@item 2177It's safe for two threads to read from the same GMP variable simultaneously, 2178but it's not safe for one to read while the another might be writing, nor for 2179two threads to write simultaneously. It's not safe for two threads to 2180generate a random number from the same @code{gmp_randstate_t} simultaneously, 2181since this involves an update of that variable. 2182@end itemize 2183 2184 2185@need 2000 2186@node Useful Macros and Constants, Compatibility with older versions, Reentrancy, GMP Basics 2187@section Useful Macros and Constants 2188@cindex Useful macros and constants 2189@cindex Constants 2190 2191@deftypevr {Global Constant} {const int} mp_bits_per_limb 2192@findex mp_bits_per_limb 2193@cindex Bits per limb 2194@cindex Limb size 2195The number of bits per limb. 2196@end deftypevr 2197 2198@defmac __GNU_MP_VERSION 2199@defmacx __GNU_MP_VERSION_MINOR 2200@defmacx __GNU_MP_VERSION_PATCHLEVEL 2201@cindex Version number 2202@cindex GMP version number 2203The major and minor GMP version, and patch level, respectively, as integers. 2204For GMP i.j, these numbers will be i, j, and 0, respectively. 2205For GMP i.j.k, these numbers will be i, j, and k, respectively. 2206@end defmac 2207 2208@deftypevr {Global Constant} {const char * const} gmp_version 2209@findex gmp_version 2210The GMP version number, as a null-terminated string, in the form ``i.j.k''. 2211This release is @nicode{"@value{VERSION}"}. Note that the format ``i.j'' was 2212used when k was zero was used before version 4.3.0. 2213@end deftypevr 2214 2215@defmac __GMP_CC 2216@defmacx __GMP_CFLAGS 2217The compiler and compiler flags, respectively, used when compiling GMP, as 2218strings. 2219@end defmac 2220 2221 2222@node Compatibility with older versions, Demonstration Programs, Useful Macros and Constants, GMP Basics 2223@section Compatibility with older versions 2224@cindex Compatibility with older versions 2225@cindex Past GMP versions 2226@cindex Upward compatibility 2227 2228This version of GMP is upwardly binary compatible with all 4.x and 3.x 2229versions, and upwardly compatible at the source level with all 2.x versions, 2230with the following exceptions. 2231 2232@itemize @bullet 2233@item 2234@code{mpn_gcd} had its source arguments swapped as of GMP 3.0, for consistency 2235with other @code{mpn} functions. 2236 2237@item 2238@code{mpf_get_prec} counted precision slightly differently in GMP 3.0 and 22393.0.1, but in 3.1 reverted to the 2.x style. 2240@end itemize 2241 2242There are a number of compatibility issues between GMP 1 and GMP 2 that of 2243course also apply when porting applications from GMP 1 to GMP 4. Please 2244see the GMP 2 manual for details. 2245 2246The Berkeley MP compatibility library (@pxref{BSD Compatible Functions}) is 2247source and binary compatible with the standard @file{libmp}. 2248 2249@c @enumerate 2250@c @item Integer division functions round the result differently. The obsolete 2251@c functions (@code{mpz_div}, @code{mpz_divmod}, @code{mpz_mdiv}, 2252@c @code{mpz_mdivmod}, etc) now all use floor rounding (i.e., they round the 2253@c quotient towards 2254@c @ifinfo 2255@c @minus{}infinity). 2256@c @end ifinfo 2257@c @iftex 2258@c @tex 2259@c $-\infty$). 2260@c @end tex 2261@c @end iftex 2262@c There are a lot of functions for integer division, giving the user better 2263@c control over the rounding. 2264 2265@c @item The function @code{mpz_mod} now compute the true @strong{mod} function. 2266 2267@c @item The functions @code{mpz_powm} and @code{mpz_powm_ui} now use 2268@c @strong{mod} for reduction. 2269 2270@c @item The assignment functions for rational numbers do no longer canonicalize 2271@c their results. In the case a non-canonical result could arise from an 2272@c assignment, the user need to insert an explicit call to 2273@c @code{mpq_canonicalize}. This change was made for efficiency. 2274 2275@c @item Output generated by @code{mpz_out_raw} in this release cannot be read 2276@c by @code{mpz_inp_raw} in previous releases. This change was made for making 2277@c the file format truly portable between machines with different word sizes. 2278 2279@c @item Several @code{mpn} functions have changed. But they were intentionally 2280@c undocumented in previous releases. 2281 2282@c @item The functions @code{mpz_cmp_ui}, @code{mpz_cmp_si}, and @code{mpq_cmp_ui} 2283@c are now implemented as macros, and thereby sometimes evaluate their 2284@c arguments multiple times. 2285 2286@c @item The functions @code{mpz_pow_ui} and @code{mpz_ui_pow_ui} now yield 1 2287@c for 0^0. (In version 1, they yielded 0.) 2288 2289@c In version 1 of the library, @code{mpq_set_den} handled negative 2290@c denominators by copying the sign to the numerator. That is no longer done. 2291 2292@c Pure assignment functions do not canonicalize the assigned variable. It is 2293@c the responsibility of the user to canonicalize the assigned variable before 2294@c any arithmetic operations are performed on that variable. 2295@c Note that this is an incompatible change from version 1 of the library. 2296 2297@c @end enumerate 2298 2299 2300@need 1000 2301@node Demonstration Programs, Efficiency, Compatibility with older versions, GMP Basics 2302@section Demonstration programs 2303@cindex Demonstration programs 2304@cindex Example programs 2305@cindex Sample programs 2306The @file{demos} subdirectory has some sample programs using GMP@. These 2307aren't built or installed, but there's a @file{Makefile} with rules for them. 2308For instance, 2309 2310@example 2311make pexpr 2312./pexpr 68^975+10 2313@end example 2314 2315@noindent 2316The following programs are provided 2317 2318@itemize @bullet 2319@item 2320@cindex Expression parsing demo 2321@cindex Parsing expressions demo 2322@samp{pexpr} is an expression evaluator, the program used on the GMP web page. 2323@item 2324@cindex Expression parsing demo 2325@cindex Parsing expressions demo 2326The @samp{calc} subdirectory has a similar but simpler evaluator using 2327@command{lex} and @command{yacc}. 2328@item 2329@cindex Expression parsing demo 2330@cindex Parsing expressions demo 2331The @samp{expr} subdirectory is yet another expression evaluator, a library 2332designed for ease of use within a C program. See @file{demos/expr/README} for 2333more information. 2334@item 2335@cindex Factorization demo 2336@samp{factorize} is a Pollard-Rho factorization program. 2337@item 2338@samp{isprime} is a command-line interface to the @code{mpz_probab_prime_p} 2339function. 2340@item 2341@samp{primes} counts or lists primes in an interval, using a sieve. 2342@item 2343@samp{qcn} is an example use of @code{mpz_kronecker_ui} to estimate quadratic 2344class numbers. 2345@item 2346@cindex @code{perl} 2347@cindex GMP Perl module 2348@cindex Perl module 2349The @samp{perl} subdirectory is a comprehensive perl interface to GMP@. See 2350@file{demos/perl/INSTALL} for more information. Documentation is in POD 2351format in @file{demos/perl/GMP.pm}. 2352@end itemize 2353 2354As an aside, consideration has been given at various times to some sort of 2355expression evaluation within the main GMP library. Going beyond something 2356minimal quickly leads to matters like user-defined functions, looping, fixnums 2357for control variables, etc, which are considered outside the scope of GMP 2358(much closer to language interpreters or compilers, @xref{Language Bindings}.) 2359Something simple for program input convenience may yet be a possibility, a 2360combination of the @file{expr} demo and the @file{pexpr} tree back-end 2361perhaps. But for now the above evaluators are offered as illustrations. 2362 2363 2364@need 1000 2365@node Efficiency, Debugging, Demonstration Programs, GMP Basics 2366@section Efficiency 2367@cindex Efficiency 2368 2369@table @asis 2370@item Small Operands 2371@cindex Small operands 2372On small operands, the time for function call overheads and memory allocation 2373can be significant in comparison to actual calculation. This is unavoidable 2374in a general purpose variable precision library, although GMP attempts to be 2375as efficient as it can on both large and small operands. 2376 2377@item Static Linking 2378@cindex Static linking 2379On some CPUs, in particular the x86s, the static @file{libgmp.a} should be 2380used for maximum speed, since the PIC code in the shared @file{libgmp.so} will 2381have a small overhead on each function call and global data address. For many 2382programs this will be insignificant, but for long calculations there's a gain 2383to be had. 2384 2385@item Initializing and Clearing 2386@cindex Initializing and clearing 2387Avoid excessive initializing and clearing of variables, since this can be 2388quite time consuming, especially in comparison to otherwise fast operations 2389like addition. 2390 2391A language interpreter might want to keep a free list or stack of 2392initialized variables ready for use. It should be possible to integrate 2393something like that with a garbage collector too. 2394 2395@item Reallocations 2396@cindex Reallocations 2397An @code{mpz_t} or @code{mpq_t} variable used to hold successively increasing 2398values will have its memory repeatedly @code{realloc}ed, which could be quite 2399slow or could fragment memory, depending on the C library. If an application 2400can estimate the final size then @code{mpz_init2} or @code{mpz_realloc2} can 2401be called to allocate the necessary space from the beginning 2402(@pxref{Initializing Integers}). 2403 2404It doesn't matter if a size set with @code{mpz_init2} or @code{mpz_realloc2} 2405is too small, since all functions will do a further reallocation if necessary. 2406Badly overestimating memory required will waste space though. 2407 2408@item @code{2exp} Functions 2409@cindex @code{2exp} functions 2410It's up to an application to call functions like @code{mpz_mul_2exp} when 2411appropriate. General purpose functions like @code{mpz_mul} make no attempt to 2412identify powers of two or other special forms, because such inputs will 2413usually be very rare and testing every time would be wasteful. 2414 2415@item @code{ui} and @code{si} Functions 2416@cindex @code{ui} and @code{si} functions 2417The @code{ui} functions and the small number of @code{si} functions exist for 2418convenience and should be used where applicable. But if for example an 2419@code{mpz_t} contains a value that fits in an @code{unsigned long} there's no 2420need extract it and call a @code{ui} function, just use the regular @code{mpz} 2421function. 2422 2423@item In-Place Operations 2424@cindex In-place operations 2425@code{mpz_abs}, @code{mpq_abs}, @code{mpf_abs}, @code{mpz_neg}, @code{mpq_neg} 2426and @code{mpf_neg} are fast when used for in-place operations like 2427@code{mpz_abs(x,x)}, since in the current implementation only a single field 2428of @code{x} needs changing. On suitable compilers (GCC for instance) this is 2429inlined too. 2430 2431@code{mpz_add_ui}, @code{mpz_sub_ui}, @code{mpf_add_ui} and @code{mpf_sub_ui} 2432benefit from an in-place operation like @code{mpz_add_ui(x,x,y)}, since 2433usually only one or two limbs of @code{x} will need to be changed. The same 2434applies to the full precision @code{mpz_add} etc if @code{y} is small. If 2435@code{y} is big then cache locality may be helped, but that's all. 2436 2437@code{mpz_mul} is currently the opposite, a separate destination is slightly 2438better. A call like @code{mpz_mul(x,x,y)} will, unless @code{y} is only one 2439limb, make a temporary copy of @code{x} before forming the result. Normally 2440that copying will only be a tiny fraction of the time for the multiply, so 2441this is not a particularly important consideration. 2442 2443@code{mpz_set}, @code{mpq_set}, @code{mpq_set_num}, @code{mpf_set}, etc, make 2444no attempt to recognise a copy of something to itself, so a call like 2445@code{mpz_set(x,x)} will be wasteful. Naturally that would never be written 2446deliberately, but if it might arise from two pointers to the same object then 2447a test to avoid it might be desirable. 2448 2449@example 2450if (x != y) 2451 mpz_set (x, y); 2452@end example 2453 2454Note that it's never worth introducing extra @code{mpz_set} calls just to get 2455in-place operations. If a result should go to a particular variable then just 2456direct it there and let GMP take care of data movement. 2457 2458@item Divisibility Testing (Small Integers) 2459@cindex Divisibility testing 2460@code{mpz_divisible_ui_p} and @code{mpz_congruent_ui_p} are the best functions 2461for testing whether an @code{mpz_t} is divisible by an individual small 2462integer. They use an algorithm which is faster than @code{mpz_tdiv_ui}, but 2463which gives no useful information about the actual remainder, only whether 2464it's zero (or a particular value). 2465 2466However when testing divisibility by several small integers, it's best to take 2467a remainder modulo their product, to save multi-precision operations. For 2468instance to test whether a number is divisible by any of 23, 29 or 31 take a 2469remainder modulo @math{23@times{}29@times{}31 = 20677} and then test that. 2470 2471The division functions like @code{mpz_tdiv_q_ui} which give a quotient as well 2472as a remainder are generally a little slower than the remainder-only functions 2473like @code{mpz_tdiv_ui}. If the quotient is only rarely wanted then it's 2474probably best to just take a remainder and then go back and calculate the 2475quotient if and when it's wanted (@code{mpz_divexact_ui} can be used if the 2476remainder is zero). 2477 2478@item Rational Arithmetic 2479@cindex Rational arithmetic 2480The @code{mpq} functions operate on @code{mpq_t} values with no common factors 2481in the numerator and denominator. Common factors are checked-for and cast out 2482as necessary. In general, cancelling factors every time is the best approach 2483since it minimizes the sizes for subsequent operations. 2484 2485However, applications that know something about the factorization of the 2486values they're working with might be able to avoid some of the GCDs used for 2487canonicalization, or swap them for divisions. For example when multiplying by 2488a prime it's enough to check for factors of it in the denominator instead of 2489doing a full GCD@. Or when forming a big product it might be known that very 2490little cancellation will be possible, and so canonicalization can be left to 2491the end. 2492 2493The @code{mpq_numref} and @code{mpq_denref} macros give access to the 2494numerator and denominator to do things outside the scope of the supplied 2495@code{mpq} functions. @xref{Applying Integer Functions}. 2496 2497The canonical form for rationals allows mixed-type @code{mpq_t} and integer 2498additions or subtractions to be done directly with multiples of the 2499denominator. This will be somewhat faster than @code{mpq_add}. For example, 2500 2501@example 2502/* mpq increment */ 2503mpz_add (mpq_numref(q), mpq_numref(q), mpq_denref(q)); 2504 2505/* mpq += unsigned long */ 2506mpz_addmul_ui (mpq_numref(q), mpq_denref(q), 123UL); 2507 2508/* mpq -= mpz */ 2509mpz_submul (mpq_numref(q), mpq_denref(q), z); 2510@end example 2511 2512@item Number Sequences 2513@cindex Number sequences 2514Functions like @code{mpz_fac_ui}, @code{mpz_fib_ui} and @code{mpz_bin_uiui} 2515are designed for calculating isolated values. If a range of values is wanted 2516it's probably best to call to get a starting point and iterate from there. 2517 2518@item Text Input/Output 2519@cindex Text input/output 2520Hexadecimal or octal are suggested for input or output in text form. 2521Power-of-2 bases like these can be converted much more efficiently than other 2522bases, like decimal. For big numbers there's usually nothing of particular 2523interest to be seen in the digits, so the base doesn't matter much. 2524 2525Maybe we can hope octal will one day become the normal base for everyday use, 2526as proposed by King Charles XII of Sweden and later reformers. 2527@c Reference: Knuth volume 2 section 4.1, page 184 of second edition. :-) 2528@end table 2529 2530 2531@node Debugging, Profiling, Efficiency, GMP Basics 2532@section Debugging 2533@cindex Debugging 2534 2535@table @asis 2536@item Stack Overflow 2537@cindex Stack overflow 2538@cindex Segmentation violation 2539@cindex Bus error 2540Depending on the system, a segmentation violation or bus error might be the 2541only indication of stack overflow. See @samp{--enable-alloca} choices in 2542@ref{Build Options}, for how to address this. 2543 2544In new enough versions of GCC, @samp{-fstack-check} may be able to ensure an 2545overflow is recognised by the system before too much damage is done, or 2546@samp{-fstack-limit-symbol} or @samp{-fstack-limit-register} may be able to 2547add checking if the system itself doesn't do any (@pxref{Code Gen Options,, 2548Options for Code Generation, gcc, Using the GNU Compiler Collection (GCC)}). 2549These options must be added to the @samp{CFLAGS} used in the GMP build 2550(@pxref{Build Options}), adding them just to an application will have no 2551effect. Note also they're a slowdown, adding overhead to each function call 2552and each stack allocation. 2553 2554@item Heap Problems 2555@cindex Heap problems 2556@cindex Malloc problems 2557The most likely cause of application problems with GMP is heap corruption. 2558Failing to @code{init} GMP variables will have unpredictable effects, and 2559corruption arising elsewhere in a program may well affect GMP@. Initializing 2560GMP variables more than once or failing to clear them will cause memory leaks. 2561 2562@cindex Malloc debugger 2563In all such cases a @code{malloc} debugger is recommended. On a GNU or BSD 2564system the standard C library @code{malloc} has some diagnostic facilities, 2565see @ref{Allocation Debugging,, Allocation Debugging, libc, The GNU C Library 2566Reference Manual}, or @samp{man 3 malloc}. Other possibilities, in no 2567particular order, include 2568 2569@display 2570@uref{http://www.inf.ethz.ch/personal/biere/projects/ccmalloc/} 2571@uref{http://dmalloc.com/} 2572@uref{http://www.perens.com/FreeSoftware/} @ (electric fence) 2573@uref{http://packages.debian.org/stable/devel/fda} 2574@uref{http://www.gnupdate.org/components/leakbug/} 2575@uref{http://people.redhat.com/~otaylor/memprof/} 2576@uref{http://www.cbmamiga.demon.co.uk/mpatrol/} 2577@end display 2578 2579The GMP default allocation routines in @file{memory.c} also have a simple 2580sentinel scheme which can be enabled with @code{#define DEBUG} in that file. 2581This is mainly designed for detecting buffer overruns during GMP development, 2582but might find other uses. 2583 2584@item Stack Backtraces 2585@cindex Stack backtrace 2586On some systems the compiler options GMP uses by default can interfere with 2587debugging. In particular on x86 and 68k systems @samp{-fomit-frame-pointer} 2588is used and this generally inhibits stack backtracing. Recompiling without 2589such options may help while debugging, though the usual caveats about it 2590potentially moving a memory problem or hiding a compiler bug will apply. 2591 2592@item GDB, the GNU Debugger 2593@cindex GDB 2594@cindex GNU Debugger 2595A sample @file{.gdbinit} is included in the distribution, showing how to call 2596some undocumented dump functions to print GMP variables from within GDB@. Note 2597that these functions shouldn't be used in final application code since they're 2598undocumented and may be subject to incompatible changes in future versions of 2599GMP. 2600 2601@item Source File Paths 2602GMP has multiple source files with the same name, in different directories. 2603For example @file{mpz}, @file{mpq} and @file{mpf} each have an 2604@file{init.c}. If the debugger can't already determine the right one it may 2605help to build with absolute paths on each C file. One way to do that is to 2606use a separate object directory with an absolute path to the source directory. 2607 2608@example 2609cd /my/build/dir 2610/my/source/dir/gmp-@value{VERSION}/configure 2611@end example 2612 2613This works via @code{VPATH}, and might require GNU @command{make}. 2614Alternately it might be possible to change the @code{.c.lo} rules 2615appropriately. 2616 2617@item Assertion Checking 2618@cindex Assertion checking 2619The build option @option{--enable-assert} is available to add some consistency 2620checks to the library (see @ref{Build Options}). These are likely to be of 2621limited value to most applications. Assertion failures are just as likely to 2622indicate memory corruption as a library or compiler bug. 2623 2624Applications using the low-level @code{mpn} functions, however, will benefit 2625from @option{--enable-assert} since it adds checks on the parameters of most 2626such functions, many of which have subtle restrictions on their usage. Note 2627however that only the generic C code has checks, not the assembly code, so 2628CPU @samp{none} should be used for maximum checking. 2629 2630@item Temporary Memory Checking 2631The build option @option{--enable-alloca=debug} arranges that each block of 2632temporary memory in GMP is allocated with a separate call to @code{malloc} (or 2633the allocation function set with @code{mp_set_memory_functions}). 2634 2635This can help a malloc debugger detect accesses outside the intended bounds, 2636or detect memory not released. In a normal build, on the other hand, 2637temporary memory is allocated in blocks which GMP divides up for its own use, 2638or may be allocated with a compiler builtin @code{alloca} which will go 2639nowhere near any malloc debugger hooks. 2640 2641@item Maximum Debuggability 2642To summarize the above, a GMP build for maximum debuggability would be 2643 2644@example 2645./configure --disable-shared --enable-assert \ 2646 --enable-alloca=debug --host=none CFLAGS=-g 2647@end example 2648 2649For C++, add @samp{--enable-cxx CXXFLAGS=-g}. 2650 2651@item Checker 2652@cindex Checker 2653@cindex GCC Checker 2654The GCC checker (@uref{http://savannah.nongnu.org/projects/checker/}) can be 2655used with GMP@. It contains a stub library which means GMP applications 2656compiled with checker can use a normal GMP build. 2657 2658A build of GMP with checking within GMP itself can be made. This will run 2659very very slowly. On GNU/Linux for example, 2660 2661@cindex @command{checkergcc} 2662@example 2663./configure --host=none-pc-linux-gnu CC=checkergcc 2664@end example 2665 2666@samp{--host=none} must be used, since the GMP assembly code doesn't support 2667the checking scheme. The GMP C++ features cannot be used, since current 2668versions of checker (0.9.9.1) don't yet support the standard C++ library. 2669 2670@item Valgrind 2671@cindex Valgrind 2672The valgrind program (@uref{http://valgrind.org/}) is a memory 2673checker for x86s. It translates and emulates machine instructions to do 2674strong checks for uninitialized data (at the level of individual bits), memory 2675accesses through bad pointers, and memory leaks. 2676 2677Recent versions of Valgrind are getting support for MMX and SSE/SSE2 2678instructions, for past versions GMP will need to be configured not to use 2679those, ie.@: for an x86 without them (for instance plain @samp{i486}). 2680 2681@item Other Problems 2682Any suspected bug in GMP itself should be isolated to make sure it's not an 2683application problem, see @ref{Reporting Bugs}. 2684@end table 2685 2686 2687@node Profiling, Autoconf, Debugging, GMP Basics 2688@section Profiling 2689@cindex Profiling 2690@cindex Execution profiling 2691@cindex @code{--enable-profiling} 2692 2693Running a program under a profiler is a good way to find where it's spending 2694most time and where improvements can be best sought. The profiling choices 2695for a GMP build are as follows. 2696 2697@table @asis 2698@item @samp{--disable-profiling} 2699The default is to add nothing special for profiling. 2700 2701It should be possible to just compile the mainline of a program with @code{-p} 2702and use @command{prof} to get a profile consisting of timer-based sampling of 2703the program counter. Most of the GMP assembly code has the necessary symbol 2704information. 2705 2706This approach has the advantage of minimizing interference with normal program 2707operation, but on most systems the resolution of the sampling is quite low (10 2708milliseconds for instance), requiring long runs to get accurate information. 2709 2710@item @samp{--enable-profiling=prof} 2711@cindex @code{prof} 2712Build with support for the system @command{prof}, which means @samp{-p} added 2713to the @samp{CFLAGS}. 2714 2715This provides call counting in addition to program counter sampling, which 2716allows the most frequently called routines to be identified, and an average 2717time spent in each routine to be determined. 2718 2719The x86 assembly code has support for this option, but on other processors 2720the assembly routines will be as if compiled without @samp{-p} and therefore 2721won't appear in the call counts. 2722 2723On some systems, such as GNU/Linux, @samp{-p} in fact means @samp{-pg} and in 2724this case @samp{--enable-profiling=gprof} described below should be used 2725instead. 2726 2727@item @samp{--enable-profiling=gprof} 2728@cindex @code{gprof} 2729Build with support for @command{gprof}, which means @samp{-pg} added to the 2730@samp{CFLAGS}. 2731 2732This provides call graph construction in addition to call counting and program 2733counter sampling, which makes it possible to count calls coming from different 2734locations. For example the number of calls to @code{mpn_mul} from 2735@code{mpz_mul} versus the number from @code{mpf_mul}. The program counter 2736sampling is still flat though, so only a total time in @code{mpn_mul} would be 2737accumulated, not a separate amount for each call site. 2738 2739The x86 assembly code has support for this option, but on other processors 2740the assembly routines will be as if compiled without @samp{-pg} and therefore 2741not be included in the call counts. 2742 2743On x86 and m68k systems @samp{-pg} and @samp{-fomit-frame-pointer} are 2744incompatible, so the latter is omitted from the default flags in that case, 2745which might result in poorer code generation. 2746 2747Incidentally, it should be possible to use the @command{gprof} program with a 2748plain @samp{--enable-profiling=prof} build. But in that case only the 2749@samp{gprof -p} flat profile and call counts can be expected to be valid, not 2750the @samp{gprof -q} call graph. 2751 2752@item @samp{--enable-profiling=instrument} 2753@cindex @code{-finstrument-functions} 2754@cindex @code{instrument-functions} 2755Build with the GCC option @samp{-finstrument-functions} added to the 2756@samp{CFLAGS} (@pxref{Code Gen Options,, Options for Code Generation, gcc, 2757Using the GNU Compiler Collection (GCC)}). 2758 2759This inserts special instrumenting calls at the start and end of each 2760function, allowing exact timing and full call graph construction. 2761 2762This instrumenting is not normally a standard system feature and will require 2763support from an external library, such as 2764 2765@cindex FunctionCheck 2766@cindex fnccheck 2767@display 2768@uref{http://sourceforge.net/projects/fnccheck/} 2769@end display 2770 2771This should be included in @samp{LIBS} during the GMP configure so that test 2772programs will link. For example, 2773 2774@example 2775./configure --enable-profiling=instrument LIBS=-lfc 2776@end example 2777 2778On a GNU system the C library provides dummy instrumenting functions, so 2779programs compiled with this option will link. In this case it's only 2780necessary to ensure the correct library is added when linking an application. 2781 2782The x86 assembly code supports this option, but on other processors the 2783assembly routines will be as if compiled without 2784@samp{-finstrument-functions} meaning time spent in them will effectively be 2785attributed to their caller. 2786@end table 2787 2788 2789@node Autoconf, Emacs, Profiling, GMP Basics 2790@section Autoconf 2791@cindex Autoconf 2792 2793Autoconf based applications can easily check whether GMP is installed. The 2794only thing to be noted is that GMP library symbols from version 3 onwards have 2795prefixes like @code{__gmpz}. The following therefore would be a simple test, 2796 2797@cindex @code{AC_CHECK_LIB} 2798@example 2799AC_CHECK_LIB(gmp, __gmpz_init) 2800@end example 2801 2802This just uses the default @code{AC_CHECK_LIB} actions for found or not found, 2803but an application that must have GMP would want to generate an error if not 2804found. For example, 2805 2806@example 2807AC_CHECK_LIB(gmp, __gmpz_init, , 2808 [AC_MSG_ERROR([GNU MP not found, see http://gmplib.org/])]) 2809@end example 2810 2811If functions added in some particular version of GMP are required, then one of 2812those can be used when checking. For example @code{mpz_mul_si} was added in 2813GMP 3.1, 2814 2815@example 2816AC_CHECK_LIB(gmp, __gmpz_mul_si, , 2817 [AC_MSG_ERROR( 2818 [GNU MP not found, or not 3.1 or up, see http://gmplib.org/])]) 2819@end example 2820 2821An alternative would be to test the version number in @file{gmp.h} using say 2822@code{AC_EGREP_CPP}. That would make it possible to test the exact version, 2823if some particular sub-minor release is known to be necessary. 2824 2825In general it's recommended that applications should simply demand a new 2826enough GMP rather than trying to provide supplements for features not 2827available in past versions. 2828 2829Occasionally an application will need or want to know the size of a type at 2830configuration or preprocessing time, not just with @code{sizeof} in the code. 2831This can be done in the normal way with @code{mp_limb_t} etc, but GMP 4.0 or 2832up is best for this, since prior versions needed certain @samp{-D} defines on 2833systems using a @code{long long} limb. The following would suit Autoconf 2.50 2834or up, 2835 2836@example 2837AC_CHECK_SIZEOF(mp_limb_t, , [#include <gmp.h>]) 2838@end example 2839 2840 2841@node Emacs, , Autoconf, GMP Basics 2842@section Emacs 2843@cindex Emacs 2844@cindex @code{info-lookup-symbol} 2845 2846@key{C-h C-i} (@code{info-lookup-symbol}) is a good way to find documentation 2847on C functions while editing (@pxref{Info Lookup, , Info Documentation Lookup, 2848emacs, The Emacs Editor}). 2849 2850The GMP manual can be included in such lookups by putting the following in 2851your @file{.emacs}, 2852 2853@c This isn't pretty, but there doesn't seem to be a better way (in emacs 2854@c 21.2 at least). info-lookup->mode-value could be used for the "assoc"s, 2855@c but that function isn't documented, whereas info-lookup-alist is. 2856@c 2857@example 2858(eval-after-load "info-look" 2859 '(let ((mode-value (assoc 'c-mode (assoc 'symbol info-lookup-alist)))) 2860 (setcar (nthcdr 3 mode-value) 2861 (cons '("(gmp)Function Index" nil "^ -.* " "\\>") 2862 (nth 3 mode-value))))) 2863@end example 2864 2865 2866@node Reporting Bugs, Integer Functions, GMP Basics, Top 2867@comment node-name, next, previous, up 2868@chapter Reporting Bugs 2869@cindex Reporting bugs 2870@cindex Bug reporting 2871 2872If you think you have found a bug in the GMP library, please investigate it 2873and report it. We have made this library available to you, and it is not too 2874much to ask you to report the bugs you find. 2875 2876Before you report a bug, check it's not already addressed in @ref{Known Build 2877Problems}, or perhaps @ref{Notes for Particular Systems}. You may also want 2878to check @uref{http://gmplib.org/} for patches for this release. 2879 2880Please include the following in any report, 2881 2882@itemize @bullet 2883@item 2884The GMP version number, and if pre-packaged or patched then say so. 2885 2886@item 2887A test program that makes it possible for us to reproduce the bug. Include 2888instructions on how to run the program. 2889 2890@item 2891A description of what is wrong. If the results are incorrect, in what way. 2892If you get a crash, say so. 2893 2894@item 2895If you get a crash, include a stack backtrace from the debugger if it's 2896informative (@samp{where} in @command{gdb}, or @samp{$C} in @command{adb}). 2897 2898@item 2899Please do not send core dumps, executables or @command{strace}s. 2900 2901@item 2902The configuration options you used when building GMP, if any. 2903 2904@item 2905The name of the compiler and its version. For @command{gcc}, get the version 2906with @samp{gcc -v}, otherwise perhaps @samp{what `which cc`}, or similar. 2907 2908@item 2909The output from running @samp{uname -a}. 2910 2911@item 2912The output from running @samp{./config.guess}, and from running 2913@samp{./configfsf.guess} (might be the same). 2914 2915@item 2916If the bug is related to @samp{configure}, then the compressed contents of 2917@file{config.log}. 2918 2919@item 2920If the bug is related to an @file{asm} file not assembling, then the contents 2921of @file{config.m4} and the offending line or lines from the temporary 2922@file{mpn/tmp-<file>.s}. 2923@end itemize 2924 2925Please make an effort to produce a self-contained report, with something 2926definite that can be tested or debugged. Vague queries or piecemeal messages 2927are difficult to act on and don't help the development effort. 2928 2929It is not uncommon that an observed problem is actually due to a bug in the 2930compiler; the GMP code tends to explore interesting corners in compilers. 2931 2932If your bug report is good, we will do our best to help you get a corrected 2933version of the library; if the bug report is poor, we won't do anything about 2934it (except maybe ask you to send a better report). 2935 2936Send your report to: @email{gmp-bugs@@gmplib.org}. 2937 2938If you think something in this manual is unclear, or downright incorrect, or if 2939the language needs to be improved, please send a note to the same address. 2940 2941 2942@node Integer Functions, Rational Number Functions, Reporting Bugs, Top 2943@comment node-name, next, previous, up 2944@chapter Integer Functions 2945@cindex Integer functions 2946 2947This chapter describes the GMP functions for performing integer arithmetic. 2948These functions start with the prefix @code{mpz_}. 2949 2950GMP integers are stored in objects of type @code{mpz_t}. 2951 2952@menu 2953* Initializing Integers:: 2954* Assigning Integers:: 2955* Simultaneous Integer Init & Assign:: 2956* Converting Integers:: 2957* Integer Arithmetic:: 2958* Integer Division:: 2959* Integer Exponentiation:: 2960* Integer Roots:: 2961* Number Theoretic Functions:: 2962* Integer Comparisons:: 2963* Integer Logic and Bit Fiddling:: 2964* I/O of Integers:: 2965* Integer Random Numbers:: 2966* Integer Import and Export:: 2967* Miscellaneous Integer Functions:: 2968* Integer Special Functions:: 2969@end menu 2970 2971@node Initializing Integers, Assigning Integers, Integer Functions, Integer Functions 2972@comment node-name, next, previous, up 2973@section Initialization Functions 2974@cindex Integer initialization functions 2975@cindex Initialization functions 2976 2977The functions for integer arithmetic assume that all integer objects are 2978initialized. You do that by calling the function @code{mpz_init}. For 2979example, 2980 2981@example 2982@{ 2983 mpz_t integ; 2984 mpz_init (integ); 2985 @dots{} 2986 mpz_add (integ, @dots{}); 2987 @dots{} 2988 mpz_sub (integ, @dots{}); 2989 2990 /* Unless the program is about to exit, do ... */ 2991 mpz_clear (integ); 2992@} 2993@end example 2994 2995As you can see, you can store new values any number of times, once an 2996object is initialized. 2997 2998@deftypefun void mpz_init (mpz_t @var{x}) 2999Initialize @var{x}, and set its value to 0. 3000@end deftypefun 3001 3002@deftypefun void mpz_inits (mpz_t @var{x}, ...) 3003Initialize a NULL-terminated list of @code{mpz_t} variables, and set their 3004values to 0. 3005@end deftypefun 3006 3007@deftypefun void mpz_init2 (mpz_t @var{x}, mp_bitcnt_t @var{n}) 3008Initialize @var{x}, with space for @var{n}-bit numbers, and set its value to 0. 3009Calling this function instead of @code{mpz_init} or @code{mpz_inits} is never 3010necessary; reallocation is handled automatically by GMP when needed. 3011 3012@var{n} is only the initial space, @var{x} will grow automatically in 3013the normal way, if necessary, for subsequent values stored. @code{mpz_init2} 3014makes it possible to avoid such reallocations if a maximum size is known in 3015advance. 3016@end deftypefun 3017 3018@deftypefun void mpz_clear (mpz_t @var{x}) 3019Free the space occupied by @var{x}. Call this function for all @code{mpz_t} 3020variables when you are done with them. 3021@end deftypefun 3022 3023@deftypefun void mpz_clears (mpz_t @var{x}, ...) 3024Free the space occupied by a NULL-terminated list of @code{mpz_t} variables. 3025@end deftypefun 3026 3027@deftypefun void mpz_realloc2 (mpz_t @var{x}, mp_bitcnt_t @var{n}) 3028Change the space allocated for @var{x} to @var{n} bits. The value in @var{x} 3029is preserved if it fits, or is set to 0 if not. 3030 3031Calling this function is never necessary; reallocation is handled automatically 3032by GMP when needed. But this function can be used to increase the space for a 3033variable in order to avoid repeated automatic reallocations, or to decrease it 3034to give memory back to the heap. 3035@end deftypefun 3036 3037 3038@node Assigning Integers, Simultaneous Integer Init & Assign, Initializing Integers, Integer Functions 3039@comment node-name, next, previous, up 3040@section Assignment Functions 3041@cindex Integer assignment functions 3042@cindex Assignment functions 3043 3044These functions assign new values to already initialized integers 3045(@pxref{Initializing Integers}). 3046 3047@deftypefun void mpz_set (mpz_t @var{rop}, mpz_t @var{op}) 3048@deftypefunx void mpz_set_ui (mpz_t @var{rop}, unsigned long int @var{op}) 3049@deftypefunx void mpz_set_si (mpz_t @var{rop}, signed long int @var{op}) 3050@deftypefunx void mpz_set_d (mpz_t @var{rop}, double @var{op}) 3051@deftypefunx void mpz_set_q (mpz_t @var{rop}, mpq_t @var{op}) 3052@deftypefunx void mpz_set_f (mpz_t @var{rop}, mpf_t @var{op}) 3053Set the value of @var{rop} from @var{op}. 3054 3055@code{mpz_set_d}, @code{mpz_set_q} and @code{mpz_set_f} truncate @var{op} to 3056make it an integer. 3057@end deftypefun 3058 3059@deftypefun int mpz_set_str (mpz_t @var{rop}, char *@var{str}, int @var{base}) 3060Set the value of @var{rop} from @var{str}, a null-terminated C string in base 3061@var{base}. White space is allowed in the string, and is simply ignored. 3062 3063The @var{base} may vary from 2 to 62, or if @var{base} is 0, then the leading 3064characters are used: @code{0x} and @code{0X} for hexadecimal, @code{0b} and 3065@code{0B} for binary, @code{0} for octal, or decimal otherwise. 3066 3067For bases up to 36, case is ignored; upper-case and lower-case letters have 3068the same value. For bases 37 to 62, upper-case letter represent the usual 306910..35 while lower-case letter represent 36..61. 3070 3071This function returns 0 if the entire string is a valid number in base 3072@var{base}. Otherwise it returns @minus{}1. 3073@c 3074@c It turns out that it is not entirely true that this function ignores 3075@c white-space. It does ignore it between digits, but not after a minus sign 3076@c or within or after ``0x''. Some thought was given to disallowing all 3077@c whitespace, but that would be an incompatible change, whitespace has been 3078@c documented as ignored ever since GMP 1. 3079@c 3080@end deftypefun 3081 3082@deftypefun void mpz_swap (mpz_t @var{rop1}, mpz_t @var{rop2}) 3083Swap the values @var{rop1} and @var{rop2} efficiently. 3084@end deftypefun 3085 3086 3087@node Simultaneous Integer Init & Assign, Converting Integers, Assigning Integers, Integer Functions 3088@comment node-name, next, previous, up 3089@section Combined Initialization and Assignment Functions 3090@cindex Integer assignment functions 3091@cindex Assignment functions 3092@cindex Integer initialization functions 3093@cindex Initialization functions 3094 3095For convenience, GMP provides a parallel series of initialize-and-set functions 3096which initialize the output and then store the value there. These functions' 3097names have the form @code{mpz_init_set@dots{}} 3098 3099Here is an example of using one: 3100 3101@example 3102@{ 3103 mpz_t pie; 3104 mpz_init_set_str (pie, "3141592653589793238462643383279502884", 10); 3105 @dots{} 3106 mpz_sub (pie, @dots{}); 3107 @dots{} 3108 mpz_clear (pie); 3109@} 3110@end example 3111 3112@noindent 3113Once the integer has been initialized by any of the @code{mpz_init_set@dots{}} 3114functions, it can be used as the source or destination operand for the ordinary 3115integer functions. Don't use an initialize-and-set function on a variable 3116already initialized! 3117 3118@deftypefun void mpz_init_set (mpz_t @var{rop}, mpz_t @var{op}) 3119@deftypefunx void mpz_init_set_ui (mpz_t @var{rop}, unsigned long int @var{op}) 3120@deftypefunx void mpz_init_set_si (mpz_t @var{rop}, signed long int @var{op}) 3121@deftypefunx void mpz_init_set_d (mpz_t @var{rop}, double @var{op}) 3122Initialize @var{rop} with limb space and set the initial numeric value from 3123@var{op}. 3124@end deftypefun 3125 3126@deftypefun int mpz_init_set_str (mpz_t @var{rop}, char *@var{str}, int @var{base}) 3127Initialize @var{rop} and set its value like @code{mpz_set_str} (see its 3128documentation above for details). 3129 3130If the string is a correct base @var{base} number, the function returns 0; 3131if an error occurs it returns @minus{}1. @var{rop} is initialized even if 3132an error occurs. (I.e., you have to call @code{mpz_clear} for it.) 3133@end deftypefun 3134 3135 3136@node Converting Integers, Integer Arithmetic, Simultaneous Integer Init & Assign, Integer Functions 3137@comment node-name, next, previous, up 3138@section Conversion Functions 3139@cindex Integer conversion functions 3140@cindex Conversion functions 3141 3142This section describes functions for converting GMP integers to standard C 3143types. Functions for converting @emph{to} GMP integers are described in 3144@ref{Assigning Integers} and @ref{I/O of Integers}. 3145 3146@deftypefun {unsigned long int} mpz_get_ui (mpz_t @var{op}) 3147Return the value of @var{op} as an @code{unsigned long}. 3148 3149If @var{op} is too big to fit an @code{unsigned long} then just the least 3150significant bits that do fit are returned. The sign of @var{op} is ignored, 3151only the absolute value is used. 3152@end deftypefun 3153 3154@deftypefun {signed long int} mpz_get_si (mpz_t @var{op}) 3155If @var{op} fits into a @code{signed long int} return the value of @var{op}. 3156Otherwise return the least significant part of @var{op}, with the same sign 3157as @var{op}. 3158 3159If @var{op} is too big to fit in a @code{signed long int}, the returned 3160result is probably not very useful. To find out if the value will fit, use 3161the function @code{mpz_fits_slong_p}. 3162@end deftypefun 3163 3164@deftypefun double mpz_get_d (mpz_t @var{op}) 3165Convert @var{op} to a @code{double}, truncating if necessary (ie.@: rounding 3166towards zero). 3167 3168If the exponent from the conversion is too big, the result is system 3169dependent. An infinity is returned where available. A hardware overflow trap 3170may or may not occur. 3171@end deftypefun 3172 3173@deftypefun double mpz_get_d_2exp (signed long int *@var{exp}, mpz_t @var{op}) 3174Convert @var{op} to a @code{double}, truncating if necessary (ie.@: rounding 3175towards zero), and returning the exponent separately. 3176 3177The return value is in the range @math{0.5@le{}@GMPabs{@var{d}}<1} and the 3178exponent is stored to @code{*@var{exp}}. @m{@var{d} * 2^{exp}, @var{d} * 31792^@var{exp}} is the (truncated) @var{op} value. If @var{op} is zero, the 3180return is @math{0.0} and 0 is stored to @code{*@var{exp}}. 3181 3182@cindex @code{frexp} 3183This is similar to the standard C @code{frexp} function (@pxref{Normalization 3184Functions,,, libc, The GNU C Library Reference Manual}). 3185@end deftypefun 3186 3187@deftypefun {char *} mpz_get_str (char *@var{str}, int @var{base}, mpz_t @var{op}) 3188Convert @var{op} to a string of digits in base @var{base}. The base argument 3189may vary from 2 to 62 or from @minus{}2 to @minus{}36. 3190 3191For @var{base} in the range 2..36, digits and lower-case letters are used; for 3192@minus{}2..@minus{}36, digits and upper-case letters are used; for 37..62, 3193digits, upper-case letters, and lower-case letters (in that significance order) 3194are used. 3195 3196If @var{str} is @code{NULL}, the result string is allocated using the current 3197allocation function (@pxref{Custom Allocation}). The block will be 3198@code{strlen(str)+1} bytes, that being exactly enough for the string and 3199null-terminator. 3200 3201If @var{str} is not @code{NULL}, it should point to a block of storage large 3202enough for the result, that being @code{mpz_sizeinbase (@var{op}, @var{base}) 3203+ 2}. The two extra bytes are for a possible minus sign, and the 3204null-terminator. 3205 3206A pointer to the result string is returned, being either the allocated block, 3207or the given @var{str}. 3208@end deftypefun 3209 3210 3211@need 2000 3212@node Integer Arithmetic, Integer Division, Converting Integers, Integer Functions 3213@comment node-name, next, previous, up 3214@section Arithmetic Functions 3215@cindex Integer arithmetic functions 3216@cindex Arithmetic functions 3217 3218@deftypefun void mpz_add (mpz_t @var{rop}, mpz_t @var{op1}, mpz_t @var{op2}) 3219@deftypefunx void mpz_add_ui (mpz_t @var{rop}, mpz_t @var{op1}, unsigned long int @var{op2}) 3220Set @var{rop} to @math{@var{op1} + @var{op2}}. 3221@end deftypefun 3222 3223@deftypefun void mpz_sub (mpz_t @var{rop}, mpz_t @var{op1}, mpz_t @var{op2}) 3224@deftypefunx void mpz_sub_ui (mpz_t @var{rop}, mpz_t @var{op1}, unsigned long int @var{op2}) 3225@deftypefunx void mpz_ui_sub (mpz_t @var{rop}, unsigned long int @var{op1}, mpz_t @var{op2}) 3226Set @var{rop} to @var{op1} @minus{} @var{op2}. 3227@end deftypefun 3228 3229@deftypefun void mpz_mul (mpz_t @var{rop}, mpz_t @var{op1}, mpz_t @var{op2}) 3230@deftypefunx void mpz_mul_si (mpz_t @var{rop}, mpz_t @var{op1}, long int @var{op2}) 3231@deftypefunx void mpz_mul_ui (mpz_t @var{rop}, mpz_t @var{op1}, unsigned long int @var{op2}) 3232Set @var{rop} to @math{@var{op1} @GMPtimes{} @var{op2}}. 3233@end deftypefun 3234 3235@deftypefun void mpz_addmul (mpz_t @var{rop}, mpz_t @var{op1}, mpz_t @var{op2}) 3236@deftypefunx void mpz_addmul_ui (mpz_t @var{rop}, mpz_t @var{op1}, unsigned long int @var{op2}) 3237Set @var{rop} to @math{@var{rop} + @var{op1} @GMPtimes{} @var{op2}}. 3238@end deftypefun 3239 3240@deftypefun void mpz_submul (mpz_t @var{rop}, mpz_t @var{op1}, mpz_t @var{op2}) 3241@deftypefunx void mpz_submul_ui (mpz_t @var{rop}, mpz_t @var{op1}, unsigned long int @var{op2}) 3242Set @var{rop} to @math{@var{rop} - @var{op1} @GMPtimes{} @var{op2}}. 3243@end deftypefun 3244 3245@deftypefun void mpz_mul_2exp (mpz_t @var{rop}, mpz_t @var{op1}, mp_bitcnt_t @var{op2}) 3246@cindex Bit shift left 3247Set @var{rop} to @m{@var{op1} \times 2^{op2}, @var{op1} times 2 raised to 3248@var{op2}}. This operation can also be defined as a left shift by @var{op2} 3249bits. 3250@end deftypefun 3251 3252@deftypefun void mpz_neg (mpz_t @var{rop}, mpz_t @var{op}) 3253Set @var{rop} to @minus{}@var{op}. 3254@end deftypefun 3255 3256@deftypefun void mpz_abs (mpz_t @var{rop}, mpz_t @var{op}) 3257Set @var{rop} to the absolute value of @var{op}. 3258@end deftypefun 3259 3260 3261@need 2000 3262@node Integer Division, Integer Exponentiation, Integer Arithmetic, Integer Functions 3263@section Division Functions 3264@cindex Integer division functions 3265@cindex Division functions 3266 3267Division is undefined if the divisor is zero. Passing a zero divisor to the 3268division or modulo functions (including the modular powering functions 3269@code{mpz_powm} and @code{mpz_powm_ui}), will cause an intentional division by 3270zero. This lets a program handle arithmetic exceptions in these functions the 3271same way as for normal C @code{int} arithmetic. 3272 3273@c Separate deftypefun groups for cdiv, fdiv and tdiv produce a blank line 3274@c between each, and seem to let tex do a better job of page breaks than an 3275@c @sp 1 in the middle of one big set. 3276 3277@deftypefun void mpz_cdiv_q (mpz_t @var{q}, mpz_t @var{n}, mpz_t @var{d}) 3278@deftypefunx void mpz_cdiv_r (mpz_t @var{r}, mpz_t @var{n}, mpz_t @var{d}) 3279@deftypefunx void mpz_cdiv_qr (mpz_t @var{q}, mpz_t @var{r}, mpz_t @var{n}, mpz_t @var{d}) 3280@maybepagebreak 3281@deftypefunx {unsigned long int} mpz_cdiv_q_ui (mpz_t @var{q}, mpz_t @var{n}, @w{unsigned long int @var{d}}) 3282@deftypefunx {unsigned long int} mpz_cdiv_r_ui (mpz_t @var{r}, mpz_t @var{n}, @w{unsigned long int @var{d}}) 3283@deftypefunx {unsigned long int} mpz_cdiv_qr_ui (mpz_t @var{q}, mpz_t @var{r}, @w{mpz_t @var{n}}, @w{unsigned long int @var{d}}) 3284@deftypefunx {unsigned long int} mpz_cdiv_ui (mpz_t @var{n}, @w{unsigned long int @var{d}}) 3285@maybepagebreak 3286@deftypefunx void mpz_cdiv_q_2exp (mpz_t @var{q}, mpz_t @var{n}, @w{mp_bitcnt_t @var{b}}) 3287@deftypefunx void mpz_cdiv_r_2exp (mpz_t @var{r}, mpz_t @var{n}, @w{mp_bitcnt_t @var{b}}) 3288@end deftypefun 3289 3290@deftypefun void mpz_fdiv_q (mpz_t @var{q}, mpz_t @var{n}, mpz_t @var{d}) 3291@deftypefunx void mpz_fdiv_r (mpz_t @var{r}, mpz_t @var{n}, mpz_t @var{d}) 3292@deftypefunx void mpz_fdiv_qr (mpz_t @var{q}, mpz_t @var{r}, mpz_t @var{n}, mpz_t @var{d}) 3293@maybepagebreak 3294@deftypefunx {unsigned long int} mpz_fdiv_q_ui (mpz_t @var{q}, mpz_t @var{n}, @w{unsigned long int @var{d}}) 3295@deftypefunx {unsigned long int} mpz_fdiv_r_ui (mpz_t @var{r}, mpz_t @var{n}, @w{unsigned long int @var{d}}) 3296@deftypefunx {unsigned long int} mpz_fdiv_qr_ui (mpz_t @var{q}, mpz_t @var{r}, @w{mpz_t @var{n}}, @w{unsigned long int @var{d}}) 3297@deftypefunx {unsigned long int} mpz_fdiv_ui (mpz_t @var{n}, @w{unsigned long int @var{d}}) 3298@maybepagebreak 3299@deftypefunx void mpz_fdiv_q_2exp (mpz_t @var{q}, mpz_t @var{n}, @w{mp_bitcnt_t @var{b}}) 3300@deftypefunx void mpz_fdiv_r_2exp (mpz_t @var{r}, mpz_t @var{n}, @w{mp_bitcnt_t @var{b}}) 3301@end deftypefun 3302 3303@deftypefun void mpz_tdiv_q (mpz_t @var{q}, mpz_t @var{n}, mpz_t @var{d}) 3304@deftypefunx void mpz_tdiv_r (mpz_t @var{r}, mpz_t @var{n}, mpz_t @var{d}) 3305@deftypefunx void mpz_tdiv_qr (mpz_t @var{q}, mpz_t @var{r}, mpz_t @var{n}, mpz_t @var{d}) 3306@maybepagebreak 3307@deftypefunx {unsigned long int} mpz_tdiv_q_ui (mpz_t @var{q}, mpz_t @var{n}, @w{unsigned long int @var{d}}) 3308@deftypefunx {unsigned long int} mpz_tdiv_r_ui (mpz_t @var{r}, mpz_t @var{n}, @w{unsigned long int @var{d}}) 3309@deftypefunx {unsigned long int} mpz_tdiv_qr_ui (mpz_t @var{q}, mpz_t @var{r}, @w{mpz_t @var{n}}, @w{unsigned long int @var{d}}) 3310@deftypefunx {unsigned long int} mpz_tdiv_ui (mpz_t @var{n}, @w{unsigned long int @var{d}}) 3311@maybepagebreak 3312@deftypefunx void mpz_tdiv_q_2exp (mpz_t @var{q}, mpz_t @var{n}, @w{mp_bitcnt_t @var{b}}) 3313@deftypefunx void mpz_tdiv_r_2exp (mpz_t @var{r}, mpz_t @var{n}, @w{mp_bitcnt_t @var{b}}) 3314@cindex Bit shift right 3315 3316@sp 1 3317Divide @var{n} by @var{d}, forming a quotient @var{q} and/or remainder 3318@var{r}. For the @code{2exp} functions, @m{@var{d}=2^b, @var{d}=2^@var{b}}. 3319The rounding is in three styles, each suiting different applications. 3320 3321@itemize @bullet 3322@item 3323@code{cdiv} rounds @var{q} up towards @m{+\infty, +infinity}, and @var{r} will 3324have the opposite sign to @var{d}. The @code{c} stands for ``ceil''. 3325 3326@item 3327@code{fdiv} rounds @var{q} down towards @m{-\infty, @minus{}infinity}, and 3328@var{r} will have the same sign as @var{d}. The @code{f} stands for 3329``floor''. 3330 3331@item 3332@code{tdiv} rounds @var{q} towards zero, and @var{r} will have the same sign 3333as @var{n}. The @code{t} stands for ``truncate''. 3334@end itemize 3335 3336In all cases @var{q} and @var{r} will satisfy 3337@m{@var{n}=@var{q}@var{d}+@var{r}, @var{n}=@var{q}*@var{d}+@var{r}}, and 3338@var{r} will satisfy @math{0@le{}@GMPabs{@var{r}}<@GMPabs{@var{d}}}. 3339 3340The @code{q} functions calculate only the quotient, the @code{r} functions 3341only the remainder, and the @code{qr} functions calculate both. Note that for 3342@code{qr} the same variable cannot be passed for both @var{q} and @var{r}, or 3343results will be unpredictable. 3344 3345For the @code{ui} variants the return value is the remainder, and in fact 3346returning the remainder is all the @code{div_ui} functions do. For 3347@code{tdiv} and @code{cdiv} the remainder can be negative, so for those the 3348return value is the absolute value of the remainder. 3349 3350For the @code{2exp} variants the divisor is @m{2^b,2^@var{b}}. These 3351functions are implemented as right shifts and bit masks, but of course they 3352round the same as the other functions. 3353 3354For positive @var{n} both @code{mpz_fdiv_q_2exp} and @code{mpz_tdiv_q_2exp} 3355are simple bitwise right shifts. For negative @var{n}, @code{mpz_fdiv_q_2exp} 3356is effectively an arithmetic right shift treating @var{n} as twos complement 3357the same as the bitwise logical functions do, whereas @code{mpz_tdiv_q_2exp} 3358effectively treats @var{n} as sign and magnitude. 3359@end deftypefun 3360 3361@deftypefun void mpz_mod (mpz_t @var{r}, mpz_t @var{n}, mpz_t @var{d}) 3362@deftypefunx {unsigned long int} mpz_mod_ui (mpz_t @var{r}, mpz_t @var{n}, @w{unsigned long int @var{d}}) 3363Set @var{r} to @var{n} @code{mod} @var{d}. The sign of the divisor is 3364ignored; the result is always non-negative. 3365 3366@code{mpz_mod_ui} is identical to @code{mpz_fdiv_r_ui} above, returning the 3367remainder as well as setting @var{r}. See @code{mpz_fdiv_ui} above if only 3368the return value is wanted. 3369@end deftypefun 3370 3371@deftypefun void mpz_divexact (mpz_t @var{q}, mpz_t @var{n}, mpz_t @var{d}) 3372@deftypefunx void mpz_divexact_ui (mpz_t @var{q}, mpz_t @var{n}, unsigned long @var{d}) 3373@cindex Exact division functions 3374Set @var{q} to @var{n}/@var{d}. These functions produce correct results only 3375when it is known in advance that @var{d} divides @var{n}. 3376 3377These routines are much faster than the other division functions, and are the 3378best choice when exact division is known to occur, for example reducing a 3379rational to lowest terms. 3380@end deftypefun 3381 3382@deftypefun int mpz_divisible_p (mpz_t @var{n}, mpz_t @var{d}) 3383@deftypefunx int mpz_divisible_ui_p (mpz_t @var{n}, unsigned long int @var{d}) 3384@deftypefunx int mpz_divisible_2exp_p (mpz_t @var{n}, mp_bitcnt_t @var{b}) 3385@cindex Divisibility functions 3386Return non-zero if @var{n} is exactly divisible by @var{d}, or in the case of 3387@code{mpz_divisible_2exp_p} by @m{2^b,2^@var{b}}. 3388 3389@var{n} is divisible by @var{d} if there exists an integer @var{q} satisfying 3390@math{@var{n} = @var{q}@GMPmultiply{}@var{d}}. Unlike the other division 3391functions, @math{@var{d}=0} is accepted and following the rule it can be seen 3392that only 0 is considered divisible by 0. 3393@end deftypefun 3394 3395@deftypefun int mpz_congruent_p (mpz_t @var{n}, mpz_t @var{c}, mpz_t @var{d}) 3396@deftypefunx int mpz_congruent_ui_p (mpz_t @var{n}, unsigned long int @var{c}, unsigned long int @var{d}) 3397@deftypefunx int mpz_congruent_2exp_p (mpz_t @var{n}, mpz_t @var{c}, mp_bitcnt_t @var{b}) 3398@cindex Divisibility functions 3399@cindex Congruence functions 3400Return non-zero if @var{n} is congruent to @var{c} modulo @var{d}, or in the 3401case of @code{mpz_congruent_2exp_p} modulo @m{2^b,2^@var{b}}. 3402 3403@var{n} is congruent to @var{c} mod @var{d} if there exists an integer @var{q} 3404satisfying @math{@var{n} = @var{c} + @var{q}@GMPmultiply{}@var{d}}. Unlike 3405the other division functions, @math{@var{d}=0} is accepted and following the 3406rule it can be seen that @var{n} and @var{c} are considered congruent mod 0 3407only when exactly equal. 3408@end deftypefun 3409 3410 3411@need 2000 3412@node Integer Exponentiation, Integer Roots, Integer Division, Integer Functions 3413@section Exponentiation Functions 3414@cindex Integer exponentiation functions 3415@cindex Exponentiation functions 3416@cindex Powering functions 3417 3418@deftypefun void mpz_powm (mpz_t @var{rop}, mpz_t @var{base}, mpz_t @var{exp}, mpz_t @var{mod}) 3419@deftypefunx void mpz_powm_ui (mpz_t @var{rop}, mpz_t @var{base}, unsigned long int @var{exp}, mpz_t @var{mod}) 3420Set @var{rop} to @m{base^{exp} \bmod mod, (@var{base} raised to @var{exp}) 3421modulo @var{mod}}. 3422 3423Negative @var{exp} is supported if an inverse @math{@var{base}^@W{-1} @bmod 3424@var{mod}} exists (see @code{mpz_invert} in @ref{Number Theoretic Functions}). 3425If an inverse doesn't exist then a divide by zero is raised. 3426@end deftypefun 3427 3428@deftypefun void mpz_powm_sec (mpz_t @var{rop}, mpz_t @var{base}, mpz_t @var{exp}, mpz_t @var{mod}) 3429Set @var{rop} to @m{base^{exp} \bmod mod, (@var{base} raised to @var{exp}) 3430modulo @var{mod}}. 3431 3432It is required that @math{@var{exp} > 0} and that @var{mod} is odd. 3433 3434This function is designed to take the same time and have the same cache access 3435patterns for any two same-size arguments, assuming that function arguments are 3436placed at the same position and that the machine state is identical upon 3437function entry. This function is intended for cryptographic purposes, where 3438resilience to side-channel attacks is desired. 3439@end deftypefun 3440 3441@deftypefun void mpz_pow_ui (mpz_t @var{rop}, mpz_t @var{base}, unsigned long int @var{exp}) 3442@deftypefunx void mpz_ui_pow_ui (mpz_t @var{rop}, unsigned long int @var{base}, unsigned long int @var{exp}) 3443Set @var{rop} to @m{base^{exp}, @var{base} raised to @var{exp}}. The case 3444@math{0^0} yields 1. 3445@end deftypefun 3446 3447 3448@need 2000 3449@node Integer Roots, Number Theoretic Functions, Integer Exponentiation, Integer Functions 3450@section Root Extraction Functions 3451@cindex Integer root functions 3452@cindex Root extraction functions 3453 3454@deftypefun int mpz_root (mpz_t @var{rop}, mpz_t @var{op}, unsigned long int @var{n}) 3455Set @var{rop} to @m{\lfloor\root n \of {op}\rfloor@C{},} the truncated integer 3456part of the @var{n}th root of @var{op}. Return non-zero if the computation 3457was exact, i.e., if @var{op} is @var{rop} to the @var{n}th power. 3458@end deftypefun 3459 3460@deftypefun void mpz_rootrem (mpz_t @var{root}, mpz_t @var{rem}, mpz_t @var{u}, unsigned long int @var{n}) 3461Set @var{root} to @m{\lfloor\root n \of {u}\rfloor@C{},} the truncated 3462integer part of the @var{n}th root of @var{u}. Set @var{rem} to the 3463remainder, @m{(@var{u} - @var{root}^n), 3464@var{u}@minus{}@var{root}**@var{n}}. 3465@end deftypefun 3466 3467@deftypefun void mpz_sqrt (mpz_t @var{rop}, mpz_t @var{op}) 3468Set @var{rop} to @m{\lfloor\sqrt{@var{op}}\rfloor@C{},} the truncated 3469integer part of the square root of @var{op}. 3470@end deftypefun 3471 3472@deftypefun void mpz_sqrtrem (mpz_t @var{rop1}, mpz_t @var{rop2}, mpz_t @var{op}) 3473Set @var{rop1} to @m{\lfloor\sqrt{@var{op}}\rfloor, the truncated integer part 3474of the square root of @var{op}}, like @code{mpz_sqrt}. Set @var{rop2} to the 3475remainder @m{(@var{op} - @var{rop1}^2), 3476@var{op}@minus{}@var{rop1}*@var{rop1}}, which will be zero if @var{op} is a 3477perfect square. 3478 3479If @var{rop1} and @var{rop2} are the same variable, the results are 3480undefined. 3481@end deftypefun 3482 3483@deftypefun int mpz_perfect_power_p (mpz_t @var{op}) 3484@cindex Perfect power functions 3485@cindex Root testing functions 3486Return non-zero if @var{op} is a perfect power, i.e., if there exist integers 3487@m{a,@var{a}} and @m{b,@var{b}}, with @m{b>1, @var{b}>1}, such that 3488@m{@var{op}=a^b, @var{op} equals @var{a} raised to the power @var{b}}. 3489 3490Under this definition both 0 and 1 are considered to be perfect powers. 3491Negative values of @var{op} are accepted, but of course can only be odd 3492perfect powers. 3493@end deftypefun 3494 3495@deftypefun int mpz_perfect_square_p (mpz_t @var{op}) 3496@cindex Perfect square functions 3497@cindex Root testing functions 3498Return non-zero if @var{op} is a perfect square, i.e., if the square root of 3499@var{op} is an integer. Under this definition both 0 and 1 are considered to 3500be perfect squares. 3501@end deftypefun 3502 3503 3504@need 2000 3505@node Number Theoretic Functions, Integer Comparisons, Integer Roots, Integer Functions 3506@section Number Theoretic Functions 3507@cindex Number theoretic functions 3508 3509@deftypefun int mpz_probab_prime_p (mpz_t @var{n}, int @var{reps}) 3510@cindex Prime testing functions 3511@cindex Probable prime testing functions 3512Determine whether @var{n} is prime. Return 2 if @var{n} is definitely prime, 3513return 1 if @var{n} is probably prime (without being certain), or return 0 if 3514@var{n} is definitely composite. 3515 3516This function does some trial divisions, then some Miller-Rabin probabilistic 3517primality tests. @var{reps} controls how many such tests are done, 5 to 10 is 3518a reasonable number, more will reduce the chances of a composite being 3519returned as ``probably prime''. 3520 3521Miller-Rabin and similar tests can be more properly called compositeness 3522tests. Numbers which fail are known to be composite but those which pass 3523might be prime or might be composite. Only a few composites pass, hence those 3524which pass are considered probably prime. 3525@end deftypefun 3526 3527@deftypefun void mpz_nextprime (mpz_t @var{rop}, mpz_t @var{op}) 3528@cindex Next prime function 3529Set @var{rop} to the next prime greater than @var{op}. 3530 3531This function uses a probabilistic algorithm to identify primes. For 3532practical purposes it's adequate, the chance of a composite passing will be 3533extremely small. 3534@end deftypefun 3535 3536@c mpz_prime_p not implemented as of gmp 3.0. 3537 3538@c @deftypefun int mpz_prime_p (mpz_t @var{n}) 3539@c Return non-zero if @var{n} is prime and zero if @var{n} is a non-prime. 3540@c This function is far slower than @code{mpz_probab_prime_p}, but then it 3541@c never returns non-zero for composite numbers. 3542 3543@c (For practical purposes, using @code{mpz_probab_prime_p} is adequate. 3544@c The likelihood of a programming error or hardware malfunction is orders 3545@c of magnitudes greater than the likelihood for a composite to pass as a 3546@c prime, if the @var{reps} argument is in the suggested range.) 3547@c @end deftypefun 3548 3549@deftypefun void mpz_gcd (mpz_t @var{rop}, mpz_t @var{op1}, mpz_t @var{op2}) 3550@cindex Greatest common divisor functions 3551@cindex GCD functions 3552Set @var{rop} to the greatest common divisor of @var{op1} and @var{op2}. 3553The result is always positive even if one or both input operands 3554are negative. 3555@end deftypefun 3556 3557@deftypefun {unsigned long int} mpz_gcd_ui (mpz_t @var{rop}, mpz_t @var{op1}, unsigned long int @var{op2}) 3558Compute the greatest common divisor of @var{op1} and @var{op2}. If 3559@var{rop} is not @code{NULL}, store the result there. 3560 3561If the result is small enough to fit in an @code{unsigned long int}, it is 3562returned. If the result does not fit, 0 is returned, and the result is equal 3563to the argument @var{op1}. Note that the result will always fit if @var{op2} 3564is non-zero. 3565@end deftypefun 3566 3567@deftypefun void mpz_gcdext (mpz_t @var{g}, mpz_t @var{s}, mpz_t @var{t}, mpz_t @var{a}, mpz_t @var{b}) 3568@cindex Extended GCD 3569@cindex GCD extended 3570Set @var{g} to the greatest common divisor of @var{a} and @var{b}, and in 3571addition set @var{s} and @var{t} to coefficients satisfying 3572@math{@var{a}@GMPmultiply{}@var{s} + @var{b}@GMPmultiply{}@var{t} = @var{g}}. 3573The value in @var{g} is always positive, even if one or both of @var{a} and 3574@var{b} are negative. The values in @var{s} and @var{t} are chosen such that 3575@math{@GMPabs{@var{s}} @le{} @GMPabs{@var{b}}} and @math{@GMPabs{@var{t}} 3576@le{} @GMPabs{@var{a}}}. 3577 3578If @var{t} is @code{NULL} then that value is not computed. 3579@end deftypefun 3580 3581@deftypefun void mpz_lcm (mpz_t @var{rop}, mpz_t @var{op1}, mpz_t @var{op2}) 3582@deftypefunx void mpz_lcm_ui (mpz_t @var{rop}, mpz_t @var{op1}, unsigned long @var{op2}) 3583@cindex Least common multiple functions 3584@cindex LCM functions 3585Set @var{rop} to the least common multiple of @var{op1} and @var{op2}. 3586@var{rop} is always positive, irrespective of the signs of @var{op1} and 3587@var{op2}. @var{rop} will be zero if either @var{op1} or @var{op2} is zero. 3588@end deftypefun 3589 3590@deftypefun int mpz_invert (mpz_t @var{rop}, mpz_t @var{op1}, mpz_t @var{op2}) 3591@cindex Modular inverse functions 3592@cindex Inverse modulo functions 3593Compute the inverse of @var{op1} modulo @var{op2} and put the result in 3594@var{rop}. If the inverse exists, the return value is non-zero and @var{rop} 3595will satisfy @math{0 @le{} @var{rop} < @var{op2}}. If an inverse doesn't exist 3596the return value is zero and @var{rop} is undefined. 3597@end deftypefun 3598 3599@deftypefun int mpz_jacobi (mpz_t @var{a}, mpz_t @var{b}) 3600@cindex Jacobi symbol functions 3601Calculate the Jacobi symbol @m{\left(a \over b\right), 3602(@var{a}/@var{b})}. This is defined only for @var{b} odd. 3603@end deftypefun 3604 3605@deftypefun int mpz_legendre (mpz_t @var{a}, mpz_t @var{p}) 3606@cindex Legendre symbol functions 3607Calculate the Legendre symbol @m{\left(a \over p\right), 3608(@var{a}/@var{p})}. This is defined only for @var{p} an odd positive 3609prime, and for such @var{p} it's identical to the Jacobi symbol. 3610@end deftypefun 3611 3612@deftypefun int mpz_kronecker (mpz_t @var{a}, mpz_t @var{b}) 3613@deftypefunx int mpz_kronecker_si (mpz_t @var{a}, long @var{b}) 3614@deftypefunx int mpz_kronecker_ui (mpz_t @var{a}, unsigned long @var{b}) 3615@deftypefunx int mpz_si_kronecker (long @var{a}, mpz_t @var{b}) 3616@deftypefunx int mpz_ui_kronecker (unsigned long @var{a}, mpz_t @var{b}) 3617@cindex Kronecker symbol functions 3618Calculate the Jacobi symbol @m{\left(a \over b\right), 3619(@var{a}/@var{b})} with the Kronecker extension @m{\left(a \over 36202\right) = \left(2 \over a\right), (a/2)=(2/a)} when @math{a} odd, or 3621@m{\left(a \over 2\right) = 0, (a/2)=0} when @math{a} even. 3622 3623When @var{b} is odd the Jacobi symbol and Kronecker symbol are 3624identical, so @code{mpz_kronecker_ui} etc can be used for mixed 3625precision Jacobi symbols too. 3626 3627For more information see Henri Cohen section 1.4.2 (@pxref{References}), 3628or any number theory textbook. See also the example program 3629@file{demos/qcn.c} which uses @code{mpz_kronecker_ui}. 3630@end deftypefun 3631 3632@deftypefun {mp_bitcnt_t} mpz_remove (mpz_t @var{rop}, mpz_t @var{op}, mpz_t @var{f}) 3633@cindex Remove factor functions 3634@cindex Factor removal functions 3635Remove all occurrences of the factor @var{f} from @var{op} and store the 3636result in @var{rop}. The return value is how many such occurrences were 3637removed. 3638@end deftypefun 3639 3640@deftypefun void mpz_fac_ui (mpz_t @var{rop}, unsigned long int @var{op}) 3641@cindex Factorial functions 3642Set @var{rop} to @var{op}!, the factorial of @var{op}. 3643@end deftypefun 3644 3645@deftypefun void mpz_bin_ui (mpz_t @var{rop}, mpz_t @var{n}, unsigned long int @var{k}) 3646@deftypefunx void mpz_bin_uiui (mpz_t @var{rop}, unsigned long int @var{n}, @w{unsigned long int @var{k}}) 3647@cindex Binomial coefficient functions 3648Compute the binomial coefficient @m{\left({n}\atop{k}\right), @var{n} over 3649@var{k}} and store the result in @var{rop}. Negative values of @var{n} are 3650supported by @code{mpz_bin_ui}, using the identity 3651@m{\left({-n}\atop{k}\right) = (-1)^k \left({n+k-1}\atop{k}\right), 3652bin(-n@C{}k) = (-1)^k * bin(n+k-1@C{}k)}, see Knuth volume 1 section 1.2.6 3653part G. 3654@end deftypefun 3655 3656@deftypefun void mpz_fib_ui (mpz_t @var{fn}, unsigned long int @var{n}) 3657@deftypefunx void mpz_fib2_ui (mpz_t @var{fn}, mpz_t @var{fnsub1}, unsigned long int @var{n}) 3658@cindex Fibonacci sequence functions 3659@code{mpz_fib_ui} sets @var{fn} to to @m{F_n,F[n]}, the @var{n}'th Fibonacci 3660number. @code{mpz_fib2_ui} sets @var{fn} to @m{F_n,F[n]}, and @var{fnsub1} to 3661@m{F_{n-1},F[n-1]}. 3662 3663These functions are designed for calculating isolated Fibonacci numbers. When 3664a sequence of values is wanted it's best to start with @code{mpz_fib2_ui} and 3665iterate the defining @m{F_{n+1} = F_n + F_{n-1}, F[n+1]=F[n]+F[n-1]} or 3666similar. 3667@end deftypefun 3668 3669@deftypefun void mpz_lucnum_ui (mpz_t @var{ln}, unsigned long int @var{n}) 3670@deftypefunx void mpz_lucnum2_ui (mpz_t @var{ln}, mpz_t @var{lnsub1}, unsigned long int @var{n}) 3671@cindex Lucas number functions 3672@code{mpz_lucnum_ui} sets @var{ln} to to @m{L_n,L[n]}, the @var{n}'th Lucas 3673number. @code{mpz_lucnum2_ui} sets @var{ln} to @m{L_n,L[n]}, and @var{lnsub1} 3674to @m{L_{n-1},L[n-1]}. 3675 3676These functions are designed for calculating isolated Lucas numbers. When a 3677sequence of values is wanted it's best to start with @code{mpz_lucnum2_ui} and 3678iterate the defining @m{L_{n+1} = L_n + L_{n-1}, L[n+1]=L[n]+L[n-1]} or 3679similar. 3680 3681The Fibonacci numbers and Lucas numbers are related sequences, so it's never 3682necessary to call both @code{mpz_fib2_ui} and @code{mpz_lucnum2_ui}. The 3683formulas for going from Fibonacci to Lucas can be found in @ref{Lucas Numbers 3684Algorithm}, the reverse is straightforward too. 3685@end deftypefun 3686 3687 3688@node Integer Comparisons, Integer Logic and Bit Fiddling, Number Theoretic Functions, Integer Functions 3689@comment node-name, next, previous, up 3690@section Comparison Functions 3691@cindex Integer comparison functions 3692@cindex Comparison functions 3693 3694@deftypefn Function int mpz_cmp (mpz_t @var{op1}, mpz_t @var{op2}) 3695@deftypefnx Function int mpz_cmp_d (mpz_t @var{op1}, double @var{op2}) 3696@deftypefnx Macro int mpz_cmp_si (mpz_t @var{op1}, signed long int @var{op2}) 3697@deftypefnx Macro int mpz_cmp_ui (mpz_t @var{op1}, unsigned long int @var{op2}) 3698Compare @var{op1} and @var{op2}. Return a positive value if @math{@var{op1} > 3699@var{op2}}, zero if @math{@var{op1} = @var{op2}}, or a negative value if 3700@math{@var{op1} < @var{op2}}. 3701 3702@code{mpz_cmp_ui} and @code{mpz_cmp_si} are macros and will evaluate their 3703arguments more than once. @code{mpz_cmp_d} can be called with an infinity, 3704but results are undefined for a NaN. 3705@end deftypefn 3706 3707@deftypefn Function int mpz_cmpabs (mpz_t @var{op1}, mpz_t @var{op2}) 3708@deftypefnx Function int mpz_cmpabs_d (mpz_t @var{op1}, double @var{op2}) 3709@deftypefnx Function int mpz_cmpabs_ui (mpz_t @var{op1}, unsigned long int @var{op2}) 3710Compare the absolute values of @var{op1} and @var{op2}. Return a positive 3711value if @math{@GMPabs{@var{op1}} > @GMPabs{@var{op2}}}, zero if 3712@math{@GMPabs{@var{op1}} = @GMPabs{@var{op2}}}, or a negative value if 3713@math{@GMPabs{@var{op1}} < @GMPabs{@var{op2}}}. 3714 3715@code{mpz_cmpabs_d} can be called with an infinity, but results are undefined 3716for a NaN. 3717@end deftypefn 3718 3719@deftypefn Macro int mpz_sgn (mpz_t @var{op}) 3720@cindex Sign tests 3721@cindex Integer sign tests 3722Return @math{+1} if @math{@var{op} > 0}, 0 if @math{@var{op} = 0}, and 3723@math{-1} if @math{@var{op} < 0}. 3724 3725This function is actually implemented as a macro. It evaluates its argument 3726multiple times. 3727@end deftypefn 3728 3729 3730@node Integer Logic and Bit Fiddling, I/O of Integers, Integer Comparisons, Integer Functions 3731@comment node-name, next, previous, up 3732@section Logical and Bit Manipulation Functions 3733@cindex Logical functions 3734@cindex Bit manipulation functions 3735@cindex Integer logical functions 3736@cindex Integer bit manipulation functions 3737 3738These functions behave as if twos complement arithmetic were used (although 3739sign-magnitude is the actual implementation). The least significant bit is 3740number 0. 3741 3742@deftypefun void mpz_and (mpz_t @var{rop}, mpz_t @var{op1}, mpz_t @var{op2}) 3743Set @var{rop} to @var{op1} bitwise-and @var{op2}. 3744@end deftypefun 3745 3746@deftypefun void mpz_ior (mpz_t @var{rop}, mpz_t @var{op1}, mpz_t @var{op2}) 3747Set @var{rop} to @var{op1} bitwise inclusive-or @var{op2}. 3748@end deftypefun 3749 3750@deftypefun void mpz_xor (mpz_t @var{rop}, mpz_t @var{op1}, mpz_t @var{op2}) 3751Set @var{rop} to @var{op1} bitwise exclusive-or @var{op2}. 3752@end deftypefun 3753 3754@deftypefun void mpz_com (mpz_t @var{rop}, mpz_t @var{op}) 3755Set @var{rop} to the one's complement of @var{op}. 3756@end deftypefun 3757 3758@deftypefun {mp_bitcnt_t} mpz_popcount (mpz_t @var{op}) 3759If @math{@var{op}@ge{}0}, return the population count of @var{op}, which is the 3760number of 1 bits in the binary representation. If @math{@var{op}<0}, the 3761number of 1s is infinite, and the return value is the largest possible 3762@code{mp_bitcnt_t}. 3763@end deftypefun 3764 3765@deftypefun {mp_bitcnt_t} mpz_hamdist (mpz_t @var{op1}, mpz_t @var{op2}) 3766If @var{op1} and @var{op2} are both @math{@ge{}0} or both @math{<0}, return the 3767hamming distance between the two operands, which is the number of bit positions 3768where @var{op1} and @var{op2} have different bit values. If one operand is 3769@math{@ge{}0} and the other @math{<0} then the number of bits different is 3770infinite, and the return value is the largest possible @code{mp_bitcnt_t}. 3771@end deftypefun 3772 3773@deftypefun {mp_bitcnt_t} mpz_scan0 (mpz_t @var{op}, mp_bitcnt_t @var{starting_bit}) 3774@deftypefunx {mp_bitcnt_t} mpz_scan1 (mpz_t @var{op}, mp_bitcnt_t @var{starting_bit}) 3775@cindex Bit scanning functions 3776@cindex Scan bit functions 3777Scan @var{op}, starting from bit @var{starting_bit}, towards more significant 3778bits, until the first 0 or 1 bit (respectively) is found. Return the index of 3779the found bit. 3780 3781If the bit at @var{starting_bit} is already what's sought, then 3782@var{starting_bit} is returned. 3783 3784If there's no bit found, then the largest possible @code{mp_bitcnt_t} is 3785returned. This will happen in @code{mpz_scan0} past the end of a negative 3786number, or @code{mpz_scan1} past the end of a nonnegative number. 3787@end deftypefun 3788 3789@deftypefun void mpz_setbit (mpz_t @var{rop}, mp_bitcnt_t @var{bit_index}) 3790Set bit @var{bit_index} in @var{rop}. 3791@end deftypefun 3792 3793@deftypefun void mpz_clrbit (mpz_t @var{rop}, mp_bitcnt_t @var{bit_index}) 3794Clear bit @var{bit_index} in @var{rop}. 3795@end deftypefun 3796 3797@deftypefun void mpz_combit (mpz_t @var{rop}, mp_bitcnt_t @var{bit_index}) 3798Complement bit @var{bit_index} in @var{rop}. 3799@end deftypefun 3800 3801@deftypefun int mpz_tstbit (mpz_t @var{op}, mp_bitcnt_t @var{bit_index}) 3802Test bit @var{bit_index} in @var{op} and return 0 or 1 accordingly. 3803@end deftypefun 3804 3805@node I/O of Integers, Integer Random Numbers, Integer Logic and Bit Fiddling, Integer Functions 3806@comment node-name, next, previous, up 3807@section Input and Output Functions 3808@cindex Integer input and output functions 3809@cindex Input functions 3810@cindex Output functions 3811@cindex I/O functions 3812 3813Functions that perform input from a stdio stream, and functions that output to 3814a stdio stream, of @code{mpz} numbers. Passing a @code{NULL} pointer for a 3815@var{stream} argument to any of these functions will make them read from 3816@code{stdin} and write to @code{stdout}, respectively. 3817 3818When using any of these functions, it is a good idea to include @file{stdio.h} 3819before @file{gmp.h}, since that will allow @file{gmp.h} to define prototypes 3820for these functions. 3821 3822See also @ref{Formatted Output} and @ref{Formatted Input}. 3823 3824@deftypefun size_t mpz_out_str (FILE *@var{stream}, int @var{base}, mpz_t @var{op}) 3825Output @var{op} on stdio stream @var{stream}, as a string of digits in base 3826@var{base}. The base argument may vary from 2 to 62 or from @minus{}2 to 3827@minus{}36. 3828 3829For @var{base} in the range 2..36, digits and lower-case letters are used; for 3830@minus{}2..@minus{}36, digits and upper-case letters are used; for 37..62, 3831digits, upper-case letters, and lower-case letters (in that significance order) 3832are used. 3833 3834Return the number of bytes written, or if an error occurred, return 0. 3835@end deftypefun 3836 3837@deftypefun size_t mpz_inp_str (mpz_t @var{rop}, FILE *@var{stream}, int @var{base}) 3838Input a possibly white-space preceded string in base @var{base} from stdio 3839stream @var{stream}, and put the read integer in @var{rop}. 3840 3841The @var{base} may vary from 2 to 62, or if @var{base} is 0, then the leading 3842characters are used: @code{0x} and @code{0X} for hexadecimal, @code{0b} and 3843@code{0B} for binary, @code{0} for octal, or decimal otherwise. 3844 3845For bases up to 36, case is ignored; upper-case and lower-case letters have 3846the same value. For bases 37 to 62, upper-case letter represent the usual 384710..35 while lower-case letter represent 36..61. 3848 3849Return the number of bytes read, or if an error occurred, return 0. 3850@end deftypefun 3851 3852@deftypefun size_t mpz_out_raw (FILE *@var{stream}, mpz_t @var{op}) 3853Output @var{op} on stdio stream @var{stream}, in raw binary format. The 3854integer is written in a portable format, with 4 bytes of size information, and 3855that many bytes of limbs. Both the size and the limbs are written in 3856decreasing significance order (i.e., in big-endian). 3857 3858The output can be read with @code{mpz_inp_raw}. 3859 3860Return the number of bytes written, or if an error occurred, return 0. 3861 3862The output of this can not be read by @code{mpz_inp_raw} from GMP 1, because 3863of changes necessary for compatibility between 32-bit and 64-bit machines. 3864@end deftypefun 3865 3866@deftypefun size_t mpz_inp_raw (mpz_t @var{rop}, FILE *@var{stream}) 3867Input from stdio stream @var{stream} in the format written by 3868@code{mpz_out_raw}, and put the result in @var{rop}. Return the number of 3869bytes read, or if an error occurred, return 0. 3870 3871This routine can read the output from @code{mpz_out_raw} also from GMP 1, in 3872spite of changes necessary for compatibility between 32-bit and 64-bit 3873machines. 3874@end deftypefun 3875 3876 3877@need 2000 3878@node Integer Random Numbers, Integer Import and Export, I/O of Integers, Integer Functions 3879@comment node-name, next, previous, up 3880@section Random Number Functions 3881@cindex Integer random number functions 3882@cindex Random number functions 3883 3884The random number functions of GMP come in two groups; older function 3885that rely on a global state, and newer functions that accept a state 3886parameter that is read and modified. Please see the @ref{Random Number 3887Functions} for more information on how to use and not to use random 3888number functions. 3889 3890@deftypefun void mpz_urandomb (mpz_t @var{rop}, gmp_randstate_t @var{state}, mp_bitcnt_t @var{n}) 3891Generate a uniformly distributed random integer in the range 0 to @m{2^n-1, 38922^@var{n}@minus{}1}, inclusive. 3893 3894The variable @var{state} must be initialized by calling one of the 3895@code{gmp_randinit} functions (@ref{Random State Initialization}) before 3896invoking this function. 3897@end deftypefun 3898 3899@deftypefun void mpz_urandomm (mpz_t @var{rop}, gmp_randstate_t @var{state}, mpz_t @var{n}) 3900Generate a uniform random integer in the range 0 to @math{@var{n}-1}, 3901inclusive. 3902 3903The variable @var{state} must be initialized by calling one of the 3904@code{gmp_randinit} functions (@ref{Random State Initialization}) 3905before invoking this function. 3906@end deftypefun 3907 3908@deftypefun void mpz_rrandomb (mpz_t @var{rop}, gmp_randstate_t @var{state}, mp_bitcnt_t @var{n}) 3909Generate a random integer with long strings of zeros and ones in the 3910binary representation. Useful for testing functions and algorithms, 3911since this kind of random numbers have proven to be more likely to 3912trigger corner-case bugs. The random number will be in the range 39130 to @m{2^n-1, 2^@var{n}@minus{}1}, inclusive. 3914 3915The variable @var{state} must be initialized by calling one of the 3916@code{gmp_randinit} functions (@ref{Random State Initialization}) 3917before invoking this function. 3918@end deftypefun 3919 3920@deftypefun void mpz_random (mpz_t @var{rop}, mp_size_t @var{max_size}) 3921Generate a random integer of at most @var{max_size} limbs. The generated 3922random number doesn't satisfy any particular requirements of randomness. 3923Negative random numbers are generated when @var{max_size} is negative. 3924 3925This function is obsolete. Use @code{mpz_urandomb} or 3926@code{mpz_urandomm} instead. 3927@end deftypefun 3928 3929@deftypefun void mpz_random2 (mpz_t @var{rop}, mp_size_t @var{max_size}) 3930Generate a random integer of at most @var{max_size} limbs, with long strings 3931of zeros and ones in the binary representation. Useful for testing functions 3932and algorithms, since this kind of random numbers have proven to be more 3933likely to trigger corner-case bugs. Negative random numbers are generated 3934when @var{max_size} is negative. 3935 3936This function is obsolete. Use @code{mpz_rrandomb} instead. 3937@end deftypefun 3938 3939 3940@node Integer Import and Export, Miscellaneous Integer Functions, Integer Random Numbers, Integer Functions 3941@section Integer Import and Export 3942 3943@code{mpz_t} variables can be converted to and from arbitrary words of binary 3944data with the following functions. 3945 3946@deftypefun void mpz_import (mpz_t @var{rop}, size_t @var{count}, int @var{order}, size_t @var{size}, int @var{endian}, size_t @var{nails}, const void *@var{op}) 3947@cindex Integer import 3948@cindex Import 3949Set @var{rop} from an array of word data at @var{op}. 3950 3951The parameters specify the format of the data. @var{count} many words are 3952read, each @var{size} bytes. @var{order} can be 1 for most significant word 3953first or -1 for least significant first. Within each word @var{endian} can be 39541 for most significant byte first, -1 for least significant first, or 0 for 3955the native endianness of the host CPU@. The most significant @var{nails} bits 3956of each word are skipped, this can be 0 to use the full words. 3957 3958There is no sign taken from the data, @var{rop} will simply be a positive 3959integer. An application can handle any sign itself, and apply it for instance 3960with @code{mpz_neg}. 3961 3962There are no data alignment restrictions on @var{op}, any address is allowed. 3963 3964Here's an example converting an array of @code{unsigned long} data, most 3965significant element first, and host byte order within each value. 3966 3967@example 3968unsigned long a[20]; 3969/* Initialize @var{z} and @var{a} */ 3970mpz_import (z, 20, 1, sizeof(a[0]), 0, 0, a); 3971@end example 3972 3973This example assumes the full @code{sizeof} bytes are used for data in the 3974given type, which is usually true, and certainly true for @code{unsigned long} 3975everywhere we know of. However on Cray vector systems it may be noted that 3976@code{short} and @code{int} are always stored in 8 bytes (and with 3977@code{sizeof} indicating that) but use only 32 or 46 bits. The @var{nails} 3978feature can account for this, by passing for instance 3979@code{8*sizeof(int)-INT_BIT}. 3980@end deftypefun 3981 3982@deftypefun {void *} mpz_export (void *@var{rop}, size_t *@var{countp}, int @var{order}, size_t @var{size}, int @var{endian}, size_t @var{nails}, mpz_t @var{op}) 3983@cindex Integer export 3984@cindex Export 3985Fill @var{rop} with word data from @var{op}. 3986 3987The parameters specify the format of the data produced. Each word will be 3988@var{size} bytes and @var{order} can be 1 for most significant word first or 3989-1 for least significant first. Within each word @var{endian} can be 1 for 3990most significant byte first, -1 for least significant first, or 0 for the 3991native endianness of the host CPU@. The most significant @var{nails} bits of 3992each word are unused and set to zero, this can be 0 to produce full words. 3993 3994The number of words produced is written to @code{*@var{countp}}, or 3995@var{countp} can be @code{NULL} to discard the count. @var{rop} must have 3996enough space for the data, or if @var{rop} is @code{NULL} then a result array 3997of the necessary size is allocated using the current GMP allocation function 3998(@pxref{Custom Allocation}). In either case the return value is the 3999destination used, either @var{rop} or the allocated block. 4000 4001If @var{op} is non-zero then the most significant word produced will be 4002non-zero. If @var{op} is zero then the count returned will be zero and 4003nothing written to @var{rop}. If @var{rop} is @code{NULL} in this case, no 4004block is allocated, just @code{NULL} is returned. 4005 4006The sign of @var{op} is ignored, just the absolute value is exported. An 4007application can use @code{mpz_sgn} to get the sign and handle it as desired. 4008(@pxref{Integer Comparisons}) 4009 4010There are no data alignment restrictions on @var{rop}, any address is allowed. 4011 4012When an application is allocating space itself the required size can be 4013determined with a calculation like the following. Since @code{mpz_sizeinbase} 4014always returns at least 1, @code{count} here will be at least one, which 4015avoids any portability problems with @code{malloc(0)}, though if @code{z} is 4016zero no space at all is actually needed (or written). 4017 4018@example 4019numb = 8*size - nail; 4020count = (mpz_sizeinbase (z, 2) + numb-1) / numb; 4021p = malloc (count * size); 4022@end example 4023@end deftypefun 4024 4025 4026@need 2000 4027@node Miscellaneous Integer Functions, Integer Special Functions, Integer Import and Export, Integer Functions 4028@comment node-name, next, previous, up 4029@section Miscellaneous Functions 4030@cindex Miscellaneous integer functions 4031@cindex Integer miscellaneous functions 4032 4033@deftypefun int mpz_fits_ulong_p (mpz_t @var{op}) 4034@deftypefunx int mpz_fits_slong_p (mpz_t @var{op}) 4035@deftypefunx int mpz_fits_uint_p (mpz_t @var{op}) 4036@deftypefunx int mpz_fits_sint_p (mpz_t @var{op}) 4037@deftypefunx int mpz_fits_ushort_p (mpz_t @var{op}) 4038@deftypefunx int mpz_fits_sshort_p (mpz_t @var{op}) 4039Return non-zero iff the value of @var{op} fits in an @code{unsigned long int}, 4040@code{signed long int}, @code{unsigned int}, @code{signed int}, @code{unsigned 4041short int}, or @code{signed short int}, respectively. Otherwise, return zero. 4042@end deftypefun 4043 4044@deftypefn Macro int mpz_odd_p (mpz_t @var{op}) 4045@deftypefnx Macro int mpz_even_p (mpz_t @var{op}) 4046Determine whether @var{op} is odd or even, respectively. Return non-zero if 4047yes, zero if no. These macros evaluate their argument more than once. 4048@end deftypefn 4049 4050@deftypefun size_t mpz_sizeinbase (mpz_t @var{op}, int @var{base}) 4051@cindex Size in digits 4052@cindex Digits in an integer 4053Return the size of @var{op} measured in number of digits in the given 4054@var{base}. @var{base} can vary from 2 to 62. The sign of @var{op} is 4055ignored, just the absolute value is used. The result will be either exact or 40561 too big. If @var{base} is a power of 2, the result is always exact. If 4057@var{op} is zero the return value is always 1. 4058 4059This function can be used to determine the space required when converting 4060@var{op} to a string. The right amount of allocation is normally two more 4061than the value returned by @code{mpz_sizeinbase}, one extra for a minus sign 4062and one for the null-terminator. 4063 4064@cindex Most significant bit 4065It will be noted that @code{mpz_sizeinbase(@var{op},2)} can be used to locate 4066the most significant 1 bit in @var{op}, counting from 1. (Unlike the bitwise 4067functions which start from 0, @xref{Integer Logic and Bit Fiddling,, Logical 4068and Bit Manipulation Functions}.) 4069@end deftypefun 4070 4071 4072@node Integer Special Functions, , Miscellaneous Integer Functions, Integer Functions 4073@section Special Functions 4074@cindex Special integer functions 4075@cindex Integer special functions 4076 4077The functions in this section are for various special purposes. Most 4078applications will not need them. 4079 4080@deftypefun void mpz_array_init (mpz_t @var{integer_array}, mp_size_t @var{array_size}, @w{mp_size_t @var{fixed_num_bits}}) 4081This is a special type of initialization. @strong{Fixed} space of 4082@var{fixed_num_bits} is allocated to each of the @var{array_size} integers in 4083@var{integer_array}. There is no way to free the storage allocated by this 4084function. Don't call @code{mpz_clear}! 4085 4086The @var{integer_array} parameter is the first @code{mpz_t} in the array. For 4087example, 4088 4089@example 4090mpz_t arr[20000]; 4091mpz_array_init (arr[0], 20000, 512); 4092@end example 4093 4094@c In case anyone's wondering, yes this parameter style is a bit anomalous, 4095@c it'd probably be nicer if it was "arr" instead of "arr[0]". Obviously the 4096@c two differ only in the declaration, not the pointer value, but changing is 4097@c not possible since it'd provoke warnings or errors in existing sources. 4098 4099This function is only intended for programs that create a large number 4100of integers and need to reduce memory usage by avoiding the overheads of 4101allocating and reallocating lots of small blocks. In normal programs this 4102function is not recommended. 4103 4104The space allocated to each integer by this function will not be automatically 4105increased, unlike the normal @code{mpz_init}, so an application must ensure it 4106is sufficient for any value stored. The following space requirements apply to 4107various routines, 4108 4109@itemize @bullet 4110@item 4111@code{mpz_abs}, @code{mpz_neg}, @code{mpz_set}, @code{mpz_set_si} and 4112@code{mpz_set_ui} need room for the value they store. 4113 4114@item 4115@code{mpz_add}, @code{mpz_add_ui}, @code{mpz_sub} and @code{mpz_sub_ui} need 4116room for the larger of the two operands, plus an extra 4117@code{mp_bits_per_limb}. 4118 4119@item 4120@code{mpz_mul}, @code{mpz_mul_ui} and @code{mpz_mul_si} need room for the sum 4121of the number of bits in their operands, but each rounded up to a multiple of 4122@code{mp_bits_per_limb}. 4123 4124@item 4125@code{mpz_swap} can be used between two array variables, but not between an 4126array and a normal variable. 4127@end itemize 4128 4129For other functions, or if in doubt, the suggestion is to calculate in a 4130regular @code{mpz_init} variable and copy the result to an array variable with 4131@code{mpz_set}. 4132@end deftypefun 4133 4134@deftypefun {void *} _mpz_realloc (mpz_t @var{integer}, mp_size_t @var{new_alloc}) 4135Change the space for @var{integer} to @var{new_alloc} limbs. The value in 4136@var{integer} is preserved if it fits, or is set to 0 if not. The return 4137value is not useful to applications and should be ignored. 4138 4139@code{mpz_realloc2} is the preferred way to accomplish allocation changes like 4140this. @code{mpz_realloc2} and @code{_mpz_realloc} are the same except that 4141@code{_mpz_realloc} takes its size in limbs. 4142@end deftypefun 4143 4144@deftypefun mp_limb_t mpz_getlimbn (mpz_t @var{op}, mp_size_t @var{n}) 4145Return limb number @var{n} from @var{op}. The sign of @var{op} is ignored, 4146just the absolute value is used. The least significant limb is number 0. 4147 4148@code{mpz_size} can be used to find how many limbs make up @var{op}. 4149@code{mpz_getlimbn} returns zero if @var{n} is outside the range 0 to 4150@code{mpz_size(@var{op})-1}. 4151@end deftypefun 4152 4153@deftypefun size_t mpz_size (mpz_t @var{op}) 4154Return the size of @var{op} measured in number of limbs. If @var{op} is zero, 4155the returned value will be zero. 4156@c (@xref{Nomenclature}, for an explanation of the concept @dfn{limb}.) 4157@end deftypefun 4158 4159 4160 4161@node Rational Number Functions, Floating-point Functions, Integer Functions, Top 4162@comment node-name, next, previous, up 4163@chapter Rational Number Functions 4164@cindex Rational number functions 4165 4166This chapter describes the GMP functions for performing arithmetic on rational 4167numbers. These functions start with the prefix @code{mpq_}. 4168 4169Rational numbers are stored in objects of type @code{mpq_t}. 4170 4171All rational arithmetic functions assume operands have a canonical form, and 4172canonicalize their result. The canonical from means that the denominator and 4173the numerator have no common factors, and that the denominator is positive. 4174Zero has the unique representation 0/1. 4175 4176Pure assignment functions do not canonicalize the assigned variable. It is 4177the responsibility of the user to canonicalize the assigned variable before 4178any arithmetic operations are performed on that variable. 4179 4180@deftypefun void mpq_canonicalize (mpq_t @var{op}) 4181Remove any factors that are common to the numerator and denominator of 4182@var{op}, and make the denominator positive. 4183@end deftypefun 4184 4185@menu 4186* Initializing Rationals:: 4187* Rational Conversions:: 4188* Rational Arithmetic:: 4189* Comparing Rationals:: 4190* Applying Integer Functions:: 4191* I/O of Rationals:: 4192@end menu 4193 4194@node Initializing Rationals, Rational Conversions, Rational Number Functions, Rational Number Functions 4195@comment node-name, next, previous, up 4196@section Initialization and Assignment Functions 4197@cindex Rational assignment functions 4198@cindex Assignment functions 4199@cindex Rational initialization functions 4200@cindex Initialization functions 4201 4202@deftypefun void mpq_init (mpq_t @var{x}) 4203Initialize @var{x} and set it to 0/1. Each variable should normally only be 4204initialized once, or at least cleared out (using the function @code{mpq_clear}) 4205between each initialization. 4206@end deftypefun 4207 4208@deftypefun void mpq_inits (mpq_t @var{x}, ...) 4209Initialize a NULL-terminated list of @code{mpq_t} variables, and set their 4210values to 0/1. 4211@end deftypefun 4212 4213@deftypefun void mpq_clear (mpq_t @var{x}) 4214Free the space occupied by @var{x}. Make sure to call this function for all 4215@code{mpq_t} variables when you are done with them. 4216@end deftypefun 4217 4218@deftypefun void mpq_clears (mpq_t @var{x}, ...) 4219Free the space occupied by a NULL-terminated list of @code{mpq_t} variables. 4220@end deftypefun 4221 4222@deftypefun void mpq_set (mpq_t @var{rop}, mpq_t @var{op}) 4223@deftypefunx void mpq_set_z (mpq_t @var{rop}, mpz_t @var{op}) 4224Assign @var{rop} from @var{op}. 4225@end deftypefun 4226 4227@deftypefun void mpq_set_ui (mpq_t @var{rop}, unsigned long int @var{op1}, unsigned long int @var{op2}) 4228@deftypefunx void mpq_set_si (mpq_t @var{rop}, signed long int @var{op1}, unsigned long int @var{op2}) 4229Set the value of @var{rop} to @var{op1}/@var{op2}. Note that if @var{op1} and 4230@var{op2} have common factors, @var{rop} has to be passed to 4231@code{mpq_canonicalize} before any operations are performed on @var{rop}. 4232@end deftypefun 4233 4234@deftypefun int mpq_set_str (mpq_t @var{rop}, char *@var{str}, int @var{base}) 4235Set @var{rop} from a null-terminated string @var{str} in the given @var{base}. 4236 4237The string can be an integer like ``41'' or a fraction like ``41/152''. The 4238fraction must be in canonical form (@pxref{Rational Number Functions}), or if 4239not then @code{mpq_canonicalize} must be called. 4240 4241The numerator and optional denominator are parsed the same as in 4242@code{mpz_set_str} (@pxref{Assigning Integers}). White space is allowed in 4243the string, and is simply ignored. The @var{base} can vary from 2 to 62, or 4244if @var{base} is 0 then the leading characters are used: @code{0x} or @code{0X} for hex, 4245@code{0b} or @code{0B} for binary, 4246@code{0} for octal, or decimal otherwise. Note that this is done separately 4247for the numerator and denominator, so for instance @code{0xEF/100} is 239/100, 4248whereas @code{0xEF/0x100} is 239/256. 4249 4250The return value is 0 if the entire string is a valid number, or @minus{}1 if 4251not. 4252@end deftypefun 4253 4254@deftypefun void mpq_swap (mpq_t @var{rop1}, mpq_t @var{rop2}) 4255Swap the values @var{rop1} and @var{rop2} efficiently. 4256@end deftypefun 4257 4258 4259@need 2000 4260@node Rational Conversions, Rational Arithmetic, Initializing Rationals, Rational Number Functions 4261@comment node-name, next, previous, up 4262@section Conversion Functions 4263@cindex Rational conversion functions 4264@cindex Conversion functions 4265 4266@deftypefun double mpq_get_d (mpq_t @var{op}) 4267Convert @var{op} to a @code{double}, truncating if necessary (ie.@: rounding 4268towards zero). 4269 4270If the exponent from the conversion is too big or too small to fit a 4271@code{double} then the result is system dependent. For too big an infinity is 4272returned when available. For too small @math{0.0} is normally returned. 4273Hardware overflow, underflow and denorm traps may or may not occur. 4274@end deftypefun 4275 4276@deftypefun void mpq_set_d (mpq_t @var{rop}, double @var{op}) 4277@deftypefunx void mpq_set_f (mpq_t @var{rop}, mpf_t @var{op}) 4278Set @var{rop} to the value of @var{op}. There is no rounding, this conversion 4279is exact. 4280@end deftypefun 4281 4282@deftypefun {char *} mpq_get_str (char *@var{str}, int @var{base}, mpq_t @var{op}) 4283Convert @var{op} to a string of digits in base @var{base}. The base may vary 4284from 2 to 36. The string will be of the form @samp{num/den}, or if the 4285denominator is 1 then just @samp{num}. 4286 4287If @var{str} is @code{NULL}, the result string is allocated using the current 4288allocation function (@pxref{Custom Allocation}). The block will be 4289@code{strlen(str)+1} bytes, that being exactly enough for the string and 4290null-terminator. 4291 4292If @var{str} is not @code{NULL}, it should point to a block of storage large 4293enough for the result, that being 4294 4295@example 4296mpz_sizeinbase (mpq_numref(@var{op}), @var{base}) 4297+ mpz_sizeinbase (mpq_denref(@var{op}), @var{base}) + 3 4298@end example 4299 4300The three extra bytes are for a possible minus sign, possible slash, and the 4301null-terminator. 4302 4303A pointer to the result string is returned, being either the allocated block, 4304or the given @var{str}. 4305@end deftypefun 4306 4307 4308@node Rational Arithmetic, Comparing Rationals, Rational Conversions, Rational Number Functions 4309@comment node-name, next, previous, up 4310@section Arithmetic Functions 4311@cindex Rational arithmetic functions 4312@cindex Arithmetic functions 4313 4314@deftypefun void mpq_add (mpq_t @var{sum}, mpq_t @var{addend1}, mpq_t @var{addend2}) 4315Set @var{sum} to @var{addend1} + @var{addend2}. 4316@end deftypefun 4317 4318@deftypefun void mpq_sub (mpq_t @var{difference}, mpq_t @var{minuend}, mpq_t @var{subtrahend}) 4319Set @var{difference} to @var{minuend} @minus{} @var{subtrahend}. 4320@end deftypefun 4321 4322@deftypefun void mpq_mul (mpq_t @var{product}, mpq_t @var{multiplier}, mpq_t @var{multiplicand}) 4323Set @var{product} to @math{@var{multiplier} @GMPtimes{} @var{multiplicand}}. 4324@end deftypefun 4325 4326@deftypefun void mpq_mul_2exp (mpq_t @var{rop}, mpq_t @var{op1}, mp_bitcnt_t @var{op2}) 4327Set @var{rop} to @m{@var{op1} \times 2^{op2}, @var{op1} times 2 raised to 4328@var{op2}}. 4329@end deftypefun 4330 4331@deftypefun void mpq_div (mpq_t @var{quotient}, mpq_t @var{dividend}, mpq_t @var{divisor}) 4332@cindex Division functions 4333Set @var{quotient} to @var{dividend}/@var{divisor}. 4334@end deftypefun 4335 4336@deftypefun void mpq_div_2exp (mpq_t @var{rop}, mpq_t @var{op1}, mp_bitcnt_t @var{op2}) 4337Set @var{rop} to @m{@var{op1}/2^{op2}, @var{op1} divided by 2 raised to 4338@var{op2}}. 4339@end deftypefun 4340 4341@deftypefun void mpq_neg (mpq_t @var{negated_operand}, mpq_t @var{operand}) 4342Set @var{negated_operand} to @minus{}@var{operand}. 4343@end deftypefun 4344 4345@deftypefun void mpq_abs (mpq_t @var{rop}, mpq_t @var{op}) 4346Set @var{rop} to the absolute value of @var{op}. 4347@end deftypefun 4348 4349@deftypefun void mpq_inv (mpq_t @var{inverted_number}, mpq_t @var{number}) 4350Set @var{inverted_number} to 1/@var{number}. If the new denominator is 4351zero, this routine will divide by zero. 4352@end deftypefun 4353 4354@node Comparing Rationals, Applying Integer Functions, Rational Arithmetic, Rational Number Functions 4355@comment node-name, next, previous, up 4356@section Comparison Functions 4357@cindex Rational comparison functions 4358@cindex Comparison functions 4359 4360@deftypefun int mpq_cmp (mpq_t @var{op1}, mpq_t @var{op2}) 4361Compare @var{op1} and @var{op2}. Return a positive value if @math{@var{op1} > 4362@var{op2}}, zero if @math{@var{op1} = @var{op2}}, and a negative value if 4363@math{@var{op1} < @var{op2}}. 4364 4365To determine if two rationals are equal, @code{mpq_equal} is faster than 4366@code{mpq_cmp}. 4367@end deftypefun 4368 4369@deftypefn Macro int mpq_cmp_ui (mpq_t @var{op1}, unsigned long int @var{num2}, unsigned long int @var{den2}) 4370@deftypefnx Macro int mpq_cmp_si (mpq_t @var{op1}, long int @var{num2}, unsigned long int @var{den2}) 4371Compare @var{op1} and @var{num2}/@var{den2}. Return a positive value if 4372@math{@var{op1} > @var{num2}/@var{den2}}, zero if @math{@var{op1} = 4373@var{num2}/@var{den2}}, and a negative value if @math{@var{op1} < 4374@var{num2}/@var{den2}}. 4375 4376@var{num2} and @var{den2} are allowed to have common factors. 4377 4378These functions are implemented as a macros and evaluate their arguments 4379multiple times. 4380@end deftypefn 4381 4382@deftypefn Macro int mpq_sgn (mpq_t @var{op}) 4383@cindex Sign tests 4384@cindex Rational sign tests 4385Return @math{+1} if @math{@var{op} > 0}, 0 if @math{@var{op} = 0}, and 4386@math{-1} if @math{@var{op} < 0}. 4387 4388This function is actually implemented as a macro. It evaluates its 4389arguments multiple times. 4390@end deftypefn 4391 4392@deftypefun int mpq_equal (mpq_t @var{op1}, mpq_t @var{op2}) 4393Return non-zero if @var{op1} and @var{op2} are equal, zero if they are 4394non-equal. Although @code{mpq_cmp} can be used for the same purpose, this 4395function is much faster. 4396@end deftypefun 4397 4398@node Applying Integer Functions, I/O of Rationals, Comparing Rationals, Rational Number Functions 4399@comment node-name, next, previous, up 4400@section Applying Integer Functions to Rationals 4401@cindex Rational numerator and denominator 4402@cindex Numerator and denominator 4403 4404The set of @code{mpq} functions is quite small. In particular, there are few 4405functions for either input or output. The following functions give direct 4406access to the numerator and denominator of an @code{mpq_t}. 4407 4408Note that if an assignment to the numerator and/or denominator could take an 4409@code{mpq_t} out of the canonical form described at the start of this chapter 4410(@pxref{Rational Number Functions}) then @code{mpq_canonicalize} must be 4411called before any other @code{mpq} functions are applied to that @code{mpq_t}. 4412 4413@deftypefn Macro mpz_t mpq_numref (mpq_t @var{op}) 4414@deftypefnx Macro mpz_t mpq_denref (mpq_t @var{op}) 4415Return a reference to the numerator and denominator of @var{op}, respectively. 4416The @code{mpz} functions can be used on the result of these macros. 4417@end deftypefn 4418 4419@deftypefun void mpq_get_num (mpz_t @var{numerator}, mpq_t @var{rational}) 4420@deftypefunx void mpq_get_den (mpz_t @var{denominator}, mpq_t @var{rational}) 4421@deftypefunx void mpq_set_num (mpq_t @var{rational}, mpz_t @var{numerator}) 4422@deftypefunx void mpq_set_den (mpq_t @var{rational}, mpz_t @var{denominator}) 4423Get or set the numerator or denominator of a rational. These functions are 4424equivalent to calling @code{mpz_set} with an appropriate @code{mpq_numref} or 4425@code{mpq_denref}. Direct use of @code{mpq_numref} or @code{mpq_denref} is 4426recommended instead of these functions. 4427@end deftypefun 4428 4429 4430@need 2000 4431@node I/O of Rationals, , Applying Integer Functions, Rational Number Functions 4432@comment node-name, next, previous, up 4433@section Input and Output Functions 4434@cindex Rational input and output functions 4435@cindex Input functions 4436@cindex Output functions 4437@cindex I/O functions 4438 4439Functions that perform input from a stdio stream, and functions that output to 4440a stdio stream, of @code{mpq} numbers. Passing a @code{NULL} pointer for a 4441@var{stream} argument to any of these functions will make them read from 4442@code{stdin} and write to @code{stdout}, respectively. 4443 4444When using any of these functions, it is a good idea to include @file{stdio.h} 4445before @file{gmp.h}, since that will allow @file{gmp.h} to define prototypes 4446for these functions. 4447 4448See also @ref{Formatted Output} and @ref{Formatted Input}. 4449 4450@deftypefun size_t mpq_out_str (FILE *@var{stream}, int @var{base}, mpq_t @var{op}) 4451Output @var{op} on stdio stream @var{stream}, as a string of digits in base 4452@var{base}. The base may vary from 2 to 36. Output is in the form 4453@samp{num/den} or if the denominator is 1 then just @samp{num}. 4454 4455Return the number of bytes written, or if an error occurred, return 0. 4456@end deftypefun 4457 4458@deftypefun size_t mpq_inp_str (mpq_t @var{rop}, FILE *@var{stream}, int @var{base}) 4459Read a string of digits from @var{stream} and convert them to a rational in 4460@var{rop}. Any initial white-space characters are read and discarded. Return 4461the number of characters read (including white space), or 0 if a rational 4462could not be read. 4463 4464The input can be a fraction like @samp{17/63} or just an integer like 4465@samp{123}. Reading stops at the first character not in this form, and white 4466space is not permitted within the string. If the input might not be in 4467canonical form, then @code{mpq_canonicalize} must be called (@pxref{Rational 4468Number Functions}). 4469 4470The @var{base} can be between 2 and 36, or can be 0 in which case the leading 4471characters of the string determine the base, @samp{0x} or @samp{0X} for 4472hexadecimal, @samp{0} for octal, or decimal otherwise. The leading characters 4473are examined separately for the numerator and denominator of a fraction, so 4474for instance @samp{0x10/11} is @math{16/11}, whereas @samp{0x10/0x11} is 4475@math{16/17}. 4476@end deftypefun 4477 4478 4479@node Floating-point Functions, Low-level Functions, Rational Number Functions, Top 4480@comment node-name, next, previous, up 4481@chapter Floating-point Functions 4482@cindex Floating-point functions 4483@cindex Float functions 4484@cindex User-defined precision 4485@cindex Precision of floats 4486 4487GMP floating point numbers are stored in objects of type @code{mpf_t} and 4488functions operating on them have an @code{mpf_} prefix. 4489 4490The mantissa of each float has a user-selectable precision, limited only by 4491available memory. Each variable has its own precision, and that can be 4492increased or decreased at any time. 4493 4494The exponent of each float is a fixed precision, one machine word on most 4495systems. In the current implementation the exponent is a count of limbs, so 4496for example on a 32-bit system this means a range of roughly 4497@math{2^@W{-68719476768}} to @math{2^@W{68719476736}}, or on a 64-bit system 4498this will be greater. Note however @code{mpf_get_str} can only return an 4499exponent which fits an @code{mp_exp_t} and currently @code{mpf_set_str} 4500doesn't accept exponents bigger than a @code{long}. 4501 4502Each variable keeps a size for the mantissa data actually in use. This means 4503that if a float is exactly represented in only a few bits then only those bits 4504will be used in a calculation, even if the selected precision is high. 4505 4506All calculations are performed to the precision of the destination variable. 4507Each function is defined to calculate with ``infinite precision'' followed by 4508a truncation to the destination precision, but of course the work done is only 4509what's needed to determine a result under that definition. 4510 4511The precision selected for a variable is a minimum value, GMP may increase it 4512a little to facilitate efficient calculation. Currently this means rounding 4513up to a whole limb, and then sometimes having a further partial limb, 4514depending on the high limb of the mantissa. But applications shouldn't be 4515concerned by such details. 4516 4517The mantissa in stored in binary, as might be imagined from the fact 4518precisions are expressed in bits. One consequence of this is that decimal 4519fractions like @math{0.1} cannot be represented exactly. The same is true of 4520plain IEEE @code{double} floats. This makes both highly unsuitable for 4521calculations involving money or other values that should be exact decimal 4522fractions. (Suitably scaled integers, or perhaps rationals, are better 4523choices.) 4524 4525@code{mpf} functions and variables have no special notion of infinity or 4526not-a-number, and applications must take care not to overflow the exponent or 4527results will be unpredictable. This might change in a future release. 4528 4529Note that the @code{mpf} functions are @emph{not} intended as a smooth 4530extension to IEEE P754 arithmetic. In particular results obtained on one 4531computer often differ from the results on a computer with a different word 4532size. 4533 4534@menu 4535* Initializing Floats:: 4536* Assigning Floats:: 4537* Simultaneous Float Init & Assign:: 4538* Converting Floats:: 4539* Float Arithmetic:: 4540* Float Comparison:: 4541* I/O of Floats:: 4542* Miscellaneous Float Functions:: 4543@end menu 4544 4545@node Initializing Floats, Assigning Floats, Floating-point Functions, Floating-point Functions 4546@comment node-name, next, previous, up 4547@section Initialization Functions 4548@cindex Float initialization functions 4549@cindex Initialization functions 4550 4551@deftypefun void mpf_set_default_prec (mp_bitcnt_t @var{prec}) 4552Set the default precision to be @strong{at least} @var{prec} bits. All 4553subsequent calls to @code{mpf_init} will use this precision, but previously 4554initialized variables are unaffected. 4555@end deftypefun 4556 4557@deftypefun {mp_bitcnt_t} mpf_get_default_prec (void) 4558Return the default precision actually used. 4559@end deftypefun 4560 4561An @code{mpf_t} object must be initialized before storing the first value in 4562it. The functions @code{mpf_init} and @code{mpf_init2} are used for that 4563purpose. 4564 4565@deftypefun void mpf_init (mpf_t @var{x}) 4566Initialize @var{x} to 0. Normally, a variable should be initialized once only 4567or at least be cleared, using @code{mpf_clear}, between initializations. The 4568precision of @var{x} is undefined unless a default precision has already been 4569established by a call to @code{mpf_set_default_prec}. 4570@end deftypefun 4571 4572@deftypefun void mpf_init2 (mpf_t @var{x}, mp_bitcnt_t @var{prec}) 4573Initialize @var{x} to 0 and set its precision to be @strong{at least} 4574@var{prec} bits. Normally, a variable should be initialized once only or at 4575least be cleared, using @code{mpf_clear}, between initializations. 4576@end deftypefun 4577 4578@deftypefun void mpf_inits (mpf_t @var{x}, ...) 4579Initialize a NULL-terminated list of @code{mpf_t} variables, and set their 4580values to 0. The precision of the initialized variables is undefined unless a 4581default precision has already been established by a call to 4582@code{mpf_set_default_prec}. 4583@end deftypefun 4584 4585@deftypefun void mpf_clear (mpf_t @var{x}) 4586Free the space occupied by @var{x}. Make sure to call this function for all 4587@code{mpf_t} variables when you are done with them. 4588@end deftypefun 4589 4590@deftypefun void mpf_clears (mpf_t @var{x}, ...) 4591Free the space occupied by a NULL-terminated list of @code{mpf_t} variables. 4592@end deftypefun 4593 4594@need 2000 4595Here is an example on how to initialize floating-point variables: 4596@example 4597@{ 4598 mpf_t x, y; 4599 mpf_init (x); /* use default precision */ 4600 mpf_init2 (y, 256); /* precision @emph{at least} 256 bits */ 4601 @dots{} 4602 /* Unless the program is about to exit, do ... */ 4603 mpf_clear (x); 4604 mpf_clear (y); 4605@} 4606@end example 4607 4608The following three functions are useful for changing the precision during a 4609calculation. A typical use would be for adjusting the precision gradually in 4610iterative algorithms like Newton-Raphson, making the computation precision 4611closely match the actual accurate part of the numbers. 4612 4613@deftypefun {mp_bitcnt_t} mpf_get_prec (mpf_t @var{op}) 4614Return the current precision of @var{op}, in bits. 4615@end deftypefun 4616 4617@deftypefun void mpf_set_prec (mpf_t @var{rop}, mp_bitcnt_t @var{prec}) 4618Set the precision of @var{rop} to be @strong{at least} @var{prec} bits. The 4619value in @var{rop} will be truncated to the new precision. 4620 4621This function requires a call to @code{realloc}, and so should not be used in 4622a tight loop. 4623@end deftypefun 4624 4625@deftypefun void mpf_set_prec_raw (mpf_t @var{rop}, mp_bitcnt_t @var{prec}) 4626Set the precision of @var{rop} to be @strong{at least} @var{prec} bits, 4627without changing the memory allocated. 4628 4629@var{prec} must be no more than the allocated precision for @var{rop}, that 4630being the precision when @var{rop} was initialized, or in the most recent 4631@code{mpf_set_prec}. 4632 4633The value in @var{rop} is unchanged, and in particular if it had a higher 4634precision than @var{prec} it will retain that higher precision. New values 4635written to @var{rop} will use the new @var{prec}. 4636 4637Before calling @code{mpf_clear} or the full @code{mpf_set_prec}, another 4638@code{mpf_set_prec_raw} call must be made to restore @var{rop} to its original 4639allocated precision. Failing to do so will have unpredictable results. 4640 4641@code{mpf_get_prec} can be used before @code{mpf_set_prec_raw} to get the 4642original allocated precision. After @code{mpf_set_prec_raw} it reflects the 4643@var{prec} value set. 4644 4645@code{mpf_set_prec_raw} is an efficient way to use an @code{mpf_t} variable at 4646different precisions during a calculation, perhaps to gradually increase 4647precision in an iteration, or just to use various different precisions for 4648different purposes during a calculation. 4649@end deftypefun 4650 4651 4652@need 2000 4653@node Assigning Floats, Simultaneous Float Init & Assign, Initializing Floats, Floating-point Functions 4654@comment node-name, next, previous, up 4655@section Assignment Functions 4656@cindex Float assignment functions 4657@cindex Assignment functions 4658 4659These functions assign new values to already initialized floats 4660(@pxref{Initializing Floats}). 4661 4662@deftypefun void mpf_set (mpf_t @var{rop}, mpf_t @var{op}) 4663@deftypefunx void mpf_set_ui (mpf_t @var{rop}, unsigned long int @var{op}) 4664@deftypefunx void mpf_set_si (mpf_t @var{rop}, signed long int @var{op}) 4665@deftypefunx void mpf_set_d (mpf_t @var{rop}, double @var{op}) 4666@deftypefunx void mpf_set_z (mpf_t @var{rop}, mpz_t @var{op}) 4667@deftypefunx void mpf_set_q (mpf_t @var{rop}, mpq_t @var{op}) 4668Set the value of @var{rop} from @var{op}. 4669@end deftypefun 4670 4671@deftypefun int mpf_set_str (mpf_t @var{rop}, char *@var{str}, int @var{base}) 4672Set the value of @var{rop} from the string in @var{str}. The string is of the 4673form @samp{M@@N} or, if the base is 10 or less, alternatively @samp{MeN}. 4674@samp{M} is the mantissa and @samp{N} is the exponent. The mantissa is always 4675in the specified base. The exponent is either in the specified base or, if 4676@var{base} is negative, in decimal. The decimal point expected is taken from 4677the current locale, on systems providing @code{localeconv}. 4678 4679The argument @var{base} may be in the ranges 2 to 62, or @minus{}62 to 4680@minus{}2. Negative values are used to specify that the exponent is in 4681decimal. 4682 4683For bases up to 36, case is ignored; upper-case and lower-case letters have 4684the same value; for bases 37 to 62, upper-case letter represent the usual 468510..35 while lower-case letter represent 36..61. 4686 4687Unlike the corresponding @code{mpz} function, the base will not be determined 4688from the leading characters of the string if @var{base} is 0. This is so that 4689numbers like @samp{0.23} are not interpreted as octal. 4690 4691White space is allowed in the string, and is simply ignored. [This is not 4692really true; white-space is ignored in the beginning of the string and within 4693the mantissa, but not in other places, such as after a minus sign or in the 4694exponent. We are considering changing the definition of this function, making 4695it fail when there is any white-space in the input, since that makes a lot of 4696sense. Please tell us your opinion about this change. Do you really want it 4697to accept @nicode{"3 14"} as meaning 314 as it does now?] 4698 4699This function returns 0 if the entire string is a valid number in base 4700@var{base}. Otherwise it returns @minus{}1. 4701@end deftypefun 4702 4703@deftypefun void mpf_swap (mpf_t @var{rop1}, mpf_t @var{rop2}) 4704Swap @var{rop1} and @var{rop2} efficiently. Both the values and the 4705precisions of the two variables are swapped. 4706@end deftypefun 4707 4708 4709@node Simultaneous Float Init & Assign, Converting Floats, Assigning Floats, Floating-point Functions 4710@comment node-name, next, previous, up 4711@section Combined Initialization and Assignment Functions 4712@cindex Float assignment functions 4713@cindex Assignment functions 4714@cindex Float initialization functions 4715@cindex Initialization functions 4716 4717For convenience, GMP provides a parallel series of initialize-and-set functions 4718which initialize the output and then store the value there. These functions' 4719names have the form @code{mpf_init_set@dots{}} 4720 4721Once the float has been initialized by any of the @code{mpf_init_set@dots{}} 4722functions, it can be used as the source or destination operand for the ordinary 4723float functions. Don't use an initialize-and-set function on a variable 4724already initialized! 4725 4726@deftypefun void mpf_init_set (mpf_t @var{rop}, mpf_t @var{op}) 4727@deftypefunx void mpf_init_set_ui (mpf_t @var{rop}, unsigned long int @var{op}) 4728@deftypefunx void mpf_init_set_si (mpf_t @var{rop}, signed long int @var{op}) 4729@deftypefunx void mpf_init_set_d (mpf_t @var{rop}, double @var{op}) 4730Initialize @var{rop} and set its value from @var{op}. 4731 4732The precision of @var{rop} will be taken from the active default precision, as 4733set by @code{mpf_set_default_prec}. 4734@end deftypefun 4735 4736@deftypefun int mpf_init_set_str (mpf_t @var{rop}, char *@var{str}, int @var{base}) 4737Initialize @var{rop} and set its value from the string in @var{str}. See 4738@code{mpf_set_str} above for details on the assignment operation. 4739 4740Note that @var{rop} is initialized even if an error occurs. (I.e., you have to 4741call @code{mpf_clear} for it.) 4742 4743The precision of @var{rop} will be taken from the active default precision, as 4744set by @code{mpf_set_default_prec}. 4745@end deftypefun 4746 4747 4748@node Converting Floats, Float Arithmetic, Simultaneous Float Init & Assign, Floating-point Functions 4749@comment node-name, next, previous, up 4750@section Conversion Functions 4751@cindex Float conversion functions 4752@cindex Conversion functions 4753 4754@deftypefun double mpf_get_d (mpf_t @var{op}) 4755Convert @var{op} to a @code{double}, truncating if necessary (ie.@: rounding 4756towards zero). 4757 4758If the exponent in @var{op} is too big or too small to fit a @code{double} 4759then the result is system dependent. For too big an infinity is returned when 4760available. For too small @math{0.0} is normally returned. Hardware overflow, 4761underflow and denorm traps may or may not occur. 4762@end deftypefun 4763 4764@deftypefun double mpf_get_d_2exp (signed long int *@var{exp}, mpf_t @var{op}) 4765Convert @var{op} to a @code{double}, truncating if necessary (ie.@: rounding 4766towards zero), and with an exponent returned separately. 4767 4768The return value is in the range @math{0.5@le{}@GMPabs{@var{d}}<1} and the 4769exponent is stored to @code{*@var{exp}}. @m{@var{d} * 2^{exp}, @var{d} * 47702^@var{exp}} is the (truncated) @var{op} value. If @var{op} is zero, the 4771return is @math{0.0} and 0 is stored to @code{*@var{exp}}. 4772 4773@cindex @code{frexp} 4774This is similar to the standard C @code{frexp} function (@pxref{Normalization 4775Functions,,, libc, The GNU C Library Reference Manual}). 4776@end deftypefun 4777 4778@deftypefun long mpf_get_si (mpf_t @var{op}) 4779@deftypefunx {unsigned long} mpf_get_ui (mpf_t @var{op}) 4780Convert @var{op} to a @code{long} or @code{unsigned long}, truncating any 4781fraction part. If @var{op} is too big for the return type, the result is 4782undefined. 4783 4784See also @code{mpf_fits_slong_p} and @code{mpf_fits_ulong_p} 4785(@pxref{Miscellaneous Float Functions}). 4786@end deftypefun 4787 4788@deftypefun {char *} mpf_get_str (char *@var{str}, mp_exp_t *@var{expptr}, int @var{base}, size_t @var{n_digits}, mpf_t @var{op}) 4789Convert @var{op} to a string of digits in base @var{base}. The base argument 4790may vary from 2 to 62 or from @minus{}2 to @minus{}36. Up to @var{n_digits} 4791digits will be generated. Trailing zeros are not returned. No more digits 4792than can be accurately represented by @var{op} are ever generated. If 4793@var{n_digits} is 0 then that accurate maximum number of digits are generated. 4794 4795For @var{base} in the range 2..36, digits and lower-case letters are used; for 4796@minus{}2..@minus{}36, digits and upper-case letters are used; for 37..62, 4797digits, upper-case letters, and lower-case letters (in that significance order) 4798are used. 4799 4800If @var{str} is @code{NULL}, the result string is allocated using the current 4801allocation function (@pxref{Custom Allocation}). The block will be 4802@code{strlen(str)+1} bytes, that being exactly enough for the string and 4803null-terminator. 4804 4805If @var{str} is not @code{NULL}, it should point to a block of 4806@math{@var{n_digits} + 2} bytes, that being enough for the mantissa, a 4807possible minus sign, and a null-terminator. When @var{n_digits} is 0 to get 4808all significant digits, an application won't be able to know the space 4809required, and @var{str} should be @code{NULL} in that case. 4810 4811The generated string is a fraction, with an implicit radix point immediately 4812to the left of the first digit. The applicable exponent is written through 4813the @var{expptr} pointer. For example, the number 3.1416 would be returned as 4814string @nicode{"31416"} and exponent 1. 4815 4816When @var{op} is zero, an empty string is produced and the exponent returned 4817is 0. 4818 4819A pointer to the result string is returned, being either the allocated block 4820or the given @var{str}. 4821@end deftypefun 4822 4823 4824@node Float Arithmetic, Float Comparison, Converting Floats, Floating-point Functions 4825@comment node-name, next, previous, up 4826@section Arithmetic Functions 4827@cindex Float arithmetic functions 4828@cindex Arithmetic functions 4829 4830@deftypefun void mpf_add (mpf_t @var{rop}, mpf_t @var{op1}, mpf_t @var{op2}) 4831@deftypefunx void mpf_add_ui (mpf_t @var{rop}, mpf_t @var{op1}, unsigned long int @var{op2}) 4832Set @var{rop} to @math{@var{op1} + @var{op2}}. 4833@end deftypefun 4834 4835@deftypefun void mpf_sub (mpf_t @var{rop}, mpf_t @var{op1}, mpf_t @var{op2}) 4836@deftypefunx void mpf_ui_sub (mpf_t @var{rop}, unsigned long int @var{op1}, mpf_t @var{op2}) 4837@deftypefunx void mpf_sub_ui (mpf_t @var{rop}, mpf_t @var{op1}, unsigned long int @var{op2}) 4838Set @var{rop} to @var{op1} @minus{} @var{op2}. 4839@end deftypefun 4840 4841@deftypefun void mpf_mul (mpf_t @var{rop}, mpf_t @var{op1}, mpf_t @var{op2}) 4842@deftypefunx void mpf_mul_ui (mpf_t @var{rop}, mpf_t @var{op1}, unsigned long int @var{op2}) 4843Set @var{rop} to @math{@var{op1} @GMPtimes{} @var{op2}}. 4844@end deftypefun 4845 4846Division is undefined if the divisor is zero, and passing a zero divisor to the 4847divide functions will make these functions intentionally divide by zero. This 4848lets the user handle arithmetic exceptions in these functions in the same 4849manner as other arithmetic exceptions. 4850 4851@deftypefun void mpf_div (mpf_t @var{rop}, mpf_t @var{op1}, mpf_t @var{op2}) 4852@deftypefunx void mpf_ui_div (mpf_t @var{rop}, unsigned long int @var{op1}, mpf_t @var{op2}) 4853@deftypefunx void mpf_div_ui (mpf_t @var{rop}, mpf_t @var{op1}, unsigned long int @var{op2}) 4854@cindex Division functions 4855Set @var{rop} to @var{op1}/@var{op2}. 4856@end deftypefun 4857 4858@deftypefun void mpf_sqrt (mpf_t @var{rop}, mpf_t @var{op}) 4859@deftypefunx void mpf_sqrt_ui (mpf_t @var{rop}, unsigned long int @var{op}) 4860@cindex Root extraction functions 4861Set @var{rop} to @m{\sqrt{@var{op}}, the square root of @var{op}}. 4862@end deftypefun 4863 4864@deftypefun void mpf_pow_ui (mpf_t @var{rop}, mpf_t @var{op1}, unsigned long int @var{op2}) 4865@cindex Exponentiation functions 4866@cindex Powering functions 4867Set @var{rop} to @m{@var{op1}^{op2}, @var{op1} raised to the power @var{op2}}. 4868@end deftypefun 4869 4870@deftypefun void mpf_neg (mpf_t @var{rop}, mpf_t @var{op}) 4871Set @var{rop} to @minus{}@var{op}. 4872@end deftypefun 4873 4874@deftypefun void mpf_abs (mpf_t @var{rop}, mpf_t @var{op}) 4875Set @var{rop} to the absolute value of @var{op}. 4876@end deftypefun 4877 4878@deftypefun void mpf_mul_2exp (mpf_t @var{rop}, mpf_t @var{op1}, mp_bitcnt_t @var{op2}) 4879Set @var{rop} to @m{@var{op1} \times 2^{op2}, @var{op1} times 2 raised to 4880@var{op2}}. 4881@end deftypefun 4882 4883@deftypefun void mpf_div_2exp (mpf_t @var{rop}, mpf_t @var{op1}, mp_bitcnt_t @var{op2}) 4884Set @var{rop} to @m{@var{op1}/2^{op2}, @var{op1} divided by 2 raised to 4885@var{op2}}. 4886@end deftypefun 4887 4888@node Float Comparison, I/O of Floats, Float Arithmetic, Floating-point Functions 4889@comment node-name, next, previous, up 4890@section Comparison Functions 4891@cindex Float comparison functions 4892@cindex Comparison functions 4893 4894@deftypefun int mpf_cmp (mpf_t @var{op1}, mpf_t @var{op2}) 4895@deftypefunx int mpf_cmp_d (mpf_t @var{op1}, double @var{op2}) 4896@deftypefunx int mpf_cmp_ui (mpf_t @var{op1}, unsigned long int @var{op2}) 4897@deftypefunx int mpf_cmp_si (mpf_t @var{op1}, signed long int @var{op2}) 4898Compare @var{op1} and @var{op2}. Return a positive value if @math{@var{op1} > 4899@var{op2}}, zero if @math{@var{op1} = @var{op2}}, and a negative value if 4900@math{@var{op1} < @var{op2}}. 4901 4902@code{mpf_cmp_d} can be called with an infinity, but results are undefined for 4903a NaN. 4904@end deftypefun 4905 4906@deftypefun int mpf_eq (mpf_t @var{op1}, mpf_t @var{op2}, mp_bitcnt_t op3) 4907Return non-zero if the first @var{op3} bits of @var{op1} and @var{op2} are 4908equal, zero otherwise. I.e., test if @var{op1} and @var{op2} are approximately 4909equal. 4910 4911Caution 1: All version of GMP up to version 4.2.4 compared just whole limbs, 4912meaning sometimes more than @var{op3} bits, sometimes fewer. 4913 4914Caution 2: This function will consider XXX11...111 and XX100...000 different, 4915even if ... is replaced by a semi-infinite number of bits. Such numbers are 4916really just one ulp off, and should be considered equal. 4917@end deftypefun 4918 4919@deftypefun void mpf_reldiff (mpf_t @var{rop}, mpf_t @var{op1}, mpf_t @var{op2}) 4920Compute the relative difference between @var{op1} and @var{op2} and store the 4921result in @var{rop}. This is @math{@GMPabs{@var{op1}-@var{op2}}/@var{op1}}. 4922@end deftypefun 4923 4924@deftypefn Macro int mpf_sgn (mpf_t @var{op}) 4925@cindex Sign tests 4926@cindex Float sign tests 4927Return @math{+1} if @math{@var{op} > 0}, 0 if @math{@var{op} = 0}, and 4928@math{-1} if @math{@var{op} < 0}. 4929 4930This function is actually implemented as a macro. It evaluates its arguments 4931multiple times. 4932@end deftypefn 4933 4934@node I/O of Floats, Miscellaneous Float Functions, Float Comparison, Floating-point Functions 4935@comment node-name, next, previous, up 4936@section Input and Output Functions 4937@cindex Float input and output functions 4938@cindex Input functions 4939@cindex Output functions 4940@cindex I/O functions 4941 4942Functions that perform input from a stdio stream, and functions that output to 4943a stdio stream, of @code{mpf} numbers. Passing a @code{NULL} pointer for a 4944@var{stream} argument to any of these functions will make them read from 4945@code{stdin} and write to @code{stdout}, respectively. 4946 4947When using any of these functions, it is a good idea to include @file{stdio.h} 4948before @file{gmp.h}, since that will allow @file{gmp.h} to define prototypes 4949for these functions. 4950 4951See also @ref{Formatted Output} and @ref{Formatted Input}. 4952 4953@deftypefun size_t mpf_out_str (FILE *@var{stream}, int @var{base}, size_t @var{n_digits}, mpf_t @var{op}) 4954Print @var{op} to @var{stream}, as a string of digits. Return the number of 4955bytes written, or if an error occurred, return 0. 4956 4957The mantissa is prefixed with an @samp{0.} and is in the given @var{base}, 4958which may vary from 2 to 62 or from @minus{}2 to @minus{}36. An exponent is 4959then printed, separated by an @samp{e}, or if the base is greater than 10 then 4960by an @samp{@@}. The exponent is always in decimal. The decimal point follows 4961the current locale, on systems providing @code{localeconv}. 4962 4963For @var{base} in the range 2..36, digits and lower-case letters are used; for 4964@minus{}2..@minus{}36, digits and upper-case letters are used; for 37..62, 4965digits, upper-case letters, and lower-case letters (in that significance order) 4966are used. 4967 4968Up to @var{n_digits} will be printed from the mantissa, except that no more 4969digits than are accurately representable by @var{op} will be printed. 4970@var{n_digits} can be 0 to select that accurate maximum. 4971@end deftypefun 4972 4973@deftypefun size_t mpf_inp_str (mpf_t @var{rop}, FILE *@var{stream}, int @var{base}) 4974Read a string in base @var{base} from @var{stream}, and put the read float in 4975@var{rop}. The string is of the form @samp{M@@N} or, if the base is 10 or 4976less, alternatively @samp{MeN}. @samp{M} is the mantissa and @samp{N} is the 4977exponent. The mantissa is always in the specified base. The exponent is 4978either in the specified base or, if @var{base} is negative, in decimal. The 4979decimal point expected is taken from the current locale, on systems providing 4980@code{localeconv}. 4981 4982The argument @var{base} may be in the ranges 2 to 36, or @minus{}36 to 4983@minus{}2. Negative values are used to specify that the exponent is in 4984decimal. 4985 4986Unlike the corresponding @code{mpz} function, the base will not be determined 4987from the leading characters of the string if @var{base} is 0. This is so that 4988numbers like @samp{0.23} are not interpreted as octal. 4989 4990Return the number of bytes read, or if an error occurred, return 0. 4991@end deftypefun 4992 4993@c @deftypefun void mpf_out_raw (FILE *@var{stream}, mpf_t @var{float}) 4994@c Output @var{float} on stdio stream @var{stream}, in raw binary 4995@c format. The float is written in a portable format, with 4 bytes of 4996@c size information, and that many bytes of limbs. Both the size and the 4997@c limbs are written in decreasing significance order. 4998@c @end deftypefun 4999 5000@c @deftypefun void mpf_inp_raw (mpf_t @var{float}, FILE *@var{stream}) 5001@c Input from stdio stream @var{stream} in the format written by 5002@c @code{mpf_out_raw}, and put the result in @var{float}. 5003@c @end deftypefun 5004 5005 5006@node Miscellaneous Float Functions, , I/O of Floats, Floating-point Functions 5007@comment node-name, next, previous, up 5008@section Miscellaneous Functions 5009@cindex Miscellaneous float functions 5010@cindex Float miscellaneous functions 5011 5012@deftypefun void mpf_ceil (mpf_t @var{rop}, mpf_t @var{op}) 5013@deftypefunx void mpf_floor (mpf_t @var{rop}, mpf_t @var{op}) 5014@deftypefunx void mpf_trunc (mpf_t @var{rop}, mpf_t @var{op}) 5015@cindex Rounding functions 5016@cindex Float rounding functions 5017Set @var{rop} to @var{op} rounded to an integer. @code{mpf_ceil} rounds to the 5018next higher integer, @code{mpf_floor} to the next lower, and @code{mpf_trunc} 5019to the integer towards zero. 5020@end deftypefun 5021 5022@deftypefun int mpf_integer_p (mpf_t @var{op}) 5023Return non-zero if @var{op} is an integer. 5024@end deftypefun 5025 5026@deftypefun int mpf_fits_ulong_p (mpf_t @var{op}) 5027@deftypefunx int mpf_fits_slong_p (mpf_t @var{op}) 5028@deftypefunx int mpf_fits_uint_p (mpf_t @var{op}) 5029@deftypefunx int mpf_fits_sint_p (mpf_t @var{op}) 5030@deftypefunx int mpf_fits_ushort_p (mpf_t @var{op}) 5031@deftypefunx int mpf_fits_sshort_p (mpf_t @var{op}) 5032Return non-zero if @var{op} would fit in the respective C data type, when 5033truncated to an integer. 5034@end deftypefun 5035 5036@deftypefun void mpf_urandomb (mpf_t @var{rop}, gmp_randstate_t @var{state}, mp_bitcnt_t @var{nbits}) 5037@cindex Random number functions 5038@cindex Float random number functions 5039Generate a uniformly distributed random float in @var{rop}, such that @math{0 5040@le{} @var{rop} < 1}, with @var{nbits} significant bits in the mantissa or 5041less if the precision of @var{rop} is smaller. 5042 5043The variable @var{state} must be initialized by calling one of the 5044@code{gmp_randinit} functions (@ref{Random State Initialization}) before 5045invoking this function. 5046@end deftypefun 5047 5048@deftypefun void mpf_random2 (mpf_t @var{rop}, mp_size_t @var{max_size}, mp_exp_t @var{exp}) 5049Generate a random float of at most @var{max_size} limbs, with long strings of 5050zeros and ones in the binary representation. The exponent of the number is in 5051the interval @minus{}@var{exp} to @var{exp} (in limbs). This function is 5052useful for testing functions and algorithms, since these kind of random 5053numbers have proven to be more likely to trigger corner-case bugs. Negative 5054random numbers are generated when @var{max_size} is negative. 5055@end deftypefun 5056 5057@c @deftypefun size_t mpf_size (mpf_t @var{op}) 5058@c Return the size of @var{op} measured in number of limbs. If @var{op} is 5059@c zero, the returned value will be zero. (@xref{Nomenclature}, for an 5060@c explanation of the concept @dfn{limb}.) 5061@c 5062@c @strong{This function is obsolete. It will disappear from future GMP 5063@c releases.} 5064@c @end deftypefun 5065 5066 5067@node Low-level Functions, Random Number Functions, Floating-point Functions, Top 5068@comment node-name, next, previous, up 5069@chapter Low-level Functions 5070@cindex Low-level functions 5071 5072This chapter describes low-level GMP functions, used to implement the 5073high-level GMP functions, but also intended for time-critical user code. 5074 5075These functions start with the prefix @code{mpn_}. 5076 5077@c 1. Some of these function clobber input operands. 5078@c 5079 5080The @code{mpn} functions are designed to be as fast as possible, @strong{not} 5081to provide a coherent calling interface. The different functions have somewhat 5082similar interfaces, but there are variations that make them hard to use. These 5083functions do as little as possible apart from the real multiple precision 5084computation, so that no time is spent on things that not all callers need. 5085 5086A source operand is specified by a pointer to the least significant limb and a 5087limb count. A destination operand is specified by just a pointer. It is the 5088responsibility of the caller to ensure that the destination has enough space 5089for storing the result. 5090 5091With this way of specifying operands, it is possible to perform computations on 5092subranges of an argument, and store the result into a subrange of a 5093destination. 5094 5095A common requirement for all functions is that each source area needs at least 5096one limb. No size argument may be zero. Unless otherwise stated, in-place 5097operations are allowed where source and destination are the same, but not where 5098they only partly overlap. 5099 5100The @code{mpn} functions are the base for the implementation of the 5101@code{mpz_}, @code{mpf_}, and @code{mpq_} functions. 5102 5103This example adds the number beginning at @var{s1p} and the number beginning at 5104@var{s2p} and writes the sum at @var{destp}. All areas have @var{n} limbs. 5105 5106@example 5107cy = mpn_add_n (destp, s1p, s2p, n) 5108@end example 5109 5110It should be noted that the @code{mpn} functions make no attempt to identify 5111high or low zero limbs on their operands, or other special forms. On random 5112data such cases will be unlikely and it'd be wasteful for every function to 5113check every time. An application knowing something about its data can take 5114steps to trim or perhaps split its calculations. 5115@c 5116@c For reference, within gmp mpz_t operands never have high zero limbs, and 5117@c we rate low zero limbs as unlikely too (or something an application should 5118@c handle). This is a prime motivation for not stripping zero limbs in say 5119@c mpn_mul_n etc. 5120@c 5121@c Other applications doing variable-length calculations will quite likely do 5122@c something similar to mpz. And even if not then it's highly likely zero 5123@c limb stripping can be done at just a few judicious points, which will be 5124@c more efficient than having lots of mpn functions checking every time. 5125 5126@sp 1 5127@noindent 5128In the notation used below, a source operand is identified by the pointer to 5129the least significant limb, and the limb count in braces. For example, 5130@{@var{s1p}, @var{s1n}@}. 5131 5132@deftypefun mp_limb_t mpn_add_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n}) 5133Add @{@var{s1p}, @var{n}@} and @{@var{s2p}, @var{n}@}, and write the @var{n} 5134least significant limbs of the result to @var{rp}. Return carry, either 0 or 51351. 5136 5137This is the lowest-level function for addition. It is the preferred function 5138for addition, since it is written in assembly for most CPUs. For addition of 5139a variable to itself (i.e., @var{s1p} equals @var{s2p}) use @code{mpn_lshift} 5140with a count of 1 for optimal speed. 5141@end deftypefun 5142 5143@deftypefun mp_limb_t mpn_add_1 (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{n}, mp_limb_t @var{s2limb}) 5144Add @{@var{s1p}, @var{n}@} and @var{s2limb}, and write the @var{n} least 5145significant limbs of the result to @var{rp}. Return carry, either 0 or 1. 5146@end deftypefun 5147 5148@deftypefun mp_limb_t mpn_add (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{s1n}, const mp_limb_t *@var{s2p}, mp_size_t @var{s2n}) 5149Add @{@var{s1p}, @var{s1n}@} and @{@var{s2p}, @var{s2n}@}, and write the 5150@var{s1n} least significant limbs of the result to @var{rp}. Return carry, 5151either 0 or 1. 5152 5153This function requires that @var{s1n} is greater than or equal to @var{s2n}. 5154@end deftypefun 5155 5156@deftypefun mp_limb_t mpn_sub_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n}) 5157Subtract @{@var{s2p}, @var{n}@} from @{@var{s1p}, @var{n}@}, and write the 5158@var{n} least significant limbs of the result to @var{rp}. Return borrow, 5159either 0 or 1. 5160 5161This is the lowest-level function for subtraction. It is the preferred 5162function for subtraction, since it is written in assembly for most CPUs. 5163@end deftypefun 5164 5165@deftypefun mp_limb_t mpn_sub_1 (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{n}, mp_limb_t @var{s2limb}) 5166Subtract @var{s2limb} from @{@var{s1p}, @var{n}@}, and write the @var{n} least 5167significant limbs of the result to @var{rp}. Return borrow, either 0 or 1. 5168@end deftypefun 5169 5170@deftypefun mp_limb_t mpn_sub (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{s1n}, const mp_limb_t *@var{s2p}, mp_size_t @var{s2n}) 5171Subtract @{@var{s2p}, @var{s2n}@} from @{@var{s1p}, @var{s1n}@}, and write the 5172@var{s1n} least significant limbs of the result to @var{rp}. Return borrow, 5173either 0 or 1. 5174 5175This function requires that @var{s1n} is greater than or equal to 5176@var{s2n}. 5177@end deftypefun 5178 5179@deftypefun void mpn_neg (mp_limb_t *@var{rp}, const mp_limb_t *@var{sp}, mp_size_t @var{n}) 5180Perform the negation of @{@var{sp}, @var{n}@}, and write the result to 5181@{@var{rp}, @var{n}@}. Return carry-out. 5182@end deftypefun 5183 5184@deftypefun void mpn_mul_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n}) 5185Multiply @{@var{s1p}, @var{n}@} and @{@var{s2p}, @var{n}@}, and write the 51862*@var{n}-limb result to @var{rp}. 5187 5188The destination has to have space for 2*@var{n} limbs, even if the product's 5189most significant limb is zero. No overlap is permitted between the 5190destination and either source. 5191 5192If the two input operands are the same, use @code{mpn_sqr}. 5193@end deftypefun 5194 5195@deftypefun mp_limb_t mpn_mul (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{s1n}, const mp_limb_t *@var{s2p}, mp_size_t @var{s2n}) 5196Multiply @{@var{s1p}, @var{s1n}@} and @{@var{s2p}, @var{s2n}@}, and write the 5197(@var{s1n}+@var{s2n})-limb result to @var{rp}. Return the most significant 5198limb of the result. 5199 5200The destination has to have space for @var{s1n} + @var{s2n} limbs, even if the 5201product's most significant limb is zero. No overlap is permitted between the 5202destination and either source. 5203 5204This function requires that @var{s1n} is greater than or equal to @var{s2n}. 5205@end deftypefun 5206 5207@deftypefun void mpn_sqr (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{n}) 5208Compute the square of @{@var{s1p}, @var{n}@} and write the 2*@var{n}-limb 5209result to @var{rp}. 5210 5211The destination has to have space for 2*@var{n} limbs, even if the result's 5212most significant limb is zero. No overlap is permitted between the 5213destination and the source. 5214@end deftypefun 5215 5216@deftypefun mp_limb_t mpn_mul_1 (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{n}, mp_limb_t @var{s2limb}) 5217Multiply @{@var{s1p}, @var{n}@} by @var{s2limb}, and write the @var{n} least 5218significant limbs of the product to @var{rp}. Return the most significant 5219limb of the product. @{@var{s1p}, @var{n}@} and @{@var{rp}, @var{n}@} are 5220allowed to overlap provided @math{@var{rp} @le{} @var{s1p}}. 5221 5222This is a low-level function that is a building block for general 5223multiplication as well as other operations in GMP@. It is written in assembly 5224for most CPUs. 5225 5226Don't call this function if @var{s2limb} is a power of 2; use @code{mpn_lshift} 5227with a count equal to the logarithm of @var{s2limb} instead, for optimal speed. 5228@end deftypefun 5229 5230@deftypefun mp_limb_t mpn_addmul_1 (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{n}, mp_limb_t @var{s2limb}) 5231Multiply @{@var{s1p}, @var{n}@} and @var{s2limb}, and add the @var{n} least 5232significant limbs of the product to @{@var{rp}, @var{n}@} and write the result 5233to @var{rp}. Return the most significant limb of the product, plus carry-out 5234from the addition. 5235 5236This is a low-level function that is a building block for general 5237multiplication as well as other operations in GMP@. It is written in assembly 5238for most CPUs. 5239@end deftypefun 5240 5241@deftypefun mp_limb_t mpn_submul_1 (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{n}, mp_limb_t @var{s2limb}) 5242Multiply @{@var{s1p}, @var{n}@} and @var{s2limb}, and subtract the @var{n} 5243least significant limbs of the product from @{@var{rp}, @var{n}@} and write the 5244result to @var{rp}. Return the most significant limb of the product, plus 5245borrow-out from the subtraction. 5246 5247This is a low-level function that is a building block for general 5248multiplication and division as well as other operations in GMP@. It is written 5249in assembly for most CPUs. 5250@end deftypefun 5251 5252@deftypefun void mpn_tdiv_qr (mp_limb_t *@var{qp}, mp_limb_t *@var{rp}, mp_size_t @var{qxn}, const mp_limb_t *@var{np}, mp_size_t @var{nn}, const mp_limb_t *@var{dp}, mp_size_t @var{dn}) 5253Divide @{@var{np}, @var{nn}@} by @{@var{dp}, @var{dn}@} and put the quotient 5254at @{@var{qp}, @var{nn}@minus{}@var{dn}+1@} and the remainder at @{@var{rp}, 5255@var{dn}@}. The quotient is rounded towards 0. 5256 5257No overlap is permitted between arguments, except that @var{np} might equal 5258@var{rp}. The dividend size @var{nn} must be greater than or equal to divisor 5259size @var{dn}. The most significant limb of the divisor must be non-zero. The 5260@var{qxn} operand must be zero. 5261@end deftypefun 5262 5263@deftypefun mp_limb_t mpn_divrem (mp_limb_t *@var{r1p}, mp_size_t @var{qxn}, mp_limb_t *@var{rs2p}, mp_size_t @var{rs2n}, const mp_limb_t *@var{s3p}, mp_size_t @var{s3n}) 5264[This function is obsolete. Please call @code{mpn_tdiv_qr} instead for best 5265performance.] 5266 5267Divide @{@var{rs2p}, @var{rs2n}@} by @{@var{s3p}, @var{s3n}@}, and write the 5268quotient at @var{r1p}, with the exception of the most significant limb, which 5269is returned. The remainder replaces the dividend at @var{rs2p}; it will be 5270@var{s3n} limbs long (i.e., as many limbs as the divisor). 5271 5272In addition to an integer quotient, @var{qxn} fraction limbs are developed, and 5273stored after the integral limbs. For most usages, @var{qxn} will be zero. 5274 5275It is required that @var{rs2n} is greater than or equal to @var{s3n}. It is 5276required that the most significant bit of the divisor is set. 5277 5278If the quotient is not needed, pass @var{rs2p} + @var{s3n} as @var{r1p}. Aside 5279from that special case, no overlap between arguments is permitted. 5280 5281Return the most significant limb of the quotient, either 0 or 1. 5282 5283The area at @var{r1p} needs to be @var{rs2n} @minus{} @var{s3n} + @var{qxn} 5284limbs large. 5285@end deftypefun 5286 5287@deftypefn Function mp_limb_t mpn_divrem_1 (mp_limb_t *@var{r1p}, mp_size_t @var{qxn}, @w{mp_limb_t *@var{s2p}}, mp_size_t @var{s2n}, mp_limb_t @var{s3limb}) 5288@deftypefnx Macro mp_limb_t mpn_divmod_1 (mp_limb_t *@var{r1p}, mp_limb_t *@var{s2p}, @w{mp_size_t @var{s2n}}, @w{mp_limb_t @var{s3limb}}) 5289Divide @{@var{s2p}, @var{s2n}@} by @var{s3limb}, and write the quotient at 5290@var{r1p}. Return the remainder. 5291 5292The integer quotient is written to @{@var{r1p}+@var{qxn}, @var{s2n}@} and in 5293addition @var{qxn} fraction limbs are developed and written to @{@var{r1p}, 5294@var{qxn}@}. Either or both @var{s2n} and @var{qxn} can be zero. For most 5295usages, @var{qxn} will be zero. 5296 5297@code{mpn_divmod_1} exists for upward source compatibility and is simply a 5298macro calling @code{mpn_divrem_1} with a @var{qxn} of 0. 5299 5300The areas at @var{r1p} and @var{s2p} have to be identical or completely 5301separate, not partially overlapping. 5302@end deftypefn 5303 5304@deftypefun mp_limb_t mpn_divmod (mp_limb_t *@var{r1p}, mp_limb_t *@var{rs2p}, mp_size_t @var{rs2n}, const mp_limb_t *@var{s3p}, mp_size_t @var{s3n}) 5305[This function is obsolete. Please call @code{mpn_tdiv_qr} instead for best 5306performance.] 5307@end deftypefun 5308 5309@deftypefn Macro mp_limb_t mpn_divexact_by3 (mp_limb_t *@var{rp}, mp_limb_t *@var{sp}, @w{mp_size_t @var{n}}) 5310@deftypefnx Function mp_limb_t mpn_divexact_by3c (mp_limb_t *@var{rp}, mp_limb_t *@var{sp}, @w{mp_size_t @var{n}}, mp_limb_t @var{carry}) 5311Divide @{@var{sp}, @var{n}@} by 3, expecting it to divide exactly, and writing 5312the result to @{@var{rp}, @var{n}@}. If 3 divides exactly, the return value is 5313zero and the result is the quotient. If not, the return value is non-zero and 5314the result won't be anything useful. 5315 5316@code{mpn_divexact_by3c} takes an initial carry parameter, which can be the 5317return value from a previous call, so a large calculation can be done piece by 5318piece from low to high. @code{mpn_divexact_by3} is simply a macro calling 5319@code{mpn_divexact_by3c} with a 0 carry parameter. 5320 5321These routines use a multiply-by-inverse and will be faster than 5322@code{mpn_divrem_1} on CPUs with fast multiplication but slow division. 5323 5324The source @math{a}, result @math{q}, size @math{n}, initial carry @math{i}, 5325and return value @math{c} satisfy @m{cb^n+a-i=3q, c*b^n + a-i = 3*q}, where 5326@m{b=2\GMPraise{@code{GMP\_NUMB\_BITS}}, b=2^GMP_NUMB_BITS}. The 5327return @math{c} is always 0, 1 or 2, and the initial carry @math{i} must also 5328be 0, 1 or 2 (these are both borrows really). When @math{c=0} clearly 5329@math{q=(a-i)/3}. When @m{c \neq 0, c!=0}, the remainder @math{(a-i) @bmod{} 53303} is given by @math{3-c}, because @math{b @equiv{} 1 @bmod{} 3} (when 5331@code{mp_bits_per_limb} is even, which is always so currently). 5332@end deftypefn 5333 5334@deftypefun mp_limb_t mpn_mod_1 (const mp_limb_t *@var{s1p}, mp_size_t @var{s1n}, mp_limb_t @var{s2limb}) 5335Divide @{@var{s1p}, @var{s1n}@} by @var{s2limb}, and return the remainder. 5336@var{s1n} can be zero. 5337@end deftypefun 5338 5339@deftypefun mp_limb_t mpn_lshift (mp_limb_t *@var{rp}, const mp_limb_t *@var{sp}, mp_size_t @var{n}, unsigned int @var{count}) 5340Shift @{@var{sp}, @var{n}@} left by @var{count} bits, and write the result to 5341@{@var{rp}, @var{n}@}. The bits shifted out at the left are returned in the 5342least significant @var{count} bits of the return value (the rest of the return 5343value is zero). 5344 5345@var{count} must be in the range 1 to @nicode{mp_bits_per_limb}@minus{}1. The 5346regions @{@var{sp}, @var{n}@} and @{@var{rp}, @var{n}@} may overlap, provided 5347@math{@var{rp} @ge{} @var{sp}}. 5348 5349This function is written in assembly for most CPUs. 5350@end deftypefun 5351 5352@deftypefun mp_limb_t mpn_rshift (mp_limb_t *@var{rp}, const mp_limb_t *@var{sp}, mp_size_t @var{n}, unsigned int @var{count}) 5353Shift @{@var{sp}, @var{n}@} right by @var{count} bits, and write the result to 5354@{@var{rp}, @var{n}@}. The bits shifted out at the right are returned in the 5355most significant @var{count} bits of the return value (the rest of the return 5356value is zero). 5357 5358@var{count} must be in the range 1 to @nicode{mp_bits_per_limb}@minus{}1. The 5359regions @{@var{sp}, @var{n}@} and @{@var{rp}, @var{n}@} may overlap, provided 5360@math{@var{rp} @le{} @var{sp}}. 5361 5362This function is written in assembly for most CPUs. 5363@end deftypefun 5364 5365@deftypefun int mpn_cmp (const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n}) 5366Compare @{@var{s1p}, @var{n}@} and @{@var{s2p}, @var{n}@} and return a 5367positive value if @math{@var{s1} > @var{s2}}, 0 if they are equal, or a 5368negative value if @math{@var{s1} < @var{s2}}. 5369@end deftypefun 5370 5371@deftypefun mp_size_t mpn_gcd (mp_limb_t *@var{rp}, mp_limb_t *@var{xp}, mp_size_t @var{xn}, mp_limb_t *@var{yp}, mp_size_t @var{yn}) 5372Set @{@var{rp}, @var{retval}@} to the greatest common divisor of @{@var{xp}, 5373@var{xn}@} and @{@var{yp}, @var{yn}@}. The result can be up to @var{yn} limbs, 5374the return value is the actual number produced. Both source operands are 5375destroyed. 5376 5377@{@var{xp}, @var{xn}@} must have at least as many bits as @{@var{yp}, 5378@var{yn}@}. @{@var{yp}, @var{yn}@} must be odd. Both operands must have 5379non-zero most significant limbs. No overlap is permitted between @{@var{xp}, 5380@var{xn}@} and @{@var{yp}, @var{yn}@}. 5381@end deftypefun 5382 5383@deftypefun mp_limb_t mpn_gcd_1 (const mp_limb_t *@var{xp}, mp_size_t @var{xn}, mp_limb_t @var{ylimb}) 5384Return the greatest common divisor of @{@var{xp}, @var{xn}@} and @var{ylimb}. 5385Both operands must be non-zero. 5386@end deftypefun 5387 5388@deftypefun mp_size_t mpn_gcdext (mp_limb_t *@var{gp}, mp_limb_t *@var{sp}, mp_size_t *@var{sn}, mp_limb_t *@var{xp}, mp_size_t @var{xn}, mp_limb_t *@var{yp}, mp_size_t @var{yn}) 5389Let @m{U,@var{U}} be defined by @{@var{xp}, @var{xn}@} and let @m{V,@var{V}} be 5390defined by @{@var{yp}, @var{yn}@}. 5391 5392Compute the greatest common divisor @math{G} of @math{U} and @math{V}. Compute 5393a cofactor @math{S} such that @math{G = US + VT}. The second cofactor @var{T} 5394is not computed but can easily be obtained from @m{(G - US) / V, (@var{G} - 5395@var{U}*@var{S}) / @var{V}} (the division will be exact). It is required that 5396@math{U @ge V > 0}. 5397 5398@math{S} satisfies @math{S = 1} or @math{@GMPabs{S} < V / (2 G)}. @math{S = 53990} if and only if @math{V} divides @math{U} (i.e., @math{G = V}). 5400 5401Store @math{G} at @var{gp} and let the return value define its limb count. 5402Store @math{S} at @var{sp} and let |*@var{sn}| define its limb count. @math{S} 5403can be negative; when this happens *@var{sn} will be negative. The areas at 5404@var{gp} and @var{sp} should each have room for @math{@var{xn}+1} limbs. 5405 5406The areas @{@var{xp}, @math{@var{xn}+1}@} and @{@var{yp}, @math{@var{yn}+1}@} 5407are destroyed (i.e.@: the input operands plus an extra limb past the end of 5408each). 5409 5410Compatibility note: GMP 4.3.0 and 4.3.1 defined @math{S} less strictly. 5411Earlier as well as later GMP releases define @math{S} as described here. 5412@end deftypefun 5413 5414@deftypefun mp_size_t mpn_sqrtrem (mp_limb_t *@var{r1p}, mp_limb_t *@var{r2p}, const mp_limb_t *@var{sp}, mp_size_t @var{n}) 5415Compute the square root of @{@var{sp}, @var{n}@} and put the result at 5416@{@var{r1p}, @math{@GMPceil{@var{n}/2}}@} and the remainder at @{@var{r2p}, 5417@var{retval}@}. @var{r2p} needs space for @var{n} limbs, but the return value 5418indicates how many are produced. 5419 5420The most significant limb of @{@var{sp}, @var{n}@} must be non-zero. The 5421areas @{@var{r1p}, @math{@GMPceil{@var{n}/2}}@} and @{@var{sp}, @var{n}@} must 5422be completely separate. The areas @{@var{r2p}, @var{n}@} and @{@var{sp}, 5423@var{n}@} must be either identical or completely separate. 5424 5425If the remainder is not wanted then @var{r2p} can be @code{NULL}, and in this 5426case the return value is zero or non-zero according to whether the remainder 5427would have been zero or non-zero. 5428 5429A return value of zero indicates a perfect square. See also 5430@code{mpz_perfect_square_p}. 5431@end deftypefun 5432 5433@deftypefun mp_size_t mpn_get_str (unsigned char *@var{str}, int @var{base}, mp_limb_t *@var{s1p}, mp_size_t @var{s1n}) 5434Convert @{@var{s1p}, @var{s1n}@} to a raw unsigned char array at @var{str} in 5435base @var{base}, and return the number of characters produced. There may be 5436leading zeros in the string. The string is not in ASCII; to convert it to 5437printable format, add the ASCII codes for @samp{0} or @samp{A}, depending on 5438the base and range. @var{base} can vary from 2 to 256. 5439 5440The most significant limb of the input @{@var{s1p}, @var{s1n}@} must be 5441non-zero. The input @{@var{s1p}, @var{s1n}@} is clobbered, except when 5442@var{base} is a power of 2, in which case it's unchanged. 5443 5444The area at @var{str} has to have space for the largest possible number 5445represented by a @var{s1n} long limb array, plus one extra character. 5446@end deftypefun 5447 5448@deftypefun mp_size_t mpn_set_str (mp_limb_t *@var{rp}, const unsigned char *@var{str}, size_t @var{strsize}, int @var{base}) 5449Convert bytes @{@var{str},@var{strsize}@} in the given @var{base} to limbs at 5450@var{rp}. 5451 5452@math{@var{str}[0]} is the most significant byte and 5453@math{@var{str}[@var{strsize}-1]} is the least significant. Each byte should 5454be a value in the range 0 to @math{@var{base}-1}, not an ASCII character. 5455@var{base} can vary from 2 to 256. 5456 5457The return value is the number of limbs written to @var{rp}. If the most 5458significant input byte is non-zero then the high limb at @var{rp} will be 5459non-zero, and only that exact number of limbs will be required there. 5460 5461If the most significant input byte is zero then there may be high zero limbs 5462written to @var{rp} and included in the return value. 5463 5464@var{strsize} must be at least 1, and no overlap is permitted between 5465@{@var{str},@var{strsize}@} and the result at @var{rp}. 5466@end deftypefun 5467 5468@deftypefun {mp_bitcnt_t} mpn_scan0 (const mp_limb_t *@var{s1p}, mp_bitcnt_t @var{bit}) 5469Scan @var{s1p} from bit position @var{bit} for the next clear bit. 5470 5471It is required that there be a clear bit within the area at @var{s1p} at or 5472beyond bit position @var{bit}, so that the function has something to return. 5473@end deftypefun 5474 5475@deftypefun {mp_bitcnt_t} mpn_scan1 (const mp_limb_t *@var{s1p}, mp_bitcnt_t @var{bit}) 5476Scan @var{s1p} from bit position @var{bit} for the next set bit. 5477 5478It is required that there be a set bit within the area at @var{s1p} at or 5479beyond bit position @var{bit}, so that the function has something to return. 5480@end deftypefun 5481 5482@deftypefun void mpn_random (mp_limb_t *@var{r1p}, mp_size_t @var{r1n}) 5483@deftypefunx void mpn_random2 (mp_limb_t *@var{r1p}, mp_size_t @var{r1n}) 5484Generate a random number of length @var{r1n} and store it at @var{r1p}. The 5485most significant limb is always non-zero. @code{mpn_random} generates 5486uniformly distributed limb data, @code{mpn_random2} generates long strings of 5487zeros and ones in the binary representation. 5488 5489@code{mpn_random2} is intended for testing the correctness of the @code{mpn} 5490routines. 5491@end deftypefun 5492 5493@deftypefun {mp_bitcnt_t} mpn_popcount (const mp_limb_t *@var{s1p}, mp_size_t @var{n}) 5494Count the number of set bits in @{@var{s1p}, @var{n}@}. 5495@end deftypefun 5496 5497@deftypefun {mp_bitcnt_t} mpn_hamdist (const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n}) 5498Compute the hamming distance between @{@var{s1p}, @var{n}@} and @{@var{s2p}, 5499@var{n}@}, which is the number of bit positions where the two operands have 5500different bit values. 5501@end deftypefun 5502 5503@deftypefun int mpn_perfect_square_p (const mp_limb_t *@var{s1p}, mp_size_t @var{n}) 5504Return non-zero iff @{@var{s1p}, @var{n}@} is a perfect square. 5505The most significant limb of the input @{@var{s1p}, @var{n}@} must be 5506non-zero. 5507@end deftypefun 5508 5509@deftypefun void mpn_and_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n}) 5510Perform the bitwise logical and of @{@var{s1p}, @var{n}@} and @{@var{s2p}, 5511@var{n}@}, and write the result to @{@var{rp}, @var{n}@}. 5512@end deftypefun 5513 5514@deftypefun void mpn_ior_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n}) 5515Perform the bitwise logical inclusive or of @{@var{s1p}, @var{n}@} and 5516@{@var{s2p}, @var{n}@}, and write the result to @{@var{rp}, @var{n}@}. 5517@end deftypefun 5518 5519@deftypefun void mpn_xor_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n}) 5520Perform the bitwise logical exclusive or of @{@var{s1p}, @var{n}@} and 5521@{@var{s2p}, @var{n}@}, and write the result to @{@var{rp}, @var{n}@}. 5522@end deftypefun 5523 5524@deftypefun void mpn_andn_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n}) 5525Perform the bitwise logical and of @{@var{s1p}, @var{n}@} and the bitwise 5526complement of @{@var{s2p}, @var{n}@}, and write the result to @{@var{rp}, @var{n}@}. 5527@end deftypefun 5528 5529@deftypefun void mpn_iorn_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n}) 5530Perform the bitwise logical inclusive or of @{@var{s1p}, @var{n}@} and the bitwise 5531complement of @{@var{s2p}, @var{n}@}, and write the result to @{@var{rp}, @var{n}@}. 5532@end deftypefun 5533 5534@deftypefun void mpn_nand_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n}) 5535Perform the bitwise logical and of @{@var{s1p}, @var{n}@} and @{@var{s2p}, 5536@var{n}@}, and write the bitwise complement of the result to @{@var{rp}, @var{n}@}. 5537@end deftypefun 5538 5539@deftypefun void mpn_nior_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n}) 5540Perform the bitwise logical inclusive or of @{@var{s1p}, @var{n}@} and 5541@{@var{s2p}, @var{n}@}, and write the bitwise complement of the result to 5542@{@var{rp}, @var{n}@}. 5543@end deftypefun 5544 5545@deftypefun void mpn_xnor_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n}) 5546Perform the bitwise logical exclusive or of @{@var{s1p}, @var{n}@} and 5547@{@var{s2p}, @var{n}@}, and write the bitwise complement of the result to 5548@{@var{rp}, @var{n}@}. 5549@end deftypefun 5550 5551@deftypefun void mpn_com (mp_limb_t *@var{rp}, const mp_limb_t *@var{sp}, mp_size_t @var{n}) 5552Perform the bitwise complement of @{@var{sp}, @var{n}@}, and write the result 5553to @{@var{rp}, @var{n}@}. 5554@end deftypefun 5555 5556@deftypefun void mpn_copyi (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{n}) 5557Copy from @{@var{s1p}, @var{n}@} to @{@var{rp}, @var{n}@}, increasingly. 5558@end deftypefun 5559 5560@deftypefun void mpn_copyd (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{n}) 5561Copy from @{@var{s1p}, @var{n}@} to @{@var{rp}, @var{n}@}, decreasingly. 5562@end deftypefun 5563 5564@deftypefun void mpn_zero (mp_limb_t *@var{rp}, mp_size_t @var{n}) 5565Zero @{@var{rp}, @var{n}@}. 5566@end deftypefun 5567 5568@sp 1 5569@section Nails 5570@cindex Nails 5571 5572@strong{Everything in this section is highly experimental and may disappear or 5573be subject to incompatible changes in a future version of GMP.} 5574 5575Nails are an experimental feature whereby a few bits are left unused at the 5576top of each @code{mp_limb_t}. This can significantly improve carry handling 5577on some processors. 5578 5579All the @code{mpn} functions accepting limb data will expect the nail bits to 5580be zero on entry, and will return data with the nails similarly all zero. 5581This applies both to limb vectors and to single limb arguments. 5582 5583Nails can be enabled by configuring with @samp{--enable-nails}. By default 5584the number of bits will be chosen according to what suits the host processor, 5585but a particular number can be selected with @samp{--enable-nails=N}. 5586 5587At the mpn level, a nail build is neither source nor binary compatible with a 5588non-nail build, strictly speaking. But programs acting on limbs only through 5589the mpn functions are likely to work equally well with either build, and 5590judicious use of the definitions below should make any program compatible with 5591either build, at the source level. 5592 5593For the higher level routines, meaning @code{mpz} etc, a nail build should be 5594fully source and binary compatible with a non-nail build. 5595 5596@defmac GMP_NAIL_BITS 5597@defmacx GMP_NUMB_BITS 5598@defmacx GMP_LIMB_BITS 5599@code{GMP_NAIL_BITS} is the number of nail bits, or 0 when nails are not in 5600use. @code{GMP_NUMB_BITS} is the number of data bits in a limb. 5601@code{GMP_LIMB_BITS} is the total number of bits in an @code{mp_limb_t}. In 5602all cases 5603 5604@example 5605GMP_LIMB_BITS == GMP_NAIL_BITS + GMP_NUMB_BITS 5606@end example 5607@end defmac 5608 5609@defmac GMP_NAIL_MASK 5610@defmacx GMP_NUMB_MASK 5611Bit masks for the nail and number parts of a limb. @code{GMP_NAIL_MASK} is 0 5612when nails are not in use. 5613 5614@code{GMP_NAIL_MASK} is not often needed, since the nail part can be obtained 5615with @code{x >> GMP_NUMB_BITS}, and that means one less large constant, which 5616can help various RISC chips. 5617@end defmac 5618 5619@defmac GMP_NUMB_MAX 5620The maximum value that can be stored in the number part of a limb. This is 5621the same as @code{GMP_NUMB_MASK}, but can be used for clarity when doing 5622comparisons rather than bit-wise operations. 5623@end defmac 5624 5625The term ``nails'' comes from finger or toe nails, which are at the ends of a 5626limb (arm or leg). ``numb'' is short for number, but is also how the 5627developers felt after trying for a long time to come up with sensible names 5628for these things. 5629 5630In the future (the distant future most likely) a non-zero nail might be 5631permitted, giving non-unique representations for numbers in a limb vector. 5632This would help vector processors since carries would only ever need to 5633propagate one or two limbs. 5634 5635 5636@node Random Number Functions, Formatted Output, Low-level Functions, Top 5637@chapter Random Number Functions 5638@cindex Random number functions 5639 5640Sequences of pseudo-random numbers in GMP are generated using a variable of 5641type @code{gmp_randstate_t}, which holds an algorithm selection and a current 5642state. Such a variable must be initialized by a call to one of the 5643@code{gmp_randinit} functions, and can be seeded with one of the 5644@code{gmp_randseed} functions. 5645 5646The functions actually generating random numbers are described in @ref{Integer 5647Random Numbers}, and @ref{Miscellaneous Float Functions}. 5648 5649The older style random number functions don't accept a @code{gmp_randstate_t} 5650parameter but instead share a global variable of that type. They use a 5651default algorithm and are currently not seeded (though perhaps that will 5652change in the future). The new functions accepting a @code{gmp_randstate_t} 5653are recommended for applications that care about randomness. 5654 5655@menu 5656* Random State Initialization:: 5657* Random State Seeding:: 5658* Random State Miscellaneous:: 5659@end menu 5660 5661@node Random State Initialization, Random State Seeding, Random Number Functions, Random Number Functions 5662@section Random State Initialization 5663@cindex Random number state 5664@cindex Initialization functions 5665 5666@deftypefun void gmp_randinit_default (gmp_randstate_t @var{state}) 5667Initialize @var{state} with a default algorithm. This will be a compromise 5668between speed and randomness, and is recommended for applications with no 5669special requirements. Currently this is @code{gmp_randinit_mt}. 5670@end deftypefun 5671 5672@deftypefun void gmp_randinit_mt (gmp_randstate_t @var{state}) 5673@cindex Mersenne twister random numbers 5674Initialize @var{state} for a Mersenne Twister algorithm. This algorithm is 5675fast and has good randomness properties. 5676@end deftypefun 5677 5678@deftypefun void gmp_randinit_lc_2exp (gmp_randstate_t @var{state}, mpz_t @var{a}, @w{unsigned long @var{c}}, @w{mp_bitcnt_t @var{m2exp}}) 5679@cindex Linear congruential random numbers 5680Initialize @var{state} with a linear congruential algorithm @m{X = (@var{a}X + 5681@var{c}) @bmod 2^{m2exp}, X = (@var{a}*X + @var{c}) mod 2^@var{m2exp}}. 5682 5683The low bits of @math{X} in this algorithm are not very random. The least 5684significant bit will have a period no more than 2, and the second bit no more 5685than 4, etc. For this reason only the high half of each @math{X} is actually 5686used. 5687 5688When a random number of more than @math{@var{m2exp}/2} bits is to be 5689generated, multiple iterations of the recurrence are used and the results 5690concatenated. 5691@end deftypefun 5692 5693@deftypefun int gmp_randinit_lc_2exp_size (gmp_randstate_t @var{state}, mp_bitcnt_t @var{size}) 5694@cindex Linear congruential random numbers 5695Initialize @var{state} for a linear congruential algorithm as per 5696@code{gmp_randinit_lc_2exp}. @var{a}, @var{c} and @var{m2exp} are selected 5697from a table, chosen so that @var{size} bits (or more) of each @math{X} will 5698be used, ie.@: @math{@var{m2exp}/2 @ge{} @var{size}}. 5699 5700If successful the return value is non-zero. If @var{size} is bigger than the 5701table data provides then the return value is zero. The maximum @var{size} 5702currently supported is 128. 5703@end deftypefun 5704 5705@deftypefun void gmp_randinit_set (gmp_randstate_t @var{rop}, gmp_randstate_t @var{op}) 5706Initialize @var{rop} with a copy of the algorithm and state from @var{op}. 5707@end deftypefun 5708 5709@c Although gmp_randinit, gmp_errno and related constants are obsolete, we 5710@c still put @findex entries for them, since they're still documented and 5711@c someone might be looking them up when perusing old application code. 5712 5713@deftypefun void gmp_randinit (gmp_randstate_t @var{state}, @w{gmp_randalg_t @var{alg}}, @dots{}) 5714@strong{This function is obsolete.} 5715 5716@findex GMP_RAND_ALG_LC 5717@findex GMP_RAND_ALG_DEFAULT 5718Initialize @var{state} with an algorithm selected by @var{alg}. The only 5719choice is @code{GMP_RAND_ALG_LC}, which is @code{gmp_randinit_lc_2exp_size} 5720described above. A third parameter of type @code{unsigned long} is required, 5721this is the @var{size} for that function. @code{GMP_RAND_ALG_DEFAULT} or 0 5722are the same as @code{GMP_RAND_ALG_LC}. 5723 5724@c For reference, this is the only place gmp_errno has been documented, and 5725@c due to being non thread safe we won't be adding to it's uses. 5726@findex gmp_errno 5727@findex GMP_ERROR_UNSUPPORTED_ARGUMENT 5728@findex GMP_ERROR_INVALID_ARGUMENT 5729@code{gmp_randinit} sets bits in the global variable @code{gmp_errno} to 5730indicate an error. @code{GMP_ERROR_UNSUPPORTED_ARGUMENT} if @var{alg} is 5731unsupported, or @code{GMP_ERROR_INVALID_ARGUMENT} if the @var{size} parameter 5732is too big. It may be noted this error reporting is not thread safe (a good 5733reason to use @code{gmp_randinit_lc_2exp_size} instead). 5734@end deftypefun 5735 5736@deftypefun void gmp_randclear (gmp_randstate_t @var{state}) 5737Free all memory occupied by @var{state}. 5738@end deftypefun 5739 5740 5741@node Random State Seeding, Random State Miscellaneous, Random State Initialization, Random Number Functions 5742@section Random State Seeding 5743@cindex Random number seeding 5744@cindex Seeding random numbers 5745 5746@deftypefun void gmp_randseed (gmp_randstate_t @var{state}, mpz_t @var{seed}) 5747@deftypefunx void gmp_randseed_ui (gmp_randstate_t @var{state}, @w{unsigned long int @var{seed}}) 5748Set an initial seed value into @var{state}. 5749 5750The size of a seed determines how many different sequences of random numbers 5751that it's possible to generate. The ``quality'' of the seed is the randomness 5752of a given seed compared to the previous seed used, and this affects the 5753randomness of separate number sequences. The method for choosing a seed is 5754critical if the generated numbers are to be used for important applications, 5755such as generating cryptographic keys. 5756 5757Traditionally the system time has been used to seed, but care needs to be 5758taken with this. If an application seeds often and the resolution of the 5759system clock is low, then the same sequence of numbers might be repeated. 5760Also, the system time is quite easy to guess, so if unpredictability is 5761required then it should definitely not be the only source for the seed value. 5762On some systems there's a special device @file{/dev/random} which provides 5763random data better suited for use as a seed. 5764@end deftypefun 5765 5766 5767@node Random State Miscellaneous, , Random State Seeding, Random Number Functions 5768@section Random State Miscellaneous 5769 5770@deftypefun {unsigned long} gmp_urandomb_ui (gmp_randstate_t @var{state}, unsigned long @var{n}) 5771Return a uniformly distributed random number of @var{n} bits, ie.@: in the 5772range 0 to @m{2^n-1,2^@var{n}-1} inclusive. @var{n} must be less than or 5773equal to the number of bits in an @code{unsigned long}. 5774@end deftypefun 5775 5776@deftypefun {unsigned long} gmp_urandomm_ui (gmp_randstate_t @var{state}, unsigned long @var{n}) 5777Return a uniformly distributed random number in the range 0 to 5778@math{@var{n}-1}, inclusive. 5779@end deftypefun 5780 5781 5782@node Formatted Output, Formatted Input, Random Number Functions, Top 5783@chapter Formatted Output 5784@cindex Formatted output 5785@cindex @code{printf} formatted output 5786 5787@menu 5788* Formatted Output Strings:: 5789* Formatted Output Functions:: 5790* C++ Formatted Output:: 5791@end menu 5792 5793@node Formatted Output Strings, Formatted Output Functions, Formatted Output, Formatted Output 5794@section Format Strings 5795 5796@code{gmp_printf} and friends accept format strings similar to the standard C 5797@code{printf} (@pxref{Formatted Output,, Formatted Output, libc, The GNU C 5798Library Reference Manual}). A format specification is of the form 5799 5800@example 5801% [flags] [width] [.[precision]] [type] conv 5802@end example 5803 5804GMP adds types @samp{Z}, @samp{Q} and @samp{F} for @code{mpz_t}, @code{mpq_t} 5805and @code{mpf_t} respectively, @samp{M} for @code{mp_limb_t}, and @samp{N} for 5806an @code{mp_limb_t} array. @samp{Z}, @samp{Q}, @samp{M} and @samp{N} behave 5807like integers. @samp{Q} will print a @samp{/} and a denominator, if needed. 5808@samp{F} behaves like a float. For example, 5809 5810@example 5811mpz_t z; 5812gmp_printf ("%s is an mpz %Zd\n", "here", z); 5813 5814mpq_t q; 5815gmp_printf ("a hex rational: %#40Qx\n", q); 5816 5817mpf_t f; 5818int n; 5819gmp_printf ("fixed point mpf %.*Ff with %d digits\n", n, f, n); 5820 5821mp_limb_t l; 5822gmp_printf ("limb %Mu\n", l); 5823 5824const mp_limb_t *ptr; 5825mp_size_t size; 5826gmp_printf ("limb array %Nx\n", ptr, size); 5827@end example 5828 5829For @samp{N} the limbs are expected least significant first, as per the 5830@code{mpn} functions (@pxref{Low-level Functions}). A negative size can be 5831given to print the value as a negative. 5832 5833All the standard C @code{printf} types behave the same as the C library 5834@code{printf}, and can be freely intermixed with the GMP extensions. In the 5835current implementation the standard parts of the format string are simply 5836handed to @code{printf} and only the GMP extensions handled directly. 5837 5838The flags accepted are as follows. GLIBC style @nisamp{'} is only for the 5839standard C types (not the GMP types), and only if the C library supports it. 5840 5841@quotation 5842@multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM} 5843@item @nicode{0} @tab pad with zeros (rather than spaces) 5844@item @nicode{#} @tab show the base with @samp{0x}, @samp{0X} or @samp{0} 5845@item @nicode{+} @tab always show a sign 5846@item (space) @tab show a space or a @samp{-} sign 5847@item @nicode{'} @tab group digits, GLIBC style (not GMP types) 5848@end multitable 5849@end quotation 5850 5851The optional width and precision can be given as a number within the format 5852string, or as a @samp{*} to take an extra parameter of type @code{int}, the 5853same as the standard @code{printf}. 5854 5855The standard types accepted are as follows. @samp{h} and @samp{l} are 5856portable, the rest will depend on the compiler (or include files) for the type 5857and the C library for the output. 5858 5859@quotation 5860@multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM} 5861@item @nicode{h} @tab @nicode{short} 5862@item @nicode{hh} @tab @nicode{char} 5863@item @nicode{j} @tab @nicode{intmax_t} or @nicode{uintmax_t} 5864@item @nicode{l} @tab @nicode{long} or @nicode{wchar_t} 5865@item @nicode{ll} @tab @nicode{long long} 5866@item @nicode{L} @tab @nicode{long double} 5867@item @nicode{q} @tab @nicode{quad_t} or @nicode{u_quad_t} 5868@item @nicode{t} @tab @nicode{ptrdiff_t} 5869@item @nicode{z} @tab @nicode{size_t} 5870@end multitable 5871@end quotation 5872 5873@noindent 5874The GMP types are 5875 5876@quotation 5877@multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM} 5878@item @nicode{F} @tab @nicode{mpf_t}, float conversions 5879@item @nicode{Q} @tab @nicode{mpq_t}, integer conversions 5880@item @nicode{M} @tab @nicode{mp_limb_t}, integer conversions 5881@item @nicode{N} @tab @nicode{mp_limb_t} array, integer conversions 5882@item @nicode{Z} @tab @nicode{mpz_t}, integer conversions 5883@end multitable 5884@end quotation 5885 5886The conversions accepted are as follows. @samp{a} and @samp{A} are always 5887supported for @code{mpf_t} but depend on the C library for standard C float 5888types. @samp{m} and @samp{p} depend on the C library. 5889 5890@quotation 5891@multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM} 5892@item @nicode{a} @nicode{A} @tab hex floats, C99 style 5893@item @nicode{c} @tab character 5894@item @nicode{d} @tab decimal integer 5895@item @nicode{e} @nicode{E} @tab scientific format float 5896@item @nicode{f} @tab fixed point float 5897@item @nicode{i} @tab same as @nicode{d} 5898@item @nicode{g} @nicode{G} @tab fixed or scientific float 5899@item @nicode{m} @tab @code{strerror} string, GLIBC style 5900@item @nicode{n} @tab store characters written so far 5901@item @nicode{o} @tab octal integer 5902@item @nicode{p} @tab pointer 5903@item @nicode{s} @tab string 5904@item @nicode{u} @tab unsigned integer 5905@item @nicode{x} @nicode{X} @tab hex integer 5906@end multitable 5907@end quotation 5908 5909@samp{o}, @samp{x} and @samp{X} are unsigned for the standard C types, but for 5910types @samp{Z}, @samp{Q} and @samp{N} they are signed. @samp{u} is not 5911meaningful for @samp{Z}, @samp{Q} and @samp{N}. 5912 5913@samp{M} is a proxy for the C library @samp{l} or @samp{L}, according to the 5914size of @code{mp_limb_t}. Unsigned conversions will be usual, but a signed 5915conversion can be used and will interpret the value as a twos complement 5916negative. 5917 5918@samp{n} can be used with any type, even the GMP types. 5919 5920Other types or conversions that might be accepted by the C library 5921@code{printf} cannot be used through @code{gmp_printf}, this includes for 5922instance extensions registered with GLIBC @code{register_printf_function}. 5923Also currently there's no support for POSIX @samp{$} style numbered arguments 5924(perhaps this will be added in the future). 5925 5926The precision field has it's usual meaning for integer @samp{Z} and float 5927@samp{F} types, but is currently undefined for @samp{Q} and should not be used 5928with that. 5929 5930@code{mpf_t} conversions only ever generate as many digits as can be 5931accurately represented by the operand, the same as @code{mpf_get_str} does. 5932Zeros will be used if necessary to pad to the requested precision. This 5933happens even for an @samp{f} conversion of an @code{mpf_t} which is an 5934integer, for instance @math{2^@W{1024}} in an @code{mpf_t} of 128 bits 5935precision will only produce about 40 digits, then pad with zeros to the 5936decimal point. An empty precision field like @samp{%.Fe} or @samp{%.Ff} can 5937be used to specifically request just the significant digits. 5938 5939The decimal point character (or string) is taken from the current locale 5940settings on systems which provide @code{localeconv} (@pxref{Locales,, Locales 5941and Internationalization, libc, The GNU C Library Reference Manual}). The C 5942library will normally do the same for standard float output. 5943 5944The format string is only interpreted as plain @code{char}s, multibyte 5945characters are not recognised. Perhaps this will change in the future. 5946 5947 5948@node Formatted Output Functions, C++ Formatted Output, Formatted Output Strings, Formatted Output 5949@section Functions 5950@cindex Output functions 5951 5952Each of the following functions is similar to the corresponding C library 5953function. The basic @code{printf} forms take a variable argument list. The 5954@code{vprintf} forms take an argument pointer, see @ref{Variadic Functions,, 5955Variadic Functions, libc, The GNU C Library Reference Manual}, or @samp{man 3 5956va_start}. 5957 5958It should be emphasised that if a format string is invalid, or the arguments 5959don't match what the format specifies, then the behaviour of any of these 5960functions will be unpredictable. GCC format string checking is not available, 5961since it doesn't recognise the GMP extensions. 5962 5963The file based functions @code{gmp_printf} and @code{gmp_fprintf} will return 5964@math{-1} to indicate a write error. Output is not ``atomic'', so partial 5965output may be produced if a write error occurs. All the functions can return 5966@math{-1} if the C library @code{printf} variant in use returns @math{-1}, but 5967this shouldn't normally occur. 5968 5969@deftypefun int gmp_printf (const char *@var{fmt}, @dots{}) 5970@deftypefunx int gmp_vprintf (const char *@var{fmt}, va_list @var{ap}) 5971Print to the standard output @code{stdout}. Return the number of characters 5972written, or @math{-1} if an error occurred. 5973@end deftypefun 5974 5975@deftypefun int gmp_fprintf (FILE *@var{fp}, const char *@var{fmt}, @dots{}) 5976@deftypefunx int gmp_vfprintf (FILE *@var{fp}, const char *@var{fmt}, va_list @var{ap}) 5977Print to the stream @var{fp}. Return the number of characters written, or 5978@math{-1} if an error occurred. 5979@end deftypefun 5980 5981@deftypefun int gmp_sprintf (char *@var{buf}, const char *@var{fmt}, @dots{}) 5982@deftypefunx int gmp_vsprintf (char *@var{buf}, const char *@var{fmt}, va_list @var{ap}) 5983Form a null-terminated string in @var{buf}. Return the number of characters 5984written, excluding the terminating null. 5985 5986No overlap is permitted between the space at @var{buf} and the string 5987@var{fmt}. 5988 5989These functions are not recommended, since there's no protection against 5990exceeding the space available at @var{buf}. 5991@end deftypefun 5992 5993@deftypefun int gmp_snprintf (char *@var{buf}, size_t @var{size}, const char *@var{fmt}, @dots{}) 5994@deftypefunx int gmp_vsnprintf (char *@var{buf}, size_t @var{size}, const char *@var{fmt}, va_list @var{ap}) 5995Form a null-terminated string in @var{buf}. No more than @var{size} bytes 5996will be written. To get the full output, @var{size} must be enough for the 5997string and null-terminator. 5998 5999The return value is the total number of characters which ought to have been 6000produced, excluding the terminating null. If @math{@var{retval} @ge{} 6001@var{size}} then the actual output has been truncated to the first 6002@math{@var{size}-1} characters, and a null appended. 6003 6004No overlap is permitted between the region @{@var{buf},@var{size}@} and the 6005@var{fmt} string. 6006 6007Notice the return value is in ISO C99 @code{snprintf} style. This is so even 6008if the C library @code{vsnprintf} is the older GLIBC 2.0.x style. 6009@end deftypefun 6010 6011@deftypefun int gmp_asprintf (char **@var{pp}, const char *@var{fmt}, @dots{}) 6012@deftypefunx int gmp_vasprintf (char **@var{pp}, const char *@var{fmt}, va_list @var{ap}) 6013Form a null-terminated string in a block of memory obtained from the current 6014memory allocation function (@pxref{Custom Allocation}). The block will be the 6015size of the string and null-terminator. The address of the block in stored to 6016*@var{pp}. The return value is the number of characters produced, excluding 6017the null-terminator. 6018 6019Unlike the C library @code{asprintf}, @code{gmp_asprintf} doesn't return 6020@math{-1} if there's no more memory available, it lets the current allocation 6021function handle that. 6022@end deftypefun 6023 6024@deftypefun int gmp_obstack_printf (struct obstack *@var{ob}, const char *@var{fmt}, @dots{}) 6025@deftypefunx int gmp_obstack_vprintf (struct obstack *@var{ob}, const char *@var{fmt}, va_list @var{ap}) 6026@cindex @code{obstack} output 6027Append to the current object in @var{ob}. The return value is the number of 6028characters written. A null-terminator is not written. 6029 6030@var{fmt} cannot be within the current object in @var{ob}, since that object 6031might move as it grows. 6032 6033These functions are available only when the C library provides the obstack 6034feature, which probably means only on GNU systems, see @ref{Obstacks,, 6035Obstacks, libc, The GNU C Library Reference Manual}. 6036@end deftypefun 6037 6038 6039@node C++ Formatted Output, , Formatted Output Functions, Formatted Output 6040@section C++ Formatted Output 6041@cindex C++ @code{ostream} output 6042@cindex @code{ostream} output 6043 6044The following functions are provided in @file{libgmpxx} (@pxref{Headers and 6045Libraries}), which is built if C++ support is enabled (@pxref{Build Options}). 6046Prototypes are available from @code{<gmp.h>}. 6047 6048@deftypefun ostream& operator<< (ostream& @var{stream}, mpz_t @var{op}) 6049Print @var{op} to @var{stream}, using its @code{ios} formatting settings. 6050@code{ios::width} is reset to 0 after output, the same as the standard 6051@code{ostream operator<<} routines do. 6052 6053In hex or octal, @var{op} is printed as a signed number, the same as for 6054decimal. This is unlike the standard @code{operator<<} routines on @code{int} 6055etc, which instead give twos complement. 6056@end deftypefun 6057 6058@deftypefun ostream& operator<< (ostream& @var{stream}, mpq_t @var{op}) 6059Print @var{op} to @var{stream}, using its @code{ios} formatting settings. 6060@code{ios::width} is reset to 0 after output, the same as the standard 6061@code{ostream operator<<} routines do. 6062 6063Output will be a fraction like @samp{5/9}, or if the denominator is 1 then 6064just a plain integer like @samp{123}. 6065 6066In hex or octal, @var{op} is printed as a signed value, the same as for 6067decimal. If @code{ios::showbase} is set then a base indicator is shown on 6068both the numerator and denominator (if the denominator is required). 6069@end deftypefun 6070 6071@deftypefun ostream& operator<< (ostream& @var{stream}, mpf_t @var{op}) 6072Print @var{op} to @var{stream}, using its @code{ios} formatting settings. 6073@code{ios::width} is reset to 0 after output, the same as the standard 6074@code{ostream operator<<} routines do. 6075 6076The decimal point follows the standard library float @code{operator<<}, which 6077on recent systems means the @code{std::locale} imbued on @var{stream}. 6078 6079Hex and octal are supported, unlike the standard @code{operator<<} on 6080@code{double}. The mantissa will be in hex or octal, the exponent will be in 6081decimal. For hex the exponent delimiter is an @samp{@@}. This is as per 6082@code{mpf_out_str}. 6083 6084@code{ios::showbase} is supported, and will put a base on the mantissa, for 6085example hex @samp{0x1.8} or @samp{0x0.8}, or octal @samp{01.4} or @samp{00.4}. 6086This last form is slightly strange, but at least differentiates itself from 6087decimal. 6088@end deftypefun 6089 6090These operators mean that GMP types can be printed in the usual C++ way, for 6091example, 6092 6093@example 6094mpz_t z; 6095int n; 6096... 6097cout << "iteration " << n << " value " << z << "\n"; 6098@end example 6099 6100But note that @code{ostream} output (and @code{istream} input, @pxref{C++ 6101Formatted Input}) is the only overloading available for the GMP types and that 6102for instance using @code{+} with an @code{mpz_t} will have unpredictable 6103results. For classes with overloading, see @ref{C++ Class Interface}. 6104 6105 6106@node Formatted Input, C++ Class Interface, Formatted Output, Top 6107@chapter Formatted Input 6108@cindex Formatted input 6109@cindex @code{scanf} formatted input 6110 6111@menu 6112* Formatted Input Strings:: 6113* Formatted Input Functions:: 6114* C++ Formatted Input:: 6115@end menu 6116 6117 6118@node Formatted Input Strings, Formatted Input Functions, Formatted Input, Formatted Input 6119@section Formatted Input Strings 6120 6121@code{gmp_scanf} and friends accept format strings similar to the standard C 6122@code{scanf} (@pxref{Formatted Input,, Formatted Input, libc, The GNU C 6123Library Reference Manual}). A format specification is of the form 6124 6125@example 6126% [flags] [width] [type] conv 6127@end example 6128 6129GMP adds types @samp{Z}, @samp{Q} and @samp{F} for @code{mpz_t}, @code{mpq_t} 6130and @code{mpf_t} respectively. @samp{Z} and @samp{Q} behave like integers. 6131@samp{Q} will read a @samp{/} and a denominator, if present. @samp{F} behaves 6132like a float. 6133 6134GMP variables don't require an @code{&} when passed to @code{gmp_scanf}, since 6135they're already ``call-by-reference''. For example, 6136 6137@example 6138/* to read say "a(5) = 1234" */ 6139int n; 6140mpz_t z; 6141gmp_scanf ("a(%d) = %Zd\n", &n, z); 6142 6143mpq_t q1, q2; 6144gmp_sscanf ("0377 + 0x10/0x11", "%Qi + %Qi", q1, q2); 6145 6146/* to read say "topleft (1.55,-2.66)" */ 6147mpf_t x, y; 6148char buf[32]; 6149gmp_scanf ("%31s (%Ff,%Ff)", buf, x, y); 6150@end example 6151 6152All the standard C @code{scanf} types behave the same as in the C library 6153@code{scanf}, and can be freely intermixed with the GMP extensions. In the 6154current implementation the standard parts of the format string are simply 6155handed to @code{scanf} and only the GMP extensions handled directly. 6156 6157The flags accepted are as follows. @samp{a} and @samp{'} will depend on 6158support from the C library, and @samp{'} cannot be used with GMP types. 6159 6160@quotation 6161@multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM} 6162@item @nicode{*} @tab read but don't store 6163@item @nicode{a} @tab allocate a buffer (string conversions) 6164@item @nicode{'} @tab grouped digits, GLIBC style (not GMP types) 6165@end multitable 6166@end quotation 6167 6168The standard types accepted are as follows. @samp{h} and @samp{l} are 6169portable, the rest will depend on the compiler (or include files) for the type 6170and the C library for the input. 6171 6172@quotation 6173@multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM} 6174@item @nicode{h} @tab @nicode{short} 6175@item @nicode{hh} @tab @nicode{char} 6176@item @nicode{j} @tab @nicode{intmax_t} or @nicode{uintmax_t} 6177@item @nicode{l} @tab @nicode{long int}, @nicode{double} or @nicode{wchar_t} 6178@item @nicode{ll} @tab @nicode{long long} 6179@item @nicode{L} @tab @nicode{long double} 6180@item @nicode{q} @tab @nicode{quad_t} or @nicode{u_quad_t} 6181@item @nicode{t} @tab @nicode{ptrdiff_t} 6182@item @nicode{z} @tab @nicode{size_t} 6183@end multitable 6184@end quotation 6185 6186@noindent 6187The GMP types are 6188 6189@quotation 6190@multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM} 6191@item @nicode{F} @tab @nicode{mpf_t}, float conversions 6192@item @nicode{Q} @tab @nicode{mpq_t}, integer conversions 6193@item @nicode{Z} @tab @nicode{mpz_t}, integer conversions 6194@end multitable 6195@end quotation 6196 6197The conversions accepted are as follows. @samp{p} and @samp{[} will depend on 6198support from the C library, the rest are standard. 6199 6200@quotation 6201@multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM} 6202@item @nicode{c} @tab character or characters 6203@item @nicode{d} @tab decimal integer 6204@item @nicode{e} @nicode{E} @nicode{f} @nicode{g} @nicode{G} 6205 @tab float 6206@item @nicode{i} @tab integer with base indicator 6207@item @nicode{n} @tab characters read so far 6208@item @nicode{o} @tab octal integer 6209@item @nicode{p} @tab pointer 6210@item @nicode{s} @tab string of non-whitespace characters 6211@item @nicode{u} @tab decimal integer 6212@item @nicode{x} @nicode{X} @tab hex integer 6213@item @nicode{[} @tab string of characters in a set 6214@end multitable 6215@end quotation 6216 6217@samp{e}, @samp{E}, @samp{f}, @samp{g} and @samp{G} are identical, they all 6218read either fixed point or scientific format, and either upper or lower case 6219@samp{e} for the exponent in scientific format. 6220 6221C99 style hex float format (@code{printf %a}, @pxref{Formatted Output 6222Strings}) is always accepted for @code{mpf_t}, but for the standard float 6223types it will depend on the C library. 6224 6225@samp{x} and @samp{X} are identical, both accept both upper and lower case 6226hexadecimal. 6227 6228@samp{o}, @samp{u}, @samp{x} and @samp{X} all read positive or negative 6229values. For the standard C types these are described as ``unsigned'' 6230conversions, but that merely affects certain overflow handling, negatives are 6231still allowed (per @code{strtoul}, @pxref{Parsing of Integers,, Parsing of 6232Integers, libc, The GNU C Library Reference Manual}). For GMP types there are 6233no overflows, so @samp{d} and @samp{u} are identical. 6234 6235@samp{Q} type reads the numerator and (optional) denominator as given. If the 6236value might not be in canonical form then @code{mpq_canonicalize} must be 6237called before using it in any calculations (@pxref{Rational Number 6238Functions}). 6239 6240@samp{Qi} will read a base specification separately for the numerator and 6241denominator. For example @samp{0x10/11} would be 16/11, whereas 6242@samp{0x10/0x11} would be 16/17. 6243 6244@samp{n} can be used with any of the types above, even the GMP types. 6245@samp{*} to suppress assignment is allowed, though in that case it would do 6246nothing at all. 6247 6248Other conversions or types that might be accepted by the C library 6249@code{scanf} cannot be used through @code{gmp_scanf}. 6250 6251Whitespace is read and discarded before a field, except for @samp{c} and 6252@samp{[} conversions. 6253 6254For float conversions, the decimal point character (or string) expected is 6255taken from the current locale settings on systems which provide 6256@code{localeconv} (@pxref{Locales,, Locales and Internationalization, libc, 6257The GNU C Library Reference Manual}). The C library will normally do the same 6258for standard float input. 6259 6260The format string is only interpreted as plain @code{char}s, multibyte 6261characters are not recognised. Perhaps this will change in the future. 6262 6263 6264@node Formatted Input Functions, C++ Formatted Input, Formatted Input Strings, Formatted Input 6265@section Formatted Input Functions 6266@cindex Input functions 6267 6268Each of the following functions is similar to the corresponding C library 6269function. The plain @code{scanf} forms take a variable argument list. The 6270@code{vscanf} forms take an argument pointer, see @ref{Variadic Functions,, 6271Variadic Functions, libc, The GNU C Library Reference Manual}, or @samp{man 3 6272va_start}. 6273 6274It should be emphasised that if a format string is invalid, or the arguments 6275don't match what the format specifies, then the behaviour of any of these 6276functions will be unpredictable. GCC format string checking is not available, 6277since it doesn't recognise the GMP extensions. 6278 6279No overlap is permitted between the @var{fmt} string and any of the results 6280produced. 6281 6282@deftypefun int gmp_scanf (const char *@var{fmt}, @dots{}) 6283@deftypefunx int gmp_vscanf (const char *@var{fmt}, va_list @var{ap}) 6284Read from the standard input @code{stdin}. 6285@end deftypefun 6286 6287@deftypefun int gmp_fscanf (FILE *@var{fp}, const char *@var{fmt}, @dots{}) 6288@deftypefunx int gmp_vfscanf (FILE *@var{fp}, const char *@var{fmt}, va_list @var{ap}) 6289Read from the stream @var{fp}. 6290@end deftypefun 6291 6292@deftypefun int gmp_sscanf (const char *@var{s}, const char *@var{fmt}, @dots{}) 6293@deftypefunx int gmp_vsscanf (const char *@var{s}, const char *@var{fmt}, va_list @var{ap}) 6294Read from a null-terminated string @var{s}. 6295@end deftypefun 6296 6297The return value from each of these functions is the same as the standard C99 6298@code{scanf}, namely the number of fields successfully parsed and stored. 6299@samp{%n} fields and fields read but suppressed by @samp{*} don't count 6300towards the return value. 6301 6302If end of input (or a file error) is reached before a character for a field or 6303a literal, and if no previous non-suppressed fields have matched, then the 6304return value is @code{EOF} instead of 0. A whitespace character in the format 6305string is only an optional match and doesn't induce an @code{EOF} in this 6306fashion. Leading whitespace read and discarded for a field don't count as 6307characters for that field. 6308 6309For the GMP types, input parsing follows C99 rules, namely one character of 6310lookahead is used and characters are read while they continue to meet the 6311format requirements. If this doesn't provide a complete number then the 6312function terminates, with that field not stored nor counted towards the return 6313value. For instance with @code{mpf_t} an input @samp{1.23e-XYZ} would be read 6314up to the @samp{X} and that character pushed back since it's not a digit. The 6315string @samp{1.23e-} would then be considered invalid since an @samp{e} must 6316be followed by at least one digit. 6317 6318For the standard C types, in the current implementation GMP calls the C 6319library @code{scanf} functions, which might have looser rules about what 6320constitutes a valid input. 6321 6322Note that @code{gmp_sscanf} is the same as @code{gmp_fscanf} and only does one 6323character of lookahead when parsing. Although clearly it could look at its 6324entire input, it is deliberately made identical to @code{gmp_fscanf}, the same 6325way C99 @code{sscanf} is the same as @code{fscanf}. 6326 6327 6328@node C++ Formatted Input, , Formatted Input Functions, Formatted Input 6329@section C++ Formatted Input 6330@cindex C++ @code{istream} input 6331@cindex @code{istream} input 6332 6333The following functions are provided in @file{libgmpxx} (@pxref{Headers and 6334Libraries}), which is built only if C++ support is enabled (@pxref{Build 6335Options}). Prototypes are available from @code{<gmp.h>}. 6336 6337@deftypefun istream& operator>> (istream& @var{stream}, mpz_t @var{rop}) 6338Read @var{rop} from @var{stream}, using its @code{ios} formatting settings. 6339@end deftypefun 6340 6341@deftypefun istream& operator>> (istream& @var{stream}, mpq_t @var{rop}) 6342An integer like @samp{123} will be read, or a fraction like @samp{5/9}. No 6343whitespace is allowed around the @samp{/}. If the fraction is not in 6344canonical form then @code{mpq_canonicalize} must be called (@pxref{Rational 6345Number Functions}) before operating on it. 6346 6347As per integer input, an @samp{0} or @samp{0x} base indicator is read when 6348none of @code{ios::dec}, @code{ios::oct} or @code{ios::hex} are set. This is 6349done separately for numerator and denominator, so that for instance 6350@samp{0x10/11} is @math{16/11} and @samp{0x10/0x11} is @math{16/17}. 6351@end deftypefun 6352 6353@deftypefun istream& operator>> (istream& @var{stream}, mpf_t @var{rop}) 6354Read @var{rop} from @var{stream}, using its @code{ios} formatting settings. 6355 6356Hex or octal floats are not supported, but might be in the future, or perhaps 6357it's best to accept only what the standard float @code{operator>>} does. 6358@end deftypefun 6359 6360Note that digit grouping specified by the @code{istream} locale is currently 6361not accepted. Perhaps this will change in the future. 6362 6363@sp 1 6364These operators mean that GMP types can be read in the usual C++ way, for 6365example, 6366 6367@example 6368mpz_t z; 6369... 6370cin >> z; 6371@end example 6372 6373But note that @code{istream} input (and @code{ostream} output, @pxref{C++ 6374Formatted Output}) is the only overloading available for the GMP types and 6375that for instance using @code{+} with an @code{mpz_t} will have unpredictable 6376results. For classes with overloading, see @ref{C++ Class Interface}. 6377 6378 6379 6380@node C++ Class Interface, BSD Compatible Functions, Formatted Input, Top 6381@chapter C++ Class Interface 6382@cindex C++ interface 6383 6384This chapter describes the C++ class based interface to GMP. 6385 6386All GMP C language types and functions can be used in C++ programs, since 6387@file{gmp.h} has @code{extern "C"} qualifiers, but the class interface offers 6388overloaded functions and operators which may be more convenient. 6389 6390Due to the implementation of this interface, a reasonably recent C++ compiler 6391is required, one supporting namespaces, partial specialization of templates 6392and member templates. For GCC this means version 2.91 or later. 6393 6394@strong{Everything described in this chapter is to be considered preliminary 6395and might be subject to incompatible changes if some unforeseen difficulty 6396reveals itself.} 6397 6398@menu 6399* C++ Interface General:: 6400* C++ Interface Integers:: 6401* C++ Interface Rationals:: 6402* C++ Interface Floats:: 6403* C++ Interface Random Numbers:: 6404* C++ Interface Limitations:: 6405@end menu 6406 6407 6408@node C++ Interface General, C++ Interface Integers, C++ Class Interface, C++ Class Interface 6409@section C++ Interface General 6410 6411@noindent 6412All the C++ classes and functions are available with 6413 6414@cindex @code{gmpxx.h} 6415@example 6416#include <gmpxx.h> 6417@end example 6418 6419Programs should be linked with the @file{libgmpxx} and @file{libgmp} 6420libraries. For example, 6421 6422@example 6423g++ mycxxprog.cc -lgmpxx -lgmp 6424@end example 6425 6426@noindent 6427The classes defined are 6428 6429@deftp Class mpz_class 6430@deftpx Class mpq_class 6431@deftpx Class mpf_class 6432@end deftp 6433 6434The standard operators and various standard functions are overloaded to allow 6435arithmetic with these classes. For example, 6436 6437@example 6438int 6439main (void) 6440@{ 6441 mpz_class a, b, c; 6442 6443 a = 1234; 6444 b = "-5678"; 6445 c = a+b; 6446 cout << "sum is " << c << "\n"; 6447 cout << "absolute value is " << abs(c) << "\n"; 6448 6449 return 0; 6450@} 6451@end example 6452 6453An important feature of the implementation is that an expression like 6454@code{a=b+c} results in a single call to the corresponding @code{mpz_add}, 6455without using a temporary for the @code{b+c} part. Expressions which by their 6456nature imply intermediate values, like @code{a=b*c+d*e}, still use temporaries 6457though. 6458 6459The classes can be freely intermixed in expressions, as can the classes and 6460the standard types @code{long}, @code{unsigned long} and @code{double}. 6461Smaller types like @code{int} or @code{float} can also be intermixed, since 6462C++ will promote them. 6463 6464Note that @code{bool} is not accepted directly, but must be explicitly cast to 6465an @code{int} first. This is because C++ will automatically convert any 6466pointer to a @code{bool}, so if GMP accepted @code{bool} it would make all 6467sorts of invalid class and pointer combinations compile but almost certainly 6468not do anything sensible. 6469 6470Conversions back from the classes to standard C++ types aren't done 6471automatically, instead member functions like @code{get_si} are provided (see 6472the following sections for details). 6473 6474Also there are no automatic conversions from the classes to the corresponding 6475GMP C types, instead a reference to the underlying C object can be obtained 6476with the following functions, 6477 6478@deftypefun mpz_t mpz_class::get_mpz_t () 6479@deftypefunx mpq_t mpq_class::get_mpq_t () 6480@deftypefunx mpf_t mpf_class::get_mpf_t () 6481@end deftypefun 6482 6483These can be used to call a C function which doesn't have a C++ class 6484interface. For example to set @code{a} to the GCD of @code{b} and @code{c}, 6485 6486@example 6487mpz_class a, b, c; 6488... 6489mpz_gcd (a.get_mpz_t(), b.get_mpz_t(), c.get_mpz_t()); 6490@end example 6491 6492In the other direction, a class can be initialized from the corresponding GMP 6493C type, or assigned to if an explicit constructor is used. In both cases this 6494makes a copy of the value, it doesn't create any sort of association. For 6495example, 6496 6497@example 6498mpz_t z; 6499// ... init and calculate z ... 6500mpz_class x(z); 6501mpz_class y; 6502y = mpz_class (z); 6503@end example 6504 6505There are no namespace setups in @file{gmpxx.h}, all types and functions are 6506simply put into the global namespace. This is what @file{gmp.h} has done in 6507the past, and continues to do for compatibility. The extras provided by 6508@file{gmpxx.h} follow GMP naming conventions and are unlikely to clash with 6509anything. 6510 6511 6512@node C++ Interface Integers, C++ Interface Rationals, C++ Interface General, C++ Class Interface 6513@section C++ Interface Integers 6514 6515@deftypefun {} mpz_class::mpz_class (type @var{n}) 6516Construct an @code{mpz_class}. All the standard C++ types may be used, except 6517@code{long long} and @code{long double}, and all the GMP C++ classes can be 6518used. Any necessary conversion follows the corresponding C function, for 6519example @code{double} follows @code{mpz_set_d} (@pxref{Assigning Integers}). 6520@end deftypefun 6521 6522@deftypefun explicit mpz_class::mpz_class (mpz_t @var{z}) 6523Construct an @code{mpz_class} from an @code{mpz_t}. The value in @var{z} is 6524copied into the new @code{mpz_class}, there won't be any permanent association 6525between it and @var{z}. 6526@end deftypefun 6527 6528@deftypefun explicit mpz_class::mpz_class (const char *@var{s}, int @var{base} = 0) 6529@deftypefunx explicit mpz_class::mpz_class (const string& @var{s}, int @var{base} = 0) 6530Construct an @code{mpz_class} converted from a string using @code{mpz_set_str} 6531(@pxref{Assigning Integers}). 6532 6533If the string is not a valid integer, an @code{std::invalid_argument} 6534exception is thrown. The same applies to @code{operator=}. 6535@end deftypefun 6536 6537@deftypefun mpz_class operator/ (mpz_class @var{a}, mpz_class @var{d}) 6538@deftypefunx mpz_class operator% (mpz_class @var{a}, mpz_class @var{d}) 6539Divisions involving @code{mpz_class} round towards zero, as per the 6540@code{mpz_tdiv_q} and @code{mpz_tdiv_r} functions (@pxref{Integer Division}). 6541This is the same as the C99 @code{/} and @code{%} operators. 6542 6543The @code{mpz_fdiv@dots{}} or @code{mpz_cdiv@dots{}} functions can always be called 6544directly if desired. For example, 6545 6546@example 6547mpz_class q, a, d; 6548... 6549mpz_fdiv_q (q.get_mpz_t(), a.get_mpz_t(), d.get_mpz_t()); 6550@end example 6551@end deftypefun 6552 6553@deftypefun mpz_class abs (mpz_class @var{op1}) 6554@deftypefunx int cmp (mpz_class @var{op1}, type @var{op2}) 6555@deftypefunx int cmp (type @var{op1}, mpz_class @var{op2}) 6556@maybepagebreak 6557@deftypefunx bool mpz_class::fits_sint_p (void) 6558@deftypefunx bool mpz_class::fits_slong_p (void) 6559@deftypefunx bool mpz_class::fits_sshort_p (void) 6560@maybepagebreak 6561@deftypefunx bool mpz_class::fits_uint_p (void) 6562@deftypefunx bool mpz_class::fits_ulong_p (void) 6563@deftypefunx bool mpz_class::fits_ushort_p (void) 6564@maybepagebreak 6565@deftypefunx double mpz_class::get_d (void) 6566@deftypefunx long mpz_class::get_si (void) 6567@deftypefunx string mpz_class::get_str (int @var{base} = 10) 6568@deftypefunx {unsigned long} mpz_class::get_ui (void) 6569@maybepagebreak 6570@deftypefunx int mpz_class::set_str (const char *@var{str}, int @var{base}) 6571@deftypefunx int mpz_class::set_str (const string& @var{str}, int @var{base}) 6572@deftypefunx int sgn (mpz_class @var{op}) 6573@deftypefunx mpz_class sqrt (mpz_class @var{op}) 6574These functions provide a C++ class interface to the corresponding GMP C 6575routines. 6576 6577@code{cmp} can be used with any of the classes or the standard C++ types, 6578except @code{long long} and @code{long double}. 6579@end deftypefun 6580 6581@sp 1 6582Overloaded operators for combinations of @code{mpz_class} and @code{double} 6583are provided for completeness, but it should be noted that if the given 6584@code{double} is not an integer then the way any rounding is done is currently 6585unspecified. The rounding might take place at the start, in the middle, or at 6586the end of the operation, and it might change in the future. 6587 6588Conversions between @code{mpz_class} and @code{double}, however, are defined 6589to follow the corresponding C functions @code{mpz_get_d} and @code{mpz_set_d}. 6590And comparisons are always made exactly, as per @code{mpz_cmp_d}. 6591 6592 6593@node C++ Interface Rationals, C++ Interface Floats, C++ Interface Integers, C++ Class Interface 6594@section C++ Interface Rationals 6595 6596In all the following constructors, if a fraction is given then it should be in 6597canonical form, or if not then @code{mpq_class::canonicalize} called. 6598 6599@deftypefun {} mpq_class::mpq_class (type @var{op}) 6600@deftypefunx {} mpq_class::mpq_class (integer @var{num}, integer @var{den}) 6601Construct an @code{mpq_class}. The initial value can be a single value of any 6602type, or a pair of integers (@code{mpz_class} or standard C++ integer types) 6603representing a fraction, except that @code{long long} and @code{long double} 6604are not supported. For example, 6605 6606@example 6607mpq_class q (99); 6608mpq_class q (1.75); 6609mpq_class q (1, 3); 6610@end example 6611@end deftypefun 6612 6613@deftypefun explicit mpq_class::mpq_class (mpq_t @var{q}) 6614Construct an @code{mpq_class} from an @code{mpq_t}. The value in @var{q} is 6615copied into the new @code{mpq_class}, there won't be any permanent association 6616between it and @var{q}. 6617@end deftypefun 6618 6619@deftypefun explicit mpq_class::mpq_class (const char *@var{s}, int @var{base} = 0) 6620@deftypefunx explicit mpq_class::mpq_class (const string& @var{s}, int @var{base} = 0) 6621Construct an @code{mpq_class} converted from a string using @code{mpq_set_str} 6622(@pxref{Initializing Rationals}). 6623 6624If the string is not a valid rational, an @code{std::invalid_argument} 6625exception is thrown. The same applies to @code{operator=}. 6626@end deftypefun 6627 6628@deftypefun void mpq_class::canonicalize () 6629Put an @code{mpq_class} into canonical form, as per @ref{Rational Number 6630Functions}. All arithmetic operators require their operands in canonical 6631form, and will return results in canonical form. 6632@end deftypefun 6633 6634@deftypefun mpq_class abs (mpq_class @var{op}) 6635@deftypefunx int cmp (mpq_class @var{op1}, type @var{op2}) 6636@deftypefunx int cmp (type @var{op1}, mpq_class @var{op2}) 6637@maybepagebreak 6638@deftypefunx double mpq_class::get_d (void) 6639@deftypefunx string mpq_class::get_str (int @var{base} = 10) 6640@maybepagebreak 6641@deftypefunx int mpq_class::set_str (const char *@var{str}, int @var{base}) 6642@deftypefunx int mpq_class::set_str (const string& @var{str}, int @var{base}) 6643@deftypefunx int sgn (mpq_class @var{op}) 6644These functions provide a C++ class interface to the corresponding GMP C 6645routines. 6646 6647@code{cmp} can be used with any of the classes or the standard C++ types, 6648except @code{long long} and @code{long double}. 6649@end deftypefun 6650 6651@deftypefun {mpz_class&} mpq_class::get_num () 6652@deftypefunx {mpz_class&} mpq_class::get_den () 6653Get a reference to an @code{mpz_class} which is the numerator or denominator 6654of an @code{mpq_class}. This can be used both for read and write access. If 6655the object returned is modified, it modifies the original @code{mpq_class}. 6656 6657If direct manipulation might produce a non-canonical value, then 6658@code{mpq_class::canonicalize} must be called before further operations. 6659@end deftypefun 6660 6661@deftypefun mpz_t mpq_class::get_num_mpz_t () 6662@deftypefunx mpz_t mpq_class::get_den_mpz_t () 6663Get a reference to the underlying @code{mpz_t} numerator or denominator of an 6664@code{mpq_class}. This can be passed to C functions expecting an 6665@code{mpz_t}. Any modifications made to the @code{mpz_t} will modify the 6666original @code{mpq_class}. 6667 6668If direct manipulation might produce a non-canonical value, then 6669@code{mpq_class::canonicalize} must be called before further operations. 6670@end deftypefun 6671 6672@deftypefun istream& operator>> (istream& @var{stream}, mpq_class& @var{rop}); 6673Read @var{rop} from @var{stream}, using its @code{ios} formatting settings, 6674the same as @code{mpq_t operator>>} (@pxref{C++ Formatted Input}). 6675 6676If the @var{rop} read might not be in canonical form then 6677@code{mpq_class::canonicalize} must be called. 6678@end deftypefun 6679 6680 6681@node C++ Interface Floats, C++ Interface Random Numbers, C++ Interface Rationals, C++ Class Interface 6682@section C++ Interface Floats 6683 6684When an expression requires the use of temporary intermediate @code{mpf_class} 6685values, like @code{f=g*h+x*y}, those temporaries will have the same precision 6686as the destination @code{f}. Explicit constructors can be used if this 6687doesn't suit. 6688 6689@deftypefun {} mpf_class::mpf_class (type @var{op}) 6690@deftypefunx {} mpf_class::mpf_class (type @var{op}, unsigned long @var{prec}) 6691Construct an @code{mpf_class}. Any standard C++ type can be used, except 6692@code{long long} and @code{long double}, and any of the GMP C++ classes can be 6693used. 6694 6695If @var{prec} is given, the initial precision is that value, in bits. If 6696@var{prec} is not given, then the initial precision is determined by the type 6697of @var{op} given. An @code{mpz_class}, @code{mpq_class}, or C++ 6698builtin type will give the default @code{mpf} precision (@pxref{Initializing 6699Floats}). An @code{mpf_class} or expression will give the precision of that 6700value. The precision of a binary expression is the higher of the two 6701operands. 6702 6703@example 6704mpf_class f(1.5); // default precision 6705mpf_class f(1.5, 500); // 500 bits (at least) 6706mpf_class f(x); // precision of x 6707mpf_class f(abs(x)); // precision of x 6708mpf_class f(-g, 1000); // 1000 bits (at least) 6709mpf_class f(x+y); // greater of precisions of x and y 6710@end example 6711@end deftypefun 6712 6713@deftypefun explicit mpf_class::mpf_class (mpf_t @var{f}) 6714@deftypefunx {} mpf_class::mpf_class (mpf_t @var{f}, unsigned long @var{prec}) 6715Construct an @code{mpf_class} from an @code{mpf_t}. The value in @var{f} is 6716copied into the new @code{mpf_class}, there won't be any permanent association 6717between it and @var{f}. 6718 6719If @var{prec} is given, the initial precision is that value, in bits. If 6720@var{prec} is not given, then the initial precision is that of @var{f}. 6721@end deftypefun 6722 6723@deftypefun explicit mpf_class::mpf_class (const char *@var{s}) 6724@deftypefunx {} mpf_class::mpf_class (const char *@var{s}, unsigned long @var{prec}, int @var{base} = 0) 6725@deftypefunx explicit mpf_class::mpf_class (const string& @var{s}) 6726@deftypefunx {} mpf_class::mpf_class (const string& @var{s}, unsigned long @var{prec}, int @var{base} = 0) 6727Construct an @code{mpf_class} converted from a string using @code{mpf_set_str} 6728(@pxref{Assigning Floats}). If @var{prec} is given, the initial precision is 6729that value, in bits. If not, the default @code{mpf} precision 6730(@pxref{Initializing Floats}) is used. 6731 6732If the string is not a valid float, an @code{std::invalid_argument} exception 6733is thrown. The same applies to @code{operator=}. 6734@end deftypefun 6735 6736@deftypefun {mpf_class&} mpf_class::operator= (type @var{op}) 6737Convert and store the given @var{op} value to an @code{mpf_class} object. The 6738same types are accepted as for the constructors above. 6739 6740Note that @code{operator=} only stores a new value, it doesn't copy or change 6741the precision of the destination, instead the value is truncated if necessary. 6742This is the same as @code{mpf_set} etc. Note in particular this means for 6743@code{mpf_class} a copy constructor is not the same as a default constructor 6744plus assignment. 6745 6746@example 6747mpf_class x (y); // x created with precision of y 6748 6749mpf_class x; // x created with default precision 6750x = y; // value truncated to that precision 6751@end example 6752 6753Applications using templated code may need to be careful about the assumptions 6754the code makes in this area, when working with @code{mpf_class} values of 6755various different or non-default precisions. For instance implementations of 6756the standard @code{complex} template have been seen in both styles above, 6757though of course @code{complex} is normally only actually specified for use 6758with the builtin float types. 6759@end deftypefun 6760 6761@deftypefun mpf_class abs (mpf_class @var{op}) 6762@deftypefunx mpf_class ceil (mpf_class @var{op}) 6763@deftypefunx int cmp (mpf_class @var{op1}, type @var{op2}) 6764@deftypefunx int cmp (type @var{op1}, mpf_class @var{op2}) 6765@maybepagebreak 6766@deftypefunx bool mpf_class::fits_sint_p (void) 6767@deftypefunx bool mpf_class::fits_slong_p (void) 6768@deftypefunx bool mpf_class::fits_sshort_p (void) 6769@maybepagebreak 6770@deftypefunx bool mpf_class::fits_uint_p (void) 6771@deftypefunx bool mpf_class::fits_ulong_p (void) 6772@deftypefunx bool mpf_class::fits_ushort_p (void) 6773@maybepagebreak 6774@deftypefunx mpf_class floor (mpf_class @var{op}) 6775@deftypefunx mpf_class hypot (mpf_class @var{op1}, mpf_class @var{op2}) 6776@maybepagebreak 6777@deftypefunx double mpf_class::get_d (void) 6778@deftypefunx long mpf_class::get_si (void) 6779@deftypefunx string mpf_class::get_str (mp_exp_t& @var{exp}, int @var{base} = 10, size_t @var{digits} = 0) 6780@deftypefunx {unsigned long} mpf_class::get_ui (void) 6781@maybepagebreak 6782@deftypefunx int mpf_class::set_str (const char *@var{str}, int @var{base}) 6783@deftypefunx int mpf_class::set_str (const string& @var{str}, int @var{base}) 6784@deftypefunx int sgn (mpf_class @var{op}) 6785@deftypefunx mpf_class sqrt (mpf_class @var{op}) 6786@deftypefunx mpf_class trunc (mpf_class @var{op}) 6787These functions provide a C++ class interface to the corresponding GMP C 6788routines. 6789 6790@code{cmp} can be used with any of the classes or the standard C++ types, 6791except @code{long long} and @code{long double}. 6792 6793The accuracy provided by @code{hypot} is not currently guaranteed. 6794@end deftypefun 6795 6796@deftypefun {mp_bitcnt_t} mpf_class::get_prec () 6797@deftypefunx void mpf_class::set_prec (mp_bitcnt_t @var{prec}) 6798@deftypefunx void mpf_class::set_prec_raw (mp_bitcnt_t @var{prec}) 6799Get or set the current precision of an @code{mpf_class}. 6800 6801The restrictions described for @code{mpf_set_prec_raw} (@pxref{Initializing 6802Floats}) apply to @code{mpf_class::set_prec_raw}. Note in particular that the 6803@code{mpf_class} must be restored to it's allocated precision before being 6804destroyed. This must be done by application code, there's no automatic 6805mechanism for it. 6806@end deftypefun 6807 6808 6809@node C++ Interface Random Numbers, C++ Interface Limitations, C++ Interface Floats, C++ Class Interface 6810@section C++ Interface Random Numbers 6811 6812@deftp Class gmp_randclass 6813The C++ class interface to the GMP random number functions uses 6814@code{gmp_randclass} to hold an algorithm selection and current state, as per 6815@code{gmp_randstate_t}. 6816@end deftp 6817 6818@deftypefun {} gmp_randclass::gmp_randclass (void (*@var{randinit}) (gmp_randstate_t, @dots{}), @dots{}) 6819Construct a @code{gmp_randclass}, using a call to the given @var{randinit} 6820function (@pxref{Random State Initialization}). The arguments expected are 6821the same as @var{randinit}, but with @code{mpz_class} instead of @code{mpz_t}. 6822For example, 6823 6824@example 6825gmp_randclass r1 (gmp_randinit_default); 6826gmp_randclass r2 (gmp_randinit_lc_2exp_size, 32); 6827gmp_randclass r3 (gmp_randinit_lc_2exp, a, c, m2exp); 6828gmp_randclass r4 (gmp_randinit_mt); 6829@end example 6830 6831@code{gmp_randinit_lc_2exp_size} will fail if the size requested is too big, 6832an @code{std::length_error} exception is thrown in that case. 6833@end deftypefun 6834 6835@deftypefun {} gmp_randclass::gmp_randclass (gmp_randalg_t @var{alg}, @dots{}) 6836Construct a @code{gmp_randclass} using the same parameters as 6837@code{gmp_randinit} (@pxref{Random State Initialization}). This function is 6838obsolete and the above @var{randinit} style should be preferred. 6839@end deftypefun 6840 6841@deftypefun void gmp_randclass::seed (unsigned long int @var{s}) 6842@deftypefunx void gmp_randclass::seed (mpz_class @var{s}) 6843Seed a random number generator. See @pxref{Random Number Functions}, for how 6844to choose a good seed. 6845@end deftypefun 6846 6847@deftypefun mpz_class gmp_randclass::get_z_bits (unsigned long @var{bits}) 6848@deftypefunx mpz_class gmp_randclass::get_z_bits (mpz_class @var{bits}) 6849Generate a random integer with a specified number of bits. 6850@end deftypefun 6851 6852@deftypefun mpz_class gmp_randclass::get_z_range (mpz_class @var{n}) 6853Generate a random integer in the range 0 to @math{@var{n}-1} inclusive. 6854@end deftypefun 6855 6856@deftypefun mpf_class gmp_randclass::get_f () 6857@deftypefunx mpf_class gmp_randclass::get_f (unsigned long @var{prec}) 6858Generate a random float @var{f} in the range @math{0 <= @var{f} < 1}. @var{f} 6859will be to @var{prec} bits precision, or if @var{prec} is not given then to 6860the precision of the destination. For example, 6861 6862@example 6863gmp_randclass r; 6864... 6865mpf_class f (0, 512); // 512 bits precision 6866f = r.get_f(); // random number, 512 bits 6867@end example 6868@end deftypefun 6869 6870 6871 6872@node C++ Interface Limitations, , C++ Interface Random Numbers, C++ Class Interface 6873@section C++ Interface Limitations 6874 6875@table @asis 6876@item @code{mpq_class} and Templated Reading 6877A generic piece of template code probably won't know that @code{mpq_class} 6878requires a @code{canonicalize} call if inputs read with @code{operator>>} 6879might be non-canonical. This can lead to incorrect results. 6880 6881@code{operator>>} behaves as it does for reasons of efficiency. A 6882canonicalize can be quite time consuming on large operands, and is best 6883avoided if it's not necessary. 6884 6885But this potential difficulty reduces the usefulness of @code{mpq_class}. 6886Perhaps a mechanism to tell @code{operator>>} what to do will be adopted in 6887the future, maybe a preprocessor define, a global flag, or an @code{ios} flag 6888pressed into service. Or maybe, at the risk of inconsistency, the 6889@code{mpq_class} @code{operator>>} could canonicalize and leave @code{mpq_t} 6890@code{operator>>} not doing so, for use on those occasions when that's 6891acceptable. Send feedback or alternate ideas to @email{gmp-bugs@@gmplib.org}. 6892 6893@item Subclassing 6894Subclassing the GMP C++ classes works, but is not currently recommended. 6895 6896Expressions involving subclasses resolve correctly (or seem to), but in normal 6897C++ fashion the subclass doesn't inherit constructors and assignments. 6898There's many of those in the GMP classes, and a good way to reestablish them 6899in a subclass is not yet provided. 6900 6901@item Templated Expressions 6902A subtle difficulty exists when using expressions together with 6903application-defined template functions. Consider the following, with @code{T} 6904intended to be some numeric type, 6905 6906@example 6907template <class T> 6908T fun (const T &, const T &); 6909@end example 6910 6911@noindent 6912When used with, say, plain @code{mpz_class} variables, it works fine: @code{T} 6913is resolved as @code{mpz_class}. 6914 6915@example 6916mpz_class f(1), g(2); 6917fun (f, g); // Good 6918@end example 6919 6920@noindent 6921But when one of the arguments is an expression, it doesn't work. 6922 6923@example 6924mpz_class f(1), g(2), h(3); 6925fun (f, g+h); // Bad 6926@end example 6927 6928This is because @code{g+h} ends up being a certain expression template type 6929internal to @code{gmpxx.h}, which the C++ template resolution rules are unable 6930to automatically convert to @code{mpz_class}. The workaround is simply to add 6931an explicit cast. 6932 6933@example 6934mpz_class f(1), g(2), h(3); 6935fun (f, mpz_class(g+h)); // Good 6936@end example 6937 6938Similarly, within @code{fun} it may be necessary to cast an expression to type 6939@code{T} when calling a templated @code{fun2}. 6940 6941@example 6942template <class T> 6943void fun (T f, T g) 6944@{ 6945 fun2 (f, f+g); // Bad 6946@} 6947 6948template <class T> 6949void fun (T f, T g) 6950@{ 6951 fun2 (f, T(f+g)); // Good 6952@} 6953@end example 6954@end table 6955 6956 6957@node BSD Compatible Functions, Custom Allocation, C++ Class Interface, Top 6958@comment node-name, next, previous, up 6959@chapter Berkeley MP Compatible Functions 6960@cindex Berkeley MP compatible functions 6961@cindex BSD MP compatible functions 6962 6963These functions are intended to be fully compatible with the Berkeley MP 6964library which is available on many BSD derived U*ix systems. The 6965@samp{--enable-mpbsd} option must be used when building GNU MP to make these 6966available (@pxref{Installing GMP}). 6967 6968The original Berkeley MP library has a usage restriction: you cannot use the 6969same variable as both source and destination in a single function call. The 6970compatible functions in GNU MP do not share this restriction---inputs and 6971outputs may overlap. 6972 6973It is not recommended that new programs are written using these functions. 6974Apart from the incomplete set of functions, the interface for initializing 6975@code{MINT} objects is more error prone, and the @code{pow} function collides 6976with @code{pow} in @file{libm.a}. 6977 6978@cindex @code{mp.h} 6979@tindex MINT 6980Include the header @file{mp.h} to get the definition of the necessary types and 6981functions. If you are on a BSD derived system, make sure to include GNU 6982@file{mp.h} if you are going to link the GNU @file{libmp.a} to your program. 6983This means that you probably need to give the @samp{-I<dir>} option to the 6984compiler, where @samp{<dir>} is the directory where you have GNU @file{mp.h}. 6985 6986@deftypefun {MINT *} itom (signed short int @var{initial_value}) 6987Allocate an integer consisting of a @code{MINT} object and dynamic limb space. 6988Initialize the integer to @var{initial_value}. Return a pointer to the 6989@code{MINT} object. 6990@end deftypefun 6991 6992@deftypefun {MINT *} xtom (char *@var{initial_value}) 6993Allocate an integer consisting of a @code{MINT} object and dynamic limb space. 6994Initialize the integer from @var{initial_value}, a hexadecimal, 6995null-terminated C string. Return a pointer to the @code{MINT} object. 6996@end deftypefun 6997 6998@deftypefun void move (MINT *@var{src}, MINT *@var{dest}) 6999Set @var{dest} to @var{src} by copying. Both variables must be previously 7000initialized. 7001@end deftypefun 7002 7003@deftypefun void madd (MINT *@var{src_1}, MINT *@var{src_2}, MINT *@var{destination}) 7004Add @var{src_1} and @var{src_2} and put the sum in @var{destination}. 7005@end deftypefun 7006 7007@deftypefun void msub (MINT *@var{src_1}, MINT *@var{src_2}, MINT *@var{destination}) 7008Subtract @var{src_2} from @var{src_1} and put the difference in 7009@var{destination}. 7010@end deftypefun 7011 7012@deftypefun void mult (MINT *@var{src_1}, MINT *@var{src_2}, MINT *@var{destination}) 7013Multiply @var{src_1} and @var{src_2} and put the product in @var{destination}. 7014@end deftypefun 7015 7016@deftypefun void mdiv (MINT *@var{dividend}, MINT *@var{divisor}, MINT *@var{quotient}, MINT *@var{remainder}) 7017@deftypefunx void sdiv (MINT *@var{dividend}, signed short int @var{divisor}, MINT *@var{quotient}, signed short int *@var{remainder}) 7018Set @var{quotient} to @var{dividend}/@var{divisor}, and @var{remainder} to 7019@var{dividend} mod @var{divisor}. The quotient is rounded towards zero; the 7020remainder has the same sign as the dividend unless it is zero. 7021 7022Some implementations of these functions work differently---or not at all---for 7023negative arguments. 7024@end deftypefun 7025 7026@deftypefun void msqrt (MINT *@var{op}, MINT *@var{root}, MINT *@var{remainder}) 7027Set @var{root} to @m{\lfloor\sqrt{@var{op}}\rfloor, the truncated integer part 7028of the square root of @var{op}}, like @code{mpz_sqrt}. Set @var{remainder} to 7029@m{(@var{op} - @var{root}^2), @var{op}@minus{}@var{root}*@var{root}}, i.e. 7030zero if @var{op} is a perfect square. 7031 7032If @var{root} and @var{remainder} are the same variable, the results are 7033undefined. 7034@end deftypefun 7035 7036@deftypefun void pow (MINT *@var{base}, MINT *@var{exp}, MINT *@var{mod}, MINT *@var{dest}) 7037Set @var{dest} to (@var{base} raised to @var{exp}) modulo @var{mod}. 7038 7039Note that the name @code{pow} clashes with @code{pow} from the standard C math 7040library (@pxref{Exponents and Logarithms,, Exponentiation and Logarithms, 7041libc, The GNU C Library Reference Manual}). An application will only be able 7042to use one or the other. 7043@end deftypefun 7044 7045@deftypefun void rpow (MINT *@var{base}, signed short int @var{exp}, MINT *@var{dest}) 7046Set @var{dest} to @var{base} raised to @var{exp}. 7047@end deftypefun 7048 7049@deftypefun void gcd (MINT *@var{op1}, MINT *@var{op2}, MINT *@var{res}) 7050Set @var{res} to the greatest common divisor of @var{op1} and @var{op2}. 7051@end deftypefun 7052 7053@deftypefun int mcmp (MINT *@var{op1}, MINT *@var{op2}) 7054Compare @var{op1} and @var{op2}. Return a positive value if @var{op1} > 7055@var{op2}, zero if @var{op1} = @var{op2}, and a negative value if @var{op1} < 7056@var{op2}. 7057@end deftypefun 7058 7059@deftypefun void min (MINT *@var{dest}) 7060Input a decimal string from @code{stdin}, and put the read integer in 7061@var{dest}. SPC and TAB are allowed in the number string, and are ignored. 7062@end deftypefun 7063 7064@deftypefun void mout (MINT *@var{src}) 7065Output @var{src} to @code{stdout}, as a decimal string. Also output a newline. 7066@end deftypefun 7067 7068@deftypefun {char *} mtox (MINT *@var{op}) 7069Convert @var{op} to a hexadecimal string, and return a pointer to the string. 7070The returned string is allocated using the default memory allocation function, 7071@code{malloc} by default. It will be @code{strlen(str)+1} bytes, that being 7072exactly enough for the string and null-terminator. 7073@end deftypefun 7074 7075@deftypefun void mfree (MINT *@var{op}) 7076De-allocate, the space used by @var{op}. @strong{This function should only be 7077passed a value returned by @code{itom} or @code{xtom}.} 7078@end deftypefun 7079 7080 7081@node Custom Allocation, Language Bindings, BSD Compatible Functions, Top 7082@comment node-name, next, previous, up 7083@chapter Custom Allocation 7084@cindex Custom allocation 7085@cindex Memory allocation 7086@cindex Allocation of memory 7087 7088By default GMP uses @code{malloc}, @code{realloc} and @code{free} for memory 7089allocation, and if they fail GMP prints a message to the standard error output 7090and terminates the program. 7091 7092Alternate functions can be specified, to allocate memory in a different way or 7093to have a different error action on running out of memory. 7094 7095This feature is available in the Berkeley compatibility library (@pxref{BSD 7096Compatible Functions}) as well as the main GMP library. 7097 7098@deftypefun void mp_set_memory_functions (@* void *(*@var{alloc_func_ptr}) (size_t), @* void *(*@var{realloc_func_ptr}) (void *, size_t, size_t), @* void (*@var{free_func_ptr}) (void *, size_t)) 7099Replace the current allocation functions from the arguments. If an argument 7100is @code{NULL}, the corresponding default function is used. 7101 7102These functions will be used for all memory allocation done by GMP, apart from 7103temporary space from @code{alloca} if that function is available and GMP is 7104configured to use it (@pxref{Build Options}). 7105 7106@strong{Be sure to call @code{mp_set_memory_functions} only when there are no 7107active GMP objects allocated using the previous memory functions! Usually 7108that means calling it before any other GMP function.} 7109@end deftypefun 7110 7111The functions supplied should fit the following declarations: 7112 7113@deftypevr Function {void *} allocate_function (size_t @var{alloc_size}) 7114Return a pointer to newly allocated space with at least @var{alloc_size} 7115bytes. 7116@end deftypevr 7117 7118@deftypevr Function {void *} reallocate_function (void *@var{ptr}, size_t @var{old_size}, size_t @var{new_size}) 7119Resize a previously allocated block @var{ptr} of @var{old_size} bytes to be 7120@var{new_size} bytes. 7121 7122The block may be moved if necessary or if desired, and in that case the 7123smaller of @var{old_size} and @var{new_size} bytes must be copied to the new 7124location. The return value is a pointer to the resized block, that being the 7125new location if moved or just @var{ptr} if not. 7126 7127@var{ptr} is never @code{NULL}, it's always a previously allocated block. 7128@var{new_size} may be bigger or smaller than @var{old_size}. 7129@end deftypevr 7130 7131@deftypevr Function void free_function (void *@var{ptr}, size_t @var{size}) 7132De-allocate the space pointed to by @var{ptr}. 7133 7134@var{ptr} is never @code{NULL}, it's always a previously allocated block of 7135@var{size} bytes. 7136@end deftypevr 7137 7138A @dfn{byte} here means the unit used by the @code{sizeof} operator. 7139 7140The @var{old_size} parameters to @var{reallocate_function} and 7141@var{free_function} are passed for convenience, but of course can be ignored 7142if not needed. The default functions using @code{malloc} and friends for 7143instance don't use them. 7144 7145No error return is allowed from any of these functions, if they return then 7146they must have performed the specified operation. In particular note that 7147@var{allocate_function} or @var{reallocate_function} mustn't return 7148@code{NULL}. 7149 7150Getting a different fatal error action is a good use for custom allocation 7151functions, for example giving a graphical dialog rather than the default print 7152to @code{stderr}. How much is possible when genuinely out of memory is 7153another question though. 7154 7155There's currently no defined way for the allocation functions to recover from 7156an error such as out of memory, they must terminate program execution. A 7157@code{longjmp} or throwing a C++ exception will have undefined results. This 7158may change in the future. 7159 7160GMP may use allocated blocks to hold pointers to other allocated blocks. This 7161will limit the assumptions a conservative garbage collection scheme can make. 7162 7163Since the default GMP allocation uses @code{malloc} and friends, those 7164functions will be linked in even if the first thing a program does is an 7165@code{mp_set_memory_functions}. It's necessary to change the GMP sources if 7166this is a problem. 7167 7168@sp 1 7169@deftypefun void mp_get_memory_functions (@* void *(**@var{alloc_func_ptr}) (size_t), @* void *(**@var{realloc_func_ptr}) (void *, size_t, size_t), @* void (**@var{free_func_ptr}) (void *, size_t)) 7170Get the current allocation functions, storing function pointers to the 7171locations given by the arguments. If an argument is @code{NULL}, that 7172function pointer is not stored. 7173 7174@need 1000 7175For example, to get just the current free function, 7176 7177@example 7178void (*freefunc) (void *, size_t); 7179 7180mp_get_memory_functions (NULL, NULL, &freefunc); 7181@end example 7182@end deftypefun 7183 7184@node Language Bindings, Algorithms, Custom Allocation, Top 7185@chapter Language Bindings 7186@cindex Language bindings 7187@cindex Other languages 7188 7189The following packages and projects offer access to GMP from languages other 7190than C, though perhaps with varying levels of functionality and efficiency. 7191 7192@c @spaceuref{U} is the same as @uref{U}, but with a couple of extra spaces 7193@c in tex, just to separate the URL from the preceding text a bit. 7194@iftex 7195@macro spaceuref {U} 7196@ @ @uref{\U\} 7197@end macro 7198@end iftex 7199@ifnottex 7200@macro spaceuref {U} 7201@uref{\U\} 7202@end macro 7203@end ifnottex 7204 7205@sp 1 7206@table @asis 7207@item C++ 7208@itemize @bullet 7209@item 7210GMP C++ class interface, @pxref{C++ Class Interface} @* Straightforward 7211interface, expression templates to eliminate temporaries. 7212@item 7213ALP @spaceuref{http://www-sop.inria.fr/saga/logiciels/ALP/} @* Linear algebra and 7214polynomials using templates. 7215@item 7216Arithmos @spaceuref{http://www.win.ua.ac.be/~cant/arithmos/} @* Rationals 7217with infinities and square roots. 7218@item 7219CLN @spaceuref{http://www.ginac.de/CLN/} @* High level classes for arithmetic. 7220@item 7221LiDIA @spaceuref{http://www.cdc.informatik.tu-darmstadt.de/TI/LiDIA/} @* A C++ 7222library for computational number theory. 7223@item 7224Linbox @spaceuref{http://www.linalg.org/} @* Sparse vectors and matrices. 7225@item 7226NTL @spaceuref{http://www.shoup.net/ntl/} @* A C++ number theory library. 7227@end itemize 7228 7229@c @item D 7230@c @itemize @bullet 7231@c @item 7232@c gmp-d @spaceuref{http://home.comcast.net/~benhinkle/gmp-d/} 7233@c @end itemize 7234 7235@item Eiffel 7236@itemize @bullet 7237@item 7238Eiffelroom @spaceuref{http://www.eiffelroom.org/node/442} 7239@end itemize 7240 7241@item Fortran 7242@itemize @bullet 7243@item 7244Omni F77 @spaceuref{http://phase.hpcc.jp/Omni/home.html} @* Arbitrary 7245precision floats. 7246@end itemize 7247 7248@item Haskell 7249@itemize @bullet 7250@item 7251Glasgow Haskell Compiler @spaceuref{http://www.haskell.org/ghc/} 7252@end itemize 7253 7254@item Java 7255@itemize @bullet 7256@item 7257Kaffe @spaceuref{http://www.kaffe.org/} 7258@item 7259Kissme @spaceuref{http://kissme.sourceforge.net/} 7260@end itemize 7261 7262@item Lisp 7263@itemize @bullet 7264@item 7265GNU Common Lisp @spaceuref{http://www.gnu.org/software/gcl/gcl.html} 7266@item 7267Librep @spaceuref{http://librep.sourceforge.net/} 7268@item 7269@c FIXME: When there's a stable release with gmp support, just refer to it 7270@c rather than bothering to talk about betas. 7271XEmacs (21.5.18 beta and up) @spaceuref{http://www.xemacs.org} @* Optional 7272big integers, rationals and floats using GMP. 7273@end itemize 7274 7275@item M4 7276@itemize @bullet 7277@item 7278@c FIXME: When there's a stable release with gmp support, just refer to it 7279@c rather than bothering to talk about betas. 7280GNU m4 betas @spaceuref{http://www.seindal.dk/rene/gnu/} @* Optionally provides 7281an arbitrary precision @code{mpeval}. 7282@end itemize 7283 7284@item ML 7285@itemize @bullet 7286@item 7287MLton compiler @spaceuref{http://mlton.org/} 7288@end itemize 7289 7290@item Objective Caml 7291@itemize @bullet 7292@item 7293MLGMP @spaceuref{http://www.di.ens.fr/~monniaux/programmes.html.en} 7294@item 7295Numerix @spaceuref{http://pauillac.inria.fr/~quercia/} @* Optionally using 7296GMP. 7297@end itemize 7298 7299@item Oz 7300@itemize @bullet 7301@item 7302Mozart @spaceuref{http://www.mozart-oz.org/} 7303@end itemize 7304 7305@item Pascal 7306@itemize @bullet 7307@item 7308GNU Pascal Compiler @spaceuref{http://www.gnu-pascal.de/} @* GMP unit. 7309@item 7310Numerix @spaceuref{http://pauillac.inria.fr/~quercia/} @* For Free Pascal, 7311optionally using GMP. 7312@end itemize 7313 7314@item Perl 7315@itemize @bullet 7316@item 7317GMP module, see @file{demos/perl} in the GMP sources (@pxref{Demonstration 7318Programs}). 7319@item 7320Math::GMP @spaceuref{http://www.cpan.org/} @* Compatible with Math::BigInt, but 7321not as many functions as the GMP module above. 7322@item 7323Math::BigInt::GMP @spaceuref{http://www.cpan.org/} @* Plug Math::GMP into 7324normal Math::BigInt operations. 7325@end itemize 7326 7327@need 1000 7328@item Pike 7329@itemize @bullet 7330@item 7331mpz module in the standard distribution, @uref{http://pike.ida.liu.se/} 7332@end itemize 7333 7334@need 500 7335@item Prolog 7336@itemize @bullet 7337@item 7338SWI Prolog @spaceuref{http://www.swi-prolog.org/} @* 7339Arbitrary precision floats. 7340@end itemize 7341 7342@item Python 7343@itemize @bullet 7344@item 7345GMPY @uref{http://code.google.com/p/gmpy/} 7346@end itemize 7347 7348@item Ruby 7349@itemize @bullet 7350@item 7351http://rubygems.org/gems/gmp 7352@end itemize 7353 7354@item Scheme 7355@itemize @bullet 7356@item 7357GNU Guile (upcoming 1.8) @spaceuref{http://www.gnu.org/software/guile/guile.html} 7358@item 7359RScheme @spaceuref{http://www.rscheme.org/} 7360@item 7361STklos @spaceuref{http://www.stklos.org/} 7362@c 7363@c For reference, MzScheme uses some of gmp, but (as of version 205) it only 7364@c has copies of some of the generic C code, and we don't consider that a 7365@c language binding to gmp. 7366@c 7367@end itemize 7368 7369@item Smalltalk 7370@itemize @bullet 7371@item 7372GNU Smalltalk @spaceuref{http://www.smalltalk.org/versions/GNUSmalltalk.html} 7373@end itemize 7374 7375@item Other 7376@itemize @bullet 7377@item 7378Axiom @uref{http://savannah.nongnu.org/projects/axiom} @* Computer algebra 7379using GCL. 7380@item 7381DrGenius @spaceuref{http://drgenius.seul.org/} @* Geometry system and 7382mathematical programming language. 7383@item 7384GiNaC @spaceuref{http://www.ginac.de/} @* C++ computer algebra using CLN. 7385@item 7386GOO @spaceuref{http://www.googoogaga.org/} @* Dynamic object oriented 7387language. 7388@item 7389Maxima @uref{http://www.ma.utexas.edu/users/wfs/maxima.html} @* Macsyma 7390computer algebra using GCL. 7391@item 7392Q @spaceuref{http://q-lang.sourceforge.net/} @* Equational programming system. 7393@item 7394Regina @spaceuref{http://regina.sourceforge.net/} @* Topological calculator. 7395@item 7396Yacas @spaceuref{http://www.xs4all.nl/~apinkus/yacas.html} @* Yet another 7397computer algebra system. 7398@end itemize 7399 7400@end table 7401 7402 7403@node Algorithms, Internals, Language Bindings, Top 7404@chapter Algorithms 7405@cindex Algorithms 7406 7407This chapter is an introduction to some of the algorithms used for various GMP 7408operations. The code is likely to be hard to understand without knowing 7409something about the algorithms. 7410 7411Some GMP internals are mentioned, but applications that expect to be 7412compatible with future GMP releases should take care to use only the 7413documented functions. 7414 7415@menu 7416* Multiplication Algorithms:: 7417* Division Algorithms:: 7418* Greatest Common Divisor Algorithms:: 7419* Powering Algorithms:: 7420* Root Extraction Algorithms:: 7421* Radix Conversion Algorithms:: 7422* Other Algorithms:: 7423* Assembly Coding:: 7424@end menu 7425 7426 7427@node Multiplication Algorithms, Division Algorithms, Algorithms, Algorithms 7428@section Multiplication 7429@cindex Multiplication algorithms 7430 7431N@cross{}N limb multiplications and squares are done using one of five 7432algorithms, as the size N increases. 7433 7434@quotation 7435@multitable {KaratsubaMMM} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM} 7436@item Algorithm @tab Threshold 7437@item Basecase @tab (none) 7438@item Karatsuba @tab @code{MUL_TOOM22_THRESHOLD} 7439@item Toom-3 @tab @code{MUL_TOOM33_THRESHOLD} 7440@item Toom-4 @tab @code{MUL_TOOM44_THRESHOLD} 7441@item FFT @tab @code{MUL_FFT_THRESHOLD} 7442@end multitable 7443@end quotation 7444 7445Similarly for squaring, with the @code{SQR} thresholds. 7446 7447N@cross{}M multiplications of operands with different sizes above 7448@code{MUL_TOOM22_THRESHOLD} are currently done by special Toom-inspired 7449algorithms or directly with FFT, depending on operand size (@pxref{Unbalanced 7450Multiplication}). 7451 7452@menu 7453* Basecase Multiplication:: 7454* Karatsuba Multiplication:: 7455* Toom 3-Way Multiplication:: 7456* Toom 4-Way Multiplication:: 7457* FFT Multiplication:: 7458* Other Multiplication:: 7459* Unbalanced Multiplication:: 7460@end menu 7461 7462 7463@node Basecase Multiplication, Karatsuba Multiplication, Multiplication Algorithms, Multiplication Algorithms 7464@subsection Basecase Multiplication 7465 7466Basecase N@cross{}M multiplication is a straightforward rectangular set of 7467cross-products, the same as long multiplication done by hand and for that 7468reason sometimes known as the schoolbook or grammar school method. This is an 7469@m{O(NM),O(N*M)} algorithm. See Knuth section 4.3.1 algorithm M 7470(@pxref{References}), and the @file{mpn/generic/mul_basecase.c} code. 7471 7472Assembly implementations of @code{mpn_mul_basecase} are essentially the same 7473as the generic C code, but have all the usual assembly tricks and 7474obscurities introduced for speed. 7475 7476A square can be done in roughly half the time of a multiply, by using the fact 7477that the cross products above and below the diagonal are the same. A triangle 7478of products below the diagonal is formed, doubled (left shift by one bit), and 7479then the products on the diagonal added. This can be seen in 7480@file{mpn/generic/sqr_basecase.c}. Again the assembly implementations take 7481essentially the same approach. 7482 7483@tex 7484\def\GMPline#1#2#3#4#5#6{% 7485 \hbox {% 7486 \vrule height 2.5ex depth 1ex 7487 \hbox to 2em {\hfil{#2}\hfil}% 7488 \vrule \hbox to 2em {\hfil{#3}\hfil}% 7489 \vrule \hbox to 2em {\hfil{#4}\hfil}% 7490 \vrule \hbox to 2em {\hfil{#5}\hfil}% 7491 \vrule \hbox to 2em {\hfil{#6}\hfil}% 7492 \vrule}} 7493\GMPdisplay{ 7494 \hbox{% 7495 \vbox{% 7496 \hbox to 1.5em {\vrule height 2.5ex depth 1ex width 0pt}% 7497 \hbox {\vrule height 2.5ex depth 1ex width 0pt u0\hfil}% 7498 \hbox {\vrule height 2.5ex depth 1ex width 0pt u1\hfil}% 7499 \hbox {\vrule height 2.5ex depth 1ex width 0pt u2\hfil}% 7500 \hbox {\vrule height 2.5ex depth 1ex width 0pt u3\hfil}% 7501 \hbox {\vrule height 2.5ex depth 1ex width 0pt u4\hfil}% 7502 \vfill}% 7503 \vbox{% 7504 \hbox{% 7505 \hbox to 2em {\hfil u0\hfil}% 7506 \hbox to 2em {\hfil u1\hfil}% 7507 \hbox to 2em {\hfil u2\hfil}% 7508 \hbox to 2em {\hfil u3\hfil}% 7509 \hbox to 2em {\hfil u4\hfil}}% 7510 \vskip 0.7ex 7511 \hrule 7512 \GMPline{u0}{d}{}{}{}{}% 7513 \hrule 7514 \GMPline{u1}{}{d}{}{}{}% 7515 \hrule 7516 \GMPline{u2}{}{}{d}{}{}% 7517 \hrule 7518 \GMPline{u3}{}{}{}{d}{}% 7519 \hrule 7520 \GMPline{u4}{}{}{}{}{d}% 7521 \hrule}}} 7522@end tex 7523@ifnottex 7524@example 7525@group 7526 u0 u1 u2 u3 u4 7527 +---+---+---+---+---+ 7528u0 | d | | | | | 7529 +---+---+---+---+---+ 7530u1 | | d | | | | 7531 +---+---+---+---+---+ 7532u2 | | | d | | | 7533 +---+---+---+---+---+ 7534u3 | | | | d | | 7535 +---+---+---+---+---+ 7536u4 | | | | | d | 7537 +---+---+---+---+---+ 7538@end group 7539@end example 7540@end ifnottex 7541 7542In practice squaring isn't a full 2@cross{} faster than multiplying, it's 7543usually around 1.5@cross{}. Less than 1.5@cross{} probably indicates 7544@code{mpn_sqr_basecase} wants improving on that CPU. 7545 7546On some CPUs @code{mpn_mul_basecase} can be faster than the generic C 7547@code{mpn_sqr_basecase} on some small sizes. @code{SQR_BASECASE_THRESHOLD} is 7548the size at which to use @code{mpn_sqr_basecase}, this will be zero if that 7549routine should be used always. 7550 7551 7552@node Karatsuba Multiplication, Toom 3-Way Multiplication, Basecase Multiplication, Multiplication Algorithms 7553@subsection Karatsuba Multiplication 7554@cindex Karatsuba multiplication 7555 7556The Karatsuba multiplication algorithm is described in Knuth section 4.3.3 7557part A, and various other textbooks. A brief description is given here. 7558 7559The inputs @math{x} and @math{y} are treated as each split into two parts of 7560equal length (or the most significant part one limb shorter if N is odd). 7561 7562@tex 7563% GMPboxwidth used for all the multiplication pictures 7564\global\newdimen\GMPboxwidth \global\GMPboxwidth=5em 7565% GMPboxdepth and GMPboxheight are also used for the float pictures 7566\global\newdimen\GMPboxdepth \global\GMPboxdepth=1ex 7567\global\newdimen\GMPboxheight \global\GMPboxheight=2ex 7568\gdef\GMPvrule{\vrule height \GMPboxheight depth \GMPboxdepth} 7569\def\GMPbox#1#2{% 7570 \vbox {% 7571 \hrule 7572 \hbox to 2\GMPboxwidth{% 7573 \GMPvrule \hfil $#1$\hfil \vrule \hfil $#2$\hfil \vrule}% 7574 \hrule}} 7575\GMPdisplay{% 7576\vbox{% 7577 \hbox to 2\GMPboxwidth {high \hfil low} 7578 \vskip 0.7ex 7579 \GMPbox{x_1}{x_0} 7580 \vskip 0.5ex 7581 \GMPbox{y_1}{y_0} 7582}} 7583@end tex 7584@ifnottex 7585@example 7586@group 7587 high low 7588+----------+----------+ 7589| x1 | x0 | 7590+----------+----------+ 7591 7592+----------+----------+ 7593| y1 | y0 | 7594+----------+----------+ 7595@end group 7596@end example 7597@end ifnottex 7598 7599Let @math{b} be the power of 2 where the split occurs, ie.@: if @ms{x,0} is 7600@math{k} limbs (@ms{y,0} the same) then 7601@m{b=2\GMPraise{$k*$@code{mp\_bits\_per\_limb}}, b=2^(k*mp_bits_per_limb)}. 7602With that @m{x=x_1b+x_0,x=x1*b+x0} and @m{y=y_1b+y_0,y=y1*b+y0}, and the 7603following holds, 7604 7605@display 7606@m{xy = (b^2+b)x_1y_1 - b(x_1-x_0)(y_1-y_0) + (b+1)x_0y_0, 7607 x*y = (b^2+b)*x1*y1 - b*(x1-x0)*(y1-y0) + (b+1)*x0*y0} 7608@end display 7609 7610This formula means doing only three multiplies of (N/2)@cross{}(N/2) limbs, 7611whereas a basecase multiply of N@cross{}N limbs is equivalent to four 7612multiplies of (N/2)@cross{}(N/2). The factors @math{(b^2+b)} etc represent 7613the positions where the three products must be added. 7614 7615@tex 7616\def\GMPboxA#1#2{% 7617 \vbox{% 7618 \hrule 7619 \hbox{% 7620 \GMPvrule 7621 \hbox to 2\GMPboxwidth {\hfil\hbox{$#1$}\hfil}% 7622 \vrule 7623 \hbox to 2\GMPboxwidth {\hfil\hbox{$#2$}\hfil}% 7624 \vrule} 7625 \hrule}} 7626\def\GMPboxB#1#2{% 7627 \hbox{% 7628 \raise \GMPboxdepth \hbox to \GMPboxwidth {\hfil #1\hskip 0.5em}% 7629 \vbox{% 7630 \hrule 7631 \hbox{% 7632 \GMPvrule 7633 \hbox to 2\GMPboxwidth {\hfil\hbox{$#2$}\hfil}% 7634 \vrule}% 7635 \hrule}}} 7636\GMPdisplay{% 7637\vbox{% 7638 \hbox to 4\GMPboxwidth {high \hfil low} 7639 \vskip 0.7ex 7640 \GMPboxA{x_1y_1}{x_0y_0} 7641 \vskip 0.5ex 7642 \GMPboxB{$+$}{x_1y_1} 7643 \vskip 0.5ex 7644 \GMPboxB{$+$}{x_0y_0} 7645 \vskip 0.5ex 7646 \GMPboxB{$-$}{(x_1-x_0)(y_1-y_0)} 7647}} 7648@end tex 7649@ifnottex 7650@example 7651@group 7652 high low 7653+--------+--------+ +--------+--------+ 7654| x1*y1 | | x0*y0 | 7655+--------+--------+ +--------+--------+ 7656 +--------+--------+ 7657 add | x1*y1 | 7658 +--------+--------+ 7659 +--------+--------+ 7660 add | x0*y0 | 7661 +--------+--------+ 7662 +--------+--------+ 7663 sub | (x1-x0)*(y1-y0) | 7664 +--------+--------+ 7665@end group 7666@end example 7667@end ifnottex 7668 7669The term @m{(x_1-x_0)(y_1-y_0),(x1-x0)*(y1-y0)} is best calculated as an 7670absolute value, and the sign used to choose to add or subtract. Notice the 7671sum @m{\mathop{\rm high}(x_0y_0)+\mathop{\rm low}(x_1y_1), 7672high(x0*y0)+low(x1*y1)} occurs twice, so it's possible to do @m{5k,5*k} limb 7673additions, rather than @m{6k,6*k}, but in GMP extra function call overheads 7674outweigh the saving. 7675 7676Squaring is similar to multiplying, but with @math{x=y} the formula reduces to 7677an equivalent with three squares, 7678 7679@display 7680@m{x^2 = (b^2+b)x_1^2 - b(x_1-x_0)^2 + (b+1)x_0^2, 7681 x^2 = (b^2+b)*x1^2 - b*(x1-x0)^2 + (b+1)*x0^2} 7682@end display 7683 7684The final result is accumulated from those three squares the same way as for 7685the three multiplies above. The middle term @m{(x_1-x_0)^2,(x1-x0)^2} is now 7686always positive. 7687 7688A similar formula for both multiplying and squaring can be constructed with a 7689middle term @m{(x_1+x_0)(y_1+y_0),(x1+x0)*(y1+y0)}. But those sums can exceed 7690@math{k} limbs, leading to more carry handling and additions than the form 7691above. 7692 7693Karatsuba multiplication is asymptotically an @math{O(N^@W{1.585})} algorithm, 7694the exponent being @m{\log3/\log2,log(3)/log(2)}, representing 3 multiplies 7695each @math{1/2} the size of the inputs. This is a big improvement over the 7696basecase multiply at @math{O(N^2)} and the advantage soon overcomes the extra 7697additions Karatsuba performs. @code{MUL_TOOM22_THRESHOLD} can be as little 7698as 10 limbs. The @code{SQR} threshold is usually about twice the @code{MUL}. 7699 7700The basecase algorithm will take a time of the form @m{M(N) = aN^2 + bN + c, 7701M(N) = a*N^2 + b*N + c} and the Karatsuba algorithm @m{K(N) = 3M(N/2) + dN + 7702e, K(N) = 3*M(N/2) + d*N + e}, which expands to @m{K(N) = {3\over4} aN^2 + 7703{3\over2} bN + 3c + dN + e, K(N) = 3/4*a*N^2 + 3/2*b*N + 3*c + d*N + e}. The 7704factor @m{3\over4, 3/4} for @math{a} means per-crossproduct speedups in the 7705basecase code will increase the threshold since they benefit @math{M(N)} more 7706than @math{K(N)}. And conversely the @m{3\over2, 3/2} for @math{b} means 7707linear style speedups of @math{b} will increase the threshold since they 7708benefit @math{K(N)} more than @math{M(N)}. The latter can be seen for 7709instance when adding an optimized @code{mpn_sqr_diagonal} to 7710@code{mpn_sqr_basecase}. Of course all speedups reduce total time, and in 7711that sense the algorithm thresholds are merely of academic interest. 7712 7713 7714@node Toom 3-Way Multiplication, Toom 4-Way Multiplication, Karatsuba Multiplication, Multiplication Algorithms 7715@subsection Toom 3-Way Multiplication 7716@cindex Toom multiplication 7717 7718The Karatsuba formula is the simplest case of a general approach to splitting 7719inputs that leads to both Toom and FFT algorithms. A description of 7720Toom can be found in Knuth section 4.3.3, with an example 3-way 7721calculation after Theorem A@. The 3-way form used in GMP is described here. 7722 7723The operands are each considered split into 3 pieces of equal length (or the 7724most significant part 1 or 2 limbs shorter than the other two). 7725 7726@tex 7727\def\GMPbox#1#2#3{% 7728 \vbox{% 7729 \hrule \vfil 7730 \hbox to 3\GMPboxwidth {% 7731 \GMPvrule 7732 \hfil$#1$\hfil 7733 \vrule 7734 \hfil$#2$\hfil 7735 \vrule 7736 \hfil$#3$\hfil 7737 \vrule}% 7738 \vfil \hrule 7739}} 7740\GMPdisplay{% 7741\vbox{% 7742 \hbox to 3\GMPboxwidth {high \hfil low} 7743 \vskip 0.7ex 7744 \GMPbox{x_2}{x_1}{x_0} 7745 \vskip 0.5ex 7746 \GMPbox{y_2}{y_1}{y_0} 7747 \vskip 0.5ex 7748}} 7749@end tex 7750@ifnottex 7751@example 7752@group 7753 high low 7754+----------+----------+----------+ 7755| x2 | x1 | x0 | 7756+----------+----------+----------+ 7757 7758+----------+----------+----------+ 7759| y2 | y1 | y0 | 7760+----------+----------+----------+ 7761@end group 7762@end example 7763@end ifnottex 7764 7765@noindent 7766These parts are treated as the coefficients of two polynomials 7767 7768@display 7769@group 7770@m{X(t) = x_2t^2 + x_1t + x_0, 7771 X(t) = x2*t^2 + x1*t + x0} 7772@m{Y(t) = y_2t^2 + y_1t + y_0, 7773 Y(t) = y2*t^2 + y1*t + y0} 7774@end group 7775@end display 7776 7777Let @math{b} equal the power of 2 which is the size of the @ms{x,0}, @ms{x,1}, 7778@ms{y,0} and @ms{y,1} pieces, ie.@: if they're @math{k} limbs each then 7779@m{b=2\GMPraise{$k*$@code{mp\_bits\_per\_limb}}, b=2^(k*mp_bits_per_limb)}. 7780With this @math{x=X(b)} and @math{y=Y(b)}. 7781 7782Let a polynomial @m{W(t)=X(t)Y(t),W(t)=X(t)*Y(t)} and suppose its coefficients 7783are 7784 7785@display 7786@m{W(t) = w_4t^4 + w_3t^3 + w_2t^2 + w_1t + w_0, 7787 W(t) = w4*t^4 + w3*t^3 + w2*t^2 + w1*t + w0} 7788@end display 7789 7790The @m{w_i,w[i]} are going to be determined, and when they are they'll give 7791the final result using @math{w=W(b)}, since 7792@m{xy=X(b)Y(b),x*y=X(b)*Y(b)=W(b)}. The coefficients will be roughly 7793@math{b^2} each, and the final @math{W(b)} will be an addition like, 7794 7795@tex 7796\def\GMPbox#1#2{% 7797 \moveright #1\GMPboxwidth 7798 \vbox{% 7799 \hrule 7800 \hbox{% 7801 \GMPvrule 7802 \hbox to 2\GMPboxwidth {\hfil$#2$\hfil}% 7803 \vrule}% 7804 \hrule 7805}} 7806\GMPdisplay{% 7807\vbox{% 7808 \hbox to 6\GMPboxwidth {high \hfil low}% 7809 \vskip 0.7ex 7810 \GMPbox{0}{w_4} 7811 \vskip 0.5ex 7812 \GMPbox{1}{w_3} 7813 \vskip 0.5ex 7814 \GMPbox{2}{w_2} 7815 \vskip 0.5ex 7816 \GMPbox{3}{w_1} 7817 \vskip 0.5ex 7818 \GMPbox{4}{w_0} 7819}} 7820@end tex 7821@ifnottex 7822@example 7823@group 7824 high low 7825+-------+-------+ 7826| w4 | 7827+-------+-------+ 7828 +--------+-------+ 7829 | w3 | 7830 +--------+-------+ 7831 +--------+-------+ 7832 | w2 | 7833 +--------+-------+ 7834 +--------+-------+ 7835 | w1 | 7836 +--------+-------+ 7837 +-------+-------+ 7838 | w0 | 7839 +-------+-------+ 7840@end group 7841@end example 7842@end ifnottex 7843 7844The @m{w_i,w[i]} coefficients could be formed by a simple set of cross 7845products, like @m{w_4=x_2y_2,w4=x2*y2}, @m{w_3=x_2y_1+x_1y_2,w3=x2*y1+x1*y2}, 7846@m{w_2=x_2y_0+x_1y_1+x_0y_2,w2=x2*y0+x1*y1+x0*y2} etc, but this would need all 7847nine @m{x_iy_j,x[i]*y[j]} for @math{i,j=0,1,2}, and would be equivalent merely 7848to a basecase multiply. Instead the following approach is used. 7849 7850@math{X(t)} and @math{Y(t)} are evaluated and multiplied at 5 points, giving 7851values of @math{W(t)} at those points. In GMP the following points are used, 7852 7853@quotation 7854@multitable {@m{t=\infty,t=inf}M} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM} 7855@item Point @tab Value 7856@item @math{t=0} @tab @m{x_0y_0,x0 * y0}, which gives @ms{w,0} immediately 7857@item @math{t=1} @tab @m{(x_2+x_1+x_0)(y_2+y_1+y_0),(x2+x1+x0) * (y2+y1+y0)} 7858@item @math{t=-1} @tab @m{(x_2-x_1+x_0)(y_2-y_1+y_0),(x2-x1+x0) * (y2-y1+y0)} 7859@item @math{t=2} @tab @m{(4x_2+2x_1+x_0)(4y_2+2y_1+y_0),(4*x2+2*x1+x0) * (4*y2+2*y1+y0)} 7860@item @m{t=\infty,t=inf} @tab @m{x_2y_2,x2 * y2}, which gives @ms{w,4} immediately 7861@end multitable 7862@end quotation 7863 7864At @math{t=-1} the values can be negative and that's handled using the 7865absolute values and tracking the sign separately. At @m{t=\infty,t=inf} the 7866value is actually @m{\lim_{t\to\infty} {X(t)Y(t)\over t^4}, X(t)*Y(t)/t^4 in 7867the limit as t approaches infinity}, but it's much easier to think of as 7868simply @m{x_2y_2,x2*y2} giving @ms{w,4} immediately (much like 7869@m{x_0y_0,x0*y0} at @math{t=0} gives @ms{w,0} immediately). 7870 7871Each of the points substituted into 7872@m{W(t)=w_4t^4+\cdots+w_0,W(t)=w4*t^4+@dots{}+w0} gives a linear combination 7873of the @m{w_i,w[i]} coefficients, and the value of those combinations has just 7874been calculated. 7875 7876@tex 7877\GMPdisplay{% 7878$\matrix{% 7879W(0) & = & & & & & & & & & w_0 \cr 7880W(1) & = & w_4 & + & w_3 & + & w_2 & + & w_1 & + & w_0 \cr 7881W(-1) & = & w_4 & - & w_3 & + & w_2 & - & w_1 & + & w_0 \cr 7882W(2) & = & 16w_4 & + & 8w_3 & + & 4w_2 & + & 2w_1 & + & w_0 \cr 7883W(\infty) & = & w_4 \cr 7884}$} 7885@end tex 7886@ifnottex 7887@example 7888@group 7889W(0) = w0 7890W(1) = w4 + w3 + w2 + w1 + w0 7891W(-1) = w4 - w3 + w2 - w1 + w0 7892W(2) = 16*w4 + 8*w3 + 4*w2 + 2*w1 + w0 7893W(inf) = w4 7894@end group 7895@end example 7896@end ifnottex 7897 7898This is a set of five equations in five unknowns, and some elementary linear 7899algebra quickly isolates each @m{w_i,w[i]}. This involves adding or 7900subtracting one @math{W(t)} value from another, and a couple of divisions by 7901powers of 2 and one division by 3, the latter using the special 7902@code{mpn_divexact_by3} (@pxref{Exact Division}). 7903 7904The conversion of @math{W(t)} values to the coefficients is interpolation. A 7905polynomial of degree 4 like @math{W(t)} is uniquely determined by values known 7906at 5 different points. The points are arbitrary and can be chosen to make the 7907linear equations come out with a convenient set of steps for quickly isolating 7908the @m{w_i,w[i]}. 7909 7910Squaring follows the same procedure as multiplication, but there's only one 7911@math{X(t)} and it's evaluated at the 5 points, and those values squared to 7912give values of @math{W(t)}. The interpolation is then identical, and in fact 7913the same @code{toom3_interpolate} subroutine is used for both squaring and 7914multiplying. 7915 7916Toom-3 is asymptotically @math{O(N^@W{1.465})}, the exponent being 7917@m{\log5/\log3,log(5)/log(3)}, representing 5 recursive multiplies of 1/3 the 7918original size each. This is an improvement over Karatsuba at 7919@math{O(N^@W{1.585})}, though Toom does more work in the evaluation and 7920interpolation and so it only realizes its advantage above a certain size. 7921 7922Near the crossover between Toom-3 and Karatsuba there's generally a range of 7923sizes where the difference between the two is small. 7924@code{MUL_TOOM33_THRESHOLD} is a somewhat arbitrary point in that range and 7925successive runs of the tune program can give different values due to small 7926variations in measuring. A graph of time versus size for the two shows the 7927effect, see @file{tune/README}. 7928 7929At the fairly small sizes where the Toom-3 thresholds occur it's worth 7930remembering that the asymptotic behaviour for Karatsuba and Toom-3 can't be 7931expected to make accurate predictions, due of course to the big influence of 7932all sorts of overheads, and the fact that only a few recursions of each are 7933being performed. Even at large sizes there's a good chance machine dependent 7934effects like cache architecture will mean actual performance deviates from 7935what might be predicted. 7936 7937The formula given for the Karatsuba algorithm (@pxref{Karatsuba 7938Multiplication}) has an equivalent for Toom-3 involving only five multiplies, 7939but this would be complicated and unenlightening. 7940 7941An alternate view of Toom-3 can be found in Zuras (@pxref{References}), using 7942a vector to represent the @math{x} and @math{y} splits and a matrix 7943multiplication for the evaluation and interpolation stages. The matrix 7944inverses are not meant to be actually used, and they have elements with values 7945much greater than in fact arise in the interpolation steps. The diagram shown 7946for the 3-way is attractive, but again doesn't have to be implemented that way 7947and for example with a bit of rearrangement just one division by 6 can be 7948done. 7949 7950 7951@node Toom 4-Way Multiplication, FFT Multiplication, Toom 3-Way Multiplication, Multiplication Algorithms 7952@subsection Toom 4-Way Multiplication 7953@cindex Toom multiplication 7954 7955Karatsuba and Toom-3 split the operands into 2 and 3 coefficients, 7956respectively. Toom-4 analogously splits the operands into 4 coefficients. 7957Using the notation from the section on Toom-3 multiplication, we form two 7958polynomials: 7959 7960@display 7961@group 7962@m{X(t) = x_3t^3 + x_2t^2 + x_1t + x_0, 7963 X(t) = x3*t^3 + x2*t^2 + x1*t + x0} 7964@m{Y(t) = y_3t^3 + y_2t^2 + y_1t + y_0, 7965 Y(t) = y3*t^3 + y2*t^2 + y1*t + y0} 7966@end group 7967@end display 7968 7969@math{X(t)} and @math{Y(t)} are evaluated and multiplied at 7 points, giving 7970values of @math{W(t)} at those points. In GMP the following points are used, 7971 7972@quotation 7973@multitable {@m{t=-1/2,t=inf}M} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM} 7974@item Point @tab Value 7975@item @math{t=0} @tab @m{x_0y_0,x0 * y0}, which gives @ms{w,0} immediately 7976@item @math{t=1/2} @tab @m{(x_3+2x_2+4x_1+8x_0)(y_3+2y_2+4y_1+8y_0),(x3+2*x2+4*x1+8*x0) * (y3+2*y2+4*y1+8*y0)} 7977@item @math{t=-1/2} @tab @m{(-x_3+2x_2-4x_1+8x_0)(-y_3+2y_2-4y_1+8y_0),(-x3+2*x2-4*x1+8*x0) * (-y3+2*y2-4*y1+8*y0)} 7978@item @math{t=1} @tab @m{(x_3+x_2+x_1+x_0)(y_3+y_2+y_1+y_0),(x3+x2+x1+x0) * (y3+y2+y1+y0)} 7979@item @math{t=-1} @tab @m{(-x_3+x_2-x_1+x_0)(-y_3+y_2-y_1+y_0),(-x3+x2-x1+x0) * (-y3+y2-y1+y0)} 7980@item @math{t=2} @tab @m{(8x_3+4x_2+2x_1+x_0)(8y_3+4y_2+2y_1+y_0),(8*x3+4*x2+2*x1+x0) * (8*y3+4*y2+2*y1+y0)} 7981@item @m{t=\infty,t=inf} @tab @m{x_3y_3,x3 * y3}, which gives @ms{w,6} immediately 7982@end multitable 7983@end quotation 7984 7985The number of additions and subtractions for Toom-4 is much larger than for Toom-3. 7986But several subexpressions occur multiple times, for example @m{x_2+x_0,x2+x0}, occurs 7987for both @math{t=1} and @math{t=-1}. 7988 7989Toom-4 is asymptotically @math{O(N^@W{1.404})}, the exponent being 7990@m{\log7/\log4,log(7)/log(4)}, representing 7 recursive multiplies of 1/4 the 7991original size each. 7992 7993 7994@node FFT Multiplication, Other Multiplication, Toom 4-Way Multiplication, Multiplication Algorithms 7995@subsection FFT Multiplication 7996@cindex FFT multiplication 7997@cindex Fast Fourier Transform 7998 7999At large to very large sizes a Fermat style FFT multiplication is used, 8000following Sch@"onhage and Strassen (@pxref{References}). Descriptions of FFTs 8001in various forms can be found in many textbooks, for instance Knuth section 80024.3.3 part C or Lipson chapter IX@. A brief description of the form used in 8003GMP is given here. 8004 8005The multiplication done is @m{xy \bmod 2^N+1, x*y mod 2^N+1}, for a given 8006@math{N}. A full product @m{xy,x*y} is obtained by choosing @m{N \ge 8007\mathop{\rm bits}(x)+\mathop{\rm bits}(y), N>=bits(x)+bits(y)} and padding 8008@math{x} and @math{y} with high zero limbs. The modular product is the native 8009form for the algorithm, so padding to get a full product is unavoidable. 8010 8011The algorithm follows a split, evaluate, pointwise multiply, interpolate and 8012combine similar to that described above for Karatsuba and Toom-3. A @math{k} 8013parameter controls the split, with an FFT-@math{k} splitting into @math{2^k} 8014pieces of @math{M=N/2^k} bits each. @math{N} must be a multiple of 8015@m{2^k\times@code{mp\_bits\_per\_limb}, (2^k)*@nicode{mp_bits_per_limb}} so 8016the split falls on limb boundaries, avoiding bit shifts in the split and 8017combine stages. 8018 8019The evaluations, pointwise multiplications, and interpolation, are all done 8020modulo @m{2^{N'}+1, 2^N'+1} where @math{N'} is @math{2M+k+3} rounded up to a 8021multiple of @math{2^k} and of @code{mp_bits_per_limb}. The results of 8022interpolation will be the following negacyclic convolution of the input 8023pieces, and the choice of @math{N'} ensures these sums aren't truncated. 8024@tex 8025$$ w_n = \sum_{{i+j = b2^k+n}\atop{b=0,1}} (-1)^b x_i y_j $$ 8026@end tex 8027@ifnottex 8028 8029@example 8030 --- 8031 \ b 8032w[n] = / (-1) * x[i] * y[j] 8033 --- 8034 i+j==b*2^k+n 8035 b=0,1 8036@end example 8037 8038@end ifnottex 8039The points used for the evaluation are @math{g^i} for @math{i=0} to 8040@math{2^k-1} where @m{g=2^{2N'/2^k}, g=2^(2N'/2^k)}. @math{g} is a 8041@m{2^k,2^k'}th root of unity mod @m{2^{N'}+1,2^N'+1}, which produces necessary 8042cancellations at the interpolation stage, and it's also a power of 2 so the 8043fast Fourier transforms used for the evaluation and interpolation do only 8044shifts, adds and negations. 8045 8046The pointwise multiplications are done modulo @m{2^{N'}+1, 2^N'+1} and either 8047recurse into a further FFT or use a plain multiplication (Toom-3, Karatsuba or 8048basecase), whichever is optimal at the size @math{N'}. The interpolation is 8049an inverse fast Fourier transform. The resulting set of sums of @m{x_iy_j, 8050x[i]*y[j]} are added at appropriate offsets to give the final result. 8051 8052Squaring is the same, but @math{x} is the only input so it's one transform at 8053the evaluate stage and the pointwise multiplies are squares. The 8054interpolation is the same. 8055 8056For a mod @math{2^N+1} product, an FFT-@math{k} is an @m{O(N^{k/(k-1)}), 8057O(N^(k/(k-1)))} algorithm, the exponent representing @math{2^k} recursed 8058modular multiplies each @m{1/2^{k-1},1/2^(k-1)} the size of the original. 8059Each successive @math{k} is an asymptotic improvement, but overheads mean each 8060is only faster at bigger and bigger sizes. In the code, @code{MUL_FFT_TABLE} 8061and @code{SQR_FFT_TABLE} are the thresholds where each @math{k} is used. Each 8062new @math{k} effectively swaps some multiplying for some shifts, adds and 8063overheads. 8064 8065A mod @math{2^N+1} product can be formed with a normal 8066@math{N@cross{}N@rightarrow{}2N} bit multiply plus a subtraction, so an FFT 8067and Toom-3 etc can be compared directly. A @math{k=4} FFT at 8068@math{O(N^@W{1.333})} can be expected to be the first faster than Toom-3 at 8069@math{O(N^@W{1.465})}. In practice this is what's found, with 8070@code{MUL_FFT_MODF_THRESHOLD} and @code{SQR_FFT_MODF_THRESHOLD} being between 8071300 and 1000 limbs, depending on the CPU@. So far it's been found that only 8072very large FFTs recurse into pointwise multiplies above these sizes. 8073 8074When an FFT is to give a full product, the change of @math{N} to @math{2N} 8075doesn't alter the theoretical complexity for a given @math{k}, but for the 8076purposes of considering where an FFT might be first used it can be assumed 8077that the FFT is recursing into a normal multiply and that on that basis it's 8078doing @math{2^k} recursed multiplies each @m{1/2^{k-2},1/2^(k-2)} the size of 8079the inputs, making it @m{O(N^{k/(k-2)}), O(N^(k/(k-2)))}. This would mean 8080@math{k=7} at @math{O(N^@W{1.4})} would be the first FFT faster than Toom-3. 8081In practice @code{MUL_FFT_THRESHOLD} and @code{SQR_FFT_THRESHOLD} have been 8082found to be in the @math{k=8} range, somewhere between 3000 and 10000 limbs. 8083 8084The way @math{N} is split into @math{2^k} pieces and then @math{2M+k+3} is 8085rounded up to a multiple of @math{2^k} and @code{mp_bits_per_limb} means that 8086when @math{2^k@ge{}@nicode{mp\_bits\_per\_limb}} the effective @math{N} is a 8087multiple of @m{2^{2k-1},2^(2k-1)} bits. The @math{+k+3} means some values of 8088@math{N} just under such a multiple will be rounded to the next. The 8089complexity calculations above assume that a favourable size is used, meaning 8090one which isn't padded through rounding, and it's also assumed that the extra 8091@math{+k+3} bits are negligible at typical FFT sizes. 8092 8093The practical effect of the @m{2^{2k-1},2^(2k-1)} constraint is to introduce a 8094step-effect into measured speeds. For example @math{k=8} will round @math{N} 8095up to a multiple of 32768 bits, so for a 32-bit limb there'll be 512 limb 8096groups of sizes for which @code{mpn_mul_n} runs at the same speed. Or for 8097@math{k=9} groups of 2048 limbs, @math{k=10} groups of 8192 limbs, etc. In 8098practice it's been found each @math{k} is used at quite small multiples of its 8099size constraint and so the step effect is quite noticeable in a time versus 8100size graph. 8101 8102The threshold determinations currently measure at the mid-points of size 8103steps, but this is sub-optimal since at the start of a new step it can happen 8104that it's better to go back to the previous @math{k} for a while. Something 8105more sophisticated for @code{MUL_FFT_TABLE} and @code{SQR_FFT_TABLE} will be 8106needed. 8107 8108 8109@node Other Multiplication, Unbalanced Multiplication, FFT Multiplication, Multiplication Algorithms 8110@subsection Other Multiplication 8111@cindex Toom multiplication 8112 8113The Toom algorithms described above (@pxref{Toom 3-Way Multiplication}, 8114@pxref{Toom 4-Way Multiplication}) generalizes to split into an arbitrary 8115number of pieces, as per Knuth section 4.3.3 algorithm C@. This is not 8116currently used. The notes here are merely for interest. 8117 8118In general a split into @math{r+1} pieces is made, and evaluations and 8119pointwise multiplications done at @m{2r+1,2*r+1} points. A 4-way split does 7 8120pointwise multiplies, 5-way does 9, etc. Asymptotically an @math{(r+1)}-way 8121algorithm is @m{O(N^{log(2r+1)/log(r+1)}, O(N^(log(2*r+1)/log(r+1)))}. Only 8122the pointwise multiplications count towards big-@math{O} complexity, but the 8123time spent in the evaluate and interpolate stages grows with @math{r} and has 8124a significant practical impact, with the asymptotic advantage of each @math{r} 8125realized only at bigger and bigger sizes. The overheads grow as 8126@m{O(Nr),O(N*r)}, whereas in an @math{r=2^k} FFT they grow only as @m{O(N \log 8127r), O(N*log(r))}. 8128 8129Knuth algorithm C evaluates at points 0,1,2,@dots{},@m{2r,2*r}, but exercise 4 8130uses @math{-r},@dots{},0,@dots{},@math{r} and the latter saves some small 8131multiplies in the evaluate stage (or rather trades them for additions), and 8132has a further saving of nearly half the interpolate steps. The idea is to 8133separate odd and even final coefficients and then perform algorithm C steps C7 8134and C8 on them separately. The divisors at step C7 become @math{j^2} and the 8135multipliers at C8 become @m{2tj-j^2,2*t*j-j^2}. 8136 8137Splitting odd and even parts through positive and negative points can be 8138thought of as using @math{-1} as a square root of unity. If a 4th root of 8139unity was available then a further split and speedup would be possible, but no 8140such root exists for plain integers. Going to complex integers with 8141@m{i=\sqrt{-1}, i=sqrt(-1)} doesn't help, essentially because in Cartesian 8142form it takes three real multiplies to do a complex multiply. The existence 8143of @m{2^k,2^k'}th roots of unity in a suitable ring or field lets the fast 8144Fourier transform keep splitting and get to @m{O(N \log r), O(N*log(r))}. 8145 8146Floating point FFTs use complex numbers approximating Nth roots of unity. 8147Some processors have special support for such FFTs. But these are not used in 8148GMP since it's very difficult to guarantee an exact result (to some number of 8149bits). An occasional difference of 1 in the last bit might not matter to a 8150typical signal processing algorithm, but is of course of vital importance to 8151GMP. 8152 8153 8154@node Unbalanced Multiplication, , Other Multiplication, Multiplication Algorithms 8155@subsection Unbalanced Multiplication 8156@cindex Unbalanced multiplication 8157 8158Multiplication of operands with different sizes, both below 8159@code{MUL_TOOM22_THRESHOLD} are done with plain schoolbook multiplication 8160(@pxref{Basecase Multiplication}). 8161 8162For really large operands, we invoke FFT directly. 8163 8164For operands between these sizes, we use Toom inspired algorithms suggested by 8165Alberto Zanoni and Marco Bodrato. The idea is to split the operands into 8166polynomials of different degree. GMP currently splits the smaller operand 8167onto 2 coefficients, i.e., a polynomial of degree 1, but the larger operand 8168can be split into 2, 3, or 4 coefficients, i.e., a polynomial of degree 1 to 81693. 8170 8171@c FIXME: This is mighty ugly, but a cleaner @need triggers texinfo bugs that 8172@c screws up layout here and there in the rest of the manual. 8173@c @tex 8174@c \goodbreak 8175@c @end tex 8176@node Division Algorithms, Greatest Common Divisor Algorithms, Multiplication Algorithms, Algorithms 8177@section Division Algorithms 8178@cindex Division algorithms 8179 8180@menu 8181* Single Limb Division:: 8182* Basecase Division:: 8183* Divide and Conquer Division:: 8184* Block-Wise Barrett Division:: 8185* Exact Division:: 8186* Exact Remainder:: 8187* Small Quotient Division:: 8188@end menu 8189 8190 8191@node Single Limb Division, Basecase Division, Division Algorithms, Division Algorithms 8192@subsection Single Limb Division 8193 8194N@cross{}1 division is implemented using repeated 2@cross{}1 divisions from 8195high to low, either with a hardware divide instruction or a multiplication by 8196inverse, whichever is best on a given CPU. 8197 8198The multiply by inverse follows ``Improved division by invariant integers'' by 8199M@"oller and Granlund (@pxref{References}) and is implemented as 8200@code{udiv_qrnnd_preinv} in @file{gmp-impl.h}. The idea is to have a 8201fixed-point approximation to @math{1/d} (see @code{invert_limb}) and then 8202multiply by the high limb (plus one bit) of the dividend to get a quotient 8203@math{q}. With @math{d} normalized (high bit set), @math{q} is no more than 1 8204too small. Subtracting @m{qd,q*d} from the dividend gives a remainder, and 8205reveals whether @math{q} or @math{q-1} is correct. 8206 8207The result is a division done with two multiplications and four or five 8208arithmetic operations. On CPUs with low latency multipliers this can be much 8209faster than a hardware divide, though the cost of calculating the inverse at 8210the start may mean it's only better on inputs bigger than say 4 or 5 limbs. 8211 8212When a divisor must be normalized, either for the generic C 8213@code{__udiv_qrnnd_c} or the multiply by inverse, the division performed is 8214actually @m{a2^k,a*2^k} by @m{d2^k,d*2^k} where @math{a} is the dividend and 8215@math{k} is the power necessary to have the high bit of @m{d2^k,d*2^k} set. 8216The bit shifts for the dividend are usually accomplished ``on the fly'' 8217meaning by extracting the appropriate bits at each step. Done this way the 8218quotient limbs come out aligned ready to store. When only the remainder is 8219wanted, an alternative is to take the dividend limbs unshifted and calculate 8220@m{r = a \bmod d2^k, r = a mod d*2^k} followed by an extra final step @m{r2^k 8221\bmod d2^k, r*2^k mod d*2^k}. This can help on CPUs with poor bit shifts or 8222few registers. 8223 8224The multiply by inverse can be done two limbs at a time. The calculation is 8225basically the same, but the inverse is two limbs and the divisor treated as if 8226padded with a low zero limb. This means more work, since the inverse will 8227need a 2@cross{}2 multiply, but the four 1@cross{}1s to do that are 8228independent and can therefore be done partly or wholly in parallel. Likewise 8229for a 2@cross{}1 calculating @m{qd,q*d}. The net effect is to process two 8230limbs with roughly the same two multiplies worth of latency that one limb at a 8231time gives. This extends to 3 or 4 limbs at a time, though the extra work to 8232apply the inverse will almost certainly soon reach the limits of multiplier 8233throughput. 8234 8235A similar approach in reverse can be taken to process just half a limb at a 8236time if the divisor is only a half limb. In this case the 1@cross{}1 multiply 8237for the inverse effectively becomes two @m{{1\over2}\times1, (1/2)x1} for each 8238limb, which can be a saving on CPUs with a fast half limb multiply, or in fact 8239if the only multiply is a half limb, and especially if it's not pipelined. 8240 8241 8242@node Basecase Division, Divide and Conquer Division, Single Limb Division, Division Algorithms 8243@subsection Basecase Division 8244 8245Basecase N@cross{}M division is like long division done by hand, but in base 8246@m{2\GMPraise{@code{mp\_bits\_per\_limb}}, 2^mp_bits_per_limb}. See Knuth 8247section 4.3.1 algorithm D, and @file{mpn/generic/sb_divrem_mn.c}. 8248 8249Briefly stated, while the dividend remains larger than the divisor, a high 8250quotient limb is formed and the N@cross{}1 product @m{qd,q*d} subtracted at 8251the top end of the dividend. With a normalized divisor (most significant bit 8252set), each quotient limb can be formed with a 2@cross{}1 division and a 82531@cross{}1 multiplication plus some subtractions. The 2@cross{}1 division is 8254by the high limb of the divisor and is done either with a hardware divide or a 8255multiply by inverse (the same as in @ref{Single Limb Division}) whichever is 8256faster. Such a quotient is sometimes one too big, requiring an addback of the 8257divisor, but that happens rarely. 8258 8259With Q=N@minus{}M being the number of quotient limbs, this is an 8260@m{O(QM),O(Q*M)} algorithm and will run at a speed similar to a basecase 8261Q@cross{}M multiplication, differing in fact only in the extra multiply and 8262divide for each of the Q quotient limbs. 8263 8264 8265@node Divide and Conquer Division, Block-Wise Barrett Division, Basecase Division, Division Algorithms 8266@subsection Divide and Conquer Division 8267 8268For divisors larger than @code{DC_DIV_QR_THRESHOLD}, division is done by dividing. 8269Or to be precise by a recursive divide and conquer algorithm based on work by 8270Moenck and Borodin, Jebelean, and Burnikel and Ziegler (@pxref{References}). 8271 8272The algorithm consists essentially of recognising that a 2N@cross{}N division 8273can be done with the basecase division algorithm (@pxref{Basecase Division}), 8274but using N/2 limbs as a base, not just a single limb. This way the 8275multiplications that arise are (N/2)@cross{}(N/2) and can take advantage of 8276Karatsuba and higher multiplication algorithms (@pxref{Multiplication 8277Algorithms}). The two ``digits'' of the quotient are formed by recursive 8278N@cross{}(N/2) divisions. 8279 8280If the (N/2)@cross{}(N/2) multiplies are done with a basecase multiplication 8281then the work is about the same as a basecase division, but with more function 8282call overheads and with some subtractions separated from the multiplies. 8283These overheads mean that it's only when N/2 is above 8284@code{MUL_TOOM22_THRESHOLD} that divide and conquer is of use. 8285 8286@code{DC_DIV_QR_THRESHOLD} is based on the divisor size N, so it will be somewhere 8287above twice @code{MUL_TOOM22_THRESHOLD}, but how much above depends on the 8288CPU@. An optimized @code{mpn_mul_basecase} can lower @code{DC_DIV_QR_THRESHOLD} a 8289little by offering a ready-made advantage over repeated @code{mpn_submul_1} 8290calls. 8291 8292Divide and conquer is asymptotically @m{O(M(N)\log N),O(M(N)*log(N))} where 8293@math{M(N)} is the time for an N@cross{}N multiplication done with FFTs. The 8294actual time is a sum over multiplications of the recursed sizes, as can be 8295seen near the end of section 2.2 of Burnikel and Ziegler. For example, within 8296the Toom-3 range, divide and conquer is @m{2.63M(N), 2.63*M(N)}. With higher 8297algorithms the @math{M(N)} term improves and the multiplier tends to @m{\log 8298N, log(N)}. In practice, at moderate to large sizes, a 2N@cross{}N division 8299is about 2 to 4 times slower than an N@cross{}N multiplication. 8300 8301 8302@node Block-Wise Barrett Division, Exact Division, Divide and Conquer Division, Division Algorithms 8303@subsection Block-Wise Barrett Division 8304 8305For the largest divisions, a block-wise Barrett division algorithm is used. 8306Here, the divisor is inverted to a precision determined by the relative size of 8307the dividend and divisor. Blocks of quotient limbs are then generated by 8308multiplying blocks from the dividend by the inverse. 8309 8310Our block-wise algorithm computes a smaller inverse than in the plain Barrett 8311algorithm. For a @math{2n/n} division, the inverse will be just @m{\lceil n/2 8312\rceil, ceil(n/2)} limbs. 8313 8314 8315@node Exact Division, Exact Remainder, Block-Wise Barrett Division, Division Algorithms 8316@subsection Exact Division 8317 8318 8319A so-called exact division is when the dividend is known to be an exact 8320multiple of the divisor. Jebelean's exact division algorithm uses this 8321knowledge to make some significant optimizations (@pxref{References}). 8322 8323The idea can be illustrated in decimal for example with 368154 divided by 8324543. Because the low digit of the dividend is 4, the low digit of the 8325quotient must be 8. This is arrived at from @m{4 \mathord{\times} 7 \bmod 10, 83264*7 mod 10}, using the fact 7 is the modular inverse of 3 (the low digit of 8327the divisor), since @m{3 \mathord{\times} 7 \mathop{\equiv} 1 \bmod 10, 3*7 8328@equiv{} 1 mod 10}. So @m{8\mathord{\times}543 = 4344,8*543=4344} can be 8329subtracted from the dividend leaving 363810. Notice the low digit has become 8330zero. 8331 8332The procedure is repeated at the second digit, with the next quotient digit 7 8333(@m{1 \mathord{\times} 7 \bmod 10, 7 @equiv{} 1*7 mod 10}), subtracting 8334@m{7\mathord{\times}543 = 3801,7*543=3801}, leaving 325800. And finally at 8335the third digit with quotient digit 6 (@m{8 \mathord{\times} 7 \bmod 10, 8*7 8336mod 10}), subtracting @m{6\mathord{\times}543 = 3258,6*543=3258} leaving 0. 8337So the quotient is 678. 8338 8339Notice however that the multiplies and subtractions don't need to extend past 8340the low three digits of the dividend, since that's enough to determine the 8341three quotient digits. For the last quotient digit no subtraction is needed 8342at all. On a 2N@cross{}N division like this one, only about half the work of 8343a normal basecase division is necessary. 8344 8345For an N@cross{}M exact division producing Q=N@minus{}M quotient limbs, the 8346saving over a normal basecase division is in two parts. Firstly, each of the 8347Q quotient limbs needs only one multiply, not a 2@cross{}1 divide and 8348multiply. Secondly, the crossproducts are reduced when @math{Q>M} to 8349@m{QM-M(M+1)/2,Q*M-M*(M+1)/2}, or when @math{Q@le{}M} to @m{Q(Q-1)/2, 8350Q*(Q-1)/2}. Notice the savings are complementary. If Q is big then many 8351divisions are saved, or if Q is small then the crossproducts reduce to a small 8352number. 8353 8354The modular inverse used is calculated efficiently by @code{binvert_limb} in 8355@file{gmp-impl.h}. This does four multiplies for a 32-bit limb, or six for a 835664-bit limb. @file{tune/modlinv.c} has some alternate implementations that 8357might suit processors better at bit twiddling than multiplying. 8358 8359The sub-quadratic exact division described by Jebelean in ``Exact Division 8360with Karatsuba Complexity'' is not currently implemented. It uses a 8361rearrangement similar to the divide and conquer for normal division 8362(@pxref{Divide and Conquer Division}), but operating from low to high. A 8363further possibility not currently implemented is ``Bidirectional Exact Integer 8364Division'' by Krandick and Jebelean which forms quotient limbs from both the 8365high and low ends of the dividend, and can halve once more the number of 8366crossproducts needed in a 2N@cross{}N division. 8367 8368A special case exact division by 3 exists in @code{mpn_divexact_by3}, 8369supporting Toom-3 multiplication and @code{mpq} canonicalizations. It forms 8370quotient digits with a multiply by the modular inverse of 3 (which is 8371@code{0xAA..AAB}) and uses two comparisons to determine a borrow for the next 8372limb. The multiplications don't need to be on the dependent chain, as long as 8373the effect of the borrows is applied, which can help chips with pipelined 8374multipliers. 8375 8376 8377@node Exact Remainder, Small Quotient Division, Exact Division, Division Algorithms 8378@subsection Exact Remainder 8379@cindex Exact remainder 8380 8381If the exact division algorithm is done with a full subtraction at each stage 8382and the dividend isn't a multiple of the divisor, then low zero limbs are 8383produced but with a remainder in the high limbs. For dividend @math{a}, 8384divisor @math{d}, quotient @math{q}, and @m{b = 2 8385\GMPraise{@code{mp\_bits\_per\_limb}}, b = 2^mp_bits_per_limb}, this remainder 8386@math{r} is of the form 8387@tex 8388$$ a = qd + r b^n $$ 8389@end tex 8390@ifnottex 8391 8392@example 8393a = q*d + r*b^n 8394@end example 8395 8396@end ifnottex 8397@math{n} represents the number of zero limbs produced by the subtractions, 8398that being the number of limbs produced for @math{q}. @math{r} will be in the 8399range @math{0@le{}r<d} and can be viewed as a remainder, but one shifted up by 8400a factor of @math{b^n}. 8401 8402Carrying out full subtractions at each stage means the same number of cross 8403products must be done as a normal division, but there's still some single limb 8404divisions saved. When @math{d} is a single limb some simplifications arise, 8405providing good speedups on a number of processors. 8406 8407@code{mpn_divexact_by3}, @code{mpn_modexact_1_odd} and the @code{mpn_redc_X} 8408functions differ subtly in how they return @math{r}, leading to some negations 8409in the above formula, but all are essentially the same. 8410 8411@cindex Divisibility algorithm 8412@cindex Congruence algorithm 8413Clearly @math{r} is zero when @math{a} is a multiple of @math{d}, and this 8414leads to divisibility or congruence tests which are potentially more efficient 8415than a normal division. 8416 8417The factor of @math{b^n} on @math{r} can be ignored in a GCD when @math{d} is 8418odd, hence the use of @code{mpn_modexact_1_odd} by @code{mpn_gcd_1} and 8419@code{mpz_kronecker_ui} etc (@pxref{Greatest Common Divisor Algorithms}). 8420 8421Montgomery's REDC method for modular multiplications uses operands of the form 8422of @m{xb^{-n}, x*b^-n} and @m{yb^{-n}, y*b^-n} and on calculating @m{(xb^{-n}) 8423(yb^{-n}), (x*b^-n)*(y*b^-n)} uses the factor of @math{b^n} in the exact 8424remainder to reach a product in the same form @m{(xy)b^{-n}, (x*y)*b^-n} 8425(@pxref{Modular Powering Algorithm}). 8426 8427Notice that @math{r} generally gives no useful information about the ordinary 8428remainder @math{a @bmod d} since @math{b^n @bmod d} could be anything. If 8429however @math{b^n @equiv{} 1 @bmod d}, then @math{r} is the negative of the 8430ordinary remainder. This occurs whenever @math{d} is a factor of 8431@math{b^n-1}, as for example with 3 in @code{mpn_divexact_by3}. For a 32 or 843264 bit limb other such factors include 5, 17 and 257, but no particular use 8433has been found for this. 8434 8435 8436@node Small Quotient Division, , Exact Remainder, Division Algorithms 8437@subsection Small Quotient Division 8438 8439An N@cross{}M division where the number of quotient limbs Q=N@minus{}M is 8440small can be optimized somewhat. 8441 8442An ordinary basecase division normalizes the divisor by shifting it to make 8443the high bit set, shifting the dividend accordingly, and shifting the 8444remainder back down at the end of the calculation. This is wasteful if only a 8445few quotient limbs are to be formed. Instead a division of just the top 8446@m{\rm2Q,2*Q} limbs of the dividend by the top Q limbs of the divisor can be 8447used to form a trial quotient. This requires only those limbs normalized, not 8448the whole of the divisor and dividend. 8449 8450A multiply and subtract then applies the trial quotient to the M@minus{}Q 8451unused limbs of the divisor and N@minus{}Q dividend limbs (which includes Q 8452limbs remaining from the trial quotient division). The starting trial 8453quotient can be 1 or 2 too big, but all cases of 2 too big and most cases of 1 8454too big are detected by first comparing the most significant limbs that will 8455arise from the subtraction. An addback is done if the quotient still turns 8456out to be 1 too big. 8457 8458This whole procedure is essentially the same as one step of the basecase 8459algorithm done in a Q limb base, though with the trial quotient test done only 8460with the high limbs, not an entire Q limb ``digit'' product. The correctness 8461of this weaker test can be established by following the argument of Knuth 8462section 4.3.1 exercise 20 but with the @m{v_2 \GMPhat q > b \GMPhat r 8463+ u_2, v2*q>b*r+u2} condition appropriately relaxed. 8464 8465 8466@need 1000 8467@node Greatest Common Divisor Algorithms, Powering Algorithms, Division Algorithms, Algorithms 8468@section Greatest Common Divisor 8469@cindex Greatest common divisor algorithms 8470@cindex GCD algorithms 8471 8472@menu 8473* Binary GCD:: 8474* Lehmer's Algorithm:: 8475* Subquadratic GCD:: 8476* Extended GCD:: 8477* Jacobi Symbol:: 8478@end menu 8479 8480 8481@node Binary GCD, Lehmer's Algorithm, Greatest Common Divisor Algorithms, Greatest Common Divisor Algorithms 8482@subsection Binary GCD 8483 8484At small sizes GMP uses an @math{O(N^2)} binary style GCD@. This is described 8485in many textbooks, for example Knuth section 4.5.2 algorithm B@. It simply 8486consists of successively reducing odd operands @math{a} and @math{b} using 8487 8488@quotation 8489@math{a,b = @abs{}(a-b),@min{}(a,b)} @* 8490strip factors of 2 from @math{a} 8491@end quotation 8492 8493The Euclidean GCD algorithm, as per Knuth algorithms E and A, repeatedly 8494computes the quotient @m{q = \lfloor a/b \rfloor, q = floor(a/b)} and replaces 8495@math{a,b} by @math{v, u - q v}. The binary algorithm has so far been found to 8496be faster than the Euclidean algorithm everywhere. One reason the binary 8497method does well is that the implied quotient at each step is usually small, 8498so often only one or two subtractions are needed to get the same effect as a 8499division. Quotients 1, 2 and 3 for example occur 67.7% of the time, see Knuth 8500section 4.5.3 Theorem E. 8501 8502When the implied quotient is large, meaning @math{b} is much smaller than 8503@math{a}, then a division is worthwhile. This is the basis for the initial 8504@math{a @bmod b} reductions in @code{mpn_gcd} and @code{mpn_gcd_1} (the latter 8505for both N@cross{}1 and 1@cross{}1 cases). But after that initial reduction, 8506big quotients occur too rarely to make it worth checking for them. 8507 8508@sp 1 8509The final @math{1@cross{}1} GCD in @code{mpn_gcd_1} is done in the generic C 8510code as described above. For two N-bit operands, the algorithm takes about 85110.68 iterations per bit. For optimum performance some attention needs to be 8512paid to the way the factors of 2 are stripped from @math{a}. 8513 8514Firstly it may be noted that in twos complement the number of low zero bits on 8515@math{a-b} is the same as @math{b-a}, so counting or testing can begin on 8516@math{a-b} without waiting for @math{@abs{}(a-b)} to be determined. 8517 8518A loop stripping low zero bits tends not to branch predict well, since the 8519condition is data dependent. But on average there's only a few low zeros, so 8520an option is to strip one or two bits arithmetically then loop for more (as 8521done for AMD K6). Or use a lookup table to get a count for several bits then 8522loop for more (as done for AMD K7). An alternative approach is to keep just 8523one of @math{a} or @math{b} odd and iterate 8524 8525@quotation 8526@math{a,b = @abs{}(a-b), @min{}(a,b)} @* 8527@math{a = a/2} if even @* 8528@math{b = b/2} if even 8529@end quotation 8530 8531This requires about 1.25 iterations per bit, but stripping of a single bit at 8532each step avoids any branching. Repeating the bit strip reduces to about 0.9 8533iterations per bit, which may be a worthwhile tradeoff. 8534 8535Generally with the above approaches a speed of perhaps 6 cycles per bit can be 8536achieved, which is still not terribly fast with for instance a 64-bit GCD 8537taking nearly 400 cycles. It's this sort of time which means it's not usually 8538advantageous to combine a set of divisibility tests into a GCD. 8539 8540Currently, the binary algorithm is used for GCD only when @math{N < 3}. 8541 8542@node Lehmer's Algorithm, Subquadratic GCD, Binary GCD, Greatest Common Divisor Algorithms 8543@comment node-name, next, previous, up 8544@subsection Lehmer's algorithm 8545 8546Lehmer's improvement of the Euclidean algorithms is based on the observation 8547that the initial part of the quotient sequence depends only on the most 8548significant parts of the inputs. The variant of Lehmer's algorithm used in GMP 8549splits off the most significant two limbs, as suggested, e.g., in ``A 8550Double-Digit Lehmer-Euclid Algorithm'' by Jebelean (@pxref{References}). The 8551quotients of two double-limb inputs are collected as a 2 by 2 matrix with 8552single-limb elements. This is done by the function @code{mpn_hgcd2}. The 8553resulting matrix is applied to the inputs using @code{mpn_mul_1} and 8554@code{mpn_submul_1}. Each iteration usually reduces the inputs by almost one 8555limb. In the rare case of a large quotient, no progress can be made by 8556examining just the most significant two limbs, and the quotient is computed 8557using plain division. 8558 8559The resulting algorithm is asymptotically @math{O(N^2)}, just as the Euclidean 8560algorithm and the binary algorithm. The quadratic part of the work are 8561the calls to @code{mpn_mul_1} and @code{mpn_submul_1}. For small sizes, the 8562linear work is also significant. There are roughly @math{N} calls to the 8563@code{mpn_hgcd2} function. This function uses a couple of important 8564optimizations: 8565 8566@itemize 8567@item 8568It uses the same relaxed notion of correctness as @code{mpn_hgcd} (see next 8569section). This means that when called with the most significant two limbs of 8570two large numbers, the returned matrix does not always correspond exactly to 8571the initial quotient sequence for the two large numbers; the final quotient 8572may sometimes be one off. 8573 8574@item 8575It takes advantage of the fact the quotients are usually small. The division 8576operator is not used, since the corresponding assembler instruction is very 8577slow on most architectures. (This code could probably be improved further, it 8578uses many branches that are unfriendly to prediction). 8579 8580@item 8581It switches from double-limb calculations to single-limb calculations half-way 8582through, when the input numbers have been reduced in size from two limbs to 8583one and a half. 8584 8585@end itemize 8586 8587@node Subquadratic GCD, Extended GCD, Lehmer's Algorithm, Greatest Common Divisor Algorithms 8588@subsection Subquadratic GCD 8589 8590For inputs larger than @code{GCD_DC_THRESHOLD}, GCD is computed via the HGCD 8591(Half GCD) function, as a generalization to Lehmer's algorithm. 8592 8593Let the inputs @math{a,b} be of size @math{N} limbs each. Put @m{S=\lfloor N/2 8594\rfloor + 1, S = floor(N/2) + 1}. Then HGCD(a,b) returns a transformation 8595matrix @math{T} with non-negative elements, and reduced numbers @math{(c;d) = 8596T^{-1} (a;b)}. The reduced numbers @math{c,d} must be larger than @math{S} 8597limbs, while their difference @math{abs(c-d)} must fit in @math{S} limbs. The 8598matrix elements will also be of size roughly @math{N/2}. 8599 8600The HGCD base case uses Lehmer's algorithm, but with the above stop condition 8601that returns reduced numbers and the corresponding transformation matrix 8602half-way through. For inputs larger than @code{HGCD_THRESHOLD}, HGCD is 8603computed recursively, using the divide and conquer algorithm in ``On 8604Sch@"onhage's algorithm and subquadratic integer GCD computation'' by M@"oller 8605(@pxref{References}). The recursive algorithm consists of these main 8606steps. 8607 8608@itemize 8609 8610@item 8611Call HGCD recursively, on the most significant @math{N/2} limbs. Apply the 8612resulting matrix @math{T_1} to the full numbers, reducing them to a size just 8613above @math{3N/2}. 8614 8615@item 8616Perform a small number of division or subtraction steps to reduce the numbers 8617to size below @math{3N/2}. This is essential mainly for the unlikely case of 8618large quotients. 8619 8620@item 8621Call HGCD recursively, on the most significant @math{N/2} limbs of the reduced 8622numbers. Apply the resulting matrix @math{T_2} to the full numbers, reducing 8623them to a size just above @math{N/2}. 8624 8625@item 8626Compute @math{T = T_1 T_2}. 8627 8628@item 8629Perform a small number of division and subtraction steps to satisfy the 8630requirements, and return. 8631@end itemize 8632 8633GCD is then implemented as a loop around HGCD, similarly to Lehmer's 8634algorithm. Where Lehmer repeatedly chops off the top two limbs, calls 8635@code{mpn_hgcd2}, and applies the resulting matrix to the full numbers, the 8636subquadratic GCD chops off the most significant third of the limbs (the 8637proportion is a tuning parameter, and @math{1/3} seems to be more efficient 8638than, e.g, @math{1/2}), calls @code{mpn_hgcd}, and applies the resulting 8639matrix. Once the input numbers are reduced to size below 8640@code{GCD_DC_THRESHOLD}, Lehmer's algorithm is used for the rest of the work. 8641 8642The asymptotic running time of both HGCD and GCD is @m{O(M(N)\log N),O(M(N)*log(N))}, 8643where @math{M(N)} is the time for multiplying two @math{N}-limb numbers. 8644 8645@comment node-name, next, previous, up 8646 8647@node Extended GCD, Jacobi Symbol, Subquadratic GCD, Greatest Common Divisor Algorithms 8648@subsection Extended GCD 8649 8650The extended GCD function, or GCDEXT, calculates @math{@gcd{}(a,b)} and also 8651cofactors @math{x} and @math{y} satisfying @m{ax+by=\gcd(a@C{}b), 8652a*x+b*y=gcd(a@C{}b)}. All the algorithms used for plain GCD are extended to 8653handle this case. The binary algorithm is used only for single-limb GCDEXT. 8654Lehmer's algorithm is used for sizes up to @code{GCDEXT_DC_THRESHOLD}. Above 8655this threshold, GCDEXT is implemented as a loop around HGCD, but with more 8656book-keeping to keep track of the cofactors. This gives the same asymptotic 8657running time as for GCD and HGCD, @m{O(M(N)\log N),O(M(N)*log(N))} 8658 8659One difference to plain GCD is that while the inputs @math{a} and @math{b} are 8660reduced as the algorithm proceeds, the cofactors @math{x} and @math{y} grow in 8661size. This makes the tuning of the chopping-point more difficult. The current 8662code chops off the most significant half of the inputs for the call to HGCD in 8663the first iteration, and the most significant two thirds for the remaining 8664calls. This strategy could surely be improved. Also the stop condition for the 8665loop, where Lehmer's algorithm is invoked once the inputs are reduced below 8666@code{GCDEXT_DC_THRESHOLD}, could maybe be improved by taking into account the 8667current size of the cofactors. 8668 8669@node Jacobi Symbol, , Extended GCD, Greatest Common Divisor Algorithms 8670@subsection Jacobi Symbol 8671@cindex Jacobi symbol algorithm 8672 8673@code{mpz_jacobi} and @code{mpz_kronecker} are currently implemented with a 8674simple binary algorithm similar to that described for the GCDs (@pxref{Binary 8675GCD}). They're not very fast when both inputs are large. Lehmer's multi-step 8676improvement or a binary based multi-step algorithm is likely to be better. 8677 8678When one operand fits a single limb, and that includes @code{mpz_kronecker_ui} 8679and friends, an initial reduction is done with either @code{mpn_mod_1} or 8680@code{mpn_modexact_1_odd}, followed by the binary algorithm on a single limb. 8681The binary algorithm is well suited to a single limb, and the whole 8682calculation in this case is quite efficient. 8683 8684In all the routines sign changes for the result are accumulated using some bit 8685twiddling, avoiding table lookups or conditional jumps. 8686 8687 8688@need 1000 8689@node Powering Algorithms, Root Extraction Algorithms, Greatest Common Divisor Algorithms, Algorithms 8690@section Powering Algorithms 8691@cindex Powering algorithms 8692 8693@menu 8694* Normal Powering Algorithm:: 8695* Modular Powering Algorithm:: 8696@end menu 8697 8698 8699@node Normal Powering Algorithm, Modular Powering Algorithm, Powering Algorithms, Powering Algorithms 8700@subsection Normal Powering 8701 8702Normal @code{mpz} or @code{mpf} powering uses a simple binary algorithm, 8703successively squaring and then multiplying by the base when a 1 bit is seen in 8704the exponent, as per Knuth section 4.6.3. The ``left to right'' 8705variant described there is used rather than algorithm A, since it's just as 8706easy and can be done with somewhat less temporary memory. 8707 8708 8709@node Modular Powering Algorithm, , Normal Powering Algorithm, Powering Algorithms 8710@subsection Modular Powering 8711 8712Modular powering is implemented using a @math{2^k}-ary sliding window 8713algorithm, as per ``Handbook of Applied Cryptography'' algorithm 14.85 8714(@pxref{References}). @math{k} is chosen according to the size of the 8715exponent. Larger exponents use larger values of @math{k}, the choice being 8716made to minimize the average number of multiplications that must supplement 8717the squaring. 8718 8719The modular multiplies and squares use either a simple division or the REDC 8720method by Montgomery (@pxref{References}). REDC is a little faster, 8721essentially saving N single limb divisions in a fashion similar to an exact 8722remainder (@pxref{Exact Remainder}). 8723 8724 8725@node Root Extraction Algorithms, Radix Conversion Algorithms, Powering Algorithms, Algorithms 8726@section Root Extraction Algorithms 8727@cindex Root extraction algorithms 8728 8729@menu 8730* Square Root Algorithm:: 8731* Nth Root Algorithm:: 8732* Perfect Square Algorithm:: 8733* Perfect Power Algorithm:: 8734@end menu 8735 8736 8737@node Square Root Algorithm, Nth Root Algorithm, Root Extraction Algorithms, Root Extraction Algorithms 8738@subsection Square Root 8739@cindex Square root algorithm 8740@cindex Karatsuba square root algorithm 8741 8742Square roots are taken using the ``Karatsuba Square Root'' algorithm by Paul 8743Zimmermann (@pxref{References}). 8744 8745An input @math{n} is split into four parts of @math{k} bits each, so with 8746@math{b=2^k} we have @m{n = a_3b^3 + a_2b^2 + a_1b + a_0, n = a3*b^3 + a2*b^2 8747+ a1*b + a0}. Part @ms{a,3} must be ``normalized'' so that either the high or 8748second highest bit is set. In GMP, @math{k} is kept on a limb boundary and 8749the input is left shifted (by an even number of bits) to normalize. 8750 8751The square root of the high two parts is taken, by recursive application of 8752the algorithm (bottoming out in a one-limb Newton's method), 8753@tex 8754$$ s',r' = \mathop{\rm sqrtrem} \> (a_3b + a_2) $$ 8755@end tex 8756@ifnottex 8757 8758@example 8759s1,r1 = sqrtrem (a3*b + a2) 8760@end example 8761 8762@end ifnottex 8763This is an approximation to the desired root and is extended by a division to 8764give @math{s},@math{r}, 8765@tex 8766$$\eqalign{ 8767q,u &= \mathop{\rm divrem} \> (r'b + a_1, 2s') \cr 8768s &= s'b + q \cr 8769r &= ub + a_0 - q^2 8770}$$ 8771@end tex 8772@ifnottex 8773 8774@example 8775q,u = divrem (r1*b + a1, 2*s1) 8776s = s1*b + q 8777r = u*b + a0 - q^2 8778@end example 8779 8780@end ifnottex 8781The normalization requirement on @ms{a,3} means at this point @math{s} is 8782either correct or 1 too big. @math{r} is negative in the latter case, so 8783@tex 8784$$\eqalign{ 8785\mathop{\rm if} \; r &< 0 \; \mathop{\rm then} \cr 8786r &\leftarrow r + 2s - 1 \cr 8787s &\leftarrow s - 1 8788}$$ 8789@end tex 8790@ifnottex 8791 8792@example 8793if r < 0 then 8794 r = r + 2*s - 1 8795 s = s - 1 8796@end example 8797 8798@end ifnottex 8799The algorithm is expressed in a divide and conquer form, but as noted in the 8800paper it can also be viewed as a discrete variant of Newton's method, or as a 8801variation on the schoolboy method (no longer taught) for square roots two 8802digits at a time. 8803 8804If the remainder @math{r} is not required then usually only a few high limbs 8805of @math{r} and @math{u} need to be calculated to determine whether an 8806adjustment to @math{s} is required. This optimization is not currently 8807implemented. 8808 8809In the Karatsuba multiplication range this algorithm is @m{O({3\over2} 8810M(N/2)),O(1.5*M(N/2))}, where @math{M(n)} is the time to multiply two numbers 8811of @math{n} limbs. In the FFT multiplication range this grows to a bound of 8812@m{O(6 M(N/2)),O(6*M(N/2))}. In practice a factor of about 1.5 to 1.8 is 8813found in the Karatsuba and Toom-3 ranges, growing to 2 or 3 in the FFT range. 8814 8815The algorithm does all its calculations in integers and the resulting 8816@code{mpn_sqrtrem} is used for both @code{mpz_sqrt} and @code{mpf_sqrt}. 8817The extended precision given by @code{mpf_sqrt_ui} is obtained by 8818padding with zero limbs. 8819 8820 8821@node Nth Root Algorithm, Perfect Square Algorithm, Square Root Algorithm, Root Extraction Algorithms 8822@subsection Nth Root 8823@cindex Root extraction algorithm 8824@cindex Nth root algorithm 8825 8826Integer Nth roots are taken using Newton's method with the following 8827iteration, where @math{A} is the input and @math{n} is the root to be taken. 8828@tex 8829$$a_{i+1} = {1\over n} \left({A \over a_i^{n-1}} + (n-1)a_i \right)$$ 8830@end tex 8831@ifnottex 8832 8833@example 8834 1 A 8835a[i+1] = - * ( --------- + (n-1)*a[i] ) 8836 n a[i]^(n-1) 8837@end example 8838 8839@end ifnottex 8840The initial approximation @m{a_1,a[1]} is generated bitwise by successively 8841powering a trial root with or without new 1 bits, aiming to be just above the 8842true root. The iteration converges quadratically when started from a good 8843approximation. When @math{n} is large more initial bits are needed to get 8844good convergence. The current implementation is not particularly well 8845optimized. 8846 8847 8848@node Perfect Square Algorithm, Perfect Power Algorithm, Nth Root Algorithm, Root Extraction Algorithms 8849@subsection Perfect Square 8850@cindex Perfect square algorithm 8851 8852A significant fraction of non-squares can be quickly identified by checking 8853whether the input is a quadratic residue modulo small integers. 8854 8855@code{mpz_perfect_square_p} first tests the input mod 256, which means just 8856examining the low byte. Only 44 different values occur for squares mod 256, 8857so 82.8% of inputs can be immediately identified as non-squares. 8858 8859On a 32-bit system similar tests are done mod 9, 5, 7, 13 and 17, for a total 886099.25% of inputs identified as non-squares. On a 64-bit system 97 is tested 8861too, for a total 99.62%. 8862 8863These moduli are chosen because they're factors of @math{2^@W{24}-1} (or 8864@math{2^@W{48}-1} for 64-bits), and such a remainder can be quickly taken just 8865using additions (see @code{mpn_mod_34lsub1}). 8866 8867When nails are in use moduli are instead selected by the @file{gen-psqr.c} 8868program and applied with an @code{mpn_mod_1}. The same @math{2^@W{24}-1} or 8869@math{2^@W{48}-1} could be done with nails using some extra bit shifts, but 8870this is not currently implemented. 8871 8872In any case each modulus is applied to the @code{mpn_mod_34lsub1} or 8873@code{mpn_mod_1} remainder and a table lookup identifies non-squares. By 8874using a ``modexact'' style calculation, and suitably permuted tables, just one 8875multiply each is required, see the code for details. Moduli are also combined 8876to save operations, so long as the lookup tables don't become too big. 8877@file{gen-psqr.c} does all the pre-calculations. 8878 8879A square root must still be taken for any value that passes these tests, to 8880verify it's really a square and not one of the small fraction of non-squares 8881that get through (ie.@: a pseudo-square to all the tested bases). 8882 8883Clearly more residue tests could be done, @code{mpz_perfect_square_p} only 8884uses a compact and efficient set. Big inputs would probably benefit from more 8885residue testing, small inputs might be better off with less. The assumed 8886distribution of squares versus non-squares in the input would affect such 8887considerations. 8888 8889 8890@node Perfect Power Algorithm, , Perfect Square Algorithm, Root Extraction Algorithms 8891@subsection Perfect Power 8892@cindex Perfect power algorithm 8893 8894Detecting perfect powers is required by some factorization algorithms. 8895Currently @code{mpz_perfect_power_p} is implemented using repeated Nth root 8896extractions, though naturally only prime roots need to be considered. 8897(@xref{Nth Root Algorithm}.) 8898 8899If a prime divisor @math{p} with multiplicity @math{e} can be found, then only 8900roots which are divisors of @math{e} need to be considered, much reducing the 8901work necessary. To this end divisibility by a set of small primes is checked. 8902 8903 8904@node Radix Conversion Algorithms, Other Algorithms, Root Extraction Algorithms, Algorithms 8905@section Radix Conversion 8906@cindex Radix conversion algorithms 8907 8908Radix conversions are less important than other algorithms. A program 8909dominated by conversions should probably use a different data representation. 8910 8911@menu 8912* Binary to Radix:: 8913* Radix to Binary:: 8914@end menu 8915 8916 8917@node Binary to Radix, Radix to Binary, Radix Conversion Algorithms, Radix Conversion Algorithms 8918@subsection Binary to Radix 8919 8920Conversions from binary to a power-of-2 radix use a simple and fast 8921@math{O(N)} bit extraction algorithm. 8922 8923Conversions from binary to other radices use one of two algorithms. Sizes 8924below @code{GET_STR_PRECOMPUTE_THRESHOLD} use a basic @math{O(N^2)} method. 8925Repeated divisions by @math{b^n} are made, where @math{b} is the radix and 8926@math{n} is the biggest power that fits in a limb. But instead of simply 8927using the remainder @math{r} from such divisions, an extra divide step is done 8928to give a fractional limb representing @math{r/b^n}. The digits of @math{r} 8929can then be extracted using multiplications by @math{b} rather than divisions. 8930Special case code is provided for decimal, allowing multiplications by 10 to 8931optimize to shifts and adds. 8932 8933Above @code{GET_STR_PRECOMPUTE_THRESHOLD} a sub-quadratic algorithm is used. 8934For an input @math{t}, powers @m{b^{n2^i},b^(n*2^i)} of the radix are 8935calculated, until a power between @math{t} and @m{\sqrt{t},sqrt(t)} is 8936reached. @math{t} is then divided by that largest power, giving a quotient 8937which is the digits above that power, and a remainder which is those below. 8938These two parts are in turn divided by the second highest power, and so on 8939recursively. When a piece has been divided down to less than 8940@code{GET_STR_DC_THRESHOLD} limbs, the basecase algorithm described above is 8941used. 8942 8943The advantage of this algorithm is that big divisions can make use of the 8944sub-quadratic divide and conquer division (@pxref{Divide and Conquer 8945Division}), and big divisions tend to have less overheads than lots of 8946separate single limb divisions anyway. But in any case the cost of 8947calculating the powers @m{b^{n2^i},b^(n*2^i)} must first be overcome. 8948 8949@code{GET_STR_PRECOMPUTE_THRESHOLD} and @code{GET_STR_DC_THRESHOLD} represent 8950the same basic thing, the point where it becomes worth doing a big division to 8951cut the input in half. @code{GET_STR_PRECOMPUTE_THRESHOLD} includes the cost 8952of calculating the radix power required, whereas @code{GET_STR_DC_THRESHOLD} 8953assumes that's already available, which is the case when recursing. 8954 8955Since the base case produces digits from least to most significant but they 8956want to be stored from most to least, it's necessary to calculate in advance 8957how many digits there will be, or at least be sure not to underestimate that. 8958For GMP the number of input bits is multiplied by @code{chars_per_bit_exactly} 8959from @code{mp_bases}, rounding up. The result is either correct or one too 8960big. 8961 8962Examining some of the high bits of the input could increase the chance of 8963getting the exact number of digits, but an exact result every time would not 8964be practical, since in general the difference between numbers 100@dots{} and 896599@dots{} is only in the last few bits and the work to identify 99@dots{} 8966might well be almost as much as a full conversion. 8967 8968@code{mpf_get_str} doesn't currently use the algorithm described here, it 8969multiplies or divides by a power of @math{b} to move the radix point to the 8970just above the highest non-zero digit (or at worst one above that location), 8971then multiplies by @math{b^n} to bring out digits. This is @math{O(N^2)} and 8972is certainly not optimal. 8973 8974The @math{r/b^n} scheme described above for using multiplications to bring out 8975digits might be useful for more than a single limb. Some brief experiments 8976with it on the base case when recursing didn't give a noticeable improvement, 8977but perhaps that was only due to the implementation. Something similar would 8978work for the sub-quadratic divisions too, though there would be the cost of 8979calculating a bigger radix power. 8980 8981Another possible improvement for the sub-quadratic part would be to arrange 8982for radix powers that balanced the sizes of quotient and remainder produced, 8983ie.@: the highest power would be an @m{b^{nk},b^(n*k)} approximately equal to 8984@m{\sqrt{t},sqrt(t)}, not restricted to a @math{2^i} factor. That ought to 8985smooth out a graph of times against sizes, but may or may not be a net 8986speedup. 8987 8988 8989@node Radix to Binary, , Binary to Radix, Radix Conversion Algorithms 8990@subsection Radix to Binary 8991 8992@strong{This section needs to be rewritten, it currently describes the 8993algorithms used before GMP 4.3.} 8994 8995Conversions from a power-of-2 radix into binary use a simple and fast 8996@math{O(N)} bitwise concatenation algorithm. 8997 8998Conversions from other radices use one of two algorithms. Sizes below 8999@code{SET_STR_PRECOMPUTE_THRESHOLD} use a basic @math{O(N^2)} method. Groups 9000of @math{n} digits are converted to limbs, where @math{n} is the biggest 9001power of the base @math{b} which will fit in a limb, then those groups are 9002accumulated into the result by multiplying by @math{b^n} and adding. This 9003saves multi-precision operations, as per Knuth section 4.4 part E 9004(@pxref{References}). Some special case code is provided for decimal, giving 9005the compiler a chance to optimize multiplications by 10. 9006 9007Above @code{SET_STR_PRECOMPUTE_THRESHOLD} a sub-quadratic algorithm is used. 9008First groups of @math{n} digits are converted into limbs. Then adjacent 9009limbs are combined into limb pairs with @m{xb^n+y,x*b^n+y}, where @math{x} 9010and @math{y} are the limbs. Adjacent limb pairs are combined into quads 9011similarly with @m{xb^{2n}+y,x*b^(2n)+y}. This continues until a single block 9012remains, that being the result. 9013 9014The advantage of this method is that the multiplications for each @math{x} are 9015big blocks, allowing Karatsuba and higher algorithms to be used. But the cost 9016of calculating the powers @m{b^{n2^i},b^(n*2^i)} must be overcome. 9017@code{SET_STR_PRECOMPUTE_THRESHOLD} usually ends up quite big, around 5000 digits, and on 9018some processors much bigger still. 9019 9020@code{SET_STR_PRECOMPUTE_THRESHOLD} is based on the input digits (and tuned 9021for decimal), though it might be better based on a limb count, so as to be 9022independent of the base. But that sort of count isn't used by the base case 9023and so would need some sort of initial calculation or estimate. 9024 9025The main reason @code{SET_STR_PRECOMPUTE_THRESHOLD} is so much bigger than the 9026corresponding @code{GET_STR_PRECOMPUTE_THRESHOLD} is that @code{mpn_mul_1} is 9027much faster than @code{mpn_divrem_1} (often by a factor of 5, or more). 9028 9029 9030@need 1000 9031@node Other Algorithms, Assembly Coding, Radix Conversion Algorithms, Algorithms 9032@section Other Algorithms 9033 9034@menu 9035* Prime Testing Algorithm:: 9036* Factorial Algorithm:: 9037* Binomial Coefficients Algorithm:: 9038* Fibonacci Numbers Algorithm:: 9039* Lucas Numbers Algorithm:: 9040* Random Number Algorithms:: 9041@end menu 9042 9043 9044@node Prime Testing Algorithm, Factorial Algorithm, Other Algorithms, Other Algorithms 9045@subsection Prime Testing 9046@cindex Prime testing algorithms 9047 9048The primality testing in @code{mpz_probab_prime_p} (@pxref{Number Theoretic 9049Functions}) first does some trial division by small factors and then uses the 9050Miller-Rabin probabilistic primality testing algorithm, as described in Knuth 9051section 4.5.4 algorithm P (@pxref{References}). 9052 9053For an odd input @math{n}, and with @math{n = q@GMPmultiply{}2^k+1} where 9054@math{q} is odd, this algorithm selects a random base @math{x} and tests 9055whether @math{x^q @bmod{} n} is 1 or @math{-1}, or an @m{x^{q2^j} \bmod n, 9056x^(q*2^j) mod n} is @math{1}, for @math{1@le{}j@le{}k}. If so then @math{n} 9057is probably prime, if not then @math{n} is definitely composite. 9058 9059Any prime @math{n} will pass the test, but some composites do too. Such 9060composites are known as strong pseudoprimes to base @math{x}. No @math{n} is 9061a strong pseudoprime to more than @math{1/4} of all bases (see Knuth exercise 906222), hence with @math{x} chosen at random there's no more than a @math{1/4} 9063chance a ``probable prime'' will in fact be composite. 9064 9065In fact strong pseudoprimes are quite rare, making the test much more 9066powerful than this analysis would suggest, but @math{1/4} is all that's proven 9067for an arbitrary @math{n}. 9068 9069 9070@node Factorial Algorithm, Binomial Coefficients Algorithm, Prime Testing Algorithm, Other Algorithms 9071@subsection Factorial 9072@cindex Factorial algorithm 9073 9074Factorials are calculated by a combination of removal of twos, powering, and 9075binary splitting. The procedure can be best illustrated with an example, 9076 9077@quotation 9078@math{23! = 1.2.3.4.5.6.7.8.9.10.11.12.13.14.15.16.17.18.19.20.21.22.23} 9079@end quotation 9080 9081@noindent 9082has factors of two removed, 9083 9084@quotation 9085@math{23! = 2^{19}.1.1.3.1.5.3.7.1.9.5.11.3.13.7.15.1.17.9.19.5.21.11.23} 9086@end quotation 9087 9088@noindent 9089and the resulting terms collected up according to their multiplicity, 9090 9091@quotation 9092@math{23! = 2^{19}.(3.5)^3.(7.9.11)^2.(13.15.17.19.21.23)} 9093@end quotation 9094 9095Each sequence such as @math{13.15.17.19.21.23} is evaluated by splitting into 9096every second term, as for instance @math{(13.17.21).(15.19.23)}, and the same 9097recursively on each half. This is implemented iteratively using some bit 9098twiddling. 9099 9100Such splitting is more efficient than repeated N@cross{}1 multiplies since it 9101forms big multiplies, allowing Karatsuba and higher algorithms to be used. 9102And even below the Karatsuba threshold a big block of work can be more 9103efficient for the basecase algorithm. 9104 9105Splitting into subsequences of every second term keeps the resulting products 9106more nearly equal in size than would the simpler approach of say taking the 9107first half and second half of the sequence. Nearly equal products are more 9108efficient for the current multiply implementation. 9109 9110 9111@node Binomial Coefficients Algorithm, Fibonacci Numbers Algorithm, Factorial Algorithm, Other Algorithms 9112@subsection Binomial Coefficients 9113@cindex Binomial coefficient algorithm 9114 9115Binomial coefficients @m{\left({n}\atop{k}\right), C(n@C{}k)} are calculated 9116by first arranging @math{k @le{} n/2} using @m{\left({n}\atop{k}\right) = 9117\left({n}\atop{n-k}\right), C(n@C{}k) = C(n@C{}n-k)} if necessary, and then 9118evaluating the following product simply from @math{i=2} to @math{i=k}. 9119@tex 9120$$ \left({n}\atop{k}\right) = (n-k+1) \prod_{i=2}^{k} {{n-k+i} \over i} $$ 9121@end tex 9122@ifnottex 9123 9124@example 9125 k (n-k+i) 9126C(n,k) = (n-k+1) * prod ------- 9127 i=2 i 9128@end example 9129 9130@end ifnottex 9131It's easy to show that each denominator @math{i} will divide the product so 9132far, so the exact division algorithm is used (@pxref{Exact Division}). 9133 9134The numerators @math{n-k+i} and denominators @math{i} are first accumulated 9135into as many fit a limb, to save multi-precision operations, though for 9136@code{mpz_bin_ui} this applies only to the divisors, since @math{n} is an 9137@code{mpz_t} and @math{n-k+i} in general won't fit in a limb at all. 9138 9139 9140@node Fibonacci Numbers Algorithm, Lucas Numbers Algorithm, Binomial Coefficients Algorithm, Other Algorithms 9141@subsection Fibonacci Numbers 9142@cindex Fibonacci number algorithm 9143 9144The Fibonacci functions @code{mpz_fib_ui} and @code{mpz_fib2_ui} are designed 9145for calculating isolated @m{F_n,F[n]} or @m{F_n,F[n]},@m{F_{n-1},F[n-1]} 9146values efficiently. 9147 9148For small @math{n}, a table of single limb values in @code{__gmp_fib_table} is 9149used. On a 32-bit limb this goes up to @m{F_{47},F[47]}, or on a 64-bit limb 9150up to @m{F_{93},F[93]}. For convenience the table starts at @m{F_{-1},F[-1]}. 9151 9152Beyond the table, values are generated with a binary powering algorithm, 9153calculating a pair @m{F_n,F[n]} and @m{F_{n-1},F[n-1]} working from high to 9154low across the bits of @math{n}. The formulas used are 9155@tex 9156$$\eqalign{ 9157 F_{2k+1} &= 4F_k^2 - F_{k-1}^2 + 2(-1)^k \cr 9158 F_{2k-1} &= F_k^2 + F_{k-1}^2 \cr 9159 F_{2k} &= F_{2k+1} - F_{2k-1} 9160}$$ 9161@end tex 9162@ifnottex 9163 9164@example 9165F[2k+1] = 4*F[k]^2 - F[k-1]^2 + 2*(-1)^k 9166F[2k-1] = F[k]^2 + F[k-1]^2 9167 9168F[2k] = F[2k+1] - F[2k-1] 9169@end example 9170 9171@end ifnottex 9172At each step, @math{k} is the high @math{b} bits of @math{n}. If the next bit 9173of @math{n} is 0 then @m{F_{2k},F[2k]},@m{F_{2k-1},F[2k-1]} is used, or if 9174it's a 1 then @m{F_{2k+1},F[2k+1]},@m{F_{2k},F[2k]} is used, and the process 9175repeated until all bits of @math{n} are incorporated. Notice these formulas 9176require just two squares per bit of @math{n}. 9177 9178It'd be possible to handle the first few @math{n} above the single limb table 9179with simple additions, using the defining Fibonacci recurrence @m{F_{k+1} = 9180F_k + F_{k-1}, F[k+1]=F[k]+F[k-1]}, but this is not done since it usually 9181turns out to be faster for only about 10 or 20 values of @math{n}, and 9182including a block of code for just those doesn't seem worthwhile. If they 9183really mattered it'd be better to extend the data table. 9184 9185Using a table avoids lots of calculations on small numbers, and makes small 9186@math{n} go fast. A bigger table would make more small @math{n} go fast, it's 9187just a question of balancing size against desired speed. For GMP the code is 9188kept compact, with the emphasis primarily on a good powering algorithm. 9189 9190@code{mpz_fib2_ui} returns both @m{F_n,F[n]} and @m{F_{n-1},F[n-1]}, but 9191@code{mpz_fib_ui} is only interested in @m{F_n,F[n]}. In this case the last 9192step of the algorithm can become one multiply instead of two squares. One of 9193the following two formulas is used, according as @math{n} is odd or even. 9194@tex 9195$$\eqalign{ 9196 F_{2k} &= F_k (F_k + 2F_{k-1}) \cr 9197 F_{2k+1} &= (2F_k + F_{k-1}) (2F_k - F_{k-1}) + 2(-1)^k 9198}$$ 9199@end tex 9200@ifnottex 9201 9202@example 9203F[2k] = F[k]*(F[k]+2F[k-1]) 9204 9205F[2k+1] = (2F[k]+F[k-1])*(2F[k]-F[k-1]) + 2*(-1)^k 9206@end example 9207 9208@end ifnottex 9209@m{F_{2k+1},F[2k+1]} here is the same as above, just rearranged to be a 9210multiply. For interest, the @m{2(-1)^k, 2*(-1)^k} term both here and above 9211can be applied just to the low limb of the calculation, without a carry or 9212borrow into further limbs, which saves some code size. See comments with 9213@code{mpz_fib_ui} and the internal @code{mpn_fib2_ui} for how this is done. 9214 9215 9216@node Lucas Numbers Algorithm, Random Number Algorithms, Fibonacci Numbers Algorithm, Other Algorithms 9217@subsection Lucas Numbers 9218@cindex Lucas number algorithm 9219 9220@code{mpz_lucnum2_ui} derives a pair of Lucas numbers from a pair of Fibonacci 9221numbers with the following simple formulas. 9222@tex 9223$$\eqalign{ 9224 L_k &= F_k + 2F_{k-1} \cr 9225 L_{k-1} &= 2F_k - F_{k-1} 9226}$$ 9227@end tex 9228@ifnottex 9229 9230@example 9231L[k] = F[k] + 2*F[k-1] 9232L[k-1] = 2*F[k] - F[k-1] 9233@end example 9234 9235@end ifnottex 9236@code{mpz_lucnum_ui} is only interested in @m{L_n,L[n]}, and some work can be 9237saved. Trailing zero bits on @math{n} can be handled with a single square 9238each. 9239@tex 9240$$ L_{2k} = L_k^2 - 2(-1)^k $$ 9241@end tex 9242@ifnottex 9243 9244@example 9245L[2k] = L[k]^2 - 2*(-1)^k 9246@end example 9247 9248@end ifnottex 9249And the lowest 1 bit can be handled with one multiply of a pair of Fibonacci 9250numbers, similar to what @code{mpz_fib_ui} does. 9251@tex 9252$$ L_{2k+1} = 5F_{k-1} (2F_k + F_{k-1}) - 4(-1)^k $$ 9253@end tex 9254@ifnottex 9255 9256@example 9257L[2k+1] = 5*F[k-1]*(2*F[k]+F[k-1]) - 4*(-1)^k 9258@end example 9259 9260@end ifnottex 9261 9262 9263@node Random Number Algorithms, , Lucas Numbers Algorithm, Other Algorithms 9264@subsection Random Numbers 9265@cindex Random number algorithms 9266 9267For the @code{urandomb} functions, random numbers are generated simply by 9268concatenating bits produced by the generator. As long as the generator has 9269good randomness properties this will produce well-distributed @math{N} bit 9270numbers. 9271 9272For the @code{urandomm} functions, random numbers in a range @math{0@le{}R<N} 9273are generated by taking values @math{R} of @m{\lceil \log_2 N \rceil, 9274ceil(log2(N))} bits each until one satisfies @math{R<N}. This will normally 9275require only one or two attempts, but the attempts are limited in case the 9276generator is somehow degenerate and produces only 1 bits or similar. 9277 9278@cindex Mersenne twister algorithm 9279The Mersenne Twister generator is by Matsumoto and Nishimura 9280(@pxref{References}). It has a non-repeating period of @math{2^@W{19937}-1}, 9281which is a Mersenne prime, hence the name of the generator. The state is 624 9282words of 32-bits each, which is iterated with one XOR and shift for each 928332-bit word generated, making the algorithm very fast. Randomness properties 9284are also very good and this is the default algorithm used by GMP. 9285 9286@cindex Linear congruential algorithm 9287Linear congruential generators are described in many text books, for instance 9288Knuth volume 2 (@pxref{References}). With a modulus @math{M} and parameters 9289@math{A} and @math{C}, a integer state @math{S} is iterated by the formula 9290@math{S @leftarrow{} A@GMPmultiply{}S+C @bmod{} M}. At each step the new 9291state is a linear function of the previous, mod @math{M}, hence the name of 9292the generator. 9293 9294In GMP only moduli of the form @math{2^N} are supported, and the current 9295implementation is not as well optimized as it could be. Overheads are 9296significant when @math{N} is small, and when @math{N} is large clearly the 9297multiply at each step will become slow. This is not a big concern, since the 9298Mersenne Twister generator is better in every respect and is therefore 9299recommended for all normal applications. 9300 9301For both generators the current state can be deduced by observing enough 9302output and applying some linear algebra (over GF(2) in the case of the 9303Mersenne Twister). This generally means raw output is unsuitable for 9304cryptographic applications without further hashing or the like. 9305 9306 9307@node Assembly Coding, , Other Algorithms, Algorithms 9308@section Assembly Coding 9309@cindex Assembly coding 9310 9311The assembly subroutines in GMP are the most significant source of speed at 9312small to moderate sizes. At larger sizes algorithm selection becomes more 9313important, but of course speedups in low level routines will still speed up 9314everything proportionally. 9315 9316Carry handling and widening multiplies that are important for GMP can't be 9317easily expressed in C@. GCC @code{asm} blocks help a lot and are provided in 9318@file{longlong.h}, but hand coding low level routines invariably offers a 9319speedup over generic C by a factor of anything from 2 to 10. 9320 9321@menu 9322* Assembly Code Organisation:: 9323* Assembly Basics:: 9324* Assembly Carry Propagation:: 9325* Assembly Cache Handling:: 9326* Assembly Functional Units:: 9327* Assembly Floating Point:: 9328* Assembly SIMD Instructions:: 9329* Assembly Software Pipelining:: 9330* Assembly Loop Unrolling:: 9331* Assembly Writing Guide:: 9332@end menu 9333 9334 9335@node Assembly Code Organisation, Assembly Basics, Assembly Coding, Assembly Coding 9336@subsection Code Organisation 9337@cindex Assembly code organisation 9338@cindex Code organisation 9339 9340The various @file{mpn} subdirectories contain machine-dependent code, written 9341in C or assembly. The @file{mpn/generic} subdirectory contains default code, 9342used when there's no machine-specific version of a particular file. 9343 9344Each @file{mpn} subdirectory is for an ISA family. Generally 32-bit and 934564-bit variants in a family cannot share code and have separate directories. 9346Within a family further subdirectories may exist for CPU variants. 9347 9348In each directory a @file{nails} subdirectory may exist, holding code with 9349nails support for that CPU variant. A @code{NAILS_SUPPORT} directive in each 9350file indicates the nails values the code handles. Nails code only exists 9351where it's faster, or promises to be faster, than plain code. There's no 9352effort put into nails if they're not going to enhance a given CPU. 9353 9354 9355@node Assembly Basics, Assembly Carry Propagation, Assembly Code Organisation, Assembly Coding 9356@subsection Assembly Basics 9357 9358@code{mpn_addmul_1} and @code{mpn_submul_1} are the most important routines 9359for overall GMP performance. All multiplications and divisions come down to 9360repeated calls to these. @code{mpn_add_n}, @code{mpn_sub_n}, 9361@code{mpn_lshift} and @code{mpn_rshift} are next most important. 9362 9363On some CPUs assembly versions of the internal functions 9364@code{mpn_mul_basecase} and @code{mpn_sqr_basecase} give significant speedups, 9365mainly through avoiding function call overheads. They can also potentially 9366make better use of a wide superscalar processor, as can bigger primitives like 9367@code{mpn_addmul_2} or @code{mpn_addmul_4}. 9368 9369The restrictions on overlaps between sources and destinations 9370(@pxref{Low-level Functions}) are designed to facilitate a variety of 9371implementations. For example, knowing @code{mpn_add_n} won't have partly 9372overlapping sources and destination means reading can be done far ahead of 9373writing on superscalar processors, and loops can be vectorized on a vector 9374processor, depending on the carry handling. 9375 9376 9377@node Assembly Carry Propagation, Assembly Cache Handling, Assembly Basics, Assembly Coding 9378@subsection Carry Propagation 9379@cindex Assembly carry propagation 9380 9381The problem that presents most challenges in GMP is propagating carries from 9382one limb to the next. In functions like @code{mpn_addmul_1} and 9383@code{mpn_add_n}, carries are the only dependencies between limb operations. 9384 9385On processors with carry flags, a straightforward CISC style @code{adc} is 9386generally best. AMD K6 @code{mpn_addmul_1} however is an example of an 9387unusual set of circumstances where a branch works out better. 9388 9389On RISC processors generally an add and compare for overflow is used. This 9390sort of thing can be seen in @file{mpn/generic/aors_n.c}. Some carry 9391propagation schemes require 4 instructions, meaning at least 4 cycles per 9392limb, but other schemes may use just 1 or 2. On wide superscalar processors 9393performance may be completely determined by the number of dependent 9394instructions between carry-in and carry-out for each limb. 9395 9396On vector processors good use can be made of the fact that a carry bit only 9397very rarely propagates more than one limb. When adding a single bit to a 9398limb, there's only a carry out if that limb was @code{0xFF@dots{}FF} which on 9399random data will be only 1 in @m{2\GMPraise{@code{mp\_bits\_per\_limb}}, 94002^mp_bits_per_limb}. @file{mpn/cray/add_n.c} is an example of this, it adds 9401all limbs in parallel, adds one set of carry bits in parallel and then only 9402rarely needs to fall through to a loop propagating further carries. 9403 9404On the x86s, GCC (as of version 2.95.2) doesn't generate particularly good code 9405for the RISC style idioms that are necessary to handle carry bits in 9406C@. Often conditional jumps are generated where @code{adc} or @code{sbb} forms 9407would be better. And so unfortunately almost any loop involving carry bits 9408needs to be coded in assembly for best results. 9409 9410 9411@node Assembly Cache Handling, Assembly Functional Units, Assembly Carry Propagation, Assembly Coding 9412@subsection Cache Handling 9413@cindex Assembly cache handling 9414 9415GMP aims to perform well both on operands that fit entirely in L1 cache and 9416those which don't. 9417 9418Basic routines like @code{mpn_add_n} or @code{mpn_lshift} are often used on 9419large operands, so L2 and main memory performance is important for them. 9420@code{mpn_mul_1} and @code{mpn_addmul_1} are mostly used for multiply and 9421square basecases, so L1 performance matters most for them, unless assembly 9422versions of @code{mpn_mul_basecase} and @code{mpn_sqr_basecase} exist, in 9423which case the remaining uses are mostly for larger operands. 9424 9425For L2 or main memory operands, memory access times will almost certainly be 9426more than the calculation time. The aim therefore is to maximize memory 9427throughput, by starting a load of the next cache line while processing the 9428contents of the previous one. Clearly this is only possible if the chip has a 9429lock-up free cache or some sort of prefetch instruction. Most current chips 9430have both these features. 9431 9432Prefetching sources combines well with loop unrolling, since a prefetch can be 9433initiated once per unrolled loop (or more than once if the loop covers more 9434than one cache line). 9435 9436On CPUs without write-allocate caches, prefetching destinations will ensure 9437individual stores don't go further down the cache hierarchy, limiting 9438bandwidth. Of course for calculations which are slow anyway, like 9439@code{mpn_divrem_1}, write-throughs might be fine. 9440 9441The distance ahead to prefetch will be determined by memory latency versus 9442throughput. The aim of course is to have data arriving continuously, at peak 9443throughput. Some CPUs have limits on the number of fetches or prefetches in 9444progress. 9445 9446If a special prefetch instruction doesn't exist then a plain load can be used, 9447but in that case care must be taken not to attempt to read past the end of an 9448operand, since that might produce a segmentation violation. 9449 9450Some CPUs or systems have hardware that detects sequential memory accesses and 9451initiates suitable cache movements automatically, making life easy. 9452 9453 9454@node Assembly Functional Units, Assembly Floating Point, Assembly Cache Handling, Assembly Coding 9455@subsection Functional Units 9456 9457When choosing an approach for an assembly loop, consideration is given to 9458what operations can execute simultaneously and what throughput can thereby be 9459achieved. In some cases an algorithm can be tweaked to accommodate available 9460resources. 9461 9462Loop control will generally require a counter and pointer updates, costing as 9463much as 5 instructions, plus any delays a branch introduces. CPU addressing 9464modes might reduce pointer updates, perhaps by allowing just one updating 9465pointer and others expressed as offsets from it, or on CISC chips with all 9466addressing done with the loop counter as a scaled index. 9467 9468The final loop control cost can be amortised by processing several limbs in 9469each iteration (@pxref{Assembly Loop Unrolling}). This at least ensures loop 9470control isn't a big fraction the work done. 9471 9472Memory throughput is always a limit. If perhaps only one load or one store 9473can be done per cycle then 3 cycles/limb will the top speed for ``binary'' 9474operations like @code{mpn_add_n}, and any code achieving that is optimal. 9475 9476Integer resources can be freed up by having the loop counter in a float 9477register, or by pressing the float units into use for some multiplying, 9478perhaps doing every second limb on the float side (@pxref{Assembly Floating 9479Point}). 9480 9481Float resources can be freed up by doing carry propagation on the integer 9482side, or even by doing integer to float conversions in integers using bit 9483twiddling. 9484 9485 9486@node Assembly Floating Point, Assembly SIMD Instructions, Assembly Functional Units, Assembly Coding 9487@subsection Floating Point 9488@cindex Assembly floating Point 9489 9490Floating point arithmetic is used in GMP for multiplications on CPUs with poor 9491integer multipliers. It's mostly useful for @code{mpn_mul_1}, 9492@code{mpn_addmul_1} and @code{mpn_submul_1} on 64-bit machines, and 9493@code{mpn_mul_basecase} on both 32-bit and 64-bit machines. 9494 9495With IEEE 53-bit double precision floats, integer multiplications producing up 9496to 53 bits will give exact results. Breaking a 64@cross{}64 multiplication 9497into eight 16@cross{}@math{32@rightarrow{}48} bit pieces is convenient. With 9498some care though six 21@cross{}@math{32@rightarrow{}53} bit products can be 9499used, if one of the lower two 21-bit pieces also uses the sign bit. 9500 9501For the @code{mpn_mul_1} family of functions on a 64-bit machine, the 9502invariant single limb is split at the start, into 3 or 4 pieces. Inside the 9503loop, the bignum operand is split into 32-bit pieces. Fast conversion of 9504these unsigned 32-bit pieces to floating point is highly machine-dependent. 9505In some cases, reading the data into the integer unit, zero-extending to 950664-bits, then transferring to the floating point unit back via memory is the 9507only option. 9508 9509Converting partial products back to 64-bit limbs is usually best done as a 9510signed conversion. Since all values are smaller than @m{2^{53},2^53}, signed 9511and unsigned are the same, but most processors lack unsigned conversions. 9512 9513@sp 2 9514 9515Here is a diagram showing 16@cross{}32 bit products for an @code{mpn_mul_1} or 9516@code{mpn_addmul_1} with a 64-bit limb. The single limb operand V is split 9517into four 16-bit parts. The multi-limb operand U is split in the loop into 9518two 32-bit parts. 9519 9520@tex 9521\global\newdimen\GMPbits \global\GMPbits=0.18em 9522\def\GMPbox#1#2#3{% 9523 \hbox{% 9524 \hbox to 128\GMPbits{\hfil 9525 \vbox{% 9526 \hrule 9527 \hbox to 48\GMPbits {\GMPvrule \hfil$#2$\hfil \vrule}% 9528 \hrule}% 9529 \hskip #1\GMPbits}% 9530 \raise \GMPboxdepth \hbox{\hskip 2em #3}}} 9531% 9532\GMPdisplay{% 9533 \vbox{% 9534 \hbox{% 9535 \hbox to 128\GMPbits {\hfil 9536 \vbox{% 9537 \hrule 9538 \hbox to 64\GMPbits{% 9539 \GMPvrule \hfil$v48$\hfil 9540 \vrule \hfil$v32$\hfil 9541 \vrule \hfil$v16$\hfil 9542 \vrule \hfil$v00$\hfil 9543 \vrule} 9544 \hrule}}% 9545 \raise \GMPboxdepth \hbox{\hskip 2em V Operand}} 9546 \vskip 0.5ex 9547 \hbox{% 9548 \hbox to 128\GMPbits {\hfil 9549 \raise \GMPboxdepth \hbox{$\times$\hskip 1.5em}% 9550 \vbox{% 9551 \hrule 9552 \hbox to 64\GMPbits {% 9553 \GMPvrule \hfil$u32$\hfil 9554 \vrule \hfil$u00$\hfil 9555 \vrule}% 9556 \hrule}}% 9557 \raise \GMPboxdepth \hbox{\hskip 2em U Operand (one limb)}}% 9558 \vskip 0.5ex 9559 \hbox{\vbox to 2ex{\hrule width 128\GMPbits}}% 9560 \GMPbox{0}{u00 \times v00}{$p00$\hskip 1.5em 48-bit products}% 9561 \vskip 0.5ex 9562 \GMPbox{16}{u00 \times v16}{$p16$} 9563 \vskip 0.5ex 9564 \GMPbox{32}{u00 \times v32}{$p32$} 9565 \vskip 0.5ex 9566 \GMPbox{48}{u00 \times v48}{$p48$} 9567 \vskip 0.5ex 9568 \GMPbox{32}{u32 \times v00}{$r32$} 9569 \vskip 0.5ex 9570 \GMPbox{48}{u32 \times v16}{$r48$} 9571 \vskip 0.5ex 9572 \GMPbox{64}{u32 \times v32}{$r64$} 9573 \vskip 0.5ex 9574 \GMPbox{80}{u32 \times v48}{$r80$} 9575}} 9576@end tex 9577@ifnottex 9578@example 9579@group 9580 +---+---+---+---+ 9581 |v48|v32|v16|v00| V operand 9582 +---+---+---+---+ 9583 9584 +-------+---+---+ 9585 x | u32 | u00 | U operand (one limb) 9586 +---------------+ 9587 9588--------------------------------- 9589 9590 +-----------+ 9591 | u00 x v00 | p00 48-bit products 9592 +-----------+ 9593 +-----------+ 9594 | u00 x v16 | p16 9595 +-----------+ 9596 +-----------+ 9597 | u00 x v32 | p32 9598 +-----------+ 9599 +-----------+ 9600 | u00 x v48 | p48 9601 +-----------+ 9602 +-----------+ 9603 | u32 x v00 | r32 9604 +-----------+ 9605 +-----------+ 9606 | u32 x v16 | r48 9607 +-----------+ 9608 +-----------+ 9609 | u32 x v32 | r64 9610 +-----------+ 9611+-----------+ 9612| u32 x v48 | r80 9613+-----------+ 9614@end group 9615@end example 9616@end ifnottex 9617 9618@math{p32} and @math{r32} can be summed using floating-point addition, and 9619likewise @math{p48} and @math{r48}. @math{p00} and @math{p16} can be summed 9620with @math{r64} and @math{r80} from the previous iteration. 9621 9622For each loop then, four 49-bit quantities are transferred to the integer unit, 9623aligned as follows, 9624 9625@tex 9626% GMPbox here should be 49 bits wide, but use 51 to better show p16+r80' 9627% crossing into the upper 64 bits. 9628\def\GMPbox#1#2#3{% 9629 \hbox{% 9630 \hbox to 128\GMPbits {% 9631 \hfil 9632 \vbox{% 9633 \hrule 9634 \hbox to 51\GMPbits {\GMPvrule \hfil$#2$\hfil \vrule}% 9635 \hrule}% 9636 \hskip #1\GMPbits}% 9637 \raise \GMPboxdepth \hbox{\hskip 1.5em $#3$\hfil}% 9638}} 9639\newbox\b \setbox\b\hbox{64 bits}% 9640\newdimen\bw \bw=\wd\b \advance\bw by 2em 9641\newdimen\x \x=128\GMPbits 9642\advance\x by -2\bw 9643\divide\x by4 9644\GMPdisplay{% 9645 \vbox{% 9646 \hbox to 128\GMPbits {% 9647 \GMPvrule 9648 \raise 0.5ex \vbox{\hrule \hbox to \x {}}% 9649 \hfil 64 bits\hfil 9650 \raise 0.5ex \vbox{\hrule \hbox to \x {}}% 9651 \vrule 9652 \raise 0.5ex \vbox{\hrule \hbox to \x {}}% 9653 \hfil 64 bits\hfil 9654 \raise 0.5ex \vbox{\hrule \hbox to \x {}}% 9655 \vrule}% 9656 \vskip 0.7ex 9657 \GMPbox{0}{p00+r64'}{i00} 9658 \vskip 0.5ex 9659 \GMPbox{16}{p16+r80'}{i16} 9660 \vskip 0.5ex 9661 \GMPbox{32}{p32+r32}{i32} 9662 \vskip 0.5ex 9663 \GMPbox{48}{p48+r48}{i48} 9664}} 9665@end tex 9666@ifnottex 9667@example 9668@group 9669|-----64bits----|-----64bits----| 9670 +------------+ 9671 | p00 + r64' | i00 9672 +------------+ 9673 +------------+ 9674 | p16 + r80' | i16 9675 +------------+ 9676 +------------+ 9677 | p32 + r32 | i32 9678 +------------+ 9679 +------------+ 9680 | p48 + r48 | i48 9681 +------------+ 9682@end group 9683@end example 9684@end ifnottex 9685 9686The challenge then is to sum these efficiently and add in a carry limb, 9687generating a low 64-bit result limb and a high 33-bit carry limb (@math{i48} 9688extends 33 bits into the high half). 9689 9690 9691@node Assembly SIMD Instructions, Assembly Software Pipelining, Assembly Floating Point, Assembly Coding 9692@subsection SIMD Instructions 9693@cindex Assembly SIMD 9694 9695The single-instruction multiple-data support in current microprocessors is 9696aimed at signal processing algorithms where each data point can be treated 9697more or less independently. There's generally not much support for 9698propagating the sort of carries that arise in GMP. 9699 9700SIMD multiplications of say four 16@cross{}16 bit multiplies only do as much 9701work as one 32@cross{}32 from GMP's point of view, and need some shifts and 9702adds besides. But of course if say the SIMD form is fully pipelined and uses 9703less instruction decoding then it may still be worthwhile. 9704 9705On the x86 chips, MMX has so far found a use in @code{mpn_rshift} and 9706@code{mpn_lshift}, and is used in a special case for 16-bit multipliers in the 9707P55 @code{mpn_mul_1}. SSE2 is used for Pentium 4 @code{mpn_mul_1}, 9708@code{mpn_addmul_1}, and @code{mpn_submul_1}. 9709 9710 9711@node Assembly Software Pipelining, Assembly Loop Unrolling, Assembly SIMD Instructions, Assembly Coding 9712@subsection Software Pipelining 9713@cindex Assembly software pipelining 9714 9715Software pipelining consists of scheduling instructions around the branch 9716point in a loop. For example a loop might issue a load not for use in the 9717present iteration but the next, thereby allowing extra cycles for the data to 9718arrive from memory. 9719 9720Naturally this is wanted only when doing things like loads or multiplies that 9721take several cycles to complete, and only where a CPU has multiple functional 9722units so that other work can be done in the meantime. 9723 9724A pipeline with several stages will have a data value in progress at each 9725stage and each loop iteration moves them along one stage. This is like 9726juggling. 9727 9728If the latency of some instruction is greater than the loop time then it will 9729be necessary to unroll, so one register has a result ready to use while 9730another (or multiple others) are still in progress. (@pxref{Assembly Loop 9731Unrolling}). 9732 9733 9734@node Assembly Loop Unrolling, Assembly Writing Guide, Assembly Software Pipelining, Assembly Coding 9735@subsection Loop Unrolling 9736@cindex Assembly loop unrolling 9737 9738Loop unrolling consists of replicating code so that several limbs are 9739processed in each loop. At a minimum this reduces loop overheads by a 9740corresponding factor, but it can also allow better register usage, for example 9741alternately using one register combination and then another. Judicious use of 9742@command{m4} macros can help avoid lots of duplication in the source code. 9743 9744Any amount of unrolling can be handled with a loop counter that's decremented 9745by @math{N} each time, stopping when the remaining count is less than the 9746further @math{N} the loop will process. Or by subtracting @math{N} at the 9747start, the termination condition becomes when the counter @math{C} is less 9748than 0 (and the count of remaining limbs is @math{C+N}). 9749 9750Alternately for a power of 2 unroll the loop count and remainder can be 9751established with a shift and mask. This is convenient if also making a 9752computed jump into the middle of a large loop. 9753 9754The limbs not a multiple of the unrolling can be handled in various ways, for 9755example 9756 9757@itemize @bullet 9758@item 9759A simple loop at the end (or the start) to process the excess. Care will be 9760wanted that it isn't too much slower than the unrolled part. 9761 9762@item 9763A set of binary tests, for example after an 8-limb unrolling, test for 4 more 9764limbs to process, then a further 2 more or not, and finally 1 more or not. 9765This will probably take more code space than a simple loop. 9766 9767@item 9768A @code{switch} statement, providing separate code for each possible excess, 9769for example an 8-limb unrolling would have separate code for 0 remaining, 1 9770remaining, etc, up to 7 remaining. This might take a lot of code, but may be 9771the best way to optimize all cases in combination with a deep pipelined loop. 9772 9773@item 9774A computed jump into the middle of the loop, thus making the first iteration 9775handle the excess. This should make times smoothly increase with size, which 9776is attractive, but setups for the jump and adjustments for pointers can be 9777tricky and could become quite difficult in combination with deep pipelining. 9778@end itemize 9779 9780 9781@node Assembly Writing Guide, , Assembly Loop Unrolling, Assembly Coding 9782@subsection Writing Guide 9783@cindex Assembly writing guide 9784 9785This is a guide to writing software pipelined loops for processing limb 9786vectors in assembly. 9787 9788First determine the algorithm and which instructions are needed. Code it 9789without unrolling or scheduling, to make sure it works. On a 3-operand CPU 9790try to write each new value to a new register, this will greatly simplify later 9791steps. 9792 9793Then note for each instruction the functional unit and/or issue port 9794requirements. If an instruction can use either of two units, like U0 or U1 9795then make a category ``U0/U1''. Count the total using each unit (or combined 9796unit), and count all instructions. 9797 9798Figure out from those counts the best possible loop time. The goal will be to 9799find a perfect schedule where instruction latencies are completely hidden. 9800The total instruction count might be the limiting factor, or perhaps a 9801particular functional unit. It might be possible to tweak the instructions to 9802help the limiting factor. 9803 9804Suppose the loop time is @math{N}, then make @math{N} issue buckets, with the 9805final loop branch at the end of the last. Now fill the buckets with dummy 9806instructions using the functional units desired. Run this to make sure the 9807intended speed is reached. 9808 9809Now replace the dummy instructions with the real instructions from the slow 9810but correct loop you started with. The first will typically be a load 9811instruction. Then the instruction using that value is placed in a bucket an 9812appropriate distance down. Run the loop again, to check it still runs at 9813target speed. 9814 9815Keep placing instructions, frequently measuring the loop. After a few you 9816will need to wrap around from the last bucket back to the top of the loop. If 9817you used the new-register for new-value strategy above then there will be no 9818register conflicts. If not then take care not to clobber something already in 9819use. Changing registers at this time is very error prone. 9820 9821The loop will overlap two or more of the original loop iterations, and the 9822computation of one vector element result will be started in one iteration of 9823the new loop, and completed one or several iterations later. 9824 9825The final step is to create feed-in and wind-down code for the loop. A good 9826way to do this is to make a copy (or copies) of the loop at the start and 9827delete those instructions which don't have valid antecedents, and at the end 9828replicate and delete those whose results are unwanted (including any further 9829loads). 9830 9831The loop will have a minimum number of limbs loaded and processed, so the 9832feed-in code must test if the request size is smaller and skip either to a 9833suitable part of the wind-down or to special code for small sizes. 9834 9835 9836@node Internals, Contributors, Algorithms, Top 9837@chapter Internals 9838@cindex Internals 9839 9840@strong{This chapter is provided only for informational purposes and the 9841various internals described here may change in future GMP releases. 9842Applications expecting to be compatible with future releases should use only 9843the documented interfaces described in previous chapters.} 9844 9845@menu 9846* Integer Internals:: 9847* Rational Internals:: 9848* Float Internals:: 9849* Raw Output Internals:: 9850* C++ Interface Internals:: 9851@end menu 9852 9853@node Integer Internals, Rational Internals, Internals, Internals 9854@section Integer Internals 9855@cindex Integer internals 9856 9857@code{mpz_t} variables represent integers using sign and magnitude, in space 9858dynamically allocated and reallocated. The fields are as follows. 9859 9860@table @asis 9861@item @code{_mp_size} 9862The number of limbs, or the negative of that when representing a negative 9863integer. Zero is represented by @code{_mp_size} set to zero, in which case 9864the @code{_mp_d} data is unused. 9865 9866@item @code{_mp_d} 9867A pointer to an array of limbs which is the magnitude. These are stored 9868``little endian'' as per the @code{mpn} functions, so @code{_mp_d[0]} is the 9869least significant limb and @code{_mp_d[ABS(_mp_size)-1]} is the most 9870significant. Whenever @code{_mp_size} is non-zero, the most significant limb 9871is non-zero. 9872 9873Currently there's always at least one limb allocated, so for instance 9874@code{mpz_set_ui} never needs to reallocate, and @code{mpz_get_ui} can fetch 9875@code{_mp_d[0]} unconditionally (though its value is then only wanted if 9876@code{_mp_size} is non-zero). 9877 9878@item @code{_mp_alloc} 9879@code{_mp_alloc} is the number of limbs currently allocated at @code{_mp_d}, 9880and naturally @code{_mp_alloc >= ABS(_mp_size)}. When an @code{mpz} routine 9881is about to (or might be about to) increase @code{_mp_size}, it checks 9882@code{_mp_alloc} to see whether there's enough space, and reallocates if not. 9883@code{MPZ_REALLOC} is generally used for this. 9884@end table 9885 9886The various bitwise logical functions like @code{mpz_and} behave as if 9887negative values were twos complement. But sign and magnitude is always used 9888internally, and necessary adjustments are made during the calculations. 9889Sometimes this isn't pretty, but sign and magnitude are best for other 9890routines. 9891 9892Some internal temporary variables are setup with @code{MPZ_TMP_INIT} and these 9893have @code{_mp_d} space obtained from @code{TMP_ALLOC} rather than the memory 9894allocation functions. Care is taken to ensure that these are big enough that 9895no reallocation is necessary (since it would have unpredictable consequences). 9896 9897@code{_mp_size} and @code{_mp_alloc} are @code{int}, although @code{mp_size_t} 9898is usually a @code{long}. This is done to make the fields just 32 bits on 9899some 64 bits systems, thereby saving a few bytes of data space but still 9900providing plenty of range. 9901 9902 9903@node Rational Internals, Float Internals, Integer Internals, Internals 9904@section Rational Internals 9905@cindex Rational internals 9906 9907@code{mpq_t} variables represent rationals using an @code{mpz_t} numerator and 9908denominator (@pxref{Integer Internals}). 9909 9910The canonical form adopted is denominator positive (and non-zero), no common 9911factors between numerator and denominator, and zero uniquely represented as 99120/1. 9913 9914It's believed that casting out common factors at each stage of a calculation 9915is best in general. A GCD is an @math{O(N^2)} operation so it's better to do 9916a few small ones immediately than to delay and have to do a big one later. 9917Knowing the numerator and denominator have no common factors can be used for 9918example in @code{mpq_mul} to make only two cross GCDs necessary, not four. 9919 9920This general approach to common factors is badly sub-optimal in the presence 9921of simple factorizations or little prospect for cancellation, but GMP has no 9922way to know when this will occur. As per @ref{Efficiency}, that's left to 9923applications. The @code{mpq_t} framework might still suit, with 9924@code{mpq_numref} and @code{mpq_denref} for direct access to the numerator and 9925denominator, or of course @code{mpz_t} variables can be used directly. 9926 9927 9928@node Float Internals, Raw Output Internals, Rational Internals, Internals 9929@section Float Internals 9930@cindex Float internals 9931 9932Efficient calculation is the primary aim of GMP floats and the use of whole 9933limbs and simple rounding facilitates this. 9934 9935@code{mpf_t} floats have a variable precision mantissa and a single machine 9936word signed exponent. The mantissa is represented using sign and magnitude. 9937 9938@c FIXME: The arrow heads don't join to the lines exactly. 9939@tex 9940\global\newdimen\GMPboxwidth \GMPboxwidth=5em 9941\global\newdimen\GMPboxheight \GMPboxheight=3ex 9942\def\centreline{\hbox{\raise 0.8ex \vbox{\hrule \hbox{\hfil}}}} 9943\GMPdisplay{% 9944\vbox{% 9945 \hbox to 5\GMPboxwidth {most significant limb \hfil least significant limb} 9946 \vskip 0.7ex 9947 \def\GMPcentreline#1{\hbox{\raise 0.5 ex \vbox{\hrule \hbox to #1 {}}}} 9948 \hbox { 9949 \hbox to 3\GMPboxwidth {% 9950 \setbox 0 = \hbox{@code{\_mp\_exp}}% 9951 \dimen0=3\GMPboxwidth 9952 \advance\dimen0 by -\wd0 9953 \divide\dimen0 by 2 9954 \advance\dimen0 by -1em 9955 \setbox1 = \hbox{$\rightarrow$}% 9956 \dimen1=\dimen0 9957 \advance\dimen1 by -\wd1 9958 \GMPcentreline{\dimen0}% 9959 \hfil 9960 \box0% 9961 \hfil 9962 \GMPcentreline{\dimen1{}}% 9963 \box1} 9964 \hbox to 2\GMPboxwidth {\hfil @code{\_mp\_d}}} 9965 \vskip 0.5ex 9966 \vbox {% 9967 \hrule 9968 \hbox{% 9969 \vrule height 2ex depth 1ex 9970 \hbox to \GMPboxwidth {}% 9971 \vrule 9972 \hbox to \GMPboxwidth {}% 9973 \vrule 9974 \hbox to \GMPboxwidth {}% 9975 \vrule 9976 \hbox to \GMPboxwidth {}% 9977 \vrule 9978 \hbox to \GMPboxwidth {}% 9979 \vrule} 9980 \hrule 9981 } 9982 \hbox {% 9983 \hbox to 0.8 pt {} 9984 \hbox to 3\GMPboxwidth {% 9985 \hfil $\cdot$} \hbox {$\leftarrow$ radix point\hfil}} 9986 \hbox to 5\GMPboxwidth{% 9987 \setbox 0 = \hbox{@code{\_mp\_size}}% 9988 \dimen0 = 5\GMPboxwidth 9989 \advance\dimen0 by -\wd0 9990 \divide\dimen0 by 2 9991 \advance\dimen0 by -1em 9992 \dimen1 = \dimen0 9993 \setbox1 = \hbox{$\leftarrow$}% 9994 \setbox2 = \hbox{$\rightarrow$}% 9995 \advance\dimen0 by -\wd1 9996 \advance\dimen1 by -\wd2 9997 \hbox to 0.3 em {}% 9998 \box1 9999 \GMPcentreline{\dimen0}% 10000 \hfil 10001 \box0 10002 \hfil 10003 \GMPcentreline{\dimen1}% 10004 \box2} 10005}} 10006@end tex 10007@ifnottex 10008@example 10009 most least 10010significant significant 10011 limb limb 10012 10013 _mp_d 10014 |---- _mp_exp ---> | 10015 _____ _____ _____ _____ _____ 10016 |_____|_____|_____|_____|_____| 10017 . <------------ radix point 10018 10019 <-------- _mp_size ---------> 10020@sp 1 10021@end example 10022@end ifnottex 10023 10024@noindent 10025The fields are as follows. 10026 10027@table @asis 10028@item @code{_mp_size} 10029The number of limbs currently in use, or the negative of that when 10030representing a negative value. Zero is represented by @code{_mp_size} and 10031@code{_mp_exp} both set to zero, and in that case the @code{_mp_d} data is 10032unused. (In the future @code{_mp_exp} might be undefined when representing 10033zero.) 10034 10035@item @code{_mp_prec} 10036The precision of the mantissa, in limbs. In any calculation the aim is to 10037produce @code{_mp_prec} limbs of result (the most significant being non-zero). 10038 10039@item @code{_mp_d} 10040A pointer to the array of limbs which is the absolute value of the mantissa. 10041These are stored ``little endian'' as per the @code{mpn} functions, so 10042@code{_mp_d[0]} is the least significant limb and 10043@code{_mp_d[ABS(_mp_size)-1]} the most significant. 10044 10045The most significant limb is always non-zero, but there are no other 10046restrictions on its value, in particular the highest 1 bit can be anywhere 10047within the limb. 10048 10049@code{_mp_prec+1} limbs are allocated to @code{_mp_d}, the extra limb being 10050for convenience (see below). There are no reallocations during a calculation, 10051only in a change of precision with @code{mpf_set_prec}. 10052 10053@item @code{_mp_exp} 10054The exponent, in limbs, determining the location of the implied radix point. 10055Zero means the radix point is just above the most significant limb. Positive 10056values mean a radix point offset towards the lower limbs and hence a value 10057@math{@ge{} 1}, as for example in the diagram above. Negative exponents mean 10058a radix point further above the highest limb. 10059 10060Naturally the exponent can be any value, it doesn't have to fall within the 10061limbs as the diagram shows, it can be a long way above or a long way below. 10062Limbs other than those included in the @code{@{_mp_d,_mp_size@}} data 10063are treated as zero. 10064@end table 10065 10066The @code{_mp_size} and @code{_mp_prec} fields are @code{int}, although the 10067@code{mp_size_t} type is usually a @code{long}. The @code{_mp_exp} field is 10068usually @code{long}. This is done to make some fields just 32 bits on some 64 10069bits systems, thereby saving a few bytes of data space but still providing 10070plenty of precision and a very large range. 10071 10072 10073@sp 1 10074@noindent 10075The following various points should be noted. 10076 10077@table @asis 10078@item Low Zeros 10079The least significant limbs @code{_mp_d[0]} etc can be zero, though such low 10080zeros can always be ignored. Routines likely to produce low zeros check and 10081avoid them to save time in subsequent calculations, but for most routines 10082they're quite unlikely and aren't checked. 10083 10084@item Mantissa Size Range 10085The @code{_mp_size} count of limbs in use can be less than @code{_mp_prec} if 10086the value can be represented in less. This means low precision values or 10087small integers stored in a high precision @code{mpf_t} can still be operated 10088on efficiently. 10089 10090@code{_mp_size} can also be greater than @code{_mp_prec}. Firstly a value is 10091allowed to use all of the @code{_mp_prec+1} limbs available at @code{_mp_d}, 10092and secondly when @code{mpf_set_prec_raw} lowers @code{_mp_prec} it leaves 10093@code{_mp_size} unchanged and so the size can be arbitrarily bigger than 10094@code{_mp_prec}. 10095 10096@item Rounding 10097All rounding is done on limb boundaries. Calculating @code{_mp_prec} limbs 10098with the high non-zero will ensure the application requested minimum precision 10099is obtained. 10100 10101The use of simple ``trunc'' rounding towards zero is efficient, since there's 10102no need to examine extra limbs and increment or decrement. 10103 10104@item Bit Shifts 10105Since the exponent is in limbs, there are no bit shifts in basic operations 10106like @code{mpf_add} and @code{mpf_mul}. When differing exponents are 10107encountered all that's needed is to adjust pointers to line up the relevant 10108limbs. 10109 10110Of course @code{mpf_mul_2exp} and @code{mpf_div_2exp} will require bit shifts, 10111but the choice is between an exponent in limbs which requires shifts there, or 10112one in bits which requires them almost everywhere else. 10113 10114@item Use of @code{_mp_prec+1} Limbs 10115The extra limb on @code{_mp_d} (@code{_mp_prec+1} rather than just 10116@code{_mp_prec}) helps when an @code{mpf} routine might get a carry from its 10117operation. @code{mpf_add} for instance will do an @code{mpn_add} of 10118@code{_mp_prec} limbs. If there's no carry then that's the result, but if 10119there is a carry then it's stored in the extra limb of space and 10120@code{_mp_size} becomes @code{_mp_prec+1}. 10121 10122Whenever @code{_mp_prec+1} limbs are held in a variable, the low limb is not 10123needed for the intended precision, only the @code{_mp_prec} high limbs. But 10124zeroing it out or moving the rest down is unnecessary. Subsequent routines 10125reading the value will simply take the high limbs they need, and this will be 10126@code{_mp_prec} if their target has that same precision. This is no more than 10127a pointer adjustment, and must be checked anyway since the destination 10128precision can be different from the sources. 10129 10130Copy functions like @code{mpf_set} will retain a full @code{_mp_prec+1} limbs 10131if available. This ensures that a variable which has @code{_mp_size} equal to 10132@code{_mp_prec+1} will get its full exact value copied. Strictly speaking 10133this is unnecessary since only @code{_mp_prec} limbs are needed for the 10134application's requested precision, but it's considered that an @code{mpf_set} 10135from one variable into another of the same precision ought to produce an exact 10136copy. 10137 10138@item Application Precisions 10139@code{__GMPF_BITS_TO_PREC} converts an application requested precision to an 10140@code{_mp_prec}. The value in bits is rounded up to a whole limb then an 10141extra limb is added since the most significant limb of @code{_mp_d} is only 10142non-zero and therefore might contain only one bit. 10143 10144@code{__GMPF_PREC_TO_BITS} does the reverse conversion, and removes the extra 10145limb from @code{_mp_prec} before converting to bits. The net effect of 10146reading back with @code{mpf_get_prec} is simply the precision rounded up to a 10147multiple of @code{mp_bits_per_limb}. 10148 10149Note that the extra limb added here for the high only being non-zero is in 10150addition to the extra limb allocated to @code{_mp_d}. For example with a 1015132-bit limb, an application request for 250 bits will be rounded up to 8 10152limbs, then an extra added for the high being only non-zero, giving an 10153@code{_mp_prec} of 9. @code{_mp_d} then gets 10 limbs allocated. Reading 10154back with @code{mpf_get_prec} will take @code{_mp_prec} subtract 1 limb and 10155multiply by 32, giving 256 bits. 10156 10157Strictly speaking, the fact the high limb has at least one bit means that a 10158float with, say, 3 limbs of 32-bits each will be holding at least 65 bits, but 10159for the purposes of @code{mpf_t} it's considered simply to be 64 bits, a nice 10160multiple of the limb size. 10161@end table 10162 10163 10164@node Raw Output Internals, C++ Interface Internals, Float Internals, Internals 10165@section Raw Output Internals 10166@cindex Raw output internals 10167 10168@noindent 10169@code{mpz_out_raw} uses the following format. 10170 10171@tex 10172\global\newdimen\GMPboxwidth \GMPboxwidth=5em 10173\global\newdimen\GMPboxheight \GMPboxheight=3ex 10174\def\centreline{\hbox{\raise 0.8ex \vbox{\hrule \hbox{\hfil}}}} 10175\GMPdisplay{% 10176\vbox{% 10177 \def\GMPcentreline#1{\hbox{\raise 0.5 ex \vbox{\hrule \hbox to #1 {}}}} 10178 \vbox {% 10179 \hrule 10180 \hbox{% 10181 \vrule height 2.5ex depth 1.5ex 10182 \hbox to \GMPboxwidth {\hfil size\hfil}% 10183 \vrule 10184 \hbox to 3\GMPboxwidth {\hfil data bytes\hfil}% 10185 \vrule} 10186 \hrule} 10187}} 10188@end tex 10189@ifnottex 10190@example 10191+------+------------------------+ 10192| size | data bytes | 10193+------+------------------------+ 10194@end example 10195@end ifnottex 10196 10197The size is 4 bytes written most significant byte first, being the number of 10198subsequent data bytes, or the twos complement negative of that when a negative 10199integer is represented. The data bytes are the absolute value of the integer, 10200written most significant byte first. 10201 10202The most significant data byte is always non-zero, so the output is the same 10203on all systems, irrespective of limb size. 10204 10205In GMP 1, leading zero bytes were written to pad the data bytes to a multiple 10206of the limb size. @code{mpz_inp_raw} will still accept this, for 10207compatibility. 10208 10209The use of ``big endian'' for both the size and data fields is deliberate, it 10210makes the data easy to read in a hex dump of a file. Unfortunately it also 10211means that the limb data must be reversed when reading or writing, so neither 10212a big endian nor little endian system can just read and write @code{_mp_d}. 10213 10214 10215@node C++ Interface Internals, , Raw Output Internals, Internals 10216@section C++ Interface Internals 10217@cindex C++ interface internals 10218 10219A system of expression templates is used to ensure something like @code{a=b+c} 10220turns into a simple call to @code{mpz_add} etc. For @code{mpf_class} 10221the scheme also ensures the precision of the final 10222destination is used for any temporaries within a statement like 10223@code{f=w*x+y*z}. These are important features which a naive implementation 10224cannot provide. 10225 10226A simplified description of the scheme follows. The true scheme is 10227complicated by the fact that expressions have different return types. For 10228detailed information, refer to the source code. 10229 10230To perform an operation, say, addition, we first define a ``function object'' 10231evaluating it, 10232 10233@example 10234struct __gmp_binary_plus 10235@{ 10236 static void eval(mpf_t f, mpf_t g, mpf_t h) @{ mpf_add(f, g, h); @} 10237@}; 10238@end example 10239 10240@noindent 10241And an ``additive expression'' object, 10242 10243@example 10244__gmp_expr<__gmp_binary_expr<mpf_class, mpf_class, __gmp_binary_plus> > 10245operator+(const mpf_class &f, const mpf_class &g) 10246@{ 10247 return __gmp_expr 10248 <__gmp_binary_expr<mpf_class, mpf_class, __gmp_binary_plus> >(f, g); 10249@} 10250@end example 10251 10252The seemingly redundant @code{__gmp_expr<__gmp_binary_expr<@dots{}>>} is used to 10253encapsulate any possible kind of expression into a single template type. In 10254fact even @code{mpf_class} etc are @code{typedef} specializations of 10255@code{__gmp_expr}. 10256 10257Next we define assignment of @code{__gmp_expr} to @code{mpf_class}. 10258 10259@example 10260template <class T> 10261mpf_class & mpf_class::operator=(const __gmp_expr<T> &expr) 10262@{ 10263 expr.eval(this->get_mpf_t(), this->precision()); 10264 return *this; 10265@} 10266 10267template <class Op> 10268void __gmp_expr<__gmp_binary_expr<mpf_class, mpf_class, Op> >::eval 10269(mpf_t f, mp_bitcnt_t precision) 10270@{ 10271 Op::eval(f, expr.val1.get_mpf_t(), expr.val2.get_mpf_t()); 10272@} 10273@end example 10274 10275where @code{expr.val1} and @code{expr.val2} are references to the expression's 10276operands (here @code{expr} is the @code{__gmp_binary_expr} stored within the 10277@code{__gmp_expr}). 10278 10279This way, the expression is actually evaluated only at the time of assignment, 10280when the required precision (that of @code{f}) is known. Furthermore the 10281target @code{mpf_t} is now available, thus we can call @code{mpf_add} directly 10282with @code{f} as the output argument. 10283 10284Compound expressions are handled by defining operators taking subexpressions 10285as their arguments, like this: 10286 10287@example 10288template <class T, class U> 10289__gmp_expr 10290<__gmp_binary_expr<__gmp_expr<T>, __gmp_expr<U>, __gmp_binary_plus> > 10291operator+(const __gmp_expr<T> &expr1, const __gmp_expr<U> &expr2) 10292@{ 10293 return __gmp_expr 10294 <__gmp_binary_expr<__gmp_expr<T>, __gmp_expr<U>, __gmp_binary_plus> > 10295 (expr1, expr2); 10296@} 10297@end example 10298 10299And the corresponding specializations of @code{__gmp_expr::eval}: 10300 10301@example 10302template <class T, class U, class Op> 10303void __gmp_expr 10304<__gmp_binary_expr<__gmp_expr<T>, __gmp_expr<U>, Op> >::eval 10305(mpf_t f, mp_bitcnt_t precision) 10306@{ 10307 // declare two temporaries 10308 mpf_class temp1(expr.val1, precision), temp2(expr.val2, precision); 10309 Op::eval(f, temp1.get_mpf_t(), temp2.get_mpf_t()); 10310@} 10311@end example 10312 10313The expression is thus recursively evaluated to any level of complexity and 10314all subexpressions are evaluated to the precision of @code{f}. 10315 10316 10317@node Contributors, References, Internals, Top 10318@comment node-name, next, previous, up 10319@appendix Contributors 10320@cindex Contributors 10321 10322Torbj@"orn Granlund wrote the original GMP library and is still the main 10323developer. Code not explicitly attributed to others, was contributed by 10324Torbj@"orn. Several other individuals and organizations have contributed 10325GMP. Here is a list in chronological order on first contribution: 10326 10327Gunnar Sj@"odin and Hans Riesel helped with mathematical problems in early 10328versions of the library. 10329 10330Richard Stallman helped with the interface design and revised the first 10331version of this manual. 10332 10333Brian Beuning and Doug Lea helped with testing of early versions of the 10334library and made creative suggestions. 10335 10336John Amanatides of York University in Canada contributed the function 10337@code{mpz_probab_prime_p}. 10338 10339Paul Zimmermann wrote the REDC-based mpz_powm code, the Sch@"onhage-Strassen 10340FFT multiply code, and the Karatsuba square root code. He also improved the 10341Toom3 code for GMP 4.2. Paul sparked the development of GMP 2, with his 10342comparisons between bignum packages. The ECMNET project Paul is organizing 10343was a driving force behind many of the optimizations in GMP 3. Paul also 10344wrote the new GMP 4.3 nth root code (with Torbj@"orn). 10345 10346Ken Weber (Kent State University, Universidade Federal do Rio Grande do Sul) 10347contributed now defunct versions of @code{mpz_gcd}, @code{mpz_divexact}, 10348@code{mpn_gcd}, and @code{mpn_bdivmod}, partially supported by CNPq (Brazil) 10349grant 301314194-2. 10350 10351Per Bothner of Cygnus Support helped to set up GMP to use Cygnus' configure. 10352He has also made valuable suggestions and tested numerous intermediary 10353releases. 10354 10355Joachim Hollman was involved in the design of the @code{mpf} interface, and in 10356the @code{mpz} design revisions for version 2. 10357 10358Bennet Yee contributed the initial versions of @code{mpz_jacobi} and 10359@code{mpz_legendre}. 10360 10361Andreas Schwab contributed the files @file{mpn/m68k/lshift.S} and 10362@file{mpn/m68k/rshift.S} (now in @file{.asm} form). 10363 10364Robert Harley of Inria, France and David Seal of ARM, England, suggested clever 10365improvements for population count. Robert also wrote highly optimized 10366Karatsuba and 3-way Toom multiplication functions for GMP 3, and contributed 10367the ARM assembly code. 10368 10369Torsten Ekedahl of the Mathematical department of Stockholm University provided 10370significant inspiration during several phases of the GMP development. His 10371mathematical expertise helped improve several algorithms. 10372 10373Linus Nordberg wrote the new configure system based on autoconf and 10374implemented the new random functions. 10375 10376Kevin Ryde worked on a large number of things: optimized x86 code, m4 asm 10377macros, parameter tuning, speed measuring, the configure system, function 10378inlining, divisibility tests, bit scanning, Jacobi symbols, Fibonacci and Lucas 10379number functions, printf and scanf functions, perl interface, demo expression 10380parser, the algorithms chapter in the manual, @file{gmpasm-mode.el}, and 10381various miscellaneous improvements elsewhere. 10382 10383Kent Boortz made the Mac OS 9 port. 10384 10385Steve Root helped write the optimized alpha 21264 assembly code. 10386 10387Gerardo Ballabio wrote the @file{gmpxx.h} C++ class interface and the C++ 10388@code{istream} input routines. 10389 10390Jason Moxham rewrote @code{mpz_fac_ui}. 10391 10392Pedro Gimeno implemented the Mersenne Twister and made other random number 10393improvements. 10394 10395Niels M@"oller wrote the sub-quadratic GCD and extended GCD code, the 10396quadratic Hensel division code, and (with Torbj@"orn) the new divide and 10397conquer division code for GMP 4.3. Niels also helped implement the new Toom 10398multiply code for GMP 4.3 and implemented helper functions to simplify Toom 10399evaluations for GMP 5.0. He wrote the original version of mpn_mulmod_bnm1. 10400 10401Alberto Zanoni and Marco Bodrato suggested the unbalanced multiply strategy, 10402and found the optimal strategies for evaluation and interpolation in Toom 10403multiplication. 10404 10405Marco Bodrato helped implement the new Toom multiply code for GMP 4.3 and 10406implemented most of the new Toom multiply and squaring code for 5.0. 10407He is the main author of the current mpn_mulmod_bnm1 and mpn_mullo_n. Marco 10408also wrote the functions mpn_invert and mpn_invertappr. 10409 10410David Harvey suggested the internal function @code{mpn_bdiv_dbm1}, implementing 10411division relevant to Toom multiplication. He also worked on fast assembly 10412sequences, in particular on a fast AMD64 @code{mpn_mul_basecase}. 10413 10414Martin Boij wrote @code{mpn_perfect_power_p}. 10415 10416(This list is chronological, not ordered after significance. If you have 10417contributed to GMP but are not listed above, please tell 10418@email{gmp-devel@@gmplib.org} about the omission!) 10419 10420The development of floating point functions of GNU MP 2, were supported in part 10421by the ESPRIT-BRA (Basic Research Activities) 6846 project POSSO (POlynomial 10422System SOlving). 10423 10424The development of GMP 2, 3, and 4 was supported in part by the IDA Center for 10425Computing Sciences. 10426 10427Thanks go to Hans Thorsen for donating an SGI system for the GMP test system 10428environment. 10429 10430@node References, GNU Free Documentation License, Contributors, Top 10431@comment node-name, next, previous, up 10432@appendix References 10433@cindex References 10434 10435@c FIXME: In tex, the @uref's are unhyphenated, which is good for clarity, 10436@c but being long words they upset paragraph formatting (the preceding line 10437@c can get badly stretched). Would like an conditional @* style line break 10438@c if the uref is too long to fit on the last line of the paragraph, but it's 10439@c not clear how to do that. For now explicit @texlinebreak{}s are used on 10440@c paragraphs that come out bad. 10441 10442@section Books 10443 10444@itemize @bullet 10445@item 10446Jonathan M. Borwein and Peter B. Borwein, ``Pi and the AGM: A Study in 10447Analytic Number Theory and Computational Complexity'', Wiley, 1998. 10448 10449@item 10450Richard Crandall and Carl Pomerance, ``Prime Numbers: A Computational 10451Perspective'', 2nd edition, Springer-Verlag, 2005. 10452@texlinebreak{} @uref{http://math.dartmouth.edu/~carlp/} 10453 10454@item 10455Henri Cohen, ``A Course in Computational Algebraic Number Theory'', Graduate 10456Texts in Mathematics number 138, Springer-Verlag, 1993. 10457@texlinebreak{} @uref{http://www.math.u-bordeaux.fr/~cohen/} 10458 10459@item 10460Donald E. Knuth, ``The Art of Computer Programming'', volume 2, 10461``Seminumerical Algorithms'', 3rd edition, Addison-Wesley, 1998. 10462@texlinebreak{} @uref{http://www-cs-faculty.stanford.edu/~knuth/taocp.html} 10463 10464@item 10465John D. Lipson, ``Elements of Algebra and Algebraic Computing'', 10466The Benjamin Cummings Publishing Company Inc, 1981. 10467 10468@item 10469Alfred J. Menezes, Paul C. van Oorschot and Scott A. Vanstone, ``Handbook of 10470Applied Cryptography'', @uref{http://www.cacr.math.uwaterloo.ca/hac/} 10471 10472@item 10473Richard M. Stallman and the GCC Developer Community, ``Using the GNU Compiler 10474Collection'', Free Software Foundation, 2008, available online 10475@uref{http://gcc.gnu.org/onlinedocs/}, and in the GCC package 10476@uref{ftp://ftp.gnu.org/gnu/gcc/} 10477@end itemize 10478 10479@section Papers 10480 10481@itemize @bullet 10482@item 10483Yves Bertot, Nicolas Magaud and Paul Zimmermann, ``A Proof of GMP Square 10484Root'', Journal of Automated Reasoning, volume 29, 2002, pp.@: 225-252. Also 10485available online as INRIA Research Report 4475, June 2001, 10486@uref{http://www.inria.fr/rrrt/rr-4475.html} 10487 10488@item 10489Christoph Burnikel and Joachim Ziegler, ``Fast Recursive Division'', 10490Max-Planck-Institut fuer Informatik Research Report MPI-I-98-1-022, 10491@texlinebreak{} @uref{http://data.mpi-sb.mpg.de/internet/reports.nsf/NumberView/1998-1-022} 10492 10493@item 10494Torbj@"orn Granlund and Peter L. Montgomery, ``Division by Invariant Integers 10495using Multiplication'', in Proceedings of the SIGPLAN PLDI'94 Conference, June 104961994. Also available @uref{ftp://ftp.cwi.nl/pub/pmontgom/divcnst.psa4.gz} 10497(and .psl.gz). 10498 10499@item 10500Niels M@"oller and Torbj@"orn Granlund, ``Improved division by invariant 10501integers'', IEEE Transactions on Computers, 11 June 2010. 10502@uref{http://gmplib.org/~tege/division-paper.pdf} 10503 10504@item 10505Torbj@"orn Granlund and Niels M@"oller, ``Division of integers large and 10506small'', to appear. 10507 10508@item 10509Tudor Jebelean, 10510``An algorithm for exact division'', 10511Journal of Symbolic Computation, 10512volume 15, 1993, pp.@: 169-180. 10513Research report version available @texlinebreak{} 10514@uref{ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1992/92-35.ps.gz} 10515 10516@item 10517Tudor Jebelean, ``Exact Division with Karatsuba Complexity - Extended 10518Abstract'', RISC-Linz technical report 96-31, @texlinebreak{} 10519@uref{ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1996/96-31.ps.gz} 10520 10521@item 10522Tudor Jebelean, ``Practical Integer Division with Karatsuba Complexity'', 10523ISSAC 97, pp.@: 339-341. Technical report available @texlinebreak{} 10524@uref{ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1996/96-29.ps.gz} 10525 10526@item 10527Tudor Jebelean, ``A Generalization of the Binary GCD Algorithm'', ISSAC 93, 10528pp.@: 111-116. Technical report version available @texlinebreak{} 10529@uref{ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1993/93-01.ps.gz} 10530 10531@item 10532Tudor Jebelean, ``A Double-Digit Lehmer-Euclid Algorithm for Finding the GCD 10533of Long Integers'', Journal of Symbolic Computation, volume 19, 1995, 10534pp.@: 145-157. Technical report version also available @texlinebreak{} 10535@uref{ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1992/92-69.ps.gz} 10536 10537@item 10538Werner Krandick and Tudor Jebelean, ``Bidirectional Exact Integer Division'', 10539Journal of Symbolic Computation, volume 21, 1996, pp.@: 441-455. Early 10540technical report version also available 10541@uref{ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1994/94-50.ps.gz} 10542 10543@item 10544Makoto Matsumoto and Takuji Nishimura, ``Mersenne Twister: A 623-dimensionally 10545equidistributed uniform pseudorandom number generator'', ACM Transactions on 10546Modelling and Computer Simulation, volume 8, January 1998, pp.@: 3-30. 10547Available online @texlinebreak{} 10548@uref{http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/ARTICLES/mt.ps.gz} (or .pdf) 10549 10550@item 10551R. Moenck and A. Borodin, ``Fast Modular Transforms via Division'', 10552Proceedings of the 13th Annual IEEE Symposium on Switching and Automata 10553Theory, October 1972, pp.@: 90-96. Reprinted as ``Fast Modular Transforms'', 10554Journal of Computer and System Sciences, volume 8, number 3, June 1974, 10555pp.@: 366-386. 10556 10557@item 10558Niels M@"oller, ``On Sch@"onhage's algorithm and subquadratic integer GCD 10559 computation'', in Mathematics of Computation, volume 77, January 2008, pp.@: 10560 589-607. 10561 10562@item 10563Peter L. Montgomery, ``Modular Multiplication Without Trial Division'', in 10564Mathematics of Computation, volume 44, number 170, April 1985. 10565 10566@item 10567Arnold Sch@"onhage and Volker Strassen, ``Schnelle Multiplikation grosser 10568Zahlen'', Computing 7, 1971, pp.@: 281-292. 10569 10570@item 10571Kenneth Weber, ``The accelerated integer GCD algorithm'', 10572ACM Transactions on Mathematical Software, 10573volume 21, number 1, March 1995, pp.@: 111-122. 10574 10575@item 10576Paul Zimmermann, ``Karatsuba Square Root'', INRIA Research Report 3805, 10577November 1999, @uref{http://www.inria.fr/rrrt/rr-3805.html} 10578 10579@item 10580Paul Zimmermann, ``A Proof of GMP Fast Division and Square Root 10581Implementations'', @texlinebreak{} 10582@uref{http://www.loria.fr/~zimmerma/papers/proof-div-sqrt.ps.gz} 10583 10584@item 10585Dan Zuras, ``On Squaring and Multiplying Large Integers'', ARITH-11: IEEE 10586Symposium on Computer Arithmetic, 1993, pp.@: 260 to 271. Reprinted as ``More 10587on Multiplying and Squaring Large Integers'', IEEE Transactions on Computers, 10588volume 43, number 8, August 1994, pp.@: 899-908. 10589@end itemize 10590 10591 10592@node GNU Free Documentation License, Concept Index, References, Top 10593@appendix GNU Free Documentation License 10594@cindex GNU Free Documentation License 10595@cindex Free Documentation License 10596@cindex Documentation license 10597@include fdl-1.3.texi 10598 10599 10600@node Concept Index, Function Index, GNU Free Documentation License, Top 10601@comment node-name, next, previous, up 10602@unnumbered Concept Index 10603@printindex cp 10604 10605@node Function Index, , Concept Index, Top 10606@comment node-name, next, previous, up 10607@unnumbered Function and Type Index 10608@printindex fn 10609 10610@bye 10611 10612@c Local variables: 10613@c fill-column: 78 10614@c compile-command: "make gmp.info" 10615@c End: 10616