1.ft CW 2.ta 8n +8n +8n +8n +8n +8n +8n 3.ft 4.TL 5A Manual for the Plan 9 assembler 6.AU 7.I "Rob Pike" 8.AI 9rob@plan9.bell-labs.com 10.SH 11Machines 12.PP 13There is an assembler for each of the MIPS, SPARC, Intel 386, 14Motorola 68020 and 68000, IBM Power PC, DEC Alpha, and ARM. 15The 68020 assembler, 16.CW 2a , 17is the oldest and in many ways the prototype. 18The assemblers are really just variations of a single program: 19they share many properties such as left-to-right assignment order for 20instruction operands and the synthesis of macro instructions 21such as 22.CW MOVE 23to hide the peculiarities of the load and store structure of the machines. 24To keep things concrete, the first part of this manual is 25specifically about the 68020. 26At the end is a description of the differences among 27the other assemblers. 28.ig 29.PP 30The document, ``How to Use the Plan 9 C Compiler'', by Rob Pike, 31is a prerequisite for this manual. 32.. 33.SH 34Registers 35.PP 36All pre-defined symbols in the assembler are upper-case. 37Data registers are 38.CW R0 39through 40.CW R7 ; 41address registers are 42.CW A0 43through 44.CW A7 ; 45floating-point registers are 46.CW F0 47through 48.CW F7 . 49.PP 50A pointer in 51.CW A6 52is used by the C compiler to point to data, enabling short addresses to 53be used more often. 54The value of 55.CW A6 56is constant and must be set during C program initialization 57to the address of the externally-defined symbol 58.CW a6base . 59.PP 60The following hardware registers are defined in the assembler; their 61meaning should be obvious given a 68020 manual: 62.CW CAAR , 63.CW CACR , 64.CW CCR , 65.CW DFC , 66.CW ISP , 67.CW MSP , 68.CW SFC , 69.CW SR , 70.CW USP , 71and 72.CW VBR . 73.PP 74The assembler also defines several pseudo-registers that 75manipulate the stack: 76.CW FP , 77.CW SP , 78and 79.CW TOS . 80.CW FP 81is the frame pointer, so 82.CW 0(FP) 83is the first argument, 84.CW 4(FP) 85is the second, and so on. 86.CW SP 87is the local stack pointer, where automatic variables are held 88(SP is a pseudo-register only on the 68020); 89.CW 0(SP) 90is the first automatic, and so on as with 91.CW FP . 92Finally, 93.CW TOS 94is the top-of-stack register, used for pushing parameters to procedures, 95saving temporary values, and so on. 96.PP 97The assembler and loader track these pseudo-registers so 98the above statements are true regardless of what has been 99pushed on the hardware stack, pointed to by 100.CW A7 . 101The name 102.CW A7 103refers to the hardware stack pointer, but beware of mixed use of 104.CW A7 105and the above stack-related pseudo-registers, which will cause trouble. 106Note, too, that the 107.CW PEA 108instruction is observed by the loader to 109alter SP and thus will insert a corresponding pop before all returns. 110The assembler accepts a label-like name to be attached to 111.CW FP 112and 113.CW SP 114uses, such as 115.CW p+0(FP) , 116to help document that 117.CW p 118is the first argument to a routine. 119The name goes in the symbol table but has no significance to the result 120of the program. 121.SH 122Referring to data 123.PP 124All external references must be made relative to some pseudo-register, 125either 126.CW PC 127(the virtual program counter) or 128.CW SB 129(the ``static base'' register). 130.CW PC 131counts instructions, not bytes of data. 132For example, to branch to the second following instruction, that is, 133to skip one instruction, one may write 134.P1 135 BRA 2(PC) 136.P2 137Labels are also allowed, as in 138.P1 139 BRA return 140 NOP 141return: 142 RTS 143.P2 144When using labels, there is no 145.CW (PC) 146annotation. 147.PP 148The pseudo-register 149.CW SB 150refers to the beginning of the address space of the program. 151Thus, references to global data and procedures are written as 152offsets to 153.CW SB , 154as in 155.P1 156 MOVL $array(SB), TOS 157.P2 158to push the address of a global array on the stack, or 159.P1 160 MOVL array+4(SB), TOS 161.P2 162to push the second (4-byte) element of the array. 163Note the use of an offset; the complete list of addressing modes is given below. 164Similarly, subroutine calls must use 165.CW SB : 166.P1 167 BSR exit(SB) 168.P2 169File-static variables have syntax 170.P1 171 local<>+4(SB) 172.P2 173The 174.CW <> 175will be filled in at load time by a unique integer. 176.PP 177When a program starts, it must execute 178.P1 179 MOVL $a6base(SB), A6 180.P2 181before accessing any global data. 182(On machines such as the MIPS and SPARC that cannot load a register 183in a single instruction, constants are loaded through the static base 184register. The loader recognizes code that initializes the static 185base register and treats it specially. You must be careful, however, 186not to load large constants on such machines when the static base 187register is not set up, such as early in interrupt routines.) 188.SH 189Expressions 190.PP 191Expressions are mostly what one might expect. 192Where an offset or a constant is expected, 193a primary expression with unary operators is allowed. 194A general C constant expression is allowed in parentheses. 195.PP 196Source files are preprocessed exactly as in the C compiler, so 197.CW #define 198and 199.CW #include 200work. 201.SH 202Addressing modes 203.PP 204The simple addressing modes are shared by all the assemblers. 205Here, for completeness, follows a table of all the 68020 addressing modes, 206since that machine has the richest set. 207In the table, 208.CW o 209is an offset, which if zero may be elided, and 210.CW d 211is a displacement, which is a constant between -128 and 127 inclusive. 212Many of the modes listed have the same name; 213scrutiny of the format will show what default is being applied. 214For instance, indexed mode with no address register supplied operates 215as though a zero-valued register were used. 216For "offset" read "displacement." 217For "\f(CW.s\fP" read one of 218.CW .L , 219or 220.CW .W 221followed by 222.CW *1 , 223.CW *2 , 224.CW *4 , 225or 226.CW *8 227to indicate the size and scaling of the data. 228.IP 229.TS 230l lfCW. 231data register R0 232address register A0 233floating-point register F0 234special names CAAR, CACR, etc. 235constant $con 236floating point constant $fcon 237external symbol name+o(SB) 238local symbol name<>+o(SB) 239automatic symbol name+o(SP) 240argument name+o(FP) 241address of external $name+o(SB) 242address of local $name<>+o(SB) 243indirect post-increment (A0)+ 244indirect pre-decrement -(A0) 245indirect with offset o(A0) 246indexed with offset o()(R0.s) 247indexed with offset o(A0)(R0.s) 248external indexed name+o(SB)(R0.s) 249local indexed name<>+o(SB)(R0.s) 250automatic indexed name+o(SP)(R0.s) 251parameter indexed name+o(FP)(R0.s) 252offset indirect post-indexed d(o())(R0.s) 253offset indirect post-indexed d(o(A0))(R0.s) 254external indirect post-indexed d(name+o(SB))(R0.s) 255local indirect post-indexed d(name<>+o(SB))(R0.s) 256automatic indirect post-indexed d(name+o(SP))(R0.s) 257parameter indirect post-indexed d(name+o(FP))(R0.s) 258offset indirect pre-indexed d(o()(R0.s)) 259offset indirect pre-indexed d(o(A0)) 260offset indirect pre-indexed d(o(A0)(R0.s)) 261external indirect pre-indexed d(name+o(SB)) 262external indirect pre-indexed d(name+o(SB)(R0.s)) 263local indirect pre-indexed d(name<>+o(SB)) 264local indirect pre-indexed d(name<>+o(SB)(R0.s)) 265automatic indirect pre-indexed d(name+o(SP)) 266automatic indirect pre-indexed d(name+o(SP)(R0.s)) 267parameter indirect pre-indexed d(name+o(FP)) 268parameter indirect pre-indexed d(name+o(FP)(R0.s)) 269.TE 270.in 271.SH 272Laying down data 273.PP 274Placing data in the instruction stream, say for interrupt vectors, is easy: 275the pseudo-instructions 276.CW LONG 277and 278.CW WORD 279(but not 280.CW BYTE ) 281lay down the value of their single argument, of the appropriate size, 282as if it were an instruction: 283.P1 284 LONG $12345 285.P2 286places the long 12345 (base 10) 287in the instruction stream. 288(On most machines, 289the only such operator is 290.CW WORD 291and it lays down 32-bit quantities. 292The 386 has all three: 293.CW LONG , 294.CW WORD , 295and 296.CW BYTE . 297The AMD64 adds 298.CW QUAD 299for 64-bit values.) 300.PP 301Placing information in the data section is more painful. 302The pseudo-instruction 303.CW DATA 304does the work, given two arguments: an address at which to place the item, 305including its size, 306and the value to place there. For example, to define a character array 307.CW array 308containing the characters 309.CW abc 310and a terminating null: 311.P1 312 DATA array+0(SB)/1, $'a' 313 DATA array+1(SB)/1, $'b' 314 DATA array+2(SB)/1, $'c' 315 GLOBL array(SB), $4 316.P2 317or 318.P1 319 DATA array+0(SB)/4, $"abc\ez" 320 GLOBL array(SB), $4 321.P2 322The 323.CW /1 324defines the number of bytes to define, 325.CW GLOBL 326makes the symbol global, and the 327.CW $4 328says how many bytes the symbol occupies. 329Uninitialized data is zeroed automatically. 330The character 331.CW \ez 332is equivalent to the C 333.CW \e0. 334The string in a 335.CW DATA 336statement may contain a maximum of eight bytes; 337build larger strings piecewise. 338Two pseudo-instructions, 339.CW DYNT 340and 341.CW INIT , 342allow the (obsolete) Alef compilers to build dynamic type information during the load 343phase. 344The 345.CW DYNT 346pseudo-instruction has two forms: 347.P1 348 DYNT , ALEF_SI_5+0(SB) 349 DYNT ALEF_AS+0(SB), ALEF_SI_5+0(SB) 350.P2 351In the first form, 352.CW DYNT 353defines the symbol to be a small unique integer constant, chosen by the loader, 354which is some multiple of the word size. In the second form, 355.CW DYNT 356defines the second symbol in the same way, 357places the address of the most recently 358defined text symbol in the array specified by the first symbol at the 359index defined by the value of the second symbol, 360and then adjusts the size of the array accordingly. 361.PP 362The 363.CW INIT 364pseudo-instruction takes the same parameters as a 365.CW DATA 366statement. Its symbol is used as the base of an array and the 367data item is installed in the array at the offset specified by the most recent 368.CW DYNT 369pseudo-instruction. 370The size of the array is adjusted accordingly. 371The 372.CW DYNT 373and 374.CW INIT 375pseudo-instructions are not implemented on the 68020. 376.SH 377Defining a procedure 378.PP 379Entry points are defined by the pseudo-operation 380.CW TEXT , 381which takes as arguments the name of the procedure (including the ubiquitous 382.CW (SB) ) 383and the number of bytes of automatic storage to pre-allocate on the stack, 384which will usually be zero when writing assembly language programs. 385On machines with a link register, such as the MIPS and SPARC, 386the special value -4 instructs the loader to generate no PC save 387and restore instructions, even if the function is not a leaf. 388Here is a complete procedure that returns the sum 389of its two arguments: 390.P1 391TEXT sum(SB), $0 392 MOVL arg1+0(FP), R0 393 ADDL arg2+4(FP), R0 394 RTS 395.P2 396An optional middle argument 397to the 398.CW TEXT 399pseudo-op is a bit field of options to the loader. 400Setting the 1 bit suspends profiling the function when profiling is enabled for the rest of 401the program. 402For example, 403.P1 404TEXT sum(SB), 1, $0 405 MOVL arg1+0(FP), R0 406 ADDL arg2+4(FP), R0 407 RTS 408.P2 409will not be profiled; the first version above would be. 410Subroutines with peculiar state, such as system call routines, 411should not be profiled. 412.PP 413Setting the 2 bit allows multiple definitions of the same 414.CW TEXT 415symbol in a program; the loader will place only one such function in the image. 416It was emitted only by the Alef compilers. 417.PP 418Subroutines to be called from C should place their result in 419.CW R0 , 420even if it is an address. 421Floating point values are returned in 422.CW F0 . 423Functions that return a structure to a C program 424receive as their first argument the address of the location to 425store the result; 426.CW R0 427is unused in the calling protocol for such procedures. 428A subroutine is responsible for saving its own registers, 429and therefore is free to use any registers without saving them (``caller saves''). 430.CW A6 431and 432.CW A7 433are the exceptions as described above. 434.SH 435When in doubt 436.PP 437If you get confused, try using the 438.CW -S 439option to 440.CW 2c 441and compiling a sample program. 442The standard output is valid input to the assembler. 443.SH 444Instructions 445.PP 446The instruction set of the assembler is not identical to that 447of the machine. 448It is chosen to match what the compiler generates, augmented 449slightly by specific needs of the operating system. 450For example, 451.CW 2a 452does not distinguish between the various forms of 453.CW MOVE 454instruction: move quick, move address, etc. Instead the context 455does the job. For example, 456.P1 457 MOVL $1, R1 458 MOVL A0, R2 459 MOVW SR, R3 460.P2 461generates official 462.CW MOVEQ , 463.CW MOVEA , 464and 465.CW MOVESR 466instructions. 467A number of instructions do not have the syntax necessary to specify 468their entire capabilities. Notable examples are the bitfield 469instructions, the 470multiply and divide instructions, etc. 471For a complete set of generated instruction names (in 472.CW 2a 473notation, not Motorola's) see the file 474.CW /sys/src/cmd/2c/2.out.h . 475Despite its name, this file contains an enumeration of the 476instructions that appear in the intermediate files generated 477by the compiler, which correspond exactly to lines of assembly language. 478.PP 479The MC68000 assembler, 480.CW 1a , 481is essentially the same, honoring the appropriate subset of the instructions 482and addressing modes. 483The definitions of these are, nonetheless, part of 484.CW 2.out.h . 485.SH 486Laying down instructions 487.PP 488The loader modifies the code produced by the assembler and compiler. 489It folds branches, 490copies short sequences of code to eliminate branches, 491and discards unreachable code. 492The first instruction of every function is assumed to be reachable. 493The pseudo-instruction 494.CW NOP , 495which you may see in compiler output, 496means no instruction at all, rather than an instruction that does nothing. 497The loader discards all 498.CW NOP 's. 499.PP 500To generate a true 501.CW NOP 502instruction, or any other instruction not known to the assembler, use a 503.CW WORD 504pseudo-instruction. 505Such instructions on RISCs are not scheduled by the loader and must have 506their delay slots filled manually. 507.SH 508MIPS 509.PP 510The registers are only addressed by number: 511.CW R0 512through 513.CW R31 . 514.CW R29 515is the stack pointer; 516.CW R30 517is used as the static base pointer, the analogue of 518.CW A6 519on the 68020. 520Its value is the address of the global symbol 521.CW setR30(SB) . 522The register holding returned values from subroutines is 523.CW R1 . 524When a function is called, space for the first argument 525is reserved at 526.CW 0(FP) 527but in C (not Alef) the value is passed in 528.CW R1 529instead. 530.PP 531The loader uses 532.CW R28 533as a temporary. The system uses 534.CW R26 535and 536.CW R27 537as interrupt-time temporaries. Therefore none of these registers 538should be used in user code. 539.PP 540The control registers are not known to the assembler. 541Instead they are numbered registers 542.CW M0 , 543.CW M1 , 544etc. 545Use this trick to access, say, 546.CW STATUS : 547.P1 548#define STATUS 12 549 MOVW M(STATUS), R1 550.P2 551.PP 552Floating point registers are called 553.CW F0 554through 555.CW F31 . 556By convention, 557.CW F24 558must be initialized to the value 0.0, 559.CW F26 560to 0.5, 561.CW F28 562to 1.0, and 563.CW F30 564to 2.0; 565this is done by the operating system. 566.PP 567The instructions and their syntax are different from those of the manufacturer's 568manual. 569There are no 570.CW lui 571and kin; instead there are 572.CW MOVW 573(move word), 574.CW MOVH 575(move halfword), 576and 577.CW MOVB 578(move byte) pseudo-instructions. If the operand is unsigned, the instructions 579are 580.CW MOVHU 581and 582.CW MOVBU . 583The order of operands is from left to right in dataflow order, just as 584on the 68020 but not as in MIPS documentation. 585This means that the 586.CW Bcond 587instructions are reversed with respect to the book; for example, a 588.CW va 589.CW BGTZ 590generates a MIPS 591.CW bltz 592instruction. 593.PP 594The assembler is for the R2000, R3000, and most of the R4000 and R6000 architectures. 595It understands the 64-bit instructions 596.CW MOVV , 597.CW MOVVL , 598.CW ADDV , 599.CW ADDVU , 600.CW SUBV , 601.CW SUBVU , 602.CW MULV , 603.CW MULVU , 604.CW DIVV , 605.CW DIVVU , 606.CW SLLV , 607.CW SRLV , 608and 609.CW SRAV . 610The assembler does not have any cache, load-linked, or store-conditional instructions. 611.PP 612Some assembler instructions are expanded into multiple instructions by the loader. 613For example the loader may convert the load of a 32 bit constant into an 614.CW lui 615followed by an 616.CW ori . 617.PP 618Assembler instructions should be laid out as if there 619were no load, branch, or floating point compare delay slots; 620the loader will rearrange\(em\f2schedule\f1\(emthe instructions 621to guarantee correctness and improve performance. 622The only exception is that the correct scheduling of instructions 623that use control registers varies from model to model of machine 624(and is often undocumented) so you should schedule such instructions 625by hand to guarantee correct behavior. 626The loader generates 627.P1 628 NOR R0, R0, R0 629.P2 630when it needs a true no-op instruction. 631Use exactly this instruction when scheduling code manually; 632the loader recognizes it and schedules the code before it and after it independently. Also, 633.CW WORD 634pseudo-ops are scheduled like no-ops. 635.PP 636The 637.CW NOSCHED 638pseudo-op disables instruction scheduling 639(scheduling is enabled by default); 640.CW SCHED 641re-enables it. 642Branch folding, code copying, and dead code elimination are 643disabled for instructions that are not scheduled. 644.SH 645SPARC 646.PP 647Once you understand the Plan 9 model for the MIPS, the SPARC is familiar. 648Registers have numerical names only: 649.CW R0 650through 651.CW R31 . 652Forget about register windows: Plan 9 doesn't use them at all. 653The machine has 32 global registers, period. 654.CW R1 655[sic] is the stack pointer. 656.CW R2 657is the static base register, with value the address of 658.CW setSB(SB) . 659.CW R7 660is the return register and also the register holding the first 661argument to a C (not Alef) function, again with space reserved at 662.CW 0(FP) . 663.CW R14 664is the loader temporary. 665.PP 666Floating-point registers are exactly as on the MIPS. 667.PP 668The control registers are known by names such as 669.CW FSR . 670The instructions to access these registers are 671.CW MOVW 672instructions, for example 673.P1 674 MOVW Y, R8 675.P2 676for the SPARC instruction 677.P1 678 rdy %r8 679.P2 680.PP 681Move instructions are similar to those on the MIPS: pseudo-operations 682that turn into appropriate sequences of 683.CW sethi 684instructions, adds, etc. 685Instructions read from left to right. Because the arguments are 686flipped to 687.CW SUBCC , 688the condition codes are not inverted as on the MIPS. 689.PP 690The syntax for the ASI stuff is, for example to move a word from ASI 2: 691.P1 692 MOVW (R7, 2), R8 693.P2 694The syntax for double indexing is 695.P1 696 MOVW (R7+R8), R9 697.P2 698.PP 699The SPARC's instruction scheduling is similar to the MIPS's. 700The official no-op instruction is: 701.P1 702 ORN R0, R0, R0 703.P2 704.SH 705i386 706.PP 707The assembler assumes 32-bit protected mode. 708The register names are 709.CW SP , 710.CW AX , 711.CW BX , 712.CW CX , 713.CW DX , 714.CW BP , 715.CW DI , 716and 717.CW SI . 718The stack pointer (not a pseudo-register) is 719.CW SP 720and the return register is 721.CW AX . 722There is no physical frame pointer but, as for the MIPS, 723.CW FP 724is a pseudo-register that acts as 725a frame pointer. 726.PP 727Opcode names are mostly the same as those listed in the Intel manual 728with an 729.CW L , 730.CW W , 731or 732.CW B 733appended to identify 32-bit, 73416-bit, and 8-bit operations. 735The exceptions are loads, stores, and conditionals. 736All load and store opcodes to and from general registers, special registers 737(such as 738.CW CR0, 739.CW CR3, 740.CW GDTR, 741.CW IDTR, 742.CW SS, 743.CW CS, 744.CW DS, 745.CW ES, 746.CW FS, 747and 748.CW GS ) 749or memory are written 750as 751.P1 752 MOV\f2x\fP src,dst 753.P2 754where 755.I x 756is 757.CW L , 758.CW W , 759or 760.CW B . 761Thus to get 762.CW AL 763use a 764.CW MOVB 765instruction. If you need to access 766.CW AH , 767you must mention it explicitly in a 768.CW MOVB : 769.P1 770 MOVB AH, BX 771.P2 772There are many examples of illegal moves, for example, 773.P1 774 MOVB BP, DI 775.P2 776that the loader actually implements as pseudo-operations. 777.PP 778The names of conditions in all conditional instructions 779.CW J , ( 780.CW SET ) 781follow the conventions of the 68020 instead of those of the Intel 782assembler: 783.CW JOS , 784.CW JOC , 785.CW JCS , 786.CW JCC , 787.CW JEQ , 788.CW JNE , 789.CW JLS , 790.CW JHI , 791.CW JMI , 792.CW JPL , 793.CW JPS , 794.CW JPC , 795.CW JLT , 796.CW JGE , 797.CW JLE , 798and 799.CW JGT 800instead of 801.CW JO , 802.CW JNO , 803.CW JB , 804.CW JNB , 805.CW JZ , 806.CW JNZ , 807.CW JBE , 808.CW JNBE , 809.CW JS , 810.CW JNS , 811.CW JP , 812.CW JNP , 813.CW JL , 814.CW JNL , 815.CW JLE , 816and 817.CW JNLE . 818.PP 819The addressing modes have syntax like 820.CW AX , 821.CW (AX) , 822.CW (AX)(BX*4) , 823.CW 10(AX) , 824and 825.CW 10(AX)(BX*4) . 826The offsets from 827.CW AX 828can be replaced by offsets from 829.CW FP 830or 831.CW SB 832to access names, for example 833.CW extern+5(SB)(AX*2) . 834.PP 835Other notes: Non-relative 836.CW JMP 837and 838.CW CALL 839have a 840.CW * 841added to the syntax. 842Only 843.CW LOOP , 844.CW LOOPEQ , 845and 846.CW LOOPNE 847are legal loop instructions. Only 848.CW REP 849and 850.CW REPN 851are recognized repeaters. These are not prefixes, but rather 852stand-alone opcodes that precede the strings, for example 853.P1 854 CLD; REP; MOVSL 855.P2 856Segment override prefixes in 857.CW MOD/RM 858fields are not supported. 859.SH 860AMD64 861.PP 862The assembler's conventions are similar to those for the 386, above. 863The architecture provides extra fixed-point registers 864.CW R8 865to 866.CW R15 . 867All registers are 64 bit, but instructions access low-order 8, 16 and 32 bits 868as described in the processor handbook. 869For example, 870.CW MOVL 871to 872.CW AX 873puts a value in the low-order 32 bits and clears the top 32 bits to zero. 874Literal operands are limited to signed 32 bit values, which are sign-extended 875to 64 bits in 64 bit operations; the exception is 876.CW MOVQ , 877which allows 64-bit literals. 878MMX registers are 879.CW M0 880to 881.CW M7 , 882and 883XMM registers are 884.CW X0 885to 886.CW X15 . 887.PP 888There are many new instructions, including the MMX and XMM media instructions, 889and conditional move instructions. 890As with the 386 instruction names, 891all new 64-bit integer instructions, and the MMX and XMM instructions 892uniformly use 893.CW L 894for `long word' (32 bits) and 895.CW Q 896for `quad word' (64 bits). 897Some instructions use 898.CW O 899(`octword') for 128-bit values, where the processor handbook 900variously uses 901.CW O 902or 903.CW DQ . 904The assembler also consistently uses 905.CW PL 906for `packed long' in 907XMM instructions, instead of 908.CW Q , 909.CW DQ 910or 911.CW PI . 912Either 913.CW MOVL 914or 915.CW MOVQ 916can be used to move values to and from control registers, even when 917the registers might be 64 bits. 918The assembler often accepts the handbook's name to ease conversion 919of existing code (but remember that the operand order is uniformly 920source then destination). 921.PP 922C's 923.CW "long long" 924type is 64 bits, but passed and returned by value, not by reference. 925More notably, C pointer values are 64 bits, and thus 926.CW "long long" 927and 928.CW "unsigned long long" 929are the only integer types wide enough to hold a pointer value. 930The C compiler and library use the XMM floating-point instructions, not 931the old 387 ones, although the latter are implemented by assembler and loader. 932The compiler provides external registers, 933allocated from 934.CW R15 935down. 936.PP 937The calling conventions are different from the 386. 938.CW CALL 939pushes, and 940.CW RET 941pops a 64-bit return address on the stack. 942The first integer or pointer argument is passed in a register, which is 943.CW BP 944for an integer or pointer (it can be referred to in assembly code by the pseudonym 945.CW RARG ). 946.CW AX 947holds the return value from subroutines as before. 948Floating-point results are returned in 949.CW X0 , 950although currently the first parameter is not passed in a register if floating-point. 951All parameters less than 8 bytes in length have 8 byte slots reserved on the stack 952to preserve alignment and simplify variable-length argument list access, 953including the first parameter when passed in a register, 954although bytes 4 to 7 are not initialized. 955.PP 956The assembler assumes 64-bit mode unless a 957.CW MODE 958pseudo-operation is given: 959.P1 960 MODE $32 961.P2 962to change to 32-bit mode. 963The effect is mainly to diagnose instructions that are illegal in 964the given mode, but the loader will also assume 32-bit operands and addresses, 965and 32-bit PC values for call and return. 966.SH 967Alpha 968.PP 969On the Alpha, all registers are 64 bits. The architecture handles 32-bit values 970by giving them a canonical format (sign extension in the case of integer registers). 971Registers are numbered 972.CW R0 973through 974.CW R31 . 975.CW R0 976holds the return value from subroutines, and also the first parameter. 977.CW R30 978is the stack pointer, 979.CW R29 980is the static base, 981.CW R26 982is the link register, and 983.CW R27 984and 985.CW R28 986are linker temporaries. 987.PP 988Floating point registers are numbered 989.CW F0 990to 991.CW F31 . 992.CW F28 993contains 994.CW 0.5 , 995.CW F29 996contains 997.CW 1.0 , 998and 999.CW F30 1000contains 1001.CW 2.0 . 1002.CW F31 1003is always 1004.CW 0.0 1005on the Alpha. 1006.PP 1007The extension character for 1008.CW MOV 1009follows DEC's notation: 1010.CW B 1011for byte (8 bits), 1012.CW W 1013for word (16 bits), 1014.CW L 1015for long (32 bits), 1016and 1017.CW Q 1018for quadword (64 bits). 1019Byte and ``word'' loads and stores may be made unsigned 1020by appending a 1021.CW U . 1022.CW S 1023and 1024.CW T 1025refer to IEEE floating point single precision (32 bits) and double precision (64 bits), respectively. 1026.SH 1027PowerPC 1028.PP 1029The PowerPC follows the Plan 9 model set by the MIPS and SPARC, 1030not the elaborate ABIs. 1031The 32-bit instructions of the 60x and 8xx PowerPC architectures are supported; 1032there is no support for the older POWER instructions. 1033Registers are 1034.CW R0 1035through 1036.CW R31 . 1037.CW R0 1038is initialized to zero; this is done by C start up code 1039and assumed by the compiler and loader. 1040.CW R1 1041is the stack pointer. 1042.CW R2 1043is the static base register, with value the address of 1044.CW setSB(SB) . 1045.CW R3 1046is the return register and also the register holding the first 1047argument to a C function, with space reserved at 1048.CW 0(FP) 1049as on the MIPS. 1050.CW R31 1051is the loader temporary. 1052The external registers in Plan 9's C are allocated from 1053.CW R30 1054down. 1055.PP 1056Floating point registers are called 1057.CW F0 1058through 1059.CW F31 . 1060By convention, several registers are initialized 1061to specific values; this is done by the operating system. 1062.CW F27 1063must be initialized to the value 1064.CW 0x4330000080000000 1065(used by float-to-int conversion), 1066.CW F28 1067to the value 0.0, 1068.CW F29 1069to 0.5, 1070.CW F30 1071to 1.0, and 1072.CW F31 1073to 2.0. 1074.PP 1075As on the MIPS and SPARC, the assembler accepts arbitrary literals 1076as operands to 1077.CW MOVW , 1078and also to 1079.CW ADD 1080and others where `immediate' variants exist, 1081and the loader generates sequences 1082of 1083.CW addi , 1084.CW addis , 1085.CW oris , 1086etc. as required. 1087The register indirect addressing modes use the same syntax as the SPARC, 1088including double indexing when allowed. 1089.PP 1090The instruction names are generally derived from the Motorola ones, 1091subject to slight transformation: 1092the 1093.CW . ' ` 1094marking the setting of condition codes is replaced by 1095.CW CC , 1096and when the letter 1097.CW o ' ` 1098represents `OE=1' it is replaced by 1099.CW V . 1100Thus 1101.CW add , 1102.CW addo. 1103and 1104.CW subfzeo. 1105become 1106.CW ADD , 1107.CW ADDVCC 1108and 1109.CW SUBFZEVCC . 1110As well as the three-operand conditional branch instruction 1111.CW BC , 1112the assembler provides pseudo-instructions for the common cases: 1113.CW BEQ , 1114.CW BNE , 1115.CW BGT , 1116.CW BGE , 1117.CW BLT , 1118.CW BLE , 1119.CW BVC , 1120and 1121.CW BVS . 1122The unconditional branch instruction is 1123.CW BR . 1124Indirect branches use 1125.CW "(CTR)" 1126or 1127.CW "(LR)" 1128as target. 1129.PP 1130Load or store operations are replaced by 1131.CW MOV 1132variants in the usual way: 1133.CW MOVW 1134(move word), 1135.CW MOVH 1136(move halfword with sign extension), and 1137.CW MOVB 1138(move byte with sign extension, a pseudo-instruction), 1139with unsigned variants 1140.CW MOVHZ 1141and 1142.CW MOVBZ , 1143and byte-reversing 1144.CW MOVWBR 1145and 1146.CW MOVHBR . 1147`Load or store with update' versions are 1148.CW MOVWU , 1149.CW MOVHU , 1150and 1151.CW MOVBZU . 1152Load or store multiple is 1153.CW MOVMW . 1154The exceptions are the string instructions, which are 1155.CW LSW 1156and 1157.CW STSW , 1158and the reservation instructions 1159.CW lwarx 1160and 1161.CW stwcx. , 1162which are 1163.CW LWAR 1164and 1165.CW STWCCC , 1166all with operands in the usual data-flow order. 1167Floating-point load or store instructions are 1168.CW FMOVD , 1169.CW FMOVDU , 1170.CW FMOVS , 1171and 1172.CW FMOVSU . 1173The register to register move instructions 1174.CW fmr 1175and 1176.CW fmr. 1177are written 1178.CW FMOVD 1179and 1180.CW FMOVDCC . 1181.PP 1182The assembler knows the commonly used special purpose registers: 1183.CW CR , 1184.CW CTR , 1185.CW DEC , 1186.CW LR , 1187.CW MSR , 1188and 1189.CW XER . 1190The rest, which are often architecture-dependent, are referenced as 1191.CW SPR(n) . 1192The segment registers of the 60x series are similarly 1193.CW SEG(n) , 1194but 1195.I n 1196can also be a register name, as in 1197.CW SEG(R3) . 1198Moves between special purpose registers and general purpose ones, 1199when allowed by the architecture, 1200are written as 1201.CW MOVW , 1202replacing 1203.CW mfcr , 1204.CW mtcr , 1205.CW mfmsr , 1206.CW mtmsr , 1207.CW mtspr , 1208.CW mfspr , 1209.CW mftb , 1210and many others. 1211.PP 1212The fields of the condition register 1213.CW CR 1214are referenced as 1215.CW CR(0) 1216through 1217.CW CR(7) . 1218They are used by the 1219.CW MOVFL 1220(move field) pseudo-instruction, 1221which produces 1222.CW mcrf 1223or 1224.CW mtcrf . 1225For example: 1226.P1 1227 MOVFL CR(3), CR(0) 1228 MOVFL R3, CR(1) 1229 MOVFL R3, $7, CR 1230.P2 1231They are also accepted in 1232the conditional branch instruction, for example 1233.P1 1234 BEQ CR(7), label 1235.P2 1236Fields of the 1237.CW FPSCR 1238are accessed using 1239.CW MOVFL 1240in a similar way: 1241.P1 1242 MOVFL FPSCR, F0 1243 MOVFL F0, FPSCR 1244 MOVFL F0, $7, FPSCR 1245 MOVFL $0, FPSCR(3) 1246.P2 1247producing 1248.CW mffs , 1249.CW mtfsf , 1250or 1251.CW mtfsfi 1252as appropriate. 1253.SH 1254ARM 1255.PP 1256The assembler provides access to 1257.CW R0 1258through 1259.CW R14 1260and the 1261.CW PC . 1262The stack pointer is 1263.CW R13 , 1264the link register is 1265.CW R14 , 1266and the static base register is 1267.CW R12 . 1268.CW R0 1269is the return register and also the register holding 1270the first argument to a subroutine. 1271The assembler supports the 1272.CW CPSR 1273and 1274.CW SPSR 1275registers. 1276It also knows about coprocessor registers 1277.CW C0 1278through 1279.CW C15 . 1280Floating registers are 1281.CW F0 1282through 1283.CW F7 , 1284.CW FPSR 1285and 1286.CW FPCR . 1287.PP 1288As with the other architectures, loads and stores are called 1289.CW MOV , 1290e.g. 1291.CW MOVW 1292for load word or store word, and 1293.CW MOVM 1294for 1295load or store multiple, 1296depending on the operands. 1297.PP 1298Addressing modes are supported by suffixes to the instructions: 1299.CW .IA 1300(increment after), 1301.CW .IB 1302(increment before), 1303.CW .DA 1304(decrement after), and 1305.CW .DB 1306(decrement before). 1307These can only be used with the 1308.CW MOV 1309instructions. 1310The move multiple instruction, 1311.CW MOVM , 1312defines a range of registers using brackets, e.g. 1313.CW [R0-R12] . 1314The special 1315.CW MOVM 1316addressing mode bits 1317.CW W , 1318.CW U , 1319and 1320.CW P 1321are written in the same manner, for example, 1322.CW MOVM.DB.W . 1323A 1324.CW .S 1325suffix allows a 1326.CW MOVM 1327instruction to access user 1328.CW R13 1329and 1330.CW R14 1331when in another processor mode. 1332Shifts and rotates in addressing modes are supported by binary operators 1333.CW << 1334(logical left shift), 1335.CW >> 1336(logical right shift), 1337.CW -> 1338(arithmetic right shift), and 1339.CW @> 1340(rotate right); for example 1341.CW "R7>>R2" or 1342.CW "R2@>2" . 1343The assembler does not support indexing by a shifted expression; 1344only names can be doubly indexed. 1345.PP 1346Any instruction can be followed by a suffix that makes the instruction conditional: 1347.CW .EQ , 1348.CW .NE , 1349and so on, as in the ARM manual, with synonyms 1350.CW .HS 1351(for 1352.CW .CS ) 1353and 1354.CW .LO 1355(for 1356.CW .CC ), 1357for example 1358.CW ADD.NE . 1359Arithmetic 1360and logical instructions 1361can have a 1362.CW .S 1363suffix, as ARM allows, to set condition codes. 1364.PP 1365The syntax of the 1366.CW MCR 1367and 1368.CW MRC 1369coprocessor instructions is largely as in the manual, with the usual adjustments. 1370The assembler directly supports only the ARM floating-point coprocessor 1371operations used by the compiler: 1372.CW CMP , 1373.CW ADD , 1374.CW SUB , 1375.CW MUL , 1376and 1377.CW DIV , 1378all with 1379.CW F 1380or 1381.CW D 1382suffix selecting single or double precision. 1383Floating-point load or store become 1384.CW MOVF 1385and 1386.CW MOVD . 1387Conversion instructions are also specified by moves: 1388.CW MOVWD , 1389.CW MOVWF , 1390.CW MOVDW , 1391.CW MOVWD , 1392.CW MOVFD , 1393and 1394.CW MOVDF . 1395