1.HTML "A Manual for the Plan 9 assembler 2.ft CW 3.ta 8n +8n +8n +8n +8n +8n +8n 4.ft 5.TL 6A Manual for the Plan 9 assembler 7.AU 8Rob Pike 9rob@plan9.bell-labs.com 10.SH 11Machines 12.PP 13There is an assembler for each of the MIPS, SPARC, Intel 386, AMD64, 14Power PC, and ARM. 15The 68020 assembler, 16.CW 2a , 17(no longer distributed) 18is the oldest and in many ways the prototype. 19The assemblers are really just variations of a single program: 20they share many properties such as left-to-right assignment order for 21instruction operands and the synthesis of macro instructions 22such as 23.CW MOVE 24to hide the peculiarities of the load and store structure of the machines. 25To keep things concrete, the first part of this manual is 26specifically about the 68020. 27At the end is a description of the differences among 28the other assemblers. 29.PP 30The document, ``How to Use the Plan 9 C Compiler'', by Rob Pike, 31is a prerequisite for this manual. 32.SH 33Registers 34.PP 35All pre-defined symbols in the assembler are upper-case. 36Data registers are 37.CW R0 38through 39.CW R7 ; 40address registers are 41.CW A0 42through 43.CW A7 ; 44floating-point registers are 45.CW F0 46through 47.CW F7 . 48.PP 49A pointer in 50.CW A6 51is used by the C compiler to point to data, enabling short addresses to 52be used more often. 53The value of 54.CW A6 55is constant and must be set during C program initialization 56to the address of the externally-defined symbol 57.CW a6base . 58.PP 59The following hardware registers are defined in the assembler; their 60meaning should be obvious given a 68020 manual: 61.CW CAAR , 62.CW CACR , 63.CW CCR , 64.CW DFC , 65.CW ISP , 66.CW MSP , 67.CW SFC , 68.CW SR , 69.CW USP , 70and 71.CW VBR . 72.PP 73The assembler also defines several pseudo-registers that 74manipulate the stack: 75.CW FP , 76.CW SP , 77and 78.CW TOS . 79.CW FP 80is the frame pointer, so 81.CW 0(FP) 82is the first argument, 83.CW 4(FP) 84is the second, and so on. 85.CW SP 86is the local stack pointer, where automatic variables are held 87(SP is a pseudo-register only on the 68020); 88.CW 0(SP) 89is the first automatic, and so on as with 90.CW FP . 91Finally, 92.CW TOS 93is the top-of-stack register, used for pushing parameters to procedures, 94saving temporary values, and so on. 95.PP 96The assembler and loader track these pseudo-registers so 97the above statements are true regardless of what has been 98pushed on the hardware stack, pointed to by 99.CW A7 . 100The name 101.CW A7 102refers to the hardware stack pointer, but beware of mixed use of 103.CW A7 104and the above stack-related pseudo-registers, which will cause trouble. 105Note, too, that the 106.CW PEA 107instruction is observed by the loader to 108alter SP and thus will insert a corresponding pop before all returns. 109The assembler accepts a label-like name to be attached to 110.CW FP 111and 112.CW SP 113uses, such as 114.CW p+0(FP) , 115to help document that 116.CW p 117is the first argument to a routine. 118The name goes in the symbol table but has no significance to the result 119of the program. 120.SH 121Referring to data 122.PP 123All external references must be made relative to some pseudo-register, 124either 125.CW PC 126(the virtual program counter) or 127.CW SB 128(the ``static base'' register). 129.CW PC 130counts instructions, not bytes of data. 131For example, to branch to the second following instruction, that is, 132to skip one instruction, one may write 133.P1 134 BRA 2(PC) 135.P2 136Labels are also allowed, as in 137.P1 138 BRA return 139 NOP 140return: 141 RTS 142.P2 143When using labels, there is no 144.CW (PC) 145annotation. 146.PP 147The pseudo-register 148.CW SB 149refers to the beginning of the address space of the program. 150Thus, references to global data and procedures are written as 151offsets to 152.CW SB , 153as in 154.P1 155 MOVL $array(SB), TOS 156.P2 157to push the address of a global array on the stack, or 158.P1 159 MOVL array+4(SB), TOS 160.P2 161to push the second (4-byte) element of the array. 162Note the use of an offset; the complete list of addressing modes is given below. 163Similarly, subroutine calls must use 164.CW SB : 165.P1 166 BSR exit(SB) 167.P2 168File-static variables have syntax 169.P1 170 local<>+4(SB) 171.P2 172The 173.CW <> 174will be filled in at load time by a unique integer. 175.PP 176When a program starts, it must execute 177.P1 178 MOVL $a6base(SB), A6 179.P2 180before accessing any global data. 181(On machines such as the MIPS and SPARC that cannot load a register 182in a single instruction, constants are loaded through the static base 183register. The loader recognizes code that initializes the static 184base register and treats it specially. You must be careful, however, 185not to load large constants on such machines when the static base 186register is not set up, such as early in interrupt routines.) 187.SH 188Expressions 189.PP 190Expressions are mostly what one might expect. 191Where an offset or a constant is expected, 192a primary expression with unary operators is allowed. 193A general C constant expression is allowed in parentheses. 194.PP 195Source files are preprocessed exactly as in the C compiler, so 196.CW #define 197and 198.CW #include 199work. 200.SH 201Addressing modes 202.PP 203The simple addressing modes are shared by all the assemblers. 204Here, for completeness, follows a table of all the 68020 addressing modes, 205since that machine has the richest set. 206In the table, 207.CW o 208is an offset, which if zero may be elided, and 209.CW d 210is a displacement, which is a constant between -128 and 127 inclusive. 211Many of the modes listed have the same name; 212scrutiny of the format will show what default is being applied. 213For instance, indexed mode with no address register supplied operates 214as though a zero-valued register were used. 215For "offset" read "displacement." 216For "\f(CW.s\fP" read one of 217.CW .L , 218or 219.CW .W 220followed by 221.CW *1 , 222.CW *2 , 223.CW *4 , 224or 225.CW *8 226to indicate the size and scaling of the data. 227.IP 228.TS 229l lfCW. 230data register R0 231address register A0 232floating-point register F0 233special names CAAR, CACR, etc. 234constant $con 235floating point constant $fcon 236external symbol name+o(SB) 237local symbol name<>+o(SB) 238automatic symbol name+o(SP) 239argument name+o(FP) 240address of external $name+o(SB) 241address of local $name<>+o(SB) 242indirect post-increment (A0)+ 243indirect pre-decrement -(A0) 244indirect with offset o(A0) 245indexed with offset o()(R0.s) 246indexed with offset o(A0)(R0.s) 247external indexed name+o(SB)(R0.s) 248local indexed name<>+o(SB)(R0.s) 249automatic indexed name+o(SP)(R0.s) 250parameter indexed name+o(FP)(R0.s) 251offset indirect post-indexed d(o())(R0.s) 252offset indirect post-indexed d(o(A0))(R0.s) 253external indirect post-indexed d(name+o(SB))(R0.s) 254local indirect post-indexed d(name<>+o(SB))(R0.s) 255automatic indirect post-indexed d(name+o(SP))(R0.s) 256parameter indirect post-indexed d(name+o(FP))(R0.s) 257offset indirect pre-indexed d(o()(R0.s)) 258offset indirect pre-indexed d(o(A0)) 259offset indirect pre-indexed d(o(A0)(R0.s)) 260external indirect pre-indexed d(name+o(SB)) 261external indirect pre-indexed d(name+o(SB)(R0.s)) 262local indirect pre-indexed d(name<>+o(SB)) 263local indirect pre-indexed d(name<>+o(SB)(R0.s)) 264automatic indirect pre-indexed d(name+o(SP)) 265automatic indirect pre-indexed d(name+o(SP)(R0.s)) 266parameter indirect pre-indexed d(name+o(FP)) 267parameter indirect pre-indexed d(name+o(FP)(R0.s)) 268.TE 269.in 270.SH 271Laying down data 272.PP 273Placing data in the instruction stream, say for interrupt vectors, is easy: 274the pseudo-instructions 275.CW LONG 276and 277.CW WORD 278(but not 279.CW BYTE ) 280lay down the value of their single argument, of the appropriate size, 281as if it were an instruction: 282.P1 283 LONG $12345 284.P2 285places the long 12345 (base 10) 286in the instruction stream. 287(On most machines, 288the only such operator is 289.CW WORD 290and it lays down 32-bit quantities. 291The 386 has all three: 292.CW LONG , 293.CW WORD , 294and 295.CW BYTE . 296The AMD64 adds 297.CW QUAD 298to that for 64-bit values. 299The 960 has only one, 300.CW LONG .) 301.PP 302Placing information in the data section is more painful. 303The pseudo-instruction 304.CW DATA 305does the work, given two arguments: an address at which to place the item, 306including its size, 307and the value to place there. For example, to define a character array 308.CW array 309containing the characters 310.CW abc 311and a terminating null: 312.P1 313 DATA array+0(SB)/1, $'a' 314 DATA array+1(SB)/1, $'b' 315 DATA array+2(SB)/1, $'c' 316 GLOBL array(SB), $4 317.P2 318or 319.P1 320 DATA array+0(SB)/4, $"abc\ez" 321 GLOBL array(SB), $4 322.P2 323The 324.CW /1 325defines the number of bytes to define, 326.CW GLOBL 327makes the symbol global, and the 328.CW $4 329says how many bytes the symbol occupies. 330Uninitialized data is zeroed automatically. 331The character 332.CW \ez 333is equivalent to the C 334.CW \e0. 335The string in a 336.CW DATA 337statement may contain a maximum of eight bytes; 338build larger strings piecewise. 339Two pseudo-instructions, 340.CW DYNT 341and 342.CW INIT , 343allow the (obsolete) Alef compilers to build dynamic type information during the load 344phase. 345The 346.CW DYNT 347pseudo-instruction has two forms: 348.P1 349 DYNT , ALEF_SI_5+0(SB) 350 DYNT ALEF_AS+0(SB), ALEF_SI_5+0(SB) 351.P2 352In the first form, 353.CW DYNT 354defines the symbol to be a small unique integer constant, chosen by the loader, 355which is some multiple of the word size. In the second form, 356.CW DYNT 357defines the second symbol in the same way, 358places the address of the most recently 359defined text symbol in the array specified by the first symbol at the 360index defined by the value of the second symbol, 361and then adjusts the size of the array accordingly. 362.PP 363The 364.CW INIT 365pseudo-instruction takes the same parameters as a 366.CW DATA 367statement. Its symbol is used as the base of an array and the 368data item is installed in the array at the offset specified by the most recent 369.CW DYNT 370pseudo-instruction. 371The size of the array is adjusted accordingly. 372The 373.CW DYNT 374and 375.CW INIT 376pseudo-instructions are not implemented on the 68020. 377.SH 378Defining a procedure 379.PP 380Entry points are defined by the pseudo-operation 381.CW TEXT , 382which takes as arguments the name of the procedure (including the ubiquitous 383.CW (SB) ) 384and the number of bytes of automatic storage to pre-allocate on the stack, 385which will usually be zero when writing assembly language programs. 386On machines with a link register, such as the MIPS and SPARC, 387the special value -4 instructs the loader to generate no PC save 388and restore instructions, even if the function is not a leaf. 389Here is a complete procedure that returns the sum 390of its two arguments: 391.P1 392TEXT sum(SB), $0 393 MOVL arg1+0(FP), R0 394 ADDL arg2+4(FP), R0 395 RTS 396.P2 397An optional middle argument 398to the 399.CW TEXT 400pseudo-op is a bit field of options to the loader. 401Setting the 1 bit suspends profiling the function when profiling is enabled for the rest of 402the program. 403For example, 404.P1 405TEXT sum(SB), 1, $0 406 MOVL arg1+0(FP), R0 407 ADDL arg2+4(FP), R0 408 RTS 409.P2 410will not be profiled; the first version above would be. 411Subroutines with peculiar state, such as system call routines, 412should not be profiled. 413.PP 414Setting the 2 bit allows multiple definitions of the same 415.CW TEXT 416symbol in a program; the loader will place only one such function in the image. 417It was emitted only by the Alef compilers. 418.PP 419Subroutines to be called from C should place their result in 420.CW R0 , 421even if it is an address. 422Floating point values are returned in 423.CW F0 . 424Functions that return a structure to a C program 425receive as their first argument the address of the location to 426store the result; 427.CW R0 428is unused in the calling protocol for such procedures. 429A subroutine is responsible for saving its own registers, 430and therefore is free to use any registers without saving them (``caller saves''). 431.CW A6 432and 433.CW A7 434are the exceptions as described above. 435.SH 436When in doubt 437.PP 438If you get confused, try using the 439.CW -S 440option to 441.CW 2c 442and compiling a sample program. 443The standard output is valid input to the assembler. 444.SH 445Instructions 446.PP 447The instruction set of the assembler is not identical to that 448of the machine. 449It is chosen to match what the compiler generates, augmented 450slightly by specific needs of the operating system. 451For example, 452.CW 2a 453does not distinguish between the various forms of 454.CW MOVE 455instruction: move quick, move address, etc. Instead the context 456does the job. For example, 457.P1 458 MOVL $1, R1 459 MOVL A0, R2 460 MOVW SR, R3 461.P2 462generates official 463.CW MOVEQ , 464.CW MOVEA , 465and 466.CW MOVESR 467instructions. 468A number of instructions do not have the syntax necessary to specify 469their entire capabilities. Notable examples are the bitfield 470instructions, the 471multiply and divide instructions, etc. 472For a complete set of generated instruction names (in 473.CW 2a 474notation, not Motorola's) see the file 475.CW /sys/src/cmd/2c/2.out.h . 476Despite its name, this file contains an enumeration of the 477instructions that appear in the intermediate files generated 478by the compiler, which correspond exactly to lines of assembly language. 479.SH 480Laying down instructions 481.PP 482The loader modifies the code produced by the assembler and compiler. 483It folds branches, 484copies short sequences of code to eliminate branches, 485and discards unreachable code. 486The first instruction of every function is assumed to be reachable. 487The pseudo-instruction 488.CW NOP , 489which you may see in compiler output, 490means no instruction at all, rather than an instruction that does nothing. 491The loader discards all 492.CW NOP 's. 493.PP 494To generate a true 495.CW NOP 496instruction, or any other instruction not known to the assembler, use a 497.CW WORD 498pseudo-instruction. 499Such instructions on RISCs are not scheduled by the loader and must have 500their delay slots filled manually. 501.SH 502MIPS 503.PP 504The registers are only addressed by number: 505.CW R0 506through 507.CW R31 . 508.CW R29 509is the stack pointer; 510.CW R30 511is used as the static base pointer, the analogue of 512.CW A6 513on the 68020. 514Its value is the address of the global symbol 515.CW setR30(SB) . 516The register holding returned values from subroutines is 517.CW R1 . 518When a function is called, space for the first argument 519is reserved at 520.CW 0(FP) 521but in C (not Alef) the value is passed in 522.CW R1 523instead. 524.PP 525The loader uses 526.CW R28 527as a temporary. The system uses 528.CW R26 529and 530.CW R27 531as interrupt-time temporaries. Therefore none of these registers 532should be used in user code. 533.PP 534The control registers are not known to the assembler. 535Instead they are numbered registers 536.CW M0 , 537.CW M1 , 538etc. 539Use this trick to access, say, 540.CW STATUS : 541.P1 542#define STATUS 12 543 MOVW M(STATUS), R1 544.P2 545.PP 546Floating point registers are called 547.CW F0 548through 549.CW F31 . 550By convention, 551.CW F24 552must be initialized to the value 0.0, 553.CW F26 554to 0.5, 555.CW F28 556to 1.0, and 557.CW F30 558to 2.0; 559this is done by the operating system. 560.PP 561The instructions and their syntax are different from those of the manufacturer's 562manual. 563There are no 564.CW lui 565and kin; instead there are 566.CW MOVW 567(move word), 568.CW MOVH 569(move halfword), 570and 571.CW MOVB 572(move byte) pseudo-instructions. If the operand is unsigned, the instructions 573are 574.CW MOVHU 575and 576.CW MOVBU . 577The order of operands is from left to right in dataflow order, just as 578on the 68020 but not as in MIPS documentation. 579This means that the 580.CW Bcond 581instructions are reversed with respect to the book; for example, a 582.CW va 583.CW BGTZ 584generates a MIPS 585.CW bltz 586instruction. 587.PP 588The assembler is for the R2000, R3000, and most of the R4000 and R6000 architectures. 589It understands the 64-bit instructions 590.CW MOVV , 591.CW MOVVL , 592.CW ADDV , 593.CW ADDVU , 594.CW SUBV , 595.CW SUBVU , 596.CW MULV , 597.CW MULVU , 598.CW DIVV , 599.CW DIVVU , 600.CW SLLV , 601.CW SRLV , 602and 603.CW SRAV . 604The assembler does not have any cache, load-linked, or store-conditional instructions. 605.PP 606Some assembler instructions are expanded into multiple instructions by the loader. 607For example the loader may convert the load of a 32 bit constant into an 608.CW lui 609followed by an 610.CW ori . 611.PP 612Assembler instructions should be laid out as if there 613were no load, branch, or floating point compare delay slots; 614the loader will rearrange\(em\f2schedule\f1\(emthe instructions 615to guarantee correctness and improve performance. 616The only exception is that the correct scheduling of instructions 617that use control registers varies from model to model of machine 618(and is often undocumented) so you should schedule such instructions 619by hand to guarantee correct behavior. 620The loader generates 621.P1 622 NOR R0, R0, R0 623.P2 624when it needs a true no-op instruction. 625Use exactly this instruction when scheduling code manually; 626the loader recognizes it and schedules the code before it and after it independently. Also, 627.CW WORD 628pseudo-ops are scheduled like no-ops. 629.PP 630The 631.CW NOSCHED 632pseudo-op disables instruction scheduling 633(scheduling is enabled by default); 634.CW SCHED 635re-enables it. 636Branch folding, code copying, and dead code elimination are 637disabled for instructions that are not scheduled. 638.SH 639SPARC 640.PP 641Once you understand the Plan 9 model for the MIPS, the SPARC is familiar. 642Registers have numerical names only: 643.CW R0 644through 645.CW R31 . 646Forget about register windows: Plan 9 doesn't use them at all. 647The machine has 32 global registers, period. 648.CW R1 649[sic] is the stack pointer. 650.CW R2 651is the static base register, with value the address of 652.CW setSB(SB) . 653.CW R7 654is the return register and also the register holding the first 655argument to a C (not Alef) function, again with space reserved at 656.CW 0(FP) . 657.CW R14 658is the loader temporary. 659.PP 660Floating-point registers are exactly as on the MIPS. 661.PP 662The control registers are known by names such as 663.CW FSR . 664The instructions to access these registers are 665.CW MOVW 666instructions, for example 667.P1 668 MOVW Y, R8 669.P2 670for the SPARC instruction 671.P1 672 rdy %r8 673.P2 674.PP 675Move instructions are similar to those on the MIPS: pseudo-operations 676that turn into appropriate sequences of 677.CW sethi 678instructions, adds, etc. 679Instructions read from left to right. Because the arguments are 680flipped to 681.CW SUBCC , 682the condition codes are not inverted as on the MIPS. 683.PP 684The syntax for the ASI stuff is, for example to move a word from ASI 2: 685.P1 686 MOVW (R7, 2), R8 687.P2 688The syntax for double indexing is 689.P1 690 MOVW (R7+R8), R9 691.P2 692.PP 693The SPARC's instruction scheduling is similar to the MIPS's. 694The official no-op instruction is: 695.P1 696 ORN R0, R0, R0 697.P2 698.SH 699i960 700.PP 701Registers are numbered 702.CW R0 703through 704.CW R31 . 705Stack pointer is 706.CW R29 ; 707return register is 708.CW R4 ; 709static base is 710.CW R28 ; 711it is initialized to the address of 712.CW setSB(SB) . 713.CW R3 714must be zero; this should be done manually early in execution by 715.P1 716 SUBO R3, R3 717.P2 718.CW R27 719is the loader temporary. 720.PP 721There is no support for floating point. 722.PP 723The Intel calling convention is not supported and cannot be used; use 724.CW BAL 725instead. 726Instructions are mostly as in the book. The major change is that 727.CW LOAD 728and 729.CW STORE 730are both called 731.CW MOV . 732The extension character for 733.CW MOV 734is as in the manual: 735.CW O 736for ordinal, 737.CW W 738for signed, etc. 739.SH 740i386 741.PP 742The assembler assumes 32-bit protected mode. 743The register names are 744.CW SP , 745.CW AX , 746.CW BX , 747.CW CX , 748.CW DX , 749.CW BP , 750.CW DI , 751and 752.CW SI . 753The stack pointer (not a pseudo-register) is 754.CW SP 755and the return register is 756.CW AX . 757There is no physical frame pointer but, as for the MIPS, 758.CW FP 759is a pseudo-register that acts as 760a frame pointer. 761.PP 762Opcode names are mostly the same as those listed in the Intel manual 763with an 764.CW L , 765.CW W , 766or 767.CW B 768appended to identify 32-bit, 76916-bit, and 8-bit operations. 770The exceptions are loads, stores, and conditionals. 771All load and store opcodes to and from general registers, special registers 772(such as 773.CW CR0, 774.CW CR3, 775.CW GDTR, 776.CW IDTR, 777.CW SS, 778.CW CS, 779.CW DS, 780.CW ES, 781.CW FS, 782and 783.CW GS ) 784or memory are written 785as 786.P1 787 MOV\f2x\fP src,dst 788.P2 789where 790.I x 791is 792.CW L , 793.CW W , 794or 795.CW B . 796Thus to get 797.CW AL 798use a 799.CW MOVB 800instruction. If you need to access 801.CW AH , 802you must mention it explicitly in a 803.CW MOVB : 804.P1 805 MOVB AH, BX 806.P2 807There are many examples of illegal moves, for example, 808.P1 809 MOVB BP, DI 810.P2 811that the loader actually implements as pseudo-operations. 812.PP 813The names of conditions in all conditional instructions 814.CW J , ( 815.CW SET ) 816follow the conventions of the 68020 instead of those of the Intel 817assembler: 818.CW JOS , 819.CW JOC , 820.CW JCS , 821.CW JCC , 822.CW JEQ , 823.CW JNE , 824.CW JLS , 825.CW JHI , 826.CW JMI , 827.CW JPL , 828.CW JPS , 829.CW JPC , 830.CW JLT , 831.CW JGE , 832.CW JLE , 833and 834.CW JGT 835instead of 836.CW JO , 837.CW JNO , 838.CW JB , 839.CW JNB , 840.CW JZ , 841.CW JNZ , 842.CW JBE , 843.CW JNBE , 844.CW JS , 845.CW JNS , 846.CW JP , 847.CW JNP , 848.CW JL , 849.CW JNL , 850.CW JLE , 851and 852.CW JNLE . 853.PP 854The addressing modes have syntax like 855.CW AX , 856.CW (AX) , 857.CW (AX)(BX*4) , 858.CW 10(AX) , 859and 860.CW 10(AX)(BX*4) . 861The offsets from 862.CW AX 863can be replaced by offsets from 864.CW FP 865or 866.CW SB 867to access names, for example 868.CW extern+5(SB)(AX*2) . 869.PP 870Other notes: Non-relative 871.CW JMP 872and 873.CW CALL 874have a 875.CW * 876added to the syntax. 877Only 878.CW LOOP , 879.CW LOOPEQ , 880and 881.CW LOOPNE 882are legal loop instructions. Only 883.CW REP 884and 885.CW REPN 886are recognized repeaters. These are not prefixes, but rather 887stand-alone opcodes that precede the strings, for example 888.P1 889 CLD; REP; MOVSL 890.P2 891Segment override prefixes in 892.CW MOD/RM 893fields are not supported. 894.SH 895AMD64 896.PP 897The assembler assumes 64-bit mode unless a 898.CW MODE 899pseudo-operation is given: 900.P1 901 MODE $32 902.P2 903to change to 32-bit mode. 904The effect is mainly to diagnose instructions that are illegal in 905the given mode, but the loader will also assume 32-bit operands and addresses, 906and 32-bit PC values for call and return. 907The assembler's conventions are similar to those for the 386, above. 908The architecture provides extra fixed-point registers 909.CW R8 910to 911.CW R15 . 912All registers are 64 bit, but instructions access low-order 8, 16 and 32 bits 913as described in the processor handbook. 914For example, 915.CW MOVL 916to 917.CW AX 918puts a value in the low-order 32 bits and clears the top 32 bits to zero. 919Literal operands are limited to signed 32 bit values, which are sign-extended 920to 64 bits in 64 bit operations; the exception is 921.CW MOVQ , 922which allows 64-bit literals. 923The external registers in Plan 9's C are allocated from 924.CW R15 925down. 926.PP 927There are many new instructions, including the MMX and XMM media instructions, 928and conditional move instructions. 929MMX registers are 930.CW M0 931to 932.CW M7 , 933and 934XMM registers are 935.CW X0 936to 937.CW X15 . 938As with the 386 instruction names, 939all new 64-bit integer instructions, and the MMX and XMM instructions 940uniformly use 941.CW L 942for `long word' (32 bits) and 943.CW Q 944for `quad word' (64 bits). 945Some instructions use 946.CW O 947(`octword') for 128-bit values, where the processor handbook 948variously uses 949.CW O 950or 951.CW DQ . 952The assembler also consistently uses 953.CW PL 954for `packed long' in 955XMM instructions, instead of 956.CW Q , 957.CW DQ 958or 959.CW PI . 960Either 961.CW MOVL 962or 963.CW MOVQ 964can be used to move values to and from control registers, even when 965the registers might be 64 bits. 966The assembler often accepts the handbook's name to ease conversion 967of existing code (but remember that the operand order is uniformly 968source then destination). 969.PP 970C's 971.CW long 972.CW long 973type is 64 bits, but passed and returned by value, not by reference. 974More notably, C pointer values are 64 bits, and thus 975.CW long 976.CW long 977and 978.CW unsigned 979.CW long 980.CW long 981are the only integer types wide enough to hold a pointer value. 982The C compiler and library use the XMM floating-point instructions, not 983the old 387 ones, although the latter are implemented by assembler and loader. 984Unlike the 386, the first integer or pointer argument is passed in a register, which is 985.CW BP 986for an integer or pointer (it can be referred to in assembly code by the pseudonym 987.CW RARG ). 988.CW AX 989holds the return value from subroutines as before. 990Floating-point results are returned in 991.CW X0 , 992although currently the first floating-point parameter is not passed in a register. 993All parameters less than 8 bytes in length have 8 byte slots reserved on the stack 994to preserve alignment and simplify variable-length argument list access, 995including the first parameter when passed in a register, 996even though bytes 4 to 7 are not initialized. 997. 998.SH 999Power PC 1000.PP 1001The Power PC follows the Plan 9 model set by the MIPS and SPARC, 1002not the elaborate ABIs. 1003The 32-bit instructions of the 60x and 8xx PowerPC architectures are supported; 1004there is no support for the older POWER instructions. 1005Registers are 1006.CW R0 1007through 1008.CW R31 . 1009.CW R0 1010is initialized to zero; this is done by C start up code 1011and assumed by the compiler and loader. 1012.CW R1 1013is the stack pointer. 1014.CW R2 1015is the static base register, with value the address of 1016.CW setSB(SB) . 1017.CW R3 1018is the return register and also the register holding the first 1019argument to a C function, with space reserved at 1020.CW 0(FP) 1021as on the MIPS. 1022.CW R31 1023is the loader temporary. 1024The external registers in Plan 9's C are allocated from 1025.CW R30 1026down. 1027.PP 1028Floating point registers are called 1029.CW F0 1030through 1031.CW F31 . 1032By convention, several registers are initialized 1033to specific values; this is done by the operating system. 1034.CW F27 1035must be initialized to the value 1036.CW 0x4330000080000000 1037(used by float-to-int conversion), 1038.CW F28 1039to the value 0.0, 1040.CW F29 1041to 0.5, 1042.CW F30 1043to 1.0, and 1044.CW F31 1045to 2.0. 1046.PP 1047As on the MIPS and SPARC, the assembler accepts arbitrary literals 1048as operands to 1049.CW MOVW , 1050and also to 1051.CW ADD 1052and others where `immediate' variants exist, 1053and the loader generates sequences 1054of 1055.CW addi , 1056.CW addis , 1057.CW oris , 1058etc. as required. 1059The register indirect addressing modes use the same syntax as the SPARC, 1060including double indexing when allowed. 1061.PP 1062The instruction names are generally derived from the Motorola ones, 1063subject to slight transformation: 1064the 1065.CW . ' ` 1066marking the setting of condition codes is replaced by 1067.CW CC , 1068and when the letter 1069.CW o ' ` 1070represents `OE=1' it is replaced by 1071.CW V . 1072Thus 1073.CW add , 1074.CW addo. 1075and 1076.CW subfzeo. 1077become 1078.CW ADD , 1079.CW ADDVCC 1080and 1081.CW SUBFZEVCC . 1082As well as the three-operand conditional branch instruction 1083.CW BC , 1084the assembler provides pseudo-instructions for the common cases: 1085.CW BEQ , 1086.CW BNE , 1087.CW BGT , 1088.CW BGE , 1089.CW BLT , 1090.CW BLE , 1091.CW BVC , 1092and 1093.CW BVS . 1094The unconditional branch instruction is 1095.CW BR . 1096Indirect branches use 1097.CW "(CTR)" 1098or 1099.CW "(LR)" 1100as target. 1101.PP 1102Load or store operations are replaced by 1103.CW MOV 1104variants in the usual way: 1105.CW MOVW 1106(move word), 1107.CW MOVH 1108(move halfword with sign extension), and 1109.CW MOVB 1110(move byte with sign extension, a pseudo-instruction), 1111with unsigned variants 1112.CW MOVHZ 1113and 1114.CW MOVBZ , 1115and byte-reversing 1116.CW MOVWBR 1117and 1118.CW MOVHBR . 1119`Load or store with update' versions are 1120.CW MOVWU , 1121.CW MOVHU , 1122and 1123.CW MOVBZU . 1124Load or store multiple is 1125.CW MOVMW . 1126The exceptions are the string instructions, which are 1127.CW LSW 1128and 1129.CW STSW , 1130and the reservation instructions 1131.CW lwarx 1132and 1133.CW stwcx. , 1134which are 1135.CW LWAR 1136and 1137.CW STWCCC , 1138all with operands in the usual data-flow order. 1139Floating-point load or store instructions are 1140.CW FMOVD , 1141.CW FMOVDU , 1142.CW FMOVS , 1143and 1144.CW FMOVSU . 1145The register to register move instructions 1146.CW fmr 1147and 1148.CW fmr. 1149are written 1150.CW FMOVD 1151and 1152.CW FMOVDCC . 1153.PP 1154The assembler knows the commonly used special purpose registers: 1155.CW CR , 1156.CW CTR , 1157.CW DEC , 1158.CW LR , 1159.CW MSR , 1160and 1161.CW XER . 1162The rest, which are often architecture-dependent, are referenced as 1163.CW SPR(n) . 1164The segment registers of the 60x series are similarly 1165.CW SEG(n) , 1166but 1167.I n 1168can also be a register name, as in 1169.CW SEG(R3) . 1170Moves between special purpose registers and general purpose ones, 1171when allowed by the architecture, 1172are written as 1173.CW MOVW , 1174replacing 1175.CW mfcr , 1176.CW mtcr , 1177.CW mfmsr , 1178.CW mtmsr , 1179.CW mtspr , 1180.CW mfspr , 1181.CW mftb , 1182and many others. 1183.PP 1184The fields of the condition register 1185.CW CR 1186are referenced as 1187.CW CR(0) 1188through 1189.CW CR(7) . 1190They are used by the 1191.CW MOVFL 1192(move field) pseudo-instruction, 1193which produces 1194.CW mcrf 1195or 1196.CW mtcrf . 1197For example: 1198.P1 1199 MOVFL CR(3), CR(0) 1200 MOVFL R3, CR(1) 1201 MOVFL R3, $7, CR 1202.P2 1203They are also accepted in 1204the conditional branch instruction, for example 1205.P1 1206 BEQ CR(7), label 1207.P2 1208Fields of the 1209.CW FPSCR 1210are accessed using 1211.CW MOVFL 1212in a similar way: 1213.P1 1214 MOVFL FPSCR, F0 1215 MOVFL F0, FPSCR 1216 MOVFL F0, $7, FPSCR 1217 MOVFL $0, FPSCR(3) 1218.P2 1219producing 1220.CW mffs , 1221.CW mtfsf 1222or 1223.CW mtfsfi , 1224as appropriate. 1225.SH 1226ARM 1227.PP 1228The assembler provides access to 1229.CW R0 1230through 1231.CW R14 1232and the 1233.CW PC . 1234The stack pointer is 1235.CW R13 , 1236the link register is 1237.CW R14 , 1238and the static base register is 1239.CW R12 . 1240.CW R0 1241is the return register and also the register holding 1242the first argument to a subroutine. 1243The external registers in Plan 9's C are allocated from 1244.CW R10 1245down. 1246.CW R11 1247is used by the loader as a temporary register. 1248The assembler supports the 1249.CW CPSR 1250and 1251.CW SPSR 1252registers. 1253It also knows about coprocessor registers 1254.CW C0 1255through 1256.CW C15 . 1257Floating registers are 1258.CW F0 1259through 1260.CW F7 , 1261.CW FPSR 1262and 1263.CW FPCR . 1264.PP 1265As with the other architectures, loads and stores are called 1266.CW MOV , 1267e.g. 1268.CW MOVW 1269for load word or store word, and 1270.CW MOVM 1271for 1272load or store multiple, 1273depending on the operands. 1274.PP 1275Addressing modes are supported by suffixes to the instructions: 1276.CW .IA 1277(increment after), 1278.CW .IB 1279(increment before), 1280.CW .DA 1281(decrement after), and 1282.CW .DB 1283(decrement before). 1284These can only be used with the 1285.CW MOV 1286instructions. 1287The move multiple instruction, 1288.CW MOVM , 1289defines a range of registers using brackets, e.g. 1290.CW [R0-R12] . 1291The special 1292.CW MOVM 1293addressing mode bits 1294.CW W , 1295.CW U , 1296and 1297.CW P 1298are written in the same manner, for example, 1299.CW MOVM.DB.W . 1300A 1301.CW .S 1302suffix allows a 1303.CW MOVM 1304instruction to access user 1305.CW R13 1306and 1307.CW R14 1308when in another processor mode. 1309Shifts and rotates in addressing modes are supported by binary operators 1310.CW << 1311(logical left shift), 1312.CW >> 1313(logical right shift), 1314.CW -> 1315(arithmetic right shift), and 1316.CW @> 1317(rotate right); for example 1318.CW "R7>>R2" or 1319.CW "R2@>2" . 1320The assembler does not support indexing by a shifted expression; 1321only names can be doubly indexed. 1322.PP 1323Any instruction can be followed by a suffix that makes the instruction conditional: 1324.CW .EQ , 1325.CW .NE , 1326and so on, as in the ARM manual, with synonyms 1327.CW .HS 1328(for 1329.CW .CS ) 1330and 1331.CW .LO 1332(for 1333.CW .CC ), 1334for example 1335.CW ADD.NE . 1336Arithmetic 1337and logical instructions 1338can have a 1339.CW .S 1340suffix, as ARM allows, to set condition codes. 1341.PP 1342The syntax of the 1343.CW MCR 1344and 1345.CW MRC 1346coprocessor instructions is largely as in the manual, with the usual adjustments. 1347The assembler directly supports only the ARM floating-point coprocessor 1348operations used by the compiler: 1349.CW CMP , 1350.CW ADD , 1351.CW SUB , 1352.CW MUL , 1353and 1354.CW DIV , 1355all with 1356.CW F 1357or 1358.CW D 1359suffix selecting single or double precision. 1360Floating-point load or store become 1361.CW MOVF 1362and 1363.CW MOVD . 1364Conversion instructions are also specified by moves: 1365.CW MOVWD , 1366.CW MOVWF , 1367.CW MOVDW , 1368.CW MOVWD , 1369.CW MOVFD , 1370and 1371.CW MOVDF . 1372