1.HTML "A Manual for the Plan 9 assembler 2.ft CW 3.ta 8n +8n +8n +8n +8n +8n +8n 4.ft 5.TL 6A Manual for the Plan 9 assembler 7.AU 8Rob Pike 9rob@plan9.bell-labs.com 10.SH 11Machines 12.PP 13There is an assembler for each of the MIPS, SPARC, Intel 386, 14Intel 960, AMD 29000, Motorola 68020 and 68000, Motorola Power PC, 15AMD64, DEC Alpha, and Acorn ARM. 16The 68020 assembler, 17.CW 2a , 18is the oldest and in many ways the prototype. 19The assemblers are really just variations of a single program: 20they share many properties such as left-to-right assignment order for 21instruction operands and the synthesis of macro instructions 22such as 23.CW MOVE 24to hide the peculiarities of the load and store structure of the machines. 25To keep things concrete, the first part of this manual is 26specifically about the 68020. 27At the end is a description of the differences among 28the other assemblers. 29.PP 30The document, ``How to Use the Plan 9 C Compiler'', by Rob Pike, 31is a prerequisite for this manual. 32.SH 33Registers 34.PP 35All pre-defined symbols in the assembler are upper-case. 36Data registers are 37.CW R0 38through 39.CW R7 ; 40address registers are 41.CW A0 42through 43.CW A7 ; 44floating-point registers are 45.CW F0 46through 47.CW F7 . 48.PP 49A pointer in 50.CW A6 51is used by the C compiler to point to data, enabling short addresses to 52be used more often. 53The value of 54.CW A6 55is constant and must be set during C program initialization 56to the address of the externally-defined symbol 57.CW a6base . 58.PP 59The following hardware registers are defined in the assembler; their 60meaning should be obvious given a 68020 manual: 61.CW CAAR , 62.CW CACR , 63.CW CCR , 64.CW DFC , 65.CW ISP , 66.CW MSP , 67.CW SFC , 68.CW SR , 69.CW USP , 70and 71.CW VBR . 72.PP 73The assembler also defines several pseudo-registers that 74manipulate the stack: 75.CW FP , 76.CW SP , 77and 78.CW TOS . 79.CW FP 80is the frame pointer, so 81.CW 0(FP) 82is the first argument, 83.CW 4(FP) 84is the second, and so on. 85.CW SP 86is the local stack pointer, where automatic variables are held 87(SP is a pseudo-register only on the 68020); 88.CW 0(SP) 89is the first automatic, and so on as with 90.CW FP . 91Finally, 92.CW TOS 93is the top-of-stack register, used for pushing parameters to procedures, 94saving temporary values, and so on. 95.PP 96The assembler and loader track these pseudo-registers so 97the above statements are true regardless of what has been 98pushed on the hardware stack, pointed to by 99.CW A7 . 100The name 101.CW A7 102refers to the hardware stack pointer, but beware of mixed use of 103.CW A7 104and the above stack-related pseudo-registers, which will cause trouble. 105Note, too, that the 106.CW PEA 107instruction is observed by the loader to 108alter SP and thus will insert a corresponding pop before all returns. 109The assembler accepts a label-like name to be attached to 110.CW FP 111and 112.CW SP 113uses, such as 114.CW p+0(FP) , 115to help document that 116.CW p 117is the first argument to a routine. 118The name goes in the symbol table but has no significance to the result 119of the program. 120.SH 121Referring to data 122.PP 123All external references must be made relative to some pseudo-register, 124either 125.CW PC 126(the virtual program counter) or 127.CW SB 128(the ``static base'' register). 129.CW PC 130counts instructions, not bytes of data. 131For example, to branch to the second following instruction, that is, 132to skip one instruction, one may write 133.P1 134 BRA 2(PC) 135.P2 136Labels are also allowed, as in 137.P1 138 BRA return 139 NOP 140return: 141 RTS 142.P2 143When using labels, there is no 144.CW (PC) 145annotation. 146.PP 147The pseudo-register 148.CW SB 149refers to the beginning of the address space of the program. 150Thus, references to global data and procedures are written as 151offsets to 152.CW SB , 153as in 154.P1 155 MOVL $array(SB), TOS 156.P2 157to push the address of a global array on the stack, or 158.P1 159 MOVL array+4(SB), TOS 160.P2 161to push the second (4-byte) element of the array. 162Note the use of an offset; the complete list of addressing modes is given below. 163Similarly, subroutine calls must use 164.CW SB : 165.P1 166 BSR exit(SB) 167.P2 168File-static variables have syntax 169.P1 170 local<>+4(SB) 171.P2 172The 173.CW <> 174will be filled in at load time by a unique integer. 175.PP 176When a program starts, it must execute 177.P1 178 MOVL $a6base(SB), A6 179.P2 180before accessing any global data. 181(On machines such as the MIPS and SPARC that cannot load a register 182in a single instruction, constants are loaded through the static base 183register. The loader recognizes code that initializes the static 184base register and treats it specially. You must be careful, however, 185not to load large constants on such machines when the static base 186register is not set up, such as early in interrupt routines.) 187.SH 188Expressions 189.PP 190Expressions are mostly what one might expect. 191Where an offset or a constant is expected, 192a primary expression with unary operators is allowed. 193A general C constant expression is allowed in parentheses. 194.PP 195Source files are preprocessed exactly as in the C compiler, so 196.CW #define 197and 198.CW #include 199work. 200.SH 201Addressing modes 202.PP 203The simple addressing modes are shared by all the assemblers. 204Here, for completeness, follows a table of all the 68020 addressing modes, 205since that machine has the richest set. 206In the table, 207.CW o 208is an offset, which if zero may be elided, and 209.CW d 210is a displacement, which is a constant between -128 and 127 inclusive. 211Many of the modes listed have the same name; 212scrutiny of the format will show what default is being applied. 213For instance, indexed mode with no address register supplied operates 214as though a zero-valued register were used. 215For "offset" read "displacement." 216For "\f(CW.s\fP" read one of 217.CW .L , 218or 219.CW .W 220followed by 221.CW *1 , 222.CW *2 , 223.CW *4 , 224or 225.CW *8 226to indicate the size and scaling of the data. 227.IP 228.TS 229l lfCW. 230data register R0 231address register A0 232floating-point register F0 233special names CAAR, CACR, etc. 234constant $con 235floating point constant $fcon 236external symbol name+o(SB) 237local symbol name<>+o(SB) 238automatic symbol name+o(SP) 239argument name+o(FP) 240address of external $name+o(SB) 241address of local $name<>+o(SB) 242indirect post-increment (A0)+ 243indirect pre-decrement -(A0) 244indirect with offset o(A0) 245indexed with offset o()(R0.s) 246indexed with offset o(A0)(R0.s) 247external indexed name+o(SB)(R0.s) 248local indexed name<>+o(SB)(R0.s) 249automatic indexed name+o(SP)(R0.s) 250parameter indexed name+o(FP)(R0.s) 251offset indirect post-indexed d(o())(R0.s) 252offset indirect post-indexed d(o(A0))(R0.s) 253external indirect post-indexed d(name+o(SB))(R0.s) 254local indirect post-indexed d(name<>+o(SB))(R0.s) 255automatic indirect post-indexed d(name+o(SP))(R0.s) 256parameter indirect post-indexed d(name+o(FP))(R0.s) 257offset indirect pre-indexed d(o()(R0.s)) 258offset indirect pre-indexed d(o(A0)) 259offset indirect pre-indexed d(o(A0)(R0.s)) 260external indirect pre-indexed d(name+o(SB)) 261external indirect pre-indexed d(name+o(SB)(R0.s)) 262local indirect pre-indexed d(name<>+o(SB)) 263local indirect pre-indexed d(name<>+o(SB)(R0.s)) 264automatic indirect pre-indexed d(name+o(SP)) 265automatic indirect pre-indexed d(name+o(SP)(R0.s)) 266parameter indirect pre-indexed d(name+o(FP)) 267parameter indirect pre-indexed d(name+o(FP)(R0.s)) 268.TE 269.in 270.SH 271Laying down data 272.PP 273Placing data in the instruction stream, say for interrupt vectors, is easy: 274the pseudo-instructions 275.CW LONG 276and 277.CW WORD 278(but not 279.CW BYTE ) 280lay down the value of their single argument, of the appropriate size, 281as if it were an instruction: 282.P1 283 LONG $12345 284.P2 285places the long 12345 (base 10) 286in the instruction stream. 287(On most machines, 288the only such operator is 289.CW WORD 290and it lays down 32-bit quantities. 291The 386 has all three: 292.CW LONG , 293.CW WORD , 294and 295.CW BYTE . 296The AMD64 adds 297.CW QUAD 298to that for 64-bit values. 299The 960 has only one, 300.CW LONG .) 301.PP 302Placing information in the data section is more painful. 303The pseudo-instruction 304.CW DATA 305does the work, given two arguments: an address at which to place the item, 306including its size, 307and the value to place there. For example, to define a character array 308.CW array 309containing the characters 310.CW abc 311and a terminating null: 312.P1 313 DATA array+0(SB)/1, $'a' 314 DATA array+1(SB)/1, $'b' 315 DATA array+2(SB)/1, $'c' 316 GLOBL array(SB), $4 317.P2 318or 319.P1 320 DATA array+0(SB)/4, $"abc\ez" 321 GLOBL array(SB), $4 322.P2 323The 324.CW /1 325defines the number of bytes to define, 326.CW GLOBL 327makes the symbol global, and the 328.CW $4 329says how many bytes the symbol occupies. 330Uninitialized data is zeroed automatically. 331The character 332.CW \ez 333is equivalent to the C 334.CW \e0. 335The string in a 336.CW DATA 337statement may contain a maximum of eight bytes; 338build larger strings piecewise. 339Two pseudo-instructions, 340.CW DYNT 341and 342.CW INIT , 343allow the (obsolete) Alef compilers to build dynamic type information during the load 344phase. 345The 346.CW DYNT 347pseudo-instruction has two forms: 348.P1 349 DYNT , ALEF_SI_5+0(SB) 350 DYNT ALEF_AS+0(SB), ALEF_SI_5+0(SB) 351.P2 352In the first form, 353.CW DYNT 354defines the symbol to be a small unique integer constant, chosen by the loader, 355which is some multiple of the word size. In the second form, 356.CW DYNT 357defines the second symbol in the same way, 358places the address of the most recently 359defined text symbol in the array specified by the first symbol at the 360index defined by the value of the second symbol, 361and then adjusts the size of the array accordingly. 362.PP 363The 364.CW INIT 365pseudo-instruction takes the same parameters as a 366.CW DATA 367statement. Its symbol is used as the base of an array and the 368data item is installed in the array at the offset specified by the most recent 369.CW DYNT 370pseudo-instruction. 371The size of the array is adjusted accordingly. 372The 373.CW DYNT 374and 375.CW INIT 376pseudo-instructions are not implemented on the 68020. 377.SH 378Defining a procedure 379.PP 380Entry points are defined by the pseudo-operation 381.CW TEXT , 382which takes as arguments the name of the procedure (including the ubiquitous 383.CW (SB) ) 384and the number of bytes of automatic storage to pre-allocate on the stack, 385which will usually be zero when writing assembly language programs. 386On machines with a link register, such as the MIPS and SPARC, 387the special value -4 instructs the loader to generate no PC save 388and restore instructions, even if the function is not a leaf. 389Here is a complete procedure that returns the sum 390of its two arguments: 391.P1 392TEXT sum(SB), $0 393 MOVL arg1+0(FP), R0 394 ADDL arg2+4(FP), R0 395 RTS 396.P2 397An optional middle argument 398to the 399.CW TEXT 400pseudo-op is a bit field of options to the loader. 401Setting the 1 bit suspends profiling the function when profiling is enabled for the rest of 402the program. 403For example, 404.P1 405TEXT sum(SB), 1, $0 406 MOVL arg1+0(FP), R0 407 ADDL arg2+4(FP), R0 408 RTS 409.P2 410will not be profiled; the first version above would be. 411Subroutines with peculiar state, such as system call routines, 412should not be profiled. 413.PP 414Setting the 2 bit allows multiple definitions of the same 415.CW TEXT 416symbol in a program; the loader will place only one such function in the image. 417It was emitted only by the Alef compilers. 418.PP 419Subroutines to be called from C should place their result in 420.CW R0 , 421even if it is an address. 422Floating point values are returned in 423.CW F0 . 424Functions that return a structure to a C program 425receive as their first argument the address of the location to 426store the result; 427.CW R0 428is unused in the calling protocol for such procedures. 429A subroutine is responsible for saving its own registers, 430and therefore is free to use any registers without saving them (``caller saves''). 431.CW A6 432and 433.CW A7 434are the exceptions as described above. 435.SH 436When in doubt 437.PP 438If you get confused, try using the 439.CW -S 440option to 441.CW 2c 442and compiling a sample program. 443The standard output is valid input to the assembler. 444.SH 445Instructions 446.PP 447The instruction set of the assembler is not identical to that 448of the machine. 449It is chosen to match what the compiler generates, augmented 450slightly by specific needs of the operating system. 451For example, 452.CW 2a 453does not distinguish between the various forms of 454.CW MOVE 455instruction: move quick, move address, etc. Instead the context 456does the job. For example, 457.P1 458 MOVL $1, R1 459 MOVL A0, R2 460 MOVW SR, R3 461.P2 462generates official 463.CW MOVEQ , 464.CW MOVEA , 465and 466.CW MOVESR 467instructions. 468A number of instructions do not have the syntax necessary to specify 469their entire capabilities. Notable examples are the bitfield 470instructions, the 471multiply and divide instructions, etc. 472For a complete set of generated instruction names (in 473.CW 2a 474notation, not Motorola's) see the file 475.CW /sys/src/cmd/2c/2.out.h . 476Despite its name, this file contains an enumeration of the 477instructions that appear in the intermediate files generated 478by the compiler, which correspond exactly to lines of assembly language. 479.PP 480The MC68000 assembler, 481.CW 1a , 482is essentially the same, honoring the appropriate subset of the instructions 483and addressing modes. 484The definitions of these are, nonetheless, part of 485.CW 2.out.h . 486.SH 487Laying down instructions 488.PP 489The loader modifies the code produced by the assembler and compiler. 490It folds branches, 491copies short sequences of code to eliminate branches, 492and discards unreachable code. 493The first instruction of every function is assumed to be reachable. 494The pseudo-instruction 495.CW NOP , 496which you may see in compiler output, 497means no instruction at all, rather than an instruction that does nothing. 498The loader discards all 499.CW NOP 's. 500.PP 501To generate a true 502.CW NOP 503instruction, or any other instruction not known to the assembler, use a 504.CW WORD 505pseudo-instruction. 506Such instructions on RISCs are not scheduled by the loader and must have 507their delay slots filled manually. 508.SH 509MIPS 510.PP 511The registers are only addressed by number: 512.CW R0 513through 514.CW R31 . 515.CW R29 516is the stack pointer; 517.CW R30 518is used as the static base pointer, the analogue of 519.CW A6 520on the 68020. 521Its value is the address of the global symbol 522.CW setR30(SB) . 523The register holding returned values from subroutines is 524.CW R1 . 525When a function is called, space for the first argument 526is reserved at 527.CW 0(FP) 528but in C (not Alef) the value is passed in 529.CW R1 530instead. 531.PP 532The loader uses 533.CW R28 534as a temporary. The system uses 535.CW R26 536and 537.CW R27 538as interrupt-time temporaries. Therefore none of these registers 539should be used in user code. 540.PP 541The control registers are not known to the assembler. 542Instead they are numbered registers 543.CW M0 , 544.CW M1 , 545etc. 546Use this trick to access, say, 547.CW STATUS : 548.P1 549#define STATUS 12 550 MOVW M(STATUS), R1 551.P2 552.PP 553Floating point registers are called 554.CW F0 555through 556.CW F31 . 557By convention, 558.CW F24 559must be initialized to the value 0.0, 560.CW F26 561to 0.5, 562.CW F28 563to 1.0, and 564.CW F30 565to 2.0; 566this is done by the operating system. 567.PP 568The instructions and their syntax are different from those of the manufacturer's 569manual. 570There are no 571.CW lui 572and kin; instead there are 573.CW MOVW 574(move word), 575.CW MOVH 576(move halfword), 577and 578.CW MOVB 579(move byte) pseudo-instructions. If the operand is unsigned, the instructions 580are 581.CW MOVHU 582and 583.CW MOVBU . 584The order of operands is from left to right in dataflow order, just as 585on the 68020 but not as in MIPS documentation. 586This means that the 587.CW Bcond 588instructions are reversed with respect to the book; for example, a 589.CW va 590.CW BGTZ 591generates a MIPS 592.CW bltz 593instruction. 594.PP 595The assembler is for the R2000, R3000, and most of the R4000 and R6000 architectures. 596It understands the 64-bit instructions 597.CW MOVV , 598.CW MOVVL , 599.CW ADDV , 600.CW ADDVU , 601.CW SUBV , 602.CW SUBVU , 603.CW MULV , 604.CW MULVU , 605.CW DIVV , 606.CW DIVVU , 607.CW SLLV , 608.CW SRLV , 609and 610.CW SRAV . 611The assembler does not have any cache, load-linked, or store-conditional instructions. 612.PP 613Some assembler instructions are expanded into multiple instructions by the loader. 614For example the loader may convert the load of a 32 bit constant into an 615.CW lui 616followed by an 617.CW ori . 618.PP 619Assembler instructions should be laid out as if there 620were no load, branch, or floating point compare delay slots; 621the loader will rearrange\(em\f2schedule\f1\(emthe instructions 622to guarantee correctness and improve performance. 623The only exception is that the correct scheduling of instructions 624that use control registers varies from model to model of machine 625(and is often undocumented) so you should schedule such instructions 626by hand to guarantee correct behavior. 627The loader generates 628.P1 629 NOR R0, R0, R0 630.P2 631when it needs a true no-op instruction. 632Use exactly this instruction when scheduling code manually; 633the loader recognizes it and schedules the code before it and after it independently. Also, 634.CW WORD 635pseudo-ops are scheduled like no-ops. 636.PP 637The 638.CW NOSCHED 639pseudo-op disables instruction scheduling 640(scheduling is enabled by default); 641.CW SCHED 642re-enables it. 643Branch folding, code copying, and dead code elimination are 644disabled for instructions that are not scheduled. 645.SH 646SPARC 647.PP 648Once you understand the Plan 9 model for the MIPS, the SPARC is familiar. 649Registers have numerical names only: 650.CW R0 651through 652.CW R31 . 653Forget about register windows: Plan 9 doesn't use them at all. 654The machine has 32 global registers, period. 655.CW R1 656[sic] is the stack pointer. 657.CW R2 658is the static base register, with value the address of 659.CW setSB(SB) . 660.CW R7 661is the return register and also the register holding the first 662argument to a C (not Alef) function, again with space reserved at 663.CW 0(FP) . 664.CW R14 665is the loader temporary. 666.PP 667Floating-point registers are exactly as on the MIPS. 668.PP 669The control registers are known by names such as 670.CW FSR . 671The instructions to access these registers are 672.CW MOVW 673instructions, for example 674.P1 675 MOVW Y, R8 676.P2 677for the SPARC instruction 678.P1 679 rdy %r8 680.P2 681.PP 682Move instructions are similar to those on the MIPS: pseudo-operations 683that turn into appropriate sequences of 684.CW sethi 685instructions, adds, etc. 686Instructions read from left to right. Because the arguments are 687flipped to 688.CW SUBCC , 689the condition codes are not inverted as on the MIPS. 690.PP 691The syntax for the ASI stuff is, for example to move a word from ASI 2: 692.P1 693 MOVW (R7, 2), R8 694.P2 695The syntax for double indexing is 696.P1 697 MOVW (R7+R8), R9 698.P2 699.PP 700The SPARC's instruction scheduling is similar to the MIPS's. 701The official no-op instruction is: 702.P1 703 ORN R0, R0, R0 704.P2 705.SH 706i960 707.PP 708Registers are numbered 709.CW R0 710through 711.CW R31 . 712Stack pointer is 713.CW R29 ; 714return register is 715.CW R4 ; 716static base is 717.CW R28 ; 718it is initialized to the address of 719.CW setSB(SB) . 720.CW R3 721must be zero; this should be done manually early in execution by 722.P1 723 SUBO R3, R3 724.P2 725.CW R27 726is the loader temporary. 727.PP 728There is no support for floating point. 729.PP 730The Intel calling convention is not supported and cannot be used; use 731.CW BAL 732instead. 733Instructions are mostly as in the book. The major change is that 734.CW LOAD 735and 736.CW STORE 737are both called 738.CW MOV . 739The extension character for 740.CW MOV 741is as in the manual: 742.CW O 743for ordinal, 744.CW W 745for signed, etc. 746.SH 747i386 748.PP 749The assembler assumes 32-bit protected mode. 750The register names are 751.CW SP , 752.CW AX , 753.CW BX , 754.CW CX , 755.CW DX , 756.CW BP , 757.CW DI , 758and 759.CW SI . 760The stack pointer (not a pseudo-register) is 761.CW SP 762and the return register is 763.CW AX . 764There is no physical frame pointer but, as for the MIPS, 765.CW FP 766is a pseudo-register that acts as 767a frame pointer. 768.PP 769Opcode names are mostly the same as those listed in the Intel manual 770with an 771.CW L , 772.CW W , 773or 774.CW B 775appended to identify 32-bit, 77616-bit, and 8-bit operations. 777The exceptions are loads, stores, and conditionals. 778All load and store opcodes to and from general registers, special registers 779(such as 780.CW CR0, 781.CW CR3, 782.CW GDTR, 783.CW IDTR, 784.CW SS, 785.CW CS, 786.CW DS, 787.CW ES, 788.CW FS, 789and 790.CW GS ) 791or memory are written 792as 793.P1 794 MOV\f2x\fP src,dst 795.P2 796where 797.I x 798is 799.CW L , 800.CW W , 801or 802.CW B . 803Thus to get 804.CW AL 805use a 806.CW MOVB 807instruction. If you need to access 808.CW AH , 809you must mention it explicitly in a 810.CW MOVB : 811.P1 812 MOVB AH, BX 813.P2 814There are many examples of illegal moves, for example, 815.P1 816 MOVB BP, DI 817.P2 818that the loader actually implements as pseudo-operations. 819.PP 820The names of conditions in all conditional instructions 821.CW J , ( 822.CW SET ) 823follow the conventions of the 68020 instead of those of the Intel 824assembler: 825.CW JOS , 826.CW JOC , 827.CW JCS , 828.CW JCC , 829.CW JEQ , 830.CW JNE , 831.CW JLS , 832.CW JHI , 833.CW JMI , 834.CW JPL , 835.CW JPS , 836.CW JPC , 837.CW JLT , 838.CW JGE , 839.CW JLE , 840and 841.CW JGT 842instead of 843.CW JO , 844.CW JNO , 845.CW JB , 846.CW JNB , 847.CW JZ , 848.CW JNZ , 849.CW JBE , 850.CW JNBE , 851.CW JS , 852.CW JNS , 853.CW JP , 854.CW JNP , 855.CW JL , 856.CW JNL , 857.CW JLE , 858and 859.CW JNLE . 860.PP 861The addressing modes have syntax like 862.CW AX , 863.CW (AX) , 864.CW (AX)(BX*4) , 865.CW 10(AX) , 866and 867.CW 10(AX)(BX*4) . 868The offsets from 869.CW AX 870can be replaced by offsets from 871.CW FP 872or 873.CW SB 874to access names, for example 875.CW extern+5(SB)(AX*2) . 876.PP 877Other notes: Non-relative 878.CW JMP 879and 880.CW CALL 881have a 882.CW * 883added to the syntax. 884Only 885.CW LOOP , 886.CW LOOPEQ , 887and 888.CW LOOPNE 889are legal loop instructions. Only 890.CW REP 891and 892.CW REPN 893are recognized repeaters. These are not prefixes, but rather 894stand-alone opcodes that precede the strings, for example 895.P1 896 CLD; REP; MOVSL 897.P2 898Segment override prefixes in 899.CW MOD/RM 900fields are not supported. 901.SH 902AMD64 903.PP 904The assembler assumes 64-bit mode unless a 905.CW MODE 906pseudo-operation is given: 907.P1 908 MODE $32 909.P2 910to change to 32-bit mode. 911The effect is mainly to diagnose instructions that are illegal in 912the given mode, but the loader will also assume 32-bit operands and addresses, 913and 32-bit PC values for call and return. 914The assembler's conventions are similar to those for the 386, above. 915The architecture provides extra fixed-point registers 916.CW R8 917to 918.CW R15 . 919All registers are 64 bit, but instructions access low-order 8, 16 and 32 bits 920as described in the processor handbook. 921For example, 922.CW MOVL 923to 924.CW AX 925puts a value in the low-order 32 bits and clears the top 32 bits to zero. 926Literal operands are limited to signed 32 bit values, which are sign-extended 927to 64 bits in 64 bit operations; the exception is 928.CW MOVQ , 929which allows 64-bit literals. 930The external registers in Plan 9's C are allocated from 931.CW R15 932down. 933.PP 934There are many new instructions, including the MMX and XMM media instructions, 935and conditional move instructions. 936MMX registers are 937.CW M0 938to 939.CW M7 , 940and 941XMM registers are 942.CW X0 943to 944.CW X15 . 945As with the 386 instruction names, 946all new 64-bit integer instructions, and the MMX and XMM instructions 947uniformly use 948.CW L 949for `long word' (32 bits) and 950.CW Q 951for `quad word' (64 bits). 952Some instructions use 953.CW O 954(`octword') for 128-bit values, where the processor handbook 955variously uses 956.CW O 957or 958.CW DQ . 959The assembler also consistently uses 960.CW PL 961for `packed long' in 962XMM instructions, instead of 963.CW Q , 964.CW DQ 965or 966.CW PI . 967Either 968.CW MOVL 969or 970.CW MOVQ 971can be used to move values to and from control registers, even when 972the registers might be 64 bits. 973The assembler often accepts the handbook's name to ease conversion 974of existing code (but remember that the operand order is uniformly 975source then destination). 976.PP 977C's 978.CW long 979.CW long 980type is 64 bits, but passed and returned by value, not by reference. 981More notably, C pointer values are 64 bits, and thus 982.CW long 983.CW long 984and 985.CW unsigned 986.CW long 987.CW long 988are the only integer types wide enough to hold a pointer value. 989The C compiler and library use the XMM floating-point instructions, not 990the old 387 ones, although the latter are implemented by assembler and loader. 991Unlike the 386, the first integer or pointer argument is passed in a register, which is 992.CW BP 993for an integer or pointer (it can be referred to in assembly code by the pseudonym 994.CW RARG ). 995.CW AX 996holds the return value from subroutines as before. 997Floating-point results are returned in 998.CW X0 , 999although currently the first floating-point parameter is not passed in a register. 1000All parameters less than 8 bytes in length have 8 byte slots reserved on the stack 1001to preserve alignment and simplify variable-length argument list access, 1002including the first parameter when passed in a register, 1003even though bytes 4 to 7 are not initialized. 1004.SH 1005Alpha 1006.PP 1007On the Alpha, all registers are 64 bits. The architecture handles 32-bit values 1008by giving them a canonical format (sign extension in the case of integer registers). 1009Registers are numbered 1010.CW R0 1011through 1012.CW R31 . 1013.CW R0 1014holds the return value from subroutines, and also the first parameter. 1015.CW R30 1016is the stack pointer, 1017.CW R29 1018is the static base, 1019.CW R26 1020is the link register, and 1021.CW R27 1022and 1023.CW R28 1024are linker temporaries. 1025.PP 1026Floating point registers are numbered 1027.CW F0 1028to 1029.CW F31 . 1030.CW F28 1031contains 1032.CW 0.5 , 1033.CW F29 1034contains 1035.CW 1.0 , 1036and 1037.CW F30 1038contains 1039.CW 2.0 . 1040.CW F31 1041is always 1042.CW 0.0 1043on the Alpha. 1044.PP 1045The extension character for 1046.CW MOV 1047follows DEC's notation: 1048.CW B 1049for byte (8 bits), 1050.CW W 1051for word (16 bits), 1052.CW L 1053for long (32 bits), 1054and 1055.CW Q 1056for quadword (64 bits). 1057Byte and ``word'' loads and stores may be made unsigned 1058by appending a 1059.CW U . 1060.CW S 1061and 1062.CW T 1063refer to IEEE floating point single precision (32 bits) and double precision (64 bits), respectively. 1064.SH 1065Power PC 1066.PP 1067The Power PC follows the Plan 9 model set by the MIPS and SPARC, 1068not the elaborate ABIs. 1069The 32-bit instructions of the 60x and 8xx PowerPC architectures are supported; 1070there is no support for the older POWER instructions. 1071Registers are 1072.CW R0 1073through 1074.CW R31 . 1075.CW R0 1076is initialized to zero; this is done by C start up code 1077and assumed by the compiler and loader. 1078.CW R1 1079is the stack pointer. 1080.CW R2 1081is the static base register, with value the address of 1082.CW setSB(SB) . 1083.CW R3 1084is the return register and also the register holding the first 1085argument to a C function, with space reserved at 1086.CW 0(FP) 1087as on the MIPS. 1088.CW R31 1089is the loader temporary. 1090The external registers in Plan 9's C are allocated from 1091.CW R30 1092down. 1093.PP 1094Floating point registers are called 1095.CW F0 1096through 1097.CW F31 . 1098By convention, several registers are initialized 1099to specific values; this is done by the operating system. 1100.CW F27 1101must be initialized to the value 1102.CW 0x4330000080000000 1103(used by float-to-int conversion), 1104.CW F28 1105to the value 0.0, 1106.CW F29 1107to 0.5, 1108.CW F30 1109to 1.0, and 1110.CW F31 1111to 2.0. 1112.PP 1113As on the MIPS and SPARC, the assembler accepts arbitrary literals 1114as operands to 1115.CW MOVW , 1116and also to 1117.CW ADD 1118and others where `immediate' variants exist, 1119and the loader generates sequences 1120of 1121.CW addi , 1122.CW addis , 1123.CW oris , 1124etc. as required. 1125The register indirect addressing modes use the same syntax as the SPARC, 1126including double indexing when allowed. 1127.PP 1128The instruction names are generally derived from the Motorola ones, 1129subject to slight transformation: 1130the 1131.CW . ' ` 1132marking the setting of condition codes is replaced by 1133.CW CC , 1134and when the letter 1135.CW o ' ` 1136represents `OE=1' it is replaced by 1137.CW V . 1138Thus 1139.CW add , 1140.CW addo. 1141and 1142.CW subfzeo. 1143become 1144.CW ADD , 1145.CW ADDVCC 1146and 1147.CW SUBFZEVCC . 1148As well as the three-operand conditional branch instruction 1149.CW BC , 1150the assembler provides pseudo-instructions for the common cases: 1151.CW BEQ , 1152.CW BNE , 1153.CW BGT , 1154.CW BGE , 1155.CW BLT , 1156.CW BLE , 1157.CW BVC , 1158and 1159.CW BVS . 1160The unconditional branch instruction is 1161.CW BR . 1162Indirect branches use 1163.CW "(CTR)" 1164or 1165.CW "(LR)" 1166as target. 1167.PP 1168Load or store operations are replaced by 1169.CW MOV 1170variants in the usual way: 1171.CW MOVW 1172(move word), 1173.CW MOVH 1174(move halfword with sign extension), and 1175.CW MOVB 1176(move byte with sign extension, a pseudo-instruction), 1177with unsigned variants 1178.CW MOVHZ 1179and 1180.CW MOVBZ , 1181and byte-reversing 1182.CW MOVWBR 1183and 1184.CW MOVHBR . 1185`Load or store with update' versions are 1186.CW MOVWU , 1187.CW MOVHU , 1188and 1189.CW MOVBZU . 1190Load or store multiple is 1191.CW MOVMW . 1192The exceptions are the string instructions, which are 1193.CW LSW 1194and 1195.CW STSW , 1196and the reservation instructions 1197.CW lwarx 1198and 1199.CW stwcx. , 1200which are 1201.CW LWAR 1202and 1203.CW STWCCC , 1204all with operands in the usual data-flow order. 1205Floating-point load or store instructions are 1206.CW FMOVD , 1207.CW FMOVDU , 1208.CW FMOVS , 1209and 1210.CW FMOVSU . 1211The register to register move instructions 1212.CW fmr 1213and 1214.CW fmr. 1215are written 1216.CW FMOVD 1217and 1218.CW FMOVDCC . 1219.PP 1220The assembler knows the commonly used special purpose registers: 1221.CW CR , 1222.CW CTR , 1223.CW DEC , 1224.CW LR , 1225.CW MSR , 1226and 1227.CW XER . 1228The rest, which are often architecture-dependent, are referenced as 1229.CW SPR(n) . 1230The segment registers of the 60x series are similarly 1231.CW SEG(n) , 1232but 1233.I n 1234can also be a register name, as in 1235.CW SEG(R3) . 1236Moves between special purpose registers and general purpose ones, 1237when allowed by the architecture, 1238are written as 1239.CW MOVW , 1240replacing 1241.CW mfcr , 1242.CW mtcr , 1243.CW mfmsr , 1244.CW mtmsr , 1245.CW mtspr , 1246.CW mfspr , 1247.CW mftb , 1248and many others. 1249.PP 1250The fields of the condition register 1251.CW CR 1252are referenced as 1253.CW CR(0) 1254through 1255.CW CR(7) . 1256They are used by the 1257.CW MOVFL 1258(move field) pseudo-instruction, 1259which produces 1260.CW mcrf 1261or 1262.CW mtcrf . 1263For example: 1264.P1 1265 MOVFL CR(3), CR(0) 1266 MOVFL R3, CR(1) 1267 MOVFL R3, $7, CR 1268.P2 1269They are also accepted in 1270the conditional branch instruction, for example 1271.P1 1272 BEQ CR(7), label 1273.P2 1274Fields of the 1275.CW FPSCR 1276are accessed using 1277.CW MOVFL 1278in a similar way: 1279.P1 1280 MOVFL FPSCR, F0 1281 MOVFL F0, FPSCR 1282 MOVFL F0, $7, FPSCR 1283 MOVFL $0, FPSCR(3) 1284.P2 1285producing 1286.CW mffs , 1287.CW mtfsf 1288or 1289.CW mtfsfi , 1290as appropriate. 1291.SH 1292ARM 1293.PP 1294The assembler provides access to 1295.CW R0 1296through 1297.CW R14 1298and the 1299.CW PC . 1300The stack pointer is 1301.CW R13 , 1302the link register is 1303.CW R14 , 1304and the static base register is 1305.CW R12 . 1306.CW R0 1307is the return register and also the register holding 1308the first argument to a subroutine. 1309The external registers in Plan 9's C are allocated from 1310.CW R10 1311down. 1312.CW R11 1313is used by the loader as a temporary register. 1314The assembler supports the 1315.CW CPSR 1316and 1317.CW SPSR 1318registers. 1319It also knows about coprocessor registers 1320.CW C0 1321through 1322.CW C15 . 1323Floating registers are 1324.CW F0 1325through 1326.CW F7 , 1327.CW FPSR 1328and 1329.CW FPCR . 1330.PP 1331As with the other architectures, loads and stores are called 1332.CW MOV , 1333e.g. 1334.CW MOVW 1335for load word or store word, and 1336.CW MOVM 1337for 1338load or store multiple, 1339depending on the operands. 1340.PP 1341Addressing modes are supported by suffixes to the instructions: 1342.CW .IA 1343(increment after), 1344.CW .IB 1345(increment before), 1346.CW .DA 1347(decrement after), and 1348.CW .DB 1349(decrement before). 1350These can only be used with the 1351.CW MOV 1352instructions. 1353The move multiple instruction, 1354.CW MOVM , 1355defines a range of registers using brackets, e.g. 1356.CW [R0-R12] . 1357The special 1358.CW MOVM 1359addressing mode bits 1360.CW W , 1361.CW U , 1362and 1363.CW P 1364are written in the same manner, for example, 1365.CW MOVM.DB.W . 1366A 1367.CW .S 1368suffix allows a 1369.CW MOVM 1370instruction to access user 1371.CW R13 1372and 1373.CW R14 1374when in another processor mode. 1375Shifts and rotates in addressing modes are supported by binary operators 1376.CW << 1377(logical left shift), 1378.CW >> 1379(logical right shift), 1380.CW -> 1381(arithmetic right shift), and 1382.CW @> 1383(rotate right); for example 1384.CW "R7>>R2" or 1385.CW "R2@>2" . 1386The assembler does not support indexing by a shifted expression; 1387only names can be doubly indexed. 1388.PP 1389Any instruction can be followed by a suffix that makes the instruction conditional: 1390.CW .EQ , 1391.CW .NE , 1392and so on, as in the ARM manual, with synonyms 1393.CW .HS 1394(for 1395.CW .CS ) 1396and 1397.CW .LO 1398(for 1399.CW .CC ), 1400for example 1401.CW ADD.NE . 1402Arithmetic 1403and logical instructions 1404can have a 1405.CW .S 1406suffix, as ARM allows, to set condition codes. 1407.PP 1408The syntax of the 1409.CW MCR 1410and 1411.CW MRC 1412coprocessor instructions is largely as in the manual, with the usual adjustments. 1413The assembler directly supports only the ARM floating-point coprocessor 1414operations used by the compiler: 1415.CW CMP , 1416.CW ADD , 1417.CW SUB , 1418.CW MUL , 1419and 1420.CW DIV , 1421all with 1422.CW F 1423or 1424.CW D 1425suffix selecting single or double precision. 1426Floating-point load or store become 1427.CW MOVF 1428and 1429.CW MOVD . 1430Conversion instructions are also specified by moves: 1431.CW MOVWD , 1432.CW MOVWF , 1433.CW MOVDW , 1434.CW MOVWD , 1435.CW MOVFD , 1436and 1437.CW MOVDF . 1438.SH 1439AMD 29000 1440.PP 1441For details about this assembly language, which was built for the AMD 29240, 1442look at the sources or examine compiler output. 1443