xref: /inferno-os/doc/asm.ms (revision 46439007cf417cbd9ac8049bb4122c890097a0fa)
1*46439007SCharles.Forsyth.ft CW
2*46439007SCharles.Forsyth.ta 8n +8n +8n +8n +8n +8n +8n
3*46439007SCharles.Forsyth.ft
4*46439007SCharles.Forsyth.TL
5*46439007SCharles.ForsythA Manual for the Plan 9 assembler
6*46439007SCharles.Forsyth.AU
7*46439007SCharles.Forsyth.I "Rob Pike"
8*46439007SCharles.Forsyth.AI
9*46439007SCharles.Forsythrob@plan9.bell-labs.com
10*46439007SCharles.Forsyth.SH
11*46439007SCharles.ForsythMachines
12*46439007SCharles.Forsyth.PP
13*46439007SCharles.ForsythThere is an assembler for each of the MIPS, SPARC, Intel 386,
14*46439007SCharles.ForsythMotorola 68020 and 68000, IBM Power PC, DEC Alpha, and ARM.
15*46439007SCharles.ForsythThe 68020 assembler,
16*46439007SCharles.Forsyth.CW 2a ,
17*46439007SCharles.Forsythis the oldest and in many ways the prototype.
18*46439007SCharles.ForsythThe assemblers are really just variations of a single program:
19*46439007SCharles.Forsyththey share many properties such as left-to-right assignment order for
20*46439007SCharles.Forsythinstruction operands and the synthesis of macro instructions
21*46439007SCharles.Forsythsuch as
22*46439007SCharles.Forsyth.CW MOVE
23*46439007SCharles.Forsythto hide the peculiarities of the load and store structure of the machines.
24*46439007SCharles.ForsythTo keep things concrete, the first part of this manual is
25*46439007SCharles.Forsythspecifically about the 68020.
26*46439007SCharles.ForsythAt the end is a description of the differences among
27*46439007SCharles.Forsyththe other assemblers.
28*46439007SCharles.Forsyth.ig
29*46439007SCharles.Forsyth.PP
30*46439007SCharles.ForsythThe document, ``How to Use the Plan 9 C Compiler'', by Rob Pike,
31*46439007SCharles.Forsythis a prerequisite for this manual.
32*46439007SCharles.Forsyth..
33*46439007SCharles.Forsyth.SH
34*46439007SCharles.ForsythRegisters
35*46439007SCharles.Forsyth.PP
36*46439007SCharles.ForsythAll pre-defined symbols in the assembler are upper-case.
37*46439007SCharles.ForsythData registers are
38*46439007SCharles.Forsyth.CW R0
39*46439007SCharles.Forsyththrough
40*46439007SCharles.Forsyth.CW R7 ;
41*46439007SCharles.Forsythaddress registers are
42*46439007SCharles.Forsyth.CW A0
43*46439007SCharles.Forsyththrough
44*46439007SCharles.Forsyth.CW A7 ;
45*46439007SCharles.Forsythfloating-point registers are
46*46439007SCharles.Forsyth.CW F0
47*46439007SCharles.Forsyththrough
48*46439007SCharles.Forsyth.CW F7 .
49*46439007SCharles.Forsyth.PP
50*46439007SCharles.ForsythA pointer in
51*46439007SCharles.Forsyth.CW A6
52*46439007SCharles.Forsythis used by the C compiler to point to data, enabling short addresses to
53*46439007SCharles.Forsythbe used more often.
54*46439007SCharles.ForsythThe value of
55*46439007SCharles.Forsyth.CW A6
56*46439007SCharles.Forsythis constant and must be set during C program initialization
57*46439007SCharles.Forsythto the address of the externally-defined symbol
58*46439007SCharles.Forsyth.CW a6base .
59*46439007SCharles.Forsyth.PP
60*46439007SCharles.ForsythThe following hardware registers are defined in the assembler; their
61*46439007SCharles.Forsythmeaning should be obvious given a 68020 manual:
62*46439007SCharles.Forsyth.CW CAAR ,
63*46439007SCharles.Forsyth.CW CACR ,
64*46439007SCharles.Forsyth.CW CCR ,
65*46439007SCharles.Forsyth.CW DFC ,
66*46439007SCharles.Forsyth.CW ISP ,
67*46439007SCharles.Forsyth.CW MSP ,
68*46439007SCharles.Forsyth.CW SFC ,
69*46439007SCharles.Forsyth.CW SR ,
70*46439007SCharles.Forsyth.CW USP ,
71*46439007SCharles.Forsythand
72*46439007SCharles.Forsyth.CW VBR .
73*46439007SCharles.Forsyth.PP
74*46439007SCharles.ForsythThe assembler also defines several pseudo-registers that
75*46439007SCharles.Forsythmanipulate the stack:
76*46439007SCharles.Forsyth.CW FP ,
77*46439007SCharles.Forsyth.CW SP ,
78*46439007SCharles.Forsythand
79*46439007SCharles.Forsyth.CW TOS .
80*46439007SCharles.Forsyth.CW FP
81*46439007SCharles.Forsythis the frame pointer, so
82*46439007SCharles.Forsyth.CW 0(FP)
83*46439007SCharles.Forsythis the first argument,
84*46439007SCharles.Forsyth.CW 4(FP)
85*46439007SCharles.Forsythis the second, and so on.
86*46439007SCharles.Forsyth.CW SP
87*46439007SCharles.Forsythis the local stack pointer, where automatic variables are held
88*46439007SCharles.Forsyth(SP is a pseudo-register only on the 68020);
89*46439007SCharles.Forsyth.CW 0(SP)
90*46439007SCharles.Forsythis the first automatic, and so on as with
91*46439007SCharles.Forsyth.CW FP .
92*46439007SCharles.ForsythFinally,
93*46439007SCharles.Forsyth.CW TOS
94*46439007SCharles.Forsythis the top-of-stack register, used for pushing parameters to procedures,
95*46439007SCharles.Forsythsaving temporary values, and so on.
96*46439007SCharles.Forsyth.PP
97*46439007SCharles.ForsythThe assembler and loader track these pseudo-registers so
98*46439007SCharles.Forsyththe above statements are true regardless of what has been
99*46439007SCharles.Forsythpushed on the hardware stack, pointed to by
100*46439007SCharles.Forsyth.CW A7 .
101*46439007SCharles.ForsythThe name
102*46439007SCharles.Forsyth.CW A7
103*46439007SCharles.Forsythrefers to the hardware stack pointer, but beware of mixed use of
104*46439007SCharles.Forsyth.CW A7
105*46439007SCharles.Forsythand the above stack-related pseudo-registers, which will cause trouble.
106*46439007SCharles.ForsythNote, too, that the
107*46439007SCharles.Forsyth.CW PEA
108*46439007SCharles.Forsythinstruction is observed by the loader to
109*46439007SCharles.Forsythalter SP and thus will insert a corresponding pop before all returns.
110*46439007SCharles.ForsythThe assembler accepts a label-like name to be attached to
111*46439007SCharles.Forsyth.CW FP
112*46439007SCharles.Forsythand
113*46439007SCharles.Forsyth.CW SP
114*46439007SCharles.Forsythuses, such as
115*46439007SCharles.Forsyth.CW p+0(FP) ,
116*46439007SCharles.Forsythto help document that
117*46439007SCharles.Forsyth.CW p
118*46439007SCharles.Forsythis the first argument to a routine.
119*46439007SCharles.ForsythThe name goes in the symbol table but has no significance to the result
120*46439007SCharles.Forsythof the program.
121*46439007SCharles.Forsyth.SH
122*46439007SCharles.ForsythReferring to data
123*46439007SCharles.Forsyth.PP
124*46439007SCharles.ForsythAll external references must be made relative to some pseudo-register,
125*46439007SCharles.Forsytheither
126*46439007SCharles.Forsyth.CW PC
127*46439007SCharles.Forsyth(the virtual program counter) or
128*46439007SCharles.Forsyth.CW SB
129*46439007SCharles.Forsyth(the ``static base'' register).
130*46439007SCharles.Forsyth.CW PC
131*46439007SCharles.Forsythcounts instructions, not bytes of data.
132*46439007SCharles.ForsythFor example, to branch to the second following instruction, that is,
133*46439007SCharles.Forsythto skip one instruction, one may write
134*46439007SCharles.Forsyth.P1
135*46439007SCharles.Forsyth	BRA	2(PC)
136*46439007SCharles.Forsyth.P2
137*46439007SCharles.ForsythLabels are also allowed, as in
138*46439007SCharles.Forsyth.P1
139*46439007SCharles.Forsyth	BRA	return
140*46439007SCharles.Forsyth	NOP
141*46439007SCharles.Forsythreturn:
142*46439007SCharles.Forsyth	RTS
143*46439007SCharles.Forsyth.P2
144*46439007SCharles.ForsythWhen using labels, there is no
145*46439007SCharles.Forsyth.CW (PC)
146*46439007SCharles.Forsythannotation.
147*46439007SCharles.Forsyth.PP
148*46439007SCharles.ForsythThe pseudo-register
149*46439007SCharles.Forsyth.CW SB
150*46439007SCharles.Forsythrefers to the beginning of the address space of the program.
151*46439007SCharles.ForsythThus, references to global data and procedures are written as
152*46439007SCharles.Forsythoffsets to
153*46439007SCharles.Forsyth.CW SB ,
154*46439007SCharles.Forsythas in
155*46439007SCharles.Forsyth.P1
156*46439007SCharles.Forsyth	MOVL	$array(SB), TOS
157*46439007SCharles.Forsyth.P2
158*46439007SCharles.Forsythto push the address of a global array on the stack, or
159*46439007SCharles.Forsyth.P1
160*46439007SCharles.Forsyth	MOVL	array+4(SB), TOS
161*46439007SCharles.Forsyth.P2
162*46439007SCharles.Forsythto push the second (4-byte) element of the array.
163*46439007SCharles.ForsythNote the use of an offset; the complete list of addressing modes is given below.
164*46439007SCharles.ForsythSimilarly, subroutine calls must use
165*46439007SCharles.Forsyth.CW SB :
166*46439007SCharles.Forsyth.P1
167*46439007SCharles.Forsyth	BSR	exit(SB)
168*46439007SCharles.Forsyth.P2
169*46439007SCharles.ForsythFile-static variables have syntax
170*46439007SCharles.Forsyth.P1
171*46439007SCharles.Forsyth	local<>+4(SB)
172*46439007SCharles.Forsyth.P2
173*46439007SCharles.ForsythThe
174*46439007SCharles.Forsyth.CW <>
175*46439007SCharles.Forsythwill be filled in at load time by a unique integer.
176*46439007SCharles.Forsyth.PP
177*46439007SCharles.ForsythWhen a program starts, it must execute
178*46439007SCharles.Forsyth.P1
179*46439007SCharles.Forsyth	MOVL	$a6base(SB), A6
180*46439007SCharles.Forsyth.P2
181*46439007SCharles.Forsythbefore accessing any global data.
182*46439007SCharles.Forsyth(On machines such as the MIPS and SPARC that cannot load a register
183*46439007SCharles.Forsythin a single instruction, constants are loaded through the static base
184*46439007SCharles.Forsythregister.  The loader recognizes code that initializes the static
185*46439007SCharles.Forsythbase register and treats it specially.  You must be careful, however,
186*46439007SCharles.Forsythnot to load large constants on such machines when the static base
187*46439007SCharles.Forsythregister is not set up, such as early in interrupt routines.)
188*46439007SCharles.Forsyth.SH
189*46439007SCharles.ForsythExpressions
190*46439007SCharles.Forsyth.PP
191*46439007SCharles.ForsythExpressions are mostly what one might expect.
192*46439007SCharles.ForsythWhere an offset or a constant is expected,
193*46439007SCharles.Forsytha primary expression with unary operators is allowed.
194*46439007SCharles.ForsythA general C constant expression is allowed in parentheses.
195*46439007SCharles.Forsyth.PP
196*46439007SCharles.ForsythSource files are preprocessed exactly as in the C compiler, so
197*46439007SCharles.Forsyth.CW #define
198*46439007SCharles.Forsythand
199*46439007SCharles.Forsyth.CW #include
200*46439007SCharles.Forsythwork.
201*46439007SCharles.Forsyth.SH
202*46439007SCharles.ForsythAddressing modes
203*46439007SCharles.Forsyth.PP
204*46439007SCharles.ForsythThe simple addressing modes are shared by all the assemblers.
205*46439007SCharles.ForsythHere, for completeness, follows a table of all the 68020 addressing modes,
206*46439007SCharles.Forsythsince that machine has the richest set.
207*46439007SCharles.ForsythIn the table,
208*46439007SCharles.Forsyth.CW o
209*46439007SCharles.Forsythis an offset, which if zero may be elided, and
210*46439007SCharles.Forsyth.CW d
211*46439007SCharles.Forsythis a displacement, which is a constant between -128 and 127 inclusive.
212*46439007SCharles.ForsythMany of the modes listed have the same name;
213*46439007SCharles.Forsythscrutiny of the format will show what default is being applied.
214*46439007SCharles.ForsythFor instance, indexed mode with no address register supplied operates
215*46439007SCharles.Forsythas though a zero-valued register were used.
216*46439007SCharles.ForsythFor "offset" read "displacement."
217*46439007SCharles.ForsythFor "\f(CW.s\fP" read one of
218*46439007SCharles.Forsyth.CW .L ,
219*46439007SCharles.Forsythor
220*46439007SCharles.Forsyth.CW .W
221*46439007SCharles.Forsythfollowed by
222*46439007SCharles.Forsyth.CW *1 ,
223*46439007SCharles.Forsyth.CW *2 ,
224*46439007SCharles.Forsyth.CW *4 ,
225*46439007SCharles.Forsythor
226*46439007SCharles.Forsyth.CW *8
227*46439007SCharles.Forsythto indicate the size and scaling of the data.
228*46439007SCharles.Forsyth.IP
229*46439007SCharles.Forsyth.TS
230*46439007SCharles.Forsythl lfCW.
231*46439007SCharles.Forsythdata register	R0
232*46439007SCharles.Forsythaddress register	A0
233*46439007SCharles.Forsythfloating-point register	F0
234*46439007SCharles.Forsythspecial names	CAAR, CACR, etc.
235*46439007SCharles.Forsythconstant	$con
236*46439007SCharles.Forsythfloating point constant	$fcon
237*46439007SCharles.Forsythexternal symbol	name+o(SB)
238*46439007SCharles.Forsythlocal symbol	name<>+o(SB)
239*46439007SCharles.Forsythautomatic symbol	name+o(SP)
240*46439007SCharles.Forsythargument	name+o(FP)
241*46439007SCharles.Forsythaddress of external	$name+o(SB)
242*46439007SCharles.Forsythaddress of local	$name<>+o(SB)
243*46439007SCharles.Forsythindirect post-increment	(A0)+
244*46439007SCharles.Forsythindirect pre-decrement	-(A0)
245*46439007SCharles.Forsythindirect with offset	o(A0)
246*46439007SCharles.Forsythindexed with offset	o()(R0.s)
247*46439007SCharles.Forsythindexed with offset	o(A0)(R0.s)
248*46439007SCharles.Forsythexternal indexed	name+o(SB)(R0.s)
249*46439007SCharles.Forsythlocal indexed	name<>+o(SB)(R0.s)
250*46439007SCharles.Forsythautomatic indexed	name+o(SP)(R0.s)
251*46439007SCharles.Forsythparameter indexed	name+o(FP)(R0.s)
252*46439007SCharles.Forsythoffset indirect post-indexed	d(o())(R0.s)
253*46439007SCharles.Forsythoffset indirect post-indexed	d(o(A0))(R0.s)
254*46439007SCharles.Forsythexternal indirect post-indexed	d(name+o(SB))(R0.s)
255*46439007SCharles.Forsythlocal indirect post-indexed	d(name<>+o(SB))(R0.s)
256*46439007SCharles.Forsythautomatic indirect post-indexed	d(name+o(SP))(R0.s)
257*46439007SCharles.Forsythparameter indirect post-indexed	d(name+o(FP))(R0.s)
258*46439007SCharles.Forsythoffset indirect pre-indexed	d(o()(R0.s))
259*46439007SCharles.Forsythoffset indirect pre-indexed	d(o(A0))
260*46439007SCharles.Forsythoffset indirect pre-indexed	d(o(A0)(R0.s))
261*46439007SCharles.Forsythexternal indirect pre-indexed	d(name+o(SB))
262*46439007SCharles.Forsythexternal indirect pre-indexed	d(name+o(SB)(R0.s))
263*46439007SCharles.Forsythlocal indirect pre-indexed	d(name<>+o(SB))
264*46439007SCharles.Forsythlocal indirect pre-indexed	d(name<>+o(SB)(R0.s))
265*46439007SCharles.Forsythautomatic indirect pre-indexed	d(name+o(SP))
266*46439007SCharles.Forsythautomatic indirect pre-indexed	d(name+o(SP)(R0.s))
267*46439007SCharles.Forsythparameter indirect pre-indexed	d(name+o(FP))
268*46439007SCharles.Forsythparameter indirect pre-indexed	d(name+o(FP)(R0.s))
269*46439007SCharles.Forsyth.TE
270*46439007SCharles.Forsyth.in
271*46439007SCharles.Forsyth.SH
272*46439007SCharles.ForsythLaying down data
273*46439007SCharles.Forsyth.PP
274*46439007SCharles.ForsythPlacing data in the instruction stream, say for interrupt vectors, is easy:
275*46439007SCharles.Forsyththe pseudo-instructions
276*46439007SCharles.Forsyth.CW LONG
277*46439007SCharles.Forsythand
278*46439007SCharles.Forsyth.CW WORD
279*46439007SCharles.Forsyth(but not
280*46439007SCharles.Forsyth.CW BYTE )
281*46439007SCharles.Forsythlay down the value of their single argument, of the appropriate size,
282*46439007SCharles.Forsythas if it were an instruction:
283*46439007SCharles.Forsyth.P1
284*46439007SCharles.Forsyth	LONG	$12345
285*46439007SCharles.Forsyth.P2
286*46439007SCharles.Forsythplaces the long 12345 (base 10)
287*46439007SCharles.Forsythin the instruction stream.
288*46439007SCharles.Forsyth(On most machines,
289*46439007SCharles.Forsyththe only such operator is
290*46439007SCharles.Forsyth.CW WORD
291*46439007SCharles.Forsythand it lays down 32-bit quantities.
292*46439007SCharles.ForsythThe 386 has all three:
293*46439007SCharles.Forsyth.CW LONG ,
294*46439007SCharles.Forsyth.CW WORD ,
295*46439007SCharles.Forsythand
296*46439007SCharles.Forsyth.CW BYTE .
297*46439007SCharles.ForsythThe AMD64 adds
298*46439007SCharles.Forsyth.CW QUAD
299*46439007SCharles.Forsythfor 64-bit values.)
300*46439007SCharles.Forsyth.PP
301*46439007SCharles.ForsythPlacing information in the data section is more painful.
302*46439007SCharles.ForsythThe pseudo-instruction
303*46439007SCharles.Forsyth.CW DATA
304*46439007SCharles.Forsythdoes the work, given two arguments: an address at which to place the item,
305*46439007SCharles.Forsythincluding its size,
306*46439007SCharles.Forsythand the value to place there.  For example, to define a character array
307*46439007SCharles.Forsyth.CW array
308*46439007SCharles.Forsythcontaining the characters
309*46439007SCharles.Forsyth.CW abc
310*46439007SCharles.Forsythand a terminating null:
311*46439007SCharles.Forsyth.P1
312*46439007SCharles.Forsyth	DATA    array+0(SB)/1, $'a'
313*46439007SCharles.Forsyth	DATA    array+1(SB)/1, $'b'
314*46439007SCharles.Forsyth	DATA    array+2(SB)/1, $'c'
315*46439007SCharles.Forsyth	GLOBL   array(SB), $4
316*46439007SCharles.Forsyth.P2
317*46439007SCharles.Forsythor
318*46439007SCharles.Forsyth.P1
319*46439007SCharles.Forsyth	DATA    array+0(SB)/4, $"abc\ez"
320*46439007SCharles.Forsyth	GLOBL   array(SB), $4
321*46439007SCharles.Forsyth.P2
322*46439007SCharles.ForsythThe
323*46439007SCharles.Forsyth.CW /1
324*46439007SCharles.Forsythdefines the number of bytes to define,
325*46439007SCharles.Forsyth.CW GLOBL
326*46439007SCharles.Forsythmakes the symbol global, and the
327*46439007SCharles.Forsyth.CW $4
328*46439007SCharles.Forsythsays how many bytes the symbol occupies.
329*46439007SCharles.ForsythUninitialized data is zeroed automatically.
330*46439007SCharles.ForsythThe character
331*46439007SCharles.Forsyth.CW \ez
332*46439007SCharles.Forsythis equivalent to the C
333*46439007SCharles.Forsyth.CW \e0.
334*46439007SCharles.ForsythThe string in a
335*46439007SCharles.Forsyth.CW DATA
336*46439007SCharles.Forsythstatement may contain a maximum of eight bytes;
337*46439007SCharles.Forsythbuild larger strings piecewise.
338*46439007SCharles.ForsythTwo pseudo-instructions,
339*46439007SCharles.Forsyth.CW DYNT
340*46439007SCharles.Forsythand
341*46439007SCharles.Forsyth.CW INIT ,
342*46439007SCharles.Forsythallow the (obsolete) Alef compilers to build dynamic type information during the load
343*46439007SCharles.Forsythphase.
344*46439007SCharles.ForsythThe
345*46439007SCharles.Forsyth.CW DYNT
346*46439007SCharles.Forsythpseudo-instruction has two forms:
347*46439007SCharles.Forsyth.P1
348*46439007SCharles.Forsyth	DYNT	, ALEF_SI_5+0(SB)
349*46439007SCharles.Forsyth	DYNT	ALEF_AS+0(SB), ALEF_SI_5+0(SB)
350*46439007SCharles.Forsyth.P2
351*46439007SCharles.ForsythIn the first form,
352*46439007SCharles.Forsyth.CW DYNT
353*46439007SCharles.Forsythdefines the symbol to be a small unique integer constant, chosen by the loader,
354*46439007SCharles.Forsythwhich is some multiple of the word size.  In the second form,
355*46439007SCharles.Forsyth.CW DYNT
356*46439007SCharles.Forsythdefines the second symbol in the same way,
357*46439007SCharles.Forsythplaces the address of the most recently
358*46439007SCharles.Forsythdefined text symbol in the array specified by the first symbol at the
359*46439007SCharles.Forsythindex defined by the value of the second symbol,
360*46439007SCharles.Forsythand then adjusts the size of the array accordingly.
361*46439007SCharles.Forsyth.PP
362*46439007SCharles.ForsythThe
363*46439007SCharles.Forsyth.CW INIT
364*46439007SCharles.Forsythpseudo-instruction takes the same parameters as a
365*46439007SCharles.Forsyth.CW DATA
366*46439007SCharles.Forsythstatement.  Its symbol is used as the base of an array and the
367*46439007SCharles.Forsythdata item is installed in the array at the offset specified by the most recent
368*46439007SCharles.Forsyth.CW DYNT
369*46439007SCharles.Forsythpseudo-instruction.
370*46439007SCharles.ForsythThe size of the array is adjusted accordingly.
371*46439007SCharles.ForsythThe
372*46439007SCharles.Forsyth.CW DYNT
373*46439007SCharles.Forsythand
374*46439007SCharles.Forsyth.CW INIT
375*46439007SCharles.Forsythpseudo-instructions are not implemented on the 68020.
376*46439007SCharles.Forsyth.SH
377*46439007SCharles.ForsythDefining a procedure
378*46439007SCharles.Forsyth.PP
379*46439007SCharles.ForsythEntry points are defined by the pseudo-operation
380*46439007SCharles.Forsyth.CW TEXT ,
381*46439007SCharles.Forsythwhich takes as arguments the name of the procedure (including the ubiquitous
382*46439007SCharles.Forsyth.CW (SB) )
383*46439007SCharles.Forsythand the number of bytes of automatic storage to pre-allocate on the stack,
384*46439007SCharles.Forsythwhich will usually be zero when writing assembly language programs.
385*46439007SCharles.ForsythOn machines with a link register, such as the MIPS and SPARC,
386*46439007SCharles.Forsyththe special value -4 instructs the loader to generate no PC save
387*46439007SCharles.Forsythand restore instructions, even if the function is not a leaf.
388*46439007SCharles.ForsythHere is a complete procedure that returns the sum
389*46439007SCharles.Forsythof its two arguments:
390*46439007SCharles.Forsyth.P1
391*46439007SCharles.ForsythTEXT	sum(SB), $0
392*46439007SCharles.Forsyth	MOVL	arg1+0(FP), R0
393*46439007SCharles.Forsyth	ADDL	arg2+4(FP), R0
394*46439007SCharles.Forsyth	RTS
395*46439007SCharles.Forsyth.P2
396*46439007SCharles.ForsythAn optional middle argument
397*46439007SCharles.Forsythto the
398*46439007SCharles.Forsyth.CW TEXT
399*46439007SCharles.Forsythpseudo-op is a bit field of options to the loader.
400*46439007SCharles.ForsythSetting the 1 bit suspends profiling the function when profiling is enabled for the rest of
401*46439007SCharles.Forsyththe program.
402*46439007SCharles.ForsythFor example,
403*46439007SCharles.Forsyth.P1
404*46439007SCharles.ForsythTEXT	sum(SB), 1, $0
405*46439007SCharles.Forsyth	MOVL	arg1+0(FP), R0
406*46439007SCharles.Forsyth	ADDL	arg2+4(FP), R0
407*46439007SCharles.Forsyth	RTS
408*46439007SCharles.Forsyth.P2
409*46439007SCharles.Forsythwill not be profiled; the first version above would be.
410*46439007SCharles.ForsythSubroutines with peculiar state, such as system call routines,
411*46439007SCharles.Forsythshould not be profiled.
412*46439007SCharles.Forsyth.PP
413*46439007SCharles.ForsythSetting the 2 bit allows multiple definitions of the same
414*46439007SCharles.Forsyth.CW TEXT
415*46439007SCharles.Forsythsymbol in a program; the loader will place only one such function in the image.
416*46439007SCharles.ForsythIt was emitted only by the Alef compilers.
417*46439007SCharles.Forsyth.PP
418*46439007SCharles.ForsythSubroutines to be called from C should place their result in
419*46439007SCharles.Forsyth.CW R0 ,
420*46439007SCharles.Forsytheven if it is an address.
421*46439007SCharles.ForsythFloating point values are returned in
422*46439007SCharles.Forsyth.CW F0 .
423*46439007SCharles.ForsythFunctions that return a structure to a C program
424*46439007SCharles.Forsythreceive as their first argument the address of the location to
425*46439007SCharles.Forsythstore the result;
426*46439007SCharles.Forsyth.CW R0
427*46439007SCharles.Forsythis unused in the calling protocol for such procedures.
428*46439007SCharles.ForsythA subroutine is responsible for saving its own registers,
429*46439007SCharles.Forsythand therefore is free to use any registers without saving them (``caller saves'').
430*46439007SCharles.Forsyth.CW A6
431*46439007SCharles.Forsythand
432*46439007SCharles.Forsyth.CW A7
433*46439007SCharles.Forsythare the exceptions as described above.
434*46439007SCharles.Forsyth.SH
435*46439007SCharles.ForsythWhen in doubt
436*46439007SCharles.Forsyth.PP
437*46439007SCharles.ForsythIf you get confused, try using the
438*46439007SCharles.Forsyth.CW -S
439*46439007SCharles.Forsythoption to
440*46439007SCharles.Forsyth.CW 2c
441*46439007SCharles.Forsythand compiling a sample program.
442*46439007SCharles.ForsythThe standard output is valid input to the assembler.
443*46439007SCharles.Forsyth.SH
444*46439007SCharles.ForsythInstructions
445*46439007SCharles.Forsyth.PP
446*46439007SCharles.ForsythThe instruction set of the assembler is not identical to that
447*46439007SCharles.Forsythof the machine.
448*46439007SCharles.ForsythIt is chosen to match what the compiler generates, augmented
449*46439007SCharles.Forsythslightly by specific needs of the operating system.
450*46439007SCharles.ForsythFor example,
451*46439007SCharles.Forsyth.CW 2a
452*46439007SCharles.Forsythdoes not distinguish between the various forms of
453*46439007SCharles.Forsyth.CW MOVE
454*46439007SCharles.Forsythinstruction: move quick, move address, etc.  Instead the context
455*46439007SCharles.Forsythdoes the job.  For example,
456*46439007SCharles.Forsyth.P1
457*46439007SCharles.Forsyth	MOVL	$1, R1
458*46439007SCharles.Forsyth	MOVL	A0, R2
459*46439007SCharles.Forsyth	MOVW	SR, R3
460*46439007SCharles.Forsyth.P2
461*46439007SCharles.Forsythgenerates official
462*46439007SCharles.Forsyth.CW MOVEQ ,
463*46439007SCharles.Forsyth.CW MOVEA ,
464*46439007SCharles.Forsythand
465*46439007SCharles.Forsyth.CW MOVESR
466*46439007SCharles.Forsythinstructions.
467*46439007SCharles.ForsythA number of instructions do not have the syntax necessary to specify
468*46439007SCharles.Forsyththeir entire capabilities.  Notable examples are the bitfield
469*46439007SCharles.Forsythinstructions, the
470*46439007SCharles.Forsythmultiply and divide instructions, etc.
471*46439007SCharles.ForsythFor a complete set of generated instruction names (in
472*46439007SCharles.Forsyth.CW 2a
473*46439007SCharles.Forsythnotation, not Motorola's) see the file
474*46439007SCharles.Forsyth.CW /sys/src/cmd/2c/2.out.h .
475*46439007SCharles.ForsythDespite its name, this file contains an enumeration of the
476*46439007SCharles.Forsythinstructions that appear in the intermediate files generated
477*46439007SCharles.Forsythby the compiler, which correspond exactly to lines of assembly language.
478*46439007SCharles.Forsyth.PP
479*46439007SCharles.ForsythThe MC68000 assembler,
480*46439007SCharles.Forsyth.CW 1a ,
481*46439007SCharles.Forsythis essentially the same, honoring the appropriate subset of the instructions
482*46439007SCharles.Forsythand addressing modes.
483*46439007SCharles.ForsythThe definitions of these are, nonetheless, part of
484*46439007SCharles.Forsyth.CW 2.out.h .
485*46439007SCharles.Forsyth.SH
486*46439007SCharles.ForsythLaying down instructions
487*46439007SCharles.Forsyth.PP
488*46439007SCharles.ForsythThe loader modifies the code produced by the assembler and compiler.
489*46439007SCharles.ForsythIt folds branches,
490*46439007SCharles.Forsythcopies short sequences of code to eliminate branches,
491*46439007SCharles.Forsythand discards unreachable code.
492*46439007SCharles.ForsythThe first instruction of every function is assumed to be reachable.
493*46439007SCharles.ForsythThe pseudo-instruction
494*46439007SCharles.Forsyth.CW NOP ,
495*46439007SCharles.Forsythwhich you may see in compiler output,
496*46439007SCharles.Forsythmeans no instruction at all, rather than an instruction that does nothing.
497*46439007SCharles.ForsythThe loader discards all
498*46439007SCharles.Forsyth.CW NOP 's.
499*46439007SCharles.Forsyth.PP
500*46439007SCharles.ForsythTo generate a true
501*46439007SCharles.Forsyth.CW NOP
502*46439007SCharles.Forsythinstruction, or any other instruction not known to the assembler, use a
503*46439007SCharles.Forsyth.CW WORD
504*46439007SCharles.Forsythpseudo-instruction.
505*46439007SCharles.ForsythSuch instructions on RISCs are not scheduled by the loader and must have
506*46439007SCharles.Forsyththeir delay slots filled manually.
507*46439007SCharles.Forsyth.SH
508*46439007SCharles.ForsythMIPS
509*46439007SCharles.Forsyth.PP
510*46439007SCharles.ForsythThe registers are only addressed by number:
511*46439007SCharles.Forsyth.CW R0
512*46439007SCharles.Forsyththrough
513*46439007SCharles.Forsyth.CW R31 .
514*46439007SCharles.Forsyth.CW R29
515*46439007SCharles.Forsythis the stack pointer;
516*46439007SCharles.Forsyth.CW R30
517*46439007SCharles.Forsythis used as the static base pointer, the analogue of
518*46439007SCharles.Forsyth.CW A6
519*46439007SCharles.Forsython the 68020.
520*46439007SCharles.ForsythIts value is the address of the global symbol
521*46439007SCharles.Forsyth.CW setR30(SB) .
522*46439007SCharles.ForsythThe register holding returned values from subroutines is
523*46439007SCharles.Forsyth.CW R1 .
524*46439007SCharles.ForsythWhen a function is called, space for the first argument
525*46439007SCharles.Forsythis reserved at
526*46439007SCharles.Forsyth.CW 0(FP)
527*46439007SCharles.Forsythbut in C (not Alef) the value is passed in
528*46439007SCharles.Forsyth.CW R1
529*46439007SCharles.Forsythinstead.
530*46439007SCharles.Forsyth.PP
531*46439007SCharles.ForsythThe loader uses
532*46439007SCharles.Forsyth.CW R28
533*46439007SCharles.Forsythas a temporary.  The system uses
534*46439007SCharles.Forsyth.CW R26
535*46439007SCharles.Forsythand
536*46439007SCharles.Forsyth.CW R27
537*46439007SCharles.Forsythas interrupt-time temporaries.  Therefore none of these registers
538*46439007SCharles.Forsythshould be used in user code.
539*46439007SCharles.Forsyth.PP
540*46439007SCharles.ForsythThe control registers are not known to the assembler.
541*46439007SCharles.ForsythInstead they are numbered registers
542*46439007SCharles.Forsyth.CW M0 ,
543*46439007SCharles.Forsyth.CW M1 ,
544*46439007SCharles.Forsythetc.
545*46439007SCharles.ForsythUse this trick to access, say,
546*46439007SCharles.Forsyth.CW STATUS :
547*46439007SCharles.Forsyth.P1
548*46439007SCharles.Forsyth#define	STATUS	12
549*46439007SCharles.Forsyth	MOVW	M(STATUS), R1
550*46439007SCharles.Forsyth.P2
551*46439007SCharles.Forsyth.PP
552*46439007SCharles.ForsythFloating point registers are called
553*46439007SCharles.Forsyth.CW F0
554*46439007SCharles.Forsyththrough
555*46439007SCharles.Forsyth.CW F31 .
556*46439007SCharles.ForsythBy convention,
557*46439007SCharles.Forsyth.CW F24
558*46439007SCharles.Forsythmust be initialized to the value 0.0,
559*46439007SCharles.Forsyth.CW F26
560*46439007SCharles.Forsythto 0.5,
561*46439007SCharles.Forsyth.CW F28
562*46439007SCharles.Forsythto 1.0, and
563*46439007SCharles.Forsyth.CW F30
564*46439007SCharles.Forsythto 2.0;
565*46439007SCharles.Forsyththis is done by the operating system.
566*46439007SCharles.Forsyth.PP
567*46439007SCharles.ForsythThe instructions and their syntax are different from those of the manufacturer's
568*46439007SCharles.Forsythmanual.
569*46439007SCharles.ForsythThere are no
570*46439007SCharles.Forsyth.CW lui
571*46439007SCharles.Forsythand kin; instead there are
572*46439007SCharles.Forsyth.CW MOVW
573*46439007SCharles.Forsyth(move word),
574*46439007SCharles.Forsyth.CW MOVH
575*46439007SCharles.Forsyth(move halfword),
576*46439007SCharles.Forsythand
577*46439007SCharles.Forsyth.CW MOVB
578*46439007SCharles.Forsyth(move byte) pseudo-instructions.  If the operand is unsigned, the instructions
579*46439007SCharles.Forsythare
580*46439007SCharles.Forsyth.CW MOVHU
581*46439007SCharles.Forsythand
582*46439007SCharles.Forsyth.CW MOVBU .
583*46439007SCharles.ForsythThe order of operands is from left to right in dataflow order, just as
584*46439007SCharles.Forsython the 68020 but not as in MIPS documentation.
585*46439007SCharles.ForsythThis means that the
586*46439007SCharles.Forsyth.CW Bcond
587*46439007SCharles.Forsythinstructions are reversed with respect to the book; for example, a
588*46439007SCharles.Forsyth.CW va
589*46439007SCharles.Forsyth.CW BGTZ
590*46439007SCharles.Forsythgenerates a MIPS
591*46439007SCharles.Forsyth.CW bltz
592*46439007SCharles.Forsythinstruction.
593*46439007SCharles.Forsyth.PP
594*46439007SCharles.ForsythThe assembler is for the R2000, R3000, and most of the R4000 and R6000 architectures.
595*46439007SCharles.ForsythIt understands the 64-bit instructions
596*46439007SCharles.Forsyth.CW MOVV ,
597*46439007SCharles.Forsyth.CW MOVVL ,
598*46439007SCharles.Forsyth.CW ADDV ,
599*46439007SCharles.Forsyth.CW ADDVU ,
600*46439007SCharles.Forsyth.CW SUBV ,
601*46439007SCharles.Forsyth.CW SUBVU ,
602*46439007SCharles.Forsyth.CW MULV ,
603*46439007SCharles.Forsyth.CW MULVU ,
604*46439007SCharles.Forsyth.CW DIVV ,
605*46439007SCharles.Forsyth.CW DIVVU ,
606*46439007SCharles.Forsyth.CW SLLV ,
607*46439007SCharles.Forsyth.CW SRLV ,
608*46439007SCharles.Forsythand
609*46439007SCharles.Forsyth.CW SRAV .
610*46439007SCharles.ForsythThe assembler does not have any cache, load-linked, or store-conditional instructions.
611*46439007SCharles.Forsyth.PP
612*46439007SCharles.ForsythSome assembler instructions are expanded into multiple instructions by the loader.
613*46439007SCharles.ForsythFor example the loader may convert the load of a 32 bit constant into an
614*46439007SCharles.Forsyth.CW lui
615*46439007SCharles.Forsythfollowed by an
616*46439007SCharles.Forsyth.CW ori .
617*46439007SCharles.Forsyth.PP
618*46439007SCharles.ForsythAssembler instructions should be laid out as if there
619*46439007SCharles.Forsythwere no load, branch, or floating point compare delay slots;
620*46439007SCharles.Forsyththe loader will rearrange\(em\f2schedule\f1\(emthe instructions
621*46439007SCharles.Forsythto guarantee correctness and improve performance.
622*46439007SCharles.ForsythThe only exception is that the correct scheduling of instructions
623*46439007SCharles.Forsyththat use control registers varies from model to model of machine
624*46439007SCharles.Forsyth(and is often undocumented) so you should schedule such instructions
625*46439007SCharles.Forsythby hand to guarantee correct behavior.
626*46439007SCharles.ForsythThe loader generates
627*46439007SCharles.Forsyth.P1
628*46439007SCharles.Forsyth	NOR	R0, R0, R0
629*46439007SCharles.Forsyth.P2
630*46439007SCharles.Forsythwhen it needs a true no-op instruction.
631*46439007SCharles.ForsythUse exactly this instruction when scheduling code manually;
632*46439007SCharles.Forsyththe loader recognizes it and schedules the code before it and after it independently.  Also,
633*46439007SCharles.Forsyth.CW WORD
634*46439007SCharles.Forsythpseudo-ops are scheduled like no-ops.
635*46439007SCharles.Forsyth.PP
636*46439007SCharles.ForsythThe
637*46439007SCharles.Forsyth.CW NOSCHED
638*46439007SCharles.Forsythpseudo-op disables instruction scheduling
639*46439007SCharles.Forsyth(scheduling is enabled by default);
640*46439007SCharles.Forsyth.CW SCHED
641*46439007SCharles.Forsythre-enables it.
642*46439007SCharles.ForsythBranch folding, code copying, and dead code elimination are
643*46439007SCharles.Forsythdisabled for instructions that are not scheduled.
644*46439007SCharles.Forsyth.SH
645*46439007SCharles.ForsythSPARC
646*46439007SCharles.Forsyth.PP
647*46439007SCharles.ForsythOnce you understand the Plan 9 model for the MIPS, the SPARC is familiar.
648*46439007SCharles.ForsythRegisters have numerical names only:
649*46439007SCharles.Forsyth.CW R0
650*46439007SCharles.Forsyththrough
651*46439007SCharles.Forsyth.CW R31 .
652*46439007SCharles.ForsythForget about register windows: Plan 9 doesn't use them at all.
653*46439007SCharles.ForsythThe machine has 32 global registers, period.
654*46439007SCharles.Forsyth.CW R1
655*46439007SCharles.Forsyth[sic] is the stack pointer.
656*46439007SCharles.Forsyth.CW R2
657*46439007SCharles.Forsythis the static base register, with value the address of
658*46439007SCharles.Forsyth.CW setSB(SB) .
659*46439007SCharles.Forsyth.CW R7
660*46439007SCharles.Forsythis the return register and also the register holding the first
661*46439007SCharles.Forsythargument to a C (not Alef) function, again with space reserved at
662*46439007SCharles.Forsyth.CW 0(FP) .
663*46439007SCharles.Forsyth.CW R14
664*46439007SCharles.Forsythis the loader temporary.
665*46439007SCharles.Forsyth.PP
666*46439007SCharles.ForsythFloating-point registers are exactly as on the MIPS.
667*46439007SCharles.Forsyth.PP
668*46439007SCharles.ForsythThe control registers are known by names such as
669*46439007SCharles.Forsyth.CW FSR .
670*46439007SCharles.ForsythThe instructions to access these registers are
671*46439007SCharles.Forsyth.CW MOVW
672*46439007SCharles.Forsythinstructions, for example
673*46439007SCharles.Forsyth.P1
674*46439007SCharles.Forsyth	MOVW	Y, R8
675*46439007SCharles.Forsyth.P2
676*46439007SCharles.Forsythfor the SPARC instruction
677*46439007SCharles.Forsyth.P1
678*46439007SCharles.Forsyth	rdy	%r8
679*46439007SCharles.Forsyth.P2
680*46439007SCharles.Forsyth.PP
681*46439007SCharles.ForsythMove instructions are similar to those on the MIPS: pseudo-operations
682*46439007SCharles.Forsyththat turn into appropriate sequences of
683*46439007SCharles.Forsyth.CW sethi
684*46439007SCharles.Forsythinstructions, adds, etc.
685*46439007SCharles.ForsythInstructions read from left to right.  Because the arguments are
686*46439007SCharles.Forsythflipped to
687*46439007SCharles.Forsyth.CW SUBCC ,
688*46439007SCharles.Forsyththe condition codes are not inverted as on the MIPS.
689*46439007SCharles.Forsyth.PP
690*46439007SCharles.ForsythThe syntax for the ASI stuff is, for example to move a word from ASI 2:
691*46439007SCharles.Forsyth.P1
692*46439007SCharles.Forsyth	MOVW	(R7, 2), R8
693*46439007SCharles.Forsyth.P2
694*46439007SCharles.ForsythThe syntax for double indexing is
695*46439007SCharles.Forsyth.P1
696*46439007SCharles.Forsyth	MOVW	(R7+R8), R9
697*46439007SCharles.Forsyth.P2
698*46439007SCharles.Forsyth.PP
699*46439007SCharles.ForsythThe SPARC's instruction scheduling is similar to the MIPS's.
700*46439007SCharles.ForsythThe official no-op instruction is:
701*46439007SCharles.Forsyth.P1
702*46439007SCharles.Forsyth	ORN	R0, R0, R0
703*46439007SCharles.Forsyth.P2
704*46439007SCharles.Forsyth.SH
705*46439007SCharles.Forsythi386
706*46439007SCharles.Forsyth.PP
707*46439007SCharles.ForsythThe assembler assumes 32-bit protected mode.
708*46439007SCharles.ForsythThe register names are
709*46439007SCharles.Forsyth.CW SP ,
710*46439007SCharles.Forsyth.CW AX ,
711*46439007SCharles.Forsyth.CW BX ,
712*46439007SCharles.Forsyth.CW CX ,
713*46439007SCharles.Forsyth.CW DX ,
714*46439007SCharles.Forsyth.CW BP ,
715*46439007SCharles.Forsyth.CW DI ,
716*46439007SCharles.Forsythand
717*46439007SCharles.Forsyth.CW SI .
718*46439007SCharles.ForsythThe stack pointer (not a pseudo-register) is
719*46439007SCharles.Forsyth.CW SP
720*46439007SCharles.Forsythand the return register is
721*46439007SCharles.Forsyth.CW AX .
722*46439007SCharles.ForsythThere is no physical frame pointer but, as for the MIPS,
723*46439007SCharles.Forsyth.CW FP
724*46439007SCharles.Forsythis a pseudo-register that acts as
725*46439007SCharles.Forsytha frame pointer.
726*46439007SCharles.Forsyth.PP
727*46439007SCharles.ForsythOpcode names are mostly the same as those listed in the Intel manual
728*46439007SCharles.Forsythwith an
729*46439007SCharles.Forsyth.CW L ,
730*46439007SCharles.Forsyth.CW W ,
731*46439007SCharles.Forsythor
732*46439007SCharles.Forsyth.CW B
733*46439007SCharles.Forsythappended to identify 32-bit,
734*46439007SCharles.Forsyth16-bit, and 8-bit operations.
735*46439007SCharles.ForsythThe exceptions are loads, stores, and conditionals.
736*46439007SCharles.ForsythAll load and store opcodes to and from general registers, special registers
737*46439007SCharles.Forsyth(such as
738*46439007SCharles.Forsyth.CW CR0,
739*46439007SCharles.Forsyth.CW CR3,
740*46439007SCharles.Forsyth.CW GDTR,
741*46439007SCharles.Forsyth.CW IDTR,
742*46439007SCharles.Forsyth.CW SS,
743*46439007SCharles.Forsyth.CW CS,
744*46439007SCharles.Forsyth.CW DS,
745*46439007SCharles.Forsyth.CW ES,
746*46439007SCharles.Forsyth.CW FS,
747*46439007SCharles.Forsythand
748*46439007SCharles.Forsyth.CW GS )
749*46439007SCharles.Forsythor memory are written
750*46439007SCharles.Forsythas
751*46439007SCharles.Forsyth.P1
752*46439007SCharles.Forsyth	MOV\f2x\fP	src,dst
753*46439007SCharles.Forsyth.P2
754*46439007SCharles.Forsythwhere
755*46439007SCharles.Forsyth.I x
756*46439007SCharles.Forsythis
757*46439007SCharles.Forsyth.CW L ,
758*46439007SCharles.Forsyth.CW W ,
759*46439007SCharles.Forsythor
760*46439007SCharles.Forsyth.CW B .
761*46439007SCharles.ForsythThus to get
762*46439007SCharles.Forsyth.CW AL
763*46439007SCharles.Forsythuse a
764*46439007SCharles.Forsyth.CW MOVB
765*46439007SCharles.Forsythinstruction.  If you need to access
766*46439007SCharles.Forsyth.CW AH ,
767*46439007SCharles.Forsythyou must mention it explicitly in a
768*46439007SCharles.Forsyth.CW MOVB :
769*46439007SCharles.Forsyth.P1
770*46439007SCharles.Forsyth	MOVB	AH, BX
771*46439007SCharles.Forsyth.P2
772*46439007SCharles.ForsythThere are many examples of illegal moves, for example,
773*46439007SCharles.Forsyth.P1
774*46439007SCharles.Forsyth	MOVB	BP, DI
775*46439007SCharles.Forsyth.P2
776*46439007SCharles.Forsyththat the loader actually implements as pseudo-operations.
777*46439007SCharles.Forsyth.PP
778*46439007SCharles.ForsythThe names of conditions in all conditional instructions
779*46439007SCharles.Forsyth.CW J , (
780*46439007SCharles.Forsyth.CW SET )
781*46439007SCharles.Forsythfollow the conventions of the 68020 instead of those of the Intel
782*46439007SCharles.Forsythassembler:
783*46439007SCharles.Forsyth.CW JOS ,
784*46439007SCharles.Forsyth.CW JOC ,
785*46439007SCharles.Forsyth.CW JCS ,
786*46439007SCharles.Forsyth.CW JCC ,
787*46439007SCharles.Forsyth.CW JEQ ,
788*46439007SCharles.Forsyth.CW JNE ,
789*46439007SCharles.Forsyth.CW JLS ,
790*46439007SCharles.Forsyth.CW JHI ,
791*46439007SCharles.Forsyth.CW JMI ,
792*46439007SCharles.Forsyth.CW JPL ,
793*46439007SCharles.Forsyth.CW JPS ,
794*46439007SCharles.Forsyth.CW JPC ,
795*46439007SCharles.Forsyth.CW JLT ,
796*46439007SCharles.Forsyth.CW JGE ,
797*46439007SCharles.Forsyth.CW JLE ,
798*46439007SCharles.Forsythand
799*46439007SCharles.Forsyth.CW JGT
800*46439007SCharles.Forsythinstead of
801*46439007SCharles.Forsyth.CW JO ,
802*46439007SCharles.Forsyth.CW JNO ,
803*46439007SCharles.Forsyth.CW JB ,
804*46439007SCharles.Forsyth.CW JNB ,
805*46439007SCharles.Forsyth.CW JZ ,
806*46439007SCharles.Forsyth.CW JNZ ,
807*46439007SCharles.Forsyth.CW JBE ,
808*46439007SCharles.Forsyth.CW JNBE ,
809*46439007SCharles.Forsyth.CW JS ,
810*46439007SCharles.Forsyth.CW JNS ,
811*46439007SCharles.Forsyth.CW JP ,
812*46439007SCharles.Forsyth.CW JNP ,
813*46439007SCharles.Forsyth.CW JL ,
814*46439007SCharles.Forsyth.CW JNL ,
815*46439007SCharles.Forsyth.CW JLE ,
816*46439007SCharles.Forsythand
817*46439007SCharles.Forsyth.CW JNLE .
818*46439007SCharles.Forsyth.PP
819*46439007SCharles.ForsythThe addressing modes have syntax like
820*46439007SCharles.Forsyth.CW AX ,
821*46439007SCharles.Forsyth.CW (AX) ,
822*46439007SCharles.Forsyth.CW (AX)(BX*4) ,
823*46439007SCharles.Forsyth.CW 10(AX) ,
824*46439007SCharles.Forsythand
825*46439007SCharles.Forsyth.CW 10(AX)(BX*4) .
826*46439007SCharles.ForsythThe offsets from
827*46439007SCharles.Forsyth.CW AX
828*46439007SCharles.Forsythcan be replaced by offsets from
829*46439007SCharles.Forsyth.CW FP
830*46439007SCharles.Forsythor
831*46439007SCharles.Forsyth.CW SB
832*46439007SCharles.Forsythto access names, for example
833*46439007SCharles.Forsyth.CW extern+5(SB)(AX*2) .
834*46439007SCharles.Forsyth.PP
835*46439007SCharles.ForsythOther notes: Non-relative
836*46439007SCharles.Forsyth.CW JMP
837*46439007SCharles.Forsythand
838*46439007SCharles.Forsyth.CW CALL
839*46439007SCharles.Forsythhave a
840*46439007SCharles.Forsyth.CW *
841*46439007SCharles.Forsythadded to the syntax.
842*46439007SCharles.ForsythOnly
843*46439007SCharles.Forsyth.CW LOOP ,
844*46439007SCharles.Forsyth.CW LOOPEQ ,
845*46439007SCharles.Forsythand
846*46439007SCharles.Forsyth.CW LOOPNE
847*46439007SCharles.Forsythare legal loop instructions.  Only
848*46439007SCharles.Forsyth.CW REP
849*46439007SCharles.Forsythand
850*46439007SCharles.Forsyth.CW REPN
851*46439007SCharles.Forsythare recognized repeaters.  These are not prefixes, but rather
852*46439007SCharles.Forsythstand-alone opcodes that precede the strings, for example
853*46439007SCharles.Forsyth.P1
854*46439007SCharles.Forsyth	CLD; REP; MOVSL
855*46439007SCharles.Forsyth.P2
856*46439007SCharles.ForsythSegment override prefixes in
857*46439007SCharles.Forsyth.CW MOD/RM
858*46439007SCharles.Forsythfields are not supported.
859*46439007SCharles.Forsyth.SH
860*46439007SCharles.ForsythAMD64
861*46439007SCharles.Forsyth.PP
862*46439007SCharles.ForsythThe assembler's conventions are similar to those for the 386, above.
863*46439007SCharles.ForsythThe architecture provides extra fixed-point registers
864*46439007SCharles.Forsyth.CW R8
865*46439007SCharles.Forsythto
866*46439007SCharles.Forsyth.CW R15 .
867*46439007SCharles.ForsythAll registers are 64 bit, but instructions access low-order 8, 16 and 32 bits
868*46439007SCharles.Forsythas described in the processor handbook.
869*46439007SCharles.ForsythFor example,
870*46439007SCharles.Forsyth.CW MOVL
871*46439007SCharles.Forsythto
872*46439007SCharles.Forsyth.CW AX
873*46439007SCharles.Forsythputs a value in the low-order 32 bits and clears the top 32 bits to zero.
874*46439007SCharles.ForsythLiteral operands are limited to signed 32 bit values, which are sign-extended
875*46439007SCharles.Forsythto 64 bits in 64 bit operations; the exception is
876*46439007SCharles.Forsyth.CW MOVQ ,
877*46439007SCharles.Forsythwhich allows 64-bit literals.
878*46439007SCharles.ForsythMMX registers are
879*46439007SCharles.Forsyth.CW M0
880*46439007SCharles.Forsythto
881*46439007SCharles.Forsyth.CW M7 ,
882*46439007SCharles.Forsythand
883*46439007SCharles.ForsythXMM registers are
884*46439007SCharles.Forsyth.CW X0
885*46439007SCharles.Forsythto
886*46439007SCharles.Forsyth.CW X15 .
887*46439007SCharles.Forsyth.PP
888*46439007SCharles.ForsythThere are many new instructions, including the MMX and XMM media instructions,
889*46439007SCharles.Forsythand conditional move instructions.
890*46439007SCharles.ForsythAs with the 386 instruction names,
891*46439007SCharles.Forsythall new 64-bit integer instructions, and the MMX and XMM instructions
892*46439007SCharles.Forsythuniformly use
893*46439007SCharles.Forsyth.CW L
894*46439007SCharles.Forsythfor `long word' (32 bits) and
895*46439007SCharles.Forsyth.CW Q
896*46439007SCharles.Forsythfor `quad word' (64 bits).
897*46439007SCharles.ForsythSome instructions use
898*46439007SCharles.Forsyth.CW O
899*46439007SCharles.Forsyth(`octword') for 128-bit values, where the processor handbook
900*46439007SCharles.Forsythvariously uses
901*46439007SCharles.Forsyth.CW O
902*46439007SCharles.Forsythor
903*46439007SCharles.Forsyth.CW DQ .
904*46439007SCharles.ForsythThe assembler also consistently uses
905*46439007SCharles.Forsyth.CW PL
906*46439007SCharles.Forsythfor `packed long' in
907*46439007SCharles.ForsythXMM instructions, instead of
908*46439007SCharles.Forsyth.CW Q ,
909*46439007SCharles.Forsyth.CW DQ
910*46439007SCharles.Forsythor
911*46439007SCharles.Forsyth.CW PI .
912*46439007SCharles.ForsythEither
913*46439007SCharles.Forsyth.CW MOVL
914*46439007SCharles.Forsythor
915*46439007SCharles.Forsyth.CW MOVQ
916*46439007SCharles.Forsythcan be used to move values to and from control registers, even when
917*46439007SCharles.Forsyththe registers might be 64 bits.
918*46439007SCharles.ForsythThe assembler often accepts the handbook's name to ease conversion
919*46439007SCharles.Forsythof existing code (but remember that the operand order is uniformly
920*46439007SCharles.Forsythsource then destination).
921*46439007SCharles.Forsyth.PP
922*46439007SCharles.ForsythC's
923*46439007SCharles.Forsyth.CW "long long"
924*46439007SCharles.Forsythtype is 64 bits, but passed and returned by value, not by reference.
925*46439007SCharles.ForsythMore notably, C pointer values are 64 bits, and thus
926*46439007SCharles.Forsyth.CW "long long"
927*46439007SCharles.Forsythand
928*46439007SCharles.Forsyth.CW "unsigned long long"
929*46439007SCharles.Forsythare the only integer types wide enough to hold a pointer value.
930*46439007SCharles.ForsythThe C compiler and library use the XMM floating-point instructions, not
931*46439007SCharles.Forsyththe old 387 ones, although the latter are implemented by assembler and loader.
932*46439007SCharles.ForsythThe compiler provides external registers,
933*46439007SCharles.Forsythallocated from
934*46439007SCharles.Forsyth.CW R15
935*46439007SCharles.Forsythdown.
936*46439007SCharles.Forsyth.PP
937*46439007SCharles.ForsythThe calling conventions are different from the 386.
938*46439007SCharles.Forsyth.CW CALL
939*46439007SCharles.Forsythpushes, and
940*46439007SCharles.Forsyth.CW RET
941*46439007SCharles.Forsythpops a 64-bit return address on the stack.
942*46439007SCharles.ForsythThe first integer or pointer argument is passed in a register, which is
943*46439007SCharles.Forsyth.CW BP
944*46439007SCharles.Forsythfor an integer or pointer (it can be referred to in assembly code by the pseudonym
945*46439007SCharles.Forsyth.CW RARG ).
946*46439007SCharles.Forsyth.CW AX
947*46439007SCharles.Forsythholds the return value from subroutines as before.
948*46439007SCharles.ForsythFloating-point results are returned in
949*46439007SCharles.Forsyth.CW X0 ,
950*46439007SCharles.Forsythalthough currently the first parameter is not passed in a register if floating-point.
951*46439007SCharles.ForsythAll parameters less than 8 bytes in length have 8 byte slots reserved on the stack
952*46439007SCharles.Forsythto preserve alignment and simplify variable-length argument list access,
953*46439007SCharles.Forsythincluding the first parameter when passed in a register,
954*46439007SCharles.Forsythalthough bytes 4 to 7 are not initialized.
955*46439007SCharles.Forsyth.PP
956*46439007SCharles.ForsythThe assembler assumes 64-bit mode unless a
957*46439007SCharles.Forsyth.CW MODE
958*46439007SCharles.Forsythpseudo-operation is given:
959*46439007SCharles.Forsyth.P1
960*46439007SCharles.Forsyth	MODE $32
961*46439007SCharles.Forsyth.P2
962*46439007SCharles.Forsythto change to 32-bit mode.
963*46439007SCharles.ForsythThe effect is mainly to diagnose instructions that are illegal in
964*46439007SCharles.Forsyththe given mode, but the loader will also assume 32-bit operands and addresses,
965*46439007SCharles.Forsythand 32-bit PC values for call and return.
966*46439007SCharles.Forsyth.SH
967*46439007SCharles.ForsythAlpha
968*46439007SCharles.Forsyth.PP
969*46439007SCharles.ForsythOn the Alpha, all registers are 64 bits.  The architecture handles 32-bit values
970*46439007SCharles.Forsythby giving them a canonical format (sign extension in the case of integer registers).
971*46439007SCharles.ForsythRegisters are numbered
972*46439007SCharles.Forsyth.CW R0
973*46439007SCharles.Forsyththrough
974*46439007SCharles.Forsyth.CW R31 .
975*46439007SCharles.Forsyth.CW R0
976*46439007SCharles.Forsythholds the return value from subroutines, and also the first parameter.
977*46439007SCharles.Forsyth.CW R30
978*46439007SCharles.Forsythis the stack pointer,
979*46439007SCharles.Forsyth.CW R29
980*46439007SCharles.Forsythis the static base,
981*46439007SCharles.Forsyth.CW R26
982*46439007SCharles.Forsythis the link register, and
983*46439007SCharles.Forsyth.CW R27
984*46439007SCharles.Forsythand
985*46439007SCharles.Forsyth.CW R28
986*46439007SCharles.Forsythare linker temporaries.
987*46439007SCharles.Forsyth.PP
988*46439007SCharles.ForsythFloating point registers are numbered
989*46439007SCharles.Forsyth.CW F0
990*46439007SCharles.Forsythto
991*46439007SCharles.Forsyth.CW F31 .
992*46439007SCharles.Forsyth.CW F28
993*46439007SCharles.Forsythcontains
994*46439007SCharles.Forsyth.CW 0.5 ,
995*46439007SCharles.Forsyth.CW F29
996*46439007SCharles.Forsythcontains
997*46439007SCharles.Forsyth.CW 1.0 ,
998*46439007SCharles.Forsythand
999*46439007SCharles.Forsyth.CW F30
1000*46439007SCharles.Forsythcontains
1001*46439007SCharles.Forsyth.CW 2.0 .
1002*46439007SCharles.Forsyth.CW F31
1003*46439007SCharles.Forsythis always
1004*46439007SCharles.Forsyth.CW 0.0
1005*46439007SCharles.Forsython the Alpha.
1006*46439007SCharles.Forsyth.PP
1007*46439007SCharles.ForsythThe extension character for
1008*46439007SCharles.Forsyth.CW MOV
1009*46439007SCharles.Forsythfollows DEC's notation:
1010*46439007SCharles.Forsyth.CW B
1011*46439007SCharles.Forsythfor byte (8 bits),
1012*46439007SCharles.Forsyth.CW W
1013*46439007SCharles.Forsythfor word (16 bits),
1014*46439007SCharles.Forsyth.CW L
1015*46439007SCharles.Forsythfor long (32 bits),
1016*46439007SCharles.Forsythand
1017*46439007SCharles.Forsyth.CW Q
1018*46439007SCharles.Forsythfor quadword (64 bits).
1019*46439007SCharles.ForsythByte and ``word'' loads and stores may be made unsigned
1020*46439007SCharles.Forsythby appending a
1021*46439007SCharles.Forsyth.CW U .
1022*46439007SCharles.Forsyth.CW S
1023*46439007SCharles.Forsythand
1024*46439007SCharles.Forsyth.CW T
1025*46439007SCharles.Forsythrefer to IEEE floating point single precision (32 bits) and double precision (64 bits), respectively.
1026*46439007SCharles.Forsyth.SH
1027*46439007SCharles.ForsythPowerPC
1028*46439007SCharles.Forsyth.PP
1029*46439007SCharles.ForsythThe PowerPC follows the Plan 9 model set by the MIPS and SPARC,
1030*46439007SCharles.Forsythnot the elaborate ABIs.
1031*46439007SCharles.ForsythThe 32-bit instructions of the 60x and 8xx PowerPC architectures are supported;
1032*46439007SCharles.Forsyththere is no support for the older POWER instructions.
1033*46439007SCharles.ForsythRegisters are
1034*46439007SCharles.Forsyth.CW R0
1035*46439007SCharles.Forsyththrough
1036*46439007SCharles.Forsyth.CW R31 .
1037*46439007SCharles.Forsyth.CW R0
1038*46439007SCharles.Forsythis initialized to zero; this is done by C start up code
1039*46439007SCharles.Forsythand assumed by the compiler and loader.
1040*46439007SCharles.Forsyth.CW R1
1041*46439007SCharles.Forsythis the stack pointer.
1042*46439007SCharles.Forsyth.CW R2
1043*46439007SCharles.Forsythis the static base register, with value the address of
1044*46439007SCharles.Forsyth.CW setSB(SB) .
1045*46439007SCharles.Forsyth.CW R3
1046*46439007SCharles.Forsythis the return register and also the register holding the first
1047*46439007SCharles.Forsythargument to a C function, with space reserved at
1048*46439007SCharles.Forsyth.CW 0(FP)
1049*46439007SCharles.Forsythas on the MIPS.
1050*46439007SCharles.Forsyth.CW R31
1051*46439007SCharles.Forsythis the loader temporary.
1052*46439007SCharles.ForsythThe external registers in Plan 9's C are allocated from
1053*46439007SCharles.Forsyth.CW R30
1054*46439007SCharles.Forsythdown.
1055*46439007SCharles.Forsyth.PP
1056*46439007SCharles.ForsythFloating point registers are called
1057*46439007SCharles.Forsyth.CW F0
1058*46439007SCharles.Forsyththrough
1059*46439007SCharles.Forsyth.CW F31 .
1060*46439007SCharles.ForsythBy convention, several registers are initialized
1061*46439007SCharles.Forsythto specific values; this is done by the operating system.
1062*46439007SCharles.Forsyth.CW F27
1063*46439007SCharles.Forsythmust be initialized to the value
1064*46439007SCharles.Forsyth.CW 0x4330000080000000
1065*46439007SCharles.Forsyth(used by float-to-int conversion),
1066*46439007SCharles.Forsyth.CW F28
1067*46439007SCharles.Forsythto the value 0.0,
1068*46439007SCharles.Forsyth.CW F29
1069*46439007SCharles.Forsythto 0.5,
1070*46439007SCharles.Forsyth.CW F30
1071*46439007SCharles.Forsythto 1.0, and
1072*46439007SCharles.Forsyth.CW F31
1073*46439007SCharles.Forsythto 2.0.
1074*46439007SCharles.Forsyth.PP
1075*46439007SCharles.ForsythAs on the MIPS and SPARC, the assembler accepts arbitrary literals
1076*46439007SCharles.Forsythas operands to
1077*46439007SCharles.Forsyth.CW MOVW ,
1078*46439007SCharles.Forsythand also to
1079*46439007SCharles.Forsyth.CW ADD
1080*46439007SCharles.Forsythand others where `immediate' variants exist,
1081*46439007SCharles.Forsythand the loader generates sequences
1082*46439007SCharles.Forsythof
1083*46439007SCharles.Forsyth.CW addi ,
1084*46439007SCharles.Forsyth.CW addis ,
1085*46439007SCharles.Forsyth.CW oris ,
1086*46439007SCharles.Forsythetc. as required.
1087*46439007SCharles.ForsythThe register indirect addressing modes use the same syntax as the SPARC,
1088*46439007SCharles.Forsythincluding double indexing when allowed.
1089*46439007SCharles.Forsyth.PP
1090*46439007SCharles.ForsythThe instruction names are generally derived from the Motorola ones,
1091*46439007SCharles.Forsythsubject to slight transformation:
1092*46439007SCharles.Forsyththe
1093*46439007SCharles.Forsyth.CW . ' `
1094*46439007SCharles.Forsythmarking the setting of condition codes is replaced by
1095*46439007SCharles.Forsyth.CW CC ,
1096*46439007SCharles.Forsythand when the letter
1097*46439007SCharles.Forsyth.CW o ' `
1098*46439007SCharles.Forsythrepresents `OE=1' it is replaced by
1099*46439007SCharles.Forsyth.CW V .
1100*46439007SCharles.ForsythThus
1101*46439007SCharles.Forsyth.CW add ,
1102*46439007SCharles.Forsyth.CW addo.
1103*46439007SCharles.Forsythand
1104*46439007SCharles.Forsyth.CW subfzeo.
1105*46439007SCharles.Forsythbecome
1106*46439007SCharles.Forsyth.CW ADD ,
1107*46439007SCharles.Forsyth.CW ADDVCC
1108*46439007SCharles.Forsythand
1109*46439007SCharles.Forsyth.CW SUBFZEVCC .
1110*46439007SCharles.ForsythAs well as the three-operand conditional branch instruction
1111*46439007SCharles.Forsyth.CW BC ,
1112*46439007SCharles.Forsyththe assembler provides pseudo-instructions for the common cases:
1113*46439007SCharles.Forsyth.CW BEQ ,
1114*46439007SCharles.Forsyth.CW BNE ,
1115*46439007SCharles.Forsyth.CW BGT ,
1116*46439007SCharles.Forsyth.CW BGE ,
1117*46439007SCharles.Forsyth.CW BLT ,
1118*46439007SCharles.Forsyth.CW BLE ,
1119*46439007SCharles.Forsyth.CW BVC ,
1120*46439007SCharles.Forsythand
1121*46439007SCharles.Forsyth.CW BVS .
1122*46439007SCharles.ForsythThe unconditional branch instruction is
1123*46439007SCharles.Forsyth.CW BR .
1124*46439007SCharles.ForsythIndirect branches use
1125*46439007SCharles.Forsyth.CW "(CTR)"
1126*46439007SCharles.Forsythor
1127*46439007SCharles.Forsyth.CW "(LR)"
1128*46439007SCharles.Forsythas target.
1129*46439007SCharles.Forsyth.PP
1130*46439007SCharles.ForsythLoad or store operations are replaced by
1131*46439007SCharles.Forsyth.CW MOV
1132*46439007SCharles.Forsythvariants in the usual way:
1133*46439007SCharles.Forsyth.CW MOVW
1134*46439007SCharles.Forsyth(move word),
1135*46439007SCharles.Forsyth.CW MOVH
1136*46439007SCharles.Forsyth(move halfword with sign extension), and
1137*46439007SCharles.Forsyth.CW MOVB
1138*46439007SCharles.Forsyth(move byte with sign extension, a pseudo-instruction),
1139*46439007SCharles.Forsythwith unsigned variants
1140*46439007SCharles.Forsyth.CW MOVHZ
1141*46439007SCharles.Forsythand
1142*46439007SCharles.Forsyth.CW MOVBZ ,
1143*46439007SCharles.Forsythand byte-reversing
1144*46439007SCharles.Forsyth.CW MOVWBR
1145*46439007SCharles.Forsythand
1146*46439007SCharles.Forsyth.CW MOVHBR .
1147*46439007SCharles.Forsyth`Load or store with update' versions are
1148*46439007SCharles.Forsyth.CW MOVWU ,
1149*46439007SCharles.Forsyth.CW MOVHU ,
1150*46439007SCharles.Forsythand
1151*46439007SCharles.Forsyth.CW MOVBZU .
1152*46439007SCharles.ForsythLoad or store multiple is
1153*46439007SCharles.Forsyth.CW MOVMW .
1154*46439007SCharles.ForsythThe exceptions are the string instructions, which are
1155*46439007SCharles.Forsyth.CW LSW
1156*46439007SCharles.Forsythand
1157*46439007SCharles.Forsyth.CW STSW ,
1158*46439007SCharles.Forsythand the reservation instructions
1159*46439007SCharles.Forsyth.CW lwarx
1160*46439007SCharles.Forsythand
1161*46439007SCharles.Forsyth.CW stwcx. ,
1162*46439007SCharles.Forsythwhich are
1163*46439007SCharles.Forsyth.CW LWAR
1164*46439007SCharles.Forsythand
1165*46439007SCharles.Forsyth.CW STWCCC ,
1166*46439007SCharles.Forsythall with operands in the usual data-flow order.
1167*46439007SCharles.ForsythFloating-point load or store instructions are
1168*46439007SCharles.Forsyth.CW FMOVD ,
1169*46439007SCharles.Forsyth.CW FMOVDU ,
1170*46439007SCharles.Forsyth.CW FMOVS ,
1171*46439007SCharles.Forsythand
1172*46439007SCharles.Forsyth.CW FMOVSU .
1173*46439007SCharles.ForsythThe register to register move instructions
1174*46439007SCharles.Forsyth.CW fmr
1175*46439007SCharles.Forsythand
1176*46439007SCharles.Forsyth.CW fmr.
1177*46439007SCharles.Forsythare written
1178*46439007SCharles.Forsyth.CW FMOVD
1179*46439007SCharles.Forsythand
1180*46439007SCharles.Forsyth.CW FMOVDCC .
1181*46439007SCharles.Forsyth.PP
1182*46439007SCharles.ForsythThe assembler knows the commonly used special purpose registers:
1183*46439007SCharles.Forsyth.CW CR ,
1184*46439007SCharles.Forsyth.CW CTR ,
1185*46439007SCharles.Forsyth.CW DEC ,
1186*46439007SCharles.Forsyth.CW LR ,
1187*46439007SCharles.Forsyth.CW MSR ,
1188*46439007SCharles.Forsythand
1189*46439007SCharles.Forsyth.CW XER .
1190*46439007SCharles.ForsythThe rest, which are often architecture-dependent, are referenced as
1191*46439007SCharles.Forsyth.CW SPR(n) .
1192*46439007SCharles.ForsythThe segment registers of the 60x series are similarly
1193*46439007SCharles.Forsyth.CW SEG(n) ,
1194*46439007SCharles.Forsythbut
1195*46439007SCharles.Forsyth.I n
1196*46439007SCharles.Forsythcan also be a register name, as in
1197*46439007SCharles.Forsyth.CW SEG(R3) .
1198*46439007SCharles.ForsythMoves between special purpose registers and general purpose ones,
1199*46439007SCharles.Forsythwhen allowed by the architecture,
1200*46439007SCharles.Forsythare written as
1201*46439007SCharles.Forsyth.CW MOVW ,
1202*46439007SCharles.Forsythreplacing
1203*46439007SCharles.Forsyth.CW mfcr ,
1204*46439007SCharles.Forsyth.CW mtcr ,
1205*46439007SCharles.Forsyth.CW mfmsr ,
1206*46439007SCharles.Forsyth.CW mtmsr ,
1207*46439007SCharles.Forsyth.CW mtspr ,
1208*46439007SCharles.Forsyth.CW mfspr ,
1209*46439007SCharles.Forsyth.CW mftb ,
1210*46439007SCharles.Forsythand many others.
1211*46439007SCharles.Forsyth.PP
1212*46439007SCharles.ForsythThe fields of the condition register
1213*46439007SCharles.Forsyth.CW CR
1214*46439007SCharles.Forsythare referenced as
1215*46439007SCharles.Forsyth.CW CR(0)
1216*46439007SCharles.Forsyththrough
1217*46439007SCharles.Forsyth.CW CR(7) .
1218*46439007SCharles.ForsythThey are used by the
1219*46439007SCharles.Forsyth.CW MOVFL
1220*46439007SCharles.Forsyth(move field) pseudo-instruction,
1221*46439007SCharles.Forsythwhich produces
1222*46439007SCharles.Forsyth.CW mcrf
1223*46439007SCharles.Forsythor
1224*46439007SCharles.Forsyth.CW mtcrf .
1225*46439007SCharles.ForsythFor example:
1226*46439007SCharles.Forsyth.P1
1227*46439007SCharles.Forsyth	MOVFL	CR(3), CR(0)
1228*46439007SCharles.Forsyth	MOVFL	R3, CR(1)
1229*46439007SCharles.Forsyth	MOVFL	R3, $7, CR
1230*46439007SCharles.Forsyth.P2
1231*46439007SCharles.ForsythThey are also accepted in
1232*46439007SCharles.Forsyththe conditional branch instruction, for example
1233*46439007SCharles.Forsyth.P1
1234*46439007SCharles.Forsyth	BEQ	CR(7), label
1235*46439007SCharles.Forsyth.P2
1236*46439007SCharles.ForsythFields of the
1237*46439007SCharles.Forsyth.CW FPSCR
1238*46439007SCharles.Forsythare accessed using
1239*46439007SCharles.Forsyth.CW MOVFL
1240*46439007SCharles.Forsythin a similar way:
1241*46439007SCharles.Forsyth.P1
1242*46439007SCharles.Forsyth	MOVFL	FPSCR, F0
1243*46439007SCharles.Forsyth	MOVFL	F0, FPSCR
1244*46439007SCharles.Forsyth	MOVFL	F0, $7, FPSCR
1245*46439007SCharles.Forsyth	MOVFL	$0, FPSCR(3)
1246*46439007SCharles.Forsyth.P2
1247*46439007SCharles.Forsythproducing
1248*46439007SCharles.Forsyth.CW mffs ,
1249*46439007SCharles.Forsyth.CW mtfsf ,
1250*46439007SCharles.Forsythor
1251*46439007SCharles.Forsyth.CW mtfsfi
1252*46439007SCharles.Forsythas appropriate.
1253*46439007SCharles.Forsyth.SH
1254*46439007SCharles.ForsythARM
1255*46439007SCharles.Forsyth.PP
1256*46439007SCharles.ForsythThe assembler provides access to
1257*46439007SCharles.Forsyth.CW R0
1258*46439007SCharles.Forsyththrough
1259*46439007SCharles.Forsyth.CW R14
1260*46439007SCharles.Forsythand the
1261*46439007SCharles.Forsyth.CW PC .
1262*46439007SCharles.ForsythThe stack pointer is
1263*46439007SCharles.Forsyth.CW R13 ,
1264*46439007SCharles.Forsyththe link register is
1265*46439007SCharles.Forsyth.CW R14 ,
1266*46439007SCharles.Forsythand the static base register is
1267*46439007SCharles.Forsyth.CW R12 .
1268*46439007SCharles.Forsyth.CW R0
1269*46439007SCharles.Forsythis the return register and also the register holding
1270*46439007SCharles.Forsyththe first argument to a subroutine.
1271*46439007SCharles.ForsythThe assembler supports the
1272*46439007SCharles.Forsyth.CW CPSR
1273*46439007SCharles.Forsythand
1274*46439007SCharles.Forsyth.CW SPSR
1275*46439007SCharles.Forsythregisters.
1276*46439007SCharles.ForsythIt also knows about coprocessor registers
1277*46439007SCharles.Forsyth.CW C0
1278*46439007SCharles.Forsyththrough
1279*46439007SCharles.Forsyth.CW C15 .
1280*46439007SCharles.ForsythFloating registers are
1281*46439007SCharles.Forsyth.CW F0
1282*46439007SCharles.Forsyththrough
1283*46439007SCharles.Forsyth.CW F7 ,
1284*46439007SCharles.Forsyth.CW FPSR
1285*46439007SCharles.Forsythand
1286*46439007SCharles.Forsyth.CW FPCR .
1287*46439007SCharles.Forsyth.PP
1288*46439007SCharles.ForsythAs with the other architectures, loads and stores are called
1289*46439007SCharles.Forsyth.CW MOV ,
1290*46439007SCharles.Forsythe.g.
1291*46439007SCharles.Forsyth.CW MOVW
1292*46439007SCharles.Forsythfor load word or store word, and
1293*46439007SCharles.Forsyth.CW MOVM
1294*46439007SCharles.Forsythfor
1295*46439007SCharles.Forsythload or store multiple,
1296*46439007SCharles.Forsythdepending on the operands.
1297*46439007SCharles.Forsyth.PP
1298*46439007SCharles.ForsythAddressing modes are supported by suffixes to the instructions:
1299*46439007SCharles.Forsyth.CW .IA
1300*46439007SCharles.Forsyth(increment after),
1301*46439007SCharles.Forsyth.CW .IB
1302*46439007SCharles.Forsyth(increment before),
1303*46439007SCharles.Forsyth.CW .DA
1304*46439007SCharles.Forsyth(decrement after), and
1305*46439007SCharles.Forsyth.CW .DB
1306*46439007SCharles.Forsyth(decrement before).
1307*46439007SCharles.ForsythThese can only be used with the
1308*46439007SCharles.Forsyth.CW MOV
1309*46439007SCharles.Forsythinstructions.
1310*46439007SCharles.ForsythThe move multiple instruction,
1311*46439007SCharles.Forsyth.CW MOVM ,
1312*46439007SCharles.Forsythdefines a range of registers using brackets, e.g.
1313*46439007SCharles.Forsyth.CW [R0-R12] .
1314*46439007SCharles.ForsythThe special
1315*46439007SCharles.Forsyth.CW MOVM
1316*46439007SCharles.Forsythaddressing mode bits
1317*46439007SCharles.Forsyth.CW W ,
1318*46439007SCharles.Forsyth.CW U ,
1319*46439007SCharles.Forsythand
1320*46439007SCharles.Forsyth.CW P
1321*46439007SCharles.Forsythare written in the same manner, for example,
1322*46439007SCharles.Forsyth.CW MOVM.DB.W .
1323*46439007SCharles.ForsythA
1324*46439007SCharles.Forsyth.CW .S
1325*46439007SCharles.Forsythsuffix allows a
1326*46439007SCharles.Forsyth.CW MOVM
1327*46439007SCharles.Forsythinstruction to access user
1328*46439007SCharles.Forsyth.CW R13
1329*46439007SCharles.Forsythand
1330*46439007SCharles.Forsyth.CW R14
1331*46439007SCharles.Forsythwhen in another processor mode.
1332*46439007SCharles.ForsythShifts and rotates in addressing modes are supported by binary operators
1333*46439007SCharles.Forsyth.CW <<
1334*46439007SCharles.Forsyth(logical left shift),
1335*46439007SCharles.Forsyth.CW >>
1336*46439007SCharles.Forsyth(logical right shift),
1337*46439007SCharles.Forsyth.CW ->
1338*46439007SCharles.Forsyth(arithmetic right shift), and
1339*46439007SCharles.Forsyth.CW @>
1340*46439007SCharles.Forsyth(rotate right); for example
1341*46439007SCharles.Forsyth.CW "R7>>R2" or
1342*46439007SCharles.Forsyth.CW "R2@>2" .
1343*46439007SCharles.ForsythThe assembler does not support indexing by a shifted expression;
1344*46439007SCharles.Forsythonly names can be doubly indexed.
1345*46439007SCharles.Forsyth.PP
1346*46439007SCharles.ForsythAny instruction can be followed by a suffix that makes the instruction conditional:
1347*46439007SCharles.Forsyth.CW .EQ ,
1348*46439007SCharles.Forsyth.CW .NE ,
1349*46439007SCharles.Forsythand so on, as in the ARM manual, with synonyms
1350*46439007SCharles.Forsyth.CW .HS
1351*46439007SCharles.Forsyth(for
1352*46439007SCharles.Forsyth.CW .CS )
1353*46439007SCharles.Forsythand
1354*46439007SCharles.Forsyth.CW .LO
1355*46439007SCharles.Forsyth(for
1356*46439007SCharles.Forsyth.CW .CC ),
1357*46439007SCharles.Forsythfor example
1358*46439007SCharles.Forsyth.CW ADD.NE .
1359*46439007SCharles.ForsythArithmetic
1360*46439007SCharles.Forsythand logical instructions
1361*46439007SCharles.Forsythcan have a
1362*46439007SCharles.Forsyth.CW .S
1363*46439007SCharles.Forsythsuffix, as ARM allows, to set condition codes.
1364*46439007SCharles.Forsyth.PP
1365*46439007SCharles.ForsythThe syntax of the
1366*46439007SCharles.Forsyth.CW MCR
1367*46439007SCharles.Forsythand
1368*46439007SCharles.Forsyth.CW MRC
1369*46439007SCharles.Forsythcoprocessor instructions is largely as in the manual, with the usual adjustments.
1370*46439007SCharles.ForsythThe assembler directly supports only the ARM floating-point coprocessor
1371*46439007SCharles.Forsythoperations used by the compiler:
1372*46439007SCharles.Forsyth.CW CMP ,
1373*46439007SCharles.Forsyth.CW ADD ,
1374*46439007SCharles.Forsyth.CW SUB ,
1375*46439007SCharles.Forsyth.CW MUL ,
1376*46439007SCharles.Forsythand
1377*46439007SCharles.Forsyth.CW DIV ,
1378*46439007SCharles.Forsythall with
1379*46439007SCharles.Forsyth.CW F
1380*46439007SCharles.Forsythor
1381*46439007SCharles.Forsyth.CW D
1382*46439007SCharles.Forsythsuffix selecting single or double precision.
1383*46439007SCharles.ForsythFloating-point load or store become
1384*46439007SCharles.Forsyth.CW MOVF
1385*46439007SCharles.Forsythand
1386*46439007SCharles.Forsyth.CW MOVD .
1387*46439007SCharles.ForsythConversion instructions are also specified by moves:
1388*46439007SCharles.Forsyth.CW MOVWD ,
1389*46439007SCharles.Forsyth.CW MOVWF ,
1390*46439007SCharles.Forsyth.CW MOVDW ,
1391*46439007SCharles.Forsyth.CW MOVWD ,
1392*46439007SCharles.Forsyth.CW MOVFD ,
1393*46439007SCharles.Forsythand
1394*46439007SCharles.Forsyth.CW MOVDF .
1395