xref: /netbsd-src/sys/external/bsd/sljit/dist/doc/tutorial/sljit_tutorial.html (revision 99e10043c2d890154986b9f0cfb10c84949ba483)
1
2<html>
3
4<head>
5  <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
6  <title>SLJIT tutorial</title>
7
8  <style type="text/css">
9    body {
10      background-color: #707070;
11      color: #000000;
12      font-family: "garamond"
13    }
14    td.main {
15      background-color: #ffffff;
16      color: #000000;
17      font-family: "garamond"
18    }
19  </style>
20</head>
21
22<body>
23
24<center>
25<table width="760" cellspacing=0 cellpadding=0>
26<tr height=20><td width=20 class="main"></td><td width=720 class="main"></td><td width=20 class="main"></td></tr>
27<tr><td width=20 class="main"></td><td width=720 class="main">
28
29<center>
30<a href="http://sourceforge.net"><img src="http://sflogo.sourceforge.net/sflogo.php?group_id=248047&amp;type=2" width="125" height="37" border="0" alt="SourceForge.net Logo" /></a>
31</center>
32<h1><center>SLJIT tutorial</center></h1>
33
34<h2>Before started</h2>
35
36<a href="">Download the tutorial sources</a><br>
37<br>
38SLJIT is a light-weight, platform independent JIT compiler, it's easy to
39embed to your own project, as a result of its 'stack-less', SLJIT have
40some limit to register usage.<br>
41<br>
42Here is some other JIT compiler I digged these days, place here if you have interest:<br>
43
44<ul>
45  <b>Libjit/liblighning:</b> - the backend of GNU.net<br>
46  <b>Libgccjit:</b> - introduced in GCC5.0, its different from other JIT lib, this
47                    one seems like constructing a C code, it use the backend of GCC.<br>
48  <b>AsmJIT:</b> - branch from the famous V8 project (JavaScript engine in Chrome),
49                   support only X86/X86_64.<br>
50  <b>DynASM:</b> - used in LuaJIT.<br>
51</ul>
52
53<br>
54AsmJIT and DynASM work in the instruction level, look like coding with ASM language,
55SLJIT look like ASM also, but it hide the detail of the specific CPU, make it more
56common, and become portable, libjit work on higher layer, libgccjit as I mention,
57really you are constructing the C code.<br>
58
59<h2>First program</h2>
60
61Usage of SLJIT:
62<ul>
631. #include "sljitLir.h" in the head of your C/C++ program<br>
642. Compile with sljit_src/sljitLir.c<br>
65</ul>
66
67ALL example can be compile like this:
68<ul>
69gcc -Wall -Ipath/to/sljit_src -DSLJIT_CONFIG_AUTO=1 \<br>
70  <ul><b>xxx.c</b> path/to/sljit_src/sljitLir.c -o program</ul>
71</ul>
72
73OK, let's take a look at the first program, this program we create a function that
74return the sum of 3 arguments.<br>
75<br>
76<div style='font-family:Courier New;font-size:11px'>
77<ul>
78#include "sljitLir.h"<br>
79 <br>
80#include &lt;stdio.h&gt;<br>
81#include &lt;stdlib.h&gt;<br>
82 <br>
83typedef sljit_sw (*func3_t)(sljit_sw a, sljit_sw b, sljit_sw c);<br>
84 <br>
85static int add3(sljit_sw a, sljit_sw b, sljit_sw c)<br>
86{<br>
87   <ul>
88    void *code;<br>
89    sljit_sw len;<br>
90    func3_t func;<br>
91   <br>
92    /* Create a SLJIT compiler */<br>
93    struct sljit_compiler *C = sljit_create_compiler();<br>
94   <br>
95    /* Start a context(function entry), have 3 arguments, discuss later */<br>
96    sljit_emit_enter(C, 0,  3,  1, 3, 0, 0, 0);<br>
97   <br>
98    /* The first arguments of function is register SLJIT_S0, 2nd, SLJIT_S1, etc.  */<br>
99    /* R0 = first */<br>
100    sljit_emit_op1(C, SLJIT_MOV, SLJIT_R0, 0, SLJIT_S0, 0);<br>
101   <br>
102    /* R0 = R0 + second */<br>
103    sljit_emit_op2(C, SLJIT_ADD, SLJIT_R0, 0, SLJIT_R0, 0, SLJIT_S1, 0);<br>
104   <br>
105    /* R0 = R0 + third */<br>
106    sljit_emit_op2(C, SLJIT_ADD, SLJIT_R0, 0, SLJIT_R0, 0, SLJIT_S2, 0);<br>
107   <br>
108    /* This statement mov R0 to RETURN REG and return */<br>
109    /* in fact, R0 is RETURN REG itself */<br>
110    sljit_emit_return(C, SLJIT_MOV, SLJIT_R0, 0);<br>
111   <br>
112    /* Generate machine code */<br>
113    code = sljit_generate_code(C);<br>
114    len = sljit_get_generated_code_size(C);<br>
115   <br>
116    /* Execute code */<br>
117    func = (func3_t)code;<br>
118    printf("func return %ld\n", func(a, b, c));<br>
119   <br>
120    /* dump_code(code, len); */<br>
121   <br>
122    /* Clean up */<br>
123    sljit_free_compiler(C);<br>
124    sljit_free_code(code);<br>
125    return 0;<br>
126   </ul>
127}<br>
128 <br>
129int main()<br>
130{<br>
131   <ul>
132    return add3(4, 5, 6);<br>
133   </ul>
134}<br>
135</ul>
136</div>
137
138<br>
139The function sljit_emit_enter create a context, save some registers to the stack,
140and create a call-frame, sljit_emit_return restore the saved-register and clean-up
141the frame. SLJIT is design to embed into other application, the code it generated
142has to follow some basic rule.<br>
143<br>
144The standard called Application Binary Interface, or ABI for short, here is a
145document for X86_64 CPU (<a href="http://www.x86-64.org/documentation/abi.pdf">ABI.pdf</a>),
146almost all Linux/Unix follow this standard. MS windows has its own, read this for more:
147<a href="http://en.wikipedia.org/wiki/X86_calling_conventions">X86_calling_conventions</a><br>
148<br>
149When reading the doc of sljit_emit_emter, the parameters 'saveds' and 'scratchs' make
150me confused. The fact is, the registers in CPU has different functions in the ABI spec,
151some of them used to pass arguments, some of them are 'callee-saved', some of them are
152'temporary used', take X86_64 for example, RAX, R10, R11 are temporary used, that means,
153they may be changed after a call instruction. And RBX, R12-R15 are callee-saved, those
154will remain the same values after the call. The rule is, every function should save
155those registers before using it.<br>
156<br>
157Fortunately, SLJIT have done the most for us, SLJIT_S[0-9] represent those 'safe'
158registers, SLJIT_R[0-9] however, only for 'temporary used'.<br>
159<br>
160When a function start, SLJIT move the function arguments to S0, S1, S2 register, it
161means function arguments are always 'safe' in the context, the limit of using stack for
162storing arguments make SLJIT support only 3 arguments max.<br>
163<br>
164Sljit_emit_opX is easy to understand, in SLJIT a data value is represented by 2
165parameters, it can be a register, an In-memory data, or an immediate number.<br>
166<br>
167
168<table align="center" cellspacing="0">
169<tr><td>First parameter</td> 	<td>Second parameter</td>	<td>Meaning</td></tr>
170<tr><td>SLJIT_R*, SLJIT_S*</td>	<td>0</td>			<td>Temp/saved registers</td></tr>
171<tr><td>SLJIT_IMM</td>			<td>Number</td>		<td>Immediate number</td></tr>
172<tr><td>SLJIT_MEM</td>			<td>Address</td>	<td>In-mem data with Absolute address</td></tr>
173<tr><td>SLJIT_MEM1(r)</td>		<td>Offset</td>		<td>In-mem data in [R + offset]</td></tr>
174<tr><td>SLJIT_MEM2(r1, r2)</td>	<td>Shift(size)</td>		<td>In-mem array, R1 as base address, R2 as index, <br>
175								Shift as size(0 for bytes, 1 for shorts, 2 for <br>
176								4bytes, 3 for 8bytes)</td></tr>
177</table>
178
179<h2>Branch</h2>
180<div style='font-family:Courier New;font-size:11px'>
181<ul>
182#include "sljitLir.h"<br>
183 <br>
184#include &lt;stdio.h&gt;<br>
185#include &lt;stdlib.h&gt;<br>
186 <br>
187typedef sljit_sw (*func3_t)(sljit_sw a, sljit_sw b, sljit_sw c);<br>
188 <br>
189/*<br>
190 This example, we generate a function like this:<br>
191 <br>
192sljit_sw func(sljit_sw a, sljit_sw b, sljit_sw c)<br>
193{<br>
194    <ul>
195    if ((a & 1) == 0)<br>
196    <ul>
197        return c;<br>
198    </ul>
199    return b;<br>
200</ul>
201}<br>
202 <br>
203 */<br>
204static int branch(sljit_sw a, sljit_sw b, sljit_sw c)<br>
205{<br>
206   <ul>
207    void *code;<br>
208    sljit_uw len;<br>
209    func3_t func;<br>
210   <br>
211    struct sljit_jump *ret_c;<br>
212    struct sljit_jump *out;<br>
213   <br>
214    /* Create a SLJIT compiler */<br>
215    struct sljit_compiler *C = sljit_create_compiler();<br>
216   <br>
217    /* 3 arg, 1 temp reg, 3 save reg */<br>
218    sljit_emit_enter(C, 0,  3,  1, 3, 0, 0, 0);<br>
219   <br>
220    /* R0 = a & 1, S0 is argument a */<br>
221    sljit_emit_op2(C, SLJIT_AND, SLJIT_R0, 0, SLJIT_S0, 0, SLJIT_IMM, 1);<br>
222   <br>
223    /* if R0 == 0 then jump to ret_c, where is ret_c? we assign it later */<br>
224    ret_c = sljit_emit_cmp(C, SLJIT_EQUAL, SLJIT_R0, 0, SLJIT_IMM, 0);<br>
225   <br>
226    /* R0 = b, S1 is argument b */<br>
227    sljit_emit_op1(C, SLJIT_MOV, SLJIT_RETURN_REG, 0, SLJIT_S1, 0);<br>
228   <br>
229    /* jump to out */<br>
230    out = sljit_emit_jump(C, SLJIT_JUMP);<br>
231   <br>
232    /* here is the 'ret_c' should jump, we emit a label and set it to ret_c */<br>
233    sljit_set_label(ret_c, sljit_emit_label(C));<br>
234   <br>
235    /* R0 = c, S2 is argument c */<br>
236    sljit_emit_op1(C, SLJIT_MOV, SLJIT_RETURN_REG, 0, SLJIT_S2, 0);<br>
237   <br>
238    /* here is the 'out' should jump */<br>
239    sljit_set_label(out, sljit_emit_label(C));<br>
240   <br>
241    /* end of function */<br>
242    sljit_emit_return(C, SLJIT_MOV, SLJIT_RETURN_REG, 0);<br>
243   <br>
244    /* Generate machine code */<br>
245    code = sljit_generate_code(C);<br>
246    len = sljit_get_generated_code_size(C);<br>
247   <br>
248    /* Execute code */<br>
249    func = (func3_t)code;<br>
250    printf("func return %ld\n", func(a, b, c));<br>
251   <br>
252    /* dump_code(code, len); */<br>
253   <br>
254    /* Clean up */<br>
255    sljit_free_compiler(C);<br>
256    sljit_free_code(code);<br>
257    return 0;<br>
258</ul>
259}<br>
260 <br>
261int main()<br>
262{<br>
263<ul>
264    return branch(4, 5, 6);<br>
265</ul>
266}<br>
267</ul>
268</div>
269
270The key to implement branch is 'struct sljit_jump' and 'struct sljit_label',
271the 'jump' contain a jump instruction, it does not know where to jump unless
272you set a label to it, the 'label' is a code address just like label in ASM
273language.<br>
274<br>
275sljit_emit_cmp/sljit_emit_jump generate a conditional/unconditional jump,
276take the statement<br>
277<ul>
278ret_c = sljit_emit_cmp(C, SLJIT_EQUAL, SLJIT_R0, 0, SLJIT_IMM, 0);<br>
279</ul>
280For example, it create a jump instruction, the condition is R0 equals 0, and
281the position of jumping will assign later with the sljit_set_label statement.<br>
282<br>
283In this example, it creates a branch like this:<br>
284<ul>
285    <ul>
286    R0 = a & 1;<br>
287    if R0 == 0 then goto ret_c;<br>
288    R0 = b;<br>
289    goto out;<br>
290    </ul>
291ret_c:<br>
292    <ul>
293    R0 = c;<br>
294    </ul>
295out:<br>
296    <ul>
297    return R0;<br>
298    </ul>
299</ul>
300<br>
301This is how high-level-language compiler handle branch.<br>
302<br>
303
304<h2>Loop</h2>
305
306Loop example is similar with Branch.
307
308<div style='font-family:Courier New;font-size:11px'>
309<ul>
310/*
311 This example, we generate a function like this:<br>
312 <br>
313sljit_sw func(sljit_sw a, sljit_sw b)<br>
314{<br>
315<ul>
316    sljit_sw i;<br>
317    sljit_sw ret = 0;<br>
318    for (i = 0; i &lt; a; ++i) {<br>
319    <ul>
320        ret += b;<br>
321    </ul>
322    }<br>
323    return ret;<br>
324</ul>
325}<br>
326*/<br>
327<br>
328<ul>
329    /* 2 arg, 2 temp reg, 2 saved reg */<br>
330    sljit_emit_enter(C, 0, 2, 2, 2, 0, 0, 0);<br>
331    <br>
332    /* R0 = 0 */<br>
333    sljit_emit_op2(C, SLJIT_XOR, SLJIT_R1, 0, SLJIT_R1, 0, SLJIT_R1, 0);<br>
334    /* RET = 0 */<br>
335    sljit_emit_op1(C, SLJIT_MOV, SLJIT_RETURN_REG, 0, SLJIT_IMM, 0);<br>
336    /* loopstart: */<br>
337    loopstart = sljit_emit_label(C);<br>
338    /* R1 &gt;= a --> jump out */<br>
339    out = sljit_emit_cmp(C, SLJIT_GREATER_EQUAL, SLJIT_R1, 0, SLJIT_S0, 0);<br>
340    /* RET += b */<br>
341    sljit_emit_op2(C, SLJIT_ADD, SLJIT_RETURN_REG, 0, SLJIT_RETURN_REG, 0, SLJIT_S1, 0);<br>
342    /* R1 += 1 */<br>
343    sljit_emit_op2(C, SLJIT_ADD, SLJIT_R1, 0, SLJIT_R1, 0, SLJIT_IMM, 1);<br>
344    /* jump loopstart */<br>
345    sljit_set_label(sljit_emit_jump(C, SLJIT_JUMP), loopstart);<br>
346    /* out: */<br>
347    sljit_set_label(out, sljit_emit_label(C));<br>
348    <br>
349    /* return RET */<br>
350    sljit_emit_return(C, SLJIT_MOV, SLJIT_RETURN_REG, 0);<br>
351</ul>
352</ul>
353</div>
354
355After this example, you are ready to construct any program that contain complex branch
356and loop.<br>
357<br>
358Here is an interesting fact, 'xor reg, reg' is better than 'mov reg, 0', it save 2 bytes
359in X86 machine.<br>
360<br>
361I will give only the key code in the rest of this tutorial, the full source of each
362chapter can be found in the attachment.<br>
363
364
365<h2>Call external function</h2>
366
367It's easy to call an external function in SLJIT, we use sljit_emit_ijump with SLJIT_CALL*
368operation to do so.<br>
369<br>
370SLJIT_CALL[N] is use to call a function with N arguments, SLJIT has only SLJIT_CALL0,
371CALL1, CALL2, CALL3, which means you can call a function with 3 arguments in max(that
372disappoint me, no chance to call fwrite in SLJIT), the arguments for the callee function
373are passed from SLJIT_R0, R1 and R2. Keep in mind to maintain those 'temp registers'.<br>
374<br>
375Assume that we have an external function:<br>
376<ul>
377    sljit_sw print_num(sljit_sw a);
378</ul>
379
380JIT code to call print_num(S1):
381
382<div style='font-family:Courier New;font-size:11px'>
383<ul>
384    /* R0 = S1; */<br>
385    sljit_emit_op1(C, SLJIT_MOV, SLJIT_R0, 0, SLJIT_S1, 0);<br>
386    /* print_num(R0) */<br>
387    sljit_emit_ijump(C, SLJIT_CALL1, SLJIT_IMM, SLJIT_FUNC_OFFSET(print_num));<br>
388</ul>
389</div>
390<br>
391This code call a imm-data(address of print_num), which is linked properly when the
392program loaded. There no problem in 1-time compile and execute, but when you planning
393to save to file and load/execute next time, that address may not correct as you expect,
394in some platform that support PIC, the address of print_num may relocate to another
395address in run-time. Check this out:
396<a href="http://en.wikipedia.org/wiki/Position-independent_code">PIC</a><br>
397<br>
398
399<h2>Structure access</h2>
400
401SLJIT use SLJIT_MEM1 to implement [Reg + offset] memory access.<br>
402<div style='font-family:Courier New;font-size:11px'>
403<ul>
404struct point_st {<br>
405    <ul>
406    sljit_sw x;<br>
407    int y;<br>
408    short z;<br>
409    char d;<br>
410    char e;<br>
411    </ul>
412};<br>
413<br>
414sljit_emit_op1(C, SLJIT_MOV_SI, SLJIT_R0, 0, SLJIT_MEM1(SLJIT_S0),<br>
415<ul>
416SLJIT_OFFSETOF(struct point_st, y));<br>
417</ul>
418</ul>
419</div>
420
421In this case, SLJIT_S0 is the address of the point_st structure, offset of member 'y'
422is determined in compile time, the important MOV operation always comes with a
423'signed/size' postfix, like this one _SI means 'signed 32bits integer', the postfix
424list:<br>
425<ul>
426   <b>UB</b> = unsigned byte (8 bit)<br>
427   <b>SB</b> = signed byte (8 bit)<br>
428   <b>UH</b> = unsigned half (16 bit)<br>
429   <b>SH</b> = signed half (16 bit)<br>
430   <b>UI</b> = unsigned int (32 bit)<br>
431   <b>SI</b> = signed int (32 bit)<br>
432   <b>P</b>  = pointer (sljit_p) size<br>
433</ul>
434
435<h2>Array accessing</h2>
436
437SLJIT use SLJIT_MEM2 to access arrays, like this:<br>
438
439<div style='font-family:Courier New;font-size:11px'>
440<ul>
441sljit_emit_op1(C, SLJIT_MOV, SLJIT_R0, 0, SLJIT_MEM2(SLJIT_S0, SLJIT_S2),<br>
442<ul>
443SLJIT_WORD_SHIFT);
444</ul>
445</ul>
446</div>
447
448This statement generates a code like this:<br>
449<ul>
450WORD S0[];<br>
451R0 = S0[S2]<br>
452</ul>
453<br>
454The array S0 is declared to be WORD, which will be sizeof(sljit_sw) in length.
455Sljit use a 'shift' for length representation: (0 for single byte, 1 for 2
456bytes, 2 for 4 bytes, 3 for 8bytes)<br>
457<br>
458The file array_access.c demonstrate a array-print example, should be easy
459to understand.<br>
460
461<h2>Local variables</h2>
462
463SLJIT provide SLJIT_MEM1(SLJIT_SP) to access the reserved space in
464sljit_emit_enter's last parameter.<br>
465In this example we have to pass the address to print_arr, local variable
466is the only choice.<br>
467
468<div style='font-family:Courier New;font-size:11px'>
469<ul>
470    /* reserved space in stack for sljit_sw arr[3] */<br>
471    sljit_emit_enter(C, 0,  3,  2, 3, 0, 0, 3 * sizeof(sljit_sw));<br>
472    /*                  opt arg R  S  FR FS local_size */<br>
473   <br>
474    /* arr[0] = S0, SLJIT_SP is the init address of local var */<br>
475    sljit_emit_op1(C, SLJIT_MOV, SLJIT_MEM1(SLJIT_SP), 0, SLJIT_S0, 0);<br>
476    /* arr[1] = S1 */<br>
477    sljit_emit_op1(C, SLJIT_MOV, SLJIT_MEM1(SLJIT_SP), 1 * sizeof(sljit_sw), SLJIT_S1, 0);<br>
478    /* arr[2] = S2 */<br>
479    sljit_emit_op1(C, SLJIT_MOV, SLJIT_MEM1(SLJIT_SP), 2 * sizeof(sljit_sw), SLJIT_S2, 0);<br>
480   <br>
481    /* R0 = arr; in fact SLJIT_SP is the address of arr, but can't do so in SLJIT */<br>
482    sljit_get_local_base(C, SLJIT_R0, 0, 0);   /* get the address of local variables */<br>
483    sljit_emit_op1(C, SLJIT_MOV, SLJIT_R1, 0, SLJIT_IMM, 3);   /* R1 = 3; */<br>
484    sljit_emit_ijump(C, SLJIT_CALL2, SLJIT_IMM, SLJIT_FUNC_OFFSET(print_arr));<br>
485    sljit_emit_return(C, SLJIT_MOV, SLJIT_R0, 0);<br>
486</ul>
487</div>
488<br>
489SLJIT_SP can only be used in SLJIT_MEM1(SLJIT_SP). In this case, SP is the
490address of 'arr', but we cannot assign it to Reg using SLJIT_MOV opr,
491instead, we use sljit_get_local_base, which load the address and offset of
492local variable to the target.<br>
493
494<h2>Brainfuck compiler</h2>
495
496Ok, the basic usage of SLJIT ends here, with more detail, I suggest reading
497sljitLir.h directly, having fun hacking the wonder of SLJIT!<br>
498<br>
499The brainfuck machine introduction can be found here:
500<a href="http://en.wikipedia.org/wiki/Brainfuck">Brainfuck</a><br>
501<br>
502
503<h2>Extra</h2>
504
5051. Dump_code function<br>
506SLJIT didn't provide disassemble functional, this is a simple function to do this(X86 only)<br>
507<br>
508
509<div style='font-family:Courier New;font-size:11px'>
510<ul>
511static void dump_code(void *code, sljit_uw len)<br>
512{<br>
513<ul>
514    FILE *fp = fopen("/tmp/slj_dump", "wb");<br>
515    if (!fp)<br>
516    <ul>
517        return;<br>
518    </ul>
519    fwrite(code, len, 1, fp);<br>
520    fclose(fp);<br>
521</ul>
522#if defined(SLJIT_CONFIG_X86_64)<br>
523<ul>
524    system("objdump -b binary -m l1om -D /tmp/slj_dump");<br>
525</ul>
526#elif defined(SLJIT_CONFIG_X86_32)<br>
527<ul>
528    system("objdump -b binary -m i386 -D /tmp/slj_dump");<br>
529</ul>
530#endif<br>
531}
532</ul>
533</div>
534
535The branch example disassembling:<br>
536 <br>
5370000000000000000 &lt;.data&gt;:<br>
538<ul>
539<table>
540<tr><td>0:</td><td>53</td><td>push   %rbx</td></tr>
541<tr><td>1:</td><td>41 57</td><td>push   %r15</td></tr>
542<tr><td>3:</td><td>41 56</td><td>push   %r14</td></tr>
543<tr><td>5:</td><td>48 8b df</td><td>mov    %rdi,%rbx</td></tr>
544<tr><td>8:</td><td>4c 8b fe</td><td>mov    %rsi,%r15</td></tr>
545<tr><td>b:</td><td>4c 8b f2</td><td>mov    %rdx,%r14</td></tr>
546<tr><td>e:</td><td>48 83 ec 10</td><td>sub    $0x10,%rsp</td></tr>
547<tr><td>12:</td><td>48 89 d8</td><td>mov    %rbx,%rax</td></tr>
548<tr><td>15:</td><td>48 83 e0 01</td><td>and    $0x1,%rax</td></tr>
549<tr><td>19:</td><td>48 83 f8 00</td><td>cmp    $0x0,%rax</td></tr>
550<tr><td>1d:</td><td>74 05</td><td>je     0x24</td></tr>
551<tr><td>1f:</td><td>4c 89 f8</td><td>mov    %r15,%rax</td></tr>
552<tr><td>22:</td><td>eb 03</td><td>jmp    0x27</td></tr>
553<tr><td>24:</td><td>4c 89 f0</td><td>mov    %r14,%rax</td></tr>
554<tr><td>27:</td><td>48 83 c4 10</td><td>add    $0x10,%rsp</td></tr>
555<tr><td>2b:</td><td>41 5e</td><td>pop    %r14</td></tr>
556<tr><td>2d:</td><td>41 5f</td><td>pop    %r15</td></tr>
557<tr><td>2f:</td><td>5b</td><td>pop    %rbx</td></tr>
558<tr><td>30:</td><td>c3</td><td>retq</td></tr>
559</table>
560</ul>
561<br>
562with GCC -O2<br>
5630000000000000000 &lt;func&gt;:<br>
564<ul>
565<table>
566<tr><td>0:</td><td>48 89 d0</td><td>mov    %rdx,%rax</td></tr>
567<tr><td>3:</td><td>83 e7 01</td><td>and    $0x1,%edi</td></tr>
568<tr><td>6:</td><td>48 0f 45 c6</td><td>cmovne %rsi,%rax</td></tr>
569<tr><td>a:</td><td>c3</td><td>retq</td></tr>
570</table>
571</ul>
572<br>
573Err... Ok, the optimization here may be weak, or, optimization there is crazy... :-)<br>
574
575<table width="100%" cellspacing=0 cellpadding=0>
576<tr><td align=right>By wenxichang#163.com, 2015.5.10</td></tr></table>
577
578</td><td width=20 class="main"></td></tr>
579<tr height=20><td width=20 class="main"></td><td width=720 class="main"></td><td width=20 class="main"></td></tr>
580</table>
581</center>
582
583</body>
584</html>
585