xref: /netbsd-src/external/lgpl3/gmp/dist/mpn/alpha/ev6/nails/README (revision d25ffa98a4bfca1fe272f3c182496ec9934faac7)
1Copyright 2002, 2005 Free Software Foundation, Inc.
2
3This file is part of the GNU MP Library.
4
5The GNU MP Library is free software; you can redistribute it and/or modify it
6under the terms of the GNU Lesser General Public License as published by the
7Free Software Foundation; either version 3 of the License, or (at your
8option) any later version.
9
10The GNU MP Library is distributed in the hope that it will be useful, but
11WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
12FITNESS FOR A PARTICULAR PURPOSE.  See the GNU Lesser General Public License
13for more details.
14
15You should have received a copy of the GNU Lesser General Public License along
16with the GNU MP Library.  If not, see http://www.gnu.org/licenses/.
17
18
19
20
21
22This directory contains assembly code for nails-enabled 21264.  The code is not
23very well optimized.
24
25For addmul_N, as N grows larger, we could make multiple loads together, then do
26about 3.3 i/c.  10 cycles after the last load, we can increase to 4 i/c.  This
27would surely allow addmul_4 to run at 2 c/l, but the same should be possible
28also for addmul_3 and perhaps even addmul_2.
29
30
31		current		fair		best
32Routine		c/l  unroll	c/l  unroll	c/l  i/c
33mul_1		3.25		2.75		2.75 3.273
34addmul_1	4.0	4	3.5	4 14	3.25 3.385
35addmul_2	4.0	1	2.5	2 10	2.25 3.333
36addmul_3	3.0	1	2.33	2 14	2    3.333
37addmul_4	2.5	1	2.125	2 17	2    3.135
38
39addmul_5			2	1 10
40addmul_6			2	1 12
41addmul_7			2	1 14
42
43(The "best" column doesn't account for bookkeeping instructions and
44thereby assumes infinite unrolling.)
45
46Basecase usages:
47
481	 addmul_1
492	 addmul_2
503	 addmul_3
514	 addmul_4
525	 addmul_3 + addmul_2	2.3998
536	 addmul_4 + addmul_2
547	 addmul_4 + addmul_3
55