Name Date Size #Lines LOC

..--

hppa1_1/H09-Jul-2024-1,3791,165

hppa2_0/H09-Jul-2024-497400

READMEH A D09-Jul-20243.4 KiB163111

add_n.asmH A D09-Jul-20241.9 KiB6454

gmp-mparam.hH A D09-Jul-20241.7 KiB6223

lshift.asmH A D09-Jul-20241.9 KiB7666

pa-defs.m4H A D09-Jul-20241.9 KiB6552

rshift.asmH A D09-Jul-20241.9 KiB7363

sub_n.asmH A D09-Jul-20241.9 KiB6555

udiv.asmH A D09-Jul-20246.7 KiB292282

README

1Copyright 1996, 1999, 2001, 2002, 2004 Free Software Foundation, Inc.
2
3This file is part of the GNU MP Library.
4
5The GNU MP Library is free software; you can redistribute it and/or modify
6it under the terms of either:
7
8  * the GNU Lesser General Public License as published by the Free
9    Software Foundation; either version 3 of the License, or (at your
10    option) any later version.
11
12or
13
14  * the GNU General Public License as published by the Free Software
15    Foundation; either version 2 of the License, or (at your option) any
16    later version.
17
18or both in parallel, as here.
19
20The GNU MP Library is distributed in the hope that it will be useful, but
21WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
22or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
23for more details.
24
25You should have received copies of the GNU General Public License and the
26GNU Lesser General Public License along with the GNU MP Library.  If not,
27see https://www.gnu.org/licenses/.
28
29
30
31
32
33
34This directory contains mpn functions for various HP PA-RISC chips.  Code
35that runs faster on the PA7100 and later implementations, is in the pa7100
36directory.
37
38RELEVANT OPTIMIZATION ISSUES
39
40  Load and Store timing
41
42On the PA7000 no memory instructions can issue the two cycles after a store.
43For the PA7100, this is reduced to one cycle.
44
45The PA7100 has a lookup-free cache, so it helps to schedule loads and the
46dependent instruction really far from each other.
47
48STATUS
49
501. mpn_mul_1 could be improved to 6.5 cycles/limb on the PA7100, using the
51   instructions below (but some sw pipelining is needed to avoid the
52   xmpyu-fstds delay):
53
54	fldds	s1_ptr
55
56	xmpyu
57	fstds	N(%r30)
58	xmpyu
59	fstds	N(%r30)
60
61	ldws	N(%r30)
62	ldws	N(%r30)
63	ldws	N(%r30)
64	ldws	N(%r30)
65
66	addc
67	stws	res_ptr
68	addc
69	stws	res_ptr
70
71	addib	Loop
72
732. mpn_addmul_1 could be improved from the current 10 to 7.5 cycles/limb
74   (asymptotically) on the PA7100, using the instructions below.  With proper
75   sw pipelining and the unrolling level below, the speed becomes 8
76   cycles/limb.
77
78	fldds	s1_ptr
79	fldds	s1_ptr
80
81	xmpyu
82	fstds	N(%r30)
83	xmpyu
84	fstds	N(%r30)
85	xmpyu
86	fstds	N(%r30)
87	xmpyu
88	fstds	N(%r30)
89
90	ldws	N(%r30)
91	ldws	N(%r30)
92	ldws	N(%r30)
93	ldws	N(%r30)
94	ldws	N(%r30)
95	ldws	N(%r30)
96	ldws	N(%r30)
97	ldws	N(%r30)
98	addc
99	addc
100	addc
101	addc
102	addc	%r0,%r0,cy-limb
103
104	ldws	res_ptr
105	ldws	res_ptr
106	ldws	res_ptr
107	ldws	res_ptr
108	add
109	stws	res_ptr
110	addc
111	stws	res_ptr
112	addc
113	stws	res_ptr
114	addc
115	stws	res_ptr
116
117	addib
118
1193. For the PA8000 we have to stick to using 32-bit limbs before compiler
120   support emerges.  But we want to use 64-bit operations whenever possible,
121   in particular for loads and stores.  It is possible to handle mpn_add_n
122   efficiently by rotating (when s1/s2 are aligned), masking+bit field
123   inserting when (they are not).  The speed should double compared to the
124   code used today.
125
126
127
128
129LABEL SYNTAX
130
131The HP-UX assembler takes labels starting in column 0 with no colon,
132
133	L$loop  ldws,mb -4(0,%r25),%r22
134
135Gas on hppa GNU/Linux however requires a colon,
136
137	L$loop: ldws,mb -4(0,%r25),%r22
138
139This is covered by using LDEF() from asm-defs.m4.  An alternative would be
140to use ".label" which is accepted by both,
141
142		.label  L$loop
143		ldws,mb -4(0,%r25),%r22
144
145but that's not as nice to look at, not if you're used to assembler code
146having labels in column 0.
147
148
149
150
151REFERENCES
152
153Hewlett Packard, "HP Assembler Reference Manual", 9th edition, June 1998,
154part number 92432-90012.
155
156
157
158----------------
159Local variables:
160mode: text
161fill-column: 76
162End:
163