1dnl IBM POWER mpn_mul_1 -- Multiply a limb vector with a limb and store the 2dnl result in a second limb vector. 3 4dnl Copyright 1992, 1994, 1999-2001 Free Software Foundation, Inc. 5 6dnl This file is part of the GNU MP Library. 7dnl 8dnl The GNU MP Library is free software; you can redistribute it and/or modify 9dnl it under the terms of either: 10dnl 11dnl * the GNU Lesser General Public License as published by the Free 12dnl Software Foundation; either version 3 of the License, or (at your 13dnl option) any later version. 14dnl 15dnl or 16dnl 17dnl * the GNU General Public License as published by the Free Software 18dnl Foundation; either version 2 of the License, or (at your option) any 19dnl later version. 20dnl 21dnl or both in parallel, as here. 22dnl 23dnl The GNU MP Library is distributed in the hope that it will be useful, but 24dnl WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY 25dnl or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License 26dnl for more details. 27dnl 28dnl You should have received copies of the GNU General Public License and the 29dnl GNU Lesser General Public License along with the GNU MP Library. If not, 30dnl see https://www.gnu.org/licenses/. 31 32 33dnl INPUT PARAMETERS 34dnl res_ptr r3 35dnl s1_ptr r4 36dnl size r5 37dnl s2_limb r6 38 39dnl The POWER architecture has no unsigned 32x32->64 bit multiplication 40dnl instruction. To obtain that operation, we have to use the 32x32->64 41dnl signed multiplication instruction, and add the appropriate compensation to 42dnl the high limb of the result. We add the multiplicand if the multiplier 43dnl has its most significant bit set, and we add the multiplier if the 44dnl multiplicand has its most significant bit set. We need to preserve the 45dnl carry flag between each iteration, so we have to compute the compensation 46dnl carefully (the natural, srai+and doesn't work). Since all POWER can 47dnl branch in zero cycles, we use conditional branches for the compensation. 48 49include(`../config.m4') 50 51ASM_START() 52PROLOGUE(mpn_mul_1) 53 cal 3,-4(3) 54 l 0,0(4) 55 cmpi 0,6,0 56 mtctr 5 57 mul 9,0,6 58 srai 7,0,31 59 and 7,7,6 60 mfmq 8 61 ai 0,0,0 C reset carry 62 cax 9,9,7 63 blt Lneg 64Lpos: bdz Lend 65Lploop: lu 0,4(4) 66 stu 8,4(3) 67 cmpi 0,0,0 68 mul 10,0,6 69 mfmq 0 70 ae 8,0,9 71 bge Lp0 72 cax 10,10,6 C adjust high limb for negative limb from s1 73Lp0: bdz Lend0 74 lu 0,4(4) 75 stu 8,4(3) 76 cmpi 0,0,0 77 mul 9,0,6 78 mfmq 0 79 ae 8,0,10 80 bge Lp1 81 cax 9,9,6 C adjust high limb for negative limb from s1 82Lp1: bdn Lploop 83 b Lend 84 85Lneg: cax 9,9,0 86 bdz Lend 87Lnloop: lu 0,4(4) 88 stu 8,4(3) 89 cmpi 0,0,0 90 mul 10,0,6 91 cax 10,10,0 C adjust high limb for negative s2_limb 92 mfmq 0 93 ae 8,0,9 94 bge Ln0 95 cax 10,10,6 C adjust high limb for negative limb from s1 96Ln0: bdz Lend0 97 lu 0,4(4) 98 stu 8,4(3) 99 cmpi 0,0,0 100 mul 9,0,6 101 cax 9,9,0 C adjust high limb for negative s2_limb 102 mfmq 0 103 ae 8,0,10 104 bge Ln1 105 cax 9,9,6 C adjust high limb for negative limb from s1 106Ln1: bdn Lnloop 107 b Lend 108 109Lend0: cal 9,0(10) 110Lend: st 8,4(3) 111 aze 3,9 112 br 113EPILOGUE(mpn_mul_1) 114