copysign-casts.ll - OpenGrok history log for /llvm-project/llvm/test/CodeGen/RISCV/copysign-casts.ll

Revision (<<< Hide revision tags) (Show revision tags >>>)	Date	Author	Comments
Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4
# 9122c523	15-Nov-2024	Pengcheng Wang <wangpengcheng.pp@bytedance.com>	[RISCV] Enable bidirectional scheduling and tracking register pressure (#115445) This is based on other targets like PPC/AArch64 and some experiments. This PR will only enable bidirectional schedu [RISCV] Enable bidirectional scheduling and tracking register pressure (#115445) This is based on other targets like PPC/AArch64 and some experiments. This PR will only enable bidirectional scheduling and tracking register pressure. Disclaimer: I haven't tested it on many cores, maybe we should make some options being features. I believe downstreams must have tried this before, so feedbacks are welcome. show more ...
# 08411c85	06-Nov-2024	Gergely Futo <gergely.futo@hightec-rt.com>	[RISCV] Correct fcopysign pattern for zdinx (#114954) Correcting the pattern fixes the following error: fatal error: error in backend: Cannot select: t17: f64 = fcopysign t5, t8
Revision tags: llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4
# 9a1eded9	03-Sep-2024	Craig Topper <craig.topper@sifive.com>	[RISCV] Custom legalize f16/bf16 FCOPYSIGN with Zfhmin/Zbfmin. (#107039) The LegalizeDAG expansion will go through memory since i16 isn't a legal type. Avoid this by using FMV nodes. Similar to [RISCV] Custom legalize f16/bf16 FCOPYSIGN with Zfhmin/Zbfmin. (#107039) The LegalizeDAG expansion will go through memory since i16 isn't a legal type. Avoid this by using FMV nodes. Similar to what we did for #106886 for FNEG and FABS. Special care is needed to handle the Sign operand being a different type. show more ...
# 3bdec313	01-Sep-2024	Craig Topper <craig.topper@sifive.com>	[RISCV] Custom legalize f16/bf16 FNEG/FABS with Zfhmin/Zbfmin. (#106886) The LegalizeDAG expansion will go through memory since i16 isn't a legal type. Avoid this by using FMV nodes.
Revision tags: llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init
# eabaee0c	07-Jan-2024	Fangrui Song <i@maskray.me>	[RISCV] Omit "@plt" in assembly output "call foo@plt" (#72467) R_RISCV_CALL/R_RISCV_CALL_PLT distinction is not necessary and R_RISCV_CALL has been deprecated. Since https://reviews.llvm.org/D132530 [RISCV] Omit "@plt" in assembly output "call foo@plt" (#72467) R_RISCV_CALL/R_RISCV_CALL_PLT distinction is not necessary and R_RISCV_CALL has been deprecated. Since https://reviews.llvm.org/D132530 `call foo` assembles to R_RISCV_CALL_PLT. The `@plt` suffix is not useful and can be removed now (matching AArch64 and PowerPC). GNU assembler assembles `call foo` to RISCV_CALL_PLT since 2022-09 (70f35d72ef04cd23771875c1661c9975044a749c). Without this patch, unconditionally changing MO_CALL to MO_PLT could create `jump .L1@plt, a0`, which is invalid in LLVM integrated assembler and GNU assembler. show more ...
Revision tags: llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init
# ee34fa00	06-Jul-2023	Craig Topper <craig.topper@sifive.com>	[RISCV] Add DAG combine for (fmv_w_x_rv64 (fmv_x_anyextw_rv64 X)) This pattern started showing up more after D151284
# 5ba40c7b	30-Jun-2023	Alex Bradbury <asb@igalia.com>	[RISCV] Custom lower FP_TO_FP16 and FP16_TO_FP to correct ABI of of libcall As introduced in D99148, RISC-V uses the softPromoteHalf legalisation for fp16 values without zfh, with logic ensuring tha [RISCV] Custom lower FP_TO_FP16 and FP16_TO_FP to correct ABI of of libcall As introduced in D99148, RISC-V uses the softPromoteHalf legalisation for fp16 values without zfh, with logic ensuring that f16 values are passed in lower bits of FPRs (see D98670) when F or D support is present. This legalisation produces ISD::FP_TO_FP16 and ISD::FP16_TO_FP nodes which (as described in ISDOpcodes.h) provide a "semi-softened interface for dealing with f16 (as an i16)". i.e. the return type of the FP_TO_FP16 is an integer rather than a float (and the arg of FP16_TO_FP is an integer). The remainder of the description focuses primarily on FP_TO_FP16 for ease of explanation. FP_TO_FP16 is lowered to a libcall to `__truncsfhf2 (float)` or `__truncdfhf2 (double)`. As of D92241, `_Float16` is used as the return type of these libcalls if the host compiler accepts `_Float16` in a test input (i.e. dst_t is set to `_Float16`). `_Float16` is enabled for the RISC-V target as of D105001 and so the return value should be passed in an FPR on hard float ABIs. This patch fixes the ABI issue in what appears to be a minimally invasive way - leaving the softPromoteHalf logic undisturbed, and lowering FP_TO_FP16 to an f32-returning libcall, converting its result to an XLen integer value. As can be seen in the test changes, the custom lowering for FP16_TO_FP means the libcall is no longer tail-callable. Although this patch fixes the issue, there are two open items: * Redundant fmv.x.w and fmv.w.x pairs are now somtimes produced during lowering (not a correctness issue). * Now coverage for STRICT variants of FP16 conversion opcodes. Differential Revision: https://reviews.llvm.org/D151284 show more ...
Revision tags: llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1
# 7b0c4184	28-Mar-2023	Craig Topper <craig.topper@sifive.com>	[RISCV] Move compressible registers to the beginning of the FP allocation order. We don't have very many compressible FP instructions, just load and store. These instruction require the FP register [RISCV] Move compressible registers to the beginning of the FP allocation order. We don't have very many compressible FP instructions, just load and store. These instruction require the FP register to be f8-f15. This patch changes the FP allocation order to prioritize f10-f15 first. These are also the FP argument registers. So I allocated them in reverse order starting at f15 to avoid taking the first argument registers. This appears to match gcc allocation order. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D146488 show more ...
Revision tags: llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init
# 86eff6be	20-Jan-2023	Philip Reames <preames@rivosinc.com>	[MachineCombiner] Use default latency model when no detailed model available This change adjusts the cost modeling used when the target does not have a schedule model with individual instruction lat [MachineCombiner] Use default latency model when no detailed model available This change adjusts the cost modeling used when the target does not have a schedule model with individual instruction latencies. After this change, we use the default latency information available from TargetSchedule. The default latency information essentially ends up treating most instructions as latency 1, with a few "expensive" ones getting a higher cost. Previously, we unconditionally applied the first legal pattern - without any consideration of profitability. As a result, this change both prevents some patterns being applied, and changes which patterns are exercised. (i.e. previously the first pattern was applied, afterwards, maybe the second one is because the first wasn't profitable.) The motivation here is two fold. First, this brings the default behavior in line with the behavior when -mcpu or -mtune is specified. This improves test coverage, and generally makes it less likely we will have bad surprises when providing more information to the compiler. Second, this enables some reassociation for ILP by default. Despite being unconditionally enabled, the prior code tended to "reassociate" repeatedly through an entire chain and simply moving the first operand to the end. The result was still a serial chain, just a different one. With this change, one of the intermediate transforms is unprofitable and we end up with a partially flattened tree. Note that the resulting code diffs show significant room for improvement in the basic algorithm. I am intentionally excluding those from this patch. For the test diffs, I don't seen any concerning regressions. I took a fairly close look at the RISCV ones, but only skimmed the x86 (particularly vector x86) changes. Differential Revision: https://reviews.llvm.org/D141017 show more ...
Revision tags: llvmorg-15.0.7
# 002005e6	22-Dec-2022	Hsiangkai Wang <hsiangkai@google.com>	[RISCV] Add integer scalar instructions to isAssociativeAndCommutative Inspired by D138107. We can add ADD, AND, OR, XOR, MUL, MIN[U]/MAX[U] to isAssociativeAndCommutative to increase instruction-l [RISCV] Add integer scalar instructions to isAssociativeAndCommutative Inspired by D138107. We can add ADD, AND, OR, XOR, MUL, MIN[U]/MAX[U] to isAssociativeAndCommutative to increase instruction-level parallelism by the existing MachineCombiner pass. Differential Revision: https://reviews.llvm.org/D140530 show more ...
# 7b50c183	30-Nov-2022	Monk Chiang <monk.chiang@sifive.com>	[RISCV] Codegen support for Zfhmin. The Zfhmin subset only has FLH, FSH, FMV.X.H, FMV.H.X, FCVT.S.H, and FCVT.H.S. If the D extension is present, the FCVT.D.H and FCVT.H.D instructions are also incl [RISCV] Codegen support for Zfhmin. The Zfhmin subset only has FLH, FSH, FMV.X.H, FMV.H.X, FCVT.S.H, and FCVT.H.S. If the D extension is present, the FCVT.D.H and FCVT.H.D instructions are also included. Since most instructions are not included for Zfhmin, so most operations are promoted. The patch primarily about making f16 a legal type. RISC-V ISA info: https://wiki.riscv.org/display/HOME/Recently+Ratified+Extensions Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D139391 show more ...
# a8c79121	30-Nov-2022	Craig Topper <craig.topper@sifive.com>	[RISCV] Teach getRegAllocationHints about compressible SRAI/SRLI. Similar to previous patches for ADDI/ADDIW/SLLI/ADD, but restricted to only cases where the register is x8-x15(GPRC reg class). I'v [RISCV] Teach getRegAllocationHints about compressible SRAI/SRLI. Similar to previous patches for ADDI/ADDIW/SLLI/ADD, but restricted to only cases where the register is x8-x15(GPRC reg class). I've restricted it so that we can be precise about whether the resulting instruction would be compressible. Changing the register allocation may make some other instruction not compressible so we should try to be accurate. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D138740 show more ...
Revision tags: llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init
# 7cbfb4eb	29-Jun-2022	Craig Topper <craig.topper@sifive.com>	[RISCV] Select (srl (and X, C2) as (slli (srliw X, C3), C3-C). If C2 has 32 leading zeros and C3 trailing zeros.
Revision tags: llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1
# 6a54776f	16-Mar-2022	Haocong.Lu <Haocong.Lu@streamcomputing.com>	[RISCV] Select SRLI+SLLI for AND with leading ones mask Select SRLI+SLLI for and i64 %x, imm if the imm is a leading ones mask. It's useful in RV64 when the mask exceeds simm32 (cannot be generated [RISCV] Select SRLI+SLLI for AND with leading ones mask Select SRLI+SLLI for and i64 %x, imm if the imm is a leading ones mask. It's useful in RV64 when the mask exceeds simm32 (cannot be generated by LUI). Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D121598 show more ...
Revision tags: llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5
# a9d5bb92	06-Apr-2021	Kito Cheng <kito.cheng@sifive.com>	[RISCV] Use __extendhfsf2/__truncsfhf2 for fp16 <-> fp32 `__gnu_h2f_ieee` and `__gnu_f2h_ieee` are introduce by ARM and set that as default name for fp16 and fp32 conversion in LLVM. However RISC-V [RISCV] Use __extendhfsf2/__truncsfhf2 for fp16 <-> fp32 `__gnu_h2f_ieee` and `__gnu_f2h_ieee` are introduce by ARM and set that as default name for fp16 and fp32 conversion in LLVM. However RISC-V GCC using default naming scheme for that, which is `__extendhfsf2` and `__truncsfhf2` for that, that cause runtime ABI incompatible issue. Although we didn't have formal runtime ABI spec to specify those naming convention yet, but I think it would be great to fix the incompatible issue first. And I've plan to create a runtime ABI spec undere psABI spec this year. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D118207 show more ...
# a0a76fee	15-Jan-2022	Shao-Ce SUN <shaoce@nj.iscas.ac.cn>	[RISCV] update zfh and zfhmin extention to v1.0 `zfh` and `zfhmin` have been ratified, with version 1.0. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D117098
# bd653f64	11-Jan-2022	Haocong.Lu <Haocong.Lu@streamcomputing.com>	[RISCV] Use shift for zero extension when Zbb and Zbp are not enabled Now AND is used for zero extension when both Zbb and Zbp are not enabled. It may be better to use shift operation if the trailin [RISCV] Use shift for zero extension when Zbb and Zbp are not enabled Now AND is used for zero extension when both Zbb and Zbp are not enabled. It may be better to use shift operation if the trailing ones mask exceeds simm12. This patch optimzes LUI+ADDI+AND to SLLI+SRLI. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D116720 show more ...
# 137d3474	16-Nov-2021	Hsiangkai Wang <kai.wang@sifive.com>	[RISCV] Reverse the order of loading/storing callee-saved registers. Currently, we restore the return address register as the last restoring instruction in the epilog. The next instruction is `ret` [RISCV] Reverse the order of loading/storing callee-saved registers. Currently, we restore the return address register as the last restoring instruction in the epilog. The next instruction is `ret` usually. It is a use of return address register. In some microarchitectures, there is load-to-use data hazard. To avoid the load-to-use data hazard, we could separate the load instruction from its use as far as possible. In this patch, we reverse the order of restoring callee-saved registers to increase the distance of `load ra` and `ret` in the epilog. Differential Revision: https://reviews.llvm.org/D113967 show more ...
# af0ecfcc	22-Nov-2021	wangpc <pc.wang@linux.alibaba.com>	[RISCV] Generate pseudo instruction li Add an alias of `addi [x], zero, imm` to generate pseudo instruction li, which makes assembly mush more readable. For existed tests, users can update them by r [RISCV] Generate pseudo instruction li Add an alias of `addi [x], zero, imm` to generate pseudo instruction li, which makes assembly mush more readable. For existed tests, users can update them by running script `llvm/utils/update_llc_test_checks.py`. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D112692 show more ...
# 1b941745	26-Aug-2021	Craig Topper <craig.topper@sifive.com>	[RISCV] Insert a sext_inreg when type legalizing i32 shl by constant on RV64. Similar to what we do for add/sub/mul. This can help remove some sext.w. There are some regressions on some bswap tests [RISCV] Insert a sext_inreg when type legalizing i32 shl by constant on RV64. Similar to what we do for add/sub/mul. This can help remove some sext.w. There are some regressions on some bswap tests, but I have an idea how to fix that for a follow up. A new PACKW pattern is added to handle the new sext_inreg placement. Differential Revision: https://reviews.llvm.org/D108663 show more ...
# 010f0f00	27-Jun-2021	Craig Topper <craig.topper@sifive.com>	Revert "[RISCV] Use zexti32/sexti32 in srliw/sraiw isel patterns to improve usage of those instructions." I thought this might help with another optimization I was thinking about, but I don't think Revert "[RISCV] Use zexti32/sexti32 in srliw/sraiw isel patterns to improve usage of those instructions." I thought this might help with another optimization I was thinking about, but I don't think it will. So it just wastes compile time calling computeKnownBits for no benefit. This reverts commit 81b2f95971edd47a0057ac4a77b674d7ea620c01. show more ...
# 81b2f959	26-Jun-2021	Craig Topper <craig.topper@sifive.com>	[RISCV] Use zexti32/sexti32 in srliw/sraiw isel patterns to improve usage of those instructions.
# dbbc95e3	01-Apr-2021	Craig Topper <craig.topper@sifive.com>	[RISCV] Use softPromoteHalf legalization for fp16 without Zfh rather than PromoteFloat. The default legalization strategy is PromoteFloat which keeps half in single precision format through multiple [RISCV] Use softPromoteHalf legalization for fp16 without Zfh rather than PromoteFloat. The default legalization strategy is PromoteFloat which keeps half in single precision format through multiple floating point operations. Conversion to/from float is done at loads, stores, bitcasts, and other places that care about the exact size being 16 bits. This patches switches to the alternative method softPromoteHalf. This aims to keep the type in 16-bit format between every operation. So we promote to float and immediately round for any arithmetic operation. This should be closer to the IR semantics since we are rounding after each operation and not accumulating extra precision across multiple operations. X86 is the only other target that enables this today. See https://reviews.llvm.org/D73749 I had to update getRegisterTypeForCallingConv to force f16 to use f32 when the F extension is enabled. This way we can still pass it in the lower bits of an FPR for ilp32f and lp64f ABIs. The softPromoteHalf would otherwise always give i16 as the argument type. Reviewed By: asb, frasercrmck Differential Revision: https://reviews.llvm.org/D99148 show more ...
# d61b40ed	01-Apr-2021	Craig Topper <craig.topper@sifive.com>	[RISCV] Improve 64-bit integer materialization for some cases. This adds a new integer materialization strategy mainly targeted at 64-bit constants like 0xffffffff where there are 32 or more trailin [RISCV] Improve 64-bit integer materialization for some cases. This adds a new integer materialization strategy mainly targeted at 64-bit constants like 0xffffffff where there are 32 or more trailing ones with leading zeros. We can materialize these by using an addi -1 and srli to restore the leading zeros. This matches what gcc does. I haven't limited to just these cases though. The implementation here takes the constant, shifts out all the leading zeros and shifts ones into the LSBs, creates the new sequence, adds an srli, and checks if this is shorter than our original strategy. I've separated the recursive portion into a standalone function so I could append the new strategy outside of the recursion. Since external users are no longer using the recursive function, I've cleaned up the external interface to return the sequence instead of taking a vector by reference. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D98821 show more ...
Revision tags: llvmorg-12.0.0-rc4
# a33fcafa	30-Mar-2021	Craig Topper <craig.topper@sifive.com>	[RISCV] Pass 'half' in the lower 16 bits of an f32 value when F extension is enabled, but Zfh is not. Without Zfh the half type isn't legal, but it could still be used as an argument/return in IR. C [RISCV] Pass 'half' in the lower 16 bits of an f32 value when F extension is enabled, but Zfh is not. Without Zfh the half type isn't legal, but it could still be used as an argument/return in IR. Clang will not generate this today. Previously we promoted the half value to float for arguments and returns if the F extension is enabled but Zfh isn't. Then depending on which ABI is enabled we would pass it in either an FPR or a GPR in float format. If the F extension isn't enabled, it would get passed in the lower 16 bits of a GPR in half format. With this patch the value will always in half format and will be in the lower bits of a GPR or FPR. This should be consistent with where the bits are located when Zfh is enabled. I've based this implementation off of how this is done on ARM. I've manually nan-boxed the value to 32 bits using integer ops. It looks like flw, fsw, fmv.s, fmv.w.x, fmf.x.w won't canonicalize nans so should leave the value alone. I think those are the instructions that could get used on this value. Reviewed By: kito-cheng Differential Revision: https://reviews.llvm.org/D98670 show more ...
12