History log of /llvm-project/llvm/test/CodeGen/RISCV/copysign-casts.ll (Results 1 – 25 of 30)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4
# 9122c523 15-Nov-2024 Pengcheng Wang <wangpengcheng.pp@bytedance.com>

[RISCV] Enable bidirectional scheduling and tracking register pressure (#115445)


This is based on other targets like PPC/AArch64 and some experiments.

This PR will only enable bidirectional schedu

[RISCV] Enable bidirectional scheduling and tracking register pressure (#115445)


This is based on other targets like PPC/AArch64 and some experiments.

This PR will only enable bidirectional scheduling and tracking register
pressure.

Disclaimer: I haven't tested it on many cores, maybe we should make
some options being features. I believe downstreams must have tried
this before, so feedbacks are welcome.

show more ...


# 08411c85 06-Nov-2024 Gergely Futo <gergely.futo@hightec-rt.com>

[RISCV] Correct fcopysign pattern for zdinx (#114954)

Correcting the pattern fixes the following error:
fatal error: error in backend: Cannot select: t17: f64 = fcopysign t5,
t8


Revision tags: llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4
# 9a1eded9 03-Sep-2024 Craig Topper <craig.topper@sifive.com>

[RISCV] Custom legalize f16/bf16 FCOPYSIGN with Zfhmin/Zbfmin. (#107039)

The LegalizeDAG expansion will go through memory since i16 isn't a legal
type. Avoid this by using FMV nodes.

Similar to

[RISCV] Custom legalize f16/bf16 FCOPYSIGN with Zfhmin/Zbfmin. (#107039)

The LegalizeDAG expansion will go through memory since i16 isn't a legal
type. Avoid this by using FMV nodes.

Similar to what we did for #106886 for FNEG and FABS. Special care is
needed to handle the Sign operand being a different type.

show more ...


# 3bdec313 01-Sep-2024 Craig Topper <craig.topper@sifive.com>

[RISCV] Custom legalize f16/bf16 FNEG/FABS with Zfhmin/Zbfmin. (#106886)

The LegalizeDAG expansion will go through memory since i16 isn't a legal
type. Avoid this by using FMV nodes.


Revision tags: llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init
# eabaee0c 07-Jan-2024 Fangrui Song <i@maskray.me>

[RISCV] Omit "@plt" in assembly output "call foo@plt" (#72467)

R_RISCV_CALL/R_RISCV_CALL_PLT distinction is not necessary and
R_RISCV_CALL has been deprecated. Since https://reviews.llvm.org/D132530

[RISCV] Omit "@plt" in assembly output "call foo@plt" (#72467)

R_RISCV_CALL/R_RISCV_CALL_PLT distinction is not necessary and
R_RISCV_CALL has been deprecated. Since https://reviews.llvm.org/D132530
`call foo` assembles to R_RISCV_CALL_PLT. The `@plt` suffix is not
useful and can be removed now (matching AArch64 and PowerPC).

GNU assembler assembles `call foo` to RISCV_CALL_PLT since 2022-09
(70f35d72ef04cd23771875c1661c9975044a749c).

Without this patch, unconditionally changing MO_CALL to MO_PLT could
create `jump .L1@plt, a0`, which is invalid in LLVM integrated assembler
and GNU assembler.

show more ...


Revision tags: llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init
# ee34fa00 06-Jul-2023 Craig Topper <craig.topper@sifive.com>

[RISCV] Add DAG combine for (fmv_w_x_rv64 (fmv_x_anyextw_rv64 X))

This pattern started showing up more after D151284


# 5ba40c7b 30-Jun-2023 Alex Bradbury <asb@igalia.com>

[RISCV] Custom lower FP_TO_FP16 and FP16_TO_FP to correct ABI of of libcall

As introduced in D99148, RISC-V uses the softPromoteHalf legalisation
for fp16 values without zfh, with logic ensuring tha

[RISCV] Custom lower FP_TO_FP16 and FP16_TO_FP to correct ABI of of libcall

As introduced in D99148, RISC-V uses the softPromoteHalf legalisation
for fp16 values without zfh, with logic ensuring that f16 values are
passed in lower bits of FPRs (see D98670) when F or D support is
present. This legalisation produces ISD::FP_TO_FP16 and ISD::FP16_TO_FP
nodes which (as described in ISDOpcodes.h) provide a "semi-softened
interface for dealing with f16 (as an i16)". i.e. the return type of the
FP_TO_FP16 is an integer rather than a float (and the arg of FP16_TO_FP
is an integer). The remainder of the description focuses primarily on
FP_TO_FP16 for ease of explanation.

FP_TO_FP16 is lowered to a libcall to `__truncsfhf2 (float)` or
`__truncdfhf2 (double)`. As of D92241, `_Float16` is used as the return
type of these libcalls if the host compiler accepts `_Float16` in a test
input (i.e. dst_t is set to `_Float16`). `_Float16` is enabled for the
RISC-V target as of D105001 and so the return value should be passed in
an FPR on hard float ABIs.

This patch fixes the ABI issue in what appears to be a minimally
invasive way - leaving the softPromoteHalf logic undisturbed, and
lowering FP_TO_FP16 to an f32-returning libcall, converting its result
to an XLen integer value.

As can be seen in the test changes, the custom lowering for FP16_TO_FP
means the libcall is no longer tail-callable.

Although this patch fixes the issue, there are two open items:
* Redundant fmv.x.w and fmv.w.x pairs are now somtimes produced during
lowering (not a correctness issue).
* Now coverage for STRICT variants of FP16 conversion opcodes.

Differential Revision: https://reviews.llvm.org/D151284

show more ...


Revision tags: llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1
# 7b0c4184 28-Mar-2023 Craig Topper <craig.topper@sifive.com>

[RISCV] Move compressible registers to the beginning of the FP allocation order.

We don't have very many compressible FP instructions, just load and store.
These instruction require the FP register

[RISCV] Move compressible registers to the beginning of the FP allocation order.

We don't have very many compressible FP instructions, just load and store.
These instruction require the FP register to be f8-f15.

This patch changes the FP allocation order to prioritize f10-f15 first.
These are also the FP argument registers. So I allocated them in reverse
order starting at f15 to avoid taking the first argument registers.
This appears to match gcc allocation order.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D146488

show more ...


Revision tags: llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init
# 86eff6be 20-Jan-2023 Philip Reames <preames@rivosinc.com>

[MachineCombiner] Use default latency model when no detailed model available

This change adjusts the cost modeling used when the target does not have a schedule model with individual instruction lat

[MachineCombiner] Use default latency model when no detailed model available

This change adjusts the cost modeling used when the target does not have a schedule model with individual instruction latencies. After this change, we use the default latency information available from TargetSchedule. The default latency information essentially ends up treating most instructions as latency 1, with a few "expensive" ones getting a higher cost.

Previously, we unconditionally applied the first legal pattern - without any consideration of profitability. As a result, this change both prevents some patterns being applied, and changes which patterns are exercised. (i.e. previously the first pattern was applied, afterwards, maybe the second one is because the first wasn't profitable.)

The motivation here is two fold.

First, this brings the default behavior in line with the behavior when -mcpu or -mtune is specified. This improves test coverage, and generally makes it less likely we will have bad surprises when providing more information to the compiler.

Second, this enables some reassociation for ILP by default. Despite being unconditionally enabled, the prior code tended to "reassociate" repeatedly through an entire chain and simply moving the first operand to the end. The result was still a serial chain, just a different one. With this change, one of the intermediate transforms is unprofitable and we end up with a partially flattened tree.

Note that the resulting code diffs show significant room for improvement in the basic algorithm. I am intentionally excluding those from this patch.

For the test diffs, I don't seen any concerning regressions. I took a fairly close look at the RISCV ones, but only skimmed the x86 (particularly vector x86) changes.

Differential Revision: https://reviews.llvm.org/D141017

show more ...


Revision tags: llvmorg-15.0.7
# 002005e6 22-Dec-2022 Hsiangkai Wang <hsiangkai@google.com>

[RISCV] Add integer scalar instructions to isAssociativeAndCommutative

Inspired by D138107.

We can add ADD, AND, OR, XOR, MUL, MIN[U]/MAX[U] to isAssociativeAndCommutative
to increase instruction-l

[RISCV] Add integer scalar instructions to isAssociativeAndCommutative

Inspired by D138107.

We can add ADD, AND, OR, XOR, MUL, MIN[U]/MAX[U] to isAssociativeAndCommutative
to increase instruction-level parallelism by the existing MachineCombiner pass.

Differential Revision: https://reviews.llvm.org/D140530

show more ...


# 7b50c183 30-Nov-2022 Monk Chiang <monk.chiang@sifive.com>

[RISCV] Codegen support for Zfhmin.

The Zfhmin subset only has FLH, FSH, FMV.X.H, FMV.H.X, FCVT.S.H, and FCVT.H.S.
If the D extension is present, the FCVT.D.H and FCVT.H.D instructions are also incl

[RISCV] Codegen support for Zfhmin.

The Zfhmin subset only has FLH, FSH, FMV.X.H, FMV.H.X, FCVT.S.H, and FCVT.H.S.
If the D extension is present, the FCVT.D.H and FCVT.H.D instructions are also included.
Since most instructions are not included for Zfhmin, so most operations are promoted.
The patch primarily about making f16 a legal type.

RISC-V ISA info:
https://wiki.riscv.org/display/HOME/Recently+Ratified+Extensions

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D139391

show more ...


# a8c79121 30-Nov-2022 Craig Topper <craig.topper@sifive.com>

[RISCV] Teach getRegAllocationHints about compressible SRAI/SRLI.

Similar to previous patches for ADDI/ADDIW/SLLI/ADD, but restricted
to only cases where the register is x8-x15(GPRC reg class).

I'v

[RISCV] Teach getRegAllocationHints about compressible SRAI/SRLI.

Similar to previous patches for ADDI/ADDIW/SLLI/ADD, but restricted
to only cases where the register is x8-x15(GPRC reg class).

I've restricted it so that we can be precise about whether the
resulting instruction would be compressible. Changing the register
allocation may make some other instruction not compressible so we
should try to be accurate.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D138740

show more ...


Revision tags: llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init
# 7cbfb4eb 29-Jun-2022 Craig Topper <craig.topper@sifive.com>

[RISCV] Select (srl (and X, C2) as (slli (srliw X, C3), C3-C).

If C2 has 32 leading zeros and C3 trailing zeros.


Revision tags: llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1
# 6a54776f 16-Mar-2022 Haocong.Lu <Haocong.Lu@streamcomputing.com>

[RISCV] Select SRLI+SLLI for AND with leading ones mask

Select SRLI+SLLI for and i64 %x, imm if the imm is a leading ones mask.
It's useful in RV64 when the mask exceeds simm32 (cannot be generated

[RISCV] Select SRLI+SLLI for AND with leading ones mask

Select SRLI+SLLI for and i64 %x, imm if the imm is a leading ones mask.
It's useful in RV64 when the mask exceeds simm32 (cannot be generated by LUI).

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D121598

show more ...


Revision tags: llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5
# a9d5bb92 06-Apr-2021 Kito Cheng <kito.cheng@sifive.com>

[RISCV] Use __extendhfsf2/__truncsfhf2 for fp16 <-> fp32

`__gnu_h2f_ieee` and `__gnu_f2h_ieee` are introduce by ARM and set that as
default name for fp16 and fp32 conversion in LLVM.

However RISC-V

[RISCV] Use __extendhfsf2/__truncsfhf2 for fp16 <-> fp32

`__gnu_h2f_ieee` and `__gnu_f2h_ieee` are introduce by ARM and set that as
default name for fp16 and fp32 conversion in LLVM.

However RISC-V GCC using default naming scheme for that, which is
`__extendhfsf2` and `__truncsfhf2` for that, that cause runtime ABI
incompatible issue.

Although we didn't have formal runtime ABI spec to specify those naming
convention yet, but I think it would be great to fix the incompatible
issue first.

And I've plan to create a runtime ABI spec undere psABI spec this year.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D118207

show more ...


# a0a76fee 15-Jan-2022 Shao-Ce SUN <shaoce@nj.iscas.ac.cn>

[RISCV] update zfh and zfhmin extention to v1.0

`zfh` and `zfhmin` have been ratified, with version 1.0.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D117098


# bd653f64 11-Jan-2022 Haocong.Lu <Haocong.Lu@streamcomputing.com>

[RISCV] Use shift for zero extension when Zbb and Zbp are not enabled

Now AND is used for zero extension when both Zbb and Zbp are not enabled.
It may be better to use shift operation if the trailin

[RISCV] Use shift for zero extension when Zbb and Zbp are not enabled

Now AND is used for zero extension when both Zbb and Zbp are not enabled.
It may be better to use shift operation if the trailing ones mask exceeds simm12.

This patch optimzes LUI+ADDI+AND to SLLI+SRLI.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D116720

show more ...


# 137d3474 16-Nov-2021 Hsiangkai Wang <kai.wang@sifive.com>

[RISCV] Reverse the order of loading/storing callee-saved registers.

Currently, we restore the return address register as the last restoring
instruction in the epilog. The next instruction is `ret`

[RISCV] Reverse the order of loading/storing callee-saved registers.

Currently, we restore the return address register as the last restoring
instruction in the epilog. The next instruction is `ret` usually. It is
a use of return address register. In some microarchitectures, there is
load-to-use data hazard. To avoid the load-to-use data hazard, we could
separate the load instruction from its use as far as possible. In this
patch, we reverse the order of restoring callee-saved registers to
increase the distance of `load ra` and `ret` in the epilog.

Differential Revision: https://reviews.llvm.org/D113967

show more ...


# af0ecfcc 22-Nov-2021 wangpc <pc.wang@linux.alibaba.com>

[RISCV] Generate pseudo instruction li

Add an alias of `addi [x], zero, imm` to generate pseudo
instruction li, which makes assembly mush more readable.
For existed tests, users can update them by r

[RISCV] Generate pseudo instruction li

Add an alias of `addi [x], zero, imm` to generate pseudo
instruction li, which makes assembly mush more readable.
For existed tests, users can update them by running script
`llvm/utils/update_llc_test_checks.py`.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D112692

show more ...


# 1b941745 26-Aug-2021 Craig Topper <craig.topper@sifive.com>

[RISCV] Insert a sext_inreg when type legalizing i32 shl by constant on RV64.

Similar to what we do for add/sub/mul.

This can help remove some sext.w. There are some regressions on
some bswap tests

[RISCV] Insert a sext_inreg when type legalizing i32 shl by constant on RV64.

Similar to what we do for add/sub/mul.

This can help remove some sext.w. There are some regressions on
some bswap tests, but I have an idea how to fix that for a follow up.

A new PACKW pattern is added to handle the new sext_inreg placement.

Differential Revision: https://reviews.llvm.org/D108663

show more ...


# 010f0f00 27-Jun-2021 Craig Topper <craig.topper@sifive.com>

Revert "[RISCV] Use zexti32/sexti32 in srliw/sraiw isel patterns to improve usage of those instructions."

I thought this might help with another optimization I was
thinking about, but I don't think

Revert "[RISCV] Use zexti32/sexti32 in srliw/sraiw isel patterns to improve usage of those instructions."

I thought this might help with another optimization I was
thinking about, but I don't think it will. So it just wastes
compile time calling computeKnownBits for no benefit.

This reverts commit 81b2f95971edd47a0057ac4a77b674d7ea620c01.

show more ...


# 81b2f959 26-Jun-2021 Craig Topper <craig.topper@sifive.com>

[RISCV] Use zexti32/sexti32 in srliw/sraiw isel patterns to improve usage of those instructions.


# dbbc95e3 01-Apr-2021 Craig Topper <craig.topper@sifive.com>

[RISCV] Use softPromoteHalf legalization for fp16 without Zfh rather than PromoteFloat.

The default legalization strategy is PromoteFloat which keeps
half in single precision format through multiple

[RISCV] Use softPromoteHalf legalization for fp16 without Zfh rather than PromoteFloat.

The default legalization strategy is PromoteFloat which keeps
half in single precision format through multiple floating point
operations. Conversion to/from float is done at loads, stores,
bitcasts, and other places that care about the exact size being 16
bits.

This patches switches to the alternative method softPromoteHalf.
This aims to keep the type in 16-bit format between every operation.
So we promote to float and immediately round for any arithmetic
operation. This should be closer to the IR semantics since we
are rounding after each operation and not accumulating extra
precision across multiple operations. X86 is the only other
target that enables this today. See https://reviews.llvm.org/D73749

I had to update getRegisterTypeForCallingConv to force f16 to
use f32 when the F extension is enabled. This way we can still
pass it in the lower bits of an FPR for ilp32f and lp64f ABIs.
The softPromoteHalf would otherwise always give i16 as the
argument type.

Reviewed By: asb, frasercrmck

Differential Revision: https://reviews.llvm.org/D99148

show more ...


# d61b40ed 01-Apr-2021 Craig Topper <craig.topper@sifive.com>

[RISCV] Improve 64-bit integer materialization for some cases.

This adds a new integer materialization strategy mainly targeted
at 64-bit constants like 0xffffffff where there are 32 or more trailin

[RISCV] Improve 64-bit integer materialization for some cases.

This adds a new integer materialization strategy mainly targeted
at 64-bit constants like 0xffffffff where there are 32 or more trailing
ones with leading zeros. We can materialize these by using an addi -1
and srli to restore the leading zeros. This matches what gcc does.

I haven't limited to just these cases though. The implementation
here takes the constant, shifts out all the leading zeros and
shifts ones into the LSBs, creates the new sequence, adds an srli,
and checks if this is shorter than our original strategy.

I've separated the recursive portion into a standalone function
so I could append the new strategy outside of the recursion. Since
external users are no longer using the recursive function, I've
cleaned up the external interface to return the sequence instead of
taking a vector by reference.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D98821

show more ...


Revision tags: llvmorg-12.0.0-rc4
# a33fcafa 30-Mar-2021 Craig Topper <craig.topper@sifive.com>

[RISCV] Pass 'half' in the lower 16 bits of an f32 value when F extension is enabled, but Zfh is not.

Without Zfh the half type isn't legal, but it could still be
used as an argument/return in IR. C

[RISCV] Pass 'half' in the lower 16 bits of an f32 value when F extension is enabled, but Zfh is not.

Without Zfh the half type isn't legal, but it could still be
used as an argument/return in IR. Clang will not generate this today.

Previously we promoted the half value to float for arguments and
returns if the F extension is enabled but Zfh isn't. Then depending on
which ABI is enabled we would pass it in either an FPR or a GPR in
float format.

If the F extension isn't enabled, it would get passed in the lower
16 bits of a GPR in half format.

With this patch the value will always in half format and will be
in the lower bits of a GPR or FPR. This should be consistent
with where the bits are located when Zfh is enabled.

I've based this implementation off of how this is done on ARM.

I've manually nan-boxed the value to 32 bits using integer ops.
It looks like flw, fsw, fmv.s, fmv.w.x, fmf.x.w won't
canonicalize nans so should leave the value alone. I think those
are the instructions that could get used on this value.

Reviewed By: kito-cheng

Differential Revision: https://reviews.llvm.org/D98670

show more ...


12