Revision tags: llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2 |
|
#
9ee4fe63 |
| 13-Apr-2023 |
Archibald Elliott <archibald.elliott@arm.com> |
[ARM] Fix Crashes in fp16/bf16 Inline Asm
We were still seeing occasional crashes with inline assembly blocks using fp16/bf16 after my previous patches: - https://reviews.llvm.org/rGff4027d152d0 - h
[ARM] Fix Crashes in fp16/bf16 Inline Asm
We were still seeing occasional crashes with inline assembly blocks using fp16/bf16 after my previous patches: - https://reviews.llvm.org/rGff4027d152d0 - https://reviews.llvm.org/rG7d15212b8c0c - https://reviews.llvm.org/rG20b2d11896d9
It turns out: - The original two commits were wrong, and we should have always been choosing the SPR register class, not the HPR register class, so that LLVM's SelectionDAGBuilder correctly did the right splits/joins. - The `splitValueIntoRegisterParts`/`joinRegisterPartsIntoValue` changes from rG20b2d11896d9 are still correct, even though they sometimes result in inefficient codegen of casts between fp16/bf16 and i32/f32 (which is visible in these tests).
This patch fixes crashes in `getCopyToParts` and when trying to select `(bf16 (bitconvert (fp16 ...)))` dags when Neon is enabled.
This patch also adds support for passing fp16/bf16 values using the 'x' constraint that is LLVM-specific. This should broadly match how we pass with 't' and 'w', but with a different set of valid S registers.
Differential Revision: https://reviews.llvm.org/D147715
show more ...
|
Revision tags: llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4 |
|
#
20b2d118 |
| 02-Mar-2023 |
Archibald Elliott <archibald.elliott@arm.com> |
[ARM] Fix Crash in 't'/'w' handling without fp16/bf16
After https://reviews.llvm.org/rGff4027d152d0 and https://reviews.llvm.org/rG7d15212b8c0c we saw crashes in SelectionDAG when trying to use thes
[ARM] Fix Crash in 't'/'w' handling without fp16/bf16
After https://reviews.llvm.org/rGff4027d152d0 and https://reviews.llvm.org/rG7d15212b8c0c we saw crashes in SelectionDAG when trying to use these constraints when you don't have the fp16 or bf16 extensions.
However, it is still possible to move 16-bit floating point values into the right place in S registers with a normal `vmov`, even if we don't have fp16 instructions we can use within the inline assembly string. This patch therefore fixes the crash.
I think the reason we weren't getting this crash before is because I think the __fp16 and __bf16 types got an error diagnostic in the Clang frontend when you didn't have the right architectural extensions to use them. This restriction was recently relaxed.
The approach for bf16 needs a bit more explanation. Exactly how BF16 is legalized was changed in rGb769eb02b526e3966847351e15d283514c2ec767 - effectively, whether you have the right instructions to get a bf16 value into/out of a S register with MoveTo/FromHPR depends on hasFullFP16, but whether you use a HPR for a value of type MVT::bf16 depends on hasBF16. This is why the tests are not changed by `+bf16` vs `-bf16`, but I've left both sets of RUN lines in case this changes in the future.
Test Changes: - Added more testing for testing inline asm (the core part) - fp16-promote.ll and pr47454.ll show improvements where unnecessary fp16-fp32 up/down-casts are no longer emitted. This results in fewer libcalls where those casts would be done with a libcall. - aes-erratum-fix.ll is fairly noisy, and I need to revisit this test so that the IR is more minimal than it is right now, because most of the changes in this commit do not relate to what AES is actually trying to verify.
Differential Revision: https://reviews.llvm.org/D143711
show more ...
|
Revision tags: llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3 |
|
#
7d15212b |
| 11-Oct-2022 |
Archibald Elliott <archibald.elliott@arm.com> |
[ARM] Support fp16/bf16 using w constraint
fp16 and bf16 values can be used in GCC's inline assembly using the "w" constraint, which means "VFP floating-point registers d0-d31" - fp16 and bf16 value
[ARM] Support fp16/bf16 using w constraint
fp16 and bf16 values can be used in GCC's inline assembly using the "w" constraint, which means "VFP floating-point registers d0-d31" - fp16 and bf16 values are stored in S registers (which alias the D registers).
This change ensures that LLVM is compatible with GCC for programs that use fp16 and the 'w' constraint.
Differential Revision: https://reviews.llvm.org/D135662
show more ...
|
Revision tags: working, llvmorg-15.0.2 |
|
#
ff4027d1 |
| 23-Sep-2022 |
Archibald Elliott <archibald.elliott@arm.com> |
[ARM] Support fp16/bf16 using t constraint
fp16 and bf16 values can be used in GCC's inline assembly using the "t" constraint, which means "VFP floating-point registers s0-s31" - fp16 and bf16 value
[ARM] Support fp16/bf16 using t constraint
fp16 and bf16 values can be used in GCC's inline assembly using the "t" constraint, which means "VFP floating-point registers s0-s31" - fp16 and bf16 values are stored in S registers too.
This change ensures that LLVM is compatible with GCC for programs that use fp16 and the 't' constraint.
Fixes #57753
Differential Revision: https://reviews.llvm.org/D134553
show more ...
|