Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5 |
|
#
ad9dcd96 |
| 22-Nov-2024 |
Sergei Barannikov <barannikov88@gmail.com> |
Reland "[ARM] Stop gluing FP comparisons to FMSTAT" (#117248)
Following #116547, this changes the result of `ARMISD::CMPFP*` and the
operand of `ARMISD::FMSTAT` from a special `Glue` type to a norm
Reland "[ARM] Stop gluing FP comparisons to FMSTAT" (#117248)
Following #116547, this changes the result of `ARMISD::CMPFP*` and the
operand of `ARMISD::FMSTAT` from a special `Glue` type to a normal type.
This change allows comparisons to be CSEd and scheduled around as can be
seen in the test changes.
Note that `ARMISD::FMSTAT` is still glued to its consumer nodes; this is
going to be changed in a separate patch.
This patch also sets `CopyCost` of `cl_FPSCR_NZCV` register class to a
negative value. The reason is the same as for CCR register class: it
makes DAG scheduler and InstrEmitter try to avoid copies of `FPCSR_NZCV`
register to / from virtual registers. Previously, this was not
necessary, since no attempt was made to create copies in the first
place.
`TRI::getCrossCopyRegClass` is modified in a way that prevents DAG
scheduler from copying FPSCR into a virtual register. The register
allocator might need to spill the virtual register, but that only seem
to work in Thumb mode.
show more ...
|
#
5d32a140 |
| 21-Nov-2024 |
Sergei Barannikov <barannikov88@gmail.com> |
Revert "[ARM] Stop gluing FP comparisons to FMSTAT" (#117175)
Reverts llvm/llvm-project#116676
Reverting per post-commit feedback (causes miscompilation errors and/or
assertion failures).
|
#
8c56dd30 |
| 20-Nov-2024 |
Sergei Barannikov <barannikov88@gmail.com> |
[ARM] Stop gluing FP comparisons to FMSTAT (#116676)
Following #116547, this changes the result of `ARMISD::CMPFP*` and the
operand of `ARMISD::FMSTAT` from a special `Glue` type to a normal type.
[ARM] Stop gluing FP comparisons to FMSTAT (#116676)
Following #116547, this changes the result of `ARMISD::CMPFP*` and the
operand of `ARMISD::FMSTAT` from a special `Glue` type to a normal type.
This change allows comparisons to be CSEd and scheduled around as can be
seen in the test changes.
Note that `ARMISD::FMSTAT` is still glued to its consumer nodes; this is
going to be changed in a separate patch.
This patch also sets `CopyCost` of `cl_FPSCR_NZCV` register class to a
negative value. The reason is the same as for CCR register class: it
makes DAG scheduler and InstrEmitter try to avoid copies of `FPCSR_NZCV`
register to / from virtual registers. Previously, this was not
necessary, since no attempt was made to create copies in the first
place.
There might be a case when a copy can't be avoided (although not found
in existing tests). If a copy is necessary, the virtual register will be
created with `cl_FPSCR_NZCV` register class. If this register class is
inappropriate, `TRI::getCrossCopyRegClass` should be modified to return
the correct class.
Pull Request: https://github.com/llvm/llvm-project/pull/116676
show more ...
|
Revision tags: llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2 |
|
#
52864d9c |
| 02-Feb-2024 |
Harald van Dijk <harald@gigawatt.nl> |
[ARM] Switch to soft promoting half types. (#80440)
The traditional promotion is known to generate wrong code.
Fixes #73805.
|
Revision tags: llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7 |
|
#
bed1c7f0 |
| 19-Dec-2022 |
Nikita Popov <npopov@redhat.com> |
[ARM] Convert some tests to opaque pointers (NFC)
|
Revision tags: llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3 |
|
#
74ca67c1 |
| 05-Jul-2020 |
David Green <david.green@arm.com> |
[ARM] Remove hasSideEffects from FP converts
Whether an instruction is deemed to have side effects in determined by whether it has a tblgen pattern that emits a single instruction. Because of the wa
[ARM] Remove hasSideEffects from FP converts
Whether an instruction is deemed to have side effects in determined by whether it has a tblgen pattern that emits a single instruction. Because of the way a lot of the the vcvt instructions are specified either in dagtodag code or with patterns that emit multiple instructions, they don't get marked as not having side effects.
This just marks them as not having side effects manually. It can help especially with instruction scheduling, to not create artificial barriers, but one of these tests also managed to produce fewer instructions.
Differential Revision: https://reviews.llvm.org/D81639
show more ...
|
Revision tags: llvmorg-10.0.1-rc2, llvmorg-10.0.1-rc1, llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5, llvmorg-10.0.0-rc4, llvmorg-10.0.0-rc3, llvmorg-10.0.0-rc2, llvmorg-10.0.0-rc1, llvmorg-11-init, llvmorg-9.0.1, llvmorg-9.0.1-rc3, llvmorg-9.0.1-rc2, llvmorg-9.0.1-rc1 |
|
#
78bfe3ab |
| 08-Oct-2019 |
Kristof Beyls <kristof.beyls@arm.com> |
[ARM] Generate vcmp instead of vcmpe
Based on the discussion in http://lists.llvm.org/pipermail/llvm-dev/2019-October/135574.html, the conclusion was reached that the ARM backend should produce vcmp
[ARM] Generate vcmp instead of vcmpe
Based on the discussion in http://lists.llvm.org/pipermail/llvm-dev/2019-October/135574.html, the conclusion was reached that the ARM backend should produce vcmp instead of vcmpe instructions by default, i.e. not be producing an Invalid Operation exception when either arguments in a floating point compare are quiet NaNs.
In the future, after constrained floating point intrinsics for floating point compare have been introduced, vcmpe instructions probably should be produced for those intrinsics - depending on the exact semantics they'll be defined to have.
This patch logically consists of the following parts: - Revert http://llvm.org/viewvc/llvm-project?rev=294945&view=rev and http://llvm.org/viewvc/llvm-project?rev=294968&view=rev, which implemented fine-tuning for when to produce vcmpe (i.e. not do it for equality comparisons). The complexity introduced by those patches isn't needed anymore if we just always produce vcmp instead. Maybe these patches need to be reintroduced again once support is needed to map potential LLVM-IR constrained floating point compare intrinsics to the ARM instruction set. - Simply select vcmp, instead of vcmpe, see simple changes in lib/Target/ARM/ARMInstrVFP.td - Adapt lots of tests that tested for vcmpe (instead of vcmp). For all of these test, the intent of what is tested for isn't related to whether the vcmp should produce an Invalid Operation exception or not.
Fixes PR43374.
Differential Revision: https://reviews.llvm.org/D68463
llvm-svn: 374025
show more ...
|
Revision tags: llvmorg-9.0.0, llvmorg-9.0.0-rc6, llvmorg-9.0.0-rc5, llvmorg-9.0.0-rc4, llvmorg-9.0.0-rc3, llvmorg-9.0.0-rc2, llvmorg-9.0.0-rc1, llvmorg-10-init, llvmorg-8.0.1, llvmorg-8.0.1-rc4 |
|
#
7b63a953 |
| 02-Jul-2019 |
Simon Tatham <simon.tatham@arm.com> |
[ARM] Stop using scalar FP instructions in integer-only MVE mode.
If you compile with `-mattr=+mve` (enabling integer MVE instructions but not floating-point ones), then the scalar FP //registers//
[ARM] Stop using scalar FP instructions in integer-only MVE mode.
If you compile with `-mattr=+mve` (enabling integer MVE instructions but not floating-point ones), then the scalar FP //registers// exist and it's legal to move things in and out of them, load and store them, but it's not legal to do arithmetic on them.
In D60708, the calls to `addRegisterClass` in ARMISelLowering that enable use of the scalar FP registers became conditionalised on `Subtarget->hasFPRegs()` instead of `Subtarget->hasVFP2Base()`, so that loads, stores and moves of those registers would work. But I didn't realise that that would also enable all the operations on those types by default.
Now, if the target doesn't have basic VFP, we follow up those `addRegisterClass` calls by turning back off all the nontrivial operations you can perform on f32 and f64. That causes several knock-on failures, which are fixed by allowing the `VMOVDcc` and `VMOVScc` instructions to be selected even if all you have is `HasFPRegs`, and adjusting several checks for 'is this a double in a single-precision-only world?' to the more general 'is this any FP type we can't do arithmetic on?'. Between those, the whole of the `float-ops.ll` and `fp16-instructions.ll` tests can now run in MVE-without-FP mode and generate correct-looking code.
One odd side effect is that I had to relax the check lines in that test so that they permit test functions like `add_f` to be generated as tailcalls to software FP library functions, instead of ordinary calls. Doing that is entirely legal, but the mystery is why this is the first RUN line that's needed the relaxation: on the usual kind of non-FP target, no tailcalls ever seem to be generated. Going by the llc messages, I think `SoftenFloatResult` must be perturbing the code generation in some way, but that's as much as I can guess.
Reviewers: dmgreen, ostannard
Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D63938
llvm-svn: 364909
show more ...
|
Revision tags: llvmorg-8.0.1-rc3, llvmorg-8.0.1-rc2 |
|
#
760df47b |
| 28-May-2019 |
Simon Tatham <simon.tatham@arm.com> |
[ARM] Replace fp-only-sp and d16 with fp64 and d32.
Those two subtarget features were awkward because their semantics are reversed: each one indicates the _lack_ of support for something in the arch
[ARM] Replace fp-only-sp and d16 with fp64 and d32.
Those two subtarget features were awkward because their semantics are reversed: each one indicates the _lack_ of support for something in the architecture, rather than the presence. As a consequence, you don't get the behavior you want if you combine two sets of feature bits.
Each SubtargetFeature for an FP architecture version now comes in four versions, one for each combination of those options. So you can still say (for example) '+vfp2' in a feature string and it will mean what it's always meant, but there's a new string '+vfp2d16sp' meaning the version without those extra options.
A lot of this change is just mechanically replacing positive checks for the old features with negative checks for the new ones. But one more interesting change is that I've rearranged getFPUFeatures() so that the main FPU feature is appended to the output list *before* rather than after the features derived from the Restriction field, so that -fp64 and -d32 can override defaults added by the main feature.
Reviewers: dmgreen, samparker, SjoerdMeijer
Subscribers: srhines, javed.absar, eraman, kristof.beyls, hiraditya, zzheng, Petar.Avramovic, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D60691
llvm-svn: 361845
show more ...
|
#
21542cd6 |
| 26-May-2019 |
David Green <david.green@arm.com> |
[ARM] Select a number of fp16 rounding functions
This add patterns for fp16 round and ceil etc. Same as the float and double patterns.
Differential Revision: https://reviews.llvm.org/D62326
llvm-s
[ARM] Select a number of fp16 rounding functions
This add patterns for fp16 round and ceil etc. Same as the float and double patterns.
Differential Revision: https://reviews.llvm.org/D62326
llvm-svn: 361718
show more ...
|
Revision tags: llvmorg-8.0.1-rc1 |
|
#
17586cda |
| 05-Apr-2019 |
Simon Pilgrim <llvm-dev@redking.me.uk> |
[SelectionDAG] Add fcmp UNDEF handling to SelectionDAG::FoldSetCC
Second half of PR40800, this patch adds DAG undef handling to fcmp instructions to match the behavior in llvm::ConstantFoldCompareIn
[SelectionDAG] Add fcmp UNDEF handling to SelectionDAG::FoldSetCC
Second half of PR40800, this patch adds DAG undef handling to fcmp instructions to match the behavior in llvm::ConstantFoldCompareInstruction, this permits constant folding of vector comparisons where some elements had been reduced to UNDEF (by SimplifyDemandedVectorElts etc.).
This involves a lot of tweaking to reduced tests as bugpoint loves to reduce fcmp arguments to undef........
Differential Revision: https://reviews.llvm.org/D60006
llvm-svn: 357765
show more ...
|
Revision tags: llvmorg-8.0.0, llvmorg-8.0.0-rc5, llvmorg-8.0.0-rc4, llvmorg-8.0.0-rc3, llvmorg-7.1.0, llvmorg-7.1.0-rc1, llvmorg-8.0.0-rc2, llvmorg-8.0.0-rc1, llvmorg-7.0.1, llvmorg-7.0.1-rc3, llvmorg-7.0.1-rc2, llvmorg-7.0.1-rc1 |
|
#
611b533f |
| 30-Oct-2018 |
Jonas Paulsson <paulsson@linux.vnet.ibm.com> |
[SchedModel] Fix for read advance cycles with implicit pseudo operands.
The SchedModel allows the addition of ReadAdvances to express that certain operands of the instructions are needed at a later
[SchedModel] Fix for read advance cycles with implicit pseudo operands.
The SchedModel allows the addition of ReadAdvances to express that certain operands of the instructions are needed at a later point than the others.
RegAlloc may add pseudo operands that are not part of the instruction descriptor, and therefore cannot have any read advance entries. This meant that in some cases the desired read advance was nullified by such a pseudo operand, which still had the original latency.
This patch fixes this by making sure that such pseudo operands get a zero latency during DAG construction.
Review: Matthias Braun, Ulrich Weigand. https://reviews.llvm.org/D49671
llvm-svn: 345606
show more ...
|
#
c045c557 |
| 29-Oct-2018 |
Matthias Braun <matze@braunis.de> |
Relax fast register allocator related test cases; NFC
- Relex hard coded registers and stack frame sizes - Some test cleanups - Change phi-dbg.ll to match on mir output after phi elimination instead
Relax fast register allocator related test cases; NFC
- Relex hard coded registers and stack frame sizes - Some test cleanups - Change phi-dbg.ll to match on mir output after phi elimination instead of going through the whole codegen pipeline.
This is in preparation for https://reviews.llvm.org/D52010 I'm committing all the test changes upfront that work before and after independently.
llvm-svn: 345532
show more ...
|
Revision tags: llvmorg-7.0.0, llvmorg-7.0.0-rc3, llvmorg-7.0.0-rc2, llvmorg-7.0.0-rc1, llvmorg-6.0.1, llvmorg-6.0.1-rc3, llvmorg-6.0.1-rc2 |
|
#
8652c53d |
| 15-May-2018 |
Sanjay Patel <spatel@rotateright.com> |
[DAG] propagate FMF for all FPMathOperators
This is a simple hack based on what's proposed in D37686, but we can extend it if needed in follow-ups. It gets us most of the FMF functionality that we
[DAG] propagate FMF for all FPMathOperators
This is a simple hack based on what's proposed in D37686, but we can extend it if needed in follow-ups. It gets us most of the FMF functionality that we want without adding any state bits to the flags. It also intentionally leaves out non-FMF flags (nsw, etc) to minimize the patch.
It should provide a superset of the functionality from D46563 - the extra tests show propagation and codegen diffs for fcmp, vecreduce, and FP libcalls.
The PPC log2() test shows the limits of this most basic approach - we only applied 'afn' to the last node created for the call. AFAIK, there aren't any libcall optimizations based on the flags currently, so that shouldn't make any difference.
Differential Revision: https://reviews.llvm.org/D46854
llvm-svn: 332358
show more ...
|
#
a79ea80d |
| 19-Apr-2018 |
Sjoerd Meijer <sjoerd.meijer@arm.com> |
[ARM] Add some missing FP16 VSEL test cases
Differential Revision: https://reviews.llvm.org/D45724
llvm-svn: 330313
|
Revision tags: llvmorg-6.0.1-rc1 |
|
#
834f7dc7 |
| 13-Apr-2018 |
Sjoerd Meijer <sjoerd.meijer@arm.com> |
[ARM] FP16 vmaxnm/vminnm scalar instructions
This adds code generation support for the FP16 vmaxnm/vminnm scalar instructions.
Differential Revision: https://reviews.llvm.org/D44675
llvm-svn: 3300
[ARM] FP16 vmaxnm/vminnm scalar instructions
This adds code generation support for the FP16 vmaxnm/vminnm scalar instructions.
Differential Revision: https://reviews.llvm.org/D44675
llvm-svn: 330034
show more ...
|
#
ac96d7c4 |
| 11-Apr-2018 |
Sjoerd Meijer <sjoerd.meijer@arm.com> |
[ARM] FP16 VSEL codegen
This is a follow up of rL327695 to instruction select more variants of VSELGT and VSELGE, for which it is necessary to custom lower SELECT.
More work is required in this are
[ARM] FP16 VSEL codegen
This is a follow up of rL327695 to instruction select more variants of VSELGT and VSELGE, for which it is necessary to custom lower SELECT.
More work is required in this area, which will be addressed soon: - more variants need to be regression tested, but this depends on the next point. - first LowerConstantFP need to be adjusted for fp16 values.
Differential Revision: https://reviews.llvm.org/D45205
llvm-svn: 329788
show more ...
|
Revision tags: llvmorg-5.0.2, llvmorg-5.0.2-rc2, llvmorg-5.0.2-rc1 |
|
#
d391a1a9 |
| 16-Mar-2018 |
Sjoerd Meijer <sjoerd.meijer@arm.com> |
[ARM] FP16 codegen support for VSEL
This implements lowering of SELECT_CC for f16s, which enables codegen of VSEL with f16 types.
Differential Revision: https://reviews.llvm.org/D44518
llvm-svn: 3
[ARM] FP16 codegen support for VSEL
This implements lowering of SELECT_CC for f16s, which enables codegen of VSEL with f16 types.
Differential Revision: https://reviews.llvm.org/D44518
llvm-svn: 327695
show more ...
|
Revision tags: llvmorg-6.0.0, llvmorg-6.0.0-rc3 |
|
#
4d5c4049 |
| 20-Feb-2018 |
Sjoerd Meijer <sjoerd.meijer@arm.com> |
[ARM] Lower BR_CC for f16
This case wasn't handled yet.
Differential Revision: https://reviews.llvm.org/D43508
llvm-svn: 325616
|
#
9430c8cd |
| 15-Feb-2018 |
Sjoerd Meijer <sjoerd.meijer@arm.com> |
[ARM] f16 vcmp fixes
This adds f16 VCMP match rules and fixes the test cases.
Differential Revision: https://reviews.llvm.org/D43291
llvm-svn: 325228
|
#
3b4294ed |
| 14-Feb-2018 |
Sjoerd Meijer <sjoerd.meijer@arm.com> |
[ARM] f16 stack spill/reloads
This adds support for handling f16 stack spills/reloads.
Differential Revision: https://reviews.llvm.org/D43280
llvm-svn: 325130
|
#
101ee430 |
| 13-Feb-2018 |
Sjoerd Meijer <sjoerd.meijer@arm.com> |
[Thumb] Handle addressing mode AddrMode5FP16
This addressing mode wasn't checked, so we were running in an assert.
Differential Revision: https://reviews.llvm.org/D43179
llvm-svn: 324996
|
Revision tags: llvmorg-6.0.0-rc2 |
|
#
8c073934 |
| 07-Feb-2018 |
Sjoerd Meijer <sjoerd.meijer@arm.com> |
[ARM] FP16 mov imm pattern
This is a follow up of r324321, adding a match pattern for mov with a FP16 immediate (also fixing operand vfp_f16imm that wasn't even compiling).
Differential Revision: h
[ARM] FP16 mov imm pattern
This is a follow up of r324321, adding a match pattern for mov with a FP16 immediate (also fixing operand vfp_f16imm that wasn't even compiling).
Differential Revision: https://reviews.llvm.org/D42973
llvm-svn: 324456
show more ...
|
#
d2718ba9 |
| 06-Feb-2018 |
Sjoerd Meijer <sjoerd.meijer@arm.com> |
[ARM] f16 conversions
This is a follow up of r324321, adding f16 <-> f32 and f16 <-> f64 conversion match patterns.
Differential Revision: https://reviews.llvm.org/D42954
llvm-svn: 324360
|
#
89ea2648 |
| 06-Feb-2018 |
Sjoerd Meijer <sjoerd.meijer@arm.com> |
[ARM] Armv8.2-A FP16 code generation (part 3/3)
This adds most of the FP16 codegen support, but these areas need further work:
- FP16 literals and immediates are not properly supported yet (e.g. li
[ARM] Armv8.2-A FP16 code generation (part 3/3)
This adds most of the FP16 codegen support, but these areas need further work:
- FP16 literals and immediates are not properly supported yet (e.g. literal pool needs work), - Instructions that are generated from intrinsics (e.g. vabs) haven't been added.
This will be addressed in follow-up patches.
Differential Revision: https://reviews.llvm.org/D42849
llvm-svn: 324321
show more ...
|