|
Revision tags: llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init |
|
| #
9e9907f1 |
| 17-Jan-2024 |
Fangrui Song <i@maskray.me> |
[AMDGPU,test] Change llc -march= to -mtriple= (#75982)
Similar to 806761a7629df268c8aed49657aeccffa6bca449.
For IR files without a target triple, -mtriple= specifies the full
target triple while
[AMDGPU,test] Change llc -march= to -mtriple= (#75982)
Similar to 806761a7629df268c8aed49657aeccffa6bca449.
For IR files without a target triple, -mtriple= specifies the full
target triple while -march= merely sets the architecture part of the
default target triple, leaving a target triple which may not make sense,
e.g. amdgpu-apple-darwin.
Therefore, -march= is error-prone and not recommended for tests without
a target triple. The issue has been benign as we recognize
$unknown-apple-darwin as ELF instead of rejecting it outrightly.
This patch changes AMDGPU tests to not rely on the default
OS/environment components. Tests that need fixes are not changed:
```
LLVM :: CodeGen/AMDGPU/fabs.f64.ll
LLVM :: CodeGen/AMDGPU/fabs.ll
LLVM :: CodeGen/AMDGPU/floor.ll
LLVM :: CodeGen/AMDGPU/fneg-fabs.f64.ll
LLVM :: CodeGen/AMDGPU/fneg-fabs.ll
LLVM :: CodeGen/AMDGPU/r600-infinite-loop-bug-while-reorganizing-vector.ll
LLVM :: CodeGen/AMDGPU/schedule-if-2.ll
```
show more ...
|
|
Revision tags: llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7 |
|
| #
bdf2fbba |
| 19-Dec-2022 |
Nikita Popov <npopov@redhat.com> |
[AMDGPU] Convert some tests to opaque pointers (NFC)
|
|
Revision tags: llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init |
|
| #
79f67cae |
| 14-Jul-2020 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Rename add/sub with carry out instructions
The hardware has created a real mess in the naming for add/sub, which have been renamed basically every generation. Switch the carry out pseudos to
AMDGPU: Rename add/sub with carry out instructions
The hardware has created a real mess in the naming for add/sub, which have been renamed basically every generation. Switch the carry out pseudos to have the gfx9/gfx10 names. We were using the original SI/CI v_add_i32/v_sub_i32 names. Later targets reintroduced these names as carryless instructions with a saturating clamp bit, which we do not define. Do this rename so we can unambiguously add these missing instructions.
The carry-in versions should also be renamed, but at least those had a consistent _u32 name to begin with. The 16-bit instructions were also renamed, but aren't ambiguous.
This does regress assembler error message quality in some cases. In mismatched wave32/wave64 situations, this will switch from "unsupported instruction" to "invalid operand", with the error pointing at the wrong position. I couldn't quite follow how the assembler selects these, but the previous behavior seemed accidental to me. It looked like there was a partial attempt to handle this which was never completed (i.e. there is an AMDGPUOperand::isBoolReg but it isn't used for anything).
show more ...
|
|
Revision tags: llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2, llvmorg-10.0.1-rc1, llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5, llvmorg-10.0.0-rc4, llvmorg-10.0.0-rc3, llvmorg-10.0.0-rc2, llvmorg-10.0.0-rc1, llvmorg-11-init, llvmorg-9.0.1, llvmorg-9.0.1-rc3, llvmorg-9.0.1-rc2, llvmorg-9.0.1-rc1, llvmorg-9.0.0, llvmorg-9.0.0-rc6, llvmorg-9.0.0-rc5, llvmorg-9.0.0-rc4, llvmorg-9.0.0-rc3, llvmorg-9.0.0-rc2, llvmorg-9.0.0-rc1, llvmorg-10-init, llvmorg-8.0.1, llvmorg-8.0.1-rc4, llvmorg-8.0.1-rc3 |
|
| #
9cac4e6d |
| 19-Jun-2019 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
Rename ExpandISelPseudo->FinalizeISel, delay register reservation
This allows targets to make more decisions about reserved registers after isel. For example, now it should be certain there are call
Rename ExpandISelPseudo->FinalizeISel, delay register reservation
This allows targets to make more decisions about reserved registers after isel. For example, now it should be certain there are calls or stack objects in the frame or not, which could have been introduced by legalization.
Patch by Matthias Braun
llvm-svn: 363757
show more ...
|
|
Revision tags: llvmorg-8.0.1-rc2, llvmorg-8.0.1-rc1, llvmorg-8.0.0, llvmorg-8.0.0-rc5, llvmorg-8.0.0-rc4, llvmorg-8.0.0-rc3, llvmorg-7.1.0, llvmorg-7.1.0-rc1, llvmorg-8.0.0-rc2, llvmorg-8.0.0-rc1, llvmorg-7.0.1, llvmorg-7.0.1-rc3, llvmorg-7.0.1-rc2, llvmorg-7.0.1-rc1, llvmorg-7.0.0, llvmorg-7.0.0-rc3, llvmorg-7.0.0-rc2, llvmorg-7.0.0-rc1, llvmorg-6.0.1, llvmorg-6.0.1-rc3, llvmorg-6.0.1-rc2, llvmorg-6.0.1-rc1, llvmorg-5.0.2, llvmorg-5.0.2-rc2, llvmorg-5.0.2-rc1, llvmorg-6.0.0, llvmorg-6.0.0-rc3, llvmorg-6.0.0-rc2 |
|
| #
43e94b15 |
| 31-Jan-2018 |
Puyan Lotfi <puyan@puyan.org> |
Followup on Proposal to move MIR physical register namespace to '$' sigil.
Discussed here:
http://lists.llvm.org/pipermail/llvm-dev/2018-January/120320.html
In preparation for adding support for n
Followup on Proposal to move MIR physical register namespace to '$' sigil.
Discussed here:
http://lists.llvm.org/pipermail/llvm-dev/2018-January/120320.html
In preparation for adding support for named vregs we are changing the sigil for physical registers in MIR to '$' from '%'. This will prevent name clashes of named physical register with named vregs.
llvm-svn: 323922
show more ...
|
|
Revision tags: llvmorg-6.0.0-rc1, llvmorg-5.0.1, llvmorg-5.0.1-rc3, llvmorg-5.0.1-rc2, llvmorg-5.0.1-rc1 |
|
| #
6c452834 |
| 24-Oct-2017 |
Justin Bogner <mail@justinbogner.com> |
MIR: Print the register class or bank in vreg defs
This updates the MIRPrinter to include the regclass when printing virtual register defs, which is already valid syntax for the parser. That is, giv
MIR: Print the register class or bank in vreg defs
This updates the MIRPrinter to include the regclass when printing virtual register defs, which is already valid syntax for the parser. That is, given 64 bit %0 and %1 in a "gpr" regbank,
%1(s64) = COPY %0(s64)
would now be written as
%1:gpr(s64) = COPY %0(s64)
While this change alone introduces a bit of redundancy with the registers block, it allows us to update the tests to be more concise and understandable and brings us closer to being able to remove the registers block completely.
Note: We generally only print the class in defs, but there is one exception. If there are uses without any defs whatsoever, we'll print the class on all uses. I'm not completely convinced this comes up in meaningful machine IR, but for now the MIRParser and MachineVerifier both accept that kind of stuff, so we don't want to have a situation where we can print something we can't parse.
llvm-svn: 316479
show more ...
|
|
Revision tags: llvmorg-5.0.0, llvmorg-5.0.0-rc5, llvmorg-5.0.0-rc4, llvmorg-5.0.0-rc3, llvmorg-5.0.0-rc2, llvmorg-5.0.0-rc1, llvmorg-4.0.1, llvmorg-4.0.1-rc3, llvmorg-4.0.1-rc2, llvmorg-4.0.1-rc1 |
|
| #
3dbeefa9 |
| 21-Mar-2017 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Mark all unspecified CC functions in tests as amdgpu_kernel
Currently the default C calling convention functions are treated the same as compute kernels. Make this explicit so the default ca
AMDGPU: Mark all unspecified CC functions in tests as amdgpu_kernel
Currently the default C calling convention functions are treated the same as compute kernels. Make this explicit so the default calling convention can be changed to a non-kernel.
Converted with perl -pi -e 's/define void/define amdgpu_kernel void/' on the relevant test directories (and undoing in one place that actually wanted a non-kernel).
llvm-svn: 298444
show more ...
|
|
Revision tags: llvmorg-4.0.0, llvmorg-4.0.0-rc4, llvmorg-4.0.0-rc3, llvmorg-4.0.0-rc2, llvmorg-4.0.0-rc1, llvmorg-3.9.1, llvmorg-3.9.1-rc3, llvmorg-3.9.1-rc2, llvmorg-3.9.1-rc1 |
|
| #
0d162b1c |
| 16-Nov-2016 |
Tom Stellard <thomas.stellard@amd.com> |
AMDGPU/SI: Avoid creating unnecessary copies in the SIFixSGPRCopies pass
Summary: 1. Don't try to copy values to and from the same register class. 2. Replace copies with of registers with immediate
AMDGPU/SI: Avoid creating unnecessary copies in the SIFixSGPRCopies pass
Summary: 1. Don't try to copy values to and from the same register class. 2. Replace copies with of registers with immediate values with v_mov/s_mov instructions.
The main purpose of this change is to make MachineSink do a better job of determining when it is beneficial to split a critical edge, since the pass assumes that copies will become move instructions.
This prevents a regression in uniform-cfg.ll if we enable critical edge splitting for AMDGPU.
Reviewers: arsenm
Subscribers: arsenm, kzhuravl, llvm-commits
Differential Revision: https://reviews.llvm.org/D23408
llvm-svn: 287131
show more ...
|
|
Revision tags: llvmorg-3.9.0, llvmorg-3.9.0-rc3, llvmorg-3.9.0-rc2, llvmorg-3.9.0-rc1 |
|
| #
545e558b |
| 13-Jul-2016 |
Quentin Colombet <qcolombet@apple.com> |
[MIR] Print on the given output instead of stderr.
Currently the MIR framework prints all its outputs (errors and actual representation) on stderr.
This patch fixes that by printing the regular out
[MIR] Print on the given output instead of stderr.
Currently the MIR framework prints all its outputs (errors and actual representation) on stderr.
This patch fixes that by printing the regular output in the output specified with -o.
Differential Revision: http://reviews.llvm.org/D22251
llvm-svn: 275314
show more ...
|
|
Revision tags: llvmorg-3.8.1, llvmorg-3.8.1-rc1 |
|
| #
3d1c1deb |
| 14-Apr-2016 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Run SIFoldOperands after PeepholeOptimizer
PeepholeOptimizer cleans up redundant copies, which makes the operand folding more effective.
shader-db stats:
Totals: SGPRS: 34200 -> 34336 (0.4
AMDGPU: Run SIFoldOperands after PeepholeOptimizer
PeepholeOptimizer cleans up redundant copies, which makes the operand folding more effective.
shader-db stats:
Totals: SGPRS: 34200 -> 34336 (0.40 %) VGPRS: 22118 -> 21655 (-2.09 %) Code Size: 632144 -> 633460 (0.21 %) bytes LDS: 11 -> 11 (0.00 %) blocks Scratch: 10240 -> 11264 (10.00 %) bytes per wave Max Waves: 8822 -> 8918 (1.09 %) Wait states: 0 -> 0 (0.00 %)
Totals from affected shaders: SGPRS: 7704 -> 7840 (1.77 %) VGPRS: 5169 -> 4706 (-8.96 %) Code Size: 234444 -> 235760 (0.56 %) bytes LDS: 2 -> 2 (0.00 %) blocks Scratch: 0 -> 1024 (0.00 %) bytes per wave Max Waves: 1188 -> 1284 (8.08 %) Wait states: 0 -> 0 (0.00 %)
Increases: SGPRS: 35 (0.01 %) VGPRS: 1 (0.00 %) Code Size: 59 (0.02 %) LDS: 0 (0.00 %) Scratch: 1 (0.00 %) Max Waves: 48 (0.02 %) Wait states: 0 (0.00 %)
Decreases: SGPRS: 26 (0.01 %) VGPRS: 54 (0.02 %) Code Size: 68 (0.03 %) LDS: 0 (0.00 %) Scratch: 0 (0.00 %) Max Waves: 4 (0.00 %) Wait states: 0 (0.00 %)
llvm-svn: 266378
show more ...
|
| #
44e5483a |
| 12-Apr-2016 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Add volatile to test loads and stores
When the memory vectorizer is enabled, these tests break. These tests don't really care about the memory instructions, and it's easier to write check li
AMDGPU: Add volatile to test loads and stores
When the memory vectorizer is enabled, these tests break. These tests don't really care about the memory instructions, and it's easier to write check lines with the unmerged loads.
llvm-svn: 266071
show more ...
|
|
Revision tags: llvmorg-3.8.0, llvmorg-3.8.0-rc3, llvmorg-3.8.0-rc2, llvmorg-3.8.0-rc1, llvmorg-3.7.1 |
|
| #
ea03cf2f |
| 30-Nov-2015 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Don't reserve SCRATCH_PTR input register
This hasn't been doing anything since using relocations was added.
llvm-svn: 254304
|
|
Revision tags: llvmorg-3.7.1-rc2, llvmorg-3.7.1-rc1 |
|
| #
e8c0891e |
| 21-Oct-2015 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Fix verifier error in SIFoldOperands
There may be other use operands that also need their kill flags cleared.
This happens in a few tests when SIFoldOperands is moved after PeepholeOptimize
AMDGPU: Fix verifier error in SIFoldOperands
There may be other use operands that also need their kill flags cleared.
This happens in a few tests when SIFoldOperands is moved after PeepholeOptimizer.
PeepholeOptimizer rewrites cases that look like: %vreg0 = ... %vreg1 = COPY %vreg0 use %vreg1<kill> %vreg2 = COPY %vreg0 use %vreg2<kill>
to use the earlier source to %vreg0 = ... use %vreg0 use %vreg0
Currently SIFoldOperands sees the copied registers, so there is only one use. So far I haven't managed to come up with a test that currently has multiple uses of a foldable VGPR -> VGPR copy.
llvm-svn: 250960
show more ...
|
|
Revision tags: llvmorg-3.7.0, llvmorg-3.7.0-rc4, llvmorg-3.7.0-rc3, studio-1.4 |
|
| #
b4d0d6a3 |
| 31-Jul-2015 |
Alex Lorenz <arphaman@gmail.com> |
AMDGPU/SI: Add implicit register operands in the correct order.
This commit fixes a bug in the class 'SIInstrInfo' where the implicit register machine operands were added to a machine instruction in
AMDGPU/SI: Add implicit register operands in the correct order.
This commit fixes a bug in the class 'SIInstrInfo' where the implicit register machine operands were added to a machine instruction in an incorrect order - the implicit uses were added before the implicit defs.
I found this bug while working on moving the implicit register operand verification code from the MIR parser to the machine verifier.
This commit also makes the method 'addImplicitDefUseOperands' in the machine instruction class public so that it can be reused in the 'SIInstrInfo' class.
Reviewers: Matt Arsenault
Differential Revision: http://reviews.llvm.org/D11689
llvm-svn: 243799
show more ...
|