Revision tags: llvmorg-5.0.0-rc2 |
|
#
1d6317c3 |
| 02-Aug-2017 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Fix emitting encoded calls
This was failing on out of bounds access to the extra operands on the s_swappc_b64 beyond those in the instruction definition.
This was working, but somehow regre
AMDGPU: Fix emitting encoded calls
This was failing on out of bounds access to the extra operands on the s_swappc_b64 beyond those in the instruction definition.
This was working, but somehow regressed within the past few weeks, although I don't see any obvious commit.
llvm-svn: 309782
show more ...
|
#
6ed7b9bf |
| 02-Aug-2017 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Analyze callee resource usage in AsmPrinter
llvm-svn: 309781
|
#
b62a4eb5 |
| 01-Aug-2017 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Initial implementation of calls
Includes a hack to fix the type selected for the GlobalAddress of the function, which will be fixed by changing the default datalayout to use generic pointers
AMDGPU: Initial implementation of calls
Includes a hack to fix the type selected for the GlobalAddress of the function, which will be fixed by changing the default datalayout to use generic pointers for 0.
llvm-svn: 309732
show more ...
|
Revision tags: llvmorg-5.0.0-rc1, llvmorg-4.0.1, llvmorg-4.0.1-rc3 |
|
#
6bda14b3 |
| 06-Jun-2017 |
Chandler Carruth <chandlerc@gmail.com> |
Sort the remaining #include lines in include/... and lib/....
I did this a long time ago with a janky python script, but now clang-format has built-in support for this. I fed clang-format every line
Sort the remaining #include lines in include/... and lib/....
I did this a long time ago with a janky python script, but now clang-format has built-in support for this. I fed clang-format every line with a #include and let it re-sort things according to the precise LLVM rules for include ordering baked into clang-format these days.
I've reverted a number of files where the results of sorting includes isn't healthy. Either places where we have legacy code relying on particular include ordering (where possible, I'll fix these separately) or where we have particular formatting around #include lines that I didn't want to disturb in this patch.
This patch is *entirely* mechanical. If you get merge conflicts or anything, just ignore the changes in this patch and run clang-format over your #include lines in the files.
Sorry for any noise here, but it is important to keep these things stable. I was seeing an increasing number of patches with irrelevant re-ordering of #include lines because clang-format was used. This patch at least isolates that churn, makes it easy to skip when resolving conflicts, and gets us to a clean baseline (again).
llvm-svn: 304787
show more ...
|
Revision tags: llvmorg-4.0.1-rc2 |
|
#
2b1f9aa5 |
| 17-May-2017 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Start defining a calling convention
Partially implement callee-side for arguments and return values. byval doesn't work properly, and most likely sret or other on-stack return values most as
AMDGPU: Start defining a calling convention
Partially implement callee-side for arguments and return values. byval doesn't work properly, and most likely sret or other on-stack return values most as well.
llvm-svn: 303308
show more ...
|
Revision tags: llvmorg-4.0.1-rc1 |
|
#
15a96b1d |
| 21-Apr-2017 |
Yaxun Liu <Yaxun.Liu@amd.com> |
[AMDGPU] Handle SI_MASKED_UNREACHABLE in instruction emitter
SI_MASKED_UNREACHABLE does not have machine instruction encoding. It needs special handling in AMDGPUAsmPrinter::EmitInstruction like som
[AMDGPU] Handle SI_MASKED_UNREACHABLE in instruction emitter
SI_MASKED_UNREACHABLE does not have machine instruction encoding. It needs special handling in AMDGPUAsmPrinter::EmitInstruction like some other pseudo instructions.
This patch fixes compilation failure of RadeonRays.
Differential Revision: https://reviews.llvm.org/D32364
llvm-svn: 301025
show more ...
|
#
5b20fbb7 |
| 21-Mar-2017 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Rename SI_RETURN
This is used for a specific type of return to a shader part's epilog code. Rename to try avoiding confusion from a true call's return.
llvm-svn: 298452
|
Revision tags: llvmorg-4.0.0, llvmorg-4.0.0-rc4, llvmorg-4.0.0-rc3, llvmorg-4.0.0-rc2 |
|
#
8f844f39 |
| 07-Feb-2017 |
Yaxun Liu <Yaxun.Liu@amd.com> |
[AMDGPU] Lower null pointers in static variable initializer
For amdgcn target Clang generates addrspacecast to represent null pointers in private and local address spaces.
In LLVM codegen,
[AMDGPU] Lower null pointers in static variable initializer
For amdgcn target Clang generates addrspacecast to represent null pointers in private and local address spaces.
In LLVM codegen, the static variable initializer is lowered by virtual function AsmPrinter::lowerConstant which is target generic. Since addrspacecast is target specific, AsmPrinter::lowerConst
This patch overrides AsmPrinter::lowerConstant with AMDGPUAsmPrinter::lowerConstant, which is able to lower the target-specific addrspacecast in the null pointer representation so that -1 is co
Differential Revision: https://reviews.llvm.org/D29284
llvm-svn: 294265
show more ...
|
#
8c209aa8 |
| 28-Jan-2017 |
Matthias Braun <matze@braunis.de> |
Cleanup dump() functions.
We had various variants of defining dump() functions in LLVM. Normalize them (this should just consistently implement the things discussed in http://lists.llvm.org/pipermai
Cleanup dump() functions.
We had various variants of defining dump() functions in LLVM. Normalize them (this should just consistently implement the things discussed in http://lists.llvm.org/pipermail/cfe-dev/2014-January/034323.html
For reference: - Public headers should just declare the dump() method but not use LLVM_DUMP_METHOD or #if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP) - The definition of a dump method should look like this: #if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP) LLVM_DUMP_METHOD void MyClass::dump() { // print stuff to dbgs()... } #endif
llvm-svn: 293359
show more ...
|
Revision tags: llvmorg-4.0.0-rc1, llvmorg-3.9.1, llvmorg-3.9.1-rc3, llvmorg-3.9.1-rc2, llvmorg-3.9.1-rc1 |
|
#
ea91cca5 |
| 15-Nov-2016 |
Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com> |
[AMDGPU] Add wave barrier builtin
The wave barrier represents the discardable barrier. Its main purpose is to carry convergent attribute, thus preventing illegal CFG optimizations. All lanes in a wa
[AMDGPU] Add wave barrier builtin
The wave barrier represents the discardable barrier. Its main purpose is to carry convergent attribute, thus preventing illegal CFG optimizations. All lanes in a wave come to convergence point simultaneously with SIMT, thus no special instruction is needed in the ISA. The barrier is discarded during code generation.
Differential Revision: https://reviews.llvm.org/D26585
llvm-svn: 287007
show more ...
|
#
c96b5d70 |
| 14-Oct-2016 |
Konstantin Zhuravlyov <kzhuravl_dev@outlook.com> |
[AMDGPU] Emit 32-bit lo/hi got and pc relative variant kinds for external and global address space variables
Differential Revision: https://reviews.llvm.org/D25562
llvm-svn: 284196
|
#
11f74020 |
| 06-Oct-2016 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
Reapply "AMDGPU: Support using tablegened MC pseudo expansions"
Fix bad merge
llvm-svn: 283470
|
#
cbc879ee |
| 06-Oct-2016 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
Revert "AMDGPU: Support using tablegened MC pseudo expansions"
llvm-svn: 283469
|
#
d20a2dd7 |
| 06-Oct-2016 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Support using tablegened MC pseudo expansions
Make the necessary refactorings to make use of PseudoInstExpansion
llvm-svn: 283467
|
#
6bc43d86 |
| 06-Oct-2016 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
BranchRelaxation: Support expanding unconditional branches
AMDGPU needs to expand unconditional branches in a new block with an indirect branch.
llvm-svn: 283464
|
#
1b9748c6 |
| 26-Sep-2016 |
Tom Stellard <thomas.stellard@amd.com> |
AMDGPU/SI: Don't crash on anonymous GlobalValues
Summary: We need to call AsmPrinter::getNameWithPrefix() in order to handle anonymous GlobalValues (e.g. @0, @1).
Reviewers: arsenm, b-sumner
Subsc
AMDGPU/SI: Don't crash on anonymous GlobalValues
Summary: We need to call AsmPrinter::getNameWithPrefix() in order to handle anonymous GlobalValues (e.g. @0, @1).
Reviewers: arsenm, b-sumner
Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D24865
llvm-svn: 282420
show more ...
|
Revision tags: llvmorg-3.9.0, llvmorg-3.9.0-rc3, llvmorg-3.9.0-rc2, llvmorg-3.9.0-rc1 |
|
#
418beb76 |
| 13-Jul-2016 |
Tom Stellard <thomas.stellard@amd.com> |
AMDGPU/SI: Add support for R_AMDGPU_GOTPCREL
Reviewers: rafael, ruiu, tony-tye, arsenm, kzhuravl
Subscribers: arsenm, llvm-commits, kzhuravl
Differential Revision: http://reviews.llvm.org/D21484
AMDGPU/SI: Add support for R_AMDGPU_GOTPCREL
Reviewers: rafael, ruiu, tony-tye, arsenm, kzhuravl
Subscribers: arsenm, llvm-commits, kzhuravl
Differential Revision: http://reviews.llvm.org/D21484
llvm-svn: 275268
show more ...
|
#
a74374a8 |
| 08-Jul-2016 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Move si_mask_branch register operand to be a use
llvm-svn: 274818
|
#
9cfc75c2 |
| 30-Jun-2016 |
Duncan P. N. Exon Smith <dexonsmith@apple.com> |
CodeGen: Use MachineInstr& in TargetInstrInfo, NFC
This is mostly a mechanical change to make TargetInstrInfo API take MachineInstr& (instead of MachineInstr* or MachineBasicBlock::iterator) when th
CodeGen: Use MachineInstr& in TargetInstrInfo, NFC
This is mostly a mechanical change to make TargetInstrInfo API take MachineInstr& (instead of MachineInstr* or MachineBasicBlock::iterator) when the argument is expected to be a valid MachineInstr. This is a general API improvement.
Although it would be possible to do this one function at a time, that would demand a quadratic amount of churn since many of these functions call each other. Instead I've done everything as a block and just updated what was necessary.
This is mostly mechanical fixes: adding and removing `*` and `&` operators. The only non-mechanical change is to split ARMBaseInstrInfo::getOperandLatencyImpl out from ARMBaseInstrInfo::getOperandLatency. Previously, the latter took a `MachineInstr*` which it updated to the instruction bundle leader; now, the latter calls the former either with the same `MachineInstr&` or the bundle leader.
As a side effect, this removes a bunch of MachineInstr* to MachineBasicBlock::iterator implicit conversions, a necessary step toward fixing PR26753.
Note: I updated WebAssembly, Lanai, and AVR (despite being off-by-default) since it turned out to be easy. I couldn't run tests for AVR since llc doesn't link with it turned on.
llvm-svn: 274189
show more ...
|
#
43e92fe3 |
| 24-Jun-2016 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Cleanup subtarget handling.
Split AMDGPUSubtarget into amdgcn/r600 specific subclasses. This removes most of the static_casting of the basic codegen classes everywhere, and tries to restrict
AMDGPU: Cleanup subtarget handling.
Split AMDGPUSubtarget into amdgcn/r600 specific subclasses. This removes most of the static_casting of the basic codegen classes everywhere, and tries to restrict the features visible on the wrong target.
llvm-svn: 273652
show more ...
|
#
9babdf42 |
| 22-Jun-2016 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Fix verifier errors in SILowerControlFlow
The main sin this was committing was using terminator instructions in the middle of the block, and then not updating the block successors / predeces
AMDGPU: Fix verifier errors in SILowerControlFlow
The main sin this was committing was using terminator instructions in the middle of the block, and then not updating the block successors / predecessors. Split the blocks up to avoid this and introduce new pseudo instructions for branches taken with exec masking.
Also use a pseudo instead of emitting s_endpgm and erasing it in the special case of a non-void return.
llvm-svn: 273467
show more ...
|
#
bf3e6e5b |
| 14-Jun-2016 |
Tom Stellard <thomas.stellard@amd.com> |
AMDGPU/SI: Refactor fixup handling for constant addrspace variables
Summary: We now use a standard fixup type applying the pc-relative address of constant address space variables, and we have the Gl
AMDGPU/SI: Refactor fixup handling for constant addrspace variables
Summary: We now use a standard fixup type applying the pc-relative address of constant address space variables, and we have the GlobalAddress lowering code add the required 4 byte offset to the global address rather than doing it as part of the fixup.
This refactoring will make it easier to use the same code for global address space variables and also simplifies the code.
Re-commit this after fixing a bug where we were trying to use a reference to a Triple object that had already been destroyed.
Reviewers: arsenm, kzhuravl
Subscribers: arsenm, kzhuravl, llvm-commits
Differential Revision: http://reviews.llvm.org/D21154
llvm-svn: 272705
show more ...
|
#
b1a523fa |
| 14-Jun-2016 |
Tom Stellard <thomas.stellard@amd.com> |
Revert "AMDGPU/SI: Refactor fixup handling for constant addrspace variables"
This reverts commit r272675.
llvm-svn: 272677
|
#
5e6298b0 |
| 14-Jun-2016 |
Tom Stellard <thomas.stellard@amd.com> |
AMDGPU/SI: Refactor fixup handling for constant addrspace variables
Summary: We now use a standard fixup type applying the pc-relative address of constant address space variables, and we have the Gl
AMDGPU/SI: Refactor fixup handling for constant addrspace variables
Summary: We now use a standard fixup type applying the pc-relative address of constant address space variables, and we have the GlobalAddress lowering code add the required 4 byte offset to the global address rather than doing it as part of the fixup.
This refactoring will make it easier to use the same code for global address space variables and also simplifies the code.
Reviewers: arsenm, kzhuravl
Subscribers: arsenm, kzhuravl, llvm-commits
Differential Revision: http://reviews.llvm.org/D21154
llvm-svn: 272675
show more ...
|
#
f3af8414 |
| 10-Jun-2016 |
Tom Stellard <thomas.stellard@amd.com> |
AMDGPU/SI: Don't use fixup_si_rodata for scratch rsrc relocations
Summary: We need to set the fixup type to FK_Data_4 for the SCRATCH_RSRC_DWORD[01] symbols, since these require absolute relocations
AMDGPU/SI: Don't use fixup_si_rodata for scratch rsrc relocations
Summary: We need to set the fixup type to FK_Data_4 for the SCRATCH_RSRC_DWORD[01] symbols, since these require absolute relocations, and fixup_si_rodata is for relative relocations.
Reviewers: arsenm, kzhuravl
Subscribers: arsenm, kzhuravl, llvm-commits
Differential Revision: http://reviews.llvm.org/D21153
llvm-svn: 272417
show more ...
|