Revision tags: llvmorg-13.0.0-rc1, llvmorg-14-init |
|
#
9dcd75f8 |
| 22-Jul-2021 |
Carl Ritson <carl.ritson@amd.com> |
[AMDGPU] Allow frontends to disable null export for pixel shaders
Disable null export (for kills) when a frontend defines a pixel shader as not exporting using amdgpu-color-export and amdgpu-depth-e
[AMDGPU] Allow frontends to disable null export for pixel shaders
Disable null export (for kills) when a frontend defines a pixel shader as not exporting using amdgpu-color-export and amdgpu-depth-export function attrbutes. This allows the generation of export free pixel shaders.
Reviewed By: foad
Differential Revision: https://reviews.llvm.org/D105683
show more ...
|
Revision tags: llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3 |
|
#
98f48723 |
| 24-Jun-2021 |
Carl Ritson <carl.ritson@amd.com> |
[AMDGPU] Add 224-bit vector types and link 192-bit types to MVTs
Add SReg_224, VReg_224, AReg_224, etc. Link 224-bit types with v7i32/v7f32. Link existing 192-bit types to newly added v3i64/v3f64/v6
[AMDGPU] Add 224-bit vector types and link 192-bit types to MVTs
Add SReg_224, VReg_224, AReg_224, etc. Link 224-bit types with v7i32/v7f32. Link existing 192-bit types to newly added v3i64/v3f64/v6i32/v6f32.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D104622
show more ...
|
Revision tags: llvmorg-12.0.1-rc2 |
|
#
294efbbd |
| 08-Jun-2021 |
Brendon Cahoon <brendon.cahoon@amd.com> |
Reland "[AMDGPU] Add gfx1013 target"
This reverts commit 211e584fa2a4c032e4d573e7cdbffd622aad0a8f.
Fixed a use-after-free error that caused the sanitizers to fail.
|
#
211e584f |
| 08-Jun-2021 |
Brendon Cahoon <brendon.cahoon@amd.com> |
Revert "[AMDGPU] Add gfx1013 target"
This reverts commit ea10a86984ea73fcec3b12d22404a15f2f59b219.
A sanitizer buildbot reports an error.
|
#
ea10a869 |
| 01-Jun-2021 |
Brendon Cahoon <brendon.cahoon@amd.com> |
[AMDGPU] Add gfx1013 target
Differential Revision: https://reviews.llvm.org/D103663
|
Revision tags: llvmorg-12.0.1-rc1 |
|
#
6fb02596 |
| 12-Apr-2021 |
Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com> |
[AMDGPU] Add support for architected flat scratch
Add support for the readonly flat Scratch register initialized by the SPI.
Differential Revision: https://reviews.llvm.org/D102432
|
#
72d570ca |
| 30-Apr-2021 |
David Stuttard <david.stuttard@amd.com> |
[AMDGPU][AsmParser/Disassembler] Correct A16 and G16 handling
A16 support for image instructions assembly/disassembly (gfx10) was missing
Also refactor MIMG op addr size calcs to common function
W
[AMDGPU][AsmParser/Disassembler] Correct A16 and G16 handling
A16 support for image instructions assembly/disassembly (gfx10) was missing
Also refactor MIMG op addr size calcs to common function
We'd got 3 places where the same operation was being done.
One test is now marked XFAIL until a related codegen patch is in place
Differential Revision: https://reviews.llvm.org/D102231
Change-Id: I7e86e730ef8c71901457855cba570581f4f576bb
show more ...
|
#
4433f460 |
| 11-May-2021 |
Austin Kerbow <Austin.Kerbow@amd.com> |
[AMDGPU] Fix extra waitcnt being added with BUFFER_INVL2
The waitcnt pass would increment the number of vmem events for some buffer invalidates that were not handled by the pass.
Reviewed By: rampi
[AMDGPU] Fix extra waitcnt being added with BUFFER_INVL2
The waitcnt pass would increment the number of vmem events for some buffer invalidates that were not handled by the pass.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D102252
show more ...
|
#
4fae63c6 |
| 08-Apr-2021 |
Konstantin Zhuravlyov <kzhuravl_dev@outlook.com> |
AMDGPU: Add gfx90c support to code object v2 for backwards compatibility
Differential Revision: https://reviews.llvm.org/D100126
|
Revision tags: llvmorg-12.0.0, llvmorg-12.0.0-rc5 |
|
#
0f5ebbcc |
| 01-Apr-2021 |
Dmitry Preobrazhensky <dmitry.preobrazhensky@amd.com> |
[AMDGPU][MC] Added flag to identify VOP instructions which have a single variant
By convention, VOP1/2/C instructions which can be promoted to VOP3 have _e32 suffix while promoted instructions have
[AMDGPU][MC] Added flag to identify VOP instructions which have a single variant
By convention, VOP1/2/C instructions which can be promoted to VOP3 have _e32 suffix while promoted instructions have _e64 suffix. Instructions which have a single variant should have no _e32/_e64 suffix. Unfortunately there was no simple way to identify single variant instructions - it was implemented by a hack. See bug https://bugs.llvm.org/show_bug.cgi?id=39086.
This fix simplifies handling of single VOP instructions by adding a dedicated flag.
Differential Revision: https://reviews.llvm.org/D99408
show more ...
|
Revision tags: llvmorg-12.0.0-rc4 |
|
#
f4ace637 |
| 24-Mar-2021 |
Konstantin Zhuravlyov <kzhuravl_dev@outlook.com> |
AMDGPU: Add target id and code object v4 support
- Add target id support (https://clang.llvm.org/docs/ClangOffloadBundler.html#target-id) - Add code object v4 support (https://llvm.org/docs/AMDG
AMDGPU: Add target id and code object v4 support
- Add target id support (https://clang.llvm.org/docs/ClangOffloadBundler.html#target-id) - Add code object v4 support (https://llvm.org/docs/AMDGPUUsage.html#elf-code-object) - Add kernarg_size to kernel descriptor - Change trap handler ABI to no longer move queue pointer into s[0:1] - Cleanup ELF definitions - Add V2, V3, V4 suffixes to make a clear distinction for code object version - Consolidate note names
Differential Revision: https://reviews.llvm.org/D95638
show more ...
|
Revision tags: llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2 |
|
#
78b6d73a |
| 19-Feb-2021 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Add even aligned VGPR/AGPR register classes
gfx90a operations require even aligned registers, but this was previously achieved by reserving registers inside the full class.
Ideally this wou
AMDGPU: Add even aligned VGPR/AGPR register classes
gfx90a operations require even aligned registers, but this was previously achieved by reserving registers inside the full class.
Ideally this would be captured in the static instruction definitions for the operands, and we would have different instructions per subtarget. The hackiest part of this is we need to manually reassign AGPR register classes after instruction selection (we get away without this for VGPRs since those types are actually registered for legal types).
show more ...
|
Revision tags: llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init |
|
#
67f06208 |
| 25-Jan-2021 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] Update s_sendmsg messages
Update the list of s_sendmsg messages known to the assembler and disassembler and validate the ones that were added or removed in gfx9 and gfx10.
Differential Rev
[AMDGPU] Update s_sendmsg messages
Update the list of s_sendmsg messages known to the assembler and disassembler and validate the ones that were added or removed in gfx9 and gfx10.
Differential Revision: https://reviews.llvm.org/D97295
show more ...
|
#
a8d9d507 |
| 17-Feb-2021 |
Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com> |
[AMDGPU] gfx90a support
Differential Revision: https://reviews.llvm.org/D96906
|
Revision tags: llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2 |
|
#
2291bd13 |
| 30-Nov-2020 |
Austin Kerbow <Austin.Kerbow@amd.com> |
[AMDGPU] Update subtarget features for new target ID support
Support for XNACK and SRAMECC is not static on some GPUs. We must be able to differentiate between different scenarios for these dynamic
[AMDGPU] Update subtarget features for new target ID support
Support for XNACK and SRAMECC is not static on some GPUs. We must be able to differentiate between different scenarios for these dynamic subtarget features.
The possible settings are:
- Unsupported: The GPU has no support for XNACK/SRAMECC. - Any: Preference is unspecified. Use conservative settings that can run anywhere. - Off: Request support for XNACK/SRAMECC Off - On: Request support for XNACK/SRAMECC On
GCNSubtarget will track the four options based on the following criteria. If the subtarget does not support XNACK/SRAMECC we say the setting is "Unsupported". If no subtarget features for XNACK/SRAMECC are requested we must support "Any" mode. If the subtarget features XNACK/SRAMECC exist in the feature string when initializing the subtarget, the settings are "On/Off".
The defaults are updated to be conservatively correct, meaning if no setting for XNACK or SRAMECC is explicitly requested, defaults will be used which generate code that can be run anywhere. This corresponds to the "Any" setting.
Differential Revision: https://reviews.llvm.org/D85882
show more ...
|
#
745064e3 |
| 26-Jan-2021 |
Dmitry Preobrazhensky <dmitry.preobrazhensky@amd.com> |
[AMDGPU][MC] Refactored exp tgt handling
Summary: - Separated tgt encoding from parsing; - Separated tgt decoding from printing; - Improved errors handling; - Disabled leading zeroes in index. The f
[AMDGPU][MC] Refactored exp tgt handling
Summary: - Separated tgt encoding from parsing; - Separated tgt decoding from printing; - Improved errors handling; - Disabled leading zeroes in index. The following code is no longer accepted: exp pos00 v3, v2, v1, v0
Reviewers: arsenm, rampitec, foad
Differential Revision: https://reviews.llvm.org/D95216
show more ...
|
#
560d7e04 |
| 20-Jan-2021 |
dfukalov <daniil.fukalov@amd.com> |
[NFC][AMDGPU] Split AMDGPUSubtarget.h to R600 and GCN subtargets
... to reduce headers dependency.
Reviewed By: rampitec, arsenm
Differential Revision: https://reviews.llvm.org/D95036
|
#
18cb7441 |
| 19-Jan-2021 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] Simpler names for arch-specific ttmp registers. NFC.
Rename the *_gfx9_gfx10 ttmp registers to *_gfx9plus for simplicity, and use the corresponding isGFX9Plus predicate to decide when to us
[AMDGPU] Simpler names for arch-specific ttmp registers. NFC.
Rename the *_gfx9_gfx10 ttmp registers to *_gfx9plus for simplicity, and use the corresponding isGFX9Plus predicate to decide when to use them instead of the old *_vi versions.
Differential Revision: https://reviews.llvm.org/D94975
show more ...
|
#
6a87e9b0 |
| 25-Dec-2020 |
dfukalov <daniil.fukalov@amd.com> |
[NFC][AMDGPU] Reduce include files dependency.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D93813
|
#
91445979 |
| 15-Dec-2020 |
Sebastian Neubauer <sebastian.neubauer@amd.com> |
[AMDGPU] Unify flat offset logic
Move getNumFlatOffsetBits from AMDGPUAsmParser and SIInstrInfo into AMDGPUBaseInfo.
Differential Revision: https://reviews.llvm.org/D93287
|
#
5733167f |
| 09-Dec-2020 |
Sebastian Neubauer <sebastian.neubauer@amd.com> |
[AMDGPU] Mark amdgpu_gfx functions as module entry function
- Allows lds allocations - Writes resource usage into COMPUTE_PGM_RSRC1 registers in PAL metadata
Differential Revision: https://reviews.
[AMDGPU] Mark amdgpu_gfx functions as module entry function
- Allows lds allocations - Writes resource usage into COMPUTE_PGM_RSRC1 registers in PAL metadata
Differential Revision: https://reviews.llvm.org/D92946
show more ...
|
Revision tags: llvmorg-11.0.1-rc1 |
|
#
4f87d30a |
| 25-Nov-2020 |
Jay Foad <jay.foad@amd.com> |
[AMDGPU] Introduce and use isGFX10Plus. NFC.
It's more future-proof to use isGFX10Plus from the start, on the assumption that future architectures will be based on current architectures.
Also make
[AMDGPU] Introduce and use isGFX10Plus. NFC.
It's more future-proof to use isGFX10Plus from the start, on the assumption that future architectures will be based on current architectures.
Also make use of the existing isGFX9Plus in a few places.
Differential Revision: https://reviews.llvm.org/D92092
show more ...
|
Revision tags: llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3 |
|
#
a022b1cc |
| 16-Sep-2020 |
Sebastian Neubauer <sebastian.neubauer@amd.com> |
[AMDGPU] Add amdgpu_gfx calling convention
Add a calling convention called amdgpu_gfx for real function calls within graphics shaders. For the moment, this uses the same calling convention as other
[AMDGPU] Add amdgpu_gfx calling convention
Add a calling convention called amdgpu_gfx for real function calls within graphics shaders. For the moment, this uses the same calling convention as other calls in amdgpu, with registers excluded for return address, stack pointer and stack buffer descriptor.
Differential Revision: https://reviews.llvm.org/D88540
show more ...
|
#
3fdf3b15 |
| 14-Oct-2020 |
Konstantin Zhuravlyov <kzhuravl_dev@outlook.com> |
AMDGPU: Update AMDHSA code object version handling
Differential Revision: https://reviews.llvm.org/D89076
|
#
acce6b60 |
| 06-Oct-2020 |
Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com> |
[AMDGPU] Create isGFX9Plus utility function
Introduce a utility function to make it more convenient to write code that is the same on the GFX9 and GFX10 subtargets.
Use isGFX9Plus in the AsmParser
[AMDGPU] Create isGFX9Plus utility function
Introduce a utility function to make it more convenient to write code that is the same on the GFX9 and GFX10 subtargets.
Use isGFX9Plus in the AsmParser for AMDGPU.
Authored By: Joe_Nash
Differential Revision: https://reviews.llvm.org/D88908
show more ...
|