#
e823d92f |
| 18-Feb-2017 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Merge initial gfx9 support
llvm-svn: 295554
|
#
19f98c6a |
| 15-Feb-2017 |
Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com> |
[AMDGPU] Fix MaxWorkGroupsPerCU for large workgroups
This patch corrects the maximum workgroups per CU if we have big workgroups (more than 128). This calculation contributes to the occupancy calcul
[AMDGPU] Fix MaxWorkGroupsPerCU for large workgroups
This patch corrects the maximum workgroups per CU if we have big workgroups (more than 128). This calculation contributes to the occupancy calculation in respect to LDS size.
Differential Revision: https://reviews.llvm.org/D29974
llvm-svn: 295134
show more ...
|
#
d96089b2 |
| 14-Feb-2017 |
Eugene Zelenko <eugene.zelenko@gmail.com> |
[MC] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
Same changes in files affected by reduced MC headers dependencies.
llvm-svn: 295009
|
#
fd871377 |
| 09-Feb-2017 |
Konstantin Zhuravlyov <kzhuravl_dev@outlook.com> |
[AMDGPU] Calculate number of min/max SGPRs/VGPRs for WavesPerEU instead of using switch statement
Differential Revision: https://reviews.llvm.org/D29741
llvm-svn: 294627
|
Revision tags: llvmorg-4.0.0-rc2 |
|
#
9f89ede1 |
| 08-Feb-2017 |
Konstantin Zhuravlyov <kzhuravl_dev@outlook.com> |
[AMDGPU] Add target information that is required by tools to metadata
Differential Revision: https://reviews.llvm.org/D28760#fb670e28
llvm-svn: 294449
|
#
43b61561 |
| 03-Feb-2017 |
Artem Tamazov <artem.tamazov@amd.com> |
[AMDGPU][mc] Fix AddressSanitizer leftover issue in gfx7_asm_all test
Issue occurs when assembling "ds_ordered_count v0, v0 gds".
llvm-svn: 294004
|
#
ca16621b |
| 30-Jan-2017 |
Tom Stellard <thomas.stellard@amd.com> |
Re-commit AMDGPU/GlobalISel: Add support for simple shaders
Fix build when global-isel is disabled and fix a warning.
Summary: We can select constant/global G_LOAD, global G_STORE, and G_GEP.
Revi
Re-commit AMDGPU/GlobalISel: Add support for simple shaders
Fix build when global-isel is disabled and fix a warning.
Summary: We can select constant/global G_LOAD, global G_STORE, and G_GEP.
Reviewers: qcolombet, MatzeB, t.p.northover, ab, arsenm
Subscribers: mehdi_amini, vkalintiris, kzhuravl, wdng, nhaehnle, mgorny, yaxunl, tony-tye, modocache, llvm-commits, dberris
Differential Revision: https://reviews.llvm.org/D26730
llvm-svn: 293551
show more ...
|
#
7a19d56f |
| 30-Jan-2017 |
Tom Stellard <thomas.stellard@amd.com> |
Revert "AMDGPU/GlobalISel: Add support for simple shaders"
This reverts commit r293503.
Revert while I investigate some of the buildbot failures.
llvm-svn: 293509
|
#
e48f60ae |
| 30-Jan-2017 |
Tom Stellard <thomas.stellard@amd.com> |
AMDGPU/GlobalISel: Add support for simple shaders
Summary: We can select constant/global G_LOAD, global G_STORE, and G_GEP.
Reviewers: qcolombet, MatzeB, t.p.northover, ab, arsenm
Subscribers: meh
AMDGPU/GlobalISel: Add support for simple shaders
Summary: We can select constant/global G_LOAD, global G_STORE, and G_GEP.
Reviewers: qcolombet, MatzeB, t.p.northover, ab, arsenm
Subscribers: mehdi_amini, vkalintiris, kzhuravl, wdng, nhaehnle, mgorny, yaxunl, tony-tye, modocache, llvm-commits, dberris
Differential Revision: https://reviews.llvm.org/D26730
llvm-svn: 293503
show more ...
|
#
33b01e9c |
| 27-Jan-2017 |
Artem Tamazov <artem.tamazov@amd.com> |
[AMDGPU][mc] Fix memory corruption uncovered by AddressSanitizer during coverage/smoke Gfx7/8 testing.
Coverage/smoke Gfx7/8 tests were committed r292922 but then reverted by r292974 due to AddressS
[AMDGPU][mc] Fix memory corruption uncovered by AddressSanitizer during coverage/smoke Gfx7/8 testing.
Coverage/smoke Gfx7/8 tests were committed r292922 but then reverted by r292974 due to AddressSanitizer failure, which is fixed by this patch. Tests to be re-committed soon.
llvm-svn: 293338
show more ...
|
#
08efb7eb |
| 27-Jan-2017 |
Tom Stellard <thomas.stellard@amd.com> |
AMDGPU/SI: Move some ISel helpers into utils so they can be shared with GISel
Reviewers: arsenm
Reviewed By: arsenm
Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye
Different
AMDGPU/SI: Move some ISel helpers into utils so they can be shared with GISel
Reviewers: arsenm
Reviewed By: arsenm
Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye
Differential Revision: https://reviews.llvm.org/D29068
llvm-svn: 293321
show more ...
|
#
5d910194 |
| 25-Jan-2017 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Set call_convention bit in kernel_code_t
According to the documentation this is supposed to be -1 if indirect calls are not supported.
llvm-svn: 293081
|
Revision tags: llvmorg-4.0.0-rc1 |
|
#
9dffada9 |
| 17-Jan-2017 |
Sam Kolton <Sam.Kolton@amd.com> |
[AMDGPU] Assembler: fix v_mac_f16 immediates
Reviewers: vpykhtin, artem.tamazov, tstellarAMD
Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye
Differential Revision: https://reviews.
[AMDGPU] Assembler: fix v_mac_f16 immediates
Reviewers: vpykhtin, artem.tamazov, tstellarAMD
Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye
Differential Revision: https://reviews.llvm.org/D28802
llvm-svn: 292224
show more ...
|
#
31dbb039 |
| 06-Jan-2017 |
Konstantin Zhuravlyov <kzhuravl_dev@outlook.com> |
[AMDGPU] Remove extra semicolon. NFC
llvm-svn: 291246
|
#
4bd72361 |
| 10-Dec-2016 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Fix handling of 16-bit immediates
Since 32-bit instructions with 32-bit input immediate behavior are used to materialize 16-bit constants in 32-bit registers for 16-bit instructions, determi
AMDGPU: Fix handling of 16-bit immediates
Since 32-bit instructions with 32-bit input immediate behavior are used to materialize 16-bit constants in 32-bit registers for 16-bit instructions, determining the legality based on the size is incorrect. Change operands to have the size specified in the type.
Also adds a workaround for a disassembler bug that produces an immediate MCOperand for an operand that is supposed to be OPERAND_REGISTER.
The assembler appears to accept out of bounds immediates and truncates them, but this seems to be an issue for 32-bit already.
llvm-svn: 289306
show more ...
|
Revision tags: llvmorg-3.9.1, llvmorg-3.9.1-rc3 |
|
#
26faed39 |
| 05-Dec-2016 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Consolidate inline immediate predicate functions
llvm-svn: 288718
|
Revision tags: llvmorg-3.9.1-rc2, llvmorg-3.9.1-rc1 |
|
#
b133fbb9 |
| 27-Oct-2016 |
Tom Stellard <thomas.stellard@amd.com> |
AMDGPU/SI: Handle hazard with > 8 byte VMEM stores
Reviewers: arsenm
Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, tony-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D25577
l
AMDGPU/SI: Handle hazard with > 8 byte VMEM stores
Reviewers: arsenm
Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, tony-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D25577
llvm-svn: 285359
show more ...
|
#
94add85a |
| 26-Oct-2016 |
Yaxun Liu <Yaxun.Liu@amd.com> |
AMDGPU: Refactor processor definition to use ISA version features
Add missing ISA versions 7.0.2/8.0.4/8.1.0. to backend.
Refactor processor definition to use ISA version features.
Fixed ISA versi
AMDGPU: Refactor processor definition to use ISA version features
Add missing ISA versions 7.0.2/8.0.4/8.1.0. to backend.
Refactor processor definition to use ISA version features.
Fixed ISA version for stoney.
Based on Laurent Morichetti's patch.
Differential Revision: https://reviews.llvm.org/D25919
llvm-svn: 285210
show more ...
|
#
08326b62 |
| 20-Oct-2016 |
Konstantin Zhuravlyov <kzhuravl_dev@outlook.com> |
[AMDGPU] Emit constant address space data in .rodata section and use relocations instead of fixups (amdhsa only)
Differential Revision: https://reviews.llvm.org/D25693
llvm-svn: 284759
|
#
c8715503 |
| 19-Oct-2016 |
Krzysztof Parzyszek <kparzysz@codeaurora.org> |
[AMDGPU] Stop using MCRegisterClass::getSize()
Differential Review: https://reviews.llvm.org/D24675
llvm-svn: 284619
|
#
cdd45476 |
| 11-Oct-2016 |
Konstantin Zhuravlyov <kzhuravl_dev@outlook.com> |
[AMDGPU] Refactor waitcnt encoding
- Refactor bit packing/unpacking - Calculate bit mask given bit shift and bit width - Introduce function for decoding bits of waitcnt - Introduce function for enco
[AMDGPU] Refactor waitcnt encoding
- Refactor bit packing/unpacking - Calculate bit mask given bit shift and bit width - Introduce function for decoding bits of waitcnt - Introduce function for encoding bits of waitcnt - Introduce function for getting waitcnt mask (instead of using bare numbers) - Introduce function fot getting max waitcnt(s) (instead of using bare numbers)
Differential Revision: https://reviews.llvm.org/D25298
llvm-svn: 283919
show more ...
|
#
98317d20 |
| 11-Oct-2016 |
Changpeng Fang <changpeng.fang@gmail.com> |
AMDGPU/SI: Update ISA version numbers for Tonga and Polaris10/11.
Differential Revision: http://reviews.llvm.org/D25454
Reviewers: tstellarAMD
llvm-svn: 283893
|
#
a3ec5c10 |
| 07-Oct-2016 |
Sam Kolton <Sam.Kolton@amd.com> |
[AMDGPU] Assembler: support v_mac_f32 DPP and SDWA. Move getNamedOperandIdx to AMDGPUBaseInfo.h
Reviewers: artem.tamazov, tstellarAMD
Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye
[AMDGPU] Assembler: support v_mac_f32 DPP and SDWA. Move getNamedOperandIdx to AMDGPUBaseInfo.h
Reviewers: artem.tamazov, tstellarAMD
Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye
Differential Revision: https://reviews.llvm.org/D25084
llvm-svn: 283560
show more ...
|
#
836cbff6 |
| 30-Sep-2016 |
Konstantin Zhuravlyov <kzhuravl_dev@outlook.com> |
[AMDGPU] Choose VMCNT, EXPCNT, LGKMCNT masks and shifts based on the isa version
Differential Revision: https://reviews.llvm.org/D24973
llvm-svn: 282877
|
#
1eeb11bf |
| 09-Sep-2016 |
Sam Kolton <Sam.Kolton@amd.com> |
AMDGPU] Assembler: better support for immediate literals in assembler.
Summary: Prevously assembler parsed all literals as either 32-bit integers or 32-bit floating-point values. Because of this we
AMDGPU] Assembler: better support for immediate literals in assembler.
Summary: Prevously assembler parsed all literals as either 32-bit integers or 32-bit floating-point values. Because of this we couldn't support f64 literals. E.g. in instruction "v_fract_f64 v[0:1], 0.5", literal 0.5 was encoded as 32-bit literal 0x3f000000, which is incorrect and will be interpreted as 3.0517578125E-5 instead of 0.5. Correct encoding is inline constant 240 (optimal) or 32-bit literal 0x3FE00000 at least.
With this change the way immediate literals are parsed is changed. All literals are always parsed as 64-bit values either integer or floating-point. Then we convert parsed literals to correct form based on information about type of operand parsed (was literal floating or binary) and type of expected instruction operands (is this f32/64 or b32/64 instruction). Here are rules how we convert literals: - We parsed fp literal: - Instruction expects 64-bit operand: - If parsed literal is inlinable (e.g. v_fract_f64_e32 v[0:1], 0.5) - then we do nothing this literal - Else if literal is not-inlinable but instruction requires to inline it (e.g. this is e64 encoding, v_fract_f64_e64 v[0:1], 1.5) - report error - Else literal is not-inlinable but we can encode it as additional 32-bit literal constant - If instruction expect fp operand type (f64) - Check if low 32 bits of literal are zeroes (e.g. v_fract_f64 v[0:1], 1.5) - If so then do nothing - Else (e.g. v_fract_f64 v[0:1], 3.1415) - report warning that low 32 bits will be set to zeroes and precision will be lost - set low 32 bits of literal to zeroes - Instruction expects integer operand type (e.g. s_mov_b64_e32 s[0:1], 1.5) - report error as it is unclear how to encode this literal - Instruction expects 32-bit operand: - Convert parsed 64 bit fp literal to 32 bit fp. Allow lose of precision but not overflow or underflow - Is this literal inlinable and are we required to inline literal (e.g. v_trunc_f32_e64 v0, 0.5) - do nothing - Else report error - Do nothing. We can encode any other 32-bit fp literal (e.g. v_trunc_f32 v0, 10000000.0) - Parsed binary literal: - Is this literal inlinable (e.g. v_trunc_f32_e32 v0, 35) - do nothing - Else, are we required to inline this literal (e.g. v_trunc_f32_e64 v0, 35) - report error - Else, literal is not-inlinable and we are not required to inline it - Are high 32 bit of literal zeroes or same as sign bit (32 bit) - do nothing (e.g. v_trunc_f32 v0, 0xdeadbeef) - Else - report error (e.g. v_trunc_f32 v0, 0x123456789abcdef0)
For this change it is required that we know operand types of instruction (are they f32/64 or b32/64). I added several new register operands (they extend previous register operands) and set operand types to corresponding types: ''' enum OperandType { OPERAND_REG_IMM32_INT, OPERAND_REG_IMM32_FP, OPERAND_REG_INLINE_C_INT, OPERAND_REG_INLINE_C_FP, } '''
This is not working yet: - Several tests are failing - Problems with predicate methods for inline immediates - LLVM generated assembler parts try to select e64 encoding before e32. More changes are required for several AsmOperands.
Reviewers: vpykhtin, tstellarAMD
Subscribers: arsenm, kzhuravl, artem.tamazov
Differential Revision: https://reviews.llvm.org/D22922
llvm-svn: 281050
show more ...
|