#
c24d5e28 |
| 08-Feb-2018 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Minor cleanups
Column limit, typo, unnecessary reference
llvm-svn: 324666
|
Revision tags: llvmorg-6.0.0-rc2, llvmorg-6.0.0-rc1 |
|
#
f1caa283 |
| 15-Dec-2017 |
Matthias Braun <matze@braunis.de> |
MachineFunction: Return reference from getFunction(); NFC
The Function can never be nullptr so we can return a reference.
llvm-svn: 320884
|
#
a8a83d15 |
| 07-Dec-2017 |
Francis Visoiu Mistrih <francisvm@yahoo.com> |
[CodeGen] Use MachineOperand::print in the MIRPrinter for MO_Register.
Work towards the unification of MIR and debug output by refactoring the interfaces.
For MachineOperand::print, keep a simple v
[CodeGen] Use MachineOperand::print in the MIRPrinter for MO_Register.
Work towards the unification of MIR and debug output by refactoring the interfaces.
For MachineOperand::print, keep a simple version that can be easily called from `dump()`, and a more complex one which will be called from both the MIRPrinter and MachineInstr::print.
Add extra checks inside MachineOperand for detached operands (operands with getParent() == nullptr).
https://reviews.llvm.org/D40836
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/kill: ([^ ]+) ([^ ]+)<def> ([^ ]+)/kill: \1 def \2 \3/g' * find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/kill: ([^ ]+) ([^ ]+) ([^ ]+)<def>/kill: \1 \2 def \3/g' * find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/kill: def ([^ ]+) ([^ ]+) ([^ ]+)<def>/kill: def \1 \2 def \3/g' * find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/<def>//g' * find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<kill>/killed \1/g' * find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<imp-use,kill>/implicit killed \1/g' * find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<dead>/dead \1/g' * find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<def[ ]*,[ ]*dead>/dead \1/g' * find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<imp-def[ ]*,[ ]*dead>/implicit-def dead \1/g' * find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<imp-def>/implicit-def \1/g' * find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<imp-use>/implicit \1/g' * find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<internal>/internal \1/g' * find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<undef>/undef \1/g'
llvm-svn: 320022
show more ...
|
Revision tags: llvmorg-5.0.1, llvmorg-5.0.1-rc3 |
|
#
8ae38bc0 |
| 05-Dec-2017 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Fix SDWA crash on inline asm
This was only searching for explicit defs, and asserting for any implicit or variadic instruction defs, like inline asm.
llvm-svn: 319826
|
#
5f7f32c3 |
| 04-Dec-2017 |
Sam Kolton <Sam.Kolton@amd.com> |
[AMDGPU] SDWA: add support for PRESERVE into SDWA peephole.
Summary:
Reviewers: arsenm, vpykhtin, rampitec
Subscribers: kzhuravl, wdng, nhaehnle, mgorny, yaxunl, dstuttard, tpr, t-tye
Differentia
[AMDGPU] SDWA: add support for PRESERVE into SDWA peephole.
Summary:
Reviewers: arsenm, vpykhtin, rampitec
Subscribers: kzhuravl, wdng, nhaehnle, mgorny, yaxunl, dstuttard, tpr, t-tye
Differential Revision: https://reviews.llvm.org/D37817
llvm-svn: 319662
show more ...
|
#
93ef1458 |
| 30-Nov-2017 |
Francis Visoiu Mistrih <francisvm@yahoo.com> |
[CodeGen] Print "%vreg0" as "%0" in both MIR and debug output
As part of the unification of the debug format and the MIR format, avoid printing "vreg" for virtual registers (which is one of the curr
[CodeGen] Print "%vreg0" as "%0" in both MIR and debug output
As part of the unification of the debug format and the MIR format, avoid printing "vreg" for virtual registers (which is one of the current MIR possibilities).
Basically:
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E "s/%vreg([0-9]+)/%\1/g" * grep -nr '%vreg' . and fix if needed * find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E "s/ vreg([0-9]+)/ %\1/g" * grep -nr 'vreg[0-9]\+' . and fix if needed
Differential Revision: https://reviews.llvm.org/D40420
llvm-svn: 319427
show more ...
|
Revision tags: llvmorg-5.0.1-rc2 |
|
#
b3bde2ea |
| 17-Nov-2017 |
David Blaikie <dblaikie@gmail.com> |
Fix a bunch more layering of CodeGen headers that are in Target
All these headers already depend on CodeGen headers so moving them into CodeGen fixes the layering (since CodeGen depends on Target, n
Fix a bunch more layering of CodeGen headers that are in Target
All these headers already depend on CodeGen headers so moving them into CodeGen fixes the layering (since CodeGen depends on Target, not the other way around).
llvm-svn: 318490
show more ...
|
Revision tags: llvmorg-5.0.1-rc1 |
|
#
f42074b6 |
| 10-Oct-2017 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Fix missing skipFunction calls
llvm-svn: 315361
|
Revision tags: llvmorg-5.0.0, llvmorg-5.0.0-rc5, llvmorg-5.0.0-rc4, llvmorg-5.0.0-rc3, llvmorg-5.0.0-rc2 |
|
#
59e12826 |
| 08-Aug-2017 |
Eugene Zelenko <eugene.zelenko@gmail.com> |
[AMDGPU] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 310328
|
Revision tags: llvmorg-5.0.0-rc1 |
|
#
a179d25b |
| 27-Jun-2017 |
Sam Kolton <Sam.Kolton@amd.com> |
[AMDGPU] SDWA: several fixes for V_CVT and VOPC instructions
Summary: 1. Instruction V_CVT_U32_F32 allow omod operand (see SIInstrInfo.td:1435). In fact this operand shouldn't be allowed here. This
[AMDGPU] SDWA: several fixes for V_CVT and VOPC instructions
Summary: 1. Instruction V_CVT_U32_F32 allow omod operand (see SIInstrInfo.td:1435). In fact this operand shouldn't be allowed here. This fix checks if SDWA pseudo instruction has OMod operand and then copy it. 2. There were several problems with support of VOPC instructions in SDWA peephole pass.
Reviewers: tstellar, arsenm, vpykhtin, airlied, kzhuravl
Subscribers: wdng, nhaehnle, yaxunl, dstuttard, tpr, sarnex, t-tye
Differential Revision: https://reviews.llvm.org/D34626
llvm-svn: 306413
show more ...
|
#
3c4933fc |
| 22-Jun-2017 |
Sam Kolton <Sam.Kolton@amd.com> |
[AMDGPU] SDWA: add support for GFX9 in peephole pass
Summary: Added support based on merged SDWA pseudo instructions. Now peephole allow one scalar operand, omod and clamp modifiers. Added several s
[AMDGPU] SDWA: add support for GFX9 in peephole pass
Summary: Added support based on merged SDWA pseudo instructions. Now peephole allow one scalar operand, omod and clamp modifiers. Added several subtarget features for GFX9 SDWA. This diff also contains changes from D34026. Depends D34026
Reviewers: vpykhtin, rampitec, arsenm
Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye
Differential Revision: https://reviews.llvm.org/D34241
llvm-svn: 305986
show more ...
|
#
549c89d2 |
| 21-Jun-2017 |
Sam Kolton <Sam.Kolton@amd.com> |
[AMDGPU] SDWA: merge VI and GFX9 pseudo instructions
Summary: Previously there were two separate pseudo instruction for SDWA on VI and on GFX9. Created one pseudo instruction that is union of both o
[AMDGPU] SDWA: merge VI and GFX9 pseudo instructions
Summary: Previously there were two separate pseudo instruction for SDWA on VI and on GFX9. Created one pseudo instruction that is union of both of them. Added verifier to check that operands conform either VI or GFX9.
Reviewers: dp, arsenm, vpykhtin
Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, artem.tamazov
Differential Revision: https://reviews.llvm.org/D34026
llvm-svn: 305886
show more ...
|
Revision tags: llvmorg-4.0.1, llvmorg-4.0.1-rc3 |
|
#
e4cda741 |
| 06-Jun-2017 |
Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com> |
[AMDGPU] Return correct value from SDWA pass
Differential Revision: https://reviews.llvm.org/D33927
llvm-svn: 304805
|
#
6bda14b3 |
| 06-Jun-2017 |
Chandler Carruth <chandlerc@gmail.com> |
Sort the remaining #include lines in include/... and lib/....
I did this a long time ago with a janky python script, but now clang-format has built-in support for this. I fed clang-format every line
Sort the remaining #include lines in include/... and lib/....
I did this a long time ago with a janky python script, but now clang-format has built-in support for this. I fed clang-format every line with a #include and let it re-sort things according to the precise LLVM rules for include ordering baked into clang-format these days.
I've reverted a number of files where the results of sorting includes isn't healthy. Either places where we have legacy code relying on particular include ordering (where possible, I'll fix these separately) or where we have particular formatting around #include lines that I didn't want to disturb in this patch.
This patch is *entirely* mechanical. If you get merge conflicts or anything, just ignore the changes in this patch and run clang-format over your #include lines in the files.
Sorry for any noise here, but it is important to keep these things stable. I was seeing an increasing number of patches with irrelevant re-ordering of #include lines because clang-format was used. This patch at least isolates that churn, makes it easy to skip when resolving conflicts, and gets us to a clean baseline (again).
llvm-svn: 304787
show more ...
|
#
03306604 |
| 03-Jun-2017 |
Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com> |
[AMDGPU] Untangle SDWA pass from SIShrinkInstructions
Remove dependency of SDWA pass on SIShrinkInstructions. The goal is to move SDWA even higher in the stack to avoid second run of MachineLICM, Ma
[AMDGPU] Untangle SDWA pass from SIShrinkInstructions
Remove dependency of SDWA pass on SIShrinkInstructions. The goal is to move SDWA even higher in the stack to avoid second run of MachineLICM, MachineCSE and SIFoldOperands.
Also added handling to preserve original src modifiers.
Differential Revision: https://reviews.llvm.org/D33860
llvm-svn: 304665
show more ...
|
Revision tags: llvmorg-4.0.1-rc2 |
|
#
56ea488d |
| 30-May-2017 |
Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com> |
[AMDGPU] Allow SDWA in instructions with immediates and SGPRs
An encoding does not allow to use SDWA in an instruction with scalar operands, either literals or SGPRs. That is however possible to cop
[AMDGPU] Allow SDWA in instructions with immediates and SGPRs
An encoding does not allow to use SDWA in an instruction with scalar operands, either literals or SGPRs. That is however possible to copy these operands into a VGPR first.
Several copies of the value are produced if multiple SDWA conversions were done. To cleanup MachineLICM (to hoist copies out of loops), MachineCSE (to remove duplicate copies) and SIFoldOperands (to replace SGPR to VGPR copy with immediate copy right to the VGPR) runs are added after the SDWA pass.
Differential Revision: https://reviews.llvm.org/D33583
llvm-svn: 304219
show more ...
|
#
ebfdaf73 |
| 18-May-2017 |
Sam Kolton <Sam.Kolton@amd.com> |
[AMDGPU] SDWA operands should not intersect with potential MIs
Summary: There should be no intesection between SDWA operands and potential MIs. E.g.: ``` v_and_b32 v0, 0xff, v1 -> src:v1 sel:BYTE_0
[AMDGPU] SDWA operands should not intersect with potential MIs
Summary: There should be no intesection between SDWA operands and potential MIs. E.g.: ``` v_and_b32 v0, 0xff, v1 -> src:v1 sel:BYTE_0 v_and_b32 v2, 0xff, v0 -> src:v0 sel:BYTE_0 v_add_u32 v3, v4, v2 ``` In that example it is possible that we would fold 2nd instruction into 3rd (v_add_u32_sdwa) and then try to fold 1st instruction into 2nd (that was already destroyed). So if SDWAOperand is also a potential MI then do not apply it.
Reviewers: vpykhtin, arsenm
Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye
Differential Revision: https://reviews.llvm.org/D32804
llvm-svn: 303347
show more ...
|
Revision tags: llvmorg-4.0.1-rc1 |
|
#
aff8341d |
| 12-Apr-2017 |
Sam Kolton <Sam.Kolton@amd.com> |
[AMDGPU] SDWA: make pass global
Summary: Remove checks for basic blocks.
Reviewers: vpykhtin, rampitec, arsenm
Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye
Differential Re
[AMDGPU] SDWA: make pass global
Summary: Remove checks for basic blocks.
Reviewers: vpykhtin, rampitec, arsenm
Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye
Differential Revision: https://reviews.llvm.org/D31935
llvm-svn: 300040
show more ...
|
#
9fa16960 |
| 06-Apr-2017 |
Sam Kolton <Sam.Kolton@amd.com> |
[AMDGPU] Resubmit SDWA peephole: enable by default Reviewers: vpykhtin, rampitec, arsenm
Subscribers: qcolombet, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye
Differential Revision: https
[AMDGPU] Resubmit SDWA peephole: enable by default Reviewers: vpykhtin, rampitec, arsenm
Subscribers: qcolombet, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye
Differential Revision: https://reviews.llvm.org/D31671
llvm-svn: 299654
show more ...
|
#
27e0f8bc |
| 31-Mar-2017 |
Sam Kolton <Sam.Kolton@amd.com> |
[AMDGPU] SDWA Peephole: improve search for immediates in SDWA patterns
Previously compiler often extracted common immediates into specific register, e.g.: ``` %vreg0 = S_MOV_B32 0xff; %vreg2 = V_AND
[AMDGPU] SDWA Peephole: improve search for immediates in SDWA patterns
Previously compiler often extracted common immediates into specific register, e.g.: ``` %vreg0 = S_MOV_B32 0xff; %vreg2 = V_AND_B32_e32 %vreg0, %vreg1 %vreg4 = V_AND_B32_e32 %vreg0, %vreg3 ``` Because of this SDWA peephole failed to find SDWA convertible pattern. E.g. in previous example this could be converted into 2 SDWA src operands: ``` SDWA src: %vreg2 src_sel:BYTE_0 SDWA src: %vreg4 src_sel:BYTE_0 ``` With this change peephole check if operand is either immediate or register that is copy of immediate.
llvm-svn: 299202
show more ...
|
#
f60ad58d |
| 21-Mar-2017 |
Sam Kolton <Sam.Kolton@amd.com> |
[ADMGPU] SDWA peephole optimization pass.
Summary: First iteration of SDWA peephole.
This pass tries to combine several instruction into one SDWA instruction. E.g. it converts: ''' V_LSHRREV_B3
[ADMGPU] SDWA peephole optimization pass.
Summary: First iteration of SDWA peephole.
This pass tries to combine several instruction into one SDWA instruction. E.g. it converts: ''' V_LSHRREV_B32_e32 %vreg0, 16, %vreg1 V_ADD_I32_e32 %vreg2, %vreg0, %vreg3 V_LSHLREV_B32_e32 %vreg4, 16, %vreg2 ''' Into: ''' V_ADD_I32_sdwa %vreg4, %vreg1, %vreg3 dst_sel:WORD_1 dst_unused:UNUSED_PAD src0_sel:WORD_1 src1_sel:DWORD '''
Pass structure: 1. Iterate over machine instruction in basic block and try to apply "SDWA patterns" to each of them. SDWA patterns match machine instruction into either source or destination SDWA operand. E.g. ''' V_LSHRREV_B32_e32 %vreg0, 16, %vreg1''' is matched to source SDWA operand '''%vreg1 src_sel:WORD_1'''. 2. Iterate over found SDWA operands and find instruction that could be potentially coverted into SDWA. E.g. for source SDWA operand potential instruction are all instruction in this basic block that uses '''%vreg0''' 3. Iterate over all potential instructions and check if they can be converted into SDWA. 4. Convert instructions to SDWA.
This review contains basic implementation of SDWA peephole pass. This pass requires additional testing fot both correctness and performance (no performance testing done). There are several ways this pass can be improved: 1. Make this pass work on whole function not only basic block. As I can see this can be done right now without changes to pass. 2. Introduce more SDWA patterns 3. Introduce mnemonics to limit when SDWA patterns should apply
Reviewers: vpykhtin, alex-t, arsenm, rampitec
Subscribers: wdng, nhaehnle, mgorny
Differential Revision: https://reviews.llvm.org/D30038
llvm-svn: 298365
show more ...
|