History log of /llvm-project/llvm/lib/Target/AMDGPU/SIPeepholeSDWA.cpp (Results 26 – 50 of 71)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-11.1.0-rc1
# 314e29ed 07-Jan-2021 Joe Nash <Joseph.Nash@amd.com>

[AMDGPU] Add _e64 suffix to VOP3 Insts

Previously, instructions which could be
expressed as VOP3 in addition to another
encoding had a _e64 suffix on the tablegen
record name, while those
only avail

[AMDGPU] Add _e64 suffix to VOP3 Insts

Previously, instructions which could be
expressed as VOP3 in addition to another
encoding had a _e64 suffix on the tablegen
record name, while those
only available as VOP3 did not. With this
patch, all VOP3s will have the _e64 suffix.
The assembly does not change, only the mir.

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D94341

Change-Id: Ia8ec8890d47f8f94bbbdac43745b4e9dd2b03423

show more ...


# 6a87e9b0 25-Dec-2020 dfukalov <daniil.fukalov@amd.com>

[NFC][AMDGPU] Reduce include files dependency.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D93813


Revision tags: llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3
# 34978602 20-Aug-2020 Jay Foad <jay.foad@amd.com>

[AMDGPU] Remove uses of Register::isPhysicalRegister/isVirtualRegister

... in favour of the isPhysical/isVirtual methods.


Revision tags: llvmorg-11.0.0-rc2
# 0462aef5 13-Aug-2020 Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>

[AMDGPU] Inhibit SDWA if target instruction has FI

Differential Revision: https://reviews.llvm.org/D85918


Revision tags: llvmorg-11.0.0-rc1, llvmorg-12-init
# 79f67cae 14-Jul-2020 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Rename add/sub with carry out instructions

The hardware has created a real mess in the naming for add/sub, which
have been renamed basically every generation. Switch the carry out
pseudos to

AMDGPU: Rename add/sub with carry out instructions

The hardware has created a real mess in the naming for add/sub, which
have been renamed basically every generation. Switch the carry out
pseudos to have the gfx9/gfx10 names. We were using the original SI/CI
v_add_i32/v_sub_i32 names. Later targets reintroduced these names as
carryless instructions with a saturating clamp bit, which we do not
define. Do this rename so we can unambiguously add these missing
instructions.

The carry-in versions should also be renamed, but at least those had a
consistent _u32 name to begin with. The 16-bit instructions were also
renamed, but aren't ambiguous.

This does regress assembler error message quality in some cases. In
mismatched wave32/wave64 situations, this will switch from
"unsupported instruction" to "invalid operand", with the error
pointing at the wrong position. I couldn't quite follow how the
assembler selects these, but the previous behavior seemed accidental
to me. It looked like there was a partial attempt to handle this which
was never completed (i.e. there is an AMDGPUOperand::isBoolReg but it
isn't used for anything).

show more ...


Revision tags: llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2
# 07cd19ef 27-May-2020 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Fix dropping MI flags when rewriting instructions

All 3 passes that change instruction encodings were dropping MI
flags. This avoids scheduling regressions caused by setting
mayRaiseFPExcept

AMDGPU: Fix dropping MI flags when rewriting instructions

All 3 passes that change instruction encodings were dropping MI
flags. This avoids scheduling regressions caused by setting
mayRaiseFPExceptions on FP instructions for non-strictfp functions.

show more ...


Revision tags: llvmorg-10.0.1-rc1, llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5, llvmorg-10.0.0-rc4, llvmorg-10.0.0-rc3, llvmorg-10.0.0-rc2
# a19de320 12-Feb-2020 Hans Wennborg <hans@chromium.org>

Fix unused function warning (PR44808)


Revision tags: llvmorg-10.0.0-rc1, llvmorg-11-init, llvmorg-9.0.1, llvmorg-9.0.1-rc3, llvmorg-9.0.1-rc2
# 3d5ba7c6 27-Nov-2019 Tim Renouf <tim.renouf@amd.com>

AMDGPU: Fixed indeterminate map iteration in SIPeepholeSDWA

Differential Revision: https://reviews.llvm.org/D70783

Change-Id: Ic26f915a4acb4c00ecefa9d09d7c24cec370ed06


Revision tags: llvmorg-9.0.1-rc1, llvmorg-9.0.0, llvmorg-9.0.0-rc6, llvmorg-9.0.0-rc5, llvmorg-9.0.0-rc4, llvmorg-9.0.0-rc3
# 0c476111 15-Aug-2019 Daniel Sanders <daniel_l_sanders@apple.com>

Apply llvm-prefer-register-over-unsigned from clang-tidy to LLVM

Summary:
This clang-tidy check is looking for unsigned integer variables whose initializer
starts with an implicit cast from llvm::Re

Apply llvm-prefer-register-over-unsigned from clang-tidy to LLVM

Summary:
This clang-tidy check is looking for unsigned integer variables whose initializer
starts with an implicit cast from llvm::Register and changes the type of the
variable to llvm::Register (dropping the llvm:: where possible).

Partial reverts in:
X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister
X86FixupLEAs.cpp - Some functions return unsigned and arguably should be MCRegister
X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister
HexagonBitSimplify.cpp - Function takes BitTracker::RegisterRef which appears to be unsigned&
MachineVerifier.cpp - Ambiguous operator==() given MCRegister and const Register
PPCFastISel.cpp - No Register::operator-=()
PeepholeOptimizer.cpp - TargetInstrInfo::optimizeLoadInstr() takes an unsigned&
MachineTraceMetrics.cpp - MachineTraceMetrics lacks a suitable constructor

Manual fixups in:
ARMFastISel.cpp - ARMEmitLoad() now takes a Register& instead of unsigned&
HexagonSplitDouble.cpp - Ternary operator was ambiguous between unsigned/Register
HexagonConstExtenders.cpp - Has a local class named Register, used llvm::Register instead of Register.
PPCFastISel.cpp - PPCEmitLoad() now takes a Register& instead of unsigned&

Depends on D65919

Reviewers: arsenm, bogner, craig.topper, RKSimon

Reviewed By: arsenm

Subscribers: RKSimon, craig.topper, lenary, aemerson, wuzish, jholewinski, MatzeB, qcolombet, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, wdng, nhaehnle, sbc100, jgravelle-google, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, javed.absar, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, tpr, PkmX, jocewei, jsji, Petar.Avramovic, asbirlea, Jim, s.egerton, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65962

llvm-svn: 369041

show more ...


# 0eaee545 15-Aug-2019 Jonas Devlieghere <jonas@devlieghere.com>

[llvm] Migrate llvm::make_unique to std::make_unique

Now that we've moved to C++14, we no longer need the llvm::make_unique
implementation from STLExtras.h. This patch is a mechanical replacement
of

[llvm] Migrate llvm::make_unique to std::make_unique

Now that we've moved to C++14, we no longer need the llvm::make_unique
implementation from STLExtras.h. This patch is a mechanical replacement
of (hopefully) all the llvm::make_unique instances across the monorepo.

llvm-svn: 369013

show more ...


Revision tags: llvmorg-9.0.0-rc2
# 2bea69bf 01-Aug-2019 Daniel Sanders <daniel_l_sanders@apple.com>

Finish moving TargetRegisterInfo::isVirtualRegister() and friends to llvm::Register as started by r367614. NFC

llvm-svn: 367633


Revision tags: llvmorg-9.0.0-rc1, llvmorg-10-init, llvmorg-8.0.1, llvmorg-8.0.1-rc4, llvmorg-8.0.1-rc3
# 52500216 16-Jun-2019 Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>

[AMDGPU] gfx10 conditional registers handling

This is cpp source part of wave32 support, excluding overriden
getRegClass().

Differential Revision: https://reviews.llvm.org/D63351

llvm-svn: 363513


Revision tags: llvmorg-8.0.1-rc2, llvmorg-8.0.1-rc1
# 28a1936f 04-May-2019 Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>

[AMDGPU] gfx1010: use fmac instructions

Differential Revision: https://reviews.llvm.org/D61527

llvm-svn: 359959


Revision tags: llvmorg-8.0.0
# da644c02 13-Mar-2019 Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>

[AMDGPU] Silence gcc 7 warnings

Differential Revision: https://reviews.llvm.org/D59330

llvm-svn: 356100


Revision tags: llvmorg-8.0.0-rc5, llvmorg-8.0.0-rc4, llvmorg-8.0.0-rc3, llvmorg-7.1.0, llvmorg-7.1.0-rc1, llvmorg-8.0.0-rc2, llvmorg-8.0.0-rc1
# 2946cd70 19-Jan-2019 Chandler Carruth <chandlerc@gmail.com>

Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the ne

Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.

Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.

llvm-svn: 351636

show more ...


Revision tags: llvmorg-7.0.1, llvmorg-7.0.1-rc3
# 16de4fd2 03-Dec-2018 Ron Lieberman <ronlieb.g@gmail.com>

[AMDGPU] Add sdwa support for ADD|SUB U64 decomposed Pseudos

The introduction of S_{ADD|SUB}_U64_PSEUDO instructions which are decomposed
into VOP3 instruction pairs for S_ADD_U64_PSEUDO:
V_ADD_I3

[AMDGPU] Add sdwa support for ADD|SUB U64 decomposed Pseudos

The introduction of S_{ADD|SUB}_U64_PSEUDO instructions which are decomposed
into VOP3 instruction pairs for S_ADD_U64_PSEUDO:
V_ADD_I32_e64
V_ADDC_U32_e64
and for S_SUB_U64_PSEUDO
V_SUB_I32_e64
V_SUBB_U32_e64
preclude the use of SDWA to encode a constant.
SDWA: Sub-Dword addressing is supported on VOP1 and VOP2 instructions,
but not on VOP3 instructions.

We desire to fold the bit-and operand into the instruction encoding
for the V_ADD_I32 instruction. This requires that we transform the
VOP3 into a VOP2 form of the instruction (_e32).
%19:vgpr_32 = V_AND_B32_e32 255,
killed %16:vgpr_32, implicit $exec
%47:vgpr_32, %49:sreg_64_xexec = V_ADD_I32_e64
%26.sub0:vreg_64, %19:vgpr_32, implicit $exec
%48:vgpr_32, dead %50:sreg_64_xexec = V_ADDC_U32_e64
%26.sub1:vreg_64, %54:vgpr_32, killed %49:sreg_64_xexec, implicit $exec

which then allows the SDWA encoding and becomes
%47:vgpr_32 = V_ADD_I32_sdwa
0, %26.sub0:vreg_64, 0, killed %16:vgpr_32, 0, 6, 0, 6, 0,
implicit-def $vcc, implicit $exec
%48:vgpr_32 = V_ADDC_U32_e32
0, %26.sub1:vreg_64, implicit-def $vcc, implicit $vcc, implicit $exec


Differential Revision: https://reviews.llvm.org/D54882

llvm-svn: 348132

show more ...


Revision tags: llvmorg-7.0.1-rc2, llvmorg-7.0.1-rc1, llvmorg-7.0.0, llvmorg-7.0.0-rc3, llvmorg-7.0.0-rc2, llvmorg-7.0.0-rc1
# 5bfbae5c 11-Jul-2018 Tom Stellard <tstellar@redhat.com>

AMDGPU: Refactor Subtarget classes

Summary:
This is a follow-up to r335942.
- Merge SISubtarget into AMDGPUSubtarget and rename to GCNSubtarget
- Rename AMDGPUCommonSubtarget to AMDGPUSubtarget
- Me

AMDGPU: Refactor Subtarget classes

Summary:
This is a follow-up to r335942.
- Merge SISubtarget into AMDGPUSubtarget and rename to GCNSubtarget
- Rename AMDGPUCommonSubtarget to AMDGPUSubtarget
- Merge R600Subtarget::Generation and GCNSubtarget::Generation into
AMDGPUSubtarget::Generation.

Reviewers: arsenm, jvesely

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D49037

llvm-svn: 336851

show more ...


Revision tags: llvmorg-6.0.1, llvmorg-6.0.1-rc3, llvmorg-6.0.1-rc2
# 44b30b45 22-May-2018 Tom Stellard <tstellar@redhat.com>

AMDGPU: Remove #include "MCTargetDesc/AMDGPUMCTargetDesc.h" from common headers

Summary:
MCTargetDesc/AMDGPUMCTargetDesc.h contains enums for all the instuction
and register defintions, which are hu

AMDGPU: Remove #include "MCTargetDesc/AMDGPUMCTargetDesc.h" from common headers

Summary:
MCTargetDesc/AMDGPUMCTargetDesc.h contains enums for all the instuction
and register defintions, which are huge so we only want to include
them where needed.

This will also make it easier if we want to split the R600 and GCN
definitions into separate tablegenerated files.

I was unable to remove AMDGPUMCTargetDesc.h from SIMachineFunctionInfo.h
because it uses some enums from the header to initialize default values
for the SIMachineFunction class, so I ended up having to remove includes of
SIMachineFunctionInfo.h from headers too.

Reviewers: arsenm, nhaehnle

Reviewed By: nhaehnle

Subscribers: MatzeB, kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D46272

llvm-svn: 332930

show more ...


# d34e60ca 14-May-2018 Nicola Zaghen <nicola.zaghen@imgtec.com>

Rename DEBUG macro to LLVM_DEBUG.

The DEBUG() macro is very generic so it might clash with other projects.
The renaming was done as follows:
- git grep -l 'DEBUG' | xargs sed -i 's/\bDEBUG\s\?(/

Rename DEBUG macro to LLVM_DEBUG.

The DEBUG() macro is very generic so it might clash with other projects.
The renaming was done as follows:
- git grep -l 'DEBUG' | xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g'
- git diff -U0 master | ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM
- Manual change to APInt
- Manually chage DOCS as regex doesn't match it.

In the transition period the DEBUG() macro is still present and aliased
to the LLVM_DEBUG() one.

Differential Revision: https://reviews.llvm.org/D43624

llvm-svn: 332240

show more ...


# 432a3883 30-Apr-2018 Nico Weber <nicolasweber@gmx.de>

IWYU for llvm-config.h in llvm, additions.

See r331124 for how I made a list of files missing the include.
I then ran this Python script:

for f in open('filelist.txt'):
f = f.strip()

IWYU for llvm-config.h in llvm, additions.

See r331124 for how I made a list of files missing the include.
I then ran this Python script:

for f in open('filelist.txt'):
f = f.strip()
fl = open(f).readlines()

found = False
for i in xrange(len(fl)):
p = '#include "llvm/'
if not fl[i].startswith(p):
continue
if fl[i][len(p):] > 'Config':
fl.insert(i, '#include "llvm/Config/llvm-config.h"\n')
found = True
break
if not found:
print 'not found', f
else:
open(f, 'w').write(''.join(fl))

and then looked through everything with `svn diff | diffstat -l | xargs -n 1000 gvim -p`
and tried to fix include ordering and whatnot.

No intended behavior change.

llvm-svn: 331184

show more ...


# cbebba49 23-Apr-2018 Nicolai Haehnle <nhaehnle@gmail.com>

AMDGPU: Fix SDWA peephole for V_AND_B32

Summary:
Found by inspection. We care about the operand that *doesn't*
contain the immediate.

I believe this is currently not hit because we fold 0xff / 0xff

AMDGPU: Fix SDWA peephole for V_AND_B32

Summary:
Found by inspection. We care about the operand that *doesn't*
contain the immediate.

I believe this is currently not hit because we fold 0xff / 0xffff
immediates only later.

Change-Id: Ic3cf8538bc7da5eff3200d96eccf9d339e6345a7

Reviewers: arsenm, rampitec

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D45886

llvm-svn: 330586

show more ...


Revision tags: llvmorg-6.0.1-rc1
# 4c45e6ff 16-Apr-2018 Dmitry Preobrazhensky <dmitry.preobrazhensky@amd.com>

[AMDGPU][MC][VI][GFX9] Added support of SDWA/DPP for v_cndmask_b32

See bug 36356: https://bugs.llvm.org/show_bug.cgi?id=36356

Differential Revision: https://reviews.llvm.org/D45446

Reviewers: arte

[AMDGPU][MC][VI][GFX9] Added support of SDWA/DPP for v_cndmask_b32

See bug 36356: https://bugs.llvm.org/show_bug.cgi?id=36356

Differential Revision: https://reviews.llvm.org/D45446

Reviewers: artem.tamazov, arsenm, timcorringham
llvm-svn: 330123

show more ...


# 59e5ef79 30-Mar-2018 Michael Bedy <mjbedy@gmail.com>

[AMDGPU] Fix the SDWA Peephole phase to handle src for dst:UNUSED_PRESERVE.

Summary:
The phase attempts to transform operations that extract a portion of a value
into an SDWA src operand in cases wh

[AMDGPU] Fix the SDWA Peephole phase to handle src for dst:UNUSED_PRESERVE.

Summary:
The phase attempts to transform operations that extract a portion of a value
into an SDWA src operand in cases where that value is used only once. It
was not prepared for this use to be the preserved portion of a value for
dst:UNUSED_PRESERVE, resulting in a crash or assert.

This change either rejects the illegal SDWA attempt, or in the case where
dst:WORD_1 and the src_sel would be WORD_0, removes the unneeded
extract instruction.

Reviewers: arsenm, #amdgpu

Reviewed By: arsenm, #amdgpu

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D44364

llvm-svn: 328856

show more ...


Revision tags: llvmorg-5.0.2, llvmorg-5.0.2-rc2, llvmorg-5.0.2-rc1
# 80cf9ff5 11-Mar-2018 Michael Bedy <mjbedy@gmail.com>

Test commit - change comment slightly.

llvm-svn: 327234


Revision tags: llvmorg-6.0.0, llvmorg-6.0.0-rc3
# 9c2f3c48 08-Feb-2018 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Process SDWA block at a time

Right now this loops over the entire function every time there
is a change, which is not very efficient. There's no practical
reason to track this so globally, s

AMDGPU: Process SDWA block at a time

Right now this loops over the entire function every time there
is a change, which is not very efficient. There's no practical
reason to track this so globally, since the code motion optimization
passes should be sinking instructions with single uses and
the pass currently will not fold with multiple uses.

llvm-svn: 324667

show more ...


123