History log of /llvm-project/llvm/lib/Target/AMDGPU/SIMachineFunctionInfo.h (Results 101 – 125 of 186)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# f7060f4f 30-Apr-2020 Ram Nalamothu <VenkataRamanaiah.Nalamothu@amd.com>

For PAL, make sure Scratch Buffer Descriptor do not clobber GIT pointer

Since SRSRC has alignment requirements, first find non GIT pointer clobbered
registers for SRSRC and then if those registers c

For PAL, make sure Scratch Buffer Descriptor do not clobber GIT pointer

Since SRSRC has alignment requirements, first find non GIT pointer clobbered
registers for SRSRC and then if those registers clobber preloaded Scratch Wave
Offset register, copy the Scratch Wave Offset register to a free SGPR.

show more ...


Revision tags: llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5, llvmorg-10.0.0-rc4, llvmorg-10.0.0-rc3, llvmorg-10.0.0-rc2, llvmorg-10.0.0-rc1
# 60b1967c 21-Jan-2020 Scott Linder <Scott.Linder@amd.com>

[AMDGPU] Add Scratch Wave Offset to Scratch Buffer Descriptor in entry functions

Add the scratch wave offset to the scratch buffer descriptor (SRSrc) in
the entry function prologue. This allows us t

[AMDGPU] Add Scratch Wave Offset to Scratch Buffer Descriptor in entry functions

Add the scratch wave offset to the scratch buffer descriptor (SRSrc) in
the entry function prologue. This allows us to removes the scratch wave
offset register from the calling convention ABI.

As part of this change, allow the use of an inline constant zero for the
SOffset of MUBUF instructions accessing the stack in entry functions
when a frame pointer is not requested/required. Entry functions with
calls still need to set up the calling convention ABI stack pointer
register, and reference it in order to address arguments of called
functions. The ABI stack pointer register remains unswizzled, but is now
wave-relative instead of queue-relative.

Non-entry functions also use an inline constant zero SOffset for
wave-relative scratch access, but continue to use the stack and frame
pointers as before. When the stack or frame pointer is converted to a
swizzled offset it is now scaled directly, as the scratch wave offset no
longer needs to be subtracted first.

Update llvm/docs/AMDGPUUsage.rst to reflect these changes to the calling
convention.

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D75138

show more ...


# db099f99 11-Mar-2020 Scott Linder <Scott.Linder@amd.com>

[AMDGPU][NFC] Refactor some uses of unsigned to Register

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76035


Revision tags: llvmorg-11-init, llvmorg-9.0.1, llvmorg-9.0.1-rc3, llvmorg-9.0.1-rc2
# 1024b73e 03-Dec-2019 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Split denormal mode tracking bits

Prepare to accurately track the future denormal-fp-math attribute
changes. The way to actually set these separately is not wired in yet.

This is just a mec

AMDGPU: Split denormal mode tracking bits

Prepare to accurately track the future denormal-fp-math attribute
changes. The way to actually set these separately is not wired in yet.

This is just a mechanical change, and mostly still assumes the input
and output mode match. This should be refined for some cases. For
example, fcanonicalize lowering should use the flushing variant if
either input or output flushing is enabled

show more ...


Revision tags: llvmorg-9.0.1-rc1
# db0ed3e4 01-Nov-2019 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Refactor treatment of denormal mode

Start moving towards treating this as a property of the calling
convention, and not the subtarget. The default denormal mode should
not be part of the sub

AMDGPU: Refactor treatment of denormal mode

Start moving towards treating this as a property of the calling
convention, and not the subtarget. The default denormal mode should
not be part of the subtarget, and be moved into a separate function
attribute.

This patch is still NFC. The denormal mode remains as a subtarget
feature for now, but make the necessary changes to switch to using an
attribute.

show more ...


# 19e7f8a2 28-Oct-2019 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Add default denormal mode to MachineFunctionInfo

The default FP mode should really be a property of a specific
function, and not a subtarget. Introduce the necessary fields to the
SIMachineF

AMDGPU: Add default denormal mode to MachineFunctionInfo

The default FP mode should really be a property of a specific
function, and not a subtarget. Introduce the necessary fields to the
SIMachineFunctionInfo to help move towards this goal.

show more ...


Revision tags: llvmorg-9.0.0, llvmorg-9.0.0-rc6, llvmorg-9.0.0-rc5, llvmorg-9.0.0-rc4, llvmorg-9.0.0-rc3
# ff07631b 27-Aug-2019 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Add amdgpu-32bit-address-high-bits to MIR serialization

llvm-svn: 370089


# 0eaee545 15-Aug-2019 Jonas Devlieghere <jonas@devlieghere.com>

[llvm] Migrate llvm::make_unique to std::make_unique

Now that we've moved to C++14, we no longer need the llvm::make_unique
implementation from STLExtras.h. This patch is a mechanical replacement
of

[llvm] Migrate llvm::make_unique to std::make_unique

Now that we've moved to C++14, we no longer need the llvm::make_unique
implementation from STLExtras.h. This patch is a mechanical replacement
of (hopefully) all the llvm::make_unique instances across the monorepo.

llvm-svn: 369013

show more ...


Revision tags: llvmorg-9.0.0-rc2, llvmorg-9.0.0-rc1, llvmorg-10-init
# 937ff6e7 11-Jul-2019 Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>

[AMDGPU] gfx908 agpr spilling

Differential Revision: https://reviews.llvm.org/D64594

llvm-svn: 365833


# 58426a37 10-Jul-2019 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Serialize mode from MachineFunctionInfo

llvm-svn: 365653


Revision tags: llvmorg-8.0.1, llvmorg-8.0.1-rc4
# 71dfb7ec 08-Jul-2019 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Make s34 the FP register

Make the FP register callee saved.

This is tricky because now the FP needs to be spilled in the prolog
relative to the incoming SP register, rather than the frame r

AMDGPU: Make s34 the FP register

Make the FP register callee saved.

This is tricky because now the FP needs to be spilled in the prolog
relative to the incoming SP register, rather than the frame register
used throughout the rest of the function. I don't like how this
bypassess the standard mechanism for CSR spills just to get the
correct insert point. I may look for a better solution, since all CSR
VGPRs may also need to have all lanes activated. Another option might
be to make getFrameIndexReference change the base register if the
frame index is a CSR, and then try to figure out the right insertion
point in emitProlog.

If there is a free VGPR lane available for SGPR spilling, try to use
it for the FP. If that would require intrtoducing a new VGPR spill,
try to use a free call clobbered SGPR. Only fallback to introducing a
new VGPR spill as a last resort.

This also doesn't attempt to handle SGPR spilling with scalar stores.

llvm-svn: 365372

show more ...


# 80177ca5 03-Jul-2019 Michael Liao <michael.hliao@gmail.com>

[AMDGPU] Enable serializing of argument info.

Summary:
- Support serialization of all arguments in machine function info. This
enables fabricating MIR tests depending on argument info.

Reviewers:

[AMDGPU] Enable serializing of argument info.

Summary:
- Support serialization of all arguments in machine function info. This
enables fabricating MIR tests depending on argument info.

Reviewers: arsenm, rampitec

Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64096

llvm-svn: 364995

show more ...


# 4dc3b2bf 01-Jul-2019 Nicolai Haehnle <nhaehnle@gmail.com>

AMDGPU: Support GDS atomics

Summary:
Original patch by Marek Olšák

Change-Id: Ia97d5d685a63a377d86e82942436d1fe6e429bab

Reviewers: mareko, arsenm, rampitec

Subscribers: kzhuravl, jvesely, wdng, y

AMDGPU: Support GDS atomics

Summary:
Original patch by Marek Olšák

Change-Id: Ia97d5d685a63a377d86e82942436d1fe6e429bab

Reviewers: mareko, arsenm, rampitec

Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, jfb, Petar.Avramovic, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63452

llvm-svn: 364814

show more ...


# 1b317685 01-Jul-2019 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Convert some places to Register

llvm-svn: 364769


Revision tags: llvmorg-8.0.1-rc3
# 4be636eb 25-Jun-2019 Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>

[AMDGPU] Removed dead SIMachineFunctionInfo::getWorkItemIDVGPR()

Differential Revision: https://reviews.llvm.org/D63780

llvm-svn: 364339


# 4d55d024 19-Jun-2019 Matt Arsenault <Matthew.Arsenault@amd.com>

Reapply "AMDGPU: Add ds_gws_init / ds_gws_barrier intrinsics"

This reapplies r363678, using the correct chain for the CopyToReg for
v0. glueCopyToM0 counterintuitively changes the operands of the
or

Reapply "AMDGPU: Add ds_gws_init / ds_gws_barrier intrinsics"

This reapplies r363678, using the correct chain for the CopyToReg for
v0. glueCopyToM0 counterintuitively changes the operands of the
original node.

llvm-svn: 363870

show more ...


# 128ce93c 19-Jun-2019 Simon Pilgrim <llvm-dev@redking.me.uk>

Revert rL363678 : AMDGPU: Add ds_gws_init / ds_gws_barrier intrinsics

There may or may not be additional work to handle this correctly on
SI/CI.
........
Breaks EXPENSIVE_CHECKS buildbots - http://l

Revert rL363678 : AMDGPU: Add ds_gws_init / ds_gws_barrier intrinsics

There may or may not be additional work to handle this correctly on
SI/CI.
........
Breaks EXPENSIVE_CHECKS buildbots - http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/78/

llvm-svn: 363797

show more ...


# 8d35dcd7 18-Jun-2019 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Add ds_gws_init / ds_gws_barrier intrinsics

There may or may not be additional work to handle this correctly on
SI/CI.

llvm-svn: 363678


# e683eba0 17-Jun-2019 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Cleanup custom PseudoSourceValue definitions

Use separate enums for each kind, avoid repeating overloads, and add
missing classof implementation.

llvm-svn: 363558


Revision tags: llvmorg-8.0.1-rc2
# b812b7a4 05-Jun-2019 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Invert frame index offset interpretation

Since the beginning, the offset of a frame index has been consistently
interpreted backwards. It was treating it as an offset from the
scratch wave o

AMDGPU: Invert frame index offset interpretation

Since the beginning, the offset of a frame index has been consistently
interpreted backwards. It was treating it as an offset from the
scratch wave offset register as a frame register. The correct
interpretation is the offset from the SP on entry to the function,
before the prolog. Frame index elimination then should select either
SP or another register as an FP.

Treat the scratch wave offset on kernel entry as the pre-incremented
SP. Rely more heavily on the standard hasFP and frame pointer
elimination logic, and clean up the private reservation code. This
saves a copy in most callee functions.

The kernel prolog emission code is still kind of a mess relying on
checking the uses of physical registers, which I would prefer to
eliminate.

Currently selection directly emits MUBUF instructions, which require
using a reference to some register. Use the register chosen for SP,
and then ignore this later. This should probably be cleaned up to use
pseudos that don't refer to any specific base register until frame
index elimination.

Add a workaround for shaders using large numbers of SGPRs. I'm not
sure these cases were ever working correctly, since as far as I can
tell the logic for figuring out which SGPR is the scratch wave offset
doesn't match up with the shader input initialization in the shader
programming guide.

llvm-svn: 362661

show more ...


Revision tags: llvmorg-8.0.1-rc1
# 0a30f33c 01-Apr-2019 Neil Henning <neil.henning@amd.com>

[AMDGPU] Pre-allocate WWM registers to reduce VGPR pressure.

This change incorporates an effort by Connor Abbot to change how we deal
with WWM operations potentially trashing valid values in inactiv

[AMDGPU] Pre-allocate WWM registers to reduce VGPR pressure.

This change incorporates an effort by Connor Abbot to change how we deal
with WWM operations potentially trashing valid values in inactive lanes.

Previously, the SIFixWWMLiveness pass would work out which registers
were being trashed within WWM regions, and ensure that the register
allocator did not have any values it was depending on resident in those
registers if the WWM section would trash them. This worked perfectly
well, but would cause sometimes severe register pressure when the WWM
section resided before divergent control flow (or at least that is where
I mostly observed it).

This fix instead runs through the WWM sections and pre allocates some
registers for WWM. It then reserves these registers so that the register
allocator cannot use them. This results in a significant register
saving on some WWM shaders I'm working with (130 -> 104 VGPRs, with just
this change!).

Differential Revision: https://reviews.llvm.org/D59295

llvm-svn: 357400

show more ...


# 055e4dce 29-Mar-2019 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Remove dx10-clamp from subtarget features

Since this can be set with s_setreg*, it should not be a subtarget
property. Set a default based on the calling convention, and Introduce
a new amdg

AMDGPU: Remove dx10-clamp from subtarget features

Since this can be set with s_setreg*, it should not be a subtarget
property. Set a default based on the calling convention, and Introduce
a new amdgpu-dx10-clamp attribute to override this if desired.

Also introduce a new amdgpu-ieee attribute to match.

The values need to match to allow inlining. I think it is OK for the
caller's dx10-clamp attribute to override the callee, but there
doesn't appear to be the infrastructure to do this currently without
definining the attribute in the generic Attributes.td.

Eventually the calling convention lowering will need to insert a mode
switch somewhere for these.

llvm-svn: 357302

show more ...


Revision tags: llvmorg-8.0.0
# bc6d07ca 14-Mar-2019 Matt Arsenault <Matthew.Arsenault@amd.com>

MIR: Allow targets to serialize MachineFunctionInfo

This has been a very painful missing feature that has made producing
reduced testcases difficult. In particular the various registers
determined f

MIR: Allow targets to serialize MachineFunctionInfo

This has been a very painful missing feature that has made producing
reduced testcases difficult. In particular the various registers
determined for stack access during function lowering were necessary to
avoid undefined register errors in a large percentage of
cases. Implement a subset of the important fields that need to be
preserved for AMDGPU.

Most of the changes are to support targets parsing register fields and
properly reporting errors. The biggest sort-of bug remaining is for
fields that can be initialized from the IR section will be overwritten
by a default initialized machineFunctionInfo section. Another
remaining bug is the machineFunctionInfo section is still printed even
if empty.

llvm-svn: 356215

show more ...


Revision tags: llvmorg-8.0.0-rc5, llvmorg-8.0.0-rc4, llvmorg-8.0.0-rc3
# aa6fb4c4 21-Feb-2019 Matt Arsenault <Matthew.Arsenault@amd.com>

AMDGPU: Remove debugger related subtarget features

As far as I know these aren't needed anymore.

llvm-svn: 354634


Revision tags: llvmorg-7.1.0, llvmorg-7.1.0-rc1, llvmorg-8.0.0-rc2, llvmorg-8.0.0-rc1
# 2946cd70 19-Jan-2019 Chandler Carruth <chandlerc@gmail.com>

Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the ne

Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.

Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.

llvm-svn: 351636

show more ...


12345678