Revision tags: llvmorg-7.0.1, llvmorg-7.0.1-rc3, llvmorg-7.0.1-rc2, llvmorg-7.0.1-rc1, llvmorg-7.0.0, llvmorg-7.0.0-rc3, llvmorg-7.0.0-rc2, llvmorg-7.0.0-rc1 |
|
#
766c77ef |
| 21-Jun-2018 |
Konstantin Zhuravlyov <kzhuravl_dev@outlook.com> |
AMDGPU/AMDHSA: Remove GridWorkGroupCountX/Y/Z and everything that comes with it from implementation and v3 header files.
Leave definition in v2 header files for backwards compatibility.
Differentia
AMDGPU/AMDHSA: Remove GridWorkGroupCountX/Y/Z and everything that comes with it from implementation and v3 header files.
Leave definition in v2 header files for backwards compatibility.
Differential Revision: https://reviews.llvm.org/D48191
llvm-svn: 335267
show more ...
|
Revision tags: llvmorg-6.0.1, llvmorg-6.0.1-rc3, llvmorg-6.0.1-rc2 |
|
#
d4b500cb |
| 31-May-2018 |
Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com> |
[AMDGPU] Track occupancy in MFI
Keep track of achieved occupancy in SIMachineFunctionInfo. At the moment we have a lot of duplicated or even missed code to query and maintain occupancy info. Record
[AMDGPU] Track occupancy in MFI
Keep track of achieved occupancy in SIMachineFunctionInfo. At the moment we have a lot of duplicated or even missed code to query and maintain occupancy info. Record it in the MFI and query in a single call. Interfaces:
- getOccupancy() - returns current recorded achieved occupancy. - getMinAllowedOccupancy() - returns lesser of the achieved occupancy and the lowest occupancy we are ready to tolerate. For example if a kernel is memory bound we are ready to tolerate 4 waves. - limitOccupancy() - record occupancy level if we have to lower it. - increaseOccupancy() - record occupancy if scheduler managed to increase the occupancy.
MFI takes care of integrating different checks affecting occupancy, including LDS use and waves-per-eu attribute. Note that scheduler starts with not yet known register pressure, so has to record either limit or increase in occupancy after it is done. Later passes can just query a resulting value.
New interface is used in the active scheduler and NFC wrt its work. Changes are also made to experimental schedulers to use it and record an occupancy after they are done. Before the change waves-per-eu was ignored by experimental schedulers and tolerance window for memory bound kernels was not used.
Differential Revision: https://reviews.llvm.org/D47509
llvm-svn: 333629
show more ...
|
#
44b30b45 |
| 22-May-2018 |
Tom Stellard <tstellar@redhat.com> |
AMDGPU: Remove #include "MCTargetDesc/AMDGPUMCTargetDesc.h" from common headers
Summary: MCTargetDesc/AMDGPUMCTargetDesc.h contains enums for all the instuction and register defintions, which are hu
AMDGPU: Remove #include "MCTargetDesc/AMDGPUMCTargetDesc.h" from common headers
Summary: MCTargetDesc/AMDGPUMCTargetDesc.h contains enums for all the instuction and register defintions, which are huge so we only want to include them where needed.
This will also make it easier if we want to split the R600 and GCN definitions into separate tablegenerated files.
I was unable to remove AMDGPUMCTargetDesc.h from SIMachineFunctionInfo.h because it uses some enums from the header to initialize default values for the SIMachineFunction class, so I ended up having to remove includes of SIMachineFunctionInfo.h from headers too.
Reviewers: arsenm, nhaehnle
Reviewed By: nhaehnle
Subscribers: MatzeB, kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D46272
llvm-svn: 332930
show more ...
|
#
5f8f34e4 |
| 01-May-2018 |
Adrian Prantl <aprantl@apple.com> |
Remove \brief commands from doxygen comments.
We've been running doxygen with the autobrief option for a couple of years now. This makes the \brief markers into our comments redundant. Since they ar
Remove \brief commands from doxygen comments.
We've been running doxygen with the autobrief option for a couple of years now. This makes the \brief markers into our comments redundant. Since they are a visual distraction and we don't want to encourage more \brief markers in new code either, this patch removes them all.
Patch produced by
for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done
Differential Revision: https://reviews.llvm.org/D46290
llvm-svn: 331272
show more ...
|
Revision tags: llvmorg-6.0.1-rc1 |
|
#
03ae399d |
| 29-Mar-2018 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Support realigning stack
While the stack access instructions don't care about alignment > 4, some transformations on the pointer calculation do make assumptions based on knowing the low bits
AMDGPU: Support realigning stack
While the stack access instructions don't care about alignment > 4, some transformations on the pointer calculation do make assumptions based on knowing the low bits of a pointer are 0. If a stack object ends up being accessed through its absolute address (relative to the kernel scratch wave offset), the addressing expression may depend on the stack frame being properly aligned. This was breaking in a testcase due to the add->or combine.
I think some of the SP/FP handling logic is still backwards, and overly simplistic to support all of the stack features. Code which tries to modify the SP with inline asm for example or variable sized objects will probably require redoing this.
llvm-svn: 328831
show more ...
|
Revision tags: llvmorg-5.0.2, llvmorg-5.0.2-rc2, llvmorg-5.0.2-rc1, llvmorg-6.0.0, llvmorg-6.0.0-rc3 |
|
#
8234b489 |
| 20-Feb-2018 |
Tim Renouf <tpr.llvm@botech.co.uk> |
[AMDGPU] stop buffer_store being moved illegally
Summary: The machine instruction scheduler was illegally moving a buffer store past a buffer load with the same descriptor and offset. Fixed by marki
[AMDGPU] stop buffer_store being moved illegally
Summary: The machine instruction scheduler was illegally moving a buffer store past a buffer load with the same descriptor and offset. Fixed by marking buffer ops as mayAlias and isAliased. This may be overly conservative, and we may need to revisit.
Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D43332
Change-Id: Iff3173d9e0653e830474546276ab9d30318b8ef7 llvm-svn: 325567
show more ...
|
#
923712b6 |
| 09-Feb-2018 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
Reapply "AMDGPU: Add 32-bit constant address space"
This reverts r324494 and reapplies r324487.
llvm-svn: 324747
|
Revision tags: llvmorg-6.0.0-rc2 |
|
#
f4e3f3e3 |
| 07-Feb-2018 |
Rafael Espindola <rafael.espindola@gmail.com> |
Revert "AMDGPU: Add 32-bit constant address space"
This reverts commit r324487.
It broke clang tests.
llvm-svn: 324494
|
#
871c30e5 |
| 07-Feb-2018 |
Marek Olsak <marek.olsak@amd.com> |
AMDGPU: Add 32-bit constant address space
Note: This is a candidate for LLVM 6.0, because it was planned to be in that release but was delayed due to a long review period.
Merge conflict in r
AMDGPU: Add 32-bit constant address space
Note: This is a candidate for LLVM 6.0, because it was planned to be in that release but was delayed due to a long review period.
Merge conflict in release_60 - resolution: Add "-p6:32:32" into the second (non-amdgiz) string.
Only scalar loads support 32-bit pointers. An address in a VGPR will fail to compile. That's OK because the results of loads will only be used in places where VGPRs are forbidden.
Updated AMDGPUAliasAnalysis and used SReg_64_XEXEC. The tests cover all uses cases we need for Mesa.
Reviewers: arsenm, nhaehnle
Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D41651
llvm-svn: 324487
show more ...
|
Revision tags: llvmorg-6.0.0-rc1 |
|
#
75ced9d5 |
| 12-Jan-2018 |
Tim Renouf <tpr.llvm@botech.co.uk> |
[AMDGPU] stop image_store being moved illegally
Summary: A recent change 321556: AMDGPU: Remove mayLoad/hasSideEffects from MIMG stores can allow the machine instruction scheduler to move an image s
[AMDGPU] stop image_store being moved illegally
Summary: A recent change 321556: AMDGPU: Remove mayLoad/hasSideEffects from MIMG stores can allow the machine instruction scheduler to move an image store past an image load using the same descriptor.
V2: Fixed by marking image ops as mayAlias and isAliased. This may be overly conservative, and we may need to revisit. V3: Reverted test change done on 321556.
Reviewers: arsenm, nhaehnle, dstuttard
Subscribers: llvm-commits, t-tye, yaxunl, wdng, kzhuravl
Differential Revision: https://reviews.llvm.org/D41969
llvm-svn: 322419
show more ...
|
#
e19bc2ee |
| 29-Dec-2017 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Use unique PSVs for buffer resources
Also fixes using the wrong memory type for some intrinsics when custom lowering them.
llvm-svn: 321557
|
#
905f3518 |
| 29-Dec-2017 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Implement getTgtMemIntrinsic for images
Currently all images are lowered to have a single image PseudoSourceValue. Image stores happen to have overly strict mayLoad/mayStore/hasSideEffects f
AMDGPU: Implement getTgtMemIntrinsic for images
Currently all images are lowered to have a single image PseudoSourceValue. Image stores happen to have overly strict mayLoad/mayStore/hasSideEffects flags set on them, so this happens to work. When these are fixed to be correct, the scheduler breaks this because the identical PSVs are assumed to be the same address. These need to be unique to the image resource value.
llvm-svn: 321555
show more ...
|
Revision tags: llvmorg-5.0.1, llvmorg-5.0.1-rc3, llvmorg-5.0.1-rc2 |
|
#
3f833edc |
| 08-Nov-2017 |
David Blaikie <dblaikie@gmail.com> |
Target/TargetInstrInfo.h -> CodeGen/TargetInstrInfo.h to match layering
This header includes CodeGen headers, and is not, itself, included by any Target headers, so move it into CodeGen to match the
Target/TargetInstrInfo.h -> CodeGen/TargetInstrInfo.h to match layering
This header includes CodeGen headers, and is not, itself, included by any Target headers, so move it into CodeGen to match the layering of its implementation.
llvm-svn: 317647
show more ...
|
#
275a4f76 |
| 02-Nov-2017 |
Konstantin Zhuravlyov <kzhuravl_dev@outlook.com> |
AMDGPU: Fix warning discovered by r317266 [-Wunused-private-field]
llvm-svn: 317280
|
#
b695cd41 |
| 02-Nov-2017 |
Konstantin Zhuravlyov <kzhuravl_dev@outlook.com> |
AMDGPU: Remove outdated fixme (it was already fixed)
llvm-svn: 317266
|
Revision tags: llvmorg-5.0.1-rc1 |
|
#
13229158 |
| 29-Sep-2017 |
Tim Renouf <tim.renouf@amd.com> |
[AMDGPU] AMDPAL scratch buffer support
Summary: Added support for scratch (including spilling) for OS type amdpal: generates code to set up the scratch descriptor if it is needed.
With amdpal, the
[AMDGPU] AMDPAL scratch buffer support
Summary: Added support for scratch (including spilling) for OS type amdpal: generates code to set up the scratch descriptor if it is needed.
With amdpal, the scratch resource descriptor is loaded from offset 0 of the global information table. The low 32 bits of the address of the global information table is passed in s0.
Added amdgpu-git-ptr-high function attribute to hard-wire the high 32 bits of the address of the global information table. If the function attribute is not specified, or is 0xffffffff, then the backend generates code to use the high 32 bits of pc.
The documentation for the AMDPAL ABI will be added in a later commit.
Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, t-tye
Differential Revision: https://reviews.llvm.org/D37483
llvm-svn: 314501
show more ...
|
#
312ccf76 |
| 14-Sep-2017 |
Jan Sjodin <jan_sjodin@yahoo.com> |
Add AddresSpace to PseudoSourceValue.
Differential Revision: https://reviews.llvm.org/D35089
llvm-svn: 313297
|
Revision tags: llvmorg-5.0.0, llvmorg-5.0.0-rc5, llvmorg-5.0.0-rc4, llvmorg-5.0.0-rc3 |
|
#
71bcbd45 |
| 11-Aug-2017 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Start adding tail call support
Handle the sibling call cases.
llvm-svn: 310753
|
Revision tags: llvmorg-5.0.0-rc2 |
|
#
59e12826 |
| 08-Aug-2017 |
Eugene Zelenko <eugene.zelenko@gmail.com> |
[AMDGPU] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 310328
|
#
8623e8d8 |
| 03-Aug-2017 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Pass special input registers to functions
llvm-svn: 309998
|
#
8e8f8f43 |
| 02-Aug-2017 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Fix clobbering CSR VGPRs when spilling SGPR to it
llvm-svn: 309783
|
#
9166ce86 |
| 28-Jul-2017 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Annotate implicitarg.ptr usage
We need to pass something to functions for this to work. It isn't derivable just from the kernarg segment pointer because the implicit arguments are placed aft
AMDGPU: Annotate implicitarg.ptr usage
We need to pass something to functions for this to work. It isn't derivable just from the kernarg segment pointer because the implicit arguments are placed after the kernel arguments.
Also fixes missing test for the intrinsic.
llvm-svn: 309398
show more ...
|
Revision tags: llvmorg-5.0.0-rc1 |
|
#
1cc47f84 |
| 18-Jul-2017 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Figure out private memory regs after lowering
Introduce pseudo-registers for registers needed for stack access, which are replaced during finalizeLowering. Note these pseudo-registers are cu
AMDGPU: Figure out private memory regs after lowering
Introduce pseudo-registers for registers needed for stack access, which are replaced during finalizeLowering. Note these pseudo-registers are currently only used for the used register location, and not for determining their input argument register.
This is better because it avoids the need to try to predict whether a call will be emitted from the IR, and also detects stack objects introduced by legalization.
Test changes are from the HasStackObjects check being more accurate since stack objects introduced during legalization are now known.
llvm-svn: 308325
show more ...
|
#
e15855d9 |
| 17-Jul-2017 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Annotate features from x work item/group IDs.
This wasn't necessary before since they are always enabled for kernels, but this is necessary if they need to be forwarded to a callable functio
AMDGPU: Annotate features from x work item/group IDs.
This wasn't necessary before since they are always enabled for kernels, but this is necessary if they need to be forwarded to a callable function.
llvm-svn: 308226
show more ...
|
#
10fc062b |
| 26-Jun-2017 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Partially fix implicit.buffer.ptr intrinsic handling
This should not be treated as a different version of private_segment_buffer. These are distinct things with different uses and register c
AMDGPU: Partially fix implicit.buffer.ptr intrinsic handling
This should not be treated as a different version of private_segment_buffer. These are distinct things with different uses and register classes, and requires the function argument info to have more context about the function's type and environment.
Also add missing test coverage for the intrinsic, and emit an error for HSA. This also encovers that the intrinsic is broken unless there happen to be stack objects.
llvm-svn: 306264
show more ...
|