Revision tags: llvmorg-4.0.1, llvmorg-4.0.1-rc3 |
|
#
6bda14b3 |
| 06-Jun-2017 |
Chandler Carruth <chandlerc@gmail.com> |
Sort the remaining #include lines in include/... and lib/....
I did this a long time ago with a janky python script, but now clang-format has built-in support for this. I fed clang-format every line
Sort the remaining #include lines in include/... and lib/....
I did this a long time ago with a janky python script, but now clang-format has built-in support for this. I fed clang-format every line with a #include and let it re-sort things according to the precise LLVM rules for include ordering baked into clang-format these days.
I've reverted a number of files where the results of sorting includes isn't healthy. Either places where we have legacy code relying on particular include ordering (where possible, I'll fix these separately) or where we have particular formatting around #include lines that I didn't want to disturb in this patch.
This patch is *entirely* mechanical. If you get merge conflicts or anything, just ignore the changes in this patch and run clang-format over your #include lines in the files.
Sorry for any noise here, but it is important to keep these things stable. I was seeing an increasing number of patches with irrelevant re-ordering of #include lines because clang-format was used. This patch at least isolates that churn, makes it easy to skip when resolving conflicts, and gets us to a clean baseline (again).
llvm-svn: 304787
show more ...
|
Revision tags: llvmorg-4.0.1-rc2 |
|
#
2b1f9aa5 |
| 17-May-2017 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Start defining a calling convention
Partially implement callee-side for arguments and return values. byval doesn't work properly, and most likely sret or other on-stack return values most as
AMDGPU: Start defining a calling convention
Partially implement callee-side for arguments and return values. byval doesn't work properly, and most likely sret or other on-stack return values most as well.
llvm-svn: 303308
show more ...
|
Revision tags: llvmorg-4.0.1-rc1 |
|
#
1c0ae397 |
| 24-Apr-2017 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Add StackPtr and FramePtr registers to MFI
These will be necessary for setting up call sequences.
llvm-svn: 301208
|
#
161e2b42 |
| 18-Apr-2017 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Make MFI fields private
llvm-svn: 300596
|
#
e622dc38 |
| 11-Apr-2017 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Refactor argument lowering
Split into smaller functions and prepare for handling non-entry functions.
llvm-svn: 299998
|
#
678e111e |
| 10-Apr-2017 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Fix crash when disassembling VOP3 mac
The unused dummy src2_modifiers is missing, so it crashes when trying to print it.
I tried to fully remove src2_modifiers, but there are some irritatio
AMDGPU: Fix crash when disassembling VOP3 mac
The unused dummy src2_modifiers is missing, so it crashes when trying to print it.
I tried to fully remove src2_modifiers, but there are some irritations in the places where it is converted to mad since it starts to require modifying use lists while iterating over them.
llvm-svn: 299861
show more ...
|
Revision tags: llvmorg-4.0.0, llvmorg-4.0.0-rc4, llvmorg-4.0.0-rc3 |
|
#
e0bf7d02 |
| 21-Feb-2017 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Don't use stack space for SGPR->VGPR spills
Before frame offsets are calculated, try to eliminate the frame indexes used by SGPR spills. Then we can delete them after.
I think for now we ca
AMDGPU: Don't use stack space for SGPR->VGPR spills
Before frame offsets are calculated, try to eliminate the frame indexes used by SGPR spills. Then we can delete them after.
I think for now we can be sure that no other instruction will be re-using the same frame indexes. It should be easy to notice if this assumption ever breaks since everything asserts if it tries to use a dead frame index later.
The unused emergency stack slot seems to still be left behind, so an additional 4 bytes is still wasted.
llvm-svn: 295753
show more ...
|
Revision tags: llvmorg-4.0.0-rc2 |
|
#
2f3f9855 |
| 25-Jan-2017 |
Tom Stellard <thomas.stellard@amd.com> |
AMDGPU add support for spilling to a user sgpr pointed buffers
Summary: This lets you select which sort of spilling you want, either s[0:1] or 64-bit loads from s[0:1].
Patch By: Dave Airlie
Revie
AMDGPU add support for spilling to a user sgpr pointed buffers
Summary: This lets you select which sort of spilling you want, either s[0:1] or 64-bit loads from s[0:1].
Patch By: Dave Airlie
Reviewers: nhaehnle, arsenm, tstellarAMD
Reviewed By: arsenm
Subscribers: mareko, llvm-commits, kzhuravl, wdng, yaxunl, tony-tye
Differential Revision: https://reviews.llvm.org/D25428
llvm-svn: 293000
show more ...
|
#
6620376d |
| 21-Jan-2017 |
Eugene Zelenko <eugene.zelenko@gmail.com> |
[AMDGPU] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 292688
|
Revision tags: llvmorg-4.0.0-rc1 |
|
#
bb138886 |
| 20-Dec-2016 |
Tom Stellard <thomas.stellard@amd.com> |
AMDGPU/SI: Make a function const
llvm-svn: 290185
|
#
6f9ef14b |
| 20-Dec-2016 |
Tom Stellard <thomas.stellard@amd.com> |
AMDGPU/SI: Add a MachineMemOperand when lowering llvm.amdgcn.buffer.load.*
Reviewers: arsenm, nhaehnle, mareko
Subscribers: kzhuravl, wdng, yaxunl, llvm-commits, tony-tye
Differential Revision: ht
AMDGPU/SI: Add a MachineMemOperand when lowering llvm.amdgcn.buffer.load.*
Reviewers: arsenm, nhaehnle, mareko
Subscribers: kzhuravl, wdng, yaxunl, llvm-commits, tony-tye
Differential Revision: https://reviews.llvm.org/D27834
llvm-svn: 290184
show more ...
|
#
244891d1 |
| 20-Dec-2016 |
Tom Stellard <thomas.stellard@amd.com> |
AMDGPU/SI: Add a MachineMemOperand to MIMG instructions
Summary: Without a MachineMemOperand, the scheduler was assuming MIMG instructions were ordered memory references, so no loads or stores could
AMDGPU/SI: Add a MachineMemOperand to MIMG instructions
Summary: Without a MachineMemOperand, the scheduler was assuming MIMG instructions were ordered memory references, so no loads or stores could be reordered across them.
Reviewers: arsenm
Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye
Differential Revision: https://reviews.llvm.org/D27536
llvm-svn: 290179
show more ...
|
Revision tags: llvmorg-3.9.1, llvmorg-3.9.1-rc3, llvmorg-3.9.1-rc2, llvmorg-3.9.1-rc1 |
|
#
1d65026c |
| 06-Sep-2016 |
Konstantin Zhuravlyov <kzhuravl_dev@outlook.com> |
[AMDGPU] Wave and register controls
- Implemented amdgpu-flat-work-group-size attribute - Implemented amdgpu-num-active-waves-per-eu attribute - Implemented amdgpu-num-sgpr attribute - Implemented a
[AMDGPU] Wave and register controls
- Implemented amdgpu-flat-work-group-size attribute - Implemented amdgpu-num-active-waves-per-eu attribute - Implemented amdgpu-num-sgpr attribute - Implemented amdgpu-num-vgpr attribute - Dynamic LDS constraints are in a separate patch
Patch by Tom Stellard and Konstantin Zhuravlyov
Differential Revision: https://reviews.llvm.org/D21562
llvm-svn: 280747
show more ...
|
#
43e5fe3f |
| 29-Aug-2016 |
Saleem Abdulrasool <compnerd@compnerd.org> |
AMDGPU: fix mismatch tags, NFC
llvm-svn: 280006
|
Revision tags: llvmorg-3.9.0, llvmorg-3.9.0-rc3, llvmorg-3.9.0-rc2 |
|
#
69fd2c11 |
| 11-Aug-2016 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Remove unused tracking of flat instructions
llvm-svn: 278361
|
Revision tags: llvmorg-3.9.0-rc1 |
|
#
52ef4019 |
| 26-Jul-2016 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Make AMDGPUMachineFunction fields private
ABIArgOffset is a problem because properly fsetting the KernArgSize requires that the reserved area before the real kernel arguments be correctly al
AMDGPU: Make AMDGPUMachineFunction fields private
ABIArgOffset is a problem because properly fsetting the KernArgSize requires that the reserved area before the real kernel arguments be correctly aligned, which requires fixing clover.
llvm-svn: 276766
show more ...
|
#
8d718dcf |
| 22-Jul-2016 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Add HSA dispatch id intrinsic
llvm-svn: 276437
|
#
0532c190 |
| 13-Jul-2016 |
Marek Olsak <marek.olsak@amd.com> |
AMDGPU/SI: Emit the number of SGPR and VGPR spills
Summary: v2: don't count SGPRs spilled to scratch twice
I think this is sufficient. It doesn't count private memory usage, which happens often and
AMDGPU/SI: Emit the number of SGPR and VGPR spills
Summary: v2: don't count SGPRs spilled to scratch twice
I think this is sufficient. It doesn't count private memory usage, which happens often and uses scratch but isn't technically a spill. The private memory usage can be computed by: [scratch_per_thread - vgpr_spills - a random multiple of SGPR spills].
The fact SGPR spills add very high numbers to the scratch size make that computation a guessing game, but I don't have a solution to that.
Reviewers: tstellarAMD
Subscribers: arsenm, kzhuravl
Differential Revision: http://reviews.llvm.org/D22197
llvm-svn: 275288
show more ...
|
#
5cbd41e0 |
| 27-Jun-2016 |
NAKAMURA Takumi <geek4civic@gmail.com> |
SIMachineFunctionInfo.cpp: Appease msc18 to use std::array.
llvm-svn: 273860
|
#
d377ad80 |
| 27-Jun-2016 |
NAKAMURA Takumi <geek4civic@gmail.com> |
Reformat blank lines.
llvm-svn: 273858
|
#
f2f3d147 |
| 25-Jun-2016 |
Konstantin Zhuravlyov <kzhuravl_dev@outlook.com> |
[AMDGPU] Emit debugger prologue and emit the rest of the debugger fields in the kernel code header
Debugger prologue is emitted if -mattr=+amdgpu-debugger-emit-prologue.
Debugger prologue writes wo
[AMDGPU] Emit debugger prologue and emit the rest of the debugger fields in the kernel code header
Debugger prologue is emitted if -mattr=+amdgpu-debugger-emit-prologue.
Debugger prologue writes work group IDs and work item IDs to scratch memory at fixed location in the following format: - offset 0: work group ID x - offset 4: work group ID y - offset 8: work group ID z - offset 16: work item ID x - offset 20: work item ID y - offset 24: work item ID z
Set - amd_kernel_code_t::debug_wavefront_private_segment_offset_sgpr to scratch wave offset reg - amd_kernel_code_t::debug_private_segment_buffer_sgpr to scratch rsrc reg - amd_kernel_code_t::is_debug_supported to true if all debugger features are enabled
Differential Revision: http://reviews.llvm.org/D20335
llvm-svn: 273769
show more ...
|
Revision tags: llvmorg-3.8.1, llvmorg-3.8.1-rc1 |
|
#
29ddd2b2 |
| 24-May-2016 |
Konstantin Zhuravlyov <kzhuravl_dev@outlook.com> |
[AMDGPU][NFC] Rename ReserveTrapVGPRs -> ReserveRegs
Differential Revision: http://reviews.llvm.org/D20081
llvm-svn: 270594
|
#
71515e57 |
| 26-Apr-2016 |
Konstantin Zhuravlyov <kzhuravl_dev@outlook.com> |
[AMDGPU] Move reserved vgpr count for trap handler usage to SIMachineFunctionInfo + minor commenting changes
Differential Revision: http://reviews.llvm.org/D19537
llvm-svn: 267573
|
#
99c14524 |
| 25-Apr-2016 |
Matt Arsenault <Matthew.Arsenault@amd.com> |
AMDGPU: Implement addrspacecast
llvm-svn: 267452
|
#
79a1fd71 |
| 14-Apr-2016 |
Tom Stellard <thomas.stellard@amd.com> |
AMDGPU: allow specifying a workgroup size that needs to fit in a compute unit
Summary: For GL_ARB_compute_shader we need to support workgroup sizes of at least 1024. However, if we want to allow lar
AMDGPU: allow specifying a workgroup size that needs to fit in a compute unit
Summary: For GL_ARB_compute_shader we need to support workgroup sizes of at least 1024. However, if we want to allow large workgroup sizes, we may need to use less registers, as we have to run more waves per SIMD.
This patch adds an attribute to specify the maximum work group size the compiled program needs to support. It defaults, to 256, as that has no wave restrictions.
Reducing the number of registers available is done similarly to how the registers were reserved for chips with the sgpr init bug.
Reviewers: mareko, arsenm, tstellarAMD, nhaehnle
Subscribers: FireBurn, kerberizer, llvm-commits, arsenm
Differential Revision: http://reviews.llvm.org/D18340
Patch By: Bas Nieuwenhuizen
llvm-svn: 266337
show more ...
|