SIMachineFunctionInfo.h - OpenGrok history log for /llvm-project/llvm/lib/Target/AMDGPU/SIMachineFunctionInfo.h

Revision (<<< Hide revision tags) (Show revision tags >>>)	Date	Author	Comments
Revision tags: llvmorg-21-init, llvmorg-19.1.7
# 67c55b1f	18-Dec-2024	Ruiling, Song <ruiling.song@amd.com>	[AMDGPU] Make max dwords of memory cluster configurable (#119342) We find it helpful to increase the value for graphics workload. Make it configurable so we can experiment with a different value.
Revision tags: llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4
# 2b5b57c5	12-Nov-2024	Christudasan Devadasan <christudasan.devadasan@amd.com>	[AMDGPU] Skip non-wwm reg implicit-def from bb prolog (#115834) Currently all implicit-def instructions are part of bb prolog. We should only include the wwm-register's implicit definitions into the [AMDGPU] Skip non-wwm reg implicit-def from bb prolog (#115834) Currently all implicit-def instructions are part of bb prolog. We should only include the wwm-register's implicit definitions into the BB prolog. The other vector class registers' implicit defs when exist at the bb top might cause interference when pushed the LR_split copy insertion downwards. The SplitKit is very strict on altering the insertion points and will assert such instances. show more ...
# bc7e099a	07-Nov-2024	dyung <douglas.yung@sony.com>	Revert "[AMDGPU][MIR] Serialize NumPhysicalVGPRSpillLanes" (#115353) Reverts llvm/llvm-project#115291 Reverting due to test failures on many bots including https://lab.llvm.org/buildbot/#/builde Revert "[AMDGPU][MIR] Serialize NumPhysicalVGPRSpillLanes" (#115353) Reverts llvm/llvm-project#115291 Reverting due to test failures on many bots including https://lab.llvm.org/buildbot/#/builders/174/builds/8049 show more ...
# 21835ee2	07-Nov-2024	Akshat Oke <Akshat.Oke@amd.com>	[AMDGPU][MIR] Serialize NumPhysicalVGPRSpillLanes (#115291)
# 3495d045	05-Nov-2024	Akshat Oke <Akshat.Oke@amd.com>	[AMDGPU][MIR] Serialize SpillPhysVGPRs (#113129)
Revision tags: llvmorg-19.1.3, llvmorg-19.1.2
# 8d13e7b8	03-Oct-2024	Jay Foad <jay.foad@amd.com>	[AMDGPU] Qualify auto. NFC. (#110878) Generated automatically with: $ clang-tidy -fix -checks=-*,llvm-qualified-auto $(find lib/Target/AMDGPU/ -type f)
Revision tags: llvmorg-19.1.1
# ac0f64f0	30-Sep-2024	Christudasan Devadasan <christudasan.devadasan@amd.com>	[AMDGPU] Split vgpr regalloc pipeline (#93526) Allocating wwm-registers and per-thread VGPR operands together imposes many challenges in the way the registers are reused during allocation. There a [AMDGPU] Split vgpr regalloc pipeline (#93526) Allocating wwm-registers and per-thread VGPR operands together imposes many challenges in the way the registers are reused during allocation. There are times when regalloc reuses the registers of regular VGPRs operations for wwm-operations in a small range leading to unwantedly clobbering their inactive lanes causing correctness issues that are hard to trace. This patch splits the VGPR allocation pipeline further to allocate wwm-registers first and the regular VGPR operands in a separate pipeline. The splitting would ensure that the physical registers used for wwm allocations won't take part in the next allocation pipeline to avoid any such clobbering. show more ...
Revision tags: llvmorg-19.1.0
# 33562085	13-Sep-2024	Diana Picus <Diana-Magda.Picus@amd.com>	Reland "[amdgpu] Add llvm.amdgcn.init.whole.wave intrinsic" (#108512) This reverts commit https://github.com/llvm/llvm-project/commit/7792b4ae79e5ac9355ee13b01f16e25455f8427f. The problem was a Reland "[amdgpu] Add llvm.amdgcn.init.whole.wave intrinsic" (#108512) This reverts commit https://github.com/llvm/llvm-project/commit/7792b4ae79e5ac9355ee13b01f16e25455f8427f. The problem was a conflict with https://github.com/llvm/llvm-project/commit/e55d6f5ea2656bf842973d8bee86c3ace31bc865 "[AMDGPU] Simplify and improve codegen for llvm.amdgcn.set.inactive (https://github.com/llvm/llvm-project/pull/107889)" which changed the syntax of V_SET_INACTIVE (and thus made my MIR test crash). ...if only we had a merge queue. show more ...
# 7792b4ae	12-Sep-2024	Diana Picus <Diana-Magda.Picus@amd.com>	Revert "Reland "[amdgpu] Add llvm.amdgcn.init.whole.wave intrinsic" (#108054)"" (#108341) Reverts llvm/llvm-project#108173 si-init-whole-wave.mir crashes on some buildbots (although it passed bo Revert "Reland "[amdgpu] Add llvm.amdgcn.init.whole.wave intrinsic" (#108054)"" (#108341) Reverts llvm/llvm-project#108173 si-init-whole-wave.mir crashes on some buildbots (although it passed both locally with sanitizers enabled and in pre-merge tests). Investigating. show more ...
# 703ebca8	12-Sep-2024	Diana Picus <Diana-Magda.Picus@amd.com>	Reland "[amdgpu] Add llvm.amdgcn.init.whole.wave intrinsic" (#108054)" (#108173) This reverts commit https://github.com/llvm/llvm-project/commit/c7a7767fca736d0447832ea4d4587fb3b9e797c2. The bui Reland "[amdgpu] Add llvm.amdgcn.init.whole.wave intrinsic" (#108054)" (#108173) This reverts commit https://github.com/llvm/llvm-project/commit/c7a7767fca736d0447832ea4d4587fb3b9e797c2. The buildbots failed because I removed a MI from its parent before updating LIS. This PR should fix that. show more ...
# f4dd1bc8	11-Sep-2024	Fraser Cormack <fraser@codeplay.com>	[AMDGPU] Fix leak and self-assignment in copy assignment operator (#107847) A static analyzer identified that this operator was unsafe in the case of self-assignment. In the placement new statem [AMDGPU] Fix leak and self-assignment in copy assignment operator (#107847) A static analyzer identified that this operator was unsafe in the case of self-assignment. In the placement new statement, StringValue's copy constructor was being implicitly called, which received a reference to "itself". In fact, it was being passed an old StringValue at the same address - one whose lifetime had already ended. The copy constructor was thus copying fields from a dead object. We need to be careful when switching active union members, and calling the destructor on the old StringValue will avoid memory leaks which I believe the old code exhibited. show more ...
# c7a7767f	10-Sep-2024	Vitaly Buka <vitalybuka@google.com>	Revert "[amdgpu] Add llvm.amdgcn.init.whole.wave intrinsic" (#108054) Breaks bots, see #105822. Reverts llvm/llvm-project#105822
# 44556e64	10-Sep-2024	Diana Picus <Diana-Magda.Picus@amd.com>	[amdgpu] Add llvm.amdgcn.init.whole.wave intrinsic (#105822) This intrinsic is meant to be used in functions that have a "tail" that needs to be run with all the lanes enabled. The "tail" may conta [amdgpu] Add llvm.amdgcn.init.whole.wave intrinsic (#105822) This intrinsic is meant to be used in functions that have a "tail" that needs to be run with all the lanes enabled. The "tail" may contain complex control flow that makes it unsuitable for the use of the existing WWM intrinsics. Instead, we will pretend that the function starts with all the lanes enabled, then branches into the actual body of the function for the lanes that were meant to run it, and then finally all the lanes will rejoin and run the tail. As such, the intrinsic will return the EXEC mask for the body of the function, and is meant to be used only as part of a very limited pattern (for now only in amdgpu_cs_chain functions): ``` entry: %func_exec = call i1 @llvm.amdgcn.init.whole.wave() br i1 %func_exec, label %func, label %tail func: ; ... stuff that should run with the actual EXEC mask br label %tail tail: ; ... stuff that runs with all the lanes enabled; ; can contain more than one basic block ``` It's an error to use the result of this intrinsic for anything other than a branch (but unfortunately checking that in the verifier is non-trivial because SIAnnotateControlFlow will introduce an amdgcn.if between the intrinsic and the branch). The intrinsic is lowered to a SI_INIT_WHOLE_WAVE pseudo, which for now is expanded in si-wqm (which is where SI_INIT_EXEC is handled too); however the information that the function was conceptually started in whole wave mode is stored in the machine function info (hasInitWholeWave). This will be useful in prolog epilog insertion, where we can skip saving the inactive lanes for CSRs (since if the function started with all the lanes active, then there are no inactive lanes to preserve). show more ...
Revision tags: llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init
# 7e9b49f6	25-Jun-2024	Nicolai Hähnle <nicolai.haehnle@amd.com>	AMDGPU: Add plumbing for private segment size argument (#96445) The actual size of scratch/private is determined at dispatch time, so add more plumbing to request it. Will be used in subsequent cha AMDGPU: Add plumbing for private segment size argument (#96445) The actual size of scratch/private is determined at dispatch time, so add more plumbing to request it. Will be used in subsequent change. show more ...
Revision tags: llvmorg-18.1.8, llvmorg-18.1.7
# 197c3a3e	02-Jun-2024	Kazu Hirata <kazu@google.com>	Use llvm::less_first (NFC) (#94136)
Revision tags: llvmorg-18.1.6
# cd4287bc	03-May-2024	Jay Foad <jay.foad@amd.com>	[AMDGPU] Convert PrologEpilogSGPRSpills from DenseMap to sorted vector (#90957) In practice PrologEpilogSGPRSpills never has more than 3 entries so DenseMap is overkill. In addition this means that [AMDGPU] Convert PrologEpilogSGPRSpills from DenseMap to sorted vector (#90957) In practice PrologEpilogSGPRSpills never has more than 3 entries so DenseMap is overkill. In addition this means that iteration happens in register number order, instead of DenseMap's hashed order, so it will not be affected by future patches that define new physical registers. This should reduce future test case churn. show more ...
Revision tags: llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2
# c4e517f5	12-Mar-2024	Jun Wang <jwang86@yahoo.com>	[AMDGPU] Adding the amdgpu_num_work_groups function attribute (#79035) A new function attribute named amdgpu_num_work_groups is added. This attribute, which consists of three integers, allows progr [AMDGPU] Adding the amdgpu_num_work_groups function attribute (#79035) A new function attribute named amdgpu_num_work_groups is added. This attribute, which consists of three integers, allows programmers to let the compiler know the number of workgroups to be launched in each of the three dimensions and do optimizations based on that information. --------- Co-authored-by: Jun Wang <jun.wang7@amd.com> show more ...
Revision tags: llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1
# 70fc9703	24-Jan-2024	Jay Foad <jay.foad@amd.com>	[AMDGPU] Move architected SGPR implementation into isel (#79120)
Revision tags: llvmorg-19-init
# 230c13d5	24-Jan-2024	Christudasan Devadasan <christudasan.devadasan@amd.com>	[AMDGPU] Pick available high VGPR for CSR SGPR spilling (#78669) CSR SGPR spilling currently uses the early available physical VGPRs. It currently imposes a high register pressure while trying to a [AMDGPU] Pick available high VGPR for CSR SGPR spilling (#78669) CSR SGPR spilling currently uses the early available physical VGPRs. It currently imposes a high register pressure while trying to allocate large VGPR tuples within the default register budget. This patch changes the spilling strategy by picking the VGPRs in the reverse order, the highest available VGPR first and later after regalloc shift them back to the lowest available range. With that, the initial VGPRs would be available for allocation and possibility of finding large number of contiguous registers will be more. show more ...
# 818f13fc	23-Jan-2024	Jay Foad <jay.foad@amd.com>	[AMDGPU] Remove getWorkGroupIDSGPR, unused since aa6fb4c45e01
# 51392996	17-Dec-2023	Carl Ritson <carl.ritson@amd.com>	[AMDGPU] Track physical VGPRs used for SGPR spills (#75573) Physical VGPRs used for SGPR spills need to be tracked independent of WWM reserved registers. The WWM reserved set contains extra registe [AMDGPU] Track physical VGPRs used for SGPR spills (#75573) Physical VGPRs used for SGPR spills need to be tracked independent of WWM reserved registers. The WWM reserved set contains extra registers allocated during WWM pre-allocation pass. This causes SGPR spills allocated after WWM pre-allocation to overlap with WWM register usage, e.g. if frame pointer is spilt during prologue/epilog insertion. show more ...
Revision tags: llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2
# 0455596e	02-Aug-2023	Austin Kerbow <Austin.Kerbow@amd.com>	[AMDGPU] Add DAG ISel support for preloaded kernel arguments This patch adds the DAG isel changes for kernel argument preloading. These changes are not usable with older firmware but subsequent patc [AMDGPU] Add DAG ISel support for preloaded kernel arguments This patch adds the DAG isel changes for kernel argument preloading. These changes are not usable with older firmware but subsequent patches in the series will make the codegen backwards compatible. This patch should only be submitted alongside that subsequent patch. Preloading here begins from the start of the kernel arguments until the amount of arguments indicated by the CL flag amdgpu-kernarg-preload-count. Aggregates and arguments passed by-ref are not supported. Special care for the alignment of the kernarg segment is needed as well as consideration of the alignment of addressable SGPR tuples when we cannot directly use misaligned large tuples that the arguments are loaded to. Reviewed By: bcahoon Differential Revision: https://reviews.llvm.org/D158579 show more ...
# 343be513	19-Aug-2023	Austin Kerbow <Austin.Kerbow@amd.com>	[AMDGPU] Add utilities to track number of user SGPRs. NFC. Factor out and unify some common code that calculates and tracks the number of user SGRPs. Reviewed By: arsenm Differential Revision: htt [AMDGPU] Add utilities to track number of user SGPRs. NFC. Factor out and unify some common code that calculates and tracks the number of user SGRPs. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D159439 show more ...
Revision tags: llvmorg-17.0.0-rc1
# 5272ae66	27-Jul-2023	Diana Picus <Diana-Magda.Picus@amd.com>	[AMDGPU] Add IsChainFunction to the MachineFunctionInfo This will represent functions with the amdgpu_cs_chain or amdgpu_cs_chain_preserve calling conventions. Differential Revision: https://review [AMDGPU] Add IsChainFunction to the MachineFunctionInfo This will represent functions with the amdgpu_cs_chain or amdgpu_cs_chain_preserve calling conventions. Differential Revision: https://reviews.llvm.org/D156410 show more ...
# 4d42e8b5	28-Jul-2023	Matt Arsenault <Matthew.Arsenault@amd.com>	Reapply "[CodeGen]Allow targets to use target specific COPY instructions for live range splitting" This reverts commit a496c8be6e638ae58bb45f13113dbe3a4b7b23fd. The workaround in c26dfc81e254c78dc2 Reapply "[CodeGen]Allow targets to use target specific COPY instructions for live range splitting" This reverts commit a496c8be6e638ae58bb45f13113dbe3a4b7b23fd. The workaround in c26dfc81e254c78dc23579cf3d1336f77249e1f6 should work around the underlying problem with SUBREG_TO_REG. show more ...
12 3 4 5 6 7 8