AMDGPULegalizerInfo.h - OpenGrok history log for /llvm-project/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.h

Revision (<<< Hide revision tags) (Show revision tags >>>)	Date	Author	Comments
Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4
# ea33af63	01-Nov-2024	Krzysztof Drewniak <Krzysztof.Drewniak@amd.com>	Reapply "[AMDGPU][GlobalISel] Fix load/store of pointer vectors, buffer..pN (#110714)" v3 (#114443) This reverts commit 8a849a2a567d4e519b246a16936b6e7519936d4b. It seems I missed a spot when tr Reapply "[AMDGPU][GlobalISel] Fix load/store of pointer vectors, buffer..pN (#110714)" v3 (#114443) This reverts commit 8a849a2a567d4e519b246a16936b6e7519936d4b. It seems I missed a spot when trying to ensure the code in the instruction selection tests were actually legalized MIR. show more ...
Revision tags: llvmorg-19.1.3, llvmorg-19.1.2
# 8a849a2a	10-Oct-2024	Mikhail Goncharov <goncharov.mikhail@gmail.com>	Revert "Reapply "[AMDGPU][GlobalISel] Fix load/store of pointer vectors, buffer..pN (#110714)" v2 (#111708)" This reverts commit 4b4a0d419c81b8b12a7dbb33dae1f7e9be91a88f. New test fails on buildbo Revert "Reapply "[AMDGPU][GlobalISel] Fix load/store of pointer vectors, buffer..pN (#110714)" v2 (#111708)" This reverts commit 4b4a0d419c81b8b12a7dbb33dae1f7e9be91a88f. New test fails on buildbots https://lab.llvm.org/buildbot/#/builders/63/builds/2039 https://lab.llvm.org/buildbot/#/builders/127/builds/1055 show more ...
# 4b4a0d41	09-Oct-2024	Krzysztof Drewniak <Krzysztof.Drewniak@amd.com>	Reapply "[AMDGPU][GlobalISel] Fix load/store of pointer vectors, buffer..pN (#110714)" v2 (#111708) This adds `-disable-gisel-legality-check` to some gfx6 and gfx7 test lines to prevent behavior m Reapply "[AMDGPU][GlobalISel] Fix load/store of pointer vectors, buffer..pN (#110714)" v2 (#111708) This adds `-disable-gisel-legality-check` to some gfx6 and gfx7 test lines to prevent behavior mismatches between debug and release builds The first attempted reapply was #111059 This reverts commit e075dcf7d270fd52dc837163ff24e8c872dfeb49. show more ...
# e075dcf7	06-Oct-2024	NAKAMURA Takumi <geek4civic@gmail.com>	Revert "Reapply "[AMDGPU][GlobalISel] Fix load/store of pointer vectors, buffer..pN (#110714)" (#111059)" This reverts commit 98a15c7b0c6ec129d371f0c121dbe9396c4f5609. (llvmorg-20-init-8051-g98a15c Revert "Reapply "[AMDGPU][GlobalISel] Fix load/store of pointer vectors, buffer..pN (#110714)" (#111059)" This reverts commit 98a15c7b0c6ec129d371f0c121dbe9396c4f5609. (llvmorg-20-init-8051-g98a15c7b0c6e) show more ...
# 98a15c7b	04-Oct-2024	Krzysztof Drewniak <Krzysztof.Drewniak@amd.com>	Reapply "[AMDGPU][GlobalISel] Fix load/store of pointer vectors, buffer..pN (#110714)" (#111059) This reverts commit 650c41aad2eb43c634a05b2b5799a0c13a73b92f. The test failures appear to be from Reapply "[AMDGPU][GlobalISel] Fix load/store of pointer vectors, buffer..pN (#110714)" (#111059) This reverts commit 650c41aad2eb43c634a05b2b5799a0c13a73b92f. The test failures appear to be from conflicts with other PRs that landed around this time. show more ...
# 650c41aa	03-Oct-2024	NAKAMURA Takumi <geek4civic@gmail.com>	Revert "[AMDGPU][GlobalISel] Fix load/store of pointer vectors, buffer..pN (#110714)" Some builders has been failing tests. ``` Failed Tests (2): LLVM :: CodeGen/AMDGPU/GlobalISel/inst-select-loa Revert "[AMDGPU][GlobalISel] Fix load/store of pointer vectors, buffer..pN (#110714)" Some builders has been failing tests. ``` Failed Tests (2): LLVM :: CodeGen/AMDGPU/GlobalISel/inst-select-load-global-old-legalization.mir LLVM :: CodeGen/AMDGPU/GlobalISel/inst-select-load-local.mir ``` This reverts commit ae5bd2a9f292037c605b2ec0ee31200581bd8701. (llvmorg-20-init-7805-gae5bd2a9f292) show more ...
# ae5bd2a9	02-Oct-2024	Krzysztof Drewniak <Krzysztof.Drewniak@amd.com>	[AMDGPU][GlobalISel] Fix load/store of pointer vectors, buffer..pN (#110714) Certain pointer address spaces were not being correctly handled by the GlobalISel lowering for buffer_load and buffer_s [AMDGPU][GlobalISel] Fix load/store of pointer vectors, buffer..pN (#110714) Certain pointer address spaces were not being correctly handled by the GlobalISel lowering for buffer_load and buffer_store. 1. ptr addrspace(1) and addrspace(4) did not have rewrite patterns defined for them, while p0 did, since those pointer types weren't in the list of types that was iterated to form the patterns. 2. Vectors of pointers need to be bitcast to vectors of the corresponding scalars, since there doesn't seem to be a good way to define the rewrite patterns for buffer_load/store of those types The need to bitcast vectors of pointers was also revealed to affect ordinary `G_LOAD` and `G_STORE` in some cases, so `shouldBitcastLoadStore()` has been fixed to handle it properly. show more ...
Revision tags: llvmorg-19.1.1, llvmorg-19.1.0
# 0745219d	06-Sep-2024	Stanislav Mekhanoshin <rampitec@users.noreply.github.com>	[AMDGPU] Add target intrinsic for s_buffer_prefetch_data (#107293)
Revision tags: llvmorg-19.1.0-rc4
# 26b0bef1	29-Aug-2024	Changpeng Fang <changpeng.fang@amd.com>	AMDGPU: Use pattern to select instruction for intrinsic llvm.fptrunc.round (#105761) Use GCNPat instead of Custom Lowering to select instructions for intrinsic llvm.fptrunc.round. "SupportedRoundMo AMDGPU: Use pattern to select instruction for intrinsic llvm.fptrunc.round (#105761) Use GCNPat instead of Custom Lowering to select instructions for intrinsic llvm.fptrunc.round. "SupportedRoundMode : TImmLeaf" is used as a predicate to select only when the rounding mode is supported. "as_hw_round_mode : SDNodeXForm" is developed to translate the round modes to the corresponding ones that hardware recognizes. show more ...
Revision tags: llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init
# 4477ff68	27-Jun-2024	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU: Remove ds_fmin/ds_fmax intrinsics (#96739) These have been replaced with atomicrmw.
# 5feb32ba	25-Jun-2024	Vikram Hegde <115221833+vikramRH@users.noreply.github.com>	[AMDGPU] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (#89217) This patch is intended to be the first of a series with end goal to adapt atomic optimizer pass t [AMDGPU] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (#89217) This patch is intended to be the first of a series with end goal to adapt atomic optimizer pass to support i64 and f64 operations (along with removing all unnecessary bitcasts). This legalizes 64 bit readlane, writelane and readfirstlane ops pre-ISel --------- Co-authored-by: vikramRH <vikhegde@amd.com> show more ...
Revision tags: llvmorg-18.1.8, llvmorg-18.1.7
# fb2c6597	19-May-2024	Leon Clark <PeddleSpam@users.noreply.github.com>	[AMDGPU] Use LSH for lowering ctlz_zero_undef.i8/i16 (#88512) Use LSH to lower ctlz_zero_undef instead of subtracting leading zeros for i8 and i16. Related to [77615](https://github.com/llvm/llv [AMDGPU] Use LSH for lowering ctlz_zero_undef.i8/i16 (#88512) Use LSH to lower ctlz_zero_undef instead of subtracting leading zeros for i8 and i16. Related to [77615](https://github.com/llvm/llvm-project/pull/77615). --------- Co-authored-by: Leon Clark <leoclark@amd.com> show more ...
Revision tags: llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3
# d365a45c	23-Mar-2024	Evgenii Kudriashov <evgenii.kudriashov@intel.com>	[GlobalISel] Introduce G_TRAP, G_DEBUGTRAP, G_UBSANTRAP (#84941) Here we introduce three new GMIR instructions to cover a set of trap intrinsics. The idea behind it is that generic intrinsics shoul [GlobalISel] Introduce G_TRAP, G_DEBUGTRAP, G_UBSANTRAP (#84941) Here we introduce three new GMIR instructions to cover a set of trap intrinsics. The idea behind it is that generic intrinsics shouldn't be used with G_INTRINSIC opcode. These new instructions can match perfectly with existing trap ISD nodes. It allows X86, AArch64, RISCV and Mips to reuse SelectionDAG patterns for selection and avoid manual selection. However AMDGPU is an exception. It selects traps during legalization regardless SelectionDAG or GlobalISel. Since there are not many places where traps are used, this change attempts to clean up all the usages of G_INTRINSIC with trap intrinsics. So, there is no stage when both G_TRAP and G_INTRINSIC_W_SIDE_EFFECTS(@llvm.trap) are allowed. show more ...
Revision tags: llvmorg-18.1.2, llvmorg-18.1.1
# 1fc5e50c	06-Mar-2024	Joseph Huber <huberjn@outlook.com>	[AMDGPU] Implement 'llvm.get.fpenv' and 'llvm.set.fpenv' (#83906) Summary: This patch implements the LLVM floating point environment control intrinsics and also exposes it through clang. We encode t [AMDGPU] Implement 'llvm.get.fpenv' and 'llvm.set.fpenv' (#83906) Summary: This patch implements the LLVM floating point environment control intrinsics and also exposes it through clang. We encode the floating point environment as a 64-bit value that simply concatenates the values of the mode registers and the current trap status. We only fetch the bits relevant for floating point instructions. That is, rounding mode, denormalization mode, ieee, dx10 clamp, debug, enabled traps, f16 overflow, and active exceptions. show more ...
Revision tags: llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1
# 45d2d775	25-Jan-2024	Jay Foad <jay.foad@amd.com>	[AMDGPU] New llvm.amdgcn.wave.id intrinsic (#79325) This is only valid on targets with architected SGPRs.
Revision tags: llvmorg-19-init
# f22cde10	08-Jan-2024	Ningning Shi(史宁宁) <shiningning@iscas.ac.cn>	[GlobalISel][NFC]Delete the comments of XXLegalizerInfo (#76918) Delete the LegalizerInfo comments of AArch64/AMD64/ARM/M68k/RISCV/x86, they are copied from register bank.
# d659bd16	03-Jan-2024	David Green <david.green@arm.com>	[GlobalISel][AArch64] Tail call libcalls. (#74929) This tries to allow libcalls to be tail called, using a similar method to DAG where the type is checked to make sure they match, and if so the ba [GlobalISel][AArch64] Tail call libcalls. (#74929) This tries to allow libcalls to be tail called, using a similar method to DAG where the type is checked to make sure they match, and if so the backend, through lowerCall checks that the tailcall is valid for all arguments. show more ...
Revision tags: llvmorg-17.0.6
# f3138524	14-Nov-2023	Acim-Maravic <119684637+Acim-Maravic@users.noreply.github.com>	[AMDGPU] Generic lowering for rint and nearbyint (#69596) The are three different rounding intrinsics, that are brought down to same instruction. Co-authored-by: Acim Maravic <acim.maravic@amd.c [AMDGPU] Generic lowering for rint and nearbyint (#69596) The are three different rounding intrinsics, that are brought down to same instruction. Co-authored-by: Acim Maravic <acim.maravic@amd.com> show more ...
Revision tags: llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3
# aa5158cd	10-Oct-2023	Thomas Symalla <5754458+tsymalla@users.noreply.github.com>	[AMDGPU] Use absolute relocations when compiling for AMDPAL and Mesa3D (#67791) The primary ISA-independent justification for using PC-relative addressing is that it makes code position-independent [AMDGPU] Use absolute relocations when compiling for AMDPAL and Mesa3D (#67791) The primary ISA-independent justification for using PC-relative addressing is that it makes code position-independent and therefore allows sharing of .text pages between processes. When not sharing .text pages, we can use absolute relocations instead, which will possibly prevent a bubble introduced by s_getpc_b64. Co-authored-by: Thomas Symalla <thomas.symalla@amd.com> show more ...
Revision tags: llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3
# 72a7024a	16-Aug-2023	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU: Correctly lower llvm.sqrt.f32 Make codegen emit correctly rounded sqrt by default. Emit the fast but only kind of fast expansion in AMDGPUCodeGenPrepare based on !fpmath, like the fdiv case AMDGPU: Correctly lower llvm.sqrt.f32 Make codegen emit correctly rounded sqrt by default. Emit the fast but only kind of fast expansion in AMDGPUCodeGenPrepare based on !fpmath, like the fdiv case. Hack around visitation ordering problems from AMDGPUCodeGenPrepare using forward iteration instead of a well behaved combiner. https://reviews.llvm.org/D158129 show more ...
# 4b7b4b94	14-Aug-2023	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU: Fix fast f32 log/log10 OpenCL conformance didn't like interpreting afn as ignore the denormal handling. https://reviews.llvm.org/D157940
Revision tags: llvmorg-17.0.0-rc2
# 10304835	30-Jul-2023	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU/GlobalISel: Handle stacksave/stackrestore https://reviews.llvm.org/D156670
Revision tags: llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6
# e3fd8f83	20-Nov-2022	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU: Correctly expand f64 sqrt intrinsic rocm-device-libs and llpc were avoiding using f64 sqrt intrinsics in favor of their own expansions. Port the expansion into the backend. Both of these use AMDGPU: Correctly expand f64 sqrt intrinsic rocm-device-libs and llpc were avoiding using f64 sqrt intrinsics in favor of their own expansions. Port the expansion into the backend. Both of these users should be updated to call the intrinsic instead. The library and llpc expansions are slightly different. llpc uses an ldexp to do the scale; the library uses a multiply. Use ldexp to do the scale instead of the multiply. I believe v_ldexp_f64 and v_mul_f64 are always the same number of cycles, but it's cheaper to materialize the 32-bit integer constant than the 64-bit double constant. The libraries have another fast version of sqrt which will be handled separately. I am tempted to do this in an IR expansion instead. In the IR we could take advantage of computeKnownFPClass to avoid the 0-or-inf argument check. show more ...
# 54916662	14-Jun-2023	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU: Correctly lower llvm.exp.f32 The library expansion has too many paths for all the permutations of DAZ, unsafe and the 3 exp functions. It's easier to expand it in the backend when we know al AMDGPU: Correctly lower llvm.exp.f32 The library expansion has too many paths for all the permutations of DAZ, unsafe and the 3 exp functions. It's easier to expand it in the backend when we know all of these things. The library currently misses the no-infinity check on the overflow, which this handles optimizing out. Some of the <3 x half> fast tests regress due to vector widening dropping flags which will be fixed separately. Apparently there is no exp10 intrinsic, but there should be. Adds some deadish code in preparation for adding one while I'm following along with the current library expansion. show more ...
# ed556a1a	14-Jun-2023	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU: Correctly lower llvm.exp2.f32 Previously this did a fast math expansion only.
12 3 4 5