llvm.mulo.ll - OpenGrok history log for /llvm-project/llvm/test/CodeGen/AMDGPU/llvm.mulo.ll

Revision (<<< Hide revision tags) (Show revision tags >>>)	Date	Author	Comments
Revision tags: llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4
# 6548b635	09-Nov-2024	Shilei Tian <i@tianshilei.me>	Reapply "[AMDGPU] Still set up the two SGPRs for queue ptr even it is COV5 (#112403)" This reverts commit ca33649abe5fad93c57afef54e43ed9b3249cd86.
# ca33649a	08-Nov-2024	Shilei Tian <i@tianshilei.me>	Revert "[AMDGPU] Still set up the two SGPRs for queue ptr even it is COV5 (#112403)" This reverts commit e215a1e27d84adad2635a52393621eb4fa439dc9 as it broke both hip and openmp buildbots.
# e215a1e2	08-Nov-2024	Shilei Tian <i@tianshilei.me>	[AMDGPU] Still set up the two SGPRs for queue ptr even it is COV5 (#112403)
Revision tags: llvmorg-19.1.3
# 3277c7cd	21-Oct-2024	Stanislav Mekhanoshin <rampitec@users.noreply.github.com>	[AMDGPU] Skip VGPR deallocation for waveslot limited kernels (#112765) MSG_DEALLOC_VGPRS slows down very small waveslot limited kernels. It's been identified this message is only really needed for [AMDGPU] Skip VGPR deallocation for waveslot limited kernels (#112765) MSG_DEALLOC_VGPRS slows down very small waveslot limited kernels. It's been identified this message is only really needed for VGPR limited kernels. A kernel becomes VGPR limited if a total number of VGPRs per SIMD / number of used VGPRs is more than a number of wave slots. show more ...
Revision tags: llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init
# 229e1185	23-Jul-2024	Christudasan Devadasan <christudasan.devadasan@amd.com>	[AMDGPU] Codegen support for constrained multi-dword sloads (#96163) For targets that support xnack replay feature (gfx8+), the multi-dword scalar loads shouldn't clobber any register that holds the [AMDGPU] Codegen support for constrained multi-dword sloads (#96163) For targets that support xnack replay feature (gfx8+), the multi-dword scalar loads shouldn't clobber any register that holds the src address. The constrained version of the scalar loads have the early clobber flag attached to the dst operand to restrict RA from re-allocating any of the src regs for its dst operand. show more ...
# b1bcb7ca	15-Jul-2024	Matt Arsenault <Matthew.Arsenault@amd.com>	Reapply "AMDGPU: Move attributor into optimization pipeline (#83131)" and follow up commit "clang/AMDGPU: Defeat attribute optimization in attribute test" (#98851) This reverts commit adaff46d087799 Reapply "AMDGPU: Move attributor into optimization pipeline (#83131)" and follow up commit "clang/AMDGPU: Defeat attribute optimization in attribute test" (#98851) This reverts commit adaff46d087799072438dd744b038e6fd50a2d78. Drop the -O3 checks from default-attributes.hip. I don't know why they are different on some bots but reverting this is far too disruptive. show more ...
# adaff46d	15-Jul-2024	dyung <douglas.yung@sony.com>	Revert "AMDGPU: Move attributor into optimization pipeline (#83131)" and follow up commit "clang/AMDGPU: Defeat attribute optimization in attribute test" (#98851) This reverts commits 677cc15e0ff2e0 Revert "AMDGPU: Move attributor into optimization pipeline (#83131)" and follow up commit "clang/AMDGPU: Defeat attribute optimization in attribute test" (#98851) This reverts commits 677cc15e0ff2e0e6aa30538eb187990a6a8f53c0 and 78bc1b64a6dc3fb6191355a5e1b502be8b3668e7. The test CodeGenHIP/default-attributes.hip is failing on multiple bots even after the attempted fix including the following: - https://lab.llvm.org/buildbot/#/builders/3/builds/1473 - https://lab.llvm.org/buildbot/#/builders/65/builds/1380 - https://lab.llvm.org/buildbot/#/builders/161/builds/595 - https://lab.llvm.org/buildbot/#/builders/154/builds/1372 - https://lab.llvm.org/buildbot/#/builders/133/builds/1547 - https://lab.llvm.org/buildbot/#/builders/81/builds/755 - https://lab.llvm.org/buildbot/#/builders/40/builds/570 - https://lab.llvm.org/buildbot/#/builders/13/builds/748 - https://lab.llvm.org/buildbot/#/builders/12/builds/1845 - https://lab.llvm.org/buildbot/#/builders/11/builds/1695 - https://lab.llvm.org/buildbot/#/builders/190/builds/1829 - https://lab.llvm.org/buildbot/#/builders/193/builds/962 - https://lab.llvm.org/buildbot/#/builders/23/builds/991 - https://lab.llvm.org/buildbot/#/builders/144/builds/2256 - https://lab.llvm.org/buildbot/#/builders/46/builds/1614 These bots have been broken for a day, so reverting to get everything back to green. show more ...
# 78bc1b64	14-Jul-2024	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU: Move attributor into optimization pipeline (#83131) Removing it from the codegen pipeline induces a lot of test churn because llc is no longer optimizing out implicit arguments to kernels. AMDGPU: Move attributor into optimization pipeline (#83131) Removing it from the codegen pipeline induces a lot of test churn because llc is no longer optimizing out implicit arguments to kernels. Mostly mechanical, but there are some creative test updates. I preferred to take the changes as-is in tests where the ABI isn't relevant. In cases where it's more relevant, or the optimize out logic was too ingrained in the test, I pre-run the optimization. Some cases manually add attributes to disable inputs. show more ...
Revision tags: llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init
# ba52f06f	18-Jan-2024	Jay Foad <jay.foad@amd.com>	[AMDGPU] CodeGen for GFX12 S_WAIT_* instructions (#77438) Update SIMemoryLegalizer and SIInsertWaitcnts to use separate wait instructions per counter (e.g. S_WAIT_LOADCNT) and split VMCNT into sep [AMDGPU] CodeGen for GFX12 S_WAIT_* instructions (#77438) Update SIMemoryLegalizer and SIInsertWaitcnts to use separate wait instructions per counter (e.g. S_WAIT_LOADCNT) and split VMCNT into separate LOADCNT, SAMPLECNT and BVHCNT counters. show more ...
# e9e9d1b0	17-Jan-2024	Jay Foad <jay.foad@amd.com>	[AMDGPU] Disable V_MAD_U64_U32/V_MAD_I64_I32 workaround for GFX12 (#77927)
# 9e9907f1	17-Jan-2024	Fangrui Song <i@maskray.me>	[AMDGPU,test] Change llc -march= to -mtriple= (#75982) Similar to 806761a7629df268c8aed49657aeccffa6bca449. For IR files without a target triple, -mtriple= specifies the full target triple while [AMDGPU,test] Change llc -march= to -mtriple= (#75982) Similar to 806761a7629df268c8aed49657aeccffa6bca449. For IR files without a target triple, -mtriple= specifies the full target triple while -march= merely sets the architecture part of the default target triple, leaving a target triple which may not make sense, e.g. amdgpu-apple-darwin. Therefore, -march= is error-prone and not recommended for tests without a target triple. The issue has been benign as we recognize $unknown-apple-darwin as ELF instead of rejecting it outrightly. This patch changes AMDGPU tests to not rely on the default OS/environment components. Tests that need fixes are not changed: ``` LLVM :: CodeGen/AMDGPU/fabs.f64.ll LLVM :: CodeGen/AMDGPU/fabs.ll LLVM :: CodeGen/AMDGPU/floor.ll LLVM :: CodeGen/AMDGPU/fneg-fabs.f64.ll LLVM :: CodeGen/AMDGPU/fneg-fabs.ll LLVM :: CodeGen/AMDGPU/r600-infinite-loop-bug-while-reorganizing-vector.ll LLVM :: CodeGen/AMDGPU/schedule-if-2.ll ``` show more ...
# dd051295	14-Dec-2023	Valery Pykhtin <valery.pykhtin@gmail.com>	[AMDGPU] Enable GCNRewritePartialRegUses pass by default. (#72975) Let's try once again after #69957 has landed.
Revision tags: llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init
# 7fa7a08f	19-Jul-2023	Jay Foad <jay.foad@amd.com>	[AMDGPU] Insert s_nop before s_sendmsg sendmsg(MSG_DEALLOC_VGPRS) Differential Revision: https://reviews.llvm.org/D155681
# f2c164c8	21-Jun-2023	Jay Foad <jay.foad@amd.com>	[AMDGPU] Do not wait for vscnt on function entry and return SIInsertWaitcnts inserts waitcnt instructions to resolve data dependencies. The GFX10+ vscnt (VMEM store count) counter is never used in t [AMDGPU] Do not wait for vscnt on function entry and return SIInsertWaitcnts inserts waitcnt instructions to resolve data dependencies. The GFX10+ vscnt (VMEM store count) counter is never used in this way. It is only used to resolve memory dependencies, and that is handled by SIMemoryLegalizer. Hence there is no need to conservatively wait for vscnt to be 0 on function entry and before returns. Differential Revision: https://reviews.llvm.org/D153537 show more ...
Revision tags: llvmorg-16.0.6
# 342acfc9	06-Jun-2023	Valery Pykhtin <valery.pykhtin@gmail.com>	[AMDGPU] Turn off pass to rewrite partially used virtual superregisters after RenameIndependentSubregs pass with registers of minimal size. There is a failure with this pass in the case when target [AMDGPU] Turn off pass to rewrite partially used virtual superregisters after RenameIndependentSubregs pass with registers of minimal size. There is a failure with this pass in the case when target register class for a subregister isn't known from instruction description (for ex. COPY). Currently in this situation the RC is obtained using TargetRegisterInfo::getSubRegisterClass but in general it's not working. In order to fix this two things should be done: 1. Stop processing a subregister if the target register class is unknown (conservative approach) 2. Improve deduction of subregister' target register class (i.e by processing COPY chain) I was going to implement point 1 but my tests use implicit operands for S_NOP and they don't have associated target register class and all tests fail. Therefore I decided to turn off the pass now, implement point 1 and fix my tests. Reviewed By: arsenm, #amdgpu Differential Revision: https://reviews.llvm.org/D152291 show more ...
Revision tags: llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7
# 8d0412ce	09-Dec-2022	Valery Pykhtin <valery.pykhtin@gmail.com>	[AMDGPU] Add pass to rewrite partially used virtual superregisters after RenameIndependentSubregs pass with registers of minimal size. The main purpose of this is to simplify register pressure track [AMDGPU] Add pass to rewrite partially used virtual superregisters after RenameIndependentSubregs pass with registers of minimal size. The main purpose of this is to simplify register pressure tracking as after the pass there is no need to track subreg liveness anymore. On the other hand this pass creates more possibilites for the subreg unaware code, as many of the subregs becomes ordinary registers. Intersting sideeffect: spill-vgpr.ll has lost a lot of spills. Reviewed By: #amdgpu, arsenm Differential Revision: https://reviews.llvm.org/D139732 show more ...
Revision tags: llvmorg-15.0.6
# fb1d166e	25-Nov-2022	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU: Bulk update some generic intrinsic tests to opaque pointers Done purely with the script.
# 30eff7f2	29-Nov-2022	Simon Pilgrim <llvm-dev@redking.me.uk>	[DAG] Attempt to replace a mul node with an existing umul_lohi/smul_lohi node (PR59217) As discussed on Issue #59217, under certain circumstances the DAG can generate duplicate MUL and MUL_LOHI node [DAG] Attempt to replace a mul node with an existing umul_lohi/smul_lohi node (PR59217) As discussed on Issue #59217, under certain circumstances the DAG can generate duplicate MUL and MUL_LOHI nodes, often during MULO legalization. This patch attempts to replace MUL nodes with additional uses of the LO result from the MUL_LOHI node Differential Revision: https://reviews.llvm.org/D138790 show more ...
Revision tags: llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, working, llvmorg-15.0.2, llvmorg-15.0.1
# fbdea5a2	09-Sep-2022	Alexander Timofeev <alexander.timofeev@amd.com>	[AMDGPU] Always select s_cselect_b32 for uniform 'select' SDNode This patch contains changes necessary to carry physical condition register (SCC) dependencies through the SDNode scheduler. It adds [AMDGPU] Always select s_cselect_b32 for uniform 'select' SDNode This patch contains changes necessary to carry physical condition register (SCC) dependencies through the SDNode scheduler. It adds the edge in the SDNodeScheduler dependency graph instead of inserting the SCC copy between each definition and use. This approach lets the scheduler place instructions in an optimal way placing the copy only when the dependency cannot be resolved. Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D133593 show more ...
Revision tags: llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init
# d1af09ad	23-Jun-2022	Joe Nash <Joseph.Nash@amd.com>	[AMDGPU] gfx11 Generate VOPD Instructions We form VOPD instructions in the GCNCreateVOPD pass by combining back-to-back component instructions. There are strict register constraints for creating a [AMDGPU] gfx11 Generate VOPD Instructions We form VOPD instructions in the GCNCreateVOPD pass by combining back-to-back component instructions. There are strict register constraints for creating a legal VOPD, namely that the matching operands (e.g. src0x and src0y, src1x and src1y) must be in different register banks. We add a PostRA scheduler mutation to put possible VOPD components back-to-back. Depends on D128442, D128270 Reviewed By: #amdgpu, rampitec Differential Revision: https://reviews.llvm.org/D128656 show more ...
# 0f94d2b3	30-Jun-2022	Jay Foad <jay.foad@amd.com>	[AMDGPU] GFX11: automatically release VGPRs at the end of the shader GFX11 has a new message type MSG_DEALLOC_VGPRS which can be used to release a shader's VGPRs. Sending this at the end of a shader [AMDGPU] GFX11: automatically release VGPRs at the end of the shader GFX11 has a new message type MSG_DEALLOC_VGPRS which can be used to release a shader's VGPRs. Sending this at the end of a shader (just before the s_endpgm) can help overall system performance in cases where the s_endpgm would have to wait for outstanding VMEM stores to complete before releasing the VGPRs. Differential Revision: https://reviews.llvm.org/D128442 show more ...
Revision tags: llvmorg-14.0.6
# cfb7ffde	21-Jun-2022	Jay Foad <jay.foad@amd.com>	[AMDGPU] New AMDGPUInsertDelayAlu pass Differential Revision: https://reviews.llvm.org/D128270
# 77851cc1	15-Jun-2022	David Stuttard <david.stuttard@amd.com>	[AMDGPU] Change use null for dead sdst to be gfx1030+ Pre gfx1030 null for sdst is different. c97436f8b6e2 [AMDGPU] Use null for dead sdst operand - requires a change to make it not apply to pre gfx [AMDGPU] Change use null for dead sdst to be gfx1030+ Pre gfx1030 null for sdst is different. c97436f8b6e2 [AMDGPU] Use null for dead sdst operand - requires a change to make it not apply to pre gfx1030 Differential Revision: https://reviews.llvm.org/D127869 show more ...
# c97436f8	10-Jun-2022	Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>	[AMDGPU] Use null for dead sdst operand Differential Revision: https://reviews.llvm.org/D127542
# d943c514	13-Jun-2022	Jay Foad <jay.foad@amd.com>	[AMDGPU] Fix GFX11 codegen for V_MAD_U64_U32 and V_MAD_I64_I32 GFX11 uses different pseudos for these because of a new constraint on which operands' registers can overlap. Differential Revision: ht [AMDGPU] Fix GFX11 codegen for V_MAD_U64_U32 and V_MAD_I64_I32 GFX11 uses different pseudos for these because of a new constraint on which operands' registers can overlap. Differential Revision: https://reviews.llvm.org/D127659 show more ...
12