SIFoldOperands.cpp - OpenGrok history log for /llvm-project/llvm/lib/Target/AMDGPU/SIFoldOperands.cpp

Revision (<<< Hide revision tags) (Show revision tags >>>)	Date	Author	Comments
Revision tags: llvmorg-4.0.0-rc2, llvmorg-4.0.0-rc1
# 69e3001b	11-Jan-2017	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU: Fix folding immediates into mac src2 Whether it is legal or not needs to check for the instruction it will be replaced with. llvm-svn: 291711
# 51818c14	10-Jan-2017	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU: Constant fold when immediate is materialized In future commits these patterns will appear after moveToVALU changes. llvm-svn: 291615
# 4bd72361	10-Dec-2016	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU: Fix handling of 16-bit immediates Since 32-bit instructions with 32-bit input immediate behavior are used to materialize 16-bit constants in 32-bit registers for 16-bit instructions, determi AMDGPU: Fix handling of 16-bit immediates Since 32-bit instructions with 32-bit input immediate behavior are used to materialize 16-bit constants in 32-bit registers for 16-bit instructions, determining the legality based on the size is incorrect. Change operands to have the size specified in the type. Also adds a workaround for a disassembler bug that produces an immediate MCOperand for an operand that is supposed to be OPERAND_REGISTER. The assembler appears to accept out of bounds immediates and truncates them, but this seems to be an issue for 32-bit already. llvm-svn: 289306 show more ...
# 8485fa09	07-Dec-2016	Tom Stellard <thomas.stellard@amd.com>	AMDGPU : Add S_SETREG instructions to fix fdiv precision issues. Patch By: Wei Ding Summary: This patch fixes the fdiv precision issues. Reviewers: b-sumner, cfang, wdng, arsenm Subscribers: kzhu AMDGPU : Add S_SETREG instructions to fix fdiv precision issues. Patch By: Wei Ding Summary: This patch fixes the fdiv precision issues. Reviewers: b-sumner, cfang, wdng, arsenm Subscribers: kzhuravl, nhaehnle, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D26424 llvm-svn: 288879 show more ...
Revision tags: llvmorg-3.9.1, llvmorg-3.9.1-rc3, llvmorg-3.9.1-rc2
# ff8bb49b	29-Nov-2016	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU: Refactor immediate folding logic Change the logic for when to fold immediates to consider the destination operand rather than the source of the materializing mov instruction. No change yet, AMDGPU: Refactor immediate folding logic Change the logic for when to fold immediates to consider the destination operand rather than the source of the materializing mov instruction. No change yet, but this will allow for correctly handling i16/f16 operands. Since 32-bit moves are used to materialize constants for these, the same bitvalue will not be in the register. llvm-svn: 288184 show more ...
Revision tags: llvmorg-3.9.1-rc1
# a24d84be	23-Nov-2016	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU: Cleanup immediate folding code Move code down to use, reorder to avoid hard to follow immediate folding logic. llvm-svn: 287818
# 391c3ea9	23-Nov-2016	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU: Fix debug printing The uint8_t was printed as a char which didn't really work. llvm-svn: 287817
# f86e4b72	13-Nov-2016	Konstantin Zhuravlyov <kzhuravl_dev@outlook.com>	[AMDGPU] Add f16 support (VI+) Differential Revision: https://reviews.llvm.org/D25975 llvm-svn: 286753
# 5e63a04e	06-Oct-2016	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU: Don't fold undef uses or copies with implicit uses llvm-svn: 283476
# c2ee42cd	06-Oct-2016	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU: Remove leftover implicit operands when folding immediates When constant folding an operation to a copy or an immediate mov, the implicit uses/defs of the old instruction were left behind, e. AMDGPU: Remove leftover implicit operands when folding immediates When constant folding an operation to a copy or an immediate mov, the implicit uses/defs of the old instruction were left behind, e.g. replacing v_or_b32 left the implicit exec use on the new copy. llvm-svn: 283471 show more ...
# 117296c0	01-Oct-2016	Mehdi Amini <mehdi.amini@apple.com>	Use StringRef in Pass/PassManager APIs (NFC) llvm-svn: 283004
# 2bc198a3	14-Sep-2016	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU: Support folding FrameIndex operands This avoids test regressions in a future commit. llvm-svn: 281491
# fa5f767a	14-Sep-2016	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU: Improve splitting 64-bit bit ops by constants This addresses a TODO to handle operations besides and. This also starts eliminating no-op operations with a constant that can emerge later. ll AMDGPU: Improve splitting 64-bit bit ops by constants This addresses a TODO to handle operations besides and. This also starts eliminating no-op operations with a constant that can emerge later. llvm-svn: 281488 show more ...
Revision tags: llvmorg-3.9.0, llvmorg-3.9.0-rc3, llvmorg-3.9.0-rc2
# 3661e90e	15-Aug-2016	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU: Don't fold subregister extracts into tied operands llvm-svn: 278676
Revision tags: llvmorg-3.9.0-rc1
# 9cfc75c2	30-Jun-2016	Duncan P. N. Exon Smith <dexonsmith@apple.com>	CodeGen: Use MachineInstr& in TargetInstrInfo, NFC This is mostly a mechanical change to make TargetInstrInfo API take MachineInstr& (instead of MachineInstr* or MachineBasicBlock::iterator) when th CodeGen: Use MachineInstr& in TargetInstrInfo, NFC This is mostly a mechanical change to make TargetInstrInfo API take MachineInstr& (instead of MachineInstr* or MachineBasicBlock::iterator) when the argument is expected to be a valid MachineInstr. This is a general API improvement. Although it would be possible to do this one function at a time, that would demand a quadratic amount of churn since many of these functions call each other. Instead I've done everything as a block and just updated what was necessary. This is mostly mechanical fixes: adding and removing `` and `&` operators. The only non-mechanical change is to split ARMBaseInstrInfo::getOperandLatencyImpl out from ARMBaseInstrInfo::getOperandLatency. Previously, the latter took a `MachineInstr` which it updated to the instruction bundle leader; now, the latter calls the former either with the same `MachineInstr&` or the bundle leader. As a side effect, this removes a bunch of MachineInstr* to MachineBasicBlock::iterator implicit conversions, a necessary step toward fixing PR26753. Note: I updated WebAssembly, Lanai, and AVR (despite being off-by-default) since it turned out to be easy. I couldn't run tests for AVR since llc doesn't link with it turned on. llvm-svn: 274189 show more ...
# 43e92fe3	24-Jun-2016	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU: Cleanup subtarget handling. Split AMDGPUSubtarget into amdgcn/r600 specific subclasses. This removes most of the static_casting of the basic codegen classes everywhere, and tries to restrict AMDGPU: Cleanup subtarget handling. Split AMDGPUSubtarget into amdgcn/r600 specific subclasses. This removes most of the static_casting of the basic codegen classes everywhere, and tries to restrict the features visible on the wrong target. llvm-svn: 273652 show more ...
Revision tags: llvmorg-3.8.1, llvmorg-3.8.1-rc1
# 7de74af9	25-Apr-2016	Andrew Kaylor <andrew.kaylor@intel.com>	Add optimization bisect opt-in calls for AMDGPU passes Differential Revision: http://reviews.llvm.org/D19450 llvm-svn: 267485
Revision tags: llvmorg-3.8.0, llvmorg-3.8.0-rc3
# 427c5489	11-Feb-2016	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU: Fix passes depending on dominator tree for no reason llvm-svn: 260494
Revision tags: llvmorg-3.8.0-rc2, llvmorg-3.8.0-rc1
# 926c56f5	13-Jan-2016	Marek Olsak <marek.olsak@amd.com>	AMDGPU/SI: Fix a bug in SIFoldOperands Summary: ret.ll will contain a test for this Reviewers: tstellarAMD, arsenm Subscribers: arsenm Differential Revision: http://reviews.llvm.org/D16029 llvm- AMDGPU/SI: Fix a bug in SIFoldOperands Summary: ret.ll will contain a test for this Reviewers: tstellarAMD, arsenm Subscribers: arsenm Differential Revision: http://reviews.llvm.org/D16029 llvm-svn: 257590 show more ...
# 82fc962c	07-Jan-2016	Nicolai Haehnle <nhaehnle@gmail.com>	AMDGPU/SI: Fold operands with sub-registers Summary: Multi-dword constant loads generated unnecessary moves from SGPRs into VGPRs, increasing the code size and VGPR pressure. These moves are now fol AMDGPU/SI: Fold operands with sub-registers Summary: Multi-dword constant loads generated unnecessary moves from SGPRs into VGPRs, increasing the code size and VGPR pressure. These moves are now folded away. Note that this lack of operand folding was not a problem for VMEM loads, because COPY nodes from VReg_Nnn to VGPR32 are eliminated by the register coalescer. Some tests are updated, note that the fsub.ll test explicitly checks that the move is elided. With the IR generated by current Mesa, the changes are obviously relatively minor: 7063 shaders in 3531 tests Totals: SGPRS: 351872 -> 352560 (0.20 %) VGPRS: 199984 -> 200732 (0.37 %) Code Size: 9876968 -> 9881112 (0.04 %) bytes LDS: 91 -> 91 (0.00 %) blocks Scratch: 1779712 -> 1767424 (-0.69 %) bytes per wave Wait states: 295164 -> 295337 (0.06 %) Totals from affected shaders: SGPRS: 65784 -> 66472 (1.05 %) VGPRS: 38064 -> 38812 (1.97 %) Code Size: 1993828 -> 1997972 (0.21 %) bytes LDS: 42 -> 42 (0.00 %) blocks Scratch: 795648 -> 783360 (-1.54 %) bytes per wave Wait states: 54026 -> 54199 (0.32 %) Reviewers: tstellarAMD, arsenm, mareko Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15875 llvm-svn: 257074 show more ...
Revision tags: llvmorg-3.7.1, llvmorg-3.7.1-rc2, llvmorg-3.7.1-rc1
# e8c0891e	21-Oct-2015	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU: Fix verifier error in SIFoldOperands There may be other use operands that also need their kill flags cleared. This happens in a few tests when SIFoldOperands is moved after PeepholeOptimize AMDGPU: Fix verifier error in SIFoldOperands There may be other use operands that also need their kill flags cleared. This happens in a few tests when SIFoldOperands is moved after PeepholeOptimizer. PeepholeOptimizer rewrites cases that look like: %vreg0 = ... %vreg1 = COPY %vreg0 use %vreg1<kill> %vreg2 = COPY %vreg0 use %vreg2<kill> to use the earlier source to %vreg0 = ... use %vreg0 use %vreg0 Currently SIFoldOperands sees the copied registers, so there is only one use. So far I haven't managed to come up with a test that currently has multiple uses of a foldable VGPR -> VGPR copy. llvm-svn: 250960 show more ...
# 16c4da03	28-Sep-2015	Andrew Kaylor <andrew.kaylor@intel.com>	Improved the interface of methods commuting operands, improved X86-FMA3 mem-folding&coalescing. Patch by Slava Klochkov (vyacheslav.n.klochkov@intel.com) Differential Revision: http://reviews.llvm. Improved the interface of methods commuting operands, improved X86-FMA3 mem-folding&coalescing. Patch by Slava Klochkov (vyacheslav.n.klochkov@intel.com) Differential Revision: http://reviews.llvm.org/D11370 llvm-svn: 248735 show more ...
# 0cb8517d	25-Sep-2015	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU: Fix recomputing dominator tree unnecessarily SIFixSGPRCopies does not modify the CFG, but this was being recomputed before running SIFoldOperands. llvm-svn: 248587
# ad46e0c1	10-Sep-2015	Matt Arsenault <Matthew.Arsenault@amd.com>	AMDGPU/SI: Fix creating v_mov_b32s without exec uses This will be caught by existing tests with a verifier check to be added in a future commit. llvm-svn: 247229
# 9a197676	09-Sep-2015	Tom Stellard <thomas.stellard@amd.com>	AMDGPU/SI: Fold operands through REG_SEQUENCE instructions Summary: This helps mostly when we use add instructions for address calculations that contain immediates. Reviewers: arsenm Subscribers: AMDGPU/SI: Fold operands through REG_SEQUENCE instructions Summary: This helps mostly when we use add instructions for address calculations that contain immediates. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D12256 llvm-svn: 247157 show more ...
1 2 3 4 5 6 7 8910