MachinePipeliner.cpp - OpenGrok history log for /llvm-project/llvm/lib/CodeGen/MachinePipeliner.cpp

Revision (<<< Hide revision tags) (Show revision tags >>>)	Date	Author	Comments
Revision tags: llvmorg-9.0.1-rc2, llvmorg-9.0.1-rc1
# 7c7e368a	14-Nov-2019	Sumanth Gundapaneni <sgundapa@quicinc.com>	[Pipeliner] Fix an assertion caused by iterator invalidation.
# 9026518e	02-Oct-2019	James Molloy <jmolloy@google.com>	[ModuloSchedule] Peel out prologs and epilogs, generate actual code Summary: This extends the PeelingModuloScheduleExpander to generate prolog and epilog code, and correctly stitch uses through the [ModuloSchedule] Peel out prologs and epilogs, generate actual code Summary: This extends the PeelingModuloScheduleExpander to generate prolog and epilog code, and correctly stitch uses through the prolog, kernel, epilog DAG. The key concept in this patch is to ensure that all transforms are local; only a function of a block and its immediate predecessor and successor. By defining the problem in this way we can inductively rewrite the entire DAG using only local knowledge that is easy to reason about. For example, we assume that all prologs and epilogs are near-perfect clones of the steady-state kernel. This means that if a block has an instruction that is predicated out, we can redirect all users of that instruction to that equivalent instruction in our immediate predecessor. As all blocks are clones, every instruction must have an equivalent in every other block. Similarly we can make the assumption by construction that if a value defined in a block is used outside that block, the only possible user is its immediate successors. We maintain this even for values that are used outside the loop by creating a limited form of LCSSA. This code isn't small, but it isn't complex. Enabled a bunch of testing from Hexagon. There are a couple of tests not enabled yet; I'm about 80% sure there isn't buggy codegen but the tests are checking for patterns that we don't produce. Those still need a bit more investigation. In the meantime we (Google) are happy with the code produced by this on our downstream SMS implementation, and believe it generates correct code. Subscribers: mgorny, hiraditya, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68205 llvm-svn: 373462 show more ...
# f5524f04	26-Sep-2019	Changpeng Fang <changpeng.fang@gmail.com>	Remove the AliasAnalysis argument in function areMemAccessesTriviallyDisjoint Reviewers: arsenm Differential Revision: https://reviews.llvm.org/D58360 llvm-svn: 373024
# 8a74eca3	21-Sep-2019	James Molloy <jmolloy@google.com>	[MachinePipeliner] Improve the TargetInstrInfo API analyzeLoop/reduceLoopCount Recommit: fix asan errors. The way MachinePipeliner uses these target hooks is stateful - we reduce trip count by one [MachinePipeliner] Improve the TargetInstrInfo API analyzeLoop/reduceLoopCount Recommit: fix asan errors. The way MachinePipeliner uses these target hooks is stateful - we reduce trip count by one per call to reduceLoopCount. It's a little overfit for hardware loops, where we don't have to worry about stitching a loop induction variable across prologs and epilogs (the induction variable is implicit). This patch introduces a new API: /// Analyze loop L, which must be a single-basic-block loop, and if the /// conditions can be understood enough produce a PipelinerLoopInfo object. virtual std::unique_ptr<PipelinerLoopInfo> analyzeLoopForPipelining(MachineBasicBlock LoopBB) const; The return value is expected to be an implementation of the abstract class: /// Object returned by analyzeLoopForPipelining. Allows software pipelining /// implementations to query attributes of the loop being pipelined. class PipelinerLoopInfo { public: virtual ~PipelinerLoopInfo(); /// Return true if the given instruction should not be pipelined and should /// be ignored. An example could be a loop comparison, or induction variable /// update with no users being pipelined. virtual bool shouldIgnoreForPipelining(const MachineInstr MI) const = 0; /// Create a condition to determine if the trip count of the loop is greater /// than TC. /// /// If the trip count is statically known to be greater than TC, return /// true. If the trip count is statically known to be not greater than TC, /// return false. Otherwise return nullopt and fill out Cond with the test /// condition. virtual Optional<bool> createTripCountGreaterCondition(int TC, MachineBasicBlock &MBB, SmallVectorImpl<MachineOperand> &Cond) = 0; /// Modify the loop such that the trip count is /// OriginalTC + TripCountAdjust. virtual void adjustTripCount(int TripCountAdjust) = 0; /// Called when the loop's preheader has been modified to NewPreheader. virtual void setPreheader(MachineBasicBlock *NewPreheader) = 0; /// Called when the loop is being removed. virtual void disposed() = 0; }; The Pipeliner (ModuloSchedule.cpp) can use this object to modify the loop while allowing the target to hold its own state across all calls. This API, in particular the disjunction of creating a trip count check condition and adjusting the loop, improves the code quality in ModuloSchedule.cpp. llvm-svn: 372463 show more ...
# 72a3d859	20-Sep-2019	Mitch Phillips <mitchphillips@outlook.com>	Revert "[MachinePipeliner] Improve the TargetInstrInfo API analyzeLoop/reduceLoopCount" This commit broke the ASan buildbot. See comments in rL372376 for more information. This reverts commit 15e27 Revert "[MachinePipeliner] Improve the TargetInstrInfo API analyzeLoop/reduceLoopCount" This commit broke the ASan buildbot. See comments in rL372376 for more information. This reverts commit 15e27b0b6d9d51362fad85dbe95ac5b3fadf0a06. llvm-svn: 372425 show more ...
# 15e27b0b	20-Sep-2019	James Molloy <jmolloy@google.com>	[MachinePipeliner] Improve the TargetInstrInfo API analyzeLoop/reduceLoopCount The way MachinePipeliner uses these target hooks is stateful - we reduce trip count by one per call to reduceLoopCount. [MachinePipeliner] Improve the TargetInstrInfo API analyzeLoop/reduceLoopCount The way MachinePipeliner uses these target hooks is stateful - we reduce trip count by one per call to reduceLoopCount. It's a little overfit for hardware loops, where we don't have to worry about stitching a loop induction variable across prologs and epilogs (the induction variable is implicit). This patch introduces a new API: /// Analyze loop L, which must be a single-basic-block loop, and if the /// conditions can be understood enough produce a PipelinerLoopInfo object. virtual std::unique_ptr<PipelinerLoopInfo> analyzeLoopForPipelining(MachineBasicBlock LoopBB) const; The return value is expected to be an implementation of the abstract class: /// Object returned by analyzeLoopForPipelining. Allows software pipelining /// implementations to query attributes of the loop being pipelined. class PipelinerLoopInfo { public: virtual ~PipelinerLoopInfo(); /// Return true if the given instruction should not be pipelined and should /// be ignored. An example could be a loop comparison, or induction variable /// update with no users being pipelined. virtual bool shouldIgnoreForPipelining(const MachineInstr MI) const = 0; /// Create a condition to determine if the trip count of the loop is greater /// than TC. /// /// If the trip count is statically known to be greater than TC, return /// true. If the trip count is statically known to be not greater than TC, /// return false. Otherwise return nullopt and fill out Cond with the test /// condition. virtual Optional<bool> createTripCountGreaterCondition(int TC, MachineBasicBlock &MBB, SmallVectorImpl<MachineOperand> &Cond) = 0; /// Modify the loop such that the trip count is /// OriginalTC + TripCountAdjust. virtual void adjustTripCount(int TripCountAdjust) = 0; /// Called when the loop's preheader has been modified to NewPreheader. virtual void setPreheader(MachineBasicBlock *NewPreheader) = 0; /// Called when the loop is being removed. virtual void disposed() = 0; }; The Pipeliner (ModuloSchedule.cpp) can use this object to modify the loop while allowing the target to hold its own state across all calls. This API, in particular the disjunction of creating a trip count check condition and adjusting the loop, improves the code quality in ModuloSchedule.cpp. llvm-svn: 372376 show more ...
Revision tags: llvmorg-9.0.0, llvmorg-9.0.0-rc6, llvmorg-9.0.0-rc5, llvmorg-9.0.0-rc4
# fef9f590	04-Sep-2019	James Molloy <jmolloy@google.com>	[ModuloSchedule] Introduce PeelingModuloScheduleExpander This is the beginnings of a reimplementation of ModuloScheduleExpander. It works by generating a single-block correct pipelined kernel and th [ModuloSchedule] Introduce PeelingModuloScheduleExpander This is the beginnings of a reimplementation of ModuloScheduleExpander. It works by generating a single-block correct pipelined kernel and then peeling out the prolog and epilogs. This patch implements kernel generation as well as a validator that will confirm the number of phis added is the same as the ModuloScheduleExpander. Prolog and epilog peeling will come in a different patch. Differential Revision: https://reviews.llvm.org/D67081 llvm-svn: 370893 show more ...
# 93549957	03-Sep-2019	James Molloy <jmolloy@google.com>	[MachinePipeliner] Add a way to unit-test the schedule emitter Emitting a schedule is really hard. There are lots of corner cases to take care of; in fact, of the 60+ SWP-specific testcases in the H [MachinePipeliner] Add a way to unit-test the schedule emitter Emitting a schedule is really hard. There are lots of corner cases to take care of; in fact, of the 60+ SWP-specific testcases in the Hexagon backend most of those are testing codegen rather than the schedule creation itself. One issue is that to test an emission corner case we must craft an input such that the generated schedule uses that corner case; sometimes this is very hard and convolutes testcases. Other times it is impossible but we want to test it anyway. This patch adds a simple test pass that will consume a module containing a loop and generate pipelined code from it. We use post-instr-symbols as a way to annotate instructions with the stage and cycle that we want to schedule them at. We also provide a flag that causes the MachinePipeliner to generate these annotations instead of actually emitting code; this allows us to generate an input testcase with: llc < %s -stop-after=pipeliner -pipeliner-annotate-for-testing -o test.mir And run the emission in isolation with: llc < test.mir -run-pass=modulo-schedule-test llvm-svn: 370705 show more ...
# 790a779f	30-Aug-2019	James Molloy <jmolloy@google.com>	[MachinePipeliner] Separate schedule emission, NFC This is the first stage in refactoring the pipeliner and making it more accessible for backends to override and control. This separates the logic a [MachinePipeliner] Separate schedule emission, NFC This is the first stage in refactoring the pipeliner and making it more accessible for backends to override and control. This separates the logic and state required to emit a scheudule from the logic that computes and validates a schedule. This will enable (a) new schedule emitters and (b) new modulo scheduling implementations to coexist. NFC. Differential Revision: https://reviews.llvm.org/D67006 llvm-svn: 370500 show more ...
# 22714592	30-Aug-2019	Bjorn Pettersson <bjorn.a.pettersson@ericsson.com>	[CodeGen] Introduce MachineBasicBlock::replacePhiUsesWith helper and use it. NFC Summary: Found a couple of places in the code where all the PHI nodes of a MBB is updated, replacing references to on [CodeGen] Introduce MachineBasicBlock::replacePhiUsesWith helper and use it. NFC Summary: Found a couple of places in the code where all the PHI nodes of a MBB is updated, replacing references to one MBB by reference to another MBB instead. This patch simply refactors the code to use a common helper (MachineBasicBlock::replacePhiUsesWith) for such PHI node updates. Reviewers: t.p.northover, arsenm, uabelho Subscribers: wdng, hiraditya, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66750 llvm-svn: 370463 show more ...
Revision tags: llvmorg-9.0.0-rc3
# 0c476111	15-Aug-2019	Daniel Sanders <daniel_l_sanders@apple.com>	Apply llvm-prefer-register-over-unsigned from clang-tidy to LLVM Summary: This clang-tidy check is looking for unsigned integer variables whose initializer starts with an implicit cast from llvm::Re Apply llvm-prefer-register-over-unsigned from clang-tidy to LLVM Summary: This clang-tidy check is looking for unsigned integer variables whose initializer starts with an implicit cast from llvm::Register and changes the type of the variable to llvm::Register (dropping the llvm:: where possible). Partial reverts in: X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister X86FixupLEAs.cpp - Some functions return unsigned and arguably should be MCRegister X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister HexagonBitSimplify.cpp - Function takes BitTracker::RegisterRef which appears to be unsigned& MachineVerifier.cpp - Ambiguous operator==() given MCRegister and const Register PPCFastISel.cpp - No Register::operator-=() PeepholeOptimizer.cpp - TargetInstrInfo::optimizeLoadInstr() takes an unsigned& MachineTraceMetrics.cpp - MachineTraceMetrics lacks a suitable constructor Manual fixups in: ARMFastISel.cpp - ARMEmitLoad() now takes a Register& instead of unsigned& HexagonSplitDouble.cpp - Ternary operator was ambiguous between unsigned/Register HexagonConstExtenders.cpp - Has a local class named Register, used llvm::Register instead of Register. PPCFastISel.cpp - PPCEmitLoad() now takes a Register& instead of unsigned& Depends on D65919 Reviewers: arsenm, bogner, craig.topper, RKSimon Reviewed By: arsenm Subscribers: RKSimon, craig.topper, lenary, aemerson, wuzish, jholewinski, MatzeB, qcolombet, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, wdng, nhaehnle, sbc100, jgravelle-google, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, javed.absar, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, tpr, PkmX, jocewei, jsji, Petar.Avramovic, asbirlea, Jim, s.egerton, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65962 llvm-svn: 369041 show more ...
Revision tags: llvmorg-9.0.0-rc2
# 6349ce5c	09-Aug-2019	Jinsong Ji <jji@us.ibm.com>	[MachinePipeliner] Avoid indeterminate order in FuncUnitSorter Summary: This is exposed by adding a new testcase in PowerPC in https://reviews.llvm.org/rL367732 The testcase got different output on [MachinePipeliner] Avoid indeterminate order in FuncUnitSorter Summary: This is exposed by adding a new testcase in PowerPC in https://reviews.llvm.org/rL367732 The testcase got different output on different platform, hence breaking buildbots. The problem is that we get differnt FuncUnitOrder when calculateResMII. The root cause is: 1. Two MachineInstr might get SAME priority(MFUsx) from minFuncUnits. 2. Current comparison operator() will return `MFUs1 > MFUs2`. 3. We use iterators for MachineInstr, so the input to FuncUnitSorter might be different on differnt platform due to the iterator nature. So for two MI with same MFU, their order is actually depends on the iterator order, which is platform (implemtation) dependent. This is risky, and may cause cross-compiling problems. The fix is to check make sure we assign a determine order when they are equal. Reviewers: bcahoon, hfinkel, jmolloy Subscribers: nemanjai, hiraditya, MaskRay, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65992 llvm-svn: 368441 show more ...
# 2bea69bf	01-Aug-2019	Daniel Sanders <daniel_l_sanders@apple.com>	Finish moving TargetRegisterInfo::isVirtualRegister() and friends to llvm::Register as started by r367614. NFC llvm-svn: 367633
Revision tags: llvmorg-9.0.0-rc1, llvmorg-10-init
# 95770866	12-Jul-2019	Jinsong Ji <jji@us.ibm.com>	[MachinePipeliner] Fix order for nodes with Anti dependence in same cycle Summary: Problem exposed in PowerPC functional testing. We did not consider Anti dependence for nodes in same cycle, so we [MachinePipeliner] Fix order for nodes with Anti dependence in same cycle Summary: Problem exposed in PowerPC functional testing. We did not consider Anti dependence for nodes in same cycle, so we may end up generating bad machine code. eg: the reduced test won't verify. * Bad machine code: Using an undefined physical register * - function: lame_encode_buffer_interleaved - basic block: %bb.4 (0x4bde4e12928) - instruction: %29:gprc = ADDZE %27:gprc, implicit-def dead $carry, implicit $carry - operand 3: implicit $carry Reviewers: bcahoon, kparzysz, hfinkel Subscribers: MaskRay, wuzish, nemanjai, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64192 llvm-svn: 365859 show more ...
Revision tags: llvmorg-8.0.1, llvmorg-8.0.1-rc4
# cbd64f76	09-Jul-2019	Jinsong Ji <jji@us.ibm.com>	[MachinePipeliner] Fix Phi refers to Phi in same stage in 1st epilogue Summary: This is exposed by functional testing on PowerPC. In some pipelined loops, Phi refer to phi did not get value defined [MachinePipeliner] Fix Phi refers to Phi in same stage in 1st epilogue Summary: This is exposed by functional testing on PowerPC. In some pipelined loops, Phi refer to phi did not get value defined by the Phi, hence getting wrong value later. As the comment mentioned, we should "use the value defined by the Phi, unless we're generating the firstepilog and the Phi refers to a Phi in a different stage.", so Phi refering to same stage Phi should use the value defined by the Phi here. Reviewers: bcahoon, hfinkel Reviewed By: hfinkel Subscribers: MaskRay, wuzish, nemanjai, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64035 llvm-svn: 365428 show more ...
Revision tags: llvmorg-8.0.1-rc3
# fee855b5	25-Jun-2019	Jinsong Ji <jji@us.ibm.com>	[MachinePipeliner] Fix risky iterator usage R++, --R When we calculate MII, we use two loops, one with iterator R++ to check whether we can reserve the resource, then --R to move back the iterator t [MachinePipeliner] Fix risky iterator usage R++, --R When we calculate MII, we use two loops, one with iterator R++ to check whether we can reserve the resource, then --R to move back the iterator to do reservation. This is risky, as R++, --R may not point to the same element at all. The can cause wrong MII. Differential Revision: https://reviews.llvm.org/D63536 llvm-svn: 364353 show more ...
# dc8de603	21-Jun-2019	Fangrui Song <maskray@google.com>	Simplify std::lower_bound with llvm::{bsearch,lower_bound}. NFC llvm-svn: 364006
# ba43840b	18-Jun-2019	Jinsong Ji <jji@us.ibm.com>	[MachinePipeliner][NFC] Do resource tracking log only when requested. In most cases we don't need to do resource tracking debug, so leave them off by default. llvm-svn: 363733
# 1c884458	13-Jun-2019	Jinsong Ji <jji@us.ibm.com>	[MachinePiepliner] Don't check boundary node in checkValidNodeOrder This was exposed by PowerPC target enablement. In ScheduleDAG, if we haven't seen any uses in this scheduling region, we will cre [MachinePiepliner] Don't check boundary node in checkValidNodeOrder This was exposed by PowerPC target enablement. In ScheduleDAG, if we haven't seen any uses in this scheduling region, we will create a dependence edge to ExitSU to model the live-out latency. This is required for vreg defs with no in-region use, and prefetches with no vreg def. When we build NodeOrder in Scheduler, we ignore these boundary nodes. However, when we check Succs in checkValidNodeOrder, we did not skip them, so we still assume all the nodes have been sorted and in order in Indices array. So when we call lower_bound() for ExitSU, it will return Indices.end(), causing memory issues in following Node access. Differential Revision: https://reviews.llvm.org/D63282 llvm-svn: 363329 show more ...
# ef2d6d99	11-Jun-2019	Jinsong Ji <jji@us.ibm.com>	[PowerPC] Enable MachinePipeliner for P9 with -ppc-enable-pipeliner Implement necessary target hooks to enable MachinePipeliner for P9 only. The pass is off by default, can be enabled with -ppc-enab [PowerPC] Enable MachinePipeliner for P9 with -ppc-enable-pipeliner Implement necessary target hooks to enable MachinePipeliner for P9 only. The pass is off by default, can be enabled with -ppc-enable-pipeliner for P9. Differential Revision: https://reviews.llvm.org/D62164 llvm-svn: 363085 show more ...
Revision tags: llvmorg-8.0.1-rc2
# 6c5d5ce5	05-Jun-2019	Ulrich Weigand <ulrich.weigand@de.ibm.com>	Allow target to handle STRICT floating-point nodes The ISD::STRICT_ nodes used to implement the constrained floating-point intrinsics are currently never passed to the target back-end, which makes i Allow target to handle STRICT floating-point nodes The ISD::STRICT_ nodes used to implement the constrained floating-point intrinsics are currently never passed to the target back-end, which makes it impossible to handle them correctly (e.g. mark instructions are depending on a floating-point status and control register, or mark instructions as possibly trapping). This patch allows the target to use setOperationAction to switch the action on ISD::STRICT_ nodes to Legal. If this is done, the SelectionDAG common code will stop converting the STRICT nodes to regular floating-point nodes, but instead pass the STRICT nodes to the target using normal SelectionDAG matching rules. To avoid having the back-end duplicate all the floating-point instruction patterns to handle both strict and non-strict variants, we make the MI codegen explicitly aware of the floating-point exceptions by introducing two new concepts: - A new MCID flag "mayRaiseFPException" that the target should set on any instruction that possibly can raise FP exception according to the architecture definition. - A new MI flag FPExcept that CodeGen/SelectionDAG will set on any MI instruction resulting from expansion of any constrained FP intrinsic. Any MI instruction that is both marked as mayRaiseFPException and FPExcept then needs to be considered as raising exceptions by MI-level codegen (e.g. scheduling). Setting those two new flags is straightforward. The mayRaiseFPException flag is simply set via TableGen by marking all relevant instruction patterns in the .td files. The FPExcept flag is set in SDNodeFlags when creating the STRICT_ nodes in the SelectionDAG, and gets inherited in the MachineSDNode nodes created from it during instruction selection. The flag is then transfered to an MIFlag when creating the MI from the MachineSDNode. This is handled just like fast-math flags like no-nans are handled today. This patch includes both common code changes required to implement the new features, and the SystemZ implementation. Reviewed By: andrew.w.kaylor Differential Revision: https://reviews.llvm.org/D55506 llvm-svn: 362663 show more ...
# 18e7bf5c	31-May-2019	Jinsong Ji <jji@us.ibm.com>	[MachinePipeliner][NFC] Add some debug log and statistics This is to add some log and statistics for debugging Differential Revision: https://reviews.llvm.org/D62165 llvm-svn: 362233
# c77aff7e	29-May-2019	Richard Trieu <rtrieu@google.com>	Inline a variable into debug section to fix unused variable warning. llvm-svn: 361927
# e8698ead	29-May-2019	Richard Trieu <rtrieu@google.com>	Inline value into debug statement to avoid unused variable warning. llvm-svn: 361924
# f6cb3bcb	29-May-2019	Jinsong Ji <jji@us.ibm.com>	Support resource tracking with InstrSchedModel The current design use DFA to do resource tracking in SMS, and DFA only support InstrItins, and also has scaling limitation. This patch extend SMS to Support resource tracking with InstrSchedModel The current design use DFA to do resource tracking in SMS, and DFA only support InstrItins, and also has scaling limitation. This patch extend SMS to allow Subtarget to use ProcResource in InstrSchedModel instead. Differential Revision: https://reviews.llvm.org/D62163 llvm-svn: 361919 show more ...
1 2 3 456 7 8 9