Revision tags: llvmorg-9.0.1-rc2, llvmorg-9.0.1-rc1 |
|
#
7c7e368a |
| 14-Nov-2019 |
Sumanth Gundapaneni <sgundapa@quicinc.com> |
[Pipeliner] Fix an assertion caused by iterator invalidation.
|
#
9026518e |
| 02-Oct-2019 |
James Molloy <jmolloy@google.com> |
[ModuloSchedule] Peel out prologs and epilogs, generate actual code
Summary: This extends the PeelingModuloScheduleExpander to generate prolog and epilog code, and correctly stitch uses through the
[ModuloSchedule] Peel out prologs and epilogs, generate actual code
Summary: This extends the PeelingModuloScheduleExpander to generate prolog and epilog code, and correctly stitch uses through the prolog, kernel, epilog DAG.
The key concept in this patch is to ensure that all transforms are *local*; only a function of a block and its immediate predecessor and successor. By defining the problem in this way we can inductively rewrite the entire DAG using only local knowledge that is easy to reason about.
For example, we assume that all prologs and epilogs are near-perfect clones of the steady-state kernel. This means that if a block has an instruction that is predicated out, we can redirect all users of that instruction to that equivalent instruction in our immediate predecessor. As all blocks are clones, every instruction must have an equivalent in every other block.
Similarly we can make the assumption by construction that if a value defined in a block is used outside that block, the only possible user is its immediate successors. We maintain this even for values that are used outside the loop by creating a limited form of LCSSA.
This code isn't small, but it isn't complex.
Enabled a bunch of testing from Hexagon. There are a couple of tests not enabled yet; I'm about 80% sure there isn't buggy codegen but the tests are checking for patterns that we don't produce. Those still need a bit more investigation. In the meantime we (Google) are happy with the code produced by this on our downstream SMS implementation, and believe it generates correct code.
Subscribers: mgorny, hiraditya, jsji, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68205
llvm-svn: 373462
show more ...
|
#
f5524f04 |
| 26-Sep-2019 |
Changpeng Fang <changpeng.fang@gmail.com> |
Remove the AliasAnalysis argument in function areMemAccessesTriviallyDisjoint
Reviewers: arsenm
Differential Revision: https://reviews.llvm.org/D58360
llvm-svn: 373024
|
#
8a74eca3 |
| 21-Sep-2019 |
James Molloy <jmolloy@google.com> |
[MachinePipeliner] Improve the TargetInstrInfo API analyzeLoop/reduceLoopCount
Recommit: fix asan errors.
The way MachinePipeliner uses these target hooks is stateful - we reduce trip count by one
[MachinePipeliner] Improve the TargetInstrInfo API analyzeLoop/reduceLoopCount
Recommit: fix asan errors.
The way MachinePipeliner uses these target hooks is stateful - we reduce trip count by one per call to reduceLoopCount. It's a little overfit for hardware loops, where we don't have to worry about stitching a loop induction variable across prologs and epilogs (the induction variable is implicit).
This patch introduces a new API:
/// Analyze loop L, which must be a single-basic-block loop, and if the /// conditions can be understood enough produce a PipelinerLoopInfo object. virtual std::unique_ptr<PipelinerLoopInfo> analyzeLoopForPipelining(MachineBasicBlock *LoopBB) const;
The return value is expected to be an implementation of the abstract class:
/// Object returned by analyzeLoopForPipelining. Allows software pipelining /// implementations to query attributes of the loop being pipelined. class PipelinerLoopInfo { public: virtual ~PipelinerLoopInfo(); /// Return true if the given instruction should not be pipelined and should /// be ignored. An example could be a loop comparison, or induction variable /// update with no users being pipelined. virtual bool shouldIgnoreForPipelining(const MachineInstr *MI) const = 0;
/// Create a condition to determine if the trip count of the loop is greater /// than TC. /// /// If the trip count is statically known to be greater than TC, return /// true. If the trip count is statically known to be not greater than TC, /// return false. Otherwise return nullopt and fill out Cond with the test /// condition. virtual Optional<bool> createTripCountGreaterCondition(int TC, MachineBasicBlock &MBB, SmallVectorImpl<MachineOperand> &Cond) = 0;
/// Modify the loop such that the trip count is /// OriginalTC + TripCountAdjust. virtual void adjustTripCount(int TripCountAdjust) = 0;
/// Called when the loop's preheader has been modified to NewPreheader. virtual void setPreheader(MachineBasicBlock *NewPreheader) = 0;
/// Called when the loop is being removed. virtual void disposed() = 0; };
The Pipeliner (ModuloSchedule.cpp) can use this object to modify the loop while allowing the target to hold its own state across all calls. This API, in particular the disjunction of creating a trip count check condition and adjusting the loop, improves the code quality in ModuloSchedule.cpp.
llvm-svn: 372463
show more ...
|
#
72a3d859 |
| 20-Sep-2019 |
Mitch Phillips <mitchphillips@outlook.com> |
Revert "[MachinePipeliner] Improve the TargetInstrInfo API analyzeLoop/reduceLoopCount"
This commit broke the ASan buildbot. See comments in rL372376 for more information.
This reverts commit 15e27
Revert "[MachinePipeliner] Improve the TargetInstrInfo API analyzeLoop/reduceLoopCount"
This commit broke the ASan buildbot. See comments in rL372376 for more information.
This reverts commit 15e27b0b6d9d51362fad85dbe95ac5b3fadf0a06.
llvm-svn: 372425
show more ...
|
#
15e27b0b |
| 20-Sep-2019 |
James Molloy <jmolloy@google.com> |
[MachinePipeliner] Improve the TargetInstrInfo API analyzeLoop/reduceLoopCount
The way MachinePipeliner uses these target hooks is stateful - we reduce trip count by one per call to reduceLoopCount.
[MachinePipeliner] Improve the TargetInstrInfo API analyzeLoop/reduceLoopCount
The way MachinePipeliner uses these target hooks is stateful - we reduce trip count by one per call to reduceLoopCount. It's a little overfit for hardware loops, where we don't have to worry about stitching a loop induction variable across prologs and epilogs (the induction variable is implicit).
This patch introduces a new API:
/// Analyze loop L, which must be a single-basic-block loop, and if the /// conditions can be understood enough produce a PipelinerLoopInfo object. virtual std::unique_ptr<PipelinerLoopInfo> analyzeLoopForPipelining(MachineBasicBlock *LoopBB) const;
The return value is expected to be an implementation of the abstract class:
/// Object returned by analyzeLoopForPipelining. Allows software pipelining /// implementations to query attributes of the loop being pipelined. class PipelinerLoopInfo { public: virtual ~PipelinerLoopInfo(); /// Return true if the given instruction should not be pipelined and should /// be ignored. An example could be a loop comparison, or induction variable /// update with no users being pipelined. virtual bool shouldIgnoreForPipelining(const MachineInstr *MI) const = 0;
/// Create a condition to determine if the trip count of the loop is greater /// than TC. /// /// If the trip count is statically known to be greater than TC, return /// true. If the trip count is statically known to be not greater than TC, /// return false. Otherwise return nullopt and fill out Cond with the test /// condition. virtual Optional<bool> createTripCountGreaterCondition(int TC, MachineBasicBlock &MBB, SmallVectorImpl<MachineOperand> &Cond) = 0;
/// Modify the loop such that the trip count is /// OriginalTC + TripCountAdjust. virtual void adjustTripCount(int TripCountAdjust) = 0;
/// Called when the loop's preheader has been modified to NewPreheader. virtual void setPreheader(MachineBasicBlock *NewPreheader) = 0;
/// Called when the loop is being removed. virtual void disposed() = 0; };
The Pipeliner (ModuloSchedule.cpp) can use this object to modify the loop while allowing the target to hold its own state across all calls. This API, in particular the disjunction of creating a trip count check condition and adjusting the loop, improves the code quality in ModuloSchedule.cpp.
llvm-svn: 372376
show more ...
|
Revision tags: llvmorg-9.0.0, llvmorg-9.0.0-rc6, llvmorg-9.0.0-rc5, llvmorg-9.0.0-rc4 |
|
#
fef9f590 |
| 04-Sep-2019 |
James Molloy <jmolloy@google.com> |
[ModuloSchedule] Introduce PeelingModuloScheduleExpander
This is the beginnings of a reimplementation of ModuloScheduleExpander. It works by generating a single-block correct pipelined kernel and th
[ModuloSchedule] Introduce PeelingModuloScheduleExpander
This is the beginnings of a reimplementation of ModuloScheduleExpander. It works by generating a single-block correct pipelined kernel and then peeling out the prolog and epilogs.
This patch implements kernel generation as well as a validator that will confirm the number of phis added is the same as the ModuloScheduleExpander.
Prolog and epilog peeling will come in a different patch.
Differential Revision: https://reviews.llvm.org/D67081
llvm-svn: 370893
show more ...
|
#
93549957 |
| 03-Sep-2019 |
James Molloy <jmolloy@google.com> |
[MachinePipeliner] Add a way to unit-test the schedule emitter
Emitting a schedule is really hard. There are lots of corner cases to take care of; in fact, of the 60+ SWP-specific testcases in the H
[MachinePipeliner] Add a way to unit-test the schedule emitter
Emitting a schedule is really hard. There are lots of corner cases to take care of; in fact, of the 60+ SWP-specific testcases in the Hexagon backend most of those are testing codegen rather than the schedule creation itself.
One issue is that to test an emission corner case we must craft an input such that the generated schedule uses that corner case; sometimes this is very hard and convolutes testcases. Other times it is impossible but we want to test it anyway.
This patch adds a simple test pass that will consume a module containing a loop and generate pipelined code from it. We use post-instr-symbols as a way to annotate instructions with the stage and cycle that we want to schedule them at.
We also provide a flag that causes the MachinePipeliner to generate these annotations instead of actually emitting code; this allows us to generate an input testcase with:
llc < %s -stop-after=pipeliner -pipeliner-annotate-for-testing -o test.mir
And run the emission in isolation with:
llc < test.mir -run-pass=modulo-schedule-test
llvm-svn: 370705
show more ...
|
#
790a779f |
| 30-Aug-2019 |
James Molloy <jmolloy@google.com> |
[MachinePipeliner] Separate schedule emission, NFC
This is the first stage in refactoring the pipeliner and making it more accessible for backends to override and control. This separates the logic a
[MachinePipeliner] Separate schedule emission, NFC
This is the first stage in refactoring the pipeliner and making it more accessible for backends to override and control. This separates the logic and state required to *emit* a scheudule from the logic that *computes* and validates a schedule.
This will enable (a) new schedule emitters and (b) new modulo scheduling implementations to coexist.
NFC.
Differential Revision: https://reviews.llvm.org/D67006
llvm-svn: 370500
show more ...
|
#
22714592 |
| 30-Aug-2019 |
Bjorn Pettersson <bjorn.a.pettersson@ericsson.com> |
[CodeGen] Introduce MachineBasicBlock::replacePhiUsesWith helper and use it. NFC
Summary: Found a couple of places in the code where all the PHI nodes of a MBB is updated, replacing references to on
[CodeGen] Introduce MachineBasicBlock::replacePhiUsesWith helper and use it. NFC
Summary: Found a couple of places in the code where all the PHI nodes of a MBB is updated, replacing references to one MBB by reference to another MBB instead.
This patch simply refactors the code to use a common helper (MachineBasicBlock::replacePhiUsesWith) for such PHI node updates.
Reviewers: t.p.northover, arsenm, uabelho
Subscribers: wdng, hiraditya, jsji, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D66750
llvm-svn: 370463
show more ...
|
Revision tags: llvmorg-9.0.0-rc3 |
|
#
0c476111 |
| 15-Aug-2019 |
Daniel Sanders <daniel_l_sanders@apple.com> |
Apply llvm-prefer-register-over-unsigned from clang-tidy to LLVM
Summary: This clang-tidy check is looking for unsigned integer variables whose initializer starts with an implicit cast from llvm::Re
Apply llvm-prefer-register-over-unsigned from clang-tidy to LLVM
Summary: This clang-tidy check is looking for unsigned integer variables whose initializer starts with an implicit cast from llvm::Register and changes the type of the variable to llvm::Register (dropping the llvm:: where possible).
Partial reverts in: X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister X86FixupLEAs.cpp - Some functions return unsigned and arguably should be MCRegister X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister HexagonBitSimplify.cpp - Function takes BitTracker::RegisterRef which appears to be unsigned& MachineVerifier.cpp - Ambiguous operator==() given MCRegister and const Register PPCFastISel.cpp - No Register::operator-=() PeepholeOptimizer.cpp - TargetInstrInfo::optimizeLoadInstr() takes an unsigned& MachineTraceMetrics.cpp - MachineTraceMetrics lacks a suitable constructor
Manual fixups in: ARMFastISel.cpp - ARMEmitLoad() now takes a Register& instead of unsigned& HexagonSplitDouble.cpp - Ternary operator was ambiguous between unsigned/Register HexagonConstExtenders.cpp - Has a local class named Register, used llvm::Register instead of Register. PPCFastISel.cpp - PPCEmitLoad() now takes a Register& instead of unsigned&
Depends on D65919
Reviewers: arsenm, bogner, craig.topper, RKSimon
Reviewed By: arsenm
Subscribers: RKSimon, craig.topper, lenary, aemerson, wuzish, jholewinski, MatzeB, qcolombet, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, wdng, nhaehnle, sbc100, jgravelle-google, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, javed.absar, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, tpr, PkmX, jocewei, jsji, Petar.Avramovic, asbirlea, Jim, s.egerton, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65962
llvm-svn: 369041
show more ...
|
Revision tags: llvmorg-9.0.0-rc2 |
|
#
6349ce5c |
| 09-Aug-2019 |
Jinsong Ji <jji@us.ibm.com> |
[MachinePipeliner] Avoid indeterminate order in FuncUnitSorter
Summary: This is exposed by adding a new testcase in PowerPC in https://reviews.llvm.org/rL367732
The testcase got different output on
[MachinePipeliner] Avoid indeterminate order in FuncUnitSorter
Summary: This is exposed by adding a new testcase in PowerPC in https://reviews.llvm.org/rL367732
The testcase got different output on different platform, hence breaking buildbots.
The problem is that we get differnt FuncUnitOrder when calculateResMII.
The root cause is: 1. Two MachineInstr might get SAME priority(MFUsx) from minFuncUnits. 2. Current comparison operator() will return `MFUs1 > MFUs2`. 3. We use iterators for MachineInstr, so the input to FuncUnitSorter might be different on differnt platform due to the iterator nature.
So for two MI with same MFU, their order is actually depends on the iterator order, which is platform (implemtation) dependent.
This is risky, and may cause cross-compiling problems.
The fix is to check make sure we assign a determine order when they are equal.
Reviewers: bcahoon, hfinkel, jmolloy
Subscribers: nemanjai, hiraditya, MaskRay, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65992
llvm-svn: 368441
show more ...
|
#
2bea69bf |
| 01-Aug-2019 |
Daniel Sanders <daniel_l_sanders@apple.com> |
Finish moving TargetRegisterInfo::isVirtualRegister() and friends to llvm::Register as started by r367614. NFC
llvm-svn: 367633
|
Revision tags: llvmorg-9.0.0-rc1, llvmorg-10-init |
|
#
95770866 |
| 12-Jul-2019 |
Jinsong Ji <jji@us.ibm.com> |
[MachinePipeliner] Fix order for nodes with Anti dependence in same cycle
Summary: Problem exposed in PowerPC functional testing.
We did not consider Anti dependence for nodes in same cycle, so we
[MachinePipeliner] Fix order for nodes with Anti dependence in same cycle
Summary: Problem exposed in PowerPC functional testing.
We did not consider Anti dependence for nodes in same cycle, so we may end up generating bad machine code. eg: the reduced test won't verify.
*** Bad machine code: Using an undefined physical register *** - function: lame_encode_buffer_interleaved - basic block: %bb.4 (0x4bde4e12928) - instruction: %29:gprc = ADDZE %27:gprc, implicit-def dead $carry, implicit $carry - operand 3: implicit $carry
Reviewers: bcahoon, kparzysz, hfinkel
Subscribers: MaskRay, wuzish, nemanjai, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64192
llvm-svn: 365859
show more ...
|
Revision tags: llvmorg-8.0.1, llvmorg-8.0.1-rc4 |
|
#
cbd64f76 |
| 09-Jul-2019 |
Jinsong Ji <jji@us.ibm.com> |
[MachinePipeliner] Fix Phi refers to Phi in same stage in 1st epilogue
Summary: This is exposed by functional testing on PowerPC. In some pipelined loops, Phi refer to phi did not get value defined
[MachinePipeliner] Fix Phi refers to Phi in same stage in 1st epilogue
Summary: This is exposed by functional testing on PowerPC. In some pipelined loops, Phi refer to phi did not get value defined by the Phi, hence getting wrong value later.
As the comment mentioned, we should "use the value defined by the Phi, unless we're generating the firstepilog and the Phi refers to a Phi in a different stage.", so Phi refering to same stage Phi should use the value defined by the Phi here.
Reviewers: bcahoon, hfinkel
Reviewed By: hfinkel
Subscribers: MaskRay, wuzish, nemanjai, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64035
llvm-svn: 365428
show more ...
|
Revision tags: llvmorg-8.0.1-rc3 |
|
#
fee855b5 |
| 25-Jun-2019 |
Jinsong Ji <jji@us.ibm.com> |
[MachinePipeliner] Fix risky iterator usage R++, --R
When we calculate MII, we use two loops, one with iterator R++ to check whether we can reserve the resource, then --R to move back the iterator t
[MachinePipeliner] Fix risky iterator usage R++, --R
When we calculate MII, we use two loops, one with iterator R++ to check whether we can reserve the resource, then --R to move back the iterator to do reservation.
This is risky, as R++, --R may not point to the same element at all. The can cause wrong MII.
Differential Revision: https://reviews.llvm.org/D63536
llvm-svn: 364353
show more ...
|
#
dc8de603 |
| 21-Jun-2019 |
Fangrui Song <maskray@google.com> |
Simplify std::lower_bound with llvm::{bsearch,lower_bound}. NFC
llvm-svn: 364006
|
#
ba43840b |
| 18-Jun-2019 |
Jinsong Ji <jji@us.ibm.com> |
[MachinePipeliner][NFC] Do resource tracking log only when requested.
In most cases we don't need to do resource tracking debug, so leave them off by default.
llvm-svn: 363733
|
#
1c884458 |
| 13-Jun-2019 |
Jinsong Ji <jji@us.ibm.com> |
[MachinePiepliner] Don't check boundary node in checkValidNodeOrder
This was exposed by PowerPC target enablement.
In ScheduleDAG, if we haven't seen any uses in this scheduling region, we will cre
[MachinePiepliner] Don't check boundary node in checkValidNodeOrder
This was exposed by PowerPC target enablement.
In ScheduleDAG, if we haven't seen any uses in this scheduling region, we will create a dependence edge to ExitSU to model the live-out latency. This is required for vreg defs with no in-region use, and prefetches with no vreg def.
When we build NodeOrder in Scheduler, we ignore these boundary nodes. However, when we check Succs in checkValidNodeOrder, we did not skip them, so we still assume all the nodes have been sorted and in order in Indices array. So when we call lower_bound() for ExitSU, it will return Indices.end(), causing memory issues in following Node access.
Differential Revision: https://reviews.llvm.org/D63282
llvm-svn: 363329
show more ...
|
#
ef2d6d99 |
| 11-Jun-2019 |
Jinsong Ji <jji@us.ibm.com> |
[PowerPC] Enable MachinePipeliner for P9 with -ppc-enable-pipeliner
Implement necessary target hooks to enable MachinePipeliner for P9 only. The pass is off by default, can be enabled with -ppc-enab
[PowerPC] Enable MachinePipeliner for P9 with -ppc-enable-pipeliner
Implement necessary target hooks to enable MachinePipeliner for P9 only. The pass is off by default, can be enabled with -ppc-enable-pipeliner for P9.
Differential Revision: https://reviews.llvm.org/D62164
llvm-svn: 363085
show more ...
|
Revision tags: llvmorg-8.0.1-rc2 |
|
#
6c5d5ce5 |
| 05-Jun-2019 |
Ulrich Weigand <ulrich.weigand@de.ibm.com> |
Allow target to handle STRICT floating-point nodes
The ISD::STRICT_ nodes used to implement the constrained floating-point intrinsics are currently never passed to the target back-end, which makes i
Allow target to handle STRICT floating-point nodes
The ISD::STRICT_ nodes used to implement the constrained floating-point intrinsics are currently never passed to the target back-end, which makes it impossible to handle them correctly (e.g. mark instructions are depending on a floating-point status and control register, or mark instructions as possibly trapping).
This patch allows the target to use setOperationAction to switch the action on ISD::STRICT_ nodes to Legal. If this is done, the SelectionDAG common code will stop converting the STRICT nodes to regular floating-point nodes, but instead pass the STRICT nodes to the target using normal SelectionDAG matching rules.
To avoid having the back-end duplicate all the floating-point instruction patterns to handle both strict and non-strict variants, we make the MI codegen explicitly aware of the floating-point exceptions by introducing two new concepts:
- A new MCID flag "mayRaiseFPException" that the target should set on any instruction that possibly can raise FP exception according to the architecture definition. - A new MI flag FPExcept that CodeGen/SelectionDAG will set on any MI instruction resulting from expansion of any constrained FP intrinsic.
Any MI instruction that is *both* marked as mayRaiseFPException *and* FPExcept then needs to be considered as raising exceptions by MI-level codegen (e.g. scheduling).
Setting those two new flags is straightforward. The mayRaiseFPException flag is simply set via TableGen by marking all relevant instruction patterns in the .td files.
The FPExcept flag is set in SDNodeFlags when creating the STRICT_ nodes in the SelectionDAG, and gets inherited in the MachineSDNode nodes created from it during instruction selection. The flag is then transfered to an MIFlag when creating the MI from the MachineSDNode. This is handled just like fast-math flags like no-nans are handled today.
This patch includes both common code changes required to implement the new features, and the SystemZ implementation.
Reviewed By: andrew.w.kaylor
Differential Revision: https://reviews.llvm.org/D55506
llvm-svn: 362663
show more ...
|
#
18e7bf5c |
| 31-May-2019 |
Jinsong Ji <jji@us.ibm.com> |
[MachinePipeliner][NFC] Add some debug log and statistics
This is to add some log and statistics for debugging
Differential Revision: https://reviews.llvm.org/D62165
llvm-svn: 362233
|
#
c77aff7e |
| 29-May-2019 |
Richard Trieu <rtrieu@google.com> |
Inline a variable into debug section to fix unused variable warning.
llvm-svn: 361927
|
#
e8698ead |
| 29-May-2019 |
Richard Trieu <rtrieu@google.com> |
Inline value into debug statement to avoid unused variable warning.
llvm-svn: 361924
|
#
f6cb3bcb |
| 29-May-2019 |
Jinsong Ji <jji@us.ibm.com> |
Support resource tracking with InstrSchedModel
The current design use DFA to do resource tracking in SMS, and DFA only support InstrItins, and also has scaling limitation.
This patch extend SMS to
Support resource tracking with InstrSchedModel
The current design use DFA to do resource tracking in SMS, and DFA only support InstrItins, and also has scaling limitation.
This patch extend SMS to allow Subtarget to use ProcResource in InstrSchedModel instead.
Differential Revision: https://reviews.llvm.org/D62163
llvm-svn: 361919
show more ...
|