xref: /dflybsd-src/contrib/gcc-4.7/gcc/doc/loop.texi (revision 04febcfb30580676d3e95f58a16c5137ee478b32)
1*e4b17023SJohn Marino@c Copyright (c) 2006, 2007, 2008 Free Software Foundation, Inc.
2*e4b17023SJohn Marino@c Free Software Foundation, Inc.
3*e4b17023SJohn Marino@c This is part of the GCC manual.
4*e4b17023SJohn Marino@c For copying conditions, see the file gcc.texi.
5*e4b17023SJohn Marino
6*e4b17023SJohn Marino@c ---------------------------------------------------------------------
7*e4b17023SJohn Marino@c Loop Representation
8*e4b17023SJohn Marino@c ---------------------------------------------------------------------
9*e4b17023SJohn Marino
10*e4b17023SJohn Marino@node Loop Analysis and Representation
11*e4b17023SJohn Marino@chapter Analysis and Representation of Loops
12*e4b17023SJohn Marino
13*e4b17023SJohn MarinoGCC provides extensive infrastructure for work with natural loops, i.e.,
14*e4b17023SJohn Marinostrongly connected components of CFG with only one entry block.  This
15*e4b17023SJohn Marinochapter describes representation of loops in GCC, both on GIMPLE and in
16*e4b17023SJohn MarinoRTL, as well as the interfaces to loop-related analyses (induction
17*e4b17023SJohn Marinovariable analysis and number of iterations analysis).
18*e4b17023SJohn Marino
19*e4b17023SJohn Marino@menu
20*e4b17023SJohn Marino* Loop representation::         Representation and analysis of loops.
21*e4b17023SJohn Marino* Loop querying::               Getting information about loops.
22*e4b17023SJohn Marino* Loop manipulation::           Loop manipulation functions.
23*e4b17023SJohn Marino* LCSSA::                       Loop-closed SSA form.
24*e4b17023SJohn Marino* Scalar evolutions::           Induction variables on GIMPLE.
25*e4b17023SJohn Marino* loop-iv::                     Induction variables on RTL.
26*e4b17023SJohn Marino* Number of iterations::        Number of iterations analysis.
27*e4b17023SJohn Marino* Dependency analysis::         Data dependency analysis.
28*e4b17023SJohn Marino* Lambda::                      Linear loop transformations framework.
29*e4b17023SJohn Marino* Omega::                       A solver for linear programming problems.
30*e4b17023SJohn Marino@end menu
31*e4b17023SJohn Marino
32*e4b17023SJohn Marino@node Loop representation
33*e4b17023SJohn Marino@section Loop representation
34*e4b17023SJohn Marino@cindex Loop representation
35*e4b17023SJohn Marino@cindex Loop analysis
36*e4b17023SJohn Marino
37*e4b17023SJohn MarinoThis chapter describes the representation of loops in GCC, and functions
38*e4b17023SJohn Marinothat can be used to build, modify and analyze this representation.  Most
39*e4b17023SJohn Marinoof the interfaces and data structures are declared in @file{cfgloop.h}.
40*e4b17023SJohn MarinoAt the moment, loop structures are analyzed and this information is
41*e4b17023SJohn Marinoupdated only by the optimization passes that deal with loops, but some
42*e4b17023SJohn Marinoefforts are being made to make it available throughout most of the
43*e4b17023SJohn Marinooptimization passes.
44*e4b17023SJohn Marino
45*e4b17023SJohn MarinoIn general, a natural loop has one entry block (header) and possibly
46*e4b17023SJohn Marinoseveral back edges (latches) leading to the header from the inside of
47*e4b17023SJohn Marinothe loop.  Loops with several latches may appear if several loops share
48*e4b17023SJohn Marinoa single header, or if there is a branching in the middle of the loop.
49*e4b17023SJohn MarinoThe representation of loops in GCC however allows only loops with a
50*e4b17023SJohn Marinosingle latch.  During loop analysis, headers of such loops are split and
51*e4b17023SJohn Marinoforwarder blocks are created in order to disambiguate their structures.
52*e4b17023SJohn MarinoHeuristic based on profile information and structure of the induction
53*e4b17023SJohn Marinovariables in the loops is used to determine whether the latches
54*e4b17023SJohn Marinocorrespond to sub-loops or to control flow in a single loop.  This means
55*e4b17023SJohn Marinothat the analysis sometimes changes the CFG, and if you run it in the
56*e4b17023SJohn Marinomiddle of an optimization pass, you must be able to deal with the new
57*e4b17023SJohn Marinoblocks.  You may avoid CFG changes by passing
58*e4b17023SJohn Marino@code{LOOPS_MAY_HAVE_MULTIPLE_LATCHES} flag to the loop discovery,
59*e4b17023SJohn Marinonote however that most other loop manipulation functions will not work
60*e4b17023SJohn Marinocorrectly for loops with multiple latch edges (the functions that only
61*e4b17023SJohn Marinoquery membership of blocks to loops and subloop relationships, or
62*e4b17023SJohn Marinoenumerate and test loop exits, can be expected to work).
63*e4b17023SJohn Marino
64*e4b17023SJohn MarinoBody of the loop is the set of blocks that are dominated by its header,
65*e4b17023SJohn Marinoand reachable from its latch against the direction of edges in CFG@.  The
66*e4b17023SJohn Marinoloops are organized in a containment hierarchy (tree) such that all the
67*e4b17023SJohn Marinoloops immediately contained inside loop L are the children of L in the
68*e4b17023SJohn Marinotree.  This tree is represented by the @code{struct loops} structure.
69*e4b17023SJohn MarinoThe root of this tree is a fake loop that contains all blocks in the
70*e4b17023SJohn Marinofunction.  Each of the loops is represented in a @code{struct loop}
71*e4b17023SJohn Marinostructure.  Each loop is assigned an index (@code{num} field of the
72*e4b17023SJohn Marino@code{struct loop} structure), and the pointer to the loop is stored in
73*e4b17023SJohn Marinothe corresponding field of the @code{larray} vector in the loops
74*e4b17023SJohn Marinostructure.  The indices do not have to be continuous, there may be
75*e4b17023SJohn Marinoempty (@code{NULL}) entries in the @code{larray} created by deleting
76*e4b17023SJohn Marinoloops.  Also, there is no guarantee on the relative order of a loop
77*e4b17023SJohn Marinoand its subloops in the numbering.  The index of a loop never changes.
78*e4b17023SJohn Marino
79*e4b17023SJohn MarinoThe entries of the @code{larray} field should not be accessed directly.
80*e4b17023SJohn MarinoThe function @code{get_loop} returns the loop description for a loop with
81*e4b17023SJohn Marinothe given index.  @code{number_of_loops} function returns number of
82*e4b17023SJohn Marinoloops in the function.  To traverse all loops, use @code{FOR_EACH_LOOP}
83*e4b17023SJohn Marinomacro.  The @code{flags} argument of the macro is used to determine
84*e4b17023SJohn Marinothe direction of traversal and the set of loops visited.  Each loop is
85*e4b17023SJohn Marinoguaranteed to be visited exactly once, regardless of the changes to the
86*e4b17023SJohn Marinoloop tree, and the loops may be removed during the traversal.  The newly
87*e4b17023SJohn Marinocreated loops are never traversed, if they need to be visited, this
88*e4b17023SJohn Marinomust be done separately after their creation.  The @code{FOR_EACH_LOOP}
89*e4b17023SJohn Marinomacro allocates temporary variables.  If the @code{FOR_EACH_LOOP} loop
90*e4b17023SJohn Marinowere ended using break or goto, they would not be released;
91*e4b17023SJohn Marino@code{FOR_EACH_LOOP_BREAK} macro must be used instead.
92*e4b17023SJohn Marino
93*e4b17023SJohn MarinoEach basic block contains the reference to the innermost loop it belongs
94*e4b17023SJohn Marinoto (@code{loop_father}).  For this reason, it is only possible to have
95*e4b17023SJohn Marinoone @code{struct loops} structure initialized at the same time for each
96*e4b17023SJohn MarinoCFG@.  The global variable @code{current_loops} contains the
97*e4b17023SJohn Marino@code{struct loops} structure.  Many of the loop manipulation functions
98*e4b17023SJohn Marinoassume that dominance information is up-to-date.
99*e4b17023SJohn Marino
100*e4b17023SJohn MarinoThe loops are analyzed through @code{loop_optimizer_init} function.  The
101*e4b17023SJohn Marinoargument of this function is a set of flags represented in an integer
102*e4b17023SJohn Marinobitmask.  These flags specify what other properties of the loop
103*e4b17023SJohn Marinostructures should be calculated/enforced and preserved later:
104*e4b17023SJohn Marino
105*e4b17023SJohn Marino@itemize
106*e4b17023SJohn Marino@item @code{LOOPS_MAY_HAVE_MULTIPLE_LATCHES}: If this flag is set, no
107*e4b17023SJohn Marinochanges to CFG will be performed in the loop analysis, in particular,
108*e4b17023SJohn Marinoloops with multiple latch edges will not be disambiguated.  If a loop
109*e4b17023SJohn Marinohas multiple latches, its latch block is set to NULL@.  Most of
110*e4b17023SJohn Marinothe loop manipulation functions will not work for loops in this shape.
111*e4b17023SJohn MarinoNo other flags that require CFG changes can be passed to
112*e4b17023SJohn Marinoloop_optimizer_init.
113*e4b17023SJohn Marino@item @code{LOOPS_HAVE_PREHEADERS}: Forwarder blocks are created in such
114*e4b17023SJohn Marinoa way that each loop has only one entry edge, and additionally, the
115*e4b17023SJohn Marinosource block of this entry edge has only one successor.  This creates a
116*e4b17023SJohn Marinonatural place where the code can be moved out of the loop, and ensures
117*e4b17023SJohn Marinothat the entry edge of the loop leads from its immediate super-loop.
118*e4b17023SJohn Marino@item @code{LOOPS_HAVE_SIMPLE_LATCHES}: Forwarder blocks are created to
119*e4b17023SJohn Marinoforce the latch block of each loop to have only one successor.  This
120*e4b17023SJohn Marinoensures that the latch of the loop does not belong to any of its
121*e4b17023SJohn Marinosub-loops, and makes manipulation with the loops significantly easier.
122*e4b17023SJohn MarinoMost of the loop manipulation functions assume that the loops are in
123*e4b17023SJohn Marinothis shape.  Note that with this flag, the ``normal'' loop without any
124*e4b17023SJohn Marinocontrol flow inside and with one exit consists of two basic blocks.
125*e4b17023SJohn Marino@item @code{LOOPS_HAVE_MARKED_IRREDUCIBLE_REGIONS}: Basic blocks and
126*e4b17023SJohn Marinoedges in the strongly connected components that are not natural loops
127*e4b17023SJohn Marino(have more than one entry block) are marked with
128*e4b17023SJohn Marino@code{BB_IRREDUCIBLE_LOOP} and @code{EDGE_IRREDUCIBLE_LOOP} flags.  The
129*e4b17023SJohn Marinoflag is not set for blocks and edges that belong to natural loops that
130*e4b17023SJohn Marinoare in such an irreducible region (but it is set for the entry and exit
131*e4b17023SJohn Marinoedges of such a loop, if they lead to/from this region).
132*e4b17023SJohn Marino@item @code{LOOPS_HAVE_RECORDED_EXITS}: The lists of exits are recorded
133*e4b17023SJohn Marinoand updated for each loop.  This makes some functions (e.g.,
134*e4b17023SJohn Marino@code{get_loop_exit_edges}) more efficient.  Some functions (e.g.,
135*e4b17023SJohn Marino@code{single_exit}) can be used only if the lists of exits are
136*e4b17023SJohn Marinorecorded.
137*e4b17023SJohn Marino@end itemize
138*e4b17023SJohn Marino
139*e4b17023SJohn MarinoThese properties may also be computed/enforced later, using functions
140*e4b17023SJohn Marino@code{create_preheaders}, @code{force_single_succ_latches},
141*e4b17023SJohn Marino@code{mark_irreducible_loops} and @code{record_loop_exits}.
142*e4b17023SJohn Marino
143*e4b17023SJohn MarinoThe memory occupied by the loops structures should be freed with
144*e4b17023SJohn Marino@code{loop_optimizer_finalize} function.
145*e4b17023SJohn Marino
146*e4b17023SJohn MarinoThe CFG manipulation functions in general do not update loop structures.
147*e4b17023SJohn MarinoSpecialized versions that additionally do so are provided for the most
148*e4b17023SJohn Marinocommon tasks.  On GIMPLE, @code{cleanup_tree_cfg_loop} function can be
149*e4b17023SJohn Marinoused to cleanup CFG while updating the loops structures if
150*e4b17023SJohn Marino@code{current_loops} is set.
151*e4b17023SJohn Marino
152*e4b17023SJohn Marino@node Loop querying
153*e4b17023SJohn Marino@section Loop querying
154*e4b17023SJohn Marino@cindex Loop querying
155*e4b17023SJohn Marino
156*e4b17023SJohn MarinoThe functions to query the information about loops are declared in
157*e4b17023SJohn Marino@file{cfgloop.h}.  Some of the information can be taken directly from
158*e4b17023SJohn Marinothe structures.  @code{loop_father} field of each basic block contains
159*e4b17023SJohn Marinothe innermost loop to that the block belongs.  The most useful fields of
160*e4b17023SJohn Marinoloop structure (that are kept up-to-date at all times) are:
161*e4b17023SJohn Marino
162*e4b17023SJohn Marino@itemize
163*e4b17023SJohn Marino@item @code{header}, @code{latch}: Header and latch basic blocks of the
164*e4b17023SJohn Marinoloop.
165*e4b17023SJohn Marino@item @code{num_nodes}: Number of basic blocks in the loop (including
166*e4b17023SJohn Marinothe basic blocks of the sub-loops).
167*e4b17023SJohn Marino@item @code{depth}: The depth of the loop in the loops tree, i.e., the
168*e4b17023SJohn Marinonumber of super-loops of the loop.
169*e4b17023SJohn Marino@item @code{outer}, @code{inner}, @code{next}: The super-loop, the first
170*e4b17023SJohn Marinosub-loop, and the sibling of the loop in the loops tree.
171*e4b17023SJohn Marino@end itemize
172*e4b17023SJohn Marino
173*e4b17023SJohn MarinoThere are other fields in the loop structures, many of them used only by
174*e4b17023SJohn Marinosome of the passes, or not updated during CFG changes; in general, they
175*e4b17023SJohn Marinoshould not be accessed directly.
176*e4b17023SJohn Marino
177*e4b17023SJohn MarinoThe most important functions to query loop structures are:
178*e4b17023SJohn Marino
179*e4b17023SJohn Marino@itemize
180*e4b17023SJohn Marino@item @code{flow_loops_dump}: Dumps the information about loops to a
181*e4b17023SJohn Marinofile.
182*e4b17023SJohn Marino@item @code{verify_loop_structure}: Checks consistency of the loop
183*e4b17023SJohn Marinostructures.
184*e4b17023SJohn Marino@item @code{loop_latch_edge}: Returns the latch edge of a loop.
185*e4b17023SJohn Marino@item @code{loop_preheader_edge}: If loops have preheaders, returns
186*e4b17023SJohn Marinothe preheader edge of a loop.
187*e4b17023SJohn Marino@item @code{flow_loop_nested_p}: Tests whether loop is a sub-loop of
188*e4b17023SJohn Marinoanother loop.
189*e4b17023SJohn Marino@item @code{flow_bb_inside_loop_p}: Tests whether a basic block belongs
190*e4b17023SJohn Marinoto a loop (including its sub-loops).
191*e4b17023SJohn Marino@item @code{find_common_loop}: Finds the common super-loop of two loops.
192*e4b17023SJohn Marino@item @code{superloop_at_depth}: Returns the super-loop of a loop with
193*e4b17023SJohn Marinothe given depth.
194*e4b17023SJohn Marino@item @code{tree_num_loop_insns}, @code{num_loop_insns}: Estimates the
195*e4b17023SJohn Marinonumber of insns in the loop, on GIMPLE and on RTL.
196*e4b17023SJohn Marino@item @code{loop_exit_edge_p}: Tests whether edge is an exit from a
197*e4b17023SJohn Marinoloop.
198*e4b17023SJohn Marino@item @code{mark_loop_exit_edges}: Marks all exit edges of all loops
199*e4b17023SJohn Marinowith @code{EDGE_LOOP_EXIT} flag.
200*e4b17023SJohn Marino@item @code{get_loop_body}, @code{get_loop_body_in_dom_order},
201*e4b17023SJohn Marino@code{get_loop_body_in_bfs_order}: Enumerates the basic blocks in the
202*e4b17023SJohn Marinoloop in depth-first search order in reversed CFG, ordered by dominance
203*e4b17023SJohn Marinorelation, and breath-first search order, respectively.
204*e4b17023SJohn Marino@item @code{single_exit}: Returns the single exit edge of the loop, or
205*e4b17023SJohn Marino@code{NULL} if the loop has more than one exit.  You can only use this
206*e4b17023SJohn Marinofunction if LOOPS_HAVE_MARKED_SINGLE_EXITS property is used.
207*e4b17023SJohn Marino@item @code{get_loop_exit_edges}: Enumerates the exit edges of a loop.
208*e4b17023SJohn Marino@item @code{just_once_each_iteration_p}: Returns true if the basic block
209*e4b17023SJohn Marinois executed exactly once during each iteration of a loop (that is, it
210*e4b17023SJohn Marinodoes not belong to a sub-loop, and it dominates the latch of the loop).
211*e4b17023SJohn Marino@end itemize
212*e4b17023SJohn Marino
213*e4b17023SJohn Marino@node Loop manipulation
214*e4b17023SJohn Marino@section Loop manipulation
215*e4b17023SJohn Marino@cindex Loop manipulation
216*e4b17023SJohn Marino
217*e4b17023SJohn MarinoThe loops tree can be manipulated using the following functions:
218*e4b17023SJohn Marino
219*e4b17023SJohn Marino@itemize
220*e4b17023SJohn Marino@item @code{flow_loop_tree_node_add}: Adds a node to the tree.
221*e4b17023SJohn Marino@item @code{flow_loop_tree_node_remove}: Removes a node from the tree.
222*e4b17023SJohn Marino@item @code{add_bb_to_loop}: Adds a basic block to a loop.
223*e4b17023SJohn Marino@item @code{remove_bb_from_loops}: Removes a basic block from loops.
224*e4b17023SJohn Marino@end itemize
225*e4b17023SJohn Marino
226*e4b17023SJohn MarinoMost low-level CFG functions update loops automatically.  The following
227*e4b17023SJohn Marinofunctions handle some more complicated cases of CFG manipulations:
228*e4b17023SJohn Marino
229*e4b17023SJohn Marino@itemize
230*e4b17023SJohn Marino@item @code{remove_path}: Removes an edge and all blocks it dominates.
231*e4b17023SJohn Marino@item @code{split_loop_exit_edge}: Splits exit edge of the loop,
232*e4b17023SJohn Marinoensuring that PHI node arguments remain in the loop (this ensures that
233*e4b17023SJohn Marinoloop-closed SSA form is preserved).  Only useful on GIMPLE.
234*e4b17023SJohn Marino@end itemize
235*e4b17023SJohn Marino
236*e4b17023SJohn MarinoFinally, there are some higher-level loop transformations implemented.
237*e4b17023SJohn MarinoWhile some of them are written so that they should work on non-innermost
238*e4b17023SJohn Marinoloops, they are mostly untested in that case, and at the moment, they
239*e4b17023SJohn Marinoare only reliable for the innermost loops:
240*e4b17023SJohn Marino
241*e4b17023SJohn Marino@itemize
242*e4b17023SJohn Marino@item @code{create_iv}: Creates a new induction variable.  Only works on
243*e4b17023SJohn MarinoGIMPLE@.  @code{standard_iv_increment_position} can be used to find a
244*e4b17023SJohn Marinosuitable place for the iv increment.
245*e4b17023SJohn Marino@item @code{duplicate_loop_to_header_edge},
246*e4b17023SJohn Marino@code{tree_duplicate_loop_to_header_edge}: These functions (on RTL and
247*e4b17023SJohn Marinoon GIMPLE) duplicate the body of the loop prescribed number of times on
248*e4b17023SJohn Marinoone of the edges entering loop header, thus performing either loop
249*e4b17023SJohn Marinounrolling or loop peeling.  @code{can_duplicate_loop_p}
250*e4b17023SJohn Marino(@code{can_unroll_loop_p} on GIMPLE) must be true for the duplicated
251*e4b17023SJohn Marinoloop.
252*e4b17023SJohn Marino@item @code{loop_version}, @code{tree_ssa_loop_version}: These function
253*e4b17023SJohn Marinocreate a copy of a loop, and a branch before them that selects one of
254*e4b17023SJohn Marinothem depending on the prescribed condition.  This is useful for
255*e4b17023SJohn Marinooptimizations that need to verify some assumptions in runtime (one of
256*e4b17023SJohn Marinothe copies of the loop is usually left unchanged, while the other one is
257*e4b17023SJohn Marinotransformed in some way).
258*e4b17023SJohn Marino@item @code{tree_unroll_loop}: Unrolls the loop, including peeling the
259*e4b17023SJohn Marinoextra iterations to make the number of iterations divisible by unroll
260*e4b17023SJohn Marinofactor, updating the exit condition, and removing the exits that now
261*e4b17023SJohn Marinocannot be taken.  Works only on GIMPLE.
262*e4b17023SJohn Marino@end itemize
263*e4b17023SJohn Marino
264*e4b17023SJohn Marino@node LCSSA
265*e4b17023SJohn Marino@section Loop-closed SSA form
266*e4b17023SJohn Marino@cindex LCSSA
267*e4b17023SJohn Marino@cindex Loop-closed SSA form
268*e4b17023SJohn Marino
269*e4b17023SJohn MarinoThroughout the loop optimizations on tree level, one extra condition is
270*e4b17023SJohn Marinoenforced on the SSA form:  No SSA name is used outside of the loop in
271*e4b17023SJohn Marinothat it is defined.  The SSA form satisfying this condition is called
272*e4b17023SJohn Marino``loop-closed SSA form'' -- LCSSA@.  To enforce LCSSA, PHI nodes must be
273*e4b17023SJohn Marinocreated at the exits of the loops for the SSA names that are used
274*e4b17023SJohn Marinooutside of them.  Only the real operands (not virtual SSA names) are
275*e4b17023SJohn Marinoheld in LCSSA, in order to save memory.
276*e4b17023SJohn Marino
277*e4b17023SJohn MarinoThere are various benefits of LCSSA:
278*e4b17023SJohn Marino
279*e4b17023SJohn Marino@itemize
280*e4b17023SJohn Marino@item Many optimizations (value range analysis, final value
281*e4b17023SJohn Marinoreplacement) are interested in the values that are defined in the loop
282*e4b17023SJohn Marinoand used outside of it, i.e., exactly those for that we create new PHI
283*e4b17023SJohn Marinonodes.
284*e4b17023SJohn Marino@item In induction variable analysis, it is not necessary to specify the
285*e4b17023SJohn Marinoloop in that the analysis should be performed -- the scalar evolution
286*e4b17023SJohn Marinoanalysis always returns the results with respect to the loop in that the
287*e4b17023SJohn MarinoSSA name is defined.
288*e4b17023SJohn Marino@item It makes updating of SSA form during loop transformations simpler.
289*e4b17023SJohn MarinoWithout LCSSA, operations like loop unrolling may force creation of PHI
290*e4b17023SJohn Marinonodes arbitrarily far from the loop, while in LCSSA, the SSA form can be
291*e4b17023SJohn Marinoupdated locally.  However, since we only keep real operands in LCSSA, we
292*e4b17023SJohn Marinocannot use this advantage (we could have local updating of real
293*e4b17023SJohn Marinooperands, but it is not much more efficient than to use generic SSA form
294*e4b17023SJohn Marinoupdating for it as well; the amount of changes to SSA is the same).
295*e4b17023SJohn Marino@end itemize
296*e4b17023SJohn Marino
297*e4b17023SJohn MarinoHowever, it also means LCSSA must be updated.  This is usually
298*e4b17023SJohn Marinostraightforward, unless you create a new value in loop and use it
299*e4b17023SJohn Marinooutside, or unless you manipulate loop exit edges (functions are
300*e4b17023SJohn Marinoprovided to make these manipulations simple).
301*e4b17023SJohn Marino@code{rewrite_into_loop_closed_ssa} is used to rewrite SSA form to
302*e4b17023SJohn MarinoLCSSA, and @code{verify_loop_closed_ssa} to check that the invariant of
303*e4b17023SJohn MarinoLCSSA is preserved.
304*e4b17023SJohn Marino
305*e4b17023SJohn Marino@node Scalar evolutions
306*e4b17023SJohn Marino@section Scalar evolutions
307*e4b17023SJohn Marino@cindex Scalar evolutions
308*e4b17023SJohn Marino@cindex IV analysis on GIMPLE
309*e4b17023SJohn Marino
310*e4b17023SJohn MarinoScalar evolutions (SCEV) are used to represent results of induction
311*e4b17023SJohn Marinovariable analysis on GIMPLE@.  They enable us to represent variables with
312*e4b17023SJohn Marinocomplicated behavior in a simple and consistent way (we only use it to
313*e4b17023SJohn Marinoexpress values of polynomial induction variables, but it is possible to
314*e4b17023SJohn Marinoextend it).  The interfaces to SCEV analysis are declared in
315*e4b17023SJohn Marino@file{tree-scalar-evolution.h}.  To use scalar evolutions analysis,
316*e4b17023SJohn Marino@code{scev_initialize} must be used.  To stop using SCEV,
317*e4b17023SJohn Marino@code{scev_finalize} should be used.  SCEV analysis caches results in
318*e4b17023SJohn Marinoorder to save time and memory.  This cache however is made invalid by
319*e4b17023SJohn Marinomost of the loop transformations, including removal of code.  If such a
320*e4b17023SJohn Marinotransformation is performed, @code{scev_reset} must be called to clean
321*e4b17023SJohn Marinothe caches.
322*e4b17023SJohn Marino
323*e4b17023SJohn MarinoGiven an SSA name, its behavior in loops can be analyzed using the
324*e4b17023SJohn Marino@code{analyze_scalar_evolution} function.  The returned SCEV however
325*e4b17023SJohn Marinodoes not have to be fully analyzed and it may contain references to
326*e4b17023SJohn Marinoother SSA names defined in the loop.  To resolve these (potentially
327*e4b17023SJohn Marinorecursive) references, @code{instantiate_parameters} or
328*e4b17023SJohn Marino@code{resolve_mixers} functions must be used.
329*e4b17023SJohn Marino@code{instantiate_parameters} is useful when you use the results of SCEV
330*e4b17023SJohn Marinoonly for some analysis, and when you work with whole nest of loops at
331*e4b17023SJohn Marinoonce.  It will try replacing all SSA names by their SCEV in all loops,
332*e4b17023SJohn Marinoincluding the super-loops of the current loop, thus providing a complete
333*e4b17023SJohn Marinoinformation about the behavior of the variable in the loop nest.
334*e4b17023SJohn Marino@code{resolve_mixers} is useful if you work with only one loop at a
335*e4b17023SJohn Marinotime, and if you possibly need to create code based on the value of the
336*e4b17023SJohn Marinoinduction variable.  It will only resolve the SSA names defined in the
337*e4b17023SJohn Marinocurrent loop, leaving the SSA names defined outside unchanged, even if
338*e4b17023SJohn Marinotheir evolution in the outer loops is known.
339*e4b17023SJohn Marino
340*e4b17023SJohn MarinoThe SCEV is a normal tree expression, except for the fact that it may
341*e4b17023SJohn Marinocontain several special tree nodes.  One of them is
342*e4b17023SJohn Marino@code{SCEV_NOT_KNOWN}, used for SSA names whose value cannot be
343*e4b17023SJohn Marinoexpressed.  The other one is @code{POLYNOMIAL_CHREC}.  Polynomial chrec
344*e4b17023SJohn Marinohas three arguments -- base, step and loop (both base and step may
345*e4b17023SJohn Marinocontain further polynomial chrecs).  Type of the expression and of base
346*e4b17023SJohn Marinoand step must be the same.  A variable has evolution
347*e4b17023SJohn Marino@code{POLYNOMIAL_CHREC(base, step, loop)} if it is (in the specified
348*e4b17023SJohn Marinoloop) equivalent to @code{x_1} in the following example
349*e4b17023SJohn Marino
350*e4b17023SJohn Marino@smallexample
351*e4b17023SJohn Marinowhile (@dots{})
352*e4b17023SJohn Marino  @{
353*e4b17023SJohn Marino    x_1 = phi (base, x_2);
354*e4b17023SJohn Marino    x_2 = x_1 + step;
355*e4b17023SJohn Marino  @}
356*e4b17023SJohn Marino@end smallexample
357*e4b17023SJohn Marino
358*e4b17023SJohn MarinoNote that this includes the language restrictions on the operations.
359*e4b17023SJohn MarinoFor example, if we compile C code and @code{x} has signed type, then the
360*e4b17023SJohn Marinooverflow in addition would cause undefined behavior, and we may assume
361*e4b17023SJohn Marinothat this does not happen.  Hence, the value with this SCEV cannot
362*e4b17023SJohn Marinooverflow (which restricts the number of iterations of such a loop).
363*e4b17023SJohn Marino
364*e4b17023SJohn MarinoIn many cases, one wants to restrict the attention just to affine
365*e4b17023SJohn Marinoinduction variables.  In this case, the extra expressive power of SCEV
366*e4b17023SJohn Marinois not useful, and may complicate the optimizations.  In this case,
367*e4b17023SJohn Marino@code{simple_iv} function may be used to analyze a value -- the result
368*e4b17023SJohn Marinois a loop-invariant base and step.
369*e4b17023SJohn Marino
370*e4b17023SJohn Marino@node loop-iv
371*e4b17023SJohn Marino@section IV analysis on RTL
372*e4b17023SJohn Marino@cindex IV analysis on RTL
373*e4b17023SJohn Marino
374*e4b17023SJohn MarinoThe induction variable on RTL is simple and only allows analysis of
375*e4b17023SJohn Marinoaffine induction variables, and only in one loop at once.  The interface
376*e4b17023SJohn Marinois declared in @file{cfgloop.h}.  Before analyzing induction variables
377*e4b17023SJohn Marinoin a loop L, @code{iv_analysis_loop_init} function must be called on L.
378*e4b17023SJohn MarinoAfter the analysis (possibly calling @code{iv_analysis_loop_init} for
379*e4b17023SJohn Marinoseveral loops) is finished, @code{iv_analysis_done} should be called.
380*e4b17023SJohn MarinoThe following functions can be used to access the results of the
381*e4b17023SJohn Marinoanalysis:
382*e4b17023SJohn Marino
383*e4b17023SJohn Marino@itemize
384*e4b17023SJohn Marino@item @code{iv_analyze}: Analyzes a single register used in the given
385*e4b17023SJohn Marinoinsn.  If no use of the register in this insn is found, the following
386*e4b17023SJohn Marinoinsns are scanned, so that this function can be called on the insn
387*e4b17023SJohn Marinoreturned by get_condition.
388*e4b17023SJohn Marino@item @code{iv_analyze_result}: Analyzes result of the assignment in the
389*e4b17023SJohn Marinogiven insn.
390*e4b17023SJohn Marino@item @code{iv_analyze_expr}: Analyzes a more complicated expression.
391*e4b17023SJohn MarinoAll its operands are analyzed by @code{iv_analyze}, and hence they must
392*e4b17023SJohn Marinobe used in the specified insn or one of the following insns.
393*e4b17023SJohn Marino@end itemize
394*e4b17023SJohn Marino
395*e4b17023SJohn MarinoThe description of the induction variable is provided in @code{struct
396*e4b17023SJohn Marinortx_iv}.  In order to handle subregs, the representation is a bit
397*e4b17023SJohn Marinocomplicated; if the value of the @code{extend} field is not
398*e4b17023SJohn Marino@code{UNKNOWN}, the value of the induction variable in the i-th
399*e4b17023SJohn Marinoiteration is
400*e4b17023SJohn Marino
401*e4b17023SJohn Marino@smallexample
402*e4b17023SJohn Marinodelta + mult * extend_@{extend_mode@} (subreg_@{mode@} (base + i * step)),
403*e4b17023SJohn Marino@end smallexample
404*e4b17023SJohn Marino
405*e4b17023SJohn Marinowith the following exception:  if @code{first_special} is true, then the
406*e4b17023SJohn Marinovalue in the first iteration (when @code{i} is zero) is @code{delta +
407*e4b17023SJohn Marinomult * base}.  However, if @code{extend} is equal to @code{UNKNOWN},
408*e4b17023SJohn Marinothen @code{first_special} must be false, @code{delta} 0, @code{mult} 1
409*e4b17023SJohn Marinoand the value in the i-th iteration is
410*e4b17023SJohn Marino
411*e4b17023SJohn Marino@smallexample
412*e4b17023SJohn Marinosubreg_@{mode@} (base + i * step)
413*e4b17023SJohn Marino@end smallexample
414*e4b17023SJohn Marino
415*e4b17023SJohn MarinoThe function @code{get_iv_value} can be used to perform these
416*e4b17023SJohn Marinocalculations.
417*e4b17023SJohn Marino
418*e4b17023SJohn Marino@node Number of iterations
419*e4b17023SJohn Marino@section Number of iterations analysis
420*e4b17023SJohn Marino@cindex Number of iterations analysis
421*e4b17023SJohn Marino
422*e4b17023SJohn MarinoBoth on GIMPLE and on RTL, there are functions available to determine
423*e4b17023SJohn Marinothe number of iterations of a loop, with a similar interface.  The
424*e4b17023SJohn Marinonumber of iterations of a loop in GCC is defined as the number of
425*e4b17023SJohn Marinoexecutions of the loop latch.  In many cases, it is not possible to
426*e4b17023SJohn Marinodetermine the number of iterations unconditionally -- the determined
427*e4b17023SJohn Marinonumber is correct only if some assumptions are satisfied.  The analysis
428*e4b17023SJohn Marinotries to verify these conditions using the information contained in the
429*e4b17023SJohn Marinoprogram; if it fails, the conditions are returned together with the
430*e4b17023SJohn Marinoresult.  The following information and conditions are provided by the
431*e4b17023SJohn Marinoanalysis:
432*e4b17023SJohn Marino
433*e4b17023SJohn Marino@itemize
434*e4b17023SJohn Marino@item @code{assumptions}: If this condition is false, the rest of
435*e4b17023SJohn Marinothe information is invalid.
436*e4b17023SJohn Marino@item @code{noloop_assumptions} on RTL, @code{may_be_zero} on GIMPLE: If
437*e4b17023SJohn Marinothis condition is true, the loop exits in the first iteration.
438*e4b17023SJohn Marino@item @code{infinite}: If this condition is true, the loop is infinite.
439*e4b17023SJohn MarinoThis condition is only available on RTL@.  On GIMPLE, conditions for
440*e4b17023SJohn Marinofiniteness of the loop are included in @code{assumptions}.
441*e4b17023SJohn Marino@item @code{niter_expr} on RTL, @code{niter} on GIMPLE: The expression
442*e4b17023SJohn Marinothat gives number of iterations.  The number of iterations is defined as
443*e4b17023SJohn Marinothe number of executions of the loop latch.
444*e4b17023SJohn Marino@end itemize
445*e4b17023SJohn Marino
446*e4b17023SJohn MarinoBoth on GIMPLE and on RTL, it necessary for the induction variable
447*e4b17023SJohn Marinoanalysis framework to be initialized (SCEV on GIMPLE, loop-iv on RTL).
448*e4b17023SJohn MarinoOn GIMPLE, the results are stored to @code{struct tree_niter_desc}
449*e4b17023SJohn Marinostructure.  Number of iterations before the loop is exited through a
450*e4b17023SJohn Marinogiven exit can be determined using @code{number_of_iterations_exit}
451*e4b17023SJohn Marinofunction.  On RTL, the results are returned in @code{struct niter_desc}
452*e4b17023SJohn Marinostructure.  The corresponding function is named
453*e4b17023SJohn Marino@code{check_simple_exit}.  There are also functions that pass through
454*e4b17023SJohn Marinoall the exits of a loop and try to find one with easy to determine
455*e4b17023SJohn Marinonumber of iterations -- @code{find_loop_niter} on GIMPLE and
456*e4b17023SJohn Marino@code{find_simple_exit} on RTL@.  Finally, there are functions that
457*e4b17023SJohn Marinoprovide the same information, but additionally cache it, so that
458*e4b17023SJohn Marinorepeated calls to number of iterations are not so costly --
459*e4b17023SJohn Marino@code{number_of_latch_executions} on GIMPLE and @code{get_simple_loop_desc}
460*e4b17023SJohn Marinoon RTL.
461*e4b17023SJohn Marino
462*e4b17023SJohn MarinoNote that some of these functions may behave slightly differently than
463*e4b17023SJohn Marinoothers -- some of them return only the expression for the number of
464*e4b17023SJohn Marinoiterations, and fail if there are some assumptions.  The function
465*e4b17023SJohn Marino@code{number_of_latch_executions} works only for single-exit loops.
466*e4b17023SJohn MarinoThe function @code{number_of_cond_exit_executions} can be used to
467*e4b17023SJohn Marinodetermine number of executions of the exit condition of a single-exit
468*e4b17023SJohn Marinoloop (i.e., the @code{number_of_latch_executions} increased by one).
469*e4b17023SJohn Marino
470*e4b17023SJohn Marino@node Dependency analysis
471*e4b17023SJohn Marino@section Data Dependency Analysis
472*e4b17023SJohn Marino@cindex Data Dependency Analysis
473*e4b17023SJohn Marino
474*e4b17023SJohn MarinoThe code for the data dependence analysis can be found in
475*e4b17023SJohn Marino@file{tree-data-ref.c} and its interface and data structures are
476*e4b17023SJohn Marinodescribed in @file{tree-data-ref.h}.  The function that computes the
477*e4b17023SJohn Marinodata dependences for all the array and pointer references for a given
478*e4b17023SJohn Marinoloop is @code{compute_data_dependences_for_loop}.  This function is
479*e4b17023SJohn Marinocurrently used by the linear loop transform and the vectorization
480*e4b17023SJohn Marinopasses.  Before calling this function, one has to allocate two vectors:
481*e4b17023SJohn Marinoa first vector will contain the set of data references that are
482*e4b17023SJohn Marinocontained in the analyzed loop body, and the second vector will contain
483*e4b17023SJohn Marinothe dependence relations between the data references.  Thus if the
484*e4b17023SJohn Marinovector of data references is of size @code{n}, the vector containing the
485*e4b17023SJohn Marinodependence relations will contain @code{n*n} elements.  However if the
486*e4b17023SJohn Marinoanalyzed loop contains side effects, such as calls that potentially can
487*e4b17023SJohn Marinointerfere with the data references in the current analyzed loop, the
488*e4b17023SJohn Marinoanalysis stops while scanning the loop body for data references, and
489*e4b17023SJohn Marinoinserts a single @code{chrec_dont_know} in the dependence relation
490*e4b17023SJohn Marinoarray.
491*e4b17023SJohn Marino
492*e4b17023SJohn MarinoThe data references are discovered in a particular order during the
493*e4b17023SJohn Marinoscanning of the loop body: the loop body is analyzed in execution order,
494*e4b17023SJohn Marinoand the data references of each statement are pushed at the end of the
495*e4b17023SJohn Marinodata reference array.  Two data references syntactically occur in the
496*e4b17023SJohn Marinoprogram in the same order as in the array of data references.  This
497*e4b17023SJohn Marinosyntactic order is important in some classical data dependence tests,
498*e4b17023SJohn Marinoand mapping this order to the elements of this array avoids costly
499*e4b17023SJohn Marinoqueries to the loop body representation.
500*e4b17023SJohn Marino
501*e4b17023SJohn MarinoThree types of data references are currently handled: ARRAY_REF,
502*e4b17023SJohn MarinoINDIRECT_REF and COMPONENT_REF@. The data structure for the data reference
503*e4b17023SJohn Marinois @code{data_reference}, where @code{data_reference_p} is a name of a
504*e4b17023SJohn Marinopointer to the data reference structure. The structure contains the
505*e4b17023SJohn Marinofollowing elements:
506*e4b17023SJohn Marino
507*e4b17023SJohn Marino@itemize
508*e4b17023SJohn Marino@item @code{base_object_info}: Provides information about the base object
509*e4b17023SJohn Marinoof the data reference and its access functions. These access functions
510*e4b17023SJohn Marinorepresent the evolution of the data reference in the loop relative to
511*e4b17023SJohn Marinoits base, in keeping with the classical meaning of the data reference
512*e4b17023SJohn Marinoaccess function for the support of arrays. For example, for a reference
513*e4b17023SJohn Marino@code{a.b[i][j]}, the base object is @code{a.b} and the access functions,
514*e4b17023SJohn Marinoone for each array subscript, are:
515*e4b17023SJohn Marino@code{@{i_init, + i_step@}_1, @{j_init, +, j_step@}_2}.
516*e4b17023SJohn Marino
517*e4b17023SJohn Marino@item @code{first_location_in_loop}: Provides information about the first
518*e4b17023SJohn Marinolocation accessed by the data reference in the loop and about the access
519*e4b17023SJohn Marinofunction used to represent evolution relative to this location. This data
520*e4b17023SJohn Marinois used to support pointers, and is not used for arrays (for which we
521*e4b17023SJohn Marinohave base objects). Pointer accesses are represented as a one-dimensional
522*e4b17023SJohn Marinoaccess that starts from the first location accessed in the loop. For
523*e4b17023SJohn Marinoexample:
524*e4b17023SJohn Marino
525*e4b17023SJohn Marino@smallexample
526*e4b17023SJohn Marino      for1 i
527*e4b17023SJohn Marino         for2 j
528*e4b17023SJohn Marino          *((int *)p + i + j) = a[i][j];
529*e4b17023SJohn Marino@end smallexample
530*e4b17023SJohn Marino
531*e4b17023SJohn MarinoThe access function of the pointer access is @code{@{0, + 4B@}_for2}
532*e4b17023SJohn Marinorelative to @code{p + i}. The access functions of the array are
533*e4b17023SJohn Marino@code{@{i_init, + i_step@}_for1} and @code{@{j_init, +, j_step@}_for2}
534*e4b17023SJohn Marinorelative to @code{a}.
535*e4b17023SJohn Marino
536*e4b17023SJohn MarinoUsually, the object the pointer refers to is either unknown, or we can't
537*e4b17023SJohn Marinoprove that the access is confined to the boundaries of a certain object.
538*e4b17023SJohn Marino
539*e4b17023SJohn MarinoTwo data references can be compared only if at least one of these two
540*e4b17023SJohn Marinorepresentations has all its fields filled for both data references.
541*e4b17023SJohn Marino
542*e4b17023SJohn MarinoThe current strategy for data dependence tests is as follows:
543*e4b17023SJohn MarinoIf both @code{a} and @code{b} are represented as arrays, compare
544*e4b17023SJohn Marino@code{a.base_object} and @code{b.base_object};
545*e4b17023SJohn Marinoif they are equal, apply dependence tests (use access functions based on
546*e4b17023SJohn Marinobase_objects).
547*e4b17023SJohn MarinoElse if both @code{a} and @code{b} are represented as pointers, compare
548*e4b17023SJohn Marino@code{a.first_location} and @code{b.first_location};
549*e4b17023SJohn Marinoif they are equal, apply dependence tests (use access functions based on
550*e4b17023SJohn Marinofirst location).
551*e4b17023SJohn MarinoHowever, if @code{a} and @code{b} are represented differently, only try
552*e4b17023SJohn Marinoto prove that the bases are definitely different.
553*e4b17023SJohn Marino
554*e4b17023SJohn Marino@item Aliasing information.
555*e4b17023SJohn Marino@item Alignment information.
556*e4b17023SJohn Marino@end itemize
557*e4b17023SJohn Marino
558*e4b17023SJohn MarinoThe structure describing the relation between two data references is
559*e4b17023SJohn Marino@code{data_dependence_relation} and the shorter name for a pointer to
560*e4b17023SJohn Marinosuch a structure is @code{ddr_p}.  This structure contains:
561*e4b17023SJohn Marino
562*e4b17023SJohn Marino@itemize
563*e4b17023SJohn Marino@item a pointer to each data reference,
564*e4b17023SJohn Marino@item a tree node @code{are_dependent} that is set to @code{chrec_known}
565*e4b17023SJohn Marinoif the analysis has proved that there is no dependence between these two
566*e4b17023SJohn Marinodata references, @code{chrec_dont_know} if the analysis was not able to
567*e4b17023SJohn Marinodetermine any useful result and potentially there could exist a
568*e4b17023SJohn Marinodependence between these data references, and @code{are_dependent} is
569*e4b17023SJohn Marinoset to @code{NULL_TREE} if there exist a dependence relation between the
570*e4b17023SJohn Marinodata references, and the description of this dependence relation is
571*e4b17023SJohn Marinogiven in the @code{subscripts}, @code{dir_vects}, and @code{dist_vects}
572*e4b17023SJohn Marinoarrays,
573*e4b17023SJohn Marino@item a boolean that determines whether the dependence relation can be
574*e4b17023SJohn Marinorepresented by a classical distance vector,
575*e4b17023SJohn Marino@item an array @code{subscripts} that contains a description of each
576*e4b17023SJohn Marinosubscript of the data references.  Given two array accesses a
577*e4b17023SJohn Marinosubscript is the tuple composed of the access functions for a given
578*e4b17023SJohn Marinodimension.  For example, given @code{A[f1][f2][f3]} and
579*e4b17023SJohn Marino@code{B[g1][g2][g3]}, there are three subscripts: @code{(f1, g1), (f2,
580*e4b17023SJohn Marinog2), (f3, g3)}.
581*e4b17023SJohn Marino@item two arrays @code{dir_vects} and @code{dist_vects} that contain
582*e4b17023SJohn Marinoclassical representations of the data dependences under the form of
583*e4b17023SJohn Marinodirection and distance dependence vectors,
584*e4b17023SJohn Marino@item an array of loops @code{loop_nest} that contains the loops to
585*e4b17023SJohn Marinowhich the distance and direction vectors refer to.
586*e4b17023SJohn Marino@end itemize
587*e4b17023SJohn Marino
588*e4b17023SJohn MarinoSeveral functions for pretty printing the information extracted by the
589*e4b17023SJohn Marinodata dependence analysis are available: @code{dump_ddrs} prints with a
590*e4b17023SJohn Marinomaximum verbosity the details of a data dependence relations array,
591*e4b17023SJohn Marino@code{dump_dist_dir_vectors} prints only the classical distance and
592*e4b17023SJohn Marinodirection vectors for a data dependence relations array, and
593*e4b17023SJohn Marino@code{dump_data_references} prints the details of the data references
594*e4b17023SJohn Marinocontained in a data reference array.
595*e4b17023SJohn Marino
596*e4b17023SJohn Marino@node Lambda
597*e4b17023SJohn Marino@section Linear loop transformations framework
598*e4b17023SJohn Marino@cindex Linear loop transformations framework
599*e4b17023SJohn Marino
600*e4b17023SJohn MarinoLambda is a framework that allows transformations of loops using
601*e4b17023SJohn Marinonon-singular matrix based transformations of the iteration space and
602*e4b17023SJohn Marinoloop bounds. This allows compositions of skewing, scaling, interchange,
603*e4b17023SJohn Marinoand reversal transformations.  These transformations are often used to
604*e4b17023SJohn Marinoimprove cache behavior or remove inner loop dependencies to allow
605*e4b17023SJohn Marinoparallelization and vectorization to take place.
606*e4b17023SJohn Marino
607*e4b17023SJohn MarinoTo perform these transformations, Lambda requires that the loopnest be
608*e4b17023SJohn Marinoconverted into an internal form that can be matrix transformed easily.
609*e4b17023SJohn MarinoTo do this conversion, the function
610*e4b17023SJohn Marino@code{gcc_loopnest_to_lambda_loopnest} is provided.  If the loop cannot
611*e4b17023SJohn Marinobe transformed using lambda, this function will return NULL.
612*e4b17023SJohn Marino
613*e4b17023SJohn MarinoOnce a @code{lambda_loopnest} is obtained from the conversion function,
614*e4b17023SJohn Marinoit can be transformed by using @code{lambda_loopnest_transform}, which
615*e4b17023SJohn Marinotakes a transformation matrix to apply.  Note that it is up to the
616*e4b17023SJohn Marinocaller to verify that the transformation matrix is legal to apply to the
617*e4b17023SJohn Marinoloop (dependence respecting, etc).  Lambda simply applies whatever
618*e4b17023SJohn Marinomatrix it is told to provide.  It can be extended to make legal matrices
619*e4b17023SJohn Marinoout of any non-singular matrix, but this is not currently implemented.
620*e4b17023SJohn MarinoLegality of a matrix for a given loopnest can be verified using
621*e4b17023SJohn Marino@code{lambda_transform_legal_p}.
622*e4b17023SJohn Marino
623*e4b17023SJohn MarinoGiven a transformed loopnest, conversion back into gcc IR is done by
624*e4b17023SJohn Marino@code{lambda_loopnest_to_gcc_loopnest}.  This function will modify the
625*e4b17023SJohn Marinoloops so that they match the transformed loopnest.
626*e4b17023SJohn Marino
627*e4b17023SJohn Marino
628*e4b17023SJohn Marino@node Omega
629*e4b17023SJohn Marino@section Omega a solver for linear programming problems
630*e4b17023SJohn Marino@cindex Omega a solver for linear programming problems
631*e4b17023SJohn Marino
632*e4b17023SJohn MarinoThe data dependence analysis contains several solvers triggered
633*e4b17023SJohn Marinosequentially from the less complex ones to the more sophisticated.
634*e4b17023SJohn MarinoFor ensuring the consistency of the results of these solvers, a data
635*e4b17023SJohn Marinodependence check pass has been implemented based on two different
636*e4b17023SJohn Marinosolvers.  The second method that has been integrated to GCC is based
637*e4b17023SJohn Marinoon the Omega dependence solver, written in the 1990's by William Pugh
638*e4b17023SJohn Marinoand David Wonnacott.  Data dependence tests can be formulated using a
639*e4b17023SJohn Marinosubset of the Presburger arithmetics that can be translated to linear
640*e4b17023SJohn Marinoconstraint systems.  These linear constraint systems can then be
641*e4b17023SJohn Marinosolved using the Omega solver.
642*e4b17023SJohn Marino
643*e4b17023SJohn MarinoThe Omega solver is using Fourier-Motzkin's algorithm for variable
644*e4b17023SJohn Marinoelimination: a linear constraint system containing @code{n} variables
645*e4b17023SJohn Marinois reduced to a linear constraint system with @code{n-1} variables.
646*e4b17023SJohn MarinoThe Omega solver can also be used for solving other problems that can
647*e4b17023SJohn Marinobe expressed under the form of a system of linear equalities and
648*e4b17023SJohn Marinoinequalities.  The Omega solver is known to have an exponential worst
649*e4b17023SJohn Marinocase, also known under the name of ``omega nightmare'' in the
650*e4b17023SJohn Marinoliterature, but in practice, the omega test is known to be efficient
651*e4b17023SJohn Marinofor the common data dependence tests.
652*e4b17023SJohn Marino
653*e4b17023SJohn MarinoThe interface used by the Omega solver for describing the linear
654*e4b17023SJohn Marinoprogramming problems is described in @file{omega.h}, and the solver is
655*e4b17023SJohn Marino@code{omega_solve_problem}.
656