1*e4b17023SJohn Marino@c Copyright (c) 2006, 2007, 2008 Free Software Foundation, Inc. 2*e4b17023SJohn Marino@c Free Software Foundation, Inc. 3*e4b17023SJohn Marino@c This is part of the GCC manual. 4*e4b17023SJohn Marino@c For copying conditions, see the file gcc.texi. 5*e4b17023SJohn Marino 6*e4b17023SJohn Marino@c --------------------------------------------------------------------- 7*e4b17023SJohn Marino@c Loop Representation 8*e4b17023SJohn Marino@c --------------------------------------------------------------------- 9*e4b17023SJohn Marino 10*e4b17023SJohn Marino@node Loop Analysis and Representation 11*e4b17023SJohn Marino@chapter Analysis and Representation of Loops 12*e4b17023SJohn Marino 13*e4b17023SJohn MarinoGCC provides extensive infrastructure for work with natural loops, i.e., 14*e4b17023SJohn Marinostrongly connected components of CFG with only one entry block. This 15*e4b17023SJohn Marinochapter describes representation of loops in GCC, both on GIMPLE and in 16*e4b17023SJohn MarinoRTL, as well as the interfaces to loop-related analyses (induction 17*e4b17023SJohn Marinovariable analysis and number of iterations analysis). 18*e4b17023SJohn Marino 19*e4b17023SJohn Marino@menu 20*e4b17023SJohn Marino* Loop representation:: Representation and analysis of loops. 21*e4b17023SJohn Marino* Loop querying:: Getting information about loops. 22*e4b17023SJohn Marino* Loop manipulation:: Loop manipulation functions. 23*e4b17023SJohn Marino* LCSSA:: Loop-closed SSA form. 24*e4b17023SJohn Marino* Scalar evolutions:: Induction variables on GIMPLE. 25*e4b17023SJohn Marino* loop-iv:: Induction variables on RTL. 26*e4b17023SJohn Marino* Number of iterations:: Number of iterations analysis. 27*e4b17023SJohn Marino* Dependency analysis:: Data dependency analysis. 28*e4b17023SJohn Marino* Lambda:: Linear loop transformations framework. 29*e4b17023SJohn Marino* Omega:: A solver for linear programming problems. 30*e4b17023SJohn Marino@end menu 31*e4b17023SJohn Marino 32*e4b17023SJohn Marino@node Loop representation 33*e4b17023SJohn Marino@section Loop representation 34*e4b17023SJohn Marino@cindex Loop representation 35*e4b17023SJohn Marino@cindex Loop analysis 36*e4b17023SJohn Marino 37*e4b17023SJohn MarinoThis chapter describes the representation of loops in GCC, and functions 38*e4b17023SJohn Marinothat can be used to build, modify and analyze this representation. Most 39*e4b17023SJohn Marinoof the interfaces and data structures are declared in @file{cfgloop.h}. 40*e4b17023SJohn MarinoAt the moment, loop structures are analyzed and this information is 41*e4b17023SJohn Marinoupdated only by the optimization passes that deal with loops, but some 42*e4b17023SJohn Marinoefforts are being made to make it available throughout most of the 43*e4b17023SJohn Marinooptimization passes. 44*e4b17023SJohn Marino 45*e4b17023SJohn MarinoIn general, a natural loop has one entry block (header) and possibly 46*e4b17023SJohn Marinoseveral back edges (latches) leading to the header from the inside of 47*e4b17023SJohn Marinothe loop. Loops with several latches may appear if several loops share 48*e4b17023SJohn Marinoa single header, or if there is a branching in the middle of the loop. 49*e4b17023SJohn MarinoThe representation of loops in GCC however allows only loops with a 50*e4b17023SJohn Marinosingle latch. During loop analysis, headers of such loops are split and 51*e4b17023SJohn Marinoforwarder blocks are created in order to disambiguate their structures. 52*e4b17023SJohn MarinoHeuristic based on profile information and structure of the induction 53*e4b17023SJohn Marinovariables in the loops is used to determine whether the latches 54*e4b17023SJohn Marinocorrespond to sub-loops or to control flow in a single loop. This means 55*e4b17023SJohn Marinothat the analysis sometimes changes the CFG, and if you run it in the 56*e4b17023SJohn Marinomiddle of an optimization pass, you must be able to deal with the new 57*e4b17023SJohn Marinoblocks. You may avoid CFG changes by passing 58*e4b17023SJohn Marino@code{LOOPS_MAY_HAVE_MULTIPLE_LATCHES} flag to the loop discovery, 59*e4b17023SJohn Marinonote however that most other loop manipulation functions will not work 60*e4b17023SJohn Marinocorrectly for loops with multiple latch edges (the functions that only 61*e4b17023SJohn Marinoquery membership of blocks to loops and subloop relationships, or 62*e4b17023SJohn Marinoenumerate and test loop exits, can be expected to work). 63*e4b17023SJohn Marino 64*e4b17023SJohn MarinoBody of the loop is the set of blocks that are dominated by its header, 65*e4b17023SJohn Marinoand reachable from its latch against the direction of edges in CFG@. The 66*e4b17023SJohn Marinoloops are organized in a containment hierarchy (tree) such that all the 67*e4b17023SJohn Marinoloops immediately contained inside loop L are the children of L in the 68*e4b17023SJohn Marinotree. This tree is represented by the @code{struct loops} structure. 69*e4b17023SJohn MarinoThe root of this tree is a fake loop that contains all blocks in the 70*e4b17023SJohn Marinofunction. Each of the loops is represented in a @code{struct loop} 71*e4b17023SJohn Marinostructure. Each loop is assigned an index (@code{num} field of the 72*e4b17023SJohn Marino@code{struct loop} structure), and the pointer to the loop is stored in 73*e4b17023SJohn Marinothe corresponding field of the @code{larray} vector in the loops 74*e4b17023SJohn Marinostructure. The indices do not have to be continuous, there may be 75*e4b17023SJohn Marinoempty (@code{NULL}) entries in the @code{larray} created by deleting 76*e4b17023SJohn Marinoloops. Also, there is no guarantee on the relative order of a loop 77*e4b17023SJohn Marinoand its subloops in the numbering. The index of a loop never changes. 78*e4b17023SJohn Marino 79*e4b17023SJohn MarinoThe entries of the @code{larray} field should not be accessed directly. 80*e4b17023SJohn MarinoThe function @code{get_loop} returns the loop description for a loop with 81*e4b17023SJohn Marinothe given index. @code{number_of_loops} function returns number of 82*e4b17023SJohn Marinoloops in the function. To traverse all loops, use @code{FOR_EACH_LOOP} 83*e4b17023SJohn Marinomacro. The @code{flags} argument of the macro is used to determine 84*e4b17023SJohn Marinothe direction of traversal and the set of loops visited. Each loop is 85*e4b17023SJohn Marinoguaranteed to be visited exactly once, regardless of the changes to the 86*e4b17023SJohn Marinoloop tree, and the loops may be removed during the traversal. The newly 87*e4b17023SJohn Marinocreated loops are never traversed, if they need to be visited, this 88*e4b17023SJohn Marinomust be done separately after their creation. The @code{FOR_EACH_LOOP} 89*e4b17023SJohn Marinomacro allocates temporary variables. If the @code{FOR_EACH_LOOP} loop 90*e4b17023SJohn Marinowere ended using break or goto, they would not be released; 91*e4b17023SJohn Marino@code{FOR_EACH_LOOP_BREAK} macro must be used instead. 92*e4b17023SJohn Marino 93*e4b17023SJohn MarinoEach basic block contains the reference to the innermost loop it belongs 94*e4b17023SJohn Marinoto (@code{loop_father}). For this reason, it is only possible to have 95*e4b17023SJohn Marinoone @code{struct loops} structure initialized at the same time for each 96*e4b17023SJohn MarinoCFG@. The global variable @code{current_loops} contains the 97*e4b17023SJohn Marino@code{struct loops} structure. Many of the loop manipulation functions 98*e4b17023SJohn Marinoassume that dominance information is up-to-date. 99*e4b17023SJohn Marino 100*e4b17023SJohn MarinoThe loops are analyzed through @code{loop_optimizer_init} function. The 101*e4b17023SJohn Marinoargument of this function is a set of flags represented in an integer 102*e4b17023SJohn Marinobitmask. These flags specify what other properties of the loop 103*e4b17023SJohn Marinostructures should be calculated/enforced and preserved later: 104*e4b17023SJohn Marino 105*e4b17023SJohn Marino@itemize 106*e4b17023SJohn Marino@item @code{LOOPS_MAY_HAVE_MULTIPLE_LATCHES}: If this flag is set, no 107*e4b17023SJohn Marinochanges to CFG will be performed in the loop analysis, in particular, 108*e4b17023SJohn Marinoloops with multiple latch edges will not be disambiguated. If a loop 109*e4b17023SJohn Marinohas multiple latches, its latch block is set to NULL@. Most of 110*e4b17023SJohn Marinothe loop manipulation functions will not work for loops in this shape. 111*e4b17023SJohn MarinoNo other flags that require CFG changes can be passed to 112*e4b17023SJohn Marinoloop_optimizer_init. 113*e4b17023SJohn Marino@item @code{LOOPS_HAVE_PREHEADERS}: Forwarder blocks are created in such 114*e4b17023SJohn Marinoa way that each loop has only one entry edge, and additionally, the 115*e4b17023SJohn Marinosource block of this entry edge has only one successor. This creates a 116*e4b17023SJohn Marinonatural place where the code can be moved out of the loop, and ensures 117*e4b17023SJohn Marinothat the entry edge of the loop leads from its immediate super-loop. 118*e4b17023SJohn Marino@item @code{LOOPS_HAVE_SIMPLE_LATCHES}: Forwarder blocks are created to 119*e4b17023SJohn Marinoforce the latch block of each loop to have only one successor. This 120*e4b17023SJohn Marinoensures that the latch of the loop does not belong to any of its 121*e4b17023SJohn Marinosub-loops, and makes manipulation with the loops significantly easier. 122*e4b17023SJohn MarinoMost of the loop manipulation functions assume that the loops are in 123*e4b17023SJohn Marinothis shape. Note that with this flag, the ``normal'' loop without any 124*e4b17023SJohn Marinocontrol flow inside and with one exit consists of two basic blocks. 125*e4b17023SJohn Marino@item @code{LOOPS_HAVE_MARKED_IRREDUCIBLE_REGIONS}: Basic blocks and 126*e4b17023SJohn Marinoedges in the strongly connected components that are not natural loops 127*e4b17023SJohn Marino(have more than one entry block) are marked with 128*e4b17023SJohn Marino@code{BB_IRREDUCIBLE_LOOP} and @code{EDGE_IRREDUCIBLE_LOOP} flags. The 129*e4b17023SJohn Marinoflag is not set for blocks and edges that belong to natural loops that 130*e4b17023SJohn Marinoare in such an irreducible region (but it is set for the entry and exit 131*e4b17023SJohn Marinoedges of such a loop, if they lead to/from this region). 132*e4b17023SJohn Marino@item @code{LOOPS_HAVE_RECORDED_EXITS}: The lists of exits are recorded 133*e4b17023SJohn Marinoand updated for each loop. This makes some functions (e.g., 134*e4b17023SJohn Marino@code{get_loop_exit_edges}) more efficient. Some functions (e.g., 135*e4b17023SJohn Marino@code{single_exit}) can be used only if the lists of exits are 136*e4b17023SJohn Marinorecorded. 137*e4b17023SJohn Marino@end itemize 138*e4b17023SJohn Marino 139*e4b17023SJohn MarinoThese properties may also be computed/enforced later, using functions 140*e4b17023SJohn Marino@code{create_preheaders}, @code{force_single_succ_latches}, 141*e4b17023SJohn Marino@code{mark_irreducible_loops} and @code{record_loop_exits}. 142*e4b17023SJohn Marino 143*e4b17023SJohn MarinoThe memory occupied by the loops structures should be freed with 144*e4b17023SJohn Marino@code{loop_optimizer_finalize} function. 145*e4b17023SJohn Marino 146*e4b17023SJohn MarinoThe CFG manipulation functions in general do not update loop structures. 147*e4b17023SJohn MarinoSpecialized versions that additionally do so are provided for the most 148*e4b17023SJohn Marinocommon tasks. On GIMPLE, @code{cleanup_tree_cfg_loop} function can be 149*e4b17023SJohn Marinoused to cleanup CFG while updating the loops structures if 150*e4b17023SJohn Marino@code{current_loops} is set. 151*e4b17023SJohn Marino 152*e4b17023SJohn Marino@node Loop querying 153*e4b17023SJohn Marino@section Loop querying 154*e4b17023SJohn Marino@cindex Loop querying 155*e4b17023SJohn Marino 156*e4b17023SJohn MarinoThe functions to query the information about loops are declared in 157*e4b17023SJohn Marino@file{cfgloop.h}. Some of the information can be taken directly from 158*e4b17023SJohn Marinothe structures. @code{loop_father} field of each basic block contains 159*e4b17023SJohn Marinothe innermost loop to that the block belongs. The most useful fields of 160*e4b17023SJohn Marinoloop structure (that are kept up-to-date at all times) are: 161*e4b17023SJohn Marino 162*e4b17023SJohn Marino@itemize 163*e4b17023SJohn Marino@item @code{header}, @code{latch}: Header and latch basic blocks of the 164*e4b17023SJohn Marinoloop. 165*e4b17023SJohn Marino@item @code{num_nodes}: Number of basic blocks in the loop (including 166*e4b17023SJohn Marinothe basic blocks of the sub-loops). 167*e4b17023SJohn Marino@item @code{depth}: The depth of the loop in the loops tree, i.e., the 168*e4b17023SJohn Marinonumber of super-loops of the loop. 169*e4b17023SJohn Marino@item @code{outer}, @code{inner}, @code{next}: The super-loop, the first 170*e4b17023SJohn Marinosub-loop, and the sibling of the loop in the loops tree. 171*e4b17023SJohn Marino@end itemize 172*e4b17023SJohn Marino 173*e4b17023SJohn MarinoThere are other fields in the loop structures, many of them used only by 174*e4b17023SJohn Marinosome of the passes, or not updated during CFG changes; in general, they 175*e4b17023SJohn Marinoshould not be accessed directly. 176*e4b17023SJohn Marino 177*e4b17023SJohn MarinoThe most important functions to query loop structures are: 178*e4b17023SJohn Marino 179*e4b17023SJohn Marino@itemize 180*e4b17023SJohn Marino@item @code{flow_loops_dump}: Dumps the information about loops to a 181*e4b17023SJohn Marinofile. 182*e4b17023SJohn Marino@item @code{verify_loop_structure}: Checks consistency of the loop 183*e4b17023SJohn Marinostructures. 184*e4b17023SJohn Marino@item @code{loop_latch_edge}: Returns the latch edge of a loop. 185*e4b17023SJohn Marino@item @code{loop_preheader_edge}: If loops have preheaders, returns 186*e4b17023SJohn Marinothe preheader edge of a loop. 187*e4b17023SJohn Marino@item @code{flow_loop_nested_p}: Tests whether loop is a sub-loop of 188*e4b17023SJohn Marinoanother loop. 189*e4b17023SJohn Marino@item @code{flow_bb_inside_loop_p}: Tests whether a basic block belongs 190*e4b17023SJohn Marinoto a loop (including its sub-loops). 191*e4b17023SJohn Marino@item @code{find_common_loop}: Finds the common super-loop of two loops. 192*e4b17023SJohn Marino@item @code{superloop_at_depth}: Returns the super-loop of a loop with 193*e4b17023SJohn Marinothe given depth. 194*e4b17023SJohn Marino@item @code{tree_num_loop_insns}, @code{num_loop_insns}: Estimates the 195*e4b17023SJohn Marinonumber of insns in the loop, on GIMPLE and on RTL. 196*e4b17023SJohn Marino@item @code{loop_exit_edge_p}: Tests whether edge is an exit from a 197*e4b17023SJohn Marinoloop. 198*e4b17023SJohn Marino@item @code{mark_loop_exit_edges}: Marks all exit edges of all loops 199*e4b17023SJohn Marinowith @code{EDGE_LOOP_EXIT} flag. 200*e4b17023SJohn Marino@item @code{get_loop_body}, @code{get_loop_body_in_dom_order}, 201*e4b17023SJohn Marino@code{get_loop_body_in_bfs_order}: Enumerates the basic blocks in the 202*e4b17023SJohn Marinoloop in depth-first search order in reversed CFG, ordered by dominance 203*e4b17023SJohn Marinorelation, and breath-first search order, respectively. 204*e4b17023SJohn Marino@item @code{single_exit}: Returns the single exit edge of the loop, or 205*e4b17023SJohn Marino@code{NULL} if the loop has more than one exit. You can only use this 206*e4b17023SJohn Marinofunction if LOOPS_HAVE_MARKED_SINGLE_EXITS property is used. 207*e4b17023SJohn Marino@item @code{get_loop_exit_edges}: Enumerates the exit edges of a loop. 208*e4b17023SJohn Marino@item @code{just_once_each_iteration_p}: Returns true if the basic block 209*e4b17023SJohn Marinois executed exactly once during each iteration of a loop (that is, it 210*e4b17023SJohn Marinodoes not belong to a sub-loop, and it dominates the latch of the loop). 211*e4b17023SJohn Marino@end itemize 212*e4b17023SJohn Marino 213*e4b17023SJohn Marino@node Loop manipulation 214*e4b17023SJohn Marino@section Loop manipulation 215*e4b17023SJohn Marino@cindex Loop manipulation 216*e4b17023SJohn Marino 217*e4b17023SJohn MarinoThe loops tree can be manipulated using the following functions: 218*e4b17023SJohn Marino 219*e4b17023SJohn Marino@itemize 220*e4b17023SJohn Marino@item @code{flow_loop_tree_node_add}: Adds a node to the tree. 221*e4b17023SJohn Marino@item @code{flow_loop_tree_node_remove}: Removes a node from the tree. 222*e4b17023SJohn Marino@item @code{add_bb_to_loop}: Adds a basic block to a loop. 223*e4b17023SJohn Marino@item @code{remove_bb_from_loops}: Removes a basic block from loops. 224*e4b17023SJohn Marino@end itemize 225*e4b17023SJohn Marino 226*e4b17023SJohn MarinoMost low-level CFG functions update loops automatically. The following 227*e4b17023SJohn Marinofunctions handle some more complicated cases of CFG manipulations: 228*e4b17023SJohn Marino 229*e4b17023SJohn Marino@itemize 230*e4b17023SJohn Marino@item @code{remove_path}: Removes an edge and all blocks it dominates. 231*e4b17023SJohn Marino@item @code{split_loop_exit_edge}: Splits exit edge of the loop, 232*e4b17023SJohn Marinoensuring that PHI node arguments remain in the loop (this ensures that 233*e4b17023SJohn Marinoloop-closed SSA form is preserved). Only useful on GIMPLE. 234*e4b17023SJohn Marino@end itemize 235*e4b17023SJohn Marino 236*e4b17023SJohn MarinoFinally, there are some higher-level loop transformations implemented. 237*e4b17023SJohn MarinoWhile some of them are written so that they should work on non-innermost 238*e4b17023SJohn Marinoloops, they are mostly untested in that case, and at the moment, they 239*e4b17023SJohn Marinoare only reliable for the innermost loops: 240*e4b17023SJohn Marino 241*e4b17023SJohn Marino@itemize 242*e4b17023SJohn Marino@item @code{create_iv}: Creates a new induction variable. Only works on 243*e4b17023SJohn MarinoGIMPLE@. @code{standard_iv_increment_position} can be used to find a 244*e4b17023SJohn Marinosuitable place for the iv increment. 245*e4b17023SJohn Marino@item @code{duplicate_loop_to_header_edge}, 246*e4b17023SJohn Marino@code{tree_duplicate_loop_to_header_edge}: These functions (on RTL and 247*e4b17023SJohn Marinoon GIMPLE) duplicate the body of the loop prescribed number of times on 248*e4b17023SJohn Marinoone of the edges entering loop header, thus performing either loop 249*e4b17023SJohn Marinounrolling or loop peeling. @code{can_duplicate_loop_p} 250*e4b17023SJohn Marino(@code{can_unroll_loop_p} on GIMPLE) must be true for the duplicated 251*e4b17023SJohn Marinoloop. 252*e4b17023SJohn Marino@item @code{loop_version}, @code{tree_ssa_loop_version}: These function 253*e4b17023SJohn Marinocreate a copy of a loop, and a branch before them that selects one of 254*e4b17023SJohn Marinothem depending on the prescribed condition. This is useful for 255*e4b17023SJohn Marinooptimizations that need to verify some assumptions in runtime (one of 256*e4b17023SJohn Marinothe copies of the loop is usually left unchanged, while the other one is 257*e4b17023SJohn Marinotransformed in some way). 258*e4b17023SJohn Marino@item @code{tree_unroll_loop}: Unrolls the loop, including peeling the 259*e4b17023SJohn Marinoextra iterations to make the number of iterations divisible by unroll 260*e4b17023SJohn Marinofactor, updating the exit condition, and removing the exits that now 261*e4b17023SJohn Marinocannot be taken. Works only on GIMPLE. 262*e4b17023SJohn Marino@end itemize 263*e4b17023SJohn Marino 264*e4b17023SJohn Marino@node LCSSA 265*e4b17023SJohn Marino@section Loop-closed SSA form 266*e4b17023SJohn Marino@cindex LCSSA 267*e4b17023SJohn Marino@cindex Loop-closed SSA form 268*e4b17023SJohn Marino 269*e4b17023SJohn MarinoThroughout the loop optimizations on tree level, one extra condition is 270*e4b17023SJohn Marinoenforced on the SSA form: No SSA name is used outside of the loop in 271*e4b17023SJohn Marinothat it is defined. The SSA form satisfying this condition is called 272*e4b17023SJohn Marino``loop-closed SSA form'' -- LCSSA@. To enforce LCSSA, PHI nodes must be 273*e4b17023SJohn Marinocreated at the exits of the loops for the SSA names that are used 274*e4b17023SJohn Marinooutside of them. Only the real operands (not virtual SSA names) are 275*e4b17023SJohn Marinoheld in LCSSA, in order to save memory. 276*e4b17023SJohn Marino 277*e4b17023SJohn MarinoThere are various benefits of LCSSA: 278*e4b17023SJohn Marino 279*e4b17023SJohn Marino@itemize 280*e4b17023SJohn Marino@item Many optimizations (value range analysis, final value 281*e4b17023SJohn Marinoreplacement) are interested in the values that are defined in the loop 282*e4b17023SJohn Marinoand used outside of it, i.e., exactly those for that we create new PHI 283*e4b17023SJohn Marinonodes. 284*e4b17023SJohn Marino@item In induction variable analysis, it is not necessary to specify the 285*e4b17023SJohn Marinoloop in that the analysis should be performed -- the scalar evolution 286*e4b17023SJohn Marinoanalysis always returns the results with respect to the loop in that the 287*e4b17023SJohn MarinoSSA name is defined. 288*e4b17023SJohn Marino@item It makes updating of SSA form during loop transformations simpler. 289*e4b17023SJohn MarinoWithout LCSSA, operations like loop unrolling may force creation of PHI 290*e4b17023SJohn Marinonodes arbitrarily far from the loop, while in LCSSA, the SSA form can be 291*e4b17023SJohn Marinoupdated locally. However, since we only keep real operands in LCSSA, we 292*e4b17023SJohn Marinocannot use this advantage (we could have local updating of real 293*e4b17023SJohn Marinooperands, but it is not much more efficient than to use generic SSA form 294*e4b17023SJohn Marinoupdating for it as well; the amount of changes to SSA is the same). 295*e4b17023SJohn Marino@end itemize 296*e4b17023SJohn Marino 297*e4b17023SJohn MarinoHowever, it also means LCSSA must be updated. This is usually 298*e4b17023SJohn Marinostraightforward, unless you create a new value in loop and use it 299*e4b17023SJohn Marinooutside, or unless you manipulate loop exit edges (functions are 300*e4b17023SJohn Marinoprovided to make these manipulations simple). 301*e4b17023SJohn Marino@code{rewrite_into_loop_closed_ssa} is used to rewrite SSA form to 302*e4b17023SJohn MarinoLCSSA, and @code{verify_loop_closed_ssa} to check that the invariant of 303*e4b17023SJohn MarinoLCSSA is preserved. 304*e4b17023SJohn Marino 305*e4b17023SJohn Marino@node Scalar evolutions 306*e4b17023SJohn Marino@section Scalar evolutions 307*e4b17023SJohn Marino@cindex Scalar evolutions 308*e4b17023SJohn Marino@cindex IV analysis on GIMPLE 309*e4b17023SJohn Marino 310*e4b17023SJohn MarinoScalar evolutions (SCEV) are used to represent results of induction 311*e4b17023SJohn Marinovariable analysis on GIMPLE@. They enable us to represent variables with 312*e4b17023SJohn Marinocomplicated behavior in a simple and consistent way (we only use it to 313*e4b17023SJohn Marinoexpress values of polynomial induction variables, but it is possible to 314*e4b17023SJohn Marinoextend it). The interfaces to SCEV analysis are declared in 315*e4b17023SJohn Marino@file{tree-scalar-evolution.h}. To use scalar evolutions analysis, 316*e4b17023SJohn Marino@code{scev_initialize} must be used. To stop using SCEV, 317*e4b17023SJohn Marino@code{scev_finalize} should be used. SCEV analysis caches results in 318*e4b17023SJohn Marinoorder to save time and memory. This cache however is made invalid by 319*e4b17023SJohn Marinomost of the loop transformations, including removal of code. If such a 320*e4b17023SJohn Marinotransformation is performed, @code{scev_reset} must be called to clean 321*e4b17023SJohn Marinothe caches. 322*e4b17023SJohn Marino 323*e4b17023SJohn MarinoGiven an SSA name, its behavior in loops can be analyzed using the 324*e4b17023SJohn Marino@code{analyze_scalar_evolution} function. The returned SCEV however 325*e4b17023SJohn Marinodoes not have to be fully analyzed and it may contain references to 326*e4b17023SJohn Marinoother SSA names defined in the loop. To resolve these (potentially 327*e4b17023SJohn Marinorecursive) references, @code{instantiate_parameters} or 328*e4b17023SJohn Marino@code{resolve_mixers} functions must be used. 329*e4b17023SJohn Marino@code{instantiate_parameters} is useful when you use the results of SCEV 330*e4b17023SJohn Marinoonly for some analysis, and when you work with whole nest of loops at 331*e4b17023SJohn Marinoonce. It will try replacing all SSA names by their SCEV in all loops, 332*e4b17023SJohn Marinoincluding the super-loops of the current loop, thus providing a complete 333*e4b17023SJohn Marinoinformation about the behavior of the variable in the loop nest. 334*e4b17023SJohn Marino@code{resolve_mixers} is useful if you work with only one loop at a 335*e4b17023SJohn Marinotime, and if you possibly need to create code based on the value of the 336*e4b17023SJohn Marinoinduction variable. It will only resolve the SSA names defined in the 337*e4b17023SJohn Marinocurrent loop, leaving the SSA names defined outside unchanged, even if 338*e4b17023SJohn Marinotheir evolution in the outer loops is known. 339*e4b17023SJohn Marino 340*e4b17023SJohn MarinoThe SCEV is a normal tree expression, except for the fact that it may 341*e4b17023SJohn Marinocontain several special tree nodes. One of them is 342*e4b17023SJohn Marino@code{SCEV_NOT_KNOWN}, used for SSA names whose value cannot be 343*e4b17023SJohn Marinoexpressed. The other one is @code{POLYNOMIAL_CHREC}. Polynomial chrec 344*e4b17023SJohn Marinohas three arguments -- base, step and loop (both base and step may 345*e4b17023SJohn Marinocontain further polynomial chrecs). Type of the expression and of base 346*e4b17023SJohn Marinoand step must be the same. A variable has evolution 347*e4b17023SJohn Marino@code{POLYNOMIAL_CHREC(base, step, loop)} if it is (in the specified 348*e4b17023SJohn Marinoloop) equivalent to @code{x_1} in the following example 349*e4b17023SJohn Marino 350*e4b17023SJohn Marino@smallexample 351*e4b17023SJohn Marinowhile (@dots{}) 352*e4b17023SJohn Marino @{ 353*e4b17023SJohn Marino x_1 = phi (base, x_2); 354*e4b17023SJohn Marino x_2 = x_1 + step; 355*e4b17023SJohn Marino @} 356*e4b17023SJohn Marino@end smallexample 357*e4b17023SJohn Marino 358*e4b17023SJohn MarinoNote that this includes the language restrictions on the operations. 359*e4b17023SJohn MarinoFor example, if we compile C code and @code{x} has signed type, then the 360*e4b17023SJohn Marinooverflow in addition would cause undefined behavior, and we may assume 361*e4b17023SJohn Marinothat this does not happen. Hence, the value with this SCEV cannot 362*e4b17023SJohn Marinooverflow (which restricts the number of iterations of such a loop). 363*e4b17023SJohn Marino 364*e4b17023SJohn MarinoIn many cases, one wants to restrict the attention just to affine 365*e4b17023SJohn Marinoinduction variables. In this case, the extra expressive power of SCEV 366*e4b17023SJohn Marinois not useful, and may complicate the optimizations. In this case, 367*e4b17023SJohn Marino@code{simple_iv} function may be used to analyze a value -- the result 368*e4b17023SJohn Marinois a loop-invariant base and step. 369*e4b17023SJohn Marino 370*e4b17023SJohn Marino@node loop-iv 371*e4b17023SJohn Marino@section IV analysis on RTL 372*e4b17023SJohn Marino@cindex IV analysis on RTL 373*e4b17023SJohn Marino 374*e4b17023SJohn MarinoThe induction variable on RTL is simple and only allows analysis of 375*e4b17023SJohn Marinoaffine induction variables, and only in one loop at once. The interface 376*e4b17023SJohn Marinois declared in @file{cfgloop.h}. Before analyzing induction variables 377*e4b17023SJohn Marinoin a loop L, @code{iv_analysis_loop_init} function must be called on L. 378*e4b17023SJohn MarinoAfter the analysis (possibly calling @code{iv_analysis_loop_init} for 379*e4b17023SJohn Marinoseveral loops) is finished, @code{iv_analysis_done} should be called. 380*e4b17023SJohn MarinoThe following functions can be used to access the results of the 381*e4b17023SJohn Marinoanalysis: 382*e4b17023SJohn Marino 383*e4b17023SJohn Marino@itemize 384*e4b17023SJohn Marino@item @code{iv_analyze}: Analyzes a single register used in the given 385*e4b17023SJohn Marinoinsn. If no use of the register in this insn is found, the following 386*e4b17023SJohn Marinoinsns are scanned, so that this function can be called on the insn 387*e4b17023SJohn Marinoreturned by get_condition. 388*e4b17023SJohn Marino@item @code{iv_analyze_result}: Analyzes result of the assignment in the 389*e4b17023SJohn Marinogiven insn. 390*e4b17023SJohn Marino@item @code{iv_analyze_expr}: Analyzes a more complicated expression. 391*e4b17023SJohn MarinoAll its operands are analyzed by @code{iv_analyze}, and hence they must 392*e4b17023SJohn Marinobe used in the specified insn or one of the following insns. 393*e4b17023SJohn Marino@end itemize 394*e4b17023SJohn Marino 395*e4b17023SJohn MarinoThe description of the induction variable is provided in @code{struct 396*e4b17023SJohn Marinortx_iv}. In order to handle subregs, the representation is a bit 397*e4b17023SJohn Marinocomplicated; if the value of the @code{extend} field is not 398*e4b17023SJohn Marino@code{UNKNOWN}, the value of the induction variable in the i-th 399*e4b17023SJohn Marinoiteration is 400*e4b17023SJohn Marino 401*e4b17023SJohn Marino@smallexample 402*e4b17023SJohn Marinodelta + mult * extend_@{extend_mode@} (subreg_@{mode@} (base + i * step)), 403*e4b17023SJohn Marino@end smallexample 404*e4b17023SJohn Marino 405*e4b17023SJohn Marinowith the following exception: if @code{first_special} is true, then the 406*e4b17023SJohn Marinovalue in the first iteration (when @code{i} is zero) is @code{delta + 407*e4b17023SJohn Marinomult * base}. However, if @code{extend} is equal to @code{UNKNOWN}, 408*e4b17023SJohn Marinothen @code{first_special} must be false, @code{delta} 0, @code{mult} 1 409*e4b17023SJohn Marinoand the value in the i-th iteration is 410*e4b17023SJohn Marino 411*e4b17023SJohn Marino@smallexample 412*e4b17023SJohn Marinosubreg_@{mode@} (base + i * step) 413*e4b17023SJohn Marino@end smallexample 414*e4b17023SJohn Marino 415*e4b17023SJohn MarinoThe function @code{get_iv_value} can be used to perform these 416*e4b17023SJohn Marinocalculations. 417*e4b17023SJohn Marino 418*e4b17023SJohn Marino@node Number of iterations 419*e4b17023SJohn Marino@section Number of iterations analysis 420*e4b17023SJohn Marino@cindex Number of iterations analysis 421*e4b17023SJohn Marino 422*e4b17023SJohn MarinoBoth on GIMPLE and on RTL, there are functions available to determine 423*e4b17023SJohn Marinothe number of iterations of a loop, with a similar interface. The 424*e4b17023SJohn Marinonumber of iterations of a loop in GCC is defined as the number of 425*e4b17023SJohn Marinoexecutions of the loop latch. In many cases, it is not possible to 426*e4b17023SJohn Marinodetermine the number of iterations unconditionally -- the determined 427*e4b17023SJohn Marinonumber is correct only if some assumptions are satisfied. The analysis 428*e4b17023SJohn Marinotries to verify these conditions using the information contained in the 429*e4b17023SJohn Marinoprogram; if it fails, the conditions are returned together with the 430*e4b17023SJohn Marinoresult. The following information and conditions are provided by the 431*e4b17023SJohn Marinoanalysis: 432*e4b17023SJohn Marino 433*e4b17023SJohn Marino@itemize 434*e4b17023SJohn Marino@item @code{assumptions}: If this condition is false, the rest of 435*e4b17023SJohn Marinothe information is invalid. 436*e4b17023SJohn Marino@item @code{noloop_assumptions} on RTL, @code{may_be_zero} on GIMPLE: If 437*e4b17023SJohn Marinothis condition is true, the loop exits in the first iteration. 438*e4b17023SJohn Marino@item @code{infinite}: If this condition is true, the loop is infinite. 439*e4b17023SJohn MarinoThis condition is only available on RTL@. On GIMPLE, conditions for 440*e4b17023SJohn Marinofiniteness of the loop are included in @code{assumptions}. 441*e4b17023SJohn Marino@item @code{niter_expr} on RTL, @code{niter} on GIMPLE: The expression 442*e4b17023SJohn Marinothat gives number of iterations. The number of iterations is defined as 443*e4b17023SJohn Marinothe number of executions of the loop latch. 444*e4b17023SJohn Marino@end itemize 445*e4b17023SJohn Marino 446*e4b17023SJohn MarinoBoth on GIMPLE and on RTL, it necessary for the induction variable 447*e4b17023SJohn Marinoanalysis framework to be initialized (SCEV on GIMPLE, loop-iv on RTL). 448*e4b17023SJohn MarinoOn GIMPLE, the results are stored to @code{struct tree_niter_desc} 449*e4b17023SJohn Marinostructure. Number of iterations before the loop is exited through a 450*e4b17023SJohn Marinogiven exit can be determined using @code{number_of_iterations_exit} 451*e4b17023SJohn Marinofunction. On RTL, the results are returned in @code{struct niter_desc} 452*e4b17023SJohn Marinostructure. The corresponding function is named 453*e4b17023SJohn Marino@code{check_simple_exit}. There are also functions that pass through 454*e4b17023SJohn Marinoall the exits of a loop and try to find one with easy to determine 455*e4b17023SJohn Marinonumber of iterations -- @code{find_loop_niter} on GIMPLE and 456*e4b17023SJohn Marino@code{find_simple_exit} on RTL@. Finally, there are functions that 457*e4b17023SJohn Marinoprovide the same information, but additionally cache it, so that 458*e4b17023SJohn Marinorepeated calls to number of iterations are not so costly -- 459*e4b17023SJohn Marino@code{number_of_latch_executions} on GIMPLE and @code{get_simple_loop_desc} 460*e4b17023SJohn Marinoon RTL. 461*e4b17023SJohn Marino 462*e4b17023SJohn MarinoNote that some of these functions may behave slightly differently than 463*e4b17023SJohn Marinoothers -- some of them return only the expression for the number of 464*e4b17023SJohn Marinoiterations, and fail if there are some assumptions. The function 465*e4b17023SJohn Marino@code{number_of_latch_executions} works only for single-exit loops. 466*e4b17023SJohn MarinoThe function @code{number_of_cond_exit_executions} can be used to 467*e4b17023SJohn Marinodetermine number of executions of the exit condition of a single-exit 468*e4b17023SJohn Marinoloop (i.e., the @code{number_of_latch_executions} increased by one). 469*e4b17023SJohn Marino 470*e4b17023SJohn Marino@node Dependency analysis 471*e4b17023SJohn Marino@section Data Dependency Analysis 472*e4b17023SJohn Marino@cindex Data Dependency Analysis 473*e4b17023SJohn Marino 474*e4b17023SJohn MarinoThe code for the data dependence analysis can be found in 475*e4b17023SJohn Marino@file{tree-data-ref.c} and its interface and data structures are 476*e4b17023SJohn Marinodescribed in @file{tree-data-ref.h}. The function that computes the 477*e4b17023SJohn Marinodata dependences for all the array and pointer references for a given 478*e4b17023SJohn Marinoloop is @code{compute_data_dependences_for_loop}. This function is 479*e4b17023SJohn Marinocurrently used by the linear loop transform and the vectorization 480*e4b17023SJohn Marinopasses. Before calling this function, one has to allocate two vectors: 481*e4b17023SJohn Marinoa first vector will contain the set of data references that are 482*e4b17023SJohn Marinocontained in the analyzed loop body, and the second vector will contain 483*e4b17023SJohn Marinothe dependence relations between the data references. Thus if the 484*e4b17023SJohn Marinovector of data references is of size @code{n}, the vector containing the 485*e4b17023SJohn Marinodependence relations will contain @code{n*n} elements. However if the 486*e4b17023SJohn Marinoanalyzed loop contains side effects, such as calls that potentially can 487*e4b17023SJohn Marinointerfere with the data references in the current analyzed loop, the 488*e4b17023SJohn Marinoanalysis stops while scanning the loop body for data references, and 489*e4b17023SJohn Marinoinserts a single @code{chrec_dont_know} in the dependence relation 490*e4b17023SJohn Marinoarray. 491*e4b17023SJohn Marino 492*e4b17023SJohn MarinoThe data references are discovered in a particular order during the 493*e4b17023SJohn Marinoscanning of the loop body: the loop body is analyzed in execution order, 494*e4b17023SJohn Marinoand the data references of each statement are pushed at the end of the 495*e4b17023SJohn Marinodata reference array. Two data references syntactically occur in the 496*e4b17023SJohn Marinoprogram in the same order as in the array of data references. This 497*e4b17023SJohn Marinosyntactic order is important in some classical data dependence tests, 498*e4b17023SJohn Marinoand mapping this order to the elements of this array avoids costly 499*e4b17023SJohn Marinoqueries to the loop body representation. 500*e4b17023SJohn Marino 501*e4b17023SJohn MarinoThree types of data references are currently handled: ARRAY_REF, 502*e4b17023SJohn MarinoINDIRECT_REF and COMPONENT_REF@. The data structure for the data reference 503*e4b17023SJohn Marinois @code{data_reference}, where @code{data_reference_p} is a name of a 504*e4b17023SJohn Marinopointer to the data reference structure. The structure contains the 505*e4b17023SJohn Marinofollowing elements: 506*e4b17023SJohn Marino 507*e4b17023SJohn Marino@itemize 508*e4b17023SJohn Marino@item @code{base_object_info}: Provides information about the base object 509*e4b17023SJohn Marinoof the data reference and its access functions. These access functions 510*e4b17023SJohn Marinorepresent the evolution of the data reference in the loop relative to 511*e4b17023SJohn Marinoits base, in keeping with the classical meaning of the data reference 512*e4b17023SJohn Marinoaccess function for the support of arrays. For example, for a reference 513*e4b17023SJohn Marino@code{a.b[i][j]}, the base object is @code{a.b} and the access functions, 514*e4b17023SJohn Marinoone for each array subscript, are: 515*e4b17023SJohn Marino@code{@{i_init, + i_step@}_1, @{j_init, +, j_step@}_2}. 516*e4b17023SJohn Marino 517*e4b17023SJohn Marino@item @code{first_location_in_loop}: Provides information about the first 518*e4b17023SJohn Marinolocation accessed by the data reference in the loop and about the access 519*e4b17023SJohn Marinofunction used to represent evolution relative to this location. This data 520*e4b17023SJohn Marinois used to support pointers, and is not used for arrays (for which we 521*e4b17023SJohn Marinohave base objects). Pointer accesses are represented as a one-dimensional 522*e4b17023SJohn Marinoaccess that starts from the first location accessed in the loop. For 523*e4b17023SJohn Marinoexample: 524*e4b17023SJohn Marino 525*e4b17023SJohn Marino@smallexample 526*e4b17023SJohn Marino for1 i 527*e4b17023SJohn Marino for2 j 528*e4b17023SJohn Marino *((int *)p + i + j) = a[i][j]; 529*e4b17023SJohn Marino@end smallexample 530*e4b17023SJohn Marino 531*e4b17023SJohn MarinoThe access function of the pointer access is @code{@{0, + 4B@}_for2} 532*e4b17023SJohn Marinorelative to @code{p + i}. The access functions of the array are 533*e4b17023SJohn Marino@code{@{i_init, + i_step@}_for1} and @code{@{j_init, +, j_step@}_for2} 534*e4b17023SJohn Marinorelative to @code{a}. 535*e4b17023SJohn Marino 536*e4b17023SJohn MarinoUsually, the object the pointer refers to is either unknown, or we can't 537*e4b17023SJohn Marinoprove that the access is confined to the boundaries of a certain object. 538*e4b17023SJohn Marino 539*e4b17023SJohn MarinoTwo data references can be compared only if at least one of these two 540*e4b17023SJohn Marinorepresentations has all its fields filled for both data references. 541*e4b17023SJohn Marino 542*e4b17023SJohn MarinoThe current strategy for data dependence tests is as follows: 543*e4b17023SJohn MarinoIf both @code{a} and @code{b} are represented as arrays, compare 544*e4b17023SJohn Marino@code{a.base_object} and @code{b.base_object}; 545*e4b17023SJohn Marinoif they are equal, apply dependence tests (use access functions based on 546*e4b17023SJohn Marinobase_objects). 547*e4b17023SJohn MarinoElse if both @code{a} and @code{b} are represented as pointers, compare 548*e4b17023SJohn Marino@code{a.first_location} and @code{b.first_location}; 549*e4b17023SJohn Marinoif they are equal, apply dependence tests (use access functions based on 550*e4b17023SJohn Marinofirst location). 551*e4b17023SJohn MarinoHowever, if @code{a} and @code{b} are represented differently, only try 552*e4b17023SJohn Marinoto prove that the bases are definitely different. 553*e4b17023SJohn Marino 554*e4b17023SJohn Marino@item Aliasing information. 555*e4b17023SJohn Marino@item Alignment information. 556*e4b17023SJohn Marino@end itemize 557*e4b17023SJohn Marino 558*e4b17023SJohn MarinoThe structure describing the relation between two data references is 559*e4b17023SJohn Marino@code{data_dependence_relation} and the shorter name for a pointer to 560*e4b17023SJohn Marinosuch a structure is @code{ddr_p}. This structure contains: 561*e4b17023SJohn Marino 562*e4b17023SJohn Marino@itemize 563*e4b17023SJohn Marino@item a pointer to each data reference, 564*e4b17023SJohn Marino@item a tree node @code{are_dependent} that is set to @code{chrec_known} 565*e4b17023SJohn Marinoif the analysis has proved that there is no dependence between these two 566*e4b17023SJohn Marinodata references, @code{chrec_dont_know} if the analysis was not able to 567*e4b17023SJohn Marinodetermine any useful result and potentially there could exist a 568*e4b17023SJohn Marinodependence between these data references, and @code{are_dependent} is 569*e4b17023SJohn Marinoset to @code{NULL_TREE} if there exist a dependence relation between the 570*e4b17023SJohn Marinodata references, and the description of this dependence relation is 571*e4b17023SJohn Marinogiven in the @code{subscripts}, @code{dir_vects}, and @code{dist_vects} 572*e4b17023SJohn Marinoarrays, 573*e4b17023SJohn Marino@item a boolean that determines whether the dependence relation can be 574*e4b17023SJohn Marinorepresented by a classical distance vector, 575*e4b17023SJohn Marino@item an array @code{subscripts} that contains a description of each 576*e4b17023SJohn Marinosubscript of the data references. Given two array accesses a 577*e4b17023SJohn Marinosubscript is the tuple composed of the access functions for a given 578*e4b17023SJohn Marinodimension. For example, given @code{A[f1][f2][f3]} and 579*e4b17023SJohn Marino@code{B[g1][g2][g3]}, there are three subscripts: @code{(f1, g1), (f2, 580*e4b17023SJohn Marinog2), (f3, g3)}. 581*e4b17023SJohn Marino@item two arrays @code{dir_vects} and @code{dist_vects} that contain 582*e4b17023SJohn Marinoclassical representations of the data dependences under the form of 583*e4b17023SJohn Marinodirection and distance dependence vectors, 584*e4b17023SJohn Marino@item an array of loops @code{loop_nest} that contains the loops to 585*e4b17023SJohn Marinowhich the distance and direction vectors refer to. 586*e4b17023SJohn Marino@end itemize 587*e4b17023SJohn Marino 588*e4b17023SJohn MarinoSeveral functions for pretty printing the information extracted by the 589*e4b17023SJohn Marinodata dependence analysis are available: @code{dump_ddrs} prints with a 590*e4b17023SJohn Marinomaximum verbosity the details of a data dependence relations array, 591*e4b17023SJohn Marino@code{dump_dist_dir_vectors} prints only the classical distance and 592*e4b17023SJohn Marinodirection vectors for a data dependence relations array, and 593*e4b17023SJohn Marino@code{dump_data_references} prints the details of the data references 594*e4b17023SJohn Marinocontained in a data reference array. 595*e4b17023SJohn Marino 596*e4b17023SJohn Marino@node Lambda 597*e4b17023SJohn Marino@section Linear loop transformations framework 598*e4b17023SJohn Marino@cindex Linear loop transformations framework 599*e4b17023SJohn Marino 600*e4b17023SJohn MarinoLambda is a framework that allows transformations of loops using 601*e4b17023SJohn Marinonon-singular matrix based transformations of the iteration space and 602*e4b17023SJohn Marinoloop bounds. This allows compositions of skewing, scaling, interchange, 603*e4b17023SJohn Marinoand reversal transformations. These transformations are often used to 604*e4b17023SJohn Marinoimprove cache behavior or remove inner loop dependencies to allow 605*e4b17023SJohn Marinoparallelization and vectorization to take place. 606*e4b17023SJohn Marino 607*e4b17023SJohn MarinoTo perform these transformations, Lambda requires that the loopnest be 608*e4b17023SJohn Marinoconverted into an internal form that can be matrix transformed easily. 609*e4b17023SJohn MarinoTo do this conversion, the function 610*e4b17023SJohn Marino@code{gcc_loopnest_to_lambda_loopnest} is provided. If the loop cannot 611*e4b17023SJohn Marinobe transformed using lambda, this function will return NULL. 612*e4b17023SJohn Marino 613*e4b17023SJohn MarinoOnce a @code{lambda_loopnest} is obtained from the conversion function, 614*e4b17023SJohn Marinoit can be transformed by using @code{lambda_loopnest_transform}, which 615*e4b17023SJohn Marinotakes a transformation matrix to apply. Note that it is up to the 616*e4b17023SJohn Marinocaller to verify that the transformation matrix is legal to apply to the 617*e4b17023SJohn Marinoloop (dependence respecting, etc). Lambda simply applies whatever 618*e4b17023SJohn Marinomatrix it is told to provide. It can be extended to make legal matrices 619*e4b17023SJohn Marinoout of any non-singular matrix, but this is not currently implemented. 620*e4b17023SJohn MarinoLegality of a matrix for a given loopnest can be verified using 621*e4b17023SJohn Marino@code{lambda_transform_legal_p}. 622*e4b17023SJohn Marino 623*e4b17023SJohn MarinoGiven a transformed loopnest, conversion back into gcc IR is done by 624*e4b17023SJohn Marino@code{lambda_loopnest_to_gcc_loopnest}. This function will modify the 625*e4b17023SJohn Marinoloops so that they match the transformed loopnest. 626*e4b17023SJohn Marino 627*e4b17023SJohn Marino 628*e4b17023SJohn Marino@node Omega 629*e4b17023SJohn Marino@section Omega a solver for linear programming problems 630*e4b17023SJohn Marino@cindex Omega a solver for linear programming problems 631*e4b17023SJohn Marino 632*e4b17023SJohn MarinoThe data dependence analysis contains several solvers triggered 633*e4b17023SJohn Marinosequentially from the less complex ones to the more sophisticated. 634*e4b17023SJohn MarinoFor ensuring the consistency of the results of these solvers, a data 635*e4b17023SJohn Marinodependence check pass has been implemented based on two different 636*e4b17023SJohn Marinosolvers. The second method that has been integrated to GCC is based 637*e4b17023SJohn Marinoon the Omega dependence solver, written in the 1990's by William Pugh 638*e4b17023SJohn Marinoand David Wonnacott. Data dependence tests can be formulated using a 639*e4b17023SJohn Marinosubset of the Presburger arithmetics that can be translated to linear 640*e4b17023SJohn Marinoconstraint systems. These linear constraint systems can then be 641*e4b17023SJohn Marinosolved using the Omega solver. 642*e4b17023SJohn Marino 643*e4b17023SJohn MarinoThe Omega solver is using Fourier-Motzkin's algorithm for variable 644*e4b17023SJohn Marinoelimination: a linear constraint system containing @code{n} variables 645*e4b17023SJohn Marinois reduced to a linear constraint system with @code{n-1} variables. 646*e4b17023SJohn MarinoThe Omega solver can also be used for solving other problems that can 647*e4b17023SJohn Marinobe expressed under the form of a system of linear equalities and 648*e4b17023SJohn Marinoinequalities. The Omega solver is known to have an exponential worst 649*e4b17023SJohn Marinocase, also known under the name of ``omega nightmare'' in the 650*e4b17023SJohn Marinoliterature, but in practice, the omega test is known to be efficient 651*e4b17023SJohn Marinofor the common data dependence tests. 652*e4b17023SJohn Marino 653*e4b17023SJohn MarinoThe interface used by the Omega solver for describing the linear 654*e4b17023SJohn Marinoprogramming problems is described in @file{omega.h}, and the solver is 655*e4b17023SJohn Marino@code{omega_solve_problem}. 656