1@c Copyright (C) 2006-2015 Free Software Foundation, Inc. 2@c Free Software Foundation, Inc. 3@c This is part of the GCC manual. 4@c For copying conditions, see the file gcc.texi. 5 6@c --------------------------------------------------------------------- 7@c Loop Representation 8@c --------------------------------------------------------------------- 9 10@node Loop Analysis and Representation 11@chapter Analysis and Representation of Loops 12 13GCC provides extensive infrastructure for work with natural loops, i.e., 14strongly connected components of CFG with only one entry block. This 15chapter describes representation of loops in GCC, both on GIMPLE and in 16RTL, as well as the interfaces to loop-related analyses (induction 17variable analysis and number of iterations analysis). 18 19@menu 20* Loop representation:: Representation and analysis of loops. 21* Loop querying:: Getting information about loops. 22* Loop manipulation:: Loop manipulation functions. 23* LCSSA:: Loop-closed SSA form. 24* Scalar evolutions:: Induction variables on GIMPLE. 25* loop-iv:: Induction variables on RTL. 26* Number of iterations:: Number of iterations analysis. 27* Dependency analysis:: Data dependency analysis. 28* Omega:: A solver for linear programming problems. 29@end menu 30 31@node Loop representation 32@section Loop representation 33@cindex Loop representation 34@cindex Loop analysis 35 36This chapter describes the representation of loops in GCC, and functions 37that can be used to build, modify and analyze this representation. Most 38of the interfaces and data structures are declared in @file{cfgloop.h}. 39Loop structures are analyzed and this information disposed or updated 40at the discretion of individual passes. Still most of the generic 41CFG manipulation routines are aware of loop structures and try to 42keep them up-to-date. By this means an increasing part of the 43compilation pipeline is setup to maintain loop structure across 44passes to allow attaching meta information to individual loops 45for consumption by later passes. 46 47In general, a natural loop has one entry block (header) and possibly 48several back edges (latches) leading to the header from the inside of 49the loop. Loops with several latches may appear if several loops share 50a single header, or if there is a branching in the middle of the loop. 51The representation of loops in GCC however allows only loops with a 52single latch. During loop analysis, headers of such loops are split and 53forwarder blocks are created in order to disambiguate their structures. 54Heuristic based on profile information and structure of the induction 55variables in the loops is used to determine whether the latches 56correspond to sub-loops or to control flow in a single loop. This means 57that the analysis sometimes changes the CFG, and if you run it in the 58middle of an optimization pass, you must be able to deal with the new 59blocks. You may avoid CFG changes by passing 60@code{LOOPS_MAY_HAVE_MULTIPLE_LATCHES} flag to the loop discovery, 61note however that most other loop manipulation functions will not work 62correctly for loops with multiple latch edges (the functions that only 63query membership of blocks to loops and subloop relationships, or 64enumerate and test loop exits, can be expected to work). 65 66Body of the loop is the set of blocks that are dominated by its header, 67and reachable from its latch against the direction of edges in CFG@. The 68loops are organized in a containment hierarchy (tree) such that all the 69loops immediately contained inside loop L are the children of L in the 70tree. This tree is represented by the @code{struct loops} structure. 71The root of this tree is a fake loop that contains all blocks in the 72function. Each of the loops is represented in a @code{struct loop} 73structure. Each loop is assigned an index (@code{num} field of the 74@code{struct loop} structure), and the pointer to the loop is stored in 75the corresponding field of the @code{larray} vector in the loops 76structure. The indices do not have to be continuous, there may be 77empty (@code{NULL}) entries in the @code{larray} created by deleting 78loops. Also, there is no guarantee on the relative order of a loop 79and its subloops in the numbering. The index of a loop never changes. 80 81The entries of the @code{larray} field should not be accessed directly. 82The function @code{get_loop} returns the loop description for a loop with 83the given index. @code{number_of_loops} function returns number of 84loops in the function. To traverse all loops, use @code{FOR_EACH_LOOP} 85macro. The @code{flags} argument of the macro is used to determine 86the direction of traversal and the set of loops visited. Each loop is 87guaranteed to be visited exactly once, regardless of the changes to the 88loop tree, and the loops may be removed during the traversal. The newly 89created loops are never traversed, if they need to be visited, this 90must be done separately after their creation. The @code{FOR_EACH_LOOP} 91macro allocates temporary variables. If the @code{FOR_EACH_LOOP} loop 92were ended using break or goto, they would not be released; 93@code{FOR_EACH_LOOP_BREAK} macro must be used instead. 94 95Each basic block contains the reference to the innermost loop it belongs 96to (@code{loop_father}). For this reason, it is only possible to have 97one @code{struct loops} structure initialized at the same time for each 98CFG@. The global variable @code{current_loops} contains the 99@code{struct loops} structure. Many of the loop manipulation functions 100assume that dominance information is up-to-date. 101 102The loops are analyzed through @code{loop_optimizer_init} function. The 103argument of this function is a set of flags represented in an integer 104bitmask. These flags specify what other properties of the loop 105structures should be calculated/enforced and preserved later: 106 107@itemize 108@item @code{LOOPS_MAY_HAVE_MULTIPLE_LATCHES}: If this flag is set, no 109changes to CFG will be performed in the loop analysis, in particular, 110loops with multiple latch edges will not be disambiguated. If a loop 111has multiple latches, its latch block is set to NULL@. Most of 112the loop manipulation functions will not work for loops in this shape. 113No other flags that require CFG changes can be passed to 114loop_optimizer_init. 115@item @code{LOOPS_HAVE_PREHEADERS}: Forwarder blocks are created in such 116a way that each loop has only one entry edge, and additionally, the 117source block of this entry edge has only one successor. This creates a 118natural place where the code can be moved out of the loop, and ensures 119that the entry edge of the loop leads from its immediate super-loop. 120@item @code{LOOPS_HAVE_SIMPLE_LATCHES}: Forwarder blocks are created to 121force the latch block of each loop to have only one successor. This 122ensures that the latch of the loop does not belong to any of its 123sub-loops, and makes manipulation with the loops significantly easier. 124Most of the loop manipulation functions assume that the loops are in 125this shape. Note that with this flag, the ``normal'' loop without any 126control flow inside and with one exit consists of two basic blocks. 127@item @code{LOOPS_HAVE_MARKED_IRREDUCIBLE_REGIONS}: Basic blocks and 128edges in the strongly connected components that are not natural loops 129(have more than one entry block) are marked with 130@code{BB_IRREDUCIBLE_LOOP} and @code{EDGE_IRREDUCIBLE_LOOP} flags. The 131flag is not set for blocks and edges that belong to natural loops that 132are in such an irreducible region (but it is set for the entry and exit 133edges of such a loop, if they lead to/from this region). 134@item @code{LOOPS_HAVE_RECORDED_EXITS}: The lists of exits are recorded 135and updated for each loop. This makes some functions (e.g., 136@code{get_loop_exit_edges}) more efficient. Some functions (e.g., 137@code{single_exit}) can be used only if the lists of exits are 138recorded. 139@end itemize 140 141These properties may also be computed/enforced later, using functions 142@code{create_preheaders}, @code{force_single_succ_latches}, 143@code{mark_irreducible_loops} and @code{record_loop_exits}. 144The properties can be queried using @code{loops_state_satisfies_p}. 145 146The memory occupied by the loops structures should be freed with 147@code{loop_optimizer_finalize} function. When loop structures are 148setup to be preserved across passes this function reduces the 149information to be kept up-to-date to a minimum (only 150@code{LOOPS_MAY_HAVE_MULTIPLE_LATCHES} set). 151 152The CFG manipulation functions in general do not update loop structures. 153Specialized versions that additionally do so are provided for the most 154common tasks. On GIMPLE, @code{cleanup_tree_cfg_loop} function can be 155used to cleanup CFG while updating the loops structures if 156@code{current_loops} is set. 157 158At the moment loop structure is preserved from the start of GIMPLE 159loop optimizations until the end of RTL loop optimizations. During 160this time a loop can be tracked by its @code{struct loop} and number. 161 162@node Loop querying 163@section Loop querying 164@cindex Loop querying 165 166The functions to query the information about loops are declared in 167@file{cfgloop.h}. Some of the information can be taken directly from 168the structures. @code{loop_father} field of each basic block contains 169the innermost loop to that the block belongs. The most useful fields of 170loop structure (that are kept up-to-date at all times) are: 171 172@itemize 173@item @code{header}, @code{latch}: Header and latch basic blocks of the 174loop. 175@item @code{num_nodes}: Number of basic blocks in the loop (including 176the basic blocks of the sub-loops). 177@item @code{depth}: The depth of the loop in the loops tree, i.e., the 178number of super-loops of the loop. 179@item @code{outer}, @code{inner}, @code{next}: The super-loop, the first 180sub-loop, and the sibling of the loop in the loops tree. 181@end itemize 182 183There are other fields in the loop structures, many of them used only by 184some of the passes, or not updated during CFG changes; in general, they 185should not be accessed directly. 186 187The most important functions to query loop structures are: 188 189@itemize 190@item @code{flow_loops_dump}: Dumps the information about loops to a 191file. 192@item @code{verify_loop_structure}: Checks consistency of the loop 193structures. 194@item @code{loop_latch_edge}: Returns the latch edge of a loop. 195@item @code{loop_preheader_edge}: If loops have preheaders, returns 196the preheader edge of a loop. 197@item @code{flow_loop_nested_p}: Tests whether loop is a sub-loop of 198another loop. 199@item @code{flow_bb_inside_loop_p}: Tests whether a basic block belongs 200to a loop (including its sub-loops). 201@item @code{find_common_loop}: Finds the common super-loop of two loops. 202@item @code{superloop_at_depth}: Returns the super-loop of a loop with 203the given depth. 204@item @code{tree_num_loop_insns}, @code{num_loop_insns}: Estimates the 205number of insns in the loop, on GIMPLE and on RTL. 206@item @code{loop_exit_edge_p}: Tests whether edge is an exit from a 207loop. 208@item @code{mark_loop_exit_edges}: Marks all exit edges of all loops 209with @code{EDGE_LOOP_EXIT} flag. 210@item @code{get_loop_body}, @code{get_loop_body_in_dom_order}, 211@code{get_loop_body_in_bfs_order}: Enumerates the basic blocks in the 212loop in depth-first search order in reversed CFG, ordered by dominance 213relation, and breath-first search order, respectively. 214@item @code{single_exit}: Returns the single exit edge of the loop, or 215@code{NULL} if the loop has more than one exit. You can only use this 216function if LOOPS_HAVE_MARKED_SINGLE_EXITS property is used. 217@item @code{get_loop_exit_edges}: Enumerates the exit edges of a loop. 218@item @code{just_once_each_iteration_p}: Returns true if the basic block 219is executed exactly once during each iteration of a loop (that is, it 220does not belong to a sub-loop, and it dominates the latch of the loop). 221@end itemize 222 223@node Loop manipulation 224@section Loop manipulation 225@cindex Loop manipulation 226 227The loops tree can be manipulated using the following functions: 228 229@itemize 230@item @code{flow_loop_tree_node_add}: Adds a node to the tree. 231@item @code{flow_loop_tree_node_remove}: Removes a node from the tree. 232@item @code{add_bb_to_loop}: Adds a basic block to a loop. 233@item @code{remove_bb_from_loops}: Removes a basic block from loops. 234@end itemize 235 236Most low-level CFG functions update loops automatically. The following 237functions handle some more complicated cases of CFG manipulations: 238 239@itemize 240@item @code{remove_path}: Removes an edge and all blocks it dominates. 241@item @code{split_loop_exit_edge}: Splits exit edge of the loop, 242ensuring that PHI node arguments remain in the loop (this ensures that 243loop-closed SSA form is preserved). Only useful on GIMPLE. 244@end itemize 245 246Finally, there are some higher-level loop transformations implemented. 247While some of them are written so that they should work on non-innermost 248loops, they are mostly untested in that case, and at the moment, they 249are only reliable for the innermost loops: 250 251@itemize 252@item @code{create_iv}: Creates a new induction variable. Only works on 253GIMPLE@. @code{standard_iv_increment_position} can be used to find a 254suitable place for the iv increment. 255@item @code{duplicate_loop_to_header_edge}, 256@code{tree_duplicate_loop_to_header_edge}: These functions (on RTL and 257on GIMPLE) duplicate the body of the loop prescribed number of times on 258one of the edges entering loop header, thus performing either loop 259unrolling or loop peeling. @code{can_duplicate_loop_p} 260(@code{can_unroll_loop_p} on GIMPLE) must be true for the duplicated 261loop. 262@item @code{loop_version}, @code{tree_ssa_loop_version}: These function 263create a copy of a loop, and a branch before them that selects one of 264them depending on the prescribed condition. This is useful for 265optimizations that need to verify some assumptions in runtime (one of 266the copies of the loop is usually left unchanged, while the other one is 267transformed in some way). 268@item @code{tree_unroll_loop}: Unrolls the loop, including peeling the 269extra iterations to make the number of iterations divisible by unroll 270factor, updating the exit condition, and removing the exits that now 271cannot be taken. Works only on GIMPLE. 272@end itemize 273 274@node LCSSA 275@section Loop-closed SSA form 276@cindex LCSSA 277@cindex Loop-closed SSA form 278 279Throughout the loop optimizations on tree level, one extra condition is 280enforced on the SSA form: No SSA name is used outside of the loop in 281that it is defined. The SSA form satisfying this condition is called 282``loop-closed SSA form'' -- LCSSA@. To enforce LCSSA, PHI nodes must be 283created at the exits of the loops for the SSA names that are used 284outside of them. Only the real operands (not virtual SSA names) are 285held in LCSSA, in order to save memory. 286 287There are various benefits of LCSSA: 288 289@itemize 290@item Many optimizations (value range analysis, final value 291replacement) are interested in the values that are defined in the loop 292and used outside of it, i.e., exactly those for that we create new PHI 293nodes. 294@item In induction variable analysis, it is not necessary to specify the 295loop in that the analysis should be performed -- the scalar evolution 296analysis always returns the results with respect to the loop in that the 297SSA name is defined. 298@item It makes updating of SSA form during loop transformations simpler. 299Without LCSSA, operations like loop unrolling may force creation of PHI 300nodes arbitrarily far from the loop, while in LCSSA, the SSA form can be 301updated locally. However, since we only keep real operands in LCSSA, we 302cannot use this advantage (we could have local updating of real 303operands, but it is not much more efficient than to use generic SSA form 304updating for it as well; the amount of changes to SSA is the same). 305@end itemize 306 307However, it also means LCSSA must be updated. This is usually 308straightforward, unless you create a new value in loop and use it 309outside, or unless you manipulate loop exit edges (functions are 310provided to make these manipulations simple). 311@code{rewrite_into_loop_closed_ssa} is used to rewrite SSA form to 312LCSSA, and @code{verify_loop_closed_ssa} to check that the invariant of 313LCSSA is preserved. 314 315@node Scalar evolutions 316@section Scalar evolutions 317@cindex Scalar evolutions 318@cindex IV analysis on GIMPLE 319 320Scalar evolutions (SCEV) are used to represent results of induction 321variable analysis on GIMPLE@. They enable us to represent variables with 322complicated behavior in a simple and consistent way (we only use it to 323express values of polynomial induction variables, but it is possible to 324extend it). The interfaces to SCEV analysis are declared in 325@file{tree-scalar-evolution.h}. To use scalar evolutions analysis, 326@code{scev_initialize} must be used. To stop using SCEV, 327@code{scev_finalize} should be used. SCEV analysis caches results in 328order to save time and memory. This cache however is made invalid by 329most of the loop transformations, including removal of code. If such a 330transformation is performed, @code{scev_reset} must be called to clean 331the caches. 332 333Given an SSA name, its behavior in loops can be analyzed using the 334@code{analyze_scalar_evolution} function. The returned SCEV however 335does not have to be fully analyzed and it may contain references to 336other SSA names defined in the loop. To resolve these (potentially 337recursive) references, @code{instantiate_parameters} or 338@code{resolve_mixers} functions must be used. 339@code{instantiate_parameters} is useful when you use the results of SCEV 340only for some analysis, and when you work with whole nest of loops at 341once. It will try replacing all SSA names by their SCEV in all loops, 342including the super-loops of the current loop, thus providing a complete 343information about the behavior of the variable in the loop nest. 344@code{resolve_mixers} is useful if you work with only one loop at a 345time, and if you possibly need to create code based on the value of the 346induction variable. It will only resolve the SSA names defined in the 347current loop, leaving the SSA names defined outside unchanged, even if 348their evolution in the outer loops is known. 349 350The SCEV is a normal tree expression, except for the fact that it may 351contain several special tree nodes. One of them is 352@code{SCEV_NOT_KNOWN}, used for SSA names whose value cannot be 353expressed. The other one is @code{POLYNOMIAL_CHREC}. Polynomial chrec 354has three arguments -- base, step and loop (both base and step may 355contain further polynomial chrecs). Type of the expression and of base 356and step must be the same. A variable has evolution 357@code{POLYNOMIAL_CHREC(base, step, loop)} if it is (in the specified 358loop) equivalent to @code{x_1} in the following example 359 360@smallexample 361while (@dots{}) 362 @{ 363 x_1 = phi (base, x_2); 364 x_2 = x_1 + step; 365 @} 366@end smallexample 367 368Note that this includes the language restrictions on the operations. 369For example, if we compile C code and @code{x} has signed type, then the 370overflow in addition would cause undefined behavior, and we may assume 371that this does not happen. Hence, the value with this SCEV cannot 372overflow (which restricts the number of iterations of such a loop). 373 374In many cases, one wants to restrict the attention just to affine 375induction variables. In this case, the extra expressive power of SCEV 376is not useful, and may complicate the optimizations. In this case, 377@code{simple_iv} function may be used to analyze a value -- the result 378is a loop-invariant base and step. 379 380@node loop-iv 381@section IV analysis on RTL 382@cindex IV analysis on RTL 383 384The induction variable on RTL is simple and only allows analysis of 385affine induction variables, and only in one loop at once. The interface 386is declared in @file{cfgloop.h}. Before analyzing induction variables 387in a loop L, @code{iv_analysis_loop_init} function must be called on L. 388After the analysis (possibly calling @code{iv_analysis_loop_init} for 389several loops) is finished, @code{iv_analysis_done} should be called. 390The following functions can be used to access the results of the 391analysis: 392 393@itemize 394@item @code{iv_analyze}: Analyzes a single register used in the given 395insn. If no use of the register in this insn is found, the following 396insns are scanned, so that this function can be called on the insn 397returned by get_condition. 398@item @code{iv_analyze_result}: Analyzes result of the assignment in the 399given insn. 400@item @code{iv_analyze_expr}: Analyzes a more complicated expression. 401All its operands are analyzed by @code{iv_analyze}, and hence they must 402be used in the specified insn or one of the following insns. 403@end itemize 404 405The description of the induction variable is provided in @code{struct 406rtx_iv}. In order to handle subregs, the representation is a bit 407complicated; if the value of the @code{extend} field is not 408@code{UNKNOWN}, the value of the induction variable in the i-th 409iteration is 410 411@smallexample 412delta + mult * extend_@{extend_mode@} (subreg_@{mode@} (base + i * step)), 413@end smallexample 414 415with the following exception: if @code{first_special} is true, then the 416value in the first iteration (when @code{i} is zero) is @code{delta + 417mult * base}. However, if @code{extend} is equal to @code{UNKNOWN}, 418then @code{first_special} must be false, @code{delta} 0, @code{mult} 1 419and the value in the i-th iteration is 420 421@smallexample 422subreg_@{mode@} (base + i * step) 423@end smallexample 424 425The function @code{get_iv_value} can be used to perform these 426calculations. 427 428@node Number of iterations 429@section Number of iterations analysis 430@cindex Number of iterations analysis 431 432Both on GIMPLE and on RTL, there are functions available to determine 433the number of iterations of a loop, with a similar interface. The 434number of iterations of a loop in GCC is defined as the number of 435executions of the loop latch. In many cases, it is not possible to 436determine the number of iterations unconditionally -- the determined 437number is correct only if some assumptions are satisfied. The analysis 438tries to verify these conditions using the information contained in the 439program; if it fails, the conditions are returned together with the 440result. The following information and conditions are provided by the 441analysis: 442 443@itemize 444@item @code{assumptions}: If this condition is false, the rest of 445the information is invalid. 446@item @code{noloop_assumptions} on RTL, @code{may_be_zero} on GIMPLE: If 447this condition is true, the loop exits in the first iteration. 448@item @code{infinite}: If this condition is true, the loop is infinite. 449This condition is only available on RTL@. On GIMPLE, conditions for 450finiteness of the loop are included in @code{assumptions}. 451@item @code{niter_expr} on RTL, @code{niter} on GIMPLE: The expression 452that gives number of iterations. The number of iterations is defined as 453the number of executions of the loop latch. 454@end itemize 455 456Both on GIMPLE and on RTL, it necessary for the induction variable 457analysis framework to be initialized (SCEV on GIMPLE, loop-iv on RTL). 458On GIMPLE, the results are stored to @code{struct tree_niter_desc} 459structure. Number of iterations before the loop is exited through a 460given exit can be determined using @code{number_of_iterations_exit} 461function. On RTL, the results are returned in @code{struct niter_desc} 462structure. The corresponding function is named 463@code{check_simple_exit}. There are also functions that pass through 464all the exits of a loop and try to find one with easy to determine 465number of iterations -- @code{find_loop_niter} on GIMPLE and 466@code{find_simple_exit} on RTL@. Finally, there are functions that 467provide the same information, but additionally cache it, so that 468repeated calls to number of iterations are not so costly -- 469@code{number_of_latch_executions} on GIMPLE and @code{get_simple_loop_desc} 470on RTL. 471 472Note that some of these functions may behave slightly differently than 473others -- some of them return only the expression for the number of 474iterations, and fail if there are some assumptions. The function 475@code{number_of_latch_executions} works only for single-exit loops. 476The function @code{number_of_cond_exit_executions} can be used to 477determine number of executions of the exit condition of a single-exit 478loop (i.e., the @code{number_of_latch_executions} increased by one). 479 480@node Dependency analysis 481@section Data Dependency Analysis 482@cindex Data Dependency Analysis 483 484The code for the data dependence analysis can be found in 485@file{tree-data-ref.c} and its interface and data structures are 486described in @file{tree-data-ref.h}. The function that computes the 487data dependences for all the array and pointer references for a given 488loop is @code{compute_data_dependences_for_loop}. This function is 489currently used by the linear loop transform and the vectorization 490passes. Before calling this function, one has to allocate two vectors: 491a first vector will contain the set of data references that are 492contained in the analyzed loop body, and the second vector will contain 493the dependence relations between the data references. Thus if the 494vector of data references is of size @code{n}, the vector containing the 495dependence relations will contain @code{n*n} elements. However if the 496analyzed loop contains side effects, such as calls that potentially can 497interfere with the data references in the current analyzed loop, the 498analysis stops while scanning the loop body for data references, and 499inserts a single @code{chrec_dont_know} in the dependence relation 500array. 501 502The data references are discovered in a particular order during the 503scanning of the loop body: the loop body is analyzed in execution order, 504and the data references of each statement are pushed at the end of the 505data reference array. Two data references syntactically occur in the 506program in the same order as in the array of data references. This 507syntactic order is important in some classical data dependence tests, 508and mapping this order to the elements of this array avoids costly 509queries to the loop body representation. 510 511Three types of data references are currently handled: ARRAY_REF, 512INDIRECT_REF and COMPONENT_REF@. The data structure for the data reference 513is @code{data_reference}, where @code{data_reference_p} is a name of a 514pointer to the data reference structure. The structure contains the 515following elements: 516 517@itemize 518@item @code{base_object_info}: Provides information about the base object 519of the data reference and its access functions. These access functions 520represent the evolution of the data reference in the loop relative to 521its base, in keeping with the classical meaning of the data reference 522access function for the support of arrays. For example, for a reference 523@code{a.b[i][j]}, the base object is @code{a.b} and the access functions, 524one for each array subscript, are: 525@code{@{i_init, + i_step@}_1, @{j_init, +, j_step@}_2}. 526 527@item @code{first_location_in_loop}: Provides information about the first 528location accessed by the data reference in the loop and about the access 529function used to represent evolution relative to this location. This data 530is used to support pointers, and is not used for arrays (for which we 531have base objects). Pointer accesses are represented as a one-dimensional 532access that starts from the first location accessed in the loop. For 533example: 534 535@smallexample 536 for1 i 537 for2 j 538 *((int *)p + i + j) = a[i][j]; 539@end smallexample 540 541The access function of the pointer access is @code{@{0, + 4B@}_for2} 542relative to @code{p + i}. The access functions of the array are 543@code{@{i_init, + i_step@}_for1} and @code{@{j_init, +, j_step@}_for2} 544relative to @code{a}. 545 546Usually, the object the pointer refers to is either unknown, or we can't 547prove that the access is confined to the boundaries of a certain object. 548 549Two data references can be compared only if at least one of these two 550representations has all its fields filled for both data references. 551 552The current strategy for data dependence tests is as follows: 553If both @code{a} and @code{b} are represented as arrays, compare 554@code{a.base_object} and @code{b.base_object}; 555if they are equal, apply dependence tests (use access functions based on 556base_objects). 557Else if both @code{a} and @code{b} are represented as pointers, compare 558@code{a.first_location} and @code{b.first_location}; 559if they are equal, apply dependence tests (use access functions based on 560first location). 561However, if @code{a} and @code{b} are represented differently, only try 562to prove that the bases are definitely different. 563 564@item Aliasing information. 565@item Alignment information. 566@end itemize 567 568The structure describing the relation between two data references is 569@code{data_dependence_relation} and the shorter name for a pointer to 570such a structure is @code{ddr_p}. This structure contains: 571 572@itemize 573@item a pointer to each data reference, 574@item a tree node @code{are_dependent} that is set to @code{chrec_known} 575if the analysis has proved that there is no dependence between these two 576data references, @code{chrec_dont_know} if the analysis was not able to 577determine any useful result and potentially there could exist a 578dependence between these data references, and @code{are_dependent} is 579set to @code{NULL_TREE} if there exist a dependence relation between the 580data references, and the description of this dependence relation is 581given in the @code{subscripts}, @code{dir_vects}, and @code{dist_vects} 582arrays, 583@item a boolean that determines whether the dependence relation can be 584represented by a classical distance vector, 585@item an array @code{subscripts} that contains a description of each 586subscript of the data references. Given two array accesses a 587subscript is the tuple composed of the access functions for a given 588dimension. For example, given @code{A[f1][f2][f3]} and 589@code{B[g1][g2][g3]}, there are three subscripts: @code{(f1, g1), (f2, 590g2), (f3, g3)}. 591@item two arrays @code{dir_vects} and @code{dist_vects} that contain 592classical representations of the data dependences under the form of 593direction and distance dependence vectors, 594@item an array of loops @code{loop_nest} that contains the loops to 595which the distance and direction vectors refer to. 596@end itemize 597 598Several functions for pretty printing the information extracted by the 599data dependence analysis are available: @code{dump_ddrs} prints with a 600maximum verbosity the details of a data dependence relations array, 601@code{dump_dist_dir_vectors} prints only the classical distance and 602direction vectors for a data dependence relations array, and 603@code{dump_data_references} prints the details of the data references 604contained in a data reference array. 605 606 607@node Omega 608@section Omega a solver for linear programming problems 609@cindex Omega a solver for linear programming problems 610 611The data dependence analysis contains several solvers triggered 612sequentially from the less complex ones to the more sophisticated. 613For ensuring the consistency of the results of these solvers, a data 614dependence check pass has been implemented based on two different 615solvers. The second method that has been integrated to GCC is based 616on the Omega dependence solver, written in the 1990's by William Pugh 617and David Wonnacott. Data dependence tests can be formulated using a 618subset of the Presburger arithmetics that can be translated to linear 619constraint systems. These linear constraint systems can then be 620solved using the Omega solver. 621 622The Omega solver is using Fourier-Motzkin's algorithm for variable 623elimination: a linear constraint system containing @code{n} variables 624is reduced to a linear constraint system with @code{n-1} variables. 625The Omega solver can also be used for solving other problems that can 626be expressed under the form of a system of linear equalities and 627inequalities. The Omega solver is known to have an exponential worst 628case, also known under the name of ``omega nightmare'' in the 629literature, but in practice, the omega test is known to be efficient 630for the common data dependence tests. 631 632The interface used by the Omega solver for describing the linear 633programming problems is described in @file{omega.h}, and the solver is 634@code{omega_solve_problem}. 635