1*8feb0f0bSmrg@c Copyright (C) 2006-2020 Free Software Foundation, Inc. 21debfc3dSmrg@c Free Software Foundation, Inc. 31debfc3dSmrg@c This is part of the GCC manual. 41debfc3dSmrg@c For copying conditions, see the file gcc.texi. 51debfc3dSmrg 61debfc3dSmrg@c --------------------------------------------------------------------- 71debfc3dSmrg@c Loop Representation 81debfc3dSmrg@c --------------------------------------------------------------------- 91debfc3dSmrg 101debfc3dSmrg@node Loop Analysis and Representation 111debfc3dSmrg@chapter Analysis and Representation of Loops 121debfc3dSmrg 131debfc3dSmrgGCC provides extensive infrastructure for work with natural loops, i.e., 141debfc3dSmrgstrongly connected components of CFG with only one entry block. This 151debfc3dSmrgchapter describes representation of loops in GCC, both on GIMPLE and in 161debfc3dSmrgRTL, as well as the interfaces to loop-related analyses (induction 171debfc3dSmrgvariable analysis and number of iterations analysis). 181debfc3dSmrg 191debfc3dSmrg@menu 201debfc3dSmrg* Loop representation:: Representation and analysis of loops. 211debfc3dSmrg* Loop querying:: Getting information about loops. 221debfc3dSmrg* Loop manipulation:: Loop manipulation functions. 231debfc3dSmrg* LCSSA:: Loop-closed SSA form. 241debfc3dSmrg* Scalar evolutions:: Induction variables on GIMPLE. 251debfc3dSmrg* loop-iv:: Induction variables on RTL. 261debfc3dSmrg* Number of iterations:: Number of iterations analysis. 271debfc3dSmrg* Dependency analysis:: Data dependency analysis. 281debfc3dSmrg@end menu 291debfc3dSmrg 301debfc3dSmrg@node Loop representation 311debfc3dSmrg@section Loop representation 321debfc3dSmrg@cindex Loop representation 331debfc3dSmrg@cindex Loop analysis 341debfc3dSmrg 351debfc3dSmrgThis chapter describes the representation of loops in GCC, and functions 361debfc3dSmrgthat can be used to build, modify and analyze this representation. Most 371debfc3dSmrgof the interfaces and data structures are declared in @file{cfgloop.h}. 381debfc3dSmrgLoop structures are analyzed and this information disposed or updated 391debfc3dSmrgat the discretion of individual passes. Still most of the generic 401debfc3dSmrgCFG manipulation routines are aware of loop structures and try to 411debfc3dSmrgkeep them up-to-date. By this means an increasing part of the 421debfc3dSmrgcompilation pipeline is setup to maintain loop structure across 431debfc3dSmrgpasses to allow attaching meta information to individual loops 441debfc3dSmrgfor consumption by later passes. 451debfc3dSmrg 461debfc3dSmrgIn general, a natural loop has one entry block (header) and possibly 471debfc3dSmrgseveral back edges (latches) leading to the header from the inside of 481debfc3dSmrgthe loop. Loops with several latches may appear if several loops share 491debfc3dSmrga single header, or if there is a branching in the middle of the loop. 501debfc3dSmrgThe representation of loops in GCC however allows only loops with a 511debfc3dSmrgsingle latch. During loop analysis, headers of such loops are split and 521debfc3dSmrgforwarder blocks are created in order to disambiguate their structures. 531debfc3dSmrgHeuristic based on profile information and structure of the induction 541debfc3dSmrgvariables in the loops is used to determine whether the latches 551debfc3dSmrgcorrespond to sub-loops or to control flow in a single loop. This means 561debfc3dSmrgthat the analysis sometimes changes the CFG, and if you run it in the 571debfc3dSmrgmiddle of an optimization pass, you must be able to deal with the new 581debfc3dSmrgblocks. You may avoid CFG changes by passing 591debfc3dSmrg@code{LOOPS_MAY_HAVE_MULTIPLE_LATCHES} flag to the loop discovery, 601debfc3dSmrgnote however that most other loop manipulation functions will not work 611debfc3dSmrgcorrectly for loops with multiple latch edges (the functions that only 621debfc3dSmrgquery membership of blocks to loops and subloop relationships, or 631debfc3dSmrgenumerate and test loop exits, can be expected to work). 641debfc3dSmrg 651debfc3dSmrgBody of the loop is the set of blocks that are dominated by its header, 661debfc3dSmrgand reachable from its latch against the direction of edges in CFG@. The 671debfc3dSmrgloops are organized in a containment hierarchy (tree) such that all the 681debfc3dSmrgloops immediately contained inside loop L are the children of L in the 691debfc3dSmrgtree. This tree is represented by the @code{struct loops} structure. 701debfc3dSmrgThe root of this tree is a fake loop that contains all blocks in the 711debfc3dSmrgfunction. Each of the loops is represented in a @code{struct loop} 721debfc3dSmrgstructure. Each loop is assigned an index (@code{num} field of the 731debfc3dSmrg@code{struct loop} structure), and the pointer to the loop is stored in 741debfc3dSmrgthe corresponding field of the @code{larray} vector in the loops 751debfc3dSmrgstructure. The indices do not have to be continuous, there may be 761debfc3dSmrgempty (@code{NULL}) entries in the @code{larray} created by deleting 771debfc3dSmrgloops. Also, there is no guarantee on the relative order of a loop 781debfc3dSmrgand its subloops in the numbering. The index of a loop never changes. 791debfc3dSmrg 801debfc3dSmrgThe entries of the @code{larray} field should not be accessed directly. 811debfc3dSmrgThe function @code{get_loop} returns the loop description for a loop with 821debfc3dSmrgthe given index. @code{number_of_loops} function returns number of 831debfc3dSmrgloops in the function. To traverse all loops, use @code{FOR_EACH_LOOP} 841debfc3dSmrgmacro. The @code{flags} argument of the macro is used to determine 851debfc3dSmrgthe direction of traversal and the set of loops visited. Each loop is 861debfc3dSmrgguaranteed to be visited exactly once, regardless of the changes to the 871debfc3dSmrgloop tree, and the loops may be removed during the traversal. The newly 881debfc3dSmrgcreated loops are never traversed, if they need to be visited, this 89*8feb0f0bSmrgmust be done separately after their creation. 901debfc3dSmrg 911debfc3dSmrgEach basic block contains the reference to the innermost loop it belongs 921debfc3dSmrgto (@code{loop_father}). For this reason, it is only possible to have 931debfc3dSmrgone @code{struct loops} structure initialized at the same time for each 941debfc3dSmrgCFG@. The global variable @code{current_loops} contains the 951debfc3dSmrg@code{struct loops} structure. Many of the loop manipulation functions 961debfc3dSmrgassume that dominance information is up-to-date. 971debfc3dSmrg 981debfc3dSmrgThe loops are analyzed through @code{loop_optimizer_init} function. The 991debfc3dSmrgargument of this function is a set of flags represented in an integer 1001debfc3dSmrgbitmask. These flags specify what other properties of the loop 1011debfc3dSmrgstructures should be calculated/enforced and preserved later: 1021debfc3dSmrg 1031debfc3dSmrg@itemize 1041debfc3dSmrg@item @code{LOOPS_MAY_HAVE_MULTIPLE_LATCHES}: If this flag is set, no 1051debfc3dSmrgchanges to CFG will be performed in the loop analysis, in particular, 1061debfc3dSmrgloops with multiple latch edges will not be disambiguated. If a loop 1071debfc3dSmrghas multiple latches, its latch block is set to NULL@. Most of 1081debfc3dSmrgthe loop manipulation functions will not work for loops in this shape. 1091debfc3dSmrgNo other flags that require CFG changes can be passed to 1101debfc3dSmrgloop_optimizer_init. 1111debfc3dSmrg@item @code{LOOPS_HAVE_PREHEADERS}: Forwarder blocks are created in such 1121debfc3dSmrga way that each loop has only one entry edge, and additionally, the 1131debfc3dSmrgsource block of this entry edge has only one successor. This creates a 1141debfc3dSmrgnatural place where the code can be moved out of the loop, and ensures 1151debfc3dSmrgthat the entry edge of the loop leads from its immediate super-loop. 1161debfc3dSmrg@item @code{LOOPS_HAVE_SIMPLE_LATCHES}: Forwarder blocks are created to 1171debfc3dSmrgforce the latch block of each loop to have only one successor. This 1181debfc3dSmrgensures that the latch of the loop does not belong to any of its 1191debfc3dSmrgsub-loops, and makes manipulation with the loops significantly easier. 1201debfc3dSmrgMost of the loop manipulation functions assume that the loops are in 1211debfc3dSmrgthis shape. Note that with this flag, the ``normal'' loop without any 1221debfc3dSmrgcontrol flow inside and with one exit consists of two basic blocks. 1231debfc3dSmrg@item @code{LOOPS_HAVE_MARKED_IRREDUCIBLE_REGIONS}: Basic blocks and 1241debfc3dSmrgedges in the strongly connected components that are not natural loops 1251debfc3dSmrg(have more than one entry block) are marked with 1261debfc3dSmrg@code{BB_IRREDUCIBLE_LOOP} and @code{EDGE_IRREDUCIBLE_LOOP} flags. The 1271debfc3dSmrgflag is not set for blocks and edges that belong to natural loops that 1281debfc3dSmrgare in such an irreducible region (but it is set for the entry and exit 1291debfc3dSmrgedges of such a loop, if they lead to/from this region). 1301debfc3dSmrg@item @code{LOOPS_HAVE_RECORDED_EXITS}: The lists of exits are recorded 1311debfc3dSmrgand updated for each loop. This makes some functions (e.g., 1321debfc3dSmrg@code{get_loop_exit_edges}) more efficient. Some functions (e.g., 1331debfc3dSmrg@code{single_exit}) can be used only if the lists of exits are 1341debfc3dSmrgrecorded. 1351debfc3dSmrg@end itemize 1361debfc3dSmrg 1371debfc3dSmrgThese properties may also be computed/enforced later, using functions 1381debfc3dSmrg@code{create_preheaders}, @code{force_single_succ_latches}, 1391debfc3dSmrg@code{mark_irreducible_loops} and @code{record_loop_exits}. 1401debfc3dSmrgThe properties can be queried using @code{loops_state_satisfies_p}. 1411debfc3dSmrg 1421debfc3dSmrgThe memory occupied by the loops structures should be freed with 1431debfc3dSmrg@code{loop_optimizer_finalize} function. When loop structures are 1441debfc3dSmrgsetup to be preserved across passes this function reduces the 1451debfc3dSmrginformation to be kept up-to-date to a minimum (only 1461debfc3dSmrg@code{LOOPS_MAY_HAVE_MULTIPLE_LATCHES} set). 1471debfc3dSmrg 1481debfc3dSmrgThe CFG manipulation functions in general do not update loop structures. 1491debfc3dSmrgSpecialized versions that additionally do so are provided for the most 1501debfc3dSmrgcommon tasks. On GIMPLE, @code{cleanup_tree_cfg_loop} function can be 1511debfc3dSmrgused to cleanup CFG while updating the loops structures if 1521debfc3dSmrg@code{current_loops} is set. 1531debfc3dSmrg 1541debfc3dSmrgAt the moment loop structure is preserved from the start of GIMPLE 1551debfc3dSmrgloop optimizations until the end of RTL loop optimizations. During 1561debfc3dSmrgthis time a loop can be tracked by its @code{struct loop} and number. 1571debfc3dSmrg 1581debfc3dSmrg@node Loop querying 1591debfc3dSmrg@section Loop querying 1601debfc3dSmrg@cindex Loop querying 1611debfc3dSmrg 1621debfc3dSmrgThe functions to query the information about loops are declared in 1631debfc3dSmrg@file{cfgloop.h}. Some of the information can be taken directly from 1641debfc3dSmrgthe structures. @code{loop_father} field of each basic block contains 1651debfc3dSmrgthe innermost loop to that the block belongs. The most useful fields of 1661debfc3dSmrgloop structure (that are kept up-to-date at all times) are: 1671debfc3dSmrg 1681debfc3dSmrg@itemize 1691debfc3dSmrg@item @code{header}, @code{latch}: Header and latch basic blocks of the 1701debfc3dSmrgloop. 1711debfc3dSmrg@item @code{num_nodes}: Number of basic blocks in the loop (including 1721debfc3dSmrgthe basic blocks of the sub-loops). 1731debfc3dSmrg@item @code{outer}, @code{inner}, @code{next}: The super-loop, the first 1741debfc3dSmrgsub-loop, and the sibling of the loop in the loops tree. 1751debfc3dSmrg@end itemize 1761debfc3dSmrg 1771debfc3dSmrgThere are other fields in the loop structures, many of them used only by 1781debfc3dSmrgsome of the passes, or not updated during CFG changes; in general, they 1791debfc3dSmrgshould not be accessed directly. 1801debfc3dSmrg 1811debfc3dSmrgThe most important functions to query loop structures are: 1821debfc3dSmrg 1831debfc3dSmrg@itemize 1841debfc3dSmrg@item @code{loop_depth}: The depth of the loop in the loops tree, i.e., the 1851debfc3dSmrgnumber of super-loops of the loop. 1861debfc3dSmrg@item @code{flow_loops_dump}: Dumps the information about loops to a 1871debfc3dSmrgfile. 1881debfc3dSmrg@item @code{verify_loop_structure}: Checks consistency of the loop 1891debfc3dSmrgstructures. 1901debfc3dSmrg@item @code{loop_latch_edge}: Returns the latch edge of a loop. 1911debfc3dSmrg@item @code{loop_preheader_edge}: If loops have preheaders, returns 1921debfc3dSmrgthe preheader edge of a loop. 1931debfc3dSmrg@item @code{flow_loop_nested_p}: Tests whether loop is a sub-loop of 1941debfc3dSmrganother loop. 1951debfc3dSmrg@item @code{flow_bb_inside_loop_p}: Tests whether a basic block belongs 1961debfc3dSmrgto a loop (including its sub-loops). 1971debfc3dSmrg@item @code{find_common_loop}: Finds the common super-loop of two loops. 1981debfc3dSmrg@item @code{superloop_at_depth}: Returns the super-loop of a loop with 1991debfc3dSmrgthe given depth. 2001debfc3dSmrg@item @code{tree_num_loop_insns}, @code{num_loop_insns}: Estimates the 2011debfc3dSmrgnumber of insns in the loop, on GIMPLE and on RTL. 2021debfc3dSmrg@item @code{loop_exit_edge_p}: Tests whether edge is an exit from a 2031debfc3dSmrgloop. 2041debfc3dSmrg@item @code{mark_loop_exit_edges}: Marks all exit edges of all loops 2051debfc3dSmrgwith @code{EDGE_LOOP_EXIT} flag. 2061debfc3dSmrg@item @code{get_loop_body}, @code{get_loop_body_in_dom_order}, 2071debfc3dSmrg@code{get_loop_body_in_bfs_order}: Enumerates the basic blocks in the 2081debfc3dSmrgloop in depth-first search order in reversed CFG, ordered by dominance 2091debfc3dSmrgrelation, and breath-first search order, respectively. 2101debfc3dSmrg@item @code{single_exit}: Returns the single exit edge of the loop, or 2111debfc3dSmrg@code{NULL} if the loop has more than one exit. You can only use this 2121debfc3dSmrgfunction if LOOPS_HAVE_MARKED_SINGLE_EXITS property is used. 2131debfc3dSmrg@item @code{get_loop_exit_edges}: Enumerates the exit edges of a loop. 2141debfc3dSmrg@item @code{just_once_each_iteration_p}: Returns true if the basic block 2151debfc3dSmrgis executed exactly once during each iteration of a loop (that is, it 2161debfc3dSmrgdoes not belong to a sub-loop, and it dominates the latch of the loop). 2171debfc3dSmrg@end itemize 2181debfc3dSmrg 2191debfc3dSmrg@node Loop manipulation 2201debfc3dSmrg@section Loop manipulation 2211debfc3dSmrg@cindex Loop manipulation 2221debfc3dSmrg 2231debfc3dSmrgThe loops tree can be manipulated using the following functions: 2241debfc3dSmrg 2251debfc3dSmrg@itemize 2261debfc3dSmrg@item @code{flow_loop_tree_node_add}: Adds a node to the tree. 2271debfc3dSmrg@item @code{flow_loop_tree_node_remove}: Removes a node from the tree. 2281debfc3dSmrg@item @code{add_bb_to_loop}: Adds a basic block to a loop. 2291debfc3dSmrg@item @code{remove_bb_from_loops}: Removes a basic block from loops. 2301debfc3dSmrg@end itemize 2311debfc3dSmrg 2321debfc3dSmrgMost low-level CFG functions update loops automatically. The following 2331debfc3dSmrgfunctions handle some more complicated cases of CFG manipulations: 2341debfc3dSmrg 2351debfc3dSmrg@itemize 2361debfc3dSmrg@item @code{remove_path}: Removes an edge and all blocks it dominates. 2371debfc3dSmrg@item @code{split_loop_exit_edge}: Splits exit edge of the loop, 2381debfc3dSmrgensuring that PHI node arguments remain in the loop (this ensures that 2391debfc3dSmrgloop-closed SSA form is preserved). Only useful on GIMPLE. 2401debfc3dSmrg@end itemize 2411debfc3dSmrg 2421debfc3dSmrgFinally, there are some higher-level loop transformations implemented. 2431debfc3dSmrgWhile some of them are written so that they should work on non-innermost 2441debfc3dSmrgloops, they are mostly untested in that case, and at the moment, they 2451debfc3dSmrgare only reliable for the innermost loops: 2461debfc3dSmrg 2471debfc3dSmrg@itemize 2481debfc3dSmrg@item @code{create_iv}: Creates a new induction variable. Only works on 2491debfc3dSmrgGIMPLE@. @code{standard_iv_increment_position} can be used to find a 2501debfc3dSmrgsuitable place for the iv increment. 2511debfc3dSmrg@item @code{duplicate_loop_to_header_edge}, 2521debfc3dSmrg@code{tree_duplicate_loop_to_header_edge}: These functions (on RTL and 2531debfc3dSmrgon GIMPLE) duplicate the body of the loop prescribed number of times on 2541debfc3dSmrgone of the edges entering loop header, thus performing either loop 2551debfc3dSmrgunrolling or loop peeling. @code{can_duplicate_loop_p} 2561debfc3dSmrg(@code{can_unroll_loop_p} on GIMPLE) must be true for the duplicated 2571debfc3dSmrgloop. 2581debfc3dSmrg@item @code{loop_version}: This function creates a copy of a loop, and 2591debfc3dSmrga branch before them that selects one of them depending on the 2601debfc3dSmrgprescribed condition. This is useful for optimizations that need to 2611debfc3dSmrgverify some assumptions in runtime (one of the copies of the loop is 2621debfc3dSmrgusually left unchanged, while the other one is transformed in some way). 2631debfc3dSmrg@item @code{tree_unroll_loop}: Unrolls the loop, including peeling the 2641debfc3dSmrgextra iterations to make the number of iterations divisible by unroll 2651debfc3dSmrgfactor, updating the exit condition, and removing the exits that now 2661debfc3dSmrgcannot be taken. Works only on GIMPLE. 2671debfc3dSmrg@end itemize 2681debfc3dSmrg 2691debfc3dSmrg@node LCSSA 2701debfc3dSmrg@section Loop-closed SSA form 2711debfc3dSmrg@cindex LCSSA 2721debfc3dSmrg@cindex Loop-closed SSA form 2731debfc3dSmrg 2741debfc3dSmrgThroughout the loop optimizations on tree level, one extra condition is 2751debfc3dSmrgenforced on the SSA form: No SSA name is used outside of the loop in 2761debfc3dSmrgthat it is defined. The SSA form satisfying this condition is called 2771debfc3dSmrg``loop-closed SSA form'' -- LCSSA@. To enforce LCSSA, PHI nodes must be 2781debfc3dSmrgcreated at the exits of the loops for the SSA names that are used 2791debfc3dSmrgoutside of them. Only the real operands (not virtual SSA names) are 2801debfc3dSmrgheld in LCSSA, in order to save memory. 2811debfc3dSmrg 2821debfc3dSmrgThere are various benefits of LCSSA: 2831debfc3dSmrg 2841debfc3dSmrg@itemize 2851debfc3dSmrg@item Many optimizations (value range analysis, final value 2861debfc3dSmrgreplacement) are interested in the values that are defined in the loop 2871debfc3dSmrgand used outside of it, i.e., exactly those for that we create new PHI 2881debfc3dSmrgnodes. 2891debfc3dSmrg@item In induction variable analysis, it is not necessary to specify the 2901debfc3dSmrgloop in that the analysis should be performed -- the scalar evolution 2911debfc3dSmrganalysis always returns the results with respect to the loop in that the 2921debfc3dSmrgSSA name is defined. 2931debfc3dSmrg@item It makes updating of SSA form during loop transformations simpler. 2941debfc3dSmrgWithout LCSSA, operations like loop unrolling may force creation of PHI 2951debfc3dSmrgnodes arbitrarily far from the loop, while in LCSSA, the SSA form can be 2961debfc3dSmrgupdated locally. However, since we only keep real operands in LCSSA, we 2971debfc3dSmrgcannot use this advantage (we could have local updating of real 2981debfc3dSmrgoperands, but it is not much more efficient than to use generic SSA form 2991debfc3dSmrgupdating for it as well; the amount of changes to SSA is the same). 3001debfc3dSmrg@end itemize 3011debfc3dSmrg 3021debfc3dSmrgHowever, it also means LCSSA must be updated. This is usually 3031debfc3dSmrgstraightforward, unless you create a new value in loop and use it 3041debfc3dSmrgoutside, or unless you manipulate loop exit edges (functions are 3051debfc3dSmrgprovided to make these manipulations simple). 3061debfc3dSmrg@code{rewrite_into_loop_closed_ssa} is used to rewrite SSA form to 3071debfc3dSmrgLCSSA, and @code{verify_loop_closed_ssa} to check that the invariant of 3081debfc3dSmrgLCSSA is preserved. 3091debfc3dSmrg 3101debfc3dSmrg@node Scalar evolutions 3111debfc3dSmrg@section Scalar evolutions 3121debfc3dSmrg@cindex Scalar evolutions 3131debfc3dSmrg@cindex IV analysis on GIMPLE 3141debfc3dSmrg 3151debfc3dSmrgScalar evolutions (SCEV) are used to represent results of induction 3161debfc3dSmrgvariable analysis on GIMPLE@. They enable us to represent variables with 3171debfc3dSmrgcomplicated behavior in a simple and consistent way (we only use it to 3181debfc3dSmrgexpress values of polynomial induction variables, but it is possible to 3191debfc3dSmrgextend it). The interfaces to SCEV analysis are declared in 3201debfc3dSmrg@file{tree-scalar-evolution.h}. To use scalar evolutions analysis, 3211debfc3dSmrg@code{scev_initialize} must be used. To stop using SCEV, 3221debfc3dSmrg@code{scev_finalize} should be used. SCEV analysis caches results in 3231debfc3dSmrgorder to save time and memory. This cache however is made invalid by 3241debfc3dSmrgmost of the loop transformations, including removal of code. If such a 3251debfc3dSmrgtransformation is performed, @code{scev_reset} must be called to clean 3261debfc3dSmrgthe caches. 3271debfc3dSmrg 3281debfc3dSmrgGiven an SSA name, its behavior in loops can be analyzed using the 3291debfc3dSmrg@code{analyze_scalar_evolution} function. The returned SCEV however 3301debfc3dSmrgdoes not have to be fully analyzed and it may contain references to 3311debfc3dSmrgother SSA names defined in the loop. To resolve these (potentially 3321debfc3dSmrgrecursive) references, @code{instantiate_parameters} or 3331debfc3dSmrg@code{resolve_mixers} functions must be used. 3341debfc3dSmrg@code{instantiate_parameters} is useful when you use the results of SCEV 3351debfc3dSmrgonly for some analysis, and when you work with whole nest of loops at 3361debfc3dSmrgonce. It will try replacing all SSA names by their SCEV in all loops, 3371debfc3dSmrgincluding the super-loops of the current loop, thus providing a complete 3381debfc3dSmrginformation about the behavior of the variable in the loop nest. 3391debfc3dSmrg@code{resolve_mixers} is useful if you work with only one loop at a 3401debfc3dSmrgtime, and if you possibly need to create code based on the value of the 3411debfc3dSmrginduction variable. It will only resolve the SSA names defined in the 3421debfc3dSmrgcurrent loop, leaving the SSA names defined outside unchanged, even if 3431debfc3dSmrgtheir evolution in the outer loops is known. 3441debfc3dSmrg 3451debfc3dSmrgThe SCEV is a normal tree expression, except for the fact that it may 3461debfc3dSmrgcontain several special tree nodes. One of them is 3471debfc3dSmrg@code{SCEV_NOT_KNOWN}, used for SSA names whose value cannot be 3481debfc3dSmrgexpressed. The other one is @code{POLYNOMIAL_CHREC}. Polynomial chrec 3491debfc3dSmrghas three arguments -- base, step and loop (both base and step may 3501debfc3dSmrgcontain further polynomial chrecs). Type of the expression and of base 3511debfc3dSmrgand step must be the same. A variable has evolution 3521debfc3dSmrg@code{POLYNOMIAL_CHREC(base, step, loop)} if it is (in the specified 3531debfc3dSmrgloop) equivalent to @code{x_1} in the following example 3541debfc3dSmrg 3551debfc3dSmrg@smallexample 3561debfc3dSmrgwhile (@dots{}) 3571debfc3dSmrg @{ 3581debfc3dSmrg x_1 = phi (base, x_2); 3591debfc3dSmrg x_2 = x_1 + step; 3601debfc3dSmrg @} 3611debfc3dSmrg@end smallexample 3621debfc3dSmrg 3631debfc3dSmrgNote that this includes the language restrictions on the operations. 3641debfc3dSmrgFor example, if we compile C code and @code{x} has signed type, then the 3651debfc3dSmrgoverflow in addition would cause undefined behavior, and we may assume 3661debfc3dSmrgthat this does not happen. Hence, the value with this SCEV cannot 3671debfc3dSmrgoverflow (which restricts the number of iterations of such a loop). 3681debfc3dSmrg 3691debfc3dSmrgIn many cases, one wants to restrict the attention just to affine 3701debfc3dSmrginduction variables. In this case, the extra expressive power of SCEV 3711debfc3dSmrgis not useful, and may complicate the optimizations. In this case, 3721debfc3dSmrg@code{simple_iv} function may be used to analyze a value -- the result 3731debfc3dSmrgis a loop-invariant base and step. 3741debfc3dSmrg 3751debfc3dSmrg@node loop-iv 3761debfc3dSmrg@section IV analysis on RTL 3771debfc3dSmrg@cindex IV analysis on RTL 3781debfc3dSmrg 3791debfc3dSmrgThe induction variable on RTL is simple and only allows analysis of 3801debfc3dSmrgaffine induction variables, and only in one loop at once. The interface 3811debfc3dSmrgis declared in @file{cfgloop.h}. Before analyzing induction variables 3821debfc3dSmrgin a loop L, @code{iv_analysis_loop_init} function must be called on L. 3831debfc3dSmrgAfter the analysis (possibly calling @code{iv_analysis_loop_init} for 3841debfc3dSmrgseveral loops) is finished, @code{iv_analysis_done} should be called. 3851debfc3dSmrgThe following functions can be used to access the results of the 3861debfc3dSmrganalysis: 3871debfc3dSmrg 3881debfc3dSmrg@itemize 3891debfc3dSmrg@item @code{iv_analyze}: Analyzes a single register used in the given 3901debfc3dSmrginsn. If no use of the register in this insn is found, the following 3911debfc3dSmrginsns are scanned, so that this function can be called on the insn 3921debfc3dSmrgreturned by get_condition. 3931debfc3dSmrg@item @code{iv_analyze_result}: Analyzes result of the assignment in the 3941debfc3dSmrggiven insn. 3951debfc3dSmrg@item @code{iv_analyze_expr}: Analyzes a more complicated expression. 3961debfc3dSmrgAll its operands are analyzed by @code{iv_analyze}, and hence they must 3971debfc3dSmrgbe used in the specified insn or one of the following insns. 3981debfc3dSmrg@end itemize 3991debfc3dSmrg 4001debfc3dSmrgThe description of the induction variable is provided in @code{struct 4011debfc3dSmrgrtx_iv}. In order to handle subregs, the representation is a bit 4021debfc3dSmrgcomplicated; if the value of the @code{extend} field is not 4031debfc3dSmrg@code{UNKNOWN}, the value of the induction variable in the i-th 4041debfc3dSmrgiteration is 4051debfc3dSmrg 4061debfc3dSmrg@smallexample 4071debfc3dSmrgdelta + mult * extend_@{extend_mode@} (subreg_@{mode@} (base + i * step)), 4081debfc3dSmrg@end smallexample 4091debfc3dSmrg 4101debfc3dSmrgwith the following exception: if @code{first_special} is true, then the 4111debfc3dSmrgvalue in the first iteration (when @code{i} is zero) is @code{delta + 4121debfc3dSmrgmult * base}. However, if @code{extend} is equal to @code{UNKNOWN}, 4131debfc3dSmrgthen @code{first_special} must be false, @code{delta} 0, @code{mult} 1 4141debfc3dSmrgand the value in the i-th iteration is 4151debfc3dSmrg 4161debfc3dSmrg@smallexample 4171debfc3dSmrgsubreg_@{mode@} (base + i * step) 4181debfc3dSmrg@end smallexample 4191debfc3dSmrg 4201debfc3dSmrgThe function @code{get_iv_value} can be used to perform these 4211debfc3dSmrgcalculations. 4221debfc3dSmrg 4231debfc3dSmrg@node Number of iterations 4241debfc3dSmrg@section Number of iterations analysis 4251debfc3dSmrg@cindex Number of iterations analysis 4261debfc3dSmrg 4271debfc3dSmrgBoth on GIMPLE and on RTL, there are functions available to determine 4281debfc3dSmrgthe number of iterations of a loop, with a similar interface. The 4291debfc3dSmrgnumber of iterations of a loop in GCC is defined as the number of 4301debfc3dSmrgexecutions of the loop latch. In many cases, it is not possible to 4311debfc3dSmrgdetermine the number of iterations unconditionally -- the determined 4321debfc3dSmrgnumber is correct only if some assumptions are satisfied. The analysis 4331debfc3dSmrgtries to verify these conditions using the information contained in the 4341debfc3dSmrgprogram; if it fails, the conditions are returned together with the 4351debfc3dSmrgresult. The following information and conditions are provided by the 4361debfc3dSmrganalysis: 4371debfc3dSmrg 4381debfc3dSmrg@itemize 4391debfc3dSmrg@item @code{assumptions}: If this condition is false, the rest of 4401debfc3dSmrgthe information is invalid. 4411debfc3dSmrg@item @code{noloop_assumptions} on RTL, @code{may_be_zero} on GIMPLE: If 4421debfc3dSmrgthis condition is true, the loop exits in the first iteration. 4431debfc3dSmrg@item @code{infinite}: If this condition is true, the loop is infinite. 4441debfc3dSmrgThis condition is only available on RTL@. On GIMPLE, conditions for 4451debfc3dSmrgfiniteness of the loop are included in @code{assumptions}. 4461debfc3dSmrg@item @code{niter_expr} on RTL, @code{niter} on GIMPLE: The expression 4471debfc3dSmrgthat gives number of iterations. The number of iterations is defined as 4481debfc3dSmrgthe number of executions of the loop latch. 4491debfc3dSmrg@end itemize 4501debfc3dSmrg 4511debfc3dSmrgBoth on GIMPLE and on RTL, it necessary for the induction variable 4521debfc3dSmrganalysis framework to be initialized (SCEV on GIMPLE, loop-iv on RTL). 4531debfc3dSmrgOn GIMPLE, the results are stored to @code{struct tree_niter_desc} 4541debfc3dSmrgstructure. Number of iterations before the loop is exited through a 4551debfc3dSmrggiven exit can be determined using @code{number_of_iterations_exit} 4561debfc3dSmrgfunction. On RTL, the results are returned in @code{struct niter_desc} 4571debfc3dSmrgstructure. The corresponding function is named 4581debfc3dSmrg@code{check_simple_exit}. There are also functions that pass through 4591debfc3dSmrgall the exits of a loop and try to find one with easy to determine 4601debfc3dSmrgnumber of iterations -- @code{find_loop_niter} on GIMPLE and 4611debfc3dSmrg@code{find_simple_exit} on RTL@. Finally, there are functions that 4621debfc3dSmrgprovide the same information, but additionally cache it, so that 4631debfc3dSmrgrepeated calls to number of iterations are not so costly -- 4641debfc3dSmrg@code{number_of_latch_executions} on GIMPLE and @code{get_simple_loop_desc} 4651debfc3dSmrgon RTL. 4661debfc3dSmrg 4671debfc3dSmrgNote that some of these functions may behave slightly differently than 4681debfc3dSmrgothers -- some of them return only the expression for the number of 4691debfc3dSmrgiterations, and fail if there are some assumptions. The function 4701debfc3dSmrg@code{number_of_latch_executions} works only for single-exit loops. 4711debfc3dSmrgThe function @code{number_of_cond_exit_executions} can be used to 4721debfc3dSmrgdetermine number of executions of the exit condition of a single-exit 4731debfc3dSmrgloop (i.e., the @code{number_of_latch_executions} increased by one). 4741debfc3dSmrg 4751debfc3dSmrgOn GIMPLE, below constraint flags affect semantics of some APIs of number 4761debfc3dSmrgof iterations analyzer: 4771debfc3dSmrg 4781debfc3dSmrg@itemize 4791debfc3dSmrg@item @code{LOOP_C_INFINITE}: If this constraint flag is set, the loop 4801debfc3dSmrgis known to be infinite. APIs like @code{number_of_iterations_exit} can 4811debfc3dSmrgreturn false directly without doing any analysis. 4821debfc3dSmrg@item @code{LOOP_C_FINITE}: If this constraint flag is set, the loop is 4831debfc3dSmrgknown to be finite, in other words, loop's number of iterations can be 4841debfc3dSmrgcomputed with @code{assumptions} be true. 4851debfc3dSmrg@end itemize 4861debfc3dSmrg 4871debfc3dSmrgGenerally, the constraint flags are set/cleared by consumers which are 4881debfc3dSmrgloop optimizers. It's also the consumers' responsibility to set/clear 4891debfc3dSmrgconstraints correctly. Failing to do that might result in hard to track 4901debfc3dSmrgdown bugs in scev/niter consumers. One typical use case is vectorizer: 4911debfc3dSmrgit drives number of iterations analyzer by setting @code{LOOP_C_FINITE} 4921debfc3dSmrgand vectorizes possibly infinite loop by versioning loop with analysis 4931debfc3dSmrgresult. In return, constraints set by consumers can also help number of 4941debfc3dSmrgiterations analyzer in following optimizers. For example, @code{niter} 4951debfc3dSmrgof a loop versioned under @code{assumptions} is valid unconditionally. 4961debfc3dSmrg 4971debfc3dSmrgOther constraints may be added in the future, for example, a constraint 4981debfc3dSmrgindicating that loops' latch must roll thus @code{may_be_zero} would be 4991debfc3dSmrgfalse unconditionally. 5001debfc3dSmrg 5011debfc3dSmrg@node Dependency analysis 5021debfc3dSmrg@section Data Dependency Analysis 5031debfc3dSmrg@cindex Data Dependency Analysis 5041debfc3dSmrg 5051debfc3dSmrgThe code for the data dependence analysis can be found in 5061debfc3dSmrg@file{tree-data-ref.c} and its interface and data structures are 5071debfc3dSmrgdescribed in @file{tree-data-ref.h}. The function that computes the 5081debfc3dSmrgdata dependences for all the array and pointer references for a given 5091debfc3dSmrgloop is @code{compute_data_dependences_for_loop}. This function is 5101debfc3dSmrgcurrently used by the linear loop transform and the vectorization 5111debfc3dSmrgpasses. Before calling this function, one has to allocate two vectors: 5121debfc3dSmrga first vector will contain the set of data references that are 5131debfc3dSmrgcontained in the analyzed loop body, and the second vector will contain 5141debfc3dSmrgthe dependence relations between the data references. Thus if the 5151debfc3dSmrgvector of data references is of size @code{n}, the vector containing the 5161debfc3dSmrgdependence relations will contain @code{n*n} elements. However if the 5171debfc3dSmrganalyzed loop contains side effects, such as calls that potentially can 5181debfc3dSmrginterfere with the data references in the current analyzed loop, the 5191debfc3dSmrganalysis stops while scanning the loop body for data references, and 5201debfc3dSmrginserts a single @code{chrec_dont_know} in the dependence relation 5211debfc3dSmrgarray. 5221debfc3dSmrg 5231debfc3dSmrgThe data references are discovered in a particular order during the 5241debfc3dSmrgscanning of the loop body: the loop body is analyzed in execution order, 5251debfc3dSmrgand the data references of each statement are pushed at the end of the 5261debfc3dSmrgdata reference array. Two data references syntactically occur in the 5271debfc3dSmrgprogram in the same order as in the array of data references. This 5281debfc3dSmrgsyntactic order is important in some classical data dependence tests, 5291debfc3dSmrgand mapping this order to the elements of this array avoids costly 5301debfc3dSmrgqueries to the loop body representation. 5311debfc3dSmrg 5321debfc3dSmrgThree types of data references are currently handled: ARRAY_REF, 5331debfc3dSmrgINDIRECT_REF and COMPONENT_REF@. The data structure for the data reference 5341debfc3dSmrgis @code{data_reference}, where @code{data_reference_p} is a name of a 5351debfc3dSmrgpointer to the data reference structure. The structure contains the 5361debfc3dSmrgfollowing elements: 5371debfc3dSmrg 5381debfc3dSmrg@itemize 5391debfc3dSmrg@item @code{base_object_info}: Provides information about the base object 5401debfc3dSmrgof the data reference and its access functions. These access functions 5411debfc3dSmrgrepresent the evolution of the data reference in the loop relative to 5421debfc3dSmrgits base, in keeping with the classical meaning of the data reference 5431debfc3dSmrgaccess function for the support of arrays. For example, for a reference 5441debfc3dSmrg@code{a.b[i][j]}, the base object is @code{a.b} and the access functions, 5451debfc3dSmrgone for each array subscript, are: 5461debfc3dSmrg@code{@{i_init, + i_step@}_1, @{j_init, +, j_step@}_2}. 5471debfc3dSmrg 5481debfc3dSmrg@item @code{first_location_in_loop}: Provides information about the first 5491debfc3dSmrglocation accessed by the data reference in the loop and about the access 5501debfc3dSmrgfunction used to represent evolution relative to this location. This data 5511debfc3dSmrgis used to support pointers, and is not used for arrays (for which we 5521debfc3dSmrghave base objects). Pointer accesses are represented as a one-dimensional 5531debfc3dSmrgaccess that starts from the first location accessed in the loop. For 5541debfc3dSmrgexample: 5551debfc3dSmrg 5561debfc3dSmrg@smallexample 5571debfc3dSmrg for1 i 5581debfc3dSmrg for2 j 5591debfc3dSmrg *((int *)p + i + j) = a[i][j]; 5601debfc3dSmrg@end smallexample 5611debfc3dSmrg 5621debfc3dSmrgThe access function of the pointer access is @code{@{0, + 4B@}_for2} 5631debfc3dSmrgrelative to @code{p + i}. The access functions of the array are 5641debfc3dSmrg@code{@{i_init, + i_step@}_for1} and @code{@{j_init, +, j_step@}_for2} 5651debfc3dSmrgrelative to @code{a}. 5661debfc3dSmrg 5671debfc3dSmrgUsually, the object the pointer refers to is either unknown, or we cannot 5681debfc3dSmrgprove that the access is confined to the boundaries of a certain object. 5691debfc3dSmrg 5701debfc3dSmrgTwo data references can be compared only if at least one of these two 5711debfc3dSmrgrepresentations has all its fields filled for both data references. 5721debfc3dSmrg 5731debfc3dSmrgThe current strategy for data dependence tests is as follows: 5741debfc3dSmrgIf both @code{a} and @code{b} are represented as arrays, compare 5751debfc3dSmrg@code{a.base_object} and @code{b.base_object}; 5761debfc3dSmrgif they are equal, apply dependence tests (use access functions based on 5771debfc3dSmrgbase_objects). 5781debfc3dSmrgElse if both @code{a} and @code{b} are represented as pointers, compare 5791debfc3dSmrg@code{a.first_location} and @code{b.first_location}; 5801debfc3dSmrgif they are equal, apply dependence tests (use access functions based on 5811debfc3dSmrgfirst location). 5821debfc3dSmrgHowever, if @code{a} and @code{b} are represented differently, only try 5831debfc3dSmrgto prove that the bases are definitely different. 5841debfc3dSmrg 5851debfc3dSmrg@item Aliasing information. 5861debfc3dSmrg@item Alignment information. 5871debfc3dSmrg@end itemize 5881debfc3dSmrg 5891debfc3dSmrgThe structure describing the relation between two data references is 5901debfc3dSmrg@code{data_dependence_relation} and the shorter name for a pointer to 5911debfc3dSmrgsuch a structure is @code{ddr_p}. This structure contains: 5921debfc3dSmrg 5931debfc3dSmrg@itemize 5941debfc3dSmrg@item a pointer to each data reference, 5951debfc3dSmrg@item a tree node @code{are_dependent} that is set to @code{chrec_known} 5961debfc3dSmrgif the analysis has proved that there is no dependence between these two 5971debfc3dSmrgdata references, @code{chrec_dont_know} if the analysis was not able to 5981debfc3dSmrgdetermine any useful result and potentially there could exist a 5991debfc3dSmrgdependence between these data references, and @code{are_dependent} is 6001debfc3dSmrgset to @code{NULL_TREE} if there exist a dependence relation between the 6011debfc3dSmrgdata references, and the description of this dependence relation is 6021debfc3dSmrggiven in the @code{subscripts}, @code{dir_vects}, and @code{dist_vects} 6031debfc3dSmrgarrays, 6041debfc3dSmrg@item a boolean that determines whether the dependence relation can be 6051debfc3dSmrgrepresented by a classical distance vector, 6061debfc3dSmrg@item an array @code{subscripts} that contains a description of each 6071debfc3dSmrgsubscript of the data references. Given two array accesses a 6081debfc3dSmrgsubscript is the tuple composed of the access functions for a given 6091debfc3dSmrgdimension. For example, given @code{A[f1][f2][f3]} and 6101debfc3dSmrg@code{B[g1][g2][g3]}, there are three subscripts: @code{(f1, g1), (f2, 6111debfc3dSmrgg2), (f3, g3)}. 6121debfc3dSmrg@item two arrays @code{dir_vects} and @code{dist_vects} that contain 6131debfc3dSmrgclassical representations of the data dependences under the form of 6141debfc3dSmrgdirection and distance dependence vectors, 6151debfc3dSmrg@item an array of loops @code{loop_nest} that contains the loops to 6161debfc3dSmrgwhich the distance and direction vectors refer to. 6171debfc3dSmrg@end itemize 6181debfc3dSmrg 6191debfc3dSmrgSeveral functions for pretty printing the information extracted by the 6201debfc3dSmrgdata dependence analysis are available: @code{dump_ddrs} prints with a 6211debfc3dSmrgmaximum verbosity the details of a data dependence relations array, 6221debfc3dSmrg@code{dump_dist_dir_vectors} prints only the classical distance and 6231debfc3dSmrgdirection vectors for a data dependence relations array, and 6241debfc3dSmrg@code{dump_data_references} prints the details of the data references 6251debfc3dSmrgcontained in a data reference array. 626