gcc/doc/loop.texi

*8feb0f0bSmrg@c Copyright (C) 2006-2020 Free Software Foundation, Inc.
1debfc3dSmrg@c Free Software Foundation, Inc.
1debfc3dSmrg@c This is part of the GCC manual.
1debfc3dSmrg@c For copying conditions, see the file gcc.texi.
1debfc3dSmrg
1debfc3dSmrg@c ---------------------------------------------------------------------
1debfc3dSmrg@c Loop Representation
1debfc3dSmrg@c ---------------------------------------------------------------------
1debfc3dSmrg
1debfc3dSmrg@node Loop Analysis and Representation
1debfc3dSmrg@chapter Analysis and Representation of Loops
1debfc3dSmrg
1debfc3dSmrgGCC provides extensive infrastructure for work with natural loops, i.e.,
1debfc3dSmrgstrongly connected components of CFG with only one entry block.  This
1debfc3dSmrgchapter describes representation of loops in GCC, both on GIMPLE and in
1debfc3dSmrgRTL, as well as the interfaces to loop-related analyses (induction
1debfc3dSmrgvariable analysis and number of iterations analysis).
1debfc3dSmrg
1debfc3dSmrg@menu
1debfc3dSmrg* Loop representation::         Representation and analysis of loops.
1debfc3dSmrg* Loop querying::               Getting information about loops.
1debfc3dSmrg* Loop manipulation::           Loop manipulation functions.
1debfc3dSmrg* LCSSA::                       Loop-closed SSA form.
1debfc3dSmrg* Scalar evolutions::           Induction variables on GIMPLE.
1debfc3dSmrg* loop-iv::                     Induction variables on RTL.
1debfc3dSmrg* Number of iterations::        Number of iterations analysis.
1debfc3dSmrg* Dependency analysis::         Data dependency analysis.
1debfc3dSmrg@end menu
1debfc3dSmrg
1debfc3dSmrg@node Loop representation
1debfc3dSmrg@section Loop representation
1debfc3dSmrg@cindex Loop representation
1debfc3dSmrg@cindex Loop analysis
1debfc3dSmrg
1debfc3dSmrgThis chapter describes the representation of loops in GCC, and functions
1debfc3dSmrgthat can be used to build, modify and analyze this representation.  Most
1debfc3dSmrgof the interfaces and data structures are declared in @file{cfgloop.h}.
1debfc3dSmrgLoop structures are analyzed and this information disposed or updated
1debfc3dSmrgat the discretion of individual passes.  Still most of the generic
1debfc3dSmrgCFG manipulation routines are aware of loop structures and try to
1debfc3dSmrgkeep them up-to-date.  By this means an increasing part of the
1debfc3dSmrgcompilation pipeline is setup to maintain loop structure across
1debfc3dSmrgpasses to allow attaching meta information to individual loops
1debfc3dSmrgfor consumption by later passes.
1debfc3dSmrg
1debfc3dSmrgIn general, a natural loop has one entry block (header) and possibly
1debfc3dSmrgseveral back edges (latches) leading to the header from the inside of
1debfc3dSmrgthe loop.  Loops with several latches may appear if several loops share
1debfc3dSmrga single header, or if there is a branching in the middle of the loop.
1debfc3dSmrgThe representation of loops in GCC however allows only loops with a
1debfc3dSmrgsingle latch.  During loop analysis, headers of such loops are split and
1debfc3dSmrgforwarder blocks are created in order to disambiguate their structures.
1debfc3dSmrgHeuristic based on profile information and structure of the induction
1debfc3dSmrgvariables in the loops is used to determine whether the latches
1debfc3dSmrgcorrespond to sub-loops or to control flow in a single loop.  This means
1debfc3dSmrgthat the analysis sometimes changes the CFG, and if you run it in the
1debfc3dSmrgmiddle of an optimization pass, you must be able to deal with the new
1debfc3dSmrgblocks.  You may avoid CFG changes by passing
1debfc3dSmrg@code{LOOPS_MAY_HAVE_MULTIPLE_LATCHES} flag to the loop discovery,
1debfc3dSmrgnote however that most other loop manipulation functions will not work
1debfc3dSmrgcorrectly for loops with multiple latch edges (the functions that only
1debfc3dSmrgquery membership of blocks to loops and subloop relationships, or
1debfc3dSmrgenumerate and test loop exits, can be expected to work).
1debfc3dSmrg
1debfc3dSmrgBody of the loop is the set of blocks that are dominated by its header,
1debfc3dSmrgand reachable from its latch against the direction of edges in CFG@.  The
1debfc3dSmrgloops are organized in a containment hierarchy (tree) such that all the
1debfc3dSmrgloops immediately contained inside loop L are the children of L in the
1debfc3dSmrgtree.  This tree is represented by the @code{struct loops} structure.
1debfc3dSmrgThe root of this tree is a fake loop that contains all blocks in the
1debfc3dSmrgfunction.  Each of the loops is represented in a @code{struct loop}
1debfc3dSmrgstructure.  Each loop is assigned an index (@code{num} field of the
1debfc3dSmrg@code{struct loop} structure), and the pointer to the loop is stored in
1debfc3dSmrgthe corresponding field of the @code{larray} vector in the loops
1debfc3dSmrgstructure.  The indices do not have to be continuous, there may be
1debfc3dSmrgempty (@code{NULL}) entries in the @code{larray} created by deleting
1debfc3dSmrgloops.  Also, there is no guarantee on the relative order of a loop
1debfc3dSmrgand its subloops in the numbering.  The index of a loop never changes.
1debfc3dSmrg
1debfc3dSmrgThe entries of the @code{larray} field should not be accessed directly.
1debfc3dSmrgThe function @code{get_loop} returns the loop description for a loop with
1debfc3dSmrgthe given index.  @code{number_of_loops} function returns number of
1debfc3dSmrgloops in the function.  To traverse all loops, use @code{FOR_EACH_LOOP}
1debfc3dSmrgmacro.  The @code{flags} argument of the macro is used to determine
1debfc3dSmrgthe direction of traversal and the set of loops visited.  Each loop is
1debfc3dSmrgguaranteed to be visited exactly once, regardless of the changes to the
1debfc3dSmrgloop tree, and the loops may be removed during the traversal.  The newly
1debfc3dSmrgcreated loops are never traversed, if they need to be visited, this
*8feb0f0bSmrgmust be done separately after their creation.
1debfc3dSmrg
1debfc3dSmrgEach basic block contains the reference to the innermost loop it belongs
1debfc3dSmrgto (@code{loop_father}).  For this reason, it is only possible to have
1debfc3dSmrgone @code{struct loops} structure initialized at the same time for each
1debfc3dSmrgCFG@.  The global variable @code{current_loops} contains the
1debfc3dSmrg@code{struct loops} structure.  Many of the loop manipulation functions
1debfc3dSmrgassume that dominance information is up-to-date.
1debfc3dSmrg
1debfc3dSmrgThe loops are analyzed through @code{loop_optimizer_init} function.  The
1debfc3dSmrgargument of this function is a set of flags represented in an integer
1debfc3dSmrgbitmask.  These flags specify what other properties of the loop
1debfc3dSmrgstructures should be calculated/enforced and preserved later:
1debfc3dSmrg
1debfc3dSmrg@itemize
1debfc3dSmrg@item @code{LOOPS_MAY_HAVE_MULTIPLE_LATCHES}: If this flag is set, no
1debfc3dSmrgchanges to CFG will be performed in the loop analysis, in particular,
1debfc3dSmrgloops with multiple latch edges will not be disambiguated.  If a loop
1debfc3dSmrghas multiple latches, its latch block is set to NULL@.  Most of
1debfc3dSmrgthe loop manipulation functions will not work for loops in this shape.
1debfc3dSmrgNo other flags that require CFG changes can be passed to
1debfc3dSmrgloop_optimizer_init.
1debfc3dSmrg@item @code{LOOPS_HAVE_PREHEADERS}: Forwarder blocks are created in such
1debfc3dSmrga way that each loop has only one entry edge, and additionally, the
1debfc3dSmrgsource block of this entry edge has only one successor.  This creates a
1debfc3dSmrgnatural place where the code can be moved out of the loop, and ensures
1debfc3dSmrgthat the entry edge of the loop leads from its immediate super-loop.
1debfc3dSmrg@item @code{LOOPS_HAVE_SIMPLE_LATCHES}: Forwarder blocks are created to
1debfc3dSmrgforce the latch block of each loop to have only one successor.  This
1debfc3dSmrgensures that the latch of the loop does not belong to any of its
1debfc3dSmrgsub-loops, and makes manipulation with the loops significantly easier.
1debfc3dSmrgMost of the loop manipulation functions assume that the loops are in
1debfc3dSmrgthis shape.  Note that with this flag, the ``normal'' loop without any
1debfc3dSmrgcontrol flow inside and with one exit consists of two basic blocks.
1debfc3dSmrg@item @code{LOOPS_HAVE_MARKED_IRREDUCIBLE_REGIONS}: Basic blocks and
1debfc3dSmrgedges in the strongly connected components that are not natural loops
1debfc3dSmrg(have more than one entry block) are marked with
1debfc3dSmrg@code{BB_IRREDUCIBLE_LOOP} and @code{EDGE_IRREDUCIBLE_LOOP} flags.  The
1debfc3dSmrgflag is not set for blocks and edges that belong to natural loops that
1debfc3dSmrgare in such an irreducible region (but it is set for the entry and exit
1debfc3dSmrgedges of such a loop, if they lead to/from this region).
1debfc3dSmrg@item @code{LOOPS_HAVE_RECORDED_EXITS}: The lists of exits are recorded
1debfc3dSmrgand updated for each loop.  This makes some functions (e.g.,
1debfc3dSmrg@code{get_loop_exit_edges}) more efficient.  Some functions (e.g.,
1debfc3dSmrg@code{single_exit}) can be used only if the lists of exits are
1debfc3dSmrgrecorded.
1debfc3dSmrg@end itemize
1debfc3dSmrg
1debfc3dSmrgThese properties may also be computed/enforced later, using functions
1debfc3dSmrg@code{create_preheaders}, @code{force_single_succ_latches},
1debfc3dSmrg@code{mark_irreducible_loops} and @code{record_loop_exits}.
1debfc3dSmrgThe properties can be queried using @code{loops_state_satisfies_p}.
1debfc3dSmrg
1debfc3dSmrgThe memory occupied by the loops structures should be freed with
1debfc3dSmrg@code{loop_optimizer_finalize} function.  When loop structures are
1debfc3dSmrgsetup to be preserved across passes this function reduces the
1debfc3dSmrginformation to be kept up-to-date to a minimum (only
1debfc3dSmrg@code{LOOPS_MAY_HAVE_MULTIPLE_LATCHES} set).
1debfc3dSmrg
1debfc3dSmrgThe CFG manipulation functions in general do not update loop structures.
1debfc3dSmrgSpecialized versions that additionally do so are provided for the most
1debfc3dSmrgcommon tasks.  On GIMPLE, @code{cleanup_tree_cfg_loop} function can be
1debfc3dSmrgused to cleanup CFG while updating the loops structures if
1debfc3dSmrg@code{current_loops} is set.
1debfc3dSmrg
1debfc3dSmrgAt the moment loop structure is preserved from the start of GIMPLE
1debfc3dSmrgloop optimizations until the end of RTL loop optimizations.  During
1debfc3dSmrgthis time a loop can be tracked by its @code{struct loop} and number.
1debfc3dSmrg
1debfc3dSmrg@node Loop querying
1debfc3dSmrg@section Loop querying
1debfc3dSmrg@cindex Loop querying
1debfc3dSmrg
1debfc3dSmrgThe functions to query the information about loops are declared in
1debfc3dSmrg@file{cfgloop.h}.  Some of the information can be taken directly from
1debfc3dSmrgthe structures.  @code{loop_father} field of each basic block contains
1debfc3dSmrgthe innermost loop to that the block belongs.  The most useful fields of
1debfc3dSmrgloop structure (that are kept up-to-date at all times) are:
1debfc3dSmrg
1debfc3dSmrg@itemize
1debfc3dSmrg@item @code{header}, @code{latch}: Header and latch basic blocks of the
1debfc3dSmrgloop.
1debfc3dSmrg@item @code{num_nodes}: Number of basic blocks in the loop (including
1debfc3dSmrgthe basic blocks of the sub-loops).
1debfc3dSmrg@item @code{outer}, @code{inner}, @code{next}: The super-loop, the first
1debfc3dSmrgsub-loop, and the sibling of the loop in the loops tree.
1debfc3dSmrg@end itemize
1debfc3dSmrg
1debfc3dSmrgThere are other fields in the loop structures, many of them used only by
1debfc3dSmrgsome of the passes, or not updated during CFG changes; in general, they
1debfc3dSmrgshould not be accessed directly.
1debfc3dSmrg
1debfc3dSmrgThe most important functions to query loop structures are:
1debfc3dSmrg
1debfc3dSmrg@itemize
1debfc3dSmrg@item @code{loop_depth}: The depth of the loop in the loops tree, i.e., the
1debfc3dSmrgnumber of super-loops of the loop.
1debfc3dSmrg@item @code{flow_loops_dump}: Dumps the information about loops to a
1debfc3dSmrgfile.
1debfc3dSmrg@item @code{verify_loop_structure}: Checks consistency of the loop
1debfc3dSmrgstructures.
1debfc3dSmrg@item @code{loop_latch_edge}: Returns the latch edge of a loop.
1debfc3dSmrg@item @code{loop_preheader_edge}: If loops have preheaders, returns
1debfc3dSmrgthe preheader edge of a loop.
1debfc3dSmrg@item @code{flow_loop_nested_p}: Tests whether loop is a sub-loop of
1debfc3dSmrganother loop.
1debfc3dSmrg@item @code{flow_bb_inside_loop_p}: Tests whether a basic block belongs
1debfc3dSmrgto a loop (including its sub-loops).
1debfc3dSmrg@item @code{find_common_loop}: Finds the common super-loop of two loops.
1debfc3dSmrg@item @code{superloop_at_depth}: Returns the super-loop of a loop with
1debfc3dSmrgthe given depth.
1debfc3dSmrg@item @code{tree_num_loop_insns}, @code{num_loop_insns}: Estimates the
1debfc3dSmrgnumber of insns in the loop, on GIMPLE and on RTL.
1debfc3dSmrg@item @code{loop_exit_edge_p}: Tests whether edge is an exit from a
1debfc3dSmrgloop.
1debfc3dSmrg@item @code{mark_loop_exit_edges}: Marks all exit edges of all loops
1debfc3dSmrgwith @code{EDGE_LOOP_EXIT} flag.
1debfc3dSmrg@item @code{get_loop_body}, @code{get_loop_body_in_dom_order},
1debfc3dSmrg@code{get_loop_body_in_bfs_order}: Enumerates the basic blocks in the
1debfc3dSmrgloop in depth-first search order in reversed CFG, ordered by dominance
1debfc3dSmrgrelation, and breath-first search order, respectively.
1debfc3dSmrg@item @code{single_exit}: Returns the single exit edge of the loop, or
1debfc3dSmrg@code{NULL} if the loop has more than one exit.  You can only use this
1debfc3dSmrgfunction if LOOPS_HAVE_MARKED_SINGLE_EXITS property is used.
1debfc3dSmrg@item @code{get_loop_exit_edges}: Enumerates the exit edges of a loop.
1debfc3dSmrg@item @code{just_once_each_iteration_p}: Returns true if the basic block
1debfc3dSmrgis executed exactly once during each iteration of a loop (that is, it
1debfc3dSmrgdoes not belong to a sub-loop, and it dominates the latch of the loop).
1debfc3dSmrg@end itemize
1debfc3dSmrg
1debfc3dSmrg@node Loop manipulation
1debfc3dSmrg@section Loop manipulation
1debfc3dSmrg@cindex Loop manipulation
1debfc3dSmrg
1debfc3dSmrgThe loops tree can be manipulated using the following functions:
1debfc3dSmrg
1debfc3dSmrg@itemize
1debfc3dSmrg@item @code{flow_loop_tree_node_add}: Adds a node to the tree.
1debfc3dSmrg@item @code{flow_loop_tree_node_remove}: Removes a node from the tree.
1debfc3dSmrg@item @code{add_bb_to_loop}: Adds a basic block to a loop.
1debfc3dSmrg@item @code{remove_bb_from_loops}: Removes a basic block from loops.
1debfc3dSmrg@end itemize
1debfc3dSmrg
1debfc3dSmrgMost low-level CFG functions update loops automatically.  The following
1debfc3dSmrgfunctions handle some more complicated cases of CFG manipulations:
1debfc3dSmrg
1debfc3dSmrg@itemize
1debfc3dSmrg@item @code{remove_path}: Removes an edge and all blocks it dominates.
1debfc3dSmrg@item @code{split_loop_exit_edge}: Splits exit edge of the loop,
1debfc3dSmrgensuring that PHI node arguments remain in the loop (this ensures that
1debfc3dSmrgloop-closed SSA form is preserved).  Only useful on GIMPLE.
1debfc3dSmrg@end itemize
1debfc3dSmrg
1debfc3dSmrgFinally, there are some higher-level loop transformations implemented.
1debfc3dSmrgWhile some of them are written so that they should work on non-innermost
1debfc3dSmrgloops, they are mostly untested in that case, and at the moment, they
1debfc3dSmrgare only reliable for the innermost loops:
1debfc3dSmrg
1debfc3dSmrg@itemize
1debfc3dSmrg@item @code{create_iv}: Creates a new induction variable.  Only works on
1debfc3dSmrgGIMPLE@.  @code{standard_iv_increment_position} can be used to find a
1debfc3dSmrgsuitable place for the iv increment.
1debfc3dSmrg@item @code{duplicate_loop_to_header_edge},
1debfc3dSmrg@code{tree_duplicate_loop_to_header_edge}: These functions (on RTL and
1debfc3dSmrgon GIMPLE) duplicate the body of the loop prescribed number of times on
1debfc3dSmrgone of the edges entering loop header, thus performing either loop
1debfc3dSmrgunrolling or loop peeling.  @code{can_duplicate_loop_p}
1debfc3dSmrg(@code{can_unroll_loop_p} on GIMPLE) must be true for the duplicated
1debfc3dSmrgloop.
1debfc3dSmrg@item @code{loop_version}: This function creates a copy of a loop, and
1debfc3dSmrga branch before them that selects one of them depending on the
1debfc3dSmrgprescribed condition.  This is useful for optimizations that need to
1debfc3dSmrgverify some assumptions in runtime (one of the copies of the loop is
1debfc3dSmrgusually left unchanged, while the other one is transformed in some way).
1debfc3dSmrg@item @code{tree_unroll_loop}: Unrolls the loop, including peeling the
1debfc3dSmrgextra iterations to make the number of iterations divisible by unroll
1debfc3dSmrgfactor, updating the exit condition, and removing the exits that now
1debfc3dSmrgcannot be taken.  Works only on GIMPLE.
1debfc3dSmrg@end itemize
1debfc3dSmrg
1debfc3dSmrg@node LCSSA
1debfc3dSmrg@section Loop-closed SSA form
1debfc3dSmrg@cindex LCSSA
1debfc3dSmrg@cindex Loop-closed SSA form
1debfc3dSmrg
1debfc3dSmrgThroughout the loop optimizations on tree level, one extra condition is
1debfc3dSmrgenforced on the SSA form:  No SSA name is used outside of the loop in
1debfc3dSmrgthat it is defined.  The SSA form satisfying this condition is called
1debfc3dSmrg``loop-closed SSA form'' -- LCSSA@.  To enforce LCSSA, PHI nodes must be
1debfc3dSmrgcreated at the exits of the loops for the SSA names that are used
1debfc3dSmrgoutside of them.  Only the real operands (not virtual SSA names) are
1debfc3dSmrgheld in LCSSA, in order to save memory.
1debfc3dSmrg
1debfc3dSmrgThere are various benefits of LCSSA:
1debfc3dSmrg
1debfc3dSmrg@itemize
1debfc3dSmrg@item Many optimizations (value range analysis, final value
1debfc3dSmrgreplacement) are interested in the values that are defined in the loop
1debfc3dSmrgand used outside of it, i.e., exactly those for that we create new PHI
1debfc3dSmrgnodes.
1debfc3dSmrg@item In induction variable analysis, it is not necessary to specify the
1debfc3dSmrgloop in that the analysis should be performed -- the scalar evolution
1debfc3dSmrganalysis always returns the results with respect to the loop in that the
1debfc3dSmrgSSA name is defined.
1debfc3dSmrg@item It makes updating of SSA form during loop transformations simpler.
1debfc3dSmrgWithout LCSSA, operations like loop unrolling may force creation of PHI
1debfc3dSmrgnodes arbitrarily far from the loop, while in LCSSA, the SSA form can be
1debfc3dSmrgupdated locally.  However, since we only keep real operands in LCSSA, we
1debfc3dSmrgcannot use this advantage (we could have local updating of real
1debfc3dSmrgoperands, but it is not much more efficient than to use generic SSA form
1debfc3dSmrgupdating for it as well; the amount of changes to SSA is the same).
1debfc3dSmrg@end itemize
1debfc3dSmrg
1debfc3dSmrgHowever, it also means LCSSA must be updated.  This is usually
1debfc3dSmrgstraightforward, unless you create a new value in loop and use it
1debfc3dSmrgoutside, or unless you manipulate loop exit edges (functions are
1debfc3dSmrgprovided to make these manipulations simple).
1debfc3dSmrg@code{rewrite_into_loop_closed_ssa} is used to rewrite SSA form to
1debfc3dSmrgLCSSA, and @code{verify_loop_closed_ssa} to check that the invariant of
1debfc3dSmrgLCSSA is preserved.
1debfc3dSmrg
1debfc3dSmrg@node Scalar evolutions
1debfc3dSmrg@section Scalar evolutions
1debfc3dSmrg@cindex Scalar evolutions
1debfc3dSmrg@cindex IV analysis on GIMPLE
1debfc3dSmrg
1debfc3dSmrgScalar evolutions (SCEV) are used to represent results of induction
1debfc3dSmrgvariable analysis on GIMPLE@.  They enable us to represent variables with
1debfc3dSmrgcomplicated behavior in a simple and consistent way (we only use it to
1debfc3dSmrgexpress values of polynomial induction variables, but it is possible to
1debfc3dSmrgextend it).  The interfaces to SCEV analysis are declared in
1debfc3dSmrg@file{tree-scalar-evolution.h}.  To use scalar evolutions analysis,
1debfc3dSmrg@code{scev_initialize} must be used.  To stop using SCEV,
1debfc3dSmrg@code{scev_finalize} should be used.  SCEV analysis caches results in
1debfc3dSmrgorder to save time and memory.  This cache however is made invalid by
1debfc3dSmrgmost of the loop transformations, including removal of code.  If such a
1debfc3dSmrgtransformation is performed, @code{scev_reset} must be called to clean
1debfc3dSmrgthe caches.
1debfc3dSmrg
1debfc3dSmrgGiven an SSA name, its behavior in loops can be analyzed using the
1debfc3dSmrg@code{analyze_scalar_evolution} function.  The returned SCEV however
1debfc3dSmrgdoes not have to be fully analyzed and it may contain references to
1debfc3dSmrgother SSA names defined in the loop.  To resolve these (potentially
1debfc3dSmrgrecursive) references, @code{instantiate_parameters} or
1debfc3dSmrg@code{resolve_mixers} functions must be used.
1debfc3dSmrg@code{instantiate_parameters} is useful when you use the results of SCEV
1debfc3dSmrgonly for some analysis, and when you work with whole nest of loops at
1debfc3dSmrgonce.  It will try replacing all SSA names by their SCEV in all loops,
1debfc3dSmrgincluding the super-loops of the current loop, thus providing a complete
1debfc3dSmrginformation about the behavior of the variable in the loop nest.
1debfc3dSmrg@code{resolve_mixers} is useful if you work with only one loop at a
1debfc3dSmrgtime, and if you possibly need to create code based on the value of the
1debfc3dSmrginduction variable.  It will only resolve the SSA names defined in the
1debfc3dSmrgcurrent loop, leaving the SSA names defined outside unchanged, even if
1debfc3dSmrgtheir evolution in the outer loops is known.
1debfc3dSmrg
1debfc3dSmrgThe SCEV is a normal tree expression, except for the fact that it may
1debfc3dSmrgcontain several special tree nodes.  One of them is
1debfc3dSmrg@code{SCEV_NOT_KNOWN}, used for SSA names whose value cannot be
1debfc3dSmrgexpressed.  The other one is @code{POLYNOMIAL_CHREC}.  Polynomial chrec
1debfc3dSmrghas three arguments -- base, step and loop (both base and step may
1debfc3dSmrgcontain further polynomial chrecs).  Type of the expression and of base
1debfc3dSmrgand step must be the same.  A variable has evolution
1debfc3dSmrg@code{POLYNOMIAL_CHREC(base, step, loop)} if it is (in the specified
1debfc3dSmrgloop) equivalent to @code{x_1} in the following example
1debfc3dSmrg
1debfc3dSmrg@smallexample
1debfc3dSmrgwhile (@dots{})
1debfc3dSmrg  @{
1debfc3dSmrg    x_1 = phi (base, x_2);
1debfc3dSmrg    x_2 = x_1 + step;
1debfc3dSmrg  @}
1debfc3dSmrg@end smallexample
1debfc3dSmrg
1debfc3dSmrgNote that this includes the language restrictions on the operations.
1debfc3dSmrgFor example, if we compile C code and @code{x} has signed type, then the
1debfc3dSmrgoverflow in addition would cause undefined behavior, and we may assume
1debfc3dSmrgthat this does not happen.  Hence, the value with this SCEV cannot
1debfc3dSmrgoverflow (which restricts the number of iterations of such a loop).
1debfc3dSmrg
1debfc3dSmrgIn many cases, one wants to restrict the attention just to affine
1debfc3dSmrginduction variables.  In this case, the extra expressive power of SCEV
1debfc3dSmrgis not useful, and may complicate the optimizations.  In this case,
1debfc3dSmrg@code{simple_iv} function may be used to analyze a value -- the result
1debfc3dSmrgis a loop-invariant base and step.
1debfc3dSmrg
1debfc3dSmrg@node loop-iv
1debfc3dSmrg@section IV analysis on RTL
1debfc3dSmrg@cindex IV analysis on RTL
1debfc3dSmrg
1debfc3dSmrgThe induction variable on RTL is simple and only allows analysis of
1debfc3dSmrgaffine induction variables, and only in one loop at once.  The interface
1debfc3dSmrgis declared in @file{cfgloop.h}.  Before analyzing induction variables
1debfc3dSmrgin a loop L, @code{iv_analysis_loop_init} function must be called on L.
1debfc3dSmrgAfter the analysis (possibly calling @code{iv_analysis_loop_init} for
1debfc3dSmrgseveral loops) is finished, @code{iv_analysis_done} should be called.
1debfc3dSmrgThe following functions can be used to access the results of the
1debfc3dSmrganalysis:
1debfc3dSmrg
1debfc3dSmrg@itemize
1debfc3dSmrg@item @code{iv_analyze}: Analyzes a single register used in the given
1debfc3dSmrginsn.  If no use of the register in this insn is found, the following
1debfc3dSmrginsns are scanned, so that this function can be called on the insn
1debfc3dSmrgreturned by get_condition.
1debfc3dSmrg@item @code{iv_analyze_result}: Analyzes result of the assignment in the
1debfc3dSmrggiven insn.
1debfc3dSmrg@item @code{iv_analyze_expr}: Analyzes a more complicated expression.
1debfc3dSmrgAll its operands are analyzed by @code{iv_analyze}, and hence they must
1debfc3dSmrgbe used in the specified insn or one of the following insns.
1debfc3dSmrg@end itemize
1debfc3dSmrg
1debfc3dSmrgThe description of the induction variable is provided in @code{struct
1debfc3dSmrgrtx_iv}.  In order to handle subregs, the representation is a bit
1debfc3dSmrgcomplicated; if the value of the @code{extend} field is not
1debfc3dSmrg@code{UNKNOWN}, the value of the induction variable in the i-th
1debfc3dSmrgiteration is
1debfc3dSmrg
1debfc3dSmrg@smallexample
1debfc3dSmrgdelta + mult * extend_@{extend_mode@} (subreg_@{mode@} (base + i * step)),
1debfc3dSmrg@end smallexample
1debfc3dSmrg
1debfc3dSmrgwith the following exception:  if @code{first_special} is true, then the
1debfc3dSmrgvalue in the first iteration (when @code{i} is zero) is @code{delta +
1debfc3dSmrgmult * base}.  However, if @code{extend} is equal to @code{UNKNOWN},
1debfc3dSmrgthen @code{first_special} must be false, @code{delta} 0, @code{mult} 1
1debfc3dSmrgand the value in the i-th iteration is
1debfc3dSmrg
1debfc3dSmrg@smallexample
1debfc3dSmrgsubreg_@{mode@} (base + i * step)
1debfc3dSmrg@end smallexample
1debfc3dSmrg
1debfc3dSmrgThe function @code{get_iv_value} can be used to perform these
1debfc3dSmrgcalculations.
1debfc3dSmrg
1debfc3dSmrg@node Number of iterations
1debfc3dSmrg@section Number of iterations analysis
1debfc3dSmrg@cindex Number of iterations analysis
1debfc3dSmrg
1debfc3dSmrgBoth on GIMPLE and on RTL, there are functions available to determine
1debfc3dSmrgthe number of iterations of a loop, with a similar interface.  The
1debfc3dSmrgnumber of iterations of a loop in GCC is defined as the number of
1debfc3dSmrgexecutions of the loop latch.  In many cases, it is not possible to
1debfc3dSmrgdetermine the number of iterations unconditionally -- the determined
1debfc3dSmrgnumber is correct only if some assumptions are satisfied.  The analysis
1debfc3dSmrgtries to verify these conditions using the information contained in the
1debfc3dSmrgprogram; if it fails, the conditions are returned together with the
1debfc3dSmrgresult.  The following information and conditions are provided by the
1debfc3dSmrganalysis:
1debfc3dSmrg
1debfc3dSmrg@itemize
1debfc3dSmrg@item @code{assumptions}: If this condition is false, the rest of
1debfc3dSmrgthe information is invalid.
1debfc3dSmrg@item @code{noloop_assumptions} on RTL, @code{may_be_zero} on GIMPLE: If
1debfc3dSmrgthis condition is true, the loop exits in the first iteration.
1debfc3dSmrg@item @code{infinite}: If this condition is true, the loop is infinite.
1debfc3dSmrgThis condition is only available on RTL@.  On GIMPLE, conditions for
1debfc3dSmrgfiniteness of the loop are included in @code{assumptions}.
1debfc3dSmrg@item @code{niter_expr} on RTL, @code{niter} on GIMPLE: The expression
1debfc3dSmrgthat gives number of iterations.  The number of iterations is defined as
1debfc3dSmrgthe number of executions of the loop latch.
1debfc3dSmrg@end itemize
1debfc3dSmrg
1debfc3dSmrgBoth on GIMPLE and on RTL, it necessary for the induction variable
1debfc3dSmrganalysis framework to be initialized (SCEV on GIMPLE, loop-iv on RTL).
1debfc3dSmrgOn GIMPLE, the results are stored to @code{struct tree_niter_desc}
1debfc3dSmrgstructure.  Number of iterations before the loop is exited through a
1debfc3dSmrggiven exit can be determined using @code{number_of_iterations_exit}
1debfc3dSmrgfunction.  On RTL, the results are returned in @code{struct niter_desc}
1debfc3dSmrgstructure.  The corresponding function is named
1debfc3dSmrg@code{check_simple_exit}.  There are also functions that pass through
1debfc3dSmrgall the exits of a loop and try to find one with easy to determine
1debfc3dSmrgnumber of iterations -- @code{find_loop_niter} on GIMPLE and
1debfc3dSmrg@code{find_simple_exit} on RTL@.  Finally, there are functions that
1debfc3dSmrgprovide the same information, but additionally cache it, so that
1debfc3dSmrgrepeated calls to number of iterations are not so costly --
1debfc3dSmrg@code{number_of_latch_executions} on GIMPLE and @code{get_simple_loop_desc}
1debfc3dSmrgon RTL.
1debfc3dSmrg
1debfc3dSmrgNote that some of these functions may behave slightly differently than
1debfc3dSmrgothers -- some of them return only the expression for the number of
1debfc3dSmrgiterations, and fail if there are some assumptions.  The function
1debfc3dSmrg@code{number_of_latch_executions} works only for single-exit loops.
1debfc3dSmrgThe function @code{number_of_cond_exit_executions} can be used to
1debfc3dSmrgdetermine number of executions of the exit condition of a single-exit
1debfc3dSmrgloop (i.e., the @code{number_of_latch_executions} increased by one).
1debfc3dSmrg
1debfc3dSmrgOn GIMPLE, below constraint flags affect semantics of some APIs of number
1debfc3dSmrgof iterations analyzer:
1debfc3dSmrg
1debfc3dSmrg@itemize
1debfc3dSmrg@item @code{LOOP_C_INFINITE}: If this constraint flag is set, the loop
1debfc3dSmrgis known to be infinite.  APIs like @code{number_of_iterations_exit} can
1debfc3dSmrgreturn false directly without doing any analysis.
1debfc3dSmrg@item @code{LOOP_C_FINITE}: If this constraint flag is set, the loop is
1debfc3dSmrgknown to be finite, in other words, loop's number of iterations can be
1debfc3dSmrgcomputed with @code{assumptions} be true.
1debfc3dSmrg@end itemize
1debfc3dSmrg
1debfc3dSmrgGenerally, the constraint flags are set/cleared by consumers which are
1debfc3dSmrgloop optimizers.  It's also the consumers' responsibility to set/clear
1debfc3dSmrgconstraints correctly.  Failing to do that might result in hard to track
1debfc3dSmrgdown bugs in scev/niter consumers.  One typical use case is vectorizer:
1debfc3dSmrgit drives number of iterations analyzer by setting @code{LOOP_C_FINITE}
1debfc3dSmrgand vectorizes possibly infinite loop by versioning loop with analysis
1debfc3dSmrgresult.  In return, constraints set by consumers can also help number of
1debfc3dSmrgiterations analyzer in following optimizers.  For example, @code{niter}
1debfc3dSmrgof a loop versioned under @code{assumptions} is valid unconditionally.
1debfc3dSmrg
1debfc3dSmrgOther constraints may be added in the future, for example, a constraint
1debfc3dSmrgindicating that loops' latch must roll thus @code{may_be_zero} would be
1debfc3dSmrgfalse unconditionally.
1debfc3dSmrg
1debfc3dSmrg@node Dependency analysis
1debfc3dSmrg@section Data Dependency Analysis
1debfc3dSmrg@cindex Data Dependency Analysis
1debfc3dSmrg
1debfc3dSmrgThe code for the data dependence analysis can be found in
1debfc3dSmrg@file{tree-data-ref.c} and its interface and data structures are
1debfc3dSmrgdescribed in @file{tree-data-ref.h}.  The function that computes the
1debfc3dSmrgdata dependences for all the array and pointer references for a given
1debfc3dSmrgloop is @code{compute_data_dependences_for_loop}.  This function is
1debfc3dSmrgcurrently used by the linear loop transform and the vectorization
1debfc3dSmrgpasses.  Before calling this function, one has to allocate two vectors:
1debfc3dSmrga first vector will contain the set of data references that are
1debfc3dSmrgcontained in the analyzed loop body, and the second vector will contain
1debfc3dSmrgthe dependence relations between the data references.  Thus if the
1debfc3dSmrgvector of data references is of size @code{n}, the vector containing the
1debfc3dSmrgdependence relations will contain @code{n*n} elements.  However if the
1debfc3dSmrganalyzed loop contains side effects, such as calls that potentially can
1debfc3dSmrginterfere with the data references in the current analyzed loop, the
1debfc3dSmrganalysis stops while scanning the loop body for data references, and
1debfc3dSmrginserts a single @code{chrec_dont_know} in the dependence relation
1debfc3dSmrgarray.
1debfc3dSmrg
1debfc3dSmrgThe data references are discovered in a particular order during the
1debfc3dSmrgscanning of the loop body: the loop body is analyzed in execution order,
1debfc3dSmrgand the data references of each statement are pushed at the end of the
1debfc3dSmrgdata reference array.  Two data references syntactically occur in the
1debfc3dSmrgprogram in the same order as in the array of data references.  This
1debfc3dSmrgsyntactic order is important in some classical data dependence tests,
1debfc3dSmrgand mapping this order to the elements of this array avoids costly
1debfc3dSmrgqueries to the loop body representation.
1debfc3dSmrg
1debfc3dSmrgThree types of data references are currently handled: ARRAY_REF,
1debfc3dSmrgINDIRECT_REF and COMPONENT_REF@. The data structure for the data reference
1debfc3dSmrgis @code{data_reference}, where @code{data_reference_p} is a name of a
1debfc3dSmrgpointer to the data reference structure. The structure contains the
1debfc3dSmrgfollowing elements:
1debfc3dSmrg
1debfc3dSmrg@itemize
1debfc3dSmrg@item @code{base_object_info}: Provides information about the base object
1debfc3dSmrgof the data reference and its access functions. These access functions
1debfc3dSmrgrepresent the evolution of the data reference in the loop relative to
1debfc3dSmrgits base, in keeping with the classical meaning of the data reference
1debfc3dSmrgaccess function for the support of arrays. For example, for a reference
1debfc3dSmrg@code{a.b[i][j]}, the base object is @code{a.b} and the access functions,
1debfc3dSmrgone for each array subscript, are:
1debfc3dSmrg@code{@{i_init, + i_step@}_1, @{j_init, +, j_step@}_2}.
1debfc3dSmrg
1debfc3dSmrg@item @code{first_location_in_loop}: Provides information about the first
1debfc3dSmrglocation accessed by the data reference in the loop and about the access
1debfc3dSmrgfunction used to represent evolution relative to this location. This data
1debfc3dSmrgis used to support pointers, and is not used for arrays (for which we
1debfc3dSmrghave base objects). Pointer accesses are represented as a one-dimensional
1debfc3dSmrgaccess that starts from the first location accessed in the loop. For
1debfc3dSmrgexample:
1debfc3dSmrg
1debfc3dSmrg@smallexample
1debfc3dSmrg      for1 i
1debfc3dSmrg         for2 j
1debfc3dSmrg          *((int *)p + i + j) = a[i][j];
1debfc3dSmrg@end smallexample
1debfc3dSmrg
1debfc3dSmrgThe access function of the pointer access is @code{@{0, + 4B@}_for2}
1debfc3dSmrgrelative to @code{p + i}. The access functions of the array are
1debfc3dSmrg@code{@{i_init, + i_step@}_for1} and @code{@{j_init, +, j_step@}_for2}
1debfc3dSmrgrelative to @code{a}.
1debfc3dSmrg
1debfc3dSmrgUsually, the object the pointer refers to is either unknown, or we cannot
1debfc3dSmrgprove that the access is confined to the boundaries of a certain object.
1debfc3dSmrg
1debfc3dSmrgTwo data references can be compared only if at least one of these two
1debfc3dSmrgrepresentations has all its fields filled for both data references.
1debfc3dSmrg
1debfc3dSmrgThe current strategy for data dependence tests is as follows:
1debfc3dSmrgIf both @code{a} and @code{b} are represented as arrays, compare
1debfc3dSmrg@code{a.base_object} and @code{b.base_object};
1debfc3dSmrgif they are equal, apply dependence tests (use access functions based on
1debfc3dSmrgbase_objects).
1debfc3dSmrgElse if both @code{a} and @code{b} are represented as pointers, compare
1debfc3dSmrg@code{a.first_location} and @code{b.first_location};
1debfc3dSmrgif they are equal, apply dependence tests (use access functions based on
1debfc3dSmrgfirst location).
1debfc3dSmrgHowever, if @code{a} and @code{b} are represented differently, only try
1debfc3dSmrgto prove that the bases are definitely different.
1debfc3dSmrg
1debfc3dSmrg@item Aliasing information.
1debfc3dSmrg@item Alignment information.
1debfc3dSmrg@end itemize
1debfc3dSmrg
1debfc3dSmrgThe structure describing the relation between two data references is
1debfc3dSmrg@code{data_dependence_relation} and the shorter name for a pointer to
1debfc3dSmrgsuch a structure is @code{ddr_p}.  This structure contains:
1debfc3dSmrg
1debfc3dSmrg@itemize
1debfc3dSmrg@item a pointer to each data reference,
1debfc3dSmrg@item a tree node @code{are_dependent} that is set to @code{chrec_known}
1debfc3dSmrgif the analysis has proved that there is no dependence between these two
1debfc3dSmrgdata references, @code{chrec_dont_know} if the analysis was not able to
1debfc3dSmrgdetermine any useful result and potentially there could exist a
1debfc3dSmrgdependence between these data references, and @code{are_dependent} is
1debfc3dSmrgset to @code{NULL_TREE} if there exist a dependence relation between the
1debfc3dSmrgdata references, and the description of this dependence relation is
1debfc3dSmrggiven in the @code{subscripts}, @code{dir_vects}, and @code{dist_vects}
1debfc3dSmrgarrays,
1debfc3dSmrg@item a boolean that determines whether the dependence relation can be
1debfc3dSmrgrepresented by a classical distance vector,
1debfc3dSmrg@item an array @code{subscripts} that contains a description of each
1debfc3dSmrgsubscript of the data references.  Given two array accesses a
1debfc3dSmrgsubscript is the tuple composed of the access functions for a given
1debfc3dSmrgdimension.  For example, given @code{A[f1][f2][f3]} and
1debfc3dSmrg@code{B[g1][g2][g3]}, there are three subscripts: @code{(f1, g1), (f2,
1debfc3dSmrgg2), (f3, g3)}.
1debfc3dSmrg@item two arrays @code{dir_vects} and @code{dist_vects} that contain
1debfc3dSmrgclassical representations of the data dependences under the form of
1debfc3dSmrgdirection and distance dependence vectors,
1debfc3dSmrg@item an array of loops @code{loop_nest} that contains the loops to
1debfc3dSmrgwhich the distance and direction vectors refer to.
1debfc3dSmrg@end itemize
1debfc3dSmrg
1debfc3dSmrgSeveral functions for pretty printing the information extracted by the
1debfc3dSmrgdata dependence analysis are available: @code{dump_ddrs} prints with a
1debfc3dSmrgmaximum verbosity the details of a data dependence relations array,
1debfc3dSmrg@code{dump_dist_dir_vectors} prints only the classical distance and
1debfc3dSmrgdirection vectors for a data dependence relations array, and
1debfc3dSmrg@code{dump_data_references} prints the details of the data references
1debfc3dSmrgcontained in a data reference array.