llvm-gcc-4.2/gcc/doc/loop.texi - llvm-archive - Git at Google

 @c Copyright (c) 2006 Free Software Foundation, Inc.
 @c Free Software Foundation, Inc.
 @c This is part of the GCC manual.
 @c For copying conditions, see the file gcc.texi.

 @c ---------------------------------------------------------------------
 @c Loop Representation
 @c ---------------------------------------------------------------------

 @node Loop Analysis and Representation
 @chapter Analysis and Representation of Loops

 GCC provides extensive infrastructure for work with natural loops, i.e.,
 strongly connected components of CFG with only one entry block.  This
 chapter describes representation of loops in GCC, both on GIMPLE and in
 RTL, as well as the interfaces to loop-related analyses (induction
 variable analysis and number of iterations analysis).

 @menu
 * Loop representation::		Representation and analysis of loops.
 * Loop querying::		Getting information about loops.
 * Loop manipulation::		Loop manipulation functions.
 * LCSSA::			Loop-closed SSA form.
 * Scalar evolutions::   	Induction variables on GIMPLE.
 * loop-iv::			Induction variables on RTL.
 * Number of iterations::	Number of iterations analysis.
 * Dependency analysis::		Data dependency analysis.
 * Lambda::			Linear loop transformations framework.
 @end menu

 @node Loop representation
 @section Loop representation
 @cindex Loop representation
 @cindex Loop analysis

 This chapter describes the representation of loops in GCC, and functions
 that can be used to build, modify and analyze this representation.  Most
 of the interfaces and data structures are declared in @file{cfgloop.h}.
 At the moment, loop structures are analyzed and this information is
 updated only by the optimization passes that deal with loops, but some
 efforts are being made to make it available throughout most of the
 optimization passes.

 In general, a natural loop has one entry block (header) and possibly
 several back edges (latches) leading to the header from the inside of
 the loop.  Loops with several latches may appear if several loops share
 a single header, or if there is a branching in the middle of the loop.
 The representation of loops in GCC however allows only loops with a
 single latch.  During loop analysis, headers of such loops are split and
 forwarder blocks are created in order to disambiguate their structures.
 A heuristic based on profile information is used to determine whether
 the latches correspond to sub-loops or to control flow in a single loop.
 This means that the analysis sometimes changes the CFG, and if you run
 it in the middle of an optimization pass, you must be able to deal with
 the new blocks.

 Body of the loop is the set of blocks that are dominated by its header,
 and reachable from its latch against the direction of edges in CFG.  The
 loops are organized in a containment hierarchy (tree) such that all the
 loops immediately contained inside loop L are the children of L in the
 tree.  This tree is represented by the @code{struct loops} structure.
 The root of this tree is a fake loop that contains all blocks in the
 function.  Each of the loops is represented in a @code{struct loop}
 structure.  Each loop is assigned an index (@code{num} field of the
 @code{struct loop} structure), and the pointer to the loop is stored in
 the corresponding field of the @code{parray} field of the loops
 structure.  Index of a sub-loop is always greater than the index of its
 super-loop.  The indices do not have to be continuous, there may be
 empty (@code{NULL}) entries in the @code{parray} created by deleting
 loops.  The index of a loop never changes.  The first unused index is
 stored in the @code{num} field of the loops structure.

 Each basic block contains the reference to the innermost loop it belongs
 to (@code{loop_father}).  For this reason, it is only possible to have
 one @code{struct loops} structure initialized at the same time for each
 CFG.  It is recommended to use the global variable @code{current_loops}
 to contain the @code{struct loops} structure, especially if the loop
 structures are updated throughout several passes.  Many of the loop
 manipulation functions assume that dominance information is up-to-date.

 The loops are analyzed through @code{loop_optimizer_init} function.  The
 argument of this function is a set of flags represented in an integer
 bitmask.  These flags specify what other properties of the loop
 structures should be calculated/enforced and preserved later:

 @itemize
 @item @code{LOOPS_HAVE_PREHEADERS}: Forwarder blocks are created in such
 a way that each loop has only one entry edge, and additionally, the
 source block of this entry edge has only one successor.  This creates a
 natural place where the code can be moved out of the loop, and ensures
 that the entry edge of the loop leads from its immediate super-loop.
 @item @code{LOOPS_HAVE_SIMPLE_LATCHES}: Forwarder blocks are created to
 force the latch block of each loop to have only one successor.  This
 ensures that the latch of the loop does not belong to any of its
 sub-loops, and makes manipulation with the loops significantly easier.
 Most of the loop manipulation functions assume that the loops are in
 this shape.  Note that with this flag, the ``normal'' loop without any
 control flow inside and with one exit consists of two basic blocks.
 @item @code{LOOPS_HAVE_MARKED_IRREDUCIBLE_REGIONS}: Basic blocks and
 edges in the strongly connected components that are not natural loops
 (have more than one entry block) are marked with
 @code{BB_IRREDUCIBLE_LOOP} and @code{EDGE_IRREDUCIBLE_LOOP} flags.  The
 flag is not set for blocks and edges that belong to natural loops that
 are in such an irreducible region (but it is set for the entry and exit
 edges of such a loop, if they lead to/from this region).
 @item @code{LOOPS_HAVE_MARKED_SINGLE_EXITS}: If a loop has exactly one
 exit edge, this edge is stored in @code{single_exit} field of the loop
 structure.  @code{NULL} is stored there otherwise.
 @end itemize

 These properties may also be computed/enforced later, using functions
 @code{create_preheaders}, @code{force_single_succ_latches},
 @code{mark_irreducible_loops} and @code{mark_single_exit_loops}.

 The memory occupied by the loops structures should be freed with
 @code{loop_optimizer_finalize} function.

 The CFG manipulation functions in general do not update loop structures.
 Specialized versions that additionally do so are provided for the most
 common tasks.  On GIMPLE, @code{cleanup_tree_cfg_loop} function can be
 used to cleanup CFG while updating the loops structures if
 @code{current_loops} is set.

 @node Loop querying
 @section Loop querying
 @cindex Loop querying

 The functions to query the information about loops are declared in
 @file{cfgloop.h}.  Some of the information can be taken directly from
 the structures.  @code{loop_father} field of each basic block contains
 the innermost loop to that the block belongs.  The most useful fields of
 loop structure (that are kept up-to-date at all times) are:

 @itemize
 @item @code{header}, @code{latch}: Header and latch basic blocks of the
 loop.
 @item @code{num_nodes}: Number of basic blocks in the loop (including
 the basic blocks of the sub-loops).
 @item @code{depth}: The depth of the loop in the loops tree, i.e., the
 number of super-loops of the loop.
 @item @code{outer}, @code{inner}, @code{next}: The super-loop, the first
 sub-loop, and the sibling of the loop in the loops tree.
 @item @code{single_exit}: The exit edge of the loop, if the loop has
 exactly one exit and the loops were analyzed with
 LOOPS_HAVE_MARKED_SINGLE_EXITS.
 @end itemize

 There are other fields in the loop structures, many of them used only by
 some of the passes, or not updated during CFG changes; in general, they
 should not be accessed directly.

 The most important functions to query loop structures are:

 @itemize
 @item @code{flow_loops_dump}: Dumps the information about loops to a
 file.
 @item @code{verify_loop_structure}: Checks consistency of the loop
 structures.
 @item @code{loop_latch_edge}: Returns the latch edge of a loop.
 @item @code{loop_preheader_edge}: If loops have preheaders, returns
 the preheader edge of a loop.
 @item @code{flow_loop_nested_p}: Tests whether loop is a sub-loop of
 another loop.
 @item @code{flow_bb_inside_loop_p}: Tests whether a basic block belongs
 to a loop (including its sub-loops).
 @item @code{find_common_loop}: Finds the common super-loop of two loops.
 @item @code{superloop_at_depth}: Returns the super-loop of a loop with
 the given depth.
 @item @code{tree_num_loop_insns}, @code{num_loop_insns}: Estimates the
 number of insns in the loop, on GIMPLE and on RTL.
 @item @code{loop_exit_edge_p}: Tests whether edge is an exit from a
 loop.
 @item @code{mark_loop_exit_edges}: Marks all exit edges of all loops
 with @code{EDGE_LOOP_EXIT} flag.
 @item @code{get_loop_body}, @code{get_loop_body_in_dom_order},
 @code{get_loop_body_in_bfs_order}: Enumerates the basic blocks in the
 loop in depth-first search order in reversed CFG, ordered by dominance
 relation, and breath-first search order, respectively.
 @item @code{get_loop_exit_edges}: Enumerates the exit edges of a loop.
 @item @code{just_once_each_iteration_p}: Returns true if the basic block
 is executed exactly once during each iteration of a loop (that is, it
 does not belong to a sub-loop, and it dominates the latch of the loop).
 @end itemize

 @node Loop manipulation
 @section Loop manipulation
 @cindex Loop manipulation

 The loops tree can be manipulated using the following functions:

 @itemize
 @item @code{flow_loop_tree_node_add}: Adds a node to the tree.
 @item @code{flow_loop_tree_node_remove}: Removes a node from the tree.
 @item @code{add_bb_to_loop}: Adds a basic block to a loop.
 @item @code{remove_bb_from_loops}: Removes a basic block from loops.
 @end itemize

 The specialized versions of several low-level CFG functions that also
 update loop structures are provided:

 @itemize
 @item @code{loop_split_edge_with}: Splits an edge, and places a
 specified RTL code on it.  On GIMPLE, the function can still be used,
 but the code must be NULL.
 @item @code{bsi_insert_on_edge_immediate_loop}: Inserts code on edge,
 splitting it if necessary.  Only works on GIMPLE.
 @item @code{remove_path}: Removes an edge and all blocks it dominates.
 @item @code{loop_commit_inserts}: Commits insertions scheduled on edges,
 and sets loops for the new blocks.  This function can only be used on
 GIMPLE.
 @item @code{split_loop_exit_edge}: Splits exit edge of the loop,
 ensuring that PHI node arguments remain in the loop (this ensures that
 loop-closed SSA form is preserved).  Only useful on GIMPLE.
 @end itemize

 Finally, there are some higher-level loop transformations implemented.
 While some of them are written so that they should work on non-innermost
 loops, they are mostly untested in that case, and at the moment, they
 are only reliable for the innermost loops:

 @itemize
 @item @code{create_iv}: Creates a new induction variable.  Only works on
 GIMPLE.  @code{standard_iv_increment_position} can be used to find a
 suitable place for the iv increment.
 @item @code{duplicate_loop_to_header_edge},
 @code{tree_duplicate_loop_to_header_edge}: These functions (on RTL and
 on GIMPLE) duplicate the body of the loop prescribed number of times on
 one of the edges entering loop header, thus performing either loop
 unrolling or loop peeling.  @code{can_duplicate_loop_p}
 (@code{can_unroll_loop_p} on GIMPLE) must be true for the duplicated
 loop.
 @item @code{loop_version}, @code{tree_ssa_loop_version}: These function
 create a copy of a loop, and a branch before them that selects one of
 them depending on the prescribed condition.  This is useful for
 optimizations that need to verify some assumptions in runtime (one of
 the copies of the loop is usually left unchanged, while the other one is
 transformed in some way).
 @item @code{tree_unroll_loop}: Unrolls the loop, including peeling the
 extra iterations to make the number of iterations divisible by unroll
 factor, updating the exit condition, and removing the exits that now
 cannot be taken.  Works only on GIMPLE.
 @end itemize

 @node LCSSA
 @section Loop-closed SSA form
 @cindex LCSSA
 @cindex Loop-closed SSA form

 Throughout the loop optimizations on tree level, one extra condition is
 enforced on the SSA form:  No SSA name is used outside of the loop in
 that it is defined.  The SSA form satisfying this condition is called
 ``loop-closed SSA form'' -- LCSSA.  To enforce LCSSA, PHI nodes must be
 created at the exits of the loops for the SSA names that are used
 outside of them.  Only the real operands (not virtual SSA names) are
 held in LCSSA, in order to save memory.

 There are various benefits of LCSSA:

 @itemize
 @item Many optimizations (value range analysis, final value
 replacement) are interested in the values that are defined in the loop
 and used outside of it, i.e., exactly those for that we create new PHI
 nodes.
 @item In induction variable analysis, it is not necessary to specify the
 loop in that the analysis should be performed -- the scalar evolution
 analysis always returns the results with respect to the loop in that the
 SSA name is defined.
 @item It makes updating of SSA form during loop transformations simpler.
 Without LCSSA, operations like loop unrolling may force creation of PHI
 nodes arbitrarily far from the loop, while in LCSSA, the SSA form can be
 updated locally.  However, since we only keep real operands in LCSSA, we
 cannot use this advantage (we could have local updating of real
 operands, but it is not much more efficient than to use generic SSA form
 updating for it as well; the amount of changes to SSA is the same).
 @end itemize

 However, it also means LCSSA must be updated.  This is usually
 straightforward, unless you create a new value in loop and use it
 outside, or unless you manipulate loop exit edges (functions are
 provided to make these manipulations simple).
 @code{rewrite_into_loop_closed_ssa} is used to rewrite SSA form to
 LCSSA, and @code{verify_loop_closed_ssa} to check that the invariant of
 LCSSA is preserved.

 @node Scalar evolutions
 @section Scalar evolutions
 @cindex Scalar evolutions
 @cindex IV analysis on GIMPLE

 Scalar evolutions (SCEV) are used to represent results of induction
 variable analysis on GIMPLE.  They enable us to represent variables with
 complicated behavior in a simple and consistent way (we only use it to
 express values of polynomial induction variables, but it is possible to
 extend it).  The interfaces to SCEV analysis are declared in
 @file{tree-scalar-evolution.h}.  To use scalar evolutions analysis,
 @code{scev_initialize} must be used.  To stop using SCEV,
 @code{scev_finalize} should be used.  SCEV analysis caches results in
 order to save time and memory.  This cache however is made invalid by
 most of the loop transformations, including removal of code.  If such a
 transformation is performed, @code{scev_reset} must be called to clean
 the caches.

 Given an SSA name, its behavior in loops can be analyzed using the
 @code{analyze_scalar_evolution} function.  The returned SCEV however
 does not have to be fully analyzed and it may contain references to
 other SSA names defined in the loop.  To resolve these (potentially
 recursive) references, @code{instantiate_parameters} or
 @code{resolve_mixers} functions must be used.
 @code{instantiate_parameters} is useful when you use the results of SCEV
 only for some analysis, and when you work with whole nest of loops at
 once.  It will try replacing all SSA names by their SCEV in all loops,
 including the super-loops of the current loop, thus providing a complete
 information about the behavior of the variable in the loop nest.
 @code{resolve_mixers} is useful if you work with only one loop at a
 time, and if you possibly need to create code based on the value of the
 induction variable.  It will only resolve the SSA names defined in the
 current loop, leaving the SSA names defined outside unchanged, even if
 their evolution in the outer loops is known.

 The SCEV is a normal tree expression, except for the fact that it may
 contain several special tree nodes.  One of them is
 @code{SCEV_NOT_KNOWN}, used for SSA names whose value cannot be
 expressed.  The other one is @code{POLYNOMIAL_CHREC}.  Polynomial chrec
 has three arguments -- base, step and loop (both base and step may
 contain further polynomial chrecs).  Type of the expression and of base
 and step must be the same.  A variable has evolution
 @code{POLYNOMIAL_CHREC(base, step, loop)} if it is (in the specified
 loop) equivalent to @code{x_1} in the following example

 @smallexample
 while (...)
   @{
     x_1 = phi (base, x_2);
     x_2 = x_1 + step;
   @}
 @end smallexample

 Note that this includes the language restrictions on the operations.
 For example, if we compile C code and @code{x} has signed type, then the
 overflow in addition would cause undefined behavior, and we may assume
 that this does not happen.  Hence, the value with this SCEV cannot
 overflow (which restricts the number of iterations of such a loop).

 In many cases, one wants to restrict the attention just to affine
 induction variables.  In this case, the extra expressive power of SCEV
 is not useful, and may complicate the optimizations.  In this case,
 @code{simple_iv} function may be used to analyze a value -- the result
 is a loop-invariant base and step.

 @node loop-iv
 @section IV analysis on RTL
 @cindex IV analysis on RTL

 The induction variable on RTL is simple and only allows analysis of
 affine induction variables, and only in one loop at once.  The interface
 is declared in @file{cfgloop.h}.  Before analyzing induction variables
 in a loop L, @code{iv_analysis_loop_init} function must be called on L.
 After the analysis (possibly calling @code{iv_analysis_loop_init} for
 several loops) is finished, @code{iv_analysis_done} should be called.
 The following functions can be used to access the results of the
 analysis:

 @itemize
 @item @code{iv_analyze}: Analyzes a single register used in the given
 insn.  If no use of the register in this insn is found, the following
 insns are scanned, so that this function can be called on the insn
 returned by get_condition.
 @item @code{iv_analyze_result}: Analyzes result of the assignment in the
 given insn.
 @item @code{iv_analyze_expr}: Analyzes a more complicated expression.
 All its operands are analyzed by @code{iv_analyze}, and hence they must
 be used in the specified insn or one of the following insns.
 @end itemize

 The description of the induction variable is provided in @code{struct
 rtx_iv}.  In order to handle subregs, the representation is a bit
 complicated; if the value of the @code{extend} field is not
 @code{UNKNOWN}, the value of the induction variable in the i-th
 iteration is

 @smallexample
 delta + mult * extend_@{extend_mode@} (subreg_@{mode@} (base + i * step)),
 @end smallexample

 with the following exception:  if @code{first_special} is true, then the
 value in the first iteration (when @code{i} is zero) is @code{delta +
 mult * base}.  However, if @code{extend} is equal to @code{UNKNOWN},
 then @code{first_special} must be false, @code{delta} 0, @code{mult} 1
 and the value in the i-th iteration is

 @smallexample
 subreg_@{mode@} (base + i * step)
 @end smallexample

 The function @code{get_iv_value} can be used to perform these
 calculations.

 @node Number of iterations
 @section Number of iterations analysis
 @cindex Number of iterations analysis

 Both on GIMPLE and on RTL, there are functions available to determine
 the number of iterations of a loop, with a similar interface.  In many
 cases, it is not possible to determine number of iterations
 unconditionally -- the determined number is correct only if some
 assumptions are satisfied.  The analysis tries to verify these
 conditions using the information contained in the program; if it fails,
 the conditions are returned together with the result.  The following
 information and conditions are provided by the analysis:

 @itemize
 @item @code{assumptions}: If this condition is false, the rest of
 the information is invalid.
 @item @code{noloop_assumptions} on RTL, @code{may_be_zero} on GIMPLE: If
 this condition is true, the loop exits in the first iteration.
 @item @code{infinite}: If this condition is true, the loop is infinite.
 This condition is only available on RTL.  On GIMPLE, conditions for
 finiteness of the loop are included in @code{assumptions}.
 @item @code{niter_expr} on RTL, @code{niter} on GIMPLE: The expression
 that gives number of iterations.  The number of iterations is defined as
 the number of executions of the loop latch.
 @end itemize

 Both on GIMPLE and on RTL, it necessary for the induction variable
 analysis framework to be initialized (SCEV on GIMPLE, loop-iv on RTL).
 On GIMPLE, the results are stored to @code{struct tree_niter_desc}
 structure.  Number of iterations before the loop is exited through a
 given exit can be determined using @code{number_of_iterations_exit}
 function.  On RTL, the results are returned in @code{struct niter_desc}
 structure.  The corresponding function is named
 @code{check_simple_exit}.  There are also functions that pass through
 all the exits of a loop and try to find one with easy to determine
 number of iterations -- @code{find_loop_niter} on GIMPLE and
 @code{find_simple_exit} on RTL.  Finally, there are functions that
 provide the same information, but additionally cache it, so that
 repeated calls to number of iterations are not so costly --
 @code{number_of_iterations_in_loop} on GIMPLE and
 @code{get_simple_loop_desc} on RTL.

 Note that some of these functions may behave slightly differently than
 others -- some of them return only the expression for the number of
 iterations, and fail if there are some assumptions.  The function
 @code{number_of_iterations_in_loop} works only for single-exit loops,
 and it returns the value for number of iterations higher by one with
 respect to all other functions (i.e., it returns number of executions of
 the exit statement, not of the loop latch).

 @node Dependency analysis
 @section Data Dependency Analysis
 @cindex Data Dependency Analysis

 The code for the data dependence analysis can be found in
 @file{tree-data-ref.c} and its interface and data structures are
 described in @file{tree-data-ref.h}.  The function that computes the
 data dependences for all the array and pointer references for a given
 loop is @code{compute_data_dependences_for_loop}.  This function is
 currently used by the linear loop transform and the vectorization
 passes.  Before calling this function, one has to allocate two vectors:
 a first vector will contain the set of data references that are
 contained in the analyzed loop body, and the second vector will contain
 the dependence relations between the data references.  Thus if the
 vector of data references is of size @code{n}, the vector containing the
 dependence relations will contain @code{n*n} elements.  However if the
 analyzed loop contains side effects, such as calls that potentially can
 interfere with the data references in the current analyzed loop, the
 analysis stops while scanning the loop body for data references, and
 inserts a single @code{chrec_dont_know} in the dependence relation
 array.

 The data references are discovered in a particular order during the
 scanning of the loop body: the loop body is analyzed in execution order,
 and the data references of each statement are pushed at the end of the
 data reference array.  Two data references syntactically occur in the
 program in the same order as in the array of data references.  This
 syntactic order is important in some classical data dependence tests,
 and mapping this order to the elements of this array avoids costly
 queries to the loop body representation.

 Three types of data references are currently handled: ARRAY_REF,
 INDIRECT_REF and COMPONENT_REF. The data structure for the data reference
 is @code{data_reference}, where @code{data_reference_p} is a name of a
 pointer to the data reference structure. The structure contains the
 following elements:

 @itemize
 @item @code{base_object_info}: Provides information about the base object
 of the data reference and its access functions. These access functions
 represent the evolution of the data reference in the loop relative to
 its base, in keeping with the classical meaning of the data reference
 access function for the support of arrays. For example, for a reference
 @code{a.b[i][j]}, the base object is @code{a.b} and the access functions,
 one for each array subscript, are:
 @code{@{i_init, + i_step@}_1, @{j_init, +, j_step@}_2}.

 @item @code{first_location_in_loop}: Provides information about the first
 location accessed by the data reference in the loop and about the access
 function used to represent evolution relative to this location. This data
 is used to support pointers, and is not used for arrays (for which we
 have base objects). Pointer accesses are represented as a one-dimensional
 access that starts from the first location accessed in the loop. For
 example:

 @smallexample
       for1 i
          for2 j
           *((int *)p + i + j) = a[i][j];
 @end smallexample

 The access function of the pointer access is @code{@{0, + 4B@}_for2}
 relative to @code{p + i}. The access functions of the array are
 @code{@{i_init, + i_step@}_for1} and @code{@{j_init, +, j_step@}_for2}
 relative to @code{a}.

 Usually, the object the pointer refers to is either unknown, or we can't
 prove that the access is confined to the boundaries of a certain object.

 Two data references can be compared only if at least one of these two
 representations has all its fields filled for both data references.

 The current strategy for data dependence tests is as follows:
 If both @code{a} and @code{b} are represented as arrays, compare
 @code{a.base_object} and @code{b.base_object};
 if they are equal, apply dependence tests (use access functions based on
 base_objects).
 Else if both @code{a} and @code{b} are represented as pointers, compare
 @code{a.first_location} and @code{b.first_location};
 if they are equal, apply dependence tests (use access functions based on
 first location).
 However, if @code{a} and @code{b} are represented differently, only try
 to prove that the bases are definitely different.

 @item Aliasing information.
 @item Alignment information.
 @end itemize

 The structure describing the relation between two data references is
 @code{data_dependence_relation} and the shorter name for a pointer to
 such a structure is @code{ddr_p}.  This structure contains:

 @itemize
 @item a pointer to each data reference,
 @item a tree node @code{are_dependent} that is set to @code{chrec_known}
 if the analysis has proved that there is no dependence between these two
 data references, @code{chrec_dont_know} if the analysis was not able to
 determine any useful result and potentially there could exist a
 dependence between these data references, and @code{are_dependent} is
 set to @code{NULL_TREE} if there exist a dependence relation between the
 data references, and the description of this dependence relation is
 given in the @code{subscripts}, @code{dir_vects}, and @code{dist_vects}
 arrays,
 @item a boolean that determines whether the dependence relation can be
 represented by a classical distance vector,
 @item an array @code{subscripts} that contains a description of each
 subscript of the data references.  Given two array accesses a
 subscript is the tuple composed of the access functions for a given
 dimension.  For example, given @code{A[f1][f2][f3]} and
 @code{B[g1][g2][g3]}, there are three subscripts: @code{(f1, g1), (f2,
 g2), (f3, g3)}.
 @item two arrays @code{dir_vects} and @code{dist_vects} that contain
 classical representations of the data dependences under the form of
 direction and distance dependence vectors,
 @item an array of loops @code{loop_nest} that contains the loops to
 which the distance and direction vectors refer to.
 @end itemize

 Several functions for pretty printing the information extracted by the
 data dependence analysis are available: @code{dump_ddrs} prints with a
 maximum verbosity the details of a data dependence relations array,
 @code{dump_dist_dir_vectors} prints only the classical distance and
 direction vectors for a data dependence relations array, and
 @code{dump_data_references} prints the details of the data references
 contained in a data reference array.

 @node Lambda
 @section Linear loop transformations framework
 @cindex Linear loop transformations framework

 Lambda is a framework that allows transformations of loops using
 non-singular matrix based transformations of the iteration space and
 loop bounds. This allows compositions of skewing, scaling, interchange,
 and reversal transformations.  These transformations are often used to
 improve cache behavior or remove inner loop dependencies to allow
 parallelization and vectorization to take place.

 To perform these transformations, Lambda requires that the loopnest be
 converted into a internal form that can be matrix transformed easily.
 To do this conversion, the function
 @code{gcc_loopnest_to_lambda_loopnest} is provided.  If the loop cannot
 be transformed using lambda, this function will return NULL.

 Once a @code{lambda_loopnest} is obtained from the conversion function,
 it can be transformed by using @code{lambda_loopnest_transform}, which
 takes a transformation matrix to apply.  Note that it is up to the
 caller to verify that the transformation matrix is legal to apply to the
 loop (dependence respecting, etc).  Lambda simply applies whatever
 matrix it is told to provide.  It can be extended to make legal matrices
 out of any non-singular matrix, but this is not currently implemented.
 Legality of a matrix for a given loopnest can be verified using
 @code{lambda_transform_legal_p}.

 Given a transformed loopnest, conversion back into gcc IR is done by
 @code{lambda_loopnest_to_gcc_loopnest}.  This function will modify the
 loops so that they match the transformed loopnest.