| @c Copyright (c) 2004, 2005 Free Software Foundation, Inc. |
| @c Free Software Foundation, Inc. |
| @c This is part of the GCC manual. |
| @c For copying conditions, see the file gcc.texi. |
| |
| @c --------------------------------------------------------------------- |
| @c Tree SSA |
| @c --------------------------------------------------------------------- |
| |
| @node Tree SSA |
| @chapter Analysis and Optimization of GIMPLE Trees |
| @cindex Tree SSA |
| @cindex Optimization infrastructure for GIMPLE |
| |
| GCC uses three main intermediate languages to represent the program |
| during compilation: GENERIC, GIMPLE and RTL@. GENERIC is a |
| language-independent representation generated by each front end. It |
| is used to serve as an interface between the parser and optimizer. |
| GENERIC is a common representation that is able to represent programs |
| written in all the languages supported by GCC@. |
| |
| GIMPLE and RTL are used to optimize the program. GIMPLE is used for |
| target and language independent optimizations (e.g., inlining, |
| constant propagation, tail call elimination, redundancy elimination, |
| etc). Much like GENERIC, GIMPLE is a language independent, tree based |
| representation. However, it differs from GENERIC in that the GIMPLE |
| grammar is more restrictive: expressions contain no more than 3 |
| operands (except function calls), it has no control flow structures |
| and expressions with side-effects are only allowed on the right hand |
| side of assignments. See the chapter describing GENERIC and GIMPLE |
| for more details. |
| |
| This chapter describes the data structures and functions used in the |
| GIMPLE optimizers (also known as ``tree optimizers'' or ``middle |
| end''). In particular, it focuses on all the macros, data structures, |
| functions and programming constructs needed to implement optimization |
| passes for GIMPLE@. |
| |
| @menu |
| * GENERIC:: A high-level language-independent representation. |
| * GIMPLE:: A lower-level factored tree representation. |
| * Annotations:: Attributes for statements and variables. |
| * Statement Operands:: Variables referenced by GIMPLE statements. |
| * SSA:: Static Single Assignment representation. |
| * Alias analysis:: Representing aliased loads and stores. |
| @end menu |
| |
| @node GENERIC |
| @section GENERIC |
| @cindex GENERIC |
| |
| The purpose of GENERIC is simply to provide a language-independent way of |
| representing an entire function in trees. To this end, it was necessary to |
| add a few new tree codes to the back end, but most everything was already |
| there. If you can express it with the codes in @code{gcc/tree.def}, it's |
| GENERIC@. |
| |
| Early on, there was a great deal of debate about how to think about |
| statements in a tree IL@. In GENERIC, a statement is defined as any |
| expression whose value, if any, is ignored. A statement will always |
| have @code{TREE_SIDE_EFFECTS} set (or it will be discarded), but a |
| non-statement expression may also have side effects. A |
| @code{CALL_EXPR}, for instance. |
| |
| It would be possible for some local optimizations to work on the |
| GENERIC form of a function; indeed, the adapted tree inliner works |
| fine on GENERIC, but the current compiler performs inlining after |
| lowering to GIMPLE (a restricted form described in the next section). |
| Indeed, currently the frontends perform this lowering before handing |
| off to @code{tree_rest_of_compilation}, but this seems inelegant. |
| |
| If necessary, a front end can use some language-dependent tree codes |
| in its GENERIC representation, so long as it provides a hook for |
| converting them to GIMPLE and doesn't expect them to work with any |
| (hypothetical) optimizers that run before the conversion to GIMPLE@. |
| The intermediate representation used while parsing C and C++ looks |
| very little like GENERIC, but the C and C++ gimplifier hooks are |
| perfectly happy to take it as input and spit out GIMPLE@. |
| |
| @node GIMPLE |
| @section GIMPLE |
| @cindex GIMPLE |
| |
| GIMPLE is a simplified subset of GENERIC for use in optimization. The |
| particular subset chosen (and the name) was heavily influenced by the |
| SIMPLE IL used by the McCAT compiler project at McGill University |
| (@uref{http://www-acaps.cs.mcgill.ca/info/McCAT/McCAT.html}), |
| though we have made some different choices. For one thing, SIMPLE |
| doesn't support @code{goto}; a production compiler can't afford that |
| kind of restriction. |
| |
| GIMPLE retains much of the structure of the parse trees: lexical |
| scopes are represented as containers, rather than markers. However, |
| expressions are broken down into a 3-address form, using temporary |
| variables to hold intermediate values. Also, control structures are |
| lowered to gotos. |
| |
| In GIMPLE no container node is ever used for its value; if a |
| @code{COND_EXPR} or @code{BIND_EXPR} has a value, it is stored into a |
| temporary within the controlled blocks, and that temporary is used in |
| place of the container. |
| |
| The compiler pass which lowers GENERIC to GIMPLE is referred to as the |
| @samp{gimplifier}. The gimplifier works recursively, replacing complex |
| statements with sequences of simple statements. |
| |
| @c Currently, the only way to |
| @c tell whether or not an expression is in GIMPLE form is by recursively |
| @c examining it; in the future there will probably be a flag to help avoid |
| @c redundant work. FIXME FIXME |
| |
| @menu |
| * Interfaces:: |
| * Temporaries:: |
| * GIMPLE Expressions:: |
| * Statements:: |
| * GIMPLE Example:: |
| * Rough GIMPLE Grammar:: |
| @end menu |
| |
| @node Interfaces |
| @subsection Interfaces |
| @cindex gimplification |
| |
| The tree representation of a function is stored in |
| @code{DECL_SAVED_TREE}. It is lowered to GIMPLE by a call to |
| @code{gimplify_function_tree}. |
| |
| If a front end wants to include language-specific tree codes in the tree |
| representation which it provides to the back end, it must provide a |
| definition of @code{LANG_HOOKS_GIMPLIFY_EXPR} which knows how to |
| convert the front end trees to GIMPLE@. Usually such a hook will involve |
| much of the same code for expanding front end trees to RTL@. This function |
| can return fully lowered GIMPLE, or it can return GENERIC trees and let the |
| main gimplifier lower them the rest of the way; this is often simpler. |
| |
| The C and C++ front ends currently convert directly from front end |
| trees to GIMPLE, and hand that off to the back end rather than first |
| converting to GENERIC@. Their gimplifier hooks know about all the |
| @code{_STMT} nodes and how to convert them to GENERIC forms. There |
| was some work done on a genericization pass which would run first, but |
| the existence of @code{STMT_EXPR} meant that in order to convert all |
| of the C statements into GENERIC equivalents would involve walking the |
| entire tree anyway, so it was simpler to lower all the way. This |
| might change in the future if someone writes an optimization pass |
| which would work better with higher-level trees, but currently the |
| optimizers all expect GIMPLE@. |
| |
| A front end which wants to use the tree optimizers (and already has |
| some sort of whole-function tree representation) only needs to provide |
| a definition of @code{LANG_HOOKS_GIMPLIFY_EXPR}, call |
| @code{gimplify_function_tree} to lower to GIMPLE, and then hand off to |
| @code{tree_rest_of_compilation} to compile and output the function. |
| |
| You can tell the compiler to dump a C-like representation of the GIMPLE |
| form with the flag @option{-fdump-tree-gimple}. |
| |
| @node Temporaries |
| @subsection Temporaries |
| @cindex Temporaries |
| |
| When gimplification encounters a subexpression which is too complex, it |
| creates a new temporary variable to hold the value of the subexpression, |
| and adds a new statement to initialize it before the current statement. |
| These special temporaries are known as @samp{expression temporaries}, and are |
| allocated using @code{get_formal_tmp_var}. The compiler tries to |
| always evaluate identical expressions into the same temporary, to simplify |
| elimination of redundant calculations. |
| |
| We can only use expression temporaries when we know that it will not be |
| reevaluated before its value is used, and that it will not be otherwise |
| modified@footnote{These restrictions are derived from those in Morgan 4.8.}. |
| Other temporaries can be allocated using |
| @code{get_initialized_tmp_var} or @code{create_tmp_var}. |
| |
| Currently, an expression like @code{a = b + 5} is not reduced any |
| further. We tried converting it to something like |
| @smallexample |
| T1 = b + 5; |
| a = T1; |
| @end smallexample |
| but this bloated the representation for minimal benefit. However, a |
| variable which must live in memory cannot appear in an expression; its |
| value is explicitly loaded into a temporary first. Similarly, storing |
| the value of an expression to a memory variable goes through a |
| temporary. |
| |
| @node GIMPLE Expressions |
| @subsection Expressions |
| @cindex GIMPLE Expressions |
| |
| In general, expressions in GIMPLE consist of an operation and the |
| appropriate number of simple operands; these operands must either be a |
| GIMPLE rvalue (@code{is_gimple_val}), i.e.@: a constant or a register |
| variable. More complex operands are factored out into temporaries, so |
| that |
| @smallexample |
| a = b + c + d |
| @end smallexample |
| becomes |
| @smallexample |
| T1 = b + c; |
| a = T1 + d; |
| @end smallexample |
| |
| The same rule holds for arguments to a @code{CALL_EXPR}. |
| |
| The target of an assignment is usually a variable, but can also be an |
| @code{INDIRECT_REF} or a compound lvalue as described below. |
| |
| @menu |
| * Compound Expressions:: |
| * Compound Lvalues:: |
| * Conditional Expressions:: |
| * Logical Operators:: |
| @end menu |
| |
| @node Compound Expressions |
| @subsubsection Compound Expressions |
| @cindex Compound Expressions |
| |
| The left-hand side of a C comma expression is simply moved into a separate |
| statement. |
| |
| @node Compound Lvalues |
| @subsubsection Compound Lvalues |
| @cindex Compound Lvalues |
| |
| Currently compound lvalues involving array and structure field references |
| are not broken down; an expression like @code{a.b[2] = 42} is not reduced |
| any further (though complex array subscripts are). This restriction is a |
| workaround for limitations in later optimizers; if we were to convert this |
| to |
| |
| @smallexample |
| T1 = &a.b; |
| T1[2] = 42; |
| @end smallexample |
| |
| alias analysis would not remember that the reference to @code{T1[2]} came |
| by way of @code{a.b}, so it would think that the assignment could alias |
| another member of @code{a}; this broke @code{struct-alias-1.c}. Future |
| optimizer improvements may make this limitation unnecessary. |
| |
| @node Conditional Expressions |
| @subsubsection Conditional Expressions |
| @cindex Conditional Expressions |
| |
| A C @code{?:} expression is converted into an @code{if} statement with |
| each branch assigning to the same temporary. So, |
| |
| @smallexample |
| a = b ? c : d; |
| @end smallexample |
| becomes |
| @smallexample |
| if (b) |
| T1 = c; |
| else |
| T1 = d; |
| a = T1; |
| @end smallexample |
| |
| Tree level if-conversion pass re-introduces @code{?:} expression, if appropriate. |
| It is used to vectorize loops with conditions using vector conditional operations. |
| |
| Note that in GIMPLE, @code{if} statements are also represented using |
| @code{COND_EXPR}, as described below. |
| |
| @node Logical Operators |
| @subsubsection Logical Operators |
| @cindex Logical Operators |
| |
| Except when they appear in the condition operand of a @code{COND_EXPR}, |
| logical `and' and `or' operators are simplified as follows: |
| @code{a = b && c} becomes |
| |
| @smallexample |
| T1 = (bool)b; |
| if (T1) |
| T1 = (bool)c; |
| a = T1; |
| @end smallexample |
| |
| Note that @code{T1} in this example cannot be an expression temporary, |
| because it has two different assignments. |
| |
| @node Statements |
| @subsection Statements |
| @cindex Statements |
| |
| Most statements will be assignment statements, represented by |
| @code{MODIFY_EXPR}. A @code{CALL_EXPR} whose value is ignored can |
| also be a statement. No other C expressions can appear at statement level; |
| a reference to a volatile object is converted into a @code{MODIFY_EXPR}. |
| In GIMPLE form, type of @code{MODIFY_EXPR} is not meaningful. Instead, use type |
| of LHS or RHS@. |
| |
| There are also several varieties of complex statements. |
| |
| @menu |
| * Blocks:: |
| * Statement Sequences:: |
| * Empty Statements:: |
| * Loops:: |
| * Selection Statements:: |
| * Jumps:: |
| * Cleanups:: |
| * GIMPLE Exception Handling:: |
| @end menu |
| |
| @node Blocks |
| @subsubsection Blocks |
| @cindex Blocks |
| |
| Block scopes and the variables they declare in GENERIC and GIMPLE are |
| expressed using the @code{BIND_EXPR} code, which in previous versions of |
| GCC was primarily used for the C statement-expression extension. |
| |
| Variables in a block are collected into @code{BIND_EXPR_VARS} in |
| declaration order. Any runtime initialization is moved out of |
| @code{DECL_INITIAL} and into a statement in the controlled block. When |
| gimplifying from C or C++, this initialization replaces the |
| @code{DECL_STMT}. |
| |
| Variable-length arrays (VLAs) complicate this process, as their size often |
| refers to variables initialized earlier in the block. To handle this, we |
| currently split the block at that point, and move the VLA into a new, inner |
| @code{BIND_EXPR}. This strategy may change in the future. |
| |
| @code{DECL_SAVED_TREE} for a GIMPLE function will always be a |
| @code{BIND_EXPR} which contains declarations for the temporary variables |
| used in the function. |
| |
| A C++ program will usually contain more @code{BIND_EXPR}s than there are |
| syntactic blocks in the source code, since several C++ constructs have |
| implicit scopes associated with them. On the other hand, although the C++ |
| front end uses pseudo-scopes to handle cleanups for objects with |
| destructors, these don't translate into the GIMPLE form; multiple |
| declarations at the same level use the same @code{BIND_EXPR}. |
| |
| @node Statement Sequences |
| @subsubsection Statement Sequences |
| @cindex Statement Sequences |
| |
| Multiple statements at the same nesting level are collected into a |
| @code{STATEMENT_LIST}. Statement lists are modified and traversed |
| using the interface in @samp{tree-iterator.h}. |
| |
| @node Empty Statements |
| @subsubsection Empty Statements |
| @cindex Empty Statements |
| |
| Whenever possible, statements with no effect are discarded. But if they |
| are nested within another construct which cannot be discarded for some |
| reason, they are instead replaced with an empty statement, generated by |
| @code{build_empty_stmt}. Initially, all empty statements were shared, |
| after the pattern of the Java front end, but this caused a lot of trouble in |
| practice. |
| |
| An empty statement is represented as @code{(void)0}. |
| |
| @node Loops |
| @subsubsection Loops |
| @cindex Loops |
| |
| At one time loops were expressed in GIMPLE using @code{LOOP_EXPR}, but |
| now they are lowered to explicit gotos. |
| |
| @node Selection Statements |
| @subsubsection Selection Statements |
| @cindex Selection Statements |
| |
| A simple selection statement, such as the C @code{if} statement, is |
| expressed in GIMPLE using a void @code{COND_EXPR}. If only one branch is |
| used, the other is filled with an empty statement. |
| |
| Normally, the condition expression is reduced to a simple comparison. If |
| it is a shortcut (@code{&&} or @code{||}) expression, however, we try to |
| break up the @code{if} into multiple @code{if}s so that the implied shortcut |
| is taken directly, much like the transformation done by @code{do_jump} in |
| the RTL expander. |
| |
| A @code{SWITCH_EXPR} in GIMPLE contains the condition and a |
| @code{TREE_VEC} of @code{CASE_LABEL_EXPR}s describing the case values |
| and corresponding @code{LABEL_DECL}s to jump to. The body of the |
| @code{switch} is moved after the @code{SWITCH_EXPR}. |
| |
| @node Jumps |
| @subsubsection Jumps |
| @cindex Jumps |
| |
| Other jumps are expressed by either @code{GOTO_EXPR} or @code{RETURN_EXPR}. |
| |
| The operand of a @code{GOTO_EXPR} must be either a label or a variable |
| containing the address to jump to. |
| |
| The operand of a @code{RETURN_EXPR} is either @code{NULL_TREE} or a |
| @code{MODIFY_EXPR} which sets the return value. It would be nice to |
| move the @code{MODIFY_EXPR} into a separate statement, but the special |
| return semantics in @code{expand_return} make that difficult. It may |
| still happen in the future, perhaps by moving most of that logic into |
| @code{expand_assignment}. |
| |
| @node Cleanups |
| @subsubsection Cleanups |
| @cindex Cleanups |
| |
| Destructors for local C++ objects and similar dynamic cleanups are |
| represented in GIMPLE by a @code{TRY_FINALLY_EXPR}. When the controlled |
| block exits, the cleanup is run. |
| |
| @code{TRY_FINALLY_EXPR} complicates the flow graph, since the cleanup |
| needs to appear on every edge out of the controlled block; this |
| reduces the freedom to move code across these edges. Therefore, the |
| EH lowering pass which runs before most of the optimization passes |
| eliminates these expressions by explicitly adding the cleanup to each |
| edge. |
| |
| @node GIMPLE Exception Handling |
| @subsubsection Exception Handling |
| @cindex GIMPLE Exception Handling |
| |
| Other exception handling constructs are represented using |
| @code{TRY_CATCH_EXPR}. The handler operand of a @code{TRY_CATCH_EXPR} |
| can be a normal statement to be executed if the controlled block throws an |
| exception, or it can have one of two special forms: |
| |
| @enumerate |
| @item A @code{CATCH_EXPR} executes its handler if the thrown exception |
| matches one of the allowed types. Multiple handlers can be |
| expressed by a sequence of @code{CATCH_EXPR} statements. |
| @item An @code{EH_FILTER_EXPR} executes its handler if the thrown |
| exception does not match one of the allowed types. |
| @end enumerate |
| |
| Currently throwing an exception is not directly represented in GIMPLE, |
| since it is implemented by calling a function. At some point in the future |
| we will want to add some way to express that the call will throw an |
| exception of a known type. |
| |
| Just before running the optimizers, the compiler lowers the high-level |
| EH constructs above into a set of @samp{goto}s, magic labels, and EH |
| regions. Continuing to unwind at the end of a cleanup is represented |
| with a @code{RESX_EXPR}. |
| |
| @node GIMPLE Example |
| @subsection GIMPLE Example |
| @cindex GIMPLE Example |
| |
| @smallexample |
| struct A @{ A(); ~A(); @}; |
| |
| int i; |
| int g(); |
| void f() |
| @{ |
| A a; |
| int j = (--i, i ? 0 : 1); |
| |
| for (int x = 42; x > 0; --x) |
| @{ |
| i += g()*4 + 32; |
| @} |
| @} |
| @end smallexample |
| |
| becomes |
| |
| @smallexample |
| void f() |
| @{ |
| int i.0; |
| int T.1; |
| int iftmp.2; |
| int T.3; |
| int T.4; |
| int T.5; |
| int T.6; |
| |
| @{ |
| struct A a; |
| int j; |
| |
| __comp_ctor (&a); |
| try |
| @{ |
| i.0 = i; |
| T.1 = i.0 - 1; |
| i = T.1; |
| i.0 = i; |
| if (i.0 == 0) |
| iftmp.2 = 1; |
| else |
| iftmp.2 = 0; |
| j = iftmp.2; |
| @{ |
| int x; |
| |
| x = 42; |
| goto test; |
| loop:; |
| |
| T.3 = g (); |
| T.4 = T.3 * 4; |
| i.0 = i; |
| T.5 = T.4 + i.0; |
| T.6 = T.5 + 32; |
| i = T.6; |
| x = x - 1; |
| |
| test:; |
| if (x > 0) |
| goto loop; |
| else |
| goto break_; |
| break_:; |
| @} |
| @} |
| finally |
| @{ |
| __comp_dtor (&a); |
| @} |
| @} |
| @} |
| @end smallexample |
| |
| @node Rough GIMPLE Grammar |
| @subsection Rough GIMPLE Grammar |
| @cindex Rough GIMPLE Grammar |
| |
| @smallexample |
| function : FUNCTION_DECL |
| DECL_SAVED_TREE -> compound-stmt |
| |
| compound-stmt: STATEMENT_LIST |
| members -> stmt |
| |
| stmt : block |
| | if-stmt |
| | switch-stmt |
| | goto-stmt |
| | return-stmt |
| | resx-stmt |
| | label-stmt |
| | try-stmt |
| | modify-stmt |
| | call-stmt |
| |
| block : BIND_EXPR |
| BIND_EXPR_VARS -> chain of DECLs |
| BIND_EXPR_BLOCK -> BLOCK |
| BIND_EXPR_BODY -> compound-stmt |
| |
| if-stmt : COND_EXPR |
| op0 -> condition |
| op1 -> compound-stmt |
| op2 -> compound-stmt |
| |
| switch-stmt : SWITCH_EXPR |
| op0 -> val |
| op1 -> NULL |
| op2 -> TREE_VEC of CASE_LABEL_EXPRs |
| The CASE_LABEL_EXPRs are sorted by CASE_LOW, |
| and default is last. |
| |
| goto-stmt : GOTO_EXPR |
| op0 -> LABEL_DECL | val |
| |
| return-stmt : RETURN_EXPR |
| op0 -> return-value |
| |
| return-value : NULL |
| | RESULT_DECL |
| | MODIFY_EXPR |
| op0 -> RESULT_DECL |
| op1 -> lhs |
| |
| resx-stmt : RESX_EXPR |
| |
| label-stmt : LABEL_EXPR |
| op0 -> LABEL_DECL |
| |
| try-stmt : TRY_CATCH_EXPR |
| op0 -> compound-stmt |
| op1 -> handler |
| | TRY_FINALLY_EXPR |
| op0 -> compound-stmt |
| op1 -> compound-stmt |
| |
| handler : catch-seq |
| | EH_FILTER_EXPR |
| | compound-stmt |
| |
| catch-seq : STATEMENT_LIST |
| members -> CATCH_EXPR |
| |
| modify-stmt : MODIFY_EXPR |
| op0 -> lhs |
| op1 -> rhs |
| |
| call-stmt : CALL_EXPR |
| op0 -> val | OBJ_TYPE_REF |
| op1 -> call-arg-list |
| |
| call-arg-list: TREE_LIST |
| members -> lhs | CONST |
| |
| addr-expr-arg: ID |
| | compref |
| |
| addressable : addr-expr-arg |
| | indirectref |
| |
| with-size-arg: addressable |
| | call-stmt |
| |
| indirectref : INDIRECT_REF |
| op0 -> val |
| |
| lhs : addressable |
| | bitfieldref |
| | WITH_SIZE_EXPR |
| op0 -> with-size-arg |
| op1 -> val |
| |
| min-lval : ID |
| | indirectref |
| |
| bitfieldref : BIT_FIELD_REF |
| op0 -> inner-compref |
| op1 -> CONST |
| op2 -> var |
| |
| compref : inner-compref |
| | REALPART_EXPR |
| op0 -> inner-compref |
| | IMAGPART_EXPR |
| op0 -> inner-compref |
| |
| inner-compref: min-lval |
| | COMPONENT_REF |
| op0 -> inner-compref |
| op1 -> FIELD_DECL |
| op2 -> val |
| | ARRAY_REF |
| op0 -> inner-compref |
| op1 -> val |
| op2 -> val |
| op3 -> val |
| | ARRAY_RANGE_REF |
| op0 -> inner-compref |
| op1 -> val |
| op2 -> val |
| op3 -> val |
| | VIEW_CONVERT_EXPR |
| op0 -> inner-compref |
| |
| condition : val |
| | RELOP |
| op0 -> val |
| op1 -> val |
| |
| val : ID |
| | CONST |
| |
| rhs : lhs |
| | CONST |
| | call-stmt |
| | ADDR_EXPR |
| op0 -> addr-expr-arg |
| | UNOP |
| op0 -> val |
| | BINOP |
| op0 -> val |
| op1 -> val |
| | RELOP |
| op0 -> val |
| op1 -> val |
| @end smallexample |
| |
| @node Annotations |
| @section Annotations |
| @cindex annotations |
| |
| The optimizers need to associate attributes with statements and |
| variables during the optimization process. For instance, we need to |
| know what basic block a statement belongs to or whether a variable |
| has aliases. All these attributes are stored in data structures |
| called annotations which are then linked to the field @code{ann} in |
| @code{struct tree_common}. |
| |
| Presently, we define annotations for statements (@code{stmt_ann_t}), |
| variables (@code{var_ann_t}) and SSA names (@code{ssa_name_ann_t}). |
| Annotations are defined and documented in @file{tree-flow.h}. |
| |
| |
| @node Statement Operands |
| @section Statement Operands |
| @cindex operands |
| @cindex virtual operands |
| @cindex real operands |
| @findex get_stmt_operands |
| @findex modify_stmt |
| |
| Almost every GIMPLE statement will contain a reference to a variable |
| or memory location. Since statements come in different shapes and |
| sizes, their operands are going to be located at various spots inside |
| the statement's tree. To facilitate access to the statement's |
| operands, they are organized into arrays associated inside each |
| statement's annotation. Each element in an operand array is a pointer |
| to a @code{VAR_DECL}, @code{PARM_DECL} or @code{SSA_NAME} tree node. |
| This provides a very convenient way of examining and replacing |
| operands. |
| |
| Data flow analysis and optimization is done on all tree nodes |
| representing variables. Any node for which @code{SSA_VAR_P} returns |
| nonzero is considered when scanning statement operands. However, not |
| all @code{SSA_VAR_P} variables are processed in the same way. For the |
| purposes of optimization, we need to distinguish between references to |
| local scalar variables and references to globals, statics, structures, |
| arrays, aliased variables, etc. The reason is simple, the compiler |
| can gather complete data flow information for a local scalar. On the |
| other hand, a global variable may be modified by a function call, it |
| may not be possible to keep track of all the elements of an array or |
| the fields of a structure, etc. |
| |
| The operand scanner gathers two kinds of operands: @dfn{real} and |
| @dfn{virtual}. An operand for which @code{is_gimple_reg} returns true |
| is considered real, otherwise it is a virtual operand. We also |
| distinguish between uses and definitions. An operand is used if its |
| value is loaded by the statement (e.g., the operand at the RHS of an |
| assignment). If the statement assigns a new value to the operand, the |
| operand is considered a definition (e.g., the operand at the LHS of |
| an assignment). |
| |
| Virtual and real operands also have very different data flow |
| properties. Real operands are unambiguous references to the |
| full object that they represent. For instance, given |
| |
| @smallexample |
| @{ |
| int a, b; |
| a = b |
| @} |
| @end smallexample |
| |
| Since @code{a} and @code{b} are non-aliased locals, the statement |
| @code{a = b} will have one real definition and one real use because |
| variable @code{b} is completely modified with the contents of |
| variable @code{a}. Real definition are also known as @dfn{killing |
| definitions}. Similarly, the use of @code{a} reads all its bits. |
| |
| In contrast, virtual operands are used with variables that can have |
| a partial or ambiguous reference. This includes structures, arrays, |
| globals, and aliased variables. In these cases, we have two types of |
| definitions. For globals, structures, and arrays, we can determine from |
| a statement whether a variable of these types has a killing definition. |
| If the variable does, then the statement is marked as having a |
| @dfn{must definition} of that variable. However, if a statement is only |
| defining a part of the variable (i.e.@: a field in a structure), or if we |
| know that a statement might define the variable but we cannot say for sure, |
| then we mark that statement as having a @dfn{may definition}. For |
| instance, given |
| |
| @smallexample |
| @{ |
| int a, b, *p; |
| |
| if (...) |
| p = &a; |
| else |
| p = &b; |
| *p = 5; |
| return *p; |
| @} |
| @end smallexample |
| |
| The assignment @code{*p = 5} may be a definition of @code{a} or |
| @code{b}. If we cannot determine statically where @code{p} is |
| pointing to at the time of the store operation, we create virtual |
| definitions to mark that statement as a potential definition site for |
| @code{a} and @code{b}. Memory loads are similarly marked with virtual |
| use operands. Virtual operands are shown in tree dumps right before |
| the statement that contains them. To request a tree dump with virtual |
| operands, use the @option{-vops} option to @option{-fdump-tree}: |
| |
| @smallexample |
| @{ |
| int a, b, *p; |
| |
| if (...) |
| p = &a; |
| else |
| p = &b; |
| # a = V_MAY_DEF <a> |
| # b = V_MAY_DEF <b> |
| *p = 5; |
| |
| # VUSE <a> |
| # VUSE <b> |
| return *p; |
| @} |
| @end smallexample |
| |
| Notice that @code{V_MAY_DEF} operands have two copies of the referenced |
| variable. This indicates that this is not a killing definition of |
| that variable. In this case we refer to it as a @dfn{may definition} |
| or @dfn{aliased store}. The presence of the second copy of the |
| variable in the @code{V_MAY_DEF} operand will become important when the |
| function is converted into SSA form. This will be used to link all |
| the non-killing definitions to prevent optimizations from making |
| incorrect assumptions about them. |
| |
| Operands are collected by @file{tree-ssa-operands.c}. They are stored |
| inside each statement's annotation and can be accessed with |
| @code{DEF_OPS}, @code{USE_OPS}, @code{V_MAY_DEF_OPS}, |
| @code{V_MUST_DEF_OPS} and @code{VUSE_OPS}. The following are all the |
| accessor macros available to access USE operands. To access all the |
| other operand arrays, just change the name accordingly. Note that |
| this interface to the operands is deprecated, and is slated for |
| removal in a future version of gcc. The preferred interface is the |
| operand iterator interface. Unless you need to discover the number of |
| operands of a given type on a statement, you are strongly urged not to |
| use this interface. |
| |
| @defmac USE_OPS (@var{ann}) |
| Returns the array of operands used by the statement with annotation |
| @var{ann}. |
| @end defmac |
| |
| @defmac STMT_USE_OPS (@var{stmt}) |
| Alternate version of USE_OPS that takes the statement @var{stmt} as |
| input. |
| @end defmac |
| |
| @defmac NUM_USES (@var{ops}) |
| Return the number of USE operands in array @var{ops}. |
| @end defmac |
| |
| @defmac USE_OP_PTR (@var{ops}, @var{i}) |
| Return a pointer to the @var{i}th operand in array @var{ops}. |
| @end defmac |
| |
| @defmac USE_OP (@var{ops}, @var{i}) |
| Return the @var{i}th operand in array @var{ops}. |
| @end defmac |
| |
| The following function shows how to print all the operands of a given |
| statement: |
| |
| @smallexample |
| void |
| print_ops (tree stmt) |
| @{ |
| vuse_optype vuses; |
| v_may_def_optype v_may_defs; |
| v_must_def_optype v_must_defs; |
| def_optype defs; |
| use_optype uses; |
| stmt_ann_t ann; |
| size_t i; |
| |
| get_stmt_operands (stmt); |
| ann = stmt_ann (stmt); |
| |
| defs = DEF_OPS (ann); |
| for (i = 0; i < NUM_DEFS (defs); i++) |
| print_generic_expr (stderr, DEF_OP (defs, i), 0); |
| |
| uses = USE_OPS (ann); |
| for (i = 0; i < NUM_USES (uses); i++) |
| print_generic_expr (stderr, USE_OP (uses, i), 0); |
| |
| v_may_defs = V_MAY_DEF_OPS (ann); |
| for (i = 0; i < NUM_V_MAY_DEFS (v_may_defs); i++) |
| @{ |
| print_generic_expr (stderr, V_MAY_DEF_OP (v_may_defs, i), 0); |
| print_generic_expr (stderr, V_MAY_DEF_RESULT (v_may_defs, i), 0); |
| @} |
| |
| v_must_defs = V_MUST_DEF_OPS (ann); |
| for (i = 0; i < NUM_V_MUST_DEFS (v_must_defs); i++) |
| print_generic_expr (stderr, V_MUST_DEF_OP (v_must_defs, i), 0); |
| |
| vuses = VUSE_OPS (ann); |
| for (i = 0; i < NUM_VUSES (vuses); i++) |
| print_generic_expr (stderr, VUSE_OP (vuses, i), 0); |
| @} |
| @end smallexample |
| |
| To collect the operands, you first need to call |
| @code{get_stmt_operands}. Since that is a potentially expensive |
| operation, statements are only scanned if they have been marked |
| modified by a call to @code{modify_stmt}. So, if your pass replaces |
| operands in a statement, make sure to call @code{modify_stmt}. |
| |
| @subsection Operand Iterators |
| @cindex Operand Iterators |
| |
| There is an alternative to iterating over the operands in a statement. |
| It is especially useful when you wish to perform the same operation on |
| more than one type of operand. The previous example could be |
| rewritten as follows: |
| |
| @smallexample |
| void |
| print_ops (tree stmt) |
| @{ |
| ssa_op_iter; |
| tree var; |
| |
| get_stmt_operands (stmt); |
| FOR_EACH_SSA_TREE_OPERAND (var, stmt, iter, SSA_OP_ALL_OPERANDS) |
| print_generic_expr (stderr, var, 0); |
| @} |
| @end smallexample |
| |
| |
| @enumerate |
| @item Determine whether you are need to see the operand pointers, or just the |
| trees, and choose the appropriate macro: |
| |
| @smallexample |
| Need Macro: |
| ---- ------- |
| use_operand_p FOR_EACH_SSA_USE_OPERAND |
| def_operand_p FOR_EACH_SSA_DEF_OPERAND |
| tree FOR_EACH_SSA_TREE_OPERAND |
| @end smallexample |
| |
| @item You need to declare a variable of the type you are interested |
| in, and an ssa_op_iter structure which serves as the loop |
| controlling variable. |
| |
| @item Determine which operands you wish to use, and specify the flags of |
| those you are interested in. They are documented in |
| @file{tree-ssa-operands.h}: |
| |
| @smallexample |
| #define SSA_OP_USE 0x01 /* @r{Real USE operands.} */ |
| #define SSA_OP_DEF 0x02 /* @r{Real DEF operands.} */ |
| #define SSA_OP_VUSE 0x04 /* @r{VUSE operands.} */ |
| #define SSA_OP_VMAYUSE 0x08 /* @r{USE portion of V_MAY_DEFS.} */ |
| #define SSA_OP_VMAYDEF 0x10 /* @r{DEF portion of V_MAY_DEFS.} */ |
| #define SSA_OP_VMUSTDEF 0x20 /* @r{V_MUST_DEF definitions.} */ |
| |
| /* @r{These are commonly grouped operand flags.} */ |
| #define SSA_OP_VIRTUAL_USES (SSA_OP_VUSE | SSA_OP_VMAYUSE) |
| #define SSA_OP_VIRTUAL_DEFS (SSA_OP_VMAYDEF | SSA_OP_VMUSTDEF) |
| #define SSA_OP_ALL_USES (SSA_OP_VIRTUAL_USES | SSA_OP_USE) |
| #define SSA_OP_ALL_DEFS (SSA_OP_VIRTUAL_DEFS | SSA_OP_DEF) |
| #define SSA_OP_ALL_OPERANDS (SSA_OP_ALL_USES | SSA_OP_ALL_DEFS) |
| @end smallexample |
| @end enumerate |
| |
| So if you want to look at the use pointers for all the @code{USE} and |
| @code{VUSE} operands, you would do something like: |
| |
| @smallexample |
| use_operand_p use_p; |
| ssa_op_iter iter; |
| |
| FOR_EACH_SSA_USE_OPERAND (use_p, stmt, iter, (SSA_OP_USE | SSA_OP_VUSE)) |
| @{ |
| process_use_ptr (use_p); |
| @} |
| @end smallexample |
| |
| The @code{_TREE_} macro is basically the same as the @code{USE} and |
| @code{DEF} macros, only with the use or def dereferenced via |
| @code{USE_FROM_PTR (use_p)} and @code{DEF_FROM_PTR (def_p)}. Since we |
| aren't using operand pointers, use and defs flags can be mixed. |
| |
| @smallexample |
| tree var; |
| ssa_op_iter iter; |
| |
| FOR_EACH_SSA_TREE_OPERAND (var, stmt, iter, SSA_OP_VUSE | SSA_OP_VMUSTDEF) |
| @{ |
| print_generic_expr (stderr, var, TDF_SLIM); |
| @} |
| @end smallexample |
| |
| @code{V_MAY_DEF}s are broken into two flags, one for the |
| @code{DEF} portion (@code{SSA_OP_VMAYDEF}) and one for the USE portion |
| (@code{SSA_OP_VMAYUSE}). If all you want to look at are the |
| @code{V_MAY_DEF}s together, there is a fourth iterator macro for this, |
| which returns both a def_operand_p and a use_operand_p for each |
| @code{V_MAY_DEF} in the statement. Note that you don't need any flags for |
| this one. |
| |
| @smallexample |
| use_operand_p use_p; |
| def_operand_p def_p; |
| ssa_op_iter iter; |
| |
| FOR_EACH_SSA_MAYDEF_OPERAND (def_p, use_p, stmt, iter) |
| @{ |
| my_code; |
| @} |
| @end smallexample |
| |
| @code{V_MUST_DEF}s are broken into two flags, one for the |
| @code{DEF} portion (@code{SSA_OP_VMUSTDEF}) and one for the kill portion |
| (@code{SSA_OP_VMUSTDEFKILL}). If all you want to look at are the |
| @code{V_MUST_DEF}s together, there is a fourth iterator macro for this, |
| which returns both a def_operand_p and a use_operand_p for each |
| @code{V_MUST_DEF} in the statement. Note that you don't need any flags for |
| this one. |
| |
| @smallexample |
| use_operand_p kill_p; |
| def_operand_p def_p; |
| ssa_op_iter iter; |
| |
| FOR_EACH_SSA_MUSTDEF_OPERAND (def_p, kill_p, stmt, iter) |
| @{ |
| my_code; |
| @} |
| @end smallexample |
| |
| |
| There are many examples in the code as well, as well as the |
| documentation in @file{tree-ssa-operands.h}. |
| |
| |
| @node SSA |
| @section Static Single Assignment |
| @cindex SSA |
| @cindex static single assignment |
| |
| Most of the tree optimizers rely on the data flow information provided |
| by the Static Single Assignment (SSA) form. We implement the SSA form |
| as described in @cite{R. Cytron, J. Ferrante, B. Rosen, M. Wegman, and |
| K. Zadeck. Efficiently Computing Static Single Assignment Form and the |
| Control Dependence Graph. ACM Transactions on Programming Languages |
| and Systems, 13(4):451-490, October 1991}. |
| |
| The SSA form is based on the premise that program variables are |
| assigned in exactly one location in the program. Multiple assignments |
| to the same variable create new versions of that variable. Naturally, |
| actual programs are seldom in SSA form initially because variables |
| tend to be assigned multiple times. The compiler modifies the program |
| representation so that every time a variable is assigned in the code, |
| a new version of the variable is created. Different versions of the |
| same variable are distinguished by subscripting the variable name with |
| its version number. Variables used in the right-hand side of |
| expressions are renamed so that their version number matches that of |
| the most recent assignment. |
| |
| We represent variable versions using @code{SSA_NAME} nodes. The |
| renaming process in @file{tree-ssa.c} wraps every real and |
| virtual operand with an @code{SSA_NAME} node which contains |
| the version number and the statement that created the |
| @code{SSA_NAME}. Only definitions and virtual definitions may |
| create new @code{SSA_NAME} nodes. |
| |
| Sometimes, flow of control makes it impossible to determine what is the |
| most recent version of a variable. In these cases, the compiler |
| inserts an artificial definition for that variable called |
| @dfn{PHI function} or @dfn{PHI node}. This new definition merges |
| all the incoming versions of the variable to create a new name |
| for it. For instance, |
| |
| @smallexample |
| if (...) |
| a_1 = 5; |
| else if (...) |
| a_2 = 2; |
| else |
| a_3 = 13; |
| |
| # a_4 = PHI <a_1, a_2, a_3> |
| return a_4; |
| @end smallexample |
| |
| Since it is not possible to determine which of the three branches |
| will be taken at runtime, we don't know which of @code{a_1}, |
| @code{a_2} or @code{a_3} to use at the return statement. So, the |
| SSA renamer creates a new version @code{a_4} which is assigned |
| the result of ``merging'' @code{a_1}, @code{a_2} and @code{a_3}. |
| Hence, PHI nodes mean ``one of these operands. I don't know |
| which''. |
| |
| The following macros can be used to examine PHI nodes |
| |
| @defmac PHI_RESULT (@var{phi}) |
| Returns the @code{SSA_NAME} created by PHI node @var{phi} (i.e., |
| @var{phi}'s LHS)@. |
| @end defmac |
| |
| @defmac PHI_NUM_ARGS (@var{phi}) |
| Returns the number of arguments in @var{phi}. This number is exactly |
| the number of incoming edges to the basic block holding @var{phi}@. |
| @end defmac |
| |
| @defmac PHI_ARG_ELT (@var{phi}, @var{i}) |
| Returns a tuple representing the @var{i}th argument of @var{phi}@. |
| Each element of this tuple contains an @code{SSA_NAME} @var{var} and |
| the incoming edge through which @var{var} flows. |
| @end defmac |
| |
| @defmac PHI_ARG_EDGE (@var{phi}, @var{i}) |
| Returns the incoming edge for the @var{i}th argument of @var{phi}. |
| @end defmac |
| |
| @defmac PHI_ARG_DEF (@var{phi}, @var{i}) |
| Returns the @code{SSA_NAME} for the @var{i}th argument of @var{phi}. |
| @end defmac |
| |
| |
| @subsection Preserving the SSA form |
| @findex vars_to_rename |
| @cindex preserving SSA form |
| Some optimization passes make changes to the function that |
| invalidate the SSA property. This can happen when a pass has |
| added new variables or changed the program so that variables that |
| were previously aliased aren't anymore. |
| |
| Whenever something like this happens, the affected variables must |
| be renamed into SSA form again. To do this, you should mark the |
| new variables in the global bitmap @code{vars_to_rename}. Once |
| your pass has finished, the pass manager will invoke the SSA |
| renamer to put the program into SSA once more. |
| |
| @subsection Examining @code{SSA_NAME} nodes |
| @cindex examining SSA_NAMEs |
| |
| The following macros can be used to examine @code{SSA_NAME} nodes |
| |
| @defmac SSA_NAME_DEF_STMT (@var{var}) |
| Returns the statement @var{s} that creates the @code{SSA_NAME} |
| @var{var}. If @var{s} is an empty statement (i.e., @code{IS_EMPTY_STMT |
| (@var{s})} returns @code{true}), it means that the first reference to |
| this variable is a USE or a VUSE@. |
| @end defmac |
| |
| @defmac SSA_NAME_VERSION (@var{var}) |
| Returns the version number of the @code{SSA_NAME} object @var{var}. |
| @end defmac |
| |
| |
| @subsection Walking use-def chains |
| |
| @deftypefn {Tree SSA function} void walk_use_def_chains (@var{var}, @var{fn}, @var{data}) |
| |
| Walks use-def chains starting at the @code{SSA_NAME} node @var{var}. |
| Calls function @var{fn} at each reaching definition found. Function |
| @var{FN} takes three arguments: @var{var}, its defining statement |
| (@var{def_stmt}) and a generic pointer to whatever state information |
| that @var{fn} may want to maintain (@var{data}). Function @var{fn} is |
| able to stop the walk by returning @code{true}, otherwise in order to |
| continue the walk, @var{fn} should return @code{false}. |
| |
| Note, that if @var{def_stmt} is a @code{PHI} node, the semantics are |
| slightly different. For each argument @var{arg} of the PHI node, this |
| function will: |
| |
| @enumerate |
| @item Walk the use-def chains for @var{arg}. |
| @item Call @code{FN (@var{arg}, @var{phi}, @var{data})}. |
| @end enumerate |
| |
| Note how the first argument to @var{fn} is no longer the original |
| variable @var{var}, but the PHI argument currently being examined. |
| If @var{fn} wants to get at @var{var}, it should call |
| @code{PHI_RESULT} (@var{phi}). |
| @end deftypefn |
| |
| @subsection Walking the dominator tree |
| |
| @deftypefn {Tree SSA function} void walk_dominator_tree (@var{walk_data}, @var{bb}) |
| |
| This function walks the dominator tree for the current CFG calling a |
| set of callback functions defined in @var{struct dom_walk_data} in |
| @file{domwalk.h}. The call back functions you need to define give you |
| hooks to execute custom code at various points during traversal: |
| |
| @enumerate |
| @item Once to initialize any local data needed while processing |
| @var{bb} and its children. This local data is pushed into an |
| internal stack which is automatically pushed and popped as the |
| walker traverses the dominator tree. |
| |
| @item Once before traversing all the statements in the @var{bb}. |
| |
| @item Once for every statement inside @var{bb}. |
| |
| @item Once after traversing all the statements and before recursing |
| into @var{bb}'s dominator children. |
| |
| @item It then recurses into all the dominator children of @var{bb}. |
| |
| @item After recursing into all the dominator children of @var{bb} it |
| can, optionally, traverse every statement in @var{bb} again |
| (i.e., repeating steps 2 and 3). |
| |
| @item Once after walking the statements in @var{bb} and @var{bb}'s |
| dominator children. At this stage, the block local data stack |
| is popped. |
| @end enumerate |
| @end deftypefn |
| |
| @node Alias analysis |
| @section Alias analysis |
| @cindex alias |
| @cindex flow-sensitive alias analysis |
| @cindex flow-insensitive alias analysis |
| |
| Alias analysis proceeds in 3 main phases: |
| |
| @enumerate |
| @item Points-to and escape analysis. |
| |
| This phase walks the use-def chains in the SSA web looking for |
| three things: |
| |
| @itemize @bullet |
| @item Assignments of the form @code{P_i = &VAR} |
| @item Assignments of the form P_i = malloc() |
| @item Pointers and ADDR_EXPR that escape the current function. |
| @end itemize |
| |
| The concept of `escaping' is the same one used in the Java world. |
| When a pointer or an ADDR_EXPR escapes, it means that it has been |
| exposed outside of the current function. So, assignment to |
| global variables, function arguments and returning a pointer are |
| all escape sites. |
| |
| This is where we are currently limited. Since not everything is |
| renamed into SSA, we lose track of escape properties when a |
| pointer is stashed inside a field in a structure, for instance. |
| In those cases, we are assuming that the pointer does escape. |
| |
| We use escape analysis to determine whether a variable is |
| call-clobbered. Simply put, if an ADDR_EXPR escapes, then the |
| variable is call-clobbered. If a pointer P_i escapes, then all |
| the variables pointed-to by P_i (and its memory tag) also escape. |
| |
| @item Compute flow-sensitive aliases |
| |
| We have two classes of memory tags. Memory tags associated with |
| the pointed-to data type of the pointers in the program. These |
| tags are called ``type memory tag'' (TMT)@. The other class are |
| those associated with SSA_NAMEs, called ``name memory tag'' (NMT)@. |
| The basic idea is that when adding operands for an INDIRECT_REF |
| *P_i, we will first check whether P_i has a name tag, if it does |
| we use it, because that will have more precise aliasing |
| information. Otherwise, we use the standard type tag. |
| |
| In this phase, we go through all the pointers we found in |
| points-to analysis and create alias sets for the name memory tags |
| associated with each pointer P_i. If P_i escapes, we mark |
| call-clobbered the variables it points to and its tag. |
| |
| |
| @item Compute flow-insensitive aliases |
| |
| This pass will compare the alias set of every type memory tag and |
| every addressable variable found in the program. Given a type |
| memory tag TMT and an addressable variable V@. If the alias sets |
| of TMT and V conflict (as computed by may_alias_p), then V is |
| marked as an alias tag and added to the alias set of TMT@. |
| @end enumerate |
| |
| For instance, consider the following function: |
| |
| @smallexample |
| foo (int i) |
| @{ |
| int *p, *q, a, b; |
| |
| if (i > 10) |
| p = &a; |
| else |
| q = &b; |
| |
| *p = 3; |
| *q = 5; |
| a = b + 2; |
| return *p; |
| @} |
| @end smallexample |
| |
| After aliasing analysis has finished, the type memory tag for |
| pointer @code{p} will have two aliases, namely variables @code{a} and |
| @code{b}. |
| Every time pointer @code{p} is dereferenced, we want to mark the |
| operation as a potential reference to @code{a} and @code{b}. |
| |
| @smallexample |
| foo (int i) |
| @{ |
| int *p, a, b; |
| |
| if (i_2 > 10) |
| p_4 = &a; |
| else |
| p_6 = &b; |
| # p_1 = PHI <p_4(1), p_6(2)>; |
| |
| # a_7 = V_MAY_DEF <a_3>; |
| # b_8 = V_MAY_DEF <b_5>; |
| *p_1 = 3; |
| |
| # a_9 = V_MAY_DEF <a_7> |
| # VUSE <b_8> |
| a_9 = b_8 + 2; |
| |
| # VUSE <a_9>; |
| # VUSE <b_8>; |
| return *p_1; |
| @} |
| @end smallexample |
| |
| In certain cases, the list of may aliases for a pointer may grow |
| too large. This may cause an explosion in the number of virtual |
| operands inserted in the code. Resulting in increased memory |
| consumption and compilation time. |
| |
| When the number of virtual operands needed to represent aliased |
| loads and stores grows too large (configurable with @option{--param |
| max-aliased-vops}), alias sets are grouped to avoid severe |
| compile-time slow downs and memory consumption. The alias |
| grouping heuristic proceeds as follows: |
| |
| @enumerate |
| @item Sort the list of pointers in decreasing number of contributed |
| virtual operands. |
| |
| @item Take the first pointer from the list and reverse the role |
| of the memory tag and its aliases. Usually, whenever an |
| aliased variable Vi is found to alias with a memory tag |
| T, we add Vi to the may-aliases set for T@. Meaning that |
| after alias analysis, we will have: |
| |
| @smallexample |
| may-aliases(T) = @{ V1, V2, V3, ..., Vn @} |
| @end smallexample |
| |
| This means that every statement that references T, will get |
| @code{n} virtual operands for each of the Vi tags. But, when |
| alias grouping is enabled, we make T an alias tag and add it |
| to the alias set of all the Vi variables: |
| |
| @smallexample |
| may-aliases(V1) = @{ T @} |
| may-aliases(V2) = @{ T @} |
| ... |
| may-aliases(Vn) = @{ T @} |
| @end smallexample |
| |
| This has two effects: (a) statements referencing T will only get |
| a single virtual operand, and, (b) all the variables Vi will now |
| appear to alias each other. So, we lose alias precision to |
| improve compile time. But, in theory, a program with such a high |
| level of aliasing should not be very optimizable in the first |
| place. |
| |
| @item Since variables may be in the alias set of more than one |
| memory tag, the grouping done in step (2) needs to be extended |
| to all the memory tags that have a non-empty intersection with |
| the may-aliases set of tag T@. For instance, if we originally |
| had these may-aliases sets: |
| |
| @smallexample |
| may-aliases(T) = @{ V1, V2, V3 @} |
| may-aliases(R) = @{ V2, V4 @} |
| @end smallexample |
| |
| In step (2) we would have reverted the aliases for T as: |
| |
| @smallexample |
| may-aliases(V1) = @{ T @} |
| may-aliases(V2) = @{ T @} |
| may-aliases(V3) = @{ T @} |
| @end smallexample |
| |
| But note that now V2 is no longer aliased with R@. We could |
| add R to may-aliases(V2), but we are in the process of |
| grouping aliases to reduce virtual operands so what we do is |
| add V4 to the grouping to obtain: |
| |
| @smallexample |
| may-aliases(V1) = @{ T @} |
| may-aliases(V2) = @{ T @} |
| may-aliases(V3) = @{ T @} |
| may-aliases(V4) = @{ T @} |
| @end smallexample |
| |
| @item If the total number of virtual operands due to aliasing is |
| still above the threshold set by max-alias-vops, go back to (2). |
| @end enumerate |