| @c Copyright (C) 2002, 2003, 2004 |
| @c Free Software Foundation, Inc. |
| @c This is part of the GCC manual. |
| @c For copying conditions, see the file gcc.texi. |
| |
| @node Type Information |
| @chapter Memory Management and Type Information |
| @cindex GGC |
| @findex GTY |
| |
| GCC uses some fairly sophisticated memory management techniques, which |
| involve determining information about GCC's data structures from GCC's |
| source code and using this information to perform garbage collection and |
| implement precompiled headers. |
| |
| A full C parser would be too complicated for this task, so a limited |
| subset of C is interpreted and special markers are used to determine |
| what parts of the source to look at. All @code{struct} and |
| @code{union} declarations that define data structures that are |
| allocated under control of the garbage collector must be marked. All |
| global variables that hold pointers to garbage-collected memory must |
| also be marked. Finally, all global variables that need to be saved |
| and restored by a precompiled header must be marked. (The precompiled |
| header mechanism can only save static variables if they're scalar. |
| Complex data structures must be allocated in garbage-collected memory |
| to be saved in a precompiled header.) |
| |
| The full format of a marker is |
| @smallexample |
| GTY (([@var{option}] [(@var{param})], [@var{option}] [(@var{param})] @dots{})) |
| @end smallexample |
| @noindent |
| but in most cases no options are needed. The outer double parentheses |
| are still necessary, though: @code{GTY(())}. Markers can appear: |
| |
| @itemize @bullet |
| @item |
| In a structure definition, before the open brace; |
| @item |
| In a global variable declaration, after the keyword @code{static} or |
| @code{extern}; and |
| @item |
| In a structure field definition, before the name of the field. |
| @end itemize |
| |
| Here are some examples of marking simple data structures and globals. |
| |
| @smallexample |
| struct @var{tag} GTY(()) |
| @{ |
| @var{fields}@dots{} |
| @}; |
| |
| typedef struct @var{tag} GTY(()) |
| @{ |
| @var{fields}@dots{} |
| @} *@var{typename}; |
| |
| static GTY(()) struct @var{tag} *@var{list}; /* @r{points to GC memory} */ |
| static GTY(()) int @var{counter}; /* @r{save counter in a PCH} */ |
| @end smallexample |
| |
| The parser understands simple typedefs such as |
| @code{typedef struct @var{tag} *@var{name};} and |
| @code{typedef int @var{name};}. |
| These don't need to be marked. |
| |
| @menu |
| * GTY Options:: What goes inside a @code{GTY(())}. |
| * GGC Roots:: Making global variables GGC roots. |
| * Files:: How the generated files work. |
| @end menu |
| |
| @node GTY Options |
| @section The Inside of a @code{GTY(())} |
| |
| Sometimes the C code is not enough to fully describe the type |
| structure. Extra information can be provided with @code{GTY} options |
| and additional markers. Some options take a parameter, which may be |
| either a string or a type name, depending on the parameter. If an |
| option takes no parameter, it is acceptable either to omit the |
| parameter entirely, or to provide an empty string as a parameter. For |
| example, @code{@w{GTY ((skip))}} and @code{@w{GTY ((skip ("")))}} are |
| equivalent. |
| |
| When the parameter is a string, often it is a fragment of C code. Four |
| special escapes may be used in these strings, to refer to pieces of |
| the data structure being marked: |
| |
| @cindex % in GTY option |
| @table @code |
| @item %h |
| The current structure. |
| @item %1 |
| The structure that immediately contains the current structure. |
| @item %0 |
| The outermost structure that contains the current structure. |
| @item %a |
| A partial expression of the form @code{[i1][i2]...} that indexes |
| the array item currently being marked. |
| @end table |
| |
| For instance, suppose that you have a structure of the form |
| @smallexample |
| struct A @{ |
| ... |
| @}; |
| struct B @{ |
| struct A foo[12]; |
| @}; |
| @end smallexample |
| @noindent |
| and @code{b} is a variable of type @code{struct B}. When marking |
| @samp{b.foo[11]}, @code{%h} would expand to @samp{b.foo[11]}, |
| @code{%0} and @code{%1} would both expand to @samp{b}, and @code{%a} |
| would expand to @samp{[11]}. |
| |
| As in ordinary C, adjacent strings will be concatenated; this is |
| helpful when you have a complicated expression. |
| @smallexample |
| @group |
| GTY ((chain_next ("TREE_CODE (&%h.generic) == INTEGER_TYPE" |
| " ? TYPE_NEXT_VARIANT (&%h.generic)" |
| " : TREE_CHAIN (&%h.generic)"))) |
| @end group |
| @end smallexample |
| |
| The available options are: |
| |
| @table @code |
| @findex length |
| @item length ("@var{expression}") |
| |
| There are two places the type machinery will need to be explicitly told |
| the length of an array. The first case is when a structure ends in a |
| variable-length array, like this: |
| @smallexample |
| struct rtvec_def GTY(()) @{ |
| int num_elem; /* @r{number of elements} */ |
| rtx GTY ((length ("%h.num_elem"))) elem[1]; |
| @}; |
| @end smallexample |
| |
| In this case, the @code{length} option is used to override the specified |
| array length (which should usually be @code{1}). The parameter of the |
| option is a fragment of C code that calculates the length. |
| |
| The second case is when a structure or a global variable contains a |
| pointer to an array, like this: |
| @smallexample |
| tree * |
| GTY ((length ("%h.regno_pointer_align_length"))) regno_decl; |
| @end smallexample |
| In this case, @code{regno_decl} has been allocated by writing something like |
| @smallexample |
| x->regno_decl = |
| ggc_alloc (x->regno_pointer_align_length * sizeof (tree)); |
| @end smallexample |
| and the @code{length} provides the length of the field. |
| |
| This second use of @code{length} also works on global variables, like: |
| @verbatim |
| static GTY((length ("reg_base_value_size"))) |
| rtx *reg_base_value; |
| @end verbatim |
| |
| @findex skip |
| @item skip |
| |
| If @code{skip} is applied to a field, the type machinery will ignore it. |
| This is somewhat dangerous; the only safe use is in a union when one |
| field really isn't ever used. |
| |
| @findex desc |
| @findex tag |
| @findex default |
| @item desc ("@var{expression}") |
| @itemx tag ("@var{constant}") |
| @itemx default |
| |
| The type machinery needs to be told which field of a @code{union} is |
| currently active. This is done by giving each field a constant |
| @code{tag} value, and then specifying a discriminator using @code{desc}. |
| The value of the expression given by @code{desc} is compared against |
| each @code{tag} value, each of which should be different. If no |
| @code{tag} is matched, the field marked with @code{default} is used if |
| there is one, otherwise no field in the union will be marked. |
| |
| In the @code{desc} option, the ``current structure'' is the union that |
| it discriminates. Use @code{%1} to mean the structure containing it. |
| There are no escapes available to the @code{tag} option, since it is a |
| constant. |
| |
| For example, |
| @smallexample |
| struct tree_binding GTY(()) |
| @{ |
| struct tree_common common; |
| union tree_binding_u @{ |
| tree GTY ((tag ("0"))) scope; |
| struct cp_binding_level * GTY ((tag ("1"))) level; |
| @} GTY ((desc ("BINDING_HAS_LEVEL_P ((tree)&%0)"))) xscope; |
| tree value; |
| @}; |
| @end smallexample |
| |
| In this example, the value of BINDING_HAS_LEVEL_P when applied to a |
| @code{struct tree_binding *} is presumed to be 0 or 1. If 1, the type |
| mechanism will treat the field @code{level} as being present and if 0, |
| will treat the field @code{scope} as being present. |
| |
| @findex param_is |
| @findex use_param |
| @item param_is (@var{type}) |
| @itemx use_param |
| |
| Sometimes it's convenient to define some data structure to work on |
| generic pointers (that is, @code{PTR}) and then use it with a specific |
| type. @code{param_is} specifies the real type pointed to, and |
| @code{use_param} says where in the generic data structure that type |
| should be put. |
| |
| For instance, to have a @code{htab_t} that points to trees, one would |
| write the definition of @code{htab_t} like this: |
| @smallexample |
| typedef struct GTY(()) @{ |
| @dots{} |
| void ** GTY ((use_param, @dots{})) entries; |
| @dots{} |
| @} htab_t; |
| @end smallexample |
| and then declare variables like this: |
| @smallexample |
| static htab_t GTY ((param_is (union tree_node))) ict; |
| @end smallexample |
| |
| @findex param@var{n}_is |
| @findex use_param@var{n} |
| @item param@var{n}_is (@var{type}) |
| @itemx use_param@var{n} |
| |
| In more complicated cases, the data structure might need to work on |
| several different types, which might not necessarily all be pointers. |
| For this, @code{param1_is} through @code{param9_is} may be used to |
| specify the real type of a field identified by @code{use_param1} through |
| @code{use_param9}. |
| |
| @findex use_params |
| @item use_params |
| |
| When a structure contains another structure that is parameterized, |
| there's no need to do anything special, the inner structure inherits the |
| parameters of the outer one. When a structure contains a pointer to a |
| parameterized structure, the type machinery won't automatically detect |
| this (it could, it just doesn't yet), so it's necessary to tell it that |
| the pointed-to structure should use the same parameters as the outer |
| structure. This is done by marking the pointer with the |
| @code{use_params} option. |
| |
| @findex deletable |
| @item deletable |
| |
| @code{deletable}, when applied to a global variable, indicates that when |
| garbage collection runs, there's no need to mark anything pointed to |
| by this variable, it can just be set to @code{NULL} instead. This is used |
| to keep a list of free structures around for re-use. |
| |
| @findex if_marked |
| @item if_marked ("@var{expression}") |
| |
| Suppose you want some kinds of object to be unique, and so you put them |
| in a hash table. If garbage collection marks the hash table, these |
| objects will never be freed, even if the last other reference to them |
| goes away. GGC has special handling to deal with this: if you use the |
| @code{if_marked} option on a global hash table, GGC will call the |
| routine whose name is the parameter to the option on each hash table |
| entry. If the routine returns nonzero, the hash table entry will |
| be marked as usual. If the routine returns zero, the hash table entry |
| will be deleted. |
| |
| The routine @code{ggc_marked_p} can be used to determine if an element |
| has been marked already; in fact, the usual case is to use |
| @code{if_marked ("ggc_marked_p")}. |
| |
| @findex maybe_undef |
| @item maybe_undef |
| |
| When applied to a field, @code{maybe_undef} indicates that it's OK if |
| the structure that this fields points to is never defined, so long as |
| this field is always @code{NULL}. This is used to avoid requiring |
| backends to define certain optional structures. It doesn't work with |
| language frontends. |
| |
| @findex nested_ptr |
| @item nested_ptr (@var{type}, "@var{to expression}", "@var{from expression}") |
| |
| The type machinery expects all pointers to point to the start of an |
| object. Sometimes for abstraction purposes it's convenient to have |
| a pointer which points inside an object. So long as it's possible to |
| convert the original object to and from the pointer, such pointers |
| can still be used. @var{type} is the type of the original object, |
| the @var{to expression} returns the pointer given the original object, |
| and the @var{from expression} returns the original object given |
| the pointer. The pointer will be available using the @code{%h} |
| escape. |
| |
| @findex chain_next |
| @findex chain_prev |
| @item chain_next ("@var{expression}") |
| @itemx chain_prev ("@var{expression}") |
| |
| It's helpful for the type machinery to know if objects are often |
| chained together in long lists; this lets it generate code that uses |
| less stack space by iterating along the list instead of recursing down |
| it. @code{chain_next} is an expression for the next item in the list, |
| @code{chain_prev} is an expression for the previous item. For singly |
| linked lists, use only @code{chain_next}; for doubly linked lists, use |
| both. The machinery requires that taking the next item of the |
| previous item gives the original item. |
| |
| @findex reorder |
| @item reorder ("@var{function name}") |
| |
| Some data structures depend on the relative ordering of pointers. If |
| the precompiled header machinery needs to change that ordering, it |
| will call the function referenced by the @code{reorder} option, before |
| changing the pointers in the object that's pointed to by the field the |
| option applies to. The function must take four arguments, with the |
| signature @samp{@w{void *, void *, gt_pointer_operator, void *}}. |
| The first parameter is a pointer to the structure that contains the |
| object being updated, or the object itself if there is no containing |
| structure. The second parameter is a cookie that should be ignored. |
| The third parameter is a routine that, given a pointer, will update it |
| to its correct new value. The fourth parameter is a cookie that must |
| be passed to the second parameter. |
| |
| PCH cannot handle data structures that depend on the absolute values |
| of pointers. @code{reorder} functions can be expensive. When |
| possible, it is better to depend on properties of the data, like an ID |
| number or the hash of a string instead. |
| |
| @findex special |
| @item special ("@var{name}") |
| |
| The @code{special} option is used to mark types that have to be dealt |
| with by special case machinery. The parameter is the name of the |
| special case. See @file{gengtype.c} for further details. Avoid |
| adding new special cases unless there is no other alternative. |
| @end table |
| |
| @node GGC Roots |
| @section Marking Roots for the Garbage Collector |
| @cindex roots, marking |
| @cindex marking roots |
| |
| In addition to keeping track of types, the type machinery also locates |
| the global variables (@dfn{roots}) that the garbage collector starts |
| at. Roots must be declared using one of the following syntaxes: |
| |
| @itemize @bullet |
| @item |
| @code{extern GTY(([@var{options}])) @var{type} @var{name};} |
| @item |
| @code{static GTY(([@var{options}])) @var{type} @var{name};} |
| @end itemize |
| @noindent |
| The syntax |
| @itemize @bullet |
| @item |
| @code{GTY(([@var{options}])) @var{type} @var{name};} |
| @end itemize |
| @noindent |
| is @emph{not} accepted. There should be an @code{extern} declaration |
| of such a variable in a header somewhere---mark that, not the |
| definition. Or, if the variable is only used in one file, make it |
| @code{static}. |
| |
| @node Files |
| @section Source Files Containing Type Information |
| @cindex generated files |
| @cindex files, generated |
| |
| Whenever you add @code{GTY} markers to a source file that previously |
| had none, or create a new source file containing @code{GTY} markers, |
| there are three things you need to do: |
| |
| @enumerate |
| @item |
| You need to add the file to the list of source files the type |
| machinery scans. There are four cases: |
| |
| @enumerate a |
| @item |
| For a back-end file, this is usually done |
| automatically; if not, you should add it to @code{target_gtfiles} in |
| the appropriate port's entries in @file{config.gcc}. |
| |
| @item |
| For files shared by all front ends, add the filename to the |
| @code{GTFILES} variable in @file{Makefile.in}. |
| |
| @item |
| For files that are part of one front end, add the filename to the |
| @code{gtfiles} variable defined in the appropriate |
| @file{config-lang.in}. For C, the file is @file{c-config-lang.in}. |
| |
| @item |
| For files that are part of some but not all front ends, add the |
| filename to the @code{gtfiles} variable of @emph{all} the front ends |
| that use it. |
| @end enumerate |
| |
| @item |
| If the file was a header file, you'll need to check that it's included |
| in the right place to be visible to the generated files. For a back-end |
| header file, this should be done automatically. For a front-end header |
| file, it needs to be included by the same file that includes |
| @file{gtype-@var{lang}.h}. For other header files, it needs to be |
| included in @file{gtype-desc.c}, which is a generated file, so add it to |
| @code{ifiles} in @code{open_base_file} in @file{gengtype.c}. |
| |
| For source files that aren't header files, the machinery will generate a |
| header file that should be included in the source file you just changed. |
| The file will be called @file{gt-@var{path}.h} where @var{path} is the |
| pathname relative to the @file{gcc} directory with slashes replaced by |
| @verb{|-|}, so for example the header file to be included in |
| @file{cp/parser.c} is called @file{gt-cp-parser.c}. The |
| generated header file should be included after everything else in the |
| source file. Don't forget to mention this file as a dependency in the |
| @file{Makefile}! |
| |
| @end enumerate |
| |
| For language frontends, there is another file that needs to be included |
| somewhere. It will be called @file{gtype-@var{lang}.h}, where |
| @var{lang} is the name of the subdirectory the language is contained in. |