1@c Copyright (C) 2002, 2003, 2004, 2007, 2008, 2009 2@c Free Software Foundation, Inc. 3@c This is part of the GCC manual. 4@c For copying conditions, see the file gcc.texi. 5 6@node Type Information 7@chapter Memory Management and Type Information 8@cindex GGC 9@findex GTY 10 11GCC uses some fairly sophisticated memory management techniques, which 12involve determining information about GCC's data structures from GCC's 13source code and using this information to perform garbage collection and 14implement precompiled headers. 15 16A full C parser would be too complicated for this task, so a limited 17subset of C is interpreted and special markers are used to determine 18what parts of the source to look at. All @code{struct} and 19@code{union} declarations that define data structures that are 20allocated under control of the garbage collector must be marked. All 21global variables that hold pointers to garbage-collected memory must 22also be marked. Finally, all global variables that need to be saved 23and restored by a precompiled header must be marked. (The precompiled 24header mechanism can only save static variables if they're scalar. 25Complex data structures must be allocated in garbage-collected memory 26to be saved in a precompiled header.) 27 28The full format of a marker is 29@smallexample 30GTY (([@var{option}] [(@var{param})], [@var{option}] [(@var{param})] @dots{})) 31@end smallexample 32@noindent 33but in most cases no options are needed. The outer double parentheses 34are still necessary, though: @code{GTY(())}. Markers can appear: 35 36@itemize @bullet 37@item 38In a structure definition, before the open brace; 39@item 40In a global variable declaration, after the keyword @code{static} or 41@code{extern}; and 42@item 43In a structure field definition, before the name of the field. 44@end itemize 45 46Here are some examples of marking simple data structures and globals. 47 48@smallexample 49struct GTY(()) @var{tag} 50@{ 51 @var{fields}@dots{} 52@}; 53 54typedef struct GTY(()) @var{tag} 55@{ 56 @var{fields}@dots{} 57@} *@var{typename}; 58 59static GTY(()) struct @var{tag} *@var{list}; /* @r{points to GC memory} */ 60static GTY(()) int @var{counter}; /* @r{save counter in a PCH} */ 61@end smallexample 62 63The parser understands simple typedefs such as 64@code{typedef struct @var{tag} *@var{name};} and 65@code{typedef int @var{name};}. 66These don't need to be marked. 67 68@menu 69* GTY Options:: What goes inside a @code{GTY(())}. 70* GGC Roots:: Making global variables GGC roots. 71* Files:: How the generated files work. 72* Invoking the garbage collector:: How to invoke the garbage collector. 73@end menu 74 75@node GTY Options 76@section The Inside of a @code{GTY(())} 77 78Sometimes the C code is not enough to fully describe the type 79structure. Extra information can be provided with @code{GTY} options 80and additional markers. Some options take a parameter, which may be 81either a string or a type name, depending on the parameter. If an 82option takes no parameter, it is acceptable either to omit the 83parameter entirely, or to provide an empty string as a parameter. For 84example, @code{@w{GTY ((skip))}} and @code{@w{GTY ((skip ("")))}} are 85equivalent. 86 87When the parameter is a string, often it is a fragment of C code. Four 88special escapes may be used in these strings, to refer to pieces of 89the data structure being marked: 90 91@cindex % in GTY option 92@table @code 93@item %h 94The current structure. 95@item %1 96The structure that immediately contains the current structure. 97@item %0 98The outermost structure that contains the current structure. 99@item %a 100A partial expression of the form @code{[i1][i2]@dots{}} that indexes 101the array item currently being marked. 102@end table 103 104For instance, suppose that you have a structure of the form 105@smallexample 106struct A @{ 107 @dots{} 108@}; 109struct B @{ 110 struct A foo[12]; 111@}; 112@end smallexample 113@noindent 114and @code{b} is a variable of type @code{struct B}. When marking 115@samp{b.foo[11]}, @code{%h} would expand to @samp{b.foo[11]}, 116@code{%0} and @code{%1} would both expand to @samp{b}, and @code{%a} 117would expand to @samp{[11]}. 118 119As in ordinary C, adjacent strings will be concatenated; this is 120helpful when you have a complicated expression. 121@smallexample 122@group 123GTY ((chain_next ("TREE_CODE (&%h.generic) == INTEGER_TYPE" 124 " ? TYPE_NEXT_VARIANT (&%h.generic)" 125 " : TREE_CHAIN (&%h.generic)"))) 126@end group 127@end smallexample 128 129The available options are: 130 131@table @code 132@findex length 133@item length ("@var{expression}") 134 135There are two places the type machinery will need to be explicitly told 136the length of an array. The first case is when a structure ends in a 137variable-length array, like this: 138@smallexample 139struct GTY(()) rtvec_def @{ 140 int num_elem; /* @r{number of elements} */ 141 rtx GTY ((length ("%h.num_elem"))) elem[1]; 142@}; 143@end smallexample 144 145In this case, the @code{length} option is used to override the specified 146array length (which should usually be @code{1}). The parameter of the 147option is a fragment of C code that calculates the length. 148 149The second case is when a structure or a global variable contains a 150pointer to an array, like this: 151@smallexample 152tree * 153 GTY ((length ("%h.regno_pointer_align_length"))) regno_decl; 154@end smallexample 155In this case, @code{regno_decl} has been allocated by writing something like 156@smallexample 157 x->regno_decl = 158 ggc_alloc (x->regno_pointer_align_length * sizeof (tree)); 159@end smallexample 160and the @code{length} provides the length of the field. 161 162This second use of @code{length} also works on global variables, like: 163@verbatim 164 static GTY((length ("reg_base_value_size"))) 165 rtx *reg_base_value; 166@end verbatim 167 168@findex skip 169@item skip 170 171If @code{skip} is applied to a field, the type machinery will ignore it. 172This is somewhat dangerous; the only safe use is in a union when one 173field really isn't ever used. 174 175@findex desc 176@findex tag 177@findex default 178@item desc ("@var{expression}") 179@itemx tag ("@var{constant}") 180@itemx default 181 182The type machinery needs to be told which field of a @code{union} is 183currently active. This is done by giving each field a constant 184@code{tag} value, and then specifying a discriminator using @code{desc}. 185The value of the expression given by @code{desc} is compared against 186each @code{tag} value, each of which should be different. If no 187@code{tag} is matched, the field marked with @code{default} is used if 188there is one, otherwise no field in the union will be marked. 189 190In the @code{desc} option, the ``current structure'' is the union that 191it discriminates. Use @code{%1} to mean the structure containing it. 192There are no escapes available to the @code{tag} option, since it is a 193constant. 194 195For example, 196@smallexample 197struct GTY(()) tree_binding 198@{ 199 struct tree_common common; 200 union tree_binding_u @{ 201 tree GTY ((tag ("0"))) scope; 202 struct cp_binding_level * GTY ((tag ("1"))) level; 203 @} GTY ((desc ("BINDING_HAS_LEVEL_P ((tree)&%0)"))) xscope; 204 tree value; 205@}; 206@end smallexample 207 208In this example, the value of BINDING_HAS_LEVEL_P when applied to a 209@code{struct tree_binding *} is presumed to be 0 or 1. If 1, the type 210mechanism will treat the field @code{level} as being present and if 0, 211will treat the field @code{scope} as being present. 212 213@findex param_is 214@findex use_param 215@item param_is (@var{type}) 216@itemx use_param 217 218Sometimes it's convenient to define some data structure to work on 219generic pointers (that is, @code{PTR}) and then use it with a specific 220type. @code{param_is} specifies the real type pointed to, and 221@code{use_param} says where in the generic data structure that type 222should be put. 223 224For instance, to have a @code{htab_t} that points to trees, one would 225write the definition of @code{htab_t} like this: 226@smallexample 227typedef struct GTY(()) @{ 228 @dots{} 229 void ** GTY ((use_param, @dots{})) entries; 230 @dots{} 231@} htab_t; 232@end smallexample 233and then declare variables like this: 234@smallexample 235 static htab_t GTY ((param_is (union tree_node))) ict; 236@end smallexample 237 238@findex param@var{n}_is 239@findex use_param@var{n} 240@item param@var{n}_is (@var{type}) 241@itemx use_param@var{n} 242 243In more complicated cases, the data structure might need to work on 244several different types, which might not necessarily all be pointers. 245For this, @code{param1_is} through @code{param9_is} may be used to 246specify the real type of a field identified by @code{use_param1} through 247@code{use_param9}. 248 249@findex use_params 250@item use_params 251 252When a structure contains another structure that is parameterized, 253there's no need to do anything special, the inner structure inherits the 254parameters of the outer one. When a structure contains a pointer to a 255parameterized structure, the type machinery won't automatically detect 256this (it could, it just doesn't yet), so it's necessary to tell it that 257the pointed-to structure should use the same parameters as the outer 258structure. This is done by marking the pointer with the 259@code{use_params} option. 260 261@findex deletable 262@item deletable 263 264@code{deletable}, when applied to a global variable, indicates that when 265garbage collection runs, there's no need to mark anything pointed to 266by this variable, it can just be set to @code{NULL} instead. This is used 267to keep a list of free structures around for re-use. 268 269@findex if_marked 270@item if_marked ("@var{expression}") 271 272Suppose you want some kinds of object to be unique, and so you put them 273in a hash table. If garbage collection marks the hash table, these 274objects will never be freed, even if the last other reference to them 275goes away. GGC has special handling to deal with this: if you use the 276@code{if_marked} option on a global hash table, GGC will call the 277routine whose name is the parameter to the option on each hash table 278entry. If the routine returns nonzero, the hash table entry will 279be marked as usual. If the routine returns zero, the hash table entry 280will be deleted. 281 282The routine @code{ggc_marked_p} can be used to determine if an element 283has been marked already; in fact, the usual case is to use 284@code{if_marked ("ggc_marked_p")}. 285 286@findex mark_hook 287@item mark_hook ("@var{hook-routine-name}") 288 289If provided for a structure or union type, the given 290@var{hook-routine-name} (between double-quotes) is the name of a 291routine called when the garbage collector has just marked the data as 292reachable. This routine should not change the data, or call any ggc 293routine. Its only argument is a pointer to the just marked (const) 294structure or union. 295 296@findex maybe_undef 297@item maybe_undef 298 299When applied to a field, @code{maybe_undef} indicates that it's OK if 300the structure that this fields points to is never defined, so long as 301this field is always @code{NULL}. This is used to avoid requiring 302backends to define certain optional structures. It doesn't work with 303language frontends. 304 305@findex nested_ptr 306@item nested_ptr (@var{type}, "@var{to expression}", "@var{from expression}") 307 308The type machinery expects all pointers to point to the start of an 309object. Sometimes for abstraction purposes it's convenient to have 310a pointer which points inside an object. So long as it's possible to 311convert the original object to and from the pointer, such pointers 312can still be used. @var{type} is the type of the original object, 313the @var{to expression} returns the pointer given the original object, 314and the @var{from expression} returns the original object given 315the pointer. The pointer will be available using the @code{%h} 316escape. 317 318@findex chain_next 319@findex chain_prev 320@findex chain_circular 321@item chain_next ("@var{expression}") 322@itemx chain_prev ("@var{expression}") 323@itemx chain_circular ("@var{expression}") 324 325It's helpful for the type machinery to know if objects are often 326chained together in long lists; this lets it generate code that uses 327less stack space by iterating along the list instead of recursing down 328it. @code{chain_next} is an expression for the next item in the list, 329@code{chain_prev} is an expression for the previous item. For singly 330linked lists, use only @code{chain_next}; for doubly linked lists, use 331both. The machinery requires that taking the next item of the 332previous item gives the original item. @code{chain_circular} is similar 333to @code{chain_next}, but can be used for circular single linked lists. 334 335@findex reorder 336@item reorder ("@var{function name}") 337 338Some data structures depend on the relative ordering of pointers. If 339the precompiled header machinery needs to change that ordering, it 340will call the function referenced by the @code{reorder} option, before 341changing the pointers in the object that's pointed to by the field the 342option applies to. The function must take four arguments, with the 343signature @samp{@w{void *, void *, gt_pointer_operator, void *}}. 344The first parameter is a pointer to the structure that contains the 345object being updated, or the object itself if there is no containing 346structure. The second parameter is a cookie that should be ignored. 347The third parameter is a routine that, given a pointer, will update it 348to its correct new value. The fourth parameter is a cookie that must 349be passed to the second parameter. 350 351PCH cannot handle data structures that depend on the absolute values 352of pointers. @code{reorder} functions can be expensive. When 353possible, it is better to depend on properties of the data, like an ID 354number or the hash of a string instead. 355 356@findex special 357@item special ("@var{name}") 358 359The @code{special} option is used to mark types that have to be dealt 360with by special case machinery. The parameter is the name of the 361special case. See @file{gengtype.c} for further details. Avoid 362adding new special cases unless there is no other alternative. 363@end table 364 365@node GGC Roots 366@section Marking Roots for the Garbage Collector 367@cindex roots, marking 368@cindex marking roots 369 370In addition to keeping track of types, the type machinery also locates 371the global variables (@dfn{roots}) that the garbage collector starts 372at. Roots must be declared using one of the following syntaxes: 373 374@itemize @bullet 375@item 376@code{extern GTY(([@var{options}])) @var{type} @var{name};} 377@item 378@code{static GTY(([@var{options}])) @var{type} @var{name};} 379@end itemize 380@noindent 381The syntax 382@itemize @bullet 383@item 384@code{GTY(([@var{options}])) @var{type} @var{name};} 385@end itemize 386@noindent 387is @emph{not} accepted. There should be an @code{extern} declaration 388of such a variable in a header somewhere---mark that, not the 389definition. Or, if the variable is only used in one file, make it 390@code{static}. 391 392@node Files 393@section Source Files Containing Type Information 394@cindex generated files 395@cindex files, generated 396 397Whenever you add @code{GTY} markers to a source file that previously 398had none, or create a new source file containing @code{GTY} markers, 399there are three things you need to do: 400 401@enumerate 402@item 403You need to add the file to the list of source files the type 404machinery scans. There are four cases: 405 406@enumerate a 407@item 408For a back-end file, this is usually done 409automatically; if not, you should add it to @code{target_gtfiles} in 410the appropriate port's entries in @file{config.gcc}. 411 412@item 413For files shared by all front ends, add the filename to the 414@code{GTFILES} variable in @file{Makefile.in}. 415 416@item 417For files that are part of one front end, add the filename to the 418@code{gtfiles} variable defined in the appropriate 419@file{config-lang.in}. For C, the file is @file{c-config-lang.in}. 420Headers should appear before non-headers in this list. 421 422@item 423For files that are part of some but not all front ends, add the 424filename to the @code{gtfiles} variable of @emph{all} the front ends 425that use it. 426@end enumerate 427 428@item 429If the file was a header file, you'll need to check that it's included 430in the right place to be visible to the generated files. For a back-end 431header file, this should be done automatically. For a front-end header 432file, it needs to be included by the same file that includes 433@file{gtype-@var{lang}.h}. For other header files, it needs to be 434included in @file{gtype-desc.c}, which is a generated file, so add it to 435@code{ifiles} in @code{open_base_file} in @file{gengtype.c}. 436 437For source files that aren't header files, the machinery will generate a 438header file that should be included in the source file you just changed. 439The file will be called @file{gt-@var{path}.h} where @var{path} is the 440pathname relative to the @file{gcc} directory with slashes replaced by 441@verb{|-|}, so for example the header file to be included in 442@file{cp/parser.c} is called @file{gt-cp-parser.c}. The 443generated header file should be included after everything else in the 444source file. Don't forget to mention this file as a dependency in the 445@file{Makefile}! 446 447@end enumerate 448 449For language frontends, there is another file that needs to be included 450somewhere. It will be called @file{gtype-@var{lang}.h}, where 451@var{lang} is the name of the subdirectory the language is contained in. 452 453Plugins can add additional root tables. Run the @code{gengtype} 454utility in plugin mode as @code{gengtype -P pluginout.h @var{source-dir} 455@var{file-list} @var{plugin*.c}} with your plugin files 456@var{plugin*.c} using @code{GTY} to generate the @var{pluginout.h} file. 457The GCC build tree is needed to be present in that mode. 458 459 460@node Invoking the garbage collector 461@section How to invoke the garbage collector 462@cindex garbage collector, invocation 463@findex ggc_collect 464 465The GCC garbage collector GGC is only invoked explicitly. In contrast 466with many other garbage collectors, it is not implicitly invoked by 467allocation routines when a lot of memory has been consumed. So the 468only way to have GGC reclaim storage it to call the @code{ggc_collect} 469function explicitly. This call is an expensive operation, as it may 470have to scan the entire heap. Beware that local variables (on the GCC 471call stack) are not followed by such an invocation (as many other 472garbage collectors do): you should reference all your data from static 473or external @code{GTY}-ed variables, and it is advised to call 474@code{ggc_collect} with a shallow call stack. The GGC is an exact mark 475and sweep garbage collector (so it does not scan the call stack for 476pointers). In practice GCC passes don't often call @code{ggc_collect} 477themselves, because it is called by the pass manager between passes. 478