1\input texinfo 2@c Copyright (C) 1988-2024 Free Software Foundation, Inc. 3@setfilename bfdint.info 4 5@settitle BFD Internals 6@iftex 7@titlepage 8@title{BFD Internals} 9@author{Ian Lance Taylor} 10@author{Cygnus Solutions} 11@page 12@end iftex 13 14@copying 15This file documents the internals of the BFD library. 16 17Copyright @copyright{} 1988-2024 Free Software Foundation, Inc. 18Contributed by Cygnus Support. 19 20Permission is granted to copy, distribute and/or modify this document 21under the terms of the GNU Free Documentation License, Version 1.1 or 22any later version published by the Free Software Foundation; with the 23Invariant Sections being ``GNU General Public License'' and ``Funding 24Free Software'', the Front-Cover texts being (a) (see below), and with 25the Back-Cover Texts being (b) (see below). A copy of the license is 26included in the section entitled ``GNU Free Documentation License''. 27 28(a) The FSF's Front-Cover Text is: 29 30 A GNU Manual 31 32(b) The FSF's Back-Cover Text is: 33 34 You have freedom to copy and modify this GNU Manual, like GNU 35 software. Copies published by the Free Software Foundation raise 36 funds for GNU development. 37@end copying 38 39@node Top 40@top BFD Internals 41@raisesections 42@cindex bfd internals 43 44This document describes some BFD internal information which may be 45helpful when working on BFD. It is very incomplete. 46 47This document is not updated regularly, and may be out of date. 48 49The initial version of this document was written by Ian Lance Taylor 50@email{ian@@cygnus.com}. 51 52@menu 53* BFD overview:: BFD overview 54* BFD guidelines:: BFD programming guidelines 55* BFD target vector:: BFD target vector 56* BFD generated files:: BFD generated files 57* BFD multiple compilations:: Files compiled multiple times in BFD 58* BFD relocation handling:: BFD relocation handling 59* BFD ELF support:: BFD ELF support 60* BFD glossary:: Glossary 61* Index:: Index 62@end menu 63 64@node BFD overview 65@section BFD overview 66 67BFD is a library which provides a single interface to read and write 68object files, executables, archive files, and core files in any format. 69 70@menu 71* BFD library interfaces:: BFD library interfaces 72* BFD library users:: BFD library users 73* BFD view:: The BFD view of a file 74* BFD blindness:: BFD loses information 75@end menu 76 77@node BFD library interfaces 78@subsection BFD library interfaces 79 80One way to look at the BFD library is to divide it into four parts by 81type of interface. 82 83The first interface is the set of generic functions which programs using 84the BFD library will call. These generic function normally translate 85directly or indirectly into calls to routines which are specific to a 86particular object file format. Many of these generic functions are 87actually defined as macros in @file{bfd.h}. These functions comprise 88the official BFD interface. 89 90The second interface is the set of functions which appear in the target 91vectors. This is the bulk of the code in BFD. A target vector is a set 92of function pointers specific to a particular object file format. The 93target vector is used to implement the generic BFD functions. These 94functions are always called through the target vector, and are never 95called directly. The target vector is described in detail in @ref{BFD 96target vector}. The set of functions which appear in a particular 97target vector is often referred to as a BFD backend. 98 99The third interface is a set of oddball functions which are typically 100specific to a particular object file format, are not generic functions, 101and are called from outside of the BFD library. These are used as hooks 102by the linker and the assembler when a particular object file format 103requires some action which the BFD generic interface does not provide. 104These functions are typically declared in @file{bfd.h}, but in many 105cases they are only provided when BFD is configured with support for a 106particular object file format. These functions live in a grey area, and 107are not really part of the official BFD interface. 108 109The fourth interface is the set of BFD support functions which are 110called by the other BFD functions. These manage issues like memory 111allocation, error handling, file access, hash tables, swapping, and the 112like. These functions are never called from outside of the BFD library. 113 114@node BFD library users 115@subsection BFD library users 116 117Another way to look at the BFD library is to divide it into three parts 118by the manner in which it is used. 119 120The first use is to read an object file. The object file readers are 121programs like @samp{gdb}, @samp{nm}, @samp{objdump}, and @samp{objcopy}. 122These programs use BFD to view an object file in a generic form. The 123official BFD interface is normally fully adequate for these programs. 124 125The second use is to write an object file. The object file writers are 126programs like @samp{gas} and @samp{objcopy}. These programs use BFD to 127create an object file. The official BFD interface is normally adequate 128for these programs, but for some object file formats the assembler needs 129some additional hooks in order to set particular flags or other 130information. The official BFD interface includes functions to copy 131private information from one object file to another, and these functions 132are used by @samp{objcopy} to avoid information loss. 133 134The third use is to link object files. There is only one object file 135linker, @samp{ld}. Originally, @samp{ld} was an object file reader and 136an object file writer, and it did the link operation using the generic 137BFD structures. However, this turned out to be too slow and too memory 138intensive. 139 140The official BFD linker functions were written to permit specific BFD 141backends to perform the link without translating through the generic 142structures, in the normal case where all the input files and output file 143have the same object file format. Not all of the backends currently 144implement the new interface, and there are default linking functions 145within BFD which use the generic structures and which work with all 146backends. 147 148For several object file formats the linker needs additional hooks which 149are not provided by the official BFD interface, particularly for dynamic 150linking support. These functions are typically called from the linker 151emulation template. 152 153@node BFD view 154@subsection The BFD view of a file 155 156BFD uses generic structures to manage information. It translates data 157into the generic form when reading files, and out of the generic form 158when writing files. 159 160BFD describes a file as a pointer to the @samp{bfd} type. A @samp{bfd} 161is composed of the following elements. The BFD information can be 162displayed using the @samp{objdump} program with various options. 163 164@table @asis 165@item general information 166The object file format, a few general flags, the start address. 167@item architecture 168The architecture, including both a general processor type (m68k, MIPS 169etc.) and a specific machine number (m68000, R4000, etc.). 170@item sections 171A list of sections. 172@item symbols 173A symbol table. 174@end table 175 176BFD represents a section as a pointer to the @samp{asection} type. Each 177section has a name and a size. Most sections also have an associated 178block of data, known as the section contents. Sections also have 179associated flags, a virtual memory address, a load memory address, a 180required alignment, a list of relocations, and other miscellaneous 181information. 182 183BFD represents a relocation as a pointer to the @samp{arelent} type. A 184relocation describes an action which the linker must take to modify the 185section contents. Relocations have a symbol, an address, an addend, and 186a pointer to a howto structure which describes how to perform the 187relocation. For more information, see @ref{BFD relocation handling}. 188 189BFD represents a symbol as a pointer to the @samp{asymbol} type. A 190symbol has a name, a pointer to a section, an offset within that 191section, and some flags. 192 193Archive files do not have any sections or symbols. Instead, BFD 194represents an archive file as a file which contains a list of 195@samp{bfd}s. BFD also provides access to the archive symbol map, as a 196list of symbol names. BFD provides a function to return the @samp{bfd} 197within the archive which corresponds to a particular entry in the 198archive symbol map. 199 200@node BFD blindness 201@subsection BFD loses information 202 203Most object file formats have information which BFD can not represent in 204its generic form, at least as currently defined. 205 206There is often explicit information which BFD can not represent. For 207example, the COFF version stamp, or the ELF program segments. BFD 208provides special hooks to handle this information when copying, 209printing, or linking an object file. The BFD support for a particular 210object file format will normally store this information in private data 211and handle it using the special hooks. 212 213In some cases there is also implicit information which BFD can not 214represent. For example, the MIPS processor distinguishes small and 215large symbols, and requires that all small symbols be within 32K of the 216GP register. This means that the MIPS assembler must be able to mark 217variables as either small or large, and the MIPS linker must know to put 218small symbols within range of the GP register. Since BFD can not 219represent this information, this means that the assembler and linker 220must have information that is specific to a particular object file 221format which is outside of the BFD library. 222 223This loss of information indicates areas where the BFD paradigm breaks 224down. It is not actually possible to represent the myriad differences 225among object file formats using a single generic interface, at least not 226in the manner which BFD does it today. 227 228Nevertheless, the BFD library does greatly simplify the task of dealing 229with object files, and particular problems caused by information loss 230can normally be solved using some sort of relatively constrained hook 231into the library. 232 233 234 235@node BFD guidelines 236@section BFD programming guidelines 237@cindex bfd programming guidelines 238@cindex programming guidelines for bfd 239@cindex guidelines, bfd programming 240 241There is a lot of poorly written and confusing code in BFD. New BFD 242code should be written to a higher standard. Merely because some BFD 243code is written in a particular manner does not mean that you should 244emulate it. 245 246Here are some general BFD programming guidelines: 247 248@itemize @bullet 249@item 250Follow the GNU coding standards. 251 252@item 253Avoid global variables. We ideally want BFD to be fully reentrant, so 254that it can be used in multiple threads. All uses of global or static 255variables interfere with that. Initialized constant variables are OK, 256and they should be explicitly marked with @samp{const}. Instead of global 257variables, use data attached to a BFD or to a linker hash table. 258 259@item 260All externally visible functions should have names which start with 261@samp{bfd_}. All such functions should be declared in some header file, 262typically @file{bfd.h}. See, for example, the various declarations near 263the end of @file{bfd-in.h}, which mostly declare functions required by 264specific linker emulations. 265 266@item 267All functions which need to be visible from one file to another within 268BFD, but should not be visible outside of BFD, should start with 269@samp{_bfd_}. Although external names beginning with @samp{_} are 270prohibited by the ANSI standard, in practice this usage will always 271work, and it is required by the GNU coding standards. 272 273@item 274Always remember that people can compile using @samp{--enable-targets} to 275build several, or all, targets at once. It must be possible to link 276together the files for all targets. 277 278@item 279BFD code should compile with few or no warnings using @samp{gcc -Wall}. 280Some warnings are OK, like the absence of certain function declarations 281which may or may not be declared in system header files. Warnings about 282ambiguous expressions and the like should always be fixed. 283@end itemize 284 285@node BFD target vector 286@section BFD target vector 287@cindex bfd target vector 288@cindex target vector in bfd 289 290BFD supports multiple object file formats by using the @dfn{target 291vector}. This is simply a set of function pointers which implement 292behaviour that is specific to a particular object file format. 293 294In this section I list all of the entries in the target vector and 295describe what they do. 296 297@menu 298* BFD target vector miscellaneous:: Miscellaneous constants 299* BFD target vector swap:: Swapping functions 300* BFD target vector format:: Format type dependent functions 301* BFD_JUMP_TABLE macros:: BFD_JUMP_TABLE macros 302* BFD target vector generic:: Generic functions 303* BFD target vector copy:: Copy functions 304* BFD target vector core:: Core file support functions 305* BFD target vector archive:: Archive functions 306* BFD target vector symbols:: Symbol table functions 307* BFD target vector relocs:: Relocation support 308* BFD target vector write:: Output functions 309* BFD target vector link:: Linker functions 310* BFD target vector dynamic:: Dynamic linking information functions 311@end menu 312 313@node BFD target vector miscellaneous 314@subsection Miscellaneous constants 315 316The target vector starts with a set of constants. 317 318@table @samp 319@item name 320The name of the target vector. This is an arbitrary string. This is 321how the target vector is named in command-line options for tools which 322use BFD, such as the @samp{--oformat} linker option. 323 324@item flavour 325A general description of the type of target. The following flavours are 326currently defined: 327 328@table @samp 329@item bfd_target_unknown_flavour 330Undefined or unknown. 331@item bfd_target_aout_flavour 332a.out. 333@item bfd_target_coff_flavour 334COFF. 335@item bfd_target_ecoff_flavour 336ECOFF. 337@item bfd_target_elf_flavour 338ELF. 339@item bfd_target_tekhex_flavour 340Tektronix hex format. 341@item bfd_target_srec_flavour 342Motorola S-record format. 343@item bfd_target_ihex_flavour 344Intel hex format. 345@item bfd_target_som_flavour 346SOM (used on HP/UX). 347@item bfd_target_verilog_flavour 348Verilog memory hex dump format. 349@item bfd_target_msdos_flavour 350MS-DOS. 351@item bfd_target_evax_flavour 352openVMS. 353@item bfd_target_mmo_flavour 354Donald Knuth's MMIXware object format. 355@end table 356 357@item byteorder 358The byte order of data in the object file. One of 359@samp{BFD_ENDIAN_BIG}, @samp{BFD_ENDIAN_LITTLE}, or 360@samp{BFD_ENDIAN_UNKNOWN}. The latter would be used for a format such 361as S-records which do not record the architecture of the data. 362 363@item header_byteorder 364The byte order of header information in the object file. Normally the 365same as the @samp{byteorder} field, but there are certain cases where it 366may be different. 367 368@item object_flags 369Flags which may appear in the @samp{flags} field of a BFD with this 370format. 371 372@item section_flags 373Flags which may appear in the @samp{flags} field of a section within a 374BFD with this format. 375 376@item symbol_leading_char 377A character which the C compiler normally puts before a symbol. For 378example, an a.out compiler will typically generate the symbol 379@samp{_foo} for a function named @samp{foo} in the C source, in which 380case this field would be @samp{_}. If there is no such character, this 381field will be @samp{0}. 382 383@item ar_pad_char 384The padding character to use at the end of an archive name. Normally 385@samp{/}. 386 387@item ar_max_namelen 388The maximum length of a short name in an archive. Normally @samp{14}. 389 390@item backend_data 391A pointer to constant backend data. This is used by backends to store 392whatever additional information they need to distinguish similar target 393vectors which use the same sets of functions. 394@end table 395 396@node BFD target vector swap 397@subsection Swapping functions 398 399Every target vector has function pointers used for swapping information 400in and out of the target representation. There are two sets of 401functions: one for data information, and one for header information. 402Each set has three sizes: 64-bit, 32-bit, and 16-bit. Each size has 403three actual functions: put, get unsigned, and get signed. 404 405These 18 functions are used to convert data between the host and target 406representations. 407 408@node BFD target vector format 409@subsection Format type dependent functions 410 411Every target vector has three arrays of function pointers which are 412indexed by the BFD format type. The BFD format types are as follows: 413 414@table @samp 415@item bfd_unknown 416Unknown format. Not used for anything useful. 417@item bfd_object 418Object file. 419@item bfd_archive 420Archive file. 421@item bfd_core 422Core file. 423@end table 424 425The three arrays of function pointers are as follows: 426 427@table @samp 428@item bfd_check_format 429Check whether the BFD is of a particular format (object file, archive 430file, or core file) corresponding to this target vector. This is called 431by the @samp{bfd_check_format} function when examining an existing BFD. 432If the BFD matches the desired format, this function will initialize any 433format specific information such as the @samp{tdata} field of the BFD. 434This function must be called before any other BFD target vector function 435on a file opened for reading. 436 437@item bfd_set_format 438Set the format of a BFD which was created for output. This is called by 439the @samp{bfd_set_format} function after creating the BFD with a 440function such as @samp{bfd_openw}. This function will initialize format 441specific information required to write out an object file or whatever of 442the given format. This function must be called before any other BFD 443target vector function on a file opened for writing. 444 445@item bfd_write_contents 446Write out the contents of the BFD in the given format. This is called 447by @samp{bfd_close} function for a BFD opened for writing. This really 448should not be an array selected by format type, as the 449@samp{bfd_set_format} function provides all the required information. 450In fact, BFD will fail if a different format is used when calling 451through the @samp{bfd_set_format} and the @samp{bfd_write_contents} 452arrays; fortunately, since @samp{bfd_close} gets it right, this is a 453difficult error to make. 454@end table 455 456@node BFD_JUMP_TABLE macros 457@subsection @samp{BFD_JUMP_TABLE} macros 458@cindex @samp{BFD_JUMP_TABLE} 459 460Most target vectors are defined using @samp{BFD_JUMP_TABLE} macros. 461These macros take a single argument, which is a prefix applied to a set 462of functions. The macros are then used to initialize the fields in the 463target vector. 464 465For example, the @samp{BFD_JUMP_TABLE_RELOCS} macro defines three 466functions: @samp{_get_reloc_upper_bound}, @samp{_canonicalize_reloc}, 467and @samp{_bfd_reloc_type_lookup}. A reference like 468@samp{BFD_JUMP_TABLE_RELOCS (foo)} will expand into three functions 469prefixed with @samp{foo}: @samp{foo_get_reloc_upper_bound}, etc. The 470@samp{BFD_JUMP_TABLE_RELOCS} macro will be placed such that those three 471functions initialize the appropriate fields in the BFD target vector. 472 473This is done because it turns out that many different target vectors can 474share certain classes of functions. For example, archives are similar 475on most platforms, so most target vectors can use the same archive 476functions. Those target vectors all use @samp{BFD_JUMP_TABLE_ARCHIVE} 477with the same argument, calling a set of functions which is defined in 478@file{archive.c}. 479 480Each of the @samp{BFD_JUMP_TABLE} macros is mentioned below along with 481the description of the function pointers which it defines. The function 482pointers will be described using the name without the prefix which the 483@samp{BFD_JUMP_TABLE} macro defines. This name is normally the same as 484the name of the field in the target vector structure. Any differences 485will be noted. 486 487@node BFD target vector generic 488@subsection Generic functions 489@cindex @samp{BFD_JUMP_TABLE_GENERIC} 490 491The @samp{BFD_JUMP_TABLE_GENERIC} macro is used for some catch all 492functions which don't easily fit into other categories. 493 494@table @samp 495@item _close_and_cleanup 496Free any target specific information associated with the BFD that 497isn't freed by @samp{_bfd_free_cached_info}. This is called when any 498BFD is closed (the @samp{bfd_write_contents} function mentioned 499earlier is only called for a BFD opened for writing). This function 500pointer is typically set to @samp{_bfd_generic_close_and_cleanup}, 501which simply returns true. 502 503@item _bfd_free_cached_info 504This function is designed for use by the generic archive routines, and 505is also called by bfd_close. After creating the archive map archive 506element bfds don't need symbols and other structures. Many targets 507use @samp{bfd_alloc} to allocate target specific information and thus 508do not need to do anything special for this entry point, and just set 509it to @samp{_bfd_generic_free_cached_info} which throws away objalloc 510memory for the bfd. Note that this means the bfd tdata and sections 511are no longer available. Targets that malloc memory, attaching it to 512the bfd tdata or to section used_by_bfd should implement a target 513version of this function to free that memory before calling 514@samp{_bfd_generic_free_cached_info}. 515 516@item _new_section_hook 517This is called from @samp{bfd_make_section_anyway} whenever a new 518section is created. Most targets use it to initialize section specific 519information. This function is called whether or not the section 520corresponds to an actual section in an actual BFD. 521 522@item _get_section_contents 523Get the contents of a section. This is called from 524@samp{bfd_get_section_contents}. Most targets set this to 525@samp{_bfd_generic_get_section_contents}, which does a @samp{bfd_seek} 526based on the section's @samp{filepos} field and a @samp{bfd_read}. The 527corresponding field in the target vector is named 528@samp{_bfd_get_section_contents}. 529 530@end table 531 532@node BFD target vector copy 533@subsection Copy functions 534@cindex @samp{BFD_JUMP_TABLE_COPY} 535 536The @samp{BFD_JUMP_TABLE_COPY} macro is used for functions which are 537called when copying BFDs, and for a couple of functions which deal with 538internal BFD information. 539 540@table @samp 541@item _bfd_copy_private_bfd_data 542This is called when copying a BFD, via @samp{bfd_copy_private_bfd_data}. 543If the input and output BFDs have the same format, this will copy any 544private information over. This is called after all the section contents 545have been written to the output file. Only a few targets do anything in 546this function. 547 548@item _bfd_merge_private_bfd_data 549This is called when linking, via @samp{bfd_merge_private_bfd_data}. It 550gives the backend linker code a chance to set any special flags in the 551output file based on the contents of the input file. Only a few targets 552do anything in this function. 553 554@item _bfd_copy_private_section_data 555This is similar to @samp{_bfd_copy_private_bfd_data}, but it is called 556for each section, via @samp{bfd_copy_private_section_data}. This 557function is called before any section contents have been written. Only 558a few targets do anything in this function. 559 560@item _bfd_copy_private_symbol_data 561This is called via @samp{bfd_copy_private_symbol_data}, but I don't 562think anything actually calls it. If it were defined, it could be used 563to copy private symbol data from one BFD to another. However, most BFDs 564store extra symbol information by allocating space which is larger than 565the @samp{asymbol} structure and storing private information in the 566extra space. Since @samp{objcopy} and other programs copy symbol 567information by copying pointers to @samp{asymbol} structures, the 568private symbol information is automatically copied as well. Most 569targets do not do anything in this function. 570 571@item _bfd_set_private_flags 572This is called via @samp{bfd_set_private_flags}. It is basically a hook 573for the assembler to set magic information. For example, the PowerPC 574ELF assembler uses it to set flags which appear in the e_flags field of 575the ELF header. Most targets do not do anything in this function. 576 577@item _bfd_print_private_bfd_data 578This is called by @samp{objdump} when the @samp{-p} option is used. It 579is called via @samp{bfd_print_private_data}. It prints any interesting 580information about the BFD which can not be otherwise represented by BFD 581and thus can not be printed by @samp{objdump}. Most targets do not do 582anything in this function. 583@end table 584 585@node BFD target vector core 586@subsection Core file support functions 587@cindex @samp{BFD_JUMP_TABLE_CORE} 588 589The @samp{BFD_JUMP_TABLE_CORE} macro is used for functions which deal 590with core files. Obviously, these functions only do something 591interesting for targets which have core file support. 592 593@table @samp 594@item _core_file_failing_command 595Given a core file, this returns the command which was run to produce the 596core file. 597 598@item _core_file_failing_signal 599Given a core file, this returns the signal number which produced the 600core file. 601 602@item _core_file_matches_executable_p 603Given a core file and a BFD for an executable, this returns whether the 604core file was generated by the executable. 605@end table 606 607@node BFD target vector archive 608@subsection Archive functions 609@cindex @samp{BFD_JUMP_TABLE_ARCHIVE} 610 611The @samp{BFD_JUMP_TABLE_ARCHIVE} macro is used for functions which deal 612with archive files. Most targets use COFF style archive files 613(including ELF targets), and these use @samp{_bfd_archive_coff} as the 614argument to @samp{BFD_JUMP_TABLE_ARCHIVE}. Some targets use BSD/a.out 615style archives, and these use @samp{_bfd_archive_bsd}. (The main 616difference between BSD and COFF archives is the format of the archive 617symbol table). Targets with no archive support use 618@samp{_bfd_noarchive}. Finally, a few targets have unusual archive 619handling. 620 621@table @samp 622@item _slurp_armap 623Read in the archive symbol table, storing it in private BFD data. This 624is normally called from the archive @samp{check_format} routine. The 625corresponding field in the target vector is named 626@samp{_bfd_slurp_armap}. 627 628@item _slurp_extended_name_table 629Read in the extended name table from the archive, if there is one, 630storing it in private BFD data. This is normally called from the 631archive @samp{check_format} routine. The corresponding field in the 632target vector is named @samp{_bfd_slurp_extended_name_table}. 633 634@item construct_extended_name_table 635Build and return an extended name table if one is needed to write out 636the archive. This also adjusts the archive headers to refer to the 637extended name table appropriately. This is normally called from the 638archive @samp{write_contents} routine. The corresponding field in the 639target vector is named @samp{_bfd_construct_extended_name_table}. 640 641@item _truncate_arname 642This copies a file name into an archive header, truncating it as 643required. It is normally called from the archive @samp{write_contents} 644routine. This function is more interesting in targets which do not 645support extended name tables, but I think the GNU @samp{ar} program 646always uses extended name tables anyhow. The corresponding field in the 647target vector is named @samp{_bfd_truncate_arname}. 648 649@item _write_armap 650Write out the archive symbol table using calls to @samp{bfd_write}. 651This is normally called from the archive @samp{write_contents} routine. 652The corresponding field in the target vector is named @samp{write_armap} 653(no leading underscore). 654 655@item _read_ar_hdr 656Read and parse an archive header. This handles expanding the archive 657header name into the real file name using the extended name table. This 658is called by routines which read the archive symbol table or the archive 659itself. The corresponding field in the target vector is named 660@samp{_bfd_read_ar_hdr_fn}. 661 662@item _openr_next_archived_file 663Given an archive and a BFD representing a file stored within the 664archive, return a BFD for the next file in the archive. This is called 665via @samp{bfd_openr_next_archived_file}. The corresponding field in the 666target vector is named @samp{openr_next_archived_file} (no leading 667underscore). 668 669@item _get_elt_at_index 670Given an archive and an index, return a BFD for the file in the archive 671corresponding to that entry in the archive symbol table. This is called 672via @samp{bfd_get_elt_at_index}. The corresponding field in the target 673vector is named @samp{_bfd_get_elt_at_index}. 674 675@item _generic_stat_arch_elt 676Do a stat on an element of an archive, returning information read from 677the archive header (modification time, uid, gid, file mode, size). This 678is called via @samp{bfd_stat_arch_elt}. The corresponding field in the 679target vector is named @samp{_bfd_stat_arch_elt}. 680 681@item _update_armap_timestamp 682After the entire contents of an archive have been written out, update 683the timestamp of the archive symbol table to be newer than that of the 684file. This is required for a.out style archives. This is normally 685called by the archive @samp{write_contents} routine. The corresponding 686field in the target vector is named @samp{_bfd_update_armap_timestamp}. 687@end table 688 689@node BFD target vector symbols 690@subsection Symbol table functions 691@cindex @samp{BFD_JUMP_TABLE_SYMBOLS} 692 693The @samp{BFD_JUMP_TABLE_SYMBOLS} macro is used for functions which deal 694with symbols. 695 696@table @samp 697@item _get_symtab_upper_bound 698Return a sensible upper bound on the amount of memory which will be 699required to read the symbol table. In practice most targets return the 700amount of memory required to hold @samp{asymbol} pointers for all the 701symbols plus a trailing @samp{NULL} entry, and store the actual symbol 702information in BFD private data. This is called via 703@samp{bfd_get_symtab_upper_bound}. The corresponding field in the 704target vector is named @samp{_bfd_get_symtab_upper_bound}. 705 706@item _canonicalize_symtab 707Read in the symbol table. This is called via 708@samp{bfd_canonicalize_symtab}. The corresponding field in the target 709vector is named @samp{_bfd_canonicalize_symtab}. 710 711@item _make_empty_symbol 712Create an empty symbol for the BFD. This is needed because most targets 713store extra information with each symbol by allocating a structure 714larger than an @samp{asymbol} and storing the extra information at the 715end. This function will allocate the right amount of memory, and return 716what looks like a pointer to an empty @samp{asymbol}. This is called 717via @samp{bfd_make_empty_symbol}. The corresponding field in the target 718vector is named @samp{_bfd_make_empty_symbol}. 719 720@item _print_symbol 721Print information about the symbol. This is called via 722@samp{bfd_print_symbol}. One of the arguments indicates what sort of 723information should be printed: 724 725@table @samp 726@item bfd_print_symbol_name 727Just print the symbol name. 728@item bfd_print_symbol_more 729Print the symbol name and some interesting flags. I don't think 730anything actually uses this. 731@item bfd_print_symbol_all 732Print all information about the symbol. This is used by @samp{objdump} 733when run with the @samp{-t} option. 734@end table 735The corresponding field in the target vector is named 736@samp{_bfd_print_symbol}. 737 738@item _get_symbol_info 739Return a standard set of information about the symbol. This is called 740via @samp{bfd_symbol_info}. The corresponding field in the target 741vector is named @samp{_bfd_get_symbol_info}. 742 743@item _bfd_is_local_label_name 744Return whether the given string would normally represent the name of a 745local label. This is called via @samp{bfd_is_local_label} and 746@samp{bfd_is_local_label_name}. Local labels are normally discarded by 747the assembler. In the linker, this defines the difference between the 748@samp{-x} and @samp{-X} options. 749 750@item _get_lineno 751Return line number information for a symbol. This is only meaningful 752for a COFF target. This is called when writing out COFF line numbers. 753 754@item _find_nearest_line 755Given an address within a section, use the debugging information to find 756the matching file name, function name, and line number, if any. This is 757called via @samp{bfd_find_nearest_line}. The corresponding field in the 758target vector is named @samp{_bfd_find_nearest_line}. 759 760@item _bfd_make_debug_symbol 761Make a debugging symbol. This is only meaningful for a COFF target, 762where it simply returns a symbol which will be placed in the 763@samp{N_DEBUG} section when it is written out. This is called via 764@samp{bfd_make_debug_symbol}. 765 766@item _read_minisymbols 767Minisymbols are used to reduce the memory requirements of programs like 768@samp{nm}. A minisymbol is a cookie pointing to internal symbol 769information which the caller can use to extract complete symbol 770information. This permits BFD to not convert all the symbols into 771generic form, but to instead convert them one at a time. This is called 772via @samp{bfd_read_minisymbols}. Most targets do not implement this, 773and just use generic support which is based on using standard 774@samp{asymbol} structures. 775 776@item _minisymbol_to_symbol 777Convert a minisymbol to a standard @samp{asymbol}. This is called via 778@samp{bfd_minisymbol_to_symbol}. 779@end table 780 781@node BFD target vector relocs 782@subsection Relocation support 783@cindex @samp{BFD_JUMP_TABLE_RELOCS} 784 785The @samp{BFD_JUMP_TABLE_RELOCS} macro is used for functions which deal 786with relocations. 787 788@table @samp 789@item _get_reloc_upper_bound 790Return a sensible upper bound on the amount of memory which will be 791required to read the relocations for a section. In practice most 792targets return the amount of memory required to hold @samp{arelent} 793pointers for all the relocations plus a trailing @samp{NULL} entry, and 794store the actual relocation information in BFD private data. This is 795called via @samp{bfd_get_reloc_upper_bound}. 796 797@item _canonicalize_reloc 798Return the relocation information for a section. This is called via 799@samp{bfd_canonicalize_reloc}. The corresponding field in the target 800vector is named @samp{_bfd_canonicalize_reloc}. 801 802@item _bfd_reloc_type_lookup 803Given a relocation code, return the corresponding howto structure 804(@pxref{BFD relocation codes}). This is called via 805@samp{bfd_reloc_type_lookup}. The corresponding field in the target 806vector is named @samp{reloc_type_lookup}. 807@end table 808 809@node BFD target vector write 810@subsection Output functions 811@cindex @samp{BFD_JUMP_TABLE_WRITE} 812 813The @samp{BFD_JUMP_TABLE_WRITE} macro is used for functions which deal 814with writing out a BFD. 815 816@table @samp 817@item _set_arch_mach 818Set the architecture and machine number for a BFD. This is called via 819@samp{bfd_set_arch_mach}. Most targets implement this by calling 820@samp{bfd_default_set_arch_mach}. The corresponding field in the target 821vector is named @samp{_bfd_set_arch_mach}. 822 823@item _set_section_contents 824Write out the contents of a section. This is called via 825@samp{bfd_set_section_contents}. The corresponding field in the target 826vector is named @samp{_bfd_set_section_contents}. 827@end table 828 829@node BFD target vector link 830@subsection Linker functions 831@cindex @samp{BFD_JUMP_TABLE_LINK} 832 833The @samp{BFD_JUMP_TABLE_LINK} macro is used for functions called by the 834linker. 835 836@table @samp 837@item _sizeof_headers 838Return the size of the header information required for a BFD. This is 839used to implement the @samp{SIZEOF_HEADERS} linker script function. It 840is normally used to align the first section at an efficient position on 841the page. This is called via @samp{bfd_sizeof_headers}. The 842corresponding field in the target vector is named 843@samp{_bfd_sizeof_headers}. 844 845@item _bfd_get_relocated_section_contents 846Read the contents of a section and apply the relocation information. 847This handles both a final link and a relocatable link; in the latter 848case, it adjust the relocation information as well. This is called via 849@samp{bfd_get_relocated_section_contents}. Most targets implement it by 850calling @samp{bfd_generic_get_relocated_section_contents}. 851 852@item _bfd_relax_section 853Try to use relaxation to shrink the size of a section. This is called 854by the linker when the @samp{-relax} option is used. This is called via 855@samp{bfd_relax_section}. Most targets do not support any sort of 856relaxation. 857 858@item _bfd_link_hash_table_create 859Create the symbol hash table to use for the linker. This linker hook 860permits the backend to control the size and information of the elements 861in the linker symbol hash table. This is called via 862@samp{bfd_link_hash_table_create}. 863 864@item _bfd_link_add_symbols 865Given an object file or an archive, add all symbols into the linker 866symbol hash table. Use callbacks to the linker to include archive 867elements in the link. This is called via @samp{bfd_link_add_symbols}. 868 869@item _bfd_final_link 870Finish the linking process. The linker calls this hook after all of the 871input files have been read, when it is ready to finish the link and 872generate the output file. This is called via @samp{bfd_final_link}. 873 874@item _bfd_link_split_section 875I don't know what this is for. Nothing seems to call it. The only 876non-trivial definition is in @file{som.c}. 877@end table 878 879@node BFD target vector dynamic 880@subsection Dynamic linking information functions 881@cindex @samp{BFD_JUMP_TABLE_DYNAMIC} 882 883The @samp{BFD_JUMP_TABLE_DYNAMIC} macro is used for functions which read 884dynamic linking information. 885 886@table @samp 887@item _get_dynamic_symtab_upper_bound 888Return a sensible upper bound on the amount of memory which will be 889required to read the dynamic symbol table. In practice most targets 890return the amount of memory required to hold @samp{asymbol} pointers for 891all the symbols plus a trailing @samp{NULL} entry, and store the actual 892symbol information in BFD private data. This is called via 893@samp{bfd_get_dynamic_symtab_upper_bound}. The corresponding field in 894the target vector is named @samp{_bfd_get_dynamic_symtab_upper_bound}. 895 896@item _canonicalize_dynamic_symtab 897Read the dynamic symbol table. This is called via 898@samp{bfd_canonicalize_dynamic_symtab}. The corresponding field in the 899target vector is named @samp{_bfd_canonicalize_dynamic_symtab}. 900 901@item _get_dynamic_reloc_upper_bound 902Return a sensible upper bound on the amount of memory which will be 903required to read the dynamic relocations. In practice most targets 904return the amount of memory required to hold @samp{arelent} pointers for 905all the relocations plus a trailing @samp{NULL} entry, and store the 906actual relocation information in BFD private data. This is called via 907@samp{bfd_get_dynamic_reloc_upper_bound}. The corresponding field in 908the target vector is named @samp{_bfd_get_dynamic_reloc_upper_bound}. 909 910@item _canonicalize_dynamic_reloc 911Read the dynamic relocations. This is called via 912@samp{bfd_canonicalize_dynamic_reloc}. The corresponding field in the 913target vector is named @samp{_bfd_canonicalize_dynamic_reloc}. 914@end table 915 916@node BFD generated files 917@section BFD generated files 918@cindex generated files in bfd 919@cindex bfd generated files 920 921BFD contains several automatically generated files. This section 922describes them. Some files are created at configure time, when you 923configure BFD. Some files are created at make time, when you build 924BFD. Some files are automatically rebuilt at make time, but only if 925you configure with the @samp{--enable-maintainer-mode} option. Some 926files live in the object directory---the directory from which you run 927configure---and some live in the source directory. All files that live 928in the source directory are checked into the git repository. 929 930@table @file 931@item bfd.h 932@cindex @file{bfd.h} 933@cindex @file{bfd-in3.h} 934Lives in the object directory. Created at make time from 935@file{bfd-in2.h} via @file{bfd-in3.h}. @file{bfd-in3.h} is created at 936configure time from @file{bfd-in2.h}. There are automatic dependencies 937to rebuild @file{bfd-in3.h} and hence @file{bfd.h} if @file{bfd-in2.h} 938changes, so you can normally ignore @file{bfd-in3.h}, and just think 939about @file{bfd-in2.h} and @file{bfd.h}. 940 941@file{bfd.h} is built by replacing a few strings in @file{bfd-in2.h}. 942To see them, search for @samp{@@} in @file{bfd-in2.h}. They mainly 943control whether BFD is built for a 32 bit target or a 64 bit target. 944 945@item bfd-in2.h 946@cindex @file{bfd-in2.h} 947Lives in the source directory. Created from @file{bfd-in.h} and several 948other BFD source files. If you configure with the 949@samp{--enable-maintainer-mode} option, @file{bfd-in2.h} is rebuilt 950automatically when a source file changes. 951 952@item elf32-target.h 953@itemx elf64-target.h 954@cindex @file{elf32-target.h} 955@cindex @file{elf64-target.h} 956Live in the object directory. Created from @file{elfxx-target.h}. 957These files are versions of @file{elfxx-target.h} customized for either 958a 32 bit ELF target or a 64 bit ELF target. 959 960@item libbfd.h 961@cindex @file{libbfd.h} 962Lives in the source directory. Created from @file{libbfd-in.h} and 963several other BFD source files. If you configure with the 964@samp{--enable-maintainer-mode} option, @file{libbfd.h} is rebuilt 965automatically when a source file changes. 966 967@item libcoff.h 968@cindex @file{libcoff.h} 969Lives in the source directory. Created from @file{libcoff-in.h} and 970@file{coffcode.h}. If you configure with the 971@samp{--enable-maintainer-mode} option, @file{libcoff.h} is rebuilt 972automatically when a source file changes. 973 974@item targmatch.h 975@cindex @file{targmatch.h} 976Lives in the object directory. Created at make time from 977@file{config.bfd}. This file is used to map configuration triplets into 978BFD target vector variable names at run time. 979@end table 980 981@node BFD multiple compilations 982@section Files compiled multiple times in BFD 983Several files in BFD are compiled multiple times. By this I mean that 984there are header files which contain function definitions. These header 985files are included by other files, and thus the functions are compiled 986once per file which includes them. 987 988Preprocessor macros are used to control the compilation, so that each 989time the files are compiled the resulting functions are slightly 990different. Naturally, if they weren't different, there would be no 991reason to compile them multiple times. 992 993This is a not a particularly good programming technique, and future BFD 994work should avoid it. 995 996@itemize @bullet 997@item 998Since this technique is rarely used, even experienced C programmers find 999it confusing. 1000 1001@item 1002It is difficult to debug programs which use BFD, since there is no way 1003to describe which version of a particular function you are looking at. 1004 1005@item 1006Programs which use BFD wind up incorporating two or more slightly 1007different versions of the same function, which wastes space in the 1008executable. 1009 1010@item 1011This technique is never required nor is it especially efficient. It is 1012always possible to use statically initialized structures holding 1013function pointers and magic constants instead. 1014@end itemize 1015 1016The following is a list of the files which are compiled multiple times. 1017 1018@table @file 1019@item aout-target.h 1020@cindex @file{aout-target.h} 1021Describes a few functions and the target vector for a.out targets. This 1022is used by individual a.out targets with different definitions of 1023@samp{N_TXTADDR} and similar a.out macros. 1024 1025@item aoutf1.h 1026@cindex @file{aoutf1.h} 1027Implements standard SunOS a.out files. In principle it supports 64 bit 1028a.out targets based on the preprocessor macro @samp{ARCH_SIZE}, but 1029since all known a.out targets are 32 bits, this code may or may not 1030work. This file is only included by a few other files, and it is 1031difficult to justify its existence. 1032 1033@item aoutx.h 1034@cindex @file{aoutx.h} 1035Implements basic a.out support routines. This file can be compiled for 1036either 32 or 64 bit support. Since all known a.out targets are 32 bits, 1037the 64 bit support may or may not work. I believe the original 1038intention was that this file would only be included by @samp{aout32.c} 1039and @samp{aout64.c}, and that other a.out targets would simply refer to 1040the functions it defined. Unfortunately, some other a.out targets 1041started including it directly, leading to a somewhat confused state of 1042affairs. 1043 1044@item coffcode.h 1045@cindex @file{coffcode.h} 1046Implements basic COFF support routines. This file is included by every 1047COFF target. It implements code which handles COFF magic numbers as 1048well as various hook functions called by the generic COFF functions in 1049@file{coffgen.c}. This file is controlled by a number of different 1050macros, and more are added regularly. 1051 1052@item coffswap.h 1053@cindex @file{coffswap.h} 1054Implements COFF swapping routines. This file is included by 1055@file{coffcode.h}, and thus by every COFF target. It implements the 1056routines which swap COFF structures between internal and external 1057format. The main control for this file is the external structure 1058definitions in the files in the @file{include/coff} directory. A COFF 1059target file will include one of those files before including 1060@file{coffcode.h} and thus @file{coffswap.h}. There are a few other 1061macros which affect @file{coffswap.h} as well, mostly describing whether 1062certain fields are present in the external structures. 1063 1064@item ecoffswap.h 1065@cindex @file{ecoffswap.h} 1066Implements ECOFF swapping routines. This is like @file{coffswap.h}, but 1067for ECOFF. It is included by the ECOFF target files (of which there are 1068only two). The control is the preprocessor macro @samp{ECOFF_32} or 1069@samp{ECOFF_64}. 1070 1071@item elfcode.h 1072@cindex @file{elfcode.h} 1073Implements ELF functions that use external structure definitions. This 1074file is included by two other files: @file{elf32.c} and @file{elf64.c}. 1075It is controlled by the @samp{ARCH_SIZE} macro which is defined to be 1076@samp{32} or @samp{64} before including it. The @samp{NAME} macro is 1077used internally to give the functions different names for the two target 1078sizes. 1079 1080@item elfcore.h 1081@cindex @file{elfcore.h} 1082Like @file{elfcode.h}, but for functions that are specific to ELF core 1083files. This is included only by @file{elfcode.h}. 1084 1085@item elfxx-target.h 1086@cindex @file{elfxx-target.h} 1087This file is the source for the generated files @file{elf32-target.h} 1088and @file{elf64-target.h}, one of which is included by every ELF target. 1089It defines the ELF target vector. 1090 1091@item netbsd.h 1092@cindex @file{netbsd.h} 1093Used by all netbsd aout targets. Several other files include it. 1094 1095@item peicode.h 1096@cindex @file{peicode.h} 1097Provides swapping routines and other hooks for PE targets. 1098@file{coffcode.h} will include this rather than @file{coffswap.h} for a 1099PE target. This defines PE specific versions of the COFF swapping 1100routines, and also defines some macros which control @file{coffcode.h} 1101itself. 1102@end table 1103 1104@node BFD relocation handling 1105@section BFD relocation handling 1106@cindex bfd relocation handling 1107@cindex relocations in bfd 1108 1109The handling of relocations is one of the more confusing aspects of BFD. 1110Relocation handling has been implemented in various different ways, all 1111somewhat incompatible, none perfect. 1112 1113@menu 1114* BFD relocation concepts:: BFD relocation concepts 1115* BFD relocation functions:: BFD relocation functions 1116* BFD relocation codes:: BFD relocation codes 1117* BFD relocation future:: BFD relocation future 1118@end menu 1119 1120@node BFD relocation concepts 1121@subsection BFD relocation concepts 1122 1123A relocation is an action which the linker must take when linking. It 1124describes a change to the contents of a section. The change is normally 1125based on the final value of one or more symbols. Relocations are 1126created by the assembler when it creates an object file. 1127 1128Most relocations are simple. A typical simple relocation is to set 32 1129bits at a given offset in a section to the value of a symbol. This type 1130of relocation would be generated for code like @code{int *p = &i;} where 1131@samp{p} and @samp{i} are global variables. A relocation for the symbol 1132@samp{i} would be generated such that the linker would initialize the 1133area of memory which holds the value of @samp{p} to the value of the 1134symbol @samp{i}. 1135 1136Slightly more complex relocations may include an addend, which is a 1137constant to add to the symbol value before using it. In some cases a 1138relocation will require adding the symbol value to the existing contents 1139of the section in the object file. In others the relocation will simply 1140replace the contents of the section with the symbol value. Some 1141relocations are PC relative, so that the value to be stored in the 1142section is the difference between the value of a symbol and the final 1143address of the section contents. 1144 1145In general, relocations can be arbitrarily complex. For example, 1146relocations used in dynamic linking systems often require the linker to 1147allocate space in a different section and use the offset within that 1148section as the value to store. 1149 1150When doing a relocatable link, the linker may or may not have to do 1151anything with a relocation, depending upon the definition of the 1152relocation. Simple relocations generally do not require any special 1153action. 1154 1155@node BFD relocation functions 1156@subsection BFD relocation functions 1157 1158In BFD, each section has an array of @samp{arelent} structures. Each 1159structure has a pointer to a symbol, an address within the section, an 1160addend, and a pointer to a @samp{reloc_howto_struct} structure. The 1161howto structure has a bunch of fields describing the reloc, including a 1162type field. The type field is specific to the object file format 1163backend; none of the generic code in BFD examines it. 1164 1165Originally, the function @samp{bfd_perform_relocation} was supposed to 1166handle all relocations. In theory, many relocations would be simple 1167enough to be described by the fields in the howto structure. For those 1168that weren't, the howto structure included a @samp{special_function} 1169field to use as an escape. 1170 1171While this seems plausible, a look at @samp{bfd_perform_relocation} 1172shows that it failed. The function has odd special cases. Some of the 1173fields in the howto structure, such as @samp{pcrel_offset}, were not 1174adequately documented. 1175 1176The linker uses @samp{bfd_perform_relocation} to do all relocations when 1177the input and output file have different formats (e.g., when generating 1178S-records). The generic linker code, which is used by all targets which 1179do not define their own special purpose linker, uses 1180@samp{bfd_get_relocated_section_contents}, which for most targets turns 1181into a call to @samp{bfd_generic_get_relocated_section_contents}, which 1182calls @samp{bfd_perform_relocation}. So @samp{bfd_perform_relocation} 1183is still widely used, which makes it difficult to change, since it is 1184difficult to test all possible cases. 1185 1186The assembler used @samp{bfd_perform_relocation} for a while. This 1187turned out to be the wrong thing to do, since 1188@samp{bfd_perform_relocation} was written to handle relocations on an 1189existing object file, while the assembler needed to create relocations 1190in a new object file. The assembler was changed to use the new function 1191@samp{bfd_install_relocation} instead, and @samp{bfd_install_relocation} 1192was created as a copy of @samp{bfd_perform_relocation}. 1193 1194Unfortunately, the work did not progress any farther, so 1195@samp{bfd_install_relocation} remains a simple copy of 1196@samp{bfd_perform_relocation}, with all the odd special cases and 1197confusing code. This again is difficult to change, because again any 1198change can affect any assembler target, and so is difficult to test. 1199 1200The new linker, when using the same object file format for all input 1201files and the output file, does not convert relocations into 1202@samp{arelent} structures, so it can not use 1203@samp{bfd_perform_relocation} at all. Instead, users of the new linker 1204are expected to write a @samp{relocate_section} function which will 1205handle relocations in a target specific fashion. 1206 1207There are two helper functions for target specific relocation: 1208@samp{_bfd_final_link_relocate} and @samp{_bfd_relocate_contents}. 1209These functions use a howto structure, but they @emph{do not} use the 1210@samp{special_function} field. Since the functions are normally called 1211from target specific code, the @samp{special_function} field adds 1212little; any relocations which require special handling can be handled 1213without calling those functions. 1214 1215So, if you want to add a new target, or add a new relocation to an 1216existing target, you need to do the following: 1217 1218@itemize @bullet 1219@item 1220Make sure you clearly understand what the contents of the section should 1221look like after assembly, after a relocatable link, and after a final 1222link. Make sure you clearly understand the operations the linker must 1223perform during a relocatable link and during a final link. 1224 1225@item 1226Write a howto structure for the relocation. The howto structure is 1227flexible enough to represent any relocation which should be handled by 1228setting a contiguous bitfield in the destination to the value of a 1229symbol, possibly with an addend, possibly adding the symbol value to the 1230value already present in the destination. 1231 1232@item 1233Change the assembler to generate your relocation. The assembler will 1234call @samp{bfd_install_relocation}, so your howto structure has to be 1235able to handle that. You may need to set the @samp{special_function} 1236field to handle assembly correctly. Be careful to ensure that any code 1237you write to handle the assembler will also work correctly when doing a 1238relocatable link. For example, see @samp{bfd_elf_generic_reloc}. 1239 1240@item 1241Test the assembler. Consider the cases of relocation against an 1242undefined symbol, a common symbol, a symbol defined in the object file 1243in the same section, and a symbol defined in the object file in a 1244different section. These cases may not all be applicable for your 1245reloc. 1246 1247@item 1248If your target uses the new linker, which is recommended, add any 1249required handling to the target specific relocation function. In simple 1250cases this will just involve a call to @samp{_bfd_final_link_relocate} 1251or @samp{_bfd_relocate_contents}, depending upon the definition of the 1252relocation and whether the link is relocatable or not. 1253 1254@item 1255Test the linker. Test the case of a final link. If the relocation can 1256overflow, use a linker script to force an overflow and make sure the 1257error is reported correctly. Test a relocatable link, whether the 1258symbol is defined or undefined in the relocatable output. For both the 1259final and relocatable link, test the case when the symbol is a common 1260symbol, when the symbol looked like a common symbol but became a defined 1261symbol, when the symbol is defined in a different object file, and when 1262the symbol is defined in the same object file. 1263 1264@item 1265In order for linking to another object file format, such as S-records, 1266to work correctly, @samp{bfd_perform_relocation} has to do the right 1267thing for the relocation. You may need to set the 1268@samp{special_function} field to handle this correctly. Test this by 1269doing a link in which the output object file format is S-records. 1270 1271@item 1272Using the linker to generate relocatable output in a different object 1273file format is impossible in the general case, so you generally don't 1274have to worry about that. The GNU linker makes sure to stop that from 1275happening when an input file in a different format has relocations. 1276 1277Linking input files of different object file formats together is quite 1278unusual, but if you're really dedicated you may want to consider testing 1279this case, both when the output object file format is the same as your 1280format, and when it is different. 1281@end itemize 1282 1283@node BFD relocation codes 1284@subsection BFD relocation codes 1285 1286BFD has another way of describing relocations besides the howto 1287structures described above: the enum @samp{bfd_reloc_code_real_type}. 1288 1289Every known relocation type can be described as a value in this 1290enumeration. The enumeration contains many target specific relocations, 1291but where two or more targets have the same relocation, a single code is 1292used. For example, the single value @samp{BFD_RELOC_32} is used for all 1293simple 32 bit relocation types. 1294 1295The main purpose of this relocation code is to give the assembler some 1296mechanism to create @samp{arelent} structures. In order for the 1297assembler to create an @samp{arelent} structure, it has to be able to 1298obtain a howto structure. The function @samp{bfd_reloc_type_lookup}, 1299which simply calls the target vector entry point 1300@samp{reloc_type_lookup}, takes a relocation code and returns a howto 1301structure. 1302 1303The function @samp{bfd_get_reloc_code_name} returns the name of a 1304relocation code. This is mainly used in error messages. 1305 1306Using both howto structures and relocation codes can be somewhat 1307confusing. There are many processor specific relocation codes. 1308However, the relocation is only fully defined by the howto structure. 1309The same relocation code will map to different howto structures in 1310different object file formats. For example, the addend handling may be 1311different. 1312 1313Most of the relocation codes are not really general. The assembler can 1314not use them without already understanding what sorts of relocations can 1315be used for a particular target. It might be possible to replace the 1316relocation codes with something simpler. 1317 1318@node BFD relocation future 1319@subsection BFD relocation future 1320 1321Clearly the current BFD relocation support is in bad shape. A 1322wholescale rewrite would be very difficult, because it would require 1323thorough testing of every BFD target. So some sort of incremental 1324change is required. 1325 1326My vague thoughts on this would involve defining a new, clearly defined, 1327howto structure. Some mechanism would be used to determine which type 1328of howto structure was being used by a particular format. 1329 1330The new howto structure would clearly define the relocation behaviour in 1331the case of an assembly, a relocatable link, and a final link. At 1332least one special function would be defined as an escape, and it might 1333make sense to define more. 1334 1335One or more generic functions similar to @samp{bfd_perform_relocation} 1336would be written to handle the new howto structure. 1337 1338This should make it possible to write a generic version of the relocate 1339section functions used by the new linker. The target specific code 1340would provide some mechanism (a function pointer or an initial 1341conversion) to convert target specific relocations into howto 1342structures. 1343 1344Ideally it would be possible to use this generic relocate section 1345function for the generic linker as well. That is, it would replace the 1346@samp{bfd_generic_get_relocated_section_contents} function which is 1347currently normally used. 1348 1349For the special case of ELF dynamic linking, more consideration needs to 1350be given to writing ELF specific but ELF target generic code to handle 1351special relocation types such as GOT and PLT. 1352 1353@node BFD ELF support 1354@section BFD ELF support 1355@cindex elf support in bfd 1356@cindex bfd elf support 1357 1358The ELF object file format is defined in two parts: a generic ABI and a 1359processor specific supplement. The ELF support in BFD is split in a 1360similar fashion. The processor specific support is largely kept within 1361a single file. The generic support is provided by several other files. 1362The processor specific support provides a set of function pointers and 1363constants used by the generic support. 1364 1365@menu 1366* BFD ELF sections and segments:: ELF sections and segments 1367* BFD ELF generic support:: BFD ELF generic support 1368* BFD ELF processor specific support:: BFD ELF processor specific support 1369* BFD ELF core files:: BFD ELF core files 1370* BFD ELF future:: BFD ELF future 1371@end menu 1372 1373@node BFD ELF sections and segments 1374@subsection ELF sections and segments 1375 1376The ELF ABI permits a file to have either sections or segments or both. 1377Relocatable object files conventionally have only sections. 1378Executables conventionally have both. Core files conventionally have 1379only program segments. 1380 1381ELF sections are similar to sections in other object file formats: they 1382have a name, a VMA, file contents, flags, and other miscellaneous 1383information. ELF relocations are stored in sections of a particular 1384type; BFD automatically converts these sections into internal relocation 1385information. 1386 1387ELF program segments are intended for fast interpretation by a system 1388loader. They have a type, a VMA, an LMA, file contents, and a couple of 1389other fields. When an ELF executable is run on a Unix system, the 1390system loader will examine the program segments to decide how to load 1391it. The loader will ignore the section information. Loadable program 1392segments (type @samp{PT_LOAD}) are directly loaded into memory. Other 1393program segments are interpreted by the loader, and generally provide 1394dynamic linking information. 1395 1396When an ELF file has both program segments and sections, an ELF program 1397segment may encompass one or more ELF sections, in the sense that the 1398portion of the file which corresponds to the program segment may include 1399the portions of the file corresponding to one or more sections. When 1400there is more than one section in a loadable program segment, the 1401relative positions of the section contents in the file must correspond 1402to the relative positions they should hold when the program segment is 1403loaded. This requirement should be obvious if you consider that the 1404system loader will load an entire program segment at a time. 1405 1406On a system which supports dynamic paging, such as any native Unix 1407system, the contents of a loadable program segment must be at the same 1408offset in the file as in memory, modulo the memory page size used on the 1409system. This is because the system loader will map the file into memory 1410starting at the start of a page. The system loader can easily remap 1411entire pages to the correct load address. However, if the contents of 1412the file were not correctly aligned within the page, the system loader 1413would have to shift the contents around within the page, which is too 1414expensive. For example, if the LMA of a loadable program segment is 1415@samp{0x40080} and the page size is @samp{0x1000}, then the position of 1416the segment contents within the file must equal @samp{0x80} modulo 1417@samp{0x1000}. 1418 1419BFD has only a single set of sections. It does not provide any generic 1420way to examine both sections and segments. When BFD is used to open an 1421object file or executable, the BFD sections will represent ELF sections. 1422When BFD is used to open a core file, the BFD sections will represent 1423ELF program segments. 1424 1425When BFD is used to examine an object file or executable, any program 1426segments will be read to set the LMA of the sections. This is because 1427ELF sections only have a VMA, while ELF program segments have both a VMA 1428and an LMA. Any program segments will be copied by the 1429@samp{copy_private} entry points. They will be printed by the 1430@samp{print_private} entry point. Otherwise, the program segments are 1431ignored. In particular, programs which use BFD currently have no direct 1432access to the program segments. 1433 1434When BFD is used to create an executable, the program segments will be 1435created automatically based on the section information. This is done in 1436the function @samp{assign_file_positions_for_segments} in @file{elf.c}. 1437This function has been tweaked many times, and probably still has 1438problems that arise in particular cases. 1439 1440There is a hook which may be used to explicitly define the program 1441segments when creating an executable: the @samp{bfd_record_phdr} 1442function in @file{bfd.c}. If this function is called, BFD will not 1443create program segments itself, but will only create the program 1444segments specified by the caller. The linker uses this function to 1445implement the @samp{PHDRS} linker script command. 1446 1447@node BFD ELF generic support 1448@subsection BFD ELF generic support 1449 1450In general, functions which do not read external data from the ELF file 1451are found in @file{elf.c}. They operate on the internal forms of the 1452ELF structures, which are defined in @file{include/elf/internal.h}. The 1453internal structures are defined in terms of @samp{bfd_vma}, and so may 1454be used for both 32 bit and 64 bit ELF targets. 1455 1456The file @file{elfcode.h} contains functions which operate on the 1457external data. @file{elfcode.h} is compiled twice, once via 1458@file{elf32.c} with @samp{ARCH_SIZE} defined as @samp{32}, and once via 1459@file{elf64.c} with @samp{ARCH_SIZE} defined as @samp{64}. 1460@file{elfcode.h} includes functions to swap the ELF structures in and 1461out of external form, as well as a few more complex functions. 1462 1463Linker support is found in @file{elflink.c}. The 1464linker support is only used if the processor specific file defines 1465@samp{elf_backend_relocate_section}, which is required to relocate the 1466section contents. If that macro is not defined, the generic linker code 1467is used, and relocations are handled via @samp{bfd_perform_relocation}. 1468 1469The core file support is in @file{elfcore.h}, which is compiled twice, 1470for both 32 and 64 bit support. The more interesting cases of core file 1471support only work on a native system which has the @file{sys/procfs.h} 1472header file. Without that file, the core file support does little more 1473than read the ELF program segments as BFD sections. 1474 1475The BFD internal header file @file{elf-bfd.h} is used for communication 1476among these files and the processor specific files. 1477 1478The default entries for the BFD ELF target vector are found mainly in 1479@file{elf.c}. Some functions are found in @file{elfcode.h}. 1480 1481The processor specific files may override particular entries in the 1482target vector, but most do not, with one exception: the 1483@samp{bfd_reloc_type_lookup} entry point is always processor specific. 1484 1485@node BFD ELF processor specific support 1486@subsection BFD ELF processor specific support 1487 1488By convention, the processor specific support for a particular processor 1489will be found in @file{elf@var{nn}-@var{cpu}.c}, where @var{nn} is 1490either 32 or 64, and @var{cpu} is the name of the processor. 1491 1492@menu 1493* BFD ELF processor required:: Required processor specific support 1494* BFD ELF processor linker:: Processor specific linker support 1495* BFD ELF processor other:: Other processor specific support options 1496@end menu 1497 1498@node BFD ELF processor required 1499@subsubsection Required processor specific support 1500 1501When writing a @file{elf@var{nn}-@var{cpu}.c} file, you must do the 1502following: 1503 1504@itemize @bullet 1505@item 1506Define either @samp{TARGET_BIG_SYM} or @samp{TARGET_LITTLE_SYM}, or 1507both, to a unique C name to use for the target vector. This name should 1508appear in the list of target vectors in @file{targets.c}, and will also 1509have to appear in @file{config.bfd} and @file{configure.ac}. Define 1510@samp{TARGET_BIG_SYM} for a big-endian processor, 1511@samp{TARGET_LITTLE_SYM} for a little-endian processor, and define both 1512for a bi-endian processor. 1513@item 1514Define either @samp{TARGET_BIG_NAME} or @samp{TARGET_LITTLE_NAME}, or 1515both, to a string used as the name of the target vector. This is the 1516name which a user of the BFD tool would use to specify the object file 1517format. It would normally appear in a linker emulation parameters 1518file. 1519@item 1520Define @samp{ELF_ARCH} to the BFD architecture (an element of the 1521@samp{bfd_architecture} enum, typically @samp{bfd_arch_@var{cpu}}). 1522@item 1523Define @samp{ELF_MACHINE_CODE} to the magic number which should appear 1524in the @samp{e_machine} field of the ELF header. As of this writing, 1525these magic numbers are assigned by Caldera; if you want to get a magic 1526number for a particular processor, try sending a note to 1527@email{registry@@caldera.com}. In the BFD sources, the magic numbers are 1528found in @file{include/elf/common.h}; they have names beginning with 1529@samp{EM_}. 1530@item 1531Define @samp{ELF_MAXPAGESIZE} to the maximum size of a virtual page in 1532memory. This can normally be found at the start of chapter 5 in the 1533processor specific supplement. For a processor which will only be used 1534in an embedded system, or which has no memory management hardware, this 1535can simply be @samp{1}. 1536@item 1537If the format should use @samp{Rel} rather than @samp{Rela} relocations, 1538define @samp{USE_REL}. This is normally defined in chapter 4 of the 1539processor specific supplement. 1540 1541In the absence of a supplement, it's easier to work with @samp{Rela} 1542relocations. @samp{Rela} relocations will require more space in object 1543files (but not in executables, except when using dynamic linking). 1544However, this is outweighed by the simplicity of addend handling when 1545using @samp{Rela} relocations. With @samp{Rel} relocations, the addend 1546must be stored in the section contents, which makes relocatable links 1547more complex. 1548 1549For example, consider C code like @code{i = a[1000];} where @samp{a} is 1550a global array. The instructions which load the value of @samp{a[1000]} 1551will most likely use a relocation which refers to the symbol 1552representing @samp{a}, with an addend that gives the offset from the 1553start of @samp{a} to element @samp{1000}. When using @samp{Rel} 1554relocations, that addend must be stored in the instructions themselves. 1555If you are adding support for a RISC chip which uses two or more 1556instructions to load an address, then the addend may not fit in a single 1557instruction, and will have to be somehow split among the instructions. 1558This makes linking awkward, particularly when doing a relocatable link 1559in which the addend may have to be updated. It can be done---the MIPS 1560ELF support does it---but it should be avoided when possible. 1561 1562It is possible, though somewhat awkward, to support both @samp{Rel} and 1563@samp{Rela} relocations for a single target; @file{elf64-mips.c} does it 1564by overriding the relocation reading and writing routines. 1565@item 1566Define howto structures for all the relocation types. 1567@item 1568Define a @samp{bfd_reloc_type_lookup} routine. This must be named 1569@samp{bfd_elf@var{nn}_bfd_reloc_type_lookup}, and may be either a 1570function or a macro. It must translate a BFD relocation code into a 1571howto structure. This is normally a table lookup or a simple switch. 1572@item 1573If using @samp{Rel} relocations, define @samp{elf_info_to_howto_rel}. 1574If using @samp{Rela} relocations, define @samp{elf_info_to_howto}. 1575Either way, this is a macro defined as the name of a function which 1576takes an @samp{arelent} and a @samp{Rel} or @samp{Rela} structure, and 1577sets the @samp{howto} field of the @samp{arelent} based on the 1578@samp{Rel} or @samp{Rela} structure. This is normally uses 1579@samp{ELF@var{nn}_R_TYPE} to get the ELF relocation type and uses it as 1580an index into a table of howto structures. 1581@end itemize 1582 1583You must also add the magic number for this processor to the 1584@samp{prep_headers} function in @file{elf.c}. 1585 1586You must also create a header file in the @file{include/elf} directory 1587called @file{@var{cpu}.h}. This file should define any target specific 1588information which may be needed outside of the BFD code. In particular 1589it should use the @samp{START_RELOC_NUMBERS}, @samp{RELOC_NUMBER}, 1590@samp{FAKE_RELOC}, @samp{EMPTY_RELOC} and @samp{END_RELOC_NUMBERS} 1591macros to create a table mapping the number used to identify a 1592relocation to a name describing that relocation. 1593 1594While not a BFD component, you probably also want to make the binutils 1595program @samp{readelf} parse your ELF objects. For this, you need to add 1596code for @code{EM_@var{cpu}} as appropriate in @file{binutils/readelf.c}. 1597 1598@node BFD ELF processor linker 1599@subsubsection Processor specific linker support 1600 1601The linker will be much more efficient if you define a relocate section 1602function. This will permit BFD to use the ELF specific linker support. 1603 1604If you do not define a relocate section function, BFD must use the 1605generic linker support, which requires converting all symbols and 1606relocations into BFD @samp{asymbol} and @samp{arelent} structures. In 1607this case, relocations will be handled by calling 1608@samp{bfd_perform_relocation}, which will use the howto structures you 1609have defined. @xref{BFD relocation handling}. 1610 1611In order to support linking into a different object file format, such as 1612S-records, @samp{bfd_perform_relocation} must work correctly with your 1613howto structures, so you can't skip that step. However, if you define 1614the relocate section function, then in the normal case of linking into 1615an ELF file the linker will not need to convert symbols and relocations, 1616and will be much more efficient. 1617 1618To use a relocation section function, define the macro 1619@samp{elf_backend_relocate_section} as the name of a function which will 1620take the contents of a section, as well as relocation, symbol, and other 1621information, and modify the section contents according to the relocation 1622information. In simple cases, this is little more than a loop over the 1623relocations which computes the value of each relocation and calls 1624@samp{_bfd_final_link_relocate}. The function must check for a 1625relocatable link, and in that case normally needs to do nothing other 1626than adjust the addend for relocations against a section symbol. 1627 1628The complex cases generally have to do with dynamic linker support. GOT 1629and PLT relocations must be handled specially, and the linker normally 1630arranges to set up the GOT and PLT sections while handling relocations. 1631When generating a shared library, random relocations must normally be 1632copied into the shared library, or converted to RELATIVE relocations 1633when possible. 1634 1635@node BFD ELF processor other 1636@subsubsection Other processor specific support options 1637 1638There are many other macros which may be defined in 1639@file{elf@var{nn}-@var{cpu}.c}. These macros may be found in 1640@file{elfxx-target.h}. 1641 1642Macros may be used to override some of the generic ELF target vector 1643functions. 1644 1645Several processor specific hook functions which may be defined as 1646macros. These functions are found as function pointers in the 1647@samp{elf_backend_data} structure defined in @file{elf-bfd.h}. In 1648general, a hook function is set by defining a macro 1649@samp{elf_backend_@var{name}}. 1650 1651There are a few processor specific constants which may also be defined. 1652These are again found in the @samp{elf_backend_data} structure. 1653 1654I will not define the various functions and constants here; see the 1655comments in @file{elf-bfd.h}. 1656 1657Normally any odd characteristic of a particular ELF processor is handled 1658via a hook function. For example, the special @samp{SHN_MIPS_SCOMMON} 1659section number found in MIPS ELF is handled via the hooks 1660@samp{section_from_bfd_section}, @samp{symbol_processing}, 1661@samp{add_symbol_hook}, and @samp{output_symbol_hook}. 1662 1663Dynamic linking support, which involves processor specific relocations 1664requiring special handling, is also implemented via hook functions. 1665 1666@node BFD ELF core files 1667@subsection BFD ELF core files 1668@cindex elf core files 1669 1670On native ELF Unix systems, core files are generated without any 1671sections. Instead, they only have program segments. 1672 1673When BFD is used to read an ELF core file, the BFD sections will 1674actually represent program segments. Since ELF program segments do not 1675have names, BFD will invent names like @samp{segment@var{n}} where 1676@var{n} is a number. 1677 1678A single ELF program segment may include both an initialized part and an 1679uninitialized part. The size of the initialized part is given by the 1680@samp{p_filesz} field. The total size of the segment is given by the 1681@samp{p_memsz} field. If @samp{p_memsz} is larger than @samp{p_filesz}, 1682then the extra space is uninitialized, or, more precisely, initialized 1683to zero. 1684 1685BFD will represent such a program segment as two different sections. 1686The first, named @samp{segment@var{n}a}, will represent the initialized 1687part of the program segment. The second, named @samp{segment@var{n}b}, 1688will represent the uninitialized part. 1689 1690ELF core files store special information such as register values in 1691program segments with the type @samp{PT_NOTE}. BFD will attempt to 1692interpret the information in these segments, and will create additional 1693sections holding the information. Some of this interpretation requires 1694information found in the host header file @file{sys/procfs.h}, and so 1695will only work when BFD is built on a native system. 1696 1697BFD does not currently provide any way to create an ELF core file. In 1698general, BFD does not provide a way to create core files. The way to 1699implement this would be to write @samp{bfd_set_format} and 1700@samp{bfd_write_contents} routines for the @samp{bfd_core} type; see 1701@ref{BFD target vector format}. 1702 1703@node BFD ELF future 1704@subsection BFD ELF future 1705 1706The current dynamic linking support has too much code duplication. 1707While each processor has particular differences, much of the dynamic 1708linking support is quite similar for each processor. The GOT and PLT 1709are handled in fairly similar ways, the details of -Bsymbolic linking 1710are generally similar, etc. This code should be reworked to use more 1711generic functions, eliminating the duplication. 1712 1713Similarly, the relocation handling has too much duplication. Many of 1714the @samp{reloc_type_lookup} and @samp{info_to_howto} functions are 1715quite similar. The relocate section functions are also often quite 1716similar, both in the standard linker handling and the dynamic linker 1717handling. Many of the COFF processor specific backends share a single 1718relocate section function (@samp{_bfd_coff_generic_relocate_section}), 1719and it should be possible to do something like this for the ELF targets 1720as well. 1721 1722The appearance of the processor specific magic number in 1723@samp{prep_headers} in @file{elf.c} is somewhat bogus. It should be 1724possible to add support for a new processor without changing the generic 1725support. 1726 1727The processor function hooks and constants are ad hoc and need better 1728documentation. 1729 1730@node BFD glossary 1731@section BFD glossary 1732@cindex glossary for bfd 1733@cindex bfd glossary 1734 1735This is a short glossary of some BFD terms. 1736 1737@table @asis 1738@item a.out 1739The a.out object file format. The original Unix object file format. 1740Still used on SunOS, though not Solaris. Supports only three sections. 1741 1742@item archive 1743A collection of object files produced and manipulated by the @samp{ar} 1744program. 1745 1746@item backend 1747The implementation within BFD of a particular object file format. The 1748set of functions which appear in a particular target vector. 1749 1750@item BFD 1751The BFD library itself. Also, each object file, archive, or executable 1752opened by the BFD library has the type @samp{bfd *}, and is sometimes 1753referred to as a bfd. 1754 1755@item COFF 1756The Common Object File Format. Used on Unix SVR3. Used by some 1757embedded targets, although ELF is normally better. 1758 1759@item DLL 1760A shared library on Windows. 1761 1762@item dynamic linker 1763When a program linked against a shared library is run, the dynamic 1764linker will locate the appropriate shared library and arrange to somehow 1765include it in the running image. 1766 1767@item dynamic object 1768Another name for an ELF shared library. 1769 1770@item ECOFF 1771The Extended Common Object File Format. Used on Alpha Digital Unix 1772(formerly OSF/1), as well as Ultrix and Irix 4. A variant of COFF. 1773 1774@item ELF 1775The Executable and Linking Format. The object file format used on most 1776modern Unix systems, including GNU/Linux, Solaris, Irix, and SVR4. Also 1777used on many embedded systems. 1778 1779@item executable 1780A program, with instructions and symbols, and perhaps dynamic linking 1781information. Normally produced by a linker. 1782 1783@item LMA 1784Load Memory Address. This is the address at which a section will be 1785loaded. Compare with VMA, below. 1786 1787@item object file 1788A binary file including machine instructions, symbols, and relocation 1789information. Normally produced by an assembler. 1790 1791@item object file format 1792The format of an object file. Typically object files and executables 1793for a particular system are in the same format, although executables 1794will not contain any relocation information. 1795 1796@item PE 1797The Portable Executable format. This is the object file format used for 1798Windows (specifically, Win32) object files. It is based closely on 1799COFF, but has a few significant differences. 1800 1801@item PEI 1802The Portable Executable Image format. This is the object file format 1803used for Windows (specifically, Win32) executables. It is very similar 1804to PE, but includes some additional header information. 1805 1806@item relocations 1807Information used by the linker to adjust section contents. Also called 1808relocs. 1809 1810@item section 1811Object files and executable are composed of sections. Sections have 1812optional data and optional relocation information. 1813 1814@item shared library 1815A library of functions which may be used by many executables without 1816actually being linked into each executable. There are several different 1817implementations of shared libraries, each having slightly different 1818features. 1819 1820@item symbol 1821Each object file and executable may have a list of symbols, often 1822referred to as the symbol table. A symbol is basically a name and an 1823address. There may also be some additional information like the type of 1824symbol, although the type of a symbol is normally something simple like 1825function or object, and should be confused with the more complex C 1826notion of type. Typically every global function and variable in a C 1827program will have an associated symbol. 1828 1829@item target vector 1830A set of functions which implement support for a particular object file 1831format. The @samp{bfd_target} structure. 1832 1833@item Win32 1834The current Windows API, implemented by Windows 95 and later and Windows 1835NT 3.51 and later, but not by Windows 3.1. 1836 1837@item XCOFF 1838The eXtended Common Object File Format. Used on AIX. A variant of 1839COFF, with a completely different symbol table implementation. 1840 1841@item VMA 1842Virtual Memory Address. This is the address a section will have when 1843an executable is run. Compare with LMA, above. 1844@end table 1845 1846@node Index 1847@unnumberedsec Index 1848@printindex cp 1849 1850@contents 1851@bye 1852