xref: /csrg-svn/share/man/man5/a.out.5 (revision 48831)
1*48831Scael.\" Copyright (c) 1991 The Regents of the University of California.
2*48831Scael.\" All rights reserved.
320755Smckusick.\"
4*48831Scael.\" This man page is derived from documentation contributed to Berkeley by
5*48831Scael.\" Donn Seeley at UUNET Technologies, Inc.
620755Smckusick.\"
7*48831Scael.\" %sccs.include.redist.roff%
8*48831Scael.\"
9*48831Scael.\"	@(#)a.out.5	6.3 (Berkeley) 04/29/91
10*48831Scael.\"
11*48831Scael.Dd
12*48831Scael.Dt A.OUT 5
13*48831Scael.Os
14*48831Scael.Sh NAME
15*48831Scael.Nm a.out
16*48831Scael.Nd format of executable binary files
17*48831Scael.Sh SYNOPSIS
18*48831Scael.Fd #include <a.out.h>
19*48831Scael.Sh DESCRIPTION
20*48831ScaelThe include file
21*48831Scael.Aq Pa a.out.h
22*48831Scaeldeclares three structures and several macros.
23*48831ScaelThe structures describe the format of
24*48831Scaelexecutable machine code files
25*48831Scael.Pq Sq binaries
26*48831Scaelon the system.
27*48831Scael.Pp
28*48831ScaelA binary file consists of up to 7 sections.
29*48831ScaelIn order, these sections are:
30*48831Scael.Bl -tag -width "text relocations"
31*48831Scael.It exec header
32*48831ScaelContains parameters used by the kernel
33*48831Scaelto load a binary file into memory and execute it,
34*48831Scaeland by the link editor
35*48831Scael.Xr ld 1
36*48831Scaelto combine a binary file with other binary files.
37*48831ScaelThis section is the only mandatory one.
38*48831Scael.It text segment
39*48831ScaelContains machine code and related data
40*48831Scaelthat are loaded into memory when a program executes.
41*48831ScaelMay be loaded read-only.
42*48831Scael.It data segment
43*48831ScaelContains initialized data; always loaded into writable memory.
44*48831Scael.It text relocations
45*48831ScaelContains records used by the link editor
46*48831Scaelto update pointers in the text segment when combining binary files.
47*48831Scael.It data relocations
48*48831ScaelLike the text relocation section, but for data segment pointers.
49*48831Scael.It symbol table
50*48831ScaelContains records used by the link editor
51*48831Scaelto cross reference the addresses of named variables and functions
52*48831Scael.Pq Sq symbols
53*48831Scaelbetween binary files.
54*48831Scael.It string table
55*48831ScaelContains the character strings corresponding to the symbol names.
56*48831Scael.El
57*48831Scael.Pp
58*48831ScaelEvery binary file begins with an
59*48831Scael.Fa exec
60*48831Scaelstructure:
61*48831Scael.Bd -literal -offset indent
6220755Smckusickstruct exec {
63*48831Scael	unsigned short	a_mid;
64*48831Scael	unsigned short	a_magic;
65*48831Scael	unsigned long	a_text;
66*48831Scael	unsigned long	a_data;
67*48831Scael	unsigned long	a_bss;
68*48831Scael	unsigned long	a_syms;
69*48831Scael	unsigned long	a_entry;
70*48831Scael	unsigned long	a_trsize;
71*48831Scael	unsigned long	a_drsize;
7220755Smckusick};
73*48831Scael.Ed
74*48831Scael.Pp
75*48831ScaelThe fields have the following functions:
76*48831Scael.Bl -tag -width a_trsize
77*48831Scael.It Fa a_mid
78*48831ScaelContains a bit pattern that
79*48831Scaelidentifies binaries that were built for
80*48831Scaelcertain sub-classes of an architecture
81*48831Scael.Pq Sq machine IDs
82*48831Scaelor variants of the operating system on a given architecture.
83*48831ScaelThe kernel may not support all machine IDs
84*48831Scaelon a given architecture.
85*48831ScaelThe
86*48831Scael.Fa a_mid
87*48831Scaelfield is not present on some architectures;
88*48831Scaelin this case, the
89*48831Scael.Fa a_magic
90*48831Scaelfield has type
91*48831Scael.Em unsigned long .
92*48831Scael.It Fa a_magic
93*48831ScaelContains a bit pattern
94*48831Scael.Pq Sq magic number
95*48831Scaelthat uniquely identifies binary files
96*48831Scaeland distinguishes different loading conventions.
97*48831ScaelThe field must contain one of the following values:
98*48831Scael.Bl -tag -width ZMAGIC
99*48831Scael.It Dv OMAGIC
100*48831ScaelThe text and data segments immediately follow the header
101*48831Scaeland are contiguous.
102*48831ScaelThe kernel loads both text and data segments into writable memory.
103*48831Scael.It Dv NMAGIC
104*48831ScaelAs with
105*48831Scael.Dv OMAGIC ,
106*48831Scaeltext and data segments immediately follow the header and are contiguous.
107*48831ScaelHowever, the kernel loads the text into read-only memory
108*48831Scaeland loads the data into writable memory at the next
109*48831Scaelpage boundary after the text.
110*48831Scael.It Dv ZMAGIC
111*48831ScaelThe kernel loads individual pages on demand from the binary.
112*48831ScaelThe header, text segment and data segment are all
113*48831Scaelpadded by the link editor to a multiple of the page size.
114*48831ScaelPages that the kernel loads from the text segment are read-only,
115*48831Scaelwhile pages from the data segment are writable.
116*48831Scael.El
117*48831Scael.It Fa a_text
118*48831ScaelContains the size of the text segment in bytes.
119*48831Scael.It Fa a_data
120*48831ScaelContains the size of the data segment in bytes.
121*48831Scael.It Fa a_bss
122*48831ScaelContains the number of bytes in the
123*48831Scael.Sq bss segment
124*48831Scaeland is used by the kernel to set the initial break
125*48831Scael.Pq Xr brk 2
126*48831Scaelafter the data segment.
127*48831ScaelThe kernel loads the program so that this amount of writable memory
128*48831Scaelappears to follow the data segment and initially reads as zeroes.
129*48831Scael.It Fa a_syms
130*48831ScaelContains the size in bytes of the symbol table section.
131*48831Scael.It Fa a_entry
132*48831ScaelContains the address in memory of the entry point
133*48831Scaelof the program after the kernel has loaded it;
134*48831Scaelthe kernel starts the execution of the program
135*48831Scaelfrom the machine instruction at this address.
136*48831Scael.It Fa a_trsize
137*48831ScaelContains the size in bytes of the text relocation table.
138*48831Scael.It Fa a_drsize
139*48831ScaelContains the size in bytes of the data relocation table.
140*48831Scael.El
141*48831Scael.Pp
142*48831ScaelThe
143*48831Scael.Pa a.out.h
144*48831Scaelinclude file defines several macros which use an
145*48831Scael.Fa exec
146*48831Scaelstructure to test consistency or to locate section offsets in the binary file.
147*48831Scael.Bl -tag -width N_BADMAG(exec)
148*48831Scael.It Fn N_BADMAG exec
149*48831ScaelNonzero if the
150*48831Scael.Fa a_magic
151*48831Scaelfield does not contain a recognized value.
152*48831Scael.It Fn N_TXTOFF exec
153*48831ScaelThe byte offset in the binary file of the beginning of the text segment.
154*48831Scael.It Fn N_SYMOFF exec
155*48831ScaelThe byte offset of the beginning of the symbol table.
156*48831Scael.It Fn N_STROFF exec
157*48831ScaelThe byte offset of the beginning of the string table.
158*48831Scael.El
159*48831Scael.Pp
160*48831ScaelRelocation records have a standard format which
161*48831Scaelis described by the
162*48831Scael.Fa relocation_info
163*48831Scaelstructure:
164*48831Scael.Bd -literal -offset indent
165*48831Scaelstruct relocation_info {
166*48831Scael	int		r_address;
167*48831Scael	unsigned int	r_symbolnum : 24,
168*48831Scael			r_pcrel : 1,
169*48831Scael			r_length : 2,
170*48831Scael			r_extern : 1,
171*48831Scael			: 4;
172*48831Scael};
173*48831Scael.Ed
174*48831Scael.Pp
175*48831ScaelThe
176*48831Scael.Fa relocation_info
177*48831Scaelfields are used as follows:
178*48831Scael.Bl -tag -width r_symbolnum
179*48831Scael.It Fa r_address
180*48831ScaelContains the byte offset of a pointer that needs to be link-edited.
181*48831ScaelText relocation offsets are reckoned from the start of the text segment,
182*48831Scaeland data relocation offsets from the start of the data segment.
183*48831ScaelThe link editor adds the value that is already stored at this offset
184*48831Scaelinto the new value that it computes using this relocation record.
185*48831Scael.It Fa r_symbolnum
186*48831ScaelContains the ordinal number of a symbol structure
187*48831Scaelin the symbol table (it is
188*48831Scael.Em not
189*48831Scaela byte offset).
190*48831ScaelAfter the link editor resolves the absolute address for this symbol,
191*48831Scaelit adds that address to the pointer that is undergoing relocation.
192*48831Scael(If the
193*48831Scael.Fa r_extern
194*48831Scaelbit is clear, the situation is different; see below.)
195*48831Scael.It Fa r_pcrel
196*48831ScaelIf this is set,
197*48831Scaelthe link editor assumes that it is updating a pointer
198*48831Scaelthat is part of a machine code instruction using pc-relative addressing.
199*48831ScaelThe address of the relocated pointer is implicitly added
200*48831Scaelto its value when the running program uses it.
201*48831Scael.It Fa r_length
202*48831ScaelContains the log base 2 of the length of the pointer in bytes;
203*48831Scael0 for 1-byte displacements, 1 for 2-byte displacements,
204*48831Scael2 for 4-byte displacements.
205*48831Scael.It Fa r_extern
206*48831ScaelSet if this relocation requires an external reference;
207*48831Scaelthe link editor must use a symbol address to update the pointer.
208*48831ScaelWhen the
209*48831Scael.Fa r_extern
210*48831Scaelbit is clear, the relocation is
211*48831Scael.Sq local ;
212*48831Scaelthe link editor updates the pointer to reflect
213*48831Scaelchanges in the load addresses of the various segments,
214*48831Scaelrather than changes in the value of a symbol.
215*48831ScaelIn this case, the content of the
216*48831Scael.Fa r_symbolnum
217*48831Scaelfield is an
218*48831Scael.Fa n_type
219*48831Scaelvalue (see below);
220*48831Scaelthis type field tells the link editor
221*48831Scaelwhat segment the relocated pointer points into.
222*48831Scael.El
223*48831Scael.Pp
224*48831ScaelSymbols map names to addresses (or more generally, strings to values).
225*48831ScaelSince the link-editor adjusts addresses,
226*48831Scaela symbol's name must be used to stand for its address
227*48831Scaeluntil an absolute value has been assigned.
228*48831ScaelSymbols consist of a fixed-length record in the symbol table
229*48831Scaeland a variable-length name in the string table.
230*48831ScaelThe symbol table is an array of
231*48831Scael.Fa nlist
232*48831Scaelstructures:
233*48831Scael.Bd -literal -offset indent
23420755Smckusickstruct nlist {
23520755Smckusick	union {
236*48831Scael		char	*n_name;
237*48831Scael		long	n_strx;
23820755Smckusick	} n_un;
239*48831Scael	unsigned char	n_type;
240*48831Scael	char		n_other;
241*48831Scael	short		n_desc;
242*48831Scael	unsigned long	n_value;
24320755Smckusick};
244*48831Scael.Ed
245*48831Scael.Pp
246*48831ScaelThe fields are used as follows:
247*48831Scael.Bl -tag -width n_un.n_strx
248*48831Scael.It Fa n_un.n_strx
249*48831ScaelContains a byte offset into the string table
250*48831Scaelfor the name of this symbol.
251*48831ScaelWhen a program accesses a symbol table with the
252*48831Scael.Xr nlist 3
253*48831Scaelfunction,
254*48831Scaelthis field is replaced with the
255*48831Scael.Fa n_un.n_name
256*48831Scaelfield, which is a pointer to the string in memory.
257*48831Scael.It Fa n_type
258*48831ScaelUsed by the link editor to determine
259*48831Scaelhow to update the symbol's value.
260*48831ScaelThe
261*48831Scael.Fa n_type
262*48831Scaelfield is broken down into three sub-fields using bitmasks.
263*48831ScaelThe link editor treats symbols with the
264*48831Scael.Dv N_EXT
265*48831Scaeltype bit set as
266*48831Scael.Sq external
267*48831Scaelsymbols and permits references to them from other binary files.
268*48831ScaelThe
269*48831Scael.Dv N_TYPE
270*48831Scaelmask selects bits of interest to the link editor:
271*48831Scael.Bl -tag -width N_TEXT
272*48831Scael.It Dv N_UNDF
273*48831ScaelAn undefined symbol.
274*48831ScaelThe link editor must locate an external symbol with the same name
275*48831Scaelin another binary file to determine the absolute value of this symbol.
276*48831ScaelAs a special case, if the
277*48831Scael.Fa n_value
278*48831Scaelfield is nonzero and no binary file in the link-edit defines this symbol,
279*48831Scaelthe link-editor will resolve this symbol to an address
280*48831Scaelin the bss segment,
281*48831Scaelreserving an amount of bytes equal to
282*48831Scael.Fa n_value .
283*48831ScaelIf this symbol is undefined in more than one binary file
284*48831Scaeland the binary files do not agree on the size,
285*48831Scaelthe link editor chooses the greatest size found across all binaries.
286*48831Scael.It Dv N_ABS
287*48831ScaelAn absolute symbol.
288*48831ScaelThe link editor does not update an absolute symbol.
289*48831Scael.It Dv N_TEXT
290*48831ScaelA text symbol.
291*48831ScaelThis symbol's value is a text address and
292*48831Scaelthe link editor will update it when it merges binary files.
293*48831Scael.It Dv N_DATA
294*48831ScaelA data symbol; similar to
295*48831Scael.Dv N_TEXT
296*48831Scaelbut for data addresses.
297*48831ScaelThe values for text and data symbols are not file offsets but
298*48831Scaeladdresses; to recover the file offsets, it is necessary
299*48831Scaelto identify the loaded address of the beginning of the corresponding
300*48831Scaelsection and subtract it, then add the offset of the section.
301*48831Scael.It Dv N_BSS
302*48831ScaelA bss symbol; like text or data symbols but
303*48831Scaelhas no corresponding offset in the binary file.
304*48831Scael.It Dv N_FN
305*48831ScaelA filename symbol.
306*48831ScaelThe link editor inserts this symbol before
307*48831Scaelthe other symbols from a binary file when
308*48831Scaelmerging binary files.
309*48831ScaelThe name of the symbol is the filename given to the link editor,
310*48831Scaeland its value is the first text address from that binary file.
311*48831ScaelFilename symbols are not needed for link-editing or loading,
312*48831Scaelbut are useful for debuggers.
313*48831Scael.El
314*48831Scael.Pp
315*48831ScaelThe
316*48831Scael.Dv N_STAB
317*48831Scaelmask selects bits of interest to symbolic debuggers
318*48831Scaelsuch as
319*48831Scael.Xr gdb 1 ;
320*48831Scaelthe values are described in
321*48831Scael.Xr stab 5 .
322*48831Scael.It Fa n_other
323*48831ScaelThis field is currently unused.
324*48831Scael.It Fa n_desc
325*48831ScaelReserved for use by debuggers; passed untouched by the link editor.
326*48831ScaelDifferent debuggers use this field for different purposes.
327*48831Scael.It Fa n_value
328*48831ScaelContains the value of the symbol.
329*48831ScaelFor text, data and bss symbols, this is an address;
330*48831Scaelfor other symbols (such as debugger symbols),
331*48831Scaelthe value may be arbitrary.
332*48831Scael.El
333*48831Scael.Pp
334*48831ScaelThe string table consists of an
335*48831Scael.Em unsigned long
336*48831Scaellength followed by null-terminated symbol strings.
337*48831ScaelThe length represents the size of the entire table in bytes,
338*48831Scaelso its minimum value (or the offset of the first string)
339*48831Scaelis always 4 on 32-bit machines.
340*48831Scael.Sh SEE ALSO
341*48831Scael.Xr ld 1 ,
342*48831Scael.Xr execve 2 ,
343*48831Scael.Xr nlist 3 ,
344*48831Scael.Xr core 5 ,
345*48831Scael.Xr dbx 5 ,
346*48831Scael.Xr stab 5
347*48831Scael.Sh HISTORY
348*48831ScaelThe
349*48831Scael.Pa a.out.h
350*48831Scaelinclude file appeared in
351*48831Scael.At v7 .
352*48831Scael.Sh BUGS
353*48831ScaelSince not all of the supported architectures use the
354*48831Scael.Fa a_mid
355*48831Scaelfield,
356*48831Scaelit can be difficult to determine what
357*48831Scaelarchitecture a binary will execute on
358*48831Scaelwithout examining its actual machine code.
359*48831ScaelEven with a machine identifier,
360*48831Scaelthe byte order of the
361*48831Scael.Fa exec
362*48831Scaelheader is machine-dependent.
363*48831Scael.Pp
364*48831ScaelNobody seems to agree on what
365*48831Scael.Em bss
366*48831Scaelstands for.
367*48831Scael.Pp
368*48831ScaelNew binary file formats may be supported in the future,
369*48831Scaeland they probably will not be compatible at any level
370*48831Scaelwith this ancient format.
371