xref: /minix3/external/bsd/file/dist/doc/magic.5 (revision 08ff44c446cce1dfd25870fb4e9daf97f613ca95)
1*08ff44c4SLionel Sambuc.\"	$NetBSD: magic.5,v 1.7 2012/02/22 17:53:50 christos Exp $
2ef01931fSBen Gras.\"
3*08ff44c4SLionel Sambuc.\" $File: magic.man,v 1.71 2011/12/07 11:58:24 rrt Exp $
4835f6802SDirk Vogt.Dd April 20, 2011
5ef01931fSBen Gras.Dt MAGIC 5
6ef01931fSBen Gras.Os
7ef01931fSBen Gras.\" install as magic.4 on USG, magic.5 on V7, Berkeley and Linux systems.
8ef01931fSBen Gras.Sh NAME
9ef01931fSBen Gras.Nm magic
10ef01931fSBen Gras.Nd file command's magic pattern file
11ef01931fSBen Gras.Sh DESCRIPTION
12ef01931fSBen GrasThis manual page documents the format of the magic file as
13ef01931fSBen Grasused by the
14ef01931fSBen Gras.Xr file 1
15*08ff44c4SLionel Sambuccommand, version 5.11.
16ef01931fSBen GrasThe
17ef01931fSBen Gras.Xr file 1
18ef01931fSBen Grascommand identifies the type of a file using,
19ef01931fSBen Grasamong other tests,
20ef01931fSBen Grasa test for whether the file contains certain
21ef01931fSBen Gras.Dq "magic patterns" .
22ef01931fSBen GrasThe file
23ef01931fSBen Gras.Pa /usr/share/misc/magic
24ef01931fSBen Grasspecifies what patterns are to be tested for, what message or
25ef01931fSBen GrasMIME type to print if a particular pattern is found,
26ef01931fSBen Grasand additional information to extract from the file.
27ef01931fSBen Gras.Pp
28ef01931fSBen GrasEach line of the file specifies a test to be performed.
29ef01931fSBen GrasA test compares the data starting at a particular offset
30ef01931fSBen Grasin the file with a byte value, a string or a numeric value.
31ef01931fSBen GrasIf the test succeeds, a message is printed.
32ef01931fSBen GrasThe line consists of the following fields:
33ef01931fSBen Gras.Bl -tag -width ".Dv message"
34ef01931fSBen Gras.It Dv offset
35ef01931fSBen GrasA number specifying the offset, in bytes, into the file of the data
36ef01931fSBen Graswhich is to be tested.
37ef01931fSBen Gras.It Dv type
38ef01931fSBen GrasThe type of the data to be tested.
39ef01931fSBen GrasThe possible values are:
40ef01931fSBen Gras.Bl -tag -width ".Dv lestring16"
41ef01931fSBen Gras.It Dv byte
42ef01931fSBen GrasA one-byte value.
43ef01931fSBen Gras.It Dv short
44ef01931fSBen GrasA two-byte value in this machine's native byte order.
45ef01931fSBen Gras.It Dv long
46ef01931fSBen GrasA four-byte value in this machine's native byte order.
47ef01931fSBen Gras.It Dv quad
48ef01931fSBen GrasAn eight-byte value in this machine's native byte order.
49ef01931fSBen Gras.It Dv float
50ef01931fSBen GrasA 32-bit single precision IEEE floating point number in this machine's native byte order.
51ef01931fSBen Gras.It Dv double
52ef01931fSBen GrasA 64-bit double precision IEEE floating point number in this machine's native byte order.
53ef01931fSBen Gras.It Dv string
54ef01931fSBen GrasA string of bytes.
55ef01931fSBen GrasThe string type specification can be optionally followed
56835f6802SDirk Vogtby /[WwcCtb]*.
57ef01931fSBen GrasThe
58835f6802SDirk Vogt.Dq W
59ef01931fSBen Grasflag compacts whitespace in the target, which must
60ef01931fSBen Grascontain at least one whitespace character.
61ef01931fSBen GrasIf the magic has
62ef01931fSBen Gras.Dv n
63ef01931fSBen Grasconsecutive blanks, the target needs at least
64ef01931fSBen Gras.Dv n
65ef01931fSBen Grasconsecutive blanks to match.
66ef01931fSBen GrasThe
67835f6802SDirk Vogt.Dq w
68*08ff44c4SLionel Sambucflag treats every blank in the magic as an optional blank.
69835f6802SDirk VogtThe
70ef01931fSBen Gras.Dq c
71*08ff44c4SLionel Sambucflag specifies case insensitive matching: lower case
72ef01931fSBen Grascharacters in the magic match both lower and upper case characters in the
73ef01931fSBen Grastarget, whereas upper case characters in the magic only match upper case
74ef01931fSBen Grascharacters in the target.
75835f6802SDirk VogtThe
76835f6802SDirk Vogt.Dq C
77*08ff44c4SLionel Sambucflag specifies case insensitive matching: upper case
78835f6802SDirk Vogtcharacters in the magic match both lower and upper case characters in the
79835f6802SDirk Vogttarget, whereas lower case characters in the magic only match upper case
80835f6802SDirk Vogtcharacters in the target.
81835f6802SDirk VogtTo do a complete case insensitive match, specify both
82835f6802SDirk Vogt.Dq c
83835f6802SDirk Vogtand
84835f6802SDirk Vogt.Dq C .
85835f6802SDirk VogtThe
86835f6802SDirk Vogt.Dq t
87*08ff44c4SLionel Sambucflag forces the test to be done for text files, while the
88835f6802SDirk Vogt.Dq b
89*08ff44c4SLionel Sambucflag forces the test to be done for binary files.
90ef01931fSBen Gras.It Dv pstring
91835f6802SDirk VogtA Pascal-style string where the first byte/short/int is interpreted as the an
92ef01931fSBen Grasunsigned length.
93835f6802SDirk VogtThe length defaults to byte and can be specified as a modifier.
94835f6802SDirk VogtThe following modifiers are supported:
95835f6802SDirk Vogt.Bl -tag -compact -width B
96835f6802SDirk Vogt.It B
97835f6802SDirk VogtA byte length (default).
98835f6802SDirk Vogt.It H
99835f6802SDirk VogtA 2 byte big endian length.
100835f6802SDirk Vogt.It h
101835f6802SDirk VogtA 2 byte big little length.
102835f6802SDirk Vogt.It L
103835f6802SDirk VogtA 4 byte big endian length.
104835f6802SDirk Vogt.It l
105835f6802SDirk VogtA 4 byte big little length.
106835f6802SDirk Vogt.It J
107835f6802SDirk VogtThe length includes itself in its count.
108835f6802SDirk Vogt.El
109ef01931fSBen GrasThe string is not NUL terminated.
110835f6802SDirk Vogt.Dq J
111835f6802SDirk Vogtis used rather than the more
112835f6802SDirk Vogtvaluable
113835f6802SDirk Vogt.Dq I
114835f6802SDirk Vogtbecause this type of length is a feature of the JPEG
115835f6802SDirk Vogtformat.
116ef01931fSBen Gras.It Dv date
117ef01931fSBen GrasA four-byte value interpreted as a UNIX date.
118ef01931fSBen Gras.It Dv qdate
119ef01931fSBen GrasA eight-byte value interpreted as a UNIX date.
120ef01931fSBen Gras.It Dv ldate
121ef01931fSBen GrasA four-byte value interpreted as a UNIX-style date, but interpreted as
122ef01931fSBen Graslocal time rather than UTC.
123ef01931fSBen Gras.It Dv qldate
124ef01931fSBen GrasAn eight-byte value interpreted as a UNIX-style date, but interpreted as
125ef01931fSBen Graslocal time rather than UTC.
126ef01931fSBen Gras.It Dv beid3
127ef01931fSBen GrasA 32-bit ID3 length in big-endian byte order.
128ef01931fSBen Gras.It Dv beshort
129ef01931fSBen GrasA two-byte value in big-endian byte order.
130ef01931fSBen Gras.It Dv belong
131ef01931fSBen GrasA four-byte value in big-endian byte order.
132ef01931fSBen Gras.It Dv bequad
133ef01931fSBen GrasAn eight-byte value in big-endian byte order.
134ef01931fSBen Gras.It Dv befloat
135ef01931fSBen GrasA 32-bit single precision IEEE floating point number in big-endian byte order.
136ef01931fSBen Gras.It Dv bedouble
137ef01931fSBen GrasA 64-bit double precision IEEE floating point number in big-endian byte order.
138ef01931fSBen Gras.It Dv bedate
139ef01931fSBen GrasA four-byte value in big-endian byte order,
140ef01931fSBen Grasinterpreted as a Unix date.
141ef01931fSBen Gras.It Dv beqdate
142ef01931fSBen GrasAn eight-byte value in big-endian byte order,
143ef01931fSBen Grasinterpreted as a Unix date.
144ef01931fSBen Gras.It Dv beldate
145ef01931fSBen GrasA four-byte value in big-endian byte order,
146ef01931fSBen Grasinterpreted as a UNIX-style date, but interpreted as local time rather
147ef01931fSBen Grasthan UTC.
148ef01931fSBen Gras.It Dv beqldate
149ef01931fSBen GrasAn eight-byte value in big-endian byte order,
150ef01931fSBen Grasinterpreted as a UNIX-style date, but interpreted as local time rather
151ef01931fSBen Grasthan UTC.
152ef01931fSBen Gras.It Dv bestring16
153ef01931fSBen GrasA two-byte unicode (UCS16) string in big-endian byte order.
154ef01931fSBen Gras.It Dv leid3
155ef01931fSBen GrasA 32-bit ID3 length in little-endian byte order.
156ef01931fSBen Gras.It Dv leshort
157ef01931fSBen GrasA two-byte value in little-endian byte order.
158ef01931fSBen Gras.It Dv lelong
159ef01931fSBen GrasA four-byte value in little-endian byte order.
160ef01931fSBen Gras.It Dv lequad
161ef01931fSBen GrasAn eight-byte value in little-endian byte order.
162ef01931fSBen Gras.It Dv lefloat
163ef01931fSBen GrasA 32-bit single precision IEEE floating point number in little-endian byte order.
164ef01931fSBen Gras.It Dv ledouble
165ef01931fSBen GrasA 64-bit double precision IEEE floating point number in little-endian byte order.
166ef01931fSBen Gras.It Dv ledate
167ef01931fSBen GrasA four-byte value in little-endian byte order,
168ef01931fSBen Grasinterpreted as a UNIX date.
169ef01931fSBen Gras.It Dv leqdate
170ef01931fSBen GrasAn eight-byte value in little-endian byte order,
171ef01931fSBen Grasinterpreted as a UNIX date.
172ef01931fSBen Gras.It Dv leldate
173ef01931fSBen GrasA four-byte value in little-endian byte order,
174ef01931fSBen Grasinterpreted as a UNIX-style date, but interpreted as local time rather
175ef01931fSBen Grasthan UTC.
176ef01931fSBen Gras.It Dv leqldate
177ef01931fSBen GrasAn eight-byte value in little-endian byte order,
178ef01931fSBen Grasinterpreted as a UNIX-style date, but interpreted as local time rather
179ef01931fSBen Grasthan UTC.
180ef01931fSBen Gras.It Dv lestring16
181ef01931fSBen GrasA two-byte unicode (UCS16) string in little-endian byte order.
182ef01931fSBen Gras.It Dv melong
183ef01931fSBen GrasA four-byte value in middle-endian (PDP-11) byte order.
184ef01931fSBen Gras.It Dv medate
185ef01931fSBen GrasA four-byte value in middle-endian (PDP-11) byte order,
186ef01931fSBen Grasinterpreted as a UNIX date.
187ef01931fSBen Gras.It Dv meldate
188ef01931fSBen GrasA four-byte value in middle-endian (PDP-11) byte order,
189ef01931fSBen Grasinterpreted as a UNIX-style date, but interpreted as local time rather
190ef01931fSBen Grasthan UTC.
191ef01931fSBen Gras.It Dv indirect
192ef01931fSBen GrasStarting at the given offset, consult the magic database again.
193ef01931fSBen Gras.It Dv regex
194ef01931fSBen GrasA regular expression match in extended POSIX regular expression syntax
195ef01931fSBen Gras(like egrep).
196ef01931fSBen GrasRegular expressions can take exponential time to process, and their
197ef01931fSBen Grasperformance is hard to predict, so their use is discouraged.
198ef01931fSBen GrasWhen used in production environments, their performance
199ef01931fSBen Grasshould be carefully checked.
200ef01931fSBen GrasThe type specification can be optionally followed by
201ef01931fSBen Gras.Dv /[c][s] .
202ef01931fSBen GrasThe
203ef01931fSBen Gras.Dq c
204ef01931fSBen Grasflag makes the match case insensitive, while the
205ef01931fSBen Gras.Dq s
206ef01931fSBen Grasflag update the offset to the start offset of the match, rather than the end.
207ef01931fSBen GrasThe regular expression is tested against line
208ef01931fSBen Gras.Dv N + 1
209ef01931fSBen Grasonwards, where
210ef01931fSBen Gras.Dv N
211ef01931fSBen Grasis the given offset.
212ef01931fSBen GrasLine endings are assumed to be in the machine's native format.
213ef01931fSBen Gras.Dv ^
214ef01931fSBen Grasand
215ef01931fSBen Gras.Dv $
216ef01931fSBen Grasmatch the beginning and end of individual lines, respectively,
217ef01931fSBen Grasnot beginning and end of file.
218ef01931fSBen Gras.It Dv search
219ef01931fSBen GrasA literal string search starting at the given offset.
220ef01931fSBen GrasThe same modifier flags can be used as for string patterns.
221ef01931fSBen GrasThe modifier flags (if any) must be followed by
222ef01931fSBen Gras.Dv /number
223ef01931fSBen Grasthe range, that is, the number of positions at which the match will be
224ef01931fSBen Grasattempted, starting from the start offset.
225ef01931fSBen GrasThis is suitable for
226ef01931fSBen Grassearching larger binary expressions with variable offsets, using
227ef01931fSBen Gras.Dv \e
228ef01931fSBen Grasescapes for special characters.
229ef01931fSBen GrasThe offset works as for regex.
230ef01931fSBen Gras.It Dv default
231ef01931fSBen GrasThis is intended to be used with the test
232ef01931fSBen Gras.Em x
233ef01931fSBen Gras(which is always true) and a message that is to be used if there are
234ef01931fSBen Grasno other matches.
235ef01931fSBen Gras.El
236ef01931fSBen Gras.Pp
237ef01931fSBen GrasEach top-level magic pattern (see below for an explanation of levels)
238ef01931fSBen Grasis classified as text or binary according to the types used.
239ef01931fSBen GrasTypes
240ef01931fSBen Gras.Dq regex
241ef01931fSBen Grasand
242ef01931fSBen Gras.Dq search
243ef01931fSBen Grasare classified as text tests, unless non-printable characters are used
244ef01931fSBen Grasin the pattern.
245ef01931fSBen GrasAll other tests are classified as binary.
246ef01931fSBen GrasA top-level
247ef01931fSBen Graspattern is considered to be a test text when all its patterns are text
248ef01931fSBen Graspatterns; otherwise, it is considered to be a binary pattern.
249ef01931fSBen GrasWhen
250ef01931fSBen Grasmatching a file, binary patterns are tried first; if no match is
251ef01931fSBen Grasfound, and the file looks like text, then its encoding is determined
252ef01931fSBen Grasand the text patterns are tried.
253ef01931fSBen Gras.Pp
254ef01931fSBen GrasThe numeric types may optionally be followed by
255ef01931fSBen Gras.Dv \*[Am]
256ef01931fSBen Grasand a numeric value,
257ef01931fSBen Grasto specify that the value is to be AND'ed with the
258ef01931fSBen Grasnumeric value before any comparisons are done.
259ef01931fSBen GrasPrepending a
260ef01931fSBen Gras.Dv u
261ef01931fSBen Grasto the type indicates that ordered comparisons should be unsigned.
262ef01931fSBen Gras.It Dv test
263ef01931fSBen GrasThe value to be compared with the value from the file.
264ef01931fSBen GrasIf the type is
265ef01931fSBen Grasnumeric, this value
266ef01931fSBen Grasis specified in C form; if it is a string, it is specified as a C string
267ef01931fSBen Graswith the usual escapes permitted (e.g. \en for new-line).
268ef01931fSBen Gras.Pp
269ef01931fSBen GrasNumeric values
270ef01931fSBen Grasmay be preceded by a character indicating the operation to be performed.
271ef01931fSBen GrasIt may be
272ef01931fSBen Gras.Dv = ,
273ef01931fSBen Grasto specify that the value from the file must equal the specified value,
274ef01931fSBen Gras.Dv \*[Lt] ,
275ef01931fSBen Grasto specify that the value from the file must be less than the specified
276ef01931fSBen Grasvalue,
277ef01931fSBen Gras.Dv \*[Gt] ,
278ef01931fSBen Grasto specify that the value from the file must be greater than the specified
279ef01931fSBen Grasvalue,
280ef01931fSBen Gras.Dv \*[Am] ,
281ef01931fSBen Grasto specify that the value from the file must have set all of the bits
282ef01931fSBen Grasthat are set in the specified value,
283ef01931fSBen Gras.Dv ^ ,
284ef01931fSBen Grasto specify that the value from the file must have clear any of the bits
285ef01931fSBen Grasthat are set in the specified value, or
286ef01931fSBen Gras.Dv ~ ,
287ef01931fSBen Grasthe value specified after is negated before tested.
288ef01931fSBen Gras.Dv x ,
289ef01931fSBen Grasto specify that any value will match.
290ef01931fSBen GrasIf the character is omitted, it is assumed to be
291ef01931fSBen Gras.Dv = .
292ef01931fSBen GrasOperators
293ef01931fSBen Gras.Dv \*[Am] ,
294ef01931fSBen Gras.Dv ^ ,
295ef01931fSBen Grasand
296ef01931fSBen Gras.Dv ~
297ef01931fSBen Grasdon't work with floats and doubles.
298ef01931fSBen GrasThe operator
299ef01931fSBen Gras.Dv !\&
300ef01931fSBen Grasspecifies that the line matches if the test does
301ef01931fSBen Gras.Em not
302ef01931fSBen Grassucceed.
303ef01931fSBen Gras.Pp
304ef01931fSBen GrasNumeric values are specified in C form; e.g.
305ef01931fSBen Gras.Dv 13
306ef01931fSBen Grasis decimal,
307ef01931fSBen Gras.Dv 013
308ef01931fSBen Grasis octal, and
309ef01931fSBen Gras.Dv 0x13
310ef01931fSBen Grasis hexadecimal.
311ef01931fSBen Gras.Pp
312ef01931fSBen GrasFor string values, the string from the
313ef01931fSBen Grasfile must match the specified string.
314ef01931fSBen GrasThe operators
315ef01931fSBen Gras.Dv = ,
316ef01931fSBen Gras.Dv \*[Lt]
317ef01931fSBen Grasand
318ef01931fSBen Gras.Dv \*[Gt]
319ef01931fSBen Gras(but not
320ef01931fSBen Gras.Dv \*[Am] )
321ef01931fSBen Grascan be applied to strings.
322ef01931fSBen GrasThe length used for matching is that of the string argument
323ef01931fSBen Grasin the magic file.
324ef01931fSBen GrasThis means that a line can match any non-empty string (usually used to
325ef01931fSBen Grasthen print the string), with
326ef01931fSBen Gras.Em \*[Gt]\e0
327ef01931fSBen Gras(because all non-empty strings are greater than the empty string).
328ef01931fSBen Gras.Pp
329ef01931fSBen GrasThe special test
330ef01931fSBen Gras.Em x
331ef01931fSBen Grasalways evaluates to true.
332835f6802SDirk Vogt.It Dv message
333ef01931fSBen GrasThe message to be printed if the comparison succeeds.
334ef01931fSBen GrasIf the string contains a
335ef01931fSBen Gras.Xr printf 3
336ef01931fSBen Grasformat specification, the value from the file (with any specified masking
337ef01931fSBen Grasperformed) is printed using the message as the format string.
338ef01931fSBen GrasIf the string begins with
339ef01931fSBen Gras.Dq \eb ,
340ef01931fSBen Grasthe message printed is the remainder of the string with no whitespace
341ef01931fSBen Grasadded before it: multiple matches are normally separated by a single
342ef01931fSBen Grasspace.
343ef01931fSBen Gras.El
344ef01931fSBen Gras.Pp
345ef01931fSBen GrasAn APPLE 4+4 character APPLE creator and type can be specified as:
346ef01931fSBen Gras.Bd -literal -offset indent
347ef01931fSBen Gras!:apple	CREATYPE
348ef01931fSBen Gras.Ed
349ef01931fSBen Gras.Pp
350ef01931fSBen GrasA MIME type is given on a separate line, which must be the next
351ef01931fSBen Grasnon-blank or comment line after the magic line that identifies the
352ef01931fSBen Grasfile type, and has the following format:
353ef01931fSBen Gras.Bd -literal -offset indent
354ef01931fSBen Gras!:mime	MIMETYPE
355ef01931fSBen Gras.Ed
356ef01931fSBen Gras.Pp
357ef01931fSBen Grasi.e. the literal string
358ef01931fSBen Gras.Dq !:mime
359ef01931fSBen Grasfollowed by the MIME type.
360ef01931fSBen Gras.Pp
361ef01931fSBen GrasAn optional strength can be supplied on a separate line which refers to
362ef01931fSBen Grasthe current magic description using the following format:
363ef01931fSBen Gras.Bd -literal -offset indent
364ef01931fSBen Gras!:strength OP VALUE
365ef01931fSBen Gras.Ed
366ef01931fSBen Gras.Pp
367ef01931fSBen GrasThe operand
368ef01931fSBen Gras.Dv OP
369ef01931fSBen Grascan be:
370ef01931fSBen Gras.Dv + ,
371ef01931fSBen Gras.Dv - ,
372ef01931fSBen Gras.Dv * ,
373ef01931fSBen Grasor
374ef01931fSBen Gras.Dv /
375ef01931fSBen Grasand
376ef01931fSBen Gras.Dv VALUE
377ef01931fSBen Grasis a constant between 0 and 255.
378ef01931fSBen GrasThis constant is applied using the specified operand
379ef01931fSBen Grasto the currently computed default magic strength.
380ef01931fSBen Gras.Pp
381ef01931fSBen GrasSome file formats contain additional information which is to be printed
382ef01931fSBen Grasalong with the file type or need additional tests to determine the true
383ef01931fSBen Grasfile type.
384ef01931fSBen GrasThese additional tests are introduced by one or more
385ef01931fSBen Gras.Em \*[Gt]
386ef01931fSBen Grascharacters preceding the offset.
387ef01931fSBen GrasThe number of
388ef01931fSBen Gras.Em \*[Gt]
389ef01931fSBen Grason the line indicates the level of the test; a line with no
390ef01931fSBen Gras.Em \*[Gt]
391ef01931fSBen Grasat the beginning is considered to be at level 0.
392ef01931fSBen GrasTests are arranged in a tree-like hierarchy:
393835f6802SDirk Vogtif the test on a line at level
394ef01931fSBen Gras.Em n
395ef01931fSBen Grassucceeds, all following tests at level
396ef01931fSBen Gras.Em n+1
397835f6802SDirk Vogtare performed, and the messages printed if the tests succeed, until a line
398ef01931fSBen Graswith level
399ef01931fSBen Gras.Em n
400ef01931fSBen Gras(or less) appears.
401ef01931fSBen GrasFor more complex files, one can use empty messages to get just the
402ef01931fSBen Gras"if/then" effect, in the following way:
403ef01931fSBen Gras.Bd -literal -offset indent
404ef01931fSBen Gras0      string   MZ
405ef01931fSBen Gras\*[Gt]0x18  leshort  \*[Lt]0x40   MS-DOS executable
406ef01931fSBen Gras\*[Gt]0x18  leshort  \*[Gt]0x3f   extended PC executable (e.g., MS Windows)
407ef01931fSBen Gras.Ed
408ef01931fSBen Gras.Pp
409ef01931fSBen GrasOffsets do not need to be constant, but can also be read from the file
410ef01931fSBen Grasbeing examined.
411ef01931fSBen GrasIf the first character following the last
412ef01931fSBen Gras.Em \*[Gt]
413ef01931fSBen Grasis a
414835f6802SDirk Vogt.Em \&(
415ef01931fSBen Grasthen the string after the parenthesis is interpreted as an indirect offset.
416ef01931fSBen GrasThat means that the number after the parenthesis is used as an offset in
417ef01931fSBen Grasthe file.
418ef01931fSBen GrasThe value at that offset is read, and is used again as an offset
419ef01931fSBen Grasin the file.
420ef01931fSBen GrasIndirect offsets are of the form:
421ef01931fSBen Gras.Em (( x [.[bislBISL]][+\-][ y ]) .
422ef01931fSBen GrasThe value of
423ef01931fSBen Gras.Em x
424ef01931fSBen Grasis used as an offset in the file.
425ef01931fSBen GrasA byte, id3 length, short or long is read at that offset depending on the
426ef01931fSBen Gras.Em [bislBISLm]
427ef01931fSBen Grastype specifier.
428ef01931fSBen GrasThe capitalized types interpret the number as a big endian
429ef01931fSBen Grasvalue, whereas the small letter versions interpret the number as a little
430ef01931fSBen Grasendian value;
431ef01931fSBen Grasthe
432ef01931fSBen Gras.Em m
433ef01931fSBen Grastype interprets the number as a middle endian (PDP-11) value.
434ef01931fSBen GrasTo that number the value of
435ef01931fSBen Gras.Em y
436ef01931fSBen Grasis added and the result is used as an offset in the file.
437ef01931fSBen GrasThe default type if one is not specified is long.
438ef01931fSBen Gras.Pp
439ef01931fSBen GrasThat way variable length structures can be examined:
440ef01931fSBen Gras.Bd -literal -offset indent
441ef01931fSBen Gras# MS Windows executables are also valid MS-DOS executables
442ef01931fSBen Gras0           string  MZ
443ef01931fSBen Gras\*[Gt]0x18       leshort \*[Lt]0x40   MZ executable (MS-DOS)
444ef01931fSBen Gras# skip the whole block below if it is not an extended executable
445ef01931fSBen Gras\*[Gt]0x18       leshort \*[Gt]0x3f
446ef01931fSBen Gras\*[Gt]\*[Gt](0x3c.l)  string  PE\e0\e0  PE executable (MS-Windows)
447ef01931fSBen Gras\*[Gt]\*[Gt](0x3c.l)  string  LX\e0\e0  LX executable (OS/2)
448ef01931fSBen Gras.Ed
449ef01931fSBen Gras.Pp
450ef01931fSBen GrasThis strategy of examining has a drawback: You must make sure that
451ef01931fSBen Grasyou eventually print something, or users may get empty output (like, when
452ef01931fSBen Grasthere is neither PE\e0\e0 nor LE\e0\e0 in the above example)
453ef01931fSBen Gras.Pp
454ef01931fSBen GrasIf this indirect offset cannot be used directly, simple calculations are
455ef01931fSBen Graspossible: appending
456ef01931fSBen Gras.Em [+-*/%\*[Am]|^]number
457ef01931fSBen Grasinside parentheses allows one to modify
458ef01931fSBen Grasthe value read from the file before it is used as an offset:
459ef01931fSBen Gras.Bd -literal -offset indent
460ef01931fSBen Gras# MS Windows executables are also valid MS-DOS executables
461ef01931fSBen Gras0           string  MZ
462ef01931fSBen Gras# sometimes, the value at 0x18 is less that 0x40 but there's still an
463ef01931fSBen Gras# extended executable, simply appended to the file
464ef01931fSBen Gras\*[Gt]0x18       leshort \*[Lt]0x40
465ef01931fSBen Gras\*[Gt]\*[Gt](4.s*512) leshort 0x014c  COFF executable (MS-DOS, DJGPP)
466ef01931fSBen Gras\*[Gt]\*[Gt](4.s*512) leshort !0x014c MZ executable (MS-DOS)
467ef01931fSBen Gras.Ed
468ef01931fSBen Gras.Pp
469ef01931fSBen GrasSometimes you do not know the exact offset as this depends on the length or
470ef01931fSBen Grasposition (when indirection was used before) of preceding fields.
471ef01931fSBen GrasYou can specify an offset relative to the end of the last up-level
472ef01931fSBen Grasfield using
473ef01931fSBen Gras.Sq \*[Am]
474ef01931fSBen Grasas a prefix to the offset:
475ef01931fSBen Gras.Bd -literal -offset indent
476ef01931fSBen Gras0           string  MZ
477ef01931fSBen Gras\*[Gt]0x18       leshort \*[Gt]0x3f
478ef01931fSBen Gras\*[Gt]\*[Gt](0x3c.l)  string  PE\e0\e0    PE executable (MS-Windows)
479ef01931fSBen Gras# immediately following the PE signature is the CPU type
480ef01931fSBen Gras\*[Gt]\*[Gt]\*[Gt]\*[Am]0       leshort 0x14c     for Intel 80386
481ef01931fSBen Gras\*[Gt]\*[Gt]\*[Gt]\*[Am]0       leshort 0x184     for DEC Alpha
482ef01931fSBen Gras.Ed
483ef01931fSBen Gras.Pp
484ef01931fSBen GrasIndirect and relative offsets can be combined:
485ef01931fSBen Gras.Bd -literal -offset indent
486ef01931fSBen Gras0             string  MZ
487ef01931fSBen Gras\*[Gt]0x18         leshort \*[Lt]0x40
488ef01931fSBen Gras\*[Gt]\*[Gt](4.s*512)   leshort !0x014c MZ executable (MS-DOS)
489ef01931fSBen Gras# if it's not COFF, go back 512 bytes and add the offset taken
490ef01931fSBen Gras# from byte 2/3, which is yet another way of finding the start
491ef01931fSBen Gras# of the extended executable
492ef01931fSBen Gras\*[Gt]\*[Gt]\*[Gt]\*[Am](2.s-514) string  LE      LE executable (MS Windows VxD driver)
493ef01931fSBen Gras.Ed
494ef01931fSBen Gras.Pp
495ef01931fSBen GrasOr the other way around:
496ef01931fSBen Gras.Bd -literal -offset indent
497ef01931fSBen Gras0                 string  MZ
498ef01931fSBen Gras\*[Gt]0x18             leshort \*[Gt]0x3f
499ef01931fSBen Gras\*[Gt]\*[Gt](0x3c.l)        string  LE\e0\e0  LE executable (MS-Windows)
500ef01931fSBen Gras# at offset 0x80 (-4, since relative offsets start at the end
501ef01931fSBen Gras# of the up-level match) inside the LE header, we find the absolute
502ef01931fSBen Gras# offset to the code area, where we look for a specific signature
503ef01931fSBen Gras\*[Gt]\*[Gt]\*[Gt](\*[Am]0x7c.l+0x26) string  UPX     \eb, UPX compressed
504ef01931fSBen Gras.Ed
505ef01931fSBen Gras.Pp
506ef01931fSBen GrasOr even both!
507ef01931fSBen Gras.Bd -literal -offset indent
508ef01931fSBen Gras0                string  MZ
509ef01931fSBen Gras\*[Gt]0x18            leshort \*[Gt]0x3f
510ef01931fSBen Gras\*[Gt]\*[Gt](0x3c.l)       string  LE\e0\e0 LE executable (MS-Windows)
511ef01931fSBen Gras# at offset 0x58 inside the LE header, we find the relative offset
512ef01931fSBen Gras# to a data area where we look for a specific signature
513ef01931fSBen Gras\*[Gt]\*[Gt]\*[Gt]\*[Am](\*[Am]0x54.l-3)  string  UNACE  \eb, ACE self-extracting archive
514ef01931fSBen Gras.Ed
515ef01931fSBen Gras.Pp
516ef01931fSBen GrasFinally, if you have to deal with offset/length pairs in your file, even the
517ef01931fSBen Grassecond value in a parenthesized expression can be taken from the file itself,
518ef01931fSBen Grasusing another set of parentheses.
519ef01931fSBen GrasNote that this additional indirect offset is always relative to the
520ef01931fSBen Grasstart of the main indirect offset.
521ef01931fSBen Gras.Bd -literal -offset indent
522ef01931fSBen Gras0                 string       MZ
523ef01931fSBen Gras\*[Gt]0x18             leshort      \*[Gt]0x3f
524ef01931fSBen Gras\*[Gt]\*[Gt](0x3c.l)        string       PE\e0\e0 PE executable (MS-Windows)
525ef01931fSBen Gras# search for the PE section called ".idata"...
526ef01931fSBen Gras\*[Gt]\*[Gt]\*[Gt]\*[Am]0xf4          search/0x140 .idata
527ef01931fSBen Gras# ...and go to the end of it, calculated from start+length;
528ef01931fSBen Gras# these are located 14 and 10 bytes after the section name
529ef01931fSBen Gras\*[Gt]\*[Gt]\*[Gt]\*[Gt](\*[Am]0xe.l+(-4)) string       PK\e3\e4 \eb, ZIP self-extracting archive
530ef01931fSBen Gras.Ed
531ef01931fSBen Gras.Sh SEE ALSO
532ef01931fSBen Gras.Xr file 1
533ef01931fSBen Gras\- the command that reads this file.
534ef01931fSBen Gras.Sh BUGS
535ef01931fSBen GrasThe formats
536ef01931fSBen Gras.Dv long ,
537ef01931fSBen Gras.Dv belong ,
538ef01931fSBen Gras.Dv lelong ,
539ef01931fSBen Gras.Dv melong ,
540ef01931fSBen Gras.Dv short ,
541ef01931fSBen Gras.Dv beshort ,
542ef01931fSBen Gras.Dv leshort ,
543ef01931fSBen Gras.Dv date ,
544ef01931fSBen Gras.Dv bedate ,
545ef01931fSBen Gras.Dv medate ,
546ef01931fSBen Gras.Dv ledate ,
547ef01931fSBen Gras.Dv beldate ,
548ef01931fSBen Gras.Dv leldate ,
549ef01931fSBen Grasand
550ef01931fSBen Gras.Dv meldate
551ef01931fSBen Grasare system-dependent; perhaps they should be specified as a number
552ef01931fSBen Grasof bytes (2B, 4B, etc),
553ef01931fSBen Grassince the files being recognized typically come from
554ef01931fSBen Grasa system on which the lengths are invariant.
555ef01931fSBen Gras.\"
556ef01931fSBen Gras.\" From: guy@sun.uucp (Guy Harris)
557ef01931fSBen Gras.\" Newsgroups: net.bugs.usg
558ef01931fSBen Gras.\" Subject: /etc/magic's format isn't well documented
559ef01931fSBen Gras.\" Message-ID: <2752@sun.uucp>
560ef01931fSBen Gras.\" Date: 3 Sep 85 08:19:07 GMT
561ef01931fSBen Gras.\" Organization: Sun Microsystems, Inc.
562ef01931fSBen Gras.\" Lines: 136
563ef01931fSBen Gras.\"
564ef01931fSBen Gras.\" Here's a manual page for the format accepted by the "file" made by adding
565ef01931fSBen Gras.\" the changes I posted to the S5R2 version.
566ef01931fSBen Gras.\"
567ef01931fSBen Gras.\" Modified for Ian Darwin's version of the file command.
568