xref: /netbsd-src/external/bsd/file/dist/doc/magic.5 (revision ddb176824c39fb0db5ceef3e9e40dcaa273aec38)
1*ddb17682Schristos.\"	$NetBSD: magic.5,v 1.25 2023/08/18 19:00:10 christos Exp $
21b108b8bSchristos.\"
3*ddb17682Schristos.\" $File: magic.man,v 1.103 2023/07/20 14:32:07 christos Exp $
4*ddb17682Schristos.Dd Arpil 18, 2023
51b108b8bSchristos.Dt MAGIC 5
61b108b8bSchristos.Os
71b108b8bSchristos.\" install as magic.4 on USG, magic.5 on V7, Berkeley and Linux systems.
81b108b8bSchristos.Sh NAME
91b108b8bSchristos.Nm magic
101b108b8bSchristos.Nd file command's magic pattern file
111b108b8bSchristos.Sh DESCRIPTION
1274db5203SchristosThis manual page documents the format of magic files as
131b108b8bSchristosused by the
141b108b8bSchristos.Xr file 1
15*ddb17682Schristoscommand, version 5.45.
161b108b8bSchristosThe
171b108b8bSchristos.Xr file 1
181b108b8bSchristoscommand identifies the type of a file using,
191b108b8bSchristosamong other tests,
201b108b8bSchristosa test for whether the file contains certain
211b108b8bSchristos.Dq "magic patterns" .
2274db5203SchristosThe database of these
2374db5203Schristos.Dq "magic patterns"
2474db5203Schristosis usually located in a binary file in
2574db5203Schristos.Pa /usr/share/misc/magic.mgc
2674db5203Schristosor a directory of source text magic pattern fragment files in
2774db5203Schristos.Pa /usr/share/misc/magic .
2874db5203SchristosThe database specifies what patterns are to be tested for, what message or
291b108b8bSchristosMIME type to print if a particular pattern is found,
301b108b8bSchristosand additional information to extract from the file.
311b108b8bSchristos.Pp
3274db5203SchristosThe format of the source fragment files that are used to build this database
3374db5203Schristosis as follows:
3474db5203SchristosEach line of a fragment file specifies a test to be performed.
351b108b8bSchristosA test compares the data starting at a particular offset
361b108b8bSchristosin the file with a byte value, a string or a numeric value.
371b108b8bSchristosIf the test succeeds, a message is printed.
381b108b8bSchristosThe line consists of the following fields:
391b108b8bSchristos.Bl -tag -width ".Dv message"
401b108b8bSchristos.It Dv offset
415efe63deSchristosA number specifying the offset (in bytes) into the file of the data
421b108b8bSchristoswhich is to be tested.
435efe63deSchristosThis offset can be a negative number if it is:
445efe63deSchristos.Bl -bullet  -compact
455efe63deSchristos.It
465efe63deSchristosThe first direct offset of the magic entry (at continuation level 0),
475efe63deSchristosin which case it is interpreted an offset from end end of the file
485efe63deSchristosgoing backwards.
4978a23c3aSchristosThis works only when a file descriptor to the file is available and it
505efe63deSchristosis a regular file.
515efe63deSchristos.It
525efe63deSchristosA continuation offset relative to the end of the last up-level field
535efe63deSchristos.Dv ( \*[Am] ) .
545efe63deSchristos.El
551b108b8bSchristos.It Dv type
561b108b8bSchristosThe type of the data to be tested.
571b108b8bSchristosThe possible values are:
581b108b8bSchristos.Bl -tag -width ".Dv lestring16"
591b108b8bSchristos.It Dv byte
601b108b8bSchristosA one-byte value.
611b108b8bSchristos.It Dv short
621b108b8bSchristosA two-byte value in this machine's native byte order.
631b108b8bSchristos.It Dv long
641b108b8bSchristosA four-byte value in this machine's native byte order.
651b108b8bSchristos.It Dv quad
661b108b8bSchristosAn eight-byte value in this machine's native byte order.
671b108b8bSchristos.It Dv float
681b108b8bSchristosA 32-bit single precision IEEE floating point number in this machine's native byte order.
691b108b8bSchristos.It Dv double
701b108b8bSchristosA 64-bit double precision IEEE floating point number in this machine's native byte order.
711b108b8bSchristos.It Dv string
721b108b8bSchristosA string of bytes.
73*ddb17682SchristosThe string type specification can be optionally followed by a /<width>
74*ddb17682Schristosoption and optionally followed by a set of flags /[bCcftTtWw]*.
75*ddb17682SchristosThe width limits the number of characters to be copied.
76*ddb17682SchristosZero means all characters.
77*ddb17682SchristosThe following flags are supported:
78*ddb17682Schristos.Bl -tag -width B -compact -offset XXXX
79*ddb17682Schristos.It b
80*ddb17682SchristosForce binary file test.
81*ddb17682Schristos.It C
82*ddb17682SchristosUse upper case insensitive matching: upper case
83*ddb17682Schristoscharacters in the magic match both lower and upper case characters in the
84*ddb17682Schristostarget, whereas lower case characters in the magic only match upper case
85*ddb17682Schristoscharacters in the target.
86*ddb17682Schristos.It c
87*ddb17682SchristosUse lower case insensitive matching: lower case
88*ddb17682Schristoscharacters in the magic match both lower and upper case characters in the
89*ddb17682Schristostarget, whereas upper case characters in the magic only match upper case
90*ddb17682Schristoscharacters in the target.
91*ddb17682SchristosTo do a complete case insensitive match, specify both
92*ddb17682Schristos.Dq c
93*ddb17682Schristosand
94*ddb17682Schristos.Dq C .
95*ddb17682Schristos.It f
96*ddb17682SchristosRequire that the matched string is a full word, not a partial word match.
97*ddb17682Schristos.It T
98*ddb17682SchristosTrim the string, i.e. leading and trailing whitespace
99*ddb17682Schristos.It t
100*ddb17682SchristosForce text file test.
101*ddb17682Schristos.It W
102*ddb17682SchristosCompact whitespace in the target, which must
1031b108b8bSchristoscontain at least one whitespace character.
1041b108b8bSchristosIf the magic has
1051b108b8bSchristos.Dv n
1061b108b8bSchristosconsecutive blanks, the target needs at least
1071b108b8bSchristos.Dv n
1081b108b8bSchristosconsecutive blanks to match.
109*ddb17682Schristos.It w
110*ddb17682SchristosTreat every blank in the magic as an optional blank.
11120d96732Schristosis deleted before the string is printed.
112*ddb17682Schristos.El
1131b108b8bSchristos.It Dv pstring
11420d96732SchristosA Pascal-style string where the first byte/short/int is interpreted as the
1151b108b8bSchristosunsigned length.
1162344ff98SchristosThe length defaults to byte and can be specified as a modifier.
1172344ff98SchristosThe following modifiers are supported:
118*ddb17682Schristos.Bl -tag -width B -compact -offset XXXX
1192344ff98Schristos.It B
1202344ff98SchristosA byte length (default).
1212344ff98Schristos.It H
122d0c65b7bSchristosA 2 byte big endian length.
123e2725312Schristos.It h
124c02f7f97SchristosA 2 byte little endian length.
125e2725312Schristos.It L
126c02f7f97SchristosA 4 byte big endian length.
1272344ff98Schristos.It l
128d0c65b7bSchristosA 4 byte little endian length.
1292344ff98Schristos.It J
1302344ff98SchristosThe length includes itself in its count.
1312344ff98Schristos.El
1321b108b8bSchristosThe string is not NUL terminated.
1332344ff98Schristos.Dq J
1342344ff98Schristosis used rather than the more
1352344ff98Schristosvaluable
1362344ff98Schristos.Dq I
1372344ff98Schristosbecause this type of length is a feature of the JPEG
1382344ff98Schristosformat.
1391b108b8bSchristos.It Dv date
1401b108b8bSchristosA four-byte value interpreted as a UNIX date.
1411b108b8bSchristos.It Dv qdate
14278a23c3aSchristosAn eight-byte value interpreted as a UNIX date.
1431b108b8bSchristos.It Dv ldate
1441b108b8bSchristosA four-byte value interpreted as a UNIX-style date, but interpreted as
1451b108b8bSchristoslocal time rather than UTC.
1461b108b8bSchristos.It Dv qldate
1471b108b8bSchristosAn eight-byte value interpreted as a UNIX-style date, but interpreted as
1481b108b8bSchristoslocal time rather than UTC.
14920d96732Schristos.It Dv qwdate
15020d96732SchristosAn eight-byte value interpreted as a Windows-style date.
1511b108b8bSchristos.It Dv beid3
1521b108b8bSchristosA 32-bit ID3 length in big-endian byte order.
1531b108b8bSchristos.It Dv beshort
1541b108b8bSchristosA two-byte value in big-endian byte order.
1551b108b8bSchristos.It Dv belong
1561b108b8bSchristosA four-byte value in big-endian byte order.
1571b108b8bSchristos.It Dv bequad
1581b108b8bSchristosAn eight-byte value in big-endian byte order.
1591b108b8bSchristos.It Dv befloat
1601b108b8bSchristosA 32-bit single precision IEEE floating point number in big-endian byte order.
1611b108b8bSchristos.It Dv bedouble
1621b108b8bSchristosA 64-bit double precision IEEE floating point number in big-endian byte order.
1631b108b8bSchristos.It Dv bedate
1641b108b8bSchristosA four-byte value in big-endian byte order,
1651b108b8bSchristosinterpreted as a Unix date.
1661b108b8bSchristos.It Dv beqdate
1671b108b8bSchristosAn eight-byte value in big-endian byte order,
1681b108b8bSchristosinterpreted as a Unix date.
1691b108b8bSchristos.It Dv beldate
1701b108b8bSchristosA four-byte value in big-endian byte order,
1711b108b8bSchristosinterpreted as a UNIX-style date, but interpreted as local time rather
1721b108b8bSchristosthan UTC.
1731b108b8bSchristos.It Dv beqldate
1741b108b8bSchristosAn eight-byte value in big-endian byte order,
1751b108b8bSchristosinterpreted as a UNIX-style date, but interpreted as local time rather
1761b108b8bSchristosthan UTC.
17720d96732Schristos.It Dv beqwdate
17820d96732SchristosAn eight-byte value in big-endian byte order,
17920d96732Schristosinterpreted as a Windows-style date.
1801b108b8bSchristos.It Dv bestring16
1811b108b8bSchristosA two-byte unicode (UCS16) string in big-endian byte order.
1821b108b8bSchristos.It Dv leid3
1831b108b8bSchristosA 32-bit ID3 length in little-endian byte order.
1841b108b8bSchristos.It Dv leshort
1851b108b8bSchristosA two-byte value in little-endian byte order.
1861b108b8bSchristos.It Dv lelong
1871b108b8bSchristosA four-byte value in little-endian byte order.
1881b108b8bSchristos.It Dv lequad
1891b108b8bSchristosAn eight-byte value in little-endian byte order.
1901b108b8bSchristos.It Dv lefloat
1911b108b8bSchristosA 32-bit single precision IEEE floating point number in little-endian byte order.
1921b108b8bSchristos.It Dv ledouble
1931b108b8bSchristosA 64-bit double precision IEEE floating point number in little-endian byte order.
1941b108b8bSchristos.It Dv ledate
1951b108b8bSchristosA four-byte value in little-endian byte order,
1961b108b8bSchristosinterpreted as a UNIX date.
1971b108b8bSchristos.It Dv leqdate
1981b108b8bSchristosAn eight-byte value in little-endian byte order,
1991b108b8bSchristosinterpreted as a UNIX date.
2001b108b8bSchristos.It Dv leldate
2011b108b8bSchristosA four-byte value in little-endian byte order,
2021b108b8bSchristosinterpreted as a UNIX-style date, but interpreted as local time rather
2031b108b8bSchristosthan UTC.
2041b108b8bSchristos.It Dv leqldate
2051b108b8bSchristosAn eight-byte value in little-endian byte order,
2061b108b8bSchristosinterpreted as a UNIX-style date, but interpreted as local time rather
2071b108b8bSchristosthan UTC.
20820d96732Schristos.It Dv leqwdate
20920d96732SchristosAn eight-byte value in little-endian byte order,
21020d96732Schristosinterpreted as a Windows-style date.
2111b108b8bSchristos.It Dv lestring16
2121b108b8bSchristosA two-byte unicode (UCS16) string in little-endian byte order.
2131b108b8bSchristos.It Dv melong
2141b108b8bSchristosA four-byte value in middle-endian (PDP-11) byte order.
2151b108b8bSchristos.It Dv medate
2161b108b8bSchristosA four-byte value in middle-endian (PDP-11) byte order,
2171b108b8bSchristosinterpreted as a UNIX date.
2181b108b8bSchristos.It Dv meldate
2191b108b8bSchristosA four-byte value in middle-endian (PDP-11) byte order,
2201b108b8bSchristosinterpreted as a UNIX-style date, but interpreted as local time rather
2211b108b8bSchristosthan UTC.
2221b108b8bSchristos.It Dv indirect
2231b108b8bSchristosStarting at the given offset, consult the magic database again.
22474db5203SchristosThe offset of the
225fa9ee498Schristos.Dv indirect
226fa9ee498Schristosmagic is by default absolute in the file, but one can specify
227fa9ee498Schristos.Dv /r
228fa9ee498Schristosto indicate that the offset is relative from the beginning of the entry.
22920d96732Schristos.It Dv name
23020d96732SchristosDefine a
23120d96732Schristos.Dq named
23220d96732Schristosmagic instance that can be called from another
23320d96732Schristos.Dv use
23420d96732Schristosmagic entry, like a subroutine call.
23520d96732SchristosNamed instance direct magic offsets are relative to the offset of the
23620d96732Schristosprevious matched entry, but indirect offsets are relative to the beginning
23720d96732Schristosof the file as usual.
23820d96732SchristosNamed magic entries always match.
23920d96732Schristos.It Dv use
24020d96732SchristosRecursively call the named magic starting from the current offset.
24120d96732SchristosIf the name of the referenced begins with a
24220d96732Schristos.Dv ^
24320d96732Schristosthen the endianness of the magic is switched; if the magic mentioned
24420d96732Schristos.Dv leshort
24520d96732Schristosfor example,
24620d96732Schristosit is treated as
24720d96732Schristos.Dv beshort
24820d96732Schristosand vice versa.
24920d96732SchristosThis is useful to avoid duplicating the rules for different endianness.
2501b108b8bSchristos.It Dv regex
2511b108b8bSchristosA regular expression match in extended POSIX regular expression syntax
25283bb9c40Swiz(like egrep).
25383bb9c40SwizRegular expressions can take exponential time to process, and their
25483bb9c40Swizperformance is hard to predict, so their use is discouraged.
25583bb9c40SwizWhen used in production environments, their performance
25683bb9c40Swizshould be carefully checked.
257819e6405SchristosThe size of the string to search should also be limited by specifying
258819e6405Schristos.Dv /<length> ,
259819e6405Schristosto avoid performance issues scanning long files.
260819e6405SchristosThe type specification can also be optionally followed by
261819e6405Schristos.Dv /[c][s][l] .
2621b108b8bSchristosThe
2631b108b8bSchristos.Dq c
2641b108b8bSchristosflag makes the match case insensitive, while the
2651b108b8bSchristos.Dq s
2661b108b8bSchristosflag update the offset to the start offset of the match, rather than the end.
267819e6405SchristosThe
268819e6405Schristos.Dq l
269819e6405Schristosmodifier, changes the limit of length to mean number of lines instead of a
270819e6405Schristosbyte count.
271819e6405SchristosLines are delimited by the platforms native line delimiter.
272819e6405SchristosWhen a line count is specified, an implicit byte count also computed assuming
273819e6405Schristoseach line is 80 characters long.
274819e6405SchristosIf neither a byte or line count is specified, the search is limited automatically
275819e6405Schristosto 8KiB.
2761b108b8bSchristos.Dv ^
2771b108b8bSchristosand
2781b108b8bSchristos.Dv $
2791b108b8bSchristosmatch the beginning and end of individual lines, respectively,
2801b108b8bSchristosnot beginning and end of file.
2811b108b8bSchristos.It Dv search
28283bb9c40SwizA literal string search starting at the given offset.
28383bb9c40SwizThe same modifier flags can be used as for string patterns.
284819e6405SchristosThe search expression must contain the range in the form
285819e6405Schristos.Dv /number,
286819e6405Schristosthat is the number of positions at which the match will be
28783bb9c40Swizattempted, starting from the start offset.
28883bb9c40SwizThis is suitable for
2891b108b8bSchristossearching larger binary expressions with variable offsets, using
2901b108b8bSchristos.Dv \e
29183bb9c40Swizescapes for special characters.
292819e6405SchristosThe order of modifier and number is not relevant.
2931b108b8bSchristos.It Dv default
2941b108b8bSchristosThis is intended to be used with the test
2951b108b8bSchristos.Em x
2968dd459ccSchristos(which is always true) and it has no type.
2978dd459ccSchristosIt matches when no other test at that continuation level has matched before.
2988dd459ccSchristosClearing that matched tests for a continuation level, can be done using the
2998dd459ccSchristos.Dv clear
3008dd459ccSchristostest.
3018dd459ccSchristos.It Dv clear
3028dd459ccSchristosThis test is always true and clears the match flag for that continuation level.
3038dd459ccSchristosIt is intended to be used with the
3048dd459ccSchristos.Dv default
3058dd459ccSchristostest.
30629faeba7Schristos.It Dv der
30729faeba7SchristosParse the file as a DER Certificate file.
30829faeba7SchristosThe test field is used as a der type that needs to be matched.
30929faeba7SchristosThe DER types are:
31029faeba7Schristos.Dv eoc ,
31129faeba7Schristos.Dv bool ,
31229faeba7Schristos.Dv int ,
31329faeba7Schristos.Dv bit_str ,
31429faeba7Schristos.Dv octet_str ,
31529faeba7Schristos.Dv null ,
31629faeba7Schristos.Dv obj_id ,
31729faeba7Schristos.Dv obj_desc ,
31829faeba7Schristos.Dv ext ,
31929faeba7Schristos.Dv real ,
32029faeba7Schristos.Dv enum ,
32129faeba7Schristos.Dv embed ,
32229faeba7Schristos.Dv utf8_str ,
32329faeba7Schristos.Dv rel_oid ,
32429faeba7Schristos.Dv time ,
32529faeba7Schristos.Dv res2 ,
32629faeba7Schristos.Dv seq ,
32729faeba7Schristos.Dv set ,
32829faeba7Schristos.Dv num_str ,
32929faeba7Schristos.Dv prt_str ,
33029faeba7Schristos.Dv t61_str ,
33129faeba7Schristos.Dv vid_str ,
33229faeba7Schristos.Dv ia5_str ,
33329faeba7Schristos.Dv utc_time ,
33429faeba7Schristos.Dv gen_time ,
33529faeba7Schristos.Dv gr_str ,
33629faeba7Schristos.Dv vis_str ,
33729faeba7Schristos.Dv gen_str ,
33829faeba7Schristos.Dv univ_str ,
33929faeba7Schristos.Dv char_str ,
34029faeba7Schristos.Dv bmp_str ,
34129faeba7Schristos.Dv date ,
34229faeba7Schristos.Dv tod ,
34329faeba7Schristos.Dv datetime ,
34429faeba7Schristos.Dv duration ,
34529faeba7Schristos.Dv oid-iri ,
34629faeba7Schristos.Dv rel-oid-iri .
34729faeba7SchristosThese types can be followed by an optional numeric size, which indicates
34829faeba7Schristosthe field width in bytes.
34929faeba7Schristos.It Dv guid
35029faeba7SchristosA Globally Unique Identifier, parsed and printed as
35129faeba7SchristosXXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX.
35229faeba7SchristosIt's format is a string.
35329faeba7Schristos.It Dv offset
35429faeba7SchristosThis is a quad value indicating the current offset of the file.
35529faeba7SchristosIt can be used to determine the size of the file or the magic buffer.
35629faeba7SchristosFor example the magic entries:
35729faeba7Schristos.Bd -literal -offset indent
35829faeba7Schristos-0	offset	x	this file is %lld bytes
35929faeba7Schristos-0	offset	<=100	must be more than 100 \e
36029faeba7Schristos    bytes and is only %lld
36129faeba7Schristos.Ed
3621d4cb158Schristos.It Dv octal
3631d4cb158SchristosA string representing an octal number.
3641d4cb158Schristos.El
3651b108b8bSchristos.El
3661b108b8bSchristos.Pp
367c2e19894SchristosFor compatibility with the Single
368c2e19894Schristos.Ux
369c2e19894SchristosStandard, the type specifiers
370c2e19894Schristos.Dv dC
371c2e19894Schristosand
372c2e19894Schristos.Dv d1
373c2e19894Schristosare equivalent to
374c2e19894Schristos.Dv byte ,
375c2e19894Schristosthe type specifiers
376c2e19894Schristos.Dv uC
377c2e19894Schristosand
378c2e19894Schristos.Dv u1
379c2e19894Schristosare equivalent to
380c2e19894Schristos.Dv ubyte ,
381c2e19894Schristosthe type specifiers
382c2e19894Schristos.Dv dS
383c2e19894Schristosand
384c2e19894Schristos.Dv d2
385c2e19894Schristosare equivalent to
386c2e19894Schristos.Dv short ,
387c2e19894Schristosthe type specifiers
388c2e19894Schristos.Dv uS
389c2e19894Schristosand
390c2e19894Schristos.Dv u2
391c2e19894Schristosare equivalent to
392c2e19894Schristos.Dv ushort ,
393c2e19894Schristosthe type specifiers
394c2e19894Schristos.Dv dI ,
395c2e19894Schristos.Dv dL ,
396c2e19894Schristosand
397c2e19894Schristos.Dv d4
398c2e19894Schristosare equivalent to
399c2e19894Schristos.Dv long ,
400c2e19894Schristosthe type specifiers
401c2e19894Schristos.Dv uI ,
402c2e19894Schristos.Dv uL ,
403c2e19894Schristosand
404c2e19894Schristos.Dv u4
405c2e19894Schristosare equivalent to
406c2e19894Schristos.Dv ulong ,
407c2e19894Schristosthe type specifier
408c2e19894Schristos.Dv d8
409c2e19894Schristosis equivalent to
410c2e19894Schristos.Dv quad ,
411c2e19894Schristosthe type specifier
412c2e19894Schristos.Dv u8
413c2e19894Schristosis equivalent to
414c2e19894Schristos.Dv uquad ,
415c2e19894Schristosand the type specifier
416c2e19894Schristos.Dv s
417c2e19894Schristosis equivalent to
418c2e19894Schristos.Dv string .
419c2e19894SchristosIn addition, the type specifier
420c2e19894Schristos.Dv dQ
421c2e19894Schristosis equivalent to
422c2e19894Schristos.Dv quad
423c2e19894Schristosand the type specifier
424c2e19894Schristos.Dv uQ
425c2e19894Schristosis equivalent to
426c2e19894Schristos.Dv uquad .
427c2e19894Schristos.Pp
4281b108b8bSchristosEach top-level magic pattern (see below for an explanation of levels)
42983bb9c40Swizis classified as text or binary according to the types used.
43083bb9c40SwizTypes
4311b108b8bSchristos.Dq regex
4321b108b8bSchristosand
4331b108b8bSchristos.Dq search
4341b108b8bSchristosare classified as text tests, unless non-printable characters are used
43583bb9c40Swizin the pattern.
43683bb9c40SwizAll other tests are classified as binary.
43783bb9c40SwizA top-level
4381b108b8bSchristospattern is considered to be a test text when all its patterns are text
43983bb9c40Swizpatterns; otherwise, it is considered to be a binary pattern.
44083bb9c40SwizWhen
4411b108b8bSchristosmatching a file, binary patterns are tried first; if no match is
4421b108b8bSchristosfound, and the file looks like text, then its encoding is determined
4431b108b8bSchristosand the text patterns are tried.
4441b108b8bSchristos.Pp
4451b108b8bSchristosThe numeric types may optionally be followed by
4461b108b8bSchristos.Dv \*[Am]
4471b108b8bSchristosand a numeric value,
4481b108b8bSchristosto specify that the value is to be AND'ed with the
4491b108b8bSchristosnumeric value before any comparisons are done.
4501b108b8bSchristosPrepending a
4511b108b8bSchristos.Dv u
4521b108b8bSchristosto the type indicates that ordered comparisons should be unsigned.
4531b108b8bSchristos.It Dv test
4541b108b8bSchristosThe value to be compared with the value from the file.
4551b108b8bSchristosIf the type is
4561b108b8bSchristosnumeric, this value
4571b108b8bSchristosis specified in C form; if it is a string, it is specified as a C string
4581b108b8bSchristoswith the usual escapes permitted (e.g. \en for new-line).
4591b108b8bSchristos.Pp
4601b108b8bSchristosNumeric values
4611b108b8bSchristosmay be preceded by a character indicating the operation to be performed.
4621b108b8bSchristosIt may be
4631b108b8bSchristos.Dv = ,
4641b108b8bSchristosto specify that the value from the file must equal the specified value,
4651b108b8bSchristos.Dv \*[Lt] ,
4661b108b8bSchristosto specify that the value from the file must be less than the specified
4671b108b8bSchristosvalue,
4681b108b8bSchristos.Dv \*[Gt] ,
4691b108b8bSchristosto specify that the value from the file must be greater than the specified
4701b108b8bSchristosvalue,
4711b108b8bSchristos.Dv \*[Am] ,
4721b108b8bSchristosto specify that the value from the file must have set all of the bits
4731b108b8bSchristosthat are set in the specified value,
4741b108b8bSchristos.Dv ^ ,
4751b108b8bSchristosto specify that the value from the file must have clear any of the bits
4761b108b8bSchristosthat are set in the specified value, or
4771b108b8bSchristos.Dv ~ ,
4781b108b8bSchristosthe value specified after is negated before tested.
4791b108b8bSchristos.Dv x ,
4801b108b8bSchristosto specify that any value will match.
4811b108b8bSchristosIf the character is omitted, it is assumed to be
4821b108b8bSchristos.Dv = .
4831b108b8bSchristosOperators
4841b108b8bSchristos.Dv \*[Am] ,
4851b108b8bSchristos.Dv ^ ,
4861b108b8bSchristosand
4871b108b8bSchristos.Dv ~
4881b108b8bSchristosdon't work with floats and doubles.
4891b108b8bSchristosThe operator
4901b108b8bSchristos.Dv !\&
4911b108b8bSchristosspecifies that the line matches if the test does
4921b108b8bSchristos.Em not
4931b108b8bSchristossucceed.
4941b108b8bSchristos.Pp
4951b108b8bSchristosNumeric values are specified in C form; e.g.
4961b108b8bSchristos.Dv 13
4971b108b8bSchristosis decimal,
4981b108b8bSchristos.Dv 013
4991b108b8bSchristosis octal, and
5001b108b8bSchristos.Dv 0x13
5011b108b8bSchristosis hexadecimal.
5021b108b8bSchristos.Pp
503819e6405SchristosNumeric operations are not performed on date types, instead the numeric
504819e6405Schristosvalue is interpreted as an offset.
505819e6405Schristos.Pp
5061b108b8bSchristosFor string values, the string from the
5071b108b8bSchristosfile must match the specified string.
5081b108b8bSchristosThe operators
5091b108b8bSchristos.Dv = ,
5101b108b8bSchristos.Dv \*[Lt]
5111b108b8bSchristosand
5121b108b8bSchristos.Dv \*[Gt]
5131b108b8bSchristos(but not
5141b108b8bSchristos.Dv \*[Am] )
5151b108b8bSchristoscan be applied to strings.
5161b108b8bSchristosThe length used for matching is that of the string argument
5171b108b8bSchristosin the magic file.
5181b108b8bSchristosThis means that a line can match any non-empty string (usually used to
5191b108b8bSchristosthen print the string), with
5201b108b8bSchristos.Em \*[Gt]\e0
5211b108b8bSchristos(because all non-empty strings are greater than the empty string).
5221b108b8bSchristos.Pp
523819e6405SchristosDates are treated as numerical values in the respective internal
524819e6405Schristosrepresentation.
525819e6405Schristos.Pp
5261b108b8bSchristosThe special test
5271b108b8bSchristos.Em x
5281b108b8bSchristosalways evaluates to true.
5292344ff98Schristos.It Dv message
5301b108b8bSchristosThe message to be printed if the comparison succeeds.
5311b108b8bSchristosIf the string contains a
5321b108b8bSchristos.Xr printf 3
5331b108b8bSchristosformat specification, the value from the file (with any specified masking
5341b108b8bSchristosperformed) is printed using the message as the format string.
5351b108b8bSchristosIf the string begins with
5361b108b8bSchristos.Dq \eb ,
5371b108b8bSchristosthe message printed is the remainder of the string with no whitespace
5381b108b8bSchristosadded before it: multiple matches are normally separated by a single
5391b108b8bSchristosspace.
5401b108b8bSchristos.El
5411b108b8bSchristos.Pp
5421b108b8bSchristosAn APPLE 4+4 character APPLE creator and type can be specified as:
5431b108b8bSchristos.Bd -literal -offset indent
5441b108b8bSchristos!:apple	CREATYPE
5451b108b8bSchristos.Ed
5461b108b8bSchristos.Pp
547*ddb17682SchristosA slash-separated list of commonly found filename extensions can be specified
548*ddb17682Schristosas:
549*ddb17682Schristos.Bd -literal -offset indent
550*ddb17682Schristos!:ext	ext[/ext...]
551*ddb17682Schristos.Ed
552*ddb17682Schristos.Pp
553*ddb17682Schristosi.e. the literal string
554*ddb17682Schristos.Dq !:ext
555*ddb17682Schristosfollowed by a slash-separated list of commonly found extensions; for example
556*ddb17682Schristosfor JPEG images:
557*ddb17682Schristos.Bd -literal -offset indent
558*ddb17682Schristos!:ext jpeg/jpg/jpe/jfif
559*ddb17682Schristos.Ed
560*ddb17682Schristos.Pp
5611b108b8bSchristosA MIME type is given on a separate line, which must be the next
5621b108b8bSchristosnon-blank or comment line after the magic line that identifies the
5631b108b8bSchristosfile type, and has the following format:
5641b108b8bSchristos.Bd -literal -offset indent
5651b108b8bSchristos!:mime	MIMETYPE
5661b108b8bSchristos.Ed
5671b108b8bSchristos.Pp
5681b108b8bSchristosi.e. the literal string
5691b108b8bSchristos.Dq !:mime
5701b108b8bSchristosfollowed by the MIME type.
5711b108b8bSchristos.Pp
5721b108b8bSchristosAn optional strength can be supplied on a separate line which refers to
5731b108b8bSchristosthe current magic description using the following format:
5741b108b8bSchristos.Bd -literal -offset indent
5751b108b8bSchristos!:strength OP VALUE
5761b108b8bSchristos.Ed
5771b108b8bSchristos.Pp
5781b108b8bSchristosThe operand
5791b108b8bSchristos.Dv OP
5801b108b8bSchristoscan be:
5811b108b8bSchristos.Dv + ,
5821b108b8bSchristos.Dv - ,
5831b108b8bSchristos.Dv * ,
5841b108b8bSchristosor
5851b108b8bSchristos.Dv /
5861b108b8bSchristosand
5871b108b8bSchristos.Dv VALUE
5881b108b8bSchristosis a constant between 0 and 255.
5891b108b8bSchristosThis constant is applied using the specified operand
5901b108b8bSchristosto the currently computed default magic strength.
5911b108b8bSchristos.Pp
5921b108b8bSchristosSome file formats contain additional information which is to be printed
5931b108b8bSchristosalong with the file type or need additional tests to determine the true
5941b108b8bSchristosfile type.
5951b108b8bSchristosThese additional tests are introduced by one or more
5961b108b8bSchristos.Em \*[Gt]
5971b108b8bSchristoscharacters preceding the offset.
5981b108b8bSchristosThe number of
5991b108b8bSchristos.Em \*[Gt]
6001b108b8bSchristoson the line indicates the level of the test; a line with no
6011b108b8bSchristos.Em \*[Gt]
6021b108b8bSchristosat the beginning is considered to be at level 0.
6031b108b8bSchristosTests are arranged in a tree-like hierarchy:
6042344ff98Schristosif the test on a line at level
6051b108b8bSchristos.Em n
6061b108b8bSchristossucceeds, all following tests at level
6071b108b8bSchristos.Em n+1
6082344ff98Schristosare performed, and the messages printed if the tests succeed, until a line
6091b108b8bSchristoswith level
6101b108b8bSchristos.Em n
6111b108b8bSchristos(or less) appears.
6121b108b8bSchristosFor more complex files, one can use empty messages to get just the
6131b108b8bSchristos"if/then" effect, in the following way:
6141b108b8bSchristos.Bd -literal -offset indent
6151b108b8bSchristos0      string   MZ
6161b108b8bSchristos\*[Gt]0x18  leshort  \*[Lt]0x40   MS-DOS executable
6171b108b8bSchristos\*[Gt]0x18  leshort  \*[Gt]0x3f   extended PC executable (e.g., MS Windows)
6181b108b8bSchristos.Ed
6191b108b8bSchristos.Pp
6201b108b8bSchristosOffsets do not need to be constant, but can also be read from the file
6211b108b8bSchristosbeing examined.
6221b108b8bSchristosIf the first character following the last
6231b108b8bSchristos.Em \*[Gt]
6241b108b8bSchristosis a
6252344ff98Schristos.Em \&(
6261b108b8bSchristosthen the string after the parenthesis is interpreted as an indirect offset.
6271b108b8bSchristosThat means that the number after the parenthesis is used as an offset in
6281b108b8bSchristosthe file.
6291b108b8bSchristosThe value at that offset is read, and is used again as an offset
6301b108b8bSchristosin the file.
6311b108b8bSchristosIndirect offsets are of the form:
632c02f7f97Schristos.Em (( x [[.,][bBcCeEfFgGhHiIlmsSqQ]][+\-][ y ]) .
6331b108b8bSchristosThe value of
6341b108b8bSchristos.Em x
6351b108b8bSchristosis used as an offset in the file.
6361b108b8bSchristosA byte, id3 length, short or long is read at that offset depending on the
637c02f7f97Schristos.Em [bBcCeEfFgGhHiIlmsSqQ]
6381b108b8bSchristostype specifier.
63974db5203SchristosThe value is treated as signed if
64074db5203Schristos.Dq ,
64174db5203Schristosis specified or unsigned if
64274db5203Schristos.Dq .
64374db5203Schristosis specified.
6441b108b8bSchristosThe capitalized types interpret the number as a big endian
6451b108b8bSchristosvalue, whereas the small letter versions interpret the number as a little
6461b108b8bSchristosendian value;
6471b108b8bSchristosthe
6481b108b8bSchristos.Em m
6491b108b8bSchristostype interprets the number as a middle endian (PDP-11) value.
6501b108b8bSchristosTo that number the value of
6511b108b8bSchristos.Em y
6521b108b8bSchristosis added and the result is used as an offset in the file.
6531b108b8bSchristosThe default type if one is not specified is long.
654c02f7f97SchristosThe following types are recognized:
655c02f7f97Schristos.Bl -column -offset indent "Type" "Half/Short" "Little" "Size"
656c02f7f97Schristos.It Sy Type	Sy Mnemonic	Sy Endian	Sy Size
657c02f7f97Schristos.It bcBc	Byte/Char	N/A	1
658c02f7f97Schristos.It efg	Double	Little	8
659c02f7f97Schristos.It EFG	Double	Big	8
660c02f7f97Schristos.It hs	Half/Short	Little	2
661c02f7f97Schristos.It HS	Half/Short	Big	2
662c02f7f97Schristos.It i	ID3	Little	4
663c02f7f97Schristos.It I	ID3	Big	4
664c02f7f97Schristos.It m	Middle	Middle	4
6651d4cb158Schristos.It o	Octal	Textual	Variable
666c02f7f97Schristos.It q	Quad	Little	8
667c02f7f97Schristos.It Q	Quad	Big	8
668c02f7f97Schristos.El
6691b108b8bSchristos.Pp
6701b108b8bSchristosThat way variable length structures can be examined:
6711b108b8bSchristos.Bd -literal -offset indent
6721b108b8bSchristos# MS Windows executables are also valid MS-DOS executables
6731b108b8bSchristos0           string  MZ
6741b108b8bSchristos\*[Gt]0x18       leshort \*[Lt]0x40   MZ executable (MS-DOS)
6751b108b8bSchristos# skip the whole block below if it is not an extended executable
6761b108b8bSchristos\*[Gt]0x18       leshort \*[Gt]0x3f
6771b108b8bSchristos\*[Gt]\*[Gt](0x3c.l)  string  PE\e0\e0  PE executable (MS-Windows)
6781b108b8bSchristos\*[Gt]\*[Gt](0x3c.l)  string  LX\e0\e0  LX executable (OS/2)
6791b108b8bSchristos.Ed
6801b108b8bSchristos.Pp
68174db5203SchristosThis strategy of examining has a drawback: you must make sure that you
68274db5203Schristoseventually print something, or users may get empty output (such as when
68374db5203Schristosthere is neither PE\e0\e0 nor LE\e0\e0 in the above example).
6841b108b8bSchristos.Pp
6851b108b8bSchristosIf this indirect offset cannot be used directly, simple calculations are
6861b108b8bSchristospossible: appending
6871b108b8bSchristos.Em [+-*/%\*[Am]|^]number
6881b108b8bSchristosinside parentheses allows one to modify
6891b108b8bSchristosthe value read from the file before it is used as an offset:
6901b108b8bSchristos.Bd -literal -offset indent
6911b108b8bSchristos# MS Windows executables are also valid MS-DOS executables
6921b108b8bSchristos0           string  MZ
6931b108b8bSchristos# sometimes, the value at 0x18 is less that 0x40 but there's still an
6941b108b8bSchristos# extended executable, simply appended to the file
6951b108b8bSchristos\*[Gt]0x18       leshort \*[Lt]0x40
6961b108b8bSchristos\*[Gt]\*[Gt](4.s*512) leshort 0x014c  COFF executable (MS-DOS, DJGPP)
6971b108b8bSchristos\*[Gt]\*[Gt](4.s*512) leshort !0x014c MZ executable (MS-DOS)
6981b108b8bSchristos.Ed
6991b108b8bSchristos.Pp
7001b108b8bSchristosSometimes you do not know the exact offset as this depends on the length or
7011b108b8bSchristosposition (when indirection was used before) of preceding fields.
7021b108b8bSchristosYou can specify an offset relative to the end of the last up-level
7031b108b8bSchristosfield using
7041b108b8bSchristos.Sq \*[Am]
7051b108b8bSchristosas a prefix to the offset:
7061b108b8bSchristos.Bd -literal -offset indent
7071b108b8bSchristos0           string  MZ
7081b108b8bSchristos\*[Gt]0x18       leshort \*[Gt]0x3f
7091b108b8bSchristos\*[Gt]\*[Gt](0x3c.l)  string  PE\e0\e0    PE executable (MS-Windows)
7101b108b8bSchristos# immediately following the PE signature is the CPU type
7111b108b8bSchristos\*[Gt]\*[Gt]\*[Gt]\*[Am]0       leshort 0x14c     for Intel 80386
7121b108b8bSchristos\*[Gt]\*[Gt]\*[Gt]\*[Am]0       leshort 0x184     for DEC Alpha
7131b108b8bSchristos.Ed
7141b108b8bSchristos.Pp
7151b108b8bSchristosIndirect and relative offsets can be combined:
7161b108b8bSchristos.Bd -literal -offset indent
7171b108b8bSchristos0             string  MZ
7181b108b8bSchristos\*[Gt]0x18         leshort \*[Lt]0x40
7191b108b8bSchristos\*[Gt]\*[Gt](4.s*512)   leshort !0x014c MZ executable (MS-DOS)
7201b108b8bSchristos# if it's not COFF, go back 512 bytes and add the offset taken
7211b108b8bSchristos# from byte 2/3, which is yet another way of finding the start
7221b108b8bSchristos# of the extended executable
7231b108b8bSchristos\*[Gt]\*[Gt]\*[Gt]\*[Am](2.s-514) string  LE      LE executable (MS Windows VxD driver)
7241b108b8bSchristos.Ed
7251b108b8bSchristos.Pp
7261b108b8bSchristosOr the other way around:
7271b108b8bSchristos.Bd -literal -offset indent
7281b108b8bSchristos0                 string  MZ
7291b108b8bSchristos\*[Gt]0x18             leshort \*[Gt]0x3f
7301b108b8bSchristos\*[Gt]\*[Gt](0x3c.l)        string  LE\e0\e0  LE executable (MS-Windows)
7311b108b8bSchristos# at offset 0x80 (-4, since relative offsets start at the end
7321b108b8bSchristos# of the up-level match) inside the LE header, we find the absolute
7331b108b8bSchristos# offset to the code area, where we look for a specific signature
7341b108b8bSchristos\*[Gt]\*[Gt]\*[Gt](\*[Am]0x7c.l+0x26) string  UPX     \eb, UPX compressed
7351b108b8bSchristos.Ed
7361b108b8bSchristos.Pp
7371b108b8bSchristosOr even both!
7381b108b8bSchristos.Bd -literal -offset indent
7391b108b8bSchristos0                string  MZ
7401b108b8bSchristos\*[Gt]0x18            leshort \*[Gt]0x3f
7411b108b8bSchristos\*[Gt]\*[Gt](0x3c.l)       string  LE\e0\e0 LE executable (MS-Windows)
7421b108b8bSchristos# at offset 0x58 inside the LE header, we find the relative offset
7431b108b8bSchristos# to a data area where we look for a specific signature
7441b108b8bSchristos\*[Gt]\*[Gt]\*[Gt]\*[Am](\*[Am]0x54.l-3)  string  UNACE  \eb, ACE self-extracting archive
7451b108b8bSchristos.Ed
7461b108b8bSchristos.Pp
7478dd459ccSchristosIf you have to deal with offset/length pairs in your file, even the
7481b108b8bSchristossecond value in a parenthesized expression can be taken from the file itself,
7491b108b8bSchristosusing another set of parentheses.
7501b108b8bSchristosNote that this additional indirect offset is always relative to the
7511b108b8bSchristosstart of the main indirect offset.
7521b108b8bSchristos.Bd -literal -offset indent
7531b108b8bSchristos0                 string       MZ
7541b108b8bSchristos\*[Gt]0x18             leshort      \*[Gt]0x3f
7551b108b8bSchristos\*[Gt]\*[Gt](0x3c.l)        string       PE\e0\e0 PE executable (MS-Windows)
7561b108b8bSchristos# search for the PE section called ".idata"...
7571b108b8bSchristos\*[Gt]\*[Gt]\*[Gt]\*[Am]0xf4          search/0x140 .idata
7581b108b8bSchristos# ...and go to the end of it, calculated from start+length;
7591b108b8bSchristos# these are located 14 and 10 bytes after the section name
7601b108b8bSchristos\*[Gt]\*[Gt]\*[Gt]\*[Gt](\*[Am]0xe.l+(-4)) string       PK\e3\e4 \eb, ZIP self-extracting archive
7611b108b8bSchristos.Ed
7628dd459ccSchristos.Pp
76374db5203SchristosIf you have a list of known values at a particular continuation level,
7648dd459ccSchristosand you want to provide a switch-like default case:
7658dd459ccSchristos.Bd -literal -offset indent
7668dd459ccSchristos# clear that continuation level match
7678dd459ccSchristos\*[Gt]18	clear
7688dd459ccSchristos\*[Gt]18	lelong	1	one
7698dd459ccSchristos\*[Gt]18	lelong	2	two
7708dd459ccSchristos\*[Gt]18	default	x
7718dd459ccSchristos# print default match
7728dd459ccSchristos\*[Gt]\*[Gt]18	lelong	x	unmatched 0x%x
7738dd459ccSchristos.Ed
7741b108b8bSchristos.Sh SEE ALSO
7751b108b8bSchristos.Xr file 1
7761b108b8bSchristos\- the command that reads this file.
7771b108b8bSchristos.Sh BUGS
7781b108b8bSchristosThe formats
7791b108b8bSchristos.Dv long ,
7801b108b8bSchristos.Dv belong ,
7811b108b8bSchristos.Dv lelong ,
7821b108b8bSchristos.Dv melong ,
7831b108b8bSchristos.Dv short ,
7841b108b8bSchristos.Dv beshort ,
7851b108b8bSchristosand
786c2e19894Schristos.Dv leshort
787c2e19894Schristosdo not depend on the length of the C data types
788c2e19894Schristos.Dv short
789c2e19894Schristosand
790c2e19894Schristos.Dv long
791c2e19894Schristoson the platform, even though the Single
792c2e19894Schristos.Ux
793*ddb17682SchristosSpecification implies that they do.
794*ddb17682SchristosHowever, as OS X Mountain Lion has passed the Single
795c2e19894Schristos.Ux
796c2e19894SchristosSpecification validation suite, and supplies a version of
797c2e19894Schristos.Xr file 1
798c2e19894Schristosin which they do not depend on the sizes of the C data types and that is
799c2e19894Schristosbuilt for a 64-bit environment in which
800c2e19894Schristos.Dv long
801c2e19894Schristosis 8 bytes rather than 4 bytes, presumably the validation suite does not
802c2e19894Schristostest whether, for example
803c2e19894Schristos.Dv long
804c2e19894Schristosrefers to an item with the same size as the C data type
805c2e19894Schristos.Dv long .
806c2e19894SchristosThere should probably be
807c2e19894Schristos.Dv type
808c2e19894Schristosnames
809c2e19894Schristos.Dv int8 ,
810c2e19894Schristos.Dv uint8 ,
811c2e19894Schristos.Dv int16 ,
812c2e19894Schristos.Dv uint16 ,
813c2e19894Schristos.Dv int32 ,
814c2e19894Schristos.Dv uint32 ,
815c2e19894Schristos.Dv int64 ,
816c2e19894Schristosand
817c2e19894Schristos.Dv uint64 ,
818c2e19894Schristosand specified-byte-order variants of them,
819c2e19894Schristosto make it clearer that those types have specified widths.
8201b108b8bSchristos.\"
8211b108b8bSchristos.\" From: guy@sun.uucp (Guy Harris)
8221b108b8bSchristos.\" Newsgroups: net.bugs.usg
8231b108b8bSchristos.\" Subject: /etc/magic's format isn't well documented
8241b108b8bSchristos.\" Message-ID: <2752@sun.uucp>
8251b108b8bSchristos.\" Date: 3 Sep 85 08:19:07 GMT
8261b108b8bSchristos.\" Organization: Sun Microsystems, Inc.
8271b108b8bSchristos.\" Lines: 136
8281b108b8bSchristos.\"
8291b108b8bSchristos.\" Here's a manual page for the format accepted by the "file" made by adding
8301b108b8bSchristos.\" the changes I posted to the S5R2 version.
8311b108b8bSchristos.\"
8321b108b8bSchristos.\" Modified for Ian Darwin's version of the file command.
833