1*08ff44c4SLionel Sambuc.\" $NetBSD: magic.5,v 1.7 2012/02/22 17:53:50 christos Exp $ 2ef01931fSBen Gras.\" 3*08ff44c4SLionel Sambuc.\" $File: magic.man,v 1.71 2011/12/07 11:58:24 rrt Exp $ 4835f6802SDirk Vogt.Dd April 20, 2011 5ef01931fSBen Gras.Dt MAGIC 5 6ef01931fSBen Gras.Os 7ef01931fSBen Gras.\" install as magic.4 on USG, magic.5 on V7, Berkeley and Linux systems. 8ef01931fSBen Gras.Sh NAME 9ef01931fSBen Gras.Nm magic 10ef01931fSBen Gras.Nd file command's magic pattern file 11ef01931fSBen Gras.Sh DESCRIPTION 12ef01931fSBen GrasThis manual page documents the format of the magic file as 13ef01931fSBen Grasused by the 14ef01931fSBen Gras.Xr file 1 15*08ff44c4SLionel Sambuccommand, version 5.11. 16ef01931fSBen GrasThe 17ef01931fSBen Gras.Xr file 1 18ef01931fSBen Grascommand identifies the type of a file using, 19ef01931fSBen Grasamong other tests, 20ef01931fSBen Grasa test for whether the file contains certain 21ef01931fSBen Gras.Dq "magic patterns" . 22ef01931fSBen GrasThe file 23ef01931fSBen Gras.Pa /usr/share/misc/magic 24ef01931fSBen Grasspecifies what patterns are to be tested for, what message or 25ef01931fSBen GrasMIME type to print if a particular pattern is found, 26ef01931fSBen Grasand additional information to extract from the file. 27ef01931fSBen Gras.Pp 28ef01931fSBen GrasEach line of the file specifies a test to be performed. 29ef01931fSBen GrasA test compares the data starting at a particular offset 30ef01931fSBen Grasin the file with a byte value, a string or a numeric value. 31ef01931fSBen GrasIf the test succeeds, a message is printed. 32ef01931fSBen GrasThe line consists of the following fields: 33ef01931fSBen Gras.Bl -tag -width ".Dv message" 34ef01931fSBen Gras.It Dv offset 35ef01931fSBen GrasA number specifying the offset, in bytes, into the file of the data 36ef01931fSBen Graswhich is to be tested. 37ef01931fSBen Gras.It Dv type 38ef01931fSBen GrasThe type of the data to be tested. 39ef01931fSBen GrasThe possible values are: 40ef01931fSBen Gras.Bl -tag -width ".Dv lestring16" 41ef01931fSBen Gras.It Dv byte 42ef01931fSBen GrasA one-byte value. 43ef01931fSBen Gras.It Dv short 44ef01931fSBen GrasA two-byte value in this machine's native byte order. 45ef01931fSBen Gras.It Dv long 46ef01931fSBen GrasA four-byte value in this machine's native byte order. 47ef01931fSBen Gras.It Dv quad 48ef01931fSBen GrasAn eight-byte value in this machine's native byte order. 49ef01931fSBen Gras.It Dv float 50ef01931fSBen GrasA 32-bit single precision IEEE floating point number in this machine's native byte order. 51ef01931fSBen Gras.It Dv double 52ef01931fSBen GrasA 64-bit double precision IEEE floating point number in this machine's native byte order. 53ef01931fSBen Gras.It Dv string 54ef01931fSBen GrasA string of bytes. 55ef01931fSBen GrasThe string type specification can be optionally followed 56835f6802SDirk Vogtby /[WwcCtb]*. 57ef01931fSBen GrasThe 58835f6802SDirk Vogt.Dq W 59ef01931fSBen Grasflag compacts whitespace in the target, which must 60ef01931fSBen Grascontain at least one whitespace character. 61ef01931fSBen GrasIf the magic has 62ef01931fSBen Gras.Dv n 63ef01931fSBen Grasconsecutive blanks, the target needs at least 64ef01931fSBen Gras.Dv n 65ef01931fSBen Grasconsecutive blanks to match. 66ef01931fSBen GrasThe 67835f6802SDirk Vogt.Dq w 68*08ff44c4SLionel Sambucflag treats every blank in the magic as an optional blank. 69835f6802SDirk VogtThe 70ef01931fSBen Gras.Dq c 71*08ff44c4SLionel Sambucflag specifies case insensitive matching: lower case 72ef01931fSBen Grascharacters in the magic match both lower and upper case characters in the 73ef01931fSBen Grastarget, whereas upper case characters in the magic only match upper case 74ef01931fSBen Grascharacters in the target. 75835f6802SDirk VogtThe 76835f6802SDirk Vogt.Dq C 77*08ff44c4SLionel Sambucflag specifies case insensitive matching: upper case 78835f6802SDirk Vogtcharacters in the magic match both lower and upper case characters in the 79835f6802SDirk Vogttarget, whereas lower case characters in the magic only match upper case 80835f6802SDirk Vogtcharacters in the target. 81835f6802SDirk VogtTo do a complete case insensitive match, specify both 82835f6802SDirk Vogt.Dq c 83835f6802SDirk Vogtand 84835f6802SDirk Vogt.Dq C . 85835f6802SDirk VogtThe 86835f6802SDirk Vogt.Dq t 87*08ff44c4SLionel Sambucflag forces the test to be done for text files, while the 88835f6802SDirk Vogt.Dq b 89*08ff44c4SLionel Sambucflag forces the test to be done for binary files. 90ef01931fSBen Gras.It Dv pstring 91835f6802SDirk VogtA Pascal-style string where the first byte/short/int is interpreted as the an 92ef01931fSBen Grasunsigned length. 93835f6802SDirk VogtThe length defaults to byte and can be specified as a modifier. 94835f6802SDirk VogtThe following modifiers are supported: 95835f6802SDirk Vogt.Bl -tag -compact -width B 96835f6802SDirk Vogt.It B 97835f6802SDirk VogtA byte length (default). 98835f6802SDirk Vogt.It H 99835f6802SDirk VogtA 2 byte big endian length. 100835f6802SDirk Vogt.It h 101835f6802SDirk VogtA 2 byte big little length. 102835f6802SDirk Vogt.It L 103835f6802SDirk VogtA 4 byte big endian length. 104835f6802SDirk Vogt.It l 105835f6802SDirk VogtA 4 byte big little length. 106835f6802SDirk Vogt.It J 107835f6802SDirk VogtThe length includes itself in its count. 108835f6802SDirk Vogt.El 109ef01931fSBen GrasThe string is not NUL terminated. 110835f6802SDirk Vogt.Dq J 111835f6802SDirk Vogtis used rather than the more 112835f6802SDirk Vogtvaluable 113835f6802SDirk Vogt.Dq I 114835f6802SDirk Vogtbecause this type of length is a feature of the JPEG 115835f6802SDirk Vogtformat. 116ef01931fSBen Gras.It Dv date 117ef01931fSBen GrasA four-byte value interpreted as a UNIX date. 118ef01931fSBen Gras.It Dv qdate 119ef01931fSBen GrasA eight-byte value interpreted as a UNIX date. 120ef01931fSBen Gras.It Dv ldate 121ef01931fSBen GrasA four-byte value interpreted as a UNIX-style date, but interpreted as 122ef01931fSBen Graslocal time rather than UTC. 123ef01931fSBen Gras.It Dv qldate 124ef01931fSBen GrasAn eight-byte value interpreted as a UNIX-style date, but interpreted as 125ef01931fSBen Graslocal time rather than UTC. 126ef01931fSBen Gras.It Dv beid3 127ef01931fSBen GrasA 32-bit ID3 length in big-endian byte order. 128ef01931fSBen Gras.It Dv beshort 129ef01931fSBen GrasA two-byte value in big-endian byte order. 130ef01931fSBen Gras.It Dv belong 131ef01931fSBen GrasA four-byte value in big-endian byte order. 132ef01931fSBen Gras.It Dv bequad 133ef01931fSBen GrasAn eight-byte value in big-endian byte order. 134ef01931fSBen Gras.It Dv befloat 135ef01931fSBen GrasA 32-bit single precision IEEE floating point number in big-endian byte order. 136ef01931fSBen Gras.It Dv bedouble 137ef01931fSBen GrasA 64-bit double precision IEEE floating point number in big-endian byte order. 138ef01931fSBen Gras.It Dv bedate 139ef01931fSBen GrasA four-byte value in big-endian byte order, 140ef01931fSBen Grasinterpreted as a Unix date. 141ef01931fSBen Gras.It Dv beqdate 142ef01931fSBen GrasAn eight-byte value in big-endian byte order, 143ef01931fSBen Grasinterpreted as a Unix date. 144ef01931fSBen Gras.It Dv beldate 145ef01931fSBen GrasA four-byte value in big-endian byte order, 146ef01931fSBen Grasinterpreted as a UNIX-style date, but interpreted as local time rather 147ef01931fSBen Grasthan UTC. 148ef01931fSBen Gras.It Dv beqldate 149ef01931fSBen GrasAn eight-byte value in big-endian byte order, 150ef01931fSBen Grasinterpreted as a UNIX-style date, but interpreted as local time rather 151ef01931fSBen Grasthan UTC. 152ef01931fSBen Gras.It Dv bestring16 153ef01931fSBen GrasA two-byte unicode (UCS16) string in big-endian byte order. 154ef01931fSBen Gras.It Dv leid3 155ef01931fSBen GrasA 32-bit ID3 length in little-endian byte order. 156ef01931fSBen Gras.It Dv leshort 157ef01931fSBen GrasA two-byte value in little-endian byte order. 158ef01931fSBen Gras.It Dv lelong 159ef01931fSBen GrasA four-byte value in little-endian byte order. 160ef01931fSBen Gras.It Dv lequad 161ef01931fSBen GrasAn eight-byte value in little-endian byte order. 162ef01931fSBen Gras.It Dv lefloat 163ef01931fSBen GrasA 32-bit single precision IEEE floating point number in little-endian byte order. 164ef01931fSBen Gras.It Dv ledouble 165ef01931fSBen GrasA 64-bit double precision IEEE floating point number in little-endian byte order. 166ef01931fSBen Gras.It Dv ledate 167ef01931fSBen GrasA four-byte value in little-endian byte order, 168ef01931fSBen Grasinterpreted as a UNIX date. 169ef01931fSBen Gras.It Dv leqdate 170ef01931fSBen GrasAn eight-byte value in little-endian byte order, 171ef01931fSBen Grasinterpreted as a UNIX date. 172ef01931fSBen Gras.It Dv leldate 173ef01931fSBen GrasA four-byte value in little-endian byte order, 174ef01931fSBen Grasinterpreted as a UNIX-style date, but interpreted as local time rather 175ef01931fSBen Grasthan UTC. 176ef01931fSBen Gras.It Dv leqldate 177ef01931fSBen GrasAn eight-byte value in little-endian byte order, 178ef01931fSBen Grasinterpreted as a UNIX-style date, but interpreted as local time rather 179ef01931fSBen Grasthan UTC. 180ef01931fSBen Gras.It Dv lestring16 181ef01931fSBen GrasA two-byte unicode (UCS16) string in little-endian byte order. 182ef01931fSBen Gras.It Dv melong 183ef01931fSBen GrasA four-byte value in middle-endian (PDP-11) byte order. 184ef01931fSBen Gras.It Dv medate 185ef01931fSBen GrasA four-byte value in middle-endian (PDP-11) byte order, 186ef01931fSBen Grasinterpreted as a UNIX date. 187ef01931fSBen Gras.It Dv meldate 188ef01931fSBen GrasA four-byte value in middle-endian (PDP-11) byte order, 189ef01931fSBen Grasinterpreted as a UNIX-style date, but interpreted as local time rather 190ef01931fSBen Grasthan UTC. 191ef01931fSBen Gras.It Dv indirect 192ef01931fSBen GrasStarting at the given offset, consult the magic database again. 193ef01931fSBen Gras.It Dv regex 194ef01931fSBen GrasA regular expression match in extended POSIX regular expression syntax 195ef01931fSBen Gras(like egrep). 196ef01931fSBen GrasRegular expressions can take exponential time to process, and their 197ef01931fSBen Grasperformance is hard to predict, so their use is discouraged. 198ef01931fSBen GrasWhen used in production environments, their performance 199ef01931fSBen Grasshould be carefully checked. 200ef01931fSBen GrasThe type specification can be optionally followed by 201ef01931fSBen Gras.Dv /[c][s] . 202ef01931fSBen GrasThe 203ef01931fSBen Gras.Dq c 204ef01931fSBen Grasflag makes the match case insensitive, while the 205ef01931fSBen Gras.Dq s 206ef01931fSBen Grasflag update the offset to the start offset of the match, rather than the end. 207ef01931fSBen GrasThe regular expression is tested against line 208ef01931fSBen Gras.Dv N + 1 209ef01931fSBen Grasonwards, where 210ef01931fSBen Gras.Dv N 211ef01931fSBen Grasis the given offset. 212ef01931fSBen GrasLine endings are assumed to be in the machine's native format. 213ef01931fSBen Gras.Dv ^ 214ef01931fSBen Grasand 215ef01931fSBen Gras.Dv $ 216ef01931fSBen Grasmatch the beginning and end of individual lines, respectively, 217ef01931fSBen Grasnot beginning and end of file. 218ef01931fSBen Gras.It Dv search 219ef01931fSBen GrasA literal string search starting at the given offset. 220ef01931fSBen GrasThe same modifier flags can be used as for string patterns. 221ef01931fSBen GrasThe modifier flags (if any) must be followed by 222ef01931fSBen Gras.Dv /number 223ef01931fSBen Grasthe range, that is, the number of positions at which the match will be 224ef01931fSBen Grasattempted, starting from the start offset. 225ef01931fSBen GrasThis is suitable for 226ef01931fSBen Grassearching larger binary expressions with variable offsets, using 227ef01931fSBen Gras.Dv \e 228ef01931fSBen Grasescapes for special characters. 229ef01931fSBen GrasThe offset works as for regex. 230ef01931fSBen Gras.It Dv default 231ef01931fSBen GrasThis is intended to be used with the test 232ef01931fSBen Gras.Em x 233ef01931fSBen Gras(which is always true) and a message that is to be used if there are 234ef01931fSBen Grasno other matches. 235ef01931fSBen Gras.El 236ef01931fSBen Gras.Pp 237ef01931fSBen GrasEach top-level magic pattern (see below for an explanation of levels) 238ef01931fSBen Grasis classified as text or binary according to the types used. 239ef01931fSBen GrasTypes 240ef01931fSBen Gras.Dq regex 241ef01931fSBen Grasand 242ef01931fSBen Gras.Dq search 243ef01931fSBen Grasare classified as text tests, unless non-printable characters are used 244ef01931fSBen Grasin the pattern. 245ef01931fSBen GrasAll other tests are classified as binary. 246ef01931fSBen GrasA top-level 247ef01931fSBen Graspattern is considered to be a test text when all its patterns are text 248ef01931fSBen Graspatterns; otherwise, it is considered to be a binary pattern. 249ef01931fSBen GrasWhen 250ef01931fSBen Grasmatching a file, binary patterns are tried first; if no match is 251ef01931fSBen Grasfound, and the file looks like text, then its encoding is determined 252ef01931fSBen Grasand the text patterns are tried. 253ef01931fSBen Gras.Pp 254ef01931fSBen GrasThe numeric types may optionally be followed by 255ef01931fSBen Gras.Dv \*[Am] 256ef01931fSBen Grasand a numeric value, 257ef01931fSBen Grasto specify that the value is to be AND'ed with the 258ef01931fSBen Grasnumeric value before any comparisons are done. 259ef01931fSBen GrasPrepending a 260ef01931fSBen Gras.Dv u 261ef01931fSBen Grasto the type indicates that ordered comparisons should be unsigned. 262ef01931fSBen Gras.It Dv test 263ef01931fSBen GrasThe value to be compared with the value from the file. 264ef01931fSBen GrasIf the type is 265ef01931fSBen Grasnumeric, this value 266ef01931fSBen Grasis specified in C form; if it is a string, it is specified as a C string 267ef01931fSBen Graswith the usual escapes permitted (e.g. \en for new-line). 268ef01931fSBen Gras.Pp 269ef01931fSBen GrasNumeric values 270ef01931fSBen Grasmay be preceded by a character indicating the operation to be performed. 271ef01931fSBen GrasIt may be 272ef01931fSBen Gras.Dv = , 273ef01931fSBen Grasto specify that the value from the file must equal the specified value, 274ef01931fSBen Gras.Dv \*[Lt] , 275ef01931fSBen Grasto specify that the value from the file must be less than the specified 276ef01931fSBen Grasvalue, 277ef01931fSBen Gras.Dv \*[Gt] , 278ef01931fSBen Grasto specify that the value from the file must be greater than the specified 279ef01931fSBen Grasvalue, 280ef01931fSBen Gras.Dv \*[Am] , 281ef01931fSBen Grasto specify that the value from the file must have set all of the bits 282ef01931fSBen Grasthat are set in the specified value, 283ef01931fSBen Gras.Dv ^ , 284ef01931fSBen Grasto specify that the value from the file must have clear any of the bits 285ef01931fSBen Grasthat are set in the specified value, or 286ef01931fSBen Gras.Dv ~ , 287ef01931fSBen Grasthe value specified after is negated before tested. 288ef01931fSBen Gras.Dv x , 289ef01931fSBen Grasto specify that any value will match. 290ef01931fSBen GrasIf the character is omitted, it is assumed to be 291ef01931fSBen Gras.Dv = . 292ef01931fSBen GrasOperators 293ef01931fSBen Gras.Dv \*[Am] , 294ef01931fSBen Gras.Dv ^ , 295ef01931fSBen Grasand 296ef01931fSBen Gras.Dv ~ 297ef01931fSBen Grasdon't work with floats and doubles. 298ef01931fSBen GrasThe operator 299ef01931fSBen Gras.Dv !\& 300ef01931fSBen Grasspecifies that the line matches if the test does 301ef01931fSBen Gras.Em not 302ef01931fSBen Grassucceed. 303ef01931fSBen Gras.Pp 304ef01931fSBen GrasNumeric values are specified in C form; e.g. 305ef01931fSBen Gras.Dv 13 306ef01931fSBen Grasis decimal, 307ef01931fSBen Gras.Dv 013 308ef01931fSBen Grasis octal, and 309ef01931fSBen Gras.Dv 0x13 310ef01931fSBen Grasis hexadecimal. 311ef01931fSBen Gras.Pp 312ef01931fSBen GrasFor string values, the string from the 313ef01931fSBen Grasfile must match the specified string. 314ef01931fSBen GrasThe operators 315ef01931fSBen Gras.Dv = , 316ef01931fSBen Gras.Dv \*[Lt] 317ef01931fSBen Grasand 318ef01931fSBen Gras.Dv \*[Gt] 319ef01931fSBen Gras(but not 320ef01931fSBen Gras.Dv \*[Am] ) 321ef01931fSBen Grascan be applied to strings. 322ef01931fSBen GrasThe length used for matching is that of the string argument 323ef01931fSBen Grasin the magic file. 324ef01931fSBen GrasThis means that a line can match any non-empty string (usually used to 325ef01931fSBen Grasthen print the string), with 326ef01931fSBen Gras.Em \*[Gt]\e0 327ef01931fSBen Gras(because all non-empty strings are greater than the empty string). 328ef01931fSBen Gras.Pp 329ef01931fSBen GrasThe special test 330ef01931fSBen Gras.Em x 331ef01931fSBen Grasalways evaluates to true. 332835f6802SDirk Vogt.It Dv message 333ef01931fSBen GrasThe message to be printed if the comparison succeeds. 334ef01931fSBen GrasIf the string contains a 335ef01931fSBen Gras.Xr printf 3 336ef01931fSBen Grasformat specification, the value from the file (with any specified masking 337ef01931fSBen Grasperformed) is printed using the message as the format string. 338ef01931fSBen GrasIf the string begins with 339ef01931fSBen Gras.Dq \eb , 340ef01931fSBen Grasthe message printed is the remainder of the string with no whitespace 341ef01931fSBen Grasadded before it: multiple matches are normally separated by a single 342ef01931fSBen Grasspace. 343ef01931fSBen Gras.El 344ef01931fSBen Gras.Pp 345ef01931fSBen GrasAn APPLE 4+4 character APPLE creator and type can be specified as: 346ef01931fSBen Gras.Bd -literal -offset indent 347ef01931fSBen Gras!:apple CREATYPE 348ef01931fSBen Gras.Ed 349ef01931fSBen Gras.Pp 350ef01931fSBen GrasA MIME type is given on a separate line, which must be the next 351ef01931fSBen Grasnon-blank or comment line after the magic line that identifies the 352ef01931fSBen Grasfile type, and has the following format: 353ef01931fSBen Gras.Bd -literal -offset indent 354ef01931fSBen Gras!:mime MIMETYPE 355ef01931fSBen Gras.Ed 356ef01931fSBen Gras.Pp 357ef01931fSBen Grasi.e. the literal string 358ef01931fSBen Gras.Dq !:mime 359ef01931fSBen Grasfollowed by the MIME type. 360ef01931fSBen Gras.Pp 361ef01931fSBen GrasAn optional strength can be supplied on a separate line which refers to 362ef01931fSBen Grasthe current magic description using the following format: 363ef01931fSBen Gras.Bd -literal -offset indent 364ef01931fSBen Gras!:strength OP VALUE 365ef01931fSBen Gras.Ed 366ef01931fSBen Gras.Pp 367ef01931fSBen GrasThe operand 368ef01931fSBen Gras.Dv OP 369ef01931fSBen Grascan be: 370ef01931fSBen Gras.Dv + , 371ef01931fSBen Gras.Dv - , 372ef01931fSBen Gras.Dv * , 373ef01931fSBen Grasor 374ef01931fSBen Gras.Dv / 375ef01931fSBen Grasand 376ef01931fSBen Gras.Dv VALUE 377ef01931fSBen Grasis a constant between 0 and 255. 378ef01931fSBen GrasThis constant is applied using the specified operand 379ef01931fSBen Grasto the currently computed default magic strength. 380ef01931fSBen Gras.Pp 381ef01931fSBen GrasSome file formats contain additional information which is to be printed 382ef01931fSBen Grasalong with the file type or need additional tests to determine the true 383ef01931fSBen Grasfile type. 384ef01931fSBen GrasThese additional tests are introduced by one or more 385ef01931fSBen Gras.Em \*[Gt] 386ef01931fSBen Grascharacters preceding the offset. 387ef01931fSBen GrasThe number of 388ef01931fSBen Gras.Em \*[Gt] 389ef01931fSBen Grason the line indicates the level of the test; a line with no 390ef01931fSBen Gras.Em \*[Gt] 391ef01931fSBen Grasat the beginning is considered to be at level 0. 392ef01931fSBen GrasTests are arranged in a tree-like hierarchy: 393835f6802SDirk Vogtif the test on a line at level 394ef01931fSBen Gras.Em n 395ef01931fSBen Grassucceeds, all following tests at level 396ef01931fSBen Gras.Em n+1 397835f6802SDirk Vogtare performed, and the messages printed if the tests succeed, until a line 398ef01931fSBen Graswith level 399ef01931fSBen Gras.Em n 400ef01931fSBen Gras(or less) appears. 401ef01931fSBen GrasFor more complex files, one can use empty messages to get just the 402ef01931fSBen Gras"if/then" effect, in the following way: 403ef01931fSBen Gras.Bd -literal -offset indent 404ef01931fSBen Gras0 string MZ 405ef01931fSBen Gras\*[Gt]0x18 leshort \*[Lt]0x40 MS-DOS executable 406ef01931fSBen Gras\*[Gt]0x18 leshort \*[Gt]0x3f extended PC executable (e.g., MS Windows) 407ef01931fSBen Gras.Ed 408ef01931fSBen Gras.Pp 409ef01931fSBen GrasOffsets do not need to be constant, but can also be read from the file 410ef01931fSBen Grasbeing examined. 411ef01931fSBen GrasIf the first character following the last 412ef01931fSBen Gras.Em \*[Gt] 413ef01931fSBen Grasis a 414835f6802SDirk Vogt.Em \&( 415ef01931fSBen Grasthen the string after the parenthesis is interpreted as an indirect offset. 416ef01931fSBen GrasThat means that the number after the parenthesis is used as an offset in 417ef01931fSBen Grasthe file. 418ef01931fSBen GrasThe value at that offset is read, and is used again as an offset 419ef01931fSBen Grasin the file. 420ef01931fSBen GrasIndirect offsets are of the form: 421ef01931fSBen Gras.Em (( x [.[bislBISL]][+\-][ y ]) . 422ef01931fSBen GrasThe value of 423ef01931fSBen Gras.Em x 424ef01931fSBen Grasis used as an offset in the file. 425ef01931fSBen GrasA byte, id3 length, short or long is read at that offset depending on the 426ef01931fSBen Gras.Em [bislBISLm] 427ef01931fSBen Grastype specifier. 428ef01931fSBen GrasThe capitalized types interpret the number as a big endian 429ef01931fSBen Grasvalue, whereas the small letter versions interpret the number as a little 430ef01931fSBen Grasendian value; 431ef01931fSBen Grasthe 432ef01931fSBen Gras.Em m 433ef01931fSBen Grastype interprets the number as a middle endian (PDP-11) value. 434ef01931fSBen GrasTo that number the value of 435ef01931fSBen Gras.Em y 436ef01931fSBen Grasis added and the result is used as an offset in the file. 437ef01931fSBen GrasThe default type if one is not specified is long. 438ef01931fSBen Gras.Pp 439ef01931fSBen GrasThat way variable length structures can be examined: 440ef01931fSBen Gras.Bd -literal -offset indent 441ef01931fSBen Gras# MS Windows executables are also valid MS-DOS executables 442ef01931fSBen Gras0 string MZ 443ef01931fSBen Gras\*[Gt]0x18 leshort \*[Lt]0x40 MZ executable (MS-DOS) 444ef01931fSBen Gras# skip the whole block below if it is not an extended executable 445ef01931fSBen Gras\*[Gt]0x18 leshort \*[Gt]0x3f 446ef01931fSBen Gras\*[Gt]\*[Gt](0x3c.l) string PE\e0\e0 PE executable (MS-Windows) 447ef01931fSBen Gras\*[Gt]\*[Gt](0x3c.l) string LX\e0\e0 LX executable (OS/2) 448ef01931fSBen Gras.Ed 449ef01931fSBen Gras.Pp 450ef01931fSBen GrasThis strategy of examining has a drawback: You must make sure that 451ef01931fSBen Grasyou eventually print something, or users may get empty output (like, when 452ef01931fSBen Grasthere is neither PE\e0\e0 nor LE\e0\e0 in the above example) 453ef01931fSBen Gras.Pp 454ef01931fSBen GrasIf this indirect offset cannot be used directly, simple calculations are 455ef01931fSBen Graspossible: appending 456ef01931fSBen Gras.Em [+-*/%\*[Am]|^]number 457ef01931fSBen Grasinside parentheses allows one to modify 458ef01931fSBen Grasthe value read from the file before it is used as an offset: 459ef01931fSBen Gras.Bd -literal -offset indent 460ef01931fSBen Gras# MS Windows executables are also valid MS-DOS executables 461ef01931fSBen Gras0 string MZ 462ef01931fSBen Gras# sometimes, the value at 0x18 is less that 0x40 but there's still an 463ef01931fSBen Gras# extended executable, simply appended to the file 464ef01931fSBen Gras\*[Gt]0x18 leshort \*[Lt]0x40 465ef01931fSBen Gras\*[Gt]\*[Gt](4.s*512) leshort 0x014c COFF executable (MS-DOS, DJGPP) 466ef01931fSBen Gras\*[Gt]\*[Gt](4.s*512) leshort !0x014c MZ executable (MS-DOS) 467ef01931fSBen Gras.Ed 468ef01931fSBen Gras.Pp 469ef01931fSBen GrasSometimes you do not know the exact offset as this depends on the length or 470ef01931fSBen Grasposition (when indirection was used before) of preceding fields. 471ef01931fSBen GrasYou can specify an offset relative to the end of the last up-level 472ef01931fSBen Grasfield using 473ef01931fSBen Gras.Sq \*[Am] 474ef01931fSBen Grasas a prefix to the offset: 475ef01931fSBen Gras.Bd -literal -offset indent 476ef01931fSBen Gras0 string MZ 477ef01931fSBen Gras\*[Gt]0x18 leshort \*[Gt]0x3f 478ef01931fSBen Gras\*[Gt]\*[Gt](0x3c.l) string PE\e0\e0 PE executable (MS-Windows) 479ef01931fSBen Gras# immediately following the PE signature is the CPU type 480ef01931fSBen Gras\*[Gt]\*[Gt]\*[Gt]\*[Am]0 leshort 0x14c for Intel 80386 481ef01931fSBen Gras\*[Gt]\*[Gt]\*[Gt]\*[Am]0 leshort 0x184 for DEC Alpha 482ef01931fSBen Gras.Ed 483ef01931fSBen Gras.Pp 484ef01931fSBen GrasIndirect and relative offsets can be combined: 485ef01931fSBen Gras.Bd -literal -offset indent 486ef01931fSBen Gras0 string MZ 487ef01931fSBen Gras\*[Gt]0x18 leshort \*[Lt]0x40 488ef01931fSBen Gras\*[Gt]\*[Gt](4.s*512) leshort !0x014c MZ executable (MS-DOS) 489ef01931fSBen Gras# if it's not COFF, go back 512 bytes and add the offset taken 490ef01931fSBen Gras# from byte 2/3, which is yet another way of finding the start 491ef01931fSBen Gras# of the extended executable 492ef01931fSBen Gras\*[Gt]\*[Gt]\*[Gt]\*[Am](2.s-514) string LE LE executable (MS Windows VxD driver) 493ef01931fSBen Gras.Ed 494ef01931fSBen Gras.Pp 495ef01931fSBen GrasOr the other way around: 496ef01931fSBen Gras.Bd -literal -offset indent 497ef01931fSBen Gras0 string MZ 498ef01931fSBen Gras\*[Gt]0x18 leshort \*[Gt]0x3f 499ef01931fSBen Gras\*[Gt]\*[Gt](0x3c.l) string LE\e0\e0 LE executable (MS-Windows) 500ef01931fSBen Gras# at offset 0x80 (-4, since relative offsets start at the end 501ef01931fSBen Gras# of the up-level match) inside the LE header, we find the absolute 502ef01931fSBen Gras# offset to the code area, where we look for a specific signature 503ef01931fSBen Gras\*[Gt]\*[Gt]\*[Gt](\*[Am]0x7c.l+0x26) string UPX \eb, UPX compressed 504ef01931fSBen Gras.Ed 505ef01931fSBen Gras.Pp 506ef01931fSBen GrasOr even both! 507ef01931fSBen Gras.Bd -literal -offset indent 508ef01931fSBen Gras0 string MZ 509ef01931fSBen Gras\*[Gt]0x18 leshort \*[Gt]0x3f 510ef01931fSBen Gras\*[Gt]\*[Gt](0x3c.l) string LE\e0\e0 LE executable (MS-Windows) 511ef01931fSBen Gras# at offset 0x58 inside the LE header, we find the relative offset 512ef01931fSBen Gras# to a data area where we look for a specific signature 513ef01931fSBen Gras\*[Gt]\*[Gt]\*[Gt]\*[Am](\*[Am]0x54.l-3) string UNACE \eb, ACE self-extracting archive 514ef01931fSBen Gras.Ed 515ef01931fSBen Gras.Pp 516ef01931fSBen GrasFinally, if you have to deal with offset/length pairs in your file, even the 517ef01931fSBen Grassecond value in a parenthesized expression can be taken from the file itself, 518ef01931fSBen Grasusing another set of parentheses. 519ef01931fSBen GrasNote that this additional indirect offset is always relative to the 520ef01931fSBen Grasstart of the main indirect offset. 521ef01931fSBen Gras.Bd -literal -offset indent 522ef01931fSBen Gras0 string MZ 523ef01931fSBen Gras\*[Gt]0x18 leshort \*[Gt]0x3f 524ef01931fSBen Gras\*[Gt]\*[Gt](0x3c.l) string PE\e0\e0 PE executable (MS-Windows) 525ef01931fSBen Gras# search for the PE section called ".idata"... 526ef01931fSBen Gras\*[Gt]\*[Gt]\*[Gt]\*[Am]0xf4 search/0x140 .idata 527ef01931fSBen Gras# ...and go to the end of it, calculated from start+length; 528ef01931fSBen Gras# these are located 14 and 10 bytes after the section name 529ef01931fSBen Gras\*[Gt]\*[Gt]\*[Gt]\*[Gt](\*[Am]0xe.l+(-4)) string PK\e3\e4 \eb, ZIP self-extracting archive 530ef01931fSBen Gras.Ed 531ef01931fSBen Gras.Sh SEE ALSO 532ef01931fSBen Gras.Xr file 1 533ef01931fSBen Gras\- the command that reads this file. 534ef01931fSBen Gras.Sh BUGS 535ef01931fSBen GrasThe formats 536ef01931fSBen Gras.Dv long , 537ef01931fSBen Gras.Dv belong , 538ef01931fSBen Gras.Dv lelong , 539ef01931fSBen Gras.Dv melong , 540ef01931fSBen Gras.Dv short , 541ef01931fSBen Gras.Dv beshort , 542ef01931fSBen Gras.Dv leshort , 543ef01931fSBen Gras.Dv date , 544ef01931fSBen Gras.Dv bedate , 545ef01931fSBen Gras.Dv medate , 546ef01931fSBen Gras.Dv ledate , 547ef01931fSBen Gras.Dv beldate , 548ef01931fSBen Gras.Dv leldate , 549ef01931fSBen Grasand 550ef01931fSBen Gras.Dv meldate 551ef01931fSBen Grasare system-dependent; perhaps they should be specified as a number 552ef01931fSBen Grasof bytes (2B, 4B, etc), 553ef01931fSBen Grassince the files being recognized typically come from 554ef01931fSBen Grasa system on which the lengths are invariant. 555ef01931fSBen Gras.\" 556ef01931fSBen Gras.\" From: guy@sun.uucp (Guy Harris) 557ef01931fSBen Gras.\" Newsgroups: net.bugs.usg 558ef01931fSBen Gras.\" Subject: /etc/magic's format isn't well documented 559ef01931fSBen Gras.\" Message-ID: <2752@sun.uucp> 560ef01931fSBen Gras.\" Date: 3 Sep 85 08:19:07 GMT 561ef01931fSBen Gras.\" Organization: Sun Microsystems, Inc. 562ef01931fSBen Gras.\" Lines: 136 563ef01931fSBen Gras.\" 564ef01931fSBen Gras.\" Here's a manual page for the format accepted by the "file" made by adding 565ef01931fSBen Gras.\" the changes I posted to the S5R2 version. 566ef01931fSBen Gras.\" 567ef01931fSBen Gras.\" Modified for Ian Darwin's version of the file command. 568