xref: /netbsd-src/external/bsd/bzip2/dist/bzip2.1 (revision 3569e60225ace945667781347835aed75d7bea05)
1*3569e602Swiz.\"	$NetBSD: bzip2.1,v 1.5 2019/07/21 21:07:12 wiz Exp $
24f9a1459Swiz.\"
3*3569e602Swiz.Dd July 13, 2019
4*3569e602Swiz.Dt BZIP2 1
5*3569e602Swiz.Os
6*3569e602Swiz.Sh NAME
7*3569e602Swiz.Nm bzip2 ,
8*3569e602Swiz.Nm bunzip2 ,
9*3569e602Swiz.Nm bzcat ,
10*3569e602Swiz.Nm bzip2recover
11*3569e602Swiz.Nd block-sorting file compressor
12*3569e602Swiz.Sh SYNOPSIS
13*3569e602Swiz.Nm bzip2
14*3569e602Swiz.Op Fl 123456789cdfkLqstVvz
15*3569e602Swiz.Op Ar filename Ar
16*3569e602Swiz.Pp
17*3569e602Swiz.Nm bunzip2
18*3569e602Swiz.Op Fl fkLVvs
19*3569e602Swiz.Op Ar filename Ar
20*3569e602Swiz.Pp
21*3569e602Swiz.Nm bzcat
22*3569e602Swiz.Op Fl s
23*3569e602Swiz.Op Ar filename Ar
24*3569e602Swiz.Pp
25*3569e602Swiz.Nm bzip2recover
26*3569e602Swiz.Ar filename
27*3569e602Swiz.Sh DESCRIPTION
28*3569e602Swiz.Nm bzip2
294f9a1459Swizcompresses files using the Burrows-Wheeler block sorting
300449b68bSwiztext compression algorithm, and Huffman coding.
310449b68bSwizCompression is generally considerably better than that achieved by
320449b68bSwizmore conventional LZ77/LZ78-based compressors, and approaches the
330449b68bSwizperformance of the PPM family of statistical compressors.
340449b68bSwiz.Pp
350449b68bSwiz.Nm bzcat
360449b68bSwizdecompresses files to stdout, and
370449b68bSwiz.Nm bzip2recover
380449b68bSwizrecovers data from damaged bzip2 files.
390449b68bSwiz.Pp
404f9a1459SwizThe command-line options are deliberately very similar to
414f9a1459Swizthose of
420449b68bSwiz.Xr gzip 1 ,
434f9a1459Swizbut they are not identical.
440449b68bSwiz.Pp
450449b68bSwiz.Nm bzip2
460449b68bSwizexpects a list of file names to accompany the command-line flags.
470449b68bSwizEach file is replaced by a compressed version of
480449b68bSwizitself, with the name
490449b68bSwiz.Dq Pa original_name.bz2 .
500449b68bSwizEach compressed file has the same modification date, permissions, and,
510449b68bSwizwhen possible, ownership as the corresponding original, so that these
520449b68bSwizproperties can be correctly restored at decompression time.
530449b68bSwizFile name handling is naive in the sense that there is no mechanism
540449b68bSwizfor preserving original file names, permissions, ownerships or dates
550449b68bSwizin filesystems which lack these concepts, or have serious file name
560449b68bSwizlength restrictions, such as
570449b68bSwiz.Tn MS-DOS .
580449b68bSwiz.Nm bzip2
594f9a1459Swizand
600449b68bSwiz.Nm bunzip2
610449b68bSwizwill by default not overwrite existing files.
620449b68bSwizIf you want this to happen, specify the
630449b68bSwiz.Fl f
640449b68bSwizflag.
650449b68bSwiz.Pp
664f9a1459SwizIf no file names are specified,
670449b68bSwiz.Nm bzip2
680449b68bSwizcompresses from standard input to standard output.
690449b68bSwizIn this case,
700449b68bSwiz.Nm bzip2
710449b68bSwizwill decline to write compressed output to a terminal, as this would
720449b68bSwizbe entirely incomprehensible and therefore pointless.
730449b68bSwiz.Pp
740449b68bSwiz.Nm bunzip2
754f9a1459Swiz(or
760449b68bSwiz.Nm bzip2 Fl d )
770449b68bSwizdecompresses all specified files.
780449b68bSwizFiles which were not created by
790449b68bSwiz.Nm bzip2
804f9a1459Swizwill be detected and ignored, and a warning issued.
810449b68bSwiz.Nm bzip2
824f9a1459Swizattempts to guess the filename for the decompressed file
834f9a1459Swizfrom that of the compressed file as follows:
840449b68bSwiz.Bl -column "filename.tbz2" "becomes" -offset indent
850449b68bSwiz.It Pa filename.bz2  Ta becomes Ta Pa filename
860449b68bSwiz.It Pa filename.bz   Ta becomes Ta Pa filename
870449b68bSwiz.It Pa filename.tbz2 Ta becomes Ta Pa filename.tar
880449b68bSwiz.It Pa filename.tbz  Ta becomes Ta Pa filename.tar
890449b68bSwiz.It Pa anyothername  Ta becomes Ta Pa anyothername.out
900449b68bSwiz.El
910449b68bSwiz.Pp
924f9a1459SwizIf the file does not end in one of the recognised endings,
930449b68bSwiz.Pa .bz2 ,
940449b68bSwiz.Pa .bz ,
950449b68bSwiz.Pa .tbz2 ,
964f9a1459Swizor
970449b68bSwiz.Pa .tbz ,
980449b68bSwiz.Nm bzip2
990449b68bSwizcomplains that it cannot guess the name of the original file, and uses
1000449b68bSwizthe original name with
1010449b68bSwiz.Pa .out
1024f9a1459Swizappended.
1030449b68bSwiz.Pp
1040449b68bSwizAs with compression, supplying no filenames causes decompression from
1054f9a1459Swizstandard input to standard output.
1060449b68bSwiz.Pp
1070449b68bSwiz.Nm bunzip2
1080449b68bSwizwill correctly decompress a file which is the concatenation of two or
1090449b68bSwizmore compressed files.
1100449b68bSwizThe result is the concatenation of the corresponding uncompressed
1110449b68bSwizfiles.
1120449b68bSwizIntegrity testing
1130449b68bSwiz.Pq Fl t
1140449b68bSwizof concatenated compressed files is also supported.
1150449b68bSwiz.Pp
1164f9a1459SwizYou can also compress or decompress files to the standard output by
1170449b68bSwizgiving the
1180449b68bSwiz.Fl c
1190449b68bSwizflag.
1200449b68bSwizMultiple files may be compressed and decompressed like this.
1210449b68bSwizThe resulting outputs are fed sequentially to stdout.
1220449b68bSwizCompression of multiple files in this manner generates a stream
1230449b68bSwizcontaining multiple compressed file representations.
1240449b68bSwizSuch a stream can be decompressed correctly only by
1250449b68bSwiz.Nm bzip2
1260449b68bSwizversion 0.9.0 or later.
1270449b68bSwizEarlier versions of
1280449b68bSwiz.Nm bzip2
1294f9a1459Swizwill stop after decompressing
1304f9a1459Swizthe first file in the stream.
1310449b68bSwiz.Pp
1320449b68bSwiz.Nm bzcat
1334f9a1459Swiz(or
1340449b68bSwiz.Nm bzip2 Fl dc )
1350449b68bSwizdecompresses all specified files to the standard output.
1360449b68bSwiz.Pp
1370449b68bSwizCompression is always performed, even if the compressed file is
1380449b68bSwizslightly larger than the original.
1390449b68bSwizFiles of less than about one hundred bytes tend to get larger, since
1400449b68bSwizthe compression mechanism has a constant overhead in the region of 50
1410449b68bSwizbytes.
1420449b68bSwizRandom data (including the output of most file compressors) is coded
1430449b68bSwizat about 8.05 bits per byte, giving an expansion of around 0.5%.
1440449b68bSwiz.Pp
1454f9a1459SwizAs a self-check for your protection,
1460449b68bSwiz.Nm bzip2
1470449b68bSwizuses 32-bit CRCs to make sure that the decompressed version of a file
1480449b68bSwizis identical to the original.
1490449b68bSwizThis guards against corruption of the compressed data, and against
1500449b68bSwizundetected bugs in
1510449b68bSwiz.Nm bzip2
1520449b68bSwiz(hopefully very unlikely).
1530449b68bSwizThe chances of data corruption going undetected is microscopic, about
1540449b68bSwizone chance in four billion for each file processed.
1550449b68bSwizBe aware, though, that the check occurs upon decompression, so it can
1560449b68bSwizonly tell you that something is wrong.
1570449b68bSwizIt can't help you recover the original uncompressed data.
1580449b68bSwizYou can use
1590449b68bSwiz.Nm bzip2recover
1604f9a1459Swizto try to recover data from
1614f9a1459Swizdamaged files.
1620449b68bSwiz.Sh OPTIONS
1630449b68bSwiz.Bl -tag -width "XXrepetitiveXfastXX"
1640449b68bSwiz.It Fl Fl
1650449b68bSwizTreats all subsequent arguments as file names, even if they start with
1660449b68bSwiza dash.
1670449b68bSwizThis is so you can handle files with names beginning with a dash, for
1680449b68bSwizexample:
1690449b68bSwiz.Dl bzip2 -- -myfilename .
1700449b68bSwiz.It Fl 1 , Fl Fl fast
1710449b68bSwizto
1720449b68bSwiz.It Fl 9 , Fl Fl best
1730449b68bSwizSet the block size to 100 k, 200 k ... 900 k when compressing.
1740449b68bSwizHas no effect when decompressing.
1750449b68bSwizSee
1760449b68bSwiz.Sx MEMORY MANAGEMENT
1770449b68bSwizbelow.
1780449b68bSwizThe
1790449b68bSwiz.Fl Fl fast
1804f9a1459Swizand
1810449b68bSwiz.Fl Fl best
1820449b68bSwizaliases are primarily for GNU
1830449b68bSwiz.Xr gzip 1
1840449b68bSwizcompatibility.
1850449b68bSwizIn particular,
1860449b68bSwiz.Fl Fl fast
1870449b68bSwizdoesn't make things significantly faster, and
1880449b68bSwiz.Fl Fl best
1890449b68bSwizmerely selects the default behaviour.
1900449b68bSwiz.It Fl c , Fl Fl stdout
1910449b68bSwizCompress or decompress to standard output.
1920449b68bSwiz.It Fl d , Fl Fl decompress
1930449b68bSwizForce decompression.
1940449b68bSwiz.Nm bzip2 ,
1950449b68bSwiz.Nm bunzip2 ,
1960449b68bSwizand
1970449b68bSwiz.Nm bzcat
1980449b68bSwizare really the same program, and the decision about what actions to
1990449b68bSwiztake is done on the basis of which name is used.
2000449b68bSwizThis flag overrides that mechanism, and forces
2010449b68bSwiz.Nm bzip2
2024f9a1459Swizto decompress.
2030449b68bSwiz.It Fl f , Fl Fl force
2040449b68bSwizForce overwrite of output files.
2050449b68bSwizNormally,
2060449b68bSwiz.Nm bzip2
2070449b68bSwizwill not overwrite existing output files.
2080449b68bSwizAlso forces
2090449b68bSwiz.Nm bzip2
2104f9a1459Swizto break hard links
2114f9a1459Swizto files, which it otherwise wouldn't do.
2120449b68bSwiz.Pp
2130449b68bSwiz.Nm bzip2
2140449b68bSwiznormally declines to decompress files which don't have the correct
2150449b68bSwizmagic header bytes.
2160449b68bSwizIf forced
2170449b68bSwiz.Pq Fl f ,
2180449b68bSwizhowever, it will pass such files through unmodified.
2190449b68bSwizThis is how GNU
2200449b68bSwiz.Xr gzip 1
2210449b68bSwizbehaves.
2220449b68bSwiz.It Fl k , Fl Fl keep
2234f9a1459SwizKeep (don't delete) input files during compression
2244f9a1459Swizor decompression.
2250449b68bSwiz.It Fl L , Fl Fl license
2260449b68bSwizDisplay the license terms and conditions.
2270449b68bSwiz.It Fl q , Fl Fl quiet
2280449b68bSwizSuppress non-essential warning messages.
2290449b68bSwizMessages pertaining to I/O errors and other critical events will not
2300449b68bSwizbe suppressed.
2310449b68bSwiz.It Fl Fl repetitive-fast
2320449b68bSwiz.It Fl Fl repetitive-best
2330449b68bSwizThese flags are redundant in versions 0.9.5 and above.
2340449b68bSwizThey provided some coarse control over the behaviour of the sorting
2350449b68bSwizalgorithm in earlier versions, which was sometimes useful.
2360449b68bSwiz0.9.5 and above have an improved algorithm which renders these flags
2370449b68bSwizirrelevant.
2380449b68bSwiz.It Fl s , Fl Fl small
2390449b68bSwizReduce memory usage, for compression, decompression and testing.
2400449b68bSwizFiles are decompressed and tested using a modified algorithm which
2410449b68bSwizonly requires 2.5 bytes per block byte.
2420449b68bSwizThis means any file can be decompressed in 2300k of memory, albeit at
2430449b68bSwizabout half the normal speed.
2440449b68bSwizDuring compression,
2450449b68bSwiz.Fl s
2460449b68bSwizselects a block size of 200k, which limits memory use to around the
2470449b68bSwizsame figure, at the expense of your compression ratio.
2480449b68bSwizIn short, if your machine is low on memory (8 megabytes or less), use
2490449b68bSwiz.Fl s
2500449b68bSwizfor everything.
2510449b68bSwizSee
2520449b68bSwiz.Sx MEMORY MANAGEMENT
2530449b68bSwizbelow.
2540449b68bSwiz.It Fl t , Fl Fl test
2550449b68bSwizCheck integrity of the specified file(s), but don't decompress them.
2560449b68bSwizThis really performs a trial decompression and throws away the result.
2570449b68bSwiz.It Fl V , Fl Fl version
2580449b68bSwizDisplay the software version.
2590449b68bSwiz.It Fl v , Fl Fl verbose
2600449b68bSwizVerbose mode: show the compression ratio for each file processed.
2610449b68bSwizFurther
2620449b68bSwiz.Fl v Ap s
2630449b68bSwizincrease the verbosity level, spewing out lots of information which is
2640449b68bSwizprimarily of interest for diagnostic purposes.
2650449b68bSwiz.It Fl z , Fl Fl compress
2660449b68bSwizThe complement to
2670449b68bSwizFl d :
2680449b68bSwizforces compression, regardless of the invocation name.
2690449b68bSwiz.El
2700449b68bSwiz.Ss MEMORY MANAGEMENT
2710449b68bSwiz.Nm bzip2
2720449b68bSwizcompresses large files in blocks.
2730449b68bSwizThe block size affects both the compression ratio achieved, and the
2740449b68bSwizamount of memory needed for compression and decompression.
2750449b68bSwizThe flags
2760449b68bSwiz.Fl 1
2770449b68bSwizthrough
2780449b68bSwiz.Fl 9
2794f9a1459Swizspecify the block size to be 100,000 bytes through 900,000 bytes (the
2800449b68bSwizdefault) respectively.
2810449b68bSwizAt decompression time, the block size used for compression is read
2820449b68bSwizfrom the header of the compressed file, and
2830449b68bSwiz.Nm bunzip2
2840449b68bSwizthen allocates itself just enough memory to decompress the file.
2850449b68bSwizSince block sizes are stored in compressed files, it follows that the
2860449b68bSwizflags
2870449b68bSwiz.Fl 1
2880449b68bSwizto
2890449b68bSwiz.Fl 9
2900449b68bSwizare irrelevant to and so ignored during decompression.
2910449b68bSwiz.Pp
2920449b68bSwizCompression and decompression requirements, in bytes, can be estimated
2930449b68bSwizas:
2940449b68bSwiz.Bl -tag -width "Decompression:" -offset indent
2950449b68bSwiz.It Compression :
2960449b68bSwiz400k + ( 8 x block size )
2970449b68bSwiz.It Decompression :
2980449b68bSwiz100k + ( 4 x block size ), or 100k + ( 2.5 x block size )
2990449b68bSwiz.El
3000449b68bSwizLarger block sizes give rapidly diminishing marginal returns.
3010449b68bSwizMost of the compression comes from the first two or three hundred k of
3020449b68bSwizblock size, a fact worth bearing in mind when using
3030449b68bSwiz.Nm bzip2
3044f9a1459Swizon small machines.
3054f9a1459SwizIt is also important to appreciate that the decompression memory
3064f9a1459Swizrequirement is set at compression time by the choice of block size.
3070449b68bSwiz.Pp
3084f9a1459SwizFor files compressed with the default 900k block size,
3090449b68bSwiz.Nm bunzip2
3100449b68bSwizwill require about 3700 kbytes to decompress.
3110449b68bSwizTo support decompression of any file on a 4 megabyte machine,
3120449b68bSwiz.Nm bunzip2
3130449b68bSwizhas an option to decompress using approximately half this amount of
3140449b68bSwizmemory, about 2300 kbytes.
3150449b68bSwizDecompression speed is also halved, so you should use this option only
3160449b68bSwizwhere necessary.
3170449b68bSwizThe relevant flag is
3180449b68bSwiz.Fl s .
3190449b68bSwiz.Pp
3200449b68bSwizIn general, try and use the largest block size memory constraints
3210449b68bSwizallow, since that maximises the compression achieved.
3220449b68bSwizCompression and decompression speed are virtually unaffected by block
3230449b68bSwizsize.
3240449b68bSwiz.Pp
3254f9a1459SwizAnother significant point applies to files which fit in a single block
3260449b68bSwiz-- that means most files you'd encounter using a large block size.
3270449b68bSwizThe amount of real memory touched is proportional to the size of the
3280449b68bSwizfile, since the file is smaller than a block.
3290449b68bSwizFor example, compressing a file 20,000 bytes long with the flag
3300449b68bSwiz.Fl 9
3310449b68bSwizwill cause the compressor to allocate around 7600k of memory, but only
3320449b68bSwiztouch 400k + 20000 * 8 = 560 kbytes of it.
3330449b68bSwizSimilarly, the decompressor will allocate 3700k but only touch 100k +
3340449b68bSwiz20000 * 4 = 180 kbytes.
3350449b68bSwiz.Pp
3364f9a1459SwizHere is a table which summarises the maximum memory usage for different
3370449b68bSwizblock sizes.
3380449b68bSwizAlso recorded is the total compressed size for 14 files of the Calgary
3390449b68bSwizText Compression Corpus totalling 3,141,622 bytes.
3400449b68bSwizThis column gives some feel for how compression varies with block size.
3410449b68bSwizThese figures tend to understate the advantage of larger block sizes
3420449b68bSwizfor larger files, since the Corpus is dominated by smaller files.
3430449b68bSwiz.Bl -column "Flag" "Compression" "Decompression" "DecompressionXXs" "Corpus size"
3440449b68bSwiz.It Sy Flag Ta Sy Compression Ta Sy Decompression Ta Sy Decompression Fl s Ta Sy Corpus size
3450449b68bSwiz.It -1 Ta 1200k Ta  500k Ta  350k Ta 914704
3460449b68bSwiz.It -2 Ta 2000k Ta  900k Ta  600k Ta 877703
3470449b68bSwiz.It -3 Ta 2800k Ta 1300k Ta  850k Ta 860338
3480449b68bSwiz.It -4 Ta 3600k Ta 1700k Ta 1100k Ta 846899
3490449b68bSwiz.It -5 Ta 4400k Ta 2100k Ta 1350k Ta 845160
3500449b68bSwiz.It -6 Ta 5200k Ta 2500k Ta 1600k Ta 838626
3510449b68bSwiz.It -7 Ta 6100k Ta 2900k Ta 1850k Ta 834096
3520449b68bSwiz.It -8 Ta 6800k Ta 3300k Ta 2100k Ta 828642
3530449b68bSwiz.It -9 Ta 7600k Ta 3700k Ta 2350k Ta 828642
3540449b68bSwiz.El
3550449b68bSwiz.Ss RECOVERING DATA FROM DAMAGED FILES
3560449b68bSwiz.Nm bzip2
3570449b68bSwizcompresses files in blocks, usually 900kbytes long.
3580449b68bSwizEach block is handled independently.
3590449b68bSwizIf a media or transmission error causes a multi-block
3600449b68bSwiz.Pa .bz2
3610449b68bSwizfile to become damaged, it may be possible to recover data from the
3620449b68bSwizundamaged blocks in the file.
3630449b68bSwiz.Pp
3644f9a1459SwizThe compressed representation of each block is delimited by a 48-bit
3654f9a1459Swizpattern, which makes it possible to find the block boundaries with
3660449b68bSwizreasonable certainty.
3670449b68bSwizEach block also carries its own 32-bit CRC, so damaged blocks can be
3680449b68bSwizdistinguished from undamaged ones.
3690449b68bSwiz.Pp
3700449b68bSwiz.Nm bzip2recover
3710449b68bSwizis a simple program whose purpose is to search for blocks in
3720449b68bSwiz.Pa .bz2
3730449b68bSwizfiles, and write each block out into its own
3740449b68bSwiz.Pa .bz2
3750449b68bSwizfile.
3760449b68bSwizYou can then use
3770449b68bSwiz.Nm bzip2
3780449b68bSwiz.Fl t
3790449b68bSwizto test the integrity of the resulting files, and decompress those
3800449b68bSwizwhich are undamaged.
3810449b68bSwiz.Pp
3820449b68bSwiz.Nm bzip2recover
3830449b68bSwiztakes a single argument, the name of the damaged file, and writes a
3840449b68bSwiznumber of files
3850449b68bSwiz.Dq Pa rec00001file.bz2 ,
3860449b68bSwiz.Dq Pa rec00002file.bz2 ,
3870449b68bSwizetc., containing the extracted blocks.
3880449b68bSwizThe output filenames are designed so that the use of wildcards in
3890449b68bSwizsubsequent processing -- for example,
3900449b68bSwiz.Dl bzip2 -dc rec*file.bz2 \*[Gt] recovered_data
3910449b68bSwiz-- processes the files in the correct order.
3920449b68bSwiz.Pp
3930449b68bSwiz.Nm bzip2recover
3940449b68bSwizshould be of most use dealing with large
3950449b68bSwiz.Pa .bz2
3960449b68bSwizfiles, as these will contain many blocks.
3970449b68bSwizIt is clearly futile to use it on damaged single-block files, since a
3980449b68bSwizdamaged block cannot be recovered.
3990449b68bSwizIf you wish to minimise any potential data loss through media or
4000449b68bSwiztransmission errors, you might consider compressing with a smaller
4014f9a1459Swizblock size.
4020449b68bSwiz.Ss PERFORMANCE NOTES
4030449b68bSwizThe sorting phase of compression gathers together similar strings in
4040449b68bSwizthe file.
4050449b68bSwizBecause of this, files containing very long runs of repeated
4060449b68bSwizsymbols, like
4070449b68bSwiz.Dq aabaabaabaab...
4080449b68bSwiz(repeated several hundred times) may compress more slowly than normal.
4090449b68bSwizVersions 0.9.5 and above fare much better than previous versions in
4100449b68bSwizthis respect.
4110449b68bSwizThe ratio between worst-case and average-case compression time is in
4120449b68bSwizthe region of 10:1.
4130449b68bSwizFor previous versions, this figure was more like 100:1.
4140449b68bSwizYou can use the
4150449b68bSwiz.Fl vvvv
4160449b68bSwizoption to monitor progress in great detail, if you want.
4170449b68bSwiz.Pp
4184f9a1459SwizDecompression speed is unaffected by these phenomena.
4190449b68bSwiz.Pp
4200449b68bSwiz.Nm bzip2
4210449b68bSwizusually allocates several megabytes of memory to operate in, and then
4220449b68bSwizcharges all over it in a fairly random fashion.
4230449b68bSwizThis means that performance, both for compressing and decompressing,
4240449b68bSwizis largely determined by the speed at which your machine can service
4250449b68bSwizcache misses.
4260449b68bSwizBecause of this, small changes to the code to reduce the miss rate
4270449b68bSwizhave been observed to give disproportionately large performance
4280449b68bSwizimprovements.
4294f9a1459SwizI imagine
4300449b68bSwiz.Nm bzip2
4314f9a1459Swizwill perform best on machines with very large caches.
4320449b68bSwiz.Sh ENVIRONMENT
4330449b68bSwiz.Nm bzip2
4340449b68bSwizwill read arguments from the environment variables
4350449b68bSwiz.Ev BZIP2
4360449b68bSwizand
4370449b68bSwiz.Ev BZIP ,
4380449b68bSwizin that order, and will process them before any arguments read from
4390449b68bSwizthe command line.
4400449b68bSwizThis gives a convenient way to supply default arguments.
4410449b68bSwiz.Sh EXIT STATUS
4420449b68bSwiz0 for a normal exit, 1 for environmental problems (file not found,
4430449b68bSwizinvalid flags, I/O errors, etc.), 2 to indicate a corrupt compressed
4440449b68bSwizfile, 3 for an internal consistency error (e.g., bug) which caused
4450449b68bSwiz.Nm bzip2
4460449b68bSwizto panic.
4470449b68bSwiz.Sh AUTHORS
4480449b68bSwiz.An -nosplit
4490449b68bSwiz.An Julian Seward
4500449b68bSwiz.Aq jseward@bzip.org
4510449b68bSwiz.Pp
4520449b68bSwiz.Pa http://www.bzip.org
4530449b68bSwiz.Pp
4544f9a1459SwizThe ideas embodied in
4550449b68bSwiz.Nm bzip2
4560449b68bSwizare due to (at least) the following people:
4570449b68bSwiz.An Michael Burrows
4580449b68bSwizand
4590449b68bSwiz.An David Wheeler
4600449b68bSwiz(for the block sorting transformation),
4610449b68bSwiz.An David Wheeler
4620449b68bSwiz(again, for the Huffman coder),
4630449b68bSwiz.An Peter Fenwick
4640449b68bSwiz(for the structured coding model in the original
4650449b68bSwiz.Nm bzip ,
4660449b68bSwizand many refinements), and
4670449b68bSwiz.An Alistair Moffat ,
4680449b68bSwiz.An Radford Neal ,
4690449b68bSwizand
4700449b68bSwiz.An Ian Witten
4714f9a1459Swiz(for the arithmetic coder in the original
4720449b68bSwiz.Nm bzip ) .
4730449b68bSwizI am much indebted for their help, support and advice.
4740449b68bSwizSee the manual in the source distribution for pointers to sources of
4750449b68bSwizdocumentation.
4760449b68bSwizChristian von Roques encouraged me to look for faster sorting
4770449b68bSwizalgorithms, so as to speed up compression.
4780449b68bSwizBela Lubkin encouraged me to improve the worst-case compression
4790449b68bSwizperformance.
4804f9a1459SwizDonna Robinson XMLised the documentation.
4814f9a1459SwizThe bz* scripts are derived from those of GNU gzip.
4820449b68bSwizMany people sent patches, helped with portability problems, lent
4830449b68bSwizmachines, gave advice and were generally helpful.
4840449b68bSwiz.Sh CAVEATS
4850449b68bSwizI/O error messages are not as helpful as they could be.
4860449b68bSwiz.Nm bzip2
4870449b68bSwiztries hard to detect I/O errors and exit cleanly, but the details of
4880449b68bSwizwhat the problem is sometimes seem rather misleading.
489*3569e602Swiz.Pp
490c4f47eb4SmayaThis manual page pertains to version 1.0.8 of
491*3569e602Swiz.Nm bzip2 .
4920449b68bSwizCompressed data created by this version is entirely forwards and
4930449b68bSwizbackwards compatible with the previous public releases, versions
494*3569e602Swiz0.1pl2, 0.9.0, 0.9.5, 1.0.0, 1.0.1, 1.0.2 and above, but with the
495*3569e602Swizfollowing exception: 0.9.0 and above can correctly decompress multiple
496*3569e602Swizconcatenated compressed files.
497*3569e602Swiz0.1pl2 cannot do this; it will stop after decompressing just the first
498*3569e602Swizfile in the stream.
499*3569e602Swiz.Pp
500*3569e602Swiz.Nm bzip2recover
501*3569e602Swizversions prior to 1.0.2 used 32-bit integers to represent bit
502*3569e602Swizpositions in compressed files, so they could not handle compressed
503*3569e602Swizfiles more than 512 megabytes long.
504*3569e602SwizVersions 1.0.2 and above use 64-bit ints on some platforms which
505*3569e602Swizsupport them (GNU supported targets, and Windows).
506*3569e602SwizTo establish whether or not
507*3569e602Swiz.Nm bzip2recover
508*3569e602Swizwas built with such a limitation, run it without arguments.
509*3569e602SwizIn any event you can build yourself an unlimited version if you can
510*3569e602Swizrecompile it with MaybeUInt64 set to be an unsigned 64-bit integer.
511