xref: /plan9/sys/src/cmd/bzip2/README (revision 59cc4ca53493a3c6d2349fe2b7f7c40f7dce7294)
1*59cc4ca5SDavid du Colombier
2*59cc4ca5SDavid du ColombierThis is the README for bzip2, a block-sorting file compressor, version
3*59cc4ca5SDavid du Colombier1.0.  This version is fully compatible with the previous public
4*59cc4ca5SDavid du Colombierreleases, bzip2-0.1pl2, bzip2-0.9.0 and bzip2-0.9.5.
5*59cc4ca5SDavid du Colombier
6*59cc4ca5SDavid du Colombierbzip2-1.0 is distributed under a BSD-style license.  For details,
7*59cc4ca5SDavid du Colombiersee the file LICENSE.
8*59cc4ca5SDavid du Colombier
9*59cc4ca5SDavid du ColombierComplete documentation is available in Postscript form (manual.ps) or
10*59cc4ca5SDavid du Colombierhtml (manual_toc.html).  A plain-text version of the manual page is
11*59cc4ca5SDavid du Colombieravailable as bzip2.txt.  A statement about Y2K issues is now included
12*59cc4ca5SDavid du Colombierin the file Y2K_INFO.
13*59cc4ca5SDavid du Colombier
14*59cc4ca5SDavid du Colombier
15*59cc4ca5SDavid du ColombierHOW TO BUILD -- UNIX
16*59cc4ca5SDavid du Colombier
17*59cc4ca5SDavid du ColombierType `make'.  This builds the library libbz2.a and then the
18*59cc4ca5SDavid du Colombierprograms bzip2 and bzip2recover.  Six self-tests are run.
19*59cc4ca5SDavid du ColombierIf the self-tests complete ok, carry on to installation:
20*59cc4ca5SDavid du Colombier
21*59cc4ca5SDavid du ColombierTo install in /usr/bin, /usr/lib, /usr/man and /usr/include, type
22*59cc4ca5SDavid du Colombier   make install
23*59cc4ca5SDavid du ColombierTo install somewhere else, eg, /xxx/yyy/{bin,lib,man,include}, type
24*59cc4ca5SDavid du Colombier   make install PREFIX=/xxx/yyy
25*59cc4ca5SDavid du ColombierIf you are (justifiably) paranoid and want to see what 'make install'
26*59cc4ca5SDavid du Colombieris going to do, you can first do
27*59cc4ca5SDavid du Colombier   make -n install                      or
28*59cc4ca5SDavid du Colombier   make -n install PREFIX=/xxx/yyy      respectively.
29*59cc4ca5SDavid du ColombierThe -n instructs make to show the commands it would execute, but
30*59cc4ca5SDavid du Colombiernot actually execute them.
31*59cc4ca5SDavid du Colombier
32*59cc4ca5SDavid du Colombier
33*59cc4ca5SDavid du ColombierHOW TO BUILD -- UNIX, shared library libbz2.so.
34*59cc4ca5SDavid du Colombier
35*59cc4ca5SDavid du ColombierDo 'make -f Makefile-libbz2_so'.  This Makefile seems to work for
36*59cc4ca5SDavid du ColombierLinux-ELF (RedHat 5.2 on an x86 box), with gcc.  I make no claims
37*59cc4ca5SDavid du Colombierthat it works for any other platform, though I suspect it probably
38*59cc4ca5SDavid du Colombierwill work for most platforms employing both ELF and gcc.
39*59cc4ca5SDavid du Colombier
40*59cc4ca5SDavid du Colombierbzip2-shared, a client of the shared library, is also build, but
41*59cc4ca5SDavid du Colombiernot self-tested.  So I suggest you also build using the normal
42*59cc4ca5SDavid du ColombierMakefile, since that conducts a self-test.
43*59cc4ca5SDavid du Colombier
44*59cc4ca5SDavid du ColombierImportant note for people upgrading .so's from 0.9.0/0.9.5 to
45*59cc4ca5SDavid du Colombierversion 1.0.  All the functions in the library have been renamed,
46*59cc4ca5SDavid du Colombierfrom (eg) bzCompress to BZ2_bzCompress, to avoid namespace pollution.
47*59cc4ca5SDavid du ColombierUnfortunately this means that the libbz2.so created by
48*59cc4ca5SDavid du ColombierMakefile-libbz2_so will not work with any program which used an
49*59cc4ca5SDavid du Colombierolder version of the library.  Sorry.  I do encourage library
50*59cc4ca5SDavid du Colombierclients to make the effort to upgrade to use version 1.0, since
51*59cc4ca5SDavid du Colombierit is both faster and more robust than previous versions.
52*59cc4ca5SDavid du Colombier
53*59cc4ca5SDavid du Colombier
54*59cc4ca5SDavid du ColombierHOW TO BUILD -- Windows 95, NT, DOS, Mac, etc.
55*59cc4ca5SDavid du Colombier
56*59cc4ca5SDavid du ColombierIt's difficult for me to support compilation on all these platforms.
57*59cc4ca5SDavid du ColombierMy approach is to collect binaries for these platforms, and put them
58*59cc4ca5SDavid du Colombieron the master web page (http://sourceware.cygnus.com/bzip2).  Look
59*59cc4ca5SDavid du Colombierthere.  However (FWIW), bzip2-1.0 is very standard ANSI C and should
60*59cc4ca5SDavid du Colombiercompile unmodified with MS Visual C.  For Win32, there is one
61*59cc4ca5SDavid du Colombierimportant caveat: in bzip2.c, you must set BZ_UNIX to 0 and
62*59cc4ca5SDavid du ColombierBZ_LCCWIN32 to 1 before building.  If you have difficulties building,
63*59cc4ca5SDavid du Colombieryou might want to read README.COMPILATION.PROBLEMS.
64*59cc4ca5SDavid du Colombier
65*59cc4ca5SDavid du Colombier
66*59cc4ca5SDavid du ColombierVALIDATION
67*59cc4ca5SDavid du Colombier
68*59cc4ca5SDavid du ColombierCorrect operation, in the sense that a compressed file can always be
69*59cc4ca5SDavid du Colombierdecompressed to reproduce the original, is obviously of paramount
70*59cc4ca5SDavid du Colombierimportance.  To validate bzip2, I used a modified version of Mark
71*59cc4ca5SDavid du ColombierNelson's churn program.  Churn is an automated test driver which
72*59cc4ca5SDavid du Colombierrecursively traverses a directory structure, using bzip2 to compress
73*59cc4ca5SDavid du Colombierand then decompress each file it encounters, and checking that the
74*59cc4ca5SDavid du Colombierdecompressed data is the same as the original.  There are more details
75*59cc4ca5SDavid du Colombierin Section 4 of the user guide.
76*59cc4ca5SDavid du Colombier
77*59cc4ca5SDavid du Colombier
78*59cc4ca5SDavid du Colombier
79*59cc4ca5SDavid du ColombierPlease read and be aware of the following:
80*59cc4ca5SDavid du Colombier
81*59cc4ca5SDavid du ColombierWARNING:
82*59cc4ca5SDavid du Colombier
83*59cc4ca5SDavid du Colombier   This program (attempts to) compress data by performing several
84*59cc4ca5SDavid du Colombier   non-trivial transformations on it.  Unless you are 100% familiar
85*59cc4ca5SDavid du Colombier   with *all* the algorithms contained herein, and with the
86*59cc4ca5SDavid du Colombier   consequences of modifying them, you should NOT meddle with the
87*59cc4ca5SDavid du Colombier   compression or decompression machinery.  Incorrect changes can and
88*59cc4ca5SDavid du Colombier   very likely *will* lead to disastrous loss of data.
89*59cc4ca5SDavid du Colombier
90*59cc4ca5SDavid du Colombier
91*59cc4ca5SDavid du ColombierDISCLAIMER:
92*59cc4ca5SDavid du Colombier
93*59cc4ca5SDavid du Colombier   I TAKE NO RESPONSIBILITY FOR ANY LOSS OF DATA ARISING FROM THE
94*59cc4ca5SDavid du Colombier   USE OF THIS PROGRAM, HOWSOEVER CAUSED.
95*59cc4ca5SDavid du Colombier
96*59cc4ca5SDavid du Colombier   Every compression of a file implies an assumption that the
97*59cc4ca5SDavid du Colombier   compressed file can be decompressed to reproduce the original.
98*59cc4ca5SDavid du Colombier   Great efforts in design, coding and testing have been made to
99*59cc4ca5SDavid du Colombier   ensure that this program works correctly.  However, the complexity
100*59cc4ca5SDavid du Colombier   of the algorithms, and, in particular, the presence of various
101*59cc4ca5SDavid du Colombier   special cases in the code which occur with very low but non-zero
102*59cc4ca5SDavid du Colombier   probability make it impossible to rule out the possibility of bugs
103*59cc4ca5SDavid du Colombier   remaining in the program.  DO NOT COMPRESS ANY DATA WITH THIS
104*59cc4ca5SDavid du Colombier   PROGRAM UNLESS YOU ARE PREPARED TO ACCEPT THE POSSIBILITY, HOWEVER
105*59cc4ca5SDavid du Colombier   SMALL, THAT THE DATA WILL NOT BE RECOVERABLE.
106*59cc4ca5SDavid du Colombier
107*59cc4ca5SDavid du Colombier   That is not to say this program is inherently unreliable.  Indeed,
108*59cc4ca5SDavid du Colombier   I very much hope the opposite is true.  bzip2 has been carefully
109*59cc4ca5SDavid du Colombier   constructed and extensively tested.
110*59cc4ca5SDavid du Colombier
111*59cc4ca5SDavid du Colombier
112*59cc4ca5SDavid du ColombierPATENTS:
113*59cc4ca5SDavid du Colombier
114*59cc4ca5SDavid du Colombier   To the best of my knowledge, bzip2 does not use any patented
115*59cc4ca5SDavid du Colombier   algorithms.  However, I do not have the resources available to
116*59cc4ca5SDavid du Colombier   carry out a full patent search.  Therefore I cannot give any
117*59cc4ca5SDavid du Colombier   guarantee of the above statement.
118*59cc4ca5SDavid du Colombier
119*59cc4ca5SDavid du ColombierEnd of legalities.
120*59cc4ca5SDavid du Colombier
121*59cc4ca5SDavid du Colombier
122*59cc4ca5SDavid du ColombierWHAT'S NEW IN 0.9.0 (as compared to 0.1pl2) ?
123*59cc4ca5SDavid du Colombier
124*59cc4ca5SDavid du Colombier   * Approx 10% faster compression, 30% faster decompression
125*59cc4ca5SDavid du Colombier   * -t (test mode) is a lot quicker
126*59cc4ca5SDavid du Colombier   * Can decompress concatenated compressed files
127*59cc4ca5SDavid du Colombier   * Programming interface, so programs can directly read/write .bz2 files
128*59cc4ca5SDavid du Colombier   * Less restrictive (BSD-style) licensing
129*59cc4ca5SDavid du Colombier   * Flag handling more compatible with GNU gzip
130*59cc4ca5SDavid du Colombier   * Much more documentation, i.e., a proper user manual
131*59cc4ca5SDavid du Colombier   * Hopefully, improved portability (at least of the library)
132*59cc4ca5SDavid du Colombier
133*59cc4ca5SDavid du ColombierWHAT'S NEW IN 0.9.5 ?
134*59cc4ca5SDavid du Colombier
135*59cc4ca5SDavid du Colombier   * Compression speed is much less sensitive to the input
136*59cc4ca5SDavid du Colombier     data than in previous versions.  Specifically, the very
137*59cc4ca5SDavid du Colombier     slow performance caused by repetitive data is fixed.
138*59cc4ca5SDavid du Colombier   * Many small improvements in file and flag handling.
139*59cc4ca5SDavid du Colombier   * A Y2K statement.
140*59cc4ca5SDavid du Colombier
141*59cc4ca5SDavid du ColombierWHAT'S NEW IN 1.0
142*59cc4ca5SDavid du Colombier
143*59cc4ca5SDavid du Colombier   See the CHANGES file.
144*59cc4ca5SDavid du Colombier
145*59cc4ca5SDavid du ColombierI hope you find bzip2 useful.  Feel free to contact me at
146*59cc4ca5SDavid du Colombier   jseward@acm.org
147*59cc4ca5SDavid du Colombierif you have any suggestions or queries.  Many people mailed me with
148*59cc4ca5SDavid du Colombiercomments, suggestions and patches after the releases of bzip-0.15,
149*59cc4ca5SDavid du Colombierbzip-0.21, bzip2-0.1pl2 and bzip2-0.9.0, and the changes in bzip2 are
150*59cc4ca5SDavid du Colombierlargely a result of this feedback.  I thank you for your comments.
151*59cc4ca5SDavid du Colombier
152*59cc4ca5SDavid du ColombierAt least for the time being, bzip2's "home" is (or can be reached via)
153*59cc4ca5SDavid du Colombierhttp://www.muraroa.demon.co.uk.
154*59cc4ca5SDavid du Colombier
155*59cc4ca5SDavid du ColombierJulian Seward
156*59cc4ca5SDavid du Colombierjseward@acm.org
157*59cc4ca5SDavid du Colombier
158*59cc4ca5SDavid du ColombierCambridge, UK
159*59cc4ca5SDavid du Colombier18   July 1996 (version 0.15)
160*59cc4ca5SDavid du Colombier25 August 1996 (version 0.21)
161*59cc4ca5SDavid du Colombier 7 August 1997 (bzip2, version 0.1)
162*59cc4ca5SDavid du Colombier29 August 1997 (bzip2, version 0.1pl2)
163*59cc4ca5SDavid du Colombier23 August 1998 (bzip2, version 0.9.0)
164*59cc4ca5SDavid du Colombier 8   June 1999 (bzip2, version 0.9.5)
165*59cc4ca5SDavid du Colombier 4   Sept 1999 (bzip2, version 0.9.5d)
166*59cc4ca5SDavid du Colombier 5    May 2000 (bzip2, version 1.0pre8)
167