xref: /minix3/external/bsd/bzip2/dist/manual.html (revision 4a711bea63dc53acce03198b5fbfaa103fe328d6)
1*4a711beaSLionel Sambuc<html>
2*4a711beaSLionel Sambuc<head>
3*4a711beaSLionel Sambuc<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
4*4a711beaSLionel Sambuc<title>bzip2 and libbzip2, version 1.0.6</title>
5*4a711beaSLionel Sambuc<meta name="generator" content="DocBook XSL Stylesheets V1.75.2">
6*4a711beaSLionel Sambuc<style type="text/css" media="screen">/* Colours:
7*4a711beaSLionel Sambuc#74240f  dark brown      h1, h2, h3, h4
8*4a711beaSLionel Sambuc#336699  medium blue     links
9*4a711beaSLionel Sambuc#339999  turquoise       link hover colour
10*4a711beaSLionel Sambuc#202020  almost black    general text
11*4a711beaSLionel Sambuc#761596  purple          md5sum text
12*4a711beaSLionel Sambuc#626262  dark gray       pre border
13*4a711beaSLionel Sambuc#eeeeee  very light gray pre background
14*4a711beaSLionel Sambuc#f2f2f9  very light blue nav table background
15*4a711beaSLionel Sambuc#3366cc  medium blue     nav table border
16*4a711beaSLionel Sambuc*/
17*4a711beaSLionel Sambuc
18*4a711beaSLionel Sambuca, a:link, a:visited, a:active { color: #336699; }
19*4a711beaSLionel Sambuca:hover { color: #339999; }
20*4a711beaSLionel Sambuc
21*4a711beaSLionel Sambucbody { font: 80%/126% sans-serif; }
22*4a711beaSLionel Sambuch1, h2, h3, h4 { color: #74240f; }
23*4a711beaSLionel Sambuc
24*4a711beaSLionel Sambucdt { color: #336699; font-weight: bold }
25*4a711beaSLionel Sambucdd {
26*4a711beaSLionel Sambuc margin-left: 1.5em;
27*4a711beaSLionel Sambuc padding-bottom: 0.8em;
28*4a711beaSLionel Sambuc}
29*4a711beaSLionel Sambuc
30*4a711beaSLionel Sambuc/* -- ruler -- */
31*4a711beaSLionel Sambucdiv.hr_blue {
32*4a711beaSLionel Sambuc  height:  3px;
33*4a711beaSLionel Sambuc  background:#ffffff url("/images/hr_blue.png") repeat-x; }
34*4a711beaSLionel Sambucdiv.hr_blue hr { display:none; }
35*4a711beaSLionel Sambuc
36*4a711beaSLionel Sambuc/* release styles */
37*4a711beaSLionel Sambuc#release p { margin-top: 0.4em; }
38*4a711beaSLionel Sambuc#release .md5sum { color: #761596; }
39*4a711beaSLionel Sambuc
40*4a711beaSLionel Sambuc
41*4a711beaSLionel Sambuc/* ------ styles for docs|manuals|howto ------ */
42*4a711beaSLionel Sambuc/* -- lists -- */
43*4a711beaSLionel Sambucul  {
44*4a711beaSLionel Sambuc margin:     0px 4px 16px 16px;
45*4a711beaSLionel Sambuc padding:    0px;
46*4a711beaSLionel Sambuc list-style: url("/images/li-blue.png");
47*4a711beaSLionel Sambuc}
48*4a711beaSLionel Sambucul li {
49*4a711beaSLionel Sambuc margin-bottom: 10px;
50*4a711beaSLionel Sambuc}
51*4a711beaSLionel Sambucul ul	{
52*4a711beaSLionel Sambuc list-style-type:  none;
53*4a711beaSLionel Sambuc list-style-image: none;
54*4a711beaSLionel Sambuc margin-left:      0px;
55*4a711beaSLionel Sambuc}
56*4a711beaSLionel Sambuc
57*4a711beaSLionel Sambuc/* header / footer nav tables */
58*4a711beaSLionel Sambuctable.nav {
59*4a711beaSLionel Sambuc border:     solid 1px #3366cc;
60*4a711beaSLionel Sambuc background: #f2f2f9;
61*4a711beaSLionel Sambuc background-color: #f2f2f9;
62*4a711beaSLionel Sambuc margin-bottom: 0.5em;
63*4a711beaSLionel Sambuc}
64*4a711beaSLionel Sambuc/* don't have underlined links in chunked nav menus */
65*4a711beaSLionel Sambuctable.nav a { text-decoration: none; }
66*4a711beaSLionel Sambuctable.nav a:hover { text-decoration: underline; }
67*4a711beaSLionel Sambuctable.nav td { font-size: 85%; }
68*4a711beaSLionel Sambuc
69*4a711beaSLionel Sambuccode, tt, pre { font-size: 120%; }
70*4a711beaSLionel Sambuccode, tt { color: #761596; }
71*4a711beaSLionel Sambuc
72*4a711beaSLionel Sambucdiv.literallayout, pre.programlisting, pre.screen {
73*4a711beaSLionel Sambuc color:      #000000;
74*4a711beaSLionel Sambuc padding:    0.5em;
75*4a711beaSLionel Sambuc background: #eeeeee;
76*4a711beaSLionel Sambuc border:     1px solid #626262;
77*4a711beaSLionel Sambuc background-color: #eeeeee;
78*4a711beaSLionel Sambuc margin: 4px 0px 4px 0px;
79*4a711beaSLionel Sambuc}
80*4a711beaSLionel Sambuc</style>
81*4a711beaSLionel Sambuc</head>
82*4a711beaSLionel Sambuc<body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF"><div lang="en" class="book" title="bzip2 and libbzip2, version 1.0.6">
83*4a711beaSLionel Sambuc<div class="titlepage">
84*4a711beaSLionel Sambuc<div>
85*4a711beaSLionel Sambuc<div><h1 class="title">
86*4a711beaSLionel Sambuc<a name="userman"></a>bzip2 and libbzip2, version 1.0.6</h1></div>
87*4a711beaSLionel Sambuc<div><h2 class="subtitle">A program and library for data compression</h2></div>
88*4a711beaSLionel Sambuc<div><div class="authorgroup"><div class="author">
89*4a711beaSLionel Sambuc<h3 class="author">
90*4a711beaSLionel Sambuc<span class="firstname">Julian</span> <span class="surname">Seward</span>
91*4a711beaSLionel Sambuc</h3>
92*4a711beaSLionel Sambuc<div class="affiliation"><span class="orgname">http://www.bzip.org<br></span></div>
93*4a711beaSLionel Sambuc</div></div></div>
94*4a711beaSLionel Sambuc<div><p class="releaseinfo">Version 1.0.6 of 6 September 2010</p></div>
95*4a711beaSLionel Sambuc<div><p class="copyright">Copyright � 1996-2010 Julian Seward</p></div>
96*4a711beaSLionel Sambuc<div><div class="legalnotice" title="Legal Notice">
97*4a711beaSLionel Sambuc<a name="id537185"></a><p>This program, <code class="computeroutput">bzip2</code>, the
98*4a711beaSLionel Sambuc  associated library <code class="computeroutput">libbzip2</code>, and
99*4a711beaSLionel Sambuc  all documentation, are copyright � 1996-2010 Julian Seward.
100*4a711beaSLionel Sambuc  All rights reserved.</p>
101*4a711beaSLionel Sambuc<p>Redistribution and use in source and binary forms, with
102*4a711beaSLionel Sambuc  or without modification, are permitted provided that the
103*4a711beaSLionel Sambuc  following conditions are met:</p>
104*4a711beaSLionel Sambuc<div class="itemizedlist"><ul class="itemizedlist" type="bullet">
105*4a711beaSLionel Sambuc<li class="listitem" style="list-style-type: disc"><p>Redistributions of source code must retain the
106*4a711beaSLionel Sambuc   above copyright notice, this list of conditions and the
107*4a711beaSLionel Sambuc   following disclaimer.</p></li>
108*4a711beaSLionel Sambuc<li class="listitem" style="list-style-type: disc"><p>The origin of this software must not be
109*4a711beaSLionel Sambuc   misrepresented; you must not claim that you wrote the original
110*4a711beaSLionel Sambuc   software.  If you use this software in a product, an
111*4a711beaSLionel Sambuc   acknowledgment in the product documentation would be
112*4a711beaSLionel Sambuc   appreciated but is not required.</p></li>
113*4a711beaSLionel Sambuc<li class="listitem" style="list-style-type: disc"><p>Altered source versions must be plainly marked
114*4a711beaSLionel Sambuc   as such, and must not be misrepresented as being the original
115*4a711beaSLionel Sambuc   software.</p></li>
116*4a711beaSLionel Sambuc<li class="listitem" style="list-style-type: disc"><p>The name of the author may not be used to
117*4a711beaSLionel Sambuc   endorse or promote products derived from this software without
118*4a711beaSLionel Sambuc   specific prior written permission.</p></li>
119*4a711beaSLionel Sambuc</ul></div>
120*4a711beaSLionel Sambuc<p>THIS SOFTWARE IS PROVIDED BY THE AUTHOR "AS IS" AND ANY
121*4a711beaSLionel Sambuc  EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
122*4a711beaSLionel Sambuc  THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A
123*4a711beaSLionel Sambuc  PARTICULAR PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL THE
124*4a711beaSLionel Sambuc  AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
125*4a711beaSLionel Sambuc  EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED
126*4a711beaSLionel Sambuc  TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
127*4a711beaSLionel Sambuc  DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
128*4a711beaSLionel Sambuc  ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
129*4a711beaSLionel Sambuc  LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING
130*4a711beaSLionel Sambuc  IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF
131*4a711beaSLionel Sambuc  THE POSSIBILITY OF SUCH DAMAGE.</p>
132*4a711beaSLionel Sambuc<p>PATENTS: To the best of my knowledge,
133*4a711beaSLionel Sambuc <code class="computeroutput">bzip2</code> and
134*4a711beaSLionel Sambuc <code class="computeroutput">libbzip2</code> do not use any patented
135*4a711beaSLionel Sambuc algorithms.  However, I do not have the resources to carry
136*4a711beaSLionel Sambuc out a patent search.  Therefore I cannot give any guarantee of
137*4a711beaSLionel Sambuc the above statement.
138*4a711beaSLionel Sambuc </p>
139*4a711beaSLionel Sambuc</div></div>
140*4a711beaSLionel Sambuc</div>
141*4a711beaSLionel Sambuc<hr>
142*4a711beaSLionel Sambuc</div>
143*4a711beaSLionel Sambuc<div class="toc">
144*4a711beaSLionel Sambuc<p><b>Table of Contents</b></p>
145*4a711beaSLionel Sambuc<dl>
146*4a711beaSLionel Sambuc<dt><span class="chapter"><a href="#intro">1. Introduction</a></span></dt>
147*4a711beaSLionel Sambuc<dt><span class="chapter"><a href="#using">2. How to use bzip2</a></span></dt>
148*4a711beaSLionel Sambuc<dd><dl>
149*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#name">2.1. NAME</a></span></dt>
150*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#synopsis">2.2. SYNOPSIS</a></span></dt>
151*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#description">2.3. DESCRIPTION</a></span></dt>
152*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#options">2.4. OPTIONS</a></span></dt>
153*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#memory-management">2.5. MEMORY MANAGEMENT</a></span></dt>
154*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#recovering">2.6. RECOVERING DATA FROM DAMAGED FILES</a></span></dt>
155*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#performance">2.7. PERFORMANCE NOTES</a></span></dt>
156*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#caveats">2.8. CAVEATS</a></span></dt>
157*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#author">2.9. AUTHOR</a></span></dt>
158*4a711beaSLionel Sambuc</dl></dd>
159*4a711beaSLionel Sambuc<dt><span class="chapter"><a href="#libprog">3.
160*4a711beaSLionel SambucProgramming with <code class="computeroutput">libbzip2</code>
161*4a711beaSLionel Sambuc</a></span></dt>
162*4a711beaSLionel Sambuc<dd><dl>
163*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#top-level">3.1. Top-level structure</a></span></dt>
164*4a711beaSLionel Sambuc<dd><dl>
165*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#ll-summary">3.1.1. Low-level summary</a></span></dt>
166*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#hl-summary">3.1.2. High-level summary</a></span></dt>
167*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#util-fns-summary">3.1.3. Utility functions summary</a></span></dt>
168*4a711beaSLionel Sambuc</dl></dd>
169*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#err-handling">3.2. Error handling</a></span></dt>
170*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#low-level">3.3. Low-level interface</a></span></dt>
171*4a711beaSLionel Sambuc<dd><dl>
172*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#bzcompress-init">3.3.1. BZ2_bzCompressInit</a></span></dt>
173*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#bzCompress">3.3.2. BZ2_bzCompress</a></span></dt>
174*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#bzCompress-end">3.3.3. BZ2_bzCompressEnd</a></span></dt>
175*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#bzDecompress-init">3.3.4. BZ2_bzDecompressInit</a></span></dt>
176*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#bzDecompress">3.3.5. BZ2_bzDecompress</a></span></dt>
177*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#bzDecompress-end">3.3.6. BZ2_bzDecompressEnd</a></span></dt>
178*4a711beaSLionel Sambuc</dl></dd>
179*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#hl-interface">3.4. High-level interface</a></span></dt>
180*4a711beaSLionel Sambuc<dd><dl>
181*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#bzreadopen">3.4.1. BZ2_bzReadOpen</a></span></dt>
182*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#bzread">3.4.2. BZ2_bzRead</a></span></dt>
183*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#bzreadgetunused">3.4.3. BZ2_bzReadGetUnused</a></span></dt>
184*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#bzreadclose">3.4.4. BZ2_bzReadClose</a></span></dt>
185*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#bzwriteopen">3.4.5. BZ2_bzWriteOpen</a></span></dt>
186*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#bzwrite">3.4.6. BZ2_bzWrite</a></span></dt>
187*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#bzwriteclose">3.4.7. BZ2_bzWriteClose</a></span></dt>
188*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#embed">3.4.8. Handling embedded compressed data streams</a></span></dt>
189*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#std-rdwr">3.4.9. Standard file-reading/writing code</a></span></dt>
190*4a711beaSLionel Sambuc</dl></dd>
191*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#util-fns">3.5. Utility functions</a></span></dt>
192*4a711beaSLionel Sambuc<dd><dl>
193*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#bzbufftobuffcompress">3.5.1. BZ2_bzBuffToBuffCompress</a></span></dt>
194*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#bzbufftobuffdecompress">3.5.2. BZ2_bzBuffToBuffDecompress</a></span></dt>
195*4a711beaSLionel Sambuc</dl></dd>
196*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#zlib-compat">3.6. zlib compatibility functions</a></span></dt>
197*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#stdio-free">3.7. Using the library in a stdio-free environment</a></span></dt>
198*4a711beaSLionel Sambuc<dd><dl>
199*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#stdio-bye">3.7.1. Getting rid of stdio</a></span></dt>
200*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#critical-error">3.7.2. Critical error handling</a></span></dt>
201*4a711beaSLionel Sambuc</dl></dd>
202*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#win-dll">3.8. Making a Windows DLL</a></span></dt>
203*4a711beaSLionel Sambuc</dl></dd>
204*4a711beaSLionel Sambuc<dt><span class="chapter"><a href="#misc">4. Miscellanea</a></span></dt>
205*4a711beaSLionel Sambuc<dd><dl>
206*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#limits">4.1. Limitations of the compressed file format</a></span></dt>
207*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#port-issues">4.2. Portability issues</a></span></dt>
208*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#bugs">4.3. Reporting bugs</a></span></dt>
209*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#package">4.4. Did you get the right package?</a></span></dt>
210*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#reading">4.5. Further Reading</a></span></dt>
211*4a711beaSLionel Sambuc</dl></dd>
212*4a711beaSLionel Sambuc</dl>
213*4a711beaSLionel Sambuc</div>
214*4a711beaSLionel Sambuc<div class="chapter" title="1.�Introduction">
215*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h2 class="title">
216*4a711beaSLionel Sambuc<a name="intro"></a>1.�Introduction</h2></div></div></div>
217*4a711beaSLionel Sambuc<p><code class="computeroutput">bzip2</code> compresses files
218*4a711beaSLionel Sambucusing the Burrows-Wheeler block-sorting text compression
219*4a711beaSLionel Sambucalgorithm, and Huffman coding.  Compression is generally
220*4a711beaSLionel Sambucconsiderably better than that achieved by more conventional
221*4a711beaSLionel SambucLZ77/LZ78-based compressors, and approaches the performance of
222*4a711beaSLionel Sambucthe PPM family of statistical compressors.</p>
223*4a711beaSLionel Sambuc<p><code class="computeroutput">bzip2</code> is built on top of
224*4a711beaSLionel Sambuc<code class="computeroutput">libbzip2</code>, a flexible library for
225*4a711beaSLionel Sambuchandling compressed data in the
226*4a711beaSLionel Sambuc<code class="computeroutput">bzip2</code> format.  This manual
227*4a711beaSLionel Sambucdescribes both how to use the program and how to work with the
228*4a711beaSLionel Sambuclibrary interface.  Most of the manual is devoted to this
229*4a711beaSLionel Sambuclibrary, not the program, which is good news if your interest is
230*4a711beaSLionel Sambuconly in the program.</p>
231*4a711beaSLionel Sambuc<div class="itemizedlist"><ul class="itemizedlist" type="bullet">
232*4a711beaSLionel Sambuc<li class="listitem" style="list-style-type: disc"><p><a class="xref" href="#using" title="2.�How to use bzip2">How to use bzip2</a> describes how to use
233*4a711beaSLionel Sambuc <code class="computeroutput">bzip2</code>; this is the only part
234*4a711beaSLionel Sambuc you need to read if you just want to know how to operate the
235*4a711beaSLionel Sambuc program.</p></li>
236*4a711beaSLionel Sambuc<li class="listitem" style="list-style-type: disc"><p><a class="xref" href="#libprog" title="3.� Programming with libbzip2">Programming with libbzip2</a> describes the
237*4a711beaSLionel Sambuc programming interfaces in detail, and</p></li>
238*4a711beaSLionel Sambuc<li class="listitem" style="list-style-type: disc"><p><a class="xref" href="#misc" title="4.�Miscellanea">Miscellanea</a> records some
239*4a711beaSLionel Sambuc miscellaneous notes which I thought ought to be recorded
240*4a711beaSLionel Sambuc somewhere.</p></li>
241*4a711beaSLionel Sambuc</ul></div>
242*4a711beaSLionel Sambuc</div>
243*4a711beaSLionel Sambuc<div class="chapter" title="2.�How to use bzip2">
244*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h2 class="title">
245*4a711beaSLionel Sambuc<a name="using"></a>2.�How to use bzip2</h2></div></div></div>
246*4a711beaSLionel Sambuc<div class="toc">
247*4a711beaSLionel Sambuc<p><b>Table of Contents</b></p>
248*4a711beaSLionel Sambuc<dl>
249*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#name">2.1. NAME</a></span></dt>
250*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#synopsis">2.2. SYNOPSIS</a></span></dt>
251*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#description">2.3. DESCRIPTION</a></span></dt>
252*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#options">2.4. OPTIONS</a></span></dt>
253*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#memory-management">2.5. MEMORY MANAGEMENT</a></span></dt>
254*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#recovering">2.6. RECOVERING DATA FROM DAMAGED FILES</a></span></dt>
255*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#performance">2.7. PERFORMANCE NOTES</a></span></dt>
256*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#caveats">2.8. CAVEATS</a></span></dt>
257*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#author">2.9. AUTHOR</a></span></dt>
258*4a711beaSLionel Sambuc</dl>
259*4a711beaSLionel Sambuc</div>
260*4a711beaSLionel Sambuc<p>This chapter contains a copy of the
261*4a711beaSLionel Sambuc<code class="computeroutput">bzip2</code> man page, and nothing
262*4a711beaSLionel Sambucelse.</p>
263*4a711beaSLionel Sambuc<div class="sect1" title="2.1.�NAME">
264*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h2 class="title" style="clear: both">
265*4a711beaSLionel Sambuc<a name="name"></a>2.1.�NAME</h2></div></div></div>
266*4a711beaSLionel Sambuc<div class="itemizedlist"><ul class="itemizedlist" type="bullet">
267*4a711beaSLionel Sambuc<li class="listitem" style="list-style-type: disc"><p><code class="computeroutput">bzip2</code>,
268*4a711beaSLionel Sambuc  <code class="computeroutput">bunzip2</code> - a block-sorting file
269*4a711beaSLionel Sambuc  compressor, v1.0.6</p></li>
270*4a711beaSLionel Sambuc<li class="listitem" style="list-style-type: disc"><p><code class="computeroutput">bzcat</code> -
271*4a711beaSLionel Sambuc   decompresses files to stdout</p></li>
272*4a711beaSLionel Sambuc<li class="listitem" style="list-style-type: disc"><p><code class="computeroutput">bzip2recover</code> -
273*4a711beaSLionel Sambuc   recovers data from damaged bzip2 files</p></li>
274*4a711beaSLionel Sambuc</ul></div>
275*4a711beaSLionel Sambuc</div>
276*4a711beaSLionel Sambuc<div class="sect1" title="2.2.�SYNOPSIS">
277*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h2 class="title" style="clear: both">
278*4a711beaSLionel Sambuc<a name="synopsis"></a>2.2.�SYNOPSIS</h2></div></div></div>
279*4a711beaSLionel Sambuc<div class="itemizedlist"><ul class="itemizedlist" type="bullet">
280*4a711beaSLionel Sambuc<li class="listitem" style="list-style-type: disc"><p><code class="computeroutput">bzip2</code> [
281*4a711beaSLionel Sambuc  -cdfkqstvzVL123456789 ] [ filenames ...  ]</p></li>
282*4a711beaSLionel Sambuc<li class="listitem" style="list-style-type: disc"><p><code class="computeroutput">bunzip2</code> [
283*4a711beaSLionel Sambuc  -fkvsVL ] [ filenames ...  ]</p></li>
284*4a711beaSLionel Sambuc<li class="listitem" style="list-style-type: disc"><p><code class="computeroutput">bzcat</code> [ -s ] [
285*4a711beaSLionel Sambuc  filenames ...  ]</p></li>
286*4a711beaSLionel Sambuc<li class="listitem" style="list-style-type: disc"><p><code class="computeroutput">bzip2recover</code>
287*4a711beaSLionel Sambuc  filename</p></li>
288*4a711beaSLionel Sambuc</ul></div>
289*4a711beaSLionel Sambuc</div>
290*4a711beaSLionel Sambuc<div class="sect1" title="2.3.�DESCRIPTION">
291*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h2 class="title" style="clear: both">
292*4a711beaSLionel Sambuc<a name="description"></a>2.3.�DESCRIPTION</h2></div></div></div>
293*4a711beaSLionel Sambuc<p><code class="computeroutput">bzip2</code> compresses files
294*4a711beaSLionel Sambucusing the Burrows-Wheeler block sorting text compression
295*4a711beaSLionel Sambucalgorithm, and Huffman coding.  Compression is generally
296*4a711beaSLionel Sambucconsiderably better than that achieved by more conventional
297*4a711beaSLionel SambucLZ77/LZ78-based compressors, and approaches the performance of
298*4a711beaSLionel Sambucthe PPM family of statistical compressors.</p>
299*4a711beaSLionel Sambuc<p>The command-line options are deliberately very similar to
300*4a711beaSLionel Sambucthose of GNU <code class="computeroutput">gzip</code>, but they are
301*4a711beaSLionel Sambucnot identical.</p>
302*4a711beaSLionel Sambuc<p><code class="computeroutput">bzip2</code> expects a list of
303*4a711beaSLionel Sambucfile names to accompany the command-line flags.  Each file is
304*4a711beaSLionel Sambucreplaced by a compressed version of itself, with the name
305*4a711beaSLionel Sambuc<code class="computeroutput">original_name.bz2</code>.  Each
306*4a711beaSLionel Sambuccompressed file has the same modification date, permissions, and,
307*4a711beaSLionel Sambucwhen possible, ownership as the corresponding original, so that
308*4a711beaSLionel Sambucthese properties can be correctly restored at decompression time.
309*4a711beaSLionel SambucFile name handling is naive in the sense that there is no
310*4a711beaSLionel Sambucmechanism for preserving original file names, permissions,
311*4a711beaSLionel Sambucownerships or dates in filesystems which lack these concepts, or
312*4a711beaSLionel Sambuchave serious file name length restrictions, such as
313*4a711beaSLionel SambucMS-DOS.</p>
314*4a711beaSLionel Sambuc<p><code class="computeroutput">bzip2</code> and
315*4a711beaSLionel Sambuc<code class="computeroutput">bunzip2</code> will by default not
316*4a711beaSLionel Sambucoverwrite existing files.  If you want this to happen, specify
317*4a711beaSLionel Sambucthe <code class="computeroutput">-f</code> flag.</p>
318*4a711beaSLionel Sambuc<p>If no file names are specified,
319*4a711beaSLionel Sambuc<code class="computeroutput">bzip2</code> compresses from standard
320*4a711beaSLionel Sambucinput to standard output.  In this case,
321*4a711beaSLionel Sambuc<code class="computeroutput">bzip2</code> will decline to write
322*4a711beaSLionel Sambuccompressed output to a terminal, as this would be entirely
323*4a711beaSLionel Sambucincomprehensible and therefore pointless.</p>
324*4a711beaSLionel Sambuc<p><code class="computeroutput">bunzip2</code> (or
325*4a711beaSLionel Sambuc<code class="computeroutput">bzip2 -d</code>) decompresses all
326*4a711beaSLionel Sambucspecified files.  Files which were not created by
327*4a711beaSLionel Sambuc<code class="computeroutput">bzip2</code> will be detected and
328*4a711beaSLionel Sambucignored, and a warning issued.
329*4a711beaSLionel Sambuc<code class="computeroutput">bzip2</code> attempts to guess the
330*4a711beaSLionel Sambucfilename for the decompressed file from that of the compressed
331*4a711beaSLionel Sambucfile as follows:</p>
332*4a711beaSLionel Sambuc<div class="itemizedlist"><ul class="itemizedlist" type="bullet">
333*4a711beaSLionel Sambuc<li class="listitem" style="list-style-type: disc"><p><code class="computeroutput">filename.bz2 </code>
334*4a711beaSLionel Sambuc  becomes
335*4a711beaSLionel Sambuc  <code class="computeroutput">filename</code></p></li>
336*4a711beaSLionel Sambuc<li class="listitem" style="list-style-type: disc"><p><code class="computeroutput">filename.bz </code>
337*4a711beaSLionel Sambuc  becomes
338*4a711beaSLionel Sambuc  <code class="computeroutput">filename</code></p></li>
339*4a711beaSLionel Sambuc<li class="listitem" style="list-style-type: disc"><p><code class="computeroutput">filename.tbz2</code>
340*4a711beaSLionel Sambuc  becomes
341*4a711beaSLionel Sambuc  <code class="computeroutput">filename.tar</code></p></li>
342*4a711beaSLionel Sambuc<li class="listitem" style="list-style-type: disc"><p><code class="computeroutput">filename.tbz </code>
343*4a711beaSLionel Sambuc  becomes
344*4a711beaSLionel Sambuc  <code class="computeroutput">filename.tar</code></p></li>
345*4a711beaSLionel Sambuc<li class="listitem" style="list-style-type: disc"><p><code class="computeroutput">anyothername </code>
346*4a711beaSLionel Sambuc  becomes
347*4a711beaSLionel Sambuc  <code class="computeroutput">anyothername.out</code></p></li>
348*4a711beaSLionel Sambuc</ul></div>
349*4a711beaSLionel Sambuc<p>If the file does not end in one of the recognised endings,
350*4a711beaSLionel Sambuc<code class="computeroutput">.bz2</code>,
351*4a711beaSLionel Sambuc<code class="computeroutput">.bz</code>,
352*4a711beaSLionel Sambuc<code class="computeroutput">.tbz2</code> or
353*4a711beaSLionel Sambuc<code class="computeroutput">.tbz</code>,
354*4a711beaSLionel Sambuc<code class="computeroutput">bzip2</code> complains that it cannot
355*4a711beaSLionel Sambucguess the name of the original file, and uses the original name
356*4a711beaSLionel Sambucwith <code class="computeroutput">.out</code> appended.</p>
357*4a711beaSLionel Sambuc<p>As with compression, supplying no filenames causes
358*4a711beaSLionel Sambucdecompression from standard input to standard output.</p>
359*4a711beaSLionel Sambuc<p><code class="computeroutput">bunzip2</code> will correctly
360*4a711beaSLionel Sambucdecompress a file which is the concatenation of two or more
361*4a711beaSLionel Sambuccompressed files.  The result is the concatenation of the
362*4a711beaSLionel Sambuccorresponding uncompressed files.  Integrity testing
363*4a711beaSLionel Sambuc(<code class="computeroutput">-t</code>) of concatenated compressed
364*4a711beaSLionel Sambucfiles is also supported.</p>
365*4a711beaSLionel Sambuc<p>You can also compress or decompress files to the standard
366*4a711beaSLionel Sambucoutput by giving the <code class="computeroutput">-c</code> flag.
367*4a711beaSLionel SambucMultiple files may be compressed and decompressed like this.  The
368*4a711beaSLionel Sambucresulting outputs are fed sequentially to stdout.  Compression of
369*4a711beaSLionel Sambucmultiple files in this manner generates a stream containing
370*4a711beaSLionel Sambucmultiple compressed file representations.  Such a stream can be
371*4a711beaSLionel Sambucdecompressed correctly only by
372*4a711beaSLionel Sambuc<code class="computeroutput">bzip2</code> version 0.9.0 or later.
373*4a711beaSLionel SambucEarlier versions of <code class="computeroutput">bzip2</code> will
374*4a711beaSLionel Sambucstop after decompressing the first file in the stream.</p>
375*4a711beaSLionel Sambuc<p><code class="computeroutput">bzcat</code> (or
376*4a711beaSLionel Sambuc<code class="computeroutput">bzip2 -dc</code>) decompresses all
377*4a711beaSLionel Sambucspecified files to the standard output.</p>
378*4a711beaSLionel Sambuc<p><code class="computeroutput">bzip2</code> will read arguments
379*4a711beaSLionel Sambucfrom the environment variables
380*4a711beaSLionel Sambuc<code class="computeroutput">BZIP2</code> and
381*4a711beaSLionel Sambuc<code class="computeroutput">BZIP</code>, in that order, and will
382*4a711beaSLionel Sambucprocess them before any arguments read from the command line.
383*4a711beaSLionel SambucThis gives a convenient way to supply default arguments.</p>
384*4a711beaSLionel Sambuc<p>Compression is always performed, even if the compressed
385*4a711beaSLionel Sambucfile is slightly larger than the original.  Files of less than
386*4a711beaSLionel Sambucabout one hundred bytes tend to get larger, since the compression
387*4a711beaSLionel Sambucmechanism has a constant overhead in the region of 50 bytes.
388*4a711beaSLionel SambucRandom data (including the output of most file compressors) is
389*4a711beaSLionel Sambuccoded at about 8.05 bits per byte, giving an expansion of around
390*4a711beaSLionel Sambuc0.5%.</p>
391*4a711beaSLionel Sambuc<p>As a self-check for your protection,
392*4a711beaSLionel Sambuc<code class="computeroutput">bzip2</code> uses 32-bit CRCs to make
393*4a711beaSLionel Sambucsure that the decompressed version of a file is identical to the
394*4a711beaSLionel Sambucoriginal.  This guards against corruption of the compressed data,
395*4a711beaSLionel Sambucand against undetected bugs in
396*4a711beaSLionel Sambuc<code class="computeroutput">bzip2</code> (hopefully very unlikely).
397*4a711beaSLionel SambucThe chances of data corruption going undetected is microscopic,
398*4a711beaSLionel Sambucabout one chance in four billion for each file processed.  Be
399*4a711beaSLionel Sambucaware, though, that the check occurs upon decompression, so it
400*4a711beaSLionel Sambuccan only tell you that something is wrong.  It can't help you
401*4a711beaSLionel Sambucrecover the original uncompressed data.  You can use
402*4a711beaSLionel Sambuc<code class="computeroutput">bzip2recover</code> to try to recover
403*4a711beaSLionel Sambucdata from damaged files.</p>
404*4a711beaSLionel Sambuc<p>Return values: 0 for a normal exit, 1 for environmental
405*4a711beaSLionel Sambucproblems (file not found, invalid flags, I/O errors, etc.), 2
406*4a711beaSLionel Sambucto indicate a corrupt compressed file, 3 for an internal
407*4a711beaSLionel Sambucconsistency error (eg, bug) which caused
408*4a711beaSLionel Sambuc<code class="computeroutput">bzip2</code> to panic.</p>
409*4a711beaSLionel Sambuc</div>
410*4a711beaSLionel Sambuc<div class="sect1" title="2.4.�OPTIONS">
411*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h2 class="title" style="clear: both">
412*4a711beaSLionel Sambuc<a name="options"></a>2.4.�OPTIONS</h2></div></div></div>
413*4a711beaSLionel Sambuc<div class="variablelist"><dl>
414*4a711beaSLionel Sambuc<dt><span class="term"><code class="computeroutput">-c --stdout</code></span></dt>
415*4a711beaSLionel Sambuc<dd><p>Compress or decompress to standard
416*4a711beaSLionel Sambuc  output.</p></dd>
417*4a711beaSLionel Sambuc<dt><span class="term"><code class="computeroutput">-d --decompress</code></span></dt>
418*4a711beaSLionel Sambuc<dd><p>Force decompression.
419*4a711beaSLionel Sambuc  <code class="computeroutput">bzip2</code>,
420*4a711beaSLionel Sambuc  <code class="computeroutput">bunzip2</code> and
421*4a711beaSLionel Sambuc  <code class="computeroutput">bzcat</code> are really the same
422*4a711beaSLionel Sambuc  program, and the decision about what actions to take is done on
423*4a711beaSLionel Sambuc  the basis of which name is used.  This flag overrides that
424*4a711beaSLionel Sambuc  mechanism, and forces bzip2 to decompress.</p></dd>
425*4a711beaSLionel Sambuc<dt><span class="term"><code class="computeroutput">-z --compress</code></span></dt>
426*4a711beaSLionel Sambuc<dd><p>The complement to
427*4a711beaSLionel Sambuc  <code class="computeroutput">-d</code>: forces compression,
428*4a711beaSLionel Sambuc  regardless of the invokation name.</p></dd>
429*4a711beaSLionel Sambuc<dt><span class="term"><code class="computeroutput">-t --test</code></span></dt>
430*4a711beaSLionel Sambuc<dd><p>Check integrity of the specified file(s), but
431*4a711beaSLionel Sambuc  don't decompress them.  This really performs a trial
432*4a711beaSLionel Sambuc  decompression and throws away the result.</p></dd>
433*4a711beaSLionel Sambuc<dt><span class="term"><code class="computeroutput">-f --force</code></span></dt>
434*4a711beaSLionel Sambuc<dd>
435*4a711beaSLionel Sambuc<p>Force overwrite of output files.  Normally,
436*4a711beaSLionel Sambuc  <code class="computeroutput">bzip2</code> will not overwrite
437*4a711beaSLionel Sambuc  existing output files.  Also forces
438*4a711beaSLionel Sambuc  <code class="computeroutput">bzip2</code> to break hard links to
439*4a711beaSLionel Sambuc  files, which it otherwise wouldn't do.</p>
440*4a711beaSLionel Sambuc<p><code class="computeroutput">bzip2</code> normally declines
441*4a711beaSLionel Sambuc  to decompress files which don't have the correct magic header
442*4a711beaSLionel Sambuc  bytes. If forced (<code class="computeroutput">-f</code>),
443*4a711beaSLionel Sambuc  however, it will pass such files through unmodified. This is
444*4a711beaSLionel Sambuc  how GNU <code class="computeroutput">gzip</code> behaves.</p>
445*4a711beaSLionel Sambuc</dd>
446*4a711beaSLionel Sambuc<dt><span class="term"><code class="computeroutput">-k --keep</code></span></dt>
447*4a711beaSLionel Sambuc<dd><p>Keep (don't delete) input files during
448*4a711beaSLionel Sambuc  compression or decompression.</p></dd>
449*4a711beaSLionel Sambuc<dt><span class="term"><code class="computeroutput">-s --small</code></span></dt>
450*4a711beaSLionel Sambuc<dd>
451*4a711beaSLionel Sambuc<p>Reduce memory usage, for compression,
452*4a711beaSLionel Sambuc  decompression and testing.  Files are decompressed and tested
453*4a711beaSLionel Sambuc  using a modified algorithm which only requires 2.5 bytes per
454*4a711beaSLionel Sambuc  block byte.  This means any file can be decompressed in 2300k
455*4a711beaSLionel Sambuc  of memory, albeit at about half the normal speed.</p>
456*4a711beaSLionel Sambuc<p>During compression, <code class="computeroutput">-s</code>
457*4a711beaSLionel Sambuc  selects a block size of 200k, which limits memory use to around
458*4a711beaSLionel Sambuc  the same figure, at the expense of your compression ratio.  In
459*4a711beaSLionel Sambuc  short, if your machine is low on memory (8 megabytes or less),
460*4a711beaSLionel Sambuc  use <code class="computeroutput">-s</code> for everything.  See
461*4a711beaSLionel Sambuc  <a class="xref" href="#memory-management" title="2.5.�MEMORY MANAGEMENT">MEMORY MANAGEMENT</a> below.</p>
462*4a711beaSLionel Sambuc</dd>
463*4a711beaSLionel Sambuc<dt><span class="term"><code class="computeroutput">-q --quiet</code></span></dt>
464*4a711beaSLionel Sambuc<dd><p>Suppress non-essential warning messages.
465*4a711beaSLionel Sambuc  Messages pertaining to I/O errors and other critical events
466*4a711beaSLionel Sambuc  will not be suppressed.</p></dd>
467*4a711beaSLionel Sambuc<dt><span class="term"><code class="computeroutput">-v --verbose</code></span></dt>
468*4a711beaSLionel Sambuc<dd><p>Verbose mode -- show the compression ratio for
469*4a711beaSLionel Sambuc  each file processed.  Further
470*4a711beaSLionel Sambuc  <code class="computeroutput">-v</code>'s increase the verbosity
471*4a711beaSLionel Sambuc  level, spewing out lots of information which is primarily of
472*4a711beaSLionel Sambuc  interest for diagnostic purposes.</p></dd>
473*4a711beaSLionel Sambuc<dt><span class="term"><code class="computeroutput">-L --license -V --version</code></span></dt>
474*4a711beaSLionel Sambuc<dd><p>Display the software version, license terms and
475*4a711beaSLionel Sambuc  conditions.</p></dd>
476*4a711beaSLionel Sambuc<dt><span class="term"><code class="computeroutput">-1</code> (or
477*4a711beaSLionel Sambuc <code class="computeroutput">--fast</code>) to
478*4a711beaSLionel Sambuc <code class="computeroutput">-9</code> (or
479*4a711beaSLionel Sambuc <code class="computeroutput">-best</code>)</span></dt>
480*4a711beaSLionel Sambuc<dd><p>Set the block size to 100 k, 200 k ...  900 k
481*4a711beaSLionel Sambuc  when compressing.  Has no effect when decompressing.  See <a class="xref" href="#memory-management" title="2.5.�MEMORY MANAGEMENT">MEMORY MANAGEMENT</a> below.  The
482*4a711beaSLionel Sambuc  <code class="computeroutput">--fast</code> and
483*4a711beaSLionel Sambuc  <code class="computeroutput">--best</code> aliases are primarily
484*4a711beaSLionel Sambuc  for GNU <code class="computeroutput">gzip</code> compatibility.
485*4a711beaSLionel Sambuc  In particular, <code class="computeroutput">--fast</code> doesn't
486*4a711beaSLionel Sambuc  make things significantly faster.  And
487*4a711beaSLionel Sambuc  <code class="computeroutput">--best</code> merely selects the
488*4a711beaSLionel Sambuc  default behaviour.</p></dd>
489*4a711beaSLionel Sambuc<dt><span class="term"><code class="computeroutput">--</code></span></dt>
490*4a711beaSLionel Sambuc<dd><p>Treats all subsequent arguments as file names,
491*4a711beaSLionel Sambuc  even if they start with a dash.  This is so you can handle
492*4a711beaSLionel Sambuc  files with names beginning with a dash, for example:
493*4a711beaSLionel Sambuc  <code class="computeroutput">bzip2 --
494*4a711beaSLionel Sambuc  -myfilename</code>.</p></dd>
495*4a711beaSLionel Sambuc<dt>
496*4a711beaSLionel Sambuc<span class="term"><code class="computeroutput">--repetitive-fast</code>, </span><span class="term"><code class="computeroutput">--repetitive-best</code></span>
497*4a711beaSLionel Sambuc</dt>
498*4a711beaSLionel Sambuc<dd><p>These flags are redundant in versions 0.9.5 and
499*4a711beaSLionel Sambuc  above.  They provided some coarse control over the behaviour of
500*4a711beaSLionel Sambuc  the sorting algorithm in earlier versions, which was sometimes
501*4a711beaSLionel Sambuc  useful.  0.9.5 and above have an improved algorithm which
502*4a711beaSLionel Sambuc  renders these flags irrelevant.</p></dd>
503*4a711beaSLionel Sambuc</dl></div>
504*4a711beaSLionel Sambuc</div>
505*4a711beaSLionel Sambuc<div class="sect1" title="2.5.�MEMORY MANAGEMENT">
506*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h2 class="title" style="clear: both">
507*4a711beaSLionel Sambuc<a name="memory-management"></a>2.5.�MEMORY MANAGEMENT</h2></div></div></div>
508*4a711beaSLionel Sambuc<p><code class="computeroutput">bzip2</code> compresses large
509*4a711beaSLionel Sambucfiles in blocks.  The block size affects both the compression
510*4a711beaSLionel Sambucratio achieved, and the amount of memory needed for compression
511*4a711beaSLionel Sambucand decompression.  The flags <code class="computeroutput">-1</code>
512*4a711beaSLionel Sambucthrough <code class="computeroutput">-9</code> specify the block
513*4a711beaSLionel Sambucsize to be 100,000 bytes through 900,000 bytes (the default)
514*4a711beaSLionel Sambucrespectively.  At decompression time, the block size used for
515*4a711beaSLionel Sambuccompression is read from the header of the compressed file, and
516*4a711beaSLionel Sambuc<code class="computeroutput">bunzip2</code> then allocates itself
517*4a711beaSLionel Sambucjust enough memory to decompress the file.  Since block sizes are
518*4a711beaSLionel Sambucstored in compressed files, it follows that the flags
519*4a711beaSLionel Sambuc<code class="computeroutput">-1</code> to
520*4a711beaSLionel Sambuc<code class="computeroutput">-9</code> are irrelevant to and so
521*4a711beaSLionel Sambucignored during decompression.</p>
522*4a711beaSLionel Sambuc<p>Compression and decompression requirements, in bytes, can be
523*4a711beaSLionel Sambucestimated as:</p>
524*4a711beaSLionel Sambuc<pre class="programlisting">Compression:   400k + ( 8 x block size )
525*4a711beaSLionel Sambuc
526*4a711beaSLionel SambucDecompression: 100k + ( 4 x block size ), or
527*4a711beaSLionel Sambuc               100k + ( 2.5 x block size )</pre>
528*4a711beaSLionel Sambuc<p>Larger block sizes give rapidly diminishing marginal
529*4a711beaSLionel Sambucreturns.  Most of the compression comes from the first two or
530*4a711beaSLionel Sambucthree hundred k of block size, a fact worth bearing in mind when
531*4a711beaSLionel Sambucusing <code class="computeroutput">bzip2</code> on small machines.
532*4a711beaSLionel SambucIt is also important to appreciate that the decompression memory
533*4a711beaSLionel Sambucrequirement is set at compression time by the choice of block
534*4a711beaSLionel Sambucsize.</p>
535*4a711beaSLionel Sambuc<p>For files compressed with the default 900k block size,
536*4a711beaSLionel Sambuc<code class="computeroutput">bunzip2</code> will require about 3700
537*4a711beaSLionel Sambuckbytes to decompress.  To support decompression of any file on a
538*4a711beaSLionel Sambuc4 megabyte machine, <code class="computeroutput">bunzip2</code> has
539*4a711beaSLionel Sambucan option to decompress using approximately half this amount of
540*4a711beaSLionel Sambucmemory, about 2300 kbytes.  Decompression speed is also halved,
541*4a711beaSLionel Sambucso you should use this option only where necessary.  The relevant
542*4a711beaSLionel Sambucflag is <code class="computeroutput">-s</code>.</p>
543*4a711beaSLionel Sambuc<p>In general, try and use the largest block size memory
544*4a711beaSLionel Sambucconstraints allow, since that maximises the compression achieved.
545*4a711beaSLionel SambucCompression and decompression speed are virtually unaffected by
546*4a711beaSLionel Sambucblock size.</p>
547*4a711beaSLionel Sambuc<p>Another significant point applies to files which fit in a
548*4a711beaSLionel Sambucsingle block -- that means most files you'd encounter using a
549*4a711beaSLionel Sambuclarge block size.  The amount of real memory touched is
550*4a711beaSLionel Sambucproportional to the size of the file, since the file is smaller
551*4a711beaSLionel Sambucthan a block.  For example, compressing a file 20,000 bytes long
552*4a711beaSLionel Sambucwith the flag <code class="computeroutput">-9</code> will cause the
553*4a711beaSLionel Sambuccompressor to allocate around 7600k of memory, but only touch
554*4a711beaSLionel Sambuc400k + 20000 * 8 = 560 kbytes of it.  Similarly, the decompressor
555*4a711beaSLionel Sambucwill allocate 3700k but only touch 100k + 20000 * 4 = 180
556*4a711beaSLionel Sambuckbytes.</p>
557*4a711beaSLionel Sambuc<p>Here is a table which summarises the maximum memory usage
558*4a711beaSLionel Sambucfor different block sizes.  Also recorded is the total compressed
559*4a711beaSLionel Sambucsize for 14 files of the Calgary Text Compression Corpus
560*4a711beaSLionel Sambuctotalling 3,141,622 bytes.  This column gives some feel for how
561*4a711beaSLionel Sambuccompression varies with block size.  These figures tend to
562*4a711beaSLionel Sambucunderstate the advantage of larger block sizes for larger files,
563*4a711beaSLionel Sambucsince the Corpus is dominated by smaller files.</p>
564*4a711beaSLionel Sambuc<pre class="programlisting">        Compress   Decompress   Decompress   Corpus
565*4a711beaSLionel SambucFlag     usage      usage       -s usage     Size
566*4a711beaSLionel Sambuc
567*4a711beaSLionel Sambuc -1      1200k       500k         350k      914704
568*4a711beaSLionel Sambuc -2      2000k       900k         600k      877703
569*4a711beaSLionel Sambuc -3      2800k      1300k         850k      860338
570*4a711beaSLionel Sambuc -4      3600k      1700k        1100k      846899
571*4a711beaSLionel Sambuc -5      4400k      2100k        1350k      845160
572*4a711beaSLionel Sambuc -6      5200k      2500k        1600k      838626
573*4a711beaSLionel Sambuc -7      6100k      2900k        1850k      834096
574*4a711beaSLionel Sambuc -8      6800k      3300k        2100k      828642
575*4a711beaSLionel Sambuc -9      7600k      3700k        2350k      828642</pre>
576*4a711beaSLionel Sambuc</div>
577*4a711beaSLionel Sambuc<div class="sect1" title="2.6.�RECOVERING DATA FROM DAMAGED FILES">
578*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h2 class="title" style="clear: both">
579*4a711beaSLionel Sambuc<a name="recovering"></a>2.6.�RECOVERING DATA FROM DAMAGED FILES</h2></div></div></div>
580*4a711beaSLionel Sambuc<p><code class="computeroutput">bzip2</code> compresses files in
581*4a711beaSLionel Sambucblocks, usually 900kbytes long.  Each block is handled
582*4a711beaSLionel Sambucindependently.  If a media or transmission error causes a
583*4a711beaSLionel Sambucmulti-block <code class="computeroutput">.bz2</code> file to become
584*4a711beaSLionel Sambucdamaged, it may be possible to recover data from the undamaged
585*4a711beaSLionel Sambucblocks in the file.</p>
586*4a711beaSLionel Sambuc<p>The compressed representation of each block is delimited by
587*4a711beaSLionel Sambuca 48-bit pattern, which makes it possible to find the block
588*4a711beaSLionel Sambucboundaries with reasonable certainty.  Each block also carries
589*4a711beaSLionel Sambucits own 32-bit CRC, so damaged blocks can be distinguished from
590*4a711beaSLionel Sambucundamaged ones.</p>
591*4a711beaSLionel Sambuc<p><code class="computeroutput">bzip2recover</code> is a simple
592*4a711beaSLionel Sambucprogram whose purpose is to search for blocks in
593*4a711beaSLionel Sambuc<code class="computeroutput">.bz2</code> files, and write each block
594*4a711beaSLionel Sambucout into its own <code class="computeroutput">.bz2</code> file.  You
595*4a711beaSLionel Sambuccan then use <code class="computeroutput">bzip2 -t</code> to test
596*4a711beaSLionel Sambucthe integrity of the resulting files, and decompress those which
597*4a711beaSLionel Sambucare undamaged.</p>
598*4a711beaSLionel Sambuc<p><code class="computeroutput">bzip2recover</code> takes a
599*4a711beaSLionel Sambucsingle argument, the name of the damaged file, and writes a
600*4a711beaSLionel Sambucnumber of files <code class="computeroutput">rec0001file.bz2</code>,
601*4a711beaSLionel Sambuc<code class="computeroutput">rec0002file.bz2</code>, etc, containing
602*4a711beaSLionel Sambucthe extracted blocks.  The output filenames are designed so that
603*4a711beaSLionel Sambucthe use of wildcards in subsequent processing -- for example,
604*4a711beaSLionel Sambuc<code class="computeroutput">bzip2 -dc rec*file.bz2 &gt;
605*4a711beaSLionel Sambucrecovered_data</code> -- lists the files in the correct
606*4a711beaSLionel Sambucorder.</p>
607*4a711beaSLionel Sambuc<p><code class="computeroutput">bzip2recover</code> should be of
608*4a711beaSLionel Sambucmost use dealing with large <code class="computeroutput">.bz2</code>
609*4a711beaSLionel Sambucfiles, as these will contain many blocks.  It is clearly futile
610*4a711beaSLionel Sambucto use it on damaged single-block files, since a damaged block
611*4a711beaSLionel Sambuccannot be recovered.  If you wish to minimise any potential data
612*4a711beaSLionel Sambucloss through media or transmission errors, you might consider
613*4a711beaSLionel Sambuccompressing with a smaller block size.</p>
614*4a711beaSLionel Sambuc</div>
615*4a711beaSLionel Sambuc<div class="sect1" title="2.7.�PERFORMANCE NOTES">
616*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h2 class="title" style="clear: both">
617*4a711beaSLionel Sambuc<a name="performance"></a>2.7.�PERFORMANCE NOTES</h2></div></div></div>
618*4a711beaSLionel Sambuc<p>The sorting phase of compression gathers together similar
619*4a711beaSLionel Sambucstrings in the file.  Because of this, files containing very long
620*4a711beaSLionel Sambucruns of repeated symbols, like "aabaabaabaab ..."  (repeated
621*4a711beaSLionel Sambucseveral hundred times) may compress more slowly than normal.
622*4a711beaSLionel SambucVersions 0.9.5 and above fare much better than previous versions
623*4a711beaSLionel Sambucin this respect.  The ratio between worst-case and average-case
624*4a711beaSLionel Sambuccompression time is in the region of 10:1.  For previous
625*4a711beaSLionel Sambucversions, this figure was more like 100:1.  You can use the
626*4a711beaSLionel Sambuc<code class="computeroutput">-vvvv</code> option to monitor progress
627*4a711beaSLionel Sambucin great detail, if you want.</p>
628*4a711beaSLionel Sambuc<p>Decompression speed is unaffected by these
629*4a711beaSLionel Sambucphenomena.</p>
630*4a711beaSLionel Sambuc<p><code class="computeroutput">bzip2</code> usually allocates
631*4a711beaSLionel Sambucseveral megabytes of memory to operate in, and then charges all
632*4a711beaSLionel Sambucover it in a fairly random fashion.  This means that performance,
633*4a711beaSLionel Sambucboth for compressing and decompressing, is largely determined by
634*4a711beaSLionel Sambucthe speed at which your machine can service cache misses.
635*4a711beaSLionel SambucBecause of this, small changes to the code to reduce the miss
636*4a711beaSLionel Sambucrate have been observed to give disproportionately large
637*4a711beaSLionel Sambucperformance improvements.  I imagine
638*4a711beaSLionel Sambuc<code class="computeroutput">bzip2</code> will perform best on
639*4a711beaSLionel Sambucmachines with very large caches.</p>
640*4a711beaSLionel Sambuc</div>
641*4a711beaSLionel Sambuc<div class="sect1" title="2.8.�CAVEATS">
642*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h2 class="title" style="clear: both">
643*4a711beaSLionel Sambuc<a name="caveats"></a>2.8.�CAVEATS</h2></div></div></div>
644*4a711beaSLionel Sambuc<p>I/O error messages are not as helpful as they could be.
645*4a711beaSLionel Sambuc<code class="computeroutput">bzip2</code> tries hard to detect I/O
646*4a711beaSLionel Sambucerrors and exit cleanly, but the details of what the problem is
647*4a711beaSLionel Sambucsometimes seem rather misleading.</p>
648*4a711beaSLionel Sambuc<p>This manual page pertains to version 1.0.6 of
649*4a711beaSLionel Sambuc<code class="computeroutput">bzip2</code>.  Compressed data created by
650*4a711beaSLionel Sambucthis version is entirely forwards and backwards compatible with the
651*4a711beaSLionel Sambucprevious public releases, versions 0.1pl2, 0.9.0 and 0.9.5, 1.0.0,
652*4a711beaSLionel Sambuc1.0.1, 1.0.2 and 1.0.3, but with the following exception: 0.9.0 and
653*4a711beaSLionel Sambucabove can correctly decompress multiple concatenated compressed files.
654*4a711beaSLionel Sambuc0.1pl2 cannot do this; it will stop after decompressing just the first
655*4a711beaSLionel Sambucfile in the stream.</p>
656*4a711beaSLionel Sambuc<p><code class="computeroutput">bzip2recover</code> versions
657*4a711beaSLionel Sambucprior to 1.0.2 used 32-bit integers to represent bit positions in
658*4a711beaSLionel Sambuccompressed files, so it could not handle compressed files more
659*4a711beaSLionel Sambucthan 512 megabytes long.  Versions 1.0.2 and above use 64-bit ints
660*4a711beaSLionel Sambucon some platforms which support them (GNU supported targets, and
661*4a711beaSLionel SambucWindows). To establish whether or not
662*4a711beaSLionel Sambuc<code class="computeroutput">bzip2recover</code> was built with such
663*4a711beaSLionel Sambuca limitation, run it without arguments. In any event you can
664*4a711beaSLionel Sambucbuild yourself an unlimited version if you can recompile it with
665*4a711beaSLionel Sambuc<code class="computeroutput">MaybeUInt64</code> set to be an
666*4a711beaSLionel Sambucunsigned 64-bit integer.</p>
667*4a711beaSLionel Sambuc</div>
668*4a711beaSLionel Sambuc<div class="sect1" title="2.9.�AUTHOR">
669*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h2 class="title" style="clear: both">
670*4a711beaSLionel Sambuc<a name="author"></a>2.9.�AUTHOR</h2></div></div></div>
671*4a711beaSLionel Sambuc<p>Julian Seward,
672*4a711beaSLionel Sambuc<code class="computeroutput">jseward@bzip.org</code></p>
673*4a711beaSLionel Sambuc<p>The ideas embodied in
674*4a711beaSLionel Sambuc<code class="computeroutput">bzip2</code> are due to (at least) the
675*4a711beaSLionel Sambucfollowing people: Michael Burrows and David Wheeler (for the
676*4a711beaSLionel Sambucblock sorting transformation), David Wheeler (again, for the
677*4a711beaSLionel SambucHuffman coder), Peter Fenwick (for the structured coding model in
678*4a711beaSLionel Sambucthe original <code class="computeroutput">bzip</code>, and many
679*4a711beaSLionel Sambucrefinements), and Alistair Moffat, Radford Neal and Ian Witten
680*4a711beaSLionel Sambuc(for the arithmetic coder in the original
681*4a711beaSLionel Sambuc<code class="computeroutput">bzip</code>).  I am much indebted for
682*4a711beaSLionel Sambuctheir help, support and advice.  See the manual in the source
683*4a711beaSLionel Sambucdistribution for pointers to sources of documentation.  Christian
684*4a711beaSLionel Sambucvon Roques encouraged me to look for faster sorting algorithms,
685*4a711beaSLionel Sambucso as to speed up compression.  Bela Lubkin encouraged me to
686*4a711beaSLionel Sambucimprove the worst-case compression performance.
687*4a711beaSLionel SambucDonna Robinson XMLised the documentation.
688*4a711beaSLionel SambucMany people sent
689*4a711beaSLionel Sambucpatches, helped with portability problems, lent machines, gave
690*4a711beaSLionel Sambucadvice and were generally helpful.</p>
691*4a711beaSLionel Sambuc</div>
692*4a711beaSLionel Sambuc</div>
693*4a711beaSLionel Sambuc<div class="chapter" title="3.� Programming with libbzip2">
694*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h2 class="title">
695*4a711beaSLionel Sambuc<a name="libprog"></a>3.�
696*4a711beaSLionel SambucProgramming with <code class="computeroutput">libbzip2</code>
697*4a711beaSLionel Sambuc</h2></div></div></div>
698*4a711beaSLionel Sambuc<div class="toc">
699*4a711beaSLionel Sambuc<p><b>Table of Contents</b></p>
700*4a711beaSLionel Sambuc<dl>
701*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#top-level">3.1. Top-level structure</a></span></dt>
702*4a711beaSLionel Sambuc<dd><dl>
703*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#ll-summary">3.1.1. Low-level summary</a></span></dt>
704*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#hl-summary">3.1.2. High-level summary</a></span></dt>
705*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#util-fns-summary">3.1.3. Utility functions summary</a></span></dt>
706*4a711beaSLionel Sambuc</dl></dd>
707*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#err-handling">3.2. Error handling</a></span></dt>
708*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#low-level">3.3. Low-level interface</a></span></dt>
709*4a711beaSLionel Sambuc<dd><dl>
710*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#bzcompress-init">3.3.1. BZ2_bzCompressInit</a></span></dt>
711*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#bzCompress">3.3.2. BZ2_bzCompress</a></span></dt>
712*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#bzCompress-end">3.3.3. BZ2_bzCompressEnd</a></span></dt>
713*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#bzDecompress-init">3.3.4. BZ2_bzDecompressInit</a></span></dt>
714*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#bzDecompress">3.3.5. BZ2_bzDecompress</a></span></dt>
715*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#bzDecompress-end">3.3.6. BZ2_bzDecompressEnd</a></span></dt>
716*4a711beaSLionel Sambuc</dl></dd>
717*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#hl-interface">3.4. High-level interface</a></span></dt>
718*4a711beaSLionel Sambuc<dd><dl>
719*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#bzreadopen">3.4.1. BZ2_bzReadOpen</a></span></dt>
720*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#bzread">3.4.2. BZ2_bzRead</a></span></dt>
721*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#bzreadgetunused">3.4.3. BZ2_bzReadGetUnused</a></span></dt>
722*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#bzreadclose">3.4.4. BZ2_bzReadClose</a></span></dt>
723*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#bzwriteopen">3.4.5. BZ2_bzWriteOpen</a></span></dt>
724*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#bzwrite">3.4.6. BZ2_bzWrite</a></span></dt>
725*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#bzwriteclose">3.4.7. BZ2_bzWriteClose</a></span></dt>
726*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#embed">3.4.8. Handling embedded compressed data streams</a></span></dt>
727*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#std-rdwr">3.4.9. Standard file-reading/writing code</a></span></dt>
728*4a711beaSLionel Sambuc</dl></dd>
729*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#util-fns">3.5. Utility functions</a></span></dt>
730*4a711beaSLionel Sambuc<dd><dl>
731*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#bzbufftobuffcompress">3.5.1. BZ2_bzBuffToBuffCompress</a></span></dt>
732*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#bzbufftobuffdecompress">3.5.2. BZ2_bzBuffToBuffDecompress</a></span></dt>
733*4a711beaSLionel Sambuc</dl></dd>
734*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#zlib-compat">3.6. zlib compatibility functions</a></span></dt>
735*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#stdio-free">3.7. Using the library in a stdio-free environment</a></span></dt>
736*4a711beaSLionel Sambuc<dd><dl>
737*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#stdio-bye">3.7.1. Getting rid of stdio</a></span></dt>
738*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#critical-error">3.7.2. Critical error handling</a></span></dt>
739*4a711beaSLionel Sambuc</dl></dd>
740*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#win-dll">3.8. Making a Windows DLL</a></span></dt>
741*4a711beaSLionel Sambuc</dl>
742*4a711beaSLionel Sambuc</div>
743*4a711beaSLionel Sambuc<p>This chapter describes the programming interface to
744*4a711beaSLionel Sambuc<code class="computeroutput">libbzip2</code>.</p>
745*4a711beaSLionel Sambuc<p>For general background information, particularly about
746*4a711beaSLionel Sambucmemory use and performance aspects, you'd be well advised to read
747*4a711beaSLionel Sambuc<a class="xref" href="#using" title="2.�How to use bzip2">How to use bzip2</a> as well.</p>
748*4a711beaSLionel Sambuc<div class="sect1" title="3.1.�Top-level structure">
749*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h2 class="title" style="clear: both">
750*4a711beaSLionel Sambuc<a name="top-level"></a>3.1.�Top-level structure</h2></div></div></div>
751*4a711beaSLionel Sambuc<p><code class="computeroutput">libbzip2</code> is a flexible
752*4a711beaSLionel Sambuclibrary for compressing and decompressing data in the
753*4a711beaSLionel Sambuc<code class="computeroutput">bzip2</code> data format.  Although
754*4a711beaSLionel Sambucpackaged as a single entity, it helps to regard the library as
755*4a711beaSLionel Sambucthree separate parts: the low level interface, and the high level
756*4a711beaSLionel Sambucinterface, and some utility functions.</p>
757*4a711beaSLionel Sambuc<p>The structure of
758*4a711beaSLionel Sambuc<code class="computeroutput">libbzip2</code>'s interfaces is similar
759*4a711beaSLionel Sambucto that of Jean-loup Gailly's and Mark Adler's excellent
760*4a711beaSLionel Sambuc<code class="computeroutput">zlib</code> library.</p>
761*4a711beaSLionel Sambuc<p>All externally visible symbols have names beginning
762*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_</code>.  This is new in version
763*4a711beaSLionel Sambuc1.0.  The intention is to minimise pollution of the namespaces of
764*4a711beaSLionel Sambuclibrary clients.</p>
765*4a711beaSLionel Sambuc<p>To use any part of the library, you need to
766*4a711beaSLionel Sambuc<code class="computeroutput">#include &lt;bzlib.h&gt;</code>
767*4a711beaSLionel Sambucinto your sources.</p>
768*4a711beaSLionel Sambuc<div class="sect2" title="3.1.1.�Low-level summary">
769*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h3 class="title">
770*4a711beaSLionel Sambuc<a name="ll-summary"></a>3.1.1.�Low-level summary</h3></div></div></div>
771*4a711beaSLionel Sambuc<p>This interface provides services for compressing and
772*4a711beaSLionel Sambucdecompressing data in memory.  There's no provision for dealing
773*4a711beaSLionel Sambucwith files, streams or any other I/O mechanisms, just straight
774*4a711beaSLionel Sambucmemory-to-memory work.  In fact, this part of the library can be
775*4a711beaSLionel Sambuccompiled without inclusion of
776*4a711beaSLionel Sambuc<code class="computeroutput">stdio.h</code>, which may be helpful
777*4a711beaSLionel Sambucfor embedded applications.</p>
778*4a711beaSLionel Sambuc<p>The low-level part of the library has no global variables
779*4a711beaSLionel Sambucand is therefore thread-safe.</p>
780*4a711beaSLionel Sambuc<p>Six routines make up the low level interface:
781*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzCompressInit</code>,
782*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzCompress</code>, and
783*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzCompressEnd</code> for
784*4a711beaSLionel Sambuccompression, and a corresponding trio
785*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzDecompressInit</code>,
786*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzDecompress</code> and
787*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzDecompressEnd</code> for
788*4a711beaSLionel Sambucdecompression.  The <code class="computeroutput">*Init</code>
789*4a711beaSLionel Sambucfunctions allocate memory for compression/decompression and do
790*4a711beaSLionel Sambucother initialisations, whilst the
791*4a711beaSLionel Sambuc<code class="computeroutput">*End</code> functions close down
792*4a711beaSLionel Sambucoperations and release memory.</p>
793*4a711beaSLionel Sambuc<p>The real work is done by
794*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzCompress</code> and
795*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzDecompress</code>.  These
796*4a711beaSLionel Sambuccompress and decompress data from a user-supplied input buffer to
797*4a711beaSLionel Sambuca user-supplied output buffer.  These buffers can be any size;
798*4a711beaSLionel Sambucarbitrary quantities of data are handled by making repeated calls
799*4a711beaSLionel Sambucto these functions.  This is a flexible mechanism allowing a
800*4a711beaSLionel Sambucconsumer-pull style of activity, or producer-push, or a mixture
801*4a711beaSLionel Sambucof both.</p>
802*4a711beaSLionel Sambuc</div>
803*4a711beaSLionel Sambuc<div class="sect2" title="3.1.2.�High-level summary">
804*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h3 class="title">
805*4a711beaSLionel Sambuc<a name="hl-summary"></a>3.1.2.�High-level summary</h3></div></div></div>
806*4a711beaSLionel Sambuc<p>This interface provides some handy wrappers around the
807*4a711beaSLionel Sambuclow-level interface to facilitate reading and writing
808*4a711beaSLionel Sambuc<code class="computeroutput">bzip2</code> format files
809*4a711beaSLionel Sambuc(<code class="computeroutput">.bz2</code> files).  The routines
810*4a711beaSLionel Sambucprovide hooks to facilitate reading files in which the
811*4a711beaSLionel Sambuc<code class="computeroutput">bzip2</code> data stream is embedded
812*4a711beaSLionel Sambucwithin some larger-scale file structure, or where there are
813*4a711beaSLionel Sambucmultiple <code class="computeroutput">bzip2</code> data streams
814*4a711beaSLionel Sambucconcatenated end-to-end.</p>
815*4a711beaSLionel Sambuc<p>For reading files,
816*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzReadOpen</code>,
817*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzRead</code>,
818*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzReadClose</code> and
819*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzReadGetUnused</code> are
820*4a711beaSLionel Sambucsupplied.  For writing files,
821*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzWriteOpen</code>,
822*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzWrite</code> and
823*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzWriteFinish</code> are
824*4a711beaSLionel Sambucavailable.</p>
825*4a711beaSLionel Sambuc<p>As with the low-level library, no global variables are used
826*4a711beaSLionel Sambucso the library is per se thread-safe.  However, if I/O errors
827*4a711beaSLionel Sambucoccur whilst reading or writing the underlying compressed files,
828*4a711beaSLionel Sambucyou may have to consult <code class="computeroutput">errno</code> to
829*4a711beaSLionel Sambucdetermine the cause of the error.  In that case, you'd need a C
830*4a711beaSLionel Sambuclibrary which correctly supports
831*4a711beaSLionel Sambuc<code class="computeroutput">errno</code> in a multithreaded
832*4a711beaSLionel Sambucenvironment.</p>
833*4a711beaSLionel Sambuc<p>To make the library a little simpler and more portable,
834*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzReadOpen</code> and
835*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzWriteOpen</code> require you to
836*4a711beaSLionel Sambucpass them file handles (<code class="computeroutput">FILE*</code>s)
837*4a711beaSLionel Sambucwhich have previously been opened for reading or writing
838*4a711beaSLionel Sambucrespectively.  That avoids portability problems associated with
839*4a711beaSLionel Sambucfile operations and file attributes, whilst not being much of an
840*4a711beaSLionel Sambucimposition on the programmer.</p>
841*4a711beaSLionel Sambuc</div>
842*4a711beaSLionel Sambuc<div class="sect2" title="3.1.3.�Utility functions summary">
843*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h3 class="title">
844*4a711beaSLionel Sambuc<a name="util-fns-summary"></a>3.1.3.�Utility functions summary</h3></div></div></div>
845*4a711beaSLionel Sambuc<p>For very simple needs,
846*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzBuffToBuffCompress</code> and
847*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzBuffToBuffDecompress</code> are
848*4a711beaSLionel Sambucprovided.  These compress data in memory from one buffer to
849*4a711beaSLionel Sambucanother buffer in a single function call.  You should assess
850*4a711beaSLionel Sambucwhether these functions fulfill your memory-to-memory
851*4a711beaSLionel Sambuccompression/decompression requirements before investing effort in
852*4a711beaSLionel Sambucunderstanding the more general but more complex low-level
853*4a711beaSLionel Sambucinterface.</p>
854*4a711beaSLionel Sambuc<p>Yoshioka Tsuneo
855*4a711beaSLionel Sambuc(<code class="computeroutput">tsuneo@rr.iij4u.or.jp</code>) has
856*4a711beaSLionel Sambuccontributed some functions to give better
857*4a711beaSLionel Sambuc<code class="computeroutput">zlib</code> compatibility.  These
858*4a711beaSLionel Sambucfunctions are <code class="computeroutput">BZ2_bzopen</code>,
859*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzread</code>,
860*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzwrite</code>,
861*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzflush</code>,
862*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzclose</code>,
863*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzerror</code> and
864*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzlibVersion</code>.  You may find
865*4a711beaSLionel Sambucthese functions more convenient for simple file reading and
866*4a711beaSLionel Sambucwriting, than those in the high-level interface.  These functions
867*4a711beaSLionel Sambucare not (yet) officially part of the library, and are minimally
868*4a711beaSLionel Sambucdocumented here.  If they break, you get to keep all the pieces.
869*4a711beaSLionel SambucI hope to document them properly when time permits.</p>
870*4a711beaSLionel Sambuc<p>Yoshioka also contributed modifications to allow the
871*4a711beaSLionel Sambuclibrary to be built as a Windows DLL.</p>
872*4a711beaSLionel Sambuc</div>
873*4a711beaSLionel Sambuc</div>
874*4a711beaSLionel Sambuc<div class="sect1" title="3.2.�Error handling">
875*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h2 class="title" style="clear: both">
876*4a711beaSLionel Sambuc<a name="err-handling"></a>3.2.�Error handling</h2></div></div></div>
877*4a711beaSLionel Sambuc<p>The library is designed to recover cleanly in all
878*4a711beaSLionel Sambucsituations, including the worst-case situation of decompressing
879*4a711beaSLionel Sambucrandom data.  I'm not 100% sure that it can always do this, so
880*4a711beaSLionel Sambucyou might want to add a signal handler to catch segmentation
881*4a711beaSLionel Sambucviolations during decompression if you are feeling especially
882*4a711beaSLionel Sambucparanoid.  I would be interested in hearing more about the
883*4a711beaSLionel Sambucrobustness of the library to corrupted compressed data.</p>
884*4a711beaSLionel Sambuc<p>Version 1.0.3 more robust in this respect than any
885*4a711beaSLionel Sambucprevious version.  Investigations with Valgrind (a tool for detecting
886*4a711beaSLionel Sambucproblems with memory management) indicate
887*4a711beaSLionel Sambucthat, at least for the few files I tested, all single-bit errors
888*4a711beaSLionel Sambucin the decompressed data are caught properly, with no
889*4a711beaSLionel Sambucsegmentation faults, no uses of uninitialised data, no out of
890*4a711beaSLionel Sambucrange reads or writes, and no infinite looping in the decompressor.
891*4a711beaSLionel SambucSo it's certainly pretty robust, although
892*4a711beaSLionel SambucI wouldn't claim it to be totally bombproof.</p>
893*4a711beaSLionel Sambuc<p>The file <code class="computeroutput">bzlib.h</code> contains
894*4a711beaSLionel Sambucall definitions needed to use the library.  In particular, you
895*4a711beaSLionel Sambucshould definitely not include
896*4a711beaSLionel Sambuc<code class="computeroutput">bzlib_private.h</code>.</p>
897*4a711beaSLionel Sambuc<p>In <code class="computeroutput">bzlib.h</code>, the various
898*4a711beaSLionel Sambucreturn values are defined.  The following list is not intended as
899*4a711beaSLionel Sambucan exhaustive description of the circumstances in which a given
900*4a711beaSLionel Sambucvalue may be returned -- those descriptions are given later.
901*4a711beaSLionel SambucRather, it is intended to convey the rough meaning of each return
902*4a711beaSLionel Sambucvalue.  The first five actions are normal and not intended to
903*4a711beaSLionel Sambucdenote an error situation.</p>
904*4a711beaSLionel Sambuc<div class="variablelist"><dl>
905*4a711beaSLionel Sambuc<dt><span class="term"><code class="computeroutput">BZ_OK</code></span></dt>
906*4a711beaSLionel Sambuc<dd><p>The requested action was completed
907*4a711beaSLionel Sambuc   successfully.</p></dd>
908*4a711beaSLionel Sambuc<dt><span class="term"><code class="computeroutput">BZ_RUN_OK, BZ_FLUSH_OK,
909*4a711beaSLionel Sambuc    BZ_FINISH_OK</code></span></dt>
910*4a711beaSLionel Sambuc<dd><p>In
911*4a711beaSLionel Sambuc   <code class="computeroutput">BZ2_bzCompress</code>, the requested
912*4a711beaSLionel Sambuc   flush/finish/nothing-special action was completed
913*4a711beaSLionel Sambuc   successfully.</p></dd>
914*4a711beaSLionel Sambuc<dt><span class="term"><code class="computeroutput">BZ_STREAM_END</code></span></dt>
915*4a711beaSLionel Sambuc<dd><p>Compression of data was completed, or the
916*4a711beaSLionel Sambuc   logical stream end was detected during
917*4a711beaSLionel Sambuc   decompression.</p></dd>
918*4a711beaSLionel Sambuc</dl></div>
919*4a711beaSLionel Sambuc<p>The following return values indicate an error of some
920*4a711beaSLionel Sambuckind.</p>
921*4a711beaSLionel Sambuc<div class="variablelist"><dl>
922*4a711beaSLionel Sambuc<dt><span class="term"><code class="computeroutput">BZ_CONFIG_ERROR</code></span></dt>
923*4a711beaSLionel Sambuc<dd><p>Indicates that the library has been improperly
924*4a711beaSLionel Sambuc   compiled on your platform -- a major configuration error.
925*4a711beaSLionel Sambuc   Specifically, it means that
926*4a711beaSLionel Sambuc   <code class="computeroutput">sizeof(char)</code>,
927*4a711beaSLionel Sambuc   <code class="computeroutput">sizeof(short)</code> and
928*4a711beaSLionel Sambuc   <code class="computeroutput">sizeof(int)</code> are not 1, 2 and
929*4a711beaSLionel Sambuc   4 respectively, as they should be.  Note that the library
930*4a711beaSLionel Sambuc   should still work properly on 64-bit platforms which follow
931*4a711beaSLionel Sambuc   the LP64 programming model -- that is, where
932*4a711beaSLionel Sambuc   <code class="computeroutput">sizeof(long)</code> and
933*4a711beaSLionel Sambuc   <code class="computeroutput">sizeof(void*)</code> are 8.  Under
934*4a711beaSLionel Sambuc   LP64, <code class="computeroutput">sizeof(int)</code> is still 4,
935*4a711beaSLionel Sambuc   so <code class="computeroutput">libbzip2</code>, which doesn't
936*4a711beaSLionel Sambuc   use the <code class="computeroutput">long</code> type, is
937*4a711beaSLionel Sambuc   OK.</p></dd>
938*4a711beaSLionel Sambuc<dt><span class="term"><code class="computeroutput">BZ_SEQUENCE_ERROR</code></span></dt>
939*4a711beaSLionel Sambuc<dd><p>When using the library, it is important to call
940*4a711beaSLionel Sambuc   the functions in the correct sequence and with data structures
941*4a711beaSLionel Sambuc   (buffers etc) in the correct states.
942*4a711beaSLionel Sambuc   <code class="computeroutput">libbzip2</code> checks as much as it
943*4a711beaSLionel Sambuc   can to ensure this is happening, and returns
944*4a711beaSLionel Sambuc   <code class="computeroutput">BZ_SEQUENCE_ERROR</code> if not.
945*4a711beaSLionel Sambuc   Code which complies precisely with the function semantics, as
946*4a711beaSLionel Sambuc   detailed below, should never receive this value; such an event
947*4a711beaSLionel Sambuc   denotes buggy code which you should
948*4a711beaSLionel Sambuc   investigate.</p></dd>
949*4a711beaSLionel Sambuc<dt><span class="term"><code class="computeroutput">BZ_PARAM_ERROR</code></span></dt>
950*4a711beaSLionel Sambuc<dd><p>Returned when a parameter to a function call is
951*4a711beaSLionel Sambuc   out of range or otherwise manifestly incorrect.  As with
952*4a711beaSLionel Sambuc   <code class="computeroutput">BZ_SEQUENCE_ERROR</code>, this
953*4a711beaSLionel Sambuc   denotes a bug in the client code.  The distinction between
954*4a711beaSLionel Sambuc   <code class="computeroutput">BZ_PARAM_ERROR</code> and
955*4a711beaSLionel Sambuc   <code class="computeroutput">BZ_SEQUENCE_ERROR</code> is a bit
956*4a711beaSLionel Sambuc   hazy, but still worth making.</p></dd>
957*4a711beaSLionel Sambuc<dt><span class="term"><code class="computeroutput">BZ_MEM_ERROR</code></span></dt>
958*4a711beaSLionel Sambuc<dd><p>Returned when a request to allocate memory
959*4a711beaSLionel Sambuc   failed.  Note that the quantity of memory needed to decompress
960*4a711beaSLionel Sambuc   a stream cannot be determined until the stream's header has
961*4a711beaSLionel Sambuc   been read.  So
962*4a711beaSLionel Sambuc   <code class="computeroutput">BZ2_bzDecompress</code> and
963*4a711beaSLionel Sambuc   <code class="computeroutput">BZ2_bzRead</code> may return
964*4a711beaSLionel Sambuc   <code class="computeroutput">BZ_MEM_ERROR</code> even though some
965*4a711beaSLionel Sambuc   of the compressed data has been read.  The same is not true
966*4a711beaSLionel Sambuc   for compression; once
967*4a711beaSLionel Sambuc   <code class="computeroutput">BZ2_bzCompressInit</code> or
968*4a711beaSLionel Sambuc   <code class="computeroutput">BZ2_bzWriteOpen</code> have
969*4a711beaSLionel Sambuc   successfully completed,
970*4a711beaSLionel Sambuc   <code class="computeroutput">BZ_MEM_ERROR</code> cannot
971*4a711beaSLionel Sambuc   occur.</p></dd>
972*4a711beaSLionel Sambuc<dt><span class="term"><code class="computeroutput">BZ_DATA_ERROR</code></span></dt>
973*4a711beaSLionel Sambuc<dd><p>Returned when a data integrity error is
974*4a711beaSLionel Sambuc   detected during decompression.  Most importantly, this means
975*4a711beaSLionel Sambuc   when stored and computed CRCs for the data do not match.  This
976*4a711beaSLionel Sambuc   value is also returned upon detection of any other anomaly in
977*4a711beaSLionel Sambuc   the compressed data.</p></dd>
978*4a711beaSLionel Sambuc<dt><span class="term"><code class="computeroutput">BZ_DATA_ERROR_MAGIC</code></span></dt>
979*4a711beaSLionel Sambuc<dd><p>As a special case of
980*4a711beaSLionel Sambuc   <code class="computeroutput">BZ_DATA_ERROR</code>, it is
981*4a711beaSLionel Sambuc   sometimes useful to know when the compressed stream does not
982*4a711beaSLionel Sambuc   start with the correct magic bytes (<code class="computeroutput">'B' 'Z'
983*4a711beaSLionel Sambuc   'h'</code>).</p></dd>
984*4a711beaSLionel Sambuc<dt><span class="term"><code class="computeroutput">BZ_IO_ERROR</code></span></dt>
985*4a711beaSLionel Sambuc<dd><p>Returned by
986*4a711beaSLionel Sambuc   <code class="computeroutput">BZ2_bzRead</code> and
987*4a711beaSLionel Sambuc   <code class="computeroutput">BZ2_bzWrite</code> when there is an
988*4a711beaSLionel Sambuc   error reading or writing in the compressed file, and by
989*4a711beaSLionel Sambuc   <code class="computeroutput">BZ2_bzReadOpen</code> and
990*4a711beaSLionel Sambuc   <code class="computeroutput">BZ2_bzWriteOpen</code> for attempts
991*4a711beaSLionel Sambuc   to use a file for which the error indicator (viz,
992*4a711beaSLionel Sambuc   <code class="computeroutput">ferror(f)</code>) is set.  On
993*4a711beaSLionel Sambuc   receipt of <code class="computeroutput">BZ_IO_ERROR</code>, the
994*4a711beaSLionel Sambuc   caller should consult <code class="computeroutput">errno</code>
995*4a711beaSLionel Sambuc   and/or <code class="computeroutput">perror</code> to acquire
996*4a711beaSLionel Sambuc   operating-system specific information about the
997*4a711beaSLionel Sambuc   problem.</p></dd>
998*4a711beaSLionel Sambuc<dt><span class="term"><code class="computeroutput">BZ_UNEXPECTED_EOF</code></span></dt>
999*4a711beaSLionel Sambuc<dd><p>Returned by
1000*4a711beaSLionel Sambuc   <code class="computeroutput">BZ2_bzRead</code> when the
1001*4a711beaSLionel Sambuc   compressed file finishes before the logical end of stream is
1002*4a711beaSLionel Sambuc   detected.</p></dd>
1003*4a711beaSLionel Sambuc<dt><span class="term"><code class="computeroutput">BZ_OUTBUFF_FULL</code></span></dt>
1004*4a711beaSLionel Sambuc<dd><p>Returned by
1005*4a711beaSLionel Sambuc   <code class="computeroutput">BZ2_bzBuffToBuffCompress</code> and
1006*4a711beaSLionel Sambuc   <code class="computeroutput">BZ2_bzBuffToBuffDecompress</code> to
1007*4a711beaSLionel Sambuc   indicate that the output data will not fit into the output
1008*4a711beaSLionel Sambuc   buffer provided.</p></dd>
1009*4a711beaSLionel Sambuc</dl></div>
1010*4a711beaSLionel Sambuc</div>
1011*4a711beaSLionel Sambuc<div class="sect1" title="3.3.�Low-level interface">
1012*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h2 class="title" style="clear: both">
1013*4a711beaSLionel Sambuc<a name="low-level"></a>3.3.�Low-level interface</h2></div></div></div>
1014*4a711beaSLionel Sambuc<div class="sect2" title="3.3.1.�BZ2_bzCompressInit">
1015*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h3 class="title">
1016*4a711beaSLionel Sambuc<a name="bzcompress-init"></a>3.3.1.�BZ2_bzCompressInit</h3></div></div></div>
1017*4a711beaSLionel Sambuc<pre class="programlisting">typedef struct {
1018*4a711beaSLionel Sambuc  char *next_in;
1019*4a711beaSLionel Sambuc  unsigned int avail_in;
1020*4a711beaSLionel Sambuc  unsigned int total_in_lo32;
1021*4a711beaSLionel Sambuc  unsigned int total_in_hi32;
1022*4a711beaSLionel Sambuc
1023*4a711beaSLionel Sambuc  char *next_out;
1024*4a711beaSLionel Sambuc  unsigned int avail_out;
1025*4a711beaSLionel Sambuc  unsigned int total_out_lo32;
1026*4a711beaSLionel Sambuc  unsigned int total_out_hi32;
1027*4a711beaSLionel Sambuc
1028*4a711beaSLionel Sambuc  void *state;
1029*4a711beaSLionel Sambuc
1030*4a711beaSLionel Sambuc  void *(*bzalloc)(void *,int,int);
1031*4a711beaSLionel Sambuc  void (*bzfree)(void *,void *);
1032*4a711beaSLionel Sambuc  void *opaque;
1033*4a711beaSLionel Sambuc} bz_stream;
1034*4a711beaSLionel Sambuc
1035*4a711beaSLionel Sambucint BZ2_bzCompressInit ( bz_stream *strm,
1036*4a711beaSLionel Sambuc                         int blockSize100k,
1037*4a711beaSLionel Sambuc                         int verbosity,
1038*4a711beaSLionel Sambuc                         int workFactor );</pre>
1039*4a711beaSLionel Sambuc<p>Prepares for compression.  The
1040*4a711beaSLionel Sambuc<code class="computeroutput">bz_stream</code> structure holds all
1041*4a711beaSLionel Sambucdata pertaining to the compression activity.  A
1042*4a711beaSLionel Sambuc<code class="computeroutput">bz_stream</code> structure should be
1043*4a711beaSLionel Sambucallocated and initialised prior to the call.  The fields of
1044*4a711beaSLionel Sambuc<code class="computeroutput">bz_stream</code> comprise the entirety
1045*4a711beaSLionel Sambucof the user-visible data.  <code class="computeroutput">state</code>
1046*4a711beaSLionel Sambucis a pointer to the private data structures required for
1047*4a711beaSLionel Sambuccompression.</p>
1048*4a711beaSLionel Sambuc<p>Custom memory allocators are supported, via fields
1049*4a711beaSLionel Sambuc<code class="computeroutput">bzalloc</code>,
1050*4a711beaSLionel Sambuc<code class="computeroutput">bzfree</code>, and
1051*4a711beaSLionel Sambuc<code class="computeroutput">opaque</code>.  The value
1052*4a711beaSLionel Sambuc<code class="computeroutput">opaque</code> is passed to as the first
1053*4a711beaSLionel Sambucargument to all calls to <code class="computeroutput">bzalloc</code>
1054*4a711beaSLionel Sambucand <code class="computeroutput">bzfree</code>, but is otherwise
1055*4a711beaSLionel Sambucignored by the library.  The call <code class="computeroutput">bzalloc (
1056*4a711beaSLionel Sambucopaque, n, m )</code> is expected to return a pointer
1057*4a711beaSLionel Sambuc<code class="computeroutput">p</code> to <code class="computeroutput">n *
1058*4a711beaSLionel Sambucm</code> bytes of memory, and <code class="computeroutput">bzfree (
1059*4a711beaSLionel Sambucopaque, p )</code> should free that memory.</p>
1060*4a711beaSLionel Sambuc<p>If you don't want to use a custom memory allocator, set
1061*4a711beaSLionel Sambuc<code class="computeroutput">bzalloc</code>,
1062*4a711beaSLionel Sambuc<code class="computeroutput">bzfree</code> and
1063*4a711beaSLionel Sambuc<code class="computeroutput">opaque</code> to
1064*4a711beaSLionel Sambuc<code class="computeroutput">NULL</code>, and the library will then
1065*4a711beaSLionel Sambucuse the standard <code class="computeroutput">malloc</code> /
1066*4a711beaSLionel Sambuc<code class="computeroutput">free</code> routines.</p>
1067*4a711beaSLionel Sambuc<p>Before calling
1068*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzCompressInit</code>, fields
1069*4a711beaSLionel Sambuc<code class="computeroutput">bzalloc</code>,
1070*4a711beaSLionel Sambuc<code class="computeroutput">bzfree</code> and
1071*4a711beaSLionel Sambuc<code class="computeroutput">opaque</code> should be filled
1072*4a711beaSLionel Sambucappropriately, as just described.  Upon return, the internal
1073*4a711beaSLionel Sambucstate will have been allocated and initialised, and
1074*4a711beaSLionel Sambuc<code class="computeroutput">total_in_lo32</code>,
1075*4a711beaSLionel Sambuc<code class="computeroutput">total_in_hi32</code>,
1076*4a711beaSLionel Sambuc<code class="computeroutput">total_out_lo32</code> and
1077*4a711beaSLionel Sambuc<code class="computeroutput">total_out_hi32</code> will have been
1078*4a711beaSLionel Sambucset to zero.  These four fields are used by the library to inform
1079*4a711beaSLionel Sambucthe caller of the total amount of data passed into and out of the
1080*4a711beaSLionel Sambuclibrary, respectively.  You should not try to change them.  As of
1081*4a711beaSLionel Sambucversion 1.0, 64-bit counts are maintained, even on 32-bit
1082*4a711beaSLionel Sambucplatforms, using the <code class="computeroutput">_hi32</code>
1083*4a711beaSLionel Sambucfields to store the upper 32 bits of the count.  So, for example,
1084*4a711beaSLionel Sambucthe total amount of data in is <code class="computeroutput">(total_in_hi32
1085*4a711beaSLionel Sambuc&lt;&lt; 32) + total_in_lo32</code>.</p>
1086*4a711beaSLionel Sambuc<p>Parameter <code class="computeroutput">blockSize100k</code>
1087*4a711beaSLionel Sambucspecifies the block size to be used for compression.  It should
1088*4a711beaSLionel Sambucbe a value between 1 and 9 inclusive, and the actual block size
1089*4a711beaSLionel Sambucused is 100000 x this figure.  9 gives the best compression but
1090*4a711beaSLionel Sambuctakes most memory.</p>
1091*4a711beaSLionel Sambuc<p>Parameter <code class="computeroutput">verbosity</code> should
1092*4a711beaSLionel Sambucbe set to a number between 0 and 4 inclusive.  0 is silent, and
1093*4a711beaSLionel Sambucgreater numbers give increasingly verbose monitoring/debugging
1094*4a711beaSLionel Sambucoutput.  If the library has been compiled with
1095*4a711beaSLionel Sambuc<code class="computeroutput">-DBZ_NO_STDIO</code>, no such output
1096*4a711beaSLionel Sambucwill appear for any verbosity setting.</p>
1097*4a711beaSLionel Sambuc<p>Parameter <code class="computeroutput">workFactor</code>
1098*4a711beaSLionel Sambuccontrols how the compression phase behaves when presented with
1099*4a711beaSLionel Sambucworst case, highly repetitive, input data.  If compression runs
1100*4a711beaSLionel Sambucinto difficulties caused by repetitive data, the library switches
1101*4a711beaSLionel Sambucfrom the standard sorting algorithm to a fallback algorithm.  The
1102*4a711beaSLionel Sambucfallback is slower than the standard algorithm by perhaps a
1103*4a711beaSLionel Sambucfactor of three, but always behaves reasonably, no matter how bad
1104*4a711beaSLionel Sambucthe input.</p>
1105*4a711beaSLionel Sambuc<p>Lower values of <code class="computeroutput">workFactor</code>
1106*4a711beaSLionel Sambucreduce the amount of effort the standard algorithm will expend
1107*4a711beaSLionel Sambucbefore resorting to the fallback.  You should set this parameter
1108*4a711beaSLionel Sambuccarefully; too low, and many inputs will be handled by the
1109*4a711beaSLionel Sambucfallback algorithm and so compress rather slowly, too high, and
1110*4a711beaSLionel Sambucyour average-to-worst case compression times can become very
1111*4a711beaSLionel Sambuclarge.  The default value of 30 gives reasonable behaviour over a
1112*4a711beaSLionel Sambucwide range of circumstances.</p>
1113*4a711beaSLionel Sambuc<p>Allowable values range from 0 to 250 inclusive.  0 is a
1114*4a711beaSLionel Sambucspecial case, equivalent to using the default value of 30.</p>
1115*4a711beaSLionel Sambuc<p>Note that the compressed output generated is the same
1116*4a711beaSLionel Sambucregardless of whether or not the fallback algorithm is
1117*4a711beaSLionel Sambucused.</p>
1118*4a711beaSLionel Sambuc<p>Be aware also that this parameter may disappear entirely in
1119*4a711beaSLionel Sambucfuture versions of the library.  In principle it should be
1120*4a711beaSLionel Sambucpossible to devise a good way to automatically choose which
1121*4a711beaSLionel Sambucalgorithm to use.  Such a mechanism would render the parameter
1122*4a711beaSLionel Sambucobsolete.</p>
1123*4a711beaSLionel Sambuc<p>Possible return values:</p>
1124*4a711beaSLionel Sambuc<pre class="programlisting">BZ_CONFIG_ERROR
1125*4a711beaSLionel Sambuc  if the library has been mis-compiled
1126*4a711beaSLionel SambucBZ_PARAM_ERROR
1127*4a711beaSLionel Sambuc  if strm is NULL
1128*4a711beaSLionel Sambuc  or blockSize &lt; 1 or blockSize &gt; 9
1129*4a711beaSLionel Sambuc  or verbosity &lt; 0 or verbosity &gt; 4
1130*4a711beaSLionel Sambuc  or workFactor &lt; 0 or workFactor &gt; 250
1131*4a711beaSLionel SambucBZ_MEM_ERROR
1132*4a711beaSLionel Sambuc  if not enough memory is available
1133*4a711beaSLionel SambucBZ_OK
1134*4a711beaSLionel Sambuc  otherwise</pre>
1135*4a711beaSLionel Sambuc<p>Allowable next actions:</p>
1136*4a711beaSLionel Sambuc<pre class="programlisting">BZ2_bzCompress
1137*4a711beaSLionel Sambuc  if BZ_OK is returned
1138*4a711beaSLionel Sambuc  no specific action needed in case of error</pre>
1139*4a711beaSLionel Sambuc</div>
1140*4a711beaSLionel Sambuc<div class="sect2" title="3.3.2.�BZ2_bzCompress">
1141*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h3 class="title">
1142*4a711beaSLionel Sambuc<a name="bzCompress"></a>3.3.2.�BZ2_bzCompress</h3></div></div></div>
1143*4a711beaSLionel Sambuc<pre class="programlisting">int BZ2_bzCompress ( bz_stream *strm, int action );</pre>
1144*4a711beaSLionel Sambuc<p>Provides more input and/or output buffer space for the
1145*4a711beaSLionel Sambuclibrary.  The caller maintains input and output buffers, and
1146*4a711beaSLionel Sambuccalls <code class="computeroutput">BZ2_bzCompress</code> to transfer
1147*4a711beaSLionel Sambucdata between them.</p>
1148*4a711beaSLionel Sambuc<p>Before each call to
1149*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzCompress</code>,
1150*4a711beaSLionel Sambuc<code class="computeroutput">next_in</code> should point at the data
1151*4a711beaSLionel Sambucto be compressed, and <code class="computeroutput">avail_in</code>
1152*4a711beaSLionel Sambucshould indicate how many bytes the library may read.
1153*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzCompress</code> updates
1154*4a711beaSLionel Sambuc<code class="computeroutput">next_in</code>,
1155*4a711beaSLionel Sambuc<code class="computeroutput">avail_in</code> and
1156*4a711beaSLionel Sambuc<code class="computeroutput">total_in</code> to reflect the number
1157*4a711beaSLionel Sambucof bytes it has read.</p>
1158*4a711beaSLionel Sambuc<p>Similarly, <code class="computeroutput">next_out</code> should
1159*4a711beaSLionel Sambucpoint to a buffer in which the compressed data is to be placed,
1160*4a711beaSLionel Sambucwith <code class="computeroutput">avail_out</code> indicating how
1161*4a711beaSLionel Sambucmuch output space is available.
1162*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzCompress</code> updates
1163*4a711beaSLionel Sambuc<code class="computeroutput">next_out</code>,
1164*4a711beaSLionel Sambuc<code class="computeroutput">avail_out</code> and
1165*4a711beaSLionel Sambuc<code class="computeroutput">total_out</code> to reflect the number
1166*4a711beaSLionel Sambucof bytes output.</p>
1167*4a711beaSLionel Sambuc<p>You may provide and remove as little or as much data as you
1168*4a711beaSLionel Sambuclike on each call of
1169*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzCompress</code>.  In the limit,
1170*4a711beaSLionel Sambucit is acceptable to supply and remove data one byte at a time,
1171*4a711beaSLionel Sambucalthough this would be terribly inefficient.  You should always
1172*4a711beaSLionel Sambucensure that at least one byte of output space is available at
1173*4a711beaSLionel Sambuceach call.</p>
1174*4a711beaSLionel Sambuc<p>A second purpose of
1175*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzCompress</code> is to request a
1176*4a711beaSLionel Sambucchange of mode of the compressed stream.</p>
1177*4a711beaSLionel Sambuc<p>Conceptually, a compressed stream can be in one of four
1178*4a711beaSLionel Sambucstates: IDLE, RUNNING, FLUSHING and FINISHING.  Before
1179*4a711beaSLionel Sambucinitialisation
1180*4a711beaSLionel Sambuc(<code class="computeroutput">BZ2_bzCompressInit</code>) and after
1181*4a711beaSLionel Sambuctermination (<code class="computeroutput">BZ2_bzCompressEnd</code>),
1182*4a711beaSLionel Sambuca stream is regarded as IDLE.</p>
1183*4a711beaSLionel Sambuc<p>Upon initialisation
1184*4a711beaSLionel Sambuc(<code class="computeroutput">BZ2_bzCompressInit</code>), the stream
1185*4a711beaSLionel Sambucis placed in the RUNNING state.  Subsequent calls to
1186*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzCompress</code> should pass
1187*4a711beaSLionel Sambuc<code class="computeroutput">BZ_RUN</code> as the requested action;
1188*4a711beaSLionel Sambucother actions are illegal and will result in
1189*4a711beaSLionel Sambuc<code class="computeroutput">BZ_SEQUENCE_ERROR</code>.</p>
1190*4a711beaSLionel Sambuc<p>At some point, the calling program will have provided all
1191*4a711beaSLionel Sambucthe input data it wants to.  It will then want to finish up -- in
1192*4a711beaSLionel Sambuceffect, asking the library to process any data it might have
1193*4a711beaSLionel Sambucbuffered internally.  In this state,
1194*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzCompress</code> will no longer
1195*4a711beaSLionel Sambucattempt to read data from
1196*4a711beaSLionel Sambuc<code class="computeroutput">next_in</code>, but it will want to
1197*4a711beaSLionel Sambucwrite data to <code class="computeroutput">next_out</code>.  Because
1198*4a711beaSLionel Sambucthe output buffer supplied by the user can be arbitrarily small,
1199*4a711beaSLionel Sambucthe finishing-up operation cannot necessarily be done with a
1200*4a711beaSLionel Sambucsingle call of
1201*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzCompress</code>.</p>
1202*4a711beaSLionel Sambuc<p>Instead, the calling program passes
1203*4a711beaSLionel Sambuc<code class="computeroutput">BZ_FINISH</code> as an action to
1204*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzCompress</code>.  This changes
1205*4a711beaSLionel Sambucthe stream's state to FINISHING.  Any remaining input (ie,
1206*4a711beaSLionel Sambuc<code class="computeroutput">next_in[0 .. avail_in-1]</code>) is
1207*4a711beaSLionel Sambuccompressed and transferred to the output buffer.  To do this,
1208*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzCompress</code> must be called
1209*4a711beaSLionel Sambucrepeatedly until all the output has been consumed.  At that
1210*4a711beaSLionel Sambucpoint, <code class="computeroutput">BZ2_bzCompress</code> returns
1211*4a711beaSLionel Sambuc<code class="computeroutput">BZ_STREAM_END</code>, and the stream's
1212*4a711beaSLionel Sambucstate is set back to IDLE.
1213*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzCompressEnd</code> should then be
1214*4a711beaSLionel Sambuccalled.</p>
1215*4a711beaSLionel Sambuc<p>Just to make sure the calling program does not cheat, the
1216*4a711beaSLionel Sambuclibrary makes a note of <code class="computeroutput">avail_in</code>
1217*4a711beaSLionel Sambucat the time of the first call to
1218*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzCompress</code> which has
1219*4a711beaSLionel Sambuc<code class="computeroutput">BZ_FINISH</code> as an action (ie, at
1220*4a711beaSLionel Sambucthe time the program has announced its intention to not supply
1221*4a711beaSLionel Sambucany more input).  By comparing this value with that of
1222*4a711beaSLionel Sambuc<code class="computeroutput">avail_in</code> over subsequent calls
1223*4a711beaSLionel Sambucto <code class="computeroutput">BZ2_bzCompress</code>, the library
1224*4a711beaSLionel Sambuccan detect any attempts to slip in more data to compress.  Any
1225*4a711beaSLionel Sambuccalls for which this is detected will return
1226*4a711beaSLionel Sambuc<code class="computeroutput">BZ_SEQUENCE_ERROR</code>.  This
1227*4a711beaSLionel Sambucindicates a programming mistake which should be corrected.</p>
1228*4a711beaSLionel Sambuc<p>Instead of asking to finish, the calling program may ask
1229*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzCompress</code> to take all the
1230*4a711beaSLionel Sambucremaining input, compress it and terminate the current
1231*4a711beaSLionel Sambuc(Burrows-Wheeler) compression block.  This could be useful for
1232*4a711beaSLionel Sambucerror control purposes.  The mechanism is analogous to that for
1233*4a711beaSLionel Sambucfinishing: call <code class="computeroutput">BZ2_bzCompress</code>
1234*4a711beaSLionel Sambucwith an action of <code class="computeroutput">BZ_FLUSH</code>,
1235*4a711beaSLionel Sambucremove output data, and persist with the
1236*4a711beaSLionel Sambuc<code class="computeroutput">BZ_FLUSH</code> action until the value
1237*4a711beaSLionel Sambuc<code class="computeroutput">BZ_RUN</code> is returned.  As with
1238*4a711beaSLionel Sambucfinishing, <code class="computeroutput">BZ2_bzCompress</code>
1239*4a711beaSLionel Sambucdetects any attempt to provide more input data once the flush has
1240*4a711beaSLionel Sambucbegun.</p>
1241*4a711beaSLionel Sambuc<p>Once the flush is complete, the stream returns to the
1242*4a711beaSLionel Sambucnormal RUNNING state.</p>
1243*4a711beaSLionel Sambuc<p>This all sounds pretty complex, but isn't really.  Here's a
1244*4a711beaSLionel Sambuctable which shows which actions are allowable in each state, what
1245*4a711beaSLionel Sambucaction will be taken, what the next state is, and what the
1246*4a711beaSLionel Sambucnon-error return values are.  Note that you can't explicitly ask
1247*4a711beaSLionel Sambucwhat state the stream is in, but nor do you need to -- it can be
1248*4a711beaSLionel Sambucinferred from the values returned by
1249*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzCompress</code>.</p>
1250*4a711beaSLionel Sambuc<pre class="programlisting">IDLE/any
1251*4a711beaSLionel Sambuc  Illegal.  IDLE state only exists after BZ2_bzCompressEnd or
1252*4a711beaSLionel Sambuc  before BZ2_bzCompressInit.
1253*4a711beaSLionel Sambuc  Return value = BZ_SEQUENCE_ERROR
1254*4a711beaSLionel Sambuc
1255*4a711beaSLionel SambucRUNNING/BZ_RUN
1256*4a711beaSLionel Sambuc  Compress from next_in to next_out as much as possible.
1257*4a711beaSLionel Sambuc  Next state = RUNNING
1258*4a711beaSLionel Sambuc  Return value = BZ_RUN_OK
1259*4a711beaSLionel Sambuc
1260*4a711beaSLionel SambucRUNNING/BZ_FLUSH
1261*4a711beaSLionel Sambuc  Remember current value of next_in. Compress from next_in
1262*4a711beaSLionel Sambuc  to next_out as much as possible, but do not accept any more input.
1263*4a711beaSLionel Sambuc  Next state = FLUSHING
1264*4a711beaSLionel Sambuc  Return value = BZ_FLUSH_OK
1265*4a711beaSLionel Sambuc
1266*4a711beaSLionel SambucRUNNING/BZ_FINISH
1267*4a711beaSLionel Sambuc  Remember current value of next_in. Compress from next_in
1268*4a711beaSLionel Sambuc  to next_out as much as possible, but do not accept any more input.
1269*4a711beaSLionel Sambuc  Next state = FINISHING
1270*4a711beaSLionel Sambuc  Return value = BZ_FINISH_OK
1271*4a711beaSLionel Sambuc
1272*4a711beaSLionel SambucFLUSHING/BZ_FLUSH
1273*4a711beaSLionel Sambuc  Compress from next_in to next_out as much as possible,
1274*4a711beaSLionel Sambuc  but do not accept any more input.
1275*4a711beaSLionel Sambuc  If all the existing input has been used up and all compressed
1276*4a711beaSLionel Sambuc  output has been removed
1277*4a711beaSLionel Sambuc    Next state = RUNNING; Return value = BZ_RUN_OK
1278*4a711beaSLionel Sambuc  else
1279*4a711beaSLionel Sambuc    Next state = FLUSHING; Return value = BZ_FLUSH_OK
1280*4a711beaSLionel Sambuc
1281*4a711beaSLionel SambucFLUSHING/other
1282*4a711beaSLionel Sambuc  Illegal.
1283*4a711beaSLionel Sambuc  Return value = BZ_SEQUENCE_ERROR
1284*4a711beaSLionel Sambuc
1285*4a711beaSLionel SambucFINISHING/BZ_FINISH
1286*4a711beaSLionel Sambuc  Compress from next_in to next_out as much as possible,
1287*4a711beaSLionel Sambuc  but to not accept any more input.
1288*4a711beaSLionel Sambuc  If all the existing input has been used up and all compressed
1289*4a711beaSLionel Sambuc  output has been removed
1290*4a711beaSLionel Sambuc    Next state = IDLE; Return value = BZ_STREAM_END
1291*4a711beaSLionel Sambuc  else
1292*4a711beaSLionel Sambuc    Next state = FINISHING; Return value = BZ_FINISH_OK
1293*4a711beaSLionel Sambuc
1294*4a711beaSLionel SambucFINISHING/other
1295*4a711beaSLionel Sambuc  Illegal.
1296*4a711beaSLionel Sambuc  Return value = BZ_SEQUENCE_ERROR</pre>
1297*4a711beaSLionel Sambuc<p>That still looks complicated?  Well, fair enough.  The
1298*4a711beaSLionel Sambucusual sequence of calls for compressing a load of data is:</p>
1299*4a711beaSLionel Sambuc<div class="orderedlist"><ol class="orderedlist" type="1">
1300*4a711beaSLionel Sambuc<li class="listitem"><p>Get started with
1301*4a711beaSLionel Sambuc  <code class="computeroutput">BZ2_bzCompressInit</code>.</p></li>
1302*4a711beaSLionel Sambuc<li class="listitem"><p>Shovel data in and shlurp out its compressed form
1303*4a711beaSLionel Sambuc  using zero or more calls of
1304*4a711beaSLionel Sambuc  <code class="computeroutput">BZ2_bzCompress</code> with action =
1305*4a711beaSLionel Sambuc  <code class="computeroutput">BZ_RUN</code>.</p></li>
1306*4a711beaSLionel Sambuc<li class="listitem"><p>Finish up. Repeatedly call
1307*4a711beaSLionel Sambuc  <code class="computeroutput">BZ2_bzCompress</code> with action =
1308*4a711beaSLionel Sambuc  <code class="computeroutput">BZ_FINISH</code>, copying out the
1309*4a711beaSLionel Sambuc  compressed output, until
1310*4a711beaSLionel Sambuc  <code class="computeroutput">BZ_STREAM_END</code> is
1311*4a711beaSLionel Sambuc  returned.</p></li>
1312*4a711beaSLionel Sambuc<li class="listitem"><p>Close up and go home.  Call
1313*4a711beaSLionel Sambuc  <code class="computeroutput">BZ2_bzCompressEnd</code>.</p></li>
1314*4a711beaSLionel Sambuc</ol></div>
1315*4a711beaSLionel Sambuc<p>If the data you want to compress fits into your input
1316*4a711beaSLionel Sambucbuffer all at once, you can skip the calls of
1317*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzCompress ( ..., BZ_RUN )</code>
1318*4a711beaSLionel Sambucand just do the <code class="computeroutput">BZ2_bzCompress ( ..., BZ_FINISH
1319*4a711beaSLionel Sambuc)</code> calls.</p>
1320*4a711beaSLionel Sambuc<p>All required memory is allocated by
1321*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzCompressInit</code>.  The
1322*4a711beaSLionel Sambuccompression library can accept any data at all (obviously).  So
1323*4a711beaSLionel Sambucyou shouldn't get any error return values from the
1324*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzCompress</code> calls.  If you
1325*4a711beaSLionel Sambucdo, they will be
1326*4a711beaSLionel Sambuc<code class="computeroutput">BZ_SEQUENCE_ERROR</code>, and indicate
1327*4a711beaSLionel Sambuca bug in your programming.</p>
1328*4a711beaSLionel Sambuc<p>Trivial other possible return values:</p>
1329*4a711beaSLionel Sambuc<pre class="programlisting">BZ_PARAM_ERROR
1330*4a711beaSLionel Sambuc  if strm is NULL, or strm-&gt;s is NULL</pre>
1331*4a711beaSLionel Sambuc</div>
1332*4a711beaSLionel Sambuc<div class="sect2" title="3.3.3.�BZ2_bzCompressEnd">
1333*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h3 class="title">
1334*4a711beaSLionel Sambuc<a name="bzCompress-end"></a>3.3.3.�BZ2_bzCompressEnd</h3></div></div></div>
1335*4a711beaSLionel Sambuc<pre class="programlisting">int BZ2_bzCompressEnd ( bz_stream *strm );</pre>
1336*4a711beaSLionel Sambuc<p>Releases all memory associated with a compression
1337*4a711beaSLionel Sambucstream.</p>
1338*4a711beaSLionel Sambuc<p>Possible return values:</p>
1339*4a711beaSLionel Sambuc<pre class="programlisting">BZ_PARAM_ERROR  if strm is NULL or strm-&gt;s is NULL
1340*4a711beaSLionel SambucBZ_OK           otherwise</pre>
1341*4a711beaSLionel Sambuc</div>
1342*4a711beaSLionel Sambuc<div class="sect2" title="3.3.4.�BZ2_bzDecompressInit">
1343*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h3 class="title">
1344*4a711beaSLionel Sambuc<a name="bzDecompress-init"></a>3.3.4.�BZ2_bzDecompressInit</h3></div></div></div>
1345*4a711beaSLionel Sambuc<pre class="programlisting">int BZ2_bzDecompressInit ( bz_stream *strm, int verbosity, int small );</pre>
1346*4a711beaSLionel Sambuc<p>Prepares for decompression.  As with
1347*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzCompressInit</code>, a
1348*4a711beaSLionel Sambuc<code class="computeroutput">bz_stream</code> record should be
1349*4a711beaSLionel Sambucallocated and initialised before the call.  Fields
1350*4a711beaSLionel Sambuc<code class="computeroutput">bzalloc</code>,
1351*4a711beaSLionel Sambuc<code class="computeroutput">bzfree</code> and
1352*4a711beaSLionel Sambuc<code class="computeroutput">opaque</code> should be set if a custom
1353*4a711beaSLionel Sambucmemory allocator is required, or made
1354*4a711beaSLionel Sambuc<code class="computeroutput">NULL</code> for the normal
1355*4a711beaSLionel Sambuc<code class="computeroutput">malloc</code> /
1356*4a711beaSLionel Sambuc<code class="computeroutput">free</code> routines.  Upon return, the
1357*4a711beaSLionel Sambucinternal state will have been initialised, and
1358*4a711beaSLionel Sambuc<code class="computeroutput">total_in</code> and
1359*4a711beaSLionel Sambuc<code class="computeroutput">total_out</code> will be zero.</p>
1360*4a711beaSLionel Sambuc<p>For the meaning of parameter
1361*4a711beaSLionel Sambuc<code class="computeroutput">verbosity</code>, see
1362*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzCompressInit</code>.</p>
1363*4a711beaSLionel Sambuc<p>If <code class="computeroutput">small</code> is nonzero, the
1364*4a711beaSLionel Sambuclibrary will use an alternative decompression algorithm which
1365*4a711beaSLionel Sambucuses less memory but at the cost of decompressing more slowly
1366*4a711beaSLionel Sambuc(roughly speaking, half the speed, but the maximum memory
1367*4a711beaSLionel Sambucrequirement drops to around 2300k).  See <a class="xref" href="#using" title="2.�How to use bzip2">How to use bzip2</a>
1368*4a711beaSLionel Sambucfor more information on memory management.</p>
1369*4a711beaSLionel Sambuc<p>Note that the amount of memory needed to decompress a
1370*4a711beaSLionel Sambucstream cannot be determined until the stream's header has been
1371*4a711beaSLionel Sambucread, so even if
1372*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzDecompressInit</code> succeeds, a
1373*4a711beaSLionel Sambucsubsequent <code class="computeroutput">BZ2_bzDecompress</code>
1374*4a711beaSLionel Sambuccould fail with
1375*4a711beaSLionel Sambuc<code class="computeroutput">BZ_MEM_ERROR</code>.</p>
1376*4a711beaSLionel Sambuc<p>Possible return values:</p>
1377*4a711beaSLionel Sambuc<pre class="programlisting">BZ_CONFIG_ERROR
1378*4a711beaSLionel Sambuc  if the library has been mis-compiled
1379*4a711beaSLionel SambucBZ_PARAM_ERROR
1380*4a711beaSLionel Sambuc  if ( small != 0 &amp;&amp; small != 1 )
1381*4a711beaSLionel Sambuc  or (verbosity &lt;; 0 || verbosity &gt; 4)
1382*4a711beaSLionel SambucBZ_MEM_ERROR
1383*4a711beaSLionel Sambuc  if insufficient memory is available</pre>
1384*4a711beaSLionel Sambuc<p>Allowable next actions:</p>
1385*4a711beaSLionel Sambuc<pre class="programlisting">BZ2_bzDecompress
1386*4a711beaSLionel Sambuc  if BZ_OK was returned
1387*4a711beaSLionel Sambuc  no specific action required in case of error</pre>
1388*4a711beaSLionel Sambuc</div>
1389*4a711beaSLionel Sambuc<div class="sect2" title="3.3.5.�BZ2_bzDecompress">
1390*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h3 class="title">
1391*4a711beaSLionel Sambuc<a name="bzDecompress"></a>3.3.5.�BZ2_bzDecompress</h3></div></div></div>
1392*4a711beaSLionel Sambuc<pre class="programlisting">int BZ2_bzDecompress ( bz_stream *strm );</pre>
1393*4a711beaSLionel Sambuc<p>Provides more input and/out output buffer space for the
1394*4a711beaSLionel Sambuclibrary.  The caller maintains input and output buffers, and uses
1395*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzDecompress</code> to transfer
1396*4a711beaSLionel Sambucdata between them.</p>
1397*4a711beaSLionel Sambuc<p>Before each call to
1398*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzDecompress</code>,
1399*4a711beaSLionel Sambuc<code class="computeroutput">next_in</code> should point at the
1400*4a711beaSLionel Sambuccompressed data, and <code class="computeroutput">avail_in</code>
1401*4a711beaSLionel Sambucshould indicate how many bytes the library may read.
1402*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzDecompress</code> updates
1403*4a711beaSLionel Sambuc<code class="computeroutput">next_in</code>,
1404*4a711beaSLionel Sambuc<code class="computeroutput">avail_in</code> and
1405*4a711beaSLionel Sambuc<code class="computeroutput">total_in</code> to reflect the number
1406*4a711beaSLionel Sambucof bytes it has read.</p>
1407*4a711beaSLionel Sambuc<p>Similarly, <code class="computeroutput">next_out</code> should
1408*4a711beaSLionel Sambucpoint to a buffer in which the uncompressed output is to be
1409*4a711beaSLionel Sambucplaced, with <code class="computeroutput">avail_out</code>
1410*4a711beaSLionel Sambucindicating how much output space is available.
1411*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzCompress</code> updates
1412*4a711beaSLionel Sambuc<code class="computeroutput">next_out</code>,
1413*4a711beaSLionel Sambuc<code class="computeroutput">avail_out</code> and
1414*4a711beaSLionel Sambuc<code class="computeroutput">total_out</code> to reflect the number
1415*4a711beaSLionel Sambucof bytes output.</p>
1416*4a711beaSLionel Sambuc<p>You may provide and remove as little or as much data as you
1417*4a711beaSLionel Sambuclike on each call of
1418*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzDecompress</code>.  In the limit,
1419*4a711beaSLionel Sambucit is acceptable to supply and remove data one byte at a time,
1420*4a711beaSLionel Sambucalthough this would be terribly inefficient.  You should always
1421*4a711beaSLionel Sambucensure that at least one byte of output space is available at
1422*4a711beaSLionel Sambuceach call.</p>
1423*4a711beaSLionel Sambuc<p>Use of <code class="computeroutput">BZ2_bzDecompress</code> is
1424*4a711beaSLionel Sambucsimpler than
1425*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzCompress</code>.</p>
1426*4a711beaSLionel Sambuc<p>You should provide input and remove output as described
1427*4a711beaSLionel Sambucabove, and repeatedly call
1428*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzDecompress</code> until
1429*4a711beaSLionel Sambuc<code class="computeroutput">BZ_STREAM_END</code> is returned.
1430*4a711beaSLionel SambucAppearance of <code class="computeroutput">BZ_STREAM_END</code>
1431*4a711beaSLionel Sambucdenotes that <code class="computeroutput">BZ2_bzDecompress</code>
1432*4a711beaSLionel Sambuchas detected the logical end of the compressed stream.
1433*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzDecompress</code> will not
1434*4a711beaSLionel Sambucproduce <code class="computeroutput">BZ_STREAM_END</code> until all
1435*4a711beaSLionel Sambucoutput data has been placed into the output buffer, so once
1436*4a711beaSLionel Sambuc<code class="computeroutput">BZ_STREAM_END</code> appears, you are
1437*4a711beaSLionel Sambucguaranteed to have available all the decompressed output, and
1438*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzDecompressEnd</code> can safely
1439*4a711beaSLionel Sambucbe called.</p>
1440*4a711beaSLionel Sambuc<p>If case of an error return value, you should call
1441*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzDecompressEnd</code> to clean up
1442*4a711beaSLionel Sambucand release memory.</p>
1443*4a711beaSLionel Sambuc<p>Possible return values:</p>
1444*4a711beaSLionel Sambuc<pre class="programlisting">BZ_PARAM_ERROR
1445*4a711beaSLionel Sambuc  if strm is NULL or strm-&gt;s is NULL
1446*4a711beaSLionel Sambuc  or strm-&gt;avail_out &lt; 1
1447*4a711beaSLionel SambucBZ_DATA_ERROR
1448*4a711beaSLionel Sambuc  if a data integrity error is detected in the compressed stream
1449*4a711beaSLionel SambucBZ_DATA_ERROR_MAGIC
1450*4a711beaSLionel Sambuc  if the compressed stream doesn't begin with the right magic bytes
1451*4a711beaSLionel SambucBZ_MEM_ERROR
1452*4a711beaSLionel Sambuc  if there wasn't enough memory available
1453*4a711beaSLionel SambucBZ_STREAM_END
1454*4a711beaSLionel Sambuc  if the logical end of the data stream was detected and all
1455*4a711beaSLionel Sambuc  output in has been consumed, eg s--&gt;avail_out &gt; 0
1456*4a711beaSLionel SambucBZ_OK
1457*4a711beaSLionel Sambuc  otherwise</pre>
1458*4a711beaSLionel Sambuc<p>Allowable next actions:</p>
1459*4a711beaSLionel Sambuc<pre class="programlisting">BZ2_bzDecompress
1460*4a711beaSLionel Sambuc  if BZ_OK was returned
1461*4a711beaSLionel SambucBZ2_bzDecompressEnd
1462*4a711beaSLionel Sambuc  otherwise</pre>
1463*4a711beaSLionel Sambuc</div>
1464*4a711beaSLionel Sambuc<div class="sect2" title="3.3.6.�BZ2_bzDecompressEnd">
1465*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h3 class="title">
1466*4a711beaSLionel Sambuc<a name="bzDecompress-end"></a>3.3.6.�BZ2_bzDecompressEnd</h3></div></div></div>
1467*4a711beaSLionel Sambuc<pre class="programlisting">int BZ2_bzDecompressEnd ( bz_stream *strm );</pre>
1468*4a711beaSLionel Sambuc<p>Releases all memory associated with a decompression
1469*4a711beaSLionel Sambucstream.</p>
1470*4a711beaSLionel Sambuc<p>Possible return values:</p>
1471*4a711beaSLionel Sambuc<pre class="programlisting">BZ_PARAM_ERROR
1472*4a711beaSLionel Sambuc  if strm is NULL or strm-&gt;s is NULL
1473*4a711beaSLionel SambucBZ_OK
1474*4a711beaSLionel Sambuc  otherwise</pre>
1475*4a711beaSLionel Sambuc<p>Allowable next actions:</p>
1476*4a711beaSLionel Sambuc<pre class="programlisting">  None.</pre>
1477*4a711beaSLionel Sambuc</div>
1478*4a711beaSLionel Sambuc</div>
1479*4a711beaSLionel Sambuc<div class="sect1" title="3.4.�High-level interface">
1480*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h2 class="title" style="clear: both">
1481*4a711beaSLionel Sambuc<a name="hl-interface"></a>3.4.�High-level interface</h2></div></div></div>
1482*4a711beaSLionel Sambuc<p>This interface provides functions for reading and writing
1483*4a711beaSLionel Sambuc<code class="computeroutput">bzip2</code> format files.  First, some
1484*4a711beaSLionel Sambucgeneral points.</p>
1485*4a711beaSLionel Sambuc<div class="itemizedlist"><ul class="itemizedlist" type="bullet">
1486*4a711beaSLionel Sambuc<li class="listitem" style="list-style-type: disc"><p>All of the functions take an
1487*4a711beaSLionel Sambuc  <code class="computeroutput">int*</code> first argument,
1488*4a711beaSLionel Sambuc  <code class="computeroutput">bzerror</code>.  After each call,
1489*4a711beaSLionel Sambuc  <code class="computeroutput">bzerror</code> should be consulted
1490*4a711beaSLionel Sambuc  first to determine the outcome of the call.  If
1491*4a711beaSLionel Sambuc  <code class="computeroutput">bzerror</code> is
1492*4a711beaSLionel Sambuc  <code class="computeroutput">BZ_OK</code>, the call completed
1493*4a711beaSLionel Sambuc  successfully, and only then should the return value of the
1494*4a711beaSLionel Sambuc  function (if any) be consulted.  If
1495*4a711beaSLionel Sambuc  <code class="computeroutput">bzerror</code> is
1496*4a711beaSLionel Sambuc  <code class="computeroutput">BZ_IO_ERROR</code>, there was an
1497*4a711beaSLionel Sambuc  error reading/writing the underlying compressed file, and you
1498*4a711beaSLionel Sambuc  should then consult <code class="computeroutput">errno</code> /
1499*4a711beaSLionel Sambuc  <code class="computeroutput">perror</code> to determine the cause
1500*4a711beaSLionel Sambuc  of the difficulty.  <code class="computeroutput">bzerror</code>
1501*4a711beaSLionel Sambuc  may also be set to various other values; precise details are
1502*4a711beaSLionel Sambuc  given on a per-function basis below.</p></li>
1503*4a711beaSLionel Sambuc<li class="listitem" style="list-style-type: disc"><p>If <code class="computeroutput">bzerror</code> indicates
1504*4a711beaSLionel Sambuc  an error (ie, anything except
1505*4a711beaSLionel Sambuc  <code class="computeroutput">BZ_OK</code> and
1506*4a711beaSLionel Sambuc  <code class="computeroutput">BZ_STREAM_END</code>), you should
1507*4a711beaSLionel Sambuc  immediately call
1508*4a711beaSLionel Sambuc  <code class="computeroutput">BZ2_bzReadClose</code> (or
1509*4a711beaSLionel Sambuc  <code class="computeroutput">BZ2_bzWriteClose</code>, depending on
1510*4a711beaSLionel Sambuc  whether you are attempting to read or to write) to free up all
1511*4a711beaSLionel Sambuc  resources associated with the stream.  Once an error has been
1512*4a711beaSLionel Sambuc  indicated, behaviour of all calls except
1513*4a711beaSLionel Sambuc  <code class="computeroutput">BZ2_bzReadClose</code>
1514*4a711beaSLionel Sambuc  (<code class="computeroutput">BZ2_bzWriteClose</code>) is
1515*4a711beaSLionel Sambuc  undefined.  The implication is that (1)
1516*4a711beaSLionel Sambuc  <code class="computeroutput">bzerror</code> should be checked
1517*4a711beaSLionel Sambuc  after each call, and (2) if
1518*4a711beaSLionel Sambuc  <code class="computeroutput">bzerror</code> indicates an error,
1519*4a711beaSLionel Sambuc  <code class="computeroutput">BZ2_bzReadClose</code>
1520*4a711beaSLionel Sambuc  (<code class="computeroutput">BZ2_bzWriteClose</code>) should then
1521*4a711beaSLionel Sambuc  be called to clean up.</p></li>
1522*4a711beaSLionel Sambuc<li class="listitem" style="list-style-type: disc"><p>The <code class="computeroutput">FILE*</code> arguments
1523*4a711beaSLionel Sambuc  passed to <code class="computeroutput">BZ2_bzReadOpen</code> /
1524*4a711beaSLionel Sambuc  <code class="computeroutput">BZ2_bzWriteOpen</code> should be set
1525*4a711beaSLionel Sambuc  to binary mode.  Most Unix systems will do this by default, but
1526*4a711beaSLionel Sambuc  other platforms, including Windows and Mac, will not.  If you
1527*4a711beaSLionel Sambuc  omit this, you may encounter problems when moving code to new
1528*4a711beaSLionel Sambuc  platforms.</p></li>
1529*4a711beaSLionel Sambuc<li class="listitem" style="list-style-type: disc"><p>Memory allocation requests are handled by
1530*4a711beaSLionel Sambuc  <code class="computeroutput">malloc</code> /
1531*4a711beaSLionel Sambuc  <code class="computeroutput">free</code>.  At present there is no
1532*4a711beaSLionel Sambuc  facility for user-defined memory allocators in the file I/O
1533*4a711beaSLionel Sambuc  functions (could easily be added, though).</p></li>
1534*4a711beaSLionel Sambuc</ul></div>
1535*4a711beaSLionel Sambuc<div class="sect2" title="3.4.1.�BZ2_bzReadOpen">
1536*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h3 class="title">
1537*4a711beaSLionel Sambuc<a name="bzreadopen"></a>3.4.1.�BZ2_bzReadOpen</h3></div></div></div>
1538*4a711beaSLionel Sambuc<pre class="programlisting">typedef void BZFILE;
1539*4a711beaSLionel Sambuc
1540*4a711beaSLionel SambucBZFILE *BZ2_bzReadOpen( int *bzerror, FILE *f,
1541*4a711beaSLionel Sambuc                        int verbosity, int small,
1542*4a711beaSLionel Sambuc                        void *unused, int nUnused );</pre>
1543*4a711beaSLionel Sambuc<p>Prepare to read compressed data from file handle
1544*4a711beaSLionel Sambuc<code class="computeroutput">f</code>.
1545*4a711beaSLionel Sambuc<code class="computeroutput">f</code> should refer to a file which
1546*4a711beaSLionel Sambuchas been opened for reading, and for which the error indicator
1547*4a711beaSLionel Sambuc(<code class="computeroutput">ferror(f)</code>)is not set.  If
1548*4a711beaSLionel Sambuc<code class="computeroutput">small</code> is 1, the library will try
1549*4a711beaSLionel Sambucto decompress using less memory, at the expense of speed.</p>
1550*4a711beaSLionel Sambuc<p>For reasons explained below,
1551*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzRead</code> will decompress the
1552*4a711beaSLionel Sambuc<code class="computeroutput">nUnused</code> bytes starting at
1553*4a711beaSLionel Sambuc<code class="computeroutput">unused</code>, before starting to read
1554*4a711beaSLionel Sambucfrom the file <code class="computeroutput">f</code>.  At most
1555*4a711beaSLionel Sambuc<code class="computeroutput">BZ_MAX_UNUSED</code> bytes may be
1556*4a711beaSLionel Sambucsupplied like this.  If this facility is not required, you should
1557*4a711beaSLionel Sambucpass <code class="computeroutput">NULL</code> and
1558*4a711beaSLionel Sambuc<code class="computeroutput">0</code> for
1559*4a711beaSLionel Sambuc<code class="computeroutput">unused</code> and
1560*4a711beaSLionel Sambucn<code class="computeroutput">Unused</code> respectively.</p>
1561*4a711beaSLionel Sambuc<p>For the meaning of parameters
1562*4a711beaSLionel Sambuc<code class="computeroutput">small</code> and
1563*4a711beaSLionel Sambuc<code class="computeroutput">verbosity</code>, see
1564*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzDecompressInit</code>.</p>
1565*4a711beaSLionel Sambuc<p>The amount of memory needed to decompress a file cannot be
1566*4a711beaSLionel Sambucdetermined until the file's header has been read.  So it is
1567*4a711beaSLionel Sambucpossible that <code class="computeroutput">BZ2_bzReadOpen</code>
1568*4a711beaSLionel Sambucreturns <code class="computeroutput">BZ_OK</code> but a subsequent
1569*4a711beaSLionel Sambuccall of <code class="computeroutput">BZ2_bzRead</code> will return
1570*4a711beaSLionel Sambuc<code class="computeroutput">BZ_MEM_ERROR</code>.</p>
1571*4a711beaSLionel Sambuc<p>Possible assignments to
1572*4a711beaSLionel Sambuc<code class="computeroutput">bzerror</code>:</p>
1573*4a711beaSLionel Sambuc<pre class="programlisting">BZ_CONFIG_ERROR
1574*4a711beaSLionel Sambuc  if the library has been mis-compiled
1575*4a711beaSLionel SambucBZ_PARAM_ERROR
1576*4a711beaSLionel Sambuc  if f is NULL
1577*4a711beaSLionel Sambuc  or small is neither 0 nor 1
1578*4a711beaSLionel Sambuc  or ( unused == NULL &amp;&amp; nUnused != 0 )
1579*4a711beaSLionel Sambuc  or ( unused != NULL &amp;&amp; !(0 &lt;= nUnused &lt;= BZ_MAX_UNUSED) )
1580*4a711beaSLionel SambucBZ_IO_ERROR
1581*4a711beaSLionel Sambuc  if ferror(f) is nonzero
1582*4a711beaSLionel SambucBZ_MEM_ERROR
1583*4a711beaSLionel Sambuc  if insufficient memory is available
1584*4a711beaSLionel SambucBZ_OK
1585*4a711beaSLionel Sambuc  otherwise.</pre>
1586*4a711beaSLionel Sambuc<p>Possible return values:</p>
1587*4a711beaSLionel Sambuc<pre class="programlisting">Pointer to an abstract BZFILE
1588*4a711beaSLionel Sambuc  if bzerror is BZ_OK
1589*4a711beaSLionel SambucNULL
1590*4a711beaSLionel Sambuc  otherwise</pre>
1591*4a711beaSLionel Sambuc<p>Allowable next actions:</p>
1592*4a711beaSLionel Sambuc<pre class="programlisting">BZ2_bzRead
1593*4a711beaSLionel Sambuc  if bzerror is BZ_OK
1594*4a711beaSLionel SambucBZ2_bzClose
1595*4a711beaSLionel Sambuc  otherwise</pre>
1596*4a711beaSLionel Sambuc</div>
1597*4a711beaSLionel Sambuc<div class="sect2" title="3.4.2.�BZ2_bzRead">
1598*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h3 class="title">
1599*4a711beaSLionel Sambuc<a name="bzread"></a>3.4.2.�BZ2_bzRead</h3></div></div></div>
1600*4a711beaSLionel Sambuc<pre class="programlisting">int BZ2_bzRead ( int *bzerror, BZFILE *b, void *buf, int len );</pre>
1601*4a711beaSLionel Sambuc<p>Reads up to <code class="computeroutput">len</code>
1602*4a711beaSLionel Sambuc(uncompressed) bytes from the compressed file
1603*4a711beaSLionel Sambuc<code class="computeroutput">b</code> into the buffer
1604*4a711beaSLionel Sambuc<code class="computeroutput">buf</code>.  If the read was
1605*4a711beaSLionel Sambucsuccessful, <code class="computeroutput">bzerror</code> is set to
1606*4a711beaSLionel Sambuc<code class="computeroutput">BZ_OK</code> and the number of bytes
1607*4a711beaSLionel Sambucread is returned.  If the logical end-of-stream was detected,
1608*4a711beaSLionel Sambuc<code class="computeroutput">bzerror</code> will be set to
1609*4a711beaSLionel Sambuc<code class="computeroutput">BZ_STREAM_END</code>, and the number of
1610*4a711beaSLionel Sambucbytes read is returned.  All other
1611*4a711beaSLionel Sambuc<code class="computeroutput">bzerror</code> values denote an
1612*4a711beaSLionel Sambucerror.</p>
1613*4a711beaSLionel Sambuc<p><code class="computeroutput">BZ2_bzRead</code> will supply
1614*4a711beaSLionel Sambuc<code class="computeroutput">len</code> bytes, unless the logical
1615*4a711beaSLionel Sambucstream end is detected or an error occurs.  Because of this, it
1616*4a711beaSLionel Sambucis possible to detect the stream end by observing when the number
1617*4a711beaSLionel Sambucof bytes returned is less than the number requested.
1618*4a711beaSLionel SambucNevertheless, this is regarded as inadvisable; you should instead
1619*4a711beaSLionel Sambuccheck <code class="computeroutput">bzerror</code> after every call
1620*4a711beaSLionel Sambucand watch out for
1621*4a711beaSLionel Sambuc<code class="computeroutput">BZ_STREAM_END</code>.</p>
1622*4a711beaSLionel Sambuc<p>Internally, <code class="computeroutput">BZ2_bzRead</code>
1623*4a711beaSLionel Sambuccopies data from the compressed file in chunks of size
1624*4a711beaSLionel Sambuc<code class="computeroutput">BZ_MAX_UNUSED</code> bytes before
1625*4a711beaSLionel Sambucdecompressing it.  If the file contains more bytes than strictly
1626*4a711beaSLionel Sambucneeded to reach the logical end-of-stream,
1627*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzRead</code> will almost certainly
1628*4a711beaSLionel Sambucread some of the trailing data before signalling
1629*4a711beaSLionel Sambuc<code class="computeroutput">BZ_SEQUENCE_END</code>.  To collect the
1630*4a711beaSLionel Sambucread but unused data once
1631*4a711beaSLionel Sambuc<code class="computeroutput">BZ_SEQUENCE_END</code> has appeared,
1632*4a711beaSLionel Sambuccall <code class="computeroutput">BZ2_bzReadGetUnused</code>
1633*4a711beaSLionel Sambucimmediately before
1634*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzReadClose</code>.</p>
1635*4a711beaSLionel Sambuc<p>Possible assignments to
1636*4a711beaSLionel Sambuc<code class="computeroutput">bzerror</code>:</p>
1637*4a711beaSLionel Sambuc<pre class="programlisting">BZ_PARAM_ERROR
1638*4a711beaSLionel Sambuc  if b is NULL or buf is NULL or len &lt; 0
1639*4a711beaSLionel SambucBZ_SEQUENCE_ERROR
1640*4a711beaSLionel Sambuc  if b was opened with BZ2_bzWriteOpen
1641*4a711beaSLionel SambucBZ_IO_ERROR
1642*4a711beaSLionel Sambuc  if there is an error reading from the compressed file
1643*4a711beaSLionel SambucBZ_UNEXPECTED_EOF
1644*4a711beaSLionel Sambuc  if the compressed file ended before
1645*4a711beaSLionel Sambuc  the logical end-of-stream was detected
1646*4a711beaSLionel SambucBZ_DATA_ERROR
1647*4a711beaSLionel Sambuc  if a data integrity error was detected in the compressed stream
1648*4a711beaSLionel SambucBZ_DATA_ERROR_MAGIC
1649*4a711beaSLionel Sambuc  if the stream does not begin with the requisite header bytes
1650*4a711beaSLionel Sambuc  (ie, is not a bzip2 data file).  This is really
1651*4a711beaSLionel Sambuc  a special case of BZ_DATA_ERROR.
1652*4a711beaSLionel SambucBZ_MEM_ERROR
1653*4a711beaSLionel Sambuc  if insufficient memory was available
1654*4a711beaSLionel SambucBZ_STREAM_END
1655*4a711beaSLionel Sambuc  if the logical end of stream was detected.
1656*4a711beaSLionel SambucBZ_OK
1657*4a711beaSLionel Sambuc  otherwise.</pre>
1658*4a711beaSLionel Sambuc<p>Possible return values:</p>
1659*4a711beaSLionel Sambuc<pre class="programlisting">number of bytes read
1660*4a711beaSLionel Sambuc  if bzerror is BZ_OK or BZ_STREAM_END
1661*4a711beaSLionel Sambucundefined
1662*4a711beaSLionel Sambuc  otherwise</pre>
1663*4a711beaSLionel Sambuc<p>Allowable next actions:</p>
1664*4a711beaSLionel Sambuc<pre class="programlisting">collect data from buf, then BZ2_bzRead or BZ2_bzReadClose
1665*4a711beaSLionel Sambuc  if bzerror is BZ_OK
1666*4a711beaSLionel Sambuccollect data from buf, then BZ2_bzReadClose or BZ2_bzReadGetUnused
1667*4a711beaSLionel Sambuc  if bzerror is BZ_SEQUENCE_END
1668*4a711beaSLionel SambucBZ2_bzReadClose
1669*4a711beaSLionel Sambuc  otherwise</pre>
1670*4a711beaSLionel Sambuc</div>
1671*4a711beaSLionel Sambuc<div class="sect2" title="3.4.3.�BZ2_bzReadGetUnused">
1672*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h3 class="title">
1673*4a711beaSLionel Sambuc<a name="bzreadgetunused"></a>3.4.3.�BZ2_bzReadGetUnused</h3></div></div></div>
1674*4a711beaSLionel Sambuc<pre class="programlisting">void BZ2_bzReadGetUnused( int* bzerror, BZFILE *b,
1675*4a711beaSLionel Sambuc                          void** unused, int* nUnused );</pre>
1676*4a711beaSLionel Sambuc<p>Returns data which was read from the compressed file but
1677*4a711beaSLionel Sambucwas not needed to get to the logical end-of-stream.
1678*4a711beaSLionel Sambuc<code class="computeroutput">*unused</code> is set to the address of
1679*4a711beaSLionel Sambucthe data, and <code class="computeroutput">*nUnused</code> to the
1680*4a711beaSLionel Sambucnumber of bytes.  <code class="computeroutput">*nUnused</code> will
1681*4a711beaSLionel Sambucbe set to a value between <code class="computeroutput">0</code> and
1682*4a711beaSLionel Sambuc<code class="computeroutput">BZ_MAX_UNUSED</code> inclusive.</p>
1683*4a711beaSLionel Sambuc<p>This function may only be called once
1684*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzRead</code> has signalled
1685*4a711beaSLionel Sambuc<code class="computeroutput">BZ_STREAM_END</code> but before
1686*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzReadClose</code>.</p>
1687*4a711beaSLionel Sambuc<p>Possible assignments to
1688*4a711beaSLionel Sambuc<code class="computeroutput">bzerror</code>:</p>
1689*4a711beaSLionel Sambuc<pre class="programlisting">BZ_PARAM_ERROR
1690*4a711beaSLionel Sambuc  if b is NULL
1691*4a711beaSLionel Sambuc  or unused is NULL or nUnused is NULL
1692*4a711beaSLionel SambucBZ_SEQUENCE_ERROR
1693*4a711beaSLionel Sambuc  if BZ_STREAM_END has not been signalled
1694*4a711beaSLionel Sambuc  or if b was opened with BZ2_bzWriteOpen
1695*4a711beaSLionel SambucBZ_OK
1696*4a711beaSLionel Sambuc  otherwise</pre>
1697*4a711beaSLionel Sambuc<p>Allowable next actions:</p>
1698*4a711beaSLionel Sambuc<pre class="programlisting">BZ2_bzReadClose</pre>
1699*4a711beaSLionel Sambuc</div>
1700*4a711beaSLionel Sambuc<div class="sect2" title="3.4.4.�BZ2_bzReadClose">
1701*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h3 class="title">
1702*4a711beaSLionel Sambuc<a name="bzreadclose"></a>3.4.4.�BZ2_bzReadClose</h3></div></div></div>
1703*4a711beaSLionel Sambuc<pre class="programlisting">void BZ2_bzReadClose ( int *bzerror, BZFILE *b );</pre>
1704*4a711beaSLionel Sambuc<p>Releases all memory pertaining to the compressed file
1705*4a711beaSLionel Sambuc<code class="computeroutput">b</code>.
1706*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzReadClose</code> does not call
1707*4a711beaSLionel Sambuc<code class="computeroutput">fclose</code> on the underlying file
1708*4a711beaSLionel Sambuchandle, so you should do that yourself if appropriate.
1709*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzReadClose</code> should be called
1710*4a711beaSLionel Sambucto clean up after all error situations.</p>
1711*4a711beaSLionel Sambuc<p>Possible assignments to
1712*4a711beaSLionel Sambuc<code class="computeroutput">bzerror</code>:</p>
1713*4a711beaSLionel Sambuc<pre class="programlisting">BZ_SEQUENCE_ERROR
1714*4a711beaSLionel Sambuc  if b was opened with BZ2_bzOpenWrite
1715*4a711beaSLionel SambucBZ_OK
1716*4a711beaSLionel Sambuc  otherwise</pre>
1717*4a711beaSLionel Sambuc<p>Allowable next actions:</p>
1718*4a711beaSLionel Sambuc<pre class="programlisting">none</pre>
1719*4a711beaSLionel Sambuc</div>
1720*4a711beaSLionel Sambuc<div class="sect2" title="3.4.5.�BZ2_bzWriteOpen">
1721*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h3 class="title">
1722*4a711beaSLionel Sambuc<a name="bzwriteopen"></a>3.4.5.�BZ2_bzWriteOpen</h3></div></div></div>
1723*4a711beaSLionel Sambuc<pre class="programlisting">BZFILE *BZ2_bzWriteOpen( int *bzerror, FILE *f,
1724*4a711beaSLionel Sambuc                         int blockSize100k, int verbosity,
1725*4a711beaSLionel Sambuc                         int workFactor );</pre>
1726*4a711beaSLionel Sambuc<p>Prepare to write compressed data to file handle
1727*4a711beaSLionel Sambuc<code class="computeroutput">f</code>.
1728*4a711beaSLionel Sambuc<code class="computeroutput">f</code> should refer to a file which
1729*4a711beaSLionel Sambuchas been opened for writing, and for which the error indicator
1730*4a711beaSLionel Sambuc(<code class="computeroutput">ferror(f)</code>)is not set.</p>
1731*4a711beaSLionel Sambuc<p>For the meaning of parameters
1732*4a711beaSLionel Sambuc<code class="computeroutput">blockSize100k</code>,
1733*4a711beaSLionel Sambuc<code class="computeroutput">verbosity</code> and
1734*4a711beaSLionel Sambuc<code class="computeroutput">workFactor</code>, see
1735*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzCompressInit</code>.</p>
1736*4a711beaSLionel Sambuc<p>All required memory is allocated at this stage, so if the
1737*4a711beaSLionel Sambuccall completes successfully,
1738*4a711beaSLionel Sambuc<code class="computeroutput">BZ_MEM_ERROR</code> cannot be signalled
1739*4a711beaSLionel Sambucby a subsequent call to
1740*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzWrite</code>.</p>
1741*4a711beaSLionel Sambuc<p>Possible assignments to
1742*4a711beaSLionel Sambuc<code class="computeroutput">bzerror</code>:</p>
1743*4a711beaSLionel Sambuc<pre class="programlisting">BZ_CONFIG_ERROR
1744*4a711beaSLionel Sambuc  if the library has been mis-compiled
1745*4a711beaSLionel SambucBZ_PARAM_ERROR
1746*4a711beaSLionel Sambuc  if f is NULL
1747*4a711beaSLionel Sambuc  or blockSize100k &lt; 1 or blockSize100k &gt; 9
1748*4a711beaSLionel SambucBZ_IO_ERROR
1749*4a711beaSLionel Sambuc  if ferror(f) is nonzero
1750*4a711beaSLionel SambucBZ_MEM_ERROR
1751*4a711beaSLionel Sambuc  if insufficient memory is available
1752*4a711beaSLionel SambucBZ_OK
1753*4a711beaSLionel Sambuc  otherwise</pre>
1754*4a711beaSLionel Sambuc<p>Possible return values:</p>
1755*4a711beaSLionel Sambuc<pre class="programlisting">Pointer to an abstract BZFILE
1756*4a711beaSLionel Sambuc  if bzerror is BZ_OK
1757*4a711beaSLionel SambucNULL
1758*4a711beaSLionel Sambuc  otherwise</pre>
1759*4a711beaSLionel Sambuc<p>Allowable next actions:</p>
1760*4a711beaSLionel Sambuc<pre class="programlisting">BZ2_bzWrite
1761*4a711beaSLionel Sambuc  if bzerror is BZ_OK
1762*4a711beaSLionel Sambuc  (you could go directly to BZ2_bzWriteClose, but this would be pretty pointless)
1763*4a711beaSLionel SambucBZ2_bzWriteClose
1764*4a711beaSLionel Sambuc  otherwise</pre>
1765*4a711beaSLionel Sambuc</div>
1766*4a711beaSLionel Sambuc<div class="sect2" title="3.4.6.�BZ2_bzWrite">
1767*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h3 class="title">
1768*4a711beaSLionel Sambuc<a name="bzwrite"></a>3.4.6.�BZ2_bzWrite</h3></div></div></div>
1769*4a711beaSLionel Sambuc<pre class="programlisting">void BZ2_bzWrite ( int *bzerror, BZFILE *b, void *buf, int len );</pre>
1770*4a711beaSLionel Sambuc<p>Absorbs <code class="computeroutput">len</code> bytes from the
1771*4a711beaSLionel Sambucbuffer <code class="computeroutput">buf</code>, eventually to be
1772*4a711beaSLionel Sambuccompressed and written to the file.</p>
1773*4a711beaSLionel Sambuc<p>Possible assignments to
1774*4a711beaSLionel Sambuc<code class="computeroutput">bzerror</code>:</p>
1775*4a711beaSLionel Sambuc<pre class="programlisting">BZ_PARAM_ERROR
1776*4a711beaSLionel Sambuc  if b is NULL or buf is NULL or len &lt; 0
1777*4a711beaSLionel SambucBZ_SEQUENCE_ERROR
1778*4a711beaSLionel Sambuc  if b was opened with BZ2_bzReadOpen
1779*4a711beaSLionel SambucBZ_IO_ERROR
1780*4a711beaSLionel Sambuc  if there is an error writing the compressed file.
1781*4a711beaSLionel SambucBZ_OK
1782*4a711beaSLionel Sambuc  otherwise</pre>
1783*4a711beaSLionel Sambuc</div>
1784*4a711beaSLionel Sambuc<div class="sect2" title="3.4.7.�BZ2_bzWriteClose">
1785*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h3 class="title">
1786*4a711beaSLionel Sambuc<a name="bzwriteclose"></a>3.4.7.�BZ2_bzWriteClose</h3></div></div></div>
1787*4a711beaSLionel Sambuc<pre class="programlisting">void BZ2_bzWriteClose( int *bzerror, BZFILE* f,
1788*4a711beaSLionel Sambuc                       int abandon,
1789*4a711beaSLionel Sambuc                       unsigned int* nbytes_in,
1790*4a711beaSLionel Sambuc                       unsigned int* nbytes_out );
1791*4a711beaSLionel Sambuc
1792*4a711beaSLionel Sambucvoid BZ2_bzWriteClose64( int *bzerror, BZFILE* f,
1793*4a711beaSLionel Sambuc                         int abandon,
1794*4a711beaSLionel Sambuc                         unsigned int* nbytes_in_lo32,
1795*4a711beaSLionel Sambuc                         unsigned int* nbytes_in_hi32,
1796*4a711beaSLionel Sambuc                         unsigned int* nbytes_out_lo32,
1797*4a711beaSLionel Sambuc                         unsigned int* nbytes_out_hi32 );</pre>
1798*4a711beaSLionel Sambuc<p>Compresses and flushes to the compressed file all data so
1799*4a711beaSLionel Sambucfar supplied by <code class="computeroutput">BZ2_bzWrite</code>.
1800*4a711beaSLionel SambucThe logical end-of-stream markers are also written, so subsequent
1801*4a711beaSLionel Sambuccalls to <code class="computeroutput">BZ2_bzWrite</code> are
1802*4a711beaSLionel Sambucillegal.  All memory associated with the compressed file
1803*4a711beaSLionel Sambuc<code class="computeroutput">b</code> is released.
1804*4a711beaSLionel Sambuc<code class="computeroutput">fflush</code> is called on the
1805*4a711beaSLionel Sambuccompressed file, but it is not
1806*4a711beaSLionel Sambuc<code class="computeroutput">fclose</code>'d.</p>
1807*4a711beaSLionel Sambuc<p>If <code class="computeroutput">BZ2_bzWriteClose</code> is
1808*4a711beaSLionel Sambuccalled to clean up after an error, the only action is to release
1809*4a711beaSLionel Sambucthe memory.  The library records the error codes issued by
1810*4a711beaSLionel Sambucprevious calls, so this situation will be detected automatically.
1811*4a711beaSLionel SambucThere is no attempt to complete the compression operation, nor to
1812*4a711beaSLionel Sambuc<code class="computeroutput">fflush</code> the compressed file.  You
1813*4a711beaSLionel Sambuccan force this behaviour to happen even in the case of no error,
1814*4a711beaSLionel Sambucby passing a nonzero value to
1815*4a711beaSLionel Sambuc<code class="computeroutput">abandon</code>.</p>
1816*4a711beaSLionel Sambuc<p>If <code class="computeroutput">nbytes_in</code> is non-null,
1817*4a711beaSLionel Sambuc<code class="computeroutput">*nbytes_in</code> will be set to be the
1818*4a711beaSLionel Sambuctotal volume of uncompressed data handled.  Similarly,
1819*4a711beaSLionel Sambuc<code class="computeroutput">nbytes_out</code> will be set to the
1820*4a711beaSLionel Sambuctotal volume of compressed data written.  For compatibility with
1821*4a711beaSLionel Sambucolder versions of the library,
1822*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzWriteClose</code> only yields the
1823*4a711beaSLionel Sambuclower 32 bits of these counts.  Use
1824*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzWriteClose64</code> if you want
1825*4a711beaSLionel Sambucthe full 64 bit counts.  These two functions are otherwise
1826*4a711beaSLionel Sambucabsolutely identical.</p>
1827*4a711beaSLionel Sambuc<p>Possible assignments to
1828*4a711beaSLionel Sambuc<code class="computeroutput">bzerror</code>:</p>
1829*4a711beaSLionel Sambuc<pre class="programlisting">BZ_SEQUENCE_ERROR
1830*4a711beaSLionel Sambuc  if b was opened with BZ2_bzReadOpen
1831*4a711beaSLionel SambucBZ_IO_ERROR
1832*4a711beaSLionel Sambuc  if there is an error writing the compressed file
1833*4a711beaSLionel SambucBZ_OK
1834*4a711beaSLionel Sambuc  otherwise</pre>
1835*4a711beaSLionel Sambuc</div>
1836*4a711beaSLionel Sambuc<div class="sect2" title="3.4.8.�Handling embedded compressed data streams">
1837*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h3 class="title">
1838*4a711beaSLionel Sambuc<a name="embed"></a>3.4.8.�Handling embedded compressed data streams</h3></div></div></div>
1839*4a711beaSLionel Sambuc<p>The high-level library facilitates use of
1840*4a711beaSLionel Sambuc<code class="computeroutput">bzip2</code> data streams which form
1841*4a711beaSLionel Sambucsome part of a surrounding, larger data stream.</p>
1842*4a711beaSLionel Sambuc<div class="itemizedlist"><ul class="itemizedlist" type="bullet">
1843*4a711beaSLionel Sambuc<li class="listitem" style="list-style-type: disc"><p>For writing, the library takes an open file handle,
1844*4a711beaSLionel Sambuc  writes compressed data to it,
1845*4a711beaSLionel Sambuc  <code class="computeroutput">fflush</code>es it but does not
1846*4a711beaSLionel Sambuc  <code class="computeroutput">fclose</code> it.  The calling
1847*4a711beaSLionel Sambuc  application can write its own data before and after the
1848*4a711beaSLionel Sambuc  compressed data stream, using that same file handle.</p></li>
1849*4a711beaSLionel Sambuc<li class="listitem" style="list-style-type: disc"><p>Reading is more complex, and the facilities are not as
1850*4a711beaSLionel Sambuc  general as they could be since generality is hard to reconcile
1851*4a711beaSLionel Sambuc  with efficiency.  <code class="computeroutput">BZ2_bzRead</code>
1852*4a711beaSLionel Sambuc  reads from the compressed file in blocks of size
1853*4a711beaSLionel Sambuc  <code class="computeroutput">BZ_MAX_UNUSED</code> bytes, and in
1854*4a711beaSLionel Sambuc  doing so probably will overshoot the logical end of compressed
1855*4a711beaSLionel Sambuc  stream.  To recover this data once decompression has ended,
1856*4a711beaSLionel Sambuc  call <code class="computeroutput">BZ2_bzReadGetUnused</code> after
1857*4a711beaSLionel Sambuc  the last call of <code class="computeroutput">BZ2_bzRead</code>
1858*4a711beaSLionel Sambuc  (the one returning
1859*4a711beaSLionel Sambuc  <code class="computeroutput">BZ_STREAM_END</code>) but before
1860*4a711beaSLionel Sambuc  calling
1861*4a711beaSLionel Sambuc  <code class="computeroutput">BZ2_bzReadClose</code>.</p></li>
1862*4a711beaSLionel Sambuc</ul></div>
1863*4a711beaSLionel Sambuc<p>This mechanism makes it easy to decompress multiple
1864*4a711beaSLionel Sambuc<code class="computeroutput">bzip2</code> streams placed end-to-end.
1865*4a711beaSLionel SambucAs the end of one stream, when
1866*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzRead</code> returns
1867*4a711beaSLionel Sambuc<code class="computeroutput">BZ_STREAM_END</code>, call
1868*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzReadGetUnused</code> to collect
1869*4a711beaSLionel Sambucthe unused data (copy it into your own buffer somewhere).  That
1870*4a711beaSLionel Sambucdata forms the start of the next compressed stream.  To start
1871*4a711beaSLionel Sambucuncompressing that next stream, call
1872*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzReadOpen</code> again, feeding in
1873*4a711beaSLionel Sambucthe unused data via the <code class="computeroutput">unused</code> /
1874*4a711beaSLionel Sambuc<code class="computeroutput">nUnused</code> parameters.  Keep doing
1875*4a711beaSLionel Sambucthis until <code class="computeroutput">BZ_STREAM_END</code> return
1876*4a711beaSLionel Sambuccoincides with the physical end of file
1877*4a711beaSLionel Sambuc(<code class="computeroutput">feof(f)</code>).  In this situation
1878*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzReadGetUnused</code> will of
1879*4a711beaSLionel Sambuccourse return no data.</p>
1880*4a711beaSLionel Sambuc<p>This should give some feel for how the high-level interface
1881*4a711beaSLionel Sambuccan be used.  If you require extra flexibility, you'll have to
1882*4a711beaSLionel Sambucbite the bullet and get to grips with the low-level
1883*4a711beaSLionel Sambucinterface.</p>
1884*4a711beaSLionel Sambuc</div>
1885*4a711beaSLionel Sambuc<div class="sect2" title="3.4.9.�Standard file-reading/writing code">
1886*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h3 class="title">
1887*4a711beaSLionel Sambuc<a name="std-rdwr"></a>3.4.9.�Standard file-reading/writing code</h3></div></div></div>
1888*4a711beaSLionel Sambuc<p>Here's how you'd write data to a compressed file:</p>
1889*4a711beaSLionel Sambuc<pre class="programlisting">FILE*   f;
1890*4a711beaSLionel SambucBZFILE* b;
1891*4a711beaSLionel Sambucint     nBuf;
1892*4a711beaSLionel Sambucchar    buf[ /* whatever size you like */ ];
1893*4a711beaSLionel Sambucint     bzerror;
1894*4a711beaSLionel Sambucint     nWritten;
1895*4a711beaSLionel Sambuc
1896*4a711beaSLionel Sambucf = fopen ( "myfile.bz2", "w" );
1897*4a711beaSLionel Sambucif ( !f ) {
1898*4a711beaSLionel Sambuc /* handle error */
1899*4a711beaSLionel Sambuc}
1900*4a711beaSLionel Sambucb = BZ2_bzWriteOpen( &amp;bzerror, f, 9 );
1901*4a711beaSLionel Sambucif (bzerror != BZ_OK) {
1902*4a711beaSLionel Sambuc BZ2_bzWriteClose ( b );
1903*4a711beaSLionel Sambuc /* handle error */
1904*4a711beaSLionel Sambuc}
1905*4a711beaSLionel Sambuc
1906*4a711beaSLionel Sambucwhile ( /* condition */ ) {
1907*4a711beaSLionel Sambuc /* get data to write into buf, and set nBuf appropriately */
1908*4a711beaSLionel Sambuc nWritten = BZ2_bzWrite ( &amp;bzerror, b, buf, nBuf );
1909*4a711beaSLionel Sambuc if (bzerror == BZ_IO_ERROR) {
1910*4a711beaSLionel Sambuc   BZ2_bzWriteClose ( &amp;bzerror, b );
1911*4a711beaSLionel Sambuc   /* handle error */
1912*4a711beaSLionel Sambuc }
1913*4a711beaSLionel Sambuc}
1914*4a711beaSLionel Sambuc
1915*4a711beaSLionel SambucBZ2_bzWriteClose( &amp;bzerror, b );
1916*4a711beaSLionel Sambucif (bzerror == BZ_IO_ERROR) {
1917*4a711beaSLionel Sambuc /* handle error */
1918*4a711beaSLionel Sambuc}</pre>
1919*4a711beaSLionel Sambuc<p>And to read from a compressed file:</p>
1920*4a711beaSLionel Sambuc<pre class="programlisting">FILE*   f;
1921*4a711beaSLionel SambucBZFILE* b;
1922*4a711beaSLionel Sambucint     nBuf;
1923*4a711beaSLionel Sambucchar    buf[ /* whatever size you like */ ];
1924*4a711beaSLionel Sambucint     bzerror;
1925*4a711beaSLionel Sambucint     nWritten;
1926*4a711beaSLionel Sambuc
1927*4a711beaSLionel Sambucf = fopen ( "myfile.bz2", "r" );
1928*4a711beaSLionel Sambucif ( !f ) {
1929*4a711beaSLionel Sambuc  /* handle error */
1930*4a711beaSLionel Sambuc}
1931*4a711beaSLionel Sambucb = BZ2_bzReadOpen ( &amp;bzerror, f, 0, NULL, 0 );
1932*4a711beaSLionel Sambucif ( bzerror != BZ_OK ) {
1933*4a711beaSLionel Sambuc  BZ2_bzReadClose ( &amp;bzerror, b );
1934*4a711beaSLionel Sambuc  /* handle error */
1935*4a711beaSLionel Sambuc}
1936*4a711beaSLionel Sambuc
1937*4a711beaSLionel Sambucbzerror = BZ_OK;
1938*4a711beaSLionel Sambucwhile ( bzerror == BZ_OK &amp;&amp; /* arbitrary other conditions */) {
1939*4a711beaSLionel Sambuc  nBuf = BZ2_bzRead ( &amp;bzerror, b, buf, /* size of buf */ );
1940*4a711beaSLionel Sambuc  if ( bzerror == BZ_OK ) {
1941*4a711beaSLionel Sambuc    /* do something with buf[0 .. nBuf-1] */
1942*4a711beaSLionel Sambuc  }
1943*4a711beaSLionel Sambuc}
1944*4a711beaSLionel Sambucif ( bzerror != BZ_STREAM_END ) {
1945*4a711beaSLionel Sambuc   BZ2_bzReadClose ( &amp;bzerror, b );
1946*4a711beaSLionel Sambuc   /* handle error */
1947*4a711beaSLionel Sambuc} else {
1948*4a711beaSLionel Sambuc   BZ2_bzReadClose ( &amp;bzerror, b );
1949*4a711beaSLionel Sambuc}</pre>
1950*4a711beaSLionel Sambuc</div>
1951*4a711beaSLionel Sambuc</div>
1952*4a711beaSLionel Sambuc<div class="sect1" title="3.5.�Utility functions">
1953*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h2 class="title" style="clear: both">
1954*4a711beaSLionel Sambuc<a name="util-fns"></a>3.5.�Utility functions</h2></div></div></div>
1955*4a711beaSLionel Sambuc<div class="sect2" title="3.5.1.�BZ2_bzBuffToBuffCompress">
1956*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h3 class="title">
1957*4a711beaSLionel Sambuc<a name="bzbufftobuffcompress"></a>3.5.1.�BZ2_bzBuffToBuffCompress</h3></div></div></div>
1958*4a711beaSLionel Sambuc<pre class="programlisting">int BZ2_bzBuffToBuffCompress( char*         dest,
1959*4a711beaSLionel Sambuc                              unsigned int* destLen,
1960*4a711beaSLionel Sambuc                              char*         source,
1961*4a711beaSLionel Sambuc                              unsigned int  sourceLen,
1962*4a711beaSLionel Sambuc                              int           blockSize100k,
1963*4a711beaSLionel Sambuc                              int           verbosity,
1964*4a711beaSLionel Sambuc                              int           workFactor );</pre>
1965*4a711beaSLionel Sambuc<p>Attempts to compress the data in <code class="computeroutput">source[0
1966*4a711beaSLionel Sambuc.. sourceLen-1]</code> into the destination buffer,
1967*4a711beaSLionel Sambuc<code class="computeroutput">dest[0 .. *destLen-1]</code>.  If the
1968*4a711beaSLionel Sambucdestination buffer is big enough,
1969*4a711beaSLionel Sambuc<code class="computeroutput">*destLen</code> is set to the size of
1970*4a711beaSLionel Sambucthe compressed data, and <code class="computeroutput">BZ_OK</code>
1971*4a711beaSLionel Sambucis returned.  If the compressed data won't fit,
1972*4a711beaSLionel Sambuc<code class="computeroutput">*destLen</code> is unchanged, and
1973*4a711beaSLionel Sambuc<code class="computeroutput">BZ_OUTBUFF_FULL</code> is
1974*4a711beaSLionel Sambucreturned.</p>
1975*4a711beaSLionel Sambuc<p>Compression in this manner is a one-shot event, done with a
1976*4a711beaSLionel Sambucsingle call to this function.  The resulting compressed data is a
1977*4a711beaSLionel Sambuccomplete <code class="computeroutput">bzip2</code> format data
1978*4a711beaSLionel Sambucstream.  There is no mechanism for making additional calls to
1979*4a711beaSLionel Sambucprovide extra input data.  If you want that kind of mechanism,
1980*4a711beaSLionel Sambucuse the low-level interface.</p>
1981*4a711beaSLionel Sambuc<p>For the meaning of parameters
1982*4a711beaSLionel Sambuc<code class="computeroutput">blockSize100k</code>,
1983*4a711beaSLionel Sambuc<code class="computeroutput">verbosity</code> and
1984*4a711beaSLionel Sambuc<code class="computeroutput">workFactor</code>, see
1985*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzCompressInit</code>.</p>
1986*4a711beaSLionel Sambuc<p>To guarantee that the compressed data will fit in its
1987*4a711beaSLionel Sambucbuffer, allocate an output buffer of size 1% larger than the
1988*4a711beaSLionel Sambucuncompressed data, plus six hundred extra bytes.</p>
1989*4a711beaSLionel Sambuc<p><code class="computeroutput">BZ2_bzBuffToBuffDecompress</code>
1990*4a711beaSLionel Sambucwill not write data at or beyond
1991*4a711beaSLionel Sambuc<code class="computeroutput">dest[*destLen]</code>, even in case of
1992*4a711beaSLionel Sambucbuffer overflow.</p>
1993*4a711beaSLionel Sambuc<p>Possible return values:</p>
1994*4a711beaSLionel Sambuc<pre class="programlisting">BZ_CONFIG_ERROR
1995*4a711beaSLionel Sambuc  if the library has been mis-compiled
1996*4a711beaSLionel SambucBZ_PARAM_ERROR
1997*4a711beaSLionel Sambuc  if dest is NULL or destLen is NULL
1998*4a711beaSLionel Sambuc  or blockSize100k &lt; 1 or blockSize100k &gt; 9
1999*4a711beaSLionel Sambuc  or verbosity &lt; 0 or verbosity &gt; 4
2000*4a711beaSLionel Sambuc  or workFactor &lt; 0 or workFactor &gt; 250
2001*4a711beaSLionel SambucBZ_MEM_ERROR
2002*4a711beaSLionel Sambuc  if insufficient memory is available
2003*4a711beaSLionel SambucBZ_OUTBUFF_FULL
2004*4a711beaSLionel Sambuc  if the size of the compressed data exceeds *destLen
2005*4a711beaSLionel SambucBZ_OK
2006*4a711beaSLionel Sambuc  otherwise</pre>
2007*4a711beaSLionel Sambuc</div>
2008*4a711beaSLionel Sambuc<div class="sect2" title="3.5.2.�BZ2_bzBuffToBuffDecompress">
2009*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h3 class="title">
2010*4a711beaSLionel Sambuc<a name="bzbufftobuffdecompress"></a>3.5.2.�BZ2_bzBuffToBuffDecompress</h3></div></div></div>
2011*4a711beaSLionel Sambuc<pre class="programlisting">int BZ2_bzBuffToBuffDecompress( char*         dest,
2012*4a711beaSLionel Sambuc                                unsigned int* destLen,
2013*4a711beaSLionel Sambuc                                char*         source,
2014*4a711beaSLionel Sambuc                                unsigned int  sourceLen,
2015*4a711beaSLionel Sambuc                                int           small,
2016*4a711beaSLionel Sambuc                                int           verbosity );</pre>
2017*4a711beaSLionel Sambuc<p>Attempts to decompress the data in <code class="computeroutput">source[0
2018*4a711beaSLionel Sambuc.. sourceLen-1]</code> into the destination buffer,
2019*4a711beaSLionel Sambuc<code class="computeroutput">dest[0 .. *destLen-1]</code>.  If the
2020*4a711beaSLionel Sambucdestination buffer is big enough,
2021*4a711beaSLionel Sambuc<code class="computeroutput">*destLen</code> is set to the size of
2022*4a711beaSLionel Sambucthe uncompressed data, and <code class="computeroutput">BZ_OK</code>
2023*4a711beaSLionel Sambucis returned.  If the compressed data won't fit,
2024*4a711beaSLionel Sambuc<code class="computeroutput">*destLen</code> is unchanged, and
2025*4a711beaSLionel Sambuc<code class="computeroutput">BZ_OUTBUFF_FULL</code> is
2026*4a711beaSLionel Sambucreturned.</p>
2027*4a711beaSLionel Sambuc<p><code class="computeroutput">source</code> is assumed to hold
2028*4a711beaSLionel Sambuca complete <code class="computeroutput">bzip2</code> format data
2029*4a711beaSLionel Sambucstream.
2030*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzBuffToBuffDecompress</code> tries
2031*4a711beaSLionel Sambucto decompress the entirety of the stream into the output
2032*4a711beaSLionel Sambucbuffer.</p>
2033*4a711beaSLionel Sambuc<p>For the meaning of parameters
2034*4a711beaSLionel Sambuc<code class="computeroutput">small</code> and
2035*4a711beaSLionel Sambuc<code class="computeroutput">verbosity</code>, see
2036*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzDecompressInit</code>.</p>
2037*4a711beaSLionel Sambuc<p>Because the compression ratio of the compressed data cannot
2038*4a711beaSLionel Sambucbe known in advance, there is no easy way to guarantee that the
2039*4a711beaSLionel Sambucoutput buffer will be big enough.  You may of course make
2040*4a711beaSLionel Sambucarrangements in your code to record the size of the uncompressed
2041*4a711beaSLionel Sambucdata, but such a mechanism is beyond the scope of this
2042*4a711beaSLionel Sambuclibrary.</p>
2043*4a711beaSLionel Sambuc<p><code class="computeroutput">BZ2_bzBuffToBuffDecompress</code>
2044*4a711beaSLionel Sambucwill not write data at or beyond
2045*4a711beaSLionel Sambuc<code class="computeroutput">dest[*destLen]</code>, even in case of
2046*4a711beaSLionel Sambucbuffer overflow.</p>
2047*4a711beaSLionel Sambuc<p>Possible return values:</p>
2048*4a711beaSLionel Sambuc<pre class="programlisting">BZ_CONFIG_ERROR
2049*4a711beaSLionel Sambuc  if the library has been mis-compiled
2050*4a711beaSLionel SambucBZ_PARAM_ERROR
2051*4a711beaSLionel Sambuc  if dest is NULL or destLen is NULL
2052*4a711beaSLionel Sambuc  or small != 0 &amp;&amp; small != 1
2053*4a711beaSLionel Sambuc  or verbosity &lt; 0 or verbosity &gt; 4
2054*4a711beaSLionel SambucBZ_MEM_ERROR
2055*4a711beaSLionel Sambuc  if insufficient memory is available
2056*4a711beaSLionel SambucBZ_OUTBUFF_FULL
2057*4a711beaSLionel Sambuc  if the size of the compressed data exceeds *destLen
2058*4a711beaSLionel SambucBZ_DATA_ERROR
2059*4a711beaSLionel Sambuc  if a data integrity error was detected in the compressed data
2060*4a711beaSLionel SambucBZ_DATA_ERROR_MAGIC
2061*4a711beaSLionel Sambuc  if the compressed data doesn't begin with the right magic bytes
2062*4a711beaSLionel SambucBZ_UNEXPECTED_EOF
2063*4a711beaSLionel Sambuc  if the compressed data ends unexpectedly
2064*4a711beaSLionel SambucBZ_OK
2065*4a711beaSLionel Sambuc  otherwise</pre>
2066*4a711beaSLionel Sambuc</div>
2067*4a711beaSLionel Sambuc</div>
2068*4a711beaSLionel Sambuc<div class="sect1" title="3.6.�zlib compatibility functions">
2069*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h2 class="title" style="clear: both">
2070*4a711beaSLionel Sambuc<a name="zlib-compat"></a>3.6.�zlib compatibility functions</h2></div></div></div>
2071*4a711beaSLionel Sambuc<p>Yoshioka Tsuneo has contributed some functions to give
2072*4a711beaSLionel Sambucbetter <code class="computeroutput">zlib</code> compatibility.
2073*4a711beaSLionel SambucThese functions are <code class="computeroutput">BZ2_bzopen</code>,
2074*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzread</code>,
2075*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzwrite</code>,
2076*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzflush</code>,
2077*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzclose</code>,
2078*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzerror</code> and
2079*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzlibVersion</code>.  These
2080*4a711beaSLionel Sambucfunctions are not (yet) officially part of the library.  If they
2081*4a711beaSLionel Sambucbreak, you get to keep all the pieces.  Nevertheless, I think
2082*4a711beaSLionel Sambucthey work ok.</p>
2083*4a711beaSLionel Sambuc<pre class="programlisting">typedef void BZFILE;
2084*4a711beaSLionel Sambuc
2085*4a711beaSLionel Sambucconst char * BZ2_bzlibVersion ( void );</pre>
2086*4a711beaSLionel Sambuc<p>Returns a string indicating the library version.</p>
2087*4a711beaSLionel Sambuc<pre class="programlisting">BZFILE * BZ2_bzopen  ( const char *path, const char *mode );
2088*4a711beaSLionel SambucBZFILE * BZ2_bzdopen ( int        fd,    const char *mode );</pre>
2089*4a711beaSLionel Sambuc<p>Opens a <code class="computeroutput">.bz2</code> file for
2090*4a711beaSLionel Sambucreading or writing, using either its name or a pre-existing file
2091*4a711beaSLionel Sambucdescriptor.  Analogous to <code class="computeroutput">fopen</code>
2092*4a711beaSLionel Sambucand <code class="computeroutput">fdopen</code>.</p>
2093*4a711beaSLionel Sambuc<pre class="programlisting">int BZ2_bzread  ( BZFILE* b, void* buf, int len );
2094*4a711beaSLionel Sambucint BZ2_bzwrite ( BZFILE* b, void* buf, int len );</pre>
2095*4a711beaSLionel Sambuc<p>Reads/writes data from/to a previously opened
2096*4a711beaSLionel Sambuc<code class="computeroutput">BZFILE</code>.  Analogous to
2097*4a711beaSLionel Sambuc<code class="computeroutput">fread</code> and
2098*4a711beaSLionel Sambuc<code class="computeroutput">fwrite</code>.</p>
2099*4a711beaSLionel Sambuc<pre class="programlisting">int  BZ2_bzflush ( BZFILE* b );
2100*4a711beaSLionel Sambucvoid BZ2_bzclose ( BZFILE* b );</pre>
2101*4a711beaSLionel Sambuc<p>Flushes/closes a <code class="computeroutput">BZFILE</code>.
2102*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzflush</code> doesn't actually do
2103*4a711beaSLionel Sambucanything.  Analogous to <code class="computeroutput">fflush</code>
2104*4a711beaSLionel Sambucand <code class="computeroutput">fclose</code>.</p>
2105*4a711beaSLionel Sambuc<pre class="programlisting">const char * BZ2_bzerror ( BZFILE *b, int *errnum )</pre>
2106*4a711beaSLionel Sambuc<p>Returns a string describing the more recent error status of
2107*4a711beaSLionel Sambuc<code class="computeroutput">b</code>, and also sets
2108*4a711beaSLionel Sambuc<code class="computeroutput">*errnum</code> to its numerical
2109*4a711beaSLionel Sambucvalue.</p>
2110*4a711beaSLionel Sambuc</div>
2111*4a711beaSLionel Sambuc<div class="sect1" title="3.7.�Using the library in a stdio-free environment">
2112*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h2 class="title" style="clear: both">
2113*4a711beaSLionel Sambuc<a name="stdio-free"></a>3.7.�Using the library in a stdio-free environment</h2></div></div></div>
2114*4a711beaSLionel Sambuc<div class="sect2" title="3.7.1.�Getting rid of stdio">
2115*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h3 class="title">
2116*4a711beaSLionel Sambuc<a name="stdio-bye"></a>3.7.1.�Getting rid of stdio</h3></div></div></div>
2117*4a711beaSLionel Sambuc<p>In a deeply embedded application, you might want to use
2118*4a711beaSLionel Sambucjust the memory-to-memory functions.  You can do this
2119*4a711beaSLionel Sambucconveniently by compiling the library with preprocessor symbol
2120*4a711beaSLionel Sambuc<code class="computeroutput">BZ_NO_STDIO</code> defined.  Doing this
2121*4a711beaSLionel Sambucgives you a library containing only the following eight
2122*4a711beaSLionel Sambucfunctions:</p>
2123*4a711beaSLionel Sambuc<p><code class="computeroutput">BZ2_bzCompressInit</code>,
2124*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzCompress</code>,
2125*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzCompressEnd</code>
2126*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzDecompressInit</code>,
2127*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzDecompress</code>,
2128*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzDecompressEnd</code>
2129*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzBuffToBuffCompress</code>,
2130*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzBuffToBuffDecompress</code></p>
2131*4a711beaSLionel Sambuc<p>When compiled like this, all functions will ignore
2132*4a711beaSLionel Sambuc<code class="computeroutput">verbosity</code> settings.</p>
2133*4a711beaSLionel Sambuc</div>
2134*4a711beaSLionel Sambuc<div class="sect2" title="3.7.2.�Critical error handling">
2135*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h3 class="title">
2136*4a711beaSLionel Sambuc<a name="critical-error"></a>3.7.2.�Critical error handling</h3></div></div></div>
2137*4a711beaSLionel Sambuc<p><code class="computeroutput">libbzip2</code> contains a number
2138*4a711beaSLionel Sambucof internal assertion checks which should, needless to say, never
2139*4a711beaSLionel Sambucbe activated.  Nevertheless, if an assertion should fail,
2140*4a711beaSLionel Sambucbehaviour depends on whether or not the library was compiled with
2141*4a711beaSLionel Sambuc<code class="computeroutput">BZ_NO_STDIO</code> set.</p>
2142*4a711beaSLionel Sambuc<p>For a normal compile, an assertion failure yields the
2143*4a711beaSLionel Sambucmessage:</p>
2144*4a711beaSLionel Sambuc<div class="blockquote"><blockquote class="blockquote">
2145*4a711beaSLionel Sambuc<p>bzip2/libbzip2: internal error number N.</p>
2146*4a711beaSLionel Sambuc<p>This is a bug in bzip2/libbzip2, 1.0.6 of 6 September 2010.
2147*4a711beaSLionel SambucPlease report it to me at: jseward@bzip.org.  If this happened
2148*4a711beaSLionel Sambucwhen you were using some program which uses libbzip2 as a
2149*4a711beaSLionel Sambuccomponent, you should also report this bug to the author(s)
2150*4a711beaSLionel Sambucof that program.  Please make an effort to report this bug;
2151*4a711beaSLionel Sambuctimely and accurate bug reports eventually lead to higher
2152*4a711beaSLionel Sambucquality software.  Thanks.  Julian Seward, 6 September 2010.
2153*4a711beaSLionel Sambuc</p>
2154*4a711beaSLionel Sambuc</blockquote></div>
2155*4a711beaSLionel Sambuc<p>where <code class="computeroutput">N</code> is some error code
2156*4a711beaSLionel Sambucnumber.  If <code class="computeroutput">N == 1007</code>, it also
2157*4a711beaSLionel Sambucprints some extra text advising the reader that unreliable memory
2158*4a711beaSLionel Sambucis often associated with internal error 1007. (This is a
2159*4a711beaSLionel Sambucfrequently-observed-phenomenon with versions 1.0.0/1.0.1).</p>
2160*4a711beaSLionel Sambuc<p><code class="computeroutput">exit(3)</code> is then
2161*4a711beaSLionel Sambuccalled.</p>
2162*4a711beaSLionel Sambuc<p>For a <code class="computeroutput">stdio</code>-free library,
2163*4a711beaSLionel Sambucassertion failures result in a call to a function declared
2164*4a711beaSLionel Sambucas:</p>
2165*4a711beaSLionel Sambuc<pre class="programlisting">extern void bz_internal_error ( int errcode );</pre>
2166*4a711beaSLionel Sambuc<p>The relevant code is passed as a parameter.  You should
2167*4a711beaSLionel Sambucsupply such a function.</p>
2168*4a711beaSLionel Sambuc<p>In either case, once an assertion failure has occurred, any
2169*4a711beaSLionel Sambuc<code class="computeroutput">bz_stream</code> records involved can
2170*4a711beaSLionel Sambucbe regarded as invalid.  You should not attempt to resume normal
2171*4a711beaSLionel Sambucoperation with them.</p>
2172*4a711beaSLionel Sambuc<p>You may, of course, change critical error handling to suit
2173*4a711beaSLionel Sambucyour needs.  As I said above, critical errors indicate bugs in
2174*4a711beaSLionel Sambucthe library and should not occur.  All "normal" error situations
2175*4a711beaSLionel Sambucare indicated via error return codes from functions, and can be
2176*4a711beaSLionel Sambucrecovered from.</p>
2177*4a711beaSLionel Sambuc</div>
2178*4a711beaSLionel Sambuc</div>
2179*4a711beaSLionel Sambuc<div class="sect1" title="3.8.�Making a Windows DLL">
2180*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h2 class="title" style="clear: both">
2181*4a711beaSLionel Sambuc<a name="win-dll"></a>3.8.�Making a Windows DLL</h2></div></div></div>
2182*4a711beaSLionel Sambuc<p>Everything related to Windows has been contributed by
2183*4a711beaSLionel SambucYoshioka Tsuneo
2184*4a711beaSLionel Sambuc(<code class="computeroutput">tsuneo@rr.iij4u.or.jp</code>), so
2185*4a711beaSLionel Sambucyou should send your queries to him (but perhaps Cc: me,
2186*4a711beaSLionel Sambuc<code class="computeroutput">jseward@bzip.org</code>).</p>
2187*4a711beaSLionel Sambuc<p>My vague understanding of what to do is: using Visual C++
2188*4a711beaSLionel Sambuc5.0, open the project file
2189*4a711beaSLionel Sambuc<code class="computeroutput">libbz2.dsp</code>, and build.  That's
2190*4a711beaSLionel Sambucall.</p>
2191*4a711beaSLionel Sambuc<p>If you can't open the project file for some reason, make a
2192*4a711beaSLionel Sambucnew one, naming these files:
2193*4a711beaSLionel Sambuc<code class="computeroutput">blocksort.c</code>,
2194*4a711beaSLionel Sambuc<code class="computeroutput">bzlib.c</code>,
2195*4a711beaSLionel Sambuc<code class="computeroutput">compress.c</code>,
2196*4a711beaSLionel Sambuc<code class="computeroutput">crctable.c</code>,
2197*4a711beaSLionel Sambuc<code class="computeroutput">decompress.c</code>,
2198*4a711beaSLionel Sambuc<code class="computeroutput">huffman.c</code>,
2199*4a711beaSLionel Sambuc<code class="computeroutput">randtable.c</code> and
2200*4a711beaSLionel Sambuc<code class="computeroutput">libbz2.def</code>.  You will also need
2201*4a711beaSLionel Sambucto name the header files <code class="computeroutput">bzlib.h</code>
2202*4a711beaSLionel Sambucand <code class="computeroutput">bzlib_private.h</code>.</p>
2203*4a711beaSLionel Sambuc<p>If you don't use VC++, you may need to define the
2204*4a711beaSLionel Sambucproprocessor symbol
2205*4a711beaSLionel Sambuc<code class="computeroutput">_WIN32</code>.</p>
2206*4a711beaSLionel Sambuc<p>Finally, <code class="computeroutput">dlltest.c</code> is a
2207*4a711beaSLionel Sambucsample program using the DLL.  It has a project file,
2208*4a711beaSLionel Sambuc<code class="computeroutput">dlltest.dsp</code>.</p>
2209*4a711beaSLionel Sambuc<p>If you just want a makefile for Visual C, have a look at
2210*4a711beaSLionel Sambuc<code class="computeroutput">makefile.msc</code>.</p>
2211*4a711beaSLionel Sambuc<p>Be aware that if you compile
2212*4a711beaSLionel Sambuc<code class="computeroutput">bzip2</code> itself on Win32, you must
2213*4a711beaSLionel Sambucset <code class="computeroutput">BZ_UNIX</code> to 0 and
2214*4a711beaSLionel Sambuc<code class="computeroutput">BZ_LCCWIN32</code> to 1, in the file
2215*4a711beaSLionel Sambuc<code class="computeroutput">bzip2.c</code>, before compiling.
2216*4a711beaSLionel SambucOtherwise the resulting binary won't work correctly.</p>
2217*4a711beaSLionel Sambuc<p>I haven't tried any of this stuff myself, but it all looks
2218*4a711beaSLionel Sambucplausible.</p>
2219*4a711beaSLionel Sambuc</div>
2220*4a711beaSLionel Sambuc</div>
2221*4a711beaSLionel Sambuc<div class="chapter" title="4.�Miscellanea">
2222*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h2 class="title">
2223*4a711beaSLionel Sambuc<a name="misc"></a>4.�Miscellanea</h2></div></div></div>
2224*4a711beaSLionel Sambuc<div class="toc">
2225*4a711beaSLionel Sambuc<p><b>Table of Contents</b></p>
2226*4a711beaSLionel Sambuc<dl>
2227*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#limits">4.1. Limitations of the compressed file format</a></span></dt>
2228*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#port-issues">4.2. Portability issues</a></span></dt>
2229*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#bugs">4.3. Reporting bugs</a></span></dt>
2230*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#package">4.4. Did you get the right package?</a></span></dt>
2231*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#reading">4.5. Further Reading</a></span></dt>
2232*4a711beaSLionel Sambuc</dl>
2233*4a711beaSLionel Sambuc</div>
2234*4a711beaSLionel Sambuc<p>These are just some random thoughts of mine.  Your mileage
2235*4a711beaSLionel Sambucmay vary.</p>
2236*4a711beaSLionel Sambuc<div class="sect1" title="4.1.�Limitations of the compressed file format">
2237*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h2 class="title" style="clear: both">
2238*4a711beaSLionel Sambuc<a name="limits"></a>4.1.�Limitations of the compressed file format</h2></div></div></div>
2239*4a711beaSLionel Sambuc<p><code class="computeroutput">bzip2-1.0.X</code>,
2240*4a711beaSLionel Sambuc<code class="computeroutput">0.9.5</code> and
2241*4a711beaSLionel Sambuc<code class="computeroutput">0.9.0</code> use exactly the same file
2242*4a711beaSLionel Sambucformat as the original version,
2243*4a711beaSLionel Sambuc<code class="computeroutput">bzip2-0.1</code>.  This decision was
2244*4a711beaSLionel Sambucmade in the interests of stability.  Creating yet another
2245*4a711beaSLionel Sambucincompatible compressed file format would create further
2246*4a711beaSLionel Sambucconfusion and disruption for users.</p>
2247*4a711beaSLionel Sambuc<p>Nevertheless, this is not a painless decision.  Development
2248*4a711beaSLionel Sambucwork since the release of
2249*4a711beaSLionel Sambuc<code class="computeroutput">bzip2-0.1</code> in August 1997 has
2250*4a711beaSLionel Sambucshown complexities in the file format which slow down
2251*4a711beaSLionel Sambucdecompression and, in retrospect, are unnecessary.  These
2252*4a711beaSLionel Sambucare:</p>
2253*4a711beaSLionel Sambuc<div class="itemizedlist"><ul class="itemizedlist" type="bullet">
2254*4a711beaSLionel Sambuc<li class="listitem" style="list-style-type: disc"><p>The run-length encoder, which is the first of the
2255*4a711beaSLionel Sambuc   compression transformations, is entirely irrelevant.  The
2256*4a711beaSLionel Sambuc   original purpose was to protect the sorting algorithm from the
2257*4a711beaSLionel Sambuc   very worst case input: a string of repeated symbols.  But
2258*4a711beaSLionel Sambuc   algorithm steps Q6a and Q6b in the original Burrows-Wheeler
2259*4a711beaSLionel Sambuc   technical report (SRC-124) show how repeats can be handled
2260*4a711beaSLionel Sambuc   without difficulty in block sorting.</p></li>
2261*4a711beaSLionel Sambuc<li class="listitem" style="list-style-type: disc">
2262*4a711beaSLionel Sambuc<p>The randomisation mechanism doesn't really need to be
2263*4a711beaSLionel Sambuc   there.  Udi Manber and Gene Myers published a suffix array
2264*4a711beaSLionel Sambuc   construction algorithm a few years back, which can be employed
2265*4a711beaSLionel Sambuc   to sort any block, no matter how repetitive, in O(N log N)
2266*4a711beaSLionel Sambuc   time.  Subsequent work by Kunihiko Sadakane has produced a
2267*4a711beaSLionel Sambuc   derivative O(N (log N)^2) algorithm which usually outperforms
2268*4a711beaSLionel Sambuc   the Manber-Myers algorithm.</p>
2269*4a711beaSLionel Sambuc<p>I could have changed to Sadakane's algorithm, but I find
2270*4a711beaSLionel Sambuc   it to be slower than <code class="computeroutput">bzip2</code>'s
2271*4a711beaSLionel Sambuc   existing algorithm for most inputs, and the randomisation
2272*4a711beaSLionel Sambuc   mechanism protects adequately against bad cases.  I didn't
2273*4a711beaSLionel Sambuc   think it was a good tradeoff to make.  Partly this is due to
2274*4a711beaSLionel Sambuc   the fact that I was not flooded with email complaints about
2275*4a711beaSLionel Sambuc   <code class="computeroutput">bzip2-0.1</code>'s performance on
2276*4a711beaSLionel Sambuc   repetitive data, so perhaps it isn't a problem for real
2277*4a711beaSLionel Sambuc   inputs.</p>
2278*4a711beaSLionel Sambuc<p>Probably the best long-term solution, and the one I have
2279*4a711beaSLionel Sambuc   incorporated into 0.9.5 and above, is to use the existing
2280*4a711beaSLionel Sambuc   sorting algorithm initially, and fall back to a O(N (log N)^2)
2281*4a711beaSLionel Sambuc   algorithm if the standard algorithm gets into
2282*4a711beaSLionel Sambuc   difficulties.</p>
2283*4a711beaSLionel Sambuc</li>
2284*4a711beaSLionel Sambuc<li class="listitem" style="list-style-type: disc"><p>The compressed file format was never designed to be
2285*4a711beaSLionel Sambuc   handled by a library, and I have had to jump though some hoops
2286*4a711beaSLionel Sambuc   to produce an efficient implementation of decompression.  It's
2287*4a711beaSLionel Sambuc   a bit hairy.  Try passing
2288*4a711beaSLionel Sambuc   <code class="computeroutput">decompress.c</code> through the C
2289*4a711beaSLionel Sambuc   preprocessor and you'll see what I mean.  Much of this
2290*4a711beaSLionel Sambuc   complexity could have been avoided if the compressed size of
2291*4a711beaSLionel Sambuc   each block of data was recorded in the data stream.</p></li>
2292*4a711beaSLionel Sambuc<li class="listitem" style="list-style-type: disc"><p>An Adler-32 checksum, rather than a CRC32 checksum,
2293*4a711beaSLionel Sambuc   would be faster to compute.</p></li>
2294*4a711beaSLionel Sambuc</ul></div>
2295*4a711beaSLionel Sambuc<p>It would be fair to say that the
2296*4a711beaSLionel Sambuc<code class="computeroutput">bzip2</code> format was frozen before I
2297*4a711beaSLionel Sambucproperly and fully understood the performance consequences of
2298*4a711beaSLionel Sambucdoing so.</p>
2299*4a711beaSLionel Sambuc<p>Improvements which I was able to incorporate into 0.9.0,
2300*4a711beaSLionel Sambucdespite using the same file format, are:</p>
2301*4a711beaSLionel Sambuc<div class="itemizedlist"><ul class="itemizedlist" type="bullet">
2302*4a711beaSLionel Sambuc<li class="listitem" style="list-style-type: disc"><p>Single array implementation of the inverse BWT.  This
2303*4a711beaSLionel Sambuc  significantly speeds up decompression, presumably because it
2304*4a711beaSLionel Sambuc  reduces the number of cache misses.</p></li>
2305*4a711beaSLionel Sambuc<li class="listitem" style="list-style-type: disc"><p>Faster inverse MTF transform for large MTF values.
2306*4a711beaSLionel Sambuc  The new implementation is based on the notion of sliding blocks
2307*4a711beaSLionel Sambuc  of values.</p></li>
2308*4a711beaSLionel Sambuc<li class="listitem" style="list-style-type: disc"><p><code class="computeroutput">bzip2-0.9.0</code> now reads
2309*4a711beaSLionel Sambuc  and writes files with <code class="computeroutput">fread</code>
2310*4a711beaSLionel Sambuc  and <code class="computeroutput">fwrite</code>; version 0.1 used
2311*4a711beaSLionel Sambuc  <code class="computeroutput">putc</code> and
2312*4a711beaSLionel Sambuc  <code class="computeroutput">getc</code>.  Duh!  Well, you live
2313*4a711beaSLionel Sambuc  and learn.</p></li>
2314*4a711beaSLionel Sambuc</ul></div>
2315*4a711beaSLionel Sambuc<p>Further ahead, it would be nice to be able to do random
2316*4a711beaSLionel Sambucaccess into files.  This will require some careful design of
2317*4a711beaSLionel Sambuccompressed file formats.</p>
2318*4a711beaSLionel Sambuc</div>
2319*4a711beaSLionel Sambuc<div class="sect1" title="4.2.�Portability issues">
2320*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h2 class="title" style="clear: both">
2321*4a711beaSLionel Sambuc<a name="port-issues"></a>4.2.�Portability issues</h2></div></div></div>
2322*4a711beaSLionel Sambuc<p>After some consideration, I have decided not to use GNU
2323*4a711beaSLionel Sambuc<code class="computeroutput">autoconf</code> to configure 0.9.5 or
2324*4a711beaSLionel Sambuc1.0.</p>
2325*4a711beaSLionel Sambuc<p><code class="computeroutput">autoconf</code>, admirable and
2326*4a711beaSLionel Sambucwonderful though it is, mainly assists with portability problems
2327*4a711beaSLionel Sambucbetween Unix-like platforms.  But
2328*4a711beaSLionel Sambuc<code class="computeroutput">bzip2</code> doesn't have much in the
2329*4a711beaSLionel Sambucway of portability problems on Unix; most of the difficulties
2330*4a711beaSLionel Sambucappear when porting to the Mac, or to Microsoft's operating
2331*4a711beaSLionel Sambucsystems.  <code class="computeroutput">autoconf</code> doesn't help
2332*4a711beaSLionel Sambucin those cases, and brings in a whole load of new
2333*4a711beaSLionel Sambuccomplexity.</p>
2334*4a711beaSLionel Sambuc<p>Most people should be able to compile the library and
2335*4a711beaSLionel Sambucprogram under Unix straight out-of-the-box, so to speak,
2336*4a711beaSLionel Sambucespecially if you have a version of GNU C available.</p>
2337*4a711beaSLionel Sambuc<p>There are a couple of
2338*4a711beaSLionel Sambuc<code class="computeroutput">__inline__</code> directives in the
2339*4a711beaSLionel Sambuccode.  GNU C (<code class="computeroutput">gcc</code>) should be
2340*4a711beaSLionel Sambucable to handle them.  If you're not using GNU C, your C compiler
2341*4a711beaSLionel Sambucshouldn't see them at all.  If your compiler does, for some
2342*4a711beaSLionel Sambucreason, see them and doesn't like them, just
2343*4a711beaSLionel Sambuc<code class="computeroutput">#define</code>
2344*4a711beaSLionel Sambuc<code class="computeroutput">__inline__</code> to be
2345*4a711beaSLionel Sambuc<code class="computeroutput">/* */</code>.  One easy way to do this
2346*4a711beaSLionel Sambucis to compile with the flag
2347*4a711beaSLionel Sambuc<code class="computeroutput">-D__inline__=</code>, which should be
2348*4a711beaSLionel Sambucunderstood by most Unix compilers.</p>
2349*4a711beaSLionel Sambuc<p>If you still have difficulties, try compiling with the
2350*4a711beaSLionel Sambucmacro <code class="computeroutput">BZ_STRICT_ANSI</code> defined.
2351*4a711beaSLionel SambucThis should enable you to build the library in a strictly ANSI
2352*4a711beaSLionel Sambuccompliant environment.  Building the program itself like this is
2353*4a711beaSLionel Sambucdangerous and not supported, since you remove
2354*4a711beaSLionel Sambuc<code class="computeroutput">bzip2</code>'s checks against
2355*4a711beaSLionel Sambuccompressing directories, symbolic links, devices, and other
2356*4a711beaSLionel Sambucnot-really-a-file entities.  This could cause filesystem
2357*4a711beaSLionel Sambuccorruption!</p>
2358*4a711beaSLionel Sambuc<p>One other thing: if you create a
2359*4a711beaSLionel Sambuc<code class="computeroutput">bzip2</code> binary for public distribution,
2360*4a711beaSLionel Sambucplease consider linking it statically (<code class="computeroutput">gcc
2361*4a711beaSLionel Sambuc-static</code>).  This avoids all sorts of library-version
2362*4a711beaSLionel Sambucissues that others may encounter later on.</p>
2363*4a711beaSLionel Sambuc<p>If you build <code class="computeroutput">bzip2</code> on
2364*4a711beaSLionel SambucWin32, you must set <code class="computeroutput">BZ_UNIX</code> to 0
2365*4a711beaSLionel Sambucand <code class="computeroutput">BZ_LCCWIN32</code> to 1, in the
2366*4a711beaSLionel Sambucfile <code class="computeroutput">bzip2.c</code>, before compiling.
2367*4a711beaSLionel SambucOtherwise the resulting binary won't work correctly.</p>
2368*4a711beaSLionel Sambuc</div>
2369*4a711beaSLionel Sambuc<div class="sect1" title="4.3.�Reporting bugs">
2370*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h2 class="title" style="clear: both">
2371*4a711beaSLionel Sambuc<a name="bugs"></a>4.3.�Reporting bugs</h2></div></div></div>
2372*4a711beaSLionel Sambuc<p>I tried pretty hard to make sure
2373*4a711beaSLionel Sambuc<code class="computeroutput">bzip2</code> is bug free, both by
2374*4a711beaSLionel Sambucdesign and by testing.  Hopefully you'll never need to read this
2375*4a711beaSLionel Sambucsection for real.</p>
2376*4a711beaSLionel Sambuc<p>Nevertheless, if <code class="computeroutput">bzip2</code> dies
2377*4a711beaSLionel Sambucwith a segmentation fault, a bus error or an internal assertion
2378*4a711beaSLionel Sambucfailure, it will ask you to email me a bug report.  Experience from
2379*4a711beaSLionel Sambucyears of feedback of bzip2 users indicates that almost all these
2380*4a711beaSLionel Sambucproblems can be traced to either compiler bugs or hardware
2381*4a711beaSLionel Sambucproblems.</p>
2382*4a711beaSLionel Sambuc<div class="itemizedlist"><ul class="itemizedlist" type="bullet">
2383*4a711beaSLionel Sambuc<li class="listitem" style="list-style-type: disc">
2384*4a711beaSLionel Sambuc<p>Recompile the program with no optimisation, and
2385*4a711beaSLionel Sambuc  see if it works.  And/or try a different compiler.  I heard all
2386*4a711beaSLionel Sambuc  sorts of stories about various flavours of GNU C (and other
2387*4a711beaSLionel Sambuc  compilers) generating bad code for
2388*4a711beaSLionel Sambuc  <code class="computeroutput">bzip2</code>, and I've run across two
2389*4a711beaSLionel Sambuc  such examples myself.</p>
2390*4a711beaSLionel Sambuc<p>2.7.X versions of GNU C are known to generate bad code
2391*4a711beaSLionel Sambuc  from time to time, at high optimisation levels.  If you get
2392*4a711beaSLionel Sambuc  problems, try using the flags
2393*4a711beaSLionel Sambuc  <code class="computeroutput">-O2</code>
2394*4a711beaSLionel Sambuc  <code class="computeroutput">-fomit-frame-pointer</code>
2395*4a711beaSLionel Sambuc  <code class="computeroutput">-fno-strength-reduce</code>.  You
2396*4a711beaSLionel Sambuc  should specifically <span class="emphasis"><em>not</em></span> use
2397*4a711beaSLionel Sambuc  <code class="computeroutput">-funroll-loops</code>.</p>
2398*4a711beaSLionel Sambuc<p>You may notice that the Makefile runs six tests as part
2399*4a711beaSLionel Sambuc  of the build process.  If the program passes all of these, it's
2400*4a711beaSLionel Sambuc  a pretty good (but not 100%) indication that the compiler has
2401*4a711beaSLionel Sambuc  done its job correctly.</p>
2402*4a711beaSLionel Sambuc</li>
2403*4a711beaSLionel Sambuc<li class="listitem" style="list-style-type: disc">
2404*4a711beaSLionel Sambuc<p>If <code class="computeroutput">bzip2</code>
2405*4a711beaSLionel Sambuc  crashes randomly, and the crashes are not repeatable, you may
2406*4a711beaSLionel Sambuc  have a flaky memory subsystem.
2407*4a711beaSLionel Sambuc  <code class="computeroutput">bzip2</code> really hammers your
2408*4a711beaSLionel Sambuc  memory hierarchy, and if it's a bit marginal, you may get these
2409*4a711beaSLionel Sambuc  problems.  Ditto if your disk or I/O subsystem is slowly
2410*4a711beaSLionel Sambuc  failing.  Yup, this really does happen.</p>
2411*4a711beaSLionel Sambuc<p>Try using a different machine of the same type, and see
2412*4a711beaSLionel Sambuc  if you can repeat the problem.</p>
2413*4a711beaSLionel Sambuc</li>
2414*4a711beaSLionel Sambuc<li class="listitem" style="list-style-type: disc"><p>This isn't really a bug, but ... If
2415*4a711beaSLionel Sambuc  <code class="computeroutput">bzip2</code> tells you your file is
2416*4a711beaSLionel Sambuc  corrupted on decompression, and you obtained the file via FTP,
2417*4a711beaSLionel Sambuc  there is a possibility that you forgot to tell FTP to do a
2418*4a711beaSLionel Sambuc  binary mode transfer.  That absolutely will cause the file to
2419*4a711beaSLionel Sambuc  be non-decompressible.  You'll have to transfer it
2420*4a711beaSLionel Sambuc  again.</p></li>
2421*4a711beaSLionel Sambuc</ul></div>
2422*4a711beaSLionel Sambuc<p>If you've incorporated
2423*4a711beaSLionel Sambuc<code class="computeroutput">libbzip2</code> into your own program
2424*4a711beaSLionel Sambucand are getting problems, please, please, please, check that the
2425*4a711beaSLionel Sambucparameters you are passing in calls to the library, are correct,
2426*4a711beaSLionel Sambucand in accordance with what the documentation says is allowable.
2427*4a711beaSLionel SambucI have tried to make the library robust against such problems,
2428*4a711beaSLionel Sambucbut I'm sure I haven't succeeded.</p>
2429*4a711beaSLionel Sambuc<p>Finally, if the above comments don't help, you'll have to
2430*4a711beaSLionel Sambucsend me a bug report.  Now, it's just amazing how many people
2431*4a711beaSLionel Sambucwill send me a bug report saying something like:</p>
2432*4a711beaSLionel Sambuc<pre class="programlisting">bzip2 crashed with segmentation fault on my machine</pre>
2433*4a711beaSLionel Sambuc<p>and absolutely nothing else.  Needless to say, a such a
2434*4a711beaSLionel Sambucreport is <span class="emphasis"><em>totally, utterly, completely and
2435*4a711beaSLionel Sambuccomprehensively 100% useless; a waste of your time, my time, and
2436*4a711beaSLionel Sambucnet bandwidth</em></span>.  With no details at all, there's no way
2437*4a711beaSLionel SambucI can possibly begin to figure out what the problem is.</p>
2438*4a711beaSLionel Sambuc<p>The rules of the game are: facts, facts, facts.  Don't omit
2439*4a711beaSLionel Sambucthem because "oh, they won't be relevant".  At the bare
2440*4a711beaSLionel Sambucminimum:</p>
2441*4a711beaSLionel Sambuc<pre class="programlisting">Machine type.  Operating system version.
2442*4a711beaSLionel SambucExact version of bzip2 (do bzip2 -V).
2443*4a711beaSLionel SambucExact version of the compiler used.
2444*4a711beaSLionel SambucFlags passed to the compiler.</pre>
2445*4a711beaSLionel Sambuc<p>However, the most important single thing that will help me
2446*4a711beaSLionel Sambucis the file that you were trying to compress or decompress at the
2447*4a711beaSLionel Sambuctime the problem happened.  Without that, my ability to do
2448*4a711beaSLionel Sambucanything more than speculate about the cause, is limited.</p>
2449*4a711beaSLionel Sambuc</div>
2450*4a711beaSLionel Sambuc<div class="sect1" title="4.4.�Did you get the right package?">
2451*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h2 class="title" style="clear: both">
2452*4a711beaSLionel Sambuc<a name="package"></a>4.4.�Did you get the right package?</h2></div></div></div>
2453*4a711beaSLionel Sambuc<p><code class="computeroutput">bzip2</code> is a resource hog.
2454*4a711beaSLionel SambucIt soaks up large amounts of CPU cycles and memory.  Also, it
2455*4a711beaSLionel Sambucgives very large latencies.  In the worst case, you can feed many
2456*4a711beaSLionel Sambucmegabytes of uncompressed data into the library before getting
2457*4a711beaSLionel Sambucany compressed output, so this probably rules out applications
2458*4a711beaSLionel Sambucrequiring interactive behaviour.</p>
2459*4a711beaSLionel Sambuc<p>These aren't faults of my implementation, I hope, but more
2460*4a711beaSLionel Sambucan intrinsic property of the Burrows-Wheeler transform
2461*4a711beaSLionel Sambuc(unfortunately).  Maybe this isn't what you want.</p>
2462*4a711beaSLionel Sambuc<p>If you want a compressor and/or library which is faster,
2463*4a711beaSLionel Sambucuses less memory but gets pretty good compression, and has
2464*4a711beaSLionel Sambucminimal latency, consider Jean-loup Gailly's and Mark Adler's
2465*4a711beaSLionel Sambucwork, <code class="computeroutput">zlib-1.2.1</code> and
2466*4a711beaSLionel Sambuc<code class="computeroutput">gzip-1.2.4</code>.  Look for them at
2467*4a711beaSLionel Sambuc<a class="ulink" href="http://www.zlib.org" target="_top">http://www.zlib.org</a> and
2468*4a711beaSLionel Sambuc<a class="ulink" href="http://www.gzip.org" target="_top">http://www.gzip.org</a>
2469*4a711beaSLionel Sambucrespectively.</p>
2470*4a711beaSLionel Sambuc<p>For something faster and lighter still, you might try Markus F
2471*4a711beaSLionel SambucX J Oberhumer's <code class="computeroutput">LZO</code> real-time
2472*4a711beaSLionel Sambuccompression/decompression library, at
2473*4a711beaSLionel Sambuc<a class="ulink" href="http://www.oberhumer.com/opensource" target="_top">http://www.oberhumer.com/opensource</a>.</p>
2474*4a711beaSLionel Sambuc</div>
2475*4a711beaSLionel Sambuc<div class="sect1" title="4.5.�Further Reading">
2476*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h2 class="title" style="clear: both">
2477*4a711beaSLionel Sambuc<a name="reading"></a>4.5.�Further Reading</h2></div></div></div>
2478*4a711beaSLionel Sambuc<p><code class="computeroutput">bzip2</code> is not research
2479*4a711beaSLionel Sambucwork, in the sense that it doesn't present any new ideas.
2480*4a711beaSLionel SambucRather, it's an engineering exercise based on existing
2481*4a711beaSLionel Sambucideas.</p>
2482*4a711beaSLionel Sambuc<p>Four documents describe essentially all the ideas behind
2483*4a711beaSLionel Sambuc<code class="computeroutput">bzip2</code>:</p>
2484*4a711beaSLionel Sambuc<div class="literallayout"><p>Michael�Burrows�and�D.�J.�Wheeler:<br>
2485*4a711beaSLionel Sambuc��"A�block-sorting�lossless�data�compression�algorithm"<br>
2486*4a711beaSLionel Sambuc���10th�May�1994.�<br>
2487*4a711beaSLionel Sambuc���Digital�SRC�Research�Report�124.<br>
2488*4a711beaSLionel Sambuc���ftp://ftp.digital.com/pub/DEC/SRC/research-reports/SRC-124.ps.gz<br>
2489*4a711beaSLionel Sambuc���If�you�have�trouble�finding�it,�try�searching�at�the<br>
2490*4a711beaSLionel Sambuc���New�Zealand�Digital�Library,�http://www.nzdl.org.<br>
2491*4a711beaSLionel Sambuc<br>
2492*4a711beaSLionel SambucDaniel�S.�Hirschberg�and�Debra�A.�LeLewer<br>
2493*4a711beaSLionel Sambuc��"Efficient�Decoding�of�Prefix�Codes"<br>
2494*4a711beaSLionel Sambuc���Communications�of�the�ACM,�April�1990,�Vol�33,�Number�4.<br>
2495*4a711beaSLionel Sambuc���You�might�be�able�to�get�an�electronic�copy�of�this<br>
2496*4a711beaSLionel Sambuc���from�the�ACM�Digital�Library.<br>
2497*4a711beaSLionel Sambuc<br>
2498*4a711beaSLionel SambucDavid�J.�Wheeler<br>
2499*4a711beaSLionel Sambuc���Program�bred3.c�and�accompanying�document�bred3.ps.<br>
2500*4a711beaSLionel Sambuc���This�contains�the�idea�behind�the�multi-table�Huffman�coding�scheme.<br>
2501*4a711beaSLionel Sambuc���ftp://ftp.cl.cam.ac.uk/users/djw3/<br>
2502*4a711beaSLionel Sambuc<br>
2503*4a711beaSLionel SambucJon�L.�Bentley�and�Robert�Sedgewick<br>
2504*4a711beaSLionel Sambuc��"Fast�Algorithms�for�Sorting�and�Searching�Strings"<br>
2505*4a711beaSLionel Sambuc���Available�from�Sedgewick's�web�page,<br>
2506*4a711beaSLionel Sambuc���www.cs.princeton.edu/~rs<br>
2507*4a711beaSLionel Sambuc</p></div>
2508*4a711beaSLionel Sambuc<p>The following paper gives valuable additional insights into
2509*4a711beaSLionel Sambucthe algorithm, but is not immediately the basis of any code used
2510*4a711beaSLionel Sambucin bzip2.</p>
2511*4a711beaSLionel Sambuc<div class="literallayout"><p>Peter�Fenwick:<br>
2512*4a711beaSLionel Sambuc���Block�Sorting�Text�Compression<br>
2513*4a711beaSLionel Sambuc���Proceedings�of�the�19th�Australasian�Computer�Science�Conference,<br>
2514*4a711beaSLionel Sambuc�����Melbourne,�Australia.��Jan�31�-�Feb�2,�1996.<br>
2515*4a711beaSLionel Sambuc���ftp://ftp.cs.auckland.ac.nz/pub/peter-f/ACSC96paper.ps</p></div>
2516*4a711beaSLionel Sambuc<p>Kunihiko Sadakane's sorting algorithm, mentioned above, is
2517*4a711beaSLionel Sambucavailable from:</p>
2518*4a711beaSLionel Sambuc<div class="literallayout"><p>http://naomi.is.s.u-tokyo.ac.jp/~sada/papers/Sada98b.ps.gz<br>
2519*4a711beaSLionel Sambuc</p></div>
2520*4a711beaSLionel Sambuc<p>The Manber-Myers suffix array construction algorithm is
2521*4a711beaSLionel Sambucdescribed in a paper available from:</p>
2522*4a711beaSLionel Sambuc<div class="literallayout"><p>http://www.cs.arizona.edu/people/gene/PAPERS/suffix.ps<br>
2523*4a711beaSLionel Sambuc</p></div>
2524*4a711beaSLionel Sambuc<p>Finally, the following papers document some
2525*4a711beaSLionel Sambucinvestigations I made into the performance of sorting
2526*4a711beaSLionel Sambucand decompression algorithms:</p>
2527*4a711beaSLionel Sambuc<div class="literallayout"><p>Julian�Seward<br>
2528*4a711beaSLionel Sambuc���On�the�Performance�of�BWT�Sorting�Algorithms<br>
2529*4a711beaSLionel Sambuc���Proceedings�of�the�IEEE�Data�Compression�Conference�2000<br>
2530*4a711beaSLionel Sambuc�����Snowbird,�Utah.��28-30�March�2000.<br>
2531*4a711beaSLionel Sambuc<br>
2532*4a711beaSLionel SambucJulian�Seward<br>
2533*4a711beaSLionel Sambuc���Space-time�Tradeoffs�in�the�Inverse�B-W�Transform<br>
2534*4a711beaSLionel Sambuc���Proceedings�of�the�IEEE�Data�Compression�Conference�2001<br>
2535*4a711beaSLionel Sambuc�����Snowbird,�Utah.��27-29�March�2001.<br>
2536*4a711beaSLionel Sambuc</p></div>
2537*4a711beaSLionel Sambuc</div>
2538*4a711beaSLionel Sambuc</div>
2539*4a711beaSLionel Sambuc</div></body>
2540*4a711beaSLionel Sambuc</html>
2541