1*4a711beaSLionel Sambuc<html> 2*4a711beaSLionel Sambuc<head> 3*4a711beaSLionel Sambuc<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"> 4*4a711beaSLionel Sambuc<title>bzip2 and libbzip2, version 1.0.6</title> 5*4a711beaSLionel Sambuc<meta name="generator" content="DocBook XSL Stylesheets V1.75.2"> 6*4a711beaSLionel Sambuc<style type="text/css" media="screen">/* Colours: 7*4a711beaSLionel Sambuc#74240f dark brown h1, h2, h3, h4 8*4a711beaSLionel Sambuc#336699 medium blue links 9*4a711beaSLionel Sambuc#339999 turquoise link hover colour 10*4a711beaSLionel Sambuc#202020 almost black general text 11*4a711beaSLionel Sambuc#761596 purple md5sum text 12*4a711beaSLionel Sambuc#626262 dark gray pre border 13*4a711beaSLionel Sambuc#eeeeee very light gray pre background 14*4a711beaSLionel Sambuc#f2f2f9 very light blue nav table background 15*4a711beaSLionel Sambuc#3366cc medium blue nav table border 16*4a711beaSLionel Sambuc*/ 17*4a711beaSLionel Sambuc 18*4a711beaSLionel Sambuca, a:link, a:visited, a:active { color: #336699; } 19*4a711beaSLionel Sambuca:hover { color: #339999; } 20*4a711beaSLionel Sambuc 21*4a711beaSLionel Sambucbody { font: 80%/126% sans-serif; } 22*4a711beaSLionel Sambuch1, h2, h3, h4 { color: #74240f; } 23*4a711beaSLionel Sambuc 24*4a711beaSLionel Sambucdt { color: #336699; font-weight: bold } 25*4a711beaSLionel Sambucdd { 26*4a711beaSLionel Sambuc margin-left: 1.5em; 27*4a711beaSLionel Sambuc padding-bottom: 0.8em; 28*4a711beaSLionel Sambuc} 29*4a711beaSLionel Sambuc 30*4a711beaSLionel Sambuc/* -- ruler -- */ 31*4a711beaSLionel Sambucdiv.hr_blue { 32*4a711beaSLionel Sambuc height: 3px; 33*4a711beaSLionel Sambuc background:#ffffff url("/images/hr_blue.png") repeat-x; } 34*4a711beaSLionel Sambucdiv.hr_blue hr { display:none; } 35*4a711beaSLionel Sambuc 36*4a711beaSLionel Sambuc/* release styles */ 37*4a711beaSLionel Sambuc#release p { margin-top: 0.4em; } 38*4a711beaSLionel Sambuc#release .md5sum { color: #761596; } 39*4a711beaSLionel Sambuc 40*4a711beaSLionel Sambuc 41*4a711beaSLionel Sambuc/* ------ styles for docs|manuals|howto ------ */ 42*4a711beaSLionel Sambuc/* -- lists -- */ 43*4a711beaSLionel Sambucul { 44*4a711beaSLionel Sambuc margin: 0px 4px 16px 16px; 45*4a711beaSLionel Sambuc padding: 0px; 46*4a711beaSLionel Sambuc list-style: url("/images/li-blue.png"); 47*4a711beaSLionel Sambuc} 48*4a711beaSLionel Sambucul li { 49*4a711beaSLionel Sambuc margin-bottom: 10px; 50*4a711beaSLionel Sambuc} 51*4a711beaSLionel Sambucul ul { 52*4a711beaSLionel Sambuc list-style-type: none; 53*4a711beaSLionel Sambuc list-style-image: none; 54*4a711beaSLionel Sambuc margin-left: 0px; 55*4a711beaSLionel Sambuc} 56*4a711beaSLionel Sambuc 57*4a711beaSLionel Sambuc/* header / footer nav tables */ 58*4a711beaSLionel Sambuctable.nav { 59*4a711beaSLionel Sambuc border: solid 1px #3366cc; 60*4a711beaSLionel Sambuc background: #f2f2f9; 61*4a711beaSLionel Sambuc background-color: #f2f2f9; 62*4a711beaSLionel Sambuc margin-bottom: 0.5em; 63*4a711beaSLionel Sambuc} 64*4a711beaSLionel Sambuc/* don't have underlined links in chunked nav menus */ 65*4a711beaSLionel Sambuctable.nav a { text-decoration: none; } 66*4a711beaSLionel Sambuctable.nav a:hover { text-decoration: underline; } 67*4a711beaSLionel Sambuctable.nav td { font-size: 85%; } 68*4a711beaSLionel Sambuc 69*4a711beaSLionel Sambuccode, tt, pre { font-size: 120%; } 70*4a711beaSLionel Sambuccode, tt { color: #761596; } 71*4a711beaSLionel Sambuc 72*4a711beaSLionel Sambucdiv.literallayout, pre.programlisting, pre.screen { 73*4a711beaSLionel Sambuc color: #000000; 74*4a711beaSLionel Sambuc padding: 0.5em; 75*4a711beaSLionel Sambuc background: #eeeeee; 76*4a711beaSLionel Sambuc border: 1px solid #626262; 77*4a711beaSLionel Sambuc background-color: #eeeeee; 78*4a711beaSLionel Sambuc margin: 4px 0px 4px 0px; 79*4a711beaSLionel Sambuc} 80*4a711beaSLionel Sambuc</style> 81*4a711beaSLionel Sambuc</head> 82*4a711beaSLionel Sambuc<body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF"><div lang="en" class="book" title="bzip2 and libbzip2, version 1.0.6"> 83*4a711beaSLionel Sambuc<div class="titlepage"> 84*4a711beaSLionel Sambuc<div> 85*4a711beaSLionel Sambuc<div><h1 class="title"> 86*4a711beaSLionel Sambuc<a name="userman"></a>bzip2 and libbzip2, version 1.0.6</h1></div> 87*4a711beaSLionel Sambuc<div><h2 class="subtitle">A program and library for data compression</h2></div> 88*4a711beaSLionel Sambuc<div><div class="authorgroup"><div class="author"> 89*4a711beaSLionel Sambuc<h3 class="author"> 90*4a711beaSLionel Sambuc<span class="firstname">Julian</span> <span class="surname">Seward</span> 91*4a711beaSLionel Sambuc</h3> 92*4a711beaSLionel Sambuc<div class="affiliation"><span class="orgname">http://www.bzip.org<br></span></div> 93*4a711beaSLionel Sambuc</div></div></div> 94*4a711beaSLionel Sambuc<div><p class="releaseinfo">Version 1.0.6 of 6 September 2010</p></div> 95*4a711beaSLionel Sambuc<div><p class="copyright">Copyright � 1996-2010 Julian Seward</p></div> 96*4a711beaSLionel Sambuc<div><div class="legalnotice" title="Legal Notice"> 97*4a711beaSLionel Sambuc<a name="id537185"></a><p>This program, <code class="computeroutput">bzip2</code>, the 98*4a711beaSLionel Sambuc associated library <code class="computeroutput">libbzip2</code>, and 99*4a711beaSLionel Sambuc all documentation, are copyright � 1996-2010 Julian Seward. 100*4a711beaSLionel Sambuc All rights reserved.</p> 101*4a711beaSLionel Sambuc<p>Redistribution and use in source and binary forms, with 102*4a711beaSLionel Sambuc or without modification, are permitted provided that the 103*4a711beaSLionel Sambuc following conditions are met:</p> 104*4a711beaSLionel Sambuc<div class="itemizedlist"><ul class="itemizedlist" type="bullet"> 105*4a711beaSLionel Sambuc<li class="listitem" style="list-style-type: disc"><p>Redistributions of source code must retain the 106*4a711beaSLionel Sambuc above copyright notice, this list of conditions and the 107*4a711beaSLionel Sambuc following disclaimer.</p></li> 108*4a711beaSLionel Sambuc<li class="listitem" style="list-style-type: disc"><p>The origin of this software must not be 109*4a711beaSLionel Sambuc misrepresented; you must not claim that you wrote the original 110*4a711beaSLionel Sambuc software. If you use this software in a product, an 111*4a711beaSLionel Sambuc acknowledgment in the product documentation would be 112*4a711beaSLionel Sambuc appreciated but is not required.</p></li> 113*4a711beaSLionel Sambuc<li class="listitem" style="list-style-type: disc"><p>Altered source versions must be plainly marked 114*4a711beaSLionel Sambuc as such, and must not be misrepresented as being the original 115*4a711beaSLionel Sambuc software.</p></li> 116*4a711beaSLionel Sambuc<li class="listitem" style="list-style-type: disc"><p>The name of the author may not be used to 117*4a711beaSLionel Sambuc endorse or promote products derived from this software without 118*4a711beaSLionel Sambuc specific prior written permission.</p></li> 119*4a711beaSLionel Sambuc</ul></div> 120*4a711beaSLionel Sambuc<p>THIS SOFTWARE IS PROVIDED BY THE AUTHOR "AS IS" AND ANY 121*4a711beaSLionel Sambuc EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, 122*4a711beaSLionel Sambuc THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A 123*4a711beaSLionel Sambuc PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE 124*4a711beaSLionel Sambuc AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, 125*4a711beaSLionel Sambuc EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED 126*4a711beaSLionel Sambuc TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, 127*4a711beaSLionel Sambuc DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND 128*4a711beaSLionel Sambuc ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 129*4a711beaSLionel Sambuc LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING 130*4a711beaSLionel Sambuc IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF 131*4a711beaSLionel Sambuc THE POSSIBILITY OF SUCH DAMAGE.</p> 132*4a711beaSLionel Sambuc<p>PATENTS: To the best of my knowledge, 133*4a711beaSLionel Sambuc <code class="computeroutput">bzip2</code> and 134*4a711beaSLionel Sambuc <code class="computeroutput">libbzip2</code> do not use any patented 135*4a711beaSLionel Sambuc algorithms. However, I do not have the resources to carry 136*4a711beaSLionel Sambuc out a patent search. Therefore I cannot give any guarantee of 137*4a711beaSLionel Sambuc the above statement. 138*4a711beaSLionel Sambuc </p> 139*4a711beaSLionel Sambuc</div></div> 140*4a711beaSLionel Sambuc</div> 141*4a711beaSLionel Sambuc<hr> 142*4a711beaSLionel Sambuc</div> 143*4a711beaSLionel Sambuc<div class="toc"> 144*4a711beaSLionel Sambuc<p><b>Table of Contents</b></p> 145*4a711beaSLionel Sambuc<dl> 146*4a711beaSLionel Sambuc<dt><span class="chapter"><a href="#intro">1. Introduction</a></span></dt> 147*4a711beaSLionel Sambuc<dt><span class="chapter"><a href="#using">2. How to use bzip2</a></span></dt> 148*4a711beaSLionel Sambuc<dd><dl> 149*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#name">2.1. NAME</a></span></dt> 150*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#synopsis">2.2. SYNOPSIS</a></span></dt> 151*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#description">2.3. DESCRIPTION</a></span></dt> 152*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#options">2.4. OPTIONS</a></span></dt> 153*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#memory-management">2.5. MEMORY MANAGEMENT</a></span></dt> 154*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#recovering">2.6. RECOVERING DATA FROM DAMAGED FILES</a></span></dt> 155*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#performance">2.7. PERFORMANCE NOTES</a></span></dt> 156*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#caveats">2.8. CAVEATS</a></span></dt> 157*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#author">2.9. AUTHOR</a></span></dt> 158*4a711beaSLionel Sambuc</dl></dd> 159*4a711beaSLionel Sambuc<dt><span class="chapter"><a href="#libprog">3. 160*4a711beaSLionel SambucProgramming with <code class="computeroutput">libbzip2</code> 161*4a711beaSLionel Sambuc</a></span></dt> 162*4a711beaSLionel Sambuc<dd><dl> 163*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#top-level">3.1. Top-level structure</a></span></dt> 164*4a711beaSLionel Sambuc<dd><dl> 165*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#ll-summary">3.1.1. Low-level summary</a></span></dt> 166*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#hl-summary">3.1.2. High-level summary</a></span></dt> 167*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#util-fns-summary">3.1.3. Utility functions summary</a></span></dt> 168*4a711beaSLionel Sambuc</dl></dd> 169*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#err-handling">3.2. Error handling</a></span></dt> 170*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#low-level">3.3. Low-level interface</a></span></dt> 171*4a711beaSLionel Sambuc<dd><dl> 172*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#bzcompress-init">3.3.1. BZ2_bzCompressInit</a></span></dt> 173*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#bzCompress">3.3.2. BZ2_bzCompress</a></span></dt> 174*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#bzCompress-end">3.3.3. BZ2_bzCompressEnd</a></span></dt> 175*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#bzDecompress-init">3.3.4. BZ2_bzDecompressInit</a></span></dt> 176*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#bzDecompress">3.3.5. BZ2_bzDecompress</a></span></dt> 177*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#bzDecompress-end">3.3.6. BZ2_bzDecompressEnd</a></span></dt> 178*4a711beaSLionel Sambuc</dl></dd> 179*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#hl-interface">3.4. High-level interface</a></span></dt> 180*4a711beaSLionel Sambuc<dd><dl> 181*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#bzreadopen">3.4.1. BZ2_bzReadOpen</a></span></dt> 182*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#bzread">3.4.2. BZ2_bzRead</a></span></dt> 183*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#bzreadgetunused">3.4.3. BZ2_bzReadGetUnused</a></span></dt> 184*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#bzreadclose">3.4.4. BZ2_bzReadClose</a></span></dt> 185*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#bzwriteopen">3.4.5. BZ2_bzWriteOpen</a></span></dt> 186*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#bzwrite">3.4.6. BZ2_bzWrite</a></span></dt> 187*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#bzwriteclose">3.4.7. BZ2_bzWriteClose</a></span></dt> 188*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#embed">3.4.8. Handling embedded compressed data streams</a></span></dt> 189*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#std-rdwr">3.4.9. Standard file-reading/writing code</a></span></dt> 190*4a711beaSLionel Sambuc</dl></dd> 191*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#util-fns">3.5. Utility functions</a></span></dt> 192*4a711beaSLionel Sambuc<dd><dl> 193*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#bzbufftobuffcompress">3.5.1. BZ2_bzBuffToBuffCompress</a></span></dt> 194*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#bzbufftobuffdecompress">3.5.2. BZ2_bzBuffToBuffDecompress</a></span></dt> 195*4a711beaSLionel Sambuc</dl></dd> 196*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#zlib-compat">3.6. zlib compatibility functions</a></span></dt> 197*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#stdio-free">3.7. Using the library in a stdio-free environment</a></span></dt> 198*4a711beaSLionel Sambuc<dd><dl> 199*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#stdio-bye">3.7.1. Getting rid of stdio</a></span></dt> 200*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#critical-error">3.7.2. Critical error handling</a></span></dt> 201*4a711beaSLionel Sambuc</dl></dd> 202*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#win-dll">3.8. Making a Windows DLL</a></span></dt> 203*4a711beaSLionel Sambuc</dl></dd> 204*4a711beaSLionel Sambuc<dt><span class="chapter"><a href="#misc">4. Miscellanea</a></span></dt> 205*4a711beaSLionel Sambuc<dd><dl> 206*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#limits">4.1. Limitations of the compressed file format</a></span></dt> 207*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#port-issues">4.2. Portability issues</a></span></dt> 208*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#bugs">4.3. Reporting bugs</a></span></dt> 209*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#package">4.4. Did you get the right package?</a></span></dt> 210*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#reading">4.5. Further Reading</a></span></dt> 211*4a711beaSLionel Sambuc</dl></dd> 212*4a711beaSLionel Sambuc</dl> 213*4a711beaSLionel Sambuc</div> 214*4a711beaSLionel Sambuc<div class="chapter" title="1.�Introduction"> 215*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h2 class="title"> 216*4a711beaSLionel Sambuc<a name="intro"></a>1.�Introduction</h2></div></div></div> 217*4a711beaSLionel Sambuc<p><code class="computeroutput">bzip2</code> compresses files 218*4a711beaSLionel Sambucusing the Burrows-Wheeler block-sorting text compression 219*4a711beaSLionel Sambucalgorithm, and Huffman coding. Compression is generally 220*4a711beaSLionel Sambucconsiderably better than that achieved by more conventional 221*4a711beaSLionel SambucLZ77/LZ78-based compressors, and approaches the performance of 222*4a711beaSLionel Sambucthe PPM family of statistical compressors.</p> 223*4a711beaSLionel Sambuc<p><code class="computeroutput">bzip2</code> is built on top of 224*4a711beaSLionel Sambuc<code class="computeroutput">libbzip2</code>, a flexible library for 225*4a711beaSLionel Sambuchandling compressed data in the 226*4a711beaSLionel Sambuc<code class="computeroutput">bzip2</code> format. This manual 227*4a711beaSLionel Sambucdescribes both how to use the program and how to work with the 228*4a711beaSLionel Sambuclibrary interface. Most of the manual is devoted to this 229*4a711beaSLionel Sambuclibrary, not the program, which is good news if your interest is 230*4a711beaSLionel Sambuconly in the program.</p> 231*4a711beaSLionel Sambuc<div class="itemizedlist"><ul class="itemizedlist" type="bullet"> 232*4a711beaSLionel Sambuc<li class="listitem" style="list-style-type: disc"><p><a class="xref" href="#using" title="2.�How to use bzip2">How to use bzip2</a> describes how to use 233*4a711beaSLionel Sambuc <code class="computeroutput">bzip2</code>; this is the only part 234*4a711beaSLionel Sambuc you need to read if you just want to know how to operate the 235*4a711beaSLionel Sambuc program.</p></li> 236*4a711beaSLionel Sambuc<li class="listitem" style="list-style-type: disc"><p><a class="xref" href="#libprog" title="3.� Programming with libbzip2">Programming with libbzip2</a> describes the 237*4a711beaSLionel Sambuc programming interfaces in detail, and</p></li> 238*4a711beaSLionel Sambuc<li class="listitem" style="list-style-type: disc"><p><a class="xref" href="#misc" title="4.�Miscellanea">Miscellanea</a> records some 239*4a711beaSLionel Sambuc miscellaneous notes which I thought ought to be recorded 240*4a711beaSLionel Sambuc somewhere.</p></li> 241*4a711beaSLionel Sambuc</ul></div> 242*4a711beaSLionel Sambuc</div> 243*4a711beaSLionel Sambuc<div class="chapter" title="2.�How to use bzip2"> 244*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h2 class="title"> 245*4a711beaSLionel Sambuc<a name="using"></a>2.�How to use bzip2</h2></div></div></div> 246*4a711beaSLionel Sambuc<div class="toc"> 247*4a711beaSLionel Sambuc<p><b>Table of Contents</b></p> 248*4a711beaSLionel Sambuc<dl> 249*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#name">2.1. NAME</a></span></dt> 250*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#synopsis">2.2. SYNOPSIS</a></span></dt> 251*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#description">2.3. DESCRIPTION</a></span></dt> 252*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#options">2.4. OPTIONS</a></span></dt> 253*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#memory-management">2.5. MEMORY MANAGEMENT</a></span></dt> 254*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#recovering">2.6. RECOVERING DATA FROM DAMAGED FILES</a></span></dt> 255*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#performance">2.7. PERFORMANCE NOTES</a></span></dt> 256*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#caveats">2.8. CAVEATS</a></span></dt> 257*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#author">2.9. AUTHOR</a></span></dt> 258*4a711beaSLionel Sambuc</dl> 259*4a711beaSLionel Sambuc</div> 260*4a711beaSLionel Sambuc<p>This chapter contains a copy of the 261*4a711beaSLionel Sambuc<code class="computeroutput">bzip2</code> man page, and nothing 262*4a711beaSLionel Sambucelse.</p> 263*4a711beaSLionel Sambuc<div class="sect1" title="2.1.�NAME"> 264*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h2 class="title" style="clear: both"> 265*4a711beaSLionel Sambuc<a name="name"></a>2.1.�NAME</h2></div></div></div> 266*4a711beaSLionel Sambuc<div class="itemizedlist"><ul class="itemizedlist" type="bullet"> 267*4a711beaSLionel Sambuc<li class="listitem" style="list-style-type: disc"><p><code class="computeroutput">bzip2</code>, 268*4a711beaSLionel Sambuc <code class="computeroutput">bunzip2</code> - a block-sorting file 269*4a711beaSLionel Sambuc compressor, v1.0.6</p></li> 270*4a711beaSLionel Sambuc<li class="listitem" style="list-style-type: disc"><p><code class="computeroutput">bzcat</code> - 271*4a711beaSLionel Sambuc decompresses files to stdout</p></li> 272*4a711beaSLionel Sambuc<li class="listitem" style="list-style-type: disc"><p><code class="computeroutput">bzip2recover</code> - 273*4a711beaSLionel Sambuc recovers data from damaged bzip2 files</p></li> 274*4a711beaSLionel Sambuc</ul></div> 275*4a711beaSLionel Sambuc</div> 276*4a711beaSLionel Sambuc<div class="sect1" title="2.2.�SYNOPSIS"> 277*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h2 class="title" style="clear: both"> 278*4a711beaSLionel Sambuc<a name="synopsis"></a>2.2.�SYNOPSIS</h2></div></div></div> 279*4a711beaSLionel Sambuc<div class="itemizedlist"><ul class="itemizedlist" type="bullet"> 280*4a711beaSLionel Sambuc<li class="listitem" style="list-style-type: disc"><p><code class="computeroutput">bzip2</code> [ 281*4a711beaSLionel Sambuc -cdfkqstvzVL123456789 ] [ filenames ... ]</p></li> 282*4a711beaSLionel Sambuc<li class="listitem" style="list-style-type: disc"><p><code class="computeroutput">bunzip2</code> [ 283*4a711beaSLionel Sambuc -fkvsVL ] [ filenames ... ]</p></li> 284*4a711beaSLionel Sambuc<li class="listitem" style="list-style-type: disc"><p><code class="computeroutput">bzcat</code> [ -s ] [ 285*4a711beaSLionel Sambuc filenames ... ]</p></li> 286*4a711beaSLionel Sambuc<li class="listitem" style="list-style-type: disc"><p><code class="computeroutput">bzip2recover</code> 287*4a711beaSLionel Sambuc filename</p></li> 288*4a711beaSLionel Sambuc</ul></div> 289*4a711beaSLionel Sambuc</div> 290*4a711beaSLionel Sambuc<div class="sect1" title="2.3.�DESCRIPTION"> 291*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h2 class="title" style="clear: both"> 292*4a711beaSLionel Sambuc<a name="description"></a>2.3.�DESCRIPTION</h2></div></div></div> 293*4a711beaSLionel Sambuc<p><code class="computeroutput">bzip2</code> compresses files 294*4a711beaSLionel Sambucusing the Burrows-Wheeler block sorting text compression 295*4a711beaSLionel Sambucalgorithm, and Huffman coding. Compression is generally 296*4a711beaSLionel Sambucconsiderably better than that achieved by more conventional 297*4a711beaSLionel SambucLZ77/LZ78-based compressors, and approaches the performance of 298*4a711beaSLionel Sambucthe PPM family of statistical compressors.</p> 299*4a711beaSLionel Sambuc<p>The command-line options are deliberately very similar to 300*4a711beaSLionel Sambucthose of GNU <code class="computeroutput">gzip</code>, but they are 301*4a711beaSLionel Sambucnot identical.</p> 302*4a711beaSLionel Sambuc<p><code class="computeroutput">bzip2</code> expects a list of 303*4a711beaSLionel Sambucfile names to accompany the command-line flags. Each file is 304*4a711beaSLionel Sambucreplaced by a compressed version of itself, with the name 305*4a711beaSLionel Sambuc<code class="computeroutput">original_name.bz2</code>. Each 306*4a711beaSLionel Sambuccompressed file has the same modification date, permissions, and, 307*4a711beaSLionel Sambucwhen possible, ownership as the corresponding original, so that 308*4a711beaSLionel Sambucthese properties can be correctly restored at decompression time. 309*4a711beaSLionel SambucFile name handling is naive in the sense that there is no 310*4a711beaSLionel Sambucmechanism for preserving original file names, permissions, 311*4a711beaSLionel Sambucownerships or dates in filesystems which lack these concepts, or 312*4a711beaSLionel Sambuchave serious file name length restrictions, such as 313*4a711beaSLionel SambucMS-DOS.</p> 314*4a711beaSLionel Sambuc<p><code class="computeroutput">bzip2</code> and 315*4a711beaSLionel Sambuc<code class="computeroutput">bunzip2</code> will by default not 316*4a711beaSLionel Sambucoverwrite existing files. If you want this to happen, specify 317*4a711beaSLionel Sambucthe <code class="computeroutput">-f</code> flag.</p> 318*4a711beaSLionel Sambuc<p>If no file names are specified, 319*4a711beaSLionel Sambuc<code class="computeroutput">bzip2</code> compresses from standard 320*4a711beaSLionel Sambucinput to standard output. In this case, 321*4a711beaSLionel Sambuc<code class="computeroutput">bzip2</code> will decline to write 322*4a711beaSLionel Sambuccompressed output to a terminal, as this would be entirely 323*4a711beaSLionel Sambucincomprehensible and therefore pointless.</p> 324*4a711beaSLionel Sambuc<p><code class="computeroutput">bunzip2</code> (or 325*4a711beaSLionel Sambuc<code class="computeroutput">bzip2 -d</code>) decompresses all 326*4a711beaSLionel Sambucspecified files. Files which were not created by 327*4a711beaSLionel Sambuc<code class="computeroutput">bzip2</code> will be detected and 328*4a711beaSLionel Sambucignored, and a warning issued. 329*4a711beaSLionel Sambuc<code class="computeroutput">bzip2</code> attempts to guess the 330*4a711beaSLionel Sambucfilename for the decompressed file from that of the compressed 331*4a711beaSLionel Sambucfile as follows:</p> 332*4a711beaSLionel Sambuc<div class="itemizedlist"><ul class="itemizedlist" type="bullet"> 333*4a711beaSLionel Sambuc<li class="listitem" style="list-style-type: disc"><p><code class="computeroutput">filename.bz2 </code> 334*4a711beaSLionel Sambuc becomes 335*4a711beaSLionel Sambuc <code class="computeroutput">filename</code></p></li> 336*4a711beaSLionel Sambuc<li class="listitem" style="list-style-type: disc"><p><code class="computeroutput">filename.bz </code> 337*4a711beaSLionel Sambuc becomes 338*4a711beaSLionel Sambuc <code class="computeroutput">filename</code></p></li> 339*4a711beaSLionel Sambuc<li class="listitem" style="list-style-type: disc"><p><code class="computeroutput">filename.tbz2</code> 340*4a711beaSLionel Sambuc becomes 341*4a711beaSLionel Sambuc <code class="computeroutput">filename.tar</code></p></li> 342*4a711beaSLionel Sambuc<li class="listitem" style="list-style-type: disc"><p><code class="computeroutput">filename.tbz </code> 343*4a711beaSLionel Sambuc becomes 344*4a711beaSLionel Sambuc <code class="computeroutput">filename.tar</code></p></li> 345*4a711beaSLionel Sambuc<li class="listitem" style="list-style-type: disc"><p><code class="computeroutput">anyothername </code> 346*4a711beaSLionel Sambuc becomes 347*4a711beaSLionel Sambuc <code class="computeroutput">anyothername.out</code></p></li> 348*4a711beaSLionel Sambuc</ul></div> 349*4a711beaSLionel Sambuc<p>If the file does not end in one of the recognised endings, 350*4a711beaSLionel Sambuc<code class="computeroutput">.bz2</code>, 351*4a711beaSLionel Sambuc<code class="computeroutput">.bz</code>, 352*4a711beaSLionel Sambuc<code class="computeroutput">.tbz2</code> or 353*4a711beaSLionel Sambuc<code class="computeroutput">.tbz</code>, 354*4a711beaSLionel Sambuc<code class="computeroutput">bzip2</code> complains that it cannot 355*4a711beaSLionel Sambucguess the name of the original file, and uses the original name 356*4a711beaSLionel Sambucwith <code class="computeroutput">.out</code> appended.</p> 357*4a711beaSLionel Sambuc<p>As with compression, supplying no filenames causes 358*4a711beaSLionel Sambucdecompression from standard input to standard output.</p> 359*4a711beaSLionel Sambuc<p><code class="computeroutput">bunzip2</code> will correctly 360*4a711beaSLionel Sambucdecompress a file which is the concatenation of two or more 361*4a711beaSLionel Sambuccompressed files. The result is the concatenation of the 362*4a711beaSLionel Sambuccorresponding uncompressed files. Integrity testing 363*4a711beaSLionel Sambuc(<code class="computeroutput">-t</code>) of concatenated compressed 364*4a711beaSLionel Sambucfiles is also supported.</p> 365*4a711beaSLionel Sambuc<p>You can also compress or decompress files to the standard 366*4a711beaSLionel Sambucoutput by giving the <code class="computeroutput">-c</code> flag. 367*4a711beaSLionel SambucMultiple files may be compressed and decompressed like this. The 368*4a711beaSLionel Sambucresulting outputs are fed sequentially to stdout. Compression of 369*4a711beaSLionel Sambucmultiple files in this manner generates a stream containing 370*4a711beaSLionel Sambucmultiple compressed file representations. Such a stream can be 371*4a711beaSLionel Sambucdecompressed correctly only by 372*4a711beaSLionel Sambuc<code class="computeroutput">bzip2</code> version 0.9.0 or later. 373*4a711beaSLionel SambucEarlier versions of <code class="computeroutput">bzip2</code> will 374*4a711beaSLionel Sambucstop after decompressing the first file in the stream.</p> 375*4a711beaSLionel Sambuc<p><code class="computeroutput">bzcat</code> (or 376*4a711beaSLionel Sambuc<code class="computeroutput">bzip2 -dc</code>) decompresses all 377*4a711beaSLionel Sambucspecified files to the standard output.</p> 378*4a711beaSLionel Sambuc<p><code class="computeroutput">bzip2</code> will read arguments 379*4a711beaSLionel Sambucfrom the environment variables 380*4a711beaSLionel Sambuc<code class="computeroutput">BZIP2</code> and 381*4a711beaSLionel Sambuc<code class="computeroutput">BZIP</code>, in that order, and will 382*4a711beaSLionel Sambucprocess them before any arguments read from the command line. 383*4a711beaSLionel SambucThis gives a convenient way to supply default arguments.</p> 384*4a711beaSLionel Sambuc<p>Compression is always performed, even if the compressed 385*4a711beaSLionel Sambucfile is slightly larger than the original. Files of less than 386*4a711beaSLionel Sambucabout one hundred bytes tend to get larger, since the compression 387*4a711beaSLionel Sambucmechanism has a constant overhead in the region of 50 bytes. 388*4a711beaSLionel SambucRandom data (including the output of most file compressors) is 389*4a711beaSLionel Sambuccoded at about 8.05 bits per byte, giving an expansion of around 390*4a711beaSLionel Sambuc0.5%.</p> 391*4a711beaSLionel Sambuc<p>As a self-check for your protection, 392*4a711beaSLionel Sambuc<code class="computeroutput">bzip2</code> uses 32-bit CRCs to make 393*4a711beaSLionel Sambucsure that the decompressed version of a file is identical to the 394*4a711beaSLionel Sambucoriginal. This guards against corruption of the compressed data, 395*4a711beaSLionel Sambucand against undetected bugs in 396*4a711beaSLionel Sambuc<code class="computeroutput">bzip2</code> (hopefully very unlikely). 397*4a711beaSLionel SambucThe chances of data corruption going undetected is microscopic, 398*4a711beaSLionel Sambucabout one chance in four billion for each file processed. Be 399*4a711beaSLionel Sambucaware, though, that the check occurs upon decompression, so it 400*4a711beaSLionel Sambuccan only tell you that something is wrong. It can't help you 401*4a711beaSLionel Sambucrecover the original uncompressed data. You can use 402*4a711beaSLionel Sambuc<code class="computeroutput">bzip2recover</code> to try to recover 403*4a711beaSLionel Sambucdata from damaged files.</p> 404*4a711beaSLionel Sambuc<p>Return values: 0 for a normal exit, 1 for environmental 405*4a711beaSLionel Sambucproblems (file not found, invalid flags, I/O errors, etc.), 2 406*4a711beaSLionel Sambucto indicate a corrupt compressed file, 3 for an internal 407*4a711beaSLionel Sambucconsistency error (eg, bug) which caused 408*4a711beaSLionel Sambuc<code class="computeroutput">bzip2</code> to panic.</p> 409*4a711beaSLionel Sambuc</div> 410*4a711beaSLionel Sambuc<div class="sect1" title="2.4.�OPTIONS"> 411*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h2 class="title" style="clear: both"> 412*4a711beaSLionel Sambuc<a name="options"></a>2.4.�OPTIONS</h2></div></div></div> 413*4a711beaSLionel Sambuc<div class="variablelist"><dl> 414*4a711beaSLionel Sambuc<dt><span class="term"><code class="computeroutput">-c --stdout</code></span></dt> 415*4a711beaSLionel Sambuc<dd><p>Compress or decompress to standard 416*4a711beaSLionel Sambuc output.</p></dd> 417*4a711beaSLionel Sambuc<dt><span class="term"><code class="computeroutput">-d --decompress</code></span></dt> 418*4a711beaSLionel Sambuc<dd><p>Force decompression. 419*4a711beaSLionel Sambuc <code class="computeroutput">bzip2</code>, 420*4a711beaSLionel Sambuc <code class="computeroutput">bunzip2</code> and 421*4a711beaSLionel Sambuc <code class="computeroutput">bzcat</code> are really the same 422*4a711beaSLionel Sambuc program, and the decision about what actions to take is done on 423*4a711beaSLionel Sambuc the basis of which name is used. This flag overrides that 424*4a711beaSLionel Sambuc mechanism, and forces bzip2 to decompress.</p></dd> 425*4a711beaSLionel Sambuc<dt><span class="term"><code class="computeroutput">-z --compress</code></span></dt> 426*4a711beaSLionel Sambuc<dd><p>The complement to 427*4a711beaSLionel Sambuc <code class="computeroutput">-d</code>: forces compression, 428*4a711beaSLionel Sambuc regardless of the invokation name.</p></dd> 429*4a711beaSLionel Sambuc<dt><span class="term"><code class="computeroutput">-t --test</code></span></dt> 430*4a711beaSLionel Sambuc<dd><p>Check integrity of the specified file(s), but 431*4a711beaSLionel Sambuc don't decompress them. This really performs a trial 432*4a711beaSLionel Sambuc decompression and throws away the result.</p></dd> 433*4a711beaSLionel Sambuc<dt><span class="term"><code class="computeroutput">-f --force</code></span></dt> 434*4a711beaSLionel Sambuc<dd> 435*4a711beaSLionel Sambuc<p>Force overwrite of output files. Normally, 436*4a711beaSLionel Sambuc <code class="computeroutput">bzip2</code> will not overwrite 437*4a711beaSLionel Sambuc existing output files. Also forces 438*4a711beaSLionel Sambuc <code class="computeroutput">bzip2</code> to break hard links to 439*4a711beaSLionel Sambuc files, which it otherwise wouldn't do.</p> 440*4a711beaSLionel Sambuc<p><code class="computeroutput">bzip2</code> normally declines 441*4a711beaSLionel Sambuc to decompress files which don't have the correct magic header 442*4a711beaSLionel Sambuc bytes. If forced (<code class="computeroutput">-f</code>), 443*4a711beaSLionel Sambuc however, it will pass such files through unmodified. This is 444*4a711beaSLionel Sambuc how GNU <code class="computeroutput">gzip</code> behaves.</p> 445*4a711beaSLionel Sambuc</dd> 446*4a711beaSLionel Sambuc<dt><span class="term"><code class="computeroutput">-k --keep</code></span></dt> 447*4a711beaSLionel Sambuc<dd><p>Keep (don't delete) input files during 448*4a711beaSLionel Sambuc compression or decompression.</p></dd> 449*4a711beaSLionel Sambuc<dt><span class="term"><code class="computeroutput">-s --small</code></span></dt> 450*4a711beaSLionel Sambuc<dd> 451*4a711beaSLionel Sambuc<p>Reduce memory usage, for compression, 452*4a711beaSLionel Sambuc decompression and testing. Files are decompressed and tested 453*4a711beaSLionel Sambuc using a modified algorithm which only requires 2.5 bytes per 454*4a711beaSLionel Sambuc block byte. This means any file can be decompressed in 2300k 455*4a711beaSLionel Sambuc of memory, albeit at about half the normal speed.</p> 456*4a711beaSLionel Sambuc<p>During compression, <code class="computeroutput">-s</code> 457*4a711beaSLionel Sambuc selects a block size of 200k, which limits memory use to around 458*4a711beaSLionel Sambuc the same figure, at the expense of your compression ratio. In 459*4a711beaSLionel Sambuc short, if your machine is low on memory (8 megabytes or less), 460*4a711beaSLionel Sambuc use <code class="computeroutput">-s</code> for everything. See 461*4a711beaSLionel Sambuc <a class="xref" href="#memory-management" title="2.5.�MEMORY MANAGEMENT">MEMORY MANAGEMENT</a> below.</p> 462*4a711beaSLionel Sambuc</dd> 463*4a711beaSLionel Sambuc<dt><span class="term"><code class="computeroutput">-q --quiet</code></span></dt> 464*4a711beaSLionel Sambuc<dd><p>Suppress non-essential warning messages. 465*4a711beaSLionel Sambuc Messages pertaining to I/O errors and other critical events 466*4a711beaSLionel Sambuc will not be suppressed.</p></dd> 467*4a711beaSLionel Sambuc<dt><span class="term"><code class="computeroutput">-v --verbose</code></span></dt> 468*4a711beaSLionel Sambuc<dd><p>Verbose mode -- show the compression ratio for 469*4a711beaSLionel Sambuc each file processed. Further 470*4a711beaSLionel Sambuc <code class="computeroutput">-v</code>'s increase the verbosity 471*4a711beaSLionel Sambuc level, spewing out lots of information which is primarily of 472*4a711beaSLionel Sambuc interest for diagnostic purposes.</p></dd> 473*4a711beaSLionel Sambuc<dt><span class="term"><code class="computeroutput">-L --license -V --version</code></span></dt> 474*4a711beaSLionel Sambuc<dd><p>Display the software version, license terms and 475*4a711beaSLionel Sambuc conditions.</p></dd> 476*4a711beaSLionel Sambuc<dt><span class="term"><code class="computeroutput">-1</code> (or 477*4a711beaSLionel Sambuc <code class="computeroutput">--fast</code>) to 478*4a711beaSLionel Sambuc <code class="computeroutput">-9</code> (or 479*4a711beaSLionel Sambuc <code class="computeroutput">-best</code>)</span></dt> 480*4a711beaSLionel Sambuc<dd><p>Set the block size to 100 k, 200 k ... 900 k 481*4a711beaSLionel Sambuc when compressing. Has no effect when decompressing. See <a class="xref" href="#memory-management" title="2.5.�MEMORY MANAGEMENT">MEMORY MANAGEMENT</a> below. The 482*4a711beaSLionel Sambuc <code class="computeroutput">--fast</code> and 483*4a711beaSLionel Sambuc <code class="computeroutput">--best</code> aliases are primarily 484*4a711beaSLionel Sambuc for GNU <code class="computeroutput">gzip</code> compatibility. 485*4a711beaSLionel Sambuc In particular, <code class="computeroutput">--fast</code> doesn't 486*4a711beaSLionel Sambuc make things significantly faster. And 487*4a711beaSLionel Sambuc <code class="computeroutput">--best</code> merely selects the 488*4a711beaSLionel Sambuc default behaviour.</p></dd> 489*4a711beaSLionel Sambuc<dt><span class="term"><code class="computeroutput">--</code></span></dt> 490*4a711beaSLionel Sambuc<dd><p>Treats all subsequent arguments as file names, 491*4a711beaSLionel Sambuc even if they start with a dash. This is so you can handle 492*4a711beaSLionel Sambuc files with names beginning with a dash, for example: 493*4a711beaSLionel Sambuc <code class="computeroutput">bzip2 -- 494*4a711beaSLionel Sambuc -myfilename</code>.</p></dd> 495*4a711beaSLionel Sambuc<dt> 496*4a711beaSLionel Sambuc<span class="term"><code class="computeroutput">--repetitive-fast</code>, </span><span class="term"><code class="computeroutput">--repetitive-best</code></span> 497*4a711beaSLionel Sambuc</dt> 498*4a711beaSLionel Sambuc<dd><p>These flags are redundant in versions 0.9.5 and 499*4a711beaSLionel Sambuc above. They provided some coarse control over the behaviour of 500*4a711beaSLionel Sambuc the sorting algorithm in earlier versions, which was sometimes 501*4a711beaSLionel Sambuc useful. 0.9.5 and above have an improved algorithm which 502*4a711beaSLionel Sambuc renders these flags irrelevant.</p></dd> 503*4a711beaSLionel Sambuc</dl></div> 504*4a711beaSLionel Sambuc</div> 505*4a711beaSLionel Sambuc<div class="sect1" title="2.5.�MEMORY MANAGEMENT"> 506*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h2 class="title" style="clear: both"> 507*4a711beaSLionel Sambuc<a name="memory-management"></a>2.5.�MEMORY MANAGEMENT</h2></div></div></div> 508*4a711beaSLionel Sambuc<p><code class="computeroutput">bzip2</code> compresses large 509*4a711beaSLionel Sambucfiles in blocks. The block size affects both the compression 510*4a711beaSLionel Sambucratio achieved, and the amount of memory needed for compression 511*4a711beaSLionel Sambucand decompression. The flags <code class="computeroutput">-1</code> 512*4a711beaSLionel Sambucthrough <code class="computeroutput">-9</code> specify the block 513*4a711beaSLionel Sambucsize to be 100,000 bytes through 900,000 bytes (the default) 514*4a711beaSLionel Sambucrespectively. At decompression time, the block size used for 515*4a711beaSLionel Sambuccompression is read from the header of the compressed file, and 516*4a711beaSLionel Sambuc<code class="computeroutput">bunzip2</code> then allocates itself 517*4a711beaSLionel Sambucjust enough memory to decompress the file. Since block sizes are 518*4a711beaSLionel Sambucstored in compressed files, it follows that the flags 519*4a711beaSLionel Sambuc<code class="computeroutput">-1</code> to 520*4a711beaSLionel Sambuc<code class="computeroutput">-9</code> are irrelevant to and so 521*4a711beaSLionel Sambucignored during decompression.</p> 522*4a711beaSLionel Sambuc<p>Compression and decompression requirements, in bytes, can be 523*4a711beaSLionel Sambucestimated as:</p> 524*4a711beaSLionel Sambuc<pre class="programlisting">Compression: 400k + ( 8 x block size ) 525*4a711beaSLionel Sambuc 526*4a711beaSLionel SambucDecompression: 100k + ( 4 x block size ), or 527*4a711beaSLionel Sambuc 100k + ( 2.5 x block size )</pre> 528*4a711beaSLionel Sambuc<p>Larger block sizes give rapidly diminishing marginal 529*4a711beaSLionel Sambucreturns. Most of the compression comes from the first two or 530*4a711beaSLionel Sambucthree hundred k of block size, a fact worth bearing in mind when 531*4a711beaSLionel Sambucusing <code class="computeroutput">bzip2</code> on small machines. 532*4a711beaSLionel SambucIt is also important to appreciate that the decompression memory 533*4a711beaSLionel Sambucrequirement is set at compression time by the choice of block 534*4a711beaSLionel Sambucsize.</p> 535*4a711beaSLionel Sambuc<p>For files compressed with the default 900k block size, 536*4a711beaSLionel Sambuc<code class="computeroutput">bunzip2</code> will require about 3700 537*4a711beaSLionel Sambuckbytes to decompress. To support decompression of any file on a 538*4a711beaSLionel Sambuc4 megabyte machine, <code class="computeroutput">bunzip2</code> has 539*4a711beaSLionel Sambucan option to decompress using approximately half this amount of 540*4a711beaSLionel Sambucmemory, about 2300 kbytes. Decompression speed is also halved, 541*4a711beaSLionel Sambucso you should use this option only where necessary. The relevant 542*4a711beaSLionel Sambucflag is <code class="computeroutput">-s</code>.</p> 543*4a711beaSLionel Sambuc<p>In general, try and use the largest block size memory 544*4a711beaSLionel Sambucconstraints allow, since that maximises the compression achieved. 545*4a711beaSLionel SambucCompression and decompression speed are virtually unaffected by 546*4a711beaSLionel Sambucblock size.</p> 547*4a711beaSLionel Sambuc<p>Another significant point applies to files which fit in a 548*4a711beaSLionel Sambucsingle block -- that means most files you'd encounter using a 549*4a711beaSLionel Sambuclarge block size. The amount of real memory touched is 550*4a711beaSLionel Sambucproportional to the size of the file, since the file is smaller 551*4a711beaSLionel Sambucthan a block. For example, compressing a file 20,000 bytes long 552*4a711beaSLionel Sambucwith the flag <code class="computeroutput">-9</code> will cause the 553*4a711beaSLionel Sambuccompressor to allocate around 7600k of memory, but only touch 554*4a711beaSLionel Sambuc400k + 20000 * 8 = 560 kbytes of it. Similarly, the decompressor 555*4a711beaSLionel Sambucwill allocate 3700k but only touch 100k + 20000 * 4 = 180 556*4a711beaSLionel Sambuckbytes.</p> 557*4a711beaSLionel Sambuc<p>Here is a table which summarises the maximum memory usage 558*4a711beaSLionel Sambucfor different block sizes. Also recorded is the total compressed 559*4a711beaSLionel Sambucsize for 14 files of the Calgary Text Compression Corpus 560*4a711beaSLionel Sambuctotalling 3,141,622 bytes. This column gives some feel for how 561*4a711beaSLionel Sambuccompression varies with block size. These figures tend to 562*4a711beaSLionel Sambucunderstate the advantage of larger block sizes for larger files, 563*4a711beaSLionel Sambucsince the Corpus is dominated by smaller files.</p> 564*4a711beaSLionel Sambuc<pre class="programlisting"> Compress Decompress Decompress Corpus 565*4a711beaSLionel SambucFlag usage usage -s usage Size 566*4a711beaSLionel Sambuc 567*4a711beaSLionel Sambuc -1 1200k 500k 350k 914704 568*4a711beaSLionel Sambuc -2 2000k 900k 600k 877703 569*4a711beaSLionel Sambuc -3 2800k 1300k 850k 860338 570*4a711beaSLionel Sambuc -4 3600k 1700k 1100k 846899 571*4a711beaSLionel Sambuc -5 4400k 2100k 1350k 845160 572*4a711beaSLionel Sambuc -6 5200k 2500k 1600k 838626 573*4a711beaSLionel Sambuc -7 6100k 2900k 1850k 834096 574*4a711beaSLionel Sambuc -8 6800k 3300k 2100k 828642 575*4a711beaSLionel Sambuc -9 7600k 3700k 2350k 828642</pre> 576*4a711beaSLionel Sambuc</div> 577*4a711beaSLionel Sambuc<div class="sect1" title="2.6.�RECOVERING DATA FROM DAMAGED FILES"> 578*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h2 class="title" style="clear: both"> 579*4a711beaSLionel Sambuc<a name="recovering"></a>2.6.�RECOVERING DATA FROM DAMAGED FILES</h2></div></div></div> 580*4a711beaSLionel Sambuc<p><code class="computeroutput">bzip2</code> compresses files in 581*4a711beaSLionel Sambucblocks, usually 900kbytes long. Each block is handled 582*4a711beaSLionel Sambucindependently. If a media or transmission error causes a 583*4a711beaSLionel Sambucmulti-block <code class="computeroutput">.bz2</code> file to become 584*4a711beaSLionel Sambucdamaged, it may be possible to recover data from the undamaged 585*4a711beaSLionel Sambucblocks in the file.</p> 586*4a711beaSLionel Sambuc<p>The compressed representation of each block is delimited by 587*4a711beaSLionel Sambuca 48-bit pattern, which makes it possible to find the block 588*4a711beaSLionel Sambucboundaries with reasonable certainty. Each block also carries 589*4a711beaSLionel Sambucits own 32-bit CRC, so damaged blocks can be distinguished from 590*4a711beaSLionel Sambucundamaged ones.</p> 591*4a711beaSLionel Sambuc<p><code class="computeroutput">bzip2recover</code> is a simple 592*4a711beaSLionel Sambucprogram whose purpose is to search for blocks in 593*4a711beaSLionel Sambuc<code class="computeroutput">.bz2</code> files, and write each block 594*4a711beaSLionel Sambucout into its own <code class="computeroutput">.bz2</code> file. You 595*4a711beaSLionel Sambuccan then use <code class="computeroutput">bzip2 -t</code> to test 596*4a711beaSLionel Sambucthe integrity of the resulting files, and decompress those which 597*4a711beaSLionel Sambucare undamaged.</p> 598*4a711beaSLionel Sambuc<p><code class="computeroutput">bzip2recover</code> takes a 599*4a711beaSLionel Sambucsingle argument, the name of the damaged file, and writes a 600*4a711beaSLionel Sambucnumber of files <code class="computeroutput">rec0001file.bz2</code>, 601*4a711beaSLionel Sambuc<code class="computeroutput">rec0002file.bz2</code>, etc, containing 602*4a711beaSLionel Sambucthe extracted blocks. The output filenames are designed so that 603*4a711beaSLionel Sambucthe use of wildcards in subsequent processing -- for example, 604*4a711beaSLionel Sambuc<code class="computeroutput">bzip2 -dc rec*file.bz2 > 605*4a711beaSLionel Sambucrecovered_data</code> -- lists the files in the correct 606*4a711beaSLionel Sambucorder.</p> 607*4a711beaSLionel Sambuc<p><code class="computeroutput">bzip2recover</code> should be of 608*4a711beaSLionel Sambucmost use dealing with large <code class="computeroutput">.bz2</code> 609*4a711beaSLionel Sambucfiles, as these will contain many blocks. It is clearly futile 610*4a711beaSLionel Sambucto use it on damaged single-block files, since a damaged block 611*4a711beaSLionel Sambuccannot be recovered. If you wish to minimise any potential data 612*4a711beaSLionel Sambucloss through media or transmission errors, you might consider 613*4a711beaSLionel Sambuccompressing with a smaller block size.</p> 614*4a711beaSLionel Sambuc</div> 615*4a711beaSLionel Sambuc<div class="sect1" title="2.7.�PERFORMANCE NOTES"> 616*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h2 class="title" style="clear: both"> 617*4a711beaSLionel Sambuc<a name="performance"></a>2.7.�PERFORMANCE NOTES</h2></div></div></div> 618*4a711beaSLionel Sambuc<p>The sorting phase of compression gathers together similar 619*4a711beaSLionel Sambucstrings in the file. Because of this, files containing very long 620*4a711beaSLionel Sambucruns of repeated symbols, like "aabaabaabaab ..." (repeated 621*4a711beaSLionel Sambucseveral hundred times) may compress more slowly than normal. 622*4a711beaSLionel SambucVersions 0.9.5 and above fare much better than previous versions 623*4a711beaSLionel Sambucin this respect. The ratio between worst-case and average-case 624*4a711beaSLionel Sambuccompression time is in the region of 10:1. For previous 625*4a711beaSLionel Sambucversions, this figure was more like 100:1. You can use the 626*4a711beaSLionel Sambuc<code class="computeroutput">-vvvv</code> option to monitor progress 627*4a711beaSLionel Sambucin great detail, if you want.</p> 628*4a711beaSLionel Sambuc<p>Decompression speed is unaffected by these 629*4a711beaSLionel Sambucphenomena.</p> 630*4a711beaSLionel Sambuc<p><code class="computeroutput">bzip2</code> usually allocates 631*4a711beaSLionel Sambucseveral megabytes of memory to operate in, and then charges all 632*4a711beaSLionel Sambucover it in a fairly random fashion. This means that performance, 633*4a711beaSLionel Sambucboth for compressing and decompressing, is largely determined by 634*4a711beaSLionel Sambucthe speed at which your machine can service cache misses. 635*4a711beaSLionel SambucBecause of this, small changes to the code to reduce the miss 636*4a711beaSLionel Sambucrate have been observed to give disproportionately large 637*4a711beaSLionel Sambucperformance improvements. I imagine 638*4a711beaSLionel Sambuc<code class="computeroutput">bzip2</code> will perform best on 639*4a711beaSLionel Sambucmachines with very large caches.</p> 640*4a711beaSLionel Sambuc</div> 641*4a711beaSLionel Sambuc<div class="sect1" title="2.8.�CAVEATS"> 642*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h2 class="title" style="clear: both"> 643*4a711beaSLionel Sambuc<a name="caveats"></a>2.8.�CAVEATS</h2></div></div></div> 644*4a711beaSLionel Sambuc<p>I/O error messages are not as helpful as they could be. 645*4a711beaSLionel Sambuc<code class="computeroutput">bzip2</code> tries hard to detect I/O 646*4a711beaSLionel Sambucerrors and exit cleanly, but the details of what the problem is 647*4a711beaSLionel Sambucsometimes seem rather misleading.</p> 648*4a711beaSLionel Sambuc<p>This manual page pertains to version 1.0.6 of 649*4a711beaSLionel Sambuc<code class="computeroutput">bzip2</code>. Compressed data created by 650*4a711beaSLionel Sambucthis version is entirely forwards and backwards compatible with the 651*4a711beaSLionel Sambucprevious public releases, versions 0.1pl2, 0.9.0 and 0.9.5, 1.0.0, 652*4a711beaSLionel Sambuc1.0.1, 1.0.2 and 1.0.3, but with the following exception: 0.9.0 and 653*4a711beaSLionel Sambucabove can correctly decompress multiple concatenated compressed files. 654*4a711beaSLionel Sambuc0.1pl2 cannot do this; it will stop after decompressing just the first 655*4a711beaSLionel Sambucfile in the stream.</p> 656*4a711beaSLionel Sambuc<p><code class="computeroutput">bzip2recover</code> versions 657*4a711beaSLionel Sambucprior to 1.0.2 used 32-bit integers to represent bit positions in 658*4a711beaSLionel Sambuccompressed files, so it could not handle compressed files more 659*4a711beaSLionel Sambucthan 512 megabytes long. Versions 1.0.2 and above use 64-bit ints 660*4a711beaSLionel Sambucon some platforms which support them (GNU supported targets, and 661*4a711beaSLionel SambucWindows). To establish whether or not 662*4a711beaSLionel Sambuc<code class="computeroutput">bzip2recover</code> was built with such 663*4a711beaSLionel Sambuca limitation, run it without arguments. In any event you can 664*4a711beaSLionel Sambucbuild yourself an unlimited version if you can recompile it with 665*4a711beaSLionel Sambuc<code class="computeroutput">MaybeUInt64</code> set to be an 666*4a711beaSLionel Sambucunsigned 64-bit integer.</p> 667*4a711beaSLionel Sambuc</div> 668*4a711beaSLionel Sambuc<div class="sect1" title="2.9.�AUTHOR"> 669*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h2 class="title" style="clear: both"> 670*4a711beaSLionel Sambuc<a name="author"></a>2.9.�AUTHOR</h2></div></div></div> 671*4a711beaSLionel Sambuc<p>Julian Seward, 672*4a711beaSLionel Sambuc<code class="computeroutput">jseward@bzip.org</code></p> 673*4a711beaSLionel Sambuc<p>The ideas embodied in 674*4a711beaSLionel Sambuc<code class="computeroutput">bzip2</code> are due to (at least) the 675*4a711beaSLionel Sambucfollowing people: Michael Burrows and David Wheeler (for the 676*4a711beaSLionel Sambucblock sorting transformation), David Wheeler (again, for the 677*4a711beaSLionel SambucHuffman coder), Peter Fenwick (for the structured coding model in 678*4a711beaSLionel Sambucthe original <code class="computeroutput">bzip</code>, and many 679*4a711beaSLionel Sambucrefinements), and Alistair Moffat, Radford Neal and Ian Witten 680*4a711beaSLionel Sambuc(for the arithmetic coder in the original 681*4a711beaSLionel Sambuc<code class="computeroutput">bzip</code>). I am much indebted for 682*4a711beaSLionel Sambuctheir help, support and advice. See the manual in the source 683*4a711beaSLionel Sambucdistribution for pointers to sources of documentation. Christian 684*4a711beaSLionel Sambucvon Roques encouraged me to look for faster sorting algorithms, 685*4a711beaSLionel Sambucso as to speed up compression. Bela Lubkin encouraged me to 686*4a711beaSLionel Sambucimprove the worst-case compression performance. 687*4a711beaSLionel SambucDonna Robinson XMLised the documentation. 688*4a711beaSLionel SambucMany people sent 689*4a711beaSLionel Sambucpatches, helped with portability problems, lent machines, gave 690*4a711beaSLionel Sambucadvice and were generally helpful.</p> 691*4a711beaSLionel Sambuc</div> 692*4a711beaSLionel Sambuc</div> 693*4a711beaSLionel Sambuc<div class="chapter" title="3.� Programming with libbzip2"> 694*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h2 class="title"> 695*4a711beaSLionel Sambuc<a name="libprog"></a>3.� 696*4a711beaSLionel SambucProgramming with <code class="computeroutput">libbzip2</code> 697*4a711beaSLionel Sambuc</h2></div></div></div> 698*4a711beaSLionel Sambuc<div class="toc"> 699*4a711beaSLionel Sambuc<p><b>Table of Contents</b></p> 700*4a711beaSLionel Sambuc<dl> 701*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#top-level">3.1. Top-level structure</a></span></dt> 702*4a711beaSLionel Sambuc<dd><dl> 703*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#ll-summary">3.1.1. Low-level summary</a></span></dt> 704*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#hl-summary">3.1.2. High-level summary</a></span></dt> 705*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#util-fns-summary">3.1.3. Utility functions summary</a></span></dt> 706*4a711beaSLionel Sambuc</dl></dd> 707*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#err-handling">3.2. Error handling</a></span></dt> 708*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#low-level">3.3. Low-level interface</a></span></dt> 709*4a711beaSLionel Sambuc<dd><dl> 710*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#bzcompress-init">3.3.1. BZ2_bzCompressInit</a></span></dt> 711*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#bzCompress">3.3.2. BZ2_bzCompress</a></span></dt> 712*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#bzCompress-end">3.3.3. BZ2_bzCompressEnd</a></span></dt> 713*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#bzDecompress-init">3.3.4. BZ2_bzDecompressInit</a></span></dt> 714*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#bzDecompress">3.3.5. BZ2_bzDecompress</a></span></dt> 715*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#bzDecompress-end">3.3.6. BZ2_bzDecompressEnd</a></span></dt> 716*4a711beaSLionel Sambuc</dl></dd> 717*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#hl-interface">3.4. High-level interface</a></span></dt> 718*4a711beaSLionel Sambuc<dd><dl> 719*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#bzreadopen">3.4.1. BZ2_bzReadOpen</a></span></dt> 720*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#bzread">3.4.2. BZ2_bzRead</a></span></dt> 721*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#bzreadgetunused">3.4.3. BZ2_bzReadGetUnused</a></span></dt> 722*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#bzreadclose">3.4.4. BZ2_bzReadClose</a></span></dt> 723*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#bzwriteopen">3.4.5. BZ2_bzWriteOpen</a></span></dt> 724*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#bzwrite">3.4.6. BZ2_bzWrite</a></span></dt> 725*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#bzwriteclose">3.4.7. BZ2_bzWriteClose</a></span></dt> 726*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#embed">3.4.8. Handling embedded compressed data streams</a></span></dt> 727*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#std-rdwr">3.4.9. Standard file-reading/writing code</a></span></dt> 728*4a711beaSLionel Sambuc</dl></dd> 729*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#util-fns">3.5. Utility functions</a></span></dt> 730*4a711beaSLionel Sambuc<dd><dl> 731*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#bzbufftobuffcompress">3.5.1. BZ2_bzBuffToBuffCompress</a></span></dt> 732*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#bzbufftobuffdecompress">3.5.2. BZ2_bzBuffToBuffDecompress</a></span></dt> 733*4a711beaSLionel Sambuc</dl></dd> 734*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#zlib-compat">3.6. zlib compatibility functions</a></span></dt> 735*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#stdio-free">3.7. Using the library in a stdio-free environment</a></span></dt> 736*4a711beaSLionel Sambuc<dd><dl> 737*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#stdio-bye">3.7.1. Getting rid of stdio</a></span></dt> 738*4a711beaSLionel Sambuc<dt><span class="sect2"><a href="#critical-error">3.7.2. Critical error handling</a></span></dt> 739*4a711beaSLionel Sambuc</dl></dd> 740*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#win-dll">3.8. Making a Windows DLL</a></span></dt> 741*4a711beaSLionel Sambuc</dl> 742*4a711beaSLionel Sambuc</div> 743*4a711beaSLionel Sambuc<p>This chapter describes the programming interface to 744*4a711beaSLionel Sambuc<code class="computeroutput">libbzip2</code>.</p> 745*4a711beaSLionel Sambuc<p>For general background information, particularly about 746*4a711beaSLionel Sambucmemory use and performance aspects, you'd be well advised to read 747*4a711beaSLionel Sambuc<a class="xref" href="#using" title="2.�How to use bzip2">How to use bzip2</a> as well.</p> 748*4a711beaSLionel Sambuc<div class="sect1" title="3.1.�Top-level structure"> 749*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h2 class="title" style="clear: both"> 750*4a711beaSLionel Sambuc<a name="top-level"></a>3.1.�Top-level structure</h2></div></div></div> 751*4a711beaSLionel Sambuc<p><code class="computeroutput">libbzip2</code> is a flexible 752*4a711beaSLionel Sambuclibrary for compressing and decompressing data in the 753*4a711beaSLionel Sambuc<code class="computeroutput">bzip2</code> data format. Although 754*4a711beaSLionel Sambucpackaged as a single entity, it helps to regard the library as 755*4a711beaSLionel Sambucthree separate parts: the low level interface, and the high level 756*4a711beaSLionel Sambucinterface, and some utility functions.</p> 757*4a711beaSLionel Sambuc<p>The structure of 758*4a711beaSLionel Sambuc<code class="computeroutput">libbzip2</code>'s interfaces is similar 759*4a711beaSLionel Sambucto that of Jean-loup Gailly's and Mark Adler's excellent 760*4a711beaSLionel Sambuc<code class="computeroutput">zlib</code> library.</p> 761*4a711beaSLionel Sambuc<p>All externally visible symbols have names beginning 762*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_</code>. This is new in version 763*4a711beaSLionel Sambuc1.0. The intention is to minimise pollution of the namespaces of 764*4a711beaSLionel Sambuclibrary clients.</p> 765*4a711beaSLionel Sambuc<p>To use any part of the library, you need to 766*4a711beaSLionel Sambuc<code class="computeroutput">#include <bzlib.h></code> 767*4a711beaSLionel Sambucinto your sources.</p> 768*4a711beaSLionel Sambuc<div class="sect2" title="3.1.1.�Low-level summary"> 769*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h3 class="title"> 770*4a711beaSLionel Sambuc<a name="ll-summary"></a>3.1.1.�Low-level summary</h3></div></div></div> 771*4a711beaSLionel Sambuc<p>This interface provides services for compressing and 772*4a711beaSLionel Sambucdecompressing data in memory. There's no provision for dealing 773*4a711beaSLionel Sambucwith files, streams or any other I/O mechanisms, just straight 774*4a711beaSLionel Sambucmemory-to-memory work. In fact, this part of the library can be 775*4a711beaSLionel Sambuccompiled without inclusion of 776*4a711beaSLionel Sambuc<code class="computeroutput">stdio.h</code>, which may be helpful 777*4a711beaSLionel Sambucfor embedded applications.</p> 778*4a711beaSLionel Sambuc<p>The low-level part of the library has no global variables 779*4a711beaSLionel Sambucand is therefore thread-safe.</p> 780*4a711beaSLionel Sambuc<p>Six routines make up the low level interface: 781*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzCompressInit</code>, 782*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzCompress</code>, and 783*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzCompressEnd</code> for 784*4a711beaSLionel Sambuccompression, and a corresponding trio 785*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzDecompressInit</code>, 786*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzDecompress</code> and 787*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzDecompressEnd</code> for 788*4a711beaSLionel Sambucdecompression. The <code class="computeroutput">*Init</code> 789*4a711beaSLionel Sambucfunctions allocate memory for compression/decompression and do 790*4a711beaSLionel Sambucother initialisations, whilst the 791*4a711beaSLionel Sambuc<code class="computeroutput">*End</code> functions close down 792*4a711beaSLionel Sambucoperations and release memory.</p> 793*4a711beaSLionel Sambuc<p>The real work is done by 794*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzCompress</code> and 795*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzDecompress</code>. These 796*4a711beaSLionel Sambuccompress and decompress data from a user-supplied input buffer to 797*4a711beaSLionel Sambuca user-supplied output buffer. These buffers can be any size; 798*4a711beaSLionel Sambucarbitrary quantities of data are handled by making repeated calls 799*4a711beaSLionel Sambucto these functions. This is a flexible mechanism allowing a 800*4a711beaSLionel Sambucconsumer-pull style of activity, or producer-push, or a mixture 801*4a711beaSLionel Sambucof both.</p> 802*4a711beaSLionel Sambuc</div> 803*4a711beaSLionel Sambuc<div class="sect2" title="3.1.2.�High-level summary"> 804*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h3 class="title"> 805*4a711beaSLionel Sambuc<a name="hl-summary"></a>3.1.2.�High-level summary</h3></div></div></div> 806*4a711beaSLionel Sambuc<p>This interface provides some handy wrappers around the 807*4a711beaSLionel Sambuclow-level interface to facilitate reading and writing 808*4a711beaSLionel Sambuc<code class="computeroutput">bzip2</code> format files 809*4a711beaSLionel Sambuc(<code class="computeroutput">.bz2</code> files). The routines 810*4a711beaSLionel Sambucprovide hooks to facilitate reading files in which the 811*4a711beaSLionel Sambuc<code class="computeroutput">bzip2</code> data stream is embedded 812*4a711beaSLionel Sambucwithin some larger-scale file structure, or where there are 813*4a711beaSLionel Sambucmultiple <code class="computeroutput">bzip2</code> data streams 814*4a711beaSLionel Sambucconcatenated end-to-end.</p> 815*4a711beaSLionel Sambuc<p>For reading files, 816*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzReadOpen</code>, 817*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzRead</code>, 818*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzReadClose</code> and 819*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzReadGetUnused</code> are 820*4a711beaSLionel Sambucsupplied. For writing files, 821*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzWriteOpen</code>, 822*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzWrite</code> and 823*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzWriteFinish</code> are 824*4a711beaSLionel Sambucavailable.</p> 825*4a711beaSLionel Sambuc<p>As with the low-level library, no global variables are used 826*4a711beaSLionel Sambucso the library is per se thread-safe. However, if I/O errors 827*4a711beaSLionel Sambucoccur whilst reading or writing the underlying compressed files, 828*4a711beaSLionel Sambucyou may have to consult <code class="computeroutput">errno</code> to 829*4a711beaSLionel Sambucdetermine the cause of the error. In that case, you'd need a C 830*4a711beaSLionel Sambuclibrary which correctly supports 831*4a711beaSLionel Sambuc<code class="computeroutput">errno</code> in a multithreaded 832*4a711beaSLionel Sambucenvironment.</p> 833*4a711beaSLionel Sambuc<p>To make the library a little simpler and more portable, 834*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzReadOpen</code> and 835*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzWriteOpen</code> require you to 836*4a711beaSLionel Sambucpass them file handles (<code class="computeroutput">FILE*</code>s) 837*4a711beaSLionel Sambucwhich have previously been opened for reading or writing 838*4a711beaSLionel Sambucrespectively. That avoids portability problems associated with 839*4a711beaSLionel Sambucfile operations and file attributes, whilst not being much of an 840*4a711beaSLionel Sambucimposition on the programmer.</p> 841*4a711beaSLionel Sambuc</div> 842*4a711beaSLionel Sambuc<div class="sect2" title="3.1.3.�Utility functions summary"> 843*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h3 class="title"> 844*4a711beaSLionel Sambuc<a name="util-fns-summary"></a>3.1.3.�Utility functions summary</h3></div></div></div> 845*4a711beaSLionel Sambuc<p>For very simple needs, 846*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzBuffToBuffCompress</code> and 847*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzBuffToBuffDecompress</code> are 848*4a711beaSLionel Sambucprovided. These compress data in memory from one buffer to 849*4a711beaSLionel Sambucanother buffer in a single function call. You should assess 850*4a711beaSLionel Sambucwhether these functions fulfill your memory-to-memory 851*4a711beaSLionel Sambuccompression/decompression requirements before investing effort in 852*4a711beaSLionel Sambucunderstanding the more general but more complex low-level 853*4a711beaSLionel Sambucinterface.</p> 854*4a711beaSLionel Sambuc<p>Yoshioka Tsuneo 855*4a711beaSLionel Sambuc(<code class="computeroutput">tsuneo@rr.iij4u.or.jp</code>) has 856*4a711beaSLionel Sambuccontributed some functions to give better 857*4a711beaSLionel Sambuc<code class="computeroutput">zlib</code> compatibility. These 858*4a711beaSLionel Sambucfunctions are <code class="computeroutput">BZ2_bzopen</code>, 859*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzread</code>, 860*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzwrite</code>, 861*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzflush</code>, 862*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzclose</code>, 863*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzerror</code> and 864*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzlibVersion</code>. You may find 865*4a711beaSLionel Sambucthese functions more convenient for simple file reading and 866*4a711beaSLionel Sambucwriting, than those in the high-level interface. These functions 867*4a711beaSLionel Sambucare not (yet) officially part of the library, and are minimally 868*4a711beaSLionel Sambucdocumented here. If they break, you get to keep all the pieces. 869*4a711beaSLionel SambucI hope to document them properly when time permits.</p> 870*4a711beaSLionel Sambuc<p>Yoshioka also contributed modifications to allow the 871*4a711beaSLionel Sambuclibrary to be built as a Windows DLL.</p> 872*4a711beaSLionel Sambuc</div> 873*4a711beaSLionel Sambuc</div> 874*4a711beaSLionel Sambuc<div class="sect1" title="3.2.�Error handling"> 875*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h2 class="title" style="clear: both"> 876*4a711beaSLionel Sambuc<a name="err-handling"></a>3.2.�Error handling</h2></div></div></div> 877*4a711beaSLionel Sambuc<p>The library is designed to recover cleanly in all 878*4a711beaSLionel Sambucsituations, including the worst-case situation of decompressing 879*4a711beaSLionel Sambucrandom data. I'm not 100% sure that it can always do this, so 880*4a711beaSLionel Sambucyou might want to add a signal handler to catch segmentation 881*4a711beaSLionel Sambucviolations during decompression if you are feeling especially 882*4a711beaSLionel Sambucparanoid. I would be interested in hearing more about the 883*4a711beaSLionel Sambucrobustness of the library to corrupted compressed data.</p> 884*4a711beaSLionel Sambuc<p>Version 1.0.3 more robust in this respect than any 885*4a711beaSLionel Sambucprevious version. Investigations with Valgrind (a tool for detecting 886*4a711beaSLionel Sambucproblems with memory management) indicate 887*4a711beaSLionel Sambucthat, at least for the few files I tested, all single-bit errors 888*4a711beaSLionel Sambucin the decompressed data are caught properly, with no 889*4a711beaSLionel Sambucsegmentation faults, no uses of uninitialised data, no out of 890*4a711beaSLionel Sambucrange reads or writes, and no infinite looping in the decompressor. 891*4a711beaSLionel SambucSo it's certainly pretty robust, although 892*4a711beaSLionel SambucI wouldn't claim it to be totally bombproof.</p> 893*4a711beaSLionel Sambuc<p>The file <code class="computeroutput">bzlib.h</code> contains 894*4a711beaSLionel Sambucall definitions needed to use the library. In particular, you 895*4a711beaSLionel Sambucshould definitely not include 896*4a711beaSLionel Sambuc<code class="computeroutput">bzlib_private.h</code>.</p> 897*4a711beaSLionel Sambuc<p>In <code class="computeroutput">bzlib.h</code>, the various 898*4a711beaSLionel Sambucreturn values are defined. The following list is not intended as 899*4a711beaSLionel Sambucan exhaustive description of the circumstances in which a given 900*4a711beaSLionel Sambucvalue may be returned -- those descriptions are given later. 901*4a711beaSLionel SambucRather, it is intended to convey the rough meaning of each return 902*4a711beaSLionel Sambucvalue. The first five actions are normal and not intended to 903*4a711beaSLionel Sambucdenote an error situation.</p> 904*4a711beaSLionel Sambuc<div class="variablelist"><dl> 905*4a711beaSLionel Sambuc<dt><span class="term"><code class="computeroutput">BZ_OK</code></span></dt> 906*4a711beaSLionel Sambuc<dd><p>The requested action was completed 907*4a711beaSLionel Sambuc successfully.</p></dd> 908*4a711beaSLionel Sambuc<dt><span class="term"><code class="computeroutput">BZ_RUN_OK, BZ_FLUSH_OK, 909*4a711beaSLionel Sambuc BZ_FINISH_OK</code></span></dt> 910*4a711beaSLionel Sambuc<dd><p>In 911*4a711beaSLionel Sambuc <code class="computeroutput">BZ2_bzCompress</code>, the requested 912*4a711beaSLionel Sambuc flush/finish/nothing-special action was completed 913*4a711beaSLionel Sambuc successfully.</p></dd> 914*4a711beaSLionel Sambuc<dt><span class="term"><code class="computeroutput">BZ_STREAM_END</code></span></dt> 915*4a711beaSLionel Sambuc<dd><p>Compression of data was completed, or the 916*4a711beaSLionel Sambuc logical stream end was detected during 917*4a711beaSLionel Sambuc decompression.</p></dd> 918*4a711beaSLionel Sambuc</dl></div> 919*4a711beaSLionel Sambuc<p>The following return values indicate an error of some 920*4a711beaSLionel Sambuckind.</p> 921*4a711beaSLionel Sambuc<div class="variablelist"><dl> 922*4a711beaSLionel Sambuc<dt><span class="term"><code class="computeroutput">BZ_CONFIG_ERROR</code></span></dt> 923*4a711beaSLionel Sambuc<dd><p>Indicates that the library has been improperly 924*4a711beaSLionel Sambuc compiled on your platform -- a major configuration error. 925*4a711beaSLionel Sambuc Specifically, it means that 926*4a711beaSLionel Sambuc <code class="computeroutput">sizeof(char)</code>, 927*4a711beaSLionel Sambuc <code class="computeroutput">sizeof(short)</code> and 928*4a711beaSLionel Sambuc <code class="computeroutput">sizeof(int)</code> are not 1, 2 and 929*4a711beaSLionel Sambuc 4 respectively, as they should be. Note that the library 930*4a711beaSLionel Sambuc should still work properly on 64-bit platforms which follow 931*4a711beaSLionel Sambuc the LP64 programming model -- that is, where 932*4a711beaSLionel Sambuc <code class="computeroutput">sizeof(long)</code> and 933*4a711beaSLionel Sambuc <code class="computeroutput">sizeof(void*)</code> are 8. Under 934*4a711beaSLionel Sambuc LP64, <code class="computeroutput">sizeof(int)</code> is still 4, 935*4a711beaSLionel Sambuc so <code class="computeroutput">libbzip2</code>, which doesn't 936*4a711beaSLionel Sambuc use the <code class="computeroutput">long</code> type, is 937*4a711beaSLionel Sambuc OK.</p></dd> 938*4a711beaSLionel Sambuc<dt><span class="term"><code class="computeroutput">BZ_SEQUENCE_ERROR</code></span></dt> 939*4a711beaSLionel Sambuc<dd><p>When using the library, it is important to call 940*4a711beaSLionel Sambuc the functions in the correct sequence and with data structures 941*4a711beaSLionel Sambuc (buffers etc) in the correct states. 942*4a711beaSLionel Sambuc <code class="computeroutput">libbzip2</code> checks as much as it 943*4a711beaSLionel Sambuc can to ensure this is happening, and returns 944*4a711beaSLionel Sambuc <code class="computeroutput">BZ_SEQUENCE_ERROR</code> if not. 945*4a711beaSLionel Sambuc Code which complies precisely with the function semantics, as 946*4a711beaSLionel Sambuc detailed below, should never receive this value; such an event 947*4a711beaSLionel Sambuc denotes buggy code which you should 948*4a711beaSLionel Sambuc investigate.</p></dd> 949*4a711beaSLionel Sambuc<dt><span class="term"><code class="computeroutput">BZ_PARAM_ERROR</code></span></dt> 950*4a711beaSLionel Sambuc<dd><p>Returned when a parameter to a function call is 951*4a711beaSLionel Sambuc out of range or otherwise manifestly incorrect. As with 952*4a711beaSLionel Sambuc <code class="computeroutput">BZ_SEQUENCE_ERROR</code>, this 953*4a711beaSLionel Sambuc denotes a bug in the client code. The distinction between 954*4a711beaSLionel Sambuc <code class="computeroutput">BZ_PARAM_ERROR</code> and 955*4a711beaSLionel Sambuc <code class="computeroutput">BZ_SEQUENCE_ERROR</code> is a bit 956*4a711beaSLionel Sambuc hazy, but still worth making.</p></dd> 957*4a711beaSLionel Sambuc<dt><span class="term"><code class="computeroutput">BZ_MEM_ERROR</code></span></dt> 958*4a711beaSLionel Sambuc<dd><p>Returned when a request to allocate memory 959*4a711beaSLionel Sambuc failed. Note that the quantity of memory needed to decompress 960*4a711beaSLionel Sambuc a stream cannot be determined until the stream's header has 961*4a711beaSLionel Sambuc been read. So 962*4a711beaSLionel Sambuc <code class="computeroutput">BZ2_bzDecompress</code> and 963*4a711beaSLionel Sambuc <code class="computeroutput">BZ2_bzRead</code> may return 964*4a711beaSLionel Sambuc <code class="computeroutput">BZ_MEM_ERROR</code> even though some 965*4a711beaSLionel Sambuc of the compressed data has been read. The same is not true 966*4a711beaSLionel Sambuc for compression; once 967*4a711beaSLionel Sambuc <code class="computeroutput">BZ2_bzCompressInit</code> or 968*4a711beaSLionel Sambuc <code class="computeroutput">BZ2_bzWriteOpen</code> have 969*4a711beaSLionel Sambuc successfully completed, 970*4a711beaSLionel Sambuc <code class="computeroutput">BZ_MEM_ERROR</code> cannot 971*4a711beaSLionel Sambuc occur.</p></dd> 972*4a711beaSLionel Sambuc<dt><span class="term"><code class="computeroutput">BZ_DATA_ERROR</code></span></dt> 973*4a711beaSLionel Sambuc<dd><p>Returned when a data integrity error is 974*4a711beaSLionel Sambuc detected during decompression. Most importantly, this means 975*4a711beaSLionel Sambuc when stored and computed CRCs for the data do not match. This 976*4a711beaSLionel Sambuc value is also returned upon detection of any other anomaly in 977*4a711beaSLionel Sambuc the compressed data.</p></dd> 978*4a711beaSLionel Sambuc<dt><span class="term"><code class="computeroutput">BZ_DATA_ERROR_MAGIC</code></span></dt> 979*4a711beaSLionel Sambuc<dd><p>As a special case of 980*4a711beaSLionel Sambuc <code class="computeroutput">BZ_DATA_ERROR</code>, it is 981*4a711beaSLionel Sambuc sometimes useful to know when the compressed stream does not 982*4a711beaSLionel Sambuc start with the correct magic bytes (<code class="computeroutput">'B' 'Z' 983*4a711beaSLionel Sambuc 'h'</code>).</p></dd> 984*4a711beaSLionel Sambuc<dt><span class="term"><code class="computeroutput">BZ_IO_ERROR</code></span></dt> 985*4a711beaSLionel Sambuc<dd><p>Returned by 986*4a711beaSLionel Sambuc <code class="computeroutput">BZ2_bzRead</code> and 987*4a711beaSLionel Sambuc <code class="computeroutput">BZ2_bzWrite</code> when there is an 988*4a711beaSLionel Sambuc error reading or writing in the compressed file, and by 989*4a711beaSLionel Sambuc <code class="computeroutput">BZ2_bzReadOpen</code> and 990*4a711beaSLionel Sambuc <code class="computeroutput">BZ2_bzWriteOpen</code> for attempts 991*4a711beaSLionel Sambuc to use a file for which the error indicator (viz, 992*4a711beaSLionel Sambuc <code class="computeroutput">ferror(f)</code>) is set. On 993*4a711beaSLionel Sambuc receipt of <code class="computeroutput">BZ_IO_ERROR</code>, the 994*4a711beaSLionel Sambuc caller should consult <code class="computeroutput">errno</code> 995*4a711beaSLionel Sambuc and/or <code class="computeroutput">perror</code> to acquire 996*4a711beaSLionel Sambuc operating-system specific information about the 997*4a711beaSLionel Sambuc problem.</p></dd> 998*4a711beaSLionel Sambuc<dt><span class="term"><code class="computeroutput">BZ_UNEXPECTED_EOF</code></span></dt> 999*4a711beaSLionel Sambuc<dd><p>Returned by 1000*4a711beaSLionel Sambuc <code class="computeroutput">BZ2_bzRead</code> when the 1001*4a711beaSLionel Sambuc compressed file finishes before the logical end of stream is 1002*4a711beaSLionel Sambuc detected.</p></dd> 1003*4a711beaSLionel Sambuc<dt><span class="term"><code class="computeroutput">BZ_OUTBUFF_FULL</code></span></dt> 1004*4a711beaSLionel Sambuc<dd><p>Returned by 1005*4a711beaSLionel Sambuc <code class="computeroutput">BZ2_bzBuffToBuffCompress</code> and 1006*4a711beaSLionel Sambuc <code class="computeroutput">BZ2_bzBuffToBuffDecompress</code> to 1007*4a711beaSLionel Sambuc indicate that the output data will not fit into the output 1008*4a711beaSLionel Sambuc buffer provided.</p></dd> 1009*4a711beaSLionel Sambuc</dl></div> 1010*4a711beaSLionel Sambuc</div> 1011*4a711beaSLionel Sambuc<div class="sect1" title="3.3.�Low-level interface"> 1012*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h2 class="title" style="clear: both"> 1013*4a711beaSLionel Sambuc<a name="low-level"></a>3.3.�Low-level interface</h2></div></div></div> 1014*4a711beaSLionel Sambuc<div class="sect2" title="3.3.1.�BZ2_bzCompressInit"> 1015*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h3 class="title"> 1016*4a711beaSLionel Sambuc<a name="bzcompress-init"></a>3.3.1.�BZ2_bzCompressInit</h3></div></div></div> 1017*4a711beaSLionel Sambuc<pre class="programlisting">typedef struct { 1018*4a711beaSLionel Sambuc char *next_in; 1019*4a711beaSLionel Sambuc unsigned int avail_in; 1020*4a711beaSLionel Sambuc unsigned int total_in_lo32; 1021*4a711beaSLionel Sambuc unsigned int total_in_hi32; 1022*4a711beaSLionel Sambuc 1023*4a711beaSLionel Sambuc char *next_out; 1024*4a711beaSLionel Sambuc unsigned int avail_out; 1025*4a711beaSLionel Sambuc unsigned int total_out_lo32; 1026*4a711beaSLionel Sambuc unsigned int total_out_hi32; 1027*4a711beaSLionel Sambuc 1028*4a711beaSLionel Sambuc void *state; 1029*4a711beaSLionel Sambuc 1030*4a711beaSLionel Sambuc void *(*bzalloc)(void *,int,int); 1031*4a711beaSLionel Sambuc void (*bzfree)(void *,void *); 1032*4a711beaSLionel Sambuc void *opaque; 1033*4a711beaSLionel Sambuc} bz_stream; 1034*4a711beaSLionel Sambuc 1035*4a711beaSLionel Sambucint BZ2_bzCompressInit ( bz_stream *strm, 1036*4a711beaSLionel Sambuc int blockSize100k, 1037*4a711beaSLionel Sambuc int verbosity, 1038*4a711beaSLionel Sambuc int workFactor );</pre> 1039*4a711beaSLionel Sambuc<p>Prepares for compression. The 1040*4a711beaSLionel Sambuc<code class="computeroutput">bz_stream</code> structure holds all 1041*4a711beaSLionel Sambucdata pertaining to the compression activity. A 1042*4a711beaSLionel Sambuc<code class="computeroutput">bz_stream</code> structure should be 1043*4a711beaSLionel Sambucallocated and initialised prior to the call. The fields of 1044*4a711beaSLionel Sambuc<code class="computeroutput">bz_stream</code> comprise the entirety 1045*4a711beaSLionel Sambucof the user-visible data. <code class="computeroutput">state</code> 1046*4a711beaSLionel Sambucis a pointer to the private data structures required for 1047*4a711beaSLionel Sambuccompression.</p> 1048*4a711beaSLionel Sambuc<p>Custom memory allocators are supported, via fields 1049*4a711beaSLionel Sambuc<code class="computeroutput">bzalloc</code>, 1050*4a711beaSLionel Sambuc<code class="computeroutput">bzfree</code>, and 1051*4a711beaSLionel Sambuc<code class="computeroutput">opaque</code>. The value 1052*4a711beaSLionel Sambuc<code class="computeroutput">opaque</code> is passed to as the first 1053*4a711beaSLionel Sambucargument to all calls to <code class="computeroutput">bzalloc</code> 1054*4a711beaSLionel Sambucand <code class="computeroutput">bzfree</code>, but is otherwise 1055*4a711beaSLionel Sambucignored by the library. The call <code class="computeroutput">bzalloc ( 1056*4a711beaSLionel Sambucopaque, n, m )</code> is expected to return a pointer 1057*4a711beaSLionel Sambuc<code class="computeroutput">p</code> to <code class="computeroutput">n * 1058*4a711beaSLionel Sambucm</code> bytes of memory, and <code class="computeroutput">bzfree ( 1059*4a711beaSLionel Sambucopaque, p )</code> should free that memory.</p> 1060*4a711beaSLionel Sambuc<p>If you don't want to use a custom memory allocator, set 1061*4a711beaSLionel Sambuc<code class="computeroutput">bzalloc</code>, 1062*4a711beaSLionel Sambuc<code class="computeroutput">bzfree</code> and 1063*4a711beaSLionel Sambuc<code class="computeroutput">opaque</code> to 1064*4a711beaSLionel Sambuc<code class="computeroutput">NULL</code>, and the library will then 1065*4a711beaSLionel Sambucuse the standard <code class="computeroutput">malloc</code> / 1066*4a711beaSLionel Sambuc<code class="computeroutput">free</code> routines.</p> 1067*4a711beaSLionel Sambuc<p>Before calling 1068*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzCompressInit</code>, fields 1069*4a711beaSLionel Sambuc<code class="computeroutput">bzalloc</code>, 1070*4a711beaSLionel Sambuc<code class="computeroutput">bzfree</code> and 1071*4a711beaSLionel Sambuc<code class="computeroutput">opaque</code> should be filled 1072*4a711beaSLionel Sambucappropriately, as just described. Upon return, the internal 1073*4a711beaSLionel Sambucstate will have been allocated and initialised, and 1074*4a711beaSLionel Sambuc<code class="computeroutput">total_in_lo32</code>, 1075*4a711beaSLionel Sambuc<code class="computeroutput">total_in_hi32</code>, 1076*4a711beaSLionel Sambuc<code class="computeroutput">total_out_lo32</code> and 1077*4a711beaSLionel Sambuc<code class="computeroutput">total_out_hi32</code> will have been 1078*4a711beaSLionel Sambucset to zero. These four fields are used by the library to inform 1079*4a711beaSLionel Sambucthe caller of the total amount of data passed into and out of the 1080*4a711beaSLionel Sambuclibrary, respectively. You should not try to change them. As of 1081*4a711beaSLionel Sambucversion 1.0, 64-bit counts are maintained, even on 32-bit 1082*4a711beaSLionel Sambucplatforms, using the <code class="computeroutput">_hi32</code> 1083*4a711beaSLionel Sambucfields to store the upper 32 bits of the count. So, for example, 1084*4a711beaSLionel Sambucthe total amount of data in is <code class="computeroutput">(total_in_hi32 1085*4a711beaSLionel Sambuc<< 32) + total_in_lo32</code>.</p> 1086*4a711beaSLionel Sambuc<p>Parameter <code class="computeroutput">blockSize100k</code> 1087*4a711beaSLionel Sambucspecifies the block size to be used for compression. It should 1088*4a711beaSLionel Sambucbe a value between 1 and 9 inclusive, and the actual block size 1089*4a711beaSLionel Sambucused is 100000 x this figure. 9 gives the best compression but 1090*4a711beaSLionel Sambuctakes most memory.</p> 1091*4a711beaSLionel Sambuc<p>Parameter <code class="computeroutput">verbosity</code> should 1092*4a711beaSLionel Sambucbe set to a number between 0 and 4 inclusive. 0 is silent, and 1093*4a711beaSLionel Sambucgreater numbers give increasingly verbose monitoring/debugging 1094*4a711beaSLionel Sambucoutput. If the library has been compiled with 1095*4a711beaSLionel Sambuc<code class="computeroutput">-DBZ_NO_STDIO</code>, no such output 1096*4a711beaSLionel Sambucwill appear for any verbosity setting.</p> 1097*4a711beaSLionel Sambuc<p>Parameter <code class="computeroutput">workFactor</code> 1098*4a711beaSLionel Sambuccontrols how the compression phase behaves when presented with 1099*4a711beaSLionel Sambucworst case, highly repetitive, input data. If compression runs 1100*4a711beaSLionel Sambucinto difficulties caused by repetitive data, the library switches 1101*4a711beaSLionel Sambucfrom the standard sorting algorithm to a fallback algorithm. The 1102*4a711beaSLionel Sambucfallback is slower than the standard algorithm by perhaps a 1103*4a711beaSLionel Sambucfactor of three, but always behaves reasonably, no matter how bad 1104*4a711beaSLionel Sambucthe input.</p> 1105*4a711beaSLionel Sambuc<p>Lower values of <code class="computeroutput">workFactor</code> 1106*4a711beaSLionel Sambucreduce the amount of effort the standard algorithm will expend 1107*4a711beaSLionel Sambucbefore resorting to the fallback. You should set this parameter 1108*4a711beaSLionel Sambuccarefully; too low, and many inputs will be handled by the 1109*4a711beaSLionel Sambucfallback algorithm and so compress rather slowly, too high, and 1110*4a711beaSLionel Sambucyour average-to-worst case compression times can become very 1111*4a711beaSLionel Sambuclarge. The default value of 30 gives reasonable behaviour over a 1112*4a711beaSLionel Sambucwide range of circumstances.</p> 1113*4a711beaSLionel Sambuc<p>Allowable values range from 0 to 250 inclusive. 0 is a 1114*4a711beaSLionel Sambucspecial case, equivalent to using the default value of 30.</p> 1115*4a711beaSLionel Sambuc<p>Note that the compressed output generated is the same 1116*4a711beaSLionel Sambucregardless of whether or not the fallback algorithm is 1117*4a711beaSLionel Sambucused.</p> 1118*4a711beaSLionel Sambuc<p>Be aware also that this parameter may disappear entirely in 1119*4a711beaSLionel Sambucfuture versions of the library. In principle it should be 1120*4a711beaSLionel Sambucpossible to devise a good way to automatically choose which 1121*4a711beaSLionel Sambucalgorithm to use. Such a mechanism would render the parameter 1122*4a711beaSLionel Sambucobsolete.</p> 1123*4a711beaSLionel Sambuc<p>Possible return values:</p> 1124*4a711beaSLionel Sambuc<pre class="programlisting">BZ_CONFIG_ERROR 1125*4a711beaSLionel Sambuc if the library has been mis-compiled 1126*4a711beaSLionel SambucBZ_PARAM_ERROR 1127*4a711beaSLionel Sambuc if strm is NULL 1128*4a711beaSLionel Sambuc or blockSize < 1 or blockSize > 9 1129*4a711beaSLionel Sambuc or verbosity < 0 or verbosity > 4 1130*4a711beaSLionel Sambuc or workFactor < 0 or workFactor > 250 1131*4a711beaSLionel SambucBZ_MEM_ERROR 1132*4a711beaSLionel Sambuc if not enough memory is available 1133*4a711beaSLionel SambucBZ_OK 1134*4a711beaSLionel Sambuc otherwise</pre> 1135*4a711beaSLionel Sambuc<p>Allowable next actions:</p> 1136*4a711beaSLionel Sambuc<pre class="programlisting">BZ2_bzCompress 1137*4a711beaSLionel Sambuc if BZ_OK is returned 1138*4a711beaSLionel Sambuc no specific action needed in case of error</pre> 1139*4a711beaSLionel Sambuc</div> 1140*4a711beaSLionel Sambuc<div class="sect2" title="3.3.2.�BZ2_bzCompress"> 1141*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h3 class="title"> 1142*4a711beaSLionel Sambuc<a name="bzCompress"></a>3.3.2.�BZ2_bzCompress</h3></div></div></div> 1143*4a711beaSLionel Sambuc<pre class="programlisting">int BZ2_bzCompress ( bz_stream *strm, int action );</pre> 1144*4a711beaSLionel Sambuc<p>Provides more input and/or output buffer space for the 1145*4a711beaSLionel Sambuclibrary. The caller maintains input and output buffers, and 1146*4a711beaSLionel Sambuccalls <code class="computeroutput">BZ2_bzCompress</code> to transfer 1147*4a711beaSLionel Sambucdata between them.</p> 1148*4a711beaSLionel Sambuc<p>Before each call to 1149*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzCompress</code>, 1150*4a711beaSLionel Sambuc<code class="computeroutput">next_in</code> should point at the data 1151*4a711beaSLionel Sambucto be compressed, and <code class="computeroutput">avail_in</code> 1152*4a711beaSLionel Sambucshould indicate how many bytes the library may read. 1153*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzCompress</code> updates 1154*4a711beaSLionel Sambuc<code class="computeroutput">next_in</code>, 1155*4a711beaSLionel Sambuc<code class="computeroutput">avail_in</code> and 1156*4a711beaSLionel Sambuc<code class="computeroutput">total_in</code> to reflect the number 1157*4a711beaSLionel Sambucof bytes it has read.</p> 1158*4a711beaSLionel Sambuc<p>Similarly, <code class="computeroutput">next_out</code> should 1159*4a711beaSLionel Sambucpoint to a buffer in which the compressed data is to be placed, 1160*4a711beaSLionel Sambucwith <code class="computeroutput">avail_out</code> indicating how 1161*4a711beaSLionel Sambucmuch output space is available. 1162*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzCompress</code> updates 1163*4a711beaSLionel Sambuc<code class="computeroutput">next_out</code>, 1164*4a711beaSLionel Sambuc<code class="computeroutput">avail_out</code> and 1165*4a711beaSLionel Sambuc<code class="computeroutput">total_out</code> to reflect the number 1166*4a711beaSLionel Sambucof bytes output.</p> 1167*4a711beaSLionel Sambuc<p>You may provide and remove as little or as much data as you 1168*4a711beaSLionel Sambuclike on each call of 1169*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzCompress</code>. In the limit, 1170*4a711beaSLionel Sambucit is acceptable to supply and remove data one byte at a time, 1171*4a711beaSLionel Sambucalthough this would be terribly inefficient. You should always 1172*4a711beaSLionel Sambucensure that at least one byte of output space is available at 1173*4a711beaSLionel Sambuceach call.</p> 1174*4a711beaSLionel Sambuc<p>A second purpose of 1175*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzCompress</code> is to request a 1176*4a711beaSLionel Sambucchange of mode of the compressed stream.</p> 1177*4a711beaSLionel Sambuc<p>Conceptually, a compressed stream can be in one of four 1178*4a711beaSLionel Sambucstates: IDLE, RUNNING, FLUSHING and FINISHING. Before 1179*4a711beaSLionel Sambucinitialisation 1180*4a711beaSLionel Sambuc(<code class="computeroutput">BZ2_bzCompressInit</code>) and after 1181*4a711beaSLionel Sambuctermination (<code class="computeroutput">BZ2_bzCompressEnd</code>), 1182*4a711beaSLionel Sambuca stream is regarded as IDLE.</p> 1183*4a711beaSLionel Sambuc<p>Upon initialisation 1184*4a711beaSLionel Sambuc(<code class="computeroutput">BZ2_bzCompressInit</code>), the stream 1185*4a711beaSLionel Sambucis placed in the RUNNING state. Subsequent calls to 1186*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzCompress</code> should pass 1187*4a711beaSLionel Sambuc<code class="computeroutput">BZ_RUN</code> as the requested action; 1188*4a711beaSLionel Sambucother actions are illegal and will result in 1189*4a711beaSLionel Sambuc<code class="computeroutput">BZ_SEQUENCE_ERROR</code>.</p> 1190*4a711beaSLionel Sambuc<p>At some point, the calling program will have provided all 1191*4a711beaSLionel Sambucthe input data it wants to. It will then want to finish up -- in 1192*4a711beaSLionel Sambuceffect, asking the library to process any data it might have 1193*4a711beaSLionel Sambucbuffered internally. In this state, 1194*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzCompress</code> will no longer 1195*4a711beaSLionel Sambucattempt to read data from 1196*4a711beaSLionel Sambuc<code class="computeroutput">next_in</code>, but it will want to 1197*4a711beaSLionel Sambucwrite data to <code class="computeroutput">next_out</code>. Because 1198*4a711beaSLionel Sambucthe output buffer supplied by the user can be arbitrarily small, 1199*4a711beaSLionel Sambucthe finishing-up operation cannot necessarily be done with a 1200*4a711beaSLionel Sambucsingle call of 1201*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzCompress</code>.</p> 1202*4a711beaSLionel Sambuc<p>Instead, the calling program passes 1203*4a711beaSLionel Sambuc<code class="computeroutput">BZ_FINISH</code> as an action to 1204*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzCompress</code>. This changes 1205*4a711beaSLionel Sambucthe stream's state to FINISHING. Any remaining input (ie, 1206*4a711beaSLionel Sambuc<code class="computeroutput">next_in[0 .. avail_in-1]</code>) is 1207*4a711beaSLionel Sambuccompressed and transferred to the output buffer. To do this, 1208*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzCompress</code> must be called 1209*4a711beaSLionel Sambucrepeatedly until all the output has been consumed. At that 1210*4a711beaSLionel Sambucpoint, <code class="computeroutput">BZ2_bzCompress</code> returns 1211*4a711beaSLionel Sambuc<code class="computeroutput">BZ_STREAM_END</code>, and the stream's 1212*4a711beaSLionel Sambucstate is set back to IDLE. 1213*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzCompressEnd</code> should then be 1214*4a711beaSLionel Sambuccalled.</p> 1215*4a711beaSLionel Sambuc<p>Just to make sure the calling program does not cheat, the 1216*4a711beaSLionel Sambuclibrary makes a note of <code class="computeroutput">avail_in</code> 1217*4a711beaSLionel Sambucat the time of the first call to 1218*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzCompress</code> which has 1219*4a711beaSLionel Sambuc<code class="computeroutput">BZ_FINISH</code> as an action (ie, at 1220*4a711beaSLionel Sambucthe time the program has announced its intention to not supply 1221*4a711beaSLionel Sambucany more input). By comparing this value with that of 1222*4a711beaSLionel Sambuc<code class="computeroutput">avail_in</code> over subsequent calls 1223*4a711beaSLionel Sambucto <code class="computeroutput">BZ2_bzCompress</code>, the library 1224*4a711beaSLionel Sambuccan detect any attempts to slip in more data to compress. Any 1225*4a711beaSLionel Sambuccalls for which this is detected will return 1226*4a711beaSLionel Sambuc<code class="computeroutput">BZ_SEQUENCE_ERROR</code>. This 1227*4a711beaSLionel Sambucindicates a programming mistake which should be corrected.</p> 1228*4a711beaSLionel Sambuc<p>Instead of asking to finish, the calling program may ask 1229*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzCompress</code> to take all the 1230*4a711beaSLionel Sambucremaining input, compress it and terminate the current 1231*4a711beaSLionel Sambuc(Burrows-Wheeler) compression block. This could be useful for 1232*4a711beaSLionel Sambucerror control purposes. The mechanism is analogous to that for 1233*4a711beaSLionel Sambucfinishing: call <code class="computeroutput">BZ2_bzCompress</code> 1234*4a711beaSLionel Sambucwith an action of <code class="computeroutput">BZ_FLUSH</code>, 1235*4a711beaSLionel Sambucremove output data, and persist with the 1236*4a711beaSLionel Sambuc<code class="computeroutput">BZ_FLUSH</code> action until the value 1237*4a711beaSLionel Sambuc<code class="computeroutput">BZ_RUN</code> is returned. As with 1238*4a711beaSLionel Sambucfinishing, <code class="computeroutput">BZ2_bzCompress</code> 1239*4a711beaSLionel Sambucdetects any attempt to provide more input data once the flush has 1240*4a711beaSLionel Sambucbegun.</p> 1241*4a711beaSLionel Sambuc<p>Once the flush is complete, the stream returns to the 1242*4a711beaSLionel Sambucnormal RUNNING state.</p> 1243*4a711beaSLionel Sambuc<p>This all sounds pretty complex, but isn't really. Here's a 1244*4a711beaSLionel Sambuctable which shows which actions are allowable in each state, what 1245*4a711beaSLionel Sambucaction will be taken, what the next state is, and what the 1246*4a711beaSLionel Sambucnon-error return values are. Note that you can't explicitly ask 1247*4a711beaSLionel Sambucwhat state the stream is in, but nor do you need to -- it can be 1248*4a711beaSLionel Sambucinferred from the values returned by 1249*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzCompress</code>.</p> 1250*4a711beaSLionel Sambuc<pre class="programlisting">IDLE/any 1251*4a711beaSLionel Sambuc Illegal. IDLE state only exists after BZ2_bzCompressEnd or 1252*4a711beaSLionel Sambuc before BZ2_bzCompressInit. 1253*4a711beaSLionel Sambuc Return value = BZ_SEQUENCE_ERROR 1254*4a711beaSLionel Sambuc 1255*4a711beaSLionel SambucRUNNING/BZ_RUN 1256*4a711beaSLionel Sambuc Compress from next_in to next_out as much as possible. 1257*4a711beaSLionel Sambuc Next state = RUNNING 1258*4a711beaSLionel Sambuc Return value = BZ_RUN_OK 1259*4a711beaSLionel Sambuc 1260*4a711beaSLionel SambucRUNNING/BZ_FLUSH 1261*4a711beaSLionel Sambuc Remember current value of next_in. Compress from next_in 1262*4a711beaSLionel Sambuc to next_out as much as possible, but do not accept any more input. 1263*4a711beaSLionel Sambuc Next state = FLUSHING 1264*4a711beaSLionel Sambuc Return value = BZ_FLUSH_OK 1265*4a711beaSLionel Sambuc 1266*4a711beaSLionel SambucRUNNING/BZ_FINISH 1267*4a711beaSLionel Sambuc Remember current value of next_in. Compress from next_in 1268*4a711beaSLionel Sambuc to next_out as much as possible, but do not accept any more input. 1269*4a711beaSLionel Sambuc Next state = FINISHING 1270*4a711beaSLionel Sambuc Return value = BZ_FINISH_OK 1271*4a711beaSLionel Sambuc 1272*4a711beaSLionel SambucFLUSHING/BZ_FLUSH 1273*4a711beaSLionel Sambuc Compress from next_in to next_out as much as possible, 1274*4a711beaSLionel Sambuc but do not accept any more input. 1275*4a711beaSLionel Sambuc If all the existing input has been used up and all compressed 1276*4a711beaSLionel Sambuc output has been removed 1277*4a711beaSLionel Sambuc Next state = RUNNING; Return value = BZ_RUN_OK 1278*4a711beaSLionel Sambuc else 1279*4a711beaSLionel Sambuc Next state = FLUSHING; Return value = BZ_FLUSH_OK 1280*4a711beaSLionel Sambuc 1281*4a711beaSLionel SambucFLUSHING/other 1282*4a711beaSLionel Sambuc Illegal. 1283*4a711beaSLionel Sambuc Return value = BZ_SEQUENCE_ERROR 1284*4a711beaSLionel Sambuc 1285*4a711beaSLionel SambucFINISHING/BZ_FINISH 1286*4a711beaSLionel Sambuc Compress from next_in to next_out as much as possible, 1287*4a711beaSLionel Sambuc but to not accept any more input. 1288*4a711beaSLionel Sambuc If all the existing input has been used up and all compressed 1289*4a711beaSLionel Sambuc output has been removed 1290*4a711beaSLionel Sambuc Next state = IDLE; Return value = BZ_STREAM_END 1291*4a711beaSLionel Sambuc else 1292*4a711beaSLionel Sambuc Next state = FINISHING; Return value = BZ_FINISH_OK 1293*4a711beaSLionel Sambuc 1294*4a711beaSLionel SambucFINISHING/other 1295*4a711beaSLionel Sambuc Illegal. 1296*4a711beaSLionel Sambuc Return value = BZ_SEQUENCE_ERROR</pre> 1297*4a711beaSLionel Sambuc<p>That still looks complicated? Well, fair enough. The 1298*4a711beaSLionel Sambucusual sequence of calls for compressing a load of data is:</p> 1299*4a711beaSLionel Sambuc<div class="orderedlist"><ol class="orderedlist" type="1"> 1300*4a711beaSLionel Sambuc<li class="listitem"><p>Get started with 1301*4a711beaSLionel Sambuc <code class="computeroutput">BZ2_bzCompressInit</code>.</p></li> 1302*4a711beaSLionel Sambuc<li class="listitem"><p>Shovel data in and shlurp out its compressed form 1303*4a711beaSLionel Sambuc using zero or more calls of 1304*4a711beaSLionel Sambuc <code class="computeroutput">BZ2_bzCompress</code> with action = 1305*4a711beaSLionel Sambuc <code class="computeroutput">BZ_RUN</code>.</p></li> 1306*4a711beaSLionel Sambuc<li class="listitem"><p>Finish up. Repeatedly call 1307*4a711beaSLionel Sambuc <code class="computeroutput">BZ2_bzCompress</code> with action = 1308*4a711beaSLionel Sambuc <code class="computeroutput">BZ_FINISH</code>, copying out the 1309*4a711beaSLionel Sambuc compressed output, until 1310*4a711beaSLionel Sambuc <code class="computeroutput">BZ_STREAM_END</code> is 1311*4a711beaSLionel Sambuc returned.</p></li> 1312*4a711beaSLionel Sambuc<li class="listitem"><p>Close up and go home. Call 1313*4a711beaSLionel Sambuc <code class="computeroutput">BZ2_bzCompressEnd</code>.</p></li> 1314*4a711beaSLionel Sambuc</ol></div> 1315*4a711beaSLionel Sambuc<p>If the data you want to compress fits into your input 1316*4a711beaSLionel Sambucbuffer all at once, you can skip the calls of 1317*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzCompress ( ..., BZ_RUN )</code> 1318*4a711beaSLionel Sambucand just do the <code class="computeroutput">BZ2_bzCompress ( ..., BZ_FINISH 1319*4a711beaSLionel Sambuc)</code> calls.</p> 1320*4a711beaSLionel Sambuc<p>All required memory is allocated by 1321*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzCompressInit</code>. The 1322*4a711beaSLionel Sambuccompression library can accept any data at all (obviously). So 1323*4a711beaSLionel Sambucyou shouldn't get any error return values from the 1324*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzCompress</code> calls. If you 1325*4a711beaSLionel Sambucdo, they will be 1326*4a711beaSLionel Sambuc<code class="computeroutput">BZ_SEQUENCE_ERROR</code>, and indicate 1327*4a711beaSLionel Sambuca bug in your programming.</p> 1328*4a711beaSLionel Sambuc<p>Trivial other possible return values:</p> 1329*4a711beaSLionel Sambuc<pre class="programlisting">BZ_PARAM_ERROR 1330*4a711beaSLionel Sambuc if strm is NULL, or strm->s is NULL</pre> 1331*4a711beaSLionel Sambuc</div> 1332*4a711beaSLionel Sambuc<div class="sect2" title="3.3.3.�BZ2_bzCompressEnd"> 1333*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h3 class="title"> 1334*4a711beaSLionel Sambuc<a name="bzCompress-end"></a>3.3.3.�BZ2_bzCompressEnd</h3></div></div></div> 1335*4a711beaSLionel Sambuc<pre class="programlisting">int BZ2_bzCompressEnd ( bz_stream *strm );</pre> 1336*4a711beaSLionel Sambuc<p>Releases all memory associated with a compression 1337*4a711beaSLionel Sambucstream.</p> 1338*4a711beaSLionel Sambuc<p>Possible return values:</p> 1339*4a711beaSLionel Sambuc<pre class="programlisting">BZ_PARAM_ERROR if strm is NULL or strm->s is NULL 1340*4a711beaSLionel SambucBZ_OK otherwise</pre> 1341*4a711beaSLionel Sambuc</div> 1342*4a711beaSLionel Sambuc<div class="sect2" title="3.3.4.�BZ2_bzDecompressInit"> 1343*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h3 class="title"> 1344*4a711beaSLionel Sambuc<a name="bzDecompress-init"></a>3.3.4.�BZ2_bzDecompressInit</h3></div></div></div> 1345*4a711beaSLionel Sambuc<pre class="programlisting">int BZ2_bzDecompressInit ( bz_stream *strm, int verbosity, int small );</pre> 1346*4a711beaSLionel Sambuc<p>Prepares for decompression. As with 1347*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzCompressInit</code>, a 1348*4a711beaSLionel Sambuc<code class="computeroutput">bz_stream</code> record should be 1349*4a711beaSLionel Sambucallocated and initialised before the call. Fields 1350*4a711beaSLionel Sambuc<code class="computeroutput">bzalloc</code>, 1351*4a711beaSLionel Sambuc<code class="computeroutput">bzfree</code> and 1352*4a711beaSLionel Sambuc<code class="computeroutput">opaque</code> should be set if a custom 1353*4a711beaSLionel Sambucmemory allocator is required, or made 1354*4a711beaSLionel Sambuc<code class="computeroutput">NULL</code> for the normal 1355*4a711beaSLionel Sambuc<code class="computeroutput">malloc</code> / 1356*4a711beaSLionel Sambuc<code class="computeroutput">free</code> routines. Upon return, the 1357*4a711beaSLionel Sambucinternal state will have been initialised, and 1358*4a711beaSLionel Sambuc<code class="computeroutput">total_in</code> and 1359*4a711beaSLionel Sambuc<code class="computeroutput">total_out</code> will be zero.</p> 1360*4a711beaSLionel Sambuc<p>For the meaning of parameter 1361*4a711beaSLionel Sambuc<code class="computeroutput">verbosity</code>, see 1362*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzCompressInit</code>.</p> 1363*4a711beaSLionel Sambuc<p>If <code class="computeroutput">small</code> is nonzero, the 1364*4a711beaSLionel Sambuclibrary will use an alternative decompression algorithm which 1365*4a711beaSLionel Sambucuses less memory but at the cost of decompressing more slowly 1366*4a711beaSLionel Sambuc(roughly speaking, half the speed, but the maximum memory 1367*4a711beaSLionel Sambucrequirement drops to around 2300k). See <a class="xref" href="#using" title="2.�How to use bzip2">How to use bzip2</a> 1368*4a711beaSLionel Sambucfor more information on memory management.</p> 1369*4a711beaSLionel Sambuc<p>Note that the amount of memory needed to decompress a 1370*4a711beaSLionel Sambucstream cannot be determined until the stream's header has been 1371*4a711beaSLionel Sambucread, so even if 1372*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzDecompressInit</code> succeeds, a 1373*4a711beaSLionel Sambucsubsequent <code class="computeroutput">BZ2_bzDecompress</code> 1374*4a711beaSLionel Sambuccould fail with 1375*4a711beaSLionel Sambuc<code class="computeroutput">BZ_MEM_ERROR</code>.</p> 1376*4a711beaSLionel Sambuc<p>Possible return values:</p> 1377*4a711beaSLionel Sambuc<pre class="programlisting">BZ_CONFIG_ERROR 1378*4a711beaSLionel Sambuc if the library has been mis-compiled 1379*4a711beaSLionel SambucBZ_PARAM_ERROR 1380*4a711beaSLionel Sambuc if ( small != 0 && small != 1 ) 1381*4a711beaSLionel Sambuc or (verbosity <; 0 || verbosity > 4) 1382*4a711beaSLionel SambucBZ_MEM_ERROR 1383*4a711beaSLionel Sambuc if insufficient memory is available</pre> 1384*4a711beaSLionel Sambuc<p>Allowable next actions:</p> 1385*4a711beaSLionel Sambuc<pre class="programlisting">BZ2_bzDecompress 1386*4a711beaSLionel Sambuc if BZ_OK was returned 1387*4a711beaSLionel Sambuc no specific action required in case of error</pre> 1388*4a711beaSLionel Sambuc</div> 1389*4a711beaSLionel Sambuc<div class="sect2" title="3.3.5.�BZ2_bzDecompress"> 1390*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h3 class="title"> 1391*4a711beaSLionel Sambuc<a name="bzDecompress"></a>3.3.5.�BZ2_bzDecompress</h3></div></div></div> 1392*4a711beaSLionel Sambuc<pre class="programlisting">int BZ2_bzDecompress ( bz_stream *strm );</pre> 1393*4a711beaSLionel Sambuc<p>Provides more input and/out output buffer space for the 1394*4a711beaSLionel Sambuclibrary. The caller maintains input and output buffers, and uses 1395*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzDecompress</code> to transfer 1396*4a711beaSLionel Sambucdata between them.</p> 1397*4a711beaSLionel Sambuc<p>Before each call to 1398*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzDecompress</code>, 1399*4a711beaSLionel Sambuc<code class="computeroutput">next_in</code> should point at the 1400*4a711beaSLionel Sambuccompressed data, and <code class="computeroutput">avail_in</code> 1401*4a711beaSLionel Sambucshould indicate how many bytes the library may read. 1402*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzDecompress</code> updates 1403*4a711beaSLionel Sambuc<code class="computeroutput">next_in</code>, 1404*4a711beaSLionel Sambuc<code class="computeroutput">avail_in</code> and 1405*4a711beaSLionel Sambuc<code class="computeroutput">total_in</code> to reflect the number 1406*4a711beaSLionel Sambucof bytes it has read.</p> 1407*4a711beaSLionel Sambuc<p>Similarly, <code class="computeroutput">next_out</code> should 1408*4a711beaSLionel Sambucpoint to a buffer in which the uncompressed output is to be 1409*4a711beaSLionel Sambucplaced, with <code class="computeroutput">avail_out</code> 1410*4a711beaSLionel Sambucindicating how much output space is available. 1411*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzCompress</code> updates 1412*4a711beaSLionel Sambuc<code class="computeroutput">next_out</code>, 1413*4a711beaSLionel Sambuc<code class="computeroutput">avail_out</code> and 1414*4a711beaSLionel Sambuc<code class="computeroutput">total_out</code> to reflect the number 1415*4a711beaSLionel Sambucof bytes output.</p> 1416*4a711beaSLionel Sambuc<p>You may provide and remove as little or as much data as you 1417*4a711beaSLionel Sambuclike on each call of 1418*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzDecompress</code>. In the limit, 1419*4a711beaSLionel Sambucit is acceptable to supply and remove data one byte at a time, 1420*4a711beaSLionel Sambucalthough this would be terribly inefficient. You should always 1421*4a711beaSLionel Sambucensure that at least one byte of output space is available at 1422*4a711beaSLionel Sambuceach call.</p> 1423*4a711beaSLionel Sambuc<p>Use of <code class="computeroutput">BZ2_bzDecompress</code> is 1424*4a711beaSLionel Sambucsimpler than 1425*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzCompress</code>.</p> 1426*4a711beaSLionel Sambuc<p>You should provide input and remove output as described 1427*4a711beaSLionel Sambucabove, and repeatedly call 1428*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzDecompress</code> until 1429*4a711beaSLionel Sambuc<code class="computeroutput">BZ_STREAM_END</code> is returned. 1430*4a711beaSLionel SambucAppearance of <code class="computeroutput">BZ_STREAM_END</code> 1431*4a711beaSLionel Sambucdenotes that <code class="computeroutput">BZ2_bzDecompress</code> 1432*4a711beaSLionel Sambuchas detected the logical end of the compressed stream. 1433*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzDecompress</code> will not 1434*4a711beaSLionel Sambucproduce <code class="computeroutput">BZ_STREAM_END</code> until all 1435*4a711beaSLionel Sambucoutput data has been placed into the output buffer, so once 1436*4a711beaSLionel Sambuc<code class="computeroutput">BZ_STREAM_END</code> appears, you are 1437*4a711beaSLionel Sambucguaranteed to have available all the decompressed output, and 1438*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzDecompressEnd</code> can safely 1439*4a711beaSLionel Sambucbe called.</p> 1440*4a711beaSLionel Sambuc<p>If case of an error return value, you should call 1441*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzDecompressEnd</code> to clean up 1442*4a711beaSLionel Sambucand release memory.</p> 1443*4a711beaSLionel Sambuc<p>Possible return values:</p> 1444*4a711beaSLionel Sambuc<pre class="programlisting">BZ_PARAM_ERROR 1445*4a711beaSLionel Sambuc if strm is NULL or strm->s is NULL 1446*4a711beaSLionel Sambuc or strm->avail_out < 1 1447*4a711beaSLionel SambucBZ_DATA_ERROR 1448*4a711beaSLionel Sambuc if a data integrity error is detected in the compressed stream 1449*4a711beaSLionel SambucBZ_DATA_ERROR_MAGIC 1450*4a711beaSLionel Sambuc if the compressed stream doesn't begin with the right magic bytes 1451*4a711beaSLionel SambucBZ_MEM_ERROR 1452*4a711beaSLionel Sambuc if there wasn't enough memory available 1453*4a711beaSLionel SambucBZ_STREAM_END 1454*4a711beaSLionel Sambuc if the logical end of the data stream was detected and all 1455*4a711beaSLionel Sambuc output in has been consumed, eg s-->avail_out > 0 1456*4a711beaSLionel SambucBZ_OK 1457*4a711beaSLionel Sambuc otherwise</pre> 1458*4a711beaSLionel Sambuc<p>Allowable next actions:</p> 1459*4a711beaSLionel Sambuc<pre class="programlisting">BZ2_bzDecompress 1460*4a711beaSLionel Sambuc if BZ_OK was returned 1461*4a711beaSLionel SambucBZ2_bzDecompressEnd 1462*4a711beaSLionel Sambuc otherwise</pre> 1463*4a711beaSLionel Sambuc</div> 1464*4a711beaSLionel Sambuc<div class="sect2" title="3.3.6.�BZ2_bzDecompressEnd"> 1465*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h3 class="title"> 1466*4a711beaSLionel Sambuc<a name="bzDecompress-end"></a>3.3.6.�BZ2_bzDecompressEnd</h3></div></div></div> 1467*4a711beaSLionel Sambuc<pre class="programlisting">int BZ2_bzDecompressEnd ( bz_stream *strm );</pre> 1468*4a711beaSLionel Sambuc<p>Releases all memory associated with a decompression 1469*4a711beaSLionel Sambucstream.</p> 1470*4a711beaSLionel Sambuc<p>Possible return values:</p> 1471*4a711beaSLionel Sambuc<pre class="programlisting">BZ_PARAM_ERROR 1472*4a711beaSLionel Sambuc if strm is NULL or strm->s is NULL 1473*4a711beaSLionel SambucBZ_OK 1474*4a711beaSLionel Sambuc otherwise</pre> 1475*4a711beaSLionel Sambuc<p>Allowable next actions:</p> 1476*4a711beaSLionel Sambuc<pre class="programlisting"> None.</pre> 1477*4a711beaSLionel Sambuc</div> 1478*4a711beaSLionel Sambuc</div> 1479*4a711beaSLionel Sambuc<div class="sect1" title="3.4.�High-level interface"> 1480*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h2 class="title" style="clear: both"> 1481*4a711beaSLionel Sambuc<a name="hl-interface"></a>3.4.�High-level interface</h2></div></div></div> 1482*4a711beaSLionel Sambuc<p>This interface provides functions for reading and writing 1483*4a711beaSLionel Sambuc<code class="computeroutput">bzip2</code> format files. First, some 1484*4a711beaSLionel Sambucgeneral points.</p> 1485*4a711beaSLionel Sambuc<div class="itemizedlist"><ul class="itemizedlist" type="bullet"> 1486*4a711beaSLionel Sambuc<li class="listitem" style="list-style-type: disc"><p>All of the functions take an 1487*4a711beaSLionel Sambuc <code class="computeroutput">int*</code> first argument, 1488*4a711beaSLionel Sambuc <code class="computeroutput">bzerror</code>. After each call, 1489*4a711beaSLionel Sambuc <code class="computeroutput">bzerror</code> should be consulted 1490*4a711beaSLionel Sambuc first to determine the outcome of the call. If 1491*4a711beaSLionel Sambuc <code class="computeroutput">bzerror</code> is 1492*4a711beaSLionel Sambuc <code class="computeroutput">BZ_OK</code>, the call completed 1493*4a711beaSLionel Sambuc successfully, and only then should the return value of the 1494*4a711beaSLionel Sambuc function (if any) be consulted. If 1495*4a711beaSLionel Sambuc <code class="computeroutput">bzerror</code> is 1496*4a711beaSLionel Sambuc <code class="computeroutput">BZ_IO_ERROR</code>, there was an 1497*4a711beaSLionel Sambuc error reading/writing the underlying compressed file, and you 1498*4a711beaSLionel Sambuc should then consult <code class="computeroutput">errno</code> / 1499*4a711beaSLionel Sambuc <code class="computeroutput">perror</code> to determine the cause 1500*4a711beaSLionel Sambuc of the difficulty. <code class="computeroutput">bzerror</code> 1501*4a711beaSLionel Sambuc may also be set to various other values; precise details are 1502*4a711beaSLionel Sambuc given on a per-function basis below.</p></li> 1503*4a711beaSLionel Sambuc<li class="listitem" style="list-style-type: disc"><p>If <code class="computeroutput">bzerror</code> indicates 1504*4a711beaSLionel Sambuc an error (ie, anything except 1505*4a711beaSLionel Sambuc <code class="computeroutput">BZ_OK</code> and 1506*4a711beaSLionel Sambuc <code class="computeroutput">BZ_STREAM_END</code>), you should 1507*4a711beaSLionel Sambuc immediately call 1508*4a711beaSLionel Sambuc <code class="computeroutput">BZ2_bzReadClose</code> (or 1509*4a711beaSLionel Sambuc <code class="computeroutput">BZ2_bzWriteClose</code>, depending on 1510*4a711beaSLionel Sambuc whether you are attempting to read or to write) to free up all 1511*4a711beaSLionel Sambuc resources associated with the stream. Once an error has been 1512*4a711beaSLionel Sambuc indicated, behaviour of all calls except 1513*4a711beaSLionel Sambuc <code class="computeroutput">BZ2_bzReadClose</code> 1514*4a711beaSLionel Sambuc (<code class="computeroutput">BZ2_bzWriteClose</code>) is 1515*4a711beaSLionel Sambuc undefined. The implication is that (1) 1516*4a711beaSLionel Sambuc <code class="computeroutput">bzerror</code> should be checked 1517*4a711beaSLionel Sambuc after each call, and (2) if 1518*4a711beaSLionel Sambuc <code class="computeroutput">bzerror</code> indicates an error, 1519*4a711beaSLionel Sambuc <code class="computeroutput">BZ2_bzReadClose</code> 1520*4a711beaSLionel Sambuc (<code class="computeroutput">BZ2_bzWriteClose</code>) should then 1521*4a711beaSLionel Sambuc be called to clean up.</p></li> 1522*4a711beaSLionel Sambuc<li class="listitem" style="list-style-type: disc"><p>The <code class="computeroutput">FILE*</code> arguments 1523*4a711beaSLionel Sambuc passed to <code class="computeroutput">BZ2_bzReadOpen</code> / 1524*4a711beaSLionel Sambuc <code class="computeroutput">BZ2_bzWriteOpen</code> should be set 1525*4a711beaSLionel Sambuc to binary mode. Most Unix systems will do this by default, but 1526*4a711beaSLionel Sambuc other platforms, including Windows and Mac, will not. If you 1527*4a711beaSLionel Sambuc omit this, you may encounter problems when moving code to new 1528*4a711beaSLionel Sambuc platforms.</p></li> 1529*4a711beaSLionel Sambuc<li class="listitem" style="list-style-type: disc"><p>Memory allocation requests are handled by 1530*4a711beaSLionel Sambuc <code class="computeroutput">malloc</code> / 1531*4a711beaSLionel Sambuc <code class="computeroutput">free</code>. At present there is no 1532*4a711beaSLionel Sambuc facility for user-defined memory allocators in the file I/O 1533*4a711beaSLionel Sambuc functions (could easily be added, though).</p></li> 1534*4a711beaSLionel Sambuc</ul></div> 1535*4a711beaSLionel Sambuc<div class="sect2" title="3.4.1.�BZ2_bzReadOpen"> 1536*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h3 class="title"> 1537*4a711beaSLionel Sambuc<a name="bzreadopen"></a>3.4.1.�BZ2_bzReadOpen</h3></div></div></div> 1538*4a711beaSLionel Sambuc<pre class="programlisting">typedef void BZFILE; 1539*4a711beaSLionel Sambuc 1540*4a711beaSLionel SambucBZFILE *BZ2_bzReadOpen( int *bzerror, FILE *f, 1541*4a711beaSLionel Sambuc int verbosity, int small, 1542*4a711beaSLionel Sambuc void *unused, int nUnused );</pre> 1543*4a711beaSLionel Sambuc<p>Prepare to read compressed data from file handle 1544*4a711beaSLionel Sambuc<code class="computeroutput">f</code>. 1545*4a711beaSLionel Sambuc<code class="computeroutput">f</code> should refer to a file which 1546*4a711beaSLionel Sambuchas been opened for reading, and for which the error indicator 1547*4a711beaSLionel Sambuc(<code class="computeroutput">ferror(f)</code>)is not set. If 1548*4a711beaSLionel Sambuc<code class="computeroutput">small</code> is 1, the library will try 1549*4a711beaSLionel Sambucto decompress using less memory, at the expense of speed.</p> 1550*4a711beaSLionel Sambuc<p>For reasons explained below, 1551*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzRead</code> will decompress the 1552*4a711beaSLionel Sambuc<code class="computeroutput">nUnused</code> bytes starting at 1553*4a711beaSLionel Sambuc<code class="computeroutput">unused</code>, before starting to read 1554*4a711beaSLionel Sambucfrom the file <code class="computeroutput">f</code>. At most 1555*4a711beaSLionel Sambuc<code class="computeroutput">BZ_MAX_UNUSED</code> bytes may be 1556*4a711beaSLionel Sambucsupplied like this. If this facility is not required, you should 1557*4a711beaSLionel Sambucpass <code class="computeroutput">NULL</code> and 1558*4a711beaSLionel Sambuc<code class="computeroutput">0</code> for 1559*4a711beaSLionel Sambuc<code class="computeroutput">unused</code> and 1560*4a711beaSLionel Sambucn<code class="computeroutput">Unused</code> respectively.</p> 1561*4a711beaSLionel Sambuc<p>For the meaning of parameters 1562*4a711beaSLionel Sambuc<code class="computeroutput">small</code> and 1563*4a711beaSLionel Sambuc<code class="computeroutput">verbosity</code>, see 1564*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzDecompressInit</code>.</p> 1565*4a711beaSLionel Sambuc<p>The amount of memory needed to decompress a file cannot be 1566*4a711beaSLionel Sambucdetermined until the file's header has been read. So it is 1567*4a711beaSLionel Sambucpossible that <code class="computeroutput">BZ2_bzReadOpen</code> 1568*4a711beaSLionel Sambucreturns <code class="computeroutput">BZ_OK</code> but a subsequent 1569*4a711beaSLionel Sambuccall of <code class="computeroutput">BZ2_bzRead</code> will return 1570*4a711beaSLionel Sambuc<code class="computeroutput">BZ_MEM_ERROR</code>.</p> 1571*4a711beaSLionel Sambuc<p>Possible assignments to 1572*4a711beaSLionel Sambuc<code class="computeroutput">bzerror</code>:</p> 1573*4a711beaSLionel Sambuc<pre class="programlisting">BZ_CONFIG_ERROR 1574*4a711beaSLionel Sambuc if the library has been mis-compiled 1575*4a711beaSLionel SambucBZ_PARAM_ERROR 1576*4a711beaSLionel Sambuc if f is NULL 1577*4a711beaSLionel Sambuc or small is neither 0 nor 1 1578*4a711beaSLionel Sambuc or ( unused == NULL && nUnused != 0 ) 1579*4a711beaSLionel Sambuc or ( unused != NULL && !(0 <= nUnused <= BZ_MAX_UNUSED) ) 1580*4a711beaSLionel SambucBZ_IO_ERROR 1581*4a711beaSLionel Sambuc if ferror(f) is nonzero 1582*4a711beaSLionel SambucBZ_MEM_ERROR 1583*4a711beaSLionel Sambuc if insufficient memory is available 1584*4a711beaSLionel SambucBZ_OK 1585*4a711beaSLionel Sambuc otherwise.</pre> 1586*4a711beaSLionel Sambuc<p>Possible return values:</p> 1587*4a711beaSLionel Sambuc<pre class="programlisting">Pointer to an abstract BZFILE 1588*4a711beaSLionel Sambuc if bzerror is BZ_OK 1589*4a711beaSLionel SambucNULL 1590*4a711beaSLionel Sambuc otherwise</pre> 1591*4a711beaSLionel Sambuc<p>Allowable next actions:</p> 1592*4a711beaSLionel Sambuc<pre class="programlisting">BZ2_bzRead 1593*4a711beaSLionel Sambuc if bzerror is BZ_OK 1594*4a711beaSLionel SambucBZ2_bzClose 1595*4a711beaSLionel Sambuc otherwise</pre> 1596*4a711beaSLionel Sambuc</div> 1597*4a711beaSLionel Sambuc<div class="sect2" title="3.4.2.�BZ2_bzRead"> 1598*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h3 class="title"> 1599*4a711beaSLionel Sambuc<a name="bzread"></a>3.4.2.�BZ2_bzRead</h3></div></div></div> 1600*4a711beaSLionel Sambuc<pre class="programlisting">int BZ2_bzRead ( int *bzerror, BZFILE *b, void *buf, int len );</pre> 1601*4a711beaSLionel Sambuc<p>Reads up to <code class="computeroutput">len</code> 1602*4a711beaSLionel Sambuc(uncompressed) bytes from the compressed file 1603*4a711beaSLionel Sambuc<code class="computeroutput">b</code> into the buffer 1604*4a711beaSLionel Sambuc<code class="computeroutput">buf</code>. If the read was 1605*4a711beaSLionel Sambucsuccessful, <code class="computeroutput">bzerror</code> is set to 1606*4a711beaSLionel Sambuc<code class="computeroutput">BZ_OK</code> and the number of bytes 1607*4a711beaSLionel Sambucread is returned. If the logical end-of-stream was detected, 1608*4a711beaSLionel Sambuc<code class="computeroutput">bzerror</code> will be set to 1609*4a711beaSLionel Sambuc<code class="computeroutput">BZ_STREAM_END</code>, and the number of 1610*4a711beaSLionel Sambucbytes read is returned. All other 1611*4a711beaSLionel Sambuc<code class="computeroutput">bzerror</code> values denote an 1612*4a711beaSLionel Sambucerror.</p> 1613*4a711beaSLionel Sambuc<p><code class="computeroutput">BZ2_bzRead</code> will supply 1614*4a711beaSLionel Sambuc<code class="computeroutput">len</code> bytes, unless the logical 1615*4a711beaSLionel Sambucstream end is detected or an error occurs. Because of this, it 1616*4a711beaSLionel Sambucis possible to detect the stream end by observing when the number 1617*4a711beaSLionel Sambucof bytes returned is less than the number requested. 1618*4a711beaSLionel SambucNevertheless, this is regarded as inadvisable; you should instead 1619*4a711beaSLionel Sambuccheck <code class="computeroutput">bzerror</code> after every call 1620*4a711beaSLionel Sambucand watch out for 1621*4a711beaSLionel Sambuc<code class="computeroutput">BZ_STREAM_END</code>.</p> 1622*4a711beaSLionel Sambuc<p>Internally, <code class="computeroutput">BZ2_bzRead</code> 1623*4a711beaSLionel Sambuccopies data from the compressed file in chunks of size 1624*4a711beaSLionel Sambuc<code class="computeroutput">BZ_MAX_UNUSED</code> bytes before 1625*4a711beaSLionel Sambucdecompressing it. If the file contains more bytes than strictly 1626*4a711beaSLionel Sambucneeded to reach the logical end-of-stream, 1627*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzRead</code> will almost certainly 1628*4a711beaSLionel Sambucread some of the trailing data before signalling 1629*4a711beaSLionel Sambuc<code class="computeroutput">BZ_SEQUENCE_END</code>. To collect the 1630*4a711beaSLionel Sambucread but unused data once 1631*4a711beaSLionel Sambuc<code class="computeroutput">BZ_SEQUENCE_END</code> has appeared, 1632*4a711beaSLionel Sambuccall <code class="computeroutput">BZ2_bzReadGetUnused</code> 1633*4a711beaSLionel Sambucimmediately before 1634*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzReadClose</code>.</p> 1635*4a711beaSLionel Sambuc<p>Possible assignments to 1636*4a711beaSLionel Sambuc<code class="computeroutput">bzerror</code>:</p> 1637*4a711beaSLionel Sambuc<pre class="programlisting">BZ_PARAM_ERROR 1638*4a711beaSLionel Sambuc if b is NULL or buf is NULL or len < 0 1639*4a711beaSLionel SambucBZ_SEQUENCE_ERROR 1640*4a711beaSLionel Sambuc if b was opened with BZ2_bzWriteOpen 1641*4a711beaSLionel SambucBZ_IO_ERROR 1642*4a711beaSLionel Sambuc if there is an error reading from the compressed file 1643*4a711beaSLionel SambucBZ_UNEXPECTED_EOF 1644*4a711beaSLionel Sambuc if the compressed file ended before 1645*4a711beaSLionel Sambuc the logical end-of-stream was detected 1646*4a711beaSLionel SambucBZ_DATA_ERROR 1647*4a711beaSLionel Sambuc if a data integrity error was detected in the compressed stream 1648*4a711beaSLionel SambucBZ_DATA_ERROR_MAGIC 1649*4a711beaSLionel Sambuc if the stream does not begin with the requisite header bytes 1650*4a711beaSLionel Sambuc (ie, is not a bzip2 data file). This is really 1651*4a711beaSLionel Sambuc a special case of BZ_DATA_ERROR. 1652*4a711beaSLionel SambucBZ_MEM_ERROR 1653*4a711beaSLionel Sambuc if insufficient memory was available 1654*4a711beaSLionel SambucBZ_STREAM_END 1655*4a711beaSLionel Sambuc if the logical end of stream was detected. 1656*4a711beaSLionel SambucBZ_OK 1657*4a711beaSLionel Sambuc otherwise.</pre> 1658*4a711beaSLionel Sambuc<p>Possible return values:</p> 1659*4a711beaSLionel Sambuc<pre class="programlisting">number of bytes read 1660*4a711beaSLionel Sambuc if bzerror is BZ_OK or BZ_STREAM_END 1661*4a711beaSLionel Sambucundefined 1662*4a711beaSLionel Sambuc otherwise</pre> 1663*4a711beaSLionel Sambuc<p>Allowable next actions:</p> 1664*4a711beaSLionel Sambuc<pre class="programlisting">collect data from buf, then BZ2_bzRead or BZ2_bzReadClose 1665*4a711beaSLionel Sambuc if bzerror is BZ_OK 1666*4a711beaSLionel Sambuccollect data from buf, then BZ2_bzReadClose or BZ2_bzReadGetUnused 1667*4a711beaSLionel Sambuc if bzerror is BZ_SEQUENCE_END 1668*4a711beaSLionel SambucBZ2_bzReadClose 1669*4a711beaSLionel Sambuc otherwise</pre> 1670*4a711beaSLionel Sambuc</div> 1671*4a711beaSLionel Sambuc<div class="sect2" title="3.4.3.�BZ2_bzReadGetUnused"> 1672*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h3 class="title"> 1673*4a711beaSLionel Sambuc<a name="bzreadgetunused"></a>3.4.3.�BZ2_bzReadGetUnused</h3></div></div></div> 1674*4a711beaSLionel Sambuc<pre class="programlisting">void BZ2_bzReadGetUnused( int* bzerror, BZFILE *b, 1675*4a711beaSLionel Sambuc void** unused, int* nUnused );</pre> 1676*4a711beaSLionel Sambuc<p>Returns data which was read from the compressed file but 1677*4a711beaSLionel Sambucwas not needed to get to the logical end-of-stream. 1678*4a711beaSLionel Sambuc<code class="computeroutput">*unused</code> is set to the address of 1679*4a711beaSLionel Sambucthe data, and <code class="computeroutput">*nUnused</code> to the 1680*4a711beaSLionel Sambucnumber of bytes. <code class="computeroutput">*nUnused</code> will 1681*4a711beaSLionel Sambucbe set to a value between <code class="computeroutput">0</code> and 1682*4a711beaSLionel Sambuc<code class="computeroutput">BZ_MAX_UNUSED</code> inclusive.</p> 1683*4a711beaSLionel Sambuc<p>This function may only be called once 1684*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzRead</code> has signalled 1685*4a711beaSLionel Sambuc<code class="computeroutput">BZ_STREAM_END</code> but before 1686*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzReadClose</code>.</p> 1687*4a711beaSLionel Sambuc<p>Possible assignments to 1688*4a711beaSLionel Sambuc<code class="computeroutput">bzerror</code>:</p> 1689*4a711beaSLionel Sambuc<pre class="programlisting">BZ_PARAM_ERROR 1690*4a711beaSLionel Sambuc if b is NULL 1691*4a711beaSLionel Sambuc or unused is NULL or nUnused is NULL 1692*4a711beaSLionel SambucBZ_SEQUENCE_ERROR 1693*4a711beaSLionel Sambuc if BZ_STREAM_END has not been signalled 1694*4a711beaSLionel Sambuc or if b was opened with BZ2_bzWriteOpen 1695*4a711beaSLionel SambucBZ_OK 1696*4a711beaSLionel Sambuc otherwise</pre> 1697*4a711beaSLionel Sambuc<p>Allowable next actions:</p> 1698*4a711beaSLionel Sambuc<pre class="programlisting">BZ2_bzReadClose</pre> 1699*4a711beaSLionel Sambuc</div> 1700*4a711beaSLionel Sambuc<div class="sect2" title="3.4.4.�BZ2_bzReadClose"> 1701*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h3 class="title"> 1702*4a711beaSLionel Sambuc<a name="bzreadclose"></a>3.4.4.�BZ2_bzReadClose</h3></div></div></div> 1703*4a711beaSLionel Sambuc<pre class="programlisting">void BZ2_bzReadClose ( int *bzerror, BZFILE *b );</pre> 1704*4a711beaSLionel Sambuc<p>Releases all memory pertaining to the compressed file 1705*4a711beaSLionel Sambuc<code class="computeroutput">b</code>. 1706*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzReadClose</code> does not call 1707*4a711beaSLionel Sambuc<code class="computeroutput">fclose</code> on the underlying file 1708*4a711beaSLionel Sambuchandle, so you should do that yourself if appropriate. 1709*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzReadClose</code> should be called 1710*4a711beaSLionel Sambucto clean up after all error situations.</p> 1711*4a711beaSLionel Sambuc<p>Possible assignments to 1712*4a711beaSLionel Sambuc<code class="computeroutput">bzerror</code>:</p> 1713*4a711beaSLionel Sambuc<pre class="programlisting">BZ_SEQUENCE_ERROR 1714*4a711beaSLionel Sambuc if b was opened with BZ2_bzOpenWrite 1715*4a711beaSLionel SambucBZ_OK 1716*4a711beaSLionel Sambuc otherwise</pre> 1717*4a711beaSLionel Sambuc<p>Allowable next actions:</p> 1718*4a711beaSLionel Sambuc<pre class="programlisting">none</pre> 1719*4a711beaSLionel Sambuc</div> 1720*4a711beaSLionel Sambuc<div class="sect2" title="3.4.5.�BZ2_bzWriteOpen"> 1721*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h3 class="title"> 1722*4a711beaSLionel Sambuc<a name="bzwriteopen"></a>3.4.5.�BZ2_bzWriteOpen</h3></div></div></div> 1723*4a711beaSLionel Sambuc<pre class="programlisting">BZFILE *BZ2_bzWriteOpen( int *bzerror, FILE *f, 1724*4a711beaSLionel Sambuc int blockSize100k, int verbosity, 1725*4a711beaSLionel Sambuc int workFactor );</pre> 1726*4a711beaSLionel Sambuc<p>Prepare to write compressed data to file handle 1727*4a711beaSLionel Sambuc<code class="computeroutput">f</code>. 1728*4a711beaSLionel Sambuc<code class="computeroutput">f</code> should refer to a file which 1729*4a711beaSLionel Sambuchas been opened for writing, and for which the error indicator 1730*4a711beaSLionel Sambuc(<code class="computeroutput">ferror(f)</code>)is not set.</p> 1731*4a711beaSLionel Sambuc<p>For the meaning of parameters 1732*4a711beaSLionel Sambuc<code class="computeroutput">blockSize100k</code>, 1733*4a711beaSLionel Sambuc<code class="computeroutput">verbosity</code> and 1734*4a711beaSLionel Sambuc<code class="computeroutput">workFactor</code>, see 1735*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzCompressInit</code>.</p> 1736*4a711beaSLionel Sambuc<p>All required memory is allocated at this stage, so if the 1737*4a711beaSLionel Sambuccall completes successfully, 1738*4a711beaSLionel Sambuc<code class="computeroutput">BZ_MEM_ERROR</code> cannot be signalled 1739*4a711beaSLionel Sambucby a subsequent call to 1740*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzWrite</code>.</p> 1741*4a711beaSLionel Sambuc<p>Possible assignments to 1742*4a711beaSLionel Sambuc<code class="computeroutput">bzerror</code>:</p> 1743*4a711beaSLionel Sambuc<pre class="programlisting">BZ_CONFIG_ERROR 1744*4a711beaSLionel Sambuc if the library has been mis-compiled 1745*4a711beaSLionel SambucBZ_PARAM_ERROR 1746*4a711beaSLionel Sambuc if f is NULL 1747*4a711beaSLionel Sambuc or blockSize100k < 1 or blockSize100k > 9 1748*4a711beaSLionel SambucBZ_IO_ERROR 1749*4a711beaSLionel Sambuc if ferror(f) is nonzero 1750*4a711beaSLionel SambucBZ_MEM_ERROR 1751*4a711beaSLionel Sambuc if insufficient memory is available 1752*4a711beaSLionel SambucBZ_OK 1753*4a711beaSLionel Sambuc otherwise</pre> 1754*4a711beaSLionel Sambuc<p>Possible return values:</p> 1755*4a711beaSLionel Sambuc<pre class="programlisting">Pointer to an abstract BZFILE 1756*4a711beaSLionel Sambuc if bzerror is BZ_OK 1757*4a711beaSLionel SambucNULL 1758*4a711beaSLionel Sambuc otherwise</pre> 1759*4a711beaSLionel Sambuc<p>Allowable next actions:</p> 1760*4a711beaSLionel Sambuc<pre class="programlisting">BZ2_bzWrite 1761*4a711beaSLionel Sambuc if bzerror is BZ_OK 1762*4a711beaSLionel Sambuc (you could go directly to BZ2_bzWriteClose, but this would be pretty pointless) 1763*4a711beaSLionel SambucBZ2_bzWriteClose 1764*4a711beaSLionel Sambuc otherwise</pre> 1765*4a711beaSLionel Sambuc</div> 1766*4a711beaSLionel Sambuc<div class="sect2" title="3.4.6.�BZ2_bzWrite"> 1767*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h3 class="title"> 1768*4a711beaSLionel Sambuc<a name="bzwrite"></a>3.4.6.�BZ2_bzWrite</h3></div></div></div> 1769*4a711beaSLionel Sambuc<pre class="programlisting">void BZ2_bzWrite ( int *bzerror, BZFILE *b, void *buf, int len );</pre> 1770*4a711beaSLionel Sambuc<p>Absorbs <code class="computeroutput">len</code> bytes from the 1771*4a711beaSLionel Sambucbuffer <code class="computeroutput">buf</code>, eventually to be 1772*4a711beaSLionel Sambuccompressed and written to the file.</p> 1773*4a711beaSLionel Sambuc<p>Possible assignments to 1774*4a711beaSLionel Sambuc<code class="computeroutput">bzerror</code>:</p> 1775*4a711beaSLionel Sambuc<pre class="programlisting">BZ_PARAM_ERROR 1776*4a711beaSLionel Sambuc if b is NULL or buf is NULL or len < 0 1777*4a711beaSLionel SambucBZ_SEQUENCE_ERROR 1778*4a711beaSLionel Sambuc if b was opened with BZ2_bzReadOpen 1779*4a711beaSLionel SambucBZ_IO_ERROR 1780*4a711beaSLionel Sambuc if there is an error writing the compressed file. 1781*4a711beaSLionel SambucBZ_OK 1782*4a711beaSLionel Sambuc otherwise</pre> 1783*4a711beaSLionel Sambuc</div> 1784*4a711beaSLionel Sambuc<div class="sect2" title="3.4.7.�BZ2_bzWriteClose"> 1785*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h3 class="title"> 1786*4a711beaSLionel Sambuc<a name="bzwriteclose"></a>3.4.7.�BZ2_bzWriteClose</h3></div></div></div> 1787*4a711beaSLionel Sambuc<pre class="programlisting">void BZ2_bzWriteClose( int *bzerror, BZFILE* f, 1788*4a711beaSLionel Sambuc int abandon, 1789*4a711beaSLionel Sambuc unsigned int* nbytes_in, 1790*4a711beaSLionel Sambuc unsigned int* nbytes_out ); 1791*4a711beaSLionel Sambuc 1792*4a711beaSLionel Sambucvoid BZ2_bzWriteClose64( int *bzerror, BZFILE* f, 1793*4a711beaSLionel Sambuc int abandon, 1794*4a711beaSLionel Sambuc unsigned int* nbytes_in_lo32, 1795*4a711beaSLionel Sambuc unsigned int* nbytes_in_hi32, 1796*4a711beaSLionel Sambuc unsigned int* nbytes_out_lo32, 1797*4a711beaSLionel Sambuc unsigned int* nbytes_out_hi32 );</pre> 1798*4a711beaSLionel Sambuc<p>Compresses and flushes to the compressed file all data so 1799*4a711beaSLionel Sambucfar supplied by <code class="computeroutput">BZ2_bzWrite</code>. 1800*4a711beaSLionel SambucThe logical end-of-stream markers are also written, so subsequent 1801*4a711beaSLionel Sambuccalls to <code class="computeroutput">BZ2_bzWrite</code> are 1802*4a711beaSLionel Sambucillegal. All memory associated with the compressed file 1803*4a711beaSLionel Sambuc<code class="computeroutput">b</code> is released. 1804*4a711beaSLionel Sambuc<code class="computeroutput">fflush</code> is called on the 1805*4a711beaSLionel Sambuccompressed file, but it is not 1806*4a711beaSLionel Sambuc<code class="computeroutput">fclose</code>'d.</p> 1807*4a711beaSLionel Sambuc<p>If <code class="computeroutput">BZ2_bzWriteClose</code> is 1808*4a711beaSLionel Sambuccalled to clean up after an error, the only action is to release 1809*4a711beaSLionel Sambucthe memory. The library records the error codes issued by 1810*4a711beaSLionel Sambucprevious calls, so this situation will be detected automatically. 1811*4a711beaSLionel SambucThere is no attempt to complete the compression operation, nor to 1812*4a711beaSLionel Sambuc<code class="computeroutput">fflush</code> the compressed file. You 1813*4a711beaSLionel Sambuccan force this behaviour to happen even in the case of no error, 1814*4a711beaSLionel Sambucby passing a nonzero value to 1815*4a711beaSLionel Sambuc<code class="computeroutput">abandon</code>.</p> 1816*4a711beaSLionel Sambuc<p>If <code class="computeroutput">nbytes_in</code> is non-null, 1817*4a711beaSLionel Sambuc<code class="computeroutput">*nbytes_in</code> will be set to be the 1818*4a711beaSLionel Sambuctotal volume of uncompressed data handled. Similarly, 1819*4a711beaSLionel Sambuc<code class="computeroutput">nbytes_out</code> will be set to the 1820*4a711beaSLionel Sambuctotal volume of compressed data written. For compatibility with 1821*4a711beaSLionel Sambucolder versions of the library, 1822*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzWriteClose</code> only yields the 1823*4a711beaSLionel Sambuclower 32 bits of these counts. Use 1824*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzWriteClose64</code> if you want 1825*4a711beaSLionel Sambucthe full 64 bit counts. These two functions are otherwise 1826*4a711beaSLionel Sambucabsolutely identical.</p> 1827*4a711beaSLionel Sambuc<p>Possible assignments to 1828*4a711beaSLionel Sambuc<code class="computeroutput">bzerror</code>:</p> 1829*4a711beaSLionel Sambuc<pre class="programlisting">BZ_SEQUENCE_ERROR 1830*4a711beaSLionel Sambuc if b was opened with BZ2_bzReadOpen 1831*4a711beaSLionel SambucBZ_IO_ERROR 1832*4a711beaSLionel Sambuc if there is an error writing the compressed file 1833*4a711beaSLionel SambucBZ_OK 1834*4a711beaSLionel Sambuc otherwise</pre> 1835*4a711beaSLionel Sambuc</div> 1836*4a711beaSLionel Sambuc<div class="sect2" title="3.4.8.�Handling embedded compressed data streams"> 1837*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h3 class="title"> 1838*4a711beaSLionel Sambuc<a name="embed"></a>3.4.8.�Handling embedded compressed data streams</h3></div></div></div> 1839*4a711beaSLionel Sambuc<p>The high-level library facilitates use of 1840*4a711beaSLionel Sambuc<code class="computeroutput">bzip2</code> data streams which form 1841*4a711beaSLionel Sambucsome part of a surrounding, larger data stream.</p> 1842*4a711beaSLionel Sambuc<div class="itemizedlist"><ul class="itemizedlist" type="bullet"> 1843*4a711beaSLionel Sambuc<li class="listitem" style="list-style-type: disc"><p>For writing, the library takes an open file handle, 1844*4a711beaSLionel Sambuc writes compressed data to it, 1845*4a711beaSLionel Sambuc <code class="computeroutput">fflush</code>es it but does not 1846*4a711beaSLionel Sambuc <code class="computeroutput">fclose</code> it. The calling 1847*4a711beaSLionel Sambuc application can write its own data before and after the 1848*4a711beaSLionel Sambuc compressed data stream, using that same file handle.</p></li> 1849*4a711beaSLionel Sambuc<li class="listitem" style="list-style-type: disc"><p>Reading is more complex, and the facilities are not as 1850*4a711beaSLionel Sambuc general as they could be since generality is hard to reconcile 1851*4a711beaSLionel Sambuc with efficiency. <code class="computeroutput">BZ2_bzRead</code> 1852*4a711beaSLionel Sambuc reads from the compressed file in blocks of size 1853*4a711beaSLionel Sambuc <code class="computeroutput">BZ_MAX_UNUSED</code> bytes, and in 1854*4a711beaSLionel Sambuc doing so probably will overshoot the logical end of compressed 1855*4a711beaSLionel Sambuc stream. To recover this data once decompression has ended, 1856*4a711beaSLionel Sambuc call <code class="computeroutput">BZ2_bzReadGetUnused</code> after 1857*4a711beaSLionel Sambuc the last call of <code class="computeroutput">BZ2_bzRead</code> 1858*4a711beaSLionel Sambuc (the one returning 1859*4a711beaSLionel Sambuc <code class="computeroutput">BZ_STREAM_END</code>) but before 1860*4a711beaSLionel Sambuc calling 1861*4a711beaSLionel Sambuc <code class="computeroutput">BZ2_bzReadClose</code>.</p></li> 1862*4a711beaSLionel Sambuc</ul></div> 1863*4a711beaSLionel Sambuc<p>This mechanism makes it easy to decompress multiple 1864*4a711beaSLionel Sambuc<code class="computeroutput">bzip2</code> streams placed end-to-end. 1865*4a711beaSLionel SambucAs the end of one stream, when 1866*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzRead</code> returns 1867*4a711beaSLionel Sambuc<code class="computeroutput">BZ_STREAM_END</code>, call 1868*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzReadGetUnused</code> to collect 1869*4a711beaSLionel Sambucthe unused data (copy it into your own buffer somewhere). That 1870*4a711beaSLionel Sambucdata forms the start of the next compressed stream. To start 1871*4a711beaSLionel Sambucuncompressing that next stream, call 1872*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzReadOpen</code> again, feeding in 1873*4a711beaSLionel Sambucthe unused data via the <code class="computeroutput">unused</code> / 1874*4a711beaSLionel Sambuc<code class="computeroutput">nUnused</code> parameters. Keep doing 1875*4a711beaSLionel Sambucthis until <code class="computeroutput">BZ_STREAM_END</code> return 1876*4a711beaSLionel Sambuccoincides with the physical end of file 1877*4a711beaSLionel Sambuc(<code class="computeroutput">feof(f)</code>). In this situation 1878*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzReadGetUnused</code> will of 1879*4a711beaSLionel Sambuccourse return no data.</p> 1880*4a711beaSLionel Sambuc<p>This should give some feel for how the high-level interface 1881*4a711beaSLionel Sambuccan be used. If you require extra flexibility, you'll have to 1882*4a711beaSLionel Sambucbite the bullet and get to grips with the low-level 1883*4a711beaSLionel Sambucinterface.</p> 1884*4a711beaSLionel Sambuc</div> 1885*4a711beaSLionel Sambuc<div class="sect2" title="3.4.9.�Standard file-reading/writing code"> 1886*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h3 class="title"> 1887*4a711beaSLionel Sambuc<a name="std-rdwr"></a>3.4.9.�Standard file-reading/writing code</h3></div></div></div> 1888*4a711beaSLionel Sambuc<p>Here's how you'd write data to a compressed file:</p> 1889*4a711beaSLionel Sambuc<pre class="programlisting">FILE* f; 1890*4a711beaSLionel SambucBZFILE* b; 1891*4a711beaSLionel Sambucint nBuf; 1892*4a711beaSLionel Sambucchar buf[ /* whatever size you like */ ]; 1893*4a711beaSLionel Sambucint bzerror; 1894*4a711beaSLionel Sambucint nWritten; 1895*4a711beaSLionel Sambuc 1896*4a711beaSLionel Sambucf = fopen ( "myfile.bz2", "w" ); 1897*4a711beaSLionel Sambucif ( !f ) { 1898*4a711beaSLionel Sambuc /* handle error */ 1899*4a711beaSLionel Sambuc} 1900*4a711beaSLionel Sambucb = BZ2_bzWriteOpen( &bzerror, f, 9 ); 1901*4a711beaSLionel Sambucif (bzerror != BZ_OK) { 1902*4a711beaSLionel Sambuc BZ2_bzWriteClose ( b ); 1903*4a711beaSLionel Sambuc /* handle error */ 1904*4a711beaSLionel Sambuc} 1905*4a711beaSLionel Sambuc 1906*4a711beaSLionel Sambucwhile ( /* condition */ ) { 1907*4a711beaSLionel Sambuc /* get data to write into buf, and set nBuf appropriately */ 1908*4a711beaSLionel Sambuc nWritten = BZ2_bzWrite ( &bzerror, b, buf, nBuf ); 1909*4a711beaSLionel Sambuc if (bzerror == BZ_IO_ERROR) { 1910*4a711beaSLionel Sambuc BZ2_bzWriteClose ( &bzerror, b ); 1911*4a711beaSLionel Sambuc /* handle error */ 1912*4a711beaSLionel Sambuc } 1913*4a711beaSLionel Sambuc} 1914*4a711beaSLionel Sambuc 1915*4a711beaSLionel SambucBZ2_bzWriteClose( &bzerror, b ); 1916*4a711beaSLionel Sambucif (bzerror == BZ_IO_ERROR) { 1917*4a711beaSLionel Sambuc /* handle error */ 1918*4a711beaSLionel Sambuc}</pre> 1919*4a711beaSLionel Sambuc<p>And to read from a compressed file:</p> 1920*4a711beaSLionel Sambuc<pre class="programlisting">FILE* f; 1921*4a711beaSLionel SambucBZFILE* b; 1922*4a711beaSLionel Sambucint nBuf; 1923*4a711beaSLionel Sambucchar buf[ /* whatever size you like */ ]; 1924*4a711beaSLionel Sambucint bzerror; 1925*4a711beaSLionel Sambucint nWritten; 1926*4a711beaSLionel Sambuc 1927*4a711beaSLionel Sambucf = fopen ( "myfile.bz2", "r" ); 1928*4a711beaSLionel Sambucif ( !f ) { 1929*4a711beaSLionel Sambuc /* handle error */ 1930*4a711beaSLionel Sambuc} 1931*4a711beaSLionel Sambucb = BZ2_bzReadOpen ( &bzerror, f, 0, NULL, 0 ); 1932*4a711beaSLionel Sambucif ( bzerror != BZ_OK ) { 1933*4a711beaSLionel Sambuc BZ2_bzReadClose ( &bzerror, b ); 1934*4a711beaSLionel Sambuc /* handle error */ 1935*4a711beaSLionel Sambuc} 1936*4a711beaSLionel Sambuc 1937*4a711beaSLionel Sambucbzerror = BZ_OK; 1938*4a711beaSLionel Sambucwhile ( bzerror == BZ_OK && /* arbitrary other conditions */) { 1939*4a711beaSLionel Sambuc nBuf = BZ2_bzRead ( &bzerror, b, buf, /* size of buf */ ); 1940*4a711beaSLionel Sambuc if ( bzerror == BZ_OK ) { 1941*4a711beaSLionel Sambuc /* do something with buf[0 .. nBuf-1] */ 1942*4a711beaSLionel Sambuc } 1943*4a711beaSLionel Sambuc} 1944*4a711beaSLionel Sambucif ( bzerror != BZ_STREAM_END ) { 1945*4a711beaSLionel Sambuc BZ2_bzReadClose ( &bzerror, b ); 1946*4a711beaSLionel Sambuc /* handle error */ 1947*4a711beaSLionel Sambuc} else { 1948*4a711beaSLionel Sambuc BZ2_bzReadClose ( &bzerror, b ); 1949*4a711beaSLionel Sambuc}</pre> 1950*4a711beaSLionel Sambuc</div> 1951*4a711beaSLionel Sambuc</div> 1952*4a711beaSLionel Sambuc<div class="sect1" title="3.5.�Utility functions"> 1953*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h2 class="title" style="clear: both"> 1954*4a711beaSLionel Sambuc<a name="util-fns"></a>3.5.�Utility functions</h2></div></div></div> 1955*4a711beaSLionel Sambuc<div class="sect2" title="3.5.1.�BZ2_bzBuffToBuffCompress"> 1956*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h3 class="title"> 1957*4a711beaSLionel Sambuc<a name="bzbufftobuffcompress"></a>3.5.1.�BZ2_bzBuffToBuffCompress</h3></div></div></div> 1958*4a711beaSLionel Sambuc<pre class="programlisting">int BZ2_bzBuffToBuffCompress( char* dest, 1959*4a711beaSLionel Sambuc unsigned int* destLen, 1960*4a711beaSLionel Sambuc char* source, 1961*4a711beaSLionel Sambuc unsigned int sourceLen, 1962*4a711beaSLionel Sambuc int blockSize100k, 1963*4a711beaSLionel Sambuc int verbosity, 1964*4a711beaSLionel Sambuc int workFactor );</pre> 1965*4a711beaSLionel Sambuc<p>Attempts to compress the data in <code class="computeroutput">source[0 1966*4a711beaSLionel Sambuc.. sourceLen-1]</code> into the destination buffer, 1967*4a711beaSLionel Sambuc<code class="computeroutput">dest[0 .. *destLen-1]</code>. If the 1968*4a711beaSLionel Sambucdestination buffer is big enough, 1969*4a711beaSLionel Sambuc<code class="computeroutput">*destLen</code> is set to the size of 1970*4a711beaSLionel Sambucthe compressed data, and <code class="computeroutput">BZ_OK</code> 1971*4a711beaSLionel Sambucis returned. If the compressed data won't fit, 1972*4a711beaSLionel Sambuc<code class="computeroutput">*destLen</code> is unchanged, and 1973*4a711beaSLionel Sambuc<code class="computeroutput">BZ_OUTBUFF_FULL</code> is 1974*4a711beaSLionel Sambucreturned.</p> 1975*4a711beaSLionel Sambuc<p>Compression in this manner is a one-shot event, done with a 1976*4a711beaSLionel Sambucsingle call to this function. The resulting compressed data is a 1977*4a711beaSLionel Sambuccomplete <code class="computeroutput">bzip2</code> format data 1978*4a711beaSLionel Sambucstream. There is no mechanism for making additional calls to 1979*4a711beaSLionel Sambucprovide extra input data. If you want that kind of mechanism, 1980*4a711beaSLionel Sambucuse the low-level interface.</p> 1981*4a711beaSLionel Sambuc<p>For the meaning of parameters 1982*4a711beaSLionel Sambuc<code class="computeroutput">blockSize100k</code>, 1983*4a711beaSLionel Sambuc<code class="computeroutput">verbosity</code> and 1984*4a711beaSLionel Sambuc<code class="computeroutput">workFactor</code>, see 1985*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzCompressInit</code>.</p> 1986*4a711beaSLionel Sambuc<p>To guarantee that the compressed data will fit in its 1987*4a711beaSLionel Sambucbuffer, allocate an output buffer of size 1% larger than the 1988*4a711beaSLionel Sambucuncompressed data, plus six hundred extra bytes.</p> 1989*4a711beaSLionel Sambuc<p><code class="computeroutput">BZ2_bzBuffToBuffDecompress</code> 1990*4a711beaSLionel Sambucwill not write data at or beyond 1991*4a711beaSLionel Sambuc<code class="computeroutput">dest[*destLen]</code>, even in case of 1992*4a711beaSLionel Sambucbuffer overflow.</p> 1993*4a711beaSLionel Sambuc<p>Possible return values:</p> 1994*4a711beaSLionel Sambuc<pre class="programlisting">BZ_CONFIG_ERROR 1995*4a711beaSLionel Sambuc if the library has been mis-compiled 1996*4a711beaSLionel SambucBZ_PARAM_ERROR 1997*4a711beaSLionel Sambuc if dest is NULL or destLen is NULL 1998*4a711beaSLionel Sambuc or blockSize100k < 1 or blockSize100k > 9 1999*4a711beaSLionel Sambuc or verbosity < 0 or verbosity > 4 2000*4a711beaSLionel Sambuc or workFactor < 0 or workFactor > 250 2001*4a711beaSLionel SambucBZ_MEM_ERROR 2002*4a711beaSLionel Sambuc if insufficient memory is available 2003*4a711beaSLionel SambucBZ_OUTBUFF_FULL 2004*4a711beaSLionel Sambuc if the size of the compressed data exceeds *destLen 2005*4a711beaSLionel SambucBZ_OK 2006*4a711beaSLionel Sambuc otherwise</pre> 2007*4a711beaSLionel Sambuc</div> 2008*4a711beaSLionel Sambuc<div class="sect2" title="3.5.2.�BZ2_bzBuffToBuffDecompress"> 2009*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h3 class="title"> 2010*4a711beaSLionel Sambuc<a name="bzbufftobuffdecompress"></a>3.5.2.�BZ2_bzBuffToBuffDecompress</h3></div></div></div> 2011*4a711beaSLionel Sambuc<pre class="programlisting">int BZ2_bzBuffToBuffDecompress( char* dest, 2012*4a711beaSLionel Sambuc unsigned int* destLen, 2013*4a711beaSLionel Sambuc char* source, 2014*4a711beaSLionel Sambuc unsigned int sourceLen, 2015*4a711beaSLionel Sambuc int small, 2016*4a711beaSLionel Sambuc int verbosity );</pre> 2017*4a711beaSLionel Sambuc<p>Attempts to decompress the data in <code class="computeroutput">source[0 2018*4a711beaSLionel Sambuc.. sourceLen-1]</code> into the destination buffer, 2019*4a711beaSLionel Sambuc<code class="computeroutput">dest[0 .. *destLen-1]</code>. If the 2020*4a711beaSLionel Sambucdestination buffer is big enough, 2021*4a711beaSLionel Sambuc<code class="computeroutput">*destLen</code> is set to the size of 2022*4a711beaSLionel Sambucthe uncompressed data, and <code class="computeroutput">BZ_OK</code> 2023*4a711beaSLionel Sambucis returned. If the compressed data won't fit, 2024*4a711beaSLionel Sambuc<code class="computeroutput">*destLen</code> is unchanged, and 2025*4a711beaSLionel Sambuc<code class="computeroutput">BZ_OUTBUFF_FULL</code> is 2026*4a711beaSLionel Sambucreturned.</p> 2027*4a711beaSLionel Sambuc<p><code class="computeroutput">source</code> is assumed to hold 2028*4a711beaSLionel Sambuca complete <code class="computeroutput">bzip2</code> format data 2029*4a711beaSLionel Sambucstream. 2030*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzBuffToBuffDecompress</code> tries 2031*4a711beaSLionel Sambucto decompress the entirety of the stream into the output 2032*4a711beaSLionel Sambucbuffer.</p> 2033*4a711beaSLionel Sambuc<p>For the meaning of parameters 2034*4a711beaSLionel Sambuc<code class="computeroutput">small</code> and 2035*4a711beaSLionel Sambuc<code class="computeroutput">verbosity</code>, see 2036*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzDecompressInit</code>.</p> 2037*4a711beaSLionel Sambuc<p>Because the compression ratio of the compressed data cannot 2038*4a711beaSLionel Sambucbe known in advance, there is no easy way to guarantee that the 2039*4a711beaSLionel Sambucoutput buffer will be big enough. You may of course make 2040*4a711beaSLionel Sambucarrangements in your code to record the size of the uncompressed 2041*4a711beaSLionel Sambucdata, but such a mechanism is beyond the scope of this 2042*4a711beaSLionel Sambuclibrary.</p> 2043*4a711beaSLionel Sambuc<p><code class="computeroutput">BZ2_bzBuffToBuffDecompress</code> 2044*4a711beaSLionel Sambucwill not write data at or beyond 2045*4a711beaSLionel Sambuc<code class="computeroutput">dest[*destLen]</code>, even in case of 2046*4a711beaSLionel Sambucbuffer overflow.</p> 2047*4a711beaSLionel Sambuc<p>Possible return values:</p> 2048*4a711beaSLionel Sambuc<pre class="programlisting">BZ_CONFIG_ERROR 2049*4a711beaSLionel Sambuc if the library has been mis-compiled 2050*4a711beaSLionel SambucBZ_PARAM_ERROR 2051*4a711beaSLionel Sambuc if dest is NULL or destLen is NULL 2052*4a711beaSLionel Sambuc or small != 0 && small != 1 2053*4a711beaSLionel Sambuc or verbosity < 0 or verbosity > 4 2054*4a711beaSLionel SambucBZ_MEM_ERROR 2055*4a711beaSLionel Sambuc if insufficient memory is available 2056*4a711beaSLionel SambucBZ_OUTBUFF_FULL 2057*4a711beaSLionel Sambuc if the size of the compressed data exceeds *destLen 2058*4a711beaSLionel SambucBZ_DATA_ERROR 2059*4a711beaSLionel Sambuc if a data integrity error was detected in the compressed data 2060*4a711beaSLionel SambucBZ_DATA_ERROR_MAGIC 2061*4a711beaSLionel Sambuc if the compressed data doesn't begin with the right magic bytes 2062*4a711beaSLionel SambucBZ_UNEXPECTED_EOF 2063*4a711beaSLionel Sambuc if the compressed data ends unexpectedly 2064*4a711beaSLionel SambucBZ_OK 2065*4a711beaSLionel Sambuc otherwise</pre> 2066*4a711beaSLionel Sambuc</div> 2067*4a711beaSLionel Sambuc</div> 2068*4a711beaSLionel Sambuc<div class="sect1" title="3.6.�zlib compatibility functions"> 2069*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h2 class="title" style="clear: both"> 2070*4a711beaSLionel Sambuc<a name="zlib-compat"></a>3.6.�zlib compatibility functions</h2></div></div></div> 2071*4a711beaSLionel Sambuc<p>Yoshioka Tsuneo has contributed some functions to give 2072*4a711beaSLionel Sambucbetter <code class="computeroutput">zlib</code> compatibility. 2073*4a711beaSLionel SambucThese functions are <code class="computeroutput">BZ2_bzopen</code>, 2074*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzread</code>, 2075*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzwrite</code>, 2076*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzflush</code>, 2077*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzclose</code>, 2078*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzerror</code> and 2079*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzlibVersion</code>. These 2080*4a711beaSLionel Sambucfunctions are not (yet) officially part of the library. If they 2081*4a711beaSLionel Sambucbreak, you get to keep all the pieces. Nevertheless, I think 2082*4a711beaSLionel Sambucthey work ok.</p> 2083*4a711beaSLionel Sambuc<pre class="programlisting">typedef void BZFILE; 2084*4a711beaSLionel Sambuc 2085*4a711beaSLionel Sambucconst char * BZ2_bzlibVersion ( void );</pre> 2086*4a711beaSLionel Sambuc<p>Returns a string indicating the library version.</p> 2087*4a711beaSLionel Sambuc<pre class="programlisting">BZFILE * BZ2_bzopen ( const char *path, const char *mode ); 2088*4a711beaSLionel SambucBZFILE * BZ2_bzdopen ( int fd, const char *mode );</pre> 2089*4a711beaSLionel Sambuc<p>Opens a <code class="computeroutput">.bz2</code> file for 2090*4a711beaSLionel Sambucreading or writing, using either its name or a pre-existing file 2091*4a711beaSLionel Sambucdescriptor. Analogous to <code class="computeroutput">fopen</code> 2092*4a711beaSLionel Sambucand <code class="computeroutput">fdopen</code>.</p> 2093*4a711beaSLionel Sambuc<pre class="programlisting">int BZ2_bzread ( BZFILE* b, void* buf, int len ); 2094*4a711beaSLionel Sambucint BZ2_bzwrite ( BZFILE* b, void* buf, int len );</pre> 2095*4a711beaSLionel Sambuc<p>Reads/writes data from/to a previously opened 2096*4a711beaSLionel Sambuc<code class="computeroutput">BZFILE</code>. Analogous to 2097*4a711beaSLionel Sambuc<code class="computeroutput">fread</code> and 2098*4a711beaSLionel Sambuc<code class="computeroutput">fwrite</code>.</p> 2099*4a711beaSLionel Sambuc<pre class="programlisting">int BZ2_bzflush ( BZFILE* b ); 2100*4a711beaSLionel Sambucvoid BZ2_bzclose ( BZFILE* b );</pre> 2101*4a711beaSLionel Sambuc<p>Flushes/closes a <code class="computeroutput">BZFILE</code>. 2102*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzflush</code> doesn't actually do 2103*4a711beaSLionel Sambucanything. Analogous to <code class="computeroutput">fflush</code> 2104*4a711beaSLionel Sambucand <code class="computeroutput">fclose</code>.</p> 2105*4a711beaSLionel Sambuc<pre class="programlisting">const char * BZ2_bzerror ( BZFILE *b, int *errnum )</pre> 2106*4a711beaSLionel Sambuc<p>Returns a string describing the more recent error status of 2107*4a711beaSLionel Sambuc<code class="computeroutput">b</code>, and also sets 2108*4a711beaSLionel Sambuc<code class="computeroutput">*errnum</code> to its numerical 2109*4a711beaSLionel Sambucvalue.</p> 2110*4a711beaSLionel Sambuc</div> 2111*4a711beaSLionel Sambuc<div class="sect1" title="3.7.�Using the library in a stdio-free environment"> 2112*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h2 class="title" style="clear: both"> 2113*4a711beaSLionel Sambuc<a name="stdio-free"></a>3.7.�Using the library in a stdio-free environment</h2></div></div></div> 2114*4a711beaSLionel Sambuc<div class="sect2" title="3.7.1.�Getting rid of stdio"> 2115*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h3 class="title"> 2116*4a711beaSLionel Sambuc<a name="stdio-bye"></a>3.7.1.�Getting rid of stdio</h3></div></div></div> 2117*4a711beaSLionel Sambuc<p>In a deeply embedded application, you might want to use 2118*4a711beaSLionel Sambucjust the memory-to-memory functions. You can do this 2119*4a711beaSLionel Sambucconveniently by compiling the library with preprocessor symbol 2120*4a711beaSLionel Sambuc<code class="computeroutput">BZ_NO_STDIO</code> defined. Doing this 2121*4a711beaSLionel Sambucgives you a library containing only the following eight 2122*4a711beaSLionel Sambucfunctions:</p> 2123*4a711beaSLionel Sambuc<p><code class="computeroutput">BZ2_bzCompressInit</code>, 2124*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzCompress</code>, 2125*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzCompressEnd</code> 2126*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzDecompressInit</code>, 2127*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzDecompress</code>, 2128*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzDecompressEnd</code> 2129*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzBuffToBuffCompress</code>, 2130*4a711beaSLionel Sambuc<code class="computeroutput">BZ2_bzBuffToBuffDecompress</code></p> 2131*4a711beaSLionel Sambuc<p>When compiled like this, all functions will ignore 2132*4a711beaSLionel Sambuc<code class="computeroutput">verbosity</code> settings.</p> 2133*4a711beaSLionel Sambuc</div> 2134*4a711beaSLionel Sambuc<div class="sect2" title="3.7.2.�Critical error handling"> 2135*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h3 class="title"> 2136*4a711beaSLionel Sambuc<a name="critical-error"></a>3.7.2.�Critical error handling</h3></div></div></div> 2137*4a711beaSLionel Sambuc<p><code class="computeroutput">libbzip2</code> contains a number 2138*4a711beaSLionel Sambucof internal assertion checks which should, needless to say, never 2139*4a711beaSLionel Sambucbe activated. Nevertheless, if an assertion should fail, 2140*4a711beaSLionel Sambucbehaviour depends on whether or not the library was compiled with 2141*4a711beaSLionel Sambuc<code class="computeroutput">BZ_NO_STDIO</code> set.</p> 2142*4a711beaSLionel Sambuc<p>For a normal compile, an assertion failure yields the 2143*4a711beaSLionel Sambucmessage:</p> 2144*4a711beaSLionel Sambuc<div class="blockquote"><blockquote class="blockquote"> 2145*4a711beaSLionel Sambuc<p>bzip2/libbzip2: internal error number N.</p> 2146*4a711beaSLionel Sambuc<p>This is a bug in bzip2/libbzip2, 1.0.6 of 6 September 2010. 2147*4a711beaSLionel SambucPlease report it to me at: jseward@bzip.org. If this happened 2148*4a711beaSLionel Sambucwhen you were using some program which uses libbzip2 as a 2149*4a711beaSLionel Sambuccomponent, you should also report this bug to the author(s) 2150*4a711beaSLionel Sambucof that program. Please make an effort to report this bug; 2151*4a711beaSLionel Sambuctimely and accurate bug reports eventually lead to higher 2152*4a711beaSLionel Sambucquality software. Thanks. Julian Seward, 6 September 2010. 2153*4a711beaSLionel Sambuc</p> 2154*4a711beaSLionel Sambuc</blockquote></div> 2155*4a711beaSLionel Sambuc<p>where <code class="computeroutput">N</code> is some error code 2156*4a711beaSLionel Sambucnumber. If <code class="computeroutput">N == 1007</code>, it also 2157*4a711beaSLionel Sambucprints some extra text advising the reader that unreliable memory 2158*4a711beaSLionel Sambucis often associated with internal error 1007. (This is a 2159*4a711beaSLionel Sambucfrequently-observed-phenomenon with versions 1.0.0/1.0.1).</p> 2160*4a711beaSLionel Sambuc<p><code class="computeroutput">exit(3)</code> is then 2161*4a711beaSLionel Sambuccalled.</p> 2162*4a711beaSLionel Sambuc<p>For a <code class="computeroutput">stdio</code>-free library, 2163*4a711beaSLionel Sambucassertion failures result in a call to a function declared 2164*4a711beaSLionel Sambucas:</p> 2165*4a711beaSLionel Sambuc<pre class="programlisting">extern void bz_internal_error ( int errcode );</pre> 2166*4a711beaSLionel Sambuc<p>The relevant code is passed as a parameter. You should 2167*4a711beaSLionel Sambucsupply such a function.</p> 2168*4a711beaSLionel Sambuc<p>In either case, once an assertion failure has occurred, any 2169*4a711beaSLionel Sambuc<code class="computeroutput">bz_stream</code> records involved can 2170*4a711beaSLionel Sambucbe regarded as invalid. You should not attempt to resume normal 2171*4a711beaSLionel Sambucoperation with them.</p> 2172*4a711beaSLionel Sambuc<p>You may, of course, change critical error handling to suit 2173*4a711beaSLionel Sambucyour needs. As I said above, critical errors indicate bugs in 2174*4a711beaSLionel Sambucthe library and should not occur. All "normal" error situations 2175*4a711beaSLionel Sambucare indicated via error return codes from functions, and can be 2176*4a711beaSLionel Sambucrecovered from.</p> 2177*4a711beaSLionel Sambuc</div> 2178*4a711beaSLionel Sambuc</div> 2179*4a711beaSLionel Sambuc<div class="sect1" title="3.8.�Making a Windows DLL"> 2180*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h2 class="title" style="clear: both"> 2181*4a711beaSLionel Sambuc<a name="win-dll"></a>3.8.�Making a Windows DLL</h2></div></div></div> 2182*4a711beaSLionel Sambuc<p>Everything related to Windows has been contributed by 2183*4a711beaSLionel SambucYoshioka Tsuneo 2184*4a711beaSLionel Sambuc(<code class="computeroutput">tsuneo@rr.iij4u.or.jp</code>), so 2185*4a711beaSLionel Sambucyou should send your queries to him (but perhaps Cc: me, 2186*4a711beaSLionel Sambuc<code class="computeroutput">jseward@bzip.org</code>).</p> 2187*4a711beaSLionel Sambuc<p>My vague understanding of what to do is: using Visual C++ 2188*4a711beaSLionel Sambuc5.0, open the project file 2189*4a711beaSLionel Sambuc<code class="computeroutput">libbz2.dsp</code>, and build. That's 2190*4a711beaSLionel Sambucall.</p> 2191*4a711beaSLionel Sambuc<p>If you can't open the project file for some reason, make a 2192*4a711beaSLionel Sambucnew one, naming these files: 2193*4a711beaSLionel Sambuc<code class="computeroutput">blocksort.c</code>, 2194*4a711beaSLionel Sambuc<code class="computeroutput">bzlib.c</code>, 2195*4a711beaSLionel Sambuc<code class="computeroutput">compress.c</code>, 2196*4a711beaSLionel Sambuc<code class="computeroutput">crctable.c</code>, 2197*4a711beaSLionel Sambuc<code class="computeroutput">decompress.c</code>, 2198*4a711beaSLionel Sambuc<code class="computeroutput">huffman.c</code>, 2199*4a711beaSLionel Sambuc<code class="computeroutput">randtable.c</code> and 2200*4a711beaSLionel Sambuc<code class="computeroutput">libbz2.def</code>. You will also need 2201*4a711beaSLionel Sambucto name the header files <code class="computeroutput">bzlib.h</code> 2202*4a711beaSLionel Sambucand <code class="computeroutput">bzlib_private.h</code>.</p> 2203*4a711beaSLionel Sambuc<p>If you don't use VC++, you may need to define the 2204*4a711beaSLionel Sambucproprocessor symbol 2205*4a711beaSLionel Sambuc<code class="computeroutput">_WIN32</code>.</p> 2206*4a711beaSLionel Sambuc<p>Finally, <code class="computeroutput">dlltest.c</code> is a 2207*4a711beaSLionel Sambucsample program using the DLL. It has a project file, 2208*4a711beaSLionel Sambuc<code class="computeroutput">dlltest.dsp</code>.</p> 2209*4a711beaSLionel Sambuc<p>If you just want a makefile for Visual C, have a look at 2210*4a711beaSLionel Sambuc<code class="computeroutput">makefile.msc</code>.</p> 2211*4a711beaSLionel Sambuc<p>Be aware that if you compile 2212*4a711beaSLionel Sambuc<code class="computeroutput">bzip2</code> itself on Win32, you must 2213*4a711beaSLionel Sambucset <code class="computeroutput">BZ_UNIX</code> to 0 and 2214*4a711beaSLionel Sambuc<code class="computeroutput">BZ_LCCWIN32</code> to 1, in the file 2215*4a711beaSLionel Sambuc<code class="computeroutput">bzip2.c</code>, before compiling. 2216*4a711beaSLionel SambucOtherwise the resulting binary won't work correctly.</p> 2217*4a711beaSLionel Sambuc<p>I haven't tried any of this stuff myself, but it all looks 2218*4a711beaSLionel Sambucplausible.</p> 2219*4a711beaSLionel Sambuc</div> 2220*4a711beaSLionel Sambuc</div> 2221*4a711beaSLionel Sambuc<div class="chapter" title="4.�Miscellanea"> 2222*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h2 class="title"> 2223*4a711beaSLionel Sambuc<a name="misc"></a>4.�Miscellanea</h2></div></div></div> 2224*4a711beaSLionel Sambuc<div class="toc"> 2225*4a711beaSLionel Sambuc<p><b>Table of Contents</b></p> 2226*4a711beaSLionel Sambuc<dl> 2227*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#limits">4.1. Limitations of the compressed file format</a></span></dt> 2228*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#port-issues">4.2. Portability issues</a></span></dt> 2229*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#bugs">4.3. Reporting bugs</a></span></dt> 2230*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#package">4.4. Did you get the right package?</a></span></dt> 2231*4a711beaSLionel Sambuc<dt><span class="sect1"><a href="#reading">4.5. Further Reading</a></span></dt> 2232*4a711beaSLionel Sambuc</dl> 2233*4a711beaSLionel Sambuc</div> 2234*4a711beaSLionel Sambuc<p>These are just some random thoughts of mine. Your mileage 2235*4a711beaSLionel Sambucmay vary.</p> 2236*4a711beaSLionel Sambuc<div class="sect1" title="4.1.�Limitations of the compressed file format"> 2237*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h2 class="title" style="clear: both"> 2238*4a711beaSLionel Sambuc<a name="limits"></a>4.1.�Limitations of the compressed file format</h2></div></div></div> 2239*4a711beaSLionel Sambuc<p><code class="computeroutput">bzip2-1.0.X</code>, 2240*4a711beaSLionel Sambuc<code class="computeroutput">0.9.5</code> and 2241*4a711beaSLionel Sambuc<code class="computeroutput">0.9.0</code> use exactly the same file 2242*4a711beaSLionel Sambucformat as the original version, 2243*4a711beaSLionel Sambuc<code class="computeroutput">bzip2-0.1</code>. This decision was 2244*4a711beaSLionel Sambucmade in the interests of stability. Creating yet another 2245*4a711beaSLionel Sambucincompatible compressed file format would create further 2246*4a711beaSLionel Sambucconfusion and disruption for users.</p> 2247*4a711beaSLionel Sambuc<p>Nevertheless, this is not a painless decision. Development 2248*4a711beaSLionel Sambucwork since the release of 2249*4a711beaSLionel Sambuc<code class="computeroutput">bzip2-0.1</code> in August 1997 has 2250*4a711beaSLionel Sambucshown complexities in the file format which slow down 2251*4a711beaSLionel Sambucdecompression and, in retrospect, are unnecessary. These 2252*4a711beaSLionel Sambucare:</p> 2253*4a711beaSLionel Sambuc<div class="itemizedlist"><ul class="itemizedlist" type="bullet"> 2254*4a711beaSLionel Sambuc<li class="listitem" style="list-style-type: disc"><p>The run-length encoder, which is the first of the 2255*4a711beaSLionel Sambuc compression transformations, is entirely irrelevant. The 2256*4a711beaSLionel Sambuc original purpose was to protect the sorting algorithm from the 2257*4a711beaSLionel Sambuc very worst case input: a string of repeated symbols. But 2258*4a711beaSLionel Sambuc algorithm steps Q6a and Q6b in the original Burrows-Wheeler 2259*4a711beaSLionel Sambuc technical report (SRC-124) show how repeats can be handled 2260*4a711beaSLionel Sambuc without difficulty in block sorting.</p></li> 2261*4a711beaSLionel Sambuc<li class="listitem" style="list-style-type: disc"> 2262*4a711beaSLionel Sambuc<p>The randomisation mechanism doesn't really need to be 2263*4a711beaSLionel Sambuc there. Udi Manber and Gene Myers published a suffix array 2264*4a711beaSLionel Sambuc construction algorithm a few years back, which can be employed 2265*4a711beaSLionel Sambuc to sort any block, no matter how repetitive, in O(N log N) 2266*4a711beaSLionel Sambuc time. Subsequent work by Kunihiko Sadakane has produced a 2267*4a711beaSLionel Sambuc derivative O(N (log N)^2) algorithm which usually outperforms 2268*4a711beaSLionel Sambuc the Manber-Myers algorithm.</p> 2269*4a711beaSLionel Sambuc<p>I could have changed to Sadakane's algorithm, but I find 2270*4a711beaSLionel Sambuc it to be slower than <code class="computeroutput">bzip2</code>'s 2271*4a711beaSLionel Sambuc existing algorithm for most inputs, and the randomisation 2272*4a711beaSLionel Sambuc mechanism protects adequately against bad cases. I didn't 2273*4a711beaSLionel Sambuc think it was a good tradeoff to make. Partly this is due to 2274*4a711beaSLionel Sambuc the fact that I was not flooded with email complaints about 2275*4a711beaSLionel Sambuc <code class="computeroutput">bzip2-0.1</code>'s performance on 2276*4a711beaSLionel Sambuc repetitive data, so perhaps it isn't a problem for real 2277*4a711beaSLionel Sambuc inputs.</p> 2278*4a711beaSLionel Sambuc<p>Probably the best long-term solution, and the one I have 2279*4a711beaSLionel Sambuc incorporated into 0.9.5 and above, is to use the existing 2280*4a711beaSLionel Sambuc sorting algorithm initially, and fall back to a O(N (log N)^2) 2281*4a711beaSLionel Sambuc algorithm if the standard algorithm gets into 2282*4a711beaSLionel Sambuc difficulties.</p> 2283*4a711beaSLionel Sambuc</li> 2284*4a711beaSLionel Sambuc<li class="listitem" style="list-style-type: disc"><p>The compressed file format was never designed to be 2285*4a711beaSLionel Sambuc handled by a library, and I have had to jump though some hoops 2286*4a711beaSLionel Sambuc to produce an efficient implementation of decompression. It's 2287*4a711beaSLionel Sambuc a bit hairy. Try passing 2288*4a711beaSLionel Sambuc <code class="computeroutput">decompress.c</code> through the C 2289*4a711beaSLionel Sambuc preprocessor and you'll see what I mean. Much of this 2290*4a711beaSLionel Sambuc complexity could have been avoided if the compressed size of 2291*4a711beaSLionel Sambuc each block of data was recorded in the data stream.</p></li> 2292*4a711beaSLionel Sambuc<li class="listitem" style="list-style-type: disc"><p>An Adler-32 checksum, rather than a CRC32 checksum, 2293*4a711beaSLionel Sambuc would be faster to compute.</p></li> 2294*4a711beaSLionel Sambuc</ul></div> 2295*4a711beaSLionel Sambuc<p>It would be fair to say that the 2296*4a711beaSLionel Sambuc<code class="computeroutput">bzip2</code> format was frozen before I 2297*4a711beaSLionel Sambucproperly and fully understood the performance consequences of 2298*4a711beaSLionel Sambucdoing so.</p> 2299*4a711beaSLionel Sambuc<p>Improvements which I was able to incorporate into 0.9.0, 2300*4a711beaSLionel Sambucdespite using the same file format, are:</p> 2301*4a711beaSLionel Sambuc<div class="itemizedlist"><ul class="itemizedlist" type="bullet"> 2302*4a711beaSLionel Sambuc<li class="listitem" style="list-style-type: disc"><p>Single array implementation of the inverse BWT. This 2303*4a711beaSLionel Sambuc significantly speeds up decompression, presumably because it 2304*4a711beaSLionel Sambuc reduces the number of cache misses.</p></li> 2305*4a711beaSLionel Sambuc<li class="listitem" style="list-style-type: disc"><p>Faster inverse MTF transform for large MTF values. 2306*4a711beaSLionel Sambuc The new implementation is based on the notion of sliding blocks 2307*4a711beaSLionel Sambuc of values.</p></li> 2308*4a711beaSLionel Sambuc<li class="listitem" style="list-style-type: disc"><p><code class="computeroutput">bzip2-0.9.0</code> now reads 2309*4a711beaSLionel Sambuc and writes files with <code class="computeroutput">fread</code> 2310*4a711beaSLionel Sambuc and <code class="computeroutput">fwrite</code>; version 0.1 used 2311*4a711beaSLionel Sambuc <code class="computeroutput">putc</code> and 2312*4a711beaSLionel Sambuc <code class="computeroutput">getc</code>. Duh! Well, you live 2313*4a711beaSLionel Sambuc and learn.</p></li> 2314*4a711beaSLionel Sambuc</ul></div> 2315*4a711beaSLionel Sambuc<p>Further ahead, it would be nice to be able to do random 2316*4a711beaSLionel Sambucaccess into files. This will require some careful design of 2317*4a711beaSLionel Sambuccompressed file formats.</p> 2318*4a711beaSLionel Sambuc</div> 2319*4a711beaSLionel Sambuc<div class="sect1" title="4.2.�Portability issues"> 2320*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h2 class="title" style="clear: both"> 2321*4a711beaSLionel Sambuc<a name="port-issues"></a>4.2.�Portability issues</h2></div></div></div> 2322*4a711beaSLionel Sambuc<p>After some consideration, I have decided not to use GNU 2323*4a711beaSLionel Sambuc<code class="computeroutput">autoconf</code> to configure 0.9.5 or 2324*4a711beaSLionel Sambuc1.0.</p> 2325*4a711beaSLionel Sambuc<p><code class="computeroutput">autoconf</code>, admirable and 2326*4a711beaSLionel Sambucwonderful though it is, mainly assists with portability problems 2327*4a711beaSLionel Sambucbetween Unix-like platforms. But 2328*4a711beaSLionel Sambuc<code class="computeroutput">bzip2</code> doesn't have much in the 2329*4a711beaSLionel Sambucway of portability problems on Unix; most of the difficulties 2330*4a711beaSLionel Sambucappear when porting to the Mac, or to Microsoft's operating 2331*4a711beaSLionel Sambucsystems. <code class="computeroutput">autoconf</code> doesn't help 2332*4a711beaSLionel Sambucin those cases, and brings in a whole load of new 2333*4a711beaSLionel Sambuccomplexity.</p> 2334*4a711beaSLionel Sambuc<p>Most people should be able to compile the library and 2335*4a711beaSLionel Sambucprogram under Unix straight out-of-the-box, so to speak, 2336*4a711beaSLionel Sambucespecially if you have a version of GNU C available.</p> 2337*4a711beaSLionel Sambuc<p>There are a couple of 2338*4a711beaSLionel Sambuc<code class="computeroutput">__inline__</code> directives in the 2339*4a711beaSLionel Sambuccode. GNU C (<code class="computeroutput">gcc</code>) should be 2340*4a711beaSLionel Sambucable to handle them. If you're not using GNU C, your C compiler 2341*4a711beaSLionel Sambucshouldn't see them at all. If your compiler does, for some 2342*4a711beaSLionel Sambucreason, see them and doesn't like them, just 2343*4a711beaSLionel Sambuc<code class="computeroutput">#define</code> 2344*4a711beaSLionel Sambuc<code class="computeroutput">__inline__</code> to be 2345*4a711beaSLionel Sambuc<code class="computeroutput">/* */</code>. One easy way to do this 2346*4a711beaSLionel Sambucis to compile with the flag 2347*4a711beaSLionel Sambuc<code class="computeroutput">-D__inline__=</code>, which should be 2348*4a711beaSLionel Sambucunderstood by most Unix compilers.</p> 2349*4a711beaSLionel Sambuc<p>If you still have difficulties, try compiling with the 2350*4a711beaSLionel Sambucmacro <code class="computeroutput">BZ_STRICT_ANSI</code> defined. 2351*4a711beaSLionel SambucThis should enable you to build the library in a strictly ANSI 2352*4a711beaSLionel Sambuccompliant environment. Building the program itself like this is 2353*4a711beaSLionel Sambucdangerous and not supported, since you remove 2354*4a711beaSLionel Sambuc<code class="computeroutput">bzip2</code>'s checks against 2355*4a711beaSLionel Sambuccompressing directories, symbolic links, devices, and other 2356*4a711beaSLionel Sambucnot-really-a-file entities. This could cause filesystem 2357*4a711beaSLionel Sambuccorruption!</p> 2358*4a711beaSLionel Sambuc<p>One other thing: if you create a 2359*4a711beaSLionel Sambuc<code class="computeroutput">bzip2</code> binary for public distribution, 2360*4a711beaSLionel Sambucplease consider linking it statically (<code class="computeroutput">gcc 2361*4a711beaSLionel Sambuc-static</code>). This avoids all sorts of library-version 2362*4a711beaSLionel Sambucissues that others may encounter later on.</p> 2363*4a711beaSLionel Sambuc<p>If you build <code class="computeroutput">bzip2</code> on 2364*4a711beaSLionel SambucWin32, you must set <code class="computeroutput">BZ_UNIX</code> to 0 2365*4a711beaSLionel Sambucand <code class="computeroutput">BZ_LCCWIN32</code> to 1, in the 2366*4a711beaSLionel Sambucfile <code class="computeroutput">bzip2.c</code>, before compiling. 2367*4a711beaSLionel SambucOtherwise the resulting binary won't work correctly.</p> 2368*4a711beaSLionel Sambuc</div> 2369*4a711beaSLionel Sambuc<div class="sect1" title="4.3.�Reporting bugs"> 2370*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h2 class="title" style="clear: both"> 2371*4a711beaSLionel Sambuc<a name="bugs"></a>4.3.�Reporting bugs</h2></div></div></div> 2372*4a711beaSLionel Sambuc<p>I tried pretty hard to make sure 2373*4a711beaSLionel Sambuc<code class="computeroutput">bzip2</code> is bug free, both by 2374*4a711beaSLionel Sambucdesign and by testing. Hopefully you'll never need to read this 2375*4a711beaSLionel Sambucsection for real.</p> 2376*4a711beaSLionel Sambuc<p>Nevertheless, if <code class="computeroutput">bzip2</code> dies 2377*4a711beaSLionel Sambucwith a segmentation fault, a bus error or an internal assertion 2378*4a711beaSLionel Sambucfailure, it will ask you to email me a bug report. Experience from 2379*4a711beaSLionel Sambucyears of feedback of bzip2 users indicates that almost all these 2380*4a711beaSLionel Sambucproblems can be traced to either compiler bugs or hardware 2381*4a711beaSLionel Sambucproblems.</p> 2382*4a711beaSLionel Sambuc<div class="itemizedlist"><ul class="itemizedlist" type="bullet"> 2383*4a711beaSLionel Sambuc<li class="listitem" style="list-style-type: disc"> 2384*4a711beaSLionel Sambuc<p>Recompile the program with no optimisation, and 2385*4a711beaSLionel Sambuc see if it works. And/or try a different compiler. I heard all 2386*4a711beaSLionel Sambuc sorts of stories about various flavours of GNU C (and other 2387*4a711beaSLionel Sambuc compilers) generating bad code for 2388*4a711beaSLionel Sambuc <code class="computeroutput">bzip2</code>, and I've run across two 2389*4a711beaSLionel Sambuc such examples myself.</p> 2390*4a711beaSLionel Sambuc<p>2.7.X versions of GNU C are known to generate bad code 2391*4a711beaSLionel Sambuc from time to time, at high optimisation levels. If you get 2392*4a711beaSLionel Sambuc problems, try using the flags 2393*4a711beaSLionel Sambuc <code class="computeroutput">-O2</code> 2394*4a711beaSLionel Sambuc <code class="computeroutput">-fomit-frame-pointer</code> 2395*4a711beaSLionel Sambuc <code class="computeroutput">-fno-strength-reduce</code>. You 2396*4a711beaSLionel Sambuc should specifically <span class="emphasis"><em>not</em></span> use 2397*4a711beaSLionel Sambuc <code class="computeroutput">-funroll-loops</code>.</p> 2398*4a711beaSLionel Sambuc<p>You may notice that the Makefile runs six tests as part 2399*4a711beaSLionel Sambuc of the build process. If the program passes all of these, it's 2400*4a711beaSLionel Sambuc a pretty good (but not 100%) indication that the compiler has 2401*4a711beaSLionel Sambuc done its job correctly.</p> 2402*4a711beaSLionel Sambuc</li> 2403*4a711beaSLionel Sambuc<li class="listitem" style="list-style-type: disc"> 2404*4a711beaSLionel Sambuc<p>If <code class="computeroutput">bzip2</code> 2405*4a711beaSLionel Sambuc crashes randomly, and the crashes are not repeatable, you may 2406*4a711beaSLionel Sambuc have a flaky memory subsystem. 2407*4a711beaSLionel Sambuc <code class="computeroutput">bzip2</code> really hammers your 2408*4a711beaSLionel Sambuc memory hierarchy, and if it's a bit marginal, you may get these 2409*4a711beaSLionel Sambuc problems. Ditto if your disk or I/O subsystem is slowly 2410*4a711beaSLionel Sambuc failing. Yup, this really does happen.</p> 2411*4a711beaSLionel Sambuc<p>Try using a different machine of the same type, and see 2412*4a711beaSLionel Sambuc if you can repeat the problem.</p> 2413*4a711beaSLionel Sambuc</li> 2414*4a711beaSLionel Sambuc<li class="listitem" style="list-style-type: disc"><p>This isn't really a bug, but ... If 2415*4a711beaSLionel Sambuc <code class="computeroutput">bzip2</code> tells you your file is 2416*4a711beaSLionel Sambuc corrupted on decompression, and you obtained the file via FTP, 2417*4a711beaSLionel Sambuc there is a possibility that you forgot to tell FTP to do a 2418*4a711beaSLionel Sambuc binary mode transfer. That absolutely will cause the file to 2419*4a711beaSLionel Sambuc be non-decompressible. You'll have to transfer it 2420*4a711beaSLionel Sambuc again.</p></li> 2421*4a711beaSLionel Sambuc</ul></div> 2422*4a711beaSLionel Sambuc<p>If you've incorporated 2423*4a711beaSLionel Sambuc<code class="computeroutput">libbzip2</code> into your own program 2424*4a711beaSLionel Sambucand are getting problems, please, please, please, check that the 2425*4a711beaSLionel Sambucparameters you are passing in calls to the library, are correct, 2426*4a711beaSLionel Sambucand in accordance with what the documentation says is allowable. 2427*4a711beaSLionel SambucI have tried to make the library robust against such problems, 2428*4a711beaSLionel Sambucbut I'm sure I haven't succeeded.</p> 2429*4a711beaSLionel Sambuc<p>Finally, if the above comments don't help, you'll have to 2430*4a711beaSLionel Sambucsend me a bug report. Now, it's just amazing how many people 2431*4a711beaSLionel Sambucwill send me a bug report saying something like:</p> 2432*4a711beaSLionel Sambuc<pre class="programlisting">bzip2 crashed with segmentation fault on my machine</pre> 2433*4a711beaSLionel Sambuc<p>and absolutely nothing else. Needless to say, a such a 2434*4a711beaSLionel Sambucreport is <span class="emphasis"><em>totally, utterly, completely and 2435*4a711beaSLionel Sambuccomprehensively 100% useless; a waste of your time, my time, and 2436*4a711beaSLionel Sambucnet bandwidth</em></span>. With no details at all, there's no way 2437*4a711beaSLionel SambucI can possibly begin to figure out what the problem is.</p> 2438*4a711beaSLionel Sambuc<p>The rules of the game are: facts, facts, facts. Don't omit 2439*4a711beaSLionel Sambucthem because "oh, they won't be relevant". At the bare 2440*4a711beaSLionel Sambucminimum:</p> 2441*4a711beaSLionel Sambuc<pre class="programlisting">Machine type. Operating system version. 2442*4a711beaSLionel SambucExact version of bzip2 (do bzip2 -V). 2443*4a711beaSLionel SambucExact version of the compiler used. 2444*4a711beaSLionel SambucFlags passed to the compiler.</pre> 2445*4a711beaSLionel Sambuc<p>However, the most important single thing that will help me 2446*4a711beaSLionel Sambucis the file that you were trying to compress or decompress at the 2447*4a711beaSLionel Sambuctime the problem happened. Without that, my ability to do 2448*4a711beaSLionel Sambucanything more than speculate about the cause, is limited.</p> 2449*4a711beaSLionel Sambuc</div> 2450*4a711beaSLionel Sambuc<div class="sect1" title="4.4.�Did you get the right package?"> 2451*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h2 class="title" style="clear: both"> 2452*4a711beaSLionel Sambuc<a name="package"></a>4.4.�Did you get the right package?</h2></div></div></div> 2453*4a711beaSLionel Sambuc<p><code class="computeroutput">bzip2</code> is a resource hog. 2454*4a711beaSLionel SambucIt soaks up large amounts of CPU cycles and memory. Also, it 2455*4a711beaSLionel Sambucgives very large latencies. In the worst case, you can feed many 2456*4a711beaSLionel Sambucmegabytes of uncompressed data into the library before getting 2457*4a711beaSLionel Sambucany compressed output, so this probably rules out applications 2458*4a711beaSLionel Sambucrequiring interactive behaviour.</p> 2459*4a711beaSLionel Sambuc<p>These aren't faults of my implementation, I hope, but more 2460*4a711beaSLionel Sambucan intrinsic property of the Burrows-Wheeler transform 2461*4a711beaSLionel Sambuc(unfortunately). Maybe this isn't what you want.</p> 2462*4a711beaSLionel Sambuc<p>If you want a compressor and/or library which is faster, 2463*4a711beaSLionel Sambucuses less memory but gets pretty good compression, and has 2464*4a711beaSLionel Sambucminimal latency, consider Jean-loup Gailly's and Mark Adler's 2465*4a711beaSLionel Sambucwork, <code class="computeroutput">zlib-1.2.1</code> and 2466*4a711beaSLionel Sambuc<code class="computeroutput">gzip-1.2.4</code>. Look for them at 2467*4a711beaSLionel Sambuc<a class="ulink" href="http://www.zlib.org" target="_top">http://www.zlib.org</a> and 2468*4a711beaSLionel Sambuc<a class="ulink" href="http://www.gzip.org" target="_top">http://www.gzip.org</a> 2469*4a711beaSLionel Sambucrespectively.</p> 2470*4a711beaSLionel Sambuc<p>For something faster and lighter still, you might try Markus F 2471*4a711beaSLionel SambucX J Oberhumer's <code class="computeroutput">LZO</code> real-time 2472*4a711beaSLionel Sambuccompression/decompression library, at 2473*4a711beaSLionel Sambuc<a class="ulink" href="http://www.oberhumer.com/opensource" target="_top">http://www.oberhumer.com/opensource</a>.</p> 2474*4a711beaSLionel Sambuc</div> 2475*4a711beaSLionel Sambuc<div class="sect1" title="4.5.�Further Reading"> 2476*4a711beaSLionel Sambuc<div class="titlepage"><div><div><h2 class="title" style="clear: both"> 2477*4a711beaSLionel Sambuc<a name="reading"></a>4.5.�Further Reading</h2></div></div></div> 2478*4a711beaSLionel Sambuc<p><code class="computeroutput">bzip2</code> is not research 2479*4a711beaSLionel Sambucwork, in the sense that it doesn't present any new ideas. 2480*4a711beaSLionel SambucRather, it's an engineering exercise based on existing 2481*4a711beaSLionel Sambucideas.</p> 2482*4a711beaSLionel Sambuc<p>Four documents describe essentially all the ideas behind 2483*4a711beaSLionel Sambuc<code class="computeroutput">bzip2</code>:</p> 2484*4a711beaSLionel Sambuc<div class="literallayout"><p>Michael�Burrows�and�D.�J.�Wheeler:<br> 2485*4a711beaSLionel Sambuc��"A�block-sorting�lossless�data�compression�algorithm"<br> 2486*4a711beaSLionel Sambuc���10th�May�1994.�<br> 2487*4a711beaSLionel Sambuc���Digital�SRC�Research�Report�124.<br> 2488*4a711beaSLionel Sambuc���ftp://ftp.digital.com/pub/DEC/SRC/research-reports/SRC-124.ps.gz<br> 2489*4a711beaSLionel Sambuc���If�you�have�trouble�finding�it,�try�searching�at�the<br> 2490*4a711beaSLionel Sambuc���New�Zealand�Digital�Library,�http://www.nzdl.org.<br> 2491*4a711beaSLionel Sambuc<br> 2492*4a711beaSLionel SambucDaniel�S.�Hirschberg�and�Debra�A.�LeLewer<br> 2493*4a711beaSLionel Sambuc��"Efficient�Decoding�of�Prefix�Codes"<br> 2494*4a711beaSLionel Sambuc���Communications�of�the�ACM,�April�1990,�Vol�33,�Number�4.<br> 2495*4a711beaSLionel Sambuc���You�might�be�able�to�get�an�electronic�copy�of�this<br> 2496*4a711beaSLionel Sambuc���from�the�ACM�Digital�Library.<br> 2497*4a711beaSLionel Sambuc<br> 2498*4a711beaSLionel SambucDavid�J.�Wheeler<br> 2499*4a711beaSLionel Sambuc���Program�bred3.c�and�accompanying�document�bred3.ps.<br> 2500*4a711beaSLionel Sambuc���This�contains�the�idea�behind�the�multi-table�Huffman�coding�scheme.<br> 2501*4a711beaSLionel Sambuc���ftp://ftp.cl.cam.ac.uk/users/djw3/<br> 2502*4a711beaSLionel Sambuc<br> 2503*4a711beaSLionel SambucJon�L.�Bentley�and�Robert�Sedgewick<br> 2504*4a711beaSLionel Sambuc��"Fast�Algorithms�for�Sorting�and�Searching�Strings"<br> 2505*4a711beaSLionel Sambuc���Available�from�Sedgewick's�web�page,<br> 2506*4a711beaSLionel Sambuc���www.cs.princeton.edu/~rs<br> 2507*4a711beaSLionel Sambuc</p></div> 2508*4a711beaSLionel Sambuc<p>The following paper gives valuable additional insights into 2509*4a711beaSLionel Sambucthe algorithm, but is not immediately the basis of any code used 2510*4a711beaSLionel Sambucin bzip2.</p> 2511*4a711beaSLionel Sambuc<div class="literallayout"><p>Peter�Fenwick:<br> 2512*4a711beaSLionel Sambuc���Block�Sorting�Text�Compression<br> 2513*4a711beaSLionel Sambuc���Proceedings�of�the�19th�Australasian�Computer�Science�Conference,<br> 2514*4a711beaSLionel Sambuc�����Melbourne,�Australia.��Jan�31�-�Feb�2,�1996.<br> 2515*4a711beaSLionel Sambuc���ftp://ftp.cs.auckland.ac.nz/pub/peter-f/ACSC96paper.ps</p></div> 2516*4a711beaSLionel Sambuc<p>Kunihiko Sadakane's sorting algorithm, mentioned above, is 2517*4a711beaSLionel Sambucavailable from:</p> 2518*4a711beaSLionel Sambuc<div class="literallayout"><p>http://naomi.is.s.u-tokyo.ac.jp/~sada/papers/Sada98b.ps.gz<br> 2519*4a711beaSLionel Sambuc</p></div> 2520*4a711beaSLionel Sambuc<p>The Manber-Myers suffix array construction algorithm is 2521*4a711beaSLionel Sambucdescribed in a paper available from:</p> 2522*4a711beaSLionel Sambuc<div class="literallayout"><p>http://www.cs.arizona.edu/people/gene/PAPERS/suffix.ps<br> 2523*4a711beaSLionel Sambuc</p></div> 2524*4a711beaSLionel Sambuc<p>Finally, the following papers document some 2525*4a711beaSLionel Sambucinvestigations I made into the performance of sorting 2526*4a711beaSLionel Sambucand decompression algorithms:</p> 2527*4a711beaSLionel Sambuc<div class="literallayout"><p>Julian�Seward<br> 2528*4a711beaSLionel Sambuc���On�the�Performance�of�BWT�Sorting�Algorithms<br> 2529*4a711beaSLionel Sambuc���Proceedings�of�the�IEEE�Data�Compression�Conference�2000<br> 2530*4a711beaSLionel Sambuc�����Snowbird,�Utah.��28-30�March�2000.<br> 2531*4a711beaSLionel Sambuc<br> 2532*4a711beaSLionel SambucJulian�Seward<br> 2533*4a711beaSLionel Sambuc���Space-time�Tradeoffs�in�the�Inverse�B-W�Transform<br> 2534*4a711beaSLionel Sambuc���Proceedings�of�the�IEEE�Data�Compression�Conference�2001<br> 2535*4a711beaSLionel Sambuc�����Snowbird,�Utah.��27-29�March�2001.<br> 2536*4a711beaSLionel Sambuc</p></div> 2537*4a711beaSLionel Sambuc</div> 2538*4a711beaSLionel Sambuc</div> 2539*4a711beaSLionel Sambuc</div></body> 2540*4a711beaSLionel Sambuc</html> 2541