xref: /netbsd-src/usr.bin/compress/compress.1 (revision dfb9caab49302bcb0203ec4a5a7b2fa947445787)
1.\" Copyright (c) 1986, 1990 The Regents of the University of California.
2.\" All rights reserved.
3.\"
4.\" This code is derived from software contributed to Berkeley by
5.\" James A. Woods, derived from original work by Spencer Thomas
6.\" and Joseph Orost.
7.\"
8.\" Redistribution and use in source and binary forms, with or without
9.\" modification, are permitted provided that the following conditions
10.\" are met:
11.\" 1. Redistributions of source code must retain the above copyright
12.\"    notice, this list of conditions and the following disclaimer.
13.\" 2. Redistributions in binary form must reproduce the above copyright
14.\"    notice, this list of conditions and the following disclaimer in the
15.\"    documentation and/or other materials provided with the distribution.
16.\" 3. All advertising materials mentioning features or use of this software
17.\"    must display the following acknowledgement:
18.\"	This product includes software developed by the University of
19.\"	California, Berkeley and its contributors.
20.\" 4. Neither the name of the University nor the names of its contributors
21.\"    may be used to endorse or promote products derived from this software
22.\"    without specific prior written permission.
23.\"
24.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
25.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
26.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
27.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
28.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
29.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
30.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
31.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
32.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
33.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
34.\" SUCH DAMAGE.
35.\"
36.\"     from: @(#)compress.1	6.10 (Berkeley) 7/30/91
37.\"	$Id: compress.1,v 1.2 1993/08/01 07:33:30 mycroft Exp $
38.\"
39.Dd July 30, 1991
40.Dt COMPRESS 1
41.Os BSD 4.3
42.Sh NAME
43.Nm compress ,
44.Nm uncompress ,
45.Nm zcat
46.Nd compress and expand data
47.Sh SYNOPSIS
48.Nm compress
49.Op Fl f
50.Op Fl v
51.Op Fl c
52.Op Fl b Ar bits
53.Op Ar
54.Nm uncompress
55.Op Fl f
56.Op Fl v
57.Op Fl c
58.Ar
59.Nm zcat
60.Op Ar
61.Sh DESCRIPTION
62.Nm Compress
63reduces the size of the named files using adaptive Lempel-Ziv coding.
64Whenever possible,
65each
66.Ar file
67is replaced by one with the extension
68.Ar \&.Z ,
69while keeping the same ownership modes, access and modification times.
70If no files are specified, the standard input is compressed to the
71standard output.
72Compressed files can be restored to their original form using
73.Nm uncompress
74or
75.Nm zcat
76.Bl -tag -width Ds
77.It Fl f
78Force compression of
79.Ar file ,
80even if it does not actually shrink
81or the corresponding
82.Ar file.Z
83file already exists.
84Except when run in the background under
85.Pa /bin/sh ,
86if
87.Fl f
88is not given the user is prompted as to whether an existing
89.Ar file.Z
90file should be overwritten.
91.It Fl c
92(``cat'').
93.Nm Compress/uncompress
94writes to the standard output; no files are changed.
95The nondestructive behavior of
96.Nm zcat
97is identical to that of
98.Nm uncompress
99.Fl c .
100.It Fl b
101Specify
102.Ar bits
103code limit (see below).
104.It Fl v
105Print the percentage reduction of each file.
106.El
107.Pp
108.Nm Compress
109uses the modified Lempel-Ziv algorithm popularized in
110"A Technique for High Performance Data Compression",
111Terry A. Welch,
112.Em IEEE Computer ,
113vol. 17,
114.no 6
115(June 1984), pp. 8-19.
116Common substrings in the file are first replaced by 9-bit codes 257 and up.
117When code 512 is reached, the algorithm switches to 10-bit codes and
118continues to use more bits until the
119limit specified by the
120.Fl b
121flag is reached (default 16).
122.Ar Bits
123must be between 9 and 16.  The default can be changed in the source to allow
124.Nm compress
125to be run on a smaller machine.
126.Pp
127After the
128.Ar bits
129limit is attained,
130.Nm compress
131periodically checks the compression ratio.  If it is increasing,
132.Nm compress
133continues to use the existing code dictionary.  However,
134if the compression ratio decreases,
135.Nm compress
136discards the table of substrings and rebuilds it from scratch.  This allows
137the algorithm to adapt to the next "block" of the file.
138.Pp
139Note that the
140.Fl b
141flag is omitted for
142.Ar uncompress
143since the
144.Ar bits
145parameter specified during compression
146is encoded within the output, along with
147a magic number to ensure that neither decompression of random data nor
148recompression of compressed data is attempted.
149.Pp
150.ne 8
151The amount of compression obtained depends on the size of the
152input, the number of
153.Ar bits
154per code, and the distribution of common substrings.
155Typically, text such as source code or English
156is reduced by 50\-60%.
157Compression is generally much better than that achieved by
158Huffman coding (as used in the historical command
159pack),
160or adaptive Huffman coding (as
161used in the historical command
162compact),
163and takes less time to compute.
164.Pp
165If an error occurs, exit status is 1;
166if the last file was not compressed because it became larger, the status
167is 2; otherwise the status is 0.
168.Sh DIAGNOSTICS
169.Bl -tag -width flag
170.It "Usage: compress [-fvc] [-b maxbits] [file ...]"
171Invalid options were specified on the command line.
172.It "Missing maxbits"
173Maxbits must follow
174.Fl b .
175.It Ar file : No "not in compressed format"
176The file specified to
177.Ar uncompress
178has not been compressed.
179.It Xo
180.Ar file : No "compressed with"
181.Ar \&xx No "bits, can only handle"
182.Ar \&yy No bits
183.Xc
184.Ar File
185was compressed by a program that could deal with
186more
187.Ar bits
188than the compress code on this machine.
189Recompress the file with smaller
190.Ar bits .
191.It Ar file : No "already has .Z suffix -- no change"
192The file is assumed to be already compressed.
193Rename the file and try again.
194.It Ar file : No "filename too long to tack on .Z"
195The file cannot be compressed because its name is longer than
19612 characters.
197Rename and try again.
198This message does not occur on
199.Bx
200systems.
201.It Ar file No "already exists; do you wish to overwrite (y or n)?"
202Respond "y" if you want the output file to be replaced; "n" if not.
203.It "uncompress: corrupt input"
204A
205.Dv SIGSEGV
206violation was detected which usually means that the input file is
207corrupted.
208.It Compression: Em "xx.xx%"
209Percentage of the input saved by compression.
210(Relevant only for
211.Fl v . )
212.It "-- not a regular file: unchanged"
213When the input file is not a regular file,
214(e.g. a directory), it is
215left unaltered.
216.It "-- has" Ar xx No "other links: unchanged"
217The input file has links; it is left unchanged.  See
218.Xr ln 1
219for more information.
220.It "-- file unchanged"
221No savings is achieved by
222compression.  The input remains virgin.
223.El
224.Sh FILES
225.Bl -tag -width file.Z
226.It Pa file.Z
227compressed file is file.Z
228.El
229.Sh BUGS
230Although compressed files are compatible between machines with large memory,
231.Fl b Ns Ar 12
232should be used for file transfer to architectures with
233a small process data space (64KB or less, as exhibited by the
234.Tn DEC PDP
235series, the Intel 80286, etc.)
236.Pp
237.Nm Compress
238should be more flexible about the existence of the `.Z' suffix.
239.Sh HISTORY
240The
241.Nm
242command appeared in
243.Bx 4.3 .
244