xref: /openbsd-src/usr.bin/compress/compress.1 (revision 8500990981f885cbe5e6a4958549cacc238b5ae6)
1.\"	$OpenBSD: compress.1,v 1.29 2003/10/01 08:43:17 jmc Exp $
2.\"	$NetBSD: compress.1,v 1.5 1995/03/26 09:44:34 glass Exp $
3.\"
4.\" Copyright (c) 1986, 1990, 1993
5.\"	The Regents of the University of California.  All rights reserved.
6.\"
7.\" This code is derived from software contributed to Berkeley by
8.\" James A. Woods, derived from original work by Spencer Thomas
9.\" and Joseph Orost.
10.\"
11.\" Redistribution and use in source and binary forms, with or without
12.\" modification, are permitted provided that the following conditions
13.\" are met:
14.\" 1. Redistributions of source code must retain the above copyright
15.\"    notice, this list of conditions and the following disclaimer.
16.\" 2. Redistributions in binary form must reproduce the above copyright
17.\"    notice, this list of conditions and the following disclaimer in the
18.\"    documentation and/or other materials provided with the distribution.
19.\" 3. Neither the name of the University nor the names of its contributors
20.\"    may be used to endorse or promote products derived from this software
21.\"    without specific prior written permission.
22.\"
23.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
24.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
25.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
26.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
27.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
28.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
29.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
30.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
31.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
32.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
33.\" SUCH DAMAGE.
34.\"
35.\"     @(#)compress.1	8.2 (Berkeley) 4/18/94
36.\"
37.Dd April 18, 1994
38.Dt COMPRESS 1
39.Os
40.Sh NAME
41.Nm compress ,
42.Nm uncompress ,
43.Nm gzip ,
44.Nm gunzip
45.Nd compress and expand data
46.Sh SYNOPSIS
47.Nm compress
48.Op Fl LV
49.Nm compress
50.Op Fl cdfghlOnNqrtv123456789
51.Op Fl b Ar bits
52.Op Fl S Ar suffix
53.Op Fl o Ar filename
54.Op Ar
55.Nm uncompress
56.Op Fl cfhlnNqrtv
57.Op Fl o Ar filename
58.Op Ar
59.Pp
60.Nm gzip
61.Op Fl LV
62.Nm gzip
63.Op Fl cdfghlnNOqrtv123456789
64.Op Fl b Ar bits
65.Op Fl S Ar suffix
66.Op Fl o Ar filename
67.Op Ar
68.Nm gunzip
69.Op Fl cfhnNqrltv
70.Op Fl o Ar filename
71.Op Ar
72.Pp
73.Nm zcat
74.Op Fl fghqr
75.Op Ar
76.Nm gzcat
77.Op Fl fhqr
78.Op Ar
79.Sh DESCRIPTION
80The
81.Nm compress
82and
83.Nm gzip
84utilities
85reduce the size of the named files using adaptive Lempel-Ziv coding.
86They are functionally identical, but use different algorithms for compression.
87If invoked as
88.Nm gzip
89or
90.Nm compress Fl g ,
91the deflate mode of compression is chosen by default;
92otherwise the older method of compression
93.Pq compress mode
94is used.
95.Pp
96Each
97.Ar file
98is renamed to the same name plus the extension
99.Dq .Z ,
100or
101.Dq .gz
102(in deflate mode).
103As many of the modification time, access time, file flags, file mode,
104user ID, and group ID as allowed by permissions are retained in the
105new file.
106If compression would not reduce the size of a
107.Ar file ,
108the file is ignored (unless
109.Fl f
110is used).
111.Pp
112The
113.Nm uncompress
114and
115.Nm gunzip
116utilities restore compressed files to their original form, renaming the
117files by removing the extension (or by using the stored name if the
118.Fl N
119flag is specified).
120When decompressing, the following extensions are recognized:
121.Dq .Z ,
122.Dq -Z ,
123.Dq _Z ,
124.Dq .gz ,
125.Dq -gz ,
126.Dq _gz ,
127.Dq .tgz ,
128.Dq -tgz ,
129.Dq _tgz ,
130.Dq .taz ,
131.Dq -taz ,
132and
133.Dq _taz .
134Extensions ending in
135.Dq tgz
136and
137.Dq taz
138are not removed when decompressing, instead they are converted to
139.Dq tar .
140.Pp
141The
142.Nm zcat
143command is equivalent in functionality to
144.Nm uncompress
145.Fl c .
146The
147.Nm gzcat
148command is equivalent in functionality to
149.Nm gunzip
150.Fl c .
151.Pp
152If renaming the files would cause files to be overwritten and the standard
153input device is a terminal, the user is prompted (on the standard error
154output) for confirmation.
155If prompting is not possible or confirmation is not received, the files
156are not overwritten.
157.Pp
158If no files are specified, the standard input is compressed or uncompressed
159to the standard output.
160If either the input or output files are not regular files, the checks for
161reduction in size and file overwriting are not performed, the input file is
162not removed, and the attributes of the input file are not retained.
163.Pp
164The options are as follows:
165.Bl -tag -width Ds
166.It Fl V
167Display the program version
168.Pq RCS IDs of the source files
169and exit.
170.It Fl b Ar bits
171Specify the
172.Ar bits
173code limit
174.Pq see below .
175.It Fl c
176Compressed or uncompressed output is written to the standard output.
177No files are modified (force
178.Nm zcat
179or
180.Nm gzcat
181mode).
182.It Fl d
183Decompress the source files instead of compressing them (force
184.Nm uncompress
185mode).
186.It Fl f
187Force compression of
188.Ar file ,
189even if it is not actually reduced in size.
190Additionally, files are overwritten without prompting for confirmation.
191If the input data is not in a format recognized by
192.Nm
193and if the option
194.Fl c
195is also given, copy the input data without change
196to the standard output: let
197.Nm zcat
198or
199.Nm gzcat
200behave as
201.Xr cat 1 .
202.It Fl g
203Use deflate scheme which reportedly provides better compression rates (force
204.Nm gzip
205mode).
206This flag need not be specified when invoked as
207.Nm gzip .
208.It Fl h
209Print a short help message.
210.It Fl l
211List information for the specified compressed files.
212The following information is listed:
213.Bl -tag -width "compression ratio"
214.It compressed size
215Size of the compressed file.
216.It uncompressed size
217Size of the file when uncompressed.
218.It compression ratio
219Ratio of the difference between the compressed and uncompressed
220sizes to the uncompressed size.
221.It uncompressed name
222Name the file will be saved as when uncompressing.
223.El
224.Pp
225If the
226.Fl v
227option is specified, the following additional information is printed:
228.Bl -tag -width "compression method"
229.It compression method
230Name of the method used to compress the file.
231.It crc
23232-bit CRC
233.Pq cyclic redundancy code
234of the uncompressed file.
235.It "time stamp"
236Date and time corresponding to the last data modification time
237(mtime) of the compressed file (if the
238.Fl n
239option is specified, the time stamp stored in the compressed file
240is printed instead).
241.El
242.It Fl n
243When compressing, do not save the original file name and time stamp.
244This information is saved by default when the deflate scheme is used.
245When uncompressing, do not restore the original file name and time stamp.
246By default, the uncompressed file inherits the time stamp of the
247compressed version and the uncompressed file name is generated from
248the name of the compressed file name as described above.
249.It Fl N
250When compressing, save the original file name and time stamp in the
251compressed file.
252This information is saved by default when the deflate scheme is used.
253When uncompressing or listing, use the time stamp and file name stored
254in the compressed file, if any, for the uncompressed version.
255.It Fl 1...9
256Use deflate scheme with compression factor of
257.Fl 1
258to
259.Fl 9 .
260Compression factor
261.Fl 1
262is the fastest, but provides a poorer level of compression.
263Compression factor
264.Fl 9
265provides the best level of compression, but is relatively slow.
266The default is
267.Fl 6 .
268This option implies
269.Fl g .
270.It Fl O
271Use old compression method.
272.It Fl o Ar filename
273Set the output file name.
274.It Fl S Ar suffix
275Set suffix for compressed files.
276.It Fl t
277Test the integrity of each file leaving any files intact.
278.It Fl r
279Recursive mode,
280.Nm
281will descend into specified directories.
282.It Fl q
283Be quiet, suppress all messages.
284.It Fl v
285Print the percentage reduction of each file and other information.
286.El
287.Pp
288In normal mode,
289.Nm
290uses a modified Lempel-Ziv algorithm
291.Pq LZW .
292Common substrings in the file are first replaced by 9-bit codes 257 and up.
293When code 512 is reached, the algorithm switches to 10-bit codes and
294continues to use more bits until the
295limit specified by the
296.Fl b
297flag is reached.
298.Ar bits
299must be between 9 and 16
300.Pq the default is 16 .
301.Pp
302After the
303.Ar bits
304limit is reached,
305.Nm
306periodically checks the compression ratio.
307If it is increasing,
308.Nm
309continues to use the existing code dictionary.
310However, if the compression ratio decreases,
311.Nm
312discards the table of substrings and rebuilds it from scratch.
313This allows the algorithm to adapt to the next
314.Dq block
315of the file.
316.Pp
317.Nm gzip
318uses a slightly different version of the Lempel-Ziv algorithm
319.Pq LZ77 .
320Common substrings are replaced by pointers to previous strings,
321and are found using a hash table.
322Unique substrings are emitted as a string of literal bytes,
323and compressed as Huffman trees.
324.Pp
325The
326.Fl b
327flag is omitted for
328.Nm uncompress
329or
330.Nm gunzip
331since the
332.Ar bits
333parameter specified during compression
334is encoded within the output, along with
335a magic number to ensure that neither decompression of random data nor
336recompression of compressed data is attempted.
337.Pp
338The amount of compression obtained depends on the size of the
339input, the number of
340.Ar bits
341per code, and the distribution of common substrings.
342Typically, text such as source code or English is reduced by 50 \- 60% using
343.Nm
344and by 60 \- 70% using
345.Nm gzip .
346Compression is generally much better than that achieved by Huffman
347coding (as used in the historical command pack), or adaptive Huffman
348coding (as used in the historical command compact), and takes less
349time to compute.
350.Pp
351The
352.Nm
353and
354.Nm gzip
355utilities exit with 0 on success, 1 if an error occurred, or 2 if one or
356more files were not compressed because they would have grown in
357size (and
358.Fl f
359was not specified).
360.Sh RETURN VALUES
361The
362.Nm
363utility exits with one of the following values:
364.Pp
365.Bl -tag -width flag -compact
366.It Li 0
367The file was compressed successfully.
368.It Li 1
369An error occurred.
370.It Li 2
371A warning occurred.
372.El
373.Sh SEE ALSO
374.Xr compress 3
375.Pp
376.Rs
377.%A Welch, Terry A.
378.%D June, 1984
379.%T "A Technique for High Performance Data Compression"
380.%J "IEEE Computer"
381.%V 17:6
382.%P pp. 8-19
383.Re
384.Pp
385.Bl -tag -width 12n -compact
386.It RFC 1950
387ZLIB Compressed Data Format Specification.
388.It RFC 1951
389DEFLATE Compressed Data Format Specification.
390.It RFC 1952
391GZIP File Format Specification.
392.El
393.Sh STANDARDS
394The
395.Nm
396utility is compliant with the
397.St -p1003.2-92
398specification.
399.Pp
400The
401.Nm gzip
402and
403.Nm gunzip
404utilities are extensions.
405.Sh HISTORY
406The
407.Nm
408command appeared in
409.Bx 4.3 .
410The deflate compression support was added in
411.Ox 2.1 .
412Full
413.Nm gzip
414compatibility was added in
415.Ox 3.4 .
416The
417.Sq g
418in this version of
419.Nm gzip
420stands for
421.Dq gratis .
422