1.\" $OpenBSD: compress.1,v 1.22 2003/07/20 13:25:52 millert Exp $ 2.\" $NetBSD: compress.1,v 1.5 1995/03/26 09:44:34 glass Exp $ 3.\" 4.\" Copyright (c) 1986, 1990, 1993 5.\" The Regents of the University of California. All rights reserved. 6.\" 7.\" This code is derived from software contributed to Berkeley by 8.\" James A. Woods, derived from original work by Spencer Thomas 9.\" and Joseph Orost. 10.\" 11.\" Redistribution and use in source and binary forms, with or without 12.\" modification, are permitted provided that the following conditions 13.\" are met: 14.\" 1. Redistributions of source code must retain the above copyright 15.\" notice, this list of conditions and the following disclaimer. 16.\" 2. Redistributions in binary form must reproduce the above copyright 17.\" notice, this list of conditions and the following disclaimer in the 18.\" documentation and/or other materials provided with the distribution. 19.\" 3. Neither the name of the University nor the names of its contributors 20.\" may be used to endorse or promote products derived from this software 21.\" without specific prior written permission. 22.\" 23.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND 24.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 25.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 26.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE 27.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 28.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS 29.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 30.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 31.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 32.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 33.\" SUCH DAMAGE. 34.\" 35.\" @(#)compress.1 8.2 (Berkeley) 4/18/94 36.\" 37.Dd April 18, 1994 38.Dt COMPRESS 1 39.Os 40.Sh NAME 41.Nm compress , 42.Nm uncompress , 43.Nm gzip , 44.Nm gunzip 45.Nd compress and expand data 46.Sh SYNOPSIS 47.Nm compress 48.Op Fl LV 49.Nm compress 50.Op Fl cdfghlOnNqrtv123456789 51.Op Fl b Ar bits 52.Op Fl S Ar suffix 53.Op Fl o Ar filename 54.Op Ar 55.Nm uncompress 56.Op Fl cfhlnNqrtv 57.Op Fl o Ar filename 58.Op Ar 59.Pp 60.Nm gzip 61.Op Fl LV 62.Nm gzip 63.Op Fl cdfghlnNOqrtv123456789 64.Op Fl b Ar bits 65.Op Fl S Ar suffix 66.Op Fl o Ar filename 67.Op Ar 68.Nm gunzip 69.Op Fl cfhnNqrltv 70.Op Fl o Ar filename 71.Op Ar 72.Pp 73.Nm zcat 74.Op Fl fhqr 75.Op Ar 76.Sh DESCRIPTION 77The 78.Nm compress 79and 80.Nm gzip 81utilities 82reduce the size of the named files using adaptive Lempel-Ziv coding. 83They are functionally identical, but use different algorithms for compression. 84If invoked as 85.Nm gzip 86or 87.Nm compress Fl g 88the deflate mode of compression is chosen by default; 89otherwise the older method of compression 90.Pq compress mode 91is used. 92.Pp 93Each 94.Ar file 95is renamed to the same name plus the extension 96.Dq .Z , 97or 98.Dq .gz 99(in deflate mode). 100As many of the modification time, access time, file flags, file mode, 101user ID, and group ID as allowed by permissions are retained in the 102new file. 103If compression would not reduce the size of a 104.Ar file , 105the file is ignored (unless 106.Fl f 107is used). 108.Pp 109The 110.Nm uncompress 111and 112.Nm gunzip 113utilities restore compressed files to their original form, renaming the 114files by removing the 115.Dq .Z 116or 117.Dq .gz 118extension. 119.Pp 120The 121.Nm zcat 122command is equivalent in functionality to 123.Nm uncompress 124.Fl c . 125.Pp 126If renaming the files would cause files to be overwritten and the standard 127input device is a terminal, the user is prompted (on the standard error 128output) for confirmation. 129If prompting is not possible or confirmation is not received, the files 130are not overwritten. 131.Pp 132If no files are specified, the standard input is compressed or uncompressed 133to the standard output. 134If either the input or output files are not regular files, the checks for 135reduction in size and file overwriting are not performed, the input file is 136not removed, and the attributes of the input file are not retained. 137.Pp 138The options are as follows: 139.Bl -tag -width Ds 140.It Fl V 141Display the program version (RCS Ids of the source files) and exit. 142.It Fl b Ar bits 143Specify the 144.Ar bits 145code limit (see below). 146.It Fl c 147Compressed or uncompressed output is written to the standard output. 148No files are modified (force 149.Nm zcat 150mode). 151.It Fl d 152Decompress the source files instead of compressing them (force 153.Nm uncompress 154mode). 155.It Fl f 156Force compression of 157.Ar file , 158even if it is not actually reduced in size. 159Additionally, files are overwritten without prompting for confirmation. 160.It Fl g 161Use deflate scheme which reportedly provides better compression rates (force 162.Nm gzip 163mode). 164This flag need not be specified when invoked as 165.Nm gzip . 166.It Fl h 167Print a short help message. 168.It Fl l 169List information for the specified compressed files. 170The following information is listed: 171.Bl -tag -width Ds -offset indent 172.It compressed size 173size of the compressed file 174.It uncompressed size 175size of the file when uncompressed 176.It compression ratio 177ratio of the difference between the compressed and uncompressed 178sizes to the uncompressed size. 179.It uncompressed name 180name the file will be saved as when uncompressing 181.El 182.Pp 183If the 184.Fl v 185option is specified, the following additional information is printed: 186.Bl -tag -width Ds -offset indent 187.It compression method 188name of the method used to compress the file 189.It crc 19032-bit crc of the uncompressed file 191.It "time stamp" 192date and time corresponding to the last data modification time 193(mtime) of the compressed file (if the 194.Fl n 195option is specified, the time stamp stored in the compressed file 196is printed instead). 197.El 198.It Fl n 199When compressing, do not save the original file name and time stamp. 200This information is saved by default when the deflate scheme is used. 201When uncompressing, do not restore the original file name and time stamp. 202By default, the uncompressed file inherits the time stamp of the 203compressed version and the uncompressed file name is generated by 204stripping the 205.Dq Z 206or 207.Dq gz 208extension from the compressed file name. 209.It Fl N 210When compressing, save the original file name and time stamp in the 211compressed file. 212This information is saved by default when the deflate scheme is used. 213When uncompressing or listing, use the time stamp and file name stored 214in the compressed file, if any, for the uncompressed version. 215.It Fl 1...9 216Use deflate scheme with compression factor of 217.Fl 1 218to 219.Fl 9 . 220Compression factor 221.Fl 1 222is the fastest, but provides a poorer level of compression. 223Compression factor 224.Fl 9 225provides the best level of compression, but is relatively slow. 226The default is 227.Fl 6 . 228This option implies 229.Fl g . 230.It Fl O 231Use old compression method. 232.It Fl o Ar filename 233Set the output file name. 234.It Fl S Ar suffix 235Set suffix for compressed files. 236.It Fl t 237Test the integrity of each file leaving any files intact. 238.It Fl r 239Recursive mode, 240.Nm 241will descend into specified directories. 242.It Fl q 243Be quiet, suppress all messages. 244.It Fl v 245Print the percentage reduction of each file and other information. 246.El 247.Pp 248In normal mode, 249.Nm 250uses a modified Lempel-Ziv algorithm 251.Pq LZW . 252Common substrings in the file are first replaced by 9-bit codes 257 and up. 253When code 512 is reached, the algorithm switches to 10-bit codes and 254continues to use more bits until the 255limit specified by the 256.Fl b 257flag is reached. 258.Ar bits 259must be between 9 and 16 (the default is 16). 260.\" XXX - use .br here to work-around an apparent bug in mdoc 261.br 262.Pp 263After the 264.Ar bits 265limit is reached, 266.Nm 267periodically checks the compression ratio. 268If it is increasing, 269.Nm 270continues to use the existing code dictionary. 271However, if the compression ratio decreases, 272.Nm 273discards the table of substrings and rebuilds it from scratch. 274This allows the algorithm to adapt to the next 275.Dq block 276of the file. 277.Pp 278.Nm gzip 279uses a slightly different version of the Lempel-Ziv algorithm 280.Pq LZ77 . 281Common substrings are replaced by pointers to previous strings, 282and are found using a hash table. 283Unique substrings are emitted as a string of literal bytes, 284and compressed as Huffman trees. 285.Pp 286The 287.Fl b 288flag is omitted for 289.Nm uncompress 290or 291.Nm gunzip 292since the 293.Ar bits 294parameter specified during compression 295is encoded within the output, along with 296a magic number to ensure that neither decompression of random data nor 297recompression of compressed data is attempted. 298.Pp 299The amount of compression obtained depends on the size of the 300input, the number of 301.Ar bits 302per code, and the distribution of common substrings. 303Typically, text such as source code or English is reduced by 50\-60% using 304.Nm 305and by 60\-70% using 306.Nm gzip . 307Compression is generally much better than that achieved by Huffman 308coding (as used in the historical command pack), or adaptive Huffman 309coding (as used in the historical command compact), and takes less 310time to compute. 311.Pp 312The 313.Nm 314and 315.Nm gzip 316utilities exit with 0 on success, 1 if an error occurred, or 2 if one or 317more files were not compressed because they would have grown in 318size (and 319.Fl f 320was not specified). 321.Sh RETURN VALUES 322The 323.Nm 324utility exits with one of the following values: 325.Pp 326.Bl -tag -width flag -compact 327.It Li 0 328The file was compressed successfully. 329.It Li 1 330An error occurred. 331.It Li 2 332A warning occurred. 333.Sh SEE ALSO 334.Rs 335.%A Welch, Terry A. 336.%D June, 1984 337.%T "A Technique for High Performance Data Compression" 338.%J "IEEE Computer" 339.%V 17:6 340.%P pp. 8-19 341.Re 342.Pp 343.Bl -tag -width 12n -compact 344.It RFC1950 345ZLIB Compressed Data Format Specification 346.It RFC1951 347DEFLATE Compressed Data Format Specification 348.It RFC1952 349GZIP File Format Specification 350.El 351.Sh STANDARDS 352The 353.Nm 354utility is compliant with the 355.St -p1003.2-92 356specification. 357.Pp 358The 359.Nm gzip 360and 361.Nm gunzip 362utilities are extensions. 363.Sh HISTORY 364The 365.Nm 366command appeared in 367.Bx 4.3 . 368The deflate compression support was added in 369.Ox 2.1 . 370