122591Smckusick 2*22732Smckusick @(#)README 5.2 (Berkeley) 06/07/85 322591Smckusick 4*22732SmckusickCompress version 3.2 enhancements: 5*22732Smckusick 6*22732Smckusick (a) portability mods for Z8000 and PC/XT 7*22732Smckusick (b) default to 'quiet' mode 8*22732Smckusick (c) unification of 'force' flags 9*22732Smckusick (d) multi-file bug fix for USERMEM code 10*22732Smckusick (e) decompress() speedup (5-10%) 11*22732Smckusick (f) manual page overhaul 12*22732Smckusick 13*22732SmckusickThis is the baseline for both BSD 4.3 and netnews 2.10.3 from Rick Adams. 14*22732Smckusick 15*22732Smckusick --jaw (June 7, 1985) 16*22732Smckusick----- 1722591SmckusickEnclosed is compress version 3.0 with the following changes: 1822591Smckusick 1922591Smckusick1. "Block" compression is performed. After the BITS run out, the 2022591Smckusick compression ratio is checked every so often. If it is decreasing, 2122591Smckusick the table is cleared and a new set of substrings are generated. 2222591Smckusick 2322591Smckusick This makes the output of compress 3.0 not compatable with that of 2422591Smckusick compress 2.0. However, compress 3.0 still accepts the output of 2522591Smckusick compress 2.0. To generate output that is compatable with compress 2622591Smckusick 2.0, use the undocumented "-C" flag. 2722591Smckusick 2822591Smckusick2. A quiet "-q" flag has been added for use by the news system. 2922591Smckusick 3022591Smckusick3. The character chaining has been deleted and the program now uses 3122591Smckusick hashing. This boosts speed , especially during compression of 3222591Smckusick large files. Other speed improvements have been made, such as 3322591Smckusick using putc() instead of fwrite(). 3422591Smckusick 3522591Smckusick4. A large table is used on large machines when a relatively small 3622591Smckusick number of bits is specified. This saves much time when compressing 3722591Smckusick for a 16-bit machine on a 32-bit virtual machine. 3822591Smckusick 3922591SmckusickMost of these changes were made by James A. Woods (ames!jaw). Thank you 4022591SmckusickJames! 4122591Smckusick 4222591SmckusickTo compile compress: 4322591Smckusick 4422591Smckusick cc -O -DUSERMEM=usermem -o compress compress.c 4522591Smckusick 4622591SmckusickWhere "usermem" is the amount of physical user memory available (in bytes). 4722591SmckusickIf any physical memory is to be reserved for other processes, put in 4822591Smckusick"-DSACREDMEM sacredmem", where "sacredmem" is the amount to be reserved. 4922591Smckusick 5022591SmckusickThe difference "usermem-sacredmem" determines the maximum BITS that can be 5122591Smckusickspecified, and the cutoff bits where the large+fast table is used. 5222591Smckusick 5322591Smckusickmemory: at least BITS cutoff 5422591Smckusick------ -- ----- ---- ------ 5522591Smckusick 4,718,592 16 13 5622591Smckusick 2,621,440 16 12 5722591Smckusick 1,572,864 16 11 5822591Smckusick 631,808 16 -- 5922591Smckusick 329,728 15 -- 6022591Smckusick 178,176 14 -- 6122591Smckusick 99,328 13 -- 6222591Smckusick 0 12 -- 6322591Smckusick 6422591SmckusickThe default memory size is 750,000 which gives a maximum BITS=16 and no 6522591Smckusicklarge+fast table. 6622591Smckusick 67*22732SmckusickThe maximum bits can be overruled by specifying "-DBITS=bits" at 6822591Smckusickcompilation time. 6922591Smckusick 7022591SmckusickIf your machine doesn't support unsigned characters, define "NO_UCHAR" 7122591Smckusickwhen compiling. 7222591Smckusick 7322591SmckusickAfter compilation, move "compress" to a standard executable location, such 7422591Smckusickas /usr/local. Then: 7522591Smckusick cd /usr/local 7622591Smckusick ln compress uncompress 7722591Smckusick ln compress zcat 7822591Smckusick 7922591SmckusickOn machines that have a fixed stack size (such as Perkin-Elmer), set the 8022591Smckusickstack to at least 12kb. ("setstack compress 12" on Perkin-Elmer). 8122591Smckusick 8222591SmckusickNext, install the manual (compress.l). 8322591Smckusick cp compress.l /usr/man/manl 8422591Smckusick cd /usr/man/manl 8522591Smckusick ln compress.l uncompress.l 8622591Smckusick ln compress.l zcat.l 8722591Smckusick 8822591Smckusick - or - 8922591Smckusick 9022591Smckusick cp compress.l /usr/man/man1/compress.1 9122591Smckusick cd /usr/man/man1 9222591Smckusick ln compress.1 uncompress.1 9322591Smckusick ln compress.1 zcat.1 9422591Smckusick 9522591Smckusick regards, 9622591Smckusick petsd!joe 9722591Smckusick 9822591SmckusickHere is a note from the net: 9922591Smckusick 10022591Smckusick>From hplabs!pesnta!amd!turtlevax!ken Sat Jan 5 03:35:20 1985 10122591SmckusickPath: ames!hplabs!pesnta!amd!turtlevax!ken 10222591SmckusickFrom: ken@turtlevax.UUCP (Ken Turkowski) 10322591SmckusickNewsgroups: net.sources 10422591SmckusickSubject: Re: Compress release 3.0 : sample Makefile 10522591SmckusickOrganization: CADLINC, Inc. @ Menlo Park, CA 10622591Smckusick 10722591SmckusickIn the compress 3.0 source recently posted to mod.sources, there is a 10822591Smckusick#define variable which can be set for optimum performance on a machine 10922591Smckusickwith a large amount of memory. A program (usermem) to calculate the 11022591Smckusickuseable amount of physical user memory is enclosed, as well as a sample 11122591Smckusick4.2bsd Vax Makefile for compress. 11222591Smckusick 11322591SmckusickHere is the README file from the previous version of compress (2.0): 11422591Smckusick 11522591Smckusick>Enclosed is compress.c version 2.0 with the following bugs fixed: 11622591Smckusick> 11722591Smckusick>1. The packed files produced by compress are different on different 11822591Smckusick> machines and dependent on the vax sysgen option. 11922591Smckusick> The bug was in the different byte/bit ordering on the 12022591Smckusick> various machines. This has been fixed. 12122591Smckusick> 122*22732Smckusick> This version is NOT compatable with the original vax posting 12322591Smckusick> unless the '-DCOMPATIBLE' option is specified to the C 12422591Smckusick> compiler. The original posting has a bug which I fixed, 12522591Smckusick> causing incompatible files. I recommend you NOT to use this 12622591Smckusick> option unless you already have a lot of packed files from 12722591Smckusick> the original posting by thomas. 12822591Smckusick>2. The exit status is not well defined (on some machines) causing the 12922591Smckusick> scripts to fail. 13022591Smckusick> The exit status is now 0,1 or 2 and is documented in 13122591Smckusick> compress.l. 13222591Smckusick>3. The function getopt() is not available in all C libraries. 13322591Smckusick> The function getopt() is no longer referenced by the 13422591Smckusick> program. 13522591Smckusick>4. Error status is not being checked on the fwrite() and fflush() calls. 13622591Smckusick> Fixed. 13722591Smckusick> 13822591Smckusick>The following enhancements have been made: 13922591Smckusick> 14022591Smckusick>1. Added facilities of "compact" into the compress program. "Pack", 14122591Smckusick> "Unpack", and "Pcat" are no longer required (no longer supplied). 14222591Smckusick>2. Installed work around for C compiler bug with "-O". 14322591Smckusick>3. Added a magic number header (\037\235). Put the bits specified 14422591Smckusick> in the file. 14522591Smckusick>4. Added "-f" flag to force overwrite of output file. 14622591Smckusick>5. Added "-c" flag and "zcat" program. 'ln compress zcat' after you 14722591Smckusick> compile. 14822591Smckusick>6. The 'uncompress' script has been deleted; simply 14922591Smckusick> 'ln compress uncompress' after you compile and it will work. 15022591Smckusick>7. Removed extra bit masking for machines that support unsigned 15122591Smckusick> characters. If your machine doesn't support unsigned characters, 15222591Smckusick> define "NO_UCHAR" when compiling. 15322591Smckusick> 15422591Smckusick>Compile "compress.c" with "-O -o compress" flags. Move "compress" to a 15522591Smckusick>standard executable location, such as /usr/local. Then: 15622591Smckusick> cd /usr/local 15722591Smckusick> ln compress uncompress 15822591Smckusick> ln compress zcat 15922591Smckusick> 16022591Smckusick>On machines that have a fixed stack size (such as Perkin-Elmer), set the 16122591Smckusick>stack to at least 12kb. ("setstack compress 12" on Perkin-Elmer). 16222591Smckusick> 16322591Smckusick>Next, install the manual (compress.l). 16422591Smckusick> cp compress.l /usr/man/manl - or - 16522591Smckusick> cp compress.l /usr/man/man1/compress.1 16622591Smckusick> 16722591Smckusick>Here is the README that I sent with my first posting: 16822591Smckusick> 16922591Smckusick>>Enclosed is a modified version of compress.c, along with scripts to make it 17022591Smckusick>>run identically to pack(1), unpack(1), an pcat(1). Here is what I 17122591Smckusick>>(petsd!joe) and a colleague (petsd!peora!srd) did: 17222591Smckusick>> 17322591Smckusick>>1. Removed VAX dependencies. 17422591Smckusick>>2. Changed the struct to separate arrays; saves mucho memory. 17522591Smckusick>>3. Did comparisons in unsigned, where possible. (Faster on Perkin-Elmer.) 17622591Smckusick>>4. Sorted the character next chain and changed the search to stop 17722591Smckusick>>prematurely. This saves a lot on the execution time when compressing. 17822591Smckusick>> 17922591Smckusick>>This version is totally compatible with the original version. Even though 18022591Smckusick>>lint(1) -p has no complaints about compress.c, it won't run on a 16-bit 18122591Smckusick>>machine, due to the size of the arrays. 18222591Smckusick>> 18322591Smckusick>>Here is the README file from the original author: 18422591Smckusick>> 18522591Smckusick>>>Well, with all this discussion about file compression (for news batching 18622591Smckusick>>>in particular) going around, I decided to implement the text compression 18722591Smckusick>>>algorithm described in the June Computer magazine. The author claimed 18822591Smckusick>>>blinding speed and good compression ratios. It's certainly faster than 18922591Smckusick>>>compact (but, then, what wouldn't be), but it's also the same speed as 19022591Smckusick>>>pack, and gets better compression than both of them. On 350K bytes of 19122591Smckusick>>>unix-wizards, compact took about 8 minutes of CPU, pack took about 80 19222591Smckusick>>>seconds, and compress (herein) also took 80 seconds. But, compact and 19322591Smckusick>>>pack got about 30% compression, whereas compress got over 50%. So, I 19422591Smckusick>>>decided I had something, and that others might be interested, too. 19522591Smckusick>>> 19622591Smckusick>>>As is probably true of compact and pack (although I haven't checked), 19722591Smckusick>>>the byte order within a word is probably relevant here, but as long as 19822591Smckusick>>>you stay on a single machine type, you should be ok. (Can anybody 19922591Smckusick>>>elucidate on this?) There are a couple of asm's in the code (extv and 20022591Smckusick>>>insv instructions), so anyone porting it to another machine will have to 20122591Smckusick>>>deal with this anyway (and could probably make it compatible with Vax 20222591Smckusick>>>byte order at the same time). Anyway, I've linted the code (both with 20322591Smckusick>>>and without -p), so it should run elsewhere. Note the longs in the 20422591Smckusick>>>code, you can take these out if you reduce BITS to <= 15. 20522591Smckusick>>> 20622591Smckusick>>>Have fun, and as always, if you make good enhancements, or bug fixes, 20722591Smckusick>>>I'd like to see them. 20822591Smckusick>>> 20922591Smckusick>>>=Spencer (thomas@utah-20, {harpo,hplabs,arizona}!utah-cs!thomas) 21022591Smckusick>> 21122591Smckusick>> regards, 21222591Smckusick>> joe 21322591Smckusick>> 21422591Smckusick>>-- 21522591Smckusick>>Full-Name: Joseph M. Orost 21622591Smckusick>>UUCP: ..!{decvax,ucbvax,ihnp4}!vax135!petsd!joe 21722591Smckusick>>US Mail: MS 313; Perkin-Elmer; 106 Apple St; Tinton Falls, NJ 07724 21822591Smckusick>>Phone: (201) 870-5844 219