1*44bedb31SLionel SambucThis is a patched version of zlib modified to use 2*44bedb31SLionel SambucPentium-optimized assembly code in the deflation algorithm. The files 3*44bedb31SLionel Sambucchanged/added by this patch are: 4*44bedb31SLionel Sambuc 5*44bedb31SLionel SambucREADME.586 6*44bedb31SLionel Sambucmatch.S 7*44bedb31SLionel Sambuc 8*44bedb31SLionel SambucThe effectiveness of these modifications is a bit marginal, as the 9*44bedb31SLionel Sambucprogram's bottleneck seems to be mostly L1-cache contention, for which 10*44bedb31SLionel Sambucthere is no real way to work around without rewriting the basic 11*44bedb31SLionel Sambucalgorithm. The speedup on average is around 5-10% (which is generally 12*44bedb31SLionel Sambucless than the amount of variance between subsequent executions). 13*44bedb31SLionel SambucHowever, when used at level 9 compression, the cache contention can 14*44bedb31SLionel Sambucdrop enough for the assembly version to achieve 10-20% speedup (and 15*44bedb31SLionel Sambucsometimes more, depending on the amount of overall redundancy in the 16*44bedb31SLionel Sambucfiles). Even here, though, cache contention can still be the limiting 17*44bedb31SLionel Sambucfactor, depending on the nature of the program using the zlib library. 18*44bedb31SLionel SambucThis may also mean that better improvements will be seen on a Pentium 19*44bedb31SLionel Sambucwith MMX, which suffers much less from L1-cache contention, but I have 20*44bedb31SLionel Sambucnot yet verified this. 21*44bedb31SLionel Sambuc 22*44bedb31SLionel SambucNote that this code has been tailored for the Pentium in particular, 23*44bedb31SLionel Sambucand will not perform well on the Pentium Pro (due to the use of a 24*44bedb31SLionel Sambucpartial register in the inner loop). 25*44bedb31SLionel Sambuc 26*44bedb31SLionel SambucIf you are using an assembler other than GNU as, you will have to 27*44bedb31SLionel Sambuctranslate match.S to use your assembler's syntax. (Have fun.) 28*44bedb31SLionel Sambuc 29*44bedb31SLionel SambucBrian Raiter 30*44bedb31SLionel Sambucbreadbox@muppetlabs.com 31*44bedb31SLionel SambucApril, 1998 32*44bedb31SLionel Sambuc 33*44bedb31SLionel Sambuc 34*44bedb31SLionel SambucAdded for zlib 1.1.3: 35*44bedb31SLionel Sambuc 36*44bedb31SLionel SambucThe patches come from 37*44bedb31SLionel Sambuchttp://www.muppetlabs.com/~breadbox/software/assembly.html 38*44bedb31SLionel Sambuc 39*44bedb31SLionel SambucTo compile zlib with this asm file, copy match.S to the zlib directory 40*44bedb31SLionel Sambucthen do: 41*44bedb31SLionel Sambuc 42*44bedb31SLionel SambucCFLAGS="-O3 -DASMV" ./configure 43*44bedb31SLionel Sambucmake OBJA=match.o 44