xref: /netbsd-src/external/bsd/zstd/dist/tests/README.md (revision 3117ece4fc4a4ca4489ba793710b60b0d26bab6c)
1*3117ece4SchristosPrograms and scripts for automated testing of Zstandard
2*3117ece4Schristos=======================================================
3*3117ece4Schristos
4*3117ece4SchristosThis directory contains the following programs and scripts:
5*3117ece4Schristos- `datagen` : Synthetic and parametrable data generator, for tests
6*3117ece4Schristos- `fullbench`  : Precisely measure speed for each zstd inner functions
7*3117ece4Schristos- `fuzzer`  : Test tool, to check zstd integrity on target platform
8*3117ece4Schristos- `paramgrill` : parameter tester for zstd
9*3117ece4Schristos- `test-zstd-speed.py` : script for testing zstd speed difference between commits
10*3117ece4Schristos- `test-zstd-versions.py` : compatibility test between zstd versions stored on Github (v0.1+)
11*3117ece4Schristos- `zstreamtest` : Fuzzer test tool for zstd streaming API
12*3117ece4Schristos- `legacy` : Test tool to test decoding of legacy zstd frames
13*3117ece4Schristos- `decodecorpus` : Tool to generate valid Zstandard frames, for verifying decoder implementations
14*3117ece4Schristos
15*3117ece4Schristos
16*3117ece4Schristos#### `test-zstd-versions.py` - script for testing zstd interoperability between versions
17*3117ece4Schristos
18*3117ece4SchristosThis script creates `versionsTest` directory to which zstd repository is cloned.
19*3117ece4SchristosThen all tagged (released) versions of zstd are compiled.
20*3117ece4SchristosIn the following step interoperability between zstd versions is checked.
21*3117ece4Schristos
22*3117ece4Schristos#### `automated-benchmarking.py` - script for benchmarking zstd prs to dev
23*3117ece4Schristos
24*3117ece4SchristosThis script benchmarks facebook:dev and changes from pull requests made to zstd and compares
25*3117ece4Schristosthem against facebook:dev to detect regressions. This script currently runs on a dedicated
26*3117ece4Schristosdesktop machine for every pull request that is made to the zstd repo but can also
27*3117ece4Schristosbe run on any machine via the command line interface.
28*3117ece4Schristos
29*3117ece4SchristosThere are three modes of usage for this script: fastmode will just run a minimal single
30*3117ece4Schristosbuild comparison (between facebook:dev and facebook:release), onetime will pull all the current
31*3117ece4Schristospull requests from the zstd repo and compare facebook:dev to all of them once, continuous
32*3117ece4Schristoswill continuously get pull requests from the zstd repo and run benchmarks against facebook:dev.
33*3117ece4Schristos
34*3117ece4Schristos```
35*3117ece4SchristosExample usage: python automated_benchmarking.py
36*3117ece4Schristos```
37*3117ece4Schristos
38*3117ece4Schristos```
39*3117ece4Schristosusage: automated_benchmarking.py [-h] [--directory DIRECTORY]
40*3117ece4Schristos                                 [--levels LEVELS] [--iterations ITERATIONS]
41*3117ece4Schristos                                 [--emails EMAILS] [--frequency FREQUENCY]
42*3117ece4Schristos                                 [--mode MODE] [--dict DICT]
43*3117ece4Schristos
44*3117ece4Schristosoptional arguments:
45*3117ece4Schristos  -h, --help            show this help message and exit
46*3117ece4Schristos  --directory DIRECTORY
47*3117ece4Schristos                        directory with files to benchmark
48*3117ece4Schristos  --levels LEVELS       levels to test e.g. ('1,2,3')
49*3117ece4Schristos  --iterations ITERATIONS
50*3117ece4Schristos                        number of benchmark iterations to run
51*3117ece4Schristos  --emails EMAILS       email addresses of people who will be alerted upon
52*3117ece4Schristos                        regression. Only for continuous mode
53*3117ece4Schristos  --frequency FREQUENCY
54*3117ece4Schristos                        specifies the number of seconds to wait before each
55*3117ece4Schristos                        successive check for new PRs in continuous mode
56*3117ece4Schristos  --mode MODE           'fastmode', 'onetime', 'current', or 'continuous' (see
57*3117ece4Schristos                        README.md for details)
58*3117ece4Schristos  --dict DICT           filename of dictionary to use (when set, this
59*3117ece4Schristos                        dictionary will be used to compress the files provided
60*3117ece4Schristos                        inside --directory)
61*3117ece4Schristos```
62*3117ece4Schristos
63*3117ece4Schristos#### `test-zstd-speed.py` - script for testing zstd speed difference between commits
64*3117ece4Schristos
65*3117ece4SchristosDEPRECATED
66*3117ece4Schristos
67*3117ece4SchristosThis script creates `speedTest` directory to which zstd repository is cloned.
68*3117ece4SchristosThen it compiles all branches of zstd and performs a speed benchmark for a given list of files (the `testFileNames` parameter).
69*3117ece4SchristosAfter `sleepTime` (an optional parameter, default 300 seconds) seconds the script checks repository for new commits.
70*3117ece4SchristosIf a new commit is found it is compiled and a speed benchmark for this commit is performed.
71*3117ece4SchristosThe results of the speed benchmark are compared to the previous results.
72*3117ece4SchristosIf compression or decompression speed for one of zstd levels is lower than `lowerLimit` (an optional parameter, default 0.98) the speed benchmark is restarted.
73*3117ece4SchristosIf second results are also lower than `lowerLimit` the warning e-mail is sent to recipients from the list (the `emails` parameter).
74*3117ece4Schristos
75*3117ece4SchristosAdditional remarks:
76*3117ece4Schristos- To be sure that speed results are accurate the script should be run on a "stable" target system with no other jobs running in parallel
77*3117ece4Schristos- Using the script with virtual machines can lead to large variations of speed results
78*3117ece4Schristos- The speed benchmark is not performed until computers' load average is lower than `maxLoadAvg` (an optional parameter, default 0.75)
79*3117ece4Schristos- The script sends e-mails using `mutt`; if `mutt` is not available it sends e-mails without attachments using `mail`; if both are not available it only prints a warning
80*3117ece4Schristos
81*3117ece4Schristos
82*3117ece4SchristosThe example usage with two test files, one e-mail address, and with an additional message:
83*3117ece4Schristos```
84*3117ece4Schristos./test-zstd-speed.py "silesia.tar calgary.tar" "email@gmail.com" --message "tested on my laptop" --sleepTime 60
85*3117ece4Schristos```
86*3117ece4Schristos
87*3117ece4SchristosTo run the script in background please use:
88*3117ece4Schristos```
89*3117ece4Schristosnohup ./test-zstd-speed.py testFileNames emails &
90*3117ece4Schristos```
91*3117ece4Schristos
92*3117ece4SchristosThe full list of parameters:
93*3117ece4Schristos```
94*3117ece4Schristospositional arguments:
95*3117ece4Schristos  testFileNames         file names list for speed benchmark
96*3117ece4Schristos  emails                list of e-mail addresses to send warnings
97*3117ece4Schristos
98*3117ece4Schristosoptional arguments:
99*3117ece4Schristos  -h, --help            show this help message and exit
100*3117ece4Schristos  --message MESSAGE     attach an additional message to e-mail
101*3117ece4Schristos  --lowerLimit LOWERLIMIT
102*3117ece4Schristos                        send email if speed is lower than given limit
103*3117ece4Schristos  --maxLoadAvg MAXLOADAVG
104*3117ece4Schristos                        maximum load average to start testing
105*3117ece4Schristos  --lastCLevel LASTCLEVEL
106*3117ece4Schristos                        last compression level for testing
107*3117ece4Schristos  --sleepTime SLEEPTIME
108*3117ece4Schristos                        frequency of repository checking in seconds
109*3117ece4Schristos```
110*3117ece4Schristos
111*3117ece4Schristos#### `decodecorpus` - tool to generate Zstandard frames for decoder testing
112*3117ece4SchristosCommand line tool to generate test .zst files.
113*3117ece4Schristos
114*3117ece4SchristosThis tool will generate .zst files with checksums,
115*3117ece4Schristosas well as optionally output the corresponding correct uncompressed data for
116*3117ece4Schristosextra verification.
117*3117ece4Schristos
118*3117ece4SchristosExample:
119*3117ece4Schristos```
120*3117ece4Schristos./decodecorpus -ptestfiles -otestfiles -n10000 -s5
121*3117ece4Schristos```
122*3117ece4Schristoswill generate 10,000 sample .zst files using a seed of 5 in the `testfiles` directory,
123*3117ece4Schristoswith the zstd checksum field set,
124*3117ece4Schristosas well as the 10,000 original files for more detailed comparison of decompression results.
125*3117ece4Schristos
126*3117ece4Schristos```
127*3117ece4Schristos./decodecorpus -t -T1mn
128*3117ece4Schristos```
129*3117ece4Schristoswill choose a random seed, and for 1 minute,
130*3117ece4Schristosgenerate random test frames and ensure that the
131*3117ece4Schristoszstd library correctly decompresses them in both simple and streaming modes.
132*3117ece4Schristos
133*3117ece4Schristos#### `paramgrill` - tool for generating compression table parameters and optimizing parameters on file given constraints
134*3117ece4Schristos
135*3117ece4SchristosFull list of arguments
136*3117ece4Schristos```
137*3117ece4Schristos -T#          : set level 1 speed objective
138*3117ece4Schristos -B#          : cut input into blocks of size # (default : single block)
139*3117ece4Schristos -S           : benchmarks a single run (example command: -Sl3w10h12)
140*3117ece4Schristos    w# - windowLog
141*3117ece4Schristos    h# - hashLog
142*3117ece4Schristos    c# - chainLog
143*3117ece4Schristos    s# - searchLog
144*3117ece4Schristos    l# - minMatch
145*3117ece4Schristos    t# - targetLength
146*3117ece4Schristos    S# - strategy
147*3117ece4Schristos    L# - level
148*3117ece4Schristos --zstd=      : Single run, parameter selection syntax same as zstdcli with more parameters
149*3117ece4Schristos                    (Added forceAttachDictionary / fadt)
150*3117ece4Schristos                    When invoked with --optimize, this represents the sample to exceed.
151*3117ece4Schristos --optimize=  : find parameters to maximize compression ratio given parameters
152*3117ece4Schristos                    Can use all --zstd= commands to constrain the type of solution found in addition to the following constraints
153*3117ece4Schristos    cSpeed=   : Minimum compression speed
154*3117ece4Schristos    dSpeed=   : Minimum decompression speed
155*3117ece4Schristos    cMem=     : Maximum compression memory
156*3117ece4Schristos    lvl=      : Searches for solutions which are strictly better than that compression lvl in ratio and cSpeed,
157*3117ece4Schristos    stc=      : When invoked with lvl=, represents percentage slack in ratio/cSpeed allowed for a solution to be considered (Default 100%)
158*3117ece4Schristos              : In normal operation, represents percentage slack in choosing viable starting strategy selection in choosing the default parameters
159*3117ece4Schristos                    (Lower value will begin with stronger strategies) (Default 90%)
160*3117ece4Schristos    speedRatio=   (accepts decimals)
161*3117ece4Schristos              : determines value of gains in speed vs gains in ratio
162*3117ece4Schristos                    when determining overall winner (default 5 (1% ratio = 5% speed)).
163*3117ece4Schristos    tries=    : Maximum number of random restarts on a single strategy before switching (Default 5)
164*3117ece4Schristos                    Higher values will make optimizer run longer, more chances to find better solution.
165*3117ece4Schristos    memLog    : Limits the log of the size of each memotable (1 per strategy). Will use hash tables when state space is larger than max size.
166*3117ece4Schristos                    Setting memLog = 0 turns off memoization
167*3117ece4Schristos --display=   : specify which parameters are included in the output
168*3117ece4Schristos                    can use all --zstd parameter names and 'cParams' as a shorthand for all parameters used in ZSTD_compressionParameters
169*3117ece4Schristos                    (Default: display all params available)
170*3117ece4Schristos -P#          : generated sample compressibility (when no file is provided)
171*3117ece4Schristos -t#          : Caps runtime of operation in seconds (default: 99999 seconds (about 27 hours))
172*3117ece4Schristos -v           : Prints Benchmarking output
173*3117ece4Schristos -D           : Next argument dictionary file
174*3117ece4Schristos -s           : Benchmark all files separately
175*3117ece4Schristos -q           : Quiet, repeat for more quiet
176*3117ece4Schristos                  -q Prints parameters + results whenever a new best is found
177*3117ece4Schristos                  -qq Only prints parameters whenever a new best is found, prints final parameters + results
178*3117ece4Schristos                  -qqq Only print final parameters + results
179*3117ece4Schristos                  -qqqq Only prints final parameter set in the form --zstd=
180*3117ece4Schristos -v           : Verbose, cancels quiet, repeat for more volume
181*3117ece4Schristos                  -v Prints all candidate parameters and results
182*3117ece4Schristos
183*3117ece4Schristos```
184*3117ece4Schristos Any inputs afterwards are treated as files to benchmark.
185