1*3117ece4SchristosPrograms and scripts for automated testing of Zstandard 2*3117ece4Schristos======================================================= 3*3117ece4Schristos 4*3117ece4SchristosThis directory contains the following programs and scripts: 5*3117ece4Schristos- `datagen` : Synthetic and parametrable data generator, for tests 6*3117ece4Schristos- `fullbench` : Precisely measure speed for each zstd inner functions 7*3117ece4Schristos- `fuzzer` : Test tool, to check zstd integrity on target platform 8*3117ece4Schristos- `paramgrill` : parameter tester for zstd 9*3117ece4Schristos- `test-zstd-speed.py` : script for testing zstd speed difference between commits 10*3117ece4Schristos- `test-zstd-versions.py` : compatibility test between zstd versions stored on Github (v0.1+) 11*3117ece4Schristos- `zstreamtest` : Fuzzer test tool for zstd streaming API 12*3117ece4Schristos- `legacy` : Test tool to test decoding of legacy zstd frames 13*3117ece4Schristos- `decodecorpus` : Tool to generate valid Zstandard frames, for verifying decoder implementations 14*3117ece4Schristos 15*3117ece4Schristos 16*3117ece4Schristos#### `test-zstd-versions.py` - script for testing zstd interoperability between versions 17*3117ece4Schristos 18*3117ece4SchristosThis script creates `versionsTest` directory to which zstd repository is cloned. 19*3117ece4SchristosThen all tagged (released) versions of zstd are compiled. 20*3117ece4SchristosIn the following step interoperability between zstd versions is checked. 21*3117ece4Schristos 22*3117ece4Schristos#### `automated-benchmarking.py` - script for benchmarking zstd prs to dev 23*3117ece4Schristos 24*3117ece4SchristosThis script benchmarks facebook:dev and changes from pull requests made to zstd and compares 25*3117ece4Schristosthem against facebook:dev to detect regressions. This script currently runs on a dedicated 26*3117ece4Schristosdesktop machine for every pull request that is made to the zstd repo but can also 27*3117ece4Schristosbe run on any machine via the command line interface. 28*3117ece4Schristos 29*3117ece4SchristosThere are three modes of usage for this script: fastmode will just run a minimal single 30*3117ece4Schristosbuild comparison (between facebook:dev and facebook:release), onetime will pull all the current 31*3117ece4Schristospull requests from the zstd repo and compare facebook:dev to all of them once, continuous 32*3117ece4Schristoswill continuously get pull requests from the zstd repo and run benchmarks against facebook:dev. 33*3117ece4Schristos 34*3117ece4Schristos``` 35*3117ece4SchristosExample usage: python automated_benchmarking.py 36*3117ece4Schristos``` 37*3117ece4Schristos 38*3117ece4Schristos``` 39*3117ece4Schristosusage: automated_benchmarking.py [-h] [--directory DIRECTORY] 40*3117ece4Schristos [--levels LEVELS] [--iterations ITERATIONS] 41*3117ece4Schristos [--emails EMAILS] [--frequency FREQUENCY] 42*3117ece4Schristos [--mode MODE] [--dict DICT] 43*3117ece4Schristos 44*3117ece4Schristosoptional arguments: 45*3117ece4Schristos -h, --help show this help message and exit 46*3117ece4Schristos --directory DIRECTORY 47*3117ece4Schristos directory with files to benchmark 48*3117ece4Schristos --levels LEVELS levels to test e.g. ('1,2,3') 49*3117ece4Schristos --iterations ITERATIONS 50*3117ece4Schristos number of benchmark iterations to run 51*3117ece4Schristos --emails EMAILS email addresses of people who will be alerted upon 52*3117ece4Schristos regression. Only for continuous mode 53*3117ece4Schristos --frequency FREQUENCY 54*3117ece4Schristos specifies the number of seconds to wait before each 55*3117ece4Schristos successive check for new PRs in continuous mode 56*3117ece4Schristos --mode MODE 'fastmode', 'onetime', 'current', or 'continuous' (see 57*3117ece4Schristos README.md for details) 58*3117ece4Schristos --dict DICT filename of dictionary to use (when set, this 59*3117ece4Schristos dictionary will be used to compress the files provided 60*3117ece4Schristos inside --directory) 61*3117ece4Schristos``` 62*3117ece4Schristos 63*3117ece4Schristos#### `test-zstd-speed.py` - script for testing zstd speed difference between commits 64*3117ece4Schristos 65*3117ece4SchristosDEPRECATED 66*3117ece4Schristos 67*3117ece4SchristosThis script creates `speedTest` directory to which zstd repository is cloned. 68*3117ece4SchristosThen it compiles all branches of zstd and performs a speed benchmark for a given list of files (the `testFileNames` parameter). 69*3117ece4SchristosAfter `sleepTime` (an optional parameter, default 300 seconds) seconds the script checks repository for new commits. 70*3117ece4SchristosIf a new commit is found it is compiled and a speed benchmark for this commit is performed. 71*3117ece4SchristosThe results of the speed benchmark are compared to the previous results. 72*3117ece4SchristosIf compression or decompression speed for one of zstd levels is lower than `lowerLimit` (an optional parameter, default 0.98) the speed benchmark is restarted. 73*3117ece4SchristosIf second results are also lower than `lowerLimit` the warning e-mail is sent to recipients from the list (the `emails` parameter). 74*3117ece4Schristos 75*3117ece4SchristosAdditional remarks: 76*3117ece4Schristos- To be sure that speed results are accurate the script should be run on a "stable" target system with no other jobs running in parallel 77*3117ece4Schristos- Using the script with virtual machines can lead to large variations of speed results 78*3117ece4Schristos- The speed benchmark is not performed until computers' load average is lower than `maxLoadAvg` (an optional parameter, default 0.75) 79*3117ece4Schristos- The script sends e-mails using `mutt`; if `mutt` is not available it sends e-mails without attachments using `mail`; if both are not available it only prints a warning 80*3117ece4Schristos 81*3117ece4Schristos 82*3117ece4SchristosThe example usage with two test files, one e-mail address, and with an additional message: 83*3117ece4Schristos``` 84*3117ece4Schristos./test-zstd-speed.py "silesia.tar calgary.tar" "email@gmail.com" --message "tested on my laptop" --sleepTime 60 85*3117ece4Schristos``` 86*3117ece4Schristos 87*3117ece4SchristosTo run the script in background please use: 88*3117ece4Schristos``` 89*3117ece4Schristosnohup ./test-zstd-speed.py testFileNames emails & 90*3117ece4Schristos``` 91*3117ece4Schristos 92*3117ece4SchristosThe full list of parameters: 93*3117ece4Schristos``` 94*3117ece4Schristospositional arguments: 95*3117ece4Schristos testFileNames file names list for speed benchmark 96*3117ece4Schristos emails list of e-mail addresses to send warnings 97*3117ece4Schristos 98*3117ece4Schristosoptional arguments: 99*3117ece4Schristos -h, --help show this help message and exit 100*3117ece4Schristos --message MESSAGE attach an additional message to e-mail 101*3117ece4Schristos --lowerLimit LOWERLIMIT 102*3117ece4Schristos send email if speed is lower than given limit 103*3117ece4Schristos --maxLoadAvg MAXLOADAVG 104*3117ece4Schristos maximum load average to start testing 105*3117ece4Schristos --lastCLevel LASTCLEVEL 106*3117ece4Schristos last compression level for testing 107*3117ece4Schristos --sleepTime SLEEPTIME 108*3117ece4Schristos frequency of repository checking in seconds 109*3117ece4Schristos``` 110*3117ece4Schristos 111*3117ece4Schristos#### `decodecorpus` - tool to generate Zstandard frames for decoder testing 112*3117ece4SchristosCommand line tool to generate test .zst files. 113*3117ece4Schristos 114*3117ece4SchristosThis tool will generate .zst files with checksums, 115*3117ece4Schristosas well as optionally output the corresponding correct uncompressed data for 116*3117ece4Schristosextra verification. 117*3117ece4Schristos 118*3117ece4SchristosExample: 119*3117ece4Schristos``` 120*3117ece4Schristos./decodecorpus -ptestfiles -otestfiles -n10000 -s5 121*3117ece4Schristos``` 122*3117ece4Schristoswill generate 10,000 sample .zst files using a seed of 5 in the `testfiles` directory, 123*3117ece4Schristoswith the zstd checksum field set, 124*3117ece4Schristosas well as the 10,000 original files for more detailed comparison of decompression results. 125*3117ece4Schristos 126*3117ece4Schristos``` 127*3117ece4Schristos./decodecorpus -t -T1mn 128*3117ece4Schristos``` 129*3117ece4Schristoswill choose a random seed, and for 1 minute, 130*3117ece4Schristosgenerate random test frames and ensure that the 131*3117ece4Schristoszstd library correctly decompresses them in both simple and streaming modes. 132*3117ece4Schristos 133*3117ece4Schristos#### `paramgrill` - tool for generating compression table parameters and optimizing parameters on file given constraints 134*3117ece4Schristos 135*3117ece4SchristosFull list of arguments 136*3117ece4Schristos``` 137*3117ece4Schristos -T# : set level 1 speed objective 138*3117ece4Schristos -B# : cut input into blocks of size # (default : single block) 139*3117ece4Schristos -S : benchmarks a single run (example command: -Sl3w10h12) 140*3117ece4Schristos w# - windowLog 141*3117ece4Schristos h# - hashLog 142*3117ece4Schristos c# - chainLog 143*3117ece4Schristos s# - searchLog 144*3117ece4Schristos l# - minMatch 145*3117ece4Schristos t# - targetLength 146*3117ece4Schristos S# - strategy 147*3117ece4Schristos L# - level 148*3117ece4Schristos --zstd= : Single run, parameter selection syntax same as zstdcli with more parameters 149*3117ece4Schristos (Added forceAttachDictionary / fadt) 150*3117ece4Schristos When invoked with --optimize, this represents the sample to exceed. 151*3117ece4Schristos --optimize= : find parameters to maximize compression ratio given parameters 152*3117ece4Schristos Can use all --zstd= commands to constrain the type of solution found in addition to the following constraints 153*3117ece4Schristos cSpeed= : Minimum compression speed 154*3117ece4Schristos dSpeed= : Minimum decompression speed 155*3117ece4Schristos cMem= : Maximum compression memory 156*3117ece4Schristos lvl= : Searches for solutions which are strictly better than that compression lvl in ratio and cSpeed, 157*3117ece4Schristos stc= : When invoked with lvl=, represents percentage slack in ratio/cSpeed allowed for a solution to be considered (Default 100%) 158*3117ece4Schristos : In normal operation, represents percentage slack in choosing viable starting strategy selection in choosing the default parameters 159*3117ece4Schristos (Lower value will begin with stronger strategies) (Default 90%) 160*3117ece4Schristos speedRatio= (accepts decimals) 161*3117ece4Schristos : determines value of gains in speed vs gains in ratio 162*3117ece4Schristos when determining overall winner (default 5 (1% ratio = 5% speed)). 163*3117ece4Schristos tries= : Maximum number of random restarts on a single strategy before switching (Default 5) 164*3117ece4Schristos Higher values will make optimizer run longer, more chances to find better solution. 165*3117ece4Schristos memLog : Limits the log of the size of each memotable (1 per strategy). Will use hash tables when state space is larger than max size. 166*3117ece4Schristos Setting memLog = 0 turns off memoization 167*3117ece4Schristos --display= : specify which parameters are included in the output 168*3117ece4Schristos can use all --zstd parameter names and 'cParams' as a shorthand for all parameters used in ZSTD_compressionParameters 169*3117ece4Schristos (Default: display all params available) 170*3117ece4Schristos -P# : generated sample compressibility (when no file is provided) 171*3117ece4Schristos -t# : Caps runtime of operation in seconds (default: 99999 seconds (about 27 hours)) 172*3117ece4Schristos -v : Prints Benchmarking output 173*3117ece4Schristos -D : Next argument dictionary file 174*3117ece4Schristos -s : Benchmark all files separately 175*3117ece4Schristos -q : Quiet, repeat for more quiet 176*3117ece4Schristos -q Prints parameters + results whenever a new best is found 177*3117ece4Schristos -qq Only prints parameters whenever a new best is found, prints final parameters + results 178*3117ece4Schristos -qqq Only print final parameters + results 179*3117ece4Schristos -qqqq Only prints final parameter set in the form --zstd= 180*3117ece4Schristos -v : Verbose, cancels quiet, repeat for more volume 181*3117ece4Schristos -v Prints all candidate parameters and results 182*3117ece4Schristos 183*3117ece4Schristos``` 184*3117ece4Schristos Any inputs afterwards are treated as files to benchmark. 185