1*3117ece4Schristos<p align="center"><img src="https://raw.githubusercontent.com/facebook/zstd/dev/doc/images/zstd_logo86.png" alt="Zstandard"></p> 2*3117ece4Schristos 3*3117ece4Schristos__Zstandard__, or `zstd` as short version, is a fast lossless compression algorithm, 4*3117ece4Schristostargeting real-time compression scenarios at zlib-level and better compression ratios. 5*3117ece4SchristosIt's backed by a very fast entropy stage, provided by [Huff0 and FSE library](https://github.com/Cyan4973/FiniteStateEntropy). 6*3117ece4Schristos 7*3117ece4SchristosZstandard's format is stable and documented in [RFC8878](https://datatracker.ietf.org/doc/html/rfc8878). Multiple independent implementations are already available. 8*3117ece4SchristosThis repository represents the reference implementation, provided as an open-source dual [BSD](LICENSE) OR [GPLv2](COPYING) licensed **C** library, 9*3117ece4Schristosand a command line utility producing and decoding `.zst`, `.gz`, `.xz` and `.lz4` files. 10*3117ece4SchristosShould your project require another programming language, 11*3117ece4Schristosa list of known ports and bindings is provided on [Zstandard homepage](https://facebook.github.io/zstd/#other-languages). 12*3117ece4Schristos 13*3117ece4Schristos**Development branch status:** 14*3117ece4Schristos 15*3117ece4Schristos[![Build Status][travisDevBadge]][travisLink] 16*3117ece4Schristos[![Build status][CircleDevBadge]][CircleLink] 17*3117ece4Schristos[![Build status][CirrusDevBadge]][CirrusLink] 18*3117ece4Schristos[![Fuzzing Status][OSSFuzzBadge]][OSSFuzzLink] 19*3117ece4Schristos 20*3117ece4Schristos[travisDevBadge]: https://api.travis-ci.com/facebook/zstd.svg?branch=dev "Continuous Integration test suite" 21*3117ece4Schristos[travisLink]: https://travis-ci.com/facebook/zstd 22*3117ece4Schristos[CircleDevBadge]: https://circleci.com/gh/facebook/zstd/tree/dev.svg?style=shield "Short test suite" 23*3117ece4Schristos[CircleLink]: https://circleci.com/gh/facebook/zstd 24*3117ece4Schristos[CirrusDevBadge]: https://api.cirrus-ci.com/github/facebook/zstd.svg?branch=dev 25*3117ece4Schristos[CirrusLink]: https://cirrus-ci.com/github/facebook/zstd 26*3117ece4Schristos[OSSFuzzBadge]: https://oss-fuzz-build-logs.storage.googleapis.com/badges/zstd.svg 27*3117ece4Schristos[OSSFuzzLink]: https://bugs.chromium.org/p/oss-fuzz/issues/list?sort=-opened&can=1&q=proj:zstd 28*3117ece4Schristos 29*3117ece4Schristos## Benchmarks 30*3117ece4Schristos 31*3117ece4SchristosFor reference, several fast compression algorithms were tested and compared 32*3117ece4Schristoson a desktop running Ubuntu 20.04 (`Linux 5.11.0-41-generic`), 33*3117ece4Schristoswith a Core i7-9700K CPU @ 4.9GHz, 34*3117ece4Schristosusing [lzbench], an open-source in-memory benchmark by @inikep 35*3117ece4Schristoscompiled with [gcc] 9.3.0, 36*3117ece4Schristoson the [Silesia compression corpus]. 37*3117ece4Schristos 38*3117ece4Schristos[lzbench]: https://github.com/inikep/lzbench 39*3117ece4Schristos[Silesia compression corpus]: https://sun.aei.polsl.pl//~sdeor/index.php?page=silesia 40*3117ece4Schristos[gcc]: https://gcc.gnu.org/ 41*3117ece4Schristos 42*3117ece4Schristos| Compressor name | Ratio | Compression| Decompress.| 43*3117ece4Schristos| --------------- | ------| -----------| ---------- | 44*3117ece4Schristos| **zstd 1.5.1 -1** | 2.887 | 530 MB/s | 1700 MB/s | 45*3117ece4Schristos| [zlib] 1.2.11 -1 | 2.743 | 95 MB/s | 400 MB/s | 46*3117ece4Schristos| brotli 1.0.9 -0 | 2.702 | 395 MB/s | 450 MB/s | 47*3117ece4Schristos| **zstd 1.5.1 --fast=1** | 2.437 | 600 MB/s | 2150 MB/s | 48*3117ece4Schristos| **zstd 1.5.1 --fast=3** | 2.239 | 670 MB/s | 2250 MB/s | 49*3117ece4Schristos| quicklz 1.5.0 -1 | 2.238 | 540 MB/s | 760 MB/s | 50*3117ece4Schristos| **zstd 1.5.1 --fast=4** | 2.148 | 710 MB/s | 2300 MB/s | 51*3117ece4Schristos| lzo1x 2.10 -1 | 2.106 | 660 MB/s | 845 MB/s | 52*3117ece4Schristos| [lz4] 1.9.3 | 2.101 | 740 MB/s | 4500 MB/s | 53*3117ece4Schristos| lzf 3.6 -1 | 2.077 | 410 MB/s | 830 MB/s | 54*3117ece4Schristos| snappy 1.1.9 | 2.073 | 550 MB/s | 1750 MB/s | 55*3117ece4Schristos 56*3117ece4Schristos[zlib]: https://www.zlib.net/ 57*3117ece4Schristos[lz4]: https://lz4.github.io/lz4/ 58*3117ece4Schristos 59*3117ece4SchristosThe negative compression levels, specified with `--fast=#`, 60*3117ece4Schristosoffer faster compression and decompression speed 61*3117ece4Schristosat the cost of compression ratio (compared to level 1). 62*3117ece4Schristos 63*3117ece4SchristosZstd can also offer stronger compression ratios at the cost of compression speed. 64*3117ece4SchristosSpeed vs Compression trade-off is configurable by small increments. 65*3117ece4SchristosDecompression speed is preserved and remains roughly the same at all settings, 66*3117ece4Schristosa property shared by most LZ compression algorithms, such as [zlib] or lzma. 67*3117ece4Schristos 68*3117ece4SchristosThe following tests were run 69*3117ece4Schristoson a server running Linux Debian (`Linux version 4.14.0-3-amd64`) 70*3117ece4Schristoswith a Core i7-6700K CPU @ 4.0GHz, 71*3117ece4Schristosusing [lzbench], an open-source in-memory benchmark by @inikep 72*3117ece4Schristoscompiled with [gcc] 7.3.0, 73*3117ece4Schristoson the [Silesia compression corpus]. 74*3117ece4Schristos 75*3117ece4SchristosCompression Speed vs Ratio | Decompression Speed 76*3117ece4Schristos---------------------------|-------------------- 77*3117ece4Schristos |  78*3117ece4Schristos 79*3117ece4SchristosA few other algorithms can produce higher compression ratios at slower speeds, falling outside of the graph. 80*3117ece4SchristosFor a larger picture including slow modes, [click on this link](doc/images/DCspeed5.png). 81*3117ece4Schristos 82*3117ece4Schristos 83*3117ece4Schristos## The case for Small Data compression 84*3117ece4Schristos 85*3117ece4SchristosPrevious charts provide results applicable to typical file and stream scenarios (several MB). Small data comes with different perspectives. 86*3117ece4Schristos 87*3117ece4SchristosThe smaller the amount of data to compress, the more difficult it is to compress. This problem is common to all compression algorithms, and reason is, compression algorithms learn from past data how to compress future data. But at the beginning of a new data set, there is no "past" to build upon. 88*3117ece4Schristos 89*3117ece4SchristosTo solve this situation, Zstd offers a __training mode__, which can be used to tune the algorithm for a selected type of data. 90*3117ece4SchristosTraining Zstandard is achieved by providing it with a few samples (one file per sample). The result of this training is stored in a file called "dictionary", which must be loaded before compression and decompression. 91*3117ece4SchristosUsing this dictionary, the compression ratio achievable on small data improves dramatically. 92*3117ece4Schristos 93*3117ece4SchristosThe following example uses the `github-users` [sample set](https://github.com/facebook/zstd/releases/tag/v1.1.3), created from [github public API](https://developer.github.com/v3/users/#get-all-users). 94*3117ece4SchristosIt consists of roughly 10K records weighing about 1KB each. 95*3117ece4Schristos 96*3117ece4SchristosCompression Ratio | Compression Speed | Decompression Speed 97*3117ece4Schristos------------------|-------------------|-------------------- 98*3117ece4Schristos |  |  99*3117ece4Schristos 100*3117ece4Schristos 101*3117ece4SchristosThese compression gains are achieved while simultaneously providing _faster_ compression and decompression speeds. 102*3117ece4Schristos 103*3117ece4SchristosTraining works if there is some correlation in a family of small data samples. The more data-specific a dictionary is, the more efficient it is (there is no _universal dictionary_). 104*3117ece4SchristosHence, deploying one dictionary per type of data will provide the greatest benefits. 105*3117ece4SchristosDictionary gains are mostly effective in the first few KB. Then, the compression algorithm will gradually use previously decoded content to better compress the rest of the file. 106*3117ece4Schristos 107*3117ece4Schristos### Dictionary compression How To: 108*3117ece4Schristos 109*3117ece4Schristos1. Create the dictionary 110*3117ece4Schristos 111*3117ece4Schristos `zstd --train FullPathToTrainingSet/* -o dictionaryName` 112*3117ece4Schristos 113*3117ece4Schristos2. Compress with dictionary 114*3117ece4Schristos 115*3117ece4Schristos `zstd -D dictionaryName FILE` 116*3117ece4Schristos 117*3117ece4Schristos3. Decompress with dictionary 118*3117ece4Schristos 119*3117ece4Schristos `zstd -D dictionaryName --decompress FILE.zst` 120*3117ece4Schristos 121*3117ece4Schristos 122*3117ece4Schristos## Build instructions 123*3117ece4Schristos 124*3117ece4Schristos`make` is the officially maintained build system of this project. 125*3117ece4SchristosAll other build systems are "compatible" and 3rd-party maintained, 126*3117ece4Schristosthey may feature small differences in advanced options. 127*3117ece4SchristosWhen your system allows it, prefer using `make` to build `zstd` and `libzstd`. 128*3117ece4Schristos 129*3117ece4Schristos### Makefile 130*3117ece4Schristos 131*3117ece4SchristosIf your system is compatible with standard `make` (or `gmake`), 132*3117ece4Schristosinvoking `make` in root directory will generate `zstd` cli in root directory. 133*3117ece4SchristosIt will also create `libzstd` into `lib/`. 134*3117ece4Schristos 135*3117ece4SchristosOther available options include: 136*3117ece4Schristos- `make install` : create and install zstd cli, library and man pages 137*3117ece4Schristos- `make check` : create and run `zstd`, test its behavior on local platform 138*3117ece4Schristos 139*3117ece4SchristosThe `Makefile` follows the [GNU Standard Makefile conventions](https://www.gnu.org/prep/standards/html_node/Makefile-Conventions.html), 140*3117ece4Schristosallowing staged install, standard flags, directory variables and command variables. 141*3117ece4Schristos 142*3117ece4SchristosFor advanced use cases, specialized compilation flags which control binary generation 143*3117ece4Schristosare documented in [`lib/README.md`](lib/README.md#modular-build) for the `libzstd` library 144*3117ece4Schristosand in [`programs/README.md`](programs/README.md#compilation-variables) for the `zstd` CLI. 145*3117ece4Schristos 146*3117ece4Schristos### cmake 147*3117ece4Schristos 148*3117ece4SchristosA `cmake` project generator is provided within `build/cmake`. 149*3117ece4SchristosIt can generate Makefiles or other build scripts 150*3117ece4Schristosto create `zstd` binary, and `libzstd` dynamic and static libraries. 151*3117ece4Schristos 152*3117ece4SchristosBy default, `CMAKE_BUILD_TYPE` is set to `Release`. 153*3117ece4Schristos 154*3117ece4Schristos#### Support for Fat (Universal2) Output 155*3117ece4Schristos 156*3117ece4Schristos`zstd` can be built and installed with support for both Apple Silicon (M1/M2) as well as Intel by using CMake's Universal2 support. 157*3117ece4SchristosTo perform a Fat/Universal2 build and install use the following commands: 158*3117ece4Schristos 159*3117ece4Schristos```bash 160*3117ece4Schristoscmake -B build-cmake-debug -S build/cmake -G Ninja -DCMAKE_OSX_ARCHITECTURES="x86_64;x86_64h;arm64" 161*3117ece4Schristoscd build-cmake-debug 162*3117ece4Schristosninja 163*3117ece4Schristossudo ninja install 164*3117ece4Schristos``` 165*3117ece4Schristos 166*3117ece4Schristos### Meson 167*3117ece4Schristos 168*3117ece4SchristosA Meson project is provided within [`build/meson`](build/meson). Follow 169*3117ece4Schristosbuild instructions in that directory. 170*3117ece4Schristos 171*3117ece4SchristosYou can also take a look at [`.travis.yml`](.travis.yml) file for an 172*3117ece4Schristosexample about how Meson is used to build this project. 173*3117ece4Schristos 174*3117ece4SchristosNote that default build type is **release**. 175*3117ece4Schristos 176*3117ece4Schristos### VCPKG 177*3117ece4SchristosYou can build and install zstd [vcpkg](https://github.com/Microsoft/vcpkg/) dependency manager: 178*3117ece4Schristos 179*3117ece4Schristos git clone https://github.com/Microsoft/vcpkg.git 180*3117ece4Schristos cd vcpkg 181*3117ece4Schristos ./bootstrap-vcpkg.sh 182*3117ece4Schristos ./vcpkg integrate install 183*3117ece4Schristos ./vcpkg install zstd 184*3117ece4Schristos 185*3117ece4SchristosThe zstd port in vcpkg is kept up to date by Microsoft team members and community contributors. 186*3117ece4SchristosIf the version is out of date, please [create an issue or pull request](https://github.com/Microsoft/vcpkg) on the vcpkg repository. 187*3117ece4Schristos 188*3117ece4Schristos### Visual Studio (Windows) 189*3117ece4Schristos 190*3117ece4SchristosGoing into `build` directory, you will find additional possibilities: 191*3117ece4Schristos- Projects for Visual Studio 2005, 2008 and 2010. 192*3117ece4Schristos + VS2010 project is compatible with VS2012, VS2013, VS2015 and VS2017. 193*3117ece4Schristos- Automated build scripts for Visual compiler by [@KrzysFR](https://github.com/KrzysFR), in `build/VS_scripts`, 194*3117ece4Schristos which will build `zstd` cli and `libzstd` library without any need to open Visual Studio solution. 195*3117ece4Schristos 196*3117ece4Schristos### Buck 197*3117ece4Schristos 198*3117ece4SchristosYou can build the zstd binary via buck by executing: `buck build programs:zstd` from the root of the repo. 199*3117ece4SchristosThe output binary will be in `buck-out/gen/programs/`. 200*3117ece4Schristos 201*3117ece4Schristos### Bazel 202*3117ece4Schristos 203*3117ece4SchristosYou easily can integrate zstd into your Bazel project by using the module hosted on the [Bazel Central Repository](https://registry.bazel.build/modules/zstd). 204*3117ece4Schristos 205*3117ece4Schristos## Testing 206*3117ece4Schristos 207*3117ece4SchristosYou can run quick local smoke tests by running `make check`. 208*3117ece4SchristosIf you can't use `make`, execute the `playTest.sh` script from the `src/tests` directory. 209*3117ece4SchristosTwo env variables `$ZSTD_BIN` and `$DATAGEN_BIN` are needed for the test script to locate the `zstd` and `datagen` binary. 210*3117ece4SchristosFor information on CI testing, please refer to `TESTING.md`. 211*3117ece4Schristos 212*3117ece4Schristos## Status 213*3117ece4Schristos 214*3117ece4SchristosZstandard is currently deployed within Facebook and many other large cloud infrastructures. 215*3117ece4SchristosIt is run continuously to compress large amounts of data in multiple formats and use cases. 216*3117ece4SchristosZstandard is considered safe for production environments. 217*3117ece4Schristos 218*3117ece4Schristos## License 219*3117ece4Schristos 220*3117ece4SchristosZstandard is dual-licensed under [BSD](LICENSE) OR [GPLv2](COPYING). 221*3117ece4Schristos 222*3117ece4Schristos## Contributing 223*3117ece4Schristos 224*3117ece4SchristosThe `dev` branch is the one where all contributions are merged before reaching `release`. 225*3117ece4SchristosIf you plan to propose a patch, please commit into the `dev` branch, or its own feature branch. 226*3117ece4SchristosDirect commit to `release` are not permitted. 227*3117ece4SchristosFor more information, please read [CONTRIBUTING](CONTRIBUTING.md). 228