1# ISA-L Function Overview 2 3ISA-L is logically broken into mostly independent units based on the source 4directories of the same name. 5- erasure_codes 6- crc 7- raid 8- mem 9- igzip 10 11The library can also be built with subsets of available units. For example 12`$ make -f Makefile.unx units=crc` will only build a library with crc 13functions. 14 15## ISA-L Functions 16 17### Erasure Code Functions 18 19Functions pertaining to erasure codes implement a general Reed-Solomon type 20encoding for blocks of data to protect against erasure of whole blocks. 21Individual operations can be described in terms of arithmetic in the Galois 22finite field GF(2^8) with the particular field-defining primitive or reducing 23polynomial \f$ x^8 + x^4 + x^3 + x^2 + 1 \f$ (0x1d). 24 25For example, the function ec_encode_data() will generate a set of parity blocks 26\f$P_i\f$ from the set of k source blocks \f$D_i\f$ and arbitrary encoding 27coefficients \f$a_{i,j}\f$ where each byte in P is calculated from sources as: 28 29\f[ P_i = \sum_{j=1}^k a_{i,j} \cdot D_j \f] 30 31where addition and multiplication \f$\cdot\f$ is defined in GF(2^8). Since any 32arbitrary set of coefficients \f$a_{i,j}\f$ can be supplied, the same 33fundamental function can be used for encoding blocks or decoding from blocks in 34erasure. 35 36#### EC Usage 37 38Various examples are available in examples/ec and unit tests in `erasure_code` 39to show an encode and decode (re-hydrate) cycle or partial update operation. As 40seen in [ec example] the process starts with picking an 41encode matrix, parameters k (source blocks) and m (total parity + source 42blocks), and expanding the necessary coefficients. 43 44~~~c 45 // Initialize g_tbls from encode matrix 46 ec_init_tables(k, p, &encode_matrix[k * k], g_tbls); 47~~~ 48 49In the example, a symmetric encode matrix is used where only the coefficients 50describing the parity blocks are used for encode and the upper matrix is 51initialized as an identity to simplify generation of the corresponding decode 52matrix. Next the parity for all (m - k) blocks are calculated at once. 53 54~~~c 55 // Generate EC parity blocks from sources 56 ec_encode_data(len, k, p, g_tbls, frag_ptrs, &frag_ptrs[k]); 57~~~ 58 59### RAID Functions 60 61Functions in the RAID section calculate and operate on XOR and P+Q parity found 62in common RAID implementations. The mathematics of RAID are based on Galois 63finite-field arithmetic to find one or two parity bytes for each byte in N 64sources such that single or dual disk failures (one or two erasures) can be 65corrected. For RAID5, a block of parity is calculated by the xor across the N 66source arrays. Each parity byte is calculated from N sources by: 67 68\f[ P = D_0 + D_1 + ... + D_{N-1} \f] 69 70where \f$D_n\f$ are elements across each source array [0-(N-1)] and + is the 71bit-wise exclusive or (xor) operation. Elements in GF(2^8) are implemented as 72bytes. 73 74For RAID6, two parity bytes P and Q are calculated from the source array. P is 75calculated as in RAID5 and Q is calculated using the generator g as: 76 77\f[ Q = g^0 D_0 + g^1 D_1 + g^2 D_2 + ... + g^{N-1} D_{N-1} \f] 78 79where g is chosen as {2}, the second field element. Multiplication and the 80field are defined using the primitive polynomial \f$ x^8 + x^4 + x^3 + x^2 + 1 \f$ 81(0x1d). 82 83#### RAID Usage 84 85RAID function usage is similar to erasure code except no coefficient expansion 86step is necessary. As seen in [raid example] the xor_gen() and xor_check() 87functions are used to generate and check parity. 88 89### CRC Functions 90 91Functions in the CRC section include fast implementations of cyclic redundancy 92check using specialized instructions such as PCLMULQDQ, carry-less 93multiplication. Generally, a CRC is the remainder in binary division of a 94message and a CRC polynomial in GF(2). 95 96\f[ CRC(M(x)) = x^{deg(P(x))} \cdot M(x) \, mod \, P(x) \f] 97 98CRC is used in many storage applications to ensure integrity of data by 99appending the CRC to a message. Various standards choose the polynomial P and 100may vary by initial seeding value, bit reversal and inverting the result and 101seed. 102 103#### CRC Usage 104 105CRC functions have a simple interface such as in [crc example]. 106 107~~~c 108 crc64_checksum = crc64_ecma_refl(crc64_checksum, inbuf, avail_in); 109~~~ 110 111Updates with new buffers are possible with subsequent calls. No extra finalize 112step is necessary. An example of combining independent CRC values is found in 113[crc combine example]. 114 115### Compress/Inflate Functions 116 117Functions in the igzip unit perform fast, loss-less data compression and 118decompression within the [deflate](https://www.ietf.org/rfc/rfc1951.txt), 119[zlib](https://www.ietf.org/rfc/rfc1950.txt), and 120[gzip](https://www.ietf.org/rfc/rfc1952.txt) binary standards. Functions for 121stream based (data pieces at a time) and stateless (data all at once) are 122available as well as multiple parameters to change the speed vs. compression 123ratio or other features. In addition, there are functions to fine tune 124compression by pre-computing static Huffman tables and setting for subsequent 125compression runs, parsing compression headers and other specific tasks to give 126more control. 127 128#### Compress/Inflate Usage 129 130The interface for compression and decompression functions is similar to zlib, 131zstd and others where a context structure keeps parameters and internal state to 132render from an input buffer to an output buffer. I/O buffer pointers and size 133are often the only required settings. ISA-L, unlike zlib and others, does not 134allocate new memory and must be done by the user explicitly when required (level 1351 and above). This gives the user more flexibility to when dynamic memory is 136allocated and reused. The minimum code for starting a compression is just 137allocating a stream structure and initializing it. This can be done just once 138for multiple compression runs. 139 140~~~c 141 struct isal_zstream stream; 142 isal_deflate_init(&stream); 143~~~ 144 145Using level 1 compression and above requires an additional, initial allocation 146for an internal intermediate buffer. Suggested sizes are defined in external 147headers. 148 149~~~c 150 stream.level = 1; 151 stream.level_buf = malloc(ISAL_DEF_LVL1_DEFAULT); 152 stream.level_buf_size = ISAL_DEF_LVL1_DEFAULT; 153~~~ 154 155After init, subsequent, multiple compression runs can be performed by supplying 156(or re-using) I/O buffers. 157 158~~~c 159 stream.next_in = inbuf; 160 stream->next_out = outbuf; 161 stream->avail_in = inbuf_size; 162 stream->avail_out = outbuf_size; 163 164 isal_deflate(stream); 165~~~ 166 167See [igzip example] for a simple example program or review the perf or check 168tests for more. 169 170**igzip**: ISA-L also provides a user program *igzip* to compress and decompress 171files. Optionally igzip can be compiled with multi-threaded compression. See 172`man igzip` for details. 173 174## General Library Features 175 176### Multi-Binary Dispatchers 177 178Multibinary support is available for all units in ISA-L. With multibinary 179support functions, an appropriate version is selected at first run and can be 180called instead of architecture-specific versions. This allows users to deploy a 181single binary with multiple function versions and choose at run time based on 182platform features. All functions also have base functions, written in portable 183C, which the multibinary function will call if none of the required instruction 184sets are enabled. 185 186### Threading 187 188All ISA-L library functions are single threaded but reentrant and thread-safe 189making it easy for users to incorporate with any threading library. The igzip 190command line utility has threaded compression but not built by default. To add 191to an automake build do `$ make D="-DHAVE_THREADS"`. 192 193### Included Tests and Utilities 194 195ISA-L source [repo] includes unit tests, performance tests and other utilities. 196 197Examples: 198- [ec example] 199- [raid example] 200- [crc example] 201- [igzip example] 202 203--- 204 205[repo]: https://github.com/intel/isa-l 206[ec example]: https://github.com/intel/isa-l/blob/master/examples/ec/ec_simple_example.c 207[raid example]: https://github.com/intel/isa-l/blob/master/raid/xor_example.c 208[crc example]: https://github.com/intel/isa-l/blob/master/crc/crc64_example.c 209[crc combine example]: https://github.com/intel/isa-l/blob/master/examples/crc/crc_combine_example.c 210[igzip example]: https://github.com/intel/isa-l/blob/master/igzip/igzip_example.c 211