1 2.xz Test Files 3---------------- 4 50. Introduction 6 7 This directory contains bunch of files to test handling of .xz files 8 in .xz decoder implementations. Many of the files have been created 9 by hand with a hex editor, thus there is no better "source code" than 10 the files themselves. All the test files (*.xz) and this README have 11 been put into the public domain. 12 13 141. File Types 15 16 Good files (good-*.xz) must decode successfully without requiring 17 a lot of CPU time or RAM. 18 19 Unsupported files (unsupported-*.xz) are good files, but headers 20 indicate features not supported by the current file format 21 specification. 22 23 Bad files (bad-*.xz) must cause the decoder to give an error. Like 24 with the good files, these files must not require a lot of CPU time 25 or RAM before they get detected to be broken. 26 27 282. Descriptions of Individual Files 29 302.1. Good Files 31 32 good-0-empty.xz has one Stream with no Blocks. 33 34 good-0pad-empty.xz has one Stream with no Blocks followed by 35 four-byte Stream Padding. 36 37 good-0cat-empty.xz has two zero-Block Streams concatenated without 38 Stream Padding. 39 40 good-0catpad-empty.xz has two zero-Block Streams concatenated with 41 four-byte Stream Padding between the Streams. 42 43 good-1-check-none.xz has one Stream with one Block with two 44 uncompressed LZMA2 chunks and no integrity check. 45 46 good-1-check-crc32.xz has one Stream with one Block with two 47 uncompressed LZMA2 chunks and CRC32 check. 48 49 good-1-check-crc64.xz is like good-1-check-crc32.xz but with CRC64. 50 51 good-1-check-sha256.xz is like good-1-check-crc32.xz but with 52 SHA256. 53 54 good-2-lzma2.xz has one Stream with two Blocks with one uncompressed 55 LZMA2 chunk in each Block. 56 57 good-1-block_header-1.xz has both Compressed Size and Uncompressed 58 Size in the Block Header. This has also four extra bytes of Header 59 Padding. 60 61 good-1-block_header-2.xz has known Compressed Size. 62 63 good-1-block_header-3.xz has known Uncompressed Size. 64 65 good-1-delta-lzma2.tiff.xz is an image file that compresses 66 better with Delta+LZMA2 than with plain LZMA2. 67 68 good-1-x86-lzma2.xz uses the x86 filter (BCJ) and LZMA2. The 69 uncompressed file is compress_prepared_bcj_x86 found from the tests 70 directory. 71 72 good-1-sparc-lzma2.xz uses the SPARC filter and LZMA. The 73 uncompressed file is compress_prepared_bcj_sparc found from the tests 74 directory. 75 76 good-1-lzma2-1.xz has two LZMA2 chunks, of which the second sets 77 new properties. 78 79 good-1-lzma2-2.xz has two LZMA2 chunks, of which the second resets 80 the state without specifying new properties. 81 82 good-1-lzma2-3.xz has two LZMA2 chunks, of which the first is 83 uncompressed and the second is LZMA. The first chunk resets dictionary 84 and the second sets new properties. 85 86 good-1-lzma2-4.xz has three LZMA2 chunks: First is LZMA, second is 87 uncompressed with dictionary reset, and third is LZMA with new 88 properties but without dictionary reset. 89 90 good-1-3delta-lzma2.xz has three Delta filters and LZMA2. 91 92 932.2. Unsupported Files 94 95 unsupported-check.xz uses Check ID 0x02 which isn't supported by 96 the current version of the file format. It is implementation-defined 97 how this file handled (it may reject it, or decode it possibly with 98 a warning). 99 100 unsupported-block_header.xz has a non-null byte in Header Padding, 101 which may indicate presence of a new unsupported field. 102 103 unsupported-filter_flags-1.xz has unsupported Filter ID 0x7F. 104 105 unsupported-filter_flags-2.xz specifies only Delta filter in the 106 List of Filter Flags, but Delta isn't allowed as the last filter in 107 the chain. It could be a little more correct to detect this file as 108 corrupt instead of unsupported, but saying it is unsupported is 109 simpler in case of liblzma. 110 111 unsupported-filter_flags-3.xz specifies two LZMA2 filters in the 112 List of Filter Flags. LZMA2 is allowed only as the last filter in the 113 chain. It could be a little more correct to detect this file as 114 corrupt instead of unsupported, but saying it is unsupported is 115 simpler in case of liblzma. 116 117 1182.3. Bad Files 119 120 bad-0pad-empty.xz has one Stream with no Blocks followed by 121 five-byte Stream Padding. Stream Padding must be a multiple of four 122 bytes, thus this file is corrupt. 123 124 bad-0catpad-empty.xz has two zero-Block Streams concatenated with 125 five-byte Stream Padding between the Streams. 126 127 bad-0cat-alone.xz is good-0-empty.xz concatenated with an empty 128 LZMA_Alone file. 129 130 bad-0cat-header_magic.xz is good-0cat-empty.xz but with one byte 131 wrong in the Header Magic Bytes field of the second Stream. liblzma 132 gives LZMA_DATA_ERROR for this. (LZMA_FORMAT_ERROR is used only if 133 the first Stream of a file has invalid Header Magic Bytes.) 134 135 bad-0-header_magic.xz is good-0-empty.xz but with one byte wrong 136 in the Header Magic Bytes field. liblzma gives LZMA_FORMAT_ERROR for 137 this. 138 139 bad-0-footer_magic.xz is good-0-empty.xz but with one byte wrong 140 in the Footer Magic Bytes field. liblzma gives LZMA_DATA_ERROR for 141 this. 142 143 bad-0-empty-truncated.xz is good-0-empty.xz without the last byte 144 of the file. 145 146 bad-0-nonempty_index.xz has no Blocks but Index claims that there is 147 one Block. 148 149 bad-0-backward_size.xz has wrong Backward Size in Stream Footer. 150 151 bad-1-stream_flags-1.xz has different Stream Flags in Stream Header 152 and Stream Footer. 153 154 bad-1-stream_flags-2.xz has wrong CRC32 in Stream Header. 155 156 bad-1-stream_flags-3.xz has wrong CRC32 in Stream Footer. 157 158 bad-1-vli-1.xz has two-byte variable-length integer in the 159 Uncompressed Size field in Block Header while one-byte would be enough 160 for that value. It's important that the file gets rejected due to too 161 big integer encoding instead of due to Uncompressed Size not matching 162 the value stored in the Block Header. That is, the decoder must not 163 try to decode the Compressed Data field. 164 165 bad-1-vli-2.xz has ten-byte variable-length integer as Uncompressed 166 Size in Block Header. It's important that the file gets rejected due 167 to too big integer encoding instead of due to Uncompressed Size not 168 matching the value stored in the Block Header. That is, the decoder 169 must not try to decode the Compressed Data field. 170 171 bad-1-block_header-1.xz has Block Header that ends in the middle of 172 the Filter Flags field. 173 174 bad-1-block_header-2.xz has Block Header that has Compressed Size and 175 Uncompressed Size but no List of Filter Flags field. 176 177 bad-1-block_header-3.xz has wrong CRC32 in Block Header. 178 179 bad-1-block_header-4.xz has too big Compressed Size in Block Header 180 (2^63 - 1 bytes while maximum is a little less, because the whole 181 Block must stay smaller than 2^63). It's important that the file 182 gets rejected due to invalid Compressed Size value; the decoder 183 must not try decoding the Compressed Data field. 184 185 bad-1-block_header-5.xz has zero as Compressed Size in Block Header. 186 187 bad-2-index-1.xz has wrong Unpadded Sizes in Index. 188 189 bad-2-index-2.xz has wrong Uncompressed Sizes in Index. 190 191 bad-2-index-3.xz has non-null byte in Index Padding. 192 193 bad-2-index-4.xz wrong CRC32 in Index. 194 195 bad-2-index-5.xz has zero as Unpadded Size. It is important that the 196 file gets rejected specifically due to Unpadded Size having an invalid 197 value. 198 199 bad-2-compressed_data_padding.xz has non-null byte in the padding of 200 the Compressed Data field of the first Block. 201 202 bad-1-check-crc32.xz has wrong Check (CRC32). 203 204 bad-1-check-crc64.xz has wrong Check (CRC64). 205 206 bad-1-check-sha256.xz has wrong Check (SHA-256). 207 208 bad-1-lzma2-1.xz has LZMA2 stream whose first chunk (uncompressed) 209 doesn't reset the dictionary. 210 211 bad-1-lzma2-2.xz has two LZMA2 chunks, of which the second chunk 212 indicates dictionary reset, but the LZMA compressed data tries to 213 repeat data from the previous chunk. 214 215 bad-1-lzma2-3.xz sets new invalid properties (lc=8, lp=0, pb=0) in 216 the middle of Block. 217 218 bad-1-lzma2-4.xz has two LZMA2 chunks, of which the first is 219 uncompressed and the second is LZMA. The first chunk resets dictionary 220 as it should, but the second chunk tries to reset state without 221 specifying properties for LZMA. 222 223 bad-1-lzma2-5.xz is like bad-1-lzma2-4.xz but doesn't try to reset 224 anything in the header of the second chunk. 225 226 bad-1-lzma2-6.xz has reserved LZMA2 control byte value (0x03). 227 228 bad-1-lzma2-7.xz has EOPM at LZMA level. 229 230 bad-1-lzma2-8.xz is like good-1-lzma2-4.xz but doesn't set new 231 properties in the third LZMA2 chunk. 232 233