xref: /plan9/sys/src/cmd/gs/jpeg/structure.doc (revision 7dd7cddf99dd7472612f1413b4da293630e6b1bc)
1*7dd7cddfSDavid du ColombierIJG JPEG LIBRARY:  SYSTEM ARCHITECTURE
2*7dd7cddfSDavid du Colombier
3*7dd7cddfSDavid du ColombierCopyright (C) 1991-1995, Thomas G. Lane.
4*7dd7cddfSDavid du ColombierThis file is part of the Independent JPEG Group's software.
5*7dd7cddfSDavid du ColombierFor conditions of distribution and use, see the accompanying README file.
6*7dd7cddfSDavid du Colombier
7*7dd7cddfSDavid du Colombier
8*7dd7cddfSDavid du ColombierThis file provides an overview of the architecture of the IJG JPEG software;
9*7dd7cddfSDavid du Colombierthat is, the functions of the various modules in the system and the interfaces
10*7dd7cddfSDavid du Colombierbetween modules.  For more precise details about any data structure or calling
11*7dd7cddfSDavid du Colombierconvention, see the include files and comments in the source code.
12*7dd7cddfSDavid du Colombier
13*7dd7cddfSDavid du ColombierWe assume that the reader is already somewhat familiar with the JPEG standard.
14*7dd7cddfSDavid du ColombierThe README file includes references for learning about JPEG.  The file
15*7dd7cddfSDavid du Colombierlibjpeg.doc describes the library from the viewpoint of an application
16*7dd7cddfSDavid du Colombierprogrammer using the library; it's best to read that file before this one.
17*7dd7cddfSDavid du ColombierAlso, the file coderules.doc describes the coding style conventions we use.
18*7dd7cddfSDavid du Colombier
19*7dd7cddfSDavid du ColombierIn this document, JPEG-specific terminology follows the JPEG standard:
20*7dd7cddfSDavid du Colombier  A "component" means a color channel, e.g., Red or Luminance.
21*7dd7cddfSDavid du Colombier  A "sample" is a single component value (i.e., one number in the image data).
22*7dd7cddfSDavid du Colombier  A "coefficient" is a frequency coefficient (a DCT transform output number).
23*7dd7cddfSDavid du Colombier  A "block" is an 8x8 group of samples or coefficients.
24*7dd7cddfSDavid du Colombier  An "MCU" (minimum coded unit) is an interleaved set of blocks of size
25*7dd7cddfSDavid du Colombier	determined by the sampling factors, or a single block in a
26*7dd7cddfSDavid du Colombier	noninterleaved scan.
27*7dd7cddfSDavid du ColombierWe do not use the terms "pixel" and "sample" interchangeably.  When we say
28*7dd7cddfSDavid du Colombierpixel, we mean an element of the full-size image, while a sample is an element
29*7dd7cddfSDavid du Colombierof the downsampled image.  Thus the number of samples may vary across
30*7dd7cddfSDavid du Colombiercomponents while the number of pixels does not.  (This terminology is not used
31*7dd7cddfSDavid du Colombierrigorously throughout the code, but it is used in places where confusion would
32*7dd7cddfSDavid du Colombierotherwise result.)
33*7dd7cddfSDavid du Colombier
34*7dd7cddfSDavid du Colombier
35*7dd7cddfSDavid du Colombier*** System features ***
36*7dd7cddfSDavid du Colombier
37*7dd7cddfSDavid du ColombierThe IJG distribution contains two parts:
38*7dd7cddfSDavid du Colombier  * A subroutine library for JPEG compression and decompression.
39*7dd7cddfSDavid du Colombier  * cjpeg/djpeg, two sample applications that use the library to transform
40*7dd7cddfSDavid du Colombier    JFIF JPEG files to and from several other image formats.
41*7dd7cddfSDavid du Colombiercjpeg/djpeg are of no great intellectual complexity: they merely add a simple
42*7dd7cddfSDavid du Colombiercommand-line user interface and I/O routines for several uncompressed image
43*7dd7cddfSDavid du Colombierformats.  This document concentrates on the library itself.
44*7dd7cddfSDavid du Colombier
45*7dd7cddfSDavid du ColombierWe desire the library to be capable of supporting all JPEG baseline, extended
46*7dd7cddfSDavid du Colombiersequential, and progressive DCT processes.  Hierarchical processes are not
47*7dd7cddfSDavid du Colombiersupported.
48*7dd7cddfSDavid du Colombier
49*7dd7cddfSDavid du ColombierThe library does not support the lossless (spatial) JPEG process.  Lossless
50*7dd7cddfSDavid du ColombierJPEG shares little or no code with lossy JPEG, and would normally be used
51*7dd7cddfSDavid du Colombierwithout the extensive pre- and post-processing provided by this library.
52*7dd7cddfSDavid du ColombierWe feel that lossless JPEG is better handled by a separate library.
53*7dd7cddfSDavid du Colombier
54*7dd7cddfSDavid du ColombierWithin these limits, any set of compression parameters allowed by the JPEG
55*7dd7cddfSDavid du Colombierspec should be readable for decompression.  (We can be more restrictive about
56*7dd7cddfSDavid du Colombierwhat formats we can generate.)  Although the system design allows for all
57*7dd7cddfSDavid du Colombierparameter values, some uncommon settings are not yet implemented and may
58*7dd7cddfSDavid du Colombiernever be; nonintegral sampling ratios are the prime example.  Furthermore,
59*7dd7cddfSDavid du Colombierwe treat 8-bit vs. 12-bit data precision as a compile-time switch, not a
60*7dd7cddfSDavid du Colombierrun-time option, because most machines can store 8-bit pixels much more
61*7dd7cddfSDavid du Colombiercompactly than 12-bit.
62*7dd7cddfSDavid du Colombier
63*7dd7cddfSDavid du ColombierFor legal reasons, JPEG arithmetic coding is not currently supported, but
64*7dd7cddfSDavid du Colombierextending the library to include it would be straightforward.
65*7dd7cddfSDavid du Colombier
66*7dd7cddfSDavid du ColombierBy itself, the library handles only interchange JPEG datastreams --- in
67*7dd7cddfSDavid du Colombierparticular the widely used JFIF file format.  The library can be used by
68*7dd7cddfSDavid du Colombiersurrounding code to process interchange or abbreviated JPEG datastreams that
69*7dd7cddfSDavid du Colombierare embedded in more complex file formats.  (For example, libtiff uses this
70*7dd7cddfSDavid du Colombierlibrary to implement JPEG compression within the TIFF file format.)
71*7dd7cddfSDavid du Colombier
72*7dd7cddfSDavid du ColombierThe library includes a substantial amount of code that is not covered by the
73*7dd7cddfSDavid du ColombierJPEG standard but is necessary for typical applications of JPEG.  These
74*7dd7cddfSDavid du Colombierfunctions preprocess the image before JPEG compression or postprocess it after
75*7dd7cddfSDavid du Colombierdecompression.  They include colorspace conversion, downsampling/upsampling,
76*7dd7cddfSDavid du Colombierand color quantization.  This code can be omitted if not needed.
77*7dd7cddfSDavid du Colombier
78*7dd7cddfSDavid du ColombierA wide range of quality vs. speed tradeoffs are possible in JPEG processing,
79*7dd7cddfSDavid du Colombierand even more so in decompression postprocessing.  The decompression library
80*7dd7cddfSDavid du Colombierprovides multiple implementations that cover most of the useful tradeoffs,
81*7dd7cddfSDavid du Colombierranging from very-high-quality down to fast-preview operation.  On the
82*7dd7cddfSDavid du Colombiercompression side we have generally not provided low-quality choices, since
83*7dd7cddfSDavid du Colombiercompression is normally less time-critical.  It should be understood that the
84*7dd7cddfSDavid du Colombierlow-quality modes may not meet the JPEG standard's accuracy requirements;
85*7dd7cddfSDavid du Colombiernonetheless, they are useful for viewers.
86*7dd7cddfSDavid du Colombier
87*7dd7cddfSDavid du Colombier
88*7dd7cddfSDavid du Colombier*** Portability issues ***
89*7dd7cddfSDavid du Colombier
90*7dd7cddfSDavid du ColombierPortability is an essential requirement for the library.  The key portability
91*7dd7cddfSDavid du Colombierissues that show up at the level of system architecture are:
92*7dd7cddfSDavid du Colombier
93*7dd7cddfSDavid du Colombier1.  Memory usage.  We want the code to be able to run on PC-class machines
94*7dd7cddfSDavid du Colombierwith limited memory.  Images should therefore be processed sequentially (in
95*7dd7cddfSDavid du Colombierstrips), to avoid holding the whole image in memory at once.  Where a
96*7dd7cddfSDavid du Colombierfull-image buffer is necessary, we should be able to use either virtual memory
97*7dd7cddfSDavid du Colombieror temporary files.
98*7dd7cddfSDavid du Colombier
99*7dd7cddfSDavid du Colombier2.  Near/far pointer distinction.  To run efficiently on 80x86 machines, the
100*7dd7cddfSDavid du Colombiercode should distinguish "small" objects (kept in near data space) from
101*7dd7cddfSDavid du Colombier"large" ones (kept in far data space).  This is an annoying restriction, but
102*7dd7cddfSDavid du Colombierfortunately it does not impact code quality for less brain-damaged machines,
103*7dd7cddfSDavid du Colombierand the source code clutter turns out to be minimal with sufficient use of
104*7dd7cddfSDavid du Colombierpointer typedefs.
105*7dd7cddfSDavid du Colombier
106*7dd7cddfSDavid du Colombier3. Data precision.  We assume that "char" is at least 8 bits, "short" and
107*7dd7cddfSDavid du Colombier"int" at least 16, "long" at least 32.  The code will work fine with larger
108*7dd7cddfSDavid du Colombierdata sizes, although memory may be used inefficiently in some cases.  However,
109*7dd7cddfSDavid du Colombierthe JPEG compressed datastream must ultimately appear on external storage as a
110*7dd7cddfSDavid du Colombiersequence of 8-bit bytes if it is to conform to the standard.  This may pose a
111*7dd7cddfSDavid du Colombierproblem on machines where char is wider than 8 bits.  The library represents
112*7dd7cddfSDavid du Colombiercompressed data as an array of values of typedef JOCTET.  If no data type
113*7dd7cddfSDavid du Colombierexactly 8 bits wide is available, custom data source and data destination
114*7dd7cddfSDavid du Colombiermodules must be written to unpack and pack the chosen JOCTET datatype into
115*7dd7cddfSDavid du Colombier8-bit external representation.
116*7dd7cddfSDavid du Colombier
117*7dd7cddfSDavid du Colombier
118*7dd7cddfSDavid du Colombier*** System overview ***
119*7dd7cddfSDavid du Colombier
120*7dd7cddfSDavid du ColombierThe compressor and decompressor are each divided into two main sections:
121*7dd7cddfSDavid du Colombierthe JPEG compressor or decompressor proper, and the preprocessing or
122*7dd7cddfSDavid du Colombierpostprocessing functions.  The interface between these two sections is the
123*7dd7cddfSDavid du Colombierimage data that the official JPEG spec regards as its input or output: this
124*7dd7cddfSDavid du Colombierdata is in the colorspace to be used for compression, and it is downsampled
125*7dd7cddfSDavid du Colombierto the sampling factors to be used.  The preprocessing and postprocessing
126*7dd7cddfSDavid du Colombiersteps are responsible for converting a normal image representation to or from
127*7dd7cddfSDavid du Colombierthis form.  (Those few applications that want to deal with YCbCr downsampled
128*7dd7cddfSDavid du Colombierdata can skip the preprocessing or postprocessing step.)
129*7dd7cddfSDavid du Colombier
130*7dd7cddfSDavid du ColombierLooking more closely, the compressor library contains the following main
131*7dd7cddfSDavid du Colombierelements:
132*7dd7cddfSDavid du Colombier
133*7dd7cddfSDavid du Colombier  Preprocessing:
134*7dd7cddfSDavid du Colombier    * Color space conversion (e.g., RGB to YCbCr).
135*7dd7cddfSDavid du Colombier    * Edge expansion and downsampling.  Optionally, this step can do simple
136*7dd7cddfSDavid du Colombier      smoothing --- this is often helpful for low-quality source data.
137*7dd7cddfSDavid du Colombier  JPEG proper:
138*7dd7cddfSDavid du Colombier    * MCU assembly, DCT, quantization.
139*7dd7cddfSDavid du Colombier    * Entropy coding (sequential or progressive, Huffman or arithmetic).
140*7dd7cddfSDavid du Colombier
141*7dd7cddfSDavid du ColombierIn addition to these modules we need overall control, marker generation,
142*7dd7cddfSDavid du Colombierand support code (memory management & error handling).  There is also a
143*7dd7cddfSDavid du Colombiermodule responsible for physically writing the output data --- typically
144*7dd7cddfSDavid du Colombierthis is just an interface to fwrite(), but some applications may need to
145*7dd7cddfSDavid du Colombierdo something else with the data.
146*7dd7cddfSDavid du Colombier
147*7dd7cddfSDavid du ColombierThe decompressor library contains the following main elements:
148*7dd7cddfSDavid du Colombier
149*7dd7cddfSDavid du Colombier  JPEG proper:
150*7dd7cddfSDavid du Colombier    * Entropy decoding (sequential or progressive, Huffman or arithmetic).
151*7dd7cddfSDavid du Colombier    * Dequantization, inverse DCT, MCU disassembly.
152*7dd7cddfSDavid du Colombier  Postprocessing:
153*7dd7cddfSDavid du Colombier    * Upsampling.  Optionally, this step may be able to do more general
154*7dd7cddfSDavid du Colombier      rescaling of the image.
155*7dd7cddfSDavid du Colombier    * Color space conversion (e.g., YCbCr to RGB).  This step may also
156*7dd7cddfSDavid du Colombier      provide gamma adjustment [ currently it does not ].
157*7dd7cddfSDavid du Colombier    * Optional color quantization (e.g., reduction to 256 colors).
158*7dd7cddfSDavid du Colombier    * Optional color precision reduction (e.g., 24-bit to 15-bit color).
159*7dd7cddfSDavid du Colombier      [This feature is not currently implemented.]
160*7dd7cddfSDavid du Colombier
161*7dd7cddfSDavid du ColombierWe also need overall control, marker parsing, and a data source module.
162*7dd7cddfSDavid du ColombierThe support code (memory management & error handling) can be shared with
163*7dd7cddfSDavid du Colombierthe compression half of the library.
164*7dd7cddfSDavid du Colombier
165*7dd7cddfSDavid du ColombierThere may be several implementations of each of these elements, particularly
166*7dd7cddfSDavid du Colombierin the decompressor, where a wide range of speed/quality tradeoffs is very
167*7dd7cddfSDavid du Colombieruseful.  It must be understood that some of the best speedups involve
168*7dd7cddfSDavid du Colombiermerging adjacent steps in the pipeline.  For example, upsampling, color space
169*7dd7cddfSDavid du Colombierconversion, and color quantization might all be done at once when using a
170*7dd7cddfSDavid du Colombierlow-quality ordered-dither technique.  The system architecture is designed to
171*7dd7cddfSDavid du Colombierallow such merging where appropriate.
172*7dd7cddfSDavid du Colombier
173*7dd7cddfSDavid du Colombier
174*7dd7cddfSDavid du ColombierNote: it is convenient to regard edge expansion (padding to block boundaries)
175*7dd7cddfSDavid du Colombieras a preprocessing/postprocessing function, even though the JPEG spec includes
176*7dd7cddfSDavid du Colombierit in compression/decompression.  We do this because downsampling/upsampling
177*7dd7cddfSDavid du Colombiercan be simplified a little if they work on padded data: it's not necessary to
178*7dd7cddfSDavid du Colombierhave special cases at the right and bottom edges.  Therefore the interface
179*7dd7cddfSDavid du Colombierbuffer is always an integral number of blocks wide and high, and we expect
180*7dd7cddfSDavid du Colombiercompression preprocessing to pad the source data properly.  Padding will occur
181*7dd7cddfSDavid du Colombieronly to the next block (8-sample) boundary.  In an interleaved-scan situation,
182*7dd7cddfSDavid du Colombieradditional dummy blocks may be used to fill out MCUs, but the MCU assembly and
183*7dd7cddfSDavid du Colombierdisassembly logic will create or discard these blocks internally.  (This is
184*7dd7cddfSDavid du Colombieradvantageous for speed reasons, since we avoid DCTing the dummy blocks.
185*7dd7cddfSDavid du ColombierIt also permits a small reduction in file size, because the compressor can
186*7dd7cddfSDavid du Colombierchoose dummy block contents so as to minimize their size in compressed form.
187*7dd7cddfSDavid du ColombierFinally, it makes the interface buffer specification independent of whether
188*7dd7cddfSDavid du Colombierthe file is actually interleaved or not.)  Applications that wish to deal
189*7dd7cddfSDavid du Colombierdirectly with the downsampled data must provide similar buffering and padding
190*7dd7cddfSDavid du Colombierfor odd-sized images.
191*7dd7cddfSDavid du Colombier
192*7dd7cddfSDavid du Colombier
193*7dd7cddfSDavid du Colombier*** Poor man's object-oriented programming ***
194*7dd7cddfSDavid du Colombier
195*7dd7cddfSDavid du ColombierIt should be clear by now that we have a lot of quasi-independent processing
196*7dd7cddfSDavid du Colombiersteps, many of which have several possible behaviors.  To avoid cluttering the
197*7dd7cddfSDavid du Colombiercode with lots of switch statements, we use a simple form of object-style
198*7dd7cddfSDavid du Colombierprogramming to separate out the different possibilities.
199*7dd7cddfSDavid du Colombier
200*7dd7cddfSDavid du ColombierFor example, two different color quantization algorithms could be implemented
201*7dd7cddfSDavid du Colombieras two separate modules that present the same external interface; at runtime,
202*7dd7cddfSDavid du Colombierthe calling code will access the proper module indirectly through an "object".
203*7dd7cddfSDavid du Colombier
204*7dd7cddfSDavid du ColombierWe can get the limited features we need while staying within portable C.
205*7dd7cddfSDavid du ColombierThe basic tool is a function pointer.  An "object" is just a struct
206*7dd7cddfSDavid du Colombiercontaining one or more function pointer fields, each of which corresponds to
207*7dd7cddfSDavid du Colombiera method name in real object-oriented languages.  During initialization we
208*7dd7cddfSDavid du Colombierfill in the function pointers with references to whichever module we have
209*7dd7cddfSDavid du Colombierdetermined we need to use in this run.  Then invocation of the module is done
210*7dd7cddfSDavid du Colombierby indirecting through a function pointer; on most machines this is no more
211*7dd7cddfSDavid du Colombierexpensive than a switch statement, which would be the only other way of
212*7dd7cddfSDavid du Colombiermaking the required run-time choice.  The really significant benefit, of
213*7dd7cddfSDavid du Colombiercourse, is keeping the source code clean and well structured.
214*7dd7cddfSDavid du Colombier
215*7dd7cddfSDavid du ColombierWe can also arrange to have private storage that varies between different
216*7dd7cddfSDavid du Colombierimplementations of the same kind of object.  We do this by making all the
217*7dd7cddfSDavid du Colombiermodule-specific object structs be separately allocated entities, which will
218*7dd7cddfSDavid du Colombierbe accessed via pointers in the master compression or decompression struct.
219*7dd7cddfSDavid du ColombierThe "public" fields or methods for a given kind of object are specified by
220*7dd7cddfSDavid du Colombiera commonly known struct.  But a module's initialization code can allocate
221*7dd7cddfSDavid du Colombiera larger struct that contains the common struct as its first member, plus
222*7dd7cddfSDavid du Colombieradditional private fields.  With appropriate pointer casting, the module's
223*7dd7cddfSDavid du Colombierinternal functions can access these private fields.  (For a simple example,
224*7dd7cddfSDavid du Colombiersee jdatadst.c, which implements the external interface specified by struct
225*7dd7cddfSDavid du Colombierjpeg_destination_mgr, but adds extra fields.)
226*7dd7cddfSDavid du Colombier
227*7dd7cddfSDavid du Colombier(Of course this would all be a lot easier if we were using C++, but we are
228*7dd7cddfSDavid du Colombiernot yet prepared to assume that everyone has a C++ compiler.)
229*7dd7cddfSDavid du Colombier
230*7dd7cddfSDavid du ColombierAn important benefit of this scheme is that it is easy to provide multiple
231*7dd7cddfSDavid du Colombierversions of any method, each tuned to a particular case.  While a lot of
232*7dd7cddfSDavid du Colombierprecalculation might be done to select an optimal implementation of a method,
233*7dd7cddfSDavid du Colombierthe cost per invocation is constant.  For example, the upsampling step might
234*7dd7cddfSDavid du Colombierhave a "generic" method, plus one or more "hardwired" methods for the most
235*7dd7cddfSDavid du Colombierpopular sampling factors; the hardwired methods would be faster because they'd
236*7dd7cddfSDavid du Colombieruse straight-line code instead of for-loops.  The cost to determine which
237*7dd7cddfSDavid du Colombiermethod to use is paid only once, at startup, and the selection criteria are
238*7dd7cddfSDavid du Colombierhidden from the callers of the method.
239*7dd7cddfSDavid du Colombier
240*7dd7cddfSDavid du ColombierThis plan differs a little bit from usual object-oriented structures, in that
241*7dd7cddfSDavid du Colombieronly one instance of each object class will exist during execution.  The
242*7dd7cddfSDavid du Colombierreason for having the class structure is that on different runs we may create
243*7dd7cddfSDavid du Colombierdifferent instances (choose to execute different modules).  You can think of
244*7dd7cddfSDavid du Colombierthe term "method" as denoting the common interface presented by a particular
245*7dd7cddfSDavid du Colombierset of interchangeable functions, and "object" as denoting a group of related
246*7dd7cddfSDavid du Colombiermethods, or the total shared interface behavior of a group of modules.
247*7dd7cddfSDavid du Colombier
248*7dd7cddfSDavid du Colombier
249*7dd7cddfSDavid du Colombier*** Overall control structure ***
250*7dd7cddfSDavid du Colombier
251*7dd7cddfSDavid du ColombierWe previously mentioned the need for overall control logic in the compression
252*7dd7cddfSDavid du Colombierand decompression libraries.  In IJG implementations prior to v5, overall
253*7dd7cddfSDavid du Colombiercontrol was mostly provided by "pipeline control" modules, which proved to be
254*7dd7cddfSDavid du Colombierlarge, unwieldy, and hard to understand.  To improve the situation, the
255*7dd7cddfSDavid du Colombiercontrol logic has been subdivided into multiple modules.  The control modules
256*7dd7cddfSDavid du Colombierconsist of:
257*7dd7cddfSDavid du Colombier
258*7dd7cddfSDavid du Colombier1. Master control for module selection and initialization.  This has two
259*7dd7cddfSDavid du Colombierresponsibilities:
260*7dd7cddfSDavid du Colombier
261*7dd7cddfSDavid du Colombier   1A.  Startup initialization at the beginning of image processing.
262*7dd7cddfSDavid du Colombier        The individual processing modules to be used in this run are selected
263*7dd7cddfSDavid du Colombier        and given initialization calls.
264*7dd7cddfSDavid du Colombier
265*7dd7cddfSDavid du Colombier   1B.  Per-pass control.  This determines how many passes will be performed
266*7dd7cddfSDavid du Colombier        and calls each active processing module to configure itself
267*7dd7cddfSDavid du Colombier        appropriately at the beginning of each pass.  End-of-pass processing,
268*7dd7cddfSDavid du Colombier	where necessary, is also invoked from the master control module.
269*7dd7cddfSDavid du Colombier
270*7dd7cddfSDavid du Colombier   Method selection is partially distributed, in that a particular processing
271*7dd7cddfSDavid du Colombier   module may contain several possible implementations of a particular method,
272*7dd7cddfSDavid du Colombier   which it will select among when given its initialization call.  The master
273*7dd7cddfSDavid du Colombier   control code need only be concerned with decisions that affect more than
274*7dd7cddfSDavid du Colombier   one module.
275*7dd7cddfSDavid du Colombier
276*7dd7cddfSDavid du Colombier2. Data buffering control.  A separate control module exists for each
277*7dd7cddfSDavid du Colombier   inter-processing-step data buffer.  This module is responsible for
278*7dd7cddfSDavid du Colombier   invoking the processing steps that write or read that data buffer.
279*7dd7cddfSDavid du Colombier
280*7dd7cddfSDavid du ColombierEach buffer controller sees the world as follows:
281*7dd7cddfSDavid du Colombier
282*7dd7cddfSDavid du Colombierinput data => processing step A => buffer => processing step B => output data
283*7dd7cddfSDavid du Colombier                      |              |               |
284*7dd7cddfSDavid du Colombier              ------------------ controller ------------------
285*7dd7cddfSDavid du Colombier
286*7dd7cddfSDavid du ColombierThe controller knows the dataflow requirements of steps A and B: how much data
287*7dd7cddfSDavid du Colombierthey want to accept in one chunk and how much they output in one chunk.  Its
288*7dd7cddfSDavid du Colombierfunction is to manage its buffer and call A and B at the proper times.
289*7dd7cddfSDavid du Colombier
290*7dd7cddfSDavid du ColombierA data buffer control module may itself be viewed as a processing step by a
291*7dd7cddfSDavid du Colombierhigher-level control module; thus the control modules form a binary tree with
292*7dd7cddfSDavid du Colombierelementary processing steps at the leaves of the tree.
293*7dd7cddfSDavid du Colombier
294*7dd7cddfSDavid du ColombierThe control modules are objects.  A considerable amount of flexibility can
295*7dd7cddfSDavid du Colombierbe had by replacing implementations of a control module.  For example:
296*7dd7cddfSDavid du Colombier* Merging of adjacent steps in the pipeline is done by replacing a control
297*7dd7cddfSDavid du Colombier  module and its pair of processing-step modules with a single processing-
298*7dd7cddfSDavid du Colombier  step module.  (Hence the possible merges are determined by the tree of
299*7dd7cddfSDavid du Colombier  control modules.)
300*7dd7cddfSDavid du Colombier* In some processing modes, a given interstep buffer need only be a "strip"
301*7dd7cddfSDavid du Colombier  buffer large enough to accommodate the desired data chunk sizes.  In other
302*7dd7cddfSDavid du Colombier  modes, a full-image buffer is needed and several passes are required.
303*7dd7cddfSDavid du Colombier  The control module determines which kind of buffer is used and manipulates
304*7dd7cddfSDavid du Colombier  virtual array buffers as needed.  One or both processing steps may be
305*7dd7cddfSDavid du Colombier  unaware of the multi-pass behavior.
306*7dd7cddfSDavid du Colombier
307*7dd7cddfSDavid du ColombierIn theory, we might be able to make all of the data buffer controllers
308*7dd7cddfSDavid du Colombierinterchangeable and provide just one set of implementations for all.  In
309*7dd7cddfSDavid du Colombierpractice, each one contains considerable special-case processing for its
310*7dd7cddfSDavid du Colombierparticular job.  The buffer controller concept should be regarded as an
311*7dd7cddfSDavid du Colombieroverall system structuring principle, not as a complete description of the
312*7dd7cddfSDavid du Colombiertask performed by any one controller.
313*7dd7cddfSDavid du Colombier
314*7dd7cddfSDavid du Colombier
315*7dd7cddfSDavid du Colombier*** Compression object structure ***
316*7dd7cddfSDavid du Colombier
317*7dd7cddfSDavid du ColombierHere is a sketch of the logical structure of the JPEG compression library:
318*7dd7cddfSDavid du Colombier
319*7dd7cddfSDavid du Colombier                                                 |-- Colorspace conversion
320*7dd7cddfSDavid du Colombier                  |-- Preprocessing controller --|
321*7dd7cddfSDavid du Colombier                  |                              |-- Downsampling
322*7dd7cddfSDavid du ColombierMain controller --|
323*7dd7cddfSDavid du Colombier                  |                            |-- Forward DCT, quantize
324*7dd7cddfSDavid du Colombier                  |-- Coefficient controller --|
325*7dd7cddfSDavid du Colombier                                               |-- Entropy encoding
326*7dd7cddfSDavid du Colombier
327*7dd7cddfSDavid du ColombierThis sketch also describes the flow of control (subroutine calls) during
328*7dd7cddfSDavid du Colombiertypical image data processing.  Each of the components shown in the diagram is
329*7dd7cddfSDavid du Colombieran "object" which may have several different implementations available.  One
330*7dd7cddfSDavid du Colombieror more source code files contain the actual implementation(s) of each object.
331*7dd7cddfSDavid du Colombier
332*7dd7cddfSDavid du ColombierThe objects shown above are:
333*7dd7cddfSDavid du Colombier
334*7dd7cddfSDavid du Colombier* Main controller: buffer controller for the subsampled-data buffer, which
335*7dd7cddfSDavid du Colombier  holds the preprocessed input data.  This controller invokes preprocessing to
336*7dd7cddfSDavid du Colombier  fill the subsampled-data buffer, and JPEG compression to empty it.  There is
337*7dd7cddfSDavid du Colombier  usually no need for a full-image buffer here; a strip buffer is adequate.
338*7dd7cddfSDavid du Colombier
339*7dd7cddfSDavid du Colombier* Preprocessing controller: buffer controller for the downsampling input data
340*7dd7cddfSDavid du Colombier  buffer, which lies between colorspace conversion and downsampling.  Note
341*7dd7cddfSDavid du Colombier  that a unified conversion/downsampling module would probably replace this
342*7dd7cddfSDavid du Colombier  controller entirely.
343*7dd7cddfSDavid du Colombier
344*7dd7cddfSDavid du Colombier* Colorspace conversion: converts application image data into the desired
345*7dd7cddfSDavid du Colombier  JPEG color space; also changes the data from pixel-interleaved layout to
346*7dd7cddfSDavid du Colombier  separate component planes.  Processes one pixel row at a time.
347*7dd7cddfSDavid du Colombier
348*7dd7cddfSDavid du Colombier* Downsampling: performs reduction of chroma components as required.
349*7dd7cddfSDavid du Colombier  Optionally may perform pixel-level smoothing as well.  Processes a "row
350*7dd7cddfSDavid du Colombier  group" at a time, where a row group is defined as Vmax pixel rows of each
351*7dd7cddfSDavid du Colombier  component before downsampling, and Vk sample rows afterwards (remember Vk
352*7dd7cddfSDavid du Colombier  differs across components).  Some downsampling or smoothing algorithms may
353*7dd7cddfSDavid du Colombier  require context rows above and below the current row group; the
354*7dd7cddfSDavid du Colombier  preprocessing controller is responsible for supplying these rows via proper
355*7dd7cddfSDavid du Colombier  buffering.  The downsampler is responsible for edge expansion at the right
356*7dd7cddfSDavid du Colombier  edge (i.e., extending each sample row to a multiple of 8 samples); but the
357*7dd7cddfSDavid du Colombier  preprocessing controller is responsible for vertical edge expansion (i.e.,
358*7dd7cddfSDavid du Colombier  duplicating the bottom sample row as needed to make a multiple of 8 rows).
359*7dd7cddfSDavid du Colombier
360*7dd7cddfSDavid du Colombier* Coefficient controller: buffer controller for the DCT-coefficient data.
361*7dd7cddfSDavid du Colombier  This controller handles MCU assembly, including insertion of dummy DCT
362*7dd7cddfSDavid du Colombier  blocks when needed at the right or bottom edge.  When performing
363*7dd7cddfSDavid du Colombier  Huffman-code optimization or emitting a multiscan JPEG file, this
364*7dd7cddfSDavid du Colombier  controller is responsible for buffering the full image.  The equivalent of
365*7dd7cddfSDavid du Colombier  one fully interleaved MCU row of subsampled data is processed per call,
366*7dd7cddfSDavid du Colombier  even when the JPEG file is noninterleaved.
367*7dd7cddfSDavid du Colombier
368*7dd7cddfSDavid du Colombier* Forward DCT and quantization: Perform DCT, quantize, and emit coefficients.
369*7dd7cddfSDavid du Colombier  Works on one or more DCT blocks at a time.  (Note: the coefficients are now
370*7dd7cddfSDavid du Colombier  emitted in normal array order, which the entropy encoder is expected to
371*7dd7cddfSDavid du Colombier  convert to zigzag order as necessary.  Prior versions of the IJG code did
372*7dd7cddfSDavid du Colombier  the conversion to zigzag order within the quantization step.)
373*7dd7cddfSDavid du Colombier
374*7dd7cddfSDavid du Colombier* Entropy encoding: Perform Huffman or arithmetic entropy coding and emit the
375*7dd7cddfSDavid du Colombier  coded data to the data destination module.  Works on one MCU per call.
376*7dd7cddfSDavid du Colombier  For progressive JPEG, the same DCT blocks are fed to the entropy coder
377*7dd7cddfSDavid du Colombier  during each pass, and the coder must emit the appropriate subset of
378*7dd7cddfSDavid du Colombier  coefficients.
379*7dd7cddfSDavid du Colombier
380*7dd7cddfSDavid du ColombierIn addition to the above objects, the compression library includes these
381*7dd7cddfSDavid du Colombierobjects:
382*7dd7cddfSDavid du Colombier
383*7dd7cddfSDavid du Colombier* Master control: determines the number of passes required, controls overall
384*7dd7cddfSDavid du Colombier  and per-pass initialization of the other modules.
385*7dd7cddfSDavid du Colombier
386*7dd7cddfSDavid du Colombier* Marker writing: generates JPEG markers (except for RSTn, which is emitted
387*7dd7cddfSDavid du Colombier  by the entropy encoder when needed).
388*7dd7cddfSDavid du Colombier
389*7dd7cddfSDavid du Colombier* Data destination manager: writes the output JPEG datastream to its final
390*7dd7cddfSDavid du Colombier  destination (e.g., a file).  The destination manager supplied with the
391*7dd7cddfSDavid du Colombier  library knows how to write to a stdio stream; for other behaviors, the
392*7dd7cddfSDavid du Colombier  surrounding application may provide its own destination manager.
393*7dd7cddfSDavid du Colombier
394*7dd7cddfSDavid du Colombier* Memory manager: allocates and releases memory, controls virtual arrays
395*7dd7cddfSDavid du Colombier  (with backing store management, where required).
396*7dd7cddfSDavid du Colombier
397*7dd7cddfSDavid du Colombier* Error handler: performs formatting and output of error and trace messages;
398*7dd7cddfSDavid du Colombier  determines handling of nonfatal errors.  The surrounding application may
399*7dd7cddfSDavid du Colombier  override some or all of this object's methods to change error handling.
400*7dd7cddfSDavid du Colombier
401*7dd7cddfSDavid du Colombier* Progress monitor: supports output of "percent-done" progress reports.
402*7dd7cddfSDavid du Colombier  This object represents an optional callback to the surrounding application:
403*7dd7cddfSDavid du Colombier  if wanted, it must be supplied by the application.
404*7dd7cddfSDavid du Colombier
405*7dd7cddfSDavid du ColombierThe error handler, destination manager, and progress monitor objects are
406*7dd7cddfSDavid du Colombierdefined as separate objects in order to simplify application-specific
407*7dd7cddfSDavid du Colombiercustomization of the JPEG library.  A surrounding application may override
408*7dd7cddfSDavid du Colombierindividual methods or supply its own all-new implementation of one of these
409*7dd7cddfSDavid du Colombierobjects.  The object interfaces for these objects are therefore treated as
410*7dd7cddfSDavid du Colombierpart of the application interface of the library, whereas the other objects
411*7dd7cddfSDavid du Colombierare internal to the library.
412*7dd7cddfSDavid du Colombier
413*7dd7cddfSDavid du ColombierThe error handler and memory manager are shared by JPEG compression and
414*7dd7cddfSDavid du Colombierdecompression; the progress monitor, if used, may be shared as well.
415*7dd7cddfSDavid du Colombier
416*7dd7cddfSDavid du Colombier
417*7dd7cddfSDavid du Colombier*** Decompression object structure ***
418*7dd7cddfSDavid du Colombier
419*7dd7cddfSDavid du ColombierHere is a sketch of the logical structure of the JPEG decompression library:
420*7dd7cddfSDavid du Colombier
421*7dd7cddfSDavid du Colombier                                               |-- Entropy decoding
422*7dd7cddfSDavid du Colombier                  |-- Coefficient controller --|
423*7dd7cddfSDavid du Colombier                  |                            |-- Dequantize, Inverse DCT
424*7dd7cddfSDavid du ColombierMain controller --|
425*7dd7cddfSDavid du Colombier                  |                               |-- Upsampling
426*7dd7cddfSDavid du Colombier                  |-- Postprocessing controller --|   |-- Colorspace conversion
427*7dd7cddfSDavid du Colombier                                                  |-- Color quantization
428*7dd7cddfSDavid du Colombier                                                  |-- Color precision reduction
429*7dd7cddfSDavid du Colombier
430*7dd7cddfSDavid du ColombierAs before, this diagram also represents typical control flow.  The objects
431*7dd7cddfSDavid du Colombiershown are:
432*7dd7cddfSDavid du Colombier
433*7dd7cddfSDavid du Colombier* Main controller: buffer controller for the subsampled-data buffer, which
434*7dd7cddfSDavid du Colombier  holds the output of JPEG decompression proper.  This controller's primary
435*7dd7cddfSDavid du Colombier  task is to feed the postprocessing procedure.  Some upsampling algorithms
436*7dd7cddfSDavid du Colombier  may require context rows above and below the current row group; when this
437*7dd7cddfSDavid du Colombier  is true, the main controller is responsible for managing its buffer so as
438*7dd7cddfSDavid du Colombier  to make context rows available.  In the current design, the main buffer is
439*7dd7cddfSDavid du Colombier  always a strip buffer; a full-image buffer is never required.
440*7dd7cddfSDavid du Colombier
441*7dd7cddfSDavid du Colombier* Coefficient controller: buffer controller for the DCT-coefficient data.
442*7dd7cddfSDavid du Colombier  This controller handles MCU disassembly, including deletion of any dummy
443*7dd7cddfSDavid du Colombier  DCT blocks at the right or bottom edge.  When reading a multiscan JPEG
444*7dd7cddfSDavid du Colombier  file, this controller is responsible for buffering the full image.
445*7dd7cddfSDavid du Colombier  (Buffering DCT coefficients, rather than samples, is necessary to support
446*7dd7cddfSDavid du Colombier  progressive JPEG.)  The equivalent of one fully interleaved MCU row of
447*7dd7cddfSDavid du Colombier  subsampled data is processed per call, even when the source JPEG file is
448*7dd7cddfSDavid du Colombier  noninterleaved.
449*7dd7cddfSDavid du Colombier
450*7dd7cddfSDavid du Colombier* Entropy decoding: Read coded data from the data source module and perform
451*7dd7cddfSDavid du Colombier  Huffman or arithmetic entropy decoding.  Works on one MCU per call.
452*7dd7cddfSDavid du Colombier  For progressive JPEG decoding, the coefficient controller supplies the prior
453*7dd7cddfSDavid du Colombier  coefficients of each MCU (initially all zeroes), which the entropy decoder
454*7dd7cddfSDavid du Colombier  modifies in each scan.
455*7dd7cddfSDavid du Colombier
456*7dd7cddfSDavid du Colombier* Dequantization and inverse DCT: like it says.  Note that the coefficients
457*7dd7cddfSDavid du Colombier  buffered by the coefficient controller have NOT been dequantized; we
458*7dd7cddfSDavid du Colombier  merge dequantization and inverse DCT into a single step for speed reasons.
459*7dd7cddfSDavid du Colombier  When scaled-down output is asked for, simplified DCT algorithms may be used
460*7dd7cddfSDavid du Colombier  that emit only 1x1, 2x2, or 4x4 samples per DCT block, not the full 8x8.
461*7dd7cddfSDavid du Colombier  Works on one DCT block at a time.
462*7dd7cddfSDavid du Colombier
463*7dd7cddfSDavid du Colombier* Postprocessing controller: buffer controller for the color quantization
464*7dd7cddfSDavid du Colombier  input buffer, when quantization is in use.  (Without quantization, this
465*7dd7cddfSDavid du Colombier  controller just calls the upsampler.)  For two-pass quantization, this
466*7dd7cddfSDavid du Colombier  controller is responsible for buffering the full-image data.
467*7dd7cddfSDavid du Colombier
468*7dd7cddfSDavid du Colombier* Upsampling: restores chroma components to full size.  (May support more
469*7dd7cddfSDavid du Colombier  general output rescaling, too.  Note that if undersized DCT outputs have
470*7dd7cddfSDavid du Colombier  been emitted by the DCT module, this module must adjust so that properly
471*7dd7cddfSDavid du Colombier  sized outputs are created.)  Works on one row group at a time.  This module
472*7dd7cddfSDavid du Colombier  also calls the color conversion module, so its top level is effectively a
473*7dd7cddfSDavid du Colombier  buffer controller for the upsampling->color conversion buffer.  However, in
474*7dd7cddfSDavid du Colombier  all but the highest-quality operating modes, upsampling and color
475*7dd7cddfSDavid du Colombier  conversion are likely to be merged into a single step.
476*7dd7cddfSDavid du Colombier
477*7dd7cddfSDavid du Colombier* Colorspace conversion: convert from JPEG color space to output color space,
478*7dd7cddfSDavid du Colombier  and change data layout from separate component planes to pixel-interleaved.
479*7dd7cddfSDavid du Colombier  Works on one pixel row at a time.
480*7dd7cddfSDavid du Colombier
481*7dd7cddfSDavid du Colombier* Color quantization: reduce the data to colormapped form, using either an
482*7dd7cddfSDavid du Colombier  externally specified colormap or an internally generated one.  This module
483*7dd7cddfSDavid du Colombier  is not used for full-color output.  Works on one pixel row at a time; may
484*7dd7cddfSDavid du Colombier  require two passes to generate a color map.  Note that the output will
485*7dd7cddfSDavid du Colombier  always be a single component representing colormap indexes.  In the current
486*7dd7cddfSDavid du Colombier  design, the output values are JSAMPLEs, so an 8-bit compilation cannot
487*7dd7cddfSDavid du Colombier  quantize to more than 256 colors.  This is unlikely to be a problem in
488*7dd7cddfSDavid du Colombier  practice.
489*7dd7cddfSDavid du Colombier
490*7dd7cddfSDavid du Colombier* Color reduction: this module handles color precision reduction, e.g.,
491*7dd7cddfSDavid du Colombier  generating 15-bit color (5 bits/primary) from JPEG's 24-bit output.
492*7dd7cddfSDavid du Colombier  Not quite clear yet how this should be handled... should we merge it with
493*7dd7cddfSDavid du Colombier  colorspace conversion???
494*7dd7cddfSDavid du Colombier
495*7dd7cddfSDavid du ColombierNote that some high-speed operating modes might condense the entire
496*7dd7cddfSDavid du Colombierpostprocessing sequence to a single module (upsample, color convert, and
497*7dd7cddfSDavid du Colombierquantize in one step).
498*7dd7cddfSDavid du Colombier
499*7dd7cddfSDavid du ColombierIn addition to the above objects, the decompression library includes these
500*7dd7cddfSDavid du Colombierobjects:
501*7dd7cddfSDavid du Colombier
502*7dd7cddfSDavid du Colombier* Master control: determines the number of passes required, controls overall
503*7dd7cddfSDavid du Colombier  and per-pass initialization of the other modules.  This is subdivided into
504*7dd7cddfSDavid du Colombier  input and output control: jdinput.c controls only input-side processing,
505*7dd7cddfSDavid du Colombier  while jdmaster.c handles overall initialization and output-side control.
506*7dd7cddfSDavid du Colombier
507*7dd7cddfSDavid du Colombier* Marker reading: decodes JPEG markers (except for RSTn).
508*7dd7cddfSDavid du Colombier
509*7dd7cddfSDavid du Colombier* Data source manager: supplies the input JPEG datastream.  The source
510*7dd7cddfSDavid du Colombier  manager supplied with the library knows how to read from a stdio stream;
511*7dd7cddfSDavid du Colombier  for other behaviors, the surrounding application may provide its own source
512*7dd7cddfSDavid du Colombier  manager.
513*7dd7cddfSDavid du Colombier
514*7dd7cddfSDavid du Colombier* Memory manager: same as for compression library.
515*7dd7cddfSDavid du Colombier
516*7dd7cddfSDavid du Colombier* Error handler: same as for compression library.
517*7dd7cddfSDavid du Colombier
518*7dd7cddfSDavid du Colombier* Progress monitor: same as for compression library.
519*7dd7cddfSDavid du Colombier
520*7dd7cddfSDavid du ColombierAs with compression, the data source manager, error handler, and progress
521*7dd7cddfSDavid du Colombiermonitor are candidates for replacement by a surrounding application.
522*7dd7cddfSDavid du Colombier
523*7dd7cddfSDavid du Colombier
524*7dd7cddfSDavid du Colombier*** Decompression input and output separation ***
525*7dd7cddfSDavid du Colombier
526*7dd7cddfSDavid du ColombierTo support efficient incremental display of progressive JPEG files, the
527*7dd7cddfSDavid du Colombierdecompressor is divided into two sections that can run independently:
528*7dd7cddfSDavid du Colombier
529*7dd7cddfSDavid du Colombier1. Data input includes marker parsing, entropy decoding, and input into the
530*7dd7cddfSDavid du Colombier   coefficient controller's DCT coefficient buffer.  Note that this
531*7dd7cddfSDavid du Colombier   processing is relatively cheap and fast.
532*7dd7cddfSDavid du Colombier
533*7dd7cddfSDavid du Colombier2. Data output reads from the DCT coefficient buffer and performs the IDCT
534*7dd7cddfSDavid du Colombier   and all postprocessing steps.
535*7dd7cddfSDavid du Colombier
536*7dd7cddfSDavid du ColombierFor a progressive JPEG file, the data input processing is allowed to get
537*7dd7cddfSDavid du Colombierarbitrarily far ahead of the data output processing.  (This occurs only
538*7dd7cddfSDavid du Colombierif the application calls jpeg_consume_input(); otherwise input and output
539*7dd7cddfSDavid du Colombierrun in lockstep, since the input section is called only when the output
540*7dd7cddfSDavid du Colombiersection needs more data.)  In this way the application can avoid making
541*7dd7cddfSDavid du Colombierextra display passes when data is arriving faster than the display pass
542*7dd7cddfSDavid du Colombiercan run.  Furthermore, it is possible to abort an output pass without
543*7dd7cddfSDavid du Colombierlosing anything, since the coefficient buffer is read-only as far as the
544*7dd7cddfSDavid du Colombieroutput section is concerned.  See libjpeg.doc for more detail.
545*7dd7cddfSDavid du Colombier
546*7dd7cddfSDavid du ColombierA full-image coefficient array is only created if the JPEG file has multiple
547*7dd7cddfSDavid du Colombierscans (or if the application specifies buffered-image mode anyway).  When
548*7dd7cddfSDavid du Colombierreading a single-scan file, the coefficient controller normally creates only
549*7dd7cddfSDavid du Colombiera one-MCU buffer, so input and output processing must run in lockstep in this
550*7dd7cddfSDavid du Colombiercase.  jpeg_consume_input() is effectively a no-op in this situation.
551*7dd7cddfSDavid du Colombier
552*7dd7cddfSDavid du ColombierThe main impact of dividing the decompressor in this fashion is that we must
553*7dd7cddfSDavid du Colombierbe very careful with shared variables in the cinfo data structure.  Each
554*7dd7cddfSDavid du Colombiervariable that can change during the course of decompression must be
555*7dd7cddfSDavid du Colombierclassified as belonging to data input or data output, and each section must
556*7dd7cddfSDavid du Colombierlook only at its own variables.  For example, the data output section may not
557*7dd7cddfSDavid du Colombierdepend on any of the variables that describe the current scan in the JPEG
558*7dd7cddfSDavid du Colombierfile, because these may change as the data input section advances into a new
559*7dd7cddfSDavid du Colombierscan.
560*7dd7cddfSDavid du Colombier
561*7dd7cddfSDavid du ColombierThe progress monitor is (somewhat arbitrarily) defined to treat input of the
562*7dd7cddfSDavid du Colombierfile as one pass when buffered-image mode is not used, and to ignore data
563*7dd7cddfSDavid du Colombierinput work completely when buffered-image mode is used.  Note that the
564*7dd7cddfSDavid du Colombierlibrary has no reliable way to predict the number of passes when dealing
565*7dd7cddfSDavid du Colombierwith a progressive JPEG file, nor can it predict the number of output passes
566*7dd7cddfSDavid du Colombierin buffered-image mode.  So the work estimate is inherently bogus anyway.
567*7dd7cddfSDavid du Colombier
568*7dd7cddfSDavid du ColombierNo comparable division is currently made in the compression library, because
569*7dd7cddfSDavid du Colombierthere isn't any real need for it.
570*7dd7cddfSDavid du Colombier
571*7dd7cddfSDavid du Colombier
572*7dd7cddfSDavid du Colombier*** Data formats ***
573*7dd7cddfSDavid du Colombier
574*7dd7cddfSDavid du ColombierArrays of pixel sample values use the following data structure:
575*7dd7cddfSDavid du Colombier
576*7dd7cddfSDavid du Colombier    typedef something JSAMPLE;		a pixel component value, 0..MAXJSAMPLE
577*7dd7cddfSDavid du Colombier    typedef JSAMPLE *JSAMPROW;		ptr to a row of samples
578*7dd7cddfSDavid du Colombier    typedef JSAMPROW *JSAMPARRAY;	ptr to a list of rows
579*7dd7cddfSDavid du Colombier    typedef JSAMPARRAY *JSAMPIMAGE;	ptr to a list of color-component arrays
580*7dd7cddfSDavid du Colombier
581*7dd7cddfSDavid du ColombierThe basic element type JSAMPLE will typically be one of unsigned char,
582*7dd7cddfSDavid du Colombier(signed) char, or short.  Short will be used if samples wider than 8 bits are
583*7dd7cddfSDavid du Colombierto be supported (this is a compile-time option).  Otherwise, unsigned char is
584*7dd7cddfSDavid du Colombierused if possible.  If the compiler only supports signed chars, then it is
585*7dd7cddfSDavid du Colombiernecessary to mask off the value when reading.  Thus, all reads of JSAMPLE
586*7dd7cddfSDavid du Colombiervalues must be coded as "GETJSAMPLE(value)", where the macro will be defined
587*7dd7cddfSDavid du Colombieras "((value) & 0xFF)" on signed-char machines and "((int) (value))" elsewhere.
588*7dd7cddfSDavid du Colombier
589*7dd7cddfSDavid du ColombierWith these conventions, JSAMPLE values can be assumed to be >= 0.  This helps
590*7dd7cddfSDavid du Colombiersimplify correct rounding during downsampling, etc.  The JPEG standard's
591*7dd7cddfSDavid du Colombierspecification that sample values run from -128..127 is accommodated by
592*7dd7cddfSDavid du Colombiersubtracting 128 just as the sample value is copied into the source array for
593*7dd7cddfSDavid du Colombierthe DCT step (this will be an array of signed ints).  Similarly, during
594*7dd7cddfSDavid du Colombierdecompression the output of the IDCT step will be immediately shifted back to
595*7dd7cddfSDavid du Colombier0..255.  (NB: different values are required when 12-bit samples are in use.
596*7dd7cddfSDavid du ColombierThe code is written in terms of MAXJSAMPLE and CENTERJSAMPLE, which will be
597*7dd7cddfSDavid du Colombierdefined as 255 and 128 respectively in an 8-bit implementation, and as 4095
598*7dd7cddfSDavid du Colombierand 2048 in a 12-bit implementation.)
599*7dd7cddfSDavid du Colombier
600*7dd7cddfSDavid du ColombierWe use a pointer per row, rather than a two-dimensional JSAMPLE array.  This
601*7dd7cddfSDavid du Colombierchoice costs only a small amount of memory and has several benefits:
602*7dd7cddfSDavid du Colombier* Code using the data structure doesn't need to know the allocated width of
603*7dd7cddfSDavid du Colombier  the rows.  This simplifies edge expansion/compression, since we can work
604*7dd7cddfSDavid du Colombier  in an array that's wider than the logical picture width.
605*7dd7cddfSDavid du Colombier* Indexing doesn't require multiplication; this is a performance win on many
606*7dd7cddfSDavid du Colombier  machines.
607*7dd7cddfSDavid du Colombier* Arrays with more than 64K total elements can be supported even on machines
608*7dd7cddfSDavid du Colombier  where malloc() cannot allocate chunks larger than 64K.
609*7dd7cddfSDavid du Colombier* The rows forming a component array may be allocated at different times
610*7dd7cddfSDavid du Colombier  without extra copying.  This trick allows some speedups in smoothing steps
611*7dd7cddfSDavid du Colombier  that need access to the previous and next rows.
612*7dd7cddfSDavid du Colombier
613*7dd7cddfSDavid du ColombierNote that each color component is stored in a separate array; we don't use the
614*7dd7cddfSDavid du Colombiertraditional layout in which the components of a pixel are stored together.
615*7dd7cddfSDavid du ColombierThis simplifies coding of modules that work on each component independently,
616*7dd7cddfSDavid du Colombierbecause they don't need to know how many components there are.  Furthermore,
617*7dd7cddfSDavid du Colombierwe can read or write each component to a temporary file independently, which
618*7dd7cddfSDavid du Colombieris helpful when dealing with noninterleaved JPEG files.
619*7dd7cddfSDavid du Colombier
620*7dd7cddfSDavid du ColombierIn general, a specific sample value is accessed by code such as
621*7dd7cddfSDavid du Colombier	GETJSAMPLE(image[colorcomponent][row][col])
622*7dd7cddfSDavid du Colombierwhere col is measured from the image left edge, but row is measured from the
623*7dd7cddfSDavid du Colombierfirst sample row currently in memory.  Either of the first two indexings can
624*7dd7cddfSDavid du Colombierbe precomputed by copying the relevant pointer.
625*7dd7cddfSDavid du Colombier
626*7dd7cddfSDavid du Colombier
627*7dd7cddfSDavid du ColombierSince most image-processing applications prefer to work on images in which
628*7dd7cddfSDavid du Colombierthe components of a pixel are stored together, the data passed to or from the
629*7dd7cddfSDavid du Colombiersurrounding application uses the traditional convention: a single pixel is
630*7dd7cddfSDavid du Colombierrepresented by N consecutive JSAMPLE values, and an image row is an array of
631*7dd7cddfSDavid du Colombier(# of color components)*(image width) JSAMPLEs.  One or more rows of data can
632*7dd7cddfSDavid du Colombierbe represented by a pointer of type JSAMPARRAY in this scheme.  This scheme is
633*7dd7cddfSDavid du Colombierconverted to component-wise storage inside the JPEG library.  (Applications
634*7dd7cddfSDavid du Colombierthat want to skip JPEG preprocessing or postprocessing will have to contend
635*7dd7cddfSDavid du Colombierwith component-wise storage.)
636*7dd7cddfSDavid du Colombier
637*7dd7cddfSDavid du Colombier
638*7dd7cddfSDavid du ColombierArrays of DCT-coefficient values use the following data structure:
639*7dd7cddfSDavid du Colombier
640*7dd7cddfSDavid du Colombier    typedef short JCOEF;		a 16-bit signed integer
641*7dd7cddfSDavid du Colombier    typedef JCOEF JBLOCK[DCTSIZE2];	an 8x8 block of coefficients
642*7dd7cddfSDavid du Colombier    typedef JBLOCK *JBLOCKROW;		ptr to one horizontal row of 8x8 blocks
643*7dd7cddfSDavid du Colombier    typedef JBLOCKROW *JBLOCKARRAY;	ptr to a list of such rows
644*7dd7cddfSDavid du Colombier    typedef JBLOCKARRAY *JBLOCKIMAGE;	ptr to a list of color component arrays
645*7dd7cddfSDavid du Colombier
646*7dd7cddfSDavid du ColombierThe underlying type is at least a 16-bit signed integer; while "short" is big
647*7dd7cddfSDavid du Colombierenough on all machines of interest, on some machines it is preferable to use
648*7dd7cddfSDavid du Colombier"int" for speed reasons, despite the storage cost.  Coefficients are grouped
649*7dd7cddfSDavid du Colombierinto 8x8 blocks (but we always use #defines DCTSIZE and DCTSIZE2 rather than
650*7dd7cddfSDavid du Colombier"8" and "64").
651*7dd7cddfSDavid du Colombier
652*7dd7cddfSDavid du ColombierThe contents of a coefficient block may be in either "natural" or zigzagged
653*7dd7cddfSDavid du Colombierorder, and may be true values or divided by the quantization coefficients,
654*7dd7cddfSDavid du Colombierdepending on where the block is in the processing pipeline.  In the current
655*7dd7cddfSDavid du Colombierlibrary, coefficient blocks are kept in natural order everywhere; the entropy
656*7dd7cddfSDavid du Colombiercodecs zigzag or dezigzag the data as it is written or read.  The blocks
657*7dd7cddfSDavid du Colombiercontain quantized coefficients everywhere outside the DCT/IDCT subsystems.
658*7dd7cddfSDavid du Colombier(This latter decision may need to be revisited to support variable
659*7dd7cddfSDavid du Colombierquantization a la JPEG Part 3.)
660*7dd7cddfSDavid du Colombier
661*7dd7cddfSDavid du ColombierNotice that the allocation unit is now a row of 8x8 blocks, corresponding to
662*7dd7cddfSDavid du Colombiereight rows of samples.  Otherwise the structure is much the same as for
663*7dd7cddfSDavid du Colombiersamples, and for the same reasons.
664*7dd7cddfSDavid du Colombier
665*7dd7cddfSDavid du ColombierOn machines where malloc() can't handle a request bigger than 64Kb, this data
666*7dd7cddfSDavid du Colombierstructure limits us to rows of less than 512 JBLOCKs, or a picture width of
667*7dd7cddfSDavid du Colombier4000+ pixels.  This seems an acceptable restriction.
668*7dd7cddfSDavid du Colombier
669*7dd7cddfSDavid du Colombier
670*7dd7cddfSDavid du ColombierOn 80x86 machines, the bottom-level pointer types (JSAMPROW and JBLOCKROW)
671*7dd7cddfSDavid du Colombiermust be declared as "far" pointers, but the upper levels can be "near"
672*7dd7cddfSDavid du Colombier(implying that the pointer lists are allocated in the DS segment).
673*7dd7cddfSDavid du ColombierWe use a #define symbol FAR, which expands to the "far" keyword when
674*7dd7cddfSDavid du Colombiercompiling on 80x86 machines and to nothing elsewhere.
675*7dd7cddfSDavid du Colombier
676*7dd7cddfSDavid du Colombier
677*7dd7cddfSDavid du Colombier*** Suspendable processing ***
678*7dd7cddfSDavid du Colombier
679*7dd7cddfSDavid du ColombierIn some applications it is desirable to use the JPEG library as an
680*7dd7cddfSDavid du Colombierincremental, memory-to-memory filter.  In this situation the data source or
681*7dd7cddfSDavid du Colombierdestination may be a limited-size buffer, and we can't rely on being able to
682*7dd7cddfSDavid du Colombierempty or refill the buffer at arbitrary times.  Instead the application would
683*7dd7cddfSDavid du Colombierlike to have control return from the library at buffer overflow/underrun, and
684*7dd7cddfSDavid du Colombierthen resume compression or decompression at a later time.
685*7dd7cddfSDavid du Colombier
686*7dd7cddfSDavid du ColombierThis scenario is supported for simple cases.  (For anything more complex, we
687*7dd7cddfSDavid du Colombierrecommend that the application "bite the bullet" and develop real multitasking
688*7dd7cddfSDavid du Colombiercapability.)  The libjpeg.doc file goes into more detail about the usage and
689*7dd7cddfSDavid du Colombierlimitations of this capability; here we address the implications for library
690*7dd7cddfSDavid du Colombierstructure.
691*7dd7cddfSDavid du Colombier
692*7dd7cddfSDavid du ColombierThe essence of the problem is that the entropy codec (coder or decoder) must
693*7dd7cddfSDavid du Colombierbe prepared to stop at arbitrary times.  In turn, the controllers that call
694*7dd7cddfSDavid du Colombierthe entropy codec must be able to stop before having produced or consumed all
695*7dd7cddfSDavid du Colombierthe data that they normally would handle in one call.  That part is reasonably
696*7dd7cddfSDavid du Colombierstraightforward: we make the controller call interfaces include "progress
697*7dd7cddfSDavid du Colombiercounters" which indicate the number of data chunks successfully processed, and
698*7dd7cddfSDavid du Colombierwe require callers to test the counter rather than just assume all of the data
699*7dd7cddfSDavid du Colombierwas processed.
700*7dd7cddfSDavid du Colombier
701*7dd7cddfSDavid du ColombierRather than trying to restart at an arbitrary point, the current Huffman
702*7dd7cddfSDavid du Colombiercodecs are designed to restart at the beginning of the current MCU after a
703*7dd7cddfSDavid du Colombiersuspension due to buffer overflow/underrun.  At the start of each call, the
704*7dd7cddfSDavid du Colombiercodec's internal state is loaded from permanent storage (in the JPEG object
705*7dd7cddfSDavid du Colombierstructures) into local variables.  On successful completion of the MCU, the
706*7dd7cddfSDavid du Colombierpermanent state is updated.  (This copying is not very expensive, and may even
707*7dd7cddfSDavid du Colombierlead to *improved* performance if the local variables can be registerized.)
708*7dd7cddfSDavid du ColombierIf a suspension occurs, the codec simply returns without updating the state,
709*7dd7cddfSDavid du Colombierthus effectively reverting to the start of the MCU.  Note that this implies
710*7dd7cddfSDavid du Colombierleaving some data unprocessed in the source/destination buffer (ie, the
711*7dd7cddfSDavid du Colombiercompressed partial MCU).  The data source/destination module interfaces are
712*7dd7cddfSDavid du Colombierspecified so as to make this possible.  This also implies that the data buffer
713*7dd7cddfSDavid du Colombiermust be large enough to hold a worst-case compressed MCU; a couple thousand
714*7dd7cddfSDavid du Colombierbytes should be enough.
715*7dd7cddfSDavid du Colombier
716*7dd7cddfSDavid du ColombierIn a successive-approximation AC refinement scan, the progressive Huffman
717*7dd7cddfSDavid du Colombierdecoder has to be able to undo assignments of newly nonzero coefficients if it
718*7dd7cddfSDavid du Colombiersuspends before the MCU is complete, since decoding requires distinguishing
719*7dd7cddfSDavid du Colombierpreviously-zero and previously-nonzero coefficients.  This is a bit tedious
720*7dd7cddfSDavid du Colombierbut probably won't have much effect on performance.  Other variants of Huffman
721*7dd7cddfSDavid du Colombierdecoding need not worry about this, since they will just store the same values
722*7dd7cddfSDavid du Colombieragain if forced to repeat the MCU.
723*7dd7cddfSDavid du Colombier
724*7dd7cddfSDavid du ColombierThis approach would probably not work for an arithmetic codec, since its
725*7dd7cddfSDavid du Colombiermodifiable state is quite large and couldn't be copied cheaply.  Instead it
726*7dd7cddfSDavid du Colombierwould have to suspend and resume exactly at the point of the buffer end.
727*7dd7cddfSDavid du Colombier
728*7dd7cddfSDavid du ColombierThe JPEG marker reader is designed to cope with suspension at an arbitrary
729*7dd7cddfSDavid du Colombierpoint.  It does so by backing up to the start of the marker parameter segment,
730*7dd7cddfSDavid du Colombierso the data buffer must be big enough to hold the largest marker of interest.
731*7dd7cddfSDavid du ColombierAgain, a couple KB should be adequate.  (A special "skip" convention is used
732*7dd7cddfSDavid du Colombierto bypass COM and APPn markers, so these can be larger than the buffer size
733*7dd7cddfSDavid du Colombierwithout causing problems; otherwise a 64K buffer would be needed in the worst
734*7dd7cddfSDavid du Colombiercase.)
735*7dd7cddfSDavid du Colombier
736*7dd7cddfSDavid du ColombierThe JPEG marker writer currently does *not* cope with suspension.  I feel that
737*7dd7cddfSDavid du Colombierthis is not necessary; it is much easier simply to require the application to
738*7dd7cddfSDavid du Colombierensure there is enough buffer space before starting.  (An empty 2K buffer is
739*7dd7cddfSDavid du Colombiermore than sufficient for the header markers; and ensuring there are a dozen or
740*7dd7cddfSDavid du Colombiertwo bytes available before calling jpeg_finish_compress() will suffice for the
741*7dd7cddfSDavid du Colombiertrailer.)  This would not work for writing multi-scan JPEG files, but
742*7dd7cddfSDavid du Colombierwe simply do not intend to support that capability with suspension.
743*7dd7cddfSDavid du Colombier
744*7dd7cddfSDavid du Colombier
745*7dd7cddfSDavid du Colombier*** Memory manager services ***
746*7dd7cddfSDavid du Colombier
747*7dd7cddfSDavid du ColombierThe JPEG library's memory manager controls allocation and deallocation of
748*7dd7cddfSDavid du Colombiermemory, and it manages large "virtual" data arrays on machines where the
749*7dd7cddfSDavid du Colombieroperating system does not provide virtual memory.  Note that the same
750*7dd7cddfSDavid du Colombiermemory manager serves both compression and decompression operations.
751*7dd7cddfSDavid du Colombier
752*7dd7cddfSDavid du ColombierIn all cases, allocated objects are tied to a particular compression or
753*7dd7cddfSDavid du Colombierdecompression master record, and they will be released when that master
754*7dd7cddfSDavid du Colombierrecord is destroyed.
755*7dd7cddfSDavid du Colombier
756*7dd7cddfSDavid du ColombierThe memory manager does not provide explicit deallocation of objects.
757*7dd7cddfSDavid du ColombierInstead, objects are created in "pools" of free storage, and a whole pool
758*7dd7cddfSDavid du Colombiercan be freed at once.  This approach helps prevent storage-leak bugs, and
759*7dd7cddfSDavid du Colombierit speeds up operations whenever malloc/free are slow (as they often are).
760*7dd7cddfSDavid du ColombierThe pools can be regarded as lifetime identifiers for objects.  Two
761*7dd7cddfSDavid du Colombierpools/lifetimes are defined:
762*7dd7cddfSDavid du Colombier  * JPOOL_PERMANENT	lasts until master record is destroyed
763*7dd7cddfSDavid du Colombier  * JPOOL_IMAGE		lasts until done with image (JPEG datastream)
764*7dd7cddfSDavid du ColombierPermanent lifetime is used for parameters and tables that should be carried
765*7dd7cddfSDavid du Colombieracross from one datastream to another; this includes all application-visible
766*7dd7cddfSDavid du Colombierparameters.  Image lifetime is used for everything else.  (A third lifetime,
767*7dd7cddfSDavid du ColombierJPOOL_PASS = one processing pass, was originally planned.  However it was
768*7dd7cddfSDavid du Colombierdropped as not being worthwhile.  The actual usage patterns are such that the
769*7dd7cddfSDavid du Colombierpeak memory usage would be about the same anyway; and having per-pass storage
770*7dd7cddfSDavid du Colombiersubstantially complicates the virtual memory allocation rules --- see below.)
771*7dd7cddfSDavid du Colombier
772*7dd7cddfSDavid du ColombierThe memory manager deals with three kinds of object:
773*7dd7cddfSDavid du Colombier1. "Small" objects.  Typically these require no more than 10K-20K total.
774*7dd7cddfSDavid du Colombier2. "Large" objects.  These may require tens to hundreds of K depending on
775*7dd7cddfSDavid du Colombier   image size.  Semantically they behave the same as small objects, but we
776*7dd7cddfSDavid du Colombier   distinguish them for two reasons:
777*7dd7cddfSDavid du Colombier     * On MS-DOS machines, large objects are referenced by FAR pointers,
778*7dd7cddfSDavid du Colombier       small objects by NEAR pointers.
779*7dd7cddfSDavid du Colombier     * Pool allocation heuristics may differ for large and small objects.
780*7dd7cddfSDavid du Colombier   Note that individual "large" objects cannot exceed the size allowed by
781*7dd7cddfSDavid du Colombier   type size_t, which may be 64K or less on some machines.
782*7dd7cddfSDavid du Colombier3. "Virtual" objects.  These are large 2-D arrays of JSAMPLEs or JBLOCKs
783*7dd7cddfSDavid du Colombier   (typically large enough for the entire image being processed).  The
784*7dd7cddfSDavid du Colombier   memory manager provides stripwise access to these arrays.  On machines
785*7dd7cddfSDavid du Colombier   without virtual memory, the rest of the array may be swapped out to a
786*7dd7cddfSDavid du Colombier   temporary file.
787*7dd7cddfSDavid du Colombier
788*7dd7cddfSDavid du Colombier(Note: JSAMPARRAY and JBLOCKARRAY data structures are a combination of large
789*7dd7cddfSDavid du Colombierobjects for the data proper and small objects for the row pointers.  For
790*7dd7cddfSDavid du Colombierconvenience and speed, the memory manager provides single routines to create
791*7dd7cddfSDavid du Colombierthese structures.  Similarly, virtual arrays include a small control block
792*7dd7cddfSDavid du Colombierand a JSAMPARRAY or JBLOCKARRAY working buffer, all created with one call.)
793*7dd7cddfSDavid du Colombier
794*7dd7cddfSDavid du ColombierIn the present implementation, virtual arrays are only permitted to have image
795*7dd7cddfSDavid du Colombierlifespan.  (Permanent lifespan would not be reasonable, and pass lifespan is
796*7dd7cddfSDavid du Colombiernot very useful since a virtual array's raison d'etre is to store data for
797*7dd7cddfSDavid du Colombiermultiple passes through the image.)  We also expect that only "small" objects
798*7dd7cddfSDavid du Colombierwill be given permanent lifespan, though this restriction is not required by
799*7dd7cddfSDavid du Colombierthe memory manager.
800*7dd7cddfSDavid du Colombier
801*7dd7cddfSDavid du ColombierIn a non-virtual-memory machine, some performance benefit can be gained by
802*7dd7cddfSDavid du Colombiermaking the in-memory buffers for virtual arrays be as large as possible.
803*7dd7cddfSDavid du Colombier(For small images, the buffers might fit entirely in memory, so blind
804*7dd7cddfSDavid du Colombierswapping would be very wasteful.)  The memory manager will adjust the height
805*7dd7cddfSDavid du Colombierof the buffers to fit within a prespecified maximum memory usage.  In order
806*7dd7cddfSDavid du Colombierto do this in a reasonably optimal fashion, the manager needs to allocate all
807*7dd7cddfSDavid du Colombierof the virtual arrays at once.  Therefore, there isn't a one-step allocation
808*7dd7cddfSDavid du Colombierroutine for virtual arrays; instead, there is a "request" routine that simply
809*7dd7cddfSDavid du Colombierallocates the control block, and a "realize" routine (called just once) that
810*7dd7cddfSDavid du Colombierdetermines space allocation and creates all of the actual buffers.  The
811*7dd7cddfSDavid du Colombierrealize routine must allow for space occupied by non-virtual large objects.
812*7dd7cddfSDavid du Colombier(We don't bother to factor in the space needed for small objects, on the
813*7dd7cddfSDavid du Colombiergrounds that it isn't worth the trouble.)
814*7dd7cddfSDavid du Colombier
815*7dd7cddfSDavid du ColombierTo support all this, we establish the following protocol for doing business
816*7dd7cddfSDavid du Colombierwith the memory manager:
817*7dd7cddfSDavid du Colombier  1. Modules must request virtual arrays (which may have only image lifespan)
818*7dd7cddfSDavid du Colombier     during the initial setup phase, i.e., in their jinit_xxx routines.
819*7dd7cddfSDavid du Colombier  2. All "large" objects (including JSAMPARRAYs and JBLOCKARRAYs) must also be
820*7dd7cddfSDavid du Colombier     allocated during initial setup.
821*7dd7cddfSDavid du Colombier  3. realize_virt_arrays will be called at the completion of initial setup.
822*7dd7cddfSDavid du Colombier     The above conventions ensure that sufficient information is available
823*7dd7cddfSDavid du Colombier     for it to choose a good size for virtual array buffers.
824*7dd7cddfSDavid du ColombierSmall objects of any lifespan may be allocated at any time.  We expect that
825*7dd7cddfSDavid du Colombierthe total space used for small objects will be small enough to be negligible
826*7dd7cddfSDavid du Colombierin the realize_virt_arrays computation.
827*7dd7cddfSDavid du Colombier
828*7dd7cddfSDavid du ColombierIn a virtual-memory machine, we simply pretend that the available space is
829*7dd7cddfSDavid du Colombierinfinite, thus causing realize_virt_arrays to decide that it can allocate all
830*7dd7cddfSDavid du Colombierthe virtual arrays as full-size in-memory buffers.  The overhead of the
831*7dd7cddfSDavid du Colombiervirtual-array access protocol is very small when no swapping occurs.
832*7dd7cddfSDavid du Colombier
833*7dd7cddfSDavid du ColombierA virtual array can be specified to be "pre-zeroed"; when this flag is set,
834*7dd7cddfSDavid du Colombiernever-yet-written sections of the array are set to zero before being made
835*7dd7cddfSDavid du Colombieravailable to the caller.  If this flag is not set, never-written sections
836*7dd7cddfSDavid du Colombierof the array contain garbage.  (This feature exists primarily because the
837*7dd7cddfSDavid du Colombierequivalent logic would otherwise be needed in jdcoefct.c for progressive
838*7dd7cddfSDavid du ColombierJPEG mode; we may as well make it available for possible other uses.)
839*7dd7cddfSDavid du Colombier
840*7dd7cddfSDavid du ColombierThe first write pass on a virtual array is required to occur in top-to-bottom
841*7dd7cddfSDavid du Colombierorder; read passes, as well as any write passes after the first one, may
842*7dd7cddfSDavid du Colombieraccess the array in any order.  This restriction exists partly to simplify
843*7dd7cddfSDavid du Colombierthe virtual array control logic, and partly because some file systems may not
844*7dd7cddfSDavid du Colombiersupport seeking beyond the current end-of-file in a temporary file.  The main
845*7dd7cddfSDavid du Colombierimplication of this restriction is that rearrangement of rows (such as
846*7dd7cddfSDavid du Colombierconverting top-to-bottom data order to bottom-to-top) must be handled while
847*7dd7cddfSDavid du Colombierreading data out of the virtual array, not while putting it in.
848*7dd7cddfSDavid du Colombier
849*7dd7cddfSDavid du Colombier
850*7dd7cddfSDavid du Colombier*** Memory manager internal structure ***
851*7dd7cddfSDavid du Colombier
852*7dd7cddfSDavid du ColombierTo isolate system dependencies as much as possible, we have broken the
853*7dd7cddfSDavid du Colombiermemory manager into two parts.  There is a reasonably system-independent
854*7dd7cddfSDavid du Colombier"front end" (jmemmgr.c) and a "back end" that contains only the code
855*7dd7cddfSDavid du Colombierlikely to change across systems.  All of the memory management methods
856*7dd7cddfSDavid du Colombieroutlined above are implemented by the front end.  The back end provides
857*7dd7cddfSDavid du Colombierthe following routines for use by the front end (none of these routines
858*7dd7cddfSDavid du Colombierare known to the rest of the JPEG code):
859*7dd7cddfSDavid du Colombier
860*7dd7cddfSDavid du Colombierjpeg_mem_init, jpeg_mem_term	system-dependent initialization/shutdown
861*7dd7cddfSDavid du Colombier
862*7dd7cddfSDavid du Colombierjpeg_get_small, jpeg_free_small	interface to malloc and free library routines
863*7dd7cddfSDavid du Colombier				(or their equivalents)
864*7dd7cddfSDavid du Colombier
865*7dd7cddfSDavid du Colombierjpeg_get_large, jpeg_free_large	interface to FAR malloc/free in MSDOS machines;
866*7dd7cddfSDavid du Colombier				else usually the same as
867*7dd7cddfSDavid du Colombier				jpeg_get_small/jpeg_free_small
868*7dd7cddfSDavid du Colombier
869*7dd7cddfSDavid du Colombierjpeg_mem_available		estimate available memory
870*7dd7cddfSDavid du Colombier
871*7dd7cddfSDavid du Colombierjpeg_open_backing_store		create a backing-store object
872*7dd7cddfSDavid du Colombier
873*7dd7cddfSDavid du Colombierread_backing_store,		manipulate a backing-store object
874*7dd7cddfSDavid du Colombierwrite_backing_store,
875*7dd7cddfSDavid du Colombierclose_backing_store
876*7dd7cddfSDavid du Colombier
877*7dd7cddfSDavid du ColombierOn some systems there will be more than one type of backing-store object
878*7dd7cddfSDavid du Colombier(specifically, in MS-DOS a backing store file might be an area of extended
879*7dd7cddfSDavid du Colombiermemory as well as a disk file).  jpeg_open_backing_store is responsible for
880*7dd7cddfSDavid du Colombierchoosing how to implement a given object.  The read/write/close routines
881*7dd7cddfSDavid du Colombierare method pointers in the structure that describes a given object; this
882*7dd7cddfSDavid du Colombierlets them be different for different object types.
883*7dd7cddfSDavid du Colombier
884*7dd7cddfSDavid du ColombierIt may be necessary to ensure that backing store objects are explicitly
885*7dd7cddfSDavid du Colombierreleased upon abnormal program termination.  For example, MS-DOS won't free
886*7dd7cddfSDavid du Colombierextended memory by itself.  To support this, we will expect the main program
887*7dd7cddfSDavid du Colombieror surrounding application to arrange to call self_destruct (typically via
888*7dd7cddfSDavid du Colombierjpeg_destroy) upon abnormal termination.  This may require a SIGINT signal
889*7dd7cddfSDavid du Colombierhandler or equivalent.  We don't want to have the back end module install its
890*7dd7cddfSDavid du Colombierown signal handler, because that would pre-empt the surrounding application's
891*7dd7cddfSDavid du Colombierability to control signal handling.
892*7dd7cddfSDavid du Colombier
893*7dd7cddfSDavid du ColombierThe IJG distribution includes several memory manager back end implementations.
894*7dd7cddfSDavid du ColombierUsually the same back end should be suitable for all applications on a given
895*7dd7cddfSDavid du Colombiersystem, but it is possible for an application to supply its own back end at
896*7dd7cddfSDavid du Colombierneed.
897*7dd7cddfSDavid du Colombier
898*7dd7cddfSDavid du Colombier
899*7dd7cddfSDavid du Colombier*** Implications of DNL marker ***
900*7dd7cddfSDavid du Colombier
901*7dd7cddfSDavid du ColombierSome JPEG files may use a DNL marker to postpone definition of the image
902*7dd7cddfSDavid du Colombierheight (this would be useful for a fax-like scanner's output, for instance).
903*7dd7cddfSDavid du ColombierIn these files the SOF marker claims the image height is 0, and you only
904*7dd7cddfSDavid du Colombierfind out the true image height at the end of the first scan.
905*7dd7cddfSDavid du Colombier
906*7dd7cddfSDavid du ColombierWe could read these files as follows:
907*7dd7cddfSDavid du Colombier1. Upon seeing zero image height, replace it by 65535 (the maximum allowed).
908*7dd7cddfSDavid du Colombier2. When the DNL is found, update the image height in the global image
909*7dd7cddfSDavid du Colombier   descriptor.
910*7dd7cddfSDavid du ColombierThis implies that control modules must avoid making copies of the image
911*7dd7cddfSDavid du Colombierheight, and must re-test for termination after each MCU row.  This would
912*7dd7cddfSDavid du Colombierbe easy enough to do.
913*7dd7cddfSDavid du Colombier
914*7dd7cddfSDavid du ColombierIn cases where image-size data structures are allocated, this approach will
915*7dd7cddfSDavid du Colombierresult in very inefficient use of virtual memory or much-larger-than-necessary
916*7dd7cddfSDavid du Colombiertemporary files.  This seems acceptable for something that probably won't be a
917*7dd7cddfSDavid du Colombiermainstream usage.  People might have to forgo use of memory-hogging options
918*7dd7cddfSDavid du Colombier(such as two-pass color quantization or noninterleaved JPEG files) if they
919*7dd7cddfSDavid du Colombierwant efficient conversion of such files.  (One could improve efficiency by
920*7dd7cddfSDavid du Colombierdemanding a user-supplied upper bound for the height, less than 65536; in most
921*7dd7cddfSDavid du Colombiercases it could be much less.)
922*7dd7cddfSDavid du Colombier
923*7dd7cddfSDavid du ColombierThe standard also permits the SOF marker to overestimate the image height,
924*7dd7cddfSDavid du Colombierwith a DNL to give the true, smaller height at the end of the first scan.
925*7dd7cddfSDavid du ColombierThis would solve the space problems if the overestimate wasn't too great.
926*7dd7cddfSDavid du ColombierHowever, it implies that you don't even know whether DNL will be used.
927*7dd7cddfSDavid du Colombier
928*7dd7cddfSDavid du ColombierThis leads to a couple of very serious objections:
929*7dd7cddfSDavid du Colombier1. Testing for a DNL marker must occur in the inner loop of the decompressor's
930*7dd7cddfSDavid du Colombier   Huffman decoder; this implies a speed penalty whether the feature is used
931*7dd7cddfSDavid du Colombier   or not.
932*7dd7cddfSDavid du Colombier2. There is no way to hide the last-minute change in image height from an
933*7dd7cddfSDavid du Colombier   application using the decoder.  Thus *every* application using the IJG
934*7dd7cddfSDavid du Colombier   library would suffer a complexity penalty whether it cared about DNL or
935*7dd7cddfSDavid du Colombier   not.
936*7dd7cddfSDavid du ColombierWe currently do not support DNL because of these problems.
937*7dd7cddfSDavid du Colombier
938*7dd7cddfSDavid du ColombierA different approach is to insist that DNL-using files be preprocessed by a
939*7dd7cddfSDavid du Colombierseparate program that reads ahead to the DNL, then goes back and fixes the SOF
940*7dd7cddfSDavid du Colombiermarker.  This is a much simpler solution and is probably far more efficient.
941*7dd7cddfSDavid du ColombierEven if one wants piped input, buffering the first scan of the JPEG file needs
942*7dd7cddfSDavid du Colombiera lot smaller temp file than is implied by the maximum-height method.  For
943*7dd7cddfSDavid du Colombierthis approach we'd simply treat DNL as a no-op in the decompressor (at most,
944*7dd7cddfSDavid du Colombiercheck that it matches the SOF image height).
945*7dd7cddfSDavid du Colombier
946*7dd7cddfSDavid du ColombierWe will not worry about making the compressor capable of outputting DNL.
947*7dd7cddfSDavid du ColombierSomething similar to the first scheme above could be applied if anyone ever
948*7dd7cddfSDavid du Colombierwants to make that work.
949