Lines Matching +full:pre +full:-
1 <?xml version="1.0" encoding="iso-8859-1"?>
2 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
3 "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
6 <!--
15 Copyright (c) 2000-2004 Fred L. Drake, Jr. <fdrake@users.sourceforge.net>
16 Copyright (c) 2002-2012 Karl Waclawek <karl@waclawek.net>
17 Copyright (c) 2017-2024 Sebastian Pipping <sebastian@pipping.org>
20 Copyright (c) 2021 Nicolas Cavallari <nicolas.cavallari@green-communications.fr>
44 -->
47 <meta http-equiv="Content-Style-Type" content="text/css" />
63 other open-source XML parsers.</p>
66 groff (an nroff look-alike), Jade (an implementation of ISO's DSSSL
156 <a href="#attack-protection">Attack Protection</a>
195 <p>Expat is a stream-oriented parser. You register callback (or
241 <pre class="eg">
260 </pre>
264 <pre class="eg">
267 Depth--;
269 </pre>
289 <pre class="eg">
301 </pre>
322 cmake -G"Visual Studio 16 2019" -DCMAKE_BUILD_TYPE=RelWithDebInfo .
326 contains the "expat.h" include file and a pre-built DLL.</p>
337 <pre class="eg">
341 </pre>
344 only one we'll mention here is the <code>--prefix</code> option. You
346 the <code>--help</code> option.</p>
351 give the option, <code>--prefix=/home/me/mystuff</code>, then the
356 <h3>Configuring Expat Using the Pre-Processor</h3>
359 pre-processor definitions. The symbols are:</p>
361 <dl class="cpp-symbols">
366 <a href="https://www.w3.org/TR/2006/REC-xml-20060816/#sec-physical-struct">general entities</a>
382 (except the <a href="https://www.w3.org/TR/2006/REC-xml-20060816/#sec-predefined-ent">predefined five</a>:
384 with a self-reference:
390 <dd>Include support for using and reporting DTD-based content. If
404 "https://www.w3.org/TR/REC-xml-names/" >Namespaces in XML</a></cite>
409 encoded in UTF-16 using wide characters of the type
461 usually be done with the <code>-lexpat</code> argument. Otherwise,
467 <p>On a Unix-based system, here's what a Makefile might look like when
470 <pre class="eg">
473 LIBS= -lexpat
475 $(CC) $(LDFLAGS) -o xmlapp xmlapp.o $(LIBS)
476 </pre>
481 <pre class="eg">
483 CFLAGS= -I/home/me/mystuff/include
485 LIBS= -L/home/me/mystuff/lib -lexpat
487 $(CC) $(LDFLAGS) -o xmlapp xmlapp.o $(LIBS)
488 </pre>
504 constructing a parser for a top-level document. The object returned
543 <pre class="eg">
546 info->skip = 0;
547 info->depth = 1;
555 if (! inf->skip) {
557 inf->skip = inf->depth;
563 inf->depth++;
570 inf->depth--;
572 if (! inf->skip)
575 if (inf->skip == inf->depth)
576 inf->skip = 0;
578 </pre>
610 common first-time mistake with any of the event-oriented interfaces to
619 <!-- XXX example needed here -->
625 the value of the <code>version</code> pseudo-attribute in the XML
635 <pre class="eg">
660 </pre>
681 are not well-formed when namespace processing is enabled, and will
687 >XML_SetReturnNSTriplet</a></code> has been called with a non-zero
713 to recognized UTF-8 and UTF-16 (1 and 2 byte encodings of Unicode),
717 <pre>
718 <?xml version="1.0" encoding="ISO-8859-2"?>
719 </pre>
723 <pre>
725 </pre>
732 <p><a name="builtin_encodings"></a>There are four built-in encodings
735 <li>UTF-8</li>
736 <li>UTF-16</li>
737 <li>ISO-8859-1</li>
738 <li>US-ASCII</li>
756 <li>Every ASCII character that can appear in a well-formed XML document
761 equal to 65535 (0xFFFF)<em>This does not apply to the built-in support
762 for UTF-16 and UTF-8</em></li>
771 array. A -1 in this array indicates a malformed byte. If the value is
772 -2, -3, or -4, then the byte is the beginning of a 2, 3, or 4 byte
773 sequence respectively. Multi-byte sequences are sent to the convert
775 function should return the Unicode scalar value for the sequence or -1
780 it passes to the handlers are always encoded in UTF-8 or UTF-16
822 <h3 id="stop-resume">Temporarily Stopping Parsing</h3>
841 if an application-domain error is found in the XML being parsed or if
853 the rough structure (in pseudo-code):</p>
855 <pre class="pseudocode">
871 </pre>
878 function mentioned in the pseudo-code above:</p>
880 <pre class="eg">
885 been an error), or the parse is stopped. Return non-zero when
918 </pre>
926 <pre class="eg">
929 non-zero when the parse is suspended.
947 </pre>
949 <p>Now that we've seen what a mess the top-level parsing loop can
962 <!-- XXX really need more here -->
966 <!-- ================================================================ -->
973 <pre class="fcndec">
976 </pre>
979 Construct a new parser. If encoding is non-<code>NULL</code>, it specifies a
981 encoding declaration. There are four built-in encodings:
984 <li>US-ASCII</li>
985 <li>UTF-8</li>
986 <li>UTF-16</li>
987 <li>ISO-8859-1</li>
995 <pre class="fcndec">
999 </pre>
1007 in XML. For instance, <code>'\xFF'</code> is not legal in UTF-8, and
1008 <code>'\xFFFF'</code> is not legal in UTF-16. There is a special case when
1010 the local part will be concatenated without any separator - this is intended
1019 be ready to receive namespace URIs containing non-URI characters.
1023 <pre class="fcndec">
1028 </pre>
1029 <pre class="signature">
1035 </pre>
1040 non-<code>NULL</code>, then namespace processing is enabled in the created parser
1046 <pre class="fcndec">
1051 </pre>
1063 <pre class="fcndec">
1066 </pre>
1073 <pre class="fcndec">
1077 </pre>
1083 state is re-initialized except for the values of ns and ns_triplets.
1118 <pre class="fcndec">
1124 </pre>
1125 <pre class="signature">
1130 </pre>
1136 that <code>s</code> doesn't have to be null-terminated. It also means that
1181 <pre class="fcndec">
1186 </pre>
1202 <pre class="fcndec">
1206 </pre>
1215 <pre class="eg">
1235 </pre>
1239 <pre class="fcndec">
1243 </pre>
1249 call-back handler, except when aborting (when <code>resumable</code>
1251 call-backs may still follow because they would otherwise get
1259 while making multiple call-backs on a contiguous chunk of characters,</li>
1264 call-backs, except when parsing an external parameter entity and
1280 not being handled appropriately; see <a href= "#stop-resume"
1308 <pre class="fcndec">
1311 </pre>
1315 within a handler call-back. Returns same status codes as <code><a
1334 <pre class="fcndec">
1338 </pre>
1339 <pre class="signature">
1351 </pre>
1378 The former implies UTF-8 encoding, the latter two imply UTF-16 encoding.
1384 <pre class="setter">
1388 </pre>
1389 <pre class="signature">
1394 </pre>
1406 <pre class="setter">
1410 </pre>
1411 <pre class="signature">
1415 </pre>
1422 <pre class="setter">
1427 </pre>
1433 <pre class="setter">
1437 </pre>
1438 <pre class="signature">
1443 </pre>
1445 is <em>NOT null-terminated</em>. You have to use the length argument
1450 may <em>NOT immediately</em> terminate call-backs if the parser is currently
1451 processing such a single block of contiguous markup-free text, as the parser
1457 <pre class="setter">
1461 </pre>
1462 <pre class="signature">
1468 </pre>
1476 <pre class="setter">
1480 </pre>
1481 <pre class="signature">
1485 </pre>
1492 <pre class="setter">
1496 </pre>
1497 <pre class="signature">
1500 </pre>
1506 <pre class="setter">
1510 </pre>
1511 <pre class="signature">
1514 </pre>
1520 <pre class="setter">
1525 </pre>
1531 <pre class="setter">
1535 </pre>
1536 <pre class="signature">
1541 </pre>
1548 that they will be encoded in UTF-8 or UTF-16. Line boundaries are not
1563 <pre class="setter">
1567 </pre>
1568 <pre class="signature">
1573 </pre>
1584 <pre class="setter">
1588 </pre>
1589 <pre class="signature">
1596 </pre>
1638 <pre class="fcndec">
1642 </pre>
1665 <pre class="setter">
1669 </pre>
1670 <pre class="signature">
1675 </pre>
1684 <p>The <code>is_parameter_entity</code> argument will be non-zero for
1693 <pre class="setter">
1698 </pre>
1699 <pre class="signature">
1711 </pre>
1727 value is -1, then that byte is invalid as the initial byte in a sequence.
1728 If the value is -n, where n is an integer > 1, then n is the number of
1730 call to the function pointed at by convert. This function may return -1
1734 string s is <em>NOT</em> null-terminated and points at the sequence of
1743 <pre class="setter">
1747 </pre>
1748 <pre class="signature">
1753 </pre>
1762 <pre class="setter">
1766 </pre>
1767 <pre class="signature">
1771 </pre>
1780 <pre class="setter">
1785 </pre>
1791 <pre class="setter">
1795 </pre>
1796 <pre class="signature">
1802 </pre>
1808 contain -1, 0, or 1 indicating respectively that there was no
1815 <pre class="setter">
1819 </pre>
1820 <pre class="signature">
1827 </pre>
1831 will be non-zero if the DOCTYPE declaration has an internal subset.</p>
1836 <pre class="setter">
1840 </pre>
1841 <pre class="signature">
1844 </pre>
1851 <pre class="setter">
1856 </pre>
1862 <pre class="setter">
1866 </pre>
1867 <pre class="signature">
1872 </pre>
1873 <pre class="signature">
1899 </pre>
1936 <pre class="setter">
1940 </pre>
1941 <pre class="signature">
1949 </pre>
1964 <code>isrequired</code>, but they will have the non-<code>NULL</code> fixed value
1970 <pre class="setter">
1974 </pre>
1975 <pre class="signature">
1986 </pre>
1988 The <code>is_parameter_entity</code> argument will be non-zero in the
1992 <code>value</code> will be non-<code>NULL</code> and <code>systemId</code>,
1994 The value string is <em>not</em> null-terminated; the length is
1997 legal to have zero-length values. Instead check for whether or not
1999 argument will have a non-<code>NULL</code> value only for unparsed entity
2005 <pre class="setter">
2009 </pre>
2010 <pre class="signature">
2018 </pre>
2022 <div id="eg"><pre>
2024 </pre></div>
2032 <pre class="setter">
2036 </pre>
2037 <pre class="signature">
2044 </pre>
2050 <pre class="setter">
2054 </pre>
2055 <pre class="signature">
2058 </pre>
2086 <pre class="fcndec">
2089 </pre>
2095 <pre class="fcndec">
2098 </pre>
2106 <pre class="fcndec">
2109 </pre>
2118 <pre class="fcndec">
2121 </pre>
2128 <pre class="fcndec">
2131 </pre>
2138 <pre class="fcndec">
2141 </pre>
2145 entity and for the end-tag event for empty element tags (the later can
2146 be used to distinguish empty-element tags from empty elements using
2151 <pre class="fcndec">
2156 </pre>
2177 <h3><a name="attack-protection">Attack Protection</a><a name="billion-laughs"></a></h3>
2180 <pre class="fcndec">
2185 </pre>
2197 <pre>
2199 </pre>
2206 <li>parser <code>p</code> must be a non-<code>NULL</code> root parser (without any parent parsers) and</li>
2207 <li><code>maximumAmplificationFactor</code> must be non-<code>NaN</code> and greater than or equal to <code>1.0</code>.</li>
2212 If you ever need to increase this value for non-attack payload,
2230 <pre class="fcndec">
2235 </pre>
2248 <li>parser <code>p</code> must be a non-<code>NULL</code> root parser (without any parent parsers).</li>
2253 If you ever need to increase this value for non-attack payload,
2266 <pre class="fcndec">
2270 </pre>
2292 <pre class="fcndec">
2296 </pre>
2308 <pre class="fcndec">
2311 </pre>
2318 <pre class="fcndec">
2321 </pre>
2330 <pre class="fcndec">
2334 </pre>
2343 <pre class="fcndec">
2346 </pre>
2352 <pre class="fcndec">
2355 </pre>
2369 <pre class="fcndec">
2372 </pre>
2376 >XML_StartElementHandler</a></code>, or -1 if there is no ID
2382 <pre class="fcndec">
2385 </pre>
2386 <pre class="signature">
2393 </pre>
2398 in the start-tag rather than defaulted. Each attribute/value pair counts
2404 <pre class="fcndec">
2408 </pre>
2411 passing a non-<code>NULL</code> encoding argument to the parser creation functions.
2420 <pre class="fcndec">
2424 </pre>
2441 <pre class="fcndec">
2445 </pre>
2452 <p><b>Note:</b> This call is optional, as the parser will auto-generate
2461 <pre class="fcndec">
2464 </pre>
2469 external subset in their DOCTYPE declaration, the application-provided
2472 application-provided subset will be parsed, but the
2479 <p>The application-provided external subset is read by calling the
2499 <pre class="fcndec">
2503 </pre>
2511 non-zero, then afterwards namespace qualified names (that is qualified
2522 <pre class="fcndec">
2525 </pre>
2537 <pre class="fcndec">
2540 </pre>
2546 <pre class="fcndec">
2549 </pre>
2550 <pre class="signature">
2556 </pre>
2559 Some macros are also defined that support compile-time tests of the
2571 <pre class="fcndec">
2574 </pre>
2575 <pre class="signature">
2594 </pre>
2605 identifying the feature-test macros Expat was compiled with. Since an
2633 <pre class="fcndec">
2636 </pre>
2646 is especially useful for third-party libraries that interact with a
2653 <pre class="fcndec">
2656 </pre>
2666 <pre class="fcndec">
2669 </pre>
2686 <pre class="fcndec">
2689 </pre>