1 2=head1 NAME 3 4Pod::Simple::Subclassing -- write a formatter as a Pod::Simple subclass 5 6=head1 SYNOPSIS 7 8 package Pod::SomeFormatter; 9 use Pod::Simple; 10 @ISA = qw(Pod::Simple); 11 $VERSION = '1.01'; 12 use strict; 13 14 sub _handle_element_start { 15 my($parser, $element_name, $attr_hash_r) = @_; 16 ... 17 } 18 19 sub _handle_element_end { 20 my($parser, $element_name, $attr_hash_r) = @_; 21 # NOTE: $attr_hash_r is only present when $element_name is "over" or "begin" 22 # The remaining code excerpts will mostly ignore this $attr_hash_r, as it is 23 # mostly useless. It is documented where "over-*" and "begin" events are 24 # documented. 25 ... 26 } 27 28 sub _handle_text { 29 my($parser, $text) = @_; 30 ... 31 } 32 1; 33 34=head1 DESCRIPTION 35 36This document is about using Pod::Simple to write a Pod processor, 37generally a Pod formatter. If you just want to know about using an 38existing Pod formatter, instead see its documentation and see also the 39docs in L<Pod::Simple>. 40 41The zeroeth step in writing a Pod formatter is to make sure that there 42isn't already a decent one in CPAN. See L<http://search.cpan.org/>, and 43run a search on the name of the format you want to render to. Also 44consider joining the Pod People list 45L<http://lists.perl.org/showlist.cgi?name=pod-people> and asking whether 46anyone has a formatter for that format -- maybe someone cobbled one 47together but just hasn't released it. 48 49The first step in writing a Pod processor is to read L<perlpodspec>, 50which contains notes information on writing a Pod parser (which has been 51largely taken care of by Pod::Simple), but also a lot of requirements 52and recommendations for writing a formatter. 53 54The second step is to actually learn the format you're planning to 55format to -- or at least as much as you need to know to represent Pod, 56which probably isn't much. 57 58The third step is to pick which of Pod::Simple's interfaces you want to 59use -- the basic interface via Pod::Simple or L<Pod::Simple::Methody> is 60event-based, sort of like L<HTML::Parser>'s interface, or sort of like 61L<XML::Parser>'s "Handlers" interface), but L<Pod::Simple::PullParser> 62provides a token-stream interface, sort of like L<HTML::TokeParser>'s 63interface; L<Pod::Simple::SimpleTree> provides a simple tree interface, 64rather like XML::Parser's "Tree" interface. Users familiar with 65XML-handling will find one of these styles relatively familiar; but if 66you would be even more at home with XML, there are classes that produce 67an XML representation of the Pod stream, notably 68L<Pod::Simple::XMLOutStream>; you can feed the output of such a class to 69whatever XML parsing system you are most at home with. 70 71The last step is to write your code based on how the events (or tokens, 72or tree-nodes, or the XML, or however you're parsing) will map to 73constructs in the output format. Also sure to consider how to escape 74text nodes containing arbitrary text, and also what to do with text 75nodes that represent preformatted text (from verbatim sections). 76 77 78 79=head1 Events 80 81TODO intro... mention that events are supplied for implicits, like for 82missing >'s 83 84 85In the following section, we use XML to represent the event structure 86associated with a particular construct. That is, TODO 87 88=over 89 90=item C<< $parser->_handle_element_start( I<element_name>, I<attr_hashref> ) >> 91 92=item C<< $parser->_handle_element_end( I<element_name> ) >> 93 94=item C<< $parser->_handle_text( I<text_string> ) >> 95 96=back 97 98TODO describe 99 100 101=over 102 103=item events with an element_name of Document 104 105Parsing a document produces this event structure: 106 107 <Document start_line="543"> 108 ...all events... 109 </Document> 110 111The value of the I<start_line> attribute will be the line number of the first 112Pod directive in the document. 113 114If there is no Pod in the given document, then the 115event structure will be this: 116 117 <Document contentless="1" start_line="543"> 118 </Document> 119 120In that case, the value of the I<start_line> attribute will not be meaningful; 121under current implementations, it will probably be the line number of the 122last line in the file. 123 124=item events with an element_name of Para 125 126Parsing a plain (non-verbatim, non-directive, non-data) paragraph in 127a Pod document produces this event structure: 128 129 <Para start_line="543"> 130 ...all events in this paragraph... 131 </Para> 132 133The value of the I<start_line> attribute will be the line number of the start 134of the paragraph. 135 136For example, parsing this paragraph of Pod: 137 138 The value of the I<start_line> attribute will be the 139 line number of the start of the paragraph. 140 141produces this event structure: 142 143 <Para start_line="129"> 144 The value of the 145 <I> 146 start_line 147 </I> 148 attribute will be the line number of the first Pod directive 149 in the document. 150 </Para> 151 152=item events with an element_name of B, C, F, or I. 153 154Parsing a BE<lt>...E<gt> formatting code (or of course any of its 155semantically identical syntactic variants 156S<BE<lt>E<lt> ... E<gt>E<gt>>, 157or S<BE<lt>E<lt>E<lt>E<lt> ... E<gt>E<gt>E<gt>E<gt>>, etc.) 158produces this event structure: 159 160 <B> 161 ...stuff... 162 </B> 163 164Currently, there are no attributes conveyed. 165 166Parsing C, F, or I codes produce the same structure, with only a 167different element name. 168 169If your parser object has been set to accept other formatting codes, 170then they will be presented like these B/C/F/I codes -- i.e., without 171any attributes. 172 173=item events with an element_name of S 174 175Normally, parsing an SE<lt>...E<gt> sequence produces this event 176structure, just as if it were a B/C/F/I code: 177 178 <S> 179 ...stuff... 180 </S> 181 182However, Pod::Simple (and presumably all derived parsers) offers the 183C<nbsp_for_S> option which, if enabled, will suppress all S events, and 184instead change all spaces in the content to non-breaking spaces. This is 185intended for formatters that output to a format that has no code that 186means the same as SE<lt>...E<gt>, but which has a code/character that 187means non-breaking space. 188 189=item events with an element_name of X 190 191Normally, parsing an XE<lt>...E<gt> sequence produces this event 192structure, just as if it were a B/C/F/I code: 193 194 <X> 195 ...stuff... 196 </X> 197 198However, Pod::Simple (and presumably all derived parsers) offers the 199C<nix_X_codes> option which, if enabled, will suppress all X events 200and ignore their content. For formatters/processors that don't use 201X events, this is presumably quite useful. 202 203 204=item events with an element_name of L 205 206Because the LE<lt>...E<gt> is the most complex construct in the 207language, it should not surprise you that the events it generates are 208the most complex in the language. Most of complexity is hidden away in 209the attribute values, so for those of you writing a Pod formatter that 210produces a non-hypertextual format, you can just ignore the attributes 211and treat an L event structure like a formatting element that 212(presumably) doesn't actually produce a change in formatting. That is, 213the content of the L event structure (as opposed to its 214attributes) is always what text should be displayed. 215 216There are, at first glance, three kinds of L links: URL, man, and pod. 217 218When a LE<lt>I<some_url>E<gt> code is parsed, it produces this event 219structure: 220 221 <L content-implicit="yes" raw="that_url" to="that_url" type="url"> 222 that_url 223 </L> 224 225The C<type="url"> attribute is always specified for this type of 226L code. 227 228For example, this Pod source: 229 230 L<http://www.perl.com/CPAN/authors/> 231 232produces this event structure: 233 234 <L content-implicit="yes" raw="http://www.perl.com/CPAN/authors/" to="http://www.perl.com/CPAN/authors/" type="url"> 235 http://www.perl.com/CPAN/authors/ 236 </L> 237 238When a LE<lt>I<manpage(section)>E<gt> code is parsed (and these are 239fairly rare and not terribly useful), it produces this event structure: 240 241 <L content-implicit="yes" raw="manpage(section)" to="manpage(section)" type="man"> 242 manpage(section) 243 </L> 244 245The C<type="man"> attribute is always specified for this type of 246L code. 247 248For example, this Pod source: 249 250 L<crontab(5)> 251 252produces this event structure: 253 254 <L content-implicit="yes" raw="crontab(5)" to="crontab(5)" type="man"> 255 crontab(5) 256 </L> 257 258In the rare cases where a man page link has a specified, that text appears 259in a I<section> attribute. For example, this Pod source: 260 261 L<crontab(5)/"ENVIRONMENT"> 262 263will produce this event structure: 264 265 <L content-implicit="yes" raw="crontab(5)/"ENVIRONMENT"" section="ENVIRONMENT" to="crontab(5)" type="man"> 266 "ENVIRONMENT" in crontab(5) 267 </L> 268 269In the rare case where the Pod document has code like 270LE<lt>I<sometext>|I<manpage(section)>E<gt>, then the I<sometext> will appear 271as the content of the element, the I<manpage(section)> text will appear 272only as the value of the I<to> attribute, and there will be no 273C<content-implicit="yes"> attribute (whose presence means that the Pod parser 274had to infer what text should appear as the link text -- as opposed to 275cases where that attribute is absent, which means that the Pod parser did 276I<not> have to infer the link text, because that L code explicitly specified 277some link text.) 278 279For example, this Pod source: 280 281 L<hell itself!|crontab(5)> 282 283will produce this event structure: 284 285 <L raw="hell itself!|crontab(5)" to="crontab(5)" type="man"> 286 hell itself! 287 </L> 288 289The last type of L structure is for links to/within Pod documents. It is 290the most complex because it can have a I<to> attribute, I<or> a 291I<section> attribute, or both. The C<type="pod"> attribute is always 292specified for this type of L code. 293 294In the most common case, the simple case of a LE<lt>podpageE<gt> code 295produces this event structure: 296 297 <L content-implicit="yes" raw="podpage" to="podpage" type="pod"> 298 podpage 299 </L> 300 301For example, this Pod source: 302 303 L<Net::Ping> 304 305produces this event structure: 306 307 <L content-implicit="yes" raw="Net::Ping" to="Net::Ping" type="pod"> 308 Net::Ping 309 </L> 310 311In cases where there is link-text explicitly specified, it 312is to be found in the content of the element (and not the 313attributes), just as with the LE<lt>I<sometext>|I<manpage(section)>E<gt> 314case discussed above. For example, this Pod source: 315 316 L<Perl Error Messages|perldiag> 317 318produces this event structure: 319 320 <L raw="Perl Error Messages|perldiag" to="perldiag" type="pod"> 321 Perl Error Messages 322 </L> 323 324In cases of links to a section in the current Pod document, 325there is a I<section> attribute instead of a I<to> attribute. 326For example, this Pod source: 327 328 L</"Member Data"> 329 330produces this event structure: 331 332 <L content-implicit="yes" raw="/"Member Data"" section="Member Data" type="pod"> 333 "Member Data" 334 </L> 335 336As another example, this Pod source: 337 338 L<the various attributes|/"Member Data"> 339 340produces this event structure: 341 342 <L raw="the various attributes|/"Member Data"" section="Member Data" type="pod"> 343 the various attributes 344 </L> 345 346In cases of links to a section in a different Pod document, 347there are both a I<section> attribute and a L<to> attribute. 348For example, this Pod source: 349 350 L<perlsyn/"Basic BLOCKs and Switch Statements"> 351 352produces this event structure: 353 354 <L content-implicit="yes" raw="perlsyn/"Basic BLOCKs and Switch Statements"" section="Basic BLOCKs and Switch Statements" to="perlsyn" type="pod"> 355 "Basic BLOCKs and Switch Statements" in perlsyn 356 </L> 357 358As another example, this Pod source: 359 360 L<SWITCH statements|perlsyn/"Basic BLOCKs and Switch Statements"> 361 362produces this event structure: 363 364 <L raw="SWITCH statements|perlsyn/"Basic BLOCKs and Switch Statements"" section="Basic BLOCKs and Switch Statements" to="perlsyn" type="pod"> 365 SWITCH statements 366 </L> 367 368Incidentally, note that we do not distinguish between these syntaxes: 369 370 L</"Member Data"> 371 L<"Member Data"> 372 L</Member Data> 373 L<Member Data> [deprecated syntax] 374 375That is, they all produce the same event structure (for the most part), namely: 376 377 <L content-implicit="yes" raw="$depends_on_syntax" section="Member Data" type="pod"> 378 "Member Data" 379 </L> 380 381The I<raw> attribute depends on what the raw content of the C<LE<lt>E<gt>> is, 382so that is why the event structure is the same "for the most part". 383 384If you have not guessed it yet, the I<raw> attribute contains the raw, 385original, unescaped content of the C<LE<lt>E<gt>> formatting code. In addition 386to the examples above, take notice of the following event structure produced 387by the following C<LE<lt>E<gt>> formatting code. 388 389 L<click B<here>|page/About the C<-M> switch> 390 391 <L raw="click B<here>|page/About the C<-M> switch" section="About the -M switch" to="page" type="pod"> 392 click B<here> 393 </L> 394 395Specifically, notice that the formatting codes are present and unescaped 396in I<raw>. 397 398There is a known bug in the I<raw> attribute where any surrounding whitespace 399is condensed into a single ' '. For example, given LE<60> linkE<62>, I<raw> 400will be " link". 401 402=item events with an element_name of E or Z 403 404While there are Pod codes EE<lt>...E<gt> and ZE<lt>E<gt>, these 405I<do not> produce any E or Z events -- that is, there are no such 406events as E or Z. 407 408=item events with an element_name of Verbatim 409 410When a Pod verbatim paragraph (AKA "codeblock") is parsed, it 411produces this event structure: 412 413 <Verbatim start_line="543" xml:space="preserve"> 414 ...text... 415 </Verbatim> 416 417The value of the I<start_line> attribute will be the line number of the 418first line of this verbatim block. The I<xml:space> attribute is always 419present, and always has the value "preserve". 420 421The text content will have tabs already expanded. 422 423 424=item events with an element_name of head1 .. head4 425 426When a "=head1 ..." directive is parsed, it produces this event 427structure: 428 429 <head1> 430 ...stuff... 431 </head1> 432 433For example, a directive consisting of this: 434 435 =head1 Options to C<new> et al. 436 437will produce this event structure: 438 439 <head1 start_line="543"> 440 Options to 441 <C> 442 new 443 </C> 444 et al. 445 </head1> 446 447"=head2" thru "=head4" directives are the same, except for the element 448names in the event structure. 449 450=item events with an element_name of encoding 451 452In the default case, the events corresponding to C<=encoding> directives 453are not emitted. They are emitted if C<keep_encoding_directive> is true. 454In that case they produce event structures like 455L</"events with an element_name of head1 .. head4"> above. 456 457=item events with an element_name of over-bullet 458 459When an "=over ... Z<>=back" block is parsed where the items are 460a bulleted list, it will produce this event structure: 461 462 <over-bullet indent="4" start_line="543"> 463 <item-bullet start_line="545"> 464 ...Stuff... 465 </item-bullet> 466 ...more item-bullets... 467 </over-bullet fake-closer="1"> 468 469The attribute I<fake-closer> is only present if it is a true value; it is not 470present if it is a false value. It is shown in the above example to illustrate 471where the attribute is (in the B<closing> tag). It signifies that the C<=over> 472did not have a matching C<=back>, and thus Pod::Simple had to create a fake 473closer. 474 475For example, this Pod source: 476 477 =over 478 479 =item * 480 481 Something 482 483 =back 484 485Would produce an event structure that does B<not> have the I<fake-closer> 486attribute, whereas this Pod source: 487 488 =over 489 490 =item * 491 492 Gasp! An unclosed =over block! 493 494would. The rest of the over-* examples will not demonstrate this attribute, 495but they all can have it. See L<Pod::Checker>'s source for an example of this 496attribute being used. 497 498The value of the I<indent> attribute is whatever value is after the 499"=over" directive, as in "=over 8". If no such value is specified 500in the directive, then the I<indent> attribute has the value "4". 501 502For example, this Pod source: 503 504 =over 505 506 =item * 507 508 Stuff 509 510 =item * 511 512 Bar I<baz>! 513 514 =back 515 516produces this event structure: 517 518 <over-bullet indent="4" start_line="10"> 519 <item-bullet start_line="12"> 520 Stuff 521 </item-bullet> 522 <item-bullet start_line="14"> 523 Bar <I>baz</I>! 524 </item-bullet> 525 </over-bullet> 526 527=item events with an element_name of over-number 528 529When an "=over ... Z<>=back" block is parsed where the items are 530a numbered list, it will produce this event structure: 531 532 <over-number indent="4" start_line="543"> 533 <item-number number="1" start_line="545"> 534 ...Stuff... 535 </item-number> 536 ...more item-number... 537 </over-bullet> 538 539This is like the "over-bullet" event structure; but note that the contents 540are "item-number" instead of "item-bullet", and note that they will have 541a "number" attribute, which some formatters/processors may ignore 542(since, for example, there's no need for it in HTML when producing 543an "<UL><LI>...</LI>...</UL>" structure), but which any processor may use. 544 545Note that the values for the I<number> attributes of "item-number" 546elements in a given "over-number" area I<will> start at 1 and go up by 547one each time. If the Pod source doesn't follow that order (even though 548it really should should!), whatever numbers it has will be ignored (with 549the correct values being put in the I<number> attributes), and an error 550message might be issued to the user. 551 552=item events with an element_name of over-text 553 554These events are somewhat unlike the other over-* 555structures, as far as what their contents are. When 556an "=over ... Z<>=back" block is parsed where the items are 557a list of text "subheadings", it will produce this event structure: 558 559 <over-text indent="4" start_line="543"> 560 <item-text> 561 ...stuff... 562 </item-text> 563 ...stuff (generally Para or Verbatim elements)... 564 <item-text> 565 ...more item-text and/or stuff... 566 </over-text> 567 568The I<indent> and I<fake-closer> attributes are as with the other over-* events. 569 570For example, this Pod source: 571 572 =over 573 574 =item Foo 575 576 Stuff 577 578 =item Bar I<baz>! 579 580 Quux 581 582 =back 583 584produces this event structure: 585 586 <over-text indent="4" start_line="20"> 587 <item-text start_line="22"> 588 Foo 589 </item-text> 590 <Para start_line="24"> 591 Stuff 592 </Para> 593 <item-text start_line="26"> 594 Bar 595 <I> 596 baz 597 </I> 598 ! 599 </item-text> 600 <Para start_line="28"> 601 Quux 602 </Para> 603 </over-text> 604 605 606 607=item events with an element_name of over-block 608 609These events are somewhat unlike the other over-* 610structures, as far as what their contents are. When 611an "=over ... Z<>=back" block is parsed where there are no items, 612it will produce this event structure: 613 614 <over-block indent="4" start_line="543"> 615 ...stuff (generally Para or Verbatim elements)... 616 </over-block> 617 618The I<indent> and I<fake-closer> attributes are as with the other over-* events. 619 620For example, this Pod source: 621 622 =over 623 624 For cutting off our trade with all parts of the world 625 626 For transporting us beyond seas to be tried for pretended offenses 627 628 He is at this time transporting large armies of foreign mercenaries to 629 complete the works of death, desolation and tyranny, already begun with 630 circumstances of cruelty and perfidy scarcely paralleled in the most 631 barbarous ages, and totally unworthy the head of a civilized nation. 632 633 =back 634 635will produce this event structure: 636 637 <over-block indent="4" start_line="2"> 638 <Para start_line="4"> 639 For cutting off our trade with all parts of the world 640 </Para> 641 <Para start_line="6"> 642 For transporting us beyond seas to be tried for pretended offenses 643 </Para> 644 <Para start_line="8"> 645 He is at this time transporting large armies of [...more text...] 646 </Para> 647 </over-block> 648 649=item events with an element_name of over-empty 650 651B<Note: These events are only triggered if C<parse_empty_lists()> is set to a 652true value.> 653 654These events are somewhat unlike the other over-* structures, as far as what 655their contents are. When an "=over ... Z<>=back" block is parsed where there 656is no content, it will produce this event structure: 657 658 <over-empty indent="4" start_line="543"> 659 </over-empty> 660 661The I<indent> and I<fake-closer> attributes are as with the other over-* events. 662 663For example, this Pod source: 664 665 =over 666 667 =over 668 669 =back 670 671 =back 672 673will produce this event structure: 674 675 <over-block indent="4" start_line="1"> 676 <over-empty indent="4" start_line="3"> 677 </over-empty> 678 </over-block> 679 680Note that the outer C<=over> is a block because it has no C<=item>s but still 681has content: the inner C<=over>. The inner C<=over>, in turn, is completely 682empty, and is treated as such. 683 684=item events with an element_name of item-bullet 685 686See L</"events with an element_name of over-bullet">, above. 687 688=item events with an element_name of item-number 689 690See L</"events with an element_name of over-number">, above. 691 692=item events with an element_name of item-text 693 694See L</"events with an element_name of over-text">, above. 695 696=item events with an element_name of for 697 698TODO... 699 700=item events with an element_name of Data 701 702TODO... 703 704=back 705 706 707 708=head1 More Pod::Simple Methods 709 710Pod::Simple provides a lot of methods that aren't generally interesting 711to the end user of an existing Pod formatter, but some of which you 712might find useful in writing a Pod formatter. They are listed below. The 713first several methods (the accept_* methods) are for declaring the 714capabilities of your parser, notably what C<=for I<targetname>> sections 715it's interested in, what extra NE<lt>...E<gt> codes it accepts beyond 716the ones described in the I<perlpod>. 717 718=over 719 720=item C<< $parser->accept_targets( I<SOMEVALUE> ) >> 721 722As the parser sees sections like: 723 724 =for html <img src="fig1.jpg"> 725 726or 727 728 =begin html 729 730 <img src="fig1.jpg"> 731 732 =end html 733 734...the parser will ignore these sections unless your subclass has 735specified that it wants to see sections targeted to "html" (or whatever 736the formatter name is). 737 738If you want to process all sections, even if they're not targeted for you, 739call this before you start parsing: 740 741 $parser->accept_targets('*'); 742 743=item C<< $parser->accept_targets_as_text( I<SOMEVALUE> ) >> 744 745This is like accept_targets, except that it specifies also that the 746content of sections for this target should be treated as Pod text even 747if the target name in "=for I<targetname>" doesn't start with a ":". 748 749At time of writing, I don't think you'll need to use this. 750 751 752=item C<< $parser->accept_codes( I<Codename>, I<Codename>... ) >> 753 754This tells the parser that you accept additional formatting codes, 755beyond just the standard ones (I B C L F S X, plus the two weird ones 756you don't actually see in the parse tree, Z and E). For example, to also 757accept codes "N", "R", and "W": 758 759 $parser->accept_codes( qw( N R W ) ); 760 761B<TODO: document how this interacts with =extend, and long element names> 762 763 764=item C<< $parser->accept_directive_as_data( I<directive_name> ) >> 765 766=item C<< $parser->accept_directive_as_verbatim( I<directive_name> ) >> 767 768=item C<< $parser->accept_directive_as_processed( I<directive_name> ) >> 769 770In the unlikely situation that you need to tell the parser that you will 771accept additional directives ("=foo" things), you need to first set the 772parser to treat its content as data (i.e., not really processed at 773all), or as verbatim (mostly just expanding tabs), or as processed text 774(parsing formatting codes like BE<lt>...E<gt>). 775 776For example, to accept a new directive "=method", you'd presumably 777use: 778 779 $parser->accept_directive_as_processed("method"); 780 781so that you could have Pod lines like: 782 783 =method I<$whatever> thing B<um> 784 785Making up your own directives breaks compatibility with other Pod 786formatters, in a way that using "=for I<target> ..." lines doesn't; 787however, you may find this useful if you're making a Pod superset 788format where you don't need to worry about compatibility. 789 790 791=item C<< $parser->nbsp_for_S( I<BOOLEAN> ); >> 792 793Setting this attribute to a true value (and by default it is false) will 794turn "SE<lt>...E<gt>" sequences into sequences of words separated by 795C<\xA0> (non-breaking space) characters. For example, it will take this: 796 797 I like S<Dutch apple pie>, don't you? 798 799and treat it as if it were: 800 801 I like DutchE<nbsp>appleE<nbsp>pie, don't you? 802 803This is handy for output formats that don't have anything quite like an 804"SE<lt>...E<gt>" code, but which do have a code for non-breaking space. 805 806There is currently no method for going the other way; but I can 807probably provide one upon request. 808 809 810=item C<< $parser->version_report() >> 811 812This returns a string reporting the $VERSION value from your module (and 813its classname) as well as the $VERSION value of Pod::Simple. Note that 814L<perlpodspec> requires output formats (wherever possible) to note 815this detail in a comment in the output format. For example, for 816some kind of SGML output format: 817 818 print OUT "<!-- \n", $parser->version_report, "\n -->"; 819 820 821=item C<< $parser->pod_para_count() >> 822 823This returns the count of Pod paragraphs seen so far. 824 825 826=item C<< $parser->line_count() >> 827 828This is the current line number being parsed. But you might find the 829"line_number" event attribute more accurate, when it is present. 830 831 832=item C<< $parser->nix_X_codes( I<SOMEVALUE> ) >> 833 834This attribute, when set to a true value (and it is false by default) 835ignores any "XE<lt>...E<gt>" sequences in the document being parsed. 836Many formats don't actually use the content of these codes, so have 837no reason to process them. 838 839=item C<< $parser->keep_encoding_directive( I<SOMEVALUE> ) >> 840 841This attribute, when set to a true value (it is false by default) 842will keep C<=encoding> and its content in the event structure. Most 843formats don't actually need to process the content of an C<=encoding> 844directive, even when this directive sets the encoding and the 845processor makes use of the encoding information. Indeed, it is 846possible to know the encoding without processing the directive 847content. 848 849=item C<< $parser->merge_text( I<SOMEVALUE> ) >> 850 851This attribute, when set to a true value (and it is false by default) 852makes sure that only one event (or token, or node) will be created 853for any single contiguous sequence of text. For example, consider 854this somewhat contrived example: 855 856 I just LOVE Z<>hotE<32>apple pie! 857 858When that is parsed and events are about to be called on it, it may 859actually seem to be four different text events, one right after another: 860one event for "I just LOVE ", one for "hot", one for " ", and one for 861"apple pie!". But if you have merge_text on, then you're guaranteed 862that it will be fired as one text event: "I just LOVE hot apple pie!". 863 864 865=item C<< $parser->code_handler( I<CODE_REF> ) >> 866 867This specifies code that should be called when a code line is seen 868(i.e., a line outside of the Pod). Normally this is undef, meaning 869that no code should be called. If you provide a routine, it should 870start out like this: 871 872 sub get_code_line { # or whatever you'll call it 873 my($line, $line_number, $parser) = @_; 874 ... 875 } 876 877Note, however, that sometimes the Pod events aren't processed in exactly 878the same order as the code lines are -- i.e., if you have a file with 879Pod, then code, then more Pod, sometimes the code will be processed (via 880whatever you have code_handler call) before the all of the preceding Pod 881has been processed. 882 883 884=item C<< $parser->cut_handler( I<CODE_REF> ) >> 885 886This is just like the code_handler attribute, except that it's for 887"=cut" lines, not code lines. The same caveats apply. "=cut" lines are 888unlikely to be interesting, but this is included for completeness. 889 890 891=item C<< $parser->pod_handler( I<CODE_REF> ) >> 892 893This is just like the code_handler attribute, except that it's for 894"=pod" lines, not code lines. The same caveats apply. "=pod" lines are 895unlikely to be interesting, but this is included for completeness. 896 897 898=item C<< $parser->whiteline_handler( I<CODE_REF> ) >> 899 900This is just like the code_handler attribute, except that it's for 901lines that are seemingly blank but have whitespace (" " and/or "\t") on them, 902not code lines. The same caveats apply. These lines are unlikely to be 903interesting, but this is included for completeness. 904 905 906=item C<< $parser->whine( I<linenumber>, I<complaint string> ) >> 907 908This notes a problem in the Pod, which will be reported to in the "Pod 909Errors" section of the document and/or send to STDERR, depending on the 910values of the attributes C<no_whining>, C<no_errata_section>, and 911C<complain_stderr>. 912 913=item C<< $parser->scream( I<linenumber>, I<complaint string> ) >> 914 915This notes an error like C<whine> does, except that it is not 916suppressible with C<no_whining>. This should be used only for very 917serious errors. 918 919 920=item C<< $parser->source_dead(1) >> 921 922This aborts parsing of the current document, by switching on the flag 923that indicates that EOF has been seen. In particularly drastic cases, 924you might want to do this. It's rather nicer than just calling 925C<die>! 926 927=item C<< $parser->hide_line_numbers( I<SOMEVALUE> ) >> 928 929Some subclasses that indiscriminately dump event attributes (well, 930except for ones beginning with "~") can use this object attribute for 931refraining to dump the "start_line" attribute. 932 933=item C<< $parser->no_whining( I<SOMEVALUE> ) >> 934 935This attribute, if set to true, will suppress reports of non-fatal 936error messages. The default value is false, meaning that complaints 937I<are> reported. How they get reported depends on the values of 938the attributes C<no_errata_section> and C<complain_stderr>. 939 940=item C<< $parser->no_errata_section( I<SOMEVALUE> ) >> 941 942This attribute, if set to true, will suppress generation of an errata 943section. The default value is false -- i.e., an errata section will be 944generated. 945 946=item C<< $parser->complain_stderr( I<SOMEVALUE> ) >> 947 948This attribute, if set to true will send complaints to STDERR. The 949default value is false -- i.e., complaints do not go to STDERR. 950 951=item C<< $parser->bare_output( I<SOMEVALUE> ) >> 952 953Some formatter subclasses use this as a flag for whether output should 954have prologue and epilogue code omitted. For example, setting this to 955true for an HTML formatter class should omit the 956"<html><head><title>...</title><body>..." prologue and the 957"</body></html>" epilogue. 958 959If you want to set this to true, you should probably also set 960C<no_whining> or at least C<no_errata_section> to true. 961 962=item C<< $parser->preserve_whitespace( I<SOMEVALUE> ) >> 963 964If you set this attribute to a true value, the parser will try to 965preserve whitespace in the output. This means that such formatting 966conventions as two spaces after periods will be preserved by the parser. 967This is primarily useful for output formats that treat whitespace as 968significant (such as text or *roff, but not HTML). 969 970=item C<< $parser->parse_empty_lists( I<SOMEVALUE> ) >> 971 972If this attribute is set to true, the parser will not ignore empty 973C<=over>/C<=back> blocks. The type of C<=over> will be I<empty>, documented 974above, L<events with an element_name of over-empty>. 975 976=back 977 978=head1 SEE ALSO 979 980L<Pod::Simple> -- event-based Pod-parsing framework 981 982L<Pod::Simple::Methody> -- like Pod::Simple, but each sort of event 983calls its own method (like C<start_head3>) 984 985L<Pod::Simple::PullParser> -- a Pod-parsing framework like Pod::Simple, 986but with a token-stream interface 987 988L<Pod::Simple::SimpleTree> -- a Pod-parsing framework like Pod::Simple, 989but with a tree interface 990 991L<Pod::Simple::Checker> -- a simple Pod::Simple subclass that reads 992documents, and then makes a plaintext report of any errors found in the 993document 994 995L<Pod::Simple::DumpAsXML> -- for dumping Pod documents as tidily 996indented XML, showing each event on its own line 997 998L<Pod::Simple::XMLOutStream> -- dumps a Pod document as XML (without 999introducing extra whitespace as Pod::Simple::DumpAsXML does). 1000 1001L<Pod::Simple::DumpAsText> -- for dumping Pod documents as tidily 1002indented text, showing each event on its own line 1003 1004L<Pod::Simple::LinkSection> -- class for objects representing the values 1005of the TODO and TODO attributes of LE<lt>...E<gt> elements 1006 1007L<Pod::Escapes> -- the module the Pod::Simple uses for evaluating 1008EE<lt>...E<gt> content 1009 1010L<Pod::Simple::Text> -- a simple plaintext formatter for Pod 1011 1012L<Pod::Simple::TextContent> -- like Pod::Simple::Text, but 1013makes no effort for indent or wrap the text being formatted 1014 1015L<Pod::Simple::HTML> -- a simple HTML formatter for Pod 1016 1017L<perlpod|perlpod> 1018 1019L<perlpodspec|perlpodspec> 1020 1021L<perldoc> 1022 1023=head1 SUPPORT 1024 1025Questions or discussion about POD and Pod::Simple should be sent to the 1026pod-people@perl.org mail list. Send an empty email to 1027pod-people-subscribe@perl.org to subscribe. 1028 1029This module is managed in an open GitHub repository, 1030L<https://github.com/theory/pod-simple/>. Feel free to fork and contribute, or 1031to clone L<git://github.com/theory/pod-simple.git> and send patches! 1032 1033Patches against Pod::Simple are welcome. Please send bug reports to 1034<bug-pod-simple@rt.cpan.org>. 1035 1036=head1 COPYRIGHT AND DISCLAIMERS 1037 1038Copyright (c) 2002 Sean M. Burke. 1039 1040This library is free software; you can redistribute it and/or modify it 1041under the same terms as Perl itself. 1042 1043This program is distributed in the hope that it will be useful, but 1044without any warranty; without even the implied warranty of 1045merchantability or fitness for a particular purpose. 1046 1047=head1 AUTHOR 1048 1049Pod::Simple was created by Sean M. Burke <sburke@cpan.org>. 1050But don't bother him, he's retired. 1051 1052Pod::Simple is maintained by: 1053 1054=over 1055 1056=item * Allison Randal C<allison@perl.org> 1057 1058=item * Hans Dieter Pearcey C<hdp@cpan.org> 1059 1060=item * David E. Wheeler C<dwheeler@cpan.org> 1061 1062=back 1063 1064=for notes 1065Hm, my old podchecker version (1.2) says: 1066 *** WARNING: node 'http://search.cpan.org/' contains non-escaped | or / at line 38 in file Subclassing.pod 1067 *** WARNING: node 'http://lists.perl.org/showlist.cgi?name=pod-people' contains non-escaped | or / at line 41 in file Subclassing.pod 1068Yes, L<...> is hard. 1069 1070 1071=cut 1072