1############################################################################# 2# Pod/Parser.pm -- package which defines a base class for parsing POD docs. 3# 4# Copyright (C) 1996-2000 by Bradford Appleton. All rights reserved. 5# This file is part of "PodParser". PodParser is free software; 6# you can redistribute it and/or modify it under the same terms 7# as Perl itself. 8############################################################################# 9 10package Pod::Parser; 11 12use vars qw($VERSION); 13$VERSION = 1.14; ## Current version of this package 14require 5.005; ## requires this Perl version or later 15 16############################################################################# 17 18=head1 NAME 19 20Pod::Parser - base class for creating POD filters and translators 21 22=head1 SYNOPSIS 23 24 use Pod::Parser; 25 26 package MyParser; 27 @ISA = qw(Pod::Parser); 28 29 sub command { 30 my ($parser, $command, $paragraph, $line_num) = @_; 31 ## Interpret the command and its text; sample actions might be: 32 if ($command eq 'head1') { ... } 33 elsif ($command eq 'head2') { ... } 34 ## ... other commands and their actions 35 my $out_fh = $parser->output_handle(); 36 my $expansion = $parser->interpolate($paragraph, $line_num); 37 print $out_fh $expansion; 38 } 39 40 sub verbatim { 41 my ($parser, $paragraph, $line_num) = @_; 42 ## Format verbatim paragraph; sample actions might be: 43 my $out_fh = $parser->output_handle(); 44 print $out_fh $paragraph; 45 } 46 47 sub textblock { 48 my ($parser, $paragraph, $line_num) = @_; 49 ## Translate/Format this block of text; sample actions might be: 50 my $out_fh = $parser->output_handle(); 51 my $expansion = $parser->interpolate($paragraph, $line_num); 52 print $out_fh $expansion; 53 } 54 55 sub interior_sequence { 56 my ($parser, $seq_command, $seq_argument) = @_; 57 ## Expand an interior sequence; sample actions might be: 58 return "*$seq_argument*" if ($seq_command eq 'B'); 59 return "`$seq_argument'" if ($seq_command eq 'C'); 60 return "_${seq_argument}_'" if ($seq_command eq 'I'); 61 ## ... other sequence commands and their resulting text 62 } 63 64 package main; 65 66 ## Create a parser object and have it parse file whose name was 67 ## given on the command-line (use STDIN if no files were given). 68 $parser = new MyParser(); 69 $parser->parse_from_filehandle(\*STDIN) if (@ARGV == 0); 70 for (@ARGV) { $parser->parse_from_file($_); } 71 72=head1 REQUIRES 73 74perl5.005, Pod::InputObjects, Exporter, Symbol, Carp 75 76=head1 EXPORTS 77 78Nothing. 79 80=head1 DESCRIPTION 81 82B<Pod::Parser> is a base class for creating POD filters and translators. 83It handles most of the effort involved with parsing the POD sections 84from an input stream, leaving subclasses free to be concerned only with 85performing the actual translation of text. 86 87B<Pod::Parser> parses PODs, and makes method calls to handle the various 88components of the POD. Subclasses of B<Pod::Parser> override these methods 89to translate the POD into whatever output format they desire. 90 91=head1 QUICK OVERVIEW 92 93To create a POD filter for translating POD documentation into some other 94format, you create a subclass of B<Pod::Parser> which typically overrides 95just the base class implementation for the following methods: 96 97=over 2 98 99=item * 100 101B<command()> 102 103=item * 104 105B<verbatim()> 106 107=item * 108 109B<textblock()> 110 111=item * 112 113B<interior_sequence()> 114 115=back 116 117You may also want to override the B<begin_input()> and B<end_input()> 118methods for your subclass (to perform any needed per-file and/or 119per-document initialization or cleanup). 120 121If you need to perform any preprocesssing of input before it is parsed 122you may want to override one or more of B<preprocess_line()> and/or 123B<preprocess_paragraph()>. 124 125Sometimes it may be necessary to make more than one pass over the input 126files. If this is the case you have several options. You can make the 127first pass using B<Pod::Parser> and override your methods to store the 128intermediate results in memory somewhere for the B<end_pod()> method to 129process. You could use B<Pod::Parser> for several passes with an 130appropriate state variable to control the operation for each pass. If 131your input source can't be reset to start at the beginning, you can 132store it in some other structure as a string or an array and have that 133structure implement a B<getline()> method (which is all that 134B<parse_from_filehandle()> uses to read input). 135 136Feel free to add any member data fields you need to keep track of things 137like current font, indentation, horizontal or vertical position, or 138whatever else you like. Be sure to read L<"PRIVATE METHODS AND DATA"> 139to avoid name collisions. 140 141For the most part, the B<Pod::Parser> base class should be able to 142do most of the input parsing for you and leave you free to worry about 143how to intepret the commands and translate the result. 144 145Note that all we have described here in this quick overview is the 146simplest most straightforward use of B<Pod::Parser> to do stream-based 147parsing. It is also possible to use the B<Pod::Parser::parse_text> function 148to do more sophisticated tree-based parsing. See L<"TREE-BASED PARSING">. 149 150=head1 PARSING OPTIONS 151 152A I<parse-option> is simply a named option of B<Pod::Parser> with a 153value that corresponds to a certain specified behavior. These various 154behaviors of B<Pod::Parser> may be enabled/disabled by setting 155or unsetting one or more I<parse-options> using the B<parseopts()> method. 156The set of currently accepted parse-options is as follows: 157 158=over 3 159 160=item B<-want_nonPODs> (default: unset) 161 162Normally (by default) B<Pod::Parser> will only provide access to 163the POD sections of the input. Input paragraphs that are not part 164of the POD-format documentation are not made available to the caller 165(not even using B<preprocess_paragraph()>). Setting this option to a 166non-empty, non-zero value will allow B<preprocess_paragraph()> to see 167non-POD sections of the input as well as POD sections. The B<cutting()> 168method can be used to determine if the corresponding paragraph is a POD 169paragraph, or some other input paragraph. 170 171=item B<-process_cut_cmd> (default: unset) 172 173Normally (by default) B<Pod::Parser> handles the C<=cut> POD directive 174by itself and does not pass it on to the caller for processing. Setting 175this option to a non-empty, non-zero value will cause B<Pod::Parser> to 176pass the C<=cut> directive to the caller just like any other POD command 177(and hence it may be processed by the B<command()> method). 178 179B<Pod::Parser> will still interpret the C<=cut> directive to mean that 180"cutting mode" has been (re)entered, but the caller will get a chance 181to capture the actual C<=cut> paragraph itself for whatever purpose 182it desires. 183 184=item B<-warnings> (default: unset) 185 186Normally (by default) B<Pod::Parser> recognizes a bare minimum of 187pod syntax errors and warnings and issues diagnostic messages 188for errors, but not for warnings. (Use B<Pod::Checker> to do more 189thorough checking of POD syntax.) Setting this option to a non-empty, 190non-zero value will cause B<Pod::Parser> to issue diagnostics for 191the few warnings it recognizes as well as the errors. 192 193=back 194 195Please see L<"parseopts()"> for a complete description of the interface 196for the setting and unsetting of parse-options. 197 198=cut 199 200############################################################################# 201 202use vars qw(@ISA); 203use strict; 204#use diagnostics; 205use Pod::InputObjects; 206use Carp; 207use Exporter; 208BEGIN { 209 if ($] < 5.6) { 210 require Symbol; 211 import Symbol; 212 } 213} 214@ISA = qw(Exporter); 215 216## These "variables" are used as local "glob aliases" for performance 217use vars qw(%myData %myOpts @input_stack); 218 219############################################################################# 220 221=head1 RECOMMENDED SUBROUTINE/METHOD OVERRIDES 222 223B<Pod::Parser> provides several methods which most subclasses will probably 224want to override. These methods are as follows: 225 226=cut 227 228##--------------------------------------------------------------------------- 229 230=head1 B<command()> 231 232 $parser->command($cmd,$text,$line_num,$pod_para); 233 234This method should be overridden by subclasses to take the appropriate 235action when a POD command paragraph (denoted by a line beginning with 236"=") is encountered. When such a POD directive is seen in the input, 237this method is called and is passed: 238 239=over 3 240 241=item C<$cmd> 242 243the name of the command for this POD paragraph 244 245=item C<$text> 246 247the paragraph text for the given POD paragraph command. 248 249=item C<$line_num> 250 251the line-number of the beginning of the paragraph 252 253=item C<$pod_para> 254 255a reference to a C<Pod::Paragraph> object which contains further 256information about the paragraph command (see L<Pod::InputObjects> 257for details). 258 259=back 260 261B<Note> that this method I<is> called for C<=pod> paragraphs. 262 263The base class implementation of this method simply treats the raw POD 264command as normal block of paragraph text (invoking the B<textblock()> 265method with the command paragraph). 266 267=cut 268 269sub command { 270 my ($self, $cmd, $text, $line_num, $pod_para) = @_; 271 ## Just treat this like a textblock 272 $self->textblock($pod_para->raw_text(), $line_num, $pod_para); 273} 274 275##--------------------------------------------------------------------------- 276 277=head1 B<verbatim()> 278 279 $parser->verbatim($text,$line_num,$pod_para); 280 281This method may be overridden by subclasses to take the appropriate 282action when a block of verbatim text is encountered. It is passed the 283following parameters: 284 285=over 3 286 287=item C<$text> 288 289the block of text for the verbatim paragraph 290 291=item C<$line_num> 292 293the line-number of the beginning of the paragraph 294 295=item C<$pod_para> 296 297a reference to a C<Pod::Paragraph> object which contains further 298information about the paragraph (see L<Pod::InputObjects> 299for details). 300 301=back 302 303The base class implementation of this method simply prints the textblock 304(unmodified) to the output filehandle. 305 306=cut 307 308sub verbatim { 309 my ($self, $text, $line_num, $pod_para) = @_; 310 my $out_fh = $self->{_OUTPUT}; 311 print $out_fh $text; 312} 313 314##--------------------------------------------------------------------------- 315 316=head1 B<textblock()> 317 318 $parser->textblock($text,$line_num,$pod_para); 319 320This method may be overridden by subclasses to take the appropriate 321action when a normal block of POD text is encountered (although the base 322class method will usually do what you want). It is passed the following 323parameters: 324 325=over 3 326 327=item C<$text> 328 329the block of text for the a POD paragraph 330 331=item C<$line_num> 332 333the line-number of the beginning of the paragraph 334 335=item C<$pod_para> 336 337a reference to a C<Pod::Paragraph> object which contains further 338information about the paragraph (see L<Pod::InputObjects> 339for details). 340 341=back 342 343In order to process interior sequences, subclasses implementations of 344this method will probably want to invoke either B<interpolate()> or 345B<parse_text()>, passing it the text block C<$text>, and the corresponding 346line number in C<$line_num>, and then perform any desired processing upon 347the returned result. 348 349The base class implementation of this method simply prints the text block 350as it occurred in the input stream). 351 352=cut 353 354sub textblock { 355 my ($self, $text, $line_num, $pod_para) = @_; 356 my $out_fh = $self->{_OUTPUT}; 357 print $out_fh $self->interpolate($text, $line_num); 358} 359 360##--------------------------------------------------------------------------- 361 362=head1 B<interior_sequence()> 363 364 $parser->interior_sequence($seq_cmd,$seq_arg,$pod_seq); 365 366This method should be overridden by subclasses to take the appropriate 367action when an interior sequence is encountered. An interior sequence is 368an embedded command within a block of text which appears as a command 369name (usually a single uppercase character) followed immediately by a 370string of text which is enclosed in angle brackets. This method is 371passed the sequence command C<$seq_cmd> and the corresponding text 372C<$seq_arg>. It is invoked by the B<interpolate()> method for each interior 373sequence that occurs in the string that it is passed. It should return 374the desired text string to be used in place of the interior sequence. 375The C<$pod_seq> argument is a reference to a C<Pod::InteriorSequence> 376object which contains further information about the interior sequence. 377Please see L<Pod::InputObjects> for details if you need to access this 378additional information. 379 380Subclass implementations of this method may wish to invoke the 381B<nested()> method of C<$pod_seq> to see if it is nested inside 382some other interior-sequence (and if so, which kind). 383 384The base class implementation of the B<interior_sequence()> method 385simply returns the raw text of the interior sequence (as it occurred 386in the input) to the caller. 387 388=cut 389 390sub interior_sequence { 391 my ($self, $seq_cmd, $seq_arg, $pod_seq) = @_; 392 ## Just return the raw text of the interior sequence 393 return $pod_seq->raw_text(); 394} 395 396############################################################################# 397 398=head1 OPTIONAL SUBROUTINE/METHOD OVERRIDES 399 400B<Pod::Parser> provides several methods which subclasses may want to override 401to perform any special pre/post-processing. These methods do I<not> have to 402be overridden, but it may be useful for subclasses to take advantage of them. 403 404=cut 405 406##--------------------------------------------------------------------------- 407 408=head1 B<new()> 409 410 my $parser = Pod::Parser->new(); 411 412This is the constructor for B<Pod::Parser> and its subclasses. You 413I<do not> need to override this method! It is capable of constructing 414subclass objects as well as base class objects, provided you use 415any of the following constructor invocation styles: 416 417 my $parser1 = MyParser->new(); 418 my $parser2 = new MyParser(); 419 my $parser3 = $parser2->new(); 420 421where C<MyParser> is some subclass of B<Pod::Parser>. 422 423Using the syntax C<MyParser::new()> to invoke the constructor is I<not> 424recommended, but if you insist on being able to do this, then the 425subclass I<will> need to override the B<new()> constructor method. If 426you do override the constructor, you I<must> be sure to invoke the 427B<initialize()> method of the newly blessed object. 428 429Using any of the above invocations, the first argument to the 430constructor is always the corresponding package name (or object 431reference). No other arguments are required, but if desired, an 432associative array (or hash-table) my be passed to the B<new()> 433constructor, as in: 434 435 my $parser1 = MyParser->new( MYDATA => $value1, MOREDATA => $value2 ); 436 my $parser2 = new MyParser( -myflag => 1 ); 437 438All arguments passed to the B<new()> constructor will be treated as 439key/value pairs in a hash-table. The newly constructed object will be 440initialized by copying the contents of the given hash-table (which may 441have been empty). The B<new()> constructor for this class and all of its 442subclasses returns a blessed reference to the initialized object (hash-table). 443 444=cut 445 446sub new { 447 ## Determine if we were called via an object-ref or a classname 448 my $this = shift; 449 my $class = ref($this) || $this; 450 ## Any remaining arguments are treated as initial values for the 451 ## hash that is used to represent this object. 452 my %params = @_; 453 my $self = { %params }; 454 ## Bless ourselves into the desired class and perform any initialization 455 bless $self, $class; 456 $self->initialize(); 457 return $self; 458} 459 460##--------------------------------------------------------------------------- 461 462=head1 B<initialize()> 463 464 $parser->initialize(); 465 466This method performs any necessary object initialization. It takes no 467arguments (other than the object instance of course, which is typically 468copied to a local variable named C<$self>). If subclasses override this 469method then they I<must> be sure to invoke C<$self-E<gt>SUPER::initialize()>. 470 471=cut 472 473sub initialize { 474 #my $self = shift; 475 #return; 476} 477 478##--------------------------------------------------------------------------- 479 480=head1 B<begin_pod()> 481 482 $parser->begin_pod(); 483 484This method is invoked at the beginning of processing for each POD 485document that is encountered in the input. Subclasses should override 486this method to perform any per-document initialization. 487 488=cut 489 490sub begin_pod { 491 #my $self = shift; 492 #return; 493} 494 495##--------------------------------------------------------------------------- 496 497=head1 B<begin_input()> 498 499 $parser->begin_input(); 500 501This method is invoked by B<parse_from_filehandle()> immediately I<before> 502processing input from a filehandle. The base class implementation does 503nothing, however, subclasses may override it to perform any per-file 504initializations. 505 506Note that if multiple files are parsed for a single POD document 507(perhaps the result of some future C<=include> directive) this method 508is invoked for every file that is parsed. If you wish to perform certain 509initializations once per document, then you should use B<begin_pod()>. 510 511=cut 512 513sub begin_input { 514 #my $self = shift; 515 #return; 516} 517 518##--------------------------------------------------------------------------- 519 520=head1 B<end_input()> 521 522 $parser->end_input(); 523 524This method is invoked by B<parse_from_filehandle()> immediately I<after> 525processing input from a filehandle. The base class implementation does 526nothing, however, subclasses may override it to perform any per-file 527cleanup actions. 528 529Please note that if multiple files are parsed for a single POD document 530(perhaps the result of some kind of C<=include> directive) this method 531is invoked for every file that is parsed. If you wish to perform certain 532cleanup actions once per document, then you should use B<end_pod()>. 533 534=cut 535 536sub end_input { 537 #my $self = shift; 538 #return; 539} 540 541##--------------------------------------------------------------------------- 542 543=head1 B<end_pod()> 544 545 $parser->end_pod(); 546 547This method is invoked at the end of processing for each POD document 548that is encountered in the input. Subclasses should override this method 549to perform any per-document finalization. 550 551=cut 552 553sub end_pod { 554 #my $self = shift; 555 #return; 556} 557 558##--------------------------------------------------------------------------- 559 560=head1 B<preprocess_line()> 561 562 $textline = $parser->preprocess_line($text, $line_num); 563 564This method should be overridden by subclasses that wish to perform 565any kind of preprocessing for each I<line> of input (I<before> it has 566been determined whether or not it is part of a POD paragraph). The 567parameter C<$text> is the input line; and the parameter C<$line_num> is 568the line number of the corresponding text line. 569 570The value returned should correspond to the new text to use in its 571place. If the empty string or an undefined value is returned then no 572further processing will be performed for this line. 573 574Please note that the B<preprocess_line()> method is invoked I<before> 575the B<preprocess_paragraph()> method. After all (possibly preprocessed) 576lines in a paragraph have been assembled together and it has been 577determined that the paragraph is part of the POD documentation from one 578of the selected sections, then B<preprocess_paragraph()> is invoked. 579 580The base class implementation of this method returns the given text. 581 582=cut 583 584sub preprocess_line { 585 my ($self, $text, $line_num) = @_; 586 return $text; 587} 588 589##--------------------------------------------------------------------------- 590 591=head1 B<preprocess_paragraph()> 592 593 $textblock = $parser->preprocess_paragraph($text, $line_num); 594 595This method should be overridden by subclasses that wish to perform any 596kind of preprocessing for each block (paragraph) of POD documentation 597that appears in the input stream. The parameter C<$text> is the POD 598paragraph from the input file; and the parameter C<$line_num> is the 599line number for the beginning of the corresponding paragraph. 600 601The value returned should correspond to the new text to use in its 602place If the empty string is returned or an undefined value is 603returned, then the given C<$text> is ignored (not processed). 604 605This method is invoked after gathering up all the lines in a paragraph 606and after determining the cutting state of the paragraph, 607but before trying to further parse or interpret them. After 608B<preprocess_paragraph()> returns, the current cutting state (which 609is returned by C<$self-E<gt>cutting()>) is examined. If it evaluates 610to true then input text (including the given C<$text>) is cut (not 611processed) until the next POD directive is encountered. 612 613Please note that the B<preprocess_line()> method is invoked I<before> 614the B<preprocess_paragraph()> method. After all (possibly preprocessed) 615lines in a paragraph have been assembled together and either it has been 616determined that the paragraph is part of the POD documentation from one 617of the selected sections or the C<-want_nonPODs> option is true, 618then B<preprocess_paragraph()> is invoked. 619 620The base class implementation of this method returns the given text. 621 622=cut 623 624sub preprocess_paragraph { 625 my ($self, $text, $line_num) = @_; 626 return $text; 627} 628 629############################################################################# 630 631=head1 METHODS FOR PARSING AND PROCESSING 632 633B<Pod::Parser> provides several methods to process input text. These 634methods typically won't need to be overridden (and in some cases they 635can't be overridden), but subclasses may want to invoke them to exploit 636their functionality. 637 638=cut 639 640##--------------------------------------------------------------------------- 641 642=head1 B<parse_text()> 643 644 $ptree1 = $parser->parse_text($text, $line_num); 645 $ptree2 = $parser->parse_text({%opts}, $text, $line_num); 646 $ptree3 = $parser->parse_text(\%opts, $text, $line_num); 647 648This method is useful if you need to perform your own interpolation 649of interior sequences and can't rely upon B<interpolate> to expand 650them in simple bottom-up order. 651 652The parameter C<$text> is a string or block of text to be parsed 653for interior sequences; and the parameter C<$line_num> is the 654line number curresponding to the beginning of C<$text>. 655 656B<parse_text()> will parse the given text into a parse-tree of "nodes." 657and interior-sequences. Each "node" in the parse tree is either a 658text-string, or a B<Pod::InteriorSequence>. The result returned is a 659parse-tree of type B<Pod::ParseTree>. Please see L<Pod::InputObjects> 660for more information about B<Pod::InteriorSequence> and B<Pod::ParseTree>. 661 662If desired, an optional hash-ref may be specified as the first argument 663to customize certain aspects of the parse-tree that is created and 664returned. The set of recognized option keywords are: 665 666=over 3 667 668=item B<-expand_seq> =E<gt> I<code-ref>|I<method-name> 669 670Normally, the parse-tree returned by B<parse_text()> will contain an 671unexpanded C<Pod::InteriorSequence> object for each interior-sequence 672encountered. Specifying B<-expand_seq> tells B<parse_text()> to "expand" 673every interior-sequence it sees by invoking the referenced function 674(or named method of the parser object) and using the return value as the 675expanded result. 676 677If a subroutine reference was given, it is invoked as: 678 679 &$code_ref( $parser, $sequence ) 680 681and if a method-name was given, it is invoked as: 682 683 $parser->method_name( $sequence ) 684 685where C<$parser> is a reference to the parser object, and C<$sequence> 686is a reference to the interior-sequence object. 687[I<NOTE>: If the B<interior_sequence()> method is specified, then it is 688invoked according to the interface specified in L<"interior_sequence()">]. 689 690=item B<-expand_text> =E<gt> I<code-ref>|I<method-name> 691 692Normally, the parse-tree returned by B<parse_text()> will contain a 693text-string for each contiguous sequence of characters outside of an 694interior-sequence. Specifying B<-expand_text> tells B<parse_text()> to 695"preprocess" every such text-string it sees by invoking the referenced 696function (or named method of the parser object) and using the return value 697as the preprocessed (or "expanded") result. [Note that if the result is 698an interior-sequence, then it will I<not> be expanded as specified by the 699B<-expand_seq> option; Any such recursive expansion needs to be handled by 700the specified callback routine.] 701 702If a subroutine reference was given, it is invoked as: 703 704 &$code_ref( $parser, $text, $ptree_node ) 705 706and if a method-name was given, it is invoked as: 707 708 $parser->method_name( $text, $ptree_node ) 709 710where C<$parser> is a reference to the parser object, C<$text> is the 711text-string encountered, and C<$ptree_node> is a reference to the current 712node in the parse-tree (usually an interior-sequence object or else the 713top-level node of the parse-tree). 714 715=item B<-expand_ptree> =E<gt> I<code-ref>|I<method-name> 716 717Rather than returning a C<Pod::ParseTree>, pass the parse-tree as an 718argument to the referenced subroutine (or named method of the parser 719object) and return the result instead of the parse-tree object. 720 721If a subroutine reference was given, it is invoked as: 722 723 &$code_ref( $parser, $ptree ) 724 725and if a method-name was given, it is invoked as: 726 727 $parser->method_name( $ptree ) 728 729where C<$parser> is a reference to the parser object, and C<$ptree> 730is a reference to the parse-tree object. 731 732=back 733 734=cut 735 736sub parse_text { 737 my $self = shift; 738 local $_ = ''; 739 740 ## Get options and set any defaults 741 my %opts = (ref $_[0]) ? %{ shift() } : (); 742 my $expand_seq = $opts{'-expand_seq'} || undef; 743 my $expand_text = $opts{'-expand_text'} || undef; 744 my $expand_ptree = $opts{'-expand_ptree'} || undef; 745 746 my $text = shift; 747 my $line = shift; 748 my $file = $self->input_file(); 749 my $cmd = ""; 750 751 ## Convert method calls into closures, for our convenience 752 my $xseq_sub = $expand_seq; 753 my $xtext_sub = $expand_text; 754 my $xptree_sub = $expand_ptree; 755 if (defined $expand_seq and $expand_seq eq 'interior_sequence') { 756 ## If 'interior_sequence' is the method to use, we have to pass 757 ## more than just the sequence object, we also need to pass the 758 ## sequence name and text. 759 $xseq_sub = sub { 760 my ($self, $iseq) = @_; 761 my $args = join("", $iseq->parse_tree->children); 762 return $self->interior_sequence($iseq->name, $args, $iseq); 763 }; 764 } 765 ref $xseq_sub or $xseq_sub = sub { shift()->$expand_seq(@_) }; 766 ref $xtext_sub or $xtext_sub = sub { shift()->$expand_text(@_) }; 767 ref $xptree_sub or $xptree_sub = sub { shift()->$expand_ptree(@_) }; 768 769 ## Keep track of the "current" interior sequence, and maintain a stack 770 ## of "in progress" sequences. 771 ## 772 ## NOTE that we push our own "accumulator" at the very beginning of the 773 ## stack. It's really a parse-tree, not a sequence; but it implements 774 ## the methods we need so we can use it to gather-up all the sequences 775 ## and strings we parse. Thus, by the end of our parsing, it should be 776 ## the only thing left on our stack and all we have to do is return it! 777 ## 778 my $seq = Pod::ParseTree->new(); 779 my @seq_stack = ($seq); 780 my ($ldelim, $rdelim) = ('', ''); 781 782 ## Iterate over all sequence starts text (NOTE: split with 783 ## capturing parens keeps the delimiters) 784 $_ = $text; 785 my @tokens = split /([A-Z]<(?:<+\s)?)/; 786 while ( @tokens ) { 787 $_ = shift @tokens; 788 ## Look for the beginning of a sequence 789 if ( /^([A-Z])(<(?:<+\s)?)$/ ) { 790 ## Push a new sequence onto the stack of those "in-progress" 791 my $ldelim_orig; 792 ($cmd, $ldelim_orig) = ($1, $2); 793 ($ldelim = $ldelim_orig) =~ s/\s+$//; 794 ($rdelim = $ldelim) =~ tr/</>/; 795 $seq = Pod::InteriorSequence->new( 796 -name => $cmd, 797 -ldelim => $ldelim_orig, -rdelim => $rdelim, 798 -file => $file, -line => $line 799 ); 800 (@seq_stack > 1) and $seq->nested($seq_stack[-1]); 801 push @seq_stack, $seq; 802 } 803 ## Look for sequence ending 804 elsif ( @seq_stack > 1 ) { 805 ## Make sure we match the right kind of closing delimiter 806 my ($seq_end, $post_seq) = ("", ""); 807 if ( ($ldelim eq '<' and /\A(.*?)(>)/s) 808 or /\A(.*?)(\s+$rdelim)/s ) 809 { 810 ## Found end-of-sequence, capture the interior and the 811 ## closing the delimiter, and put the rest back on the 812 ## token-list 813 $post_seq = substr($_, length($1) + length($2)); 814 ($_, $seq_end) = ($1, $2); 815 (length $post_seq) and unshift @tokens, $post_seq; 816 } 817 if (length) { 818 ## In the middle of a sequence, append this text to it, and 819 ## dont forget to "expand" it if that's what the caller wanted 820 $seq->append($expand_text ? &$xtext_sub($self,$_,$seq) : $_); 821 $_ .= $seq_end; 822 } 823 if (length $seq_end) { 824 ## End of current sequence, record terminating delimiter 825 $seq->rdelim($seq_end); 826 ## Pop it off the stack of "in progress" sequences 827 pop @seq_stack; 828 ## Append result to its parent in current parse tree 829 $seq_stack[-1]->append($expand_seq ? &$xseq_sub($self,$seq) 830 : $seq); 831 ## Remember the current cmd-name and left-delimiter 832 if(@seq_stack > 1) { 833 $cmd = $seq_stack[-1]->name; 834 $ldelim = $seq_stack[-1]->ldelim; 835 $rdelim = $seq_stack[-1]->rdelim; 836 } else { 837 $cmd = $ldelim = $rdelim = ''; 838 } 839 } 840 } 841 elsif (length) { 842 ## In the middle of a sequence, append this text to it, and 843 ## dont forget to "expand" it if that's what the caller wanted 844 $seq->append($expand_text ? &$xtext_sub($self,$_,$seq) : $_); 845 } 846 ## Keep track of line count 847 $line += tr/\n//; 848 ## Remember the "current" sequence 849 $seq = $seq_stack[-1]; 850 } 851 852 ## Handle unterminated sequences 853 my $errorsub = (@seq_stack > 1) ? $self->errorsub() : undef; 854 while (@seq_stack > 1) { 855 ($cmd, $file, $line) = ($seq->name, $seq->file_line); 856 $ldelim = $seq->ldelim; 857 ($rdelim = $ldelim) =~ tr/</>/; 858 $rdelim =~ s/^(\S+)(\s*)$/$2$1/; 859 pop @seq_stack; 860 my $errmsg = "*** ERROR: unterminated ${cmd}${ldelim}...${rdelim}". 861 " at line $line in file $file\n"; 862 (ref $errorsub) and &{$errorsub}($errmsg) 863 or (defined $errorsub) and $self->$errorsub($errmsg) 864 or warn($errmsg); 865 $seq_stack[-1]->append($expand_seq ? &$xseq_sub($self,$seq) : $seq); 866 $seq = $seq_stack[-1]; 867 } 868 869 ## Return the resulting parse-tree 870 my $ptree = (pop @seq_stack)->parse_tree; 871 return $expand_ptree ? &$xptree_sub($self, $ptree) : $ptree; 872} 873 874##--------------------------------------------------------------------------- 875 876=head1 B<interpolate()> 877 878 $textblock = $parser->interpolate($text, $line_num); 879 880This method translates all text (including any embedded interior sequences) 881in the given text string C<$text> and returns the interpolated result. The 882parameter C<$line_num> is the line number corresponding to the beginning 883of C<$text>. 884 885B<interpolate()> merely invokes a private method to recursively expand 886nested interior sequences in bottom-up order (innermost sequences are 887expanded first). If there is a need to expand nested sequences in 888some alternate order, use B<parse_text> instead. 889 890=cut 891 892sub interpolate { 893 my($self, $text, $line_num) = @_; 894 my %parse_opts = ( -expand_seq => 'interior_sequence' ); 895 my $ptree = $self->parse_text( \%parse_opts, $text, $line_num ); 896 return join "", $ptree->children(); 897} 898 899##--------------------------------------------------------------------------- 900 901=begin __PRIVATE__ 902 903=head1 B<parse_paragraph()> 904 905 $parser->parse_paragraph($text, $line_num); 906 907This method takes the text of a POD paragraph to be processed, along 908with its corresponding line number, and invokes the appropriate method 909(one of B<command()>, B<verbatim()>, or B<textblock()>). 910 911For performance reasons, this method is invoked directly without any 912dynamic lookup; Hence subclasses may I<not> override it! 913 914=end __PRIVATE__ 915 916=cut 917 918sub parse_paragraph { 919 my ($self, $text, $line_num) = @_; 920 local *myData = $self; ## alias to avoid deref-ing overhead 921 local *myOpts = ($myData{_PARSEOPTS} ||= {}); ## get parse-options 922 local $_; 923 924 ## See if we want to preprocess nonPOD paragraphs as well as POD ones. 925 my $wantNonPods = $myOpts{'-want_nonPODs'}; 926 927 ## Update cutting status 928 $myData{_CUTTING} = 0 if $text =~ /^={1,2}\S/; 929 930 ## Perform any desired preprocessing if we wanted it this early 931 $wantNonPods and $text = $self->preprocess_paragraph($text, $line_num); 932 933 ## Ignore up until next POD directive if we are cutting 934 return if $myData{_CUTTING}; 935 936 ## Now we know this is block of text in a POD section! 937 938 ##----------------------------------------------------------------- 939 ## This is a hook (hack ;-) for Pod::Select to do its thing without 940 ## having to override methods, but also without Pod::Parser assuming 941 ## $self is an instance of Pod::Select (if the _SELECTED_SECTIONS 942 ## field exists then we assume there is an is_selected() method for 943 ## us to invoke (calling $self->can('is_selected') could verify this 944 ## but that is more overhead than I want to incur) 945 ##----------------------------------------------------------------- 946 947 ## Ignore this block if it isnt in one of the selected sections 948 if (exists $myData{_SELECTED_SECTIONS}) { 949 $self->is_selected($text) or return ($myData{_CUTTING} = 1); 950 } 951 952 ## If we havent already, perform any desired preprocessing and 953 ## then re-check the "cutting" state 954 unless ($wantNonPods) { 955 $text = $self->preprocess_paragraph($text, $line_num); 956 return 1 unless ((defined $text) and (length $text)); 957 return 1 if ($myData{_CUTTING}); 958 } 959 960 ## Look for one of the three types of paragraphs 961 my ($pfx, $cmd, $arg, $sep) = ('', '', '', ''); 962 my $pod_para = undef; 963 if ($text =~ /^(={1,2})(?=\S)/) { 964 ## Looks like a command paragraph. Capture the command prefix used 965 ## ("=" or "=="), as well as the command-name, its paragraph text, 966 ## and whatever sequence of characters was used to separate them 967 $pfx = $1; 968 $_ = substr($text, length $pfx); 969 ($cmd, $sep, $text) = split /(\s+)/, $_, 2; 970 ## If this is a "cut" directive then we dont need to do anything 971 ## except return to "cutting" mode. 972 if ($cmd eq 'cut') { 973 $myData{_CUTTING} = 1; 974 return unless $myOpts{'-process_cut_cmd'}; 975 } 976 } 977 ## Save the attributes indicating how the command was specified. 978 $pod_para = new Pod::Paragraph( 979 -name => $cmd, 980 -text => $text, 981 -prefix => $pfx, 982 -separator => $sep, 983 -file => $myData{_INFILE}, 984 -line => $line_num 985 ); 986 # ## Invoke appropriate callbacks 987 # if (exists $myData{_CALLBACKS}) { 988 # ## Look through the callback list, invoke callbacks, 989 # ## then see if we need to do the default actions 990 # ## (invoke_callbacks will return true if we do). 991 # return 1 unless $self->invoke_callbacks($cmd, $text, $line_num, $pod_para); 992 # } 993 if (length $cmd) { 994 ## A command paragraph 995 $self->command($cmd, $text, $line_num, $pod_para); 996 } 997 elsif ($text =~ /^\s+/) { 998 ## Indented text - must be a verbatim paragraph 999 $self->verbatim($text, $line_num, $pod_para); 1000 } 1001 else { 1002 ## Looks like an ordinary block of text 1003 $self->textblock($text, $line_num, $pod_para); 1004 } 1005 return 1; 1006} 1007 1008##--------------------------------------------------------------------------- 1009 1010=head1 B<parse_from_filehandle()> 1011 1012 $parser->parse_from_filehandle($in_fh,$out_fh); 1013 1014This method takes an input filehandle (which is assumed to already be 1015opened for reading) and reads the entire input stream looking for blocks 1016(paragraphs) of POD documentation to be processed. If no first argument 1017is given the default input filehandle C<STDIN> is used. 1018 1019The C<$in_fh> parameter may be any object that provides a B<getline()> 1020method to retrieve a single line of input text (hence, an appropriate 1021wrapper object could be used to parse PODs from a single string or an 1022array of strings). 1023 1024Using C<$in_fh-E<gt>getline()>, input is read line-by-line and assembled 1025into paragraphs or "blocks" (which are separated by lines containing 1026nothing but whitespace). For each block of POD documentation 1027encountered it will invoke a method to parse the given paragraph. 1028 1029If a second argument is given then it should correspond to a filehandle where 1030output should be sent (otherwise the default output filehandle is 1031C<STDOUT> if no output filehandle is currently in use). 1032 1033B<NOTE:> For performance reasons, this method caches the input stream at 1034the top of the stack in a local variable. Any attempts by clients to 1035change the stack contents during processing when in the midst executing 1036of this method I<will not affect> the input stream used by the current 1037invocation of this method. 1038 1039This method does I<not> usually need to be overridden by subclasses. 1040 1041=cut 1042 1043sub parse_from_filehandle { 1044 my $self = shift; 1045 my %opts = (ref $_[0] eq 'HASH') ? %{ shift() } : (); 1046 my ($in_fh, $out_fh) = @_; 1047 $in_fh = \*STDIN unless ($in_fh); 1048 local *myData = $self; ## alias to avoid deref-ing overhead 1049 local *myOpts = ($myData{_PARSEOPTS} ||= {}); ## get parse-options 1050 local $_; 1051 1052 ## Put this stream at the top of the stack and do beginning-of-input 1053 ## processing. NOTE that $in_fh might be reset during this process. 1054 my $topstream = $self->_push_input_stream($in_fh, $out_fh); 1055 (exists $opts{-cutting}) and $self->cutting( $opts{-cutting} ); 1056 1057 ## Initialize line/paragraph 1058 my ($textline, $paragraph) = ('', ''); 1059 my ($nlines, $plines) = (0, 0); 1060 1061 ## Use <$fh> instead of $fh->getline where possible (for speed) 1062 $_ = ref $in_fh; 1063 my $tied_fh = (/^(?:GLOB|FileHandle|IO::\w+)$/ or tied $in_fh); 1064 1065 ## Read paragraphs line-by-line 1066 while (defined ($textline = $tied_fh ? <$in_fh> : $in_fh->getline)) { 1067 $textline = $self->preprocess_line($textline, ++$nlines); 1068 next unless ((defined $textline) && (length $textline)); 1069 $_ = $paragraph; ## save previous contents 1070 1071 if ((! length $paragraph) && ($textline =~ /^==/)) { 1072 ## '==' denotes a one-line command paragraph 1073 $paragraph = $textline; 1074 $plines = 1; 1075 $textline = ''; 1076 } else { 1077 ## Append this line to the current paragraph 1078 $paragraph .= $textline; 1079 ++$plines; 1080 } 1081 1082 ## See if this line is blank and ends the current paragraph. 1083 ## If it isnt, then keep iterating until it is. 1084 next unless (($textline =~ /^([^\S\r\n]*)[\r\n]*$/) 1085 && (length $paragraph)); 1086 1087 ## Issue a warning about any non-empty blank lines 1088 if (length($1) > 0 and $myOpts{'-warnings'} and ! $myData{_CUTTING}) { 1089 my $errorsub = $self->errorsub(); 1090 my $file = $self->input_file(); 1091 my $errmsg = "*** WARNING: line containing nothing but whitespace". 1092 " in paragraph at line $nlines in file $file\n"; 1093 (ref $errorsub) and &{$errorsub}($errmsg) 1094 or (defined $errorsub) and $self->$errorsub($errmsg) 1095 or warn($errmsg); 1096 } 1097 1098 ## Now process the paragraph 1099 parse_paragraph($self, $paragraph, ($nlines - $plines) + 1); 1100 $paragraph = ''; 1101 $plines = 0; 1102 } 1103 ## Dont forget about the last paragraph in the file 1104 if (length $paragraph) { 1105 parse_paragraph($self, $paragraph, ($nlines - $plines) + 1) 1106 } 1107 1108 ## Now pop the input stream off the top of the input stack. 1109 $self->_pop_input_stream(); 1110} 1111 1112##--------------------------------------------------------------------------- 1113 1114=head1 B<parse_from_file()> 1115 1116 $parser->parse_from_file($filename,$outfile); 1117 1118This method takes a filename and does the following: 1119 1120=over 2 1121 1122=item * 1123 1124opens the input and output files for reading 1125(creating the appropriate filehandles) 1126 1127=item * 1128 1129invokes the B<parse_from_filehandle()> method passing it the 1130corresponding input and output filehandles. 1131 1132=item * 1133 1134closes the input and output files. 1135 1136=back 1137 1138If the special input filename "-" or "<&STDIN" is given then the STDIN 1139filehandle is used for input (and no open or close is performed). If no 1140input filename is specified then "-" is implied. 1141 1142If a second argument is given then it should be the name of the desired 1143output file. If the special output filename "-" or ">&STDOUT" is given 1144then the STDOUT filehandle is used for output (and no open or close is 1145performed). If the special output filename ">&STDERR" is given then the 1146STDERR filehandle is used for output (and no open or close is 1147performed). If no output filehandle is currently in use and no output 1148filename is specified, then "-" is implied. 1149 1150This method does I<not> usually need to be overridden by subclasses. 1151 1152=cut 1153 1154sub parse_from_file { 1155 my $self = shift; 1156 my %opts = (ref $_[0] eq 'HASH') ? %{ shift() } : (); 1157 my ($infile, $outfile) = @_; 1158 my ($in_fh, $out_fh) = (gensym, gensym) if ($] < 5.6); 1159 my ($close_input, $close_output) = (0, 0); 1160 local *myData = $self; 1161 local $_; 1162 1163 ## Is $infile a filename or a (possibly implied) filehandle 1164 $infile = '-' unless ((defined $infile) && (length $infile)); 1165 if (($infile eq '-') || ($infile =~ /^<&(STDIN|0)$/i)) { 1166 ## Not a filename, just a string implying STDIN 1167 $myData{_INFILE} = "<standard input>"; 1168 $in_fh = \*STDIN; 1169 } 1170 elsif (ref $infile) { 1171 ## Must be a filehandle-ref (or else assume its a ref to an object 1172 ## that supports the common IO read operations). 1173 $myData{_INFILE} = ${$infile}; 1174 $in_fh = $infile; 1175 } 1176 else { 1177 ## We have a filename, open it for reading 1178 $myData{_INFILE} = $infile; 1179 open($in_fh, "< $infile") or 1180 croak "Can't open $infile for reading: $!\n"; 1181 $close_input = 1; 1182 } 1183 1184 ## NOTE: we need to be *very* careful when "defaulting" the output 1185 ## file. We only want to use a default if this is the beginning of 1186 ## the entire document (but *not* if this is an included file). We 1187 ## determine this by seeing if the input stream stack has been set-up 1188 ## already 1189 ## 1190 unless ((defined $outfile) && (length $outfile)) { 1191 (defined $myData{_TOP_STREAM}) && ($out_fh = $myData{_OUTPUT}) 1192 || ($outfile = '-'); 1193 } 1194 ## Is $outfile a filename or a (possibly implied) filehandle 1195 if ((defined $outfile) && (length $outfile)) { 1196 if (($outfile eq '-') || ($outfile =~ /^>&?(?:STDOUT|1)$/i)) { 1197 ## Not a filename, just a string implying STDOUT 1198 $myData{_OUTFILE} = "<standard output>"; 1199 $out_fh = \*STDOUT; 1200 } 1201 elsif ($outfile =~ /^>&(STDERR|2)$/i) { 1202 ## Not a filename, just a string implying STDERR 1203 $myData{_OUTFILE} = "<standard error>"; 1204 $out_fh = \*STDERR; 1205 } 1206 elsif (ref $outfile) { 1207 ## Must be a filehandle-ref (or else assume its a ref to an 1208 ## object that supports the common IO write operations). 1209 $myData{_OUTFILE} = ${$outfile}; 1210 $out_fh = $outfile; 1211 } 1212 else { 1213 ## We have a filename, open it for writing 1214 $myData{_OUTFILE} = $outfile; 1215 (-d $outfile) and croak "$outfile is a directory, not POD input!\n"; 1216 open($out_fh, "> $outfile") or 1217 croak "Can't open $outfile for writing: $!\n"; 1218 $close_output = 1; 1219 } 1220 } 1221 1222 ## Whew! That was a lot of work to set up reasonably/robust behavior 1223 ## in the case of a non-filename for reading and writing. Now we just 1224 ## have to parse the input and close the handles when we're finished. 1225 $self->parse_from_filehandle(\%opts, $in_fh, $out_fh); 1226 1227 $close_input and 1228 close($in_fh) || croak "Can't close $infile after reading: $!\n"; 1229 $close_output and 1230 close($out_fh) || croak "Can't close $outfile after writing: $!\n"; 1231} 1232 1233############################################################################# 1234 1235=head1 ACCESSOR METHODS 1236 1237Clients of B<Pod::Parser> should use the following methods to access 1238instance data fields: 1239 1240=cut 1241 1242##--------------------------------------------------------------------------- 1243 1244=head1 B<errorsub()> 1245 1246 $parser->errorsub("method_name"); 1247 $parser->errorsub(\&warn_user); 1248 $parser->errorsub(sub { print STDERR, @_ }); 1249 1250Specifies the method or subroutine to use when printing error messages 1251about POD syntax. The supplied method/subroutine I<must> return TRUE upon 1252successful printing of the message. If C<undef> is given, then the B<warn> 1253builtin is used to issue error messages (this is the default behavior). 1254 1255 my $errorsub = $parser->errorsub() 1256 my $errmsg = "This is an error message!\n" 1257 (ref $errorsub) and &{$errorsub}($errmsg) 1258 or (defined $errorsub) and $parser->$errorsub($errmsg) 1259 or warn($errmsg); 1260 1261Returns a method name, or else a reference to the user-supplied subroutine 1262used to print error messages. Returns C<undef> if the B<warn> builtin 1263is used to issue error messages (this is the default behavior). 1264 1265=cut 1266 1267sub errorsub { 1268 return (@_ > 1) ? ($_[0]->{_ERRORSUB} = $_[1]) : $_[0]->{_ERRORSUB}; 1269} 1270 1271##--------------------------------------------------------------------------- 1272 1273=head1 B<cutting()> 1274 1275 $boolean = $parser->cutting(); 1276 1277Returns the current C<cutting> state: a boolean-valued scalar which 1278evaluates to true if text from the input file is currently being "cut" 1279(meaning it is I<not> considered part of the POD document). 1280 1281 $parser->cutting($boolean); 1282 1283Sets the current C<cutting> state to the given value and returns the 1284result. 1285 1286=cut 1287 1288sub cutting { 1289 return (@_ > 1) ? ($_[0]->{_CUTTING} = $_[1]) : $_[0]->{_CUTTING}; 1290} 1291 1292##--------------------------------------------------------------------------- 1293 1294##--------------------------------------------------------------------------- 1295 1296=head1 B<parseopts()> 1297 1298When invoked with no additional arguments, B<parseopts> returns a hashtable 1299of all the current parsing options. 1300 1301 ## See if we are parsing non-POD sections as well as POD ones 1302 my %opts = $parser->parseopts(); 1303 $opts{'-want_nonPODs}' and print "-want_nonPODs\n"; 1304 1305When invoked using a single string, B<parseopts> treats the string as the 1306name of a parse-option and returns its corresponding value if it exists 1307(returns C<undef> if it doesn't). 1308 1309 ## Did we ask to see '=cut' paragraphs? 1310 my $want_cut = $parser->parseopts('-process_cut_cmd'); 1311 $want_cut and print "-process_cut_cmd\n"; 1312 1313When invoked with multiple arguments, B<parseopts> treats them as 1314key/value pairs and the specified parse-option names are set to the 1315given values. Any unspecified parse-options are unaffected. 1316 1317 ## Set them back to the default 1318 $parser->parseopts(-warnings => 0); 1319 1320When passed a single hash-ref, B<parseopts> uses that hash to completely 1321reset the existing parse-options, all previous parse-option values 1322are lost. 1323 1324 ## Reset all options to default 1325 $parser->parseopts( { } ); 1326 1327See L<"PARSING OPTIONS"> for more information on the name and meaning of each 1328parse-option currently recognized. 1329 1330=cut 1331 1332sub parseopts { 1333 local *myData = shift; 1334 local *myOpts = ($myData{_PARSEOPTS} ||= {}); 1335 return %myOpts if (@_ == 0); 1336 if (@_ == 1) { 1337 local $_ = shift; 1338 return ref($_) ? $myData{_PARSEOPTS} = $_ : $myOpts{$_}; 1339 } 1340 my @newOpts = (%myOpts, @_); 1341 $myData{_PARSEOPTS} = { @newOpts }; 1342} 1343 1344##--------------------------------------------------------------------------- 1345 1346=head1 B<output_file()> 1347 1348 $fname = $parser->output_file(); 1349 1350Returns the name of the output file being written. 1351 1352=cut 1353 1354sub output_file { 1355 return $_[0]->{_OUTFILE}; 1356} 1357 1358##--------------------------------------------------------------------------- 1359 1360=head1 B<output_handle()> 1361 1362 $fhandle = $parser->output_handle(); 1363 1364Returns the output filehandle object. 1365 1366=cut 1367 1368sub output_handle { 1369 return $_[0]->{_OUTPUT}; 1370} 1371 1372##--------------------------------------------------------------------------- 1373 1374=head1 B<input_file()> 1375 1376 $fname = $parser->input_file(); 1377 1378Returns the name of the input file being read. 1379 1380=cut 1381 1382sub input_file { 1383 return $_[0]->{_INFILE}; 1384} 1385 1386##--------------------------------------------------------------------------- 1387 1388=head1 B<input_handle()> 1389 1390 $fhandle = $parser->input_handle(); 1391 1392Returns the current input filehandle object. 1393 1394=cut 1395 1396sub input_handle { 1397 return $_[0]->{_INPUT}; 1398} 1399 1400##--------------------------------------------------------------------------- 1401 1402=begin __PRIVATE__ 1403 1404=head1 B<input_streams()> 1405 1406 $listref = $parser->input_streams(); 1407 1408Returns a reference to an array which corresponds to the stack of all 1409the input streams that are currently in the middle of being parsed. 1410 1411While parsing an input stream, it is possible to invoke 1412B<parse_from_file()> or B<parse_from_filehandle()> to parse a new input 1413stream and then return to parsing the previous input stream. Each input 1414stream to be parsed is pushed onto the end of this input stack 1415before any of its input is read. The input stream that is currently 1416being parsed is always at the end (or top) of the input stack. When an 1417input stream has been exhausted, it is popped off the end of the 1418input stack. 1419 1420Each element on this input stack is a reference to C<Pod::InputSource> 1421object. Please see L<Pod::InputObjects> for more details. 1422 1423This method might be invoked when printing diagnostic messages, for example, 1424to obtain the name and line number of the all input files that are currently 1425being processed. 1426 1427=end __PRIVATE__ 1428 1429=cut 1430 1431sub input_streams { 1432 return $_[0]->{_INPUT_STREAMS}; 1433} 1434 1435##--------------------------------------------------------------------------- 1436 1437=begin __PRIVATE__ 1438 1439=head1 B<top_stream()> 1440 1441 $hashref = $parser->top_stream(); 1442 1443Returns a reference to the hash-table that represents the element 1444that is currently at the top (end) of the input stream stack 1445(see L<"input_streams()">). The return value will be the C<undef> 1446if the input stack is empty. 1447 1448This method might be used when printing diagnostic messages, for example, 1449to obtain the name and line number of the current input file. 1450 1451=end __PRIVATE__ 1452 1453=cut 1454 1455sub top_stream { 1456 return $_[0]->{_TOP_STREAM} || undef; 1457} 1458 1459############################################################################# 1460 1461=head1 PRIVATE METHODS AND DATA 1462 1463B<Pod::Parser> makes use of several internal methods and data fields 1464which clients should not need to see or use. For the sake of avoiding 1465name collisions for client data and methods, these methods and fields 1466are briefly discussed here. Determined hackers may obtain further 1467information about them by reading the B<Pod::Parser> source code. 1468 1469Private data fields are stored in the hash-object whose reference is 1470returned by the B<new()> constructor for this class. The names of all 1471private methods and data-fields used by B<Pod::Parser> begin with a 1472prefix of "_" and match the regular expression C</^_\w+$/>. 1473 1474=cut 1475 1476##--------------------------------------------------------------------------- 1477 1478=begin _PRIVATE_ 1479 1480=head1 B<_push_input_stream()> 1481 1482 $hashref = $parser->_push_input_stream($in_fh,$out_fh); 1483 1484This method will push the given input stream on the input stack and 1485perform any necessary beginning-of-document or beginning-of-file 1486processing. The argument C<$in_fh> is the input stream filehandle to 1487push, and C<$out_fh> is the corresponding output filehandle to use (if 1488it is not given or is undefined, then the current output stream is used, 1489which defaults to standard output if it doesnt exist yet). 1490 1491The value returned will be reference to the hash-table that represents 1492the new top of the input stream stack. I<Please Note> that it is 1493possible for this method to use default values for the input and output 1494file handles. If this happens, you will need to look at the C<INPUT> 1495and C<OUTPUT> instance data members to determine their new values. 1496 1497=end _PRIVATE_ 1498 1499=cut 1500 1501sub _push_input_stream { 1502 my ($self, $in_fh, $out_fh) = @_; 1503 local *myData = $self; 1504 1505 ## Initialize stuff for the entire document if this is *not* 1506 ## an included file. 1507 ## 1508 ## NOTE: we need to be *very* careful when "defaulting" the output 1509 ## filehandle. We only want to use a default value if this is the 1510 ## beginning of the entire document (but *not* if this is an included 1511 ## file). 1512 unless (defined $myData{_TOP_STREAM}) { 1513 $out_fh = \*STDOUT unless (defined $out_fh); 1514 $myData{_CUTTING} = 1; ## current "cutting" state 1515 $myData{_INPUT_STREAMS} = []; ## stack of all input streams 1516 } 1517 1518 ## Initialize input indicators 1519 $myData{_OUTFILE} = '(unknown)' unless (defined $myData{_OUTFILE}); 1520 $myData{_OUTPUT} = $out_fh if (defined $out_fh); 1521 $in_fh = \*STDIN unless (defined $in_fh); 1522 $myData{_INFILE} = '(unknown)' unless (defined $myData{_INFILE}); 1523 $myData{_INPUT} = $in_fh; 1524 my $input_top = $myData{_TOP_STREAM} 1525 = new Pod::InputSource( 1526 -name => $myData{_INFILE}, 1527 -handle => $in_fh, 1528 -was_cutting => $myData{_CUTTING} 1529 ); 1530 local *input_stack = $myData{_INPUT_STREAMS}; 1531 push(@input_stack, $input_top); 1532 1533 ## Perform beginning-of-document and/or beginning-of-input processing 1534 $self->begin_pod() if (@input_stack == 1); 1535 $self->begin_input(); 1536 1537 return $input_top; 1538} 1539 1540##--------------------------------------------------------------------------- 1541 1542=begin _PRIVATE_ 1543 1544=head1 B<_pop_input_stream()> 1545 1546 $hashref = $parser->_pop_input_stream(); 1547 1548This takes no arguments. It will perform any necessary end-of-file or 1549end-of-document processing and then pop the current input stream from 1550the top of the input stack. 1551 1552The value returned will be reference to the hash-table that represents 1553the new top of the input stream stack. 1554 1555=end _PRIVATE_ 1556 1557=cut 1558 1559sub _pop_input_stream { 1560 my ($self) = @_; 1561 local *myData = $self; 1562 local *input_stack = $myData{_INPUT_STREAMS}; 1563 1564 ## Perform end-of-input and/or end-of-document processing 1565 $self->end_input() if (@input_stack > 0); 1566 $self->end_pod() if (@input_stack == 1); 1567 1568 ## Restore cutting state to whatever it was before we started 1569 ## parsing this file. 1570 my $old_top = pop(@input_stack); 1571 $myData{_CUTTING} = $old_top->was_cutting(); 1572 1573 ## Dont forget to reset the input indicators 1574 my $input_top = undef; 1575 if (@input_stack > 0) { 1576 $input_top = $myData{_TOP_STREAM} = $input_stack[-1]; 1577 $myData{_INFILE} = $input_top->name(); 1578 $myData{_INPUT} = $input_top->handle(); 1579 } else { 1580 delete $myData{_TOP_STREAM}; 1581 delete $myData{_INPUT_STREAMS}; 1582 } 1583 1584 return $input_top; 1585} 1586 1587############################################################################# 1588 1589=head1 TREE-BASED PARSING 1590 1591If straightforward stream-based parsing wont meet your needs (as is 1592likely the case for tasks such as translating PODs into structured 1593markup languages like HTML and XML) then you may need to take the 1594tree-based approach. Rather than doing everything in one pass and 1595calling the B<interpolate()> method to expand sequences into text, it 1596may be desirable to instead create a parse-tree using the B<parse_text()> 1597method to return a tree-like structure which may contain an ordered 1598list of children (each of which may be a text-string, or a similar 1599tree-like structure). 1600 1601Pay special attention to L<"METHODS FOR PARSING AND PROCESSING"> and 1602to the objects described in L<Pod::InputObjects>. The former describes 1603the gory details and parameters for how to customize and extend the 1604parsing behavior of B<Pod::Parser>. B<Pod::InputObjects> provides 1605several objects that may all be used interchangeably as parse-trees. The 1606most obvious one is the B<Pod::ParseTree> object. It defines the basic 1607interface and functionality that all things trying to be a POD parse-tree 1608should do. A B<Pod::ParseTree> is defined such that each "node" may be a 1609text-string, or a reference to another parse-tree. Each B<Pod::Paragraph> 1610object and each B<Pod::InteriorSequence> object also supports the basic 1611parse-tree interface. 1612 1613The B<parse_text()> method takes a given paragraph of text, and 1614returns a parse-tree that contains one or more children, each of which 1615may be a text-string, or an InteriorSequence object. There are also 1616callback-options that may be passed to B<parse_text()> to customize 1617the way it expands or transforms interior-sequences, as well as the 1618returned result. These callbacks can be used to create a parse-tree 1619with custom-made objects (which may or may not support the parse-tree 1620interface, depending on how you choose to do it). 1621 1622If you wish to turn an entire POD document into a parse-tree, that process 1623is fairly straightforward. The B<parse_text()> method is the key to doing 1624this successfully. Every paragraph-callback (i.e. the polymorphic methods 1625for B<command()>, B<verbatim()>, and B<textblock()> paragraphs) takes 1626a B<Pod::Paragraph> object as an argument. Each paragraph object has a 1627B<parse_tree()> method that can be used to get or set a corresponding 1628parse-tree. So for each of those paragraph-callback methods, simply call 1629B<parse_text()> with the options you desire, and then use the returned 1630parse-tree to assign to the given paragraph object. 1631 1632That gives you a parse-tree for each paragraph - so now all you need is 1633an ordered list of paragraphs. You can maintain that yourself as a data 1634element in the object/hash. The most straightforward way would be simply 1635to use an array-ref, with the desired set of custom "options" for each 1636invocation of B<parse_text>. Let's assume the desired option-set is 1637given by the hash C<%options>. Then we might do something like the 1638following: 1639 1640 package MyPodParserTree; 1641 1642 @ISA = qw( Pod::Parser ); 1643 1644 ... 1645 1646 sub begin_pod { 1647 my $self = shift; 1648 $self->{'-paragraphs'} = []; ## initialize paragraph list 1649 } 1650 1651 sub command { 1652 my ($parser, $command, $paragraph, $line_num, $pod_para) = @_; 1653 my $ptree = $parser->parse_text({%options}, $paragraph, ...); 1654 $pod_para->parse_tree( $ptree ); 1655 push @{ $self->{'-paragraphs'} }, $pod_para; 1656 } 1657 1658 sub verbatim { 1659 my ($parser, $paragraph, $line_num, $pod_para) = @_; 1660 push @{ $self->{'-paragraphs'} }, $pod_para; 1661 } 1662 1663 sub textblock { 1664 my ($parser, $paragraph, $line_num, $pod_para) = @_; 1665 my $ptree = $parser->parse_text({%options}, $paragraph, ...); 1666 $pod_para->parse_tree( $ptree ); 1667 push @{ $self->{'-paragraphs'} }, $pod_para; 1668 } 1669 1670 ... 1671 1672 package main; 1673 ... 1674 my $parser = new MyPodParserTree(...); 1675 $parser->parse_from_file(...); 1676 my $paragraphs_ref = $parser->{'-paragraphs'}; 1677 1678Of course, in this module-author's humble opinion, I'd be more inclined to 1679use the existing B<Pod::ParseTree> object than a simple array. That way 1680everything in it, paragraphs and sequences, all respond to the same core 1681interface for all parse-tree nodes. The result would look something like: 1682 1683 package MyPodParserTree2; 1684 1685 ... 1686 1687 sub begin_pod { 1688 my $self = shift; 1689 $self->{'-ptree'} = new Pod::ParseTree; ## initialize parse-tree 1690 } 1691 1692 sub parse_tree { 1693 ## convenience method to get/set the parse-tree for the entire POD 1694 (@_ > 1) and $_[0]->{'-ptree'} = $_[1]; 1695 return $_[0]->{'-ptree'}; 1696 } 1697 1698 sub command { 1699 my ($parser, $command, $paragraph, $line_num, $pod_para) = @_; 1700 my $ptree = $parser->parse_text({<<options>>}, $paragraph, ...); 1701 $pod_para->parse_tree( $ptree ); 1702 $parser->parse_tree()->append( $pod_para ); 1703 } 1704 1705 sub verbatim { 1706 my ($parser, $paragraph, $line_num, $pod_para) = @_; 1707 $parser->parse_tree()->append( $pod_para ); 1708 } 1709 1710 sub textblock { 1711 my ($parser, $paragraph, $line_num, $pod_para) = @_; 1712 my $ptree = $parser->parse_text({<<options>>}, $paragraph, ...); 1713 $pod_para->parse_tree( $ptree ); 1714 $parser->parse_tree()->append( $pod_para ); 1715 } 1716 1717 ... 1718 1719 package main; 1720 ... 1721 my $parser = new MyPodParserTree2(...); 1722 $parser->parse_from_file(...); 1723 my $ptree = $parser->parse_tree; 1724 ... 1725 1726Now you have the entire POD document as one great big parse-tree. You 1727can even use the B<-expand_seq> option to B<parse_text> to insert 1728whole different kinds of objects. Just don't expect B<Pod::Parser> 1729to know what to do with them after that. That will need to be in your 1730code. Or, alternatively, you can insert any object you like so long as 1731it conforms to the B<Pod::ParseTree> interface. 1732 1733One could use this to create subclasses of B<Pod::Paragraphs> and 1734B<Pod::InteriorSequences> for specific commands (or to create your own 1735custom node-types in the parse-tree) and add some kind of B<emit()> 1736method to each custom node/subclass object in the tree. Then all you'd 1737need to do is recursively walk the tree in the desired order, processing 1738the children (most likely from left to right) by formatting them if 1739they are text-strings, or by calling their B<emit()> method if they 1740are objects/references. 1741 1742=head1 SEE ALSO 1743 1744L<Pod::InputObjects>, L<Pod::Select> 1745 1746B<Pod::InputObjects> defines POD input objects corresponding to 1747command paragraphs, parse-trees, and interior-sequences. 1748 1749B<Pod::Select> is a subclass of B<Pod::Parser> which provides the ability 1750to selectively include and/or exclude sections of a POD document from being 1751translated based upon the current heading, subheading, subsubheading, etc. 1752 1753=for __PRIVATE__ 1754B<Pod::Callbacks> is a subclass of B<Pod::Parser> which gives its users 1755the ability the employ I<callback functions> instead of, or in addition 1756to, overriding methods of the base class. 1757 1758=for __PRIVATE__ 1759B<Pod::Select> and B<Pod::Callbacks> do not override any 1760methods nor do they define any new methods with the same name. Because 1761of this, they may I<both> be used (in combination) as a base class of 1762the same subclass in order to combine their functionality without 1763causing any namespace clashes due to multiple inheritance. 1764 1765=head1 AUTHOR 1766 1767Please report bugs using L<http://rt.cpan.org>. 1768 1769Brad Appleton E<lt>bradapp@enteract.comE<gt> 1770 1771Based on code for B<Pod::Text> written by 1772Tom Christiansen E<lt>tchrist@mox.perl.comE<gt> 1773 1774=cut 1775 17761; 1777