1 2=head1 NAME 3 4Pod::Simple - framework for parsing Pod 5 6=head1 SYNOPSIS 7 8 TODO 9 10=head1 DESCRIPTION 11 12Pod::Simple is a Perl library for parsing text in the Pod ("plain old 13documentation") markup language that is typically used for writing 14documentation for Perl and for Perl modules. The Pod format is explained 15in L<perlpod>; the most common formatter is called C<perldoc>. 16 17Be sure to read L</ENCODING> if your Pod contains non-ASCII characters. 18 19Pod formatters can use Pod::Simple to parse Pod documents and render them into 20plain text, HTML, or any number of other formats. Typically, such formatters 21will be subclasses of Pod::Simple, and so they will inherit its methods, like 22C<parse_file>. 23 24If you're reading this document just because you have a Pod-processing 25subclass that you want to use, this document (plus the documentation for the 26subclass) is probably all you need to read. 27 28If you're reading this document because you want to write a formatter 29subclass, continue reading it and then read L<Pod::Simple::Subclassing>, and 30then possibly even read L<perlpodspec> (some of which is for parser-writers, 31but much of which is notes to formatter-writers). 32 33=head1 MAIN METHODS 34 35=over 36 37=item C<< $parser = I<SomeClass>->new(); >> 38 39This returns a new parser object, where I<C<SomeClass>> is a subclass 40of Pod::Simple. 41 42=item C<< $parser->output_fh( *OUT ); >> 43 44This sets the filehandle that C<$parser>'s output will be written to. 45You can pass C<*STDOUT> or C<*STDERR>, otherwise you should probably do 46something like this: 47 48 my $outfile = "output.txt"; 49 open TXTOUT, ">$outfile" or die "Can't write to $outfile: $!"; 50 $parser->output_fh(*TXTOUT); 51 52...before you call one of the C<< $parser->parse_I<whatever> >> methods. 53 54=item C<< $parser->output_string( \$somestring ); >> 55 56This sets the string that C<$parser>'s output will be sent to, 57instead of any filehandle. 58 59 60=item C<< $parser->parse_file( I<$some_filename> ); >> 61 62=item C<< $parser->parse_file( *INPUT_FH ); >> 63 64This reads the Pod content of the file (or filehandle) that you specify, 65and processes it with that C<$parser> object, according to however 66C<$parser>'s class works, and according to whatever parser options you 67have set up for this C<$parser> object. 68 69=item C<< $parser->parse_string_document( I<$all_content> ); >> 70 71This works just like C<parse_file> except that it reads the Pod 72content not from a file, but from a string that you have already 73in memory. 74 75=item C<< $parser->parse_lines( I<...@lines...>, undef ); >> 76 77This processes the lines in C<@lines> (where each list item must be a 78defined value, and must contain exactly one line of content -- so no 79items like C<"foo\nbar"> are allowed). The final C<undef> is used to 80indicate the end of document being parsed. 81 82The other C<parser_I<whatever>> methods are meant to be called only once 83per C<$parser> object; but C<parse_lines> can be called as many times per 84C<$parser> object as you want, as long as the last call (and only 85the last call) ends with an C<undef> value. 86 87 88=item C<< $parser->content_seen >> 89 90This returns true only if there has been any real content seen for this 91document. Returns false in cases where the document contains content, 92but does not make use of any Pod markup. 93 94=item C<< I<SomeClass>->filter( I<$filename> ); >> 95 96=item C<< I<SomeClass>->filter( I<*INPUT_FH> ); >> 97 98=item C<< I<SomeClass>->filter( I<\$document_content> ); >> 99 100This is a shortcut method for creating a new parser object, setting the 101output handle to STDOUT, and then processing the specified file (or 102filehandle, or in-memory document). This is handy for one-liners like 103this: 104 105 perl -MPod::Simple::Text -e "Pod::Simple::Text->filter('thingy.pod')" 106 107=back 108 109 110 111=head1 SECONDARY METHODS 112 113Some of these methods might be of interest to general users, as 114well as of interest to formatter-writers. 115 116Note that the general pattern here is that the accessor-methods 117read the attribute's value with C<< $value = $parser->I<attribute> >> 118and set the attribute's value with 119C<< $parser->I<attribute>(I<newvalue>) >>. For each accessor, I typically 120only mention one syntax or another, based on which I think you are actually 121most likely to use. 122 123 124=over 125 126=item C<< $parser->parse_characters( I<SOMEVALUE> ) >> 127 128The Pod parser normally expects to read octets and to convert those octets 129to characters based on the C<=encoding> declaration in the Pod source. Set 130this option to a true value to indicate that the Pod source is already a Perl 131character stream. This tells the parser to ignore any C<=encoding> command 132and to skip all the code paths involving decoding octets. 133 134=item C<< $parser->no_whining( I<SOMEVALUE> ) >> 135 136If you set this attribute to a true value, you will suppress the 137parser's complaints about irregularities in the Pod coding. By default, 138this attribute's value is false, meaning that irregularities will 139be reported. 140 141Note that turning this attribute to true won't suppress one or two kinds 142of complaints about rarely occurring unrecoverable errors. 143 144 145=item C<< $parser->no_errata_section( I<SOMEVALUE> ) >> 146 147If you set this attribute to a true value, you will stop the parser from 148generating a "POD ERRORS" section at the end of the document. By 149default, this attribute's value is false, meaning that an errata section 150will be generated, as necessary. 151 152 153=item C<< $parser->complain_stderr( I<SOMEVALUE> ) >> 154 155If you set this attribute to a true value, it will send reports of 156parsing errors to STDERR. By default, this attribute's value is false, 157meaning that no output is sent to STDERR. 158 159Setting C<complain_stderr> also sets C<no_errata_section>. 160 161 162=item C<< $parser->source_filename >> 163 164This returns the filename that this parser object was set to read from. 165 166 167=item C<< $parser->doc_has_started >> 168 169This returns true if C<$parser> has read from a source, and has seen 170Pod content in it. 171 172 173=item C<< $parser->source_dead >> 174 175This returns true if C<$parser> has read from a source, and come to the 176end of that source. 177 178=item C<< $parser->strip_verbatim_indent( I<SOMEVALUE> ) >> 179 180The perlpod spec for a Verbatim paragraph is "It should be reproduced 181exactly...", which means that the whitespace you've used to indent your 182verbatim blocks will be preserved in the output. This can be annoying for 183outputs such as HTML, where that whitespace will remain in front of every 184line. It's an unfortunate case where syntax is turned into semantics. 185 186If the POD you're parsing adheres to a consistent indentation policy, you can 187have such indentation stripped from the beginning of every line of your 188verbatim blocks. This method tells Pod::Simple what to strip. For two-space 189indents, you'd use: 190 191 $parser->strip_verbatim_indent(' '); 192 193For tab indents, you'd use a tab character: 194 195 $parser->strip_verbatim_indent("\t"); 196 197If the POD is inconsistent about the indentation of verbatim blocks, but you 198have figured out a heuristic to determine how much a particular verbatim block 199is indented, you can pass a code reference instead. The code reference will be 200executed with one argument, an array reference of all the lines in the 201verbatim block, and should return the value to be stripped from each line. For 202example, if you decide that you're fine to use the first line of the verbatim 203block to set the standard for indentation of the rest of the block, you can 204look at the first line and return the appropriate value, like so: 205 206 $new->strip_verbatim_indent(sub { 207 my $lines = shift; 208 (my $indent = $lines->[0]) =~ s/\S.*//; 209 return $indent; 210 }); 211 212If you'd rather treat each line individually, you can do that, too, by just 213transforming them in-place in the code reference and returning C<undef>. Say 214that you don't want I<any> lines indented. You can do something like this: 215 216 $new->strip_verbatim_indent(sub { 217 my $lines = shift; 218 sub { s/^\s+// for @{ $lines }, 219 return undef; 220 }); 221 222=back 223 224=head1 TERTIARY METHODS 225 226=over 227 228=item C<< $parser->abandon_output_fh() >>X<abandon_output_fh> 229 230Cancel output to the file handle. Any POD read by the C<$parser> is not 231effected. 232 233=item C<< $parser->abandon_output_string() >>X<abandon_output_string> 234 235Cancel output to the output string. Any POD read by the C<$parser> is not 236effected. 237 238=item C<< $parser->accept_code( @codes ) >>X<accept_code> 239 240Alias for L<< accept_codes >>. 241 242=item C<< $parser->accept_codes( @codes ) >>X<accept_codes> 243 244Allows C<$parser> to accept a list of L<perlpod/Formatting Codes>. This can be 245used to implement user-defined codes. 246 247=item C<< $parser->accept_directive_as_data( @directives ) >>X<accept_directive_as_data> 248 249Allows C<$parser> to accept a list of directives for data paragraphs. A 250directive is the label of a L<perlpod/Command Paragraph>. A data paragraph is 251one delimited by C<< =begin/=for/=end >> directives. This can be used to 252implement user-defined directives. 253 254=item C<< $parser->accept_directive_as_processed( @directives ) >>X<accept_directive_as_processed> 255 256Allows C<$parser> to accept a list of directives for processed paragraphs. A 257directive is the label of a L<perlpod/Command Paragraph>. A processed 258paragraph is also known as L<perlpod/Ordinary Paragraph>. This can be used to 259implement user-defined directives. 260 261=item C<< $parser->accept_directive_as_verbatim( @directives ) >>X<accept_directive_as_verbatim> 262 263Allows C<$parser> to accept a list of directives for L<perlpod/Verbatim 264Paragraph>. A directive is the label of a L<perlpod/Command Paragraph>. This 265can be used to implement user-defined directives. 266 267=item C<< $parser->accept_target( @targets ) >>X<accept_target> 268 269Alias for L<< accept_targets >>. 270 271=item C<< $parser->accept_target_as_text( @targets ) >>X<accept_target_as_text> 272 273Alias for L<< accept_targets_as_text >>. 274 275=item C<< $parser->accept_targets( @targets ) >>X<accept_targets> 276 277Accepts targets for C<< =begin/=for/=end >> sections of the POD. 278 279=item C<< $parser->accept_targets_as_text( @targets ) >>X<accept_targets_as_text> 280 281Accepts targets for C<< =begin/=for/=end >> sections that should be parsed as 282POD. For details, see L<< perlpodspec/About Data Paragraphs >>. 283 284=item C<< $parser->any_errata_seen() >>X<any_errata_seen> 285 286Used to check if any errata was seen. 287 288I<Example:> 289 290 die "too many errors\n" if $parser->any_errata_seen(); 291 292=item C<< $parser->errata_seen() >>X<errata_seen> 293 294Returns a hash reference of all errata seen, both whines and screams. The hash reference's keys are the line number and the value is an array reference of the errors for that line. 295 296I<Example:> 297 298 if ( $parser->any_errata_seen() ) { 299 $logger->log( $parser->errata_seen() ); 300 } 301 302=item C<< $parser->detected_encoding() >>X<detected_encoding> 303 304Return the encoding corresponding to C<< =encoding >>, but only if the 305encoding was recognized and handled. 306 307=item C<< $parser->encoding() >>X<encoding> 308 309Return encoding of the document, even if the encoding is not correctly 310handled. 311 312=item C<< $parser->parse_from_file( $source, $to ) >>X<parse_from_file> 313 314Parses from C<$source> file to C<$to> file. Similar to L<< 315Pod::Parser/parse_from_file >>. 316 317=item C<< $parser->scream( @error_messages ) >>X<scream> 318 319Log an error that can't be ignored. 320 321=item C<< $parser->unaccept_code( @codes ) >>X<unaccept_code> 322 323Alias for L<< unaccept_codes >>. 324 325=item C<< $parser->unaccept_codes( @codes ) >>X<unaccept_codes> 326 327Removes C<< @codes >> as valid codes for the parse. 328 329=item C<< $parser->unaccept_directive( @directives ) >>X<unaccept_directive> 330 331Alias for L<< unaccept_directives >>. 332 333=item C<< $parser->unaccept_directives( @directives ) >>X<unaccept_directives> 334 335Removes C<< @directives >> as valid directives for the parse. 336 337=item C<< $parser->unaccept_target( @targets ) >>X<unaccept_target> 338 339Alias for L<< unaccept_targets >>. 340 341=item C<< $parser->unaccept_targets( @targets ) >>X<unaccept_targets> 342 343Removes C<< @targets >> as valid targets for the parse. 344 345=item C<< $parser->version_report() >>X<version_report> 346 347Returns a string describing the version. 348 349=item C<< $parser->whine( @error_messages ) >>X<whine> 350 351Log an error unless C<< $parser->no_whining( TRUE ); >>. 352 353=back 354 355=head1 ENCODING 356 357The Pod::Simple parser expects to read B<octets>. The parser will decode the 358octets into Perl's internal character string representation using the value of 359the C<=encoding> declaration in the POD source. 360 361If the POD source does not include an C<=encoding> declaration, the parser will 362attempt to guess the encoding (selecting one of UTF-8 or CP 1252) by examining 363the first non-ASCII bytes and applying the heuristic described in 364L<perlpodspec>. (If the POD source contains only ASCII bytes, the 365encoding is assumed to be ASCII.) 366 367If you set the C<parse_characters> option to a true value the parser will 368expect characters rather than octets; will ignore any C<=encoding>; and will 369make no attempt to decode the input. 370 371=head1 SEE ALSO 372 373L<Pod::Simple::Subclassing> 374 375L<perlpod|perlpod> 376 377L<perlpodspec|perlpodspec> 378 379L<Pod::Escapes|Pod::Escapes> 380 381L<perldoc> 382 383=head1 SUPPORT 384 385Questions or discussion about POD and Pod::Simple should be sent to the 386pod-people@perl.org mail list. Send an empty email to 387pod-people-subscribe@perl.org to subscribe. 388 389This module is managed in an open GitHub repository, 390L<https://github.com/perl-pod/pod-simple/>. Feel free to fork and contribute, or 391to clone L<git://github.com/perl-pod/pod-simple.git> and send patches! 392 393Patches against Pod::Simple are welcome. Please send bug reports to 394<bug-pod-simple@rt.cpan.org>. 395 396=head1 COPYRIGHT AND DISCLAIMERS 397 398Copyright (c) 2002 Sean M. Burke. 399 400This library is free software; you can redistribute it and/or modify it 401under the same terms as Perl itself. 402 403This program is distributed in the hope that it will be useful, but 404without any warranty; without even the implied warranty of 405merchantability or fitness for a particular purpose. 406 407=head1 AUTHOR 408 409Pod::Simple was created by Sean M. Burke <sburke@cpan.org>. 410But don't bother him, he's retired. 411 412Pod::Simple is maintained by: 413 414=over 415 416=item * Allison Randal C<allison@perl.org> 417 418=item * Hans Dieter Pearcey C<hdp@cpan.org> 419 420=item * David E. Wheeler C<dwheeler@cpan.org> 421 422=back 423 424Documentation has been contributed by: 425 426=over 427 428=item * Gabor Szabo C<szabgab@gmail.com> 429 430=item * Shawn H Corey C<SHCOREY at cpan.org> 431 432=back 433 434=cut 435