xref: /openbsd-src/gnu/usr.bin/perl/cpan/Pod-Simple/lib/Pod/Simple.pod (revision 99fd087599a8791921855f21bd7e36130f39aadc)
1
2=head1 NAME
3
4Pod::Simple - framework for parsing Pod
5
6=head1 SYNOPSIS
7
8 TODO
9
10=head1 DESCRIPTION
11
12Pod::Simple is a Perl library for parsing text in the Pod ("plain old
13documentation") markup language that is typically used for writing
14documentation for Perl and for Perl modules. The Pod format is explained
15in L<perlpod>; the most common formatter is called C<perldoc>.
16
17Be sure to read L</ENCODING> if your Pod contains non-ASCII characters.
18
19Pod formatters can use Pod::Simple to parse Pod documents and render them into
20plain text, HTML, or any number of other formats. Typically, such formatters
21will be subclasses of Pod::Simple, and so they will inherit its methods, like
22C<parse_file>.
23
24If you're reading this document just because you have a Pod-processing
25subclass that you want to use, this document (plus the documentation for the
26subclass) is probably all you need to read.
27
28If you're reading this document because you want to write a formatter
29subclass, continue reading it and then read L<Pod::Simple::Subclassing>, and
30then possibly even read L<perlpodspec> (some of which is for parser-writers,
31but much of which is notes to formatter-writers).
32
33=head1 MAIN METHODS
34
35=over
36
37=item C<< $parser = I<SomeClass>->new(); >>
38
39This returns a new parser object, where I<C<SomeClass>> is a subclass
40of Pod::Simple.
41
42=item C<< $parser->output_fh( *OUT ); >>
43
44This sets the filehandle that C<$parser>'s output will be written to.
45You can pass C<*STDOUT> or C<*STDERR>, otherwise you should probably do
46something like this:
47
48    my $outfile = "output.txt";
49    open TXTOUT, ">$outfile" or die "Can't write to $outfile: $!";
50    $parser->output_fh(*TXTOUT);
51
52...before you call one of the C<< $parser->parse_I<whatever> >> methods.
53
54=item C<< $parser->output_string( \$somestring ); >>
55
56This sets the string that C<$parser>'s output will be sent to,
57instead of any filehandle.
58
59
60=item C<< $parser->parse_file( I<$some_filename> ); >>
61
62=item C<< $parser->parse_file( *INPUT_FH ); >>
63
64This reads the Pod content of the file (or filehandle) that you specify,
65and processes it with that C<$parser> object, according to however
66C<$parser>'s class works, and according to whatever parser options you
67have set up for this C<$parser> object.
68
69=item C<< $parser->parse_string_document( I<$all_content> ); >>
70
71This works just like C<parse_file> except that it reads the Pod
72content not from a file, but from a string that you have already
73in memory.
74
75=item C<< $parser->parse_lines( I<...@lines...>, undef ); >>
76
77This processes the lines in C<@lines> (where each list item must be a
78defined value, and must contain exactly one line of content -- so no
79items like C<"foo\nbar"> are allowed).  The final C<undef> is used to
80indicate the end of document being parsed.
81
82The other C<parser_I<whatever>> methods are meant to be called only once
83per C<$parser> object; but C<parse_lines> can be called as many times per
84C<$parser> object as you want, as long as the last call (and only
85the last call) ends with an C<undef> value.
86
87
88=item C<< $parser->content_seen >>
89
90This returns true only if there has been any real content seen for this
91document. Returns false in cases where the document contains content,
92but does not make use of any Pod markup.
93
94=item C<< I<SomeClass>->filter( I<$filename> ); >>
95
96=item C<< I<SomeClass>->filter( I<*INPUT_FH> ); >>
97
98=item C<< I<SomeClass>->filter( I<\$document_content> ); >>
99
100This is a shortcut method for creating a new parser object, setting the
101output handle to STDOUT, and then processing the specified file (or
102filehandle, or in-memory document). This is handy for one-liners like
103this:
104
105  perl -MPod::Simple::Text -e "Pod::Simple::Text->filter('thingy.pod')"
106
107=back
108
109
110
111=head1 SECONDARY METHODS
112
113Some of these methods might be of interest to general users, as
114well as of interest to formatter-writers.
115
116Note that the general pattern here is that the accessor-methods
117read the attribute's value with C<< $value = $parser->I<attribute> >>
118and set the attribute's value with
119C<< $parser->I<attribute>(I<newvalue>) >>.  For each accessor, I typically
120only mention one syntax or another, based on which I think you are actually
121most likely to use.
122
123
124=over
125
126=item C<< $parser->parse_characters( I<SOMEVALUE> ) >>
127
128The Pod parser normally expects to read octets and to convert those octets
129to characters based on the C<=encoding> declaration in the Pod source.  Set
130this option to a true value to indicate that the Pod source is already a Perl
131character stream.  This tells the parser to ignore any C<=encoding> command
132and to skip all the code paths involving decoding octets.
133
134=item C<< $parser->no_whining( I<SOMEVALUE> ) >>
135
136If you set this attribute to a true value, you will suppress the
137parser's complaints about irregularities in the Pod coding. By default,
138this attribute's value is false, meaning that irregularities will
139be reported.
140
141Note that turning this attribute to true won't suppress one or two kinds
142of complaints about rarely occurring unrecoverable errors.
143
144
145=item C<< $parser->no_errata_section( I<SOMEVALUE> ) >>
146
147If you set this attribute to a true value, you will stop the parser from
148generating a "POD ERRORS" section at the end of the document. By
149default, this attribute's value is false, meaning that an errata section
150will be generated, as necessary.
151
152
153=item C<< $parser->complain_stderr( I<SOMEVALUE> ) >>
154
155If you set this attribute to a true value, it will send reports of
156parsing errors to STDERR. By default, this attribute's value is false,
157meaning that no output is sent to STDERR.
158
159Setting C<complain_stderr> also sets C<no_errata_section>.
160
161
162=item C<< $parser->source_filename >>
163
164This returns the filename that this parser object was set to read from.
165
166
167=item C<< $parser->doc_has_started >>
168
169This returns true if C<$parser> has read from a source, and has seen
170Pod content in it.
171
172
173=item C<< $parser->source_dead >>
174
175This returns true if C<$parser> has read from a source, and come to the
176end of that source.
177
178=item C<< $parser->strip_verbatim_indent( I<SOMEVALUE> ) >>
179
180The perlpod spec for a Verbatim paragraph is "It should be reproduced
181exactly...", which means that the whitespace you've used to indent your
182verbatim blocks will be preserved in the output. This can be annoying for
183outputs such as HTML, where that whitespace will remain in front of every
184line. It's an unfortunate case where syntax is turned into semantics.
185
186If the POD you're parsing adheres to a consistent indentation policy, you can
187have such indentation stripped from the beginning of every line of your
188verbatim blocks. This method tells Pod::Simple what to strip. For two-space
189indents, you'd use:
190
191  $parser->strip_verbatim_indent('  ');
192
193For tab indents, you'd use a tab character:
194
195  $parser->strip_verbatim_indent("\t");
196
197If the POD is inconsistent about the indentation of verbatim blocks, but you
198have figured out a heuristic to determine how much a particular verbatim block
199is indented, you can pass a code reference instead. The code reference will be
200executed with one argument, an array reference of all the lines in the
201verbatim block, and should return the value to be stripped from each line. For
202example, if you decide that you're fine to use the first line of the verbatim
203block to set the standard for indentation of the rest of the block, you can
204look at the first line and return the appropriate value, like so:
205
206  $new->strip_verbatim_indent(sub {
207      my $lines = shift;
208      (my $indent = $lines->[0]) =~ s/\S.*//;
209      return $indent;
210  });
211
212If you'd rather treat each line individually, you can do that, too, by just
213transforming them in-place in the code reference and returning C<undef>. Say
214that you don't want I<any> lines indented. You can do something like this:
215
216  $new->strip_verbatim_indent(sub {
217      my $lines = shift;
218      sub { s/^\s+// for @{ $lines },
219      return undef;
220  });
221
222=back
223
224=head1 TERTIARY METHODS
225
226=over
227
228=item C<< $parser->abandon_output_fh() >>X<abandon_output_fh>
229
230Cancel output to the file handle. Any POD read by the C<$parser> is not
231effected.
232
233=item C<< $parser->abandon_output_string() >>X<abandon_output_string>
234
235Cancel output to the output string. Any POD read by the C<$parser> is not
236effected.
237
238=item C<< $parser->accept_code( @codes ) >>X<accept_code>
239
240Alias for L<< accept_codes >>.
241
242=item C<< $parser->accept_codes( @codes ) >>X<accept_codes>
243
244Allows C<$parser> to accept a list of L<perlpod/Formatting Codes>. This can be
245used to implement user-defined codes.
246
247=item C<< $parser->accept_directive_as_data( @directives ) >>X<accept_directive_as_data>
248
249Allows C<$parser> to accept a list of directives for data paragraphs. A
250directive is the label of a L<perlpod/Command Paragraph>. A data paragraph is
251one delimited by C<< =begin/=for/=end >> directives. This can be used to
252implement user-defined directives.
253
254=item C<< $parser->accept_directive_as_processed( @directives ) >>X<accept_directive_as_processed>
255
256Allows C<$parser> to accept a list of directives for processed paragraphs. A
257directive is the label of a L<perlpod/Command Paragraph>. A processed
258paragraph is also known as L<perlpod/Ordinary Paragraph>. This can be used to
259implement user-defined directives.
260
261=item C<< $parser->accept_directive_as_verbatim( @directives ) >>X<accept_directive_as_verbatim>
262
263Allows C<$parser> to accept a list of directives for L<perlpod/Verbatim
264Paragraph>. A directive is the label of a L<perlpod/Command Paragraph>. This
265can be used to implement user-defined directives.
266
267=item C<< $parser->accept_target( @targets ) >>X<accept_target>
268
269Alias for L<< accept_targets >>.
270
271=item C<< $parser->accept_target_as_text( @targets ) >>X<accept_target_as_text>
272
273Alias for L<< accept_targets_as_text >>.
274
275=item C<< $parser->accept_targets( @targets ) >>X<accept_targets>
276
277Accepts targets for C<< =begin/=for/=end >> sections of the POD.
278
279=item C<< $parser->accept_targets_as_text( @targets ) >>X<accept_targets_as_text>
280
281Accepts targets for C<< =begin/=for/=end >> sections that should be parsed as
282POD. For details, see L<< perlpodspec/About Data Paragraphs >>.
283
284=item C<< $parser->any_errata_seen() >>X<any_errata_seen>
285
286Used to check if any errata was seen.
287
288I<Example:>
289
290  die "too many errors\n" if $parser->any_errata_seen();
291
292=item C<< $parser->errata_seen() >>X<errata_seen>
293
294Returns a hash reference of all errata seen, both whines and screams. The hash reference's keys are the line number and the value is an array reference of the errors for that line.
295
296I<Example:>
297
298  if ( $parser->any_errata_seen() ) {
299     $logger->log( $parser->errata_seen() );
300  }
301
302=item C<< $parser->detected_encoding() >>X<detected_encoding>
303
304Return the encoding corresponding to C<< =encoding >>, but only if the
305encoding was recognized and handled.
306
307=item C<< $parser->encoding() >>X<encoding>
308
309Return encoding of the document, even if the encoding is not correctly
310handled.
311
312=item C<< $parser->parse_from_file( $source, $to ) >>X<parse_from_file>
313
314Parses from C<$source> file to C<$to> file. Similar to L<<
315Pod::Parser/parse_from_file >>.
316
317=item C<< $parser->scream( @error_messages ) >>X<scream>
318
319Log an error that can't be ignored.
320
321=item C<< $parser->unaccept_code( @codes ) >>X<unaccept_code>
322
323Alias for L<< unaccept_codes >>.
324
325=item C<< $parser->unaccept_codes( @codes ) >>X<unaccept_codes>
326
327Removes C<< @codes >> as valid codes for the parse.
328
329=item C<< $parser->unaccept_directive( @directives ) >>X<unaccept_directive>
330
331Alias for L<< unaccept_directives >>.
332
333=item C<< $parser->unaccept_directives( @directives ) >>X<unaccept_directives>
334
335Removes C<< @directives >> as valid directives for the parse.
336
337=item C<< $parser->unaccept_target( @targets ) >>X<unaccept_target>
338
339Alias for L<< unaccept_targets >>.
340
341=item C<< $parser->unaccept_targets( @targets ) >>X<unaccept_targets>
342
343Removes C<< @targets >> as valid targets for the parse.
344
345=item C<< $parser->version_report() >>X<version_report>
346
347Returns a string describing the version.
348
349=item C<< $parser->whine( @error_messages ) >>X<whine>
350
351Log an error unless C<< $parser->no_whining( TRUE ); >>.
352
353=back
354
355=head1 ENCODING
356
357The Pod::Simple parser expects to read B<octets>.  The parser will decode the
358octets into Perl's internal character string representation using the value of
359the C<=encoding> declaration in the POD source.
360
361If the POD source does not include an C<=encoding> declaration, the parser will
362attempt to guess the encoding (selecting one of UTF-8 or CP 1252) by examining
363the first non-ASCII bytes and applying the heuristic described in
364L<perlpodspec>.  (If the POD source contains only ASCII bytes, the
365encoding is assumed to be ASCII.)
366
367If you set the C<parse_characters> option to a true value the parser will
368expect characters rather than octets; will ignore any C<=encoding>; and will
369make no attempt to decode the input.
370
371=head1 SEE ALSO
372
373L<Pod::Simple::Subclassing>
374
375L<perlpod|perlpod>
376
377L<perlpodspec|perlpodspec>
378
379L<Pod::Escapes|Pod::Escapes>
380
381L<perldoc>
382
383=head1 SUPPORT
384
385Questions or discussion about POD and Pod::Simple should be sent to the
386pod-people@perl.org mail list. Send an empty email to
387pod-people-subscribe@perl.org to subscribe.
388
389This module is managed in an open GitHub repository,
390L<https://github.com/perl-pod/pod-simple/>. Feel free to fork and contribute, or
391to clone L<git://github.com/perl-pod/pod-simple.git> and send patches!
392
393Patches against Pod::Simple are welcome. Please send bug reports to
394<bug-pod-simple@rt.cpan.org>.
395
396=head1 COPYRIGHT AND DISCLAIMERS
397
398Copyright (c) 2002 Sean M. Burke.
399
400This library is free software; you can redistribute it and/or modify it
401under the same terms as Perl itself.
402
403This program is distributed in the hope that it will be useful, but
404without any warranty; without even the implied warranty of
405merchantability or fitness for a particular purpose.
406
407=head1 AUTHOR
408
409Pod::Simple was created by Sean M. Burke <sburke@cpan.org>.
410But don't bother him, he's retired.
411
412Pod::Simple is maintained by:
413
414=over
415
416=item * Allison Randal C<allison@perl.org>
417
418=item * Hans Dieter Pearcey C<hdp@cpan.org>
419
420=item * David E. Wheeler C<dwheeler@cpan.org>
421
422=back
423
424Documentation has been contributed by:
425
426=over
427
428=item * Gabor Szabo C<szabgab@gmail.com>
429
430=item * Shawn H Corey  C<SHCOREY at cpan.org>
431
432=back
433
434=cut
435