1=head1 NAME 2 3perliol - C API for Perl's implementation of IO in Layers. 4 5=head1 SYNOPSIS 6 7 /* Defining a layer ... */ 8 #include <perliol.h> 9 10=head1 DESCRIPTION 11 12This document describes the behavior and implementation of the PerlIO 13abstraction described in L<perlapio> when C<USE_PERLIO> is defined. 14 15=head2 History and Background 16 17The PerlIO abstraction was introduced in perl5.003_02 but languished as 18just an abstraction until perl5.7.0. However during that time a number 19of perl extensions switched to using it, so the API is mostly fixed to 20maintain (source) compatibility. 21 22The aim of the implementation is to provide the PerlIO API in a flexible 23and platform neutral manner. It is also a trial of an "Object Oriented 24C, with vtables" approach which may be applied to Perl 6. 25 26=head2 Basic Structure 27 28PerlIO is a stack of layers. 29 30The low levels of the stack work with the low-level operating system 31calls (file descriptors in C) getting bytes in and out, the higher 32layers of the stack buffer, filter, and otherwise manipulate the I/O, 33and return characters (or bytes) to Perl. Terms I<above> and I<below> 34are used to refer to the relative positioning of the stack layers. 35 36A layer contains a "vtable", the table of I/O operations (at C level 37a table of function pointers), and status flags. The functions in the 38vtable implement operations like "open", "read", and "write". 39 40When I/O, for example "read", is requested, the request goes from Perl 41first down the stack using "read" functions of each layer, then at the 42bottom the input is requested from the operating system services, then 43the result is returned up the stack, finally being interpreted as Perl 44data. 45 46The requests do not necessarily go always all the way down to the 47operating system: that's where PerlIO buffering comes into play. 48 49When you do an open() and specify extra PerlIO layers to be deployed, 50the layers you specify are "pushed" on top of the already existing 51default stack. One way to see it is that "operating system is 52on the left" and "Perl is on the right". 53 54What exact layers are in this default stack depends on a lot of 55things: your operating system, Perl version, Perl compile time 56configuration, and Perl runtime configuration. See L<PerlIO>, 57L<perlrun/PERLIO>, and L<open> for more information. 58 59binmode() operates similarly to open(): by default the specified 60layers are pushed on top of the existing stack. 61 62However, note that even as the specified layers are "pushed on top" 63for open() and binmode(), this doesn't mean that the effects are 64limited to the "top": PerlIO layers can be very 'active' and inspect 65and affect layers also deeper in the stack. As an example there 66is a layer called "raw" which repeatedly "pops" layers until 67it reaches the first layer that has declared itself capable of 68handling binary data. The "pushed" layers are processed in left-to-right 69order. 70 71sysopen() operates (unsurprisingly) at a lower level in the stack than 72open(). For example in Unix or Unix-like systems sysopen() operates 73directly at the level of file descriptors: in the terms of PerlIO 74layers, it uses only the "unix" layer, which is a rather thin wrapper 75on top of the Unix file descriptors. 76 77=head2 Layers vs Disciplines 78 79Initial discussion of the ability to modify IO streams behaviour used 80the term "discipline" for the entities which were added. This came (I 81believe) from the use of the term in "sfio", which in turn borrowed it 82from "line disciplines" on Unix terminals. However, this document (and 83the C code) uses the term "layer". 84 85This is, I hope, a natural term given the implementation, and should 86avoid connotations that are inherent in earlier uses of "discipline" 87for things which are rather different. 88 89=head2 Data Structures 90 91The basic data structure is a PerlIOl: 92 93 typedef struct _PerlIO PerlIOl; 94 typedef struct _PerlIO_funcs PerlIO_funcs; 95 typedef PerlIOl *PerlIO; 96 97 struct _PerlIO 98 { 99 PerlIOl * next; /* Lower layer */ 100 PerlIO_funcs * tab; /* Functions for this layer */ 101 U32 flags; /* Various flags for state */ 102 }; 103 104A C<PerlIOl *> is a pointer to the struct, and the I<application> 105level C<PerlIO *> is a pointer to a C<PerlIOl *> - i.e. a pointer 106to a pointer to the struct. This allows the application level C<PerlIO *> 107to remain constant while the actual C<PerlIOl *> underneath 108changes. (Compare perl's C<SV *> which remains constant while its 109C<sv_any> field changes as the scalar's type changes.) An IO stream is 110then in general represented as a pointer to this linked-list of 111"layers". 112 113It should be noted that because of the double indirection in a C<PerlIO *>, 114a C<< &(perlio->next) >> "is" a C<PerlIO *>, and so to some degree 115at least one layer can use the "standard" API on the next layer down. 116 117A "layer" is composed of two parts: 118 119=over 4 120 121=item 1. 122 123The functions and attributes of the "layer class". 124 125=item 2. 126 127The per-instance data for a particular handle. 128 129=back 130 131=head2 Functions and Attributes 132 133The functions and attributes are accessed via the "tab" (for table) 134member of C<PerlIOl>. The functions (methods of the layer "class") are 135fixed, and are defined by the C<PerlIO_funcs> type. They are broadly the 136same as the public C<PerlIO_xxxxx> functions: 137 138 struct _PerlIO_funcs 139 { 140 Size_t fsize; 141 char * name; 142 Size_t size; 143 IV kind; 144 IV (*Pushed)(pTHX_ PerlIO *f,const char *mode,SV *arg, PerlIO_funcs *tab); 145 IV (*Popped)(pTHX_ PerlIO *f); 146 PerlIO * (*Open)(pTHX_ PerlIO_funcs *tab, 147 PerlIO_list_t *layers, IV n, 148 const char *mode, 149 int fd, int imode, int perm, 150 PerlIO *old, 151 int narg, SV **args); 152 IV (*Binmode)(pTHX_ PerlIO *f); 153 SV * (*Getarg)(pTHX_ PerlIO *f, CLONE_PARAMS *param, int flags) 154 IV (*Fileno)(pTHX_ PerlIO *f); 155 PerlIO * (*Dup)(pTHX_ PerlIO *f, PerlIO *o, CLONE_PARAMS *param, int flags) 156 /* Unix-like functions - cf sfio line disciplines */ 157 SSize_t (*Read)(pTHX_ PerlIO *f, void *vbuf, Size_t count); 158 SSize_t (*Unread)(pTHX_ PerlIO *f, const void *vbuf, Size_t count); 159 SSize_t (*Write)(pTHX_ PerlIO *f, const void *vbuf, Size_t count); 160 IV (*Seek)(pTHX_ PerlIO *f, Off_t offset, int whence); 161 Off_t (*Tell)(pTHX_ PerlIO *f); 162 IV (*Close)(pTHX_ PerlIO *f); 163 /* Stdio-like buffered IO functions */ 164 IV (*Flush)(pTHX_ PerlIO *f); 165 IV (*Fill)(pTHX_ PerlIO *f); 166 IV (*Eof)(pTHX_ PerlIO *f); 167 IV (*Error)(pTHX_ PerlIO *f); 168 void (*Clearerr)(pTHX_ PerlIO *f); 169 void (*Setlinebuf)(pTHX_ PerlIO *f); 170 /* Perl's snooping functions */ 171 STDCHAR * (*Get_base)(pTHX_ PerlIO *f); 172 Size_t (*Get_bufsiz)(pTHX_ PerlIO *f); 173 STDCHAR * (*Get_ptr)(pTHX_ PerlIO *f); 174 SSize_t (*Get_cnt)(pTHX_ PerlIO *f); 175 void (*Set_ptrcnt)(pTHX_ PerlIO *f,STDCHAR *ptr,SSize_t cnt); 176 }; 177 178The first few members of the struct give a function table size for 179compatibility check "name" for the layer, the size to C<malloc> for the per-instance data, 180and some flags which are attributes of the class as whole (such as whether it is a buffering 181layer), then follow the functions which fall into four basic groups: 182 183=over 4 184 185=item 1. 186 187Opening and setup functions 188 189=item 2. 190 191Basic IO operations 192 193=item 3. 194 195Stdio class buffering options. 196 197=item 4. 198 199Functions to support Perl's traditional "fast" access to the buffer. 200 201=back 202 203A layer does not have to implement all the functions, but the whole 204table has to be present. Unimplemented slots can be NULL (which will 205result in an error when called) or can be filled in with stubs to 206"inherit" behaviour from a "base class". This "inheritance" is fixed 207for all instances of the layer, but as the layer chooses which stubs 208to populate the table, limited "multiple inheritance" is possible. 209 210=head2 Per-instance Data 211 212The per-instance data are held in memory beyond the basic PerlIOl 213struct, by making a PerlIOl the first member of the layer's struct 214thus: 215 216 typedef struct 217 { 218 struct _PerlIO base; /* Base "class" info */ 219 STDCHAR * buf; /* Start of buffer */ 220 STDCHAR * end; /* End of valid part of buffer */ 221 STDCHAR * ptr; /* Current position in buffer */ 222 Off_t posn; /* Offset of buf into the file */ 223 Size_t bufsiz; /* Real size of buffer */ 224 IV oneword; /* Emergency buffer */ 225 } PerlIOBuf; 226 227In this way (as for perl's scalars) a pointer to a PerlIOBuf can be 228treated as a pointer to a PerlIOl. 229 230=head2 Layers in action. 231 232 table perlio unix 233 | | 234 +-----------+ +----------+ +--------+ 235 PerlIO ->| |--->| next |--->| NULL | 236 +-----------+ +----------+ +--------+ 237 | | | buffer | | fd | 238 +-----------+ | | +--------+ 239 | | +----------+ 240 241 242The above attempts to show how the layer scheme works in a simple case. 243The application's C<PerlIO *> points to an entry in the table(s) 244representing open (allocated) handles. For example the first three slots 245in the table correspond to C<stdin>,C<stdout> and C<stderr>. The table 246in turn points to the current "top" layer for the handle - in this case 247an instance of the generic buffering layer "perlio". That layer in turn 248points to the next layer down - in this case the low-level "unix" layer. 249 250The above is roughly equivalent to a "stdio" buffered stream, but with 251much more flexibility: 252 253=over 4 254 255=item * 256 257If Unix level C<read>/C<write>/C<lseek> is not appropriate for (say) 258sockets then the "unix" layer can be replaced (at open time or even 259dynamically) with a "socket" layer. 260 261=item * 262 263Different handles can have different buffering schemes. The "top" 264layer could be the "mmap" layer if reading disk files was quicker 265using C<mmap> than C<read>. An "unbuffered" stream can be implemented 266simply by not having a buffer layer. 267 268=item * 269 270Extra layers can be inserted to process the data as it flows through. 271This was the driving need for including the scheme in perl 5.7.0+ - we 272needed a mechanism to allow data to be translated between perl's 273internal encoding (conceptually at least Unicode as UTF-8), and the 274"native" format used by the system. This is provided by the 275":encoding(xxxx)" layer which typically sits above the buffering layer. 276 277=item * 278 279A layer can be added that does "\n" to CRLF translation. This layer 280can be used on any platform, not just those that normally do such 281things. 282 283=back 284 285=head2 Per-instance flag bits 286 287The generic flag bits are a hybrid of C<O_XXXXX> style flags deduced 288from the mode string passed to C<PerlIO_open()>, and state bits for 289typical buffer layers. 290 291=over 4 292 293=item PERLIO_F_EOF 294 295End of file. 296 297=item PERLIO_F_CANWRITE 298 299Writes are permitted, i.e. opened as "w" or "r+" or "a", etc. 300 301=item PERLIO_F_CANREAD 302 303Reads are permitted i.e. opened "r" or "w+" (or even "a+" - ick). 304 305=item PERLIO_F_ERROR 306 307An error has occurred (for C<PerlIO_error()>). 308 309=item PERLIO_F_TRUNCATE 310 311Truncate file suggested by open mode. 312 313=item PERLIO_F_APPEND 314 315All writes should be appends. 316 317=item PERLIO_F_CRLF 318 319Layer is performing Win32-like "\n" mapped to CR,LF for output and CR,LF 320mapped to "\n" for input. Normally the provided "crlf" layer is the only 321layer that need bother about this. C<PerlIO_binmode()> will mess with this 322flag rather than add/remove layers if the C<PERLIO_K_CANCRLF> bit is set 323for the layers class. 324 325=item PERLIO_F_UTF8 326 327Data written to this layer should be UTF-8 encoded; data provided 328by this layer should be considered UTF-8 encoded. Can be set on any layer 329by ":utf8" dummy layer. Also set on ":encoding" layer. 330 331=item PERLIO_F_UNBUF 332 333Layer is unbuffered - i.e. write to next layer down should occur for 334each write to this layer. 335 336=item PERLIO_F_WRBUF 337 338The buffer for this layer currently holds data written to it but not sent 339to next layer. 340 341=item PERLIO_F_RDBUF 342 343The buffer for this layer currently holds unconsumed data read from 344layer below. 345 346=item PERLIO_F_LINEBUF 347 348Layer is line buffered. Write data should be passed to next layer down 349whenever a "\n" is seen. Any data beyond the "\n" should then be 350processed. 351 352=item PERLIO_F_TEMP 353 354File has been C<unlink()>ed, or should be deleted on C<close()>. 355 356=item PERLIO_F_OPEN 357 358Handle is open. 359 360=item PERLIO_F_FASTGETS 361 362This instance of this layer supports the "fast C<gets>" interface. 363Normally set based on C<PERLIO_K_FASTGETS> for the class and by the 364existence of the function(s) in the table. However a class that 365normally provides that interface may need to avoid it on a 366particular instance. The "pending" layer needs to do this when 367it is pushed above a layer which does not support the interface. 368(Perl's C<sv_gets()> does not expect the streams fast C<gets> behaviour 369to change during one "get".) 370 371=back 372 373=head2 Methods in Detail 374 375=over 4 376 377=item fsize 378 379 Size_t fsize; 380 381Size of the function table. This is compared against the value PerlIO 382code "knows" as a compatibility check. Future versions I<may> be able 383to tolerate layers compiled against an old version of the headers. 384 385=item name 386 387 char * name; 388 389The name of the layer whose open() method Perl should invoke on 390open(). For example if the layer is called APR, you will call: 391 392 open $fh, ">:APR", ... 393 394and Perl knows that it has to invoke the PerlIOAPR_open() method 395implemented by the APR layer. 396 397=item size 398 399 Size_t size; 400 401The size of the per-instance data structure, e.g.: 402 403 sizeof(PerlIOAPR) 404 405If this field is zero then C<PerlIO_pushed> does not malloc anything 406and assumes layer's Pushed function will do any required layer stack 407manipulation - used to avoid malloc/free overhead for dummy layers. 408If the field is non-zero it must be at least the size of C<PerlIOl>, 409C<PerlIO_pushed> will allocate memory for the layer's data structures 410and link new layer onto the stream's stack. (If the layer's Pushed 411method returns an error indication the layer is popped again.) 412 413=item kind 414 415 IV kind; 416 417=over 4 418 419=item * PERLIO_K_BUFFERED 420 421The layer is buffered. 422 423=item * PERLIO_K_RAW 424 425The layer is acceptable to have in a binmode(FH) stack - i.e. it does not 426(or will configure itself not to) transform bytes passing through it. 427 428=item * PERLIO_K_CANCRLF 429 430Layer can translate between "\n" and CRLF line ends. 431 432=item * PERLIO_K_FASTGETS 433 434Layer allows buffer snooping. 435 436=item * PERLIO_K_MULTIARG 437 438Used when the layer's open() accepts more arguments than usual. The 439extra arguments should come not before the C<MODE> argument. When this 440flag is used it's up to the layer to validate the args. 441 442=back 443 444=item Pushed 445 446 IV (*Pushed)(pTHX_ PerlIO *f,const char *mode, SV *arg); 447 448The only absolutely mandatory method. Called when the layer is pushed 449onto the stack. The C<mode> argument may be NULL if this occurs 450post-open. The C<arg> will be non-C<NULL> if an argument string was 451passed. In most cases this should call C<PerlIOBase_pushed()> to 452convert C<mode> into the appropriate C<PERLIO_F_XXXXX> flags in 453addition to any actions the layer itself takes. If a layer is not 454expecting an argument it need neither save the one passed to it, nor 455provide C<Getarg()> (it could perhaps C<Perl_warn> that the argument 456was un-expected). 457 458Returns 0 on success. On failure returns -1 and should set errno. 459 460=item Popped 461 462 IV (*Popped)(pTHX_ PerlIO *f); 463 464Called when the layer is popped from the stack. A layer will normally 465be popped after C<Close()> is called. But a layer can be popped 466without being closed if the program is dynamically managing layers on 467the stream. In such cases C<Popped()> should free any resources 468(buffers, translation tables, ...) not held directly in the layer's 469struct. It should also C<Unread()> any unconsumed data that has been 470read and buffered from the layer below back to that layer, so that it 471can be re-provided to what ever is now above. 472 473Returns 0 on success and failure. If C<Popped()> returns I<true> then 474I<perlio.c> assumes that either the layer has popped itself, or the 475layer is super special and needs to be retained for other reasons. 476In most cases it should return I<false>. 477 478=item Open 479 480 PerlIO * (*Open)(...); 481 482The C<Open()> method has lots of arguments because it combines the 483functions of perl's C<open>, C<PerlIO_open>, perl's C<sysopen>, 484C<PerlIO_fdopen> and C<PerlIO_reopen>. The full prototype is as 485follows: 486 487 PerlIO * (*Open)(pTHX_ PerlIO_funcs *tab, 488 PerlIO_list_t *layers, IV n, 489 const char *mode, 490 int fd, int imode, int perm, 491 PerlIO *old, 492 int narg, SV **args); 493 494Open should (perhaps indirectly) call C<PerlIO_allocate()> to allocate 495a slot in the table and associate it with the layers information for 496the opened file, by calling C<PerlIO_push>. The I<layers> is an 497array of all the layers destined for the C<PerlIO *>, and any 498arguments passed to them, I<n> is the index into that array of the 499layer being called. The macro C<PerlIOArg> will return a (possibly 500C<NULL>) SV * for the argument passed to the layer. 501 502The I<mode> string is an "C<fopen()>-like" string which would match 503the regular expression C</^[I#]?[rwa]\+?[bt]?$/>. 504 505The C<'I'> prefix is used during creation of C<stdin>..C<stderr> via 506special C<PerlIO_fdopen> calls; the C<'#'> prefix means that this is 507C<sysopen> and that I<imode> and I<perm> should be passed to 508C<PerlLIO_open3>; C<'r'> means B<r>ead, C<'w'> means B<w>rite and 509C<'a'> means B<a>ppend. The C<'+'> suffix means that both reading and 510writing/appending are permitted. The C<'b'> suffix means file should 511be binary, and C<'t'> means it is text. (Almost all layers should do 512the IO in binary mode, and ignore the b/t bits. The C<:crlf> layer 513should be pushed to handle the distinction.) 514 515If I<old> is not C<NULL> then this is a C<PerlIO_reopen>. Perl itself 516does not use this (yet?) and semantics are a little vague. 517 518If I<fd> not negative then it is the numeric file descriptor I<fd>, 519which will be open in a manner compatible with the supplied mode 520string, the call is thus equivalent to C<PerlIO_fdopen>. In this case 521I<nargs> will be zero. 522 523If I<nargs> is greater than zero then it gives the number of arguments 524passed to C<open>, otherwise it will be 1 if for example 525C<PerlIO_open> was called. In simple cases SvPV_nolen(*args) is the 526pathname to open. 527 528If a layer provides C<Open()> it should normally call the C<Open()> 529method of next layer down (if any) and then push itself on top if that 530succeeds. C<PerlIOBase_open> is provided to do exactly that, so in 531most cases you don't have to write your own C<Open()> method. If this 532method is not defined, other layers may have difficulty pushing 533themselves on top of it during open. 534 535If C<PerlIO_push> was performed and open has failed, it must 536C<PerlIO_pop> itself, since if it's not, the layer won't be removed 537and may cause bad problems. 538 539Returns C<NULL> on failure. 540 541=item Binmode 542 543 IV (*Binmode)(pTHX_ PerlIO *f); 544 545Optional. Used when C<:raw> layer is pushed (explicitly or as a result 546of binmode(FH)). If not present layer will be popped. If present 547should configure layer as binary (or pop itself) and return 0. 548If it returns -1 for error C<binmode> will fail with layer 549still on the stack. 550 551=item Getarg 552 553 SV * (*Getarg)(pTHX_ PerlIO *f, 554 CLONE_PARAMS *param, int flags); 555 556Optional. If present should return an SV * representing the string 557argument passed to the layer when it was 558pushed. e.g. ":encoding(ascii)" would return an SvPV with value 559"ascii". (I<param> and I<flags> arguments can be ignored in most 560cases) 561 562C<Dup> uses C<Getarg> to retrieve the argument originally passed to 563C<Pushed>, so you must implement this function if your layer has an 564extra argument to C<Pushed> and will ever be C<Dup>ed. 565 566=item Fileno 567 568 IV (*Fileno)(pTHX_ PerlIO *f); 569 570Returns the Unix/Posix numeric file descriptor for the handle. Normally 571C<PerlIOBase_fileno()> (which just asks next layer down) will suffice 572for this. 573 574Returns -1 on error, which is considered to include the case where the 575layer cannot provide such a file descriptor. 576 577=item Dup 578 579 PerlIO * (*Dup)(pTHX_ PerlIO *f, PerlIO *o, 580 CLONE_PARAMS *param, int flags); 581 582XXX: Needs more docs. 583 584Used as part of the "clone" process when a thread is spawned (in which 585case param will be non-NULL) and when a stream is being duplicated via 586'&' in the C<open>. 587 588Similar to C<Open>, returns PerlIO* on success, C<NULL> on failure. 589 590=item Read 591 592 SSize_t (*Read)(pTHX_ PerlIO *f, void *vbuf, Size_t count); 593 594Basic read operation. 595 596Typically will call C<Fill> and manipulate pointers (possibly via the 597API). C<PerlIOBuf_read()> may be suitable for derived classes which 598provide "fast gets" methods. 599 600Returns actual bytes read, or -1 on an error. 601 602=item Unread 603 604 SSize_t (*Unread)(pTHX_ PerlIO *f, 605 const void *vbuf, Size_t count); 606 607A superset of stdio's C<ungetc()>. Should arrange for future reads to 608see the bytes in C<vbuf>. If there is no obviously better implementation 609then C<PerlIOBase_unread()> provides the function by pushing a "fake" 610"pending" layer above the calling layer. 611 612Returns the number of unread chars. 613 614=item Write 615 616 SSize_t (*Write)(PerlIO *f, const void *vbuf, Size_t count); 617 618Basic write operation. 619 620Returns bytes written or -1 on an error. 621 622=item Seek 623 624 IV (*Seek)(pTHX_ PerlIO *f, Off_t offset, int whence); 625 626Position the file pointer. Should normally call its own C<Flush> 627method and then the C<Seek> method of next layer down. 628 629Returns 0 on success, -1 on failure. 630 631=item Tell 632 633 Off_t (*Tell)(pTHX_ PerlIO *f); 634 635Return the file pointer. May be based on layers cached concept of 636position to avoid overhead. 637 638Returns -1 on failure to get the file pointer. 639 640=item Close 641 642 IV (*Close)(pTHX_ PerlIO *f); 643 644Close the stream. Should normally call C<PerlIOBase_close()> to flush 645itself and close layers below, and then deallocate any data structures 646(buffers, translation tables, ...) not held directly in the data 647structure. 648 649Returns 0 on success, -1 on failure. 650 651=item Flush 652 653 IV (*Flush)(pTHX_ PerlIO *f); 654 655Should make stream's state consistent with layers below. That is, any 656buffered write data should be written, and file position of lower layers 657adjusted for data read from below but not actually consumed. 658(Should perhaps C<Unread()> such data to the lower layer.) 659 660Returns 0 on success, -1 on failure. 661 662=item Fill 663 664 IV (*Fill)(pTHX_ PerlIO *f); 665 666The buffer for this layer should be filled (for read) from layer 667below. When you "subclass" PerlIOBuf layer, you want to use its 668I<_read> method and to supply your own fill method, which fills the 669PerlIOBuf's buffer. 670 671Returns 0 on success, -1 on failure. 672 673=item Eof 674 675 IV (*Eof)(pTHX_ PerlIO *f); 676 677Return end-of-file indicator. C<PerlIOBase_eof()> is normally sufficient. 678 679Returns 0 on end-of-file, 1 if not end-of-file, -1 on error. 680 681=item Error 682 683 IV (*Error)(pTHX_ PerlIO *f); 684 685Return error indicator. C<PerlIOBase_error()> is normally sufficient. 686 687Returns 1 if there is an error (usually when C<PERLIO_F_ERROR> is set), 6880 otherwise. 689 690=item Clearerr 691 692 void (*Clearerr)(pTHX_ PerlIO *f); 693 694Clear end-of-file and error indicators. Should call C<PerlIOBase_clearerr()> 695to set the C<PERLIO_F_XXXXX> flags, which may suffice. 696 697=item Setlinebuf 698 699 void (*Setlinebuf)(pTHX_ PerlIO *f); 700 701Mark the stream as line buffered. C<PerlIOBase_setlinebuf()> sets the 702PERLIO_F_LINEBUF flag and is normally sufficient. 703 704=item Get_base 705 706 STDCHAR * (*Get_base)(pTHX_ PerlIO *f); 707 708Allocate (if not already done so) the read buffer for this layer and 709return pointer to it. Return NULL on failure. 710 711=item Get_bufsiz 712 713 Size_t (*Get_bufsiz)(pTHX_ PerlIO *f); 714 715Return the number of bytes that last C<Fill()> put in the buffer. 716 717=item Get_ptr 718 719 STDCHAR * (*Get_ptr)(pTHX_ PerlIO *f); 720 721Return the current read pointer relative to this layer's buffer. 722 723=item Get_cnt 724 725 SSize_t (*Get_cnt)(pTHX_ PerlIO *f); 726 727Return the number of bytes left to be read in the current buffer. 728 729=item Set_ptrcnt 730 731 void (*Set_ptrcnt)(pTHX_ PerlIO *f, 732 STDCHAR *ptr, SSize_t cnt); 733 734Adjust the read pointer and count of bytes to match C<ptr> and/or C<cnt>. 735The application (or layer above) must ensure they are consistent. 736(Checking is allowed by the paranoid.) 737 738=back 739 740=head2 Utilities 741 742To ask for the next layer down use PerlIONext(PerlIO *f). 743 744To check that a PerlIO* is valid use PerlIOValid(PerlIO *f). (All 745this does is really just to check that the pointer is non-NULL and 746that the pointer behind that is non-NULL.) 747 748PerlIOBase(PerlIO *f) returns the "Base" pointer, or in other words, 749the C<PerlIOl*> pointer. 750 751PerlIOSelf(PerlIO* f, type) return the PerlIOBase cast to a type. 752 753Perl_PerlIO_or_Base(PerlIO* f, callback, base, failure, args) either 754calls the I<callback> from the functions of the layer I<f> (just by 755the name of the IO function, like "Read") with the I<args>, or if 756there is no such callback, calls the I<base> version of the callback 757with the same args, or if the f is invalid, set errno to EBADF and 758return I<failure>. 759 760Perl_PerlIO_or_fail(PerlIO* f, callback, failure, args) either calls 761the I<callback> of the functions of the layer I<f> with the I<args>, 762or if there is no such callback, set errno to EINVAL. Or if the f is 763invalid, set errno to EBADF and return I<failure>. 764 765Perl_PerlIO_or_Base_void(PerlIO* f, callback, base, args) either calls 766the I<callback> of the functions of the layer I<f> with the I<args>, 767or if there is no such callback, calls the I<base> version of the 768callback with the same args, or if the f is invalid, set errno to 769EBADF. 770 771Perl_PerlIO_or_fail_void(PerlIO* f, callback, args) either calls the 772I<callback> of the functions of the layer I<f> with the I<args>, or if 773there is no such callback, set errno to EINVAL. Or if the f is 774invalid, set errno to EBADF. 775 776=head2 Implementing PerlIO Layers 777 778If you find the implementation document unclear or not sufficient, 779look at the existing PerlIO layer implementations, which include: 780 781=over 782 783=item * C implementations 784 785The F<perlio.c> and F<perliol.h> in the Perl core implement the 786"unix", "perlio", "stdio", "crlf", "utf8", "byte", "raw", "pending" 787layers, and also the "mmap" and "win32" layers if applicable. 788(The "win32" is currently unfinished and unused, to see what is used 789instead in Win32, see L<PerlIO/"Querying the layers of filehandles"> .) 790 791PerlIO::encoding, PerlIO::scalar, PerlIO::via in the Perl core. 792 793PerlIO::gzip and APR::PerlIO (mod_perl 2.0) on CPAN. 794 795=item * Perl implementations 796 797PerlIO::via::QuotedPrint in the Perl core and PerlIO::via::* on CPAN. 798 799=back 800 801If you are creating a PerlIO layer, you may want to be lazy, in other 802words, implement only the methods that interest you. The other methods 803you can either replace with the "blank" methods 804 805 PerlIOBase_noop_ok 806 PerlIOBase_noop_fail 807 808(which do nothing, and return zero and -1, respectively) or for 809certain methods you may assume a default behaviour by using a NULL 810method. The Open method looks for help in the 'parent' layer. 811The following table summarizes the behaviour: 812 813 method behaviour with NULL 814 815 Clearerr PerlIOBase_clearerr 816 Close PerlIOBase_close 817 Dup PerlIOBase_dup 818 Eof PerlIOBase_eof 819 Error PerlIOBase_error 820 Fileno PerlIOBase_fileno 821 Fill FAILURE 822 Flush SUCCESS 823 Getarg SUCCESS 824 Get_base FAILURE 825 Get_bufsiz FAILURE 826 Get_cnt FAILURE 827 Get_ptr FAILURE 828 Open INHERITED 829 Popped SUCCESS 830 Pushed SUCCESS 831 Read PerlIOBase_read 832 Seek FAILURE 833 Set_cnt FAILURE 834 Set_ptrcnt FAILURE 835 Setlinebuf PerlIOBase_setlinebuf 836 Tell FAILURE 837 Unread PerlIOBase_unread 838 Write FAILURE 839 840 FAILURE Set errno (to EINVAL in Unixish, to LIB$_INVARG in VMS) and 841 return -1 (for numeric return values) or NULL (for pointers) 842 INHERITED Inherited from the layer below 843 SUCCESS Return 0 (for numeric return values) or a pointer 844 845=head2 Core Layers 846 847The file C<perlio.c> provides the following layers: 848 849=over 4 850 851=item "unix" 852 853A basic non-buffered layer which calls Unix/POSIX C<read()>, C<write()>, 854C<lseek()>, C<close()>. No buffering. Even on platforms that distinguish 855between O_TEXT and O_BINARY this layer is always O_BINARY. 856 857=item "perlio" 858 859A very complete generic buffering layer which provides the whole of 860PerlIO API. It is also intended to be used as a "base class" for other 861layers. (For example its C<Read()> method is implemented in terms of 862the C<Get_cnt()>/C<Get_ptr()>/C<Set_ptrcnt()> methods). 863 864"perlio" over "unix" provides a complete replacement for stdio as seen 865via PerlIO API. This is the default for USE_PERLIO when system's stdio 866does not permit perl's "fast gets" access, and which do not 867distinguish between C<O_TEXT> and C<O_BINARY>. 868 869=item "stdio" 870 871A layer which provides the PerlIO API via the layer scheme, but 872implements it by calling system's stdio. This is (currently) the default 873if system's stdio provides sufficient access to allow perl's "fast gets" 874access and which do not distinguish between C<O_TEXT> and C<O_BINARY>. 875 876=item "crlf" 877 878A layer derived using "perlio" as a base class. It provides Win32-like 879"\n" to CR,LF translation. Can either be applied above "perlio" or serve 880as the buffer layer itself. "crlf" over "unix" is the default if system 881distinguishes between C<O_TEXT> and C<O_BINARY> opens. (At some point 882"unix" will be replaced by a "native" Win32 IO layer on that platform, 883as Win32's read/write layer has various drawbacks.) The "crlf" layer is 884a reasonable model for a layer which transforms data in some way. 885 886=item "mmap" 887 888If Configure detects C<mmap()> functions this layer is provided (with 889"perlio" as a "base") which does "read" operations by mmap()ing the 890file. Performance improvement is marginal on modern systems, so it is 891mainly there as a proof of concept. It is likely to be unbundled from 892the core at some point. The "mmap" layer is a reasonable model for a 893minimalist "derived" layer. 894 895=item "pending" 896 897An "internal" derivative of "perlio" which can be used to provide 898Unread() function for layers which have no buffer or cannot be 899bothered. (Basically this layer's C<Fill()> pops itself off the stack 900and so resumes reading from layer below.) 901 902=item "raw" 903 904A dummy layer which never exists on the layer stack. Instead when 905"pushed" it actually pops the stack removing itself, it then calls 906Binmode function table entry on all the layers in the stack - normally 907this (via PerlIOBase_binmode) removes any layers which do not have 908C<PERLIO_K_RAW> bit set. Layers can modify that behaviour by defining 909their own Binmode entry. 910 911=item "utf8" 912 913Another dummy layer. When pushed it pops itself and sets the 914C<PERLIO_F_UTF8> flag on the layer which was (and now is once more) 915the top of the stack. 916 917=back 918 919In addition F<perlio.c> also provides a number of C<PerlIOBase_xxxx()> 920functions which are intended to be used in the table slots of classes 921which do not need to do anything special for a particular method. 922 923=head2 Extension Layers 924 925Layers can be made available by extension modules. When an unknown layer 926is encountered the PerlIO code will perform the equivalent of : 927 928 use PerlIO 'layer'; 929 930Where I<layer> is the unknown layer. F<PerlIO.pm> will then attempt to: 931 932 require PerlIO::layer; 933 934If after that process the layer is still not defined then the C<open> 935will fail. 936 937The following extension layers are bundled with perl: 938 939=over 4 940 941=item ":encoding" 942 943 use Encoding; 944 945makes this layer available, although F<PerlIO.pm> "knows" where to 946find it. It is an example of a layer which takes an argument as it is 947called thus: 948 949 open( $fh, "<:encoding(iso-8859-7)", $pathname ); 950 951=item ":scalar" 952 953Provides support for reading data from and writing data to a scalar. 954 955 open( $fh, "+<:scalar", \$scalar ); 956 957When a handle is so opened, then reads get bytes from the string value 958of I<$scalar>, and writes change the value. In both cases the position 959in I<$scalar> starts as zero but can be altered via C<seek>, and 960determined via C<tell>. 961 962Please note that this layer is implied when calling open() thus: 963 964 open( $fh, "+<", \$scalar ); 965 966=item ":via" 967 968Provided to allow layers to be implemented as Perl code. For instance: 969 970 use PerlIO::via::StripHTML; 971 open( my $fh, "<:via(StripHTML)", "index.html" ); 972 973See L<PerlIO::via> for details. 974 975=back 976 977=head1 TODO 978 979Things that need to be done to improve this document. 980 981=over 982 983=item * 984 985Explain how to make a valid fh without going through open()(i.e. apply 986a layer). For example if the file is not opened through perl, but we 987want to get back a fh, like it was opened by Perl. 988 989How PerlIO_apply_layera fits in, where its docs, was it made public? 990 991Currently the example could be something like this: 992 993 PerlIO *foo_to_PerlIO(pTHX_ char *mode, ...) 994 { 995 char *mode; /* "w", "r", etc */ 996 const char *layers = ":APR"; /* the layer name */ 997 PerlIO *f = PerlIO_allocate(aTHX); 998 if (!f) { 999 return NULL; 1000 } 1001 1002 PerlIO_apply_layers(aTHX_ f, mode, layers); 1003 1004 if (f) { 1005 PerlIOAPR *st = PerlIOSelf(f, PerlIOAPR); 1006 /* fill in the st struct, as in _open() */ 1007 st->file = file; 1008 PerlIOBase(f)->flags |= PERLIO_F_OPEN; 1009 1010 return f; 1011 } 1012 return NULL; 1013 } 1014 1015=item * 1016 1017fix/add the documentation in places marked as XXX. 1018 1019=item * 1020 1021The handling of errors by the layer is not specified. e.g. when $! 1022should be set explicitly, when the error handling should be just 1023delegated to the top layer. 1024 1025Probably give some hints on using SETERRNO() or pointers to where they 1026can be found. 1027 1028=item * 1029 1030I think it would help to give some concrete examples to make it easier 1031to understand the API. Of course I agree that the API has to be 1032concise, but since there is no second document that is more of a 1033guide, I think that it'd make it easier to start with the doc which is 1034an API, but has examples in it in places where things are unclear, to 1035a person who is not a PerlIO guru (yet). 1036 1037=back 1038 1039=cut 1040