1=head1 NAME 2 3perliol - C API for Perl's implementation of IO in Layers. 4 5=head1 SYNOPSIS 6 7 /* Defining a layer ... */ 8 #include <perliol.h> 9 10=head1 DESCRIPTION 11 12This document describes the behavior and implementation of the PerlIO 13abstraction described in L<perlapio> when C<USE_PERLIO> is defined. 14 15=head2 History and Background 16 17The PerlIO abstraction was introduced in perl5.003_02 but languished as 18just an abstraction until perl5.7.0. However during that time a number 19of perl extensions switched to using it, so the API is mostly fixed to 20maintain (source) compatibility. 21 22The aim of the implementation is to provide the PerlIO API in a flexible 23and platform neutral manner. It is also a trial of an "Object Oriented 24C, with vtables" approach which may be applied to Perl 6. 25 26=head2 Basic Structure 27 28PerlIO is a stack of layers. 29 30The low levels of the stack work with the low-level operating system 31calls (file descriptors in C) getting bytes in and out, the higher 32layers of the stack buffer, filter, and otherwise manipulate the I/O, 33and return characters (or bytes) to Perl. Terms I<above> and I<below> 34are used to refer to the relative positioning of the stack layers. 35 36A layer contains a "vtable", the table of I/O operations (at C level 37a table of function pointers), and status flags. The functions in the 38vtable implement operations like "open", "read", and "write". 39 40When I/O, for example "read", is requested, the request goes from Perl 41first down the stack using "read" functions of each layer, then at the 42bottom the input is requested from the operating system services, then 43the result is returned up the stack, finally being interpreted as Perl 44data. 45 46The requests do not necessarily go always all the way down to the 47operating system: that's where PerlIO buffering comes into play. 48 49When you do an open() and specify extra PerlIO layers to be deployed, 50the layers you specify are "pushed" on top of the already existing 51default stack. One way to see it is that "operating system is 52on the left" and "Perl is on the right". 53 54What exact layers are in this default stack depends on a lot of 55things: your operating system, Perl version, Perl compile time 56configuration, and Perl runtime configuration. See L<PerlIO>, 57L<perlrun/PERLIO>, and L<open> for more information. 58 59binmode() operates similarly to open(): by default the specified 60layers are pushed on top of the existing stack. 61 62However, note that even as the specified layers are "pushed on top" 63for open() and binmode(), this doesn't mean that the effects are 64limited to the "top": PerlIO layers can be very 'active' and inspect 65and affect layers also deeper in the stack. As an example there 66is a layer called "raw" which repeatedly "pops" layers until 67it reaches the first layer that has declared itself capable of 68handling binary data. The "pushed" layers are processed in left-to-right 69order. 70 71sysopen() operates (unsurprisingly) at a lower level in the stack than 72open(). For example in Unix or Unix-like systems sysopen() operates 73directly at the level of file descriptors: in the terms of PerlIO 74layers, it uses only the "unix" layer, which is a rather thin wrapper 75on top of the Unix file descriptors. 76 77=head2 Layers vs Disciplines 78 79Initial discussion of the ability to modify IO streams behaviour used 80the term "discipline" for the entities which were added. This came (I 81believe) from the use of the term in "sfio", which in turn borrowed it 82from "line disciplines" on Unix terminals. However, this document (and 83the C code) uses the term "layer". 84 85This is, I hope, a natural term given the implementation, and should 86avoid connotations that are inherent in earlier uses of "discipline" 87for things which are rather different. 88 89=head2 Data Structures 90 91The basic data structure is a PerlIOl: 92 93 typedef struct _PerlIO PerlIOl; 94 typedef struct _PerlIO_funcs PerlIO_funcs; 95 typedef PerlIOl *PerlIO; 96 97 struct _PerlIO 98 { 99 PerlIOl * next; /* Lower layer */ 100 PerlIO_funcs * tab; /* Functions for this layer */ 101 U32 flags; /* Various flags for state */ 102 }; 103 104A C<PerlIOl *> is a pointer to the struct, and the I<application> 105level C<PerlIO *> is a pointer to a C<PerlIOl *> - i.e. a pointer 106to a pointer to the struct. This allows the application level C<PerlIO *> 107to remain constant while the actual C<PerlIOl *> underneath 108changes. (Compare perl's C<SV *> which remains constant while its 109C<sv_any> field changes as the scalar's type changes.) An IO stream is 110then in general represented as a pointer to this linked-list of 111"layers". 112 113It should be noted that because of the double indirection in a C<PerlIO *>, 114a C<< &(perlio->next) >> "is" a C<PerlIO *>, and so to some degree 115at least one layer can use the "standard" API on the next layer down. 116 117A "layer" is composed of two parts: 118 119=over 4 120 121=item 1. 122 123The functions and attributes of the "layer class". 124 125=item 2. 126 127The per-instance data for a particular handle. 128 129=back 130 131=head2 Functions and Attributes 132 133The functions and attributes are accessed via the "tab" (for table) 134member of C<PerlIOl>. The functions (methods of the layer "class") are 135fixed, and are defined by the C<PerlIO_funcs> type. They are broadly the 136same as the public C<PerlIO_xxxxx> functions: 137 138 struct _PerlIO_funcs 139 { 140 Size_t fsize; 141 char * name; 142 Size_t size; 143 IV kind; 144 IV (*Pushed)(pTHX_ PerlIO *f, 145 const char *mode, 146 SV *arg, 147 PerlIO_funcs *tab); 148 IV (*Popped)(pTHX_ PerlIO *f); 149 PerlIO * (*Open)(pTHX_ PerlIO_funcs *tab, 150 PerlIO_list_t *layers, IV n, 151 const char *mode, 152 int fd, int imode, int perm, 153 PerlIO *old, 154 int narg, SV **args); 155 IV (*Binmode)(pTHX_ PerlIO *f); 156 SV * (*Getarg)(pTHX_ PerlIO *f, CLONE_PARAMS *param, int flags) 157 IV (*Fileno)(pTHX_ PerlIO *f); 158 PerlIO * (*Dup)(pTHX_ PerlIO *f, 159 PerlIO *o, 160 CLONE_PARAMS *param, 161 int flags) 162 /* Unix-like functions - cf sfio line disciplines */ 163 SSize_t (*Read)(pTHX_ PerlIO *f, void *vbuf, Size_t count); 164 SSize_t (*Unread)(pTHX_ PerlIO *f, const void *vbuf, Size_t count); 165 SSize_t (*Write)(pTHX_ PerlIO *f, const void *vbuf, Size_t count); 166 IV (*Seek)(pTHX_ PerlIO *f, Off_t offset, int whence); 167 Off_t (*Tell)(pTHX_ PerlIO *f); 168 IV (*Close)(pTHX_ PerlIO *f); 169 /* Stdio-like buffered IO functions */ 170 IV (*Flush)(pTHX_ PerlIO *f); 171 IV (*Fill)(pTHX_ PerlIO *f); 172 IV (*Eof)(pTHX_ PerlIO *f); 173 IV (*Error)(pTHX_ PerlIO *f); 174 void (*Clearerr)(pTHX_ PerlIO *f); 175 void (*Setlinebuf)(pTHX_ PerlIO *f); 176 /* Perl's snooping functions */ 177 STDCHAR * (*Get_base)(pTHX_ PerlIO *f); 178 Size_t (*Get_bufsiz)(pTHX_ PerlIO *f); 179 STDCHAR * (*Get_ptr)(pTHX_ PerlIO *f); 180 SSize_t (*Get_cnt)(pTHX_ PerlIO *f); 181 void (*Set_ptrcnt)(pTHX_ PerlIO *f,STDCHAR *ptr,SSize_t cnt); 182 }; 183 184The first few members of the struct give a function table size for 185compatibility check "name" for the layer, the size to C<malloc> for the per-instance data, 186and some flags which are attributes of the class as whole (such as whether it is a buffering 187layer), then follow the functions which fall into four basic groups: 188 189=over 4 190 191=item 1. 192 193Opening and setup functions 194 195=item 2. 196 197Basic IO operations 198 199=item 3. 200 201Stdio class buffering options. 202 203=item 4. 204 205Functions to support Perl's traditional "fast" access to the buffer. 206 207=back 208 209A layer does not have to implement all the functions, but the whole 210table has to be present. Unimplemented slots can be NULL (which will 211result in an error when called) or can be filled in with stubs to 212"inherit" behaviour from a "base class". This "inheritance" is fixed 213for all instances of the layer, but as the layer chooses which stubs 214to populate the table, limited "multiple inheritance" is possible. 215 216=head2 Per-instance Data 217 218The per-instance data are held in memory beyond the basic PerlIOl 219struct, by making a PerlIOl the first member of the layer's struct 220thus: 221 222 typedef struct 223 { 224 struct _PerlIO base; /* Base "class" info */ 225 STDCHAR * buf; /* Start of buffer */ 226 STDCHAR * end; /* End of valid part of buffer */ 227 STDCHAR * ptr; /* Current position in buffer */ 228 Off_t posn; /* Offset of buf into the file */ 229 Size_t bufsiz; /* Real size of buffer */ 230 IV oneword; /* Emergency buffer */ 231 } PerlIOBuf; 232 233In this way (as for perl's scalars) a pointer to a PerlIOBuf can be 234treated as a pointer to a PerlIOl. 235 236=head2 Layers in action. 237 238 table perlio unix 239 | | 240 +-----------+ +----------+ +--------+ 241 PerlIO ->| |--->| next |--->| NULL | 242 +-----------+ +----------+ +--------+ 243 | | | buffer | | fd | 244 +-----------+ | | +--------+ 245 | | +----------+ 246 247 248The above attempts to show how the layer scheme works in a simple case. 249The application's C<PerlIO *> points to an entry in the table(s) 250representing open (allocated) handles. For example the first three slots 251in the table correspond to C<stdin>,C<stdout> and C<stderr>. The table 252in turn points to the current "top" layer for the handle - in this case 253an instance of the generic buffering layer "perlio". That layer in turn 254points to the next layer down - in this case the low-level "unix" layer. 255 256The above is roughly equivalent to a "stdio" buffered stream, but with 257much more flexibility: 258 259=over 4 260 261=item * 262 263If Unix level C<read>/C<write>/C<lseek> is not appropriate for (say) 264sockets then the "unix" layer can be replaced (at open time or even 265dynamically) with a "socket" layer. 266 267=item * 268 269Different handles can have different buffering schemes. The "top" 270layer could be the "mmap" layer if reading disk files was quicker 271using C<mmap> than C<read>. An "unbuffered" stream can be implemented 272simply by not having a buffer layer. 273 274=item * 275 276Extra layers can be inserted to process the data as it flows through. 277This was the driving need for including the scheme in perl 5.7.0+ - we 278needed a mechanism to allow data to be translated between perl's 279internal encoding (conceptually at least Unicode as UTF-8), and the 280"native" format used by the system. This is provided by the 281":encoding(xxxx)" layer which typically sits above the buffering layer. 282 283=item * 284 285A layer can be added that does "\n" to CRLF translation. This layer 286can be used on any platform, not just those that normally do such 287things. 288 289=back 290 291=head2 Per-instance flag bits 292 293The generic flag bits are a hybrid of C<O_XXXXX> style flags deduced 294from the mode string passed to C<PerlIO_open()>, and state bits for 295typical buffer layers. 296 297=over 4 298 299=item PERLIO_F_EOF 300 301End of file. 302 303=item PERLIO_F_CANWRITE 304 305Writes are permitted, i.e. opened as "w" or "r+" or "a", etc. 306 307=item PERLIO_F_CANREAD 308 309Reads are permitted i.e. opened "r" or "w+" (or even "a+" - ick). 310 311=item PERLIO_F_ERROR 312 313An error has occurred (for C<PerlIO_error()>). 314 315=item PERLIO_F_TRUNCATE 316 317Truncate file suggested by open mode. 318 319=item PERLIO_F_APPEND 320 321All writes should be appends. 322 323=item PERLIO_F_CRLF 324 325Layer is performing Win32-like "\n" mapped to CR,LF for output and CR,LF 326mapped to "\n" for input. Normally the provided "crlf" layer is the only 327layer that need bother about this. C<PerlIO_binmode()> will mess with this 328flag rather than add/remove layers if the C<PERLIO_K_CANCRLF> bit is set 329for the layers class. 330 331=item PERLIO_F_UTF8 332 333Data written to this layer should be UTF-8 encoded; data provided 334by this layer should be considered UTF-8 encoded. Can be set on any layer 335by ":utf8" dummy layer. Also set on ":encoding" layer. 336 337=item PERLIO_F_UNBUF 338 339Layer is unbuffered - i.e. write to next layer down should occur for 340each write to this layer. 341 342=item PERLIO_F_WRBUF 343 344The buffer for this layer currently holds data written to it but not sent 345to next layer. 346 347=item PERLIO_F_RDBUF 348 349The buffer for this layer currently holds unconsumed data read from 350layer below. 351 352=item PERLIO_F_LINEBUF 353 354Layer is line buffered. Write data should be passed to next layer down 355whenever a "\n" is seen. Any data beyond the "\n" should then be 356processed. 357 358=item PERLIO_F_TEMP 359 360File has been C<unlink()>ed, or should be deleted on C<close()>. 361 362=item PERLIO_F_OPEN 363 364Handle is open. 365 366=item PERLIO_F_FASTGETS 367 368This instance of this layer supports the "fast C<gets>" interface. 369Normally set based on C<PERLIO_K_FASTGETS> for the class and by the 370existence of the function(s) in the table. However a class that 371normally provides that interface may need to avoid it on a 372particular instance. The "pending" layer needs to do this when 373it is pushed above a layer which does not support the interface. 374(Perl's C<sv_gets()> does not expect the streams fast C<gets> behaviour 375to change during one "get".) 376 377=back 378 379=head2 Methods in Detail 380 381=over 4 382 383=item fsize 384 385 Size_t fsize; 386 387Size of the function table. This is compared against the value PerlIO 388code "knows" as a compatibility check. Future versions I<may> be able 389to tolerate layers compiled against an old version of the headers. 390 391=item name 392 393 char * name; 394 395The name of the layer whose open() method Perl should invoke on 396open(). For example if the layer is called APR, you will call: 397 398 open $fh, ">:APR", ... 399 400and Perl knows that it has to invoke the PerlIOAPR_open() method 401implemented by the APR layer. 402 403=item size 404 405 Size_t size; 406 407The size of the per-instance data structure, e.g.: 408 409 sizeof(PerlIOAPR) 410 411If this field is zero then C<PerlIO_pushed> does not malloc anything 412and assumes layer's Pushed function will do any required layer stack 413manipulation - used to avoid malloc/free overhead for dummy layers. 414If the field is non-zero it must be at least the size of C<PerlIOl>, 415C<PerlIO_pushed> will allocate memory for the layer's data structures 416and link new layer onto the stream's stack. (If the layer's Pushed 417method returns an error indication the layer is popped again.) 418 419=item kind 420 421 IV kind; 422 423=over 4 424 425=item * PERLIO_K_BUFFERED 426 427The layer is buffered. 428 429=item * PERLIO_K_RAW 430 431The layer is acceptable to have in a binmode(FH) stack - i.e. it does not 432(or will configure itself not to) transform bytes passing through it. 433 434=item * PERLIO_K_CANCRLF 435 436Layer can translate between "\n" and CRLF line ends. 437 438=item * PERLIO_K_FASTGETS 439 440Layer allows buffer snooping. 441 442=item * PERLIO_K_MULTIARG 443 444Used when the layer's open() accepts more arguments than usual. The 445extra arguments should come not before the C<MODE> argument. When this 446flag is used it's up to the layer to validate the args. 447 448=back 449 450=item Pushed 451 452 IV (*Pushed)(pTHX_ PerlIO *f,const char *mode, SV *arg); 453 454The only absolutely mandatory method. Called when the layer is pushed 455onto the stack. The C<mode> argument may be NULL if this occurs 456post-open. The C<arg> will be non-C<NULL> if an argument string was 457passed. In most cases this should call C<PerlIOBase_pushed()> to 458convert C<mode> into the appropriate C<PERLIO_F_XXXXX> flags in 459addition to any actions the layer itself takes. If a layer is not 460expecting an argument it need neither save the one passed to it, nor 461provide C<Getarg()> (it could perhaps C<Perl_warn> that the argument 462was un-expected). 463 464Returns 0 on success. On failure returns -1 and should set errno. 465 466=item Popped 467 468 IV (*Popped)(pTHX_ PerlIO *f); 469 470Called when the layer is popped from the stack. A layer will normally 471be popped after C<Close()> is called. But a layer can be popped 472without being closed if the program is dynamically managing layers on 473the stream. In such cases C<Popped()> should free any resources 474(buffers, translation tables, ...) not held directly in the layer's 475struct. It should also C<Unread()> any unconsumed data that has been 476read and buffered from the layer below back to that layer, so that it 477can be re-provided to what ever is now above. 478 479Returns 0 on success and failure. If C<Popped()> returns I<true> then 480I<perlio.c> assumes that either the layer has popped itself, or the 481layer is super special and needs to be retained for other reasons. 482In most cases it should return I<false>. 483 484=item Open 485 486 PerlIO * (*Open)(...); 487 488The C<Open()> method has lots of arguments because it combines the 489functions of perl's C<open>, C<PerlIO_open>, perl's C<sysopen>, 490C<PerlIO_fdopen> and C<PerlIO_reopen>. The full prototype is as 491follows: 492 493 PerlIO * (*Open)(pTHX_ PerlIO_funcs *tab, 494 PerlIO_list_t *layers, IV n, 495 const char *mode, 496 int fd, int imode, int perm, 497 PerlIO *old, 498 int narg, SV **args); 499 500Open should (perhaps indirectly) call C<PerlIO_allocate()> to allocate 501a slot in the table and associate it with the layers information for 502the opened file, by calling C<PerlIO_push>. The I<layers> is an 503array of all the layers destined for the C<PerlIO *>, and any 504arguments passed to them, I<n> is the index into that array of the 505layer being called. The macro C<PerlIOArg> will return a (possibly 506C<NULL>) SV * for the argument passed to the layer. 507 508The I<mode> string is an "C<fopen()>-like" string which would match 509the regular expression C</^[I#]?[rwa]\+?[bt]?$/>. 510 511The C<'I'> prefix is used during creation of C<stdin>..C<stderr> via 512special C<PerlIO_fdopen> calls; the C<'#'> prefix means that this is 513C<sysopen> and that I<imode> and I<perm> should be passed to 514C<PerlLIO_open3>; C<'r'> means B<r>ead, C<'w'> means B<w>rite and 515C<'a'> means B<a>ppend. The C<'+'> suffix means that both reading and 516writing/appending are permitted. The C<'b'> suffix means file should 517be binary, and C<'t'> means it is text. (Almost all layers should do 518the IO in binary mode, and ignore the b/t bits. The C<:crlf> layer 519should be pushed to handle the distinction.) 520 521If I<old> is not C<NULL> then this is a C<PerlIO_reopen>. Perl itself 522does not use this (yet?) and semantics are a little vague. 523 524If I<fd> not negative then it is the numeric file descriptor I<fd>, 525which will be open in a manner compatible with the supplied mode 526string, the call is thus equivalent to C<PerlIO_fdopen>. In this case 527I<nargs> will be zero. 528 529If I<nargs> is greater than zero then it gives the number of arguments 530passed to C<open>, otherwise it will be 1 if for example 531C<PerlIO_open> was called. In simple cases SvPV_nolen(*args) is the 532pathname to open. 533 534If a layer provides C<Open()> it should normally call the C<Open()> 535method of next layer down (if any) and then push itself on top if that 536succeeds. C<PerlIOBase_open> is provided to do exactly that, so in 537most cases you don't have to write your own C<Open()> method. If this 538method is not defined, other layers may have difficulty pushing 539themselves on top of it during open. 540 541If C<PerlIO_push> was performed and open has failed, it must 542C<PerlIO_pop> itself, since if it's not, the layer won't be removed 543and may cause bad problems. 544 545Returns C<NULL> on failure. 546 547=item Binmode 548 549 IV (*Binmode)(pTHX_ PerlIO *f); 550 551Optional. Used when C<:raw> layer is pushed (explicitly or as a result 552of binmode(FH)). If not present layer will be popped. If present 553should configure layer as binary (or pop itself) and return 0. 554If it returns -1 for error C<binmode> will fail with layer 555still on the stack. 556 557=item Getarg 558 559 SV * (*Getarg)(pTHX_ PerlIO *f, 560 CLONE_PARAMS *param, int flags); 561 562Optional. If present should return an SV * representing the string 563argument passed to the layer when it was 564pushed. e.g. ":encoding(ascii)" would return an SvPV with value 565"ascii". (I<param> and I<flags> arguments can be ignored in most 566cases) 567 568C<Dup> uses C<Getarg> to retrieve the argument originally passed to 569C<Pushed>, so you must implement this function if your layer has an 570extra argument to C<Pushed> and will ever be C<Dup>ed. 571 572=item Fileno 573 574 IV (*Fileno)(pTHX_ PerlIO *f); 575 576Returns the Unix/Posix numeric file descriptor for the handle. Normally 577C<PerlIOBase_fileno()> (which just asks next layer down) will suffice 578for this. 579 580Returns -1 on error, which is considered to include the case where the 581layer cannot provide such a file descriptor. 582 583=item Dup 584 585 PerlIO * (*Dup)(pTHX_ PerlIO *f, PerlIO *o, 586 CLONE_PARAMS *param, int flags); 587 588XXX: Needs more docs. 589 590Used as part of the "clone" process when a thread is spawned (in which 591case param will be non-NULL) and when a stream is being duplicated via 592'&' in the C<open>. 593 594Similar to C<Open>, returns PerlIO* on success, C<NULL> on failure. 595 596=item Read 597 598 SSize_t (*Read)(pTHX_ PerlIO *f, void *vbuf, Size_t count); 599 600Basic read operation. 601 602Typically will call C<Fill> and manipulate pointers (possibly via the 603API). C<PerlIOBuf_read()> may be suitable for derived classes which 604provide "fast gets" methods. 605 606Returns actual bytes read, or -1 on an error. 607 608=item Unread 609 610 SSize_t (*Unread)(pTHX_ PerlIO *f, 611 const void *vbuf, Size_t count); 612 613A superset of stdio's C<ungetc()>. Should arrange for future reads to 614see the bytes in C<vbuf>. If there is no obviously better implementation 615then C<PerlIOBase_unread()> provides the function by pushing a "fake" 616"pending" layer above the calling layer. 617 618Returns the number of unread chars. 619 620=item Write 621 622 SSize_t (*Write)(PerlIO *f, const void *vbuf, Size_t count); 623 624Basic write operation. 625 626Returns bytes written or -1 on an error. 627 628=item Seek 629 630 IV (*Seek)(pTHX_ PerlIO *f, Off_t offset, int whence); 631 632Position the file pointer. Should normally call its own C<Flush> 633method and then the C<Seek> method of next layer down. 634 635Returns 0 on success, -1 on failure. 636 637=item Tell 638 639 Off_t (*Tell)(pTHX_ PerlIO *f); 640 641Return the file pointer. May be based on layers cached concept of 642position to avoid overhead. 643 644Returns -1 on failure to get the file pointer. 645 646=item Close 647 648 IV (*Close)(pTHX_ PerlIO *f); 649 650Close the stream. Should normally call C<PerlIOBase_close()> to flush 651itself and close layers below, and then deallocate any data structures 652(buffers, translation tables, ...) not held directly in the data 653structure. 654 655Returns 0 on success, -1 on failure. 656 657=item Flush 658 659 IV (*Flush)(pTHX_ PerlIO *f); 660 661Should make stream's state consistent with layers below. That is, any 662buffered write data should be written, and file position of lower layers 663adjusted for data read from below but not actually consumed. 664(Should perhaps C<Unread()> such data to the lower layer.) 665 666Returns 0 on success, -1 on failure. 667 668=item Fill 669 670 IV (*Fill)(pTHX_ PerlIO *f); 671 672The buffer for this layer should be filled (for read) from layer 673below. When you "subclass" PerlIOBuf layer, you want to use its 674I<_read> method and to supply your own fill method, which fills the 675PerlIOBuf's buffer. 676 677Returns 0 on success, -1 on failure. 678 679=item Eof 680 681 IV (*Eof)(pTHX_ PerlIO *f); 682 683Return end-of-file indicator. C<PerlIOBase_eof()> is normally sufficient. 684 685Returns 0 on end-of-file, 1 if not end-of-file, -1 on error. 686 687=item Error 688 689 IV (*Error)(pTHX_ PerlIO *f); 690 691Return error indicator. C<PerlIOBase_error()> is normally sufficient. 692 693Returns 1 if there is an error (usually when C<PERLIO_F_ERROR> is set), 6940 otherwise. 695 696=item Clearerr 697 698 void (*Clearerr)(pTHX_ PerlIO *f); 699 700Clear end-of-file and error indicators. Should call C<PerlIOBase_clearerr()> 701to set the C<PERLIO_F_XXXXX> flags, which may suffice. 702 703=item Setlinebuf 704 705 void (*Setlinebuf)(pTHX_ PerlIO *f); 706 707Mark the stream as line buffered. C<PerlIOBase_setlinebuf()> sets the 708PERLIO_F_LINEBUF flag and is normally sufficient. 709 710=item Get_base 711 712 STDCHAR * (*Get_base)(pTHX_ PerlIO *f); 713 714Allocate (if not already done so) the read buffer for this layer and 715return pointer to it. Return NULL on failure. 716 717=item Get_bufsiz 718 719 Size_t (*Get_bufsiz)(pTHX_ PerlIO *f); 720 721Return the number of bytes that last C<Fill()> put in the buffer. 722 723=item Get_ptr 724 725 STDCHAR * (*Get_ptr)(pTHX_ PerlIO *f); 726 727Return the current read pointer relative to this layer's buffer. 728 729=item Get_cnt 730 731 SSize_t (*Get_cnt)(pTHX_ PerlIO *f); 732 733Return the number of bytes left to be read in the current buffer. 734 735=item Set_ptrcnt 736 737 void (*Set_ptrcnt)(pTHX_ PerlIO *f, 738 STDCHAR *ptr, SSize_t cnt); 739 740Adjust the read pointer and count of bytes to match C<ptr> and/or C<cnt>. 741The application (or layer above) must ensure they are consistent. 742(Checking is allowed by the paranoid.) 743 744=back 745 746=head2 Utilities 747 748To ask for the next layer down use PerlIONext(PerlIO *f). 749 750To check that a PerlIO* is valid use PerlIOValid(PerlIO *f). (All 751this does is really just to check that the pointer is non-NULL and 752that the pointer behind that is non-NULL.) 753 754PerlIOBase(PerlIO *f) returns the "Base" pointer, or in other words, 755the C<PerlIOl*> pointer. 756 757PerlIOSelf(PerlIO* f, type) return the PerlIOBase cast to a type. 758 759Perl_PerlIO_or_Base(PerlIO* f, callback, base, failure, args) either 760calls the I<callback> from the functions of the layer I<f> (just by 761the name of the IO function, like "Read") with the I<args>, or if 762there is no such callback, calls the I<base> version of the callback 763with the same args, or if the f is invalid, set errno to EBADF and 764return I<failure>. 765 766Perl_PerlIO_or_fail(PerlIO* f, callback, failure, args) either calls 767the I<callback> of the functions of the layer I<f> with the I<args>, 768or if there is no such callback, set errno to EINVAL. Or if the f is 769invalid, set errno to EBADF and return I<failure>. 770 771Perl_PerlIO_or_Base_void(PerlIO* f, callback, base, args) either calls 772the I<callback> of the functions of the layer I<f> with the I<args>, 773or if there is no such callback, calls the I<base> version of the 774callback with the same args, or if the f is invalid, set errno to 775EBADF. 776 777Perl_PerlIO_or_fail_void(PerlIO* f, callback, args) either calls the 778I<callback> of the functions of the layer I<f> with the I<args>, or if 779there is no such callback, set errno to EINVAL. Or if the f is 780invalid, set errno to EBADF. 781 782=head2 Implementing PerlIO Layers 783 784If you find the implementation document unclear or not sufficient, 785look at the existing PerlIO layer implementations, which include: 786 787=over 788 789=item * C implementations 790 791The F<perlio.c> and F<perliol.h> in the Perl core implement the 792"unix", "perlio", "stdio", "crlf", "utf8", "byte", "raw", "pending" 793layers, and also the "mmap" and "win32" layers if applicable. 794(The "win32" is currently unfinished and unused, to see what is used 795instead in Win32, see L<PerlIO/"Querying the layers of filehandles"> .) 796 797PerlIO::encoding, PerlIO::scalar, PerlIO::via in the Perl core. 798 799PerlIO::gzip and APR::PerlIO (mod_perl 2.0) on CPAN. 800 801=item * Perl implementations 802 803PerlIO::via::QuotedPrint in the Perl core and PerlIO::via::* on CPAN. 804 805=back 806 807If you are creating a PerlIO layer, you may want to be lazy, in other 808words, implement only the methods that interest you. The other methods 809you can either replace with the "blank" methods 810 811 PerlIOBase_noop_ok 812 PerlIOBase_noop_fail 813 814(which do nothing, and return zero and -1, respectively) or for 815certain methods you may assume a default behaviour by using a NULL 816method. The Open method looks for help in the 'parent' layer. 817The following table summarizes the behaviour: 818 819 method behaviour with NULL 820 821 Clearerr PerlIOBase_clearerr 822 Close PerlIOBase_close 823 Dup PerlIOBase_dup 824 Eof PerlIOBase_eof 825 Error PerlIOBase_error 826 Fileno PerlIOBase_fileno 827 Fill FAILURE 828 Flush SUCCESS 829 Getarg SUCCESS 830 Get_base FAILURE 831 Get_bufsiz FAILURE 832 Get_cnt FAILURE 833 Get_ptr FAILURE 834 Open INHERITED 835 Popped SUCCESS 836 Pushed SUCCESS 837 Read PerlIOBase_read 838 Seek FAILURE 839 Set_cnt FAILURE 840 Set_ptrcnt FAILURE 841 Setlinebuf PerlIOBase_setlinebuf 842 Tell FAILURE 843 Unread PerlIOBase_unread 844 Write FAILURE 845 846 FAILURE Set errno (to EINVAL in Unixish, to LIB$_INVARG in VMS) 847 and return -1 (for numeric return values) or NULL (for 848 pointers) 849 INHERITED Inherited from the layer below 850 SUCCESS Return 0 (for numeric return values) or a pointer 851 852=head2 Core Layers 853 854The file C<perlio.c> provides the following layers: 855 856=over 4 857 858=item "unix" 859 860A basic non-buffered layer which calls Unix/POSIX C<read()>, C<write()>, 861C<lseek()>, C<close()>. No buffering. Even on platforms that distinguish 862between O_TEXT and O_BINARY this layer is always O_BINARY. 863 864=item "perlio" 865 866A very complete generic buffering layer which provides the whole of 867PerlIO API. It is also intended to be used as a "base class" for other 868layers. (For example its C<Read()> method is implemented in terms of 869the C<Get_cnt()>/C<Get_ptr()>/C<Set_ptrcnt()> methods). 870 871"perlio" over "unix" provides a complete replacement for stdio as seen 872via PerlIO API. This is the default for USE_PERLIO when system's stdio 873does not permit perl's "fast gets" access, and which do not 874distinguish between C<O_TEXT> and C<O_BINARY>. 875 876=item "stdio" 877 878A layer which provides the PerlIO API via the layer scheme, but 879implements it by calling system's stdio. This is (currently) the default 880if system's stdio provides sufficient access to allow perl's "fast gets" 881access and which do not distinguish between C<O_TEXT> and C<O_BINARY>. 882 883=item "crlf" 884 885A layer derived using "perlio" as a base class. It provides Win32-like 886"\n" to CR,LF translation. Can either be applied above "perlio" or serve 887as the buffer layer itself. "crlf" over "unix" is the default if system 888distinguishes between C<O_TEXT> and C<O_BINARY> opens. (At some point 889"unix" will be replaced by a "native" Win32 IO layer on that platform, 890as Win32's read/write layer has various drawbacks.) The "crlf" layer is 891a reasonable model for a layer which transforms data in some way. 892 893=item "mmap" 894 895If Configure detects C<mmap()> functions this layer is provided (with 896"perlio" as a "base") which does "read" operations by mmap()ing the 897file. Performance improvement is marginal on modern systems, so it is 898mainly there as a proof of concept. It is likely to be unbundled from 899the core at some point. The "mmap" layer is a reasonable model for a 900minimalist "derived" layer. 901 902=item "pending" 903 904An "internal" derivative of "perlio" which can be used to provide 905Unread() function for layers which have no buffer or cannot be 906bothered. (Basically this layer's C<Fill()> pops itself off the stack 907and so resumes reading from layer below.) 908 909=item "raw" 910 911A dummy layer which never exists on the layer stack. Instead when 912"pushed" it actually pops the stack removing itself, it then calls 913Binmode function table entry on all the layers in the stack - normally 914this (via PerlIOBase_binmode) removes any layers which do not have 915C<PERLIO_K_RAW> bit set. Layers can modify that behaviour by defining 916their own Binmode entry. 917 918=item "utf8" 919 920Another dummy layer. When pushed it pops itself and sets the 921C<PERLIO_F_UTF8> flag on the layer which was (and now is once more) 922the top of the stack. 923 924=back 925 926In addition F<perlio.c> also provides a number of C<PerlIOBase_xxxx()> 927functions which are intended to be used in the table slots of classes 928which do not need to do anything special for a particular method. 929 930=head2 Extension Layers 931 932Layers can be made available by extension modules. When an unknown layer 933is encountered the PerlIO code will perform the equivalent of : 934 935 use PerlIO 'layer'; 936 937Where I<layer> is the unknown layer. F<PerlIO.pm> will then attempt to: 938 939 require PerlIO::layer; 940 941If after that process the layer is still not defined then the C<open> 942will fail. 943 944The following extension layers are bundled with perl: 945 946=over 4 947 948=item ":encoding" 949 950 use Encoding; 951 952makes this layer available, although F<PerlIO.pm> "knows" where to 953find it. It is an example of a layer which takes an argument as it is 954called thus: 955 956 open( $fh, "<:encoding(iso-8859-7)", $pathname ); 957 958=item ":scalar" 959 960Provides support for reading data from and writing data to a scalar. 961 962 open( $fh, "+<:scalar", \$scalar ); 963 964When a handle is so opened, then reads get bytes from the string value 965of I<$scalar>, and writes change the value. In both cases the position 966in I<$scalar> starts as zero but can be altered via C<seek>, and 967determined via C<tell>. 968 969Please note that this layer is implied when calling open() thus: 970 971 open( $fh, "+<", \$scalar ); 972 973=item ":via" 974 975Provided to allow layers to be implemented as Perl code. For instance: 976 977 use PerlIO::via::StripHTML; 978 open( my $fh, "<:via(StripHTML)", "index.html" ); 979 980See L<PerlIO::via> for details. 981 982=back 983 984=head1 TODO 985 986Things that need to be done to improve this document. 987 988=over 989 990=item * 991 992Explain how to make a valid fh without going through open()(i.e. apply 993a layer). For example if the file is not opened through perl, but we 994want to get back a fh, like it was opened by Perl. 995 996How PerlIO_apply_layera fits in, where its docs, was it made public? 997 998Currently the example could be something like this: 999 1000 PerlIO *foo_to_PerlIO(pTHX_ char *mode, ...) 1001 { 1002 char *mode; /* "w", "r", etc */ 1003 const char *layers = ":APR"; /* the layer name */ 1004 PerlIO *f = PerlIO_allocate(aTHX); 1005 if (!f) { 1006 return NULL; 1007 } 1008 1009 PerlIO_apply_layers(aTHX_ f, mode, layers); 1010 1011 if (f) { 1012 PerlIOAPR *st = PerlIOSelf(f, PerlIOAPR); 1013 /* fill in the st struct, as in _open() */ 1014 st->file = file; 1015 PerlIOBase(f)->flags |= PERLIO_F_OPEN; 1016 1017 return f; 1018 } 1019 return NULL; 1020 } 1021 1022=item * 1023 1024fix/add the documentation in places marked as XXX. 1025 1026=item * 1027 1028The handling of errors by the layer is not specified. e.g. when $! 1029should be set explicitly, when the error handling should be just 1030delegated to the top layer. 1031 1032Probably give some hints on using SETERRNO() or pointers to where they 1033can be found. 1034 1035=item * 1036 1037I think it would help to give some concrete examples to make it easier 1038to understand the API. Of course I agree that the API has to be 1039concise, but since there is no second document that is more of a 1040guide, I think that it'd make it easier to start with the doc which is 1041an API, but has examples in it in places where things are unclear, to 1042a person who is not a PerlIO guru (yet). 1043 1044=back 1045 1046=cut 1047