1=head1 NAME 2 3perliol - C API for Perl's implementation of IO in Layers. 4 5=head1 SYNOPSIS 6 7 /* Defining a layer ... */ 8 #include <perliol.h> 9 10=head1 DESCRIPTION 11 12This document describes the behavior and implementation of the PerlIO 13abstraction described in L<perlapio> when C<USE_PERLIO> is defined (and 14C<USE_SFIO> is not). 15 16=head2 History and Background 17 18The PerlIO abstraction was introduced in perl5.003_02 but languished as 19just an abstraction until perl5.7.0. However during that time a number 20of perl extensions switched to using it, so the API is mostly fixed to 21maintain (source) compatibility. 22 23The aim of the implementation is to provide the PerlIO API in a flexible 24and platform neutral manner. It is also a trial of an "Object Oriented 25C, with vtables" approach which may be applied to Perl 6. 26 27=head2 Basic Structure 28 29PerlIO is a stack of layers. 30 31The low levels of the stack work with the low-level operating system 32calls (file descriptors in C) getting bytes in and out, the higher 33layers of the stack buffer, filter, and otherwise manipulate the I/O, 34and return characters (or bytes) to Perl. Terms I<above> and I<below> 35are used to refer to the relative positioning of the stack layers. 36 37A layer contains a "vtable", the table of I/O operations (at C level 38a table of function pointers), and status flags. The functions in the 39vtable implement operations like "open", "read", and "write". 40 41When I/O, for example "read", is requested, the request goes from Perl 42first down the stack using "read" functions of each layer, then at the 43bottom the input is requested from the operating system services, then 44the result is returned up the stack, finally being interpreted as Perl 45data. 46 47The requests do not necessarily go always all the way down to the 48operating system: that's where PerlIO buffering comes into play. 49 50When you do an open() and specify extra PerlIO layers to be deployed, 51the layers you specify are "pushed" on top of the already existing 52default stack. One way to see it is that "operating system is 53on the left" and "Perl is on the right". 54 55What exact layers are in this default stack depends on a lot of 56things: your operating system, Perl version, Perl compile time 57configuration, and Perl runtime configuration. See L<PerlIO>, 58L<perlrun/PERLIO>, and L<open> for more information. 59 60binmode() operates similarly to open(): by default the specified 61layers are pushed on top of the existing stack. 62 63However, note that even as the specified layers are "pushed on top" 64for open() and binmode(), this doesn't mean that the effects are 65limited to the "top": PerlIO layers can be very 'active' and inspect 66and affect layers also deeper in the stack. As an example there 67is a layer called "raw" which repeatedly "pops" layers until 68it reaches the first layer that has declared itself capable of 69handling binary data. The "pushed" layers are processed in left-to-right 70order. 71 72sysopen() operates (unsurprisingly) at a lower level in the stack than 73open(). For example in UNIX or UNIX-like systems sysopen() operates 74directly at the level of file descriptors: in the terms of PerlIO 75layers, it uses only the "unix" layer, which is a rather thin wrapper 76on top of the UNIX file descriptors. 77 78=head2 Layers vs Disciplines 79 80Initial discussion of the ability to modify IO streams behaviour used 81the term "discipline" for the entities which were added. This came (I 82believe) from the use of the term in "sfio", which in turn borrowed it 83from "line disciplines" on Unix terminals. However, this document (and 84the C code) uses the term "layer". 85 86This is, I hope, a natural term given the implementation, and should 87avoid connotations that are inherent in earlier uses of "discipline" 88for things which are rather different. 89 90=head2 Data Structures 91 92The basic data structure is a PerlIOl: 93 94 typedef struct _PerlIO PerlIOl; 95 typedef struct _PerlIO_funcs PerlIO_funcs; 96 typedef PerlIOl *PerlIO; 97 98 struct _PerlIO 99 { 100 PerlIOl * next; /* Lower layer */ 101 PerlIO_funcs * tab; /* Functions for this layer */ 102 IV flags; /* Various flags for state */ 103 }; 104 105A C<PerlIOl *> is a pointer to the struct, and the I<application> 106level C<PerlIO *> is a pointer to a C<PerlIOl *> - i.e. a pointer 107to a pointer to the struct. This allows the application level C<PerlIO *> 108to remain constant while the actual C<PerlIOl *> underneath 109changes. (Compare perl's C<SV *> which remains constant while its 110C<sv_any> field changes as the scalar's type changes.) An IO stream is 111then in general represented as a pointer to this linked-list of 112"layers". 113 114It should be noted that because of the double indirection in a C<PerlIO *>, 115a C<< &(perlio->next) >> "is" a C<PerlIO *>, and so to some degree 116at least one layer can use the "standard" API on the next layer down. 117 118A "layer" is composed of two parts: 119 120=over 4 121 122=item 1. 123 124The functions and attributes of the "layer class". 125 126=item 2. 127 128The per-instance data for a particular handle. 129 130=back 131 132=head2 Functions and Attributes 133 134The functions and attributes are accessed via the "tab" (for table) 135member of C<PerlIOl>. The functions (methods of the layer "class") are 136fixed, and are defined by the C<PerlIO_funcs> type. They are broadly the 137same as the public C<PerlIO_xxxxx> functions: 138 139 struct _PerlIO_funcs 140 { 141 Size_t fsize; 142 char * name; 143 Size_t size; 144 IV kind; 145 IV (*Pushed)(pTHX_ PerlIO *f,const char *mode,SV *arg, PerlIO_funcs *tab); 146 IV (*Popped)(pTHX_ PerlIO *f); 147 PerlIO * (*Open)(pTHX_ PerlIO_funcs *tab, 148 AV *layers, IV n, 149 const char *mode, 150 int fd, int imode, int perm, 151 PerlIO *old, 152 int narg, SV **args); 153 IV (*Binmode)(pTHX_ PerlIO *f); 154 SV * (*Getarg)(pTHX_ PerlIO *f, CLONE_PARAMS *param, int flags) 155 IV (*Fileno)(pTHX_ PerlIO *f); 156 PerlIO * (*Dup)(pTHX_ PerlIO *f, PerlIO *o, CLONE_PARAMS *param, int flags) 157 /* Unix-like functions - cf sfio line disciplines */ 158 SSize_t (*Read)(pTHX_ PerlIO *f, void *vbuf, Size_t count); 159 SSize_t (*Unread)(pTHX_ PerlIO *f, const void *vbuf, Size_t count); 160 SSize_t (*Write)(pTHX_ PerlIO *f, const void *vbuf, Size_t count); 161 IV (*Seek)(pTHX_ PerlIO *f, Off_t offset, int whence); 162 Off_t (*Tell)(pTHX_ PerlIO *f); 163 IV (*Close)(pTHX_ PerlIO *f); 164 /* Stdio-like buffered IO functions */ 165 IV (*Flush)(pTHX_ PerlIO *f); 166 IV (*Fill)(pTHX_ PerlIO *f); 167 IV (*Eof)(pTHX_ PerlIO *f); 168 IV (*Error)(pTHX_ PerlIO *f); 169 void (*Clearerr)(pTHX_ PerlIO *f); 170 void (*Setlinebuf)(pTHX_ PerlIO *f); 171 /* Perl's snooping functions */ 172 STDCHAR * (*Get_base)(pTHX_ PerlIO *f); 173 Size_t (*Get_bufsiz)(pTHX_ PerlIO *f); 174 STDCHAR * (*Get_ptr)(pTHX_ PerlIO *f); 175 SSize_t (*Get_cnt)(pTHX_ PerlIO *f); 176 void (*Set_ptrcnt)(pTHX_ PerlIO *f,STDCHAR *ptr,SSize_t cnt); 177 }; 178 179The first few members of the struct give a function table size for 180compatibility check "name" for the layer, the size to C<malloc> for the per-instance data, 181and some flags which are attributes of the class as whole (such as whether it is a buffering 182layer), then follow the functions which fall into four basic groups: 183 184=over 4 185 186=item 1. 187 188Opening and setup functions 189 190=item 2. 191 192Basic IO operations 193 194=item 3. 195 196Stdio class buffering options. 197 198=item 4. 199 200Functions to support Perl's traditional "fast" access to the buffer. 201 202=back 203 204A layer does not have to implement all the functions, but the whole 205table has to be present. Unimplemented slots can be NULL (which will 206result in an error when called) or can be filled in with stubs to 207"inherit" behaviour from a "base class". This "inheritance" is fixed 208for all instances of the layer, but as the layer chooses which stubs 209to populate the table, limited "multiple inheritance" is possible. 210 211=head2 Per-instance Data 212 213The per-instance data are held in memory beyond the basic PerlIOl 214struct, by making a PerlIOl the first member of the layer's struct 215thus: 216 217 typedef struct 218 { 219 struct _PerlIO base; /* Base "class" info */ 220 STDCHAR * buf; /* Start of buffer */ 221 STDCHAR * end; /* End of valid part of buffer */ 222 STDCHAR * ptr; /* Current position in buffer */ 223 Off_t posn; /* Offset of buf into the file */ 224 Size_t bufsiz; /* Real size of buffer */ 225 IV oneword; /* Emergency buffer */ 226 } PerlIOBuf; 227 228In this way (as for perl's scalars) a pointer to a PerlIOBuf can be 229treated as a pointer to a PerlIOl. 230 231=head2 Layers in action. 232 233 table perlio unix 234 | | 235 +-----------+ +----------+ +--------+ 236 PerlIO ->| |--->| next |--->| NULL | 237 +-----------+ +----------+ +--------+ 238 | | | buffer | | fd | 239 +-----------+ | | +--------+ 240 | | +----------+ 241 242 243The above attempts to show how the layer scheme works in a simple case. 244The application's C<PerlIO *> points to an entry in the table(s) 245representing open (allocated) handles. For example the first three slots 246in the table correspond to C<stdin>,C<stdout> and C<stderr>. The table 247in turn points to the current "top" layer for the handle - in this case 248an instance of the generic buffering layer "perlio". That layer in turn 249points to the next layer down - in this case the low-level "unix" layer. 250 251The above is roughly equivalent to a "stdio" buffered stream, but with 252much more flexibility: 253 254=over 4 255 256=item * 257 258If Unix level C<read>/C<write>/C<lseek> is not appropriate for (say) 259sockets then the "unix" layer can be replaced (at open time or even 260dynamically) with a "socket" layer. 261 262=item * 263 264Different handles can have different buffering schemes. The "top" 265layer could be the "mmap" layer if reading disk files was quicker 266using C<mmap> than C<read>. An "unbuffered" stream can be implemented 267simply by not having a buffer layer. 268 269=item * 270 271Extra layers can be inserted to process the data as it flows through. 272This was the driving need for including the scheme in perl 5.7.0+ - we 273needed a mechanism to allow data to be translated between perl's 274internal encoding (conceptually at least Unicode as UTF-8), and the 275"native" format used by the system. This is provided by the 276":encoding(xxxx)" layer which typically sits above the buffering layer. 277 278=item * 279 280A layer can be added that does "\n" to CRLF translation. This layer 281can be used on any platform, not just those that normally do such 282things. 283 284=back 285 286=head2 Per-instance flag bits 287 288The generic flag bits are a hybrid of C<O_XXXXX> style flags deduced 289from the mode string passed to C<PerlIO_open()>, and state bits for 290typical buffer layers. 291 292=over 4 293 294=item PERLIO_F_EOF 295 296End of file. 297 298=item PERLIO_F_CANWRITE 299 300Writes are permitted, i.e. opened as "w" or "r+" or "a", etc. 301 302=item PERLIO_F_CANREAD 303 304Reads are permitted i.e. opened "r" or "w+" (or even "a+" - ick). 305 306=item PERLIO_F_ERROR 307 308An error has occurred (for C<PerlIO_error()>). 309 310=item PERLIO_F_TRUNCATE 311 312Truncate file suggested by open mode. 313 314=item PERLIO_F_APPEND 315 316All writes should be appends. 317 318=item PERLIO_F_CRLF 319 320Layer is performing Win32-like "\n" mapped to CR,LF for output and CR,LF 321mapped to "\n" for input. Normally the provided "crlf" layer is the only 322layer that need bother about this. C<PerlIO_binmode()> will mess with this 323flag rather than add/remove layers if the C<PERLIO_K_CANCRLF> bit is set 324for the layers class. 325 326=item PERLIO_F_UTF8 327 328Data written to this layer should be UTF-8 encoded; data provided 329by this layer should be considered UTF-8 encoded. Can be set on any layer 330by ":utf8" dummy layer. Also set on ":encoding" layer. 331 332=item PERLIO_F_UNBUF 333 334Layer is unbuffered - i.e. write to next layer down should occur for 335each write to this layer. 336 337=item PERLIO_F_WRBUF 338 339The buffer for this layer currently holds data written to it but not sent 340to next layer. 341 342=item PERLIO_F_RDBUF 343 344The buffer for this layer currently holds unconsumed data read from 345layer below. 346 347=item PERLIO_F_LINEBUF 348 349Layer is line buffered. Write data should be passed to next layer down 350whenever a "\n" is seen. Any data beyond the "\n" should then be 351processed. 352 353=item PERLIO_F_TEMP 354 355File has been C<unlink()>ed, or should be deleted on C<close()>. 356 357=item PERLIO_F_OPEN 358 359Handle is open. 360 361=item PERLIO_F_FASTGETS 362 363This instance of this layer supports the "fast C<gets>" interface. 364Normally set based on C<PERLIO_K_FASTGETS> for the class and by the 365existence of the function(s) in the table. However a class that 366normally provides that interface may need to avoid it on a 367particular instance. The "pending" layer needs to do this when 368it is pushed above a layer which does not support the interface. 369(Perl's C<sv_gets()> does not expect the streams fast C<gets> behaviour 370to change during one "get".) 371 372=back 373 374=head2 Methods in Detail 375 376=over 4 377 378=item fsize 379 380 Size_t fsize; 381 382Size of the function table. This is compared against the value PerlIO 383code "knows" as a compatibility check. Future versions I<may> be able 384to tolerate layers compiled against an old version of the headers. 385 386=item name 387 388 char * name; 389 390The name of the layer whose open() method Perl should invoke on 391open(). For example if the layer is called APR, you will call: 392 393 open $fh, ">:APR", ... 394 395and Perl knows that it has to invoke the PerlIOAPR_open() method 396implemented by the APR layer. 397 398=item size 399 400 Size_t size; 401 402The size of the per-instance data structure, e.g.: 403 404 sizeof(PerlIOAPR) 405 406If this field is zero then C<PerlIO_pushed> does not malloc anything 407and assumes layer's Pushed function will do any required layer stack 408manipulation - used to avoid malloc/free overhead for dummy layers. 409If the field is non-zero it must be at least the size of C<PerlIOl>, 410C<PerlIO_pushed> will allocate memory for the layer's data structures 411and link new layer onto the stream's stack. (If the layer's Pushed 412method returns an error indication the layer is popped again.) 413 414=item kind 415 416 IV kind; 417 418=over 4 419 420=item * PERLIO_K_BUFFERED 421 422The layer is buffered. 423 424=item * PERLIO_K_RAW 425 426The layer is acceptable to have in a binmode(FH) stack - i.e. it does not 427(or will configure itself not to) transform bytes passing through it. 428 429=item * PERLIO_K_CANCRLF 430 431Layer can translate between "\n" and CRLF line ends. 432 433=item * PERLIO_K_FASTGETS 434 435Layer allows buffer snooping. 436 437=item * PERLIO_K_MULTIARG 438 439Used when the layer's open() accepts more arguments than usual. The 440extra arguments should come not before the C<MODE> argument. When this 441flag is used it's up to the layer to validate the args. 442 443=back 444 445=item Pushed 446 447 IV (*Pushed)(pTHX_ PerlIO *f,const char *mode, SV *arg); 448 449The only absolutely mandatory method. Called when the layer is pushed 450onto the stack. The C<mode> argument may be NULL if this occurs 451post-open. The C<arg> will be non-C<NULL> if an argument string was 452passed. In most cases this should call C<PerlIOBase_pushed()> to 453convert C<mode> into the appropriate C<PERLIO_F_XXXXX> flags in 454addition to any actions the layer itself takes. If a layer is not 455expecting an argument it need neither save the one passed to it, nor 456provide C<Getarg()> (it could perhaps C<Perl_warn> that the argument 457was un-expected). 458 459Returns 0 on success. On failure returns -1 and should set errno. 460 461=item Popped 462 463 IV (*Popped)(pTHX_ PerlIO *f); 464 465Called when the layer is popped from the stack. A layer will normally 466be popped after C<Close()> is called. But a layer can be popped 467without being closed if the program is dynamically managing layers on 468the stream. In such cases C<Popped()> should free any resources 469(buffers, translation tables, ...) not held directly in the layer's 470struct. It should also C<Unread()> any unconsumed data that has been 471read and buffered from the layer below back to that layer, so that it 472can be re-provided to what ever is now above. 473 474Returns 0 on success and failure. If C<Popped()> returns I<true> then 475I<perlio.c> assumes that either the layer has popped itself, or the 476layer is super special and needs to be retained for other reasons. 477In most cases it should return I<false>. 478 479=item Open 480 481 PerlIO * (*Open)(...); 482 483The C<Open()> method has lots of arguments because it combines the 484functions of perl's C<open>, C<PerlIO_open>, perl's C<sysopen>, 485C<PerlIO_fdopen> and C<PerlIO_reopen>. The full prototype is as 486follows: 487 488 PerlIO * (*Open)(pTHX_ PerlIO_funcs *tab, 489 AV *layers, IV n, 490 const char *mode, 491 int fd, int imode, int perm, 492 PerlIO *old, 493 int narg, SV **args); 494 495Open should (perhaps indirectly) call C<PerlIO_allocate()> to allocate 496a slot in the table and associate it with the layers information for 497the opened file, by calling C<PerlIO_push>. The I<layers> AV is an 498array of all the layers destined for the C<PerlIO *>, and any 499arguments passed to them, I<n> is the index into that array of the 500layer being called. The macro C<PerlIOArg> will return a (possibly 501C<NULL>) SV * for the argument passed to the layer. 502 503The I<mode> string is an "C<fopen()>-like" string which would match 504the regular expression C</^[I#]?[rwa]\+?[bt]?$/>. 505 506The C<'I'> prefix is used during creation of C<stdin>..C<stderr> via 507special C<PerlIO_fdopen> calls; the C<'#'> prefix means that this is 508C<sysopen> and that I<imode> and I<perm> should be passed to 509C<PerlLIO_open3>; C<'r'> means B<r>ead, C<'w'> means B<w>rite and 510C<'a'> means B<a>ppend. The C<'+'> suffix means that both reading and 511writing/appending are permitted. The C<'b'> suffix means file should 512be binary, and C<'t'> means it is text. (Almost all layers should do 513the IO in binary mode, and ignore the b/t bits. The C<:crlf> layer 514should be pushed to handle the distinction.) 515 516If I<old> is not C<NULL> then this is a C<PerlIO_reopen>. Perl itself 517does not use this (yet?) and semantics are a little vague. 518 519If I<fd> not negative then it is the numeric file descriptor I<fd>, 520which will be open in a manner compatible with the supplied mode 521string, the call is thus equivalent to C<PerlIO_fdopen>. In this case 522I<nargs> will be zero. 523 524If I<nargs> is greater than zero then it gives the number of arguments 525passed to C<open>, otherwise it will be 1 if for example 526C<PerlIO_open> was called. In simple cases SvPV_nolen(*args) is the 527pathname to open. 528 529Having said all that translation-only layers do not need to provide 530C<Open()> at all, but rather leave the opening to a lower level layer 531and wait to be "pushed". If a layer does provide C<Open()> it should 532normally call the C<Open()> method of next layer down (if any) and 533then push itself on top if that succeeds. 534 535If C<PerlIO_push> was performed and open has failed, it must 536C<PerlIO_pop> itself, since if it's not, the layer won't be removed 537and may cause bad problems. 538 539Returns C<NULL> on failure. 540 541=item Binmode 542 543 IV (*Binmode)(pTHX_ PerlIO *f); 544 545Optional. Used when C<:raw> layer is pushed (explicitly or as a result 546of binmode(FH)). If not present layer will be popped. If present 547should configure layer as binary (or pop itself) and return 0. 548If it returns -1 for error C<binmode> will fail with layer 549still on the stack. 550 551=item Getarg 552 553 SV * (*Getarg)(pTHX_ PerlIO *f, 554 CLONE_PARAMS *param, int flags); 555 556Optional. If present should return an SV * representing the string 557argument passed to the layer when it was 558pushed. e.g. ":encoding(ascii)" would return an SvPV with value 559"ascii". (I<param> and I<flags> arguments can be ignored in most 560cases) 561 562C<Dup> uses C<Getarg> to retrieve the argument originally passed to 563C<Pushed>, so you must implement this function if your layer has an 564extra argument to C<Pushed> and will ever be C<Dup>ed. 565 566=item Fileno 567 568 IV (*Fileno)(pTHX_ PerlIO *f); 569 570Returns the Unix/Posix numeric file descriptor for the handle. Normally 571C<PerlIOBase_fileno()> (which just asks next layer down) will suffice 572for this. 573 574Returns -1 on error, which is considered to include the case where the 575layer cannot provide such a file descriptor. 576 577=item Dup 578 579 PerlIO * (*Dup)(pTHX_ PerlIO *f, PerlIO *o, 580 CLONE_PARAMS *param, int flags); 581 582XXX: Needs more docs. 583 584Used as part of the "clone" process when a thread is spawned (in which 585case param will be non-NULL) and when a stream is being duplicated via 586'&' in the C<open>. 587 588Similar to C<Open>, returns PerlIO* on success, C<NULL> on failure. 589 590=item Read 591 592 SSize_t (*Read)(pTHX_ PerlIO *f, void *vbuf, Size_t count); 593 594Basic read operation. 595 596Typically will call C<Fill> and manipulate pointers (possibly via the 597API). C<PerlIOBuf_read()> may be suitable for derived classes which 598provide "fast gets" methods. 599 600Returns actual bytes read, or -1 on an error. 601 602=item Unread 603 604 SSize_t (*Unread)(pTHX_ PerlIO *f, 605 const void *vbuf, Size_t count); 606 607A superset of stdio's C<ungetc()>. Should arrange for future reads to 608see the bytes in C<vbuf>. If there is no obviously better implementation 609then C<PerlIOBase_unread()> provides the function by pushing a "fake" 610"pending" layer above the calling layer. 611 612Returns the number of unread chars. 613 614=item Write 615 616 SSize_t (*Write)(PerlIO *f, const void *vbuf, Size_t count); 617 618Basic write operation. 619 620Returns bytes written or -1 on an error. 621 622=item Seek 623 624 IV (*Seek)(pTHX_ PerlIO *f, Off_t offset, int whence); 625 626Position the file pointer. Should normally call its own C<Flush> 627method and then the C<Seek> method of next layer down. 628 629Returns 0 on success, -1 on failure. 630 631=item Tell 632 633 Off_t (*Tell)(pTHX_ PerlIO *f); 634 635Return the file pointer. May be based on layers cached concept of 636position to avoid overhead. 637 638Returns -1 on failure to get the file pointer. 639 640=item Close 641 642 IV (*Close)(pTHX_ PerlIO *f); 643 644Close the stream. Should normally call C<PerlIOBase_close()> to flush 645itself and close layers below, and then deallocate any data structures 646(buffers, translation tables, ...) not held directly in the data 647structure. 648 649Returns 0 on success, -1 on failure. 650 651=item Flush 652 653 IV (*Flush)(pTHX_ PerlIO *f); 654 655Should make stream's state consistent with layers below. That is, any 656buffered write data should be written, and file position of lower layers 657adjusted for data read from below but not actually consumed. 658(Should perhaps C<Unread()> such data to the lower layer.) 659 660Returns 0 on success, -1 on failure. 661 662=item Fill 663 664 IV (*Fill)(pTHX_ PerlIO *f); 665 666The buffer for this layer should be filled (for read) from layer 667below. When you "subclass" PerlIOBuf layer, you want to use its 668I<_read> method and to supply your own fill method, which fills the 669PerlIOBuf's buffer. 670 671Returns 0 on success, -1 on failure. 672 673=item Eof 674 675 IV (*Eof)(pTHX_ PerlIO *f); 676 677Return end-of-file indicator. C<PerlIOBase_eof()> is normally sufficient. 678 679Returns 0 on end-of-file, 1 if not end-of-file, -1 on error. 680 681=item Error 682 683 IV (*Error)(pTHX_ PerlIO *f); 684 685Return error indicator. C<PerlIOBase_error()> is normally sufficient. 686 687Returns 1 if there is an error (usually when C<PERLIO_F_ERROR> is set, 6880 otherwise. 689 690=item Clearerr 691 692 void (*Clearerr)(pTHX_ PerlIO *f); 693 694Clear end-of-file and error indicators. Should call C<PerlIOBase_clearerr()> 695to set the C<PERLIO_F_XXXXX> flags, which may suffice. 696 697=item Setlinebuf 698 699 void (*Setlinebuf)(pTHX_ PerlIO *f); 700 701Mark the stream as line buffered. C<PerlIOBase_setlinebuf()> sets the 702PERLIO_F_LINEBUF flag and is normally sufficient. 703 704=item Get_base 705 706 STDCHAR * (*Get_base)(pTHX_ PerlIO *f); 707 708Allocate (if not already done so) the read buffer for this layer and 709return pointer to it. Return NULL on failure. 710 711=item Get_bufsiz 712 713 Size_t (*Get_bufsiz)(pTHX_ PerlIO *f); 714 715Return the number of bytes that last C<Fill()> put in the buffer. 716 717=item Get_ptr 718 719 STDCHAR * (*Get_ptr)(pTHX_ PerlIO *f); 720 721Return the current read pointer relative to this layer's buffer. 722 723=item Get_cnt 724 725 SSize_t (*Get_cnt)(pTHX_ PerlIO *f); 726 727Return the number of bytes left to be read in the current buffer. 728 729=item Set_ptrcnt 730 731 void (*Set_ptrcnt)(pTHX_ PerlIO *f, 732 STDCHAR *ptr, SSize_t cnt); 733 734Adjust the read pointer and count of bytes to match C<ptr> and/or C<cnt>. 735The application (or layer above) must ensure they are consistent. 736(Checking is allowed by the paranoid.) 737 738=back 739 740=head2 Utilities 741 742To ask for the next layer down use PerlIONext(PerlIO *f). 743 744To check that a PerlIO* is valid use PerlIOValid(PerlIO *f). (All 745this does is really just to check that the pointer is non-NULL and 746that the pointer behind that is non-NULL.) 747 748PerlIOBase(PerlIO *f) returns the "Base" pointer, or in other words, 749the C<PerlIOl*> pointer. 750 751PerlIOSelf(PerlIO* f, type) return the PerlIOBase cast to a type. 752 753Perl_PerlIO_or_Base(PerlIO* f, callback, base, failure, args) either 754calls the I<callback> from the functions of the layer I<f> (just by 755the name of the IO function, like "Read") with the I<args>, or if 756there is no such callback, calls the I<base> version of the callback 757with the same args, or if the f is invalid, set errno to EBADF and 758return I<failure>. 759 760Perl_PerlIO_or_fail(PerlIO* f, callback, failure, args) either calls 761the I<callback> of the functions of the layer I<f> with the I<args>, 762or if there is no such callback, set errno to EINVAL. Or if the f is 763invalid, set errno to EBADF and return I<failure>. 764 765Perl_PerlIO_or_Base_void(PerlIO* f, callback, base, args) either calls 766the I<callback> of the functions of the layer I<f> with the I<args>, 767or if there is no such callback, calls the I<base> version of the 768callback with the same args, or if the f is invalid, set errno to 769EBADF. 770 771Perl_PerlIO_or_fail_void(PerlIO* f, callback, args) either calls the 772I<callback> of the functions of the layer I<f> with the I<args>, or if 773there is no such callback, set errno to EINVAL. Or if the f is 774invalid, set errno to EBADF. 775 776=head2 Implementing PerlIO Layers 777 778If you find the implementation document unclear or not sufficient, 779look at the existing PerlIO layer implementations, which include: 780 781=over 782 783=item * C implementations 784 785The F<perlio.c> and F<perliol.h> in the Perl core implement the 786"unix", "perlio", "stdio", "crlf", "utf8", "byte", "raw", "pending" 787layers, and also the "mmap" and "win32" layers if applicable. 788(The "win32" is currently unfinished and unused, to see what is used 789instead in Win32, see L<PerlIO/"Querying the layers of filehandles"> .) 790 791PerlIO::encoding, PerlIO::scalar, PerlIO::via in the Perl core. 792 793PerlIO::gzip and APR::PerlIO (mod_perl 2.0) on CPAN. 794 795=item * Perl implementations 796 797PerlIO::via::QuotedPrint in the Perl core and PerlIO::via::* on CPAN. 798 799=back 800 801If you are creating a PerlIO layer, you may want to be lazy, in other 802words, implement only the methods that interest you. The other methods 803you can either replace with the "blank" methods 804 805 PerlIOBase_noop_ok 806 PerlIOBase_noop_fail 807 808(which do nothing, and return zero and -1, respectively) or for 809certain methods you may assume a default behaviour by using a NULL 810method. The Open method looks for help in the 'parent' layer. 811The following table summarizes the behaviour: 812 813 method behaviour with NULL 814 815 Clearerr PerlIOBase_clearerr 816 Close PerlIOBase_close 817 Dup PerlIOBase_dup 818 Eof PerlIOBase_eof 819 Error PerlIOBase_error 820 Fileno PerlIOBase_fileno 821 Fill FAILURE 822 Flush SUCCESS 823 Getarg SUCCESS 824 Get_base FAILURE 825 Get_bufsiz FAILURE 826 Get_cnt FAILURE 827 Get_ptr FAILURE 828 Open INHERITED 829 Popped SUCCESS 830 Pushed SUCCESS 831 Read PerlIOBase_read 832 Seek FAILURE 833 Set_cnt FAILURE 834 Set_ptrcnt FAILURE 835 Setlinebuf PerlIOBase_setlinebuf 836 Tell FAILURE 837 Unread PerlIOBase_unread 838 Write FAILURE 839 840 FAILURE Set errno (to EINVAL in UNIXish, to LIB$_INVARG in VMS) and 841 return -1 (for numeric return values) or NULL (for pointers) 842 INHERITED Inherited from the layer below 843 SUCCESS Return 0 (for numeric return values) or a pointer 844 845=head2 Core Layers 846 847The file C<perlio.c> provides the following layers: 848 849=over 4 850 851=item "unix" 852 853A basic non-buffered layer which calls Unix/POSIX C<read()>, C<write()>, 854C<lseek()>, C<close()>. No buffering. Even on platforms that distinguish 855between O_TEXT and O_BINARY this layer is always O_BINARY. 856 857=item "perlio" 858 859A very complete generic buffering layer which provides the whole of 860PerlIO API. It is also intended to be used as a "base class" for other 861layers. (For example its C<Read()> method is implemented in terms of 862the C<Get_cnt()>/C<Get_ptr()>/C<Set_ptrcnt()> methods). 863 864"perlio" over "unix" provides a complete replacement for stdio as seen 865via PerlIO API. This is the default for USE_PERLIO when system's stdio 866does not permit perl's "fast gets" access, and which do not 867distinguish between C<O_TEXT> and C<O_BINARY>. 868 869=item "stdio" 870 871A layer which provides the PerlIO API via the layer scheme, but 872implements it by calling system's stdio. This is (currently) the default 873if system's stdio provides sufficient access to allow perl's "fast gets" 874access and which do not distinguish between C<O_TEXT> and C<O_BINARY>. 875 876=item "crlf" 877 878A layer derived using "perlio" as a base class. It provides Win32-like 879"\n" to CR,LF translation. Can either be applied above "perlio" or serve 880as the buffer layer itself. "crlf" over "unix" is the default if system 881distinguishes between C<O_TEXT> and C<O_BINARY> opens. (At some point 882"unix" will be replaced by a "native" Win32 IO layer on that platform, 883as Win32's read/write layer has various drawbacks.) The "crlf" layer is 884a reasonable model for a layer which transforms data in some way. 885 886=item "mmap" 887 888If Configure detects C<mmap()> functions this layer is provided (with 889"perlio" as a "base") which does "read" operations by mmap()ing the 890file. Performance improvement is marginal on modern systems, so it is 891mainly there as a proof of concept. It is likely to be unbundled from 892the core at some point. The "mmap" layer is a reasonable model for a 893minimalist "derived" layer. 894 895=item "pending" 896 897An "internal" derivative of "perlio" which can be used to provide 898Unread() function for layers which have no buffer or cannot be 899bothered. (Basically this layer's C<Fill()> pops itself off the stack 900and so resumes reading from layer below.) 901 902=item "raw" 903 904A dummy layer which never exists on the layer stack. Instead when 905"pushed" it actually pops the stack removing itself, it then calls 906Binmode function table entry on all the layers in the stack - normally 907this (via PerlIOBase_binmode) removes any layers which do not have 908C<PERLIO_K_RAW> bit set. Layers can modify that behaviour by defining 909their own Binmode entry. 910 911=item "utf8" 912 913Another dummy layer. When pushed it pops itself and sets the 914C<PERLIO_F_UTF8> flag on the layer which was (and now is once more) 915the top of the stack. 916 917=back 918 919In addition F<perlio.c> also provides a number of C<PerlIOBase_xxxx()> 920functions which are intended to be used in the table slots of classes 921which do not need to do anything special for a particular method. 922 923=head2 Extension Layers 924 925Layers can made available by extension modules. When an unknown layer 926is encountered the PerlIO code will perform the equivalent of : 927 928 use PerlIO 'layer'; 929 930Where I<layer> is the unknown layer. F<PerlIO.pm> will then attempt to: 931 932 require PerlIO::layer; 933 934If after that process the layer is still not defined then the C<open> 935will fail. 936 937The following extension layers are bundled with perl: 938 939=over 4 940 941=item ":encoding" 942 943 use Encoding; 944 945makes this layer available, although F<PerlIO.pm> "knows" where to 946find it. It is an example of a layer which takes an argument as it is 947called thus: 948 949 open( $fh, "<:encoding(iso-8859-7)", $pathname ); 950 951=item ":scalar" 952 953Provides support for reading data from and writing data to a scalar. 954 955 open( $fh, "+<:scalar", \$scalar ); 956 957When a handle is so opened, then reads get bytes from the string value 958of I<$scalar>, and writes change the value. In both cases the position 959in I<$scalar> starts as zero but can be altered via C<seek>, and 960determined via C<tell>. 961 962Please note that this layer is implied when calling open() thus: 963 964 open( $fh, "+<", \$scalar ); 965 966=item ":via" 967 968Provided to allow layers to be implemented as Perl code. For instance: 969 970 use PerlIO::via::StripHTML; 971 open( my $fh, "<:via(StripHTML)", "index.html" ); 972 973See L<PerlIO::via> for details. 974 975=back 976 977=head1 TODO 978 979Things that need to be done to improve this document. 980 981=over 982 983=item * 984 985Explain how to make a valid fh without going through open()(i.e. apply 986a layer). For example if the file is not opened through perl, but we 987want to get back a fh, like it was opened by Perl. 988 989How PerlIO_apply_layera fits in, where its docs, was it made public? 990 991Currently the example could be something like this: 992 993 PerlIO *foo_to_PerlIO(pTHX_ char *mode, ...) 994 { 995 char *mode; /* "w", "r", etc */ 996 const char *layers = ":APR"; /* the layer name */ 997 PerlIO *f = PerlIO_allocate(aTHX); 998 if (!f) { 999 return NULL; 1000 } 1001 1002 PerlIO_apply_layers(aTHX_ f, mode, layers); 1003 1004 if (f) { 1005 PerlIOAPR *st = PerlIOSelf(f, PerlIOAPR); 1006 /* fill in the st struct, as in _open() */ 1007 st->file = file; 1008 PerlIOBase(f)->flags |= PERLIO_F_OPEN; 1009 1010 return f; 1011 } 1012 return NULL; 1013 } 1014 1015=item * 1016 1017fix/add the documentation in places marked as XXX. 1018 1019=item * 1020 1021The handling of errors by the layer is not specified. e.g. when $! 1022should be set explicitly, when the error handling should be just 1023delegated to the top layer. 1024 1025Probably give some hints on using SETERRNO() or pointers to where they 1026can be found. 1027 1028=item * 1029 1030I think it would help to give some concrete examples to make it easier 1031to understand the API. Of course I agree that the API has to be 1032concise, but since there is no second document that is more of a 1033guide, I think that it'd make it easier to start with the doc which is 1034an API, but has examples in it in places where things are unclear, to 1035a person who is not a PerlIO guru (yet). 1036 1037=back 1038 1039=cut 1040