xref: /plan9/sys/doc/prog4.ms (revision ff8c3af2f44d95267f67219afa20ba82ff6cf7e4)
1.FP lucidasans
2.TL
3Changes to the Programming Environment
4.br
5in the
6.br
7Fourth Release of Plan 9
8.AU
9Rob Pike
10.sp
11rob@plan9.bell-labs.com
12.SH
13Introduction
14.PP
15The fourth release of Plan 9 includes changes at many levels of the system,
16with repercussions in the libraries and program interfaces.
17This document summarizes the changes and describes how
18existing programs must be modified to run in the new release.
19It is not exhaustive, of course; for further detail about any of the
20topics refer to the manual pages, as always.
21.PP
22Programmers new to Plan 9 may find valuable tidbits here, but the
23real audience for this paper is those with a need to update applications
24and servers written in C for earlier releases of the Plan 9 operating system.
25.SH
269P, NAMELEN, and strings
27.PP
28The underlying file service protocol for Plan 9, 9P, retains its basic form
29but has had a number of adjustments to deal with longer file names and error strings,
30new authentication mechanisms, and to make it more efficient at
31evaluating file names.
32The change to file names affects a number of system interfaces;
33because file name elements are no longer of fixed size, they can
34no longer be stored as arrays.
35.PP
369P used to be a fixed-format protocol with
37.CW NAMELEN -sized
38byte arrays representing file name elements.
39Now, it is a variable-format protocol, as described in
40.I intro (5),
41in which strings are represented by a count followed by that many bytes.
42Thus, the string
43.CW ken
44would previously have occupied 28
45.CW NAMELEN ) (
46bytes in the message; now it occupies 5: a two-byte count followed by the three bytes of
47.CW ken
48and no terminal zero.
49(And of course, a name could now be much longer.)
50A similar format change has been made to
51.CW stat
52buffers: they are no longer
53.CW DIRLEN
54bytes long but instead have variable size prefixed by a two-byte count.
55And in fact the entire 9P message syntax has changed: every message
56now begins with a message length field that makes it trivial to break the
57string into messages without parsing them, so
58.CW aux/fcall
59is gone.
60A new library entry point,
61.CW read9pmsg ,
62makes it easy for user-level servers to break the client data stream into 9P messages.
63All servers should switch from using
64.CW read
65(or the now gone
66.CW getS)
67to using
68.CW read9pmsg .
69.PP
70This change to 9P affects the way strings are handled by the kernel and throughout
71the system.
72The consequences are primarily that fixed-size arrays have been replaced
73by pointers and counts in a variety of system interfaces.
74Most programs will need at least some adjustment to the new style.
75In summary:
76.CW NAMELEN
77is gone, except as a vestige in the authentication libraries, where it has been
78rechristened
79.CW ANAMELEN .
80.CW DIRLEN
81and
82.CW ERRLEN
83are also gone.
84All programs that mention
85these constants
86will need to be fixed.
87.PP
88The simplest place to see this change is in the
89.CW errstr
90system call, which no longer assumes a buffer of length
91.CW ERRLEN
92but now requires a byte-count argument:
93.P1
94char buf[...];
95
96errstr(buf, sizeof buf);
97.P2
98The buffer can be any size you like.
99For convenience, the kernel stores error strings internally as 256-byte arrays,
100so if you like \(em but it's not required \(em you can use the defined constant
101.CW ERRMAX= 256
102as a good buffer size.
103Unlike the old
104.CW ERRLEN
105(which had value 64),
106.CW ERRMAX
107is advisory, not mandatory, and is not part of the 9P specification.
108.PP
109With names, stat buffers, and directories, there isn't even an echo of a fixed-size array any more.
110.SH
111Directories and wait messages
112.PP
113With strings now variable-length, a number of system calls needed to change:
114.CW errstr ,
115.CW stat ,
116.CW fstat ,
117.CW wstat ,
118.CW fwstat ,
119and
120.CW wait
121are all affected, as is
122.CW read
123when applied to directories.
124.PP
125As far as directories are concerned, most programs don't use the system calls
126directly anyway, since they operate on the machine-independent form, but
127instead call the machine-dependent
128.CW Dir
129routines
130.CW dirstat ,
131.CW dirread ,
132etc.
133These used to fill user-provided fixed-size buffers; now they return objects allocated
134by
135.CW malloc
136(which must therefore be freed after use).
137To `stat' a file:
138.P1
139Dir *d;
140
141d = dirstat(filename);
142if(d == nil){
143	fprint(2, "can't stat %s: %r\en", filename);
144	exits("stat");
145}
146use(d);
147free(d);
148.P2
149A common new bug is to forget to free a
150.CW Dir
151returned by
152.CW dirstat .
153.PP
154.CW Dirfstat
155and
156.CW Dirfwstat
157work pretty much as before, but changes to 9P make
158it possible to exercise finer-grained control on what fields
159of the
160.CW Dir
161are to be changed; see
162.I stat (2)
163and
164.I stat (5)
165for details.
166.PP
167Reading a directory works in a similar way to
168.CW dirstat ,
169with
170.CW dirread
171allocating and filling in an array of
172.CW Dir
173structures.
174The return value is the number of elements of the array.
175The arguments to
176.CW dirread
177now include a pointer to a
178.CW Dir*
179to be filled in with the address of the allocated array:
180.P1
181Dir *d;
182int i, n;
183
184while((n = dirread(fd, &d)) > 0){
185	for(i=0; i<n; i++)
186		use(&d[i]);
187	free(d);
188}
189.P2
190A new library function,
191.CW dirreadall ,
192has the same form as
193.CW dirread
194but returns the entire directory in one call:
195.P1
196n = dirreadall(fd, &d)
197for(i=0; i<n; i++)
198	use(&d[i]);
199free(d);
200.P2
201If your program insists on using the underlying
202.CW stat
203system call or its relatives, or wants to operate directly on the
204machine-independent format returned by
205.CW stat
206or
207.CW read ,
208it will need to be modified.
209Such programs are rare enough that we'll not discuss them here beyond referring to
210the man page
211.I stat (2)
212for details.
213Be aware, though, that it used to be possible to regard the buffer returned by
214.CW stat
215as a byte array that began with the zero-terminated
216name of the file; this is no longer true.
217With very rare exceptions, programs that call
218.CW stat
219would be better recast to use the
220.CW dir
221routines or, if their goal is just to test the existence of a file,
222.CW access .
223.PP
224Similar changes have affected the
225.CW wait
226system call.  In fact,
227.CW wait
228is no longer a system call but a library routine that calls the new
229.CW await
230system call and returns a newly allocated machine-dependent
231.CW Waitmsg
232structure:
233.P1
234Waitmsg *w;
235
236w = wait();
237if(w == nil)
238	error("wait: %r");
239print("pid is %d; exit string %s\en", w->pid, w->msg);
240free(w);
241.P2
242The exit string
243.CW w->msg
244may be empty but it will never be a nil pointer.
245Again, don't forget to free the structure returned by
246.CW wait .
247If all you need is the pid, you can call
248.CW waitpid ,
249which reports just the pid and doesn't return an allocated structure:
250.P1
251int pid;
252
253pid = waitpid();
254if(pid < 0)
255	error("wait: %r");
256print("pid is %d\en", pid);
257.P2
258.SH
259Quoted strings and tokenize
260.PP
261.CW Wait
262gives us a good opportunity to describe how the system copes with all this
263free-format data.
264Consider the text returned by the
265.CW await
266system call, which includes a set of integers (pids and times) and a string (the exit status).
267This information is formatted free-form; here is the statement in the kernel that
268generates the message:
269.P1
270n = snprint(a, n, "%d %lud %lud %lud %q",
271	wq->w.pid,
272	wq->w.time[TUser], wq->w.time[TSys], wq->w.time[TReal],
273	wq->w.msg);
274.P2
275Note the use of
276.CW %q
277to produce a quoted-string representation of the exit status.
278The
279.CW %q
280format is like %s but will wrap
281.CW rc -style
282single quotes around the string if it contains white space or is otherwise ambiguous.
283The library routine
284.CW tokenize
285can be used to parse data formatted this way: it splits white-space-separated
286fields but understands the
287.CW %q
288quoting conventions.
289Here is how the
290.CW wait
291library routine builds its
292.CW Waitmsg
293from the data returned by
294.CW await :
295.P1
296Waitmsg*
297wait(void)
298{
299	int n, l;
300	char buf[512], *fld[5];
301	Waitmsg *w;
302
303	n = await(buf, sizeof buf-1);
304	if(n < 0)
305		return nil;
306	buf[n] = '\0';
307	if(tokenize(buf, fld, nelem(fld)) != nelem(fld)){
308		werrstr("couldn't parse wait message");
309		return nil;
310	}
311	l = strlen(fld[4])+1;
312	w = malloc(sizeof(Waitmsg)+l);
313	if(w == nil)
314		return nil;
315	w->pid = atoi(fld[0]);
316	w->time[0] = atoi(fld[1]);
317	w->time[1] = atoi(fld[2]);
318	w->time[2] = atoi(fld[3]);
319	w->msg = (char*)&w[1];
320	memmove(w->msg, fld[4], l);
321	return w;
322}
323.P2
324.PP
325This style of quoted-string and
326.CW tokenize
327is used all through the system now.
328In particular, devices now
329.CW tokenize
330the messages written to their
331.CW ctl
332files, which means that you can send messages that contain white space, by quoting them,
333and that you no longer need to worry about whether or not the device accepts a newline.
334In other words, you can say
335.P1
336echo message > /dev/xx/ctl
337.P2
338instead of
339.CW echo
340.CW -n
341because
342.CW tokenize
343treats the newline character as white space and discards it.
344.PP
345While we're on the subject of quotes and strings, note that the implementation of
346.CW await
347used
348.CW snprint
349rather than
350.CW sprint .
351We now deprecate
352.CW sprint
353because it has no protection against buffer overflow.
354We prefer
355.CW snprint
356or
357.CW seprint ,
358to constrain the output.
359The
360.CW %q
361format is cleverer than most in this regard:
362if the string is too long to be represented in full,
363.CW %q
364is smart enough to produce a truncated but correctly quoted
365string within the available space.
366.SH
367Mount
368.PP
369Although strings in 9P are now variable-length and not zero-terminated,
370this has little direct effect in most of the system interfaces.
371File and user names are still zero-terminated strings as always;
372the kernel does the work of translating them as necessary for
373transport.
374And of course, they are now free to be as long as you might want;
375the only hard limit is that their length must be represented in 16 bits.
376.PP
377One example where this matters is that the file system specification in the
378.CW mount
379system call can now be much longer.
380Programs like
381.CW rio
382that used the specification string in creative ways were limited by the
383.CW NAMELEN
384restriction; now they can use the string more freely.
385.CW Rio
386now accepts a simple but less cryptic specification language for the window
387to be created by the
388.CW mount
389call, e.g.:
390.P1
391% mount $wsys /mnt/wsys 'new -dx 250 -dy 250 -pid 1234'
392.P2
393In the old system, this sort of control was impossible through the
394.CW mount
395interface.
396.PP
397While we're on the subject of
398.CW mount ,
399note that with the new security architecture
400(see
401.I factotum (4)),
4029P has moved its authentication outside the protocol proper.
403(For a full description of this change to 9P, see
404.I fauth (2),
405.I attach (5),
406and the paper
407.I "Security in Plan 9\f1.)
408The most explicit effect of this change is that
409.CW mount
410now takes another argument,
411.CW afd ,
412a file descriptor for the
413authentication file through which the authentication will be made.
414For most user-level file servers, which do not require authentication, it is
415sufficient to provide
416.CW -1
417as the value of
418.CW afd:
419.P1
420if(mount(fd, -1, "/mnt/wsys", MREPL,
421   "new -dx 250 -dy 250 -pid 1234") < 0)
422	error("mount failed: %r");
423.P2
424To connect to servers that require authentication, use the new
425.CW fauth
426system call or the reimplemented
427.CW amount
428(authenticated mount) library call.
429In fact, since
430.CW amount
431handles both authenticating and non-authenticating servers, it is often
432easiest just to replace calls to
433.CW mount
434by calls to
435.CW amount ;
436see
437.I auth (2)
438for details.
439.SH
440Print
441.PP
442The C library has been heavily reworked in places.
443Besides the changes mentioned above, it
444now has a much more complete set of routines for handling
445.CW Rune
446strings (that is, zero-terminated arrays of 16-bit character values).
447The most sweeping changes, however, are in the way formatted I/O is performed.
448.PP
449The
450.CW print
451routine and all its relatives have been reimplemented to offer a number
452of improvements:
453.IP (1)
454Better buffer management, including the provision of an internal flush
455routine, makes it unnecessary to provide large buffers.
456For example,
457.CW print
458uses a much smaller buffer now (reducing stack load) while simultaneously
459removing the need to truncate the output string if it doesn't fit in the buffer.
460.IP (2)
461Global variables have been eliminated so no locking is necessary.
462.IP (3)
463The combination of (1) and (2) means that the standard implementation of
464.CW print
465now works fine in threaded programs, and
466.CW threadprint
467is gone.
468.IP (4)
469The new routine
470.CW smprint
471prints into, and returns, storage allocated on demand by
472.CW malloc .
473.IP (5)
474It is now possible to print into a
475.CW Rune
476string; for instance,
477.CW runesmprint
478is the
479.CW Rune
480analog of
481.CW smprint .
482.IP (6)
483There is improved support for custom
484print verbs and custom output routines such as error handlers.
485The routine
486.CW doprint
487is gone, but
488.CW vseprint
489can always be used instead.
490However, the new routines
491.CW fmtfdinit ,
492.CW fmtstrinit ,
493.CW fmtprint ,
494and friends
495are often a better replacement.
496The details are too long for exposition here;
497.I fmtinstall (2)
498explains the new interface and provides examples.
499.IP (7)
500Two new format flags, space and comma, close somewhat the gap between
501Plan 9 and ANSI C.
502.PP
503Despite these changes, most programs will be unaffected;
504.CW print
505is still
506.CW print .
507Don't forget, though, that
508you should eliminate calls to
509.CW sprint
510and use the
511.CW %q
512format when appropriate.
513.SH
514Binary compatibility
515.PP
516The discussion so far has been about changes at the source level.
517Existing binaries will probably run without change in the new
518environment, since the kernel provides backward-compatible
519system calls for
520.CW errstr ,
521.CW stat ,
522.CW wait ,
523etc.
524The only exceptions are programs that do either a
525.CW mount
526system call, because of the security changes and because
527the file descriptor in
528.CW mount
529must point to a new 9P connection; or a
530.CW read
531system call on a directory, since the returned data will
532be in the new format.
533A moment's reflection will discover that this means old
534user-level file servers will need to be fixed to run on the new system.
535.SH
536File servers
537.PP
538A full description of what user-level servers must do to provide service with
539the new 9P is beyond the scope of this paper.
540Your best source of information is section 5 of the manual,
541combined with study of a few examples.
542.CW /sys/src/cmd/ramfs.c
543is a simple example; it has a counterpart
544.CW /sys/src/lib9p/ramfs.c
545that implements the same service using the new
546.I 9p (2)
547library.
548.PP
549That said, it's worth summarizing what to watch for when converting a file server.
550The
551.CW session
552message is gone, and there is a now a
553.CW version
554message that is exchanged at the start of a connection to establish
555the version of the protocol to use (there's only one at the moment, identified by
556the string
557.CW 9P2000 )
558and what the maximum message size will be.
559This negotiation makes it easier to handle 9P encapsulation, such as with
560.CW exportfs ,
561and also permits larger message sizes when appropriate.
562.PP
563If your server wants to authenticate, it will need to implement an authentication file
564and implement the
565.CW auth
566message; otherwise it should return a helpful error string to the
567.CW Tauth
568request to signal that authentication is not required.
569.PP
570The handling of
571.CW stat
572and directory reads will require some changes but they should not be fundamental.
573Be aware that seeking on directories is forbidden, so it is fine if you disregard the
574file offset when implementing directory reads; this makes it a little easier to handle
575the variable-length entries.
576You should still never return a partial directory entry; if the I/O count is too small
577to return even one entry, you should return two bytes containing the byte count
578required to represent the next entry in the directory.
579User code can use this value to formulate a retry if it desires.
580See the
581DIAGNOSTICS section of
582.I stat (2)
583for a description of this process.
584.PP
585The trickiest part of updating a file server is that the
586.CW clone
587and
588.CW walk
589messages have been merged into a single message, a sort of `clone-multiwalk'.
590The new message, still called
591.CW walk ,
592proposes a sequence of file name elements to be evaluated using a possibly
593cloned fid.
594The return message contains the qids of the files reached by
595walking to the sequential elements.
596If all the elements can be walked, the fid will be cloned if requested.
597If a non-zero number of elements are requested, but none
598can be walked, an error should be returned.
599If only some can be walked, the fid is not cloned, the original fid is left
600where it was, and the returned
601.CW Rwalk
602message should contain the partial list of successfully reached qids.
603See
604.I walk (5)
605for a full description.
606