xref: /plan9-contrib/sys/doc/comp.ms (revision 219b2ee8daee37f4aad58d63f21287faa8e4ffdc)
13e12c5d1SDavid du Colombier.TL
23e12c5d1SDavid du ColombierHow to Use the Plan 9 C Compiler
33e12c5d1SDavid du Colombier.AU
43e12c5d1SDavid du ColombierRob Pike
5*219b2ee8SDavid du Colombierrob@plan9.att.com
63e12c5d1SDavid du Colombier.SH
73e12c5d1SDavid du ColombierIntroduction
83e12c5d1SDavid du Colombier.PP
93e12c5d1SDavid du ColombierThe C compiler on Plan 9 is a wholly new program; in fact
103e12c5d1SDavid du Colombierit was the first piece of software written for what would
113e12c5d1SDavid du Colombiereventually become Plan 9 from Bell Labs.
123e12c5d1SDavid du ColombierProgrammers familiar with existing C compilers will find
133e12c5d1SDavid du Colombiera number of differences in both the language the Plan 9 compiler
143e12c5d1SDavid du Colombieraccepts and in how the compiler is used.
153e12c5d1SDavid du Colombier.PP
163e12c5d1SDavid du ColombierThe compiler is really a set of compilers, one for each
173e12c5d1SDavid du Colombierarchitecture \(em MIPS, SPARC, Motorola 68020, Intel 386, etc. \(em
183e12c5d1SDavid du Colombierthat accept a dialect of ANSI C and efficiently produce
193e12c5d1SDavid du Colombierfairly good code for the target machine.
203e12c5d1SDavid du ColombierThere is a packaging of the compiler that accepts strict ANSI C for
21*219b2ee8SDavid du Colombiera POSIX environment, but this document focuses on the
223e12c5d1SDavid du Colombiernative Plan 9 environment, that in which all the system source and
233e12c5d1SDavid du Colombieralmost all the utilities are written.
243e12c5d1SDavid du Colombier.SH
253e12c5d1SDavid du ColombierSource
263e12c5d1SDavid du Colombier.PP
273e12c5d1SDavid du ColombierThe language accepted by the compilers is the core ANSI C language
283e12c5d1SDavid du Colombierwith some modest extensions,
293e12c5d1SDavid du Colombiera greatly simplified preprocessor,
303e12c5d1SDavid du Colombiera smaller library that includes system calls and related facilities,
313e12c5d1SDavid du Colombierand a completely different structure for include files.
323e12c5d1SDavid du Colombier.PP
333e12c5d1SDavid du ColombierOfficial ANSI C accepts the old (K&R) style of declarations for
343e12c5d1SDavid du Colombierfunctions; the Plan 9 compilers
353e12c5d1SDavid du Colombierare more demanding.
363e12c5d1SDavid du ColombierWithout an explicit run-time flag
373e12c5d1SDavid du Colombier.CW -B ) (
383e12c5d1SDavid du Colombierwhose use is discouraged, the compilers insist
393e12c5d1SDavid du Colombieron new-style function declarations, that is, prototypes for
403e12c5d1SDavid du Colombierfunction arguments.
413e12c5d1SDavid du ColombierThe function declarations in the libraries' include files are
423e12c5d1SDavid du Colombierall in the new style so the interfaces are checked at compile time.
433e12c5d1SDavid du ColombierFor C programmers who have not yet switched to function prototypes
443e12c5d1SDavid du Colombierthe clumsy syntax may seem repellent but the payoff in stronger typing
453e12c5d1SDavid du Colombieris substantial.
463e12c5d1SDavid du ColombierThose who wish to import existing software to Plan 9 are urged
473e12c5d1SDavid du Colombierto use the opportunity to update their code.
483e12c5d1SDavid du Colombier.PP
493e12c5d1SDavid du ColombierThe compilers include an integrated preprocessor that accepts the familiar
503e12c5d1SDavid du Colombier.CW #include ,
513e12c5d1SDavid du Colombier.CW #define
523e12c5d1SDavid du Colombierfor macros both with and without arguments,
533e12c5d1SDavid du Colombier.CW #undef ,
543e12c5d1SDavid du Colombier.CW #line ,
553e12c5d1SDavid du Colombier.CW #ifdef ,
563e12c5d1SDavid du Colombier.CW #ifndef ,
573e12c5d1SDavid du Colombierand
583e12c5d1SDavid du Colombier.CW #endif .
593e12c5d1SDavid du ColombierIt
603e12c5d1SDavid du Colombiersupports neither
613e12c5d1SDavid du Colombier.CW #if
623e12c5d1SDavid du Colombiernor
633e12c5d1SDavid du Colombier.CW ##
643e12c5d1SDavid du Colombierand honors a single
653e12c5d1SDavid du Colombier.CW #pragma .
663e12c5d1SDavid du ColombierThe
673e12c5d1SDavid du Colombier.CW #if
683e12c5d1SDavid du Colombierdirective was omitted because it greatly complicates the
693e12c5d1SDavid du Colombierpreprocessor, is never necessary, and is usually abused.
703e12c5d1SDavid du ColombierConditional compilation in general makes code hard to understand;
71*219b2ee8SDavid du Colombierthe Plan 9 source uses it sparingly.
723e12c5d1SDavid du ColombierAlso, because the compilers remove dead code, regular
733e12c5d1SDavid du Colombier.CW if
743e12c5d1SDavid du Colombierstatements with constant conditions are more readable equivalents to many
753e12c5d1SDavid du Colombier.CW #ifs .
763e12c5d1SDavid du ColombierTo compile imported code ineluctably fouled by
773e12c5d1SDavid du Colombier.CW #if
783e12c5d1SDavid du Colombierthere is a separate command,
793e12c5d1SDavid du Colombier.CW /bin/cpp ,
803e12c5d1SDavid du Colombierthat implements the complete ANSI C preprocessor specification.
813e12c5d1SDavid du Colombier.PP
823e12c5d1SDavid du ColombierInclude files fall into two groups: machine-dependent and machine-independent.
833e12c5d1SDavid du ColombierThe machine-independent files occupy the directory
843e12c5d1SDavid du Colombier.CW /sys/include ;
853e12c5d1SDavid du Colombierthe others are placed in a directory appropriate to the machine, such as
863e12c5d1SDavid du Colombier.CW /mips/include .
873e12c5d1SDavid du ColombierThe compiler searches for include files
883e12c5d1SDavid du Colombierfirst in the machine-dependent directory and then
893e12c5d1SDavid du Colombierin the machine-independent directory.
90*219b2ee8SDavid du ColombierAt the time of writing there are twenty-two machine-independent include
913e12c5d1SDavid du Colombierfiles and three (per machine) machine-dependent ones:
923e12c5d1SDavid du Colombier.CW <ureg.h> ,
933e12c5d1SDavid du Colombier.CW <stdarg.h> ,
943e12c5d1SDavid du Colombierand
953e12c5d1SDavid du Colombier.CW <u.h> .
963e12c5d1SDavid du ColombierThe first describes the layout of registers on the system stack,
973e12c5d1SDavid du Colombierfor use by the debugger;
983e12c5d1SDavid du Colombierthe second, as in ANSI C, defines a portable way to declare variadic
993e12c5d1SDavid du Colombierfunctions.
1003e12c5d1SDavid du ColombierThe third defines some
1013e12c5d1SDavid du Colombierarchitecture-dependent types such as
1023e12c5d1SDavid du Colombier.CW jmp_buf
1033e12c5d1SDavid du Colombierfor
1043e12c5d1SDavid du Colombier.CW setjmp
1053e12c5d1SDavid du Colombierand
1063e12c5d1SDavid du Colombieralso a set of
1073e12c5d1SDavid du Colombier.CW typedef
1083e12c5d1SDavid du Colombierabbreviations for
1093e12c5d1SDavid du Colombier.CW unsigned
1103e12c5d1SDavid du Colombier.CW short
1113e12c5d1SDavid du Colombierand so on.
1123e12c5d1SDavid du Colombier.PP
1133e12c5d1SDavid du ColombierHere is an excerpt from
1143e12c5d1SDavid du Colombier.CW /68020/include/u.h :
1153e12c5d1SDavid du Colombier.P1
1163e12c5d1SDavid du Colombiertypedef	unsigned short	ushort;
1173e12c5d1SDavid du Colombiertypedef	unsigned char	uchar;
1183e12c5d1SDavid du Colombiertypedef unsigned long	ulong;
1193e12c5d1SDavid du Colombiertypedef unsigned int	uint;
1203e12c5d1SDavid du Colombiertypedef   signed char	schar;
1213e12c5d1SDavid du Colombiertypedef	long		vlong;
1223e12c5d1SDavid du Colombier
1233e12c5d1SDavid du Colombiertypedef long	jmp_buf[2];
1243e12c5d1SDavid du Colombier#define	JMPBUFSP	0
1253e12c5d1SDavid du Colombier#define	JMPBUFPC	1
1263e12c5d1SDavid du Colombier#define	JMPBUFDPC	0
1273e12c5d1SDavid du Colombier.P2
1283e12c5d1SDavid du ColombierThe type
1293e12c5d1SDavid du Colombier.CW vlong
1303e12c5d1SDavid du Colombieris the largest integer type available; on some architectures it
1313e12c5d1SDavid du Colombieris a 64-bit value.
1323e12c5d1SDavid du ColombierThe
1333e12c5d1SDavid du Colombier.CW #define
1343e12c5d1SDavid du Colombierconstants permit an architecture-independent (but compiler-dependent)
1353e12c5d1SDavid du Colombierimplementation of stack-switching using
1363e12c5d1SDavid du Colombier.CW setjmp
1373e12c5d1SDavid du Colombierand
1383e12c5d1SDavid du Colombier.CW longjmp .
1393e12c5d1SDavid du Colombier.PP
1403e12c5d1SDavid du ColombierEvery Plan 9 C program begins
1413e12c5d1SDavid du Colombier.P1
1423e12c5d1SDavid du Colombier#include <u.h>
1433e12c5d1SDavid du Colombier.P2
1443e12c5d1SDavid du Colombierbecause all the other installed header files use the
1453e12c5d1SDavid du Colombier.CW typedefs
1463e12c5d1SDavid du Colombierdeclared in
1473e12c5d1SDavid du Colombier.CW <u.h> .
1483e12c5d1SDavid du Colombier.PP
1493e12c5d1SDavid du ColombierIn strict ANSI C, include files are grouped to collect related functions
1503e12c5d1SDavid du Colombierin a single file: one for string functions, one for memory functions,
1513e12c5d1SDavid du Colombierone for I/O, and none for system calls.
1523e12c5d1SDavid du ColombierEach include file is protected by an
1533e12c5d1SDavid du Colombier.CW #ifdef
1543e12c5d1SDavid du Colombierto guarantee its contents are seen by the compiler only once.
155*219b2ee8SDavid du ColombierPlan 9 takes a different approach.  Other than a few include
1563e12c5d1SDavid du Colombierfiles that define external formats such as archives, the files in
1573e12c5d1SDavid du Colombier.CW /sys/include
1583e12c5d1SDavid du Colombiercorrespond to
1593e12c5d1SDavid du Colombier.I libraries.
1603e12c5d1SDavid du ColombierIf a program is using a library, it includes the corresponding header.
1613e12c5d1SDavid du ColombierThe default C library comprises string functions, memory functions, and
1623e12c5d1SDavid du Colombierso on, largely as in ANSI C, some formatted I/O routines,
1633e12c5d1SDavid du Colombierplus all the system calls and related functions.
1643e12c5d1SDavid du ColombierTo use these functions, one must
1653e12c5d1SDavid du Colombier.CW #include
1663e12c5d1SDavid du Colombierthe file
1673e12c5d1SDavid du Colombier.CW <libc.h> ,
1683e12c5d1SDavid du Colombierwhich in turn must follow
1693e12c5d1SDavid du Colombier.CW <u.h> ,
1703e12c5d1SDavid du Colombierto define their prototypes for the compiler.
1713e12c5d1SDavid du ColombierHere is the complete source to the traditional first C program:
1723e12c5d1SDavid du Colombier.P1
1733e12c5d1SDavid du Colombier#include <u.h>
1743e12c5d1SDavid du Colombier#include <libc.h>
1753e12c5d1SDavid du Colombier
1763e12c5d1SDavid du Colombiervoid
1773e12c5d1SDavid du Colombiermain(void)
1783e12c5d1SDavid du Colombier{
1793e12c5d1SDavid du Colombier	print("hello world\en");
1803e12c5d1SDavid du Colombier	exits(0);
1813e12c5d1SDavid du Colombier}
1823e12c5d1SDavid du Colombier.P2
1833e12c5d1SDavid du ColombierThe
1843e12c5d1SDavid du Colombier.CW print
1853e12c5d1SDavid du Colombierroutine and its relatives
1863e12c5d1SDavid du Colombier.CW fprint
1873e12c5d1SDavid du Colombierand
1883e12c5d1SDavid du Colombier.CW sprint
1893e12c5d1SDavid du Colombierresemble the similarly-named functions in Standard I/O but are not
1903e12c5d1SDavid du Colombierattached to a specific I/O library.
1913e12c5d1SDavid du ColombierIn Plan 9
1923e12c5d1SDavid du Colombier.CW main
1933e12c5d1SDavid du Colombieris not integer-valued; it should call
1943e12c5d1SDavid du Colombier.CW exits ,
195*219b2ee8SDavid du Colombierwhich takes a string argument (or null; here ANSI C promotes the 0 to a
196*219b2ee8SDavid du Colombier.CW char* ).
1973e12c5d1SDavid du ColombierAll these functions are, of course, documented in the Programmer's Manual.
1983e12c5d1SDavid du Colombier.PP
1993e12c5d1SDavid du ColombierTo use
2003e12c5d1SDavid du Colombier.CW printf ,
2013e12c5d1SDavid du Colombier.CW <stdio.h>
2023e12c5d1SDavid du Colombiermust be included to define the function prototype for
2033e12c5d1SDavid du Colombier.CW printf :
2043e12c5d1SDavid du Colombier.P1
2053e12c5d1SDavid du Colombier#include <u.h>
2063e12c5d1SDavid du Colombier#include <libc.h>
2073e12c5d1SDavid du Colombier#include <stdio.h>
2083e12c5d1SDavid du Colombier
2093e12c5d1SDavid du Colombiervoid
2103e12c5d1SDavid du Colombiermain(int argc, char *argv[])
2113e12c5d1SDavid du Colombier{
212*219b2ee8SDavid du Colombier	printf("%s: hello world; argc = %d\en", argv[0], argc);
2133e12c5d1SDavid du Colombier	exits(0);
2143e12c5d1SDavid du Colombier}
2153e12c5d1SDavid du Colombier.P2
216*219b2ee8SDavid du ColombierIn practice, Standard I/O is not used much in Plan 9.  I/O libraries are
2173e12c5d1SDavid du Colombierdiscussed in a later section of this document.
2183e12c5d1SDavid du Colombier.PP
2193e12c5d1SDavid du ColombierThere are libraries for handling regular expressions, bitmap graphics,
2203e12c5d1SDavid du Colombierwindows, and so on, and each has an associated include file.
2213e12c5d1SDavid du ColombierThe manual for each library states which include files are needed.
2223e12c5d1SDavid du ColombierThe files are not protected against multiple inclusion and themselves
2233e12c5d1SDavid du Colombiercontain no nested
2243e12c5d1SDavid du Colombier.CW #includes .
2253e12c5d1SDavid du ColombierInstead the
2263e12c5d1SDavid du Colombierprogrammer is expected to sort out the requirements
2273e12c5d1SDavid du Colombierand to
2283e12c5d1SDavid du Colombier.CW #include
2293e12c5d1SDavid du Colombierthe necessary files once at the top of each source file.  In practice this is
2303e12c5d1SDavid du Colombiertrivial: this way of handling include files is so straightforward
2313e12c5d1SDavid du Colombierthat it is rare for a source file to contain more than half a dozen
2323e12c5d1SDavid du Colombier.CW #includes .
2333e12c5d1SDavid du Colombier.PP
2343e12c5d1SDavid du ColombierThe compilers do their own register allocation so the
2353e12c5d1SDavid du Colombier.CW register
2363e12c5d1SDavid du Colombierkeyword is ignored.
2373e12c5d1SDavid du ColombierFor different reasons,
2383e12c5d1SDavid du Colombier.CW volatile
2393e12c5d1SDavid du Colombierand
2403e12c5d1SDavid du Colombier.CW const
2413e12c5d1SDavid du Colombierare also ignored.
2423e12c5d1SDavid du Colombier.PP
2433e12c5d1SDavid du ColombierTo make it easier to share code with other systems, Plan 9 has a version
2443e12c5d1SDavid du Colombierof the compiler,
2453e12c5d1SDavid du Colombier.CW pcc ,
2463e12c5d1SDavid du Colombierthat provides the standard ANSI C preprocessor, headers, and libraries
2473e12c5d1SDavid du Colombierwith POSIX extensions.
2483e12c5d1SDavid du Colombier.CW Pcc
2493e12c5d1SDavid du Colombieris recommended only
2503e12c5d1SDavid du Colombierwhen broad external portability is mandated.  It compiles slower,
2513e12c5d1SDavid du Colombierproduces slower code (it takes extra work to simulate POSIX on Plan 9),
2523e12c5d1SDavid du Colombiereliminates those parts of the Plan 9 interface
2533e12c5d1SDavid du Colombiernot related to POSIX, and illustrates the clumsiness of an environment
2543e12c5d1SDavid du Colombierdesigned by committee.
2553e12c5d1SDavid du Colombier.CW Pcc
2563e12c5d1SDavid du Colombieris described in more detail in
2573e12c5d1SDavid du Colombier.I
258*219b2ee8SDavid du ColombierAPE\(emThe ANSI/POSIX Environment,
2593e12c5d1SDavid du Colombier.R
2603e12c5d1SDavid du Colombierby Howard Trickey.
2613e12c5d1SDavid du Colombier.SH
2623e12c5d1SDavid du ColombierProcess
2633e12c5d1SDavid du Colombier.PP
2643e12c5d1SDavid du ColombierEach CPU architecture supported by Plan 9 is identified by a single,
2653e12c5d1SDavid du Colombierarbitrary, alphanumeric character:
2663e12c5d1SDavid du Colombier.CW v
2673e12c5d1SDavid du Colombierfor MIPS,
2683e12c5d1SDavid du Colombier.CW k
2693e12c5d1SDavid du Colombierfor SPARC,
270*219b2ee8SDavid du Colombier.CW x
271*219b2ee8SDavid du Colombierfor AT&T DSP3210,
2723e12c5d1SDavid du Colombier.CW 2
2733e12c5d1SDavid du Colombierfor Motorola 68020 and 68040,
2743e12c5d1SDavid du Colombier.CW 8
2753e12c5d1SDavid du Colombierfor Intel 386, and
2763e12c5d1SDavid du Colombier.CW 6
2773e12c5d1SDavid du Colombierfor Intel 960.
2783e12c5d1SDavid du ColombierThe character labels the support tools and files for that architecture.
2793e12c5d1SDavid du ColombierFor instance, for the 68020 the compiler is
2803e12c5d1SDavid du Colombier.CW 2c ,
2813e12c5d1SDavid du Colombierthe assembler is
2823e12c5d1SDavid du Colombier.CW 2a ,
2833e12c5d1SDavid du Colombierthe link editor/loader is
2843e12c5d1SDavid du Colombier.CW 2l ,
2853e12c5d1SDavid du Colombierthe object files are suffixed
2863e12c5d1SDavid du Colombier.CW \&.2 ,
2873e12c5d1SDavid du Colombierand the default name for an executable file is
2883e12c5d1SDavid du Colombier.CW 2.out .
2893e12c5d1SDavid du ColombierBefore we can use the compiler we therefore need to know which
2903e12c5d1SDavid du Colombiermachine we are compiling for.
2913e12c5d1SDavid du ColombierThe next section explains how this decision is made; for the moment
2923e12c5d1SDavid du Colombierassume we are building 68020 binaries and make the mental substitution for
2933e12c5d1SDavid du Colombier.CW 2
2943e12c5d1SDavid du Colombierappropriate to the machine you are actually using.
2953e12c5d1SDavid du Colombier.PP
2963e12c5d1SDavid du ColombierTo convert source to an executable binary is a two-step process.
2973e12c5d1SDavid du ColombierFirst run the compiler,
2983e12c5d1SDavid du Colombier.CW 2c ,
2993e12c5d1SDavid du Colombieron the source, say
3003e12c5d1SDavid du Colombier.CW file.c ,
3013e12c5d1SDavid du Colombierto generate an object file
3023e12c5d1SDavid du Colombier.CW file.2 .
3033e12c5d1SDavid du ColombierThen run the loader,
3043e12c5d1SDavid du Colombier.CW 2l ,
3053e12c5d1SDavid du Colombierto generate an executable
3063e12c5d1SDavid du Colombier.CW 2.out
3073e12c5d1SDavid du Colombierthat may be run (on a 680X0 machine):
3083e12c5d1SDavid du Colombier.P1
3093e12c5d1SDavid du Colombier2c file.c
3103e12c5d1SDavid du Colombier2l file.2
3113e12c5d1SDavid du Colombier2.out
3123e12c5d1SDavid du Colombier.P2
3133e12c5d1SDavid du ColombierThe loader automatically links with whatever libraries the program
3143e12c5d1SDavid du Colombierneeds, usually including the standard C library as defined by
3153e12c5d1SDavid du Colombier.CW <libc.h> .
3163e12c5d1SDavid du ColombierOf course the compiler and loader have lots of options, both familiar and new;
3173e12c5d1SDavid du Colombiersee the manual for details.
3183e12c5d1SDavid du ColombierThe compiler does not generate an executable automatically;
3193e12c5d1SDavid du Colombierthe output of the compiler must be given to the loader.
3203e12c5d1SDavid du ColombierSince most compilation is done under the control of
3213e12c5d1SDavid du Colombier.CW mk
3223e12c5d1SDavid du Colombier(see below), this is rarely an inconvenience.
3233e12c5d1SDavid du Colombier.PP
3243e12c5d1SDavid du ColombierThe distribution of work between the compiler and loader is unusual.
3253e12c5d1SDavid du ColombierThe compiler integrates preprocessing, parsing, register allocation,
3263e12c5d1SDavid du Colombiercode generation and some assembly.
3273e12c5d1SDavid du ColombierCombining these tasks in a single program is part of the reason for
3283e12c5d1SDavid du Colombierthe compiler's efficiency.
3293e12c5d1SDavid du ColombierThe loader does instruction selection, branch folding,
3303e12c5d1SDavid du Colombierinstruction scheduling,
3313e12c5d1SDavid du Colombierand writes the final executable.
3323e12c5d1SDavid du ColombierThere is no separate C preprocessor and no assembler in the usual pipeline.
3333e12c5d1SDavid du ColombierInstead the intermediate object file
334*219b2ee8SDavid du Colombier(here a
3353e12c5d1SDavid du Colombier.CW \&.2
336*219b2ee8SDavid du Colombierfile) is a type of binary assembly language.
3373e12c5d1SDavid du ColombierThe instructions in the intermediate format are not exactly those in
3383e12c5d1SDavid du Colombierthe machine.  For example, on the 68020 the object file may specify
3393e12c5d1SDavid du Colombiera MOVE instruction but the loader will decide just which variant of
3403e12c5d1SDavid du Colombierthe MOVE instruction \(em MOVE immediate, MOVE quick, MOVE address,
3413e12c5d1SDavid du Colombieretc. \(em is most efficient.
3423e12c5d1SDavid du Colombier.PP
3433e12c5d1SDavid du ColombierThe assembler,
3443e12c5d1SDavid du Colombier.CW 2a ,
3453e12c5d1SDavid du Colombieris just a translator between the textual and binary
3463e12c5d1SDavid du Colombierrepresentations of the object file format.
3473e12c5d1SDavid du ColombierIt is not an assembler in the traditional sense.  It has limited
3483e12c5d1SDavid du Colombiermacro capabilities (the same as the integral C preprocessor in the compiler),
3493e12c5d1SDavid du Colombierclumsy syntax, and minimal error checking.  For instance, the assembler
3503e12c5d1SDavid du Colombierwill accept an instruction (such as memory-to-memory MOVE on the MIPS) that the
3513e12c5d1SDavid du Colombiermachine does not actually support; only when the output of the assembler
3523e12c5d1SDavid du Colombieris passed to the loader will the error be discovered.
353*219b2ee8SDavid du ColombierThe assembler is intended only for writing things that need access to instructions
354*219b2ee8SDavid du Colombierinvisible from C,
3553e12c5d1SDavid du Colombiersuch as the machine-dependent
356*219b2ee8SDavid du Colombierpart of an operating system;
357*219b2ee8SDavid du Colombiervery little code in Plan 9 is in assembly language.
3583e12c5d1SDavid du Colombier.PP
3593e12c5d1SDavid du ColombierThe compilers take an option
3603e12c5d1SDavid du Colombier.CW -S
3613e12c5d1SDavid du Colombierthat causes them to print on their standard output the generated code
3623e12c5d1SDavid du Colombierin a format acceptable as input to the assemblers.
3633e12c5d1SDavid du ColombierThis is of course merely a formatting of the
3643e12c5d1SDavid du Colombierdata in the object file; therefore the assembler is just
3653e12c5d1SDavid du Colombieran
366*219b2ee8SDavid du ColombierASCII-to-binary converter for this format.
3673e12c5d1SDavid du ColombierOther than the specific instructions, the input to the assemblers
368*219b2ee8SDavid du Colombieris largely architecture-independent; see
369*219b2ee8SDavid du Colombier``A Manual for the Plan 9 Assembler'',
370*219b2ee8SDavid du Colombierby Rob Pike,
371*219b2ee8SDavid du Colombierfor more information.
3723e12c5d1SDavid du Colombier.PP
3733e12c5d1SDavid du ColombierThe loader is an integral part of the compilation process.
3743e12c5d1SDavid du ColombierEach library header file contains a
3753e12c5d1SDavid du Colombier.CW #pragma
3763e12c5d1SDavid du Colombierthat tells the loader the name of the associated archive; it is
3773e12c5d1SDavid du Colombiernot necessary to tell the loader which libraries a program uses.
3783e12c5d1SDavid du ColombierThe C run-time startup is found, by default, in the C library.
3793e12c5d1SDavid du ColombierThe loader starts with an undefined
3803e12c5d1SDavid du Colombiersymbol,
3813e12c5d1SDavid du Colombier.CW _main ,
3823e12c5d1SDavid du Colombierthat is resolved by pulling in the run-time startup code from the library.
3833e12c5d1SDavid du Colombier(The loader undefines
3843e12c5d1SDavid du Colombier.CW _mainp
3853e12c5d1SDavid du Colombierwhen profiling is enabled, to force loading of the profiling start-up
3863e12c5d1SDavid du Colombierinstead.)
3873e12c5d1SDavid du Colombier.PP
3883e12c5d1SDavid du ColombierUnlike its counterpart on other systems, the Plan 9 loader rearranges
3893e12c5d1SDavid du Colombierdata to optimize access.  This means the order of variables in the
3903e12c5d1SDavid du Colombierloaded program is unrelated to its order in the source.
3913e12c5d1SDavid du ColombierMost programs don't care, but some assume that, for example, the
3923e12c5d1SDavid du Colombiervariables declared by
3933e12c5d1SDavid du Colombier.P1
3943e12c5d1SDavid du Colombierint a;
3953e12c5d1SDavid du Colombierint b;
3963e12c5d1SDavid du Colombier.P2
3973e12c5d1SDavid du Colombierwill appear at adjacent addresses in memory.  On Plan 9, they won't.
3983e12c5d1SDavid du Colombier.SH
3993e12c5d1SDavid du ColombierHeterogeneity
4003e12c5d1SDavid du Colombier.PP
4013e12c5d1SDavid du ColombierWhen the system starts or a user logs in the environment is configured
4023e12c5d1SDavid du Colombierso the appropriate binaries are available in
4033e12c5d1SDavid du Colombier.CW /bin .
4043e12c5d1SDavid du ColombierThe configuration process is controlled by an environment variable,
4053e12c5d1SDavid du Colombier.CW $cputype ,
4063e12c5d1SDavid du Colombierwith value such as
4073e12c5d1SDavid du Colombier.CW mips ,
4083e12c5d1SDavid du Colombier.CW 68020 ,
4093e12c5d1SDavid du Colombieror
4103e12c5d1SDavid du Colombier.CW sparc .
411*219b2ee8SDavid du ColombierFor each architecture there is a directory in the root,
412*219b2ee8SDavid du Colombierwith the appropriate name,
4133e12c5d1SDavid du Colombierthat holds the binary and library files for that architecture.
4143e12c5d1SDavid du ColombierThus
4153e12c5d1SDavid du Colombier.CW /mips/lib
4163e12c5d1SDavid du Colombiercontains the object code libraries for MIPS programs,
4173e12c5d1SDavid du Colombier.CW /mips/include
4183e12c5d1SDavid du Colombierholds MIPS-specific include files, and
4193e12c5d1SDavid du Colombier.CW /mips/bin
4203e12c5d1SDavid du Colombierhas the MIPS binaries.
4213e12c5d1SDavid du ColombierThese binaries are attached to
4223e12c5d1SDavid du Colombier.CW /bin
4233e12c5d1SDavid du Colombierat boot time by binding
4243e12c5d1SDavid du Colombier.CW /$cputype/bin
4253e12c5d1SDavid du Colombierto
4263e12c5d1SDavid du Colombier.CW /bin ,
4273e12c5d1SDavid du Colombierso
4283e12c5d1SDavid du Colombier.CW /bin
4293e12c5d1SDavid du Colombieralways contains the correct files.
4303e12c5d1SDavid du Colombier.PP
4313e12c5d1SDavid du ColombierThe MIPS compiler,
4323e12c5d1SDavid du Colombier.CW vc ,
4333e12c5d1SDavid du Colombierby definition
4343e12c5d1SDavid du Colombierproduces object files for the MIPS architecture,
4353e12c5d1SDavid du Colombierregardless of the architecture of the machine on which the compiler is running.
4363e12c5d1SDavid du ColombierThere is a version of
4373e12c5d1SDavid du Colombier.CW vc
4383e12c5d1SDavid du Colombiercompiled for each architecture:
4393e12c5d1SDavid du Colombier.CW /mips/bin/vc ,
4403e12c5d1SDavid du Colombier.CW /68020/bin/vc ,
4413e12c5d1SDavid du Colombier.CW /sparc/bin/vc ,
4423e12c5d1SDavid du Colombierand so on,
4433e12c5d1SDavid du Colombiereach capable of producing MIPS object files regardless of the native
4443e12c5d1SDavid du Colombierinstruction set.
4453e12c5d1SDavid du ColombierIf one is running on a SPARC,
4463e12c5d1SDavid du Colombier.CW /sparc/bin/vc
4473e12c5d1SDavid du Colombierwill compile programs for the MIPS;
4483e12c5d1SDavid du Colombierif one is running on machine
4493e12c5d1SDavid du Colombier.CW $cputype ,
4503e12c5d1SDavid du Colombier.CW /$cputype/bin/vc
4513e12c5d1SDavid du Colombierwill compile programs for the MIPS.
4523e12c5d1SDavid du Colombier.PP
453*219b2ee8SDavid du ColombierBecause of the bindings that assemble
454*219b2ee8SDavid du Colombier.CW /bin ,
455*219b2ee8SDavid du Colombierthe shell always looks for a command, say
456*219b2ee8SDavid du Colombier.CW date ,
4573e12c5d1SDavid du Colombierin
458*219b2ee8SDavid du Colombier.CW /bin
4593e12c5d1SDavid du Colombierand automatically finds the file
4603e12c5d1SDavid du Colombier.CW /$cputype/bin/date .
4613e12c5d1SDavid du ColombierTherefore the MIPS compiler is known as just
4623e12c5d1SDavid du Colombier.CW vc ;
4633e12c5d1SDavid du Colombierthe shell will invoke
4643e12c5d1SDavid du Colombier.CW /bin/vc
4653e12c5d1SDavid du Colombierand that is guaranteed to be the version of the MIPS compiler
4663e12c5d1SDavid du Colombierappropriate for the machine running the command.
4673e12c5d1SDavid du ColombierRegardless of the architecture of the compiling machine,
4683e12c5d1SDavid du Colombier.CW /bin/vc
4693e12c5d1SDavid du Colombieris
4703e12c5d1SDavid du Colombier.I always
4713e12c5d1SDavid du Colombierthe MIPS compiler.
4723e12c5d1SDavid du Colombier.PP
4733e12c5d1SDavid du ColombierAlso, the output of
4743e12c5d1SDavid du Colombier.CW vc
4753e12c5d1SDavid du Colombierand
4763e12c5d1SDavid du Colombier.CW vl
477*219b2ee8SDavid du Colombieris completely independent of the machine type on which they are executed:
4783e12c5d1SDavid du Colombier.CW \&.v
4793e12c5d1SDavid du Colombierfiles compiled (with
4803e12c5d1SDavid du Colombier.CW vc )
4813e12c5d1SDavid du Colombieron a SPARC may be linked (with
4823e12c5d1SDavid du Colombier.CW vl )
4833e12c5d1SDavid du Colombieron a 386.
4843e12c5d1SDavid du Colombier(The resulting
4853e12c5d1SDavid du Colombier.CW v.out
4863e12c5d1SDavid du Colombierwill run, of course, only on a MIPS.)
4873e12c5d1SDavid du ColombierSimilarly, the MIPS libraries in
4883e12c5d1SDavid du Colombier.CW /mips/lib
4893e12c5d1SDavid du Colombierare suitable for loading with
4903e12c5d1SDavid du Colombier.CW vl
4913e12c5d1SDavid du Colombieron any machine; there is only one set of MIPS libraries, not one
4923e12c5d1SDavid du Colombierset for each architecture that supports the MIPS compiler.
4933e12c5d1SDavid du Colombier.SH
4943e12c5d1SDavid du ColombierHeterogeneity and \f(CWmk\fP
4953e12c5d1SDavid du Colombier.PP
4963e12c5d1SDavid du ColombierMost software on Plan 9 is compiled under the control of
4973e12c5d1SDavid du Colombier.CW mk ,
4983e12c5d1SDavid du Colombiera descendant of
4993e12c5d1SDavid du Colombier.CW make
5003e12c5d1SDavid du Colombierthat is documented in the Programmer's Manual.
5013e12c5d1SDavid du ColombierA convention used throughout the
5023e12c5d1SDavid du Colombier.CW mkfiles
5033e12c5d1SDavid du Colombiermakes it easy to compile the source into binary suitable for any architecture.
5043e12c5d1SDavid du Colombier.PP
5053e12c5d1SDavid du ColombierThe variable
5063e12c5d1SDavid du Colombier.CW $cputype
5073e12c5d1SDavid du Colombieris advisory: it reports the architecture of the current environment, and should
5083e12c5d1SDavid du Colombiernot be modified.  A second variable,
5093e12c5d1SDavid du Colombier.CW $objtype ,
5103e12c5d1SDavid du Colombieris used to set which architecture is being
5113e12c5d1SDavid du Colombier.I compiled
5123e12c5d1SDavid du Colombierfor.
5133e12c5d1SDavid du ColombierThe value of
5143e12c5d1SDavid du Colombier.CW $objtype
5153e12c5d1SDavid du Colombiercan be used by a
5163e12c5d1SDavid du Colombier.CW mkfile
5173e12c5d1SDavid du Colombierto configure the compilation environment.
5183e12c5d1SDavid du Colombier.PP
5193e12c5d1SDavid du ColombierIn each machine's root directory there is a short
5203e12c5d1SDavid du Colombier.CW mkfile
521*219b2ee8SDavid du Colombierthat defines a set of macros for the compiler, loader, etc.
5223e12c5d1SDavid du ColombierHere is
5233e12c5d1SDavid du Colombier.CW /mips/mkfile :
5243e12c5d1SDavid du Colombier.P1
5253e12c5d1SDavid du ColombierCC=vc
526*219b2ee8SDavid du ColombierALEF=val
5273e12c5d1SDavid du ColombierLD=vl
5283e12c5d1SDavid du ColombierO=v
5293e12c5d1SDavid du ColombierAS=va
530*219b2ee8SDavid du ColombierOS=2kv86x
531*219b2ee8SDavid du ColombierCPUS=mips 68020 sparc 386
5323e12c5d1SDavid du ColombierCFLAGS=
533*219b2ee8SDavid du ColombierLEX=lex
534*219b2ee8SDavid du ColombierYACC=yacc
535*219b2ee8SDavid du ColombierMK=/bin/mk
5363e12c5d1SDavid du Colombier.P2
5373e12c5d1SDavid du Colombier.CW CC
5383e12c5d1SDavid du Colombieris obviously the compiler,
5393e12c5d1SDavid du Colombier.CW AS
5403e12c5d1SDavid du Colombierthe assembler, and
5413e12c5d1SDavid du Colombier.CW LD
5423e12c5d1SDavid du Colombierthe loader.
543*219b2ee8SDavid du Colombier.CW ALEF
544*219b2ee8SDavid du Colombieridentifies the Alef compiler, described below.
5453e12c5d1SDavid du Colombier.CW O
5463e12c5d1SDavid du Colombieris the suffix for the object files and
5473e12c5d1SDavid du Colombier.CW CPUS
5483e12c5d1SDavid du Colombierand
5493e12c5d1SDavid du Colombier.CW OS
5503e12c5d1SDavid du Colombierare used in special rules described below.
5513e12c5d1SDavid du Colombier.PP
5523e12c5d1SDavid du ColombierHere is a
5533e12c5d1SDavid du Colombier.CW mkfile
5543e12c5d1SDavid du Colombierto build the installed source for
5553e12c5d1SDavid du Colombier.CW sam :
5563e12c5d1SDavid du Colombier.P1
5573e12c5d1SDavid du Colombier</$objtype/mkfile
558*219b2ee8SDavid du ColombierOBJ=sam.$O address.$O buffer.$O cmd.$O disc.$O error.$O \e
559*219b2ee8SDavid du Colombier	file.$O io.$O list.$O mesg.$O moveto.$O multi.$O \e
560*219b2ee8SDavid du Colombier	plan9.$O rasp.$O regexp.$O string.$O sys.$O xec.$O
5613e12c5d1SDavid du Colombier
5623e12c5d1SDavid du Colombier$O.out:	$OBJ
5633e12c5d1SDavid du Colombier	$LD $OBJ
5643e12c5d1SDavid du Colombier
5653e12c5d1SDavid du Colombierinstall:	$O.out
5663e12c5d1SDavid du Colombier	cp $O.out /$objtype/bin/sam
5673e12c5d1SDavid du Colombier
5683e12c5d1SDavid du Colombierinstallall:
5693e12c5d1SDavid du Colombier	for(objtype in $CPUS) mk install
5703e12c5d1SDavid du Colombier
5713e12c5d1SDavid du Colombier%.$O:	%.c
5723e12c5d1SDavid du Colombier	$CC $CFLAGS $stem.c
5733e12c5d1SDavid du Colombier
5743e12c5d1SDavid du Colombier$OBJ:	sam.h errors.h mesg.h
5753e12c5d1SDavid du Colombieraddress.$O cmd.$O parse.$O xec.$O unix.$O:	parse.h
5763e12c5d1SDavid du Colombier
5773e12c5d1SDavid du Colombierclean:V:
5783e12c5d1SDavid du Colombier	rm -f [$OS].out *.[$OS] y.tab.?
5793e12c5d1SDavid du Colombier.P2
5803e12c5d1SDavid du Colombier(The actual
5813e12c5d1SDavid du Colombier.CW mkfile
5823e12c5d1SDavid du Colombierimports most of its rules from other secondary files, but
5833e12c5d1SDavid du Colombierthis example works and is not misleading.)
5843e12c5d1SDavid du ColombierThe first line causes
5853e12c5d1SDavid du Colombier.CW mk
5863e12c5d1SDavid du Colombierto include the contents of
5873e12c5d1SDavid du Colombier.CW /$objtype/mkfile
5883e12c5d1SDavid du Colombierin the current
5893e12c5d1SDavid du Colombier.CW mkfile .
5903e12c5d1SDavid du ColombierIf
5913e12c5d1SDavid du Colombier.CW $objtype
5923e12c5d1SDavid du Colombieris
5933e12c5d1SDavid du Colombier.CW mips ,
594*219b2ee8SDavid du Colombierthis inserts the MIPS macro definitions into the
5953e12c5d1SDavid du Colombier.CW mkfile .
5963e12c5d1SDavid du ColombierIn this case the rule for
5973e12c5d1SDavid du Colombier.CW $O.out
598*219b2ee8SDavid du Colombieruses the MIPS tools to build
5993e12c5d1SDavid du Colombier.CW v.out .
6003e12c5d1SDavid du ColombierThe
6013e12c5d1SDavid du Colombier.CW %.$O
6023e12c5d1SDavid du Colombierrule in the file uses
6033e12c5d1SDavid du Colombier.CW mk 's
604*219b2ee8SDavid du Colombierpattern matching facilities to convert the source files to the object
605*219b2ee8SDavid du Colombierfiles through the compiler.
6063e12c5d1SDavid du Colombier(The text of the rules is passed directly to the shell,
6073e12c5d1SDavid du Colombier.CW rc ,
6083e12c5d1SDavid du Colombierwithout further translation.
6093e12c5d1SDavid du ColombierSee the
6103e12c5d1SDavid du Colombier.CW mk
6113e12c5d1SDavid du Colombiermanual if any of this is unfamiliar.)
6123e12c5d1SDavid du ColombierBecause the default rule builds
6133e12c5d1SDavid du Colombier.CW $O.out
6143e12c5d1SDavid du Colombierrather than
6153e12c5d1SDavid du Colombier.CW sam ,
6163e12c5d1SDavid du Colombierit is possible to maintain binaries for multiple machines in the
6173e12c5d1SDavid du Colombiersame source directory without conflict.
6183e12c5d1SDavid du ColombierThis is also, of course, why the output files from the various
6193e12c5d1SDavid du Colombiercompilers and loaders
620*219b2ee8SDavid du Colombierhave distinct names.
6213e12c5d1SDavid du Colombier.PP
6223e12c5d1SDavid du ColombierThe rest of the
6233e12c5d1SDavid du Colombier.CW mkfile
6243e12c5d1SDavid du Colombiershould be easy to follow; notice how the rules for
6253e12c5d1SDavid du Colombier.CW clean
6263e12c5d1SDavid du Colombierand
6273e12c5d1SDavid du Colombier.CW installall
6283e12c5d1SDavid du Colombier(that is, install versions for all architectures) use other macros
6293e12c5d1SDavid du Colombierdefined in
6303e12c5d1SDavid du Colombier.CW /$objtype/mkfile .
6313e12c5d1SDavid du ColombierIn Plan 9,
6323e12c5d1SDavid du Colombier.CW mkfiles
6333e12c5d1SDavid du Colombierfor commands conventionally contain rules to
6343e12c5d1SDavid du Colombier.CW install
6353e12c5d1SDavid du Colombier(compile and install the version for
6363e12c5d1SDavid du Colombier.CW $objtype ),
6373e12c5d1SDavid du Colombier.CW installall
6383e12c5d1SDavid du Colombier(compile and install for all
6393e12c5d1SDavid du Colombier.CW $objtypes ),
6403e12c5d1SDavid du Colombierand
6413e12c5d1SDavid du Colombier.CW clean
6423e12c5d1SDavid du Colombier(remove all object files, binaries, etc.).
6433e12c5d1SDavid du Colombier.PP
6443e12c5d1SDavid du ColombierThe
6453e12c5d1SDavid du Colombier.CW mkfile
6463e12c5d1SDavid du Colombieris easy to use.  To build a MIPS binary,
6473e12c5d1SDavid du Colombier.CW v.out :
6483e12c5d1SDavid du Colombier.P1
6493e12c5d1SDavid du Colombier% objtype=mips
6503e12c5d1SDavid du Colombier% mk
6513e12c5d1SDavid du Colombier.P2
6523e12c5d1SDavid du ColombierTo build and install a MIPS binary:
6533e12c5d1SDavid du Colombier.P1
6543e12c5d1SDavid du Colombier% objtype=mips
6553e12c5d1SDavid du Colombier% mk install
6563e12c5d1SDavid du Colombier.P2
6573e12c5d1SDavid du ColombierTo build and install all versions:
6583e12c5d1SDavid du Colombier.P1
6593e12c5d1SDavid du Colombier% mk installall
6603e12c5d1SDavid du Colombier.P2
6613e12c5d1SDavid du ColombierThese conventions make cross-compilation as easy to manage
6623e12c5d1SDavid du Colombieras traditional native compilation.
6633e12c5d1SDavid du ColombierPlan 9 programs compile and run without change on machines from
664*219b2ee8SDavid du Colombierlarge multiprocessors to laptops.  For more information about this process, see
665*219b2ee8SDavid du Colombier``Plan 9 Mkfiles'',
666*219b2ee8SDavid du Colombierby Bob Flandrena.
6673e12c5d1SDavid du Colombier.SH
6683e12c5d1SDavid du ColombierPortability
6693e12c5d1SDavid du Colombier.PP
6703e12c5d1SDavid du ColombierWithin Plan 9, it is painless to write portable programs, programs whose
6713e12c5d1SDavid du Colombiersource is independent of the machine on which they execute.
6723e12c5d1SDavid du ColombierThe operating system is fixed and the compiler, headers and libraries
6733e12c5d1SDavid du Colombierare constant so most of the stumbling blocks to portability are removed.
6743e12c5d1SDavid du ColombierAttention to a few details can avoid those that remain.
6753e12c5d1SDavid du Colombier.PP
6763e12c5d1SDavid du ColombierPlan 9 is a heterogeneous environment, so programs must
6773e12c5d1SDavid du Colombier.I expect
6783e12c5d1SDavid du Colombierthat external files will be written by programs on machines of different
6793e12c5d1SDavid du Colombierarchitectures.
6803e12c5d1SDavid du ColombierThe compilers, for instance, must handle without confusion
6813e12c5d1SDavid du Colombierobject files written by other machines.
6823e12c5d1SDavid du ColombierThe traditional approach to this problem is to pepper the source with
6833e12c5d1SDavid du Colombier.CW #ifdefs
6843e12c5d1SDavid du Colombierto turn byte-swapping on and off.
685*219b2ee8SDavid du ColombierPlan 9 takes a different approach: of the handful of machine-dependent
6863e12c5d1SDavid du Colombier.CW #ifdefs
687*219b2ee8SDavid du Colombierin all the source, almost all are deep in the libraries.
688*219b2ee8SDavid du ColombierInstead programs read and write files in a defined format,
6893e12c5d1SDavid du Colombiereither (for low volume applications) as formatted text, or
6903e12c5d1SDavid du Colombier(for high volume applications) as binary in a known byte order.
691*219b2ee8SDavid du ColombierIf the external data were written with the most significant
692*219b2ee8SDavid du Colombierbyte first, the following code reads a 4-byte integer correctly
6933e12c5d1SDavid du Colombierregardless of the architecture of the executing machine (assuming
694*219b2ee8SDavid du Colombieran unsigned long holds 4 bytes):
6953e12c5d1SDavid du Colombier.P1
6963e12c5d1SDavid du Colombierulong
6973e12c5d1SDavid du Colombiergetlong(void)
6983e12c5d1SDavid du Colombier{
6993e12c5d1SDavid du Colombier	ulong l;
7003e12c5d1SDavid du Colombier
7013e12c5d1SDavid du Colombier	l = (getchar()&0xFF)<<24;
7023e12c5d1SDavid du Colombier	l |= (getchar()&0xFF)<<16;
7033e12c5d1SDavid du Colombier	l |= (getchar()&0xFF)<<8;
7043e12c5d1SDavid du Colombier	l |= (getchar()&0xFF)<<0;
7053e12c5d1SDavid du Colombier	return l;
7063e12c5d1SDavid du Colombier}
7073e12c5d1SDavid du Colombier.P2
7083e12c5d1SDavid du ColombierNote that this code does not `swap' the bytes; instead it just reads
7093e12c5d1SDavid du Colombierthem in the correct order.
7103e12c5d1SDavid du ColombierVariations of this code will handle any binary format
7113e12c5d1SDavid du Colombierand also avoid problems
7123e12c5d1SDavid du Colombierinvolving how structures are padded, how words are aligned,
7133e12c5d1SDavid du Colombierand other impediments to portability.
7143e12c5d1SDavid du ColombierBe aware, though, that extra care is needed to handle floating point data.
7153e12c5d1SDavid du Colombier.PP
7163e12c5d1SDavid du ColombierEfficiency hounds will argue that this method is unnecessarily slow and clumsy
7173e12c5d1SDavid du Colombierwhen the executing machine has the same byte order (and padding and alignment)
7183e12c5d1SDavid du Colombieras the data.
7193e12c5d1SDavid du ColombierI/O speed is rarely the bottleneck for an application, however,
7203e12c5d1SDavid du Colombierand the gain in simplicity of porting and maintaining the code greatly outweighs
7213e12c5d1SDavid du Colombierthe minor speed loss from handling data in this general way.
7223e12c5d1SDavid du ColombierThis method is how the Plan 9 compilers, the window system, and even the file
7233e12c5d1SDavid du Colombierservers transmit data between programs.
7243e12c5d1SDavid du Colombier.PP
7253e12c5d1SDavid du ColombierTo port programs beyond Plan 9, where the system interface is more variable,
7263e12c5d1SDavid du Colombierit is probably necessary to use
7273e12c5d1SDavid du Colombier.CW pcc
7283e12c5d1SDavid du Colombierand hope that the target machine supports ANSI C and POSIX.
7293e12c5d1SDavid du Colombier.SH
7303e12c5d1SDavid du ColombierI/O
7313e12c5d1SDavid du Colombier.PP
7323e12c5d1SDavid du ColombierThe default C library, defined by the include file
7333e12c5d1SDavid du Colombier.CW <libc.h> ,
7343e12c5d1SDavid du Colombiercontains no buffered I/O package.
7353e12c5d1SDavid du ColombierIt does have several entry points for printing formatted text:
7363e12c5d1SDavid du Colombier.CW print
7373e12c5d1SDavid du Colombieroutputs text to the standard output,
7383e12c5d1SDavid du Colombier.CW fprint
7393e12c5d1SDavid du Colombieroutputs text to a specified integer file descriptor, and
7403e12c5d1SDavid du Colombier.CW sprint
7413e12c5d1SDavid du Colombierplaces text in a character array.
742*219b2ee8SDavid du ColombierTo access library routines for buffered I/O, a program must
743*219b2ee8SDavid du Colombierexplicitly include the header file associated with an appropriate library.
7443e12c5d1SDavid du Colombier.PP
7453e12c5d1SDavid du ColombierThe recommended I/O library, used by most Plan 9 utilities, is
7463e12c5d1SDavid du Colombier.CW bio
7473e12c5d1SDavid du Colombier(buffered I/O), defined by
748*219b2ee8SDavid du Colombier.CW <bio.h> .
7493e12c5d1SDavid du ColombierThere also exists an implementation of ANSI Standard I/O,
7503e12c5d1SDavid du Colombier.CW stdio .
7513e12c5d1SDavid du Colombier.PP
7523e12c5d1SDavid du Colombier.CW Bio
753*219b2ee8SDavid du Colombieris small and efficient, particularly for buffer-at-a-time or
7543e12c5d1SDavid du Colombierline-at-a-time I/O.
7553e12c5d1SDavid du ColombierEven for character-at-a-time I/O, however, it is significantly faster than
7563e12c5d1SDavid du Colombierthe Standard I/O library,
7573e12c5d1SDavid du Colombier.CW stdio .
758*219b2ee8SDavid du ColombierIts interface is compact and regular, although it lacks a few conveniences.
7593e12c5d1SDavid du ColombierThe most noticeable is that one must explicitly define buffers for standard
7603e12c5d1SDavid du Colombierinput and output;
7613e12c5d1SDavid du Colombier.CW bio
7623e12c5d1SDavid du Colombierdoes not predefine them.  Here is a program to copy input to output a character
7633e12c5d1SDavid du Colombierat a time using
7643e12c5d1SDavid du Colombier.CW bio :
7653e12c5d1SDavid du Colombier.P1
7663e12c5d1SDavid du Colombier#include <u.h>
7673e12c5d1SDavid du Colombier#include <libc.h>
7683e12c5d1SDavid du Colombier#include <bio.h>
7693e12c5d1SDavid du Colombier
7703e12c5d1SDavid du ColombierBiobuf	bin;
7713e12c5d1SDavid du ColombierBiobuf	bout;
7723e12c5d1SDavid du Colombier
7733e12c5d1SDavid du Colombiermain(void)
7743e12c5d1SDavid du Colombier{
7753e12c5d1SDavid du Colombier	int c;
7763e12c5d1SDavid du Colombier
7773e12c5d1SDavid du Colombier	Binit(&bin, 0, OREAD);
7783e12c5d1SDavid du Colombier	Binit(&bout, 1, OWRITE);
7793e12c5d1SDavid du Colombier
7803e12c5d1SDavid du Colombier	while((c=Bgetc(&bin)) != Beof)
7813e12c5d1SDavid du Colombier		Bputc(&bout, c);
7823e12c5d1SDavid du Colombier	exits(0);
7833e12c5d1SDavid du Colombier}
7843e12c5d1SDavid du Colombier.P2
7853e12c5d1SDavid du ColombierFor peak performance, we could replace
7863e12c5d1SDavid du Colombier.CW Bgetc
7873e12c5d1SDavid du Colombierand
7883e12c5d1SDavid du Colombier.CW Bputc
7893e12c5d1SDavid du Colombierby their equivalent in-line macros
7903e12c5d1SDavid du Colombier.CW BGETC
7913e12c5d1SDavid du Colombierand
7923e12c5d1SDavid du Colombier.CW BPUTC
7933e12c5d1SDavid du Colombierbut
7943e12c5d1SDavid du Colombierthe performance gain would be modest.
7953e12c5d1SDavid du ColombierFor more information on
7963e12c5d1SDavid du Colombier.CW bio ,
7973e12c5d1SDavid du Colombiersee the Programmer's Manual.
7983e12c5d1SDavid du Colombier.PP
7993e12c5d1SDavid du ColombierPerhaps the most dramatic difference in the I/O interface of Plan 9 from other
800*219b2ee8SDavid du Colombiersystems' is that text is not ASCII.
8013e12c5d1SDavid du ColombierThe format for
8023e12c5d1SDavid du Colombiertext in Plan 9 is a byte-stream encoding of 16-bit characters.
803*219b2ee8SDavid du ColombierThe character set is based on the Unicode Standard and is backward compatible with
804*219b2ee8SDavid du ColombierASCII:
8053e12c5d1SDavid du Colombiercharacters with value 0 through 127 are the same in both sets.
8063e12c5d1SDavid du ColombierThe 16-bit characters, called
8073e12c5d1SDavid du Colombier.I runes
8083e12c5d1SDavid du Colombierin Plan 9, are encoded using a representation called
809*219b2ee8SDavid du ColombierUTF,
810*219b2ee8SDavid du Colombieran encoding that is becoming accepted as a standard.
811*219b2ee8SDavid du Colombier(ISO calls it UTF-8;
812*219b2ee8SDavid du Colombierthroughout Plan 9 it's just called
813*219b2ee8SDavid du ColombierUTF.)
814*219b2ee8SDavid du ColombierUTF
8153e12c5d1SDavid du Colombierdefines multibyte sequences to
8163e12c5d1SDavid du Colombierrepresent character values from 0 to 65535.
8173e12c5d1SDavid du ColombierIn
818*219b2ee8SDavid du ColombierUTF,
8193e12c5d1SDavid du Colombiercharacter values up to 127 decimal, 7F hexadecimal, represent themselves,
8203e12c5d1SDavid du Colombierso straight
821*219b2ee8SDavid du ColombierASCII
8223e12c5d1SDavid du Colombierfiles are also valid
823*219b2ee8SDavid du ColombierUTF.
8243e12c5d1SDavid du ColombierAlso,
825*219b2ee8SDavid du ColombierUTF
8263e12c5d1SDavid du Colombierguarantees that bytes with values 0 to 127 (NUL to DEL, inclusive)
8273e12c5d1SDavid du Colombierwill appear only when they represent themselves, so programs that read bytes
8283e12c5d1SDavid du Colombierlooking for plain ASCII characters will continue to work.
8293e12c5d1SDavid du ColombierAny program that expects a one-to-one correspondence between bytes and
8303e12c5d1SDavid du Colombiercharacters will, however, need to be modified.
8313e12c5d1SDavid du ColombierAn example is parsing file names.
8323e12c5d1SDavid du ColombierFile names, like all text, are in
833*219b2ee8SDavid du ColombierUTF,
8343e12c5d1SDavid du Colombierso it is incorrect to search for a character in a string by
835*219b2ee8SDavid du Colombier.CW strchr(filename,
836*219b2ee8SDavid du Colombier.CW c)
8373e12c5d1SDavid du Colombierbecause the character might have a multi-byte encoding.
8383e12c5d1SDavid du ColombierThe correct method is to call
839*219b2ee8SDavid du Colombier.CW utfrune(filename,
840*219b2ee8SDavid du Colombier.CW c) ,
8413e12c5d1SDavid du Colombierdefined in
8423e12c5d1SDavid du Colombier.I rune (2),
8433e12c5d1SDavid du Colombierwhich interprets the file name as a sequence of encoded characters
8443e12c5d1SDavid du Colombierrather than bytes.
8453e12c5d1SDavid du ColombierIn fact, even when you know the character is a single byte
8463e12c5d1SDavid du Colombierthat can represent only itself,
8473e12c5d1SDavid du Colombierit is safer to use
8483e12c5d1SDavid du Colombier.CW utfrune
8493e12c5d1SDavid du Colombierbecause that assumes nothing about the character set
8503e12c5d1SDavid du Colombierand its representation.
8513e12c5d1SDavid du Colombier.PP
8523e12c5d1SDavid du ColombierThe library defines several symbols relevant to the representation of characters.
8533e12c5d1SDavid du ColombierAny byte with unsigned value less than
8543e12c5d1SDavid du Colombier.CW Runesync
8553e12c5d1SDavid du Colombierwill not appear in any multi-byte encoding of a character.
8563e12c5d1SDavid du Colombier.CW Utfrune
8573e12c5d1SDavid du Colombiercompares the character being searched against
8583e12c5d1SDavid du Colombier.CW Runesync
8593e12c5d1SDavid du Colombierto see if it is sufficient to call
8603e12c5d1SDavid du Colombier.CW strchr
8613e12c5d1SDavid du Colombieror if the byte stream must be interpreted.
8623e12c5d1SDavid du ColombierAny byte with unsigned value less than
8633e12c5d1SDavid du Colombier.CW Runeself
864*219b2ee8SDavid du Colombieris represented by a single byte with the same value.
865*219b2ee8SDavid du ColombierFinally, when errors are encountered converting
8663e12c5d1SDavid du Colombierto runes from a byte stream, the library returns the rune value
8673e12c5d1SDavid du Colombier.CW Runeerror
8683e12c5d1SDavid du Colombierand advances a single byte.  This permits programs to find runes
8693e12c5d1SDavid du Colombierembedded in binary data.
8703e12c5d1SDavid du Colombier.PP
8713e12c5d1SDavid du Colombier.CW Bio
8723e12c5d1SDavid du Colombierincludes routines
8733e12c5d1SDavid du Colombier.CW Bgetrune
8743e12c5d1SDavid du Colombierand
8753e12c5d1SDavid du Colombier.CW Bputrune
8763e12c5d1SDavid du Colombierto transform the external byte stream
877*219b2ee8SDavid du ColombierUTF
8783e12c5d1SDavid du Colombierformat to and from
8793e12c5d1SDavid du Colombierinternal 16-bit runes.
8803e12c5d1SDavid du ColombierAlso, the
8813e12c5d1SDavid du Colombier.CW %s
8823e12c5d1SDavid du Colombierformat to
8833e12c5d1SDavid du Colombier.CW print
8843e12c5d1SDavid du Colombieraccepts
885*219b2ee8SDavid du ColombierUTF;
8863e12c5d1SDavid du Colombier.CW %c
8873e12c5d1SDavid du Colombierprints a character after narrowing it to 8 bits.
8883e12c5d1SDavid du ColombierThe
8893e12c5d1SDavid du Colombier.CW %S
8903e12c5d1SDavid du Colombierformat prints a null-terminated sequence of runes;
8913e12c5d1SDavid du Colombier.CW %C
8923e12c5d1SDavid du Colombierprints a character after narrowing it to 16 bits.
8933e12c5d1SDavid du ColombierFor more information, see the Programmer's Manual, in particular
8943e12c5d1SDavid du Colombier.I utf (6)
8953e12c5d1SDavid du Colombierand
896*219b2ee8SDavid du Colombier.I rune (2),
897*219b2ee8SDavid du Colombierand the paper,
898*219b2ee8SDavid du Colombier``Hello world, or
899*219b2ee8SDavid du ColombierΚαλημέρα κόσμε, or\
900*219b2ee8SDavid du Colombier\f(Jpこんにちは 世界\f1'',
901*219b2ee8SDavid du Colombierby Rob Pike and
902*219b2ee8SDavid du ColombierKen Thompson;
9033e12c5d1SDavid du Colombierthere is not room for the full story here.
9043e12c5d1SDavid du Colombier.PP
9053e12c5d1SDavid du ColombierThese issues affect the compiler in several ways.
9063e12c5d1SDavid du ColombierFirst, the C source is in
907*219b2ee8SDavid du ColombierUTF.
908*219b2ee8SDavid du ColombierANSI says C variables are formed from
909*219b2ee8SDavid du ColombierASCII
910*219b2ee8SDavid du Colombieralphanumerics, but comments and literal strings may contain any characters
911*219b2ee8SDavid du Colombierencoded in the native encoding, here
912*219b2ee8SDavid du ColombierUTF.
9133e12c5d1SDavid du ColombierThe declaration
9143e12c5d1SDavid du Colombier.P1
9153e12c5d1SDavid du Colombierchar *cp = "abcÿ";
9163e12c5d1SDavid du Colombier.P2
9173e12c5d1SDavid du Colombierinitializes the variable
9183e12c5d1SDavid du Colombier.CW cp
9193e12c5d1SDavid du Colombierto point to an array of bytes holding the
920*219b2ee8SDavid du ColombierUTF
9213e12c5d1SDavid du Colombierrepresentation of the characters
9223e12c5d1SDavid du Colombier.CW abcÿ.
9233e12c5d1SDavid du ColombierThe type
9243e12c5d1SDavid du Colombier.CW Rune
9253e12c5d1SDavid du Colombieris defined in
9263e12c5d1SDavid du Colombier.CW <u.h>
9273e12c5d1SDavid du Colombierto be
9283e12c5d1SDavid du Colombier.CW ushort ,
9293e12c5d1SDavid du Colombierwhich is also the  `wide character' type in the compiler.
9303e12c5d1SDavid du ColombierTherefore the declaration
9313e12c5d1SDavid du Colombier.P1
9323e12c5d1SDavid du ColombierRune *rp = L"abcÿ";
9333e12c5d1SDavid du Colombier.P2
9343e12c5d1SDavid du Colombierinitializes the variable
9353e12c5d1SDavid du Colombier.CW rp
9363e12c5d1SDavid du Colombierto point to an array of unsigned short integers holding the 16-bit
9373e12c5d1SDavid du Colombiervalues of the characters
9383e12c5d1SDavid du Colombier.CW abcÿ .
9393e12c5d1SDavid du ColombierNote that in both these declarations the characters in the source
9403e12c5d1SDavid du Colombierthat represent
9413e12c5d1SDavid du Colombier.CW "abcÿ"
9423e12c5d1SDavid du Colombierare the same; what changes is how those characters are represented
9433e12c5d1SDavid du Colombierin memory in the program.
9443e12c5d1SDavid du ColombierThe following two lines:
9453e12c5d1SDavid du Colombier.P1
9463e12c5d1SDavid du Colombierprint("%s\en", "abcÿ");
9473e12c5d1SDavid du Colombierprint("%S\en", L"abcÿ");
9483e12c5d1SDavid du Colombier.P2
9493e12c5d1SDavid du Colombierproduce the same
950*219b2ee8SDavid du ColombierUTF
9513e12c5d1SDavid du Colombierstring on their output, the first by copying the bytes, the second
9523e12c5d1SDavid du Colombierby converting from runes to bytes.
9533e12c5d1SDavid du Colombier.PP
9543e12c5d1SDavid du ColombierIn C, character constants are integers but narrowed through the
9553e12c5d1SDavid du Colombier.CW char
956*219b2ee8SDavid du Colombiertype.
957*219b2ee8SDavid du ColombierThe Unicode character
9583e12c5d1SDavid du Colombier.CW ÿ
9593e12c5d1SDavid du Colombierhas value 255, so if the
9603e12c5d1SDavid du Colombier.CW char
9613e12c5d1SDavid du Colombiertype is signed,
9623e12c5d1SDavid du Colombierthe constant
9633e12c5d1SDavid du Colombier.CW 'ÿ'
9643e12c5d1SDavid du Colombierhas value \-1 (which is equal to EOF).
9653e12c5d1SDavid du ColombierOn the other hand,
9663e12c5d1SDavid du Colombier.CW L'ÿ'
9673e12c5d1SDavid du Colombiernarrows through the wide character type,
9683e12c5d1SDavid du Colombier.CW ushort ,
9693e12c5d1SDavid du Colombierand therefore has value 255.
970*219b2ee8SDavid du Colombier.PP
971*219b2ee8SDavid du ColombierFinally, although it's not ANSI C, the Plan 9 C compilers
972*219b2ee8SDavid du Colombierassume any character with value above
973*219b2ee8SDavid du Colombier.CW Runeself
974*219b2ee8SDavid du Colombieris an alphanumeric,
975*219b2ee8SDavid du Colombierso α is a legal, if non-portable, variable name.
9763e12c5d1SDavid du Colombier.SH
9773e12c5d1SDavid du ColombierArguments
9783e12c5d1SDavid du Colombier.PP
979*219b2ee8SDavid du ColombierSome macros are defined
980*219b2ee8SDavid du Colombierin
9813e12c5d1SDavid du Colombier.CW <libc.h>
982*219b2ee8SDavid du Colombierfor parsing the arguments to
9833e12c5d1SDavid du Colombier.CW main() .
9843e12c5d1SDavid du ColombierThey are described in
9853e12c5d1SDavid du Colombier.I ARG (2)
9863e12c5d1SDavid du Colombierbut are fairly self-explanatory.
9873e12c5d1SDavid du ColombierThere are four macros:
9883e12c5d1SDavid du Colombier.CW ARGBEGIN
9893e12c5d1SDavid du Colombierand
9903e12c5d1SDavid du Colombier.CW ARGEND
9913e12c5d1SDavid du Colombierare used to bracket a hidden
9923e12c5d1SDavid du Colombier.CW switch
9933e12c5d1SDavid du Colombierstatement within which
9943e12c5d1SDavid du Colombier.CW ARGC
9953e12c5d1SDavid du Colombierreturns the current option character (rune) being processed and
9963e12c5d1SDavid du Colombier.CW ARGF
9973e12c5d1SDavid du Colombierreturns the argument to the option, as in the loader option
9983e12c5d1SDavid du Colombier.CW -o
9993e12c5d1SDavid du Colombier.CW file .
1000*219b2ee8SDavid du ColombierHere, for example, is the code at the beginning of
10013e12c5d1SDavid du Colombier.CW main()
10023e12c5d1SDavid du Colombierin
1003*219b2ee8SDavid du Colombier.CW ramfs.c
10043e12c5d1SDavid du Colombier(see
1005*219b2ee8SDavid du Colombier.I ramfs (1))
10063e12c5d1SDavid du Colombierthat cracks its arguments:
10073e12c5d1SDavid du Colombier.P1
10083e12c5d1SDavid du Colombiervoid
10093e12c5d1SDavid du Colombiermain(int argc, char *argv[])
10103e12c5d1SDavid du Colombier{
1011*219b2ee8SDavid du Colombier	char *defmnt;
1012*219b2ee8SDavid du Colombier	int p[2];
1013*219b2ee8SDavid du Colombier	int mfd[2];
1014*219b2ee8SDavid du Colombier	int stdio = 0;
10153e12c5d1SDavid du Colombier
1016*219b2ee8SDavid du Colombier	defmnt = "/tmp";
10173e12c5d1SDavid du Colombier	ARGBEGIN{
1018*219b2ee8SDavid du Colombier	case 'i':
1019*219b2ee8SDavid du Colombier		defmnt = 0;
1020*219b2ee8SDavid du Colombier		stdio = 1;
1021*219b2ee8SDavid du Colombier		mfd[0] = 0;
1022*219b2ee8SDavid du Colombier		mfd[1] = 1;
10233e12c5d1SDavid du Colombier		break;
10243e12c5d1SDavid du Colombier	case 's':
1025*219b2ee8SDavid du Colombier		defmnt = 0;
1026*219b2ee8SDavid du Colombier		break;
1027*219b2ee8SDavid du Colombier	case 'm':
1028*219b2ee8SDavid du Colombier		defmnt = ARGF();
10293e12c5d1SDavid du Colombier		break;
10303e12c5d1SDavid du Colombier	default:
10313e12c5d1SDavid du Colombier		usage();
10323e12c5d1SDavid du Colombier	}ARGEND
10333e12c5d1SDavid du Colombier.P2
10343e12c5d1SDavid du Colombier.SH
10353e12c5d1SDavid du ColombierExtensions
10363e12c5d1SDavid du Colombier.PP
10373e12c5d1SDavid du ColombierThe compiler has several extensions to ANSI C, all of which are used
10383e12c5d1SDavid du Colombierextensively in the system source.
10393e12c5d1SDavid du ColombierFirst,
10403e12c5d1SDavid du Colombier.I structure
10413e12c5d1SDavid du Colombier.I displays
10423e12c5d1SDavid du Colombierpermit
10433e12c5d1SDavid du Colombier.CW struct
10443e12c5d1SDavid du Colombierexpressions to be formed dynamically.
10453e12c5d1SDavid du ColombierGiven these declarations:
10463e12c5d1SDavid du Colombier.P1
10473e12c5d1SDavid du Colombiertypedef struct Point Point;
10483e12c5d1SDavid du Colombiertypedef struct Rectangle Rectangle;
10493e12c5d1SDavid du Colombier
10503e12c5d1SDavid du Colombierstruct Point
10513e12c5d1SDavid du Colombier{
10523e12c5d1SDavid du Colombier	int x, y;
10533e12c5d1SDavid du Colombier};
10543e12c5d1SDavid du Colombier
10553e12c5d1SDavid du Colombierstruct Rectangle
10563e12c5d1SDavid du Colombier{
10573e12c5d1SDavid du Colombier	Point min, max;
10583e12c5d1SDavid du Colombier};
10593e12c5d1SDavid du Colombier
10603e12c5d1SDavid du ColombierPoint	p, q, add(Point, Point);
10613e12c5d1SDavid du ColombierRectangle r;
10623e12c5d1SDavid du Colombierint	x, y;
10633e12c5d1SDavid du Colombier.P2
10643e12c5d1SDavid du Colombierthis assignment may appear anywhere an assignment is legal:
10653e12c5d1SDavid du Colombier.P1
10663e12c5d1SDavid du Colombierr = (Rectangle){add(p, q), (Point){x, y+3}};
10673e12c5d1SDavid du Colombier.P2
10683e12c5d1SDavid du ColombierThe syntax is the same as for initializing a structure but with
10693e12c5d1SDavid du Colombiera leading cast.
10703e12c5d1SDavid du Colombier.PP
10713e12c5d1SDavid du ColombierIf an
10723e12c5d1SDavid du Colombier.I anonymous
10733e12c5d1SDavid du Colombier.I structure
10743e12c5d1SDavid du Colombieror
10753e12c5d1SDavid du Colombier.I union
10763e12c5d1SDavid du Colombieris declared within another structure or union, the members of the internal
10773e12c5d1SDavid du Colombierstructure or union are addressable without prefix in the outer structure.
10783e12c5d1SDavid du ColombierThis feature eliminates the clumsy naming of nested structures and,
10793e12c5d1SDavid du Colombierparticularly, unions.
10803e12c5d1SDavid du ColombierFor example, after these declarations,
10813e12c5d1SDavid du Colombier.P1
10823e12c5d1SDavid du Colombierstruct Lock
10833e12c5d1SDavid du Colombier{
10843e12c5d1SDavid du Colombier	int	locked;
10853e12c5d1SDavid du Colombier};
10863e12c5d1SDavid du Colombier
10873e12c5d1SDavid du Colombierstruct Node
10883e12c5d1SDavid du Colombier{
10893e12c5d1SDavid du Colombier	int	type;
10903e12c5d1SDavid du Colombier	union{
10913e12c5d1SDavid du Colombier		double  dval;
10923e12c5d1SDavid du Colombier		double  fval;
10933e12c5d1SDavid du Colombier		long    lval;
1094*219b2ee8SDavid du Colombier	};		/* anonymous union */
10953e12c5d1SDavid du Colombier	struct Lock;	/* anonymous structure */
10963e12c5d1SDavid du Colombier} *node;
10973e12c5d1SDavid du Colombier
10983e12c5d1SDavid du Colombiervoid	lock(struct Lock*);
10993e12c5d1SDavid du Colombier.P2
11003e12c5d1SDavid du Colombierone may refer to
11013e12c5d1SDavid du Colombier.CW node->type ,
11023e12c5d1SDavid du Colombier.CW node->dval ,
11033e12c5d1SDavid du Colombier.CW node->fval ,
11043e12c5d1SDavid du Colombier.CW node->lval ,
11053e12c5d1SDavid du Colombierand
11063e12c5d1SDavid du Colombier.CW node->locked .
11073e12c5d1SDavid du ColombierMoreover, the address of a
11083e12c5d1SDavid du Colombier.CW struct
11093e12c5d1SDavid du Colombier.CW Node
11103e12c5d1SDavid du Colombiermay be used without a cast anywhere that the address of a
11113e12c5d1SDavid du Colombier.CW struct
11123e12c5d1SDavid du Colombier.CW Lock
11133e12c5d1SDavid du Colombieris used, such as in argument lists.
1114*219b2ee8SDavid du ColombierThe compiler automatically promotes the type and adjusts the address.
11153e12c5d1SDavid du ColombierThus one may invoke
11163e12c5d1SDavid du Colombier.CW lock(node) .
11173e12c5d1SDavid du Colombier.PP
11183e12c5d1SDavid du ColombierAnonymous structures and unions may be accessed by type name
11193e12c5d1SDavid du Colombierif (and only if) they are declared using a
11203e12c5d1SDavid du Colombier.CW typedef
11213e12c5d1SDavid du Colombiername.
11223e12c5d1SDavid du ColombierFor example, using the above declaration for
11233e12c5d1SDavid du Colombier.CW Point ,
11243e12c5d1SDavid du Colombierone may declare
11253e12c5d1SDavid du Colombier.P1
11263e12c5d1SDavid du Colombierstruct
11273e12c5d1SDavid du Colombier{
11283e12c5d1SDavid du Colombier	int	type;
11293e12c5d1SDavid du Colombier	Point;
11303e12c5d1SDavid du Colombier} p;
11313e12c5d1SDavid du Colombier.P2
11323e12c5d1SDavid du Colombierand refer to
11333e12c5d1SDavid du Colombier.CW p.Point .
11343e12c5d1SDavid du Colombier.PP
11353e12c5d1SDavid du ColombierIn the initialization of arrays, a number in square brackets before an
11363e12c5d1SDavid du Colombierelement sets the index for the initialization.  For example, to initialize
11373e12c5d1SDavid du Colombiersome elements in
11383e12c5d1SDavid du Colombiera table of function pointers indexed by
1139*219b2ee8SDavid du ColombierASCII
11403e12c5d1SDavid du Colombiercharacter,
11413e12c5d1SDavid du Colombier.P1
11423e12c5d1SDavid du Colombiervoid	percent(void), slash(void);
11433e12c5d1SDavid du Colombier
11443e12c5d1SDavid du Colombiervoid	(*func[128])(void) =
11453e12c5d1SDavid du Colombier{
11463e12c5d1SDavid du Colombier	['%']	percent,
11473e12c5d1SDavid du Colombier	['/']	slash,
11483e12c5d1SDavid du Colombier};
11493e12c5d1SDavid du Colombier.P2
11503e12c5d1SDavid du Colombier.PP
11513e12c5d1SDavid du ColombierFinally, the declaration
11523e12c5d1SDavid du Colombier.P1
11533e12c5d1SDavid du Colombierextern register reg;
11543e12c5d1SDavid du Colombier.P2
11553e12c5d1SDavid du Colombier.I this "" (
11563e12c5d1SDavid du Colombierappearance of the register keyword is not ignored)
11573e12c5d1SDavid du Colombierallocates a global register to hold the variable
11583e12c5d1SDavid du Colombier.CW reg .
11593e12c5d1SDavid du ColombierExternal registers must be used carefully: they need to be declared in
11603e12c5d1SDavid du Colombier.I all
11613e12c5d1SDavid du Colombiersource files and libraries in the program to guarantee the register
11623e12c5d1SDavid du Colombieris not allocated temporarily for other purposes.
11633e12c5d1SDavid du ColombierEspecially on machines with few registers, such as the i386,
11643e12c5d1SDavid du Colombierit is easy to link accidentally with code that has already usurped
11653e12c5d1SDavid du Colombierthe global registers and there is no diagnostic when this happens.
11663e12c5d1SDavid du ColombierUsed wisely, though, external registers are powerful.
11673e12c5d1SDavid du ColombierThe Plan 9 operating system uses them to access per-process and
11683e12c5d1SDavid du Colombierper-machine data structures on a multiprocessor.  The storage class they provide
11693e12c5d1SDavid du Colombieris hard to create in other ways.
11703e12c5d1SDavid du Colombier.SH
11713e12c5d1SDavid du ColombierThe compile-time environment
11723e12c5d1SDavid du Colombier.PP
11733e12c5d1SDavid du ColombierThe code generated by the compilers is `optimized' by default:
11743e12c5d1SDavid du Colombiervariables are placed in registers and peephole optimizations are
11753e12c5d1SDavid du Colombierperformed.
11763e12c5d1SDavid du ColombierThe compiler flag
11773e12c5d1SDavid du Colombier.CW -N
11783e12c5d1SDavid du Colombierdisables these optimizations.
11793e12c5d1SDavid du ColombierRegisterization is done locally rather than throughout a function:
11803e12c5d1SDavid du Colombierwhether a variable occupies a register or
11813e12c5d1SDavid du Colombierthe memory location identified in the symbol
11823e12c5d1SDavid du Colombiertable depends on the activity of the variable and may change
11833e12c5d1SDavid du Colombierthroughout the life of the variable.
11843e12c5d1SDavid du ColombierThe
11853e12c5d1SDavid du Colombier.CW -N
11863e12c5d1SDavid du Colombierflag is rarely needed;
11873e12c5d1SDavid du Colombierits main use is to simplify debugging.
11883e12c5d1SDavid du ColombierThere is no information in the symbol table to identify the
11893e12c5d1SDavid du Colombierregisterization of a variable, so
11903e12c5d1SDavid du Colombier.CW -N
11913e12c5d1SDavid du Colombierguarantees the variable is always where the symbol table says it is.
11923e12c5d1SDavid du Colombier.PP
11933e12c5d1SDavid du ColombierAnother flag,
11943e12c5d1SDavid du Colombier.CW -w ,
11953e12c5d1SDavid du Colombierturns
11963e12c5d1SDavid du Colombier.I on
11973e12c5d1SDavid du Colombierwarnings about portability and problems detected in flow analysis.
11983e12c5d1SDavid du ColombierMost code in Plan 9 is compiled with warnings enabled;
11993e12c5d1SDavid du Colombierthese warnings plus the type checking offered by function prototypes
12003e12c5d1SDavid du Colombierprovide most of the support of the Unix tool
12013e12c5d1SDavid du Colombier.CW lint
12023e12c5d1SDavid du Colombiermore accurately and with less chatter.
12033e12c5d1SDavid du ColombierTwo of the warnings,
12043e12c5d1SDavid du Colombier`used and not set' and `set and not used', are almost always accurate but
12053e12c5d1SDavid du Colombiermay be triggered spuriously by code with invisible control flow,
12063e12c5d1SDavid du Colombiersuch as in routines that call
12073e12c5d1SDavid du Colombier.CW longjmp .
12083e12c5d1SDavid du ColombierThe compiler statements
12093e12c5d1SDavid du Colombier.P1
12103e12c5d1SDavid du ColombierSET(v1);
12113e12c5d1SDavid du ColombierUSED(v2);
12123e12c5d1SDavid du Colombier.P2
12133e12c5d1SDavid du Colombierdecorate the flow graph to silence the compiler.
12143e12c5d1SDavid du ColombierEither statement accepts a comma-separated list of variables.
1215*219b2ee8SDavid du ColombierUse them carefully: they may silence real errors.
1216*219b2ee8SDavid du ColombierFor the common case of unused parameters to a function,
1217*219b2ee8SDavid du Colombierleaving the name off the declaration silences the warnings.
1218*219b2ee8SDavid du ColombierThat is, listing the type of a parameter but giving it no
1219*219b2ee8SDavid du Colombierassociated variable name does the trick.
12203e12c5d1SDavid du Colombier.SH
12213e12c5d1SDavid du ColombierDebugging
12223e12c5d1SDavid du Colombier.PP
1223*219b2ee8SDavid du ColombierThere are two debuggers available on Plan 9.
1224*219b2ee8SDavid du ColombierThe first, and older, is
12253e12c5d1SDavid du Colombier.CW db ,
12263e12c5d1SDavid du Colombiera revision of Unix
12273e12c5d1SDavid du Colombier.CW adb .
1228*219b2ee8SDavid du ColombierThe other,
1229*219b2ee8SDavid du Colombier.CW acid ,
1230*219b2ee8SDavid du Colombieris a source-level debugger whose commands are statements in
1231*219b2ee8SDavid du Colombiera true programming language.
1232*219b2ee8SDavid du Colombier.CW Acid
1233*219b2ee8SDavid du Colombieris the preferred debugger, but since it
1234*219b2ee8SDavid du Colombierborrows some elements of
1235*219b2ee8SDavid du Colombier.CW db ,
1236*219b2ee8SDavid du Colombiernotably the formats for displaying values, it is worth knowing a little bit about
1237*219b2ee8SDavid du Colombier.CW db .
12383e12c5d1SDavid du Colombier.PP
1239*219b2ee8SDavid du ColombierBoth debuggers support multiple architectures in a single program; that is,
1240*219b2ee8SDavid du Colombierthe programs are
12413e12c5d1SDavid du Colombier.CW db
1242*219b2ee8SDavid du Colombierand
1243*219b2ee8SDavid du Colombier.CW acid ,
1244*219b2ee8SDavid du Colombiernot for example
1245*219b2ee8SDavid du Colombier.CW vdb
1246*219b2ee8SDavid du Colombierand
1247*219b2ee8SDavid du Colombier.CW vacid .
1248*219b2ee8SDavid du ColombierThey also support cross-architecture debugging comfortably:
1249*219b2ee8SDavid du Colombierone may debug a 68020 binary on a MIPS.
1250*219b2ee8SDavid du Colombier.PP
12513e12c5d1SDavid du ColombierImagine a program has crashed mysteriously:
12523e12c5d1SDavid du Colombier.P1
12533e12c5d1SDavid du Colombier% X11/X
12543e12c5d1SDavid du ColombierFatal server bug!
12553e12c5d1SDavid du Colombierfailed to create default stipple
1256*219b2ee8SDavid du ColombierX 106: suicide: sys: trap: fault read addr=0x0 pc=0x00105fb8
12573e12c5d1SDavid du Colombier%
12583e12c5d1SDavid du Colombier.P2
12593e12c5d1SDavid du ColombierWhen a process dies on Plan 9 it hangs in the `broken' state
12603e12c5d1SDavid du Colombierfor debugging.
1261*219b2ee8SDavid du ColombierAttach a debugger to the process by naming its process id:
12623e12c5d1SDavid du Colombier.P1
1263*219b2ee8SDavid du Colombier% acid 106
1264*219b2ee8SDavid du Colombier/proc/106/text:mips plan 9 executable
1265*219b2ee8SDavid du Colombier
1266*219b2ee8SDavid du Colombier/sys/lib/acid/port
1267*219b2ee8SDavid du Colombier/sys/lib/acid/mips
1268*219b2ee8SDavid du Colombieracid:
12693e12c5d1SDavid du Colombier.P2
12703e12c5d1SDavid du ColombierThe
1271*219b2ee8SDavid du Colombier.CW acid
1272*219b2ee8SDavid du Colombierfunction
1273*219b2ee8SDavid du Colombier.CW stk()
12743e12c5d1SDavid du Colombierreports the stack traceback:
12753e12c5d1SDavid du Colombier.P1
1276*219b2ee8SDavid du Colombieracid: stk()
1277*219b2ee8SDavid du ColombierAt pc:0x105fb8:abort+0x24 /sys/src/ape/lib/ap/stdio/abort.c:6
1278*219b2ee8SDavid du Colombierabort() /sys/src/ape/lib/ap/stdio/abort.c:4
1279*219b2ee8SDavid du Colombier	called from FatalError+#4e
1280*219b2ee8SDavid du Colombier		/sys/src/X/mit/server/dix/misc.c:421
12813e12c5d1SDavid du ColombierFatalError(s9=#e02, s8=#4901d200, s7=#2, s6=#72701, s5=#1,
12823e12c5d1SDavid du Colombier    s4=#7270d, s3=#6, s2=#12, s1=#ff37f1c, s0=#6, f=#7270f)
1283*219b2ee8SDavid du Colombier    /sys/src/X/mit/server/dix/misc.c:416
1284*219b2ee8SDavid du Colombier	called from gnotscreeninit+#4ce
1285*219b2ee8SDavid du Colombier		/sys/src/X/mit/server/ddx/gnot/gnot.c:792
1286*219b2ee8SDavid du Colombiergnotscreeninit(snum=#0, sc=#80db0)
1287*219b2ee8SDavid du Colombier    /sys/src/X/mit/server/ddx/gnot/gnot.c:766
1288*219b2ee8SDavid du Colombier	called from AddScreen+#16e
1289*219b2ee8SDavid du Colombier		/n/bootes/sys/src/X/mit/server/dix/main.c:610
1290*219b2ee8SDavid du ColombierAddScreen(pfnInit=0x0000129c,argc=0x00000001,argv=0x7fffffe4)
1291*219b2ee8SDavid du Colombier    /sys/src/X/mit/server/dix/main.c:530
1292*219b2ee8SDavid du Colombier	called from InitOutput+0x80
1293*219b2ee8SDavid du Colombier		/sys/src/X/mit/server/ddx/brazil/brddx.c:522
1294*219b2ee8SDavid du ColombierInitOutput(argc=0x00000001,argv=0x7fffffe4)
1295*219b2ee8SDavid du Colombier    /sys/src/X/mit/server/ddx/brazil/brddx.c:511
1296*219b2ee8SDavid du Colombier	called from main+0x294
1297*219b2ee8SDavid du Colombier		/sys/src/X/mit/server/dix/main.c:225
1298*219b2ee8SDavid du Colombiermain(argc=0x00000001,argv=0x7fffffe4)
1299*219b2ee8SDavid du Colombier    /sys/src/X/mit/server/dix/main.c:136
1300*219b2ee8SDavid du Colombier	called from _main+0x24
1301*219b2ee8SDavid du Colombier		/sys/src/ape/lib/ap/mips/main9.s:8
13023e12c5d1SDavid du Colombier.P2
1303*219b2ee8SDavid du ColombierThe function
1304*219b2ee8SDavid du Colombier.CW lstk()
1305*219b2ee8SDavid du Colombieris similar but
1306*219b2ee8SDavid du Colombieralso reports the values of local variables.
1307*219b2ee8SDavid du ColombierNote that the traceback includes full file names; this is a boon to debugging,
1308*219b2ee8SDavid du Colombieralthough it makes the output much noisier.
13093e12c5d1SDavid du Colombier.PP
1310*219b2ee8SDavid du ColombierTo use
1311*219b2ee8SDavid du Colombier.CW acid
1312*219b2ee8SDavid du Colombierwell you will need to learn its input language; see the
1313*219b2ee8SDavid du Colombier``Acid Manual'',
1314*219b2ee8SDavid du Colombierby Phil Winterbottom,
1315*219b2ee8SDavid du Colombierfor details.  For simple debugging, however, the information in the manual page is
1316*219b2ee8SDavid du Colombiersufficient.  In particular, it describes the most useful functions
1317*219b2ee8SDavid du Colombierfor examining a process.
1318*219b2ee8SDavid du Colombier.PP
1319*219b2ee8SDavid du ColombierThe compiler does not place
1320*219b2ee8SDavid du Colombierinformation describing the types of variables in the executable,
13213e12c5d1SDavid du Colombierbut a compile-time flag provides crude support for symbolic debugging.
13223e12c5d1SDavid du ColombierThe
1323*219b2ee8SDavid du Colombier.CW -a
13243e12c5d1SDavid du Colombierflag to the compiler suppresses code generation
1325*219b2ee8SDavid du Colombierand instead emits source text in the
1326*219b2ee8SDavid du Colombier.CW acid
1327*219b2ee8SDavid du Colombierlanguage to format and display data structure types defined in the program.
1328*219b2ee8SDavid du ColombierThe easiest way to use this feature is to put a rule in the
1329*219b2ee8SDavid du Colombier.CW mkfile :
13303e12c5d1SDavid du Colombier.P1
1331*219b2ee8SDavid du Colombiersyms:   main.$O
1332*219b2ee8SDavid du Colombier        $CC -a main.c > syms
13333e12c5d1SDavid du Colombier.P2
1334*219b2ee8SDavid du ColombierThen from within
1335*219b2ee8SDavid du Colombier.CW acid ,
1336*219b2ee8SDavid du Colombier.P1
1337*219b2ee8SDavid du Colombieracid: include("sourcedirectory/syms")
1338*219b2ee8SDavid du Colombier.P2
1339*219b2ee8SDavid du Colombierto read in the relevant definitions.
1340*219b2ee8SDavid du Colombier(For multi-file source, you need to be a little fancier;
1341*219b2ee8SDavid du Colombiersee
1342*219b2ee8SDavid du Colombier.I 2c (1)).
1343*219b2ee8SDavid du ColombierThis text includes, for each defined compound
1344*219b2ee8SDavid du Colombiertype, a function with that name that may be called with the address of a structure
1345*219b2ee8SDavid du Colombierof that type to display its contents.
1346*219b2ee8SDavid du ColombierFor example, if
1347*219b2ee8SDavid du Colombier.CW rect
1348*219b2ee8SDavid du Colombieris a global variable of type
1349*219b2ee8SDavid du Colombier.CW Rectangle ,
1350*219b2ee8SDavid du Colombierone may execute
1351*219b2ee8SDavid du Colombier.P1
1352*219b2ee8SDavid du ColombierRectangle(*rect)
1353*219b2ee8SDavid du Colombier.P2
1354*219b2ee8SDavid du Colombierto display it.
13553e12c5d1SDavid du ColombierThe
1356*219b2ee8SDavid du Colombier.CW *
1357*219b2ee8SDavid du Colombier(indirection) operator is necessary because
1358*219b2ee8SDavid du Colombierof the way
1359*219b2ee8SDavid du Colombier.CW acid
1360*219b2ee8SDavid du Colombierworks: each global symbol in the program is defined as a variable by
1361*219b2ee8SDavid du Colombier.CW acid ,
1362*219b2ee8SDavid du Colombierwith value equal to the
1363*219b2ee8SDavid du Colombier.I address
1364*219b2ee8SDavid du Colombierof the symbol.
13653e12c5d1SDavid du Colombier.PP
1366*219b2ee8SDavid du ColombierAnother common technique is to write by hand special
1367*219b2ee8SDavid du Colombier.CW acid
1368*219b2ee8SDavid du Colombiercode to define functions to aid debugging, initialize the debugger, and so on.
1369*219b2ee8SDavid du ColombierConventionally, this is placed in a file called
1370*219b2ee8SDavid du Colombier.CW acid
1371*219b2ee8SDavid du Colombierin the source directory; it has a line
1372*219b2ee8SDavid du Colombier.P1
1373*219b2ee8SDavid du Colombierinclude("sourcedirectory/syms");
1374*219b2ee8SDavid du Colombier.P2
1375*219b2ee8SDavid du Colombierto load the compiler-produced symbols.  One may edit the compiler output directly but
1376*219b2ee8SDavid du Colombierit is wiser to keep the hand-generated
1377*219b2ee8SDavid du Colombier.CW acid
1378*219b2ee8SDavid du Colombierseparate from the machine-generated.
1379*219b2ee8SDavid du Colombier.PP
1380*219b2ee8SDavid du ColombierThere is much more to say here.  See
1381*219b2ee8SDavid du Colombier.CW acid
1382*219b2ee8SDavid du Colombiermanual page, the reference manual, or the paper
1383*219b2ee8SDavid du Colombier``Acid: A Debugger Built From A Language'',
1384*219b2ee8SDavid du Colombieralso by Phil Winterbottom.
1385*219b2ee8SDavid du Colombier.SH
1386*219b2ee8SDavid du ColombierAlef
1387*219b2ee8SDavid du Colombier.PP
1388*219b2ee8SDavid du ColombierWith minor substitutions, most of this document applies to Alef.
1389*219b2ee8SDavid du ColombierThe compilers are
1390*219b2ee8SDavid du Colombier.CW val ,
1391*219b2ee8SDavid du Colombier.CW kal ,
1392*219b2ee8SDavid du Colombierand
1393*219b2ee8SDavid du Colombier.CW 8al ;
1394*219b2ee8SDavid du Colombierthey work with the usual assemblers and loaders.
1395*219b2ee8SDavid du ColombierThere is no Alef compiler for the 68020.
1396*219b2ee8SDavid du ColombierThe directory of machine-independent include files is
1397*219b2ee8SDavid du Colombier.CW /sys/include/alef ;
1398*219b2ee8SDavid du Colombierthere are no machine-dependent Alef include files.
1399*219b2ee8SDavid du ColombierThe libraries are in
1400*219b2ee8SDavid du Colombier.CW /$objtype/lib/alef .
1401*219b2ee8SDavid du ColombierAlef uses
1402*219b2ee8SDavid du Colombier.CW /bin/cpp ,
1403*219b2ee8SDavid du Colombierwhich is a full ANSI C preprocessor.
1404*219b2ee8SDavid du ColombierOur style of use, however, is the same as in Plan 9 C.
1405*219b2ee8SDavid du Colombier.PP
1406*219b2ee8SDavid du ColombierThe Alef compilers don't have the
1407*219b2ee8SDavid du Colombier.CW USED(v)
1408*219b2ee8SDavid du Colombierand
1409*219b2ee8SDavid du Colombier.CW SET(v)
1410*219b2ee8SDavid du Colombieroperators; instead say something like
1411*219b2ee8SDavid du Colombier.P1
1412*219b2ee8SDavid du Colombierif(v);
1413*219b2ee8SDavid du Colombier.P2
1414*219b2ee8SDavid du Colombierfor
1415*219b2ee8SDavid du Colombier.CW USED
1416*219b2ee8SDavid du Colombierand just set the variable to something benign to silence `used and not set' warnings.
1417*219b2ee8SDavid du ColombierThe compilers also permit leaving unused parameters unnamed.
1418*219b2ee8SDavid du Colombier.PP
1419*219b2ee8SDavid du ColombierThe compilers support UTF,
1420*219b2ee8SDavid du Colombieralthough variable names must be plain alphanumeric.
1421*219b2ee8SDavid du ColombierUTF
1422*219b2ee8SDavid du Colombierstrings have syntax
1423*219b2ee8SDavid du Colombier.CW $"string"
1424*219b2ee8SDavid du Colombierrather than
1425*219b2ee8SDavid du Colombier.CW L"string" .
1426*219b2ee8SDavid du Colombier.PP
1427*219b2ee8SDavid du ColombierFinally, when debugging, some helpful
1428*219b2ee8SDavid du Colombier.CW acid
1429*219b2ee8SDavid du Colombiermay be loaded by supplying the flag
1430*219b2ee8SDavid du Colombier.CW -lalef
1431*219b2ee8SDavid du Colombierwhen starting
1432*219b2ee8SDavid du Colombier.CW acid .
1433*219b2ee8SDavid du ColombierThis code defines
1434*219b2ee8SDavid du Colombierfunctions to help analyze the state of the run-time system.
1435*219b2ee8SDavid du ColombierFor example,
1436*219b2ee8SDavid du Colombier.CW pchan(c)
1437*219b2ee8SDavid du Colombierreports the state of a channel.
1438*219b2ee8SDavid du ColombierBecause Alef programs are multi-threaded, they have multiple stacks.
1439*219b2ee8SDavid du ColombierTo print the stack trace for a
1440*219b2ee8SDavid du Colombier.CW proc ,
1441*219b2ee8SDavid du Colombierdo
1442*219b2ee8SDavid du Colombier.P1
1443*219b2ee8SDavid du Colombiersetproc(pid);
1444*219b2ee8SDavid du Colombierstk();
1445*219b2ee8SDavid du Colombier.P2
1446*219b2ee8SDavid du Colombierwhere
1447*219b2ee8SDavid du Colombier.CW pid
1448*219b2ee8SDavid du Colombieris the Plan 9 process id of the
1449*219b2ee8SDavid du Colombier.CW proc .
1450*219b2ee8SDavid du ColombierTo print the stack trace for a task is clumsier.
1451*219b2ee8SDavid du ColombierIn the program, get the `task id'
1452*219b2ee8SDavid du Colombierby calling the run-time function
1453*219b2ee8SDavid du Colombier.CW ALEF_tid
1454*219b2ee8SDavid du Colombierin each task and recording it in a global:
1455*219b2ee8SDavid du Colombier.P1
1456*219b2ee8SDavid du Colombiertaskid = ALEF_tid();
1457*219b2ee8SDavid du Colombier.P2
1458*219b2ee8SDavid du ColombierWhen the program is debugged, the task id
1459*219b2ee8SDavid du Colombiermay be passed to an
1460*219b2ee8SDavid du Colombier.CW acid
1461*219b2ee8SDavid du Colombierfunction to print the stack:
1462*219b2ee8SDavid du Colombier.P1
1463*219b2ee8SDavid du Colombierlabstk(*taskid);
1464*219b2ee8SDavid du Colombier.P2
1465*219b2ee8SDavid du ColombierThis is of course best done in the private, program-specific
1466*219b2ee8SDavid du Colombier.CW acid
1467*219b2ee8SDavid du Colombiercode.
1468