xref: /plan9/sys/doc/comp.ms (revision b9e364c446c00cfa6b1164b4648b126624c464b2)
1426d2b71SDavid du Colombier.HTML "How to Use the Plan 9 C Compiler
23e12c5d1SDavid du Colombier.TL
3*b9e364c4SDavid du ColombierHow to Use the Plan 9 C Compiler*
43e12c5d1SDavid du Colombier.AU
53e12c5d1SDavid du ColombierRob Pike
67dd7cddfSDavid du Colombierrob@plan9.bell-labs.com
73e12c5d1SDavid du Colombier.SH
83e12c5d1SDavid du ColombierIntroduction
9*b9e364c4SDavid du Colombier.FS
10*b9e364c4SDavid du Colombier* This paper has been revised to reflect the move to 21-bit Unicode.
11*b9e364c4SDavid du Colombier.FE
123e12c5d1SDavid du Colombier.PP
133e12c5d1SDavid du ColombierThe C compiler on Plan 9 is a wholly new program; in fact
143e12c5d1SDavid du Colombierit was the first piece of software written for what would
153e12c5d1SDavid du Colombiereventually become Plan 9 from Bell Labs.
163e12c5d1SDavid du ColombierProgrammers familiar with existing C compilers will find
173e12c5d1SDavid du Colombiera number of differences in both the language the Plan 9 compiler
183e12c5d1SDavid du Colombieraccepts and in how the compiler is used.
193e12c5d1SDavid du Colombier.PP
203e12c5d1SDavid du ColombierThe compiler is really a set of compilers, one for each
21f54a2a50SDavid du Colombierarchitecture \(em MIPS, SPARC, Intel 386, Power PC, ARM, etc. \(em
223e12c5d1SDavid du Colombierthat accept a dialect of ANSI C and efficiently produce
233e12c5d1SDavid du Colombierfairly good code for the target machine.
243e12c5d1SDavid du ColombierThere is a packaging of the compiler that accepts strict ANSI C for
25219b2ee8SDavid du Colombiera POSIX environment, but this document focuses on the
263e12c5d1SDavid du Colombiernative Plan 9 environment, that in which all the system source and
273e12c5d1SDavid du Colombieralmost all the utilities are written.
283e12c5d1SDavid du Colombier.SH
293e12c5d1SDavid du ColombierSource
303e12c5d1SDavid du Colombier.PP
31a9ca66cbSDavid du ColombierThe language accepted by the compilers is the core 1989 ANSI C language
323e12c5d1SDavid du Colombierwith some modest extensions,
333e12c5d1SDavid du Colombiera greatly simplified preprocessor,
343e12c5d1SDavid du Colombiera smaller library that includes system calls and related facilities,
353e12c5d1SDavid du Colombierand a completely different structure for include files.
363e12c5d1SDavid du Colombier.PP
373e12c5d1SDavid du ColombierOfficial ANSI C accepts the old (K&R) style of declarations for
383e12c5d1SDavid du Colombierfunctions; the Plan 9 compilers
393e12c5d1SDavid du Colombierare more demanding.
403e12c5d1SDavid du ColombierWithout an explicit run-time flag
413e12c5d1SDavid du Colombier.CW -B ) (
423e12c5d1SDavid du Colombierwhose use is discouraged, the compilers insist
433e12c5d1SDavid du Colombieron new-style function declarations, that is, prototypes for
443e12c5d1SDavid du Colombierfunction arguments.
453e12c5d1SDavid du ColombierThe function declarations in the libraries' include files are
463e12c5d1SDavid du Colombierall in the new style so the interfaces are checked at compile time.
473e12c5d1SDavid du ColombierFor C programmers who have not yet switched to function prototypes
483e12c5d1SDavid du Colombierthe clumsy syntax may seem repellent but the payoff in stronger typing
493e12c5d1SDavid du Colombieris substantial.
503e12c5d1SDavid du ColombierThose who wish to import existing software to Plan 9 are urged
513e12c5d1SDavid du Colombierto use the opportunity to update their code.
523e12c5d1SDavid du Colombier.PP
533e12c5d1SDavid du ColombierThe compilers include an integrated preprocessor that accepts the familiar
543e12c5d1SDavid du Colombier.CW #include ,
553e12c5d1SDavid du Colombier.CW #define
563e12c5d1SDavid du Colombierfor macros both with and without arguments,
573e12c5d1SDavid du Colombier.CW #undef ,
583e12c5d1SDavid du Colombier.CW #line ,
593e12c5d1SDavid du Colombier.CW #ifdef ,
603e12c5d1SDavid du Colombier.CW #ifndef ,
613e12c5d1SDavid du Colombierand
623e12c5d1SDavid du Colombier.CW #endif .
633e12c5d1SDavid du ColombierIt
643e12c5d1SDavid du Colombiersupports neither
653e12c5d1SDavid du Colombier.CW #if
663e12c5d1SDavid du Colombiernor
677dd7cddfSDavid du Colombier.CW ## ,
687dd7cddfSDavid du Colombieralthough it does
697dd7cddfSDavid du Colombierhonor a few
707dd7cddfSDavid du Colombier.CW #pragmas .
713e12c5d1SDavid du ColombierThe
723e12c5d1SDavid du Colombier.CW #if
733e12c5d1SDavid du Colombierdirective was omitted because it greatly complicates the
743e12c5d1SDavid du Colombierpreprocessor, is never necessary, and is usually abused.
753e12c5d1SDavid du ColombierConditional compilation in general makes code hard to understand;
76219b2ee8SDavid du Colombierthe Plan 9 source uses it sparingly.
773e12c5d1SDavid du ColombierAlso, because the compilers remove dead code, regular
783e12c5d1SDavid du Colombier.CW if
793e12c5d1SDavid du Colombierstatements with constant conditions are more readable equivalents to many
803e12c5d1SDavid du Colombier.CW #ifs .
813e12c5d1SDavid du ColombierTo compile imported code ineluctably fouled by
823e12c5d1SDavid du Colombier.CW #if
833e12c5d1SDavid du Colombierthere is a separate command,
843e12c5d1SDavid du Colombier.CW /bin/cpp ,
853e12c5d1SDavid du Colombierthat implements the complete ANSI C preprocessor specification.
863e12c5d1SDavid du Colombier.PP
873e12c5d1SDavid du ColombierInclude files fall into two groups: machine-dependent and machine-independent.
883e12c5d1SDavid du ColombierThe machine-independent files occupy the directory
893e12c5d1SDavid du Colombier.CW /sys/include ;
903e12c5d1SDavid du Colombierthe others are placed in a directory appropriate to the machine, such as
913e12c5d1SDavid du Colombier.CW /mips/include .
923e12c5d1SDavid du ColombierThe compiler searches for include files
933e12c5d1SDavid du Colombierfirst in the machine-dependent directory and then
943e12c5d1SDavid du Colombierin the machine-independent directory.
957dd7cddfSDavid du ColombierAt the time of writing there are thirty-one machine-independent include
967dd7cddfSDavid du Colombierfiles and two (per machine) machine-dependent ones:
977dd7cddfSDavid du Colombier.CW <ureg.h>
983e12c5d1SDavid du Colombierand
993e12c5d1SDavid du Colombier.CW <u.h> .
1003e12c5d1SDavid du ColombierThe first describes the layout of registers on the system stack,
1017dd7cddfSDavid du Colombierfor use by the debugger.
1027dd7cddfSDavid du ColombierThe second defines some
1033e12c5d1SDavid du Colombierarchitecture-dependent types such as
1043e12c5d1SDavid du Colombier.CW jmp_buf
1053e12c5d1SDavid du Colombierfor
1063e12c5d1SDavid du Colombier.CW setjmp
1077dd7cddfSDavid du Colombierand the
1087dd7cddfSDavid du Colombier.CW va_arg
1093e12c5d1SDavid du Colombierand
1107dd7cddfSDavid du Colombier.CW va_list
1117dd7cddfSDavid du Colombiermacros for handling arguments to variadic functions,
1127dd7cddfSDavid du Colombieras well as a set of
1133e12c5d1SDavid du Colombier.CW typedef
1143e12c5d1SDavid du Colombierabbreviations for
1153e12c5d1SDavid du Colombier.CW unsigned
1163e12c5d1SDavid du Colombier.CW short
1173e12c5d1SDavid du Colombierand so on.
1183e12c5d1SDavid du Colombier.PP
1193e12c5d1SDavid du ColombierHere is an excerpt from
120f54a2a50SDavid du Colombier.CW /386/include/u.h :
1213e12c5d1SDavid du Colombier.P1
1227dd7cddfSDavid du Colombier#define nil		((void*)0)
1233e12c5d1SDavid du Colombiertypedef	unsigned short	ushort;
1243e12c5d1SDavid du Colombiertypedef	unsigned char	uchar;
1253e12c5d1SDavid du Colombiertypedef unsigned long	ulong;
1263e12c5d1SDavid du Colombiertypedef unsigned int	uint;
1273e12c5d1SDavid du Colombiertypedef   signed char	schar;
1287dd7cddfSDavid du Colombiertypedef	long long       vlong;
1293e12c5d1SDavid du Colombier
1303e12c5d1SDavid du Colombiertypedef long	jmp_buf[2];
1313e12c5d1SDavid du Colombier#define	JMPBUFSP	0
1323e12c5d1SDavid du Colombier#define	JMPBUFPC	1
1333e12c5d1SDavid du Colombier#define	JMPBUFDPC	0
1343e12c5d1SDavid du Colombier.P2
1357dd7cddfSDavid du ColombierPlan 9 programs use
1367dd7cddfSDavid du Colombier.CW nil
1377dd7cddfSDavid du Colombierfor the name of the zero-valued pointer.
1383e12c5d1SDavid du ColombierThe type
1393e12c5d1SDavid du Colombier.CW vlong
1407dd7cddfSDavid du Colombieris the largest integer type available; on most architectures it
1413e12c5d1SDavid du Colombieris a 64-bit value.
1427dd7cddfSDavid du ColombierA couple of other types in
1437dd7cddfSDavid du Colombier.CW <u.h>
1447dd7cddfSDavid du Colombierare
1457dd7cddfSDavid du Colombier.CW u32int ,
1467dd7cddfSDavid du Colombierwhich is guaranteed to have exactly 32 bits (a possibility on all the supported architectures) and
1477dd7cddfSDavid du Colombier.CW mpdigit ,
1487dd7cddfSDavid du Colombierwhich is used by the multiprecision math package
1497dd7cddfSDavid du Colombier.CW <mp.h> .
1503e12c5d1SDavid du ColombierThe
1513e12c5d1SDavid du Colombier.CW #define
1523e12c5d1SDavid du Colombierconstants permit an architecture-independent (but compiler-dependent)
1533e12c5d1SDavid du Colombierimplementation of stack-switching using
1543e12c5d1SDavid du Colombier.CW setjmp
1553e12c5d1SDavid du Colombierand
1563e12c5d1SDavid du Colombier.CW longjmp .
1573e12c5d1SDavid du Colombier.PP
1583e12c5d1SDavid du ColombierEvery Plan 9 C program begins
1593e12c5d1SDavid du Colombier.P1
1603e12c5d1SDavid du Colombier#include <u.h>
1613e12c5d1SDavid du Colombier.P2
1623e12c5d1SDavid du Colombierbecause all the other installed header files use the
1633e12c5d1SDavid du Colombier.CW typedefs
1643e12c5d1SDavid du Colombierdeclared in
1653e12c5d1SDavid du Colombier.CW <u.h> .
1663e12c5d1SDavid du Colombier.PP
1673e12c5d1SDavid du ColombierIn strict ANSI C, include files are grouped to collect related functions
1683e12c5d1SDavid du Colombierin a single file: one for string functions, one for memory functions,
1693e12c5d1SDavid du Colombierone for I/O, and none for system calls.
1703e12c5d1SDavid du ColombierEach include file is protected by an
1713e12c5d1SDavid du Colombier.CW #ifdef
1723e12c5d1SDavid du Colombierto guarantee its contents are seen by the compiler only once.
173219b2ee8SDavid du ColombierPlan 9 takes a different approach.  Other than a few include
1743e12c5d1SDavid du Colombierfiles that define external formats such as archives, the files in
1753e12c5d1SDavid du Colombier.CW /sys/include
1763e12c5d1SDavid du Colombiercorrespond to
1773e12c5d1SDavid du Colombier.I libraries.
1783e12c5d1SDavid du ColombierIf a program is using a library, it includes the corresponding header.
1793e12c5d1SDavid du ColombierThe default C library comprises string functions, memory functions, and
1803e12c5d1SDavid du Colombierso on, largely as in ANSI C, some formatted I/O routines,
1813e12c5d1SDavid du Colombierplus all the system calls and related functions.
1823e12c5d1SDavid du ColombierTo use these functions, one must
1833e12c5d1SDavid du Colombier.CW #include
1843e12c5d1SDavid du Colombierthe file
1853e12c5d1SDavid du Colombier.CW <libc.h> ,
1863e12c5d1SDavid du Colombierwhich in turn must follow
1873e12c5d1SDavid du Colombier.CW <u.h> ,
1883e12c5d1SDavid du Colombierto define their prototypes for the compiler.
1893e12c5d1SDavid du ColombierHere is the complete source to the traditional first C program:
1903e12c5d1SDavid du Colombier.P1
1913e12c5d1SDavid du Colombier#include <u.h>
1923e12c5d1SDavid du Colombier#include <libc.h>
1933e12c5d1SDavid du Colombier
1943e12c5d1SDavid du Colombiervoid
1953e12c5d1SDavid du Colombiermain(void)
1963e12c5d1SDavid du Colombier{
1973e12c5d1SDavid du Colombier	print("hello world\en");
1983e12c5d1SDavid du Colombier	exits(0);
1993e12c5d1SDavid du Colombier}
2003e12c5d1SDavid du Colombier.P2
2013e12c5d1SDavid du ColombierThe
2023e12c5d1SDavid du Colombier.CW print
2033e12c5d1SDavid du Colombierroutine and its relatives
2043e12c5d1SDavid du Colombier.CW fprint
2053e12c5d1SDavid du Colombierand
2063e12c5d1SDavid du Colombier.CW sprint
2073e12c5d1SDavid du Colombierresemble the similarly-named functions in Standard I/O but are not
2083e12c5d1SDavid du Colombierattached to a specific I/O library.
2093e12c5d1SDavid du ColombierIn Plan 9
2103e12c5d1SDavid du Colombier.CW main
2113e12c5d1SDavid du Colombieris not integer-valued; it should call
2123e12c5d1SDavid du Colombier.CW exits ,
213219b2ee8SDavid du Colombierwhich takes a string argument (or null; here ANSI C promotes the 0 to a
214219b2ee8SDavid du Colombier.CW char* ).
2153e12c5d1SDavid du ColombierAll these functions are, of course, documented in the Programmer's Manual.
2163e12c5d1SDavid du Colombier.PP
2173e12c5d1SDavid du ColombierTo use
2183e12c5d1SDavid du Colombier.CW printf ,
2193e12c5d1SDavid du Colombier.CW <stdio.h>
2203e12c5d1SDavid du Colombiermust be included to define the function prototype for
2213e12c5d1SDavid du Colombier.CW printf :
2223e12c5d1SDavid du Colombier.P1
2233e12c5d1SDavid du Colombier#include <u.h>
2243e12c5d1SDavid du Colombier#include <libc.h>
2253e12c5d1SDavid du Colombier#include <stdio.h>
2263e12c5d1SDavid du Colombier
2273e12c5d1SDavid du Colombiervoid
2283e12c5d1SDavid du Colombiermain(int argc, char *argv[])
2293e12c5d1SDavid du Colombier{
230219b2ee8SDavid du Colombier	printf("%s: hello world; argc = %d\en", argv[0], argc);
2313e12c5d1SDavid du Colombier	exits(0);
2323e12c5d1SDavid du Colombier}
2333e12c5d1SDavid du Colombier.P2
234219b2ee8SDavid du ColombierIn practice, Standard I/O is not used much in Plan 9.  I/O libraries are
2353e12c5d1SDavid du Colombierdiscussed in a later section of this document.
2363e12c5d1SDavid du Colombier.PP
2377dd7cddfSDavid du ColombierThere are libraries for handling regular expressions, raster graphics,
2383e12c5d1SDavid du Colombierwindows, and so on, and each has an associated include file.
2393e12c5d1SDavid du ColombierThe manual for each library states which include files are needed.
2403e12c5d1SDavid du ColombierThe files are not protected against multiple inclusion and themselves
2413e12c5d1SDavid du Colombiercontain no nested
2423e12c5d1SDavid du Colombier.CW #includes .
2433e12c5d1SDavid du ColombierInstead the
2443e12c5d1SDavid du Colombierprogrammer is expected to sort out the requirements
2453e12c5d1SDavid du Colombierand to
2463e12c5d1SDavid du Colombier.CW #include
2473e12c5d1SDavid du Colombierthe necessary files once at the top of each source file.  In practice this is
2483e12c5d1SDavid du Colombiertrivial: this way of handling include files is so straightforward
2493e12c5d1SDavid du Colombierthat it is rare for a source file to contain more than half a dozen
2503e12c5d1SDavid du Colombier.CW #includes .
2513e12c5d1SDavid du Colombier.PP
2523e12c5d1SDavid du ColombierThe compilers do their own register allocation so the
2533e12c5d1SDavid du Colombier.CW register
2543e12c5d1SDavid du Colombierkeyword is ignored.
2553e12c5d1SDavid du ColombierFor different reasons,
2563e12c5d1SDavid du Colombier.CW volatile
2573e12c5d1SDavid du Colombierand
2583e12c5d1SDavid du Colombier.CW const
2593e12c5d1SDavid du Colombierare also ignored.
2603e12c5d1SDavid du Colombier.PP
2613e12c5d1SDavid du ColombierTo make it easier to share code with other systems, Plan 9 has a version
2623e12c5d1SDavid du Colombierof the compiler,
2633e12c5d1SDavid du Colombier.CW pcc ,
2643e12c5d1SDavid du Colombierthat provides the standard ANSI C preprocessor, headers, and libraries
2653e12c5d1SDavid du Colombierwith POSIX extensions.
2663e12c5d1SDavid du Colombier.CW Pcc
2673e12c5d1SDavid du Colombieris recommended only
2683e12c5d1SDavid du Colombierwhen broad external portability is mandated.  It compiles slower,
2693e12c5d1SDavid du Colombierproduces slower code (it takes extra work to simulate POSIX on Plan 9),
2703e12c5d1SDavid du Colombiereliminates those parts of the Plan 9 interface
2713e12c5d1SDavid du Colombiernot related to POSIX, and illustrates the clumsiness of an environment
2723e12c5d1SDavid du Colombierdesigned by committee.
2733e12c5d1SDavid du Colombier.CW Pcc
2743e12c5d1SDavid du Colombieris described in more detail in
2753e12c5d1SDavid du Colombier.I
276219b2ee8SDavid du ColombierAPE\(emThe ANSI/POSIX Environment,
2773e12c5d1SDavid du Colombier.R
2783e12c5d1SDavid du Colombierby Howard Trickey.
2793e12c5d1SDavid du Colombier.SH
2803e12c5d1SDavid du ColombierProcess
2813e12c5d1SDavid du Colombier.PP
2823e12c5d1SDavid du ColombierEach CPU architecture supported by Plan 9 is identified by a single,
2833e12c5d1SDavid du Colombierarbitrary, alphanumeric character:
2843e12c5d1SDavid du Colombier.CW k
2853e12c5d1SDavid du Colombierfor SPARC,
2867dd7cddfSDavid du Colombier.CW q
28740ff8eeaSDavid du Colombierfor 32-bit Power PC,
2887dd7cddfSDavid du Colombier.CW v
2897dd7cddfSDavid du Colombierfor MIPS,
2901052a86aSDavid du Colombier.CW 0
2911052a86aSDavid du Colombierfor little-endian MIPS,
2927dd7cddfSDavid du Colombier.CW 5
293f54a2a50SDavid du Colombierfor ARM v5 and later 32-bit architectures,
2947dd7cddfSDavid du Colombier.CW 6
2951052a86aSDavid du Colombierfor AMD64,
2963e12c5d1SDavid du Colombier.CW 8
2973e12c5d1SDavid du Colombierfor Intel 386, and
2987dd7cddfSDavid du Colombier.CW 9
29940ff8eeaSDavid du Colombierfor 64-bit Power PC.
3003e12c5d1SDavid du ColombierThe character labels the support tools and files for that architecture.
301f54a2a50SDavid du ColombierFor instance, for the 386 the compiler is
302f54a2a50SDavid du Colombier.CW 8c ,
3033e12c5d1SDavid du Colombierthe assembler is
304f54a2a50SDavid du Colombier.CW 8a ,
3053e12c5d1SDavid du Colombierthe link editor/loader is
306f54a2a50SDavid du Colombier.CW 8l ,
3073e12c5d1SDavid du Colombierthe object files are suffixed
308f54a2a50SDavid du Colombier.CW \&.8 ,
3093e12c5d1SDavid du Colombierand the default name for an executable file is
310f54a2a50SDavid du Colombier.CW 8.out .
3113e12c5d1SDavid du ColombierBefore we can use the compiler we therefore need to know which
3123e12c5d1SDavid du Colombiermachine we are compiling for.
3133e12c5d1SDavid du ColombierThe next section explains how this decision is made; for the moment
314f54a2a50SDavid du Colombierassume we are building 386 binaries and make the mental substitution for
315f54a2a50SDavid du Colombier.CW 8
3163e12c5d1SDavid du Colombierappropriate to the machine you are actually using.
3173e12c5d1SDavid du Colombier.PP
3183e12c5d1SDavid du ColombierTo convert source to an executable binary is a two-step process.
3193e12c5d1SDavid du ColombierFirst run the compiler,
320f54a2a50SDavid du Colombier.CW 8c ,
3213e12c5d1SDavid du Colombieron the source, say
3223e12c5d1SDavid du Colombier.CW file.c ,
3233e12c5d1SDavid du Colombierto generate an object file
324f54a2a50SDavid du Colombier.CW file.8 .
3253e12c5d1SDavid du ColombierThen run the loader,
326f54a2a50SDavid du Colombier.CW 8l ,
3273e12c5d1SDavid du Colombierto generate an executable
328f54a2a50SDavid du Colombier.CW 8.out
329f54a2a50SDavid du Colombierthat may be run (on a 386 machine):
3303e12c5d1SDavid du Colombier.P1
331f54a2a50SDavid du Colombier8c file.c
332f54a2a50SDavid du Colombier8l file.8
333f54a2a50SDavid du Colombier8.out
3343e12c5d1SDavid du Colombier.P2
3353e12c5d1SDavid du ColombierThe loader automatically links with whatever libraries the program
3363e12c5d1SDavid du Colombierneeds, usually including the standard C library as defined by
3373e12c5d1SDavid du Colombier.CW <libc.h> .
3383e12c5d1SDavid du ColombierOf course the compiler and loader have lots of options, both familiar and new;
3393e12c5d1SDavid du Colombiersee the manual for details.
3403e12c5d1SDavid du ColombierThe compiler does not generate an executable automatically;
3413e12c5d1SDavid du Colombierthe output of the compiler must be given to the loader.
3423e12c5d1SDavid du ColombierSince most compilation is done under the control of
3433e12c5d1SDavid du Colombier.CW mk
3443e12c5d1SDavid du Colombier(see below), this is rarely an inconvenience.
3453e12c5d1SDavid du Colombier.PP
3463e12c5d1SDavid du ColombierThe distribution of work between the compiler and loader is unusual.
3473e12c5d1SDavid du ColombierThe compiler integrates preprocessing, parsing, register allocation,
3483e12c5d1SDavid du Colombiercode generation and some assembly.
3493e12c5d1SDavid du ColombierCombining these tasks in a single program is part of the reason for
3503e12c5d1SDavid du Colombierthe compiler's efficiency.
3513e12c5d1SDavid du ColombierThe loader does instruction selection, branch folding,
3523e12c5d1SDavid du Colombierinstruction scheduling,
3533e12c5d1SDavid du Colombierand writes the final executable.
3543e12c5d1SDavid du ColombierThere is no separate C preprocessor and no assembler in the usual pipeline.
3553e12c5d1SDavid du ColombierInstead the intermediate object file
356219b2ee8SDavid du Colombier(here a
357f54a2a50SDavid du Colombier.CW \&.8
358219b2ee8SDavid du Colombierfile) is a type of binary assembly language.
3593e12c5d1SDavid du ColombierThe instructions in the intermediate format are not exactly those in
3603e12c5d1SDavid du Colombierthe machine.  For example, on the 68020 the object file may specify
3613e12c5d1SDavid du Colombiera MOVE instruction but the loader will decide just which variant of
3623e12c5d1SDavid du Colombierthe MOVE instruction \(em MOVE immediate, MOVE quick, MOVE address,
3633e12c5d1SDavid du Colombieretc. \(em is most efficient.
3643e12c5d1SDavid du Colombier.PP
3653e12c5d1SDavid du ColombierThe assembler,
366f54a2a50SDavid du Colombier.CW 8a ,
3673e12c5d1SDavid du Colombieris just a translator between the textual and binary
3683e12c5d1SDavid du Colombierrepresentations of the object file format.
3693e12c5d1SDavid du ColombierIt is not an assembler in the traditional sense.  It has limited
3703e12c5d1SDavid du Colombiermacro capabilities (the same as the integral C preprocessor in the compiler),
3713e12c5d1SDavid du Colombierclumsy syntax, and minimal error checking.  For instance, the assembler
3723e12c5d1SDavid du Colombierwill accept an instruction (such as memory-to-memory MOVE on the MIPS) that the
3733e12c5d1SDavid du Colombiermachine does not actually support; only when the output of the assembler
3743e12c5d1SDavid du Colombieris passed to the loader will the error be discovered.
375219b2ee8SDavid du ColombierThe assembler is intended only for writing things that need access to instructions
376219b2ee8SDavid du Colombierinvisible from C,
3773e12c5d1SDavid du Colombiersuch as the machine-dependent
378219b2ee8SDavid du Colombierpart of an operating system;
379219b2ee8SDavid du Colombiervery little code in Plan 9 is in assembly language.
3803e12c5d1SDavid du Colombier.PP
3813e12c5d1SDavid du ColombierThe compilers take an option
3823e12c5d1SDavid du Colombier.CW -S
3833e12c5d1SDavid du Colombierthat causes them to print on their standard output the generated code
3843e12c5d1SDavid du Colombierin a format acceptable as input to the assemblers.
3853e12c5d1SDavid du ColombierThis is of course merely a formatting of the
3863e12c5d1SDavid du Colombierdata in the object file; therefore the assembler is just
3873e12c5d1SDavid du Colombieran
388219b2ee8SDavid du ColombierASCII-to-binary converter for this format.
3893e12c5d1SDavid du ColombierOther than the specific instructions, the input to the assemblers
390219b2ee8SDavid du Colombieris largely architecture-independent; see
391219b2ee8SDavid du Colombier``A Manual for the Plan 9 Assembler'',
392219b2ee8SDavid du Colombierby Rob Pike,
393219b2ee8SDavid du Colombierfor more information.
3943e12c5d1SDavid du Colombier.PP
3953e12c5d1SDavid du ColombierThe loader is an integral part of the compilation process.
3963e12c5d1SDavid du ColombierEach library header file contains a
3973e12c5d1SDavid du Colombier.CW #pragma
3983e12c5d1SDavid du Colombierthat tells the loader the name of the associated archive; it is
3993e12c5d1SDavid du Colombiernot necessary to tell the loader which libraries a program uses.
4003e12c5d1SDavid du ColombierThe C run-time startup is found, by default, in the C library.
4013e12c5d1SDavid du ColombierThe loader starts with an undefined
4023e12c5d1SDavid du Colombiersymbol,
4033e12c5d1SDavid du Colombier.CW _main ,
4043e12c5d1SDavid du Colombierthat is resolved by pulling in the run-time startup code from the library.
4053e12c5d1SDavid du Colombier(The loader undefines
4063e12c5d1SDavid du Colombier.CW _mainp
4073e12c5d1SDavid du Colombierwhen profiling is enabled, to force loading of the profiling start-up
4083e12c5d1SDavid du Colombierinstead.)
4093e12c5d1SDavid du Colombier.PP
4103e12c5d1SDavid du ColombierUnlike its counterpart on other systems, the Plan 9 loader rearranges
4113e12c5d1SDavid du Colombierdata to optimize access.  This means the order of variables in the
4123e12c5d1SDavid du Colombierloaded program is unrelated to its order in the source.
4133e12c5d1SDavid du ColombierMost programs don't care, but some assume that, for example, the
4143e12c5d1SDavid du Colombiervariables declared by
4153e12c5d1SDavid du Colombier.P1
4163e12c5d1SDavid du Colombierint a;
4173e12c5d1SDavid du Colombierint b;
4183e12c5d1SDavid du Colombier.P2
4193e12c5d1SDavid du Colombierwill appear at adjacent addresses in memory.  On Plan 9, they won't.
4203e12c5d1SDavid du Colombier.SH
4213e12c5d1SDavid du ColombierHeterogeneity
4223e12c5d1SDavid du Colombier.PP
4233e12c5d1SDavid du ColombierWhen the system starts or a user logs in the environment is configured
4243e12c5d1SDavid du Colombierso the appropriate binaries are available in
4253e12c5d1SDavid du Colombier.CW /bin .
4263e12c5d1SDavid du ColombierThe configuration process is controlled by an environment variable,
4273e12c5d1SDavid du Colombier.CW $cputype ,
4283e12c5d1SDavid du Colombierwith value such as
4293e12c5d1SDavid du Colombier.CW mips ,
4307dd7cddfSDavid du Colombier.CW 386 ,
431f54a2a50SDavid du Colombier.CW arm ,
4323e12c5d1SDavid du Colombieror
4333e12c5d1SDavid du Colombier.CW sparc .
434219b2ee8SDavid du ColombierFor each architecture there is a directory in the root,
435219b2ee8SDavid du Colombierwith the appropriate name,
4363e12c5d1SDavid du Colombierthat holds the binary and library files for that architecture.
4373e12c5d1SDavid du ColombierThus
4383e12c5d1SDavid du Colombier.CW /mips/lib
4393e12c5d1SDavid du Colombiercontains the object code libraries for MIPS programs,
4403e12c5d1SDavid du Colombier.CW /mips/include
4413e12c5d1SDavid du Colombierholds MIPS-specific include files, and
4423e12c5d1SDavid du Colombier.CW /mips/bin
4433e12c5d1SDavid du Colombierhas the MIPS binaries.
4443e12c5d1SDavid du ColombierThese binaries are attached to
4453e12c5d1SDavid du Colombier.CW /bin
4463e12c5d1SDavid du Colombierat boot time by binding
4473e12c5d1SDavid du Colombier.CW /$cputype/bin
4483e12c5d1SDavid du Colombierto
4493e12c5d1SDavid du Colombier.CW /bin ,
4503e12c5d1SDavid du Colombierso
4513e12c5d1SDavid du Colombier.CW /bin
4523e12c5d1SDavid du Colombieralways contains the correct files.
4533e12c5d1SDavid du Colombier.PP
4543e12c5d1SDavid du ColombierThe MIPS compiler,
4553e12c5d1SDavid du Colombier.CW vc ,
4563e12c5d1SDavid du Colombierby definition
4573e12c5d1SDavid du Colombierproduces object files for the MIPS architecture,
4583e12c5d1SDavid du Colombierregardless of the architecture of the machine on which the compiler is running.
4593e12c5d1SDavid du ColombierThere is a version of
4603e12c5d1SDavid du Colombier.CW vc
4613e12c5d1SDavid du Colombiercompiled for each architecture:
4623e12c5d1SDavid du Colombier.CW /mips/bin/vc ,
463f54a2a50SDavid du Colombier.CW /arm/bin/vc ,
4643e12c5d1SDavid du Colombier.CW /sparc/bin/vc ,
4653e12c5d1SDavid du Colombierand so on,
4663e12c5d1SDavid du Colombiereach capable of producing MIPS object files regardless of the native
4673e12c5d1SDavid du Colombierinstruction set.
4683e12c5d1SDavid du ColombierIf one is running on a SPARC,
4693e12c5d1SDavid du Colombier.CW /sparc/bin/vc
4703e12c5d1SDavid du Colombierwill compile programs for the MIPS;
4713e12c5d1SDavid du Colombierif one is running on machine
4723e12c5d1SDavid du Colombier.CW $cputype ,
4733e12c5d1SDavid du Colombier.CW /$cputype/bin/vc
4743e12c5d1SDavid du Colombierwill compile programs for the MIPS.
4753e12c5d1SDavid du Colombier.PP
476219b2ee8SDavid du ColombierBecause of the bindings that assemble
477219b2ee8SDavid du Colombier.CW /bin ,
478219b2ee8SDavid du Colombierthe shell always looks for a command, say
479219b2ee8SDavid du Colombier.CW date ,
4803e12c5d1SDavid du Colombierin
481219b2ee8SDavid du Colombier.CW /bin
4823e12c5d1SDavid du Colombierand automatically finds the file
4833e12c5d1SDavid du Colombier.CW /$cputype/bin/date .
4843e12c5d1SDavid du ColombierTherefore the MIPS compiler is known as just
4853e12c5d1SDavid du Colombier.CW vc ;
4863e12c5d1SDavid du Colombierthe shell will invoke
4873e12c5d1SDavid du Colombier.CW /bin/vc
4883e12c5d1SDavid du Colombierand that is guaranteed to be the version of the MIPS compiler
4893e12c5d1SDavid du Colombierappropriate for the machine running the command.
4903e12c5d1SDavid du ColombierRegardless of the architecture of the compiling machine,
4913e12c5d1SDavid du Colombier.CW /bin/vc
4923e12c5d1SDavid du Colombieris
4933e12c5d1SDavid du Colombier.I always
4943e12c5d1SDavid du Colombierthe MIPS compiler.
4953e12c5d1SDavid du Colombier.PP
4963e12c5d1SDavid du ColombierAlso, the output of
4973e12c5d1SDavid du Colombier.CW vc
4983e12c5d1SDavid du Colombierand
4993e12c5d1SDavid du Colombier.CW vl
500219b2ee8SDavid du Colombieris completely independent of the machine type on which they are executed:
5013e12c5d1SDavid du Colombier.CW \&.v
5023e12c5d1SDavid du Colombierfiles compiled (with
5033e12c5d1SDavid du Colombier.CW vc )
5043e12c5d1SDavid du Colombieron a SPARC may be linked (with
5053e12c5d1SDavid du Colombier.CW vl )
5063e12c5d1SDavid du Colombieron a 386.
5073e12c5d1SDavid du Colombier(The resulting
5083e12c5d1SDavid du Colombier.CW v.out
5093e12c5d1SDavid du Colombierwill run, of course, only on a MIPS.)
5103e12c5d1SDavid du ColombierSimilarly, the MIPS libraries in
5113e12c5d1SDavid du Colombier.CW /mips/lib
5123e12c5d1SDavid du Colombierare suitable for loading with
5133e12c5d1SDavid du Colombier.CW vl
5143e12c5d1SDavid du Colombieron any machine; there is only one set of MIPS libraries, not one
5153e12c5d1SDavid du Colombierset for each architecture that supports the MIPS compiler.
5163e12c5d1SDavid du Colombier.SH
5173e12c5d1SDavid du ColombierHeterogeneity and \f(CWmk\fP
5183e12c5d1SDavid du Colombier.PP
5193e12c5d1SDavid du ColombierMost software on Plan 9 is compiled under the control of
5203e12c5d1SDavid du Colombier.CW mk ,
5213e12c5d1SDavid du Colombiera descendant of
5223e12c5d1SDavid du Colombier.CW make
5233e12c5d1SDavid du Colombierthat is documented in the Programmer's Manual.
5243e12c5d1SDavid du ColombierA convention used throughout the
5253e12c5d1SDavid du Colombier.CW mkfiles
5263e12c5d1SDavid du Colombiermakes it easy to compile the source into binary suitable for any architecture.
5273e12c5d1SDavid du Colombier.PP
5283e12c5d1SDavid du ColombierThe variable
5293e12c5d1SDavid du Colombier.CW $cputype
5303e12c5d1SDavid du Colombieris advisory: it reports the architecture of the current environment, and should
5313e12c5d1SDavid du Colombiernot be modified.  A second variable,
5323e12c5d1SDavid du Colombier.CW $objtype ,
5333e12c5d1SDavid du Colombieris used to set which architecture is being
5343e12c5d1SDavid du Colombier.I compiled
5353e12c5d1SDavid du Colombierfor.
5363e12c5d1SDavid du ColombierThe value of
5373e12c5d1SDavid du Colombier.CW $objtype
5383e12c5d1SDavid du Colombiercan be used by a
5393e12c5d1SDavid du Colombier.CW mkfile
5403e12c5d1SDavid du Colombierto configure the compilation environment.
5413e12c5d1SDavid du Colombier.PP
5423e12c5d1SDavid du ColombierIn each machine's root directory there is a short
5433e12c5d1SDavid du Colombier.CW mkfile
544219b2ee8SDavid du Colombierthat defines a set of macros for the compiler, loader, etc.
5453e12c5d1SDavid du ColombierHere is
5463e12c5d1SDavid du Colombier.CW /mips/mkfile :
5473e12c5d1SDavid du Colombier.P1
5487dd7cddfSDavid du Colombier</sys/src/mkfile.proto
5497dd7cddfSDavid du Colombier
5503e12c5d1SDavid du ColombierCC=vc
5513e12c5d1SDavid du ColombierLD=vl
5523e12c5d1SDavid du ColombierO=v
5533e12c5d1SDavid du ColombierAS=va
5547dd7cddfSDavid du Colombier.P2
5557dd7cddfSDavid du ColombierThe line
5567dd7cddfSDavid du Colombier.P1
5577dd7cddfSDavid du Colombier</sys/src/mkfile.proto
5587dd7cddfSDavid du Colombier.P2
5597dd7cddfSDavid du Colombiercauses
5607dd7cddfSDavid du Colombier.CW mk
5617dd7cddfSDavid du Colombierto include the file
5627dd7cddfSDavid du Colombier.CW /sys/src/mkfile.proto ,
5637dd7cddfSDavid du Colombierwhich contains general definitions:
5647dd7cddfSDavid du Colombier.P1
5657dd7cddfSDavid du Colombier#
5667dd7cddfSDavid du Colombier# common mkfile parameters shared by all architectures
5677dd7cddfSDavid du Colombier#
5687dd7cddfSDavid du Colombier
56940ff8eeaSDavid du ColombierOS=5689qv
57040ff8eeaSDavid du ColombierCPUS=arm amd64 386 power mips
57140ff8eeaSDavid du ColombierCFLAGS=-FTVw
572219b2ee8SDavid du ColombierLEX=lex
573219b2ee8SDavid du ColombierYACC=yacc
574219b2ee8SDavid du ColombierMK=/bin/mk
5753e12c5d1SDavid du Colombier.P2
5763e12c5d1SDavid du Colombier.CW CC
5773e12c5d1SDavid du Colombieris obviously the compiler,
5783e12c5d1SDavid du Colombier.CW AS
5793e12c5d1SDavid du Colombierthe assembler, and
5803e12c5d1SDavid du Colombier.CW LD
5813e12c5d1SDavid du Colombierthe loader.
5823e12c5d1SDavid du Colombier.CW O
5833e12c5d1SDavid du Colombieris the suffix for the object files and
5843e12c5d1SDavid du Colombier.CW CPUS
5853e12c5d1SDavid du Colombierand
5863e12c5d1SDavid du Colombier.CW OS
5873e12c5d1SDavid du Colombierare used in special rules described below.
5883e12c5d1SDavid du Colombier.PP
5893e12c5d1SDavid du ColombierHere is a
5903e12c5d1SDavid du Colombier.CW mkfile
5913e12c5d1SDavid du Colombierto build the installed source for
5923e12c5d1SDavid du Colombier.CW sam :
5933e12c5d1SDavid du Colombier.P1
5943e12c5d1SDavid du Colombier</$objtype/mkfile
595219b2ee8SDavid du ColombierOBJ=sam.$O address.$O buffer.$O cmd.$O disc.$O error.$O \e
596219b2ee8SDavid du Colombier	file.$O io.$O list.$O mesg.$O moveto.$O multi.$O \e
597219b2ee8SDavid du Colombier	plan9.$O rasp.$O regexp.$O string.$O sys.$O xec.$O
5983e12c5d1SDavid du Colombier
5993e12c5d1SDavid du Colombier$O.out:	$OBJ
6003e12c5d1SDavid du Colombier	$LD $OBJ
6013e12c5d1SDavid du Colombier
6023e12c5d1SDavid du Colombierinstall:	$O.out
6033e12c5d1SDavid du Colombier	cp $O.out /$objtype/bin/sam
6043e12c5d1SDavid du Colombier
6053e12c5d1SDavid du Colombierinstallall:
6063e12c5d1SDavid du Colombier	for(objtype in $CPUS) mk install
6073e12c5d1SDavid du Colombier
6083e12c5d1SDavid du Colombier%.$O:	%.c
6093e12c5d1SDavid du Colombier	$CC $CFLAGS $stem.c
6103e12c5d1SDavid du Colombier
6113e12c5d1SDavid du Colombier$OBJ:	sam.h errors.h mesg.h
6123e12c5d1SDavid du Colombieraddress.$O cmd.$O parse.$O xec.$O unix.$O:	parse.h
6133e12c5d1SDavid du Colombier
6143e12c5d1SDavid du Colombierclean:V:
6153e12c5d1SDavid du Colombier	rm -f [$OS].out *.[$OS] y.tab.?
6163e12c5d1SDavid du Colombier.P2
6173e12c5d1SDavid du Colombier(The actual
6183e12c5d1SDavid du Colombier.CW mkfile
6193e12c5d1SDavid du Colombierimports most of its rules from other secondary files, but
6203e12c5d1SDavid du Colombierthis example works and is not misleading.)
6213e12c5d1SDavid du ColombierThe first line causes
6223e12c5d1SDavid du Colombier.CW mk
6233e12c5d1SDavid du Colombierto include the contents of
6243e12c5d1SDavid du Colombier.CW /$objtype/mkfile
6253e12c5d1SDavid du Colombierin the current
6263e12c5d1SDavid du Colombier.CW mkfile .
6273e12c5d1SDavid du ColombierIf
6283e12c5d1SDavid du Colombier.CW $objtype
6293e12c5d1SDavid du Colombieris
6303e12c5d1SDavid du Colombier.CW mips ,
631219b2ee8SDavid du Colombierthis inserts the MIPS macro definitions into the
6323e12c5d1SDavid du Colombier.CW mkfile .
6333e12c5d1SDavid du ColombierIn this case the rule for
6343e12c5d1SDavid du Colombier.CW $O.out
635219b2ee8SDavid du Colombieruses the MIPS tools to build
6363e12c5d1SDavid du Colombier.CW v.out .
6373e12c5d1SDavid du ColombierThe
6383e12c5d1SDavid du Colombier.CW %.$O
6393e12c5d1SDavid du Colombierrule in the file uses
6403e12c5d1SDavid du Colombier.CW mk 's
641219b2ee8SDavid du Colombierpattern matching facilities to convert the source files to the object
642219b2ee8SDavid du Colombierfiles through the compiler.
6433e12c5d1SDavid du Colombier(The text of the rules is passed directly to the shell,
6443e12c5d1SDavid du Colombier.CW rc ,
6453e12c5d1SDavid du Colombierwithout further translation.
6463e12c5d1SDavid du ColombierSee the
6473e12c5d1SDavid du Colombier.CW mk
6483e12c5d1SDavid du Colombiermanual if any of this is unfamiliar.)
6493e12c5d1SDavid du ColombierBecause the default rule builds
6503e12c5d1SDavid du Colombier.CW $O.out
6513e12c5d1SDavid du Colombierrather than
6523e12c5d1SDavid du Colombier.CW sam ,
6533e12c5d1SDavid du Colombierit is possible to maintain binaries for multiple machines in the
6543e12c5d1SDavid du Colombiersame source directory without conflict.
6553e12c5d1SDavid du ColombierThis is also, of course, why the output files from the various
6563e12c5d1SDavid du Colombiercompilers and loaders
657219b2ee8SDavid du Colombierhave distinct names.
6583e12c5d1SDavid du Colombier.PP
6593e12c5d1SDavid du ColombierThe rest of the
6603e12c5d1SDavid du Colombier.CW mkfile
6613e12c5d1SDavid du Colombiershould be easy to follow; notice how the rules for
6623e12c5d1SDavid du Colombier.CW clean
6633e12c5d1SDavid du Colombierand
6643e12c5d1SDavid du Colombier.CW installall
6653e12c5d1SDavid du Colombier(that is, install versions for all architectures) use other macros
6663e12c5d1SDavid du Colombierdefined in
6673e12c5d1SDavid du Colombier.CW /$objtype/mkfile .
6683e12c5d1SDavid du ColombierIn Plan 9,
6693e12c5d1SDavid du Colombier.CW mkfiles
6703e12c5d1SDavid du Colombierfor commands conventionally contain rules to
6713e12c5d1SDavid du Colombier.CW install
6723e12c5d1SDavid du Colombier(compile and install the version for
6733e12c5d1SDavid du Colombier.CW $objtype ),
6743e12c5d1SDavid du Colombier.CW installall
6753e12c5d1SDavid du Colombier(compile and install for all
6763e12c5d1SDavid du Colombier.CW $objtypes ),
6773e12c5d1SDavid du Colombierand
6783e12c5d1SDavid du Colombier.CW clean
6793e12c5d1SDavid du Colombier(remove all object files, binaries, etc.).
6803e12c5d1SDavid du Colombier.PP
6813e12c5d1SDavid du ColombierThe
6823e12c5d1SDavid du Colombier.CW mkfile
6833e12c5d1SDavid du Colombieris easy to use.  To build a MIPS binary,
6843e12c5d1SDavid du Colombier.CW v.out :
6853e12c5d1SDavid du Colombier.P1
6863e12c5d1SDavid du Colombier% objtype=mips
6873e12c5d1SDavid du Colombier% mk
6883e12c5d1SDavid du Colombier.P2
6893e12c5d1SDavid du ColombierTo build and install a MIPS binary:
6903e12c5d1SDavid du Colombier.P1
6913e12c5d1SDavid du Colombier% objtype=mips
6923e12c5d1SDavid du Colombier% mk install
6933e12c5d1SDavid du Colombier.P2
6943e12c5d1SDavid du ColombierTo build and install all versions:
6953e12c5d1SDavid du Colombier.P1
6963e12c5d1SDavid du Colombier% mk installall
6973e12c5d1SDavid du Colombier.P2
6983e12c5d1SDavid du ColombierThese conventions make cross-compilation as easy to manage
6993e12c5d1SDavid du Colombieras traditional native compilation.
7003e12c5d1SDavid du ColombierPlan 9 programs compile and run without change on machines from
701219b2ee8SDavid du Colombierlarge multiprocessors to laptops.  For more information about this process, see
702219b2ee8SDavid du Colombier``Plan 9 Mkfiles'',
703219b2ee8SDavid du Colombierby Bob Flandrena.
7043e12c5d1SDavid du Colombier.SH
7053e12c5d1SDavid du ColombierPortability
7063e12c5d1SDavid du Colombier.PP
7073e12c5d1SDavid du ColombierWithin Plan 9, it is painless to write portable programs, programs whose
7083e12c5d1SDavid du Colombiersource is independent of the machine on which they execute.
7093e12c5d1SDavid du ColombierThe operating system is fixed and the compiler, headers and libraries
7103e12c5d1SDavid du Colombierare constant so most of the stumbling blocks to portability are removed.
7113e12c5d1SDavid du ColombierAttention to a few details can avoid those that remain.
7123e12c5d1SDavid du Colombier.PP
7133e12c5d1SDavid du ColombierPlan 9 is a heterogeneous environment, so programs must
7143e12c5d1SDavid du Colombier.I expect
7153e12c5d1SDavid du Colombierthat external files will be written by programs on machines of different
7163e12c5d1SDavid du Colombierarchitectures.
7173e12c5d1SDavid du ColombierThe compilers, for instance, must handle without confusion
7183e12c5d1SDavid du Colombierobject files written by other machines.
7193e12c5d1SDavid du ColombierThe traditional approach to this problem is to pepper the source with
7203e12c5d1SDavid du Colombier.CW #ifdefs
7213e12c5d1SDavid du Colombierto turn byte-swapping on and off.
722219b2ee8SDavid du ColombierPlan 9 takes a different approach: of the handful of machine-dependent
7233e12c5d1SDavid du Colombier.CW #ifdefs
724219b2ee8SDavid du Colombierin all the source, almost all are deep in the libraries.
725219b2ee8SDavid du ColombierInstead programs read and write files in a defined format,
7263e12c5d1SDavid du Colombiereither (for low volume applications) as formatted text, or
7273e12c5d1SDavid du Colombier(for high volume applications) as binary in a known byte order.
728219b2ee8SDavid du ColombierIf the external data were written with the most significant
729219b2ee8SDavid du Colombierbyte first, the following code reads a 4-byte integer correctly
7303e12c5d1SDavid du Colombierregardless of the architecture of the executing machine (assuming
731219b2ee8SDavid du Colombieran unsigned long holds 4 bytes):
7323e12c5d1SDavid du Colombier.P1
7333e12c5d1SDavid du Colombierulong
7343e12c5d1SDavid du Colombiergetlong(void)
7353e12c5d1SDavid du Colombier{
7363e12c5d1SDavid du Colombier	ulong l;
7373e12c5d1SDavid du Colombier
7383e12c5d1SDavid du Colombier	l = (getchar()&0xFF)<<24;
7393e12c5d1SDavid du Colombier	l |= (getchar()&0xFF)<<16;
7403e12c5d1SDavid du Colombier	l |= (getchar()&0xFF)<<8;
7413e12c5d1SDavid du Colombier	l |= (getchar()&0xFF)<<0;
7423e12c5d1SDavid du Colombier	return l;
7433e12c5d1SDavid du Colombier}
7443e12c5d1SDavid du Colombier.P2
7453e12c5d1SDavid du ColombierNote that this code does not `swap' the bytes; instead it just reads
7463e12c5d1SDavid du Colombierthem in the correct order.
7473e12c5d1SDavid du ColombierVariations of this code will handle any binary format
7483e12c5d1SDavid du Colombierand also avoid problems
7493e12c5d1SDavid du Colombierinvolving how structures are padded, how words are aligned,
7503e12c5d1SDavid du Colombierand other impediments to portability.
7513e12c5d1SDavid du ColombierBe aware, though, that extra care is needed to handle floating point data.
7523e12c5d1SDavid du Colombier.PP
7533e12c5d1SDavid du ColombierEfficiency hounds will argue that this method is unnecessarily slow and clumsy
7543e12c5d1SDavid du Colombierwhen the executing machine has the same byte order (and padding and alignment)
7553e12c5d1SDavid du Colombieras the data.
7567dd7cddfSDavid du ColombierThe CPU cost of I/O processing
7577dd7cddfSDavid du Colombieris rarely the bottleneck for an application, however,
7583e12c5d1SDavid du Colombierand the gain in simplicity of porting and maintaining the code greatly outweighs
7593e12c5d1SDavid du Colombierthe minor speed loss from handling data in this general way.
7603e12c5d1SDavid du ColombierThis method is how the Plan 9 compilers, the window system, and even the file
7613e12c5d1SDavid du Colombierservers transmit data between programs.
7623e12c5d1SDavid du Colombier.PP
7633e12c5d1SDavid du ColombierTo port programs beyond Plan 9, where the system interface is more variable,
7643e12c5d1SDavid du Colombierit is probably necessary to use
7653e12c5d1SDavid du Colombier.CW pcc
7663e12c5d1SDavid du Colombierand hope that the target machine supports ANSI C and POSIX.
7673e12c5d1SDavid du Colombier.SH
7683e12c5d1SDavid du ColombierI/O
7693e12c5d1SDavid du Colombier.PP
7703e12c5d1SDavid du ColombierThe default C library, defined by the include file
7713e12c5d1SDavid du Colombier.CW <libc.h> ,
7723e12c5d1SDavid du Colombiercontains no buffered I/O package.
7733e12c5d1SDavid du ColombierIt does have several entry points for printing formatted text:
7743e12c5d1SDavid du Colombier.CW print
7753e12c5d1SDavid du Colombieroutputs text to the standard output,
7763e12c5d1SDavid du Colombier.CW fprint
7773e12c5d1SDavid du Colombieroutputs text to a specified integer file descriptor, and
7783e12c5d1SDavid du Colombier.CW sprint
7793e12c5d1SDavid du Colombierplaces text in a character array.
780219b2ee8SDavid du ColombierTo access library routines for buffered I/O, a program must
781219b2ee8SDavid du Colombierexplicitly include the header file associated with an appropriate library.
7823e12c5d1SDavid du Colombier.PP
7833e12c5d1SDavid du ColombierThe recommended I/O library, used by most Plan 9 utilities, is
7843e12c5d1SDavid du Colombier.CW bio
7853e12c5d1SDavid du Colombier(buffered I/O), defined by
786219b2ee8SDavid du Colombier.CW <bio.h> .
7873e12c5d1SDavid du ColombierThere also exists an implementation of ANSI Standard I/O,
7883e12c5d1SDavid du Colombier.CW stdio .
7893e12c5d1SDavid du Colombier.PP
7903e12c5d1SDavid du Colombier.CW Bio
791219b2ee8SDavid du Colombieris small and efficient, particularly for buffer-at-a-time or
7923e12c5d1SDavid du Colombierline-at-a-time I/O.
7933e12c5d1SDavid du ColombierEven for character-at-a-time I/O, however, it is significantly faster than
7943e12c5d1SDavid du Colombierthe Standard I/O library,
7953e12c5d1SDavid du Colombier.CW stdio .
796219b2ee8SDavid du ColombierIts interface is compact and regular, although it lacks a few conveniences.
7973e12c5d1SDavid du ColombierThe most noticeable is that one must explicitly define buffers for standard
7983e12c5d1SDavid du Colombierinput and output;
7993e12c5d1SDavid du Colombier.CW bio
8007dd7cddfSDavid du Colombierdoes not predefine them.  Here is a program to copy input to output a byte
8013e12c5d1SDavid du Colombierat a time using
8023e12c5d1SDavid du Colombier.CW bio :
8033e12c5d1SDavid du Colombier.P1
8043e12c5d1SDavid du Colombier#include <u.h>
8053e12c5d1SDavid du Colombier#include <libc.h>
8063e12c5d1SDavid du Colombier#include <bio.h>
8073e12c5d1SDavid du Colombier
8083e12c5d1SDavid du ColombierBiobuf	bin;
8093e12c5d1SDavid du ColombierBiobuf	bout;
8103e12c5d1SDavid du Colombier
8113e12c5d1SDavid du Colombiermain(void)
8123e12c5d1SDavid du Colombier{
8133e12c5d1SDavid du Colombier	int c;
8143e12c5d1SDavid du Colombier
8153e12c5d1SDavid du Colombier	Binit(&bin, 0, OREAD);
8163e12c5d1SDavid du Colombier	Binit(&bout, 1, OWRITE);
8173e12c5d1SDavid du Colombier
8183e12c5d1SDavid du Colombier	while((c=Bgetc(&bin)) != Beof)
8193e12c5d1SDavid du Colombier		Bputc(&bout, c);
8203e12c5d1SDavid du Colombier	exits(0);
8213e12c5d1SDavid du Colombier}
8223e12c5d1SDavid du Colombier.P2
8233e12c5d1SDavid du ColombierFor peak performance, we could replace
8243e12c5d1SDavid du Colombier.CW Bgetc
8253e12c5d1SDavid du Colombierand
8263e12c5d1SDavid du Colombier.CW Bputc
8273e12c5d1SDavid du Colombierby their equivalent in-line macros
8283e12c5d1SDavid du Colombier.CW BGETC
8293e12c5d1SDavid du Colombierand
8303e12c5d1SDavid du Colombier.CW BPUTC
8313e12c5d1SDavid du Colombierbut
8323e12c5d1SDavid du Colombierthe performance gain would be modest.
8333e12c5d1SDavid du ColombierFor more information on
8343e12c5d1SDavid du Colombier.CW bio ,
8353e12c5d1SDavid du Colombiersee the Programmer's Manual.
8363e12c5d1SDavid du Colombier.PP
8373e12c5d1SDavid du ColombierPerhaps the most dramatic difference in the I/O interface of Plan 9 from other
838219b2ee8SDavid du Colombiersystems' is that text is not ASCII.
8393e12c5d1SDavid du ColombierThe format for
840*b9e364c4SDavid du Colombiertext in Plan 9 is a byte-stream encoding of 21-bit characters.
841219b2ee8SDavid du ColombierThe character set is based on the Unicode Standard and is backward compatible with
842219b2ee8SDavid du ColombierASCII:
8433e12c5d1SDavid du Colombiercharacters with value 0 through 127 are the same in both sets.
844*b9e364c4SDavid du ColombierThe 21-bit characters, called
8453e12c5d1SDavid du Colombier.I runes
8463e12c5d1SDavid du Colombierin Plan 9, are encoded using a representation called
847219b2ee8SDavid du ColombierUTF,
848219b2ee8SDavid du Colombieran encoding that is becoming accepted as a standard.
849219b2ee8SDavid du Colombier(ISO calls it UTF-8;
850219b2ee8SDavid du Colombierthroughout Plan 9 it's just called
851219b2ee8SDavid du ColombierUTF.)
852219b2ee8SDavid du ColombierUTF
8533e12c5d1SDavid du Colombierdefines multibyte sequences to
854*b9e364c4SDavid du Colombierrepresent character values from 0 to 1,114,111.
8553e12c5d1SDavid du ColombierIn
856219b2ee8SDavid du ColombierUTF,
8573e12c5d1SDavid du Colombiercharacter values up to 127 decimal, 7F hexadecimal, represent themselves,
8583e12c5d1SDavid du Colombierso straight
859219b2ee8SDavid du ColombierASCII
8603e12c5d1SDavid du Colombierfiles are also valid
861219b2ee8SDavid du ColombierUTF.
8623e12c5d1SDavid du ColombierAlso,
863219b2ee8SDavid du ColombierUTF
8643e12c5d1SDavid du Colombierguarantees that bytes with values 0 to 127 (NUL to DEL, inclusive)
8653e12c5d1SDavid du Colombierwill appear only when they represent themselves, so programs that read bytes
8663e12c5d1SDavid du Colombierlooking for plain ASCII characters will continue to work.
8673e12c5d1SDavid du ColombierAny program that expects a one-to-one correspondence between bytes and
8683e12c5d1SDavid du Colombiercharacters will, however, need to be modified.
8693e12c5d1SDavid du ColombierAn example is parsing file names.
8703e12c5d1SDavid du ColombierFile names, like all text, are in
871219b2ee8SDavid du ColombierUTF,
8723e12c5d1SDavid du Colombierso it is incorrect to search for a character in a string by
873219b2ee8SDavid du Colombier.CW strchr(filename,
874219b2ee8SDavid du Colombier.CW c)
8753e12c5d1SDavid du Colombierbecause the character might have a multi-byte encoding.
8763e12c5d1SDavid du ColombierThe correct method is to call
877219b2ee8SDavid du Colombier.CW utfrune(filename,
878219b2ee8SDavid du Colombier.CW c) ,
8793e12c5d1SDavid du Colombierdefined in
8803e12c5d1SDavid du Colombier.I rune (2),
8813e12c5d1SDavid du Colombierwhich interprets the file name as a sequence of encoded characters
8823e12c5d1SDavid du Colombierrather than bytes.
8833e12c5d1SDavid du ColombierIn fact, even when you know the character is a single byte
8843e12c5d1SDavid du Colombierthat can represent only itself,
8853e12c5d1SDavid du Colombierit is safer to use
8863e12c5d1SDavid du Colombier.CW utfrune
8873e12c5d1SDavid du Colombierbecause that assumes nothing about the character set
8883e12c5d1SDavid du Colombierand its representation.
8893e12c5d1SDavid du Colombier.PP
8903e12c5d1SDavid du ColombierThe library defines several symbols relevant to the representation of characters.
8913e12c5d1SDavid du ColombierAny byte with unsigned value less than
8923e12c5d1SDavid du Colombier.CW Runesync
8933e12c5d1SDavid du Colombierwill not appear in any multi-byte encoding of a character.
8943e12c5d1SDavid du Colombier.CW Utfrune
8953e12c5d1SDavid du Colombiercompares the character being searched against
8963e12c5d1SDavid du Colombier.CW Runesync
8973e12c5d1SDavid du Colombierto see if it is sufficient to call
8983e12c5d1SDavid du Colombier.CW strchr
8993e12c5d1SDavid du Colombieror if the byte stream must be interpreted.
9003e12c5d1SDavid du ColombierAny byte with unsigned value less than
9013e12c5d1SDavid du Colombier.CW Runeself
902219b2ee8SDavid du Colombieris represented by a single byte with the same value.
903219b2ee8SDavid du ColombierFinally, when errors are encountered converting
9043e12c5d1SDavid du Colombierto runes from a byte stream, the library returns the rune value
9053e12c5d1SDavid du Colombier.CW Runeerror
9063e12c5d1SDavid du Colombierand advances a single byte.  This permits programs to find runes
9073e12c5d1SDavid du Colombierembedded in binary data.
9083e12c5d1SDavid du Colombier.PP
9093e12c5d1SDavid du Colombier.CW Bio
9103e12c5d1SDavid du Colombierincludes routines
9113e12c5d1SDavid du Colombier.CW Bgetrune
9123e12c5d1SDavid du Colombierand
9133e12c5d1SDavid du Colombier.CW Bputrune
9143e12c5d1SDavid du Colombierto transform the external byte stream
915219b2ee8SDavid du ColombierUTF
9163e12c5d1SDavid du Colombierformat to and from
917*b9e364c4SDavid du Colombierinternal 21-bit runes.
9183e12c5d1SDavid du ColombierAlso, the
9193e12c5d1SDavid du Colombier.CW %s
9203e12c5d1SDavid du Colombierformat to
9213e12c5d1SDavid du Colombier.CW print
9223e12c5d1SDavid du Colombieraccepts
923219b2ee8SDavid du ColombierUTF;
9243e12c5d1SDavid du Colombier.CW %c
9253e12c5d1SDavid du Colombierprints a character after narrowing it to 8 bits.
9263e12c5d1SDavid du ColombierThe
9273e12c5d1SDavid du Colombier.CW %S
9283e12c5d1SDavid du Colombierformat prints a null-terminated sequence of runes;
9293e12c5d1SDavid du Colombier.CW %C
930*b9e364c4SDavid du Colombierprints a character after narrowing it to 21 bits.
9313e12c5d1SDavid du ColombierFor more information, see the Programmer's Manual, in particular
9323e12c5d1SDavid du Colombier.I utf (6)
9333e12c5d1SDavid du Colombierand
934219b2ee8SDavid du Colombier.I rune (2),
935219b2ee8SDavid du Colombierand the paper,
936219b2ee8SDavid du Colombier``Hello world, or
937219b2ee8SDavid du ColombierΚαλημέρα κόσμε, or\
938219b2ee8SDavid du Colombier\f(Jpこんにちは 世界\f1'',
939219b2ee8SDavid du Colombierby Rob Pike and
940219b2ee8SDavid du ColombierKen Thompson;
9413e12c5d1SDavid du Colombierthere is not room for the full story here.
9423e12c5d1SDavid du Colombier.PP
9433e12c5d1SDavid du ColombierThese issues affect the compiler in several ways.
9443e12c5d1SDavid du ColombierFirst, the C source is in
945219b2ee8SDavid du ColombierUTF.
946219b2ee8SDavid du ColombierANSI says C variables are formed from
947219b2ee8SDavid du ColombierASCII
948219b2ee8SDavid du Colombieralphanumerics, but comments and literal strings may contain any characters
949219b2ee8SDavid du Colombierencoded in the native encoding, here
950219b2ee8SDavid du ColombierUTF.
9513e12c5d1SDavid du ColombierThe declaration
9523e12c5d1SDavid du Colombier.P1
9533e12c5d1SDavid du Colombierchar *cp = "abcÿ";
9543e12c5d1SDavid du Colombier.P2
9553e12c5d1SDavid du Colombierinitializes the variable
9563e12c5d1SDavid du Colombier.CW cp
9573e12c5d1SDavid du Colombierto point to an array of bytes holding the
958219b2ee8SDavid du ColombierUTF
9593e12c5d1SDavid du Colombierrepresentation of the characters
9603e12c5d1SDavid du Colombier.CW abcÿ.
9613e12c5d1SDavid du ColombierThe type
9623e12c5d1SDavid du Colombier.CW Rune
9633e12c5d1SDavid du Colombieris defined in
9643e12c5d1SDavid du Colombier.CW <u.h>
9653e12c5d1SDavid du Colombierto be
9663e12c5d1SDavid du Colombier.CW ushort ,
9673e12c5d1SDavid du Colombierwhich is also the  `wide character' type in the compiler.
9683e12c5d1SDavid du ColombierTherefore the declaration
9693e12c5d1SDavid du Colombier.P1
9703e12c5d1SDavid du ColombierRune *rp = L"abcÿ";
9713e12c5d1SDavid du Colombier.P2
9723e12c5d1SDavid du Colombierinitializes the variable
9733e12c5d1SDavid du Colombier.CW rp
974*b9e364c4SDavid du Colombierto point to an array of unsigned long integers holding the 21-bit
9753e12c5d1SDavid du Colombiervalues of the characters
9763e12c5d1SDavid du Colombier.CW abcÿ .
9773e12c5d1SDavid du ColombierNote that in both these declarations the characters in the source
9783e12c5d1SDavid du Colombierthat represent
9793e12c5d1SDavid du Colombier.CW "abcÿ"
9803e12c5d1SDavid du Colombierare the same; what changes is how those characters are represented
9813e12c5d1SDavid du Colombierin memory in the program.
9823e12c5d1SDavid du ColombierThe following two lines:
9833e12c5d1SDavid du Colombier.P1
9843e12c5d1SDavid du Colombierprint("%s\en", "abcÿ");
9853e12c5d1SDavid du Colombierprint("%S\en", L"abcÿ");
9863e12c5d1SDavid du Colombier.P2
9873e12c5d1SDavid du Colombierproduce the same
988219b2ee8SDavid du ColombierUTF
9893e12c5d1SDavid du Colombierstring on their output, the first by copying the bytes, the second
9903e12c5d1SDavid du Colombierby converting from runes to bytes.
9913e12c5d1SDavid du Colombier.PP
9923e12c5d1SDavid du ColombierIn C, character constants are integers but narrowed through the
9933e12c5d1SDavid du Colombier.CW char
994219b2ee8SDavid du Colombiertype.
995219b2ee8SDavid du ColombierThe Unicode character
9963e12c5d1SDavid du Colombier.CW ÿ
9973e12c5d1SDavid du Colombierhas value 255, so if the
9983e12c5d1SDavid du Colombier.CW char
9993e12c5d1SDavid du Colombiertype is signed,
10003e12c5d1SDavid du Colombierthe constant
10013e12c5d1SDavid du Colombier.CW 'ÿ'
10023e12c5d1SDavid du Colombierhas value \-1 (which is equal to EOF).
10033e12c5d1SDavid du ColombierOn the other hand,
10043e12c5d1SDavid du Colombier.CW L'ÿ'
10053e12c5d1SDavid du Colombiernarrows through the wide character type,
10063e12c5d1SDavid du Colombier.CW ushort ,
10073e12c5d1SDavid du Colombierand therefore has value 255.
1008219b2ee8SDavid du Colombier.PP
1009219b2ee8SDavid du ColombierFinally, although it's not ANSI C, the Plan 9 C compilers
1010219b2ee8SDavid du Colombierassume any character with value above
1011219b2ee8SDavid du Colombier.CW Runeself
1012219b2ee8SDavid du Colombieris an alphanumeric,
1013219b2ee8SDavid du Colombierso α is a legal, if non-portable, variable name.
10143e12c5d1SDavid du Colombier.SH
10153e12c5d1SDavid du ColombierArguments
10163e12c5d1SDavid du Colombier.PP
1017219b2ee8SDavid du ColombierSome macros are defined
1018219b2ee8SDavid du Colombierin
10193e12c5d1SDavid du Colombier.CW <libc.h>
1020219b2ee8SDavid du Colombierfor parsing the arguments to
10213e12c5d1SDavid du Colombier.CW main() .
10223e12c5d1SDavid du ColombierThey are described in
10233e12c5d1SDavid du Colombier.I ARG (2)
10243e12c5d1SDavid du Colombierbut are fairly self-explanatory.
10253e12c5d1SDavid du ColombierThere are four macros:
10263e12c5d1SDavid du Colombier.CW ARGBEGIN
10273e12c5d1SDavid du Colombierand
10283e12c5d1SDavid du Colombier.CW ARGEND
10293e12c5d1SDavid du Colombierare used to bracket a hidden
10303e12c5d1SDavid du Colombier.CW switch
10313e12c5d1SDavid du Colombierstatement within which
10323e12c5d1SDavid du Colombier.CW ARGC
10333e12c5d1SDavid du Colombierreturns the current option character (rune) being processed and
10343e12c5d1SDavid du Colombier.CW ARGF
10353e12c5d1SDavid du Colombierreturns the argument to the option, as in the loader option
10363e12c5d1SDavid du Colombier.CW -o
10373e12c5d1SDavid du Colombier.CW file .
1038219b2ee8SDavid du ColombierHere, for example, is the code at the beginning of
10393e12c5d1SDavid du Colombier.CW main()
10403e12c5d1SDavid du Colombierin
1041219b2ee8SDavid du Colombier.CW ramfs.c
10423e12c5d1SDavid du Colombier(see
1043219b2ee8SDavid du Colombier.I ramfs (1))
10443e12c5d1SDavid du Colombierthat cracks its arguments:
10453e12c5d1SDavid du Colombier.P1
10463e12c5d1SDavid du Colombiervoid
10473e12c5d1SDavid du Colombiermain(int argc, char *argv[])
10483e12c5d1SDavid du Colombier{
1049219b2ee8SDavid du Colombier	char *defmnt;
1050219b2ee8SDavid du Colombier	int p[2];
1051219b2ee8SDavid du Colombier	int mfd[2];
1052219b2ee8SDavid du Colombier	int stdio = 0;
10533e12c5d1SDavid du Colombier
1054219b2ee8SDavid du Colombier	defmnt = "/tmp";
10553e12c5d1SDavid du Colombier	ARGBEGIN{
1056219b2ee8SDavid du Colombier	case 'i':
1057219b2ee8SDavid du Colombier		defmnt = 0;
1058219b2ee8SDavid du Colombier		stdio = 1;
1059219b2ee8SDavid du Colombier		mfd[0] = 0;
1060219b2ee8SDavid du Colombier		mfd[1] = 1;
10613e12c5d1SDavid du Colombier		break;
10623e12c5d1SDavid du Colombier	case 's':
1063219b2ee8SDavid du Colombier		defmnt = 0;
1064219b2ee8SDavid du Colombier		break;
1065219b2ee8SDavid du Colombier	case 'm':
1066219b2ee8SDavid du Colombier		defmnt = ARGF();
10673e12c5d1SDavid du Colombier		break;
10683e12c5d1SDavid du Colombier	default:
10693e12c5d1SDavid du Colombier		usage();
10703e12c5d1SDavid du Colombier	}ARGEND
10713e12c5d1SDavid du Colombier.P2
10723e12c5d1SDavid du Colombier.SH
10733e12c5d1SDavid du ColombierExtensions
10743e12c5d1SDavid du Colombier.PP
1075a9ca66cbSDavid du ColombierThe compiler has several extensions to 1989 ANSI C, all of which are used
10763e12c5d1SDavid du Colombierextensively in the system source.
1077a9ca66cbSDavid du ColombierSome of these have been adopted in later ANSI C standards.
10783e12c5d1SDavid du ColombierFirst,
10793e12c5d1SDavid du Colombier.I structure
10803e12c5d1SDavid du Colombier.I displays
10813e12c5d1SDavid du Colombierpermit
10823e12c5d1SDavid du Colombier.CW struct
10833e12c5d1SDavid du Colombierexpressions to be formed dynamically.
10843e12c5d1SDavid du ColombierGiven these declarations:
10853e12c5d1SDavid du Colombier.P1
10863e12c5d1SDavid du Colombiertypedef struct Point Point;
10873e12c5d1SDavid du Colombiertypedef struct Rectangle Rectangle;
10883e12c5d1SDavid du Colombier
10893e12c5d1SDavid du Colombierstruct Point
10903e12c5d1SDavid du Colombier{
10913e12c5d1SDavid du Colombier	int x, y;
10923e12c5d1SDavid du Colombier};
10933e12c5d1SDavid du Colombier
10943e12c5d1SDavid du Colombierstruct Rectangle
10953e12c5d1SDavid du Colombier{
10963e12c5d1SDavid du Colombier	Point min, max;
10973e12c5d1SDavid du Colombier};
10983e12c5d1SDavid du Colombier
10993e12c5d1SDavid du ColombierPoint	p, q, add(Point, Point);
11003e12c5d1SDavid du ColombierRectangle r;
11013e12c5d1SDavid du Colombierint	x, y;
11023e12c5d1SDavid du Colombier.P2
11033e12c5d1SDavid du Colombierthis assignment may appear anywhere an assignment is legal:
11043e12c5d1SDavid du Colombier.P1
11053e12c5d1SDavid du Colombierr = (Rectangle){add(p, q), (Point){x, y+3}};
11063e12c5d1SDavid du Colombier.P2
11073e12c5d1SDavid du ColombierThe syntax is the same as for initializing a structure but with
11083e12c5d1SDavid du Colombiera leading cast.
11093e12c5d1SDavid du Colombier.PP
11103e12c5d1SDavid du ColombierIf an
11113e12c5d1SDavid du Colombier.I anonymous
11123e12c5d1SDavid du Colombier.I structure
11133e12c5d1SDavid du Colombieror
11143e12c5d1SDavid du Colombier.I union
11153e12c5d1SDavid du Colombieris declared within another structure or union, the members of the internal
11163e12c5d1SDavid du Colombierstructure or union are addressable without prefix in the outer structure.
11173e12c5d1SDavid du ColombierThis feature eliminates the clumsy naming of nested structures and,
11183e12c5d1SDavid du Colombierparticularly, unions.
11193e12c5d1SDavid du ColombierFor example, after these declarations,
11203e12c5d1SDavid du Colombier.P1
11213e12c5d1SDavid du Colombierstruct Lock
11223e12c5d1SDavid du Colombier{
11233e12c5d1SDavid du Colombier	int	locked;
11243e12c5d1SDavid du Colombier};
11253e12c5d1SDavid du Colombier
11263e12c5d1SDavid du Colombierstruct Node
11273e12c5d1SDavid du Colombier{
11283e12c5d1SDavid du Colombier	int	type;
11293e12c5d1SDavid du Colombier	union{
11303e12c5d1SDavid du Colombier		double  dval;
11313e12c5d1SDavid du Colombier		double  fval;
11323e12c5d1SDavid du Colombier		long    lval;
1133219b2ee8SDavid du Colombier	};		/* anonymous union */
11343e12c5d1SDavid du Colombier	struct Lock;	/* anonymous structure */
11353e12c5d1SDavid du Colombier} *node;
11363e12c5d1SDavid du Colombier
11373e12c5d1SDavid du Colombiervoid	lock(struct Lock*);
11383e12c5d1SDavid du Colombier.P2
11393e12c5d1SDavid du Colombierone may refer to
11403e12c5d1SDavid du Colombier.CW node->type ,
11413e12c5d1SDavid du Colombier.CW node->dval ,
11423e12c5d1SDavid du Colombier.CW node->fval ,
11433e12c5d1SDavid du Colombier.CW node->lval ,
11443e12c5d1SDavid du Colombierand
11453e12c5d1SDavid du Colombier.CW node->locked .
11463e12c5d1SDavid du ColombierMoreover, the address of a
11473e12c5d1SDavid du Colombier.CW struct
11483e12c5d1SDavid du Colombier.CW Node
11493e12c5d1SDavid du Colombiermay be used without a cast anywhere that the address of a
11503e12c5d1SDavid du Colombier.CW struct
11513e12c5d1SDavid du Colombier.CW Lock
11523e12c5d1SDavid du Colombieris used, such as in argument lists.
1153219b2ee8SDavid du ColombierThe compiler automatically promotes the type and adjusts the address.
11543e12c5d1SDavid du ColombierThus one may invoke
11553e12c5d1SDavid du Colombier.CW lock(node) .
11563e12c5d1SDavid du Colombier.PP
11573e12c5d1SDavid du ColombierAnonymous structures and unions may be accessed by type name
11583e12c5d1SDavid du Colombierif (and only if) they are declared using a
11593e12c5d1SDavid du Colombier.CW typedef
11603e12c5d1SDavid du Colombiername.
11613e12c5d1SDavid du ColombierFor example, using the above declaration for
11623e12c5d1SDavid du Colombier.CW Point ,
11633e12c5d1SDavid du Colombierone may declare
11643e12c5d1SDavid du Colombier.P1
11653e12c5d1SDavid du Colombierstruct
11663e12c5d1SDavid du Colombier{
11673e12c5d1SDavid du Colombier	int	type;
11683e12c5d1SDavid du Colombier	Point;
11693e12c5d1SDavid du Colombier} p;
11703e12c5d1SDavid du Colombier.P2
11713e12c5d1SDavid du Colombierand refer to
11723e12c5d1SDavid du Colombier.CW p.Point .
11733e12c5d1SDavid du Colombier.PP
11743e12c5d1SDavid du ColombierIn the initialization of arrays, a number in square brackets before an
11753e12c5d1SDavid du Colombierelement sets the index for the initialization.  For example, to initialize
11763e12c5d1SDavid du Colombiersome elements in
11773e12c5d1SDavid du Colombiera table of function pointers indexed by
1178219b2ee8SDavid du ColombierASCII
11793e12c5d1SDavid du Colombiercharacter,
11803e12c5d1SDavid du Colombier.P1
11813e12c5d1SDavid du Colombiervoid	percent(void), slash(void);
11823e12c5d1SDavid du Colombier
11833e12c5d1SDavid du Colombiervoid	(*func[128])(void) =
11843e12c5d1SDavid du Colombier{
11853e12c5d1SDavid du Colombier	['%']	percent,
11863e12c5d1SDavid du Colombier	['/']	slash,
11873e12c5d1SDavid du Colombier};
11883e12c5d1SDavid du Colombier.P2
11897dd7cddfSDavid du Colombier.LP
11907dd7cddfSDavid du ColombierA similar syntax allows one to initialize structure elements:
11917dd7cddfSDavid du Colombier.P1
11927dd7cddfSDavid du ColombierPoint p =
11937dd7cddfSDavid du Colombier{
11947dd7cddfSDavid du Colombier	.y 100,
11957dd7cddfSDavid du Colombier	.x 200
11967dd7cddfSDavid du Colombier};
11977dd7cddfSDavid du Colombier.P2
11987dd7cddfSDavid du ColombierThese initialization syntaxes were later added to ANSI C, with the addition of an
11997dd7cddfSDavid du Colombierequals sign between the index or tag and the value.
12007dd7cddfSDavid du ColombierThe Plan 9 compiler accepts either form.
12013e12c5d1SDavid du Colombier.PP
12023e12c5d1SDavid du ColombierFinally, the declaration
12033e12c5d1SDavid du Colombier.P1
12043e12c5d1SDavid du Colombierextern register reg;
12053e12c5d1SDavid du Colombier.P2
12063e12c5d1SDavid du Colombier.I this "" (
12073e12c5d1SDavid du Colombierappearance of the register keyword is not ignored)
12083e12c5d1SDavid du Colombierallocates a global register to hold the variable
12093e12c5d1SDavid du Colombier.CW reg .
12103e12c5d1SDavid du ColombierExternal registers must be used carefully: they need to be declared in
12113e12c5d1SDavid du Colombier.I all
12123e12c5d1SDavid du Colombiersource files and libraries in the program to guarantee the register
12133e12c5d1SDavid du Colombieris not allocated temporarily for other purposes.
12143e12c5d1SDavid du ColombierEspecially on machines with few registers, such as the i386,
12153e12c5d1SDavid du Colombierit is easy to link accidentally with code that has already usurped
12163e12c5d1SDavid du Colombierthe global registers and there is no diagnostic when this happens.
12173e12c5d1SDavid du ColombierUsed wisely, though, external registers are powerful.
12183e12c5d1SDavid du ColombierThe Plan 9 operating system uses them to access per-process and
12193e12c5d1SDavid du Colombierper-machine data structures on a multiprocessor.  The storage class they provide
12203e12c5d1SDavid du Colombieris hard to create in other ways.
12213e12c5d1SDavid du Colombier.SH
12223e12c5d1SDavid du ColombierThe compile-time environment
12233e12c5d1SDavid du Colombier.PP
12243e12c5d1SDavid du ColombierThe code generated by the compilers is `optimized' by default:
12253e12c5d1SDavid du Colombiervariables are placed in registers and peephole optimizations are
12263e12c5d1SDavid du Colombierperformed.
12273e12c5d1SDavid du ColombierThe compiler flag
12283e12c5d1SDavid du Colombier.CW -N
12293e12c5d1SDavid du Colombierdisables these optimizations.
12303e12c5d1SDavid du ColombierRegisterization is done locally rather than throughout a function:
12313e12c5d1SDavid du Colombierwhether a variable occupies a register or
12323e12c5d1SDavid du Colombierthe memory location identified in the symbol
12333e12c5d1SDavid du Colombiertable depends on the activity of the variable and may change
12343e12c5d1SDavid du Colombierthroughout the life of the variable.
12353e12c5d1SDavid du ColombierThe
12363e12c5d1SDavid du Colombier.CW -N
12373e12c5d1SDavid du Colombierflag is rarely needed;
12383e12c5d1SDavid du Colombierits main use is to simplify debugging.
12393e12c5d1SDavid du ColombierThere is no information in the symbol table to identify the
12403e12c5d1SDavid du Colombierregisterization of a variable, so
12413e12c5d1SDavid du Colombier.CW -N
12423e12c5d1SDavid du Colombierguarantees the variable is always where the symbol table says it is.
12433e12c5d1SDavid du Colombier.PP
12443e12c5d1SDavid du ColombierAnother flag,
12453e12c5d1SDavid du Colombier.CW -w ,
12463e12c5d1SDavid du Colombierturns
12473e12c5d1SDavid du Colombier.I on
12483e12c5d1SDavid du Colombierwarnings about portability and problems detected in flow analysis.
12493e12c5d1SDavid du ColombierMost code in Plan 9 is compiled with warnings enabled;
12503e12c5d1SDavid du Colombierthese warnings plus the type checking offered by function prototypes
12513e12c5d1SDavid du Colombierprovide most of the support of the Unix tool
12523e12c5d1SDavid du Colombier.CW lint
12533e12c5d1SDavid du Colombiermore accurately and with less chatter.
12543e12c5d1SDavid du ColombierTwo of the warnings,
12553e12c5d1SDavid du Colombier`used and not set' and `set and not used', are almost always accurate but
12563e12c5d1SDavid du Colombiermay be triggered spuriously by code with invisible control flow,
12573e12c5d1SDavid du Colombiersuch as in routines that call
12583e12c5d1SDavid du Colombier.CW longjmp .
12593e12c5d1SDavid du ColombierThe compiler statements
12603e12c5d1SDavid du Colombier.P1
12613e12c5d1SDavid du ColombierSET(v1);
12623e12c5d1SDavid du ColombierUSED(v2);
12633e12c5d1SDavid du Colombier.P2
12643e12c5d1SDavid du Colombierdecorate the flow graph to silence the compiler.
12653e12c5d1SDavid du ColombierEither statement accepts a comma-separated list of variables.
1266219b2ee8SDavid du ColombierUse them carefully: they may silence real errors.
1267219b2ee8SDavid du ColombierFor the common case of unused parameters to a function,
1268219b2ee8SDavid du Colombierleaving the name off the declaration silences the warnings.
1269219b2ee8SDavid du ColombierThat is, listing the type of a parameter but giving it no
1270219b2ee8SDavid du Colombierassociated variable name does the trick.
12713e12c5d1SDavid du Colombier.SH
12723e12c5d1SDavid du ColombierDebugging
12733e12c5d1SDavid du Colombier.PP
1274219b2ee8SDavid du ColombierThere are two debuggers available on Plan 9.
1275219b2ee8SDavid du ColombierThe first, and older, is
12763e12c5d1SDavid du Colombier.CW db ,
12773e12c5d1SDavid du Colombiera revision of Unix
12783e12c5d1SDavid du Colombier.CW adb .
1279219b2ee8SDavid du ColombierThe other,
1280219b2ee8SDavid du Colombier.CW acid ,
1281219b2ee8SDavid du Colombieris a source-level debugger whose commands are statements in
1282219b2ee8SDavid du Colombiera true programming language.
1283219b2ee8SDavid du Colombier.CW Acid
1284219b2ee8SDavid du Colombieris the preferred debugger, but since it
1285219b2ee8SDavid du Colombierborrows some elements of
1286219b2ee8SDavid du Colombier.CW db ,
1287219b2ee8SDavid du Colombiernotably the formats for displaying values, it is worth knowing a little bit about
1288219b2ee8SDavid du Colombier.CW db .
12893e12c5d1SDavid du Colombier.PP
1290219b2ee8SDavid du ColombierBoth debuggers support multiple architectures in a single program; that is,
1291219b2ee8SDavid du Colombierthe programs are
12923e12c5d1SDavid du Colombier.CW db
1293219b2ee8SDavid du Colombierand
1294219b2ee8SDavid du Colombier.CW acid ,
1295219b2ee8SDavid du Colombiernot for example
1296219b2ee8SDavid du Colombier.CW vdb
1297219b2ee8SDavid du Colombierand
1298219b2ee8SDavid du Colombier.CW vacid .
1299219b2ee8SDavid du ColombierThey also support cross-architecture debugging comfortably:
1300f54a2a50SDavid du Colombierone may debug a 386 binary on a MIPS.
1301219b2ee8SDavid du Colombier.PP
13023e12c5d1SDavid du ColombierImagine a program has crashed mysteriously:
13033e12c5d1SDavid du Colombier.P1
13043e12c5d1SDavid du Colombier% X11/X
13053e12c5d1SDavid du ColombierFatal server bug!
13063e12c5d1SDavid du Colombierfailed to create default stipple
1307219b2ee8SDavid du ColombierX 106: suicide: sys: trap: fault read addr=0x0 pc=0x00105fb8
13083e12c5d1SDavid du Colombier%
13093e12c5d1SDavid du Colombier.P2
13103e12c5d1SDavid du ColombierWhen a process dies on Plan 9 it hangs in the `broken' state
13113e12c5d1SDavid du Colombierfor debugging.
1312219b2ee8SDavid du ColombierAttach a debugger to the process by naming its process id:
13133e12c5d1SDavid du Colombier.P1
1314219b2ee8SDavid du Colombier% acid 106
1315219b2ee8SDavid du Colombier/proc/106/text:mips plan 9 executable
1316219b2ee8SDavid du Colombier
1317219b2ee8SDavid du Colombier/sys/lib/acid/port
1318219b2ee8SDavid du Colombier/sys/lib/acid/mips
1319219b2ee8SDavid du Colombieracid:
13203e12c5d1SDavid du Colombier.P2
13213e12c5d1SDavid du ColombierThe
1322219b2ee8SDavid du Colombier.CW acid
1323219b2ee8SDavid du Colombierfunction
1324219b2ee8SDavid du Colombier.CW stk()
13253e12c5d1SDavid du Colombierreports the stack traceback:
13263e12c5d1SDavid du Colombier.P1
1327219b2ee8SDavid du Colombieracid: stk()
1328219b2ee8SDavid du ColombierAt pc:0x105fb8:abort+0x24 /sys/src/ape/lib/ap/stdio/abort.c:6
1329219b2ee8SDavid du Colombierabort() /sys/src/ape/lib/ap/stdio/abort.c:4
1330219b2ee8SDavid du Colombier	called from FatalError+#4e
1331219b2ee8SDavid du Colombier		/sys/src/X/mit/server/dix/misc.c:421
13323e12c5d1SDavid du ColombierFatalError(s9=#e02, s8=#4901d200, s7=#2, s6=#72701, s5=#1,
13333e12c5d1SDavid du Colombier    s4=#7270d, s3=#6, s2=#12, s1=#ff37f1c, s0=#6, f=#7270f)
1334219b2ee8SDavid du Colombier    /sys/src/X/mit/server/dix/misc.c:416
1335219b2ee8SDavid du Colombier	called from gnotscreeninit+#4ce
1336219b2ee8SDavid du Colombier		/sys/src/X/mit/server/ddx/gnot/gnot.c:792
1337219b2ee8SDavid du Colombiergnotscreeninit(snum=#0, sc=#80db0)
1338219b2ee8SDavid du Colombier    /sys/src/X/mit/server/ddx/gnot/gnot.c:766
1339219b2ee8SDavid du Colombier	called from AddScreen+#16e
1340219b2ee8SDavid du Colombier		/n/bootes/sys/src/X/mit/server/dix/main.c:610
1341219b2ee8SDavid du ColombierAddScreen(pfnInit=0x0000129c,argc=0x00000001,argv=0x7fffffe4)
1342219b2ee8SDavid du Colombier    /sys/src/X/mit/server/dix/main.c:530
1343219b2ee8SDavid du Colombier	called from InitOutput+0x80
1344219b2ee8SDavid du Colombier		/sys/src/X/mit/server/ddx/brazil/brddx.c:522
1345219b2ee8SDavid du ColombierInitOutput(argc=0x00000001,argv=0x7fffffe4)
1346219b2ee8SDavid du Colombier    /sys/src/X/mit/server/ddx/brazil/brddx.c:511
1347219b2ee8SDavid du Colombier	called from main+0x294
1348219b2ee8SDavid du Colombier		/sys/src/X/mit/server/dix/main.c:225
1349219b2ee8SDavid du Colombiermain(argc=0x00000001,argv=0x7fffffe4)
1350219b2ee8SDavid du Colombier    /sys/src/X/mit/server/dix/main.c:136
1351219b2ee8SDavid du Colombier	called from _main+0x24
1352219b2ee8SDavid du Colombier		/sys/src/ape/lib/ap/mips/main9.s:8
13533e12c5d1SDavid du Colombier.P2
1354219b2ee8SDavid du ColombierThe function
1355219b2ee8SDavid du Colombier.CW lstk()
1356219b2ee8SDavid du Colombieris similar but
1357219b2ee8SDavid du Colombieralso reports the values of local variables.
1358219b2ee8SDavid du ColombierNote that the traceback includes full file names; this is a boon to debugging,
1359219b2ee8SDavid du Colombieralthough it makes the output much noisier.
13603e12c5d1SDavid du Colombier.PP
1361219b2ee8SDavid du ColombierTo use
1362219b2ee8SDavid du Colombier.CW acid
1363219b2ee8SDavid du Colombierwell you will need to learn its input language; see the
1364219b2ee8SDavid du Colombier``Acid Manual'',
1365219b2ee8SDavid du Colombierby Phil Winterbottom,
1366219b2ee8SDavid du Colombierfor details.  For simple debugging, however, the information in the manual page is
1367219b2ee8SDavid du Colombiersufficient.  In particular, it describes the most useful functions
1368219b2ee8SDavid du Colombierfor examining a process.
1369219b2ee8SDavid du Colombier.PP
1370219b2ee8SDavid du ColombierThe compiler does not place
1371219b2ee8SDavid du Colombierinformation describing the types of variables in the executable,
13723e12c5d1SDavid du Colombierbut a compile-time flag provides crude support for symbolic debugging.
13733e12c5d1SDavid du ColombierThe
1374219b2ee8SDavid du Colombier.CW -a
13753e12c5d1SDavid du Colombierflag to the compiler suppresses code generation
1376219b2ee8SDavid du Colombierand instead emits source text in the
1377219b2ee8SDavid du Colombier.CW acid
1378219b2ee8SDavid du Colombierlanguage to format and display data structure types defined in the program.
1379219b2ee8SDavid du ColombierThe easiest way to use this feature is to put a rule in the
1380219b2ee8SDavid du Colombier.CW mkfile :
13813e12c5d1SDavid du Colombier.P1
1382219b2ee8SDavid du Colombiersyms:   main.$O
1383219b2ee8SDavid du Colombier        $CC -a main.c > syms
13843e12c5d1SDavid du Colombier.P2
1385219b2ee8SDavid du ColombierThen from within
1386219b2ee8SDavid du Colombier.CW acid ,
1387219b2ee8SDavid du Colombier.P1
1388219b2ee8SDavid du Colombieracid: include("sourcedirectory/syms")
1389219b2ee8SDavid du Colombier.P2
1390219b2ee8SDavid du Colombierto read in the relevant definitions.
1391219b2ee8SDavid du Colombier(For multi-file source, you need to be a little fancier;
1392219b2ee8SDavid du Colombiersee
1393f54a2a50SDavid du Colombier.I 8c (1)).
1394219b2ee8SDavid du ColombierThis text includes, for each defined compound
1395219b2ee8SDavid du Colombiertype, a function with that name that may be called with the address of a structure
1396219b2ee8SDavid du Colombierof that type to display its contents.
1397219b2ee8SDavid du ColombierFor example, if
1398219b2ee8SDavid du Colombier.CW rect
1399219b2ee8SDavid du Colombieris a global variable of type
1400219b2ee8SDavid du Colombier.CW Rectangle ,
1401219b2ee8SDavid du Colombierone may execute
1402219b2ee8SDavid du Colombier.P1
1403219b2ee8SDavid du ColombierRectangle(*rect)
1404219b2ee8SDavid du Colombier.P2
1405219b2ee8SDavid du Colombierto display it.
14063e12c5d1SDavid du ColombierThe
1407219b2ee8SDavid du Colombier.CW *
1408219b2ee8SDavid du Colombier(indirection) operator is necessary because
1409219b2ee8SDavid du Colombierof the way
1410219b2ee8SDavid du Colombier.CW acid
1411219b2ee8SDavid du Colombierworks: each global symbol in the program is defined as a variable by
1412219b2ee8SDavid du Colombier.CW acid ,
1413219b2ee8SDavid du Colombierwith value equal to the
1414219b2ee8SDavid du Colombier.I address
1415219b2ee8SDavid du Colombierof the symbol.
14163e12c5d1SDavid du Colombier.PP
1417219b2ee8SDavid du ColombierAnother common technique is to write by hand special
1418219b2ee8SDavid du Colombier.CW acid
1419219b2ee8SDavid du Colombiercode to define functions to aid debugging, initialize the debugger, and so on.
1420219b2ee8SDavid du ColombierConventionally, this is placed in a file called
1421219b2ee8SDavid du Colombier.CW acid
1422219b2ee8SDavid du Colombierin the source directory; it has a line
1423219b2ee8SDavid du Colombier.P1
1424219b2ee8SDavid du Colombierinclude("sourcedirectory/syms");
1425219b2ee8SDavid du Colombier.P2
1426219b2ee8SDavid du Colombierto load the compiler-produced symbols.  One may edit the compiler output directly but
1427219b2ee8SDavid du Colombierit is wiser to keep the hand-generated
1428219b2ee8SDavid du Colombier.CW acid
1429219b2ee8SDavid du Colombierseparate from the machine-generated.
1430219b2ee8SDavid du Colombier.PP
14317dd7cddfSDavid du ColombierTo make things simple, the default rules in the system
14327dd7cddfSDavid du Colombier.CW mkfiles
14337dd7cddfSDavid du Colombierinclude entries to make
14347dd7cddfSDavid du Colombier.CW foo.acid
14357dd7cddfSDavid du Colombierfrom
14367dd7cddfSDavid du Colombier.CW foo.c ,
14377dd7cddfSDavid du Colombierso one may use
14387dd7cddfSDavid du Colombier.CW mk
14397dd7cddfSDavid du Colombierto automate the production of
14407dd7cddfSDavid du Colombier.CW acid
14417dd7cddfSDavid du Colombierdefinitions for a given C source file.
14427dd7cddfSDavid du Colombier.PP
1443219b2ee8SDavid du ColombierThere is much more to say here.  See
1444219b2ee8SDavid du Colombier.CW acid
1445219b2ee8SDavid du Colombiermanual page, the reference manual, or the paper
1446219b2ee8SDavid du Colombier``Acid: A Debugger Built From A Language'',
1447219b2ee8SDavid du Colombieralso by Phil Winterbottom.
1448