xref: /inferno-os/doc/styx.ms (revision 46439007cf417cbd9ac8049bb4122c890097a0fa)
1*46439007SCharles.Forsyth.ds TM \u\s-2TM\s+2\d
2*46439007SCharles.Forsyth.nr dT 6
3*46439007SCharles.Forsyth.nr XT 6
4*46439007SCharles.Forsyth.TL
5*46439007SCharles.ForsythThe Styx Architecture for Distributed Systems
6*46439007SCharles.Forsyth.AU
7*46439007SCharles.ForsythRob Pike
8*46439007SCharles.ForsythDennis M. Ritchie
9*46439007SCharles.Forsyth.AI
10*46439007SCharles.ForsythComputing Science Research Center
11*46439007SCharles.ForsythLucent Technologies, Bell Labs
12*46439007SCharles.ForsythMurray Hill, New Jersey
13*46439007SCharles.ForsythUSA
14*46439007SCharles.Forsyth.FS
15*46439007SCharles.Forsyth.FA
16*46439007SCharles.ForsythOriginally appeared in
17*46439007SCharles.Forsyth.I "Bell Labs Technical Journal" ,
18*46439007SCharles.ForsythVol. 4,
19*46439007SCharles.ForsythNo. 2,
20*46439007SCharles.ForsythApril-June 1999,
21*46439007SCharles.Forsythpp. 146-152.
22*46439007SCharles.Forsyth.br
23*46439007SCharles.ForsythCopyright © 1999 Lucent Technologies Inc.  All rights reserved.
24*46439007SCharles.Forsyth.FE
25*46439007SCharles.Forsyth.AB
26*46439007SCharles.ForsythA distributed system is constructed from a set of relatively
27*46439007SCharles.Forsythindependent components that form a unified, but geographically and
28*46439007SCharles.Forsythfunctionally diverse entity.  Examples include networked operating
29*46439007SCharles.Forsythsystems, Internet services, the national telephone
30*46439007SCharles.Forsythswitching system, and in general
31*46439007SCharles.Forsythall the technology using today's diverse digital
32*46439007SCharles.Forsythnetworks.  Nevertheless, distributed systems remain difficult
33*46439007SCharles.Forsythto design, build, and maintain, primarily because of the lack
34*46439007SCharles.Forsythof a clean, perspicuous interconnection model for the
35*46439007SCharles.Forsythcomponents.
36*46439007SCharles.Forsyth.LP
37*46439007SCharles.ForsythOur experience with two distributed operating systems,
38*46439007SCharles.ForsythPlan 9 and Inferno, encourages us to propose such a model.
39*46439007SCharles.ForsythThese systems depend on, advocate, and generally push to the
40*46439007SCharles.Forsythlimit a fruitful idea: to present their
41*46439007SCharles.Forsythresources as files in a hierarchical name space.
42*46439007SCharles.ForsythThe objects appearing as files may represent stored data, but may
43*46439007SCharles.Forsythalso be devices, dynamic information sources, interfaces to services,
44*46439007SCharles.Forsythand control points.  The approach unifies and provides basic naming,
45*46439007SCharles.Forsythstructuring, and access control mechanisms for all system resources.
46*46439007SCharles.ForsythA simple underlying network protocol, Styx, forms
47*46439007SCharles.Forsyththe core of the architecture by presenting a common
48*46439007SCharles.Forsythlanguage for communication within the system.
49*46439007SCharles.Forsyth.LP
50*46439007SCharles.ForsythEven within non-distributed systems, the presentation of services
51*46439007SCharles.Forsythas files advantageously extends a familiar scheme for naming, classifying,
52*46439007SCharles.Forsythand connecting to system resources.
53*46439007SCharles.ForsythMore important, the approach provides a natural way to build
54*46439007SCharles.Forsythdistributed systems, by using well-known technology for attaching
55*46439007SCharles.Forsythremote file systems.
56*46439007SCharles.ForsythIf resources are represented as files,
57*46439007SCharles.Forsythand there are remote file systems, one has
58*46439007SCharles.Forsytha distributed system: resources available in one place
59*46439007SCharles.Forsythare usable from another.
60*46439007SCharles.Forsyth.AE
61*46439007SCharles.Forsyth.SH
62*46439007SCharles.ForsythIntroduction
63*46439007SCharles.Forsyth.LP
64*46439007SCharles.ForsythThe Styx protocol is a variant of a protocol called
65*46439007SCharles.Forsyth.I 9P
66*46439007SCharles.Forsyththat
67*46439007SCharles.Forsythwas developed for the Plan 9 operating system[9man].
68*46439007SCharles.ForsythFor simplicity, we will use the name
69*46439007SCharles.ForsythStyx throughout this paper; the difference concerns only the initialization of
70*46439007SCharles.Forsytha connection.
71*46439007SCharles.Forsyth.LP
72*46439007SCharles.ForsythThe original idea behind Styx was to encode file operations between
73*46439007SCharles.Forsythclient programs and the file system,
74*46439007SCharles.Forsythto be translated into messages for transmission on a computer network.
75*46439007SCharles.ForsythUsing this technology,
76*46439007SCharles.ForsythPlan 9 separates the file server\(ema central repository for
77*46439007SCharles.Forsythpermanent file storage\(emboth from the CPU server\(ema large
78*46439007SCharles.Forsythshared-memory multiprocessor\(emand from the user terminals.
79*46439007SCharles.ForsythThis physical separation of function was central to the original
80*46439007SCharles.Forsythdesign of the system;
81*46439007SCharles.Forsythwhat was unexpected was how well the model could be used to
82*46439007SCharles.Forsythsolve a wide variety of problems not usually thought of as
83*46439007SCharles.Forsythfile system issues.
84*46439007SCharles.Forsyth.LP
85*46439007SCharles.ForsythThe breakthrough was to realize that by representing
86*46439007SCharles.Forsytha computing resource as a form of file system,
87*46439007SCharles.Forsythmany of the difficulties of making that resource available
88*46439007SCharles.Forsythacross the network would disappear naturally, because
89*46439007SCharles.ForsythStyx could export the resource transparently.
90*46439007SCharles.ForsythFor example,
91*46439007SCharles.Forsyththe Plan 9 window system,
92*46439007SCharles.Forsyth.CW 8½
93*46439007SCharles.Forsyth[Pike91],
94*46439007SCharles.Forsythis implemented as a dynamic file server that publishes
95*46439007SCharles.Forsythfiles with names like
96*46439007SCharles.Forsyth.CW /dev/mouse
97*46439007SCharles.Forsythand
98*46439007SCharles.Forsyth.CW /dev/screen
99*46439007SCharles.Forsythto provide access to the local hardware.
100*46439007SCharles.ForsythThe
101*46439007SCharles.Forsyth.CW /dev/mouse
102*46439007SCharles.Forsythfile, for instance,
103*46439007SCharles.Forsythmay be opened and read like a regular file, in the manner of UNIX\*(TM device
104*46439007SCharles.Forsythfiles, but under
105*46439007SCharles.Forsyth.CW 8½
106*46439007SCharles.Forsythit is multiplexed: each client program has a private
107*46439007SCharles.Forsyth.CW /dev/mouse
108*46439007SCharles.Forsythfile that returns mouse events only when the client's window
109*46439007SCharles.Forsythis the active one on the display.
110*46439007SCharles.ForsythThis design provides a clean, simple mechanism for controlling
111*46439007SCharles.Forsythaccess to the mouse.
112*46439007SCharles.ForsythIts real strength, though, is that the representation of the window system's
113*46439007SCharles.Forsythresources as files allows Styx to make those resources available across the
114*46439007SCharles.Forsythnetwork.
115*46439007SCharles.ForsythFor example, an interactive graphics program may be run on a CPU server
116*46439007SCharles.Forsythsimply by having
117*46439007SCharles.Forsyth.CW 8½
118*46439007SCharles.Forsythserve the appropriate files to that machine.
119*46439007SCharles.Forsyth.LP
120*46439007SCharles.ForsythNote that although the resources published by Styx behave like files\(emthey
121*46439007SCharles.Forsythhave file names, file permissions, and file access methods\(emthey do not
122*46439007SCharles.Forsythneed to exist as standard files on disk.
123*46439007SCharles.ForsythThe
124*46439007SCharles.Forsyth.CW /dev/mouse
125*46439007SCharles.Forsythfile is accessed by standard file I/O mechanisms but is nonetheless a
126*46439007SCharles.Forsythtransient object fabricated dynamically by a running program;
127*46439007SCharles.Forsythit has no permanent existence.
128*46439007SCharles.Forsyth.LP
129*46439007SCharles.ForsythBy following this approach throughout the system, Plan 9 achieves
130*46439007SCharles.Forsytha remarkable degree of transparency in the distribution of resources[PPTTW93].
131*46439007SCharles.ForsythBesides interactive graphics, services such as debugging, maintenance,
132*46439007SCharles.Forsythfile backup, and even access to the underlying network hardware
133*46439007SCharles.Forsythcan be made available across the network using Styx, permitting
134*46439007SCharles.Forsyththe construction of distributed applications and services
135*46439007SCharles.Forsythusing nothing more sophisticated than file I/O.
136*46439007SCharles.Forsyth.SH
137*46439007SCharles.ForsythThe Styx protocol
138*46439007SCharles.Forsyth.LP
139*46439007SCharles.ForsythStyx's place in the world is analogous to
140*46439007SCharles.ForsythSun NFS[RFC][NFS] or Microsoft CIFS[CIFS], although it is simpler and easier to implement
141*46439007SCharles.Forsyth[Welc94].
142*46439007SCharles.ForsythFurthermore, NFS and CIFS are designed for sharing regular disk files; NFS in particular
143*46439007SCharles.Forsythis intimately tied to the implementation and caching strategy
144*46439007SCharles.Forsythof the underlying UNIX file system.
145*46439007SCharles.ForsythUnlike Styx, NFS and CIFS are clumsier at exporting dynamic device-like
146*46439007SCharles.Forsythfiles such as
147*46439007SCharles.Forsyth.CW /dev/mouse .
148*46439007SCharles.Forsyth.LP
149*46439007SCharles.ForsythStyx provides a view of a hierarchical, tree-shaped
150*46439007SCharles.Forsythfile system name space[Nee89], together with access information about
151*46439007SCharles.Forsyththe files (permissions, sizes, dates) and the means to read and write
152*46439007SCharles.Forsyththe files.
153*46439007SCharles.ForsythIts users (that is, the people who write application programs),
154*46439007SCharles.Forsythdon't see the protocol itself; instead they see files that they
155*46439007SCharles.Forsythread and write, and that provide information or change information.
156*46439007SCharles.Forsyth.LP
157*46439007SCharles.ForsythIn use, a Styx
158*46439007SCharles.Forsyth.I client
159*46439007SCharles.Forsythis an entity on one machine that establishes communication with
160*46439007SCharles.Forsythanother entity, the
161*46439007SCharles.Forsyth.I server ,
162*46439007SCharles.Forsython the same or another machine.
163*46439007SCharles.ForsythThe client mechanisms may be built into the operating system, as they
164*46439007SCharles.Forsythare in Plan 9 or Inferno[INF1][INF2], or into application libraries;
165*46439007SCharles.Forsytha server may be part of the operating system, or just as often
166*46439007SCharles.Forsythmay be application code on a separate server machine.  In any case, the
167*46439007SCharles.Forsythclient and server entities
168*46439007SCharles.Forsythcommunicate by exchanging messages, and the effect is that the client
169*46439007SCharles.Forsythsees a hierarchical file system that exists on the server.
170*46439007SCharles.ForsythThe Styx protocol is the specification of the messages that are exchanged.
171*46439007SCharles.Forsyth.LP
172*46439007SCharles.ForsythAt one level, Styx consists of messages of 13 types for
173*46439007SCharles.Forsyth.RS
174*46439007SCharles.Forsyth.IP \(bu
175*46439007SCharles.ForsythStarting communication (attaching to a file system);
176*46439007SCharles.Forsyth.IP \(bu
177*46439007SCharles.ForsythNavigating the file system (that is, specifying and
178*46439007SCharles.Forsythgaining a handle for a named file);
179*46439007SCharles.Forsyth.IP \(bu
180*46439007SCharles.ForsythReading and writing a file; and
181*46439007SCharles.Forsyth.IP \(bu
182*46439007SCharles.ForsythPerforming file status inquiries and changes
183*46439007SCharles.Forsyth.RE
184*46439007SCharles.Forsyth.LP
185*46439007SCharles.ForsythHowever, application writers simply code requests to open, read, or write
186*46439007SCharles.Forsythfiles; a library or the operating system translates the requests
187*46439007SCharles.Forsythinto the necessary byte sequences transmitted over a communication
188*46439007SCharles.Forsythchannel.  The Styx protocol proper specifies the interpretation of these
189*46439007SCharles.Forsythbyte sequences.  It fits, approximately, at the OSI Session Layer level
190*46439007SCharles.Forsythof the ISO standard classification.
191*46439007SCharles.ForsythIts specification is independent of most details of machine architecture
192*46439007SCharles.Forsythand it has been successfully used among machines of varying instruction
193*46439007SCharles.Forsythsets and data layout.
194*46439007SCharles.ForsythThe protocol is summarized in Table 1.
195*46439007SCharles.Forsyth.KF
196*46439007SCharles.Forsyth.TS
197*46439007SCharles.Forsythcenter box;
198*46439007SCharles.Forsythl l
199*46439007SCharles.Forsyth--
200*46439007SCharles.ForsythlfCW l.
201*46439007SCharles.ForsythName	Description
202*46439007SCharles.Forsythattach	Authenticate user of connection; return FID
203*46439007SCharles.Forsythclone	Duplicate FID
204*46439007SCharles.Forsythwalk	Advance FID one level of name hierarchy
205*46439007SCharles.Forsythopen	Check permissions for file I/O
206*46439007SCharles.Forsythcreate	Create new file
207*46439007SCharles.Forsythread	Read contents of file
208*46439007SCharles.Forsythwrite	Write contents of file
209*46439007SCharles.Forsythclose	Discard FID
210*46439007SCharles.Forsythremove	Remove file
211*46439007SCharles.Forsythstat	Report file state: permissions, etc.
212*46439007SCharles.Forsythwstat	Modify file state
213*46439007SCharles.Forsytherror	Return error condition for failed operation
214*46439007SCharles.Forsythflush	Disregard outstanding I/O requests
215*46439007SCharles.Forsyth.TE
216*46439007SCharles.Forsyth.ce 100
217*46439007SCharles.Forsyth.ps -1
218*46439007SCharles.ForsythTable 1. Summary of Styx messages.
219*46439007SCharles.Forsyth.ps
220*46439007SCharles.Forsyth.ce 0
221*46439007SCharles.Forsyth.KE
222*46439007SCharles.Forsyth.LP
223*46439007SCharles.ForsythIn use, an operation such as
224*46439007SCharles.Forsyth.P1
225*46439007SCharles.Forsythopen("/usr/rob/.profile", O_READ);
226*46439007SCharles.Forsyth.P2
227*46439007SCharles.Forsythis translated by the underlying system into a sequence of Styx messages.
228*46439007SCharles.ForsythAfter establishing the initial connection to the
229*46439007SCharles.Forsythfile server, an
230*46439007SCharles.Forsyth.CW attach
231*46439007SCharles.Forsythmessage authenticates the user (the person or agent accessing the files) and
232*46439007SCharles.Forsythreturns an object called a
233*46439007SCharles.Forsyth.CW FID
234*46439007SCharles.Forsyth(file ID) that represents the root of the hierarchy on the server.
235*46439007SCharles.ForsythWhen the
236*46439007SCharles.Forsyth.CW open()
237*46439007SCharles.Forsythoperation is executed, it proceeds as follows.
238*46439007SCharles.Forsyth.RS
239*46439007SCharles.Forsyth.IP \(bu
240*46439007SCharles.ForsythA
241*46439007SCharles.Forsyth.CW clone
242*46439007SCharles.Forsythmessage duplicates the root
243*46439007SCharles.Forsyth.CW FID ,
244*46439007SCharles.Forsythreturning a new
245*46439007SCharles.Forsyth.CW FID
246*46439007SCharles.Forsyththat can navigate the hierarchy without losing the connection to the root.
247*46439007SCharles.Forsyth.IP \(bu
248*46439007SCharles.ForsythThe new
249*46439007SCharles.Forsyth.CW FID
250*46439007SCharles.Forsythis then moved to the file
251*46439007SCharles.Forsyth.CW /usr/rob/.profile
252*46439007SCharles.Forsythby a sequence of
253*46439007SCharles.Forsyth.CW walk
254*46439007SCharles.Forsythmessages that step along, one path component at a time
255*46439007SCharles.Forsyth.CW usr , (
256*46439007SCharles.Forsyth.CW rob ,
257*46439007SCharles.Forsyth.CW .profile ).
258*46439007SCharles.Forsyth.IP \(bu
259*46439007SCharles.ForsythFinally, an
260*46439007SCharles.Forsyth.CW open
261*46439007SCharles.Forsythmessage checks that the user has permission to read the file,
262*46439007SCharles.Forsythpermitting subsequent
263*46439007SCharles.Forsyth.CW read
264*46439007SCharles.Forsythand
265*46439007SCharles.Forsyth.CW write
266*46439007SCharles.Forsythoperations (messages) on the
267*46439007SCharles.Forsyth.CW FID .
268*46439007SCharles.Forsyth.IP \(bu
269*46439007SCharles.ForsythOnce I/O is completed, the
270*46439007SCharles.Forsyth.CW close
271*46439007SCharles.Forsythmessage will release the
272*46439007SCharles.Forsyth.CW FID .
273*46439007SCharles.Forsyth.RE
274*46439007SCharles.Forsyth.LP
275*46439007SCharles.ForsythAt a lower level, implementations of Styx depend only on a reliable,
276*46439007SCharles.Forsythbyte-stream Transport communications layer. For example, it runs over either
277*46439007SCharles.ForsythTCP/IP, the standard transmission control protocol
278*46439007SCharles.Forsythand Internet protocol,
279*46439007SCharles.Forsythor Internet link (IL), which is a sequenced, reliable datagram protocol
280*46439007SCharles.Forsythusing IP packets.
281*46439007SCharles.ForsythIt is worth emphasizing, though, that the model does not require the
282*46439007SCharles.Forsythexistence of a network to join the components; Styx runs fine
283*46439007SCharles.Forsythover a Unix pipe or even using shared memory.
284*46439007SCharles.ForsythThe strength of the approach is not so much how it works over a network
285*46439007SCharles.Forsythas that its behavior over a network is identical to its behavior locally.
286*46439007SCharles.Forsyth.SH
287*46439007SCharles.ForsythArchitectural approach
288*46439007SCharles.Forsyth.LP
289*46439007SCharles.ForsythStyx, as a file system protocol, is merely a component in a
290*46439007SCharles.Forsythmore encompassing approach
291*46439007SCharles.Forsythto system design: the presentation of resources as files.
292*46439007SCharles.ForsythThis approach will be discussed using a sequence of examples.
293*46439007SCharles.Forsyth.SH
294*46439007SCharles.Forsyth.I "Example: networking
295*46439007SCharles.Forsyth.LP
296*46439007SCharles.ForsythAs an example, access to a TCP/IP network in Inferno and Plan 9 systems
297*46439007SCharles.Forsythappears as a piece of a file system, with (abbreviated) structure
298*46439007SCharles.Forsythas follows[PrWi93]:
299*46439007SCharles.Forsyth.P1
300*46439007SCharles.Forsyth/net/
301*46439007SCharles.Forsyth	dns/
302*46439007SCharles.Forsyth	tcp/
303*46439007SCharles.Forsyth		clone
304*46439007SCharles.Forsyth		stats
305*46439007SCharles.Forsyth		0/
306*46439007SCharles.Forsyth			ctl
307*46439007SCharles.Forsyth			status
308*46439007SCharles.Forsyth			data
309*46439007SCharles.Forsyth			listen
310*46439007SCharles.Forsyth		1/
311*46439007SCharles.Forsyth			...
312*46439007SCharles.Forsyth		...
313*46439007SCharles.Forsyth	ether0/
314*46439007SCharles.Forsyth		0/
315*46439007SCharles.Forsyth			ctl
316*46439007SCharles.Forsyth			status
317*46439007SCharles.Forsyth			...
318*46439007SCharles.Forsyth		1/
319*46439007SCharles.Forsyth			...
320*46439007SCharles.Forsyth	...
321*46439007SCharles.Forsyth.P2
322*46439007SCharles.ForsythThis represents a file system structure in which one can name, read, and write `files' with
323*46439007SCharles.Forsythnames like
324*46439007SCharles.Forsyth.CW /net/dns ,
325*46439007SCharles.Forsyth.CW /net/tcp/clone ,
326*46439007SCharles.Forsyth.CW /net/tcp/0/ctl
327*46439007SCharles.Forsythand so on;
328*46439007SCharles.Forsyththere are directories of files
329*46439007SCharles.Forsyth.CW /net/tcp
330*46439007SCharles.Forsythand
331*46439007SCharles.Forsyth.CW /net/ether0 .
332*46439007SCharles.ForsythOn the machine that actually has the network interface, all of these
333*46439007SCharles.Forsyththings that look like files are constructed by the kernel drivers that maintain
334*46439007SCharles.Forsyththe TCP/IP stack; they are not real files on a disk.
335*46439007SCharles.ForsythOperations on the `files' turn into operations sent to the device drivers.
336*46439007SCharles.Forsyth.LP
337*46439007SCharles.ForsythSuppose an application wishes to establish a connection over TCP/IP to
338*46439007SCharles.Forsyth.CW www.bell-labs.com .
339*46439007SCharles.ForsythThe first task is to translate the domain name
340*46439007SCharles.Forsyth.CW www.bell-labs.com
341*46439007SCharles.Forsythto a numerical internet address; this is a complicated process, generally
342*46439007SCharles.Forsythinvolving communicating with local and remote Domain Name Servers.
343*46439007SCharles.ForsythIn the Styx model, this is done by opening the file
344*46439007SCharles.Forsyth.CW /dev/dns
345*46439007SCharles.Forsythand writing the literal string
346*46439007SCharles.Forsyth.CW www.bell-labs.com
347*46439007SCharles.Forsython the file; then the same file is read.
348*46439007SCharles.ForsythIt will return the string
349*46439007SCharles.Forsyth.CW 204.178.16.5
350*46439007SCharles.Forsythas a sequence of 12 characters.
351*46439007SCharles.Forsyth.LP
352*46439007SCharles.ForsythOnce the numerical Internet address is acquired, the connection must be established;
353*46439007SCharles.Forsyththis is done by opening
354*46439007SCharles.Forsyth.CW /net/tcp/clone
355*46439007SCharles.Forsythand reading from it a string that specifies a directory like
356*46439007SCharles.Forsyth.CW /net/tcp/43 ,
357*46439007SCharles.Forsythwhich represents a new, unique TCP/IP channel.
358*46439007SCharles.ForsythTo establish the connection,
359*46439007SCharles.Forsythwrite a message like
360*46439007SCharles.Forsyth.CW "connect 204.178.16.5
361*46439007SCharles.Forsython the control file for that connection,
362*46439007SCharles.Forsyth.CW /net/tcp/43/ctl .
363*46439007SCharles.ForsythSubsequently, communication with
364*46439007SCharles.Forsyth.CW www.bell-labs.com
365*46439007SCharles.Forsythis done by reading and
366*46439007SCharles.Forsythwriting on the file
367*46439007SCharles.Forsyth.CW /net/tcp/43/data .
368*46439007SCharles.Forsyth.LP
369*46439007SCharles.ForsythThere are several things to note about this approach.
370*46439007SCharles.Forsyth.RS
371*46439007SCharles.Forsyth.IP \(bu
372*46439007SCharles.ForsythAll the interface points look like files, and are
373*46439007SCharles.Forsythaccessed by the same I/O mechanisms already available in
374*46439007SCharles.Forsythprogramming languages like C, C++, or Java. However, they do not
375*46439007SCharles.Forsythcorrespond to ordinary data files on disk, but instead are creations
376*46439007SCharles.Forsythof a middleware code layer.
377*46439007SCharles.Forsyth.IP \(bu
378*46439007SCharles.ForsythCommunication across the interface, by convention, uses printable character strings where
379*46439007SCharles.Forsythfeasible instead of binary information.  This means that the syntax
380*46439007SCharles.Forsythof communication does not depend on CPU architecture or language details.
381*46439007SCharles.Forsyth.IP \(bu
382*46439007SCharles.ForsythBecause the interface, as in this example with
383*46439007SCharles.Forsyth.CW /net
384*46439007SCharles.Forsythas the interface with networking facilities, looks like a piece of a
385*46439007SCharles.Forsythhierarchical file system, it can easily and nearly automatically
386*46439007SCharles.Forsythbe exported to a remote machine and used from afar.
387*46439007SCharles.Forsyth.RE
388*46439007SCharles.Forsyth.LP
389*46439007SCharles.ForsythIn particular, the Styx implementation encourages a natural way of providing
390*46439007SCharles.Forsythcontrolled access to networks.
391*46439007SCharles.ForsythLucent, like many organizations, has an internal network not
392*46439007SCharles.Forsythaccessible to the international Internet, and has a few
393*46439007SCharles.Forsythgateways between the inside and outside networks.
394*46439007SCharles.ForsythOnly the gateway machines are connected to both, and they implement
395*46439007SCharles.Forsyththe administrative controls for safety and security.
396*46439007SCharles.ForsythThe advantage of the Styx model is the ease with which
397*46439007SCharles.Forsyththe outside Internet can be used from inside.
398*46439007SCharles.ForsythIf the
399*46439007SCharles.Forsyth.CW /net
400*46439007SCharles.Forsythfile tree described above is provided on a gateway machine,
401*46439007SCharles.Forsythit can be used as a remote file system from machines on the
402*46439007SCharles.Forsythinside.  This is safe, because this connection is one-way:
403*46439007SCharles.Forsythinside machines can see the external network interfaces,
404*46439007SCharles.Forsythbut outside machines cannot see the inside.
405*46439007SCharles.Forsyth.SH
406*46439007SCharles.Forsyth.I "Example: debugging
407*46439007SCharles.Forsyth.LP
408*46439007SCharles.ForsythA similar approach, borrowed and generalized from the UNIX
409*46439007SCharles.Forsythsystem [Kill], is useful for controlling and discovering the status
410*46439007SCharles.Forsythof the running processes in the operating system.
411*46439007SCharles.ForsythHere a directory
412*46439007SCharles.Forsyth.CW /proc
413*46439007SCharles.Forsythcontains a subdirectory for each process running on the
414*46439007SCharles.Forsythsystem; the names of the subdirectories correspond to
415*46439007SCharles.Forsythprocess IDs:
416*46439007SCharles.Forsyth.P1
417*46439007SCharles.Forsyth/proc/
418*46439007SCharles.Forsyth	1/
419*46439007SCharles.Forsyth		status
420*46439007SCharles.Forsyth		ctl
421*46439007SCharles.Forsyth		fd
422*46439007SCharles.Forsyth		text
423*46439007SCharles.Forsyth		mem
424*46439007SCharles.Forsyth		...
425*46439007SCharles.Forsyth	2/
426*46439007SCharles.Forsyth		status
427*46439007SCharles.Forsyth		ctl
428*46439007SCharles.Forsyth		...
429*46439007SCharles.Forsyth	...
430*46439007SCharles.Forsyth.P2
431*46439007SCharles.ForsythThe file names in the process directories refer to various aspects
432*46439007SCharles.Forsythof the corresponding process:
433*46439007SCharles.Forsyth.CW status
434*46439007SCharles.Forsythcontains information about the state of the process;
435*46439007SCharles.Forsyth.CW ctl ,
436*46439007SCharles.Forsythwhen written, performs operations like pausing, restarting,
437*46439007SCharles.Forsythor killing the process;
438*46439007SCharles.Forsyth.CW fd
439*46439007SCharles.Forsythnames and describes the files open in the process;
440*46439007SCharles.Forsyth.CW text
441*46439007SCharles.Forsythand
442*46439007SCharles.Forsyth.CW mem
443*46439007SCharles.Forsythrepresent the program code and the data respectively.
444*46439007SCharles.Forsyth.LP
445*46439007SCharles.ForsythWhere possible, the information and control are again
446*46439007SCharles.Forsythrepresented as text strings.  For example, one line
447*46439007SCharles.Forsythfrom the
448*46439007SCharles.Forsyth.CW status
449*46439007SCharles.Forsythfile of a typical process might be
450*46439007SCharles.Forsyth.DS
451*46439007SCharles.Forsyth.CW "samterm dmr Read 0 20 2478910 0 0 ...
452*46439007SCharles.Forsyth.DE
453*46439007SCharles.Forsythwhich shows the name of the program, the owner, its state, and several numbers
454*46439007SCharles.Forsythrepresenting CPU time in various categories.
455*46439007SCharles.Forsyth.LP
456*46439007SCharles.ForsythOnce again, the approach provides several payoffs.
457*46439007SCharles.ForsythBecause process information is represented in file form,
458*46439007SCharles.Forsythremote debugging (debugging programs on another machine)
459*46439007SCharles.Forsythis possible immediately by remote-mounting the
460*46439007SCharles.Forsyth.CW /proc
461*46439007SCharles.Forsythtree on another machine.
462*46439007SCharles.ForsythThe machine-independent representation of information means
463*46439007SCharles.Forsyththat most operations work properly even if the remote machine
464*46439007SCharles.Forsythuses a different CPU architecture from the one doing the
465*46439007SCharles.Forsythdebugging.
466*46439007SCharles.ForsythMost of the programs that deal
467*46439007SCharles.Forsythwith status and control contain no machine-dependent parts
468*46439007SCharles.Forsythand are completely portable.
469*46439007SCharles.Forsyth(A few are not, however: no attempt is made to render the
470*46439007SCharles.Forsythmemory data or instructions in machine-independent form.)
471*46439007SCharles.Forsyth.SH
472*46439007SCharles.Forsyth.I "Example: PathStar\*(TM Access Server
473*46439007SCharles.Forsyth.LP
474*46439007SCharles.ForsythThe data shelf of Lucent's PathStar Access Server[PATH] uses Styx to connect
475*46439007SCharles.Forsyththe line cards and other devices on the shelf to the control computer.
476*46439007SCharles.ForsythIn fact, Styx is the protocol for high-level communication on the backplane.
477*46439007SCharles.Forsyth.LP
478*46439007SCharles.ForsythThe file system hierarchy served by the control computer includes a structure
479*46439007SCharles.Forsythlike this:
480*46439007SCharles.Forsyth.P1
481*46439007SCharles.Forsyth/trip/
482*46439007SCharles.Forsyth	config
483*46439007SCharles.Forsyth	admin/
484*46439007SCharles.Forsyth		ospfctl
485*46439007SCharles.Forsyth		...
486*46439007SCharles.Forsyth	boot/
487*46439007SCharles.Forsyth		0/
488*46439007SCharles.Forsyth			ctl
489*46439007SCharles.Forsyth			eeprom
490*46439007SCharles.Forsyth			memory
491*46439007SCharles.Forsyth			msg
492*46439007SCharles.Forsyth			pack
493*46439007SCharles.Forsyth			alarm
494*46439007SCharles.Forsyth			...
495*46439007SCharles.Forsyth		1/
496*46439007SCharles.Forsyth			...
497*46439007SCharles.Forsyth/net/
498*46439007SCharles.Forsyth	...
499*46439007SCharles.Forsyth.P2
500*46439007SCharles.ForsythThe directories under
501*46439007SCharles.Forsyth.CW /net
502*46439007SCharles.Forsythare similar to those in Plan 9 or Inferno; they form the interface to the
503*46439007SCharles.Forsythexternal IP network.
504*46439007SCharles.ForsythThe
505*46439007SCharles.Forsyth.CW /trip
506*46439007SCharles.Forsythhierarchy represents the control structure of the shelf.
507*46439007SCharles.Forsyth.LP
508*46439007SCharles.ForsythThe subdirectories under
509*46439007SCharles.Forsyth.CW /trip/boot
510*46439007SCharles.Forsytheach provide access to one of the line cards or other devices in the shelf.
511*46439007SCharles.ForsythFor example, to initialize a card one writes the text string
512*46439007SCharles.Forsyth.CW reset
513*46439007SCharles.Forsythto the
514*46439007SCharles.Forsyth.CW ctl
515*46439007SCharles.Forsythfile of the card, while bootstrapping is done by copying the control
516*46439007SCharles.Forsythsoftware for the card into the
517*46439007SCharles.Forsyth.CW memory
518*46439007SCharles.Forsythfile and writing a
519*46439007SCharles.Forsyth.CW reset
520*46439007SCharles.Forsythmessage to
521*46439007SCharles.Forsyth.CW ctl .
522*46439007SCharles.ForsythOnce the line card is running,
523*46439007SCharles.Forsyththe other files present an interface to the higher-level structure of the device:
524*46439007SCharles.Forsyth.CW pack
525*46439007SCharles.Forsythis the port through which IP packets are transferred to and from the card,
526*46439007SCharles.Forsyth.CW alarm
527*46439007SCharles.Forsythmay be read to discover outstanding conditions on the card, and so on.
528*46439007SCharles.Forsyth.LP
529*46439007SCharles.ForsythAll this structure is exported from the shelf using Styx.
530*46439007SCharles.ForsythThe external element management software (EMS) controls and monitors the
531*46439007SCharles.Forsythshelf using Styx operations.
532*46439007SCharles.ForsythFor example, the EMS may read
533*46439007SCharles.Forsyth.CW /trip/boot/7/alarm
534*46439007SCharles.Forsythand discover a diagnostic condition.
535*46439007SCharles.ForsythBy reading and writing the other files under
536*46439007SCharles.Forsyth.CW /trip/boot/7/ ,
537*46439007SCharles.Forsyththe card may be taken off line, diagnosed, and perhaps reset or substituted,
538*46439007SCharles.Forsythall from the system running the EMS, which may be elsewhere in the network.
539*46439007SCharles.Forsyth.LP
540*46439007SCharles.ForsythAnother example is the implementation of SNMP in the PathStar Access Server.
541*46439007SCharles.ForsythThe functionality of SNMP is usually distributed through the various components
542*46439007SCharles.Forsythof a network, but here it is a straightforward adaption process,
543*46439007SCharles.Forsythrunning anywhere in the network, that translates SNMP requests to Styx
544*46439007SCharles.Forsythoperations in the network element.
545*46439007SCharles.ForsythBesides dramatically simplifying the implementation, the natural
546*46439007SCharles.Forsythability for aggregation permits
547*46439007SCharles.Forsytha single process to provide SNMP access to an arbitrarily complex network subsystem.
548*46439007SCharles.ForsythYet the structure is secure: the file-oriented nature of the operations make it
549*46439007SCharles.Forsytheasy to establish standard authentication and security controls to guarantee
550*46439007SCharles.Forsyththat only trusted parties have access to the SNMP operations.
551*46439007SCharles.Forsyth.LP
552*46439007SCharles.ForsythThere are local benefits to this architecture, as well.
553*46439007SCharles.ForsythStyx provides a single point in the design where control can be separated
554*46439007SCharles.Forsythfrom the details of the underlying fabric, isolating both from changes in the
555*46439007SCharles.Forsythother.  Components become more adaptable: software can be upgraded
556*46439007SCharles.Forsythwithout worrying about hidden dependencies on the hardware,
557*46439007SCharles.Forsythand new hardware may be installed without updating the control
558*46439007SCharles.Forsythsoftware above.
559*46439007SCharles.Forsyth.SH
560*46439007SCharles.ForsythSecurity issues
561*46439007SCharles.Forsyth.LP
562*46439007SCharles.ForsythStyx provides several security mechanisms for
563*46439007SCharles.Forsythdiscouraging hostile or accidental actions that injure the integrity
564*46439007SCharles.Forsythof a system.
565*46439007SCharles.Forsyth.LP
566*46439007SCharles.ForsythThe underlying file-communication protocol includes
567*46439007SCharles.Forsythuser and group identifiers that a server may check against
568*46439007SCharles.Forsythother authentication.
569*46439007SCharles.ForsythFor example, a server may check, on a request to open a file,
570*46439007SCharles.Forsyththat the user ID associated with the request is permitted to
571*46439007SCharles.Forsythperform the operation.
572*46439007SCharles.ForsythThis mechanism is familiar from general-purpose operating
573*46439007SCharles.Forsythsystems, and its use is well-known.
574*46439007SCharles.ForsythIt depends on passwords or stronger mechanisms for authenticating
575*46439007SCharles.Forsyththe identity of clients.
576*46439007SCharles.Forsyth.LP
577*46439007SCharles.ForsythThe Styx approach of providing remote resources
578*46439007SCharles.Forsythas file systems over a network encourages genuinely secure access
579*46439007SCharles.Forsythto the resources in a way transparent to applications, so that
580*46439007SCharles.Forsythauthentication transactions need not be provided as part of each.
581*46439007SCharles.ForsythFor example, in Inferno, the negotiation of an initial connection
582*46439007SCharles.Forsythbetween client and server may include installation of any of
583*46439007SCharles.Forsythseveral encrypting or message-digesting protocols that
584*46439007SCharles.Forsythsupervise the channel.
585*46439007SCharles.ForsythAll application use of the resources provided by the server
586*46439007SCharles.Forsythis then protected against interference, and the server
587*46439007SCharles.Forsythhas strong assurance that its facilities are being used in
588*46439007SCharles.Forsythan authorized way.
589*46439007SCharles.ForsythThis is relevant both for general-purpose file servers,
590*46439007SCharles.Forsythand, in the telephony field, is especially useful for safe
591*46439007SCharles.Forsythremote administration.
592*46439007SCharles.Forsyth.SH
593*46439007SCharles.ForsythSummary
594*46439007SCharles.Forsyth.LP
595*46439007SCharles.ForsythPresentation of resources as a piece of a possibly remote file system
596*46439007SCharles.Forsythis an attractive way of creating distributed systems that treads a
597*46439007SCharles.Forsythpath between two extremes:
598*46439007SCharles.Forsyth.IP 1
599*46439007SCharles.ForsythAll communication with other parts of the system is by
600*46439007SCharles.Forsythexplicit messages sent between components.
601*46439007SCharles.ForsythThis communication differs in style from applications' use
602*46439007SCharles.Forsythof local resources.
603*46439007SCharles.Forsyth.IP 2
604*46439007SCharles.ForsythAll communication is by means of
605*46439007SCharles.Forsythclosely shared resources: the CPU-addressable memory in
606*46439007SCharles.Forsythvarious parts is made directly available across a big network;
607*46439007SCharles.Forsythapplications can read and write far-away objects exactly as
608*46439007SCharles.Forsyththey do those on the same motherboard as their own CPU.
609*46439007SCharles.Forsyth.LP
610*46439007SCharles.ForsythSomething like the first of these extremes is usually more evident
611*46439007SCharles.Forsythin today's systems, although either the operating system or software
612*46439007SCharles.Forsythlayered upon it usually paper over some of the rough spots.
613*46439007SCharles.ForsythThe second remains more difficult to approach, because
614*46439007SCharles.Forsythnetworks (especially big ones like the Internet) are not very
615*46439007SCharles.Forsythreliable, and because
616*46439007SCharles.Forsyththe machines on them are diverse in processor architecture
617*46439007SCharles.Forsythand in installed software.
618*46439007SCharles.Forsyth.LP
619*46439007SCharles.ForsythThe design plan described and advocated in this paper
620*46439007SCharles.Forsythlies between the two extremes.
621*46439007SCharles.ForsythIt has these advantages:
622*46439007SCharles.Forsyth.IP \(bu
623*46439007SCharles.Forsyth.I "A simple, familiar programming model for reading and writing named files" .
624*46439007SCharles.ForsythFile systems have well-defined naming, access, and permissions structures.
625*46439007SCharles.Forsyth.IP \(bu
626*46439007SCharles.Forsyth.I "Platform and language independence" .
627*46439007SCharles.ForsythUnderlying access to resources is
628*46439007SCharles.Forsythat the file level, which is provided nearly everywhere, instead
629*46439007SCharles.Forsythof depending on facilities available only with particular languages
630*46439007SCharles.Forsythor operating systems.
631*46439007SCharles.ForsythC++ or Java classes, and C libraries can be constructed
632*46439007SCharles.Forsythto access the facilities.
633*46439007SCharles.Forsyth.IP \(bu
634*46439007SCharles.Forsyth.I "A hierarchical naming and access control structure" .
635*46439007SCharles.ForsythThis encourages clean
636*46439007SCharles.Forsythand well-structured design of resource naming and access.
637*46439007SCharles.Forsyth.IP \(bu
638*46439007SCharles.Forsyth.I "Easy testing and debugging" .
639*46439007SCharles.ForsythBy using well-specified, narrow interfaces
640*46439007SCharles.Forsythat the file level, it is straightforward to observe the communication
641*46439007SCharles.Forsythbetween distributed entities.
642*46439007SCharles.Forsyth.IP \(bu
643*46439007SCharles.Forsyth.I "Low cost" .
644*46439007SCharles.ForsythSupport software, at both client and server,
645*46439007SCharles.Forsythcan be written in a few thousand lines
646*46439007SCharles.Forsythof code, and will occupy only small space in products.
647*46439007SCharles.Forsyth.LP
648*46439007SCharles.ForsythThis approach to building systems is successful in the general-purpose
649*46439007SCharles.Forsythsystems Plan 9 and Inferno;
650*46439007SCharles.Forsythit has also been used to construct systems specialized for telephony, such
651*46439007SCharles.Forsythas Mantra[MAN] and the PathStar Access Server.
652*46439007SCharles.ForsythIt supplies a coherent, extensible structure both to the internal communications
653*46439007SCharles.Forsythwithin a single system and external communication between heterogeneous
654*46439007SCharles.Forsythcomponents of a large digital network.
655*46439007SCharles.Forsyth.LP
656*46439007SCharles.Forsyth.SH
657*46439007SCharles.ForsythReferences
658*46439007SCharles.Forsyth.nr PS -1
659*46439007SCharles.Forsyth.nr VS -1
660*46439007SCharles.Forsyth.IP [NFS] 11
661*46439007SCharles.ForsythR. Sandberg, D. Goldberg, S. Kleiman, D. Walsh, and
662*46439007SCharles.ForsythB. Lyon,
663*46439007SCharles.Forsyth``Design and Implementation of the Sun Network File System'',
664*46439007SCharles.Forsyth.I "Proc. Summer 1985 USENIX Conf." ,
665*46439007SCharles.ForsythPortland, Oregon, June 1985,
666*46439007SCharles.Forsythpp. 119-130.
667*46439007SCharles.Forsyth.IP [RFC] 11
668*46439007SCharles.ForsythInternet RFC 1094.
669*46439007SCharles.Forsyth.IP [9man] 11
670*46439007SCharles.Forsyth.I "Plan 9 Programmer's Manual" ,
671*46439007SCharles.ForsythSecond Edition,
672*46439007SCharles.ForsythVol. 1 and 2,
673*46439007SCharles.ForsythBell Laboratories,
674*46439007SCharles.ForsythMurray Hill, N.J.,
675*46439007SCharles.Forsyth1995.
676*46439007SCharles.Forsyth.IP [Kill84] 11
677*46439007SCharles.ForsythT. J. Killian,
678*46439007SCharles.Forsyth``Processes as Files'',
679*46439007SCharles.Forsyth.I "Proc. Summer 1984 USENIX Conf." ,
680*46439007SCharles.ForsythJune 1984, Salt Lake City, Utah, June 1984, pp. 203-207.
681*46439007SCharles.Forsyth.IP [Pike91] 11
682*46439007SCharles.ForsythR. Pike,
683*46439007SCharles.Forsyth``8½, the Plan 9 Window System'',
684*46439007SCharles.Forsyth.I "Proc. Summer 1991 USENIX Conf." ,
685*46439007SCharles.ForsythNashville TN, June 1991, pp. 257-265.
686*46439007SCharles.Forsyth.IP "[PPTTW93] " 11
687*46439007SCharles.ForsythR. Pike, D.L. Presotto, K. Thompson, H. Trickey, and P. Winterbottom, ``The Use of Name Spaces in Plan 9'',
688*46439007SCharles.Forsyth.I "Op. Sys. Rev." ,
689*46439007SCharles.ForsythVol. 27, No. 2, April 1993, pp. 72-76.
690*46439007SCharles.Forsyth.IP [PrWi93] 11
691*46439007SCharles.ForsythD. L. Presotto and P. Winterbottom,
692*46439007SCharles.Forsyth``The Organization of Networks in Plan 9'',
693*46439007SCharles.Forsyth.I "Proc. Winter 1993 USENIX Conf." ,
694*46439007SCharles.ForsythSan Diego, Calif., Jan. 1993, pp. 43-50.
695*46439007SCharles.Forsyth.IP [Nee89] 11
696*46439007SCharles.ForsythR. Needham, ``Names'', in
697*46439007SCharles.Forsyth.I "Distributed systems" ,
698*46439007SCharles.Forsythedited by S. Mullender,
699*46439007SCharles.ForsythAddison-Wesley,
700*46439007SCharles.ForsythReading, Mass., 1989, pp. 89-101.
701*46439007SCharles.Forsyth.IP [CIFS]
702*46439007SCharles.ForsythPaul Leach and Dan Perry, ``CIFS: A Common Internet File System'', Nov. 1996,
703*46439007SCharles.Forsyth.I "http://www.microsoft.com/mind/1196/cifs.htm" .
704*46439007SCharles.Forsyth.IP [INF1]
705*46439007SCharles.Forsyth.I "Inferno Programmer's Manual",
706*46439007SCharles.ForsythThird Edition,
707*46439007SCharles.ForsythVol. 1 and 2, Vita Nuova Holdings Limited, York, England, 2000.
708*46439007SCharles.Forsyth.IP [INF2]
709*46439007SCharles.ForsythS.M. Dorward, R. Pike, D. L. Presotto, D. M. Ritchie, H. Trickey,
710*46439007SCharles.Forsythand P. Winterbottom, ``The Inferno Operating System'',
711*46439007SCharles.Forsyth.I "Bell Labs Technical Journal"
712*46439007SCharles.ForsythVol. 2,
713*46439007SCharles.ForsythNo. 1,
714*46439007SCharles.ForsythWinter 1997.
715*46439007SCharles.Forsyth.IP [MAN]
716*46439007SCharles.ForsythR. A. Lakshmi-Ratan,
717*46439007SCharles.Forsyth``The Lucent Technologies Softswitch\-Realizing the Promise of Convergence'',
718*46439007SCharles.Forsyth.I "Bell Labs Technical Journal" ,
719*46439007SCharles.ForsythVol. 4,
720*46439007SCharles.ForsythNo. 2,
721*46439007SCharles.ForsythApril-June 1999,
722*46439007SCharles.Forsythpp. 174-196.
723*46439007SCharles.Forsyth.IP [PATH]
724*46439007SCharles.ForsythJ. M. Fossaceca, J. D. Sandoz, and P. Winterbottom,
725*46439007SCharles.Forsyth``The PathStar Access Server: Facilitating Carrier-Scale Packet Telephony'',
726*46439007SCharles.Forsyth.I "Bell Labs Technical Journal" ,
727*46439007SCharles.ForsythVol. 3,
728*46439007SCharles.ForsythNo. 4,
729*46439007SCharles.ForsythOctober-December 1998,
730*46439007SCharles.Forsythpp. 86-102.
731*46439007SCharles.Forsyth.IP [Welc94]
732*46439007SCharles.ForsythB. Welch,
733*46439007SCharles.Forsyth``A Comparison of Three Distributed File System Architectures: Vnode, Sprite, and Plan 9'',
734*46439007SCharles.Forsyth.I "Computing Systems" ,
735*46439007SCharles.ForsythVol. 7, No. 2, pp. 175-199 (1994).
736*46439007SCharles.Forsyth.nr PS +1
737*46439007SCharles.Forsyth.nr VS +1
738