xref: /plan9-contrib/sys/doc/auth.ms (revision d46c239f8612929b7dbade67d0d071633df3a15d)
1.de SS
2.NH 2
3..
4.EQ
5delim $#
6.EN
7.TL
8Security in Plan 9
9.AU
10Russ Cox, MIT LCS
11.br
12Eric Grosse, Bell Labs
13.br
14Rob Pike, Bell Labs
15.br
16Dave Presotto, Avaya Labs and Bell Labs
17.br
18Sean Quinlan, Bell Labs
19.br
20.CW {rsc,ehg,rob,presotto,seanq}@plan9.bell-labs.com
21.AB
22The security architecture of the Plan 9™
23operating system has recently been redesigned
24to address some technical shortcomings.
25This redesign provided an opportunity also to make the system more
26convenient to use securely.
27Plan 9 has thus improved in two ways not usually seen together:
28it has become more secure
29.I and
30easier to use.
31.LP
32The central component of the new architecture is a per-user
33self-contained agent called
34.CW factotum .
35.CW Factotum
36securely holds a
37copy of the user's keys and negotiates authentication protocols, on
38behalf of the user, with secure services around the network.
39Concentrating security code in a single program offers several
40advantages including: ease of update or repair to broken security
41software and protocols; the ability to run secure services at a lower
42privilege level; uniform management of keys for all services; and an
43opportunity to provide single sign on, even to unchanged legacy
44applications.
45.CW  Factotum
46has an unusual architecture: it is implemented
47as a Plan 9 file server.
48.FS
49To appear, in a slightly different form, in
50.I
51Proc. of the 2002 Usenix Security Symposium,
52.R
53San Francisco.
54.FE
55.AE
56.NH 1
57Introduction
58.LP
59Secure computing systems face two challenges:
60first, they must employ sophisticated technology that is difficult to design
61and prove correct; and second,
62they must be easy for regular people to use.
63The question of ease of use is sometimes neglected, but it is essential:
64weak but easy-to-use security can be more effective than strong but
65difficult-to-use security if it is more likely to be used.
66People lock their front doors when they leave the house, knowing
67full well that a burglar is capable of picking the lock (or avoiding
68the door altogether); yet few would accept the cost and
69awkwardness of a bank vault door on the
70house even though that might reduce the probability of a robbery.
71A related point is that users need a clear model of how the security
72operates (if not how it actually provides security) in order to use it
73well; for example, the clarity of a lock icon on a web browser
74is offset by the confusing and typically insecure
75steps for installing X.509 certificates.
76.LP
77The security architecture of the Plan 9
78operating system
79[Pike95]
80has recently been redesigned to make it both more secure
81and easier to use.
82By
83.I security
84we mean three things:
85first, the business of authenticating users and services;
86second, the safe handling, deployment, and use of keys
87and other secret information; and
88third, the use of encryption and integrity checks
89to safeguard communications
90from prying eyes.
91.LP
92The old security architecture of Plan 9
93had several engineering problems in common with other operating systems.
94First, it had an inadequate notion of security domain.
95Once a user provided a password to connect to a local file store,
96the system required that the same password be used to access all the other file
97stores.
98That is, the system treated all network services as
99belonging to the same security domain.
100.LP
101Second, the algorithms and protocols used in authentication,
102by nature tricky and difficult to get right, were compiled into the
103various applications, kernel modules, and file servers.
104Changes and fixes to a security protocol
105required that all components using that protocol needed to be recompiled,
106or at least relinked, and restarted.
107.LP
108Third, the file transport protocol, 9P
109[Pike93],
110that forms the core of
111the Plan 9 system, had its authentication protocol embedded in its design.
112This meant that fixing or changing the authentication used by 9P
113required deep changes to the system.
114If someone were to find a way to break the protocol, the system would
115be wide open and very hard to fix.
116.LP
117These and a number of lesser problems, combined with a desire
118for more widespread use of encryption in the system, spurred us to
119rethink the entire security architecture of Plan 9.
120.LP
121The centerpiece of the new architecture is an agent,
122called
123.CW factotum ,
124that handles the user's keys and negotiates all security
125interactions with system services and applications.
126Like a trusted assistant with a copy of the owner's keys,
127.CW factotum
128does all the negotiation for security and authentication.
129Programs no longer need to be compiled with cryptographic
130code; instead they communicate with
131.CW factotum
132agents
133that represent distinct entities in the cryptographic exchange,
134such as a user and server of a secure service.
135If a security protocol needs to be added, deleted, or modified,
136only
137.CW factotum
138needs to be updated for all system services
139to be kept secure.
140.LP
141Building on
142.CW factotum ,
143we modified
144secure services in the system to move
145user authentication code into
146.CW factotum ;
147made authentication a separable component of the file server protocol;
148deployed new security protocols;
149designed a secure file store,
150called
151.CW secstore ,
152to protect our keys but make them easy to get when they are needed;
153designed a new kernel module to support transparent use of
154Transport Layer Security (TLS)
155[RFC2246];
156and began using encryption for all communications within the system.
157The overall architecture is illustrated in Figure 1a.
158.KF
159.EQ
160gsize 9
161.EN
162.PS 3i
163
164# Secstore
165Sec:  box "Secstore" wid 1.3i ht .5i
166
167# Terminal
168Term0: box invis ht .1i with .e at Sec.e + (-1.1i, -.5i)
169Term:  box wid 1.1i ht 1i with .nw at Term0.ne
170Termlab: "\s-2Terminal\s+2" at Term.s + (0, -.15i)
171FT: ellipse "$ F sub  T#" wid .40i ht .30i with .ne at Term.ne + (-.1i, -.1i)
172PT: ellipse "$ P sub  T#" wid .6i ht .45i with .sw at Term.sw + (.2i, .2i)
173
174# CPU
175Cpu0: box invis ht .1i with .w at Term0.w + (3i, 0)
176Cpu:  box wid 1.1i ht 1i with .nw at Cpu0.ne
177Cpulab: "\s-2CPU Server\s+2" at Cpu.s + (0, -.15i)
178FC: ellipse "$ F sub  C#" wid .40 ht .30i with .nw at Cpu.nw + (.1i, -.1i)
179PC: ellipse "$ P sub  C#" wid .6i ht .45i with .se at Cpu.se + (-.2i, .2i)
180
181# Authentication Server
182Auth:  box dashed "Auth Server" wid 1.3i ht .5i with .e at Sec.e + (0, -2.3i)
183
184# File Server
185File0: box invis ht .1i with .w at Cpu0.w + (0, -1.5i)
186File:  box wid 1.1i ht 1i with .nw at File0.ne
187Filelab: "\s-2File Server\s+2" at File.s + (0, -.15i)
188FF: ellipse "$ F sub  F#" wid .40i ht .30i with .nw at File.nw + (.1i, -.1i)
189PF: ellipse "$ P sub  F#" wid .6i ht .45i with .se at File.se + (-.2i, .2i)
190
191# Connections
192line from PT.e + (0, +0.05i) to PC.w  + (0, +0.05i)
193spline from PT.e + (0, -0.05i) right 1i then down 1.5i right .5i then right to PF.w + (0, -0.05i)
194spline from PC.w + (0, -0.05i) left 1.1i then down 1.4i then right to PF.w + (0, 0.05i)
195line <-> from FC.se to PC.nw
196line <-> from FT.sw to PT.ne
197line <-> from FF.se to PF.nw
198spline <-> from Sec.e right .5i then down .655i then left to FT.e
199#spline from Auth.e + (0, 0.05i) right .5i then up 1i then to FT.se
200#spline from Auth.e + (0, 0.00i) right .7i then up 1i then to FC.sw
201#spline from Auth.e + (0, -0.05i) right .5i then to FF.w
202.PE
203.LP
204.ps 9
205.vs 10
206Figure 1a.  Components of the security architecture.
207Each box is a (typically) separate machine; each ellipse a process.
208The ellipses labeled $F sub X#
209are
210.CW factotum
211processes; those labeled
212$P sub X#
213are the pieces and proxies of a distributed program.
214The authentication server is one of several repositories for users' security information
215that
216.CW factotum
217processes consult as required.
218.CW Secstore
219is a shared resource for storing private information such as keys;
220.CW factotum
221consults it for the user during bootstrap.
222.sp
223.KE
224.EQ
225gsize 11
226.EN
227.LP
228Secure protocols and algorithms are well understood
229and are usually not the weakest link in a system's security.
230In practice, most security problems arise from buggy servers,
231confusing software, or administrative oversights.
232It is these practical problems that we are addressing.
233Although this paper describes the algorithms and protocols we are using,
234they are included mainly for concreteness.
235Our main intent is to present a simple security architecture built
236upon a small trusted code base that is easy to verify (whether by manual or
237automatic means), easy to understand, and easy to use.
238.LP
239Although it is a subjective assessment,
240we believe we have achieved our goal of ease of use.
241That we have achieved
242our goal of improved security is supported by our plan to
243move our currently private computing environment onto the Internet
244outside the corporate firewall.
245The rest of this paper explains the architecture and how it is used,
246to explain why a system that is easy to use securely is also safe
247enough to run in the open network.
248.NH 1
249An Agent for Security
250.LP
251One of the primary reasons for the redesign of the Plan 9
252security infrastructure was to remove the authentication
253method both from the applications and from the kernel.
254Cryptographic code
255is large and intricate, so it should
256be packaged as a separate component that can be repaired or
257modified without altering or even relinking applications
258and services that depend on it.
259If a security protocol is broken, it should be trivial to repair,
260disable, or replace it on the fly.
261Similarly, it should be possible for multiple programs to use
262a common security protocol without embedding it in each program.
263.LP
264Some systems use dynamically linked libraries (DLLs) to address these configuration issues.
265The problem with this approach is that it leaves
266security code in the same address space as the program using it.
267The interactions between the program and the DLL
268can therefore accidentally or deliberately violate the interface,
269weakening security.
270Also, a program using a library to implement secure services
271must run at a privilege level necessary to provide the service;
272separating the security to a different program makes it possible
273to run the services at a weaker privilege level, isolating the
274privileged code to a single, more trustworthy component.
275.LP
276Following the lead of the SSH agent
277[Ylon96],
278we give each user
279an agent process responsible
280for holding and using the user's keys.
281The agent program is called
282.CW factotum
283because of its similarity to the proverbial servant with the
284power to act on behalf of his master because he holds the
285keys to all the master's possessions.  It is essential that
286.CW factotum
287keep the keys secret and use them only in the owner's interest.
288Later we'll discuss some changes to the kernel to reduce the possibility of
289.CW factotum
290leaking information inadvertently.
291.LP
292.CW Factotum
293is implemented, like most Plan 9 services, as a file server.
294It is conventionally mounted upon the directory
295.CW /mnt/factotum ,
296and the files it serves there are analogous to virtual devices that provide access to,
297and control of, the services of the
298.CW factotum .
299The next few sections describe the design of
300.CW factotum
301and how it operates with the other pieces of Plan 9 to provide
302security services.
303.SS
304Logging in
305.LP
306To make the discussions that follow more concrete,
307we begin with a couple of examples showing how the
308Plan 9 security architecture appears to the user.
309These examples both involve a user
310.CW gre
311logging in after booting a local machine.
312The user may or may not have a secure store in which
313all his keys are kept.
314If he does,
315.CW factotum
316will prompt him for the password to the secure store
317and obtain keys from it, prompting only when a key
318isn't found in the store.
319Otherwise,
320.CW factotum
321must prompt for each key.
322.LP
323In the typescripts, \f6\s9\en\s0\fP
324represents a literal newline
325character typed to force a default response.
326User input is in italics, and
327long lines are folded and indented to fit.
328.LP
329This first example shows a user logging in without
330help from the secure store.
331First,
332.CW factotum
333prompts for a user name that the local kernel
334will use:
335.P1
336user[none]: \f6\s9gre\s0\fP
337.P2
338(Default responses appear in square brackets.)
339The kernel then starts accessing local resources
340and requests, through
341.CW factotum ,
342a user/password pair to do so:
343.P1
344!Adding key: dom=cs.bell-labs.com
345    proto=p9sk1
346user[gre]: \f6\s9\en\s0\fP
347password: \f6****\fP
348.P2
349Now the user is logged in to the local system, and
350the mail client starts up:
351.P1
352!Adding key: proto=apop
353    server=plan9.bell-labs.com
354user[gre]: \f6\s9\en\s0\fP
355password: \f6****\fP
356.P2
357.CW Factotum
358is doing all the prompting and the applications
359being started are not even touching the keys.
360Note that it's always clear which key is being requested.
361.LP
362Now consider the same login sequence, but in the case where
363.CW gre
364has a secure store account:
365.P1
366user[none]: \f6\s9gre\s0\fP
367secstore password: \f6*********\fP
368STA PIN+SecurID: \f6*********\fP
369.P2
370That's the last
371.CW gre
372will hear from
373.CW factotum
374unless an attempt is made to contact
375a system for which no key is kept in the secure store.
376.SS
377The factotum
378.LP
379Each computer running Plan 9 has one user id that owns all the
380resources on that system \(em the scheduler, local disks,
381network interfaces, etc.
382That user, the
383.I "host owner" ,
384is the closest analogue in Plan 9 to a Unix
385.CW root
386account (although it is far weaker;
387rather than having special powers, as its name implies the host owner
388is just a regular user that happens to own the
389resources of the local machine).
390On a single-user system, which we call a terminal,
391the host owner is the id of the terminal's user.
392Shared servers such as CPU servers normally have a pseudo-user
393that initially owns all resources.
394At boot time, the Plan 9 kernel starts a
395.CW factotum
396executing as, and therefore with the privileges of,
397the host owner.
398.LP
399New processes run as
400the same user as the process which created them.
401When a process must take on the identity of a new user,
402such as to provide a login shell
403on a shared CPU server,
404it does so by proving to the host owner's
405.CW factotum
406that it is
407authorized to do so.
408This is done by running an
409authentication protocol with
410.CW factotum
411to
412prove that the process has access to secret information
413which only the new user should possess.
414For example, consider the setup in Figure 1a.
415If a user on the terminal
416wants to log in to the CPU server using the
417Plan 9
418.CW cpu
419service
420[Pike93],
421then
422$P sub T#
423might be the
424.CW cpu
425client program and
426$P sub C#
427the
428.CW cpu
429server.
430Neither $P sub C# nor $P sub T#
431knows the details of the authentication.
432They
433do need to be able to shuttle messages back and
434forth between the two
435.CW factotums ,
436but this is
437a generic function easily performed without
438knowing, or being able to extract, secrets in
439the messages.
440$P sub T#
441will make a network connection to $P sub C#.
442$P sub T#
443and
444$P sub C#
445will then relay messages between
446the
447.CW factotum
448owned by the user, $F sub T#,
449and the one owned by the CPU server, $F sub C#,
450until mutual authentication has been established.
451Later
452sections describe the RPC between
453.CW factotum
454and
455applications and the library functions to support proxy operations.
456.LP
457The kernel always uses a single local instance of
458.CW factotum ,
459running as the
460host owner, for
461its authentication purposes, but
462a regular user may start other
463.CW factotum
464agents.
465In fact, the
466.CW factotum
467representing the user need not be
468running on the same machine as its client.
469For instance, it is easy for a user on a CPU server,
470through standard Plan 9 operations,
471to replace the
472.CW /mnt/factotum
473in the user's private file name space on the server
474with a connection to the
475.CW factotum
476running on the terminal.
477(The usual file system permissions prevent interlopers
478from doing so maliciously.)
479This permits secure operations on the CPU server to be
480transparently validated by the user's own
481.CW factotum ,
482so
483secrets need never leave the user's terminal.
484The SSH agent
485[Ylon96]
486does much the
487same with special SSH protocol messages, but
488an advantage to making our agent a file system
489is that we need no new mechanism to access our remote
490agent; remote file access is sufficient.
491.LP
492Within
493.CW factotum ,
494each protocol is implemented as a state
495machine with a generic interface, so protocols are in
496essence pluggable modules, easy to add, modify, or drop.
497Writing a message to and reading a message from
498.CW factotum
499each require a separate RPC and result in
500a single state transition.
501Therefore
502.CW factotum
503always runs to completion on every RPC and never blocks
504waiting for input during any authentication.
505Moreover, the number of simultaneous
506authentications is limited only by the amount of memory we're
507willing to dedicate to representing the state machines.
508.LP
509Authentication protocols are implemented only
510within
511.CW factotum ,
512but adding and removing
513protocols does require relinking the binary, so
514.CW factotum
515processes (but no others)
516need to be restarted in order to take advantage of
517new or repaired protocols.
518.LP
519At the time of writing,
520.CW factotum
521contains authentication
522modules for the Plan 9 shared key protocol (p9sk1),
523SSH's RSA authentication, passwords in the clear, APOP, CRAM, PPP's CHAP,
524Microsoft PPP's MSCHAP, and VNC's challenge/response.
525.SS
526Local capabilities
527.LP
528A capability system, managed by the kernel, is used to empower
529.CW factotum
530to grant permission to another process to change its user id.
531A
532kernel device driver
533implements two files,
534.CW /dev/caphash
535and
536.CW /dev/capuse .
537The write-only file
538.CW /dev/caphash
539can be opened only by the host owner, and only once.
540.CW Factotum
541opens this file immediately after booting.
542.LP
543To use the files,
544.CW factotum
545creates a string of the form
546.I userid1\f(CW@\fPuserid2\f(CW@\fPrandom-string ,
547uses SHA1 HMAC to hash
548.I userid1\f(CW@\fPuserid2
549with key
550.I random-string ,
551and writes that hash to
552.CW /dev/caphash .
553.CW Factotum
554then passes the original string to another
555process on the same machine, running
556as user
557.I userid1 ,
558which
559writes the string to
560.CW /dev/capuse .
561The kernel hashes the string and looks for
562a matching hash in its list.
563If it finds one,
564the writing process's user id changes from
565.I userid1
566to
567.I userid2 .
568Once used, or if a timeout expires,
569the capability is discarded by the kernel.
570.LP
571The capabilities are local to the machine on which they are created.
572Hence a
573.CW factotum
574running on one machine cannot pass capabilities
575to processes on another and expect them to work.
576.SS
577Keys
578.LP
579We define the word
580.I key
581to mean not only a secret, but also a description of the
582context in which that secret is to be used: the protocol,
583server, user, etc. to which it applies.
584That is,
585a key is a combination of secret and descriptive information
586used to authenticate the identities of parties
587transmitting or receiving information.
588The set of keys used
589in any authentication depends both on the protocol and on
590parameters passed by the program requesting the authentication.
591.LP
592Taking a tip from SDSI
593[RiLa],
594which represents security information as textual S-expressions,
595keys in Plan 9 are represented as plain UTF-8 text.
596Text is easily
597understood and manipulated by users.
598By contrast,
599a binary or other cryptic format
600can actually reduce overall security.
601Binary formats are difficult for users to examine and can only be
602cracked by special tools, themselves poorly understood by most users.
603For example, very few people know or understand what's inside
604their X.509 certificates.
605Most don't even know where in the system to
606find them.
607Therefore, they have no idea what they are trusting, and why, and
608are powerless to change their trust relationships.
609Textual, centrally stored and managed keys are easier to use and safer.
610.LP
611Plan 9 has historically represented databases as attribute/value pairs,
612since they are a good foundation for selection and projection operations.
613.CW Factotum
614therefore represents
615the keys in the format
616.I attribute\f(CW=\fPvalue ,
617where
618.I attribute
619is an identifier, possibly with a single-character prefix, and
620.I value
621is an arbitrary quoted string.
622The pairs themselves are separated by white space.
623For example, a Plan 9 key and an APOP key
624might be represented like this:
625.P1
626dom=bell-labs.com proto=p9sk1 user=gre
627	!password='don''t tell'
628proto=apop server=x.y.com user=gre
629	!password='open sesame'
630.P2
631If a value is empty or contains white space or single quotes, it must be quoted;
632quotes are represented by doubled single quotes.
633Attributes that begin with an exclamation mark
634.CW ! ) (
635are considered
636.I secret .
637.CW Factotum
638will never let a secret value escape its address space
639and will suppress keyboard echo when asking the user to type one.
640.LP
641A program requesting authentication selects a key
642by providing a
643.I query ,
644a list of elements to be matched by the key.
645Each element in the list is either an
646.I attribute\f(CW=\fPvalue
647pair, which is satisfied by keys with
648exactly that pair;
649or an attribute followed by a question mark,
650.I attribute\f(CW? ,
651which is satisfied by keys with some pair specifying
652the attribute.
653A key matches a query if every element in the list
654is satisfied.
655For instance, to select the APOP key in the previous example,
656an APOP client process might specify the query
657.P1
658server=x.y.com proto=apop
659.P2
660Internally,
661.CW factotum 's
662APOP module would add the requirements of
663having
664.CW user
665and
666.CW !password
667attributes, forming the query
668.P1
669server=x.y.com proto=apop user? !password?
670.P2
671when searching for an appropriate key.
672.LP
673.CW Factotum
674modules expect keys to have some well-known attributes.
675For instance, the
676.CW proto
677attribute specifies the protocol module
678responsible for using a particular key,
679and protocol modules may expect other well-known attributes
680(many expect keys to have
681.CW !password
682attributes, for example).
683Additional attributes can be used as comments or for
684further discrimination without intervention by
685.CW factotum ;
686for example, the APOP and IMAP mail clients conventionally
687include a
688.CW server
689attribute to select an appropriate key for authentication.
690.LP
691Unlike in SDSI,
692keys in Plan 9 have no nested structure.  This design
693keeps the representation simple and straightforward.
694If necessary, we could add a nested attribute
695or, in the manner of relational databases, an attribute that
696selects another tuple, but so far the simple design has been sufficient.
697.LP
698A simple common structure for all keys makes them easy for users
699to administer,
700but the set of attributes and their interpretation is still
701protocol-specific and can be subtle.
702Users may still
703need to consult a manual to understand all details.
704Many attributes
705.CW proto , (
706.CW user ,
707.CW password ,
708.CW server )
709are self-explanatory and our short experience
710has not uncovered any particular difficulty in handling keys.
711Things
712will likely get messier, however,
713when we grapple with public
714keys and their myriad components.
715.SS
716Protecting keys
717.LP
718Secrets must be prevented from escaping
719.CW factotum .
720There are a number of ways they could leak:
721another process might be able to debug the agent process, the
722agent might swap out to disk, or the process might willingly
723disclose the key.
724The last is the easiest to avoid:
725secret information in a key is marked
726as such, and
727whenever
728.CW factotum
729prints keys or queries for new
730ones, it is careful to avoid displaying secret information.
731(The only exception to this is the
732``plaintext password'' protocol, which consists
733of sending the values of the
734.CW user
735and
736.CW !password
737attributes.
738Only keys tagged with
739.CW proto=pass
740can have their passwords disclosed by this mechanism.)
741.LP
742Preventing the first two forms of leakage
743requires help from the kernel.
744In Plan 9, every process is
745represented by a directory in the
746.CW /proc
747file system.
748Using the files in this directory,
749other processes could (with appropriate access permission) examine
750.CW factotum 's
751memory and registers.
752.CW Factotum
753is protected from processes of other users
754by the default access bits of its
755.CW /proc
756directory.
757However, we'd also like to protect the
758agent from other processes owned by the same user,
759both to avoid honest mistakes and to prevent
760an unattended terminal being
761exploited to discover secret passwords.
762To do this, we added a control message to
763.CW /proc
764called
765.CW private .
766Once the
767.CW factotum
768process has written
769.CW private
770to its
771.CW /proc/\f2pid\fP/ctl
772file, no process can access
773.CW factotum 's
774memory
775through
776.CW /proc .
777(Plan 9 has no other mechanism, such as
778.CW /dev/kmem ,
779for accessing a process's memory.)
780.LP
781Similarly, the agent's address space should not be
782swapped out, to prevent discovering unencrypted
783keys on the swapping media.
784The
785.CW noswap
786control message in
787.CW /proc
788prevents this scenario.
789Neither
790.CW private
791nor
792.CW noswap
793is specific to
794.CW factotum .
795User-level file servers such as
796.CW dossrv ,
797which interprets FAT file systems,
798could use
799.CW noswap
800to keep their buffer caches from being
801swapped to disk.
802.LP
803Despite our precautions, attackers might still
804find a way to gain access to a process running as the host
805owner on a machine.
806Although they could not directly
807access the keys, attackers could use the local
808.CW factotum
809to perform authentications for them.
810In the case
811of some keys, for example those locking bank
812accounts, we want a way to disable or at least
813detect such access.
814That is the role of the
815.CW confirm
816attribute in a key.
817Whenever a key with a
818.CW confirm
819attribute is accessed, the local user must
820confirm use of the key via a local GUI.
821The next section describes the actual mechanism.
822.LP
823We have not addressed leaks possible as a result of
824someone rebooting or resetting a machine running
825.CW factotum .
826For example, someone could reset a machine
827and reboot it with a debugger instead of a kernel,
828allowing them to examine the contents of memory
829and find keys.  We have not found a satisfactory
830solution to this problem.
831.SS
832Factotum transactions
833.LP
834External programs manage
835.CW factotum 's
836internal key state
837through its file interface,
838writing textual
839.CW key
840and
841.CW delkey
842commands to the
843.CW /mnt/factotum/ctl
844file.
845Both commands take a list of attributes as an argument.
846.CW Key
847creates a key with the given attributes, replacing any
848extant key with an identical set of public attributes.
849.CW Delkey
850deletes all keys that match the given set of attributes.
851Reading the
852.CW ctl
853file returns a list of keys, one per line, displaying only public attributes.
854The following example illustrates these interactions.
855.P1
856% cd /mnt/factotum
857% ls -l
858-lrw------- gre gre 0 Jan 30 22:17 confirm
859--rw------- gre gre 0 Jan 30 22:17 ctl
860-lr-------- gre gre 0 Jan 30 22:17 log
861-lrw------- gre gre 0 Jan 30 22:17 needkey
862--r--r--r-- gre gre 0 Jan 30 22:17 proto
863--rw-rw-rw- gre gre 0 Jan 30 22:17 rpc
864% cat >ctl
865key dom=bell-labs.com proto=p9sk1 user=gre
866    !password='don''t tell'
867key proto=apop server=x.y.com user=gre
868    !password='bite me'
869^D
870% cat ctl
871key dom=bell-labs.com proto=p9sk1 user=gre
872key proto=apop server=x.y.com user=gre
873% echo 'delkey proto=apop' >ctl
874% cat ctl
875key dom=bell-labs.com proto=p9sk1 user=gre
876%
877.P2
878(A file with the
879.CW l
880bit set can be opened by only one process at a time.)
881.LP
882The heart of the interface is the
883.CW rpc
884file.
885Programs authenticate with
886.CW factotum
887by writing a request to the
888.CW rpc
889file
890and reading back the reply; this sequence is called an RPC
891.I transaction .
892Requests and replies have the same format:
893a textual verb possibly followed by arguments,
894which may be textual or binary.
895The most common reply verb is
896.CW ok ,
897indicating success.
898An RPC session begins with a
899.CW start
900transaction; the argument is a key query as described
901earlier.
902Once started, an RPC conversation usually consists of
903a sequence of
904.CW read
905and
906.CW write
907transactions.
908If the conversation is successful, an
909.CW authinfo
910transaction will return information about
911the identities learned during the transaction.
912The
913.CW attr
914transaction returns a list of attributes for the current
915conversation; the list includes any attributes given in
916the
917.CW start
918query as well as any public attributes from keys being used.
919.LP
920As an example of the
921.CW rpc
922file in action, consider a mail client
923connecting to a mail server and authenticating using
924the POP3 protocol's APOP challenge-response command.
925There are four programs involved: the mail client $P sub C#, the client
926.CW factotum
927$F sub C#, the mail server $P sub S#, and the server
928.CW factotum
929$F sub S#.
930All authentication computations are handled by the
931.CW factotum
932processes.
933The mail programs' role is just to relay messages.
934.LP
935At startup, the mail server at
936.CW x.y.com
937begins an APOP conversation
938with its
939.CW factotum
940to obtain the banner greeting, which
941includes a challenge:
942.P1
943$P sub S -> F sub S#: start proto=apop role=server
944$F sub S -> P sub S#: ok
945$P sub S -> F sub S#: read
946$F sub S -> P sub S#: ok +OK POP3 \f2challenge\fP
947.P2
948Having obtained the challenge, the server greets the client:
949.P1
950$P sub S -> P sub C#: +OK POP3 \f2challenge\fP
951.P2
952The client then uses an APOP conversation with its
953.CW factotum
954to obtain a response:
955.P1
956$P sub C -> F sub C#: start proto=apop role=client
957            server=x.y.com
958$F sub C -> P sub C#: ok
959$P sub C -> F sub C#: write +OK POP3 \f2challenge\fP
960$F sub C -> P sub C#: ok
961$P sub C -> F sub C#: read
962$F sub C -> P sub C#: ok APOP gre \f2response\fP
963.P2
964.CW Factotum
965requires that
966.CW start
967requests include a
968.CW proto
969attribute, and the APOP module requires an additional
970.CW role
971attribute, but the other attributes are optional and only
972restrict the key space.
973Before responding to the
974.CW start
975transaction, the client
976.CW factotum
977looks for a key to
978use for the rest of the conversation.
979Because of the arguments in the
980.CW start
981request, the key must have public attributes
982.CW proto=apop
983and
984.CW server=x.y.com ;
985as mentioned earlier,
986the APOP module additionally requires that the key have
987.CW user
988and
989.CW !password
990attributes.
991Now that the client has obtained a response
992from its
993.CW factotum ,
994it echoes that response to the server:
995.P1
996$P sub C -> P sub S#: APOP gre \f2response\fP
997.P2
998Similarly, the server passes this message to
999its
1000.CW factotum
1001and obtains another to send back.
1002.P1
1003$P sub S -> F sub S#: write APOP gre \f2response\fP
1004$F sub S -> P sub S#: ok
1005$P sub S -> F sub S#: read
1006$F sub S -> P sub S#: ok +OK welcome
1007
1008$P sub S -> P sub C#: +OK welcome
1009.P2
1010Now the authentication protocol is done, and
1011the server can retrieve information
1012about what the protocol established.
1013.P1
1014$P sub S -> F sub S#: authinfo
1015$F sub S -> P sub S#: ok client=gre
1016            capability=\f2capability\fP
1017.P2
1018The
1019.CW authinfo
1020data is a list of
1021.I attr\f(CW=\fPvalue
1022pairs, here a client user name and a capability.
1023(Protocols that establish shared secrets or provide
1024mutual authentication indicate this by adding
1025appropriate
1026.I attr\f(CW=\fPvalue
1027pairs.)
1028The capability can be used by the server to change its
1029identity to that of the client, as described earlier.
1030Once it has changed its identity, the server can access and serve
1031the client's mailbox.
1032.LP
1033Two more files provide hooks for a graphical
1034.CW factotum
1035control interface.
1036The first,
1037.CW confirm ,
1038allows the user detailed control over the use of certain keys.
1039If a key has a
1040.CW confirm=
1041attribute, then the user must approve each use of the key.
1042A separate program with a graphical interface reads from the
1043.CW confirm
1044file to see when a confirmation is necessary.
1045The read blocks until a key usage needs to be approved, whereupon
1046it will return a line of the form
1047.P1
1048confirm tag=1 \f2attributes\fP
1049.P2
1050requesting permission to use the key with those public attributes.
1051The graphical interface then prompts the user for approval
1052and writes back
1053.P1
1054tag=1 answer=yes
1055.P2
1056(or
1057.CW answer=no ).
1058.LP
1059The second file,
1060.CW needkey ,
1061diverts key requests.
1062In the APOP example, if a suitable key had not been found
1063during the
1064.CW start
1065transaction,
1066.CW factotum
1067would have indicated failure by
1068returning a response indicating
1069what key was needed:
1070.P1
1071$F sub C -> P sub C#: needkey proto=apop
1072    server=x.y.com user? !password?
1073.P2
1074A typical client would then prompt the user for the desired
1075key information, create a new key via the
1076.CW ctl
1077file, and then reissue the
1078.CW start
1079request.
1080If the
1081.CW needkey
1082file is open,
1083then instead of failing, the transaction
1084will block, and the next read from the
1085.CW /mnt/factotum/needkey
1086file will return a line of the form
1087.P1
1088needkey tag=1 \f2attributes\f2
1089.P2
1090The graphical interface then prompts the user for the needed
1091key information, creates the key via the
1092.CW ctl
1093file, and writes back
1094.CW tag=1
1095to resume the transaction.
1096.LP
1097The remaining files are informational and used for debugging.
1098The
1099.CW proto
1100file contains a list of supported protocols (to see what protocols the
1101system supports,
1102.CW cat
1103.CW /mnt/factotum/proto ),
1104and the
1105.CW log
1106file contains a log of operations and debugging output
1107enabled by a
1108.CW debug
1109control message.
1110.LP
1111The next few sections explain how
1112.CW factotum
1113is used by system services.
1114.NH 1
1115Authentication in 9P
1116.LP
1117Plan 9 uses a remote file access protocol, 9P
1118[Pike93],
1119to connect to resources such as the
1120file server and remote processes.
1121The original design for 9P included special messages at the start of a conversation
1122to authenticate the user.
1123Multiple users can share a single connection, such as when a CPU server
1124runs processes for many users connected to a single file server,
1125but each must authenticate separately.
1126The authentication protocol, similar to that of Kerberos
1127[Stei88],
1128used a sequence of messages passed between client, file server, and authentication
1129server to verify the identities of the user, calling machine, and serving machine.
1130One major drawback to the design was that the authentication method was defined by 9P
1131itself and could not be changed.
1132Moreover, there was no mechanism to relegate
1133authentication to an external (trusted) agent,
1134so a process implementing 9P needed, besides support for file service,
1135a substantial body of cryptographic code to implement a handful of startup messages
1136in the protocol.
1137.LP
1138A recent redesign of 9P
1139addressed a number of file service issues outside the scope of this paper.
1140On issues of authentication, there were two goals:
1141first, to remove details about authentication from the
1142protocol itself; second, to allow an external program to execute the authentication
1143part of the protocol.
1144In particular, we wanted a way to quickly incorporate
1145ideas found in other systems such as SFS
1146[Mazi99].
1147.LP
1148Since 9P is a file service protocol, the solution involved creating a new type of file
1149to be served: an
1150.I authentication
1151.I file .
1152Connections to a 9P service begin in a state that
1153allows no general file access but permits the client
1154to open an authentication file
1155by sending a special message, generated by the new
1156.CW fauth
1157system call:
1158.P1
1159afd = fauth(int fd, char *servicename);
1160.P2
1161Here
1162.CW fd
1163is the user's file descriptor for the established network connection to the 9P server
1164and
1165.CW servicename
1166is the name of the desired service offered on that server, typically the file subsystem
1167to be accessed.
1168The returned file descriptor,
1169.CW afd ,
1170is a unique handle representing the authentication file
1171created for this connection to authenticate to
1172this service; it is analogous to a capability.
1173The authentication file represented by
1174.CW afd
1175is not otherwise addressable on the server, such as through
1176the file name hierarchy.
1177In all other respects, it behaves like a regular file;
1178most important, it accepts standard read and write operations.
1179.LP
1180To prove its identity, the user process (via
1181.CW factotum )
1182executes the authentication protocol,
1183described in the next section of this paper,
1184over the
1185.CW afd
1186file descriptor with ordinary reads and writes.
1187When client and server have successfully negotiated, the authentication file
1188changes state so it can be used as evidence of authority in
1189.CW mount .
1190.LP
1191Once identity is established, the process presents the (now verified)
1192.CW afd
1193as proof of identity to the
1194.CW mount
1195system call:
1196.P1
1197mount(int fd, int afd, char *mountpoint,
1198      int flag, char *servicename)
1199.P2
1200If the
1201.CW mount
1202succeeds, the user now
1203has appropriate permissions for the file hierarchy made
1204visible at the mount point.
1205.LP
1206This sequence of events has several advantages.
1207First, the actual authentication protocol is implemented using regular reads and writes,
1208not special 9P messages, so
1209they can be processed, forwarded, proxied, and so on by
1210any 9P agent without special arrangement.
1211Second, the business of negotiating the authentication by reading and writing the
1212authentication file can be delegated to an outside agent, in particular
1213.CW factotum ;
1214the programs that implement the client and server ends of a 9P conversation need
1215no authentication or cryptographic code.
1216Third,
1217since the authentication protocol is not defined by 9P itself, it is easy to change and
1218can even be negotiated dynamically.
1219Finally, since
1220.CW afd
1221acts like a capability, it can be treated like one:
1222handed to another process to give it special permissions;
1223kept around for later use when authentication is again required;
1224or closed to make sure no other process can use it.
1225.LP
1226All these advantages stem from moving the authentication negotiation into
1227reads and writes on a separate file.
1228As is often the case in Plan 9,
1229making a resource (here authentication) accessible with a file-like interface
1230reduces
1231.I a
1232.I priori
1233the need for special interfaces.
1234.LP
1235.SS
1236Plan 9 shared key protocol
1237.LP
1238In addition to the various standard protocols supported by
1239.CW factotum ,
1240we use a shared key protocol for native
1241Plan 9 authentication.
1242This protocol provides backward compatibility with
1243older versions of the system.  One reason for the new
1244architecture is to let us replace such protocols
1245in the near future with more cryptographically secure ones.
1246.LP
1247.I P9sk1
1248is a shared key protocol that uses tickets much like those
1249in the original Kerberos.
1250The difference is that we've
1251replaced the expiration time in Kerberos tickets with
1252a random nonce parameter and a counter.
1253We summarize it here:
1254.P1
1255$C -> S: ~~ "nonce" sub C#
1256$S -> C: ~~ "nonce" sub S , "uid" sub S , "domain" sub S#
1257
1258$C -> A: ~~ "nonce" sub S , "uid" sub S , "domain" sub S , "uid" sub C ,#
1259         $"factotum" sub C#
1260$A -> C: ~~ K sub C roman "{" "nonce" sub S , "uid" sub C , "uid" sub S, K sub n roman "}",#
1261         $K sub S roman "{" "nonce" sub S , "uid" sub C , "uid" sub S, K sub n roman "}"#
1262
1263$C -> S: ~~ K sub S roman "{" "nonce" sub S , "uid" sub C , "uid" sub S , K sub n roman "}",#
1264         $K sub n roman "{" "nonce" sub S , "counter" roman "}"#
1265$S -> C: ~~ K sub n roman "{" "nonce" sub C , "counter" roman "}"#
1266.P2
1267(Here $K roman "{" x roman "}"# indicates $x# encrypted with
1268DES key $K#.)
1269The first two messages exchange nonces and server identification.
1270After this initial exchange, the client contacts the authentication
1271server to obtain a pair of encrypted tickets, one encrypted with
1272the client key and one with the server key.
1273The client relays the server ticket to the server.
1274The server believes that the ticket is new
1275because it contains
1276$"nonce" sub S#
1277and that the ticket is from the authentication
1278server because it is encrypted in the server key $K sub S#.
1279The ticket is basically a statement from the authentication
1280server that now $"uid" sub C# and $"uid" sub S# share a
1281secret $K sub n#.
1282The authenticator $K sub n roman "{" "nonce" sub S , "counter" roman "}"#
1283convinces the server that the client knows $K sub n# and thus
1284must be $"uid" sub C#.
1285Similarly, authenticator $K sub n roman "{" "nonce" sub C , "counter" roman "}"#
1286convinces the client that the server knows $K sub n# and thus
1287must be $"uid" sub S#.
1288Tickets can be reused, without contacting the authentication
1289server again, by incrementing the counter before each
1290authenticator is generated.
1291.LP
1292In the future we hope to introduce a public key version of
1293p9sk1,
1294which would allow authentication even
1295when the authentication server is not available.
1296.SS
1297The authentication server
1298.LP
1299Each Plan 9 security domain has an authentication server (AS)
1300that all users trust to keep the complete set of shared keys.
1301It also offers services for users and administrators to manage the
1302keys, create and disable accounts, and so on.
1303It typically runs on
1304a standalone machine with few other services.
1305The AS comprises two services,
1306.CW keyfs
1307and
1308.CW authsrv .
1309.LP
1310.CW Keyfs
1311is a user-level file system that manages an
1312encrypted database of user accounts.
1313Each account is represented by a directory containing the
1314files
1315.CW key ,
1316containing the Plan 9 key for p9sk1;
1317.CW secret
1318for the challenge/response protocols (APOP, VNC, CHAP, MSCHAP,
1319CRAM);
1320.CW log
1321for authentication outcomes;
1322.CW expire
1323for an expiration time; and
1324.CW status .
1325If the expiration time passes,
1326if the number of successive failed authentications
1327exceeds 50, or if
1328.CW disabled
1329is written to the status file,
1330any attempt to access the
1331.CW key
1332or
1333.CW secret
1334files will fail.
1335.LP
1336.CW Authsrv
1337is a network service that brokers shared key authentications
1338for the protocols p9sk1, APOP, VNC, CHAP, MSCHAP,
1339and CRAM.  Remote users can also call
1340.CW authsrv
1341to change their passwords.
1342.LP
1343The
1344p9sk1
1345protocol was described in the previous
1346section.
1347The challenge/response protocols differ
1348in detail but all follow the general structure:
1349.P1
1350$C -> S: ~~ "nonce" sub C#
1351$S -> C: ~~ "nonce" sub S , "uid" sub S ,"domain" sub S#
1352$C -> A: ~~ "nonce" sub S , "uid" sub S , "domain" sub S ,#
1353         $"hostid" sub C , "uid" sub C#
1354$A -> C: ~~ K sub C roman "{" "nonce" sub S , "uid" sub C , "uid" sub S, K sub n roman "}",#
1355         $K sub S roman "{" "nonce" sub S , "uid" sub C , "uid" sub S, K sub n roman "}"#
1356$C -> S: ~~ K sub S roman "{" "nonce" sub S , "uid" sub C , "uid" sub S, K sub n roman "}",#
1357         $K sub n roman "{" "nonce" sub S roman "}"#
1358$S -> C: ~~ K sub n roman "{" "nonce" sub C roman "}"#
1359.P2
1360The password protocol is:
1361.P1
1362$C -> A: ~~ "uid" sub C#
1363$A -> C: ~~ K sub c roman "{" K sub n roman "}"#
1364$C -> A: ~~ K sub n roman "{" "password" sub "old" , "password" sub "new" roman "}"#
1365$A -> C: ~~ OK#
1366.P2
1367To avoid replay attacks, the pre-encryption
1368clear text for each of the protocols (as well as for p9sk1) includes
1369a tag indicating the encryption's role in the
1370protocol.  We elided them in these outlines.
1371.SS
1372Protocol negotiation
1373.LP
1374Rather than require particular protocols for particular services,
1375we implemented a negotiation metaprotocol,
1376.I p9any ,
1377which chooses the actual authentication protocol to use.
1378P9any
1379is used now by all native services on Plan 9.
1380.LP
1381The metaprotocol is simple.  The callee sends a
1382null-terminated string of the form:
1383.P1
1384v.$n# $proto sub 1#@$domain sub 1# $proto sub 2#@$domain sub 2# ...
1385.P2
1386where
1387.I n
1388is a decimal version number, $proto sub k#
1389is the name of a protocol for which the
1390.CW factotum
1391has a key, and $domain sub k#
1392is the name of the domain in which the key is
1393valid.
1394The caller then responds
1395.P1
1396\f2proto\fP@\f2domain\fP
1397.P2
1398indicating its choice.
1399Finally the callee responds
1400.P1
1401OK
1402.P2
1403Any other string indicates failure.
1404At this point the chosen protocol commences.
1405The final fixed-length reply is used to make it easy to
1406delimit the I/O stream should the chosen protocol
1407require the caller rather than the callee to send the first message.
1408.LP
1409With this negotiation metaprotocol, the underlying
1410authentication protocols used for Plan 9 services
1411can be changed under any application just
1412by changing the keys known by the
1413.CW factotum
1414agents at each end.
1415.LP
1416P9any is vulnerable to man in the middle attacks
1417to the extent that the attacker may constrain the
1418possible choices by changing the stream.  However,
1419we believe this is acceptable since the attacker
1420cannot force either side to choose algorithms
1421that it is unwilling to use.
1422.NH 1
1423Library Interface to Factotum
1424.LP
1425Although programs can access
1426.CW factotum 's
1427services through its file system interface,
1428it is more common to use a C library that
1429packages the interaction.
1430There are a number of routines in the library,
1431not all of which are relevant here, but a few
1432examples should give their flavor.
1433.LP
1434First, consider the problem of mounting a remote file server using 9P.
1435An earlier discussion showed how the
1436.CW fauth
1437and
1438.CW mount
1439system calls use an authentication file,
1440.CW afd ,
1441as a capability,
1442but not how
1443.CW factotum
1444manages
1445.CW afd .
1446The library contains a routine,
1447.CW amount
1448(authenticated mount), that is used by most programs in preference to
1449the raw
1450.CW fauth
1451and
1452.CW mount
1453calls.
1454.CW Amount
1455engages
1456.CW factotum
1457to validate
1458.CW afd ;
1459here is the complete code:
1460.P1
1461.ta 3n +3n +3n +3n
1462int
1463amount(int fd, char *mntpt,
1464	int flags, char *aname)
1465{
1466	int afd, ret;
1467	AuthInfo *ai;
1468
1469	afd = fauth(fd, aname);
1470	if(afd >= 0){
1471		ai = auth_proxy(afd, amount_getkey,
1472			"proto=p9any role=client");
1473		if(ai != NULL)
1474			auth_freeAI(ai);
1475	}
1476	ret = mount(fd, afd, mntpt,
1477		flags, aname);
1478	if(afd >= 0)
1479		close(afd);
1480	return ret;
1481}
1482.P2
1483where parameter
1484.CW fd
1485is a file descriptor returned by
1486.CW open
1487or
1488.CW dial
1489for a new connection to a file server.
1490The conversation with
1491.CW factotum
1492occurs in the call to
1493.CW auth_proxy ,
1494which specifies, as a key query,
1495which authentication protocol to use
1496(here the metaprotocol
1497.CW p9any )
1498and the role being played
1499.CW client ). (
1500.CW Auth_proxy
1501will read and write the
1502.CW factotum
1503files, and the authentication file descriptor
1504.CW afd ,
1505to validate the user's right to access the service.
1506If the call is successful, any auxiliary data, held in an
1507.CW AuthInfo
1508structure, is freed.
1509In any case, the
1510.CW mount
1511is then called with the (perhaps validated)
1512.CW afd.
1513A 9P server can cause the
1514.CW fauth
1515system call to fail, as an indication that authentication is
1516not required to access the service.
1517.LP
1518The second argument to
1519.CW auth_proxy
1520is a function, here
1521.CW amount_getkey ,
1522to be called if secret information such as a password or
1523response to a challenge is required as part of the authentication.
1524This function, of course, will provide this data to
1525.CW factotum
1526as a
1527.CW key
1528message on the
1529.CW /mnt/factotum/ctl
1530file.
1531.LP
1532Although the final argument to
1533.CW auth_proxy
1534in this example is a simple string, in general
1535it can be a formatted-print specifier in the manner of
1536.CW printf ,
1537to enable the construction of more elaborate key queries.
1538.LP
1539As another example, consider the Plan 9
1540.CW cpu
1541service, which exports local devices to a shell process on
1542a remote machine, typically
1543to connect the local screen and keyboard to a more powerful computer.
1544At heart,
1545.CW cpu
1546is a superset of a service called
1547.CW exportfs
1548[Pike93],
1549which allows one machine to see an arbitrary portion of the file name space
1550of another machine, such as to
1551export the network device to another machine
1552for gatewaying.
1553However,
1554.CW cpu
1555is not just
1556.CW exportfs
1557because it also delivers signals such as interrupt
1558and negotiates the initial environment
1559for the remote shell.
1560.LP
1561To authenticate an instance of
1562.CW cpu
1563requires
1564.CW factotum
1565processes on both ends: the local, client
1566end running as the user on a terminal
1567and the remote, server
1568end running as the host owner of the server machine.
1569Here is schematic code for the two ends:
1570.P1
1571.ta 3n +3n +3n +3n
1572/* client */
1573int
1574p9auth(int fd)
1575{
1576	AuthInfo *ai;
1577
1578	ai = auth_proxy(fd, auth_getkey,
1579		"proto=p9any role=client");
1580	if(ai == NULL)
1581		return -1;
1582
1583	/* start cpu protocol here */
1584}
1585
1586/* server */
1587int
1588srvp9auth(int fd, char *user)
1589{
1590	AuthInfo *ai;
1591
1592	ai = auth_proxy(fd, NULL,
1593		"proto=p9any role=server");
1594	if(ai == NULL)
1595		return -1;
1596	/* set user id for server process */
1597	if(auth_chuid(ai, NULL) < 0)
1598		return -1;
1599
1600	/* start cpu protocol here */
1601}
1602.P2
1603.CW Auth_chuid
1604encapsulates the negotiation to change a user id using the
1605.CW caphash
1606and
1607.CW capuse
1608files of the (server) kernel.
1609Note that although the client process may ask the user for new keys, using
1610.CW auth_getkey ,
1611the server machine, presumably a shared machine with a pseudo-user for
1612the host owner, sets the key-getting function to
1613.CW NULL .
1614.NH 1
1615Secure Store
1616.LP
1617.CW Factotum
1618keeps its keys in volatile memory, which must somehow be
1619initialized at boot time.
1620Therefore,
1621.CW factotum
1622must be
1623supplemented by a persistent store, perhaps
1624a floppy disk containing a key file of commands to be copied into
1625.CW /mnt/factotum/ctl
1626during bootstrap.
1627But removable media are a nuisance to carry and
1628are vulnerable to theft.
1629Keys could be stored encrypted on a shared file system, but
1630only if those keys are not necessary for authenticating to
1631the file system in the first place.
1632Even if the keys are encrypted under a user
1633password, a thief might well succeed with a dictionary attack.
1634Other risks of local storage are loss of the contents
1635through mechanical mishap or dead batteries.
1636Thus for convenience and
1637safety we provide a
1638.CW secstore
1639(secure store) server in the network to hold each user's permanent list of keys, a
1640.I key
1641.I file .
1642.LP
1643.CW Secstore
1644is a file server for encrypted data,
1645used only during bootstrapping.
1646It must provide strong
1647authentication and resistance to passive and active protocol attacks
1648while assuming nothing more from the client than a password.
1649Once
1650.CW factotum
1651has loaded the key file, further encrypted or authenticated
1652file storage can be accomplished by standard mechanisms.
1653.EQ
1654define mod % ~ roman "mod" ~ %
1655define sha1 % "sha1" %
1656.EN
1657.LP
1658The cryptographic technology that enables
1659.CW secstore
1660is a form of encrypted
1661key exchange
1662called PAK
1663[Boyk00],
1664analogous to
1665EKE
1666[Bell93],
1667SRP
1668[Wu98],
1669or
1670SPEKE
1671[Jabl].
1672PAK was chosen
1673because it comes with a proof of equivalence in strength to
1674Diffie-Hellman; subtle flaws in some earlier encrypted key exchange
1675protocols and implementations have encouraged us to take special care.
1676In outline, the PAK protocol is:
1677.P1
1678$C -> S:~ C, g sup x H#
1679$S -> C:~ S, g sup y , hash(g sup xy , C, S)#
1680$C -> S:~ hash(g sup xy , S, C)#
1681.P2
1682where $H# is a preshared secret between client $C# and server $S#.
1683There are several variants of PAK, all presented in papers
1684mainly concerned with proofs of cryptographic properties.
1685To aid implementers, we have distilled a description of the specific
1686version we use into an Appendix to this paper.
1687The Plan 9 open source license provides for use of Lucent's
1688encrypted key exchange patents in this context.
1689.LP
1690As a further layer of defense against password theft,
1691we provide (within the encrypted channel $C -> S#)
1692information that is validated at a RADIUS server,
1693such as the digits from a hardware token
1694[RFC2138].
1695This provides two-factor authentication, which potentially
1696requires tricking two independent administrators in any attack by
1697social engineering.
1698.LP
1699The key file stored on the server is encrypted with AES (Rijndael) using CBC
1700with a 10-byte initialization vector and trailing authentication padding.
1701All this is invisible to the user of
1702.CW secstore .
1703For that matter, it is invisible to the
1704.CW secstore
1705server as well;
1706if the AES Modes of Operation are standardized and a new encryption format
1707designed, it can be implemented by a client without change to the server.
1708The
1709.CW secstore
1710is deliberately not backed up;  the user is expected to
1711use more than one
1712.CW secstore
1713or save the key file on removable media
1714and lock it away.
1715The user's password is hashed to create the $H# used
1716in the PAK protocol;  a different hash of the password is used as
1717the file encryption key.
1718Finally, there is a command (inside the authenticated,
1719encrypted channel between client and
1720.CW secstore )
1721to change passwords by sending
1722a new $H#;
1723for consistency, the client process must at the same time fetch and re-encrypt all files.
1724.LP
1725When
1726.CW factotum
1727starts, it dials the local
1728.CW secstore
1729and checks whether the user has an account.
1730If so,
1731it prompts for the user's
1732.CW secstore
1733password and fetches the key file.
1734The PAK protocol
1735ensures mutual authentication and prevents dictionary attacks on the password
1736by passive wiretappers or active intermediaries.
1737Passwords saved in
1738the key file can be long random strings suitable for
1739simpler challenge/response authentication protocols.
1740Thus the user need only remember
1741a single, weaker password to enable strong, ``single sign on'' authentication to
1742unchanged legacy applications scattered across multiple authentication domains.
1743.NH 1
1744Transport Layer Security
1745.LP
1746Since the Plan 9 operating system is designed for use in network elements
1747that must withstand direct attack, unguarded by firewall or VPN, we seek
1748to ensure that all applications use channels with appropriate mutual
1749authentication and encryption.
1750A principal tool for this is TLS 1.0
1751[RFC2246].
1752(TLS 1.0 is nearly the same as SSL 3.0,
1753and our software is designed to interoperate
1754with implementations of either standard.)
1755.LP
1756TLS defines a record layer protocol for message integrity and privacy
1757through the use of message digesting and encryption with shared secrets.
1758We implement this service as a kernel device, though it could
1759be performed at slightly higher cost by invoking a separate program.
1760The library interface to the TLS kernel device is:
1761.P1
1762int pushtls(int fd, char *hashalg,
1763    char *cryptalg, int isclient,
1764    char *secret, char *dir);
1765.P2
1766Given a file descriptor, the names of message digest and
1767encryption algorithms, and the shared secret,
1768.CW pushtls
1769returns a new file descriptor for the encrypted connection.
1770(The final argument
1771.CW dir
1772receives the name of the directory in the TLS device that
1773is associated with the new connection.)
1774The function is named by analogy with the ``push'' operation
1775supported by the stream I/O system of Research Unix and the
1776first two editions of Plan 9.
1777Because adding encryption is as simple as replacing one
1778file descriptor with another, adding encryption to a particular
1779network service is usually trivial.
1780.LP
1781The Plan 9 shared key authentication protocols establish a shared 56-bit secret
1782as a side effect.
1783Native Plan 9 network services such as
1784.CW cpu
1785and
1786.CW exportfs
1787use these protocols for authentication and then invoke
1788.CW pushtls
1789with the shared secret.
1790.LP
1791Above the record layer, TLS specifies a handshake protocol using public keys
1792to establish the session secret.
1793This protocol is widely used with HTTP and IMAP4
1794to provide server authentication, though with client certificates it could provide
1795mutual authentication.  The library function
1796.P1
1797int tlsClient(int fd, TLSconn *conn)
1798.P2
1799handles the initial handshake and returns the result of
1800.CW pushtls .
1801On return, it fills the
1802.CW conn
1803structure with the session ID used
1804and the X.509 certificate presented by the
1805server, but makes no effort to verify the certificate.
1806Although the original design intent of X.509 certificates expected
1807that they would be used with a Public Key Infrastructure,
1808reliable deployment has been so long delayed and problematic
1809that we have adopted the simpler policy of just using the
1810X.509 certificate as a representation of the public key,
1811depending on a locally-administered directory of SHA1 thumbprints
1812to allow applications to decide which public keys to trust
1813for which purposes.
1814.NH 1
1815Related Work and Discussion
1816.LP
1817Kerberos, one of the earliest distributed authentication
1818systems, keeps a set of authentication tickets in a temporary file called
1819a ticket cache.  The ticket cache is protected by Unix file permissions.
1820An environment variable containing the file name of the ticket cache
1821allows for different ticket caches in different simultaneous login sessions.
1822A user logs in by typing his or her Kerberos password.
1823The login program uses the Kerberos password to obtain a temporary
1824ticket-granting ticket from the authentication server, initializes the
1825ticket cache with the ticket-granting ticket, and then forgets the password.
1826Other applications can use the ticket-granting ticket to sign tickets
1827for themselves on behalf of the user during the login session.
1828The ticket cache is removed when the user logs out
1829[Stei88].
1830The ticket cache relieves the user from typing a password
1831every time authentication is needed.
1832.LP
1833The secure shell SSH develops this idea further, replacing the
1834temporary file with a named Unix domain socket connected to
1835a user-level program, called an agent.
1836Once the SSH agent is started and initialized with one or
1837more RSA private keys, SSH clients can employ it
1838to perform RSA authentications on their behalf.
1839In the absence of an agent, SSH typically uses RSA keys
1840read from encrypted disk files or uses passphrase-based
1841authentication, both of which would require prompting the user
1842for a passphrase whenever authentication is needed
1843[Ylon96].
1844The self-certifying file system SFS uses a similar agent
1845[Kami00],
1846not only for moderating the use of client authentication keys
1847but also for verifying server public keys
1848[Mazi99].
1849.LP
1850.CW Factotum
1851is a logical continuation of this evolution,
1852replacing the program-specific SSH or SFS agents with
1853a general agent capable of serving a wide variety of programs.
1854Having one agent for all programs removes the need
1855to have one agent for each program.
1856It also allows the programs themselves to be protocol-agnostic,
1857so that, for example, one could build an SSH workalike
1858capable of using any protocol supported by
1859.CW factotum ,
1860without that program knowing anything about the protocols.
1861Traditionally each program needs to implement each
1862authentication protocol for itself, an $O(n sup 2 )# coding
1863problem that
1864.CW factotum
1865reduces to $O(n)#.
1866.LP
1867Previous work on agents has concentrated on their use by clients
1868authenticating to servers.
1869Looking in the other direction, Sun Microsystem's
1870pluggable authentication module (PAM) is one
1871of the earliest attempts to
1872provide a general authentication mechanism for Unix-like
1873operating systems
1874[Sama96].
1875Without a central authority like PAM, system policy is tied
1876up in the various implementations of network services.
1877For example, on a typical Unix, if a system administrator
1878decides not to allow plaintext passwords for authentication,
1879the configuration files for a half dozen different servers \(em
1880.CW rlogind ,
1881.CW telnetd ,
1882.CW ftpd ,
1883.CW sshd ,
1884and so on \(em
1885need to be edited.
1886PAM solves this problem by hiding the details of a given
1887authentication mechanism behind a common library interface.
1888Directed by a system-wide configuration file,
1889an application selects a particular authentication mechanism
1890by dynamically loading the appropriate shared library.
1891PAM is widely used on Sun's Solaris and some Linux distributions.
1892.LP
1893.CW Factotum
1894achieves the same goals
1895using the agent approach.
1896.CW Factotum
1897is the only process that needs to create
1898capabilities, so all the network servers can run as
1899untrusted users (e.g.,
1900Plan 9's
1901.CW none
1902or Unix's
1903.CW nobody ),
1904which greatly reduces the harm done if a server is buggy
1905and is compromised.
1906In fact, if
1907.CW factotum
1908were implemented on Unix along with
1909an analogue to the Plan 9 capability device, venerable
1910programs like
1911.CW su
1912and
1913.CW login
1914would no longer need to be installed ``setuid root.''
1915.LP
1916Several other systems, such as Password Safe [Schn],
1917store multiple passwords in an encrypted file,
1918so that the user only needs to remember one password.
1919Our
1920.CW secstore
1921solution differs from these by placing the storage in
1922a hardened location in the network, so that the encrypted file is
1923less liable to be stolen for offline dictionary attack and so that
1924it is available even when a user has several computers.
1925In contrast, Microsoft's Passport system
1926[Micr]
1927keeps credentials in
1928the network, but centralized at one extremely-high-value target.
1929The important feature of Passport, setting up trust relationships
1930with e-merchants, is outside our scope.
1931The
1932.CW secstore
1933architecture is almost identical to
1934Perlman and Kaufman's
1935[Perl99]
1936but with newer EKE technology.
1937Like them, we chose to defend mainly against outside attacks
1938on
1939.CW secstore ;
1940if additional defense of the files on the server
1941itself is desired, one can use distributed techniques
1942[Ford00].
1943.LP
1944We made a conscious choice of placing encryption, message integrity,
1945and key management at the application layer
1946(TLS, just above layer 4) rather than at layer 3, as in IPsec.
1947This leads to a simpler structure for the network stack, easier
1948integration with applications and, most important, easier network
1949administration since we can recognize which applications are misbehaving
1950based on TCP port numbers.  TLS does suffer (relative to IPsec) from
1951the possibility of forged TCP Reset, but we feel that this is adequately
1952dealt with by randomized TCP sequence numbers.
1953In contrast with other TLS libraries, Plan 9 does not
1954require the application to change
1955.CW write
1956calls to
1957.CW sslwrite
1958but simply to add a few lines of code at startup
1959[Resc01].
1960.NH 1
1961Conclusion
1962.LP
1963Writing safe code is difficult.
1964Stack attacks,
1965mistakes in logic, and bugs in compilers and operating systems
1966can each make it possible for an attacker
1967to subvert the intended execution sequence of a
1968service.
1969If the server process has the privileges
1970of a powerful user, such as
1971.CW root
1972on Unix, then so does the attacker.
1973.CW Factotum
1974allows us
1975to constrain the privileged execution to a single
1976process whose core is a few thousand lines of code.
1977Verifying such a process, both through manual and automatic means,
1978is much easier and less error prone
1979than requiring it of all servers.
1980.LP
1981An implementation of these ideas is in Plan 9 from Bell Labs, Fourth Edition,
1982freely available from \f(CWhttp://\%plan9.bell-labs.com/\%plan9\fP.
1983.SH
1984Acknowledgments
1985.LP
1986William Josephson contributed to the implementation of password changing in
1987.CW secstore .
1988We thank Phil MacKenzie and Martín Abadi for helpful comments on early parts
1989of the design.
1990Chuck Blake,
1991Peter Bosch,
1992Frans Kaashoek,
1993Sape Mullender,
1994and
1995Lakshman Y. N.,
1996predominantly Dutchmen, gave helpful comments on the paper.
1997Russ Cox is supported by a fellowship from the Fannie and John Hertz Foundation.
1998.SH
1999References
2000.LP
2001[Bell93]
2002S.M. Bellovin and M. Merritt,
2003``Augmented Encrypted Key Exchange,''
2004Proceedings of the 1st ACM Conference on Computer and Communications Security, 1993, pp. 244 - 250.
2005.LP
2006[Boyk00]
2007Victor Boyko, Philip MacKenzie, and Sarvar Patel,
2008``Provably Secure Password-Authenticated Key Exchange using Diffie-Hellman,''
2009Eurocrypt 2000, 156\-171.
2010... http://www.bell-labs.com/who/philmac/research/pak-final.ps.gz
2011.LP
2012[RFC2246]
2013T . Dierks and C. Allen,
2014``The TLS Protocol, Version 1.0,''
2015RFC 2246.
2016.LP
2017[Ford00]
2018Warwick Ford and Burton S. Kaliski, Jr.,
2019``Server-Assisted Generation of a Strong Secret from a Password,''
2020IEEE Fifth International Workshop on Enterprise Security,
2021National Institute of Standards and Technology (NIST),
2022Gaithersburg MD, June 14 - 16, 2000.
2023.LP
2024[Jabl]
2025David P. Jablon,
2026``Strong Password-Only Authenticated Key Exchange,''
2027\f(CWhttp://\%integritysciences.com/\%speke97.html\fP.
2028.LP
2029[Kami00]
2030Michael Kaminsky.
2031``Flexible Key Management with SFS Agents,''
2032Master's Thesis, MIT, May 2000.
2033.LP
2034[Mack]
2035Philip MacKenzie,
2036private communication.
2037.LP
2038[Mazi99]
2039David Mazières, Michael Kaminsky, M. Frans Kaashoek and Emmett Witchel,
2040``Separating key management from file system security,''
2041Symposium on Operating Systems Principles, 1999, pp. 124-139.
2042.LP
2043[Micr]
2044Microsoft Passport,
2045\f(CWhttp://\%www.passport.com/\fP.
2046.LP
2047[Perl99]
2048Radia Perlman and Charlie Kaufman,
2049``Secure Password-Based Protocol for Downloading a Private Key,''
2050Proc. 1999 Network and Distributed System Security Symposium,
2051Internet Society, January 1999.
2052.LP
2053[Pike95]
2054Rob Pike, Dave Presotto, Sean Dorward, Bob Flandrena, Ken Thompson, Howard Trickey, and Phil Winterbottom,
2055``Plan 9 from Bell Labs,''
2056Computing Systems, \f3\&8\fP, 3, Summer 1995, pp. 221-254.
2057.LP
2058[Pike93]
2059Rob Pike, Dave Presotto, Ken Thompson, Howard Trickey, Phil Winterbottom,
2060``The Use of Name Spaces in Plan 9,''
2061Operating Systems Review, \f3\&27\fP, 2, April 1993, pp. 72-76
2062(reprinted from Proceedings of the 5th ACM SIGOPS European Workshop,
2063Mont Saint-Michel, 1992, Paper nº 34).
2064.LP
2065[Resc01]
2066Eric Rescorla,
2067``SSL and TLS: Designing and Building Secure Systems,''
2068Addison-Wesley, 2001. ISBN 0-201-61598-3, p. 387.
2069.LP
2070[RFC2138]
2071C. Rigney, A. Rubens, W. Simpson, S. Willens,
2072``Remote Authentication Dial In User Service (RADIUS),''
2073RFC2138, April 1997.
2074.LP
2075[RiLa]
2076Ronald L. Rivest and Butler Lampson,
2077``SDSI\(emA Simple Distributed Security Infrastructure,''
2078\f(CWhttp://\%theory.lcs.mit.edu/\%~rivest/\%sdsi10.ps\fP.
2079.LP
2080[Schn]
2081Bruce Schneier, Password Safe,
2082\f(CWhttp://\%www.counterpane.com/\%passsafe.html\fP.
2083.LP
2084[Sama96]
2085Vipin Samar,
2086``Unified Login with Pluggable Authentication Modules (PAM),''
2087Proceedings of the Third ACM Conference on Computer Communications and Security,
2088March 1996, New Delhi, India.
2089... http://www1.acm.org/pubs/articles/proceedings/commsec/238168/p1-samar/p1-samar.pdf
2090.LP
2091[Stei88]
2092Jennifer G. Steiner, Clifford Neumann, and Jeffrey I. Schiller,
2093``\fIKerberos\fR: An Authentication Service for Open Network Systems,''
2094Proceedings of USENIX Winter Conference, Dallas, Texas, February 1988, pp. 191\-202.
2095... ftp://athena-dist.mit.edu/pub/kerberos/doc/usenix.PS
2096.LP
2097[Wu98]
2098T. Wu,
2099``The Secure Remote Password Protocol,''
2100Proceedings of
2101the 1998 Internet Society Network and Distributed System Security
2102Symposium, San Diego, CA, March 1998, pp. 97-111.
2103.LP
2104[Ylon96]
2105Ylonen, T.,
2106``SSH\(emSecure Login Connections Over the Internet,''
21076th USENIX Security Symposium, pp. 37-42. San Jose, CA, July 1996.
2108.SH
2109Appendix: Summary of the PAK protocol
2110.LP
2111Let $q>2 sup 160# and $p>2 sup 1024# be primes
2112such that $p=rq+1# with $r# not a multiple of $q#.
2113Take $h ∈ Z sub p sup *# such that $g == h sup r# is not 1.
2114These parameters may be chosen by the NIST algorithm for DSA,
2115and are public, fixed values.
2116The client $C# knows a secret $pi#
2117and computes $H == (H sub 1 (C, ~ pi )) sup r# and $H sup -1#,
2118where $H sub 1# is a hash function yielding a random element of $Z sub p sup *#,
2119and $H sup -1# may be computed by gcd.
2120(All arithmetic is modulo $p#.)
2121The client gives $H sup -1# to the server $S# ahead of time by a private channel.
2122To start a new connection, the client generates a random value $x#,
2123computes $m == g sup x H#,
2124then calls the server and sends $C# and $m#.
2125The server checks $m != 0 mod p#,
2126generates random $y#,
2127computes $ mu == g sup y#,
2128$ sigma == (m H sup -1 ) sup y#,
2129and sends $S#, $mu#, $k == sha1 ( roman "\"server\"", C, S, m, mu , sigma , H sup -1 )#.
2130Next the client computes $sigma =  mu sup x#,
2131verifies $k#,
2132and sends $k' == sha1 ( roman "\"client\"", C, S, m, mu , sigma , H sup -1 )#.
2133The server then verifies $k'# and both sides begin
2134using session key $K == sha1 ( roman "\"session\"", C, S, m, mu , sigma , H sup -1 )#.
2135In the published version of PAK, the server name $S#
2136is included in the initial
2137hash $H#, but doing so is inconvenient in our application,
2138as the server may be known by various equivalent names.
2139.LP
2140MacKenzie has shown
2141[Mack]
2142that the
2143equivalence proof [Boyk00]
2144can be adapted to cover our version.
2145