1*11077Seric.nr DR 1	\" this is a draft copy
2*11077Seric.nr si 3n
3*11077Seric.he 'Mail Systems and Addressing in 4.2bsd''%'
4*11077Seric.fo 'Version 1.1'DRAFT'Last Mod 02/16/83'
5*11077Seric.if n .ls 2
6*11077Seric.+c
7*11077Seric.(l C
8*11077Seric.sz 14
9*11077SericMail Systems and Addressing
10*11077Sericin 4.2bsd
11*11077Seric.sz
12*11077Seric.sp
13*11077SericEric Allman\(dg
14*11077Seric.sp 0.5
15*11077Seric.i
16*11077SericBritton-Lee, Inc.
17*11077Seric1919 Addison Street, Suite 105.
18*11077SericBerkeley, California 94704.
19*11077Seric.sp 0.5
20*11077Seric.r
21*11077Sericeric@Berkeley.ARPA
22*11077Sericucbvax!eric
23*11077Seric.)l
24*11077Seric.sp
25*11077Seric.(l F
26*11077Seric.ce
27*11077SericABSTRACT
28*11077Seric.sp \n(psu
29*11077SericRouting mail through a heterogeneous internet presents many new
30*11077Sericproblems.
31*11077SericAmong the worst of these is that of address mapping.
32*11077SericHistorically, this has been handled on an ad hoc basis.
33*11077SericHowever,
34*11077Sericthis approach has become unmanageable as internets grow.
35*11077Seric.sp \n(psu
36*11077SericSendmail acts a unified
37*11077Seric.q "post office"
38*11077Sericto which all mail can be
39*11077Sericsubmitted.
40*11077SericAddress interpretation is controlled by a production
41*11077Sericsystem,
42*11077Sericwhich can parse both old and new format addresses.
43*11077SericThe
44*11077Sericnew format is
45*11077Seric.q "domain-based,"
46*11077Serica flexible technique that can
47*11077Serichandle many common situations.
48*11077SericSendmail is not intended to perform
49*11077Sericuser interface functions.
50*11077Seric.sp \n(psu
51*11077SericSendmail will replace delivermail in the Berkeley 4.2 distribution.
52*11077SericSeveral major hosts are now or will soon be running sendmail.
53*11077SericThis change will affect any users that route mail through a sendmail
54*11077Sericgateway.
55*11077SericThe changes that will be user visible are emphasized.
56*11077Seric.)l
57*11077Seric.sp 2
58*11077Seric.(f
59*11077Seric\(dgA considerable part of this work
60*11077Sericwas done while under the employ
61*11077Sericof the INGRES Project
62*11077Sericat the University of California at Berkeley.
63*11077Seric.)f
64*11077Seric.pp
65*11077SericThe mail system to appear in 4.2bsd
66*11077Sericwill contain a number of changes.
67*11077SericMost of these changes are based on the replacement of
68*11077Seric.i delivermail
69*11077Sericwith a new module called
70*11077Seric.i sendmail.
71*11077Seric.i Sendmail
72*11077Sericimplements a general internetwork mail routing facility,
73*11077Sericfeaturing aliasing and forwarding,
74*11077Sericautomatic routing to network gateways,
75*11077Sericand flexible configuration.
76*11077SericOf key interest to the mail system user
77*11077Sericwill be the changes in the network addressing structure.
78*11077Seric.pp
79*11077SericIn a simple network,
80*11077Sericeach node has an address,
81*11077Sericand resources can be identified
82*11077Sericwith a host-resource pair;
83*11077Sericin particular,
84*11077Sericthe mail system can refer to users
85*11077Sericusing a host-username pair.
86*11077SericHost names and numbers have to be administered by a central authority,
87*11077Sericbut usernames can be assigned locally to each host.
88*11077Seric.pp
89*11077SericIn an internet,
90*11077Sericmultiple networks with different characteristics
91*11077Sericand managements
92*11077Sericmust communicate.
93*11077SericIn particular,
94*11077Sericthe syntax and semantics of resource identification change.
95*11077SericCertain special cases can be handled trivially
96*11077Sericby
97*11077Seric.i "ad hoc"
98*11077Serictechniques,
99*11077Sericsuch as
100*11077Sericproviding network names that appear local to hosts
101*11077Sericon other networks,
102*11077Sericas with the Ethernet at Xerox PARC.
103*11077SericHowever, the general case is extremely complex.
104*11077SericFor example,
105*11077Sericsome networks require point-to-point routing,
106*11077Sericwhich simplifies the database update problem
107*11077Sericsince only adjacent hosts must be entered
108*11077Sericinto the system tables,
109*11077Sericwhile others use end-to-end addressing.
110*11077SericSome networks use a left-associative syntax
111*11077Sericand others use a right-associative syntax,
112*11077Sericcausing ambiguity in mixed addresses.
113*11077Seric.pp
114*11077SericInternet standards seek to eliminate these problems.
115*11077SericInitially, these proposed expanding the address pairs
116*11077Sericto address triples,
117*11077Sericconsisting of
118*11077Seric{network, host, username}
119*11077Serictriples.
120*11077SericNetwork numbers must be universally agreed upon,
121*11077Sericand hosts can be assigned locally
122*11077Sericon each network.
123*11077SericThe user-level presentation was changed
124*11077Sericto address domains,
125*11077Sericcomprised of a local resource identification
126*11077Sericand a hierarchical domain specification
127*11077Sericwith a common static root.
128*11077SericThe domain technique
129*11077Sericseparates the issue of physical versus logical addressing.
130*11077SericFor example,
131*11077Serican address of the form
132*11077Seric.q "eric@a.cc.berkeley.arpa"
133*11077Sericdescribes only the logical
134*11077Sericorganization of the address space.
135*11077Seric.pp
136*11077Seric.i Sendmail
137*11077Sericis intended to help bridge the gap
138*11077Sericbetween the totally
139*11077Seric.i "ad hoc"
140*11077Sericworld
141*11077Sericof networks that know nothing of each other
142*11077Sericand the clean, tightly-coupled world
143*11077Sericof unique network numbers.
144*11077SericIt can accept old arbitrary address syntaxes,
145*11077Sericresolving ambiguities using heuristics
146*11077Sericspecified by the system administrator,
147*11077Sericas well as domain-based addressing.
148*11077SericIt helps guide the conversion of message formats
149*11077Sericbetween disparate networks.
150*11077SericIn short,
151*11077Seric.i sendmail
152*11077Sericis designed to assist a graceful transition
153*11077Sericto consistent internetwork addressing schemes.
154*11077Seric.sp
155*11077Seric.pp
156*11077SericSection 1 defines some of the terms
157*11077Sericfrequently left fuzzy
158*11077Sericwhen working in mail systems.
159*11077SericSection 2 discusses the design goals for
160*11077Seric.i sendmail .
161*11077SericIn section 3,
162*11077Sericthe new address formats
163*11077Sericand basic features of
164*11077Seric.i sendmail
165*11077Sericare described.
166*11077SericSection 4 discusses some of the special problems
167*11077Sericof the UUCP network.
168*11077SericThe differences between
169*11077Seric.i sendmail
170*11077Sericand
171*11077Seric.i delivermail
172*11077Sericare presented in section 5.
173*11077Seric.sp
174*11077Seric.(l F
175*11077Seric.b DISCLAIMER:
176*11077SericA number of examples
177*11077Sericin this paper
178*11077Sericuse names of actual people
179*11077Sericand organizations.
180*11077SericThis is not intended
181*11077Sericto imply a commitment
182*11077Sericor even an intellectual agreement
183*11077Sericon the part of these people or organizations.
184*11077SericIn particular,
185*11077SericBell Telephone Laboratories (BTL),
186*11077SericDigital Equipment Corporation (DEC),
187*11077SericLawrence Berkeley Laboratories (LBL),
188*11077SericBritton-Lee Incorporated (BLI),
189*11077Sericand the University of California at Berkeley
190*11077Sericare not committed to any of these proposals at this time.
191*11077SericMuch of this paper
192*11077Sericrepresents no more than
193*11077Sericthe personal opinions of the author.
194*11077Seric.)l
195*11077Seric.sh 1 "DEFINITIONS"
196*11077Seric.pp
197*11077SericThere are four basic concepts
198*11077Sericthat must be clearly distinguished
199*11077Sericwhen dealing with mail systems:
200*11077Sericthe user (or the user's agent),
201*11077Sericthe user's identification,
202*11077Sericthe user's address,
203*11077Sericand the route.
204*11077SericThese are distinguished primarily by their position independence.
205*11077Seric.sh 2 "User and Identification"
206*11077Seric.pp
207*11077SericThe user is the being
208*11077Seric(a person or program)
209*11077Sericthat is creating or receiving a message.
210*11077SericAn
211*11077Seric.i agent
212*11077Sericis an entity operating on behalf of the user \*-
213*11077Sericsuch as a secretary who handles my mail.
214*11077Sericor a program that automatically returns a
215*11077Sericmessage such as
216*11077Seric.q "I am at the UNICOM conference."
217*11077Seric.pp
218*11077SericThe identification is the tag
219*11077Sericthat goes along with the particular user.
220*11077SericThis tag is completely independent of location.
221*11077SericFor example,
222*11077Sericmy identification is the string
223*11077Seric.q "Eric Allman,"
224*11077Sericand this identification does not change
225*11077Sericwhether I am located at U.C. Berkeley,
226*11077Sericat Britton-Lee,
227*11077Sericor at a scientific institute in Austria.
228*11077Seric.pp
229*11077SericSince the identification is frequently ambiguous
230*11077Seric(e.g., there are two
231*11077Seric.q "Robert Henry" s
232*11077Sericat Berkeley)
233*11077Sericit is common to add other disambiguating information
234*11077Sericthat is not strictly part of the identification
235*11077Seric(e.g.,
236*11077SericRobert
237*11077Seric.q "Code Generator"
238*11077SericHenry
239*11077Sericversus
240*11077SericRobert
241*11077Seric.q "System Administrator"
242*11077SericHenry).
243*11077Seric.sh 2 "Address"
244*11077Seric.pp
245*11077SericThe address specifies a location.
246*11077SericAs I move around,
247*11077Sericmy address changes.
248*11077SericFor example,
249*11077Sericmy address might change from
250*11077Seric.q eric@Berkeley.ARPA
251*11077Sericto
252*11077Seric.q eric@bli.UUCP
253*11077Sericor
254*11077Seric.q allman@IIASA.Austria
255*11077Sericdepending on my current affiliation.
256*11077Seric.pp
257*11077SericHowever,
258*11077Sericand address is independent of the location of anyone else.
259*11077SericThat is,
260*11077Sericmy address remains the same to everyone who might be sending me mail.
261*11077SericFor example,
262*11077Serica person at MIT and a person at USC
263*11077Sericcould both send to
264*11077Seric.q eric@Berkeley.ARPA
265*11077Sericand have it arrive to the same mailbox.
266*11077Seric.pp
267*11077SericIdeally a
268*11077Seric.q "white pages"
269*11077Sericservice would be provided to map user identifications
270*11077Sericinto addresses
271*11077Seric(for example, see
272*11077Seric[Solomon81]).
273*11077SericCurrently this is handled by passing around
274*11077Sericscraps of paper
275*11077Sericor by calling people on the telephone
276*11077Sericto find out their address.
277*11077Seric.sh 2 "Route"
278*11077Seric.pp
279*11077SericWhere an address specifies
280*11077Seric.i where
281*11077Sericto find a mailbox,
282*11077Serica route specifies
283*11077Seric.i how
284*11077Sericto find the mailbox.
285*11077SericSpecifically,
286*11077Sericit specifies a path
287*11077Sericfrom sender to receiver.
288*11077SericAs such, the route is potentially different
289*11077Sericfor every pair of people in the electronic universe.
290*11077Seric.pp
291*11077SericNormally the route is hidden from the user
292*11077Sericby the software.
293*11077SericHowever,
294*11077Sericsome networks put the burden of determining the route
295*11077Sericonto the sender.
296*11077SericAlthough this simplifies the software,
297*11077Sericit also greatly impairs the usability
298*11077Sericfor most users.
299*11077SericThe UUCP network is an example of such a network.
300*11077Seric.sh 1 "DESIGN GOALS"
301*11077Seric.pp
302*11077SericDesign goals for
303*11077Seric.i sendmail \**
304*11077Seric.(f
305*11077Seric\**This section makes no distinction between
306*11077Seric.i delivermail
307*11077Sericand
308*11077Seric.i sendmail.
309*11077Seric.)f
310*11077Sericinclude:
311*11077Seric.np
312*11077SericCompatibility with the existing mail programs,
313*11077Sericincluding Bell version 6 mail,
314*11077SericBell version 7 mail,
315*11077SericBerkeley
316*11077Seric.i Mail
317*11077Seric[Shoens79],
318*11077SericBerkNet mail
319*11077Seric[Schmidt79],
320*11077Sericand hopefully UUCP mail
321*11077Seric[Nowitz78].
322*11077SericARPANET mail
323*11077Seric[Crocker82]
324*11077Sericwas also required.
325*11077Seric.np
326*11077SericReliability, in the sense of guaranteeing
327*11077Sericthat every message is correctly delivered
328*11077Sericor at least brought to the attention of a human
329*11077Sericfor correct disposal;
330*11077Sericno message should ever be completely lost.
331*11077SericThis goal was considered essential
332*11077Sericbecause of the emphasis on mail in our environment.
333*11077SericIt has turned out to be one of the hardest goals to satisfy,
334*11077Sericespecially in the face of the many anomalous message formats
335*11077Sericproduced by various ARPANET sites.
336*11077SericFor example,
337*11077Sericcertain sites generate improperly formated addresses,
338*11077Sericoccasionally
339*11077Sericcausing error-message loops.
340*11077SericSome hosts use blanks in names,
341*11077Sericcausing problems with
342*11077Sericmail programs that assume that an address
343*11077Sericis one word.
344*11077SericThe semantics of some fields
345*11077Sericare interpreted slightly differently
346*11077Sericby different sites.
347*11077SericIn summary,
348*11077Sericthe obscure features of the ARPANET mail protocol
349*11077Sericreally
350*11077Seric.i are
351*11077Sericused and
352*11077Sericare difficult to support,
353*11077Sericbut must be supported.
354*11077Seric.np
355*11077SericExisting software to do actual delivery
356*11077Sericshould be used whenever possible.
357*11077SericThis goal derives as much from political and practical considerations
358*11077Sericas technical.
359*11077Seric.np
360*11077SericEasy expansion to
361*11077Sericfairly complex environments,
362*11077Sericincluding multiple
363*11077Sericconnections to a single network type
364*11077Seric(such as with multiple UUCP or Ether nets).
365*11077SericThis goal requires consideration of the contents of an address
366*11077Sericas well as its syntax
367*11077Sericin order to determine which gateway to use.
368*11077Seric.np
369*11077SericConfiguration should not be compiled into the code.
370*11077SericA single compiled program should be able to run as is at any site
371*11077Seric(barring such basic changes as the CPU type or the operating system).
372*11077SericWe have found this seemingly unimportant goal
373*11077Sericto be critical in real life.
374*11077SericBesides the simple problems that occur when any program gets recompiled
375*11077Sericin a different environment,
376*11077Sericmany sites like to
377*11077Seric.q fiddle
378*11077Sericwith anything that they will be recompiling anyway.
379*11077Seric.np
380*11077Seric.i Sendmail
381*11077Sericmust be able to let various groups maintain their own mailing lists,
382*11077Sericand let individuals specify their own forwarding,
383*11077Sericwithout modifying the system alias file.
384*11077Seric.np
385*11077SericEach user should be able to specify which mailer to execute
386*11077Sericto process mail being delivered for him.
387*11077SericThis feature allows users who are using specialized mailers
388*11077Sericthat use a different format to build their environment
389*11077Sericwithout changing the system,
390*11077Sericand facilitates specialized functions
391*11077Seric(such as returning an
392*11077Seric.q "I am on vacation"
393*11077Sericmessage).
394*11077Seric.np
395*11077SericNetwork traffic should be minimized
396*11077Sericby batching addresses to a single host where possible,
397*11077Sericwithout assistance from the user.
398*11077Seric.pp
399*11077SericThese goals motivated the architecture illustrated in figure 1.
400*11077Seric.(z
401*11077Seric.hl
402*11077Seric.ie t \
403*11077Seric.	sp 18
404*11077Seric.el \{\
405*11077Seric.(c
406*11077Seric+---------+   +---------+   +---------+
407*11077Seric| sender1 |   | sender2 |   | sender3 |
408*11077Seric+---------+   +---------+   +---------+
409*11077Seric     |  	   |             |
410*11077Seric     +----------+  +  +----------+
411*11077Seric		|  |  |
412*11077Seric		v  v  v
413*11077Seric            +-------------+
414*11077Seric            |   sendmail  |
415*11077Seric            +-------------+
416*11077Seric		|  |  |
417*11077Seric     +----------+  +  +----------+
418*11077Seric     |  	   |             |
419*11077Seric     v             v             v
420*11077Seric+---------+   +---------+   +---------+
421*11077Seric| mailer1 |   | mailer2 |   | mailer3 |
422*11077Seric+---------+   +---------+   +---------+
423*11077Seric.)c
424*11077Seric.\}
425*11077Seric
426*11077Seric.ce
427*11077SericFigure 1 \*- Sendmail System Structure.
428*11077Seric.hl
429*11077Seric.)z
430*11077SericThe user interacts with a mail generating and sending program.
431*11077SericWhen the mail is created,
432*11077Sericthe generator calls
433*11077Seric.i sendmail ,
434*11077Sericwhich routes the message to the correct mailer(s).
435*11077SericSince some of the senders may be network servers
436*11077Sericand some of the mailers may be network clients,
437*11077Seric.i sendmail
438*11077Sericmay be used as an internet mail gateway.
439*11077Seric.sh 1 "USAGE"
440*11077Seric.sh 2 "Address Formats"
441*11077Seric.pp
442*11077SericArguments may be flags and addresses.
443*11077SericFlags set various processing options.
444*11077SericFollowing flag arguments,
445*11077Sericaddress arguments may be given.
446*11077SericAddresses follow the syntax in RFC822
447*11077Seric[Crocker82]
448*11077Sericfor ARPANET
449*11077Sericaddress formats.
450*11077SericIn brief, the format is:
451*11077Seric.np
452*11077SericAnything in parentheses is thrown away
453*11077Seric(as a comment).
454*11077Seric.np
455*11077SericAnything in angle brackets (\c
456*11077Seric.q "<\|>" )
457*11077Sericis preferred
458*11077Sericover anything else.
459*11077SericThis rule implements the ARPANET standard that addresses of the form
460*11077Seric.(b
461*11077Sericuser name <machine-address>
462*11077Seric.)b
463*11077Sericwill send to the electronic
464*11077Seric.q machine-address
465*11077Sericrather than the human
466*11077Seric.q "user name."
467*11077Seric.np
468*11077SericDouble quotes
469*11077Seric(\ "\ )
470*11077Sericquote phrases;
471*11077Sericbackslashes quote characters.
472*11077SericBackslashes are more powerful
473*11077Sericin that they will cause otherwise equivalent phrases
474*11077Sericto compare differently \*- for example,
475*11077Seric.i user
476*11077Sericand
477*11077Seric.i
478*11077Seric"user"
479*11077Seric.r
480*11077Sericare equivalent,
481*11077Sericbut
482*11077Seric.i \euser
483*11077Sericis different from either of them.
484*11077Seric.pp
485*11077SericParentheses, angle brackets, and double quotes
486*11077Sericmust be properly balanced and nested.
487*11077SericThe rewriting rules control remaining parsing\**.
488*11077Seric.(f
489*11077Seric\**Disclaimer: Some special processing is done
490*11077Sericafter rewriting local names; see below.
491*11077Seric.)f
492*11077Seric.pp
493*11077SericAlthough old style addresses are still accepted
494*11077Sericin most cases,
495*11077Sericthe preferred address format
496*11077Sericis based on ARPANET-style domain-based addresses
497*11077Seric[Su82a].
498*11077SericThese addresses are based on a hierarchical, logical decomposition
499*11077Sericof the address space.
500*11077SericThe addresses are hierarchical in a sense
501*11077Sericsimilar to the U.S. postal addresses:
502*11077Sericthe messages may first be routed to the correct state,
503*11077Sericwith no initial consideration of the city
504*11077Sericor other addressing details.
505*11077SericThe addresses are logical
506*11077Sericin that each step in the hierarchy
507*11077Sericcorresponds to a set of
508*11077Seric.q "naming authorities"
509*11077Sericrather than a physical network.
510*11077Seric.pp
511*11077SericFor example,
512*11077Sericthe address:
513*11077Seric.(l
514*11077Sericeric@HostA.BigSite.ARPA
515*11077Seric.)l
516*11077Sericwould first look up the domain
517*11077SericBigSite
518*11077Sericin the namespace administrated by
519*11077SericARPA.
520*11077SericA query could then be sent to
521*11077SericBigSite
522*11077Sericfor interpretation of
523*11077SericHostA.
524*11077SericEventually the mail would arrive at
525*11077SericHostA,
526*11077Sericwhich would then do final delivery
527*11077Sericto user
528*11077Seric.q eric.
529*11077Seric.sh 2 "Mail to Files and Programs"
530*11077Seric.pp
531*11077SericFiles and programs are legitimate message recipients.
532*11077SericFiles provide archival storage of messages,
533*11077Sericuseful for project administration and history.
534*11077SericPrograms are useful as recipients in a variety of situations,
535*11077Sericfor example,
536*11077Sericto maintain a public repository of systems messages
537*11077Seric(such as the Berkeley
538*11077Seric.i msgs
539*11077Sericprogram).
540*11077Seric.pp
541*11077SericAny address passing through the initial parsing algorithm
542*11077Sericas a local address
543*11077Seric(i.e, not appearing to be a valid address for another mailer)
544*11077Sericis scanned for two special cases.
545*11077SericIf prefixed by a vertical bar (\c
546*11077Seric.q \^|\^ )
547*11077Sericthe rest of the address is processed as a shell command.
548*11077SericIf the user name begins with a slash mark (\c
549*11077Seric.q /\^ )
550*11077Sericthe name is used as a file name,
551*11077Sericinstead of a login name.
552*11077Seric.pp
553*11077SericFiles that have setuid or setgid bits set
554*11077Sericbut no execute bits set
555*11077Serichave those bits honored if
556*11077Seric.i sendmail
557*11077Sericis running as root.
558*11077Seric.sh 2 "Aliasing, Forwarding, Inclusion"
559*11077Seric.pp
560*11077Seric.i Sendmail
561*11077Sericreroutes mail three ways.
562*11077SericAliasing applies system wide.
563*11077SericForwarding allows each user to reroute incoming mail
564*11077Sericdestined for that account.
565*11077SericInclusion directs
566*11077Seric.i sendmail
567*11077Sericto read a file for a list of addresses,
568*11077Sericand is normally used
569*11077Sericin conjunction with aliasing.
570*11077Seric.sh 3 "Aliasing"
571*11077Seric.pp
572*11077SericAliasing maps names to address lists using a system-wide file.
573*11077SericThis file is indexed to speed access.
574*11077SericOnly names that parse as local
575*11077Sericare allowed as aliases;
576*11077Sericthis guarantees a unique key
577*11077Seric(since there are no nicknames for the local host).
578*11077Seric.sh 3 "Forwarding"
579*11077Seric.pp
580*11077SericAfter aliasing,
581*11077Sericrecipients that are local and valid
582*11077Sericare checked for the existence of a
583*11077Seric.q .forward
584*11077Sericfile in their home directory.
585*11077SericIf it exists,
586*11077Sericthe message is
587*11077Seric.i not
588*11077Sericsent to that user,
589*11077Sericbut rather to the list of users in that file.
590*11077SericOften
591*11077Sericthis list will contain only one address,
592*11077Sericand the feature will be used for network mail forwarding.
593*11077Seric.pp
594*11077SericForwarding also permits a user to specify a private incoming mailer.
595*11077SericFor example,
596*11077Sericforwarding to:
597*11077Seric.(b
598*11077Seric"\^|\|/usr/local/newmail myname"
599*11077Seric.)b
600*11077Sericwill use a different incoming mailer.
601*11077Seric.sh 3 "Inclusion"
602*11077Seric.pp
603*11077SericInclusion is specified in RFC 733 [Crocker77] syntax:
604*11077Seric.(b
605*11077Seric:Include: pathname
606*11077Seric.)b
607*11077SericAn address of this form reads the file specified by
608*11077Seric.i pathname
609*11077Sericand sends to all users listed in that file.
610*11077Seric.pp
611*11077SericThe intent is
612*11077Seric.i not
613*11077Sericto support direct use of this feature,
614*11077Sericbut rather to use this as a subset of aliasing.
615*11077SericFor example,
616*11077Serican alias of the form:
617*11077Seric.(b
618*11077Sericproject: :include:/usr/project/userlist
619*11077Seric.)b
620*11077Sericis a method of letting a project maintain a mailing list
621*11077Sericwithout interaction with the system administration,
622*11077Sericeven if the alias file is protected.
623*11077Seric.pp
624*11077SericIt is not necessary to rebuild the index on the alias database
625*11077Sericwhen a :include: list is changed.
626*11077Seric.sh 2 "Message Collection"
627*11077Seric.pp
628*11077SericOnce all recipient addresses are parsed and verified,
629*11077Sericthe message is collected.
630*11077SericThe message comes in two parts:
631*11077Serica message header and a message body,
632*11077Sericseparated by a blank line.
633*11077SericThe body is an uninterpreted
634*11077Sericsequence of text lines.
635*11077Seric.pp
636*11077SericThe header is formated as a series of lines
637*11077Sericof the form
638*11077Seric.(b
639*11077Seric	field-name: field-value
640*11077Seric.)b
641*11077SericField-value can be split across lines by starting the following
642*11077Sericlines with a space or a tab.
643*11077SericSome header fields have special internal meaning,
644*11077Sericand have appropriate special processing.
645*11077SericOther headers are simply passed through.
646*11077SericSome header fields may be added automatically,
647*11077Sericsuch as time stamps.
648*11077Seric.sh 1 "THE UUCP PROBLEM"
649*11077Seric.pp
650*11077SericOf particular interest
651*11077Sericis the UUCP network.
652*11077SericThe explicit routing
653*11077Sericused in the UUCP world
654*11077Sericcauses a number of serious problems.
655*11077SericFirst,
656*11077Sericgiving out an address
657*11077Sericis impossible
658*11077Sericwithout knowing the address of your potential correspondent.
659*11077SericThis is typically handled
660*11077Sericby specifying the address
661*11077Sericrelative to some
662*11077Seric.q "well-known"
663*11077Serichost
664*11077Seric(e.g.,
665*11077Sericucbvax or decvax).
666*11077SericSecond,
667*11077Sericit is often difficult to compute
668*11077Sericthe set of addresses
669*11077Sericto reply to
670*11077Sericwithout some knowledge
671*11077Sericof the topology of the network.
672*11077SericAlthough it may be easy for a human being
673*11077Sericto do this
674*11077Sericunder many circumstances,
675*11077Serica program does not have equally sophisticated heuristics
676*11077Sericbuilt in.
677*11077SericThird,
678*11077Sericcertain addresses will become painfully and unnecessarily long,
679*11077Sericas when a message is routed through many hosts in USENET.
680*11077SericAnd finally,
681*11077Sericcertain
682*11077Seric.q "mixed domain"
683*11077Sericaddresses
684*11077Sericare impossible to parse unambiguously \*-
685*11077Serice.g.,
686*11077Seric.(l
687*11077Sericdecvax!ucbvax!lbl-h@LBL-CSAM
688*11077Seric.)l
689*11077Sericmight have many possible resolutions,
690*11077Sericdepending on whether the message was first routed
691*11077Sericto decvax
692*11077Sericor to LBL-CSAM.
693*11077Seric.pp
694*11077SericTo solve this problem,
695*11077Sericthe UUCP syntax
696*11077Sericwould have to be changed to use addresses
697*11077Sericrather than routes.
698*11077SericFor example,
699*11077Sericthe address
700*11077Seric.q decvax!ucbvax!eric
701*11077Sericmight be expressed as
702*11077Seric.q eric@ucbvax.UUCP
703*11077Seric(with the hop through decvax implied).
704*11077SericThis address would itself be a domain-based address;
705*11077Sericfor example,
706*11077Serican address might be of the form:
707*11077Seric.(l
708*11077Sericmark@d.cbosg.btl.UUCP
709*11077Seric.)l
710*11077SericHosts outside of Bell Telephone Laboratories
711*11077Sericwould then only need to know
712*11077Serichow to get to a designated BTL relay,
713*11077Sericand the BTL topology
714*11077Sericwould only be maintained inside Bell.
715*11077Seric.pp
716*11077SericThere are three major problems
717*11077Sericassociated with turning UUCP addresses
718*11077Sericinto something reasonable:
719*11077Sericdefining the namespace,
720*11077Sericcreating and propagating the necessary software,
721*11077Sericand building and maintaining the database.
722*11077Seric.sh 2 "Defining the Namespace"
723*11077Seric.pp
724*11077SericMaking all UUCP hosts
725*11077Serictop-level names
726*11077Sericis not practical for a number of reasons.
727*11077SericFirst,
728*11077Sericwith over 1600 sites already,
729*11077Sericand (with the increasing availability of inexpensive microcomputers
730*11077Sericand autodialers)
731*11077Sericseveral thousand more coming within a few years,
732*11077Sericthe database update problem
733*11077Sericis simply intractable
734*11077Sericif the namespace is flat.
735*11077SericSecond,
736*11077Sericthere are almost certainly name conflicts today.
737*11077SericThird,
738*11077Sericas the number of sites grow
739*11077Sericthe names become ever less mnemonic.
740*11077Seric.pp
741*11077SericIt seems inevitable
742*11077Sericthat there be some sort of naming authority
743*11077Sericfor the set of top level names
744*11077Sericin the UUCP domain,
745*11077Sericas unpleasant a possibility
746*11077Sericas that may seem.
747*11077SericIt will simply not be possible
748*11077Sericto have one host resolving all names.
749*11077SericIt may however be possible
750*11077Sericto handle this
751*11077Sericin a fashion similar to that of assigning names of newsgroups
752*11077Sericin USENET.
753*11077SericHowever,
754*11077Sericit will be essential to encourage everyone
755*11077Sericto become subdomains of an existing domain
756*11077Sericwhenever possible \*-
757*11077Sericeven though this will certainly bruise some egos.
758*11077SericFor example,
759*11077Sericif a new host named
760*11077Seric.q blid
761*11077Sericwere to be added to the UUCP network,
762*11077Sericit would probably actually be addressed as
763*11077Seric.q d.bli.UUCP
764*11077Seric(i.e.,
765*11077Sericas host
766*11077Seric.q d
767*11077Sericin the pseudo-domain
768*11077Seric.q bli
769*11077Sericrather than as host
770*11077Seric.q blid
771*11077Sericin the UUCP domain).
772*11077Seric.sh 2 "Creating and Propagating the Software"
773*11077Seric.pp
774*11077SericThe software itself
775*11077Sericis relatively trivial.
776*11077SericTwo modules are needed,
777*11077Sericone to handle incoming mail
778*11077Sericand one to handle outgoing mail.
779*11077Seric.pp
780*11077SericThe incoming module
781*11077Sericmust be prepared to handle either old or new style addresses.
782*11077SericNew-style addresses
783*11077Sericcan be passed through unchanged.
784*11077SericOld style addresses
785*11077Sericmust be turned into new style addresses
786*11077Sericwhere possible.
787*11077Seric.pp
788*11077SericThe outgoing module
789*11077Sericis slightly trickier.
790*11077SericIt must do a database lookup on the recipient addresses
791*11077Seric(passed on the command line)
792*11077Sericto determine what hosts to send the message to.
793*11077SericIf those hosts do not accept new-style addresses,
794*11077Sericit must transform all addresses in the header of the message
795*11077Sericinto old style using the database lookup.
796*11077Seric.pp
797*11077SericBoth of these modules
798*11077Sericare straightforward
799*11077Sericexcept for the issue of modifying the header.
800*11077SericIt seems prudent to choose one format
801*11077Sericfor the message headers.
802*11077SericFor a number of reasons,
803*11077SericBerkeley has elected to use the ARPANET protocols
804*11077Sericfor message formats.
805*11077SericHowever,
806*11077Sericthis protocol is somewhat difficult to parse.
807*11077Seric.pp
808*11077SericPropagation is somewhat more difficult.
809*11077SericThere are a large number of hosts
810*11077Sericconnected to UUCP
811*11077Sericthat will want to run completely standard systems
812*11077Seric(for very good reasons).
813*11077SericThe strategy is not to convert the entire network \*-
814*11077Sericonly enough of it it alleviate the problem.
815*11077Seric.sh 2 "Building and Maintaining the Database"
816*11077Seric.pp
817*11077SericThis is by far the most difficult problem.
818*11077SericA prototype for this database
819*11077Sericalready exists,
820*11077Sericbut it is maintained by hand
821*11077Sericand does not pretend to be complete.
822*11077Seric.pp
823*11077SericThis problem will be reduced considerably
824*11077Sericif people choose to group their hosts
825*11077Sericinto subdomains.
826*11077SericThis would require a global update
827*11077Sericonly when a new top level domain
828*11077Sericjoined the network.
829*11077SericA message to a host in a subdomain
830*11077Sericcould simply be routed to a known domain gateway
831*11077Sericfor further processing.
832*11077SericFor example,
833*11077Sericthe address
834*11077Seric.q eric@a.bli.UUCP
835*11077Sericmight be routed to the
836*11077Seric.q bli
837*11077Sericgateway
838*11077Sericfor redistribution;
839*11077Sericnew hosts could be added
840*11077Sericwithin BLI
841*11077Sericwithout notifying the rest of the world.
842*11077SericOf course,
843*11077Sericother hosts
844*11077Seric.i could
845*11077Sericbe notified as an efficiency measure.
846*11077Seric.pp
847*11077SericNor need there be only one domain gateway.
848*11077SericA domain such as BTL,
849*11077Sericfor instance,
850*11077Sericmight have a dozen gateways to the outside world;
851*11077Serica non-BTL site
852*11077Sericcould choose the one that was closest.
853*11077SericThe only restriction
854*11077Sericwould be that all gateways
855*11077Sericmaintain a consistent view of the domain
856*11077Sericthat they represent.
857*11077Seric.sh 2 "Logical Structure"
858*11077Seric.pp
859*11077SericLogically,
860*11077Sericdomains are organized into a tree.
861*11077SericThere need not be a host actually associated
862*11077Sericwith each level in the tree \*-
863*11077Sericfor example,
864*11077Sericthere will be no host associated with the name
865*11077Seric.q UUCP.
866*11077SericSimilarly,
867*11077Serican organization might group names together for administrative reasons;
868*11077Sericfor example,
869*11077Sericthe name
870*11077Seric.(l
871*11077SericCAD.research.BigCorp.UUCP
872*11077Seric.)l
873*11077Sericmight not actually have a host representing
874*11077Seric.q research.
875*11077Seric.pp
876*11077SericHowever,
877*11077Sericit may frequently be convenient to have a host
878*11077Sericor hosts
879*11077Sericthat
880*11077Seric.q represent
881*11077Serica domain.
882*11077SericFor example,
883*11077Sericif a single host exists that
884*11077Sericrepresents
885*11077SericBerkeley,
886*11077Sericthen mail from outside Berkeley
887*11077Sericcan forward mail to that host
888*11077Sericfor further resolution
889*11077Sericwithout knowing Berkeley's
890*11077Seric(rather volatile)
891*11077Serictopology.
892*11077SericThis is not unlike the operation
893*11077Sericof the telephone network.
894*11077Seric.pp
895*11077SericThis may also be useful
896*11077Sericinside certain large domains.
897*11077SericFor example,
898*11077Sericat Berkeley it may be presumed
899*11077Sericthat most hosts know about other hosts
900*11077Sericinside the Berkeley domain.
901*11077SericBut if they process and address
902*11077Sericthat is unknown,
903*11077Sericthey can pass it
904*11077Seric.q upstairs
905*11077Sericfor further examination.
906*11077SericThus as new hosts are added
907*11077Sericonly one host
908*11077Seric(the domain master)
909*11077Seric.i must
910*11077Sericbe updated immediately;
911*11077Sericother hosts can be updated as convenient.
912*11077Seric.pp
913*11077SericIdeally this name resolution
914*11077Sericwould be performed by a name server
915*11077Seric(e.g., [Su82b])
916*11077Sericto avoid unnecessary copying
917*11077Sericof the message.
918*11077SericHowever,
919*11077Sericin a batch network
920*11077Sericsuch as UUCP
921*11077Sericthis could result in unnecessary delays.
922*11077Seric.sh 1 "COMPARISON WITH DELIVERMAIL"
923*11077Seric.pp
924*11077Seric.i Sendmail
925*11077Sericis an outgrowth of
926*11077Seric.i delivermail .
927*11077SericThe primary differences are:
928*11077Seric.np
929*11077SericConfiguration information is not compiled in.
930*11077SericThis change simplifies many of the problems
931*11077Sericof moving to other machines.
932*11077SericIt also allows easy debugging of new mailers.
933*11077Seric.np
934*11077SericAddress parsing is more flexible.
935*11077SericFor example,
936*11077Seric.i delivermail
937*11077Sericonly supported one gateway to any network,
938*11077Sericwhereas
939*11077Seric.i sendmail
940*11077Sericcan be sensitive to host names
941*11077Sericand reroute to different gateways.
942*11077Seric.np
943*11077SericForwarding and
944*11077Seric:include:
945*11077Sericfeatures eliminate the requirement that the system alias file
946*11077Sericbe writable by any user
947*11077Seric(or that an update program be written,
948*11077Sericor that the system administration make all changes).
949*11077Seric.np
950*11077Seric.i Sendmail
951*11077Sericsupports message batching across networks
952*11077Sericwhen a message is being sent to multiple recipients.
953*11077Seric.np
954*11077SericA mail queue is provided in
955*11077Seric.i sendmail.
956*11077SericMail that cannot be delivered immediately
957*11077Sericbut can potentially be delivered later
958*11077Sericis stored in this queue for a later retry.
959*11077SericThe queue also provides a buffer against system crashes;
960*11077Sericafter the message has been collected
961*11077Sericit may be reliably redelivered
962*11077Sericeven if the system crashes during the initial delivery.
963*11077Seric.np
964*11077Seric.i Sendmail
965*11077Sericuses the networking support provided by 4.2BSD
966*11077Sericto provide a direct interface networks such as the ARPANET
967*11077Sericand/or Ethernet
968*11077Sericusing SMTP (the Simple Mail Transfer Protocol)
969*11077Sericover a TCP/IP connection.
970*11077Seric.+c
971*11077Seric.ce
972*11077SericREFERENCES
973*11077Seric.nr ii 1.5i
974*11077Seric.ip [Crocker77]
975*11077SericCrocker, D. H.,
976*11077SericVittal, J. J.,
977*11077SericPogran, K. T.,
978*11077Sericand
979*11077SericHenderson, D. A. Jr.,
980*11077Seric.ul
981*11077SericStandard for the Format of ARPA Network Text Messages.
982*11077SericRFC 733,
983*11077SericNIC 41952.
984*11077SericIn [Feinler78].
985*11077SericNovember 1977.
986*11077Seric.ip [Crocker82]
987*11077SericCrocker, D. H.,
988*11077Seric.ul
989*11077SericStandard for the Format of Arpa Internet Text Messages.
990*11077SericRFC 822.
991*11077SericNetwork Information Center,
992*11077SericSRI International,
993*11077SericMenlo Park, California.
994*11077SericAugust 1982.
995*11077Seric.ip [Feinler78]
996*11077SericFeinler, E.,
997*11077Sericand
998*11077SericPostel, J.
999*11077Seric(eds.),
1000*11077Seric.ul
1001*11077SericARPANET Protocol Handbook.
1002*11077SericNIC 7104,
1003*11077SericNetwork Information Center,
1004*11077SericSRI International,
1005*11077SericMenlo Park, California.
1006*11077Seric1978.
1007*11077Seric.ip [Nowitz78]
1008*11077SericNowitz, D. A.,
1009*11077Sericand
1010*11077SericLesk, M. E.,
1011*11077Seric.ul
1012*11077SericA Dial-Up Network of UNIX Systems.
1013*11077SericBell Laboratories.
1014*11077SericIn
1015*11077SericUNIX Programmer's Manual, Seventh Edition,
1016*11077SericVolume 2.
1017*11077SericAugust, 1978.
1018*11077Seric.ip [Schmidt79]
1019*11077SericSchmidt, E.,
1020*11077Seric.ul
1021*11077SericAn Introduction to the Berkeley Network.
1022*11077SericUniversity of California, Berkeley California.
1023*11077Seric1979.
1024*11077Seric.ip [Shoens79]
1025*11077SericShoens, K.,
1026*11077Seric.ul
1027*11077SericMail Reference Manual.
1028*11077SericUniversity of California, Berkeley.
1029*11077SericIn UNIX Programmer's Manual,
1030*11077SericSeventh Edition,
1031*11077SericVolume 2C.
1032*11077SericDecember 1979.
1033*11077Seric.ip [Solomon81]
1034*11077SericSolomon, M.,
1035*11077SericLandweber, L.,
1036*11077Sericand
1037*11077SericNeuhengen, D.,
1038*11077Seric.ul
1039*11077SericThe Design of the CSNET Name Server.
1040*11077SericCS-DN-2.
1041*11077SericUniversity of Wisconsin,
1042*11077SericMadison.
1043*11077SericOctober 1981.
1044*11077Seric.ip [Su82a]
1045*11077SericSu, Zaw-Sing,
1046*11077Sericand
1047*11077SericPostel, Jon,
1048*11077Seric.ul
1049*11077SericThe Domain Naming Convention for Internet User Applications.
1050*11077SericRFC819.
1051*11077SericNetwork Information Center,
1052*11077SericSRI International,
1053*11077SericMenlo Park, California.
1054*11077SericAugust 1982.
1055*11077Seric.ip [Su82b]
1056*11077SericSu, Zaw-Sing,
1057*11077Seric.ul
1058*11077SericA Distributed System for Internet Name Service.
1059*11077SericRFC830.
1060*11077SericNetwork Information Center,
1061*11077SericSRI International,
1062*11077SericMenlo Park, California.
1063*11077SericOctober 1982.
1064