111077Seric.nr si 3n
211077Seric.he 'Mail Systems and Addressing in 4.2bsd''%'
3*64960Seric.fo 'Version 8.2'USENIX \- Jan 83'Last Mod 11/27/93'
411077Seric.if n .ls 2
511077Seric.+c
611077Seric.(l C
711077Seric.sz 14
811077SericMail Systems and Addressing
911077Sericin 4.2bsd
1011077Seric.sz
1111077Seric.sp
12*64960SericEric Allman*
1311077Seric.sp 0.5
1411077Seric.i
1511077SericBritton-Lee, Inc.
1611077Seric1919 Addison Street, Suite 105.
1711077SericBerkeley, California 94704.
1811077Seric.sp 0.5
1911077Seric.r
2011077Sericeric@Berkeley.ARPA
2111077Sericucbvax!eric
2211077Seric.)l
2311077Seric.sp
2411077Seric.(l F
2511077Seric.ce
2611077SericABSTRACT
2711077Seric.sp \n(psu
2811077SericRouting mail through a heterogeneous internet presents many new
2911077Sericproblems.
3011077SericAmong the worst of these is that of address mapping.
3111077SericHistorically, this has been handled on an ad hoc basis.
3211077SericHowever,
3311077Sericthis approach has become unmanageable as internets grow.
3411077Seric.sp \n(psu
3511077SericSendmail acts a unified
3611077Seric.q "post office"
3711077Sericto which all mail can be
3811077Sericsubmitted.
3911077SericAddress interpretation is controlled by a production
4011077Sericsystem,
4111077Sericwhich can parse both old and new format addresses.
4211077SericThe
4311077Sericnew format is
4411077Seric.q "domain-based,"
4511077Serica flexible technique that can
4611077Serichandle many common situations.
4711077SericSendmail is not intended to perform
4811077Sericuser interface functions.
4911077Seric.sp \n(psu
5011077SericSendmail will replace delivermail in the Berkeley 4.2 distribution.
5111077SericSeveral major hosts are now or will soon be running sendmail.
5211077SericThis change will affect any users that route mail through a sendmail
5311077Sericgateway.
5411077SericThe changes that will be user visible are emphasized.
5511077Seric.)l
5611077Seric.sp 2
5711077Seric.(f
58*64960Seric*A considerable part of this work
5911077Sericwas done while under the employ
6011077Sericof the INGRES Project
6111077Sericat the University of California at Berkeley.
6211077Seric.)f
6311077Seric.pp
6411077SericThe mail system to appear in 4.2bsd
6511077Sericwill contain a number of changes.
6611077SericMost of these changes are based on the replacement of
6711077Seric.i delivermail
6811077Sericwith a new module called
6911077Seric.i sendmail.
7011077Seric.i Sendmail
7111077Sericimplements a general internetwork mail routing facility,
7211077Sericfeaturing aliasing and forwarding,
7311077Sericautomatic routing to network gateways,
7411077Sericand flexible configuration.
7511077SericOf key interest to the mail system user
7611077Sericwill be the changes in the network addressing structure.
7711077Seric.pp
7811077SericIn a simple network,
7911077Sericeach node has an address,
8011077Sericand resources can be identified
8111077Sericwith a host-resource pair;
8211077Sericin particular,
8311077Sericthe mail system can refer to users
8411077Sericusing a host-username pair.
8511077SericHost names and numbers have to be administered by a central authority,
8611077Sericbut usernames can be assigned locally to each host.
8711077Seric.pp
8811077SericIn an internet,
8911077Sericmultiple networks with different characteristics
9011077Sericand managements
9111077Sericmust communicate.
9211077SericIn particular,
9311077Sericthe syntax and semantics of resource identification change.
9411077SericCertain special cases can be handled trivially
9511077Sericby
9611077Seric.i "ad hoc"
9711077Serictechniques,
9811077Sericsuch as
9911077Sericproviding network names that appear local to hosts
10011077Sericon other networks,
10111077Sericas with the Ethernet at Xerox PARC.
10211077SericHowever, the general case is extremely complex.
10311077SericFor example,
10411234Sericsome networks require that the route the message takes
10511234Sericbe explicitly specified by the sender,
10611234Sericsimplifying the database update problem
10711077Sericsince only adjacent hosts must be entered
10811077Sericinto the system tables,
10911234Sericwhile others use logical addressing,
11011234Sericwhere the sender specifies the location of the recipient
11111234Sericbut not how to get there.
11211077SericSome networks use a left-associative syntax
11311077Sericand others use a right-associative syntax,
11411077Sericcausing ambiguity in mixed addresses.
11511077Seric.pp
11611077SericInternet standards seek to eliminate these problems.
11711077SericInitially, these proposed expanding the address pairs
11811077Sericto address triples,
11911077Sericconsisting of
12011077Seric{network, host, username}
12111077Serictriples.
12211077SericNetwork numbers must be universally agreed upon,
12311077Sericand hosts can be assigned locally
12411077Sericon each network.
12511077SericThe user-level presentation was changed
12611077Sericto address domains,
12711077Sericcomprised of a local resource identification
12811077Sericand a hierarchical domain specification
12911077Sericwith a common static root.
13011077SericThe domain technique
13111077Sericseparates the issue of physical versus logical addressing.
13211077SericFor example,
13311077Serican address of the form
13411077Seric.q "eric@a.cc.berkeley.arpa"
13511234Sericdescribes the logical
13611234Sericorganization of the address space
13711234Seric(user
13811234Seric.q eric
13911234Sericon host
14011234Seric.q a
14111234Sericin the Computer Center
14211234Sericat Berkeley)
14311234Sericbut not the physical networks used
14411234Seric(for example, this could go over different networks
14511234Sericdepending on whether
14611234Seric.q a
14711234Sericwere on an ethernet
14811234Sericor a store-and-forward network).
14911077Seric.pp
15011077Seric.i Sendmail
15111077Sericis intended to help bridge the gap
15211077Sericbetween the totally
15311077Seric.i "ad hoc"
15411077Sericworld
15511077Sericof networks that know nothing of each other
15611077Sericand the clean, tightly-coupled world
15711077Sericof unique network numbers.
15811077SericIt can accept old arbitrary address syntaxes,
15911077Sericresolving ambiguities using heuristics
16011077Sericspecified by the system administrator,
16111077Sericas well as domain-based addressing.
16211077SericIt helps guide the conversion of message formats
16311077Sericbetween disparate networks.
16411077SericIn short,
16511077Seric.i sendmail
16611077Sericis designed to assist a graceful transition
16711077Sericto consistent internetwork addressing schemes.
16811077Seric.sp
16911077Seric.pp
17011077SericSection 1 defines some of the terms
17111077Sericfrequently left fuzzy
17211077Sericwhen working in mail systems.
17311077SericSection 2 discusses the design goals for
17411077Seric.i sendmail .
17511077SericIn section 3,
17611077Sericthe new address formats
17711077Sericand basic features of
17811077Seric.i sendmail
17911077Sericare described.
18011077SericSection 4 discusses some of the special problems
18111077Sericof the UUCP network.
18211077SericThe differences between
18311077Seric.i sendmail
18411077Sericand
18511077Seric.i delivermail
18611077Sericare presented in section 5.
18711077Seric.sp
18811077Seric.(l F
18911077Seric.b DISCLAIMER:
19011077SericA number of examples
19111077Sericin this paper
19211077Sericuse names of actual people
19311077Sericand organizations.
19411077SericThis is not intended
19511077Sericto imply a commitment
19611077Sericor even an intellectual agreement
19711077Sericon the part of these people or organizations.
19811077SericIn particular,
19911077SericBell Telephone Laboratories (BTL),
20011077SericDigital Equipment Corporation (DEC),
20111077SericLawrence Berkeley Laboratories (LBL),
20211077SericBritton-Lee Incorporated (BLI),
20311077Sericand the University of California at Berkeley
20411077Sericare not committed to any of these proposals at this time.
20511077SericMuch of this paper
20611077Sericrepresents no more than
20711077Sericthe personal opinions of the author.
20811077Seric.)l
20911077Seric.sh 1 "DEFINITIONS"
21011077Seric.pp
21111077SericThere are four basic concepts
21211077Sericthat must be clearly distinguished
21311077Sericwhen dealing with mail systems:
21411077Sericthe user (or the user's agent),
21511077Sericthe user's identification,
21611077Sericthe user's address,
21711077Sericand the route.
21811077SericThese are distinguished primarily by their position independence.
21911077Seric.sh 2 "User and Identification"
22011077Seric.pp
22111077SericThe user is the being
22211077Seric(a person or program)
22311077Sericthat is creating or receiving a message.
22411077SericAn
22511077Seric.i agent
22611077Sericis an entity operating on behalf of the user \*-
22711077Sericsuch as a secretary who handles my mail.
22811077Sericor a program that automatically returns a
22911077Sericmessage such as
23011077Seric.q "I am at the UNICOM conference."
23111077Seric.pp
23211077SericThe identification is the tag
23311077Sericthat goes along with the particular user.
23411077SericThis tag is completely independent of location.
23511077SericFor example,
23611077Sericmy identification is the string
23711077Seric.q "Eric Allman,"
23811077Sericand this identification does not change
23911077Sericwhether I am located at U.C. Berkeley,
24011077Sericat Britton-Lee,
24111077Sericor at a scientific institute in Austria.
24211077Seric.pp
24311077SericSince the identification is frequently ambiguous
24411077Seric(e.g., there are two
24511077Seric.q "Robert Henry" s
24611077Sericat Berkeley)
24711077Sericit is common to add other disambiguating information
24811077Sericthat is not strictly part of the identification
24911077Seric(e.g.,
25011077SericRobert
25111077Seric.q "Code Generator"
25211077SericHenry
25311077Sericversus
25411077SericRobert
25511077Seric.q "System Administrator"
25611077SericHenry).
25711077Seric.sh 2 "Address"
25811077Seric.pp
25911077SericThe address specifies a location.
26011077SericAs I move around,
26111077Sericmy address changes.
26211077SericFor example,
26311077Sericmy address might change from
26411077Seric.q eric@Berkeley.ARPA
26511077Sericto
26611077Seric.q eric@bli.UUCP
26711077Sericor
26811077Seric.q allman@IIASA.Austria
26911077Sericdepending on my current affiliation.
27011077Seric.pp
27111077SericHowever,
27211234Serican address is independent of the location of anyone else.
27311077SericThat is,
27411077Sericmy address remains the same to everyone who might be sending me mail.
27511077SericFor example,
27611077Serica person at MIT and a person at USC
27711077Sericcould both send to
27811077Seric.q eric@Berkeley.ARPA
27911077Sericand have it arrive to the same mailbox.
28011077Seric.pp
28111077SericIdeally a
28211077Seric.q "white pages"
28311077Sericservice would be provided to map user identifications
28411077Sericinto addresses
28511077Seric(for example, see
28611077Seric[Solomon81]).
28711077SericCurrently this is handled by passing around
28811077Sericscraps of paper
28911077Sericor by calling people on the telephone
29011077Sericto find out their address.
29111077Seric.sh 2 "Route"
29211077Seric.pp
29311234SericWhile an address specifies
29411077Seric.i where
29511077Sericto find a mailbox,
29611077Serica route specifies
29711077Seric.i how
29811077Sericto find the mailbox.
29911077SericSpecifically,
30011077Sericit specifies a path
30111077Sericfrom sender to receiver.
30211077SericAs such, the route is potentially different
30311077Sericfor every pair of people in the electronic universe.
30411077Seric.pp
30511077SericNormally the route is hidden from the user
30611077Sericby the software.
30711077SericHowever,
30811077Sericsome networks put the burden of determining the route
30911077Sericonto the sender.
31011077SericAlthough this simplifies the software,
31111077Sericit also greatly impairs the usability
31211077Sericfor most users.
31311077SericThe UUCP network is an example of such a network.
31411077Seric.sh 1 "DESIGN GOALS"
31511077Seric.pp
31611077SericDesign goals for
31711077Seric.i sendmail \**
31811077Seric.(f
31911077Seric\**This section makes no distinction between
32011077Seric.i delivermail
32111077Sericand
32211077Seric.i sendmail.
32311077Seric.)f
32411077Sericinclude:
32511077Seric.np
32611077SericCompatibility with the existing mail programs,
32711077Sericincluding Bell version 6 mail,
32811077SericBell version 7 mail,
32911077SericBerkeley
33011077Seric.i Mail
33111077Seric[Shoens79],
33211077SericBerkNet mail
33311077Seric[Schmidt79],
33411077Sericand hopefully UUCP mail
33511077Seric[Nowitz78].
33611077SericARPANET mail
33711077Seric[Crocker82]
33811077Sericwas also required.
33911077Seric.np
34011077SericReliability, in the sense of guaranteeing
34111077Sericthat every message is correctly delivered
34211077Sericor at least brought to the attention of a human
34311077Sericfor correct disposal;
34411077Sericno message should ever be completely lost.
34511077SericThis goal was considered essential
34611077Sericbecause of the emphasis on mail in our environment.
34711077SericIt has turned out to be one of the hardest goals to satisfy,
34811077Sericespecially in the face of the many anomalous message formats
34911077Sericproduced by various ARPANET sites.
35011077SericFor example,
35111077Sericcertain sites generate improperly formated addresses,
35211077Sericoccasionally
35311077Sericcausing error-message loops.
35411077SericSome hosts use blanks in names,
35511077Sericcausing problems with
35611077Sericmail programs that assume that an address
35711077Sericis one word.
35811077SericThe semantics of some fields
35911077Sericare interpreted slightly differently
36011077Sericby different sites.
36111077SericIn summary,
36211077Sericthe obscure features of the ARPANET mail protocol
36311077Sericreally
36411077Seric.i are
36511077Sericused and
36611077Sericare difficult to support,
36711077Sericbut must be supported.
36811077Seric.np
36911077SericExisting software to do actual delivery
37011077Sericshould be used whenever possible.
37111077SericThis goal derives as much from political and practical considerations
37211077Sericas technical.
37311077Seric.np
37411077SericEasy expansion to
37511077Sericfairly complex environments,
37611077Sericincluding multiple
37711077Sericconnections to a single network type
37811234Seric(such as with multiple UUCP or Ethernets).
37911077SericThis goal requires consideration of the contents of an address
38011077Sericas well as its syntax
38111077Sericin order to determine which gateway to use.
38211077Seric.np
38311234SericConfiguration information should not be compiled into the code.
38411077SericA single compiled program should be able to run as is at any site
38511077Seric(barring such basic changes as the CPU type or the operating system).
38611077SericWe have found this seemingly unimportant goal
38711077Sericto be critical in real life.
38811077SericBesides the simple problems that occur when any program gets recompiled
38911077Sericin a different environment,
39011077Sericmany sites like to
39111077Seric.q fiddle
39211077Sericwith anything that they will be recompiling anyway.
39311077Seric.np
39411077Seric.i Sendmail
39511077Sericmust be able to let various groups maintain their own mailing lists,
39611077Sericand let individuals specify their own forwarding,
39711077Sericwithout modifying the system alias file.
39811077Seric.np
39911077SericEach user should be able to specify which mailer to execute
40011077Sericto process mail being delivered for him.
40111077SericThis feature allows users who are using specialized mailers
40211077Sericthat use a different format to build their environment
40311077Sericwithout changing the system,
40411077Sericand facilitates specialized functions
40511077Seric(such as returning an
40611077Seric.q "I am on vacation"
40711077Sericmessage).
40811077Seric.np
40911077SericNetwork traffic should be minimized
41011077Sericby batching addresses to a single host where possible,
41111077Sericwithout assistance from the user.
41211077Seric.pp
41311077SericThese goals motivated the architecture illustrated in figure 1.
41411077Seric.(z
41511077Seric.hl
41611077Seric.ie t \
41711077Seric.	sp 18
41811077Seric.el \{\
41911077Seric.(c
42011077Seric+---------+   +---------+   +---------+
42111077Seric| sender1 |   | sender2 |   | sender3 |
42211077Seric+---------+   +---------+   +---------+
42311077Seric     |  	   |             |
42411077Seric     +----------+  +  +----------+
42511077Seric		|  |  |
42611077Seric		v  v  v
42711077Seric            +-------------+
42811077Seric            |   sendmail  |
42911077Seric            +-------------+
43011077Seric		|  |  |
43111077Seric     +----------+  +  +----------+
43211077Seric     |  	   |             |
43311077Seric     v             v             v
43411077Seric+---------+   +---------+   +---------+
43511077Seric| mailer1 |   | mailer2 |   | mailer3 |
43611077Seric+---------+   +---------+   +---------+
43711077Seric.)c
43811077Seric.\}
43911077Seric
44011077Seric.ce
44111077SericFigure 1 \*- Sendmail System Structure.
44211077Seric.hl
44311077Seric.)z
44411077SericThe user interacts with a mail generating and sending program.
44511077SericWhen the mail is created,
44611077Sericthe generator calls
44711077Seric.i sendmail ,
44811077Sericwhich routes the message to the correct mailer(s).
44911077SericSince some of the senders may be network servers
45011077Sericand some of the mailers may be network clients,
45111077Seric.i sendmail
45211077Sericmay be used as an internet mail gateway.
45311077Seric.sh 1 "USAGE"
45411077Seric.sh 2 "Address Formats"
45511077Seric.pp
45611234SericArguments may be flags or addresses.
45711077SericFlags set various processing options.
45811077SericFollowing flag arguments,
45911077Sericaddress arguments may be given.
46011077SericAddresses follow the syntax in RFC822
46111077Seric[Crocker82]
46211077Sericfor ARPANET
46311077Sericaddress formats.
46411077SericIn brief, the format is:
46511077Seric.np
46611077SericAnything in parentheses is thrown away
46711077Seric(as a comment).
46811077Seric.np
46911077SericAnything in angle brackets (\c
47011077Seric.q "<\|>" )
47111077Sericis preferred
47211077Sericover anything else.
47311077SericThis rule implements the ARPANET standard that addresses of the form
47411077Seric.(b
47511077Sericuser name <machine-address>
47611077Seric.)b
47711077Sericwill send to the electronic
47811077Seric.q machine-address
47911077Sericrather than the human
48011077Seric.q "user name."
48111077Seric.np
48211077SericDouble quotes
48311077Seric(\ "\ )
48411077Sericquote phrases;
48511077Sericbackslashes quote characters.
48611077SericBackslashes are more powerful
48711077Sericin that they will cause otherwise equivalent phrases
48811077Sericto compare differently \*- for example,
48911077Seric.i user
49011077Sericand
49111077Seric.i
49211077Seric"user"
49311077Seric.r
49411077Sericare equivalent,
49511077Sericbut
49611077Seric.i \euser
49711077Sericis different from either of them.
49811234SericThis might be used
49911234Sericto avoid normal aliasing
50011234Sericor duplicate suppression algorithms.
50111077Seric.pp
50211077SericParentheses, angle brackets, and double quotes
50311077Sericmust be properly balanced and nested.
50411077SericThe rewriting rules control remaining parsing\**.
50511077Seric.(f
50611077Seric\**Disclaimer: Some special processing is done
50711077Sericafter rewriting local names; see below.
50811077Seric.)f
50911077Seric.pp
51011077SericAlthough old style addresses are still accepted
51111077Sericin most cases,
51211077Sericthe preferred address format
51311077Sericis based on ARPANET-style domain-based addresses
51411077Seric[Su82a].
51511077SericThese addresses are based on a hierarchical, logical decomposition
51611077Sericof the address space.
51711077SericThe addresses are hierarchical in a sense
51811077Sericsimilar to the U.S. postal addresses:
51911077Sericthe messages may first be routed to the correct state,
52011077Sericwith no initial consideration of the city
52111077Sericor other addressing details.
52211077SericThe addresses are logical
52311077Sericin that each step in the hierarchy
52411077Sericcorresponds to a set of
52511077Seric.q "naming authorities"
52611077Sericrather than a physical network.
52711077Seric.pp
52811077SericFor example,
52911077Sericthe address:
53011077Seric.(l
53111077Sericeric@HostA.BigSite.ARPA
53211077Seric.)l
53311077Sericwould first look up the domain
53411077SericBigSite
53511077Sericin the namespace administrated by
53611077SericARPA.
53711077SericA query could then be sent to
53811077SericBigSite
53911077Sericfor interpretation of
54011077SericHostA.
54111077SericEventually the mail would arrive at
54211077SericHostA,
54311077Sericwhich would then do final delivery
54411077Sericto user
54511077Seric.q eric.
54611077Seric.sh 2 "Mail to Files and Programs"
54711077Seric.pp
54811077SericFiles and programs are legitimate message recipients.
54911077SericFiles provide archival storage of messages,
55011077Sericuseful for project administration and history.
55111077SericPrograms are useful as recipients in a variety of situations,
55211077Sericfor example,
55311077Sericto maintain a public repository of systems messages
55411077Seric(such as the Berkeley
55511077Seric.i msgs
55611077Sericprogram).
55711077Seric.pp
55811077SericAny address passing through the initial parsing algorithm
55911077Sericas a local address
56011077Seric(i.e, not appearing to be a valid address for another mailer)
56111077Sericis scanned for two special cases.
56211077SericIf prefixed by a vertical bar (\c
56311077Seric.q \^|\^ )
56411077Sericthe rest of the address is processed as a shell command.
56511077SericIf the user name begins with a slash mark (\c
56611077Seric.q /\^ )
56711077Sericthe name is used as a file name,
56811077Sericinstead of a login name.
56911077Seric.sh 2 "Aliasing, Forwarding, Inclusion"
57011077Seric.pp
57111077Seric.i Sendmail
57211077Sericreroutes mail three ways.
57311077SericAliasing applies system wide.
57411077SericForwarding allows each user to reroute incoming mail
57511077Sericdestined for that account.
57611077SericInclusion directs
57711077Seric.i sendmail
57811077Sericto read a file for a list of addresses,
57911077Sericand is normally used
58011077Sericin conjunction with aliasing.
58111077Seric.sh 3 "Aliasing"
58211077Seric.pp
58311234SericAliasing maps local addresses to address lists using a system-wide file.
58411234SericThis file is hashed to speed access.
58511234SericOnly addresses that parse as local
58611077Sericare allowed as aliases;
58711077Sericthis guarantees a unique key
58811077Seric(since there are no nicknames for the local host).
58911077Seric.sh 3 "Forwarding"
59011077Seric.pp
59111077SericAfter aliasing,
59211234Sericif an recipient address specifies a local user
59311234Seric.i sendmail
59411234Sericsearches for a
59511077Seric.q .forward
59611234Sericfile in the recipient's home directory.
59711077SericIf it exists,
59811077Sericthe message is
59911077Seric.i not
60011077Sericsent to that user,
60111234Sericbut rather to the list of addresses in that file.
60211077SericOften
60311077Sericthis list will contain only one address,
60411077Sericand the feature will be used for network mail forwarding.
60511077Seric.pp
60611077SericForwarding also permits a user to specify a private incoming mailer.
60711077SericFor example,
60811077Sericforwarding to:
60911077Seric.(b
61011077Seric"\^|\|/usr/local/newmail myname"
61111077Seric.)b
61211077Sericwill use a different incoming mailer.
61311077Seric.sh 3 "Inclusion"
61411077Seric.pp
61511077SericInclusion is specified in RFC 733 [Crocker77] syntax:
61611077Seric.(b
61711077Seric:Include: pathname
61811077Seric.)b
61911077SericAn address of this form reads the file specified by
62011077Seric.i pathname
62111077Sericand sends to all users listed in that file.
62211077Seric.pp
62311077SericThe intent is
62411077Seric.i not
62511077Sericto support direct use of this feature,
62611077Sericbut rather to use this as a subset of aliasing.
62711077SericFor example,
62811077Serican alias of the form:
62911077Seric.(b
63011077Sericproject: :include:/usr/project/userlist
63111077Seric.)b
63211077Sericis a method of letting a project maintain a mailing list
63311077Sericwithout interaction with the system administration,
63411077Sericeven if the alias file is protected.
63511077Seric.pp
63611077SericIt is not necessary to rebuild the index on the alias database
63711077Sericwhen a :include: list is changed.
63811077Seric.sh 2 "Message Collection"
63911077Seric.pp
64011077SericOnce all recipient addresses are parsed and verified,
64111077Sericthe message is collected.
64211077SericThe message comes in two parts:
64311077Serica message header and a message body,
64411077Sericseparated by a blank line.
64511077SericThe body is an uninterpreted
64611077Sericsequence of text lines.
64711077Seric.pp
64811077SericThe header is formated as a series of lines
64911077Sericof the form
65011077Seric.(b
65111077Seric	field-name: field-value
65211077Seric.)b
65311077SericField-value can be split across lines by starting the following
65411077Sericlines with a space or a tab.
65511077SericSome header fields have special internal meaning,
65611077Sericand have appropriate special processing.
65711077SericOther headers are simply passed through.
65811077SericSome header fields may be added automatically,
65911077Sericsuch as time stamps.
66011077Seric.sh 1 "THE UUCP PROBLEM"
66111077Seric.pp
66211077SericOf particular interest
66311077Sericis the UUCP network.
66411077SericThe explicit routing
66511234Sericused in the UUCP environment
66611077Sericcauses a number of serious problems.
66711077SericFirst,
66811077Sericgiving out an address
66911077Sericis impossible
67011077Sericwithout knowing the address of your potential correspondent.
67111077SericThis is typically handled
67211077Sericby specifying the address
67311077Sericrelative to some
67411077Seric.q "well-known"
67511077Serichost
67611077Seric(e.g.,
67711077Sericucbvax or decvax).
67811077SericSecond,
67911077Sericit is often difficult to compute
68011077Sericthe set of addresses
68111077Sericto reply to
68211077Sericwithout some knowledge
68311077Sericof the topology of the network.
68411077SericAlthough it may be easy for a human being
68511077Sericto do this
68611077Sericunder many circumstances,
68711077Serica program does not have equally sophisticated heuristics
68811077Sericbuilt in.
68911077SericThird,
69011077Sericcertain addresses will become painfully and unnecessarily long,
69111234Sericas when a message is routed through many hosts in the USENET.
69211077SericAnd finally,
69311077Sericcertain
69411077Seric.q "mixed domain"
69511077Sericaddresses
69611077Sericare impossible to parse unambiguously \*-
69711077Serice.g.,
69811077Seric.(l
69911234Sericdecvax!ucbvax!lbl-h!user@LBL-CSAM
70011077Seric.)l
70111077Sericmight have many possible resolutions,
70211077Sericdepending on whether the message was first routed
70311077Sericto decvax
70411077Sericor to LBL-CSAM.
70511077Seric.pp
70611077SericTo solve this problem,
70711077Sericthe UUCP syntax
70811077Sericwould have to be changed to use addresses
70911077Sericrather than routes.
71011077SericFor example,
71111077Sericthe address
71211077Seric.q decvax!ucbvax!eric
71311077Sericmight be expressed as
71411077Seric.q eric@ucbvax.UUCP
71511077Seric(with the hop through decvax implied).
71611077SericThis address would itself be a domain-based address;
71711077Sericfor example,
71811077Serican address might be of the form:
71911077Seric.(l
72011077Sericmark@d.cbosg.btl.UUCP
72111077Seric.)l
72211077SericHosts outside of Bell Telephone Laboratories
72311077Sericwould then only need to know
72411077Serichow to get to a designated BTL relay,
72511077Sericand the BTL topology
72611077Sericwould only be maintained inside Bell.
72711077Seric.pp
72811077SericThere are three major problems
72911077Sericassociated with turning UUCP addresses
73011077Sericinto something reasonable:
73111077Sericdefining the namespace,
73211077Sericcreating and propagating the necessary software,
73311077Sericand building and maintaining the database.
73411077Seric.sh 2 "Defining the Namespace"
73511077Seric.pp
73611234SericPutting all UUCP hosts into a flat namespace
73711234Seric(e.g.,
73811234Seric.q \&...@host.UUCP )
73911077Sericis not practical for a number of reasons.
74011077SericFirst,
74111077Sericwith over 1600 sites already,
74211077Sericand (with the increasing availability of inexpensive microcomputers
74311077Sericand autodialers)
74411077Sericseveral thousand more coming within a few years,
74511077Sericthe database update problem
74611077Sericis simply intractable
74711077Sericif the namespace is flat.
74811077SericSecond,
74911077Sericthere are almost certainly name conflicts today.
75011077SericThird,
75111077Sericas the number of sites grow
75211077Sericthe names become ever less mnemonic.
75311077Seric.pp
75411077SericIt seems inevitable
75511077Sericthat there be some sort of naming authority
75611077Sericfor the set of top level names
75711077Sericin the UUCP domain,
75811077Sericas unpleasant a possibility
75911077Sericas that may seem.
76011077SericIt will simply not be possible
76111077Sericto have one host resolving all names.
76211077SericIt may however be possible
76311077Sericto handle this
76411077Sericin a fashion similar to that of assigning names of newsgroups
76511077Sericin USENET.
76611077SericHowever,
76711077Sericit will be essential to encourage everyone
76811077Sericto become subdomains of an existing domain
76911077Sericwhenever possible \*-
77011077Sericeven though this will certainly bruise some egos.
77111077SericFor example,
77211077Sericif a new host named
77311077Seric.q blid
77411077Sericwere to be added to the UUCP network,
77511077Sericit would probably actually be addressed as
77611077Seric.q d.bli.UUCP
77711077Seric(i.e.,
77811077Sericas host
77911077Seric.q d
78011077Sericin the pseudo-domain
78111077Seric.q bli
78211077Sericrather than as host
78311077Seric.q blid
78411077Sericin the UUCP domain).
78511077Seric.sh 2 "Creating and Propagating the Software"
78611077Seric.pp
78711234SericThe software required to implement a consistent namespace
78811077Sericis relatively trivial.
78911077SericTwo modules are needed,
79011077Sericone to handle incoming mail
79111077Sericand one to handle outgoing mail.
79211077Seric.pp
79311077SericThe incoming module
79411077Sericmust be prepared to handle either old or new style addresses.
79511077SericNew-style addresses
79611077Sericcan be passed through unchanged.
79711077SericOld style addresses
79811077Sericmust be turned into new style addresses
79911077Sericwhere possible.
80011077Seric.pp
80111077SericThe outgoing module
80211077Sericis slightly trickier.
80311077SericIt must do a database lookup on the recipient addresses
80411077Seric(passed on the command line)
80511077Sericto determine what hosts to send the message to.
80611077SericIf those hosts do not accept new-style addresses,
80711077Sericit must transform all addresses in the header of the message
80811077Sericinto old style using the database lookup.
80911077Seric.pp
81011077SericBoth of these modules
81111077Sericare straightforward
81211077Sericexcept for the issue of modifying the header.
81311077SericIt seems prudent to choose one format
81411077Sericfor the message headers.
81511077SericFor a number of reasons,
81611077SericBerkeley has elected to use the ARPANET protocols
81711077Sericfor message formats.
81811077SericHowever,
81911077Sericthis protocol is somewhat difficult to parse.
82011077Seric.pp
82111077SericPropagation is somewhat more difficult.
82211077SericThere are a large number of hosts
82311077Sericconnected to UUCP
82411077Sericthat will want to run completely standard systems
82511077Seric(for very good reasons).
82611077SericThe strategy is not to convert the entire network \*-
82711077Sericonly enough of it it alleviate the problem.
82811077Seric.sh 2 "Building and Maintaining the Database"
82911077Seric.pp
83011077SericThis is by far the most difficult problem.
83111077SericA prototype for this database
83211077Sericalready exists,
83311077Sericbut it is maintained by hand
83411077Sericand does not pretend to be complete.
83511077Seric.pp
83611077SericThis problem will be reduced considerably
83711077Sericif people choose to group their hosts
83811077Sericinto subdomains.
83911077SericThis would require a global update
84011077Sericonly when a new top level domain
84111077Sericjoined the network.
84211077SericA message to a host in a subdomain
84311077Sericcould simply be routed to a known domain gateway
84411077Sericfor further processing.
84511077SericFor example,
84611077Sericthe address
84711077Seric.q eric@a.bli.UUCP
84811077Sericmight be routed to the
84911077Seric.q bli
85011077Sericgateway
85111077Sericfor redistribution;
85211077Sericnew hosts could be added
85311077Sericwithin BLI
85411077Sericwithout notifying the rest of the world.
85511077SericOf course,
85611077Sericother hosts
85711077Seric.i could
85811077Sericbe notified as an efficiency measure.
85911077Seric.pp
86011234SericThere may be more than one domain gateway.
86111077SericA domain such as BTL,
86211077Sericfor instance,
86311077Sericmight have a dozen gateways to the outside world;
86411077Serica non-BTL site
86511234Sericcould choose the closest gateway.
86611077SericThe only restriction
86711077Sericwould be that all gateways
86811077Sericmaintain a consistent view of the domain
86911234Sericthey represent.
87011077Seric.sh 2 "Logical Structure"
87111077Seric.pp
87211077SericLogically,
87311077Sericdomains are organized into a tree.
87411077SericThere need not be a host actually associated
87511077Sericwith each level in the tree \*-
87611077Sericfor example,
87711077Sericthere will be no host associated with the name
87811077Seric.q UUCP.
87911077SericSimilarly,
88011077Serican organization might group names together for administrative reasons;
88111077Sericfor example,
88211077Sericthe name
88311077Seric.(l
88411077SericCAD.research.BigCorp.UUCP
88511077Seric.)l
88611077Sericmight not actually have a host representing
88711077Seric.q research.
88811077Seric.pp
88911077SericHowever,
89011077Sericit may frequently be convenient to have a host
89111077Sericor hosts
89211077Sericthat
89311077Seric.q represent
89411077Serica domain.
89511077SericFor example,
89611077Sericif a single host exists that
89711077Sericrepresents
89811077SericBerkeley,
89911077Sericthen mail from outside Berkeley
90011077Sericcan forward mail to that host
90111077Sericfor further resolution
90211077Sericwithout knowing Berkeley's
90311077Seric(rather volatile)
90411077Serictopology.
90511077SericThis is not unlike the operation
90611077Sericof the telephone network.
90711077Seric.pp
90811077SericThis may also be useful
90911077Sericinside certain large domains.
91011077SericFor example,
91111077Sericat Berkeley it may be presumed
91211077Sericthat most hosts know about other hosts
91311077Sericinside the Berkeley domain.
91411234SericBut if they process an address
91511077Sericthat is unknown,
91611077Sericthey can pass it
91711077Seric.q upstairs
91811077Sericfor further examination.
91911077SericThus as new hosts are added
92011077Sericonly one host
92111077Seric(the domain master)
92211077Seric.i must
92311077Sericbe updated immediately;
92411077Sericother hosts can be updated as convenient.
92511077Seric.pp
92611234SericIdeally this name resolution process
92711077Sericwould be performed by a name server
92811077Seric(e.g., [Su82b])
92911077Sericto avoid unnecessary copying
93011077Sericof the message.
93111077SericHowever,
93211077Sericin a batch network
93311077Sericsuch as UUCP
93411077Sericthis could result in unnecessary delays.
93511077Seric.sh 1 "COMPARISON WITH DELIVERMAIL"
93611077Seric.pp
93711077Seric.i Sendmail
93811077Sericis an outgrowth of
93911077Seric.i delivermail .
94011077SericThe primary differences are:
94111077Seric.np
94211077SericConfiguration information is not compiled in.
94311077SericThis change simplifies many of the problems
94411077Sericof moving to other machines.
94511077SericIt also allows easy debugging of new mailers.
94611077Seric.np
94711077SericAddress parsing is more flexible.
94811077SericFor example,
94911077Seric.i delivermail
95011077Sericonly supported one gateway to any network,
95111077Sericwhereas
95211077Seric.i sendmail
95311077Sericcan be sensitive to host names
95411077Sericand reroute to different gateways.
95511077Seric.np
95611077SericForwarding and
95711077Seric:include:
95811077Sericfeatures eliminate the requirement that the system alias file
95911077Sericbe writable by any user
96011077Seric(or that an update program be written,
96111077Sericor that the system administration make all changes).
96211077Seric.np
96311077Seric.i Sendmail
96411077Sericsupports message batching across networks
96511077Sericwhen a message is being sent to multiple recipients.
96611077Seric.np
96711077SericA mail queue is provided in
96811077Seric.i sendmail.
96911077SericMail that cannot be delivered immediately
97011077Sericbut can potentially be delivered later
97111077Sericis stored in this queue for a later retry.
97211077SericThe queue also provides a buffer against system crashes;
97311077Sericafter the message has been collected
97411077Sericit may be reliably redelivered
97511077Sericeven if the system crashes during the initial delivery.
97611077Seric.np
97711077Seric.i Sendmail
97811077Sericuses the networking support provided by 4.2BSD
97911077Sericto provide a direct interface networks such as the ARPANET
98011077Sericand/or Ethernet
98111077Sericusing SMTP (the Simple Mail Transfer Protocol)
98211077Sericover a TCP/IP connection.
98311077Seric.+c
98411077Seric.ce
98511077SericREFERENCES
98611077Seric.nr ii 1.5i
98711077Seric.ip [Crocker77]
98811077SericCrocker, D. H.,
98911077SericVittal, J. J.,
99011077SericPogran, K. T.,
99111077Sericand
99211077SericHenderson, D. A. Jr.,
99311077Seric.ul
99411077SericStandard for the Format of ARPA Network Text Messages.
99511077SericRFC 733,
99611077SericNIC 41952.
99711077SericIn [Feinler78].
99811077SericNovember 1977.
99911077Seric.ip [Crocker82]
100011077SericCrocker, D. H.,
100111077Seric.ul
100211077SericStandard for the Format of Arpa Internet Text Messages.
100311077SericRFC 822.
100411077SericNetwork Information Center,
100511077SericSRI International,
100611077SericMenlo Park, California.
100711077SericAugust 1982.
100811077Seric.ip [Feinler78]
100911077SericFeinler, E.,
101011077Sericand
101111077SericPostel, J.
101211077Seric(eds.),
101311077Seric.ul
101411077SericARPANET Protocol Handbook.
101511077SericNIC 7104,
101611077SericNetwork Information Center,
101711077SericSRI International,
101811077SericMenlo Park, California.
101911077Seric1978.
102011077Seric.ip [Nowitz78]
102111077SericNowitz, D. A.,
102211077Sericand
102311077SericLesk, M. E.,
102411077Seric.ul
102511077SericA Dial-Up Network of UNIX Systems.
102611077SericBell Laboratories.
102711077SericIn
102811077SericUNIX Programmer's Manual, Seventh Edition,
102911077SericVolume 2.
103011077SericAugust, 1978.
103111077Seric.ip [Schmidt79]
103211077SericSchmidt, E.,
103311077Seric.ul
103411077SericAn Introduction to the Berkeley Network.
103511077SericUniversity of California, Berkeley California.
103611077Seric1979.
103711077Seric.ip [Shoens79]
103811077SericShoens, K.,
103911077Seric.ul
104011077SericMail Reference Manual.
104111077SericUniversity of California, Berkeley.
104211077SericIn UNIX Programmer's Manual,
104311077SericSeventh Edition,
104411077SericVolume 2C.
104511077SericDecember 1979.
104611077Seric.ip [Solomon81]
104711077SericSolomon, M.,
104811077SericLandweber, L.,
104911077Sericand
105011077SericNeuhengen, D.,
105111077Seric.ul
105211077SericThe Design of the CSNET Name Server.
105311077SericCS-DN-2.
105411077SericUniversity of Wisconsin,
105511077SericMadison.
105611077SericOctober 1981.
105711077Seric.ip [Su82a]
105811077SericSu, Zaw-Sing,
105911077Sericand
106011077SericPostel, Jon,
106111077Seric.ul
106211077SericThe Domain Naming Convention for Internet User Applications.
106311077SericRFC819.
106411077SericNetwork Information Center,
106511077SericSRI International,
106611077SericMenlo Park, California.
106711077SericAugust 1982.
106811077Seric.ip [Su82b]
106911077SericSu, Zaw-Sing,
107011077Seric.ul
107111077SericA Distributed System for Internet Name Service.
107211077SericRFC830.
107311077SericNetwork Information Center,
107411077SericSRI International,
107511077SericMenlo Park, California.
107611077SericOctober 1982.
1077