1*11077Seric.nr DR 1 \" this is a draft copy 2*11077Seric.nr si 3n 3*11077Seric.he 'Mail Systems and Addressing in 4.2bsd''%' 4*11077Seric.fo 'Version 1.1'DRAFT'Last Mod 02/16/83' 5*11077Seric.if n .ls 2 6*11077Seric.+c 7*11077Seric.(l C 8*11077Seric.sz 14 9*11077SericMail Systems and Addressing 10*11077Sericin 4.2bsd 11*11077Seric.sz 12*11077Seric.sp 13*11077SericEric Allman\(dg 14*11077Seric.sp 0.5 15*11077Seric.i 16*11077SericBritton-Lee, Inc. 17*11077Seric1919 Addison Street, Suite 105. 18*11077SericBerkeley, California 94704. 19*11077Seric.sp 0.5 20*11077Seric.r 21*11077Sericeric@Berkeley.ARPA 22*11077Sericucbvax!eric 23*11077Seric.)l 24*11077Seric.sp 25*11077Seric.(l F 26*11077Seric.ce 27*11077SericABSTRACT 28*11077Seric.sp \n(psu 29*11077SericRouting mail through a heterogeneous internet presents many new 30*11077Sericproblems. 31*11077SericAmong the worst of these is that of address mapping. 32*11077SericHistorically, this has been handled on an ad hoc basis. 33*11077SericHowever, 34*11077Sericthis approach has become unmanageable as internets grow. 35*11077Seric.sp \n(psu 36*11077SericSendmail acts a unified 37*11077Seric.q "post office" 38*11077Sericto which all mail can be 39*11077Sericsubmitted. 40*11077SericAddress interpretation is controlled by a production 41*11077Sericsystem, 42*11077Sericwhich can parse both old and new format addresses. 43*11077SericThe 44*11077Sericnew format is 45*11077Seric.q "domain-based," 46*11077Serica flexible technique that can 47*11077Serichandle many common situations. 48*11077SericSendmail is not intended to perform 49*11077Sericuser interface functions. 50*11077Seric.sp \n(psu 51*11077SericSendmail will replace delivermail in the Berkeley 4.2 distribution. 52*11077SericSeveral major hosts are now or will soon be running sendmail. 53*11077SericThis change will affect any users that route mail through a sendmail 54*11077Sericgateway. 55*11077SericThe changes that will be user visible are emphasized. 56*11077Seric.)l 57*11077Seric.sp 2 58*11077Seric.(f 59*11077Seric\(dgA considerable part of this work 60*11077Sericwas done while under the employ 61*11077Sericof the INGRES Project 62*11077Sericat the University of California at Berkeley. 63*11077Seric.)f 64*11077Seric.pp 65*11077SericThe mail system to appear in 4.2bsd 66*11077Sericwill contain a number of changes. 67*11077SericMost of these changes are based on the replacement of 68*11077Seric.i delivermail 69*11077Sericwith a new module called 70*11077Seric.i sendmail. 71*11077Seric.i Sendmail 72*11077Sericimplements a general internetwork mail routing facility, 73*11077Sericfeaturing aliasing and forwarding, 74*11077Sericautomatic routing to network gateways, 75*11077Sericand flexible configuration. 76*11077SericOf key interest to the mail system user 77*11077Sericwill be the changes in the network addressing structure. 78*11077Seric.pp 79*11077SericIn a simple network, 80*11077Sericeach node has an address, 81*11077Sericand resources can be identified 82*11077Sericwith a host-resource pair; 83*11077Sericin particular, 84*11077Sericthe mail system can refer to users 85*11077Sericusing a host-username pair. 86*11077SericHost names and numbers have to be administered by a central authority, 87*11077Sericbut usernames can be assigned locally to each host. 88*11077Seric.pp 89*11077SericIn an internet, 90*11077Sericmultiple networks with different characteristics 91*11077Sericand managements 92*11077Sericmust communicate. 93*11077SericIn particular, 94*11077Sericthe syntax and semantics of resource identification change. 95*11077SericCertain special cases can be handled trivially 96*11077Sericby 97*11077Seric.i "ad hoc" 98*11077Serictechniques, 99*11077Sericsuch as 100*11077Sericproviding network names that appear local to hosts 101*11077Sericon other networks, 102*11077Sericas with the Ethernet at Xerox PARC. 103*11077SericHowever, the general case is extremely complex. 104*11077SericFor example, 105*11077Sericsome networks require point-to-point routing, 106*11077Sericwhich simplifies the database update problem 107*11077Sericsince only adjacent hosts must be entered 108*11077Sericinto the system tables, 109*11077Sericwhile others use end-to-end addressing. 110*11077SericSome networks use a left-associative syntax 111*11077Sericand others use a right-associative syntax, 112*11077Sericcausing ambiguity in mixed addresses. 113*11077Seric.pp 114*11077SericInternet standards seek to eliminate these problems. 115*11077SericInitially, these proposed expanding the address pairs 116*11077Sericto address triples, 117*11077Sericconsisting of 118*11077Seric{network, host, username} 119*11077Serictriples. 120*11077SericNetwork numbers must be universally agreed upon, 121*11077Sericand hosts can be assigned locally 122*11077Sericon each network. 123*11077SericThe user-level presentation was changed 124*11077Sericto address domains, 125*11077Sericcomprised of a local resource identification 126*11077Sericand a hierarchical domain specification 127*11077Sericwith a common static root. 128*11077SericThe domain technique 129*11077Sericseparates the issue of physical versus logical addressing. 130*11077SericFor example, 131*11077Serican address of the form 132*11077Seric.q "eric@a.cc.berkeley.arpa" 133*11077Sericdescribes only the logical 134*11077Sericorganization of the address space. 135*11077Seric.pp 136*11077Seric.i Sendmail 137*11077Sericis intended to help bridge the gap 138*11077Sericbetween the totally 139*11077Seric.i "ad hoc" 140*11077Sericworld 141*11077Sericof networks that know nothing of each other 142*11077Sericand the clean, tightly-coupled world 143*11077Sericof unique network numbers. 144*11077SericIt can accept old arbitrary address syntaxes, 145*11077Sericresolving ambiguities using heuristics 146*11077Sericspecified by the system administrator, 147*11077Sericas well as domain-based addressing. 148*11077SericIt helps guide the conversion of message formats 149*11077Sericbetween disparate networks. 150*11077SericIn short, 151*11077Seric.i sendmail 152*11077Sericis designed to assist a graceful transition 153*11077Sericto consistent internetwork addressing schemes. 154*11077Seric.sp 155*11077Seric.pp 156*11077SericSection 1 defines some of the terms 157*11077Sericfrequently left fuzzy 158*11077Sericwhen working in mail systems. 159*11077SericSection 2 discusses the design goals for 160*11077Seric.i sendmail . 161*11077SericIn section 3, 162*11077Sericthe new address formats 163*11077Sericand basic features of 164*11077Seric.i sendmail 165*11077Sericare described. 166*11077SericSection 4 discusses some of the special problems 167*11077Sericof the UUCP network. 168*11077SericThe differences between 169*11077Seric.i sendmail 170*11077Sericand 171*11077Seric.i delivermail 172*11077Sericare presented in section 5. 173*11077Seric.sp 174*11077Seric.(l F 175*11077Seric.b DISCLAIMER: 176*11077SericA number of examples 177*11077Sericin this paper 178*11077Sericuse names of actual people 179*11077Sericand organizations. 180*11077SericThis is not intended 181*11077Sericto imply a commitment 182*11077Sericor even an intellectual agreement 183*11077Sericon the part of these people or organizations. 184*11077SericIn particular, 185*11077SericBell Telephone Laboratories (BTL), 186*11077SericDigital Equipment Corporation (DEC), 187*11077SericLawrence Berkeley Laboratories (LBL), 188*11077SericBritton-Lee Incorporated (BLI), 189*11077Sericand the University of California at Berkeley 190*11077Sericare not committed to any of these proposals at this time. 191*11077SericMuch of this paper 192*11077Sericrepresents no more than 193*11077Sericthe personal opinions of the author. 194*11077Seric.)l 195*11077Seric.sh 1 "DEFINITIONS" 196*11077Seric.pp 197*11077SericThere are four basic concepts 198*11077Sericthat must be clearly distinguished 199*11077Sericwhen dealing with mail systems: 200*11077Sericthe user (or the user's agent), 201*11077Sericthe user's identification, 202*11077Sericthe user's address, 203*11077Sericand the route. 204*11077SericThese are distinguished primarily by their position independence. 205*11077Seric.sh 2 "User and Identification" 206*11077Seric.pp 207*11077SericThe user is the being 208*11077Seric(a person or program) 209*11077Sericthat is creating or receiving a message. 210*11077SericAn 211*11077Seric.i agent 212*11077Sericis an entity operating on behalf of the user \*- 213*11077Sericsuch as a secretary who handles my mail. 214*11077Sericor a program that automatically returns a 215*11077Sericmessage such as 216*11077Seric.q "I am at the UNICOM conference." 217*11077Seric.pp 218*11077SericThe identification is the tag 219*11077Sericthat goes along with the particular user. 220*11077SericThis tag is completely independent of location. 221*11077SericFor example, 222*11077Sericmy identification is the string 223*11077Seric.q "Eric Allman," 224*11077Sericand this identification does not change 225*11077Sericwhether I am located at U.C. Berkeley, 226*11077Sericat Britton-Lee, 227*11077Sericor at a scientific institute in Austria. 228*11077Seric.pp 229*11077SericSince the identification is frequently ambiguous 230*11077Seric(e.g., there are two 231*11077Seric.q "Robert Henry" s 232*11077Sericat Berkeley) 233*11077Sericit is common to add other disambiguating information 234*11077Sericthat is not strictly part of the identification 235*11077Seric(e.g., 236*11077SericRobert 237*11077Seric.q "Code Generator" 238*11077SericHenry 239*11077Sericversus 240*11077SericRobert 241*11077Seric.q "System Administrator" 242*11077SericHenry). 243*11077Seric.sh 2 "Address" 244*11077Seric.pp 245*11077SericThe address specifies a location. 246*11077SericAs I move around, 247*11077Sericmy address changes. 248*11077SericFor example, 249*11077Sericmy address might change from 250*11077Seric.q eric@Berkeley.ARPA 251*11077Sericto 252*11077Seric.q eric@bli.UUCP 253*11077Sericor 254*11077Seric.q allman@IIASA.Austria 255*11077Sericdepending on my current affiliation. 256*11077Seric.pp 257*11077SericHowever, 258*11077Sericand address is independent of the location of anyone else. 259*11077SericThat is, 260*11077Sericmy address remains the same to everyone who might be sending me mail. 261*11077SericFor example, 262*11077Serica person at MIT and a person at USC 263*11077Sericcould both send to 264*11077Seric.q eric@Berkeley.ARPA 265*11077Sericand have it arrive to the same mailbox. 266*11077Seric.pp 267*11077SericIdeally a 268*11077Seric.q "white pages" 269*11077Sericservice would be provided to map user identifications 270*11077Sericinto addresses 271*11077Seric(for example, see 272*11077Seric[Solomon81]). 273*11077SericCurrently this is handled by passing around 274*11077Sericscraps of paper 275*11077Sericor by calling people on the telephone 276*11077Sericto find out their address. 277*11077Seric.sh 2 "Route" 278*11077Seric.pp 279*11077SericWhere an address specifies 280*11077Seric.i where 281*11077Sericto find a mailbox, 282*11077Serica route specifies 283*11077Seric.i how 284*11077Sericto find the mailbox. 285*11077SericSpecifically, 286*11077Sericit specifies a path 287*11077Sericfrom sender to receiver. 288*11077SericAs such, the route is potentially different 289*11077Sericfor every pair of people in the electronic universe. 290*11077Seric.pp 291*11077SericNormally the route is hidden from the user 292*11077Sericby the software. 293*11077SericHowever, 294*11077Sericsome networks put the burden of determining the route 295*11077Sericonto the sender. 296*11077SericAlthough this simplifies the software, 297*11077Sericit also greatly impairs the usability 298*11077Sericfor most users. 299*11077SericThe UUCP network is an example of such a network. 300*11077Seric.sh 1 "DESIGN GOALS" 301*11077Seric.pp 302*11077SericDesign goals for 303*11077Seric.i sendmail \** 304*11077Seric.(f 305*11077Seric\**This section makes no distinction between 306*11077Seric.i delivermail 307*11077Sericand 308*11077Seric.i sendmail. 309*11077Seric.)f 310*11077Sericinclude: 311*11077Seric.np 312*11077SericCompatibility with the existing mail programs, 313*11077Sericincluding Bell version 6 mail, 314*11077SericBell version 7 mail, 315*11077SericBerkeley 316*11077Seric.i Mail 317*11077Seric[Shoens79], 318*11077SericBerkNet mail 319*11077Seric[Schmidt79], 320*11077Sericand hopefully UUCP mail 321*11077Seric[Nowitz78]. 322*11077SericARPANET mail 323*11077Seric[Crocker82] 324*11077Sericwas also required. 325*11077Seric.np 326*11077SericReliability, in the sense of guaranteeing 327*11077Sericthat every message is correctly delivered 328*11077Sericor at least brought to the attention of a human 329*11077Sericfor correct disposal; 330*11077Sericno message should ever be completely lost. 331*11077SericThis goal was considered essential 332*11077Sericbecause of the emphasis on mail in our environment. 333*11077SericIt has turned out to be one of the hardest goals to satisfy, 334*11077Sericespecially in the face of the many anomalous message formats 335*11077Sericproduced by various ARPANET sites. 336*11077SericFor example, 337*11077Sericcertain sites generate improperly formated addresses, 338*11077Sericoccasionally 339*11077Sericcausing error-message loops. 340*11077SericSome hosts use blanks in names, 341*11077Sericcausing problems with 342*11077Sericmail programs that assume that an address 343*11077Sericis one word. 344*11077SericThe semantics of some fields 345*11077Sericare interpreted slightly differently 346*11077Sericby different sites. 347*11077SericIn summary, 348*11077Sericthe obscure features of the ARPANET mail protocol 349*11077Sericreally 350*11077Seric.i are 351*11077Sericused and 352*11077Sericare difficult to support, 353*11077Sericbut must be supported. 354*11077Seric.np 355*11077SericExisting software to do actual delivery 356*11077Sericshould be used whenever possible. 357*11077SericThis goal derives as much from political and practical considerations 358*11077Sericas technical. 359*11077Seric.np 360*11077SericEasy expansion to 361*11077Sericfairly complex environments, 362*11077Sericincluding multiple 363*11077Sericconnections to a single network type 364*11077Seric(such as with multiple UUCP or Ether nets). 365*11077SericThis goal requires consideration of the contents of an address 366*11077Sericas well as its syntax 367*11077Sericin order to determine which gateway to use. 368*11077Seric.np 369*11077SericConfiguration should not be compiled into the code. 370*11077SericA single compiled program should be able to run as is at any site 371*11077Seric(barring such basic changes as the CPU type or the operating system). 372*11077SericWe have found this seemingly unimportant goal 373*11077Sericto be critical in real life. 374*11077SericBesides the simple problems that occur when any program gets recompiled 375*11077Sericin a different environment, 376*11077Sericmany sites like to 377*11077Seric.q fiddle 378*11077Sericwith anything that they will be recompiling anyway. 379*11077Seric.np 380*11077Seric.i Sendmail 381*11077Sericmust be able to let various groups maintain their own mailing lists, 382*11077Sericand let individuals specify their own forwarding, 383*11077Sericwithout modifying the system alias file. 384*11077Seric.np 385*11077SericEach user should be able to specify which mailer to execute 386*11077Sericto process mail being delivered for him. 387*11077SericThis feature allows users who are using specialized mailers 388*11077Sericthat use a different format to build their environment 389*11077Sericwithout changing the system, 390*11077Sericand facilitates specialized functions 391*11077Seric(such as returning an 392*11077Seric.q "I am on vacation" 393*11077Sericmessage). 394*11077Seric.np 395*11077SericNetwork traffic should be minimized 396*11077Sericby batching addresses to a single host where possible, 397*11077Sericwithout assistance from the user. 398*11077Seric.pp 399*11077SericThese goals motivated the architecture illustrated in figure 1. 400*11077Seric.(z 401*11077Seric.hl 402*11077Seric.ie t \ 403*11077Seric. sp 18 404*11077Seric.el \{\ 405*11077Seric.(c 406*11077Seric+---------+ +---------+ +---------+ 407*11077Seric| sender1 | | sender2 | | sender3 | 408*11077Seric+---------+ +---------+ +---------+ 409*11077Seric | | | 410*11077Seric +----------+ + +----------+ 411*11077Seric | | | 412*11077Seric v v v 413*11077Seric +-------------+ 414*11077Seric | sendmail | 415*11077Seric +-------------+ 416*11077Seric | | | 417*11077Seric +----------+ + +----------+ 418*11077Seric | | | 419*11077Seric v v v 420*11077Seric+---------+ +---------+ +---------+ 421*11077Seric| mailer1 | | mailer2 | | mailer3 | 422*11077Seric+---------+ +---------+ +---------+ 423*11077Seric.)c 424*11077Seric.\} 425*11077Seric 426*11077Seric.ce 427*11077SericFigure 1 \*- Sendmail System Structure. 428*11077Seric.hl 429*11077Seric.)z 430*11077SericThe user interacts with a mail generating and sending program. 431*11077SericWhen the mail is created, 432*11077Sericthe generator calls 433*11077Seric.i sendmail , 434*11077Sericwhich routes the message to the correct mailer(s). 435*11077SericSince some of the senders may be network servers 436*11077Sericand some of the mailers may be network clients, 437*11077Seric.i sendmail 438*11077Sericmay be used as an internet mail gateway. 439*11077Seric.sh 1 "USAGE" 440*11077Seric.sh 2 "Address Formats" 441*11077Seric.pp 442*11077SericArguments may be flags and addresses. 443*11077SericFlags set various processing options. 444*11077SericFollowing flag arguments, 445*11077Sericaddress arguments may be given. 446*11077SericAddresses follow the syntax in RFC822 447*11077Seric[Crocker82] 448*11077Sericfor ARPANET 449*11077Sericaddress formats. 450*11077SericIn brief, the format is: 451*11077Seric.np 452*11077SericAnything in parentheses is thrown away 453*11077Seric(as a comment). 454*11077Seric.np 455*11077SericAnything in angle brackets (\c 456*11077Seric.q "<\|>" ) 457*11077Sericis preferred 458*11077Sericover anything else. 459*11077SericThis rule implements the ARPANET standard that addresses of the form 460*11077Seric.(b 461*11077Sericuser name <machine-address> 462*11077Seric.)b 463*11077Sericwill send to the electronic 464*11077Seric.q machine-address 465*11077Sericrather than the human 466*11077Seric.q "user name." 467*11077Seric.np 468*11077SericDouble quotes 469*11077Seric(\ "\ ) 470*11077Sericquote phrases; 471*11077Sericbackslashes quote characters. 472*11077SericBackslashes are more powerful 473*11077Sericin that they will cause otherwise equivalent phrases 474*11077Sericto compare differently \*- for example, 475*11077Seric.i user 476*11077Sericand 477*11077Seric.i 478*11077Seric"user" 479*11077Seric.r 480*11077Sericare equivalent, 481*11077Sericbut 482*11077Seric.i \euser 483*11077Sericis different from either of them. 484*11077Seric.pp 485*11077SericParentheses, angle brackets, and double quotes 486*11077Sericmust be properly balanced and nested. 487*11077SericThe rewriting rules control remaining parsing\**. 488*11077Seric.(f 489*11077Seric\**Disclaimer: Some special processing is done 490*11077Sericafter rewriting local names; see below. 491*11077Seric.)f 492*11077Seric.pp 493*11077SericAlthough old style addresses are still accepted 494*11077Sericin most cases, 495*11077Sericthe preferred address format 496*11077Sericis based on ARPANET-style domain-based addresses 497*11077Seric[Su82a]. 498*11077SericThese addresses are based on a hierarchical, logical decomposition 499*11077Sericof the address space. 500*11077SericThe addresses are hierarchical in a sense 501*11077Sericsimilar to the U.S. postal addresses: 502*11077Sericthe messages may first be routed to the correct state, 503*11077Sericwith no initial consideration of the city 504*11077Sericor other addressing details. 505*11077SericThe addresses are logical 506*11077Sericin that each step in the hierarchy 507*11077Sericcorresponds to a set of 508*11077Seric.q "naming authorities" 509*11077Sericrather than a physical network. 510*11077Seric.pp 511*11077SericFor example, 512*11077Sericthe address: 513*11077Seric.(l 514*11077Sericeric@HostA.BigSite.ARPA 515*11077Seric.)l 516*11077Sericwould first look up the domain 517*11077SericBigSite 518*11077Sericin the namespace administrated by 519*11077SericARPA. 520*11077SericA query could then be sent to 521*11077SericBigSite 522*11077Sericfor interpretation of 523*11077SericHostA. 524*11077SericEventually the mail would arrive at 525*11077SericHostA, 526*11077Sericwhich would then do final delivery 527*11077Sericto user 528*11077Seric.q eric. 529*11077Seric.sh 2 "Mail to Files and Programs" 530*11077Seric.pp 531*11077SericFiles and programs are legitimate message recipients. 532*11077SericFiles provide archival storage of messages, 533*11077Sericuseful for project administration and history. 534*11077SericPrograms are useful as recipients in a variety of situations, 535*11077Sericfor example, 536*11077Sericto maintain a public repository of systems messages 537*11077Seric(such as the Berkeley 538*11077Seric.i msgs 539*11077Sericprogram). 540*11077Seric.pp 541*11077SericAny address passing through the initial parsing algorithm 542*11077Sericas a local address 543*11077Seric(i.e, not appearing to be a valid address for another mailer) 544*11077Sericis scanned for two special cases. 545*11077SericIf prefixed by a vertical bar (\c 546*11077Seric.q \^|\^ ) 547*11077Sericthe rest of the address is processed as a shell command. 548*11077SericIf the user name begins with a slash mark (\c 549*11077Seric.q /\^ ) 550*11077Sericthe name is used as a file name, 551*11077Sericinstead of a login name. 552*11077Seric.pp 553*11077SericFiles that have setuid or setgid bits set 554*11077Sericbut no execute bits set 555*11077Serichave those bits honored if 556*11077Seric.i sendmail 557*11077Sericis running as root. 558*11077Seric.sh 2 "Aliasing, Forwarding, Inclusion" 559*11077Seric.pp 560*11077Seric.i Sendmail 561*11077Sericreroutes mail three ways. 562*11077SericAliasing applies system wide. 563*11077SericForwarding allows each user to reroute incoming mail 564*11077Sericdestined for that account. 565*11077SericInclusion directs 566*11077Seric.i sendmail 567*11077Sericto read a file for a list of addresses, 568*11077Sericand is normally used 569*11077Sericin conjunction with aliasing. 570*11077Seric.sh 3 "Aliasing" 571*11077Seric.pp 572*11077SericAliasing maps names to address lists using a system-wide file. 573*11077SericThis file is indexed to speed access. 574*11077SericOnly names that parse as local 575*11077Sericare allowed as aliases; 576*11077Sericthis guarantees a unique key 577*11077Seric(since there are no nicknames for the local host). 578*11077Seric.sh 3 "Forwarding" 579*11077Seric.pp 580*11077SericAfter aliasing, 581*11077Sericrecipients that are local and valid 582*11077Sericare checked for the existence of a 583*11077Seric.q .forward 584*11077Sericfile in their home directory. 585*11077SericIf it exists, 586*11077Sericthe message is 587*11077Seric.i not 588*11077Sericsent to that user, 589*11077Sericbut rather to the list of users in that file. 590*11077SericOften 591*11077Sericthis list will contain only one address, 592*11077Sericand the feature will be used for network mail forwarding. 593*11077Seric.pp 594*11077SericForwarding also permits a user to specify a private incoming mailer. 595*11077SericFor example, 596*11077Sericforwarding to: 597*11077Seric.(b 598*11077Seric"\^|\|/usr/local/newmail myname" 599*11077Seric.)b 600*11077Sericwill use a different incoming mailer. 601*11077Seric.sh 3 "Inclusion" 602*11077Seric.pp 603*11077SericInclusion is specified in RFC 733 [Crocker77] syntax: 604*11077Seric.(b 605*11077Seric:Include: pathname 606*11077Seric.)b 607*11077SericAn address of this form reads the file specified by 608*11077Seric.i pathname 609*11077Sericand sends to all users listed in that file. 610*11077Seric.pp 611*11077SericThe intent is 612*11077Seric.i not 613*11077Sericto support direct use of this feature, 614*11077Sericbut rather to use this as a subset of aliasing. 615*11077SericFor example, 616*11077Serican alias of the form: 617*11077Seric.(b 618*11077Sericproject: :include:/usr/project/userlist 619*11077Seric.)b 620*11077Sericis a method of letting a project maintain a mailing list 621*11077Sericwithout interaction with the system administration, 622*11077Sericeven if the alias file is protected. 623*11077Seric.pp 624*11077SericIt is not necessary to rebuild the index on the alias database 625*11077Sericwhen a :include: list is changed. 626*11077Seric.sh 2 "Message Collection" 627*11077Seric.pp 628*11077SericOnce all recipient addresses are parsed and verified, 629*11077Sericthe message is collected. 630*11077SericThe message comes in two parts: 631*11077Serica message header and a message body, 632*11077Sericseparated by a blank line. 633*11077SericThe body is an uninterpreted 634*11077Sericsequence of text lines. 635*11077Seric.pp 636*11077SericThe header is formated as a series of lines 637*11077Sericof the form 638*11077Seric.(b 639*11077Seric field-name: field-value 640*11077Seric.)b 641*11077SericField-value can be split across lines by starting the following 642*11077Sericlines with a space or a tab. 643*11077SericSome header fields have special internal meaning, 644*11077Sericand have appropriate special processing. 645*11077SericOther headers are simply passed through. 646*11077SericSome header fields may be added automatically, 647*11077Sericsuch as time stamps. 648*11077Seric.sh 1 "THE UUCP PROBLEM" 649*11077Seric.pp 650*11077SericOf particular interest 651*11077Sericis the UUCP network. 652*11077SericThe explicit routing 653*11077Sericused in the UUCP world 654*11077Sericcauses a number of serious problems. 655*11077SericFirst, 656*11077Sericgiving out an address 657*11077Sericis impossible 658*11077Sericwithout knowing the address of your potential correspondent. 659*11077SericThis is typically handled 660*11077Sericby specifying the address 661*11077Sericrelative to some 662*11077Seric.q "well-known" 663*11077Serichost 664*11077Seric(e.g., 665*11077Sericucbvax or decvax). 666*11077SericSecond, 667*11077Sericit is often difficult to compute 668*11077Sericthe set of addresses 669*11077Sericto reply to 670*11077Sericwithout some knowledge 671*11077Sericof the topology of the network. 672*11077SericAlthough it may be easy for a human being 673*11077Sericto do this 674*11077Sericunder many circumstances, 675*11077Serica program does not have equally sophisticated heuristics 676*11077Sericbuilt in. 677*11077SericThird, 678*11077Sericcertain addresses will become painfully and unnecessarily long, 679*11077Sericas when a message is routed through many hosts in USENET. 680*11077SericAnd finally, 681*11077Sericcertain 682*11077Seric.q "mixed domain" 683*11077Sericaddresses 684*11077Sericare impossible to parse unambiguously \*- 685*11077Serice.g., 686*11077Seric.(l 687*11077Sericdecvax!ucbvax!lbl-h@LBL-CSAM 688*11077Seric.)l 689*11077Sericmight have many possible resolutions, 690*11077Sericdepending on whether the message was first routed 691*11077Sericto decvax 692*11077Sericor to LBL-CSAM. 693*11077Seric.pp 694*11077SericTo solve this problem, 695*11077Sericthe UUCP syntax 696*11077Sericwould have to be changed to use addresses 697*11077Sericrather than routes. 698*11077SericFor example, 699*11077Sericthe address 700*11077Seric.q decvax!ucbvax!eric 701*11077Sericmight be expressed as 702*11077Seric.q eric@ucbvax.UUCP 703*11077Seric(with the hop through decvax implied). 704*11077SericThis address would itself be a domain-based address; 705*11077Sericfor example, 706*11077Serican address might be of the form: 707*11077Seric.(l 708*11077Sericmark@d.cbosg.btl.UUCP 709*11077Seric.)l 710*11077SericHosts outside of Bell Telephone Laboratories 711*11077Sericwould then only need to know 712*11077Serichow to get to a designated BTL relay, 713*11077Sericand the BTL topology 714*11077Sericwould only be maintained inside Bell. 715*11077Seric.pp 716*11077SericThere are three major problems 717*11077Sericassociated with turning UUCP addresses 718*11077Sericinto something reasonable: 719*11077Sericdefining the namespace, 720*11077Sericcreating and propagating the necessary software, 721*11077Sericand building and maintaining the database. 722*11077Seric.sh 2 "Defining the Namespace" 723*11077Seric.pp 724*11077SericMaking all UUCP hosts 725*11077Serictop-level names 726*11077Sericis not practical for a number of reasons. 727*11077SericFirst, 728*11077Sericwith over 1600 sites already, 729*11077Sericand (with the increasing availability of inexpensive microcomputers 730*11077Sericand autodialers) 731*11077Sericseveral thousand more coming within a few years, 732*11077Sericthe database update problem 733*11077Sericis simply intractable 734*11077Sericif the namespace is flat. 735*11077SericSecond, 736*11077Sericthere are almost certainly name conflicts today. 737*11077SericThird, 738*11077Sericas the number of sites grow 739*11077Sericthe names become ever less mnemonic. 740*11077Seric.pp 741*11077SericIt seems inevitable 742*11077Sericthat there be some sort of naming authority 743*11077Sericfor the set of top level names 744*11077Sericin the UUCP domain, 745*11077Sericas unpleasant a possibility 746*11077Sericas that may seem. 747*11077SericIt will simply not be possible 748*11077Sericto have one host resolving all names. 749*11077SericIt may however be possible 750*11077Sericto handle this 751*11077Sericin a fashion similar to that of assigning names of newsgroups 752*11077Sericin USENET. 753*11077SericHowever, 754*11077Sericit will be essential to encourage everyone 755*11077Sericto become subdomains of an existing domain 756*11077Sericwhenever possible \*- 757*11077Sericeven though this will certainly bruise some egos. 758*11077SericFor example, 759*11077Sericif a new host named 760*11077Seric.q blid 761*11077Sericwere to be added to the UUCP network, 762*11077Sericit would probably actually be addressed as 763*11077Seric.q d.bli.UUCP 764*11077Seric(i.e., 765*11077Sericas host 766*11077Seric.q d 767*11077Sericin the pseudo-domain 768*11077Seric.q bli 769*11077Sericrather than as host 770*11077Seric.q blid 771*11077Sericin the UUCP domain). 772*11077Seric.sh 2 "Creating and Propagating the Software" 773*11077Seric.pp 774*11077SericThe software itself 775*11077Sericis relatively trivial. 776*11077SericTwo modules are needed, 777*11077Sericone to handle incoming mail 778*11077Sericand one to handle outgoing mail. 779*11077Seric.pp 780*11077SericThe incoming module 781*11077Sericmust be prepared to handle either old or new style addresses. 782*11077SericNew-style addresses 783*11077Sericcan be passed through unchanged. 784*11077SericOld style addresses 785*11077Sericmust be turned into new style addresses 786*11077Sericwhere possible. 787*11077Seric.pp 788*11077SericThe outgoing module 789*11077Sericis slightly trickier. 790*11077SericIt must do a database lookup on the recipient addresses 791*11077Seric(passed on the command line) 792*11077Sericto determine what hosts to send the message to. 793*11077SericIf those hosts do not accept new-style addresses, 794*11077Sericit must transform all addresses in the header of the message 795*11077Sericinto old style using the database lookup. 796*11077Seric.pp 797*11077SericBoth of these modules 798*11077Sericare straightforward 799*11077Sericexcept for the issue of modifying the header. 800*11077SericIt seems prudent to choose one format 801*11077Sericfor the message headers. 802*11077SericFor a number of reasons, 803*11077SericBerkeley has elected to use the ARPANET protocols 804*11077Sericfor message formats. 805*11077SericHowever, 806*11077Sericthis protocol is somewhat difficult to parse. 807*11077Seric.pp 808*11077SericPropagation is somewhat more difficult. 809*11077SericThere are a large number of hosts 810*11077Sericconnected to UUCP 811*11077Sericthat will want to run completely standard systems 812*11077Seric(for very good reasons). 813*11077SericThe strategy is not to convert the entire network \*- 814*11077Sericonly enough of it it alleviate the problem. 815*11077Seric.sh 2 "Building and Maintaining the Database" 816*11077Seric.pp 817*11077SericThis is by far the most difficult problem. 818*11077SericA prototype for this database 819*11077Sericalready exists, 820*11077Sericbut it is maintained by hand 821*11077Sericand does not pretend to be complete. 822*11077Seric.pp 823*11077SericThis problem will be reduced considerably 824*11077Sericif people choose to group their hosts 825*11077Sericinto subdomains. 826*11077SericThis would require a global update 827*11077Sericonly when a new top level domain 828*11077Sericjoined the network. 829*11077SericA message to a host in a subdomain 830*11077Sericcould simply be routed to a known domain gateway 831*11077Sericfor further processing. 832*11077SericFor example, 833*11077Sericthe address 834*11077Seric.q eric@a.bli.UUCP 835*11077Sericmight be routed to the 836*11077Seric.q bli 837*11077Sericgateway 838*11077Sericfor redistribution; 839*11077Sericnew hosts could be added 840*11077Sericwithin BLI 841*11077Sericwithout notifying the rest of the world. 842*11077SericOf course, 843*11077Sericother hosts 844*11077Seric.i could 845*11077Sericbe notified as an efficiency measure. 846*11077Seric.pp 847*11077SericNor need there be only one domain gateway. 848*11077SericA domain such as BTL, 849*11077Sericfor instance, 850*11077Sericmight have a dozen gateways to the outside world; 851*11077Serica non-BTL site 852*11077Sericcould choose the one that was closest. 853*11077SericThe only restriction 854*11077Sericwould be that all gateways 855*11077Sericmaintain a consistent view of the domain 856*11077Sericthat they represent. 857*11077Seric.sh 2 "Logical Structure" 858*11077Seric.pp 859*11077SericLogically, 860*11077Sericdomains are organized into a tree. 861*11077SericThere need not be a host actually associated 862*11077Sericwith each level in the tree \*- 863*11077Sericfor example, 864*11077Sericthere will be no host associated with the name 865*11077Seric.q UUCP. 866*11077SericSimilarly, 867*11077Serican organization might group names together for administrative reasons; 868*11077Sericfor example, 869*11077Sericthe name 870*11077Seric.(l 871*11077SericCAD.research.BigCorp.UUCP 872*11077Seric.)l 873*11077Sericmight not actually have a host representing 874*11077Seric.q research. 875*11077Seric.pp 876*11077SericHowever, 877*11077Sericit may frequently be convenient to have a host 878*11077Sericor hosts 879*11077Sericthat 880*11077Seric.q represent 881*11077Serica domain. 882*11077SericFor example, 883*11077Sericif a single host exists that 884*11077Sericrepresents 885*11077SericBerkeley, 886*11077Sericthen mail from outside Berkeley 887*11077Sericcan forward mail to that host 888*11077Sericfor further resolution 889*11077Sericwithout knowing Berkeley's 890*11077Seric(rather volatile) 891*11077Serictopology. 892*11077SericThis is not unlike the operation 893*11077Sericof the telephone network. 894*11077Seric.pp 895*11077SericThis may also be useful 896*11077Sericinside certain large domains. 897*11077SericFor example, 898*11077Sericat Berkeley it may be presumed 899*11077Sericthat most hosts know about other hosts 900*11077Sericinside the Berkeley domain. 901*11077SericBut if they process and address 902*11077Sericthat is unknown, 903*11077Sericthey can pass it 904*11077Seric.q upstairs 905*11077Sericfor further examination. 906*11077SericThus as new hosts are added 907*11077Sericonly one host 908*11077Seric(the domain master) 909*11077Seric.i must 910*11077Sericbe updated immediately; 911*11077Sericother hosts can be updated as convenient. 912*11077Seric.pp 913*11077SericIdeally this name resolution 914*11077Sericwould be performed by a name server 915*11077Seric(e.g., [Su82b]) 916*11077Sericto avoid unnecessary copying 917*11077Sericof the message. 918*11077SericHowever, 919*11077Sericin a batch network 920*11077Sericsuch as UUCP 921*11077Sericthis could result in unnecessary delays. 922*11077Seric.sh 1 "COMPARISON WITH DELIVERMAIL" 923*11077Seric.pp 924*11077Seric.i Sendmail 925*11077Sericis an outgrowth of 926*11077Seric.i delivermail . 927*11077SericThe primary differences are: 928*11077Seric.np 929*11077SericConfiguration information is not compiled in. 930*11077SericThis change simplifies many of the problems 931*11077Sericof moving to other machines. 932*11077SericIt also allows easy debugging of new mailers. 933*11077Seric.np 934*11077SericAddress parsing is more flexible. 935*11077SericFor example, 936*11077Seric.i delivermail 937*11077Sericonly supported one gateway to any network, 938*11077Sericwhereas 939*11077Seric.i sendmail 940*11077Sericcan be sensitive to host names 941*11077Sericand reroute to different gateways. 942*11077Seric.np 943*11077SericForwarding and 944*11077Seric:include: 945*11077Sericfeatures eliminate the requirement that the system alias file 946*11077Sericbe writable by any user 947*11077Seric(or that an update program be written, 948*11077Sericor that the system administration make all changes). 949*11077Seric.np 950*11077Seric.i Sendmail 951*11077Sericsupports message batching across networks 952*11077Sericwhen a message is being sent to multiple recipients. 953*11077Seric.np 954*11077SericA mail queue is provided in 955*11077Seric.i sendmail. 956*11077SericMail that cannot be delivered immediately 957*11077Sericbut can potentially be delivered later 958*11077Sericis stored in this queue for a later retry. 959*11077SericThe queue also provides a buffer against system crashes; 960*11077Sericafter the message has been collected 961*11077Sericit may be reliably redelivered 962*11077Sericeven if the system crashes during the initial delivery. 963*11077Seric.np 964*11077Seric.i Sendmail 965*11077Sericuses the networking support provided by 4.2BSD 966*11077Sericto provide a direct interface networks such as the ARPANET 967*11077Sericand/or Ethernet 968*11077Sericusing SMTP (the Simple Mail Transfer Protocol) 969*11077Sericover a TCP/IP connection. 970*11077Seric.+c 971*11077Seric.ce 972*11077SericREFERENCES 973*11077Seric.nr ii 1.5i 974*11077Seric.ip [Crocker77] 975*11077SericCrocker, D. H., 976*11077SericVittal, J. J., 977*11077SericPogran, K. T., 978*11077Sericand 979*11077SericHenderson, D. A. Jr., 980*11077Seric.ul 981*11077SericStandard for the Format of ARPA Network Text Messages. 982*11077SericRFC 733, 983*11077SericNIC 41952. 984*11077SericIn [Feinler78]. 985*11077SericNovember 1977. 986*11077Seric.ip [Crocker82] 987*11077SericCrocker, D. H., 988*11077Seric.ul 989*11077SericStandard for the Format of Arpa Internet Text Messages. 990*11077SericRFC 822. 991*11077SericNetwork Information Center, 992*11077SericSRI International, 993*11077SericMenlo Park, California. 994*11077SericAugust 1982. 995*11077Seric.ip [Feinler78] 996*11077SericFeinler, E., 997*11077Sericand 998*11077SericPostel, J. 999*11077Seric(eds.), 1000*11077Seric.ul 1001*11077SericARPANET Protocol Handbook. 1002*11077SericNIC 7104, 1003*11077SericNetwork Information Center, 1004*11077SericSRI International, 1005*11077SericMenlo Park, California. 1006*11077Seric1978. 1007*11077Seric.ip [Nowitz78] 1008*11077SericNowitz, D. A., 1009*11077Sericand 1010*11077SericLesk, M. E., 1011*11077Seric.ul 1012*11077SericA Dial-Up Network of UNIX Systems. 1013*11077SericBell Laboratories. 1014*11077SericIn 1015*11077SericUNIX Programmer's Manual, Seventh Edition, 1016*11077SericVolume 2. 1017*11077SericAugust, 1978. 1018*11077Seric.ip [Schmidt79] 1019*11077SericSchmidt, E., 1020*11077Seric.ul 1021*11077SericAn Introduction to the Berkeley Network. 1022*11077SericUniversity of California, Berkeley California. 1023*11077Seric1979. 1024*11077Seric.ip [Shoens79] 1025*11077SericShoens, K., 1026*11077Seric.ul 1027*11077SericMail Reference Manual. 1028*11077SericUniversity of California, Berkeley. 1029*11077SericIn UNIX Programmer's Manual, 1030*11077SericSeventh Edition, 1031*11077SericVolume 2C. 1032*11077SericDecember 1979. 1033*11077Seric.ip [Solomon81] 1034*11077SericSolomon, M., 1035*11077SericLandweber, L., 1036*11077Sericand 1037*11077SericNeuhengen, D., 1038*11077Seric.ul 1039*11077SericThe Design of the CSNET Name Server. 1040*11077SericCS-DN-2. 1041*11077SericUniversity of Wisconsin, 1042*11077SericMadison. 1043*11077SericOctober 1981. 1044*11077Seric.ip [Su82a] 1045*11077SericSu, Zaw-Sing, 1046*11077Sericand 1047*11077SericPostel, Jon, 1048*11077Seric.ul 1049*11077SericThe Domain Naming Convention for Internet User Applications. 1050*11077SericRFC819. 1051*11077SericNetwork Information Center, 1052*11077SericSRI International, 1053*11077SericMenlo Park, California. 1054*11077SericAugust 1982. 1055*11077Seric.ip [Su82b] 1056*11077SericSu, Zaw-Sing, 1057*11077Seric.ul 1058*11077SericA Distributed System for Internet Name Service. 1059*11077SericRFC830. 1060*11077SericNetwork Information Center, 1061*11077SericSRI International, 1062*11077SericMenlo Park, California. 1063*11077SericOctober 1982. 1064