1426d2b71SDavid du Colombier.HTML "The Text Editor sam 23e12c5d1SDavid du Colombier.Vx 17 11 November 87 1 32 "ROB PIKE" "THE TEXT EDITOR SAM" 33e12c5d1SDavid du Colombier.ds DY "31 May 1987 43e12c5d1SDavid du Colombier.ds DR "Revised 1 July 1987 53e12c5d1SDavid du Colombier.de CW \" puts first arg in CW font, same as UL; maintains font 63e12c5d1SDavid du Colombier\%\&\\$3\f(CW\\$1\fP\&\\$2 73e12c5d1SDavid du Colombier.. 83e12c5d1SDavid du Colombier.de Cs 93e12c5d1SDavid du Colombier.br 103e12c5d1SDavid du Colombier.fi 113e12c5d1SDavid du Colombier.ft 2 123e12c5d1SDavid du Colombier.ps -2 133e12c5d1SDavid du Colombier.vs -2 143e12c5d1SDavid du Colombier.. 153e12c5d1SDavid du Colombier.de Ce 163e12c5d1SDavid du Colombier.br 173e12c5d1SDavid du Colombier.nf 183e12c5d1SDavid du Colombier.ft 1 193e12c5d1SDavid du Colombier.ps 203e12c5d1SDavid du Colombier.vs 213e12c5d1SDavid du Colombier.sp 223e12c5d1SDavid du Colombier.. 23426d2b71SDavid du Colombier.de XP 24*5d535f58SDavid du Colombier.ie h .html - <center><img src="\\$1.gif" /></center> 25426d2b71SDavid du Colombier.el .BP \\$1.ps \\$2 26426d2b71SDavid du Colombier.. 273e12c5d1SDavid du Colombier.TL 283e12c5d1SDavid du ColombierThe Text Editor \&\f(CWsam\fP 293e12c5d1SDavid du Colombier.AU 30219b2ee8SDavid du ColombierRob Pike 317dd7cddfSDavid du Colombierrob@plan9.bell-labs.com 323e12c5d1SDavid du Colombier.AB 333e12c5d1SDavid du Colombier.LP 343e12c5d1SDavid du Colombier.CW Sam 353e12c5d1SDavid du Colombieris an interactive multi-file text editor intended for 363e12c5d1SDavid du Colombierbitmap displays. 373e12c5d1SDavid du ColombierA textual command language 383e12c5d1SDavid du Colombiersupplements the mouse-driven, cut-and-paste interface 393e12c5d1SDavid du Colombierto make complex or 403e12c5d1SDavid du Colombierrepetitive editing tasks easy to specify. 413e12c5d1SDavid du ColombierThe language is characterized by the composition of regular expressions 423e12c5d1SDavid du Colombierto describe the structure of the text being modified. 433e12c5d1SDavid du ColombierThe treatment of files as a database, with changes logged 443e12c5d1SDavid du Colombieras atomic transactions, guides the implementation and 453e12c5d1SDavid du Colombiermakes a general `undo' mechanism straightforward. 463e12c5d1SDavid du Colombier.PP 473e12c5d1SDavid du Colombier.CW Sam 483e12c5d1SDavid du Colombieris implemented as two processes connected by a low-bandwidth stream, 493e12c5d1SDavid du Colombierone process handling the display and the other the editing 503e12c5d1SDavid du Colombieralgorithms. Therefore it can run with the display process 513e12c5d1SDavid du Colombierin a bitmap terminal and the editor on a local host, 523e12c5d1SDavid du Colombierwith both processes on a bitmap-equipped host, or with 533e12c5d1SDavid du Colombierthe display process in the terminal and the editor in a 543e12c5d1SDavid du Colombierremote host. 553e12c5d1SDavid du ColombierBy suppressing the display process, 563e12c5d1SDavid du Colombierit can even run without a bitmap terminal. 573e12c5d1SDavid du Colombier.PP 583e12c5d1SDavid du ColombierThis paper is reprinted from Software\(emPractice and Experience, 59219b2ee8SDavid du ColombierVol 17, number 11, pp. 813-845, November 1987. 60219b2ee8SDavid du ColombierThe paper has not been updated for the Plan 9 manuals. Although 61219b2ee8SDavid du Colombier.CW Sam 62219b2ee8SDavid du Colombierhas not changed much since the paper was written, the system around it certainly has. 63219b2ee8SDavid du ColombierNonetheless, the description here still stands as the best introduction to the editor. 643e12c5d1SDavid du Colombier.AE 653e12c5d1SDavid du Colombier.SH 663e12c5d1SDavid du ColombierIntroduction 673e12c5d1SDavid du Colombier.LP 683e12c5d1SDavid du Colombier.CW Sam 693e12c5d1SDavid du Colombieris an interactive text editor that combines cut-and-paste interactive editing with 703e12c5d1SDavid du Colombieran unusual command language based on the composition of regular expressions. 71219b2ee8SDavid du ColombierIt is written as two programs: one, the `host part,' runs on a UNIX system 723e12c5d1SDavid du Colombierand implements the command language and provides file access; the other, the 733e12c5d1SDavid du Colombier`terminal part,' runs asynchronously 743e12c5d1SDavid du Colombieron a machine with a mouse and bitmap display 753e12c5d1SDavid du Colombierand supports the display and interactive editing. 763e12c5d1SDavid du ColombierThe host part may be even run in isolation on an ordinary terminal 773e12c5d1SDavid du Colombierto edit text using the command 783e12c5d1SDavid du Colombierlanguage, much like a traditional line editor, 793e12c5d1SDavid du Colombierwithout assistance from a mouse or display. 803e12c5d1SDavid du ColombierMost often, 813e12c5d1SDavid du Colombierthe terminal part runs on a Blit\u\s-4\&1\s+4\d terminal 823e12c5d1SDavid du Colombier(actually on a Teletype DMD 5620, the production version of the Blit), whose 833e12c5d1SDavid du Colombierhost connection is an ordinary 9600 bps RS232 link; 84219b2ee8SDavid du Colombieron the SUN computer the host and display processes run on a single machine, 853e12c5d1SDavid du Colombierconnected by a pipe. 863e12c5d1SDavid du Colombier.PP 873e12c5d1SDavid du Colombier.CW Sam 883e12c5d1SDavid du Colombieredits uninterpreted 893e12c5d1SDavid du ColombierASCII text. 903e12c5d1SDavid du ColombierIt has no facilities for multiple fonts, graphics or tables, 913e12c5d1SDavid du Colombierunlike MacWrite,\u\s-4\&2\s+4\d Bravo,\u\s-4\&3\s+4\d Tioga\u\s-4\&4\s+4\d 923e12c5d1SDavid du Colombieror Lara.\u\s-4\&5\s+4\d 933e12c5d1SDavid du ColombierAlso unlike them, it has a rich command language. 943e12c5d1SDavid du Colombier(Throughout this paper, the phrase 953e12c5d1SDavid du Colombier.I 963e12c5d1SDavid du Colombiercommand language 973e12c5d1SDavid du Colombier.R 983e12c5d1SDavid du Colombierrefers to 993e12c5d1SDavid du Colombiertextual commands; commands activated from the mouse form the 1003e12c5d1SDavid du Colombier.I mouse 1013e12c5d1SDavid du Colombier.I language. ) 1023e12c5d1SDavid du Colombier.CW Sam 1033e12c5d1SDavid du Colombierdeveloped as an editor for use by programmers, and tries to join 104219b2ee8SDavid du Colombierthe styles of the UNIX text editor 1053e12c5d1SDavid du Colombier.CW ed \u\s-4\&6,7\s+4\d 1063e12c5d1SDavid du Colombierwith that of interactive cut-and-paste editors by 1073e12c5d1SDavid du Colombierproviding a comfortable mouse-driven interface 1083e12c5d1SDavid du Colombierto a program with a solid command language driven by regular expressions. 1093e12c5d1SDavid du ColombierThe command language developed more than the mouse language, and 1103e12c5d1SDavid du Colombieracquired a notation for describing the structure of files 1113e12c5d1SDavid du Colombiermore richly than as a sequence of lines, 1123e12c5d1SDavid du Colombierusing a dataflow-like syntax for specifying changes. 1133e12c5d1SDavid du Colombier.PP 1143e12c5d1SDavid du ColombierThe interactive style was influenced by 1153e12c5d1SDavid du Colombier.CW jim ,\u\s-4\&1\s+4\d 1163e12c5d1SDavid du Colombieran early cut-and-paste editor for the Blit, and by 1173e12c5d1SDavid du Colombier.CW mux ,\u\s-4\&8\s+4\d 1183e12c5d1SDavid du Colombierthe Blit window system. 1193e12c5d1SDavid du Colombier.CW Mux 1203e12c5d1SDavid du Colombiermerges the original Blit window system, 1213e12c5d1SDavid du Colombier.CW mpx ,\u\s-4\&1\s+4\d 1223e12c5d1SDavid du Colombierwith cut-and-paste editing, forming something like a 1233e12c5d1SDavid du Colombiermultiplexed version of 1243e12c5d1SDavid du Colombier.CW jim 1253e12c5d1SDavid du Colombierthat edits the output of (and input to) command sessions rather than files. 1263e12c5d1SDavid du Colombier.PP 1273e12c5d1SDavid du ColombierThe first part of this paper describes the command language, then the mouse 1283e12c5d1SDavid du Colombierlanguage, and explains how they interact. 1293e12c5d1SDavid du ColombierThat is followed by a description of the implementation, 1303e12c5d1SDavid du Colombierfirst of the host part, then of the terminal part. 1313e12c5d1SDavid du ColombierA principle that influenced the design of 1323e12c5d1SDavid du Colombier.CW sam 1333e12c5d1SDavid du Colombieris that it should have no explicit limits, such as upper limits on 1343e12c5d1SDavid du Colombierfile size or line length. 1353e12c5d1SDavid du ColombierA secondary consideration is that it be efficient. 1363e12c5d1SDavid du ColombierTo honor these two goals together requires a method for efficiently 1373e12c5d1SDavid du Colombiermanipulating 1383e12c5d1SDavid du Colombierhuge strings (files) without breaking them into lines, 1393e12c5d1SDavid du Colombierperhaps while making thousands of changes 1403e12c5d1SDavid du Colombierunder control of the command language. 1413e12c5d1SDavid du Colombier.CW Sam 's 1423e12c5d1SDavid du Colombiermethod is to 1433e12c5d1SDavid du Colombiertreat the file as a transaction database, implementing changes as atomic 1443e12c5d1SDavid du Colombierupdates. These updates may be unwound easily to `undo' changes. 1453e12c5d1SDavid du ColombierEfficiency is achieved through a collection of caches that minimizes 1463e12c5d1SDavid du Colombierdisc traffic and data motion, both within the two parts of the program 1473e12c5d1SDavid du Colombierand between them. 1483e12c5d1SDavid du Colombier.PP 1493e12c5d1SDavid du ColombierThe terminal part of 1503e12c5d1SDavid du Colombier.CW sam 1513e12c5d1SDavid du Colombieris fairly straightforward. 1523e12c5d1SDavid du ColombierMore interesting is how the two halves of the editor stay 1533e12c5d1SDavid du Colombiersynchronized when either half may initiate a change. 1543e12c5d1SDavid du ColombierThis is achieved through a data structure that organizes the 1553e12c5d1SDavid du Colombiercommunications and is maintained in parallel by both halves. 1563e12c5d1SDavid du Colombier.PP 1573e12c5d1SDavid du ColombierThe last part of the paper chronicles the writing of 1583e12c5d1SDavid du Colombier.CW sam 1593e12c5d1SDavid du Colombierand discusses the lessons that were learned through its development and use. 1603e12c5d1SDavid du Colombier.PP 1613e12c5d1SDavid du ColombierThe paper is long, but is composed largely of two papers of reasonable length: 1623e12c5d1SDavid du Colombiera description of the user interface of 1633e12c5d1SDavid du Colombier.CW sam 1643e12c5d1SDavid du Colombierand a discussion of its implementation. 1653e12c5d1SDavid du ColombierThey are combined because the implementation is strongly influenced by 1663e12c5d1SDavid du Colombierthe user interface, and vice versa. 1673e12c5d1SDavid du Colombier.SH 1683e12c5d1SDavid du ColombierThe Interface 1693e12c5d1SDavid du Colombier.LP 1703e12c5d1SDavid du Colombier.CW Sam 1713e12c5d1SDavid du Colombieris a text editor for multiple files. 1723e12c5d1SDavid du ColombierFile names may be provided when it is invoked: 1733e12c5d1SDavid du Colombier.P1 1743e12c5d1SDavid du Colombiersam file1 file2 ... 1753e12c5d1SDavid du Colombier.P2 1763e12c5d1SDavid du Colombierand there are commands 1773e12c5d1SDavid du Colombierto add new files and discard unneeded ones. 1783e12c5d1SDavid du ColombierFiles are not read until necessary 1793e12c5d1SDavid du Colombierto complete some command. 1803e12c5d1SDavid du ColombierEditing operations apply to an internal copy 181219b2ee8SDavid du Colombiermade when the file is read; the UNIX file associated with the copy 1823e12c5d1SDavid du Colombieris changed only by an explicit command. 1833e12c5d1SDavid du ColombierTo simplify the discussion, the internal copy is here called a 1843e12c5d1SDavid du Colombier.I file , 1853e12c5d1SDavid du Colombierwhile the disc-resident original is called a 1863e12c5d1SDavid du Colombier.I 1873e12c5d1SDavid du Colombierdisc file. 1883e12c5d1SDavid du Colombier.R 1893e12c5d1SDavid du Colombier.PP 1903e12c5d1SDavid du Colombier.CW Sam 1913e12c5d1SDavid du Colombieris usually connected to a bitmap display that presents a cut-and-paste 1923e12c5d1SDavid du Colombiereditor driven by the mouse. 1933e12c5d1SDavid du ColombierIn this mode, the command language is still available: 1943e12c5d1SDavid du Colombiertext typed in a special window, called the 1953e12c5d1SDavid du Colombier.CW sam 1963e12c5d1SDavid du Colombier.I window, 1973e12c5d1SDavid du Colombieris interpreted 1983e12c5d1SDavid du Colombieras commands to be executed in the current file. 1993e12c5d1SDavid du ColombierCut-and-paste editing may be used in any window \(em even in the 2003e12c5d1SDavid du Colombier.CW sam 2013e12c5d1SDavid du Colombierwindow to construct commands. 2023e12c5d1SDavid du ColombierThe other mode of operation, invoked by starting 2033e12c5d1SDavid du Colombier.CW sam 2043e12c5d1SDavid du Colombierwith the option 2053e12c5d1SDavid du Colombier.CW -d 2063e12c5d1SDavid du Colombier(for `no download'), 2073e12c5d1SDavid du Colombierdoes not use the mouse or bitmap display, but still permits 2083e12c5d1SDavid du Colombierediting using the textual command language, even on an ordinary terminal, 2093e12c5d1SDavid du Colombierinteractively or from a script. 2103e12c5d1SDavid du Colombier.PP 2113e12c5d1SDavid du ColombierThe following sections describe first the command language (under 2123e12c5d1SDavid du Colombier.CW sam\ -d 2133e12c5d1SDavid du Colombierand in the 2143e12c5d1SDavid du Colombier.CW sam 2153e12c5d1SDavid du Colombierwindow), and then the mouse interface. 2163e12c5d1SDavid du ColombierThese two languages are nearly independent, but connect through the 2173e12c5d1SDavid du Colombier.I current 2183e12c5d1SDavid du Colombier.I text, 2193e12c5d1SDavid du Colombierdescribed below. 2203e12c5d1SDavid du Colombier.SH 2 2213e12c5d1SDavid du ColombierThe Command Language 2223e12c5d1SDavid du Colombier.LP 2233e12c5d1SDavid du ColombierA file consists of its contents, which are an array of characters 2243e12c5d1SDavid du Colombier(that is, a string); the 2253e12c5d1SDavid du Colombier.I name 2263e12c5d1SDavid du Colombierof the associated disc file; the 2273e12c5d1SDavid du Colombier.I 2283e12c5d1SDavid du Colombiermodified bit 2293e12c5d1SDavid du Colombier.R 2303e12c5d1SDavid du Colombierthat states whether the contents match those of 2313e12c5d1SDavid du Colombierthe disc file; 2323e12c5d1SDavid du Colombierand a substring of the contents, called the 2333e12c5d1SDavid du Colombier.I 2343e12c5d1SDavid du Colombiercurrent text 2353e12c5d1SDavid du Colombier.R 2363e12c5d1SDavid du Colombieror 2373e12c5d1SDavid du Colombier.I dot 2383e12c5d1SDavid du Colombier(see Figures 1 and 2). 2393e12c5d1SDavid du ColombierIf the current text is a null string, dot falls between characters. 2403e12c5d1SDavid du ColombierThe 2413e12c5d1SDavid du Colombier.I value 2423e12c5d1SDavid du Colombierof dot is the location of the current text; the 2433e12c5d1SDavid du Colombier.I contents 2443e12c5d1SDavid du Colombierof dot are the characters it contains. 2453e12c5d1SDavid du Colombier.CW Sam 2463e12c5d1SDavid du Colombierimparts to the text no two-dimensional interpretation such as columns 2473e12c5d1SDavid du Colombieror fields; text is always one-dimensional. 248219b2ee8SDavid du ColombierEven the idea of a `line' of text as understood by most UNIX programs 2493e12c5d1SDavid du Colombier\(em a sequence of characters terminated by a newline character \(em 2503e12c5d1SDavid du Colombieris only weakly supported. 2513e12c5d1SDavid du Colombier.PP 2523e12c5d1SDavid du ColombierThe 2533e12c5d1SDavid du Colombier.I 2543e12c5d1SDavid du Colombiercurrent file 2553e12c5d1SDavid du Colombier.R 2563e12c5d1SDavid du Colombieris the file to which editing commands refer. 2573e12c5d1SDavid du ColombierThe current text is therefore dot in the current file. 2583e12c5d1SDavid du ColombierIf a command doesn't explicitly name a particular file or piece of text, 2593e12c5d1SDavid du Colombierthe command is assumed to apply to the current text. 2603e12c5d1SDavid du ColombierFor the moment, ignore the presence of multiple files and consider 2613e12c5d1SDavid du Colombierediting a single file. 2623e12c5d1SDavid du Colombier.KF L 263426d2b71SDavid du Colombier.XP fig1 3.5i 2643e12c5d1SDavid du Colombier.Cs 2653e12c5d1SDavid du ColombierFigure 1. A typical 2663e12c5d1SDavid du Colombier.CW sam 2673e12c5d1SDavid du Colombierscreen, with the editing menu presented. 2683e12c5d1SDavid du ColombierThe 2693e12c5d1SDavid du Colombier.CW sam 2703e12c5d1SDavid du Colombier(command language) window is in the middle, with file windows above and below. 2713e12c5d1SDavid du Colombier(The user interface makes it easy to create these abutting windows.) 2723e12c5d1SDavid du ColombierThe partially obscured window is a third file window. 2733e12c5d1SDavid du ColombierThe uppermost window is that to which typing and mouse operations apply, 2743e12c5d1SDavid du Colombieras indicated by its heavy border. 2753e12c5d1SDavid du ColombierEach window has its current text highlighted in reverse video. 2763e12c5d1SDavid du ColombierThe 2773e12c5d1SDavid du Colombier.CW sam 2783e12c5d1SDavid du Colombierwindow's current text is the null string on the last visible line, 2793e12c5d1SDavid du Colombierindicated by a vertical bar. 2803e12c5d1SDavid du ColombierSee also Figure 2. 2813e12c5d1SDavid du Colombier.Ce 2823e12c5d1SDavid du Colombier.KE 2833e12c5d1SDavid du Colombier.PP 2843e12c5d1SDavid du ColombierCommands have one-letter names. 2853e12c5d1SDavid du ColombierExcept for non-editing commands such as writing 2863e12c5d1SDavid du Colombierthe file to disc, most commands make some change 2873e12c5d1SDavid du Colombierto the text in dot and leave dot set to the text resulting from the change. 2883e12c5d1SDavid du ColombierFor example, the delete command, 2893e12c5d1SDavid du Colombier.CW d , 2903e12c5d1SDavid du Colombierdeletes the text in dot, replacing it by the null string and setting dot 2913e12c5d1SDavid du Colombierto the result. 2923e12c5d1SDavid du ColombierThe change command, 2933e12c5d1SDavid du Colombier.CW c , 2943e12c5d1SDavid du Colombierreplaces dot by text delimited by an arbitrary punctuation character, 2953e12c5d1SDavid du Colombierconventionally 2963e12c5d1SDavid du Colombiera slash. Thus, 2973e12c5d1SDavid du Colombier.P1 2983e12c5d1SDavid du Colombierc/Peter/ 2993e12c5d1SDavid du Colombier.P2 3003e12c5d1SDavid du Colombierreplaces the text in dot by the string 3013e12c5d1SDavid du Colombier.CW Peter . 3023e12c5d1SDavid du ColombierSimilarly, 3033e12c5d1SDavid du Colombier.P1 3043e12c5d1SDavid du Colombiera/Peter/ 3053e12c5d1SDavid du Colombier.P2 3063e12c5d1SDavid du Colombier(append) adds the string after dot, and 3073e12c5d1SDavid du Colombier.P1 3083e12c5d1SDavid du Colombieri/Peter/ 3093e12c5d1SDavid du Colombier.P2 3103e12c5d1SDavid du Colombier(insert) inserts before dot. 3113e12c5d1SDavid du ColombierAll three leave dot set to the new text, 3123e12c5d1SDavid du Colombier.CW Peter . 3133e12c5d1SDavid du Colombier.PP 3143e12c5d1SDavid du ColombierNewlines are part of the syntax of commands: 3153e12c5d1SDavid du Colombierthe newline character lexically terminates a command. 3163e12c5d1SDavid du ColombierWithin the inserted text, however, newlines are never implicit. 3173e12c5d1SDavid du ColombierBut since it is often convenient to insert multiple lines of text, 3183e12c5d1SDavid du Colombier.CW sam 3193e12c5d1SDavid du Colombierhas a special 3203e12c5d1SDavid du Colombiersyntax for that case: 3213e12c5d1SDavid du Colombier.P1 3223e12c5d1SDavid du Colombiera 3233e12c5d1SDavid du Colombiersome lines of text 3243e12c5d1SDavid du Colombierto be inserted in the file, 3253e12c5d1SDavid du Colombierterminated by a period 3263e12c5d1SDavid du Colombieron a line by itself 3273e12c5d1SDavid du Colombier\&. 3283e12c5d1SDavid du Colombier.P2 3293e12c5d1SDavid du ColombierIn the one-line syntax, a newline character may be specified by a C-like 3303e12c5d1SDavid du Colombierescape, so 3313e12c5d1SDavid du Colombier.P1 3323e12c5d1SDavid du Colombierc/\en/ 3333e12c5d1SDavid du Colombier.P2 3343e12c5d1SDavid du Colombierreplaces dot by a single newline character. 3353e12c5d1SDavid du Colombier.PP 3363e12c5d1SDavid du Colombier.CW Sam 3373e12c5d1SDavid du Colombieralso has a substitute command, 3383e12c5d1SDavid du Colombier.CW s : 3393e12c5d1SDavid du Colombier.P1 3403e12c5d1SDavid du Colombiers/\f2expression\fP/\f2replacement\fP/ 3413e12c5d1SDavid du Colombier.P2 3423e12c5d1SDavid du Colombiersubstitutes the replacement text for the first match, in dot, 3433e12c5d1SDavid du Colombierof the regular expression. 3443e12c5d1SDavid du ColombierThus, if dot is the string 3453e12c5d1SDavid du Colombier.CW Peter , 3463e12c5d1SDavid du Colombierthe command 3473e12c5d1SDavid du Colombier.P1 3483e12c5d1SDavid du Colombiers/t/st/ 3493e12c5d1SDavid du Colombier.P2 3503e12c5d1SDavid du Colombierchanges it to 3513e12c5d1SDavid du Colombier.CW Pester . 3523e12c5d1SDavid du ColombierIn general, 3533e12c5d1SDavid du Colombier.CW s 3543e12c5d1SDavid du Colombieris unnecessary, but it was inherited from 3553e12c5d1SDavid du Colombier.CW ed 3563e12c5d1SDavid du Colombierand it has some convenient variations. 3573e12c5d1SDavid du ColombierFor instance, the replacement text may include the matched text, 3583e12c5d1SDavid du Colombierspecified by 3593e12c5d1SDavid du Colombier.CW & : 3603e12c5d1SDavid du Colombier.P1 3613e12c5d1SDavid du Colombiers/Peter/Oh, &, &, &, &!/ 3623e12c5d1SDavid du Colombier.P2 3633e12c5d1SDavid du Colombier.PP 3643e12c5d1SDavid du ColombierThere are also three commands that apply programs 3653e12c5d1SDavid du Colombierto text: 3663e12c5d1SDavid du Colombier.P1 367219b2ee8SDavid du Colombier< \f2UNIX program\fP 3683e12c5d1SDavid du Colombier.P2 369219b2ee8SDavid du Colombierreplaces dot by the output of the UNIX program. 3703e12c5d1SDavid du ColombierSimilarly, the 3713e12c5d1SDavid du Colombier.CW > 3723e12c5d1SDavid du Colombiercommand 3733e12c5d1SDavid du Colombierruns the program with dot as its standard input, and 3743e12c5d1SDavid du Colombier.CW | 3753e12c5d1SDavid du Colombierdoes both. For example, 3763e12c5d1SDavid du Colombier.P1 3773e12c5d1SDavid du Colombier| sort 3783e12c5d1SDavid du Colombier.P2 3793e12c5d1SDavid du Colombierreplaces dot by the result of applying the standard sorting utility to it. 3803e12c5d1SDavid du ColombierAgain, newlines have no special significance for these 3813e12c5d1SDavid du Colombier.CW sam 3823e12c5d1SDavid du Colombiercommands. 3833e12c5d1SDavid du ColombierThe text acted upon and resulting from these commands is not necessarily 384219b2ee8SDavid du Colombierbounded by newlines, although for connection with UNIX programs, 3853e12c5d1SDavid du Colombiernewlines may be necessary to obey conventions. 3863e12c5d1SDavid du Colombier.PP 3873e12c5d1SDavid du ColombierOne more command: 3883e12c5d1SDavid du Colombier.CW p 3893e12c5d1SDavid du Colombierprints the contents of dot. 3903e12c5d1SDavid du ColombierTable I summarizes 3913e12c5d1SDavid du Colombier.CW sam 's 3923e12c5d1SDavid du Colombiercommands. 3933e12c5d1SDavid du Colombier.KF 3943e12c5d1SDavid du Colombier.TS 3953e12c5d1SDavid du Colombiercenter; 3963e12c5d1SDavid du Colombierc s 3973e12c5d1SDavid du ColombierlfCW l. 3983e12c5d1SDavid du ColombierTable I. \f(CWSam\fP commands 3993e12c5d1SDavid du Colombier.sp .4 4003e12c5d1SDavid du Colombier.ft CW 4013e12c5d1SDavid du Colombier_ 4023e12c5d1SDavid du Colombier.ft 4033e12c5d1SDavid du Colombier.sp .4 4043e12c5d1SDavid du Colombier\f1Text commands\fP 4053e12c5d1SDavid du Colombier.sp .4 4063e12c5d1SDavid du Colombier_ 4073e12c5d1SDavid du Colombier.sp .4 4083e12c5d1SDavid du Colombiera/\f2text\fP/ Append text after dot 4093e12c5d1SDavid du Colombierc/\f2text\fP/ Change text in dot 4103e12c5d1SDavid du Colombieri/\f2text\fP/ Insert text before dot 4113e12c5d1SDavid du Colombierd Delete text in dot 4123e12c5d1SDavid du Colombiers/\f2regexp\fP/\f2text\fP/ Substitute text for match of regular expression in dot 4133e12c5d1SDavid du Colombierm \f2address\fP Move text in dot after address 4143e12c5d1SDavid du Colombiert \f2address\fP Copy text in dot after address 4153e12c5d1SDavid du Colombier.sp .4 4163e12c5d1SDavid du Colombier_ 4173e12c5d1SDavid du Colombier.sp .4 4183e12c5d1SDavid du Colombier\f1Display commands\fP 4193e12c5d1SDavid du Colombier.sp .4 4203e12c5d1SDavid du Colombier_ 4213e12c5d1SDavid du Colombier.sp .2 4223e12c5d1SDavid du Colombierp Print contents of dot 4233e12c5d1SDavid du Colombier\&= Print value (line numbers and character numbers) of dot 4243e12c5d1SDavid du Colombier.sp .4 4253e12c5d1SDavid du Colombier_ 4263e12c5d1SDavid du Colombier.sp .4 4273e12c5d1SDavid du Colombier\f1File commands\fP 4283e12c5d1SDavid du Colombier.sp .4 4293e12c5d1SDavid du Colombier_ 4303e12c5d1SDavid du Colombier.sp .2 4313e12c5d1SDavid du Colombierb \f2file-list\fP Set current file to first file in list that \f(CWsam\fP has in menu 4323e12c5d1SDavid du ColombierB \f2file-list\fP Same as \f(CWb\fP, but load new files 4333e12c5d1SDavid du Colombiern Print menu lines of all files 4343e12c5d1SDavid du ColombierD \f2file-list\fP Delete named files from \f(CWsam\fP 4353e12c5d1SDavid du Colombier.sp .4 4363e12c5d1SDavid du Colombier_ 4373e12c5d1SDavid du Colombier.sp .4 4383e12c5d1SDavid du Colombier\f1I/O commands\fP 4393e12c5d1SDavid du Colombier.sp .4 4403e12c5d1SDavid du Colombier_ 4413e12c5d1SDavid du Colombier.sp .2 4423e12c5d1SDavid du Colombiere \f2filename\fP Replace file with named disc file 4433e12c5d1SDavid du Colombierr \f2filename\fP Replace dot by contents of named disc file 4443e12c5d1SDavid du Colombierw \f2filename\fP Write file to named disc file 4453e12c5d1SDavid du Colombierf \f2filename\fP Set file name and print new menu line 446219b2ee8SDavid du Colombier< \f2UNIX-command\fP Replace dot by standard output of command 447219b2ee8SDavid du Colombier> \f2UNIX-command\fP Send dot to standard input of command 448219b2ee8SDavid du Colombier| \f2UNIX-command\fP Replace dot by result of command applied to dot 449219b2ee8SDavid du Colombier! \f2UNIX-command\fP Run the command 4503e12c5d1SDavid du Colombier.sp .4 4513e12c5d1SDavid du Colombier_ 4523e12c5d1SDavid du Colombier.sp .4 4533e12c5d1SDavid du Colombier\f1Loops and conditionals\fP 4543e12c5d1SDavid du Colombier.sp .4 4553e12c5d1SDavid du Colombier_ 4563e12c5d1SDavid du Colombier.sp .2 4573e12c5d1SDavid du Colombierx/\f2regexp\fP/ \f2command\fP For each match of regexp, set dot and run command 4583e12c5d1SDavid du Colombiery/\f2regexp\fP/ \f2command\fP Between adjacent matches of regexp, set dot and run command 4593e12c5d1SDavid du ColombierX/\f2regexp\fP/ \f2command\fP Run command in each file whose menu line matches regexp 4603e12c5d1SDavid du ColombierY/\f2regexp\fP/ \f2command\fP Run command in each file whose menu line does not match 4613e12c5d1SDavid du Colombierg/\f2regexp\fP/ \f2command\fP If dot contains a match of regexp, run command 4623e12c5d1SDavid du Colombierv/\f2regexp\fP/ \f2command\fP If dot does not contain a match of regexp, run command 4633e12c5d1SDavid du Colombier.sp .4 4643e12c5d1SDavid du Colombier_ 4653e12c5d1SDavid du Colombier.sp .4 4663e12c5d1SDavid du Colombier\f1Miscellany\fP 4673e12c5d1SDavid du Colombier.sp .4 4683e12c5d1SDavid du Colombier_ 4693e12c5d1SDavid du Colombier.sp .2 4703e12c5d1SDavid du Colombierk Set address mark to value of dot 4713e12c5d1SDavid du Colombierq Quit 4723e12c5d1SDavid du Colombieru \f2n\fP Undo last \f2n\fP (default 1) changes 4733e12c5d1SDavid du Colombier{ } Braces group commands 4743e12c5d1SDavid du Colombier.sp .3 4753e12c5d1SDavid du Colombier.ft CW 4763e12c5d1SDavid du Colombier_ 4773e12c5d1SDavid du Colombier.ft 4783e12c5d1SDavid du Colombier.TE 4793e12c5d1SDavid du Colombier.sp 4803e12c5d1SDavid du Colombier.KE 4813e12c5d1SDavid du Colombier.PP 4823e12c5d1SDavid du ColombierThe value of dot may be changed by 4833e12c5d1SDavid du Colombierspecifying an 4843e12c5d1SDavid du Colombier.I address 4853e12c5d1SDavid du Colombierfor the command. 4863e12c5d1SDavid du ColombierThe simplest address is a line number: 4873e12c5d1SDavid du Colombier.P1 4883e12c5d1SDavid du Colombier3 4893e12c5d1SDavid du Colombier.P2 4903e12c5d1SDavid du Colombierrefers to the third line of the file, so 4913e12c5d1SDavid du Colombier.P1 4923e12c5d1SDavid du Colombier3d 4933e12c5d1SDavid du Colombier.P2 4943e12c5d1SDavid du Colombierdeletes the third line of the file, and implicitly renumbers 4953e12c5d1SDavid du Colombierthe lines so the old line 4 is now numbered 3. 4963e12c5d1SDavid du Colombier(This is one of the few places where 4973e12c5d1SDavid du Colombier.CW sam 4983e12c5d1SDavid du Colombierdeals with lines directly.) 4993e12c5d1SDavid du ColombierLine 5003e12c5d1SDavid du Colombier.CW 0 5013e12c5d1SDavid du Colombieris the null string at the beginning of the file. 5023e12c5d1SDavid du ColombierIf a command consists of only an address, a 5033e12c5d1SDavid du Colombier.CW p 5043e12c5d1SDavid du Colombiercommand is assumed, so typing an unadorned 5053e12c5d1SDavid du Colombier.CW 3 5063e12c5d1SDavid du Colombierprints line 3 on the terminal. 5073e12c5d1SDavid du ColombierThere are a couple of other basic addresses: 5083e12c5d1SDavid du Colombiera period addresses dot itself; and 5093e12c5d1SDavid du Colombiera dollar sign 5103e12c5d1SDavid du Colombier.CW $ ) ( 5113e12c5d1SDavid du Colombieraddresses the null string at the end of the file. 5123e12c5d1SDavid du Colombier.PP 5133e12c5d1SDavid du ColombierAn address is always a single substring of the file. 5143e12c5d1SDavid du ColombierThus, the address 5153e12c5d1SDavid du Colombier.CW 3 5163e12c5d1SDavid du Colombieraddresses the characters 5173e12c5d1SDavid du Colombierafter the second newline of 5183e12c5d1SDavid du Colombierthe file through the third newline of the file. 5193e12c5d1SDavid du ColombierA 5203e12c5d1SDavid du Colombier.I 5213e12c5d1SDavid du Colombiercompound address 5223e12c5d1SDavid du Colombier.R 5233e12c5d1SDavid du Colombieris constructed by the comma operator 5243e12c5d1SDavid du Colombier.P1 5253e12c5d1SDavid du Colombier\f2address1\fP,\f2address2\fP 5263e12c5d1SDavid du Colombier.P2 5273e12c5d1SDavid du Colombierand addresses the substring of the file from the beginning of 5283e12c5d1SDavid du Colombier.I address1 5293e12c5d1SDavid du Colombierto the end of 5303e12c5d1SDavid du Colombier.I address2 . 5313e12c5d1SDavid du ColombierFor example, the command 5323e12c5d1SDavid du Colombier.CW 3,5p 5333e12c5d1SDavid du Colombierprints the third through fifth lines of the file and 5343e12c5d1SDavid du Colombier.CW .,$d 5353e12c5d1SDavid du Colombierdeletes the text from the beginning of dot to the end of the file. 5363e12c5d1SDavid du Colombier.PP 5373e12c5d1SDavid du ColombierThese addresses are all absolute positions in the file, but 5383e12c5d1SDavid du Colombier.CW sam 5393e12c5d1SDavid du Colombieralso has relative addresses, indicated by 5403e12c5d1SDavid du Colombier.CW + 5413e12c5d1SDavid du Colombieror 5423e12c5d1SDavid du Colombier.CW - . 5433e12c5d1SDavid du ColombierFor example, 5443e12c5d1SDavid du Colombier.P1 5453e12c5d1SDavid du Colombier$-3 5463e12c5d1SDavid du Colombier.P2 5473e12c5d1SDavid du Colombieris the third line before the end of the file and 5483e12c5d1SDavid du Colombier.P1 5493e12c5d1SDavid du Colombier\&.+1 5503e12c5d1SDavid du Colombier.P2 5513e12c5d1SDavid du Colombieris the line after dot. 5523e12c5d1SDavid du ColombierIf no address appears to the left of the 5533e12c5d1SDavid du Colombier.CW + 5543e12c5d1SDavid du Colombieror 5553e12c5d1SDavid du Colombier.CW - , 5563e12c5d1SDavid du Colombierdot is assumed; 5573e12c5d1SDavid du Colombierif nothing appears to the right, 5583e12c5d1SDavid du Colombier.CW 1 5593e12c5d1SDavid du Colombieris assumed. 5603e12c5d1SDavid du ColombierTherefore, 5613e12c5d1SDavid du Colombier.CW .+1 5623e12c5d1SDavid du Colombiermay be abbreviated to just a plus sign. 5633e12c5d1SDavid du Colombier.PP 5643e12c5d1SDavid du ColombierThe 5653e12c5d1SDavid du Colombier.CW + 5663e12c5d1SDavid du Colombieroperator acts relative to the end of its first argument, while the 5673e12c5d1SDavid du Colombier.CW - 5683e12c5d1SDavid du Colombieroperator acts relative to the beginning. Thus 5693e12c5d1SDavid du Colombier.CW .+1 5703e12c5d1SDavid du Colombieraddresses the first line after dot, 5713e12c5d1SDavid du Colombier.CW .- 5723e12c5d1SDavid du Colombieraddresses the first line before dot, and 5733e12c5d1SDavid du Colombier.CW +- 5743e12c5d1SDavid du Colombierrefers to the line containing the end of dot. (Dot may span multiple lines, and 5753e12c5d1SDavid du Colombier.CW + 5763e12c5d1SDavid du Colombierselects the line after the end of dot, then 5773e12c5d1SDavid du Colombier.CW - 5783e12c5d1SDavid du Colombierbacks up one line.) 5793e12c5d1SDavid du Colombier.PP 5803e12c5d1SDavid du ColombierThe final type of address is a regular expression, which addresses the 5813e12c5d1SDavid du Colombiertext matched by the expression. The expression is enclosed in slashes, as in 5823e12c5d1SDavid du Colombier.P1 5833e12c5d1SDavid du Colombier/\f2expression\fP/ 5843e12c5d1SDavid du Colombier.P2 585219b2ee8SDavid du ColombierThe expressions are the same as those in the UNIX program 5863e12c5d1SDavid du Colombier.CW egrep ,\u\s-4\&6,7\s+4\d 5873e12c5d1SDavid du Colombierand include closures, alternations, and so on. 5883e12c5d1SDavid du ColombierThey find the 5893e12c5d1SDavid du Colombier.I 5903e12c5d1SDavid du Colombierleftmost longest 5913e12c5d1SDavid du Colombier.R 5923e12c5d1SDavid du Colombierstring that matches the expression, that is, 5933e12c5d1SDavid du Colombierthe first match after the point where the search is started, 5943e12c5d1SDavid du Colombierand if more than one match begins at the same spot, the longest such match. 595219b2ee8SDavid du Colombier(I assume familiarity with the syntax for regular expressions in UNIX programs.\u\s-4\&9\s+4\d) 5963e12c5d1SDavid du ColombierFor example, 5973e12c5d1SDavid du Colombier.P1 5983e12c5d1SDavid du Colombier/x/ 5993e12c5d1SDavid du Colombier.P2 6003e12c5d1SDavid du Colombiermatches the next 6013e12c5d1SDavid du Colombier.CW x 6023e12c5d1SDavid du Colombiercharacter in the file, 6033e12c5d1SDavid du Colombier.P1 6043e12c5d1SDavid du Colombier/xx*/ 6053e12c5d1SDavid du Colombier.P2 6063e12c5d1SDavid du Colombiermatches the next run of one or more 6073e12c5d1SDavid du Colombier.CW x 's, 6083e12c5d1SDavid du Colombierand 6093e12c5d1SDavid du Colombier.P1 6103e12c5d1SDavid du Colombier/x|Peter/ 6113e12c5d1SDavid du Colombier.P2 6123e12c5d1SDavid du Colombiermatches the next 6133e12c5d1SDavid du Colombier.CW x 6143e12c5d1SDavid du Colombieror 6153e12c5d1SDavid du Colombier.CW Peter . 616219b2ee8SDavid du ColombierFor compatibility with other UNIX programs, the `any character' operator, 6173e12c5d1SDavid du Colombiera period, 6183e12c5d1SDavid du Colombierdoes not match a newline, so 6193e12c5d1SDavid du Colombier.P1 6203e12c5d1SDavid du Colombier/.*/ 6213e12c5d1SDavid du Colombier.P2 6223e12c5d1SDavid du Colombiermatches the text from dot to the end of the line, but excludes the newline 6233e12c5d1SDavid du Colombierand so will not match across 6243e12c5d1SDavid du Colombierthe line boundary. 6253e12c5d1SDavid du Colombier.PP 6263e12c5d1SDavid du ColombierRegular expressions are always relative addresses. 6273e12c5d1SDavid du ColombierThe direction is forwards by default, 6283e12c5d1SDavid du Colombierso 6293e12c5d1SDavid du Colombier.CW /Peter/ 6303e12c5d1SDavid du Colombieris really an abbreviation for 6313e12c5d1SDavid du Colombier.CW +/Peter/ . 6323e12c5d1SDavid du ColombierThe search can be reversed with a minus sign, so 6333e12c5d1SDavid du Colombier.P1 6343e12c5d1SDavid du Colombier.CW -/Peter/ 6353e12c5d1SDavid du Colombier.P2 6363e12c5d1SDavid du Colombierfinds the first 6373e12c5d1SDavid du Colombier.CW Peter 6383e12c5d1SDavid du Colombierbefore dot. 6393e12c5d1SDavid du ColombierRegular expressions may be used with other address forms, so 6403e12c5d1SDavid du Colombier.CW 0+/Peter/ 6413e12c5d1SDavid du Colombierfinds the first 6423e12c5d1SDavid du Colombier.CW Peter 6433e12c5d1SDavid du Colombierin the file and 6443e12c5d1SDavid du Colombier.CW $-/Peter/ 6453e12c5d1SDavid du Colombierfinds the last. 6463e12c5d1SDavid du ColombierTable II summarizes 6473e12c5d1SDavid du Colombier.CW sam 's 6483e12c5d1SDavid du Colombieraddresses. 6493e12c5d1SDavid du Colombier.KF 6503e12c5d1SDavid du Colombier.TS 6513e12c5d1SDavid du Colombiercenter; 6523e12c5d1SDavid du Colombierc s 6533e12c5d1SDavid du ColombierlfCW l. 6543e12c5d1SDavid du ColombierTable II. \f(CWSam\fP addresses 6553e12c5d1SDavid du Colombier.sp .4 6563e12c5d1SDavid du Colombier.ft CW 6573e12c5d1SDavid du Colombier_ 6583e12c5d1SDavid du Colombier.ft 6593e12c5d1SDavid du Colombier.sp .4 6603e12c5d1SDavid du Colombier\f1Simple addresses\fP 6613e12c5d1SDavid du Colombier.sp .4 6623e12c5d1SDavid du Colombier_ 6633e12c5d1SDavid du Colombier.sp .2 6643e12c5d1SDavid du Colombier#\f2n\fP The empty string after character \f2n\fP 6653e12c5d1SDavid du Colombier\f2n\fP Line \f2n\fP. 6663e12c5d1SDavid du Colombier/\f2regexp\fP/ The first following match of the regular expression 6673e12c5d1SDavid du Colombier-/\f2regexp\fP/ The first previous match of the regular expression 6683e12c5d1SDavid du Colombier$ The null string at the end of the file 6693e12c5d1SDavid du Colombier\&. Dot 6703e12c5d1SDavid du Colombier\&' The address mark, set by \f(CWk\fP command 6713e12c5d1SDavid du Colombier"\f2regexp\fP" Dot in the file whose menu line matches regexp 6723e12c5d1SDavid du Colombier.sp .4 6733e12c5d1SDavid du Colombier_ 6743e12c5d1SDavid du Colombier.sp .4 6753e12c5d1SDavid du Colombier\f1Compound addresses\fP 6763e12c5d1SDavid du Colombier.sp .4 6773e12c5d1SDavid du Colombier_ 6783e12c5d1SDavid du Colombier.sp .2 6793e12c5d1SDavid du Colombier\f2a1\fP+\f2a2\fP The address \f2a2\fP evaluated starting at right of \f2a1\fP 6803e12c5d1SDavid du Colombier\f2a1\fP-\f2a2\fP \f2a2\fP evaluated in the reverse direction starting at left of \f2a1\fP 6813e12c5d1SDavid du Colombier\f2a1\fP,\f2a2\fP From the left of \f2a1\fP to the right of \f2a2\fP (default \f(CW0,$\fP) 6823e12c5d1SDavid du Colombier\f2a1\fP;\f2a2\fP Like \f(CW,\fP but sets dot after evaluating \f2a1\fP 6833e12c5d1SDavid du Colombier.sp .4 6843e12c5d1SDavid du Colombier_ 6853e12c5d1SDavid du Colombier.sp .4 6863e12c5d1SDavid du Colombier.T& 6873e12c5d1SDavid du Colombierc s. 6883e12c5d1SDavid du ColombierT{ 6893e12c5d1SDavid du ColombierThe operators 6903e12c5d1SDavid du Colombier.CW + 6913e12c5d1SDavid du Colombierand 6923e12c5d1SDavid du Colombier.CW - 6933e12c5d1SDavid du Colombierare high precedence, while 6943e12c5d1SDavid du Colombier.CW , 6953e12c5d1SDavid du Colombierand 6963e12c5d1SDavid du Colombier.CW ; 6973e12c5d1SDavid du Colombierare low precedence. 6983e12c5d1SDavid du ColombierIn both 6993e12c5d1SDavid du Colombier.CW + 7003e12c5d1SDavid du Colombierand 7013e12c5d1SDavid du Colombier.CW - 7023e12c5d1SDavid du Colombierforms, 7033e12c5d1SDavid du Colombier.I a2 7043e12c5d1SDavid du Colombierdefaults to 1 and 7053e12c5d1SDavid du Colombier.I a1 7063e12c5d1SDavid du Colombierdefaults to dot. 7073e12c5d1SDavid du ColombierIf both 7083e12c5d1SDavid du Colombier.I a1 7093e12c5d1SDavid du Colombierand 7103e12c5d1SDavid du Colombier.I a2 7113e12c5d1SDavid du Colombierare present, 7123e12c5d1SDavid du Colombier.CW + 7133e12c5d1SDavid du Colombiermay be elided. 7143e12c5d1SDavid du ColombierT} 7153e12c5d1SDavid du Colombier.sp .5 7163e12c5d1SDavid du Colombier.ft CW 7173e12c5d1SDavid du Colombier_ 7183e12c5d1SDavid du Colombier.ft 7193e12c5d1SDavid du Colombier.TE 7203e12c5d1SDavid du Colombier.sp 7213e12c5d1SDavid du Colombier.KE 7223e12c5d1SDavid du Colombier.PP 7233e12c5d1SDavid du ColombierThe language discussed so far will not seem novel 724219b2ee8SDavid du Colombierto people who use UNIX text editors 7253e12c5d1SDavid du Colombiersuch as 7263e12c5d1SDavid du Colombier.CW ed 7273e12c5d1SDavid du Colombieror 7283e12c5d1SDavid du Colombier.CW vi .\u\s-4\&9\s+4\d 7293e12c5d1SDavid du ColombierMoreover, the kinds of editing operations these commands allow, with the exception 7303e12c5d1SDavid du Colombierof regular expressions and line numbers, 7313e12c5d1SDavid du Colombierare clearly more conveniently handled by a mouse-based interface. 7323e12c5d1SDavid du ColombierIndeed, 7333e12c5d1SDavid du Colombier.CW sam 's 7343e12c5d1SDavid du Colombiermouse language (discussed at length below) is the means by which 7353e12c5d1SDavid du Colombiersimple changes are usually made. 7363e12c5d1SDavid du ColombierFor large or repetitive changes, however, a textual language 7373e12c5d1SDavid du Colombieroutperforms a manual interface. 7383e12c5d1SDavid du Colombier.PP 7393e12c5d1SDavid du ColombierImagine that, instead of deleting just one occurrence of the string 7403e12c5d1SDavid du Colombier.CW Peter , 7413e12c5d1SDavid du Colombierwe wanted to eliminate every 7423e12c5d1SDavid du Colombier.CW Peter . 7433e12c5d1SDavid du ColombierWhat's needed is an iterator that runs a command for each occurrence of some 7443e12c5d1SDavid du Colombiertext. 7453e12c5d1SDavid du Colombier.CW Sam 's 7463e12c5d1SDavid du Colombieriterator is called 7473e12c5d1SDavid du Colombier.CW x , 7483e12c5d1SDavid du Colombierfor extract: 7493e12c5d1SDavid du Colombier.P1 7503e12c5d1SDavid du Colombierx/\f2expression\fP/ \f2command\fP 7513e12c5d1SDavid du Colombier.P2 7523e12c5d1SDavid du Colombierfinds all matches in dot of the specified expression, and for each 7533e12c5d1SDavid du Colombiersuch match, sets dot to the text matched and runs the command. 7543e12c5d1SDavid du ColombierSo to delete all the 7553e12c5d1SDavid du Colombier.CW Peters: 7563e12c5d1SDavid du Colombier.P1 7573e12c5d1SDavid du Colombier0,$ x/Peter/ d 7583e12c5d1SDavid du Colombier.P2 7593e12c5d1SDavid du Colombier(Blanks in these examples are to improve readability; 7603e12c5d1SDavid du Colombier.CW sam 7613e12c5d1SDavid du Colombierneither requires nor interprets them.) 7623e12c5d1SDavid du ColombierThis searches the entire file 7633e12c5d1SDavid du Colombier.CW 0,$ ) ( 7643e12c5d1SDavid du Colombierfor occurrences of the string 7653e12c5d1SDavid du Colombier.CW Peter , 7663e12c5d1SDavid du Colombierand runs the 7673e12c5d1SDavid du Colombier.CW d 7683e12c5d1SDavid du Colombiercommand with dot set to each such occurrence. 7693e12c5d1SDavid du Colombier(By contrast, the comparable 7703e12c5d1SDavid du Colombier.CW ed 7713e12c5d1SDavid du Colombiercommand would delete all 7723e12c5d1SDavid du Colombier.I lines 7733e12c5d1SDavid du Colombiercontaining 7743e12c5d1SDavid du Colombier.CW Peter ; 7753e12c5d1SDavid du Colombier.CW sam 7763e12c5d1SDavid du Colombierdeletes only the 7773e12c5d1SDavid du Colombier.CW Peters .) 7783e12c5d1SDavid du ColombierThe address 7793e12c5d1SDavid du Colombier.CW 0,$ 7803e12c5d1SDavid du Colombieris commonly used, and may be abbreviated to just a comma. 7813e12c5d1SDavid du ColombierAs another example, 7823e12c5d1SDavid du Colombier.P1 7833e12c5d1SDavid du Colombier, x/Peter/ p 7843e12c5d1SDavid du Colombier.P2 7853e12c5d1SDavid du Colombierprints a list of 7863e12c5d1SDavid du Colombier.CW Peters, 7873e12c5d1SDavid du Colombierone for each appearance in the file, with no intervening text (not even newlines 7883e12c5d1SDavid du Colombierto separate the instances). 7893e12c5d1SDavid du Colombier.PP 7903e12c5d1SDavid du ColombierOf course, the text extracted by 7913e12c5d1SDavid du Colombier.CW x 7923e12c5d1SDavid du Colombiermay be selected by a regular expression, 7933e12c5d1SDavid du Colombierwhich complicates deciding what set of matches is chosen \(em 7943e12c5d1SDavid du Colombiermatches may overlap. This is resolved by generating the matches 7953e12c5d1SDavid du Colombierstarting from the beginning of dot using the leftmost-longest rule, 7963e12c5d1SDavid du Colombierand searching for each match starting from the end of the previous one. 7973e12c5d1SDavid du ColombierRegular expressions may also match null strings, but a null match 7983e12c5d1SDavid du Colombieradjacent to a non-null match is never selected; at least one character 7993e12c5d1SDavid du Colombiermust intervene. 8003e12c5d1SDavid du ColombierFor example, 8013e12c5d1SDavid du Colombier.P1 8023e12c5d1SDavid du Colombier, c/AAA/ 8033e12c5d1SDavid du Colombierx/B*/ c/-/ 8043e12c5d1SDavid du Colombier, p 8053e12c5d1SDavid du Colombier.P2 8063e12c5d1SDavid du Colombierproduces as output 8073e12c5d1SDavid du Colombier.P1 8083e12c5d1SDavid du Colombier-A-A-A- 8093e12c5d1SDavid du Colombier.P2 8103e12c5d1SDavid du Colombierbecause the pattern 8113e12c5d1SDavid du Colombier.CW B* 8123e12c5d1SDavid du Colombiermatches the null strings separating the 8133e12c5d1SDavid du Colombier.CW A 's. 8143e12c5d1SDavid du Colombier.PP 8153e12c5d1SDavid du ColombierThe 8163e12c5d1SDavid du Colombier.CW x 8173e12c5d1SDavid du Colombiercommand has a complement, 8183e12c5d1SDavid du Colombier.CW y , 8193e12c5d1SDavid du Colombierwith similar syntax, that executes the command with dot set to the text 8203e12c5d1SDavid du Colombier.I between 8213e12c5d1SDavid du Colombierthe matches of the expression. 8223e12c5d1SDavid du ColombierFor example, 8233e12c5d1SDavid du Colombier.P1 8243e12c5d1SDavid du Colombier, c/AAA/ 8253e12c5d1SDavid du Colombiery/A/ c/-/ 8263e12c5d1SDavid du Colombier, p 8273e12c5d1SDavid du Colombier.P2 8283e12c5d1SDavid du Colombierproduces the same result as the example above. 8293e12c5d1SDavid du Colombier.PP 8303e12c5d1SDavid du ColombierThe 8313e12c5d1SDavid du Colombier.CW x 8323e12c5d1SDavid du Colombierand 8333e12c5d1SDavid du Colombier.CW y 8343e12c5d1SDavid du Colombiercommands are looping constructs, and 8353e12c5d1SDavid du Colombier.CW sam 8363e12c5d1SDavid du Colombierhas a pair of conditional commands to go with them. 8373e12c5d1SDavid du ColombierThey have similar syntax: 8383e12c5d1SDavid du Colombier.P1 8393e12c5d1SDavid du Colombierg/\f2expression\fP/ \f2command\fP 8403e12c5d1SDavid du Colombier.P2 8413e12c5d1SDavid du Colombier(guard) 8423e12c5d1SDavid du Colombierruns the command exactly once if dot contains a match of the expression. 8433e12c5d1SDavid du ColombierThis is different from 8443e12c5d1SDavid du Colombier.CW x , 8453e12c5d1SDavid du Colombierwhich runs the command for 8463e12c5d1SDavid du Colombier.I each 8473e12c5d1SDavid du Colombiermatch: 8483e12c5d1SDavid du Colombier.CW x 8493e12c5d1SDavid du Colombierloops; 8503e12c5d1SDavid du Colombier.CW g 8513e12c5d1SDavid du Colombiermerely tests, without changing the value of dot. 8523e12c5d1SDavid du ColombierThus, 8533e12c5d1SDavid du Colombier.P1 8543e12c5d1SDavid du Colombier, x/Peter/ d 8553e12c5d1SDavid du Colombier.P2 8563e12c5d1SDavid du Colombierdeletes all occurrences of 8573e12c5d1SDavid du Colombier.CW Peter , 8583e12c5d1SDavid du Colombierbut 8593e12c5d1SDavid du Colombier.P1 8603e12c5d1SDavid du Colombier, g/Peter/ d 8613e12c5d1SDavid du Colombier.P2 8623e12c5d1SDavid du Colombierdeletes the whole file (reduces it to a null string) if 8633e12c5d1SDavid du Colombier.CW Peter 8643e12c5d1SDavid du Colombieroccurs anywhere in the text. 8653e12c5d1SDavid du ColombierThe complementary conditional is 8663e12c5d1SDavid du Colombier.CW v , 8673e12c5d1SDavid du Colombierwhich runs the command if there is 8683e12c5d1SDavid du Colombier.I no 8693e12c5d1SDavid du Colombiermatch of the expression. 8703e12c5d1SDavid du Colombier.PP 8713e12c5d1SDavid du ColombierThese control-structure-like commands may be composed to construct more 8723e12c5d1SDavid du Colombierinvolved operations. For example, to print those lines of text that 8733e12c5d1SDavid du Colombiercontain the string 8743e12c5d1SDavid du Colombier.CW Peter : 8753e12c5d1SDavid du Colombier.P1 8763e12c5d1SDavid du Colombier, x/.*\en/ g/Peter/ p 8773e12c5d1SDavid du Colombier.P2 8783e12c5d1SDavid du ColombierThe 8793e12c5d1SDavid du Colombier.CW x 8803e12c5d1SDavid du Colombierbreaks the file into lines, the 8813e12c5d1SDavid du Colombier.CW g 8823e12c5d1SDavid du Colombierselects those lines containing 8833e12c5d1SDavid du Colombier.CW Peter , 8843e12c5d1SDavid du Colombierand the 8853e12c5d1SDavid du Colombier.CW p 8863e12c5d1SDavid du Colombierprints them. 8873e12c5d1SDavid du ColombierThis command gives an address for the 8883e12c5d1SDavid du Colombier.CW x 8893e12c5d1SDavid du Colombiercommand (the whole file), but because 8903e12c5d1SDavid du Colombier.CW g 8913e12c5d1SDavid du Colombierdoes not have an explicit address, it applies to the value of 8923e12c5d1SDavid du Colombierdot produced by the 8933e12c5d1SDavid du Colombier.CW x 8943e12c5d1SDavid du Colombiercommand, that is, to each line. 8953e12c5d1SDavid du ColombierAll commands in 8963e12c5d1SDavid du Colombier.CW sam 8973e12c5d1SDavid du Colombierexcept for the command to write a file to disc use dot for the 8983e12c5d1SDavid du Colombierdefault address. 8993e12c5d1SDavid du Colombier.PP 9003e12c5d1SDavid du ColombierComposition may be continued indefinitely. 9013e12c5d1SDavid du Colombier.P1 9023e12c5d1SDavid du Colombier, x/.*\en/ g/Peter/ v/SaltPeter/ p 9033e12c5d1SDavid du Colombier.P2 9043e12c5d1SDavid du Colombierprints those lines containing 9053e12c5d1SDavid du Colombier.CW Peter 9063e12c5d1SDavid du Colombierbut 9073e12c5d1SDavid du Colombier.I not 9083e12c5d1SDavid du Colombierthose containing 9093e12c5d1SDavid du Colombier.CW SaltPeter . 9103e12c5d1SDavid du Colombier.SH 2 9113e12c5d1SDavid du ColombierStructural Regular Expressions 9123e12c5d1SDavid du Colombier.LP 913219b2ee8SDavid du ColombierUnlike other UNIX text editors, 9143e12c5d1SDavid du Colombierincluding the non-interactive ones such as 9153e12c5d1SDavid du Colombier.CW sed 9163e12c5d1SDavid du Colombierand 9173e12c5d1SDavid du Colombier.CW awk ,\u\s-4\&7\s+4\d 9183e12c5d1SDavid du Colombier.CW sam 9193e12c5d1SDavid du Colombieris good for manipulating files with multi-line `records.' 9203e12c5d1SDavid du ColombierAn example is an on-line phone book composed of records, 9213e12c5d1SDavid du Colombierseparated by blank lines, of the form 9223e12c5d1SDavid du Colombier.P1 9233e12c5d1SDavid du ColombierHerbert Tic 9243e12c5d1SDavid du Colombier44 Turnip Ave., Endive, NJ 9253e12c5d1SDavid du Colombier201-5555642 9263e12c5d1SDavid du Colombier 9273e12c5d1SDavid du ColombierNorbert Twinge 9283e12c5d1SDavid du Colombier16 Potato St., Cabbagetown, NJ 9293e12c5d1SDavid du Colombier201-5553145 9303e12c5d1SDavid du Colombier 9313e12c5d1SDavid du Colombier\&... 9323e12c5d1SDavid du Colombier.P2 9333e12c5d1SDavid du ColombierThe format may be encoded as a regular expression: 9343e12c5d1SDavid du Colombier.P1 9353e12c5d1SDavid du Colombier(.+\en)+ 9363e12c5d1SDavid du Colombier.P2 9373e12c5d1SDavid du Colombierthat is, a sequence of one or more non-blank lines. 9383e12c5d1SDavid du ColombierThe command to print Mr. Tic's entire record is then 9393e12c5d1SDavid du Colombier.P1 9403e12c5d1SDavid du Colombier, x/(.+\en)+/ g/^Herbert Tic$/ p 9413e12c5d1SDavid du Colombier.P2 9423e12c5d1SDavid du Colombierand that to extract just the phone number is 9433e12c5d1SDavid du Colombier.P1 9443e12c5d1SDavid du Colombier, x/(.+\en)+/ g/^Herbert Tic$/ x/^[0-9]*-[0-9]*\en/ p 9453e12c5d1SDavid du Colombier.P2 9463e12c5d1SDavid du ColombierThe latter command breaks the file into records, 9473e12c5d1SDavid du Colombierchooses Mr. Tic's record, 9483e12c5d1SDavid du Colombierextracts the phone number from the record, 9493e12c5d1SDavid du Colombierand finally prints the number. 9503e12c5d1SDavid du Colombier.PP 9513e12c5d1SDavid du ColombierA more involved problem is that of 9523e12c5d1SDavid du Colombierrenaming a particular variable, say 9533e12c5d1SDavid du Colombier.CW n , 9543e12c5d1SDavid du Colombierto 9553e12c5d1SDavid du Colombier.CW num 9563e12c5d1SDavid du Colombierin a C program. 9573e12c5d1SDavid du ColombierThe obvious first attempt, 9583e12c5d1SDavid du Colombier.P1 9593e12c5d1SDavid du Colombier, x/n/ c/num/ 9603e12c5d1SDavid du Colombier.P2 9613e12c5d1SDavid du Colombieris badly flawed: it changes not only the variable 9623e12c5d1SDavid du Colombier.CW n 9633e12c5d1SDavid du Colombierbut any letter 9643e12c5d1SDavid du Colombier.CW n 9653e12c5d1SDavid du Colombierthat appears. 9663e12c5d1SDavid du ColombierWe need to extract all the variables, and select those that match 9673e12c5d1SDavid du Colombier.CW n 9683e12c5d1SDavid du Colombierand only 9693e12c5d1SDavid du Colombier.CW n : 9703e12c5d1SDavid du Colombier.P1 9713e12c5d1SDavid du Colombier, x/[A-Za-z_][A-Za-z_0-9]*/ g/n/ v/../ c/num/ 9723e12c5d1SDavid du Colombier.P2 9733e12c5d1SDavid du ColombierThe pattern 9743e12c5d1SDavid du Colombier.CW [A-Za-z_][A-Za-z_0-9]* 9753e12c5d1SDavid du Colombiermatches C identifiers. 9763e12c5d1SDavid du ColombierNext 9773e12c5d1SDavid du Colombier.CW g/n/ 9783e12c5d1SDavid du Colombierselects those containing an 9793e12c5d1SDavid du Colombier.CW n . 9803e12c5d1SDavid du ColombierThen 9813e12c5d1SDavid du Colombier.CW v/../ 9823e12c5d1SDavid du Colombierrejects those containing two (or more) characters, and finally 9833e12c5d1SDavid du Colombier.CW c/num/ 9843e12c5d1SDavid du Colombierchanges the remainder (identifiers 9853e12c5d1SDavid du Colombier.CW n ) 9863e12c5d1SDavid du Colombierto 9873e12c5d1SDavid du Colombier.CW num . 9883e12c5d1SDavid du ColombierThis version clearly works much better, but there may still be problems. 9893e12c5d1SDavid du ColombierFor example, in C character and string constants, the sequence 9903e12c5d1SDavid du Colombier.CW \en 9913e12c5d1SDavid du Colombieris interpreted as a newline character, and we don't want to change it to 9923e12c5d1SDavid du Colombier.CW \enum. 9933e12c5d1SDavid du ColombierThis problem can be forestalled with a 9943e12c5d1SDavid du Colombier.CW y 9953e12c5d1SDavid du Colombiercommand: 9963e12c5d1SDavid du Colombier.P1 9973e12c5d1SDavid du Colombier, y/\e\en/ x/[A-Za-z_][A-Za-z_0-9]*/ g/n/ v/../ c/num/ 9983e12c5d1SDavid du Colombier.P2 9993e12c5d1SDavid du Colombier(the second 10003e12c5d1SDavid du Colombier.CW \e 10013e12c5d1SDavid du Colombieris necessary because of lexical conventions in regular expressions), 10023e12c5d1SDavid du Colombieror we could even reject character constants and strings outright: 1003219b2ee8SDavid du Colombier.P1 0 10043e12c5d1SDavid du Colombier,y/'[^']*'/ y/"[^"]*"/ x/[A-Za-z_][A-Za-z_0-9]*/ g/n/ v/../ c/num/ 10053e12c5d1SDavid du Colombier.P2 10063e12c5d1SDavid du ColombierThe 10073e12c5d1SDavid du Colombier.CW y 10083e12c5d1SDavid du Colombiercommands in this version exclude from consideration all character constants 10093e12c5d1SDavid du Colombierand strings. 10103e12c5d1SDavid du ColombierThe only remaining problem is to deal with the possible occurrence of 10113e12c5d1SDavid du Colombier.CW \e' 10123e12c5d1SDavid du Colombieror 10133e12c5d1SDavid du Colombier.CW \e" 10143e12c5d1SDavid du Colombierwithin these sequences, but it's easy to see how to resolve this difficulty. 10153e12c5d1SDavid du Colombier.PP 10163e12c5d1SDavid du ColombierThe point of these composed commands is successive refinement. 10173e12c5d1SDavid du ColombierA simple version of the command is tried, and if it's not good enough, 10183e12c5d1SDavid du Colombierit can be honed by adding a clause or two. 10193e12c5d1SDavid du Colombier(Mistakes can be undone; see below. 10203e12c5d1SDavid du ColombierAlso, the mouse language makes it unnecessary to retype the command each time.) 10213e12c5d1SDavid du ColombierThe resulting chains of commands are somewhat reminiscent of 10223e12c5d1SDavid du Colombiershell pipelines.\u\s-4\&7\s+4\d 10233e12c5d1SDavid du ColombierUnlike pipelines, though, which pass along modified 10243e12c5d1SDavid du Colombier.I data , 10253e12c5d1SDavid du Colombier.CW sam 10263e12c5d1SDavid du Colombiercommands pass a 10273e12c5d1SDavid du Colombier.I view 10283e12c5d1SDavid du Colombierof the data. 10293e12c5d1SDavid du ColombierThe text at each step of the command is the same, but which pieces 10303e12c5d1SDavid du Colombierare selected is refined step by step until the correct piece is 10313e12c5d1SDavid du Colombieravailable to the final step of the command line, which ultimately makes the change. 10323e12c5d1SDavid du Colombier.PP 1033219b2ee8SDavid du ColombierIn other UNIX programs, regular expressions are used only for selection, 10343e12c5d1SDavid du Colombieras in the 10353e12c5d1SDavid du Colombier.CW sam 10363e12c5d1SDavid du Colombier.CW g 10373e12c5d1SDavid du Colombiercommand, never for extraction as in the 10383e12c5d1SDavid du Colombier.CW x 10393e12c5d1SDavid du Colombieror 10403e12c5d1SDavid du Colombier.CW y 10413e12c5d1SDavid du Colombiercommand. 10423e12c5d1SDavid du ColombierFor example, patterns in 10433e12c5d1SDavid du Colombier.CW awk \u\s-4\&7\s+4\d 10443e12c5d1SDavid du Colombierare used to select lines to be operated on, but cannot be used 10453e12c5d1SDavid du Colombierto describe the format of the input text, or to handle newline-free text. 10463e12c5d1SDavid du ColombierThe use of regular expressions to describe the structure of a piece 10473e12c5d1SDavid du Colombierof text rather than its contents, as in the 10483e12c5d1SDavid du Colombier.CW x 10493e12c5d1SDavid du Colombiercommand, 10503e12c5d1SDavid du Colombierhas been given a name: 10513e12c5d1SDavid du Colombier.I 10523e12c5d1SDavid du Colombierstructural regular expressions. 10533e12c5d1SDavid du Colombier.R 10543e12c5d1SDavid du ColombierWhen they are composed, as in the above example, 10553e12c5d1SDavid du Colombierthey are pleasantly expressive. 10563e12c5d1SDavid du ColombierTheir use is discussed at greater length elsewhere.\u\s-4\&10\s+4\d 10573e12c5d1SDavid du Colombier.PP 10583e12c5d1SDavid du Colombier.SH 2 10593e12c5d1SDavid du ColombierMultiple files 10603e12c5d1SDavid du Colombier.LP 10613e12c5d1SDavid du Colombier.CW Sam 10623e12c5d1SDavid du Colombierhas a few other commands, mostly relating to input and output. 10633e12c5d1SDavid du Colombier.P1 10643e12c5d1SDavid du Colombiere discfilename 10653e12c5d1SDavid du Colombier.P2 10663e12c5d1SDavid du Colombierreplaces the contents and name of the current file with those of the named 10673e12c5d1SDavid du Colombierdisc file; 10683e12c5d1SDavid du Colombier.P1 10693e12c5d1SDavid du Colombierw discfilename 10703e12c5d1SDavid du Colombier.P2 10713e12c5d1SDavid du Colombierwrites the contents to the named disc file; and 10723e12c5d1SDavid du Colombier.P1 10733e12c5d1SDavid du Colombierr discfilename 10743e12c5d1SDavid du Colombier.P2 10753e12c5d1SDavid du Colombierreplaces dot with the contents of the named disc file. 10763e12c5d1SDavid du ColombierAll these commands use the current file's name if none is specified. 10773e12c5d1SDavid du ColombierFinally, 10783e12c5d1SDavid du Colombier.P1 10793e12c5d1SDavid du Colombierf discfilename 10803e12c5d1SDavid du Colombier.P2 10813e12c5d1SDavid du Colombierchanges the name associated with the file and displays the result: 10823e12c5d1SDavid du Colombier.P1 10833e12c5d1SDavid du Colombier\&'-. discfilename 10843e12c5d1SDavid du Colombier.P2 10853e12c5d1SDavid du ColombierThis output is called the file's 10863e12c5d1SDavid du Colombier.I 10873e12c5d1SDavid du Colombiermenu line, 10883e12c5d1SDavid du Colombier.R 10893e12c5d1SDavid du Colombierbecause it is the contents of the file's line in the button 3 menu (described 10903e12c5d1SDavid du Colombierin the 10913e12c5d1SDavid du Colombiernext section). 10923e12c5d1SDavid du ColombierThe first three characters are a concise notation for the state of the file. 10933e12c5d1SDavid du ColombierThe apostrophe signifies that the file is modified. 10943e12c5d1SDavid du ColombierThe minus sign indicates the number of windows 10953e12c5d1SDavid du Colombieropen on the file (see the next section): 10963e12c5d1SDavid du Colombier.CW - 10973e12c5d1SDavid du Colombiermeans none, 10983e12c5d1SDavid du Colombier.CW + 10993e12c5d1SDavid du Colombiermeans one, and 11003e12c5d1SDavid du Colombier.CW * 11013e12c5d1SDavid du Colombiermeans more than one. 11023e12c5d1SDavid du ColombierFinally, the period indicates that this is the current file. 11033e12c5d1SDavid du ColombierThese characters are useful for controlling the 11043e12c5d1SDavid du Colombier.CW X 11053e12c5d1SDavid du Colombiercommand, described shortly. 11063e12c5d1SDavid du Colombier.PP 11073e12c5d1SDavid du Colombier.CW Sam 11083e12c5d1SDavid du Colombiermay be started with a set of disc files (such as all the source for 11093e12c5d1SDavid du Colombiera program) by invoking it with a list of file names as arguments, and 11103e12c5d1SDavid du Colombiermore may be added or deleted on demand. 11113e12c5d1SDavid du Colombier.P1 11123e12c5d1SDavid du ColombierB discfile1 discfile2 ... 11133e12c5d1SDavid du Colombier.P2 11143e12c5d1SDavid du Colombieradds the named files to 11153e12c5d1SDavid du Colombier.CW sam 's 11163e12c5d1SDavid du Colombierlist, and 11173e12c5d1SDavid du Colombier.P1 11183e12c5d1SDavid du ColombierD discfile1 discfile2 ... 11193e12c5d1SDavid du Colombier.P2 11203e12c5d1SDavid du Colombierremoves them from 11213e12c5d1SDavid du Colombier.CW sam 's 11223e12c5d1SDavid du Colombiermemory (without effect on associated disc files). 11233e12c5d1SDavid du ColombierBoth these commands have a syntax for using the shell\u\s-4\&7\s+4\d 1124219b2ee8SDavid du Colombier(the UNIX command interpreter) to generate the lists: 11253e12c5d1SDavid du Colombier.P1 11263e12c5d1SDavid du ColombierB <echo *.c 11273e12c5d1SDavid du Colombier.P2 11283e12c5d1SDavid du Colombierwill add all C source files, and 11293e12c5d1SDavid du Colombier.P1 11303e12c5d1SDavid du ColombierB <grep -l variable *.c 11313e12c5d1SDavid du Colombier.P2 11323e12c5d1SDavid du Colombierwill add all C source files referencing a particular variable 1133219b2ee8SDavid du Colombier(the UNIX command 11343e12c5d1SDavid du Colombier.CW grep\ -l 11353e12c5d1SDavid du Colombierlists all files in its arguments that contain matches of 11363e12c5d1SDavid du Colombierthe specified regular expression). 11373e12c5d1SDavid du ColombierFinally, 11383e12c5d1SDavid du Colombier.CW D 11393e12c5d1SDavid du Colombierwithout arguments deletes the current file. 11403e12c5d1SDavid du Colombier.PP 11413e12c5d1SDavid du ColombierThere are two ways to change which file is current: 11423e12c5d1SDavid du Colombier.P1 11433e12c5d1SDavid du Colombierb filename 11443e12c5d1SDavid du Colombier.P2 11453e12c5d1SDavid du Colombiermakes the named file current. 11463e12c5d1SDavid du ColombierThe 11473e12c5d1SDavid du Colombier.CW B 11483e12c5d1SDavid du Colombiercommand 11493e12c5d1SDavid du Colombierdoes the same, but also adds any new files to 11503e12c5d1SDavid du Colombier.CW sam 's 11513e12c5d1SDavid du Colombierlist. 11523e12c5d1SDavid du Colombier(In practice, of course, the current file 11533e12c5d1SDavid du Colombieris usually chosen by mouse actions, not by textual commands.) 11543e12c5d1SDavid du ColombierThe other way is to use a form of address that refers to files: 11553e12c5d1SDavid du Colombier.P1 11563e12c5d1SDavid du Colombier"\f2expression\fP" \f2address\fP 11573e12c5d1SDavid du Colombier.P2 11583e12c5d1SDavid du Colombierrefers to the address evaluated in the file whose menu line 11593e12c5d1SDavid du Colombiermatches the expression (there must be exactly one match). 11603e12c5d1SDavid du ColombierFor example, 11613e12c5d1SDavid du Colombier.P1 11623e12c5d1SDavid du Colombier"peter.c" 3 11633e12c5d1SDavid du Colombier.P2 11643e12c5d1SDavid du Colombierrefers to the third line of the file whose name matches 11653e12c5d1SDavid du Colombier.CW peter.c . 11663e12c5d1SDavid du ColombierThis is most useful in the move 11673e12c5d1SDavid du Colombier.CW m ) ( 11683e12c5d1SDavid du Colombierand copy 11693e12c5d1SDavid du Colombier.CW t ) ( 11703e12c5d1SDavid du Colombiercommands: 11713e12c5d1SDavid du Colombier.P1 11723e12c5d1SDavid du Colombier0,$ t "peter.c" 0 11733e12c5d1SDavid du Colombier.P2 11743e12c5d1SDavid du Colombiermakes a copy of the current file at the beginning of 11753e12c5d1SDavid du Colombier.CW peter.c . 11763e12c5d1SDavid du Colombier.PP 11773e12c5d1SDavid du ColombierThe 11783e12c5d1SDavid du Colombier.CW X 11793e12c5d1SDavid du Colombiercommand 11803e12c5d1SDavid du Colombieris a looping construct, like 11813e12c5d1SDavid du Colombier.CW x , 11823e12c5d1SDavid du Colombierthat refers to files instead of strings: 11833e12c5d1SDavid du Colombier.P1 11843e12c5d1SDavid du ColombierX/\f2expression\fP/ \f2command\fP 11853e12c5d1SDavid du Colombier.P2 11863e12c5d1SDavid du Colombierruns the command in all 11873e12c5d1SDavid du Colombierfiles whose menu lines match the expression. The best example is 11883e12c5d1SDavid du Colombier.P1 11893e12c5d1SDavid du ColombierX/'/ w 11903e12c5d1SDavid du Colombier.P2 11913e12c5d1SDavid du Colombierwhich writes to disc all modified files. 11923e12c5d1SDavid du Colombier.CW Y 11933e12c5d1SDavid du Colombieris the complement of 11943e12c5d1SDavid du Colombier.CW X : 11953e12c5d1SDavid du Colombierit runs the command on all files whose menu lines don't match the expression: 11963e12c5d1SDavid du Colombier.P1 11973e12c5d1SDavid du ColombierY/\e.c/ D 11983e12c5d1SDavid du Colombier.P2 11993e12c5d1SDavid du Colombierdeletes all files that don't have 12003e12c5d1SDavid du Colombier.CW \&.c 12013e12c5d1SDavid du Colombierin their names, that is, it keeps all C source files and deletes the rest. 12023e12c5d1SDavid du Colombier.PP 12033e12c5d1SDavid du ColombierBraces allow commands to be grouped, so 12043e12c5d1SDavid du Colombier.P1 12053e12c5d1SDavid du Colombier{ 12063e12c5d1SDavid du Colombier \f2command1\fP 12073e12c5d1SDavid du Colombier \f2command2\fP 12083e12c5d1SDavid du Colombier} 12093e12c5d1SDavid du Colombier.P2 12103e12c5d1SDavid du Colombieris syntactically a single command that runs two commands. 12113e12c5d1SDavid du ColombierThus, 12123e12c5d1SDavid du Colombier.P1 12133e12c5d1SDavid du ColombierX/\e.c/ ,g/variable/ { 12143e12c5d1SDavid du Colombier f 12153e12c5d1SDavid du Colombier , x/.*\en/ g/variable/ p 12163e12c5d1SDavid du Colombier} 12173e12c5d1SDavid du Colombier.P2 12183e12c5d1SDavid du Colombierfinds all occurrences of 12193e12c5d1SDavid du Colombier.CW variable 12203e12c5d1SDavid du Colombierin C source files, and prints 12213e12c5d1SDavid du Colombierout the file names and lines of each match. 12223e12c5d1SDavid du ColombierThe precise semantics of compound operations is discussed in the implementation 12233e12c5d1SDavid du Colombiersections below. 12243e12c5d1SDavid du Colombier.PP 12253e12c5d1SDavid du ColombierFinally, 12263e12c5d1SDavid du Colombierthe undo command, 12273e12c5d1SDavid du Colombier.CW u , 12283e12c5d1SDavid du Colombierundoes the last command, 12293e12c5d1SDavid du Colombierno matter how many files were affected. 12303e12c5d1SDavid du ColombierMultiple undo operations move further back in time, so 12313e12c5d1SDavid du Colombier.P1 12323e12c5d1SDavid du Colombieru 12333e12c5d1SDavid du Colombieru 12343e12c5d1SDavid du Colombier.P2 12353e12c5d1SDavid du Colombier(which may be abbreviated 12363e12c5d1SDavid du Colombier.CW u2 ) 12373e12c5d1SDavid du Colombierundoes the last two commands. An undo may not be undone, however, nor 12383e12c5d1SDavid du Colombiermay any command that adds or deletes files. 12393e12c5d1SDavid du ColombierEverything else is undoable, though, including for example 12403e12c5d1SDavid du Colombier.CW e 12413e12c5d1SDavid du Colombiercommands: 12423e12c5d1SDavid du Colombier.P1 12433e12c5d1SDavid du Colombiere filename 12443e12c5d1SDavid du Colombieru 12453e12c5d1SDavid du Colombier.P2 12463e12c5d1SDavid du Colombierrestores the state of the file completely, including its name, dot, 12473e12c5d1SDavid du Colombierand modified bit. Because of the undo, potentially dangerous commands 12483e12c5d1SDavid du Colombierare not guarded by confirmations. Only 12493e12c5d1SDavid du Colombier.CW D , 12503e12c5d1SDavid du Colombierwhich destroys the information necessary to restore itself, is protected. 12513e12c5d1SDavid du ColombierIt will not delete a modified file, but a second 12523e12c5d1SDavid du Colombier.CW D 12533e12c5d1SDavid du Colombierof the same file will succeed regardless. 12543e12c5d1SDavid du ColombierThe 12553e12c5d1SDavid du Colombier.CW q 12563e12c5d1SDavid du Colombiercommand, which exits 12573e12c5d1SDavid du Colombier.CW sam , 12583e12c5d1SDavid du Colombieris similarly guarded. 12593e12c5d1SDavid du Colombier.SH 2 12603e12c5d1SDavid du ColombierMouse Interface 12613e12c5d1SDavid du Colombier.LP 12623e12c5d1SDavid du Colombier.CW Sam 12633e12c5d1SDavid du Colombieris most commonly run 12643e12c5d1SDavid du Colombierconnected to a bitmap display and mouse for interactive editing. 12653e12c5d1SDavid du ColombierThe only difference in the command language 12663e12c5d1SDavid du Colombierbetween regular, mouse-driven 12673e12c5d1SDavid du Colombier.CW sam 12683e12c5d1SDavid du Colombierand 12693e12c5d1SDavid du Colombier.CW sam\ -d 12703e12c5d1SDavid du Colombieris that if an address 12713e12c5d1SDavid du Colombieris provided without a command, 12723e12c5d1SDavid du Colombier.CW sam\ -d 12733e12c5d1SDavid du Colombierwill print the text referenced by the address, but 12743e12c5d1SDavid du Colombierregular 12753e12c5d1SDavid du Colombier.CW sam 12763e12c5d1SDavid du Colombierwill highlight it on the screen \(em in fact, 12773e12c5d1SDavid du Colombierdot is always highlighted (see Figure 2). 12783e12c5d1SDavid du Colombier.WS 1 12793e12c5d1SDavid du Colombier.KF 1280426d2b71SDavid du Colombier.XP fig3 2.04i 12813e12c5d1SDavid du Colombier.Cs 12823e12c5d1SDavid du ColombierFigure 2. A 12833e12c5d1SDavid du Colombier.CW sam 12843e12c5d1SDavid du Colombierwindow. The scroll bar down the left 12853e12c5d1SDavid du Colombierrepresents the file, with the bubble showing the fraction 12863e12c5d1SDavid du Colombiervisible in the window. 12873e12c5d1SDavid du ColombierThe scroll bar may be manipulated by the mouse for convenient browsing. 12883e12c5d1SDavid du ColombierThe current text, 12893e12c5d1SDavid du Colombierwhich is highlighted, need not fit on a line. Here it consists of one partial 12903e12c5d1SDavid du Colombierline, one complete line, and final partial line. 12913e12c5d1SDavid du Colombier.Ce 12923e12c5d1SDavid du Colombier.KE 12933e12c5d1SDavid du Colombier.PP 12943e12c5d1SDavid du ColombierEach file may have zero or more windows open on the display. 12953e12c5d1SDavid du ColombierAt any time, only one window in all of 12963e12c5d1SDavid du Colombier.CW sam 12973e12c5d1SDavid du Colombieris the 12983e12c5d1SDavid du Colombier.I 12993e12c5d1SDavid du Colombiercurrent window, 13003e12c5d1SDavid du Colombier.R 13013e12c5d1SDavid du Colombierthat is, the window to which typing and mouse actions refer; 13023e12c5d1SDavid du Colombierthis may be the 13033e12c5d1SDavid du Colombier.CW sam 13043e12c5d1SDavid du Colombierwindow (that in which commands may be typed) 13053e12c5d1SDavid du Colombieror one of the file windows. 13063e12c5d1SDavid du ColombierWhen a file has multiple windows, the image of the file in each window 13073e12c5d1SDavid du Colombieris always kept up to date. 13083e12c5d1SDavid du ColombierThe current file is the last file affected by a command, 13093e12c5d1SDavid du Colombierso if the 13103e12c5d1SDavid du Colombier.CW sam 13113e12c5d1SDavid du Colombierwindow is current, 13123e12c5d1SDavid du Colombierthe current window is not a window on the current file. 13133e12c5d1SDavid du ColombierHowever, each window on a file has its own value of dot, 13143e12c5d1SDavid du Colombierand when switching between windows on a single file, 13153e12c5d1SDavid du Colombierthe file's value of dot is changed to that of the window. 13163e12c5d1SDavid du ColombierThus, flipping between windows behaves in the obvious, convenient way. 13173e12c5d1SDavid du Colombier.PP 13183e12c5d1SDavid du ColombierThe mouse on the Blit has three buttons, numbered left to right. 13193e12c5d1SDavid du ColombierButton 3 has a list of commands to manipulate windows, 13203e12c5d1SDavid du Colombierfollowed by a list of `menu lines' exactly as printed by the 13213e12c5d1SDavid du Colombier.CW f 13223e12c5d1SDavid du Colombiercommand, one per file (not one per window). 13233e12c5d1SDavid du ColombierThese menu lines are sorted by file name. 13243e12c5d1SDavid du ColombierIf the list is long, the Blit menu software will make it more manageable 13253e12c5d1SDavid du Colombierby generating a scrolling menu instead of an unwieldy long list. 13263e12c5d1SDavid du ColombierUsing the menu to select a file from the list makes that file the current 13273e12c5d1SDavid du Colombierfile, and the most recently current window in that file the current window. 13283e12c5d1SDavid du ColombierBut if that file is already current, selecting it in the menu cycles through 13293e12c5d1SDavid du Colombierthe windows on the file; this simple trick avoids a special menu to 13303e12c5d1SDavid du Colombierchoose windows on a file. 13313e12c5d1SDavid du ColombierIf there is no window open on the file, 13323e12c5d1SDavid du Colombier.CW sam 13333e12c5d1SDavid du Colombierchanges the mouse cursor to prompt the user to create one. 13343e12c5d1SDavid du Colombier.PP 13353e12c5d1SDavid du ColombierThe commands on the button 3 menu are straightforward (see Figure 3), and 13363e12c5d1SDavid du Colombierare like the commands to manipulate windows in 13373e12c5d1SDavid du Colombier.CW mux ,\u\s-4\&8\s+4\d 13383e12c5d1SDavid du Colombierthe Blit's window system. 13393e12c5d1SDavid du Colombier.CW New 13403e12c5d1SDavid du Colombiermakes a new file, and gives it one empty window, whose size is determined 13413e12c5d1SDavid du Colombierby a rectangle swept by the mouse. 1342219b2ee8SDavid du Colombier.CW Zerox 13433e12c5d1SDavid du Colombierprompts for a window to be selected, and 13443e12c5d1SDavid du Colombiermakes a clone of that window; this is how multiple windows are created on one file. 13453e12c5d1SDavid du Colombier.CW Reshape 13463e12c5d1SDavid du Colombierchanges the size of the indicated window, and 13473e12c5d1SDavid du Colombier.CW close 13483e12c5d1SDavid du Colombierdeletes it. If that is the last window open on the file, 13493e12c5d1SDavid du Colombier.CW close 13503e12c5d1SDavid du Colombierfirst does a 13513e12c5d1SDavid du Colombier.CW D 13523e12c5d1SDavid du Colombiercommand on the file. 13533e12c5d1SDavid du Colombier.CW Write 13543e12c5d1SDavid du Colombieris identical to a 13553e12c5d1SDavid du Colombier.CW w 13563e12c5d1SDavid du Colombiercommand on the file; it is in the menu purely for convenience. 13573e12c5d1SDavid du ColombierFinally, 13583e12c5d1SDavid du Colombier.CW ~~sam~~ 13593e12c5d1SDavid du Colombieris a menu item that appears between the commands and the file names. 13603e12c5d1SDavid du ColombierSelecting it makes the 13613e12c5d1SDavid du Colombier.CW sam 13623e12c5d1SDavid du Colombierwindow the current window, 13633e12c5d1SDavid du Colombiercausing subsequent typing to be interpreted as commands. 13643e12c5d1SDavid du Colombier.KF 1365426d2b71SDavid du Colombier.XP fig2 2.74i 13663e12c5d1SDavid du Colombier.Cs 13673e12c5d1SDavid du ColombierFigure 3. The menu on button 3. 13683e12c5d1SDavid du ColombierThe black rectangle on the left is a scroll bar; the menu is limited to 13693e12c5d1SDavid du Colombierthe length shown to prevent its becoming unwieldy. 13703e12c5d1SDavid du ColombierAbove the 13713e12c5d1SDavid du Colombier.CW ~~sam~~ 13723e12c5d1SDavid du Colombierline is a list of commands; 13733e12c5d1SDavid du Colombierbeneath it is a list of files, presented exactly as with the 13743e12c5d1SDavid du Colombier.CW f 13753e12c5d1SDavid du Colombiercommand. 13763e12c5d1SDavid du Colombier.Ce 13773e12c5d1SDavid du Colombier.KE 13783e12c5d1SDavid du Colombier.PP 13793e12c5d1SDavid du ColombierWhen 13803e12c5d1SDavid du Colombier.CW sam 13813e12c5d1SDavid du Colombierrequests that a window be swept, in response to 13823e12c5d1SDavid du Colombier.CW new , 1383219b2ee8SDavid du Colombier.CW zerox 13843e12c5d1SDavid du Colombieror 13853e12c5d1SDavid du Colombier.CW reshape , 13863e12c5d1SDavid du Colombierit changes the mouse cursor from the usual arrow to a box with 13873e12c5d1SDavid du Colombiera small arrow. 13883e12c5d1SDavid du ColombierIn this state, the mouse may be used to indicate an arbitrary rectangle by 13893e12c5d1SDavid du Colombierpressing button 3 at one corner and releasing it at the opposite corner. 13903e12c5d1SDavid du ColombierMore conveniently, 13913e12c5d1SDavid du Colombierbutton 3 may simply be clicked, 13923e12c5d1SDavid du Colombierwhereupon 13933e12c5d1SDavid du Colombier.CW sam 13943e12c5d1SDavid du Colombiercreates the maximal rectangle that contains the cursor 13953e12c5d1SDavid du Colombierand abuts the 13963e12c5d1SDavid du Colombier.CW sam 13973e12c5d1SDavid du Colombierwindow. 13983e12c5d1SDavid du ColombierBy placing the 13993e12c5d1SDavid du Colombier.CW sam 14003e12c5d1SDavid du Colombierwindow in the middle of the screen, the user can define two regions (one above, 14013e12c5d1SDavid du Colombierone below) in which stacked fully-overlapping 14023e12c5d1SDavid du Colombierwindows can be created with minimal fuss (see Figure 1). 14033e12c5d1SDavid du ColombierThis simple user interface trick makes window creation noticeably easier. 14043e12c5d1SDavid du Colombier.PP 14053e12c5d1SDavid du ColombierThe cut-and-paste editor is essentially the same as that in Smalltalk-80.\u\s-4\&11\s+4\d 14063e12c5d1SDavid du ColombierThe text in dot is always highlighted on the screen. 14073e12c5d1SDavid du ColombierWhen a character is typed it replaces dot, and sets dot to the null 14083e12c5d1SDavid du Colombierstring after the character. Thus, ordinary typing inserts text. 14093e12c5d1SDavid du ColombierButton 1 is used for selection: 14103e12c5d1SDavid du Colombierpressing the button, moving the mouse, and lifting the button 14113e12c5d1SDavid du Colombierselects (sets dot to) the text between the points where the 14123e12c5d1SDavid du Colombierbutton was pressed and released. 14133e12c5d1SDavid du ColombierPressing and releasing at the same point selects a null string; this 14143e12c5d1SDavid du Colombieris called clicking. Clicking twice quickly, or 14153e12c5d1SDavid du Colombier.I 14163e12c5d1SDavid du Colombierdouble clicking, 14173e12c5d1SDavid du Colombier.R 14183e12c5d1SDavid du Colombierselects larger objects; 14193e12c5d1SDavid du Colombierfor example, double clicking in a word selects the word, 14203e12c5d1SDavid du Colombierdouble clicking just inside an opening bracket selects the text 14213e12c5d1SDavid du Colombiercontained in the brackets (handling nested brackets correctly), 14223e12c5d1SDavid du Colombierand similarly for 14233e12c5d1SDavid du Colombierparentheses, quotes, and so on. 14243e12c5d1SDavid du ColombierThe double-clicking rules reflect a bias toward 14253e12c5d1SDavid du Colombierprogrammers. 14263e12c5d1SDavid du ColombierIf 14273e12c5d1SDavid du Colombier.CW sam 14283e12c5d1SDavid du Colombierwere intended more for word processing, double-clicks would probably 14293e12c5d1SDavid du Colombierselect linguistic structures such as sentences. 14303e12c5d1SDavid du Colombier.PP 14313e12c5d1SDavid du ColombierIf button 1 is pressed outside the current window, it makes the indicated 14323e12c5d1SDavid du Colombierwindow current. 14333e12c5d1SDavid du ColombierThis is the easiest way to switch between windows and files. 14343e12c5d1SDavid du Colombier.PP 14353e12c5d1SDavid du ColombierPressing button 2 brings up a menu of editing functions (see Figure 4). 14363e12c5d1SDavid du ColombierThese mostly apply to the selected text: 14373e12c5d1SDavid du Colombier.CW cut 14383e12c5d1SDavid du Colombierdeletes the selected text, and remembers it in a hidden buffer called the 14393e12c5d1SDavid du Colombier.I 14403e12c5d1SDavid du Colombiersnarf buffer, 14413e12c5d1SDavid du Colombier.R 14423e12c5d1SDavid du Colombier.CW paste 14433e12c5d1SDavid du Colombierreplaces the selected text by the contents of the snarf buffer, 14443e12c5d1SDavid du Colombier.CW snarf 14453e12c5d1SDavid du Colombierjust copies the selected text to the snarf buffer, 14463e12c5d1SDavid du Colombier.CW look 14473e12c5d1SDavid du Colombiersearches forward for the next literal occurrence of the selected text, and 14483e12c5d1SDavid du Colombier.CW <mux> 14493e12c5d1SDavid du Colombierexchanges snarf buffers with the window system in which 14503e12c5d1SDavid du Colombier.CW sam 14513e12c5d1SDavid du Colombieris running. 14523e12c5d1SDavid du ColombierFinally, the last regular expression used appears as a menu entry 14533e12c5d1SDavid du Colombierto search 14543e12c5d1SDavid du Colombierforward for the next occurrence of a match for the expression. 14553e12c5d1SDavid du Colombier.WS 1 14563e12c5d1SDavid du Colombier.KF 1457426d2b71SDavid du Colombier.XP fig4 1.20i 14583e12c5d1SDavid du Colombier.Cs 14593e12c5d1SDavid du ColombierFigure 4. The menu on button 2. 14603e12c5d1SDavid du ColombierThe bottom entry tracks the most recently used regular expression, which may 14613e12c5d1SDavid du Colombierbe literal text. 14623e12c5d1SDavid du Colombier.Ce 14633e12c5d1SDavid du Colombier.KE 14643e12c5d1SDavid du Colombier.PP 14653e12c5d1SDavid du ColombierThe relationship between the command language and the mouse language is 14663e12c5d1SDavid du Colombierentirely due to the equality of dot and the selected text chosen 14673e12c5d1SDavid du Colombierwith button 1 on the mouse. 14683e12c5d1SDavid du ColombierFor example, to make a set of changes in a C subroutine, dot can be 14693e12c5d1SDavid du Colombierset by double clicking on the left brace that begins the subroutine, 14703e12c5d1SDavid du Colombierwhich sets dot for the command language. 14713e12c5d1SDavid du ColombierAn address-free command then typed in the 14723e12c5d1SDavid du Colombier.CW sam 14733e12c5d1SDavid du Colombierwindow will apply only to the text between the opening and closing 14743e12c5d1SDavid du Colombierbraces of the function. 14753e12c5d1SDavid du ColombierThe idea is to select what you want, and then say what you want 14763e12c5d1SDavid du Colombierto do with it, whether invoked by a menu selection or by a typed command. 14773e12c5d1SDavid du ColombierAnd of course, the value of dot is highlighted on 14783e12c5d1SDavid du Colombierthe display after the command completes. 14793e12c5d1SDavid du ColombierThis relationship between mouse interface and command language 14803e12c5d1SDavid du Colombieris clumsy to explain, but comfortable, even natural, in practice. 14813e12c5d1SDavid du Colombier.SH 14823e12c5d1SDavid du ColombierThe Implementation 14833e12c5d1SDavid du Colombier.LP 14843e12c5d1SDavid du ColombierThe next few sections describe how 14853e12c5d1SDavid du Colombier.CW sam 14863e12c5d1SDavid du Colombieris put together, first the host part, 14873e12c5d1SDavid du Colombierthen the inter-component communication, 14883e12c5d1SDavid du Colombierthen the terminal part. 14893e12c5d1SDavid du ColombierAfter explaining how the command language is implemented, 14903e12c5d1SDavid du Colombierthe discussion follows (roughly) the path of a character 14913e12c5d1SDavid du Colombierfrom the temporary file on disc to the screen. 14923e12c5d1SDavid du ColombierThe presentation centers on the data structures, 14933e12c5d1SDavid du Colombierbecause that is how the program was designed and because 14943e12c5d1SDavid du Colombierthe algorithms are easy to provide, given the right data 14953e12c5d1SDavid du Colombierstructures. 14963e12c5d1SDavid du Colombier.SH 2 14973e12c5d1SDavid du ColombierParsing and execution 14983e12c5d1SDavid du Colombier.LP 14993e12c5d1SDavid du ColombierThe command language is interpreted by parsing each command with a 15003e12c5d1SDavid du Colombiertable-driven recursive 15013e12c5d1SDavid du Colombierdescent parser, and when a complete command is assembled, invoking a top-down 15023e12c5d1SDavid du Colombierexecutor. 15033e12c5d1SDavid du ColombierMost editors instead employ a simple character-at-a-time 15043e12c5d1SDavid du Colombierlexical scanner. 15053e12c5d1SDavid du ColombierUse of a parser makes it 15063e12c5d1SDavid du Colombiereasy and unambiguous to detect when a command is complete, 15073e12c5d1SDavid du Colombierwhich has two advantages. 15083e12c5d1SDavid du ColombierFirst, escape conventions such as backslashes to quote 15093e12c5d1SDavid du Colombiermultiple-line commands are unnecessary; if the command isn't finished, 15103e12c5d1SDavid du Colombierthe parser keeps reading. For example, a multiple-line append driven by an 15113e12c5d1SDavid du Colombier.CW x 15123e12c5d1SDavid du Colombiercommand is straightforward: 15133e12c5d1SDavid du Colombier.P1 15143e12c5d1SDavid du Colombierx/.*\en/ g/Peter/ a 15153e12c5d1SDavid du Colombierone line about Peter 15163e12c5d1SDavid du Colombieranother line about Peter 15173e12c5d1SDavid du Colombier\&. 15183e12c5d1SDavid du Colombier.P2 1519219b2ee8SDavid du ColombierOther UNIX editors would require a backslash after all but the last line. 15203e12c5d1SDavid du Colombier.PP 15213e12c5d1SDavid du ColombierThe other advantage is specific to the two-process structure of 15223e12c5d1SDavid du Colombier.CW sam . 15233e12c5d1SDavid du ColombierThe host process must decide when a command is completed so the 15243e12c5d1SDavid du Colombiercommand interpreter can be called. This problem is easily resolved 15253e12c5d1SDavid du Colombierby having the lexical analyzer read the single stream of events from the 15263e12c5d1SDavid du Colombierterminal, directly executing all typing and mouse commands, 15273e12c5d1SDavid du Colombierbut passing to the parser characters typed to the 15283e12c5d1SDavid du Colombier.CW sam 15293e12c5d1SDavid du Colombiercommand window. 15303e12c5d1SDavid du ColombierThis scheme is slightly complicated by the availability of cut-and-paste 15313e12c5d1SDavid du Colombierediting in the 15323e12c5d1SDavid du Colombier.CW sam 15333e12c5d1SDavid du Colombierwindow, but that difficulty is resolved by applying the rules 15343e12c5d1SDavid du Colombierused in 15353e12c5d1SDavid du Colombier.CW mux : 15363e12c5d1SDavid du Colombierwhen a newline is typed to the 15373e12c5d1SDavid du Colombier.CW sam 15383e12c5d1SDavid du Colombierwindow, all text between the newline and the previously typed newline 15393e12c5d1SDavid du Colombieris made available to the parser. 15403e12c5d1SDavid du ColombierThis permits arbitrary editing to be done to a command before 15413e12c5d1SDavid du Colombiertyping newline and thereby requesting execution. 15423e12c5d1SDavid du Colombier.PP 15433e12c5d1SDavid du ColombierThe parser is driven by a table because the syntax of addresses 15443e12c5d1SDavid du Colombierand commands is regular enough 15453e12c5d1SDavid du Colombierto be encoded compactly. There are few special cases, such as the 15463e12c5d1SDavid du Colombierreplacement text in a substitution, so the syntax of almost all commands 15473e12c5d1SDavid du Colombiercan be encoded with a few flags. 15483e12c5d1SDavid du ColombierThese include whether the command allows an address (for example, 15493e12c5d1SDavid du Colombier.CW e 15503e12c5d1SDavid du Colombierdoes not), whether it takes a regular expression (as in 15513e12c5d1SDavid du Colombier.CW x 15523e12c5d1SDavid du Colombierand 15533e12c5d1SDavid du Colombier.CW s ), 15543e12c5d1SDavid du Colombierwhether it takes replacement text (as in 15553e12c5d1SDavid du Colombier.CW c 15563e12c5d1SDavid du Colombieror 15573e12c5d1SDavid du Colombier.CW i ), 15583e12c5d1SDavid du Colombierwhich may be multi-line, and so on. 15593e12c5d1SDavid du ColombierThe internal syntax of regular expressions is handled by a separate 15603e12c5d1SDavid du Colombierparser; a regular expression is a leaf of the command parse tree. 15613e12c5d1SDavid du ColombierRegular expressions are discussed fully in the next section. 15623e12c5d1SDavid du Colombier.PP 15633e12c5d1SDavid du ColombierThe parser table also has information about defaults, so the interpreter 15643e12c5d1SDavid du Colombieris always called with a complete tree. For example, the parser fills in 15653e12c5d1SDavid du Colombierthe implicit 15663e12c5d1SDavid du Colombier.CW 0 15673e12c5d1SDavid du Colombierand 15683e12c5d1SDavid du Colombier.CW $ 15693e12c5d1SDavid du Colombierin the abbreviated address 15703e12c5d1SDavid du Colombier.CW , 15713e12c5d1SDavid du Colombier(comma), 15723e12c5d1SDavid du Colombierinserts a 15733e12c5d1SDavid du Colombier.CW + 15743e12c5d1SDavid du Colombierto the left of an unadorned regular expression in an address, 15753e12c5d1SDavid du Colombierand provides the usual default address 15763e12c5d1SDavid du Colombier.CW . 15773e12c5d1SDavid du Colombier(dot) for commands that expect an address but are not given one. 15783e12c5d1SDavid du Colombier.PP 15793e12c5d1SDavid du ColombierOnce a complete command is parsed, the evaluation is easy. 15803e12c5d1SDavid du ColombierThe address is evaluated left-to-right starting from the value of dot, 15813e12c5d1SDavid du Colombierwith a mostly ordinary expression evaluator. 15823e12c5d1SDavid du ColombierAddresses, like many of the data structures in 15833e12c5d1SDavid du Colombier.CW sam , 15843e12c5d1SDavid du Colombierare held in a C structure and passed around by value: 15853e12c5d1SDavid du Colombier.P1 15863e12c5d1SDavid du Colombiertypedef long Posn; /* Position in a file */ 15873e12c5d1SDavid du Colombiertypedef struct Range{ 15883e12c5d1SDavid du Colombier Posn p1, p2; 15893e12c5d1SDavid du Colombier}Range; 15903e12c5d1SDavid du Colombiertypedef struct Address{ 15913e12c5d1SDavid du Colombier Range r; 15923e12c5d1SDavid du Colombier File *f; 15933e12c5d1SDavid du Colombier}Address; 15943e12c5d1SDavid du Colombier.P2 15953e12c5d1SDavid du ColombierAn address is encoded as a substring (character positions 15963e12c5d1SDavid du Colombier.CW p1 15973e12c5d1SDavid du Colombierto 15983e12c5d1SDavid du Colombier.CW p2 ) 15993e12c5d1SDavid du Colombierin a file 16003e12c5d1SDavid du Colombier.CW f . 16013e12c5d1SDavid du Colombier(The data type 16023e12c5d1SDavid du Colombier.CW File 16033e12c5d1SDavid du Colombieris described in detail below.) 16043e12c5d1SDavid du Colombier.PP 16053e12c5d1SDavid du ColombierThe address interpreter is an 16063e12c5d1SDavid du Colombier.CW Address -valued 16073e12c5d1SDavid du Colombierfunction that traverses the parse tree describing an address (the 16083e12c5d1SDavid du Colombierparse tree for the address has type 16093e12c5d1SDavid du Colombier.CW Addrtree ): 16103e12c5d1SDavid du Colombier.P1 16113e12c5d1SDavid du ColombierAddress 16123e12c5d1SDavid du Colombieraddress(ap, a, sign) 16133e12c5d1SDavid du Colombier Addrtree *ap; 16143e12c5d1SDavid du Colombier Address a; 16153e12c5d1SDavid du Colombier int sign; 16163e12c5d1SDavid du Colombier{ 16173e12c5d1SDavid du Colombier Address a2; 16183e12c5d1SDavid du Colombier do 16193e12c5d1SDavid du Colombier switch(ap->type){ 16203e12c5d1SDavid du Colombier case '.': 16213e12c5d1SDavid du Colombier a=a.f->dot; 16223e12c5d1SDavid du Colombier break; 16233e12c5d1SDavid du Colombier case '$': 16243e12c5d1SDavid du Colombier a.r.p1=a.r.p2=a.f->nbytes; 16253e12c5d1SDavid du Colombier break; 16263e12c5d1SDavid du Colombier case '"': 16273e12c5d1SDavid du Colombier a=matchfile(a, ap->aregexp)->dot; 16283e12c5d1SDavid du Colombier break; 16293e12c5d1SDavid du Colombier case ',': 16303e12c5d1SDavid du Colombier a2=address(ap->right, a, 0); 16313e12c5d1SDavid du Colombier a=address(ap->left, a, 0); 16323e12c5d1SDavid du Colombier if(a.f!=a2.f || a2.r.p2<a.r.p1) 16333e12c5d1SDavid du Colombier error(Eorder); 16343e12c5d1SDavid du Colombier a.r.p2=a2.r.p2; 16353e12c5d1SDavid du Colombier return a; 16363e12c5d1SDavid du Colombier /* and so on */ 16373e12c5d1SDavid du Colombier } 16383e12c5d1SDavid du Colombier while((ap=ap->right)!=0); 16393e12c5d1SDavid du Colombier return a; 16403e12c5d1SDavid du Colombier} 16413e12c5d1SDavid du Colombier.P2 16423e12c5d1SDavid du Colombier.PP 16433e12c5d1SDavid du ColombierThroughout, errors are handled by a non-local 16443e12c5d1SDavid du Colombier.CW goto 16453e12c5d1SDavid du Colombier(a 16463e12c5d1SDavid du Colombier.CW setjmp/longjmp 16473e12c5d1SDavid du Colombierin C terminology) 16483e12c5d1SDavid du Colombierhidden in a routine called 16493e12c5d1SDavid du Colombier.CW error 16503e12c5d1SDavid du Colombierthat immediately aborts the execution, retracts any 16513e12c5d1SDavid du Colombierpartially made changes (see the section below on `undoing'), and 16523e12c5d1SDavid du Colombierreturns to the top level of the parser. 16533e12c5d1SDavid du ColombierThe argument to 16543e12c5d1SDavid du Colombier.CW error 16553e12c5d1SDavid du Colombieris an enumeration type that 16563e12c5d1SDavid du Colombieris translated to a terse but possibly helpful 16573e12c5d1SDavid du Colombiermessage such as `?addresses out of order.' 16583e12c5d1SDavid du ColombierVery common messages are kept short; for example the message for 16593e12c5d1SDavid du Colombiera failed regular expression search is `?search.' 16603e12c5d1SDavid du Colombier.PP 16613e12c5d1SDavid du ColombierCharacter addresses such as 16623e12c5d1SDavid du Colombier.CW #3 16633e12c5d1SDavid du Colombierare trivial to implement, as the 16643e12c5d1SDavid du Colombier.CW File 16653e12c5d1SDavid du Colombierdata structure is accessible by character number. 16663e12c5d1SDavid du ColombierHowever, 16673e12c5d1SDavid du Colombier.CW sam 16683e12c5d1SDavid du Colombierkeeps no information about the position of newlines \(em it is too 16693e12c5d1SDavid du Colombierexpensive to track dynamically \(em so line addresses are computed by reading 16703e12c5d1SDavid du Colombierthe file, counting newlines. Except in very large files, this has proven 16713e12c5d1SDavid du Colombieracceptable: file access is fast enough to make the technique practical, 16723e12c5d1SDavid du Colombierand lines are not central to the structure of the command language. 16733e12c5d1SDavid du Colombier.PP 16743e12c5d1SDavid du ColombierThe command interpreter, called 16753e12c5d1SDavid du Colombier.CW cmdexec , 16763e12c5d1SDavid du Colombieris also straightforward. The parse table includes a 16773e12c5d1SDavid du Colombierfunction to call to interpret a particular command. That function 16783e12c5d1SDavid du Colombierreceives as arguments 16793e12c5d1SDavid du Colombierthe calculated address 16803e12c5d1SDavid du Colombierfor the command 16813e12c5d1SDavid du Colombierand the command tree (of type 16823e12c5d1SDavid du Colombier.CW Cmdtree ), 16833e12c5d1SDavid du Colombierwhich may contain information such as the subtree for compound commands. 16843e12c5d1SDavid du ColombierHere, for example, is the function for the 16853e12c5d1SDavid du Colombier.CW g 16863e12c5d1SDavid du Colombierand 16873e12c5d1SDavid du Colombier.CW v 16883e12c5d1SDavid du Colombiercommands: 16893e12c5d1SDavid du Colombier.P1 16903e12c5d1SDavid du Colombierint 16913e12c5d1SDavid du Colombierg_cmd(a, cp) 16923e12c5d1SDavid du Colombier Address a; 16933e12c5d1SDavid du Colombier Cmdtree *cp; 16943e12c5d1SDavid du Colombier{ 16953e12c5d1SDavid du Colombier compile(cp->regexp); 16963e12c5d1SDavid du Colombier if(execute(a.f, a.r.p1, a.r.p2)!=(cp->cmdchar=='v')){ 16973e12c5d1SDavid du Colombier a.f->dot=a; 16983e12c5d1SDavid du Colombier return cmdexec(a, cp->subcmd); 16993e12c5d1SDavid du Colombier } 1700219b2ee8SDavid du Colombier return TRUE; /* cause execution to continue */ 17013e12c5d1SDavid du Colombier} 17023e12c5d1SDavid du Colombier.P2 17033e12c5d1SDavid du Colombier.CW Compile "" ( 17043e12c5d1SDavid du Colombierand 17053e12c5d1SDavid du Colombier.CW execute 17063e12c5d1SDavid du Colombierare part of the regular expression code, described in the next section.) 17073e12c5d1SDavid du ColombierBecause the parser and the 17083e12c5d1SDavid du Colombier.CW File 17093e12c5d1SDavid du Colombierdata structure do most of the work, most commands 17103e12c5d1SDavid du Colombierare similarly brief. 17113e12c5d1SDavid du Colombier.SH 2 17123e12c5d1SDavid du ColombierRegular expressions 17133e12c5d1SDavid du Colombier.LP 17143e12c5d1SDavid du ColombierThe regular expression code in 17153e12c5d1SDavid du Colombier.CW sam 17163e12c5d1SDavid du Colombieris an interpreted, rather than compiled on-the-fly, implementation of Thompson's 17173e12c5d1SDavid du Colombiernon-deterministic finite automaton algorithm.\u\s-4\&12\s+4\d 1718219b2ee8SDavid du ColombierThe syntax and semantics of the expressions are as in the UNIX program 17193e12c5d1SDavid du Colombier.CW egrep , 17203e12c5d1SDavid du Colombierincluding alternation, closures, character classes, and so on. 17213e12c5d1SDavid du ColombierThe only changes in the notation are two additions: 17223e12c5d1SDavid du Colombier.CW \en 17233e12c5d1SDavid du Colombieris translated to, and matches, a newline character, and 17243e12c5d1SDavid du Colombier.CW @ 17253e12c5d1SDavid du Colombiermatches any character. In 17263e12c5d1SDavid du Colombier.CW egrep , 17273e12c5d1SDavid du Colombierthe character 17283e12c5d1SDavid du Colombier.CW \&. 17293e12c5d1SDavid du Colombiermatches any character except newline, and in 17303e12c5d1SDavid du Colombier.CW sam 17313e12c5d1SDavid du Colombierthe same rule seemed safest, to prevent idioms like 17323e12c5d1SDavid du Colombier.CW \&.* 17333e12c5d1SDavid du Colombierfrom spanning newlines. 17343e12c5d1SDavid du Colombier.CW Egrep 17353e12c5d1SDavid du Colombierexpressions are arguably too complicated for an interactive editor \(em 17363e12c5d1SDavid du Colombiercertainly it would make sense if all the special characters were two-character 17373e12c5d1SDavid du Colombiersequences, so that most of the punctuation characters wouldn't have 17383e12c5d1SDavid du Colombierpeculiar meanings \(em but for an interesting command language, full 17393e12c5d1SDavid du Colombierregular expressions are necessary, and 17403e12c5d1SDavid du Colombier.CW egrep 1741219b2ee8SDavid du Colombierdefines the full regular expression syntax for UNIX programs. 1742219b2ee8SDavid du ColombierAlso, it seemed superfluous to define a new syntax, since various UNIX programs 17433e12c5d1SDavid du Colombier.CW ed , ( 17443e12c5d1SDavid du Colombier.CW egrep 17453e12c5d1SDavid du Colombierand 17463e12c5d1SDavid du Colombier.CW vi ) 17473e12c5d1SDavid du Colombierdefine too many already. 17483e12c5d1SDavid du Colombier.PP 17493e12c5d1SDavid du ColombierThe expressions are compiled by a routine, 17503e12c5d1SDavid du Colombier.CW compile , 17513e12c5d1SDavid du Colombierthat generates the description of the non-deterministic finite state machine. 17523e12c5d1SDavid du ColombierA second routine, 17533e12c5d1SDavid du Colombier.CW execute , 17543e12c5d1SDavid du Colombierinterprets the machine to generate the leftmost-longest match of the 17553e12c5d1SDavid du Colombierexpression in a substring of the file. 17563e12c5d1SDavid du ColombierThe algorithm is described elsewhere.\u\s-4\&12,13\s+4\d 17573e12c5d1SDavid du Colombier.CW Execute 17583e12c5d1SDavid du Colombierreports 17593e12c5d1SDavid du Colombierwhether a match was found, and sets a global variable, 17603e12c5d1SDavid du Colombierof type 17613e12c5d1SDavid du Colombier.CW Range , 17623e12c5d1SDavid du Colombierto the substring matched. 17633e12c5d1SDavid du Colombier.PP 17643e12c5d1SDavid du ColombierA trick is required to evaluate the expression in reverse, such as when 17653e12c5d1SDavid du Colombiersearching backwards for an expression. 17663e12c5d1SDavid du ColombierFor example, 17673e12c5d1SDavid du Colombier.P1 17683e12c5d1SDavid du Colombier-/P.*r/ 17693e12c5d1SDavid du Colombier.P2 17703e12c5d1SDavid du Colombierlooks backwards through the file for a match of the expression. 17713e12c5d1SDavid du ColombierThe expression, however, is defined for a forward search. 17723e12c5d1SDavid du ColombierThe solution is to construct a machine identical to the machine 17733e12c5d1SDavid du Colombierfor a forward search except for a reversal of all the concatenation 17743e12c5d1SDavid du Colombieroperators (the other operators are symmetric under direction reversal), 17753e12c5d1SDavid du Colombierto exchange the meaning of the operators 17763e12c5d1SDavid du Colombier.CW ^ 17773e12c5d1SDavid du Colombierand 17783e12c5d1SDavid du Colombier.CW $ , 17793e12c5d1SDavid du Colombierand then to read the file backwards, looking for the 17803e12c5d1SDavid du Colombierusual earliest longest match. 17813e12c5d1SDavid du Colombier.PP 17823e12c5d1SDavid du Colombier.CW Execute 17833e12c5d1SDavid du Colombiergenerates only one match each time it is called. 17843e12c5d1SDavid du ColombierTo interpret looping constructs such as the 17853e12c5d1SDavid du Colombier.CW x 17863e12c5d1SDavid du Colombiercommand, 17873e12c5d1SDavid du Colombier.CW sam 17883e12c5d1SDavid du Colombiermust therefore synchronize between 17893e12c5d1SDavid du Colombiercalls of 17903e12c5d1SDavid du Colombier.CW execute 17913e12c5d1SDavid du Colombierto avoid 17923e12c5d1SDavid du Colombierproblems with null matches. 17933e12c5d1SDavid du ColombierFor example, even given the leftmost-longest rule, 17943e12c5d1SDavid du Colombierthe expression 17953e12c5d1SDavid du Colombier.CW a* 17963e12c5d1SDavid du Colombiermatches three times in the string 17973e12c5d1SDavid du Colombier.CW ab 17983e12c5d1SDavid du Colombier(the character 17993e12c5d1SDavid du Colombier.CW a , 18003e12c5d1SDavid du Colombierthe null string between the 18013e12c5d1SDavid du Colombier.CW a 18023e12c5d1SDavid du Colombierand 18033e12c5d1SDavid du Colombier.CW b , 18043e12c5d1SDavid du Colombierand the final null string). 18053e12c5d1SDavid du ColombierAfter returning a match for the 18063e12c5d1SDavid du Colombier.CW a , 18073e12c5d1SDavid du Colombier.CW sam 18083e12c5d1SDavid du Colombiermust not match the null string before the 18093e12c5d1SDavid du Colombier.CW b . 18103e12c5d1SDavid du ColombierThe algorithm starts 18113e12c5d1SDavid du Colombier.CW execute 18123e12c5d1SDavid du Colombierat the end of its previous match, and 18133e12c5d1SDavid du Colombierif the match it returns 18143e12c5d1SDavid du Colombieris null and abuts the previous match, rejects the match and advances 18153e12c5d1SDavid du Colombierthe initial position one character. 18163e12c5d1SDavid du Colombier.SH 2 18173e12c5d1SDavid du ColombierMemory allocation 18183e12c5d1SDavid du Colombier.LP 18193e12c5d1SDavid du ColombierThe C language has no memory allocation primitives, although a standard 18203e12c5d1SDavid du Colombierlibrary routine, 18213e12c5d1SDavid du Colombier.CW malloc , 18223e12c5d1SDavid du Colombierprovides adequate service for simple programs. 18233e12c5d1SDavid du ColombierFor specific uses, however, 18243e12c5d1SDavid du Colombierit can be better to write a custom allocator. 18253e12c5d1SDavid du ColombierThe allocator (or rather, pair of allocators) described here 18263e12c5d1SDavid du Colombierwork in both the terminal and host parts of 18273e12c5d1SDavid du Colombier.CW sam . 18283e12c5d1SDavid du ColombierThey are designed for efficient manipulation of strings, 18293e12c5d1SDavid du Colombierwhich are allocated and freed frequently and vary in length from essentially 18303e12c5d1SDavid du Colombierzero to 32 Kbytes (very large strings are written to disc). 18313e12c5d1SDavid du ColombierMore important, strings may be large and change size often, 18323e12c5d1SDavid du Colombierso to minimize memory usage it is helpful to reclaim and to coalesce the 18333e12c5d1SDavid du Colombierunused portions of strings when they are truncated. 18343e12c5d1SDavid du Colombier.PP 18353e12c5d1SDavid du ColombierObjects to be allocated in 18363e12c5d1SDavid du Colombier.CW sam 18373e12c5d1SDavid du Colombierare of two flavors: 18383e12c5d1SDavid du Colombierthe first is C 18393e12c5d1SDavid du Colombier.CW structs , 18403e12c5d1SDavid du Colombierwhich are small and often addressed by pointer variables; 18413e12c5d1SDavid du Colombierthe second is variable-sized arrays of characters 18423e12c5d1SDavid du Colombieror integers whose 18433e12c5d1SDavid du Colombierbase pointer is always used to access them. 18443e12c5d1SDavid du ColombierThe memory allocator in 18453e12c5d1SDavid du Colombier.CW sam 18463e12c5d1SDavid du Colombieris therefore in two parts: 18473e12c5d1SDavid du Colombierfirst, a traditional first-fit allocator that provides fixed storage for 18483e12c5d1SDavid du Colombier.CW structs ; 18493e12c5d1SDavid du Colombierand second, a garbage-compacting allocator that reduces storage 18503e12c5d1SDavid du Colombieroverhead for variable-sized objects, at the cost of some bookkeeping. 18513e12c5d1SDavid du ColombierThe two types of objects are allocated from adjoining arenas, with 18523e12c5d1SDavid du Colombierthe garbage-compacting allocator controlling the arena with higher addresses. 18533e12c5d1SDavid du ColombierSeparating into two arenas simplifies compaction and prevents fragmentation due 18543e12c5d1SDavid du Colombierto immovable objects. 18553e12c5d1SDavid du ColombierThe access rules for garbage-compactable objects 18563e12c5d1SDavid du Colombier(discussed in the next paragraph) allow them to be relocated, so when 18573e12c5d1SDavid du Colombierthe first-fit arena needs space, it moves the garbage-compacted arena 18583e12c5d1SDavid du Colombierto higher addresses to make room. Storage is therefore created only 18593e12c5d1SDavid du Colombierat successively higher addresses, either when more garbage-compacted 18603e12c5d1SDavid du Colombierspace is needed or when the first-fit arena pushes up the other arena. 18613e12c5d1SDavid du Colombier.PP 18623e12c5d1SDavid du ColombierObjects that may be compacted declare to the 18633e12c5d1SDavid du Colombierallocator a cell that is guaranteed to be the sole repository of the 18643e12c5d1SDavid du Colombieraddress of the object whenever a compaction can occur. 18653e12c5d1SDavid du ColombierThe compactor can then update the address when the object is moved. 18663e12c5d1SDavid du ColombierFor example, the implementation of type 18673e12c5d1SDavid du Colombier.CW List 18683e12c5d1SDavid du Colombier(really a variable-length array) 18693e12c5d1SDavid du Colombieris: 18703e12c5d1SDavid du Colombier.P1 18713e12c5d1SDavid du Colombiertypedef struct List{ 18723e12c5d1SDavid du Colombier int nused; 18733e12c5d1SDavid du Colombier long *ptr; 18743e12c5d1SDavid du Colombier}List; 18753e12c5d1SDavid du Colombier.P2 18763e12c5d1SDavid du ColombierThe 18773e12c5d1SDavid du Colombier.CW ptr 18783e12c5d1SDavid du Colombiercell must always be used directly, and never copied. When a 18793e12c5d1SDavid du Colombier.CW List 18803e12c5d1SDavid du Colombieris to be created the 18813e12c5d1SDavid du Colombier.CW List 18823e12c5d1SDavid du Colombierstructure is allocated in the ordinary first-fit arena 18833e12c5d1SDavid du Colombierand its 18843e12c5d1SDavid du Colombier.CW ptr 18853e12c5d1SDavid du Colombieris allocated in the garbage-compacted arena. 18863e12c5d1SDavid du ColombierA similar data type for strings, called 18873e12c5d1SDavid du Colombier.CW String , 18883e12c5d1SDavid du Colombierstores variable-length character arrays of up to 32767 elements. 18893e12c5d1SDavid du Colombier.PP 18903e12c5d1SDavid du ColombierA related matter of programming style: 18913e12c5d1SDavid du Colombier.CW sam 18923e12c5d1SDavid du Colombierfrequently passes structures by value, which 18933e12c5d1SDavid du Colombiersimplifies the code. 18943e12c5d1SDavid du ColombierTraditionally, C programs have 18953e12c5d1SDavid du Colombierpassed structures by reference, but implicit allocation on 18963e12c5d1SDavid du Colombierthe stack is easier to use. 18973e12c5d1SDavid du ColombierStructure passing is a relatively new feature of C 18983e12c5d1SDavid du Colombier(it is not in the 18993e12c5d1SDavid du Colombierstandard reference manual for C\u\s-4\&14\s+4\d), and is poorly supported in most 19003e12c5d1SDavid du Colombiercommercial C compilers. 19013e12c5d1SDavid du ColombierIt's convenient and expressive, though, 19023e12c5d1SDavid du Colombierand simplifies memory management by 19033e12c5d1SDavid du Colombieravoiding the allocator altogether 19043e12c5d1SDavid du Colombierand eliminating pointer aliases. 19053e12c5d1SDavid du Colombier.SH 2 19063e12c5d1SDavid du ColombierData structures for manipulating files 19073e12c5d1SDavid du Colombier.LP 19083e12c5d1SDavid du ColombierExperience with 19093e12c5d1SDavid du Colombier.CW jim 19103e12c5d1SDavid du Colombiershowed that the requirements 19113e12c5d1SDavid du Colombierof the file data structure were few, but strict. 19123e12c5d1SDavid du ColombierFirst, files need to be read and written quickly; 19133e12c5d1SDavid du Colombieradding a fresh file must be painless. 19143e12c5d1SDavid du ColombierSecond, the implementation must place no arbitrary upper limit on 19153e12c5d1SDavid du Colombierthe number or sizes of files. (It should be practical to edit many files, 19163e12c5d1SDavid du Colombierand files up to megabytes in length should be handled gracefully.) 19173e12c5d1SDavid du ColombierThis implies that files be stored on disc, not in main memory. 19183e12c5d1SDavid du Colombier(Aficionados of virtual memory may argue otherwise, but the 19193e12c5d1SDavid du Colombierimplementation of virtual 19203e12c5d1SDavid du Colombiermemory in our system is not something to depend on 19213e12c5d1SDavid du Colombierfor good performance.) 19223e12c5d1SDavid du ColombierThird, changes to files need be made by only two primitives: 19233e12c5d1SDavid du Colombierdeletion and insertion. 19243e12c5d1SDavid du ColombierThese are inverses of each other, 19253e12c5d1SDavid du Colombierwhich simplifies the implementation of the undo operation. 19263e12c5d1SDavid du ColombierFinally, 19273e12c5d1SDavid du Colombierit must be easy and efficient to access the file, either 19283e12c5d1SDavid du Colombierforwards or backwards, a byte at a time. 19293e12c5d1SDavid du Colombier.PP 19303e12c5d1SDavid du ColombierThe 19313e12c5d1SDavid du Colombier.CW File 19323e12c5d1SDavid du Colombierdata type is constructed from three simpler data structures that hold arrays 19333e12c5d1SDavid du Colombierof characters. 19343e12c5d1SDavid du ColombierEach of these types has an insertion and deletion operator, and the 19353e12c5d1SDavid du Colombierinsertion and deletion operators of the 19363e12c5d1SDavid du Colombier.CW File 19373e12c5d1SDavid du Colombiertype itself are constructed from them. 19383e12c5d1SDavid du Colombier.PP 19393e12c5d1SDavid du ColombierThe simplest type is the 19403e12c5d1SDavid du Colombier.CW String , 19413e12c5d1SDavid du Colombierwhich is used to hold strings in main memory. 19423e12c5d1SDavid du ColombierThe code that manages 19433e12c5d1SDavid du Colombier.CW Strings 19443e12c5d1SDavid du Colombierguarantees that they will never be longer 19453e12c5d1SDavid du Colombierthan some moderate size, and in practice they are rarely larger than 8 Kbytes. 19463e12c5d1SDavid du Colombier.CW Strings 19473e12c5d1SDavid du Colombierhave two purposes: they hold short strings like file names with little overhead, 19483e12c5d1SDavid du Colombierand because they are deliberately small, they are efficient to modify. 19493e12c5d1SDavid du ColombierThey are therefore used as the data structure for in-memory caches. 19503e12c5d1SDavid du Colombier.PP 19513e12c5d1SDavid du ColombierThe disc copy of the file is managed by a data structure called a 19523e12c5d1SDavid du Colombier.CW Disc , 19533e12c5d1SDavid du Colombierwhich corresponds to a temporary file. A 19543e12c5d1SDavid du Colombier.CW Disc 19553e12c5d1SDavid du Colombierhas no storage in main memory other than bookkeeping information; 19563e12c5d1SDavid du Colombierthe actual data being held is all on the disc. 19573e12c5d1SDavid du ColombierTo reduce the number of open files needed, 19583e12c5d1SDavid du Colombier.CW sam 1959219b2ee8SDavid du Colombieropens a dozen temporary UNIX files and multiplexes the 19603e12c5d1SDavid du Colombier.CW Discs 19613e12c5d1SDavid du Colombierupon them. 19623e12c5d1SDavid du ColombierThis permits many files to 19633e12c5d1SDavid du Colombierbe edited; the entire 19643e12c5d1SDavid du Colombier.CW sam 19653e12c5d1SDavid du Colombiersource (48 files) may be edited comfortably with a single 19663e12c5d1SDavid du Colombierinstance of 19673e12c5d1SDavid du Colombier.CW sam . 19683e12c5d1SDavid du ColombierAllocating one temporary file per 19693e12c5d1SDavid du Colombier.CW Disc 19703e12c5d1SDavid du Colombierwould strain the operating system's limit on the number of open files. 19713e12c5d1SDavid du ColombierAlso, spreading the traffic among temporary files keeps the files shorter, 1972219b2ee8SDavid du Colombierand shorter files are more efficiently implemented by the UNIX 19733e12c5d1SDavid du ColombierI/O subsystem. 19743e12c5d1SDavid du Colombier.PP 19753e12c5d1SDavid du ColombierA 19763e12c5d1SDavid du Colombier.CW Disc 19773e12c5d1SDavid du Colombieris an array of fixed-length blocks, each of which contains 19783e12c5d1SDavid du Colombierbetween 1 and 4096 characters of active data. 1979219b2ee8SDavid du Colombier(The block size of our UNIX file system is 4096 bytes.) 19803e12c5d1SDavid du ColombierThe block addresses within the temporary file and the length of each 19813e12c5d1SDavid du Colombierblock are stored in a 19823e12c5d1SDavid du Colombier.CW List . 19833e12c5d1SDavid du ColombierWhen changes are made the live part of blocks may change size. 19843e12c5d1SDavid du ColombierBlocks are created and coalesced when necessary to try to keep the sizes 19853e12c5d1SDavid du Colombierbetween 2048 and 4096 bytes. 19863e12c5d1SDavid du ColombierAn actively changing part of the 19873e12c5d1SDavid du Colombier.CW Disc 19883e12c5d1SDavid du Colombiertherefore typically has about a kilobyte of slop that can be 19893e12c5d1SDavid du Colombierinserted or deleted 19903e12c5d1SDavid du Colombierwithout changing more than one block or affecting the block order. 19913e12c5d1SDavid du ColombierWhen an insertion would overflow a block, the block is split, a new one 19923e12c5d1SDavid du Colombieris allocated to receive the overflow, and the memory-resident list of blocks 19933e12c5d1SDavid du Colombieris rearranged to reflect the insertion of the new block. 19943e12c5d1SDavid du Colombier.PP 19953e12c5d1SDavid du ColombierObviously, going to the disc for every modification to the file is 19963e12c5d1SDavid du Colombierprohibitively expensive. 19973e12c5d1SDavid du ColombierThe data type 19983e12c5d1SDavid du Colombier.CW Buffer 19993e12c5d1SDavid du Colombierconsists of a 20003e12c5d1SDavid du Colombier.CW Disc 20013e12c5d1SDavid du Colombierto hold the data and a 20023e12c5d1SDavid du Colombier.CW String 20033e12c5d1SDavid du Colombierthat acts as a cache. 20043e12c5d1SDavid du ColombierThis is the first of a series of caches throughout the data structures in 20053e12c5d1SDavid du Colombier.CW sam. 20063e12c5d1SDavid du ColombierThe caches not only improve performance, they provide a way to organize 20073e12c5d1SDavid du Colombierthe flow of data, particularly in the communication between the host 20083e12c5d1SDavid du Colombierand terminal. 20093e12c5d1SDavid du ColombierThis idea is developed below, in the section on communications. 20103e12c5d1SDavid du Colombier.PP 20113e12c5d1SDavid du ColombierTo reduce disc traffic, changes to a 20123e12c5d1SDavid du Colombier.CW Buffer 20133e12c5d1SDavid du Colombierare mediated by a variable-length string, in memory, that acts as a cache. 20143e12c5d1SDavid du ColombierWhen an insertion or deletion is made to a 20153e12c5d1SDavid du Colombier.CW Buffer , 20163e12c5d1SDavid du Colombierif the change can be accommodated by the cache, it is done there. 20173e12c5d1SDavid du ColombierIf the cache becomes bigger than a block because of an insertion, 20183e12c5d1SDavid du Colombiersome of it is written to the 20193e12c5d1SDavid du Colombier.CW Disc 20203e12c5d1SDavid du Colombierand deleted from the cache. 20213e12c5d1SDavid du ColombierIf the change does not intersect the cache, the cache is flushed. 20223e12c5d1SDavid du ColombierThe cache is only loaded at the new position if the change is smaller than a block; 20233e12c5d1SDavid du Colombierotherwise, it is sent directly to the 20243e12c5d1SDavid du Colombier.CW Disc . 20253e12c5d1SDavid du ColombierThis is because 20263e12c5d1SDavid du Colombierlarge changes are typically sequential, 20273e12c5d1SDavid du Colombierwhereupon the next change is unlikely to overlap the current one. 20283e12c5d1SDavid du Colombier.PP 20293e12c5d1SDavid du ColombierA 20303e12c5d1SDavid du Colombier.CW File 20313e12c5d1SDavid du Colombiercomprises a 20323e12c5d1SDavid du Colombier.CW String 20333e12c5d1SDavid du Colombierto hold the file name and some ancillary data such as dot and the modified bit. 20343e12c5d1SDavid du ColombierThe most important components, though, are a pair of 20353e12c5d1SDavid du Colombier.CW Buffers , 20363e12c5d1SDavid du Colombierone called the transcript and the other the contents. 20373e12c5d1SDavid du ColombierTheir use is described in the next section. 20383e12c5d1SDavid du Colombier.PP 20393e12c5d1SDavid du ColombierThe overall structure is shown in Figure 5. 20403e12c5d1SDavid du ColombierAlthough it may seem that the data is touched many times on its 20413e12c5d1SDavid du Colombierway from the 20423e12c5d1SDavid du Colombier.CW Disc , 2043219b2ee8SDavid du Colombierit is read (by one UNIX system call) directly into the cache of the 20443e12c5d1SDavid du Colombierassociated 20453e12c5d1SDavid du Colombier.CW Buffer ; 20463e12c5d1SDavid du Colombierno extra copy is done. 20473e12c5d1SDavid du ColombierSimilarly, when flushing the cache, the text is written 20483e12c5d1SDavid du Colombierdirectly from the cache to disc. 20493e12c5d1SDavid du ColombierMost operations act directly on the text in the cache. 20503e12c5d1SDavid du ColombierA principle applied throughout 20513e12c5d1SDavid du Colombier.CW sam 20523e12c5d1SDavid du Colombieris that the fewer times the data is copied, the faster the program will run 20533e12c5d1SDavid du Colombier(see also the paper by Waite\u\s-4\&15\s+4\d). 20543e12c5d1SDavid du Colombier.KF 20553e12c5d1SDavid du Colombier.PS 20563e12c5d1SDavid du Colombiercopy "fig5.pic" 20573e12c5d1SDavid du Colombier.PE 20583e12c5d1SDavid du Colombier.Cs 20593e12c5d1SDavid du ColombierFigure 5. File data structures. 20603e12c5d1SDavid du ColombierThe temporary files are stored in the standard repository for such files 20613e12c5d1SDavid du Colombieron the host system. 20623e12c5d1SDavid du Colombier.Ce 20633e12c5d1SDavid du Colombier.KE 20643e12c5d1SDavid du Colombier.PP 20653e12c5d1SDavid du ColombierThe contents of a 20663e12c5d1SDavid du Colombier.CW File 20673e12c5d1SDavid du Colombierare accessed by a routine that 20683e12c5d1SDavid du Colombiercopies to a buffer a substring of a file starting at a specified offset. 20693e12c5d1SDavid du ColombierTo read a byte at a time, a 20703e12c5d1SDavid du Colombier.CW File "" per- 20713e12c5d1SDavid du Colombierarray is loaded starting from a specified initial position, 20723e12c5d1SDavid du Colombierand bytes may then be read from the array. 20733e12c5d1SDavid du ColombierThe implementation is done by a macro similar to the C standard I/O 20743e12c5d1SDavid du Colombier.CW getc 20753e12c5d1SDavid du Colombiermacro.\u\s-4\&14\s+4\d 20763e12c5d1SDavid du ColombierBecause the reading may be done at any address, a minor change to the 20773e12c5d1SDavid du Colombiermacro allows the file to be read backwards. 20783e12c5d1SDavid du ColombierThis array is read-only; there is no 20793e12c5d1SDavid du Colombier.CW putc . 20803e12c5d1SDavid du Colombier.SH 2 20813e12c5d1SDavid du ColombierDoing and undoing 20823e12c5d1SDavid du Colombier.LP 20833e12c5d1SDavid du Colombier.CW Sam 20843e12c5d1SDavid du Colombierhas an unusual method for managing changes to files. 20853e12c5d1SDavid du ColombierThe command language makes it easy to specify multiple variable-length changes 20863e12c5d1SDavid du Colombierto a file millions of bytes long, and such changes 20873e12c5d1SDavid du Colombiermust be made efficiently if the editor is to be practical. 20883e12c5d1SDavid du ColombierThe usual techniques for inserting and deleting strings 20893e12c5d1SDavid du Colombierare inadequate under these conditions. 20903e12c5d1SDavid du ColombierThe 20913e12c5d1SDavid du Colombier.CW Buffer 20923e12c5d1SDavid du Colombierand 20933e12c5d1SDavid du Colombier.CW Disc 20943e12c5d1SDavid du Colombierdata structures are designed for efficient random access to long strings, 20953e12c5d1SDavid du Colombierbut care must be taken to avoid super-linear behavior when making 20963e12c5d1SDavid du Colombiermany changes simultaneously. 20973e12c5d1SDavid du Colombier.PP 20983e12c5d1SDavid du Colombier.CW Sam 20993e12c5d1SDavid du Colombieruses a two-pass algorithm for making changes, and treats each file as a database 21003e12c5d1SDavid du Colombieragainst which transactions are registered. 21013e12c5d1SDavid du ColombierChanges are not made directly to the contents. 21023e12c5d1SDavid du ColombierInstead, when a command is started, a `mark' containing 21033e12c5d1SDavid du Colombiera sequence number is placed in the transcript 21043e12c5d1SDavid du Colombier.CW Buffer , 21053e12c5d1SDavid du Colombierand each change made to the file, either an insertion or deletion 21063e12c5d1SDavid du Colombieror a change to the file name, 21073e12c5d1SDavid du Colombieris appended to the end of the transcript. 21083e12c5d1SDavid du ColombierWhen the command is complete, the transcript is rewound to the 21093e12c5d1SDavid du Colombiermark and applied to the contents. 21103e12c5d1SDavid du Colombier.PP 21113e12c5d1SDavid du ColombierOne reason for separating evaluation from 21123e12c5d1SDavid du Colombierapplication in this way is to simplify tracking the addresses of changes 21133e12c5d1SDavid du Colombiermade in the middle of a long sequence. 21143e12c5d1SDavid du ColombierThe two-pass algorithm also allows all changes to apply to the 21153e12c5d1SDavid du Colombier.I original 21163e12c5d1SDavid du Colombierdata: no change can affect another change made in the same command. 21173e12c5d1SDavid du ColombierThis is particularly important when evaluating an 21183e12c5d1SDavid du Colombier.CW x 21193e12c5d1SDavid du Colombiercommand because it prevents regular expression matches 21203e12c5d1SDavid du Colombierfrom stumbling over changes made earlier in the execution. 21213e12c5d1SDavid du ColombierAlso, the two-pass 2122219b2ee8SDavid du Colombieralgorithm is cleaner than the way other UNIX editors allow changes to 21233e12c5d1SDavid du Colombieraffect each other; 21243e12c5d1SDavid du Colombierfor example, 21253e12c5d1SDavid du Colombier.CW ed 's 21263e12c5d1SDavid du Colombieridioms to do things like delete every other line 21273e12c5d1SDavid du Colombierdepend critically on the implementation. 21283e12c5d1SDavid du ColombierInstead, 21293e12c5d1SDavid du Colombier.CW sam 's 21303e12c5d1SDavid du Colombiersimple model, in which all changes in a command occur effectively 21313e12c5d1SDavid du Colombiersimultaneously, is easy to explain and to understand. 21323e12c5d1SDavid du Colombier.PP 21333e12c5d1SDavid du ColombierThe records in the transcript are of the form ``delete substring from 21343e12c5d1SDavid du Colombierlocations 21353e12c5d1SDavid du Colombier123 to 456'' and ``insert 11 characters `hello there' at location 789.'' 21363e12c5d1SDavid du Colombier(It is an error if the changes are not at monotonically greater 21373e12c5d1SDavid du Colombierpositions through the file.) 21383e12c5d1SDavid du ColombierWhile the update is occurring, these numbers must be 21393e12c5d1SDavid du Colombieroffset by earlier changes, but that is straightforward and 21403e12c5d1SDavid du Colombierlocal to the update routine; 21413e12c5d1SDavid du Colombiermoreover, all the numbers have been computed 21423e12c5d1SDavid du Colombierbefore the first is examined. 21433e12c5d1SDavid du Colombier.PP 21443e12c5d1SDavid du ColombierTreating the file as a transaction system has another advantage: 21453e12c5d1SDavid du Colombierundo is trivial. 21463e12c5d1SDavid du ColombierAll it takes is to invert the transcript after it has been 21473e12c5d1SDavid du Colombierimplemented, converting insertions 21483e12c5d1SDavid du Colombierinto deletions and vice versa, and saving them in a holding 21493e12c5d1SDavid du Colombier.CW Buffer . 21503e12c5d1SDavid du ColombierThe `do' transcript can then be deleted from 21513e12c5d1SDavid du Colombierthe transcript 21523e12c5d1SDavid du Colombier.CW Buffer 21533e12c5d1SDavid du Colombierand replaced by the `undo' transcript. 21543e12c5d1SDavid du ColombierIf an undo is requested, the transcript is rewound and the undo transcript 21553e12c5d1SDavid du Colombierexecuted. 21563e12c5d1SDavid du ColombierBecause the transcript 21573e12c5d1SDavid du Colombier.CW Buffer 21583e12c5d1SDavid du Colombieris not truncated after each command, it accumulates 21593e12c5d1SDavid du Colombiersuccessive changes. 21603e12c5d1SDavid du ColombierA sequence of undo commands 21613e12c5d1SDavid du Colombiercan therefore back up the file arbitrarily, 21623e12c5d1SDavid du Colombierwhich is more helpful than the more commonly implemented self-inverse form of undo. 21633e12c5d1SDavid du Colombier.CW Sam "" ( 21643e12c5d1SDavid du Colombierprovides no way to undo an undo, but if it were desired, 21653e12c5d1SDavid du Colombierit would be easy to provide by re-interpreting the `do' transcript.) 21663e12c5d1SDavid du ColombierEach mark in the transcript contains a sequence number and the offset into 21673e12c5d1SDavid du Colombierthe transcript of the previous mark, to aid in unwinding the transcript. 21683e12c5d1SDavid du ColombierMarks also contain the value of dot and the modified bit so these can be 21693e12c5d1SDavid du Colombierrestored easily. 21703e12c5d1SDavid du ColombierUndoing multiple files is easy; it merely demands undoing all files whose 21713e12c5d1SDavid du Colombierlatest change has the same sequence number as the current file. 21723e12c5d1SDavid du Colombier.PP 21733e12c5d1SDavid du ColombierAnother benefit of having a transcript is that errors encountered in the middle 21743e12c5d1SDavid du Colombierof a complicated command need not leave the files in an intermediate state. 21753e12c5d1SDavid du ColombierBy rewinding the transcript to the mark beginning the command, 21763e12c5d1SDavid du Colombierthe partial command can be trivially undone. 21773e12c5d1SDavid du Colombier.PP 21783e12c5d1SDavid du ColombierWhen the update algorithm was first implemented, it was unacceptably slow, 21793e12c5d1SDavid du Colombierso a cache was added to coalesce nearby changes, 21803e12c5d1SDavid du Colombierreplacing multiple small changes by a single larger one. 21813e12c5d1SDavid du ColombierThis reduced the number 21823e12c5d1SDavid du Colombierof insertions into the transaction 21833e12c5d1SDavid du Colombier.CW Buffer , 21843e12c5d1SDavid du Colombierand made a dramatic improvement in performance, 21853e12c5d1SDavid du Colombierbut made it impossible 21863e12c5d1SDavid du Colombierto handle changes in non-monotonic order in the file; the caching method 21873e12c5d1SDavid du Colombieronly works if changes don't overlap. 21883e12c5d1SDavid du ColombierBefore the cache was added, the transaction could in principle be sorted 21893e12c5d1SDavid du Colombierif the changes were out of order, although 21903e12c5d1SDavid du Colombierthis was never done. 21913e12c5d1SDavid du ColombierThe current status is therefore acceptable performance with a minor 21923e12c5d1SDavid du Colombierrestriction on global changes, which is sometimes, but rarely, an annoyance. 21933e12c5d1SDavid du Colombier.PP 21943e12c5d1SDavid du ColombierThe update algorithm obviously paws the data more than simpler 21953e12c5d1SDavid du Colombieralgorithms, but it is not prohibitively expensive; 21963e12c5d1SDavid du Colombierthe caches help. 21973e12c5d1SDavid du Colombier(The principle of avoiding copying the data is still honored here, 21983e12c5d1SDavid du Colombieralthough not as piously: 21993e12c5d1SDavid du Colombierthe data is moved from contents' cache to 22003e12c5d1SDavid du Colombierthe transcript's all at once and through only one internal buffer.) 22013e12c5d1SDavid du ColombierPerformance figures confirm the efficiency. 22023e12c5d1SDavid du ColombierTo read from a dead start a hundred kilobyte file on a VAX-11/750 22033e12c5d1SDavid du Colombiertakes 1.4 seconds of user time, 2.5 seconds of system time, 22043e12c5d1SDavid du Colombierand 5 seconds of real time. 22053e12c5d1SDavid du ColombierReading the same file in 22063e12c5d1SDavid du Colombier.CW ed 22073e12c5d1SDavid du Colombiertakes 6.0 seconds of user time, 1.7 seconds of system time, 22083e12c5d1SDavid du Colombierand 8 seconds of real time. 22093e12c5d1SDavid du Colombier.CW Sam 22103e12c5d1SDavid du Colombieruses about half the CPU time. 22113e12c5d1SDavid du ColombierA more interesting example is the one stated above: 22123e12c5d1SDavid du Colombierinserting a character between every pair of characters in the file. 22133e12c5d1SDavid du ColombierThe 22143e12c5d1SDavid du Colombier.CW sam 22153e12c5d1SDavid du Colombiercommand is 22163e12c5d1SDavid du Colombier.P1 22173e12c5d1SDavid du Colombier,y/@/ a/x/ 22183e12c5d1SDavid du Colombier.P2 22193e12c5d1SDavid du Colombierand takes 3 CPU seconds per kilobyte of input file, of which 22203e12c5d1SDavid du Colombierabout a third is spent in the regular expression code. 22213e12c5d1SDavid du ColombierThis translates to about 500 changes per second. 22223e12c5d1SDavid du Colombier.CW Ed 22233e12c5d1SDavid du Colombiertakes 1.5 seconds per kilobyte to make a similar change (ignoring newlines), 22243e12c5d1SDavid du Colombierbut cannot undo it. 22253e12c5d1SDavid du ColombierThe same example in 22263e12c5d1SDavid du Colombier.CW ex ,\u\s-4\&9\s+4\d 22273e12c5d1SDavid du Colombiera variant of 22283e12c5d1SDavid du Colombier.CW ed 22293e12c5d1SDavid du Colombierdone at the University of California at Berkeley, 22303e12c5d1SDavid du Colombierwhich allows one level of undoing, again takes 3 seconds. 22313e12c5d1SDavid du ColombierIn summary, 22323e12c5d1SDavid du Colombier.CW sam 's 2233219b2ee8SDavid du Colombierperformance is comparable to that of other UNIX editors, although it solves 22343e12c5d1SDavid du Colombiera harder problem. 22353e12c5d1SDavid du Colombier.SH 2 22363e12c5d1SDavid du ColombierCommunications 22373e12c5d1SDavid du Colombier.LP 22383e12c5d1SDavid du ColombierThe discussion so far has described the implementation of the host part of 22393e12c5d1SDavid du Colombier.CW sam ; 22403e12c5d1SDavid du Colombierthe next few sections explain how a machine with mouse and bitmap display 22413e12c5d1SDavid du Colombiercan be engaged to improve interaction. 22423e12c5d1SDavid du Colombier.CW Sam 22433e12c5d1SDavid du Colombieris not the first editor to be written as two processes,\u\s-4\&16\s+4\d 22443e12c5d1SDavid du Colombierbut its implementation 22453e12c5d1SDavid du Colombierhas some unusual aspects. 22463e12c5d1SDavid du Colombier.PP 22473e12c5d1SDavid du ColombierThere are several ways 22483e12c5d1SDavid du Colombier.CW sam 's 22493e12c5d1SDavid du Colombierhost and terminal parts may be connected. 22503e12c5d1SDavid du ColombierThe first and simplest is to forgo the terminal part and use the host 22513e12c5d1SDavid du Colombierpart's command language to edit text on an ordinary terminal. 22523e12c5d1SDavid du ColombierThis mode is invoked by starting 22533e12c5d1SDavid du Colombier.CW sam 22543e12c5d1SDavid du Colombierwith the 22553e12c5d1SDavid du Colombier.CW -d 22563e12c5d1SDavid du Colombieroption. 22573e12c5d1SDavid du ColombierWith no options, 22583e12c5d1SDavid du Colombier.CW sam 22593e12c5d1SDavid du Colombierruns separate host and terminal programs, 22603e12c5d1SDavid du Colombiercommunicating with a message protocol over the physical 22613e12c5d1SDavid du Colombierconnection that joins them. 22623e12c5d1SDavid du ColombierTypically, the connection is an RS-232 link between a Blit 22633e12c5d1SDavid du Colombier(the prototypical display for 22643e12c5d1SDavid du Colombier.CW sam ) 22653e12c5d1SDavid du Colombierand a host running 2266219b2ee8SDavid du Colombierthe Ninth Edition of the UNIX operating system.\u\s-4\&8\s+4\d 22673e12c5d1SDavid du Colombier(This is the version of the system used in the Computing Sciences Research 22687dd7cddfSDavid du ColombierCenter at AT&T Bell Laboratories [now Lucent Technologies, Bell Labs], where I work. Its relevant 22693e12c5d1SDavid du Colombieraspects are discussed in the Blit paper.\u\s-4\&1\s+4\d) 22703e12c5d1SDavid du ColombierThe implementation of 22713e12c5d1SDavid du Colombier.CW sam 2272219b2ee8SDavid du Colombierfor the SUN computer runs both processes on the same machine and 22733e12c5d1SDavid du Colombierconnects them by a pipe. 22743e12c5d1SDavid du Colombier.PP 22753e12c5d1SDavid du ColombierThe low bandwidth of an RS-232 link 22763e12c5d1SDavid du Colombiernecessitated the split between 22773e12c5d1SDavid du Colombierthe two programs. 22783e12c5d1SDavid du ColombierThe division is a mixed blessing: 22793e12c5d1SDavid du Colombiera program in two parts is much harder to write and to debug 22803e12c5d1SDavid du Colombierthan a self-contained one, 22813e12c5d1SDavid du Colombierbut the split makes several unusual configurations possible. 22823e12c5d1SDavid du ColombierThe terminal may be physically separated from the host, allowing the conveniences 22833e12c5d1SDavid du Colombierof a mouse and bitmap display to be taken home while leaving the files at work. 22843e12c5d1SDavid du ColombierIt is also possible to run the host part on a remote machine: 22853e12c5d1SDavid du Colombier.P1 22863e12c5d1SDavid du Colombiersam -r host 22873e12c5d1SDavid du Colombier.P2 22883e12c5d1SDavid du Colombierconnects to the terminal in the usual way, and then makes a call 22893e12c5d1SDavid du Colombieracross the network to establish the host part of 22903e12c5d1SDavid du Colombier.CW sam 22913e12c5d1SDavid du Colombieron the named machine. 22923e12c5d1SDavid du ColombierFinally, it cross-connects the I/O to join the two parts. 22933e12c5d1SDavid du ColombierThis allows 22943e12c5d1SDavid du Colombier.CW sam 22953e12c5d1SDavid du Colombierto be run on machines that do not support bitmap displays; 22963e12c5d1SDavid du Colombierfor example, 22973e12c5d1SDavid du Colombier.CW sam 22983e12c5d1SDavid du Colombieris the editor of choice on our Cray X-MP/24. 22993e12c5d1SDavid du Colombier.CW Sam 23003e12c5d1SDavid du Colombier.CW -r 23013e12c5d1SDavid du Colombierinvolves 23023e12c5d1SDavid du Colombier.I three 23033e12c5d1SDavid du Colombiermachines: the remote host, the terminal, and the local host. 23043e12c5d1SDavid du ColombierThe local host's job is simple but vital: it passes the data 23053e12c5d1SDavid du Colombierbetween the remote host and terminal. 23063e12c5d1SDavid du Colombier.PP 23073e12c5d1SDavid du ColombierThe host and terminal exchange messages asynchronously 23083e12c5d1SDavid du Colombier(rather than, say, as remote procedure calls) but there is no 23093e12c5d1SDavid du Colombiererror detection or correction 23103e12c5d1SDavid du Colombierbecause, whatever the configuration, the connection is reliable. 23113e12c5d1SDavid du ColombierBecause the terminal handles mundane interaction tasks such as 23123e12c5d1SDavid du Colombierpopping up menus and interpreting the responses, the messages are about 23133e12c5d1SDavid du Colombierdata, not actions. 23143e12c5d1SDavid du ColombierFor example, the host knows nothing about what is displayed on the screen, 23153e12c5d1SDavid du Colombierand when the user types a character, the message sent to the host says 23163e12c5d1SDavid du Colombier``insert a one-byte string at location 123 in file 7,'' not ``a character 23173e12c5d1SDavid du Colombierwas typed at the current position in the current file.'' 23183e12c5d1SDavid du ColombierIn other words, the messages look very much like the transaction records 23193e12c5d1SDavid du Colombierin the transcripts. 23203e12c5d1SDavid du Colombier.PP 23213e12c5d1SDavid du ColombierEither the host or terminal part of 23223e12c5d1SDavid du Colombier.CW sam 23233e12c5d1SDavid du Colombiermay initiate a change to a file. 23243e12c5d1SDavid du ColombierThe command language operates on the host, while typing and some 23253e12c5d1SDavid du Colombiermouse operations are executed directly in the terminal to optimize response. 23263e12c5d1SDavid du ColombierChanges initiated by the host program must be transmitted to the terminal, 23273e12c5d1SDavid du Colombierand 23283e12c5d1SDavid du Colombiervice versa. 23293e12c5d1SDavid du Colombier(A token is exchanged to determine which end is in control, 23303e12c5d1SDavid du Colombierwhich means that characters typed while a time-consuming command runs 23313e12c5d1SDavid du Colombiermust be buffered and do not appear until the command is complete.) 23323e12c5d1SDavid du ColombierTo maintain consistent information, 23333e12c5d1SDavid du Colombierthe host and terminal track changes through a per-file 23343e12c5d1SDavid du Colombierdata structure that records what portions of the file 23353e12c5d1SDavid du Colombierthe terminal has received. 23363e12c5d1SDavid du ColombierThe data structure, called a 23373e12c5d1SDavid du Colombier.CW Rasp 23383e12c5d1SDavid du Colombier(a weak pun: it's a file with holes) 23393e12c5d1SDavid du Colombieris held and updated by both the host and terminal. 23403e12c5d1SDavid du ColombierA 23413e12c5d1SDavid du Colombier.CW Rasp 23423e12c5d1SDavid du Colombieris a list of 23433e12c5d1SDavid du Colombier.CW Strings 23443e12c5d1SDavid du Colombierholding those parts of the file known to the terminal, 23453e12c5d1SDavid du Colombierseparated by counts of the number of bytes in the interstices. 23463e12c5d1SDavid du ColombierOf course, the host doesn't keep a separate copy of the data (it only needs 23473e12c5d1SDavid du Colombierthe lengths of the various pieces), 23483e12c5d1SDavid du Colombierbut the structure is the same on both ends. 23493e12c5d1SDavid du Colombier.PP 23503e12c5d1SDavid du ColombierThe 23513e12c5d1SDavid du Colombier.CW Rasp 23523e12c5d1SDavid du Colombierin the terminal doubles as a cache. 23533e12c5d1SDavid du ColombierSince the terminal keeps the text for portions of the file it has displayed, 23543e12c5d1SDavid du Colombierit need not request data from the host when revisiting old parts of the file 23553e12c5d1SDavid du Colombieror redrawing obscured windows, which speeds things up considerably 23563e12c5d1SDavid du Colombierover low-speed links. 23573e12c5d1SDavid du Colombier.PP 23583e12c5d1SDavid du ColombierIt's trivial for the terminal to maintain its 23593e12c5d1SDavid du Colombier.CW Rasp , 23603e12c5d1SDavid du Colombierbecause all changes made on the terminal apply to parts of the file 23613e12c5d1SDavid du Colombieralready loaded there. 23623e12c5d1SDavid du ColombierChanges made by the host are compared against the 23633e12c5d1SDavid du Colombier.CW Rasp 23643e12c5d1SDavid du Colombierduring the update sequence after each command. 23653e12c5d1SDavid du ColombierSmall changes to pieces of the file loaded in the terminal 23663e12c5d1SDavid du Colombierare sent in their entirety. 23673e12c5d1SDavid du ColombierLarger changes, and changes that fall entirely in the holes, 23683e12c5d1SDavid du Colombierare transmitted as messages without literal data: 23693e12c5d1SDavid du Colombieronly the lengths of the deleted and inserted strings are transmitted. 23703e12c5d1SDavid du ColombierWhen a command is completed, the terminal examines its visible 23713e12c5d1SDavid du Colombierwindows to see if any holes in their 23723e12c5d1SDavid du Colombier.CW Rasps 23733e12c5d1SDavid du Colombierintersect the visible portion of the file. 23743e12c5d1SDavid du ColombierIt then requests the missing data from the host, 23753e12c5d1SDavid du Colombieralong with up to 512 bytes of surrounding data, to minimize 23763e12c5d1SDavid du Colombierthe number of messages when visiting a new portion of the file. 23773e12c5d1SDavid du ColombierThis technique provides a kind of two-level lazy evaluation for the terminal. 23783e12c5d1SDavid du ColombierThe first level sends a minimum of information about 23793e12c5d1SDavid du Colombierparts of the file not being edited interactively; 23803e12c5d1SDavid du Colombierthe second level waits until a change is displayed before 23813e12c5d1SDavid du Colombiertransmitting the new data. 23823e12c5d1SDavid du ColombierOf course, 23833e12c5d1SDavid du Colombierperformance is also helped by having the terminal respond immediately to typing 23843e12c5d1SDavid du Colombierand simple mouse requests. 23853e12c5d1SDavid du ColombierExcept for small changes to active pieces of the file, which are 23863e12c5d1SDavid du Colombiertransmitted to the terminal without negotiation, 23873e12c5d1SDavid du Colombierthe terminal is wholly responsible for deciding what is displayed; 23883e12c5d1SDavid du Colombierthe host uses the 23893e12c5d1SDavid du Colombier.CW Rasp 23903e12c5d1SDavid du Colombieronly to tell the terminal what might be relevant. 23913e12c5d1SDavid du Colombier.PP 23923e12c5d1SDavid du ColombierWhen a change is initiated by the host, 23933e12c5d1SDavid du Colombierthe messages to the terminal describing the change 23943e12c5d1SDavid du Colombierare generated by the routine that applies the transcript of the changes 23953e12c5d1SDavid du Colombierto the contents of the 23963e12c5d1SDavid du Colombier.CW File . 23973e12c5d1SDavid du ColombierSince changes are undone by the same update routine, 23983e12c5d1SDavid du Colombierundoing requires 23993e12c5d1SDavid du Colombierno extra code in the communications; 24003e12c5d1SDavid du Colombierthe usual messages describing changes to the file are sufficient 24013e12c5d1SDavid du Colombierto back up the screen image. 24023e12c5d1SDavid du Colombier.PP 24033e12c5d1SDavid du ColombierThe 24043e12c5d1SDavid du Colombier.CW Rasp 24053e12c5d1SDavid du Colombieris a particularly good example of the way caches are used in 24063e12c5d1SDavid du Colombier.CW sam . 24073e12c5d1SDavid du ColombierFirst, it facilitates access to the active portion of the text by placing 24083e12c5d1SDavid du Colombierthe busy text in main memory. 24093e12c5d1SDavid du ColombierIn so doing, it provides efficient access 24103e12c5d1SDavid du Colombierto a large data structure that does not fit in memory. 24113e12c5d1SDavid du ColombierSince the form of data is to be imposed by the user, not by the program, 24123e12c5d1SDavid du Colombierand because characters will frequently be scanned sequentially, 24133e12c5d1SDavid du Colombierfiles are stored as flat objects. 24143e12c5d1SDavid du ColombierCaches help keep performance good and linear when working with such 24153e12c5d1SDavid du Colombierdata. 24163e12c5d1SDavid du Colombier.PP 24173e12c5d1SDavid du ColombierSecond, the 24183e12c5d1SDavid du Colombier.CW Rasp 24193e12c5d1SDavid du Colombierand several of the other caches have some 24203e12c5d1SDavid du Colombier.I read-ahead; 24213e12c5d1SDavid du Colombierthat is, the cache is loaded with more information than is needed for 24223e12c5d1SDavid du Colombierthe job immediately at hand. 24233e12c5d1SDavid du ColombierWhen manipulating linear structures, the accesses are usually sequential, 24243e12c5d1SDavid du Colombierand read-ahead can significantly reduce the average time to access the 24253e12c5d1SDavid du Colombiernext element of the object. 24263e12c5d1SDavid du ColombierSequential access is a common mode for people as well as programs; 24273e12c5d1SDavid du Colombierconsider scrolling through a document while looking for something. 24283e12c5d1SDavid du Colombier.PP 24293e12c5d1SDavid du ColombierFinally, like any good data structure, 24303e12c5d1SDavid du Colombierthe cache guides the algorithm, or at least the implementation. 24313e12c5d1SDavid du ColombierThe 24323e12c5d1SDavid du Colombier.CW Rasp 24333e12c5d1SDavid du Colombierwas actually invented to control the communications between the host and 24343e12c5d1SDavid du Colombierterminal parts, but I realized very early that it was also a form of 24353e12c5d1SDavid du Colombiercache. Other caches were more explicitly intended to serve a double 24363e12c5d1SDavid du Colombierpurpose: for example, the caches in 24373e12c5d1SDavid du Colombier.CW Files 24383e12c5d1SDavid du Colombierthat coalesce updates not only reduce traffic to the 24393e12c5d1SDavid du Colombiertranscript and contents 24403e12c5d1SDavid du Colombier.CW Buffers , 24413e12c5d1SDavid du Colombierthey also clump screen updates so that complicated changes to the 24423e12c5d1SDavid du Colombierscreen are achieved in 24433e12c5d1SDavid du Colombierjust a few messages to the terminal. 24443e12c5d1SDavid du ColombierThis saved me considerable work: I did not need to write special 24453e12c5d1SDavid du Colombiercode to optimize the message traffic to the 24463e12c5d1SDavid du Colombierterminal. 24473e12c5d1SDavid du ColombierCaches pay off in surprising ways. 24483e12c5d1SDavid du ColombierAlso, they tend to be independent, so their performance improvements 24493e12c5d1SDavid du Colombierare multiplicative. 24503e12c5d1SDavid du Colombier.SH 2 24513e12c5d1SDavid du ColombierData structures in the terminal 24523e12c5d1SDavid du Colombier.LP 24533e12c5d1SDavid du ColombierThe terminal's job is to display and to maintain a consistent image of 24543e12c5d1SDavid du Colombierpieces of the files being edited. 24553e12c5d1SDavid du ColombierBecause the text is always in memory, the data structures are 24563e12c5d1SDavid du Colombierconsiderably simpler than those in the host part. 24573e12c5d1SDavid du Colombier.PP 24583e12c5d1SDavid du Colombier.CW Sam 24593e12c5d1SDavid du Colombiertypically has far more windows than does 24603e12c5d1SDavid du Colombier.CW mux , 24613e12c5d1SDavid du Colombierthe window system within which its Blit implementation runs. 24623e12c5d1SDavid du Colombier.CW Mux 24633e12c5d1SDavid du Colombierhas a fairly small number of asynchronously updated windows; 24643e12c5d1SDavid du Colombier.CW sam 24653e12c5d1SDavid du Colombierneeds a large number of synchronously updated windows that are 24663e12c5d1SDavid du Colombierusually static and often fully obscured. 24673e12c5d1SDavid du ColombierThe different tradeoffs guided 24683e12c5d1SDavid du Colombier.CW sam 24693e12c5d1SDavid du Colombieraway from the memory-intensive implementation of windows, called 24703e12c5d1SDavid du Colombier.CW Layers ,\u\s-4\&17\s+4\d 24713e12c5d1SDavid du Colombierused in 24723e12c5d1SDavid du Colombier.CW mux. 24733e12c5d1SDavid du ColombierRather than depending on a complete bitmap image of the display for each window, 24743e12c5d1SDavid du Colombier.CW sam 24753e12c5d1SDavid du Colombierregenerates the image from its in-memory text 24763e12c5d1SDavid du Colombier(stored in the 24773e12c5d1SDavid du Colombier.CW Rasp ) 24783e12c5d1SDavid du Colombierwhen necessary, although it will use such an image if it is available. 24793e12c5d1SDavid du ColombierLike 24803e12c5d1SDavid du Colombier.CW Layers , 24813e12c5d1SDavid du Colombierthough, 24823e12c5d1SDavid du Colombier.CW sam 24833e12c5d1SDavid du Colombieruses the screen bitmap as active storage in which to update the image using 24843e12c5d1SDavid du Colombier.CW bitblt .\u\s-4\&18,19\s+4\d 24853e12c5d1SDavid du ColombierThe resulting organization, pictured in Figure 6, 24863e12c5d1SDavid du Colombierhas a global array of windows, called 24873e12c5d1SDavid du Colombier.CW Flayers , 24883e12c5d1SDavid du Colombiereach of which holds an image of a piece of text held in a data structure 24893e12c5d1SDavid du Colombiercalled a 24903e12c5d1SDavid du Colombier.CW Frame , 24913e12c5d1SDavid du Colombierwhich in turn represents 24923e12c5d1SDavid du Colombiera rectangular window full of text displayed in some 24933e12c5d1SDavid du Colombier.CW Bitmap . 24943e12c5d1SDavid du ColombierEach 24953e12c5d1SDavid du Colombier.CW Flayer 24963e12c5d1SDavid du Colombierappears in a global list that orders them all front-to-back 24973e12c5d1SDavid du Colombieron the display, and simultaneously as an element of a per-file array 24983e12c5d1SDavid du Colombierthat holds all the open windows for that file. 24993e12c5d1SDavid du ColombierThe complement in the terminal of the 25003e12c5d1SDavid du Colombier.CW File 25013e12c5d1SDavid du Colombieron the host is called a 25023e12c5d1SDavid du Colombier.CW Text ; 25033e12c5d1SDavid du Colombiereach connects its 25043e12c5d1SDavid du Colombier.CW Flayers 25053e12c5d1SDavid du Colombierto the associated 25063e12c5d1SDavid du Colombier.CW Rasp . 25073e12c5d1SDavid du Colombier.KF 25083e12c5d1SDavid du Colombier.PS 25093e12c5d1SDavid du Colombiercopy "fig6.pic" 25103e12c5d1SDavid du Colombier.PE 25113e12c5d1SDavid du Colombier.Cs 25123e12c5d1SDavid du ColombierFigure 6. Data structures in the terminal. 25133e12c5d1SDavid du Colombier.CW Flayers 25143e12c5d1SDavid du Colombierare also linked together into a front-to-back list. 25153e12c5d1SDavid du Colombier.CW Boxes 25163e12c5d1SDavid du Colombierare discussed in the next section. 25173e12c5d1SDavid du Colombier.Ce 25183e12c5d1SDavid du Colombier.KE 25193e12c5d1SDavid du Colombier.PP 25203e12c5d1SDavid du ColombierThe 25213e12c5d1SDavid du Colombier.CW Bitmap 25223e12c5d1SDavid du Colombierfor a 25233e12c5d1SDavid du Colombier.CW Frame 25243e12c5d1SDavid du Colombiercontains the image of the text. 25253e12c5d1SDavid du ColombierFor a fully visible window, the 25263e12c5d1SDavid du Colombier.CW Bitmap 25273e12c5d1SDavid du Colombierwill be the screen (or at least the 25283e12c5d1SDavid du Colombier.CW Layer 25293e12c5d1SDavid du Colombierin which 25303e12c5d1SDavid du Colombier.CW sam 25313e12c5d1SDavid du Colombieris being run), 25323e12c5d1SDavid du Colombierwhile for partially obscured windows the 25333e12c5d1SDavid du Colombier.CW Bitmap 25343e12c5d1SDavid du Colombierwill be off-screen. 25353e12c5d1SDavid du ColombierIf the window is fully obscured, the 25363e12c5d1SDavid du Colombier.CW Bitmap 25373e12c5d1SDavid du Colombierwill be null. 25383e12c5d1SDavid du Colombier.PP 25393e12c5d1SDavid du ColombierThe 25403e12c5d1SDavid du Colombier.CW Bitmap 25413e12c5d1SDavid du Colombieris a kind of cache. 25423e12c5d1SDavid du ColombierWhen making changes to the display, most of the original image will 25433e12c5d1SDavid du Colombierlook the same in the final image, and the update algorithms exploit this. 25443e12c5d1SDavid du ColombierThe 25453e12c5d1SDavid du Colombier.CW Frame 25463e12c5d1SDavid du Colombiersoftware updates the image in the 25473e12c5d1SDavid du Colombier.CW Bitmap 25483e12c5d1SDavid du Colombierincrementally; the 25493e12c5d1SDavid du Colombier.CW Bitmap 25503e12c5d1SDavid du Colombieris not just an image, it is a data structure.\u\s-4\&18,19\s+4\d 25513e12c5d1SDavid du ColombierThe job of the software that updates the display is therefore 25523e12c5d1SDavid du Colombierto use as much as possible of the existing image (converting the 25533e12c5d1SDavid du Colombiertext from ASCII characters to pixels is expensive) in a sort of two-dimensional 25543e12c5d1SDavid du Colombierstring insertion algorithm. 25553e12c5d1SDavid du ColombierThe details of this process are described in the next section. 25563e12c5d1SDavid du Colombier.PP 25573e12c5d1SDavid du ColombierThe 25583e12c5d1SDavid du Colombier.CW Frame 25593e12c5d1SDavid du Colombiersoftware has no code to support overlapping windows; 25603e12c5d1SDavid du Colombierits job is to keep a single 25613e12c5d1SDavid du Colombier.CW Bitmap 25623e12c5d1SDavid du Colombierup to date. 25633e12c5d1SDavid du ColombierIt falls to the 25643e12c5d1SDavid du Colombier.CW Flayer 25653e12c5d1SDavid du Colombiersoftware to multiplex the various 25663e12c5d1SDavid du Colombier.CW Bitmaps 25673e12c5d1SDavid du Colombieronto the screen. 25683e12c5d1SDavid du ColombierThe problem of maintaining overlapping 25693e12c5d1SDavid du Colombier.CW Flayers 25703e12c5d1SDavid du Colombieris easier than for 25713e12c5d1SDavid du Colombier.CW Layers \u\s-4\&17\s+4\d 25723e12c5d1SDavid du Colombierbecause changes are made synchronously and because the contents of the window 25733e12c5d1SDavid du Colombiercan be reconstructed from the data stored in the 25743e12c5d1SDavid du Colombier.CW Frame ; 25753e12c5d1SDavid du Colombierthe 25763e12c5d1SDavid du Colombier.CW Layers 25773e12c5d1SDavid du Colombiersoftware 25783e12c5d1SDavid du Colombiermakes no such assumptions. 25793e12c5d1SDavid du ColombierIn 25803e12c5d1SDavid du Colombier.CW sam , 25813e12c5d1SDavid du Colombierthe window being changed is almost always fully visible, because the current 25823e12c5d1SDavid du Colombierwindow is always fully visible, by construction. 25833e12c5d1SDavid du ColombierHowever, when multi-file changes are being made, or when 25843e12c5d1SDavid du Colombiermore than one window is open on a file, 25853e12c5d1SDavid du Colombierit may be necessary to update partially obscured windows. 25863e12c5d1SDavid du Colombier.PP 25873e12c5d1SDavid du ColombierThere are three cases: the window is 25883e12c5d1SDavid du Colombierfully visible, invisible (fully obscured), or partially visible. 25893e12c5d1SDavid du ColombierIf fully visible, the 25903e12c5d1SDavid du Colombier.CW Bitmap 25913e12c5d1SDavid du Colombieris part of the screen, so when the 25923e12c5d1SDavid du Colombier.CW Flayer 25933e12c5d1SDavid du Colombierupdate routine calls the 25943e12c5d1SDavid du Colombier.CW Frame 25953e12c5d1SDavid du Colombierupdate routine, the screen will be updated directly. 25963e12c5d1SDavid du ColombierIf the window is invisible, 25973e12c5d1SDavid du Colombierthere is no associated 25983e12c5d1SDavid du Colombier.CW Bitmap , 25993e12c5d1SDavid du Colombierand all that is necessary is to update the 26003e12c5d1SDavid du Colombier.CW Frame 26013e12c5d1SDavid du Colombierdata structure, not the image. 26023e12c5d1SDavid du ColombierIf the window is partially visible, the 26033e12c5d1SDavid du Colombier.CW Frame 26043e12c5d1SDavid du Colombierroutine is called to update the image in the off-screen 26053e12c5d1SDavid du Colombier.CW Bitmap , 26063e12c5d1SDavid du Colombierwhich may require regenerating it from the text of the window. 26073e12c5d1SDavid du ColombierThe 26083e12c5d1SDavid du Colombier.CW Flayer 26093e12c5d1SDavid du Colombiercode then clips this 26103e12c5d1SDavid du Colombier.CW Bitmap 26113e12c5d1SDavid du Colombieragainst the 26123e12c5d1SDavid du Colombier.CW Bitmaps 26133e12c5d1SDavid du Colombierof all 26143e12c5d1SDavid du Colombier.CW Frames 26153e12c5d1SDavid du Colombierin front of the 26163e12c5d1SDavid du Colombier.CW Frame 26173e12c5d1SDavid du Colombierbeing modified, and the remainder is copied to the display. 26183e12c5d1SDavid du Colombier.PP 26193e12c5d1SDavid du ColombierThis is much faster than recreating the image off-screen 26203e12c5d1SDavid du Colombierfor every change, or clipping all the changes made to the image 26213e12c5d1SDavid du Colombierduring its update. 26223e12c5d1SDavid du ColombierUnfortunately, these caches can also consume prohibitive amounts of 26233e12c5d1SDavid du Colombiermemory, so they are freed fairly liberally \(em after every change to the 26243e12c5d1SDavid du Colombierfront-to-back order of the 26253e12c5d1SDavid du Colombier.CW Flayers . 26263e12c5d1SDavid du ColombierThe result is that 26273e12c5d1SDavid du Colombierthe off-screen 26283e12c5d1SDavid du Colombier.CW Bitmaps 26293e12c5d1SDavid du Colombierexist only while multi-window changes are occurring, 26303e12c5d1SDavid du Colombierwhich is the only time the performance improvement they provide is needed. 26313e12c5d1SDavid du ColombierAlso, the user interface causes fully-obscured windows to be the 26323e12c5d1SDavid du Colombiereasiest to make \(em 26333e12c5d1SDavid du Colombiercreating a canonically sized and placed window requires only a button click 26343e12c5d1SDavid du Colombier\(em which reduces the need for caching still further. 26353e12c5d1SDavid du Colombier.PP 26363e12c5d1SDavid du Colombier.SH 2 26373e12c5d1SDavid du ColombierScreen update 26383e12c5d1SDavid du Colombier.LP 26393e12c5d1SDavid du ColombierOnly two low-level primitives are needed for incremental update: 26403e12c5d1SDavid du Colombier.CW bitblt , 26413e12c5d1SDavid du Colombierwhich copies rectangles of pixels, and 26423e12c5d1SDavid du Colombier.CW string 26433e12c5d1SDavid du Colombier(which in turn calls 26443e12c5d1SDavid du Colombier.CW bitblt ), 26453e12c5d1SDavid du Colombierwhich draws a null-terminated character string in a 26463e12c5d1SDavid du Colombier.CW Bitmap . 26473e12c5d1SDavid du ColombierA 26483e12c5d1SDavid du Colombier.CW Frame 26493e12c5d1SDavid du Colombiercontains a list of 26503e12c5d1SDavid du Colombier.CW Boxes , 26513e12c5d1SDavid du Colombiereach of which defines a horizontal strip of text in the window 26523e12c5d1SDavid du Colombier(see Figure 7). 26533e12c5d1SDavid du ColombierA 26543e12c5d1SDavid du Colombier.CW Box 26553e12c5d1SDavid du Colombierhas a character string 26563e12c5d1SDavid du Colombier.CW str , 26573e12c5d1SDavid du Colombierand a 26583e12c5d1SDavid du Colombier.CW Rectangle 26593e12c5d1SDavid du Colombier.CW rect 26603e12c5d1SDavid du Colombierthat defines the location of the strip in the window. 26613e12c5d1SDavid du Colombier(The text in 26623e12c5d1SDavid du Colombier.CW str 26633e12c5d1SDavid du Colombieris stored in the 26643e12c5d1SDavid du Colombier.CW Box 26653e12c5d1SDavid du Colombierseparately from the 26663e12c5d1SDavid du Colombier.CW Rasp 26673e12c5d1SDavid du Colombierassociated with the window's file, so 26683e12c5d1SDavid du Colombier.CW Boxes 26693e12c5d1SDavid du Colombierare self-contained.) 26703e12c5d1SDavid du ColombierThe invariant is that 26713e12c5d1SDavid du Colombierthe image of the 26723e12c5d1SDavid du Colombier.CW Box 26733e12c5d1SDavid du Colombiercan be reproduced by calling 26743e12c5d1SDavid du Colombier.CW string 26753e12c5d1SDavid du Colombierwith argument 26763e12c5d1SDavid du Colombier.CW str 26773e12c5d1SDavid du Colombierto draw the string in 26783e12c5d1SDavid du Colombier.CW rect , 26793e12c5d1SDavid du Colombierand the resulting picture fits perfectly within 26803e12c5d1SDavid du Colombier.CW rect . 26813e12c5d1SDavid du ColombierIn other words, the 26823e12c5d1SDavid du Colombier.CW Boxes 26833e12c5d1SDavid du Colombierdefine the tiling of the window. 26843e12c5d1SDavid du ColombierThe tiling may be complicated by long lines of text, which 26853e12c5d1SDavid du Colombierare folded onto the next line. 26863e12c5d1SDavid du ColombierSome editors use horizontal scrolling to avoid this complication, 26873e12c5d1SDavid du Colombierbut to be comfortable this technique requires that lines not be 26883e12c5d1SDavid du Colombier.I too 26893e12c5d1SDavid du Colombierlong; 26903e12c5d1SDavid du Colombier.CW sam 26913e12c5d1SDavid du Colombierhas no such restriction. 2692219b2ee8SDavid du ColombierAlso, and perhaps more importantly, UNIX programs and terminals traditionally fold 26933e12c5d1SDavid du Colombierlong lines to make their contents fully visible. 26943e12c5d1SDavid du Colombier.PP 26953e12c5d1SDavid du ColombierTwo special kinds of 26963e12c5d1SDavid du Colombier.CW Boxes 26973e12c5d1SDavid du Colombiercontain a single 26983e12c5d1SDavid du Colombiercharacter: either a newline or a tab. 26993e12c5d1SDavid du ColombierNewlines and tabs are white space. 27003e12c5d1SDavid du ColombierA newline 27013e12c5d1SDavid du Colombier.CW Box 27023e12c5d1SDavid du Colombieralways extends to the right edge of the window, 27033e12c5d1SDavid du Colombierforcing the following 27043e12c5d1SDavid du Colombier.CW Box 27053e12c5d1SDavid du Colombierto the next line. 27063e12c5d1SDavid du ColombierThe width of a tab depends on where it is located: 27073e12c5d1SDavid du Colombierit forces the next 27083e12c5d1SDavid du Colombier.CW Box 27093e12c5d1SDavid du Colombierto begin at a tab location. 27103e12c5d1SDavid du ColombierTabs also 27113e12c5d1SDavid du Colombierhave a minimum width equivalent to a blank (blanks are 27123e12c5d1SDavid du Colombierdrawn by 27133e12c5d1SDavid du Colombier.CW string 27143e12c5d1SDavid du Colombierand are not treated specially); newlines have a minimum width of zero. 27153e12c5d1SDavid du Colombier.KF 27163e12c5d1SDavid du Colombier.PS 27173e12c5d1SDavid du Colombiercopy "fig7.pic" 27183e12c5d1SDavid du Colombier.PE 27193e12c5d1SDavid du Colombier.sp .5 27203e12c5d1SDavid du Colombier.Cs 27213e12c5d1SDavid du ColombierFigure 7. A line of text showing its 27223e12c5d1SDavid du Colombier.CW Boxes . 27233e12c5d1SDavid du ColombierThe first two blank 27243e12c5d1SDavid du Colombier.CW Boxes 27253e12c5d1SDavid du Colombiercontain tabs; the last contains a newline. 27263e12c5d1SDavid du ColombierSpaces are handled as ordinary characters. 27273e12c5d1SDavid du Colombier.Ce 27283e12c5d1SDavid du Colombier.KE 27293e12c5d1SDavid du Colombier.PP 27303e12c5d1SDavid du ColombierThe update algorithms always use the 27313e12c5d1SDavid du Colombier.CW Bitmap 27323e12c5d1SDavid du Colombierimage of the text (either the display or cache 27333e12c5d1SDavid du Colombier.CW Bitmap ); 27343e12c5d1SDavid du Colombierthey never examine the characters within a 27353e12c5d1SDavid du Colombier.CW Box 27363e12c5d1SDavid du Colombierexcept when the 27373e12c5d1SDavid du Colombier.CW Box 27383e12c5d1SDavid du Colombierneeds to be split in two. 27393e12c5d1SDavid du ColombierBefore a change, the window consists of a tiling of 27403e12c5d1SDavid du Colombier.CW Boxes ; 27413e12c5d1SDavid du Colombierafter the change the window is tiled differently. 27423e12c5d1SDavid du ColombierThe update algorithms rearrange the tiles in place, without 27433e12c5d1SDavid du Colombierbackup storage. 27443e12c5d1SDavid du ColombierThe algorithms are not strictly optimal \(em for example, they can 27453e12c5d1SDavid du Colombierclear a pixel that is later going to be written upon \(em 27463e12c5d1SDavid du Colombierbut they never move a tile that doesn't need to be moved, 27473e12c5d1SDavid du Colombierand they move each tile at most once. 27483e12c5d1SDavid du Colombier.CW Frinsert 27493e12c5d1SDavid du Colombieron a Blit can absorb over a thousand characters a second if the strings 27503e12c5d1SDavid du Colombierbeing inserted are a few tens of characters long. 27513e12c5d1SDavid du Colombier.PP 27523e12c5d1SDavid du ColombierConsider 27533e12c5d1SDavid du Colombier.CW frdelete . 27543e12c5d1SDavid du ColombierIts job is to delete a substring from a 27553e12c5d1SDavid du Colombier.CW Frame 27563e12c5d1SDavid du Colombierand restore the image of the 27573e12c5d1SDavid du Colombier.CW Frame . 27583e12c5d1SDavid du ColombierThe image of a substring has a peculiar shape (see Figure 2) comprising 27593e12c5d1SDavid du Colombierpossibly a partial line, 27603e12c5d1SDavid du Colombierzero or more full lines, 27613e12c5d1SDavid du Colombierand possibly a final partial line. 27623e12c5d1SDavid du ColombierFor reference, call this the 27633e12c5d1SDavid du Colombier.I 27643e12c5d1SDavid du ColombierZ-shape. 27653e12c5d1SDavid du Colombier.R 27663e12c5d1SDavid du Colombier.CW Frdelete 27673e12c5d1SDavid du Colombierbegins by splitting, if necessary, the 27683e12c5d1SDavid du Colombier.CW Boxes 27693e12c5d1SDavid du Colombiercontaining the ends of 27703e12c5d1SDavid du Colombierthe substring so the substring begins and ends on 27713e12c5d1SDavid du Colombier.CW Box 27723e12c5d1SDavid du Colombierboundaries. 27733e12c5d1SDavid du ColombierBecause the substring is being deleted, its image is not needed, 27743e12c5d1SDavid du Colombierso the Z-shape is then cleared. 27753e12c5d1SDavid du ColombierThen, tiles (that is, the images of 27763e12c5d1SDavid du Colombier.CW Boxes ) 27773e12c5d1SDavid du Colombierare copied, using 27783e12c5d1SDavid du Colombier.CW bitblt , 27793e12c5d1SDavid du Colombierfrom immediately after the Z-shape to 27803e12c5d1SDavid du Colombierthe beginning of the Z-shape, 27813e12c5d1SDavid du Colombierresulting in a new Z-shape. 27823e12c5d1SDavid du Colombier.CW Boxes "" ( 27833e12c5d1SDavid du Colombierwhose contents would span two lines in the new position must first be split.) 27843e12c5d1SDavid du Colombier.PP 27853e12c5d1SDavid du ColombierCopying the remainder of the 27863e12c5d1SDavid du Colombier.CW Frame 27873e12c5d1SDavid du Colombiertile by tile 27883e12c5d1SDavid du Colombierthis way will clearly accomplish the deletion but eventually, 27893e12c5d1SDavid du Colombiertypically when the copying algorithm encounters a tab or newline, 27903e12c5d1SDavid du Colombierthe old and new 27913e12c5d1SDavid du Colombier.CW x 27923e12c5d1SDavid du Colombiercoordinates of the tile 27933e12c5d1SDavid du Colombierto be copied are the same. 27943e12c5d1SDavid du ColombierThis correspondence implies 27953e12c5d1SDavid du Colombierthat the Z-shape has its beginning and ending edges aligned 27963e12c5d1SDavid du Colombiervertically, and a sequence of at most two 27973e12c5d1SDavid du Colombier.CW bitblts 27983e12c5d1SDavid du Colombiercan be used to copy the remaining tiles. 27993e12c5d1SDavid du ColombierThe last step is to clear out the resulting empty space at the bottom 28003e12c5d1SDavid du Colombierof the window; 28013e12c5d1SDavid du Colombierthe number of lines to be cleared is the number of complete lines in the 28023e12c5d1SDavid du ColombierZ-shape closed by the final 28033e12c5d1SDavid du Colombier.CW bitblts. 28043e12c5d1SDavid du ColombierThe final step is to merge horizontally adjacent 28053e12c5d1SDavid du Colombier.CW Boxes 28063e12c5d1SDavid du Colombierof plain text. 28073e12c5d1SDavid du ColombierThe complete source to 28083e12c5d1SDavid du Colombier.CW frdelete 28093e12c5d1SDavid du Colombieris less than 100 lines of C. 28103e12c5d1SDavid du Colombier.PP 28113e12c5d1SDavid du Colombier.CW frinsert 28123e12c5d1SDavid du Colombieris more complicated because it must do four passes: 28133e12c5d1SDavid du Colombierone to construct the 28143e12c5d1SDavid du Colombier.CW Box 28153e12c5d1SDavid du Colombierlist for the inserted string, 28163e12c5d1SDavid du Colombierone to reconnoitre, 28173e12c5d1SDavid du Colombierone to copy (in opposite order to 28183e12c5d1SDavid du Colombier.CW frdelete ) 28193e12c5d1SDavid du Colombierthe 28203e12c5d1SDavid du Colombier.CW Boxes 28213e12c5d1SDavid du Colombierto make the hole for the new text, 28223e12c5d1SDavid du Colombierand finally one to copy the new text into place. 28233e12c5d1SDavid du ColombierOverall, though, 28243e12c5d1SDavid du Colombier.CW frinsert 28253e12c5d1SDavid du Colombierhas a similar flavor to 28263e12c5d1SDavid du Colombier.CW frdelete , 28273e12c5d1SDavid du Colombierand needn't be described further. 28283e12c5d1SDavid du Colombier.CW Frinsert 28293e12c5d1SDavid du Colombierand its subsidiary routines comprise 211 lines of C. 28303e12c5d1SDavid du Colombier.PP 28313e12c5d1SDavid du ColombierThe terminal source code is 3024 lines of C, 28323e12c5d1SDavid du Colombierand the host source is 5797 lines. 28333e12c5d1SDavid du Colombier.SH 28343e12c5d1SDavid du ColombierDiscussion 28353e12c5d1SDavid du Colombier.SH 2 28363e12c5d1SDavid du ColombierHistory 28373e12c5d1SDavid du Colombier.LP 28383e12c5d1SDavid du ColombierThe immediate ancestor of 28393e12c5d1SDavid du Colombier.CW sam 28403e12c5d1SDavid du Colombierwas the original text editor for the Blit, called 28413e12c5d1SDavid du Colombier.CW jim . 28423e12c5d1SDavid du Colombier.CW Sam 28433e12c5d1SDavid du Colombierinherited 28443e12c5d1SDavid du Colombier.CW jim 's 28453e12c5d1SDavid du Colombiertwo-process structure and mouse language almost unchanged, but 28463e12c5d1SDavid du Colombier.CW jim 28473e12c5d1SDavid du Colombiersuffered from several drawbacks that were addressed in the design of 28483e12c5d1SDavid du Colombier.CW sam . 28493e12c5d1SDavid du ColombierThe most important of these was the lack of a command language. 28503e12c5d1SDavid du ColombierAlthough 28513e12c5d1SDavid du Colombier.CW jim 28523e12c5d1SDavid du Colombierwas easy to use for simple editing, it provided no direct help with 28533e12c5d1SDavid du Colombierlarge or repetitive editing tasks. Instead, it provided a command to pass 28543e12c5d1SDavid du Colombierselected text through a shell pipeline, 28553e12c5d1SDavid du Colombierbut this was no more satisfactory than could be expected of a stopgap measure. 28563e12c5d1SDavid du Colombier.PP 28573e12c5d1SDavid du Colombier.CW Jim 28583e12c5d1SDavid du Colombierwas written primarily as a vehicle for experimenting with a mouse-based 28593e12c5d1SDavid du Colombierinterface to text, and the experiment was successful. 28603e12c5d1SDavid du Colombier.CW Jim 28613e12c5d1SDavid du Colombierhad some spin-offs: 28623e12c5d1SDavid du Colombier.CW mux , 28633e12c5d1SDavid du Colombierthe second window system for the Blit, is essentially a multiplexed 28643e12c5d1SDavid du Colombierversion of the terminal part of 28653e12c5d1SDavid du Colombier.CW jim ; 28663e12c5d1SDavid du Colombierand the debugger 28673e12c5d1SDavid du Colombier.CW pi 's 28683e12c5d1SDavid du Colombieruser interface\u\s-4\&20\s+4\d was closely modeled on 28693e12c5d1SDavid du Colombier.CW jim 's. 28703e12c5d1SDavid du ColombierBut after a couple of years, 28713e12c5d1SDavid du Colombier.CW jim 28723e12c5d1SDavid du Colombierhad become difficult to maintain and limiting to use, 28733e12c5d1SDavid du Colombierand its replacement was overdue. 28743e12c5d1SDavid du Colombier.PP 28753e12c5d1SDavid du ColombierI began the design of 28763e12c5d1SDavid du Colombier.CW sam 28773e12c5d1SDavid du Colombierby asking 28783e12c5d1SDavid du Colombier.CW jim 28793e12c5d1SDavid du Colombiercustomers what they wanted. 28803e12c5d1SDavid du ColombierThis was probably a mistake; the answers were essentially a list of features 28813e12c5d1SDavid du Colombierto be found in other editors, which did not provide any of the 28823e12c5d1SDavid du Colombierguiding principles I was seeking. 28833e12c5d1SDavid du ColombierFor instance, one common request was for a ``global substitute,'' 28843e12c5d1SDavid du Colombierbut no one suggested how to provide it within a cut-and-paste editor. 28853e12c5d1SDavid du ColombierI was looking for a scheme that would 28863e12c5d1SDavid du Colombiersupport such specialized features comfortably in the context of some 28873e12c5d1SDavid du Colombiergeneral command language. 28883e12c5d1SDavid du ColombierIdeas were not forthcoming, though, particularly given my insistence 28893e12c5d1SDavid du Colombieron removing all limits on file sizes, line lengths and so on. 28903e12c5d1SDavid du ColombierEven worse, I recognized that, since the mouse could easily 28913e12c5d1SDavid du Colombierindicate a region of the screen that was not an integral number of lines, 28923e12c5d1SDavid du Colombierthe command language would best forget about newlines altogether, 28933e12c5d1SDavid du Colombierand that meant the command language had to treat the file as a single 28943e12c5d1SDavid du Colombierstring, not an array of lines. 28953e12c5d1SDavid du Colombier.PP 28963e12c5d1SDavid du ColombierEventually, I decided that thinking was not getting me very far and it was 28973e12c5d1SDavid du Colombiertime to try building. 28983e12c5d1SDavid du ColombierI knew that the terminal part could be built easily \(em 28993e12c5d1SDavid du Colombierthat part of 29003e12c5d1SDavid du Colombier.CW jim 29013e12c5d1SDavid du Colombierbehaved acceptably well \(em and that most of the hard work was going 29023e12c5d1SDavid du Colombierto be in the host part: the file interface, command interpreter and so on. 29033e12c5d1SDavid du ColombierMoreover, I had some ideas about how the architecture of 29043e12c5d1SDavid du Colombier.CW jim 29053e12c5d1SDavid du Colombiercould be improved without destroying its basic structure, which I liked 29063e12c5d1SDavid du Colombierin principle but which hadn't worked out as well as I had hoped. 29073e12c5d1SDavid du ColombierSo I began by designing the file data structure, 29083e12c5d1SDavid du Colombierstarting with the way 29093e12c5d1SDavid du Colombier.CW jim 29103e12c5d1SDavid du Colombierworked \(em comparable to a single structure merging 29113e12c5d1SDavid du Colombier.CW Disc 29123e12c5d1SDavid du Colombierand 29133e12c5d1SDavid du Colombier.CW Buffer , 29143e12c5d1SDavid du Colombierwhich I split to make the cache more general 29153e12c5d1SDavid du Colombier\(em and thinking about how global substitute could be implemented. 29163e12c5d1SDavid du ColombierThe answer was clearly that it had to be done in two passes, 29173e12c5d1SDavid du Colombierand the transcript-oriented implementation fell out naturally. 29183e12c5d1SDavid du Colombier.PP 29193e12c5d1SDavid du Colombier.CW Sam 29203e12c5d1SDavid du Colombierwas written bottom-up, 29213e12c5d1SDavid du Colombierstarting from the data structures and algorithms for manipulating text, 29223e12c5d1SDavid du Colombierthrough the command language and up to the code for maintaining 29233e12c5d1SDavid du Colombierthe display. 29243e12c5d1SDavid du ColombierIn retrospect, it turned out well, but this implementation method is 29253e12c5d1SDavid du Colombiernot recommended in general. 29263e12c5d1SDavid du ColombierThere were several times when I had a large body of interesting code 29273e12c5d1SDavid du Colombierassembled and no clue how to proceed with it. 29283e12c5d1SDavid du ColombierThe command language, in particular, took almost a year to figure out, 29293e12c5d1SDavid du Colombierbut can be implemented (given what was there at the beginning of that year) 29303e12c5d1SDavid du Colombierin a day or two. Similarly, inventing the 29313e12c5d1SDavid du Colombier.CW Rasp 29323e12c5d1SDavid du Colombierdata structure delayed the 29333e12c5d1SDavid du Colombierconnection of the host and terminal pieces by another few months. 29343e12c5d1SDavid du Colombier.CW Sam 29353e12c5d1SDavid du Colombiertook about two years to write, although only about four months were 29363e12c5d1SDavid du Colombierspent actually working on it. 29373e12c5d1SDavid du Colombier.PP 29383e12c5d1SDavid du ColombierPart of the design process was unusual: 29393e12c5d1SDavid du Colombierthe subset of the protocol that maintains the 29403e12c5d1SDavid du Colombier.CW Rasp 29413e12c5d1SDavid du Colombierwas simulated, debugged 29423e12c5d1SDavid du Colombierand verified by an automatic protocol analyzer,\u\s-4\&21\s+4\d and was bug-free 29433e12c5d1SDavid du Colombierfrom the start. 29443e12c5d1SDavid du ColombierThe rest of the protocol, concerned mostly 29453e12c5d1SDavid du Colombierwith keeping menus up to date, 29463e12c5d1SDavid du Colombierwas unfortunately too unwieldy for such analysis, 29473e12c5d1SDavid du Colombierand was debugged by more traditional methods, primarily 29483e12c5d1SDavid du Colombierby logging in a file all messages in and out of the host. 29493e12c5d1SDavid du Colombier.SH 2 29503e12c5d1SDavid du ColombierReflections 29513e12c5d1SDavid du Colombier.LP 29523e12c5d1SDavid du Colombier.CW Sam 29533e12c5d1SDavid du Colombieris essentially the only interactive editor used by the sixty or so members of 29543e12c5d1SDavid du Colombierthe computing science research center in which I work. 29553e12c5d1SDavid du ColombierThe same could not be said of 29563e12c5d1SDavid du Colombier.CW jim ; 29573e12c5d1SDavid du Colombierthe lack of a command language kept some people from adopting it. 29583e12c5d1SDavid du ColombierThe union of a user interface as comfortable as 29593e12c5d1SDavid du Colombier.CW jim 's 29603e12c5d1SDavid du Colombierwith a command language as powerful as 2961219b2ee8SDavid du Colombier.CW ed 's† 29623e12c5d1SDavid du Colombier.FS 29633e12c5d1SDavid du Colombier.vs 9 2964219b2ee8SDavid du Colombier†The people who criticize 29653e12c5d1SDavid du Colombier.CW ed 29663e12c5d1SDavid du Colombieras an interactive program often forget that it and its close relative 29673e12c5d1SDavid du Colombier.CW sed \u\s-4\&7\s+4\d 29683e12c5d1SDavid du Colombierstill thrive as programmable editors. The strength of these programs is 29693e12c5d1SDavid du Colombierindependent of their convenience for interactive editing. 29703e12c5d1SDavid du Colombier.br 29713e12c5d1SDavid du Colombier.vs 29723e12c5d1SDavid du Colombier.FE 29733e12c5d1SDavid du Colombieris essential to 29743e12c5d1SDavid du Colombier.CW sam 's 29753e12c5d1SDavid du Colombiersuccess. 29763e12c5d1SDavid du ColombierWhen 29773e12c5d1SDavid du Colombier.CW sam 29783e12c5d1SDavid du Colombierwas first made available to the 29793e12c5d1SDavid du Colombier.CW jim 29803e12c5d1SDavid du Colombiercommunity, 29813e12c5d1SDavid du Colombieralmost everyone switched to it within two or three days. 29823e12c5d1SDavid du ColombierIn the months that followed, even people who had never adopted 29833e12c5d1SDavid du Colombier.CW jim 29843e12c5d1SDavid du Colombierstarted using 29853e12c5d1SDavid du Colombier.CW sam 29863e12c5d1SDavid du Colombierexclusively. 29873e12c5d1SDavid du Colombier.PP 29883e12c5d1SDavid du ColombierTo be honest, 29893e12c5d1SDavid du Colombier.CW ed 29903e12c5d1SDavid du Colombierstill gets occasional use, but usually when 29913e12c5d1SDavid du Colombiersomething quick needs to be done and the overhead of 29923e12c5d1SDavid du Colombierdownloading the terminal part of 29933e12c5d1SDavid du Colombier.CW sam 29943e12c5d1SDavid du Colombierisn't worth the trouble. 29953e12c5d1SDavid du ColombierAlso, as a `line' editor, 29963e12c5d1SDavid du Colombier.CW sam 29973e12c5d1SDavid du Colombier.CW -d 29983e12c5d1SDavid du Colombieris a bit odd; 29993e12c5d1SDavid du Colombierwhen using a good old ASCII terminal, it's comforting to have 30003e12c5d1SDavid du Colombiera true line editor. 30013e12c5d1SDavid du ColombierBut it is fair to say that 30023e12c5d1SDavid du Colombier.CW sam 's 30033e12c5d1SDavid du Colombiercommand language has displaced 30043e12c5d1SDavid du Colombier.CW ed 's 30053e12c5d1SDavid du Colombierfor most of the complicated editing that has kept line editors 30063e12c5d1SDavid du Colombier(that is, command-driven editors) with us. 30073e12c5d1SDavid du Colombier.PP 30083e12c5d1SDavid du Colombier.CW Sam 's 30093e12c5d1SDavid du Colombiercommand language is even fancier than 30103e12c5d1SDavid du Colombier.CW ed 's, 30113e12c5d1SDavid du Colombierand most 30123e12c5d1SDavid du Colombier.CW sam 30133e12c5d1SDavid du Colombiercustomers don't come near to using all its capabilities. 30143e12c5d1SDavid du ColombierDoes it need to be so sophisticated? 30153e12c5d1SDavid du ColombierI think the answer is yes, for two reasons. 30163e12c5d1SDavid du Colombier.PP 30173e12c5d1SDavid du ColombierFirst, the 30183e12c5d1SDavid du Colombier.I model 30193e12c5d1SDavid du Colombierfor 30203e12c5d1SDavid du Colombier.CW sam 's 30213e12c5d1SDavid du Colombiercommand language is really relatively simple, and certainly simpler than that of 30223e12c5d1SDavid du Colombier.CW ed . 30233e12c5d1SDavid du ColombierFor instance, there is only one kind of textual loop in 30243e12c5d1SDavid du Colombier.CW sam 30253e12c5d1SDavid du Colombier\(em the 30263e12c5d1SDavid du Colombier.CW x 30273e12c5d1SDavid du Colombiercommand \(em 30283e12c5d1SDavid du Colombierwhile 30293e12c5d1SDavid du Colombier.CW ed 30303e12c5d1SDavid du Colombierhas three (the 30313e12c5d1SDavid du Colombier.CW g 30323e12c5d1SDavid du Colombiercommand, the global flag on substitutions, and the implicit loop over 30333e12c5d1SDavid du Colombierlines in multi-line substitutions). 30343e12c5d1SDavid du ColombierAlso, 30353e12c5d1SDavid du Colombier.CW ed 's 30363e12c5d1SDavid du Colombiersubstitute command is necessary to make changes within lines, but in 30373e12c5d1SDavid du Colombier.CW sam 30383e12c5d1SDavid du Colombierthe 30393e12c5d1SDavid du Colombier.CW s 30403e12c5d1SDavid du Colombiercommand is more of a familiar convenience than a necessity; 30413e12c5d1SDavid du Colombier.CW c 30423e12c5d1SDavid du Colombierand 30433e12c5d1SDavid du Colombier.CW t 30443e12c5d1SDavid du Colombiercan do all the work. 30453e12c5d1SDavid du Colombier.PP 30463e12c5d1SDavid du ColombierSecond, 30473e12c5d1SDavid du Colombiergiven a community that expects an editor to be about as powerful as 30483e12c5d1SDavid du Colombier.CW ed , 30493e12c5d1SDavid du Colombierit's hard to see how 30503e12c5d1SDavid du Colombier.CW sam 30513e12c5d1SDavid du Colombiercould really be much simpler and still satisfy that expectation. 30523e12c5d1SDavid du ColombierPeople want to do ``global substitutes,'' and most are content 30533e12c5d1SDavid du Colombierto have the recipe for that and a few other fancy changes. 30543e12c5d1SDavid du ColombierThe sophistication of the command language is really just a veneer 30553e12c5d1SDavid du Colombierover a design that makes it possible to do global substitutes 30563e12c5d1SDavid du Colombierin a screen editor. 30573e12c5d1SDavid du ColombierSome people will always want something more, however, and it's gratifying to 30583e12c5d1SDavid du Colombierbe able to provide it. 30593e12c5d1SDavid du ColombierThe real power of 30603e12c5d1SDavid du Colombier.CW sam 's 30613e12c5d1SDavid du Colombiercommand language comes from composability of the operators, which is by 30623e12c5d1SDavid du Colombiernature orthogonal to the underlying model. 30633e12c5d1SDavid du ColombierIn other words, 30643e12c5d1SDavid du Colombier.CW sam 30653e12c5d1SDavid du Colombieris not itself complex, but it makes complex things possible. 30663e12c5d1SDavid du ColombierIf you don't want to do anything complex, you can ignore the 30673e12c5d1SDavid du Colombiercomplexity altogether, and many people do so. 30683e12c5d1SDavid du Colombier.PP 30693e12c5d1SDavid du ColombierSometimes I am asked the opposite question: why didn't I just make 30703e12c5d1SDavid du Colombier.CW sam 30713e12c5d1SDavid du Colombiera real programmable editor, with macros and variables and so on? 30723e12c5d1SDavid du ColombierThe main reason is a matter of taste: I like the editor 30733e12c5d1SDavid du Colombierto be the same every time I use it. 30743e12c5d1SDavid du ColombierThere is one technical reason, though: 30753e12c5d1SDavid du Colombierprogrammability in editors is largely a workaround for insufficient 30763e12c5d1SDavid du Colombierinteractivity. 30773e12c5d1SDavid du ColombierProgrammable editors are used to make particular, usually short-term, 30783e12c5d1SDavid du Colombierthings easy to do, such as by providing shorthands for common actions. 30793e12c5d1SDavid du ColombierIf things are generally easy to do in the first place, 30803e12c5d1SDavid du Colombiershorthands are not as helpful. 30813e12c5d1SDavid du Colombier.CW Sam 30823e12c5d1SDavid du Colombiermakes common editing operations very easy, and the solutions to 30833e12c5d1SDavid du Colombiercomplex editing problems seem commensurate with the problems themselves. 30843e12c5d1SDavid du ColombierAlso, the ability to edit the 30853e12c5d1SDavid du Colombier.CW sam 30863e12c5d1SDavid du Colombierwindow makes it easy to repeat commands \(em it only takes a mouse button click 30873e12c5d1SDavid du Colombierto execute a command again. 30883e12c5d1SDavid du Colombier.SH 2 30893e12c5d1SDavid du ColombierPros and cons 30903e12c5d1SDavid du Colombier.LP 30913e12c5d1SDavid du Colombier.CW Sam 30923e12c5d1SDavid du Colombierhas several other good points, 30933e12c5d1SDavid du Colombierand its share of problems. 30943e12c5d1SDavid du ColombierAmong the good things is the idea of 30953e12c5d1SDavid du Colombierstructural regular expressions, 30963e12c5d1SDavid du Colombierwhose usefulness has only begun to be explored. 30973e12c5d1SDavid du ColombierThey were arrived at serendipitously when I attempted to distill the essence of 30983e12c5d1SDavid du Colombier.CW ed 's 30993e12c5d1SDavid du Colombierway of doing global substitution and recognized that the looping command in 31003e12c5d1SDavid du Colombier.CW ed 31013e12c5d1SDavid du Colombierwas implicitly imposing a structure (an array of lines) on the file. 31023e12c5d1SDavid du Colombier.PP 31033e12c5d1SDavid du ColombierAnother of 31043e12c5d1SDavid du Colombier.CW sam 's 31053e12c5d1SDavid du Colombiergood things is its undo capability. 31063e12c5d1SDavid du ColombierI had never before used an editor with a true undo, 31073e12c5d1SDavid du Colombierbut I would never go back now. 31083e12c5d1SDavid du ColombierUndo 31093e12c5d1SDavid du Colombier.I must 31103e12c5d1SDavid du Colombierbe done well, but if it is, it can be relied on. 31113e12c5d1SDavid du ColombierFor example, 31123e12c5d1SDavid du Colombierit's safe to experiment if you're not sure how to write some intricate command, 31133e12c5d1SDavid du Colombierbecause if you make a mistake, it can be fixed simply and reliably. 31143e12c5d1SDavid du ColombierI learned two things about undo from writing 31153e12c5d1SDavid du Colombier.CW sam : 31163e12c5d1SDavid du Colombierfirst, it's easy to provide if you design it in from the beginning, and 31173e12c5d1SDavid du Colombiersecond, it's necessary, particularly if the system has some subtle 31183e12c5d1SDavid du Colombierproperties that may be unfamiliar or error-prone for users. 31193e12c5d1SDavid du Colombier.PP 31203e12c5d1SDavid du Colombier.CW Sam 's 31213e12c5d1SDavid du Colombierlack of internal limits and sizes is a virtue. 31223e12c5d1SDavid du ColombierBecause it avoids all fixed-size tables and data structures, 31233e12c5d1SDavid du Colombier.CW sam 31243e12c5d1SDavid du Colombieris able to make global changes to files that some of our other 31253e12c5d1SDavid du Colombiertools cannot even read. 31263e12c5d1SDavid du ColombierMoreover, the design keeps the performance linear when doing such 31273e12c5d1SDavid du Colombieroperations, although I must admit 31283e12c5d1SDavid du Colombier.CW sam 31293e12c5d1SDavid du Colombierdoes get slow when editing a huge file. 31303e12c5d1SDavid du Colombier.PP 31313e12c5d1SDavid du ColombierNow, the problems. 31323e12c5d1SDavid du ColombierExternally, the most obvious is that it is poorly integrated into the 31333e12c5d1SDavid du Colombiersurrounding window system. 31343e12c5d1SDavid du ColombierBy design, the user interface in 31353e12c5d1SDavid du Colombier.CW sam 31363e12c5d1SDavid du Colombierfeels almost identical to that of 31373e12c5d1SDavid du Colombier.CW mux , 31383e12c5d1SDavid du Colombierbut a thick wall separates text in 31393e12c5d1SDavid du Colombier.CW sam 31403e12c5d1SDavid du Colombierfrom the programs running in 31413e12c5d1SDavid du Colombier.CW mux . 31423e12c5d1SDavid du ColombierFor instance, the `snarf buffer' in 31433e12c5d1SDavid du Colombier.CW sam 31443e12c5d1SDavid du Colombiermust be maintained separately from that in 31453e12c5d1SDavid du Colombier.CW mux . 31463e12c5d1SDavid du ColombierThis is regrettable, but probably necessary given the unusual configuration 31473e12c5d1SDavid du Colombierof the system, with a programmable terminal on the far end of an RS-232 link. 31483e12c5d1SDavid du Colombier.PP 31493e12c5d1SDavid du Colombier.CW Sam 31503e12c5d1SDavid du Colombieris reliable; otherwise, people wouldn't use it. 31513e12c5d1SDavid du ColombierBut it was written over such a long time, and has so many new (to me) 31523e12c5d1SDavid du Colombierideas in it, that I would like to see it done over again to clean 31533e12c5d1SDavid du Colombierup the code and remove many of the lingering problems in the implementation. 31543e12c5d1SDavid du ColombierThe worst part is in the interconnection of the host and terminal parts, 31553e12c5d1SDavid du Colombierwhich might even be able to go away in a redesign for a more 31563e12c5d1SDavid du Colombierconventional window system. 31573e12c5d1SDavid du ColombierThe program must be split in two to use the terminal effectively, 31583e12c5d1SDavid du Colombierbut the low bandwidth of the connection forces the separation to 31593e12c5d1SDavid du Colombieroccur in an inconvenient part of the design if performance is to be acceptable. 31603e12c5d1SDavid du ColombierA simple remote procedure call 31613e12c5d1SDavid du Colombierprotocol driven by the host, emitting only graphics 31623e12c5d1SDavid du Colombiercommands, would be easy to write but wouldn't have nearly the 31633e12c5d1SDavid du Colombiernecessary responsiveness. On the other hand, if the terminal were in control 31643e12c5d1SDavid du Colombierand requested much simpler file services from the host, regular expression 31653e12c5d1SDavid du Colombiersearches would require that the terminal read the entire file over its RS-232 31663e12c5d1SDavid du Colombierlink, which would be unreasonably slow. 31673e12c5d1SDavid du ColombierA compromise in which either end can take control is necessary. 31683e12c5d1SDavid du ColombierIn retrospect, the communications protocol should have been 31693e12c5d1SDavid du Colombierdesigned and verified formally, although I do not know of any tool 31703e12c5d1SDavid du Colombierthat can adequately relate the protocol to 31713e12c5d1SDavid du Colombierits implementation. 31723e12c5d1SDavid du Colombier.PP 31733e12c5d1SDavid du ColombierNot all of 31743e12c5d1SDavid du Colombier.CW sam 's 31753e12c5d1SDavid du Colombierusers are comfortable with its command language, and few are adept. 31763e12c5d1SDavid du ColombierSome (venerable) people use a sort of 31773e12c5d1SDavid du Colombier.CW ed \& `` 31783e12c5d1SDavid du Colombiersubset'' of 31793e12c5d1SDavid du Colombier.CW sam 's 31803e12c5d1SDavid du Colombiercommand language, 31813e12c5d1SDavid du Colombierand even ask why 31823e12c5d1SDavid du Colombier.CW sam 's 31833e12c5d1SDavid du Colombiercommand language is not exactly 31843e12c5d1SDavid du Colombier.CW ed 's. 31853e12c5d1SDavid du Colombier(The reason, of course, is that 31863e12c5d1SDavid du Colombier.CW sam 's 31873e12c5d1SDavid du Colombiermodel for text does not include newlines, which are central to 31883e12c5d1SDavid du Colombier.CW ed . 31893e12c5d1SDavid du ColombierMaking the text an array of newlines to the command language would 31903e12c5d1SDavid du Colombierbe too much of a break from the seamless model provided by the mouse. 31913e12c5d1SDavid du ColombierSome editors, such as 31923e12c5d1SDavid du Colombier.CW vi , 31933e12c5d1SDavid du Colombierare willing to make this break, though.) 31943e12c5d1SDavid du ColombierThe difficulty is that 31953e12c5d1SDavid du Colombier.CW sam 's 31963e12c5d1SDavid du Colombiersyntax is so close to 31973e12c5d1SDavid du Colombier.CW ed 's 31983e12c5d1SDavid du Colombierthat people believe it 31993e12c5d1SDavid du Colombier.I should 32003e12c5d1SDavid du Colombierbe the same. 32013e12c5d1SDavid du ColombierI thought, with some justification in hindsight, 32023e12c5d1SDavid du Colombierthat making 32033e12c5d1SDavid du Colombier.CW sam 32043e12c5d1SDavid du Colombiersimilar to 32053e12c5d1SDavid du Colombier.CW ed 32063e12c5d1SDavid du Colombierwould make it easier to learn and to accept. 32073e12c5d1SDavid du ColombierBut I may have overstepped and raised the users' 32083e12c5d1SDavid du Colombierexpectations too much. 32093e12c5d1SDavid du ColombierIt's hard to decide which way to resolve this problem. 32103e12c5d1SDavid du Colombier.PP 32113e12c5d1SDavid du ColombierFinally, there is a tradeoff in 32123e12c5d1SDavid du Colombier.CW sam 32133e12c5d1SDavid du Colombierthat was decided by the environment in which it runs: 32143e12c5d1SDavid du Colombier.CW sam 32153e12c5d1SDavid du Colombieris a multi-file editor, although in a different system there might instead be 32163e12c5d1SDavid du Colombiermultiple single-file editors. 32173e12c5d1SDavid du ColombierThe decision was made primarily because starting a new program in a Blit is 32183e12c5d1SDavid du Colombiertime-consuming. 32193e12c5d1SDavid du ColombierIf the choice could be made freely, however, I would 32203e12c5d1SDavid du Colombierstill choose the multi-file architecture, because it allows 32213e12c5d1SDavid du Colombiergroups of files to be handled as a unit; 32223e12c5d1SDavid du Colombierthe usefulness of the multi-file commands is incontrovertible. 32233e12c5d1SDavid du ColombierIt is delightful to have the source to an entire program 32243e12c5d1SDavid du Colombieravailable at your fingertips. 32253e12c5d1SDavid du Colombier.SH 32263e12c5d1SDavid du ColombierAcknowledgements 32273e12c5d1SDavid du Colombier.LP 32283e12c5d1SDavid du ColombierTom Cargill suggested the idea behind the 32293e12c5d1SDavid du Colombier.CW Rasp 32303e12c5d1SDavid du Colombierdata structure. 32313e12c5d1SDavid du ColombierNorman Wilson and Ken Thompson influenced the command language. 32323e12c5d1SDavid du ColombierThis paper was improved by comments from 32333e12c5d1SDavid du ColombierAl Aho, 32343e12c5d1SDavid du ColombierJon Bentley, 32353e12c5d1SDavid du ColombierChris Fraser, 32363e12c5d1SDavid du ColombierGerard Holzmann, 32373e12c5d1SDavid du ColombierBrian Kernighan, 32383e12c5d1SDavid du ColombierTed Kowalski, 32393e12c5d1SDavid du ColombierDoug McIlroy 32403e12c5d1SDavid du Colombierand 32413e12c5d1SDavid du ColombierDennis Ritchie. 3242