xref: /plan9/sys/doc/sam/sam.ms (revision 5d535f588fe0a7c7f04081512974c9c20b5d87a8)
1426d2b71SDavid du Colombier.HTML "The Text Editor sam
23e12c5d1SDavid du Colombier.Vx 17 11 November 87 1 32 "ROB PIKE" "THE TEXT EDITOR SAM"
33e12c5d1SDavid du Colombier.ds DY "31 May 1987
43e12c5d1SDavid du Colombier.ds DR "Revised 1 July 1987
53e12c5d1SDavid du Colombier.de CW		\" puts first arg in CW font, same as UL; maintains font
63e12c5d1SDavid du Colombier\%\&\\$3\f(CW\\$1\fP\&\\$2
73e12c5d1SDavid du Colombier..
83e12c5d1SDavid du Colombier.de Cs
93e12c5d1SDavid du Colombier.br
103e12c5d1SDavid du Colombier.fi
113e12c5d1SDavid du Colombier.ft 2
123e12c5d1SDavid du Colombier.ps -2
133e12c5d1SDavid du Colombier.vs -2
143e12c5d1SDavid du Colombier..
153e12c5d1SDavid du Colombier.de Ce
163e12c5d1SDavid du Colombier.br
173e12c5d1SDavid du Colombier.nf
183e12c5d1SDavid du Colombier.ft 1
193e12c5d1SDavid du Colombier.ps
203e12c5d1SDavid du Colombier.vs
213e12c5d1SDavid du Colombier.sp
223e12c5d1SDavid du Colombier..
23426d2b71SDavid du Colombier.de XP
24*5d535f58SDavid du Colombier.ie h .html - <center><img src="\\$1.gif" /></center>
25426d2b71SDavid du Colombier.el .BP \\$1.ps \\$2
26426d2b71SDavid du Colombier..
273e12c5d1SDavid du Colombier.TL
283e12c5d1SDavid du ColombierThe Text Editor \&\f(CWsam\fP
293e12c5d1SDavid du Colombier.AU
30219b2ee8SDavid du ColombierRob Pike
317dd7cddfSDavid du Colombierrob@plan9.bell-labs.com
323e12c5d1SDavid du Colombier.AB
333e12c5d1SDavid du Colombier.LP
343e12c5d1SDavid du Colombier.CW Sam
353e12c5d1SDavid du Colombieris an interactive multi-file text editor intended for
363e12c5d1SDavid du Colombierbitmap displays.
373e12c5d1SDavid du ColombierA textual command language
383e12c5d1SDavid du Colombiersupplements the mouse-driven, cut-and-paste interface
393e12c5d1SDavid du Colombierto make complex or
403e12c5d1SDavid du Colombierrepetitive editing tasks easy to specify.
413e12c5d1SDavid du ColombierThe language is characterized by the composition of regular expressions
423e12c5d1SDavid du Colombierto describe the structure of the text being modified.
433e12c5d1SDavid du ColombierThe treatment of files as a database, with changes logged
443e12c5d1SDavid du Colombieras atomic transactions, guides the implementation and
453e12c5d1SDavid du Colombiermakes a general `undo' mechanism straightforward.
463e12c5d1SDavid du Colombier.PP
473e12c5d1SDavid du Colombier.CW Sam
483e12c5d1SDavid du Colombieris implemented as two processes connected by a low-bandwidth stream,
493e12c5d1SDavid du Colombierone process handling the display and the other the editing
503e12c5d1SDavid du Colombieralgorithms.  Therefore it can run with the display process
513e12c5d1SDavid du Colombierin a bitmap terminal and the editor on a local host,
523e12c5d1SDavid du Colombierwith both processes on a bitmap-equipped host, or with
533e12c5d1SDavid du Colombierthe display process in the terminal and the editor in a
543e12c5d1SDavid du Colombierremote host.
553e12c5d1SDavid du ColombierBy suppressing the display process,
563e12c5d1SDavid du Colombierit can even run without a bitmap terminal.
573e12c5d1SDavid du Colombier.PP
583e12c5d1SDavid du ColombierThis paper is reprinted from Software\(emPractice and Experience,
59219b2ee8SDavid du ColombierVol 17, number 11, pp. 813-845, November 1987.
60219b2ee8SDavid du ColombierThe paper has not been updated for the Plan 9 manuals.  Although
61219b2ee8SDavid du Colombier.CW Sam
62219b2ee8SDavid du Colombierhas not changed much since the paper was written, the system around it certainly has.
63219b2ee8SDavid du ColombierNonetheless, the description here still stands as the best introduction to the editor.
643e12c5d1SDavid du Colombier.AE
653e12c5d1SDavid du Colombier.SH
663e12c5d1SDavid du ColombierIntroduction
673e12c5d1SDavid du Colombier.LP
683e12c5d1SDavid du Colombier.CW Sam
693e12c5d1SDavid du Colombieris an interactive text editor that combines cut-and-paste interactive editing with
703e12c5d1SDavid du Colombieran unusual command language based on the composition of regular expressions.
71219b2ee8SDavid du ColombierIt is written as two programs: one, the `host part,' runs on a UNIX system
723e12c5d1SDavid du Colombierand implements the command language and provides file access; the other, the
733e12c5d1SDavid du Colombier`terminal part,' runs asynchronously
743e12c5d1SDavid du Colombieron a machine with a mouse and bitmap display
753e12c5d1SDavid du Colombierand supports the display and interactive editing.
763e12c5d1SDavid du ColombierThe host part may be even run in isolation on an ordinary terminal
773e12c5d1SDavid du Colombierto edit text using the command
783e12c5d1SDavid du Colombierlanguage, much like a traditional line editor,
793e12c5d1SDavid du Colombierwithout assistance from a mouse or display.
803e12c5d1SDavid du ColombierMost often,
813e12c5d1SDavid du Colombierthe terminal part runs on a Blit\u\s-4\&1\s+4\d terminal
823e12c5d1SDavid du Colombier(actually on a Teletype DMD 5620, the production version of the Blit), whose
833e12c5d1SDavid du Colombierhost connection is an ordinary 9600 bps RS232 link;
84219b2ee8SDavid du Colombieron the SUN computer the host and display processes run on a single machine,
853e12c5d1SDavid du Colombierconnected by a pipe.
863e12c5d1SDavid du Colombier.PP
873e12c5d1SDavid du Colombier.CW Sam
883e12c5d1SDavid du Colombieredits uninterpreted
893e12c5d1SDavid du ColombierASCII text.
903e12c5d1SDavid du ColombierIt has no facilities for multiple fonts, graphics or tables,
913e12c5d1SDavid du Colombierunlike MacWrite,\u\s-4\&2\s+4\d Bravo,\u\s-4\&3\s+4\d Tioga\u\s-4\&4\s+4\d
923e12c5d1SDavid du Colombieror Lara.\u\s-4\&5\s+4\d
933e12c5d1SDavid du ColombierAlso unlike them, it has a rich command language.
943e12c5d1SDavid du Colombier(Throughout this paper, the phrase
953e12c5d1SDavid du Colombier.I
963e12c5d1SDavid du Colombiercommand language
973e12c5d1SDavid du Colombier.R
983e12c5d1SDavid du Colombierrefers to
993e12c5d1SDavid du Colombiertextual commands; commands activated from the mouse form the
1003e12c5d1SDavid du Colombier.I mouse
1013e12c5d1SDavid du Colombier.I language. )
1023e12c5d1SDavid du Colombier.CW Sam
1033e12c5d1SDavid du Colombierdeveloped as an editor for use by programmers, and tries to join
104219b2ee8SDavid du Colombierthe styles of the UNIX text editor
1053e12c5d1SDavid du Colombier.CW ed \u\s-4\&6,7\s+4\d
1063e12c5d1SDavid du Colombierwith that of interactive cut-and-paste editors by
1073e12c5d1SDavid du Colombierproviding a comfortable mouse-driven interface
1083e12c5d1SDavid du Colombierto a program with a solid command language driven by regular expressions.
1093e12c5d1SDavid du ColombierThe command language developed more than the mouse language, and
1103e12c5d1SDavid du Colombieracquired a notation for describing the structure of files
1113e12c5d1SDavid du Colombiermore richly than as a sequence of lines,
1123e12c5d1SDavid du Colombierusing a dataflow-like syntax for specifying changes.
1133e12c5d1SDavid du Colombier.PP
1143e12c5d1SDavid du ColombierThe interactive style was influenced by
1153e12c5d1SDavid du Colombier.CW jim ,\u\s-4\&1\s+4\d
1163e12c5d1SDavid du Colombieran early cut-and-paste editor for the Blit, and by
1173e12c5d1SDavid du Colombier.CW mux ,\u\s-4\&8\s+4\d
1183e12c5d1SDavid du Colombierthe Blit window system.
1193e12c5d1SDavid du Colombier.CW Mux
1203e12c5d1SDavid du Colombiermerges the original Blit window system,
1213e12c5d1SDavid du Colombier.CW mpx ,\u\s-4\&1\s+4\d
1223e12c5d1SDavid du Colombierwith cut-and-paste editing, forming something like a
1233e12c5d1SDavid du Colombiermultiplexed version of
1243e12c5d1SDavid du Colombier.CW jim
1253e12c5d1SDavid du Colombierthat edits the output of (and input to) command sessions rather than files.
1263e12c5d1SDavid du Colombier.PP
1273e12c5d1SDavid du ColombierThe first part of this paper describes the command language, then the mouse
1283e12c5d1SDavid du Colombierlanguage, and explains how they interact.
1293e12c5d1SDavid du ColombierThat is followed by a description of the implementation,
1303e12c5d1SDavid du Colombierfirst of the host part, then of the terminal part.
1313e12c5d1SDavid du ColombierA principle that influenced the design of
1323e12c5d1SDavid du Colombier.CW sam
1333e12c5d1SDavid du Colombieris that it should have no explicit limits, such as upper limits on
1343e12c5d1SDavid du Colombierfile size or line length.
1353e12c5d1SDavid du ColombierA secondary consideration is that it be efficient.
1363e12c5d1SDavid du ColombierTo honor these two goals together requires a method for efficiently
1373e12c5d1SDavid du Colombiermanipulating
1383e12c5d1SDavid du Colombierhuge strings (files) without breaking them into lines,
1393e12c5d1SDavid du Colombierperhaps while making thousands of changes
1403e12c5d1SDavid du Colombierunder control of the command language.
1413e12c5d1SDavid du Colombier.CW Sam 's
1423e12c5d1SDavid du Colombiermethod is to
1433e12c5d1SDavid du Colombiertreat the file as a transaction database, implementing changes as atomic
1443e12c5d1SDavid du Colombierupdates.  These updates may be unwound easily to `undo' changes.
1453e12c5d1SDavid du ColombierEfficiency is achieved through a collection of caches that minimizes
1463e12c5d1SDavid du Colombierdisc traffic and data motion, both within the two parts of the program
1473e12c5d1SDavid du Colombierand between them.
1483e12c5d1SDavid du Colombier.PP
1493e12c5d1SDavid du ColombierThe terminal part of
1503e12c5d1SDavid du Colombier.CW sam
1513e12c5d1SDavid du Colombieris fairly straightforward.
1523e12c5d1SDavid du ColombierMore interesting is how the two halves of the editor stay
1533e12c5d1SDavid du Colombiersynchronized when either half may initiate a change.
1543e12c5d1SDavid du ColombierThis is achieved through a data structure that organizes the
1553e12c5d1SDavid du Colombiercommunications and is maintained in parallel by both halves.
1563e12c5d1SDavid du Colombier.PP
1573e12c5d1SDavid du ColombierThe last part of the paper chronicles the writing of
1583e12c5d1SDavid du Colombier.CW sam
1593e12c5d1SDavid du Colombierand discusses the lessons that were learned through its development and use.
1603e12c5d1SDavid du Colombier.PP
1613e12c5d1SDavid du ColombierThe paper is long, but is composed largely of two papers of reasonable length:
1623e12c5d1SDavid du Colombiera description of the user interface of
1633e12c5d1SDavid du Colombier.CW sam
1643e12c5d1SDavid du Colombierand a discussion of its implementation.
1653e12c5d1SDavid du ColombierThey are combined because the implementation is strongly influenced by
1663e12c5d1SDavid du Colombierthe user interface, and vice versa.
1673e12c5d1SDavid du Colombier.SH
1683e12c5d1SDavid du ColombierThe Interface
1693e12c5d1SDavid du Colombier.LP
1703e12c5d1SDavid du Colombier.CW Sam
1713e12c5d1SDavid du Colombieris a text editor for multiple files.
1723e12c5d1SDavid du ColombierFile names may be provided when it is invoked:
1733e12c5d1SDavid du Colombier.P1
1743e12c5d1SDavid du Colombiersam file1 file2 ...
1753e12c5d1SDavid du Colombier.P2
1763e12c5d1SDavid du Colombierand there are commands
1773e12c5d1SDavid du Colombierto add new files and discard unneeded ones.
1783e12c5d1SDavid du ColombierFiles are not read until necessary
1793e12c5d1SDavid du Colombierto complete some command.
1803e12c5d1SDavid du ColombierEditing operations apply to an internal copy
181219b2ee8SDavid du Colombiermade when the file is read; the UNIX file associated with the copy
1823e12c5d1SDavid du Colombieris changed only by an explicit command.
1833e12c5d1SDavid du ColombierTo simplify the discussion, the internal copy is here called a
1843e12c5d1SDavid du Colombier.I file ,
1853e12c5d1SDavid du Colombierwhile the disc-resident original is called a
1863e12c5d1SDavid du Colombier.I
1873e12c5d1SDavid du Colombierdisc file.
1883e12c5d1SDavid du Colombier.R
1893e12c5d1SDavid du Colombier.PP
1903e12c5d1SDavid du Colombier.CW Sam
1913e12c5d1SDavid du Colombieris usually connected to a bitmap display that presents a cut-and-paste
1923e12c5d1SDavid du Colombiereditor driven by the mouse.
1933e12c5d1SDavid du ColombierIn this mode, the command language is still available:
1943e12c5d1SDavid du Colombiertext typed in a special window, called the
1953e12c5d1SDavid du Colombier.CW sam
1963e12c5d1SDavid du Colombier.I window,
1973e12c5d1SDavid du Colombieris interpreted
1983e12c5d1SDavid du Colombieras commands to be executed in the current file.
1993e12c5d1SDavid du ColombierCut-and-paste editing may be used in any window \(em even in the
2003e12c5d1SDavid du Colombier.CW sam
2013e12c5d1SDavid du Colombierwindow to construct commands.
2023e12c5d1SDavid du ColombierThe other mode of operation, invoked by starting
2033e12c5d1SDavid du Colombier.CW sam
2043e12c5d1SDavid du Colombierwith the option
2053e12c5d1SDavid du Colombier.CW -d
2063e12c5d1SDavid du Colombier(for `no download'),
2073e12c5d1SDavid du Colombierdoes not use the mouse or bitmap display, but still permits
2083e12c5d1SDavid du Colombierediting using the textual command language, even on an ordinary terminal,
2093e12c5d1SDavid du Colombierinteractively or from a script.
2103e12c5d1SDavid du Colombier.PP
2113e12c5d1SDavid du ColombierThe following sections describe first the command language (under
2123e12c5d1SDavid du Colombier.CW sam\ -d
2133e12c5d1SDavid du Colombierand in the
2143e12c5d1SDavid du Colombier.CW sam
2153e12c5d1SDavid du Colombierwindow), and then the mouse interface.
2163e12c5d1SDavid du ColombierThese two languages are nearly independent, but connect through the
2173e12c5d1SDavid du Colombier.I current
2183e12c5d1SDavid du Colombier.I text,
2193e12c5d1SDavid du Colombierdescribed below.
2203e12c5d1SDavid du Colombier.SH 2
2213e12c5d1SDavid du ColombierThe Command Language
2223e12c5d1SDavid du Colombier.LP
2233e12c5d1SDavid du ColombierA file consists of its contents, which are an array of characters
2243e12c5d1SDavid du Colombier(that is, a string); the
2253e12c5d1SDavid du Colombier.I name
2263e12c5d1SDavid du Colombierof the associated disc file; the
2273e12c5d1SDavid du Colombier.I
2283e12c5d1SDavid du Colombiermodified bit
2293e12c5d1SDavid du Colombier.R
2303e12c5d1SDavid du Colombierthat states whether the contents match those of
2313e12c5d1SDavid du Colombierthe disc file;
2323e12c5d1SDavid du Colombierand a substring of the contents, called the
2333e12c5d1SDavid du Colombier.I
2343e12c5d1SDavid du Colombiercurrent text
2353e12c5d1SDavid du Colombier.R
2363e12c5d1SDavid du Colombieror
2373e12c5d1SDavid du Colombier.I dot
2383e12c5d1SDavid du Colombier(see Figures 1 and 2).
2393e12c5d1SDavid du ColombierIf the current text is a null string, dot falls between characters.
2403e12c5d1SDavid du ColombierThe
2413e12c5d1SDavid du Colombier.I value
2423e12c5d1SDavid du Colombierof dot is the location of the current text; the
2433e12c5d1SDavid du Colombier.I contents
2443e12c5d1SDavid du Colombierof dot are the characters it contains.
2453e12c5d1SDavid du Colombier.CW Sam
2463e12c5d1SDavid du Colombierimparts to the text no two-dimensional interpretation such as columns
2473e12c5d1SDavid du Colombieror fields; text is always one-dimensional.
248219b2ee8SDavid du ColombierEven the idea of a `line' of text as understood by most UNIX programs
2493e12c5d1SDavid du Colombier\(em a sequence of characters terminated by a newline character \(em
2503e12c5d1SDavid du Colombieris only weakly supported.
2513e12c5d1SDavid du Colombier.PP
2523e12c5d1SDavid du ColombierThe
2533e12c5d1SDavid du Colombier.I
2543e12c5d1SDavid du Colombiercurrent file
2553e12c5d1SDavid du Colombier.R
2563e12c5d1SDavid du Colombieris the file to which editing commands refer.
2573e12c5d1SDavid du ColombierThe current text is therefore dot in the current file.
2583e12c5d1SDavid du ColombierIf a command doesn't explicitly name a particular file or piece of text,
2593e12c5d1SDavid du Colombierthe command is assumed to apply to the current text.
2603e12c5d1SDavid du ColombierFor the moment, ignore the presence of multiple files and consider
2613e12c5d1SDavid du Colombierediting a single file.
2623e12c5d1SDavid du Colombier.KF L
263426d2b71SDavid du Colombier.XP fig1 3.5i
2643e12c5d1SDavid du Colombier.Cs
2653e12c5d1SDavid du ColombierFigure 1. A typical
2663e12c5d1SDavid du Colombier.CW sam
2673e12c5d1SDavid du Colombierscreen, with the editing menu presented.
2683e12c5d1SDavid du ColombierThe
2693e12c5d1SDavid du Colombier.CW sam
2703e12c5d1SDavid du Colombier(command language) window is in the middle, with file windows above and below.
2713e12c5d1SDavid du Colombier(The user interface makes it easy to create these abutting windows.)
2723e12c5d1SDavid du ColombierThe partially obscured window is a third file window.
2733e12c5d1SDavid du ColombierThe uppermost window is that to which typing and mouse operations apply,
2743e12c5d1SDavid du Colombieras indicated by its heavy border.
2753e12c5d1SDavid du ColombierEach window has its current text highlighted in reverse video.
2763e12c5d1SDavid du ColombierThe
2773e12c5d1SDavid du Colombier.CW sam
2783e12c5d1SDavid du Colombierwindow's current text is the null string on the last visible line,
2793e12c5d1SDavid du Colombierindicated by a vertical bar.
2803e12c5d1SDavid du ColombierSee also Figure 2.
2813e12c5d1SDavid du Colombier.Ce
2823e12c5d1SDavid du Colombier.KE
2833e12c5d1SDavid du Colombier.PP
2843e12c5d1SDavid du ColombierCommands have one-letter names.
2853e12c5d1SDavid du ColombierExcept for non-editing commands such as writing
2863e12c5d1SDavid du Colombierthe file to disc, most commands make some change
2873e12c5d1SDavid du Colombierto the text in dot and leave dot set to the text resulting from the change.
2883e12c5d1SDavid du ColombierFor example, the delete command,
2893e12c5d1SDavid du Colombier.CW d ,
2903e12c5d1SDavid du Colombierdeletes the text in dot, replacing it by the null string and setting dot
2913e12c5d1SDavid du Colombierto the result.
2923e12c5d1SDavid du ColombierThe change command,
2933e12c5d1SDavid du Colombier.CW c ,
2943e12c5d1SDavid du Colombierreplaces dot by text delimited by an arbitrary punctuation character,
2953e12c5d1SDavid du Colombierconventionally
2963e12c5d1SDavid du Colombiera slash.  Thus,
2973e12c5d1SDavid du Colombier.P1
2983e12c5d1SDavid du Colombierc/Peter/
2993e12c5d1SDavid du Colombier.P2
3003e12c5d1SDavid du Colombierreplaces the text in dot by the string
3013e12c5d1SDavid du Colombier.CW Peter .
3023e12c5d1SDavid du ColombierSimilarly,
3033e12c5d1SDavid du Colombier.P1
3043e12c5d1SDavid du Colombiera/Peter/
3053e12c5d1SDavid du Colombier.P2
3063e12c5d1SDavid du Colombier(append) adds the string after dot, and
3073e12c5d1SDavid du Colombier.P1
3083e12c5d1SDavid du Colombieri/Peter/
3093e12c5d1SDavid du Colombier.P2
3103e12c5d1SDavid du Colombier(insert) inserts before dot.
3113e12c5d1SDavid du ColombierAll three leave dot set to the new text,
3123e12c5d1SDavid du Colombier.CW Peter .
3133e12c5d1SDavid du Colombier.PP
3143e12c5d1SDavid du ColombierNewlines are part of the syntax of commands:
3153e12c5d1SDavid du Colombierthe newline character lexically terminates a command.
3163e12c5d1SDavid du ColombierWithin the inserted text, however, newlines are never implicit.
3173e12c5d1SDavid du ColombierBut since it is often convenient to insert multiple lines of text,
3183e12c5d1SDavid du Colombier.CW sam
3193e12c5d1SDavid du Colombierhas a special
3203e12c5d1SDavid du Colombiersyntax for that case:
3213e12c5d1SDavid du Colombier.P1
3223e12c5d1SDavid du Colombiera
3233e12c5d1SDavid du Colombiersome lines of text
3243e12c5d1SDavid du Colombierto be inserted in the file,
3253e12c5d1SDavid du Colombierterminated by a period
3263e12c5d1SDavid du Colombieron a line by itself
3273e12c5d1SDavid du Colombier\&.
3283e12c5d1SDavid du Colombier.P2
3293e12c5d1SDavid du ColombierIn the one-line syntax, a newline character may be specified by a C-like
3303e12c5d1SDavid du Colombierescape, so
3313e12c5d1SDavid du Colombier.P1
3323e12c5d1SDavid du Colombierc/\en/
3333e12c5d1SDavid du Colombier.P2
3343e12c5d1SDavid du Colombierreplaces dot by a single newline character.
3353e12c5d1SDavid du Colombier.PP
3363e12c5d1SDavid du Colombier.CW Sam
3373e12c5d1SDavid du Colombieralso has a substitute command,
3383e12c5d1SDavid du Colombier.CW s :
3393e12c5d1SDavid du Colombier.P1
3403e12c5d1SDavid du Colombiers/\f2expression\fP/\f2replacement\fP/
3413e12c5d1SDavid du Colombier.P2
3423e12c5d1SDavid du Colombiersubstitutes the replacement text for the first match, in dot,
3433e12c5d1SDavid du Colombierof the regular expression.
3443e12c5d1SDavid du ColombierThus, if dot is the string
3453e12c5d1SDavid du Colombier.CW Peter ,
3463e12c5d1SDavid du Colombierthe command
3473e12c5d1SDavid du Colombier.P1
3483e12c5d1SDavid du Colombiers/t/st/
3493e12c5d1SDavid du Colombier.P2
3503e12c5d1SDavid du Colombierchanges it to
3513e12c5d1SDavid du Colombier.CW Pester .
3523e12c5d1SDavid du ColombierIn general,
3533e12c5d1SDavid du Colombier.CW s
3543e12c5d1SDavid du Colombieris unnecessary, but it was inherited from
3553e12c5d1SDavid du Colombier.CW ed
3563e12c5d1SDavid du Colombierand it has some convenient variations.
3573e12c5d1SDavid du ColombierFor instance, the replacement text may include the matched text,
3583e12c5d1SDavid du Colombierspecified by
3593e12c5d1SDavid du Colombier.CW & :
3603e12c5d1SDavid du Colombier.P1
3613e12c5d1SDavid du Colombiers/Peter/Oh, &, &, &, &!/
3623e12c5d1SDavid du Colombier.P2
3633e12c5d1SDavid du Colombier.PP
3643e12c5d1SDavid du ColombierThere are also three commands that apply programs
3653e12c5d1SDavid du Colombierto text:
3663e12c5d1SDavid du Colombier.P1
367219b2ee8SDavid du Colombier< \f2UNIX program\fP
3683e12c5d1SDavid du Colombier.P2
369219b2ee8SDavid du Colombierreplaces dot by the output of the UNIX program.
3703e12c5d1SDavid du ColombierSimilarly, the
3713e12c5d1SDavid du Colombier.CW >
3723e12c5d1SDavid du Colombiercommand
3733e12c5d1SDavid du Colombierruns the program with dot as its standard input, and
3743e12c5d1SDavid du Colombier.CW |
3753e12c5d1SDavid du Colombierdoes both.  For example,
3763e12c5d1SDavid du Colombier.P1
3773e12c5d1SDavid du Colombier| sort
3783e12c5d1SDavid du Colombier.P2
3793e12c5d1SDavid du Colombierreplaces dot by the result of applying the standard sorting utility to it.
3803e12c5d1SDavid du ColombierAgain, newlines have no special significance for these
3813e12c5d1SDavid du Colombier.CW sam
3823e12c5d1SDavid du Colombiercommands.
3833e12c5d1SDavid du ColombierThe text acted upon and resulting from these commands is not necessarily
384219b2ee8SDavid du Colombierbounded by newlines, although for connection with UNIX programs,
3853e12c5d1SDavid du Colombiernewlines may be necessary to obey conventions.
3863e12c5d1SDavid du Colombier.PP
3873e12c5d1SDavid du ColombierOne more command:
3883e12c5d1SDavid du Colombier.CW p
3893e12c5d1SDavid du Colombierprints the contents of dot.
3903e12c5d1SDavid du ColombierTable I summarizes
3913e12c5d1SDavid du Colombier.CW sam 's
3923e12c5d1SDavid du Colombiercommands.
3933e12c5d1SDavid du Colombier.KF
3943e12c5d1SDavid du Colombier.TS
3953e12c5d1SDavid du Colombiercenter;
3963e12c5d1SDavid du Colombierc s
3973e12c5d1SDavid du ColombierlfCW l.
3983e12c5d1SDavid du ColombierTable I. \f(CWSam\fP commands
3993e12c5d1SDavid du Colombier.sp .4
4003e12c5d1SDavid du Colombier.ft CW
4013e12c5d1SDavid du Colombier_
4023e12c5d1SDavid du Colombier.ft
4033e12c5d1SDavid du Colombier.sp .4
4043e12c5d1SDavid du Colombier\f1Text commands\fP
4053e12c5d1SDavid du Colombier.sp .4
4063e12c5d1SDavid du Colombier_
4073e12c5d1SDavid du Colombier.sp .4
4083e12c5d1SDavid du Colombiera/\f2text\fP/	Append text after dot
4093e12c5d1SDavid du Colombierc/\f2text\fP/	Change text in dot
4103e12c5d1SDavid du Colombieri/\f2text\fP/	Insert text before dot
4113e12c5d1SDavid du Colombierd	Delete text in dot
4123e12c5d1SDavid du Colombiers/\f2regexp\fP/\f2text\fP/	Substitute text for match of regular expression in dot
4133e12c5d1SDavid du Colombierm \f2address\fP	Move text in dot after address
4143e12c5d1SDavid du Colombiert \f2address\fP	Copy text in dot after address
4153e12c5d1SDavid du Colombier.sp .4
4163e12c5d1SDavid du Colombier_
4173e12c5d1SDavid du Colombier.sp .4
4183e12c5d1SDavid du Colombier\f1Display commands\fP
4193e12c5d1SDavid du Colombier.sp .4
4203e12c5d1SDavid du Colombier_
4213e12c5d1SDavid du Colombier.sp .2
4223e12c5d1SDavid du Colombierp	Print contents of dot
4233e12c5d1SDavid du Colombier\&=	Print value (line numbers and character numbers) of dot
4243e12c5d1SDavid du Colombier.sp .4
4253e12c5d1SDavid du Colombier_
4263e12c5d1SDavid du Colombier.sp .4
4273e12c5d1SDavid du Colombier\f1File commands\fP
4283e12c5d1SDavid du Colombier.sp .4
4293e12c5d1SDavid du Colombier_
4303e12c5d1SDavid du Colombier.sp .2
4313e12c5d1SDavid du Colombierb \f2file-list\fP	Set current file to first file in list that \f(CWsam\fP has in menu
4323e12c5d1SDavid du ColombierB \f2file-list\fP	Same as \f(CWb\fP, but load new files
4333e12c5d1SDavid du Colombiern	Print menu lines of all files
4343e12c5d1SDavid du ColombierD \f2file-list\fP	Delete named files from \f(CWsam\fP
4353e12c5d1SDavid du Colombier.sp .4
4363e12c5d1SDavid du Colombier_
4373e12c5d1SDavid du Colombier.sp .4
4383e12c5d1SDavid du Colombier\f1I/O commands\fP
4393e12c5d1SDavid du Colombier.sp .4
4403e12c5d1SDavid du Colombier_
4413e12c5d1SDavid du Colombier.sp .2
4423e12c5d1SDavid du Colombiere \f2filename\fP	Replace file with named disc file
4433e12c5d1SDavid du Colombierr \f2filename\fP	Replace dot by contents of named disc file
4443e12c5d1SDavid du Colombierw \f2filename\fP	Write file to named disc file
4453e12c5d1SDavid du Colombierf \f2filename\fP	Set file name and print new menu line
446219b2ee8SDavid du Colombier< \f2UNIX-command\fP	Replace dot by standard output of command
447219b2ee8SDavid du Colombier> \f2UNIX-command\fP	Send dot to standard input of command
448219b2ee8SDavid du Colombier| \f2UNIX-command\fP	Replace dot by result of command applied to dot
449219b2ee8SDavid du Colombier! \f2UNIX-command\fP	Run the command
4503e12c5d1SDavid du Colombier.sp .4
4513e12c5d1SDavid du Colombier_
4523e12c5d1SDavid du Colombier.sp .4
4533e12c5d1SDavid du Colombier\f1Loops and conditionals\fP
4543e12c5d1SDavid du Colombier.sp .4
4553e12c5d1SDavid du Colombier_
4563e12c5d1SDavid du Colombier.sp .2
4573e12c5d1SDavid du Colombierx/\f2regexp\fP/ \f2command\fP	For each match of regexp, set dot and run command
4583e12c5d1SDavid du Colombiery/\f2regexp\fP/ \f2command\fP	Between adjacent matches of regexp, set dot and run command
4593e12c5d1SDavid du ColombierX/\f2regexp\fP/ \f2command\fP	Run command in each file whose menu line matches regexp
4603e12c5d1SDavid du ColombierY/\f2regexp\fP/ \f2command\fP	Run command in each file whose menu line does not match
4613e12c5d1SDavid du Colombierg/\f2regexp\fP/ \f2command\fP	If dot contains a match of regexp, run command
4623e12c5d1SDavid du Colombierv/\f2regexp\fP/ \f2command\fP	If dot does not contain a match of regexp, run command
4633e12c5d1SDavid du Colombier.sp .4
4643e12c5d1SDavid du Colombier_
4653e12c5d1SDavid du Colombier.sp .4
4663e12c5d1SDavid du Colombier\f1Miscellany\fP
4673e12c5d1SDavid du Colombier.sp .4
4683e12c5d1SDavid du Colombier_
4693e12c5d1SDavid du Colombier.sp .2
4703e12c5d1SDavid du Colombierk	Set address mark to value of dot
4713e12c5d1SDavid du Colombierq	Quit
4723e12c5d1SDavid du Colombieru \f2n\fP	Undo last \f2n\fP (default 1) changes
4733e12c5d1SDavid du Colombier{ }	Braces group commands
4743e12c5d1SDavid du Colombier.sp .3
4753e12c5d1SDavid du Colombier.ft CW
4763e12c5d1SDavid du Colombier_
4773e12c5d1SDavid du Colombier.ft
4783e12c5d1SDavid du Colombier.TE
4793e12c5d1SDavid du Colombier.sp
4803e12c5d1SDavid du Colombier.KE
4813e12c5d1SDavid du Colombier.PP
4823e12c5d1SDavid du ColombierThe value of dot may be changed by
4833e12c5d1SDavid du Colombierspecifying an
4843e12c5d1SDavid du Colombier.I address
4853e12c5d1SDavid du Colombierfor the command.
4863e12c5d1SDavid du ColombierThe simplest address is a line number:
4873e12c5d1SDavid du Colombier.P1
4883e12c5d1SDavid du Colombier3
4893e12c5d1SDavid du Colombier.P2
4903e12c5d1SDavid du Colombierrefers to the third line of the file, so
4913e12c5d1SDavid du Colombier.P1
4923e12c5d1SDavid du Colombier3d
4933e12c5d1SDavid du Colombier.P2
4943e12c5d1SDavid du Colombierdeletes the third line of the file, and implicitly renumbers
4953e12c5d1SDavid du Colombierthe lines so the old line 4 is now numbered 3.
4963e12c5d1SDavid du Colombier(This is one of the few places where
4973e12c5d1SDavid du Colombier.CW sam
4983e12c5d1SDavid du Colombierdeals with lines directly.)
4993e12c5d1SDavid du ColombierLine
5003e12c5d1SDavid du Colombier.CW 0
5013e12c5d1SDavid du Colombieris the null string at the beginning of the file.
5023e12c5d1SDavid du ColombierIf a command consists of only an address, a
5033e12c5d1SDavid du Colombier.CW p
5043e12c5d1SDavid du Colombiercommand is assumed, so typing an unadorned
5053e12c5d1SDavid du Colombier.CW 3
5063e12c5d1SDavid du Colombierprints line 3 on the terminal.
5073e12c5d1SDavid du ColombierThere are a couple of other basic addresses:
5083e12c5d1SDavid du Colombiera period addresses dot itself; and
5093e12c5d1SDavid du Colombiera dollar sign
5103e12c5d1SDavid du Colombier.CW $ ) (
5113e12c5d1SDavid du Colombieraddresses the null string at the end of the file.
5123e12c5d1SDavid du Colombier.PP
5133e12c5d1SDavid du ColombierAn address is always a single substring of the file.
5143e12c5d1SDavid du ColombierThus, the address
5153e12c5d1SDavid du Colombier.CW 3
5163e12c5d1SDavid du Colombieraddresses the characters
5173e12c5d1SDavid du Colombierafter the second newline of
5183e12c5d1SDavid du Colombierthe file through the third newline of the file.
5193e12c5d1SDavid du ColombierA
5203e12c5d1SDavid du Colombier.I
5213e12c5d1SDavid du Colombiercompound address
5223e12c5d1SDavid du Colombier.R
5233e12c5d1SDavid du Colombieris constructed by the comma operator
5243e12c5d1SDavid du Colombier.P1
5253e12c5d1SDavid du Colombier\f2address1\fP,\f2address2\fP
5263e12c5d1SDavid du Colombier.P2
5273e12c5d1SDavid du Colombierand addresses the substring of the file from the beginning of
5283e12c5d1SDavid du Colombier.I address1
5293e12c5d1SDavid du Colombierto the end of
5303e12c5d1SDavid du Colombier.I address2 .
5313e12c5d1SDavid du ColombierFor example, the command
5323e12c5d1SDavid du Colombier.CW 3,5p
5333e12c5d1SDavid du Colombierprints the third through fifth lines of the file and
5343e12c5d1SDavid du Colombier.CW .,$d
5353e12c5d1SDavid du Colombierdeletes the text from the beginning of dot to the end of the file.
5363e12c5d1SDavid du Colombier.PP
5373e12c5d1SDavid du ColombierThese addresses are all absolute positions in the file, but
5383e12c5d1SDavid du Colombier.CW sam
5393e12c5d1SDavid du Colombieralso has relative addresses, indicated by
5403e12c5d1SDavid du Colombier.CW +
5413e12c5d1SDavid du Colombieror
5423e12c5d1SDavid du Colombier.CW - .
5433e12c5d1SDavid du ColombierFor example,
5443e12c5d1SDavid du Colombier.P1
5453e12c5d1SDavid du Colombier$-3
5463e12c5d1SDavid du Colombier.P2
5473e12c5d1SDavid du Colombieris the third line before the end of the file and
5483e12c5d1SDavid du Colombier.P1
5493e12c5d1SDavid du Colombier\&.+1
5503e12c5d1SDavid du Colombier.P2
5513e12c5d1SDavid du Colombieris the line after dot.
5523e12c5d1SDavid du ColombierIf no address appears to the left of the
5533e12c5d1SDavid du Colombier.CW +
5543e12c5d1SDavid du Colombieror
5553e12c5d1SDavid du Colombier.CW - ,
5563e12c5d1SDavid du Colombierdot is assumed;
5573e12c5d1SDavid du Colombierif nothing appears to the right,
5583e12c5d1SDavid du Colombier.CW 1
5593e12c5d1SDavid du Colombieris assumed.
5603e12c5d1SDavid du ColombierTherefore,
5613e12c5d1SDavid du Colombier.CW .+1
5623e12c5d1SDavid du Colombiermay be abbreviated to just a plus sign.
5633e12c5d1SDavid du Colombier.PP
5643e12c5d1SDavid du ColombierThe
5653e12c5d1SDavid du Colombier.CW +
5663e12c5d1SDavid du Colombieroperator acts relative to the end of its first argument, while the
5673e12c5d1SDavid du Colombier.CW -
5683e12c5d1SDavid du Colombieroperator acts relative to the beginning.  Thus
5693e12c5d1SDavid du Colombier.CW .+1
5703e12c5d1SDavid du Colombieraddresses the first line after dot,
5713e12c5d1SDavid du Colombier.CW .-
5723e12c5d1SDavid du Colombieraddresses the first line before dot, and
5733e12c5d1SDavid du Colombier.CW +-
5743e12c5d1SDavid du Colombierrefers to the line containing the end of dot.  (Dot may span multiple lines, and
5753e12c5d1SDavid du Colombier.CW +
5763e12c5d1SDavid du Colombierselects the line after the end of dot, then
5773e12c5d1SDavid du Colombier.CW -
5783e12c5d1SDavid du Colombierbacks up one line.)
5793e12c5d1SDavid du Colombier.PP
5803e12c5d1SDavid du ColombierThe final type of address is a regular expression, which addresses the
5813e12c5d1SDavid du Colombiertext matched by the expression.  The expression is enclosed in slashes, as in
5823e12c5d1SDavid du Colombier.P1
5833e12c5d1SDavid du Colombier/\f2expression\fP/
5843e12c5d1SDavid du Colombier.P2
585219b2ee8SDavid du ColombierThe expressions are the same as those in the UNIX program
5863e12c5d1SDavid du Colombier.CW egrep ,\u\s-4\&6,7\s+4\d
5873e12c5d1SDavid du Colombierand include closures, alternations, and so on.
5883e12c5d1SDavid du ColombierThey find the
5893e12c5d1SDavid du Colombier.I
5903e12c5d1SDavid du Colombierleftmost longest
5913e12c5d1SDavid du Colombier.R
5923e12c5d1SDavid du Colombierstring that matches the expression, that is,
5933e12c5d1SDavid du Colombierthe first match after the point where the search is started,
5943e12c5d1SDavid du Colombierand if more than one match begins at the same spot, the longest such match.
595219b2ee8SDavid du Colombier(I assume familiarity with the syntax for regular expressions in UNIX programs.\u\s-4\&9\s+4\d)
5963e12c5d1SDavid du ColombierFor example,
5973e12c5d1SDavid du Colombier.P1
5983e12c5d1SDavid du Colombier/x/
5993e12c5d1SDavid du Colombier.P2
6003e12c5d1SDavid du Colombiermatches the next
6013e12c5d1SDavid du Colombier.CW x
6023e12c5d1SDavid du Colombiercharacter in the file,
6033e12c5d1SDavid du Colombier.P1
6043e12c5d1SDavid du Colombier/xx*/
6053e12c5d1SDavid du Colombier.P2
6063e12c5d1SDavid du Colombiermatches the next run of one or more
6073e12c5d1SDavid du Colombier.CW x 's,
6083e12c5d1SDavid du Colombierand
6093e12c5d1SDavid du Colombier.P1
6103e12c5d1SDavid du Colombier/x|Peter/
6113e12c5d1SDavid du Colombier.P2
6123e12c5d1SDavid du Colombiermatches the next
6133e12c5d1SDavid du Colombier.CW x
6143e12c5d1SDavid du Colombieror
6153e12c5d1SDavid du Colombier.CW Peter .
616219b2ee8SDavid du ColombierFor compatibility with other UNIX programs, the `any character' operator,
6173e12c5d1SDavid du Colombiera period,
6183e12c5d1SDavid du Colombierdoes not match a newline, so
6193e12c5d1SDavid du Colombier.P1
6203e12c5d1SDavid du Colombier/.*/
6213e12c5d1SDavid du Colombier.P2
6223e12c5d1SDavid du Colombiermatches the text from dot to the end of the line, but excludes the newline
6233e12c5d1SDavid du Colombierand so will not match across
6243e12c5d1SDavid du Colombierthe line boundary.
6253e12c5d1SDavid du Colombier.PP
6263e12c5d1SDavid du ColombierRegular expressions are always relative addresses.
6273e12c5d1SDavid du ColombierThe direction is forwards by default,
6283e12c5d1SDavid du Colombierso
6293e12c5d1SDavid du Colombier.CW /Peter/
6303e12c5d1SDavid du Colombieris really an abbreviation for
6313e12c5d1SDavid du Colombier.CW +/Peter/ .
6323e12c5d1SDavid du ColombierThe search can be reversed with a minus sign, so
6333e12c5d1SDavid du Colombier.P1
6343e12c5d1SDavid du Colombier.CW -/Peter/
6353e12c5d1SDavid du Colombier.P2
6363e12c5d1SDavid du Colombierfinds the first
6373e12c5d1SDavid du Colombier.CW Peter
6383e12c5d1SDavid du Colombierbefore dot.
6393e12c5d1SDavid du ColombierRegular expressions may be used with other address forms, so
6403e12c5d1SDavid du Colombier.CW 0+/Peter/
6413e12c5d1SDavid du Colombierfinds the first
6423e12c5d1SDavid du Colombier.CW Peter
6433e12c5d1SDavid du Colombierin the file and
6443e12c5d1SDavid du Colombier.CW $-/Peter/
6453e12c5d1SDavid du Colombierfinds the last.
6463e12c5d1SDavid du ColombierTable II summarizes
6473e12c5d1SDavid du Colombier.CW sam 's
6483e12c5d1SDavid du Colombieraddresses.
6493e12c5d1SDavid du Colombier.KF
6503e12c5d1SDavid du Colombier.TS
6513e12c5d1SDavid du Colombiercenter;
6523e12c5d1SDavid du Colombierc s
6533e12c5d1SDavid du ColombierlfCW l.
6543e12c5d1SDavid du ColombierTable II. \f(CWSam\fP addresses
6553e12c5d1SDavid du Colombier.sp .4
6563e12c5d1SDavid du Colombier.ft CW
6573e12c5d1SDavid du Colombier_
6583e12c5d1SDavid du Colombier.ft
6593e12c5d1SDavid du Colombier.sp .4
6603e12c5d1SDavid du Colombier\f1Simple addresses\fP
6613e12c5d1SDavid du Colombier.sp .4
6623e12c5d1SDavid du Colombier_
6633e12c5d1SDavid du Colombier.sp .2
6643e12c5d1SDavid du Colombier#\f2n\fP	The empty string after character \f2n\fP
6653e12c5d1SDavid du Colombier\f2n\fP	Line \f2n\fP.
6663e12c5d1SDavid du Colombier/\f2regexp\fP/	The first following match of the regular expression
6673e12c5d1SDavid du Colombier-/\f2regexp\fP/	The first previous match of the regular expression
6683e12c5d1SDavid du Colombier$	The null string at the end of the file
6693e12c5d1SDavid du Colombier\&.	Dot
6703e12c5d1SDavid du Colombier\&'	The address mark, set by \f(CWk\fP command
6713e12c5d1SDavid du Colombier"\f2regexp\fP"	Dot in the file whose menu line matches regexp
6723e12c5d1SDavid du Colombier.sp .4
6733e12c5d1SDavid du Colombier_
6743e12c5d1SDavid du Colombier.sp .4
6753e12c5d1SDavid du Colombier\f1Compound addresses\fP
6763e12c5d1SDavid du Colombier.sp .4
6773e12c5d1SDavid du Colombier_
6783e12c5d1SDavid du Colombier.sp .2
6793e12c5d1SDavid du Colombier\f2a1\fP+\f2a2\fP	The address \f2a2\fP evaluated starting at right of \f2a1\fP
6803e12c5d1SDavid du Colombier\f2a1\fP-\f2a2\fP	\f2a2\fP evaluated in the reverse direction starting at left of \f2a1\fP
6813e12c5d1SDavid du Colombier\f2a1\fP,\f2a2\fP	From the left of \f2a1\fP to the right of \f2a2\fP (default \f(CW0,$\fP)
6823e12c5d1SDavid du Colombier\f2a1\fP;\f2a2\fP	Like \f(CW,\fP but sets dot after evaluating \f2a1\fP
6833e12c5d1SDavid du Colombier.sp .4
6843e12c5d1SDavid du Colombier_
6853e12c5d1SDavid du Colombier.sp .4
6863e12c5d1SDavid du Colombier.T&
6873e12c5d1SDavid du Colombierc s.
6883e12c5d1SDavid du ColombierT{
6893e12c5d1SDavid du ColombierThe operators
6903e12c5d1SDavid du Colombier.CW +
6913e12c5d1SDavid du Colombierand
6923e12c5d1SDavid du Colombier.CW -
6933e12c5d1SDavid du Colombierare high precedence, while
6943e12c5d1SDavid du Colombier.CW ,
6953e12c5d1SDavid du Colombierand
6963e12c5d1SDavid du Colombier.CW ;
6973e12c5d1SDavid du Colombierare low precedence.
6983e12c5d1SDavid du ColombierIn both
6993e12c5d1SDavid du Colombier.CW +
7003e12c5d1SDavid du Colombierand
7013e12c5d1SDavid du Colombier.CW -
7023e12c5d1SDavid du Colombierforms,
7033e12c5d1SDavid du Colombier.I a2
7043e12c5d1SDavid du Colombierdefaults to 1 and
7053e12c5d1SDavid du Colombier.I a1
7063e12c5d1SDavid du Colombierdefaults to dot.
7073e12c5d1SDavid du ColombierIf both
7083e12c5d1SDavid du Colombier.I a1
7093e12c5d1SDavid du Colombierand
7103e12c5d1SDavid du Colombier.I a2
7113e12c5d1SDavid du Colombierare present,
7123e12c5d1SDavid du Colombier.CW +
7133e12c5d1SDavid du Colombiermay be elided.
7143e12c5d1SDavid du ColombierT}
7153e12c5d1SDavid du Colombier.sp .5
7163e12c5d1SDavid du Colombier.ft CW
7173e12c5d1SDavid du Colombier_
7183e12c5d1SDavid du Colombier.ft
7193e12c5d1SDavid du Colombier.TE
7203e12c5d1SDavid du Colombier.sp
7213e12c5d1SDavid du Colombier.KE
7223e12c5d1SDavid du Colombier.PP
7233e12c5d1SDavid du ColombierThe language discussed so far will not seem novel
724219b2ee8SDavid du Colombierto people who use UNIX text editors
7253e12c5d1SDavid du Colombiersuch as
7263e12c5d1SDavid du Colombier.CW ed
7273e12c5d1SDavid du Colombieror
7283e12c5d1SDavid du Colombier.CW vi .\u\s-4\&9\s+4\d
7293e12c5d1SDavid du ColombierMoreover, the kinds of editing operations these commands allow, with the exception
7303e12c5d1SDavid du Colombierof regular expressions and line numbers,
7313e12c5d1SDavid du Colombierare clearly more conveniently handled by a mouse-based interface.
7323e12c5d1SDavid du ColombierIndeed,
7333e12c5d1SDavid du Colombier.CW sam 's
7343e12c5d1SDavid du Colombiermouse language (discussed at length below) is the means by which
7353e12c5d1SDavid du Colombiersimple changes are usually made.
7363e12c5d1SDavid du ColombierFor large or repetitive changes, however, a textual language
7373e12c5d1SDavid du Colombieroutperforms a manual interface.
7383e12c5d1SDavid du Colombier.PP
7393e12c5d1SDavid du ColombierImagine that, instead of deleting just one occurrence of the string
7403e12c5d1SDavid du Colombier.CW Peter ,
7413e12c5d1SDavid du Colombierwe wanted to eliminate every
7423e12c5d1SDavid du Colombier.CW Peter .
7433e12c5d1SDavid du ColombierWhat's needed is an iterator that runs a command for each occurrence of some
7443e12c5d1SDavid du Colombiertext.
7453e12c5d1SDavid du Colombier.CW Sam 's
7463e12c5d1SDavid du Colombieriterator is called
7473e12c5d1SDavid du Colombier.CW x ,
7483e12c5d1SDavid du Colombierfor extract:
7493e12c5d1SDavid du Colombier.P1
7503e12c5d1SDavid du Colombierx/\f2expression\fP/ \f2command\fP
7513e12c5d1SDavid du Colombier.P2
7523e12c5d1SDavid du Colombierfinds all matches in dot of the specified expression, and for each
7533e12c5d1SDavid du Colombiersuch match, sets dot to the text matched and runs the command.
7543e12c5d1SDavid du ColombierSo to delete all the
7553e12c5d1SDavid du Colombier.CW Peters:
7563e12c5d1SDavid du Colombier.P1
7573e12c5d1SDavid du Colombier0,$ x/Peter/ d
7583e12c5d1SDavid du Colombier.P2
7593e12c5d1SDavid du Colombier(Blanks in these examples are to improve readability;
7603e12c5d1SDavid du Colombier.CW sam
7613e12c5d1SDavid du Colombierneither requires nor interprets them.)
7623e12c5d1SDavid du ColombierThis searches the entire file
7633e12c5d1SDavid du Colombier.CW 0,$ ) (
7643e12c5d1SDavid du Colombierfor occurrences of the string
7653e12c5d1SDavid du Colombier.CW Peter ,
7663e12c5d1SDavid du Colombierand runs the
7673e12c5d1SDavid du Colombier.CW d
7683e12c5d1SDavid du Colombiercommand with dot set to each such occurrence.
7693e12c5d1SDavid du Colombier(By contrast, the comparable
7703e12c5d1SDavid du Colombier.CW ed
7713e12c5d1SDavid du Colombiercommand would delete all
7723e12c5d1SDavid du Colombier.I lines
7733e12c5d1SDavid du Colombiercontaining
7743e12c5d1SDavid du Colombier.CW Peter ;
7753e12c5d1SDavid du Colombier.CW sam
7763e12c5d1SDavid du Colombierdeletes only the
7773e12c5d1SDavid du Colombier.CW Peters .)
7783e12c5d1SDavid du ColombierThe address
7793e12c5d1SDavid du Colombier.CW 0,$
7803e12c5d1SDavid du Colombieris commonly used, and may be abbreviated to just a comma.
7813e12c5d1SDavid du ColombierAs another example,
7823e12c5d1SDavid du Colombier.P1
7833e12c5d1SDavid du Colombier, x/Peter/ p
7843e12c5d1SDavid du Colombier.P2
7853e12c5d1SDavid du Colombierprints a list of
7863e12c5d1SDavid du Colombier.CW Peters,
7873e12c5d1SDavid du Colombierone for each appearance in the file, with no intervening text (not even newlines
7883e12c5d1SDavid du Colombierto separate the instances).
7893e12c5d1SDavid du Colombier.PP
7903e12c5d1SDavid du ColombierOf course, the text extracted by
7913e12c5d1SDavid du Colombier.CW x
7923e12c5d1SDavid du Colombiermay be selected by a regular expression,
7933e12c5d1SDavid du Colombierwhich complicates deciding what set of matches is chosen \(em
7943e12c5d1SDavid du Colombiermatches may overlap.  This is resolved by generating the matches
7953e12c5d1SDavid du Colombierstarting from the beginning of dot using the leftmost-longest rule,
7963e12c5d1SDavid du Colombierand searching for each match starting from the end of the previous one.
7973e12c5d1SDavid du ColombierRegular expressions may also match null strings, but a null match
7983e12c5d1SDavid du Colombieradjacent to a non-null match is never selected; at least one character
7993e12c5d1SDavid du Colombiermust intervene.
8003e12c5d1SDavid du ColombierFor example,
8013e12c5d1SDavid du Colombier.P1
8023e12c5d1SDavid du Colombier, c/AAA/
8033e12c5d1SDavid du Colombierx/B*/ c/-/
8043e12c5d1SDavid du Colombier, p
8053e12c5d1SDavid du Colombier.P2
8063e12c5d1SDavid du Colombierproduces as output
8073e12c5d1SDavid du Colombier.P1
8083e12c5d1SDavid du Colombier-A-A-A-
8093e12c5d1SDavid du Colombier.P2
8103e12c5d1SDavid du Colombierbecause the pattern
8113e12c5d1SDavid du Colombier.CW B*
8123e12c5d1SDavid du Colombiermatches the null strings separating the
8133e12c5d1SDavid du Colombier.CW A 's.
8143e12c5d1SDavid du Colombier.PP
8153e12c5d1SDavid du ColombierThe
8163e12c5d1SDavid du Colombier.CW x
8173e12c5d1SDavid du Colombiercommand has a complement,
8183e12c5d1SDavid du Colombier.CW y ,
8193e12c5d1SDavid du Colombierwith similar syntax, that executes the command with dot set to the text
8203e12c5d1SDavid du Colombier.I between
8213e12c5d1SDavid du Colombierthe matches of the expression.
8223e12c5d1SDavid du ColombierFor example,
8233e12c5d1SDavid du Colombier.P1
8243e12c5d1SDavid du Colombier, c/AAA/
8253e12c5d1SDavid du Colombiery/A/ c/-/
8263e12c5d1SDavid du Colombier, p
8273e12c5d1SDavid du Colombier.P2
8283e12c5d1SDavid du Colombierproduces the same result as the example above.
8293e12c5d1SDavid du Colombier.PP
8303e12c5d1SDavid du ColombierThe
8313e12c5d1SDavid du Colombier.CW x
8323e12c5d1SDavid du Colombierand
8333e12c5d1SDavid du Colombier.CW y
8343e12c5d1SDavid du Colombiercommands are looping constructs, and
8353e12c5d1SDavid du Colombier.CW sam
8363e12c5d1SDavid du Colombierhas a pair of conditional commands to go with them.
8373e12c5d1SDavid du ColombierThey have similar syntax:
8383e12c5d1SDavid du Colombier.P1
8393e12c5d1SDavid du Colombierg/\f2expression\fP/ \f2command\fP
8403e12c5d1SDavid du Colombier.P2
8413e12c5d1SDavid du Colombier(guard)
8423e12c5d1SDavid du Colombierruns the command exactly once if dot contains a match of the expression.
8433e12c5d1SDavid du ColombierThis is different from
8443e12c5d1SDavid du Colombier.CW x ,
8453e12c5d1SDavid du Colombierwhich runs the command for
8463e12c5d1SDavid du Colombier.I each
8473e12c5d1SDavid du Colombiermatch:
8483e12c5d1SDavid du Colombier.CW x
8493e12c5d1SDavid du Colombierloops;
8503e12c5d1SDavid du Colombier.CW g
8513e12c5d1SDavid du Colombiermerely tests, without changing the value of dot.
8523e12c5d1SDavid du ColombierThus,
8533e12c5d1SDavid du Colombier.P1
8543e12c5d1SDavid du Colombier, x/Peter/ d
8553e12c5d1SDavid du Colombier.P2
8563e12c5d1SDavid du Colombierdeletes all occurrences of
8573e12c5d1SDavid du Colombier.CW Peter ,
8583e12c5d1SDavid du Colombierbut
8593e12c5d1SDavid du Colombier.P1
8603e12c5d1SDavid du Colombier, g/Peter/ d
8613e12c5d1SDavid du Colombier.P2
8623e12c5d1SDavid du Colombierdeletes the whole file (reduces it to a null string) if
8633e12c5d1SDavid du Colombier.CW Peter
8643e12c5d1SDavid du Colombieroccurs anywhere in the text.
8653e12c5d1SDavid du ColombierThe complementary conditional is
8663e12c5d1SDavid du Colombier.CW v ,
8673e12c5d1SDavid du Colombierwhich runs the command if there is
8683e12c5d1SDavid du Colombier.I no
8693e12c5d1SDavid du Colombiermatch of the expression.
8703e12c5d1SDavid du Colombier.PP
8713e12c5d1SDavid du ColombierThese control-structure-like commands may be composed to construct more
8723e12c5d1SDavid du Colombierinvolved operations.  For example, to print those lines of text that
8733e12c5d1SDavid du Colombiercontain the string
8743e12c5d1SDavid du Colombier.CW Peter :
8753e12c5d1SDavid du Colombier.P1
8763e12c5d1SDavid du Colombier, x/.*\en/ g/Peter/ p
8773e12c5d1SDavid du Colombier.P2
8783e12c5d1SDavid du ColombierThe
8793e12c5d1SDavid du Colombier.CW x
8803e12c5d1SDavid du Colombierbreaks the file into lines, the
8813e12c5d1SDavid du Colombier.CW g
8823e12c5d1SDavid du Colombierselects those lines containing
8833e12c5d1SDavid du Colombier.CW Peter ,
8843e12c5d1SDavid du Colombierand the
8853e12c5d1SDavid du Colombier.CW p
8863e12c5d1SDavid du Colombierprints them.
8873e12c5d1SDavid du ColombierThis command gives an address for the
8883e12c5d1SDavid du Colombier.CW x
8893e12c5d1SDavid du Colombiercommand (the whole file), but because
8903e12c5d1SDavid du Colombier.CW g
8913e12c5d1SDavid du Colombierdoes not have an explicit address, it applies to the value of
8923e12c5d1SDavid du Colombierdot produced by the
8933e12c5d1SDavid du Colombier.CW x
8943e12c5d1SDavid du Colombiercommand, that is, to each line.
8953e12c5d1SDavid du ColombierAll commands in
8963e12c5d1SDavid du Colombier.CW sam
8973e12c5d1SDavid du Colombierexcept for the command to write a file to disc use dot for the
8983e12c5d1SDavid du Colombierdefault address.
8993e12c5d1SDavid du Colombier.PP
9003e12c5d1SDavid du ColombierComposition may be continued indefinitely.
9013e12c5d1SDavid du Colombier.P1
9023e12c5d1SDavid du Colombier, x/.*\en/ g/Peter/ v/SaltPeter/ p
9033e12c5d1SDavid du Colombier.P2
9043e12c5d1SDavid du Colombierprints those lines containing
9053e12c5d1SDavid du Colombier.CW Peter
9063e12c5d1SDavid du Colombierbut
9073e12c5d1SDavid du Colombier.I not
9083e12c5d1SDavid du Colombierthose containing
9093e12c5d1SDavid du Colombier.CW SaltPeter .
9103e12c5d1SDavid du Colombier.SH 2
9113e12c5d1SDavid du ColombierStructural Regular Expressions
9123e12c5d1SDavid du Colombier.LP
913219b2ee8SDavid du ColombierUnlike other UNIX text editors,
9143e12c5d1SDavid du Colombierincluding the non-interactive ones such as
9153e12c5d1SDavid du Colombier.CW sed
9163e12c5d1SDavid du Colombierand
9173e12c5d1SDavid du Colombier.CW awk ,\u\s-4\&7\s+4\d
9183e12c5d1SDavid du Colombier.CW sam
9193e12c5d1SDavid du Colombieris good for manipulating files with multi-line `records.'
9203e12c5d1SDavid du ColombierAn example is an on-line phone book composed of records,
9213e12c5d1SDavid du Colombierseparated by blank lines, of the form
9223e12c5d1SDavid du Colombier.P1
9233e12c5d1SDavid du ColombierHerbert Tic
9243e12c5d1SDavid du Colombier44 Turnip Ave., Endive, NJ
9253e12c5d1SDavid du Colombier201-5555642
9263e12c5d1SDavid du Colombier
9273e12c5d1SDavid du ColombierNorbert Twinge
9283e12c5d1SDavid du Colombier16 Potato St., Cabbagetown, NJ
9293e12c5d1SDavid du Colombier201-5553145
9303e12c5d1SDavid du Colombier
9313e12c5d1SDavid du Colombier\&...
9323e12c5d1SDavid du Colombier.P2
9333e12c5d1SDavid du ColombierThe format may be encoded as a regular expression:
9343e12c5d1SDavid du Colombier.P1
9353e12c5d1SDavid du Colombier(.+\en)+
9363e12c5d1SDavid du Colombier.P2
9373e12c5d1SDavid du Colombierthat is, a sequence of one or more non-blank lines.
9383e12c5d1SDavid du ColombierThe command to print Mr. Tic's entire record is then
9393e12c5d1SDavid du Colombier.P1
9403e12c5d1SDavid du Colombier, x/(.+\en)+/ g/^Herbert Tic$/ p
9413e12c5d1SDavid du Colombier.P2
9423e12c5d1SDavid du Colombierand that to extract just the phone number is
9433e12c5d1SDavid du Colombier.P1
9443e12c5d1SDavid du Colombier, x/(.+\en)+/ g/^Herbert Tic$/ x/^[0-9]*-[0-9]*\en/ p
9453e12c5d1SDavid du Colombier.P2
9463e12c5d1SDavid du ColombierThe latter command breaks the file into records,
9473e12c5d1SDavid du Colombierchooses Mr. Tic's record,
9483e12c5d1SDavid du Colombierextracts the phone number from the record,
9493e12c5d1SDavid du Colombierand finally prints the number.
9503e12c5d1SDavid du Colombier.PP
9513e12c5d1SDavid du ColombierA more involved problem is that of
9523e12c5d1SDavid du Colombierrenaming a particular variable, say
9533e12c5d1SDavid du Colombier.CW n ,
9543e12c5d1SDavid du Colombierto
9553e12c5d1SDavid du Colombier.CW num
9563e12c5d1SDavid du Colombierin a C program.
9573e12c5d1SDavid du ColombierThe obvious first attempt,
9583e12c5d1SDavid du Colombier.P1
9593e12c5d1SDavid du Colombier, x/n/ c/num/
9603e12c5d1SDavid du Colombier.P2
9613e12c5d1SDavid du Colombieris badly flawed: it changes not only the variable
9623e12c5d1SDavid du Colombier.CW n
9633e12c5d1SDavid du Colombierbut any letter
9643e12c5d1SDavid du Colombier.CW n
9653e12c5d1SDavid du Colombierthat appears.
9663e12c5d1SDavid du ColombierWe need to extract all the variables, and select those that match
9673e12c5d1SDavid du Colombier.CW n
9683e12c5d1SDavid du Colombierand only
9693e12c5d1SDavid du Colombier.CW n :
9703e12c5d1SDavid du Colombier.P1
9713e12c5d1SDavid du Colombier, x/[A-Za-z_][A-Za-z_0-9]*/ g/n/ v/../ c/num/
9723e12c5d1SDavid du Colombier.P2
9733e12c5d1SDavid du ColombierThe pattern
9743e12c5d1SDavid du Colombier.CW [A-Za-z_][A-Za-z_0-9]*
9753e12c5d1SDavid du Colombiermatches C identifiers.
9763e12c5d1SDavid du ColombierNext
9773e12c5d1SDavid du Colombier.CW g/n/
9783e12c5d1SDavid du Colombierselects those containing an
9793e12c5d1SDavid du Colombier.CW n .
9803e12c5d1SDavid du ColombierThen
9813e12c5d1SDavid du Colombier.CW v/../
9823e12c5d1SDavid du Colombierrejects those containing two (or more) characters, and finally
9833e12c5d1SDavid du Colombier.CW c/num/
9843e12c5d1SDavid du Colombierchanges the remainder (identifiers
9853e12c5d1SDavid du Colombier.CW n )
9863e12c5d1SDavid du Colombierto
9873e12c5d1SDavid du Colombier.CW num .
9883e12c5d1SDavid du ColombierThis version clearly works much better, but there may still be problems.
9893e12c5d1SDavid du ColombierFor example, in C character and string constants, the sequence
9903e12c5d1SDavid du Colombier.CW \en
9913e12c5d1SDavid du Colombieris interpreted as a newline character, and we don't want to change it to
9923e12c5d1SDavid du Colombier.CW \enum.
9933e12c5d1SDavid du ColombierThis problem can be forestalled with a
9943e12c5d1SDavid du Colombier.CW y
9953e12c5d1SDavid du Colombiercommand:
9963e12c5d1SDavid du Colombier.P1
9973e12c5d1SDavid du Colombier, y/\e\en/ x/[A-Za-z_][A-Za-z_0-9]*/ g/n/ v/../ c/num/
9983e12c5d1SDavid du Colombier.P2
9993e12c5d1SDavid du Colombier(the second
10003e12c5d1SDavid du Colombier.CW \e
10013e12c5d1SDavid du Colombieris necessary because of lexical conventions in regular expressions),
10023e12c5d1SDavid du Colombieror we could even reject character constants and strings outright:
1003219b2ee8SDavid du Colombier.P1 0
10043e12c5d1SDavid du Colombier,y/'[^']*'/ y/"[^"]*"/ x/[A-Za-z_][A-Za-z_0-9]*/ g/n/ v/../ c/num/
10053e12c5d1SDavid du Colombier.P2
10063e12c5d1SDavid du ColombierThe
10073e12c5d1SDavid du Colombier.CW y
10083e12c5d1SDavid du Colombiercommands in this version exclude from consideration all character constants
10093e12c5d1SDavid du Colombierand strings.
10103e12c5d1SDavid du ColombierThe only remaining problem is to deal with the possible occurrence of
10113e12c5d1SDavid du Colombier.CW \e'
10123e12c5d1SDavid du Colombieror
10133e12c5d1SDavid du Colombier.CW \e"
10143e12c5d1SDavid du Colombierwithin these sequences, but it's easy to see how to resolve this difficulty.
10153e12c5d1SDavid du Colombier.PP
10163e12c5d1SDavid du ColombierThe point of these composed commands is successive refinement.
10173e12c5d1SDavid du ColombierA simple version of the command is tried, and if it's not good enough,
10183e12c5d1SDavid du Colombierit can be honed by adding a clause or two.
10193e12c5d1SDavid du Colombier(Mistakes can be undone; see below.
10203e12c5d1SDavid du ColombierAlso, the mouse language makes it unnecessary to retype the command each time.)
10213e12c5d1SDavid du ColombierThe resulting chains of commands are somewhat reminiscent of
10223e12c5d1SDavid du Colombiershell pipelines.\u\s-4\&7\s+4\d
10233e12c5d1SDavid du ColombierUnlike pipelines, though, which pass along modified
10243e12c5d1SDavid du Colombier.I data ,
10253e12c5d1SDavid du Colombier.CW sam
10263e12c5d1SDavid du Colombiercommands pass a
10273e12c5d1SDavid du Colombier.I view
10283e12c5d1SDavid du Colombierof the data.
10293e12c5d1SDavid du ColombierThe text at each step of the command is the same, but which pieces
10303e12c5d1SDavid du Colombierare selected is refined step by step until the correct piece is
10313e12c5d1SDavid du Colombieravailable to the final step of the command line, which ultimately makes the change.
10323e12c5d1SDavid du Colombier.PP
1033219b2ee8SDavid du ColombierIn other UNIX programs, regular expressions are used only for selection,
10343e12c5d1SDavid du Colombieras in the
10353e12c5d1SDavid du Colombier.CW sam
10363e12c5d1SDavid du Colombier.CW g
10373e12c5d1SDavid du Colombiercommand, never for extraction as in the
10383e12c5d1SDavid du Colombier.CW x
10393e12c5d1SDavid du Colombieror
10403e12c5d1SDavid du Colombier.CW y
10413e12c5d1SDavid du Colombiercommand.
10423e12c5d1SDavid du ColombierFor example, patterns in
10433e12c5d1SDavid du Colombier.CW awk \u\s-4\&7\s+4\d
10443e12c5d1SDavid du Colombierare used to select lines to be operated on, but cannot be used
10453e12c5d1SDavid du Colombierto describe the format of the input text, or to handle newline-free text.
10463e12c5d1SDavid du ColombierThe use of regular expressions to describe the structure of a piece
10473e12c5d1SDavid du Colombierof text rather than its contents, as in the
10483e12c5d1SDavid du Colombier.CW x
10493e12c5d1SDavid du Colombiercommand,
10503e12c5d1SDavid du Colombierhas been given a name:
10513e12c5d1SDavid du Colombier.I
10523e12c5d1SDavid du Colombierstructural regular expressions.
10533e12c5d1SDavid du Colombier.R
10543e12c5d1SDavid du ColombierWhen they are composed, as in the above example,
10553e12c5d1SDavid du Colombierthey are pleasantly expressive.
10563e12c5d1SDavid du ColombierTheir use is discussed at greater length elsewhere.\u\s-4\&10\s+4\d
10573e12c5d1SDavid du Colombier.PP
10583e12c5d1SDavid du Colombier.SH 2
10593e12c5d1SDavid du ColombierMultiple files
10603e12c5d1SDavid du Colombier.LP
10613e12c5d1SDavid du Colombier.CW Sam
10623e12c5d1SDavid du Colombierhas a few other commands, mostly relating to input and output.
10633e12c5d1SDavid du Colombier.P1
10643e12c5d1SDavid du Colombiere discfilename
10653e12c5d1SDavid du Colombier.P2
10663e12c5d1SDavid du Colombierreplaces the contents and name of the current file with those of the named
10673e12c5d1SDavid du Colombierdisc file;
10683e12c5d1SDavid du Colombier.P1
10693e12c5d1SDavid du Colombierw discfilename
10703e12c5d1SDavid du Colombier.P2
10713e12c5d1SDavid du Colombierwrites the contents to the named disc file; and
10723e12c5d1SDavid du Colombier.P1
10733e12c5d1SDavid du Colombierr discfilename
10743e12c5d1SDavid du Colombier.P2
10753e12c5d1SDavid du Colombierreplaces dot with the contents of the named disc file.
10763e12c5d1SDavid du ColombierAll these commands use the current file's name if none is specified.
10773e12c5d1SDavid du ColombierFinally,
10783e12c5d1SDavid du Colombier.P1
10793e12c5d1SDavid du Colombierf discfilename
10803e12c5d1SDavid du Colombier.P2
10813e12c5d1SDavid du Colombierchanges the name associated with the file and displays the result:
10823e12c5d1SDavid du Colombier.P1
10833e12c5d1SDavid du Colombier\&'-. discfilename
10843e12c5d1SDavid du Colombier.P2
10853e12c5d1SDavid du ColombierThis output is called the file's
10863e12c5d1SDavid du Colombier.I
10873e12c5d1SDavid du Colombiermenu line,
10883e12c5d1SDavid du Colombier.R
10893e12c5d1SDavid du Colombierbecause it is the contents of the file's line in the button 3 menu (described
10903e12c5d1SDavid du Colombierin the
10913e12c5d1SDavid du Colombiernext section).
10923e12c5d1SDavid du ColombierThe first three characters are a concise notation for the state of the file.
10933e12c5d1SDavid du ColombierThe apostrophe signifies that the file is modified.
10943e12c5d1SDavid du ColombierThe minus sign indicates the number of windows
10953e12c5d1SDavid du Colombieropen on the file (see the next section):
10963e12c5d1SDavid du Colombier.CW -
10973e12c5d1SDavid du Colombiermeans none,
10983e12c5d1SDavid du Colombier.CW +
10993e12c5d1SDavid du Colombiermeans one, and
11003e12c5d1SDavid du Colombier.CW *
11013e12c5d1SDavid du Colombiermeans more than one.
11023e12c5d1SDavid du ColombierFinally, the period indicates that this is the current file.
11033e12c5d1SDavid du ColombierThese characters are useful for controlling the
11043e12c5d1SDavid du Colombier.CW X
11053e12c5d1SDavid du Colombiercommand, described shortly.
11063e12c5d1SDavid du Colombier.PP
11073e12c5d1SDavid du Colombier.CW Sam
11083e12c5d1SDavid du Colombiermay be started with a set of disc files (such as all the source for
11093e12c5d1SDavid du Colombiera program) by invoking it with a list of file names as arguments, and
11103e12c5d1SDavid du Colombiermore may be added or deleted on demand.
11113e12c5d1SDavid du Colombier.P1
11123e12c5d1SDavid du ColombierB discfile1 discfile2 ...
11133e12c5d1SDavid du Colombier.P2
11143e12c5d1SDavid du Colombieradds the named files to
11153e12c5d1SDavid du Colombier.CW sam 's
11163e12c5d1SDavid du Colombierlist, and
11173e12c5d1SDavid du Colombier.P1
11183e12c5d1SDavid du ColombierD discfile1 discfile2 ...
11193e12c5d1SDavid du Colombier.P2
11203e12c5d1SDavid du Colombierremoves them from
11213e12c5d1SDavid du Colombier.CW sam 's
11223e12c5d1SDavid du Colombiermemory (without effect on associated disc files).
11233e12c5d1SDavid du ColombierBoth these commands have a syntax for using the shell\u\s-4\&7\s+4\d
1124219b2ee8SDavid du Colombier(the UNIX command interpreter) to generate the lists:
11253e12c5d1SDavid du Colombier.P1
11263e12c5d1SDavid du ColombierB <echo *.c
11273e12c5d1SDavid du Colombier.P2
11283e12c5d1SDavid du Colombierwill add all C source files, and
11293e12c5d1SDavid du Colombier.P1
11303e12c5d1SDavid du ColombierB <grep -l variable *.c
11313e12c5d1SDavid du Colombier.P2
11323e12c5d1SDavid du Colombierwill add all C source files referencing a particular variable
1133219b2ee8SDavid du Colombier(the UNIX command
11343e12c5d1SDavid du Colombier.CW grep\ -l
11353e12c5d1SDavid du Colombierlists all files in its arguments that contain matches of
11363e12c5d1SDavid du Colombierthe specified regular expression).
11373e12c5d1SDavid du ColombierFinally,
11383e12c5d1SDavid du Colombier.CW D
11393e12c5d1SDavid du Colombierwithout arguments deletes the current file.
11403e12c5d1SDavid du Colombier.PP
11413e12c5d1SDavid du ColombierThere are two ways to change which file is current:
11423e12c5d1SDavid du Colombier.P1
11433e12c5d1SDavid du Colombierb filename
11443e12c5d1SDavid du Colombier.P2
11453e12c5d1SDavid du Colombiermakes the named file current.
11463e12c5d1SDavid du ColombierThe
11473e12c5d1SDavid du Colombier.CW B
11483e12c5d1SDavid du Colombiercommand
11493e12c5d1SDavid du Colombierdoes the same, but also adds any new files to
11503e12c5d1SDavid du Colombier.CW sam 's
11513e12c5d1SDavid du Colombierlist.
11523e12c5d1SDavid du Colombier(In practice, of course, the current file
11533e12c5d1SDavid du Colombieris usually chosen by mouse actions, not by textual commands.)
11543e12c5d1SDavid du ColombierThe other way is to use a form of address that refers to files:
11553e12c5d1SDavid du Colombier.P1
11563e12c5d1SDavid du Colombier"\f2expression\fP" \f2address\fP
11573e12c5d1SDavid du Colombier.P2
11583e12c5d1SDavid du Colombierrefers to the address evaluated in the file whose menu line
11593e12c5d1SDavid du Colombiermatches the expression (there must be exactly one match).
11603e12c5d1SDavid du ColombierFor example,
11613e12c5d1SDavid du Colombier.P1
11623e12c5d1SDavid du Colombier"peter.c" 3
11633e12c5d1SDavid du Colombier.P2
11643e12c5d1SDavid du Colombierrefers to the third line of the file whose name matches
11653e12c5d1SDavid du Colombier.CW peter.c .
11663e12c5d1SDavid du ColombierThis is most useful in the move
11673e12c5d1SDavid du Colombier.CW m ) (
11683e12c5d1SDavid du Colombierand copy
11693e12c5d1SDavid du Colombier.CW t ) (
11703e12c5d1SDavid du Colombiercommands:
11713e12c5d1SDavid du Colombier.P1
11723e12c5d1SDavid du Colombier0,$ t "peter.c" 0
11733e12c5d1SDavid du Colombier.P2
11743e12c5d1SDavid du Colombiermakes a copy of the current file at the beginning of
11753e12c5d1SDavid du Colombier.CW peter.c .
11763e12c5d1SDavid du Colombier.PP
11773e12c5d1SDavid du ColombierThe
11783e12c5d1SDavid du Colombier.CW X
11793e12c5d1SDavid du Colombiercommand
11803e12c5d1SDavid du Colombieris a looping construct, like
11813e12c5d1SDavid du Colombier.CW x ,
11823e12c5d1SDavid du Colombierthat refers to files instead of strings:
11833e12c5d1SDavid du Colombier.P1
11843e12c5d1SDavid du ColombierX/\f2expression\fP/ \f2command\fP
11853e12c5d1SDavid du Colombier.P2
11863e12c5d1SDavid du Colombierruns the command in all
11873e12c5d1SDavid du Colombierfiles whose menu lines match the expression.  The best example is
11883e12c5d1SDavid du Colombier.P1
11893e12c5d1SDavid du ColombierX/'/ w
11903e12c5d1SDavid du Colombier.P2
11913e12c5d1SDavid du Colombierwhich writes to disc all modified files.
11923e12c5d1SDavid du Colombier.CW Y
11933e12c5d1SDavid du Colombieris the complement of
11943e12c5d1SDavid du Colombier.CW X :
11953e12c5d1SDavid du Colombierit runs the command on all files whose menu lines don't match the expression:
11963e12c5d1SDavid du Colombier.P1
11973e12c5d1SDavid du ColombierY/\e.c/ D
11983e12c5d1SDavid du Colombier.P2
11993e12c5d1SDavid du Colombierdeletes all files that don't have
12003e12c5d1SDavid du Colombier.CW \&.c
12013e12c5d1SDavid du Colombierin their names, that is, it keeps all C source files and deletes the rest.
12023e12c5d1SDavid du Colombier.PP
12033e12c5d1SDavid du ColombierBraces allow commands to be grouped, so
12043e12c5d1SDavid du Colombier.P1
12053e12c5d1SDavid du Colombier{
12063e12c5d1SDavid du Colombier	\f2command1\fP
12073e12c5d1SDavid du Colombier	\f2command2\fP
12083e12c5d1SDavid du Colombier}
12093e12c5d1SDavid du Colombier.P2
12103e12c5d1SDavid du Colombieris syntactically a single command that runs two commands.
12113e12c5d1SDavid du ColombierThus,
12123e12c5d1SDavid du Colombier.P1
12133e12c5d1SDavid du ColombierX/\e.c/ ,g/variable/ {
12143e12c5d1SDavid du Colombier	f
12153e12c5d1SDavid du Colombier	, x/.*\en/ g/variable/ p
12163e12c5d1SDavid du Colombier}
12173e12c5d1SDavid du Colombier.P2
12183e12c5d1SDavid du Colombierfinds all occurrences of
12193e12c5d1SDavid du Colombier.CW variable
12203e12c5d1SDavid du Colombierin C source files, and prints
12213e12c5d1SDavid du Colombierout the file names and lines of each match.
12223e12c5d1SDavid du ColombierThe precise semantics of compound operations is discussed in the implementation
12233e12c5d1SDavid du Colombiersections below.
12243e12c5d1SDavid du Colombier.PP
12253e12c5d1SDavid du ColombierFinally,
12263e12c5d1SDavid du Colombierthe undo command,
12273e12c5d1SDavid du Colombier.CW u ,
12283e12c5d1SDavid du Colombierundoes the last command,
12293e12c5d1SDavid du Colombierno matter how many files were affected.
12303e12c5d1SDavid du ColombierMultiple undo operations move further back in time, so
12313e12c5d1SDavid du Colombier.P1
12323e12c5d1SDavid du Colombieru
12333e12c5d1SDavid du Colombieru
12343e12c5d1SDavid du Colombier.P2
12353e12c5d1SDavid du Colombier(which may be abbreviated
12363e12c5d1SDavid du Colombier.CW u2 )
12373e12c5d1SDavid du Colombierundoes the last two commands.  An undo may not be undone, however, nor
12383e12c5d1SDavid du Colombiermay any command that adds or deletes files.
12393e12c5d1SDavid du ColombierEverything else is undoable, though, including for example
12403e12c5d1SDavid du Colombier.CW e
12413e12c5d1SDavid du Colombiercommands:
12423e12c5d1SDavid du Colombier.P1
12433e12c5d1SDavid du Colombiere filename
12443e12c5d1SDavid du Colombieru
12453e12c5d1SDavid du Colombier.P2
12463e12c5d1SDavid du Colombierrestores the state of the file completely, including its name, dot,
12473e12c5d1SDavid du Colombierand modified bit.  Because of the undo, potentially dangerous commands
12483e12c5d1SDavid du Colombierare not guarded by confirmations.  Only
12493e12c5d1SDavid du Colombier.CW D ,
12503e12c5d1SDavid du Colombierwhich destroys the information necessary to restore itself, is protected.
12513e12c5d1SDavid du ColombierIt will not delete a modified file, but a second
12523e12c5d1SDavid du Colombier.CW D
12533e12c5d1SDavid du Colombierof the same file will succeed regardless.
12543e12c5d1SDavid du ColombierThe
12553e12c5d1SDavid du Colombier.CW q
12563e12c5d1SDavid du Colombiercommand, which exits
12573e12c5d1SDavid du Colombier.CW sam ,
12583e12c5d1SDavid du Colombieris similarly guarded.
12593e12c5d1SDavid du Colombier.SH 2
12603e12c5d1SDavid du ColombierMouse Interface
12613e12c5d1SDavid du Colombier.LP
12623e12c5d1SDavid du Colombier.CW Sam
12633e12c5d1SDavid du Colombieris most commonly run
12643e12c5d1SDavid du Colombierconnected to a bitmap display and mouse for interactive editing.
12653e12c5d1SDavid du ColombierThe only difference in the command language
12663e12c5d1SDavid du Colombierbetween regular, mouse-driven
12673e12c5d1SDavid du Colombier.CW sam
12683e12c5d1SDavid du Colombierand
12693e12c5d1SDavid du Colombier.CW sam\ -d
12703e12c5d1SDavid du Colombieris that if an address
12713e12c5d1SDavid du Colombieris provided without a command,
12723e12c5d1SDavid du Colombier.CW sam\ -d
12733e12c5d1SDavid du Colombierwill print the text referenced by the address, but
12743e12c5d1SDavid du Colombierregular
12753e12c5d1SDavid du Colombier.CW sam
12763e12c5d1SDavid du Colombierwill highlight it on the screen \(em in fact,
12773e12c5d1SDavid du Colombierdot is always highlighted (see Figure 2).
12783e12c5d1SDavid du Colombier.WS 1
12793e12c5d1SDavid du Colombier.KF
1280426d2b71SDavid du Colombier.XP fig3 2.04i
12813e12c5d1SDavid du Colombier.Cs
12823e12c5d1SDavid du ColombierFigure 2. A
12833e12c5d1SDavid du Colombier.CW sam
12843e12c5d1SDavid du Colombierwindow.  The scroll bar down the left
12853e12c5d1SDavid du Colombierrepresents the file, with the bubble showing the fraction
12863e12c5d1SDavid du Colombiervisible in the window.
12873e12c5d1SDavid du ColombierThe scroll bar may be manipulated by the mouse for convenient browsing.
12883e12c5d1SDavid du ColombierThe current text,
12893e12c5d1SDavid du Colombierwhich is highlighted, need not fit on a line.  Here it consists of one partial
12903e12c5d1SDavid du Colombierline, one complete line, and final partial line.
12913e12c5d1SDavid du Colombier.Ce
12923e12c5d1SDavid du Colombier.KE
12933e12c5d1SDavid du Colombier.PP
12943e12c5d1SDavid du ColombierEach file may have zero or more windows open on the display.
12953e12c5d1SDavid du ColombierAt any time, only one window in all of
12963e12c5d1SDavid du Colombier.CW sam
12973e12c5d1SDavid du Colombieris the
12983e12c5d1SDavid du Colombier.I
12993e12c5d1SDavid du Colombiercurrent window,
13003e12c5d1SDavid du Colombier.R
13013e12c5d1SDavid du Colombierthat is, the window to which typing and mouse actions refer;
13023e12c5d1SDavid du Colombierthis may be the
13033e12c5d1SDavid du Colombier.CW sam
13043e12c5d1SDavid du Colombierwindow (that in which commands may be typed)
13053e12c5d1SDavid du Colombieror one of the file windows.
13063e12c5d1SDavid du ColombierWhen a file has multiple windows, the image of the file in each window
13073e12c5d1SDavid du Colombieris always kept up to date.
13083e12c5d1SDavid du ColombierThe current file is the last file affected by a command,
13093e12c5d1SDavid du Colombierso if the
13103e12c5d1SDavid du Colombier.CW sam
13113e12c5d1SDavid du Colombierwindow is current,
13123e12c5d1SDavid du Colombierthe current window is not a window on the current file.
13133e12c5d1SDavid du ColombierHowever, each window on a file has its own value of dot,
13143e12c5d1SDavid du Colombierand when switching between windows on a single file,
13153e12c5d1SDavid du Colombierthe file's value of dot is changed to that of the window.
13163e12c5d1SDavid du ColombierThus, flipping between windows behaves in the obvious, convenient way.
13173e12c5d1SDavid du Colombier.PP
13183e12c5d1SDavid du ColombierThe mouse on the Blit has three buttons, numbered left to right.
13193e12c5d1SDavid du ColombierButton 3 has a list of commands to manipulate windows,
13203e12c5d1SDavid du Colombierfollowed by a list of `menu lines' exactly as printed by the
13213e12c5d1SDavid du Colombier.CW f
13223e12c5d1SDavid du Colombiercommand, one per file (not one per window).
13233e12c5d1SDavid du ColombierThese menu lines are sorted by file name.
13243e12c5d1SDavid du ColombierIf the list is long, the Blit menu software will make it more manageable
13253e12c5d1SDavid du Colombierby generating a scrolling menu instead of an unwieldy long list.
13263e12c5d1SDavid du ColombierUsing the menu to select a file from the list makes that file the current
13273e12c5d1SDavid du Colombierfile, and the most recently current window in that file the current window.
13283e12c5d1SDavid du ColombierBut if that file is already current, selecting it in the menu cycles through
13293e12c5d1SDavid du Colombierthe windows on the file; this simple trick avoids a special menu to
13303e12c5d1SDavid du Colombierchoose windows on a file.
13313e12c5d1SDavid du ColombierIf there is no window open on the file,
13323e12c5d1SDavid du Colombier.CW sam
13333e12c5d1SDavid du Colombierchanges the mouse cursor to prompt the user to create one.
13343e12c5d1SDavid du Colombier.PP
13353e12c5d1SDavid du ColombierThe commands on the button 3 menu are straightforward (see Figure 3), and
13363e12c5d1SDavid du Colombierare like the commands to manipulate windows in
13373e12c5d1SDavid du Colombier.CW mux ,\u\s-4\&8\s+4\d
13383e12c5d1SDavid du Colombierthe Blit's window system.
13393e12c5d1SDavid du Colombier.CW New
13403e12c5d1SDavid du Colombiermakes a new file, and gives it one empty window, whose size is determined
13413e12c5d1SDavid du Colombierby a rectangle swept by the mouse.
1342219b2ee8SDavid du Colombier.CW Zerox
13433e12c5d1SDavid du Colombierprompts for a window to be selected, and
13443e12c5d1SDavid du Colombiermakes a clone of that window; this is how multiple windows are created on one file.
13453e12c5d1SDavid du Colombier.CW Reshape
13463e12c5d1SDavid du Colombierchanges the size of the indicated window, and
13473e12c5d1SDavid du Colombier.CW close
13483e12c5d1SDavid du Colombierdeletes it.  If that is the last window open on the file,
13493e12c5d1SDavid du Colombier.CW close
13503e12c5d1SDavid du Colombierfirst does a
13513e12c5d1SDavid du Colombier.CW D
13523e12c5d1SDavid du Colombiercommand on the file.
13533e12c5d1SDavid du Colombier.CW Write
13543e12c5d1SDavid du Colombieris identical to a
13553e12c5d1SDavid du Colombier.CW w
13563e12c5d1SDavid du Colombiercommand on the file; it is in the menu purely for convenience.
13573e12c5d1SDavid du ColombierFinally,
13583e12c5d1SDavid du Colombier.CW ~~sam~~
13593e12c5d1SDavid du Colombieris a menu item that appears between the commands and the file names.
13603e12c5d1SDavid du ColombierSelecting it makes the
13613e12c5d1SDavid du Colombier.CW sam
13623e12c5d1SDavid du Colombierwindow the current window,
13633e12c5d1SDavid du Colombiercausing subsequent typing to be interpreted as commands.
13643e12c5d1SDavid du Colombier.KF
1365426d2b71SDavid du Colombier.XP fig2 2.74i
13663e12c5d1SDavid du Colombier.Cs
13673e12c5d1SDavid du ColombierFigure 3. The menu on button 3.
13683e12c5d1SDavid du ColombierThe black rectangle on the left is a scroll bar; the menu is limited to
13693e12c5d1SDavid du Colombierthe length shown to prevent its becoming unwieldy.
13703e12c5d1SDavid du ColombierAbove the
13713e12c5d1SDavid du Colombier.CW ~~sam~~
13723e12c5d1SDavid du Colombierline is a list of commands;
13733e12c5d1SDavid du Colombierbeneath it is a list of files, presented exactly as with the
13743e12c5d1SDavid du Colombier.CW f
13753e12c5d1SDavid du Colombiercommand.
13763e12c5d1SDavid du Colombier.Ce
13773e12c5d1SDavid du Colombier.KE
13783e12c5d1SDavid du Colombier.PP
13793e12c5d1SDavid du ColombierWhen
13803e12c5d1SDavid du Colombier.CW sam
13813e12c5d1SDavid du Colombierrequests that a window be swept, in response to
13823e12c5d1SDavid du Colombier.CW new ,
1383219b2ee8SDavid du Colombier.CW zerox
13843e12c5d1SDavid du Colombieror
13853e12c5d1SDavid du Colombier.CW reshape ,
13863e12c5d1SDavid du Colombierit changes the mouse cursor from the usual arrow to a box with
13873e12c5d1SDavid du Colombiera small arrow.
13883e12c5d1SDavid du ColombierIn this state, the mouse may be used to indicate an arbitrary rectangle by
13893e12c5d1SDavid du Colombierpressing button 3 at one corner and releasing it at the opposite corner.
13903e12c5d1SDavid du ColombierMore conveniently,
13913e12c5d1SDavid du Colombierbutton 3 may simply be clicked,
13923e12c5d1SDavid du Colombierwhereupon
13933e12c5d1SDavid du Colombier.CW sam
13943e12c5d1SDavid du Colombiercreates the maximal rectangle that contains the cursor
13953e12c5d1SDavid du Colombierand abuts the
13963e12c5d1SDavid du Colombier.CW sam
13973e12c5d1SDavid du Colombierwindow.
13983e12c5d1SDavid du ColombierBy placing the
13993e12c5d1SDavid du Colombier.CW sam
14003e12c5d1SDavid du Colombierwindow in the middle of the screen, the user can define two regions (one above,
14013e12c5d1SDavid du Colombierone below) in which stacked fully-overlapping
14023e12c5d1SDavid du Colombierwindows can be created with minimal fuss (see Figure 1).
14033e12c5d1SDavid du ColombierThis simple user interface trick makes window creation noticeably easier.
14043e12c5d1SDavid du Colombier.PP
14053e12c5d1SDavid du ColombierThe cut-and-paste editor is essentially the same as that in Smalltalk-80.\u\s-4\&11\s+4\d
14063e12c5d1SDavid du ColombierThe text in dot is always highlighted on the screen.
14073e12c5d1SDavid du ColombierWhen a character is typed it replaces dot, and sets dot to the null
14083e12c5d1SDavid du Colombierstring after the character.  Thus, ordinary typing inserts text.
14093e12c5d1SDavid du ColombierButton 1 is used for selection:
14103e12c5d1SDavid du Colombierpressing the button, moving the mouse, and lifting the button
14113e12c5d1SDavid du Colombierselects (sets dot to) the text between the points where the
14123e12c5d1SDavid du Colombierbutton was pressed and released.
14133e12c5d1SDavid du ColombierPressing and releasing at the same point selects a null string; this
14143e12c5d1SDavid du Colombieris called clicking.  Clicking twice quickly, or
14153e12c5d1SDavid du Colombier.I
14163e12c5d1SDavid du Colombierdouble clicking,
14173e12c5d1SDavid du Colombier.R
14183e12c5d1SDavid du Colombierselects larger objects;
14193e12c5d1SDavid du Colombierfor example, double clicking in a word selects the word,
14203e12c5d1SDavid du Colombierdouble clicking just inside an opening bracket selects the text
14213e12c5d1SDavid du Colombiercontained in the brackets (handling nested brackets correctly),
14223e12c5d1SDavid du Colombierand similarly for
14233e12c5d1SDavid du Colombierparentheses, quotes, and so on.
14243e12c5d1SDavid du ColombierThe double-clicking rules reflect a bias toward
14253e12c5d1SDavid du Colombierprogrammers.
14263e12c5d1SDavid du ColombierIf
14273e12c5d1SDavid du Colombier.CW sam
14283e12c5d1SDavid du Colombierwere intended more for word processing, double-clicks would probably
14293e12c5d1SDavid du Colombierselect linguistic structures such as sentences.
14303e12c5d1SDavid du Colombier.PP
14313e12c5d1SDavid du ColombierIf button 1 is pressed outside the current window, it makes the indicated
14323e12c5d1SDavid du Colombierwindow current.
14333e12c5d1SDavid du ColombierThis is the easiest way to switch between windows and files.
14343e12c5d1SDavid du Colombier.PP
14353e12c5d1SDavid du ColombierPressing button 2 brings up a menu of editing functions (see Figure 4).
14363e12c5d1SDavid du ColombierThese mostly apply to the selected text:
14373e12c5d1SDavid du Colombier.CW cut
14383e12c5d1SDavid du Colombierdeletes the selected text, and remembers it in a hidden buffer called the
14393e12c5d1SDavid du Colombier.I
14403e12c5d1SDavid du Colombiersnarf buffer,
14413e12c5d1SDavid du Colombier.R
14423e12c5d1SDavid du Colombier.CW paste
14433e12c5d1SDavid du Colombierreplaces the selected text by the contents of the snarf buffer,
14443e12c5d1SDavid du Colombier.CW snarf
14453e12c5d1SDavid du Colombierjust copies the selected text to the snarf buffer,
14463e12c5d1SDavid du Colombier.CW look
14473e12c5d1SDavid du Colombiersearches forward for the next literal occurrence of the selected text, and
14483e12c5d1SDavid du Colombier.CW <mux>
14493e12c5d1SDavid du Colombierexchanges snarf buffers with the window system in which
14503e12c5d1SDavid du Colombier.CW sam
14513e12c5d1SDavid du Colombieris running.
14523e12c5d1SDavid du ColombierFinally, the last regular expression used appears as a menu entry
14533e12c5d1SDavid du Colombierto search
14543e12c5d1SDavid du Colombierforward for the next occurrence of a match for the expression.
14553e12c5d1SDavid du Colombier.WS 1
14563e12c5d1SDavid du Colombier.KF
1457426d2b71SDavid du Colombier.XP fig4 1.20i
14583e12c5d1SDavid du Colombier.Cs
14593e12c5d1SDavid du ColombierFigure 4. The menu on button 2.
14603e12c5d1SDavid du ColombierThe bottom entry tracks the most recently used regular expression, which may
14613e12c5d1SDavid du Colombierbe literal text.
14623e12c5d1SDavid du Colombier.Ce
14633e12c5d1SDavid du Colombier.KE
14643e12c5d1SDavid du Colombier.PP
14653e12c5d1SDavid du ColombierThe relationship between the command language and the mouse language is
14663e12c5d1SDavid du Colombierentirely due to the equality of dot and the selected text chosen
14673e12c5d1SDavid du Colombierwith button 1 on the mouse.
14683e12c5d1SDavid du ColombierFor example, to make a set of changes in a C subroutine, dot can be
14693e12c5d1SDavid du Colombierset by double clicking on the left brace that begins the subroutine,
14703e12c5d1SDavid du Colombierwhich sets dot for the command language.
14713e12c5d1SDavid du ColombierAn address-free command then typed in the
14723e12c5d1SDavid du Colombier.CW sam
14733e12c5d1SDavid du Colombierwindow will apply only to the text between the opening and closing
14743e12c5d1SDavid du Colombierbraces of the function.
14753e12c5d1SDavid du ColombierThe idea is to select what you want, and then say what you want
14763e12c5d1SDavid du Colombierto do with it, whether invoked by a menu selection or by a typed command.
14773e12c5d1SDavid du ColombierAnd of course, the value of dot is highlighted on
14783e12c5d1SDavid du Colombierthe display after the command completes.
14793e12c5d1SDavid du ColombierThis relationship between mouse interface and command language
14803e12c5d1SDavid du Colombieris clumsy to explain, but comfortable, even natural, in practice.
14813e12c5d1SDavid du Colombier.SH
14823e12c5d1SDavid du ColombierThe Implementation
14833e12c5d1SDavid du Colombier.LP
14843e12c5d1SDavid du ColombierThe next few sections describe how
14853e12c5d1SDavid du Colombier.CW sam
14863e12c5d1SDavid du Colombieris put together, first the host part,
14873e12c5d1SDavid du Colombierthen the inter-component communication,
14883e12c5d1SDavid du Colombierthen the terminal part.
14893e12c5d1SDavid du ColombierAfter explaining how the command language is implemented,
14903e12c5d1SDavid du Colombierthe discussion follows (roughly) the path of a character
14913e12c5d1SDavid du Colombierfrom the temporary file on disc to the screen.
14923e12c5d1SDavid du ColombierThe presentation centers on the data structures,
14933e12c5d1SDavid du Colombierbecause that is how the program was designed and because
14943e12c5d1SDavid du Colombierthe algorithms are easy to provide, given the right data
14953e12c5d1SDavid du Colombierstructures.
14963e12c5d1SDavid du Colombier.SH 2
14973e12c5d1SDavid du ColombierParsing and execution
14983e12c5d1SDavid du Colombier.LP
14993e12c5d1SDavid du ColombierThe command language is interpreted by parsing each command with a
15003e12c5d1SDavid du Colombiertable-driven recursive
15013e12c5d1SDavid du Colombierdescent parser, and when a complete command is assembled, invoking a top-down
15023e12c5d1SDavid du Colombierexecutor.
15033e12c5d1SDavid du ColombierMost editors instead employ a simple character-at-a-time
15043e12c5d1SDavid du Colombierlexical scanner.
15053e12c5d1SDavid du ColombierUse of a parser makes it
15063e12c5d1SDavid du Colombiereasy and unambiguous to detect when a command is complete,
15073e12c5d1SDavid du Colombierwhich has two advantages.
15083e12c5d1SDavid du ColombierFirst, escape conventions such as backslashes to quote
15093e12c5d1SDavid du Colombiermultiple-line commands are unnecessary;  if the command isn't finished,
15103e12c5d1SDavid du Colombierthe parser keeps reading.  For example, a multiple-line append driven by an
15113e12c5d1SDavid du Colombier.CW x
15123e12c5d1SDavid du Colombiercommand is straightforward:
15133e12c5d1SDavid du Colombier.P1
15143e12c5d1SDavid du Colombierx/.*\en/ g/Peter/ a
15153e12c5d1SDavid du Colombierone line about Peter
15163e12c5d1SDavid du Colombieranother line about Peter
15173e12c5d1SDavid du Colombier\&.
15183e12c5d1SDavid du Colombier.P2
1519219b2ee8SDavid du ColombierOther UNIX editors would require a backslash after all but the last line.
15203e12c5d1SDavid du Colombier.PP
15213e12c5d1SDavid du ColombierThe other advantage is specific to the two-process structure of
15223e12c5d1SDavid du Colombier.CW sam .
15233e12c5d1SDavid du ColombierThe host process must decide when a command is completed so the
15243e12c5d1SDavid du Colombiercommand interpreter can be called.  This problem is easily resolved
15253e12c5d1SDavid du Colombierby having the lexical analyzer read the single stream of events from the
15263e12c5d1SDavid du Colombierterminal, directly executing all typing and mouse commands,
15273e12c5d1SDavid du Colombierbut passing to the parser characters typed to the
15283e12c5d1SDavid du Colombier.CW sam
15293e12c5d1SDavid du Colombiercommand window.
15303e12c5d1SDavid du ColombierThis scheme is slightly complicated by the availability of cut-and-paste
15313e12c5d1SDavid du Colombierediting in the
15323e12c5d1SDavid du Colombier.CW sam
15333e12c5d1SDavid du Colombierwindow, but that difficulty is resolved by applying the rules
15343e12c5d1SDavid du Colombierused in
15353e12c5d1SDavid du Colombier.CW mux :
15363e12c5d1SDavid du Colombierwhen a newline is typed to the
15373e12c5d1SDavid du Colombier.CW sam
15383e12c5d1SDavid du Colombierwindow, all text between the newline and the previously typed newline
15393e12c5d1SDavid du Colombieris made available to the parser.
15403e12c5d1SDavid du ColombierThis permits arbitrary editing to be done to a command before
15413e12c5d1SDavid du Colombiertyping newline and thereby requesting execution.
15423e12c5d1SDavid du Colombier.PP
15433e12c5d1SDavid du ColombierThe parser is driven by a table because the syntax of addresses
15443e12c5d1SDavid du Colombierand commands is regular enough
15453e12c5d1SDavid du Colombierto be encoded compactly.  There are few special cases, such as the
15463e12c5d1SDavid du Colombierreplacement text in a substitution, so the syntax of almost all commands
15473e12c5d1SDavid du Colombiercan be encoded with a few flags.
15483e12c5d1SDavid du ColombierThese include whether the command allows an address (for example,
15493e12c5d1SDavid du Colombier.CW e
15503e12c5d1SDavid du Colombierdoes not), whether it takes a regular expression (as in
15513e12c5d1SDavid du Colombier.CW x
15523e12c5d1SDavid du Colombierand
15533e12c5d1SDavid du Colombier.CW s ),
15543e12c5d1SDavid du Colombierwhether it takes replacement text (as in
15553e12c5d1SDavid du Colombier.CW c
15563e12c5d1SDavid du Colombieror
15573e12c5d1SDavid du Colombier.CW i ),
15583e12c5d1SDavid du Colombierwhich may be multi-line, and so on.
15593e12c5d1SDavid du ColombierThe internal syntax of regular expressions is handled by a separate
15603e12c5d1SDavid du Colombierparser; a regular expression is a leaf of the command parse tree.
15613e12c5d1SDavid du ColombierRegular expressions are discussed fully in the next section.
15623e12c5d1SDavid du Colombier.PP
15633e12c5d1SDavid du ColombierThe parser table also has information about defaults, so the interpreter
15643e12c5d1SDavid du Colombieris always called with a complete tree.  For example, the parser fills in
15653e12c5d1SDavid du Colombierthe implicit
15663e12c5d1SDavid du Colombier.CW 0
15673e12c5d1SDavid du Colombierand
15683e12c5d1SDavid du Colombier.CW $
15693e12c5d1SDavid du Colombierin the abbreviated address
15703e12c5d1SDavid du Colombier.CW ,
15713e12c5d1SDavid du Colombier(comma),
15723e12c5d1SDavid du Colombierinserts a
15733e12c5d1SDavid du Colombier.CW +
15743e12c5d1SDavid du Colombierto the left of an unadorned regular expression in an address,
15753e12c5d1SDavid du Colombierand provides the usual default address
15763e12c5d1SDavid du Colombier.CW .
15773e12c5d1SDavid du Colombier(dot) for commands that expect an address but are not given one.
15783e12c5d1SDavid du Colombier.PP
15793e12c5d1SDavid du ColombierOnce a complete command is parsed, the evaluation is easy.
15803e12c5d1SDavid du ColombierThe address is evaluated left-to-right starting from the value of dot,
15813e12c5d1SDavid du Colombierwith a mostly ordinary expression evaluator.
15823e12c5d1SDavid du ColombierAddresses, like many of the data structures in
15833e12c5d1SDavid du Colombier.CW sam ,
15843e12c5d1SDavid du Colombierare held in a C structure and passed around by value:
15853e12c5d1SDavid du Colombier.P1
15863e12c5d1SDavid du Colombiertypedef long Posn;    /* Position in a file */
15873e12c5d1SDavid du Colombiertypedef struct Range{
15883e12c5d1SDavid du Colombier        Posn    p1, p2;
15893e12c5d1SDavid du Colombier}Range;
15903e12c5d1SDavid du Colombiertypedef struct Address{
15913e12c5d1SDavid du Colombier        Range   r;
15923e12c5d1SDavid du Colombier        File    *f;
15933e12c5d1SDavid du Colombier}Address;
15943e12c5d1SDavid du Colombier.P2
15953e12c5d1SDavid du ColombierAn address is encoded as a substring (character positions
15963e12c5d1SDavid du Colombier.CW p1
15973e12c5d1SDavid du Colombierto
15983e12c5d1SDavid du Colombier.CW p2 )
15993e12c5d1SDavid du Colombierin a file
16003e12c5d1SDavid du Colombier.CW f .
16013e12c5d1SDavid du Colombier(The data type
16023e12c5d1SDavid du Colombier.CW File
16033e12c5d1SDavid du Colombieris described in detail below.)
16043e12c5d1SDavid du Colombier.PP
16053e12c5d1SDavid du ColombierThe address interpreter is an
16063e12c5d1SDavid du Colombier.CW Address -valued
16073e12c5d1SDavid du Colombierfunction that traverses the parse tree describing an address (the
16083e12c5d1SDavid du Colombierparse tree for the address has type
16093e12c5d1SDavid du Colombier.CW Addrtree ):
16103e12c5d1SDavid du Colombier.P1
16113e12c5d1SDavid du ColombierAddress
16123e12c5d1SDavid du Colombieraddress(ap, a, sign)
16133e12c5d1SDavid du Colombier	Addrtree *ap;
16143e12c5d1SDavid du Colombier	Address a;
16153e12c5d1SDavid du Colombier	int sign;
16163e12c5d1SDavid du Colombier{
16173e12c5d1SDavid du Colombier	Address a2;
16183e12c5d1SDavid du Colombier	do
16193e12c5d1SDavid du Colombier		switch(ap->type){
16203e12c5d1SDavid du Colombier		case '.':
16213e12c5d1SDavid du Colombier			a=a.f->dot;
16223e12c5d1SDavid du Colombier			break;
16233e12c5d1SDavid du Colombier		case '$':
16243e12c5d1SDavid du Colombier			a.r.p1=a.r.p2=a.f->nbytes;
16253e12c5d1SDavid du Colombier			break;
16263e12c5d1SDavid du Colombier		case '"':
16273e12c5d1SDavid du Colombier			a=matchfile(a, ap->aregexp)->dot;
16283e12c5d1SDavid du Colombier			break;
16293e12c5d1SDavid du Colombier		case ',':
16303e12c5d1SDavid du Colombier			a2=address(ap->right, a, 0);
16313e12c5d1SDavid du Colombier			a=address(ap->left, a, 0);
16323e12c5d1SDavid du Colombier			if(a.f!=a2.f || a2.r.p2<a.r.p1)
16333e12c5d1SDavid du Colombier				error(Eorder);
16343e12c5d1SDavid du Colombier			a.r.p2=a2.r.p2;
16353e12c5d1SDavid du Colombier			return a;
16363e12c5d1SDavid du Colombier		/* and so on */
16373e12c5d1SDavid du Colombier		}
16383e12c5d1SDavid du Colombier	while((ap=ap->right)!=0);
16393e12c5d1SDavid du Colombier	return a;
16403e12c5d1SDavid du Colombier}
16413e12c5d1SDavid du Colombier.P2
16423e12c5d1SDavid du Colombier.PP
16433e12c5d1SDavid du ColombierThroughout, errors are handled by a non-local
16443e12c5d1SDavid du Colombier.CW goto
16453e12c5d1SDavid du Colombier(a
16463e12c5d1SDavid du Colombier.CW setjmp/longjmp
16473e12c5d1SDavid du Colombierin C terminology)
16483e12c5d1SDavid du Colombierhidden in a routine called
16493e12c5d1SDavid du Colombier.CW error
16503e12c5d1SDavid du Colombierthat immediately aborts the execution, retracts any
16513e12c5d1SDavid du Colombierpartially made changes (see the section below on `undoing'), and
16523e12c5d1SDavid du Colombierreturns to the top level of the parser.
16533e12c5d1SDavid du ColombierThe argument to
16543e12c5d1SDavid du Colombier.CW error
16553e12c5d1SDavid du Colombieris an enumeration type that
16563e12c5d1SDavid du Colombieris translated to a terse but possibly helpful
16573e12c5d1SDavid du Colombiermessage such as `?addresses out of order.'
16583e12c5d1SDavid du ColombierVery common messages are kept short; for example the message for
16593e12c5d1SDavid du Colombiera failed regular expression search is `?search.'
16603e12c5d1SDavid du Colombier.PP
16613e12c5d1SDavid du ColombierCharacter addresses such as
16623e12c5d1SDavid du Colombier.CW #3
16633e12c5d1SDavid du Colombierare trivial to implement, as the
16643e12c5d1SDavid du Colombier.CW File
16653e12c5d1SDavid du Colombierdata structure is accessible by character number.
16663e12c5d1SDavid du ColombierHowever,
16673e12c5d1SDavid du Colombier.CW sam
16683e12c5d1SDavid du Colombierkeeps no information about the position of newlines \(em it is too
16693e12c5d1SDavid du Colombierexpensive to track dynamically \(em so line addresses are computed by reading
16703e12c5d1SDavid du Colombierthe file, counting newlines.  Except in very large files, this has proven
16713e12c5d1SDavid du Colombieracceptable: file access is fast enough to make the technique practical,
16723e12c5d1SDavid du Colombierand lines are not central to the structure of the command language.
16733e12c5d1SDavid du Colombier.PP
16743e12c5d1SDavid du ColombierThe command interpreter, called
16753e12c5d1SDavid du Colombier.CW cmdexec ,
16763e12c5d1SDavid du Colombieris also straightforward.  The parse table includes a
16773e12c5d1SDavid du Colombierfunction to call to interpret a particular command.  That function
16783e12c5d1SDavid du Colombierreceives as arguments
16793e12c5d1SDavid du Colombierthe calculated address
16803e12c5d1SDavid du Colombierfor the command
16813e12c5d1SDavid du Colombierand the command tree (of type
16823e12c5d1SDavid du Colombier.CW Cmdtree ),
16833e12c5d1SDavid du Colombierwhich may contain information such as the subtree for compound commands.
16843e12c5d1SDavid du ColombierHere, for example, is the function for the
16853e12c5d1SDavid du Colombier.CW g
16863e12c5d1SDavid du Colombierand
16873e12c5d1SDavid du Colombier.CW v
16883e12c5d1SDavid du Colombiercommands:
16893e12c5d1SDavid du Colombier.P1
16903e12c5d1SDavid du Colombierint
16913e12c5d1SDavid du Colombierg_cmd(a, cp)
16923e12c5d1SDavid du Colombier	Address a;
16933e12c5d1SDavid du Colombier	Cmdtree *cp;
16943e12c5d1SDavid du Colombier{
16953e12c5d1SDavid du Colombier	compile(cp->regexp);
16963e12c5d1SDavid du Colombier	if(execute(a.f, a.r.p1, a.r.p2)!=(cp->cmdchar=='v')){
16973e12c5d1SDavid du Colombier		a.f->dot=a;
16983e12c5d1SDavid du Colombier		return cmdexec(a, cp->subcmd);
16993e12c5d1SDavid du Colombier	}
1700219b2ee8SDavid du Colombier	return TRUE;	/* cause execution to continue */
17013e12c5d1SDavid du Colombier}
17023e12c5d1SDavid du Colombier.P2
17033e12c5d1SDavid du Colombier.CW Compile "" (
17043e12c5d1SDavid du Colombierand
17053e12c5d1SDavid du Colombier.CW execute
17063e12c5d1SDavid du Colombierare part of the regular expression code, described in the next section.)
17073e12c5d1SDavid du ColombierBecause the parser and the
17083e12c5d1SDavid du Colombier.CW File
17093e12c5d1SDavid du Colombierdata structure do most of the work, most commands
17103e12c5d1SDavid du Colombierare similarly brief.
17113e12c5d1SDavid du Colombier.SH 2
17123e12c5d1SDavid du ColombierRegular expressions
17133e12c5d1SDavid du Colombier.LP
17143e12c5d1SDavid du ColombierThe regular expression code in
17153e12c5d1SDavid du Colombier.CW sam
17163e12c5d1SDavid du Colombieris an interpreted, rather than compiled on-the-fly, implementation of Thompson's
17173e12c5d1SDavid du Colombiernon-deterministic finite automaton algorithm.\u\s-4\&12\s+4\d
1718219b2ee8SDavid du ColombierThe syntax and semantics of the expressions are as in the UNIX program
17193e12c5d1SDavid du Colombier.CW egrep ,
17203e12c5d1SDavid du Colombierincluding alternation, closures, character classes, and so on.
17213e12c5d1SDavid du ColombierThe only changes in the notation are two additions:
17223e12c5d1SDavid du Colombier.CW \en
17233e12c5d1SDavid du Colombieris translated to, and matches, a newline character, and
17243e12c5d1SDavid du Colombier.CW @
17253e12c5d1SDavid du Colombiermatches any character.  In
17263e12c5d1SDavid du Colombier.CW egrep ,
17273e12c5d1SDavid du Colombierthe character
17283e12c5d1SDavid du Colombier.CW \&.
17293e12c5d1SDavid du Colombiermatches any character except newline, and in
17303e12c5d1SDavid du Colombier.CW sam
17313e12c5d1SDavid du Colombierthe same rule seemed safest, to prevent idioms like
17323e12c5d1SDavid du Colombier.CW \&.*
17333e12c5d1SDavid du Colombierfrom spanning newlines.
17343e12c5d1SDavid du Colombier.CW Egrep
17353e12c5d1SDavid du Colombierexpressions are arguably too complicated for an interactive editor \(em
17363e12c5d1SDavid du Colombiercertainly it would make sense if all the special characters were two-character
17373e12c5d1SDavid du Colombiersequences, so that most of the punctuation characters wouldn't have
17383e12c5d1SDavid du Colombierpeculiar meanings \(em but for an interesting command language, full
17393e12c5d1SDavid du Colombierregular expressions are necessary, and
17403e12c5d1SDavid du Colombier.CW egrep
1741219b2ee8SDavid du Colombierdefines the full regular expression syntax for UNIX programs.
1742219b2ee8SDavid du ColombierAlso, it seemed superfluous to define a new syntax, since various UNIX programs
17433e12c5d1SDavid du Colombier.CW ed , (
17443e12c5d1SDavid du Colombier.CW egrep
17453e12c5d1SDavid du Colombierand
17463e12c5d1SDavid du Colombier.CW vi )
17473e12c5d1SDavid du Colombierdefine too many already.
17483e12c5d1SDavid du Colombier.PP
17493e12c5d1SDavid du ColombierThe expressions are compiled by a routine,
17503e12c5d1SDavid du Colombier.CW compile ,
17513e12c5d1SDavid du Colombierthat generates the description of the non-deterministic finite state machine.
17523e12c5d1SDavid du ColombierA second routine,
17533e12c5d1SDavid du Colombier.CW execute ,
17543e12c5d1SDavid du Colombierinterprets the machine to generate the leftmost-longest match of the
17553e12c5d1SDavid du Colombierexpression in a substring of the file.
17563e12c5d1SDavid du ColombierThe algorithm is described elsewhere.\u\s-4\&12,13\s+4\d
17573e12c5d1SDavid du Colombier.CW Execute
17583e12c5d1SDavid du Colombierreports
17593e12c5d1SDavid du Colombierwhether a match was found, and sets a global variable,
17603e12c5d1SDavid du Colombierof type
17613e12c5d1SDavid du Colombier.CW Range ,
17623e12c5d1SDavid du Colombierto the substring matched.
17633e12c5d1SDavid du Colombier.PP
17643e12c5d1SDavid du ColombierA trick is required to evaluate the expression in reverse, such as when
17653e12c5d1SDavid du Colombiersearching backwards for an expression.
17663e12c5d1SDavid du ColombierFor example,
17673e12c5d1SDavid du Colombier.P1
17683e12c5d1SDavid du Colombier-/P.*r/
17693e12c5d1SDavid du Colombier.P2
17703e12c5d1SDavid du Colombierlooks backwards through the file for a match of the expression.
17713e12c5d1SDavid du ColombierThe expression, however, is defined for a forward search.
17723e12c5d1SDavid du ColombierThe solution is to construct a machine identical to the machine
17733e12c5d1SDavid du Colombierfor a forward search except for a reversal of all the concatenation
17743e12c5d1SDavid du Colombieroperators (the other operators are symmetric under direction reversal),
17753e12c5d1SDavid du Colombierto exchange the meaning of the operators
17763e12c5d1SDavid du Colombier.CW ^
17773e12c5d1SDavid du Colombierand
17783e12c5d1SDavid du Colombier.CW $ ,
17793e12c5d1SDavid du Colombierand then to read the file backwards, looking for the
17803e12c5d1SDavid du Colombierusual earliest longest match.
17813e12c5d1SDavid du Colombier.PP
17823e12c5d1SDavid du Colombier.CW Execute
17833e12c5d1SDavid du Colombiergenerates only one match each time it is called.
17843e12c5d1SDavid du ColombierTo interpret looping constructs such as the
17853e12c5d1SDavid du Colombier.CW x
17863e12c5d1SDavid du Colombiercommand,
17873e12c5d1SDavid du Colombier.CW sam
17883e12c5d1SDavid du Colombiermust therefore synchronize between
17893e12c5d1SDavid du Colombiercalls of
17903e12c5d1SDavid du Colombier.CW execute
17913e12c5d1SDavid du Colombierto avoid
17923e12c5d1SDavid du Colombierproblems with null matches.
17933e12c5d1SDavid du ColombierFor example, even given the leftmost-longest rule,
17943e12c5d1SDavid du Colombierthe expression
17953e12c5d1SDavid du Colombier.CW a*
17963e12c5d1SDavid du Colombiermatches three times in the string
17973e12c5d1SDavid du Colombier.CW ab
17983e12c5d1SDavid du Colombier(the character
17993e12c5d1SDavid du Colombier.CW a ,
18003e12c5d1SDavid du Colombierthe null string between the
18013e12c5d1SDavid du Colombier.CW a
18023e12c5d1SDavid du Colombierand
18033e12c5d1SDavid du Colombier.CW b ,
18043e12c5d1SDavid du Colombierand the final null string).
18053e12c5d1SDavid du ColombierAfter returning a match for the
18063e12c5d1SDavid du Colombier.CW a ,
18073e12c5d1SDavid du Colombier.CW sam
18083e12c5d1SDavid du Colombiermust not match the null string before the
18093e12c5d1SDavid du Colombier.CW b .
18103e12c5d1SDavid du ColombierThe algorithm starts
18113e12c5d1SDavid du Colombier.CW execute
18123e12c5d1SDavid du Colombierat the end of its previous match, and
18133e12c5d1SDavid du Colombierif the match it returns
18143e12c5d1SDavid du Colombieris null and abuts the previous match, rejects the match and advances
18153e12c5d1SDavid du Colombierthe initial position one character.
18163e12c5d1SDavid du Colombier.SH 2
18173e12c5d1SDavid du ColombierMemory allocation
18183e12c5d1SDavid du Colombier.LP
18193e12c5d1SDavid du ColombierThe C language has no memory allocation primitives, although a standard
18203e12c5d1SDavid du Colombierlibrary routine,
18213e12c5d1SDavid du Colombier.CW malloc ,
18223e12c5d1SDavid du Colombierprovides adequate service for simple programs.
18233e12c5d1SDavid du ColombierFor specific uses, however,
18243e12c5d1SDavid du Colombierit can be better to write a custom allocator.
18253e12c5d1SDavid du ColombierThe allocator (or rather, pair of allocators) described here
18263e12c5d1SDavid du Colombierwork in both the terminal and host parts of
18273e12c5d1SDavid du Colombier.CW sam .
18283e12c5d1SDavid du ColombierThey are designed for efficient manipulation of strings,
18293e12c5d1SDavid du Colombierwhich are allocated and freed frequently and vary in length from essentially
18303e12c5d1SDavid du Colombierzero to 32 Kbytes (very large strings are written to disc).
18313e12c5d1SDavid du ColombierMore important, strings may be large and change size often,
18323e12c5d1SDavid du Colombierso to minimize memory usage it is helpful to reclaim and to coalesce the
18333e12c5d1SDavid du Colombierunused portions of strings when they are truncated.
18343e12c5d1SDavid du Colombier.PP
18353e12c5d1SDavid du ColombierObjects to be allocated in
18363e12c5d1SDavid du Colombier.CW sam
18373e12c5d1SDavid du Colombierare of two flavors:
18383e12c5d1SDavid du Colombierthe first is C
18393e12c5d1SDavid du Colombier.CW structs ,
18403e12c5d1SDavid du Colombierwhich are small and often addressed by pointer variables;
18413e12c5d1SDavid du Colombierthe second is variable-sized arrays of characters
18423e12c5d1SDavid du Colombieror integers whose
18433e12c5d1SDavid du Colombierbase pointer is always used to access them.
18443e12c5d1SDavid du ColombierThe memory allocator in
18453e12c5d1SDavid du Colombier.CW sam
18463e12c5d1SDavid du Colombieris therefore in two parts:
18473e12c5d1SDavid du Colombierfirst, a traditional first-fit allocator that provides fixed storage for
18483e12c5d1SDavid du Colombier.CW structs ;
18493e12c5d1SDavid du Colombierand second, a garbage-compacting allocator that reduces storage
18503e12c5d1SDavid du Colombieroverhead for variable-sized objects, at the cost of some bookkeeping.
18513e12c5d1SDavid du ColombierThe two types of objects are allocated from adjoining arenas, with
18523e12c5d1SDavid du Colombierthe garbage-compacting allocator controlling the arena with higher addresses.
18533e12c5d1SDavid du ColombierSeparating into two arenas simplifies compaction and prevents fragmentation due
18543e12c5d1SDavid du Colombierto immovable objects.
18553e12c5d1SDavid du ColombierThe access rules for garbage-compactable objects
18563e12c5d1SDavid du Colombier(discussed in the next paragraph) allow them to be relocated, so when
18573e12c5d1SDavid du Colombierthe first-fit arena needs space, it moves the garbage-compacted arena
18583e12c5d1SDavid du Colombierto higher addresses to make room.  Storage is therefore created only
18593e12c5d1SDavid du Colombierat successively higher addresses, either when more garbage-compacted
18603e12c5d1SDavid du Colombierspace is needed or when the first-fit arena pushes up the other arena.
18613e12c5d1SDavid du Colombier.PP
18623e12c5d1SDavid du ColombierObjects that may be compacted declare to the
18633e12c5d1SDavid du Colombierallocator a cell that is guaranteed to be the sole repository of the
18643e12c5d1SDavid du Colombieraddress of the object whenever a compaction can occur.
18653e12c5d1SDavid du ColombierThe compactor can then update the address when the object is moved.
18663e12c5d1SDavid du ColombierFor example, the implementation of type
18673e12c5d1SDavid du Colombier.CW List
18683e12c5d1SDavid du Colombier(really a variable-length array)
18693e12c5d1SDavid du Colombieris:
18703e12c5d1SDavid du Colombier.P1
18713e12c5d1SDavid du Colombiertypedef struct List{
18723e12c5d1SDavid du Colombier        int     nused;
18733e12c5d1SDavid du Colombier        long    *ptr;
18743e12c5d1SDavid du Colombier}List;
18753e12c5d1SDavid du Colombier.P2
18763e12c5d1SDavid du ColombierThe
18773e12c5d1SDavid du Colombier.CW ptr
18783e12c5d1SDavid du Colombiercell must always be used directly, and never copied.  When a
18793e12c5d1SDavid du Colombier.CW List
18803e12c5d1SDavid du Colombieris to be created the
18813e12c5d1SDavid du Colombier.CW List
18823e12c5d1SDavid du Colombierstructure is allocated in the ordinary first-fit arena
18833e12c5d1SDavid du Colombierand its
18843e12c5d1SDavid du Colombier.CW ptr
18853e12c5d1SDavid du Colombieris allocated in the garbage-compacted arena.
18863e12c5d1SDavid du ColombierA similar data type for strings, called
18873e12c5d1SDavid du Colombier.CW String ,
18883e12c5d1SDavid du Colombierstores variable-length character arrays of up to 32767 elements.
18893e12c5d1SDavid du Colombier.PP
18903e12c5d1SDavid du ColombierA related matter of programming style:
18913e12c5d1SDavid du Colombier.CW sam
18923e12c5d1SDavid du Colombierfrequently passes structures by value, which
18933e12c5d1SDavid du Colombiersimplifies the code.
18943e12c5d1SDavid du ColombierTraditionally, C programs have
18953e12c5d1SDavid du Colombierpassed structures by reference, but implicit allocation on
18963e12c5d1SDavid du Colombierthe stack is easier to use.
18973e12c5d1SDavid du ColombierStructure passing is a relatively new feature of C
18983e12c5d1SDavid du Colombier(it is not in the
18993e12c5d1SDavid du Colombierstandard reference manual for C\u\s-4\&14\s+4\d), and is poorly supported in most
19003e12c5d1SDavid du Colombiercommercial C compilers.
19013e12c5d1SDavid du ColombierIt's convenient and expressive, though,
19023e12c5d1SDavid du Colombierand simplifies memory management by
19033e12c5d1SDavid du Colombieravoiding the allocator altogether
19043e12c5d1SDavid du Colombierand eliminating pointer aliases.
19053e12c5d1SDavid du Colombier.SH 2
19063e12c5d1SDavid du ColombierData structures for manipulating files
19073e12c5d1SDavid du Colombier.LP
19083e12c5d1SDavid du ColombierExperience with
19093e12c5d1SDavid du Colombier.CW jim
19103e12c5d1SDavid du Colombiershowed that the requirements
19113e12c5d1SDavid du Colombierof the file data structure were few, but strict.
19123e12c5d1SDavid du ColombierFirst, files need to be read and written quickly;
19133e12c5d1SDavid du Colombieradding a fresh file must be painless.
19143e12c5d1SDavid du ColombierSecond, the implementation must place no arbitrary upper limit on
19153e12c5d1SDavid du Colombierthe number or sizes of files.  (It should be practical to edit many files,
19163e12c5d1SDavid du Colombierand files up to megabytes in length should be handled gracefully.)
19173e12c5d1SDavid du ColombierThis implies that files be stored on disc, not in main memory.
19183e12c5d1SDavid du Colombier(Aficionados of virtual memory may argue otherwise, but the
19193e12c5d1SDavid du Colombierimplementation of virtual
19203e12c5d1SDavid du Colombiermemory in our system is not something to depend on
19213e12c5d1SDavid du Colombierfor good performance.)
19223e12c5d1SDavid du ColombierThird, changes to files need be made by only two primitives:
19233e12c5d1SDavid du Colombierdeletion and insertion.
19243e12c5d1SDavid du ColombierThese are inverses of each other,
19253e12c5d1SDavid du Colombierwhich simplifies the implementation of the undo operation.
19263e12c5d1SDavid du ColombierFinally,
19273e12c5d1SDavid du Colombierit must be easy and efficient to access the file, either
19283e12c5d1SDavid du Colombierforwards or backwards, a byte at a time.
19293e12c5d1SDavid du Colombier.PP
19303e12c5d1SDavid du ColombierThe
19313e12c5d1SDavid du Colombier.CW File
19323e12c5d1SDavid du Colombierdata type is constructed from three simpler data structures that hold arrays
19333e12c5d1SDavid du Colombierof characters.
19343e12c5d1SDavid du ColombierEach of these types has an insertion and deletion operator, and the
19353e12c5d1SDavid du Colombierinsertion and deletion operators of the
19363e12c5d1SDavid du Colombier.CW File
19373e12c5d1SDavid du Colombiertype itself are constructed from them.
19383e12c5d1SDavid du Colombier.PP
19393e12c5d1SDavid du ColombierThe simplest type is the
19403e12c5d1SDavid du Colombier.CW String ,
19413e12c5d1SDavid du Colombierwhich is used to hold strings in main memory.
19423e12c5d1SDavid du ColombierThe code that manages
19433e12c5d1SDavid du Colombier.CW Strings
19443e12c5d1SDavid du Colombierguarantees that they will never be longer
19453e12c5d1SDavid du Colombierthan some moderate size, and in practice they are rarely larger than 8 Kbytes.
19463e12c5d1SDavid du Colombier.CW Strings
19473e12c5d1SDavid du Colombierhave two purposes: they hold short strings like file names with little overhead,
19483e12c5d1SDavid du Colombierand because they are deliberately small, they are efficient to modify.
19493e12c5d1SDavid du ColombierThey are therefore used as the data structure for in-memory caches.
19503e12c5d1SDavid du Colombier.PP
19513e12c5d1SDavid du ColombierThe disc copy of the file is managed by a data structure called a
19523e12c5d1SDavid du Colombier.CW Disc ,
19533e12c5d1SDavid du Colombierwhich corresponds to a temporary file.  A
19543e12c5d1SDavid du Colombier.CW Disc
19553e12c5d1SDavid du Colombierhas no storage in main memory other than bookkeeping information;
19563e12c5d1SDavid du Colombierthe actual data being held is all on the disc.
19573e12c5d1SDavid du ColombierTo reduce the number of open files needed,
19583e12c5d1SDavid du Colombier.CW sam
1959219b2ee8SDavid du Colombieropens a dozen temporary UNIX files and multiplexes the
19603e12c5d1SDavid du Colombier.CW Discs
19613e12c5d1SDavid du Colombierupon them.
19623e12c5d1SDavid du ColombierThis permits many files to
19633e12c5d1SDavid du Colombierbe edited; the entire
19643e12c5d1SDavid du Colombier.CW sam
19653e12c5d1SDavid du Colombiersource (48 files) may be edited comfortably with a single
19663e12c5d1SDavid du Colombierinstance of
19673e12c5d1SDavid du Colombier.CW sam .
19683e12c5d1SDavid du ColombierAllocating one temporary file per
19693e12c5d1SDavid du Colombier.CW Disc
19703e12c5d1SDavid du Colombierwould strain the operating system's limit on the number of open files.
19713e12c5d1SDavid du ColombierAlso, spreading the traffic among temporary files keeps the files shorter,
1972219b2ee8SDavid du Colombierand shorter files are more efficiently implemented by the UNIX
19733e12c5d1SDavid du ColombierI/O subsystem.
19743e12c5d1SDavid du Colombier.PP
19753e12c5d1SDavid du ColombierA
19763e12c5d1SDavid du Colombier.CW Disc
19773e12c5d1SDavid du Colombieris an array of fixed-length blocks, each of which contains
19783e12c5d1SDavid du Colombierbetween 1 and 4096 characters of active data.
1979219b2ee8SDavid du Colombier(The block size of our UNIX file system is 4096 bytes.)
19803e12c5d1SDavid du ColombierThe block addresses within the temporary file and the length of each
19813e12c5d1SDavid du Colombierblock are stored in a
19823e12c5d1SDavid du Colombier.CW List .
19833e12c5d1SDavid du ColombierWhen changes are made the live part of blocks may change size.
19843e12c5d1SDavid du ColombierBlocks are created and coalesced when necessary to try to keep the sizes
19853e12c5d1SDavid du Colombierbetween 2048 and 4096 bytes.
19863e12c5d1SDavid du ColombierAn actively changing part of the
19873e12c5d1SDavid du Colombier.CW Disc
19883e12c5d1SDavid du Colombiertherefore typically has about a kilobyte of slop that can be
19893e12c5d1SDavid du Colombierinserted or deleted
19903e12c5d1SDavid du Colombierwithout changing more than one block or affecting the block order.
19913e12c5d1SDavid du ColombierWhen an insertion would overflow a block, the block is split, a new one
19923e12c5d1SDavid du Colombieris allocated to receive the overflow, and the memory-resident list of blocks
19933e12c5d1SDavid du Colombieris rearranged to reflect the insertion of the new block.
19943e12c5d1SDavid du Colombier.PP
19953e12c5d1SDavid du ColombierObviously, going to the disc for every modification to the file is
19963e12c5d1SDavid du Colombierprohibitively expensive.
19973e12c5d1SDavid du ColombierThe data type
19983e12c5d1SDavid du Colombier.CW Buffer
19993e12c5d1SDavid du Colombierconsists of a
20003e12c5d1SDavid du Colombier.CW Disc
20013e12c5d1SDavid du Colombierto hold the data and a
20023e12c5d1SDavid du Colombier.CW String
20033e12c5d1SDavid du Colombierthat acts as a cache.
20043e12c5d1SDavid du ColombierThis is the first of a series of caches throughout the data structures in
20053e12c5d1SDavid du Colombier.CW sam.
20063e12c5d1SDavid du ColombierThe caches not only improve performance, they provide a way to organize
20073e12c5d1SDavid du Colombierthe flow of data, particularly in the communication between the host
20083e12c5d1SDavid du Colombierand terminal.
20093e12c5d1SDavid du ColombierThis idea is developed below, in the section on communications.
20103e12c5d1SDavid du Colombier.PP
20113e12c5d1SDavid du ColombierTo reduce disc traffic, changes to a
20123e12c5d1SDavid du Colombier.CW Buffer
20133e12c5d1SDavid du Colombierare mediated by a variable-length string, in memory, that acts as a cache.
20143e12c5d1SDavid du ColombierWhen an insertion or deletion is made to a
20153e12c5d1SDavid du Colombier.CW Buffer ,
20163e12c5d1SDavid du Colombierif the change can be accommodated by the cache, it is done there.
20173e12c5d1SDavid du ColombierIf the cache becomes bigger than a block because of an insertion,
20183e12c5d1SDavid du Colombiersome of it is written to the
20193e12c5d1SDavid du Colombier.CW Disc
20203e12c5d1SDavid du Colombierand deleted from the cache.
20213e12c5d1SDavid du ColombierIf the change does not intersect the cache, the cache is flushed.
20223e12c5d1SDavid du ColombierThe cache is only loaded at the new position if the change is smaller than a block;
20233e12c5d1SDavid du Colombierotherwise, it is sent directly to the
20243e12c5d1SDavid du Colombier.CW Disc .
20253e12c5d1SDavid du ColombierThis is because
20263e12c5d1SDavid du Colombierlarge changes are typically sequential,
20273e12c5d1SDavid du Colombierwhereupon the next change is unlikely to overlap the current one.
20283e12c5d1SDavid du Colombier.PP
20293e12c5d1SDavid du ColombierA
20303e12c5d1SDavid du Colombier.CW File
20313e12c5d1SDavid du Colombiercomprises a
20323e12c5d1SDavid du Colombier.CW String
20333e12c5d1SDavid du Colombierto hold the file name and some ancillary data such as dot and the modified bit.
20343e12c5d1SDavid du ColombierThe most important components, though, are a pair of
20353e12c5d1SDavid du Colombier.CW Buffers ,
20363e12c5d1SDavid du Colombierone called the transcript and the other the contents.
20373e12c5d1SDavid du ColombierTheir use is described in the next section.
20383e12c5d1SDavid du Colombier.PP
20393e12c5d1SDavid du ColombierThe overall structure is shown in Figure 5.
20403e12c5d1SDavid du ColombierAlthough it may seem that the data is touched many times on its
20413e12c5d1SDavid du Colombierway from the
20423e12c5d1SDavid du Colombier.CW Disc ,
2043219b2ee8SDavid du Colombierit is read (by one UNIX system call) directly into the cache of the
20443e12c5d1SDavid du Colombierassociated
20453e12c5d1SDavid du Colombier.CW Buffer ;
20463e12c5d1SDavid du Colombierno extra copy is done.
20473e12c5d1SDavid du ColombierSimilarly, when flushing the cache, the text is written
20483e12c5d1SDavid du Colombierdirectly from the cache to disc.
20493e12c5d1SDavid du ColombierMost operations act directly on the text in the cache.
20503e12c5d1SDavid du ColombierA principle applied throughout
20513e12c5d1SDavid du Colombier.CW sam
20523e12c5d1SDavid du Colombieris that the fewer times the data is copied, the faster the program will run
20533e12c5d1SDavid du Colombier(see also the paper by Waite\u\s-4\&15\s+4\d).
20543e12c5d1SDavid du Colombier.KF
20553e12c5d1SDavid du Colombier.PS
20563e12c5d1SDavid du Colombiercopy "fig5.pic"
20573e12c5d1SDavid du Colombier.PE
20583e12c5d1SDavid du Colombier.Cs
20593e12c5d1SDavid du ColombierFigure 5. File data structures.
20603e12c5d1SDavid du ColombierThe temporary files are stored in the standard repository for such files
20613e12c5d1SDavid du Colombieron the host system.
20623e12c5d1SDavid du Colombier.Ce
20633e12c5d1SDavid du Colombier.KE
20643e12c5d1SDavid du Colombier.PP
20653e12c5d1SDavid du ColombierThe contents of a
20663e12c5d1SDavid du Colombier.CW File
20673e12c5d1SDavid du Colombierare accessed by a routine that
20683e12c5d1SDavid du Colombiercopies to a buffer a substring of a file starting at a specified offset.
20693e12c5d1SDavid du ColombierTo read a byte at a time, a
20703e12c5d1SDavid du Colombier.CW File "" per-
20713e12c5d1SDavid du Colombierarray is loaded starting from a specified initial position,
20723e12c5d1SDavid du Colombierand bytes may then be read from the array.
20733e12c5d1SDavid du ColombierThe implementation is done by a macro similar to the C standard I/O
20743e12c5d1SDavid du Colombier.CW getc
20753e12c5d1SDavid du Colombiermacro.\u\s-4\&14\s+4\d
20763e12c5d1SDavid du ColombierBecause the reading may be done at any address, a minor change to the
20773e12c5d1SDavid du Colombiermacro allows the file to be read backwards.
20783e12c5d1SDavid du ColombierThis array is read-only; there is no
20793e12c5d1SDavid du Colombier.CW putc .
20803e12c5d1SDavid du Colombier.SH 2
20813e12c5d1SDavid du ColombierDoing and undoing
20823e12c5d1SDavid du Colombier.LP
20833e12c5d1SDavid du Colombier.CW Sam
20843e12c5d1SDavid du Colombierhas an unusual method for managing changes to files.
20853e12c5d1SDavid du ColombierThe command language makes it easy to specify multiple variable-length changes
20863e12c5d1SDavid du Colombierto a file millions of bytes long, and such changes
20873e12c5d1SDavid du Colombiermust be made efficiently if the editor is to be practical.
20883e12c5d1SDavid du ColombierThe usual techniques for inserting and deleting strings
20893e12c5d1SDavid du Colombierare inadequate under these conditions.
20903e12c5d1SDavid du ColombierThe
20913e12c5d1SDavid du Colombier.CW Buffer
20923e12c5d1SDavid du Colombierand
20933e12c5d1SDavid du Colombier.CW Disc
20943e12c5d1SDavid du Colombierdata structures are designed for efficient random access to long strings,
20953e12c5d1SDavid du Colombierbut care must be taken to avoid super-linear behavior when making
20963e12c5d1SDavid du Colombiermany changes simultaneously.
20973e12c5d1SDavid du Colombier.PP
20983e12c5d1SDavid du Colombier.CW Sam
20993e12c5d1SDavid du Colombieruses a two-pass algorithm for making changes, and treats each file as a database
21003e12c5d1SDavid du Colombieragainst which transactions are registered.
21013e12c5d1SDavid du ColombierChanges are not made directly to the contents.
21023e12c5d1SDavid du ColombierInstead, when a command is started, a `mark' containing
21033e12c5d1SDavid du Colombiera sequence number is placed in the transcript
21043e12c5d1SDavid du Colombier.CW Buffer ,
21053e12c5d1SDavid du Colombierand each change made to the file, either an insertion or deletion
21063e12c5d1SDavid du Colombieror a change to the file name,
21073e12c5d1SDavid du Colombieris appended to the end of the transcript.
21083e12c5d1SDavid du ColombierWhen the command is complete, the transcript is rewound to the
21093e12c5d1SDavid du Colombiermark and applied to the contents.
21103e12c5d1SDavid du Colombier.PP
21113e12c5d1SDavid du ColombierOne reason for separating evaluation from
21123e12c5d1SDavid du Colombierapplication in this way is to simplify tracking the addresses of changes
21133e12c5d1SDavid du Colombiermade in the middle of a long sequence.
21143e12c5d1SDavid du ColombierThe two-pass algorithm also allows all changes to apply to the
21153e12c5d1SDavid du Colombier.I original
21163e12c5d1SDavid du Colombierdata: no change can affect another change made in the same command.
21173e12c5d1SDavid du ColombierThis is particularly important when evaluating an
21183e12c5d1SDavid du Colombier.CW x
21193e12c5d1SDavid du Colombiercommand because it prevents regular expression matches
21203e12c5d1SDavid du Colombierfrom stumbling over changes made earlier in the execution.
21213e12c5d1SDavid du ColombierAlso, the two-pass
2122219b2ee8SDavid du Colombieralgorithm is cleaner than the way other UNIX editors allow changes to
21233e12c5d1SDavid du Colombieraffect each other;
21243e12c5d1SDavid du Colombierfor example,
21253e12c5d1SDavid du Colombier.CW ed 's
21263e12c5d1SDavid du Colombieridioms to do things like delete every other line
21273e12c5d1SDavid du Colombierdepend critically on the implementation.
21283e12c5d1SDavid du ColombierInstead,
21293e12c5d1SDavid du Colombier.CW sam 's
21303e12c5d1SDavid du Colombiersimple model, in which all changes in a command occur effectively
21313e12c5d1SDavid du Colombiersimultaneously, is easy to explain and to understand.
21323e12c5d1SDavid du Colombier.PP
21333e12c5d1SDavid du ColombierThe records in the transcript are of the form ``delete substring from
21343e12c5d1SDavid du Colombierlocations
21353e12c5d1SDavid du Colombier123 to 456'' and ``insert 11 characters `hello there' at location 789.''
21363e12c5d1SDavid du Colombier(It is an error if the changes are not at monotonically greater
21373e12c5d1SDavid du Colombierpositions through the file.)
21383e12c5d1SDavid du ColombierWhile the update is occurring, these numbers must be
21393e12c5d1SDavid du Colombieroffset by earlier changes, but that is straightforward and
21403e12c5d1SDavid du Colombierlocal to the update routine;
21413e12c5d1SDavid du Colombiermoreover, all the numbers have been computed
21423e12c5d1SDavid du Colombierbefore the first is examined.
21433e12c5d1SDavid du Colombier.PP
21443e12c5d1SDavid du ColombierTreating the file as a transaction system has another advantage:
21453e12c5d1SDavid du Colombierundo is trivial.
21463e12c5d1SDavid du ColombierAll it takes is to invert the transcript after it has been
21473e12c5d1SDavid du Colombierimplemented, converting insertions
21483e12c5d1SDavid du Colombierinto deletions and vice versa, and saving them in a holding
21493e12c5d1SDavid du Colombier.CW Buffer .
21503e12c5d1SDavid du ColombierThe `do' transcript can then be deleted from
21513e12c5d1SDavid du Colombierthe transcript
21523e12c5d1SDavid du Colombier.CW Buffer
21533e12c5d1SDavid du Colombierand replaced by the `undo' transcript.
21543e12c5d1SDavid du ColombierIf an undo is requested, the transcript is rewound and the undo transcript
21553e12c5d1SDavid du Colombierexecuted.
21563e12c5d1SDavid du ColombierBecause the transcript
21573e12c5d1SDavid du Colombier.CW Buffer
21583e12c5d1SDavid du Colombieris not truncated after each command, it accumulates
21593e12c5d1SDavid du Colombiersuccessive changes.
21603e12c5d1SDavid du ColombierA sequence of undo commands
21613e12c5d1SDavid du Colombiercan therefore back up the file arbitrarily,
21623e12c5d1SDavid du Colombierwhich is more helpful than the more commonly implemented self-inverse form of undo.
21633e12c5d1SDavid du Colombier.CW Sam "" (
21643e12c5d1SDavid du Colombierprovides no way to undo an undo, but if it were desired,
21653e12c5d1SDavid du Colombierit would be easy to provide by re-interpreting the `do' transcript.)
21663e12c5d1SDavid du ColombierEach mark in the transcript contains a sequence number and the offset into
21673e12c5d1SDavid du Colombierthe transcript of the previous mark, to aid in unwinding the transcript.
21683e12c5d1SDavid du ColombierMarks also contain the value of dot and the modified bit so these can be
21693e12c5d1SDavid du Colombierrestored easily.
21703e12c5d1SDavid du ColombierUndoing multiple files is easy; it merely demands undoing all files whose
21713e12c5d1SDavid du Colombierlatest change has the same sequence number as the current file.
21723e12c5d1SDavid du Colombier.PP
21733e12c5d1SDavid du ColombierAnother benefit of having a transcript is that errors encountered in the middle
21743e12c5d1SDavid du Colombierof a complicated command need not leave the files in an intermediate state.
21753e12c5d1SDavid du ColombierBy rewinding the transcript to the mark beginning the command,
21763e12c5d1SDavid du Colombierthe partial command can be trivially undone.
21773e12c5d1SDavid du Colombier.PP
21783e12c5d1SDavid du ColombierWhen the update algorithm was first implemented, it was unacceptably slow,
21793e12c5d1SDavid du Colombierso a cache was added to coalesce nearby changes,
21803e12c5d1SDavid du Colombierreplacing multiple small changes by a single larger one.
21813e12c5d1SDavid du ColombierThis reduced the number
21823e12c5d1SDavid du Colombierof insertions into the transaction
21833e12c5d1SDavid du Colombier.CW Buffer ,
21843e12c5d1SDavid du Colombierand made a dramatic improvement in performance,
21853e12c5d1SDavid du Colombierbut made it impossible
21863e12c5d1SDavid du Colombierto handle changes in non-monotonic order in the file; the caching method
21873e12c5d1SDavid du Colombieronly works if changes don't overlap.
21883e12c5d1SDavid du ColombierBefore the cache was added, the transaction could in principle be sorted
21893e12c5d1SDavid du Colombierif the changes were out of order, although
21903e12c5d1SDavid du Colombierthis was never done.
21913e12c5d1SDavid du ColombierThe current status is therefore acceptable performance with a minor
21923e12c5d1SDavid du Colombierrestriction on global changes, which is sometimes, but rarely, an annoyance.
21933e12c5d1SDavid du Colombier.PP
21943e12c5d1SDavid du ColombierThe update algorithm obviously paws the data more than simpler
21953e12c5d1SDavid du Colombieralgorithms, but it is not prohibitively expensive;
21963e12c5d1SDavid du Colombierthe caches help.
21973e12c5d1SDavid du Colombier(The principle of avoiding copying the data is still honored here,
21983e12c5d1SDavid du Colombieralthough not as piously:
21993e12c5d1SDavid du Colombierthe data is moved from contents' cache to
22003e12c5d1SDavid du Colombierthe transcript's all at once and through only one internal buffer.)
22013e12c5d1SDavid du ColombierPerformance figures confirm the efficiency.
22023e12c5d1SDavid du ColombierTo read from a dead start a hundred kilobyte file on a VAX-11/750
22033e12c5d1SDavid du Colombiertakes 1.4 seconds of user time, 2.5 seconds of system time,
22043e12c5d1SDavid du Colombierand 5 seconds of real time.
22053e12c5d1SDavid du ColombierReading the same file in
22063e12c5d1SDavid du Colombier.CW ed
22073e12c5d1SDavid du Colombiertakes 6.0 seconds of user time, 1.7 seconds of system time,
22083e12c5d1SDavid du Colombierand 8 seconds of real time.
22093e12c5d1SDavid du Colombier.CW Sam
22103e12c5d1SDavid du Colombieruses about half the CPU time.
22113e12c5d1SDavid du ColombierA more interesting example is the one stated above:
22123e12c5d1SDavid du Colombierinserting a character between every pair of characters in the file.
22133e12c5d1SDavid du ColombierThe
22143e12c5d1SDavid du Colombier.CW sam
22153e12c5d1SDavid du Colombiercommand is
22163e12c5d1SDavid du Colombier.P1
22173e12c5d1SDavid du Colombier,y/@/ a/x/
22183e12c5d1SDavid du Colombier.P2
22193e12c5d1SDavid du Colombierand takes 3 CPU seconds per kilobyte of input file, of which
22203e12c5d1SDavid du Colombierabout a third is spent in the regular expression code.
22213e12c5d1SDavid du ColombierThis translates to about 500 changes per second.
22223e12c5d1SDavid du Colombier.CW Ed
22233e12c5d1SDavid du Colombiertakes 1.5 seconds per kilobyte to make a similar change (ignoring newlines),
22243e12c5d1SDavid du Colombierbut cannot undo it.
22253e12c5d1SDavid du ColombierThe same example in
22263e12c5d1SDavid du Colombier.CW ex ,\u\s-4\&9\s+4\d
22273e12c5d1SDavid du Colombiera variant of
22283e12c5d1SDavid du Colombier.CW ed
22293e12c5d1SDavid du Colombierdone at the University of California at Berkeley,
22303e12c5d1SDavid du Colombierwhich allows one level of undoing, again takes 3 seconds.
22313e12c5d1SDavid du ColombierIn summary,
22323e12c5d1SDavid du Colombier.CW sam 's
2233219b2ee8SDavid du Colombierperformance is comparable to that of other UNIX editors, although it solves
22343e12c5d1SDavid du Colombiera harder problem.
22353e12c5d1SDavid du Colombier.SH 2
22363e12c5d1SDavid du ColombierCommunications
22373e12c5d1SDavid du Colombier.LP
22383e12c5d1SDavid du ColombierThe discussion so far has described the implementation of the host part of
22393e12c5d1SDavid du Colombier.CW sam ;
22403e12c5d1SDavid du Colombierthe next few sections explain how a machine with mouse and bitmap display
22413e12c5d1SDavid du Colombiercan be engaged to improve interaction.
22423e12c5d1SDavid du Colombier.CW Sam
22433e12c5d1SDavid du Colombieris not the first editor to be written as two processes,\u\s-4\&16\s+4\d
22443e12c5d1SDavid du Colombierbut its implementation
22453e12c5d1SDavid du Colombierhas some unusual aspects.
22463e12c5d1SDavid du Colombier.PP
22473e12c5d1SDavid du ColombierThere are several ways
22483e12c5d1SDavid du Colombier.CW sam 's
22493e12c5d1SDavid du Colombierhost and terminal parts may be connected.
22503e12c5d1SDavid du ColombierThe first and simplest is to forgo the terminal part and use the host
22513e12c5d1SDavid du Colombierpart's command language to edit text on an ordinary terminal.
22523e12c5d1SDavid du ColombierThis mode is invoked by starting
22533e12c5d1SDavid du Colombier.CW sam
22543e12c5d1SDavid du Colombierwith the
22553e12c5d1SDavid du Colombier.CW -d
22563e12c5d1SDavid du Colombieroption.
22573e12c5d1SDavid du ColombierWith no options,
22583e12c5d1SDavid du Colombier.CW sam
22593e12c5d1SDavid du Colombierruns separate host and terminal programs,
22603e12c5d1SDavid du Colombiercommunicating with a message protocol over the physical
22613e12c5d1SDavid du Colombierconnection that joins them.
22623e12c5d1SDavid du ColombierTypically, the connection is an RS-232 link between a Blit
22633e12c5d1SDavid du Colombier(the prototypical display for
22643e12c5d1SDavid du Colombier.CW sam )
22653e12c5d1SDavid du Colombierand a host running
2266219b2ee8SDavid du Colombierthe Ninth Edition of the UNIX operating system.\u\s-4\&8\s+4\d
22673e12c5d1SDavid du Colombier(This is the version of the system used in the Computing Sciences Research
22687dd7cddfSDavid du ColombierCenter at AT&T Bell Laboratories [now Lucent Technologies, Bell Labs], where I work.  Its relevant
22693e12c5d1SDavid du Colombieraspects are discussed in the Blit paper.\u\s-4\&1\s+4\d)
22703e12c5d1SDavid du ColombierThe implementation of
22713e12c5d1SDavid du Colombier.CW sam
2272219b2ee8SDavid du Colombierfor the SUN computer runs both processes on the same machine and
22733e12c5d1SDavid du Colombierconnects them by a pipe.
22743e12c5d1SDavid du Colombier.PP
22753e12c5d1SDavid du ColombierThe low bandwidth of an RS-232 link
22763e12c5d1SDavid du Colombiernecessitated the split between
22773e12c5d1SDavid du Colombierthe two programs.
22783e12c5d1SDavid du ColombierThe division is a mixed blessing:
22793e12c5d1SDavid du Colombiera program in two parts is much harder to write and to debug
22803e12c5d1SDavid du Colombierthan a self-contained one,
22813e12c5d1SDavid du Colombierbut the split makes several unusual configurations possible.
22823e12c5d1SDavid du ColombierThe terminal may be physically separated from the host, allowing the conveniences
22833e12c5d1SDavid du Colombierof a mouse and bitmap display to be taken home while leaving the files at work.
22843e12c5d1SDavid du ColombierIt is also possible to run the host part on a remote machine:
22853e12c5d1SDavid du Colombier.P1
22863e12c5d1SDavid du Colombiersam -r host
22873e12c5d1SDavid du Colombier.P2
22883e12c5d1SDavid du Colombierconnects to the terminal in the usual way, and then makes a call
22893e12c5d1SDavid du Colombieracross the network to establish the host part of
22903e12c5d1SDavid du Colombier.CW sam
22913e12c5d1SDavid du Colombieron the named machine.
22923e12c5d1SDavid du ColombierFinally, it cross-connects the I/O to join the two parts.
22933e12c5d1SDavid du ColombierThis allows
22943e12c5d1SDavid du Colombier.CW sam
22953e12c5d1SDavid du Colombierto be run on machines that do not support bitmap displays;
22963e12c5d1SDavid du Colombierfor example,
22973e12c5d1SDavid du Colombier.CW sam
22983e12c5d1SDavid du Colombieris the editor of choice on our Cray X-MP/24.
22993e12c5d1SDavid du Colombier.CW Sam
23003e12c5d1SDavid du Colombier.CW -r
23013e12c5d1SDavid du Colombierinvolves
23023e12c5d1SDavid du Colombier.I three
23033e12c5d1SDavid du Colombiermachines: the remote host, the terminal, and the local host.
23043e12c5d1SDavid du ColombierThe local host's job is simple but vital: it passes the data
23053e12c5d1SDavid du Colombierbetween the remote host and terminal.
23063e12c5d1SDavid du Colombier.PP
23073e12c5d1SDavid du ColombierThe host and terminal exchange messages asynchronously
23083e12c5d1SDavid du Colombier(rather than, say, as remote procedure calls) but there is no
23093e12c5d1SDavid du Colombiererror detection or correction
23103e12c5d1SDavid du Colombierbecause, whatever the configuration, the connection is reliable.
23113e12c5d1SDavid du ColombierBecause the terminal handles mundane interaction tasks such as
23123e12c5d1SDavid du Colombierpopping up menus and interpreting the responses, the messages are about
23133e12c5d1SDavid du Colombierdata, not actions.
23143e12c5d1SDavid du ColombierFor example, the host knows nothing about what is displayed on the screen,
23153e12c5d1SDavid du Colombierand when the user types a character, the message sent to the host says
23163e12c5d1SDavid du Colombier``insert a one-byte string at location 123 in file 7,'' not ``a character
23173e12c5d1SDavid du Colombierwas typed at the current position in the current file.''
23183e12c5d1SDavid du ColombierIn other words, the messages look very much like the transaction records
23193e12c5d1SDavid du Colombierin the transcripts.
23203e12c5d1SDavid du Colombier.PP
23213e12c5d1SDavid du ColombierEither the host or terminal part of
23223e12c5d1SDavid du Colombier.CW sam
23233e12c5d1SDavid du Colombiermay initiate a change to a file.
23243e12c5d1SDavid du ColombierThe command language operates on the host, while typing and some
23253e12c5d1SDavid du Colombiermouse operations are executed directly in the terminal to optimize response.
23263e12c5d1SDavid du ColombierChanges initiated by the host program must be transmitted to the terminal,
23273e12c5d1SDavid du Colombierand
23283e12c5d1SDavid du Colombiervice versa.
23293e12c5d1SDavid du Colombier(A token is exchanged to determine which end is in control,
23303e12c5d1SDavid du Colombierwhich means that characters typed while a time-consuming command runs
23313e12c5d1SDavid du Colombiermust be buffered and do not appear until the command is complete.)
23323e12c5d1SDavid du ColombierTo maintain consistent information,
23333e12c5d1SDavid du Colombierthe host and terminal track changes through a per-file
23343e12c5d1SDavid du Colombierdata structure that records what portions of the file
23353e12c5d1SDavid du Colombierthe terminal has received.
23363e12c5d1SDavid du ColombierThe data structure, called a
23373e12c5d1SDavid du Colombier.CW Rasp
23383e12c5d1SDavid du Colombier(a weak pun: it's a file with holes)
23393e12c5d1SDavid du Colombieris held and updated by both the host and terminal.
23403e12c5d1SDavid du ColombierA
23413e12c5d1SDavid du Colombier.CW Rasp
23423e12c5d1SDavid du Colombieris a list of
23433e12c5d1SDavid du Colombier.CW Strings
23443e12c5d1SDavid du Colombierholding those parts of the file known to the terminal,
23453e12c5d1SDavid du Colombierseparated by counts of the number of bytes in the interstices.
23463e12c5d1SDavid du ColombierOf course, the host doesn't keep a separate copy of the data (it only needs
23473e12c5d1SDavid du Colombierthe lengths of the various pieces),
23483e12c5d1SDavid du Colombierbut the structure is the same on both ends.
23493e12c5d1SDavid du Colombier.PP
23503e12c5d1SDavid du ColombierThe
23513e12c5d1SDavid du Colombier.CW Rasp
23523e12c5d1SDavid du Colombierin the terminal doubles as a cache.
23533e12c5d1SDavid du ColombierSince the terminal keeps the text for portions of the file it has displayed,
23543e12c5d1SDavid du Colombierit need not request data from the host when revisiting old parts of the file
23553e12c5d1SDavid du Colombieror redrawing obscured windows, which speeds things up considerably
23563e12c5d1SDavid du Colombierover low-speed links.
23573e12c5d1SDavid du Colombier.PP
23583e12c5d1SDavid du ColombierIt's trivial for the terminal to maintain its
23593e12c5d1SDavid du Colombier.CW Rasp ,
23603e12c5d1SDavid du Colombierbecause all changes made on the terminal apply to parts of the file
23613e12c5d1SDavid du Colombieralready loaded there.
23623e12c5d1SDavid du ColombierChanges made by the host are compared against the
23633e12c5d1SDavid du Colombier.CW Rasp
23643e12c5d1SDavid du Colombierduring the update sequence after each command.
23653e12c5d1SDavid du ColombierSmall changes to pieces of the file loaded in the terminal
23663e12c5d1SDavid du Colombierare sent in their entirety.
23673e12c5d1SDavid du ColombierLarger changes, and changes that fall entirely in the holes,
23683e12c5d1SDavid du Colombierare transmitted as messages without literal data:
23693e12c5d1SDavid du Colombieronly the lengths of the deleted and inserted strings are transmitted.
23703e12c5d1SDavid du ColombierWhen a command is completed, the terminal examines its visible
23713e12c5d1SDavid du Colombierwindows to see if any holes in their
23723e12c5d1SDavid du Colombier.CW Rasps
23733e12c5d1SDavid du Colombierintersect the visible portion of the file.
23743e12c5d1SDavid du ColombierIt then requests the missing data from the host,
23753e12c5d1SDavid du Colombieralong with up to 512 bytes of surrounding data, to minimize
23763e12c5d1SDavid du Colombierthe number of messages when visiting a new portion of the file.
23773e12c5d1SDavid du ColombierThis technique provides a kind of two-level lazy evaluation for the terminal.
23783e12c5d1SDavid du ColombierThe first level sends a minimum of information about
23793e12c5d1SDavid du Colombierparts of the file not being edited interactively;
23803e12c5d1SDavid du Colombierthe second level waits until a change is displayed before
23813e12c5d1SDavid du Colombiertransmitting the new data.
23823e12c5d1SDavid du ColombierOf course,
23833e12c5d1SDavid du Colombierperformance is also helped by having the terminal respond immediately to typing
23843e12c5d1SDavid du Colombierand simple mouse requests.
23853e12c5d1SDavid du ColombierExcept for small changes to active pieces of the file, which are
23863e12c5d1SDavid du Colombiertransmitted to the terminal without negotiation,
23873e12c5d1SDavid du Colombierthe terminal is wholly responsible for deciding what is displayed;
23883e12c5d1SDavid du Colombierthe host uses the
23893e12c5d1SDavid du Colombier.CW Rasp
23903e12c5d1SDavid du Colombieronly to tell the terminal what might be relevant.
23913e12c5d1SDavid du Colombier.PP
23923e12c5d1SDavid du ColombierWhen a change is initiated by the host,
23933e12c5d1SDavid du Colombierthe messages to the terminal describing the change
23943e12c5d1SDavid du Colombierare generated by the routine that applies the transcript of the changes
23953e12c5d1SDavid du Colombierto the contents of the
23963e12c5d1SDavid du Colombier.CW File .
23973e12c5d1SDavid du ColombierSince changes are undone by the same update routine,
23983e12c5d1SDavid du Colombierundoing requires
23993e12c5d1SDavid du Colombierno extra code in the communications;
24003e12c5d1SDavid du Colombierthe usual messages describing changes to the file are sufficient
24013e12c5d1SDavid du Colombierto back up the screen image.
24023e12c5d1SDavid du Colombier.PP
24033e12c5d1SDavid du ColombierThe
24043e12c5d1SDavid du Colombier.CW Rasp
24053e12c5d1SDavid du Colombieris a particularly good example of the way caches are used in
24063e12c5d1SDavid du Colombier.CW sam .
24073e12c5d1SDavid du ColombierFirst, it facilitates access to the active portion of the text by placing
24083e12c5d1SDavid du Colombierthe busy text in main memory.
24093e12c5d1SDavid du ColombierIn so doing, it provides efficient access
24103e12c5d1SDavid du Colombierto a large data structure that does not fit in memory.
24113e12c5d1SDavid du ColombierSince the form of data is to be imposed by the user, not by the program,
24123e12c5d1SDavid du Colombierand because characters will frequently be scanned sequentially,
24133e12c5d1SDavid du Colombierfiles are stored as flat objects.
24143e12c5d1SDavid du ColombierCaches help keep performance good and linear when working with such
24153e12c5d1SDavid du Colombierdata.
24163e12c5d1SDavid du Colombier.PP
24173e12c5d1SDavid du ColombierSecond, the
24183e12c5d1SDavid du Colombier.CW Rasp
24193e12c5d1SDavid du Colombierand several of the other caches have some
24203e12c5d1SDavid du Colombier.I read-ahead;
24213e12c5d1SDavid du Colombierthat is, the cache is loaded with more information than is needed for
24223e12c5d1SDavid du Colombierthe job immediately at hand.
24233e12c5d1SDavid du ColombierWhen manipulating linear structures, the accesses are usually sequential,
24243e12c5d1SDavid du Colombierand read-ahead can significantly reduce the average time to access the
24253e12c5d1SDavid du Colombiernext element of the object.
24263e12c5d1SDavid du ColombierSequential access is a common mode for people as well as programs;
24273e12c5d1SDavid du Colombierconsider scrolling through a document while looking for something.
24283e12c5d1SDavid du Colombier.PP
24293e12c5d1SDavid du ColombierFinally, like any good data structure,
24303e12c5d1SDavid du Colombierthe cache guides the algorithm, or at least the implementation.
24313e12c5d1SDavid du ColombierThe
24323e12c5d1SDavid du Colombier.CW Rasp
24333e12c5d1SDavid du Colombierwas actually invented to control the communications between the host and
24343e12c5d1SDavid du Colombierterminal parts, but I realized very early that it was also a form of
24353e12c5d1SDavid du Colombiercache.  Other caches were more explicitly intended to serve a double
24363e12c5d1SDavid du Colombierpurpose: for example, the caches in
24373e12c5d1SDavid du Colombier.CW Files
24383e12c5d1SDavid du Colombierthat coalesce updates not only reduce traffic to the
24393e12c5d1SDavid du Colombiertranscript and contents
24403e12c5d1SDavid du Colombier.CW Buffers ,
24413e12c5d1SDavid du Colombierthey also clump screen updates so that complicated changes to the
24423e12c5d1SDavid du Colombierscreen are achieved in
24433e12c5d1SDavid du Colombierjust a few messages to the terminal.
24443e12c5d1SDavid du ColombierThis saved me considerable work: I did not need to write special
24453e12c5d1SDavid du Colombiercode to optimize the message traffic to the
24463e12c5d1SDavid du Colombierterminal.
24473e12c5d1SDavid du ColombierCaches pay off in surprising ways.
24483e12c5d1SDavid du ColombierAlso, they tend to be independent, so their performance improvements
24493e12c5d1SDavid du Colombierare multiplicative.
24503e12c5d1SDavid du Colombier.SH 2
24513e12c5d1SDavid du ColombierData structures in the terminal
24523e12c5d1SDavid du Colombier.LP
24533e12c5d1SDavid du ColombierThe terminal's job is to display and to maintain a consistent image of
24543e12c5d1SDavid du Colombierpieces of the files being edited.
24553e12c5d1SDavid du ColombierBecause the text is always in memory, the data structures are
24563e12c5d1SDavid du Colombierconsiderably simpler than those in the host part.
24573e12c5d1SDavid du Colombier.PP
24583e12c5d1SDavid du Colombier.CW Sam
24593e12c5d1SDavid du Colombiertypically has far more windows than does
24603e12c5d1SDavid du Colombier.CW mux ,
24613e12c5d1SDavid du Colombierthe window system within which its Blit implementation runs.
24623e12c5d1SDavid du Colombier.CW Mux
24633e12c5d1SDavid du Colombierhas a fairly small number of asynchronously updated windows;
24643e12c5d1SDavid du Colombier.CW sam
24653e12c5d1SDavid du Colombierneeds a large number of synchronously updated windows that are
24663e12c5d1SDavid du Colombierusually static and often fully obscured.
24673e12c5d1SDavid du ColombierThe different tradeoffs guided
24683e12c5d1SDavid du Colombier.CW sam
24693e12c5d1SDavid du Colombieraway from the memory-intensive implementation of windows, called
24703e12c5d1SDavid du Colombier.CW Layers ,\u\s-4\&17\s+4\d
24713e12c5d1SDavid du Colombierused in
24723e12c5d1SDavid du Colombier.CW mux.
24733e12c5d1SDavid du ColombierRather than depending on a complete bitmap image of the display for each window,
24743e12c5d1SDavid du Colombier.CW sam
24753e12c5d1SDavid du Colombierregenerates the image from its in-memory text
24763e12c5d1SDavid du Colombier(stored in the
24773e12c5d1SDavid du Colombier.CW Rasp )
24783e12c5d1SDavid du Colombierwhen necessary, although it will use such an image if it is available.
24793e12c5d1SDavid du ColombierLike
24803e12c5d1SDavid du Colombier.CW Layers ,
24813e12c5d1SDavid du Colombierthough,
24823e12c5d1SDavid du Colombier.CW sam
24833e12c5d1SDavid du Colombieruses the screen bitmap as active storage in which to update the image using
24843e12c5d1SDavid du Colombier.CW bitblt .\u\s-4\&18,19\s+4\d
24853e12c5d1SDavid du ColombierThe resulting organization, pictured in Figure 6,
24863e12c5d1SDavid du Colombierhas a global array of windows, called
24873e12c5d1SDavid du Colombier.CW Flayers ,
24883e12c5d1SDavid du Colombiereach of which holds an image of a piece of text held in a data structure
24893e12c5d1SDavid du Colombiercalled a
24903e12c5d1SDavid du Colombier.CW Frame ,
24913e12c5d1SDavid du Colombierwhich in turn represents
24923e12c5d1SDavid du Colombiera rectangular window full of text displayed in some
24933e12c5d1SDavid du Colombier.CW Bitmap .
24943e12c5d1SDavid du ColombierEach
24953e12c5d1SDavid du Colombier.CW Flayer
24963e12c5d1SDavid du Colombierappears in a global list that orders them all front-to-back
24973e12c5d1SDavid du Colombieron the display, and simultaneously as an element of a per-file array
24983e12c5d1SDavid du Colombierthat holds all the open windows for that file.
24993e12c5d1SDavid du ColombierThe complement in the terminal of the
25003e12c5d1SDavid du Colombier.CW File
25013e12c5d1SDavid du Colombieron the host is called a
25023e12c5d1SDavid du Colombier.CW Text ;
25033e12c5d1SDavid du Colombiereach connects its
25043e12c5d1SDavid du Colombier.CW Flayers
25053e12c5d1SDavid du Colombierto the associated
25063e12c5d1SDavid du Colombier.CW Rasp .
25073e12c5d1SDavid du Colombier.KF
25083e12c5d1SDavid du Colombier.PS
25093e12c5d1SDavid du Colombiercopy "fig6.pic"
25103e12c5d1SDavid du Colombier.PE
25113e12c5d1SDavid du Colombier.Cs
25123e12c5d1SDavid du ColombierFigure 6. Data structures in the terminal.
25133e12c5d1SDavid du Colombier.CW Flayers
25143e12c5d1SDavid du Colombierare also linked together into a front-to-back list.
25153e12c5d1SDavid du Colombier.CW Boxes
25163e12c5d1SDavid du Colombierare discussed in the next section.
25173e12c5d1SDavid du Colombier.Ce
25183e12c5d1SDavid du Colombier.KE
25193e12c5d1SDavid du Colombier.PP
25203e12c5d1SDavid du ColombierThe
25213e12c5d1SDavid du Colombier.CW Bitmap
25223e12c5d1SDavid du Colombierfor a
25233e12c5d1SDavid du Colombier.CW Frame
25243e12c5d1SDavid du Colombiercontains the image of the text.
25253e12c5d1SDavid du ColombierFor a fully visible window, the
25263e12c5d1SDavid du Colombier.CW Bitmap
25273e12c5d1SDavid du Colombierwill be the screen (or at least the
25283e12c5d1SDavid du Colombier.CW Layer
25293e12c5d1SDavid du Colombierin which
25303e12c5d1SDavid du Colombier.CW sam
25313e12c5d1SDavid du Colombieris being run),
25323e12c5d1SDavid du Colombierwhile for partially obscured windows the
25333e12c5d1SDavid du Colombier.CW Bitmap
25343e12c5d1SDavid du Colombierwill be off-screen.
25353e12c5d1SDavid du ColombierIf the window is fully obscured, the
25363e12c5d1SDavid du Colombier.CW Bitmap
25373e12c5d1SDavid du Colombierwill be null.
25383e12c5d1SDavid du Colombier.PP
25393e12c5d1SDavid du ColombierThe
25403e12c5d1SDavid du Colombier.CW Bitmap
25413e12c5d1SDavid du Colombieris a kind of cache.
25423e12c5d1SDavid du ColombierWhen making changes to the display, most of the original image will
25433e12c5d1SDavid du Colombierlook the same in the final image, and the update algorithms exploit this.
25443e12c5d1SDavid du ColombierThe
25453e12c5d1SDavid du Colombier.CW Frame
25463e12c5d1SDavid du Colombiersoftware updates the image in the
25473e12c5d1SDavid du Colombier.CW Bitmap
25483e12c5d1SDavid du Colombierincrementally; the
25493e12c5d1SDavid du Colombier.CW Bitmap
25503e12c5d1SDavid du Colombieris not just an image, it is a data structure.\u\s-4\&18,19\s+4\d
25513e12c5d1SDavid du ColombierThe job of the software that updates the display is therefore
25523e12c5d1SDavid du Colombierto use as much as possible of the existing image (converting the
25533e12c5d1SDavid du Colombiertext from ASCII characters to pixels is expensive) in a sort of two-dimensional
25543e12c5d1SDavid du Colombierstring insertion algorithm.
25553e12c5d1SDavid du ColombierThe details of this process are described in the next section.
25563e12c5d1SDavid du Colombier.PP
25573e12c5d1SDavid du ColombierThe
25583e12c5d1SDavid du Colombier.CW Frame
25593e12c5d1SDavid du Colombiersoftware has no code to support overlapping windows;
25603e12c5d1SDavid du Colombierits job is to keep a single
25613e12c5d1SDavid du Colombier.CW Bitmap
25623e12c5d1SDavid du Colombierup to date.
25633e12c5d1SDavid du ColombierIt falls to the
25643e12c5d1SDavid du Colombier.CW Flayer
25653e12c5d1SDavid du Colombiersoftware to multiplex the various
25663e12c5d1SDavid du Colombier.CW Bitmaps
25673e12c5d1SDavid du Colombieronto the screen.
25683e12c5d1SDavid du ColombierThe problem of maintaining overlapping
25693e12c5d1SDavid du Colombier.CW Flayers
25703e12c5d1SDavid du Colombieris easier than for
25713e12c5d1SDavid du Colombier.CW Layers \u\s-4\&17\s+4\d
25723e12c5d1SDavid du Colombierbecause changes are made synchronously and because the contents of the window
25733e12c5d1SDavid du Colombiercan be reconstructed from the data stored in the
25743e12c5d1SDavid du Colombier.CW Frame ;
25753e12c5d1SDavid du Colombierthe
25763e12c5d1SDavid du Colombier.CW Layers
25773e12c5d1SDavid du Colombiersoftware
25783e12c5d1SDavid du Colombiermakes no such assumptions.
25793e12c5d1SDavid du ColombierIn
25803e12c5d1SDavid du Colombier.CW sam ,
25813e12c5d1SDavid du Colombierthe window being changed is almost always fully visible, because the current
25823e12c5d1SDavid du Colombierwindow is always fully visible, by construction.
25833e12c5d1SDavid du ColombierHowever, when multi-file changes are being made, or when
25843e12c5d1SDavid du Colombiermore than one window is open on a file,
25853e12c5d1SDavid du Colombierit may be necessary to update partially obscured windows.
25863e12c5d1SDavid du Colombier.PP
25873e12c5d1SDavid du ColombierThere are three cases: the window is
25883e12c5d1SDavid du Colombierfully visible, invisible (fully obscured), or partially visible.
25893e12c5d1SDavid du ColombierIf fully visible, the
25903e12c5d1SDavid du Colombier.CW Bitmap
25913e12c5d1SDavid du Colombieris part of the screen, so when the
25923e12c5d1SDavid du Colombier.CW Flayer
25933e12c5d1SDavid du Colombierupdate routine calls the
25943e12c5d1SDavid du Colombier.CW Frame
25953e12c5d1SDavid du Colombierupdate routine, the screen will be updated directly.
25963e12c5d1SDavid du ColombierIf the window is invisible,
25973e12c5d1SDavid du Colombierthere is no associated
25983e12c5d1SDavid du Colombier.CW Bitmap ,
25993e12c5d1SDavid du Colombierand all that is necessary is to update the
26003e12c5d1SDavid du Colombier.CW Frame
26013e12c5d1SDavid du Colombierdata structure, not the image.
26023e12c5d1SDavid du ColombierIf the window is partially visible, the
26033e12c5d1SDavid du Colombier.CW Frame
26043e12c5d1SDavid du Colombierroutine is called to update the image in the off-screen
26053e12c5d1SDavid du Colombier.CW Bitmap ,
26063e12c5d1SDavid du Colombierwhich may require regenerating it from the text of the window.
26073e12c5d1SDavid du ColombierThe
26083e12c5d1SDavid du Colombier.CW Flayer
26093e12c5d1SDavid du Colombiercode then clips this
26103e12c5d1SDavid du Colombier.CW Bitmap
26113e12c5d1SDavid du Colombieragainst the
26123e12c5d1SDavid du Colombier.CW Bitmaps
26133e12c5d1SDavid du Colombierof all
26143e12c5d1SDavid du Colombier.CW Frames
26153e12c5d1SDavid du Colombierin front of the
26163e12c5d1SDavid du Colombier.CW Frame
26173e12c5d1SDavid du Colombierbeing modified, and the remainder is copied to the display.
26183e12c5d1SDavid du Colombier.PP
26193e12c5d1SDavid du ColombierThis is much faster than recreating the image off-screen
26203e12c5d1SDavid du Colombierfor every change, or clipping all the changes made to the image
26213e12c5d1SDavid du Colombierduring its update.
26223e12c5d1SDavid du ColombierUnfortunately, these caches can also consume prohibitive amounts of
26233e12c5d1SDavid du Colombiermemory, so they are freed fairly liberally \(em after every change to the
26243e12c5d1SDavid du Colombierfront-to-back order of the
26253e12c5d1SDavid du Colombier.CW Flayers .
26263e12c5d1SDavid du ColombierThe result is that
26273e12c5d1SDavid du Colombierthe off-screen
26283e12c5d1SDavid du Colombier.CW Bitmaps
26293e12c5d1SDavid du Colombierexist only while multi-window changes are occurring,
26303e12c5d1SDavid du Colombierwhich is the only time the performance improvement they provide is needed.
26313e12c5d1SDavid du ColombierAlso, the user interface causes fully-obscured windows to be the
26323e12c5d1SDavid du Colombiereasiest to make \(em
26333e12c5d1SDavid du Colombiercreating a canonically sized and placed window requires only a button click
26343e12c5d1SDavid du Colombier\(em which reduces the need for caching still further.
26353e12c5d1SDavid du Colombier.PP
26363e12c5d1SDavid du Colombier.SH 2
26373e12c5d1SDavid du ColombierScreen update
26383e12c5d1SDavid du Colombier.LP
26393e12c5d1SDavid du ColombierOnly two low-level primitives are needed for incremental update:
26403e12c5d1SDavid du Colombier.CW bitblt ,
26413e12c5d1SDavid du Colombierwhich copies rectangles of pixels, and
26423e12c5d1SDavid du Colombier.CW string
26433e12c5d1SDavid du Colombier(which in turn calls
26443e12c5d1SDavid du Colombier.CW bitblt ),
26453e12c5d1SDavid du Colombierwhich draws a null-terminated character string in a
26463e12c5d1SDavid du Colombier.CW Bitmap .
26473e12c5d1SDavid du ColombierA
26483e12c5d1SDavid du Colombier.CW Frame
26493e12c5d1SDavid du Colombiercontains a list of
26503e12c5d1SDavid du Colombier.CW Boxes ,
26513e12c5d1SDavid du Colombiereach of which defines a horizontal strip of text in the window
26523e12c5d1SDavid du Colombier(see Figure 7).
26533e12c5d1SDavid du ColombierA
26543e12c5d1SDavid du Colombier.CW Box
26553e12c5d1SDavid du Colombierhas a character string
26563e12c5d1SDavid du Colombier.CW str ,
26573e12c5d1SDavid du Colombierand a
26583e12c5d1SDavid du Colombier.CW Rectangle
26593e12c5d1SDavid du Colombier.CW rect
26603e12c5d1SDavid du Colombierthat defines the location of the strip in the window.
26613e12c5d1SDavid du Colombier(The text in
26623e12c5d1SDavid du Colombier.CW str
26633e12c5d1SDavid du Colombieris stored in the
26643e12c5d1SDavid du Colombier.CW Box
26653e12c5d1SDavid du Colombierseparately from the
26663e12c5d1SDavid du Colombier.CW Rasp
26673e12c5d1SDavid du Colombierassociated with the window's file, so
26683e12c5d1SDavid du Colombier.CW Boxes
26693e12c5d1SDavid du Colombierare self-contained.)
26703e12c5d1SDavid du ColombierThe invariant is that
26713e12c5d1SDavid du Colombierthe image of the
26723e12c5d1SDavid du Colombier.CW Box
26733e12c5d1SDavid du Colombiercan be reproduced by calling
26743e12c5d1SDavid du Colombier.CW string
26753e12c5d1SDavid du Colombierwith argument
26763e12c5d1SDavid du Colombier.CW str
26773e12c5d1SDavid du Colombierto draw the string in
26783e12c5d1SDavid du Colombier.CW rect ,
26793e12c5d1SDavid du Colombierand the resulting picture fits perfectly within
26803e12c5d1SDavid du Colombier.CW rect .
26813e12c5d1SDavid du ColombierIn other words, the
26823e12c5d1SDavid du Colombier.CW Boxes
26833e12c5d1SDavid du Colombierdefine the tiling of the window.
26843e12c5d1SDavid du ColombierThe tiling may be complicated by long lines of text, which
26853e12c5d1SDavid du Colombierare folded onto the next line.
26863e12c5d1SDavid du ColombierSome editors use horizontal scrolling to avoid this complication,
26873e12c5d1SDavid du Colombierbut to be comfortable this technique requires that lines not be
26883e12c5d1SDavid du Colombier.I too
26893e12c5d1SDavid du Colombierlong;
26903e12c5d1SDavid du Colombier.CW sam
26913e12c5d1SDavid du Colombierhas no such restriction.
2692219b2ee8SDavid du ColombierAlso, and perhaps more importantly, UNIX programs and terminals traditionally fold
26933e12c5d1SDavid du Colombierlong lines to make their contents fully visible.
26943e12c5d1SDavid du Colombier.PP
26953e12c5d1SDavid du ColombierTwo special kinds of
26963e12c5d1SDavid du Colombier.CW Boxes
26973e12c5d1SDavid du Colombiercontain a single
26983e12c5d1SDavid du Colombiercharacter: either a newline or a tab.
26993e12c5d1SDavid du ColombierNewlines and tabs are white space.
27003e12c5d1SDavid du ColombierA newline
27013e12c5d1SDavid du Colombier.CW Box
27023e12c5d1SDavid du Colombieralways extends to the right edge of the window,
27033e12c5d1SDavid du Colombierforcing the following
27043e12c5d1SDavid du Colombier.CW Box
27053e12c5d1SDavid du Colombierto the next line.
27063e12c5d1SDavid du ColombierThe width of a tab depends on where it is located:
27073e12c5d1SDavid du Colombierit forces the next
27083e12c5d1SDavid du Colombier.CW Box
27093e12c5d1SDavid du Colombierto begin at a tab location.
27103e12c5d1SDavid du ColombierTabs also
27113e12c5d1SDavid du Colombierhave a minimum width equivalent to a blank (blanks are
27123e12c5d1SDavid du Colombierdrawn by
27133e12c5d1SDavid du Colombier.CW string
27143e12c5d1SDavid du Colombierand are not treated specially); newlines have a minimum width of zero.
27153e12c5d1SDavid du Colombier.KF
27163e12c5d1SDavid du Colombier.PS
27173e12c5d1SDavid du Colombiercopy "fig7.pic"
27183e12c5d1SDavid du Colombier.PE
27193e12c5d1SDavid du Colombier.sp .5
27203e12c5d1SDavid du Colombier.Cs
27213e12c5d1SDavid du ColombierFigure 7. A line of text showing its
27223e12c5d1SDavid du Colombier.CW Boxes .
27233e12c5d1SDavid du ColombierThe first two blank
27243e12c5d1SDavid du Colombier.CW Boxes
27253e12c5d1SDavid du Colombiercontain tabs; the last contains a newline.
27263e12c5d1SDavid du ColombierSpaces are handled as ordinary characters.
27273e12c5d1SDavid du Colombier.Ce
27283e12c5d1SDavid du Colombier.KE
27293e12c5d1SDavid du Colombier.PP
27303e12c5d1SDavid du ColombierThe update algorithms always use the
27313e12c5d1SDavid du Colombier.CW Bitmap
27323e12c5d1SDavid du Colombierimage of the text (either the display or cache
27333e12c5d1SDavid du Colombier.CW Bitmap );
27343e12c5d1SDavid du Colombierthey never examine the characters within a
27353e12c5d1SDavid du Colombier.CW Box
27363e12c5d1SDavid du Colombierexcept when the
27373e12c5d1SDavid du Colombier.CW Box
27383e12c5d1SDavid du Colombierneeds to be split in two.
27393e12c5d1SDavid du ColombierBefore a change, the window consists of a tiling of
27403e12c5d1SDavid du Colombier.CW Boxes ;
27413e12c5d1SDavid du Colombierafter the change the window is tiled differently.
27423e12c5d1SDavid du ColombierThe update algorithms rearrange the tiles in place, without
27433e12c5d1SDavid du Colombierbackup storage.
27443e12c5d1SDavid du ColombierThe algorithms are not strictly optimal \(em for example, they can
27453e12c5d1SDavid du Colombierclear a pixel that is later going to be written upon \(em
27463e12c5d1SDavid du Colombierbut they never move a tile that doesn't need to be moved,
27473e12c5d1SDavid du Colombierand they move each tile at most once.
27483e12c5d1SDavid du Colombier.CW Frinsert
27493e12c5d1SDavid du Colombieron a Blit can absorb over a thousand characters a second if the strings
27503e12c5d1SDavid du Colombierbeing inserted are a few tens of characters long.
27513e12c5d1SDavid du Colombier.PP
27523e12c5d1SDavid du ColombierConsider
27533e12c5d1SDavid du Colombier.CW frdelete .
27543e12c5d1SDavid du ColombierIts job is to delete a substring from a
27553e12c5d1SDavid du Colombier.CW Frame
27563e12c5d1SDavid du Colombierand restore the image of the
27573e12c5d1SDavid du Colombier.CW Frame .
27583e12c5d1SDavid du ColombierThe image of a substring has a peculiar shape (see Figure 2) comprising
27593e12c5d1SDavid du Colombierpossibly a partial line,
27603e12c5d1SDavid du Colombierzero or more full lines,
27613e12c5d1SDavid du Colombierand possibly a final partial line.
27623e12c5d1SDavid du ColombierFor reference, call this the
27633e12c5d1SDavid du Colombier.I
27643e12c5d1SDavid du ColombierZ-shape.
27653e12c5d1SDavid du Colombier.R
27663e12c5d1SDavid du Colombier.CW Frdelete
27673e12c5d1SDavid du Colombierbegins by splitting, if necessary, the
27683e12c5d1SDavid du Colombier.CW Boxes
27693e12c5d1SDavid du Colombiercontaining the ends of
27703e12c5d1SDavid du Colombierthe substring so the substring begins and ends on
27713e12c5d1SDavid du Colombier.CW Box
27723e12c5d1SDavid du Colombierboundaries.
27733e12c5d1SDavid du ColombierBecause the substring is being deleted, its image is not needed,
27743e12c5d1SDavid du Colombierso the Z-shape is then cleared.
27753e12c5d1SDavid du ColombierThen, tiles (that is, the images of
27763e12c5d1SDavid du Colombier.CW Boxes )
27773e12c5d1SDavid du Colombierare copied, using
27783e12c5d1SDavid du Colombier.CW bitblt ,
27793e12c5d1SDavid du Colombierfrom immediately after the Z-shape to
27803e12c5d1SDavid du Colombierthe beginning of the Z-shape,
27813e12c5d1SDavid du Colombierresulting in a new Z-shape.
27823e12c5d1SDavid du Colombier.CW Boxes "" (
27833e12c5d1SDavid du Colombierwhose contents would span two lines in the new position must first be split.)
27843e12c5d1SDavid du Colombier.PP
27853e12c5d1SDavid du ColombierCopying the remainder of the
27863e12c5d1SDavid du Colombier.CW Frame
27873e12c5d1SDavid du Colombiertile by tile
27883e12c5d1SDavid du Colombierthis way will clearly accomplish the deletion but eventually,
27893e12c5d1SDavid du Colombiertypically when the copying algorithm encounters a tab or newline,
27903e12c5d1SDavid du Colombierthe old and new
27913e12c5d1SDavid du Colombier.CW x
27923e12c5d1SDavid du Colombiercoordinates of the tile
27933e12c5d1SDavid du Colombierto be copied are the same.
27943e12c5d1SDavid du ColombierThis correspondence implies
27953e12c5d1SDavid du Colombierthat the Z-shape has its beginning and ending edges aligned
27963e12c5d1SDavid du Colombiervertically, and a sequence of at most two
27973e12c5d1SDavid du Colombier.CW bitblts
27983e12c5d1SDavid du Colombiercan be used to copy the remaining tiles.
27993e12c5d1SDavid du ColombierThe last step is to clear out the resulting empty space at the bottom
28003e12c5d1SDavid du Colombierof the window;
28013e12c5d1SDavid du Colombierthe number of lines to be cleared is the number of complete lines in the
28023e12c5d1SDavid du ColombierZ-shape closed by the final
28033e12c5d1SDavid du Colombier.CW bitblts.
28043e12c5d1SDavid du ColombierThe final step is to merge horizontally adjacent
28053e12c5d1SDavid du Colombier.CW Boxes
28063e12c5d1SDavid du Colombierof plain text.
28073e12c5d1SDavid du ColombierThe complete source to
28083e12c5d1SDavid du Colombier.CW frdelete
28093e12c5d1SDavid du Colombieris less than 100 lines of C.
28103e12c5d1SDavid du Colombier.PP
28113e12c5d1SDavid du Colombier.CW frinsert
28123e12c5d1SDavid du Colombieris more complicated because it must do four passes:
28133e12c5d1SDavid du Colombierone to construct the
28143e12c5d1SDavid du Colombier.CW Box
28153e12c5d1SDavid du Colombierlist for the inserted string,
28163e12c5d1SDavid du Colombierone to reconnoitre,
28173e12c5d1SDavid du Colombierone to copy (in opposite order to
28183e12c5d1SDavid du Colombier.CW frdelete )
28193e12c5d1SDavid du Colombierthe
28203e12c5d1SDavid du Colombier.CW Boxes
28213e12c5d1SDavid du Colombierto make the hole for the new text,
28223e12c5d1SDavid du Colombierand finally one to copy the new text into place.
28233e12c5d1SDavid du ColombierOverall, though,
28243e12c5d1SDavid du Colombier.CW frinsert
28253e12c5d1SDavid du Colombierhas a similar flavor to
28263e12c5d1SDavid du Colombier.CW frdelete ,
28273e12c5d1SDavid du Colombierand needn't be described further.
28283e12c5d1SDavid du Colombier.CW Frinsert
28293e12c5d1SDavid du Colombierand its subsidiary routines comprise 211 lines of C.
28303e12c5d1SDavid du Colombier.PP
28313e12c5d1SDavid du ColombierThe terminal source code is 3024 lines of C,
28323e12c5d1SDavid du Colombierand the host source is 5797 lines.
28333e12c5d1SDavid du Colombier.SH
28343e12c5d1SDavid du ColombierDiscussion
28353e12c5d1SDavid du Colombier.SH 2
28363e12c5d1SDavid du ColombierHistory
28373e12c5d1SDavid du Colombier.LP
28383e12c5d1SDavid du ColombierThe immediate ancestor of
28393e12c5d1SDavid du Colombier.CW sam
28403e12c5d1SDavid du Colombierwas the original text editor for the Blit, called
28413e12c5d1SDavid du Colombier.CW jim .
28423e12c5d1SDavid du Colombier.CW Sam
28433e12c5d1SDavid du Colombierinherited
28443e12c5d1SDavid du Colombier.CW jim 's
28453e12c5d1SDavid du Colombiertwo-process structure and mouse language almost unchanged, but
28463e12c5d1SDavid du Colombier.CW jim
28473e12c5d1SDavid du Colombiersuffered from several drawbacks that were addressed in the design of
28483e12c5d1SDavid du Colombier.CW sam .
28493e12c5d1SDavid du ColombierThe most important of these was the lack of a command language.
28503e12c5d1SDavid du ColombierAlthough
28513e12c5d1SDavid du Colombier.CW jim
28523e12c5d1SDavid du Colombierwas easy to use for simple editing, it provided no direct help with
28533e12c5d1SDavid du Colombierlarge or repetitive editing tasks.  Instead, it provided a command to pass
28543e12c5d1SDavid du Colombierselected text through a shell pipeline,
28553e12c5d1SDavid du Colombierbut this was no more satisfactory than could be expected of a stopgap measure.
28563e12c5d1SDavid du Colombier.PP
28573e12c5d1SDavid du Colombier.CW Jim
28583e12c5d1SDavid du Colombierwas written primarily as a vehicle for experimenting with a mouse-based
28593e12c5d1SDavid du Colombierinterface to text, and the experiment was successful.
28603e12c5d1SDavid du Colombier.CW Jim
28613e12c5d1SDavid du Colombierhad some spin-offs:
28623e12c5d1SDavid du Colombier.CW mux ,
28633e12c5d1SDavid du Colombierthe second window system for the Blit, is essentially a multiplexed
28643e12c5d1SDavid du Colombierversion of the terminal part of
28653e12c5d1SDavid du Colombier.CW jim ;
28663e12c5d1SDavid du Colombierand the debugger
28673e12c5d1SDavid du Colombier.CW pi 's
28683e12c5d1SDavid du Colombieruser interface\u\s-4\&20\s+4\d was closely modeled on
28693e12c5d1SDavid du Colombier.CW jim 's.
28703e12c5d1SDavid du ColombierBut after a couple of years,
28713e12c5d1SDavid du Colombier.CW jim
28723e12c5d1SDavid du Colombierhad become difficult to maintain and limiting to use,
28733e12c5d1SDavid du Colombierand its replacement was overdue.
28743e12c5d1SDavid du Colombier.PP
28753e12c5d1SDavid du ColombierI began the design of
28763e12c5d1SDavid du Colombier.CW sam
28773e12c5d1SDavid du Colombierby asking
28783e12c5d1SDavid du Colombier.CW jim
28793e12c5d1SDavid du Colombiercustomers what they wanted.
28803e12c5d1SDavid du ColombierThis was probably a mistake; the answers were essentially a list of features
28813e12c5d1SDavid du Colombierto be found in other editors, which did not provide any of the
28823e12c5d1SDavid du Colombierguiding principles I was seeking.
28833e12c5d1SDavid du ColombierFor instance, one common request was for a ``global substitute,''
28843e12c5d1SDavid du Colombierbut no one suggested how to provide it within a cut-and-paste editor.
28853e12c5d1SDavid du ColombierI was looking for a scheme that would
28863e12c5d1SDavid du Colombiersupport such specialized features comfortably in the context of some
28873e12c5d1SDavid du Colombiergeneral command language.
28883e12c5d1SDavid du ColombierIdeas were not forthcoming, though, particularly given my insistence
28893e12c5d1SDavid du Colombieron removing all limits on file sizes, line lengths and so on.
28903e12c5d1SDavid du ColombierEven worse, I recognized that, since the mouse could easily
28913e12c5d1SDavid du Colombierindicate a region of the screen that was not an integral number of lines,
28923e12c5d1SDavid du Colombierthe command language would best forget about newlines altogether,
28933e12c5d1SDavid du Colombierand that meant the command language had to treat the file as a single
28943e12c5d1SDavid du Colombierstring, not an array of lines.
28953e12c5d1SDavid du Colombier.PP
28963e12c5d1SDavid du ColombierEventually, I decided that thinking was not getting me very far and it was
28973e12c5d1SDavid du Colombiertime to try building.
28983e12c5d1SDavid du ColombierI knew that the terminal part could be built easily \(em
28993e12c5d1SDavid du Colombierthat part of
29003e12c5d1SDavid du Colombier.CW jim
29013e12c5d1SDavid du Colombierbehaved acceptably well \(em and that most of the hard work was going
29023e12c5d1SDavid du Colombierto be in the host part: the file interface, command interpreter and so on.
29033e12c5d1SDavid du ColombierMoreover, I had some ideas about how the architecture of
29043e12c5d1SDavid du Colombier.CW jim
29053e12c5d1SDavid du Colombiercould be improved without destroying its basic structure, which I liked
29063e12c5d1SDavid du Colombierin principle but which hadn't worked out as well as I had hoped.
29073e12c5d1SDavid du ColombierSo I began by designing the file data structure,
29083e12c5d1SDavid du Colombierstarting with the way
29093e12c5d1SDavid du Colombier.CW jim
29103e12c5d1SDavid du Colombierworked \(em comparable to a single structure merging
29113e12c5d1SDavid du Colombier.CW Disc
29123e12c5d1SDavid du Colombierand
29133e12c5d1SDavid du Colombier.CW Buffer ,
29143e12c5d1SDavid du Colombierwhich I split to make the cache more general
29153e12c5d1SDavid du Colombier\(em and thinking about how global substitute could be implemented.
29163e12c5d1SDavid du ColombierThe answer was clearly that it had to be done in two passes,
29173e12c5d1SDavid du Colombierand the transcript-oriented implementation fell out naturally.
29183e12c5d1SDavid du Colombier.PP
29193e12c5d1SDavid du Colombier.CW Sam
29203e12c5d1SDavid du Colombierwas written bottom-up,
29213e12c5d1SDavid du Colombierstarting from the data structures and algorithms for manipulating text,
29223e12c5d1SDavid du Colombierthrough the command language and up to the code for maintaining
29233e12c5d1SDavid du Colombierthe display.
29243e12c5d1SDavid du ColombierIn retrospect, it turned out well, but this implementation method is
29253e12c5d1SDavid du Colombiernot recommended in general.
29263e12c5d1SDavid du ColombierThere were several times when I had a large body of interesting code
29273e12c5d1SDavid du Colombierassembled and no clue how to proceed with it.
29283e12c5d1SDavid du ColombierThe command language, in particular, took almost a year to figure out,
29293e12c5d1SDavid du Colombierbut can be implemented (given what was there at the beginning of that year)
29303e12c5d1SDavid du Colombierin a day or two.  Similarly, inventing the
29313e12c5d1SDavid du Colombier.CW Rasp
29323e12c5d1SDavid du Colombierdata structure delayed the
29333e12c5d1SDavid du Colombierconnection of the host and terminal pieces by another few months.
29343e12c5d1SDavid du Colombier.CW Sam
29353e12c5d1SDavid du Colombiertook about two years to write, although only about four months were
29363e12c5d1SDavid du Colombierspent actually working on it.
29373e12c5d1SDavid du Colombier.PP
29383e12c5d1SDavid du ColombierPart of the design process was unusual:
29393e12c5d1SDavid du Colombierthe subset of the protocol that maintains the
29403e12c5d1SDavid du Colombier.CW Rasp
29413e12c5d1SDavid du Colombierwas simulated, debugged
29423e12c5d1SDavid du Colombierand verified by an automatic protocol analyzer,\u\s-4\&21\s+4\d and was bug-free
29433e12c5d1SDavid du Colombierfrom the start.
29443e12c5d1SDavid du ColombierThe rest of the protocol, concerned mostly
29453e12c5d1SDavid du Colombierwith keeping menus up to date,
29463e12c5d1SDavid du Colombierwas unfortunately too unwieldy for such analysis,
29473e12c5d1SDavid du Colombierand was debugged by more traditional methods, primarily
29483e12c5d1SDavid du Colombierby logging in a file all messages in and out of the host.
29493e12c5d1SDavid du Colombier.SH 2
29503e12c5d1SDavid du ColombierReflections
29513e12c5d1SDavid du Colombier.LP
29523e12c5d1SDavid du Colombier.CW Sam
29533e12c5d1SDavid du Colombieris essentially the only interactive editor used by the sixty or so members of
29543e12c5d1SDavid du Colombierthe computing science research center in which I work.
29553e12c5d1SDavid du ColombierThe same could not be said of
29563e12c5d1SDavid du Colombier.CW jim ;
29573e12c5d1SDavid du Colombierthe lack of a command language kept some people from adopting it.
29583e12c5d1SDavid du ColombierThe union of a user interface as comfortable as
29593e12c5d1SDavid du Colombier.CW jim 's
29603e12c5d1SDavid du Colombierwith a command language as powerful as
2961219b2ee8SDavid du Colombier.CW ed 's†
29623e12c5d1SDavid du Colombier.FS
29633e12c5d1SDavid du Colombier.vs 9
2964219b2ee8SDavid du Colombier†The people who criticize
29653e12c5d1SDavid du Colombier.CW ed
29663e12c5d1SDavid du Colombieras an interactive program often forget that it and its close relative
29673e12c5d1SDavid du Colombier.CW sed \u\s-4\&7\s+4\d
29683e12c5d1SDavid du Colombierstill thrive as programmable editors.  The strength of these programs is
29693e12c5d1SDavid du Colombierindependent of their convenience for interactive editing.
29703e12c5d1SDavid du Colombier.br
29713e12c5d1SDavid du Colombier.vs
29723e12c5d1SDavid du Colombier.FE
29733e12c5d1SDavid du Colombieris essential to
29743e12c5d1SDavid du Colombier.CW sam 's
29753e12c5d1SDavid du Colombiersuccess.
29763e12c5d1SDavid du ColombierWhen
29773e12c5d1SDavid du Colombier.CW sam
29783e12c5d1SDavid du Colombierwas first made available to the
29793e12c5d1SDavid du Colombier.CW jim
29803e12c5d1SDavid du Colombiercommunity,
29813e12c5d1SDavid du Colombieralmost everyone switched to it within two or three days.
29823e12c5d1SDavid du ColombierIn the months that followed, even people who had never adopted
29833e12c5d1SDavid du Colombier.CW jim
29843e12c5d1SDavid du Colombierstarted using
29853e12c5d1SDavid du Colombier.CW sam
29863e12c5d1SDavid du Colombierexclusively.
29873e12c5d1SDavid du Colombier.PP
29883e12c5d1SDavid du ColombierTo be honest,
29893e12c5d1SDavid du Colombier.CW ed
29903e12c5d1SDavid du Colombierstill gets occasional use, but usually when
29913e12c5d1SDavid du Colombiersomething quick needs to be done and the overhead of
29923e12c5d1SDavid du Colombierdownloading the terminal part of
29933e12c5d1SDavid du Colombier.CW sam
29943e12c5d1SDavid du Colombierisn't worth the trouble.
29953e12c5d1SDavid du ColombierAlso, as a `line' editor,
29963e12c5d1SDavid du Colombier.CW sam
29973e12c5d1SDavid du Colombier.CW -d
29983e12c5d1SDavid du Colombieris a bit odd;
29993e12c5d1SDavid du Colombierwhen using a good old ASCII terminal, it's comforting to have
30003e12c5d1SDavid du Colombiera true line editor.
30013e12c5d1SDavid du ColombierBut it is fair to say that
30023e12c5d1SDavid du Colombier.CW sam 's
30033e12c5d1SDavid du Colombiercommand language has displaced
30043e12c5d1SDavid du Colombier.CW ed 's
30053e12c5d1SDavid du Colombierfor most of the complicated editing that has kept line editors
30063e12c5d1SDavid du Colombier(that is, command-driven editors) with us.
30073e12c5d1SDavid du Colombier.PP
30083e12c5d1SDavid du Colombier.CW Sam 's
30093e12c5d1SDavid du Colombiercommand language is even fancier than
30103e12c5d1SDavid du Colombier.CW ed 's,
30113e12c5d1SDavid du Colombierand most
30123e12c5d1SDavid du Colombier.CW sam
30133e12c5d1SDavid du Colombiercustomers don't come near to using all its capabilities.
30143e12c5d1SDavid du ColombierDoes it need to be so sophisticated?
30153e12c5d1SDavid du ColombierI think the answer is yes, for two reasons.
30163e12c5d1SDavid du Colombier.PP
30173e12c5d1SDavid du ColombierFirst, the
30183e12c5d1SDavid du Colombier.I model
30193e12c5d1SDavid du Colombierfor
30203e12c5d1SDavid du Colombier.CW sam 's
30213e12c5d1SDavid du Colombiercommand language is really relatively simple, and certainly simpler than that of
30223e12c5d1SDavid du Colombier.CW ed .
30233e12c5d1SDavid du ColombierFor instance, there is only one kind of textual loop in
30243e12c5d1SDavid du Colombier.CW sam
30253e12c5d1SDavid du Colombier\(em the
30263e12c5d1SDavid du Colombier.CW x
30273e12c5d1SDavid du Colombiercommand \(em
30283e12c5d1SDavid du Colombierwhile
30293e12c5d1SDavid du Colombier.CW ed
30303e12c5d1SDavid du Colombierhas three (the
30313e12c5d1SDavid du Colombier.CW g
30323e12c5d1SDavid du Colombiercommand, the global flag on substitutions, and the implicit loop over
30333e12c5d1SDavid du Colombierlines in multi-line substitutions).
30343e12c5d1SDavid du ColombierAlso,
30353e12c5d1SDavid du Colombier.CW ed 's
30363e12c5d1SDavid du Colombiersubstitute command is necessary to make changes within lines, but in
30373e12c5d1SDavid du Colombier.CW sam
30383e12c5d1SDavid du Colombierthe
30393e12c5d1SDavid du Colombier.CW s
30403e12c5d1SDavid du Colombiercommand is more of a familiar convenience than a necessity;
30413e12c5d1SDavid du Colombier.CW c
30423e12c5d1SDavid du Colombierand
30433e12c5d1SDavid du Colombier.CW t
30443e12c5d1SDavid du Colombiercan do all the work.
30453e12c5d1SDavid du Colombier.PP
30463e12c5d1SDavid du ColombierSecond,
30473e12c5d1SDavid du Colombiergiven a community that expects an editor to be about as powerful as
30483e12c5d1SDavid du Colombier.CW ed ,
30493e12c5d1SDavid du Colombierit's hard to see how
30503e12c5d1SDavid du Colombier.CW sam
30513e12c5d1SDavid du Colombiercould really be much simpler and still satisfy that expectation.
30523e12c5d1SDavid du ColombierPeople want to do ``global substitutes,'' and most are content
30533e12c5d1SDavid du Colombierto have the recipe for that and a few other fancy changes.
30543e12c5d1SDavid du ColombierThe sophistication of the command language is really just a veneer
30553e12c5d1SDavid du Colombierover a design that makes it possible to do global substitutes
30563e12c5d1SDavid du Colombierin a screen editor.
30573e12c5d1SDavid du ColombierSome people will always want something more, however, and it's gratifying to
30583e12c5d1SDavid du Colombierbe able to provide it.
30593e12c5d1SDavid du ColombierThe real power of
30603e12c5d1SDavid du Colombier.CW sam 's
30613e12c5d1SDavid du Colombiercommand language comes from composability of the operators, which is by
30623e12c5d1SDavid du Colombiernature orthogonal to the underlying model.
30633e12c5d1SDavid du ColombierIn other words,
30643e12c5d1SDavid du Colombier.CW sam
30653e12c5d1SDavid du Colombieris not itself complex, but it makes complex things possible.
30663e12c5d1SDavid du ColombierIf you don't want to do anything complex, you can ignore the
30673e12c5d1SDavid du Colombiercomplexity altogether, and many people do so.
30683e12c5d1SDavid du Colombier.PP
30693e12c5d1SDavid du ColombierSometimes I am asked the opposite question: why didn't I just make
30703e12c5d1SDavid du Colombier.CW sam
30713e12c5d1SDavid du Colombiera real programmable editor, with macros and variables and so on?
30723e12c5d1SDavid du ColombierThe main reason is a matter of taste: I like the editor
30733e12c5d1SDavid du Colombierto be the same every time I use it.
30743e12c5d1SDavid du ColombierThere is one technical reason, though:
30753e12c5d1SDavid du Colombierprogrammability in editors is largely a workaround for insufficient
30763e12c5d1SDavid du Colombierinteractivity.
30773e12c5d1SDavid du ColombierProgrammable editors are used to make particular, usually short-term,
30783e12c5d1SDavid du Colombierthings easy to do, such as by providing shorthands for common actions.
30793e12c5d1SDavid du ColombierIf things are generally easy to do in the first place,
30803e12c5d1SDavid du Colombiershorthands are not as helpful.
30813e12c5d1SDavid du Colombier.CW Sam
30823e12c5d1SDavid du Colombiermakes common editing operations very easy, and the solutions to
30833e12c5d1SDavid du Colombiercomplex editing problems seem commensurate with the problems themselves.
30843e12c5d1SDavid du ColombierAlso, the ability to edit the
30853e12c5d1SDavid du Colombier.CW sam
30863e12c5d1SDavid du Colombierwindow makes it easy to repeat commands \(em it only takes a mouse button click
30873e12c5d1SDavid du Colombierto execute a command again.
30883e12c5d1SDavid du Colombier.SH 2
30893e12c5d1SDavid du ColombierPros and cons
30903e12c5d1SDavid du Colombier.LP
30913e12c5d1SDavid du Colombier.CW Sam
30923e12c5d1SDavid du Colombierhas several other good points,
30933e12c5d1SDavid du Colombierand its share of problems.
30943e12c5d1SDavid du ColombierAmong the good things is the idea of
30953e12c5d1SDavid du Colombierstructural regular expressions,
30963e12c5d1SDavid du Colombierwhose usefulness has only begun to be explored.
30973e12c5d1SDavid du ColombierThey were arrived at serendipitously when I attempted to distill the essence of
30983e12c5d1SDavid du Colombier.CW ed 's
30993e12c5d1SDavid du Colombierway of doing global substitution and recognized that the looping command in
31003e12c5d1SDavid du Colombier.CW ed
31013e12c5d1SDavid du Colombierwas implicitly imposing a structure (an array of lines) on the file.
31023e12c5d1SDavid du Colombier.PP
31033e12c5d1SDavid du ColombierAnother of
31043e12c5d1SDavid du Colombier.CW sam 's
31053e12c5d1SDavid du Colombiergood things is its undo capability.
31063e12c5d1SDavid du ColombierI had never before used an editor with a true undo,
31073e12c5d1SDavid du Colombierbut I would never go back now.
31083e12c5d1SDavid du ColombierUndo
31093e12c5d1SDavid du Colombier.I must
31103e12c5d1SDavid du Colombierbe done well, but if it is, it can be relied on.
31113e12c5d1SDavid du ColombierFor example,
31123e12c5d1SDavid du Colombierit's safe to experiment if you're not sure how to write some intricate command,
31133e12c5d1SDavid du Colombierbecause if you make a mistake, it can be fixed simply and reliably.
31143e12c5d1SDavid du ColombierI learned two things about undo from writing
31153e12c5d1SDavid du Colombier.CW sam :
31163e12c5d1SDavid du Colombierfirst, it's easy to provide if you design it in from the beginning, and
31173e12c5d1SDavid du Colombiersecond, it's necessary, particularly if the system has some subtle
31183e12c5d1SDavid du Colombierproperties that may be unfamiliar or error-prone for users.
31193e12c5d1SDavid du Colombier.PP
31203e12c5d1SDavid du Colombier.CW Sam 's
31213e12c5d1SDavid du Colombierlack of internal limits and sizes is a virtue.
31223e12c5d1SDavid du ColombierBecause it avoids all fixed-size tables and data structures,
31233e12c5d1SDavid du Colombier.CW sam
31243e12c5d1SDavid du Colombieris able to make global changes to files that some of our other
31253e12c5d1SDavid du Colombiertools cannot even read.
31263e12c5d1SDavid du ColombierMoreover, the design keeps the performance linear when doing such
31273e12c5d1SDavid du Colombieroperations, although I must admit
31283e12c5d1SDavid du Colombier.CW sam
31293e12c5d1SDavid du Colombierdoes get slow when editing a huge file.
31303e12c5d1SDavid du Colombier.PP
31313e12c5d1SDavid du ColombierNow, the problems.
31323e12c5d1SDavid du ColombierExternally, the most obvious is that it is poorly integrated into the
31333e12c5d1SDavid du Colombiersurrounding window system.
31343e12c5d1SDavid du ColombierBy design, the user interface in
31353e12c5d1SDavid du Colombier.CW sam
31363e12c5d1SDavid du Colombierfeels almost identical to that of
31373e12c5d1SDavid du Colombier.CW mux ,
31383e12c5d1SDavid du Colombierbut a thick wall separates text in
31393e12c5d1SDavid du Colombier.CW sam
31403e12c5d1SDavid du Colombierfrom the programs running in
31413e12c5d1SDavid du Colombier.CW mux .
31423e12c5d1SDavid du ColombierFor instance, the `snarf buffer' in
31433e12c5d1SDavid du Colombier.CW sam
31443e12c5d1SDavid du Colombiermust be maintained separately from that in
31453e12c5d1SDavid du Colombier.CW mux .
31463e12c5d1SDavid du ColombierThis is regrettable, but probably necessary given the unusual configuration
31473e12c5d1SDavid du Colombierof the system, with a programmable terminal on the far end of an RS-232 link.
31483e12c5d1SDavid du Colombier.PP
31493e12c5d1SDavid du Colombier.CW Sam
31503e12c5d1SDavid du Colombieris reliable; otherwise, people wouldn't use it.
31513e12c5d1SDavid du ColombierBut it was written over such a long time, and has so many new (to me)
31523e12c5d1SDavid du Colombierideas in it, that I would like to see it done over again to clean
31533e12c5d1SDavid du Colombierup the code and remove many of the lingering problems in the implementation.
31543e12c5d1SDavid du ColombierThe worst part is in the interconnection of the host and terminal parts,
31553e12c5d1SDavid du Colombierwhich might even be able to go away in a redesign for a more
31563e12c5d1SDavid du Colombierconventional window system.
31573e12c5d1SDavid du ColombierThe program must be split in two to use the terminal effectively,
31583e12c5d1SDavid du Colombierbut the low bandwidth of the connection forces the separation to
31593e12c5d1SDavid du Colombieroccur in an inconvenient part of the design if performance is to be acceptable.
31603e12c5d1SDavid du ColombierA simple remote procedure call
31613e12c5d1SDavid du Colombierprotocol driven by the host, emitting only graphics
31623e12c5d1SDavid du Colombiercommands, would be easy to write but wouldn't have nearly the
31633e12c5d1SDavid du Colombiernecessary responsiveness.  On the other hand, if the terminal were in control
31643e12c5d1SDavid du Colombierand requested much simpler file services from the host, regular expression
31653e12c5d1SDavid du Colombiersearches would require that the terminal read the entire file over its RS-232
31663e12c5d1SDavid du Colombierlink, which would be unreasonably slow.
31673e12c5d1SDavid du ColombierA compromise in which either end can take control is necessary.
31683e12c5d1SDavid du ColombierIn retrospect, the communications protocol should have been
31693e12c5d1SDavid du Colombierdesigned and verified formally, although I do not know of any tool
31703e12c5d1SDavid du Colombierthat can adequately relate the protocol to
31713e12c5d1SDavid du Colombierits implementation.
31723e12c5d1SDavid du Colombier.PP
31733e12c5d1SDavid du ColombierNot all of
31743e12c5d1SDavid du Colombier.CW sam 's
31753e12c5d1SDavid du Colombierusers are comfortable with its command language, and few are adept.
31763e12c5d1SDavid du ColombierSome (venerable) people use a sort of
31773e12c5d1SDavid du Colombier.CW ed \& ``
31783e12c5d1SDavid du Colombiersubset'' of
31793e12c5d1SDavid du Colombier.CW sam 's
31803e12c5d1SDavid du Colombiercommand language,
31813e12c5d1SDavid du Colombierand even ask why
31823e12c5d1SDavid du Colombier.CW sam 's
31833e12c5d1SDavid du Colombiercommand language is not exactly
31843e12c5d1SDavid du Colombier.CW ed 's.
31853e12c5d1SDavid du Colombier(The reason, of course, is that
31863e12c5d1SDavid du Colombier.CW sam 's
31873e12c5d1SDavid du Colombiermodel for text does not include newlines, which are central to
31883e12c5d1SDavid du Colombier.CW ed .
31893e12c5d1SDavid du ColombierMaking the text an array of newlines to the command language would
31903e12c5d1SDavid du Colombierbe too much of a break from the seamless model provided by the mouse.
31913e12c5d1SDavid du ColombierSome editors, such as
31923e12c5d1SDavid du Colombier.CW vi ,
31933e12c5d1SDavid du Colombierare willing to make this break, though.)
31943e12c5d1SDavid du ColombierThe difficulty is that
31953e12c5d1SDavid du Colombier.CW sam 's
31963e12c5d1SDavid du Colombiersyntax is so close to
31973e12c5d1SDavid du Colombier.CW ed 's
31983e12c5d1SDavid du Colombierthat people believe it
31993e12c5d1SDavid du Colombier.I should
32003e12c5d1SDavid du Colombierbe the same.
32013e12c5d1SDavid du ColombierI thought, with some justification in hindsight,
32023e12c5d1SDavid du Colombierthat making
32033e12c5d1SDavid du Colombier.CW sam
32043e12c5d1SDavid du Colombiersimilar to
32053e12c5d1SDavid du Colombier.CW ed
32063e12c5d1SDavid du Colombierwould make it easier to learn and to accept.
32073e12c5d1SDavid du ColombierBut I may have overstepped and raised the users'
32083e12c5d1SDavid du Colombierexpectations too much.
32093e12c5d1SDavid du ColombierIt's hard to decide which way to resolve this problem.
32103e12c5d1SDavid du Colombier.PP
32113e12c5d1SDavid du ColombierFinally, there is a tradeoff in
32123e12c5d1SDavid du Colombier.CW sam
32133e12c5d1SDavid du Colombierthat was decided by the environment in which it runs:
32143e12c5d1SDavid du Colombier.CW sam
32153e12c5d1SDavid du Colombieris a multi-file editor, although in a different system there might instead be
32163e12c5d1SDavid du Colombiermultiple single-file editors.
32173e12c5d1SDavid du ColombierThe decision was made primarily because starting a new program in a Blit is
32183e12c5d1SDavid du Colombiertime-consuming.
32193e12c5d1SDavid du ColombierIf the choice could be made freely, however, I would
32203e12c5d1SDavid du Colombierstill choose the multi-file architecture, because it allows
32213e12c5d1SDavid du Colombiergroups of files to be handled as a unit;
32223e12c5d1SDavid du Colombierthe usefulness of the multi-file commands is incontrovertible.
32233e12c5d1SDavid du ColombierIt is delightful to have the source to an entire program
32243e12c5d1SDavid du Colombieravailable at your fingertips.
32253e12c5d1SDavid du Colombier.SH
32263e12c5d1SDavid du ColombierAcknowledgements
32273e12c5d1SDavid du Colombier.LP
32283e12c5d1SDavid du ColombierTom Cargill suggested the idea behind the
32293e12c5d1SDavid du Colombier.CW Rasp
32303e12c5d1SDavid du Colombierdata structure.
32313e12c5d1SDavid du ColombierNorman Wilson and Ken Thompson influenced the command language.
32323e12c5d1SDavid du ColombierThis paper was improved by comments from
32333e12c5d1SDavid du ColombierAl Aho,
32343e12c5d1SDavid du ColombierJon Bentley,
32353e12c5d1SDavid du ColombierChris Fraser,
32363e12c5d1SDavid du ColombierGerard Holzmann,
32373e12c5d1SDavid du ColombierBrian Kernighan,
32383e12c5d1SDavid du ColombierTed Kowalski,
32393e12c5d1SDavid du ColombierDoug McIlroy
32403e12c5d1SDavid du Colombierand
32413e12c5d1SDavid du ColombierDennis Ritchie.
3242