xref: /plan9/sys/doc/sam/sam.ms (revision 219b2ee8daee37f4aad58d63f21287faa8e4ffdc)
13e12c5d1SDavid du Colombier.Vx 17 11 November 87 1 32 "ROB PIKE" "THE TEXT EDITOR SAM"
23e12c5d1SDavid du Colombier.ds DY "31 May 1987
33e12c5d1SDavid du Colombier.ds DR "Revised 1 July 1987
43e12c5d1SDavid du Colombier.de CW		\" puts first arg in CW font, same as UL; maintains font
53e12c5d1SDavid du Colombier\%\&\\$3\f(CW\\$1\fP\&\\$2
63e12c5d1SDavid du Colombier..
73e12c5d1SDavid du Colombier.de Cs
83e12c5d1SDavid du Colombier.br
93e12c5d1SDavid du Colombier.fi
103e12c5d1SDavid du Colombier.ft 2
113e12c5d1SDavid du Colombier.ps -2
123e12c5d1SDavid du Colombier.vs -2
133e12c5d1SDavid du Colombier..
143e12c5d1SDavid du Colombier.de Ce
153e12c5d1SDavid du Colombier.br
163e12c5d1SDavid du Colombier.nf
173e12c5d1SDavid du Colombier.ft 1
183e12c5d1SDavid du Colombier.ps
193e12c5d1SDavid du Colombier.vs
203e12c5d1SDavid du Colombier.sp
213e12c5d1SDavid du Colombier..
223e12c5d1SDavid du Colombier.TL
233e12c5d1SDavid du ColombierThe Text Editor \&\f(CWsam\fP
243e12c5d1SDavid du Colombier.AU
25*219b2ee8SDavid du ColombierRob Pike
26*219b2ee8SDavid du Colombierrob@plan9.att.com
273e12c5d1SDavid du Colombier.AB
283e12c5d1SDavid du Colombier.LP
293e12c5d1SDavid du Colombier.CW Sam
303e12c5d1SDavid du Colombieris an interactive multi-file text editor intended for
313e12c5d1SDavid du Colombierbitmap displays.
323e12c5d1SDavid du ColombierA textual command language
333e12c5d1SDavid du Colombiersupplements the mouse-driven, cut-and-paste interface
343e12c5d1SDavid du Colombierto make complex or
353e12c5d1SDavid du Colombierrepetitive editing tasks easy to specify.
363e12c5d1SDavid du ColombierThe language is characterized by the composition of regular expressions
373e12c5d1SDavid du Colombierto describe the structure of the text being modified.
383e12c5d1SDavid du ColombierThe treatment of files as a database, with changes logged
393e12c5d1SDavid du Colombieras atomic transactions, guides the implementation and
403e12c5d1SDavid du Colombiermakes a general `undo' mechanism straightforward.
413e12c5d1SDavid du Colombier.PP
423e12c5d1SDavid du Colombier.CW Sam
433e12c5d1SDavid du Colombieris implemented as two processes connected by a low-bandwidth stream,
443e12c5d1SDavid du Colombierone process handling the display and the other the editing
453e12c5d1SDavid du Colombieralgorithms.  Therefore it can run with the display process
463e12c5d1SDavid du Colombierin a bitmap terminal and the editor on a local host,
473e12c5d1SDavid du Colombierwith both processes on a bitmap-equipped host, or with
483e12c5d1SDavid du Colombierthe display process in the terminal and the editor in a
493e12c5d1SDavid du Colombierremote host.
503e12c5d1SDavid du ColombierBy suppressing the display process,
513e12c5d1SDavid du Colombierit can even run without a bitmap terminal.
523e12c5d1SDavid du Colombier.PP
533e12c5d1SDavid du ColombierThis paper is reprinted from Software\(emPractice and Experience,
54*219b2ee8SDavid du ColombierVol 17, number 11, pp. 813-845, November 1987.
55*219b2ee8SDavid du ColombierThe paper has not been updated for the Plan 9 manuals.  Although
56*219b2ee8SDavid du Colombier.CW Sam
57*219b2ee8SDavid du Colombierhas not changed much since the paper was written, the system around it certainly has.
58*219b2ee8SDavid du ColombierNonetheless, the description here still stands as the best introduction to the editor.
593e12c5d1SDavid du Colombier.AE
603e12c5d1SDavid du Colombier.SH
613e12c5d1SDavid du ColombierIntroduction
623e12c5d1SDavid du Colombier.LP
633e12c5d1SDavid du Colombier.CW Sam
643e12c5d1SDavid du Colombieris an interactive text editor that combines cut-and-paste interactive editing with
653e12c5d1SDavid du Colombieran unusual command language based on the composition of regular expressions.
66*219b2ee8SDavid du ColombierIt is written as two programs: one, the `host part,' runs on a UNIX system
673e12c5d1SDavid du Colombierand implements the command language and provides file access; the other, the
683e12c5d1SDavid du Colombier`terminal part,' runs asynchronously
693e12c5d1SDavid du Colombieron a machine with a mouse and bitmap display
703e12c5d1SDavid du Colombierand supports the display and interactive editing.
713e12c5d1SDavid du ColombierThe host part may be even run in isolation on an ordinary terminal
723e12c5d1SDavid du Colombierto edit text using the command
733e12c5d1SDavid du Colombierlanguage, much like a traditional line editor,
743e12c5d1SDavid du Colombierwithout assistance from a mouse or display.
753e12c5d1SDavid du ColombierMost often,
763e12c5d1SDavid du Colombierthe terminal part runs on a Blit\u\s-4\&1\s+4\d terminal
773e12c5d1SDavid du Colombier(actually on a Teletype DMD 5620, the production version of the Blit), whose
783e12c5d1SDavid du Colombierhost connection is an ordinary 9600 bps RS232 link;
79*219b2ee8SDavid du Colombieron the SUN computer the host and display processes run on a single machine,
803e12c5d1SDavid du Colombierconnected by a pipe.
813e12c5d1SDavid du Colombier.PP
823e12c5d1SDavid du Colombier.CW Sam
833e12c5d1SDavid du Colombieredits uninterpreted
843e12c5d1SDavid du ColombierASCII text.
853e12c5d1SDavid du ColombierIt has no facilities for multiple fonts, graphics or tables,
863e12c5d1SDavid du Colombierunlike MacWrite,\u\s-4\&2\s+4\d Bravo,\u\s-4\&3\s+4\d Tioga\u\s-4\&4\s+4\d
873e12c5d1SDavid du Colombieror Lara.\u\s-4\&5\s+4\d
883e12c5d1SDavid du ColombierAlso unlike them, it has a rich command language.
893e12c5d1SDavid du Colombier(Throughout this paper, the phrase
903e12c5d1SDavid du Colombier.I
913e12c5d1SDavid du Colombiercommand language
923e12c5d1SDavid du Colombier.R
933e12c5d1SDavid du Colombierrefers to
943e12c5d1SDavid du Colombiertextual commands; commands activated from the mouse form the
953e12c5d1SDavid du Colombier.I mouse
963e12c5d1SDavid du Colombier.I language. )
973e12c5d1SDavid du Colombier.CW Sam
983e12c5d1SDavid du Colombierdeveloped as an editor for use by programmers, and tries to join
99*219b2ee8SDavid du Colombierthe styles of the UNIX text editor
1003e12c5d1SDavid du Colombier.CW ed \u\s-4\&6,7\s+4\d
1013e12c5d1SDavid du Colombierwith that of interactive cut-and-paste editors by
1023e12c5d1SDavid du Colombierproviding a comfortable mouse-driven interface
1033e12c5d1SDavid du Colombierto a program with a solid command language driven by regular expressions.
1043e12c5d1SDavid du ColombierThe command language developed more than the mouse language, and
1053e12c5d1SDavid du Colombieracquired a notation for describing the structure of files
1063e12c5d1SDavid du Colombiermore richly than as a sequence of lines,
1073e12c5d1SDavid du Colombierusing a dataflow-like syntax for specifying changes.
1083e12c5d1SDavid du Colombier.PP
1093e12c5d1SDavid du ColombierThe interactive style was influenced by
1103e12c5d1SDavid du Colombier.CW jim ,\u\s-4\&1\s+4\d
1113e12c5d1SDavid du Colombieran early cut-and-paste editor for the Blit, and by
1123e12c5d1SDavid du Colombier.CW mux ,\u\s-4\&8\s+4\d
1133e12c5d1SDavid du Colombierthe Blit window system.
1143e12c5d1SDavid du Colombier.CW Mux
1153e12c5d1SDavid du Colombiermerges the original Blit window system,
1163e12c5d1SDavid du Colombier.CW mpx ,\u\s-4\&1\s+4\d
1173e12c5d1SDavid du Colombierwith cut-and-paste editing, forming something like a
1183e12c5d1SDavid du Colombiermultiplexed version of
1193e12c5d1SDavid du Colombier.CW jim
1203e12c5d1SDavid du Colombierthat edits the output of (and input to) command sessions rather than files.
1213e12c5d1SDavid du Colombier.PP
1223e12c5d1SDavid du ColombierThe first part of this paper describes the command language, then the mouse
1233e12c5d1SDavid du Colombierlanguage, and explains how they interact.
1243e12c5d1SDavid du ColombierThat is followed by a description of the implementation,
1253e12c5d1SDavid du Colombierfirst of the host part, then of the terminal part.
1263e12c5d1SDavid du ColombierA principle that influenced the design of
1273e12c5d1SDavid du Colombier.CW sam
1283e12c5d1SDavid du Colombieris that it should have no explicit limits, such as upper limits on
1293e12c5d1SDavid du Colombierfile size or line length.
1303e12c5d1SDavid du ColombierA secondary consideration is that it be efficient.
1313e12c5d1SDavid du ColombierTo honor these two goals together requires a method for efficiently
1323e12c5d1SDavid du Colombiermanipulating
1333e12c5d1SDavid du Colombierhuge strings (files) without breaking them into lines,
1343e12c5d1SDavid du Colombierperhaps while making thousands of changes
1353e12c5d1SDavid du Colombierunder control of the command language.
1363e12c5d1SDavid du Colombier.CW Sam 's
1373e12c5d1SDavid du Colombiermethod is to
1383e12c5d1SDavid du Colombiertreat the file as a transaction database, implementing changes as atomic
1393e12c5d1SDavid du Colombierupdates.  These updates may be unwound easily to `undo' changes.
1403e12c5d1SDavid du ColombierEfficiency is achieved through a collection of caches that minimizes
1413e12c5d1SDavid du Colombierdisc traffic and data motion, both within the two parts of the program
1423e12c5d1SDavid du Colombierand between them.
1433e12c5d1SDavid du Colombier.PP
1443e12c5d1SDavid du ColombierThe terminal part of
1453e12c5d1SDavid du Colombier.CW sam
1463e12c5d1SDavid du Colombieris fairly straightforward.
1473e12c5d1SDavid du ColombierMore interesting is how the two halves of the editor stay
1483e12c5d1SDavid du Colombiersynchronized when either half may initiate a change.
1493e12c5d1SDavid du ColombierThis is achieved through a data structure that organizes the
1503e12c5d1SDavid du Colombiercommunications and is maintained in parallel by both halves.
1513e12c5d1SDavid du Colombier.PP
1523e12c5d1SDavid du ColombierThe last part of the paper chronicles the writing of
1533e12c5d1SDavid du Colombier.CW sam
1543e12c5d1SDavid du Colombierand discusses the lessons that were learned through its development and use.
1553e12c5d1SDavid du Colombier.PP
1563e12c5d1SDavid du ColombierThe paper is long, but is composed largely of two papers of reasonable length:
1573e12c5d1SDavid du Colombiera description of the user interface of
1583e12c5d1SDavid du Colombier.CW sam
1593e12c5d1SDavid du Colombierand a discussion of its implementation.
1603e12c5d1SDavid du ColombierThey are combined because the implementation is strongly influenced by
1613e12c5d1SDavid du Colombierthe user interface, and vice versa.
1623e12c5d1SDavid du Colombier.SH
1633e12c5d1SDavid du ColombierThe Interface
1643e12c5d1SDavid du Colombier.LP
1653e12c5d1SDavid du Colombier.CW Sam
1663e12c5d1SDavid du Colombieris a text editor for multiple files.
1673e12c5d1SDavid du ColombierFile names may be provided when it is invoked:
1683e12c5d1SDavid du Colombier.P1
1693e12c5d1SDavid du Colombiersam file1 file2 ...
1703e12c5d1SDavid du Colombier.P2
1713e12c5d1SDavid du Colombierand there are commands
1723e12c5d1SDavid du Colombierto add new files and discard unneeded ones.
1733e12c5d1SDavid du ColombierFiles are not read until necessary
1743e12c5d1SDavid du Colombierto complete some command.
1753e12c5d1SDavid du ColombierEditing operations apply to an internal copy
176*219b2ee8SDavid du Colombiermade when the file is read; the UNIX file associated with the copy
1773e12c5d1SDavid du Colombieris changed only by an explicit command.
1783e12c5d1SDavid du ColombierTo simplify the discussion, the internal copy is here called a
1793e12c5d1SDavid du Colombier.I file ,
1803e12c5d1SDavid du Colombierwhile the disc-resident original is called a
1813e12c5d1SDavid du Colombier.I
1823e12c5d1SDavid du Colombierdisc file.
1833e12c5d1SDavid du Colombier.R
1843e12c5d1SDavid du Colombier.PP
1853e12c5d1SDavid du Colombier.CW Sam
1863e12c5d1SDavid du Colombieris usually connected to a bitmap display that presents a cut-and-paste
1873e12c5d1SDavid du Colombiereditor driven by the mouse.
1883e12c5d1SDavid du ColombierIn this mode, the command language is still available:
1893e12c5d1SDavid du Colombiertext typed in a special window, called the
1903e12c5d1SDavid du Colombier.CW sam
1913e12c5d1SDavid du Colombier.I window,
1923e12c5d1SDavid du Colombieris interpreted
1933e12c5d1SDavid du Colombieras commands to be executed in the current file.
1943e12c5d1SDavid du ColombierCut-and-paste editing may be used in any window \(em even in the
1953e12c5d1SDavid du Colombier.CW sam
1963e12c5d1SDavid du Colombierwindow to construct commands.
1973e12c5d1SDavid du ColombierThe other mode of operation, invoked by starting
1983e12c5d1SDavid du Colombier.CW sam
1993e12c5d1SDavid du Colombierwith the option
2003e12c5d1SDavid du Colombier.CW -d
2013e12c5d1SDavid du Colombier(for `no download'),
2023e12c5d1SDavid du Colombierdoes not use the mouse or bitmap display, but still permits
2033e12c5d1SDavid du Colombierediting using the textual command language, even on an ordinary terminal,
2043e12c5d1SDavid du Colombierinteractively or from a script.
2053e12c5d1SDavid du Colombier.PP
2063e12c5d1SDavid du ColombierThe following sections describe first the command language (under
2073e12c5d1SDavid du Colombier.CW sam\ -d
2083e12c5d1SDavid du Colombierand in the
2093e12c5d1SDavid du Colombier.CW sam
2103e12c5d1SDavid du Colombierwindow), and then the mouse interface.
2113e12c5d1SDavid du ColombierThese two languages are nearly independent, but connect through the
2123e12c5d1SDavid du Colombier.I current
2133e12c5d1SDavid du Colombier.I text,
2143e12c5d1SDavid du Colombierdescribed below.
2153e12c5d1SDavid du Colombier.SH 2
2163e12c5d1SDavid du ColombierThe Command Language
2173e12c5d1SDavid du Colombier.LP
2183e12c5d1SDavid du ColombierA file consists of its contents, which are an array of characters
2193e12c5d1SDavid du Colombier(that is, a string); the
2203e12c5d1SDavid du Colombier.I name
2213e12c5d1SDavid du Colombierof the associated disc file; the
2223e12c5d1SDavid du Colombier.I
2233e12c5d1SDavid du Colombiermodified bit
2243e12c5d1SDavid du Colombier.R
2253e12c5d1SDavid du Colombierthat states whether the contents match those of
2263e12c5d1SDavid du Colombierthe disc file;
2273e12c5d1SDavid du Colombierand a substring of the contents, called the
2283e12c5d1SDavid du Colombier.I
2293e12c5d1SDavid du Colombiercurrent text
2303e12c5d1SDavid du Colombier.R
2313e12c5d1SDavid du Colombieror
2323e12c5d1SDavid du Colombier.I dot
2333e12c5d1SDavid du Colombier(see Figures 1 and 2).
2343e12c5d1SDavid du ColombierIf the current text is a null string, dot falls between characters.
2353e12c5d1SDavid du ColombierThe
2363e12c5d1SDavid du Colombier.I value
2373e12c5d1SDavid du Colombierof dot is the location of the current text; the
2383e12c5d1SDavid du Colombier.I contents
2393e12c5d1SDavid du Colombierof dot are the characters it contains.
2403e12c5d1SDavid du Colombier.CW Sam
2413e12c5d1SDavid du Colombierimparts to the text no two-dimensional interpretation such as columns
2423e12c5d1SDavid du Colombieror fields; text is always one-dimensional.
243*219b2ee8SDavid du ColombierEven the idea of a `line' of text as understood by most UNIX programs
2443e12c5d1SDavid du Colombier\(em a sequence of characters terminated by a newline character \(em
2453e12c5d1SDavid du Colombieris only weakly supported.
2463e12c5d1SDavid du Colombier.PP
2473e12c5d1SDavid du ColombierThe
2483e12c5d1SDavid du Colombier.I
2493e12c5d1SDavid du Colombiercurrent file
2503e12c5d1SDavid du Colombier.R
2513e12c5d1SDavid du Colombieris the file to which editing commands refer.
2523e12c5d1SDavid du ColombierThe current text is therefore dot in the current file.
2533e12c5d1SDavid du ColombierIf a command doesn't explicitly name a particular file or piece of text,
2543e12c5d1SDavid du Colombierthe command is assumed to apply to the current text.
2553e12c5d1SDavid du ColombierFor the moment, ignore the presence of multiple files and consider
2563e12c5d1SDavid du Colombierediting a single file.
2573e12c5d1SDavid du Colombier.KF L
2583e12c5d1SDavid du Colombier.BP fig1.ps 3.5i
2593e12c5d1SDavid du Colombier.Cs
2603e12c5d1SDavid du ColombierFigure 1. A typical
2613e12c5d1SDavid du Colombier.CW sam
2623e12c5d1SDavid du Colombierscreen, with the editing menu presented.
2633e12c5d1SDavid du ColombierThe
2643e12c5d1SDavid du Colombier.CW sam
2653e12c5d1SDavid du Colombier(command language) window is in the middle, with file windows above and below.
2663e12c5d1SDavid du Colombier(The user interface makes it easy to create these abutting windows.)
2673e12c5d1SDavid du ColombierThe partially obscured window is a third file window.
2683e12c5d1SDavid du ColombierThe uppermost window is that to which typing and mouse operations apply,
2693e12c5d1SDavid du Colombieras indicated by its heavy border.
2703e12c5d1SDavid du ColombierEach window has its current text highlighted in reverse video.
2713e12c5d1SDavid du ColombierThe
2723e12c5d1SDavid du Colombier.CW sam
2733e12c5d1SDavid du Colombierwindow's current text is the null string on the last visible line,
2743e12c5d1SDavid du Colombierindicated by a vertical bar.
2753e12c5d1SDavid du ColombierSee also Figure 2.
2763e12c5d1SDavid du Colombier.Ce
2773e12c5d1SDavid du Colombier.KE
2783e12c5d1SDavid du Colombier.PP
2793e12c5d1SDavid du ColombierCommands have one-letter names.
2803e12c5d1SDavid du ColombierExcept for non-editing commands such as writing
2813e12c5d1SDavid du Colombierthe file to disc, most commands make some change
2823e12c5d1SDavid du Colombierto the text in dot and leave dot set to the text resulting from the change.
2833e12c5d1SDavid du ColombierFor example, the delete command,
2843e12c5d1SDavid du Colombier.CW d ,
2853e12c5d1SDavid du Colombierdeletes the text in dot, replacing it by the null string and setting dot
2863e12c5d1SDavid du Colombierto the result.
2873e12c5d1SDavid du ColombierThe change command,
2883e12c5d1SDavid du Colombier.CW c ,
2893e12c5d1SDavid du Colombierreplaces dot by text delimited by an arbitrary punctuation character,
2903e12c5d1SDavid du Colombierconventionally
2913e12c5d1SDavid du Colombiera slash.  Thus,
2923e12c5d1SDavid du Colombier.P1
2933e12c5d1SDavid du Colombierc/Peter/
2943e12c5d1SDavid du Colombier.P2
2953e12c5d1SDavid du Colombierreplaces the text in dot by the string
2963e12c5d1SDavid du Colombier.CW Peter .
2973e12c5d1SDavid du ColombierSimilarly,
2983e12c5d1SDavid du Colombier.P1
2993e12c5d1SDavid du Colombiera/Peter/
3003e12c5d1SDavid du Colombier.P2
3013e12c5d1SDavid du Colombier(append) adds the string after dot, and
3023e12c5d1SDavid du Colombier.P1
3033e12c5d1SDavid du Colombieri/Peter/
3043e12c5d1SDavid du Colombier.P2
3053e12c5d1SDavid du Colombier(insert) inserts before dot.
3063e12c5d1SDavid du ColombierAll three leave dot set to the new text,
3073e12c5d1SDavid du Colombier.CW Peter .
3083e12c5d1SDavid du Colombier.PP
3093e12c5d1SDavid du ColombierNewlines are part of the syntax of commands:
3103e12c5d1SDavid du Colombierthe newline character lexically terminates a command.
3113e12c5d1SDavid du ColombierWithin the inserted text, however, newlines are never implicit.
3123e12c5d1SDavid du ColombierBut since it is often convenient to insert multiple lines of text,
3133e12c5d1SDavid du Colombier.CW sam
3143e12c5d1SDavid du Colombierhas a special
3153e12c5d1SDavid du Colombiersyntax for that case:
3163e12c5d1SDavid du Colombier.P1
3173e12c5d1SDavid du Colombiera
3183e12c5d1SDavid du Colombiersome lines of text
3193e12c5d1SDavid du Colombierto be inserted in the file,
3203e12c5d1SDavid du Colombierterminated by a period
3213e12c5d1SDavid du Colombieron a line by itself
3223e12c5d1SDavid du Colombier\&.
3233e12c5d1SDavid du Colombier.P2
3243e12c5d1SDavid du ColombierIn the one-line syntax, a newline character may be specified by a C-like
3253e12c5d1SDavid du Colombierescape, so
3263e12c5d1SDavid du Colombier.P1
3273e12c5d1SDavid du Colombierc/\en/
3283e12c5d1SDavid du Colombier.P2
3293e12c5d1SDavid du Colombierreplaces dot by a single newline character.
3303e12c5d1SDavid du Colombier.PP
3313e12c5d1SDavid du Colombier.CW Sam
3323e12c5d1SDavid du Colombieralso has a substitute command,
3333e12c5d1SDavid du Colombier.CW s :
3343e12c5d1SDavid du Colombier.P1
3353e12c5d1SDavid du Colombiers/\f2expression\fP/\f2replacement\fP/
3363e12c5d1SDavid du Colombier.P2
3373e12c5d1SDavid du Colombiersubstitutes the replacement text for the first match, in dot,
3383e12c5d1SDavid du Colombierof the regular expression.
3393e12c5d1SDavid du ColombierThus, if dot is the string
3403e12c5d1SDavid du Colombier.CW Peter ,
3413e12c5d1SDavid du Colombierthe command
3423e12c5d1SDavid du Colombier.P1
3433e12c5d1SDavid du Colombiers/t/st/
3443e12c5d1SDavid du Colombier.P2
3453e12c5d1SDavid du Colombierchanges it to
3463e12c5d1SDavid du Colombier.CW Pester .
3473e12c5d1SDavid du ColombierIn general,
3483e12c5d1SDavid du Colombier.CW s
3493e12c5d1SDavid du Colombieris unnecessary, but it was inherited from
3503e12c5d1SDavid du Colombier.CW ed
3513e12c5d1SDavid du Colombierand it has some convenient variations.
3523e12c5d1SDavid du ColombierFor instance, the replacement text may include the matched text,
3533e12c5d1SDavid du Colombierspecified by
3543e12c5d1SDavid du Colombier.CW & :
3553e12c5d1SDavid du Colombier.P1
3563e12c5d1SDavid du Colombiers/Peter/Oh, &, &, &, &!/
3573e12c5d1SDavid du Colombier.P2
3583e12c5d1SDavid du Colombier.PP
3593e12c5d1SDavid du ColombierThere are also three commands that apply programs
3603e12c5d1SDavid du Colombierto text:
3613e12c5d1SDavid du Colombier.P1
362*219b2ee8SDavid du Colombier< \f2UNIX program\fP
3633e12c5d1SDavid du Colombier.P2
364*219b2ee8SDavid du Colombierreplaces dot by the output of the UNIX program.
3653e12c5d1SDavid du ColombierSimilarly, the
3663e12c5d1SDavid du Colombier.CW >
3673e12c5d1SDavid du Colombiercommand
3683e12c5d1SDavid du Colombierruns the program with dot as its standard input, and
3693e12c5d1SDavid du Colombier.CW |
3703e12c5d1SDavid du Colombierdoes both.  For example,
3713e12c5d1SDavid du Colombier.P1
3723e12c5d1SDavid du Colombier| sort
3733e12c5d1SDavid du Colombier.P2
3743e12c5d1SDavid du Colombierreplaces dot by the result of applying the standard sorting utility to it.
3753e12c5d1SDavid du ColombierAgain, newlines have no special significance for these
3763e12c5d1SDavid du Colombier.CW sam
3773e12c5d1SDavid du Colombiercommands.
3783e12c5d1SDavid du ColombierThe text acted upon and resulting from these commands is not necessarily
379*219b2ee8SDavid du Colombierbounded by newlines, although for connection with UNIX programs,
3803e12c5d1SDavid du Colombiernewlines may be necessary to obey conventions.
3813e12c5d1SDavid du Colombier.PP
3823e12c5d1SDavid du ColombierOne more command:
3833e12c5d1SDavid du Colombier.CW p
3843e12c5d1SDavid du Colombierprints the contents of dot.
3853e12c5d1SDavid du ColombierTable I summarizes
3863e12c5d1SDavid du Colombier.CW sam 's
3873e12c5d1SDavid du Colombiercommands.
3883e12c5d1SDavid du Colombier.KF
3893e12c5d1SDavid du Colombier.TS
3903e12c5d1SDavid du Colombiercenter;
3913e12c5d1SDavid du Colombierc s
3923e12c5d1SDavid du ColombierlfCW l.
3933e12c5d1SDavid du ColombierTable I. \f(CWSam\fP commands
3943e12c5d1SDavid du Colombier.sp .4
3953e12c5d1SDavid du Colombier.ft CW
3963e12c5d1SDavid du Colombier_
3973e12c5d1SDavid du Colombier.ft
3983e12c5d1SDavid du Colombier.sp .4
3993e12c5d1SDavid du Colombier\f1Text commands\fP
4003e12c5d1SDavid du Colombier.sp .4
4013e12c5d1SDavid du Colombier_
4023e12c5d1SDavid du Colombier.sp .4
4033e12c5d1SDavid du Colombiera/\f2text\fP/	Append text after dot
4043e12c5d1SDavid du Colombierc/\f2text\fP/	Change text in dot
4053e12c5d1SDavid du Colombieri/\f2text\fP/	Insert text before dot
4063e12c5d1SDavid du Colombierd	Delete text in dot
4073e12c5d1SDavid du Colombiers/\f2regexp\fP/\f2text\fP/	Substitute text for match of regular expression in dot
4083e12c5d1SDavid du Colombierm \f2address\fP	Move text in dot after address
4093e12c5d1SDavid du Colombiert \f2address\fP	Copy text in dot after address
4103e12c5d1SDavid du Colombier.sp .4
4113e12c5d1SDavid du Colombier_
4123e12c5d1SDavid du Colombier.sp .4
4133e12c5d1SDavid du Colombier\f1Display commands\fP
4143e12c5d1SDavid du Colombier.sp .4
4153e12c5d1SDavid du Colombier_
4163e12c5d1SDavid du Colombier.sp .2
4173e12c5d1SDavid du Colombierp	Print contents of dot
4183e12c5d1SDavid du Colombier\&=	Print value (line numbers and character numbers) of dot
4193e12c5d1SDavid du Colombier.sp .4
4203e12c5d1SDavid du Colombier_
4213e12c5d1SDavid du Colombier.sp .4
4223e12c5d1SDavid du Colombier\f1File commands\fP
4233e12c5d1SDavid du Colombier.sp .4
4243e12c5d1SDavid du Colombier_
4253e12c5d1SDavid du Colombier.sp .2
4263e12c5d1SDavid du Colombierb \f2file-list\fP	Set current file to first file in list that \f(CWsam\fP has in menu
4273e12c5d1SDavid du ColombierB \f2file-list\fP	Same as \f(CWb\fP, but load new files
4283e12c5d1SDavid du Colombiern	Print menu lines of all files
4293e12c5d1SDavid du ColombierD \f2file-list\fP	Delete named files from \f(CWsam\fP
4303e12c5d1SDavid du Colombier.sp .4
4313e12c5d1SDavid du Colombier_
4323e12c5d1SDavid du Colombier.sp .4
4333e12c5d1SDavid du Colombier\f1I/O commands\fP
4343e12c5d1SDavid du Colombier.sp .4
4353e12c5d1SDavid du Colombier_
4363e12c5d1SDavid du Colombier.sp .2
4373e12c5d1SDavid du Colombiere \f2filename\fP	Replace file with named disc file
4383e12c5d1SDavid du Colombierr \f2filename\fP	Replace dot by contents of named disc file
4393e12c5d1SDavid du Colombierw \f2filename\fP	Write file to named disc file
4403e12c5d1SDavid du Colombierf \f2filename\fP	Set file name and print new menu line
441*219b2ee8SDavid du Colombier< \f2UNIX-command\fP	Replace dot by standard output of command
442*219b2ee8SDavid du Colombier> \f2UNIX-command\fP	Send dot to standard input of command
443*219b2ee8SDavid du Colombier| \f2UNIX-command\fP	Replace dot by result of command applied to dot
444*219b2ee8SDavid du Colombier! \f2UNIX-command\fP	Run the command
4453e12c5d1SDavid du Colombier.sp .4
4463e12c5d1SDavid du Colombier_
4473e12c5d1SDavid du Colombier.sp .4
4483e12c5d1SDavid du Colombier\f1Loops and conditionals\fP
4493e12c5d1SDavid du Colombier.sp .4
4503e12c5d1SDavid du Colombier_
4513e12c5d1SDavid du Colombier.sp .2
4523e12c5d1SDavid du Colombierx/\f2regexp\fP/ \f2command\fP	For each match of regexp, set dot and run command
4533e12c5d1SDavid du Colombiery/\f2regexp\fP/ \f2command\fP	Between adjacent matches of regexp, set dot and run command
4543e12c5d1SDavid du ColombierX/\f2regexp\fP/ \f2command\fP	Run command in each file whose menu line matches regexp
4553e12c5d1SDavid du ColombierY/\f2regexp\fP/ \f2command\fP	Run command in each file whose menu line does not match
4563e12c5d1SDavid du Colombierg/\f2regexp\fP/ \f2command\fP	If dot contains a match of regexp, run command
4573e12c5d1SDavid du Colombierv/\f2regexp\fP/ \f2command\fP	If dot does not contain a match of regexp, run command
4583e12c5d1SDavid du Colombier.sp .4
4593e12c5d1SDavid du Colombier_
4603e12c5d1SDavid du Colombier.sp .4
4613e12c5d1SDavid du Colombier\f1Miscellany\fP
4623e12c5d1SDavid du Colombier.sp .4
4633e12c5d1SDavid du Colombier_
4643e12c5d1SDavid du Colombier.sp .2
4653e12c5d1SDavid du Colombierk	Set address mark to value of dot
4663e12c5d1SDavid du Colombierq	Quit
4673e12c5d1SDavid du Colombieru \f2n\fP	Undo last \f2n\fP (default 1) changes
4683e12c5d1SDavid du Colombier{ }	Braces group commands
4693e12c5d1SDavid du Colombier.sp .3
4703e12c5d1SDavid du Colombier.ft CW
4713e12c5d1SDavid du Colombier_
4723e12c5d1SDavid du Colombier.ft
4733e12c5d1SDavid du Colombier.TE
4743e12c5d1SDavid du Colombier.sp
4753e12c5d1SDavid du Colombier.KE
4763e12c5d1SDavid du Colombier.PP
4773e12c5d1SDavid du ColombierThe value of dot may be changed by
4783e12c5d1SDavid du Colombierspecifying an
4793e12c5d1SDavid du Colombier.I address
4803e12c5d1SDavid du Colombierfor the command.
4813e12c5d1SDavid du ColombierThe simplest address is a line number:
4823e12c5d1SDavid du Colombier.P1
4833e12c5d1SDavid du Colombier3
4843e12c5d1SDavid du Colombier.P2
4853e12c5d1SDavid du Colombierrefers to the third line of the file, so
4863e12c5d1SDavid du Colombier.P1
4873e12c5d1SDavid du Colombier3d
4883e12c5d1SDavid du Colombier.P2
4893e12c5d1SDavid du Colombierdeletes the third line of the file, and implicitly renumbers
4903e12c5d1SDavid du Colombierthe lines so the old line 4 is now numbered 3.
4913e12c5d1SDavid du Colombier(This is one of the few places where
4923e12c5d1SDavid du Colombier.CW sam
4933e12c5d1SDavid du Colombierdeals with lines directly.)
4943e12c5d1SDavid du ColombierLine
4953e12c5d1SDavid du Colombier.CW 0
4963e12c5d1SDavid du Colombieris the null string at the beginning of the file.
4973e12c5d1SDavid du ColombierIf a command consists of only an address, a
4983e12c5d1SDavid du Colombier.CW p
4993e12c5d1SDavid du Colombiercommand is assumed, so typing an unadorned
5003e12c5d1SDavid du Colombier.CW 3
5013e12c5d1SDavid du Colombierprints line 3 on the terminal.
5023e12c5d1SDavid du ColombierThere are a couple of other basic addresses:
5033e12c5d1SDavid du Colombiera period addresses dot itself; and
5043e12c5d1SDavid du Colombiera dollar sign
5053e12c5d1SDavid du Colombier.CW $ ) (
5063e12c5d1SDavid du Colombieraddresses the null string at the end of the file.
5073e12c5d1SDavid du Colombier.PP
5083e12c5d1SDavid du ColombierAn address is always a single substring of the file.
5093e12c5d1SDavid du ColombierThus, the address
5103e12c5d1SDavid du Colombier.CW 3
5113e12c5d1SDavid du Colombieraddresses the characters
5123e12c5d1SDavid du Colombierafter the second newline of
5133e12c5d1SDavid du Colombierthe file through the third newline of the file.
5143e12c5d1SDavid du ColombierA
5153e12c5d1SDavid du Colombier.I
5163e12c5d1SDavid du Colombiercompound address
5173e12c5d1SDavid du Colombier.R
5183e12c5d1SDavid du Colombieris constructed by the comma operator
5193e12c5d1SDavid du Colombier.P1
5203e12c5d1SDavid du Colombier\f2address1\fP,\f2address2\fP
5213e12c5d1SDavid du Colombier.P2
5223e12c5d1SDavid du Colombierand addresses the substring of the file from the beginning of
5233e12c5d1SDavid du Colombier.I address1
5243e12c5d1SDavid du Colombierto the end of
5253e12c5d1SDavid du Colombier.I address2 .
5263e12c5d1SDavid du ColombierFor example, the command
5273e12c5d1SDavid du Colombier.CW 3,5p
5283e12c5d1SDavid du Colombierprints the third through fifth lines of the file and
5293e12c5d1SDavid du Colombier.CW .,$d
5303e12c5d1SDavid du Colombierdeletes the text from the beginning of dot to the end of the file.
5313e12c5d1SDavid du Colombier.PP
5323e12c5d1SDavid du ColombierThese addresses are all absolute positions in the file, but
5333e12c5d1SDavid du Colombier.CW sam
5343e12c5d1SDavid du Colombieralso has relative addresses, indicated by
5353e12c5d1SDavid du Colombier.CW +
5363e12c5d1SDavid du Colombieror
5373e12c5d1SDavid du Colombier.CW - .
5383e12c5d1SDavid du ColombierFor example,
5393e12c5d1SDavid du Colombier.P1
5403e12c5d1SDavid du Colombier$-3
5413e12c5d1SDavid du Colombier.P2
5423e12c5d1SDavid du Colombieris the third line before the end of the file and
5433e12c5d1SDavid du Colombier.P1
5443e12c5d1SDavid du Colombier\&.+1
5453e12c5d1SDavid du Colombier.P2
5463e12c5d1SDavid du Colombieris the line after dot.
5473e12c5d1SDavid du ColombierIf no address appears to the left of the
5483e12c5d1SDavid du Colombier.CW +
5493e12c5d1SDavid du Colombieror
5503e12c5d1SDavid du Colombier.CW - ,
5513e12c5d1SDavid du Colombierdot is assumed;
5523e12c5d1SDavid du Colombierif nothing appears to the right,
5533e12c5d1SDavid du Colombier.CW 1
5543e12c5d1SDavid du Colombieris assumed.
5553e12c5d1SDavid du ColombierTherefore,
5563e12c5d1SDavid du Colombier.CW .+1
5573e12c5d1SDavid du Colombiermay be abbreviated to just a plus sign.
5583e12c5d1SDavid du Colombier.PP
5593e12c5d1SDavid du ColombierThe
5603e12c5d1SDavid du Colombier.CW +
5613e12c5d1SDavid du Colombieroperator acts relative to the end of its first argument, while the
5623e12c5d1SDavid du Colombier.CW -
5633e12c5d1SDavid du Colombieroperator acts relative to the beginning.  Thus
5643e12c5d1SDavid du Colombier.CW .+1
5653e12c5d1SDavid du Colombieraddresses the first line after dot,
5663e12c5d1SDavid du Colombier.CW .-
5673e12c5d1SDavid du Colombieraddresses the first line before dot, and
5683e12c5d1SDavid du Colombier.CW +-
5693e12c5d1SDavid du Colombierrefers to the line containing the end of dot.  (Dot may span multiple lines, and
5703e12c5d1SDavid du Colombier.CW +
5713e12c5d1SDavid du Colombierselects the line after the end of dot, then
5723e12c5d1SDavid du Colombier.CW -
5733e12c5d1SDavid du Colombierbacks up one line.)
5743e12c5d1SDavid du Colombier.PP
5753e12c5d1SDavid du ColombierThe final type of address is a regular expression, which addresses the
5763e12c5d1SDavid du Colombiertext matched by the expression.  The expression is enclosed in slashes, as in
5773e12c5d1SDavid du Colombier.P1
5783e12c5d1SDavid du Colombier/\f2expression\fP/
5793e12c5d1SDavid du Colombier.P2
580*219b2ee8SDavid du ColombierThe expressions are the same as those in the UNIX program
5813e12c5d1SDavid du Colombier.CW egrep ,\u\s-4\&6,7\s+4\d
5823e12c5d1SDavid du Colombierand include closures, alternations, and so on.
5833e12c5d1SDavid du ColombierThey find the
5843e12c5d1SDavid du Colombier.I
5853e12c5d1SDavid du Colombierleftmost longest
5863e12c5d1SDavid du Colombier.R
5873e12c5d1SDavid du Colombierstring that matches the expression, that is,
5883e12c5d1SDavid du Colombierthe first match after the point where the search is started,
5893e12c5d1SDavid du Colombierand if more than one match begins at the same spot, the longest such match.
590*219b2ee8SDavid du Colombier(I assume familiarity with the syntax for regular expressions in UNIX programs.\u\s-4\&9\s+4\d)
5913e12c5d1SDavid du ColombierFor example,
5923e12c5d1SDavid du Colombier.P1
5933e12c5d1SDavid du Colombier/x/
5943e12c5d1SDavid du Colombier.P2
5953e12c5d1SDavid du Colombiermatches the next
5963e12c5d1SDavid du Colombier.CW x
5973e12c5d1SDavid du Colombiercharacter in the file,
5983e12c5d1SDavid du Colombier.P1
5993e12c5d1SDavid du Colombier/xx*/
6003e12c5d1SDavid du Colombier.P2
6013e12c5d1SDavid du Colombiermatches the next run of one or more
6023e12c5d1SDavid du Colombier.CW x 's,
6033e12c5d1SDavid du Colombierand
6043e12c5d1SDavid du Colombier.P1
6053e12c5d1SDavid du Colombier/x|Peter/
6063e12c5d1SDavid du Colombier.P2
6073e12c5d1SDavid du Colombiermatches the next
6083e12c5d1SDavid du Colombier.CW x
6093e12c5d1SDavid du Colombieror
6103e12c5d1SDavid du Colombier.CW Peter .
611*219b2ee8SDavid du ColombierFor compatibility with other UNIX programs, the `any character' operator,
6123e12c5d1SDavid du Colombiera period,
6133e12c5d1SDavid du Colombierdoes not match a newline, so
6143e12c5d1SDavid du Colombier.P1
6153e12c5d1SDavid du Colombier/.*/
6163e12c5d1SDavid du Colombier.P2
6173e12c5d1SDavid du Colombiermatches the text from dot to the end of the line, but excludes the newline
6183e12c5d1SDavid du Colombierand so will not match across
6193e12c5d1SDavid du Colombierthe line boundary.
6203e12c5d1SDavid du Colombier.PP
6213e12c5d1SDavid du ColombierRegular expressions are always relative addresses.
6223e12c5d1SDavid du ColombierThe direction is forwards by default,
6233e12c5d1SDavid du Colombierso
6243e12c5d1SDavid du Colombier.CW /Peter/
6253e12c5d1SDavid du Colombieris really an abbreviation for
6263e12c5d1SDavid du Colombier.CW +/Peter/ .
6273e12c5d1SDavid du ColombierThe search can be reversed with a minus sign, so
6283e12c5d1SDavid du Colombier.P1
6293e12c5d1SDavid du Colombier.CW -/Peter/
6303e12c5d1SDavid du Colombier.P2
6313e12c5d1SDavid du Colombierfinds the first
6323e12c5d1SDavid du Colombier.CW Peter
6333e12c5d1SDavid du Colombierbefore dot.
6343e12c5d1SDavid du ColombierRegular expressions may be used with other address forms, so
6353e12c5d1SDavid du Colombier.CW 0+/Peter/
6363e12c5d1SDavid du Colombierfinds the first
6373e12c5d1SDavid du Colombier.CW Peter
6383e12c5d1SDavid du Colombierin the file and
6393e12c5d1SDavid du Colombier.CW $-/Peter/
6403e12c5d1SDavid du Colombierfinds the last.
6413e12c5d1SDavid du ColombierTable II summarizes
6423e12c5d1SDavid du Colombier.CW sam 's
6433e12c5d1SDavid du Colombieraddresses.
6443e12c5d1SDavid du Colombier.KF
6453e12c5d1SDavid du Colombier.TS
6463e12c5d1SDavid du Colombiercenter;
6473e12c5d1SDavid du Colombierc s
6483e12c5d1SDavid du ColombierlfCW l.
6493e12c5d1SDavid du ColombierTable II. \f(CWSam\fP addresses
6503e12c5d1SDavid du Colombier.sp .4
6513e12c5d1SDavid du Colombier.ft CW
6523e12c5d1SDavid du Colombier_
6533e12c5d1SDavid du Colombier.ft
6543e12c5d1SDavid du Colombier.sp .4
6553e12c5d1SDavid du Colombier\f1Simple addresses\fP
6563e12c5d1SDavid du Colombier.sp .4
6573e12c5d1SDavid du Colombier_
6583e12c5d1SDavid du Colombier.sp .2
6593e12c5d1SDavid du Colombier#\f2n\fP	The empty string after character \f2n\fP
6603e12c5d1SDavid du Colombier\f2n\fP	Line \f2n\fP.
6613e12c5d1SDavid du Colombier/\f2regexp\fP/	The first following match of the regular expression
6623e12c5d1SDavid du Colombier-/\f2regexp\fP/	The first previous match of the regular expression
6633e12c5d1SDavid du Colombier$	The null string at the end of the file
6643e12c5d1SDavid du Colombier\&.	Dot
6653e12c5d1SDavid du Colombier\&'	The address mark, set by \f(CWk\fP command
6663e12c5d1SDavid du Colombier"\f2regexp\fP"	Dot in the file whose menu line matches regexp
6673e12c5d1SDavid du Colombier.sp .4
6683e12c5d1SDavid du Colombier_
6693e12c5d1SDavid du Colombier.sp .4
6703e12c5d1SDavid du Colombier\f1Compound addresses\fP
6713e12c5d1SDavid du Colombier.sp .4
6723e12c5d1SDavid du Colombier_
6733e12c5d1SDavid du Colombier.sp .2
6743e12c5d1SDavid du Colombier\f2a1\fP+\f2a2\fP	The address \f2a2\fP evaluated starting at right of \f2a1\fP
6753e12c5d1SDavid du Colombier\f2a1\fP-\f2a2\fP	\f2a2\fP evaluated in the reverse direction starting at left of \f2a1\fP
6763e12c5d1SDavid du Colombier\f2a1\fP,\f2a2\fP	From the left of \f2a1\fP to the right of \f2a2\fP (default \f(CW0,$\fP)
6773e12c5d1SDavid du Colombier\f2a1\fP;\f2a2\fP	Like \f(CW,\fP but sets dot after evaluating \f2a1\fP
6783e12c5d1SDavid du Colombier.sp .4
6793e12c5d1SDavid du Colombier_
6803e12c5d1SDavid du Colombier.sp .4
6813e12c5d1SDavid du Colombier.T&
6823e12c5d1SDavid du Colombierc s.
6833e12c5d1SDavid du ColombierT{
6843e12c5d1SDavid du ColombierThe operators
6853e12c5d1SDavid du Colombier.CW +
6863e12c5d1SDavid du Colombierand
6873e12c5d1SDavid du Colombier.CW -
6883e12c5d1SDavid du Colombierare high precedence, while
6893e12c5d1SDavid du Colombier.CW ,
6903e12c5d1SDavid du Colombierand
6913e12c5d1SDavid du Colombier.CW ;
6923e12c5d1SDavid du Colombierare low precedence.
6933e12c5d1SDavid du ColombierIn both
6943e12c5d1SDavid du Colombier.CW +
6953e12c5d1SDavid du Colombierand
6963e12c5d1SDavid du Colombier.CW -
6973e12c5d1SDavid du Colombierforms,
6983e12c5d1SDavid du Colombier.I a2
6993e12c5d1SDavid du Colombierdefaults to 1 and
7003e12c5d1SDavid du Colombier.I a1
7013e12c5d1SDavid du Colombierdefaults to dot.
7023e12c5d1SDavid du ColombierIf both
7033e12c5d1SDavid du Colombier.I a1
7043e12c5d1SDavid du Colombierand
7053e12c5d1SDavid du Colombier.I a2
7063e12c5d1SDavid du Colombierare present,
7073e12c5d1SDavid du Colombier.CW +
7083e12c5d1SDavid du Colombiermay be elided.
7093e12c5d1SDavid du ColombierT}
7103e12c5d1SDavid du Colombier.sp .5
7113e12c5d1SDavid du Colombier.ft CW
7123e12c5d1SDavid du Colombier_
7133e12c5d1SDavid du Colombier.ft
7143e12c5d1SDavid du Colombier.TE
7153e12c5d1SDavid du Colombier.sp
7163e12c5d1SDavid du Colombier.KE
7173e12c5d1SDavid du Colombier.PP
7183e12c5d1SDavid du ColombierThe language discussed so far will not seem novel
719*219b2ee8SDavid du Colombierto people who use UNIX text editors
7203e12c5d1SDavid du Colombiersuch as
7213e12c5d1SDavid du Colombier.CW ed
7223e12c5d1SDavid du Colombieror
7233e12c5d1SDavid du Colombier.CW vi .\u\s-4\&9\s+4\d
7243e12c5d1SDavid du ColombierMoreover, the kinds of editing operations these commands allow, with the exception
7253e12c5d1SDavid du Colombierof regular expressions and line numbers,
7263e12c5d1SDavid du Colombierare clearly more conveniently handled by a mouse-based interface.
7273e12c5d1SDavid du ColombierIndeed,
7283e12c5d1SDavid du Colombier.CW sam 's
7293e12c5d1SDavid du Colombiermouse language (discussed at length below) is the means by which
7303e12c5d1SDavid du Colombiersimple changes are usually made.
7313e12c5d1SDavid du ColombierFor large or repetitive changes, however, a textual language
7323e12c5d1SDavid du Colombieroutperforms a manual interface.
7333e12c5d1SDavid du Colombier.PP
7343e12c5d1SDavid du ColombierImagine that, instead of deleting just one occurrence of the string
7353e12c5d1SDavid du Colombier.CW Peter ,
7363e12c5d1SDavid du Colombierwe wanted to eliminate every
7373e12c5d1SDavid du Colombier.CW Peter .
7383e12c5d1SDavid du ColombierWhat's needed is an iterator that runs a command for each occurrence of some
7393e12c5d1SDavid du Colombiertext.
7403e12c5d1SDavid du Colombier.CW Sam 's
7413e12c5d1SDavid du Colombieriterator is called
7423e12c5d1SDavid du Colombier.CW x ,
7433e12c5d1SDavid du Colombierfor extract:
7443e12c5d1SDavid du Colombier.P1
7453e12c5d1SDavid du Colombierx/\f2expression\fP/ \f2command\fP
7463e12c5d1SDavid du Colombier.P2
7473e12c5d1SDavid du Colombierfinds all matches in dot of the specified expression, and for each
7483e12c5d1SDavid du Colombiersuch match, sets dot to the text matched and runs the command.
7493e12c5d1SDavid du ColombierSo to delete all the
7503e12c5d1SDavid du Colombier.CW Peters:
7513e12c5d1SDavid du Colombier.P1
7523e12c5d1SDavid du Colombier0,$ x/Peter/ d
7533e12c5d1SDavid du Colombier.P2
7543e12c5d1SDavid du Colombier(Blanks in these examples are to improve readability;
7553e12c5d1SDavid du Colombier.CW sam
7563e12c5d1SDavid du Colombierneither requires nor interprets them.)
7573e12c5d1SDavid du ColombierThis searches the entire file
7583e12c5d1SDavid du Colombier.CW 0,$ ) (
7593e12c5d1SDavid du Colombierfor occurrences of the string
7603e12c5d1SDavid du Colombier.CW Peter ,
7613e12c5d1SDavid du Colombierand runs the
7623e12c5d1SDavid du Colombier.CW d
7633e12c5d1SDavid du Colombiercommand with dot set to each such occurrence.
7643e12c5d1SDavid du Colombier(By contrast, the comparable
7653e12c5d1SDavid du Colombier.CW ed
7663e12c5d1SDavid du Colombiercommand would delete all
7673e12c5d1SDavid du Colombier.I lines
7683e12c5d1SDavid du Colombiercontaining
7693e12c5d1SDavid du Colombier.CW Peter ;
7703e12c5d1SDavid du Colombier.CW sam
7713e12c5d1SDavid du Colombierdeletes only the
7723e12c5d1SDavid du Colombier.CW Peters .)
7733e12c5d1SDavid du ColombierThe address
7743e12c5d1SDavid du Colombier.CW 0,$
7753e12c5d1SDavid du Colombieris commonly used, and may be abbreviated to just a comma.
7763e12c5d1SDavid du ColombierAs another example,
7773e12c5d1SDavid du Colombier.P1
7783e12c5d1SDavid du Colombier, x/Peter/ p
7793e12c5d1SDavid du Colombier.P2
7803e12c5d1SDavid du Colombierprints a list of
7813e12c5d1SDavid du Colombier.CW Peters,
7823e12c5d1SDavid du Colombierone for each appearance in the file, with no intervening text (not even newlines
7833e12c5d1SDavid du Colombierto separate the instances).
7843e12c5d1SDavid du Colombier.PP
7853e12c5d1SDavid du ColombierOf course, the text extracted by
7863e12c5d1SDavid du Colombier.CW x
7873e12c5d1SDavid du Colombiermay be selected by a regular expression,
7883e12c5d1SDavid du Colombierwhich complicates deciding what set of matches is chosen \(em
7893e12c5d1SDavid du Colombiermatches may overlap.  This is resolved by generating the matches
7903e12c5d1SDavid du Colombierstarting from the beginning of dot using the leftmost-longest rule,
7913e12c5d1SDavid du Colombierand searching for each match starting from the end of the previous one.
7923e12c5d1SDavid du ColombierRegular expressions may also match null strings, but a null match
7933e12c5d1SDavid du Colombieradjacent to a non-null match is never selected; at least one character
7943e12c5d1SDavid du Colombiermust intervene.
7953e12c5d1SDavid du ColombierFor example,
7963e12c5d1SDavid du Colombier.P1
7973e12c5d1SDavid du Colombier, c/AAA/
7983e12c5d1SDavid du Colombierx/B*/ c/-/
7993e12c5d1SDavid du Colombier, p
8003e12c5d1SDavid du Colombier.P2
8013e12c5d1SDavid du Colombierproduces as output
8023e12c5d1SDavid du Colombier.P1
8033e12c5d1SDavid du Colombier-A-A-A-
8043e12c5d1SDavid du Colombier.P2
8053e12c5d1SDavid du Colombierbecause the pattern
8063e12c5d1SDavid du Colombier.CW B*
8073e12c5d1SDavid du Colombiermatches the null strings separating the
8083e12c5d1SDavid du Colombier.CW A 's.
8093e12c5d1SDavid du Colombier.PP
8103e12c5d1SDavid du ColombierThe
8113e12c5d1SDavid du Colombier.CW x
8123e12c5d1SDavid du Colombiercommand has a complement,
8133e12c5d1SDavid du Colombier.CW y ,
8143e12c5d1SDavid du Colombierwith similar syntax, that executes the command with dot set to the text
8153e12c5d1SDavid du Colombier.I between
8163e12c5d1SDavid du Colombierthe matches of the expression.
8173e12c5d1SDavid du ColombierFor example,
8183e12c5d1SDavid du Colombier.P1
8193e12c5d1SDavid du Colombier, c/AAA/
8203e12c5d1SDavid du Colombiery/A/ c/-/
8213e12c5d1SDavid du Colombier, p
8223e12c5d1SDavid du Colombier.P2
8233e12c5d1SDavid du Colombierproduces the same result as the example above.
8243e12c5d1SDavid du Colombier.PP
8253e12c5d1SDavid du ColombierThe
8263e12c5d1SDavid du Colombier.CW x
8273e12c5d1SDavid du Colombierand
8283e12c5d1SDavid du Colombier.CW y
8293e12c5d1SDavid du Colombiercommands are looping constructs, and
8303e12c5d1SDavid du Colombier.CW sam
8313e12c5d1SDavid du Colombierhas a pair of conditional commands to go with them.
8323e12c5d1SDavid du ColombierThey have similar syntax:
8333e12c5d1SDavid du Colombier.P1
8343e12c5d1SDavid du Colombierg/\f2expression\fP/ \f2command\fP
8353e12c5d1SDavid du Colombier.P2
8363e12c5d1SDavid du Colombier(guard)
8373e12c5d1SDavid du Colombierruns the command exactly once if dot contains a match of the expression.
8383e12c5d1SDavid du ColombierThis is different from
8393e12c5d1SDavid du Colombier.CW x ,
8403e12c5d1SDavid du Colombierwhich runs the command for
8413e12c5d1SDavid du Colombier.I each
8423e12c5d1SDavid du Colombiermatch:
8433e12c5d1SDavid du Colombier.CW x
8443e12c5d1SDavid du Colombierloops;
8453e12c5d1SDavid du Colombier.CW g
8463e12c5d1SDavid du Colombiermerely tests, without changing the value of dot.
8473e12c5d1SDavid du ColombierThus,
8483e12c5d1SDavid du Colombier.P1
8493e12c5d1SDavid du Colombier, x/Peter/ d
8503e12c5d1SDavid du Colombier.P2
8513e12c5d1SDavid du Colombierdeletes all occurrences of
8523e12c5d1SDavid du Colombier.CW Peter ,
8533e12c5d1SDavid du Colombierbut
8543e12c5d1SDavid du Colombier.P1
8553e12c5d1SDavid du Colombier, g/Peter/ d
8563e12c5d1SDavid du Colombier.P2
8573e12c5d1SDavid du Colombierdeletes the whole file (reduces it to a null string) if
8583e12c5d1SDavid du Colombier.CW Peter
8593e12c5d1SDavid du Colombieroccurs anywhere in the text.
8603e12c5d1SDavid du ColombierThe complementary conditional is
8613e12c5d1SDavid du Colombier.CW v ,
8623e12c5d1SDavid du Colombierwhich runs the command if there is
8633e12c5d1SDavid du Colombier.I no
8643e12c5d1SDavid du Colombiermatch of the expression.
8653e12c5d1SDavid du Colombier.PP
8663e12c5d1SDavid du ColombierThese control-structure-like commands may be composed to construct more
8673e12c5d1SDavid du Colombierinvolved operations.  For example, to print those lines of text that
8683e12c5d1SDavid du Colombiercontain the string
8693e12c5d1SDavid du Colombier.CW Peter :
8703e12c5d1SDavid du Colombier.P1
8713e12c5d1SDavid du Colombier, x/.*\en/ g/Peter/ p
8723e12c5d1SDavid du Colombier.P2
8733e12c5d1SDavid du ColombierThe
8743e12c5d1SDavid du Colombier.CW x
8753e12c5d1SDavid du Colombierbreaks the file into lines, the
8763e12c5d1SDavid du Colombier.CW g
8773e12c5d1SDavid du Colombierselects those lines containing
8783e12c5d1SDavid du Colombier.CW Peter ,
8793e12c5d1SDavid du Colombierand the
8803e12c5d1SDavid du Colombier.CW p
8813e12c5d1SDavid du Colombierprints them.
8823e12c5d1SDavid du ColombierThis command gives an address for the
8833e12c5d1SDavid du Colombier.CW x
8843e12c5d1SDavid du Colombiercommand (the whole file), but because
8853e12c5d1SDavid du Colombier.CW g
8863e12c5d1SDavid du Colombierdoes not have an explicit address, it applies to the value of
8873e12c5d1SDavid du Colombierdot produced by the
8883e12c5d1SDavid du Colombier.CW x
8893e12c5d1SDavid du Colombiercommand, that is, to each line.
8903e12c5d1SDavid du ColombierAll commands in
8913e12c5d1SDavid du Colombier.CW sam
8923e12c5d1SDavid du Colombierexcept for the command to write a file to disc use dot for the
8933e12c5d1SDavid du Colombierdefault address.
8943e12c5d1SDavid du Colombier.PP
8953e12c5d1SDavid du ColombierComposition may be continued indefinitely.
8963e12c5d1SDavid du Colombier.P1
8973e12c5d1SDavid du Colombier, x/.*\en/ g/Peter/ v/SaltPeter/ p
8983e12c5d1SDavid du Colombier.P2
8993e12c5d1SDavid du Colombierprints those lines containing
9003e12c5d1SDavid du Colombier.CW Peter
9013e12c5d1SDavid du Colombierbut
9023e12c5d1SDavid du Colombier.I not
9033e12c5d1SDavid du Colombierthose containing
9043e12c5d1SDavid du Colombier.CW SaltPeter .
9053e12c5d1SDavid du Colombier.SH 2
9063e12c5d1SDavid du ColombierStructural Regular Expressions
9073e12c5d1SDavid du Colombier.LP
908*219b2ee8SDavid du ColombierUnlike other UNIX text editors,
9093e12c5d1SDavid du Colombierincluding the non-interactive ones such as
9103e12c5d1SDavid du Colombier.CW sed
9113e12c5d1SDavid du Colombierand
9123e12c5d1SDavid du Colombier.CW awk ,\u\s-4\&7\s+4\d
9133e12c5d1SDavid du Colombier.CW sam
9143e12c5d1SDavid du Colombieris good for manipulating files with multi-line `records.'
9153e12c5d1SDavid du ColombierAn example is an on-line phone book composed of records,
9163e12c5d1SDavid du Colombierseparated by blank lines, of the form
9173e12c5d1SDavid du Colombier.P1
9183e12c5d1SDavid du ColombierHerbert Tic
9193e12c5d1SDavid du Colombier44 Turnip Ave., Endive, NJ
9203e12c5d1SDavid du Colombier201-5555642
9213e12c5d1SDavid du Colombier
9223e12c5d1SDavid du ColombierNorbert Twinge
9233e12c5d1SDavid du Colombier16 Potato St., Cabbagetown, NJ
9243e12c5d1SDavid du Colombier201-5553145
9253e12c5d1SDavid du Colombier
9263e12c5d1SDavid du Colombier\&...
9273e12c5d1SDavid du Colombier.P2
9283e12c5d1SDavid du ColombierThe format may be encoded as a regular expression:
9293e12c5d1SDavid du Colombier.P1
9303e12c5d1SDavid du Colombier(.+\en)+
9313e12c5d1SDavid du Colombier.P2
9323e12c5d1SDavid du Colombierthat is, a sequence of one or more non-blank lines.
9333e12c5d1SDavid du ColombierThe command to print Mr. Tic's entire record is then
9343e12c5d1SDavid du Colombier.P1
9353e12c5d1SDavid du Colombier, x/(.+\en)+/ g/^Herbert Tic$/ p
9363e12c5d1SDavid du Colombier.P2
9373e12c5d1SDavid du Colombierand that to extract just the phone number is
9383e12c5d1SDavid du Colombier.P1
9393e12c5d1SDavid du Colombier, x/(.+\en)+/ g/^Herbert Tic$/ x/^[0-9]*-[0-9]*\en/ p
9403e12c5d1SDavid du Colombier.P2
9413e12c5d1SDavid du ColombierThe latter command breaks the file into records,
9423e12c5d1SDavid du Colombierchooses Mr. Tic's record,
9433e12c5d1SDavid du Colombierextracts the phone number from the record,
9443e12c5d1SDavid du Colombierand finally prints the number.
9453e12c5d1SDavid du Colombier.PP
9463e12c5d1SDavid du ColombierA more involved problem is that of
9473e12c5d1SDavid du Colombierrenaming a particular variable, say
9483e12c5d1SDavid du Colombier.CW n ,
9493e12c5d1SDavid du Colombierto
9503e12c5d1SDavid du Colombier.CW num
9513e12c5d1SDavid du Colombierin a C program.
9523e12c5d1SDavid du ColombierThe obvious first attempt,
9533e12c5d1SDavid du Colombier.P1
9543e12c5d1SDavid du Colombier, x/n/ c/num/
9553e12c5d1SDavid du Colombier.P2
9563e12c5d1SDavid du Colombieris badly flawed: it changes not only the variable
9573e12c5d1SDavid du Colombier.CW n
9583e12c5d1SDavid du Colombierbut any letter
9593e12c5d1SDavid du Colombier.CW n
9603e12c5d1SDavid du Colombierthat appears.
9613e12c5d1SDavid du ColombierWe need to extract all the variables, and select those that match
9623e12c5d1SDavid du Colombier.CW n
9633e12c5d1SDavid du Colombierand only
9643e12c5d1SDavid du Colombier.CW n :
9653e12c5d1SDavid du Colombier.P1
9663e12c5d1SDavid du Colombier, x/[A-Za-z_][A-Za-z_0-9]*/ g/n/ v/../ c/num/
9673e12c5d1SDavid du Colombier.P2
9683e12c5d1SDavid du ColombierThe pattern
9693e12c5d1SDavid du Colombier.CW [A-Za-z_][A-Za-z_0-9]*
9703e12c5d1SDavid du Colombiermatches C identifiers.
9713e12c5d1SDavid du ColombierNext
9723e12c5d1SDavid du Colombier.CW g/n/
9733e12c5d1SDavid du Colombierselects those containing an
9743e12c5d1SDavid du Colombier.CW n .
9753e12c5d1SDavid du ColombierThen
9763e12c5d1SDavid du Colombier.CW v/../
9773e12c5d1SDavid du Colombierrejects those containing two (or more) characters, and finally
9783e12c5d1SDavid du Colombier.CW c/num/
9793e12c5d1SDavid du Colombierchanges the remainder (identifiers
9803e12c5d1SDavid du Colombier.CW n )
9813e12c5d1SDavid du Colombierto
9823e12c5d1SDavid du Colombier.CW num .
9833e12c5d1SDavid du ColombierThis version clearly works much better, but there may still be problems.
9843e12c5d1SDavid du ColombierFor example, in C character and string constants, the sequence
9853e12c5d1SDavid du Colombier.CW \en
9863e12c5d1SDavid du Colombieris interpreted as a newline character, and we don't want to change it to
9873e12c5d1SDavid du Colombier.CW \enum.
9883e12c5d1SDavid du ColombierThis problem can be forestalled with a
9893e12c5d1SDavid du Colombier.CW y
9903e12c5d1SDavid du Colombiercommand:
9913e12c5d1SDavid du Colombier.P1
9923e12c5d1SDavid du Colombier, y/\e\en/ x/[A-Za-z_][A-Za-z_0-9]*/ g/n/ v/../ c/num/
9933e12c5d1SDavid du Colombier.P2
9943e12c5d1SDavid du Colombier(the second
9953e12c5d1SDavid du Colombier.CW \e
9963e12c5d1SDavid du Colombieris necessary because of lexical conventions in regular expressions),
9973e12c5d1SDavid du Colombieror we could even reject character constants and strings outright:
998*219b2ee8SDavid du Colombier.P1 0
9993e12c5d1SDavid du Colombier,y/'[^']*'/ y/"[^"]*"/ x/[A-Za-z_][A-Za-z_0-9]*/ g/n/ v/../ c/num/
10003e12c5d1SDavid du Colombier.P2
10013e12c5d1SDavid du ColombierThe
10023e12c5d1SDavid du Colombier.CW y
10033e12c5d1SDavid du Colombiercommands in this version exclude from consideration all character constants
10043e12c5d1SDavid du Colombierand strings.
10053e12c5d1SDavid du ColombierThe only remaining problem is to deal with the possible occurrence of
10063e12c5d1SDavid du Colombier.CW \e'
10073e12c5d1SDavid du Colombieror
10083e12c5d1SDavid du Colombier.CW \e"
10093e12c5d1SDavid du Colombierwithin these sequences, but it's easy to see how to resolve this difficulty.
10103e12c5d1SDavid du Colombier.PP
10113e12c5d1SDavid du ColombierThe point of these composed commands is successive refinement.
10123e12c5d1SDavid du ColombierA simple version of the command is tried, and if it's not good enough,
10133e12c5d1SDavid du Colombierit can be honed by adding a clause or two.
10143e12c5d1SDavid du Colombier(Mistakes can be undone; see below.
10153e12c5d1SDavid du ColombierAlso, the mouse language makes it unnecessary to retype the command each time.)
10163e12c5d1SDavid du ColombierThe resulting chains of commands are somewhat reminiscent of
10173e12c5d1SDavid du Colombiershell pipelines.\u\s-4\&7\s+4\d
10183e12c5d1SDavid du ColombierUnlike pipelines, though, which pass along modified
10193e12c5d1SDavid du Colombier.I data ,
10203e12c5d1SDavid du Colombier.CW sam
10213e12c5d1SDavid du Colombiercommands pass a
10223e12c5d1SDavid du Colombier.I view
10233e12c5d1SDavid du Colombierof the data.
10243e12c5d1SDavid du ColombierThe text at each step of the command is the same, but which pieces
10253e12c5d1SDavid du Colombierare selected is refined step by step until the correct piece is
10263e12c5d1SDavid du Colombieravailable to the final step of the command line, which ultimately makes the change.
10273e12c5d1SDavid du Colombier.PP
1028*219b2ee8SDavid du ColombierIn other UNIX programs, regular expressions are used only for selection,
10293e12c5d1SDavid du Colombieras in the
10303e12c5d1SDavid du Colombier.CW sam
10313e12c5d1SDavid du Colombier.CW g
10323e12c5d1SDavid du Colombiercommand, never for extraction as in the
10333e12c5d1SDavid du Colombier.CW x
10343e12c5d1SDavid du Colombieror
10353e12c5d1SDavid du Colombier.CW y
10363e12c5d1SDavid du Colombiercommand.
10373e12c5d1SDavid du ColombierFor example, patterns in
10383e12c5d1SDavid du Colombier.CW awk \u\s-4\&7\s+4\d
10393e12c5d1SDavid du Colombierare used to select lines to be operated on, but cannot be used
10403e12c5d1SDavid du Colombierto describe the format of the input text, or to handle newline-free text.
10413e12c5d1SDavid du ColombierThe use of regular expressions to describe the structure of a piece
10423e12c5d1SDavid du Colombierof text rather than its contents, as in the
10433e12c5d1SDavid du Colombier.CW x
10443e12c5d1SDavid du Colombiercommand,
10453e12c5d1SDavid du Colombierhas been given a name:
10463e12c5d1SDavid du Colombier.I
10473e12c5d1SDavid du Colombierstructural regular expressions.
10483e12c5d1SDavid du Colombier.R
10493e12c5d1SDavid du ColombierWhen they are composed, as in the above example,
10503e12c5d1SDavid du Colombierthey are pleasantly expressive.
10513e12c5d1SDavid du ColombierTheir use is discussed at greater length elsewhere.\u\s-4\&10\s+4\d
10523e12c5d1SDavid du Colombier.PP
10533e12c5d1SDavid du Colombier.SH 2
10543e12c5d1SDavid du ColombierMultiple files
10553e12c5d1SDavid du Colombier.LP
10563e12c5d1SDavid du Colombier.CW Sam
10573e12c5d1SDavid du Colombierhas a few other commands, mostly relating to input and output.
10583e12c5d1SDavid du Colombier.P1
10593e12c5d1SDavid du Colombiere discfilename
10603e12c5d1SDavid du Colombier.P2
10613e12c5d1SDavid du Colombierreplaces the contents and name of the current file with those of the named
10623e12c5d1SDavid du Colombierdisc file;
10633e12c5d1SDavid du Colombier.P1
10643e12c5d1SDavid du Colombierw discfilename
10653e12c5d1SDavid du Colombier.P2
10663e12c5d1SDavid du Colombierwrites the contents to the named disc file; and
10673e12c5d1SDavid du Colombier.P1
10683e12c5d1SDavid du Colombierr discfilename
10693e12c5d1SDavid du Colombier.P2
10703e12c5d1SDavid du Colombierreplaces dot with the contents of the named disc file.
10713e12c5d1SDavid du ColombierAll these commands use the current file's name if none is specified.
10723e12c5d1SDavid du ColombierFinally,
10733e12c5d1SDavid du Colombier.P1
10743e12c5d1SDavid du Colombierf discfilename
10753e12c5d1SDavid du Colombier.P2
10763e12c5d1SDavid du Colombierchanges the name associated with the file and displays the result:
10773e12c5d1SDavid du Colombier.P1
10783e12c5d1SDavid du Colombier\&'-. discfilename
10793e12c5d1SDavid du Colombier.P2
10803e12c5d1SDavid du ColombierThis output is called the file's
10813e12c5d1SDavid du Colombier.I
10823e12c5d1SDavid du Colombiermenu line,
10833e12c5d1SDavid du Colombier.R
10843e12c5d1SDavid du Colombierbecause it is the contents of the file's line in the button 3 menu (described
10853e12c5d1SDavid du Colombierin the
10863e12c5d1SDavid du Colombiernext section).
10873e12c5d1SDavid du ColombierThe first three characters are a concise notation for the state of the file.
10883e12c5d1SDavid du ColombierThe apostrophe signifies that the file is modified.
10893e12c5d1SDavid du ColombierThe minus sign indicates the number of windows
10903e12c5d1SDavid du Colombieropen on the file (see the next section):
10913e12c5d1SDavid du Colombier.CW -
10923e12c5d1SDavid du Colombiermeans none,
10933e12c5d1SDavid du Colombier.CW +
10943e12c5d1SDavid du Colombiermeans one, and
10953e12c5d1SDavid du Colombier.CW *
10963e12c5d1SDavid du Colombiermeans more than one.
10973e12c5d1SDavid du ColombierFinally, the period indicates that this is the current file.
10983e12c5d1SDavid du ColombierThese characters are useful for controlling the
10993e12c5d1SDavid du Colombier.CW X
11003e12c5d1SDavid du Colombiercommand, described shortly.
11013e12c5d1SDavid du Colombier.PP
11023e12c5d1SDavid du Colombier.CW Sam
11033e12c5d1SDavid du Colombiermay be started with a set of disc files (such as all the source for
11043e12c5d1SDavid du Colombiera program) by invoking it with a list of file names as arguments, and
11053e12c5d1SDavid du Colombiermore may be added or deleted on demand.
11063e12c5d1SDavid du Colombier.P1
11073e12c5d1SDavid du ColombierB discfile1 discfile2 ...
11083e12c5d1SDavid du Colombier.P2
11093e12c5d1SDavid du Colombieradds the named files to
11103e12c5d1SDavid du Colombier.CW sam 's
11113e12c5d1SDavid du Colombierlist, and
11123e12c5d1SDavid du Colombier.P1
11133e12c5d1SDavid du ColombierD discfile1 discfile2 ...
11143e12c5d1SDavid du Colombier.P2
11153e12c5d1SDavid du Colombierremoves them from
11163e12c5d1SDavid du Colombier.CW sam 's
11173e12c5d1SDavid du Colombiermemory (without effect on associated disc files).
11183e12c5d1SDavid du ColombierBoth these commands have a syntax for using the shell\u\s-4\&7\s+4\d
1119*219b2ee8SDavid du Colombier(the UNIX command interpreter) to generate the lists:
11203e12c5d1SDavid du Colombier.P1
11213e12c5d1SDavid du ColombierB <echo *.c
11223e12c5d1SDavid du Colombier.P2
11233e12c5d1SDavid du Colombierwill add all C source files, and
11243e12c5d1SDavid du Colombier.P1
11253e12c5d1SDavid du ColombierB <grep -l variable *.c
11263e12c5d1SDavid du Colombier.P2
11273e12c5d1SDavid du Colombierwill add all C source files referencing a particular variable
1128*219b2ee8SDavid du Colombier(the UNIX command
11293e12c5d1SDavid du Colombier.CW grep\ -l
11303e12c5d1SDavid du Colombierlists all files in its arguments that contain matches of
11313e12c5d1SDavid du Colombierthe specified regular expression).
11323e12c5d1SDavid du ColombierFinally,
11333e12c5d1SDavid du Colombier.CW D
11343e12c5d1SDavid du Colombierwithout arguments deletes the current file.
11353e12c5d1SDavid du Colombier.PP
11363e12c5d1SDavid du ColombierThere are two ways to change which file is current:
11373e12c5d1SDavid du Colombier.P1
11383e12c5d1SDavid du Colombierb filename
11393e12c5d1SDavid du Colombier.P2
11403e12c5d1SDavid du Colombiermakes the named file current.
11413e12c5d1SDavid du ColombierThe
11423e12c5d1SDavid du Colombier.CW B
11433e12c5d1SDavid du Colombiercommand
11443e12c5d1SDavid du Colombierdoes the same, but also adds any new files to
11453e12c5d1SDavid du Colombier.CW sam 's
11463e12c5d1SDavid du Colombierlist.
11473e12c5d1SDavid du Colombier(In practice, of course, the current file
11483e12c5d1SDavid du Colombieris usually chosen by mouse actions, not by textual commands.)
11493e12c5d1SDavid du ColombierThe other way is to use a form of address that refers to files:
11503e12c5d1SDavid du Colombier.P1
11513e12c5d1SDavid du Colombier"\f2expression\fP" \f2address\fP
11523e12c5d1SDavid du Colombier.P2
11533e12c5d1SDavid du Colombierrefers to the address evaluated in the file whose menu line
11543e12c5d1SDavid du Colombiermatches the expression (there must be exactly one match).
11553e12c5d1SDavid du ColombierFor example,
11563e12c5d1SDavid du Colombier.P1
11573e12c5d1SDavid du Colombier"peter.c" 3
11583e12c5d1SDavid du Colombier.P2
11593e12c5d1SDavid du Colombierrefers to the third line of the file whose name matches
11603e12c5d1SDavid du Colombier.CW peter.c .
11613e12c5d1SDavid du ColombierThis is most useful in the move
11623e12c5d1SDavid du Colombier.CW m ) (
11633e12c5d1SDavid du Colombierand copy
11643e12c5d1SDavid du Colombier.CW t ) (
11653e12c5d1SDavid du Colombiercommands:
11663e12c5d1SDavid du Colombier.P1
11673e12c5d1SDavid du Colombier0,$ t "peter.c" 0
11683e12c5d1SDavid du Colombier.P2
11693e12c5d1SDavid du Colombiermakes a copy of the current file at the beginning of
11703e12c5d1SDavid du Colombier.CW peter.c .
11713e12c5d1SDavid du Colombier.PP
11723e12c5d1SDavid du ColombierThe
11733e12c5d1SDavid du Colombier.CW X
11743e12c5d1SDavid du Colombiercommand
11753e12c5d1SDavid du Colombieris a looping construct, like
11763e12c5d1SDavid du Colombier.CW x ,
11773e12c5d1SDavid du Colombierthat refers to files instead of strings:
11783e12c5d1SDavid du Colombier.P1
11793e12c5d1SDavid du ColombierX/\f2expression\fP/ \f2command\fP
11803e12c5d1SDavid du Colombier.P2
11813e12c5d1SDavid du Colombierruns the command in all
11823e12c5d1SDavid du Colombierfiles whose menu lines match the expression.  The best example is
11833e12c5d1SDavid du Colombier.P1
11843e12c5d1SDavid du ColombierX/'/ w
11853e12c5d1SDavid du Colombier.P2
11863e12c5d1SDavid du Colombierwhich writes to disc all modified files.
11873e12c5d1SDavid du Colombier.CW Y
11883e12c5d1SDavid du Colombieris the complement of
11893e12c5d1SDavid du Colombier.CW X :
11903e12c5d1SDavid du Colombierit runs the command on all files whose menu lines don't match the expression:
11913e12c5d1SDavid du Colombier.P1
11923e12c5d1SDavid du ColombierY/\e.c/ D
11933e12c5d1SDavid du Colombier.P2
11943e12c5d1SDavid du Colombierdeletes all files that don't have
11953e12c5d1SDavid du Colombier.CW \&.c
11963e12c5d1SDavid du Colombierin their names, that is, it keeps all C source files and deletes the rest.
11973e12c5d1SDavid du Colombier.PP
11983e12c5d1SDavid du ColombierBraces allow commands to be grouped, so
11993e12c5d1SDavid du Colombier.P1
12003e12c5d1SDavid du Colombier{
12013e12c5d1SDavid du Colombier	\f2command1\fP
12023e12c5d1SDavid du Colombier	\f2command2\fP
12033e12c5d1SDavid du Colombier}
12043e12c5d1SDavid du Colombier.P2
12053e12c5d1SDavid du Colombieris syntactically a single command that runs two commands.
12063e12c5d1SDavid du ColombierThus,
12073e12c5d1SDavid du Colombier.P1
12083e12c5d1SDavid du ColombierX/\e.c/ ,g/variable/ {
12093e12c5d1SDavid du Colombier	f
12103e12c5d1SDavid du Colombier	, x/.*\en/ g/variable/ p
12113e12c5d1SDavid du Colombier}
12123e12c5d1SDavid du Colombier.P2
12133e12c5d1SDavid du Colombierfinds all occurrences of
12143e12c5d1SDavid du Colombier.CW variable
12153e12c5d1SDavid du Colombierin C source files, and prints
12163e12c5d1SDavid du Colombierout the file names and lines of each match.
12173e12c5d1SDavid du ColombierThe precise semantics of compound operations is discussed in the implementation
12183e12c5d1SDavid du Colombiersections below.
12193e12c5d1SDavid du Colombier.PP
12203e12c5d1SDavid du ColombierFinally,
12213e12c5d1SDavid du Colombierthe undo command,
12223e12c5d1SDavid du Colombier.CW u ,
12233e12c5d1SDavid du Colombierundoes the last command,
12243e12c5d1SDavid du Colombierno matter how many files were affected.
12253e12c5d1SDavid du ColombierMultiple undo operations move further back in time, so
12263e12c5d1SDavid du Colombier.P1
12273e12c5d1SDavid du Colombieru
12283e12c5d1SDavid du Colombieru
12293e12c5d1SDavid du Colombier.P2
12303e12c5d1SDavid du Colombier(which may be abbreviated
12313e12c5d1SDavid du Colombier.CW u2 )
12323e12c5d1SDavid du Colombierundoes the last two commands.  An undo may not be undone, however, nor
12333e12c5d1SDavid du Colombiermay any command that adds or deletes files.
12343e12c5d1SDavid du ColombierEverything else is undoable, though, including for example
12353e12c5d1SDavid du Colombier.CW e
12363e12c5d1SDavid du Colombiercommands:
12373e12c5d1SDavid du Colombier.P1
12383e12c5d1SDavid du Colombiere filename
12393e12c5d1SDavid du Colombieru
12403e12c5d1SDavid du Colombier.P2
12413e12c5d1SDavid du Colombierrestores the state of the file completely, including its name, dot,
12423e12c5d1SDavid du Colombierand modified bit.  Because of the undo, potentially dangerous commands
12433e12c5d1SDavid du Colombierare not guarded by confirmations.  Only
12443e12c5d1SDavid du Colombier.CW D ,
12453e12c5d1SDavid du Colombierwhich destroys the information necessary to restore itself, is protected.
12463e12c5d1SDavid du ColombierIt will not delete a modified file, but a second
12473e12c5d1SDavid du Colombier.CW D
12483e12c5d1SDavid du Colombierof the same file will succeed regardless.
12493e12c5d1SDavid du ColombierThe
12503e12c5d1SDavid du Colombier.CW q
12513e12c5d1SDavid du Colombiercommand, which exits
12523e12c5d1SDavid du Colombier.CW sam ,
12533e12c5d1SDavid du Colombieris similarly guarded.
12543e12c5d1SDavid du Colombier.SH 2
12553e12c5d1SDavid du ColombierMouse Interface
12563e12c5d1SDavid du Colombier.LP
12573e12c5d1SDavid du Colombier.CW Sam
12583e12c5d1SDavid du Colombieris most commonly run
12593e12c5d1SDavid du Colombierconnected to a bitmap display and mouse for interactive editing.
12603e12c5d1SDavid du ColombierThe only difference in the command language
12613e12c5d1SDavid du Colombierbetween regular, mouse-driven
12623e12c5d1SDavid du Colombier.CW sam
12633e12c5d1SDavid du Colombierand
12643e12c5d1SDavid du Colombier.CW sam\ -d
12653e12c5d1SDavid du Colombieris that if an address
12663e12c5d1SDavid du Colombieris provided without a command,
12673e12c5d1SDavid du Colombier.CW sam\ -d
12683e12c5d1SDavid du Colombierwill print the text referenced by the address, but
12693e12c5d1SDavid du Colombierregular
12703e12c5d1SDavid du Colombier.CW sam
12713e12c5d1SDavid du Colombierwill highlight it on the screen \(em in fact,
12723e12c5d1SDavid du Colombierdot is always highlighted (see Figure 2).
12733e12c5d1SDavid du Colombier.WS 1
12743e12c5d1SDavid du Colombier.KF
12753e12c5d1SDavid du Colombier.BP fig3.ps 2.04i
12763e12c5d1SDavid du Colombier.Cs
12773e12c5d1SDavid du ColombierFigure 2. A
12783e12c5d1SDavid du Colombier.CW sam
12793e12c5d1SDavid du Colombierwindow.  The scroll bar down the left
12803e12c5d1SDavid du Colombierrepresents the file, with the bubble showing the fraction
12813e12c5d1SDavid du Colombiervisible in the window.
12823e12c5d1SDavid du ColombierThe scroll bar may be manipulated by the mouse for convenient browsing.
12833e12c5d1SDavid du ColombierThe current text,
12843e12c5d1SDavid du Colombierwhich is highlighted, need not fit on a line.  Here it consists of one partial
12853e12c5d1SDavid du Colombierline, one complete line, and final partial line.
12863e12c5d1SDavid du Colombier.Ce
12873e12c5d1SDavid du Colombier.KE
12883e12c5d1SDavid du Colombier.PP
12893e12c5d1SDavid du ColombierEach file may have zero or more windows open on the display.
12903e12c5d1SDavid du ColombierAt any time, only one window in all of
12913e12c5d1SDavid du Colombier.CW sam
12923e12c5d1SDavid du Colombieris the
12933e12c5d1SDavid du Colombier.I
12943e12c5d1SDavid du Colombiercurrent window,
12953e12c5d1SDavid du Colombier.R
12963e12c5d1SDavid du Colombierthat is, the window to which typing and mouse actions refer;
12973e12c5d1SDavid du Colombierthis may be the
12983e12c5d1SDavid du Colombier.CW sam
12993e12c5d1SDavid du Colombierwindow (that in which commands may be typed)
13003e12c5d1SDavid du Colombieror one of the file windows.
13013e12c5d1SDavid du ColombierWhen a file has multiple windows, the image of the file in each window
13023e12c5d1SDavid du Colombieris always kept up to date.
13033e12c5d1SDavid du ColombierThe current file is the last file affected by a command,
13043e12c5d1SDavid du Colombierso if the
13053e12c5d1SDavid du Colombier.CW sam
13063e12c5d1SDavid du Colombierwindow is current,
13073e12c5d1SDavid du Colombierthe current window is not a window on the current file.
13083e12c5d1SDavid du ColombierHowever, each window on a file has its own value of dot,
13093e12c5d1SDavid du Colombierand when switching between windows on a single file,
13103e12c5d1SDavid du Colombierthe file's value of dot is changed to that of the window.
13113e12c5d1SDavid du ColombierThus, flipping between windows behaves in the obvious, convenient way.
13123e12c5d1SDavid du Colombier.PP
13133e12c5d1SDavid du ColombierThe mouse on the Blit has three buttons, numbered left to right.
13143e12c5d1SDavid du ColombierButton 3 has a list of commands to manipulate windows,
13153e12c5d1SDavid du Colombierfollowed by a list of `menu lines' exactly as printed by the
13163e12c5d1SDavid du Colombier.CW f
13173e12c5d1SDavid du Colombiercommand, one per file (not one per window).
13183e12c5d1SDavid du ColombierThese menu lines are sorted by file name.
13193e12c5d1SDavid du ColombierIf the list is long, the Blit menu software will make it more manageable
13203e12c5d1SDavid du Colombierby generating a scrolling menu instead of an unwieldy long list.
13213e12c5d1SDavid du ColombierUsing the menu to select a file from the list makes that file the current
13223e12c5d1SDavid du Colombierfile, and the most recently current window in that file the current window.
13233e12c5d1SDavid du ColombierBut if that file is already current, selecting it in the menu cycles through
13243e12c5d1SDavid du Colombierthe windows on the file; this simple trick avoids a special menu to
13253e12c5d1SDavid du Colombierchoose windows on a file.
13263e12c5d1SDavid du ColombierIf there is no window open on the file,
13273e12c5d1SDavid du Colombier.CW sam
13283e12c5d1SDavid du Colombierchanges the mouse cursor to prompt the user to create one.
13293e12c5d1SDavid du Colombier.PP
13303e12c5d1SDavid du ColombierThe commands on the button 3 menu are straightforward (see Figure 3), and
13313e12c5d1SDavid du Colombierare like the commands to manipulate windows in
13323e12c5d1SDavid du Colombier.CW mux ,\u\s-4\&8\s+4\d
13333e12c5d1SDavid du Colombierthe Blit's window system.
13343e12c5d1SDavid du Colombier.CW New
13353e12c5d1SDavid du Colombiermakes a new file, and gives it one empty window, whose size is determined
13363e12c5d1SDavid du Colombierby a rectangle swept by the mouse.
1337*219b2ee8SDavid du Colombier.CW Zerox
13383e12c5d1SDavid du Colombierprompts for a window to be selected, and
13393e12c5d1SDavid du Colombiermakes a clone of that window; this is how multiple windows are created on one file.
13403e12c5d1SDavid du Colombier.CW Reshape
13413e12c5d1SDavid du Colombierchanges the size of the indicated window, and
13423e12c5d1SDavid du Colombier.CW close
13433e12c5d1SDavid du Colombierdeletes it.  If that is the last window open on the file,
13443e12c5d1SDavid du Colombier.CW close
13453e12c5d1SDavid du Colombierfirst does a
13463e12c5d1SDavid du Colombier.CW D
13473e12c5d1SDavid du Colombiercommand on the file.
13483e12c5d1SDavid du Colombier.CW Write
13493e12c5d1SDavid du Colombieris identical to a
13503e12c5d1SDavid du Colombier.CW w
13513e12c5d1SDavid du Colombiercommand on the file; it is in the menu purely for convenience.
13523e12c5d1SDavid du ColombierFinally,
13533e12c5d1SDavid du Colombier.CW ~~sam~~
13543e12c5d1SDavid du Colombieris a menu item that appears between the commands and the file names.
13553e12c5d1SDavid du ColombierSelecting it makes the
13563e12c5d1SDavid du Colombier.CW sam
13573e12c5d1SDavid du Colombierwindow the current window,
13583e12c5d1SDavid du Colombiercausing subsequent typing to be interpreted as commands.
13593e12c5d1SDavid du Colombier.KF
13603e12c5d1SDavid du Colombier.BP fig2.ps 2.74i
13613e12c5d1SDavid du Colombier.Cs
13623e12c5d1SDavid du ColombierFigure 3. The menu on button 3.
13633e12c5d1SDavid du ColombierThe black rectangle on the left is a scroll bar; the menu is limited to
13643e12c5d1SDavid du Colombierthe length shown to prevent its becoming unwieldy.
13653e12c5d1SDavid du ColombierAbove the
13663e12c5d1SDavid du Colombier.CW ~~sam~~
13673e12c5d1SDavid du Colombierline is a list of commands;
13683e12c5d1SDavid du Colombierbeneath it is a list of files, presented exactly as with the
13693e12c5d1SDavid du Colombier.CW f
13703e12c5d1SDavid du Colombiercommand.
13713e12c5d1SDavid du Colombier.Ce
13723e12c5d1SDavid du Colombier.KE
13733e12c5d1SDavid du Colombier.PP
13743e12c5d1SDavid du ColombierWhen
13753e12c5d1SDavid du Colombier.CW sam
13763e12c5d1SDavid du Colombierrequests that a window be swept, in response to
13773e12c5d1SDavid du Colombier.CW new ,
1378*219b2ee8SDavid du Colombier.CW zerox
13793e12c5d1SDavid du Colombieror
13803e12c5d1SDavid du Colombier.CW reshape ,
13813e12c5d1SDavid du Colombierit changes the mouse cursor from the usual arrow to a box with
13823e12c5d1SDavid du Colombiera small arrow.
13833e12c5d1SDavid du ColombierIn this state, the mouse may be used to indicate an arbitrary rectangle by
13843e12c5d1SDavid du Colombierpressing button 3 at one corner and releasing it at the opposite corner.
13853e12c5d1SDavid du ColombierMore conveniently,
13863e12c5d1SDavid du Colombierbutton 3 may simply be clicked,
13873e12c5d1SDavid du Colombierwhereupon
13883e12c5d1SDavid du Colombier.CW sam
13893e12c5d1SDavid du Colombiercreates the maximal rectangle that contains the cursor
13903e12c5d1SDavid du Colombierand abuts the
13913e12c5d1SDavid du Colombier.CW sam
13923e12c5d1SDavid du Colombierwindow.
13933e12c5d1SDavid du ColombierBy placing the
13943e12c5d1SDavid du Colombier.CW sam
13953e12c5d1SDavid du Colombierwindow in the middle of the screen, the user can define two regions (one above,
13963e12c5d1SDavid du Colombierone below) in which stacked fully-overlapping
13973e12c5d1SDavid du Colombierwindows can be created with minimal fuss (see Figure 1).
13983e12c5d1SDavid du ColombierThis simple user interface trick makes window creation noticeably easier.
13993e12c5d1SDavid du Colombier.PP
14003e12c5d1SDavid du ColombierThe cut-and-paste editor is essentially the same as that in Smalltalk-80.\u\s-4\&11\s+4\d
14013e12c5d1SDavid du ColombierThe text in dot is always highlighted on the screen.
14023e12c5d1SDavid du ColombierWhen a character is typed it replaces dot, and sets dot to the null
14033e12c5d1SDavid du Colombierstring after the character.  Thus, ordinary typing inserts text.
14043e12c5d1SDavid du ColombierButton 1 is used for selection:
14053e12c5d1SDavid du Colombierpressing the button, moving the mouse, and lifting the button
14063e12c5d1SDavid du Colombierselects (sets dot to) the text between the points where the
14073e12c5d1SDavid du Colombierbutton was pressed and released.
14083e12c5d1SDavid du ColombierPressing and releasing at the same point selects a null string; this
14093e12c5d1SDavid du Colombieris called clicking.  Clicking twice quickly, or
14103e12c5d1SDavid du Colombier.I
14113e12c5d1SDavid du Colombierdouble clicking,
14123e12c5d1SDavid du Colombier.R
14133e12c5d1SDavid du Colombierselects larger objects;
14143e12c5d1SDavid du Colombierfor example, double clicking in a word selects the word,
14153e12c5d1SDavid du Colombierdouble clicking just inside an opening bracket selects the text
14163e12c5d1SDavid du Colombiercontained in the brackets (handling nested brackets correctly),
14173e12c5d1SDavid du Colombierand similarly for
14183e12c5d1SDavid du Colombierparentheses, quotes, and so on.
14193e12c5d1SDavid du ColombierThe double-clicking rules reflect a bias toward
14203e12c5d1SDavid du Colombierprogrammers.
14213e12c5d1SDavid du ColombierIf
14223e12c5d1SDavid du Colombier.CW sam
14233e12c5d1SDavid du Colombierwere intended more for word processing, double-clicks would probably
14243e12c5d1SDavid du Colombierselect linguistic structures such as sentences.
14253e12c5d1SDavid du Colombier.PP
14263e12c5d1SDavid du ColombierIf button 1 is pressed outside the current window, it makes the indicated
14273e12c5d1SDavid du Colombierwindow current.
14283e12c5d1SDavid du ColombierThis is the easiest way to switch between windows and files.
14293e12c5d1SDavid du Colombier.PP
14303e12c5d1SDavid du ColombierPressing button 2 brings up a menu of editing functions (see Figure 4).
14313e12c5d1SDavid du ColombierThese mostly apply to the selected text:
14323e12c5d1SDavid du Colombier.CW cut
14333e12c5d1SDavid du Colombierdeletes the selected text, and remembers it in a hidden buffer called the
14343e12c5d1SDavid du Colombier.I
14353e12c5d1SDavid du Colombiersnarf buffer,
14363e12c5d1SDavid du Colombier.R
14373e12c5d1SDavid du Colombier.CW paste
14383e12c5d1SDavid du Colombierreplaces the selected text by the contents of the snarf buffer,
14393e12c5d1SDavid du Colombier.CW snarf
14403e12c5d1SDavid du Colombierjust copies the selected text to the snarf buffer,
14413e12c5d1SDavid du Colombier.CW look
14423e12c5d1SDavid du Colombiersearches forward for the next literal occurrence of the selected text, and
14433e12c5d1SDavid du Colombier.CW <mux>
14443e12c5d1SDavid du Colombierexchanges snarf buffers with the window system in which
14453e12c5d1SDavid du Colombier.CW sam
14463e12c5d1SDavid du Colombieris running.
14473e12c5d1SDavid du ColombierFinally, the last regular expression used appears as a menu entry
14483e12c5d1SDavid du Colombierto search
14493e12c5d1SDavid du Colombierforward for the next occurrence of a match for the expression.
14503e12c5d1SDavid du Colombier.WS 1
14513e12c5d1SDavid du Colombier.KF
14523e12c5d1SDavid du Colombier.BP fig4.ps 1.20i
14533e12c5d1SDavid du Colombier.Cs
14543e12c5d1SDavid du ColombierFigure 4. The menu on button 2.
14553e12c5d1SDavid du ColombierThe bottom entry tracks the most recently used regular expression, which may
14563e12c5d1SDavid du Colombierbe literal text.
14573e12c5d1SDavid du Colombier.Ce
14583e12c5d1SDavid du Colombier.KE
14593e12c5d1SDavid du Colombier.PP
14603e12c5d1SDavid du ColombierThe relationship between the command language and the mouse language is
14613e12c5d1SDavid du Colombierentirely due to the equality of dot and the selected text chosen
14623e12c5d1SDavid du Colombierwith button 1 on the mouse.
14633e12c5d1SDavid du ColombierFor example, to make a set of changes in a C subroutine, dot can be
14643e12c5d1SDavid du Colombierset by double clicking on the left brace that begins the subroutine,
14653e12c5d1SDavid du Colombierwhich sets dot for the command language.
14663e12c5d1SDavid du ColombierAn address-free command then typed in the
14673e12c5d1SDavid du Colombier.CW sam
14683e12c5d1SDavid du Colombierwindow will apply only to the text between the opening and closing
14693e12c5d1SDavid du Colombierbraces of the function.
14703e12c5d1SDavid du ColombierThe idea is to select what you want, and then say what you want
14713e12c5d1SDavid du Colombierto do with it, whether invoked by a menu selection or by a typed command.
14723e12c5d1SDavid du ColombierAnd of course, the value of dot is highlighted on
14733e12c5d1SDavid du Colombierthe display after the command completes.
14743e12c5d1SDavid du ColombierThis relationship between mouse interface and command language
14753e12c5d1SDavid du Colombieris clumsy to explain, but comfortable, even natural, in practice.
14763e12c5d1SDavid du Colombier.SH
14773e12c5d1SDavid du ColombierThe Implementation
14783e12c5d1SDavid du Colombier.LP
14793e12c5d1SDavid du ColombierThe next few sections describe how
14803e12c5d1SDavid du Colombier.CW sam
14813e12c5d1SDavid du Colombieris put together, first the host part,
14823e12c5d1SDavid du Colombierthen the inter-component communication,
14833e12c5d1SDavid du Colombierthen the terminal part.
14843e12c5d1SDavid du ColombierAfter explaining how the command language is implemented,
14853e12c5d1SDavid du Colombierthe discussion follows (roughly) the path of a character
14863e12c5d1SDavid du Colombierfrom the temporary file on disc to the screen.
14873e12c5d1SDavid du ColombierThe presentation centers on the data structures,
14883e12c5d1SDavid du Colombierbecause that is how the program was designed and because
14893e12c5d1SDavid du Colombierthe algorithms are easy to provide, given the right data
14903e12c5d1SDavid du Colombierstructures.
14913e12c5d1SDavid du Colombier.SH 2
14923e12c5d1SDavid du ColombierParsing and execution
14933e12c5d1SDavid du Colombier.LP
14943e12c5d1SDavid du ColombierThe command language is interpreted by parsing each command with a
14953e12c5d1SDavid du Colombiertable-driven recursive
14963e12c5d1SDavid du Colombierdescent parser, and when a complete command is assembled, invoking a top-down
14973e12c5d1SDavid du Colombierexecutor.
14983e12c5d1SDavid du ColombierMost editors instead employ a simple character-at-a-time
14993e12c5d1SDavid du Colombierlexical scanner.
15003e12c5d1SDavid du ColombierUse of a parser makes it
15013e12c5d1SDavid du Colombiereasy and unambiguous to detect when a command is complete,
15023e12c5d1SDavid du Colombierwhich has two advantages.
15033e12c5d1SDavid du ColombierFirst, escape conventions such as backslashes to quote
15043e12c5d1SDavid du Colombiermultiple-line commands are unnecessary;  if the command isn't finished,
15053e12c5d1SDavid du Colombierthe parser keeps reading.  For example, a multiple-line append driven by an
15063e12c5d1SDavid du Colombier.CW x
15073e12c5d1SDavid du Colombiercommand is straightforward:
15083e12c5d1SDavid du Colombier.P1
15093e12c5d1SDavid du Colombierx/.*\en/ g/Peter/ a
15103e12c5d1SDavid du Colombierone line about Peter
15113e12c5d1SDavid du Colombieranother line about Peter
15123e12c5d1SDavid du Colombier\&.
15133e12c5d1SDavid du Colombier.P2
1514*219b2ee8SDavid du ColombierOther UNIX editors would require a backslash after all but the last line.
15153e12c5d1SDavid du Colombier.PP
15163e12c5d1SDavid du ColombierThe other advantage is specific to the two-process structure of
15173e12c5d1SDavid du Colombier.CW sam .
15183e12c5d1SDavid du ColombierThe host process must decide when a command is completed so the
15193e12c5d1SDavid du Colombiercommand interpreter can be called.  This problem is easily resolved
15203e12c5d1SDavid du Colombierby having the lexical analyzer read the single stream of events from the
15213e12c5d1SDavid du Colombierterminal, directly executing all typing and mouse commands,
15223e12c5d1SDavid du Colombierbut passing to the parser characters typed to the
15233e12c5d1SDavid du Colombier.CW sam
15243e12c5d1SDavid du Colombiercommand window.
15253e12c5d1SDavid du ColombierThis scheme is slightly complicated by the availability of cut-and-paste
15263e12c5d1SDavid du Colombierediting in the
15273e12c5d1SDavid du Colombier.CW sam
15283e12c5d1SDavid du Colombierwindow, but that difficulty is resolved by applying the rules
15293e12c5d1SDavid du Colombierused in
15303e12c5d1SDavid du Colombier.CW mux :
15313e12c5d1SDavid du Colombierwhen a newline is typed to the
15323e12c5d1SDavid du Colombier.CW sam
15333e12c5d1SDavid du Colombierwindow, all text between the newline and the previously typed newline
15343e12c5d1SDavid du Colombieris made available to the parser.
15353e12c5d1SDavid du ColombierThis permits arbitrary editing to be done to a command before
15363e12c5d1SDavid du Colombiertyping newline and thereby requesting execution.
15373e12c5d1SDavid du Colombier.PP
15383e12c5d1SDavid du ColombierThe parser is driven by a table because the syntax of addresses
15393e12c5d1SDavid du Colombierand commands is regular enough
15403e12c5d1SDavid du Colombierto be encoded compactly.  There are few special cases, such as the
15413e12c5d1SDavid du Colombierreplacement text in a substitution, so the syntax of almost all commands
15423e12c5d1SDavid du Colombiercan be encoded with a few flags.
15433e12c5d1SDavid du ColombierThese include whether the command allows an address (for example,
15443e12c5d1SDavid du Colombier.CW e
15453e12c5d1SDavid du Colombierdoes not), whether it takes a regular expression (as in
15463e12c5d1SDavid du Colombier.CW x
15473e12c5d1SDavid du Colombierand
15483e12c5d1SDavid du Colombier.CW s ),
15493e12c5d1SDavid du Colombierwhether it takes replacement text (as in
15503e12c5d1SDavid du Colombier.CW c
15513e12c5d1SDavid du Colombieror
15523e12c5d1SDavid du Colombier.CW i ),
15533e12c5d1SDavid du Colombierwhich may be multi-line, and so on.
15543e12c5d1SDavid du ColombierThe internal syntax of regular expressions is handled by a separate
15553e12c5d1SDavid du Colombierparser; a regular expression is a leaf of the command parse tree.
15563e12c5d1SDavid du ColombierRegular expressions are discussed fully in the next section.
15573e12c5d1SDavid du Colombier.PP
15583e12c5d1SDavid du ColombierThe parser table also has information about defaults, so the interpreter
15593e12c5d1SDavid du Colombieris always called with a complete tree.  For example, the parser fills in
15603e12c5d1SDavid du Colombierthe implicit
15613e12c5d1SDavid du Colombier.CW 0
15623e12c5d1SDavid du Colombierand
15633e12c5d1SDavid du Colombier.CW $
15643e12c5d1SDavid du Colombierin the abbreviated address
15653e12c5d1SDavid du Colombier.CW ,
15663e12c5d1SDavid du Colombier(comma),
15673e12c5d1SDavid du Colombierinserts a
15683e12c5d1SDavid du Colombier.CW +
15693e12c5d1SDavid du Colombierto the left of an unadorned regular expression in an address,
15703e12c5d1SDavid du Colombierand provides the usual default address
15713e12c5d1SDavid du Colombier.CW .
15723e12c5d1SDavid du Colombier(dot) for commands that expect an address but are not given one.
15733e12c5d1SDavid du Colombier.PP
15743e12c5d1SDavid du ColombierOnce a complete command is parsed, the evaluation is easy.
15753e12c5d1SDavid du ColombierThe address is evaluated left-to-right starting from the value of dot,
15763e12c5d1SDavid du Colombierwith a mostly ordinary expression evaluator.
15773e12c5d1SDavid du ColombierAddresses, like many of the data structures in
15783e12c5d1SDavid du Colombier.CW sam ,
15793e12c5d1SDavid du Colombierare held in a C structure and passed around by value:
15803e12c5d1SDavid du Colombier.P1
15813e12c5d1SDavid du Colombiertypedef long Posn;    /* Position in a file */
15823e12c5d1SDavid du Colombiertypedef struct Range{
15833e12c5d1SDavid du Colombier        Posn    p1, p2;
15843e12c5d1SDavid du Colombier}Range;
15853e12c5d1SDavid du Colombiertypedef struct Address{
15863e12c5d1SDavid du Colombier        Range   r;
15873e12c5d1SDavid du Colombier        File    *f;
15883e12c5d1SDavid du Colombier}Address;
15893e12c5d1SDavid du Colombier.P2
15903e12c5d1SDavid du ColombierAn address is encoded as a substring (character positions
15913e12c5d1SDavid du Colombier.CW p1
15923e12c5d1SDavid du Colombierto
15933e12c5d1SDavid du Colombier.CW p2 )
15943e12c5d1SDavid du Colombierin a file
15953e12c5d1SDavid du Colombier.CW f .
15963e12c5d1SDavid du Colombier(The data type
15973e12c5d1SDavid du Colombier.CW File
15983e12c5d1SDavid du Colombieris described in detail below.)
15993e12c5d1SDavid du Colombier.PP
16003e12c5d1SDavid du ColombierThe address interpreter is an
16013e12c5d1SDavid du Colombier.CW Address -valued
16023e12c5d1SDavid du Colombierfunction that traverses the parse tree describing an address (the
16033e12c5d1SDavid du Colombierparse tree for the address has type
16043e12c5d1SDavid du Colombier.CW Addrtree ):
16053e12c5d1SDavid du Colombier.P1
16063e12c5d1SDavid du ColombierAddress
16073e12c5d1SDavid du Colombieraddress(ap, a, sign)
16083e12c5d1SDavid du Colombier	Addrtree *ap;
16093e12c5d1SDavid du Colombier	Address a;
16103e12c5d1SDavid du Colombier	int sign;
16113e12c5d1SDavid du Colombier{
16123e12c5d1SDavid du Colombier	Address a2;
16133e12c5d1SDavid du Colombier	do
16143e12c5d1SDavid du Colombier		switch(ap->type){
16153e12c5d1SDavid du Colombier		case '.':
16163e12c5d1SDavid du Colombier			a=a.f->dot;
16173e12c5d1SDavid du Colombier			break;
16183e12c5d1SDavid du Colombier		case '$':
16193e12c5d1SDavid du Colombier			a.r.p1=a.r.p2=a.f->nbytes;
16203e12c5d1SDavid du Colombier			break;
16213e12c5d1SDavid du Colombier		case '"':
16223e12c5d1SDavid du Colombier			a=matchfile(a, ap->aregexp)->dot;
16233e12c5d1SDavid du Colombier			break;
16243e12c5d1SDavid du Colombier		case ',':
16253e12c5d1SDavid du Colombier			a2=address(ap->right, a, 0);
16263e12c5d1SDavid du Colombier			a=address(ap->left, a, 0);
16273e12c5d1SDavid du Colombier			if(a.f!=a2.f || a2.r.p2<a.r.p1)
16283e12c5d1SDavid du Colombier				error(Eorder);
16293e12c5d1SDavid du Colombier			a.r.p2=a2.r.p2;
16303e12c5d1SDavid du Colombier			return a;
16313e12c5d1SDavid du Colombier		/* and so on */
16323e12c5d1SDavid du Colombier		}
16333e12c5d1SDavid du Colombier	while((ap=ap->right)!=0);
16343e12c5d1SDavid du Colombier	return a;
16353e12c5d1SDavid du Colombier}
16363e12c5d1SDavid du Colombier.P2
16373e12c5d1SDavid du Colombier.PP
16383e12c5d1SDavid du ColombierThroughout, errors are handled by a non-local
16393e12c5d1SDavid du Colombier.CW goto
16403e12c5d1SDavid du Colombier(a
16413e12c5d1SDavid du Colombier.CW setjmp/longjmp
16423e12c5d1SDavid du Colombierin C terminology)
16433e12c5d1SDavid du Colombierhidden in a routine called
16443e12c5d1SDavid du Colombier.CW error
16453e12c5d1SDavid du Colombierthat immediately aborts the execution, retracts any
16463e12c5d1SDavid du Colombierpartially made changes (see the section below on `undoing'), and
16473e12c5d1SDavid du Colombierreturns to the top level of the parser.
16483e12c5d1SDavid du ColombierThe argument to
16493e12c5d1SDavid du Colombier.CW error
16503e12c5d1SDavid du Colombieris an enumeration type that
16513e12c5d1SDavid du Colombieris translated to a terse but possibly helpful
16523e12c5d1SDavid du Colombiermessage such as `?addresses out of order.'
16533e12c5d1SDavid du ColombierVery common messages are kept short; for example the message for
16543e12c5d1SDavid du Colombiera failed regular expression search is `?search.'
16553e12c5d1SDavid du Colombier.PP
16563e12c5d1SDavid du ColombierCharacter addresses such as
16573e12c5d1SDavid du Colombier.CW #3
16583e12c5d1SDavid du Colombierare trivial to implement, as the
16593e12c5d1SDavid du Colombier.CW File
16603e12c5d1SDavid du Colombierdata structure is accessible by character number.
16613e12c5d1SDavid du ColombierHowever,
16623e12c5d1SDavid du Colombier.CW sam
16633e12c5d1SDavid du Colombierkeeps no information about the position of newlines \(em it is too
16643e12c5d1SDavid du Colombierexpensive to track dynamically \(em so line addresses are computed by reading
16653e12c5d1SDavid du Colombierthe file, counting newlines.  Except in very large files, this has proven
16663e12c5d1SDavid du Colombieracceptable: file access is fast enough to make the technique practical,
16673e12c5d1SDavid du Colombierand lines are not central to the structure of the command language.
16683e12c5d1SDavid du Colombier.PP
16693e12c5d1SDavid du ColombierThe command interpreter, called
16703e12c5d1SDavid du Colombier.CW cmdexec ,
16713e12c5d1SDavid du Colombieris also straightforward.  The parse table includes a
16723e12c5d1SDavid du Colombierfunction to call to interpret a particular command.  That function
16733e12c5d1SDavid du Colombierreceives as arguments
16743e12c5d1SDavid du Colombierthe calculated address
16753e12c5d1SDavid du Colombierfor the command
16763e12c5d1SDavid du Colombierand the command tree (of type
16773e12c5d1SDavid du Colombier.CW Cmdtree ),
16783e12c5d1SDavid du Colombierwhich may contain information such as the subtree for compound commands.
16793e12c5d1SDavid du ColombierHere, for example, is the function for the
16803e12c5d1SDavid du Colombier.CW g
16813e12c5d1SDavid du Colombierand
16823e12c5d1SDavid du Colombier.CW v
16833e12c5d1SDavid du Colombiercommands:
16843e12c5d1SDavid du Colombier.P1
16853e12c5d1SDavid du Colombierint
16863e12c5d1SDavid du Colombierg_cmd(a, cp)
16873e12c5d1SDavid du Colombier	Address a;
16883e12c5d1SDavid du Colombier	Cmdtree *cp;
16893e12c5d1SDavid du Colombier{
16903e12c5d1SDavid du Colombier	compile(cp->regexp);
16913e12c5d1SDavid du Colombier	if(execute(a.f, a.r.p1, a.r.p2)!=(cp->cmdchar=='v')){
16923e12c5d1SDavid du Colombier		a.f->dot=a;
16933e12c5d1SDavid du Colombier		return cmdexec(a, cp->subcmd);
16943e12c5d1SDavid du Colombier	}
1695*219b2ee8SDavid du Colombier	return TRUE;	/* cause execution to continue */
16963e12c5d1SDavid du Colombier}
16973e12c5d1SDavid du Colombier.P2
16983e12c5d1SDavid du Colombier.CW Compile "" (
16993e12c5d1SDavid du Colombierand
17003e12c5d1SDavid du Colombier.CW execute
17013e12c5d1SDavid du Colombierare part of the regular expression code, described in the next section.)
17023e12c5d1SDavid du ColombierBecause the parser and the
17033e12c5d1SDavid du Colombier.CW File
17043e12c5d1SDavid du Colombierdata structure do most of the work, most commands
17053e12c5d1SDavid du Colombierare similarly brief.
17063e12c5d1SDavid du Colombier.SH 2
17073e12c5d1SDavid du ColombierRegular expressions
17083e12c5d1SDavid du Colombier.LP
17093e12c5d1SDavid du ColombierThe regular expression code in
17103e12c5d1SDavid du Colombier.CW sam
17113e12c5d1SDavid du Colombieris an interpreted, rather than compiled on-the-fly, implementation of Thompson's
17123e12c5d1SDavid du Colombiernon-deterministic finite automaton algorithm.\u\s-4\&12\s+4\d
1713*219b2ee8SDavid du ColombierThe syntax and semantics of the expressions are as in the UNIX program
17143e12c5d1SDavid du Colombier.CW egrep ,
17153e12c5d1SDavid du Colombierincluding alternation, closures, character classes, and so on.
17163e12c5d1SDavid du ColombierThe only changes in the notation are two additions:
17173e12c5d1SDavid du Colombier.CW \en
17183e12c5d1SDavid du Colombieris translated to, and matches, a newline character, and
17193e12c5d1SDavid du Colombier.CW @
17203e12c5d1SDavid du Colombiermatches any character.  In
17213e12c5d1SDavid du Colombier.CW egrep ,
17223e12c5d1SDavid du Colombierthe character
17233e12c5d1SDavid du Colombier.CW \&.
17243e12c5d1SDavid du Colombiermatches any character except newline, and in
17253e12c5d1SDavid du Colombier.CW sam
17263e12c5d1SDavid du Colombierthe same rule seemed safest, to prevent idioms like
17273e12c5d1SDavid du Colombier.CW \&.*
17283e12c5d1SDavid du Colombierfrom spanning newlines.
17293e12c5d1SDavid du Colombier.CW Egrep
17303e12c5d1SDavid du Colombierexpressions are arguably too complicated for an interactive editor \(em
17313e12c5d1SDavid du Colombiercertainly it would make sense if all the special characters were two-character
17323e12c5d1SDavid du Colombiersequences, so that most of the punctuation characters wouldn't have
17333e12c5d1SDavid du Colombierpeculiar meanings \(em but for an interesting command language, full
17343e12c5d1SDavid du Colombierregular expressions are necessary, and
17353e12c5d1SDavid du Colombier.CW egrep
1736*219b2ee8SDavid du Colombierdefines the full regular expression syntax for UNIX programs.
1737*219b2ee8SDavid du ColombierAlso, it seemed superfluous to define a new syntax, since various UNIX programs
17383e12c5d1SDavid du Colombier.CW ed , (
17393e12c5d1SDavid du Colombier.CW egrep
17403e12c5d1SDavid du Colombierand
17413e12c5d1SDavid du Colombier.CW vi )
17423e12c5d1SDavid du Colombierdefine too many already.
17433e12c5d1SDavid du Colombier.PP
17443e12c5d1SDavid du ColombierThe expressions are compiled by a routine,
17453e12c5d1SDavid du Colombier.CW compile ,
17463e12c5d1SDavid du Colombierthat generates the description of the non-deterministic finite state machine.
17473e12c5d1SDavid du ColombierA second routine,
17483e12c5d1SDavid du Colombier.CW execute ,
17493e12c5d1SDavid du Colombierinterprets the machine to generate the leftmost-longest match of the
17503e12c5d1SDavid du Colombierexpression in a substring of the file.
17513e12c5d1SDavid du ColombierThe algorithm is described elsewhere.\u\s-4\&12,13\s+4\d
17523e12c5d1SDavid du Colombier.CW Execute
17533e12c5d1SDavid du Colombierreports
17543e12c5d1SDavid du Colombierwhether a match was found, and sets a global variable,
17553e12c5d1SDavid du Colombierof type
17563e12c5d1SDavid du Colombier.CW Range ,
17573e12c5d1SDavid du Colombierto the substring matched.
17583e12c5d1SDavid du Colombier.PP
17593e12c5d1SDavid du ColombierA trick is required to evaluate the expression in reverse, such as when
17603e12c5d1SDavid du Colombiersearching backwards for an expression.
17613e12c5d1SDavid du ColombierFor example,
17623e12c5d1SDavid du Colombier.P1
17633e12c5d1SDavid du Colombier-/P.*r/
17643e12c5d1SDavid du Colombier.P2
17653e12c5d1SDavid du Colombierlooks backwards through the file for a match of the expression.
17663e12c5d1SDavid du ColombierThe expression, however, is defined for a forward search.
17673e12c5d1SDavid du ColombierThe solution is to construct a machine identical to the machine
17683e12c5d1SDavid du Colombierfor a forward search except for a reversal of all the concatenation
17693e12c5d1SDavid du Colombieroperators (the other operators are symmetric under direction reversal),
17703e12c5d1SDavid du Colombierto exchange the meaning of the operators
17713e12c5d1SDavid du Colombier.CW ^
17723e12c5d1SDavid du Colombierand
17733e12c5d1SDavid du Colombier.CW $ ,
17743e12c5d1SDavid du Colombierand then to read the file backwards, looking for the
17753e12c5d1SDavid du Colombierusual earliest longest match.
17763e12c5d1SDavid du Colombier.PP
17773e12c5d1SDavid du Colombier.CW Execute
17783e12c5d1SDavid du Colombiergenerates only one match each time it is called.
17793e12c5d1SDavid du ColombierTo interpret looping constructs such as the
17803e12c5d1SDavid du Colombier.CW x
17813e12c5d1SDavid du Colombiercommand,
17823e12c5d1SDavid du Colombier.CW sam
17833e12c5d1SDavid du Colombiermust therefore synchronize between
17843e12c5d1SDavid du Colombiercalls of
17853e12c5d1SDavid du Colombier.CW execute
17863e12c5d1SDavid du Colombierto avoid
17873e12c5d1SDavid du Colombierproblems with null matches.
17883e12c5d1SDavid du ColombierFor example, even given the leftmost-longest rule,
17893e12c5d1SDavid du Colombierthe expression
17903e12c5d1SDavid du Colombier.CW a*
17913e12c5d1SDavid du Colombiermatches three times in the string
17923e12c5d1SDavid du Colombier.CW ab
17933e12c5d1SDavid du Colombier(the character
17943e12c5d1SDavid du Colombier.CW a ,
17953e12c5d1SDavid du Colombierthe null string between the
17963e12c5d1SDavid du Colombier.CW a
17973e12c5d1SDavid du Colombierand
17983e12c5d1SDavid du Colombier.CW b ,
17993e12c5d1SDavid du Colombierand the final null string).
18003e12c5d1SDavid du ColombierAfter returning a match for the
18013e12c5d1SDavid du Colombier.CW a ,
18023e12c5d1SDavid du Colombier.CW sam
18033e12c5d1SDavid du Colombiermust not match the null string before the
18043e12c5d1SDavid du Colombier.CW b .
18053e12c5d1SDavid du ColombierThe algorithm starts
18063e12c5d1SDavid du Colombier.CW execute
18073e12c5d1SDavid du Colombierat the end of its previous match, and
18083e12c5d1SDavid du Colombierif the match it returns
18093e12c5d1SDavid du Colombieris null and abuts the previous match, rejects the match and advances
18103e12c5d1SDavid du Colombierthe initial position one character.
18113e12c5d1SDavid du Colombier.SH 2
18123e12c5d1SDavid du ColombierMemory allocation
18133e12c5d1SDavid du Colombier.LP
18143e12c5d1SDavid du ColombierThe C language has no memory allocation primitives, although a standard
18153e12c5d1SDavid du Colombierlibrary routine,
18163e12c5d1SDavid du Colombier.CW malloc ,
18173e12c5d1SDavid du Colombierprovides adequate service for simple programs.
18183e12c5d1SDavid du ColombierFor specific uses, however,
18193e12c5d1SDavid du Colombierit can be better to write a custom allocator.
18203e12c5d1SDavid du ColombierThe allocator (or rather, pair of allocators) described here
18213e12c5d1SDavid du Colombierwork in both the terminal and host parts of
18223e12c5d1SDavid du Colombier.CW sam .
18233e12c5d1SDavid du ColombierThey are designed for efficient manipulation of strings,
18243e12c5d1SDavid du Colombierwhich are allocated and freed frequently and vary in length from essentially
18253e12c5d1SDavid du Colombierzero to 32 Kbytes (very large strings are written to disc).
18263e12c5d1SDavid du ColombierMore important, strings may be large and change size often,
18273e12c5d1SDavid du Colombierso to minimize memory usage it is helpful to reclaim and to coalesce the
18283e12c5d1SDavid du Colombierunused portions of strings when they are truncated.
18293e12c5d1SDavid du Colombier.PP
18303e12c5d1SDavid du ColombierObjects to be allocated in
18313e12c5d1SDavid du Colombier.CW sam
18323e12c5d1SDavid du Colombierare of two flavors:
18333e12c5d1SDavid du Colombierthe first is C
18343e12c5d1SDavid du Colombier.CW structs ,
18353e12c5d1SDavid du Colombierwhich are small and often addressed by pointer variables;
18363e12c5d1SDavid du Colombierthe second is variable-sized arrays of characters
18373e12c5d1SDavid du Colombieror integers whose
18383e12c5d1SDavid du Colombierbase pointer is always used to access them.
18393e12c5d1SDavid du ColombierThe memory allocator in
18403e12c5d1SDavid du Colombier.CW sam
18413e12c5d1SDavid du Colombieris therefore in two parts:
18423e12c5d1SDavid du Colombierfirst, a traditional first-fit allocator that provides fixed storage for
18433e12c5d1SDavid du Colombier.CW structs ;
18443e12c5d1SDavid du Colombierand second, a garbage-compacting allocator that reduces storage
18453e12c5d1SDavid du Colombieroverhead for variable-sized objects, at the cost of some bookkeeping.
18463e12c5d1SDavid du ColombierThe two types of objects are allocated from adjoining arenas, with
18473e12c5d1SDavid du Colombierthe garbage-compacting allocator controlling the arena with higher addresses.
18483e12c5d1SDavid du ColombierSeparating into two arenas simplifies compaction and prevents fragmentation due
18493e12c5d1SDavid du Colombierto immovable objects.
18503e12c5d1SDavid du ColombierThe access rules for garbage-compactable objects
18513e12c5d1SDavid du Colombier(discussed in the next paragraph) allow them to be relocated, so when
18523e12c5d1SDavid du Colombierthe first-fit arena needs space, it moves the garbage-compacted arena
18533e12c5d1SDavid du Colombierto higher addresses to make room.  Storage is therefore created only
18543e12c5d1SDavid du Colombierat successively higher addresses, either when more garbage-compacted
18553e12c5d1SDavid du Colombierspace is needed or when the first-fit arena pushes up the other arena.
18563e12c5d1SDavid du Colombier.PP
18573e12c5d1SDavid du ColombierObjects that may be compacted declare to the
18583e12c5d1SDavid du Colombierallocator a cell that is guaranteed to be the sole repository of the
18593e12c5d1SDavid du Colombieraddress of the object whenever a compaction can occur.
18603e12c5d1SDavid du ColombierThe compactor can then update the address when the object is moved.
18613e12c5d1SDavid du ColombierFor example, the implementation of type
18623e12c5d1SDavid du Colombier.CW List
18633e12c5d1SDavid du Colombier(really a variable-length array)
18643e12c5d1SDavid du Colombieris:
18653e12c5d1SDavid du Colombier.P1
18663e12c5d1SDavid du Colombiertypedef struct List{
18673e12c5d1SDavid du Colombier        int     nused;
18683e12c5d1SDavid du Colombier        long    *ptr;
18693e12c5d1SDavid du Colombier}List;
18703e12c5d1SDavid du Colombier.P2
18713e12c5d1SDavid du ColombierThe
18723e12c5d1SDavid du Colombier.CW ptr
18733e12c5d1SDavid du Colombiercell must always be used directly, and never copied.  When a
18743e12c5d1SDavid du Colombier.CW List
18753e12c5d1SDavid du Colombieris to be created the
18763e12c5d1SDavid du Colombier.CW List
18773e12c5d1SDavid du Colombierstructure is allocated in the ordinary first-fit arena
18783e12c5d1SDavid du Colombierand its
18793e12c5d1SDavid du Colombier.CW ptr
18803e12c5d1SDavid du Colombieris allocated in the garbage-compacted arena.
18813e12c5d1SDavid du ColombierA similar data type for strings, called
18823e12c5d1SDavid du Colombier.CW String ,
18833e12c5d1SDavid du Colombierstores variable-length character arrays of up to 32767 elements.
18843e12c5d1SDavid du Colombier.PP
18853e12c5d1SDavid du ColombierA related matter of programming style:
18863e12c5d1SDavid du Colombier.CW sam
18873e12c5d1SDavid du Colombierfrequently passes structures by value, which
18883e12c5d1SDavid du Colombiersimplifies the code.
18893e12c5d1SDavid du ColombierTraditionally, C programs have
18903e12c5d1SDavid du Colombierpassed structures by reference, but implicit allocation on
18913e12c5d1SDavid du Colombierthe stack is easier to use.
18923e12c5d1SDavid du ColombierStructure passing is a relatively new feature of C
18933e12c5d1SDavid du Colombier(it is not in the
18943e12c5d1SDavid du Colombierstandard reference manual for C\u\s-4\&14\s+4\d), and is poorly supported in most
18953e12c5d1SDavid du Colombiercommercial C compilers.
18963e12c5d1SDavid du ColombierIt's convenient and expressive, though,
18973e12c5d1SDavid du Colombierand simplifies memory management by
18983e12c5d1SDavid du Colombieravoiding the allocator altogether
18993e12c5d1SDavid du Colombierand eliminating pointer aliases.
19003e12c5d1SDavid du Colombier.SH 2
19013e12c5d1SDavid du ColombierData structures for manipulating files
19023e12c5d1SDavid du Colombier.LP
19033e12c5d1SDavid du ColombierExperience with
19043e12c5d1SDavid du Colombier.CW jim
19053e12c5d1SDavid du Colombiershowed that the requirements
19063e12c5d1SDavid du Colombierof the file data structure were few, but strict.
19073e12c5d1SDavid du ColombierFirst, files need to be read and written quickly;
19083e12c5d1SDavid du Colombieradding a fresh file must be painless.
19093e12c5d1SDavid du ColombierSecond, the implementation must place no arbitrary upper limit on
19103e12c5d1SDavid du Colombierthe number or sizes of files.  (It should be practical to edit many files,
19113e12c5d1SDavid du Colombierand files up to megabytes in length should be handled gracefully.)
19123e12c5d1SDavid du ColombierThis implies that files be stored on disc, not in main memory.
19133e12c5d1SDavid du Colombier(Aficionados of virtual memory may argue otherwise, but the
19143e12c5d1SDavid du Colombierimplementation of virtual
19153e12c5d1SDavid du Colombiermemory in our system is not something to depend on
19163e12c5d1SDavid du Colombierfor good performance.)
19173e12c5d1SDavid du ColombierThird, changes to files need be made by only two primitives:
19183e12c5d1SDavid du Colombierdeletion and insertion.
19193e12c5d1SDavid du ColombierThese are inverses of each other,
19203e12c5d1SDavid du Colombierwhich simplifies the implementation of the undo operation.
19213e12c5d1SDavid du ColombierFinally,
19223e12c5d1SDavid du Colombierit must be easy and efficient to access the file, either
19233e12c5d1SDavid du Colombierforwards or backwards, a byte at a time.
19243e12c5d1SDavid du Colombier.PP
19253e12c5d1SDavid du ColombierThe
19263e12c5d1SDavid du Colombier.CW File
19273e12c5d1SDavid du Colombierdata type is constructed from three simpler data structures that hold arrays
19283e12c5d1SDavid du Colombierof characters.
19293e12c5d1SDavid du ColombierEach of these types has an insertion and deletion operator, and the
19303e12c5d1SDavid du Colombierinsertion and deletion operators of the
19313e12c5d1SDavid du Colombier.CW File
19323e12c5d1SDavid du Colombiertype itself are constructed from them.
19333e12c5d1SDavid du Colombier.PP
19343e12c5d1SDavid du ColombierThe simplest type is the
19353e12c5d1SDavid du Colombier.CW String ,
19363e12c5d1SDavid du Colombierwhich is used to hold strings in main memory.
19373e12c5d1SDavid du ColombierThe code that manages
19383e12c5d1SDavid du Colombier.CW Strings
19393e12c5d1SDavid du Colombierguarantees that they will never be longer
19403e12c5d1SDavid du Colombierthan some moderate size, and in practice they are rarely larger than 8 Kbytes.
19413e12c5d1SDavid du Colombier.CW Strings
19423e12c5d1SDavid du Colombierhave two purposes: they hold short strings like file names with little overhead,
19433e12c5d1SDavid du Colombierand because they are deliberately small, they are efficient to modify.
19443e12c5d1SDavid du ColombierThey are therefore used as the data structure for in-memory caches.
19453e12c5d1SDavid du Colombier.PP
19463e12c5d1SDavid du ColombierThe disc copy of the file is managed by a data structure called a
19473e12c5d1SDavid du Colombier.CW Disc ,
19483e12c5d1SDavid du Colombierwhich corresponds to a temporary file.  A
19493e12c5d1SDavid du Colombier.CW Disc
19503e12c5d1SDavid du Colombierhas no storage in main memory other than bookkeeping information;
19513e12c5d1SDavid du Colombierthe actual data being held is all on the disc.
19523e12c5d1SDavid du ColombierTo reduce the number of open files needed,
19533e12c5d1SDavid du Colombier.CW sam
1954*219b2ee8SDavid du Colombieropens a dozen temporary UNIX files and multiplexes the
19553e12c5d1SDavid du Colombier.CW Discs
19563e12c5d1SDavid du Colombierupon them.
19573e12c5d1SDavid du ColombierThis permits many files to
19583e12c5d1SDavid du Colombierbe edited; the entire
19593e12c5d1SDavid du Colombier.CW sam
19603e12c5d1SDavid du Colombiersource (48 files) may be edited comfortably with a single
19613e12c5d1SDavid du Colombierinstance of
19623e12c5d1SDavid du Colombier.CW sam .
19633e12c5d1SDavid du ColombierAllocating one temporary file per
19643e12c5d1SDavid du Colombier.CW Disc
19653e12c5d1SDavid du Colombierwould strain the operating system's limit on the number of open files.
19663e12c5d1SDavid du ColombierAlso, spreading the traffic among temporary files keeps the files shorter,
1967*219b2ee8SDavid du Colombierand shorter files are more efficiently implemented by the UNIX
19683e12c5d1SDavid du ColombierI/O subsystem.
19693e12c5d1SDavid du Colombier.PP
19703e12c5d1SDavid du ColombierA
19713e12c5d1SDavid du Colombier.CW Disc
19723e12c5d1SDavid du Colombieris an array of fixed-length blocks, each of which contains
19733e12c5d1SDavid du Colombierbetween 1 and 4096 characters of active data.
1974*219b2ee8SDavid du Colombier(The block size of our UNIX file system is 4096 bytes.)
19753e12c5d1SDavid du ColombierThe block addresses within the temporary file and the length of each
19763e12c5d1SDavid du Colombierblock are stored in a
19773e12c5d1SDavid du Colombier.CW List .
19783e12c5d1SDavid du ColombierWhen changes are made the live part of blocks may change size.
19793e12c5d1SDavid du ColombierBlocks are created and coalesced when necessary to try to keep the sizes
19803e12c5d1SDavid du Colombierbetween 2048 and 4096 bytes.
19813e12c5d1SDavid du ColombierAn actively changing part of the
19823e12c5d1SDavid du Colombier.CW Disc
19833e12c5d1SDavid du Colombiertherefore typically has about a kilobyte of slop that can be
19843e12c5d1SDavid du Colombierinserted or deleted
19853e12c5d1SDavid du Colombierwithout changing more than one block or affecting the block order.
19863e12c5d1SDavid du ColombierWhen an insertion would overflow a block, the block is split, a new one
19873e12c5d1SDavid du Colombieris allocated to receive the overflow, and the memory-resident list of blocks
19883e12c5d1SDavid du Colombieris rearranged to reflect the insertion of the new block.
19893e12c5d1SDavid du Colombier.PP
19903e12c5d1SDavid du ColombierObviously, going to the disc for every modification to the file is
19913e12c5d1SDavid du Colombierprohibitively expensive.
19923e12c5d1SDavid du ColombierThe data type
19933e12c5d1SDavid du Colombier.CW Buffer
19943e12c5d1SDavid du Colombierconsists of a
19953e12c5d1SDavid du Colombier.CW Disc
19963e12c5d1SDavid du Colombierto hold the data and a
19973e12c5d1SDavid du Colombier.CW String
19983e12c5d1SDavid du Colombierthat acts as a cache.
19993e12c5d1SDavid du ColombierThis is the first of a series of caches throughout the data structures in
20003e12c5d1SDavid du Colombier.CW sam.
20013e12c5d1SDavid du ColombierThe caches not only improve performance, they provide a way to organize
20023e12c5d1SDavid du Colombierthe flow of data, particularly in the communication between the host
20033e12c5d1SDavid du Colombierand terminal.
20043e12c5d1SDavid du ColombierThis idea is developed below, in the section on communications.
20053e12c5d1SDavid du Colombier.PP
20063e12c5d1SDavid du ColombierTo reduce disc traffic, changes to a
20073e12c5d1SDavid du Colombier.CW Buffer
20083e12c5d1SDavid du Colombierare mediated by a variable-length string, in memory, that acts as a cache.
20093e12c5d1SDavid du ColombierWhen an insertion or deletion is made to a
20103e12c5d1SDavid du Colombier.CW Buffer ,
20113e12c5d1SDavid du Colombierif the change can be accommodated by the cache, it is done there.
20123e12c5d1SDavid du ColombierIf the cache becomes bigger than a block because of an insertion,
20133e12c5d1SDavid du Colombiersome of it is written to the
20143e12c5d1SDavid du Colombier.CW Disc
20153e12c5d1SDavid du Colombierand deleted from the cache.
20163e12c5d1SDavid du ColombierIf the change does not intersect the cache, the cache is flushed.
20173e12c5d1SDavid du ColombierThe cache is only loaded at the new position if the change is smaller than a block;
20183e12c5d1SDavid du Colombierotherwise, it is sent directly to the
20193e12c5d1SDavid du Colombier.CW Disc .
20203e12c5d1SDavid du ColombierThis is because
20213e12c5d1SDavid du Colombierlarge changes are typically sequential,
20223e12c5d1SDavid du Colombierwhereupon the next change is unlikely to overlap the current one.
20233e12c5d1SDavid du Colombier.PP
20243e12c5d1SDavid du ColombierA
20253e12c5d1SDavid du Colombier.CW File
20263e12c5d1SDavid du Colombiercomprises a
20273e12c5d1SDavid du Colombier.CW String
20283e12c5d1SDavid du Colombierto hold the file name and some ancillary data such as dot and the modified bit.
20293e12c5d1SDavid du ColombierThe most important components, though, are a pair of
20303e12c5d1SDavid du Colombier.CW Buffers ,
20313e12c5d1SDavid du Colombierone called the transcript and the other the contents.
20323e12c5d1SDavid du ColombierTheir use is described in the next section.
20333e12c5d1SDavid du Colombier.PP
20343e12c5d1SDavid du ColombierThe overall structure is shown in Figure 5.
20353e12c5d1SDavid du ColombierAlthough it may seem that the data is touched many times on its
20363e12c5d1SDavid du Colombierway from the
20373e12c5d1SDavid du Colombier.CW Disc ,
2038*219b2ee8SDavid du Colombierit is read (by one UNIX system call) directly into the cache of the
20393e12c5d1SDavid du Colombierassociated
20403e12c5d1SDavid du Colombier.CW Buffer ;
20413e12c5d1SDavid du Colombierno extra copy is done.
20423e12c5d1SDavid du ColombierSimilarly, when flushing the cache, the text is written
20433e12c5d1SDavid du Colombierdirectly from the cache to disc.
20443e12c5d1SDavid du ColombierMost operations act directly on the text in the cache.
20453e12c5d1SDavid du ColombierA principle applied throughout
20463e12c5d1SDavid du Colombier.CW sam
20473e12c5d1SDavid du Colombieris that the fewer times the data is copied, the faster the program will run
20483e12c5d1SDavid du Colombier(see also the paper by Waite\u\s-4\&15\s+4\d).
20493e12c5d1SDavid du Colombier.KF
20503e12c5d1SDavid du Colombier.PS
20513e12c5d1SDavid du Colombiercopy "fig5.pic"
20523e12c5d1SDavid du Colombier.PE
20533e12c5d1SDavid du Colombier.Cs
20543e12c5d1SDavid du ColombierFigure 5. File data structures.
20553e12c5d1SDavid du ColombierThe temporary files are stored in the standard repository for such files
20563e12c5d1SDavid du Colombieron the host system.
20573e12c5d1SDavid du Colombier.Ce
20583e12c5d1SDavid du Colombier.KE
20593e12c5d1SDavid du Colombier.PP
20603e12c5d1SDavid du ColombierThe contents of a
20613e12c5d1SDavid du Colombier.CW File
20623e12c5d1SDavid du Colombierare accessed by a routine that
20633e12c5d1SDavid du Colombiercopies to a buffer a substring of a file starting at a specified offset.
20643e12c5d1SDavid du ColombierTo read a byte at a time, a
20653e12c5d1SDavid du Colombier.CW File "" per-
20663e12c5d1SDavid du Colombierarray is loaded starting from a specified initial position,
20673e12c5d1SDavid du Colombierand bytes may then be read from the array.
20683e12c5d1SDavid du ColombierThe implementation is done by a macro similar to the C standard I/O
20693e12c5d1SDavid du Colombier.CW getc
20703e12c5d1SDavid du Colombiermacro.\u\s-4\&14\s+4\d
20713e12c5d1SDavid du ColombierBecause the reading may be done at any address, a minor change to the
20723e12c5d1SDavid du Colombiermacro allows the file to be read backwards.
20733e12c5d1SDavid du ColombierThis array is read-only; there is no
20743e12c5d1SDavid du Colombier.CW putc .
20753e12c5d1SDavid du Colombier.SH 2
20763e12c5d1SDavid du ColombierDoing and undoing
20773e12c5d1SDavid du Colombier.LP
20783e12c5d1SDavid du Colombier.CW Sam
20793e12c5d1SDavid du Colombierhas an unusual method for managing changes to files.
20803e12c5d1SDavid du ColombierThe command language makes it easy to specify multiple variable-length changes
20813e12c5d1SDavid du Colombierto a file millions of bytes long, and such changes
20823e12c5d1SDavid du Colombiermust be made efficiently if the editor is to be practical.
20833e12c5d1SDavid du ColombierThe usual techniques for inserting and deleting strings
20843e12c5d1SDavid du Colombierare inadequate under these conditions.
20853e12c5d1SDavid du ColombierThe
20863e12c5d1SDavid du Colombier.CW Buffer
20873e12c5d1SDavid du Colombierand
20883e12c5d1SDavid du Colombier.CW Disc
20893e12c5d1SDavid du Colombierdata structures are designed for efficient random access to long strings,
20903e12c5d1SDavid du Colombierbut care must be taken to avoid super-linear behavior when making
20913e12c5d1SDavid du Colombiermany changes simultaneously.
20923e12c5d1SDavid du Colombier.PP
20933e12c5d1SDavid du Colombier.CW Sam
20943e12c5d1SDavid du Colombieruses a two-pass algorithm for making changes, and treats each file as a database
20953e12c5d1SDavid du Colombieragainst which transactions are registered.
20963e12c5d1SDavid du ColombierChanges are not made directly to the contents.
20973e12c5d1SDavid du ColombierInstead, when a command is started, a `mark' containing
20983e12c5d1SDavid du Colombiera sequence number is placed in the transcript
20993e12c5d1SDavid du Colombier.CW Buffer ,
21003e12c5d1SDavid du Colombierand each change made to the file, either an insertion or deletion
21013e12c5d1SDavid du Colombieror a change to the file name,
21023e12c5d1SDavid du Colombieris appended to the end of the transcript.
21033e12c5d1SDavid du ColombierWhen the command is complete, the transcript is rewound to the
21043e12c5d1SDavid du Colombiermark and applied to the contents.
21053e12c5d1SDavid du Colombier.PP
21063e12c5d1SDavid du ColombierOne reason for separating evaluation from
21073e12c5d1SDavid du Colombierapplication in this way is to simplify tracking the addresses of changes
21083e12c5d1SDavid du Colombiermade in the middle of a long sequence.
21093e12c5d1SDavid du ColombierThe two-pass algorithm also allows all changes to apply to the
21103e12c5d1SDavid du Colombier.I original
21113e12c5d1SDavid du Colombierdata: no change can affect another change made in the same command.
21123e12c5d1SDavid du ColombierThis is particularly important when evaluating an
21133e12c5d1SDavid du Colombier.CW x
21143e12c5d1SDavid du Colombiercommand because it prevents regular expression matches
21153e12c5d1SDavid du Colombierfrom stumbling over changes made earlier in the execution.
21163e12c5d1SDavid du ColombierAlso, the two-pass
2117*219b2ee8SDavid du Colombieralgorithm is cleaner than the way other UNIX editors allow changes to
21183e12c5d1SDavid du Colombieraffect each other;
21193e12c5d1SDavid du Colombierfor example,
21203e12c5d1SDavid du Colombier.CW ed 's
21213e12c5d1SDavid du Colombieridioms to do things like delete every other line
21223e12c5d1SDavid du Colombierdepend critically on the implementation.
21233e12c5d1SDavid du ColombierInstead,
21243e12c5d1SDavid du Colombier.CW sam 's
21253e12c5d1SDavid du Colombiersimple model, in which all changes in a command occur effectively
21263e12c5d1SDavid du Colombiersimultaneously, is easy to explain and to understand.
21273e12c5d1SDavid du Colombier.PP
21283e12c5d1SDavid du ColombierThe records in the transcript are of the form ``delete substring from
21293e12c5d1SDavid du Colombierlocations
21303e12c5d1SDavid du Colombier123 to 456'' and ``insert 11 characters `hello there' at location 789.''
21313e12c5d1SDavid du Colombier(It is an error if the changes are not at monotonically greater
21323e12c5d1SDavid du Colombierpositions through the file.)
21333e12c5d1SDavid du ColombierWhile the update is occurring, these numbers must be
21343e12c5d1SDavid du Colombieroffset by earlier changes, but that is straightforward and
21353e12c5d1SDavid du Colombierlocal to the update routine;
21363e12c5d1SDavid du Colombiermoreover, all the numbers have been computed
21373e12c5d1SDavid du Colombierbefore the first is examined.
21383e12c5d1SDavid du Colombier.PP
21393e12c5d1SDavid du ColombierTreating the file as a transaction system has another advantage:
21403e12c5d1SDavid du Colombierundo is trivial.
21413e12c5d1SDavid du ColombierAll it takes is to invert the transcript after it has been
21423e12c5d1SDavid du Colombierimplemented, converting insertions
21433e12c5d1SDavid du Colombierinto deletions and vice versa, and saving them in a holding
21443e12c5d1SDavid du Colombier.CW Buffer .
21453e12c5d1SDavid du ColombierThe `do' transcript can then be deleted from
21463e12c5d1SDavid du Colombierthe transcript
21473e12c5d1SDavid du Colombier.CW Buffer
21483e12c5d1SDavid du Colombierand replaced by the `undo' transcript.
21493e12c5d1SDavid du ColombierIf an undo is requested, the transcript is rewound and the undo transcript
21503e12c5d1SDavid du Colombierexecuted.
21513e12c5d1SDavid du ColombierBecause the transcript
21523e12c5d1SDavid du Colombier.CW Buffer
21533e12c5d1SDavid du Colombieris not truncated after each command, it accumulates
21543e12c5d1SDavid du Colombiersuccessive changes.
21553e12c5d1SDavid du ColombierA sequence of undo commands
21563e12c5d1SDavid du Colombiercan therefore back up the file arbitrarily,
21573e12c5d1SDavid du Colombierwhich is more helpful than the more commonly implemented self-inverse form of undo.
21583e12c5d1SDavid du Colombier.CW Sam "" (
21593e12c5d1SDavid du Colombierprovides no way to undo an undo, but if it were desired,
21603e12c5d1SDavid du Colombierit would be easy to provide by re-interpreting the `do' transcript.)
21613e12c5d1SDavid du ColombierEach mark in the transcript contains a sequence number and the offset into
21623e12c5d1SDavid du Colombierthe transcript of the previous mark, to aid in unwinding the transcript.
21633e12c5d1SDavid du ColombierMarks also contain the value of dot and the modified bit so these can be
21643e12c5d1SDavid du Colombierrestored easily.
21653e12c5d1SDavid du ColombierUndoing multiple files is easy; it merely demands undoing all files whose
21663e12c5d1SDavid du Colombierlatest change has the same sequence number as the current file.
21673e12c5d1SDavid du Colombier.PP
21683e12c5d1SDavid du ColombierAnother benefit of having a transcript is that errors encountered in the middle
21693e12c5d1SDavid du Colombierof a complicated command need not leave the files in an intermediate state.
21703e12c5d1SDavid du ColombierBy rewinding the transcript to the mark beginning the command,
21713e12c5d1SDavid du Colombierthe partial command can be trivially undone.
21723e12c5d1SDavid du Colombier.PP
21733e12c5d1SDavid du ColombierWhen the update algorithm was first implemented, it was unacceptably slow,
21743e12c5d1SDavid du Colombierso a cache was added to coalesce nearby changes,
21753e12c5d1SDavid du Colombierreplacing multiple small changes by a single larger one.
21763e12c5d1SDavid du ColombierThis reduced the number
21773e12c5d1SDavid du Colombierof insertions into the transaction
21783e12c5d1SDavid du Colombier.CW Buffer ,
21793e12c5d1SDavid du Colombierand made a dramatic improvement in performance,
21803e12c5d1SDavid du Colombierbut made it impossible
21813e12c5d1SDavid du Colombierto handle changes in non-monotonic order in the file; the caching method
21823e12c5d1SDavid du Colombieronly works if changes don't overlap.
21833e12c5d1SDavid du ColombierBefore the cache was added, the transaction could in principle be sorted
21843e12c5d1SDavid du Colombierif the changes were out of order, although
21853e12c5d1SDavid du Colombierthis was never done.
21863e12c5d1SDavid du ColombierThe current status is therefore acceptable performance with a minor
21873e12c5d1SDavid du Colombierrestriction on global changes, which is sometimes, but rarely, an annoyance.
21883e12c5d1SDavid du Colombier.PP
21893e12c5d1SDavid du ColombierThe update algorithm obviously paws the data more than simpler
21903e12c5d1SDavid du Colombieralgorithms, but it is not prohibitively expensive;
21913e12c5d1SDavid du Colombierthe caches help.
21923e12c5d1SDavid du Colombier(The principle of avoiding copying the data is still honored here,
21933e12c5d1SDavid du Colombieralthough not as piously:
21943e12c5d1SDavid du Colombierthe data is moved from contents' cache to
21953e12c5d1SDavid du Colombierthe transcript's all at once and through only one internal buffer.)
21963e12c5d1SDavid du ColombierPerformance figures confirm the efficiency.
21973e12c5d1SDavid du ColombierTo read from a dead start a hundred kilobyte file on a VAX-11/750
21983e12c5d1SDavid du Colombiertakes 1.4 seconds of user time, 2.5 seconds of system time,
21993e12c5d1SDavid du Colombierand 5 seconds of real time.
22003e12c5d1SDavid du ColombierReading the same file in
22013e12c5d1SDavid du Colombier.CW ed
22023e12c5d1SDavid du Colombiertakes 6.0 seconds of user time, 1.7 seconds of system time,
22033e12c5d1SDavid du Colombierand 8 seconds of real time.
22043e12c5d1SDavid du Colombier.CW Sam
22053e12c5d1SDavid du Colombieruses about half the CPU time.
22063e12c5d1SDavid du ColombierA more interesting example is the one stated above:
22073e12c5d1SDavid du Colombierinserting a character between every pair of characters in the file.
22083e12c5d1SDavid du ColombierThe
22093e12c5d1SDavid du Colombier.CW sam
22103e12c5d1SDavid du Colombiercommand is
22113e12c5d1SDavid du Colombier.P1
22123e12c5d1SDavid du Colombier,y/@/ a/x/
22133e12c5d1SDavid du Colombier.P2
22143e12c5d1SDavid du Colombierand takes 3 CPU seconds per kilobyte of input file, of which
22153e12c5d1SDavid du Colombierabout a third is spent in the regular expression code.
22163e12c5d1SDavid du ColombierThis translates to about 500 changes per second.
22173e12c5d1SDavid du Colombier.CW Ed
22183e12c5d1SDavid du Colombiertakes 1.5 seconds per kilobyte to make a similar change (ignoring newlines),
22193e12c5d1SDavid du Colombierbut cannot undo it.
22203e12c5d1SDavid du ColombierThe same example in
22213e12c5d1SDavid du Colombier.CW ex ,\u\s-4\&9\s+4\d
22223e12c5d1SDavid du Colombiera variant of
22233e12c5d1SDavid du Colombier.CW ed
22243e12c5d1SDavid du Colombierdone at the University of California at Berkeley,
22253e12c5d1SDavid du Colombierwhich allows one level of undoing, again takes 3 seconds.
22263e12c5d1SDavid du ColombierIn summary,
22273e12c5d1SDavid du Colombier.CW sam 's
2228*219b2ee8SDavid du Colombierperformance is comparable to that of other UNIX editors, although it solves
22293e12c5d1SDavid du Colombiera harder problem.
22303e12c5d1SDavid du Colombier.SH 2
22313e12c5d1SDavid du ColombierCommunications
22323e12c5d1SDavid du Colombier.LP
22333e12c5d1SDavid du ColombierThe discussion so far has described the implementation of the host part of
22343e12c5d1SDavid du Colombier.CW sam ;
22353e12c5d1SDavid du Colombierthe next few sections explain how a machine with mouse and bitmap display
22363e12c5d1SDavid du Colombiercan be engaged to improve interaction.
22373e12c5d1SDavid du Colombier.CW Sam
22383e12c5d1SDavid du Colombieris not the first editor to be written as two processes,\u\s-4\&16\s+4\d
22393e12c5d1SDavid du Colombierbut its implementation
22403e12c5d1SDavid du Colombierhas some unusual aspects.
22413e12c5d1SDavid du Colombier.PP
22423e12c5d1SDavid du ColombierThere are several ways
22433e12c5d1SDavid du Colombier.CW sam 's
22443e12c5d1SDavid du Colombierhost and terminal parts may be connected.
22453e12c5d1SDavid du ColombierThe first and simplest is to forgo the terminal part and use the host
22463e12c5d1SDavid du Colombierpart's command language to edit text on an ordinary terminal.
22473e12c5d1SDavid du ColombierThis mode is invoked by starting
22483e12c5d1SDavid du Colombier.CW sam
22493e12c5d1SDavid du Colombierwith the
22503e12c5d1SDavid du Colombier.CW -d
22513e12c5d1SDavid du Colombieroption.
22523e12c5d1SDavid du ColombierWith no options,
22533e12c5d1SDavid du Colombier.CW sam
22543e12c5d1SDavid du Colombierruns separate host and terminal programs,
22553e12c5d1SDavid du Colombiercommunicating with a message protocol over the physical
22563e12c5d1SDavid du Colombierconnection that joins them.
22573e12c5d1SDavid du ColombierTypically, the connection is an RS-232 link between a Blit
22583e12c5d1SDavid du Colombier(the prototypical display for
22593e12c5d1SDavid du Colombier.CW sam )
22603e12c5d1SDavid du Colombierand a host running
2261*219b2ee8SDavid du Colombierthe Ninth Edition of the UNIX operating system.\u\s-4\&8\s+4\d
22623e12c5d1SDavid du Colombier(This is the version of the system used in the Computing Sciences Research
22633e12c5d1SDavid du ColombierCenter at AT&T Bell Laboratories, where I work.  Its relevant
22643e12c5d1SDavid du Colombieraspects are discussed in the Blit paper.\u\s-4\&1\s+4\d)
22653e12c5d1SDavid du ColombierThe implementation of
22663e12c5d1SDavid du Colombier.CW sam
2267*219b2ee8SDavid du Colombierfor the SUN computer runs both processes on the same machine and
22683e12c5d1SDavid du Colombierconnects them by a pipe.
22693e12c5d1SDavid du Colombier.PP
22703e12c5d1SDavid du ColombierThe low bandwidth of an RS-232 link
22713e12c5d1SDavid du Colombiernecessitated the split between
22723e12c5d1SDavid du Colombierthe two programs.
22733e12c5d1SDavid du ColombierThe division is a mixed blessing:
22743e12c5d1SDavid du Colombiera program in two parts is much harder to write and to debug
22753e12c5d1SDavid du Colombierthan a self-contained one,
22763e12c5d1SDavid du Colombierbut the split makes several unusual configurations possible.
22773e12c5d1SDavid du ColombierThe terminal may be physically separated from the host, allowing the conveniences
22783e12c5d1SDavid du Colombierof a mouse and bitmap display to be taken home while leaving the files at work.
22793e12c5d1SDavid du ColombierIt is also possible to run the host part on a remote machine:
22803e12c5d1SDavid du Colombier.P1
22813e12c5d1SDavid du Colombiersam -r host
22823e12c5d1SDavid du Colombier.P2
22833e12c5d1SDavid du Colombierconnects to the terminal in the usual way, and then makes a call
22843e12c5d1SDavid du Colombieracross the network to establish the host part of
22853e12c5d1SDavid du Colombier.CW sam
22863e12c5d1SDavid du Colombieron the named machine.
22873e12c5d1SDavid du ColombierFinally, it cross-connects the I/O to join the two parts.
22883e12c5d1SDavid du ColombierThis allows
22893e12c5d1SDavid du Colombier.CW sam
22903e12c5d1SDavid du Colombierto be run on machines that do not support bitmap displays;
22913e12c5d1SDavid du Colombierfor example,
22923e12c5d1SDavid du Colombier.CW sam
22933e12c5d1SDavid du Colombieris the editor of choice on our Cray X-MP/24.
22943e12c5d1SDavid du Colombier.CW Sam
22953e12c5d1SDavid du Colombier.CW -r
22963e12c5d1SDavid du Colombierinvolves
22973e12c5d1SDavid du Colombier.I three
22983e12c5d1SDavid du Colombiermachines: the remote host, the terminal, and the local host.
22993e12c5d1SDavid du ColombierThe local host's job is simple but vital: it passes the data
23003e12c5d1SDavid du Colombierbetween the remote host and terminal.
23013e12c5d1SDavid du Colombier.PP
23023e12c5d1SDavid du ColombierThe host and terminal exchange messages asynchronously
23033e12c5d1SDavid du Colombier(rather than, say, as remote procedure calls) but there is no
23043e12c5d1SDavid du Colombiererror detection or correction
23053e12c5d1SDavid du Colombierbecause, whatever the configuration, the connection is reliable.
23063e12c5d1SDavid du ColombierBecause the terminal handles mundane interaction tasks such as
23073e12c5d1SDavid du Colombierpopping up menus and interpreting the responses, the messages are about
23083e12c5d1SDavid du Colombierdata, not actions.
23093e12c5d1SDavid du ColombierFor example, the host knows nothing about what is displayed on the screen,
23103e12c5d1SDavid du Colombierand when the user types a character, the message sent to the host says
23113e12c5d1SDavid du Colombier``insert a one-byte string at location 123 in file 7,'' not ``a character
23123e12c5d1SDavid du Colombierwas typed at the current position in the current file.''
23133e12c5d1SDavid du ColombierIn other words, the messages look very much like the transaction records
23143e12c5d1SDavid du Colombierin the transcripts.
23153e12c5d1SDavid du Colombier.PP
23163e12c5d1SDavid du ColombierEither the host or terminal part of
23173e12c5d1SDavid du Colombier.CW sam
23183e12c5d1SDavid du Colombiermay initiate a change to a file.
23193e12c5d1SDavid du ColombierThe command language operates on the host, while typing and some
23203e12c5d1SDavid du Colombiermouse operations are executed directly in the terminal to optimize response.
23213e12c5d1SDavid du ColombierChanges initiated by the host program must be transmitted to the terminal,
23223e12c5d1SDavid du Colombierand
23233e12c5d1SDavid du Colombiervice versa.
23243e12c5d1SDavid du Colombier(A token is exchanged to determine which end is in control,
23253e12c5d1SDavid du Colombierwhich means that characters typed while a time-consuming command runs
23263e12c5d1SDavid du Colombiermust be buffered and do not appear until the command is complete.)
23273e12c5d1SDavid du ColombierTo maintain consistent information,
23283e12c5d1SDavid du Colombierthe host and terminal track changes through a per-file
23293e12c5d1SDavid du Colombierdata structure that records what portions of the file
23303e12c5d1SDavid du Colombierthe terminal has received.
23313e12c5d1SDavid du ColombierThe data structure, called a
23323e12c5d1SDavid du Colombier.CW Rasp
23333e12c5d1SDavid du Colombier(a weak pun: it's a file with holes)
23343e12c5d1SDavid du Colombieris held and updated by both the host and terminal.
23353e12c5d1SDavid du ColombierA
23363e12c5d1SDavid du Colombier.CW Rasp
23373e12c5d1SDavid du Colombieris a list of
23383e12c5d1SDavid du Colombier.CW Strings
23393e12c5d1SDavid du Colombierholding those parts of the file known to the terminal,
23403e12c5d1SDavid du Colombierseparated by counts of the number of bytes in the interstices.
23413e12c5d1SDavid du ColombierOf course, the host doesn't keep a separate copy of the data (it only needs
23423e12c5d1SDavid du Colombierthe lengths of the various pieces),
23433e12c5d1SDavid du Colombierbut the structure is the same on both ends.
23443e12c5d1SDavid du Colombier.PP
23453e12c5d1SDavid du ColombierThe
23463e12c5d1SDavid du Colombier.CW Rasp
23473e12c5d1SDavid du Colombierin the terminal doubles as a cache.
23483e12c5d1SDavid du ColombierSince the terminal keeps the text for portions of the file it has displayed,
23493e12c5d1SDavid du Colombierit need not request data from the host when revisiting old parts of the file
23503e12c5d1SDavid du Colombieror redrawing obscured windows, which speeds things up considerably
23513e12c5d1SDavid du Colombierover low-speed links.
23523e12c5d1SDavid du Colombier.PP
23533e12c5d1SDavid du ColombierIt's trivial for the terminal to maintain its
23543e12c5d1SDavid du Colombier.CW Rasp ,
23553e12c5d1SDavid du Colombierbecause all changes made on the terminal apply to parts of the file
23563e12c5d1SDavid du Colombieralready loaded there.
23573e12c5d1SDavid du ColombierChanges made by the host are compared against the
23583e12c5d1SDavid du Colombier.CW Rasp
23593e12c5d1SDavid du Colombierduring the update sequence after each command.
23603e12c5d1SDavid du ColombierSmall changes to pieces of the file loaded in the terminal
23613e12c5d1SDavid du Colombierare sent in their entirety.
23623e12c5d1SDavid du ColombierLarger changes, and changes that fall entirely in the holes,
23633e12c5d1SDavid du Colombierare transmitted as messages without literal data:
23643e12c5d1SDavid du Colombieronly the lengths of the deleted and inserted strings are transmitted.
23653e12c5d1SDavid du ColombierWhen a command is completed, the terminal examines its visible
23663e12c5d1SDavid du Colombierwindows to see if any holes in their
23673e12c5d1SDavid du Colombier.CW Rasps
23683e12c5d1SDavid du Colombierintersect the visible portion of the file.
23693e12c5d1SDavid du ColombierIt then requests the missing data from the host,
23703e12c5d1SDavid du Colombieralong with up to 512 bytes of surrounding data, to minimize
23713e12c5d1SDavid du Colombierthe number of messages when visiting a new portion of the file.
23723e12c5d1SDavid du ColombierThis technique provides a kind of two-level lazy evaluation for the terminal.
23733e12c5d1SDavid du ColombierThe first level sends a minimum of information about
23743e12c5d1SDavid du Colombierparts of the file not being edited interactively;
23753e12c5d1SDavid du Colombierthe second level waits until a change is displayed before
23763e12c5d1SDavid du Colombiertransmitting the new data.
23773e12c5d1SDavid du ColombierOf course,
23783e12c5d1SDavid du Colombierperformance is also helped by having the terminal respond immediately to typing
23793e12c5d1SDavid du Colombierand simple mouse requests.
23803e12c5d1SDavid du ColombierExcept for small changes to active pieces of the file, which are
23813e12c5d1SDavid du Colombiertransmitted to the terminal without negotiation,
23823e12c5d1SDavid du Colombierthe terminal is wholly responsible for deciding what is displayed;
23833e12c5d1SDavid du Colombierthe host uses the
23843e12c5d1SDavid du Colombier.CW Rasp
23853e12c5d1SDavid du Colombieronly to tell the terminal what might be relevant.
23863e12c5d1SDavid du Colombier.PP
23873e12c5d1SDavid du ColombierWhen a change is initiated by the host,
23883e12c5d1SDavid du Colombierthe messages to the terminal describing the change
23893e12c5d1SDavid du Colombierare generated by the routine that applies the transcript of the changes
23903e12c5d1SDavid du Colombierto the contents of the
23913e12c5d1SDavid du Colombier.CW File .
23923e12c5d1SDavid du ColombierSince changes are undone by the same update routine,
23933e12c5d1SDavid du Colombierundoing requires
23943e12c5d1SDavid du Colombierno extra code in the communications;
23953e12c5d1SDavid du Colombierthe usual messages describing changes to the file are sufficient
23963e12c5d1SDavid du Colombierto back up the screen image.
23973e12c5d1SDavid du Colombier.PP
23983e12c5d1SDavid du ColombierThe
23993e12c5d1SDavid du Colombier.CW Rasp
24003e12c5d1SDavid du Colombieris a particularly good example of the way caches are used in
24013e12c5d1SDavid du Colombier.CW sam .
24023e12c5d1SDavid du ColombierFirst, it facilitates access to the active portion of the text by placing
24033e12c5d1SDavid du Colombierthe busy text in main memory.
24043e12c5d1SDavid du ColombierIn so doing, it provides efficient access
24053e12c5d1SDavid du Colombierto a large data structure that does not fit in memory.
24063e12c5d1SDavid du ColombierSince the form of data is to be imposed by the user, not by the program,
24073e12c5d1SDavid du Colombierand because characters will frequently be scanned sequentially,
24083e12c5d1SDavid du Colombierfiles are stored as flat objects.
24093e12c5d1SDavid du ColombierCaches help keep performance good and linear when working with such
24103e12c5d1SDavid du Colombierdata.
24113e12c5d1SDavid du Colombier.PP
24123e12c5d1SDavid du ColombierSecond, the
24133e12c5d1SDavid du Colombier.CW Rasp
24143e12c5d1SDavid du Colombierand several of the other caches have some
24153e12c5d1SDavid du Colombier.I read-ahead;
24163e12c5d1SDavid du Colombierthat is, the cache is loaded with more information than is needed for
24173e12c5d1SDavid du Colombierthe job immediately at hand.
24183e12c5d1SDavid du ColombierWhen manipulating linear structures, the accesses are usually sequential,
24193e12c5d1SDavid du Colombierand read-ahead can significantly reduce the average time to access the
24203e12c5d1SDavid du Colombiernext element of the object.
24213e12c5d1SDavid du ColombierSequential access is a common mode for people as well as programs;
24223e12c5d1SDavid du Colombierconsider scrolling through a document while looking for something.
24233e12c5d1SDavid du Colombier.PP
24243e12c5d1SDavid du ColombierFinally, like any good data structure,
24253e12c5d1SDavid du Colombierthe cache guides the algorithm, or at least the implementation.
24263e12c5d1SDavid du ColombierThe
24273e12c5d1SDavid du Colombier.CW Rasp
24283e12c5d1SDavid du Colombierwas actually invented to control the communications between the host and
24293e12c5d1SDavid du Colombierterminal parts, but I realized very early that it was also a form of
24303e12c5d1SDavid du Colombiercache.  Other caches were more explicitly intended to serve a double
24313e12c5d1SDavid du Colombierpurpose: for example, the caches in
24323e12c5d1SDavid du Colombier.CW Files
24333e12c5d1SDavid du Colombierthat coalesce updates not only reduce traffic to the
24343e12c5d1SDavid du Colombiertranscript and contents
24353e12c5d1SDavid du Colombier.CW Buffers ,
24363e12c5d1SDavid du Colombierthey also clump screen updates so that complicated changes to the
24373e12c5d1SDavid du Colombierscreen are achieved in
24383e12c5d1SDavid du Colombierjust a few messages to the terminal.
24393e12c5d1SDavid du ColombierThis saved me considerable work: I did not need to write special
24403e12c5d1SDavid du Colombiercode to optimize the message traffic to the
24413e12c5d1SDavid du Colombierterminal.
24423e12c5d1SDavid du ColombierCaches pay off in surprising ways.
24433e12c5d1SDavid du ColombierAlso, they tend to be independent, so their performance improvements
24443e12c5d1SDavid du Colombierare multiplicative.
24453e12c5d1SDavid du Colombier.SH 2
24463e12c5d1SDavid du ColombierData structures in the terminal
24473e12c5d1SDavid du Colombier.LP
24483e12c5d1SDavid du ColombierThe terminal's job is to display and to maintain a consistent image of
24493e12c5d1SDavid du Colombierpieces of the files being edited.
24503e12c5d1SDavid du ColombierBecause the text is always in memory, the data structures are
24513e12c5d1SDavid du Colombierconsiderably simpler than those in the host part.
24523e12c5d1SDavid du Colombier.PP
24533e12c5d1SDavid du Colombier.CW Sam
24543e12c5d1SDavid du Colombiertypically has far more windows than does
24553e12c5d1SDavid du Colombier.CW mux ,
24563e12c5d1SDavid du Colombierthe window system within which its Blit implementation runs.
24573e12c5d1SDavid du Colombier.CW Mux
24583e12c5d1SDavid du Colombierhas a fairly small number of asynchronously updated windows;
24593e12c5d1SDavid du Colombier.CW sam
24603e12c5d1SDavid du Colombierneeds a large number of synchronously updated windows that are
24613e12c5d1SDavid du Colombierusually static and often fully obscured.
24623e12c5d1SDavid du ColombierThe different tradeoffs guided
24633e12c5d1SDavid du Colombier.CW sam
24643e12c5d1SDavid du Colombieraway from the memory-intensive implementation of windows, called
24653e12c5d1SDavid du Colombier.CW Layers ,\u\s-4\&17\s+4\d
24663e12c5d1SDavid du Colombierused in
24673e12c5d1SDavid du Colombier.CW mux.
24683e12c5d1SDavid du ColombierRather than depending on a complete bitmap image of the display for each window,
24693e12c5d1SDavid du Colombier.CW sam
24703e12c5d1SDavid du Colombierregenerates the image from its in-memory text
24713e12c5d1SDavid du Colombier(stored in the
24723e12c5d1SDavid du Colombier.CW Rasp )
24733e12c5d1SDavid du Colombierwhen necessary, although it will use such an image if it is available.
24743e12c5d1SDavid du ColombierLike
24753e12c5d1SDavid du Colombier.CW Layers ,
24763e12c5d1SDavid du Colombierthough,
24773e12c5d1SDavid du Colombier.CW sam
24783e12c5d1SDavid du Colombieruses the screen bitmap as active storage in which to update the image using
24793e12c5d1SDavid du Colombier.CW bitblt .\u\s-4\&18,19\s+4\d
24803e12c5d1SDavid du ColombierThe resulting organization, pictured in Figure 6,
24813e12c5d1SDavid du Colombierhas a global array of windows, called
24823e12c5d1SDavid du Colombier.CW Flayers ,
24833e12c5d1SDavid du Colombiereach of which holds an image of a piece of text held in a data structure
24843e12c5d1SDavid du Colombiercalled a
24853e12c5d1SDavid du Colombier.CW Frame ,
24863e12c5d1SDavid du Colombierwhich in turn represents
24873e12c5d1SDavid du Colombiera rectangular window full of text displayed in some
24883e12c5d1SDavid du Colombier.CW Bitmap .
24893e12c5d1SDavid du ColombierEach
24903e12c5d1SDavid du Colombier.CW Flayer
24913e12c5d1SDavid du Colombierappears in a global list that orders them all front-to-back
24923e12c5d1SDavid du Colombieron the display, and simultaneously as an element of a per-file array
24933e12c5d1SDavid du Colombierthat holds all the open windows for that file.
24943e12c5d1SDavid du ColombierThe complement in the terminal of the
24953e12c5d1SDavid du Colombier.CW File
24963e12c5d1SDavid du Colombieron the host is called a
24973e12c5d1SDavid du Colombier.CW Text ;
24983e12c5d1SDavid du Colombiereach connects its
24993e12c5d1SDavid du Colombier.CW Flayers
25003e12c5d1SDavid du Colombierto the associated
25013e12c5d1SDavid du Colombier.CW Rasp .
25023e12c5d1SDavid du Colombier.KF
25033e12c5d1SDavid du Colombier.PS
25043e12c5d1SDavid du Colombiercopy "fig6.pic"
25053e12c5d1SDavid du Colombier.PE
25063e12c5d1SDavid du Colombier.Cs
25073e12c5d1SDavid du ColombierFigure 6. Data structures in the terminal.
25083e12c5d1SDavid du Colombier.CW Flayers
25093e12c5d1SDavid du Colombierare also linked together into a front-to-back list.
25103e12c5d1SDavid du Colombier.CW Boxes
25113e12c5d1SDavid du Colombierare discussed in the next section.
25123e12c5d1SDavid du Colombier.Ce
25133e12c5d1SDavid du Colombier.KE
25143e12c5d1SDavid du Colombier.PP
25153e12c5d1SDavid du ColombierThe
25163e12c5d1SDavid du Colombier.CW Bitmap
25173e12c5d1SDavid du Colombierfor a
25183e12c5d1SDavid du Colombier.CW Frame
25193e12c5d1SDavid du Colombiercontains the image of the text.
25203e12c5d1SDavid du ColombierFor a fully visible window, the
25213e12c5d1SDavid du Colombier.CW Bitmap
25223e12c5d1SDavid du Colombierwill be the screen (or at least the
25233e12c5d1SDavid du Colombier.CW Layer
25243e12c5d1SDavid du Colombierin which
25253e12c5d1SDavid du Colombier.CW sam
25263e12c5d1SDavid du Colombieris being run),
25273e12c5d1SDavid du Colombierwhile for partially obscured windows the
25283e12c5d1SDavid du Colombier.CW Bitmap
25293e12c5d1SDavid du Colombierwill be off-screen.
25303e12c5d1SDavid du ColombierIf the window is fully obscured, the
25313e12c5d1SDavid du Colombier.CW Bitmap
25323e12c5d1SDavid du Colombierwill be null.
25333e12c5d1SDavid du Colombier.PP
25343e12c5d1SDavid du ColombierThe
25353e12c5d1SDavid du Colombier.CW Bitmap
25363e12c5d1SDavid du Colombieris a kind of cache.
25373e12c5d1SDavid du ColombierWhen making changes to the display, most of the original image will
25383e12c5d1SDavid du Colombierlook the same in the final image, and the update algorithms exploit this.
25393e12c5d1SDavid du ColombierThe
25403e12c5d1SDavid du Colombier.CW Frame
25413e12c5d1SDavid du Colombiersoftware updates the image in the
25423e12c5d1SDavid du Colombier.CW Bitmap
25433e12c5d1SDavid du Colombierincrementally; the
25443e12c5d1SDavid du Colombier.CW Bitmap
25453e12c5d1SDavid du Colombieris not just an image, it is a data structure.\u\s-4\&18,19\s+4\d
25463e12c5d1SDavid du ColombierThe job of the software that updates the display is therefore
25473e12c5d1SDavid du Colombierto use as much as possible of the existing image (converting the
25483e12c5d1SDavid du Colombiertext from ASCII characters to pixels is expensive) in a sort of two-dimensional
25493e12c5d1SDavid du Colombierstring insertion algorithm.
25503e12c5d1SDavid du ColombierThe details of this process are described in the next section.
25513e12c5d1SDavid du Colombier.PP
25523e12c5d1SDavid du ColombierThe
25533e12c5d1SDavid du Colombier.CW Frame
25543e12c5d1SDavid du Colombiersoftware has no code to support overlapping windows;
25553e12c5d1SDavid du Colombierits job is to keep a single
25563e12c5d1SDavid du Colombier.CW Bitmap
25573e12c5d1SDavid du Colombierup to date.
25583e12c5d1SDavid du ColombierIt falls to the
25593e12c5d1SDavid du Colombier.CW Flayer
25603e12c5d1SDavid du Colombiersoftware to multiplex the various
25613e12c5d1SDavid du Colombier.CW Bitmaps
25623e12c5d1SDavid du Colombieronto the screen.
25633e12c5d1SDavid du ColombierThe problem of maintaining overlapping
25643e12c5d1SDavid du Colombier.CW Flayers
25653e12c5d1SDavid du Colombieris easier than for
25663e12c5d1SDavid du Colombier.CW Layers \u\s-4\&17\s+4\d
25673e12c5d1SDavid du Colombierbecause changes are made synchronously and because the contents of the window
25683e12c5d1SDavid du Colombiercan be reconstructed from the data stored in the
25693e12c5d1SDavid du Colombier.CW Frame ;
25703e12c5d1SDavid du Colombierthe
25713e12c5d1SDavid du Colombier.CW Layers
25723e12c5d1SDavid du Colombiersoftware
25733e12c5d1SDavid du Colombiermakes no such assumptions.
25743e12c5d1SDavid du ColombierIn
25753e12c5d1SDavid du Colombier.CW sam ,
25763e12c5d1SDavid du Colombierthe window being changed is almost always fully visible, because the current
25773e12c5d1SDavid du Colombierwindow is always fully visible, by construction.
25783e12c5d1SDavid du ColombierHowever, when multi-file changes are being made, or when
25793e12c5d1SDavid du Colombiermore than one window is open on a file,
25803e12c5d1SDavid du Colombierit may be necessary to update partially obscured windows.
25813e12c5d1SDavid du Colombier.PP
25823e12c5d1SDavid du ColombierThere are three cases: the window is
25833e12c5d1SDavid du Colombierfully visible, invisible (fully obscured), or partially visible.
25843e12c5d1SDavid du ColombierIf fully visible, the
25853e12c5d1SDavid du Colombier.CW Bitmap
25863e12c5d1SDavid du Colombieris part of the screen, so when the
25873e12c5d1SDavid du Colombier.CW Flayer
25883e12c5d1SDavid du Colombierupdate routine calls the
25893e12c5d1SDavid du Colombier.CW Frame
25903e12c5d1SDavid du Colombierupdate routine, the screen will be updated directly.
25913e12c5d1SDavid du ColombierIf the window is invisible,
25923e12c5d1SDavid du Colombierthere is no associated
25933e12c5d1SDavid du Colombier.CW Bitmap ,
25943e12c5d1SDavid du Colombierand all that is necessary is to update the
25953e12c5d1SDavid du Colombier.CW Frame
25963e12c5d1SDavid du Colombierdata structure, not the image.
25973e12c5d1SDavid du ColombierIf the window is partially visible, the
25983e12c5d1SDavid du Colombier.CW Frame
25993e12c5d1SDavid du Colombierroutine is called to update the image in the off-screen
26003e12c5d1SDavid du Colombier.CW Bitmap ,
26013e12c5d1SDavid du Colombierwhich may require regenerating it from the text of the window.
26023e12c5d1SDavid du ColombierThe
26033e12c5d1SDavid du Colombier.CW Flayer
26043e12c5d1SDavid du Colombiercode then clips this
26053e12c5d1SDavid du Colombier.CW Bitmap
26063e12c5d1SDavid du Colombieragainst the
26073e12c5d1SDavid du Colombier.CW Bitmaps
26083e12c5d1SDavid du Colombierof all
26093e12c5d1SDavid du Colombier.CW Frames
26103e12c5d1SDavid du Colombierin front of the
26113e12c5d1SDavid du Colombier.CW Frame
26123e12c5d1SDavid du Colombierbeing modified, and the remainder is copied to the display.
26133e12c5d1SDavid du Colombier.PP
26143e12c5d1SDavid du ColombierThis is much faster than recreating the image off-screen
26153e12c5d1SDavid du Colombierfor every change, or clipping all the changes made to the image
26163e12c5d1SDavid du Colombierduring its update.
26173e12c5d1SDavid du ColombierUnfortunately, these caches can also consume prohibitive amounts of
26183e12c5d1SDavid du Colombiermemory, so they are freed fairly liberally \(em after every change to the
26193e12c5d1SDavid du Colombierfront-to-back order of the
26203e12c5d1SDavid du Colombier.CW Flayers .
26213e12c5d1SDavid du ColombierThe result is that
26223e12c5d1SDavid du Colombierthe off-screen
26233e12c5d1SDavid du Colombier.CW Bitmaps
26243e12c5d1SDavid du Colombierexist only while multi-window changes are occurring,
26253e12c5d1SDavid du Colombierwhich is the only time the performance improvement they provide is needed.
26263e12c5d1SDavid du ColombierAlso, the user interface causes fully-obscured windows to be the
26273e12c5d1SDavid du Colombiereasiest to make \(em
26283e12c5d1SDavid du Colombiercreating a canonically sized and placed window requires only a button click
26293e12c5d1SDavid du Colombier\(em which reduces the need for caching still further.
26303e12c5d1SDavid du Colombier.PP
26313e12c5d1SDavid du Colombier.SH 2
26323e12c5d1SDavid du ColombierScreen update
26333e12c5d1SDavid du Colombier.LP
26343e12c5d1SDavid du ColombierOnly two low-level primitives are needed for incremental update:
26353e12c5d1SDavid du Colombier.CW bitblt ,
26363e12c5d1SDavid du Colombierwhich copies rectangles of pixels, and
26373e12c5d1SDavid du Colombier.CW string
26383e12c5d1SDavid du Colombier(which in turn calls
26393e12c5d1SDavid du Colombier.CW bitblt ),
26403e12c5d1SDavid du Colombierwhich draws a null-terminated character string in a
26413e12c5d1SDavid du Colombier.CW Bitmap .
26423e12c5d1SDavid du ColombierA
26433e12c5d1SDavid du Colombier.CW Frame
26443e12c5d1SDavid du Colombiercontains a list of
26453e12c5d1SDavid du Colombier.CW Boxes ,
26463e12c5d1SDavid du Colombiereach of which defines a horizontal strip of text in the window
26473e12c5d1SDavid du Colombier(see Figure 7).
26483e12c5d1SDavid du ColombierA
26493e12c5d1SDavid du Colombier.CW Box
26503e12c5d1SDavid du Colombierhas a character string
26513e12c5d1SDavid du Colombier.CW str ,
26523e12c5d1SDavid du Colombierand a
26533e12c5d1SDavid du Colombier.CW Rectangle
26543e12c5d1SDavid du Colombier.CW rect
26553e12c5d1SDavid du Colombierthat defines the location of the strip in the window.
26563e12c5d1SDavid du Colombier(The text in
26573e12c5d1SDavid du Colombier.CW str
26583e12c5d1SDavid du Colombieris stored in the
26593e12c5d1SDavid du Colombier.CW Box
26603e12c5d1SDavid du Colombierseparately from the
26613e12c5d1SDavid du Colombier.CW Rasp
26623e12c5d1SDavid du Colombierassociated with the window's file, so
26633e12c5d1SDavid du Colombier.CW Boxes
26643e12c5d1SDavid du Colombierare self-contained.)
26653e12c5d1SDavid du ColombierThe invariant is that
26663e12c5d1SDavid du Colombierthe image of the
26673e12c5d1SDavid du Colombier.CW Box
26683e12c5d1SDavid du Colombiercan be reproduced by calling
26693e12c5d1SDavid du Colombier.CW string
26703e12c5d1SDavid du Colombierwith argument
26713e12c5d1SDavid du Colombier.CW str
26723e12c5d1SDavid du Colombierto draw the string in
26733e12c5d1SDavid du Colombier.CW rect ,
26743e12c5d1SDavid du Colombierand the resulting picture fits perfectly within
26753e12c5d1SDavid du Colombier.CW rect .
26763e12c5d1SDavid du ColombierIn other words, the
26773e12c5d1SDavid du Colombier.CW Boxes
26783e12c5d1SDavid du Colombierdefine the tiling of the window.
26793e12c5d1SDavid du ColombierThe tiling may be complicated by long lines of text, which
26803e12c5d1SDavid du Colombierare folded onto the next line.
26813e12c5d1SDavid du ColombierSome editors use horizontal scrolling to avoid this complication,
26823e12c5d1SDavid du Colombierbut to be comfortable this technique requires that lines not be
26833e12c5d1SDavid du Colombier.I too
26843e12c5d1SDavid du Colombierlong;
26853e12c5d1SDavid du Colombier.CW sam
26863e12c5d1SDavid du Colombierhas no such restriction.
2687*219b2ee8SDavid du ColombierAlso, and perhaps more importantly, UNIX programs and terminals traditionally fold
26883e12c5d1SDavid du Colombierlong lines to make their contents fully visible.
26893e12c5d1SDavid du Colombier.PP
26903e12c5d1SDavid du ColombierTwo special kinds of
26913e12c5d1SDavid du Colombier.CW Boxes
26923e12c5d1SDavid du Colombiercontain a single
26933e12c5d1SDavid du Colombiercharacter: either a newline or a tab.
26943e12c5d1SDavid du ColombierNewlines and tabs are white space.
26953e12c5d1SDavid du ColombierA newline
26963e12c5d1SDavid du Colombier.CW Box
26973e12c5d1SDavid du Colombieralways extends to the right edge of the window,
26983e12c5d1SDavid du Colombierforcing the following
26993e12c5d1SDavid du Colombier.CW Box
27003e12c5d1SDavid du Colombierto the next line.
27013e12c5d1SDavid du ColombierThe width of a tab depends on where it is located:
27023e12c5d1SDavid du Colombierit forces the next
27033e12c5d1SDavid du Colombier.CW Box
27043e12c5d1SDavid du Colombierto begin at a tab location.
27053e12c5d1SDavid du ColombierTabs also
27063e12c5d1SDavid du Colombierhave a minimum width equivalent to a blank (blanks are
27073e12c5d1SDavid du Colombierdrawn by
27083e12c5d1SDavid du Colombier.CW string
27093e12c5d1SDavid du Colombierand are not treated specially); newlines have a minimum width of zero.
27103e12c5d1SDavid du Colombier.KF
27113e12c5d1SDavid du Colombier.PS
27123e12c5d1SDavid du Colombiercopy "fig7.pic"
27133e12c5d1SDavid du Colombier.PE
27143e12c5d1SDavid du Colombier.sp .5
27153e12c5d1SDavid du Colombier.Cs
27163e12c5d1SDavid du ColombierFigure 7. A line of text showing its
27173e12c5d1SDavid du Colombier.CW Boxes .
27183e12c5d1SDavid du ColombierThe first two blank
27193e12c5d1SDavid du Colombier.CW Boxes
27203e12c5d1SDavid du Colombiercontain tabs; the last contains a newline.
27213e12c5d1SDavid du ColombierSpaces are handled as ordinary characters.
27223e12c5d1SDavid du Colombier.Ce
27233e12c5d1SDavid du Colombier.KE
27243e12c5d1SDavid du Colombier.PP
27253e12c5d1SDavid du ColombierThe update algorithms always use the
27263e12c5d1SDavid du Colombier.CW Bitmap
27273e12c5d1SDavid du Colombierimage of the text (either the display or cache
27283e12c5d1SDavid du Colombier.CW Bitmap );
27293e12c5d1SDavid du Colombierthey never examine the characters within a
27303e12c5d1SDavid du Colombier.CW Box
27313e12c5d1SDavid du Colombierexcept when the
27323e12c5d1SDavid du Colombier.CW Box
27333e12c5d1SDavid du Colombierneeds to be split in two.
27343e12c5d1SDavid du ColombierBefore a change, the window consists of a tiling of
27353e12c5d1SDavid du Colombier.CW Boxes ;
27363e12c5d1SDavid du Colombierafter the change the window is tiled differently.
27373e12c5d1SDavid du ColombierThe update algorithms rearrange the tiles in place, without
27383e12c5d1SDavid du Colombierbackup storage.
27393e12c5d1SDavid du ColombierThe algorithms are not strictly optimal \(em for example, they can
27403e12c5d1SDavid du Colombierclear a pixel that is later going to be written upon \(em
27413e12c5d1SDavid du Colombierbut they never move a tile that doesn't need to be moved,
27423e12c5d1SDavid du Colombierand they move each tile at most once.
27433e12c5d1SDavid du Colombier.CW Frinsert
27443e12c5d1SDavid du Colombieron a Blit can absorb over a thousand characters a second if the strings
27453e12c5d1SDavid du Colombierbeing inserted are a few tens of characters long.
27463e12c5d1SDavid du Colombier.PP
27473e12c5d1SDavid du ColombierConsider
27483e12c5d1SDavid du Colombier.CW frdelete .
27493e12c5d1SDavid du ColombierIts job is to delete a substring from a
27503e12c5d1SDavid du Colombier.CW Frame
27513e12c5d1SDavid du Colombierand restore the image of the
27523e12c5d1SDavid du Colombier.CW Frame .
27533e12c5d1SDavid du ColombierThe image of a substring has a peculiar shape (see Figure 2) comprising
27543e12c5d1SDavid du Colombierpossibly a partial line,
27553e12c5d1SDavid du Colombierzero or more full lines,
27563e12c5d1SDavid du Colombierand possibly a final partial line.
27573e12c5d1SDavid du ColombierFor reference, call this the
27583e12c5d1SDavid du Colombier.I
27593e12c5d1SDavid du ColombierZ-shape.
27603e12c5d1SDavid du Colombier.R
27613e12c5d1SDavid du Colombier.CW Frdelete
27623e12c5d1SDavid du Colombierbegins by splitting, if necessary, the
27633e12c5d1SDavid du Colombier.CW Boxes
27643e12c5d1SDavid du Colombiercontaining the ends of
27653e12c5d1SDavid du Colombierthe substring so the substring begins and ends on
27663e12c5d1SDavid du Colombier.CW Box
27673e12c5d1SDavid du Colombierboundaries.
27683e12c5d1SDavid du ColombierBecause the substring is being deleted, its image is not needed,
27693e12c5d1SDavid du Colombierso the Z-shape is then cleared.
27703e12c5d1SDavid du ColombierThen, tiles (that is, the images of
27713e12c5d1SDavid du Colombier.CW Boxes )
27723e12c5d1SDavid du Colombierare copied, using
27733e12c5d1SDavid du Colombier.CW bitblt ,
27743e12c5d1SDavid du Colombierfrom immediately after the Z-shape to
27753e12c5d1SDavid du Colombierthe beginning of the Z-shape,
27763e12c5d1SDavid du Colombierresulting in a new Z-shape.
27773e12c5d1SDavid du Colombier.CW Boxes "" (
27783e12c5d1SDavid du Colombierwhose contents would span two lines in the new position must first be split.)
27793e12c5d1SDavid du Colombier.PP
27803e12c5d1SDavid du ColombierCopying the remainder of the
27813e12c5d1SDavid du Colombier.CW Frame
27823e12c5d1SDavid du Colombiertile by tile
27833e12c5d1SDavid du Colombierthis way will clearly accomplish the deletion but eventually,
27843e12c5d1SDavid du Colombiertypically when the copying algorithm encounters a tab or newline,
27853e12c5d1SDavid du Colombierthe old and new
27863e12c5d1SDavid du Colombier.CW x
27873e12c5d1SDavid du Colombiercoordinates of the tile
27883e12c5d1SDavid du Colombierto be copied are the same.
27893e12c5d1SDavid du ColombierThis correspondence implies
27903e12c5d1SDavid du Colombierthat the Z-shape has its beginning and ending edges aligned
27913e12c5d1SDavid du Colombiervertically, and a sequence of at most two
27923e12c5d1SDavid du Colombier.CW bitblts
27933e12c5d1SDavid du Colombiercan be used to copy the remaining tiles.
27943e12c5d1SDavid du ColombierThe last step is to clear out the resulting empty space at the bottom
27953e12c5d1SDavid du Colombierof the window;
27963e12c5d1SDavid du Colombierthe number of lines to be cleared is the number of complete lines in the
27973e12c5d1SDavid du ColombierZ-shape closed by the final
27983e12c5d1SDavid du Colombier.CW bitblts.
27993e12c5d1SDavid du ColombierThe final step is to merge horizontally adjacent
28003e12c5d1SDavid du Colombier.CW Boxes
28013e12c5d1SDavid du Colombierof plain text.
28023e12c5d1SDavid du ColombierThe complete source to
28033e12c5d1SDavid du Colombier.CW frdelete
28043e12c5d1SDavid du Colombieris less than 100 lines of C.
28053e12c5d1SDavid du Colombier.PP
28063e12c5d1SDavid du Colombier.CW frinsert
28073e12c5d1SDavid du Colombieris more complicated because it must do four passes:
28083e12c5d1SDavid du Colombierone to construct the
28093e12c5d1SDavid du Colombier.CW Box
28103e12c5d1SDavid du Colombierlist for the inserted string,
28113e12c5d1SDavid du Colombierone to reconnoitre,
28123e12c5d1SDavid du Colombierone to copy (in opposite order to
28133e12c5d1SDavid du Colombier.CW frdelete )
28143e12c5d1SDavid du Colombierthe
28153e12c5d1SDavid du Colombier.CW Boxes
28163e12c5d1SDavid du Colombierto make the hole for the new text,
28173e12c5d1SDavid du Colombierand finally one to copy the new text into place.
28183e12c5d1SDavid du ColombierOverall, though,
28193e12c5d1SDavid du Colombier.CW frinsert
28203e12c5d1SDavid du Colombierhas a similar flavor to
28213e12c5d1SDavid du Colombier.CW frdelete ,
28223e12c5d1SDavid du Colombierand needn't be described further.
28233e12c5d1SDavid du Colombier.CW Frinsert
28243e12c5d1SDavid du Colombierand its subsidiary routines comprise 211 lines of C.
28253e12c5d1SDavid du Colombier.PP
28263e12c5d1SDavid du ColombierThe terminal source code is 3024 lines of C,
28273e12c5d1SDavid du Colombierand the host source is 5797 lines.
28283e12c5d1SDavid du Colombier.SH
28293e12c5d1SDavid du ColombierDiscussion
28303e12c5d1SDavid du Colombier.SH 2
28313e12c5d1SDavid du ColombierHistory
28323e12c5d1SDavid du Colombier.LP
28333e12c5d1SDavid du ColombierThe immediate ancestor of
28343e12c5d1SDavid du Colombier.CW sam
28353e12c5d1SDavid du Colombierwas the original text editor for the Blit, called
28363e12c5d1SDavid du Colombier.CW jim .
28373e12c5d1SDavid du Colombier.CW Sam
28383e12c5d1SDavid du Colombierinherited
28393e12c5d1SDavid du Colombier.CW jim 's
28403e12c5d1SDavid du Colombiertwo-process structure and mouse language almost unchanged, but
28413e12c5d1SDavid du Colombier.CW jim
28423e12c5d1SDavid du Colombiersuffered from several drawbacks that were addressed in the design of
28433e12c5d1SDavid du Colombier.CW sam .
28443e12c5d1SDavid du ColombierThe most important of these was the lack of a command language.
28453e12c5d1SDavid du ColombierAlthough
28463e12c5d1SDavid du Colombier.CW jim
28473e12c5d1SDavid du Colombierwas easy to use for simple editing, it provided no direct help with
28483e12c5d1SDavid du Colombierlarge or repetitive editing tasks.  Instead, it provided a command to pass
28493e12c5d1SDavid du Colombierselected text through a shell pipeline,
28503e12c5d1SDavid du Colombierbut this was no more satisfactory than could be expected of a stopgap measure.
28513e12c5d1SDavid du Colombier.PP
28523e12c5d1SDavid du Colombier.CW Jim
28533e12c5d1SDavid du Colombierwas written primarily as a vehicle for experimenting with a mouse-based
28543e12c5d1SDavid du Colombierinterface to text, and the experiment was successful.
28553e12c5d1SDavid du Colombier.CW Jim
28563e12c5d1SDavid du Colombierhad some spin-offs:
28573e12c5d1SDavid du Colombier.CW mux ,
28583e12c5d1SDavid du Colombierthe second window system for the Blit, is essentially a multiplexed
28593e12c5d1SDavid du Colombierversion of the terminal part of
28603e12c5d1SDavid du Colombier.CW jim ;
28613e12c5d1SDavid du Colombierand the debugger
28623e12c5d1SDavid du Colombier.CW pi 's
28633e12c5d1SDavid du Colombieruser interface\u\s-4\&20\s+4\d was closely modeled on
28643e12c5d1SDavid du Colombier.CW jim 's.
28653e12c5d1SDavid du ColombierBut after a couple of years,
28663e12c5d1SDavid du Colombier.CW jim
28673e12c5d1SDavid du Colombierhad become difficult to maintain and limiting to use,
28683e12c5d1SDavid du Colombierand its replacement was overdue.
28693e12c5d1SDavid du Colombier.PP
28703e12c5d1SDavid du ColombierI began the design of
28713e12c5d1SDavid du Colombier.CW sam
28723e12c5d1SDavid du Colombierby asking
28733e12c5d1SDavid du Colombier.CW jim
28743e12c5d1SDavid du Colombiercustomers what they wanted.
28753e12c5d1SDavid du ColombierThis was probably a mistake; the answers were essentially a list of features
28763e12c5d1SDavid du Colombierto be found in other editors, which did not provide any of the
28773e12c5d1SDavid du Colombierguiding principles I was seeking.
28783e12c5d1SDavid du ColombierFor instance, one common request was for a ``global substitute,''
28793e12c5d1SDavid du Colombierbut no one suggested how to provide it within a cut-and-paste editor.
28803e12c5d1SDavid du ColombierI was looking for a scheme that would
28813e12c5d1SDavid du Colombiersupport such specialized features comfortably in the context of some
28823e12c5d1SDavid du Colombiergeneral command language.
28833e12c5d1SDavid du ColombierIdeas were not forthcoming, though, particularly given my insistence
28843e12c5d1SDavid du Colombieron removing all limits on file sizes, line lengths and so on.
28853e12c5d1SDavid du ColombierEven worse, I recognized that, since the mouse could easily
28863e12c5d1SDavid du Colombierindicate a region of the screen that was not an integral number of lines,
28873e12c5d1SDavid du Colombierthe command language would best forget about newlines altogether,
28883e12c5d1SDavid du Colombierand that meant the command language had to treat the file as a single
28893e12c5d1SDavid du Colombierstring, not an array of lines.
28903e12c5d1SDavid du Colombier.PP
28913e12c5d1SDavid du ColombierEventually, I decided that thinking was not getting me very far and it was
28923e12c5d1SDavid du Colombiertime to try building.
28933e12c5d1SDavid du ColombierI knew that the terminal part could be built easily \(em
28943e12c5d1SDavid du Colombierthat part of
28953e12c5d1SDavid du Colombier.CW jim
28963e12c5d1SDavid du Colombierbehaved acceptably well \(em and that most of the hard work was going
28973e12c5d1SDavid du Colombierto be in the host part: the file interface, command interpreter and so on.
28983e12c5d1SDavid du ColombierMoreover, I had some ideas about how the architecture of
28993e12c5d1SDavid du Colombier.CW jim
29003e12c5d1SDavid du Colombiercould be improved without destroying its basic structure, which I liked
29013e12c5d1SDavid du Colombierin principle but which hadn't worked out as well as I had hoped.
29023e12c5d1SDavid du ColombierSo I began by designing the file data structure,
29033e12c5d1SDavid du Colombierstarting with the way
29043e12c5d1SDavid du Colombier.CW jim
29053e12c5d1SDavid du Colombierworked \(em comparable to a single structure merging
29063e12c5d1SDavid du Colombier.CW Disc
29073e12c5d1SDavid du Colombierand
29083e12c5d1SDavid du Colombier.CW Buffer ,
29093e12c5d1SDavid du Colombierwhich I split to make the cache more general
29103e12c5d1SDavid du Colombier\(em and thinking about how global substitute could be implemented.
29113e12c5d1SDavid du ColombierThe answer was clearly that it had to be done in two passes,
29123e12c5d1SDavid du Colombierand the transcript-oriented implementation fell out naturally.
29133e12c5d1SDavid du Colombier.PP
29143e12c5d1SDavid du Colombier.CW Sam
29153e12c5d1SDavid du Colombierwas written bottom-up,
29163e12c5d1SDavid du Colombierstarting from the data structures and algorithms for manipulating text,
29173e12c5d1SDavid du Colombierthrough the command language and up to the code for maintaining
29183e12c5d1SDavid du Colombierthe display.
29193e12c5d1SDavid du ColombierIn retrospect, it turned out well, but this implementation method is
29203e12c5d1SDavid du Colombiernot recommended in general.
29213e12c5d1SDavid du ColombierThere were several times when I had a large body of interesting code
29223e12c5d1SDavid du Colombierassembled and no clue how to proceed with it.
29233e12c5d1SDavid du ColombierThe command language, in particular, took almost a year to figure out,
29243e12c5d1SDavid du Colombierbut can be implemented (given what was there at the beginning of that year)
29253e12c5d1SDavid du Colombierin a day or two.  Similarly, inventing the
29263e12c5d1SDavid du Colombier.CW Rasp
29273e12c5d1SDavid du Colombierdata structure delayed the
29283e12c5d1SDavid du Colombierconnection of the host and terminal pieces by another few months.
29293e12c5d1SDavid du Colombier.CW Sam
29303e12c5d1SDavid du Colombiertook about two years to write, although only about four months were
29313e12c5d1SDavid du Colombierspent actually working on it.
29323e12c5d1SDavid du Colombier.PP
29333e12c5d1SDavid du ColombierPart of the design process was unusual:
29343e12c5d1SDavid du Colombierthe subset of the protocol that maintains the
29353e12c5d1SDavid du Colombier.CW Rasp
29363e12c5d1SDavid du Colombierwas simulated, debugged
29373e12c5d1SDavid du Colombierand verified by an automatic protocol analyzer,\u\s-4\&21\s+4\d and was bug-free
29383e12c5d1SDavid du Colombierfrom the start.
29393e12c5d1SDavid du ColombierThe rest of the protocol, concerned mostly
29403e12c5d1SDavid du Colombierwith keeping menus up to date,
29413e12c5d1SDavid du Colombierwas unfortunately too unwieldy for such analysis,
29423e12c5d1SDavid du Colombierand was debugged by more traditional methods, primarily
29433e12c5d1SDavid du Colombierby logging in a file all messages in and out of the host.
29443e12c5d1SDavid du Colombier.SH 2
29453e12c5d1SDavid du ColombierReflections
29463e12c5d1SDavid du Colombier.LP
29473e12c5d1SDavid du Colombier.CW Sam
29483e12c5d1SDavid du Colombieris essentially the only interactive editor used by the sixty or so members of
29493e12c5d1SDavid du Colombierthe computing science research center in which I work.
29503e12c5d1SDavid du ColombierThe same could not be said of
29513e12c5d1SDavid du Colombier.CW jim ;
29523e12c5d1SDavid du Colombierthe lack of a command language kept some people from adopting it.
29533e12c5d1SDavid du ColombierThe union of a user interface as comfortable as
29543e12c5d1SDavid du Colombier.CW jim 's
29553e12c5d1SDavid du Colombierwith a command language as powerful as
2956*219b2ee8SDavid du Colombier.CW ed 's†
29573e12c5d1SDavid du Colombier.FS
29583e12c5d1SDavid du Colombier.vs 9
2959*219b2ee8SDavid du Colombier†The people who criticize
29603e12c5d1SDavid du Colombier.CW ed
29613e12c5d1SDavid du Colombieras an interactive program often forget that it and its close relative
29623e12c5d1SDavid du Colombier.CW sed \u\s-4\&7\s+4\d
29633e12c5d1SDavid du Colombierstill thrive as programmable editors.  The strength of these programs is
29643e12c5d1SDavid du Colombierindependent of their convenience for interactive editing.
29653e12c5d1SDavid du Colombier.br
29663e12c5d1SDavid du Colombier.vs
29673e12c5d1SDavid du Colombier.FE
29683e12c5d1SDavid du Colombieris essential to
29693e12c5d1SDavid du Colombier.CW sam 's
29703e12c5d1SDavid du Colombiersuccess.
29713e12c5d1SDavid du ColombierWhen
29723e12c5d1SDavid du Colombier.CW sam
29733e12c5d1SDavid du Colombierwas first made available to the
29743e12c5d1SDavid du Colombier.CW jim
29753e12c5d1SDavid du Colombiercommunity,
29763e12c5d1SDavid du Colombieralmost everyone switched to it within two or three days.
29773e12c5d1SDavid du ColombierIn the months that followed, even people who had never adopted
29783e12c5d1SDavid du Colombier.CW jim
29793e12c5d1SDavid du Colombierstarted using
29803e12c5d1SDavid du Colombier.CW sam
29813e12c5d1SDavid du Colombierexclusively.
29823e12c5d1SDavid du Colombier.PP
29833e12c5d1SDavid du ColombierTo be honest,
29843e12c5d1SDavid du Colombier.CW ed
29853e12c5d1SDavid du Colombierstill gets occasional use, but usually when
29863e12c5d1SDavid du Colombiersomething quick needs to be done and the overhead of
29873e12c5d1SDavid du Colombierdownloading the terminal part of
29883e12c5d1SDavid du Colombier.CW sam
29893e12c5d1SDavid du Colombierisn't worth the trouble.
29903e12c5d1SDavid du ColombierAlso, as a `line' editor,
29913e12c5d1SDavid du Colombier.CW sam
29923e12c5d1SDavid du Colombier.CW -d
29933e12c5d1SDavid du Colombieris a bit odd;
29943e12c5d1SDavid du Colombierwhen using a good old ASCII terminal, it's comforting to have
29953e12c5d1SDavid du Colombiera true line editor.
29963e12c5d1SDavid du ColombierBut it is fair to say that
29973e12c5d1SDavid du Colombier.CW sam 's
29983e12c5d1SDavid du Colombiercommand language has displaced
29993e12c5d1SDavid du Colombier.CW ed 's
30003e12c5d1SDavid du Colombierfor most of the complicated editing that has kept line editors
30013e12c5d1SDavid du Colombier(that is, command-driven editors) with us.
30023e12c5d1SDavid du Colombier.PP
30033e12c5d1SDavid du Colombier.CW Sam 's
30043e12c5d1SDavid du Colombiercommand language is even fancier than
30053e12c5d1SDavid du Colombier.CW ed 's,
30063e12c5d1SDavid du Colombierand most
30073e12c5d1SDavid du Colombier.CW sam
30083e12c5d1SDavid du Colombiercustomers don't come near to using all its capabilities.
30093e12c5d1SDavid du ColombierDoes it need to be so sophisticated?
30103e12c5d1SDavid du ColombierI think the answer is yes, for two reasons.
30113e12c5d1SDavid du Colombier.PP
30123e12c5d1SDavid du ColombierFirst, the
30133e12c5d1SDavid du Colombier.I model
30143e12c5d1SDavid du Colombierfor
30153e12c5d1SDavid du Colombier.CW sam 's
30163e12c5d1SDavid du Colombiercommand language is really relatively simple, and certainly simpler than that of
30173e12c5d1SDavid du Colombier.CW ed .
30183e12c5d1SDavid du ColombierFor instance, there is only one kind of textual loop in
30193e12c5d1SDavid du Colombier.CW sam
30203e12c5d1SDavid du Colombier\(em the
30213e12c5d1SDavid du Colombier.CW x
30223e12c5d1SDavid du Colombiercommand \(em
30233e12c5d1SDavid du Colombierwhile
30243e12c5d1SDavid du Colombier.CW ed
30253e12c5d1SDavid du Colombierhas three (the
30263e12c5d1SDavid du Colombier.CW g
30273e12c5d1SDavid du Colombiercommand, the global flag on substitutions, and the implicit loop over
30283e12c5d1SDavid du Colombierlines in multi-line substitutions).
30293e12c5d1SDavid du ColombierAlso,
30303e12c5d1SDavid du Colombier.CW ed 's
30313e12c5d1SDavid du Colombiersubstitute command is necessary to make changes within lines, but in
30323e12c5d1SDavid du Colombier.CW sam
30333e12c5d1SDavid du Colombierthe
30343e12c5d1SDavid du Colombier.CW s
30353e12c5d1SDavid du Colombiercommand is more of a familiar convenience than a necessity;
30363e12c5d1SDavid du Colombier.CW c
30373e12c5d1SDavid du Colombierand
30383e12c5d1SDavid du Colombier.CW t
30393e12c5d1SDavid du Colombiercan do all the work.
30403e12c5d1SDavid du Colombier.PP
30413e12c5d1SDavid du ColombierSecond,
30423e12c5d1SDavid du Colombiergiven a community that expects an editor to be about as powerful as
30433e12c5d1SDavid du Colombier.CW ed ,
30443e12c5d1SDavid du Colombierit's hard to see how
30453e12c5d1SDavid du Colombier.CW sam
30463e12c5d1SDavid du Colombiercould really be much simpler and still satisfy that expectation.
30473e12c5d1SDavid du ColombierPeople want to do ``global substitutes,'' and most are content
30483e12c5d1SDavid du Colombierto have the recipe for that and a few other fancy changes.
30493e12c5d1SDavid du ColombierThe sophistication of the command language is really just a veneer
30503e12c5d1SDavid du Colombierover a design that makes it possible to do global substitutes
30513e12c5d1SDavid du Colombierin a screen editor.
30523e12c5d1SDavid du ColombierSome people will always want something more, however, and it's gratifying to
30533e12c5d1SDavid du Colombierbe able to provide it.
30543e12c5d1SDavid du ColombierThe real power of
30553e12c5d1SDavid du Colombier.CW sam 's
30563e12c5d1SDavid du Colombiercommand language comes from composability of the operators, which is by
30573e12c5d1SDavid du Colombiernature orthogonal to the underlying model.
30583e12c5d1SDavid du ColombierIn other words,
30593e12c5d1SDavid du Colombier.CW sam
30603e12c5d1SDavid du Colombieris not itself complex, but it makes complex things possible.
30613e12c5d1SDavid du ColombierIf you don't want to do anything complex, you can ignore the
30623e12c5d1SDavid du Colombiercomplexity altogether, and many people do so.
30633e12c5d1SDavid du Colombier.PP
30643e12c5d1SDavid du ColombierSometimes I am asked the opposite question: why didn't I just make
30653e12c5d1SDavid du Colombier.CW sam
30663e12c5d1SDavid du Colombiera real programmable editor, with macros and variables and so on?
30673e12c5d1SDavid du ColombierThe main reason is a matter of taste: I like the editor
30683e12c5d1SDavid du Colombierto be the same every time I use it.
30693e12c5d1SDavid du ColombierThere is one technical reason, though:
30703e12c5d1SDavid du Colombierprogrammability in editors is largely a workaround for insufficient
30713e12c5d1SDavid du Colombierinteractivity.
30723e12c5d1SDavid du ColombierProgrammable editors are used to make particular, usually short-term,
30733e12c5d1SDavid du Colombierthings easy to do, such as by providing shorthands for common actions.
30743e12c5d1SDavid du ColombierIf things are generally easy to do in the first place,
30753e12c5d1SDavid du Colombiershorthands are not as helpful.
30763e12c5d1SDavid du Colombier.CW Sam
30773e12c5d1SDavid du Colombiermakes common editing operations very easy, and the solutions to
30783e12c5d1SDavid du Colombiercomplex editing problems seem commensurate with the problems themselves.
30793e12c5d1SDavid du ColombierAlso, the ability to edit the
30803e12c5d1SDavid du Colombier.CW sam
30813e12c5d1SDavid du Colombierwindow makes it easy to repeat commands \(em it only takes a mouse button click
30823e12c5d1SDavid du Colombierto execute a command again.
30833e12c5d1SDavid du Colombier.SH 2
30843e12c5d1SDavid du ColombierPros and cons
30853e12c5d1SDavid du Colombier.LP
30863e12c5d1SDavid du Colombier.CW Sam
30873e12c5d1SDavid du Colombierhas several other good points,
30883e12c5d1SDavid du Colombierand its share of problems.
30893e12c5d1SDavid du ColombierAmong the good things is the idea of
30903e12c5d1SDavid du Colombierstructural regular expressions,
30913e12c5d1SDavid du Colombierwhose usefulness has only begun to be explored.
30923e12c5d1SDavid du ColombierThey were arrived at serendipitously when I attempted to distill the essence of
30933e12c5d1SDavid du Colombier.CW ed 's
30943e12c5d1SDavid du Colombierway of doing global substitution and recognized that the looping command in
30953e12c5d1SDavid du Colombier.CW ed
30963e12c5d1SDavid du Colombierwas implicitly imposing a structure (an array of lines) on the file.
30973e12c5d1SDavid du Colombier.PP
30983e12c5d1SDavid du ColombierAnother of
30993e12c5d1SDavid du Colombier.CW sam 's
31003e12c5d1SDavid du Colombiergood things is its undo capability.
31013e12c5d1SDavid du ColombierI had never before used an editor with a true undo,
31023e12c5d1SDavid du Colombierbut I would never go back now.
31033e12c5d1SDavid du ColombierUndo
31043e12c5d1SDavid du Colombier.I must
31053e12c5d1SDavid du Colombierbe done well, but if it is, it can be relied on.
31063e12c5d1SDavid du ColombierFor example,
31073e12c5d1SDavid du Colombierit's safe to experiment if you're not sure how to write some intricate command,
31083e12c5d1SDavid du Colombierbecause if you make a mistake, it can be fixed simply and reliably.
31093e12c5d1SDavid du ColombierI learned two things about undo from writing
31103e12c5d1SDavid du Colombier.CW sam :
31113e12c5d1SDavid du Colombierfirst, it's easy to provide if you design it in from the beginning, and
31123e12c5d1SDavid du Colombiersecond, it's necessary, particularly if the system has some subtle
31133e12c5d1SDavid du Colombierproperties that may be unfamiliar or error-prone for users.
31143e12c5d1SDavid du Colombier.PP
31153e12c5d1SDavid du Colombier.CW Sam 's
31163e12c5d1SDavid du Colombierlack of internal limits and sizes is a virtue.
31173e12c5d1SDavid du ColombierBecause it avoids all fixed-size tables and data structures,
31183e12c5d1SDavid du Colombier.CW sam
31193e12c5d1SDavid du Colombieris able to make global changes to files that some of our other
31203e12c5d1SDavid du Colombiertools cannot even read.
31213e12c5d1SDavid du ColombierMoreover, the design keeps the performance linear when doing such
31223e12c5d1SDavid du Colombieroperations, although I must admit
31233e12c5d1SDavid du Colombier.CW sam
31243e12c5d1SDavid du Colombierdoes get slow when editing a huge file.
31253e12c5d1SDavid du Colombier.PP
31263e12c5d1SDavid du ColombierNow, the problems.
31273e12c5d1SDavid du ColombierExternally, the most obvious is that it is poorly integrated into the
31283e12c5d1SDavid du Colombiersurrounding window system.
31293e12c5d1SDavid du ColombierBy design, the user interface in
31303e12c5d1SDavid du Colombier.CW sam
31313e12c5d1SDavid du Colombierfeels almost identical to that of
31323e12c5d1SDavid du Colombier.CW mux ,
31333e12c5d1SDavid du Colombierbut a thick wall separates text in
31343e12c5d1SDavid du Colombier.CW sam
31353e12c5d1SDavid du Colombierfrom the programs running in
31363e12c5d1SDavid du Colombier.CW mux .
31373e12c5d1SDavid du ColombierFor instance, the `snarf buffer' in
31383e12c5d1SDavid du Colombier.CW sam
31393e12c5d1SDavid du Colombiermust be maintained separately from that in
31403e12c5d1SDavid du Colombier.CW mux .
31413e12c5d1SDavid du ColombierThis is regrettable, but probably necessary given the unusual configuration
31423e12c5d1SDavid du Colombierof the system, with a programmable terminal on the far end of an RS-232 link.
31433e12c5d1SDavid du Colombier.PP
31443e12c5d1SDavid du Colombier.CW Sam
31453e12c5d1SDavid du Colombieris reliable; otherwise, people wouldn't use it.
31463e12c5d1SDavid du ColombierBut it was written over such a long time, and has so many new (to me)
31473e12c5d1SDavid du Colombierideas in it, that I would like to see it done over again to clean
31483e12c5d1SDavid du Colombierup the code and remove many of the lingering problems in the implementation.
31493e12c5d1SDavid du ColombierThe worst part is in the interconnection of the host and terminal parts,
31503e12c5d1SDavid du Colombierwhich might even be able to go away in a redesign for a more
31513e12c5d1SDavid du Colombierconventional window system.
31523e12c5d1SDavid du ColombierThe program must be split in two to use the terminal effectively,
31533e12c5d1SDavid du Colombierbut the low bandwidth of the connection forces the separation to
31543e12c5d1SDavid du Colombieroccur in an inconvenient part of the design if performance is to be acceptable.
31553e12c5d1SDavid du ColombierA simple remote procedure call
31563e12c5d1SDavid du Colombierprotocol driven by the host, emitting only graphics
31573e12c5d1SDavid du Colombiercommands, would be easy to write but wouldn't have nearly the
31583e12c5d1SDavid du Colombiernecessary responsiveness.  On the other hand, if the terminal were in control
31593e12c5d1SDavid du Colombierand requested much simpler file services from the host, regular expression
31603e12c5d1SDavid du Colombiersearches would require that the terminal read the entire file over its RS-232
31613e12c5d1SDavid du Colombierlink, which would be unreasonably slow.
31623e12c5d1SDavid du ColombierA compromise in which either end can take control is necessary.
31633e12c5d1SDavid du ColombierIn retrospect, the communications protocol should have been
31643e12c5d1SDavid du Colombierdesigned and verified formally, although I do not know of any tool
31653e12c5d1SDavid du Colombierthat can adequately relate the protocol to
31663e12c5d1SDavid du Colombierits implementation.
31673e12c5d1SDavid du Colombier.PP
31683e12c5d1SDavid du ColombierNot all of
31693e12c5d1SDavid du Colombier.CW sam 's
31703e12c5d1SDavid du Colombierusers are comfortable with its command language, and few are adept.
31713e12c5d1SDavid du ColombierSome (venerable) people use a sort of
31723e12c5d1SDavid du Colombier.CW ed \& ``
31733e12c5d1SDavid du Colombiersubset'' of
31743e12c5d1SDavid du Colombier.CW sam 's
31753e12c5d1SDavid du Colombiercommand language,
31763e12c5d1SDavid du Colombierand even ask why
31773e12c5d1SDavid du Colombier.CW sam 's
31783e12c5d1SDavid du Colombiercommand language is not exactly
31793e12c5d1SDavid du Colombier.CW ed 's.
31803e12c5d1SDavid du Colombier(The reason, of course, is that
31813e12c5d1SDavid du Colombier.CW sam 's
31823e12c5d1SDavid du Colombiermodel for text does not include newlines, which are central to
31833e12c5d1SDavid du Colombier.CW ed .
31843e12c5d1SDavid du ColombierMaking the text an array of newlines to the command language would
31853e12c5d1SDavid du Colombierbe too much of a break from the seamless model provided by the mouse.
31863e12c5d1SDavid du ColombierSome editors, such as
31873e12c5d1SDavid du Colombier.CW vi ,
31883e12c5d1SDavid du Colombierare willing to make this break, though.)
31893e12c5d1SDavid du ColombierThe difficulty is that
31903e12c5d1SDavid du Colombier.CW sam 's
31913e12c5d1SDavid du Colombiersyntax is so close to
31923e12c5d1SDavid du Colombier.CW ed 's
31933e12c5d1SDavid du Colombierthat people believe it
31943e12c5d1SDavid du Colombier.I should
31953e12c5d1SDavid du Colombierbe the same.
31963e12c5d1SDavid du ColombierI thought, with some justification in hindsight,
31973e12c5d1SDavid du Colombierthat making
31983e12c5d1SDavid du Colombier.CW sam
31993e12c5d1SDavid du Colombiersimilar to
32003e12c5d1SDavid du Colombier.CW ed
32013e12c5d1SDavid du Colombierwould make it easier to learn and to accept.
32023e12c5d1SDavid du ColombierBut I may have overstepped and raised the users'
32033e12c5d1SDavid du Colombierexpectations too much.
32043e12c5d1SDavid du ColombierIt's hard to decide which way to resolve this problem.
32053e12c5d1SDavid du Colombier.PP
32063e12c5d1SDavid du ColombierFinally, there is a tradeoff in
32073e12c5d1SDavid du Colombier.CW sam
32083e12c5d1SDavid du Colombierthat was decided by the environment in which it runs:
32093e12c5d1SDavid du Colombier.CW sam
32103e12c5d1SDavid du Colombieris a multi-file editor, although in a different system there might instead be
32113e12c5d1SDavid du Colombiermultiple single-file editors.
32123e12c5d1SDavid du ColombierThe decision was made primarily because starting a new program in a Blit is
32133e12c5d1SDavid du Colombiertime-consuming.
32143e12c5d1SDavid du ColombierIf the choice could be made freely, however, I would
32153e12c5d1SDavid du Colombierstill choose the multi-file architecture, because it allows
32163e12c5d1SDavid du Colombiergroups of files to be handled as a unit;
32173e12c5d1SDavid du Colombierthe usefulness of the multi-file commands is incontrovertible.
32183e12c5d1SDavid du ColombierIt is delightful to have the source to an entire program
32193e12c5d1SDavid du Colombieravailable at your fingertips.
32203e12c5d1SDavid du Colombier.SH
32213e12c5d1SDavid du ColombierAcknowledgements
32223e12c5d1SDavid du Colombier.LP
32233e12c5d1SDavid du ColombierTom Cargill suggested the idea behind the
32243e12c5d1SDavid du Colombier.CW Rasp
32253e12c5d1SDavid du Colombierdata structure.
32263e12c5d1SDavid du ColombierNorman Wilson and Ken Thompson influenced the command language.
32273e12c5d1SDavid du ColombierThis paper was improved by comments from
32283e12c5d1SDavid du ColombierAl Aho,
32293e12c5d1SDavid du ColombierJon Bentley,
32303e12c5d1SDavid du ColombierChris Fraser,
32313e12c5d1SDavid du ColombierGerard Holzmann,
32323e12c5d1SDavid du ColombierBrian Kernighan,
32333e12c5d1SDavid du ColombierTed Kowalski,
32343e12c5d1SDavid du ColombierDoug McIlroy
32353e12c5d1SDavid du Colombierand
32363e12c5d1SDavid du ColombierDennis Ritchie.
3237