xref: /plan9/sys/doc/sam/sam.tut (revision 6a5dc2224c39be050e950388971fcc3c57735be9)
1.de P1
2.KS
3.DS
4.ft CW
5.ta 5n 10n 15n 20n 25n 30n 35n 40n 45n 50n 55n 60n 65n 70n 75n 80n
6..
7.de P2
8.ft 1
9.DE
10.KE
11..
12.de CW
13.lg 0
14\%\&\\$3\f(CW\\$1\fP\&\\$2
15.lg
16..
17.de WC
18.lg 0
19\%\&\\$3\f(CI\\$1\fP\&\\$2
20.lg
21..
22.TL
23A tutorial for the
24.CW sam
25.B
26command language
27.AU
28Rob Pike
29.AI
30.MH
31.AB
32.CW sam
33is an interactive text editor with a command language that makes heavy use
34of regular expressions.
35Although the language is syntactically similar to
36.CW ed (1),
37the details are interestingly different.
38This tutorial introduces the command language, but does not discuss
39the screen and mouse interface.
40With apologies to those unfamiliar with the Ninth Edition Blit software,
41it is assumed that the similarity of
42.CW sam
43to
44.CW mux (9)
45at this level makes
46.CW sam 's
47mouse language easy to learn.
48.PP
49The
50.CW sam
51command language applies identically to two environments:
52when running
53.CW sam
54on an ordinary terminal
55(\f2via\f1\f1
56.CW sam\ -d ),
57and in the command window of a
58.I downloaded
59.CW sam ,
60that is, one using the bitmap display and mouse.
61.AE
62.SH
63Introduction
64.PP
65This tutorial describes the command language of
66.CW sam ,
67an interactive text editor that runs on Blits and
68some computers with bitmap displays.
69For most editing tasks, the mouse-based editing features
70are sufficient, and they are easy to use and to learn.
71.PP
72The command language is often useful, however, particularly
73when making global changes.
74Unlike the commands in
75.CW ed ,
76which are necessary to make changes,
77.CW sam
78commands tend to be used
79only for complicated or repetitive editing tasks.
80It is in these more involved uses that
81the differences between
82.CW sam
83and other text editors are most evident.
84.PP
85.CW sam 's
86language makes it easy to do some things that other editors,
87including programs like
88.CW sed
89and
90.CW awk ,
91do not handle gracefully, so this tutorial serves partly as a
92lesson in
93.CW sam 's
94manner of manipulating text.
95The examples below therefore concentrate entirely on the language,
96assuming that facility with the use of the mouse in
97.CW sam
98is at worst easy to pick up.
99In fact,
100.CW sam
101can be run without the mouse at all (not
102.I downloaded ),
103by specifying the
104.CW -d
105flag, and it is this domain that the tutorial
106occupies; the command language in these modes
107are identical.
108.PP
109A word to the Unix adept:
110although
111.CW sam
112is syntactically very similar to
113.CW ed ,
114it is fundamentally and deliberately different in design and detailed semantics.
115You might use knowledge of
116.CW ed
117to predict how the substitute command works,
118but you'd only be right if you had used some understanding of
119.CW sam 's
120workings to influence your prediction.
121Be particularly careful about idioms.
122Idioms form in curious nooks of languages and depend on
123undependable peculiarities.
124.CW ed
125idioms simply don't work in
126.CW sam :
127.CW 1,$s/a/b/
128makes one substitution in the whole file, not one per line.
129.CW sam
130has its own idioms.
131Much of the purpose of this tutorial is to publish them
132and make fluency in
133.CW sam
134a matter of learning, not cunning.
135.PP
136The tutorial depends on familiarity with regular expressions, although
137some experience with a more traditional Unix editor may be helpful.
138To aid readers familiar with
139.CW ed ,
140I have pointed out in square brackets [] some of
141the relevant differences between
142.CW ed
143and
144.CW sam .
145Read these comments only if you wish
146to understand the differences; the lesson is about
147.CW sam ,
148not
149.CW sam
150.I vs.
151.CW ed .
152Another typographic convention is that output appears in
153.CW "this font,
154while typed input appears as
155.WC "slanty text.
156.PP
157Nomenclature:
158.CW sam
159keeps a copy of the text it is editing.
160This copy is called a
161.I file .
162To avoid confusion, I have called the permanent storage on disc a
163.I
164Unix file.
165.R
166.SH
167Text
168.PP
169To get started, we need some text to play with.
170Any text will do; try something from
171James Gosling's Emacs manual:
172.P1
173$ \f(CIsam -d
174a
175This manual is organized in a rather haphazard manner.  The first
176several sections were written hastily in an attempt to provide a
177general introduction to the commands in Emacs and to try to show
178the method in the madness that is the Emacs command structure.
179\&.
180.ft
181.P2
182.WC "sam -d
183starts
184.CW sam
185running.
186The
187.CW a
188command adds text until a line containing just a period, and sets the
189.I
190current text
191.R
192(also called
193.I dot )
194to what was typed \(em everything between the
195.CW a
196and the period.
197.CW ed "" [
198would leave dot set to only the last line.]
199The
200.CW p
201command prints the current text:
202.P1
203.WC p
204This manual is organized in a rather haphazard manner.  The first
205several sections were written hastily in an attempt to provide a
206general introduction to the commands in Emacs and to try to show
207the method in the madness that is the Emacs command structure.
208.P2
209[Again,
210.CW ed
211would print only the last line.]
212The
213.CW a
214command adds its text
215.I after
216dot; the
217.CW i
218command is like
219.CW a,
220but adds the text
221.I before
222dot.
223.P1
224.ft CI
225i
226Introduction
227\&.
228p
229.ft
230Introduction
231.P2
232There is also a
233.CW c
234command that changes (replaces) the current text,
235and
236.CW d
237that deletes it; these are illustrated below.
238.PP
239To see all the text, we can specify what text to print;
240for the moment, suffice it to say that
241.WC 0,$
242specifies the entire file.
243.CW ed "" [
244users would probably type
245.WC 1,$ ,
246which in practice is the same thing, but see below.]
247.P1
248.WC 0,$p
249Introduction
250This manual is organized in a rather haphazard manner.  The first
251several sections were written hastily in an attempt to provide a
252general introduction to the commands in Emacs and to try to show
253the method in the madness that is the Emacs command structure.
254.P2
255Except for the
256.CW w
257command described below,
258.I all
259commands,
260including
261.CW p ,
262set dot to the text they touch.
263Thus,
264.CW a
265and
266.CW i
267set dot to the new text,
268.CW p
269to the text printed, and so on.
270Similarly, all commands
271(except
272.CW w )
273by default operate on the current
274text [unlike
275.CW ed ,
276for which some commands (such as
277.CW g )
278default to the entire file].
279.PP
280Things are not going to get very interesting until we can
281set dot arbitrarily.
282This is done by
283.I addresses ,
284which specify a piece of the file.
285The address
286.CW 1 ,
287for example, sets dot to the first line of the file.
288.P1
289.WC 1p
290Introduction
291.WC c
292.WC Preamble
293.WC .
294.P2
295The
296.CW c
297command didn't need to specify dot; the
298.CW p
299left it on line one.
300It's therefore easy to delete the first line utterly;
301the last command left dot set to line one:
302.P1
303.WC d
304.WC 1p
305This manual is organized in a rather haphazard manner.  The first
306.P2
307(Line numbers change
308to reflect changes to the file.)
309.PP
310The address \f(CW/\f2text\f(CW/\f1
311sets dot to the first appearance of
312.I text ,
313after dot.
314.CW ed "" [
315matches the first line containing
316.I text .]
317If
318.I text
319is not found, the search restarts at the beginning of the file
320and continues until dot.
321.P1
322.WC /Emacs/p
323Emacs
324.P2
325It's difficult to indicate typographically, but in this example no newline appears
326after
327.CW Emacs :
328the text to be printed is the string
329.CW Emacs ', `
330exactly.
331(The final
332.CW p
333may be left off \(em it is the default command.
334When downloaded, however, the default is instead to select the text,
335to highlight it,
336and to make it visible by moving the window on the file if necessary.
337Thus,
338.CW /Emacs/
339indicates on the display the next occurrence of the text.)
340.PP
341Imagine we wanted to change the word
342.CW haphazard
343to
344.CW thoughtless .
345Obviously, what's needed is another
346.CW c
347command, but the method used so far to insert text includes a newline.
348The syntax for including text without newlines is to surround the
349text with slashes (which is the same as the syntax for
350text searches, but what is going on should be clear from context).
351The text must appear immediately after the
352.CW c
353(or
354.CW a
355or
356.CW i ).
357Given this, it is easy to make the required change:
358.P1
359.WC /haphazard/c/thoughtless/
360.WC 1p
361This manual is organized in a rather thoughtless manner.  The first
362.P2
363[Changes can always be done with a
364.CW c
365command, even if the text is smaller than a line].
366You'll find that this way of providing text to commands is much
367more common than is the multiple-lines syntax.
368If you want to include a slash
369.CW /
370in the text, just precede it with a backslash
371.CW \e ,
372and use a backslash to protect a backslash itself.
373.P1
374.WC /Emacs/c/Emacs\e\e360/
375.WC 4p
376general introduction to the commands in Emacs\e360 and to try to show
377.P2
378We could also make this particular change by
379.P1
380.WC /Emacs/a/\e\e360/
381.P2
382.PP
383This is as good a place as any to introduce the
384.CW u
385command, which undoes the last command.
386A second
387.CW u
388will undo the penultimate command, and so on.
389.P1
390.WC u
391.WC 4p
392general introduction to the commands in Emacs and to try to show
393.WC u
394.WC 3p
395This manual is organized in a rather haphazard manner.  The first
396.P2
397Undoing can only back up; there is no way to undo a previous
398.CW u .
399.SH
400Addresses
401.PP
402We've seen the simplest forms of addresses, but there is more
403to learn before we can get too much further.
404An address selects a region in the file \(em a substring \(em
405and therefore must define the beginning and the end of a region.
406Thus, the address
407.CW 13
408selects from the beginning of line thirteen to the end of line thirteen, and
409.CW /Emacs/
410selects from the beginning of the word
411.CW Emacs ' `
412to the end.
413.PP
414Addresses may be combined with a comma:
415.P1
41613,15
417.P2
418selects lines thirteen through fifteen.  The definition of the comma
419operator is to select from the beginning of the left hand address (the
420beginning of line 13) to the end of the right hand address (the
421end of line 15).
422.PP
423A few special simple addresses come in handy:
424.CW .
425(a period) represents dot, the current text,
426.CW 0
427(line zero) selects the null string at the beginning of the file, and
428.CW $
429selects the null string at the end of the file
430[not the last line of the file].
431Therefore,
432.P1
4330,13
434.P2
435selects from the beginning of the file to the end of line thirteen,
436.P1
437\&.,$
438.P2
439selects from the beginning of the current text to the end of the file, and
440.P1
4410,$
442.P2
443selects the whole file [that is, a single string containing the whole file,
444not a list of all the lines in the file].
445.PP
446These are all
447.I absolute
448addresses: they refer to specific places in the file.
449.CW sam
450also has relative addresses, which depend
451on the value of dot,
452and in fact we have already seen one form:
453.CW /Emacs/
454finds the first occurrence of
455.CW Emacs
456searching forwards from dot.
457Which occurrence of
458.CW Emacs
459it finds depends on the value of dot.
460What if you wanted the first occurrence
461.CW before
462dot?  Just precede the pattern with a minus sign, which reverses the direction
463of the search:
464.P1
465-/Emacs/
466.P2
467In fact, the complete syntax for forward searching is
468.P1
469+/Emacs/
470.P2
471but the plus sign is the default, and in practice is rarely used.
472Here is an example that includes it for clarity:
473.P1
4740+/Emacs/
475.P2
476selects the first occurrence of
477.CW Emacs
478in the file; read it as ``go to line 0, then search forwards for
479.CW Emacs .''
480Since the
481.CW +
482is optional, this can be written
483.CW 0/Emacs/ .
484Similarly,
485.P1
486$-/Emacs/
487.P2
488finds the last occurrence in the file, so
489.P1
4900/Emacs/,$-/Emacs/
491.P2
492selects the text from the first to last
493.CW Emacs ,
494inclusive.
495Slightly more interesting:
496.P1
497/Emacs/+/Emacs/
498.P2
499(there is an implicit
500.CW .+
501at the beginning) selects the second
502.CW Emacs
503following dot.
504.PP
505Line numbers may also be relative.
506.P1
507-2
508.P2
509selects the second previous line, and
510.P1
511+5
512.P2
513selects the fifth following line (here the plus sign is obligatory).
514.PP
515Since addresses may select (and dot may be) more than one line,
516we need a definition of `previous' and `following:'
517`previous' means
518.I
519before the beginning
520.R
521of dot, and `following'
522means
523.I
524after the end
525.R
526of dot.
527For example, if the file contains \f(CWA\f(CIAA\f(CWA\f1,
528with dot set to the middle two
529.CW A 's
530(the slanting characters),
531.CW -/A/
532sets dot to the first
533.CW A ,
534and
535.CW +/A/
536sets dot to the last
537.CW A .
538Except under odd circumstances (such as when the only occurrence of the
539text in the file is already the current text), the text selected by a
540search will be disjoint from dot.
541.PP
542To select the
543.CW "troff -ms
544paragraph containing dot, however long it is, use
545.P1
546-/.PP/,/.PP/-1
547.P2
548which will include the
549.CW .PP
550that begins the paragraph, and exclude the one that ends it.
551.PP
552When typing relative line number addresses, the default number is
553.CW 1 ,
554so the above could be written slightly more simply:
555.P1
556-/.PP/,/.PP/-
557.P2
558.PP
559What does the address
560.CW +1-1
561or the equivalent
562.CW +-
563mean?  It looks like it does nothing, but recall that dot need not be a
564complete line of text.
565.CW +1
566selects the line after the end of the current text, and
567.CW -1
568selects the line before the beginning.  Therefore
569.CW +1-1
570selects the line before the line after the end of dot, that is,
571the complete line containing the end of dot.
572We can use this construction to expand a selection to include a complete line,
573say the first line in the file containing
574.CW Emacs :
575.P1
576.WC 0/Emacs/+-p
577general introduction to the commands in Emacs and to try to show
578.P2
579The address
580.CW +-
581is an idiom.
582.SH
583Loops
584.PP
585Above, we changed one occurrence of
586.CW Emacs
587to
588.CW Emacs\e360 ,
589but if the name of the editor is really changing, it would be useful
590to change
591.I all
592instances of the name in a single command.
593.CW sam
594provides a command,
595.CW x
596(extract), for just that job.
597The syntax is
598\f(CWx/\f2pattern\f(CW/\f2command\f1.
599For each occurrence of the pattern in the selected text,
600.CW x
601sets dot to the occurrence and runs command.
602For example, to change
603.CW Emacs
604to
605.CW vi,
606.P1
607.WC 0,$x/Emacs/c/vi/
608.WC 0,$p
609This manual is organized in a rather haphazard manner.  The first
610several sections were written hastily in an attempt to provide a
611general introduction to the commands in vi and to try to show
612the method in the madness that is the vi command structure.
613.P2
614This
615works by subdividing the current text
616.CW 0,$ "" (
617\(em the whole file) into appearances of its textual argument
618.CW Emacs ), (
619and then running the command that follows
620.CW c/vi/ ) (
621with dot set to the text.
622We can read this example as, ``find all occurrences of
623.CW Emacs
624in the file, and for each one,
625set the current text to the occurrence and run the command
626.CW c/vi/ ,
627which will replace the current text by
628.CW vi. ''
629[This command is somewhat similar to
630.CW ed 's
631.CW g
632command.  The differences will develop below, but note that the
633default address, as always, is dot rather than the whole file.]
634.PP
635A single
636.CW u
637command is sufficient to undo an
638.CW x
639command, regardless of how many individual changes the
640.CW x
641makes.
642.P1
643.WC u
644.WC 0,$p
645This manual is organized in a rather haphazard manner.  The first
646several sections were written hastily in an attempt to provide a
647general introduction to the commands in Emacs and to try to show
648the method in the madness that is the Emacs command structure.
649.P2
650.PP
651Of course,
652.CW c
653is not the only command
654.CW x
655can run.  An
656.CW a
657command can be used to put proprietary markings on
658.CW Emacs :
659.P1
660.WC 0,$x/Emacs/a/{TM}/
661.WC /Emacs/+-p
662general introduction to the commands in Emacs{TM} and to try to show
663.P2
664[There is no way to see the changes as they happen, as in
665.CW ed 's
666.CW g/Emacs/s//&{TM}/p ;
667see the section on Multiple Changes, below.]
668.PP
669The
670.CW p
671command is also useful when driven by an
672.CW x ,
673but be careful that you say what you mean;
674.P1
675.WC 0,$x/Emacs/p
676EmacsEmacs
677.P2
678since
679.CW x
680sets dot to the text in the slashes, printing only that text
681is not going to be very
682informative.  But the command that
683.CW x
684runs can contain addresses.  For example, if we want to print all
685lines containing
686.CW Emacs ,
687just use
688.CW +- :
689.P1
690.WC 0,$x/Emacs/+-p
691general introduction to the commands in Emacs{TM} and to try to show
692the method in the madness that is the Emacs{TM} command structure.
693.P2
694Finally, let's restore the state of the file with another
695.CW x
696command, and make use of a handy shorthand:
697a comma in an address has its left side default to
698.CW 0 ,
699and its right side default to
700.CW $ ,
701so the easy-to-type address
702.CW ,
703refers to the whole file:
704.P1
705.WC ",x/Emacs/ /{TM}/d
706.WC ,p
707This manual is organized in a rather haphazard manner.  The first
708several sections were written hastily in an attempt to provide a
709general introduction to the commands in Emacs and to try to show
710the method in the madness that is the Emacs command structure.
711.P2
712Notice what this
713.CW x
714does: for each occurrence of Emacs,
715find the
716.CW {TM}
717that follows, and delete it.
718.PP
719The `text'
720.CW sam
721accepts
722for searches in addresses and in
723.CW x
724commands is not simple text, but rather
725.I regular\ expressions.
726Unix has several distinct interpretations of regular expressions.
727The form used by
728.CW sam
729is that of
730.CW egrep (1),
731including parentheses
732.CW ()
733for grouping and an `or' operator
734.CW |
735for matching strings in parallel.
736.CW sam
737makes two extensions:
738although
739.CW .
740(the most overloaded character in Unix) matches any character
741.I except
742newline, the regular expression
743.CW @
744(think of it as a big dot) matches any character, even newlines;
745and the character sequence
746.CW \en
747matches a newline character.
748Replacement text, such as used in the
749.CW a
750and
751.CW c
752commands, is still plain text, but the sequence
753.CW \en
754represents newline in that context, too.
755.PP
756Here is an example.  Say we wanted to double space the document, that is,
757turn every newline into two newlines.
758The following all do the job:
759.P1
760.WC ",x/\en/ a/\en/
761.WC ",x/\en/ c/\en\en/
762.WC ",x/$/ a/\en/
763.WC ",x/^/ i/\en/
764.P2
765The last example is slightly different, because it puts a newline
766.I before
767each line; the other examples place it after.
768The first two examples manipulate newlines directly
769[something outside
770.CW ed 's
771ken]; the last two
772use regular expressions:
773.CW $
774is the empty string at the end of a line, while
775.CW ^
776is the empty string at the beginning.
777.PP
778These solutions all have a possible drawback: if there is already a blank line
779(that is, two consecutive newlines), they make it much larger (four
780consecutive newlines).
781A better method is to extend every group of newlines by one:
782.P1
783.WC ",x/\en+/ a/\en/
784.P2
785The regular expression operator
786.CW +
787means `one or more;'
788.CW \en+
789is identical to
790.CW \en\en* .
791Thus, this example
792takes every sequence of newlines and adds another
793to the end.
794.PP
795A more common example is indenting a block of text by a tab stop.
796The following all work,
797although the first is arguably the cleanest (the blank text in slashes is a tab):
798.P1
799.WC ",x/^/a/	 /
800.WC ",x/^/c/	 /
801.WC ",x/.*\en/i/	 /
802.P2
803The last example uses the pattern (idiom, really)
804.CW .*\en
805to match lines:
806.CW .*
807matches the longest possible string of non-newline characters.
808Taking initial tabs away is just as easy:
809.P1
810.WC ",x/^    /d
811.P2
812In these examples I have specified an address (the whole file), but
813in practice commands like these are more likely to be run without
814an address, using the value of dot set by selecting text with the mouse.
815.SH
816Conditionals
817.PP
818The
819.CW x
820command is a looping construct:
821for each match of a regular expression,
822it extracts (sets dot to) the match and runs a command.
823.CW sam
824also has a conditional,
825.CW g :
826\f(CWg/\f2pattern\f(CW/\f2command\f1
827runs the command if dot contains a match of the pattern
828.I
829without changing the value of dot.
830.R
831The inverse,
832.CW v ,
833runs the command if dot does
834.I not
835contain a match of the pattern.
836(The letters
837.CW g
838and
839.CW v
840are historical and have no mnemonic significance.  You might
841think of
842.CW g
843as `guard.')
844.CW ed "" [
845users should read the above definitions very carefully; the
846.CW g
847command in
848.CW sam
849is fundamentally different from that in
850.CW ed .]
851Here is an example of the difference between
852.CW x
853and
854.CW g:
855.P1
856,x/Emacs/c/vi/
857.P2
858changes each occurrence of the word
859.CW Emacs
860in the file to the word
861.CW vi ,
862but
863.P1
864,g/Emacs/c/vi/
865.P2
866changes the
867.I "whole file
868to
869.CW vi
870if there is the word
871.CW Emacs
872anywhere in the file.
873.PP
874Neither of these commands is particularly interesting in isolation,
875but they are valuable when combined with
876.CW x
877and with themselves.
878.SH
879Composition
880.PP
881One way to think about the
882.CW x
883command is that, given a selection (a value of dot)
884it iterates through interesting subselections (values of dot within).
885In other words, it takes a piece of text and cuts it into smaller pieces.
886But the text that it cuts up may already be a piece cut by a previous
887.CW x
888command or selected by a
889.CW g .
890.CW sam 's
891most interesting property is the ability to define a sequence of commands
892to perform a particular task.\(dg
893.FS
894\(dg
895The obvious analogy with shell pipelines is only partially valid,
896because the individual
897.CW sam
898commands are all working on the same text; it is only how the text is
899sliced up that is changing.
900.FE
901A simple example is to change all occurrences of
902.CW Emacs
903to
904.CW emacs ;
905certainly the command
906.P1
907.WC ",x/Emacs/ c/emacs/
908.P2
909will work, but we can use an
910.CW x
911command to save retyping most of the word
912.CW Emacs :
913.P1
914.WC ",x/Emacs/ x/E/ c/e/
915.P2
916(Blanks can be used
917to separate commands on a line to make them easier to read.)
918What this command does is find all occurrences of
919.CW Emacs
920.CW ,x/Emacs/ ), (
921and then
922.I
923with dot set to that text,
924.R
925find all occurrences of the letter
926.CW E
927.CW x/E/ ), (
928and then
929.I
930with dot set to that text,
931.R
932run the command
933.CW c/e/
934to change the character to lower case.
935Note that the address for the command \(em the whole file, specified by a comma
936\(em is only given to the leftmost
937piece of the command; the rest of the pieces have dot set for them by
938the execution of the pieces to their left.
939.PP
940As another simple example, consider a problem
941solved above: printing all lines in the file containing the word
942.CW Emacs:
943.P1
944.WC ",x/.*\en/ g/Emacs/p
945general introduction to the commands in Emacs and to try to show
946the method in the madness that is the Emacs command structure.
947.P2
948This command says to break the file into lines
949.CW ,x/.*\en/ ), (
950and for each line that contains the string
951.CW Emacs
952.CW g/Emacs/ ), (
953run the command
954.CW p
955with dot set to the line (not the match of
956.CW Emacs ),
957which prints the line.
958To save typing, because
959.CW .*\en
960is a common pattern in
961.CW x
962commands,
963if the
964.CW x
965is followed immediately by a space, the pattern
966.CW .*\en
967is assumed.
968Therefore, the above could be written more succinctly:
969.P1
970.WC ",x g/Emacs/p
971.P2
972The solution we used before was
973.P1
974.WC ,x/Emacs/+-p
975.P2
976which runs the command
977.CW +-p
978with dot set to each match of
979.CW Emacs
980in the file (recall that the idiom
981.CW +-p
982prints the line containing the end of dot).
983.PP
984The two commands usually produce the same result
985(the
986.CW +-p
987form will print a line twice if it contains
988.CW Emacs
989twice).  Which is better?
990.CW ,x/Emacs/+-p
991is easier to type and will be much faster if the file is large and
992there are few occurrences of the string, but it is really an odd special case.
993.CW ",x/.*\en/ g/Emacs/p
994is slower \(em it breaks each line out separately, then examines
995it for a match \(em but is conceptually cleaner, and generalizes more easily.
996For example, consider the following piece of the Emacs manual:
997.P1
998command name="append-to-file", key="[unbound]"
999Takes the contents of the current buffer and appends it to the
1000named file. If the files doesn't exist, it will be created.
1001
1002command name="apropos", key="ESC-?"
1003Prompts for a keyword and then prints a list of those commands
1004whose short description contains that keyword.  For example,
1005if you forget which commands deal with windows, just type
1006"@b[ESC-?]@t[window]@b[ESC]".
1007
1008\&\f2and so on\f(CW
1009.P2
1010This text consists of groups of non-empty lines, with a simple format
1011for the text within each group.
1012Imagine that we wanted to find the description of the `apropos'
1013command.
1014The problem is to break the file into individual descriptions,
1015and then to find the description of `apropos' and to print it.
1016The solution is straightforward:
1017.P1
1018.WC ,x/(.+\en)+/\ g/command\ name="apropos"/p
1019command name="apropos", key="ESC-?"
1020Prompts for a keyword and then prints a list of those commands
1021whose short description contains that keyword.  For example,
1022if you forget which commands deal with windows, just type
1023"@b[ESC-?]@t[window]@b[ESC]".
1024.P2
1025The regular expression
1026.CW (.+\en)+
1027matches one or more lines with one or more characters each, that is,
1028the text between blank lines, so
1029.CW ,x/(.+\en)+/
1030extracts each description; then
1031.CW g/command\ name="apropos"/
1032selects the description for `apropos' and
1033.CW p
1034prints it.
1035.PP
1036Imagine that we had a C program containing the variable
1037.CW n ,
1038but we wanted to change it to
1039.CW num .
1040This command is a first cut:
1041.P1
1042.WC ",x/n/ c/num/
1043.P2
1044but is obviously flawed: it will change all
1045.CW n 's
1046in the file, not just the
1047.I identifier
1048.CW n .
1049A better solution is to use an
1050.CW x
1051command to extract the identifiers, and then use
1052.CW g
1053to find the
1054.CW n 's:
1055.P1
1056.WC ",x/[a-zA-Z_][a-zA-Z_0-9]*/ g/n/ v/../ c/num/
1057.P2
1058It looks awful, but it's fairly easy to understand when read
1059left to right.
1060A C identifier is an alphabetic or underscore followed by zero or more
1061alphanumerics or underscores, that is, matches of the regular expression
1062.CW [a-zA-Z_][a-zA-Z_0-9]* .
1063The
1064.CW g
1065command selects those identifiers containing
1066.CW n ,
1067and the
1068.CW v
1069is a trick: it rejects those identifiers containing more than one
1070character.  Hence the
1071.CW c/num/
1072applies only to free-standing
1073.CW n 's.
1074.PP
1075There is still a problem here:
1076we don't want to change
1077.CW n 's
1078that are part of the character constant
1079.CW \en .
1080There is a command
1081.CW y ,
1082complementary to
1083.CW x ,
1084that is just what we need:
1085\f(CWy/\f2pattern\f(CW/\f2command\f1
1086runs the command on the pieces of text
1087.I between
1088matches of the pattern;
1089if
1090.CW x
1091selects,
1092.CW y
1093rejects.
1094Here is the final command:
1095.P1
1096.WC ",y/\e\en/ x/[a-zA-Z_][a-zA-Z_0-9]*/ g/n/ v/../ c/num/
1097.P2
1098The
1099.CW y/\e\en/
1100(with backslash doubled to make it a literal character)
1101removes the two-character sequence
1102.CW \en
1103from consideration, so the rest of the command will not touch it.
1104There is more we could do here; for example, another
1105.CW y
1106could be prefixed to protect comments in the code.
1107I won't elaborate the example any further, but you should have
1108an idea of the way in which the looping and conditional commands
1109in
1110.CW sam
1111may be composed to do interesting things.
1112.SH
1113Grouping
1114.PP
1115There is another way to arrange commands.
1116By enclosing them in brace brackets
1117.CW {} ,
1118commands may be applied in parallel.
1119This example uses the
1120.CW =
1121command, which reports the line and character numbers of dot,
1122together with
1123.CW p ,
1124to report on appearances of
1125.CW Emacs
1126in our original file:
1127.P1
1128.WC ,p
1129This manual is organized in a rather haphazard manner.  The first
1130several sections were written hastily in an attempt to provide a
1131general introduction to the commands in Emacs and to try to show
1132the method in the madness that is the Emacs command structure.
1133.ft CI
1134,x/Emacs/{
1135	=
1136	+-p
1137}
1138.ft
11393; #171,#176
1140general introduction to the commands in Emacs and to try to show
11414; #234,#239
1142the method in the madness that is the Emacs command structure.
1143.P2
1144(The number before the semicolon is the line number;
1145the numbers beginning with
1146.CW #
1147are character numbers.)
1148As a more interesting example, consider changing all occurrences of
1149.CW Emacs
1150to
1151.CW vi
1152and vice versa.  We can type
1153.P1
1154.ft CI
1155,x/Emacs|vi/{
1156	g/Emacs/ c/vi/
1157	g/vi/ c/Emacs/
1158}
1159.ft
1160.P2
1161or even
1162.P1
1163.ft CI
1164,x/[a-zA-Z]+/{
1165	g/Emacs/ v/....../ c/vi/
1166	g/vi/ v/.../ c/Emacs/
1167}
1168.ft
1169.P2
1170to make sure we don't change strings embedded in words.
1171.SH
1172Multiple Changes
1173.PP
1174You might wonder why, once
1175.CW Emacs
1176has been changed to
1177.CW vi
1178in the above example,
1179the second command in the braces doesn't put it back again.
1180The reason is that the commands are run in parallel:
1181within any top-level
1182.CW sam
1183command, all changes to the file refer to the state of the file
1184before any of the changes in that command are made.
1185After all the changes have been determined, they are all applied
1186simultaneously.
1187.PP
1188This means, as mentioned, that commands within a compound
1189command see the state of the file before any of the changes apply.
1190This method of evaluation makes some things easier (such as the exchange of
1191.CW Emacs
1192and
1193.CW vi ),
1194and some things harder.
1195For instance, it is impossible to use a
1196.CW p
1197command to print the changes as they happen,
1198because they haven't happened when the
1199.CW p
1200is executed.
1201An indirect ramification is that changes must occur in forward
1202order through the file,
1203and must not overlap.
1204.SH
1205Unix
1206.PP
1207.CW sam
1208has a few commands to connect to Unix processes.
1209The simplest is
1210.CW ! ,
1211which runs the command with input and output connected to the terminal.
1212.P1
1213.WC !date
1214Wed May 28 23:25:21 EDT 1986
1215!
1216.P2
1217(When downloaded, the input is connected to
1218.CW /dev/null
1219and only the first few lines of output are printed;
1220any overflow is stored in
1221.CW $HOME/sam.err .)
1222The final
1223.CW !
1224is a prompt to indicate when the command completes.
1225.PP
1226Slightly more interesting is
1227.CW > ,
1228which provides the current text as standard input to the Unix command:
1229.P1
1230.WC "1,2 >wc
1231      2       22      131
1232!
1233.P2
1234The complement of
1235.CW >
1236is, naturally,
1237.CW < :
1238it replaces the current text with the standard output of the Unix command:
1239.P1
1240.WC "1 <date
1241!
1242.WC 1p
1243Wed May 28 23:26:44 EDT 1986
1244.P2
1245The last command is
1246.CW | ,
1247which is a combination of
1248.CW <
1249and
1250.CW > :
1251the current text is provided as standard input to the Unix command,
1252and the Unix command's standard output is collected and used to
1253replace the original text.
1254For example,
1255.P1
1256.WC ",| sort
1257.P2
1258runs
1259.CW sort (1)
1260on the file, sorting the lines of the text lexicographically.
1261Note that
1262.CW < ,
1263.CW >
1264and
1265.CW |
1266are
1267.CW sam
1268commands, not Unix shell operators.
1269.PP
1270The next example converts all appearances of
1271.CW Emacs
1272to upper case using
1273.CW tr (1):
1274.P1
1275.WC ",x/Emacs/ | tr a-z A-Z
1276.P2
1277.CW tr
1278is run once for each occurrence of
1279.CW Emacs .
1280Of course, you could do this example more efficiently with a simple
1281.CW c
1282command, but here's a trickier one:
1283given a Unix mail box as input,
1284convert all the
1285.CW Subject
1286headers to distinct fortunes:
1287.P1
1288.WC ",x/^Subject:.*\en/ x/[^:]*\en/ < /usr/games/fortune
1289.P2
1290(The regular expression
1291.CW [^:]
1292refers to any character
1293.I except
1294.CW :
1295and newline; the negation operator
1296.CW ^
1297excludes newline from the list of characters.)
1298Again,
1299.CW /usr/games/fortune
1300is run once for each
1301.CW Subject
1302line, so each
1303.CW Subject
1304line is changed to a different fortune.
1305.SH
1306A few other text commands
1307.PP
1308For completeness, I should mention three other commands that
1309manipulate text.  The
1310.CW m
1311command moves the current text to after the text specified by the
1312(obligatory) address after the command.
1313Thus
1314.P1
1315.WC "/Emacs/+- m 0
1316.P2
1317moves the next line containing
1318.CW Emacs
1319to the beginning of the file.
1320Similarly,
1321.CW t
1322(another historic character) copies the text:
1323.P1
1324.WC "/Emacs/+- t 0
1325.P2
1326would make, at the beginning of the file, a copy of the next line
1327containing
1328.CW Emacs .
1329.PP
1330The third command is more interesting: it makes substitutions.
1331Its syntax is
1332\f(CWs/\f2pattern\f(CW/\f2replacement\f(CW/\f1.
1333Within the current text, it finds the first occurrence of
1334the pattern and replaces it by the replacement text,
1335leaving dot set to the entire address of the substitution.
1336.P1
1337.WC 1p
1338This manual is organized in a rather haphazard manner.  The first
1339.WC s/haphazard/thoughtless/
1340.WC p
1341This manual is organized in a rather thoughtless manner.  The first
1342.P2
1343Occurrences of the character
1344.CW &
1345in the replacement text stand for the text matching the pattern.
1346.P1
1347.WC s/T/"&&&&"/
1348.WC p
1349"TTTT"his manual is organized in a rather thoughtless manner.  The first
1350.P2
1351There are two variants.  The first is that a number may be specified
1352after the
1353.CW s ,
1354to indicate which occurrence of the pattern to substitute; the default
1355is the first.
1356.P1
1357.WC s2/is/was/
1358.WC p
1359"TTTT"his manual was organized in a rather thoughtless manner.  The first
1360.P2
1361The second is that suffixing a
1362.CW g
1363(global) causes replacement of all occurrences, not just the first.
1364.P1
1365.WC s/[a-zA-Z]/x/g
1366.WC p
1367"xxxx"xxx xxxxxx xxx xxxxxxxxx xx x xxxxxx xxxxxxxxxxx xxxxxxx  xxx xxxxx
1368.P2
1369Notice that in all these examples
1370dot is left
1371set to the entire line.
1372.PP
1373[The substitute command is vital to
1374.CW ed,
1375because it is the only way to make changes within a line.
1376It is less valuable in
1377.CW sam ,
1378in which the concept of a line is much less important.
1379For example, many
1380.CW ed
1381substitution idioms are handled well by
1382.CW sam 's
1383basic commands. Consider the commands
1384.P1
1385s/good/bad/
1386s/good//
1387s/good/& bye/
1388.P2
1389which are equivalent in
1390.CW sam
1391to
1392.P1
1393/good/c/bad/
1394/good/d
1395/good/a/ bye/
1396.P2
1397and for which the context search is likely unnecessary because the desired
1398text is already dot.
1399Also, beware this
1400.CW ed
1401idiom:
1402.P1
14031,$s/good/bad/
1404.P2
1405which changes the first
1406.CW good
1407on each line; the same command in
1408.CW sam
1409will only change the first one in the whole file.
1410The correct
1411.CW sam
1412version is
1413.P1
1414,x s/good/bad/
1415.P2
1416but what is more likely meant is
1417.P1
1418,x/good/ c/bad/
1419.P2
1420.CW sam
1421operates under different rules.]
1422.SH
1423Files
1424.PP
1425So far, we have only been working with a single file,
1426but
1427.CW sam
1428is a multi-file editor.
1429Only one file may be edited at a time, but
1430it is easy to change which file is the `current' file for editing.
1431To see how to do this, we need a
1432.CW sam
1433with a few files;
1434the easiest way to do this is to start it
1435with a list of Unix file names to edit.
1436.P1
1437$ \f(CIecho *.ms\f(CW
1438conquest.ms death.ms emacs.ms famine.ms slaughter.ms
1439$ \f(CIsam -d *.ms\f(CW
1440 -. conquest.ms
1441.P2
1442(I'm sorry the Horsemen don't appear in liturgical order.)
1443The line printed by
1444.CW sam
1445is an indication that the Unix file
1446.CW conquest.ms
1447has been read, and is now the current file.
1448.CW sam
1449does not read the Unix file until
1450the associated
1451.CW sam
1452file becomes current.
1453.PP
1454The
1455.CW n
1456command prints the names of all the files:
1457.P1
1458.WC n
1459 -. conquest.ms
1460 -  death.ms
1461 -  emacs.ms
1462 -  famine.ms
1463 -  slaughter.ms
1464.P2
1465This list is also available in the menu on mouse button 3.
1466The command
1467.CW f
1468tells the name of just the current file:
1469.P1
1470.WC f
1471 -. conquest.ms
1472.P2
1473The characters to the left of the file name encode helpful information about
1474the file.
1475The minus sign becomes a plus sign if the file has a window open, and an
1476asterisk if more than one is open.
1477The period (another meaning of dot) identifies the current file.
1478The leading blank changes to an apostrophe if the file is different
1479from the contents of the associated Unix file, as far as
1480.CW sam
1481knows.
1482This becomes evident if we make a change.
1483.P1
1484.WC 1d
1485.WC f
1486\&'-. conquest.ms
1487.P2
1488If the file is restored by an undo command, the apostrophe disappears.
1489.P1
1490.WC u
1491.WC f
1492 -. conquest.ms
1493.P2
1494The file name may be changed by providing a new name with the
1495.CW f
1496command:
1497.P1
1498.CW "f pestilence.ms
1499\&'-. pestilence.ms
1500.P2
1501.WC f
1502prints the new status of the file,
1503that is, it changes the name if one is provided, and prints the
1504name regardless.
1505A file name change may also be undone.
1506.P1
1507.WC u
1508.WC f
1509 -. conquest.ms
1510.P2
1511.PP
1512When
1513.CW sam
1514is downloaded, the current file may be changed simply by selecting
1515the desired file from the menu (selecting the same file subsequently
1516cycles through the windows opened on the file).
1517Otherwise, the
1518.CW b
1519command can be used to choose the desired file:\(dg
1520.FS
1521\(dg A bug prevents the
1522.CW b
1523command from working when downloaded.
1524Because the menu is more convenient anyway, and
1525because the method
1526of choosing files from the command language is slated to change,
1527the bug hasn't been fixed.
1528.FE
1529.P1
1530.WC "b emacs.ms
1531 -. emacs.ms
1532.P2
1533Again,
1534.CW sam
1535prints the name (actually, executes an implicit
1536.CW f
1537command) because the Unix file
1538.CW emacs.ms
1539is being read for the first time.
1540It is an error to ask for a file
1541.CW sam
1542doesn't know about, but the
1543.CW B
1544command will prime
1545.CW sam 's
1546menu with a new file, and make it current.
1547.P1
1548.WC "b flood.pic
1549?no such file `flood.pic'
1550.WC "B flood.pic
1551 -. flood.pic
1552.WC n
1553 -  conquest.ms
1554 -  death.ms
1555 -  emacs.ms
1556 -  famine.ms
1557 -. flood.pic
1558 -  slaughter.ms
1559.P2
1560Both
1561.CW b
1562and
1563.CW B
1564will accept a list of file names.
1565.CW b
1566simply takes the first file in the list, but
1567.CW B
1568loads them all.
1569The list may be typed on one line \(em
1570.P1
1571.WC "B devil.tex satan.tex 666.tex emacs.tex
1572.P2
1573\(em or generated by a Unix command \(em
1574.P1
1575.WC "B <echo *.tex
1576.P2
1577The latter form requires a Unix command;
1578.CW sam
1579does not understand the shell file name metacharacters, so
1580.CW "B *.tex
1581attempts to load a single file named
1582.CW *.tex .
1583(The
1584.CW <
1585form is of course derived from
1586.CW sam 's
1587.CW <
1588command.)
1589.CW echo
1590is not the only useful command to run subservient to
1591.CW B ;
1592for example,
1593.P1
1594.WC "B <grep -l Emacs *
1595.P2
1596will load only those files containing the string
1597.CW Emacs .
1598Finally, a special case: a
1599.CW B
1600with no arguments creates an empty, nameless file within
1601.CW sam .
1602.PP
1603The complement of
1604.CW B
1605is
1606.CW D :
1607.P1
1608.WC "D devil.tex satan.tex 666.tex emacs.tex
1609.P2
1610eradicates the files from
1611.CW sam 's
1612memory (not from the Unix machine's disc).
1613.CW D
1614without any file names removes the current file from
1615.CW sam .
1616.PP
1617There are three other commands that relate the current file
1618to Unix files.
1619The
1620.CW w
1621command writes the file to disc;
1622without arguments, it writes the entire file to the Unix file associated
1623with the current file in
1624.CW sam
1625(it is the only command whose default address is not dot).
1626Of course, you can specify an address to be written,
1627and a different file name, with the obvious syntax:
1628.P1
1629.WC "1,2w /tmp/revelations
1630/tmp/revelations: #44
1631.P2
1632.CW sam
1633responds with the file name and the number of characters written to the file.
1634The
1635.CW write
1636command on the button 3 menu is identical in function to an unadorned
1637.CW w
1638command.
1639.PP
1640The other two commands,
1641.CW e
1642and
1643.CW r ,
1644read data from Unix files.
1645The
1646.CW e
1647command clears out the current file,
1648reads the data from the named file (or uses the current file's old name if
1649none is explicitly provided), and sets the file name.
1650It's much like a
1651.CW B
1652command, but puts the information in the current file instead of a new one.
1653.CW e
1654without any file name is therefore an easy way to refresh
1655.CW sam 's
1656copy of a Unix file.
1657[Unlike in
1658.CW ed ,
1659.CW e
1660doesn't complain if the file is modified.  The principle is not
1661to protect against things that can be undone if wrong.]
1662Since its job is to replace the whole text,
1663.CW e
1664never takes an address.
1665.PP
1666The
1667.CW r
1668command is like
1669.CW e ,
1670but it doesn't clear the file:
1671the text in the Unix file replaces dot, or the specified text if an
1672address is given.
1673.P1
1674.WC "r emacs.ms
1675.P2
1676has essentially the effect of
1677.P1
1678.WC "<cat emacs.ms
1679.P2
1680The commands
1681.CW r
1682and
1683.CW w
1684will set the name of the file if the current file has no name already defined;
1685.CW e
1686sets the name even if the file already has one.
1687.PP
1688There is a command, analogous to
1689.CW x ,
1690that iterates over files instead of pieces of text:
1691.CW X
1692(capital
1693.CW x ).
1694The syntax is easy; it's just like that of
1695.CW x
1696\(em \f(CWX/\f2pattern\f(CW/\f2command\f1.
1697(The complementary command is
1698.CW Y ,
1699analogous to
1700.CW y .)
1701The effect is to run the command in each file whose menu entry
1702(that is, whose line printed by an
1703.CW f
1704command) matches the pattern.
1705For example, since an apostrophe identifies modified files,
1706.P1
1707.WC "X/'/ w
1708.P2
1709writes the changed files out to disc.
1710Here is a longer example: find all uses of a particular variable
1711in the C source files:
1712.P1
1713.WC "X/\e.c$/ ,x/variable/+-p
1714.P2
1715We can use an
1716.CW f
1717command to identify which file the variable appears in:
1718.P1
1719.ft CI
1720X/\e.c$/ ,g/variable/ {
1721	f
1722	,x/variable/+-{
1723		=
1724		p
1725	}
1726}
1727.ft
1728.P2
1729Here, the
1730.CW g
1731command guarantees that only the names of files containing the variable
1732will be printed (but beware that
1733.CW sam
1734may confuse matters by printing the names of files it reads in during
1735the command).
1736The
1737.CW =
1738command shows where in the file the variable appears, and the
1739.CW p
1740command prints the line.
1741.PP
1742The
1743.CW D
1744command is handy as the target of an
1745.CW X .
1746This example deletes from the menu all C files that do not contain
1747a particular variable:
1748.P1
1749.WC "X/\e.c$/ ,v/variable/ D
1750.P2
1751If no pattern is provided for the
1752.CW X ,
1753the command (which defaults to
1754.CW f )
1755is run in all files, so
1756.P1
1757.WC "X D
1758.P2
1759cleans
1760.CW sam
1761up for a fresh start.
1762.PP
1763But rather than working any further, let's stop now:
1764.P1
1765.WC q
1766$
1767.P2
1768.fi
1769.PP
1770Some of the file manipulating commands can be undone:
1771undoing a
1772.CW f ,
1773.CW e ,
1774or
1775.CW r
1776restores the previous state of the file,
1777but
1778.CW w ,
1779.CW B
1780and
1781.CW D
1782are irrevocable.
1783And, of course, so is
1784.CW q .
1785