xref: /netbsd-src/external/bsd/nvi/docs/internals/quoting (revision 4391d5e9d4f291db41e3b3ba26a01b5e51364aae)
1*4391d5e9Schristos#	@(#)quoting	5.5 (Berkeley) 11/12/94
2*4391d5e9Schristos
3*4391d5e9SchristosQUOTING IN EX/VI:
4*4391d5e9Schristos
5*4391d5e9SchristosThere are four escape characters in historic ex/vi:
6*4391d5e9Schristos
7*4391d5e9Schristos	\ (backslashes)
8*4391d5e9Schristos	^V
9*4391d5e9Schristos	^Q (assuming it wasn't used for IXON/IXOFF)
10*4391d5e9Schristos	The terminal literal next character.
11*4391d5e9Schristos
12*4391d5e9SchristosVi did not use the lnext character, it always used ^V (or ^Q).
13*4391d5e9Schristos^V and ^Q were equivalent in all cases for vi.
14*4391d5e9Schristos
15*4391d5e9SchristosThere are four different areas in ex/vi where escaping characters
16*4391d5e9Schristosis interesting:
17*4391d5e9Schristos
18*4391d5e9Schristos	1: In vi text input mode.
19*4391d5e9Schristos	2: In vi command mode.
20*4391d5e9Schristos	3: In ex command and text input modes.
21*4391d5e9Schristos	4: In the ex commands themselves.
22*4391d5e9Schristos
23*4391d5e9Schristos1: Vi text input mode (a, i, o, :colon commands, etc.):
24*4391d5e9Schristos
25*4391d5e9Schristos   The set of characters that users might want to escape are as follows.
26*4391d5e9Schristos   As ^L and ^Z were not special in input mode, they are not listed.
27*4391d5e9Schristos
28*4391d5e9Schristos	carriage return (^M)
29*4391d5e9Schristos	escape		(^[)
30*4391d5e9Schristos	autoindents	(^D, 0, ^, ^T)
31*4391d5e9Schristos	erase		(^H)
32*4391d5e9Schristos	word erase	(^W)
33*4391d5e9Schristos	line erase	(^U)
34*4391d5e9Schristos	newline		(^J)		(not historic practice)
35*4391d5e9Schristos
36*4391d5e9Schristos   Historic practice was that ^V was the only way to escape any
37*4391d5e9Schristos   of these characters, and that whatever character followed
38*4391d5e9Schristos   the ^V was taken literally, e.g. ^V^V is a single ^V.  I
39*4391d5e9Schristos   don't see any strong reason to make it possible to escape
40*4391d5e9Schristos   ^J, so I'm going to leave that alone.
41*4391d5e9Schristos
42*4391d5e9Schristos   One comment regarding the autoindent characters.  In historic
43*4391d5e9Schristos   vi, if you entered "^V0^D" autoindent erasure was still
44*4391d5e9Schristos   triggered, although it wasn't if you entered "0^V^D".  In
45*4391d5e9Schristos   nvi, if you escape either character, autoindent erasure is
46*4391d5e9Schristos   not triggered.
47*4391d5e9Schristos
48*4391d5e9Schristos   Abbreviations were not performed if the non-word character
49*4391d5e9Schristos   that triggered the abbreviation was escaped by a ^V.  Input
50*4391d5e9Schristos   maps were not triggered if any part of the map was escaped
51*4391d5e9Schristos   by a ^V.
52*4391d5e9Schristos
53*4391d5e9Schristos   The historic vi implementation for the 'r' command requires
54*4391d5e9Schristos   two leading ^V's to replace a character with a literal
55*4391d5e9Schristos   character.  This is obviously a bug, and should be fixed.
56*4391d5e9Schristos
57*4391d5e9Schristos2: Vi command mode
58*4391d5e9Schristos
59*4391d5e9Schristos   Command maps were not triggered if the second or later
60*4391d5e9Schristos   character of a map was escaped by a ^V.
61*4391d5e9Schristos
62*4391d5e9Schristos   The obvious extension is that ^V should keep the next command
63*4391d5e9Schristos   character from being mapped, so you can do ":map x xxx" and
64*4391d5e9Schristos   then enter ^Vx to delete a single character.
65*4391d5e9Schristos
66*4391d5e9Schristos3: Ex command and text input modes.
67*4391d5e9Schristos
68*4391d5e9Schristos   As ex ran in canonical mode, there was little work that it
69*4391d5e9Schristos   needed to do for quoting.  The notable differences between
70*4391d5e9Schristos   ex and vi are that it was possible to escape a <newline> in
71*4391d5e9Schristos   the ex command and text input modes, and ex used the "literal
72*4391d5e9Schristos   next" character, not control-V/control-Q.
73*4391d5e9Schristos
74*4391d5e9Schristos4: The ex commands:
75*4391d5e9Schristos
76*4391d5e9Schristos   Ex commands are delimited by '|' or newline characters.
77*4391d5e9Schristos   Within the commands, whitespace characters delimit the
78*4391d5e9Schristos   arguments.  Backslash will generally escape any following
79*4391d5e9Schristos   character.  In the abbreviate, unabbreviate, map and unmap
80*4391d5e9Schristos   commands, control-V escapes the next character, instead.
81*4391d5e9Schristos
82*4391d5e9Schristos   This is historic behavior in vi, although there are special
83*4391d5e9Schristos   cases where it's impossible to escape a character, generally
84*4391d5e9Schristos   a whitespace character.
85*4391d5e9Schristos
86*4391d5e9Schristos   Escaping characters in file names in ex commands:
87*4391d5e9Schristos
88*4391d5e9Schristos	:cd [directory]				(directory)
89*4391d5e9Schristos	:chdir [directory]			(directory)
90*4391d5e9Schristos	:edit [+cmd] [file]			(file)
91*4391d5e9Schristos	:ex [+cmd] [file]			(file)
92*4391d5e9Schristos	:file [file]				(file)
93*4391d5e9Schristos	:next [file ...]			(file ...)
94*4391d5e9Schristos	:read [!cmd | file]			(file)
95*4391d5e9Schristos	:source [file]				(file)
96*4391d5e9Schristos	:write [!cmd | file]			(file)
97*4391d5e9Schristos	:wq [file]				(file)
98*4391d5e9Schristos	:xit [file]				(file)
99*4391d5e9Schristos
100*4391d5e9Schristos   Since file names are also subject to word expansion, the
101*4391d5e9Schristos   underlying shell had better be doing the correct backslash
102*4391d5e9Schristos   escaping.  This is NOT historic behavior in vi, making it
103*4391d5e9Schristos   impossible to insert a whitespace, newline or carriage return
104*4391d5e9Schristos   character into a file name.
105*4391d5e9Schristos
106*4391d5e9Schristos4: Escaping characters in non-file arguments in ex commands:
107*4391d5e9Schristos
108*4391d5e9Schristos	:abbreviate word string			(word, string)
109*4391d5e9Schristos*	:edit [+cmd] [file]			(+cmd)
110*4391d5e9Schristos*	:ex [+cmd] [file]			(+cmd)
111*4391d5e9Schristos	:map word string			(word, string)
112*4391d5e9Schristos*	:set [option ...]			(option)
113*4391d5e9Schristos*	:tag string				(string)
114*4391d5e9Schristos	:unabbreviate word			(word)
115*4391d5e9Schristos	:unmap word				(word)
116*4391d5e9Schristos
117*4391d5e9Schristos   These commands use whitespace to delimit their arguments, and use
118*4391d5e9Schristos   ^V to escape those characters.  The exceptions are starred in the
119*4391d5e9Schristos   above list, and are discussed below.
120*4391d5e9Schristos
121*4391d5e9Schristos   In general, I intend to treat a ^V in any argument, followed by
122*4391d5e9Schristos   any character, as that literal character.  This will permit
123*4391d5e9Schristos   editing of files name "foo|", for example, by using the string
124*4391d5e9Schristos   "foo\^V|", where the literal next character protects the pipe
125*4391d5e9Schristos   from the ex command parser and the backslash protects it from the
126*4391d5e9Schristos   shell expansion.
127*4391d5e9Schristos
128*4391d5e9Schristos   This is backward compatible with historical vi, although there
129*4391d5e9Schristos   were a number of special cases where vi wasn't consistent.
130*4391d5e9Schristos
131*4391d5e9Schristos4.1: The edit/ex commands:
132*4391d5e9Schristos
133*4391d5e9Schristos   The edit/ex commands are a special case because | symbols may
134*4391d5e9Schristos   occur in the "+cmd" field, for example:
135*4391d5e9Schristos
136*4391d5e9Schristos	:edit +10|s/abc/ABC/ file.c
137*4391d5e9Schristos
138*4391d5e9Schristos   In addition, the edit and ex commands have historically
139*4391d5e9Schristos   ignored literal next characters in the +cmd string, so that
140*4391d5e9Schristos   the following command won't work.
141*4391d5e9Schristos
142*4391d5e9Schristos	:edit +10|s/X/^V / file.c
143*4391d5e9Schristos
144*4391d5e9Schristos   I intend to handle the literal next character in edit/ex consistently
145*4391d5e9Schristos   with how it is handled in other commands.
146*4391d5e9Schristos
147*4391d5e9Schristos   More fun facts to know and tell:
148*4391d5e9Schristos	The acid test for the ex/edit commands:
149*4391d5e9Schristos
150*4391d5e9Schristos		date > file1; date > file2
151*4391d5e9Schristos		vi
152*4391d5e9Schristos		:edit +1|s/./XXX/|w file1| e file2|1 | s/./XXX/|wq
153*4391d5e9Schristos
154*4391d5e9Schristos	No version of vi, of which I'm aware, handles it.
155*4391d5e9Schristos
156*4391d5e9Schristos4.2: The set command:
157*4391d5e9Schristos
158*4391d5e9Schristos   The set command treats ^V's as literal characters, so the
159*4391d5e9Schristos   following command won't work.  Backslashes do work in this
160*4391d5e9Schristos   case, though, so the second version of the command does work.
161*4391d5e9Schristos
162*4391d5e9Schristos	set tags=tags_file1^V tags_file2
163*4391d5e9Schristos	set tags=tags_file1\ tags_file2
164*4391d5e9Schristos
165*4391d5e9Schristos   I intend to continue permitting backslashes in set commands,
166*4391d5e9Schristos   but to also permit literal next characters to work as well.
167*4391d5e9Schristos   This is backward compatible, but will also make set
168*4391d5e9Schristos   consistent with the other commands.  I think it's unlikely
169*4391d5e9Schristos   to break any historic .exrc's, given that there are probably
170*4391d5e9Schristos   very few files with ^V's in their name.
171*4391d5e9Schristos
172*4391d5e9Schristos4.3: The tag command:
173*4391d5e9Schristos
174*4391d5e9Schristos   The tag command ignores ^V's and backslashes; there's no way to
175*4391d5e9Schristos   get a space into a tag name.
176*4391d5e9Schristos
177*4391d5e9Schristos   I think this is a don't care, and I don't intend to fix it.
178*4391d5e9Schristos
179*4391d5e9Schristos5: Regular expressions:
180*4391d5e9Schristos
181*4391d5e9Schristos	:global /pattern/ command
182*4391d5e9Schristos	:substitute /pattern/replace/
183*4391d5e9Schristos	:vglobal /pattern/ command
184*4391d5e9Schristos
185*4391d5e9Schristos   I intend to treat a backslash in the pattern, followed by the
186*4391d5e9Schristos   delimiter character or a backslash, as that literal character.
187*4391d5e9Schristos
188*4391d5e9Schristos   This is historic behavior in vi.  It would get rid of a fairly
189*4391d5e9Schristos   hard-to-explain special case if we could just use the character
190*4391d5e9Schristos   immediately following the backslash in all cases, or, if we
191*4391d5e9Schristos   changed nvi to permit using the literal next character as a
192*4391d5e9Schristos   pattern escape character, but that would probably break historic
193*4391d5e9Schristos   scripts.
194*4391d5e9Schristos
195*4391d5e9Schristos   There is an additional escaping issue for regular expressions.
196*4391d5e9Schristos   Within the pattern and replacement, the '|' character did not
197*4391d5e9Schristos   delimit ex commands.  For example, the following is legal.
198*4391d5e9Schristos
199*4391d5e9Schristos	:substitute /|/PIPE/|s/P/XXX/
200*4391d5e9Schristos
201*4391d5e9Schristos   This is a special case that I will support.
202*4391d5e9Schristos
203*4391d5e9Schristos6: Ending anything with an escape character:
204*4391d5e9Schristos
205*4391d5e9Schristos   In all of the above rules, an escape character (either ^V or a
206*4391d5e9Schristos   backslash) at the end of an argument or file name is not handled
207*4391d5e9Schristos   specially, but used as a literal character.
208*4391d5e9Schristos
209