xref: /openbsd-src/usr.bin/vi/docs/internals/quoting (revision d4e7c603042317101df5b56db72417d951eb90f7)
1*d4e7c603Sniklas#	$OpenBSD: quoting,v 1.3 2001/01/29 01:58:39 niklas Exp $
2*d4e7c603Sniklas
345f2ab88Sderaadt#	@(#)quoting	5.5 (Berkeley) 11/12/94
4df930be7Sderaadt
5df930be7SderaadtQUOTING IN EX/VI:
6df930be7Sderaadt
745f2ab88SderaadtThere are four escape characters in historic ex/vi:
8df930be7Sderaadt
945f2ab88Sderaadt	\ (backslashes)
1045f2ab88Sderaadt	^V
1145f2ab88Sderaadt	^Q (assuming it wasn't used for IXON/IXOFF)
1245f2ab88Sderaadt	The terminal literal next character.
13df930be7Sderaadt
1445f2ab88SderaadtVi did not use the lnext character, it always used ^V (or ^Q).
1545f2ab88Sderaadt^V and ^Q were equivalent in all cases for vi.
1645f2ab88Sderaadt
1745f2ab88SderaadtThere are four different areas in ex/vi where escaping characters
1845f2ab88Sderaadtis interesting:
1945f2ab88Sderaadt
2045f2ab88Sderaadt	1: In vi text input mode.
2145f2ab88Sderaadt	2: In vi command mode.
2245f2ab88Sderaadt	3: In ex command and text input modes.
2345f2ab88Sderaadt	4: In the ex commands themselves.
2445f2ab88Sderaadt
2545f2ab88Sderaadt1: Vi text input mode (a, i, o, :colon commands, etc.):
2645f2ab88Sderaadt
2745f2ab88Sderaadt   The set of characters that users might want to escape are as follows.
2845f2ab88Sderaadt   As ^L and ^Z were not special in input mode, they are not listed.
29df930be7Sderaadt
30df930be7Sderaadt	carriage return (^M)
31df930be7Sderaadt	escape		(^[)
3245f2ab88Sderaadt	autoindents	(^D, 0, ^, ^T)
3345f2ab88Sderaadt	erase		(^H)
3445f2ab88Sderaadt	word erase	(^W)
3545f2ab88Sderaadt	line erase	(^U)
36df930be7Sderaadt	newline		(^J)		(not historic practice)
37df930be7Sderaadt
3845f2ab88Sderaadt   Historic practice was that ^V was the only way to escape any
3945f2ab88Sderaadt   of these characters, and that whatever character followed
4045f2ab88Sderaadt   the ^V was taken literally, e.g. ^V^V is a single ^V.  I
4145f2ab88Sderaadt   don't see any strong reason to make it possible to escape
4245f2ab88Sderaadt   ^J, so I'm going to leave that alone.
43df930be7Sderaadt
4445f2ab88Sderaadt   One comment regarding the autoindent characters.  In historic
4545f2ab88Sderaadt   vi, if you entered "^V0^D" autoindent erasure was still
4645f2ab88Sderaadt   triggered, although it wasn't if you entered "0^V^D".  In
4745f2ab88Sderaadt   nvi, if you escape either character, autoindent erasure is
4845f2ab88Sderaadt   not triggered.
49df930be7Sderaadt
5045f2ab88Sderaadt   Abbreviations were not performed if the non-word character
5145f2ab88Sderaadt   that triggered the abbreviation was escaped by a ^V.  Input
5245f2ab88Sderaadt   maps were not triggered if any part of the map was escaped
5345f2ab88Sderaadt   by a ^V.
54df930be7Sderaadt
55df930be7Sderaadt   The historic vi implementation for the 'r' command requires
5645f2ab88Sderaadt   two leading ^V's to replace a character with a literal
5745f2ab88Sderaadt   character.  This is obviously a bug, and should be fixed.
58df930be7Sderaadt
5945f2ab88Sderaadt2: Vi command mode
60df930be7Sderaadt
6145f2ab88Sderaadt   Command maps were not triggered if the second or later
6245f2ab88Sderaadt   character of a map was escaped by a ^V.
63df930be7Sderaadt
6445f2ab88Sderaadt   The obvious extension is that ^V should keep the next command
6545f2ab88Sderaadt   character from being mapped, so you can do ":map x xxx" and
6645f2ab88Sderaadt   then enter ^Vx to delete a single character.
6745f2ab88Sderaadt
6845f2ab88Sderaadt3: Ex command and text input modes.
6945f2ab88Sderaadt
7045f2ab88Sderaadt   As ex ran in canonical mode, there was little work that it
7145f2ab88Sderaadt   needed to do for quoting.  The notable differences between
7245f2ab88Sderaadt   ex and vi are that it was possible to escape a <newline> in
7345f2ab88Sderaadt   the ex command and text input modes, and ex used the "literal
7445f2ab88Sderaadt   next" character, not control-V/control-Q.
7545f2ab88Sderaadt
7645f2ab88Sderaadt4: The ex commands:
7745f2ab88Sderaadt
7845f2ab88Sderaadt   Ex commands are delimited by '|' or newline characters.
7945f2ab88Sderaadt   Within the commands, whitespace characters delimit the
8045f2ab88Sderaadt   arguments.  Backslash will generally escape any following
8145f2ab88Sderaadt   character.  In the abbreviate, unabbreviate, map and unmap
8245f2ab88Sderaadt   commands, control-V escapes the next character, instead.
83df930be7Sderaadt
84df930be7Sderaadt   This is historic behavior in vi, although there are special
85df930be7Sderaadt   cases where it's impossible to escape a character, generally
86df930be7Sderaadt   a whitespace character.
87df930be7Sderaadt
8845f2ab88Sderaadt   Escaping characters in file names in ex commands:
89df930be7Sderaadt
90df930be7Sderaadt	:cd [directory]				(directory)
91df930be7Sderaadt	:chdir [directory]			(directory)
92df930be7Sderaadt	:edit [+cmd] [file]			(file)
93df930be7Sderaadt	:ex [+cmd] [file]			(file)
94df930be7Sderaadt	:file [file]				(file)
95df930be7Sderaadt	:next [file ...]			(file ...)
96df930be7Sderaadt	:read [!cmd | file]			(file)
97df930be7Sderaadt	:source [file]				(file)
98df930be7Sderaadt	:write [!cmd | file]			(file)
99df930be7Sderaadt	:wq [file]				(file)
100df930be7Sderaadt	:xit [file]				(file)
101df930be7Sderaadt
10245f2ab88Sderaadt   Since file names are also subject to word expansion, the
10345f2ab88Sderaadt   underlying shell had better be doing the correct backslash
10445f2ab88Sderaadt   escaping.  This is NOT historic behavior in vi, making it
10545f2ab88Sderaadt   impossible to insert a whitespace, newline or carriage return
10645f2ab88Sderaadt   character into a file name.
107df930be7Sderaadt
108df930be7Sderaadt4: Escaping characters in non-file arguments in ex commands:
109df930be7Sderaadt
110df930be7Sderaadt	:abbreviate word string			(word, string)
111df930be7Sderaadt*	:edit [+cmd] [file]			(+cmd)
112df930be7Sderaadt*	:ex [+cmd] [file]			(+cmd)
113df930be7Sderaadt	:map word string			(word, string)
114df930be7Sderaadt*	:set [option ...]			(option)
115df930be7Sderaadt*	:tag string				(string)
116df930be7Sderaadt	:unabbreviate word			(word)
117df930be7Sderaadt	:unmap word				(word)
118df930be7Sderaadt
119df930be7Sderaadt   These commands use whitespace to delimit their arguments, and use
120df930be7Sderaadt   ^V to escape those characters.  The exceptions are starred in the
121df930be7Sderaadt   above list, and are discussed below.
122df930be7Sderaadt
123df930be7Sderaadt   In general, I intend to treat a ^V in any argument, followed by
124df930be7Sderaadt   any character, as that literal character.  This will permit
125df930be7Sderaadt   editing of files name "foo|", for example, by using the string
126df930be7Sderaadt   "foo\^V|", where the literal next character protects the pipe
127df930be7Sderaadt   from the ex command parser and the backslash protects it from the
128df930be7Sderaadt   shell expansion.
129df930be7Sderaadt
130df930be7Sderaadt   This is backward compatible with historical vi, although there
131df930be7Sderaadt   were a number of special cases where vi wasn't consistent.
132df930be7Sderaadt
133df930be7Sderaadt4.1: The edit/ex commands:
134df930be7Sderaadt
135df930be7Sderaadt   The edit/ex commands are a special case because | symbols may
136df930be7Sderaadt   occur in the "+cmd" field, for example:
137df930be7Sderaadt
138df930be7Sderaadt	:edit +10|s/abc/ABC/ file.c
139df930be7Sderaadt
14045f2ab88Sderaadt   In addition, the edit and ex commands have historically
14145f2ab88Sderaadt   ignored literal next characters in the +cmd string, so that
14245f2ab88Sderaadt   the following command won't work.
143df930be7Sderaadt
144df930be7Sderaadt	:edit +10|s/X/^V / file.c
145df930be7Sderaadt
146df930be7Sderaadt   I intend to handle the literal next character in edit/ex consistently
147df930be7Sderaadt   with how it is handled in other commands.
148df930be7Sderaadt
149df930be7Sderaadt   More fun facts to know and tell:
150df930be7Sderaadt	The acid test for the ex/edit commands:
151df930be7Sderaadt
152df930be7Sderaadt		date > file1; date > file2
153df930be7Sderaadt		vi
154df930be7Sderaadt		:edit +1|s/./XXX/|w file1| e file2|1 | s/./XXX/|wq
155df930be7Sderaadt
156df930be7Sderaadt	No version of vi, of which I'm aware, handles it.
157df930be7Sderaadt
158df930be7Sderaadt4.2: The set command:
159df930be7Sderaadt
16045f2ab88Sderaadt   The set command treats ^V's as literal characters, so the
16145f2ab88Sderaadt   following command won't work.  Backslashes do work in this
16245f2ab88Sderaadt   case, though, so the second version of the command does work.
163df930be7Sderaadt
164df930be7Sderaadt	set tags=tags_file1^V tags_file2
165df930be7Sderaadt	set tags=tags_file1\ tags_file2
166df930be7Sderaadt
16745f2ab88Sderaadt   I intend to continue permitting backslashes in set commands,
16845f2ab88Sderaadt   but to also permit literal next characters to work as well.
16945f2ab88Sderaadt   This is backward compatible, but will also make set
17045f2ab88Sderaadt   consistent with the other commands.  I think it's unlikely
17145f2ab88Sderaadt   to break any historic .exrc's, given that there are probably
17245f2ab88Sderaadt   very few files with ^V's in their name.
173df930be7Sderaadt
174df930be7Sderaadt4.3: The tag command:
175df930be7Sderaadt
176df930be7Sderaadt   The tag command ignores ^V's and backslashes; there's no way to
177df930be7Sderaadt   get a space into a tag name.
178df930be7Sderaadt
179df930be7Sderaadt   I think this is a don't care, and I don't intend to fix it.
180df930be7Sderaadt
181df930be7Sderaadt5: Regular expressions:
182df930be7Sderaadt
183df930be7Sderaadt	:global /pattern/ command
184df930be7Sderaadt	:substitute /pattern/replace/
185df930be7Sderaadt	:vglobal /pattern/ command
186df930be7Sderaadt
187df930be7Sderaadt   I intend to treat a backslash in the pattern, followed by the
188df930be7Sderaadt   delimiter character or a backslash, as that literal character.
189df930be7Sderaadt
190df930be7Sderaadt   This is historic behavior in vi.  It would get rid of a fairly
191df930be7Sderaadt   hard-to-explain special case if we could just use the character
192df930be7Sderaadt   immediately following the backslash in all cases, or, if we
193df930be7Sderaadt   changed nvi to permit using the literal next character as a
194df930be7Sderaadt   pattern escape character, but that would probably break historic
195df930be7Sderaadt   scripts.
196df930be7Sderaadt
197df930be7Sderaadt   There is an additional escaping issue for regular expressions.
198df930be7Sderaadt   Within the pattern and replacement, the '|' character did not
199df930be7Sderaadt   delimit ex commands.  For example, the following is legal.
200df930be7Sderaadt
201df930be7Sderaadt	:substitute /|/PIPE/|s/P/XXX/
202df930be7Sderaadt
203df930be7Sderaadt   This is a special case that I will support.
204df930be7Sderaadt
205df930be7Sderaadt6: Ending anything with an escape character:
206df930be7Sderaadt
207df930be7Sderaadt   In all of the above rules, an escape character (either ^V or a
208df930be7Sderaadt   backslash) at the end of an argument or file name is not handled
209df930be7Sderaadt   specially, but used as a literal character.
21045f2ab88Sderaadt
211