HardenedBSD/contrib/nvi/docs/internals/quoting
Peter Wemm b8ba871bd9 Import of nvi-1.79, minus a few bits that we dont need (eg: postscript
files, curses, db, regex etc that we already have).  The other glue will
follow shortly.

Obtained from: Keith Bostic <bostic@bostic.com>
1996-11-01 06:45:43 +00:00

209 lines
7.0 KiB
Plaintext

# @(#)quoting 5.5 (Berkeley) 11/12/94
QUOTING IN EX/VI:
There are four escape characters in historic ex/vi:
\ (backslashes)
^V
^Q (assuming it wasn't used for IXON/IXOFF)
The terminal literal next character.
Vi did not use the lnext character, it always used ^V (or ^Q).
^V and ^Q were equivalent in all cases for vi.
There are four different areas in ex/vi where escaping characters
is interesting:
1: In vi text input mode.
2: In vi command mode.
3: In ex command and text input modes.
4: In the ex commands themselves.
1: Vi text input mode (a, i, o, :colon commands, etc.):
The set of characters that users might want to escape are as follows.
As ^L and ^Z were not special in input mode, they are not listed.
carriage return (^M)
escape (^[)
autoindents (^D, 0, ^, ^T)
erase (^H)
word erase (^W)
line erase (^U)
newline (^J) (not historic practice)
Historic practice was that ^V was the only way to escape any
of these characters, and that whatever character followed
the ^V was taken literally, e.g. ^V^V is a single ^V. I
don't see any strong reason to make it possible to escape
^J, so I'm going to leave that alone.
One comment regarding the autoindent characters. In historic
vi, if you entered "^V0^D" autoindent erasure was still
triggered, although it wasn't if you entered "0^V^D". In
nvi, if you escape either character, autoindent erasure is
not triggered.
Abbreviations were not performed if the non-word character
that triggered the abbreviation was escaped by a ^V. Input
maps were not triggered if any part of the map was escaped
by a ^V.
The historic vi implementation for the 'r' command requires
two leading ^V's to replace a character with a literal
character. This is obviously a bug, and should be fixed.
2: Vi command mode
Command maps were not triggered if the second or later
character of a map was escaped by a ^V.
The obvious extension is that ^V should keep the next command
character from being mapped, so you can do ":map x xxx" and
then enter ^Vx to delete a single character.
3: Ex command and text input modes.
As ex ran in canonical mode, there was little work that it
needed to do for quoting. The notable differences between
ex and vi are that it was possible to escape a <newline> in
the ex command and text input modes, and ex used the "literal
next" character, not control-V/control-Q.
4: The ex commands:
Ex commands are delimited by '|' or newline characters.
Within the commands, whitespace characters delimit the
arguments. Backslash will generally escape any following
character. In the abbreviate, unabbreviate, map and unmap
commands, control-V escapes the next character, instead.
This is historic behavior in vi, although there are special
cases where it's impossible to escape a character, generally
a whitespace character.
Escaping characters in file names in ex commands:
:cd [directory] (directory)
:chdir [directory] (directory)
:edit [+cmd] [file] (file)
:ex [+cmd] [file] (file)
:file [file] (file)
:next [file ...] (file ...)
:read [!cmd | file] (file)
:source [file] (file)
:write [!cmd | file] (file)
:wq [file] (file)
:xit [file] (file)
Since file names are also subject to word expansion, the
underlying shell had better be doing the correct backslash
escaping. This is NOT historic behavior in vi, making it
impossible to insert a whitespace, newline or carriage return
character into a file name.
4: Escaping characters in non-file arguments in ex commands:
:abbreviate word string (word, string)
* :edit [+cmd] [file] (+cmd)
* :ex [+cmd] [file] (+cmd)
:map word string (word, string)
* :set [option ...] (option)
* :tag string (string)
:unabbreviate word (word)
:unmap word (word)
These commands use whitespace to delimit their arguments, and use
^V to escape those characters. The exceptions are starred in the
above list, and are discussed below.
In general, I intend to treat a ^V in any argument, followed by
any character, as that literal character. This will permit
editing of files name "foo|", for example, by using the string
"foo\^V|", where the literal next character protects the pipe
from the ex command parser and the backslash protects it from the
shell expansion.
This is backward compatible with historical vi, although there
were a number of special cases where vi wasn't consistent.
4.1: The edit/ex commands:
The edit/ex commands are a special case because | symbols may
occur in the "+cmd" field, for example:
:edit +10|s/abc/ABC/ file.c
In addition, the edit and ex commands have historically
ignored literal next characters in the +cmd string, so that
the following command won't work.
:edit +10|s/X/^V / file.c
I intend to handle the literal next character in edit/ex consistently
with how it is handled in other commands.
More fun facts to know and tell:
The acid test for the ex/edit commands:
date > file1; date > file2
vi
:edit +1|s/./XXX/|w file1| e file2|1 | s/./XXX/|wq
No version of vi, of which I'm aware, handles it.
4.2: The set command:
The set command treats ^V's as literal characters, so the
following command won't work. Backslashes do work in this
case, though, so the second version of the command does work.
set tags=tags_file1^V tags_file2
set tags=tags_file1\ tags_file2
I intend to continue permitting backslashes in set commands,
but to also permit literal next characters to work as well.
This is backward compatible, but will also make set
consistent with the other commands. I think it's unlikely
to break any historic .exrc's, given that there are probably
very few files with ^V's in their name.
4.3: The tag command:
The tag command ignores ^V's and backslashes; there's no way to
get a space into a tag name.
I think this is a don't care, and I don't intend to fix it.
5: Regular expressions:
:global /pattern/ command
:substitute /pattern/replace/
:vglobal /pattern/ command
I intend to treat a backslash in the pattern, followed by the
delimiter character or a backslash, as that literal character.
This is historic behavior in vi. It would get rid of a fairly
hard-to-explain special case if we could just use the character
immediately following the backslash in all cases, or, if we
changed nvi to permit using the literal next character as a
pattern escape character, but that would probably break historic
scripts.
There is an additional escaping issue for regular expressions.
Within the pattern and replacement, the '|' character did not
delimit ex commands. For example, the following is legal.
:substitute /|/PIPE/|s/P/XXX/
This is a special case that I will support.
6: Ending anything with an escape character:
In all of the above rules, an escape character (either ^V or a
backslash) at the end of an argument or file name is not handled
specially, but used as a literal character.