xref: /netbsd-src/lib/libc/gen/vis.3 (revision b1c86f5f087524e68db12794ee9c3e3da1ab17a0)
1.\"	$NetBSD: vis.3,v 1.23 2009/02/10 23:06:31 christos Exp $
2.\"
3.\" Copyright (c) 1989, 1991, 1993
4.\"	The Regents of the University of California.  All rights reserved.
5.\"
6.\" Redistribution and use in source and binary forms, with or without
7.\" modification, are permitted provided that the following conditions
8.\" are met:
9.\" 1. Redistributions of source code must retain the above copyright
10.\"    notice, this list of conditions and the following disclaimer.
11.\" 2. Redistributions in binary form must reproduce the above copyright
12.\"    notice, this list of conditions and the following disclaimer in the
13.\"    documentation and/or other materials provided with the distribution.
14.\" 3. Neither the name of the University nor the names of its contributors
15.\"    may be used to endorse or promote products derived from this software
16.\"    without specific prior written permission.
17.\"
18.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
19.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
20.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
21.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
22.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
23.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
24.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
25.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
26.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
27.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
28.\" SUCH DAMAGE.
29.\"
30.\"     @(#)vis.3	8.1 (Berkeley) 6/9/93
31.\"
32.Dd February 10, 2009
33.Dt VIS 3
34.Os
35.Sh NAME
36.Nm vis ,
37.Nm strvis ,
38.Nm strvisx ,
39.Nm svis ,
40.Nm strsvis ,
41.Nm strsvisx
42.Nd visually encode characters
43.Sh LIBRARY
44.Lb libc
45.Sh SYNOPSIS
46.In vis.h
47.Ft char *
48.Fn vis "char *dst" "int c" "int flag" "int nextc"
49.Ft int
50.Fn strvis "char *dst" "const char *src" "int flag"
51.Ft int
52.Fn strvisx "char *dst" "const char *src" "size_t len" "int flag"
53.Ft char *
54.Fn svis "char *dst" "int c" "int flag" "int nextc" "const char *extra"
55.Ft int
56.Fn strsvis "char *dst" "const char *src" "int flag" "const char *extra"
57.Ft int
58.Fn strsvisx "char *dst" "const char *src" "size_t len" "int flag" "const char *extra"
59.Sh DESCRIPTION
60The
61.Fn vis
62function
63copies into
64.Fa dst
65a string which represents the character
66.Fa c .
67If
68.Fa c
69needs no encoding, it is copied in unaltered.
70The string is null terminated, and a pointer to the end of the string is
71returned.
72The maximum length of any encoding is four
73characters (not including the trailing
74.Dv NUL ) ;
75thus, when
76encoding a set of characters into a buffer, the size of the buffer should
77be four times the number of characters encoded, plus one for the trailing
78.Dv NUL .
79The flag parameter is used for altering the default range of
80characters considered for encoding and for altering the visual
81representation.
82The additional character,
83.Fa nextc ,
84is only used when selecting the
85.Dv VIS_CSTYLE
86encoding format (explained below).
87.Pp
88The
89.Fn strvis
90and
91.Fn strvisx
92functions copy into
93.Fa dst
94a visual representation of
95the string
96.Fa src .
97The
98.Fn strvis
99function encodes characters from
100.Fa src
101up to the
102first
103.Dv NUL .
104The
105.Fn strvisx
106function encodes exactly
107.Fa len
108characters from
109.Fa src
110(this
111is useful for encoding a block of data that may contain
112.Dv NUL Ns 's ) .
113Both forms
114.Dv NUL
115terminate
116.Fa dst .
117The size of
118.Fa dst
119must be four times the number
120of characters encoded from
121.Fa src
122(plus one for the
123.Dv NUL ) .
124Both
125forms return the number of characters in dst (not including
126the trailing
127.Dv NUL ) .
128.Pp
129The functions
130.Fn svis ,
131.Fn strsvis ,
132and
133.Fn strsvisx
134correspond to
135.Fn vis ,
136.Fn strvis ,
137and
138.Fn strvisx
139but have an additional argument
140.Fa extra ,
141pointing to a
142.Dv NUL
143terminated list of characters.
144These characters will be copied encoded or backslash-escaped into
145.Fa dst .
146These functions are useful e.g. to remove the special meaning
147of certain characters to shells.
148.Pp
149The encoding is a unique, invertible representation composed entirely of
150graphic characters; it can be decoded back into the original form using
151the
152.Xr unvis 3
153or
154.Xr strunvis 3
155functions.
156.Pp
157There are two parameters that can be controlled: the range of
158characters that are encoded (applies only to
159.Fn vis ,
160.Fn strvis ,
161and
162.Fn strvisx ) ,
163and the type of representation used.
164By default, all non-graphic characters,
165except space, tab, and newline are encoded.
166(See
167.Xr isgraph 3 . )
168The following flags
169alter this:
170.Bl -tag -width VIS_WHITEX
171.It Dv VIS_SP
172Also encode space.
173.It Dv VIS_TAB
174Also encode tab.
175.It Dv VIS_NL
176Also encode newline.
177.It Dv VIS_WHITE
178Synonym for
179.Dv VIS_SP
180\&|
181.Dv VIS_TAB
182\&|
183.Dv VIS_NL .
184.It Dv VIS_SAFE
185Only encode "unsafe" characters.
186Unsafe means control characters which may cause common terminals to perform
187unexpected functions.
188Currently this form allows space, tab, newline, backspace, bell, and
189return - in addition to all graphic characters - unencoded.
190.El
191.Pp
192(The above flags have no effect for
193.Fn svis ,
194.Fn strsvis ,
195and
196.Fn strsvisx .
197When using these functions, place all graphic characters to be
198encoded in an array pointed to by
199.Fa extra .
200In general, the backslash character should be included in this array, see the
201warning on the use of the
202.Dv VIS_NOSLASH
203flag below).
204.Pp
205There are four forms of encoding.
206All forms use the backslash character
207.Ql \e
208to introduce a special
209sequence; two backslashes are used to represent a real backslash,
210except
211.Dv VIS_HTTPSTYLE
212that uses
213.Ql % ,
214or
215.Dv VIS_MIMESTYLE
216that uses
217.Ql = .
218These are the visual formats:
219.Bl -tag -width VIS_CSTYLE
220.It (default)
221Use an
222.Ql M
223to represent meta characters (characters with the 8th
224bit set), and use caret
225.Ql ^
226to represent control characters see
227.Pf ( Xr iscntrl 3 ) .
228The following formats are used:
229.Bl -tag -width xxxxx
230.It Dv \e^C
231Represents the control character
232.Ql C .
233Spans characters
234.Ql \e000
235through
236.Ql \e037 ,
237and
238.Ql \e177
239(as
240.Ql \e^? ) .
241.It Dv \eM-C
242Represents character
243.Ql C
244with the 8th bit set.
245Spans characters
246.Ql \e241
247through
248.Ql \e376 .
249.It Dv \eM^C
250Represents control character
251.Ql C
252with the 8th bit set.
253Spans characters
254.Ql \e200
255through
256.Ql \e237 ,
257and
258.Ql \e377
259(as
260.Ql \eM^? ) .
261.It Dv \e040
262Represents
263.Tn ASCII
264space.
265.It Dv \e240
266Represents Meta-space.
267.El
268.Pp
269.It Dv VIS_CSTYLE
270Use C-style backslash sequences to represent standard non-printable
271characters.
272The following sequences are used to represent the indicated characters:
273.Bd -unfilled -offset indent
274.Li \ea Tn  - BEL No (007)
275.Li \eb Tn  - BS No (010)
276.Li \ef Tn  - NP No (014)
277.Li \en Tn  - NL No (012)
278.Li \er Tn  - CR No (015)
279.Li \es Tn  - SP No (040)
280.Li \et Tn  - HT No (011)
281.Li \ev Tn  - VT No (013)
282.Li \e0 Tn  - NUL No (000)
283.Ed
284.Pp
285When using this format, the nextc parameter is looked at to determine
286if a
287.Dv NUL
288character can be encoded as
289.Ql \e0
290instead of
291.Ql \e000 .
292If
293.Fa nextc
294is an octal digit, the latter representation is used to
295avoid ambiguity.
296.It Dv VIS_OCTAL
297Use a three digit octal sequence.
298The form is
299.Ql \eddd
300where
301.Em d
302represents an octal digit.
303.It Dv VIS_HTTPSTYLE
304Use URI encoding as described in RFC 1738.
305The form is
306.Ql %xx
307where
308.Em x
309represents a lower case hexadecimal digit.
310.It Dv VIS_MIMESTYLE
311Use MIME Quoted-Printable encoding as described in RFC 2045, only don't
312break lines and don't handle CRLF.
313The form is:
314.Ql %XX
315where
316.Em X
317represents an upper case hexadecimal digit.
318.El
319.Pp
320There is one additional flag,
321.Dv VIS_NOSLASH ,
322which inhibits the
323doubling of backslashes and the backslash before the default
324format (that is, control characters are represented by
325.Ql ^C
326and
327meta characters as
328.Ql M-C ) .
329With this flag set, the encoding is
330ambiguous and non-invertible.
331.Sh SEE ALSO
332.Xr unvis 1 ,
333.Xr vis 1 ,
334.Xr unvis 3
335.Rs
336.%A T. Berners-Lee
337.%T Uniform Resource Locators (URL)
338.%O RFC1738
339.Re
340.Sh HISTORY
341The
342.Fa vis ,
343.Fa strvis ,
344and
345.Fa strvisx
346functions first appeared in
347.Bx 4.4 .
348The
349.Fa svis ,
350.Fa strsvis ,
351and
352.Fa strsvisx
353functions appeared in
354.Nx 1.5 .
355