xref: /minix3/lib/libc/gen/unvis.3 (revision 84d9c625bfea59e274550651111ae9edfdc40fbd)
1*84d9c625SLionel Sambuc.\"	$NetBSD: unvis.3,v 1.27 2012/12/15 07:34:36 wiz Exp $
22fe8fb19SBen Gras.\"
32fe8fb19SBen Gras.\" Copyright (c) 1989, 1991, 1993
42fe8fb19SBen Gras.\"	The Regents of the University of California.  All rights reserved.
52fe8fb19SBen Gras.\"
62fe8fb19SBen Gras.\" Redistribution and use in source and binary forms, with or without
72fe8fb19SBen Gras.\" modification, are permitted provided that the following conditions
82fe8fb19SBen Gras.\" are met:
92fe8fb19SBen Gras.\" 1. Redistributions of source code must retain the above copyright
102fe8fb19SBen Gras.\"    notice, this list of conditions and the following disclaimer.
112fe8fb19SBen Gras.\" 2. Redistributions in binary form must reproduce the above copyright
122fe8fb19SBen Gras.\"    notice, this list of conditions and the following disclaimer in the
132fe8fb19SBen Gras.\"    documentation and/or other materials provided with the distribution.
142fe8fb19SBen Gras.\" 3. Neither the name of the University nor the names of its contributors
152fe8fb19SBen Gras.\"    may be used to endorse or promote products derived from this software
162fe8fb19SBen Gras.\"    without specific prior written permission.
172fe8fb19SBen Gras.\"
182fe8fb19SBen Gras.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
192fe8fb19SBen Gras.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
202fe8fb19SBen Gras.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
212fe8fb19SBen Gras.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
222fe8fb19SBen Gras.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
232fe8fb19SBen Gras.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
242fe8fb19SBen Gras.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
252fe8fb19SBen Gras.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
262fe8fb19SBen Gras.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
272fe8fb19SBen Gras.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
282fe8fb19SBen Gras.\" SUCH DAMAGE.
292fe8fb19SBen Gras.\"
302fe8fb19SBen Gras.\"     @(#)unvis.3	8.2 (Berkeley) 12/11/93
312fe8fb19SBen Gras.\"
32f14fb602SLionel Sambuc.Dd March 12, 2011
332fe8fb19SBen Gras.Dt UNVIS 3
342fe8fb19SBen Gras.Os
352fe8fb19SBen Gras.Sh NAME
362fe8fb19SBen Gras.Nm unvis ,
372fe8fb19SBen Gras.Nm strunvis
382fe8fb19SBen Gras.Nd decode a visual representation of characters
392fe8fb19SBen Gras.Sh LIBRARY
402fe8fb19SBen Gras.Lb libc
412fe8fb19SBen Gras.Sh SYNOPSIS
422fe8fb19SBen Gras.In vis.h
432fe8fb19SBen Gras.Ft int
442fe8fb19SBen Gras.Fn unvis "char *cp" "int c" "int *astate" "int flag"
452fe8fb19SBen Gras.Ft int
462fe8fb19SBen Gras.Fn strunvis "char *dst" "const char *src"
472fe8fb19SBen Gras.Ft int
48f14fb602SLionel Sambuc.Fn strnunvis "char *dst" "size_t dlen" "const char *src"
49f14fb602SLionel Sambuc.Ft int
502fe8fb19SBen Gras.Fn strunvisx "char *dst" "const char *src" "int flag"
51f14fb602SLionel Sambuc.Ft int
52f14fb602SLionel Sambuc.Fn strnunvisx "char *dst" "size_t dlen" "const char *src" "int flag"
532fe8fb19SBen Gras.Sh DESCRIPTION
542fe8fb19SBen GrasThe
552fe8fb19SBen Gras.Fn unvis ,
562fe8fb19SBen Gras.Fn strunvis
572fe8fb19SBen Grasand
582fe8fb19SBen Gras.Fn strunvisx
592fe8fb19SBen Grasfunctions
602fe8fb19SBen Grasare used to decode a visual representation of characters, as produced
612fe8fb19SBen Grasby the
622fe8fb19SBen Gras.Xr vis 3
632fe8fb19SBen Grasfunction, back into
642fe8fb19SBen Grasthe original form.
652fe8fb19SBen Gras.Pp
662fe8fb19SBen GrasThe
672fe8fb19SBen Gras.Fn unvis
682fe8fb19SBen Grasfunction is called with successive characters in
692fe8fb19SBen Gras.Ar c
702fe8fb19SBen Grasuntil a valid sequence is recognized, at which time the decoded
712fe8fb19SBen Grascharacter is available at the character pointed to by
722fe8fb19SBen Gras.Ar cp .
732fe8fb19SBen Gras.Pp
742fe8fb19SBen GrasThe
752fe8fb19SBen Gras.Fn strunvis
762fe8fb19SBen Grasfunction decodes the characters pointed to by
772fe8fb19SBen Gras.Ar src
782fe8fb19SBen Grasinto the buffer pointed to by
792fe8fb19SBen Gras.Ar dst .
802fe8fb19SBen GrasThe
812fe8fb19SBen Gras.Fn strunvis
822fe8fb19SBen Grasfunction simply copies
832fe8fb19SBen Gras.Ar src
842fe8fb19SBen Grasto
852fe8fb19SBen Gras.Ar dst ,
862fe8fb19SBen Grasdecoding any escape sequences along the way,
872fe8fb19SBen Grasand returns the number of characters placed into
882fe8fb19SBen Gras.Ar dst ,
892fe8fb19SBen Grasor \-1 if an
902fe8fb19SBen Grasinvalid escape sequence was detected.
912fe8fb19SBen GrasThe size of
922fe8fb19SBen Gras.Ar dst
932fe8fb19SBen Grasshould be equal to the size of
942fe8fb19SBen Gras.Ar src
952fe8fb19SBen Gras(that is, no expansion takes place during decoding).
962fe8fb19SBen Gras.Pp
972fe8fb19SBen GrasThe
982fe8fb19SBen Gras.Fn strunvisx
992fe8fb19SBen Grasfunction does the same as the
1002fe8fb19SBen Gras.Fn strunvis
1012fe8fb19SBen Grasfunction,
1022fe8fb19SBen Grasbut it allows you to add a flag that specifies the style the string
1032fe8fb19SBen Gras.Ar src
1042fe8fb19SBen Grasis encoded with.
1052fe8fb19SBen GrasCurrently, the supported flags are:
1062fe8fb19SBen Gras.Dv VIS_HTTPSTYLE
1072fe8fb19SBen Grasand
1082fe8fb19SBen Gras.Dv VIS_MIMESTYLE .
1092fe8fb19SBen Gras.Pp
1102fe8fb19SBen GrasThe
1112fe8fb19SBen Gras.Fn unvis
1122fe8fb19SBen Grasfunction implements a state machine that can be used to decode an
1132fe8fb19SBen Grasarbitrary stream of bytes.
1142fe8fb19SBen GrasAll state associated with the bytes being decoded is stored outside the
1152fe8fb19SBen Gras.Fn unvis
1162fe8fb19SBen Grasfunction (that is, a pointer to the state is passed in), so
1172fe8fb19SBen Grascalls decoding different streams can be freely intermixed.
1182fe8fb19SBen GrasTo start decoding a stream of bytes, first initialize an integer to zero.
1192fe8fb19SBen GrasCall
1202fe8fb19SBen Gras.Fn unvis
1212fe8fb19SBen Graswith each successive byte, along with a pointer
1222fe8fb19SBen Grasto this integer, and a pointer to a destination character.
1232fe8fb19SBen GrasThe
1242fe8fb19SBen Gras.Fn unvis
1252fe8fb19SBen Grasfunction has several return codes that must be handled properly.
1262fe8fb19SBen GrasThey are:
1272fe8fb19SBen Gras.Bl -tag -width UNVIS_VALIDPUSH
128*84d9c625SLionel Sambuc.It Li \&0 No (zero)
1292fe8fb19SBen GrasAnother character is necessary; nothing has been recognized yet.
1302fe8fb19SBen Gras.It Dv UNVIS_VALID
1312fe8fb19SBen GrasA valid character has been recognized and is available at the location
132*84d9c625SLionel Sambucpointed to by
133*84d9c625SLionel Sambuc.Fa cp .
1342fe8fb19SBen Gras.It Dv UNVIS_VALIDPUSH
1352fe8fb19SBen GrasA valid character has been recognized and is available at the location
136*84d9c625SLionel Sambucpointed to by
137*84d9c625SLionel Sambuc.Fa cp ;
138*84d9c625SLionel Sambuchowever, the character currently passed in should be passed in again.
1392fe8fb19SBen Gras.It Dv UNVIS_NOCHAR
1402fe8fb19SBen GrasA valid sequence was detected, but no character was produced.
1412fe8fb19SBen GrasThis return code is necessary to indicate a logical break between characters.
1422fe8fb19SBen Gras.It Dv UNVIS_SYNBAD
1432fe8fb19SBen GrasAn invalid escape sequence was detected, or the decoder is in an unknown state.
1442fe8fb19SBen GrasThe decoder is placed into the starting state.
1452fe8fb19SBen Gras.El
1462fe8fb19SBen Gras.Pp
1472fe8fb19SBen GrasWhen all bytes in the stream have been processed, call
1482fe8fb19SBen Gras.Fn unvis
1492fe8fb19SBen Grasone more time with flag set to
1502fe8fb19SBen Gras.Dv UNVIS_END
1512fe8fb19SBen Grasto extract any remaining character (the character passed in is ignored).
1522fe8fb19SBen Gras.Pp
1532fe8fb19SBen GrasThe
154*84d9c625SLionel Sambuc.Fa flag
1552fe8fb19SBen Grasargument is also used to specify the encoding style of the source.
1562fe8fb19SBen GrasIf set to
1572fe8fb19SBen Gras.Dv VIS_HTTPSTYLE
1582fe8fb19SBen Grasor
1592fe8fb19SBen Gras.Dv VIS_HTTP1808 ,
1602fe8fb19SBen Gras.Fn unvis
1612fe8fb19SBen Graswill decode URI strings as specified in RFC 1808.
1622fe8fb19SBen GrasIf set to
1632fe8fb19SBen Gras.Dv VIS_HTTP1866 ,
1642fe8fb19SBen Gras.Fn unvis
165*84d9c625SLionel Sambucwill decode entity references and numeric character references
166*84d9c625SLionel Sambucas specified in RFC 1866.
1672fe8fb19SBen GrasIf set to
1682fe8fb19SBen Gras.Dv VIS_MIMESTYLE ,
1692fe8fb19SBen Gras.Fn unvis
1702fe8fb19SBen Graswill decode MIME Quoted-Printable strings as specified in RFC 2045.
1712fe8fb19SBen GrasIf set to
1722fe8fb19SBen Gras.Dv VIS_NOESCAPE ,
1732fe8fb19SBen Gras.Fn unvis
174*84d9c625SLionel Sambucwill not decode
175*84d9c625SLionel Sambuc.Ql \e
176*84d9c625SLionel Sambucquoted characters.
1772fe8fb19SBen Gras.Pp
1782fe8fb19SBen GrasThe following code fragment illustrates a proper use of
1792fe8fb19SBen Gras.Fn unvis .
1802fe8fb19SBen Gras.Bd -literal -offset indent
1812fe8fb19SBen Grasint state = 0;
1822fe8fb19SBen Graschar out;
1832fe8fb19SBen Gras
1842fe8fb19SBen Graswhile ((ch = getchar()) != EOF) {
1852fe8fb19SBen Grasagain:
1862fe8fb19SBen Gras	switch(unvis(\*[Am]out, ch, \*[Am]state, 0)) {
1872fe8fb19SBen Gras	case 0:
1882fe8fb19SBen Gras	case UNVIS_NOCHAR:
1892fe8fb19SBen Gras		break;
1902fe8fb19SBen Gras	case UNVIS_VALID:
1912fe8fb19SBen Gras		(void)putchar(out);
1922fe8fb19SBen Gras		break;
1932fe8fb19SBen Gras	case UNVIS_VALIDPUSH:
1942fe8fb19SBen Gras		(void)putchar(out);
1952fe8fb19SBen Gras		goto again;
1962fe8fb19SBen Gras	case UNVIS_SYNBAD:
1972fe8fb19SBen Gras		errx(EXIT_FAILURE, "Bad character sequence!");
1982fe8fb19SBen Gras	}
1992fe8fb19SBen Gras}
2002fe8fb19SBen Grasif (unvis(\*[Am]out, '\e0', \*[Am]state, UNVIS_END) == UNVIS_VALID)
2012fe8fb19SBen Gras	(void)putchar(out);
2022fe8fb19SBen Gras.Ed
203f14fb602SLionel Sambuc.Sh ERRORS
204f14fb602SLionel SambucThe functions
205f14fb602SLionel Sambuc.Fn strunvis ,
206f14fb602SLionel Sambuc.Fn strnunvis ,
207f14fb602SLionel Sambuc.Fn strunvisx ,
208f14fb602SLionel Sambucand
209f14fb602SLionel Sambuc.Fn strnunvisx
210f14fb602SLionel Sambucwill return \-1 on error and set
211f14fb602SLionel Sambuc.Va errno
212f14fb602SLionel Sambucto:
213f14fb602SLionel Sambuc.Bl -tag -width Er
214f14fb602SLionel Sambuc.It Bq Er EINVAL
215f14fb602SLionel SambucAn invalid escape sequence was detected, or the decoder is in an unknown state.
216f14fb602SLionel Sambuc.El
217f14fb602SLionel Sambuc.Pp
218f14fb602SLionel SambucIn addition the functions
219f14fb602SLionel Sambuc.Fn strnunvis
220f14fb602SLionel Sambucand
221f14fb602SLionel Sambuc.Fn strnunvisx
222f14fb602SLionel Sambucwill can also set
223f14fb602SLionel Sambuc.Va errno
224f14fb602SLionel Sambucon error to:
225f14fb602SLionel Sambuc.Bl -tag -width Er
226f14fb602SLionel Sambuc.It Bq Er ENOSPC
227f14fb602SLionel SambucNot enough space to perform the conversion.
228f14fb602SLionel Sambuc.El
2292fe8fb19SBen Gras.Sh SEE ALSO
2302fe8fb19SBen Gras.Xr unvis 1 ,
2312fe8fb19SBen Gras.Xr vis 1 ,
2322fe8fb19SBen Gras.Xr vis 3
2332fe8fb19SBen Gras.Rs
2342fe8fb19SBen Gras.%A R. Fielding
2352fe8fb19SBen Gras.%T Relative Uniform Resource Locators
2362fe8fb19SBen Gras.%O RFC1808
2372fe8fb19SBen Gras.Re
2382fe8fb19SBen Gras.Sh HISTORY
2392fe8fb19SBen GrasThe
2402fe8fb19SBen Gras.Fn unvis
2412fe8fb19SBen Grasfunction
2422fe8fb19SBen Grasfirst appeared in
2432fe8fb19SBen Gras.Bx 4.4 .
244f14fb602SLionel SambucThe
245f14fb602SLionel Sambuc.Fn strnunvis
246f14fb602SLionel Sambucand
247f14fb602SLionel Sambuc.Fn strnunvisx
248f14fb602SLionel Sambucfunctions appeared in
249f14fb602SLionel Sambuc.Nx 6.0 .
250*84d9c625SLionel Sambuc.Sh BUGS
251*84d9c625SLionel SambucThe names
252*84d9c625SLionel Sambuc.Dv VIS_HTTP1808
253*84d9c625SLionel Sambucand
254*84d9c625SLionel Sambuc.Dv VIS_HTTP1866
255*84d9c625SLionel Sambucare wrong.
256*84d9c625SLionel SambucPercent-encoding was defined in RFC 1738, the original RFC for URL.
257*84d9c625SLionel SambucRFC 1866 defines HTML 2.0, an application of SGML, from which it
258*84d9c625SLionel Sambucinherits concepts of numeric character references and entity
259*84d9c625SLionel Sambucreferences.
260