1*84d9c625SLionel Sambuc.\" $NetBSD: unvis.3,v 1.27 2012/12/15 07:34:36 wiz Exp $ 22fe8fb19SBen Gras.\" 32fe8fb19SBen Gras.\" Copyright (c) 1989, 1991, 1993 42fe8fb19SBen Gras.\" The Regents of the University of California. All rights reserved. 52fe8fb19SBen Gras.\" 62fe8fb19SBen Gras.\" Redistribution and use in source and binary forms, with or without 72fe8fb19SBen Gras.\" modification, are permitted provided that the following conditions 82fe8fb19SBen Gras.\" are met: 92fe8fb19SBen Gras.\" 1. Redistributions of source code must retain the above copyright 102fe8fb19SBen Gras.\" notice, this list of conditions and the following disclaimer. 112fe8fb19SBen Gras.\" 2. Redistributions in binary form must reproduce the above copyright 122fe8fb19SBen Gras.\" notice, this list of conditions and the following disclaimer in the 132fe8fb19SBen Gras.\" documentation and/or other materials provided with the distribution. 142fe8fb19SBen Gras.\" 3. Neither the name of the University nor the names of its contributors 152fe8fb19SBen Gras.\" may be used to endorse or promote products derived from this software 162fe8fb19SBen Gras.\" without specific prior written permission. 172fe8fb19SBen Gras.\" 182fe8fb19SBen Gras.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND 192fe8fb19SBen Gras.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 202fe8fb19SBen Gras.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 212fe8fb19SBen Gras.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE 222fe8fb19SBen Gras.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 232fe8fb19SBen Gras.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS 242fe8fb19SBen Gras.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 252fe8fb19SBen Gras.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 262fe8fb19SBen Gras.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 272fe8fb19SBen Gras.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 282fe8fb19SBen Gras.\" SUCH DAMAGE. 292fe8fb19SBen Gras.\" 302fe8fb19SBen Gras.\" @(#)unvis.3 8.2 (Berkeley) 12/11/93 312fe8fb19SBen Gras.\" 32f14fb602SLionel Sambuc.Dd March 12, 2011 332fe8fb19SBen Gras.Dt UNVIS 3 342fe8fb19SBen Gras.Os 352fe8fb19SBen Gras.Sh NAME 362fe8fb19SBen Gras.Nm unvis , 372fe8fb19SBen Gras.Nm strunvis 382fe8fb19SBen Gras.Nd decode a visual representation of characters 392fe8fb19SBen Gras.Sh LIBRARY 402fe8fb19SBen Gras.Lb libc 412fe8fb19SBen Gras.Sh SYNOPSIS 422fe8fb19SBen Gras.In vis.h 432fe8fb19SBen Gras.Ft int 442fe8fb19SBen Gras.Fn unvis "char *cp" "int c" "int *astate" "int flag" 452fe8fb19SBen Gras.Ft int 462fe8fb19SBen Gras.Fn strunvis "char *dst" "const char *src" 472fe8fb19SBen Gras.Ft int 48f14fb602SLionel Sambuc.Fn strnunvis "char *dst" "size_t dlen" "const char *src" 49f14fb602SLionel Sambuc.Ft int 502fe8fb19SBen Gras.Fn strunvisx "char *dst" "const char *src" "int flag" 51f14fb602SLionel Sambuc.Ft int 52f14fb602SLionel Sambuc.Fn strnunvisx "char *dst" "size_t dlen" "const char *src" "int flag" 532fe8fb19SBen Gras.Sh DESCRIPTION 542fe8fb19SBen GrasThe 552fe8fb19SBen Gras.Fn unvis , 562fe8fb19SBen Gras.Fn strunvis 572fe8fb19SBen Grasand 582fe8fb19SBen Gras.Fn strunvisx 592fe8fb19SBen Grasfunctions 602fe8fb19SBen Grasare used to decode a visual representation of characters, as produced 612fe8fb19SBen Grasby the 622fe8fb19SBen Gras.Xr vis 3 632fe8fb19SBen Grasfunction, back into 642fe8fb19SBen Grasthe original form. 652fe8fb19SBen Gras.Pp 662fe8fb19SBen GrasThe 672fe8fb19SBen Gras.Fn unvis 682fe8fb19SBen Grasfunction is called with successive characters in 692fe8fb19SBen Gras.Ar c 702fe8fb19SBen Grasuntil a valid sequence is recognized, at which time the decoded 712fe8fb19SBen Grascharacter is available at the character pointed to by 722fe8fb19SBen Gras.Ar cp . 732fe8fb19SBen Gras.Pp 742fe8fb19SBen GrasThe 752fe8fb19SBen Gras.Fn strunvis 762fe8fb19SBen Grasfunction decodes the characters pointed to by 772fe8fb19SBen Gras.Ar src 782fe8fb19SBen Grasinto the buffer pointed to by 792fe8fb19SBen Gras.Ar dst . 802fe8fb19SBen GrasThe 812fe8fb19SBen Gras.Fn strunvis 822fe8fb19SBen Grasfunction simply copies 832fe8fb19SBen Gras.Ar src 842fe8fb19SBen Grasto 852fe8fb19SBen Gras.Ar dst , 862fe8fb19SBen Grasdecoding any escape sequences along the way, 872fe8fb19SBen Grasand returns the number of characters placed into 882fe8fb19SBen Gras.Ar dst , 892fe8fb19SBen Grasor \-1 if an 902fe8fb19SBen Grasinvalid escape sequence was detected. 912fe8fb19SBen GrasThe size of 922fe8fb19SBen Gras.Ar dst 932fe8fb19SBen Grasshould be equal to the size of 942fe8fb19SBen Gras.Ar src 952fe8fb19SBen Gras(that is, no expansion takes place during decoding). 962fe8fb19SBen Gras.Pp 972fe8fb19SBen GrasThe 982fe8fb19SBen Gras.Fn strunvisx 992fe8fb19SBen Grasfunction does the same as the 1002fe8fb19SBen Gras.Fn strunvis 1012fe8fb19SBen Grasfunction, 1022fe8fb19SBen Grasbut it allows you to add a flag that specifies the style the string 1032fe8fb19SBen Gras.Ar src 1042fe8fb19SBen Grasis encoded with. 1052fe8fb19SBen GrasCurrently, the supported flags are: 1062fe8fb19SBen Gras.Dv VIS_HTTPSTYLE 1072fe8fb19SBen Grasand 1082fe8fb19SBen Gras.Dv VIS_MIMESTYLE . 1092fe8fb19SBen Gras.Pp 1102fe8fb19SBen GrasThe 1112fe8fb19SBen Gras.Fn unvis 1122fe8fb19SBen Grasfunction implements a state machine that can be used to decode an 1132fe8fb19SBen Grasarbitrary stream of bytes. 1142fe8fb19SBen GrasAll state associated with the bytes being decoded is stored outside the 1152fe8fb19SBen Gras.Fn unvis 1162fe8fb19SBen Grasfunction (that is, a pointer to the state is passed in), so 1172fe8fb19SBen Grascalls decoding different streams can be freely intermixed. 1182fe8fb19SBen GrasTo start decoding a stream of bytes, first initialize an integer to zero. 1192fe8fb19SBen GrasCall 1202fe8fb19SBen Gras.Fn unvis 1212fe8fb19SBen Graswith each successive byte, along with a pointer 1222fe8fb19SBen Grasto this integer, and a pointer to a destination character. 1232fe8fb19SBen GrasThe 1242fe8fb19SBen Gras.Fn unvis 1252fe8fb19SBen Grasfunction has several return codes that must be handled properly. 1262fe8fb19SBen GrasThey are: 1272fe8fb19SBen Gras.Bl -tag -width UNVIS_VALIDPUSH 128*84d9c625SLionel Sambuc.It Li \&0 No (zero) 1292fe8fb19SBen GrasAnother character is necessary; nothing has been recognized yet. 1302fe8fb19SBen Gras.It Dv UNVIS_VALID 1312fe8fb19SBen GrasA valid character has been recognized and is available at the location 132*84d9c625SLionel Sambucpointed to by 133*84d9c625SLionel Sambuc.Fa cp . 1342fe8fb19SBen Gras.It Dv UNVIS_VALIDPUSH 1352fe8fb19SBen GrasA valid character has been recognized and is available at the location 136*84d9c625SLionel Sambucpointed to by 137*84d9c625SLionel Sambuc.Fa cp ; 138*84d9c625SLionel Sambuchowever, the character currently passed in should be passed in again. 1392fe8fb19SBen Gras.It Dv UNVIS_NOCHAR 1402fe8fb19SBen GrasA valid sequence was detected, but no character was produced. 1412fe8fb19SBen GrasThis return code is necessary to indicate a logical break between characters. 1422fe8fb19SBen Gras.It Dv UNVIS_SYNBAD 1432fe8fb19SBen GrasAn invalid escape sequence was detected, or the decoder is in an unknown state. 1442fe8fb19SBen GrasThe decoder is placed into the starting state. 1452fe8fb19SBen Gras.El 1462fe8fb19SBen Gras.Pp 1472fe8fb19SBen GrasWhen all bytes in the stream have been processed, call 1482fe8fb19SBen Gras.Fn unvis 1492fe8fb19SBen Grasone more time with flag set to 1502fe8fb19SBen Gras.Dv UNVIS_END 1512fe8fb19SBen Grasto extract any remaining character (the character passed in is ignored). 1522fe8fb19SBen Gras.Pp 1532fe8fb19SBen GrasThe 154*84d9c625SLionel Sambuc.Fa flag 1552fe8fb19SBen Grasargument is also used to specify the encoding style of the source. 1562fe8fb19SBen GrasIf set to 1572fe8fb19SBen Gras.Dv VIS_HTTPSTYLE 1582fe8fb19SBen Grasor 1592fe8fb19SBen Gras.Dv VIS_HTTP1808 , 1602fe8fb19SBen Gras.Fn unvis 1612fe8fb19SBen Graswill decode URI strings as specified in RFC 1808. 1622fe8fb19SBen GrasIf set to 1632fe8fb19SBen Gras.Dv VIS_HTTP1866 , 1642fe8fb19SBen Gras.Fn unvis 165*84d9c625SLionel Sambucwill decode entity references and numeric character references 166*84d9c625SLionel Sambucas specified in RFC 1866. 1672fe8fb19SBen GrasIf set to 1682fe8fb19SBen Gras.Dv VIS_MIMESTYLE , 1692fe8fb19SBen Gras.Fn unvis 1702fe8fb19SBen Graswill decode MIME Quoted-Printable strings as specified in RFC 2045. 1712fe8fb19SBen GrasIf set to 1722fe8fb19SBen Gras.Dv VIS_NOESCAPE , 1732fe8fb19SBen Gras.Fn unvis 174*84d9c625SLionel Sambucwill not decode 175*84d9c625SLionel Sambuc.Ql \e 176*84d9c625SLionel Sambucquoted characters. 1772fe8fb19SBen Gras.Pp 1782fe8fb19SBen GrasThe following code fragment illustrates a proper use of 1792fe8fb19SBen Gras.Fn unvis . 1802fe8fb19SBen Gras.Bd -literal -offset indent 1812fe8fb19SBen Grasint state = 0; 1822fe8fb19SBen Graschar out; 1832fe8fb19SBen Gras 1842fe8fb19SBen Graswhile ((ch = getchar()) != EOF) { 1852fe8fb19SBen Grasagain: 1862fe8fb19SBen Gras switch(unvis(\*[Am]out, ch, \*[Am]state, 0)) { 1872fe8fb19SBen Gras case 0: 1882fe8fb19SBen Gras case UNVIS_NOCHAR: 1892fe8fb19SBen Gras break; 1902fe8fb19SBen Gras case UNVIS_VALID: 1912fe8fb19SBen Gras (void)putchar(out); 1922fe8fb19SBen Gras break; 1932fe8fb19SBen Gras case UNVIS_VALIDPUSH: 1942fe8fb19SBen Gras (void)putchar(out); 1952fe8fb19SBen Gras goto again; 1962fe8fb19SBen Gras case UNVIS_SYNBAD: 1972fe8fb19SBen Gras errx(EXIT_FAILURE, "Bad character sequence!"); 1982fe8fb19SBen Gras } 1992fe8fb19SBen Gras} 2002fe8fb19SBen Grasif (unvis(\*[Am]out, '\e0', \*[Am]state, UNVIS_END) == UNVIS_VALID) 2012fe8fb19SBen Gras (void)putchar(out); 2022fe8fb19SBen Gras.Ed 203f14fb602SLionel Sambuc.Sh ERRORS 204f14fb602SLionel SambucThe functions 205f14fb602SLionel Sambuc.Fn strunvis , 206f14fb602SLionel Sambuc.Fn strnunvis , 207f14fb602SLionel Sambuc.Fn strunvisx , 208f14fb602SLionel Sambucand 209f14fb602SLionel Sambuc.Fn strnunvisx 210f14fb602SLionel Sambucwill return \-1 on error and set 211f14fb602SLionel Sambuc.Va errno 212f14fb602SLionel Sambucto: 213f14fb602SLionel Sambuc.Bl -tag -width Er 214f14fb602SLionel Sambuc.It Bq Er EINVAL 215f14fb602SLionel SambucAn invalid escape sequence was detected, or the decoder is in an unknown state. 216f14fb602SLionel Sambuc.El 217f14fb602SLionel Sambuc.Pp 218f14fb602SLionel SambucIn addition the functions 219f14fb602SLionel Sambuc.Fn strnunvis 220f14fb602SLionel Sambucand 221f14fb602SLionel Sambuc.Fn strnunvisx 222f14fb602SLionel Sambucwill can also set 223f14fb602SLionel Sambuc.Va errno 224f14fb602SLionel Sambucon error to: 225f14fb602SLionel Sambuc.Bl -tag -width Er 226f14fb602SLionel Sambuc.It Bq Er ENOSPC 227f14fb602SLionel SambucNot enough space to perform the conversion. 228f14fb602SLionel Sambuc.El 2292fe8fb19SBen Gras.Sh SEE ALSO 2302fe8fb19SBen Gras.Xr unvis 1 , 2312fe8fb19SBen Gras.Xr vis 1 , 2322fe8fb19SBen Gras.Xr vis 3 2332fe8fb19SBen Gras.Rs 2342fe8fb19SBen Gras.%A R. Fielding 2352fe8fb19SBen Gras.%T Relative Uniform Resource Locators 2362fe8fb19SBen Gras.%O RFC1808 2372fe8fb19SBen Gras.Re 2382fe8fb19SBen Gras.Sh HISTORY 2392fe8fb19SBen GrasThe 2402fe8fb19SBen Gras.Fn unvis 2412fe8fb19SBen Grasfunction 2422fe8fb19SBen Grasfirst appeared in 2432fe8fb19SBen Gras.Bx 4.4 . 244f14fb602SLionel SambucThe 245f14fb602SLionel Sambuc.Fn strnunvis 246f14fb602SLionel Sambucand 247f14fb602SLionel Sambuc.Fn strnunvisx 248f14fb602SLionel Sambucfunctions appeared in 249f14fb602SLionel Sambuc.Nx 6.0 . 250*84d9c625SLionel Sambuc.Sh BUGS 251*84d9c625SLionel SambucThe names 252*84d9c625SLionel Sambuc.Dv VIS_HTTP1808 253*84d9c625SLionel Sambucand 254*84d9c625SLionel Sambuc.Dv VIS_HTTP1866 255*84d9c625SLionel Sambucare wrong. 256*84d9c625SLionel SambucPercent-encoding was defined in RFC 1738, the original RFC for URL. 257*84d9c625SLionel SambucRFC 1866 defines HTML 2.0, an application of SGML, from which it 258*84d9c625SLionel Sambucinherits concepts of numeric character references and entity 259*84d9c625SLionel Sambucreferences. 260