1*0a6a1f1dSLionel Sambuc.\" Id: preconv.1,v 1.7 2013/07/13 19:41:16 schwarze Exp 292395e9cSLionel Sambuc.\" 392395e9cSLionel Sambuc.\" Copyright (c) 2011 Kristaps Dzonsons <kristaps@bsd.lv> 492395e9cSLionel Sambuc.\" 592395e9cSLionel Sambuc.\" Permission to use, copy, modify, and distribute this software for any 692395e9cSLionel Sambuc.\" purpose with or without fee is hereby granted, provided that the above 792395e9cSLionel Sambuc.\" copyright notice and this permission notice appear in all copies. 892395e9cSLionel Sambuc.\" 992395e9cSLionel Sambuc.\" THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES 1092395e9cSLionel Sambuc.\" WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF 1192395e9cSLionel Sambuc.\" MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR 1292395e9cSLionel Sambuc.\" ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES 1392395e9cSLionel Sambuc.\" WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN 1492395e9cSLionel Sambuc.\" ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF 1592395e9cSLionel Sambuc.\" OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. 1692395e9cSLionel Sambuc.\" 17*0a6a1f1dSLionel Sambuc.Dd July 13, 2013 1892395e9cSLionel Sambuc.Dt PRECONV 1 1992395e9cSLionel Sambuc.Os 2092395e9cSLionel Sambuc.Sh NAME 2192395e9cSLionel Sambuc.Nm preconv 2292395e9cSLionel Sambuc.Nd recode multibyte UNIX manuals 2392395e9cSLionel Sambuc.Sh SYNOPSIS 2492395e9cSLionel Sambuc.Nm preconv 2592395e9cSLionel Sambuc.Op Fl D Ar enc 2692395e9cSLionel Sambuc.Op Fl e Ar enc 2792395e9cSLionel Sambuc.Op Ar file 2892395e9cSLionel Sambuc.Sh DESCRIPTION 2992395e9cSLionel SambucThe 3092395e9cSLionel Sambuc.Nm 3192395e9cSLionel Sambucutility recodes multibyte 3292395e9cSLionel Sambuc.Ux 3392395e9cSLionel Sambucmanual files into 3492395e9cSLionel Sambuc.Xr mandoc 1 3592395e9cSLionel Sambuc.Po 3692395e9cSLionel Sambucor other troff system supporting the 3792395e9cSLionel Sambuc.Sq \e[uNNNN] 3892395e9cSLionel Sambucescape sequence 3992395e9cSLionel Sambuc.Pc 4092395e9cSLionel Sambucinput. 4192395e9cSLionel Sambuc.Pp 4292395e9cSLionel SambucBy default, it parses from standard output, determining encoding as 4392395e9cSLionel Sambucdescribed in 4492395e9cSLionel Sambuc.Sx Algorithm . 4592395e9cSLionel Sambuc.Pp 4692395e9cSLionel SambucIts arguments are as follows: 4792395e9cSLionel Sambuc.Bl -tag -width Ds 4892395e9cSLionel Sambuc.It Fl D Ar enc 4992395e9cSLionel SambucThe default encoding. 5092395e9cSLionel Sambuc.It Fl e Ar enc 5192395e9cSLionel SambucThe document's encoding. 5292395e9cSLionel Sambuc.It Ar file 5392395e9cSLionel SambucThe input file. 5492395e9cSLionel Sambuc.El 5592395e9cSLionel Sambuc.Pp 5692395e9cSLionel SambucThe recoded input is written to standard output: Unicode characters in 5792395e9cSLionel Sambucthe ASCII range are printed as regular ASCII characters, while those 5892395e9cSLionel Sambucabove this range are printed using the 5992395e9cSLionel Sambuc.Sq \e[uNNNN] 6092395e9cSLionel Sambucformat documented in 6192395e9cSLionel Sambuc.Xr mandoc_char 7 . 6292395e9cSLionel Sambuc.Pp 6392395e9cSLionel SambucIf input bytes are improperly formed in the current encoding, they're 6492395e9cSLionel Sambucpassed unmodified to standard output. 6592395e9cSLionel SambucFor some encodings, such as UTF-8, unrecoverable input sequences will 6692395e9cSLionel Sambuccause 6792395e9cSLionel Sambuc.Nm 6892395e9cSLionel Sambucto stop processing and exit. 6992395e9cSLionel Sambuc.Ss Algorithm 7092395e9cSLionel SambucAn encoding is chosen according to the following steps: 7192395e9cSLionel Sambuc.Bl -enum 7292395e9cSLionel Sambuc.It 7392395e9cSLionel SambucFrom the argument passed to 7492395e9cSLionel Sambuc.Fl e Ar enc . 7592395e9cSLionel Sambuc.It 7692395e9cSLionel SambucIf a BOM exists, UTF\-8 encoding is selected. 7792395e9cSLionel Sambuc.It 7892395e9cSLionel SambucFrom the coding tags parsed from 7992395e9cSLionel Sambuc.Qq File Variables 8092395e9cSLionel Sambucon the first two lines of input. 8192395e9cSLionel SambucA file variable is an input line of the form 8292395e9cSLionel Sambuc.Pp 8392395e9cSLionel Sambuc.Dl \%.\e\(dq -*- key: val [; key: val ]* -*- 8492395e9cSLionel Sambuc.Pp 8592395e9cSLionel SambucA coding tag variable is where 8692395e9cSLionel Sambuc.Cm key 8792395e9cSLionel Sambucis 8892395e9cSLionel Sambuc.Qq coding 8992395e9cSLionel Sambucand 9092395e9cSLionel Sambuc.Cm val 9192395e9cSLionel Sambucis the name of the encoding. 9292395e9cSLionel SambucA typical file variable with a coding tag is 9392395e9cSLionel Sambuc.Pp 9492395e9cSLionel Sambuc.Dl \%.\e\(dq -*- mode: troff; coding: utf-8 -*- 9592395e9cSLionel Sambuc.It 9692395e9cSLionel SambucFrom the argument passed to 9792395e9cSLionel Sambuc.Fl D Ar enc . 9892395e9cSLionel Sambuc.It 9992395e9cSLionel SambucIf all else fails, Latin\-1 is used. 10092395e9cSLionel Sambuc.El 10192395e9cSLionel Sambuc.Pp 10292395e9cSLionel SambucThe 10392395e9cSLionel Sambuc.Nm 10492395e9cSLionel Sambucutility recognises the UTF\-8, us\-ascii, and latin\-1 encodings as 10592395e9cSLionel Sambucpassed to the 10692395e9cSLionel Sambuc.Fl e 10792395e9cSLionel Sambucand 10892395e9cSLionel Sambuc.Fl D 10992395e9cSLionel Sambucarguments, or as coding tags. 11092395e9cSLionel SambucEncodings are matched case-insensitively. 11192395e9cSLionel Sambuc.\" .Sh IMPLEMENTATION NOTES 11292395e9cSLionel Sambuc.\" Not used in OpenBSD. 11392395e9cSLionel Sambuc.\" .Sh RETURN VALUES 11492395e9cSLionel Sambuc.\" For sections 2, 3, & 9 only. 11592395e9cSLionel Sambuc.\" .Sh ENVIRONMENT 11692395e9cSLionel Sambuc.\" For sections 1, 6, 7, & 8 only. 11792395e9cSLionel Sambuc.\" .Sh FILES 11892395e9cSLionel Sambuc.Sh EXIT STATUS 11992395e9cSLionel Sambuc.Ex -std 12092395e9cSLionel Sambuc.Sh EXAMPLES 12192395e9cSLionel SambucExplicitly page a UTF\-8 manual 12292395e9cSLionel Sambuc.Pa foo.1 12392395e9cSLionel Sambucin the current locale: 12492395e9cSLionel Sambuc.Pp 12592395e9cSLionel Sambuc.Dl $ preconv \-e utf\-8 foo.1 | mandoc -Tlocale | less 12692395e9cSLionel Sambuc.\" .Sh DIAGNOSTICS 12792395e9cSLionel Sambuc.\" For sections 1, 4, 6, 7, & 8 only. 12892395e9cSLionel Sambuc.\" .Sh ERRORS 12992395e9cSLionel Sambuc.\" For sections 2, 3, & 9 only. 13092395e9cSLionel Sambuc.Sh SEE ALSO 13192395e9cSLionel Sambuc.Xr mandoc 1 , 13292395e9cSLionel Sambuc.Xr mandoc_char 7 13392395e9cSLionel Sambuc.Sh STANDARDS 13492395e9cSLionel SambucThe 13592395e9cSLionel Sambuc.Nm 13692395e9cSLionel Sambucutility references the US-ASCII character set standard, ANSI_X3.4\-1968; 13792395e9cSLionel Sambucthe Latin\-1 character set standard, ISO/IEC 8859\-1:1998; the UTF\-8 13892395e9cSLionel Sambuccharacter set standard; and UCS (Unicode), ISO/IEC 10646. 13992395e9cSLionel Sambuc.Sh HISTORY 14092395e9cSLionel SambucThe 14192395e9cSLionel Sambuc.Nm 14292395e9cSLionel Sambucutility first appeared in the GNU troff 14392395e9cSLionel Sambuc.Pq Dq groff 14492395e9cSLionel Sambucsystem in December 2005, authored by Tomohiro Kubota and Werner 14592395e9cSLionel SambucLemberg. 14692395e9cSLionel SambucThe implementation that is part of the 14792395e9cSLionel Sambuc.Xr mandoc 1 14892395e9cSLionel Sambucutility appeared in May 2011. 14992395e9cSLionel Sambuc.Sh AUTHORS 15092395e9cSLionel SambucThe 15192395e9cSLionel Sambuc.Nm 15292395e9cSLionel Sambucutility was written by 153*0a6a1f1dSLionel Sambuc.An Kristaps Dzonsons Aq Mt kristaps@bsd.lv . 15492395e9cSLionel Sambuc.\" .Sh CAVEATS 15592395e9cSLionel Sambuc.\" .Sh BUGS 15692395e9cSLionel Sambuc.\" .Sh SECURITY CONSIDERATIONS 15792395e9cSLionel Sambuc.\" Not used in OpenBSD. 158