xref: /minix3/external/bsd/mdocml/dist/preconv.1 (revision 0a6a1f1d05b60e214de2f05a7310ddd1f0e590e7)
1*0a6a1f1dSLionel Sambuc.\"	Id: preconv.1,v 1.7 2013/07/13 19:41:16 schwarze Exp
292395e9cSLionel Sambuc.\"
392395e9cSLionel Sambuc.\" Copyright (c) 2011 Kristaps Dzonsons <kristaps@bsd.lv>
492395e9cSLionel Sambuc.\"
592395e9cSLionel Sambuc.\" Permission to use, copy, modify, and distribute this software for any
692395e9cSLionel Sambuc.\" purpose with or without fee is hereby granted, provided that the above
792395e9cSLionel Sambuc.\" copyright notice and this permission notice appear in all copies.
892395e9cSLionel Sambuc.\"
992395e9cSLionel Sambuc.\" THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
1092395e9cSLionel Sambuc.\" WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
1192395e9cSLionel Sambuc.\" MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
1292395e9cSLionel Sambuc.\" ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
1392395e9cSLionel Sambuc.\" WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
1492395e9cSLionel Sambuc.\" ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
1592395e9cSLionel Sambuc.\" OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
1692395e9cSLionel Sambuc.\"
17*0a6a1f1dSLionel Sambuc.Dd July 13, 2013
1892395e9cSLionel Sambuc.Dt PRECONV 1
1992395e9cSLionel Sambuc.Os
2092395e9cSLionel Sambuc.Sh NAME
2192395e9cSLionel Sambuc.Nm preconv
2292395e9cSLionel Sambuc.Nd recode multibyte UNIX manuals
2392395e9cSLionel Sambuc.Sh SYNOPSIS
2492395e9cSLionel Sambuc.Nm preconv
2592395e9cSLionel Sambuc.Op Fl D Ar enc
2692395e9cSLionel Sambuc.Op Fl e Ar enc
2792395e9cSLionel Sambuc.Op Ar file
2892395e9cSLionel Sambuc.Sh DESCRIPTION
2992395e9cSLionel SambucThe
3092395e9cSLionel Sambuc.Nm
3192395e9cSLionel Sambucutility recodes multibyte
3292395e9cSLionel Sambuc.Ux
3392395e9cSLionel Sambucmanual files into
3492395e9cSLionel Sambuc.Xr mandoc 1
3592395e9cSLionel Sambuc.Po
3692395e9cSLionel Sambucor other troff system supporting the
3792395e9cSLionel Sambuc.Sq \e[uNNNN]
3892395e9cSLionel Sambucescape sequence
3992395e9cSLionel Sambuc.Pc
4092395e9cSLionel Sambucinput.
4192395e9cSLionel Sambuc.Pp
4292395e9cSLionel SambucBy default, it parses from standard output, determining encoding as
4392395e9cSLionel Sambucdescribed in
4492395e9cSLionel Sambuc.Sx Algorithm .
4592395e9cSLionel Sambuc.Pp
4692395e9cSLionel SambucIts arguments are as follows:
4792395e9cSLionel Sambuc.Bl -tag -width Ds
4892395e9cSLionel Sambuc.It Fl D Ar enc
4992395e9cSLionel SambucThe default encoding.
5092395e9cSLionel Sambuc.It Fl e Ar enc
5192395e9cSLionel SambucThe document's encoding.
5292395e9cSLionel Sambuc.It Ar file
5392395e9cSLionel SambucThe input file.
5492395e9cSLionel Sambuc.El
5592395e9cSLionel Sambuc.Pp
5692395e9cSLionel SambucThe recoded input is written to standard output: Unicode characters in
5792395e9cSLionel Sambucthe ASCII range are printed as regular ASCII characters, while those
5892395e9cSLionel Sambucabove this range are printed using the
5992395e9cSLionel Sambuc.Sq \e[uNNNN]
6092395e9cSLionel Sambucformat documented in
6192395e9cSLionel Sambuc.Xr mandoc_char 7 .
6292395e9cSLionel Sambuc.Pp
6392395e9cSLionel SambucIf input bytes are improperly formed in the current encoding, they're
6492395e9cSLionel Sambucpassed unmodified to standard output.
6592395e9cSLionel SambucFor some encodings, such as UTF-8, unrecoverable input sequences will
6692395e9cSLionel Sambuccause
6792395e9cSLionel Sambuc.Nm
6892395e9cSLionel Sambucto stop processing and exit.
6992395e9cSLionel Sambuc.Ss Algorithm
7092395e9cSLionel SambucAn encoding is chosen according to the following steps:
7192395e9cSLionel Sambuc.Bl -enum
7292395e9cSLionel Sambuc.It
7392395e9cSLionel SambucFrom the argument passed to
7492395e9cSLionel Sambuc.Fl e Ar enc .
7592395e9cSLionel Sambuc.It
7692395e9cSLionel SambucIf a BOM exists, UTF\-8 encoding is selected.
7792395e9cSLionel Sambuc.It
7892395e9cSLionel SambucFrom the coding tags parsed from
7992395e9cSLionel Sambuc.Qq File Variables
8092395e9cSLionel Sambucon the first two lines of input.
8192395e9cSLionel SambucA file variable is an input line of the form
8292395e9cSLionel Sambuc.Pp
8392395e9cSLionel Sambuc.Dl \%.\e\(dq -*- key: val [; key: val ]* -*-
8492395e9cSLionel Sambuc.Pp
8592395e9cSLionel SambucA coding tag variable is where
8692395e9cSLionel Sambuc.Cm key
8792395e9cSLionel Sambucis
8892395e9cSLionel Sambuc.Qq coding
8992395e9cSLionel Sambucand
9092395e9cSLionel Sambuc.Cm val
9192395e9cSLionel Sambucis the name of the encoding.
9292395e9cSLionel SambucA typical file variable with a coding tag is
9392395e9cSLionel Sambuc.Pp
9492395e9cSLionel Sambuc.Dl \%.\e\(dq -*- mode: troff; coding: utf-8 -*-
9592395e9cSLionel Sambuc.It
9692395e9cSLionel SambucFrom the argument passed to
9792395e9cSLionel Sambuc.Fl D Ar enc .
9892395e9cSLionel Sambuc.It
9992395e9cSLionel SambucIf all else fails, Latin\-1 is used.
10092395e9cSLionel Sambuc.El
10192395e9cSLionel Sambuc.Pp
10292395e9cSLionel SambucThe
10392395e9cSLionel Sambuc.Nm
10492395e9cSLionel Sambucutility recognises the UTF\-8, us\-ascii, and latin\-1 encodings as
10592395e9cSLionel Sambucpassed to the
10692395e9cSLionel Sambuc.Fl e
10792395e9cSLionel Sambucand
10892395e9cSLionel Sambuc.Fl D
10992395e9cSLionel Sambucarguments, or as coding tags.
11092395e9cSLionel SambucEncodings are matched case-insensitively.
11192395e9cSLionel Sambuc.\" .Sh IMPLEMENTATION NOTES
11292395e9cSLionel Sambuc.\" Not used in OpenBSD.
11392395e9cSLionel Sambuc.\" .Sh RETURN VALUES
11492395e9cSLionel Sambuc.\" For sections 2, 3, & 9 only.
11592395e9cSLionel Sambuc.\" .Sh ENVIRONMENT
11692395e9cSLionel Sambuc.\" For sections 1, 6, 7, & 8 only.
11792395e9cSLionel Sambuc.\" .Sh FILES
11892395e9cSLionel Sambuc.Sh EXIT STATUS
11992395e9cSLionel Sambuc.Ex -std
12092395e9cSLionel Sambuc.Sh EXAMPLES
12192395e9cSLionel SambucExplicitly page a UTF\-8 manual
12292395e9cSLionel Sambuc.Pa foo.1
12392395e9cSLionel Sambucin the current locale:
12492395e9cSLionel Sambuc.Pp
12592395e9cSLionel Sambuc.Dl $ preconv \-e utf\-8 foo.1 | mandoc -Tlocale | less
12692395e9cSLionel Sambuc.\" .Sh DIAGNOSTICS
12792395e9cSLionel Sambuc.\" For sections 1, 4, 6, 7, & 8 only.
12892395e9cSLionel Sambuc.\" .Sh ERRORS
12992395e9cSLionel Sambuc.\" For sections 2, 3, & 9 only.
13092395e9cSLionel Sambuc.Sh SEE ALSO
13192395e9cSLionel Sambuc.Xr mandoc 1 ,
13292395e9cSLionel Sambuc.Xr mandoc_char 7
13392395e9cSLionel Sambuc.Sh STANDARDS
13492395e9cSLionel SambucThe
13592395e9cSLionel Sambuc.Nm
13692395e9cSLionel Sambucutility references the US-ASCII character set standard, ANSI_X3.4\-1968;
13792395e9cSLionel Sambucthe Latin\-1 character set standard, ISO/IEC 8859\-1:1998; the UTF\-8
13892395e9cSLionel Sambuccharacter set standard; and UCS (Unicode), ISO/IEC 10646.
13992395e9cSLionel Sambuc.Sh HISTORY
14092395e9cSLionel SambucThe
14192395e9cSLionel Sambuc.Nm
14292395e9cSLionel Sambucutility first appeared in the GNU troff
14392395e9cSLionel Sambuc.Pq Dq groff
14492395e9cSLionel Sambucsystem in December 2005, authored by Tomohiro Kubota and Werner
14592395e9cSLionel SambucLemberg.
14692395e9cSLionel SambucThe implementation that is part of the
14792395e9cSLionel Sambuc.Xr mandoc 1
14892395e9cSLionel Sambucutility appeared in May 2011.
14992395e9cSLionel Sambuc.Sh AUTHORS
15092395e9cSLionel SambucThe
15192395e9cSLionel Sambuc.Nm
15292395e9cSLionel Sambucutility was written by
153*0a6a1f1dSLionel Sambuc.An Kristaps Dzonsons Aq Mt kristaps@bsd.lv .
15492395e9cSLionel Sambuc.\" .Sh CAVEATS
15592395e9cSLionel Sambuc.\" .Sh BUGS
15692395e9cSLionel Sambuc.\" .Sh SECURITY CONSIDERATIONS
15792395e9cSLionel Sambuc.\" Not used in OpenBSD.
158