Annotation of mandoc/preconv.1, Revision 1.4
1.4 ! kristaps 1: .\" $Id: preconv.1,v 1.3 2011/05/26 14:43:07 kristaps Exp $
1.1 kristaps 2: .\"
3: .\" Copyright (c) 2011 Kristaps Dzonsons <kristaps@bsd.lv>
4: .\"
5: .\" Permission to use, copy, modify, and distribute this software for any
6: .\" purpose with or without fee is hereby granted, provided that the above
7: .\" copyright notice and this permission notice appear in all copies.
8: .\"
9: .\" THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
10: .\" WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
11: .\" MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
12: .\" ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
13: .\" WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
14: .\" ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
15: .\" OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
16: .\"
1.2 kristaps 17: .Dd $Mdocdate: May 26 2011 $
1.1 kristaps 18: .Dt PRECONV 1
19: .Os
20: .Sh NAME
21: .Nm preconv
1.2 kristaps 22: .Nd recode multibyte UNIX manuals
1.1 kristaps 23: .Sh SYNOPSIS
24: .Nm preconv
25: .Op Fl D Ar enc
26: .Op Fl e Ar enc
27: .Op Ar file
28: .Sh DESCRIPTION
29: The
30: .Nm
31: utility recodes multibyte
32: .Ux
33: manual files into
34: .Xr mandoc 1
1.2 kristaps 35: .Po
36: or other troff system supporting the
37: .Sq \e[uNNNN]
38: escape sequence
39: .Pc
1.1 kristaps 40: input.
41: Its arguments are as follows:
42: .Bl -tag -width Ds
43: .It Fl D Ar enc
44: The default encoding.
45: .It Fl e Ar enc
46: The document's encoding.
47: .It Ar file
48: The input file.
49: .El
50: .Pp
51: If
52: .Ar file
53: is not provided,
54: .Nm
55: accepts standard input.
1.3 kristaps 56: See
57: .Sx Algorithm
58: for encoding choice.
59: .Pp
60: The recoded input is written to standard output: Unicode characters in
61: the ASCII range are printed as regular ASCII characters, while those
62: above this range are printed using the
1.1 kristaps 63: .Sq \e[uNNNN]
64: format documented in
65: .Xr mandoc_char 7 .
66: .Pp
67: If input bytes are improperly formed in the current encoding, they're
68: passed unmodified to standard output.
1.3 kristaps 69: For some encodings, such as UTF-8, unrecoverable input sequences will
70: cause
1.1 kristaps 71: .Nm
1.3 kristaps 72: to stop processing and exit.
1.1 kristaps 73: .Ss Algorithm
74: An encoding is chosen according to the following steps:
75: .Bl -enum
76: .It
77: From the argument passed to
78: .Fl e Ar enc .
79: .It
1.3 kristaps 80: If a BOM exists, UTF\-8 encoding is selected.
81: .It
82: From the coding tags parsed from
83: .Qq File Variables
84: on the first two lines of input.
85: A file variable is an input line of the form
86: .Pp
87: .Dl \%.\e\(dq -*- key: val [; key: val ]* -*-
88: .Pp
1.4 ! kristaps 89: A coding tag variable is where
1.3 kristaps 90: .Cm key
91: is
92: .Qq coding
93: and
94: .Cm val
95: is the name of the encoding.
1.4 ! kristaps 96: A typical file variable with a coding tag is
1.3 kristaps 97: .Pp
98: .Dl \%.\e\(dq -*- mode: troff; coding: utf-8 -*-
1.1 kristaps 99: .It
100: From the argument passed to
101: .Fl D Ar enc .
102: .It
103: If all else fails, Latin\-1 is used.
104: .El
1.3 kristaps 105: .Pp
106: The
107: .Nm
108: utility recognises the UTF\-8, us\-ascii, and latin\-1 encodings as
109: passed to the
110: .Fl e
111: and
112: .Fl D
113: arguments, or as coding tags.
114: Encodings are matched case-insensitively.
1.1 kristaps 115: .\" .Sh IMPLEMENTATION NOTES
116: .\" Not used in OpenBSD.
117: .\" .Sh RETURN VALUES
118: .\" For sections 2, 3, & 9 only.
119: .\" .Sh ENVIRONMENT
120: .\" For sections 1, 6, 7, & 8 only.
121: .\" .Sh FILES
122: .Sh EXIT STATUS
123: .Ex -std
1.3 kristaps 124: .Sh EXAMPLES
125: Explicitly page a UTF\-8 manual
126: .Pa foo.1
127: in the current locale:
128: .Pp
129: .Dl $ preconv \-e utf\-8 foo.1 | mandoc -Tlocale | less
1.1 kristaps 130: .\" .Sh DIAGNOSTICS
131: .\" For sections 1, 4, 6, 7, & 8 only.
132: .\" .Sh ERRORS
133: .\" For sections 2, 3, & 9 only.
134: .Sh SEE ALSO
135: .Xr mandoc 1 ,
136: .Xr mandoc_char 7
137: .Sh STANDARDS
138: The
139: .Nm
140: utility references the US-ASCII character set standard, ANSI_X3.4\-1968;
141: the Latin\-1 character set standard, ISO/IEC 8859\-1:1998; the UTF\-8
142: character set standard; and UCS (Unicode), ISO/IEC 10646.
143: .Sh HISTORY
144: The
145: .Nm
146: utility first appeared in the GNU troff
147: .Pq Dq groff
148: system in December 2005, authored by Tomohiro Kubota and Werner
149: Lemberg.
150: The implementation that is part of the
151: .Xr mandoc 1
152: utility appeared in May 2011.
153: .Sh AUTHORS
154: The
155: .Nm
156: utility was written by
157: .An Kristaps Dzonsons Aq kristaps@bsd.lv .
158: .\" .Sh CAVEATS
159: .\" .Sh BUGS
160: .\" .Sh SECURITY CONSIDERATIONS
161: .\" Not used in OpenBSD.
CVSweb