version 1.2, 2011/05/26 12:14:46 |
version 1.4, 2011/05/26 14:45:04 |
Line 42 Its arguments are as follows: |
|
Line 42 Its arguments are as follows: |
|
.Bl -tag -width Ds |
.Bl -tag -width Ds |
.It Fl D Ar enc |
.It Fl D Ar enc |
The default encoding. |
The default encoding. |
This is case-insensitive. |
|
See |
|
.Sx Algorithm |
|
and |
|
.Sx Encodings . |
|
.It Fl e Ar enc |
.It Fl e Ar enc |
The document's encoding. |
The document's encoding. |
This is case-insensitive. |
|
See |
|
.Sx Algorithm |
|
and |
|
.Sx Encodings . |
|
.It Ar file |
.It Ar file |
The input file. |
The input file. |
.El |
.El |
|
|
is not provided, |
is not provided, |
.Nm |
.Nm |
accepts standard input. |
accepts standard input. |
Output is written to standard output. |
See |
Unicode characters in the ASCII range are printed as regular ASCII |
.Sx Algorithm |
characters; those above this range are printed using the |
for encoding choice. |
|
.Pp |
|
The recoded input is written to standard output: Unicode characters in |
|
the ASCII range are printed as regular ASCII characters, while those |
|
above this range are printed using the |
.Sq \e[uNNNN] |
.Sq \e[uNNNN] |
format documented in |
format documented in |
.Xr mandoc_char 7 . |
.Xr mandoc_char 7 . |
.Pp |
.Pp |
If input bytes are improperly formed in the current encoding, they're |
If input bytes are improperly formed in the current encoding, they're |
passed unmodified to standard output. |
passed unmodified to standard output. |
.Ss Encodings |
For some encodings, such as UTF-8, unrecoverable input sequences will |
The |
cause |
.Nm |
.Nm |
utility accepts the |
to stop processing and exit. |
.Ar utf\-8 , |
|
.Ar us\-ascii , |
|
and |
|
.Ar latin\-1 |
|
encodings as arguments to |
|
.Fl D Ar enc |
|
or |
|
.Fl e Ar enc . |
|
.Ss Algorithm |
.Ss Algorithm |
An encoding is chosen according to the following steps: |
An encoding is chosen according to the following steps: |
.Bl -enum |
.Bl -enum |
Line 91 An encoding is chosen according to the following steps |
|
Line 77 An encoding is chosen according to the following steps |
|
From the argument passed to |
From the argument passed to |
.Fl e Ar enc . |
.Fl e Ar enc . |
.It |
.It |
If a BOM exists, utf\-8 encoding is selected. |
If a BOM exists, UTF\-8 encoding is selected. |
.It |
.It |
|
From the coding tags parsed from |
|
.Qq File Variables |
|
on the first two lines of input. |
|
A file variable is an input line of the form |
|
.Pp |
|
.Dl \%.\e\(dq -*- key: val [; key: val ]* -*- |
|
.Pp |
|
A coding tag variable is where |
|
.Cm key |
|
is |
|
.Qq coding |
|
and |
|
.Cm val |
|
is the name of the encoding. |
|
A typical file variable with a coding tag is |
|
.Pp |
|
.Dl \%.\e\(dq -*- mode: troff; coding: utf-8 -*- |
|
.It |
From the argument passed to |
From the argument passed to |
.Fl D Ar enc . |
.Fl D Ar enc . |
.It |
.It |
If all else fails, Latin\-1 is used. |
If all else fails, Latin\-1 is used. |
.El |
.El |
|
.Pp |
|
The |
|
.Nm |
|
utility recognises the UTF\-8, us\-ascii, and latin\-1 encodings as |
|
passed to the |
|
.Fl e |
|
and |
|
.Fl D |
|
arguments, or as coding tags. |
|
Encodings are matched case-insensitively. |
.\" .Sh IMPLEMENTATION NOTES |
.\" .Sh IMPLEMENTATION NOTES |
.\" Not used in OpenBSD. |
.\" Not used in OpenBSD. |
.\" .Sh RETURN VALUES |
.\" .Sh RETURN VALUES |
Line 107 If all else fails, Latin\-1 is used. |
|
Line 121 If all else fails, Latin\-1 is used. |
|
.\" .Sh FILES |
.\" .Sh FILES |
.Sh EXIT STATUS |
.Sh EXIT STATUS |
.Ex -std |
.Ex -std |
.\" .Sh EXAMPLES |
.Sh EXAMPLES |
|
Explicitly page a UTF\-8 manual |
|
.Pa foo.1 |
|
in the current locale: |
|
.Pp |
|
.Dl $ preconv \-e utf\-8 foo.1 | mandoc -Tlocale | less |
.\" .Sh DIAGNOSTICS |
.\" .Sh DIAGNOSTICS |
.\" For sections 1, 4, 6, 7, & 8 only. |
.\" For sections 1, 4, 6, 7, & 8 only. |
.\" .Sh ERRORS |
.\" .Sh ERRORS |