=================================================================== RCS file: /cvs/mandoc/mandoc.1,v retrieving revision 1.11 retrieving revision 1.102 diff -u -p -r1.11 -r1.102 --- mandoc/mandoc.1 2009/03/26 16:44:22 1.11 +++ mandoc/mandoc.1 2013/03/06 08:08:24 1.102 @@ -1,147 +1,127 @@ -.\" $Id: mandoc.1,v 1.11 2009/03/26 16:44:22 kristaps Exp $ +.\" $Id: mandoc.1,v 1.102 2013/03/06 08:08:24 schwarze Exp $ .\" -.\" Copyright (c) 2009 Kristaps Dzonsons +.\" Copyright (c) 2009, 2010, 2011 Kristaps Dzonsons +.\" Copyright (c) 2012 Ingo Schwarze .\" .\" Permission to use, copy, modify, and distribute this software for any -.\" purpose with or without fee is hereby granted, provided that the -.\" above copyright notice and this permission notice appear in all -.\" copies. +.\" purpose with or without fee is hereby granted, provided that the above +.\" copyright notice and this permission notice appear in all copies. .\" -.\" THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL -.\" WARRANTIES WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED -.\" WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE -.\" AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL -.\" DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR -.\" PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER -.\" TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR -.\" PERFORMANCE OF THIS SOFTWARE. +.\" THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES +.\" WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF +.\" MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR +.\" ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES +.\" WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN +.\" ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF +.\" OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. .\" -.Dd $Mdocdate: March 26 2009 $ -.Dt mandoc 1 +.Dd $Mdocdate: March 6 2013 $ +.Dt MANDOC 1 .Os -.\" SECTION .Sh NAME .Nm mandoc .Nd format and display UNIX manuals -.\" SECTION .Sh SYNOPSIS .Nm mandoc .Op Fl V -.Op Fl f Ns Ar option... +.Sm off +.Op Fl I Cm os Li = Ar name +.Sm on .Op Fl m Ns Ar format -.Op Fl W Ns Ar err... +.Op Fl O Ns Ar option .Op Fl T Ns Ar output -.Op Ar infile... -.\" SECTION +.Op Fl W Ns Ar level +.Op Ar .Sh DESCRIPTION The .Nm -utility formats +utility formats .Ux -manual pages for display. The arguments are as follows: -.Bl -tag -width XXXXXXXXXXXX -.\" ITEM -.It Fl f Ns Ar option... -Override default compiler behaviour. See -.Sx Compiler Options -for details. -.\" ITEM -.It Fl m -Input format. See -.Sx Input Formats -for available formats. Defaults to -.Fl m Ns Ar doc . -.\" ITEM -.It Fl T -Output format. See -.Sx Output Formats -for available formats. Defaults to -.Fl T Ns Ar ascii . -.\" ITEM -.It Fl V -Print version and exit. -.\" ITEM -.It Fl W Ns Ar err... -Print warning messages. May be set to -.Fl W Ns Ar all -for all warnings, -.Ar compat -for groff/troff-compatibility warnings, or -.Ar syntax -for syntax warnings. If -.Fl W Ns Ar error -is specified, warnings are considered errors and cause utility -termination. Multiple -.Fl W -arguments may be comma-separated, such as -.Fl W Ns Ar error,all . -.\" ITEM -.It Ar infile... -Read input from zero or more -.Ar infile . -If unspecified, reads from stdin. If multiple files are specified, -.Nm -will halt with the first failed parse. -.El -.\" PARAGRAPH +manual pages for display. .Pp -By default, -.Nm -reads +By default, +.Nm +reads .Xr mdoc 7 +or +.Xr man 7 text from stdin, implying -.Fl m Ns Ar mdoc , -and prints 78-column backspace-encoded output to stdout as if -.Fl T Ns Ar ascii -were provided. -.\" PARAGRAPH +.Fl m Ns Cm andoc , +and produces +.Fl T Ns Cm ascii +output. .Pp -.Ex -std mandoc -.\" SUB-SECTION -.Ss Reserved Words (mdoc only) -The reserved words described in +The arguments are as follows: +.Bl -tag -width Ds +.Sm off +.It Fl I Cm os Li = Ar name +.Sm on +Override the default operating system +.Ar name +for the .Xr mdoc 7 -are handled according to the following rules: -.Bl -enum -offset XXX -.It -Opening delimiters -.Po -.Sq \&( , -.Sq \&[ , +.Sq \&Os +macro. +.It Fl m Ns Ar format +Input format. +See +.Sx Input Formats +for available formats. +Defaults to +.Fl m Ns Cm andoc . +.It Fl O Ns Ar option +Comma-separated output options. +.It Fl T Ns Ar output +Output format. +See +.Sx Output Formats +for available formats. +Defaults to +.Fl T Ns Cm ascii . +.It Fl V +Print version and exit. +.It Fl W Ns Ar level +Specify the minimum message +.Ar level +to be reported on the standard error output and to affect the exit status. +The +.Ar level +can be +.Cm warning , +.Cm error , +or +.Cm fatal . +The default is +.Fl W Ns Cm fatal ; +.Fl W Ns Cm all +is an alias for +.Fl W Ns Cm warning . +See +.Sx EXIT STATUS and -.Sq \&{ -.Pc are not followed by whitespace. -.It -Closing delimiters -.Po -.Sq \&. , -.Sq \&, , -.Sq \&; , -.Sq \&: , -.Sq \&? , -.Sq \&! , -.Sq \&) , -.Sq \&] +.Sx DIAGNOSTICS +for details. +.Pp +The special option +.Fl W Ns Cm stop +tells +.Nm +to exit after parsing a file that causes warnings or errors of at least +the requested level. +No formatted output will be produced from that file. +If both a +.Ar level and -.Sq \&} -.Pc are not preceeded by whitespace. +.Cm stop +are requested, they can be joined with a comma, for example +.Fl W Ns Cm error , Ns Cm stop . +.It Ar file +Read input from zero or more files. +If unspecified, reads from stdin. +If multiple files are specified, +.Nm +will halt with the first failed parse. .El -.\" PARAGRAPH -.Pp -Note that reserved words only register as such as if they appear as -standalone tokens, either in parsed lines or streams of text. Thus, the -following fragment: -.Bd -literal -offset XXXX -this self is not that of the waking , empirically real man -.Ed -.\" PARAGRAPH -.Pp -\&...correctly adjusts the comma spacing to -.Dq this self is not that of the waking , empirically real man . -However, if the comma were part of -.Dq ,empirically , -it would not. -.\" SUB-SECTION .Ss Input Formats The .Nm @@ -150,118 +130,553 @@ utility accepts and .Xr man 7 input with -.Fl m Ns Ar doc +.Fl m Ns Cm doc and -.Fl m Ns Ar an , -respectively. The +.Fl m Ns Cm an , +respectively. +The .Xr mdoc 7 format is .Em strongly -recommended; +recommended; .Xr man 7 should only be used for legacy manuals. .Pp -The following escape sequences are recognised, although the per-format -compiler may not allow certain sequences. -.Bl -tag -width Ds -offset XXXX -.It \efX -sets the font mode to X (B, I, R or P, where P resets the font) -.It \eX, \e(XX, \e[XN] -queries the special-character table for a corresponding symbol -.It \e*X, \e*(XX, \e*[XN] -deprecated special-character format -.El -.\" SUB-SECTION +A third option, +.Fl m Ns Cm andoc , +which is also the default, determines encoding on-the-fly: if the first +non-comment macro is +.Sq \&Dd +or +.Sq \&Dt , +the +.Xr mdoc 7 +parser is used; otherwise, the +.Xr man 7 +parser is used. +.Pp +If multiple +files are specified with +.Fl m Ns Cm andoc , +each has its file-type determined this way. +If multiple files are +specified and +.Fl m Ns Cm doc +or +.Fl m Ns Cm an +is specified, then this format is used exclusively. .Ss Output Formats The .Nm utility accepts the following .Fl T -arguments: -.Bl -tag -width XXXXXXXXXXXX -offset XXXX -.It Ar ascii -Produce 7-bit ASCII output, backspace-encoded for bold and underline -styles. This is the default. -.It Ar tree -Produce an indented parse tree. -.It Ar lint +arguments, which correspond to output modes: +.Bl -tag -width "-Tlocale" +.It Fl T Ns Cm ascii +Produce 7-bit ASCII output. +This is the default. +See +.Sx ASCII Output . +.It Fl T Ns Cm html +Produce strict CSS1/HTML-4.01 output. +See +.Sx HTML Output . +.It Fl T Ns Cm lint Parse only: produce no output. +Implies +.Fl W Ns Cm warning . +.It Fl T Ns Cm locale +Encode output using the current locale. +See +.Sx Locale Output . +.It Fl T Ns Cm man +Produce +.Xr man 7 +format output. +See +.Sx Man Output . +.It Fl T Ns Cm pdf +Produce PDF output. +See +.Sx PDF Output . +.It Fl T Ns Cm ps +Produce PostScript output. +See +.Sx PostScript Output . +.It Fl T Ns Cm tree +Produce an indented parse tree. +.It Fl T Ns Cm utf8 +Encode output in the UTF\-8 multi-byte format. +See +.Sx UTF\-8 Output . +.It Fl T Ns Cm xhtml +Produce strict CSS1/XHTML-1.0 output. +See +.Sx XHTML Output . .El -.\" SUB-SECTION -.Ss Compiler Options -Default compiler behaviour may be overriden with the -.Fl f -flag. -.Bl -tag -width XXXXXXXXXXXX -offset XXXX -.It Fl f Ns Ar ign-scope -When rewinding the scope of a block macro, forces the compiler to ignore -scope violations. This can seriously mangle the resulting tree. -.Pq mdoc only -.It Fl f Ns Ar ign-escape -Ignore invalid escape sequences. -.It Fl f Ns Ar ign-macro -Ignore unknown macros at the start of input lines. +.Pp +If multiple input files are specified, these will be processed by the +corresponding filter in-order. +.Ss ASCII Output +Output produced by +.Fl T Ns Cm ascii , +which is the default, is rendered in standard 7-bit ASCII documented in +.Xr ascii 7 . +.Pp +Font styles are applied by using back-spaced encoding such that an +underlined character +.Sq c +is rendered as +.Sq _ Ns \e[bs] Ns c , +where +.Sq \e[bs] +is the back-space character number 8. +Emboldened characters are rendered as +.Sq c Ns \e[bs] Ns c . +.Pp +The special characters documented in +.Xr mandoc_char 7 +are rendered best-effort in an ASCII equivalent. +If no equivalent is found, +.Sq \&? +is used instead. +.Pp +Output width is limited to 78 visible columns unless literal input lines +exceed this limit. +.Pp +The following +.Fl O +arguments are accepted: +.Bl -tag -width Ds +.It Cm indent Ns = Ns Ar indent +The left margin for normal text is set to +.Ar indent +blank characters instead of the default of five for +.Xr mdoc 7 +and seven for +.Xr man 7 . +Increasing this is not recommended; it may result in degraded formatting, +for example overfull lines or ugly line breaks. +.It Cm width Ns = Ns Ar width +The output width is set to +.Ar width , +which will normalise to \(>=60. .El -.\" PARAGRAPH +.Ss HTML Output +Output produced by +.Fl T Ns Cm html +conforms to HTML-4.01 strict. .Pp -As with the -.Fl W -flag, multiple -.Fl f -options may be grouped and delimited with a comma. Using -.Fl f Ns Ar ign-scope,ign-escape , -for example, will try to ignore scope and character-escape errors. -.\" SECTION -.Sh EXAMPLES -To page this manual page on the terminal: -.\" PARAGRAPH +The +.Pa example.style.css +file documents style-sheet classes available for customising output. +If a style-sheet is not specified with +.Fl O Ns Ar style , +.Fl T Ns Cm html +defaults to simple output readable in any graphical or text-based web +browser. .Pp -.D1 % mandoc \-Wall,error mandoc.1 2>&1 | less -.\" SECTION -.Sh SEE ALSO -.Xr mdoc 7 , +Special characters are rendered in decimal-encoded UTF\-8. +.Pp +The following +.Fl O +arguments are accepted: +.Bl -tag -width Ds +.It Cm fragment +Omit the +.Aq !DOCTYPE +declaration and the +.Aq html , +.Aq head , +and +.Aq body +elements and only emit the subtree below the +.Aq body +element. +The +.Cm style +argument will be ignored. +This is useful when embedding manual content within existing documents. +.It Cm includes Ns = Ns Ar fmt +The string +.Ar fmt , +for example, +.Ar ../src/%I.html , +is used as a template for linked header files (usually via the +.Sq \&In +macro). +Instances of +.Sq \&%I +are replaced with the include filename. +The default is not to present a +hyperlink. +.It Cm man Ns = Ns Ar fmt +The string +.Ar fmt , +for example, +.Ar ../html%S/%N.%S.html , +is used as a template for linked manuals (usually via the +.Sq \&Xr +macro). +Instances of +.Sq \&%N +and +.Sq %S +are replaced with the linked manual's name and section, respectively. +If no section is included, section 1 is assumed. +The default is not to +present a hyperlink. +.It Cm style Ns = Ns Ar style.css +The file +.Ar style.css +is used for an external style-sheet. +This must be a valid absolute or +relative URI. +.El +.Ss Locale Output +Locale-depending output encoding is triggered with +.Fl T Ns Cm locale . +This option is not available on all systems: systems without locale +support, or those whose internal representation is not natively UCS-4, +will fall back to +.Fl T Ns Cm ascii . +See +.Sx ASCII Output +for font style specification and available command-line arguments. +.Ss Man Output +Translate input format into .Xr man 7 -.\" -.Sh AUTHORS +output format. +This is useful for distributing manual sources to legacy systems +lacking +.Xr mdoc 7 +formatters. +.Pp +If +.Xr mdoc 7 +is passed as input, it is translated into +.Xr man 7 . +If the input format is +.Xr man 7 , +the input is copied to the output, expanding any +.Xr roff 7 +.Sq so +requests. +The parser is also run, and as usual, the +.Fl W +level controls which +.Sx DIAGNOSTICS +are displayed before copying the input to the output. +.Ss PDF Output +PDF-1.1 output may be generated by +.Fl T Ns Cm pdf . +See +.Sx PostScript Output +for +.Fl O +arguments and defaults. +.Ss PostScript Output +PostScript +.Qq Adobe-3.0 +Level-2 pages may be generated by +.Fl T Ns Cm ps . +Output pages default to letter sized and are rendered in the Times font +family, 11-point. +Margins are calculated as 1/9 the page length and width. +Line-height is 1.4m. +.Pp +Special characters are rendered as in +.Sx ASCII Output . +.Pp +The following +.Fl O +arguments are accepted: +.Bl -tag -width Ds +.It Cm paper Ns = Ns Ar name +The paper size +.Ar name +may be one of +.Ar a3 , +.Ar a4 , +.Ar a5 , +.Ar legal , +or +.Ar letter . +You may also manually specify dimensions as +.Ar NNxNN , +width by height in millimetres. +If an unknown value is encountered, +.Ar letter +is used. +.El +.Ss UTF\-8 Output +Use +.Fl T Ns Cm utf8 +to force a UTF\-8 locale. +See +.Sx Locale Output +for details and options. +.Ss XHTML Output +Output produced by +.Fl T Ns Cm xhtml +conforms to XHTML-1.0 strict. +.Pp +See +.Sx HTML Output +for details; beyond generating XHTML tags instead of HTML tags, these +output modes are identical. +.Sh EXIT STATUS The .Nm -utility was written by -.An Kristaps Dzonsons Aq kristaps@openbsd.org . -.\" SECTION -.Sh CAVEATS -The +utility exits with one of the following values, controlled by the message +.Ar level +associated with the +.Fl W +option: +.Pp +.Bl -tag -width Ds -compact +.It 0 +No warnings or errors occurred, or those that did were ignored because +they were lower than the requested +.Ar level . +.It 2 +At least one warning occurred, but no error, and +.Fl W Ns Cm warning +was specified. +.It 3 +At least one parsing error occurred, but no fatal error, and +.Fl W Ns Cm error +or +.Fl W Ns Cm warning +was specified. +.It 4 +A fatal parsing error occurred. +.It 5 +Invalid command line arguments were specified. +No input files have been read. +.It 6 +An operating system error occurred, for example memory exhaustion or an +error accessing input files. +Such errors cause .Nm -utility in -.Fl T Ns Ar ascii -mode doesn't yet know how to display the following: +to exit at once, possibly in the middle of parsing or formatting a file. +.El .Pp -.Bl -bullet -compact -.It -The \-hang -.Sq \&.Bl -list is not yet supported. +Note that selecting +.Fl T Ns Cm lint +output mode implies +.Fl W Ns Cm warning . +.Sh EXAMPLES +To page manuals to the terminal: +.Pp +.Dl $ mandoc \-Wall,stop mandoc.1 2\*(Gt&1 | less +.Dl $ mandoc mandoc.1 mdoc.3 mdoc.7 | less +.Pp +To produce HTML manuals with +.Ar style.css +as the style-sheet: +.Pp +.Dl $ mandoc \-Thtml -Ostyle=style.css mdoc.7 \*(Gt mdoc.7.html +.Pp +To check over a large set of manuals: +.Pp +.Dl $ mandoc \-Tlint `find /usr/src -name \e*\e.[1-9]` +.Pp +To produce a series of PostScript manuals for A4 paper: +.Pp +.Dl $ mandoc \-Tps \-Opaper=a4 mdoc.7 man.7 \*(Gt manuals.ps +.Pp +Convert a modern +.Xr mdoc 7 +manual to the older +.Xr man 7 +format, for use on systems lacking an +.Xr mdoc 7 +parser: +.Pp +.Dl $ mandoc \-Tman foo.mdoc \*(Gt foo.man +.Sh DIAGNOSTICS +Standard error messages reporting parsing errors are prefixed by +.Pp +.Sm off +.D1 Ar file : line : column : \ level : +.Sm on +.Pp +where the fields have the following meanings: +.Bl -tag -width "column" +.It Ar file +The name of the input file causing the message. +.It Ar line +The line number in that input file. +Line numbering starts at 1. +.It Ar column +The column number in that input file. +Column numbering starts at 1. +If the issue is caused by a word, the column number usually +points to the first character of the word. +.It Ar level +The message level, printed in capital letters. .El .Pp -Other macros still aren't supported by virtue of nobody complaining -about their absence. Please report any omissions: this is a work in -progress. +Message levels have the following meanings: +.Bl -tag -width "warning" +.It Cm fatal +The parser is unable to parse a given input file at all. +No formatted output is produced from that input file. +.It Cm error +An input file contains syntax that cannot be safely interpreted, +either because it is invalid or because +.Nm +does not implement it yet. +By discarding part of the input or inserting missing tokens, +the parser is able to continue, and the error does not prevent +generation of formatted output, but typically, preparing that +output involves information loss, broken document structure +or unintended formatting. +.It Cm warning +An input file uses obsolete, discouraged or non-portable syntax. +All the same, the meaning of the input is unambiguous and a correct +rendering can be produced. +Documents causing warnings may render poorly when using other +formatting tools instead of +.Nm . +.El .Pp -The following list documents differences between traditional -.Xr nroff 1 -output and -.Nm : +Messages of the +.Cm warning +and +.Cm error +levels are hidden unless their level, or a lower level, is requested using a +.Fl W +option or +.Fl T Ns Cm lint +output mode. .Pp +The +.Nm +utility may also print messages related to invalid command line arguments +or operating system errors, for example when memory is exhausted or +input files cannot be read. +Such messages do not carry the prefix described above. +.Sh COMPATIBILITY +This section summarises +.Nm +compatibility with GNU troff. +Each input and output format is separately noted. +.Ss ASCII Compatibility .Bl -bullet -compact -.It -A list of display following -.Sq \&.Ss +.It +Unrenderable unicode codepoints specified with +.Sq \e[uNNNN] +escapes are printed as +.Sq \&? +in mandoc. +In GNU troff, these raise an error. +.It +The +.Sq \&Bd \-literal +and +.Sq \&Bd \-unfilled +macros of +.Xr mdoc 7 +in +.Fl T Ns Cm ascii +are synonyms, as are \-filled and \-ragged. +.It +In historic GNU troff, the +.Sq \&Pa +.Xr mdoc 7 +macro does not underline when scoped under an +.Sq \&It +in the FILES section. +This behaves correctly in +.Nm . +.It +A list or display following the +.Sq \&Ss +.Xr mdoc 7 +macro in +.Fl T Ns Cm ascii does not assert a prior vertical break, just as it doesn't with -.Sq \&.Sh . +.Sq \&Sh . .It -Special characters don't follow the current font style. -.\" LIST-ITEM +The +.Sq \&na +.Xr man 7 +macro in +.Fl T Ns Cm ascii +has no effect. .It -The \-literal and \-unfilled -.Sq \&.Bd -displays types are synonyms, as are \-filled and \-ragged. +Words aren't hyphenated. .El +.Ss HTML/XHTML Compatibility +.Bl -bullet -compact +.It +The +.Sq \efP +escape will revert the font to the previous +.Sq \ef +escape, not to the last rendered decoration, which is now dictated by +CSS instead of hard-coded. +It also will not span past the current scope, +for the same reason. +Note that in +.Sx ASCII Output +mode, this will work fine. +.It +The +.Xr mdoc 7 +.Sq \&Bl \-hang +and +.Sq \&Bl \-tag +list types render similarly (no break following overreached left-hand +side) due to the expressive constraints of HTML. +.It +The +.Xr man 7 +.Sq IP +and +.Sq TP +lists render similarly. +.El +.Sh SEE ALSO +.Xr eqn 7 , +.Xr man 7 , +.Xr mandoc_char 7 , +.Xr mdoc 7 , +.Xr roff 7 , +.Xr tbl 7 +.Sh AUTHORS +The +.Nm +utility was written by +.An Kristaps Dzonsons , +.Mt kristaps@bsd.lv . +.Sh CAVEATS +In +.Fl T Ns Cm html +and +.Fl T Ns Cm xhtml , +the maximum size of an element attribute is determined by +.Dv BUFSIZ , +which is usually 1024 bytes. +Be aware of this when setting long link +formats such as +.Fl O Ns Cm style Ns = Ns Ar really/long/link . +.Pp +Nesting elements within next-line element scopes of +.Fl m Ns Cm an , +such as +.Sq br +within an empty +.Sq B , +will confuse +.Fl T Ns Cm html +and +.Fl T Ns Cm xhtml +and cause them to forget the formatting of the prior next-line scope. +.Pp +The +.Sq \(aq +control character is an alias for the standard macro control character +and does not emit a line-break as stipulated in GNU troff.