=================================================================== RCS file: /cvs/mandoc/mandoc.1,v retrieving revision 1.18 retrieving revision 1.50 diff -u -p -r1.18 -r1.50 --- mandoc/mandoc.1 2009/06/11 07:26:35 1.18 +++ mandoc/mandoc.1 2010/01/29 14:39:38 1.50 @@ -1,4 +1,4 @@ -.\" $Id: mandoc.1,v 1.18 2009/06/11 07:26:35 kristaps Exp $ +.\" $Id: mandoc.1,v 1.50 2010/01/29 14:39:38 kristaps Exp $ .\" .\" Copyright (c) 2009 Kristaps Dzonsons .\" @@ -14,66 +14,71 @@ .\" ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF .\" OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. .\" -.Dd $Mdocdate: June 11 2009 $ +.Dd $Mdocdate: January 29 2010 $ .Dt MANDOC 1 .Os -.\" SECTION +. +. .Sh NAME .Nm mandoc .Nd format and display UNIX manuals -.\" SECTION +. +. .Sh SYNOPSIS .Nm mandoc -.Op Fl V .Op Fl f Ns Ar option... .Op Fl m Ns Ar format -.Op Fl W Ns Ar err... +.Op Fl O Ns Ar option... .Op Fl T Ns Ar output +.Op Fl V +.Op Fl W Ns Ar err... .Op Ar infile... -.\" SECTION +. +. .Sh DESCRIPTION The .Nm -utility formats +utility formats .Ux manual pages for display. The arguments are as follows: -.Bl -tag -width XXXXXXXXXXXX -.\" ITEM +. +.Bl -tag -width Ds .It Fl f Ns Ar option... -Override default compiler behaviour. See +Comma-separated compiler options. See .Sx Compiler Options for details. -.\" ITEM -.It Fl m +. +.It Fl m Ns Ar format Input format. See .Sx Input Formats for available formats. Defaults to .Fl m Ns Ar andoc . -.\" ITEM -.It Fl T +. +.It Fl O Ns Ar option... +Comma-separated output options. See +.Sx Output Options +for details. +. +.It Fl T Ns Ar output Output format. See .Sx Output Formats for available formats. Defaults to .Fl T Ns Ar ascii . -.\" ITEM +. .It Fl V Print version and exit. -.\" ITEM +. .It Fl W Ns Ar err... -Print warning messages. May be set to +Comma-separated warning options. Use .Fl W Ns Ar all -for all warnings, -.Ar compat -for groff/troff-compatibility warnings, or -.Ar syntax -for syntax warnings. If -.Fl W Ns Ar error -is specified, warnings are considered errors and cause utility -termination. Multiple +to print warnings, +.Fl W Ns Ar error +for warnings to be considered errors and cause utility +termination. Multiple .Fl W arguments may be comma-separated, such as .Fl W Ns Ar error,all . -.\" ITEM +. .It Ar infile... Read input from zero or more .Ar infile . @@ -81,54 +86,24 @@ If unspecified, reads from stdin. If multiple files a .Nm will halt with the first failed parse. .El -.\" PARAGRAPH +. .Pp -By default, -.Nm -reads +By default, +.Nm +reads .Xr mdoc 7 or .Xr man 7 text from stdin, implying .Fl m Ns Ar andoc , -and prints 78-column backspace-encoded output to stdout as if +and produces .Fl T Ns Ar ascii -were provided. -.\" PARAGRAPH +output. +. .Pp .Ex -std mandoc -.\" SUB-SECTION -.Ss Punctuation -If punctuation is set apart from words, such as in the phrase -.Dq to be \&, or not to be , -it's processed by -.Nm -according to the following rules. Opening punctuation -.Po -.Sq \&( , -.Sq \&[ , -and -.Sq \&{ -.Pc -is not followed by a space. Closing punctuation -.Po -.Sq \&. , -.Sq \&, , -.Sq \&; , -.Sq \&: , -.Sq \&? , -.Sq \&! , -.Sq \&) , -.Sq \&] -and -.Sq \&} -.Pc -is not preceded by whitespace. -.Pp -If the input is -.Xr mdoc 7 , -these rules are also applied to macro arguments when appropriate. -.\" SUB-SECTION +. +. .Ss Input Formats The .Nm @@ -144,146 +119,409 @@ respectively. The .Xr mdoc 7 format is .Em strongly -recommended; +recommended; .Xr man 7 should only be used for legacy manuals. +. .Pp A third option, .Fl m Ns Ar andoc , which is also the default, determines encoding on-the-fly: if the first -non-comment macro is -.Sq \&.Dd +non-comment macro is +.Sq \&Dd or -.Sq \&.Dt , -the +.Sq \&Dt , +the .Xr mdoc 7 parser is used; otherwise, the .Xr man 7 parser is used. +. .Pp If multiple -files are specified with -.Fl m Ns Ar andoc , +files are specified with +.Fl m Ns Ar andoc , each has its file-type determined this way. If multiple files are specified and .Fl m Ns Ar doc or .Fl m Ns Ar an is specified, then this format is used exclusively. -.\" .Pp -.\" The following escape sequences are recognised, although the per-format -.\" compiler may not allow certain sequences. -.\" .Bl -tag -width Ds -offset XXXX -.\" .It \efX -.\" sets the font mode to X (B, I, R or P, where P resets the font) -.\" .It \eX, \e(XX, \e[XN] -.\" queries the special-character table for a corresponding symbol -.\" .It \e*X, \e*(XX, \e*[XN] -.\" deprecated special-character format -.\" .El -.\" SUB-SECTION +. +. .Ss Output Formats The .Nm utility accepts the following .Fl T -arguments: -.Bl -tag -width XXXXXXXXXXXX +arguments (see +.Sx OUTPUT ) : +. +.Bl -tag -width Ds .It Fl T Ns Ar ascii Produce 7-bit ASCII output, backspace-encoded for bold and underline -styles. This is the default. +styles. This is the default. See +.Sx ASCII Output . +. +.It Fl T Ns Ar html +Produce strict HTML-4.01 output, with a sane default style. See +.Sx HTML Output . +. +.It Fl T Ns Ar xhtml +Produce strict XHTML-1.0 output, with a sane default style. See +.Sx XHTML Output . +. .It Fl T Ns Ar tree Produce an indented parse tree. +. .It Fl T Ns Ar lint Parse only: produce no output. .El +. .Pp If multiple input files are specified, these will be processed by the corresponding filter in-order. -.\" SUB-SECTION +. +. .Ss Compiler Options -Default compiler behaviour may be overriden with the +Default compiler behaviour may be overridden with the .Fl f flag. -.Bl -tag -width XXXXXXXXXXXXXX +. +.Bl -tag -width Ds .It Fl f Ns Ar ign-scope When rewinding the scope of a block macro, forces the compiler to ignore scope violations. This can seriously mangle the resulting tree. .Pq mdoc only +. .It Fl f Ns Ar ign-escape Ignore invalid escape sequences. -.It Fl f Ns Ar ign-macro -Ignore unknown macros at the start of input lines (default for -.Xr man 7 -parsing). +This is the default, but the option can be used to override an earlier +.Fl f Ns Ar strict . +. +.It Fl f Ns Ar no-ign-escape +Don't ignore invalid escape sequences. +. .It Fl f Ns Ar no-ign-macro -Do not ignore unknown macros at the start of input lines (default for -.Xr mdoc 7 -parsing). +Do not ignore unknown macros at the start of input lines. +. +.It Fl f Ns Ar no-ign-chars +Do not ignore disallowed characters. +. +.It Fl f Ns Ar strict +Implies +.Fl f Ns Ar no-ign-escape , +.Fl f Ns Ar no-ign-macro +and +.Fl f Ns Ar no-ign-chars . +. +.It Fl f Ns Ar ign-errors +Don't halt when encountering parse errors. Useful with +.Fl T Ns Ar lint +over a large set of manuals passed on the command line. .El -.\" PARAGRAPH +. +. +.Ss Output Options +For the time being, only +.Fl T Ns Ar html +accepts output options: +.Bl -tag -width Ds +.It Fl O Ns Ar style=style.css +The file +.Ar style.css +is used for an external style-sheet. This must be a valid absolute or +relative URI. +.It Fl O Ns Ar includes=fmt +The string +.Ar fmt , +for example, +.Ar ../src/%I.html , +is used as a template for linked header files (usually via the +.Sq \&In +macro). Instances of +.Sq \&%I +are replaced with the include filename. The default is not to present a +hyperlink. +.It Fl O Ns Ar man=fmt +The string +.Ar fmt , +for example, +.Ar ../html%S/%N.%S.html , +is used as a template for linked manuals (usually via the +.Sq \&Xr +macro). Instances of +.Sq \&%N +and +.Sq %S +are replaced with the linked manual's name and section, respectively. +If no section is included, section 1 is assumed. The default is not to +present a hyperlink. +.El +. +. +.Sh OUTPUT +This section documents output details of +.Nm . +In general, output conforms to the traditional manual style of a header, +a body composed of sections and sub-sections, and a footer. .Pp -As with the -.Fl W -flag, multiple -.Fl f -options may be grouped and delimited with a comma. Using -.Fl f Ns Ar ign-scope,ign-escape , -for example, will try to ignore scope and character-escape errors. -.\" SECTION +The text style of output characters (non-macro characters, punctuation, +and white-space) is dictated by context. +.Pp +White-space is generally stripped from input. This can be changed with +character escapes (specified in +.Xr mandoc_char 7 ) +or literal modes (specified in +.Xr mdoc 7 +and +.Xr man 7 ) . +.Pp +If non-macro punctuation is set apart from words, such as in the phrase +.Dq to be \&, or not to be , +it's processed by +.Nm , +regardless of output format, according to the following rules: opening +punctuation +.Po +.Sq \&( , +.Sq \&[ , +and +.Sq \&{ +.Pc +is not followed by a space; closing punctuation +.Po +.Sq \&. , +.Sq \&, , +.Sq \&; , +.Sq \&: , +.Sq \&? , +.Sq \&! , +.Sq \&) , +.Sq \&] +and +.Sq \&} +.Pc +is not preceded by white-space. +. +.Pp +If the input is +.Xr mdoc 7 , +however, these rules are also applied to macro arguments when appropriate. +. +. +.Ss ASCII Output +Output produced by +.Fl T Ns Ar ascii , +which is the default, is rendered in standard 7-bit ASCII documented in +.Xr ascii 7 . +.Pp +Font styles are applied by using back-spaced encoding such that an +underlined character +.Sq c +is rendered as +.Sq _ Ns \e[bs] Ns c , +where +.Sq \e[bs] +is the back-space character number 8. Emboldened characters are rendered as +.Sq c Ns \e[bs] Ns c . +.Pp +The special characters documented in +.Xr mandoc_char 7 +are rendered best-effort in an ASCII equivalent. +.Pp +Output width is limited to 78 visible columns unless literal input lines +exceed this limit. +. +. +.Ss HTML Output +Output produced by +.Fl T Ns Ar html +conforms to HTML-4.01 strict. +.Pp +Font styles and page structure are applied using CSS2. By default, no +font style is applied to any text, although CSS2 is hard-coded to format +the basic structure of output. +.Pp +The +.Pa example.style.css +file documents the range of styles applied to output and, if used, will +cause rendered documents to appear as they do in +.Fl T Ns Ar ascii . +.Pp +Special characters are rendered in decimal-encoded UTF-8. +. +. +.Ss XHTML Output +Output produced by +.Fl T Ns Ar xhtml +conforms to XHTML-1.0 strict. +.Pp +See +.Sx HTML Output +for details; beyond generating XHTML tags instead of HTML tags, these +output modes are identical. +. +. .Sh EXAMPLES To page manuals to the terminal: -.\" PARAGRAPH +. .Pp -.D1 % mandoc \-Wall,error mandoc.1 2>&1 | less -.Pp +.D1 % mandoc \-Wall,error \-fstrict mandoc.1 2>&1 | less .D1 % mandoc mandoc.1 mdoc.3 mdoc.7 | less -.\" SECTION +. +.Pp +To produce HTML manuals with +.Ar style.css +as the style-sheet: +.Pp +.D1 % mandoc \-Thtml -Ostyle=style.css mdoc.7 > mdoc.7.html +.Pp +To check over a large set of manuals: +. +.Pp +.Dl % mandoc \-Tlint \-fign-errors `find /usr/src -name \e*\e.[1-9]` +. +. +.Sh COMPATIBILITY +This section summarises +.Nm +compatibility with +.Xr groff 1 . +Each input and output format is separately noted. +. +. +.Ss ASCII Compatibility +.Bl -bullet -compact +.It +The +.Sq \e~ +special character doesn't produce expected behaviour in +.Fl T Ns Ar ascii . +. +.It +The +.Sq \&Bd \-literal +and +.Sq \&Bd \-unfilled +macros of +.Xr mdoc 7 +in +.Fl T Ns Ar ascii +are synonyms, as are \-filled and \-ragged. +. +.It +In +.Xr groff 1 , +the +.Sq \&Pa +.Xr mdoc 7 +macro does not underline when scoped under an +.Sq \&It +in the FILES section. This behaves correctly in +.Nm . +. +.It +A list or display following +.Sq \&Ss +.Xr mdoc 7 +macro in +.Fl T Ns Ar ascii +does not assert a prior vertical break, just as it doesn't with +.Sq \&Sh . +. +.It +The +.Sq \&na +.Xr man 7 +macro in +.Fl T Ns Ar ascii +has no effect. +. +.It +Words aren't hyphenated. +. +.It +In normal mode (not a literal block), blocks of spaces aren't preserved, +so double spaces following sentence closure are reduced to a single space; +.Xr groff 1 +retains spaces. +. +.It +Sentences are unilaterally monospaced. +.El +. +. +.Ss HTML/XHTML Compatibility +.Bl -bullet -compact +.It +The +.Sq \efP +escape will revert the font to the previous +.Sq \ef +escape, not to the last rendered decoration, which is now dictated by +CSS instead of hard-coded. It also will not span past the current +scope, for the same reason. Note that in +.Sx ASCII Output +mode, this will work fine. +.It +The +.Xr mdoc 7 +.Sq \&Bl \-hang +and +.Sq \&Bl \-tag +list types render similarly (no break following overreached left-hand +side) due to the expressive constraints of HTML. +. +.It +The +.Xr man 7 +.Sq IP +and +.Sq TP +lists render similarly. +.El +. +. .Sh SEE ALSO .Xr mandoc_char 7 , .Xr mdoc 7 , .Xr man 7 -.\" +. .Sh AUTHORS The .Nm -utility was written by +utility was written by .An Kristaps Dzonsons Aq kristaps@kth.se . -.\" SECTION +. +. .Sh CAVEATS -The -.Nm -utility in -.Fl T Ns Ar ascii -mode doesn't yet know how to display the following: +The +.Fl T Ns Ar html +and +.Fl T Ns Ar xhtml +CSS2 styling used for +.Fl m Ns Ar doc +input lists does not render properly in brain-dead browsers, such as +Internet Explorer 6 and earlier. .Pp -.Bl -bullet -compact -.It -The \-hang -.Sq \&.Bl -list is not yet supported. -.El +In +.Fl T Ns Ar html +and +.Fl T Ns Ar xhtml , +the maximum size of an element attribute is determined by +.Dv BUFSIZ , +which is usually 1024 bytes. Be aware of this when setting long link +formats, e.g., +.Fl O Ns Ar style=really/long/link . .Pp -Other macros still aren't supported by virtue of nobody complaining -about their absence. Please report any omissions: this is a work in -progress. -.Pp -The following list documents differences between traditional -.Xr nroff 1 -output and -.Nm : -.Pp -.Bl -bullet -compact -.It -A list of display following -.Sq \&.Ss -does not assert a prior vertical break, just as it doesn't with -.Sq \&.Sh . -.It -Special characters don't follow the current font style. -.\" LIST-ITEM -.It -The \-literal and \-unfilled -.Sq \&.Bd -displays types are synonyms, as are \-filled and \-ragged. -.El +The +.Fl T Ns Ar html +and +.Fl T Ns Ar xhtml +output modes don't render the +.Sq \es +font size escape documented in +.Xr mdoc 7 +and +.Xr man 7 .