=================================================================== RCS file: /cvs/mandoc/Attic/mdoc.3,v retrieving revision 1.6 retrieving revision 1.11 diff -u -p -r1.6 -r1.11 --- mandoc/Attic/mdoc.3 2009/02/23 09:33:34 1.6 +++ mandoc/Attic/mdoc.3 2009/02/25 17:02:47 1.11 @@ -1,4 +1,4 @@ -.\" $Id: mdoc.3,v 1.6 2009/02/23 09:33:34 kristaps Exp $ +.\" $Id: mdoc.3,v 1.11 2009/02/25 17:02:47 kristaps Exp $ .\" .\" Copyright (c) 2009 Kristaps Dzonsons .\" @@ -16,7 +16,7 @@ .\" TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR .\" PERFORMANCE OF THIS SOFTWARE. .\" -.Dd $Mdocdate: February 23 2009 $ +.Dd $Mdocdate: February 25 2009 $ .Dt mdoc 3 .Os .\" SECTION @@ -50,16 +50,19 @@ The .Nm mdoc library parses lines of mdoc input into an abstract syntax tree. -.Dq mdoc -is a macro package of the +.Dq mdoc , +which is used to format BSD manual pages, is a macro package of the .Dq roff -language, which is used to format BSD manual pages. The +language. The .Nm library implements only those macros documented in the .Xr mdoc 7 and .Xr mdoc.samples 7 -manuals. +manuals. Documents with +.Xr refer 1 , +.Xr eqn 1 +and other pre-processor sections aren't accomodated. .\" PARAGRAPH .Pp .Nm @@ -89,7 +92,9 @@ This section further defines the .Sx Functions and .Sx Variables -available to programmers. The last sub-section, +available to programmers. Following that, +.Sx Character Encoding +describes input format. Lastly, .Sx Abstract Syntax Tree , documents the output tree. .\" SUBSECTION @@ -99,7 +104,7 @@ Both functions (see and variables (see .Sx Variables ) may use the following types: -.Bl -ohang +.Bl -ohang -offset "XXXX" .\" LIST-ITEM .It Vt struct mdoc An opaque type defined in @@ -120,7 +125,7 @@ for details. .\" SUBSECTION .Ss Functions Function descriptions follow: -.Bl -ohang +.Bl -ohang -offset "XXXX" .\" LIST-ITEM .It Fn mdoc_alloc Allocates a parsing structure. The @@ -165,7 +170,7 @@ return 0, the data will be incomplete. .\" SUBSECTION .Ss Variables The following variables are also defined: -.Bl -ohang +.Bl -ohang -offset "XXXX" .\" LIST-ITEM .It Va mdoc_macronames An array of string-ified token names. @@ -174,6 +179,21 @@ An array of string-ified token names. An array of string-ified token argument names. .El .\" SUBSECTION +.Ss Character Encoding +The +.Xr mdoc 3 +library accepts only printable ASCII characters as defined by +.Xr isprint 3 . +Non-ASCII character sequences are escaped with an escape character +.Sq \\ +and followed by either an open-parenthesis +.Sq \&( +for two-character sequences; an open-bracket +.Sq \&[ +for n-character sequences (terminated at a close-bracket +.Sq \&] ) ; +or one of a small set of single characters for other escapes. +.\" SUBSECTION .Ss Abstract Syntax Tree The .Nm @@ -213,7 +233,7 @@ field). The tree itself is arranged according to the following normal form, where capitalised non-terminals represent nodes. .Pp -.Bl -tag -width "ELEMENTXX" -compact +.Bl -tag -width "ELEMENTXX" -compact -offset "XXXX" .\" LIST-ITEM .It ROOT \(<- mnode+ @@ -238,31 +258,31 @@ where capitalised non-terminals represent nodes. .Pp Of note are the TEXT nodes following the HEAD, BODY and TAIL nodes of the BLOCK production. These refer to punctuation marks. Furthermore, -although a TEXT node will generally have a non-zero-length string, it -certain cases, such as -.Dq \&.Bd \-literal , +although a TEXT node will generally have a non-zero-length string, in +the specific case of +.Sq \&.Bd \-literal , an empty line will produce a zero-length string. .\" PARAGRAPH .Pp -The rule-of-thumb for mapping node types to macros follows: in-line +The rule-of-thumb for mapping node types to macros follows. In-line elements, such as -.Dq \&.Em foo , +.Sq \&.Em foo , are classified as ELEMENT nodes, which can only contain text. -Multi-line elements such as -.Dq \&.Sh +Multi-line elements, such as +.Sq \&.Sh , are BLOCK elements, where the HEAD constitutes line contents and the BODY constitutes subsequent lines. In-line elements with matching pairs, such as -.Dq \&.So +.Sq \&.So and -.Dq \&.Sc , +.Sq \&.Sc , are BLOCK elements with no HEAD tag. The only exception to this is -.Dq \&.Eo +.Sq \&.Eo and -.Dq \&.Ec , +.Sq \&.Ec , which has a HEAD and TAIL node corresponding to the enclosure string. -TEXT nodes, obviously, constitute text; the ROOT node is the document's -root. +TEXT nodes, obviously, constitute text, and the ROOT node is the +document's root. .\" SECTION .Sh EXAMPLES The following example reads lines from stdin and parses them, operating @@ -272,7 +292,7 @@ Note that, if the last line of the file isn't newline- will truncate the file's last character (see .Xr fgetln 3 ) . Further, this example does not error-check nor free memory upon failure. -.Bd -literal +.Bd -literal -offset "XXXX" struct mdoc *mdoc; struct mdoc_node *node; char *buf; @@ -318,6 +338,7 @@ is the default .Xr groff 1 system bundled with .Ox . +.\" PARAGRAPH .Pp Un-implemented: the .Sq \&Xc @@ -327,19 +348,27 @@ macros aren't handled when used to span lines for the .Sq \&It macro. Such usage is specifically discouraged in .Xr mdoc.samples 7 . +.\" PARAGRAPH .Pp Bugs: when .Sq \&It \-column is invoked, whitespace is not stripped around .Sq \&Ta or tab-character separators. +.\" PARAGRAPH .Pp +Bugs: elements within columns for +.Sq \&It \-column +are not yet supported. +.\" PARAGRAPH +.Pp Incompatible: the .Sq \&At macro only accepts a single parameter. Furthermore, several macros .Pf ( Sq \&Pp , .Sq \&It , and possibly others) accept multiple arguments with a warning. +.\" PARAGRAPH .Pp Incompatible: only those macros specified by .Xr mdoc.samples 7