=================================================================== RCS file: /cvs/mandoc/Attic/mdoc.3,v retrieving revision 1.5 retrieving revision 1.6 diff -u -p -r1.5 -r1.6 --- mandoc/Attic/mdoc.3 2009/01/20 15:05:01 1.5 +++ mandoc/Attic/mdoc.3 2009/02/23 09:33:34 1.6 @@ -1,8 +1,25 @@ +.\" $Id: mdoc.3,v 1.6 2009/02/23 09:33:34 kristaps Exp $ +.\" +.\" Copyright (c) 2009 Kristaps Dzonsons +.\" +.\" Permission to use, copy, modify, and distribute this software for any +.\" purpose with or without fee is hereby granted, provided that the +.\" above copyright notice and this permission notice appear in all +.\" copies. +.\" +.\" THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL +.\" WARRANTIES WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED +.\" WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE +.\" AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL +.\" DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR +.\" PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER +.\" TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR +.\" PERFORMANCE OF THIS SOFTWARE. .\" -.Dd $Mdocdate: January 20 2009 $ +.Dd $Mdocdate: February 23 2009 $ .Dt mdoc 3 .Os -.\" +.\" SECTION .Sh NAME .Nm mdoc_alloc , .Nm mdoc_parseln , @@ -11,7 +28,7 @@ .Nm mdoc_meta , .Nm mdoc_free .Nd mdoc macro compiler library -.\" +.\" SECTION .Sh SYNOPSIS .Fd #include .Vt extern const char * const * mdoc_macronames; @@ -28,11 +45,28 @@ .Fn mdoc_meta "struct mdoc *mdoc" .Ft int .Fn mdoc_endparse "struct mdoc *mdoc" -.\" +.\" SECTION .Sh DESCRIPTION The .Nm mdoc -library parses lines of mdoc-macro text into an abstract syntax tree. +library parses lines of mdoc input into an abstract syntax tree. +.Dq mdoc +is a macro package of the +.Dq roff +language, which is used to format BSD manual pages. The +.Nm +library implements only those macros documented in the +.Xr mdoc 7 +and +.Xr mdoc.samples 7 +manuals. +.\" PARAGRAPH +.Pp +.Nm +is +.Ud +.\" PARAGRAPH +.Pp In general, applications initiate a parsing sequence with .Fn mdoc_alloc , parse each line in a document with @@ -48,10 +82,46 @@ then free all allocated memory with See the .Sx EXAMPLES section for a full example. +.\" PARAGRAPH .Pp -.\" Function descriptions. +This section further defines the +.Sx Types , +.Sx Functions +and +.Sx Variables +available to programmers. The last sub-section, +.Sx Abstract Syntax Tree , +documents the output tree. +.\" SUBSECTION +.Ss Types +Both functions (see +.Sx Functions ) +and variables (see +.Sx Variables ) +may use the following types: +.Bl -ohang +.\" LIST-ITEM +.It Vt struct mdoc +An opaque type defined in +.Pa mdoc.c . +Its values are only used privately within the library. +.\" LIST-ITEM +.It Vt struct mdoc_cb +A set of message callbacks defined in +.Pa mdoc.h . +.\" LIST-ITEM +.It Vt struct mdoc_node +A parsed node. Defined in +.Pa mdoc.h . +See +.Sx Abstract Syntax Tree +for details. +.El +.\" SUBSECTION +.Ss Functions Function descriptions follow: -.Bl -ohang -offset indent +.Bl -ohang +.\" LIST-ITEM .It Fn mdoc_alloc Allocates a parsing structure. The .Fa data @@ -60,20 +130,24 @@ pointer is passed to callbacks in which are documented further in the header file. Returns NULL on failure. If non-NULL, the pointer must be freed with .Fn mdoc_free . +.\" LIST-ITEM .It Fn mdoc_free Free all resources of a parser. The pointer is no longer valid after invocation. +.\" LIST-ITEM .It Fn mdoc_parseln Parse a nil-terminated line of input. This line should not contain the trailing newline. Returns 0 on failure, 1 on success. The input buffer .Fa buf is modified by this function. +.\" LIST-ITEM .It Fn mdoc_endparse Signals that the parse is complete. Note that if .Fn mdoc_endparse is called subsequent to .Fn mdoc_node , the resulting tree is incomplete. Returns 0 on failure, 1 on success. +.\" LIST-ITEM .It Fn mdoc_node Returns the first node of the parse. Note that if .Fn mdoc_parseln @@ -88,20 +162,108 @@ or .Fn mdoc_endparse return 0, the data will be incomplete. .El -.Pp -.\" Variable descriptions. +.\" SUBSECTION +.Ss Variables The following variables are also defined: -.Bl -ohang -offset indent +.Bl -ohang +.\" LIST-ITEM .It Va mdoc_macronames An array of string-ified token names. +.\" LIST-ITEM .It Va mdoc_argnames An array of string-ified token argument names. .El -.Pp +.\" SUBSECTION +.Ss Abstract Syntax Tree +The .Nm -is -.Ud -.\" +functions produce an abstract syntax tree (AST) describing the input +lines in a regular form. It may be reviewed at any time with +.Fn mdoc_nodes ; +however, if called before +.Fn mdoc_endparse , +or after +.Fn mdoc_endparse +or +.Fn mdoc_parseln +fail, it may be incomplete. +.\" PARAGRAPH +.Pp +The AST is composed of +.Vt struct mdoc_node +nodes with block, head, body, element, root and text types as declared +by the +.Va type +field. Each node also provides its parse point (the +.Va line , +.Va sec , +and +.Va pos +fields), its position in the tree (the +.Va parent , +.Va child , +.Va next +and +.Va prev +fields) and type-specific data (the +.Va data +field). +.\" PARAGRAPH +.Pp +The tree itself is arranged according to the following normal form, +where capitalised non-terminals represent nodes. +.Pp +.Bl -tag -width "ELEMENTXX" -compact +.\" LIST-ITEM +.It ROOT +\(<- mnode+ +.It mnode +\(<- BLOCK | ELEMENT | TEXT +.It BLOCK +\(<- (HEAD [TEXT])+ [BODY [TEXT]] [TAIL [TEXT]] +.It BLOCK +\(<- BODY [TEXT] [TAIL [TEXT]] +.It ELEMENT +\(<- TEXT* +.It HEAD +\(<- mnode+ +.It BODY +\(<- mnode+ +.It TAIL +\(<- mnode+ +.It TEXT +\(<- [[:alpha:]]* +.El +.\" PARAGRAPH +.Pp +Of note are the TEXT nodes following the HEAD, BODY and TAIL nodes of +the BLOCK production. These refer to punctuation marks. Furthermore, +although a TEXT node will generally have a non-zero-length string, it +certain cases, such as +.Dq \&.Bd \-literal , +an empty line will produce a zero-length string. +.\" PARAGRAPH +.Pp +The rule-of-thumb for mapping node types to macros follows: in-line +elements, such as +.Dq \&.Em foo , +are classified as ELEMENT nodes, which can only contain text. +Multi-line elements such as +.Dq \&.Sh +are BLOCK elements, where the HEAD constitutes line contents and the +BODY constitutes subsequent lines. In-line elements with matching +pairs, such as +.Dq \&.So +and +.Dq \&.Sc , +are BLOCK elements with no HEAD tag. The only exception to this is +.Dq \&.Eo +and +.Dq \&.Ec , +which has a HEAD and TAIL node corresponding to the enclosure string. +TEXT nodes, obviously, constitute text; the ROOT node is the document's +root. +.\" SECTION .Sh EXAMPLES The following example reads lines from stdin and parses them, operating on the finished parse tree with @@ -135,21 +297,19 @@ if (NULL == (node = mdoc_node(mdoc))) parsed(mdoc, node); mdoc_free(mdoc); .Ed -.\" +.\" SECTION .Sh SEE ALSO .Xr mdoc 7 , .Xr mdoc.samples 7 , .Xr groff 1 , .Xr mdocml 1 -.\" -.\" +.\" SECTION .Sh AUTHORS The .Nm utility was written by .An Kristaps Dzonsons Aq kristaps@kth.se . -.\" -.\" +.\" SECTION .Sh BUGS Bugs, un-implemented macros and incompabilities are documented in this section. The baseline for determining whether macro parsing is @@ -189,4 +349,6 @@ for .Ox are supported; support for .Nx -and other BSD systems is in progress. +and other +.Bx +systems is in progress.