version 1.5, 2009/01/20 15:05:01 |
version 1.6, 2009/02/23 09:33:34 |
|
|
|
.\" $Id$ |
|
.\" |
|
.\" Copyright (c) 2009 Kristaps Dzonsons <kristaps@kth.se> |
|
.\" |
|
.\" Permission to use, copy, modify, and distribute this software for any |
|
.\" purpose with or without fee is hereby granted, provided that the |
|
.\" above copyright notice and this permission notice appear in all |
|
.\" copies. |
|
.\" |
|
.\" THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL |
|
.\" WARRANTIES WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED |
|
.\" WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE |
|
.\" AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL |
|
.\" DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR |
|
.\" PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER |
|
.\" TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR |
|
.\" PERFORMANCE OF THIS SOFTWARE. |
.\" |
.\" |
.Dd $Mdocdate$ |
.Dd $Mdocdate$ |
.Dt mdoc 3 |
.Dt mdoc 3 |
.Os |
.Os |
.\" |
.\" SECTION |
.Sh NAME |
.Sh NAME |
.Nm mdoc_alloc , |
.Nm mdoc_alloc , |
.Nm mdoc_parseln , |
.Nm mdoc_parseln , |
|
|
.Nm mdoc_meta , |
.Nm mdoc_meta , |
.Nm mdoc_free |
.Nm mdoc_free |
.Nd mdoc macro compiler library |
.Nd mdoc macro compiler library |
.\" |
.\" SECTION |
.Sh SYNOPSIS |
.Sh SYNOPSIS |
.Fd #include <mdoc.h> |
.Fd #include <mdoc.h> |
.Vt extern const char * const * mdoc_macronames; |
.Vt extern const char * const * mdoc_macronames; |
|
|
.Fn mdoc_meta "struct mdoc *mdoc" |
.Fn mdoc_meta "struct mdoc *mdoc" |
.Ft int |
.Ft int |
.Fn mdoc_endparse "struct mdoc *mdoc" |
.Fn mdoc_endparse "struct mdoc *mdoc" |
.\" |
.\" SECTION |
.Sh DESCRIPTION |
.Sh DESCRIPTION |
The |
The |
.Nm mdoc |
.Nm mdoc |
library parses lines of mdoc-macro text into an abstract syntax tree. |
library parses lines of mdoc input into an abstract syntax tree. |
|
.Dq mdoc |
|
is a macro package of the |
|
.Dq roff |
|
language, which is used to format BSD manual pages. The |
|
.Nm |
|
library implements only those macros documented in the |
|
.Xr mdoc 7 |
|
and |
|
.Xr mdoc.samples 7 |
|
manuals. |
|
.\" PARAGRAPH |
|
.Pp |
|
.Nm |
|
is |
|
.Ud |
|
.\" PARAGRAPH |
|
.Pp |
In general, applications initiate a parsing sequence with |
In general, applications initiate a parsing sequence with |
.Fn mdoc_alloc , |
.Fn mdoc_alloc , |
parse each line in a document with |
parse each line in a document with |
Line 48 then free all allocated memory with |
|
Line 82 then free all allocated memory with |
|
See the |
See the |
.Sx EXAMPLES |
.Sx EXAMPLES |
section for a full example. |
section for a full example. |
|
.\" PARAGRAPH |
.Pp |
.Pp |
.\" Function descriptions. |
This section further defines the |
|
.Sx Types , |
|
.Sx Functions |
|
and |
|
.Sx Variables |
|
available to programmers. The last sub-section, |
|
.Sx Abstract Syntax Tree , |
|
documents the output tree. |
|
.\" SUBSECTION |
|
.Ss Types |
|
Both functions (see |
|
.Sx Functions ) |
|
and variables (see |
|
.Sx Variables ) |
|
may use the following types: |
|
.Bl -ohang |
|
.\" LIST-ITEM |
|
.It Vt struct mdoc |
|
An opaque type defined in |
|
.Pa mdoc.c . |
|
Its values are only used privately within the library. |
|
.\" LIST-ITEM |
|
.It Vt struct mdoc_cb |
|
A set of message callbacks defined in |
|
.Pa mdoc.h . |
|
.\" LIST-ITEM |
|
.It Vt struct mdoc_node |
|
A parsed node. Defined in |
|
.Pa mdoc.h . |
|
See |
|
.Sx Abstract Syntax Tree |
|
for details. |
|
.El |
|
.\" SUBSECTION |
|
.Ss Functions |
Function descriptions follow: |
Function descriptions follow: |
.Bl -ohang -offset indent |
.Bl -ohang |
|
.\" LIST-ITEM |
.It Fn mdoc_alloc |
.It Fn mdoc_alloc |
Allocates a parsing structure. The |
Allocates a parsing structure. The |
.Fa data |
.Fa data |
Line 60 pointer is passed to callbacks in |
|
Line 130 pointer is passed to callbacks in |
|
which are documented further in the header file. Returns NULL on |
which are documented further in the header file. Returns NULL on |
failure. If non-NULL, the pointer must be freed with |
failure. If non-NULL, the pointer must be freed with |
.Fn mdoc_free . |
.Fn mdoc_free . |
|
.\" LIST-ITEM |
.It Fn mdoc_free |
.It Fn mdoc_free |
Free all resources of a parser. The pointer is no longer valid after |
Free all resources of a parser. The pointer is no longer valid after |
invocation. |
invocation. |
|
.\" LIST-ITEM |
.It Fn mdoc_parseln |
.It Fn mdoc_parseln |
Parse a nil-terminated line of input. This line should not contain the |
Parse a nil-terminated line of input. This line should not contain the |
trailing newline. Returns 0 on failure, 1 on success. The input buffer |
trailing newline. Returns 0 on failure, 1 on success. The input buffer |
.Fa buf |
.Fa buf |
is modified by this function. |
is modified by this function. |
|
.\" LIST-ITEM |
.It Fn mdoc_endparse |
.It Fn mdoc_endparse |
Signals that the parse is complete. Note that if |
Signals that the parse is complete. Note that if |
.Fn mdoc_endparse |
.Fn mdoc_endparse |
is called subsequent to |
is called subsequent to |
.Fn mdoc_node , |
.Fn mdoc_node , |
the resulting tree is incomplete. Returns 0 on failure, 1 on success. |
the resulting tree is incomplete. Returns 0 on failure, 1 on success. |
|
.\" LIST-ITEM |
.It Fn mdoc_node |
.It Fn mdoc_node |
Returns the first node of the parse. Note that if |
Returns the first node of the parse. Note that if |
.Fn mdoc_parseln |
.Fn mdoc_parseln |
|
|
.Fn mdoc_endparse |
.Fn mdoc_endparse |
return 0, the data will be incomplete. |
return 0, the data will be incomplete. |
.El |
.El |
.Pp |
.\" SUBSECTION |
.\" Variable descriptions. |
.Ss Variables |
The following variables are also defined: |
The following variables are also defined: |
.Bl -ohang -offset indent |
.Bl -ohang |
|
.\" LIST-ITEM |
.It Va mdoc_macronames |
.It Va mdoc_macronames |
An array of string-ified token names. |
An array of string-ified token names. |
|
.\" LIST-ITEM |
.It Va mdoc_argnames |
.It Va mdoc_argnames |
An array of string-ified token argument names. |
An array of string-ified token argument names. |
.El |
.El |
.Pp |
.\" SUBSECTION |
|
.Ss Abstract Syntax Tree |
|
The |
.Nm |
.Nm |
is |
functions produce an abstract syntax tree (AST) describing the input |
.Ud |
lines in a regular form. It may be reviewed at any time with |
.\" |
.Fn mdoc_nodes ; |
|
however, if called before |
|
.Fn mdoc_endparse , |
|
or after |
|
.Fn mdoc_endparse |
|
or |
|
.Fn mdoc_parseln |
|
fail, it may be incomplete. |
|
.\" PARAGRAPH |
|
.Pp |
|
The AST is composed of |
|
.Vt struct mdoc_node |
|
nodes with block, head, body, element, root and text types as declared |
|
by the |
|
.Va type |
|
field. Each node also provides its parse point (the |
|
.Va line , |
|
.Va sec , |
|
and |
|
.Va pos |
|
fields), its position in the tree (the |
|
.Va parent , |
|
.Va child , |
|
.Va next |
|
and |
|
.Va prev |
|
fields) and type-specific data (the |
|
.Va data |
|
field). |
|
.\" PARAGRAPH |
|
.Pp |
|
The tree itself is arranged according to the following normal form, |
|
where capitalised non-terminals represent nodes. |
|
.Pp |
|
.Bl -tag -width "ELEMENTXX" -compact |
|
.\" LIST-ITEM |
|
.It ROOT |
|
\(<- mnode+ |
|
.It mnode |
|
\(<- BLOCK | ELEMENT | TEXT |
|
.It BLOCK |
|
\(<- (HEAD [TEXT])+ [BODY [TEXT]] [TAIL [TEXT]] |
|
.It BLOCK |
|
\(<- BODY [TEXT] [TAIL [TEXT]] |
|
.It ELEMENT |
|
\(<- TEXT* |
|
.It HEAD |
|
\(<- mnode+ |
|
.It BODY |
|
\(<- mnode+ |
|
.It TAIL |
|
\(<- mnode+ |
|
.It TEXT |
|
\(<- [[:alpha:]]* |
|
.El |
|
.\" PARAGRAPH |
|
.Pp |
|
Of note are the TEXT nodes following the HEAD, BODY and TAIL nodes of |
|
the BLOCK production. These refer to punctuation marks. Furthermore, |
|
although a TEXT node will generally have a non-zero-length string, it |
|
certain cases, such as |
|
.Dq \&.Bd \-literal , |
|
an empty line will produce a zero-length string. |
|
.\" PARAGRAPH |
|
.Pp |
|
The rule-of-thumb for mapping node types to macros follows: in-line |
|
elements, such as |
|
.Dq \&.Em foo , |
|
are classified as ELEMENT nodes, which can only contain text. |
|
Multi-line elements such as |
|
.Dq \&.Sh |
|
are BLOCK elements, where the HEAD constitutes line contents and the |
|
BODY constitutes subsequent lines. In-line elements with matching |
|
pairs, such as |
|
.Dq \&.So |
|
and |
|
.Dq \&.Sc , |
|
are BLOCK elements with no HEAD tag. The only exception to this is |
|
.Dq \&.Eo |
|
and |
|
.Dq \&.Ec , |
|
which has a HEAD and TAIL node corresponding to the enclosure string. |
|
TEXT nodes, obviously, constitute text; the ROOT node is the document's |
|
root. |
|
.\" SECTION |
.Sh EXAMPLES |
.Sh EXAMPLES |
The following example reads lines from stdin and parses them, operating |
The following example reads lines from stdin and parses them, operating |
on the finished parse tree with |
on the finished parse tree with |
Line 135 if (NULL == (node = mdoc_node(mdoc))) |
|
Line 297 if (NULL == (node = mdoc_node(mdoc))) |
|
parsed(mdoc, node); |
parsed(mdoc, node); |
mdoc_free(mdoc); |
mdoc_free(mdoc); |
.Ed |
.Ed |
.\" |
.\" SECTION |
.Sh SEE ALSO |
.Sh SEE ALSO |
.Xr mdoc 7 , |
.Xr mdoc 7 , |
.Xr mdoc.samples 7 , |
.Xr mdoc.samples 7 , |
.Xr groff 1 , |
.Xr groff 1 , |
.Xr mdocml 1 |
.Xr mdocml 1 |
.\" |
.\" SECTION |
.\" |
|
.Sh AUTHORS |
.Sh AUTHORS |
The |
The |
.Nm |
.Nm |
utility was written by |
utility was written by |
.An Kristaps Dzonsons Aq kristaps@kth.se . |
.An Kristaps Dzonsons Aq kristaps@kth.se . |
.\" |
.\" SECTION |
.\" |
|
.Sh BUGS |
.Sh BUGS |
Bugs, un-implemented macros and incompabilities are documented in this |
Bugs, un-implemented macros and incompabilities are documented in this |
section. The baseline for determining whether macro parsing is |
section. The baseline for determining whether macro parsing is |
|
|
.Ox |
.Ox |
are supported; support for |
are supported; support for |
.Nx |
.Nx |
and other BSD systems is in progress. |
and other |
|
.Bx |
|
systems is in progress. |