version 1.14, 2009/03/12 15:55:11 |
version 1.27, 2009/04/12 19:19:57 |
|
|
.\" $Id$ |
.\" $Id$ |
.\" |
.\" |
.\" Copyright (c) 2009 Kristaps Dzonsons <kristaps@kth.se> |
.\" Copyright (c) 2009 Kristaps Dzonsons <kristaps@openbsd.org> |
.\" |
.\" |
.\" Permission to use, copy, modify, and distribute this software for any |
.\" Permission to use, copy, modify, and distribute this software for any |
.\" purpose with or without fee is hereby granted, provided that the |
.\" purpose with or without fee is hereby granted, provided that the |
|
|
.\" PERFORMANCE OF THIS SOFTWARE. |
.\" PERFORMANCE OF THIS SOFTWARE. |
.\" |
.\" |
.Dd $Mdocdate$ |
.Dd $Mdocdate$ |
.Dt mdoc 3 |
.Dt MDOC 3 |
.Os |
.Os |
.\" SECTION |
.\" SECTION |
.Sh NAME |
.Sh NAME |
|
|
.Nm mdoc_endparse , |
.Nm mdoc_endparse , |
.Nm mdoc_node , |
.Nm mdoc_node , |
.Nm mdoc_meta , |
.Nm mdoc_meta , |
.Nm mdoc_free |
.Nm mdoc_free , |
|
.Nm mdoc_reset |
.Nd mdoc macro compiler library |
.Nd mdoc macro compiler library |
.\" SECTION |
.\" SECTION |
.Sh SYNOPSIS |
.Sh SYNOPSIS |
|
|
.Vt extern const char * const * mdoc_macronames; |
.Vt extern const char * const * mdoc_macronames; |
.Vt extern const char * const * mdoc_argnames; |
.Vt extern const char * const * mdoc_argnames; |
.Ft "struct mdoc *" |
.Ft "struct mdoc *" |
.Fn mdoc_alloc "void *data" "const struct mdoc_cb *cb" |
.Fn mdoc_alloc "void *data" "int pflags" "const struct mdoc_cb *cb" |
|
.Ft int |
|
.Fn mdoc_reset "struct mdoc *mdoc" |
.Ft void |
.Ft void |
.Fn mdoc_free "struct mdoc *mdoc" |
.Fn mdoc_free "struct mdoc *mdoc" |
.Ft int |
.Ft int |
|
|
.Sh DESCRIPTION |
.Sh DESCRIPTION |
The |
The |
.Nm mdoc |
.Nm mdoc |
library parses lines of mdoc input into an abstract syntax tree. |
library parses lines of |
.Dq mdoc , |
|
which is used to format BSD manual pages, is a macro package of the |
|
.Dq roff |
|
language. The |
|
.Nm |
|
library implements only those macros documented in the |
|
.Xr mdoc 7 |
.Xr mdoc 7 |
and |
input (and |
.Xr mdoc.samples 7 |
.Em only |
manuals. Documents with |
mdoc) into an abstract syntax tree (AST). |
.Xr refer 1 , |
|
.Xr eqn 1 |
|
and other pre-processor sections aren't accomodated. |
|
.\" PARAGRAPH |
.\" PARAGRAPH |
.Pp |
.Pp |
.Nm |
|
is |
|
.Ud |
|
.\" PARAGRAPH |
|
.Pp |
|
In general, applications initiate a parsing sequence with |
In general, applications initiate a parsing sequence with |
.Fn mdoc_alloc , |
.Fn mdoc_alloc , |
parse each line in a document with |
parse each line in a document with |
|
|
.Fn mdoc_meta , |
.Fn mdoc_meta , |
then free all allocated memory with |
then free all allocated memory with |
.Fn mdoc_free . |
.Fn mdoc_free . |
See the |
The |
|
.Fn mdoc_reset |
|
function may be used in order to reset the parser for another input |
|
sequence. See the |
.Sx EXAMPLES |
.Sx EXAMPLES |
section for a full example. |
section for a full example. |
.\" PARAGRAPH |
.\" PARAGRAPH |
Line 92 This section further defines the |
|
Line 84 This section further defines the |
|
.Sx Functions |
.Sx Functions |
and |
and |
.Sx Variables |
.Sx Variables |
available to programmers. Following that, |
available to programmers. Following that, the |
.Sx Character Encoding |
.Sx Abstract Syntax Tree |
describes input format. Lastly, |
section documents the output tree. |
.Sx Abstract Syntax Tree , |
|
documents the output tree. |
|
.\" SUBSECTION |
.\" SUBSECTION |
.Ss Types |
.Ss Types |
Both functions (see |
Both functions (see |
Line 132 Allocates a parsing structure. The |
|
Line 122 Allocates a parsing structure. The |
|
.Fa data |
.Fa data |
pointer is passed to callbacks in |
pointer is passed to callbacks in |
.Fa cb , |
.Fa cb , |
which are documented further in the header file. Returns NULL on |
which are documented further in the header file. |
failure. If non-NULL, the pointer must be freed with |
The |
|
.Fa pflags |
|
arguments are defined in |
|
.Pa mdoc.h . |
|
Returns NULL on failure. If non-NULL, the pointer must be freed with |
.Fn mdoc_free . |
.Fn mdoc_free . |
.\" LIST-ITEM |
.\" LIST-ITEM |
|
.It Fn mdoc_reset |
|
Reset the parser for another parse routine. After its use, |
|
.Fn mdoc_parseln |
|
behaves as if invoked for the first time. If it returns 0, memory could |
|
not be allocated. |
|
.\" LIST-ITEM |
.It Fn mdoc_free |
.It Fn mdoc_free |
Free all resources of a parser. The pointer is no longer valid after |
Free all resources of a parser. The pointer is no longer valid after |
invocation. |
invocation. |
Line 179 An array of string-ified token names. |
|
Line 179 An array of string-ified token names. |
|
An array of string-ified token argument names. |
An array of string-ified token argument names. |
.El |
.El |
.\" SUBSECTION |
.\" SUBSECTION |
.Ss Character Encoding |
|
The |
|
.Xr mdoc 3 |
|
library accepts only printable ASCII characters as defined by |
|
.Xr isprint 3 . |
|
Non-ASCII character sequences are delimited in various ways. All are |
|
preceeded by an escape character |
|
.Sq \\ |
|
and followed by either an open-parenthesis |
|
.Sq \&( |
|
for two-character sequences; an open-bracket |
|
.Sq \&[ |
|
for n-character sequences (terminated at a close-bracket |
|
.Sq \&] ) ; |
|
an asterisk and open-parenthesis |
|
.Sq \&*( |
|
for two-character sequences; |
|
an asterisk and non-open-parenthesis |
|
.Sq \&* |
|
for single-character sequences; or one of a small set of standalone |
|
single characters for other escapes. |
|
.\" PARAGRAPH |
|
.Pp |
|
Examples: |
|
.Pp |
|
.Bl -tag -width "XXXXXXXX" -offset "XXXX" -compact |
|
.\" LIST-ITEM |
|
.It \\*(<= |
|
prints |
|
.Dq \*(<= |
|
.Pq greater-equal |
|
.\" LIST-ITEM |
|
.It \\(<- |
|
prints |
|
.Dq \(<- |
|
.Pq left-arrow |
|
.\" LIST-ITEM |
|
.It \\[<-] |
|
also prints |
|
.Dq \(<- |
|
.Pq left-arrow |
|
.\" LIST-ITEM |
|
.It \\*(Ba |
|
prints |
|
.Dq \*(Ba |
|
.Pq bar |
|
.\" LIST-ITEM |
|
.It \\*q |
|
prints |
|
.Dq \*q |
|
.Pq double-quote |
|
.El |
|
.\" PARAGRAPH |
|
.Pp |
|
All escaped sequences are syntax-checked, but it's up to the front-end |
|
system to correctly render them to the output device. |
|
.\" SUBSECTION |
|
.Ss Abstract Syntax Tree |
.Ss Abstract Syntax Tree |
The |
The |
.Nm |
.Nm |
functions produce an abstract syntax tree (AST) describing the input |
functions produce an abstract syntax tree (AST) describing input in a |
lines in a regular form. It may be reviewed at any time with |
regular form. It may be reviewed at any time with |
.Fn mdoc_nodes ; |
.Fn mdoc_nodes ; |
however, if called before |
however, if called before |
.Fn mdoc_endparse , |
.Fn mdoc_endparse , |
|
|
.Fn mdoc_endparse |
.Fn mdoc_endparse |
or |
or |
.Fn mdoc_parseln |
.Fn mdoc_parseln |
fail, it may be incomplete. |
fail, it may be incomplete. |
.\" PARAGRAPH |
.\" PARAGRAPH |
.Pp |
.Pp |
|
This AST is governed by the ontological |
|
rules dictated in |
|
.Xr mdoc 7 |
|
and derives its terminology accordingly. |
|
.Qq In-line |
|
elements described in |
|
.Xr mdoc 7 |
|
are described simply as |
|
.Qq elements . |
|
.\" PARAGRAPH |
|
.Pp |
The AST is composed of |
The AST is composed of |
.Vt struct mdoc_node |
.Vt struct mdoc_node |
nodes with block, head, body, element, root and text types as declared |
nodes with block, head, body, element, root and text types as declared |
Line 267 fields), its position in the tree (the |
|
Line 221 fields), its position in the tree (the |
|
.Va next |
.Va next |
and |
and |
.Va prev |
.Va prev |
fields) and type-specific data (the |
fields) and some type-specific data. |
.Va data |
|
field). |
|
.\" PARAGRAPH |
.\" PARAGRAPH |
.Pp |
.Pp |
The tree itself is arranged according to the following normal form, |
The tree itself is arranged according to the following normal form, |
Line 304 although a TEXT node will generally have a non-zero-le |
|
Line 256 although a TEXT node will generally have a non-zero-le |
|
the specific case of |
the specific case of |
.Sq \&.Bd \-literal , |
.Sq \&.Bd \-literal , |
an empty line will produce a zero-length string. |
an empty line will produce a zero-length string. |
.\" PARAGRAPH |
|
.Pp |
|
The rule-of-thumb for mapping node types to macros follows. In-line |
|
elements, such as |
|
.Sq \&.Em foo , |
|
are classified as ELEMENT nodes, which can only contain text. |
|
Multi-line elements, such as |
|
.Sq \&.Sh , |
|
are BLOCK elements, where the HEAD constitutes line contents and the |
|
BODY constitutes subsequent lines. In-line elements with matching |
|
pairs, such as |
|
.Sq \&.So |
|
and |
|
.Sq \&.Sc , |
|
are BLOCK elements with no HEAD tag. The only exception to this is |
|
.Sq \&.Eo |
|
and |
|
.Sq \&.Ec , |
|
which has a HEAD and TAIL node corresponding to the enclosure string. |
|
TEXT nodes, obviously, constitute text, and the ROOT node is the |
|
document's root. |
|
.\" SECTION |
.\" SECTION |
.Sh EXAMPLES |
.Sh EXAMPLES |
The following example reads lines from stdin and parses them, operating |
The following example reads lines from stdin and parses them, operating |
|
|
int line; |
int line; |
|
|
line = 1; |
line = 1; |
mdoc = mdoc_alloc(NULL, NULL); |
mdoc = mdoc_alloc(NULL, 0, NULL); |
|
|
while ((buf = fgetln(fp, &len))) { |
while ((buf = fgetln(fp, &len))) { |
buf[len - 1] = '\\0'; |
buf[len - 1] = '\\0'; |
Line 360 parsed(mdoc, node); |
|
Line 291 parsed(mdoc, node); |
|
mdoc_free(mdoc); |
mdoc_free(mdoc); |
.Ed |
.Ed |
.\" SECTION |
.\" SECTION |
.Sh COMPATIBILITY |
.Sh SEE ALSO |
In general, only those macros specified by |
.Xr mandoc 1 , |
.Xr mdoc.samples 7 |
|
and |
|
.Xr mdoc 7 |
.Xr mdoc 7 |
for |
|
.Ox |
|
and |
|
.Nx |
|
are supported; support for other |
|
.Bx |
|
systems is in progress. |
|
.Bl -bullet |
|
.\" LIST-ITEM |
|
.It |
|
NetBSD |
|
.Sq \&It \-nested |
|
is assumed for all lists: any list may be nested and |
|
.Sq \-enum |
|
lists will restart the sequence only for the sub-list. |
|
.\" LIST-ITEM |
|
.It |
|
Newer NetBSD-style |
|
.Sq \&It \-column |
|
syntax, where column widths may be preceeded by other arguments (instead |
|
of proceeded), is not supported. |
|
.\" LIST-ITEM |
|
.It |
|
The |
|
.Sq \&At |
|
macro only accepts a single parameter. |
|
.El |
|
.\" SECTION |
.\" SECTION |
.Sh SEE ALSO |
|
.Xr mdoc 7 , |
|
.Xr mdoc.samples 7 , |
|
.Xr groff 1 , |
|
.Xr mdocml 1 |
|
.\" SECTION |
|
.Sh AUTHORS |
.Sh AUTHORS |
The |
The |
.Nm |
.Nm |
utility was written by |
utility was written by |
.An Kristaps Dzonsons Aq kristaps@kth.se . |
.An Kristaps Dzonsons Aq kristaps@openbsd.org . |
.\" SECTION |
.\" SECTION |
.Sh CAVEATS |
.Sh CAVEATS |
.Bl -bullet |
.Bl -dash -compact |
.\" LIST-ITEM |
.\" LIST-ITEM |
.It |
.It |
The |
The |
.Sq \&Xc |
.Sq \&.Xc |
and |
and |
.Sq \&Xo |
.Sq \&.Xo |
macros aren't handled when used to span lines for the |
macros aren't handled when used to span lines for the |
.Sq \&It |
.Sq \&.It |
macro. Such usage is specifically discouraged in |
macro. |
.Xr mdoc.samples 7 . |
.\" LIST-ITEM |
|
.It |
|
The |
|
.Sq \&.Bsx |
|
macro family doesn't yet understand version arguments. |
|
.\" LIST-ITEM |
|
.It |
|
If not given a value, the \-offset argument to |
|
.Sq \&.Bd |
|
and |
|
.Sq \&.Bl |
|
should be the width of |
|
.Qq <string> ; |
|
instead, a value of |
|
.Li 10n |
|
is provided. |
|
.\" LIST-ITEM |
|
.It |
|
Columns widths in |
|
.Sq \&.Bl \-column |
|
should default to width |
|
.Qq <stringx> |
|
if not included. |
|
.\" LIST-ITEM |
|
.It |
|
List-width suffix |
|
.Qq m |
|
isn't handled. |
|
.\" LIST-ITEM |
|
.It |
|
Contents of the SYNOPSIS section aren't checked. |
.El |
.El |