Annotation of mandoc/mdoc.3, Revision 1.50
1.50 ! kristaps 1: .\" $Id: mdoc.3,v 1.49 2010/08/20 01:02:07 schwarze Exp $
1.6 kristaps 2: .\"
1.47 schwarze 3: .\" Copyright (c) 2009, 2010 Kristaps Dzonsons <kristaps@bsd.lv>
4: .\" Copyright (c) 2010 Ingo Schwarze <schwarze@openbsd.org>
1.6 kristaps 5: .\"
6: .\" Permission to use, copy, modify, and distribute this software for any
1.28 kristaps 7: .\" purpose with or without fee is hereby granted, provided that the above
8: .\" copyright notice and this permission notice appear in all copies.
1.6 kristaps 9: .\"
1.28 kristaps 10: .\" THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
11: .\" WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
12: .\" MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
13: .\" ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
14: .\" WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
15: .\" ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
16: .\" OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
1.33 kristaps 17: .\"
1.50 ! kristaps 18: .Dd $Mdocdate: August 20 2010 $
1.27 kristaps 19: .Dt MDOC 3
1.1 kristaps 20: .Os
21: .Sh NAME
1.39 kristaps 22: .Nm mdoc ,
1.1 kristaps 23: .Nm mdoc_alloc ,
24: .Nm mdoc_endparse ,
1.38 kristaps 25: .Nm mdoc_free ,
26: .Nm mdoc_meta ,
1.4 kristaps 27: .Nm mdoc_node ,
1.38 kristaps 28: .Nm mdoc_parseln ,
1.20 kristaps 29: .Nm mdoc_reset
1.2 kristaps 30: .Nd mdoc macro compiler library
1.1 kristaps 31: .Sh SYNOPSIS
1.38 kristaps 32: .In mandoc.h
1.35 kristaps 33: .In mdoc.h
1.4 kristaps 34: .Vt extern const char * const * mdoc_macronames;
35: .Vt extern const char * const * mdoc_argnames;
1.1 kristaps 36: .Ft "struct mdoc *"
1.43 kristaps 37: .Fo mdoc_alloc
1.44 kristaps 38: .Fa "struct regset *regs"
1.43 kristaps 39: .Fa "void *data"
40: .Fa "mandocmsg msgs"
41: .Fc
1.26 kristaps 42: .Ft int
1.38 kristaps 43: .Fn mdoc_endparse "struct mdoc *mdoc"
1.1 kristaps 44: .Ft void
1.2 kristaps 45: .Fn mdoc_free "struct mdoc *mdoc"
1.38 kristaps 46: .Ft "const struct mdoc_meta *"
47: .Fn mdoc_meta "const struct mdoc *mdoc"
48: .Ft "const struct mdoc_node *"
49: .Fn mdoc_node "const struct mdoc *mdoc"
1.1 kristaps 50: .Ft int
1.42 kristaps 51: .Fo mdoc_parseln
52: .Fa "struct mdoc *mdoc"
53: .Fa "int line"
54: .Fa "char *buf"
55: .Fc
1.1 kristaps 56: .Ft int
1.38 kristaps 57: .Fn mdoc_reset "struct mdoc *mdoc"
1.1 kristaps 58: .Sh DESCRIPTION
59: The
60: .Nm mdoc
1.33 kristaps 61: library parses lines of
1.17 kristaps 62: .Xr mdoc 7
1.38 kristaps 63: input
64: into an abstract syntax tree (AST).
1.6 kristaps 65: .Pp
1.1 kristaps 66: In general, applications initiate a parsing sequence with
67: .Fn mdoc_alloc ,
1.33 kristaps 68: parse each line in a document with
1.1 kristaps 69: .Fn mdoc_parseln ,
70: close the parsing session with
71: .Fn mdoc_endparse ,
72: operate over the syntax tree returned by
1.33 kristaps 73: .Fn mdoc_node
1.4 kristaps 74: and
75: .Fn mdoc_meta ,
1.1 kristaps 76: then free all allocated memory with
77: .Fn mdoc_free .
1.20 kristaps 78: The
79: .Fn mdoc_reset
80: function may be used in order to reset the parser for another input
1.38 kristaps 81: sequence.
1.6 kristaps 82: .Ss Types
1.37 kristaps 83: .Bl -ohang
1.6 kristaps 84: .It Vt struct mdoc
1.50 ! kristaps 85: An opaque type.
1.6 kristaps 86: Its values are only used privately within the library.
87: .It Vt struct mdoc_node
1.38 kristaps 88: A parsed node.
1.33 kristaps 89: See
1.6 kristaps 90: .Sx Abstract Syntax Tree
91: for details.
92: .El
93: .Ss Functions
1.37 kristaps 94: .Bl -ohang
1.2 kristaps 95: .It Fn mdoc_alloc
1.38 kristaps 96: Allocates a parsing structure.
97: The
1.2 kristaps 98: .Fa data
1.40 kristaps 99: pointer is passed to
100: .Fa msgs .
1.38 kristaps 101: Returns NULL on failure.
102: If non-NULL, the pointer must be freed with
1.2 kristaps 103: .Fn mdoc_free .
1.20 kristaps 104: .It Fn mdoc_reset
1.38 kristaps 105: Reset the parser for another parse routine.
106: After its use,
1.20 kristaps 107: .Fn mdoc_parseln
1.38 kristaps 108: behaves as if invoked for the first time.
109: If it returns 0, memory could not be allocated.
1.2 kristaps 110: .It Fn mdoc_free
1.38 kristaps 111: Free all resources of a parser.
112: The pointer is no longer valid after invocation.
1.2 kristaps 113: .It Fn mdoc_parseln
1.38 kristaps 114: Parse a nil-terminated line of input.
115: This line should not contain the trailing newline.
116: Returns 0 on failure, 1 on success.
117: The input buffer
1.2 kristaps 118: .Fa buf
119: is modified by this function.
120: .It Fn mdoc_endparse
1.38 kristaps 121: Signals that the parse is complete.
122: Note that if
1.2 kristaps 123: .Fn mdoc_endparse
124: is called subsequent to
1.4 kristaps 125: .Fn mdoc_node ,
1.38 kristaps 126: the resulting tree is incomplete.
127: Returns 0 on failure, 1 on success.
1.4 kristaps 128: .It Fn mdoc_node
1.38 kristaps 129: Returns the first node of the parse.
130: Note that if
1.2 kristaps 131: .Fn mdoc_parseln
132: or
133: .Fn mdoc_endparse
134: return 0, the tree will be incomplete.
1.4 kristaps 135: .It Fn mdoc_meta
1.38 kristaps 136: Returns the document's parsed meta-data.
137: If this information has not yet been supplied or
1.4 kristaps 138: .Fn mdoc_parseln
139: or
140: .Fn mdoc_endparse
141: return 0, the data will be incomplete.
142: .El
1.6 kristaps 143: .Ss Variables
1.37 kristaps 144: .Bl -ohang
1.4 kristaps 145: .It Va mdoc_macronames
146: An array of string-ified token names.
147: .It Va mdoc_argnames
148: An array of string-ified token argument names.
1.2 kristaps 149: .El
1.6 kristaps 150: .Ss Abstract Syntax Tree
1.33 kristaps 151: The
1.6 kristaps 152: .Nm
1.17 kristaps 153: functions produce an abstract syntax tree (AST) describing input in a
1.38 kristaps 154: regular form.
155: It may be reviewed at any time with
1.6 kristaps 156: .Fn mdoc_nodes ;
157: however, if called before
158: .Fn mdoc_endparse ,
159: or after
1.33 kristaps 160: .Fn mdoc_endparse
1.6 kristaps 161: or
162: .Fn mdoc_parseln
1.33 kristaps 163: fail, it may be incomplete.
1.18 kristaps 164: .Pp
165: This AST is governed by the ontological
1.17 kristaps 166: rules dictated in
167: .Xr mdoc 7
1.33 kristaps 168: and derives its terminology accordingly.
1.17 kristaps 169: .Qq In-line
170: elements described in
171: .Xr mdoc 7
1.33 kristaps 172: are described simply as
1.17 kristaps 173: .Qq elements .
1.6 kristaps 174: .Pp
1.33 kristaps 175: The AST is composed of
1.6 kristaps 176: .Vt struct mdoc_node
177: nodes with block, head, body, element, root and text types as declared
178: by the
179: .Va type
1.38 kristaps 180: field.
181: Each node also provides its parse point (the
1.6 kristaps 182: .Va line ,
183: .Va sec ,
184: and
185: .Va pos
186: fields), its position in the tree (the
187: .Va parent ,
188: .Va child ,
1.45 schwarze 189: .Va nchild ,
1.33 kristaps 190: .Va next
1.6 kristaps 191: and
1.33 kristaps 192: .Va prev
1.45 schwarze 193: fields) and some type-specific data, in particular, for nodes generated
194: from macros, the generating macro in the
195: .Va tok
196: field.
1.6 kristaps 197: .Pp
198: The tree itself is arranged according to the following normal form,
199: where capitalised non-terminals represent nodes.
200: .Pp
1.37 kristaps 201: .Bl -tag -width "ELEMENTXX" -compact
1.6 kristaps 202: .It ROOT
203: \(<- mnode+
204: .It mnode
205: \(<- BLOCK | ELEMENT | TEXT
206: .It BLOCK
1.41 kristaps 207: \(<- HEAD [TEXT] (BODY [TEXT])+ [TAIL [TEXT]]
1.6 kristaps 208: .It ELEMENT
209: \(<- TEXT*
210: .It HEAD
1.45 schwarze 211: \(<- mnode*
1.6 kristaps 212: .It BODY
1.45 schwarze 213: \(<- mnode* [ENDBODY mnode*]
1.6 kristaps 214: .It TAIL
1.45 schwarze 215: \(<- mnode*
1.6 kristaps 216: .It TEXT
1.38 kristaps 217: \(<- [[:printable:],0x1e]*
1.6 kristaps 218: .El
1.2 kristaps 219: .Pp
1.6 kristaps 220: Of note are the TEXT nodes following the HEAD, BODY and TAIL nodes of
1.41 kristaps 221: the BLOCK production: these refer to punctuation marks.
1.38 kristaps 222: Furthermore, although a TEXT node will generally have a non-zero-length
223: string, in the specific case of
1.8 kristaps 224: .Sq \&.Bd \-literal ,
1.6 kristaps 225: an empty line will produce a zero-length string.
1.41 kristaps 226: Multiple body parts are only found in invocations of
227: .Sq \&Bl \-column ,
228: where a new body introduces a new phrase.
1.46 kristaps 229: .Ss Badly-nested Blocks
230: The ENDBODY node is available to end the formatting associated
231: with a given block before the physical end of that block.
232: It has a non-null
1.45 schwarze 233: .Va end
234: field, is of the BODY
235: .Va type ,
236: has the same
237: .Va tok
238: as the BLOCK it is ending, and has a
239: .Va pending
240: field pointing to that BLOCK's BODY node.
241: It is an indirect child of that BODY node
242: and has no children of its own.
243: .Pp
244: An ENDBODY node is generated when a block ends while one of its child
245: blocks is still open, like in the following example:
246: .Bd -literal -offset indent
247: \&.Ao ao
248: \&.Bo bo ac
249: \&.Ac bc
250: \&.Bc end
251: .Ed
252: .Pp
253: This example results in the following block structure:
254: .Bd -literal -offset indent
255: BLOCK Ao
256: HEAD Ao
257: BODY Ao
258: TEXT ao
259: BLOCK Bo, pending -> Ao
260: HEAD Bo
261: BODY Bo
262: TEXT bo
263: TEXT ac
264: ENDBODY Ao, pending -> Ao
265: TEXT bc
266: TEXT end
267: .Ed
268: .Pp
1.46 kristaps 269: Here, the formatting of the
270: .Sq \&Ao
271: block extends from TEXT ao to TEXT ac,
272: while the formatting of the
273: .Sq \&Bo
274: block extends from TEXT bo to TEXT bc.
275: It renders as follows in
1.45 schwarze 276: .Fl T Ns Cm ascii
277: mode:
1.46 kristaps 278: .Pp
1.45 schwarze 279: .Dl <ao [bo ac> bc] end
1.46 kristaps 280: .Pp
281: Support for badly-nested blocks is only provided for backward
1.45 schwarze 282: compatibility with some older
283: .Xr mdoc 7
284: implementations.
1.46 kristaps 285: Using badly-nested blocks is
286: .Em strongly discouraged :
287: the
288: .Fl T Ns Cm html
289: and
290: .Fl T Ns Cm xhtml
291: front-ends are unable to render them in any meaningful way.
292: Furthermore, behaviour when encountering badly-nested blocks is not
293: consistent across troff implementations, especially when using multiple
294: levels of badly-nested blocks.
1.2 kristaps 295: .Sh EXAMPLES
296: The following example reads lines from stdin and parses them, operating
1.33 kristaps 297: on the finished parse tree with
1.2 kristaps 298: .Fn parsed .
1.37 kristaps 299: This example does not error-check nor free memory upon failure.
300: .Bd -literal -offset indent
1.44 kristaps 301: struct regset regs;
1.2 kristaps 302: struct mdoc *mdoc;
1.31 kristaps 303: const struct mdoc_node *node;
1.2 kristaps 304: char *buf;
305: size_t len;
306: int line;
307:
1.44 kristaps 308: bzero(®s, sizeof(struct regset));
1.2 kristaps 309: line = 1;
1.49 schwarze 310: mdoc = mdoc_alloc(®s, NULL, NULL);
1.37 kristaps 311: buf = NULL;
312: alloc_len = 0;
1.2 kristaps 313:
1.37 kristaps 314: while ((len = getline(&buf, &alloc_len, stdin)) >= 0) {
315: if (len && buflen[len - 1] = '\en')
316: buf[len - 1] = '\e0';
317: if ( ! mdoc_parseln(mdoc, line, buf))
318: errx(1, "mdoc_parseln");
319: line++;
1.2 kristaps 320: }
321:
322: if ( ! mdoc_endparse(mdoc))
1.37 kristaps 323: errx(1, "mdoc_endparse");
1.4 kristaps 324: if (NULL == (node = mdoc_node(mdoc)))
1.37 kristaps 325: errx(1, "mdoc_node");
1.2 kristaps 326:
327: parsed(mdoc, node);
328: mdoc_free(mdoc);
329: .Ed
1.38 kristaps 330: .Pp
1.50 ! kristaps 331: To compile this, execute
! 332: .Pp
! 333: .D1 % cc main.c libmdoc.a libmandoc.a
! 334: .Pp
! 335: where
1.38 kristaps 336: .Pa main.c
1.50 ! kristaps 337: is the example file.
1.17 kristaps 338: .Sh SEE ALSO
1.20 kristaps 339: .Xr mandoc 1 ,
1.14 kristaps 340: .Xr mdoc 7
1.2 kristaps 341: .Sh AUTHORS
342: The
343: .Nm
1.38 kristaps 344: library was written by
1.37 kristaps 345: .An Kristaps Dzonsons Aq kristaps@bsd.lv .
CVSweb