Annotation of mandoc/mdoc.3, Revision 1.57
1.57 ! kristaps 1: .\" $Id: mdoc.3,v 1.56 2011/02/09 09:05:52 kristaps Exp $
1.6 kristaps 2: .\"
1.47 schwarze 3: .\" Copyright (c) 2009, 2010 Kristaps Dzonsons <kristaps@bsd.lv>
4: .\" Copyright (c) 2010 Ingo Schwarze <schwarze@openbsd.org>
1.6 kristaps 5: .\"
6: .\" Permission to use, copy, modify, and distribute this software for any
1.28 kristaps 7: .\" purpose with or without fee is hereby granted, provided that the above
8: .\" copyright notice and this permission notice appear in all copies.
1.6 kristaps 9: .\"
1.28 kristaps 10: .\" THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
11: .\" WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
12: .\" MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
13: .\" ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
14: .\" WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
15: .\" ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
16: .\" OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
1.33 kristaps 17: .\"
1.57 ! kristaps 18: .Dd $Mdocdate: February 9 2011 $
1.27 kristaps 19: .Dt MDOC 3
1.1 kristaps 20: .Os
21: .Sh NAME
1.39 kristaps 22: .Nm mdoc ,
1.57 ! kristaps 23: .Nm mdoc_addeqn ,
! 24: .Nm mdoc_addspan ,
1.1 kristaps 25: .Nm mdoc_alloc ,
26: .Nm mdoc_endparse ,
1.38 kristaps 27: .Nm mdoc_free ,
28: .Nm mdoc_meta ,
1.4 kristaps 29: .Nm mdoc_node ,
1.38 kristaps 30: .Nm mdoc_parseln ,
1.20 kristaps 31: .Nm mdoc_reset
1.2 kristaps 32: .Nd mdoc macro compiler library
1.1 kristaps 33: .Sh SYNOPSIS
1.38 kristaps 34: .In mandoc.h
1.35 kristaps 35: .In mdoc.h
1.4 kristaps 36: .Vt extern const char * const * mdoc_macronames;
37: .Vt extern const char * const * mdoc_argnames;
1.52 kristaps 38: .Ft int
1.56 kristaps 39: .Fo mdoc_addeqn
40: .Fa "struct mdoc *mdoc"
41: .Fa "const struct eqn *eqn"
42: .Fc
43: .Ft int
1.55 kristaps 44: .Fo mdoc_addspan
1.52 kristaps 45: .Fa "struct mdoc *mdoc"
46: .Fa "const struct tbl_span *span"
47: .Fc
1.1 kristaps 48: .Ft "struct mdoc *"
1.43 kristaps 49: .Fo mdoc_alloc
1.44 kristaps 50: .Fa "struct regset *regs"
1.43 kristaps 51: .Fa "void *data"
52: .Fa "mandocmsg msgs"
53: .Fc
1.26 kristaps 54: .Ft int
1.38 kristaps 55: .Fn mdoc_endparse "struct mdoc *mdoc"
1.1 kristaps 56: .Ft void
1.2 kristaps 57: .Fn mdoc_free "struct mdoc *mdoc"
1.38 kristaps 58: .Ft "const struct mdoc_meta *"
59: .Fn mdoc_meta "const struct mdoc *mdoc"
60: .Ft "const struct mdoc_node *"
61: .Fn mdoc_node "const struct mdoc *mdoc"
1.1 kristaps 62: .Ft int
1.42 kristaps 63: .Fo mdoc_parseln
64: .Fa "struct mdoc *mdoc"
65: .Fa "int line"
66: .Fa "char *buf"
67: .Fc
1.1 kristaps 68: .Ft int
1.38 kristaps 69: .Fn mdoc_reset "struct mdoc *mdoc"
1.1 kristaps 70: .Sh DESCRIPTION
71: The
72: .Nm mdoc
1.33 kristaps 73: library parses lines of
1.17 kristaps 74: .Xr mdoc 7
1.38 kristaps 75: input
76: into an abstract syntax tree (AST).
1.6 kristaps 77: .Pp
1.1 kristaps 78: In general, applications initiate a parsing sequence with
79: .Fn mdoc_alloc ,
1.33 kristaps 80: parse each line in a document with
1.1 kristaps 81: .Fn mdoc_parseln ,
82: close the parsing session with
83: .Fn mdoc_endparse ,
84: operate over the syntax tree returned by
1.33 kristaps 85: .Fn mdoc_node
1.4 kristaps 86: and
87: .Fn mdoc_meta ,
1.1 kristaps 88: then free all allocated memory with
89: .Fn mdoc_free .
1.20 kristaps 90: The
91: .Fn mdoc_reset
92: function may be used in order to reset the parser for another input
1.38 kristaps 93: sequence.
1.6 kristaps 94: .Ss Types
1.37 kristaps 95: .Bl -ohang
1.6 kristaps 96: .It Vt struct mdoc
1.50 kristaps 97: An opaque type.
1.6 kristaps 98: Its values are only used privately within the library.
99: .It Vt struct mdoc_node
1.38 kristaps 100: A parsed node.
1.33 kristaps 101: See
1.6 kristaps 102: .Sx Abstract Syntax Tree
103: for details.
104: .El
105: .Ss Functions
1.53 kristaps 106: If
1.56 kristaps 107: .Fn mdoc_addeqn ,
1.53 kristaps 108: .Fn mdoc_addspan ,
109: .Fn mdoc_parseln ,
110: or
111: .Fn mdoc_endparse
112: return 0, calls to any function but
113: .Fn mdoc_reset
114: or
115: .Fn mdoc_free
116: will raise an assertion.
1.37 kristaps 117: .Bl -ohang
1.56 kristaps 118: .It Fn mdoc_addeqn
119: Add an equation to the parsing stream.
120: Returns 0 on failure, 1 on success.
1.52 kristaps 121: .It Fn mdoc_addspan
122: Add a table span to the parsing stream.
123: Returns 0 on failure, 1 on success.
1.2 kristaps 124: .It Fn mdoc_alloc
1.38 kristaps 125: Allocates a parsing structure.
126: The
1.2 kristaps 127: .Fa data
1.40 kristaps 128: pointer is passed to
129: .Fa msgs .
1.53 kristaps 130: Always returns a valid pointer.
131: The pointer must be freed with
1.2 kristaps 132: .Fn mdoc_free .
1.20 kristaps 133: .It Fn mdoc_reset
1.38 kristaps 134: Reset the parser for another parse routine.
135: After its use,
1.20 kristaps 136: .Fn mdoc_parseln
1.38 kristaps 137: behaves as if invoked for the first time.
138: If it returns 0, memory could not be allocated.
1.2 kristaps 139: .It Fn mdoc_free
1.38 kristaps 140: Free all resources of a parser.
141: The pointer is no longer valid after invocation.
1.2 kristaps 142: .It Fn mdoc_parseln
1.38 kristaps 143: Parse a nil-terminated line of input.
144: This line should not contain the trailing newline.
145: Returns 0 on failure, 1 on success.
146: The input buffer
1.2 kristaps 147: .Fa buf
148: is modified by this function.
149: .It Fn mdoc_endparse
1.38 kristaps 150: Signals that the parse is complete.
151: Returns 0 on failure, 1 on success.
1.4 kristaps 152: .It Fn mdoc_node
1.38 kristaps 153: Returns the first node of the parse.
1.4 kristaps 154: .It Fn mdoc_meta
1.38 kristaps 155: Returns the document's parsed meta-data.
1.4 kristaps 156: .El
1.6 kristaps 157: .Ss Variables
1.37 kristaps 158: .Bl -ohang
1.4 kristaps 159: .It Va mdoc_macronames
160: An array of string-ified token names.
161: .It Va mdoc_argnames
162: An array of string-ified token argument names.
1.2 kristaps 163: .El
1.6 kristaps 164: .Ss Abstract Syntax Tree
1.33 kristaps 165: The
1.6 kristaps 166: .Nm
1.17 kristaps 167: functions produce an abstract syntax tree (AST) describing input in a
1.38 kristaps 168: regular form.
169: It may be reviewed at any time with
1.6 kristaps 170: .Fn mdoc_nodes ;
171: however, if called before
172: .Fn mdoc_endparse ,
173: or after
1.33 kristaps 174: .Fn mdoc_endparse
1.6 kristaps 175: or
176: .Fn mdoc_parseln
1.33 kristaps 177: fail, it may be incomplete.
1.18 kristaps 178: .Pp
179: This AST is governed by the ontological
1.17 kristaps 180: rules dictated in
181: .Xr mdoc 7
1.33 kristaps 182: and derives its terminology accordingly.
1.17 kristaps 183: .Qq In-line
184: elements described in
185: .Xr mdoc 7
1.33 kristaps 186: are described simply as
1.17 kristaps 187: .Qq elements .
1.6 kristaps 188: .Pp
1.33 kristaps 189: The AST is composed of
1.6 kristaps 190: .Vt struct mdoc_node
191: nodes with block, head, body, element, root and text types as declared
192: by the
193: .Va type
1.38 kristaps 194: field.
195: Each node also provides its parse point (the
1.6 kristaps 196: .Va line ,
197: .Va sec ,
198: and
199: .Va pos
200: fields), its position in the tree (the
201: .Va parent ,
202: .Va child ,
1.45 schwarze 203: .Va nchild ,
1.33 kristaps 204: .Va next
1.6 kristaps 205: and
1.33 kristaps 206: .Va prev
1.45 schwarze 207: fields) and some type-specific data, in particular, for nodes generated
208: from macros, the generating macro in the
209: .Va tok
210: field.
1.6 kristaps 211: .Pp
212: The tree itself is arranged according to the following normal form,
213: where capitalised non-terminals represent nodes.
214: .Pp
1.37 kristaps 215: .Bl -tag -width "ELEMENTXX" -compact
1.6 kristaps 216: .It ROOT
217: \(<- mnode+
218: .It mnode
219: \(<- BLOCK | ELEMENT | TEXT
220: .It BLOCK
1.41 kristaps 221: \(<- HEAD [TEXT] (BODY [TEXT])+ [TAIL [TEXT]]
1.6 kristaps 222: .It ELEMENT
223: \(<- TEXT*
224: .It HEAD
1.45 schwarze 225: \(<- mnode*
1.6 kristaps 226: .It BODY
1.45 schwarze 227: \(<- mnode* [ENDBODY mnode*]
1.6 kristaps 228: .It TAIL
1.45 schwarze 229: \(<- mnode*
1.6 kristaps 230: .It TEXT
1.38 kristaps 231: \(<- [[:printable:],0x1e]*
1.6 kristaps 232: .El
1.2 kristaps 233: .Pp
1.6 kristaps 234: Of note are the TEXT nodes following the HEAD, BODY and TAIL nodes of
1.41 kristaps 235: the BLOCK production: these refer to punctuation marks.
1.38 kristaps 236: Furthermore, although a TEXT node will generally have a non-zero-length
237: string, in the specific case of
1.8 kristaps 238: .Sq \&.Bd \-literal ,
1.6 kristaps 239: an empty line will produce a zero-length string.
1.41 kristaps 240: Multiple body parts are only found in invocations of
241: .Sq \&Bl \-column ,
242: where a new body introduces a new phrase.
1.46 kristaps 243: .Ss Badly-nested Blocks
244: The ENDBODY node is available to end the formatting associated
245: with a given block before the physical end of that block.
246: It has a non-null
1.45 schwarze 247: .Va end
248: field, is of the BODY
249: .Va type ,
250: has the same
251: .Va tok
252: as the BLOCK it is ending, and has a
253: .Va pending
254: field pointing to that BLOCK's BODY node.
255: It is an indirect child of that BODY node
256: and has no children of its own.
257: .Pp
258: An ENDBODY node is generated when a block ends while one of its child
259: blocks is still open, like in the following example:
260: .Bd -literal -offset indent
261: \&.Ao ao
262: \&.Bo bo ac
263: \&.Ac bc
264: \&.Bc end
265: .Ed
266: .Pp
267: This example results in the following block structure:
268: .Bd -literal -offset indent
269: BLOCK Ao
270: HEAD Ao
271: BODY Ao
272: TEXT ao
273: BLOCK Bo, pending -> Ao
274: HEAD Bo
275: BODY Bo
276: TEXT bo
277: TEXT ac
278: ENDBODY Ao, pending -> Ao
279: TEXT bc
280: TEXT end
281: .Ed
282: .Pp
1.46 kristaps 283: Here, the formatting of the
284: .Sq \&Ao
285: block extends from TEXT ao to TEXT ac,
286: while the formatting of the
287: .Sq \&Bo
288: block extends from TEXT bo to TEXT bc.
289: It renders as follows in
1.45 schwarze 290: .Fl T Ns Cm ascii
291: mode:
1.46 kristaps 292: .Pp
1.45 schwarze 293: .Dl <ao [bo ac> bc] end
1.46 kristaps 294: .Pp
295: Support for badly-nested blocks is only provided for backward
1.45 schwarze 296: compatibility with some older
297: .Xr mdoc 7
298: implementations.
1.46 kristaps 299: Using badly-nested blocks is
300: .Em strongly discouraged :
301: the
302: .Fl T Ns Cm html
303: and
304: .Fl T Ns Cm xhtml
305: front-ends are unable to render them in any meaningful way.
306: Furthermore, behaviour when encountering badly-nested blocks is not
307: consistent across troff implementations, especially when using multiple
308: levels of badly-nested blocks.
1.2 kristaps 309: .Sh EXAMPLES
310: The following example reads lines from stdin and parses them, operating
1.33 kristaps 311: on the finished parse tree with
1.2 kristaps 312: .Fn parsed .
1.37 kristaps 313: This example does not error-check nor free memory upon failure.
314: .Bd -literal -offset indent
1.44 kristaps 315: struct regset regs;
1.2 kristaps 316: struct mdoc *mdoc;
1.31 kristaps 317: const struct mdoc_node *node;
1.2 kristaps 318: char *buf;
319: size_t len;
320: int line;
321:
1.44 kristaps 322: bzero(®s, sizeof(struct regset));
1.2 kristaps 323: line = 1;
1.49 schwarze 324: mdoc = mdoc_alloc(®s, NULL, NULL);
1.37 kristaps 325: buf = NULL;
326: alloc_len = 0;
1.2 kristaps 327:
1.37 kristaps 328: while ((len = getline(&buf, &alloc_len, stdin)) >= 0) {
329: if (len && buflen[len - 1] = '\en')
330: buf[len - 1] = '\e0';
331: if ( ! mdoc_parseln(mdoc, line, buf))
332: errx(1, "mdoc_parseln");
333: line++;
1.2 kristaps 334: }
335:
336: if ( ! mdoc_endparse(mdoc))
1.37 kristaps 337: errx(1, "mdoc_endparse");
1.4 kristaps 338: if (NULL == (node = mdoc_node(mdoc)))
1.37 kristaps 339: errx(1, "mdoc_node");
1.2 kristaps 340:
341: parsed(mdoc, node);
342: mdoc_free(mdoc);
343: .Ed
1.38 kristaps 344: .Pp
1.50 kristaps 345: To compile this, execute
346: .Pp
1.51 kristaps 347: .Dl % cc main.c libmdoc.a libmandoc.a
1.50 kristaps 348: .Pp
349: where
1.38 kristaps 350: .Pa main.c
1.50 kristaps 351: is the example file.
1.17 kristaps 352: .Sh SEE ALSO
1.20 kristaps 353: .Xr mandoc 1 ,
1.14 kristaps 354: .Xr mdoc 7
1.2 kristaps 355: .Sh AUTHORS
356: The
357: .Nm
1.38 kristaps 358: library was written by
1.37 kristaps 359: .An Kristaps Dzonsons Aq kristaps@bsd.lv .
CVSweb