Annotation of mandoc/mandoc.3, Revision 1.5
1.5 ! kristaps 1: .\" $Id: mandoc.3,v 1.4 2011/04/19 16:30:00 kristaps Exp $
1.1 kristaps 2: .\"
3: .\" Copyright (c) 2009, 2010, 2011 Kristaps Dzonsons <kristaps@bsd.lv>
4: .\" Copyright (c) 2010 Ingo Schwarze <schwarze@openbsd.org>
5: .\"
6: .\" Permission to use, copy, modify, and distribute this software for any
7: .\" purpose with or without fee is hereby granted, provided that the above
8: .\" copyright notice and this permission notice appear in all copies.
9: .\"
10: .\" THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
11: .\" WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
12: .\" MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
13: .\" ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
14: .\" WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
15: .\" ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
16: .\" OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
17: .\"
1.5 ! kristaps 18: .Dd $Mdocdate: April 19 2011 $
1.1 kristaps 19: .Dt MANDOC 3
20: .Os
21: .Sh NAME
22: .Nm mandoc ,
1.3 kristaps 23: .Nm mandoc_escape ,
1.1 kristaps 24: .Nm man_meta ,
25: .Nm man_node ,
26: .Nm mdoc_meta ,
27: .Nm mdoc_node ,
28: .Nm mparse_alloc ,
29: .Nm mparse_free ,
30: .Nm mparse_readfd ,
31: .Nm mparse_reset ,
1.2 kristaps 32: .Nm mparse_result ,
33: .Nm mparse_strerror ,
34: .Nm mparse_strlevel
1.1 kristaps 35: .Nd mandoc macro compiler library
36: .Sh SYNOPSIS
37: .In man.h
38: .In mdoc.h
39: .In mandoc.h
1.3 kristaps 40: .Ft "enum mandoc_esc"
41: .Fo mandoc_escape
42: .Fa "const char **in"
43: .Fa "const char **seq"
44: .Fa "int *len"
45: .Fc
1.1 kristaps 46: .Ft "const struct man_meta *"
47: .Fo man_meta
48: .Fa "const struct man *man"
49: .Fc
50: .Ft "const struct man_node *"
51: .Fo man_node
52: .Fa "const struct man *man"
53: .Fc
54: .Ft "const struct mdoc_meta *"
55: .Fo mdoc_meta
56: .Fa "const struct mdoc *mdoc"
57: .Fc
58: .Ft "const struct mdoc_node *"
59: .Fo mdoc_node
60: .Fa "const struct mdoc *mdoc"
61: .Fc
62: .Ft void
63: .Fo mparse_alloc
64: .Fa "enum mparset type"
65: .Fa "enum mandoclevel wlevel"
66: .Fa "mandocmsg msg"
67: .Fa "void *msgarg"
68: .Fc
69: .Ft void
70: .Fo mparse_free
71: .Fa "struct mparse *parse"
72: .Fc
73: .Ft "enum mandoclevel"
74: .Fo mparse_readfd
75: .Fa "struct mparse *parse"
76: .Fa "int fd"
77: .Fa "const char *fname"
78: .Fc
79: .Ft void
80: .Fo mparse_reset
81: .Fa "struct mparse *parse"
82: .Fc
83: .Ft void
84: .Fo mparse_result
85: .Fa "struct mparse *parse"
86: .Fa "struct mdoc **mdoc"
87: .Fa "struct man **man"
1.2 kristaps 88: .Fc
89: .Ft "const char *"
90: .Fo mparse_strerror
91: .Fa "enum mandocerr"
92: .Fc
93: .Ft "const char *"
94: .Fo mparse_strlevel
95: .Fa "enum mandoclevel"
1.1 kristaps 96: .Fc
97: .Vt extern const char * const * man_macronames;
98: .Vt extern const char * const * mdoc_argnames;
99: .Vt extern const char * const * mdoc_macronames;
1.4 kristaps 100: .Fd "#define ASCII_NBRSP"
101: .Fd "#define ASCII_HYPH"
1.1 kristaps 102: .Sh DESCRIPTION
103: The
104: .Nm mandoc
105: library parses a
106: .Ux
107: manual into an abstract syntax tree (AST).
108: .Ux
109: manuals are composed of
110: .Xr mdoc 7
111: or
112: .Xr man 7 ,
113: and may be mixed with
114: .Xr roff 7 ,
115: .Xr tbl 7 ,
116: and
117: .Xr eqn 7
118: invocations.
119: .Pp
120: The following describes a general parse sequence:
121: .Bl -enum
122: .It
123: initiate a parsing sequence with
124: .Fn mparse_alloc ;
125: .It
126: parse files or file descriptors with
127: .Fn mparse_readfd ;
128: .It
129: retrieve a parsed syntax tree, if the parse was successful, with
130: .Fn mparse_result ;
131: .It
132: iterate over parse nodes with
133: .Fn mdoc_node
134: or
135: .Fn man_node ;
136: .It
137: free all allocated memory with
138: .Fn mparse_free ,
139: or invoke
140: .Fn mparse_reset
141: and parse new files.
1.3 kristaps 142: .El
143: .Sh REFERENCE
144: This section documents the functions, types, and variables available
145: via
146: .In mandoc.h .
147: .Ss Types
148: .Bl -ohang
149: .It Vt "enum mandoc_esc"
150: .It Vt "enum mandocerr"
151: .It Vt "enum mandoclevel"
152: .It Vt "enum mparset"
153: .It Vt "struct mparse"
154: .It Vt "mandocmsg"
155: .El
156: .Ss Functions
157: .Bl -ohang
158: .It Fn mandoc_escape
1.4 kristaps 159: Scan an escape sequence, i.e., a character string beginning with
160: .Sq \e .
161: Pass a pointer to this string as
162: .Va end ;
163: it will be set to the supremum of the parsed escape sequence unless
164: returning ESCAPE_ERROR, in which case the string is bogus and should be
165: thrown away.
166: If not ESCAPE_ERROR or ESCAPE_IGNORE,
167: .Va start
168: is set to the first relevant character of the substring (font, glyph,
169: whatever) of length
170: .Va sz .
171: Both
172: .Va start
173: and
174: .Va sz
175: may be NULL.
1.3 kristaps 176: .It Fn man_meta
1.4 kristaps 177: Obtain the meta-data of a successful parse.
178: This may only be used on a pointer returned by
179: .Fn mparse_result .
1.3 kristaps 180: .It Fn man_node
1.4 kristaps 181: Obtain the root node of a successful parse.
182: This may only be used on a pointer returned by
183: .Fn mparse_result .
1.3 kristaps 184: .It Fn mdoc_meta
1.4 kristaps 185: Obtain the meta-data of a successful parse.
186: This may only be used on a pointer returned by
187: .Fn mparse_result .
1.3 kristaps 188: .It Fn mdoc_node
1.4 kristaps 189: Obtain the root node of a successful parse.
190: This may only be used on a pointer returned by
191: .Fn mparse_result .
1.3 kristaps 192: .It Fn mparse_alloc
1.4 kristaps 193: Allocate a parser.
194: The same parser may be used for multiple files so long as
195: .Fn mparse_reset
196: is called between parses.
197: .Fn mparse_free
198: must be called to free the memory allocated by this function.
1.3 kristaps 199: .It Fn mparse_free
1.4 kristaps 200: Free all memory allocated by
201: .Fn mparse_alloc .
1.3 kristaps 202: .It Fn mparse_readfd
1.4 kristaps 203: Parse a file or file descriptor.
204: If
205: .Va fd
206: is -1,
207: .Va fname
208: is opened for reading.
209: Otherwise,
210: .Va fname
211: is assumed to be the name associated with
212: .Va fd .
213: This may be called multiple times with different parameters; however,
214: .Fn mparse_reset
215: should be invoked between parses.
1.3 kristaps 216: .It Fn mparse_reset
1.4 kristaps 217: Reset a parser so that
218: .Fn mparse_readfd
219: may be used again.
1.3 kristaps 220: .It Fn mparse_result
1.4 kristaps 221: Obtain the result of a parse.
222: Only successful parses
223: .Po
224: i.e., those where
225: .Fn mparse_readfd
226: returned less than MANDOCLEVEL_FATAL
227: .Pc
228: should invoke this function, in which case one of the two pointers will
229: be filled in.
1.3 kristaps 230: .It Fn mparse_strerror
1.4 kristaps 231: Return a statically-allocated string representation of an error code.
1.3 kristaps 232: .It Fn mparse_strlevel
1.4 kristaps 233: Return a statically-allocated string representation of a level code.
1.3 kristaps 234: .El
235: .Ss Variables
236: .Bl -ohang
237: .It Va man_macronames
1.4 kristaps 238: The string representation of a man macro as indexed by
239: .Vt "enum mant" .
1.3 kristaps 240: .It Va mdoc_argnames
1.4 kristaps 241: The string representation of a mdoc macro argument as indexed by
242: .Vt "enum mdocargt" .
1.3 kristaps 243: .It Va mdoc_macronames
1.4 kristaps 244: The string representation of a mdoc macro as indexed by
245: .Vt "enum mdoct" .
1.1 kristaps 246: .El
247: .Sh IMPLEMENTATION NOTES
248: This section consists of structural documentation for
249: .Xr mdoc 7
250: and
251: .Xr man 7
252: syntax trees.
253: .Ss Man Abstract Syntax Tree
254: This AST is governed by the ontological rules dictated in
255: .Xr man 7
256: and derives its terminology accordingly.
257: .Pp
258: The AST is composed of
259: .Vt struct man_node
260: nodes with element, root and text types as declared by the
261: .Va type
262: field.
263: Each node also provides its parse point (the
264: .Va line ,
265: .Va sec ,
266: and
267: .Va pos
268: fields), its position in the tree (the
269: .Va parent ,
270: .Va child ,
271: .Va next
272: and
273: .Va prev
274: fields) and some type-specific data.
275: .Pp
276: The tree itself is arranged according to the following normal form,
277: where capitalised non-terminals represent nodes.
278: .Pp
279: .Bl -tag -width "ELEMENTXX" -compact
280: .It ROOT
281: \(<- mnode+
282: .It mnode
283: \(<- ELEMENT | TEXT | BLOCK
284: .It BLOCK
285: \(<- HEAD BODY
286: .It HEAD
287: \(<- mnode*
288: .It BODY
289: \(<- mnode*
290: .It ELEMENT
291: \(<- ELEMENT | TEXT*
292: .It TEXT
293: \(<- [[:alpha:]]*
294: .El
295: .Pp
296: The only elements capable of nesting other elements are those with
297: next-lint scope as documented in
298: .Xr man 7 .
299: .Ss Mdoc Abstract Syntax Tree
300: This AST is governed by the ontological
301: rules dictated in
302: .Xr mdoc 7
303: and derives its terminology accordingly.
304: .Qq In-line
305: elements described in
306: .Xr mdoc 7
307: are described simply as
308: .Qq elements .
309: .Pp
310: The AST is composed of
311: .Vt struct mdoc_node
312: nodes with block, head, body, element, root and text types as declared
313: by the
314: .Va type
315: field.
316: Each node also provides its parse point (the
317: .Va line ,
318: .Va sec ,
319: and
320: .Va pos
321: fields), its position in the tree (the
322: .Va parent ,
323: .Va child ,
324: .Va nchild ,
325: .Va next
326: and
327: .Va prev
328: fields) and some type-specific data, in particular, for nodes generated
329: from macros, the generating macro in the
330: .Va tok
331: field.
332: .Pp
333: The tree itself is arranged according to the following normal form,
334: where capitalised non-terminals represent nodes.
335: .Pp
336: .Bl -tag -width "ELEMENTXX" -compact
337: .It ROOT
338: \(<- mnode+
339: .It mnode
340: \(<- BLOCK | ELEMENT | TEXT
341: .It BLOCK
342: \(<- HEAD [TEXT] (BODY [TEXT])+ [TAIL [TEXT]]
343: .It ELEMENT
344: \(<- TEXT*
345: .It HEAD
346: \(<- mnode*
347: .It BODY
348: \(<- mnode* [ENDBODY mnode*]
349: .It TAIL
350: \(<- mnode*
351: .It TEXT
352: \(<- [[:printable:],0x1e]*
353: .El
354: .Pp
355: Of note are the TEXT nodes following the HEAD, BODY and TAIL nodes of
356: the BLOCK production: these refer to punctuation marks.
357: Furthermore, although a TEXT node will generally have a non-zero-length
358: string, in the specific case of
359: .Sq \&.Bd \-literal ,
360: an empty line will produce a zero-length string.
361: Multiple body parts are only found in invocations of
362: .Sq \&Bl \-column ,
363: where a new body introduces a new phrase.
364: .Pp
365: The
366: .Xr mdoc 7
1.5 ! kristaps 367: syntax tree accommodates for broken block structures as well.
1.1 kristaps 368: The ENDBODY node is available to end the formatting associated
369: with a given block before the physical end of that block.
370: It has a non-null
371: .Va end
372: field, is of the BODY
373: .Va type ,
374: has the same
375: .Va tok
376: as the BLOCK it is ending, and has a
377: .Va pending
378: field pointing to that BLOCK's BODY node.
379: It is an indirect child of that BODY node
380: and has no children of its own.
381: .Pp
382: An ENDBODY node is generated when a block ends while one of its child
383: blocks is still open, like in the following example:
384: .Bd -literal -offset indent
385: \&.Ao ao
386: \&.Bo bo ac
387: \&.Ac bc
388: \&.Bc end
389: .Ed
390: .Pp
391: This example results in the following block structure:
392: .Bd -literal -offset indent
393: BLOCK Ao
394: HEAD Ao
395: BODY Ao
396: TEXT ao
397: BLOCK Bo, pending -> Ao
398: HEAD Bo
399: BODY Bo
400: TEXT bo
401: TEXT ac
402: ENDBODY Ao, pending -> Ao
403: TEXT bc
404: TEXT end
405: .Ed
406: .Pp
407: Here, the formatting of the
408: .Sq \&Ao
409: block extends from TEXT ao to TEXT ac,
410: while the formatting of the
411: .Sq \&Bo
412: block extends from TEXT bo to TEXT bc.
413: It renders as follows in
414: .Fl T Ns Cm ascii
415: mode:
416: .Pp
417: .Dl <ao [bo ac> bc] end
418: .Pp
419: Support for badly-nested blocks is only provided for backward
420: compatibility with some older
421: .Xr mdoc 7
422: implementations.
423: Using badly-nested blocks is
424: .Em strongly discouraged ;
425: for example, the
426: .Fl T Ns Cm html
427: and
428: .Fl T Ns Cm xhtml
429: front-ends to
430: .Xr mandoc 1
431: are unable to render them in any meaningful way.
432: Furthermore, behaviour when encountering badly-nested blocks is not
433: consistent across troff implementations, especially when using multiple
434: levels of badly-nested blocks.
435: .Sh SEE ALSO
436: .Xr mandoc 1 ,
437: .Xr eqn 7 ,
438: .Xr man 7 ,
439: .Xr mdoc 7 ,
440: .Xr roff 7 ,
441: .Xr tbl 7
442: .Sh AUTHORS
443: The
444: .Nm
445: library was written by
446: .An Kristaps Dzonsons Aq kristaps@bsd.lv .
CVSweb