Annotation of mandoc/mandoc_html.3, Revision 1.22
1.22 ! schwarze 1: .\" $Id: mandoc_html.3,v 1.21 2020/04/18 20:44:09 schwarze Exp $
1.1 schwarze 2: .\"
1.11 schwarze 3: .\" Copyright (c) 2014, 2017, 2018 Ingo Schwarze <schwarze@openbsd.org>
1.1 schwarze 4: .\"
5: .\" Permission to use, copy, modify, and distribute this software for any
6: .\" purpose with or without fee is hereby granted, provided that the above
7: .\" copyright notice and this permission notice appear in all copies.
8: .\"
9: .\" THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
10: .\" WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
11: .\" MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
12: .\" ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
13: .\" WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
14: .\" ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
15: .\" OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
16: .\"
1.22 ! schwarze 17: .Dd $Mdocdate: April 18 2020 $
1.1 schwarze 18: .Dt MANDOC_HTML 3
19: .Os
20: .Sh NAME
21: .Nm mandoc_html
22: .Nd internals of the mandoc HTML formatter
23: .Sh SYNOPSIS
1.21 schwarze 24: .In sys/types.h
25: .Fd #include """mandoc.h"""
26: .Fd #include """roff.h"""
27: .Fd #include """out.h"""
28: .Fd #include """html.h"""
1.1 schwarze 29: .Ft void
30: .Fn print_gen_decls "struct html *h"
31: .Ft void
1.11 schwarze 32: .Fn print_gen_comment "struct html *h" "struct roff_node *n"
33: .Ft void
1.1 schwarze 34: .Fn print_gen_head "struct html *h"
35: .Ft struct tag *
36: .Fo print_otag
37: .Fa "struct html *h"
38: .Fa "enum htmltag tag"
1.2 schwarze 39: .Fa "const char *fmt"
40: .Fa ...
1.1 schwarze 41: .Fc
42: .Ft void
43: .Fo print_tagq
44: .Fa "struct html *h"
45: .Fa "const struct tag *until"
46: .Fc
47: .Ft void
48: .Fo print_stagq
49: .Fa "struct html *h"
50: .Fa "const struct tag *suntil"
51: .Fc
52: .Ft void
1.21 schwarze 53: .Fn html_close_paragraph "struct html *h"
54: .Ft enum roff_tok
55: .Fo html_fillmode
56: .Fa "struct html *h"
57: .Fa "enum roff_tok tok"
58: .Fc
59: .Ft int
60: .Fo html_setfont
61: .Fa "struct html *h"
62: .Fa "enum mandoc_esc font"
63: .Fc
64: .Ft void
1.1 schwarze 65: .Fo print_text
66: .Fa "struct html *h"
67: .Fa "const char *word"
68: .Fc
1.21 schwarze 69: .Ft void
70: .Fo print_tagged_text
71: .Fa "struct html *h"
72: .Fa "const char *word"
73: .Fa "struct roff_node *n"
74: .Fc
1.7 schwarze 75: .Ft char *
76: .Fo html_make_id
77: .Fa "const struct roff_node *n"
1.20 schwarze 78: .Fa "int unique"
1.7 schwarze 79: .Fc
1.20 schwarze 80: .Ft struct tag *
81: .Fo print_otag_id
82: .Fa "struct html *h"
83: .Fa "enum htmltag tag"
84: .Fa "const char *cattr"
85: .Fa "struct roff_node *n"
1.7 schwarze 86: .Fc
1.21 schwarze 87: .Ft void
88: .Fn print_endline "struct html *h"
1.1 schwarze 89: .Sh DESCRIPTION
90: The mandoc HTML formatter is not a formal library.
91: However, as it is compiled into more than one program, in particular
92: .Xr mandoc 1
93: and
94: .Xr man.cgi 8 ,
95: and because it may be security-critical in some contexts,
96: some documentation is useful to help to use it correctly and
97: to prevent XSS vulnerabilities.
98: .Pp
99: The formatter produces HTML output on the standard output.
100: Since proper escaping is usually required and best taken care of
101: at one central place, the language-specific formatters
102: .Po
103: .Pa *_html.c ,
104: see
105: .Sx FILES
106: .Pc
107: are not supposed to print directly to
108: .Dv stdout
109: using functions like
110: .Xr printf 3 ,
111: .Xr putc 3 ,
112: .Xr puts 3 ,
113: or
114: .Xr write 2 .
115: Instead, they are expected to use the output functions declared in
116: .Pa html.h
117: and implemented as part of the main HTML formatting engine in
118: .Pa html.c .
119: .Ss Data structures
120: These structures are declared in
121: .Pa html.h .
122: .Bl -tag -width Ds
123: .It Vt struct html
124: Internal state of the HTML formatter.
125: .It Vt struct tag
126: One entry for the LIFO stack of HTML elements.
1.21 schwarze 127: Members include
1.1 schwarze 128: .Fa "enum htmltag tag"
129: and
130: .Fa "struct tag *next" .
131: .El
132: .Ss Private interface functions
133: The function
134: .Fn print_gen_decls
135: prints the opening
136: .Aq Pf \&! Ic DOCTYPE
1.21 schwarze 137: declaration.
1.11 schwarze 138: .Pp
139: The function
140: .Fn print_gen_comment
141: prints the leading comments, usually containing a Copyright notice
142: and license, as an HTML comment.
143: It is intended to be called right after opening the
144: .Aq Ic HTML
145: element.
146: Pass the first
147: .Dv ROFFT_COMMENT
148: node in
149: .Fa n .
1.1 schwarze 150: .Pp
151: The function
152: .Fn print_gen_head
153: prints the opening
154: .Aq Ic META
155: and
156: .Aq Ic LINK
157: elements for the document
158: .Aq Ic HEAD ,
159: using the
160: .Fa style
161: member of
162: .Fa h
163: unless that is
164: .Dv NULL .
165: It uses
166: .Fn print_otag
167: which takes care of properly encoding attributes,
168: which is relevant for the
169: .Fa style
170: link in particular.
171: .Pp
172: The function
173: .Fn print_otag
174: prints the start tag of an HTML element with the name
175: .Fa tag ,
1.2 schwarze 176: optionally including the attributes specified by
177: .Fa fmt .
178: If
179: .Fa fmt
180: is the empty string, no attributes are written.
181: Each letter of
182: .Fa fmt
183: specifies one attribute to write.
184: Most attributes require one
185: .Va char *
186: argument which becomes the value of the attribute.
187: The arguments have to be given in the same order as the attribute letters.
1.5 schwarze 188: If an argument is
189: .Dv NULL ,
190: the respective attribute is not written.
1.2 schwarze 191: .Bl -tag -width 1n -offset indent
192: .It Cm c
193: Print a
194: .Cm class
195: attribute.
196: .It Cm h
197: Print a
198: .Cm href
199: attribute.
1.3 schwarze 200: This attribute letter can optionally be followed by a modifier letter.
201: If followed by
202: .Cm R ,
203: it formats the link as a local one by prefixing a
204: .Sq #
205: character.
206: If followed by
207: .Cm I ,
208: it interpretes the argument as a header file name
209: and generates a link using the
210: .Xr mandoc 1
211: .Fl O Cm includes
212: option.
213: If followed by
214: .Cm M ,
215: it takes two arguments instead of one, a manual page name and
216: section, and formats them as a link to a manual page using the
217: .Xr mandoc 1
218: .Fl O Cm man
219: option.
1.2 schwarze 220: .It Cm i
221: Print an
222: .Cm id
223: attribute.
224: .It Cm \&?
225: Print an arbitrary attribute.
226: This format letter requires two
227: .Vt char *
228: arguments, the attribute name and the value.
1.5 schwarze 229: The name must not be
230: .Dv NULL .
1.2 schwarze 231: .El
232: .Pp
233: .Fn print_otag
234: uses the private function
1.1 schwarze 235: .Fn print_encode
236: to take care of HTML encoding.
237: If required by the element type, it remembers in
238: .Fa h
239: that the element is open.
240: The function
241: .Fn print_tagq
242: is used to close out all open elements up to and including
243: .Fa until ;
244: .Fn print_stagq
245: is a variant to close out all open elements up to but excluding
246: .Fa suntil .
1.21 schwarze 247: The function
248: .Fn html_close_paragraph
249: closes all open elements that establish phrasing context,
250: thus returning to the innermost flow context.
251: .Pp
252: The function
253: .Fn html_fillmode
254: switches to fill mode if
255: .Fa want
256: is
257: .Dv ROFF_fi
258: or to no-fill mode if
259: .Fa want
260: is
261: .Dv ROFF_nf .
262: Switching from fill mode to no-fill mode closes the current paragraph
263: and opens a
264: .Aq Ic PRE
265: element.
266: Switching in the opposite direction closes the
267: .Aq Ic PRE
268: element, but does not open a new paragraph.
269: If
270: .Fa want
271: matches the mode that is already active, no elements are closed nor opened.
272: If
273: .Fa want
274: is
275: .Dv TOKEN_NONE ,
276: the mode remains as it is.
277: .Pp
278: The function
279: .Fn html_setfont
280: selects the
281: .Fa font ,
282: which can be
283: .Dv ESCAPE_FONTROMAN ,
284: .Dv ESCAPE_FONTBOLD ,
285: .Dv ESCAPE_FONTITALIC ,
286: .Dv ESCAPE_FONTBI ,
287: or
288: .Dv ESCAPE_FONTCW ,
289: for future text output and internally remembers
290: the font that was active before the change.
291: If the
292: .Fa font
293: argument is
294: .Dv ESCAPE_FONTPREV ,
295: the current and the previous font are exchanged.
296: This function only changes the internal state of the
297: .Fa h
298: object; no HTML elements are written yet.
299: Subsequent text output will write font elements when needed.
1.1 schwarze 300: .Pp
301: The function
302: .Fn print_text
303: prints HTML element content.
304: It uses the private function
305: .Fn print_encode
306: to take care of HTML encoding.
307: If the document has requested a non-standard font, for example using a
308: .Xr roff 7
309: .Ic \ef
310: font escape sequence,
311: .Fn print_text
312: wraps
313: .Fa word
314: in an HTML font selection element using the
315: .Fn print_otag
316: and
317: .Fn print_tagq
318: functions.
319: .Pp
1.7 schwarze 320: The function
1.21 schwarze 321: .Fn print_tagged_text
322: is a variant of
323: .Fn print_text
324: that wraps
325: .Fa word
326: in an
327: .Aq Ic A
328: element of class
329: .Qq permalink
330: if
331: .Fa n
332: is not
333: .Dv NULL
334: and yields a segment identifier when passed to
335: .Fn html_make_id .
336: .Pp
337: The function
1.7 schwarze 338: .Fn html_make_id
1.20 schwarze 339: allocates a string to be used for the
340: .Cm id
341: attribute of an HTML element and/or as a segment identifier for a URI in an
342: .Aq Ic A
343: element.
344: If
345: .Fa n
346: contains a
1.21 schwarze 347: .Fa tag
1.20 schwarze 348: attribute, it is used; otherwise, child nodes are used.
349: If
1.7 schwarze 350: .Fa n
1.20 schwarze 351: is an
352: .Ic \&Sh ,
353: .Ic \&Ss ,
354: .Ic \&Sx ,
355: .Ic SH ,
356: or
357: .Ic SS
358: node, the resulting string is the concatenation of the child strings;
359: for other node types, only the first child is used.
360: Bytes not permitted in URI-fragment strings are replaced by underscores.
361: If any of the children to be used is not a text node,
362: no string is generated and
1.7 schwarze 363: .Dv NULL
1.20 schwarze 364: is returned instead.
365: If the
366: .Fa unique
367: argument is non-zero, deduplication is performed by appending an
368: underscore and a decimal integer, if necessary.
1.22 ! schwarze 369: If the
! 370: .Fa unique
! 371: argument is 1, this is assumed to be the first call for this tag
! 372: at this location, typically for use by
! 373: .Dv NODE_ID ,
! 374: so the integer is incremented before use.
! 375: If the
! 376: .Fa unique
! 377: argument is 2, this is ssumed to be the second call for this tag
! 378: at this location, typically for use by
! 379: .Dv NODE_HREF ,
! 380: so the existing integer, if any, is used without incrementing it.
1.7 schwarze 381: .Pp
382: The function
1.20 schwarze 383: .Fn print_otag_id
384: opens a
385: .Fa tag
386: element of class
387: .Fa cattr
388: for the node
389: .Fa n .
390: If the flag
391: .Dv NODE_ID
392: is set in
393: .Fa n ,
394: it attempts to generate an
395: .Cm id
396: attribute with
397: .Fn html_make_id .
1.21 schwarze 398: If the flag
399: .Dv NODE_HREF
400: is set in
401: .Fa n ,
402: an
1.20 schwarze 403: .Aq Ic A
404: element of class
1.21 schwarze 405: .Qq permalink
406: is added:
1.20 schwarze 407: outside if
408: .Fa n
1.21 schwarze 409: generates an element that can only occur in phrasing context,
410: or inside otherwise.
1.20 schwarze 411: This function is a wrapper around
412: .Fn html_make_id
413: and
414: .Fn print_otag ,
1.22 ! schwarze 415: automatically chosing the
1.20 schwarze 416: .Fa unique
1.22 ! schwarze 417: argument appropriately and setting the
1.20 schwarze 418: .Fa fmt
419: arguments to
420: .Qq chR
421: and
422: .Qq ci ,
423: respectively.
1.7 schwarze 424: .Pp
1.21 schwarze 425: The function
426: .Fn print_endline
427: makes sure subsequent output starts on a new HTML output line.
428: If nothing was printed on the current output line yet, it has no effect.
429: Otherwise, it appends any buffered text to the current output line,
430: ends the line, and updates the internal state of the
431: .Fa h
432: object.
433: .Pp
1.1 schwarze 434: The functions
435: .Fn print_eqn ,
436: .Fn print_tbl ,
437: and
438: .Fn print_tblclose
439: are not yet documented.
1.20 schwarze 440: .Sh RETURN VALUES
441: The functions
442: .Fn print_otag
443: and
444: .Fn print_otag_id
445: return a pointer to a new element on the stack of HTML elements.
446: When
447: .Fn print_otag_id
448: opens two elements, a pointer to the outer one is returned.
449: The memory pointed to is owned by the library and is automatically
450: .Xr free 3 Ns d
451: when
452: .Fn print_tagq
453: is called on it or when
454: .Fn print_stagq
455: is called on a parent element.
456: .Pp
457: The function
1.21 schwarze 458: .Fn html_fillmode
459: returns
460: .Dv ROFF_fi
461: if fill mode was active before the call or
462: .Dv ROFF_nf
463: otherwise.
464: .Pp
465: The function
1.20 schwarze 466: .Fn html_make_id
467: returns a newly allocated string or
468: .Dv NULL
469: if
470: .Fa n
471: lacks text data to create the attribute from.
1.22 ! schwarze 472: The caller is responsible for
1.20 schwarze 473: .Xr free 3 Ns ing
474: the returned string after using it.
475: .Pp
476: In case of
477: .Xr malloc 3
478: failure, these functions do not return but call
479: .Xr err 3 .
1.1 schwarze 480: .Sh FILES
481: .Bl -tag -width mandoc_aux.c -compact
482: .It Pa main.h
483: declarations of public functions for use by the main program,
484: not yet documented
485: .It Pa html.h
486: declarations of data types and private functions
487: for use by language-specific HTML formatters
488: .It Pa html.c
489: main HTML formatting engine and utility functions
490: .It Pa mdoc_html.c
491: .Xr mdoc 7
492: HTML formatter
493: .It Pa man_html.c
494: .Xr man 7
495: HTML formatter
496: .It Pa tbl_html.c
497: .Xr tbl 7
498: HTML formatter
499: .It Pa eqn_html.c
500: .Xr eqn 7
501: HTML formatter
1.21 schwarze 502: .It Pa roff_html.c
503: .Xr roff 7
504: HTML formatter, handling requests like
505: .Ic br ,
506: .Ic ce ,
507: .Ic fi ,
508: .Ic ft ,
509: .Ic nf ,
510: .Ic rj ,
511: and
512: .Ic sp .
1.1 schwarze 513: .It Pa out.h
514: declarations of data types and private functions
515: for shared use by all mandoc formatters,
516: not yet documented
517: .It Pa out.c
518: private functions for shared use by all mandoc formatters
519: .It Pa mandoc_aux.h
520: declarations of common mandoc utility functions, see
521: .Xr mandoc 3
522: .It Pa mandoc_aux.c
523: implementation of common mandoc utility functions
524: .El
525: .Sh SEE ALSO
526: .Xr mandoc 1 ,
527: .Xr mandoc 3 ,
528: .Xr man.cgi 8
529: .Sh AUTHORS
530: .An -nosplit
531: The mandoc HTML formatter was written by
532: .An Kristaps Dzonsons Aq Mt kristaps@bsd.lv .
1.5 schwarze 533: It is maintained by
534: .An Ingo Schwarze Aq Mt schwarze@openbsd.org ,
535: who also wrote this manual.
CVSweb