Annotation of mandoc/mandoc_html.3, Revision 1.21
1.21 ! schwarze 1: .\" $Id: mandoc_html.3,v 1.20 2020/03/13 15:32:28 schwarze Exp $
1.1 schwarze 2: .\"
1.11 schwarze 3: .\" Copyright (c) 2014, 2017, 2018 Ingo Schwarze <schwarze@openbsd.org>
1.1 schwarze 4: .\"
5: .\" Permission to use, copy, modify, and distribute this software for any
6: .\" purpose with or without fee is hereby granted, provided that the above
7: .\" copyright notice and this permission notice appear in all copies.
8: .\"
9: .\" THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
10: .\" WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
11: .\" MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
12: .\" ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
13: .\" WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
14: .\" ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
15: .\" OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
16: .\"
1.21 ! schwarze 17: .Dd $Mdocdate: March 13 2020 $
1.1 schwarze 18: .Dt MANDOC_HTML 3
19: .Os
20: .Sh NAME
21: .Nm mandoc_html
22: .Nd internals of the mandoc HTML formatter
23: .Sh SYNOPSIS
1.21 ! schwarze 24: .In sys/types.h
! 25: .Fd #include """mandoc.h"""
! 26: .Fd #include """roff.h"""
! 27: .Fd #include """out.h"""
! 28: .Fd #include """html.h"""
1.1 schwarze 29: .Ft void
30: .Fn print_gen_decls "struct html *h"
31: .Ft void
1.11 schwarze 32: .Fn print_gen_comment "struct html *h" "struct roff_node *n"
33: .Ft void
1.1 schwarze 34: .Fn print_gen_head "struct html *h"
35: .Ft struct tag *
36: .Fo print_otag
37: .Fa "struct html *h"
38: .Fa "enum htmltag tag"
1.2 schwarze 39: .Fa "const char *fmt"
40: .Fa ...
1.1 schwarze 41: .Fc
42: .Ft void
43: .Fo print_tagq
44: .Fa "struct html *h"
45: .Fa "const struct tag *until"
46: .Fc
47: .Ft void
48: .Fo print_stagq
49: .Fa "struct html *h"
50: .Fa "const struct tag *suntil"
51: .Fc
52: .Ft void
1.21 ! schwarze 53: .Fn html_close_paragraph "struct html *h"
! 54: .Ft enum roff_tok
! 55: .Fo html_fillmode
! 56: .Fa "struct html *h"
! 57: .Fa "enum roff_tok tok"
! 58: .Fc
! 59: .Ft int
! 60: .Fo html_setfont
! 61: .Fa "struct html *h"
! 62: .Fa "enum mandoc_esc font"
! 63: .Fc
! 64: .Ft void
1.1 schwarze 65: .Fo print_text
66: .Fa "struct html *h"
67: .Fa "const char *word"
68: .Fc
1.21 ! schwarze 69: .Ft void
! 70: .Fo print_tagged_text
! 71: .Fa "struct html *h"
! 72: .Fa "const char *word"
! 73: .Fa "struct roff_node *n"
! 74: .Fc
1.7 schwarze 75: .Ft char *
76: .Fo html_make_id
77: .Fa "const struct roff_node *n"
1.20 schwarze 78: .Fa "int unique"
1.7 schwarze 79: .Fc
1.20 schwarze 80: .Ft struct tag *
81: .Fo print_otag_id
82: .Fa "struct html *h"
83: .Fa "enum htmltag tag"
84: .Fa "const char *cattr"
85: .Fa "struct roff_node *n"
1.7 schwarze 86: .Fc
1.21 ! schwarze 87: .Ft void
! 88: .Fn print_endline "struct html *h"
1.1 schwarze 89: .Sh DESCRIPTION
90: The mandoc HTML formatter is not a formal library.
91: However, as it is compiled into more than one program, in particular
92: .Xr mandoc 1
93: and
94: .Xr man.cgi 8 ,
95: and because it may be security-critical in some contexts,
96: some documentation is useful to help to use it correctly and
97: to prevent XSS vulnerabilities.
98: .Pp
99: The formatter produces HTML output on the standard output.
100: Since proper escaping is usually required and best taken care of
101: at one central place, the language-specific formatters
102: .Po
103: .Pa *_html.c ,
104: see
105: .Sx FILES
106: .Pc
107: are not supposed to print directly to
108: .Dv stdout
109: using functions like
110: .Xr printf 3 ,
111: .Xr putc 3 ,
112: .Xr puts 3 ,
113: or
114: .Xr write 2 .
115: Instead, they are expected to use the output functions declared in
116: .Pa html.h
117: and implemented as part of the main HTML formatting engine in
118: .Pa html.c .
119: .Ss Data structures
120: These structures are declared in
121: .Pa html.h .
122: .Bl -tag -width Ds
123: .It Vt struct html
124: Internal state of the HTML formatter.
125: .It Vt struct tag
126: One entry for the LIFO stack of HTML elements.
1.21 ! schwarze 127: Members include
1.1 schwarze 128: .Fa "enum htmltag tag"
129: and
130: .Fa "struct tag *next" .
131: .El
132: .Ss Private interface functions
133: The function
134: .Fn print_gen_decls
135: prints the opening
136: .Aq Pf \&! Ic DOCTYPE
1.21 ! schwarze 137: declaration.
1.11 schwarze 138: .Pp
139: The function
140: .Fn print_gen_comment
141: prints the leading comments, usually containing a Copyright notice
142: and license, as an HTML comment.
143: It is intended to be called right after opening the
144: .Aq Ic HTML
145: element.
146: Pass the first
147: .Dv ROFFT_COMMENT
148: node in
149: .Fa n .
1.1 schwarze 150: .Pp
151: The function
152: .Fn print_gen_head
153: prints the opening
154: .Aq Ic META
155: and
156: .Aq Ic LINK
157: elements for the document
158: .Aq Ic HEAD ,
159: using the
160: .Fa style
161: member of
162: .Fa h
163: unless that is
164: .Dv NULL .
165: It uses
166: .Fn print_otag
167: which takes care of properly encoding attributes,
168: which is relevant for the
169: .Fa style
170: link in particular.
171: .Pp
172: The function
173: .Fn print_otag
174: prints the start tag of an HTML element with the name
175: .Fa tag ,
1.2 schwarze 176: optionally including the attributes specified by
177: .Fa fmt .
178: If
179: .Fa fmt
180: is the empty string, no attributes are written.
181: Each letter of
182: .Fa fmt
183: specifies one attribute to write.
184: Most attributes require one
185: .Va char *
186: argument which becomes the value of the attribute.
187: The arguments have to be given in the same order as the attribute letters.
1.5 schwarze 188: If an argument is
189: .Dv NULL ,
190: the respective attribute is not written.
1.2 schwarze 191: .Bl -tag -width 1n -offset indent
192: .It Cm c
193: Print a
194: .Cm class
195: attribute.
196: .It Cm h
197: Print a
198: .Cm href
199: attribute.
1.3 schwarze 200: This attribute letter can optionally be followed by a modifier letter.
201: If followed by
202: .Cm R ,
203: it formats the link as a local one by prefixing a
204: .Sq #
205: character.
206: If followed by
207: .Cm I ,
208: it interpretes the argument as a header file name
209: and generates a link using the
210: .Xr mandoc 1
211: .Fl O Cm includes
212: option.
213: If followed by
214: .Cm M ,
215: it takes two arguments instead of one, a manual page name and
216: section, and formats them as a link to a manual page using the
217: .Xr mandoc 1
218: .Fl O Cm man
219: option.
1.2 schwarze 220: .It Cm i
221: Print an
222: .Cm id
223: attribute.
224: .It Cm \&?
225: Print an arbitrary attribute.
226: This format letter requires two
227: .Vt char *
228: arguments, the attribute name and the value.
1.5 schwarze 229: The name must not be
230: .Dv NULL .
1.2 schwarze 231: .El
232: .Pp
233: .Fn print_otag
234: uses the private function
1.1 schwarze 235: .Fn print_encode
236: to take care of HTML encoding.
237: If required by the element type, it remembers in
238: .Fa h
239: that the element is open.
240: The function
241: .Fn print_tagq
242: is used to close out all open elements up to and including
243: .Fa until ;
244: .Fn print_stagq
245: is a variant to close out all open elements up to but excluding
246: .Fa suntil .
1.21 ! schwarze 247: The function
! 248: .Fn html_close_paragraph
! 249: closes all open elements that establish phrasing context,
! 250: thus returning to the innermost flow context.
! 251: .Pp
! 252: The function
! 253: .Fn html_fillmode
! 254: switches to fill mode if
! 255: .Fa want
! 256: is
! 257: .Dv ROFF_fi
! 258: or to no-fill mode if
! 259: .Fa want
! 260: is
! 261: .Dv ROFF_nf .
! 262: Switching from fill mode to no-fill mode closes the current paragraph
! 263: and opens a
! 264: .Aq Ic PRE
! 265: element.
! 266: Switching in the opposite direction closes the
! 267: .Aq Ic PRE
! 268: element, but does not open a new paragraph.
! 269: If
! 270: .Fa want
! 271: matches the mode that is already active, no elements are closed nor opened.
! 272: If
! 273: .Fa want
! 274: is
! 275: .Dv TOKEN_NONE ,
! 276: the mode remains as it is.
! 277: .Pp
! 278: The function
! 279: .Fn html_setfont
! 280: selects the
! 281: .Fa font ,
! 282: which can be
! 283: .Dv ESCAPE_FONTROMAN ,
! 284: .Dv ESCAPE_FONTBOLD ,
! 285: .Dv ESCAPE_FONTITALIC ,
! 286: .Dv ESCAPE_FONTBI ,
! 287: or
! 288: .Dv ESCAPE_FONTCW ,
! 289: for future text output and internally remembers
! 290: the font that was active before the change.
! 291: If the
! 292: .Fa font
! 293: argument is
! 294: .Dv ESCAPE_FONTPREV ,
! 295: the current and the previous font are exchanged.
! 296: This function only changes the internal state of the
! 297: .Fa h
! 298: object; no HTML elements are written yet.
! 299: Subsequent text output will write font elements when needed.
1.1 schwarze 300: .Pp
301: The function
302: .Fn print_text
303: prints HTML element content.
304: It uses the private function
305: .Fn print_encode
306: to take care of HTML encoding.
307: If the document has requested a non-standard font, for example using a
308: .Xr roff 7
309: .Ic \ef
310: font escape sequence,
311: .Fn print_text
312: wraps
313: .Fa word
314: in an HTML font selection element using the
315: .Fn print_otag
316: and
317: .Fn print_tagq
318: functions.
319: .Pp
1.7 schwarze 320: The function
1.21 ! schwarze 321: .Fn print_tagged_text
! 322: is a variant of
! 323: .Fn print_text
! 324: that wraps
! 325: .Fa word
! 326: in an
! 327: .Aq Ic A
! 328: element of class
! 329: .Qq permalink
! 330: if
! 331: .Fa n
! 332: is not
! 333: .Dv NULL
! 334: and yields a segment identifier when passed to
! 335: .Fn html_make_id .
! 336: .Pp
! 337: The function
1.7 schwarze 338: .Fn html_make_id
1.20 schwarze 339: allocates a string to be used for the
340: .Cm id
341: attribute of an HTML element and/or as a segment identifier for a URI in an
342: .Aq Ic A
343: element.
344: If
345: .Fa n
346: contains a
1.21 ! schwarze 347: .Fa tag
1.20 schwarze 348: attribute, it is used; otherwise, child nodes are used.
349: If
1.7 schwarze 350: .Fa n
1.20 schwarze 351: is an
352: .Ic \&Sh ,
353: .Ic \&Ss ,
354: .Ic \&Sx ,
355: .Ic SH ,
356: or
357: .Ic SS
358: node, the resulting string is the concatenation of the child strings;
359: for other node types, only the first child is used.
360: Bytes not permitted in URI-fragment strings are replaced by underscores.
361: If any of the children to be used is not a text node,
362: no string is generated and
1.7 schwarze 363: .Dv NULL
1.20 schwarze 364: is returned instead.
365: If the
366: .Fa unique
367: argument is non-zero, deduplication is performed by appending an
368: underscore and a decimal integer, if necessary.
1.7 schwarze 369: .Pp
370: The function
1.20 schwarze 371: .Fn print_otag_id
372: opens a
373: .Fa tag
374: element of class
375: .Fa cattr
376: for the node
377: .Fa n .
378: If the flag
379: .Dv NODE_ID
380: is set in
381: .Fa n ,
382: it attempts to generate an
383: .Cm id
384: attribute with
385: .Fn html_make_id .
1.21 ! schwarze 386: If the flag
! 387: .Dv NODE_HREF
! 388: is set in
! 389: .Fa n ,
! 390: an
1.20 schwarze 391: .Aq Ic A
392: element of class
1.21 ! schwarze 393: .Qq permalink
! 394: is added:
1.20 schwarze 395: outside if
396: .Fa n
1.21 ! schwarze 397: generates an element that can only occur in phrasing context,
! 398: or inside otherwise.
1.20 schwarze 399: This function is a wrapper around
400: .Fn html_make_id
401: and
402: .Fn print_otag ,
403: fixing the
404: .Fa unique
405: argument to 1 and the
406: .Fa fmt
407: arguments to
408: .Qq chR
409: and
410: .Qq ci ,
411: respectively.
1.7 schwarze 412: .Pp
1.21 ! schwarze 413: The function
! 414: .Fn print_endline
! 415: makes sure subsequent output starts on a new HTML output line.
! 416: If nothing was printed on the current output line yet, it has no effect.
! 417: Otherwise, it appends any buffered text to the current output line,
! 418: ends the line, and updates the internal state of the
! 419: .Fa h
! 420: object.
! 421: .Pp
1.1 schwarze 422: The functions
423: .Fn print_eqn ,
424: .Fn print_tbl ,
425: and
426: .Fn print_tblclose
427: are not yet documented.
1.20 schwarze 428: .Sh RETURN VALUES
429: The functions
430: .Fn print_otag
431: and
432: .Fn print_otag_id
433: return a pointer to a new element on the stack of HTML elements.
434: When
435: .Fn print_otag_id
436: opens two elements, a pointer to the outer one is returned.
437: The memory pointed to is owned by the library and is automatically
438: .Xr free 3 Ns d
439: when
440: .Fn print_tagq
441: is called on it or when
442: .Fn print_stagq
443: is called on a parent element.
444: .Pp
445: The function
1.21 ! schwarze 446: .Fn html_fillmode
! 447: returns
! 448: .Dv ROFF_fi
! 449: if fill mode was active before the call or
! 450: .Dv ROFF_nf
! 451: otherwise.
! 452: .Pp
! 453: The function
1.20 schwarze 454: .Fn html_make_id
455: returns a newly allocated string or
456: .Dv NULL
457: if
458: .Fa n
459: lacks text data to create the attribute from.
460: If the
461: .Fa unique
462: argument is 0, the caller is responsible for
463: .Xr free 3 Ns ing
464: the returned string after using it.
465: If the
466: .Fa unique
467: argument is non-zero, the
468: .Va id_unique
469: ohash table is used for de-duplication and owns the returned string.
470: In this case, it will be freed automatically by
471: .Fn html_reset
472: or
473: .Fn html_free .
474: .Pp
475: In case of
476: .Xr malloc 3
477: failure, these functions do not return but call
478: .Xr err 3 .
1.1 schwarze 479: .Sh FILES
480: .Bl -tag -width mandoc_aux.c -compact
481: .It Pa main.h
482: declarations of public functions for use by the main program,
483: not yet documented
484: .It Pa html.h
485: declarations of data types and private functions
486: for use by language-specific HTML formatters
487: .It Pa html.c
488: main HTML formatting engine and utility functions
489: .It Pa mdoc_html.c
490: .Xr mdoc 7
491: HTML formatter
492: .It Pa man_html.c
493: .Xr man 7
494: HTML formatter
495: .It Pa tbl_html.c
496: .Xr tbl 7
497: HTML formatter
498: .It Pa eqn_html.c
499: .Xr eqn 7
500: HTML formatter
1.21 ! schwarze 501: .It Pa roff_html.c
! 502: .Xr roff 7
! 503: HTML formatter, handling requests like
! 504: .Ic br ,
! 505: .Ic ce ,
! 506: .Ic fi ,
! 507: .Ic ft ,
! 508: .Ic nf ,
! 509: .Ic rj ,
! 510: and
! 511: .Ic sp .
1.1 schwarze 512: .It Pa out.h
513: declarations of data types and private functions
514: for shared use by all mandoc formatters,
515: not yet documented
516: .It Pa out.c
517: private functions for shared use by all mandoc formatters
518: .It Pa mandoc_aux.h
519: declarations of common mandoc utility functions, see
520: .Xr mandoc 3
521: .It Pa mandoc_aux.c
522: implementation of common mandoc utility functions
523: .El
524: .Sh SEE ALSO
525: .Xr mandoc 1 ,
526: .Xr mandoc 3 ,
527: .Xr man.cgi 8
528: .Sh AUTHORS
529: .An -nosplit
530: The mandoc HTML formatter was written by
531: .An Kristaps Dzonsons Aq Mt kristaps@bsd.lv .
1.5 schwarze 532: It is maintained by
533: .An Ingo Schwarze Aq Mt schwarze@openbsd.org ,
534: who also wrote this manual.
CVSweb