Annotation of mandoc/mandoc_html.3, Revision 1.23
1.23 ! schwarze 1: .\" $Id: mandoc_html.3,v 1.22 2020/04/19 15:16:56 schwarze Exp $
1.1 schwarze 2: .\"
1.11 schwarze 3: .\" Copyright (c) 2014, 2017, 2018 Ingo Schwarze <schwarze@openbsd.org>
1.1 schwarze 4: .\"
5: .\" Permission to use, copy, modify, and distribute this software for any
6: .\" purpose with or without fee is hereby granted, provided that the above
7: .\" copyright notice and this permission notice appear in all copies.
8: .\"
9: .\" THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
10: .\" WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
11: .\" MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
12: .\" ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
13: .\" WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
14: .\" ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
15: .\" OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
16: .\"
1.23 ! schwarze 17: .Dd $Mdocdate: April 19 2020 $
1.1 schwarze 18: .Dt MANDOC_HTML 3
19: .Os
20: .Sh NAME
21: .Nm mandoc_html
22: .Nd internals of the mandoc HTML formatter
23: .Sh SYNOPSIS
1.21 schwarze 24: .In sys/types.h
25: .Fd #include """mandoc.h"""
26: .Fd #include """roff.h"""
27: .Fd #include """out.h"""
28: .Fd #include """html.h"""
1.1 schwarze 29: .Ft void
30: .Fn print_gen_decls "struct html *h"
31: .Ft void
1.11 schwarze 32: .Fn print_gen_comment "struct html *h" "struct roff_node *n"
33: .Ft void
1.1 schwarze 34: .Fn print_gen_head "struct html *h"
35: .Ft struct tag *
36: .Fo print_otag
37: .Fa "struct html *h"
38: .Fa "enum htmltag tag"
1.2 schwarze 39: .Fa "const char *fmt"
40: .Fa ...
1.1 schwarze 41: .Fc
42: .Ft void
43: .Fo print_tagq
44: .Fa "struct html *h"
45: .Fa "const struct tag *until"
46: .Fc
47: .Ft void
48: .Fo print_stagq
49: .Fa "struct html *h"
50: .Fa "const struct tag *suntil"
51: .Fc
52: .Ft void
1.21 schwarze 53: .Fn html_close_paragraph "struct html *h"
54: .Ft enum roff_tok
55: .Fo html_fillmode
56: .Fa "struct html *h"
57: .Fa "enum roff_tok tok"
58: .Fc
59: .Ft int
60: .Fo html_setfont
61: .Fa "struct html *h"
62: .Fa "enum mandoc_esc font"
63: .Fc
64: .Ft void
1.1 schwarze 65: .Fo print_text
66: .Fa "struct html *h"
67: .Fa "const char *word"
68: .Fc
1.21 schwarze 69: .Ft void
70: .Fo print_tagged_text
71: .Fa "struct html *h"
72: .Fa "const char *word"
73: .Fa "struct roff_node *n"
74: .Fc
1.7 schwarze 75: .Ft char *
76: .Fo html_make_id
77: .Fa "const struct roff_node *n"
1.20 schwarze 78: .Fa "int unique"
1.7 schwarze 79: .Fc
1.20 schwarze 80: .Ft struct tag *
81: .Fo print_otag_id
82: .Fa "struct html *h"
83: .Fa "enum htmltag tag"
84: .Fa "const char *cattr"
85: .Fa "struct roff_node *n"
1.7 schwarze 86: .Fc
1.21 schwarze 87: .Ft void
88: .Fn print_endline "struct html *h"
1.1 schwarze 89: .Sh DESCRIPTION
90: The mandoc HTML formatter is not a formal library.
91: However, as it is compiled into more than one program, in particular
92: .Xr mandoc 1
93: and
94: .Xr man.cgi 8 ,
95: and because it may be security-critical in some contexts,
96: some documentation is useful to help to use it correctly and
97: to prevent XSS vulnerabilities.
98: .Pp
99: The formatter produces HTML output on the standard output.
100: Since proper escaping is usually required and best taken care of
101: at one central place, the language-specific formatters
102: .Po
103: .Pa *_html.c ,
104: see
105: .Sx FILES
106: .Pc
107: are not supposed to print directly to
108: .Dv stdout
109: using functions like
110: .Xr printf 3 ,
111: .Xr putc 3 ,
112: .Xr puts 3 ,
113: or
114: .Xr write 2 .
115: Instead, they are expected to use the output functions declared in
116: .Pa html.h
117: and implemented as part of the main HTML formatting engine in
118: .Pa html.c .
119: .Ss Data structures
120: These structures are declared in
121: .Pa html.h .
122: .Bl -tag -width Ds
123: .It Vt struct html
124: Internal state of the HTML formatter.
125: .It Vt struct tag
126: One entry for the LIFO stack of HTML elements.
1.21 schwarze 127: Members include
1.1 schwarze 128: .Fa "enum htmltag tag"
129: and
130: .Fa "struct tag *next" .
131: .El
132: .Ss Private interface functions
133: The function
134: .Fn print_gen_decls
135: prints the opening
136: .Aq Pf \&! Ic DOCTYPE
1.21 schwarze 137: declaration.
1.11 schwarze 138: .Pp
139: The function
140: .Fn print_gen_comment
141: prints the leading comments, usually containing a Copyright notice
142: and license, as an HTML comment.
143: It is intended to be called right after opening the
144: .Aq Ic HTML
145: element.
146: Pass the first
147: .Dv ROFFT_COMMENT
148: node in
149: .Fa n .
1.1 schwarze 150: .Pp
151: The function
152: .Fn print_gen_head
153: prints the opening
154: .Aq Ic META
155: and
156: .Aq Ic LINK
157: elements for the document
158: .Aq Ic HEAD ,
159: using the
160: .Fa style
161: member of
162: .Fa h
163: unless that is
164: .Dv NULL .
165: It uses
166: .Fn print_otag
167: which takes care of properly encoding attributes,
168: which is relevant for the
169: .Fa style
170: link in particular.
171: .Pp
172: The function
173: .Fn print_otag
174: prints the start tag of an HTML element with the name
175: .Fa tag ,
1.2 schwarze 176: optionally including the attributes specified by
177: .Fa fmt .
178: If
179: .Fa fmt
180: is the empty string, no attributes are written.
181: Each letter of
182: .Fa fmt
183: specifies one attribute to write.
184: Most attributes require one
185: .Va char *
186: argument which becomes the value of the attribute.
187: The arguments have to be given in the same order as the attribute letters.
1.5 schwarze 188: If an argument is
189: .Dv NULL ,
190: the respective attribute is not written.
1.2 schwarze 191: .Bl -tag -width 1n -offset indent
192: .It Cm c
193: Print a
194: .Cm class
195: attribute.
196: .It Cm h
197: Print a
198: .Cm href
199: attribute.
1.3 schwarze 200: This attribute letter can optionally be followed by a modifier letter.
201: If followed by
202: .Cm R ,
203: it formats the link as a local one by prefixing a
204: .Sq #
205: character.
206: If followed by
207: .Cm I ,
208: it interpretes the argument as a header file name
209: and generates a link using the
210: .Xr mandoc 1
211: .Fl O Cm includes
212: option.
213: If followed by
214: .Cm M ,
215: it takes two arguments instead of one, a manual page name and
216: section, and formats them as a link to a manual page using the
217: .Xr mandoc 1
218: .Fl O Cm man
219: option.
1.2 schwarze 220: .It Cm i
221: Print an
222: .Cm id
223: attribute.
224: .It Cm \&?
225: Print an arbitrary attribute.
226: This format letter requires two
227: .Vt char *
228: arguments, the attribute name and the value.
1.5 schwarze 229: The name must not be
230: .Dv NULL .
1.23 ! schwarze 231: .It Cm s
! 232: Print a
! 233: .Cm style
! 234: attribute.
! 235: If present, it must be the last format letter.
! 236: It requires two
! 237: .Va char *
! 238: arguments.
! 239: The first is the name of the style property, the second its value.
! 240: The name must not be
! 241: .Dv NULL .
! 242: The
! 243: .Cm s
! 244: .Ar fmt
! 245: letter can be repeated, each repetition requiring an additional pair of
! 246: .Va char *
! 247: arguments.
1.2 schwarze 248: .El
249: .Pp
250: .Fn print_otag
251: uses the private function
1.1 schwarze 252: .Fn print_encode
253: to take care of HTML encoding.
254: If required by the element type, it remembers in
255: .Fa h
256: that the element is open.
257: The function
258: .Fn print_tagq
259: is used to close out all open elements up to and including
260: .Fa until ;
261: .Fn print_stagq
262: is a variant to close out all open elements up to but excluding
263: .Fa suntil .
1.21 schwarze 264: The function
265: .Fn html_close_paragraph
266: closes all open elements that establish phrasing context,
267: thus returning to the innermost flow context.
268: .Pp
269: The function
270: .Fn html_fillmode
271: switches to fill mode if
272: .Fa want
273: is
274: .Dv ROFF_fi
275: or to no-fill mode if
276: .Fa want
277: is
278: .Dv ROFF_nf .
279: Switching from fill mode to no-fill mode closes the current paragraph
280: and opens a
281: .Aq Ic PRE
282: element.
283: Switching in the opposite direction closes the
284: .Aq Ic PRE
285: element, but does not open a new paragraph.
286: If
287: .Fa want
288: matches the mode that is already active, no elements are closed nor opened.
289: If
290: .Fa want
291: is
292: .Dv TOKEN_NONE ,
293: the mode remains as it is.
294: .Pp
295: The function
296: .Fn html_setfont
297: selects the
298: .Fa font ,
299: which can be
300: .Dv ESCAPE_FONTROMAN ,
301: .Dv ESCAPE_FONTBOLD ,
302: .Dv ESCAPE_FONTITALIC ,
303: .Dv ESCAPE_FONTBI ,
304: or
305: .Dv ESCAPE_FONTCW ,
306: for future text output and internally remembers
307: the font that was active before the change.
308: If the
309: .Fa font
310: argument is
311: .Dv ESCAPE_FONTPREV ,
312: the current and the previous font are exchanged.
313: This function only changes the internal state of the
314: .Fa h
315: object; no HTML elements are written yet.
316: Subsequent text output will write font elements when needed.
1.1 schwarze 317: .Pp
318: The function
319: .Fn print_text
320: prints HTML element content.
321: It uses the private function
322: .Fn print_encode
323: to take care of HTML encoding.
324: If the document has requested a non-standard font, for example using a
325: .Xr roff 7
326: .Ic \ef
327: font escape sequence,
328: .Fn print_text
329: wraps
330: .Fa word
331: in an HTML font selection element using the
332: .Fn print_otag
333: and
334: .Fn print_tagq
335: functions.
336: .Pp
1.7 schwarze 337: The function
1.21 schwarze 338: .Fn print_tagged_text
339: is a variant of
340: .Fn print_text
341: that wraps
342: .Fa word
343: in an
344: .Aq Ic A
345: element of class
346: .Qq permalink
347: if
348: .Fa n
349: is not
350: .Dv NULL
351: and yields a segment identifier when passed to
352: .Fn html_make_id .
353: .Pp
354: The function
1.7 schwarze 355: .Fn html_make_id
1.20 schwarze 356: allocates a string to be used for the
357: .Cm id
358: attribute of an HTML element and/or as a segment identifier for a URI in an
359: .Aq Ic A
360: element.
361: If
362: .Fa n
363: contains a
1.21 schwarze 364: .Fa tag
1.20 schwarze 365: attribute, it is used; otherwise, child nodes are used.
366: If
1.7 schwarze 367: .Fa n
1.20 schwarze 368: is an
369: .Ic \&Sh ,
370: .Ic \&Ss ,
371: .Ic \&Sx ,
372: .Ic SH ,
373: or
374: .Ic SS
375: node, the resulting string is the concatenation of the child strings;
376: for other node types, only the first child is used.
377: Bytes not permitted in URI-fragment strings are replaced by underscores.
378: If any of the children to be used is not a text node,
379: no string is generated and
1.7 schwarze 380: .Dv NULL
1.20 schwarze 381: is returned instead.
382: If the
383: .Fa unique
384: argument is non-zero, deduplication is performed by appending an
385: underscore and a decimal integer, if necessary.
1.22 schwarze 386: If the
387: .Fa unique
388: argument is 1, this is assumed to be the first call for this tag
389: at this location, typically for use by
390: .Dv NODE_ID ,
391: so the integer is incremented before use.
392: If the
393: .Fa unique
394: argument is 2, this is ssumed to be the second call for this tag
395: at this location, typically for use by
396: .Dv NODE_HREF ,
397: so the existing integer, if any, is used without incrementing it.
1.7 schwarze 398: .Pp
399: The function
1.20 schwarze 400: .Fn print_otag_id
401: opens a
402: .Fa tag
403: element of class
404: .Fa cattr
405: for the node
406: .Fa n .
407: If the flag
408: .Dv NODE_ID
409: is set in
410: .Fa n ,
411: it attempts to generate an
412: .Cm id
413: attribute with
414: .Fn html_make_id .
1.21 schwarze 415: If the flag
416: .Dv NODE_HREF
417: is set in
418: .Fa n ,
419: an
1.20 schwarze 420: .Aq Ic A
421: element of class
1.21 schwarze 422: .Qq permalink
423: is added:
1.20 schwarze 424: outside if
425: .Fa n
1.21 schwarze 426: generates an element that can only occur in phrasing context,
427: or inside otherwise.
1.20 schwarze 428: This function is a wrapper around
429: .Fn html_make_id
430: and
431: .Fn print_otag ,
1.22 schwarze 432: automatically chosing the
1.20 schwarze 433: .Fa unique
1.22 schwarze 434: argument appropriately and setting the
1.20 schwarze 435: .Fa fmt
436: arguments to
437: .Qq chR
438: and
439: .Qq ci ,
440: respectively.
1.7 schwarze 441: .Pp
1.21 schwarze 442: The function
443: .Fn print_endline
444: makes sure subsequent output starts on a new HTML output line.
445: If nothing was printed on the current output line yet, it has no effect.
446: Otherwise, it appends any buffered text to the current output line,
447: ends the line, and updates the internal state of the
448: .Fa h
449: object.
450: .Pp
1.1 schwarze 451: The functions
452: .Fn print_eqn ,
453: .Fn print_tbl ,
454: and
455: .Fn print_tblclose
456: are not yet documented.
1.20 schwarze 457: .Sh RETURN VALUES
458: The functions
459: .Fn print_otag
460: and
461: .Fn print_otag_id
462: return a pointer to a new element on the stack of HTML elements.
463: When
464: .Fn print_otag_id
465: opens two elements, a pointer to the outer one is returned.
466: The memory pointed to is owned by the library and is automatically
467: .Xr free 3 Ns d
468: when
469: .Fn print_tagq
470: is called on it or when
471: .Fn print_stagq
472: is called on a parent element.
473: .Pp
474: The function
1.21 schwarze 475: .Fn html_fillmode
476: returns
477: .Dv ROFF_fi
478: if fill mode was active before the call or
479: .Dv ROFF_nf
480: otherwise.
481: .Pp
482: The function
1.20 schwarze 483: .Fn html_make_id
484: returns a newly allocated string or
485: .Dv NULL
486: if
487: .Fa n
488: lacks text data to create the attribute from.
1.22 schwarze 489: The caller is responsible for
1.20 schwarze 490: .Xr free 3 Ns ing
491: the returned string after using it.
492: .Pp
493: In case of
494: .Xr malloc 3
495: failure, these functions do not return but call
496: .Xr err 3 .
1.1 schwarze 497: .Sh FILES
498: .Bl -tag -width mandoc_aux.c -compact
499: .It Pa main.h
500: declarations of public functions for use by the main program,
501: not yet documented
502: .It Pa html.h
503: declarations of data types and private functions
504: for use by language-specific HTML formatters
505: .It Pa html.c
506: main HTML formatting engine and utility functions
507: .It Pa mdoc_html.c
508: .Xr mdoc 7
509: HTML formatter
510: .It Pa man_html.c
511: .Xr man 7
512: HTML formatter
513: .It Pa tbl_html.c
514: .Xr tbl 7
515: HTML formatter
516: .It Pa eqn_html.c
517: .Xr eqn 7
518: HTML formatter
1.21 schwarze 519: .It Pa roff_html.c
520: .Xr roff 7
521: HTML formatter, handling requests like
522: .Ic br ,
523: .Ic ce ,
524: .Ic fi ,
525: .Ic ft ,
526: .Ic nf ,
527: .Ic rj ,
528: and
529: .Ic sp .
1.1 schwarze 530: .It Pa out.h
531: declarations of data types and private functions
532: for shared use by all mandoc formatters,
533: not yet documented
534: .It Pa out.c
535: private functions for shared use by all mandoc formatters
536: .It Pa mandoc_aux.h
537: declarations of common mandoc utility functions, see
538: .Xr mandoc 3
539: .It Pa mandoc_aux.c
540: implementation of common mandoc utility functions
541: .El
542: .Sh SEE ALSO
543: .Xr mandoc 1 ,
544: .Xr mandoc 3 ,
545: .Xr man.cgi 8
546: .Sh AUTHORS
547: .An -nosplit
548: The mandoc HTML formatter was written by
549: .An Kristaps Dzonsons Aq Mt kristaps@bsd.lv .
1.5 schwarze 550: It is maintained by
551: .An Ingo Schwarze Aq Mt schwarze@openbsd.org ,
552: who also wrote this manual.
CVSweb