=================================================================== RCS file: /cvs/mandoc/mandoc_char.7,v retrieving revision 1.62 retrieving revision 1.78 diff -u -p -r1.62 -r1.78 --- mandoc/mandoc_char.7 2015/03/30 16:06:14 1.62 +++ mandoc/mandoc_char.7 2020/10/31 11:45:16 1.78 @@ -1,8 +1,8 @@ -.\" $Id: mandoc_char.7,v 1.62 2015/03/30 16:06:14 schwarze Exp $ +.\" $Id: mandoc_char.7,v 1.78 2020/10/31 11:45:16 schwarze Exp $ .\" .\" Copyright (c) 2003 Jason McIntyre .\" Copyright (c) 2009, 2010, 2011 Kristaps Dzonsons -.\" Copyright (c) 2011, 2013, 2015 Ingo Schwarze +.\" Copyright (c) 2011,2013,2015,2017-2020 Ingo Schwarze .\" .\" Permission to use, copy, modify, and distribute this software for any .\" purpose with or without fee is hereby granted, provided that the above @@ -16,7 +16,7 @@ .\" ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF .\" OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. .\" -.Dd $Mdocdate: March 30 2015 $ +.Dd $Mdocdate: October 31 2020 $ .Dt MANDOC_CHAR 7 .Os .Sh NAME @@ -35,23 +35,37 @@ documents. .Pp The rendering depends on the .Xr mandoc 1 -output mode; in ASCII output, most characters are completely -unintelligible. -For that reason, using any of the special characters documented here, -except those discussed in the +output mode; it can be inspected by calling +.Xr man 1 +on the +.Nm +manual page with different +.Fl T +arguments. +In ASCII output, the rendering of some characters may be hard +to interpret for the reader. +Many are rendered as descriptive strings like +.Qq , +.Qq , +or +.Qq , +which may look ugly, and many are replaced by similar ASCII characters. +In particular, accented characters are usually shown without the accent. +For that reason, try to avoid using any of the special characters +documented here except those discussed in the .Sx DESCRIPTION , -is strongly discouraged; they are supported merely for backwards -compatibility with existing documents. +unless they are essential for explaining the subject matter at hand, +for example when documenting complicated mathematical functions. .Pp In particular, in English manual pages, do not use special-character escape sequences to represent national language characters in author names; instead, provide ASCII transcriptions of the names. .Ss Dashes and Hyphens In typography there are different types of dashes of various width: -the hyphen (-), -the minus sign (\-), +the hyphen (\(hy), the en-dash (\(en), -and the em-dash (\(em). +the em-dash (\(em), +and the mathematical minus sign (\(mi). .Pp Hyphens are used for adjectives; to separate the two parts of a compound word; @@ -62,14 +76,6 @@ blue-eyed lorry-driver .Ed .Pp -The mathematical minus sign is used for negative numbers or subtraction. -It should be written as -.Sq \e- : -.Bd -unfilled -offset indent -a = 3 \e- 1; -b = \e-2; -.Ed -.Pp The en-dash is used to separate the two elements of a range, or can be used the same way as an em-dash. It should be written as @@ -88,10 +94,47 @@ Three things \e(em apples, oranges, and bananas. This is not that \e(em rather, this is that. .Ed .Pp -Note: -hyphens, minus signs, and en-dashes look identical under normal ASCII output. -Other formats, such as PostScript, render them correctly, -with differing widths. +In +.Xr roff 7 +documents, the minus sign is normally written as +.Sq \e- . +In manual pages, some style guides recommend to also use +.Sq \e- +if an ASCII 0x2d +.Dq hyphen-minus +output glyph that can be copied and pasted is desired in output modes +supporting it, for example in +.Fl T Cm utf8 +and +.Fl T Cm html . +But currently, no practically relevant manual page formatter requires +that subtlety, so in manual pages, it is sufficient to write plain +.Sq - +to represent hyphen, minus, and hyphen-minus. +.Pp +If a word on a text input line contains a hyphen, a formatter may decide +to insert an output line break after the hyphen if that helps filling +the current output line, but the whole word would overflow the line. +If it is important that the word is not broken across lines in this +way, a zero-width space +.Pq Sq \e& +can be inserted before or after the hyphen. +While +.Xr mandoc 1 +never breaks the output line after hyphens adjacent to a zero-width +space, after any of the other dash- or hyphen-like characters +represented by escape sequences, or after hyphens inside words in +macro arguments, other software may not respect these rules and may +break the line even in such cases. +.Pp +Some +.Xr roff 7 +implementations contains dictionaries allowing to break the line +at syllable boundaries even inside words that contain no hyphens. +Such automatic hyphenation is not supported by +.Xr mandoc 1 , +which only breaks the line at whitespace, and inside words only +after existing hyphens. .Ss Spaces To separate words in normal text, for indenting and alignment in literal context, and when none of the following special cases apply, @@ -145,6 +188,8 @@ even on request and macro lines. .Ss Accents In output modes supporting such special output characters, for example .Fl T Cm pdf , +and sometimes less consistently in +.Fl T Cm utf8 , some .Xr roff 7 formatters convert the following ASCII input characters to the @@ -153,6 +198,7 @@ following Unicode special output characters: .It \(ga Ta U+2018 Ta left single quotation mark .It \(aq Ta U+2019 Ta right single quotation mark .It \(ti Ta U+02DC Ta small tilde +.It \(ha Ta U+02C6 Ta modifier letter circumflex .El .Pp In prose, this automatic substitution is often desirable; @@ -163,6 +209,7 @@ escaping to render as follows: .It \e(ga Ta U+0060 Ta grave accent .It \e(aq Ta U+0027 Ta apostrophe .It \e(ti Ta U+007E Ta tilde +.It \e(ha Ta U+005E Ta circumflex accent .El .Ss Periods The period @@ -214,16 +261,18 @@ subsection of the .Xr roff 7 manual. .Pp -Spacing: +Spaces, non-breaking unless stated otherwise: .Bl -column "Input" "Description" -offset indent -compact .It Em Input Ta Em Description -.It Sq \e\ \& Ta unpaddable non-breaking space -.It \e\(ti Ta paddable non-breaking space -.It \e0 Ta unpaddable, breaking digit-width space +.It Sq \e\ \& Ta unpaddable space +.It \e\(ti Ta paddable space +.It \e0 Ta digit-width space .It \e| Ta one-sixth \e(em narrow space, zero width in nroff mode .It \e^ Ta one-twelfth \e(em half-narrow space, zero width in nroff .It \e& Ta zero-width space +.It \e) Ta zero-width space transparent to end-of-sentence detection .It \e% Ta zero-width space allowing hyphenation +.It \e: Ta zero-width space allowing line break .El .Pp Lines: @@ -232,6 +281,7 @@ Lines: .It \e(ba Ta \(ba Ta bar .It \e(br Ta \(br Ta box rule .It \e(ul Ta \(ul Ta underscore +.It \e(ru Ta \(ru Ta underscore (width 0.5m) .It \e(rn Ta \(rn Ta overline .It \e(bb Ta \(bb Ta broken bar .It \e(sl Ta \(sl Ta forward slash @@ -255,6 +305,10 @@ Text markers: .It \e(sh Ta \(sh Ta hash (pound) .It \e(CR Ta \(CR Ta carriage return .It \e(OK Ta \(OK Ta check mark +.It \e(CL Ta \(CL Ta club suit +.It \e(SP Ta \(SP Ta spade suit +.It \e(HE Ta \(HE Ta heart suit +.It \e(DI Ta \(DI Ta diamond suit .El .Pp Legal symbols: @@ -286,8 +340,8 @@ Quotes: .It \e(rq Ta \(rq Ta right double-quote .It \e(oq Ta \(oq Ta left single-quote .It \e(cq Ta \(cq Ta right single-quote -.It \e(aq Ta \(aq Ta apostrophe quote (text) -.It \e(dq Ta \(dq Ta double quote (text) +.It \e(aq Ta \(aq Ta apostrophe quote (ASCII character) +.It \e(dq Ta \(dq Ta double quote (ASCII character) .It \e(Fo Ta \(Fo Ta left guillemet .It \e(Fc Ta \(Fc Ta right guillemet .It \e(fo Ta \(fo Ta left single guillemet @@ -303,7 +357,7 @@ Brackets: .It \e(rC Ta \(rC Ta right brace .It \e(la Ta \(la Ta left angle .It \e(ra Ta \(ra Ta right angle -.It \e(bv Ta \(bv Ta brace extension +.It \e(bv Ta \(bv Ta brace extension (special font) .It \e[braceex] Ta \[braceex] Ta brace extension .It \e[bracketlefttp] Ta \[bracketlefttp] Ta top-left hooked bracket .It \e[bracketleftbt] Ta \[bracketleftbt] Ta bottom-left hooked bracket @@ -348,6 +402,7 @@ Arrows: .It \e(uA Ta \(uA Ta up double-arrow .It \e(dA Ta \(dA Ta down double-arrow .It \e(vA Ta \(vA Ta up-down double-arrow +.It \e(an Ta \(an Ta horizontal arrow extension .El .Pp Logical: @@ -355,8 +410,8 @@ Logical: .It Em Input Ta Em Rendered Ta Em Description .It \e(AN Ta \(AN Ta logical and .It \e(OR Ta \(OR Ta logical or -.It \e(no Ta \(no Ta logical not -.It \e[tno] Ta \[tno] Ta logical not (text) +.It \e[tno] Ta \[tno] Ta logical not (text font) +.It \e(no Ta \(no Ta logical not (special font) .It \e(te Ta \(te Ta existential quantifier .It \e(fa Ta \(fa Ta universal quantifier .It \e(st Ta \(st Ta such that @@ -368,19 +423,20 @@ Logical: Mathematical: .Bl -column "xxcoproductxx" "Rendered" "Description" -offset indent -compact .It Em Input Ta Em Rendered Ta Em Description -.It \e(pl Ta \(pl Ta plus -.It \e(mi Ta \(mi Ta minus -.It \e- Ta \- Ta minus (text) +.It \e- Ta \- Ta minus (text font) +.It \e(mi Ta \(mi Ta minus (special font) +.It + Ta + Ta plus (text font) +.It \e(pl Ta \(pl Ta plus (special font) .It \e(-+ Ta \(-+ Ta minus-plus -.It \e(+- Ta \(+- Ta plus-minus -.It \e[t+-] Ta \[t+-] Ta plus-minus (text) +.It \e[t+-] Ta \[t+-] Ta plus-minus (text font) +.It \e(+- Ta \(+- Ta plus-minus (special font) .It \e(pc Ta \(pc Ta center-dot -.It \e(mu Ta \(mu Ta multiply -.It \e[tmu] Ta \[tmu] Ta multiply (text) +.It \e[tmu] Ta \[tmu] Ta multiply (text font) +.It \e(mu Ta \(mu Ta multiply (special font) .It \e(c* Ta \(c* Ta circle-multiply .It \e(c+ Ta \(c+ Ta circle-plus -.It \e(di Ta \(di Ta divide -.It \e[tdi] Ta \[tdi] Ta divide (text) +.It \e[tdi] Ta \[tdi] Ta divide (text font) +.It \e(di Ta \(di Ta divide (special font) .It \e(f/ Ta \(f/ Ta fraction .It \e(** Ta \(** Ta asterisk .It \e(<= Ta \(<= Ta less-than-equal @@ -426,11 +482,20 @@ Mathematical: .It \e(Ah Ta \(Ah Ta aleph .It \e(Im Ta \(Im Ta imaginary .It \e(Re Ta \(Re Ta real +.It \e(wp Ta \(wp Ta Weierstrass p .It \e(pd Ta \(pd Ta partial differential .It \e(-h Ta \(-h Ta Planck constant over 2\(*p -.It \e[12] Ta \[12] Ta one-half -.It \e[14] Ta \[14] Ta one-fourth -.It \e[34] Ta \[34] Ta three-fourths +.It \e[hbar] Ta \[hbar] Ta Planck constant over 2\(*p +.It \e(12 Ta \(12 Ta one-half +.It \e(14 Ta \(14 Ta one-fourth +.It \e(34 Ta \(34 Ta three-fourths +.It \e(18 Ta \(18 Ta one-eighth +.It \e(38 Ta \(38 Ta three-eighths +.It \e(58 Ta \(58 Ta five-eighths +.It \e(78 Ta \(78 Ta seven-eighths +.It \e(S1 Ta \(S1 Ta superscript 1 +.It \e(S2 Ta \(S2 Ta superscript 2 +.It \e(S3 Ta \(S3 Ta superscript 3 .El .Pp Ligatures: @@ -468,8 +533,8 @@ Accents: .It \e(ao Ta \(ao Ta ring .It \e(a\(ti Ta \(a~ Ta tilde .It \e(ho Ta \(ho Ta ogonek -.It \e(ha Ta \(ha Ta hat (text) -.It \e(ti Ta \(ti Ta tilde (text) +.It \e(ha Ta \(ha Ta hat (ASCII character) +.It \e(ti Ta \(ti Ta tilde (ASCII character) .El .Pp Accented letters: @@ -480,11 +545,13 @@ Accented letters: .It \e(\(aqI Ta \('I Ta acute I .It \e(\(aqO Ta \('O Ta acute O .It \e(\(aqU Ta \('U Ta acute U +.It \e(\(aqY Ta \('Y Ta acute Y .It \e(\(aqa Ta \('a Ta acute a .It \e(\(aqe Ta \('e Ta acute e .It \e(\(aqi Ta \('i Ta acute i .It \e(\(aqo Ta \('o Ta acute o .It \e(\(aqu Ta \('u Ta acute u +.It \e(\(aqy Ta \('y Ta acute y .It \e(\(gaA Ta \(`A Ta grave A .It \e(\(gaE Ta \(`E Ta grave E .It \e(\(gaI Ta \(`I Ta grave I @@ -564,6 +631,8 @@ Units: .It \e(fm Ta \(fm Ta minute .It \e(sd Ta \(sd Ta second .It \e(mc Ta \(mc Ta micro +.It \e(Of Ta \(Of Ta Spanish female ordinal +.It \e(Om Ta \(Om Ta Spanish masculine ordinal .El .Pp Greek letters: @@ -640,11 +709,6 @@ Their syntax is similar to special characters, using and .Sq \e*[N] .Pq N-character . -For details, see the -.Em Predefined Strings -subsection of the -.Xr roff 7 -manual. .Bl -column "Input" "Rendered" "Description" -offset indent .It Em Input Ta Em Rendered Ta Em Description .It \e*(Ba Ta \*(Ba Ta vertical bar @@ -696,14 +760,16 @@ For backward compatibility with existing manuals, .Xr mandoc 1 also supports the .Pp -.Dl \eN\(aq Ns Ar number Ns \(aq +.Dl \eN\(aq Ns Ar number Ns \(aq and \e[ Ns Cm char Ns Ar number ] .Pp -escape sequence, inserting the character +escape sequences, inserting the character .Ar number from the current character set into the output. Of course, this is inherently non-portable and is already marked -as deprecated in the Heirloom roff manual. -For example, do not use \eN\(aq34\(aq, use \e(dq, or even the plain +as deprecated in the Heirloom roff manual; +on top of that, the second form is a GNU extension. +For example, do not use \eN\(aq34\(aq or \e[char34], use \e(dq, +or even the plain .Sq \(dq character where possible. .Sh COMPATIBILITY @@ -720,13 +786,11 @@ In .Fl T Ns Cm ascii , the \e(ss, \e(nm, \e(nb, \e(nc, \e(ib, \e(ip, \e(pp, \e[sum], \e[product], -\e[coproduct], \e(gr, \e(\-h, and \e(a. special characters render +\e[coproduct], \e(gr, \e(-h, and \e(a. special characters render differently between mandoc and groff. .It In -.Fl T Ns Cm html -and -.Fl T Ns Cm xhtml , +.Fl T Ns Cm html , the \e(\(ti=, \e(nb, and \e(nc special characters render differently between mandoc and groff. .It