[BACK]Return to mandoc_escape.3 CVS log [TXT][DIR] Up to [cvsweb.bsd.lv] / mandoc

Annotation of mandoc/mandoc_escape.3, Revision 1.1

1.1     ! schwarze    1: .\"    $Id$
        !             2: .\"
        !             3: .\" Copyright (c) 2014 Ingo Schwarze <schwarze@openbsd.org>
        !             4: .\"
        !             5: .\" Permission to use, copy, modify, and distribute this software for any
        !             6: .\" purpose with or without fee is hereby granted, provided that the above
        !             7: .\" copyright notice and this permission notice appear in all copies.
        !             8: .\"
        !             9: .\" THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
        !            10: .\" WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
        !            11: .\" MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
        !            12: .\" ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
        !            13: .\" WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
        !            14: .\" ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
        !            15: .\" OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
        !            16: .\"
        !            17: .Dd $Mdocdate$
        !            18: .Dt MANDOC_ESCAPE 3
        !            19: .Os
        !            20: .Sh NAME
        !            21: .Nm mandoc_escape
        !            22: .Nd parse roff escape sequences
        !            23: .Sh LIBRARY
        !            24: .Lb libmandoc
        !            25: .Sh SYNOPSIS
        !            26: .In sys/types.h
        !            27: .In mandoc.h
        !            28: .Ft "enum mandoc_esc"
        !            29: .Fo mandoc_escape
        !            30: .Fa "const char **end"
        !            31: .Fa "const char **start"
        !            32: .Fa "int *sz"
        !            33: .Fc
        !            34: .Sh DESCRIPTION
        !            35: This function scans a
        !            36: .Xr roff 7
        !            37: escape sequence.
        !            38: .Pp
        !            39: An escape sequence consists of
        !            40: .Bl -dash -compact -width 2n
        !            41: .It
        !            42: an initial backslash character
        !            43: .Pq Sq \e ,
        !            44: .It
        !            45: a single ASCII character called the escape sequence identifier,
        !            46: .It
        !            47: and, with only a few exceptions, an argument.
        !            48: .El
        !            49: .Pp
        !            50: Arguments can be given in the following forms; some escape sequence
        !            51: identifiers only accept some of these forms as specified below.
        !            52: The first three forms are called the standard forms.
        !            53: .Bl -tag -width 2n
        !            54: .It \&In brackets: Ic \&[ Ns Ar argument Ns Ic \&]
        !            55: The argument starts after the initial
        !            56: .Sq \&[ ,
        !            57: ends before the final
        !            58: .Sq \&] ,
        !            59: and the escape sequence ends with the final
        !            60: .Sq \&] .
        !            61: .It Two-character argument short form: Ic \&( Ns Ar ar
        !            62: This form can only be used for arguments
        !            63: consisting of exactly two characters.
        !            64: It has the same effect as
        !            65: .Ic \&[ Ns Ar ar Ns Ic \&] .
        !            66: .It One-character argument short form: Ar a
        !            67: This form can only be used for arguments
        !            68: consisting of exactly one character.
        !            69: It has the same effect as
        !            70: .Ic \&[ Ns Ar a Ns Ic \&] .
        !            71: .It Delimited form: Ar C Ns Ar argument Ns Ar C
        !            72: The argument starts after the initial delimiter character
        !            73: .Ar C ,
        !            74: ends before the next occurrence of the delimiter character
        !            75: .Ar C ,
        !            76: and the escape sequence ends with that second
        !            77: .Ar C .
        !            78: Some escape sequences allow arbitrary characters
        !            79: .Ar C
        !            80: as quoting characters, some restrict the range of characters
        !            81: that can be used as quoting characters.
        !            82: .El
        !            83: .Pp
        !            84: Upon function entry,
        !            85: .Fa end
        !            86: is expected to point to the escape sequence identifier.
        !            87: The values passed in as
        !            88: .Fa start
        !            89: and
        !            90: .Fa sz
        !            91: are ignored and overwritten.
        !            92: .Pp
        !            93: By design, this function cannot handle those
        !            94: .Xr roff 7
        !            95: escape sequences that require in-place expansion, in particular
        !            96: user-defined strings
        !            97: .Ic \e* ,
        !            98: number registers
        !            99: .Ic \en ,
        !           100: width measurements
        !           101: .Ic \ew ,
        !           102: and numerical expression control
        !           103: .Ic \eB .
        !           104: These are handled by
        !           105: .Fn roff_res ,
        !           106: a private preprocessor function called from
        !           107: .Fn roff_parseln ,
        !           108: see the file
        !           109: .Pa roff.c .
        !           110: .Pp
        !           111: The function
        !           112: .Fn mandoc_escape
        !           113: is used
        !           114: .Bl -dash -compact -width 2n
        !           115: .It
        !           116: recursively by itself, because some escape sequence arguments can
        !           117: in turn contain other escape sequences,
        !           118: .It
        !           119: for error detection internally by the
        !           120: .Xr roff 7
        !           121: parser part of the
        !           122: .Lb libmandoc ,
        !           123: see the file
        !           124: .Pa roff.c ,
        !           125: .It
        !           126: above all externally by the
        !           127: .Xr mandoc
        !           128: formatting modules, in particular
        !           129: .Fl Tascii
        !           130: and
        !           131: .Fl Thtml ,
        !           132: for formatting purposes, see the files
        !           133: .Pa term.c
        !           134: and
        !           135: .Pa html.c ,
        !           136: .It
        !           137: and rarely externally by high-level utilities using the mandoc library,
        !           138: for example
        !           139: .Xr makewhatis 8 ,
        !           140: to purge escape sequences from text.
        !           141: .El
        !           142: .Sh RETURN VALUES
        !           143: Upon function return, the pointer
        !           144: .Fa end
        !           145: is set to the character after the end of the escape sequence,
        !           146: such that the calling higher-level parser can easily continue.
        !           147: .Pp
        !           148: For escape sequences taking an argument, the pointer
        !           149: .Fa start
        !           150: is set to the beginning of the argument and
        !           151: .Fa sz
        !           152: is set to the length of the argument.
        !           153: For escape sequences not taking an argument,
        !           154: .Fa start
        !           155: is set to the character after the end of the sequence and
        !           156: .Fa sz
        !           157: is set to 0.
        !           158: Both
        !           159: .Fa start
        !           160: and
        !           161: .Fa sz
        !           162: may be
        !           163: .Dv NULL ;
        !           164: in that case, the argument and the length are not returned.
        !           165: .Pp
        !           166: For sequences taking an argument, the function
        !           167: .Fn mandoc_escape
        !           168: returns one of the following values:
        !           169: .Bl -tag -width 2n
        !           170: .It Dv ESCAPE_FONT
        !           171: The escape sequence
        !           172: .Ic \ef
        !           173: taking an argument in standard form:
        !           174: .Ic \ef[ , \ef( , \ef Ns Ar a .
        !           175: Two-character arguments starting with the character
        !           176: .Sq C
        !           177: are reduced to one-character arguments by skipping the
        !           178: .Sq C .
        !           179: More specific values are returned for the most commonly used arguments:
        !           180: .Bl -column "argument" "ESCAPE_FONTITALIC"
        !           181: .It argument Ta return value
        !           182: .It Cm R No or Cm 1 Ta Dv ESCAPE_FONTROMAN
        !           183: .It Cm I No or Cm 2 Ta Dv ESCAPE_FONTITALIC
        !           184: .It Cm B No or Cm 3 Ta Dv ESCAPE_FONTBOLD
        !           185: .It Cm P Ta Dv ESCAPE_FONTPREV
        !           186: .It Cm BI Ta Dv ESCAPE_FONTBI
        !           187: .El
        !           188: .It Dv ESCAPE_SPECIAL
        !           189: The escape sequence
        !           190: .Ic \eC
        !           191: taking an argument delimited with the single quote character
        !           192: and, as a special exception, the escape sequences
        !           193: .Em not
        !           194: having an identifier, that is, those where the argument, in standard
        !           195: form, directly follows the initial backslash:
        !           196: .Ic \eC' , \e[ , \e( , \e Ns Ar a .
        !           197: Note that the one-character argument short form can only be used for
        !           198: argument characters that do not clash with escape sequence identifiers.
        !           199: .Pp
        !           200: If the argument consists of more than one character
        !           201: and starts with the character
        !           202: .Sq u ,
        !           203: .Dv ESCAPE_UNICODE
        !           204: is returned as described below.
        !           205: If the argument is just the single character
        !           206: .Sq u ,
        !           207: .Dv ESCAPE_ERROR
        !           208: is returned.
        !           209: .Pp
        !           210: The
        !           211: .Dv ESCAPE_SPECIAL
        !           212: special character escape sequences can be rendered using the functions
        !           213: .Fn mchars_spec2cp
        !           214: and
        !           215: .Fn mchars_spec2str
        !           216: described in the
        !           217: .Xr mchars_alloc 3
        !           218: manual.
        !           219: .It Dv ESCAPE_UNICODE
        !           220: Escape sequences of the same format as described above under
        !           221: .Dv ESCAPE_SPECIAL ,
        !           222: but with an argument starting with the character
        !           223: .Sq u :
        !           224: .Ic \eC'u , \e[u .
        !           225: As a special exception,
        !           226: .Fa start
        !           227: is set to the character after the
        !           228: .Sq u ,
        !           229: and the
        !           230: .Fa sz
        !           231: return value does not include the
        !           232: .Sq u
        !           233: either.
        !           234: .Pp
        !           235: Such Unicode character escape sequences can be rendered using the function
        !           236: .Fn mchars_num2uc
        !           237: described in the
        !           238: .Xr mchars_alloc 3
        !           239: manual.
        !           240: .It Dv ESCAPE_NUMBERED
        !           241: The escape sequence
        !           242: .Ic \eN
        !           243: followed by a delimited argument.
        !           244: The delimiter character is arbitrary except that digits cannot be used.
        !           245: If a digit is encountered instead of the opening delimiter, that
        !           246: digit is considered to be the argument and the end of the sequence, and
        !           247: .Dv ESCAPE_IGNORE
        !           248: is returned.
        !           249: .Pp
        !           250: Such ASCII character escape sequences can be rendered using the function
        !           251: .Fn mchars_num2char
        !           252: described in the
        !           253: .Xr mchars_alloc 3
        !           254: manual.
        !           255: .It Dv ESCAPE_IGNORE
        !           256: .Bl -bullet -width 2n
        !           257: .It
        !           258: The escape sequence
        !           259: .Ic \es
        !           260: followed by an argument in standard form or by an argument delimited
        !           261: by the single quote character:
        !           262: .Ic \es' , \es[ , \es( , \es Ns Ar a .
        !           263: As a special exception, an optional
        !           264: .Sq +
        !           265: or
        !           266: .Sq \-
        !           267: character is allowed after the
        !           268: .Sq s
        !           269: for all forms.
        !           270: .It
        !           271: The escape sequences
        !           272: .Ic \eF ,
        !           273: .Ic \eg ,
        !           274: .Ic \ek ,
        !           275: .Ic \eM ,
        !           276: .Ic \em ,
        !           277: .Ic \en ,
        !           278: .Ic \eV ,
        !           279: and
        !           280: .Ic \eY
        !           281: followed by an argument in standard form.
        !           282: .It
        !           283: The escape sequences
        !           284: .Ic \eA ,
        !           285: .Ic \eb ,
        !           286: .Ic \eD ,
        !           287: .Ic \eo ,
        !           288: .Ic \eR ,
        !           289: .Ic \eX ,
        !           290: and
        !           291: .Ic \eZ
        !           292: followed by an argument delimited by an arbitrary character.
        !           293: .It
        !           294: The escape sequences
        !           295: .Ic \eH ,
        !           296: .Ic \eh ,
        !           297: .Ic \eL ,
        !           298: .Ic \el ,
        !           299: .Ic \eS ,
        !           300: .Ic \ev ,
        !           301: and
        !           302: .Ic \ex
        !           303: followed by an argument delimited by a character that cannot occur
        !           304: in numerical expressions.
        !           305: However, if any character that can occur in numerical expressions
        !           306: is found instead of a delimiter, the sequence is considered to end
        !           307: with that character, and
        !           308: .Dv ESCAPE_ERROR
        !           309: is returned.
        !           310: .El
        !           311: .It Dv ESCAPE_ERROR
        !           312: Escape sequences taking an argument but not matching any of the above patterns.
        !           313: In particular, that happens if the end of the logical input line
        !           314: is reached before the end of the argument.
        !           315: .El
        !           316: .Pp
        !           317: For sequences that do not take an argument, the function
        !           318: .Fn mandoc_escape
        !           319: returns one of the following values:
        !           320: .Bl -tag -width 2n
        !           321: .It Dv ESCAPE_SKIPCHAR
        !           322: The escape sequence
        !           323: .Qq \ez .
        !           324: .It Dv ESCAPE_NOSPACE
        !           325: The escape sequence
        !           326: .Qq \ec .
        !           327: .It Dv ESCAPE_IGNORE
        !           328: The escape sequences
        !           329: .Qq \ed
        !           330: and
        !           331: .Qq \eu .
        !           332: .El
        !           333: .Sh FILES
        !           334: This function is implemented in
        !           335: .Pa mandoc.c .
        !           336: .Sh SEE ALSO
        !           337: .Xr mchars_alloc 3 ,
        !           338: .Xr mandoc_char 7 ,
        !           339: .Xr roff 7
        !           340: .Sh HISTORY
        !           341: This function has been available since mandoc 1.11.2.
        !           342: .Sh AUTHORS
        !           343: .An Kristaps Dzonsons Aq Mt kristaps@bsd.lv
        !           344: .An Ingo Schwarze Aq Mt schwarze@openbsd.org
        !           345: .Sh BUGS
        !           346: The function doesn't cleanly distinguish between sequences that are
        !           347: valid and supported, valid and ignored, valid and unsupported,
        !           348: syntactically invalid, or undefined.
        !           349: For sequences that are ignored or unsupported, it doesn't tell
        !           350: whether that deficiency is likely to cause major formatting problems
        !           351: and/or loss of document content.
        !           352: The function is already rather complicated and still parses some
        !           353: sequences incorrectly.
        !           354: .
        !           355: .ig
        !           356: For these sequences, the list given below specifies a starting string
        !           357: and either the length of the argument or an ending character.
        !           358: The argument starts after the starting string.
        !           359: In the former case, the sequence ends with the end of the argument.
        !           360: In the latter case, the argument ends before the ending character,
        !           361: and the sequence ends with the ending character.
        !           362: ..

CVSweb