CVS log for mandoc/mandoc.c

[BACK] Up to [cvsweb.bsd.lv] / mandoc

Request diff between arbitrary revisions


Default branch: MAIN
Current tag: MAIN


Revision 1.121 / (download) - annotate - [select for diffs], Thu May 19 15:37:47 2022 UTC (22 months, 1 week ago) by schwarze
Branch: MAIN
CVS Tags: HEAD
Changes since 1.120: +9 -385 lines
Diff to previous 1.120 (unified) to selected 1.60 (unified)

Make roff_expand() parse left-to-right rather than right-to-left.
Some escape sequences have side effects on global state, implying
that the order of evaluation matters.  For example, this fixes the
long-standing bug that "\n+x\n+x\n+x" after ".nr x 0 1" used to
print "321"; now it correctly prints "123".

Right-to-left parsing was convenient because it implicitly handled
nested escape sequences.  With correct left-to-right parsing, nesting
now requires an explicit implementation, here solved as follows:
1. Handle nested expanding escape sequences iteratively.
When finding one, expand it, then retry parsing the enclosing escape
sequence from the beginning, which will ultimately succeed as soon
as it no longer contains any nested expanding escape sequences.
2. Handle nested non-expanding escape sequences recursively.
When finding one, the escape sequence parser calls itself to find
the end of the inner sequence, then continues parsing the outer
sequence after that point.

This requires the mandoc_escape() function to operate in two different
modes.  The roff(7) parser uses it in a mode where it generates
diagnostics and may return an expansion request instead of a parse
result.  All other callers, in particular the formatters, use it
in a simpler mode that never generates diagnostics and always returns
a definite parsing result, but that requires all expanding escape
sequences to already have been expanded earlier.  The bulk of the
code is the same for both modes.
Since this required a major rewrite of the function anyway, move
it into its own new file roff_escape.c and out of the file mandoc.c,
which was misnamed in the first place and lacks a clear focus.

As a side benefit, this also fixes a number of assertion failures
that tb@ found with afl(1), for example "\n\\\\*0", "\v\-\\*0",
and "\w\-\\\\\$0*0".

As another side benefit, it also resolves some code duplication
between mandoc_escape() and roff_expand() and centralizes all
handling of escape sequences (except for expansion) in roff_escape.c,
hopefully easing maintenance and feature improvements in the future.

While here, also move end-of-input handling out of the complicated
function roff_expand() and into the simpler function roff_parse_comment(),
making the logic easier to understand.

Since this is a major reorganization of a central component of
mandoc(1), stability of the program might slightly suffer for a few
weeks, but i believe that's not a problem at this point of the
release cycle.  The new code already satisfies the regression suite,
but more tweaking and regression testing to further improve the
handling of various escape sequences will likely follow in the near
future.

Revision 1.120 / (download) - annotate - [select for diffs], Wed Apr 13 13:19:34 2022 UTC (23 months, 2 weeks ago) by schwarze
Branch: MAIN
Changes since 1.119: +3 -3 lines
Diff to previous 1.119 (unified) to selected 1.60 (unified)

Surprisingly, groff supports multiple copy mode escapes at the
beginning of an escape sequence: \, \E, \EE, \EEE, and so on all do
the same outside copy mode, so let them do the same in mandoc(1), too.

This fixes an assertion failure triggered by \EE*X that tb@ found
with afl(1).  The first E was consumed by roff_expand(), but that
function failed to recognize the escape sequence as the expansion
of a user-defined string and handed it over to mandoc_escape(),
which consumed the second E and then died on an assertion because
it is not prepared to handle user-defined strings.  Fix this by
letting *both* functions handle arbitrary numbers of 'E's correctly.

Revision 1.119 / (download) - annotate - [select for diffs], Tue Aug 10 12:55:03 2021 UTC (2 years, 7 months ago) by schwarze
Branch: MAIN
CVS Tags: VERSION_1_14_6
Changes since 1.118: +5 -5 lines
Diff to previous 1.118 (unified) to selected 1.60 (unified)

Support two-character font names (BI, CW, CR, CB, CI)
in the tbl(7) layout font modifier.

Get rid of the TBL_CELL_BOLD and TBL_CELL_ITALIC flags and use
the usual ESCAPE_FONT* enum mandoc_esc members from mandoc.h instead,
which simplifies and unifies some code.

While here, also support CB and CI in roff(7) \f escape sequences
and in roff(7) .ft requests for all output modes.  Using those is
certainly not recommended because portability is limited even with
groff, but supporting them makes some existing third-party manual
pages look better, in particular in HTML output mode.

Bug-compatible with groff as far as i'm aware, except that i consider
font names starting with the '\n' (ASCII 0x0a line feed) character
so insane that i decided to not support them.

Missing feature reported by nabijaczleweli dot xyz in
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=992002.
I used none of the code from the initial patch submitted by
nabijaczleweli, but some of their ideas.
Final patch tested by them, too.

Revision 1.118 / (download) - annotate - [select for diffs], Sat Oct 24 22:57:39 2020 UTC (3 years, 5 months ago) by schwarze
Branch: MAIN
Changes since 1.117: +16 -9 lines
Diff to previous 1.117 (unified) to selected 1.60 (unified)

Treat \*[.T] in the same way as \*(.T rather than calling abort(3).
Bug found because the groff-current manual pages started using the
variant form of this predefined string.

Revision 1.117 / (download) - annotate - [select for diffs], Sun Jan 19 16:44:50 2020 UTC (4 years, 2 months ago) by schwarze
Branch: MAIN
Changes since 1.116: +35 -21 lines
Diff to previous 1.116 (unified) to selected 1.60 (unified)

Align to the new, sane behaviour of the groff_mdoc(7) .Dd macro:
without an argument, use the empty string, and always concatenate
all arguments, no matter their number.
This allows reducing the number of arguments of mandoc_normdate()
and some other simplifications, at the same time polishing some
error messages by adding the name of the macro in question.

Revision 1.116 / (download) - annotate - [select for diffs], Thu Jun 27 15:07:30 2019 UTC (4 years, 9 months ago) by schwarze
Branch: MAIN
Changes since 1.115: +9 -4 lines
Diff to previous 1.115 (unified) to selected 1.60 (unified)

Fix mandoc_normdate() and the way it is used.
In the past, it could return NULL but the calling code wasn't prepared
to handle that.  Make sure it always returns an allocated string.
While here, simplify the code by handling the "quick" attribute
inside mandoc_normdate() rather than at multiple callsites.

Triggered by deraadt@ pointing out
that snprintf(3) error handling was incomplete in time2a().

Revision 1.115 / (download) - annotate - [select for diffs], Tue May 21 08:04:21 2019 UTC (4 years, 10 months ago) by schwarze
Branch: MAIN
Changes since 1.114: +3 -3 lines
Diff to previous 1.114 (unified) to selected 1.60 (unified)

Do not print the style message "missing date" when the date is given
as "$Mdocdate$" without an actual date.  That is the canonical way to
write a new manual page and not bad style at all.
Misleading message reported by kn@ on tech@.

Revision 1.114 / (download) - annotate - [select for diffs], Sun Dec 30 00:49:55 2018 UTC (5 years, 2 months ago) by schwarze
Branch: MAIN
CVS Tags: VERSION_1_14_5
Changes since 1.113: +3 -2 lines
Diff to previous 1.113 (unified) to selected 1.60 (unified)

Cleanup, no functional change:

The struct roff_man used to be a bad mixture of internal parser
state and public parsing results.  Move the public results to the
parsing result struct roff_meta, which is already public.  Move the
rest of struct roff_man to the parser-internal header roff_int.h.

Since the validators need access to the parser state, call them
from the top level parser during mparse_result() rather than from
the main programs, also reducing code duplication.

This keeps parser internal state out of thee main programs (five
in mandoc portable) and out of eight formatters.

Revision 1.113 / (download) - annotate - [select for diffs], Tue Dec 18 22:00:02 2018 UTC (5 years, 3 months ago) by schwarze
Branch: MAIN
Changes since 1.112: +1 -97 lines
Diff to previous 1.112 (unified) to selected 1.60 (unified)

As a first step towards making roff_res() callable from mandoc_getarg(),
move the function mandoc_getarg() from mandoc.c to roff.c.  It was
misplaced in mandoc.c in the first place; that file is intended for
utilities needed both by parsers and by formatters, while reading
macro arguments in copy mode is purely a task of the roff(7) parser.
Needed as a preliminary for an upcoming bugfix.
No code change.

Revision 1.112 / (download) - annotate - [select for diffs], Sun Dec 16 00:17:02 2018 UTC (5 years, 3 months ago) by schwarze
Branch: MAIN
Changes since 1.111: +55 -42 lines
Diff to previous 1.111 (unified) to selected 1.60 (unified)

Yet another round of improvements to manual font selection.

Unify handling of \f and .ft.
Support \f4 (bold+italic).
Support ".ft BI" and ".ft CW" for terminal output.
Support the .ft request in HTML output.
Reject the bogus fonts \f(C1, \f(C2, \f(C3, and \f(CP.
In regress.pl, only strip leading whitespace in math mode.

Revision 1.111 / (download) - annotate - [select for diffs], Sat Dec 15 19:30:26 2018 UTC (5 years, 3 months ago) by schwarze
Branch: MAIN
Changes since 1.110: +79 -17 lines
Diff to previous 1.110 (unified) to selected 1.60 (unified)

Several improvements to escape sequence handling.

* Add the missing special character \_ (underscore).
* Partial implementations of \a (leader character)
and \E (uninterpreted escape character).
* Parse and ignore \r (reverse line feed).
* Add a WARNING message about undefined escape sequences.
* Add an UNSUPP message about unsupported escape sequences.
* Mark \! and \? (transparent throughput)
and \O (suppress output) as unsupported.
* Treat the various variants of zero-width spaces as one-byte escape
sequences rather than as special characters, to avoid defining bogus
forms with square brackets.
* For special characters with one-byte names, do not define bogus
forms with square brackets, except for \[-], which is valid.
* In the form with square brackets, undefined special characters do not
fall back to printing the name verbatim, not even for one-byte names.
* Starting a special character name with a blank is an error.
* Undefined escape sequences never abort formatting of the input
string, not even in HTML output mode.
* Document the newly handled escapes, and a few that were missing.
* Regression tests for most of the above.

Revision 1.110 / (download) - annotate - [select for diffs], Fri Dec 14 06:33:14 2018 UTC (5 years, 3 months ago) by schwarze
Branch: MAIN
Changes since 1.109: +2 -2 lines
Diff to previous 1.109 (unified) to selected 1.60 (unified)

Cleanup, no functional change:
Now that message handling is properly encapsulated,
remove struct mparse pointers from four structs (roff, roff_man,
tbl_node, eqn_node) and from the argument lists of five functions
(roff_alloc, roff_man_alloc, mandoc_getarg, tbl_alloc, eqn_alloc).
Except for being passed to the main program as an opaque object,
it now only occurs in read.c, as it should, and not across 15 files
like in the past.

Revision 1.109 / (download) - annotate - [select for diffs], Fri Dec 14 05:18:02 2018 UTC (5 years, 3 months ago) by schwarze
Branch: MAIN
Changes since 1.108: +9 -12 lines
Diff to previous 1.108 (unified) to selected 1.60 (unified)

Almost mechanical diff to remove the "struct mparse *" argument
from mandoc_msg(), where it is no longer used.
While here, rename mandoc_vmsg() to mandoc_msg() and retire the
old version:  There is really no point in having another function
merely to save "%s" in a few places.
Minus 140 lines of code.

Revision 1.108 / (download) - annotate - [select for diffs], Thu Oct 25 01:32:40 2018 UTC (5 years, 5 months ago) by schwarze
Branch: MAIN
Changes since 1.107: +7 -2 lines
Diff to previous 1.107 (unified) to selected 1.60 (unified)

Implement the \f(CW and \f(CR (constant width font) escape sequences
for HTML output.  Somewhat relevant because pod2man(1) relies on this.
Missing feature reported by Pali dot Rohar at gmail dot com.

Note that constant width font was already correctly selected before
this when required by semantic markup.  Only attempting physical
markup with the low-level escape sequence was ineffective.

Revision 1.107 / (download) - annotate - [select for diffs], Mon Aug 20 18:06:56 2018 UTC (5 years, 7 months ago) by schwarze
Branch: MAIN
Changes since 1.106: +8 -5 lines
Diff to previous 1.106 (unified) to selected 1.60 (unified)

\f[] means \fP, not \fR

Revision 1.106 / (download) - annotate - [select for diffs], Thu Aug 16 13:54:06 2018 UTC (5 years, 7 months ago) by schwarze
Branch: MAIN
Changes since 1.105: +8 -1 lines
Diff to previous 1.105 (unified) to selected 1.60 (unified)

Implement the \*(.T predefined string (interpolate device name)
by allowing the preprocessor to pass it through to the formatters.
Used for example by the groff_char(7) manual page.

Revision 1.105 / (download) - annotate - [select for diffs], Fri Aug 10 22:12:44 2018 UTC (5 years, 7 months ago) by schwarze
Branch: MAIN
Changes since 1.104: +22 -4 lines
Diff to previous 1.104 (unified) to selected 1.60 (unified)

handle the non-portable GNU-style \[charNN], \[charNNN] character
escape sequences, used for example in the groff_char(7) manual page

Revision 1.104 / (download) - annotate - [select for diffs], Sat Jul 28 18:34:15 2018 UTC (5 years, 8 months ago) by schwarze
Branch: MAIN
CVS Tags: VERSION_1_14_4
Changes since 1.103: +5 -2 lines
Diff to previous 1.103 (unified) to selected 1.60 (unified)

Issue a STYLE message when normalizing the date format in .Dd/.TH.
Leah Neukirchen pointed out that mdoclint(1) used to warn about a
leading zero before the day number, so we know that both NetBSD and
Void Linux want the message.  It does no harm on OpenBSD because
Mdocdate always does the right thing anyway.
jmc@ agrees that it makes sense in contexts not using Mdocdate.

Revision 1.103 / (download) - annotate - [select for diffs], Mon Jul 3 13:40:19 2017 UTC (6 years, 8 months ago) by schwarze
Branch: MAIN
CVS Tags: VERSION_1_14_3, VERSION_1_14_2
Changes since 1.102: +11 -3 lines
Diff to previous 1.102 (unified) to selected 1.60 (unified)

warn about time machines; suggested by Thomas Klausner <wiz @ NetBSD>

Revision 1.102 / (download) - annotate - [select for diffs], Wed Jun 14 01:31:26 2017 UTC (6 years, 9 months ago) by schwarze
Branch: MAIN
Changes since 1.101: +3 -1 lines
Diff to previous 1.101 (unified) to selected 1.60 (unified)

implement the roff(7) \p (break output line) escape sequence

Revision 1.101 / (download) - annotate - [select for diffs], Sun Jun 11 19:37:01 2017 UTC (6 years, 9 months ago) by schwarze
Branch: MAIN
Changes since 1.100: +11 -7 lines
Diff to previous 1.100 (unified) to selected 1.60 (unified)

Style message about legacy man(7) date format in mdoc(7) documents
and operating system dependent messages about missing or unexpected
Mdocdate; inspired by mdoclint(1).

Revision 1.100 / (download) - annotate - [select for diffs], Fri Jun 2 19:21:23 2017 UTC (6 years, 9 months ago) by schwarze
Branch: MAIN
Changes since 1.99: +13 -3 lines
Diff to previous 1.99 (unified) to selected 1.60 (unified)

Partial implementation of \h (horizontal line drawing function).
A full implementation would require access to output device properties
and state variables (both only available after the main parser has
finalized the parse tree) before numerical expansions in the roff
preprocessor (i.e., before the main parser is even started).

Not trying to pull that stunt right now because the static-width
implementation committed here is sufficient for tcl-style manual pages
and already more complicated than i would have suspected.

Revision 1.99 / (download) - annotate - [select for diffs], Thu Jun 1 19:05:37 2017 UTC (6 years, 9 months ago) by schwarze
Branch: MAIN
Changes since 1.98: +2 -2 lines
Diff to previous 1.98 (unified) to selected 1.60 (unified)

Minimal implementation of the \h (horizontal motion) escape sequence.
Good enough to cope with the average DocBook insanity.

Revision 1.98 / (download) - annotate - [select for diffs], Thu Nov 12 22:44:27 2015 UTC (8 years, 4 months ago) by schwarze
Branch: MAIN
CVS Tags: VERSION_1_14_1, VERSION_1_13_4, VERSION_1_13
Changes since 1.97: +32 -16 lines
Diff to previous 1.97 (unified) to selected 1.60 (unified)

Simplify the logic in mandoc_normdate() and add some comments.
Also add a comment in time2a() explaining why it isn't possible
to use just one single call to strftime().
Do some style cleanup while here.
No functional change.
Triggered by a very different patch from des@FreeBSD.

Revision 1.97 / (download) - annotate - [select for diffs], Thu Oct 15 23:35:55 2015 UTC (8 years, 5 months ago) by schwarze
Branch: MAIN
Changes since 1.96: +1 -3 lines
Diff to previous 1.96 (unified) to selected 1.60 (unified)

Delete two preprocessor constants that are no longer used.
Patch from Michael Reed <m dot reed at mykolab dot com>.

Revision 1.96 / (download) - annotate - [select for diffs], Tue Oct 13 23:30:50 2015 UTC (8 years, 5 months ago) by schwarze
Branch: MAIN
Changes since 1.95: +4 -1 lines
Diff to previous 1.95 (unified) to selected 1.60 (unified)

Reject the escape sequences \[uD800] to \[uDFFF] in the parser.
These surrogates are not valid Unicode codepoints,
so treat them just like any other undefined character escapes:
Warn about them and do not produce output.
Issue noticed while talking to stsp@, semarie@, and bentley@.

Revision 1.95 / (download) - annotate - [select for diffs], Mon Oct 12 00:08:15 2015 UTC (8 years, 5 months ago) by schwarze
Branch: MAIN
Changes since 1.94: +1 -32 lines
Diff to previous 1.94 (unified) to selected 1.60 (unified)

To make the code more readable, delete 283 /* FALLTHROUGH */ comments
that were right between two adjacent case statement.  Keep only
those 24 where the first case actually executes some code before
falling through to the next case.

Revision 1.94 / (download) - annotate - [select for diffs], Tue Oct 6 18:32:19 2015 UTC (8 years, 5 months ago) by schwarze
Branch: MAIN
Changes since 1.93: +27 -26 lines
Diff to previous 1.93 (unified) to selected 1.60 (unified)

modernize style: "return" is not a function

Revision 1.93 / (download) - annotate - [select for diffs], Sat Aug 29 22:40:05 2015 UTC (8 years, 7 months ago) by schwarze
Branch: MAIN
Changes since 1.92: +5 -1 lines
Diff to previous 1.92 (unified) to selected 1.60 (unified)

Parse and ignore the escape sequences \, and \/ (italic corrections).
Actually using these is very stupid because they are groff extensions
and other roff(7) implementations typically print unintended characters
at the places where they are used.
Nevertheless, some manuals contain them, for example ocserv(8).
Problem reported by Kurt Jaeger <pi at FreeBSD>.

Revision 1.92 / (download) - annotate - [select for diffs], Fri Feb 20 23:55:10 2015 UTC (9 years, 1 month ago) by schwarze
Branch: MAIN
CVS Tags: VERSION_1_13_3
Changes since 1.91: +9 -1 lines
Diff to previous 1.91 (unified) to selected 1.60 (unified)

For selecting a two-digit font size, support the historic syntax \s12
in addition to the classic syntax \s(12, the modern syntax \s[12],
and the alternative syntax \s'12'.  The historic syntax only works
for the font sizes 10-39.
Real-world usage found by naddy@ in plan9/rc.

Revision 1.91 / (download) - annotate - [select for diffs], Wed Jan 21 20:33:25 2015 UTC (9 years, 2 months ago) by schwarze
Branch: MAIN
Changes since 1.90: +7 -5 lines
Diff to previous 1.90 (unified) to selected 1.60 (unified)

Rudimentary implementation of the roff(7) \o escape sequence (overstrike).
This is of some relevance because the pod2man(1) preamble abuses it
for the icelandic letter Thorn, instead of simply using \(TP and \(Tp.
Missing feature found by sthen@ in DateTime::Locale::is_IS(3p).

Revision 1.90 / (download) - annotate - [select for diffs], Thu Jan 1 18:11:45 2015 UTC (9 years, 2 months ago) by schwarze
Branch: MAIN
Changes since 1.89: +4 -4 lines
Diff to previous 1.89 (unified) to selected 1.60 (unified)

Fix a read buffer overrun triggered by trailing \s- or trailing \s+
without the required subsequent argument; found by jsg@ with afl.

Revision 1.89 / (download) - annotate - [select for diffs], Mon Dec 15 17:30:30 2014 UTC (9 years, 3 months ago) by schwarze
Branch: MAIN
Changes since 1.88: +3 -1 lines
Diff to previous 1.88 (unified) to selected 1.60 (unified)

Catch localtime() failure for additional safety;
patch from Jan Stary <hans at stare dot cz> some time ago.

Revision 1.88 / (download) - annotate - [select for diffs], Tue Oct 28 13:24:44 2014 UTC (9 years, 5 months ago) by schwarze
Branch: MAIN
CVS Tags: VERSION_1_13_2
Changes since 1.87: +10 -5 lines
Diff to previous 1.87 (unified) to selected 1.60 (unified)

Tighten Unicode escape name parsing.
Accept only 0xXXXX, 0xYXXXX, 0x10XXXX with Y != 0.
This simplifies mchars_num2uc().

Revision 1.87 / (download) - annotate - [select for diffs], Mon Oct 13 17:17:45 2014 UTC (9 years, 5 months ago) by schwarze
Branch: MAIN
Changes since 1.86: +12 -13 lines
Diff to previous 1.86 (unified) to selected 1.60 (unified)

Stricter syntax checking of Unicode character names:
Require exactly 4, 5 or 6 hex digits and allow nothing else.
This avoids mishandling stuff like \[ua] and \C'uA' as Unicode
and also fixes underlining in eqn(7) -Thtml output which uses \[ul].
Problem found and semantics suggested by kristaps@.

Revision 1.86 / (download) - annotate - [select for diffs], Mon Aug 18 09:11:47 2014 UTC (9 years, 7 months ago) by kristaps
Branch: MAIN
Changes since 1.85: +3 -2 lines
Diff to previous 1.85 (unified) to selected 1.60 (unified)

Fix a corner case where \H<nil> (where <nil> is the \0 character) would
cause mandoc_escape() to read past the end of an allocated string.
Found when a script scanning of all Mac OSX manual accidentally also
scanned binary (gzip'd) files, discussed with schwarze@ on tech@.

Revision 1.85 / (download) - annotate - [select for diffs], Sat Aug 16 19:00:01 2014 UTC (9 years, 7 months ago) by schwarze
Branch: MAIN
Changes since 1.84: +2 -2 lines
Diff to previous 1.84 (unified) to selected 1.60 (unified)

Improve build system and autodetection.
* Make ./configure standalone, that's what people expect.
* Let people write a ./configure.local from scratch, not edit existing files.
* Autodetect wchar, sqlite3, and manpath and act accordingly.
* Autodetect the need for -L/usr/local/lib and -lutil.
* Get rid of config.h.p{re,ost}, let ./configure only write what's needed.
* Let ./configure write a Makefile.local snippet, that's quite flexible.

Revision 1.84 / (download) - annotate - [select for diffs], Sun Aug 10 23:54:41 2014 UTC (9 years, 7 months ago) by schwarze
Branch: MAIN
Branch point for: VERSION_1_12
Changes since 1.83: +1 -3 lines
Diff to previous 1.83 (unified) to selected 1.60 (unified)

Get rid of HAVE_CONFIG_H, it is always defined; idea from libnbcompat.
Include <sys/types.h> where needed, it does not belong in config.h.
Remove <stdio.h> from config.h; if it is missing somewhere, it should
be added, but i cannot find a *.c file where it is missing.

Revision 1.83 / (download) - annotate - [select for diffs], Sun Jul 6 19:09:00 2014 UTC (9 years, 8 months ago) by schwarze
Branch: MAIN
CVS Tags: VERSION_1_13_1
Changes since 1.82: +3 -3 lines
Diff to previous 1.82 (unified) to selected 1.60 (unified)

Clean up messages related to plain text and to escape sequences.
* Mention invalid escape sequences and string names, and fallbacks.
* Hierarchical naming.

Revision 1.82 / (download) - annotate - [select for diffs], Sun Jul 6 18:37:34 2014 UTC (9 years, 8 months ago) by schwarze
Branch: MAIN
Changes since 1.81: +4 -2 lines
Diff to previous 1.81 (unified) to selected 1.60 (unified)

Fix handling of escape sequences taking numeric arguments.
* Repair detection of invalid delimiters.
* Discard the invalid delimiter together with the invalid sequence.

Note to self: In general, strchr("\0...", c) is a thoroughly bad idea.

Revision 1.81 / (download) - annotate - [select for diffs], Tue Jul 1 22:37:15 2014 UTC (9 years, 8 months ago) by schwarze
Branch: MAIN
Changes since 1.80: +2 -2 lines
Diff to previous 1.80 (unified) to selected 1.60 (unified)

Clean up the warnings related to document structure.
* Hierarchical naming of the related enum mandocerr items.
* Mention the offending macro, section title, or string.
While here, improve some wordings:
* Descriptive instead of imperative style.
* Uniform style for "missing" and "skipping".
* Where applicable, mention the fallback used.

Revision 1.80 / (download) - annotate - [select for diffs], Fri Jun 20 17:24:00 2014 UTC (9 years, 9 months ago) by schwarze
Branch: MAIN
Changes since 1.79: +3 -3 lines
Diff to previous 1.79 (unified) to selected 1.60 (unified)

Start systematic improvements of error reporting.
So far, this covers all WARNINGs related to the prologue.

1) hierarchical naming of MANDOCERR_* constants
2) mention the macro name in messages where that adds clarity
3) add one missing MANDOCERR_DATE_MISSING msg
4) fix the wording of one message related to the man(7) prologue

Started on the plane back from Ottawa.

Revision 1.79 / (download) - annotate - [select for diffs], Sun Apr 20 16:46:04 2014 UTC (9 years, 11 months ago) by schwarze
Branch: MAIN
Changes since 1.78: +62 -62 lines
Diff to previous 1.78 (unified) to selected 1.60 (unified)

KNF: case (FOO):  ->  case FOO:, remove /* LINTED */ and /* ARGSUSED */,
remove trailing whitespace and blanks before tabs, improve some indenting;
no functional change

Revision 1.78 / (download) - annotate - [select for diffs], Tue Apr 8 01:37:27 2014 UTC (9 years, 11 months ago) by schwarze
Branch: MAIN
Changes since 1.77: +3 -6 lines
Diff to previous 1.77 (unified) to selected 1.60 (unified)

Fully implement the \B (validate numerical expression) and
partially implement the \w (measure text width) escape sequence
in a way that makes them usable in numerical expressions and in
conditional requests, similar to how \n (interpolate number register)
and \* (expand user-defined string) are implemented.

This lets mandoc(1) handle the baroque low-level roff code
found at the beginning of the ggrep(1) manual.
Thanks to pascal@ for the report.

Revision 1.77 / (download) - annotate - [select for diffs], Mon Apr 7 17:51:10 2014 UTC (9 years, 11 months ago) by schwarze
Branch: MAIN
Changes since 1.76: +5 -5 lines
Diff to previous 1.76 (unified) to selected 1.60 (unified)

Accept arbitrary argument delimiters for various roff(7) escape sequences.
Needed for example by the new Perl pod2man(1) preamble.

Revision 1.76 / (download) - annotate - [select for diffs], Sun Mar 23 11:25:26 2014 UTC (10 years ago) by schwarze
Branch: MAIN
Changes since 1.75: +2 -69 lines
Diff to previous 1.75 (unified) to selected 1.60 (unified)

The files mandoc.c and mandoc.h contained both specialised low-level
functions used for multiple languages (mdoc, man, roff), for example
mandoc_escape(), mandoc_getarg(), mandoc_eos(), and generic auxiliary
functions.  Split the auxiliaries out into their own file and header.
While here, do some #include cleanup.

Revision 1.75 / (download) - annotate - [select for diffs], Tue Dec 31 23:23:10 2013 UTC (10 years, 2 months ago) by schwarze
Branch: MAIN
Changes since 1.74: +5 -5 lines
Diff to previous 1.74 (unified) to selected 1.60 (unified)

Simplify: Remove an unused argument from the mandoc_eos() function.
No functional change.

Revision 1.74 / (download) - annotate - [select for diffs], Mon Dec 30 18:30:32 2013 UTC (10 years, 2 months ago) by schwarze
Branch: MAIN
CVS Tags: VERSION_1_12_3
Changes since 1.73: +2 -2 lines
Diff to previous 1.73 (unified) to selected 1.60 (unified)

Remove duplicate const specifiers from the declaration of mandoc_escape().
Found by Thomas Klausner <wiz at NetBSD dot org> using clang.
No functional change.

Revision 1.73 / (download) - annotate - [select for diffs], Thu Dec 26 02:55:28 2013 UTC (10 years, 3 months ago) by schwarze
Branch: MAIN
Changes since 1.72: +6 -8 lines
Diff to previous 1.72 (unified) to selected 1.60 (unified)

I have no idea how it happened that \B, \H, \h, \L, and \l got
mapped to ESCAPE_NUMBERED (which is for \N and only for \N), that
made no sense at all.  Properly remap them to ESCAPE_IGNORE.

While here, move \B and \w from the group taking number arguments
to the group taking string arguments; right now, that doesn't imply
any functional change, but if we ever go ahead and implement a
parser for roff(7) numerical expressions, it will suddenly start
to matter, and cause confusion.

Revision 1.72 / (download) - annotate - [select for diffs], Wed Dec 25 22:45:33 2013 UTC (10 years, 3 months ago) by schwarze
Branch: MAIN
Changes since 1.71: +9 -1 lines
Diff to previous 1.71 (unified) to selected 1.60 (unified)

Parse and ignore the roff(7) escape sequences \d (move half line down)
und \u (move half line up).  Found by bentley@ in some DocBook crap.

Revision 1.71 / (download) - annotate - [select for diffs], Wed Dec 25 00:50:05 2013 UTC (10 years, 3 months ago) by schwarze
Branch: MAIN
Changes since 1.70: +4 -4 lines
Diff to previous 1.70 (unified) to selected 1.60 (unified)

s/[Nn]ull/NUL/ in comments where appropriate;
suggested by Thomas Klausner <wiz @ NetBSD dot org>.

Revision 1.70 / (download) - annotate - [select for diffs], Sun Nov 10 21:34:04 2013 UTC (10 years, 4 months ago) by schwarze
Branch: MAIN
Changes since 1.69: +5 -2 lines
Diff to previous 1.69 (unified) to selected 1.60 (unified)

Support the alternative syntax \C'uXXXX' for Unicode characters.
It is already documented in the Heirloom troff manual,
and groff handles it as well.

Bug reported by Bjarni Ingi Gislason <bjarniig at rhi dot hi dot is>
on <bug-groff at gnu dot org>.  Well, admittedly, that bug was reported
against groff, but mandoc was even more broken than groff with respect
to this syntax...

Revision 1.69 / (download) - annotate - [select for diffs], Sat Oct 5 20:30:05 2013 UTC (10 years, 5 months ago) by schwarze
Branch: MAIN
Changes since 1.68: +2 -2 lines
Diff to previous 1.68 (unified) to selected 1.60 (unified)

Cleanup suggested by gcc-4.8.1, following hints by Christos Zoulas:
- avoid bad qualifier casting in roff.c, roff_parsetext()
  by changing the mandoc_escape arguments to "const char const **"
- avoid bad qualifier casting in mandocdb.c, index_merge()
- do not complain about unused variables in test-*.c
- garbage collect a few unused variables elsewhere

Revision 1.68 / (download) - annotate - [select for diffs], Thu Aug 8 20:07:47 2013 UTC (10 years, 7 months ago) by schwarze
Branch: MAIN
CVS Tags: VERSION_1_12_2
Changes since 1.67: +15 -9 lines
Diff to previous 1.67 (unified) to selected 1.60 (unified)

Implement the roff(7) font-escape sequence \f(BI "bold+italic".
This improves the formatting of about 40 base manuals
and reduces groff-mandoc formatting differences in base by about 5%.

Revision 1.67 / (download) - annotate - [select for diffs], Thu Jun 20 22:39:30 2013 UTC (10 years, 9 months ago) by schwarze
Branch: MAIN
Changes since 1.66: +24 -6 lines
Diff to previous 1.66 (unified) to selected 1.60 (unified)

Improve handling of the roff(7) "\t" escape sequence:
* Parsing macro arguments has to be done in copy mode,
  which implies replacing "\t" by a literal tab character.
* Otherwise, render "\t" as the empty string, not as a 't' character.

This fixes formatting of the distfile example in the oldrdist(1) manual.
This also shows up in the unzip(1) manual as one of several issues
preventing the removal of USE_GROFF from the archivers/unzip port.
Thanks to espie@ for attracting my attention to the unzip(1) manual.

Revision 1.66 / (download) - annotate - [select for diffs], Tue Jun 12 20:21:04 2012 UTC (11 years, 9 months ago) by kristaps
Branch: MAIN
Changes since 1.65: +1 -27 lines
Diff to previous 1.65 (unified) to selected 1.60 (unified)

Add `cc' support.
This was reported by espie@ and in the TODO.
Caveat: `cc' has buggy behaviour when invoked in groff(1) and followed
by a line-breaking control character macro, e.g., in a -man doc,

  .cc |
  .B foo
  'B foo
  |cc
  'B foo

will cause groff(1) to behave properly for `.B' but inline the macro
definition for `B' when invoked with the line-breaking macro.

Revision 1.65 / (download) - annotate - [select for diffs], Thu May 31 22:38:16 2012 UTC (11 years, 9 months ago) by schwarze
Branch: MAIN
Changes since 1.64: +65 -68 lines
Diff to previous 1.64 (unified) to selected 1.60 (unified)

While i already got my fingers dirty on mandoc_escape(),
profit of the occasion to pull out some spaghetti, that is,
three confusing variables and fourteen pointless assignments
among them; instead, always operate on the official pointers
**start, **end, and *sz, each of which conveys an obvious meaning.

No functional change intended, and the new tests confirm that
everything still (err...) "works", as far as that word can be
applied to the kind of roff(7) mock-up code i'm polishing here.

"just commit" kristaps@

Revision 1.64 / (download) - annotate - [select for diffs], Thu May 31 22:34:06 2012 UTC (11 years, 9 months ago) by schwarze
Branch: MAIN
Changes since 1.63: +37 -118 lines
Diff to previous 1.63 (unified) to selected 1.60 (unified)

Make recursive parsing of roff(7) escapes actually work in the general case,
in particular when the inner escapes are preceded or followed by other terms.
While doing so, remove lots of bogus code that was trying to make pointless
distinctions between numeric and non-numeric escape sequences, while both
actually share the same syntax and we ignore the semantics anyway.

This prevents some of the strings defined in the pod2man(1) preamble
from producing garbage output, in particular in scandinavian words.
Of course, proper rendering of scandinavian national characters
cannot be expected even with these fixes.

"just commit" kristaps@

Revision 1.63 / (download) - annotate - [select for diffs], Thu May 31 22:29:13 2012 UTC (11 years, 9 months ago) by schwarze
Branch: MAIN
Changes since 1.62: +12 -2 lines
Diff to previous 1.62 (unified) to selected 1.60 (unified)

Implement the roff \z escape sequence, intended to output the next
character without advancing the cursor position; implement it to
simply skip the next character, as it will usually be overwritten.

With this change, the pod2man(1) preamble user-defined string \*:,
intended to render as a diaeresis or umlaut diacritic above the
preceding character, is rendered in a slightly less ugly way,
though still not correctly.  It was rendered as "z.." and is now
rendered as ".".

Given that the definition of \*: uses elaborate manual \h positioning,
there is little chance for mandoc(1) to ever render it correctly,
but at least we can refrain from printing out a spurious "z", and
we can make the \z do something semi-reasonable for easier cases.

"just commit" kristaps@

Revision 1.62 / (download) - annotate - [select for diffs], Sat Dec 3 16:08:51 2011 UTC (12 years, 3 months ago) by schwarze
Branch: MAIN
CVS Tags: VERSION_1_12_1
Changes since 1.61: +4 -3 lines
Diff to previous 1.61 (unified) to selected 1.60 (unified)

ISO style "%Y-%m-%d" dates are common in man(7) .TH.
They have been considered valid in the past, but were reformatted
to the mdoc(7) "Month day, year" style.
To make page footers more similar to groff, no longer reformat them,
just print them as they are.
This doesn't change anything with respect to what's considered valid
or what is warned about.

ok kristaps@

Revision 1.61 / (download) - annotate - [select for diffs], Sun Nov 6 14:43:14 2011 UTC (12 years, 4 months ago) by kristaps
Branch: MAIN
Changes since 1.60: +9 -2 lines
Diff to previous 1.60 (unified)

Accomodate for \f(Cx formatting.  Noted by Andreas Vogele, thanks!

Revision 1.60 / (download) - annotate - [selected], Mon Oct 24 20:30:57 2011 UTC (12 years, 5 months ago) by schwarze
Branch: MAIN
Changes since 1.59: +23 -7 lines
Diff to previous 1.59 (unified)

Handle \N numbered character escapes the same way as groff:
If \N is followed by a digit, ignore \N and the digit.
If \N is followed by a non-digit, the next non-digit
ends the character number; the two delimiters need not match.
Kristaps calls that "gross, but not our fault".

For now, i'm fixing \N only.  Other escapes taking numeric arguments
may or may not need similar handling, but \N is by far the most
important for practical purposes.

ok kristaps@

Revision 1.59 / (download) - annotate - [select for diffs], Sun Sep 18 14:14:15 2011 UTC (12 years, 6 months ago) by schwarze
Branch: MAIN
CVS Tags: VERSION_1_12_0
Changes since 1.58: +2 -2 lines
Diff to previous 1.58 (unified) to selected 1.60 (unified)

forgotten Copyright bumps; no code change
found while syncing to OpenBSD

Revision 1.58 / (download) - annotate - [select for diffs], Wed Jul 27 07:32:26 2011 UTC (12 years, 8 months ago) by kristaps
Branch: MAIN
CVS Tags: VERSION_1_11_7, VERSION_1_11_6
Changes since 1.57: +1 -39 lines
Diff to previous 1.57 (unified) to selected 1.60 (unified)

Move mandoc_hyph() into roff_parsetext() as a single conditional.  While
here, do some function renames for clarity and make all function
prototypes be in one place.

Revision 1.57 / (download) - annotate - [select for diffs], Wed Jul 27 07:06:29 2011 UTC (12 years, 8 months ago) by kristaps
Branch: MAIN
Changes since 1.56: +21 -10 lines
Diff to previous 1.56 (unified) to selected 1.60 (unified)

Update mandoc_hyph() to the extent that numbers on either side of the
hyphen make for a non-breakable hyphen.  Found by random testing.

Revision 1.56 / (download) - annotate - [select for diffs], Sun Jul 24 18:15:14 2011 UTC (12 years, 8 months ago) by kristaps
Branch: MAIN
CVS Tags: VERSION_1_11_5
Changes since 1.55: +9 -6 lines
Diff to previous 1.55 (unified) to selected 1.60 (unified)

Scary-looking but otherwise harmless changes allow me to build for Windows.
That is to say, with mingw32.  This amounts to the following:

 (1) break compat.c into compat_strlcpy.c and compat_strlcat.c
 (2) add compat_getsubopt.c (from OpenBSD) and test-getsubopt.c
 (3) add test-strptime.c for HAVE_STRPTIME
 (4) add ifdef bits here and there, where necessary
 (5) remove some harmless unportable stuff (u_char, localtime_r)

I've added the appropriate mdocml.zip target to the Makefile, too.

Revision 1.55 / (download) - annotate - [select for diffs], Thu Jul 21 23:30:39 2011 UTC (12 years, 8 months ago) by kristaps
Branch: MAIN
Changes since 1.54: +11 -1 lines
Diff to previous 1.54 (unified) to selected 1.60 (unified)

Complete eqn.7 parsing.  Features all productions from the original 1975
CACM paper in an LR(1) parse (1 -> eqn_rewind()).  Right now the code is
a little jungly, but will clear up as I consolidate parse components.
The AST structure will also be cleaned up, as right now it's pretty ad
hoc (this won't change the parse itself).  I added the mandoc_strndup()
function will here.

Revision 1.54 / (download) - annotate - [select for diffs], Thu Jul 21 15:21:13 2011 UTC (12 years, 8 months ago) by kristaps
Branch: MAIN
Changes since 1.53: +6 -7 lines
Diff to previous 1.53 (unified) to selected 1.60 (unified)

Support `size' constructs in eqn.7.  Generalise mandoc_strontou to this
effect.

Revision 1.53 / (download) - annotate - [select for diffs], Tue May 24 21:31:23 2011 UTC (12 years, 10 months ago) by kristaps
Branch: MAIN
CVS Tags: VERSION_1_11_4, VERSION_1_11_3
Changes since 1.52: +1 -5 lines
Diff to previous 1.52 (unified) to selected 1.60 (unified)

Remove all references to ESCAPE_PREDEF, which is now not exposed passed
the libroff point.  This clears up a nice chunk of code.

Revision 1.52 / (download) - annotate - [select for diffs], Sun May 15 15:30:33 2011 UTC (12 years, 10 months ago) by kristaps
Branch: MAIN
Changes since 1.51: +9 -1 lines
Diff to previous 1.51 (unified) to selected 1.60 (unified)

Support groff's escape for Unicode input.  See

  http://mdocml.bsd.lv/archives/tech/0368.html

For the time being, we just throw it away.

Revision 1.51 / (download) - annotate - [select for diffs], Sat May 14 17:54:42 2011 UTC (12 years, 10 months ago) by kristaps
Branch: MAIN
Changes since 1.50: +2 -2 lines
Diff to previous 1.50 (unified) to selected 1.60 (unified)

Make character engine (-Tascii, -Tpdf, -Tps) ready for Unicode: make buffer
consist of type "int".  This will take more work (especially in encode and
friends), but this is a strong start.  This commit also consists of some
harmless lint fixes.

Revision 1.50 / (download) - annotate - [select for diffs], Sat May 14 16:06:09 2011 UTC (12 years, 10 months ago) by kristaps
Branch: MAIN
Changes since 1.49: +35 -1 lines
Diff to previous 1.49 (unified) to selected 1.60 (unified)

Move roff.c's strtol into libmandoc.h for use by other parts of the code
(which will come).

Revision 1.49 / (download) - annotate - [select for diffs], Sat Apr 30 10:18:24 2011 UTC (12 years, 11 months ago) by kristaps
Branch: MAIN
CVS Tags: VERSION_1_11_2
Changes since 1.48: +2 -2 lines
Diff to previous 1.48 (unified) to selected 1.60 (unified)

No code change: fixing spelling errors.  From a patch by uqs@.  Thanks!

Revision 1.48 / (download) - annotate - [select for diffs], Tue Apr 19 16:38:48 2011 UTC (12 years, 11 months ago) by kristaps
Branch: MAIN
Changes since 1.47: +4 -14 lines
Diff to previous 1.47 (unified) to selected 1.60 (unified)

Clean up parsing of delimiters in -mdoc.  First, remove the "dowarn"
variable from mandoc_getarg() so that it prints the warning every time.
Then, remove the warning from args_checkpunct().  This way, warnings
are being posted at the correct time.  This makes the flag argument to
mdoc_zargs() superfluous, so make it be zero when it's invoked.  Finally,
move the args() flags into mdoc_argv.c and make them enums.

Revision 1.47 / (download) - annotate - [select for diffs], Sun Apr 17 09:08:19 2011 UTC (12 years, 11 months ago) by kristaps
Branch: MAIN
Changes since 1.46: +7 -6 lines
Diff to previous 1.46 (unified) to selected 1.60 (unified)

Get mdoc_argv.c ready to use [some of] mandoc_getarg() by giving said
function a parameter to suppress warnings.

Revision 1.46 / (download) - annotate - [select for diffs], Sat Apr 9 15:35:30 2011 UTC (12 years, 11 months ago) by kristaps
Branch: MAIN
Changes since 1.45: +4 -4 lines
Diff to previous 1.45 (unified) to selected 1.60 (unified)

Lint catching some potential issues.

Revision 1.45 / (download) - annotate - [select for diffs], Sat Apr 9 15:29:40 2011 UTC (12 years, 11 months ago) by kristaps
Branch: MAIN
Changes since 1.44: +306 -142 lines
Diff to previous 1.44 (unified) to selected 1.60 (unified)

Remove a2roffdeco() and mandoc_special() functions and replace them with
a public (mandoc.h) function mandoc_escape(), which merges the
functionality of both prior functions.

Reason: code duplication.  The a2roffdeco() and mandoc_special()
functions were pretty much the same thing and both quite complex.  This
allows one function to receive improvements in (e.g.) subexpression
handling and performance, instead of having to replicate functionality.

As such, the mandoc_escape() function already handles a superset of the
escapes handled in previous versions and has improvements in performance
(using strcspn(), for example) and reliable handling of subexpressions.

This code Works For Me, but may need work to catch any regressions.
Since the benefits are great (leaner code, simpler API), I'd rather have
it in-tree than floating as a patch.

Revision 1.44 / (download) - annotate - [select for diffs], Mon Mar 28 23:52:13 2011 UTC (13 years ago) by kristaps
Branch: MAIN
CVS Tags: VERSION_1_11_1
Changes since 1.43: +26 -1 lines
Diff to previous 1.43 (unified) to selected 1.60 (unified)

Have libman and libmdoc use mandoc_getcontrol() to determine whether a
macro has been invoked.  libroff is next.

Revision 1.43 / (download) - annotate - [select for diffs], Tue Mar 22 14:05:45 2011 UTC (13 years ago) by kristaps
Branch: MAIN
Changes since 1.42: +1 -51 lines
Diff to previous 1.42 (unified) to selected 1.60 (unified)

Move mandoc_isdelim() back into libmdoc.h.  This fixes an unreported
error where (1) -man pages were punctuating delimiters (e.g., `.B a ;')
and where (2) standalone punctuation in -mdoc or -man (e.g., ";" on its
own line) would also be punctuated.  This introduces a small amount of
complexity of mdoc_{html,term}.c must manage their own spacing with
running print_word() or print_text().  The check for delimiting now
happens in mdoc_macro.c's dword().

Revision 1.42 / (download) - annotate - [select for diffs], Sun Mar 20 16:02:05 2011 UTC (13 years ago) by kristaps
Branch: MAIN
Changes since 1.41: +9 -9 lines
Diff to previous 1.41 (unified) to selected 1.60 (unified)

Consolidate messages.  Have all parse-time messages (in libmdoc,
libroff, etc., etc.) route into mandoc_msg() and mandoc_vmsg(), for the
time being in libmandoc.h.  This requires struct mparse to be passed
into the allocation routines instead of mandocmsg and a void pointer.
Then, move some of the functionality of the old mmsg() into read.c's
mparse_mmsg() (check against wlevel and setting of file_status) and use
main.c's mmsg() as simply a printing tool.

Revision 1.41 / (download) - annotate - [select for diffs], Thu Mar 17 09:18:12 2011 UTC (13 years ago) by kristaps
Branch: MAIN
CVS Tags: VERSION_1_10_10
Changes since 1.40: +3 -3 lines
Diff to previous 1.40 (unified) to selected 1.60 (unified)

Tiny optimisation in mandoc_isdelim() check.

Revision 1.40 / (download) - annotate - [select for diffs], Thu Mar 17 09:16:38 2011 UTC (13 years ago) by kristaps
Branch: MAIN
Changes since 1.39: +52 -6 lines
Diff to previous 1.39 (unified) to selected 1.60 (unified)

Move mdoc_isdelim() into mandoc.h as mandoc_isdelim().  This allows the
removal of manual delimiter checks in html.c and term.c.  Finally, add
the escaped period as a closing delimiter, removing a TODO to this
effect.

Revision 1.39 / (download) - annotate - [select for diffs], Tue Mar 15 16:23:51 2011 UTC (13 years ago) by kristaps
Branch: MAIN
Changes since 1.38: +2 -2 lines
Diff to previous 1.38 (unified) to selected 1.60 (unified)

Make lint shut up a little bit.

Revision 1.38 / (download) - annotate - [select for diffs], Tue Mar 15 03:03:54 2011 UTC (13 years ago) by schwarze
Branch: MAIN
Changes since 1.37: +24 -18 lines
Diff to previous 1.37 (unified) to selected 1.60 (unified)

my $buf = "string"; return $string;  is cool in Perl, but not in C;
found by Ulrich Spoerlein <uqs at freebsd> using the clang static analyzer;
"ok, but please document the numbers" kristaps@

Revision 1.37 / (download) - annotate - [select for diffs], Mon Mar 7 01:35:51 2011 UTC (13 years ago) by schwarze
Branch: MAIN
Changes since 1.36: +49 -30 lines
Diff to previous 1.36 (unified) to selected 1.60 (unified)

Clean up date handling,
as a first step to get rid of the frequent petty warnings in this area:
 - always store dates as strings, not as seconds since the Epoch
 - for input, try the three most common formats everywhere
 - for unrecognized format, just pass the date though verbatim
 - when there is no date at all, still use the current date
Originally triggered by a one-line patch from Tim van der Molen,
<tbvdm at xs4all dot nl>, which is included here.
Feedback and OK on manual parts from jmc@.
"please check this in" kristaps@

Revision 1.36 / (download) - annotate - [select for diffs], Mon Jan 3 22:42:37 2011 UTC (13 years, 2 months ago) by schwarze
Branch: MAIN
CVS Tags: VERSION_1_10_9
Changes since 1.35: +81 -3 lines
Diff to previous 1.35 (unified) to selected 1.60 (unified)

Unify roff macro argument parsing (in roff.c, roff_userdef()) and man macro
argument parsing (in man_argv.c, man_args()), both having different bugs,
to use one common macro argument parser (in mandoc.c, mandoc_getarg()),
because from the point of view of roff, man macros are just roff macros,
hence their arguments are parsed in exactly the same way.

While doing so, fix these bugs:
 * Escaped blanks (i.e. those preceded by an odd number of backslashes)
   were mishandled as argument separators in unquoted arguments to
   user-defined roff macros.
 * Unescaped blanks preceded by an even number of backslashes were not
   recognized as argument separators in unquoted arguments to man macros.
 * Escaped backslashes (i.e. pairs of backslashes) were not reduced
   to single backslashes both in unquoted and quoted arguments both
   to user-defined roff macros and to man macros.
 * Escaped quotes (i.e. pairs of quotes inside quoted arguments) were
   not reduced to single quotes in man macros.

OK kristaps@

Note that mdoc macro argument parsing is yet another beast for no good
reason and is probably afflicted by similar bugs.  But i don't attempt
to fix that right now because it is intricately entangled with lots of
unrelated high-level mdoc(7) functionality, like delimiter handling and
column list phrase handling.  Disentagling that would waste too much
time now.

Revision 1.35 / (download) - annotate - [select for diffs], Sat Sep 4 20:18:53 2010 UTC (13 years, 6 months ago) by kristaps
Branch: MAIN
CVS Tags: VERSION_1_10_8, VERSION_1_10_7, VERSION_1_10_6
Changes since 1.34: +5 -5 lines
Diff to previous 1.34 (unified) to selected 1.60 (unified)

Churny commit to quiet lint.  No functional changes.

Revision 1.34 / (download) - annotate - [select for diffs], Sun Aug 29 11:28:09 2010 UTC (13 years, 7 months ago) by kristaps
Branch: MAIN
Changes since 1.33: +3 -3 lines
Diff to previous 1.33 (unified) to selected 1.60 (unified)

Remove overstrike `\o'.  This isn't the best solution because we really
should be printing the contents, but for the time being, this is good
enough.

Revision 1.33 / (download) - annotate - [select for diffs], Tue Aug 24 13:56:51 2010 UTC (13 years, 7 months ago) by kristaps
Branch: MAIN
Changes since 1.32: +25 -2 lines
Diff to previous 1.32 (unified) to selected 1.60 (unified)

Handle nested, recursive mathematical subexpressions.  This is
definitely not general, but it's good enough for pod2man definitions
(after I clean up the roff, which will be addressed in later fixes).

Revision 1.32 / (download) - annotate - [select for diffs], Tue Aug 24 13:39:37 2010 UTC (13 years, 7 months ago) by kristaps
Branch: MAIN
Changes since 1.31: +2 -2 lines
Diff to previous 1.31 (unified) to selected 1.60 (unified)

Strip out `\k' escape.

Revision 1.31 / (download) - annotate - [select for diffs], Tue Aug 24 13:07:01 2010 UTC (13 years, 7 months ago) by kristaps
Branch: MAIN
Changes since 1.30: +7 -3 lines
Diff to previous 1.30 (unified) to selected 1.60 (unified)

Stripping out of `\w' groff escape.  Yet another for pod2man...

Revision 1.30 / (download) - annotate - [select for diffs], Tue Aug 24 12:18:48 2010 UTC (13 years, 7 months ago) by kristaps
Branch: MAIN
Changes since 1.29: +8 -1 lines
Diff to previous 1.29 (unified) to selected 1.60 (unified)

Strip out the `\z' escape.  This is the first recursive sequence,
getting mandoc ready to handle pod2man's complex escapes.

Revision 1.29 / (download) - annotate - [select for diffs], Fri Aug 20 01:02:07 2010 UTC (13 years, 7 months ago) by schwarze
Branch: MAIN
Changes since 1.28: +5 -5 lines
Diff to previous 1.28 (unified) to selected 1.60 (unified)

Implement a simple, consistent user interface for error handling.
We now have sufficient practical experience to know what we want,
so this is intended to be final:
- provide -Wlevel (warning, error or fatal) to select what you care about
- provide -Wstop to stop after parsing a file with warnings you care about
- provide consistent exit status codes for those warnings you care about
- fully document what warnings, errors and fatal errors mean
- remove all other cruft from the user interface, less is more:
- remove all -f knobs along with the whole -f option
- remove the old -Werror because calling warnings "fatal" is silly
- always finish parsing each file, unless fatal errors prevent that
This commit also includes a couple of related simplifications behind
the scenes regarding error handling.
Feedback and OK  kristaps@;  Joerg Sonnenberger (NetBSD) and
Sascha Wildner (DragonFly BSD) agree with the general direction.

Revision 1.28 / (download) - annotate - [select for diffs], Mon Aug 16 09:37:58 2010 UTC (13 years, 7 months ago) by kristaps
Branch: MAIN
Changes since 1.27: +12 -10 lines
Diff to previous 1.27 (unified) to selected 1.60 (unified)

Add \v and \h to ignored escapes.  These are in the category of \s.
Also made sign-less \s-style escapes be ok (this is technically against
what's in the groff.7 manual, but seems pretty widespread).  Noted by
Thomas Jeunet as uglifying the gcc.1 manual.

Revision 1.27 / (download) - annotate - [select for diffs], Sun Jul 25 19:05:59 2010 UTC (13 years, 8 months ago) by joerg
Branch: MAIN
CVS Tags: VERSION_1_10_5
Changes since 1.26: +2 -2 lines
Diff to previous 1.26 (unified) to selected 1.60 (unified)

Ensure that isalnum is called with unsigned char argument.

Revision 1.26 / (download) - annotate - [select for diffs], Thu Jul 22 14:03:50 2010 UTC (13 years, 8 months ago) by kristaps
Branch: MAIN
CVS Tags: VERSION_1_10_5_PREPDF
Changes since 1.25: +4 -1 lines
Diff to previous 1.25 (unified) to selected 1.60 (unified)

Accept "\s0" (i.e., properly ignore it).  Found in the wild (e.g., gfdl.7).

Revision 1.25 / (download) - annotate - [select for diffs], Wed Jul 21 20:35:03 2010 UTC (13 years, 8 months ago) by kristaps
Branch: MAIN
Changes since 1.24: +2 -2 lines
Diff to previous 1.24 (unified) to selected 1.60 (unified)

Accomodate for groff's crappy behaviour wherein an unrecognised
single-character escape (and ONLY this type of escape) will map back
into itself:

       "If a backslash is followed by a character that does not
	constitute a defined escape sequence the backslash is silently
        ignored and the  character maps to itself."

(From groff.7.)

Found by Jason McIntyre.

Revision 1.24 / (download) - annotate - [select for diffs], Sun Jul 18 22:55:06 2010 UTC (13 years, 8 months ago) by kristaps
Branch: MAIN
Changes since 1.23: +62 -3 lines
Diff to previous 1.23 (unified) to selected 1.60 (unified)

Throw out a2roffdeco() in out.c for a readable version.  The prior one
was completely unmaintainable.  The new one is both readable and quite
similar to mandoc_special(), which in future versions will easily allow
throwing-away of unsupported escapes (such as \m).  It's also a fair bit
smaller.

DECO_SIZE has been removed: this crap, like colours, will not be
supported.

mandoc_special() also has #if 0'd switch branches for ALL groff.7
escapes and some lint fixes.

Revision 1.23 / (download) - annotate - [select for diffs], Sun Jul 18 17:00:26 2010 UTC (13 years, 8 months ago) by schwarze
Branch: MAIN
Changes since 1.22: +13 -10 lines
Diff to previous 1.22 (unified) to selected 1.60 (unified)

Text ending in a full stop, exclamation mark or question mark
should not flag the end of a sentence if:

1) The punctuation is followed by closing delimiters
and not preceded by alphanumeric characters, like in
"There is no full stop (.) in this sentence"

or

2) The punctuation is a child of a macro
and not preceded by alphanumeric characters, like in
"There is no full stop
.Pq \&.
in this sentence"

"looks fine" to kristaps@; tested by jmc@ and sobrado@

Revision 1.22 / (download) - annotate - [select for diffs], Sun Jul 18 12:10:08 2010 UTC (13 years, 8 months ago) by kristaps
Branch: MAIN
Changes since 1.21: +63 -151 lines
Diff to previous 1.21 (unified) to selected 1.60 (unified)

Clean up mandoc_special() (in order later to catch \m).  It also flags
several syntactic errors that weren't caught before.

Also un-puke chars.c on zero-length \[].

Revision 1.21 / (download) - annotate - [select for diffs], Tue Jul 6 22:04:31 2010 UTC (13 years, 8 months ago) by kristaps
Branch: MAIN
CVS Tags: VERSION_1_10_4
Changes since 1.20: +0 -0 lines
Diff to previous 1.20 (unified) to selected 1.60 (unified)

Resurrect mandoc.c after bogus removal (was: libmandoc.c).

Revision 1.20, Mon Jul 5 20:00:55 2010 UTC (13 years, 8 months ago) by kristaps
Branch: MAIN
Changes since 1.19: +1 -1 lines
FILE REMOVED

Renamed mandoc.c to libmandoc.c.  This is in the efforts of getting a
cleaner namespace for functions across the entire system (mandoc.h:
getting parsed-string values, or declarations necessary for the AST
data), and compiler functions (libmandoc.h: back-end functions and
declarations).

Revision 1.19 / (download) - annotate - [select for diffs], Sat Jun 19 20:46:28 2010 UTC (13 years, 9 months ago) by kristaps
Branch: MAIN
CVS Tags: VERSION_1_10_3, VERSION_1_10_2
Changes since 1.18: +2 -2 lines
Diff to previous 1.18 (unified) to selected 1.60 (unified)

Churn as I finish email address migration kth.se -> bsd.lv.

Revision 1.18 / (download) - annotate - [select for diffs], Wed Jun 9 19:22:56 2010 UTC (13 years, 9 months ago) by kristaps
Branch: MAIN
Changes since 1.17: +54 -30 lines
Diff to previous 1.17 (unified) to selected 1.60 (unified)

Squash bug noted by Ulrich Spoerlein where "-" were being converted to
ASCII_HYPH, as per normal, but were screwing up mandoc_special().  Fixed
by making mandoc_special() first check isspace() instead of ! isgraph(),
then normalise its string as it passes out.  This require de-constifying
some validation routines not already de-constified (those in libman),
but that's ok, because I'd like to be pushing actions into validation
routines to save on space and redundant calculations.

Revision 1.17 / (download) - annotate - [select for diffs], Tue Jun 1 11:47:28 2010 UTC (13 years, 9 months ago) by kristaps
Branch: MAIN
CVS Tags: VERSION_1_10_1
Changes since 1.16: +3 -1 lines
Diff to previous 1.16 (unified) to selected 1.60 (unified)

Fixed condition of `\}' closing a conditional at the start of the line.

Fixed flushed-out condition of \} causing subsequent arguments to be
truncated, when in fact the whole line should be passed through (if the
conditional succeeds) to the front-end and the \} ignored there.

Added regression test of this behaviour.

Revision 1.16 / (download) - annotate - [select for diffs], Tue May 25 12:37:20 2010 UTC (13 years, 10 months ago) by kristaps
Branch: MAIN
Changes since 1.15: +29 -1 lines
Diff to previous 1.15 (unified) to selected 1.60 (unified)

Modified version of Ingo Schwarze's patch for hyphen-breaking.
Breakable hyphens are cued in the back-ends (with ASCII_HYPH) and acted
upon in term.c or ignored in html.c.

Also cleaned up XML decl printing (no need for extra vars).

Revision 1.15 / (download) - annotate - [select for diffs], Sat May 15 07:01:51 2010 UTC (13 years, 10 months ago) by kristaps
Branch: MAIN
Changes since 1.14: +3 -1 lines
Diff to previous 1.14 (unified) to selected 1.60 (unified)

Documented EOS buffered spaces and added `]'.

Revision 1.14 / (download) - annotate - [select for diffs], Sat May 15 06:48:13 2010 UTC (13 years, 10 months ago) by kristaps
Branch: MAIN
Changes since 1.13: +27 -13 lines
Diff to previous 1.13 (unified) to selected 1.60 (unified)

More EOS: append_delims() fitted with EOS detection, so ANY macro with appended delimiters will properly EOS.
Fixed mandoc_eos() to accept sentence punctuation followed by close-delim buffers.

Revision 1.13 / (download) - annotate - [select for diffs], Fri May 14 14:09:13 2010 UTC (13 years, 10 months ago) by kristaps
Branch: MAIN
Changes since 1.12: +3 -2 lines
Diff to previous 1.12 (unified) to selected 1.60 (unified)

Block-implicit macros now up-propogate end-of-sentence spacing.  NOTE: GROFF IS NOT SMART ENOUGH TO DO THIS.

Revision 1.12 / (download) - annotate - [select for diffs], Wed May 12 17:08:03 2010 UTC (13 years, 10 months ago) by kristaps
Branch: MAIN
CVS Tags: VERSION_1_9_25
Changes since 1.11: +24 -1 lines
Diff to previous 1.11 (unified) to selected 1.60 (unified)

Put the eos-checker into libmandoc.h.
Added bits in mdoc.7 and man.7 about EOS spacing.

Revision 1.11 / (download) - annotate - [select for diffs], Wed Apr 7 11:25:38 2010 UTC (13 years, 11 months ago) by kristaps
Branch: MAIN
CVS Tags: VERSION_1_9_24, VERSION_1_9_23
Changes since 1.10: +5 -5 lines
Diff to previous 1.10 (unified) to selected 1.60 (unified)

Add support/ignoring of \f(xy, \f[X...], \F(xy, \FX, \F[X...] roff-style font escapes (noted by Frantisek Holop).

Revision 1.10 / (download) - annotate - [select for diffs], Tue Jan 5 19:51:10 2010 UTC (14 years, 2 months ago) by kristaps
Branch: MAIN
CVS Tags: VERSION_1_9_22, VERSION_1_9_21, VERSION_1_9_20, VERSION_1_9_19, VERSION_1_9_18, VERSION_1_9_17, VERSION_1_9_16, VERSION_1_9_15
Changes since 1.9: +1 -3 lines
Diff to previous 1.9 (unified) to selected 1.60 (unified)

Removed references to `\\' escape (noted by Jason McIntyre, Ingo Schwarze).

Revision 1.9 / (download) - annotate - [select for diffs], Fri Jan 1 17:14:28 2010 UTC (14 years, 2 months ago) by kristaps
Branch: MAIN
CVS Tags: VERSION_1_9_15-pre2
Changes since 1.8: +3 -3 lines
Diff to previous 1.8 (unified) to selected 1.60 (unified)

Big check-in of compatibility layer.  This should work on most major architectures. Thanks to Joerg Sonnenberger.

Revision 1.8 / (download) - annotate - [select for diffs], Thu Nov 5 10:16:01 2009 UTC (14 years, 4 months ago) by kristaps
Branch: MAIN
CVS Tags: VERSION_1_9_15-pre1, VERSION_1_9_14
Changes since 1.7: +80 -3 lines
Diff to previous 1.7 (unified) to selected 1.60 (unified)

Documented that `\s' and `\f' don't work in HTML mode (and why).
Added support for recognising the many forms of `\s' (doesn't yet render).

Revision 1.7 / (download) - annotate - [select for diffs], Mon Nov 2 06:22:45 2009 UTC (14 years, 4 months ago) by kristaps
Branch: MAIN
CVS Tags: VERSION_1_9_13
Changes since 1.6: +63 -1 lines
Diff to previous 1.6 (unified) to selected 1.60 (unified)

Added mandoc_a2time() for proper date conversion.
Fitted TH and Dd handlers to use mandoc_a2time().
Documented date syntax for -man, fixed documentation for -mdoc.

Revision 1.6 / (download) - annotate - [select for diffs], Sat Oct 31 06:10:58 2009 UTC (14 years, 5 months ago) by kristaps
Branch: MAIN
CVS Tags: VERSION_1_9_12
Changes since 1.5: +5 -5 lines
Diff to previous 1.5 (unified) to selected 1.60 (unified)

Using perror() instead of fprintf for failure from library functions.

Revision 1.5 / (download) - annotate - [select for diffs], Fri Oct 30 05:58:38 2009 UTC (14 years, 5 months ago) by kristaps
Branch: MAIN
Changes since 1.4: +1 -14 lines
Diff to previous 1.4 (unified) to selected 1.60 (unified)

libmdoc and libman now using non-recoverable allocations (simpler code).

Revision 1.4 / (download) - annotate - [select for diffs], Wed Oct 28 19:21:59 2009 UTC (14 years, 5 months ago) by kristaps
Branch: MAIN
Changes since 1.3: +74 -1 lines
Diff to previous 1.3 (unified) to selected 1.60 (unified)

Slow movement of internal allocations to fail completely.

Revision 1.3 / (download) - annotate - [select for diffs], Fri Jul 24 20:22:24 2009 UTC (14 years, 8 months ago) by kristaps
Branch: MAIN
CVS Tags: VERSION_1_9_9, VERSION_1_9_8, VERSION_1_9_7, VERSION_1_9_6, VERSION_1_9_5, VERSION_1_9_2, VERSION_1_9_11, VERSION_1_9_10, VERSION_1_9_1, VERSION_1_9_0, VERSION_1_8_5, VERSION_1_8_4
Changes since 1.2: +3 -1 lines
Diff to previous 1.2 (unified) to selected 1.60 (unified)

Added `sp' support to libman.
Added `\c' to known escapes (only used in man, but still).

Revision 1.2 / (download) - annotate - [select for diffs], Sun Jul 12 09:48:00 2009 UTC (14 years, 8 months ago) by kristaps
Branch: MAIN
CVS Tags: VERSION_1_8_3, VERSION_1_8_2, VERSION_1_8_1, VERSION_1_8_0
Changes since 1.1: +3 -1 lines
Diff to previous 1.1 (unified) to selected 1.60 (unified)

Fix for u_char, FreeBSD 7.2 (uqs@spoerlein.net).

Revision 1.1 / (download) - annotate - [select for diffs], Sat Jul 4 09:01:55 2009 UTC (14 years, 8 months ago) by kristaps
Branch: MAIN
CVS Tags: VERSION_1_7_24, VERSION_1_7_23, VERSION_1_7_22, VERSION_1_7_21
Diff to selected 1.60 (unified)

Moved escape validation into libmandoc.h/mandoc.c (common between libman/libmdoc1).
libman supports MAN_IGN_ESCAPE (like MDOC_IGN_ESCAPE).
All popular escapes now handled consistently.

This form allows you to request diff's between any two revisions of a file. You may select a symbolic revision name using the selection box or you may type in a numeric name using the type-in text box.




CVSweb