=================================================================== RCS file: /cvs/mandoc/mandoc.db.5,v retrieving revision 1.4 retrieving revision 1.5 diff -u -p -r1.4 -r1.5 --- mandoc/mandoc.db.5 2016/07/07 14:35:48 1.4 +++ mandoc/mandoc.db.5 2016/08/01 12:27:15 1.5 @@ -1,6 +1,6 @@ -.\" $Id: mandoc.db.5,v 1.4 2016/07/07 14:35:48 schwarze Exp $ +.\" $Id: mandoc.db.5,v 1.5 2016/08/01 12:27:15 schwarze Exp $ .\" -.\" Copyright (c) 2014 Ingo Schwarze +.\" Copyright (c) 2014, 2016 Ingo Schwarze .\" .\" Permission to use, copy, modify, and distribute this software for any .\" purpose with or without fee is hereby granted, provided that the above @@ -14,7 +14,7 @@ .\" ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF .\" OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. .\" -.Dd $Mdocdate: July 7 2016 $ +.Dd $Mdocdate: August 1 2016 $ .Dt MANDOC.DB 5 .Os .Sh NAME @@ -23,7 +23,7 @@ .Sh DESCRIPTION The .Nm -SQLite3 file format is used to store information about installed manual +file format is used to store information about installed manual pages to facilitate semantic searching for manuals. Each manual page tree contains its own .Nm @@ -34,88 +34,157 @@ for examples. Such database files are generated by .Xr makewhatis 8 and used by +.Xr man 1 , .Xr apropos 1 and .Xr whatis 1 . .Pp -One line in the following tables describes: -.Bl -tag -width Ds -.It Sy mpages -One physical manual page file, no matter how many times and under which -names it may appear in the file system. -.It Sy mlinks -One entry in the file system, no matter which content it points to. -.It Sy names -One manual page name, no matter whether it appears in a page header, -in a NAME or SYNOPSIS section, or as a file name. -.It Sy keys -One chunk of text from some macro invocation. +The file format uses three datatypes: +.Pp +.Bl -dash -compact -offset 2n -width 1n +.It +32-bit signed integer numbers in big endian (network) byte ordering +.It +NUL-terminated strings +.It +lists of NUL-terminated strings, terminated by a second NUL character .El .Pp -Each record in the latter three tables uses its -.Va pageid -column to point to a record in the -.Sy mpages -table. +Numbers are aligned to four-byte boundaries; where they follow +strings or lists of strings, padding with additional NUL characters +occurs. +Some, but not all, numbers point to positions in the file. +These pointers are measured in bytes, and the first byte of the +file is considered to be byte 0. .Pp -The other columns are as follows; unless stated otherwise, they are -of type -.Vt TEXT . -.Bl -tag -width mpages.desc -.It Sy mpages.desc -The description line -.Pq Sq \&Nd -of the page. -.It Sy mpages.form -An -.Vt INTEGER -bit field. -If bit -.Dv FORM_GZ -is set, the page is compressed and requires -.Xr gunzip 1 -for display. -If bit -.Dv FORM_SRC -is set, the page is unformatted, that is in +Each file consists of: +.Pp +.Bl -dash -compact -offset 2n -width 1n +.It +One magic number, 0x3a7d0cdb. +.It +One version number, currently 1. +.It +One pointer to the macros table. +.It +One pointer to the final magic number. +.It +The pages table (variable length). +.It +The macros table (variable length). +.It +The magic number once again, 0x3a7d0cdb. +.El +.Pp +The pages table contains one entry for each physical manual page +file, no matter how many hard and soft links it may have in the +file system. +The pages table consists of: +.Pp +.Bl -dash -compact -offset 2n -width 1n +.It +The number of pages in the database. +.It +For each page: +.Bl -dash -compact -offset 2n -width 1n +.It +One pointer to the list of names. +.It +One pointer to the list of sections. +.It +One pointer to the list of architectures +or 0 if the page is machine-independent. +.It +One pointer to the one-line description string. +.It +One pointer to the list of filenames. +.El +.It +For each page, the list of names. +Each name is preceded by a single byte indicating the sources of the name. +The meaning of the bits is: +.Bl -dash -compact -offset 2n -width 1n +.It +0x10: The name appears in a filename. +.It +0x08: The name appears in a header line, i.e. in a .Dt or .TH macro. +.It +0x04: The name is the first one in the title line, i.e. it appears +in the first .Nm macro in the NAME section. +.It +0x02: The name appears in any .Nm macro in the NAME section. +.It +0x01: The name appears in an .Nm block in the SYNOPSIS section. +.El +.It +For each page, the list of sections. +Each section is given as a string, not as a number. +.It +For each architecture-dependent page, the list of architectures. +.It +For each page, the one-line description string taken from the .Nd macro. +.It +For each page, the list of filenames relative to the root of the +respective manpath. +This list includes hard links, soft links, and links simulated +with .so +.Xr roff 7 +requests. +The first filename is preceded by a single byte +having the following significance: +.Bl -dash -compact -offset 2n -width 1n +.It +.Dv FORM_SRC No = 0x01 : +The file format is .Xr mdoc 7 or -.Xr man 7 -format, and requires -.Xr mandoc 1 -for display. -If bit -.Dv FORM_SRC -is not set, the page is formatted, i.e. a -.Sq cat -page. -.It Sy mlinks.sec -The manual section as found in the subdirectory name. -.It Sy mlinks.arch -The manual architecture as found in the subdirectory name, or -.Qq any . -.It Sy mlinks.name -The manual name as found in the file name. -.It Sy names.bits -An -.Vt INTEGER -bit mask telling whether the name came from a header line, from the -NAME or SYNOPSIS section, or from a file name. -Bits are defined in -.In mansearch.h . -.It Sy names.name -The name itself. -.It Sy keys.bits -An -.Vt INTEGER -bit mask telling which semantic contexts the key was found in; -defined in -.In mansearch.h , -documented in +.Xr man 7 . +.It +.Dv FORM_CAT No = 0x02 : +The manual page is preformatted. +.El +.It +Zero to three NUL bytes for padding. +.El +.Pp +The macros table consists of: +.Pp +.Bl -dash -compact -offset 2n -width 1n +.It +The number of different macro keys, currently 36. +The ordering of macros is defined in +.In mansearch.h +and the significance of the macro keys is documented in .Xr apropos 1 . -.It Sy keys.key -The string found in those contexts. +.It +For each macro key, one pointer to the respective macro table. +.It +For each macro key, the macro table (variable length). .El +.Pp +Each macro table consists of: +.Pp +.Bl -dash -compact -offset 2n -width 1n +.It +The number of entries in the table. +.It +For each entry: +.Bl -dash -compact -offset 2n -width 1n +.It +One pointer to the value of the macro key. +Each value is a string of text taken from some macro invocation. +.It +One pointer to the list of pages. +.El +.It +For each entry, the value of the macro key. +.It +Zero to three NUL bytes for padding. +.It +For each entry, one or more pointers to pages in the pages table, +pointing to the pointer to the list of names, +followed by the number 0. +.El .Sh FILES .Bl -tag -width /usr/share/man/mandoc.db -compact .It Pa /usr/share/man/mandoc.db @@ -128,10 +197,16 @@ Window System. The same for .Xr packages 7 . .El +.Pp +A program to dump +.Nm +files in a human-readable format suitable for +.Xr diff 1 +is provided in the directory +.Pa /usr/src/regress/usr.bin/mandoc/db/dbm_dump/ . .Sh SEE ALSO .Xr apropos 1 , .Xr man 1 , -.Xr sqlite3 1 , .Xr whatis 1 , .Xr makewhatis 8 .Sh HISTORY @@ -140,7 +215,7 @@ A manual page database first appeared in .Bx 2 . The present format first appeared in -.Ox 5.6 . +.Ox 6.1 . .Sh AUTHORS .An -nosplit The original version of @@ -148,9 +223,6 @@ The original version of was written by .An Bill Joy in 1979. -An SQLite3 version was first implemented by -.An Kristaps Dzonsons Aq Mt kristaps@bsd.lv -in 2012. The present database format was designed by .An Ingo Schwarze Aq Mt schwarze@openbsd.org -in 2014. +in 2016.