1*f14fb602SLionel Sambuc.\" $NetBSD: btree.3,v 1.13 2012/10/13 15:28:33 njoly Exp $ 22639ae9bSBen Gras.\" 32639ae9bSBen Gras.\" Copyright (c) 1990, 1993 42639ae9bSBen Gras.\" The Regents of the University of California. All rights reserved. 52639ae9bSBen Gras.\" 62639ae9bSBen Gras.\" Redistribution and use in source and binary forms, with or without 72639ae9bSBen Gras.\" modification, are permitted provided that the following conditions 82639ae9bSBen Gras.\" are met: 92639ae9bSBen Gras.\" 1. Redistributions of source code must retain the above copyright 102639ae9bSBen Gras.\" notice, this list of conditions and the following disclaimer. 112639ae9bSBen Gras.\" 2. Redistributions in binary form must reproduce the above copyright 122639ae9bSBen Gras.\" notice, this list of conditions and the following disclaimer in the 132639ae9bSBen Gras.\" documentation and/or other materials provided with the distribution. 142639ae9bSBen Gras.\" 3. Neither the name of the University nor the names of its contributors 152639ae9bSBen Gras.\" may be used to endorse or promote products derived from this software 162639ae9bSBen Gras.\" without specific prior written permission. 172639ae9bSBen Gras.\" 182639ae9bSBen Gras.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND 192639ae9bSBen Gras.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 202639ae9bSBen Gras.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 212639ae9bSBen Gras.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE 222639ae9bSBen Gras.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 232639ae9bSBen Gras.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS 242639ae9bSBen Gras.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 252639ae9bSBen Gras.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 262639ae9bSBen Gras.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 272639ae9bSBen Gras.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 282639ae9bSBen Gras.\" SUCH DAMAGE. 292639ae9bSBen Gras.\" 302639ae9bSBen Gras.\" @(#)btree.3 8.4 (Berkeley) 8/18/94 312639ae9bSBen Gras.\" 322639ae9bSBen Gras.Dd April 17, 2003 332639ae9bSBen Gras.Dt BTREE 3 342639ae9bSBen Gras.Os 352639ae9bSBen Gras.Sh NAME 362639ae9bSBen Gras.Nm btree 372639ae9bSBen Gras.Nd btree database access method 382639ae9bSBen Gras.Sh SYNOPSIS 392639ae9bSBen Gras.In sys/types.h 402639ae9bSBen Gras.In db.h 412639ae9bSBen Gras.Sh DESCRIPTION 422639ae9bSBen GrasThe routine 432639ae9bSBen Gras.Fn dbopen 442639ae9bSBen Grasis the library interface to database files. 452639ae9bSBen GrasOne of the supported file formats is btree files. 462639ae9bSBen GrasThe general description of the database access methods is in 472639ae9bSBen Gras.Xr dbopen 3 , 482639ae9bSBen Grasthis manual page describes only the btree specific information. 492639ae9bSBen Gras.Pp 502639ae9bSBen GrasThe btree data structure is a sorted, balanced tree structure storing 512639ae9bSBen Grasassociated key/data pairs. 522639ae9bSBen Gras.Pp 532639ae9bSBen GrasThe btree access method specific data structure provided to 542639ae9bSBen Gras.Fn dbopen 552639ae9bSBen Grasis defined in the 562639ae9bSBen Gras.In db.h 572639ae9bSBen Grasinclude file as follows: 582639ae9bSBen Gras.Bd -literal 592639ae9bSBen Grastypedef struct { 602639ae9bSBen Gras u_long flags; 612639ae9bSBen Gras u_int cachesize; 622639ae9bSBen Gras int maxkeypage; 632639ae9bSBen Gras int minkeypage; 642639ae9bSBen Gras u_int psize; 652639ae9bSBen Gras int (*compare)(const DBT *key1, const DBT *key2); 662639ae9bSBen Gras size_t (*prefix)(const DBT *key1, const DBT *key2); 672639ae9bSBen Gras int lorder; 682639ae9bSBen Gras} BTREEINFO; 692639ae9bSBen Gras.Ed 702639ae9bSBen Gras.Pp 712639ae9bSBen GrasThe elements of this structure are as follows: 722639ae9bSBen Gras.Bl -tag -width maxkeypagex 732639ae9bSBen Gras.It Fa flags 742639ae9bSBen GrasThe flag value is specified by or'ing any of the following values: 752639ae9bSBen Gras.Bl -tag -width R_DUP -offset indent 762639ae9bSBen Gras.It Dv R_DUP 772639ae9bSBen GrasPermit duplicate keys in the tree, i.e. permit insertion if the key to 782639ae9bSBen Grasbe inserted already exists in the tree. 792639ae9bSBen GrasThe default behavior, as described in 802639ae9bSBen Gras.Xr dbopen 3 , 812639ae9bSBen Grasis to overwrite a matching key when inserting a new key or to fail if 822639ae9bSBen Grasthe 832639ae9bSBen Gras.Dv R_NOOVERWRITE 842639ae9bSBen Grasflag is specified. 852639ae9bSBen GrasThe 862639ae9bSBen Gras.Dv R_DUP 872639ae9bSBen Grasflag is overridden by the 882639ae9bSBen Gras.Dv R_NOOVERWRITE 892639ae9bSBen Grasflag, and if the 902639ae9bSBen Gras.Dv R_NOOVERWRITE 912639ae9bSBen Grasflag is specified, attempts to insert duplicate keys into the tree 922639ae9bSBen Graswill fail. 932639ae9bSBen Gras.Pp 942639ae9bSBen GrasIf the database contains duplicate keys, the order of retrieval of 952639ae9bSBen Graskey/data pairs is undefined if the 962639ae9bSBen Gras.Em get 972639ae9bSBen Grasroutine is used, however, 982639ae9bSBen Gras.Em seq 992639ae9bSBen Grasroutine calls with the 1002639ae9bSBen Gras.Dv R_CURSOR 1012639ae9bSBen Grasflag set will always return the logical 1022639ae9bSBen Gras.Dq first 1032639ae9bSBen Grasof any group of duplicate keys. 1042639ae9bSBen Gras.El 1052639ae9bSBen Gras.It Fa cachesize 1062639ae9bSBen GrasA suggested maximum size (in bytes) of the memory cache. 1072639ae9bSBen GrasThis value is 1082639ae9bSBen Gras.Em only 1092639ae9bSBen Grasadvisory, and the access method will allocate more memory rather than 1102639ae9bSBen Grasfail. 1112639ae9bSBen GrasSince every search examines the root page of the tree, caching the 1122639ae9bSBen Grasmost recently used pages substantially improves access time. 1132639ae9bSBen GrasIn addition, physical writes are delayed as long as possible, so a 1142639ae9bSBen Grasmoderate cache can reduce the number of I/O operations significantly. 1152639ae9bSBen GrasObviously, using a cache increases (but only increases) the likelihood 1162639ae9bSBen Grasof corruption or lost data if the system crashes while a tree is being 1172639ae9bSBen Grasmodified. 1182639ae9bSBen GrasIf 1192639ae9bSBen Gras.Fa cachesize 1202639ae9bSBen Grasis 0 (no size is specified) a default cache is used. 1212639ae9bSBen Gras.It Fa maxkeypage 1222639ae9bSBen GrasThe maximum number of keys which will be stored on any single page. 1232639ae9bSBen GrasNot currently implemented. 1242639ae9bSBen Gras.\" The maximum number of keys which will be stored on any single page. 1252639ae9bSBen Gras.\" Because of the way the btree data structure works, 1262639ae9bSBen Gras.\" .Fa maxkeypage 1272639ae9bSBen Gras.\" must always be greater than or equal to 2. 1282639ae9bSBen Gras.\" If 1292639ae9bSBen Gras.\" .Fa maxkeypage 1302639ae9bSBen Gras.\" is 0 (no maximum number of keys is specified) the page fill factor is 1312639ae9bSBen Gras.\" made as large as possible (which is almost invariably what is wanted). 1322639ae9bSBen Gras.It Fa minkeypage 1332639ae9bSBen GrasThe minimum number of keys which will be stored on any single page. 1342639ae9bSBen GrasThis value is used to determine which keys will be stored on overflow 1352639ae9bSBen Graspages, i.e., if a key or data item is longer than the pagesize divided 1362639ae9bSBen Grasby the 1372639ae9bSBen Gras.Fa minkeypage 1382639ae9bSBen Grasvalue, it will be stored on overflow pages instead of in the page 1392639ae9bSBen Grasitself. 1402639ae9bSBen GrasIf 1412639ae9bSBen Gras.Fa minkeypage 1422639ae9bSBen Grasis 0 (no minimum number of keys is specified) a value of 2 is used. 1432639ae9bSBen Gras.It Fa psize 1442639ae9bSBen GrasPage size is the size (in bytes) of the pages used for nodes in the 1452639ae9bSBen Grastree. 1462639ae9bSBen GrasThe minimum page size is 512 bytes and the maximum page size is 64K. 1472639ae9bSBen GrasIf 1482639ae9bSBen Gras.Fa psize 1492639ae9bSBen Grasis 0 (no page size is specified) a page size is chosen based on the 1502639ae9bSBen Grasunderlying file system I/O block size. 1512639ae9bSBen Gras.It Fa compare 1522639ae9bSBen GrasCompare is the key comparison function. 1532639ae9bSBen GrasIt must return an integer less than, equal to, or greater than zero if 1542639ae9bSBen Grasthe first key argument is considered to be respectively less than, 1552639ae9bSBen Grasequal to, or greater than the second key argument. 1562639ae9bSBen GrasThe same comparison function must be used on a given tree every time 1572639ae9bSBen Grasit is opened. 1582639ae9bSBen GrasIf 1592639ae9bSBen Gras.Fa compare 1602639ae9bSBen Grasis 1612639ae9bSBen Gras.Dv NULL 1622639ae9bSBen Gras(no comparison function is specified), the keys are compared 1632639ae9bSBen Graslexically, with shorter keys considered less than longer keys. 1642639ae9bSBen Gras.It Fa prefix 1652639ae9bSBen GrasPrefix is the prefix comparison function. 1662639ae9bSBen GrasIf specified, this routine must return the number of bytes of the 1672639ae9bSBen Grassecond key argument which are necessary to determine that it is 1682639ae9bSBen Grasgreater than the first key argument. 1692639ae9bSBen GrasIf the keys are equal, the key length should be returned. 1702639ae9bSBen GrasNote, the usefulness of this routine is very data dependent, but, in 1712639ae9bSBen Grassome data sets can produce significantly reduced tree sizes and search 1722639ae9bSBen Grastimes. 1732639ae9bSBen GrasIf 1742639ae9bSBen Gras.Fa prefix 1752639ae9bSBen Grasis 1762639ae9bSBen Gras.Dv NULL 1772639ae9bSBen Gras(no prefix function is specified), 1782639ae9bSBen Gras.Em and 1792639ae9bSBen Grasno comparison function is specified, a default lexical comparison 1802639ae9bSBen Grasroutine is used. 1812639ae9bSBen GrasIf 1822639ae9bSBen Gras.Fa prefix 1832639ae9bSBen Grasis 1842639ae9bSBen Gras.Dv NULL 1852639ae9bSBen Grasand a comparison routine is specified, no prefix comparison is done. 1862639ae9bSBen Gras.It Fa lorder 1872639ae9bSBen GrasThe byte order for integers in the stored database metadata. 1882639ae9bSBen GrasThe number should represent the order as an integer; for example, 1892639ae9bSBen Grasbig endian order would be the number 4,321. 1902639ae9bSBen GrasIf 1912639ae9bSBen Gras.Fa lorder 1922639ae9bSBen Grasis 0 (no order is specified) the current host order is used. 1932639ae9bSBen Gras.El 1942639ae9bSBen Gras.Pp 1952639ae9bSBen GrasIf the file already exists (and the 1962639ae9bSBen Gras.Dv O_TRUNC 1972639ae9bSBen Grasflag is not specified), the values specified for the parameters flags, 1982639ae9bSBen Graslorder and psize are ignored in favor of the values used when the tree 1992639ae9bSBen Graswas created. 2002639ae9bSBen Gras.Pp 2012639ae9bSBen GrasForward sequential scans of a tree are from the least key to the 2022639ae9bSBen Grasgreatest. 2032639ae9bSBen Gras.Pp 2042639ae9bSBen GrasSpace freed up by deleting key/data pairs from the tree is never 2052639ae9bSBen Grasreclaimed, although it is normally made available for reuse. 2062639ae9bSBen GrasThis means that the btree storage structure is grow-only. 2072639ae9bSBen GrasThe only solutions are to avoid excessive deletions, or to create a 2082639ae9bSBen Grasfresh tree periodically from a scan of an existing one. 2092639ae9bSBen Gras.Pp 2102639ae9bSBen GrasSearches, insertions, and deletions in a btree will all complete in 2112639ae9bSBen GrasO lg base N where base is the average fill factor. 2122639ae9bSBen GrasOften, inserting ordered data into btrees results in a low fill 2132639ae9bSBen Grasfactor. 2142639ae9bSBen GrasThis implementation has been modified to make ordered insertion the 2152639ae9bSBen Grasbest case, resulting in a much better than normal page fill factor. 2162639ae9bSBen Gras.Sh ERRORS 2172639ae9bSBen GrasThe 2182639ae9bSBen Gras.Nm 2192639ae9bSBen Grasaccess method routines may fail and set 2202639ae9bSBen Gras.Va errno 2212639ae9bSBen Grasfor any of the errors specified for the library routine 2222639ae9bSBen Gras.Xr dbopen 3 . 2232639ae9bSBen Gras.Sh SEE ALSO 2242639ae9bSBen Gras.Xr dbopen 3 , 2252639ae9bSBen Gras.Xr hash 3 , 2262639ae9bSBen Gras.Xr mpool 3 , 2272639ae9bSBen Gras.Xr recno 3 2282639ae9bSBen Gras.Rs 2292639ae9bSBen Gras.%T "The Ubiquitous B-tree" 2302639ae9bSBen Gras.%A "Douglas Comer" 2312639ae9bSBen Gras.%J "ACM Comput. Surv." 2322639ae9bSBen Gras.%V 2 2332639ae9bSBen Gras.%N 11 2342639ae9bSBen Gras.%D June 1979 2352639ae9bSBen Gras.%P 121-138 2362639ae9bSBen Gras.Re 2372639ae9bSBen Gras.Rs 2382639ae9bSBen Gras.%T "Prefix B-trees" 2392639ae9bSBen Gras.%A "Bayer" 2402639ae9bSBen Gras.%A "Unterauer" 2412639ae9bSBen Gras.%J "ACM Transactions on Database Systems" 2422639ae9bSBen Gras.%V Vol. 2 2432639ae9bSBen Gras.%N 1 2442639ae9bSBen Gras.%D March 1977 2452639ae9bSBen Gras.%P 11-26 2462639ae9bSBen Gras.Re 2472639ae9bSBen Gras.Rs 2482639ae9bSBen Gras.%B "The Art of Computer Programming Vol. 3: Sorting and Searching" 2492639ae9bSBen Gras.%A "D.E. Knuth" 2502639ae9bSBen Gras.%D 1968 2512639ae9bSBen Gras.%P 471-480 2522639ae9bSBen Gras.Re 2532639ae9bSBen Gras.Sh BUGS 2542639ae9bSBen GrasOnly big and little endian byte order is supported. 255