xref: /netbsd-src/external/bsd/openldap/dist/libraries/liblmdb/intro.doc (revision e670fd5c413e99c2f6a37901bb21c537fcd322d2)
1/*
2 * Copyright 2015-2021 Howard Chu, Symas Corp.
3 * All rights reserved.
4 *
5 * Redistribution and use in source and binary forms, with or without
6 * modification, are permitted only as authorized by the OpenLDAP
7 * Public License.
8 *
9 * A copy of this license is available in the file LICENSE in the
10 * top-level directory of the distribution or, alternatively, at
11 * <http://www.OpenLDAP.org/license.html>.
12 */
13/** @page starting Getting Started
14
15LMDB is compact, fast, powerful, and robust and implements a simplified
16variant of the BerkeleyDB (BDB) API. (BDB is also very powerful, and verbosely
17documented in its own right.) After reading this page, the main
18\ref mdb documentation should make sense. Thanks to Bert Hubert
19for creating the
20<a href="https://github.com/ahupowerdns/ahutils/blob/master/lmdb-semantics.md">
21initial version</a> of this writeup.
22
23Everything starts with an environment, created by #mdb_env_create().
24Once created, this environment must also be opened with #mdb_env_open().
25
26#mdb_env_open() gets passed a name which is interpreted as a directory
27path. Note that this directory must exist already, it is not created
28for you. Within that directory, a lock file and a storage file will be
29generated. If you don't want to use a directory, you can pass the
30#MDB_NOSUBDIR option, in which case the path you provided is used
31directly as the data file, and another file with a "-lock" suffix
32added will be used for the lock file.
33
34Once the environment is open, a transaction can be created within it
35using #mdb_txn_begin(). Transactions may be read-write or read-only,
36and read-write transactions may be nested. A transaction must only
37be used by one thread at a time. Transactions are always required,
38even for read-only access. The transaction provides a consistent
39view of the data.
40
41Once a transaction has been created, a database can be opened within it
42using #mdb_dbi_open(). If only one database will ever be used in the
43environment, a NULL can be passed as the database name. For named
44databases, the #MDB_CREATE flag must be used to create the database
45if it doesn't already exist. Also, #mdb_env_set_maxdbs() must be
46called after #mdb_env_create() and before #mdb_env_open() to set the
47maximum number of named databases you want to support.
48
49Note: a single transaction can open multiple databases. Generally
50databases should only be opened once, by the first transaction in
51the process. After the first transaction completes, the database
52handles can freely be used by all subsequent transactions.
53
54Within a transaction, #mdb_get() and #mdb_put() can store single
55key/value pairs if that is all you need to do (but see \ref Cursors
56below if you want to do more).
57
58A key/value pair is expressed as two #MDB_val structures. This struct
59has two fields, \c mv_size and \c mv_data. The data is a \c void pointer to
60an array of \c mv_size bytes.
61
62Because LMDB is very efficient (and usually zero-copy), the data returned
63in an #MDB_val structure may be memory-mapped straight from disk. In
64other words <b>look but do not touch</b> (or free() for that matter).
65Once a transaction is closed, the values can no longer be used, so
66make a copy if you need to keep them after that.
67
68@section Cursors Cursors
69
70To do more powerful things, we must use a cursor.
71
72Within the transaction, a cursor can be created with #mdb_cursor_open().
73With this cursor we can store/retrieve/delete (multiple) values using
74#mdb_cursor_get(), #mdb_cursor_put(), and #mdb_cursor_del().
75
76#mdb_cursor_get() positions itself depending on the cursor operation
77requested, and for some operations, on the supplied key. For example,
78to list all key/value pairs in a database, use operation #MDB_FIRST for
79the first call to #mdb_cursor_get(), and #MDB_NEXT on subsequent calls,
80until the end is hit.
81
82To retrieve all keys starting from a specified key value, use #MDB_SET.
83For more cursor operations, see the \ref mdb docs.
84
85When using #mdb_cursor_put(), either the function will position the
86cursor for you based on the \b key, or you can use operation
87#MDB_CURRENT to use the current position of the cursor. Note that
88\b key must then match the current position's key.
89
90@subsection summary Summarizing the Opening
91
92So we have a cursor in a transaction which opened a database in an
93environment which is opened from a filesystem after it was
94separately created.
95
96Or, we create an environment, open it from a filesystem, create a
97transaction within it, open a database within that transaction,
98and create a cursor within all of the above.
99
100Got it?
101
102@section thrproc Threads and Processes
103
104LMDB uses POSIX locks on files, and these locks have issues if one
105process opens a file multiple times. Because of this, do not
106#mdb_env_open() a file multiple times from a single process. Instead,
107share the LMDB environment that has opened the file across all threads.
108Otherwise, if a single process opens the same environment multiple times,
109closing it once will remove all the locks held on it, and the other
110instances will be vulnerable to corruption from other processes.
111
112Also note that a transaction is tied to one thread by default using
113Thread Local Storage. If you want to pass read-only transactions across
114threads, you can use the #MDB_NOTLS option on the environment.
115
116@section txns Transactions, Rollbacks, etc.
117
118To actually get anything done, a transaction must be committed using
119#mdb_txn_commit(). Alternatively, all of a transaction's operations
120can be discarded using #mdb_txn_abort(). In a read-only transaction,
121any cursors will \b not automatically be freed. In a read-write
122transaction, all cursors will be freed and must not be used again.
123
124For read-only transactions, obviously there is nothing to commit to
125storage. The transaction still must eventually be aborted to close
126any database handle(s) opened in it, or committed to keep the
127database handles around for reuse in new transactions.
128
129In addition, as long as a transaction is open, a consistent view of
130the database is kept alive, which requires storage. A read-only
131transaction that no longer requires this consistent view should
132be terminated (committed or aborted) when the view is no longer
133needed (but see below for an optimization).
134
135There can be multiple simultaneously active read-only transactions
136but only one that can write. Once a single read-write transaction
137is opened, all further attempts to begin one will block until the
138first one is committed or aborted. This has no effect on read-only
139transactions, however, and they may continue to be opened at any time.
140
141@section dupkeys Duplicate Keys
142
143#mdb_get() and #mdb_put() respectively have no and only some support
144for multiple key/value pairs with identical keys. If there are multiple
145values for a key, #mdb_get() will only return the first value.
146
147When multiple values for one key are required, pass the #MDB_DUPSORT
148flag to #mdb_dbi_open(). In an #MDB_DUPSORT database, by default
149#mdb_put() will not replace the value for a key if the key existed
150already. Instead it will add the new value to the key. In addition,
151#mdb_del() will pay attention to the value field too, allowing for
152specific values of a key to be deleted.
153
154Finally, additional cursor operations become available for
155traversing through and retrieving duplicate values.
156
157@section optim Some Optimization
158
159If you frequently begin and abort read-only transactions, as an
160optimization, it is possible to only reset and renew a transaction.
161
162#mdb_txn_reset() releases any old copies of data kept around for
163a read-only transaction. To reuse this reset transaction, call
164#mdb_txn_renew() on it. Any cursors in this transaction must also
165be renewed using #mdb_cursor_renew().
166
167Note that #mdb_txn_reset() is similar to #mdb_txn_abort() and will
168close any databases you opened within the transaction.
169
170To permanently free a transaction, reset or not, use #mdb_txn_abort().
171
172@section cleanup Cleaning Up
173
174For read-only transactions, any cursors created within it must
175be closed using #mdb_cursor_close().
176
177It is very rarely necessary to close a database handle, and in
178general they should just be left open.
179
180@section onward The Full API
181
182The full \ref mdb documentation lists further details, like how to:
183
184  \li size a database (the default limits are intentionally small)
185  \li drop and clean a database
186  \li detect and report errors
187  \li optimize (bulk) loading speed
188  \li (temporarily) reduce robustness to gain even more speed
189  \li gather statistics about the database
190  \li define custom sort orders
191
192*/
193