1# OpenLDAP: pkg/openldap-guide/admin/tuning.sdf,v 1.9.2.8 2009/01/22 00:00:47 kurt Exp 2# Copyright 1999-2009 The OpenLDAP Foundation, All Rights Reserved. 3# COPYING RESTRICTIONS APPLY, see COPYRIGHT. 4 5H1: Tuning 6 7This is perhaps one of the most important chapters in the guide, because if 8you have not tuned {{slapd}}(8) correctly or grasped how to design your 9directory and environment, you can expect very poor performance. 10 11Reading, understanding and experimenting using the instructions and information 12in the following sections, will enable you to fully understand how to tailor 13your directory server to your specific requirements. 14 15It should be noted that the following information has been collected over time 16from our community based FAQ. So obviously the benefit of this real world experience 17and advice should be of great value to the reader. 18 19 20H2: Performance Factors 21 22Various factors can play a part in how your directory performs on your chosen 23hardware and environment. We will attempt to discuss these here. 24 25 26H3: Memory 27 28Scale your cache to use available memory and increase system memory if you can. 29 30See {{SECT:Caching}} 31 32 33H3: Disks 34 35Use fast subsystems. Put each database and logs on separate disks configurable 36via {{DB_CONFIG}}: 37 38> # Data Directory 39> set_data_dir /data/db 40> 41> # Transaction Log settings 42> set_lg_dir /logs 43 44 45H3: Network Topology 46 47http://www.openldap.org/faq/data/cache/363.html 48 49Drawing here. 50 51 52H3: Directory Layout Design 53 54Reference to other sections and good/bad drawing here. 55 56 57H3: Expected Usage 58 59Discussion. 60 61 62H2: Indexes 63 64H3: Understanding how a search works 65 66If you're searching on a filter that has been indexed, then the search reads 67the index and pulls exactly the entries that are referenced by the index. 68If the filter term has not been indexed, then the search must read every single 69 entry in the target scope and test to see if each entry matches the filter. 70Obviously indexing can save a lot of work when it's used correctly. 71 72H3: What to index 73 74You should create indices to match the actual filter terms used in 75search queries. 76 77> index cn,sn,givenname,mail eq 78 79Each attribute index can be tuned further by selecting the set of index types to generate. For example, substring and approximate search for organizations (o) may make little sense (and isn't like done very often). And searching for {{userPassword}} likely makes no sense what so ever. 80 81General rule: don't go overboard with indexes. Unused indexes must be maintained and hence can only slow things down. 82 83See {{slapd.conf}}(8) and {{slapdindex}}(8) for more information 84 85 86H3: Presence indexing 87 88If your client application uses presence filters and if the 89target attribute exists on the majority of entries in your target scope, then 90all of those entries are going to be read anyway, because they are valid 91members of the result set. In a subtree where 100% of the 92entries are going to contain the same attributes, the presence index does 93absolutely NOTHING to benefit the search, because 100% of the entries match 94that presence filter. 95 96So the resource cost of generating the index is a 97complete waste of CPU time, disk, and memory. Don't do it unless you know 98that it will be used, and that the attribute in question occurs very 99infrequently in the target data. 100 101Almost no applications use presence filters in their search queries. Presence 102indexing is pointless when the target attribute exists on the majority of 103entries in the database. In most LDAP deployments, presence indexing should 104not be done, it's just wasted overhead. 105 106See the {{Logging}} section below on what to watch our for if you have a frequently searched 107for attribute that is unindexed. 108 109 110H2: Logging 111 112H3: What log level to use 113 114The default of {{loglevel stats}} (256) is really the best bet. There's a corollary to 115this when problems *do* arise, don't try to trace them using syslog. 116Use the debug flag instead, and capture slapd's stderr output. syslog is too 117slow for debug tracing, and it's inherently lossy - it will throw away messages when it 118can't keep up. 119 120Contrary to popular belief, {{loglevel 0}} is not ideal for production as you 121won't be able to track when problems first arise. 122 123H3: What to watch out for 124 125The most common message you'll see that you should pay attention to is: 126 127> "<= bdb_equality_candidates: (foo) index_param failed (18)" 128 129That means that some application tried to use an equality filter ({{foo=<somevalue>}}) 130and attribute {{foo}} does not have an equality index. If you see a lot of these 131messages, you should add the index. If you see one every month or so, it may 132be acceptable to ignore it. 133 134The default syslog level is stats (256) which logs the basic parameters of each 135request; it usually produces 1-3 lines of output. On Solaris and systems that 136only provide synchronous syslog, you may want to turn it off completely, but 137usually you want to leave it enabled so that you'll be able to see index 138messages whenever they arise. On Linux you can configure syslogd to run 139asynchronously, in which case the performance hit for moderate syslog traffic 140pretty much disappears. 141 142H3: Improving throughput 143 144You can improve logging performance on some systems by configuring syslog not 145to sync the file system with every write ({{man syslogd/syslog.conf}}). In Linux, 146you can prepend the log file name with a "-" in {{syslog.conf}}. For example, 147if you are using the default LOCAL4 logging you could try: 148 149> # LDAP logs 150> LOCAL4.* -/var/log/ldap 151 152For syslog-ng, add or modify the following line in {{syslog-ng.conf}}: 153 154> options { sync(n); }; 155 156where n is the number of lines which will be buffered before a write. 157 158 159H2: Caching 160 161We all know what caching is, don't we? 162 163In brief, "A cache is a block of memory for temporary storage of data likely 164to be used again" - {{URL:http://en.wikipedia.org/wiki/Cache}} 165 166There are 3 types of caches, BerkeleyDB's own cache, {{slapd}}(8) 167entry cache and {{TERM:IDL}} (IDL) cache. 168 169 170H3: Berkeley DB Cache 171 172There are two ways to tune for the BDB cachesize: 173 174(a) BDB cache size necessary to load the database via slapadd in optimal time 175 176(b) BDB cache size necessary to have a high performing running slapd once the data is loaded 177 178For (a), the optimal cachesize is the size of the entire database. If you 179already have the database loaded, this is simply a 180 181> du -c -h *.bdb 182 183in the directory containing the OpenLDAP ({{/usr/local/var/openldap-data}}) data. 184 185For (b), the optimal cachesize is just the size of the {{id2entry.bdb}} file, 186plus about 10% for growth. 187 188The tuning of {{DB_CONFIG}} should be done for each BDB type database 189instantiated (back-bdb, back-hdb). 190 191Note that while the {{TERM:BDB}} cache is just raw chunks of memory and 192configured as a memory size, the {{slapd}}(8) entry cache holds parsed entries, 193and the size of each entry is variable. 194 195There is also an IDL cache which is used for Index Data Lookups. 196If you can fit all of your database into slapd's entry cache, and all of your 197index lookups fit in the IDL cache, that will provide the maximum throughput. 198 199If not, but you can fit the entire database into the BDB cache, then you 200should do that and shrink the slapd entry cache as appropriate. 201 202Failing that, you should balance the BDB cache against the entry cache. 203 204It is worth noting that it is not absolutely necessary to configure a BerkeleyDB 205cache equal in size to your entire database. All that you need is a cache 206that's large enough for your "working set." 207 208That means, large enough to hold all of the most frequently accessed data, 209plus a few less-frequently accessed items. 210 211For more information, please see: {{URL:http://www.oracle.com/technology/documentation/berkeley-db/db/ref/am_conf/cachesize.html}} 212 213H4: Calculating Cachesize 214 215The back-bdb database lives in two main files, {{F:dn2id.bdb}} and {{F:id2entry.bdb}}. 216These are B-tree databases. We have never documented the back-bdb internal 217layout before, because it didn't seem like something anyone should have to worry 218about, nor was it necessarily cast in stone. But here's how it works today, 219in OpenLDAP 2.4. 220 221A B-tree is a balanced tree; it stores data in its leaf nodes and bookkeeping 222data in its interior nodes (If you don't know what tree data structures look 223 like in general, Google for some references, because that's getting far too 224elementary for the purposes of this discussion). 225 226For decent performance, you need enough cache memory to contain all the nodes 227along the path from the root of the tree down to the particular data item 228you're accessing. That's enough cache for a single search. For the general case, 229you want enough cache to contain all the internal nodes in the database. 230 231> db_stat -d 232 233will tell you how many internal pages are present in a database. You should 234check this number for both dn2id and id2entry. 235 236Also note that {{id2entry}} always uses 16KB per "page", while {{dn2id}} uses whatever 237the underlying filesystem uses, typically 4 or 8KB. To avoid thrashing the, 238your cache must be at least as large as the number of internal pages in both 239the {{dn2id}} and {{id2entry}} databases, plus some extra space to accommodate the actual 240leaf data pages. 241 242For example, in my OpenLDAP 2.4 test database, I have an input LDIF file that's 243about 360MB. With the back-hdb backend this creates a {{dn2id.bdb}} that's 68MB, 244and an {{id2entry}} that's 800MB. db_stat tells me that {{dn2id}} uses 4KB pages, has 245433 internal pages, and 6378 leaf pages. The id2entry uses 16KB pages, has 52 246internal pages, and 45912 leaf pages. In order to efficiently retrieve any 247single entry in this database, the cache should be at least 248 249> (433+1) * 4KB + (52+1) * 16KB in size: 1736KB + 848KB =~ 2.5MB. 250 251This doesn't take into account other library overhead, so this is even lower 252than the barest minimum. The default cache size, when nothing is configured, 253is only 256KB. 254 255This 2.5MB number also doesn't take indexing into account. Each indexed attribute 256uses another database file of its own, using a Hash structure. 257 258Unlike the B-trees, where you only need to touch one data page to find an entry 259of interest, doing an index lookup generally touches multiple keys, and the 260point of a hash structure is that the keys are evenly distributed across the 261data space. That means there's no convenient compact subset of the database that 262you can keep in the cache to insure quick operation, you can pretty much expect 263references to be scattered across the whole thing. My strategy here would be to 264provide enough cache for at least 50% of all of the hash data. 265 266> (Number of hash buckets + number of overflow pages + number of duplicate pages) * page size / 2. 267 268The objectClass index for my example database is 5.9MB and uses 3 hash buckets 269and 656 duplicate pages. So: 270 271> ( 3 + 656 ) * 4KB / 2 =~ 1.3MB. 272 273With only this index enabled, I'd figure at least a 4MB cache for this backend. 274(Of course you're using a single cache shared among all of the database files, 275so the cache pages will most likely get used for something other than what you 276accounted for, but this gives you a fighting chance.) 277 278With this 4MB cache I can slapcat this entire database on my 1.3GHz PIII in 2791 minute, 40 seconds. With the cache doubled to 8MB, it still takes the same 1:40s. 280Once you've got enough cache to fit the B-tree internal pages, increasing it 281further won't have any effect until the cache really is large enough to hold 282100% of the data pages. I don't have enough free RAM to hold all the 800MB 283id2entry data, so 4MB is good enough. 284 285With back-bdb and back-hdb you can use "db_stat -m" to check how well the 286database cache is performing. 287 288For more information on {{db_stat}}: {{URL:http://www.oracle.com/technology/documentation/berkeley-db/db/utility/db_stat.html}} 289 290H3: {{slapd}}(8) Entry Cache (cachesize) 291 292The {{slapd}}(8) entry cache operates on decoded entries. The rationale - entries 293in the entry cache can be used directly, giving the fastest response. If an entry 294isn't in the entry cache but can be extracted from the BDB page cache, that will 295avoid an I/O but it will still require parsing, so this will be slower. 296 297If the entry is in neither cache then BDB will have to flush some of its current 298cached pages and bring in the needed pages, resulting in a couple of expensive 299I/Os as well as parsing. 300 301The most optimal value is of course, the entire number of entries in the database. 302However, most directory servers don't consistently serve out their entire database, so setting this to a lesser number that more closely matches the believed working set of data is 303sufficient. This is the second most important parameter for the DB. 304 305As far as balancing the entry cache vs the BDB cache - parsed entries in memory 306are generally about twice as large as they are on disk. 307 308As we have already mentioned, not having a proper database cache size will 309cause performance issues. These issues are not an indication of corruption 310occurring in the database. It is merely the fact that the cache is thrashing 311itself that causes performance/response time to slowdown. 312 313 314H3: {{TERM:IDL}} Cache (idlcachesize) 315 316Each IDL holds the search results from a given query, so the IDL cache will 317end up holding the most frequently requested search results. For back-bdb, 318it is generally recommended to match the "cachesize" setting. For back-hdb, 319it is generally recommended to be 3x"cachesize". 320 321{NOTE: The idlcachesize setting directly affects search performance} 322 323 324H3: {{slapd}}(8) Threads 325 326{{slapd}}(8) can process requests via a configurable number of thread, which 327in turn affects the in/out rate of connections. 328 329This value should generally be a function of the number of "real" cores on 330the system, for example on a server with 2 CPUs with one core each, set this 331to 8, or 4 threads per real core. This is a "read" maximized value. The more 332threads that are configured per core, the slower {{slapd}}(8) responds for 333"read" operations. On the flip side, it appears to handle write operations 334faster in a heavy write/low read scenario. 335 336The upper bound for good read performance appears to be 16 threads (which 337also happens to be the default setting). 338