1<!doctype html public "-//W3C//DTD HTML 4.01 Transitional//EN" 2 "http://www.w3.org/TR/html4/loose.dtd"> 3 4<html> 5 6<head> 7 8<title>Postfix Bottleneck Analysis</title> 9 10<meta http-equiv="Content-Type" content="text/html; charset=utf-8"> 11<link rel='stylesheet' type='text/css' href='postfix-doc.css'> 12 13</head> 14 15<body> 16 17<h1><img src="postfix-logo.jpg" width="203" height="98" ALT="">Postfix Bottleneck Analysis</h1> 18 19<hr> 20 21<h2>Purpose of this document </h2> 22 23<p> This document is an introduction to Postfix queue congestion analysis. 24It explains how the qshape(1) program can help to track down the 25reason for queue congestion. qshape(1) is bundled with Postfix 262.1 and later source code, under the "auxiliary" directory. This 27document describes qshape(1) as bundled with Postfix 2.4. </p> 28 29<p> This document covers the following topics: </p> 30 31<ul> 32 33<li><a href="#qshape">Introducing the qshape tool</a> 34 35<li><a href="#trouble_shooting">Trouble shooting with qshape</a> 36 37<li><a href="#healthy">Example 1: Healthy queue</a> 38 39<li><a href="#dictionary_bounce">Example 2: Deferred queue full of 40dictionary attack bounces</a></li> 41 42<li><a href="#active_congestion">Example 3: Congestion in the active 43queue</a></li> 44 45<li><a href="#backlog">Example 4: High volume destination backlog</a> 46 47<li><a href="#queues">Postfix queue directories</a> 48 49<ul> 50 51<li> <a href="#maildrop_queue"> The "maildrop" queue </a> 52 53<li> <a href="#hold_queue"> The "hold" queue </a> 54 55<li> <a href="#incoming_queue"> The "incoming" queue </a> 56 57<li> <a href="#active_queue"> The "active" queue </a> 58 59<li> <a href="#deferred_queue"> The "deferred" queue </a> 60 61</ul> 62 63<li><a href="#credits">Credits</a> 64 65</ul> 66 67<h2><a name="qshape">Introducing the qshape tool</a></h2> 68 69<p> When mail is draining slowly or the queue is unexpectedly large, 70run qshape(1) as the super-user (root) to help zero in on the problem. 71The qshape(1) program displays a tabular view of the Postfix queue 72contents. </p> 73 74<ul> 75 76<li> <p> On the horizontal axis, it displays the queue age with 77fine granularity for recent messages and (geometrically) less fine 78granularity for older messages. </p> 79 80<li> <p> The vertical axis displays the destination (or with the 81"-s" switch the sender) domain. Domains with the most messages are 82listed first. </p> 83 84</ul> 85 86<p> For example, in the output below we see the top 10 lines of 87the (mostly forged) sender domain distribution for captured spam 88in the "hold" queue: </p> 89 90<blockquote> 91<pre> 92$ qshape -s hold | head 93 T 5 10 20 40 80 160 320 640 1280 1280+ 94 TOTAL 486 0 0 1 0 0 2 4 20 40 419 95 yahoo.com 14 0 0 1 0 0 0 0 1 0 12 96 extremepricecuts.net 13 0 0 0 0 0 0 0 2 0 11 97 ms35.hinet.net 12 0 0 0 0 0 0 0 0 1 11 98 winnersdaily.net 12 0 0 0 0 0 0 0 2 0 10 99 hotmail.com 11 0 0 0 0 0 0 0 0 1 10 100 worldnet.fr 6 0 0 0 0 0 0 0 0 0 6 101 ms41.hinet.net 6 0 0 0 0 0 0 0 0 0 6 102 osn.de 5 0 0 0 0 0 1 0 0 0 4 103</pre> 104</blockquote> 105 106<ul> 107 108<li> <p> The "T" column shows the total (in this case sender) count 109for each domain. The columns with numbers above them, show counts 110for messages aged fewer than that many minutes, but not younger 111than the age limit for the previous column. The row labeled "TOTAL" 112shows the total count for all domains. </p> 113 114<li> <p> In this example, there are 14 messages allegedly from 115yahoo.com, 1 between 10 and 20 minutes old, 1 between 320 and 640 116minutes old and 12 older than 1280 minutes (1440 minutes in a day). 117</p> 118 119</ul> 120 121<p> When the output is a terminal intermediate results showing the top 20 122domains (-n option) are displayed after every 1000 messages (-N option) 123and the final output also shows only the top 20 domains. This makes 124qshape useful even when the "deferred" queue is very large and it may 125otherwise take prohibitively long to read the entire "deferred" queue. </p> 126 127<p> By default, qshape shows statistics for the union of both the 128"incoming" and "active" queues which are the most relevant queues to 129look at when analyzing performance. </p> 130 131<p> One can request an alternate list of queues: </p> 132 133<blockquote> 134<pre> 135$ qshape deferred 136$ qshape incoming active deferred 137</pre> 138</blockquote> 139 140<p> this will show the age distribution of the "deferred" queue or 141the union of the "incoming", "active" and "deferred" queues. </p> 142 143<p> Command line options control the number of display "buckets", 144the age limit for the smallest bucket, display of parent domain 145counts and so on. The "-h" option outputs a summary of the available 146switches. </p> 147 148<h2><a name="trouble_shooting">Trouble shooting with qshape</a> 149</h2> 150 151<p> Large numbers in the qshape output represent a large number of 152messages that are destined to (or alleged to come from) a particular 153domain. It should be possible to tell at a glance which domains 154dominate the queue sender or recipient counts, approximately when 155a burst of mail started, and when it stopped. </p> 156 157<p> The problem destinations or sender domains appear near the top 158left corner of the output table. Remember that the "active" queue 159can accommodate up to 20000 ($qmgr_message_active_limit) messages. 160To check whether this limit has been reached, use: </p> 161 162<blockquote> 163<pre> 164$ qshape -s active <i>(show sender statistics)</i> 165</pre> 166</blockquote> 167 168<p> If the total sender count is below 20000 the "active" queue is 169not yet saturated, any high volume sender domains show near the 170top of the output. 171 172<p> With oqmgr(8) the "active" queue is also limited to at most 20000 173recipient addresses ($qmgr_message_recipient_limit). To check for 174exhaustion of this limit use: </p> 175 176<blockquote> 177<pre> 178$ qshape active <i>(show recipient statistics)</i> 179</pre> 180</blockquote> 181 182<p> Having found the high volume domains, it is often useful to 183search the logs for recent messages pertaining to the domains in 184question. </p> 185 186<blockquote> 187<pre> 188# Find deliveries to example.com 189# 190$ tail -10000 /var/log/maillog | 191 grep -E -i ': to=<.*@example\.com>,' | 192 less 193 194# Find messages from example.com 195# 196$ tail -10000 /var/log/maillog | 197 grep -E -i ': from=<.*@example\.com>,' | 198 less 199</pre> 200</blockquote> 201 202<p> You may want to drill in on some specific queue ids: </p> 203 204<blockquote> 205<pre> 206# Find all messages for a specific queue id. 207# 208$ tail -10000 /var/log/maillog | grep -E ': 2B2173FF68: ' 209</pre> 210</blockquote> 211 212<p> Also look for queue manager warning messages in the log. These 213warnings can suggest strategies to reduce congestion. </p> 214 215<blockquote> 216<pre> 217$ grep -E 'qmgr.*(panic|fatal|error|warning):' /var/log/maillog 218</pre> 219</blockquote> 220 221<p> When all else fails try the Postfix mailing list for help, but 222please don't forget to include the top 10 or 20 lines of qshape(1) 223output. </p> 224 225<h2><a name="healthy">Example 1: Healthy queue</a></h2> 226 227<p> When looking at just the "incoming" and "active" queues, under 228normal conditions (no congestion) the "incoming" and "active" queues 229are nearly empty. Mail leaves the system almost as quickly as it 230comes in or is deferred without congestion in the "active" queue. 231</p> 232 233<blockquote> 234<pre> 235$ qshape <i>(show "incoming" and "active" queue status)</i> 236 237 T 5 10 20 40 80 160 320 640 1280 1280+ 238 TOTAL 5 0 0 0 1 0 0 0 1 1 2 239 meri.uwasa.fi 5 0 0 0 1 0 0 0 1 1 2 240</pre> 241</blockquote> 242 243<p> If one looks at the two queues separately, the "incoming" queue 244is empty or perhaps briefly has one or two messages, while the 245"active" queue holds more messages and for a somewhat longer time: 246</p> 247 248<blockquote> 249<pre> 250$ qshape incoming 251 252 T 5 10 20 40 80 160 320 640 1280 1280+ 253 TOTAL 0 0 0 0 0 0 0 0 0 0 0 254 255$ qshape active 256 257 T 5 10 20 40 80 160 320 640 1280 1280+ 258 TOTAL 5 0 0 0 1 0 0 0 1 1 2 259 meri.uwasa.fi 5 0 0 0 1 0 0 0 1 1 2 260</pre> 261</blockquote> 262 263<h2><a name="dictionary_bounce">Example 2: Deferred queue full of 264dictionary attack bounces</a></h2> 265 266<p> This is from a server where recipient validation is not yet 267available for some of the hosted domains. Dictionary attacks on 268the unvalidated domains result in bounce backscatter. The bounces 269dominate the queue, but with proper tuning they do not saturate the 270"incoming" or "active" queues. The high volume of deferred mail is not 271a direct cause for alarm. </p> 272 273<blockquote> 274<pre> 275$ qshape deferred | head 276 277 T 5 10 20 40 80 160 320 640 1280 1280+ 278 TOTAL 2234 4 2 5 9 31 57 108 201 464 1353 279 heyhihellothere.com 207 0 0 1 1 6 6 8 25 68 92 280 pleazerzoneprod.com 105 0 0 0 0 0 0 0 5 44 56 281 groups.msn.com 63 2 1 2 4 4 14 14 14 8 0 282 orion.toppoint.de 49 0 0 0 1 0 2 4 3 16 23 283 kali.com.cn 46 0 0 0 0 1 0 2 6 12 25 284 meri.uwasa.fi 44 0 0 0 0 1 0 2 8 11 22 285 gjr.paknet.com.pk 43 1 0 0 1 1 3 3 6 12 16 286 aristotle.algonet.se 41 0 0 0 0 0 1 2 11 12 15 287</pre> 288</blockquote> 289 290<p> The domains shown are mostly bulk-mailers and all the volume 291is the tail end of the time distribution, showing that short term 292arrival rates are moderate. Larger numbers and lower message ages 293are more indicative of current trouble. Old mail still going nowhere 294is largely harmless so long as the "active" and "incoming" queues are 295short. We can also see that the groups.msn.com undeliverables are 296low rate steady stream rather than a concentrated dictionary attack 297that is now over. </p> 298 299<blockquote> 300<pre> 301$ qshape -s deferred | head 302 303 T 5 10 20 40 80 160 320 640 1280 1280+ 304 TOTAL 2193 4 4 5 8 33 56 104 205 465 1309 305 MAILER-DAEMON 1709 4 4 5 8 33 55 101 198 452 849 306 example.com 263 0 0 0 0 0 0 0 0 2 261 307 example.org 209 0 0 0 0 0 1 3 6 11 188 308 example.net 6 0 0 0 0 0 0 0 0 0 6 309 example.edu 3 0 0 0 0 0 0 0 0 0 3 310 example.gov 2 0 0 0 0 0 0 0 1 0 1 311 example.mil 1 0 0 0 0 0 0 0 0 0 1 312</pre> 313</blockquote> 314 315<p> Looking at the sender distribution, we see that as expected 316most of the messages are bounces. </p> 317 318<h2><a name="active_congestion">Example 3: Congestion in the active 319queue</a></h2> 320 321<p> This example is taken from a Feb 2004 discussion on the Postfix 322Users list. Congestion was reported with the 323"active" and "incoming" queues 324large and not shrinking despite very large delivery agent 325process limits. The thread is archived at: 326http://groups.google.com/groups?threadm=c0b7js$2r65$1@FreeBSD.csie.NCTU.edu.tw 327and 328http://archives.neohapsis.com/archives/postfix/2004-02/thread.html#1371 329</p> 330 331<p> Using an older version of qshape(1) it was quickly determined 332that all the messages were for just a few destinations: </p> 333 334<blockquote> 335<pre> 336$ qshape <i>(show "incoming" and "active" queue status)</i> 337 338 T A 5 10 20 40 80 160 320 320+ 339 TOTAL 11775 9996 0 0 1 1 42 94 221 1420 340 user.sourceforge.net 7678 7678 0 0 0 0 0 0 0 0 341 lists.sourceforge.net 2313 2313 0 0 0 0 0 0 0 0 342 gzd.gotdns.com 102 0 0 0 0 0 0 0 2 100 343</pre> 344</blockquote> 345 346<p> The "A" column showed the count of messages in the "active" queue, 347and the numbered columns showed totals for the "deferred" queue. At 34810000 messages (Postfix 1.x "active" queue size limit) the "active" queue 349is full. The "incoming" queue was growing rapidly. </p> 350 351<p> With the trouble destinations clearly identified, the administrator 352quickly found and fixed the problem. It is substantially harder to 353glean the same information from the logs. While a careful reading 354of mailq(1) output should yield similar results, it is much harder 355to gauge the magnitude of the problem by looking at the queue 356one message at a time. </p> 357 358<h2><a name="backlog">Example 4: High volume destination backlog</a></h2> 359 360<p> When a site you send a lot of email to is down or slow, mail 361messages will rapidly build up in the "deferred" queue, or worse, in 362the "active" queue. The qshape output will show large numbers for 363the destination domain in all age buckets that overlap the starting 364time of the problem: </p> 365 366<blockquote> 367<pre> 368$ qshape deferred | head 369 370 T 5 10 20 40 80 160 320 640 1280 1280+ 371 TOTAL 5000 200 200 400 800 1600 1000 200 200 200 200 372 highvolume.com 4000 160 160 320 640 1280 1440 0 0 0 0 373 ... 374</pre> 375</blockquote> 376 377<p> Here the "highvolume.com" destination is continuing to accumulate 378deferred mail. The "incoming" and "active" queues are fine, but the 379"deferred" queue started growing some time between 1 and 2 hours ago 380and continues to grow. </p> 381 382<p> If the high volume destination is not down, but is instead 383slow, one might see similar congestion in the "active" queue. 384"Active" queue congestion is a greater cause for alarm; one might need to 385take measures to ensure that the mail is deferred instead or even 386add an access(5) rule asking the sender to try again later. </p> 387 388<p> If a high volume destination exhibits frequent bursts of consecutive 389connections refused by all MX hosts or "421 Server busy errors", it 390is possible for the queue manager to mark the destination as "dead" 391despite the transient nature of the errors. The destination will be 392retried again after the expiration of a $minimal_backoff_time timer. 393If the error bursts are frequent enough it may be that only a small 394quantity of email is delivered before the destination is again marked 395"dead". In some cases enabling static (not on demand) connection 396caching by listing the appropriate nexthop domain in a table included in 397"smtp_connection_cache_destinations" may help to reduce the error rate, 398because most messages will re-use existing connections. </p> 399 400<p> The MTA that has been observed most frequently to exhibit such 401bursts of errors is Microsoft Exchange, which refuses connections 402under load. Some proxy virus scanners in front of the Exchange 403server propagate the refused connection to the client as a "421" 404error. </p> 405 406<p> Note that it is now possible to configure Postfix to exhibit similarly 407erratic behavior by misconfiguring the anvil(8) service. Do not use 408anvil(8) for steady-state rate limiting, its purpose is (unintentional) 409DoS prevention and the rate limits set should be very generous! </p> 410 411<p> If one finds oneself needing to deliver a high volume of mail to a 412destination that exhibits frequent brief bursts of errors and connection 413caching does not solve the problem, there is a subtle workaround. </p> 414 415<ul> 416 417<li> <p> Postfix version 2.5 and later: </p> 418 419<ul> 420 421<li> <p> In master.cf set up a dedicated clone of the "smtp" transport 422for the destination in question. In the example below we will call 423it "fragile". </p> 424 425<li> <p> In master.cf configure a reasonable process limit for the 426cloned smtp transport (a number in the 10-20 range is typical). </p> 427 428<li> <p> IMPORTANT!!! In main.cf configure a large per-destination 429pseudo-cohort failure limit for the cloned smtp transport. </p> 430 431<pre> 432/etc/postfix/main.cf: 433 transport_maps = hash:/etc/postfix/transport 434 fragile_destination_concurrency_failed_cohort_limit = 100 435 fragile_destination_concurrency_limit = 20 436 437/etc/postfix/transport: 438 example.com fragile: 439 440/etc/postfix/master.cf: 441 # service type private unpriv chroot wakeup maxproc command 442 fragile unix - - n - 20 smtp 443</pre> 444 445<p> See also the documentation for 446default_destination_concurrency_failed_cohort_limit and 447default_destination_concurrency_limit. </p> 448 449</ul> 450 451<li> <p> Earlier Postfix versions: </p> 452 453<ul> 454 455<li> <p> In master.cf set up a dedicated clone of the "smtp" 456transport for the destination in question. In the example below 457we will call it "fragile". </p> 458 459<li> <p> In master.cf configure a reasonable process limit for the 460transport (a number in the 10-20 range is typical). </p> 461 462<li> <p> IMPORTANT!!! In main.cf configure a very large initial 463and destination concurrency limit for this transport (say 2000). </p> 464 465<pre> 466/etc/postfix/main.cf: 467 transport_maps = hash:/etc/postfix/transport 468 initial_destination_concurrency = 2000 469 fragile_destination_concurrency_limit = 2000 470 471/etc/postfix/transport: 472 example.com fragile: 473 474/etc/postfix/master.cf: 475 # service type private unpriv chroot wakeup maxproc command 476 fragile unix - - n - 20 smtp 477</pre> 478 479<p> See also the documentation for default_destination_concurrency_limit. 480</p> 481 482</ul> 483 484</ul> 485 486<p> The effect of this configuration is that up to 2000 487consecutive errors are tolerated without marking the destination 488dead, while the total concurrency remains reasonable (10-20 489processes). This trick is only for a very specialized situation: 490high volume delivery into a channel with multi-error bursts 491that is capable of high throughput, but is repeatedly throttled by 492the bursts of errors. </p> 493 494<p> When a destination is unable to handle the load even after the 495Postfix process limit is reduced to 1, a desperate measure is to 496insert brief delays between delivery attempts. </p> 497 498<ul> 499 500<li> <p> Postfix version 2.5 and later: </p> 501 502<ul> 503 504<li> <p> In master.cf set up a dedicated clone of the "smtp" transport 505for the problem destination. In the example below we call it "slow". 506</p> 507 508<li> <p> In main.cf configure a short delay between deliveries to 509the same destination. </p> 510 511<pre> 512/etc/postfix/main.cf: 513 transport_maps = hash:/etc/postfix/transport 514 slow_destination_rate_delay = 1 515 slow_destination_concurrency_failed_cohort_limit = 100 516 517/etc/postfix/transport: 518 example.com slow: 519 520/etc/postfix/master.cf: 521 # service type private unpriv chroot wakeup maxproc command 522 slow unix - - n - - smtp 523</pre> 524 525</ul> 526 527<p> See also the documentation for default_destination_rate_delay. </p> 528 529<p> This solution forces the Postfix smtp(8) client to wait for 530$slow_destination_rate_delay seconds between deliveries to the same 531destination. </p> 532 533<p> IMPORTANT!! The large slow_destination_concurrency_failed_cohort_limit 534value is needed. This prevents Postfix from deferring all mail for 535the same destination after only one connection or handshake error 536(the reason for this is that non-zero slow_destination_rate_delay 537forces a per-destination concurrency of 1). </p> 538 539<li> <p> Earlier Postfix versions: </p> 540 541<ul> 542 543<li> <p> In the transport map entry for the problem destination, 544specify a dead host as the primary nexthop. </p> 545 546<li> <p> In the master.cf entry for the transport specify the 547problem destination as the fallback_relay and specify a small 548smtp_connect_timeout value. </p> 549 550<pre> 551/etc/postfix/main.cf: 552 transport_maps = hash:/etc/postfix/transport 553 554/etc/postfix/transport: 555 example.com slow:[dead.host] 556 557/etc/postfix/master.cf: 558 # service type private unpriv chroot wakeup maxproc command 559 slow unix - - n - 1 smtp 560 -o fallback_relay=problem.example.com 561 -o smtp_connect_timeout=1 562 -o smtp_connection_cache_on_demand=no 563</pre> 564 565</ul> 566 567<p> This solution forces the Postfix smtp(8) client to wait for 568$smtp_connect_timeout seconds between deliveries. The connection 569caching feature is disabled to prevent the client from skipping 570over the dead host. </p> 571 572</ul> 573 574<h2><a name="queues">Postfix queue directories</a></h2> 575 576<p> The following sections describe Postfix queues: their purpose, 577what normal behavior looks like, and how to diagnose abnormal 578behavior. </p> 579 580<h3> <a name="maildrop_queue"> The "maildrop" queue </a> </h3> 581 582<p> Messages that have been submitted via the Postfix sendmail(1) 583command, but not yet brought into the main Postfix queue by the 584pickup(8) service, await processing in the "maildrop" queue. Messages 585can be added to the "maildrop" queue even when the Postfix system 586is not running. They will begin to be processed once Postfix is 587started. </p> 588 589<p> The "maildrop" queue is drained by the single threaded pickup(8) 590service scanning the queue directory periodically or when notified 591of new message arrival by the postdrop(1) program. The postdrop(1) 592program is a setgid helper that allows the unprivileged Postfix 593sendmail(1) program to inject mail into the "maildrop" queue and 594to notify the pickup(8) service of its arrival. </p> 595 596<p> All mail that enters the main Postfix queue does so via the 597cleanup(8) service. The cleanup service is responsible for envelope 598and header rewriting, header and body regular expression checks, 599automatic bcc recipient processing, milter content processing, and 600reliable insertion of the message into the Postfix "incoming" queue. </p> 601 602<p> In the absence of excessive CPU consumption in cleanup(8) header 603or body regular expression checks or other software consuming all 604available CPU resources, Postfix performance is disk I/O bound. 605The rate at which the pickup(8) service can inject messages into 606the queue is largely determined by disk access times, since the 607cleanup(8) service must commit the message to stable storage before 608returning success. The same is true of the postdrop(1) program 609writing the message to the "maildrop" directory. </p> 610 611<p> As the pickup service is single threaded, it can only deliver 612one message at a time at a rate that does not exceed the reciprocal 613disk I/O latency (+ CPU if not negligible) of the cleanup service. 614</p> 615 616<p> Congestion in this queue is indicative of an excessive local message 617submission rate or perhaps excessive CPU consumption in the cleanup(8) 618service due to excessive body_checks, or (Postfix ≥ 2.3) high latency 619milters. </p> 620 621<p> Note, that once the "active" queue is full, the cleanup service 622will attempt to slow down message injection by pausing $in_flow_delay 623for each message. In this case "maildrop" queue congestion may be 624a consequence of congestion downstream, rather than a problem in 625its own right. </p> 626 627<p> Note, you should not attempt to deliver large volumes of mail via 628the pickup(8) service. High volume sites should avoid using "simple" 629content filters that re-inject scanned mail via Postfix sendmail(1) 630and postdrop(1). </p> 631 632<p> A high arrival rate of locally submitted mail may be an indication 633of an uncaught forwarding loop, or a run-away notification program. 634Try to keep the volume of local mail injection to a moderate level. 635</p> 636 637<p> The "postsuper -r" command can place selected messages into 638the "maildrop" queue for reprocessing. This is most useful for 639resetting any stale content_filter settings. Requeuing a large number 640of messages using "postsuper -r" can clearly cause a spike in the 641size of the "maildrop" queue. </p> 642 643<h3> <a name="hold_queue"> The "hold" queue </a> </h3> 644 645<p> The administrator can define "smtpd" access(5) policies, or 646cleanup(8) header/body checks that cause messages to be automatically 647diverted from normal processing and placed indefinitely in the 648"hold" queue. Messages placed in the "hold" queue stay there until 649the administrator intervenes. No periodic delivery attempts are 650made for messages in the "hold" queue. The postsuper(1) command 651can be used to manually release messages into the "deferred" queue. 652</p> 653 654<p> Messages can potentially stay in the "hold" queue longer than 655$maximal_queue_lifetime. If such "old" messages need to be released from 656the "hold" queue, they should typically be moved into the "maildrop" queue 657using "postsuper -r", so that the message gets a new timestamp and 658is given more than one opportunity to be delivered. Messages that are 659"young" can be moved directly into the "deferred" queue using 660"postsuper -H". </p> 661 662<p> The "hold" queue plays little role in Postfix performance, and 663monitoring of the "hold" queue is typically more closely motivated 664by tracking spam and malware, than by performance issues. </p> 665 666<h3> <a name="incoming_queue"> The "incoming" queue </a> </h3> 667 668<p> All new mail entering the Postfix queue is written by the 669cleanup(8) service into the "incoming" queue. New queue files are 670created owned by the "postfix" user with an access bitmask (or 671mode) of 0600. Once a queue file is ready for further processing 672the cleanup(8) service changes the queue file mode to 0700 and 673notifies the queue manager of new mail arrival. The queue manager 674ignores incomplete queue files whose mode is 0600, as these are 675still being written by cleanup. </p> 676 677<p> The queue manager scans the "incoming" queue bringing any new 678mail into the "active" queue if the "active" queue resource limits 679have not been exceeded. By default, the "active" queue accommodates 680at most 20000 messages. Once the "active" queue message limit is 681reached, the queue manager stops scanning the "incoming" queue 682(and the "deferred" queue, see below). </p> 683 684<p> Under normal conditions the "incoming" queue is nearly empty (has 685only mode 0600 files), with the queue manager able to import new 686messages into the "active" queue as soon as they become available. 687</p> 688 689<p> The "incoming" queue grows when the message input rate spikes 690above the rate at which the queue manager can import messages into 691the "active" queue. The main factors slowing down the queue manager 692are disk I/O and lookup queries to the trivial-rewrite service. If the queue 693manager is routinely not keeping up, consider not using "slow" 694lookup services (MySQL, LDAP, ...) for transport lookups or speeding 695up the hosts that provide the lookup service. If the problem is I/O 696starvation, consider striping the queue over more disks, faster controllers 697with a battery write cache, or other hardware improvements. At the very 698least, make sure that the queue directory is mounted with the "noatime" 699option if applicable to the underlying filesystem. </p> 700 701<p> The in_flow_delay parameter is used to clamp the input rate 702when the queue manager starts to fall behind. The cleanup(8) service 703will pause for $in_flow_delay seconds before creating a new queue 704file if it cannot obtain a "token" from the queue manager. </p> 705 706<p> Since the number of cleanup(8) processes is limited in most 707cases by the SMTP server concurrency, the input rate can exceed 708the output rate by at most "SMTP connection count" / $in_flow_delay 709messages per second. </p> 710 711<p> With a default process limit of 100, and an in_flow_delay of 7121s, the coupling is strong enough to limit a single run-away injector 713to 1 message per second, but is not strong enough to deflect an 714excessive input rate from many sources at the same time. </p> 715 716<p> If a server is being hammered from multiple directions, consider 717raising the in_flow_delay to 10 seconds, but only if the "incoming" queue 718is growing even while the "active" queue is not full and the 719trivial-rewrite service is using a fast transport lookup mechanism. 720</p> 721 722<h3> <a name="active_queue"> The "active" queue </a> </h3> 723 724<p> The queue manager is a delivery agent scheduler; it works to 725ensure fast and fair delivery of mail to all destinations within 726designated resource limits. </p> 727 728<p> The "active" queue is somewhat analogous to an operating system's 729process run queue. Messages in the "active" queue are ready to be 730sent (runnable), but are not necessarily in the process of being 731sent (running). </p> 732 733<p> While most Postfix administrators think of the "active" queue 734as a directory on disk, the real "active" queue is a set of data 735structures in the memory of the queue manager process. </p> 736 737<p> Messages in the "maildrop", "hold", "incoming" and "deferred" queues 738(see below) do not occupy memory; they are safely stored on 739disk waiting for their turn to be processed. The envelope information 740for messages in the "active" queue is managed in memory, allowing 741the queue manager to do global scheduling, allocating available 742delivery agent processes to an appropriate message in the "active" queue. </p> 743 744<p> Within the "active" queue, (multi-recipient) messages are broken 745up into groups of recipients that share the same transport/nexthop 746combination; the group size is capped by the transport's recipient 747concurrency limit. </p> 748 749<p> Multiple recipient groups (from one or more messages) are queued 750for delivery grouped by transport/nexthop combination. The 751<b>destination</b> concurrency limit for the transports caps the number 752of simultaneous delivery attempts for each nexthop. Transports with 753a <b>recipient</b> concurrency limit of 1 are special: these are grouped 754by the actual recipient address rather than the nexthop, yielding 755per-recipient concurrency limits rather than per-domain 756concurrency limits. Per-recipient limits are appropriate when 757performing final delivery to mailboxes rather than when relaying 758to a remote server. </p> 759 760<p> Congestion occurs in the "active" queue when one or more destinations 761drain slower than the corresponding message input rate. </p> 762 763<p> Input into the "active" queue comes both from new mail in the "incoming" queue, 764and retries of mail in the "deferred" queue. Should the "deferred" queue 765get really large, retries of old mail can dominate the arrival 766rate of new mail. Systems with more CPU, faster disks and more network 767bandwidth can deal with larger "deferred" queues, but as a rule of thumb 768the "deferred" queue scales to somewhere between 100,000 and 1,000,000 769messages with good performance unlikely above that "limit". Systems with 770queues this large should typically stop accepting new mail, or put the 771backlog "on hold" until the underlying issue is fixed (provided that 772there is enough capacity to handle just the new mail). </p> 773 774<p> When a destination is down for some time, the queue manager will 775mark it dead, and immediately defer all mail for the destination without 776trying to assign it to a delivery agent. In this case the messages 777will quickly leave the "active" queue and end up in the "deferred" queue 778(with Postfix < 2.4, this is done directly by the queue manager, 779with Postfix ≥ 2.4 this is done via the "retry" delivery agent). </p> 780 781<p> When the destination is instead simply slow, or there is a problem 782causing an excessive arrival rate the "active" queue will grow and will 783become dominated by mail to the congested destination. </p> 784 785<p> The only way to reduce congestion is to either reduce the input 786rate or increase the throughput. Increasing the throughput requires 787either increasing the concurrency or reducing the latency of 788deliveries. </p> 789 790<p> For high volume sites a key tuning parameter is the number of 791"smtp" delivery agents allocated to the "smtp" and "relay" transports. 792High volume sites tend to send to many different destinations, many 793of which may be down or slow, so a good fraction of the available 794delivery agents will be blocked waiting for slow sites. Also mail 795destined across the globe will incur large SMTP command-response 796latencies, so high message throughput can only be achieved with 797more concurrent delivery agents. </p> 798 799<p> The default "smtp" process limit of 100 is good enough for most 800sites, and may even need to be lowered for sites with low bandwidth 801connections (no use increasing concurrency once the network pipe 802is full). When one finds that the queue is growing on an "idle" 803system (CPU, disk I/O and network not exhausted) the remaining 804reason for congestion is insufficient concurrency in the face of 805a high average latency. If the number of outbound SMTP connections 806(either ESTABLISHED or SYN_SENT) reaches the process limit, mail 807is draining slowly and the system and network are not loaded, raise 808the "smtp" and/or "relay" process limits! </p> 809 810<p> When a high volume destination is served by multiple MX hosts with 811typically low delivery latency, performance can suffer dramatically when 812one of the MX hosts is unresponsive and SMTP connections to that host 813timeout. For example, if there are 2 equal weight MX hosts, the SMTP 814connection timeout is 30 seconds and one of the MX hosts is down, the 815average SMTP connection will take approximately 15 seconds to complete. 816With a default per-destination concurrency limit of 20 connections, 817throughput falls to just over 1 message per second. </p> 818 819<p> The best way to avoid bottlenecks when one or more MX hosts is 820non-responsive is to use connection caching. Connection caching was 821introduced with Postfix 2.2 and is by default enabled on demand for 822destinations with a backlog of mail in the "active" queue. When connection 823caching is in effect for a particular destination, established connections 824are re-used to send additional messages, this reduces the number of 825connections made per message delivery and maintains good throughput even 826in the face of partial unavailability of the destination's MX hosts. </p> 827 828<p> If connection caching is not available (Postfix < 2.2) or does 829not provide a sufficient latency reduction, especially for the "relay" 830transport used to forward mail to "your own" domains, consider setting 831lower than default SMTP connection timeouts (1-5 seconds) and higher 832than default destination concurrency limits. This will further reduce 833latency and provide more concurrency to maintain throughput should 834latency rise. </p> 835 836<p> Setting high concurrency limits to domains that are not your own may 837be viewed as hostile by the receiving system, and steps may be taken 838to prevent you from monopolizing the destination system's resources. 839The defensive measures may substantially reduce your throughput or block 840access entirely. Do not set aggressive concurrency limits to remote 841domains without coordinating with the administrators of the target 842domain. </p> 843 844<p> If necessary, dedicate and tune custom transports for selected high 845volume destinations. The "relay" transport is provided for forwarding mail 846to domains for which your server is a primary or backup MX host. These can 847make up a substantial fraction of your email traffic. Use the "relay" and 848not the "smtp" transport to send email to these domains. Using the "relay" 849transport allocates a separate delivery agent pool to these destinations 850and allows separate tuning of timeouts and concurrency limits. </p> 851 852<p> Another common cause of congestion is unwarranted flushing of the 853entire "deferred" queue. The "deferred" queue holds messages that are likely 854to fail to be delivered and are also likely to be slow to fail delivery 855(time out). As a result the most common reaction to a large "deferred" queue 856(flush it!) is more than likely counter-productive, and typically makes 857the congestion worse. Do not flush the "deferred" queue unless you expect 858that most of its content has recently become deliverable (e.g. relayhost 859back up after an outage)! </p> 860 861<p> Note that whenever the queue manager is restarted, there may 862already be messages in the "active" queue directory, but the "real" 863"active" queue in memory is empty. In order to recover the in-memory 864state, the queue manager moves all the "active" queue messages 865back into the "incoming" queue, and then uses its normal "incoming" queue 866scan to refill the "active" queue. The process of moving all 867the messages back and forth, redoing transport table (trivial-rewrite(8) 868resolve service) lookups, and re-importing the messages back into 869memory is expensive. At all costs, avoid frequent restarts of the 870queue manager (e.g. via frequent execution of "postfix reload"). </p> 871 872<h3> <a name="deferred_queue"> The "deferred" queue </a> </h3> 873 874<p> When all the deliverable recipients for a message are delivered, 875and for some recipients delivery failed for a transient reason (it 876might succeed later), the message is placed in the "deferred" queue. 877</p> 878 879<p> The queue manager scans the "deferred" queue periodically. The scan 880interval is controlled by the queue_run_delay parameter. While a "deferred" queue 881scan is in progress, if an "incoming" queue scan is also in progress 882(ideally these are brief since the "incoming" queue should be short), the 883queue manager alternates between looking for messages in the "incoming" queue 884and in the "deferred" queue. This "round-robin" strategy prevents 885starvation of either the "incoming" or the "deferred" queues. </p> 886 887<p> Each "deferred" queue scan only brings a fraction of the "deferred" queue 888back into the "active" queue for a retry. This is because each 889message in the "deferred" queue is assigned a "cool-off" time when 890it is deferred. This is done by time-warping the modification 891time of the queue file into the future. The queue file is not 892eligible for a retry if its modification time is not yet reached. 893</p> 894 895<p> The "cool-off" time is at least $minimal_backoff_time and at 896most $maximal_backoff_time. The next retry time is set by doubling 897the message's age in the queue, and adjusting up or down to lie 898within the limits. This means that young messages are initially 899retried more often than old messages. </p> 900 901<p> If a high volume site routinely has large "deferred" queues, it 902may be useful to adjust the queue_run_delay, minimal_backoff_time and 903maximal_backoff_time to provide short enough delays on first failure 904(Postfix ≥ 2.4 has a sensibly low minimal backoff time by default), 905with perhaps longer delays after multiple failures, to reduce the 906retransmission rate of old messages and thereby reduce the quantity 907of previously deferred mail in the "active" queue. If you want a really 908low minimal_backoff_time, you may also want to lower queue_run_delay, 909but understand that more frequent scans will increase the demand for 910disk I/O. </p> 911 912<p> One common cause of large "deferred" queues is failure to validate 913recipients at the SMTP input stage. Since spammers routinely launch 914dictionary attacks from unrepliable sender addresses, the bounces 915for invalid recipient addresses clog the "deferred" queue (and at high 916volumes proportionally clog the "active" queue). Recipient validation 917is strongly recommended through use of the local_recipient_maps and 918relay_recipient_maps parameters. Even when bounces drain quickly they 919inundate innocent victims of forgery with unwanted email. To avoid 920this, do not accept mail for invalid recipients. </p> 921 922<p> When a host with lots of deferred mail is down for some time, 923it is possible for the entire "deferred" queue to reach its retry 924time simultaneously. This can lead to a very full "active" queue once 925the host comes back up. The phenomenon can repeat approximately 926every maximal_backoff_time seconds if the messages are again deferred 927after a brief burst of congestion. Perhaps, a future Postfix release 928will add a random offset to the retry time (or use a combination 929of strategies) to reduce the odds of repeated complete "deferred" queue 930flushes. </p> 931 932<h2><a name="credits">Credits</a></h2> 933 934<p> The qshape(1) program was developed by Victor Duchovni of Morgan 935Stanley, who also wrote the initial version of this document. </p> 936 937</body> 938 939</html> 940