cfd59c5a | 12-Feb-2020 |
Matthew Dillon <dillon@apollo.backplane.com> |
tmpfs - Flush and recycle pages quickly during heavy paging activity
* When the pagedaemon is operating any write()s made via tmpfs will be forced to operate through the buffer cache via cluster_w
tmpfs - Flush and recycle pages quickly during heavy paging activity
* When the pagedaemon is operating any write()s made via tmpfs will be forced to operate through the buffer cache via cluster_write() or bdwrite() instead of using buwrite().
This will cause the pages to be pipelined to backing store (swap) under these conditions, making them clean immediately to avoid having tmpfs cause further paging pressure on the system when it is already under paging pressure.
* In addition, the B_TTC flag is set on these buffers to attempt to recycle the pages directly into PQ_CACHE ASAP after they are flushed.
* Implement cluster_write() operation by default to try to improve block sizes for physical I/O.
* TMPFS currently must move pages between two VM objects when reclaiming a vnode, and back again upon re-use. The current VM mechanism for renaming VM pages dirties them and this can potentially cause the paging system to thrash on the same page under heavy vnode recycling loads.
Instead of allowing this to happen, TMPFS now frees any clean page that have backing store assigned when moving from the backing object, and any clean pages that were instantiated from backing store when moving to the backing object.
show more ...
|
3b6a19b2 | 24-Oct-2017 |
Matthew Dillon <dillon@apollo.backplane.com> |
kernel - Refactor lockmgr()
* Seriously refactor lockmgr() so we can use atomic_fetchadd_*() for shared locks and reduce unnecessary atomic ops and atomic op loops.
The main win here is being a
kernel - Refactor lockmgr()
* Seriously refactor lockmgr() so we can use atomic_fetchadd_*() for shared locks and reduce unnecessary atomic ops and atomic op loops.
The main win here is being able to use atomic_fetchadd_*() when acquiring and releasing shared locks. A simple fstat() loop (which utilizes a LK_SHARED lockmgr lock on the vnode) improves from 191ns to around 110ns per loop with 32 concurrent threads (on a 16-core/ 32-thread xeon).
* To accomplish this, the 32-bit lk_count field becomes 64-bits. The shared count is separated into the high 32-bits, allowing it to be manipulated for both blocking shared requests and the shared lock count field. The low count bits are used for exclusive locks. Control bits are adjusted to manage lockmgr features.
LKC_SHARED Indicates shared lock count is active, else excl lock count. Can predispose the lock when the related count is 0 (does not have to be cleared, for example).
LKC_UPREQ Queued upgrade request. Automatically granted by releasing entity (UPREQ -> ~SHARED|1).
LKC_EXREQ Queued exclusive request (only when lock held shared). Automatically granted by releasing entity (EXREQ -> ~SHARED|1).
LKC_EXREQ2 Aggregated exclusive request. When EXREQ cannot be obtained due to the lock being held exclusively or EXREQ already being queued, EXREQ2 is flagged for wakeup/retries.
LKC_CANCEL Cancel API support
LKC_SMASK Shared lock count mask (LKC_SCOUNT increments).
LKC_XMASK Exclusive lock count mask (+1 increments)
The 'no lock' condition occurs when LKC_XMASK is 0 and LKC_SMASK is 0, regardless of the state of LKC_SHARED.
* Lockmgr still supports exclusive priority over shared locks. The semantics have slightly changed. The priority mechanism only applies to the EXREQ holder. Once an exclusive lock is obtained, any blocking shared or exclusive locks will have equal priority until the exclusive lock is released. Once released, shared locks can squeeze in, but then the next pending exclusive lock will assert its priority over any new shared locks when it wakes up and loops.
This isn't quite what I wanted, but it seems to work quite well. I had to make a trade-off in the EXREQ lock-grant mechanism to improve performance.
* In addition, we use atomic_fcmpset_long() instead of atomic_cmpset_long() to reduce cache line flip flopping at least a little.
* Remove lockcount() and lockcountnb(), which tried to count lock refs. Replace with lockinuse(), which simply tells the caller whether the lock is referenced or not.
* Expand some of the copyright notices (years and authors) for major rewrites. Really there are a lot more and I have to pay more attention to adjustments.
show more ...
|