#
5c8d05e2 |
| 06-Aug-2008 |
Matthew Dillon <dillon@dragonflybsd.org> |
HAMMER 2.1:01 - Stability
* Fix a bug in the B-Tree code. Recursive deletions are done prior to marking a node as actually being empty, but setup for the deletion (by calling hammer_cursor_dele
HAMMER 2.1:01 - Stability
* Fix a bug in the B-Tree code. Recursive deletions are done prior to marking a node as actually being empty, but setup for the deletion (by calling hammer_cursor_deleted_element()) must still occur prior to the recursrion so cursor indexes are properly adjusted for the possible removal. If the recursion is not successful we can just leave the cursors post-adjusted since the subtree has an empty leaf anyway.
* Rename HAMMER_CURSOR_DELBTREE to HAMMER_CURSOR_RETEST so its function is more apparent.
* Properly set the HAMMER_CURSOR_RETEST flag when relocking a cursor that has tracked a ripout, so the cursor's new current element is re-tested by any iteration using the cursor.
* Remove code that allowed a SETUP record to be converted to a FLUSH record if the target inode is already in the correct flush group. The problem is that target inode has already setup its sync state for the backend and the nlinks count will not be correct if we add another directory ADD/DEL record to the flush. While strictly a temporary nlinks mismatch (the next flush would correct it), a crash occuring here would result in inconsistent nlink counts on the media.
* Reference and release buffers instead of directly calling low level hammer_io_deallocate(), and generally reference and release buffers around reclamations in the buffer/io invalidation code to avoid races. In particular, the buffer must be referenced during a call to hammer_io_clear_modify().
* Fix a buffer leak in hammer_del_buffers() which is not only bad unto itself, but can also cause reblocking assertions on the presence of buffer aliases later on.
* Return ENOTDIR if rmdir attempts to remove a non-directory.
Reported-by: Francois Tigeot <ftigeot@wolfpond.org> (rmdir) Reported-by: YONETANI Tomokazu <qhwt+dfly@les.ath.cx> (multiple)
show more ...
|
#
e469566b |
| 31-Jul-2008 |
Matthew Dillon <dillon@dragonflybsd.org> |
HAMMER: Mirroring work
* Fix an invalidation race that can be triggered by the mirroring or reblocking code. The invalidation was being made before the direct IO completed rather then after.
*
HAMMER: Mirroring work
* Fix an invalidation race that can be triggered by the mirroring or reblocking code. The invalidation was being made before the direct IO completed rather then after.
* Fix an invalidation race. hammer_io_inval() was cleaning out any pre-existing buffer cache buffer aliases but was not cleaning out the VM backing store, resulting in CRC assertions (but no on-media corruption) by the mirroring code.
* Change the bulk-record sequencing to avoid adding the record to the inode's record list until after the direct-io has been initiated.
* Change the mirror_read code to generate PASS records for deleted records whos create_tid is out of bounds, so we do not have to transport the data for deleted data records. This greatly reduces the mirror bandwidth needed to mirror deletions.
The mirror_write code similarly will issue delete_tid updates as appropriate when presented with a PASS record.
* Mirror targets no longer strip deleted records which had yet to be created on the target. The record is now created so snapshot state is retained.
show more ...
|
#
cdb6e4e6 |
| 18-Jul-2008 |
Matthew Dillon <dillon@dragonflybsd.org> |
HAMMER 63/Many: IO Error handling features
This commit removes all the remaining Debugger() calls and KKASSERTs in the I/O error path. Errors are now propagated up the call tree and properly report
HAMMER 63/Many: IO Error handling features
This commit removes all the remaining Debugger() calls and KKASSERTs in the I/O error path. Errors are now propagated up the call tree and properly reported.
* Report I/O errors instead of asserting.
* Read or Write errors in the flush path disable flushing and force the mount into read-only mode. Modified buffers are left locked in memory until umount to provide a consistent snapshot of the state of the filesystem.
You must umount and remount to recover the filesystem. The filesystem will automatically rollback to the last valid flush upon remounting.
* umount and umount -f are now able to unmount a HAMMER filesystem that has catastrophic write errors (e.g. pulling the USB cable on an external drive).
show more ...
|
#
ce0138a6 |
| 14-Jul-2008 |
Matthew Dillon <dillon@dragonflybsd.org> |
HAMMER 61E/Many: Features
* Implement hammer iostats.
|
#
1b0ab2c3 |
| 14-Jul-2008 |
Matthew Dillon <dillon@dragonflybsd.org> |
HAMMER 61F/Many: Stabilization w/ simultanious pruning and reblocking
* BUG FIX: When doing direct-read check to see if any device buffers are aliasing the disk block and flush any we find which a
HAMMER 61F/Many: Stabilization w/ simultanious pruning and reblocking
* BUG FIX: When doing direct-read check to see if any device buffers are aliasing the disk block and flush any we find which are dirty. This ensures that reblocked data gets to disk before a direct-read tries to read them FROM the disk.
* BUG FIX: Fix a bug introduced in a recent commit where the flusher wlll not always completely flush the UNDO FIFO or completely flush all meta-data, resulting in a rollback after a normal umount/mount.
* BUG FIX: Direct-writes queue I/O independant of the in-memory record. When the backend flusher flushes the record, making it available in the B-Tree, make sure that the indepent I/O has completed. Otherwise a later reblocking operation might read the media before the direct-write has actually completed.
* BUG FIX: In-memory records are not subject direct-IO, since their data is not yet on the media.
* BUG FIX: Do not allow mount to succeed unless all volumes have been found. (Reported-by: Sascha Wildner <saw@online.de>)
* BUG FIX: The bd_heatup() call in the reblocker was in the wrong place, potentially causing the cursor to shift unexpectedly.
* Reorient some of the buffer invalidation code by enhancing the reservation code.
* Add read CRC verification logic for some direct-reads, but comment it out because the VM system's bogus-page replacement breaks it.
show more ...
|
#
ecca949a |
| 29-Jun-2008 |
Matthew Dillon <dillon@dragonflybsd.org> |
HAMMER 59E/Many: Stabilization pass - fixes for large file issues
* Correct a bug related to inodes moving between flush groups (when truncating a large file). Some records were not being moved a
HAMMER 59E/Many: Stabilization pass - fixes for large file issues
* Correct a bug related to inodes moving between flush groups (when truncating a large file). Some records were not being moved and cause an assertion later on.
* Fix a leak of B_LOCKED buffers which could occur under heavy loads. Eventually enough build up to deadlock the buffer cache.
show more ...
|
#
f5a07a7a |
| 28-Jun-2008 |
Matthew Dillon <dillon@dragonflybsd.org> |
HAMMER 59D/Many: Sync with buffer cache changes in HEAD.
* Adjust hammer to limit dirty meta-data buffers based on total bytes rather then total buffers.
* Limit to 1/4 the buffer cache limit (fo
HAMMER 59D/Many: Sync with buffer cache changes in HEAD.
* Adjust hammer to limit dirty meta-data buffers based on total bytes rather then total buffers.
* Limit to 1/4 the buffer cache limit (for now)... a better solution is needed.
show more ...
|
#
5a930e66 |
| 23-Jun-2008 |
Matthew Dillon <dillon@dragonflybsd.org> |
HAMMER 57/Many: Pseudofs support
* Finish up implementation of the localization field which is used to split the B-Tree up into domains. Use the upper 16 bits as a pseudo filesytem selector.
*
HAMMER 57/Many: Pseudofs support
* Finish up implementation of the localization field which is used to split the B-Tree up into domains. Use the upper 16 bits as a pseudo filesytem selector.
* Code hammer_vop_mknod() to be able to create pseudo-filesystems.
show more ...
|
#
43c665ae |
| 21-Jun-2008 |
Matthew Dillon <dillon@dragonflybsd.org> |
HAMMER 56F/Many: Stabilization pass
* When data is reblocked any related direct-io offsets cached in front-end buffer cache buffers must be cleaned out. This also requires running through any s
HAMMER 56F/Many: Stabilization pass
* When data is reblocked any related direct-io offsets cached in front-end buffer cache buffers must be cleaned out. This also requires running through any snapshotted inodes referencing the same object.
* The flusher must check that the cached B-Tree node has not been flagged as deleted (HAMMER_NODE_DELETED) before seeking to it.
* hammer_io_direct_read() now requires and asserts that the second-level cached offset in the BIO is a zone-2 offset.
* hammer_io_direct_write() no longer overwrites the second-level cached offset with the third level raw disk offset. It pushes a third level to set the raw disk offset.
* When creating a directory entry, set the localization field for pseudo-fs support (which isn't quite working yet anyway so no biggy).
* Move the Red-Black tree generator for inodes from hammer_ondisk.c to hammer_inode.c.
show more ...
|
#
4a2796f3 |
| 20-Jun-2008 |
Matthew Dillon <dillon@dragonflybsd.org> |
HAMMER 56C/Many: Performance tuning - MEDIA STRUCTURES CHANGED!
* Implement variable block sizes. Records at offsets < 1MB use 16K buffers while records at offsets >= 1MB use 64K buffers. This i
HAMMER 56C/Many: Performance tuning - MEDIA STRUCTURES CHANGED!
* Implement variable block sizes. Records at offsets < 1MB use 16K buffers while records at offsets >= 1MB use 64K buffers. This improves performance primarily by reducing the number of B-Tree elements we have to stuff.
* Mess around with the deadlock handling code a bit. It still needs a re-think but it works. Implement low-priority shared locks. A low priority shared lock can only be acquired if no other locks are held by the thread.
* Implement slow-down code for the record backlog to the flusher and reimplement the slow-down code that deals with reclaimed inodes queued to the flusher. This should hopefully fix the kernel memory exhaustion issues for M_HAMMER.
* Update layer2->append_off. It isn't implemented yet but doing this now will prevent media incompatibilities later on when it does get implemented.
* Split hammer_blockmap_free() into hammer_blockmap_free() and hammer_blockmap_finalize().
* Fix a bug in the delayed-CRC handling related to reblocking. When throwing away a modified buffer, pending CRC requests must also be thrown away.
* Fix a bug in the record overlap compare code. If we cannot return 0 due to an overlap because the record has been deleted, we must still return an appropriate formal code so the scan progresses in the correct direction down the red-black tree.
* Make data in the meta-data zone a meta-data buffer structure type so it gets synced to disk at the appropriate time. This may be temporary, it's needed to deal with the atime/mtime code but another commit may soon make it moot.
* Bump the seqcount so cluster_read() does the right thing when reading into a large UIO just after opening a file.
* Do a better job calculating vap->va_bytes. It's still fake, but its a more accurate fake.
* Fix an issue in the BMAP code related to ranges that do not cover the requested logical offset.
* Fix a bug in the blockmap code. If a reservation is released without finalizing any allocations within that big-block, another zone can steal it out from under the current zone's next_offset, resulting in a zone mismatch panic.
show more ...
|
#
bcac4bbb |
| 18-Jun-2008 |
Matthew Dillon <dillon@dragonflybsd.org> |
HAMMER 56B/Many: Performance tuning - MEDIA STRUCTURES CHANGED!
* MEDIA CHANGE: The atime has been moved back into the inode data proper. The nlinks field has also been moved.
* PERFORMANCE: The
HAMMER 56B/Many: Performance tuning - MEDIA STRUCTURES CHANGED!
* MEDIA CHANGE: The atime has been moved back into the inode data proper. The nlinks field has also been moved.
* PERFORMANCE: The CRC for cached B-Tree nodes was being run on every access instead of just the first time. This was the cause of HAMMER's poor directory scanning performance and cpu-intensive write flushes.
Adjusted to only check the CRC on the initial load into the buffer cache.
* PERFORMANCE: The CRC for modified B-Tree nodes was being regenerated every time the node was modified, so a large number of insertions or deletions modifying the same B-Tree need needlessly regenerated the CRC each time.
Adjusted to delay generation of the CRC until just before the buffer is flushed to the physical media.
Just for the record, B-Tree nodes are 4K and it takes ~25uS to run a CRC on them. Needless to say removing the unnecessary calls solved a lot of performance issues.
* PERFORMANCE: Removed limitations in the node caching algorithms. Now more then one inode can cache pointers to the same B-Tree node.
* PERFORMANCE: When calculating the parent B-Tree node we have to scan the element array to locate the index that points back to the child. Use a power-of-2 algorithm instead of a linear scan.
* PERFORMANCE: Clean up the selection of ip->cache[0] or ip->cache[1] based on whether we are trying to cache the location of the inode or the location of the file object's data.
show more ...
|
#
cb51be26 |
| 17-Jun-2008 |
Matthew Dillon <dillon@dragonflybsd.org> |
HAMMER 56A/Many: Performance tuning - MEDIA STRUCTURES CHANGED!
* MEDIA CHANGE: The blockmaps have been folded into the freemap. Allocations are now made directly out of the freemap. More work i
HAMMER 56A/Many: Performance tuning - MEDIA STRUCTURES CHANGED!
* MEDIA CHANGE: The blockmaps have been folded into the freemap. Allocations are now made directly out of the freemap. More work is expected here.
The blockmaps are still used to sequence allocations, but no block number translation is required any more. This didn't improve performance much but it will make it easier for future optimizations to localize allocations.
* PERFORMANCE: Removed the holes recording code. Another commit will soon take over the functionality.
* PERFORMANCE: The flusher's slave threads now collect a number of inodes into a batch before starting their work, in an attempt to reduce deadlocks between slave threads from adjacent inodes.
* PERFORMANCE: B-Tree positional caching now works much better, greatly reducing the cpu overhead when accessing the filesystem.
* PERFORMANCE: Added a write-append optimization. Do not do a lookup/iteration to locate records being overwritten when no such records should exist. This cuts the cpu overhead of write-append flushes in half.
* PERFORMANCE: Add a vfs.hammer.write_mode sysctl feature to test out two different ways of queueing write I/O's.
* Added B-Tree statistics (hammer bstats 1).
show more ...
|
#
bf3b416b |
| 14-Jun-2008 |
Matthew Dillon <dillon@dragonflybsd.org> |
HAMMER 55: Performance tuning and bug fixes - MEDIA STRUCTURES CHANGED!
* BUG-FIX: Fix a race in hammer_rel_mem_record() which could result in a machine lockup. The code could block at an inappro
HAMMER 55: Performance tuning and bug fixes - MEDIA STRUCTURES CHANGED!
* BUG-FIX: Fix a race in hammer_rel_mem_record() which could result in a machine lockup. The code could block at an inappropriate time with both the record and a dependancy inode pointer left unprotected.
* BUG-FIX: The direct-write code could assert on (*error != 0) due to an incorrect conditional in the in-memory record scanning code.
* Inode data and directory entry data has been given its own zone as a stop-gap until the low level allocator can be rewritten.
* Increase the directory object-id cache from 128 entries to 1024 entries.
* General cleanup.
* Introduce a separate reblocking domain for directories: 'hammer reblock-dirs'.
show more ...
|
#
7bc5b8c2 |
| 13-Jun-2008 |
Matthew Dillon <dillon@dragonflybsd.org> |
HAMMER 54D/Many: Performance tuning.
* Remove major barriers to write performance and fix hicups revealed by blogbench.
Change the HAMMER reclaim-delay algorithm to operate like a FIFO instead
HAMMER 54D/Many: Performance tuning.
* Remove major barriers to write performance and fix hicups revealed by blogbench.
Change the HAMMER reclaim-delay algorithm to operate like a FIFO instead of as a free-for-all. The idea of introducing a dynamic delay helped some, but the addition of the wakeup FIFO allows burst completions by the flusher to immediately wakeup processes that were waiting for the reclaim count to drain. The result is far, far smoother operation.
* Remove a major blocking conflict between the buffer cache daemon and HAMMER. The buffer cache was getting stuck on trying to overwrite dirty records that had already been queued to the flusher. The flusher might not act on the record(s) for a long period of time, causing the buffer cache daemon to stall.
Fix the problem by properly using the HAMMER_RECF_INTERLOCK_BE flag, which stays on only for a very short period of time, instead of testing the record's flush state (record->flush_state), which can stay in the HAMMER_FST_FLUSH state for a very long time.
* The parent B-Tree node does not need to be locked when inserting into the child.
* Use the new B_AGE semantics to keep meta-data intact longer. This results in another big improvement in random read and write performance.
show more ...
|
#
a99b9ea2 |
| 11-Jun-2008 |
Matthew Dillon <dillon@dragonflybsd.org> |
HAMMER 54/Many: Performance tuning
* Implement hammer_vop_bmap().
* Implement cluster_read() support. This should jump up linear read performance almost to the platter speed. I get 100 MB/sec t
HAMMER 54/Many: Performance tuning
* Implement hammer_vop_bmap().
* Implement cluster_read() support. This should jump up linear read performance almost to the platter speed. I get 100 MB/sec testing vs 35 MB/sec previously.
* Do a better job kicking an inode into the flusher when writing sequentially. This hops up write rate at least +50%. It isn't quite able to run at the platter speed due to B-Tree overheads which will be addressed in a later patch.
* Do not create data fragments at the ends of files greater then 16K, use a full 16K block. The reason is that fragments in HAMMER are allocated out of a wholely different zone and we do not want to lose the chance of making the tail end of the file contiguous.
Files less then 16K still use data fragments.
* Fix a machine lockup related to an interrupt race with biodone() and insertions and deltions from hmp->lose_list.
* Fix a memory exhaustion issue.
Reported-by: Francois Tigeot <ftigeot@wolfpond.org> (machine lockup) Credit-also: Jonathan Stuart on the 0 byte sized file bug fix.
show more ...
|
#
af209b0f |
| 10-Jun-2008 |
Matthew Dillon <dillon@dragonflybsd.org> |
HAMMER 53H/Many: Performance tuning, bug fixes
* CHANGE THE ON-MEDIA B-TREE STRUCTURE. The number of elements per node has been increased from 16 to 64. The intent is to reduce the number of see
HAMMER 53H/Many: Performance tuning, bug fixes
* CHANGE THE ON-MEDIA B-TREE STRUCTURE. The number of elements per node has been increased from 16 to 64. The intent is to reduce the number of seeks required in a heavy random-access loading situation.
* Add a shortcut to the B-Tree node scanning code (requires more testing). Instead of scanning linearly we do a power-of-2 narrowing search.
* Only do clustered reads for DATA types. Do not cluster meta-data (aka B-Tree) I/O. Note that the inode data structure is considered to be a DATA type. Reduce the cluster read side from 256K to 64K to avoid blowing out the buffer cache.
* Augment hammer locks so one can discern between a normal lock blockage and one that is recovering from a deadlock.
* Change the slave work threads for the flusher to pull their work off a single queue. This fixes an issue where one slave work thread would sometimes get a disproportionate percentage of the work and the master thread then had to wait for it to finish while the other work threads were twiddling their thumbs.
* Adjust the wait reclaims code to solve a long standing performance issue. The flusher could get so far behind that the system's buffer cache buffers would no longer have any locality of reference to what was being flushed, causing a massive drop in performance.
* Do not queue a dirty inode to the flusher unconditionally in the strategy write code. Only do it if system resources appear to be stressed. The inode will get flushed when the filesystem syncs.
* Code cleanup.
* Fix a bug reported by Antonio Huete Jimenez related to 0-length writes not working properly.
show more ...
|
#
cebe9493 |
| 10-Jun-2008 |
Matthew Dillon <dillon@dragonflybsd.org> |
HAMMER 53D/Many: Stabilization
* Fix an overwrite bug with direct write which could result in file corruption.
* Reserve just-freed big blocks for two flush cycles to prevent HAMMER from overwr
HAMMER 53D/Many: Stabilization
* Fix an overwrite bug with direct write which could result in file corruption.
* Reserve just-freed big blocks for two flush cycles to prevent HAMMER from overwriting destroyed data so it does not become corrupt if the system crashes. This is needed because the recover code does not record UNDOs for data (nor do we want it to).
* More I/O subsystem work. There may still be an ellusive panic related to calls to regetblk().
show more ...
|
#
9f5097dc |
| 09-Jun-2008 |
Matthew Dillon <dillon@dragonflybsd.org> |
HAMMER 53C/Many: Stabilization
* HAMMER queues dirty inodes reclaimed by the kernel to the backend for their final sync. Programs like blogbench can overload the backend and generate more new i
HAMMER 53C/Many: Stabilization
* HAMMER queues dirty inodes reclaimed by the kernel to the backend for their final sync. Programs like blogbench can overload the backend and generate more new inodes then the backend can dispose of, running M_HAMMER out of memory.
Add code to stall on vop_open() when this condition is detected to give the backend a chance to catch-up. (see NOTE 1 below).
* HAMMER could build up too many meta-data buffers and cause the system to deadlock in newbuf. Recode the flusher to allow a block of UNDOs, the volume header, and all related meta-data buffers to be flushed piecemeal, and then continue the flush loop without closing out the transaction. If a crash occurs the recovery code will undo the partial flushes.
* Fix an issue located by FSX under load. The in-memory/on-disk record merging code was not dealing with in-memory data records properly The key field for data records is (base_offset + data_len), not just (base_off), so a 'match' between an in-memory data record and an on-disk data records requires a special case test. This is the case where the in-memory record is intended to overwrite the on-disk record, so the in-memory record must be chosen and the on-disk record discarded for the purposes of read().
* Fix a bug in hammer_io.c related to the handling of B_LOCKED buffers that resulted in an assertion at umount time. Buffer cache buffers were not being properly disassociated from their hammer_buffer countparts in the direct-write case.
* The frontend's direct-write capability for truncated buffers (such as used with small files) was causing an assertion to occur on the backend. Add an interlock on the related hammer_buffer to prevent the frontend from attempting to modify the buffer while the backend is trying to write it to the media.
* Dynamically size the dirty buffer limit. This still needs some work.
(NOTE 1): On read/write performance issues. Currently HAMMER's frontend VOPs are massively disassociated from modifying B-Tree updates. Even though a direct-write capability now exists, it applies only to bulk data writes to disk and NOT to B-Tree updates. Each direct write creates a record which must be queued to the backend to do the B-Tree update on the media. The flusher is currently single-threaded and when HAMMER gets too far behind doing these updates the current safeties will cause performance to degrade drastically. This is a known issue that will be addressed.
show more ...
|
#
0832c9bb |
| 08-Jun-2008 |
Matthew Dillon <dillon@dragonflybsd.org> |
HAMMER 53B/Many: Complete overhaul of strategy code, reservations, etc
* Completely overhaul the strategy code. Implement direct reads and writes for all cases. REMOVE THE BACKEND BIO QUEUE. BI
HAMMER 53B/Many: Complete overhaul of strategy code, reservations, etc
* Completely overhaul the strategy code. Implement direct reads and writes for all cases. REMOVE THE BACKEND BIO QUEUE. BIOs are no longer queued to the flusher under any circumstances.
Remove numerous hacks that were previously emplaced to deal with BIO's being queued to the flusher.
* Add a mechanism to invalidate buffer cache buffers that might be shadowed by direct I/O. e.g. if a strategy write uses the vnode's bio directly there may be a shadow hammer_buffer that will then become stale and must be invalidated.
* Implement a reservation tracking structure (hammer_reserve) to track storage reservations made by the frontend. The backend will not attempt to free or reuse reserved space if it encounters it.
Use reservations to back cached holes (struct hammer_hole) for the same reason.
* Index hammer_buffer on the zone-X offset instead of the zone-2 offset. Base the RB tree in the hammer_mount instead of (zone-2) hammer_volume. This removes nearly all blockmap lookup operations from the critical path.
* Do a much better job tracking cached dirty data for the purposes of calculating whether the filesystem will become full or not.
* Fix a critical bug in the CRC generation of short data buffers.
* Fix a VM deadlock.
* Use 16-byte alignment for all on-disk data instead of 8-byte alignment.
* Major code cleanup.
As-of this commit write performance is now extremely good.
show more ...
|
#
47637bff |
| 07-Jun-2008 |
Matthew Dillon <dillon@dragonflybsd.org> |
HAMMER 53A/Many: Read and write performance enhancements, etc.
* Add hammer_io_direct_read(). For full-block reads this code allows a high-level frontend buffer cache buffer associated with the
HAMMER 53A/Many: Read and write performance enhancements, etc.
* Add hammer_io_direct_read(). For full-block reads this code allows a high-level frontend buffer cache buffer associated with the regular file vnode to directly access the underlying storage, instead of loading that storage via a hammer_buffer and bcopy()ing it.
* Add a write bypass, allowing the frontend to bypass the flusher and write full-blocks directly to the underlying storage, greatly improving frontend write performance. Caveat: See note at bottom.
The write bypass is implemented by adding a feature whereby the frontend can soft-reserve unused disk space on the physical media without having to interact (much) with on-disk meta-data structures. This allows the frontend to flush high-level buffer cache buffers directly to disk and release the buffer for reuse by the system, resulting in very high write performance.
To properly associate the reserved space with the filesystem so it can be accessed in later reads, an in-memory hammer_record is created referencing it. This record is queued to the backend flusher for final disposition. The backend disposes of the record by inserting the appropriate B-Tree element and marking the storage as allocated. At that point the storage becomes official.
* Clean up numerous procedures to support the above new features. In particular, do a major cleanup of the cached truncation offset code (this is the code which allows HAMMER to implement wholely asynchronous truncate()/ftruncate() support.
Also clean up the flusher triggering code, removing numerous hacks that had been in place to deal with the lack of a direct-write mechanism.
* Start working on statistics gathering to track record and B-Tree operations.
* CAVEAT: The backend flusher creates a significant cpu burden when flushing a large number of in-memory data records. Even though the data itself has already been written to disk, there is currently a great deal of overhead involved in manipulating the B-Tree to insert the new records. Overall write performance will only be modestly improved until these code paths are optimized.
show more ...
|
#
19b97e01 |
| 18-May-2008 |
Matthew Dillon <dillon@dragonflybsd.org> |
HAMMER 46B/Many: Stabilization pass
* Add a feature to vmntvnodescan() to only do one pass on the vnode list. Have HAMMER use it.
* Fix a buffer cache leak. Buffers could wind up disassociated f
HAMMER 46B/Many: Stabilization pass
* Add a feature to vmntvnodescan() to only do one pass on the vnode list. Have HAMMER use it.
* Fix a buffer cache leak. Buffers could wind up disassociated from their HAMMER structures while in a B_LOCKED state, preventing the kernel from reusing them.
show more ...
|
#
2f85fa4d |
| 18-May-2008 |
Matthew Dillon <dillon@dragonflybsd.org> |
HAMMER 46/Many: Performance pass, media changes, bug fixes.
* Add a localization field to the B-Tree element which has sorting priority over the object id.
Use the localization field to separat
HAMMER 46/Many: Performance pass, media changes, bug fixes.
* Add a localization field to the B-Tree element which has sorting priority over the object id.
Use the localization field to separate inode entries from file data. This allows the reblocker to cluster inode information together and greatly improves directory/stat performance.
* Enhance the reblocker to reblock internal B-Tree nodes as well as leaves.
* Enhance the reblocker by adding 'reblock-inodes' in addition to 'reblock-data' and 'reblock-btree', allowing individual types of meta-data to be independantly reblocked.
* Fix a bug in hammer_bread(). The buffer's zoneX_offset field was sometimes not being properly masked, resulting in unnecessary blockmap lookups. Also add hammer_clrxlate_buffer() to clear the translation cache for a hammer_buffer.
* Fix numerous issues with hmp->sync_lock.
* Fix a buffer exhaustion issue in the pruner and reblocker due to not counting I/O's in progress as being dirty.
* Enhance the symlink implementation. Take advantage of the extra 24 bytes of space in the inode data to directly store symlinks <= 24 bytes.
* Use cluster_read() to gang read I/O's into 64KB chunks. Rely on localization and the reblocker and pruner to make doing the larger I/O's worthwhile.
These changes reduce ls -lR overhead on 43383 files (half created with cpdup, half totally randomly created with blogbench). Overhead went from 35 seconds after reblocking, before the changes, to 5 seconds after reblocking, after the changes.
show more ...
|
#
09ac686b |
| 15-May-2008 |
Matthew Dillon <dillon@dragonflybsd.org> |
HAMMER 45/Many: Stabilization pass, undo sequencing.
* The flusher was improperly requesting a reflush on buffers. The flush request was being defered for any buffers with active front-end refere
HAMMER 45/Many: Stabilization pass, undo sequencing.
* The flusher was improperly requesting a reflush on buffers. The flush request was being defered for any buffers with active front-end references and then wound up being flushed by the front-end, breaking ordering requirements.
Remove the reflush flag entirely. This fixes numerous crash recovery cases.
* Add a missing unlock in the reblocking ioctl code which was responsible for a number of process lockups.
* Enhance the undo recovery kprintf.
* Validate the CRC in UNDO records
show more ...
|
#
77062c8a |
| 06-May-2008 |
Matthew Dillon <dillon@dragonflybsd.org> |
HAMMER 41B/Many: Cleanup.
* Disable (most) debugging kprintfs unless a hammer debug sysctl is set.
* Do not allow buffers to be synced on panic.
|
#
c9b9e29d |
| 04-May-2008 |
Matthew Dillon <dillon@dragonflybsd.org> |
HAMMER 40F/Many: UNDO cleanup & stabilization.
* Properly classify UNDO zone buffers so they are flushed at the correct point in time.
* Minor rewrite of the code tracking the UNDO demark for the
HAMMER 40F/Many: UNDO cleanup & stabilization.
* Properly classify UNDO zone buffers so they are flushed at the correct point in time.
* Minor rewrite of the code tracking the UNDO demark for the next flush.
* Introduce a considerably better backend flushing activation algorithm to avoid single-buffer flushes.
* Put a lock around the freemap allocator.
show more ...
|