History log of /dflybsd-src/sys/vfs/hammer/hammer_reblock.c (Results 26 – 50 of 85)
Revision Date Author Comments
# 745703c7 07-Jul-2015 Tomohiro Kusumi <kusumi.tomohiro@gmail.com>

hammer: Remove trailing whitespaces

- (Non-functional commits could make it difficult to git-blame
the history if there are too many of those)


# a981af19 02-Jul-2015 Tomohiro Kusumi <kusumi.tomohiro@gmail.com>

sys/vfs/hammer: Change "bigblock" to "big-block"

- There are(or were) several terms for 8MB chunk, for example
"big-block", "bigblock", "big block", "large-block", etc but
"big-block" seems to b

sys/vfs/hammer: Change "bigblock" to "big-block"

- There are(or were) several terms for 8MB chunk, for example
"big-block", "bigblock", "big block", "large-block", etc but
"big-block" seems to be the canonical term.

- Changes are mostly comments and some in printf and hammer(8).
Variable names (e.g. xxx_bigblock_xxx) remain unchanged.

- The official design document as well as much of the existing
code (excluding variable and macro names) use "big-block".
https://www.dragonflybsd.org/hammer/hammer.pdf

- Also see e04ee2de and the previous commit.

show more ...


# d165c90a 02-Jul-2015 Tomohiro Kusumi <kusumi.tomohiro@gmail.com>

sys/vfs/hammer: Change "big block" to "big-block"

- This word refers to 8MB chunk in hammer's blockmap layers,
not literally "big" "block".

- Changes are mostly comments and some in printf and ha

sys/vfs/hammer: Change "big block" to "big-block"

- This word refers to 8MB chunk in hammer's blockmap layers,
not literally "big" "block".

- Changes are mostly comments and some in printf and hammer(8).

- The official design document as well as much of the existing
code (excluding variable and macro names) use "big-block".
https://www.dragonflybsd.org/hammer/hammer.pdf

- Also see e04ee2de.

show more ...


# f1c0ae53 30-Apr-2015 Tomohiro Kusumi <kusumi.tomohiro@gmail.com>

sys/vfs/hammer: Add inline functions hammer_modify_buffer|volume_noundo()

- Add noundo wrappers hammer_modify_buffer|volume_noundo() similar to
the existing inline function hammer_modify_node_noun

sys/vfs/hammer: Add inline functions hammer_modify_buffer|volume_noundo()

- Add noundo wrappers hammer_modify_buffer|volume_noundo() similar to
the existing inline function hammer_modify_node_noundo() for better
readability.

- A pair of args (NULL, 0) indicating that it's not generating undo is
a bit unclear (and there are even comments for them).

- (The compiler doesn't actually inline hammer_modify_node_noundo()
in my environment, but these one-line wrappers are inlined)

show more ...


# 5e1e1454 24-Apr-2015 Tomohiro Kusumi <kusumi.tomohiro@gmail.com>

sys/vfs/hammer: Add -A option to reblock|rebalance all pfs

- -A option makes certain per pfs hammer commands perform on all pfs
of the filesystem that the [filesystem] arg belongs to. Currently

sys/vfs/hammer: Add -A option to reblock|rebalance all pfs

- -A option makes certain per pfs hammer commands perform on all pfs
of the filesystem that the [filesystem] arg belongs to. Currently
hammer reblock and rebalance commands support this. It does nothing
to other commands.

- With -A option, above hammer commands use a range of 0 to 0xFFFF
for pfs id (upper 16 bits) of the cursor localization. This makes
it iterate all pfs in the filesystem.

- Above difference in localization range means btree iteration
applies to larger range of nodes in terms of pfs id, since it's
been used as a top priority key that works as a localizing factor
of pfs within the btree. There is no logical difference other than
the range is different. So performing these commands on all pfs is
as simple as changing the localization range (unless other keys are
involved as additional parameters like hammer prune does).

show more ...


# 6540d157 21-Apr-2015 Tomohiro Kusumi <kusumi.tomohiro@gmail.com>

sys/vfs/hammer: Cleanup cursor initialization code on reblock

- Just make things a bit more clear.

- The rule is the ioctl caller sets localization type to reblock,
and the ioctl code adds up ip

sys/vfs/hammer: Cleanup cursor initialization code on reblock

- Just make things a bit more clear.

- The rule is the ioctl caller sets localization type to reblock,
and the ioctl code adds up ip localization to initialize cursor.

show more ...


# 558a44e2 21-Apr-2015 Tomohiro Kusumi <kusumi.tomohiro@gmail.com>

sys/vfs/hammer: Fix comment

- Sync a comment with what's written in reblock_usage().


# 4fa5fb92 21-Apr-2015 Tomohiro Kusumi <kusumi.tomohiro@gmail.com>

sys/vfs/hammer: Cleanup sanity checks

- Move sanity checks to the beginning of the function.

- Check 'free_level > HAMMER_BIGBLOCK_SIZE'.
free_level is somewhere between 0 and 8MB (inclusive).


# e04ee2de 31-Jan-2015 Tomohiro Kusumi <kusumi.tomohiro@gmail.com>

hammer: fix terminology of "large block"

This cleanup patch changes terminology "large block" to "big block".

- Both "large block" and "big block" are widely used in hammer source from
kernel to us

hammer: fix terminology of "large block"

This cleanup patch changes terminology "large block" to "big block".

- Both "large block" and "big block" are widely used in hammer source from
kernel to userspace, however these two refer to the same data structure which
is a 8MB sized chunk within low level blockmapped storage layer.

- The original design document https://www.dragonflybsd.org/hammer/hammer.pdf
uses big block for this data structure. Having two expressions in its
implementation is confusing and makes grep difficult.

Closes: #2782

show more ...


# f31f6d84 07-Jan-2013 Sascha Wildner <saw@online.de>

kernel/hammer: Remove unused variables and add __debugvar.


# 55b50bd5 16-Jan-2012 Matthew Dillon <dillon@apollo.backplane.com>

kernel - Fix 3:00 a.m. crashes (deadlocks) related to HAMMER VM use

When memory is low and the pageout daemon needs to write things out we still
need to have at least some reserve to perform the sup

kernel - Fix 3:00 a.m. crashes (deadlocks) related to HAMMER VM use

When memory is low and the pageout daemon needs to write things out we still
need to have at least some reserve to perform the supporting operations for
the pageout. HAMMER is particularly memory intensive and could get into a
situation where insufficient reserve memory was available, deadlocking the
system.

With these changes DragonFly should run stable on systems with as little
as 256M of ram, and possibly a bit lower.

* The getblk/bread/bwrite/etc brelse/bqrelse sequence used to manage buffers
had several bugs in it that prevented the low memory handling code from
operating properly. The b[q]relse() sequence was not properly detecting
the low memory condition and freeing or caching the underlying VM pages
(when possible).

* Also change the low memory test used by the buffer cache from 'severe'
to 'min' in kern/vfs_bio.c. We may be able to change this back to 'severe'
at a later date with further testing. These tests are in brelse(),
bqrelse(), and vfs_vmio_release().

* Rewrite bio_page_alloc(). It effectively does the same thing that it did
before but should operate more smoothly. We also no longer try to recover
pages from unrelated buffer cache buffers from this function, which could
lead to deadlocks. The warning kprintf is now also rate-limited.

* Add a buffer overload test in the hammer dedup ioctl. A hammer dedup
could cause a buffer cache deadlock by allowing too many dirty buffers
to build up.

* Add a VM memory test to the core hammer flusher code that was previously
only checking for the UNDO meta-data and buffer overload limits. This
is now done on a per-record basis and should prevent HAMMER from allocating
too much memory during a flusher operation when the VM system is already
too low on memory.

* Add some vm_wait_nominal() calls in critical I/O paths, but make sure we
do not use these calls in any I/O path used by the HAMMER pageout code.

Probably the most important path is the vm_object_page_clean*() code
path, effectively called via either msync() or via the 30-60 second
system sync.

* Properly bawrite() a buffer in hammer_vop_write() when IO_ASYNC is set
(which is used by the pageout daemon), otherwise the pageout daemon will
not be able to directly recover memory in low memory situations when
paging to a HAMMER file mapped SHARED/RW.

Testing-by: tuxillo, lentferj, ftigeot, dillon

show more ...


# e86903d8 11-Apr-2011 Matthew Dillon <dillon@apollo.backplane.com>

HAMMER VFS - Fix degenerate stall condition in flusher during unmount

* Fix a case where the flusher can stall during an unmount.

* Rework the flusher sequence numbers to always allocate a sequence

HAMMER VFS - Fix degenerate stall condition in flusher during unmount

* Fix a case where the flusher can stall during an unmount.

* Rework the flusher sequence numbers to always allocate a sequence number
when a flush is requested, remove the flusher.act field, and rejigger the
code a bit.

* This also cleans up an edge case when a full sync is inserted (when taking
snapshots, filesystem sync, etc), by inserting several sequence numbers to
completely flush the UNDO/REDO FIFO before moving on to the next active
flush group.

Reported-by: Sepherosa Ziehau <sepherosa@gmail.com>, Francois Tigeot <ftigeot@wolfpond.org>, numerous others.

show more ...


# 18bee4a2 03-Apr-2011 Matthew Dillon <dillon@apollo.backplane.com>

HAMMER VFS - Implement swapcache for HAMMER data in double_buffer mode

* Support swapcache data caching when HAMMER's double_buffer mode is enabled.
Typically the following sysctls:

vfs.hammer.

HAMMER VFS - Implement swapcache for HAMMER data in double_buffer mode

* Support swapcache data caching when HAMMER's double_buffer mode is enabled.
Typically the following sysctls:

vfs.hammer.double_buffer=1
vm.swapcache.read_enable=1
vm.swapcache.data_enable=1
vm.swapcache.meta_enable=1 (optional)
vm.swapcache.use_chflags=0 (optional - see man swapcache)

* This causes swapcache to attempt to cache file data from HAMMER
filesystems stored via the block device instead of the individual
file vnodes.

* This allows swapcache to more efficiently cache file data without
vnode recycling from a limited kern.maxvnodes value getting in the way.

If you have a large dataset spread across many smaller files which would
normally overwhelm maxvnodes, and even on large systems handling very
large data sets where you wish to cache the file data for some of the
files (using use_chflags=1 mode), this makes it possible to cache ALL
the file data AND meta-data on the SSD even though the related vnodes
cached by the kernel get recycled.

* Whereas it may have been inefficient to turn on vm.swapcache.data_enable
before, due to filesystem scans and such, it may now be possible to this
feature on with double buffering also enabled.

Note that you must still be cognizant of the aggregate amount of file
data being accessed by your system if you have set use_chflags to 0, you
simply no longer need to worry about how many files that data belongs to.

* Enabling HAMMER's double_buffer mode will reduce performance somewhat for
the normal best-case file caching, but it will also greatly improve
performance once you start blowing out your memory caches.

show more ...


# b4f86ea3 12-Jan-2011 Matthew Dillon <dillon@apollo.backplane.com>

HAMMER VFS - Remove B-Tree allocation hints, add double_buffer option.

* Remove the allocation hints when allocating b-tree nodes and
remove over-full test in the blockmap allocator for b-tree and

HAMMER VFS - Remove B-Tree allocation hints, add double_buffer option.

* Remove the allocation hints when allocating b-tree nodes and
remove over-full test in the blockmap allocator for b-tree and
meta-data elements.

The hinting and leaving some space unused in the big-blocks did
not improve performance. Write performance is actually slightly
better when new allocations are made linearly.

* Either way we have to depend on the reblocker to reorganize the
B-Tree.

* Add a sysctl vfs.hammer.double_buffer, defaulting to off. This
is currently used for debugging and testing live-dedup.

Normally only small-data blocks are run through the device vnode's
buffer cache (allowing us to consolidate many small data blocks
within the device vnode's buffer cache), and large data blocks are
read directly into the file vnode's buffer.

Turning on double_buffer cases ALL file data to run through the
device vnode's buffer cache, resulting in double data caching
which is not normally useful, so leave this off for now. It will
not improve performance.

show more ...


# b9107f58 17-Aug-2010 Matthew Dillon <dillon@apollo.backplane.com>

HAMMER Utility - Add catastrophic recovery feature

* hammer -f <devices> recover <empty_target_dir>

* Add a catastrophic recovery feature. A HAMMER filesystem image is
scanned (using the -f <blo

HAMMER Utility - Add catastrophic recovery feature

* hammer -f <devices> recover <empty_target_dir>

* Add a catastrophic recovery feature. A HAMMER filesystem image is
scanned (using the -f <blockdevs> specification). Any buffer which
looks like a B-Tree node is then sub-scanned for inode, directory, and
data records and the filesystem is reconstructed in the specified
target directory.

* The files and directories are initially named after the object id
and are renamed and moved as directory entries are found to resolve
the fragmentory information.

* File writes strip trailing 0's (data records are not limited to the
file EOF), but will properly truncate the file if/when the related
inode record is found.

* Currently no attempt is made to restore owner, group, file modes,
softlinks, or hardlinks (only one link will be restored).

TODO: Currently a valid volume header is required, but the only thing
we actually need from it is the vol_buf_beg field. This field
could be guessed or passed in on the command line in a future
update to the recovery code.

show more ...


# 07ed04b5 19-Apr-2010 Matthew Dillon <dillon@apollo.backplane.com>

HAMMER VFS - Fix probable corruption case when filesystem becomes nearly full

* The reblocking code was incorrectly assuming the cursor would be pointing
at a valid node element after an unlock/re

HAMMER VFS - Fix probable corruption case when filesystem becomes nearly full

* The reblocking code was incorrectly assuming the cursor would be pointing
at a valid node element after an unlock/relock sequence, when it could
actually be pointing at the EOF of a node. This case can occur when
the filesystem is nearly full (possibly due to the reblocking operation
itself), when the filesystem is also under load from unrelated
operations.

* This can result in the creation of a corrupted B-Tree leaf node or
data record.

* Corruption can be checked with hammer checkmap and hammer show
(as of this rev):

hammer -f device checkmap

Should output no B-Tree node records or free space mismatches.
You will still get the initial volume summary.

hammer -f device show | egrep '^B' | egrep -v '^BM'

Should output no records.

* Currently the only recourse if corruption is found is to copy off the
filesystem, newfs_hammer, and copy it back.

Full history and snapshots can be retained by using 'hammer -B mirror-read'
to copy off the filesystem and mirror-write to copy it back. However,
pleaes remember you must do this for each PFS individually. Make sure
you have a viable backup before newfsing anything.

Reported-by: Francois Tigeot <ftigeot@wolfpond.org>, Jan Lentfer <Jan.Lentfer@web.de>

show more ...


# ebbcfba9 01-Apr-2010 Matthew Dillon <dillon@apollo.backplane.com>

HAMMER VFS - Fix insufficient cursor change test

* The reblocking code tests whether a cursor has changed after being
unlocked. This test was insufficient and resulted in an assertion
panic. B

HAMMER VFS - Fix insufficient cursor change test

* The reblocking code tests whether a cursor has changed after being
unlocked. This test was insufficient and resulted in an assertion
panic. Beef up the test.

Reported-by: Jan Lentfer <Jan.Lentfer@web.de>

show more ...


# 24cf83d2 20-Mar-2010 Matthew Dillon <dillon@apollo.backplane.com>

HAMMER VFS - frontload kmalloc()'s when rebalancing

* The rebalancing code must allocate upwards of 16MB of memory to hold
copies of B-Tree nodes (~64*64*4K). This is enough to blow out the
eme

HAMMER VFS - frontload kmalloc()'s when rebalancing

* The rebalancing code must allocate upwards of 16MB of memory to hold
copies of B-Tree nodes (~64*64*4K). This is enough to blow out the
emergency memory reserve used by the pageout daemon and deadlock the
system in low memory situations.

* Refactor the allocations. Allocate all the memory up-front so no
major allocations occur while nodes in the B-Tree are held locked.

* There are probably other cases where this may become a problem. With
UFS it wasn't an issue because flushing a file was fairly unsophisticated.
But with HAMMER certain aspects of the flush require B-Tree lookups and
can't be dumbed down to a simple raw disk write.

The rebalancing code was the most aggregious abuser of kernel memory
though and that should now be fixed.

Reported-by: Francois Tigeot <ftigeot@wolfpond.org>

show more ...


# b8a41159 12-Feb-2010 Matthew Dillon <dillon@apollo.backplane.com>

kernel - SWAP CACHE part 19/many - distinguish bulk data in HAMMER block dev

* Add buf->flags/B_NOTMETA, vm_page->flags/PG_NOTMETA. If set the pages
underlying the buffer will not be considered m

kernel - SWAP CACHE part 19/many - distinguish bulk data in HAMMER block dev

* Add buf->flags/B_NOTMETA, vm_page->flags/PG_NOTMETA. If set the pages
underlying the buffer will not be considered meta-data from the
point of view of the swapcache.

* HAMMER must sometimes access bulk data via the block device instead of
via a file vnode. For example, the reblocking and mirroring code.
We do not want this data to be misinterpreted as meta-data when
the meta-data-only swapcache is turned on, otherwise it will blow
out the actual meta-data in the swapcache.

HAMMER_RECTYPE_DATA and HAMMER_RECTYPE_DB are considered normal data.
All other record types (e.g. direntry, inode, etc) are meta-data.

show more ...


# 83f2a3aa 14-Oct-2009 Matthew Dillon <dillon@apollo.backplane.com>

HAMMER - Add version 3 meta-data features

* These features are available for filesystem version 3. Version 2 may be upgraded
to version 3 in-place. These features are not usable until you upgrad

HAMMER - Add version 3 meta-data features

* These features are available for filesystem version 3. Version 2 may be upgraded
to version 3 in-place. These features are not usable until you upgrade.

* Definitively store snapshots in filesystem meta-data. Softlinks still
work. The new snapshot directives (snap, snaplo, snapq, etc) also allow
you to specify up to a 64-character note for each snapshot you create.
The snapls directive may be used to list all snapshots stored in meta-data.

'hammer cleanup' will move all softlink-based snapshots residing in the
<fs>/snapshots directory to meta-data when it next snapshots the filesystem
(within a day of upgrading, usually). The snapshot softlinks are left intact.

Storing snapshot information in meta-data means that accidental wipes of
your <fs>/snapshots directory will NOT cause later hammer cleanup runs to
destroy your snapshots! The meta-data snapshots are also removed if you
do a prune-everything, or through normal pruning expirations, and thus
'hammer snapls' will definitively list your valid snapshots.

This feature also means that you can obtain a definitive list of snapshots
available on mirroring slaves.

* Definitively store the hammer cleanup configuration file in filesystem meta-data.
This meta-data is not mirrored. 'hammer cleanup' will move <fs>/snapshots/config
to the new meta-data config and deletes <fs>/snapshots/config after you've upgraded
the filesystem. You can edit the configuration with the 'viconfig' directive.

* The HAMMER utility has new directives: snap, snaplo, snapq, snaprm, snapls,
config, and viconfig.

* WARNING! Filesystems mounted 'nohistory' and files chflagged similarly do not
have snapshots, but the hammer utility still allows the directives to be run.
This is a bug that needs to be fixed.

show more ...


# c9ce54d6 03-Sep-2009 Matthew Dillon <dillon@apollo.backplane.com>

HAMMER - Fix lost inode issue (primarily with nohistory mounts)

* When a HAMMER cursor is unlocked it becomes tracked and unrelated
B-Tree operations will cause the tracked cursor's nodes and indi

HAMMER - Fix lost inode issue (primarily with nohistory mounts)

* When a HAMMER cursor is unlocked it becomes tracked and unrelated
B-Tree operations will cause the tracked cursor's nodes and indices
to be updated. The cursor structure also has a leaf element pointer
which was not being properly updated. This could lead to panics and
lost inodes.

Properly adjust the leaf element pointer in tracked cursors.

* The bug primarily occurs with nohistory mounts or nohistory sub-trees
due to the larger number of physical deletions made to the B-Tree, but
could also occur (rarely) with normal mounts.

* Add additional assertions to catch any further occurrences (though I
think all the cases have been covered now).

* Add a new sysctl vfs.hammer.error_panic which can be set to e.g. 9 to
cause critical errors to panic immediately instead of returning
through the call stack, making debugging possible.

Reported-by: Numerous people

show more ...


# 973c11b9 24-Jun-2009 Matthew Dillon <dillon@apollo.backplane.com>

AMD64 - Fix many compile-time warnings. int/ptr type mismatches, %llx, etc.


# df2ccbac 20-Jun-2009 Matthew Dillon <dillon@apollo.backplane.com>

HAMMER VFS - Add hinting capability to block allocator, hint B-Tree

* A hammer_off_t can now be supplied to the blockmap allocator as a hint.

* Use the hinting mechanism to better-localize B-Tree n

HAMMER VFS - Add hinting capability to block allocator, hint B-Tree

* A hammer_off_t can now be supplied to the blockmap allocator as a hint.

* Use the hinting mechanism to better-localize B-Tree node allocations
and meta-data updates.

show more ...


# 1775b6a0 15-Mar-2009 Matthew Dillon <dillon@apollo.backplane.com>

HAMMER VFS - Add a B-Tree rebalancing feature.

This is the initial commit of B-Tree rebalancing support for HAMMER.
The rebalancer may be run using the 'hammer rebalance' utility directive.

The lea

HAMMER VFS - Add a B-Tree rebalancing feature.

This is the initial commit of B-Tree rebalancing support for HAMMER.
The rebalancer may be run using the 'hammer rebalance' utility directive.

The leafs in a HAMMER B-Tree all reside at the same depth. Insertions and
deletions only collapse the B-Tree when a leaf node becomes empty and then
only if any necessary recursion (possibly reaching the root node) succeeds.
No balancing occurs during normal operation and B-Tree nodes can wind up
with wildly different element counts which bloats the tree and makes
searches less efficient.

The rebalancer effectively does a depth-first traversal of the B-Tree,
visiting leaf nodes first and parent nodes as a trailing function on the
way back up the tree. For any given internal node the sum total of
elements contained in its children is divided by the number of children.
The effective number of children is reduced as is practical to obtain a 75%
fill level. The elements are then packed into the children and any
wholely empty children left over are deleted. The rebalancer does not
create new B-Tree nodes.

Element packing is fairly complex, requiring tracked cursors, on-media
parent pointers, mirror TIDs, and boundary elements to be updated. The
rebalancer must hold a large number of B-Tree nodes exclusively locked
while running.

show more ...


# 982be4bf 24-Jan-2009 Matthew Dillon <dillon@apollo.backplane.com>

HAMMER VFS - Remove the unused also_ip argument from the cursor API


1234