rf_reconstruct.c - OpenGrok history log for /netbsd-src/sys/dev/raidframe/rf

Revision	Date	Author	Comments
# a6071800	17-Sep-2023	oster <oster@NetBSD.org>	Implement hot removal of spares and components. From manu@. Implement a long desired feature of automatically incorporating a used spare into the array after a reconstruct. Given the configuration Implement hot removal of spares and components. From manu@. Implement a long desired feature of automatically incorporating a used spare into the array after a reconstruct. Given the configuration: Components: /dev/wd0e: failed /dev/wd1e: optimal /dev/wd2e: optimal Spares: /dev/wd3e: spare Running 'raidctl -F /dev/wd0e raid0' will now result in the following configuration after a successful rebuild: Components: /dev/wd3e: optimal /dev/wd1e: optimal /dev/wd2e: optimal No spares. Thanks to manu@ for the development of the initial set of changes which allowed the changes to automatically incorporate a used spare to come to fruition. Thanks also to manu@ for useful discussions about and additional testing of these changes. show more ...
# 9dbaa117	08-Sep-2023	oster <oster@NetBSD.org>	Revision 1.104 actually fixed the issues that were preventing us from freeing the ReconControl structures. So free them and thus also prevent a panic on shutdown due to items not being correctly ret Revision 1.104 actually fixed the issues that were preventing us from freeing the ReconControl structures. So free them and thus also prevent a panic on shutdown due to items not being correctly returned to the pool. Thanks to manu@ for report of the panic, and for initial testing of the changes. XXX pullup-9 XXX pullup-10 show more ...
# d03fc4a7	27-Jul-2021	oster <oster@NetBSD.org>	rf_CreateDiskQueueData() no longer uses waitflag, and will always succeed. Cleanup the error path for the (no longer needed) PR_NOWAIT cases.
# 2cf3739a	23-Jul-2021	oster <oster@NetBSD.org>	Extensive mechanical changes to the pools used in RAIDframe. Alloclist remains not per-RAID, so initialize that pool separately/differently than the rest. The remainder of pools in RF_Pools_s are n Extensive mechanical changes to the pools used in RAIDframe. Alloclist remains not per-RAID, so initialize that pool separately/differently than the rest. The remainder of pools in RF_Pools_s are now per-RAID pools. Mostly mechanical changes to functions to allocate/destroy per-RAID pools. Needed to make raidPtr available in certain cases to be able to find the per-RAID pools. Extend rf_pool_init() to now populate a per-RAID wchan value that is unique to each pool for a given RAID device. TODO: Complete the analysis of the minimum number of items that are required for each pool to allow IO to progress (i.e. so that a request for pool resources can always be satisfied), and dynamically scale minimum pool sizes based on RAID configuration. show more ...
# 03e76925	15-Feb-2021	oster <oster@NetBSD.org>	Fix a long long-standing off-by-one error in computing lastPSID. SUsPerPU is only really supported for a value of 1, and since the first PSID is 0, the last will be numStripe-1. Also update the set Fix a long long-standing off-by-one error in computing lastPSID. SUsPerPU is only really supported for a value of 1, and since the first PSID is 0, the last will be numStripe-1. Also update the setting of pending_writes to reflect the change to lastPSID. Needs pullups to -8 and -9. show more ...
# 5a52f2af	08-Dec-2019	mlelstv <mlelstv@NetBSD.org>	Switch to vn_bdev_open* functions.
# b67baf4c	10-Oct-2019	christos <christos@NetBSD.org>	fix the function pointer and callback mess: - callback functions return 0 and their result is not checked; make them void. - there are two types of callbacks and they used to overload their parameter fix the function pointer and callback mess: - callback functions return 0 and their result is not checked; make them void. - there are two types of callbacks and they used to overload their parameters and the callback structure; separate them into "function" and "value" callbacks. - make the wait function signature consistent. show more ...
# 95d8fbc8	09-Feb-2019	christos <christos@NetBSD.org>	- Change the allocation macros to be more like function calls - Change sizeof(type) -> sizeof(variable) - Use macros for the long buffer length allocations - Remove "bit polishing" memsets() -- do t - Change the allocation macros to be more like function calls - Change sizeof(type) -> sizeof(variable) - Use macros for the long buffer length allocations - Remove "bit polishing" memsets() -- do them only once - Remove unnecessary casts Thanks to oster@ for finding bugs and testing. show more ...
# 404e5f06	14-Nov-2014	oster <oster@NetBSD.org>	Fix a long-standing bug related to rebooting while a reconstruct-to-spare is underway but not yet complete. The issue was that a component was being marked as a used_spare when the rebuild started, Fix a long-standing bug related to rebooting while a reconstruct-to-spare is underway but not yet complete. The issue was that a component was being marked as a used_spare when the rebuild started, not when the rebuild was actually finished. Marking it as a used_spare meant that the component label on the spare was being updated such that after a reboot the component would be considered up-to-date, regardless of whether the rebuild actually completed! This fix includes: 1) Add an additional state "rf_ds_rebuilding_spare" which is used to denote that a spare is currently being rebuilt from the live components. 2) Update the comments on the disk states, which were out-of-sync with reality. 3) When rebuilding to a spare component, that spare now enters the state rf_ds_rebuilding_spare instead of the state rf_ds_used_spare. 4) When the rebuild is actually complete then the spare component enters the rf_ds_used_spare state. rf_ds_used_spare is now used exclusively for the case where the rebuilding to the spare has completed successfully. XXX: Someday we need to teach raidctl(8) about this new state, and take out the backwards compatibility code in rf_netbsdkintf.c (see RAIDFRAME_GET_INFO in raidioctl()). For today, this fix needs to be generic enough that it can get backported without major grief. XXX: Needs pullup to netbsd-5, netbsd-6, and netbsd-7 Fixes PR#49244. show more ...
# 292a0c7f	14-Jun-2014	hannken <hannken@NetBSD.org>	Change dk_lookup() to return an anonymous vnode not associated with any file system. Change all consumers of dk_lookup() to get the device from "v_rdev" instead of VOP_GETATTR() as specfs does not s Change dk_lookup() to return an anonymous vnode not associated with any file system. Change all consumers of dk_lookup() to get the device from "v_rdev" instead of VOP_GETATTR() as specfs does not support VOP_GETATTR(). Devices obtained with dk_lookup() will no longer disappear on forced unmounts. Fix for PR kern/48849 (root mirror raid fails on shutdown) Welcome to 6.99.44 show more ...
# 5c5fb0b1	06-Mar-2013	yamt <yamt@NetBSD.org>	fix parens in a message
# cebdda60	20-Feb-2012	oster <oster@NetBSD.org>	Add logic to the main reconstruction loop to handle RAID5 with rotated spares. While here, observe that we were actually doing one more stripe than we thought we were, and correct that too (it didn' Add logic to the main reconstruction loop to handle RAID5 with rotated spares. While here, observe that we were actually doing one more stripe than we thought we were, and correct that too (it didn't matter for non-RAID5_RS, but it definitely does for RAID5_RS). Add some bounds-checking at the beginning to handle the case where the number of stripes in the set is smaller than the sliding reconstruction window. XXX: this problem likely needs to be fixed for PARITY_DECLUSTERING too. show more ...
# 2cc7a01f	14-Oct-2011	hannken <hannken@NetBSD.org>	Change the vnode locking protocol of VOP_GETATTR() to request at least a shared lock. Make all calls outside of file systems respect it. The calls from file systems need review. No objections from Change the vnode locking protocol of VOP_GETATTR() to request at least a shared lock. Make all calls outside of file systems respect it. The calls from file systems need review. No objections from tech-kern. show more ...
# 28c3372a	03-Aug-2011	oster <oster@NetBSD.org>	Address part of PR kern/44972. From YAMAMOTO Takashi. Thanks!
# 33d93c8d	28-May-2011	yamt <yamt@NetBSD.org>	rf_ReconstructInPlace: don't leave a vnode open on errors. fixes a part of PR/44972.
# 463102d2	24-May-2011	buhrow <buhrow@NetBSD.org>	Suggested to oster@ and approved via private e-mail as a help to people who are getting reconstruction failures.
# 8c36bb4b	11-May-2011	mrg <mrg@NetBSD.org>	convert the main raidPtr mutex to a kmutex, and add a couple of cv's to cover the old sleep/wakeup points for adding_hot_spare and waitForReconCond. convert all remaining simple_lock's to kmutexes (t convert the main raidPtr mutex to a kmutex, and add a couple of cv's to cover the old sleep/wakeup points for adding_hot_spare and waitForReconCond. convert all remaining simple_lock's to kmutexes (they're not used or compiled right now... even with all options enabled) and remove the support for them. this leaves just a pair of tsleep()/wakeup() calls using old scheduling APIs. show more ...
# 02d186e1	02-May-2011	mrg <mrg@NetBSD.org>	convert rb_mutex to a kmutex/cv.
# ec02ea41	19-Feb-2011	enami <enami@NetBSD.org>	Define accessors for number of blocks and partition size in the component label and use them where appropriate. Disscussed on tech-kern.
# 8f6ed30d	19-Nov-2010	dholland <dholland@NetBSD.org>	Introduce struct pathbuf. This is an abstraction to hold a pathname and the metadata required to interpret it. Callers of namei must now create a pathbuf and pass it to NDINIT (instead of a string an Introduce struct pathbuf. This is an abstraction to hold a pathname and the metadata required to interpret it. Callers of namei must now create a pathbuf and pass it to NDINIT (instead of a string and a uio_seg), then destroy the pathbuf after the namei session is complete. Update all namei call sites accordingly. Add a pathbuf(9) man page and update namei(9). The pathbuf interface also now appears in a couple of related additional places that were passing string/uio_seg pairs that were later fed into NDINIT. Update other call sites accordingly. show more ...
# 4de66268	01-Nov-2010	mrg <mrg@NetBSD.org>	add support for >2TB raid devices. - add two new members to the component label: u_int numBlocksHi u_int partitionSizeHi and store the top 32 bits of the real number of blocks and part add support for >2TB raid devices. - add two new members to the component label: u_int numBlocksHi u_int partitionSizeHi and store the top 32 bits of the real number of blocks and partition size. modify rf_print_component_label(), rf_does_it_fit(), rf_AutoConfigureDisks() and rf_ReconstructFailedDiskBasic(). - call disk_blocksize() after disk_attach() [ from mlelstv ] - shift the block number relative to DEV_BSHIFT in raidstart() and InitBP() so that accesses work for non 512-byte devices. [ from mlelstv ] - update rf_getdisksize() to use the new getdisksize() [ from mlelstv. this part needs a separate change for netbsd-5. ] reviewed by: oster, christos and darrenr show more ...
# f1a1ad33	17-Nov-2009	jld <jld@NetBSD.org>	Finally commit the RAIDframe parity map Summer Of Code project. Drastically reduces the amount of time spent rewriting parity after an unclean shutdown by keeping better track of which regions might Finally commit the RAIDframe parity map Summer Of Code project. Drastically reduces the amount of time spent rewriting parity after an unclean shutdown by keeping better track of which regions might have had outstanding writes. Enabled by default; can be disabled on a per-set basis, or tuned, with the new raidctl(8) commands. Discussed on tech-kern@ to a general air of approval; exhortations to commit from mrg@, christos@, and others. Thanks to Google for their sponsorship, oster@ for mentoring the project, assorted developers for trying very hard to break it, and probably more I'm forgetting. show more ...
# f17e8d67	11-Feb-2009	oster <oster@NetBSD.org>	If we see a RF_RECON_WRITE_ERROR event we know a write has finished and we need to account for that. Failure to do so means we can end up waiting forever for writes we think are outstanding, but whi If we see a RF_RECON_WRITE_ERROR event we know a write has finished and we need to account for that. Failure to do so means we can end up waiting forever for writes we think are outstanding, but which have already completed. Addresses the RAIDframe part of PR#40569. Thanks to Matthias Scheler for reporting the issue and verifying the fix. show more ...
# 73225b15	20-Dec-2008	oster <oster@NetBSD.org>	When unconfiguring an array where a reconstruct is in progress, abort the reconstruct and wait for IOs to drain before pulling the plug. Should fix the panic reported by der Mouse on tech-kern.
# c4025116	23-Sep-2008	oster <oster@NetBSD.org>	Nuke unneeded printf(). Spotted by pooka@.
12 3 4 5 6