#
a6071800 |
| 17-Sep-2023 |
oster <oster@NetBSD.org> |
Implement hot removal of spares and components. From manu@.
Implement a long desired feature of automatically incorporating a used spare into the array after a reconstruct.
Given the configuration
Implement hot removal of spares and components. From manu@.
Implement a long desired feature of automatically incorporating a used spare into the array after a reconstruct.
Given the configuration: Components: /dev/wd0e: failed /dev/wd1e: optimal /dev/wd2e: optimal Spares: /dev/wd3e: spare
Running 'raidctl -F /dev/wd0e raid0' will now result in the following configuration after a successful rebuild: Components: /dev/wd3e: optimal /dev/wd1e: optimal /dev/wd2e: optimal No spares.
Thanks to manu@ for the development of the initial set of changes which allowed the changes to automatically incorporate a used spare to come to fruition. Thanks also to manu@ for useful discussions about and additional testing of these changes.
show more ...
|
#
9dbaa117 |
| 08-Sep-2023 |
oster <oster@NetBSD.org> |
Revision 1.104 actually fixed the issues that were preventing us from freeing the ReconControl structures. So free them and thus also prevent a panic on shutdown due to items not being correctly ret
Revision 1.104 actually fixed the issues that were preventing us from freeing the ReconControl structures. So free them and thus also prevent a panic on shutdown due to items not being correctly returned to the pool.
Thanks to manu@ for report of the panic, and for initial testing of the changes.
XXX pullup-9 XXX pullup-10
show more ...
|
#
d03fc4a7 |
| 27-Jul-2021 |
oster <oster@NetBSD.org> |
rf_CreateDiskQueueData() no longer uses waitflag, and will always succeed. Cleanup the error path for the (no longer needed) PR_NOWAIT cases.
|
#
2cf3739a |
| 23-Jul-2021 |
oster <oster@NetBSD.org> |
Extensive mechanical changes to the pools used in RAIDframe.
Alloclist remains not per-RAID, so initialize that pool separately/differently than the rest.
The remainder of pools in RF_Pools_s are n
Extensive mechanical changes to the pools used in RAIDframe.
Alloclist remains not per-RAID, so initialize that pool separately/differently than the rest.
The remainder of pools in RF_Pools_s are now per-RAID pools. Mostly mechanical changes to functions to allocate/destroy per-RAID pools. Needed to make raidPtr available in certain cases to be able to find the per-RAID pools.
Extend rf_pool_init() to now populate a per-RAID wchan value that is unique to each pool for a given RAID device.
TODO: Complete the analysis of the minimum number of items that are required for each pool to allow IO to progress (i.e. so that a request for pool resources can always be satisfied), and dynamically scale minimum pool sizes based on RAID configuration.
show more ...
|
#
03e76925 |
| 15-Feb-2021 |
oster <oster@NetBSD.org> |
Fix a long long-standing off-by-one error in computing lastPSID.
SUsPerPU is only really supported for a value of 1, and since the first PSID is 0, the last will be numStripe-1. Also update the set
Fix a long long-standing off-by-one error in computing lastPSID.
SUsPerPU is only really supported for a value of 1, and since the first PSID is 0, the last will be numStripe-1. Also update the setting of pending_writes to reflect the change to lastPSID.
Needs pullups to -8 and -9.
show more ...
|
#
5a52f2af |
| 08-Dec-2019 |
mlelstv <mlelstv@NetBSD.org> |
Switch to vn_bdev_open* functions.
|
#
b67baf4c |
| 10-Oct-2019 |
christos <christos@NetBSD.org> |
fix the function pointer and callback mess: - callback functions return 0 and their result is not checked; make them void. - there are two types of callbacks and they used to overload their parameter
fix the function pointer and callback mess: - callback functions return 0 and their result is not checked; make them void. - there are two types of callbacks and they used to overload their parameters and the callback structure; separate them into "function" and "value" callbacks. - make the wait function signature consistent.
show more ...
|
#
95d8fbc8 |
| 09-Feb-2019 |
christos <christos@NetBSD.org> |
- Change the allocation macros to be more like function calls - Change sizeof(type) -> sizeof(*variable) - Use macros for the long buffer length allocations - Remove "bit polishing" memsets() -- do t
- Change the allocation macros to be more like function calls - Change sizeof(type) -> sizeof(*variable) - Use macros for the long buffer length allocations - Remove "bit polishing" memsets() -- do them only once - Remove unnecessary casts
Thanks to oster@ for finding bugs and testing.
show more ...
|
#
404e5f06 |
| 14-Nov-2014 |
oster <oster@NetBSD.org> |
Fix a long-standing bug related to rebooting while a reconstruct-to-spare is underway but not yet complete.
The issue was that a component was being marked as a used_spare when the rebuild started,
Fix a long-standing bug related to rebooting while a reconstruct-to-spare is underway but not yet complete.
The issue was that a component was being marked as a used_spare when the rebuild started, not when the rebuild was actually finished. Marking it as a used_spare meant that the component label on the spare was being updated such that after a reboot the component would be considered up-to-date, regardless of whether the rebuild actually completed!
This fix includes: 1) Add an additional state "rf_ds_rebuilding_spare" which is used to denote that a spare is currently being rebuilt from the live components. 2) Update the comments on the disk states, which were out-of-sync with reality. 3) When rebuilding to a spare component, that spare now enters the state rf_ds_rebuilding_spare instead of the state rf_ds_used_spare. 4) When the rebuild is actually complete then the spare component enters the rf_ds_used_spare state. rf_ds_used_spare is now used exclusively for the case where the rebuilding to the spare has completed successfully.
XXX: Someday we need to teach raidctl(8) about this new state, and take out the backwards compatibility code in rf_netbsdkintf.c (see RAIDFRAME_GET_INFO in raidioctl()). For today, this fix needs to be generic enough that it can get backported without major grief.
XXX: Needs pullup to netbsd-5*, netbsd-6*, and netbsd-7
Fixes PR#49244.
show more ...
|
#
292a0c7f |
| 14-Jun-2014 |
hannken <hannken@NetBSD.org> |
Change dk_lookup() to return an anonymous vnode not associated with any file system. Change all consumers of dk_lookup() to get the device from "v_rdev" instead of VOP_GETATTR() as specfs does not s
Change dk_lookup() to return an anonymous vnode not associated with any file system. Change all consumers of dk_lookup() to get the device from "v_rdev" instead of VOP_GETATTR() as specfs does not support VOP_GETATTR(). Devices obtained with dk_lookup() will no longer disappear on forced unmounts.
Fix for PR kern/48849 (root mirror raid fails on shutdown)
Welcome to 6.99.44
show more ...
|
#
5c5fb0b1 |
| 06-Mar-2013 |
yamt <yamt@NetBSD.org> |
fix parens in a message
|
#
cebdda60 |
| 20-Feb-2012 |
oster <oster@NetBSD.org> |
Add logic to the main reconstruction loop to handle RAID5 with rotated spares. While here, observe that we were actually doing one more stripe than we thought we were, and correct that too (it didn'
Add logic to the main reconstruction loop to handle RAID5 with rotated spares. While here, observe that we were actually doing one more stripe than we thought we were, and correct that too (it didn't matter for non-RAID5_RS, but it definitely does for RAID5_RS). Add some bounds-checking at the beginning to handle the case where the number of stripes in the set is smaller than the sliding reconstruction window.
XXX: this problem likely needs to be fixed for PARITY_DECLUSTERING too.
show more ...
|
#
2cc7a01f |
| 14-Oct-2011 |
hannken <hannken@NetBSD.org> |
Change the vnode locking protocol of VOP_GETATTR() to request at least a shared lock. Make all calls outside of file systems respect it.
The calls from file systems need review.
No objections from
Change the vnode locking protocol of VOP_GETATTR() to request at least a shared lock. Make all calls outside of file systems respect it.
The calls from file systems need review.
No objections from tech-kern.
show more ...
|
#
28c3372a |
| 03-Aug-2011 |
oster <oster@NetBSD.org> |
Address part of PR kern/44972. From YAMAMOTO Takashi. Thanks!
|
#
33d93c8d |
| 28-May-2011 |
yamt <yamt@NetBSD.org> |
rf_ReconstructInPlace: don't leave a vnode open on errors. fixes a part of PR/44972.
|
#
463102d2 |
| 24-May-2011 |
buhrow <buhrow@NetBSD.org> |
Suggested to oster@ and approved via private e-mail as a help to people who are getting reconstruction failures.
|
#
8c36bb4b |
| 11-May-2011 |
mrg <mrg@NetBSD.org> |
convert the main raidPtr mutex to a kmutex, and add a couple of cv's to cover the old sleep/wakeup points for adding_hot_spare and waitForReconCond. convert all remaining simple_lock's to kmutexes (t
convert the main raidPtr mutex to a kmutex, and add a couple of cv's to cover the old sleep/wakeup points for adding_hot_spare and waitForReconCond. convert all remaining simple_lock's to kmutexes (they're not used or compiled right now... even with all options enabled) and remove the support for them.
this leaves just a pair of tsleep()/wakeup() calls using old scheduling APIs.
show more ...
|
#
02d186e1 |
| 02-May-2011 |
mrg <mrg@NetBSD.org> |
convert rb_mutex to a kmutex/cv.
|
#
ec02ea41 |
| 19-Feb-2011 |
enami <enami@NetBSD.org> |
Define accessors for number of blocks and partition size in the component label and use them where appropriate. Disscussed on tech-kern.
|
#
8f6ed30d |
| 19-Nov-2010 |
dholland <dholland@NetBSD.org> |
Introduce struct pathbuf. This is an abstraction to hold a pathname and the metadata required to interpret it. Callers of namei must now create a pathbuf and pass it to NDINIT (instead of a string an
Introduce struct pathbuf. This is an abstraction to hold a pathname and the metadata required to interpret it. Callers of namei must now create a pathbuf and pass it to NDINIT (instead of a string and a uio_seg), then destroy the pathbuf after the namei session is complete.
Update all namei call sites accordingly. Add a pathbuf(9) man page and update namei(9).
The pathbuf interface also now appears in a couple of related additional places that were passing string/uio_seg pairs that were later fed into NDINIT. Update other call sites accordingly.
show more ...
|
#
4de66268 |
| 01-Nov-2010 |
mrg <mrg@NetBSD.org> |
add support for >2TB raid devices.
- add two new members to the component label: u_int numBlocksHi u_int partitionSizeHi and store the top 32 bits of the real number of blocks and part
add support for >2TB raid devices.
- add two new members to the component label: u_int numBlocksHi u_int partitionSizeHi and store the top 32 bits of the real number of blocks and partition size. modify rf_print_component_label(), rf_does_it_fit(), rf_AutoConfigureDisks() and rf_ReconstructFailedDiskBasic().
- call disk_blocksize() after disk_attach() [ from mlelstv ]
- shift the block number relative to DEV_BSHIFT in raidstart() and InitBP() so that accesses work for non 512-byte devices. [ from mlelstv ]
- update rf_getdisksize() to use the new getdisksize() [ from mlelstv. this part needs a separate change for netbsd-5. ]
reviewed by: oster, christos and darrenr
show more ...
|
#
f1a1ad33 |
| 17-Nov-2009 |
jld <jld@NetBSD.org> |
Finally commit the RAIDframe parity map Summer Of Code project.
Drastically reduces the amount of time spent rewriting parity after an unclean shutdown by keeping better track of which regions might
Finally commit the RAIDframe parity map Summer Of Code project.
Drastically reduces the amount of time spent rewriting parity after an unclean shutdown by keeping better track of which regions might have had outstanding writes. Enabled by default; can be disabled on a per-set basis, or tuned, with the new raidctl(8) commands.
Discussed on tech-kern@ to a general air of approval; exhortations to commit from mrg@, christos@, and others.
Thanks to Google for their sponsorship, oster@ for mentoring the project, assorted developers for trying very hard to break it, and probably more I'm forgetting.
show more ...
|
#
f17e8d67 |
| 11-Feb-2009 |
oster <oster@NetBSD.org> |
If we see a RF_RECON_WRITE_ERROR event we know a write has finished and we need to account for that. Failure to do so means we can end up waiting forever for writes we think are outstanding, but whi
If we see a RF_RECON_WRITE_ERROR event we know a write has finished and we need to account for that. Failure to do so means we can end up waiting forever for writes we think are outstanding, but which have already completed.
Addresses the RAIDframe part of PR#40569. Thanks to Matthias Scheler for reporting the issue and verifying the fix.
show more ...
|
#
73225b15 |
| 20-Dec-2008 |
oster <oster@NetBSD.org> |
When unconfiguring an array where a reconstruct is in progress, abort the reconstruct and wait for IOs to drain before pulling the plug.
Should fix the panic reported by der Mouse on tech-kern.
|
#
c4025116 |
| 23-Sep-2008 |
oster <oster@NetBSD.org> |
Nuke unneeded printf(). Spotted by pooka@.
|