rf_reconstruct.c - OpenGrok history log for /netbsd-src/sys/dev/raidframe/rf

Revision	Date	Author	Comments
# 396f9f45	19-May-2008	oster <oster@NetBSD.org>	Re-work some of the guts of the reconstruction code. Reconmap used to have one pointer for every reconstruction unit. This does not scale well in the land of 1TB disks, where some 100MB+ of "status Re-work some of the guts of the reconstruction code. Reconmap used to have one pointer for every reconstruction unit. This does not scale well in the land of 1TB disks, where some 100MB+ of "status pointers" are required for typical configurations. Convert the reconstruction code to use a "sliding status window" which will scale nicely regardless of the number of stripes/reconstruction units in the RAID set. Convert the main reconstruction loop to rebuild the array in chunks rather than in one big lump. As part of these changes, introduce a function to kick any waiters on the head separation callback list, and use that in the main reconstruction event queue to wake up the waiters if things have stalled. (I believe this may fix a race condition that could occur at at least at the very end of a disk during reconstruction under heavy IO load.) Thanks to Brian Buhrow for all his help, support, and patience in testing these changes. show more ...
# 8fb49f6f	15-Apr-2008	oster <oster@NetBSD.org>	A forced recon read should not default to indicating that the reads for that disk have stopped, since this will bump us out of the normal reconstruction loop prematurely. Fixes the (mostly cosmetic) A forced recon read should not default to indicating that the reads for that disk have stopped, since this will bump us out of the normal reconstruction loop prematurely. Fixes the (mostly cosmetic) bug where the reconstruction status values stop updating, and from raidctl it appears that reconstruction has totally stalled (which it actually hasn't -- the reconstruction does complete properly, but not in the normal way). show more ...
# 25c8cdfd	14-Apr-2008	oster <oster@NetBSD.org>	Print out the status value if a reconstruction read fails. Don't print out write promotions during reconstruct unless we are debugging reconstructs.
# 287ee4e9	26-Jan-2008	oster <oster@NetBSD.org>	In a land before time, when kernel processes roamed the system, we needed to keep track of the kernel process that opened a device in order to close it with the right credentials. Flash forward to t In a land before time, when kernel processes roamed the system, we needed to keep track of the kernel process that opened a device in order to close it with the right credentials. Flash forward to today where curlwp is now quite sufficient. show more ...
# 61e8303e	26-Nov-2007	pooka <pooka@NetBSD.org>	Remove the "struct lwp " argument from all VFS and VOP interfaces. The general trend is to remove it from all kernel interfaces and this is a start. In case the calling lwp is desired, curlwp shoul Remove the "struct lwp " argument from all VFS and VOP interfaces. The general trend is to remove it from all kernel interfaces and this is a start. In case the calling lwp is desired, curlwp should be used. quick consensus on tech-kern show more ...
# 6384685d	21-Sep-2007	oster <oster@NetBSD.org>	Fix wording in a comment and correct a debug line. From Olivier Cherrier (via private mail). Thanks!
# 1c0f1b25	18-Jul-2007	ad <ad@NetBSD.org>	Fix fallout from recent kthread changes.
# 88ab7da9	09-Jul-2007	ad <ad@NetBSD.org>	Merge some of the less invasive changes from the vmlocking branch: - kthread, callout, devsw API changes - select()/poll() improvements - miscellaneous MT safety improvements
# 954bc134	26-Jun-2007	cube <cube@NetBSD.org>	Change dk_lookup() to accept an additional argument of the type enum uio_seg that tells whether the given path is in user space or kernel space, so it can tell NDINIT(). While the raidframe calls we Change dk_lookup() to accept an additional argument of the type enum uio_seg that tells whether the given path is in user space or kernel space, so it can tell NDINIT(). While the raidframe calls were ok, both ccd(4) and cgd(4) were passing pointers to user space data, which leads to strange error on i386, as reported by Jukka Salmi on current-users. The issue has been there since last august, I'm actually a bit surprised that no one in the meantime has used ccd(4) or cgd(4) on an arch where it would have simply faulted. show more ...
# 168cd830	16-Nov-2006	christos <christos@NetBSD.org>	__unused removal on arguments; approved by core.
# 4d595fd7	12-Oct-2006	christos <christos@NetBSD.org>	- sprinkle __unused on function decls. - fix a couple of unused bugs - no more -Wno-unused for i386
# ecdff16f	27-Aug-2006	christos <christos@NetBSD.org>	- use dk_lookup instead of our home-spun version. - allow raid to be configured in a wedge - allow wedges to be configured in a raid - add autoconfiguration of wedges in a raid
# 3029ac48	21-Jul-2006	ad <ad@NetBSD.org>	- Use the LWP cached credentials where sane. - Minor cosmetic changes.
# 2867b68b	14-May-2006	elad <elad@NetBSD.org>	integrate kauth.
# 95e1ffb1	11-Dec-2005	christos <christos@NetBSD.org>	merge ktrace-lwp.
# 97682553	18-Jul-2005	oster <oster@NetBSD.org>	If rf_SubmitReconBuffer indicates the submission was blocked (for whatever reason), return 0 instead of the default RF_RECON_READ_STOPPED. Returning RF_RECON_READ_STOPPED would result in rf_Continue If rf_SubmitReconBuffer indicates the submission was blocked (for whatever reason), return 0 instead of the default RF_RECON_READ_STOPPED. Returning RF_RECON_READ_STOPPED would result in rf_ContinueReconstructFailedDisk() thinking that the given component was "done" and breaking out of the main reconstruction loop far too early. Reconstruction still worked correctly as long as there were no errors, but RAIDframe wouldn't be in a position to properly handle read/write errors during reconstruction. This fixes the "raidctl's progress bar spins at 0% until reconstruction finishes" problem. show more ...
# 77708271	08-Jun-2005	oster <oster@NetBSD.org>	- initialize numRUsTotal before we indicate that we are doing a reconstruct. - make numRUsComplete and numRUsTotal 64-bit quantities like everything else that records this information.
# f31bd063	27-Feb-2005	perry <perry@NetBSD.org>	nuke trailing whitespace
# be864067	12-Feb-2005	oster <oster@NetBSD.org>	The 'next' argument to rf_CreateDiskQueueData is always NULL. Since there is no particular reason to pass an extra NULL argument, turf it, and initialize p->next to NULL within the function.
# 0b154709	12-Feb-2005	oster <oster@NetBSD.org>	Add a 'waitflag' argument to rf_CreateDiskQueueData() and use it to determine if we are willing to wait for memory to come from the diskqueuedata (dqd) and bufpool pools. Cleanup the mess related to Add a 'waitflag' argument to rf_CreateDiskQueueData() and use it to determine if we are willing to wait for memory to come from the diskqueuedata (dqd) and bufpool pools. Cleanup the mess related to code calling rf_CreateDiskQueueData() with different expectations (and/or blatent disregard) of what might happen if there were insufficient pool resources. show more ...
# 04a30b5e	06-Feb-2005	oster <oster@NetBSD.org>	It's not a bad idea to update the component labels whether or not the reconstruction was successful.
# 339f61b7	05-Feb-2005	oster <oster@NetBSD.org>	rf_GetNextReconEvent() will return a valid event, so no need for the assert. (we'd have panic'ed in there long before this assert if that wasn't the case). Minor whitespace changes.
# c38bce14	05-Feb-2005	oster <oster@NetBSD.org>	Vastly improve the error handling in the case of a read/write error that occurs during a reconstruction. We go from zero error handling and likely panicing if something goes amiss, to gracefully bai Vastly improve the error handling in the case of a read/write error that occurs during a reconstruction. We go from zero error handling and likely panicing if something goes amiss, to gracefully bailing and leaving the system in the best, usable state possible. - introduce rf_DrainReconEventQueue() to allow easy cleaning of the reconstruction event queue - change how we cleanup the floating recon buffers in rf_FreeReconControl(). Detect the end of the list rather than traversing according to a count. - keep track of the number of pending reconstruction writes. In the event of a read error, use this to wait long enough for the pending writes to (hopefully) drain. - more cleanup is still needed on this code, but I didn't want to start mixing major functional changes with minor cleanups. XXX: There is a known issue with pool items left outstanding due to the IO failure, and this can show up in the form of a panic at the tail end of a shutdown. This problem is much less severe than before these changes, and the hope/plan is that this problem will go away once this code gets overhauled again. show more ...
# c18a2427	22-Jan-2005	oster <oster@NetBSD.org>	Torch some #define's missed in last commit.
# 31409478	22-Jan-2005	oster <oster@NetBSD.org>	Reconstruction Descriptors are only allocated once per reconstruction, and don't need their own pool or freelist or anything fancier than a malloc/free.
123 4 5 6