History log of /netbsd-src/sys/dev/raidframe/rf_reconstruct.c (Results 26 – 50 of 129)
Revision Date Author Comments
# 396f9f45 19-May-2008 oster <oster@NetBSD.org>

Re-work some of the guts of the reconstruction code.

Reconmap used to have one pointer for every reconstruction unit. This
does not scale well in the land of 1TB disks, where some 100MB+ of
"status

Re-work some of the guts of the reconstruction code.

Reconmap used to have one pointer for every reconstruction unit. This
does not scale well in the land of 1TB disks, where some 100MB+ of
"status pointers" are required for typical configurations. Convert
the reconstruction code to use a "sliding status window" which will
scale nicely regardless of the number of stripes/reconstruction units
in the RAID set. Convert the main reconstruction loop to rebuild the
array in chunks rather than in one big lump.

As part of these changes, introduce a function to kick any waiters on
the head separation callback list, and use that in the main
reconstruction event queue to wake up the waiters if things have
stalled. (I believe this may fix a race condition that could occur at
at least at the very end of a disk during reconstruction under heavy
IO load.)

Thanks to Brian Buhrow for all his help, support, and patience in
testing these changes.

show more ...


# 8fb49f6f 15-Apr-2008 oster <oster@NetBSD.org>

A forced recon read should not default to indicating that the reads
for that disk have stopped, since this will bump us out of the normal
reconstruction loop prematurely.

Fixes the (mostly cosmetic)

A forced recon read should not default to indicating that the reads
for that disk have stopped, since this will bump us out of the normal
reconstruction loop prematurely.

Fixes the (mostly cosmetic) bug where the reconstruction
status values stop updating, and from raidctl it appears that
reconstruction has totally stalled (which it actually hasn't -- the
reconstruction does complete properly, but not in the normal way).

show more ...


# 25c8cdfd 14-Apr-2008 oster <oster@NetBSD.org>

Print out the status value if a reconstruction read fails.
Don't print out write promotions during reconstruct unless
we are debugging reconstructs.


# 287ee4e9 26-Jan-2008 oster <oster@NetBSD.org>

In a land before time, when kernel processes roamed the system, we
needed to keep track of the kernel process that opened a device in
order to close it with the right credentials. Flash forward to t

In a land before time, when kernel processes roamed the system, we
needed to keep track of the kernel process that opened a device in
order to close it with the right credentials. Flash forward to today
where curlwp is now quite sufficient.

show more ...


# 61e8303e 26-Nov-2007 pooka <pooka@NetBSD.org>

Remove the "struct lwp *" argument from all VFS and VOP interfaces.
The general trend is to remove it from all kernel interfaces and
this is a start. In case the calling lwp is desired, curlwp shoul

Remove the "struct lwp *" argument from all VFS and VOP interfaces.
The general trend is to remove it from all kernel interfaces and
this is a start. In case the calling lwp is desired, curlwp should
be used.

quick consensus on tech-kern

show more ...


# 6384685d 21-Sep-2007 oster <oster@NetBSD.org>

Fix wording in a comment and correct a debug line. From Olivier Cherrier
(via private mail). Thanks!


# 1c0f1b25 18-Jul-2007 ad <ad@NetBSD.org>

Fix fallout from recent kthread changes.


# 88ab7da9 09-Jul-2007 ad <ad@NetBSD.org>

Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 954bc134 26-Jun-2007 cube <cube@NetBSD.org>

Change dk_lookup() to accept an additional argument of the type enum uio_seg
that tells whether the given path is in user space or kernel space, so it
can tell NDINIT().

While the raidframe calls we

Change dk_lookup() to accept an additional argument of the type enum uio_seg
that tells whether the given path is in user space or kernel space, so it
can tell NDINIT().

While the raidframe calls were ok, both ccd(4) and cgd(4) were passing
pointers to user space data, which leads to strange error on i386, as
reported by Jukka Salmi on current-users.

The issue has been there since last august, I'm actually a bit surprised
that no one in the meantime has used ccd(4) or cgd(4) on an arch where it
would have simply faulted.

show more ...


# 168cd830 16-Nov-2006 christos <christos@NetBSD.org>

__unused removal on arguments; approved by core.


# 4d595fd7 12-Oct-2006 christos <christos@NetBSD.org>

- sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386


# ecdff16f 27-Aug-2006 christos <christos@NetBSD.org>

- use dk_lookup instead of our home-spun version.
- allow raid to be configured in a wedge
- allow wedges to be configured in a raid
- add autoconfiguration of wedges in a raid


# 3029ac48 21-Jul-2006 ad <ad@NetBSD.org>

- Use the LWP cached credentials where sane.
- Minor cosmetic changes.


# 2867b68b 14-May-2006 elad <elad@NetBSD.org>

integrate kauth.


# 95e1ffb1 11-Dec-2005 christos <christos@NetBSD.org>

merge ktrace-lwp.


# 97682553 18-Jul-2005 oster <oster@NetBSD.org>

If rf_SubmitReconBuffer indicates the submission was blocked (for
whatever reason), return 0 instead of the default
RF_RECON_READ_STOPPED. Returning RF_RECON_READ_STOPPED would result
in rf_Continue

If rf_SubmitReconBuffer indicates the submission was blocked (for
whatever reason), return 0 instead of the default
RF_RECON_READ_STOPPED. Returning RF_RECON_READ_STOPPED would result
in rf_ContinueReconstructFailedDisk() thinking that the given
component was "done" and breaking out of the main reconstruction loop
far too early. Reconstruction still worked correctly as long as there
were no errors, but RAIDframe wouldn't be in a position to properly
handle read/write errors during reconstruction.

This fixes the "raidctl's progress bar spins at 0% until
reconstruction finishes" problem.

show more ...


# 77708271 08-Jun-2005 oster <oster@NetBSD.org>

- initialize numRUsTotal before we indicate that we are doing a reconstruct.

- make numRUsComplete and numRUsTotal 64-bit quantities like
everything else that records this information.


# f31bd063 27-Feb-2005 perry <perry@NetBSD.org>

nuke trailing whitespace


# be864067 12-Feb-2005 oster <oster@NetBSD.org>

The 'next' argument to rf_CreateDiskQueueData is always NULL. Since
there is no particular reason to pass an extra NULL argument, turf it,
and initialize p->next to NULL within the function.


# 0b154709 12-Feb-2005 oster <oster@NetBSD.org>

Add a 'waitflag' argument to rf_CreateDiskQueueData() and use it to
determine if we are willing to wait for memory to come from the
diskqueuedata (dqd) and bufpool pools. Cleanup the mess related to

Add a 'waitflag' argument to rf_CreateDiskQueueData() and use it to
determine if we are willing to wait for memory to come from the
diskqueuedata (dqd) and bufpool pools. Cleanup the mess related to
code calling rf_CreateDiskQueueData() with different expectations
(and/or blatent disregard) of what might happen if there were
insufficient pool resources.

show more ...


# 04a30b5e 06-Feb-2005 oster <oster@NetBSD.org>

It's not a bad idea to update the component labels whether or not the
reconstruction was successful.


# 339f61b7 05-Feb-2005 oster <oster@NetBSD.org>

rf_GetNextReconEvent() *will* return a valid event, so no need for
the assert. (we'd have panic'ed in there long before this assert
if that wasn't the case).

Minor whitespace changes.


# c38bce14 05-Feb-2005 oster <oster@NetBSD.org>

Vastly improve the error handling in the case of a read/write error
that occurs during a reconstruction. We go from zero error handling
and likely panicing if something goes amiss, to gracefully bai

Vastly improve the error handling in the case of a read/write error
that occurs during a reconstruction. We go from zero error handling
and likely panicing if something goes amiss, to gracefully bailing and
leaving the system in the best, usable state possible.

- introduce rf_DrainReconEventQueue() to allow easy cleaning of the
reconstruction event queue

- change how we cleanup the floating recon buffers in
rf_FreeReconControl(). Detect the end of the list rather
than traversing according to a count.

- keep track of the number of pending reconstruction writes. In the
event of a read error, use this to wait long enough for the pending
writes to (hopefully) drain.

- more cleanup is still needed on this code, but I didn't want to
start mixing major functional changes with minor cleanups.

XXX: There is a known issue with pool items left outstanding due to
the IO failure, and this can show up in the form of a panic at the
tail end of a shutdown. This problem is much less severe than before
these changes, and the hope/plan is that this problem will go away
once this code gets overhauled again.

show more ...


# c18a2427 22-Jan-2005 oster <oster@NetBSD.org>

Torch some #define's missed in last commit.


# 31409478 22-Jan-2005 oster <oster@NetBSD.org>

Reconstruction Descriptors are only allocated once per reconstruction,
and don't need their own pool or freelist or anything fancier than a
malloc/free.


123456