xref: /netbsd-src/sbin/raidctl/raidctl.8 (revision 808873b5c32f3eb7326add45fae2c6c53417132d)
1*808873b5Soster.\"     $NetBSD: raidctl.8,v 1.82 2023/09/25 21:59:38 oster Exp $
2ed77a60fSoster.\"
312335050Swiz.\" Copyright (c) 1998, 2002 The NetBSD Foundation, Inc.
4ed77a60fSoster.\" All rights reserved.
5ed77a60fSoster.\"
6ed77a60fSoster.\" This code is derived from software contributed to The NetBSD Foundation
7ed77a60fSoster.\" by Greg Oster
8ed77a60fSoster.\"
9ed77a60fSoster.\" Redistribution and use in source and binary forms, with or without
10ed77a60fSoster.\" modification, are permitted provided that the following conditions
11ed77a60fSoster.\" are met:
12ed77a60fSoster.\" 1. Redistributions of source code must retain the above copyright
13ed77a60fSoster.\"    notice, this list of conditions and the following disclaimer.
14ed77a60fSoster.\" 2. Redistributions in binary form must reproduce the above copyright
15ed77a60fSoster.\"    notice, this list of conditions and the following disclaimer in the
16ed77a60fSoster.\"    documentation and/or other materials provided with the distribution.
17ed77a60fSoster.\"
18ed77a60fSoster.\" THIS SOFTWARE IS PROVIDED BY THE NETBSD FOUNDATION, INC. AND CONTRIBUTORS
19ed77a60fSoster.\" ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
20ed77a60fSoster.\" TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
21ed77a60fSoster.\" PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL THE FOUNDATION OR CONTRIBUTORS
22ed77a60fSoster.\" BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
23ed77a60fSoster.\" CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
24ed77a60fSoster.\" SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
25ed77a60fSoster.\" INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
26ed77a60fSoster.\" CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
27ed77a60fSoster.\" ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
28ed77a60fSoster.\" POSSIBILITY OF SUCH DAMAGE.
29ed77a60fSoster.\"
30ed77a60fSoster.\"
31ed77a60fSoster.\" Copyright (c) 1995 Carnegie-Mellon University.
32ed77a60fSoster.\" All rights reserved.
33ed77a60fSoster.\"
34ed77a60fSoster.\" Author: Mark Holland
35ed77a60fSoster.\"
36ed77a60fSoster.\" Permission to use, copy, modify and distribute this software and
37ed77a60fSoster.\" its documentation is hereby granted, provided that both the copyright
38ed77a60fSoster.\" notice and this permission notice appear in all copies of the
39ed77a60fSoster.\" software, derivative works or modified versions, and any portions
40ed77a60fSoster.\" thereof, and that both notices appear in supporting documentation.
41ed77a60fSoster.\"
42ed77a60fSoster.\" CARNEGIE MELLON ALLOWS FREE USE OF THIS SOFTWARE IN ITS "AS IS"
43ed77a60fSoster.\" CONDITION.  CARNEGIE MELLON DISCLAIMS ANY LIABILITY OF ANY KIND
44ed77a60fSoster.\" FOR ANY DAMAGES WHATSOEVER RESULTING FROM THE USE OF THIS SOFTWARE.
45ed77a60fSoster.\"
46ed77a60fSoster.\" Carnegie Mellon requests users of this software to return to
47ed77a60fSoster.\"
48ed77a60fSoster.\"  Software Distribution Coordinator  or  Software.Distribution@CS.CMU.EDU
49ed77a60fSoster.\"  School of Computer Science
50ed77a60fSoster.\"  Carnegie Mellon University
51ed77a60fSoster.\"  Pittsburgh PA 15213-3890
52ed77a60fSoster.\"
53ed77a60fSoster.\" any improvements or extensions that they make and grant Carnegie the
54ed77a60fSoster.\" rights to redistribute these changes.
55ed77a60fSoster.\"
56*808873b5Soster.Dd September 25, 2023
57ed77a60fSoster.Dt RAIDCTL 8
5873f545bbSwiz.Os
59ed77a60fSoster.Sh NAME
60ed77a60fSoster.Nm raidctl
61ed77a60fSoster.Nd configuration utility for the RAIDframe disk driver
62ed77a60fSoster.Sh SYNOPSIS
63990562bfSwiz.Nm
6474ff9ea8Soster.Ar dev
6574ff9ea8Soster.Ar command
6674ff9ea8Soster.Op Ar arg Op ...
6774ff9ea8Soster.Nm
68c714a07dSoster.Op Fl v
6952334fd8Schristos.Fl A Op yes | no | forceroot | softroot
70268ef0a8Soster.Ar dev
71990562bfSwiz.Nm
72268ef0a8Soster.Op Fl v
73704ad63cSwiz.Fl a Ar component Ar dev
74704ad63cSwiz.Nm
75704ad63cSwiz.Op Fl v
7653d349a1Soster.Fl C Ar config_file Ar dev
77990562bfSwiz.Nm
78c714a07dSoster.Op Fl v
79704ad63cSwiz.Fl c Ar config_file Ar dev
80990562bfSwiz.Nm
81c714a07dSoster.Op Fl v
82ed77a60fSoster.Fl F Ar component Ar dev
83990562bfSwiz.Nm
84c714a07dSoster.Op Fl v
85704ad63cSwiz.Fl f Ar component Ar dev
86990562bfSwiz.Nm
87c714a07dSoster.Op Fl v
88364e3039Slukem.Fl G Ar dev
89990562bfSwiz.Nm
90364e3039Slukem.Op Fl v
91704ad63cSwiz.Fl g Ar component Ar dev
92990562bfSwiz.Nm
93c714a07dSoster.Op Fl v
9453d349a1Soster.Fl I Ar serial_number Ar dev
95990562bfSwiz.Nm
96c714a07dSoster.Op Fl v
97704ad63cSwiz.Fl i Ar dev
98f1a1ad33Sjld.Nm
99f1a1ad33Sjld.Op Fl v
1007464f2ddSoster.Fl L Ar dev
1017464f2ddSoster.Nm
1027464f2ddSoster.Op Fl v
103f1a1ad33Sjld.Fl M
104f1a1ad33Sjld.Oo yes | no | set
105f1a1ad33Sjld.Ar params
106f1a1ad33Sjld.Oc
107f1a1ad33Sjld.Ar dev
108f1a1ad33Sjld.Nm
109f1a1ad33Sjld.Op Fl v
110704ad63cSwiz.Fl m Ar dev
111990562bfSwiz.Nm
112c714a07dSoster.Op Fl v
113d0740fb3Soster.Fl P Ar dev
114990562bfSwiz.Nm
115c714a07dSoster.Op Fl v
116704ad63cSwiz.Fl p Ar dev
117990562bfSwiz.Nm
118c714a07dSoster.Op Fl v
11953d349a1Soster.Fl R Ar component Ar dev
120990562bfSwiz.Nm
121c714a07dSoster.Op Fl v
122704ad63cSwiz.Fl r Ar component Ar dev
123990562bfSwiz.Nm
124c714a07dSoster.Op Fl v
12553d349a1Soster.Fl S Ar dev
126990562bfSwiz.Nm
127c714a07dSoster.Op Fl v
128704ad63cSwiz.Fl s Ar dev
129c4e0e8f4Schristos.Nm
130c4e0e8f4Schristos.Op Fl v
131d73b978aSkre.Fl t Ar config_file
132d73b978aSkre.Nm
133d73b978aSkre.Op Fl v
134c4e0e8f4Schristos.Fl U Ar unit Ar dev
135704ad63cSwiz.Nm
136704ad63cSwiz.Op Fl v
137704ad63cSwiz.Fl u Ar dev
138ed77a60fSoster.Sh DESCRIPTION
13912335050Swiz.Nm
140ed77a60fSosteris the user-land control program for
14143002191Smycroft.Xr raid 4 ,
142ed77a60fSosterthe RAIDframe disk device.
14312335050Swiz.Nm
144ed77a60fSosteris primarily used to dynamically configure and unconfigure RAIDframe disk
1452fb4b1dbSwizdevices.
1462fb4b1dbSwizFor more information about the RAIDframe disk device, see
147ed77a60fSoster.Xr raid 4 .
148ed77a60fSoster.Pp
149ed77a60fSosterThis document assumes the reader has at least rudimentary knowledge of
150ed77a60fSosterRAID and RAID concepts.
151ed77a60fSoster.Pp
15274ff9ea8SosterThe simplified command-line options for
15374ff9ea8Soster.Nm
15474ff9ea8Sosterare as follows:
15574ff9ea8Soster.Bl -tag -width indent
15674ff9ea8Soster.It Ic create Ar level Ar component1 Ar component2 Ar ...
15774ff9ea8Sosterwhere
15874ff9ea8Soster.Ar level
15974ff9ea8Sosterspecifies the RAID level and is one of
16074ff9ea8Soster.Ar 0
16174ff9ea8Soster,
16274ff9ea8Soster.Ar 1
16374ff9ea8Soster(or
16474ff9ea8Soster.Ar mirror
16574ff9ea8Soster), or
16674ff9ea8Soster.Ar 5
16774ff9ea8Sosterand each of
16874ff9ea8Soster.Ar componentN
16974ff9ea8Sosterspecify the devices to be configured into the RAID set.
17074ff9ea8Soster.El
17174ff9ea8Soster.Pp
17274ff9ea8SosterThe advanced command-line options for
173ed77a60fSoster.Nm
174ed77a60fSosterare as follows:
175ed77a60fSoster.Bl -tag -width indent
176268ef0a8Soster.It Fl A Ic yes Ar dev
1772fb4b1dbSwizMake the RAID set auto-configurable.
1782fb4b1dbSwizThe RAID set will be automatically configured at boot
179268ef0a8Soster.Ar before
1802fb4b1dbSwizthe root file system is mounted.
18112335050SwizNote that all components of the set must be of type
18212335050Swiz.Dv RAID
18312335050Swizin the disklabel.
184268ef0a8Soster.It Fl A Ic no Ar dev
185268ef0a8SosterTurn off auto-configuration for the RAID set.
18652334fd8Schristos.It Fl A Ic forceroot Ar dev
187268ef0a8SosterMake the RAID set auto-configurable, and also mark the set as being
1882fb4b1dbSwizeligible to be the root partition.
1892fb4b1dbSwizA RAID set configured this way will
190268ef0a8Soster.Ar override
1912fb4b1dbSwizthe use of the boot disk as the root device.
19212335050SwizAll components of the set must be of type
19312335050Swiz.Dv RAID
19412335050Swizin the disklabel.
195680ae8dcSfredbNote that only certain architectures
19655f2c234Swiz(currently arc, alpha, amd64, bebox, cobalt, emips, evbarm, i386, landisk,
19755f2c234Swizofppc, pmax, riscv, sandpoint, sgimips, sparc, sparc64, and vax)
198680ae8dcSfredbsupport booting a kernel directly from a RAID set.
1999a9013c6SsborrillPlease note that
2009a9013c6Ssborrill.Ic forceroot
2019a9013c6Ssborrillmode was referred to as
2029a9013c6Ssborrill.Ic root
2039a9013c6Ssborrillmode on earlier versions of
2049a9013c6Ssborrill.Nx .
2059a9013c6SsborrillFor compatibility reasons,
2069a9013c6Ssborrill.Ic root
2079a9013c6Ssborrillcan be used as an alias for
2089a9013c6Ssborrill.Ic forceroot .
20952334fd8Schristos.It Fl A Ic softroot Ar dev
21052334fd8SchristosLike
21152334fd8Schristos.Ic forceroot ,
21252334fd8Schristosbut only change the root device if the boot device is part of the RAID set.
213704ad63cSwiz.It Fl a Ar component Ar dev
214704ad63cSwizAdd
215704ad63cSwiz.Ar component
216704ad63cSwizas a hot spare for the device
217704ad63cSwiz.Ar dev .
218704ad63cSwizComponent labels (which identify the location of a given
219704ad63cSwizcomponent within a particular RAID set) are automatically added to the
220704ad63cSwizhot spare after it has been used and are not required for
221704ad63cSwiz.Ar component
222704ad63cSwizbefore it is used.
223704ad63cSwiz.It Fl C Ar config_file Ar dev
224704ad63cSwizAs for
225704ad63cSwiz.Fl c ,
226704ad63cSwizbut forces the configuration to take place.
227704ad63cSwizFatal errors due to uninitialized components are ignored.
228704ad63cSwizThis is required the first time a RAID set is configured.
229ed77a60fSoster.It Fl c Ar config_file Ar dev
230ed77a60fSosterConfigure the RAIDframe device
231ed77a60fSoster.Ar dev
232ed77a60fSosteraccording to the configuration given in
233ed77a60fSoster.Ar config_file .
234ed77a60fSosterA description of the contents of
235ed77a60fSoster.Ar config_file
236ed77a60fSosteris given later.
237ed77a60fSoster.It Fl F Ar component Ar dev
238ed77a60fSosterFails the specified
239ed77a60fSoster.Ar component
240ed77a60fSosterof the device, and immediately begin a reconstruction of the failed
2412fb4b1dbSwizdisk onto an available hot spare.
2422fb4b1dbSwizThis is one of the mechanisms used to start
243ed77a60fSosterthe reconstruction process if a component does have a hardware failure.
244704ad63cSwiz.It Fl f Ar component Ar dev
245704ad63cSwizThis marks the specified
246704ad63cSwiz.Ar component
247704ad63cSwizas having failed, but does not initiate a reconstruction of that component.
248364e3039Slukem.It Fl G Ar dev
249364e3039SlukemGenerate the configuration of the RAIDframe device in a format suitable for
25012335050Swizuse with the
251364e3039Slukem.Fl c
252364e3039Slukemor
25312335050Swiz.Fl C
25412335050Swizoptions.
255704ad63cSwiz.It Fl g Ar component Ar dev
256704ad63cSwizGet the component label for the specified component.
25753d349a1Soster.It Fl I Ar serial_number Ar dev
25853d349a1SosterInitialize the component labels on each component of the device.
25953d349a1Soster.Ar serial_number
26053d349a1Sosteris used as one of the keys in determining whether a
2612fb4b1dbSwizparticular set of components belong to the same RAID set.
2622fb4b1dbSwizWhile not strictly enforced, different serial numbers should be used for
2632fb4b1dbSwizdifferent RAID sets.
2642fb4b1dbSwizThis step
26512335050Swiz.Em MUST
266268ef0a8Sosterbe performed when a new RAID set is created.
267704ad63cSwiz.It Fl i Ar dev
268704ad63cSwizInitialize the RAID device.
269704ad63cSwizIn particular, (re-)write the parity on the selected device.
270704ad63cSwizThis
271704ad63cSwiz.Em MUST
272704ad63cSwizbe done for
273704ad63cSwiz.Em all
274704ad63cSwizRAID sets before the RAID device is labeled and before
275704ad63cSwizfile systems are created on the RAID device.
2767464f2ddSoster.It Fl L Ar dev
2777464f2ddSosterRescan all devices on the system, looking for RAID sets that can be
2787464f2ddSosterauto-configured.  The RAID device provided here has to be a valid
2797464f2ddSosterdevice, but does not need to be configured.  (e.g.
2807464f2ddSoster.Bd -literal -offset indent
2817464f2ddSosterraidctl -L raid0
2827464f2ddSoster.Ed
2837464f2ddSoster.Pp
2847464f2ddSosteris all that is needed to perform a rescan.)
285f1a1ad33Sjld.It Fl M Ic yes Ar dev
286f1a1ad33Sjld.\"XXX should there be a section with more info on the parity map feature?
287f1a1ad33SjldEnable the use of a parity map on the RAID set; this is the default,
288f1a1ad33Sjldand greatly reduces the time taken to check parity after unclean
289f1a1ad33Sjldshutdowns at the cost of some very slight overhead during normal
290f1a1ad33Sjldoperation.
291f1a1ad33SjldChanges to this setting will take effect the next time the set is
292f1a1ad33Sjldconfigured.
293f1a1ad33SjldNote that RAID-0 sets, having no parity, will not use a parity map in
294f1a1ad33Sjldany case.
295f1a1ad33Sjld.It Fl M Ic no Ar dev
296f1a1ad33SjldDisable the use of a parity map on the RAID set; doing this is not
297f1a1ad33Sjldrecommended.
298f1a1ad33SjldThis will take effect the next time the set is configured.
299f1a1ad33Sjld.It Fl M Ic set Ar cooldown Ar tickms Ar regions Ar dev
300f1a1ad33SjldAlter the parameters of the parity map; parameters to leave unchanged
301f1a1ad33Sjldcan be given as 0, and trailing zeroes may be omitted.
302f1a1ad33Sjld.\"XXX should this explanation be deferred to another section as well?
303f1a1ad33SjldThe RAID set is divided into
304f1a1ad33Sjld.Ar regions
305f1a1ad33Sjldregions; each region is marked dirty for at most
306f1a1ad33Sjld.Ar cooldown
307f1a1ad33Sjldintervals of
308f1a1ad33Sjld.Ar tickms
309f1a1ad33Sjldmilliseconds each after a write to it, and at least
310f1a1ad33Sjld.Ar cooldown
311f1a1ad33Sjld\- 1 such intervals.
312f1a1ad33SjldChanges to
313f1a1ad33Sjld.Ar regions
314f1a1ad33Sjldtake effect the next time is configured, while changes to the other
315f1a1ad33Sjldparameters are applied immediately.
316f1a1ad33SjldThe default parameters are expected to be reasonable for most workloads.
317704ad63cSwiz.It Fl m Ar dev
318704ad63cSwizDisplay status information about the parity map on the RAID set, if any.
319704ad63cSwizIf used with
320704ad63cSwiz.Fl v
321704ad63cSwizthen the current contents of the parity map will be output (in
322704ad63cSwizhexadecimal format) as well.
323d0740fb3Soster.It Fl P Ar dev
324d0740fb3SosterCheck the status of the parity on the RAID set, and initialize
325d0740fb3Soster(re-write) the parity if the parity is not known to be up-to-date.
326f1717bc7SosterThis is normally used after a system crash (and before a
327f1717bc7Soster.Xr fsck 8 )
328f1717bc7Sosterto ensure the integrity of the parity.
329704ad63cSwiz.It Fl p Ar dev
330704ad63cSwizCheck the status of the parity on the RAID set.
331704ad63cSwizDisplays a status message,
332704ad63cSwizand returns successfully if the parity is up-to-date.
33353d349a1Soster.It Fl R Ar component Ar dev
33453d349a1SosterFails the specified
33553d349a1Soster.Ar component ,
33653d349a1Sosterif necessary, and immediately begins a reconstruction back to
33753d349a1Soster.Ar component .
338f26e8d9aSosterThis is useful for reconstructing back onto a component after
339f26e8d9aSosterit has been replaced following a failure.
340704ad63cSwiz.It Fl r Ar component Ar dev
341a6071800SosterRemove the specified
342704ad63cSwiz.Ar component
343a6071800Sosterfrom the RAID. The component must be in the failed, spare, or spared state
344a6071800Sosterin order to be removed.
34553d349a1Soster.It Fl S Ar dev
346*808873b5SosterCheck the status of parity re-writing and component reconstruction.
3472fb4b1dbSwizThe output indicates the amount of progress
348f1717bc7Sosterachieved in each of these areas.
349704ad63cSwiz.It Fl s Ar dev
350704ad63cSwizDisplay the status of the RAIDframe device for each of the components
351704ad63cSwizand spares.
352d73b978aSkre.It Fl t Ar config_file
353d73b978aSkreRead and parse the
354d73b978aSkre.Ar config_file ,
355d73b978aSkrereporting any errors, then exit.
356d73b978aSkreNo raidframe operations are performed.
357c4e0e8f4Schristos.It Fl U Ar unit Ar dev
358c4e0e8f4SchristosSet the
359c4e0e8f4Schristos.Dv last_unit
360c4e0e8f4Schristosfield in all the raid components, so that the next time the raid
361c4e0e8f4Schristoswill be autoconfigured it uses that
362c4e0e8f4Schristos.Ar unit .
363704ad63cSwiz.It Fl u Ar dev
364704ad63cSwizUnconfigure the RAIDframe device.
365704ad63cSwizThis does not remove any component labels or change any configuration
366704ad63cSwizsettings (e.g. auto-configuration settings) for the RAID set.
367c714a07dSoster.It Fl v
368*808873b5SosterBe more verbose, and provide a progress indicator for operations such
369*808873b5Sosteras reconstructions and parity re-writing.
370ed77a60fSoster.El
371ed77a60fSoster.Pp
372ed77a60fSosterThe device used by
373ed77a60fSoster.Nm
374ed77a60fSosteris specified by
375ed77a60fSoster.Ar dev .
376ed77a60fSoster.Ar dev
37712335050Swizmay be either the full name of the device, e.g.,
37812335050Swiz.Pa /dev/rraid0d ,
37912335050Swizfor the i386 architecture, or
38012335050Swiz.Pa /dev/rraid0c
38112335050Swizfor many others, or just simply
38212335050Swiz.Pa raid0
38312335050Swiz(for
38412335050Swiz.Pa /dev/rraid0[cd] ) .
38501b23475SosterIt is recommended that the partitions used to represent the
38601b23475SosterRAID device are not used for file systems.
38774ff9ea8Soster.Ss Simple RAID configuration
38874ff9ea8SosterFor simple RAID configurations using RAID levels 0 (simple striping),
38974ff9ea8Soster1 (mirroring), or 5 (striping with distributed parity)
39074ff9ea8Soster.Nm
39174ff9ea8Sostersupports command-line configuration of RAID setups without
39274ff9ea8Sosterthe use of a configuration file.  For example,
39374ff9ea8Soster.Bd -literal -offset indent
39474ff9ea8Sosterraidctl raid0 create 0 /dev/wd0e /dev/wd1e /dev/wd2e
39574ff9ea8Soster.Ed
39674ff9ea8Soster.Pp
39774ff9ea8Sosterwill create a RAID level 0 set on the device named
39874ff9ea8Soster.Pa raid0
39974ff9ea8Sosterusing the components
40074ff9ea8Soster.Pa /dev/wd0e ,
40174ff9ea8Soster.Pa /dev/wd1e ,
40274ff9ea8Sosterand
40374ff9ea8Soster.Pa /dev/wd2e .
40474ff9ea8SosterSimilarly,
40574ff9ea8Soster.Bd -literal -offset indent
40674ff9ea8Sosterraidctl raid0 create mirror absent /dev/wd1e
40774ff9ea8Soster.Ed
40874ff9ea8Soster.Pp
40974ff9ea8Sosterwill create a RAID level 1 (mirror) set with an absent first component
41074ff9ea8Sosterand
41174ff9ea8Soster.Pa /dev/wd1e
41274ff9ea8Sosteras the second component.  In all cases the resulting RAID device will
41374ff9ea8Sosterbe marked as auto-configurable, will have a serial number set (based
41474ff9ea8Sosteron the current time), and parity will be initialized (if the RAID level
41574ff9ea8Sosterhas parity and sufficent components are present).  Reasonable
41674ff9ea8Sosterperformance values are automatically used by default for other
41774ff9ea8Sosterparameters normally specified in the configuration file.
41874ff9ea8Soster.Pp
419364e3039Slukem.Ss Configuration file
420ed77a60fSosterThe format of the configuration file is complex, and
4212fb4b1dbSwizonly an abbreviated treatment is given here.
4222fb4b1dbSwizIn the configuration files, a
423ed77a60fSoster.Sq #
424ed77a60fSosterindicates the beginning of a comment.
425ed77a60fSoster.Pp
426ed77a60fSosterThere are 4 required sections of a configuration file, and 2
4272fb4b1dbSwizoptional sections.
4282fb4b1dbSwizEach section begins with a
429ed77a60fSoster.Sq START ,
4302fb4b1dbSwizfollowed by the section name,
4312fb4b1dbSwizand the configuration parameters associated with that section.
4322fb4b1dbSwizThe first section is the
433ed77a60fSoster.Sq array
434ed77a60fSostersection, and it specifies
435f2b04ca0Smrgthe number of columns, and spare disks in the RAID set.
4362fb4b1dbSwizFor example:
437dbb255dcSwiz.Bd -literal -offset indent
438ed77a60fSosterSTART array
439f2b04ca0Smrg3 0
440f2b04ca0Smrg.Ed
441f2b04ca0Smrg.Pp
442f2b04ca0Smrgindicates an array with 3 columns, and 0 spare disks.
443f2b04ca0SmrgOld configurations specified a 3rd value in front of the
444f2b04ca0Smrgnumber of columns and spare disks.
445f2b04ca0SmrgThis old value, if provided, must be specified as 1:
446f2b04ca0Smrg.Bd -literal -offset indent
447f2b04ca0SmrgSTART array
448ed77a60fSoster1 3 0
449ed77a60fSoster.Ed
450ed77a60fSoster.Pp
451ed77a60fSosterThe second section, the
452ed77a60fSoster.Sq disks
4532fb4b1dbSwizsection, specifies the actual components of the device.
4542fb4b1dbSwizFor example:
455dbb255dcSwiz.Bd -literal -offset indent
456ed77a60fSosterSTART disks
457ed77a60fSoster/dev/sd0e
458ed77a60fSoster/dev/sd1e
459ed77a60fSoster/dev/sd2e
460ed77a60fSoster.Ed
461ed77a60fSoster.Pp
4622fb4b1dbSwizspecifies the three component disks to be used in the RAID device.
463cb7aeb82SkardelDisk wedges may also be specified with the NAME=<wedge name> syntax.
4642fb4b1dbSwizIf any of the specified drives cannot be found when the RAID device is
465ed77a60fSosterconfigured, then they will be marked as
466ed77a60fSoster.Sq failed ,
4672fb4b1dbSwizand the system will operate in degraded mode.
4682fb4b1dbSwizNote that it is
46912335050Swiz.Em imperative
470ed77a60fSosterthat the order of the components in the configuration file does not
4712fb4b1dbSwizchange between configurations of a RAID device.
4722fb4b1dbSwizChanging the order of the components will result in data loss
4732fb4b1dbSwizif the set is configured with the
474f1717bc7Soster.Fl C
4752fb4b1dbSwizoption.
4762fb4b1dbSwizIn normal circumstances, the RAID set will not configure if only
477f1717bc7Soster.Fl c
478f1717bc7Sosteris specified, and the components are out-of-order.
479ed77a60fSoster.Pp
480ed77a60fSosterThe next section, which is the
481ed77a60fSoster.Sq spare
4822fb4b1dbSwizsection, is optional, and, if present, specifies the devices to be used as
483ed77a60fSoster.Sq hot spares
4842fb4b1dbSwiz\(em devices which are on-line,
4852fb4b1dbSwizbut are not actively used by the RAID driver unless
4862fb4b1dbSwizone of the main components fail.
4872fb4b1dbSwizA simple
488ed77a60fSoster.Sq spare
489ed77a60fSostersection might be:
490dbb255dcSwiz.Bd -literal -offset indent
491ed77a60fSosterSTART spare
492ed77a60fSoster/dev/sd3e
493ed77a60fSoster.Ed
494ed77a60fSoster.Pp
4952fb4b1dbSwizfor a configuration with a single spare component.
4962fb4b1dbSwizIf no spare drives are to be used in the configuration, then the
497ed77a60fSoster.Sq spare
49873b35dd2Smsaitohsection may be omitted.
499ed77a60fSoster.Pp
500ed77a60fSosterThe next section is the
501ed77a60fSoster.Sq layout
5022fb4b1dbSwizsection.
5032fb4b1dbSwizThis section describes the general layout parameters for the RAID device,
5042fb4b1dbSwizand provides such information as
5052fb4b1dbSwizsectors per stripe unit,
5062fb4b1dbSwizstripe units per parity unit,
5072fb4b1dbSwizstripe units per reconstruction unit,
5082fb4b1dbSwizand the parity configuration to use.
5092fb4b1dbSwizThis section might look like:
510dbb255dcSwiz.Bd -literal -offset indent
511ed77a60fSosterSTART layout
512ed77a60fSoster# sectPerSU SUsPerParityUnit SUsPerReconUnit RAID_level
513ed77a60fSoster32 1 1 5
514ed77a60fSoster.Ed
515ed77a60fSoster.Pp
516ed77a60fSosterThe sectors per stripe unit specifies, in blocks, the interleave
51712335050Swizfactor; i.e., the number of contiguous sectors to be written to each
5182fb4b1dbSwizcomponent for a single stripe.
5192fb4b1dbSwizAppropriate selection of this value (32 in this example)
5202fb4b1dbSwizis the subject of much research in RAID architectures.
5212fb4b1dbSwizThe stripe units per parity unit and
522ed77a60fSosterstripe units per reconstruction unit are normally each set to 1.
523ed77a60fSosterWhile certain values above 1 are permitted, a discussion of valid
524ed77a60fSostervalues and the consequences of using anything other than 1 are outside
5252fb4b1dbSwizthe scope of this document.
5262fb4b1dbSwizThe last value in this section (5 in this example)
5272fb4b1dbSwizindicates the parity configuration desired.
5282fb4b1dbSwizValid entries include:
529ed77a60fSoster.Bl -tag -width inde
530ed77a60fSoster.It 0
5312fb4b1dbSwizRAID level 0.
5322fb4b1dbSwizNo parity, only simple striping.
533ed77a60fSoster.It 1
5342fb4b1dbSwizRAID level 1.
5352fb4b1dbSwizMirroring.
5362fb4b1dbSwizThe parity is the mirror.
537ed77a60fSoster.It 4
5382fb4b1dbSwizRAID level 4.
53912335050SwizStriping across components, with parity stored on the last component.
540ed77a60fSoster.It 5
5412fb4b1dbSwizRAID level 5.
54212335050SwizStriping across components, parity distributed across all components.
543ed77a60fSoster.El
544ed77a60fSoster.Pp
545ed77a60fSosterThere are other valid entries here, including those for Even-Odd
546ed77a60fSosterparity, RAID level 5 with rotated sparing, Chained declustering,
547ed77a60fSosterand Interleaved declustering, but as of this writing the code for
548ed77a60fSosterthose parity operations has not been tested with
549ed77a60fSoster.Nx .
550ed77a60fSoster.Pp
551ed77a60fSosterThe next required section is the
552ed77a60fSoster.Sq queue
5532fb4b1dbSwizsection.
5542fb4b1dbSwizThis is most often specified as:
555dbb255dcSwiz.Bd -literal -offset indent
556ed77a60fSosterSTART queue
557f1717bc7Sosterfifo 100
558ed77a60fSoster.Ed
559ed77a60fSoster.Pp
560f4f9f7bcSosterwhere the queuing method is specified as fifo (first-in, first-out),
561f1717bc7Sosterand the size of the per-component queue is limited to 100 requests.
562ed77a60fSosterOther queuing methods may also be specified, but a discussion of them
563ed77a60fSosteris beyond the scope of this document.
564ed77a60fSoster.Pp
565ed77a60fSosterThe final section, the
56673b35dd2Smsaitoh.Sq debug
5672fb4b1dbSwizsection, is optional.
5682fb4b1dbSwizFor more details on this the reader is referred to
5692fb4b1dbSwizthe RAIDframe documentation discussed in the
570ed77a60fSoster.Sx HISTORY
571ed77a60fSostersection.
572dbb255dcSwiz.Pp
573debaf475SmrgSince
574debaf475Smrg.Nx 10
575debaf475SmrgRAIDframe has been been capable of autoconfiguration of components
576debaf475Smrgoriginally configured on opposite endian systems.  The current label
577debaf475Smrgendianness will be retained.
578debaf475Smrg.Pp
579ed77a60fSosterSee
580ed77a60fSoster.Sx EXAMPLES
581ed77a60fSosterfor a more complete configuration file example.
582dbb255dcSwiz.Sh FILES
583dbb255dcSwiz.Bl -tag -width /dev/XXrXraidX -compact
584dbb255dcSwiz.It Pa /dev/{,r}raid*
585dbb255dcSwiz.Cm raid
586dbb255dcSwizdevice special files.
587dbb255dcSwiz.El
588ed77a60fSoster.Sh EXAMPLES
58974ff9ea8SosterThe examples given in this section are for more complex
59074ff9ea8Sostersetups than can be configured with the simplified command-line
59174ff9ea8Sosterconfiguration option described early.
59274ff9ea8Soster.Pp
593f1717bc7SosterIt is highly recommended that before using the RAID driver for real
594f1717bc7Sosterfile systems that the system administrator(s) become quite familiar
595f1717bc7Sosterwith the use of
596990562bfSwiz.Nm ,
59712335050Swizand that they understand how the component reconstruction process works.
5982fb4b1dbSwizThe examples in this section will focus on configuring a
599f1717bc7Sosternumber of different RAID sets of varying degrees of redundancy.
600f1717bc7SosterBy working through these examples, administrators should be able to
601f1717bc7Sosterdevelop a good feel for how to configure a RAID set, and how to
602f1717bc7Sosterinitiate reconstruction of failed components.
603ed77a60fSoster.Pp
604f1717bc7SosterIn the following examples
605f1717bc7Soster.Sq raid0
6062fb4b1dbSwizwill be used to denote the RAID device.
6072fb4b1dbSwizDepending on the architecture,
60812335050Swiz.Pa /dev/rraid0c
609f1717bc7Sosteror
61012335050Swiz.Pa /dev/rraid0d
611f1717bc7Sostermay be used in place of
61212335050Swiz.Pa raid0 .
613ff1bb25dSoster.Ss Initialization and Configuration
614f1717bc7SosterThe initial step in configuring a RAID set is to identify the components
6152fb4b1dbSwizthat will be used in the RAID set.
6162fb4b1dbSwizAll components should be the same size.
6172fb4b1dbSwizEach component should have a disklabel type of
618f1717bc7Soster.Dv FS_RAID ,
61912335050Swizand a typical disklabel entry for a RAID component might look like:
620dbb255dcSwiz.Bd -literal -offset indent
621268ef0a8Sosterf:  1800000  200495     RAID              # (Cyl.  405*- 4041*)
622268ef0a8Soster.Ed
623268ef0a8Soster.Pp
624f1717bc7SosterWhile
625f1717bc7Soster.Dv FS_BSDFFS
626f1717bc7Sosterwill also work as the component type, the type
627f1717bc7Soster.Dv FS_RAID
628f1717bc7Sosteris preferred for RAIDframe use, as it is required for features such as
6292fb4b1dbSwizauto-configuration.
6302fb4b1dbSwizAs part of the initial configuration of each RAID set,
6312fb4b1dbSwizeach component will be given a
632f1717bc7Soster.Sq component label .
633f1717bc7SosterA
634f1717bc7Soster.Sq component label
635f1717bc7Sostercontains important information about the component, including a
636f2b04ca0Smrguser-specified serial number, the column of that component in
63712335050Swizthe RAID set, the redundancy level of the RAID set, a
63812335050Swiz.Sq modification counter ,
63912335050Swizand whether the parity information (if any) on that
6402fb4b1dbSwizcomponent is known to be correct.
6412fb4b1dbSwizComponent labels are an integral part of the RAID set,
6422fb4b1dbSwizsince they are used to ensure that components
643f1717bc7Sosterare configured in the correct order, and used to keep track of other
6442fb4b1dbSwizvital information about the RAID set.
6452fb4b1dbSwizComponent labels are also required for the auto-detection
6462fb4b1dbSwizand auto-configuration of RAID sets at boot time.
6472fb4b1dbSwizFor a component label to be considered valid, that
648f1717bc7Sosterparticular component label must be in agreement with the other
6492fb4b1dbSwizcomponent labels in the set.
6502fb4b1dbSwizFor example, the serial number,
651239e79a9Soster.Sq modification counter ,
652f2b04ca0Smrgand number of columns must all be in agreement.
6532fb4b1dbSwizIf any of these are different, then the component is
6542fb4b1dbSwiznot considered to be part of the set.
6552fb4b1dbSwizSee
656ed77a60fSoster.Xr raid 4
657f1717bc7Sosterfor more information about component labels.
658f1717bc7Soster.Pp
659f1717bc7SosterOnce the components have been identified, and the disks have
660f1717bc7Sosterappropriate labels,
66112335050Swiz.Nm
662f1717bc7Sosteris then used to configure the
663f1717bc7Soster.Xr raid 4
6642fb4b1dbSwizdevice.
66512335050SwizTo configure the device, a configuration file which looks something like:
666dbb255dcSwiz.Bd -literal -offset indent
667ed77a60fSosterSTART array
668f2b04ca0Smrg# numCol numSpare
669f2b04ca0Smrg3 1
670ed77a60fSoster
671ed77a60fSosterSTART disks
672ed77a60fSoster/dev/sd1e
673ed77a60fSoster/dev/sd2e
674ed77a60fSoster/dev/sd3e
675ed77a60fSoster
676ed77a60fSosterSTART spare
677ed77a60fSoster/dev/sd4e
678ed77a60fSoster
679ed77a60fSosterSTART layout
680ed77a60fSoster# sectPerSU SUsPerParityUnit SUsPerReconUnit RAID_level_5
681ed77a60fSoster32 1 1 5
682ed77a60fSoster
683ed77a60fSosterSTART queue
684ed77a60fSosterfifo 100
685ed77a60fSoster.Ed
686ed77a60fSoster.Pp
6872fb4b1dbSwizis created in a file.
6882fb4b1dbSwizThe above configuration file specifies a RAID 5
68912335050Swizset consisting of the components
69012335050Swiz.Pa /dev/sd1e ,
69112335050Swiz.Pa /dev/sd2e ,
69212335050Swizand
69312335050Swiz.Pa /dev/sd3e ,
69412335050Swizwith
69512335050Swiz.Pa /dev/sd4e
69612335050Swizavailable as a
697ed77a60fSoster.Sq hot spare
6982fb4b1dbSwizin case one of the three main drives should fail.
6992fb4b1dbSwizA RAID 0 set would be specified in a similar way:
700dbb255dcSwiz.Bd -literal -offset indent
7011f4cc78aSosterSTART array
702f2b04ca0Smrg# numCol numSpare
703f2b04ca0Smrg4 0
7041f4cc78aSoster
7051f4cc78aSosterSTART disks
7061f4cc78aSoster/dev/sd10e
7071f4cc78aSoster/dev/sd11e
7081f4cc78aSoster/dev/sd12e
7091f4cc78aSoster/dev/sd13e
7101f4cc78aSoster
7111f4cc78aSosterSTART layout
7121f4cc78aSoster# sectPerSU SUsPerParityUnit SUsPerReconUnit RAID_level_0
7131f4cc78aSoster64 1 1 0
7141f4cc78aSoster
7151f4cc78aSosterSTART queue
7161f4cc78aSosterfifo 100
7171f4cc78aSoster.Ed
7181f4cc78aSoster.Pp
71912335050SwizIn this case, devices
72012335050Swiz.Pa /dev/sd10e ,
72112335050Swiz.Pa /dev/sd11e ,
72212335050Swiz.Pa /dev/sd12e ,
72312335050Swizand
72412335050Swiz.Pa /dev/sd13e
7252fb4b1dbSwizare the components that make up this RAID set.
7262fb4b1dbSwizNote that there are no hot spares for a RAID 0 set,
7272fb4b1dbSwizsince there is no way to recover data if any of the components fail.
7281f4cc78aSoster.Pp
7291f4cc78aSosterFor a RAID 1 (mirror) set, the following configuration might be used:
730dbb255dcSwiz.Bd -literal -offset indent
7311f4cc78aSosterSTART array
732f2b04ca0Smrg# numCol numSpare
733f2b04ca0Smrg2 0
7341f4cc78aSoster
7351f4cc78aSosterSTART disks
7361f4cc78aSoster/dev/sd20e
7371f4cc78aSoster/dev/sd21e
7381f4cc78aSoster
7391f4cc78aSosterSTART layout
7401f4cc78aSoster# sectPerSU SUsPerParityUnit SUsPerReconUnit RAID_level_1
7411f4cc78aSoster128 1 1 1
7421f4cc78aSoster
7431f4cc78aSosterSTART queue
7441f4cc78aSosterfifo 100
7451f4cc78aSoster.Ed
7461f4cc78aSoster.Pp
74712335050SwizIn this case,
74812335050Swiz.Pa /dev/sd20e
74912335050Swizand
75012335050Swiz.Pa /dev/sd21e
75112335050Swizare the two components of the mirror set.
7522fb4b1dbSwizWhile no hot spares have been specified in this
7531f4cc78aSosterconfiguration, they easily could be, just as they were specified in
7542fb4b1dbSwizthe RAID 5 case above.
75512335050SwizNote as well that RAID 1 sets are currently limited to only 2 components.
7562fb4b1dbSwizAt present, n-way mirroring is not possible.
7571f4cc78aSoster.Pp
758f1717bc7SosterThe first time a RAID set is configured, the
75953d349a1Soster.Fl C
760f1717bc7Sosteroption must be used:
761dbb255dcSwiz.Bd -literal -offset indent
762f1717bc7Sosterraidctl -C raid0.conf raid0
76353d349a1Soster.Ed
76453d349a1Soster.Pp
7651f4cc78aSosterwhere
76612335050Swiz.Pa raid0.conf
7672fb4b1dbSwizis the name of the RAID configuration file.
7682fb4b1dbSwizThe
76953d349a1Soster.Fl C
77053d349a1Sosterforces the configuration to succeed, even if any of the component
7712fb4b1dbSwizlabels are incorrect.
7722fb4b1dbSwizThe
7731f4cc78aSoster.Fl C
7741f4cc78aSosteroption should not be used lightly in
77553d349a1Sostersituations other than initial configurations, as if
77653d349a1Sosterthe system is refusing to configure a RAID set, there is probably a
7772fb4b1dbSwizvery good reason for it.
7782fb4b1dbSwizAfter the initial configuration is done (and
779f1717bc7Sosterappropriate component labels are added with the
7801f4cc78aSoster.Fl I
781f1717bc7Sosteroption) then raid0 can be configured normally with:
782dbb255dcSwiz.Bd -literal -offset indent
783f1717bc7Sosterraidctl -c raid0.conf raid0
784f1717bc7Soster.Ed
78553d349a1Soster.Pp
78653d349a1SosterWhen the RAID set is configured for the first time, it is
78753d349a1Sosternecessary to initialize the component labels, and to initialize the
7882fb4b1dbSwizparity on the RAID set.
7892fb4b1dbSwizInitializing the component labels is done with:
790dbb255dcSwiz.Bd -literal -offset indent
79153d349a1Sosterraidctl -I 112341 raid0
79253d349a1Soster.Ed
79353d349a1Soster.Pp
79453d349a1Sosterwhere
79553d349a1Soster.Sq 112341
7962fb4b1dbSwizis a user-specified serial number for the RAID set.
7972fb4b1dbSwizThis initialization step is
79812335050Swiz.Em required
7992fb4b1dbSwizfor all RAID sets.
8002fb4b1dbSwizAs well, using different serial numbers between RAID sets is
80112335050Swiz.Em strongly encouraged ,
8021f4cc78aSosteras using the same serial number for all RAID sets will only serve to
8031f4cc78aSosterdecrease the usefulness of the component label checking.
80453d349a1Soster.Pp
805617759aaSosterInitializing the RAID set is done via the
806617759aaSoster.Fl i
8072fb4b1dbSwizoption.
8082fb4b1dbSwizThis initialization
80912335050Swiz.Em MUST
8101f4cc78aSosterbe done for
81112335050Swiz.Em all
8121f4cc78aSosterRAID sets, since among other things it verifies that the parity (if
8132fb4b1dbSwizany) on the RAID set is correct.
8142fb4b1dbSwizSince this initialization may be quite time-consuming, the
8151f4cc78aSoster.Fl v
816f4f9f7bcSosteroption may be also used in conjunction with
817617759aaSoster.Fl i :
818dbb255dcSwiz.Bd -literal -offset indent
819617759aaSosterraidctl -iv raid0
820617759aaSoster.Ed
821617759aaSoster.Pp
822f1717bc7SosterThis will give more verbose output on the
823f1717bc7Sosterstatus of the initialization:
824dbb255dcSwiz.Bd -literal -offset indent
825f1717bc7SosterInitiating re-write of parity
826f1717bc7SosterParity Re-write status:
827f1717bc7Soster 10% |****                                   | ETA:    06:03 /
828f1717bc7Soster.Ed
829f1717bc7Soster.Pp
830f1717bc7SosterThe output provides a
831f1717bc7Soster.Sq Percent Complete
832f1717bc7Sosterin both a numeric and graphical format, as well as an estimated time
833f1717bc7Sosterto completion of the operation.
834f1717bc7Soster.Pp
835f1717bc7SosterSince it is the parity that provides the
836e828ccd7Soster.Sq redundancy
83712335050Swizpart of RAID, it is critical that the parity is correct as much as possible.
8382fb4b1dbSwizIf the parity is not correct, then there is no
839268ef0a8Sosterguarantee that data will not be lost if a component fails.
840268ef0a8Soster.Pp
84112335050SwizOnce the parity is known to be correct, it is then safe to perform
84217ab9829Smycroft.Xr disklabel 8 ,
84317ab9829Smycroft.Xr newfs 8 ,
84453d349a1Sosteror
84553d349a1Soster.Xr fsck 8
84653d349a1Sosteron the device or its file systems, and then to mount the file systems
84753d349a1Sosterfor use.
84853d349a1Soster.Pp
84912335050SwizUnder certain circumstances (e.g., the additional component has not
850617759aaSosterarrived, or data is being migrated off of a disk destined to become a
8517b1f0f94Swizcomponent) it may be desirable to configure a RAID 1 set with only
8522fb4b1dbSwiza single component.
853894fc3b8SwizThis can be achieved by using the word
854894fc3b8Swiz.Dq absent
855894fc3b8Swizto indicate that a particular component is not present.
8562fb4b1dbSwizIn the following:
857dbb255dcSwiz.Bd -literal -offset indent
858617759aaSosterSTART array
859f2b04ca0Smrg# numCol numSpare
860f2b04ca0Smrg2 0
861617759aaSoster
862617759aaSosterSTART disks
8631c6a30a0Sosterabsent
864617759aaSoster/dev/sd0e
865617759aaSoster
866617759aaSosterSTART layout
867617759aaSoster# sectPerSU SUsPerParityUnit SUsPerReconUnit RAID_level_1
868617759aaSoster128 1 1 1
869617759aaSoster
870617759aaSosterSTART queue
871617759aaSosterfifo 100
872617759aaSoster.Ed
873617759aaSoster.Pp
87412335050Swiz.Pa /dev/sd0e
87512335050Swizis the real component, and will be the second disk of a RAID 1 set.
8761c6a30a0SosterThe first component is simply marked as being absent.
877617759aaSosterConfiguration (using
878617759aaSoster.Fl C
879617759aaSosterand
880617759aaSoster.Fl I Ar 12345
881617759aaSosteras above) proceeds normally, but initialization of the RAID set will
8822fb4b1dbSwizhave to wait until all physical components are present.
8832fb4b1dbSwizAfter configuration, this set can be used normally, but will be operating
8842fb4b1dbSwizin degraded mode.
8852fb4b1dbSwizOnce a second physical component is obtained, it can be hot-added,
8862fb4b1dbSwizthe existing data mirrored, and normal operation resumed.
887761dfd30Soster.Pp
888761dfd30SosterThe size of the resulting RAID set will depend on the number of data
889761dfd30Sostercomponents in the set.
890761dfd30SosterSpace is automatically reserved for the component labels, and
891761dfd30Sosterthe actual amount of space used
892761dfd30Sosterfor data on a component will be rounded down to the largest possible
893761dfd30Sostermultiple of the sectors per stripe unit (sectPerSU) value.
894761dfd30SosterThus, the amount of space provided by the RAID set will be less
895761dfd30Sosterthan the sum of the size of the components.
896ff1bb25dSoster.Ss Maintenance of the RAID set
897d0740fb3SosterAfter the parity has been initialized for the first time, the command:
898dbb255dcSwiz.Bd -literal -offset indent
899d0740fb3Sosterraidctl -p raid0
900d0740fb3Soster.Ed
901d0740fb3Soster.Pp
9022fb4b1dbSwizcan be used to check the current status of the parity.
9032fb4b1dbSwizTo check the parity and rebuild it necessary (for example,
9042fb4b1dbSwizafter an unclean shutdown) the command:
905dbb255dcSwiz.Bd -literal -offset indent
906d0740fb3Sosterraidctl -P raid0
907d0740fb3Soster.Ed
908d0740fb3Soster.Pp
9092fb4b1dbSwizis used.
9102fb4b1dbSwizNote that re-writing the parity can be done while
91112335050Swizother operations on the RAID set are taking place (e.g., while doing a
912d0740fb3Soster.Xr fsck 8
9132fb4b1dbSwizon a file system on the RAID set).
91412335050SwizHowever: for maximum effectiveness of the RAID set, the parity should be
91512335050Swizknown to be correct before any data on the set is modified.
916d0740fb3Soster.Pp
91753d349a1SosterTo see how the RAID set is doing, the following command can be used to
91853d349a1Sostershow the RAID set's status:
919dbb255dcSwiz.Bd -literal -offset indent
920ed77a60fSosterraidctl -s raid0
921ed77a60fSoster.Ed
922ed77a60fSoster.Pp
923ed77a60fSosterThe output will look something like:
924dbb255dcSwiz.Bd -literal -offset indent
925ed77a60fSosterComponents:
926ed77a60fSoster           /dev/sd1e: optimal
927ed77a60fSoster           /dev/sd2e: optimal
928ed77a60fSoster           /dev/sd3e: optimal
929ed77a60fSosterSpares:
93053d349a1Soster           /dev/sd4e: spare
931f1717bc7SosterComponent label for /dev/sd1e:
932f1717bc7Soster   Row: 0 Column: 0 Num Rows: 1 Num Columns: 3
933f1717bc7Soster   Version: 2 Serial Number: 13432 Mod Counter: 65
934f1717bc7Soster   Clean: No Status: 0
935f1717bc7Soster   sectPerSU: 32 SUsPerPU: 1 SUsPerRU: 1
936f1717bc7Soster   RAID Level: 5  blocksize: 512 numBlocks: 1799936
937f1717bc7Soster   Autoconfig: No
938f1717bc7Soster   Last configured as: raid0
939f1717bc7SosterComponent label for /dev/sd2e:
940f1717bc7Soster   Row: 0 Column: 1 Num Rows: 1 Num Columns: 3
941f1717bc7Soster   Version: 2 Serial Number: 13432 Mod Counter: 65
942f1717bc7Soster   Clean: No Status: 0
943f1717bc7Soster   sectPerSU: 32 SUsPerPU: 1 SUsPerRU: 1
944f1717bc7Soster   RAID Level: 5  blocksize: 512 numBlocks: 1799936
945f1717bc7Soster   Autoconfig: No
946f1717bc7Soster   Last configured as: raid0
947f1717bc7SosterComponent label for /dev/sd3e:
948f1717bc7Soster   Row: 0 Column: 2 Num Rows: 1 Num Columns: 3
949f1717bc7Soster   Version: 2 Serial Number: 13432 Mod Counter: 65
950f1717bc7Soster   Clean: No Status: 0
951f1717bc7Soster   sectPerSU: 32 SUsPerPU: 1 SUsPerRU: 1
952f1717bc7Soster   RAID Level: 5  blocksize: 512 numBlocks: 1799936
953f1717bc7Soster   Autoconfig: No
954f1717bc7Soster   Last configured as: raid0
955f1717bc7SosterParity status: clean
956f1717bc7SosterReconstruction is 100% complete.
957f1717bc7SosterParity Re-write is 100% complete.
958ed77a60fSoster.Ed
959ed77a60fSoster.Pp
9602fb4b1dbSwizThis indicates that all is well with the RAID set.
9612fb4b1dbSwizOf importance here are the component lines which read
962f1717bc7Soster.Sq optimal ,
963f1717bc7Sosterand the
964f1717bc7Soster.Sq Parity status
9653202ca27Sosterline.
9663202ca27Soster.Sq Parity status: clean
9673202ca27Sosterindicates that the parity is up-to-date for this RAID set,
9683202ca27Sosterwhether or not the RAID set is in redundant or degraded mode.
9693202ca27Soster.Sq Parity status: DIRTY
9703202ca27Sosterindicates that it is not known if the parity information is
9713202ca27Sosterconsistent with the data, and that the parity information needs
9723202ca27Sosterto be checked.
9732fb4b1dbSwizNote that if there are file systems open on the RAID set,
9742fb4b1dbSwizthe individual components will not be
975f1717bc7Soster.Sq clean
976f1717bc7Sosterbut the set as a whole can still be clean.
97753d349a1Soster.Pp
97812335050SwizTo check the component label of
97912335050Swiz.Pa /dev/sd1e ,
98012335050Swizthe following is used:
981dbb255dcSwiz.Bd -literal -offset indent
98253d349a1Sosterraidctl -g /dev/sd1e raid0
983ed77a60fSoster.Ed
984ed77a60fSoster.Pp
98553d349a1SosterThe output of this command will look something like:
986dbb255dcSwiz.Bd -literal -offset indent
987f1717bc7SosterComponent label for /dev/sd1e:
988f1717bc7Soster   Row: 0 Column: 0 Num Rows: 1 Num Columns: 3
989f1717bc7Soster   Version: 2 Serial Number: 13432 Mod Counter: 65
990f1717bc7Soster   Clean: No Status: 0
991f1717bc7Soster   sectPerSU: 32 SUsPerPU: 1 SUsPerRU: 1
992f1717bc7Soster   RAID Level: 5  blocksize: 512 numBlocks: 1799936
993f1717bc7Soster   Autoconfig: No
994f1717bc7Soster   Last configured as: raid0
99553d349a1Soster.Ed
996ff1bb25dSoster.Ss Dealing with Component Failures
997ed77a60fSosterIf for some reason
998ed77a60fSoster(perhaps to test reconstruction) it is necessary to pretend a drive
999ed77a60fSosterhas failed, the following will perform that function:
1000dbb255dcSwiz.Bd -literal -offset indent
1001ed77a60fSosterraidctl -f /dev/sd2e raid0
1002ed77a60fSoster.Ed
1003ed77a60fSoster.Pp
1004ed77a60fSosterThe system will then be performing all operations in degraded mode,
1005ebf003eaSkristerwwhere missing data is re-computed from existing data and the parity.
1006f1717bc7SosterIn this case, obtaining the status of raid0 will return (in part):
1007dbb255dcSwiz.Bd -literal -offset indent
1008ed77a60fSosterComponents:
1009ed77a60fSoster           /dev/sd1e: optimal
1010ed77a60fSoster           /dev/sd2e: failed
1011ed77a60fSoster           /dev/sd3e: optimal
1012ed77a60fSosterSpares:
101353d349a1Soster           /dev/sd4e: spare
1014ed77a60fSoster.Ed
1015ed77a60fSoster.Pp
1016ed77a60fSosterNote that with the use of
1017ed77a60fSoster.Fl f
10182fb4b1dbSwiza reconstruction has not been started.
10192fb4b1dbSwizTo both fail the disk and start a reconstruction, the
1020ed77a60fSoster.Fl F
102153d349a1Sosteroption must be used:
1022dbb255dcSwiz.Bd -literal -offset indent
102353d349a1Sosterraidctl -F /dev/sd2e raid0
102453d349a1Soster.Ed
102553d349a1Soster.Pp
102653d349a1SosterThe
1027ed77a60fSoster.Fl f
1028ed77a60fSosteroption may be used first, and then the
1029ed77a60fSoster.Fl F
103053d349a1Sosteroption used later, on the same disk, if desired.
1031ed77a60fSosterImmediately after the reconstruction is started, the status will report:
1032dbb255dcSwiz.Bd -literal -offset indent
1033ed77a60fSosterComponents:
1034ed77a60fSoster           /dev/sd1e: optimal
1035ed77a60fSoster           /dev/sd2e: reconstructing
1036ed77a60fSoster           /dev/sd3e: optimal
1037ed77a60fSosterSpares:
103853d349a1Soster           /dev/sd4e: used_spare
1039f1717bc7Soster[...]
1040f1717bc7SosterParity status: clean
1041f1717bc7SosterReconstruction is 10% complete.
1042f1717bc7SosterParity Re-write is 100% complete.
1043ed77a60fSoster.Ed
1044ed77a60fSoster.Pp
10452fb4b1dbSwizThis indicates that a reconstruction is in progress.
10462fb4b1dbSwizTo find out how the reconstruction is progressing the
104753d349a1Soster.Fl S
10482fb4b1dbSwizoption may be used.
10492fb4b1dbSwizThis will indicate the progress in terms of the
10502fb4b1dbSwizpercentage of the reconstruction that is completed.
10512fb4b1dbSwizWhen the reconstruction is finished the
1052ed77a60fSoster.Fl s
1053ed77a60fSosteroption will show:
1054dbb255dcSwiz.Bd -literal -offset indent
1055ed77a60fSosterComponents:
1056ed77a60fSoster           /dev/sd1e: optimal
1057a6071800Soster           /dev/sd4e: optimal
1058ed77a60fSoster           /dev/sd3e: optimal
1059a6071800SosterNo spares.
1060f1717bc7Soster[...]
1061f1717bc7SosterParity status: clean
1062f1717bc7SosterReconstruction is 100% complete.
1063f1717bc7SosterParity Re-write is 100% complete.
1064ed77a60fSoster.Ed
1065ed77a60fSoster.Pp
1066a6071800Sosteras
106712335050Swiz.Pa /dev/sd2e
1068a6071800Sosterhas been removed and replaced with
1069a6071800Soster.Pa /dev/sd4e .
1070ed77a60fSoster.Pp
107153d349a1SosterIf a component fails and there are no hot spares
1072f1717bc7Sosteravailable on-line, the status of the RAID set might (in part) look like:
1073dbb255dcSwiz.Bd -literal -offset indent
107453d349a1SosterComponents:
107553d349a1Soster           /dev/sd1e: optimal
107653d349a1Soster           /dev/sd2e: failed
107753d349a1Soster           /dev/sd3e: optimal
107853d349a1SosterNo spares.
107953d349a1Soster.Ed
108053d349a1Soster.Pp
10812fb4b1dbSwizIn this case there are a number of options.
10822fb4b1dbSwizThe first option is to add a hot spare using:
1083dbb255dcSwiz.Bd -literal -offset indent
108453d349a1Sosterraidctl -a /dev/sd4e raid0
108553d349a1Soster.Ed
108653d349a1Soster.Pp
108753d349a1SosterAfter the hot add, the status would then be:
1088dbb255dcSwiz.Bd -literal -offset indent
108953d349a1SosterComponents:
109053d349a1Soster           /dev/sd1e: optimal
109153d349a1Soster           /dev/sd2e: failed
109253d349a1Soster           /dev/sd3e: optimal
109353d349a1SosterSpares:
109453d349a1Soster           /dev/sd4e: spare
109553d349a1Soster.Ed
109653d349a1Soster.Pp
109753d349a1SosterReconstruction could then take place using
109853d349a1Soster.Fl F
1099a6071800Sosteras described above.
110053d349a1Soster.Pp
110112335050SwizA second option is to rebuild directly onto
110212335050Swiz.Pa /dev/sd2e .
110312335050SwizOnce the disk containing
110412335050Swiz.Pa /dev/sd2e
110512335050Swizhas been replaced, one can simply use:
1106dbb255dcSwiz.Bd -literal -offset indent
110753d349a1Sosterraidctl -R /dev/sd2e raid0
110853d349a1Soster.Ed
110953d349a1Soster.Pp
111012335050Swizto rebuild the
111112335050Swiz.Pa /dev/sd2e
111212335050Swizcomponent.
11132fb4b1dbSwizAs the rebuilding is in progress, the status will be:
1114dbb255dcSwiz.Bd -literal -offset indent
111553d349a1SosterComponents:
111653d349a1Soster           /dev/sd1e: optimal
111753d349a1Soster           /dev/sd2e: reconstructing
111853d349a1Soster           /dev/sd3e: optimal
111953d349a1SosterNo spares.
112053d349a1Soster.Ed
112153d349a1Soster.Pp
112253d349a1Sosterand when completed, will be:
1123dbb255dcSwiz.Bd -literal -offset indent
112453d349a1SosterComponents:
112553d349a1Soster           /dev/sd1e: optimal
112653d349a1Soster           /dev/sd2e: optimal
112753d349a1Soster           /dev/sd3e: optimal
112853d349a1SosterNo spares.
112953d349a1Soster.Ed
113053d349a1Soster.Pp
1131617759aaSosterIn circumstances where a particular component is completely
1132617759aaSosterunavailable after a reboot, a special component name will be used to
11332fb4b1dbSwizindicate the missing component.
11342fb4b1dbSwizFor example:
1135dbb255dcSwiz.Bd -literal -offset indent
1136617759aaSosterComponents:
1137617759aaSoster           /dev/sd2e: optimal
1138617759aaSoster          component1: failed
1139617759aaSosterNo spares.
1140617759aaSoster.Ed
1141617759aaSoster.Pp
1142617759aaSosterindicates that the second component of this RAID set was not detected
11432fb4b1dbSwizat all by the auto-configuration code.
11442fb4b1dbSwizThe name
1145617759aaSoster.Sq component1
11462fb4b1dbSwizcan be used anywhere a normal component name would be used.
11472fb4b1dbSwizFor example, to add a hot spare to the above set, and rebuild to that hot
1148617759aaSosterspare, the following could be done:
1149dbb255dcSwiz.Bd -literal -offset indent
1150617759aaSosterraidctl -a /dev/sd3e raid0
1151617759aaSosterraidctl -F component1 raid0
1152617759aaSoster.Ed
1153617759aaSoster.Pp
1154617759aaSosterat which point the data missing from
1155617759aaSoster.Sq component1
115612335050Swizwould be reconstructed onto
115712335050Swiz.Pa /dev/sd3e .
1158c4aed2daSoster.Pp
1159c4aed2daSosterWhen more than one component is marked as
1160c4aed2daSoster.Sq failed
116112335050Swizdue to a non-component hardware failure (e.g., loss of power to two
1162c4aed2daSostercomponents, adapter problems, termination problems, or cabling issues) it
11632fb4b1dbSwizis quite possible to recover the data on the RAID set.
11642fb4b1dbSwizThe first thing to be aware of is that the first disk to fail will
11652fb4b1dbSwizalmost certainly be out-of-sync with the remainder of the array.
11662fb4b1dbSwizIf any IO was performed between the time the first component is considered
1167c4aed2daSoster.Sq failed
1168c4aed2daSosterand when the second component is considered
1169c4aed2daSoster.Sq failed ,
1170c4aed2daSosterthen the first component to fail will
117112335050Swiz.Em not
11722fb4b1dbSwizcontain correct data, and should be ignored.
11732fb4b1dbSwizWhen the second component is marked as failed, however, the RAID device will
11742fb4b1dbSwiz(currently) panic the system.
11752fb4b1dbSwizAt this point the data on the RAID set
1176c4aed2daSoster(not including the first failed component) is still self consistent,
1177c4aed2daSosterand will be in no worse state of repair than had the power gone out in
1178c4aed2daSosterthe middle of a write to a file system on a non-RAID device.
117912335050SwizThe problem, however, is that the component labels may now have 3 different
118012335050Swiz.Sq modification counters
118112335050Swiz(one value on the first component that failed, one value on the second
118212335050Swizcomponent that failed, and a third value on the remaining components).
11832fb4b1dbSwizIn such a situation, the RAID set will not autoconfigure,
11842fb4b1dbSwizand can only be forcibly re-configured
1185c4aed2daSosterwith the
1186c4aed2daSoster.Fl C
11872fb4b1dbSwizoption.
11882fb4b1dbSwizTo recover the RAID set, one must first remedy whatever physical
11892fb4b1dbSwizproblem caused the multiple-component failure.
11902fb4b1dbSwizAfter that is done, the RAID set can be restored by forcibly
11912fb4b1dbSwizconfiguring the raid set
119212335050Swiz.Em without
11932fb4b1dbSwizthe component that failed first.
119412335050SwizFor example, if
119512335050Swiz.Pa /dev/sd1e
119612335050Swizand
119712335050Swiz.Pa /dev/sd2e
119812335050Swizfail (in that order) in a RAID set of the following configuration:
1199c4aed2daSoster.Bd -literal -offset indent
1200c4aed2daSosterSTART array
1201f2b04ca0Smrg4 0
1202c4aed2daSoster
12034421a692SwizSTART disks
1204c4aed2daSoster/dev/sd1e
1205c4aed2daSoster/dev/sd2e
1206c4aed2daSoster/dev/sd3e
1207c4aed2daSoster/dev/sd4e
1208c4aed2daSoster
1209c4aed2daSosterSTART layout
1210c4aed2daSoster# sectPerSU SUsPerParityUnit SUsPerReconUnit RAID_level_5
1211c4aed2daSoster64 1 1 5
1212c4aed2daSoster
1213c4aed2daSosterSTART queue
1214c4aed2daSosterfifo 100
1215c4aed2daSoster
1216c4aed2daSoster.Ed
1217c4aed2daSoster.Pp
1218c4aed2daSosterthen the following configuration (say "recover_raid0.conf")
1219c4aed2daSoster.Bd -literal -offset indent
1220c4aed2daSosterSTART array
1221f2b04ca0Smrg4 0
1222c4aed2daSoster
12234421a692SwizSTART disks
12247cb3f2efSosterabsent
1225c4aed2daSoster/dev/sd2e
1226c4aed2daSoster/dev/sd3e
1227c4aed2daSoster/dev/sd4e
1228c4aed2daSoster
1229c4aed2daSosterSTART layout
1230c4aed2daSoster# sectPerSU SUsPerParityUnit SUsPerReconUnit RAID_level_5
1231c4aed2daSoster64 1 1 5
1232c4aed2daSoster
1233c4aed2daSosterSTART queue
1234c4aed2daSosterfifo 100
1235c4aed2daSoster.Ed
1236c4aed2daSoster.Pp
12377cb3f2efSostercan be used with
1238c4aed2daSoster.Bd -literal -offset indent
1239c4aed2daSosterraidctl -C recover_raid0.conf raid0
1240c4aed2daSoster.Ed
1241c4aed2daSoster.Pp
12422fb4b1dbSwizto force the configuration of raid0.
12432fb4b1dbSwizA
1244c4aed2daSoster.Bd -literal -offset indent
1245c4aed2daSosterraidctl -I 12345 raid0
1246c4aed2daSoster.Ed
1247c4aed2daSoster.Pp
1248c4aed2daSosterwill be required in order to synchronize the component labels.
1249c4aed2daSosterAt this point the file systems on the RAID set can then be checked and
12502fb4b1dbSwizcorrected.
12512fb4b1dbSwizTo complete the re-construction of the RAID set,
125212335050Swiz.Pa /dev/sd1e
125312335050Swizis simply hot-added back into the array, and reconstructed
1254c4aed2daSosteras described earlier.
1255ff1bb25dSoster.Ss RAID on RAID
125612335050SwizRAID sets can be layered to create more complex and much larger RAID sets.
125712335050SwizA RAID 0 set, for example, could be constructed from four RAID 5 sets.
12582fb4b1dbSwizThe following configuration file shows such a setup:
1259dbb255dcSwiz.Bd -literal -offset indent
1260ff1bb25dSosterSTART array
1261f2b04ca0Smrg# numCol numSpare
1262f2b04ca0Smrg4 0
1263ff1bb25dSoster
1264ff1bb25dSosterSTART disks
1265ff1bb25dSoster/dev/raid1e
1266ff1bb25dSoster/dev/raid2e
1267ff1bb25dSoster/dev/raid3e
1268ff1bb25dSoster/dev/raid4e
1269ff1bb25dSoster
1270ff1bb25dSosterSTART layout
1271ff1bb25dSoster# sectPerSU SUsPerParityUnit SUsPerReconUnit RAID_level_0
1272ff1bb25dSoster128 1 1 0
1273ff1bb25dSoster
1274ff1bb25dSosterSTART queue
1275ff1bb25dSosterfifo 100
1276ff1bb25dSoster.Ed
1277ff1bb25dSoster.Pp
1278ff1bb25dSosterA similar configuration file might be used for a RAID 0 set
12792fb4b1dbSwizconstructed from components on RAID 1 sets.
12802fb4b1dbSwizIn such a configuration, the mirroring provides a high degree
12812fb4b1dbSwizof redundancy, while the striping provides additional speed benefits.
1282ff1bb25dSoster.Ss Auto-configuration and Root on RAID
12832fb4b1dbSwizRAID sets can also be auto-configured at boot.
12842fb4b1dbSwizTo make a set auto-configurable,
12852fb4b1dbSwizsimply prepare the RAID set as above, and then do a:
1286dbb255dcSwiz.Bd -literal -offset indent
1287f4f9f7bcSosterraidctl -A yes raid0
1288f4f9f7bcSoster.Ed
1289f4f9f7bcSoster.Pp
12902fb4b1dbSwizto turn on auto-configuration for that set.
12912fb4b1dbSwizTo turn off auto-configuration, use:
1292dbb255dcSwiz.Bd -literal -offset indent
1293f4f9f7bcSosterraidctl -A no raid0
1294f4f9f7bcSoster.Ed
1295f4f9f7bcSoster.Pp
1296ff1bb25dSosterRAID sets which are auto-configurable will be configured before the
12972fb4b1dbSwizroot file system is mounted.
12982fb4b1dbSwizThese RAID sets are thus available for
12992fb4b1dbSwizuse as a root file system, or for any other file system.
13002fb4b1dbSwizA primary advantage of using the auto-configuration is that RAID components
13012fb4b1dbSwizbecome more independent of the disks they reside on.
13022fb4b1dbSwizFor example, SCSI ID's can change, but auto-configured sets will always be
1303ff1bb25dSosterconfigured correctly, even if the SCSI ID's of the component disks
1304ff1bb25dSosterhave become scrambled.
1305ff1bb25dSoster.Pp
1306364e3039SlukemHaving a system's root file system
1307364e3039Slukem.Pq Pa /
130812335050Swizon a RAID set is also allowed, with the
1309ff1bb25dSoster.Sq a
1310364e3039Slukempartition of such a RAID set being used for
1311364e3039Slukem.Pa / .
1312ff1bb25dSosterTo use raid0a as the root file system, simply use:
1313dbb255dcSwiz.Bd -literal -offset indent
13149a9013c6Ssborrillraidctl -A forceroot raid0
1315f4f9f7bcSoster.Ed
1316f4f9f7bcSoster.Pp
1317617759aaSosterTo return raid0a to be just an auto-configuring set simply use the
1318617759aaSoster.Fl A Ar yes
1319617759aaSosterarguments.
1320617759aaSoster.Pp
1321617759aaSosterNote that kernels can only be directly read from RAID 1 components on
1322680ae8dcSfredbarchitectures that support that
132355f2c234Swiz(currently alpha, i386, pmax, sandpoint, sparc, sparc64, and vax).
13242fb4b1dbSwizOn those architectures, the
1325617759aaSoster.Dv FS_RAID
1326617759aaSosterfile system is recognized by the bootblocks, and will properly load the
13272fb4b1dbSwizkernel directly from a RAID 1 component.
13282fb4b1dbSwizFor other architectures, or to support the root file system
13292fb4b1dbSwizon other RAID sets, some other mechanism must be used to get a kernel booting.
133012335050SwizFor example, a small partition containing only the secondary boot-blocks
133112335050Swizand an alternate kernel (or two) could be used.
13322fb4b1dbSwizOnce a kernel is booting however, and an auto-configuring RAID set is
13332fb4b1dbSwizfound that is eligible to be root, then that RAID set will be
13342fb4b1dbSwizauto-configured and used as the root device.
13352fb4b1dbSwizIf two or more RAID sets claim to be root devices, then the
13362fb4b1dbSwizuser will be prompted to select the root device.
13372fb4b1dbSwizAt this time, RAID 0, 1, 4, and 5 sets are all supported as root devices.
1338ff1bb25dSoster.Pp
1339ff1bb25dSosterA typical RAID 1 setup with root on RAID might be as follows:
1340ff1bb25dSoster.Bl -enum
1341ff1bb25dSoster.It
1342ff1bb25dSosterwd0a - a small partition, which contains a complete, bootable, basic
1343dbb255dcSwiz.Nx
1344dbb255dcSwizinstallation.
1345ff1bb25dSoster.It
1346dbb255dcSwizwd1a - also contains a complete, bootable, basic
1347dbb255dcSwiz.Nx
1348dbb255dcSwizinstallation.
1349ff1bb25dSoster.It
1350ff1bb25dSosterwd0e and wd1e - a RAID 1 set, raid0, used for the root file system.
1351ff1bb25dSoster.It
1352ff1bb25dSosterwd0f and wd1f - a RAID 1 set, raid1, which will be used only for
1353ff1bb25dSosterswap space.
1354ff1bb25dSoster.It
1355364e3039Slukemwd0g and wd1g - a RAID 1 set, raid2, used for
1356364e3039Slukem.Pa /usr ,
1357364e3039Slukem.Pa /home ,
1358364e3039Slukemor other data, if desired.
1359ff1bb25dSoster.It
136062c5ab30Sexplorerwd0h and wd1h - a RAID 1 set, raid3, if desired.
1361ff1bb25dSoster.El
1362ff1bb25dSoster.Pp
13632fb4b1dbSwizRAID sets raid0, raid1, and raid2 are all marked as auto-configurable.
13642fb4b1dbSwizraid0 is marked as being a root file system.
1365364e3039SlukemWhen new kernels are installed, the kernel is not only copied to
1366364e3039Slukem.Pa / ,
13672fb4b1dbSwizbut also to wd0a and wd1a.
13682fb4b1dbSwizThe kernel on wd0a is required, since that
13692fb4b1dbSwizis the kernel the system boots from.
13702fb4b1dbSwizThe kernel on wd1a is also
13712fb4b1dbSwizrequired, since that will be the kernel used should wd0 fail.
13722fb4b1dbSwizThe important point here is to have redundant copies of the kernel
1373ff1bb25dSosteravailable, in the event that one of the drives fail.
1374ff1bb25dSoster.Pp
1375ff1bb25dSosterThere is no requirement that the root file system be on the same disk
13762fb4b1dbSwizas the kernel.
13772fb4b1dbSwizFor example, obtaining the kernel from wd0a, and using
13782fb4b1dbSwizsd0e and sd1e for raid0, and the root file system, is fine.
13792fb4b1dbSwizIt
138012335050Swiz.Em is
1381ff1bb25dSostercritical, however, that there be multiple kernels available, in the
1382ff1bb25dSosterevent of media failure.
1383ff1bb25dSoster.Pp
1384ff1bb25dSosterMulti-layered RAID devices (such as a RAID 0 set made
1385f4f9f7bcSosterup of RAID 1 sets) are
138612335050Swiz.Em not
1387f4f9f7bcSostersupported as root devices or auto-configurable devices at this point.
1388f4f9f7bcSoster(Multi-layered RAID devices
138912335050Swiz.Em are
13902fb4b1dbSwizsupported in general, however, as mentioned earlier.)
139112335050SwizNote that in order to enable component auto-detection and
139212335050Swizauto-configuration of RAID devices, the line:
1393dbb255dcSwiz.Bd -literal -offset indent
1394680c3843Sosteroptions    RAID_AUTOCONFIG
1395680c3843Soster.Ed
1396680c3843Soster.Pp
13972fb4b1dbSwizmust be in the kernel configuration file.
13982fb4b1dbSwizSee
1399680c3843Soster.Xr raid 4
1400680c3843Sosterfor more details.
140101b23475Soster.Ss Swapping on RAID
140201b23475SosterA RAID device can be used as a swap device.
140301b23475SosterIn order to ensure that a RAID device used as a swap device
140401b23475Sosteris correctly unconfigured when the system is shutdown or rebooted,
140501b23475Sosterit is recommended that the line
140601b23475Soster.Bd -literal -offset indent
140701b23475Sosterswapoff=YES
140801b23475Soster.Ed
140901b23475Soster.Pp
141001b23475Sosterbe added to
141101b23475Soster.Pa /etc/rc.conf .
1412ff1bb25dSoster.Ss Unconfiguration
1413ed77a60fSosterThe final operation performed by
1414ed77a60fSoster.Nm
1415ed77a60fSosteris to unconfigure a
1416ed77a60fSoster.Xr raid 4
14172fb4b1dbSwizdevice.
14182fb4b1dbSwizThis is accomplished via a simple:
1419dbb255dcSwiz.Bd -literal -offset indent
1420ed77a60fSosterraidctl -u raid0
1421ed77a60fSoster.Ed
1422ed77a60fSoster.Pp
1423ed77a60fSosterat which point the device is ready to be reconfigured.
1424617759aaSoster.Ss Performance Tuning
1425617759aaSosterSelection of the various parameter values which result in the best
1426617759aaSosterperformance can be quite tricky, and often requires a bit of
1427617759aaSostertrial-and-error to get those values most appropriate for a given system.
1428617759aaSosterA whole range of factors come into play, including:
1429617759aaSoster.Bl -enum
1430617759aaSoster.It
143112335050SwizTypes of components (e.g., SCSI vs. IDE) and their bandwidth
1432617759aaSoster.It
1433617759aaSosterTypes of controller cards and their bandwidth
1434617759aaSoster.It
1435617759aaSosterDistribution of components among controllers
1436617759aaSoster.It
1437617759aaSosterIO bandwidth
1438617759aaSoster.It
1439364e3039Slukemfile system access patterns
1440617759aaSoster.It
1441617759aaSosterCPU speed
1442617759aaSoster.El
1443617759aaSoster.Pp
1444617759aaSosterAs with most performance tuning, benchmarking under real-life loads
14452fb4b1dbSwizmay be the only way to measure expected performance.
14462fb4b1dbSwizUnderstanding some of the underlying technology is also useful in tuning.
14472fb4b1dbSwizThe goal of this section is to provide pointers to those parameters which may
1448617759aaSostermake significant differences in performance.
1449617759aaSoster.Pp
14502fb4b1dbSwizFor a RAID 1 set, a SectPerSU value of 64 or 128 is typically sufficient.
14512fb4b1dbSwizSince data in a RAID 1 set is arranged in a linear
1452617759aaSosterfashion on each component, selecting an appropriate stripe size is
14532fb4b1dbSwizsomewhat less critical than it is for a RAID 5 set.
14542fb4b1dbSwizHowever: a stripe size that is too small will cause large IO's to be
14552fb4b1dbSwizbroken up into a number of smaller ones, hurting performance.
14562fb4b1dbSwizAt the same time, a large stripe size may cause problems with
14572fb4b1dbSwizconcurrent accesses to stripes, which may also affect performance.
14582fb4b1dbSwizThus values in the range of 32 to 128 are often the most effective.
1459617759aaSoster.Pp
14602fb4b1dbSwizTuning RAID 5 sets is trickier.
14612fb4b1dbSwizIn the best case, IO is presented to the RAID set one stripe at a time.
14622fb4b1dbSwizSince the entire stripe is available at the beginning of the IO,
14632fb4b1dbSwizthe parity of that stripe can be calculated before the stripe is written,
14642fb4b1dbSwizand then the stripe data and parity can be written in parallel.
14652fb4b1dbSwizWhen the amount of data being written is less than a full stripe worth, the
1466617759aaSoster.Sq small write
14672fb4b1dbSwizproblem occurs.
14682fb4b1dbSwizSince a
1469617759aaSoster.Sq small write
1470617759aaSostermeans only a portion of the stripe on the components is going to
1471617759aaSosterchange, the data (and parity) on the components must be updated
14722fb4b1dbSwizslightly differently.
14732fb4b1dbSwizFirst, the
1474617759aaSoster.Sq old parity
1475617759aaSosterand
1476617759aaSoster.Sq old data
14772fb4b1dbSwizmust be read from the components.
14782fb4b1dbSwizThen the new parity is constructed,
1479617759aaSosterusing the new data to be written, and the old data and old parity.
14802fb4b1dbSwizFinally, the new data and new parity are written.
14812fb4b1dbSwizAll this extra data shuffling results in a serious loss of performance,
14822fb4b1dbSwizand is typically 2 to 4 times slower than a full stripe write (or read).
14832fb4b1dbSwizTo combat this problem in the real world, it may be useful
14842fb4b1dbSwizto ensure that stripe sizes are small enough that a
1485617759aaSoster.Sq large IO
14862fb4b1dbSwizfrom the system will use exactly one large stripe write.
14872fb4b1dbSwizAs is seen later, there are some file system dependencies
14882fb4b1dbSwizwhich may come into play here as well.
1489617759aaSoster.Pp
1490617759aaSosterSince the size of a
1491617759aaSoster.Sq large IO
1492617759aaSosteris often (currently) only 32K or 64K, on a 5-drive RAID 5 set it may
1493617759aaSosterbe desirable to select a SectPerSU value of 16 blocks (8K) or 32
14942fb4b1dbSwizblocks (16K).
14952fb4b1dbSwizSince there are 4 data sectors per stripe, the maximum
14962fb4b1dbSwizdata per stripe is 64 blocks (32K) or 128 blocks (64K).
14972fb4b1dbSwizAgain, empirical measurement will provide the best indicators of which
14989e2ba7b4Sjldvalues will yield better performance.
1499617759aaSoster.Pp
15002fb4b1dbSwizThe parameters used for the file system are also critical to good performance.
15012fb4b1dbSwizFor
1502617759aaSoster.Xr newfs 8 ,
1503617759aaSosterfor example, increasing the block size to 32K or 64K may improve
15042fb4b1dbSwizperformance dramatically.
15052fb4b1dbSwizAs well, changing the cylinders-per-group
1506617759aaSosterparameter from 16 to 32 or higher is often not only necessary for
15072fb4b1dbSwizlarger file systems, but may also have positive performance implications.
15081f4cc78aSoster.Ss Summary
15091f4cc78aSosterDespite the length of this man-page, configuring a RAID set is a
15102fb4b1dbSwizrelatively straight-forward process.
15112fb4b1dbSwizAll that needs to be done is the following steps:
15121f4cc78aSoster.Bl -enum
15131f4cc78aSoster.It
15141f4cc78aSosterUse
15151f4cc78aSoster.Xr disklabel 8
15161f4cc78aSosterto create the components (of type RAID).
15171f4cc78aSoster.It
151812335050SwizConstruct a RAID configuration file: e.g.,
151912335050Swiz.Pa raid0.conf
15201f4cc78aSoster.It
15211f4cc78aSosterConfigure the RAID set with:
1522dbb255dcSwiz.Bd -literal -offset indent
15231f4cc78aSosterraidctl -C raid0.conf raid0
15241f4cc78aSoster.Ed
15251f4cc78aSoster.It
15261f4cc78aSosterInitialize the component labels with:
1527dbb255dcSwiz.Bd -literal -offset indent
15281f4cc78aSosterraidctl -I 123456 raid0
15291f4cc78aSoster.Ed
15301f4cc78aSoster.It
15311f4cc78aSosterInitialize other important parts of the set with:
1532dbb255dcSwiz.Bd -literal -offset indent
15331f4cc78aSosterraidctl -i raid0
15341f4cc78aSoster.Ed
15351f4cc78aSoster.It
15361f4cc78aSosterGet the default label for the RAID set:
1537dbb255dcSwiz.Bd -literal -offset indent
153801869ca4Swizdisklabel raid0 > /tmp/label
15391f4cc78aSoster.Ed
15401f4cc78aSoster.It
15411f4cc78aSosterEdit the label:
1542dbb255dcSwiz.Bd -literal -offset indent
15431f4cc78aSostervi /tmp/label
15441f4cc78aSoster.Ed
15451f4cc78aSoster.It
15461f4cc78aSosterPut the new label on the RAID set:
1547dbb255dcSwiz.Bd -literal -offset indent
15481f4cc78aSosterdisklabel -R -r raid0 /tmp/label
15491f4cc78aSoster.Ed
15501f4cc78aSoster.It
15511f4cc78aSosterCreate the file system:
1552dbb255dcSwiz.Bd -literal -offset indent
15531f4cc78aSosternewfs /dev/rraid0e
15541f4cc78aSoster.Ed
15551f4cc78aSoster.It
15561f4cc78aSosterMount the file system:
1557dbb255dcSwiz.Bd -literal -offset indent
15581f4cc78aSostermount /dev/raid0e /mnt
15591f4cc78aSoster.Ed
15601f4cc78aSoster.It
15611f4cc78aSosterUse:
1562dbb255dcSwiz.Bd -literal -offset indent
15631f4cc78aSosterraidctl -c raid0.conf raid0
15641f4cc78aSoster.Ed
15651f4cc78aSoster.Pp
15661f4cc78aSosterTo re-configure the RAID set the next time it is needed, or put
156712335050Swiz.Pa raid0.conf
156812335050Swizinto
156912335050Swiz.Pa /etc
157012335050Swizwhere it will automatically be started by the
157112335050Swiz.Pa /etc/rc.d
157212335050Swizscripts.
15731f4cc78aSoster.El
1574ed77a60fSoster.Sh SEE ALSO
1575ed77a60fSoster.Xr ccd 4 ,
1576dbb255dcSwiz.Xr raid 4 ,
1577ed77a60fSoster.Xr rc 8
1578ed77a60fSoster.Sh HISTORY
1579ed77a60fSosterRAIDframe is a framework for rapid prototyping of RAID structures
1580ed77a60fSosterdeveloped by the folks at the Parallel Data Laboratory at Carnegie
1581ed77a60fSosterMellon University (CMU).
1582ed77a60fSosterA more complete description of the internals and functionality of
1583ed77a60fSosterRAIDframe is found in the paper "RAIDframe: A Rapid Prototyping Tool
1584ed77a60fSosterfor RAID Systems", by William V. Courtright II, Garth Gibson, Mark
1585ed77a60fSosterHolland, LeAnn Neal Reilly, and Jim Zelenka, and published by the
1586ed77a60fSosterParallel Data Laboratory of Carnegie Mellon University.
1587ed77a60fSosterThe
1588ed77a60fSoster.Nm
15892fb4b1dbSwizcommand first appeared as a program in CMU's RAIDframe v1.1 distribution.
15902fb4b1dbSwizThis version of
1591ed77a60fSoster.Nm
1592ed77a60fSosteris a complete re-write, and first appeared in
1593ed77a60fSoster.Nx 1.4 .
1594ed77a60fSoster.Sh COPYRIGHT
1595dbb255dcSwiz.Bd -literal
1596ed77a60fSosterThe RAIDframe Copyright is as follows:
1597ed77a60fSoster
1598ed77a60fSosterCopyright (c) 1994-1996 Carnegie-Mellon University.
1599ed77a60fSosterAll rights reserved.
1600ed77a60fSoster
1601ed77a60fSosterPermission to use, copy, modify and distribute this software and
1602ed77a60fSosterits documentation is hereby granted, provided that both the copyright
1603ed77a60fSosternotice and this permission notice appear in all copies of the
1604ed77a60fSostersoftware, derivative works or modified versions, and any portions
1605ed77a60fSosterthereof, and that both notices appear in supporting documentation.
1606ed77a60fSoster
1607ed77a60fSosterCARNEGIE MELLON ALLOWS FREE USE OF THIS SOFTWARE IN ITS "AS IS"
1608ed77a60fSosterCONDITION.  CARNEGIE MELLON DISCLAIMS ANY LIABILITY OF ANY KIND
1609ed77a60fSosterFOR ANY DAMAGES WHATSOEVER RESULTING FROM THE USE OF THIS SOFTWARE.
1610ed77a60fSoster
1611ed77a60fSosterCarnegie Mellon requests users of this software to return to
1612ed77a60fSoster
1613ed77a60fSoster Software Distribution Coordinator  or  Software.Distribution@CS.CMU.EDU
1614ed77a60fSoster School of Computer Science
1615ed77a60fSoster Carnegie Mellon University
1616ed77a60fSoster Pittsburgh PA 15213-3890
1617ed77a60fSoster
1618ed77a60fSosterany improvements or extensions that they make and grant Carnegie the
1619ed77a60fSosterrights to redistribute these changes.
1620ed77a60fSoster.Ed
1621dbb255dcSwiz.Sh WARNINGS
1622dbb255dcSwizCertain RAID levels (1, 4, 5, 6, and others) can protect against some
16232fb4b1dbSwizdata loss due to component failure.
16242fb4b1dbSwizHowever the loss of two components of a RAID 4 or 5 system,
16252fb4b1dbSwizor the loss of a single component of a RAID 0 system will
16262fb4b1dbSwizresult in the entire file system being lost.
1627dbb255dcSwizRAID is
162812335050Swiz.Em NOT
1629dbb255dcSwiza substitute for good backup practices.
1630dbb255dcSwiz.Pp
1631dbb255dcSwizRecomputation of parity
163212335050Swiz.Em MUST
16332fb4b1dbSwizbe performed whenever there is a chance that it may have been compromised.
16342fb4b1dbSwizThis includes after system crashes, or before a RAID
16352fb4b1dbSwizdevice has been used for the first time.
16362fb4b1dbSwizFailure to keep parity correct will be catastrophic should a
163712335050Swizcomponent ever fail \(em it is better to use RAID 0 and get the
16382fb4b1dbSwizadditional space and speed, than it is to use parity, but
16392fb4b1dbSwiznot keep the parity correct.
16402fb4b1dbSwizAt least with RAID 0 there is no perception of increased data security.
16413dd51f1bSbuhrow.Pp
16423dd51f1bSbuhrowWhen replacing a failed component of a RAID set, it is a good
16433dd51f1bSbuhrowidea to zero out the first 64 blocks of the new component to insure the
16443dd51f1bSbuhrowRAIDframe driver doesn't erroneously detect a component label in the
16451bd615d1Swiznew component.
16461bd615d1SwizThis is particularly true on
1647af97374dSnjoly.Em RAID 1
16483dd51f1bSbuhrowsets because there is at most one correct component label in a failed RAID
16493dd51f1bSbuhrow1 installation, and the RAIDframe driver picks the component label with the
16503dd51f1bSbuhrowhighest serial number and modification value as the authoritative source
16513dd51f1bSbuhrowfor the failed RAID set when choosing which component label to use to
16523dd51f1bSbuhrowconfigure the RAID set.
1653dbb255dcSwiz.Sh BUGS
1654dbb255dcSwizHot-spare removal is currently not available.
1655