xref: /netbsd-src/share/man/man4/man4.vax/uda.4 (revision d9158b13b5dfe46201430699a3f7a235ecf28df3)
1.\" Copyright (c) 1980, 1987, 1991 Regents of the University of California.
2.\" All rights reserved.
3.\"
4.\" Redistribution and use in source and binary forms, with or without
5.\" modification, are permitted provided that the following conditions
6.\" are met:
7.\" 1. Redistributions of source code must retain the above copyright
8.\"    notice, this list of conditions and the following disclaimer.
9.\" 2. Redistributions in binary form must reproduce the above copyright
10.\"    notice, this list of conditions and the following disclaimer in the
11.\"    documentation and/or other materials provided with the distribution.
12.\" 3. All advertising materials mentioning features or use of this software
13.\"    must display the following acknowledgement:
14.\"	This product includes software developed by the University of
15.\"	California, Berkeley and its contributors.
16.\" 4. Neither the name of the University nor the names of its contributors
17.\"    may be used to endorse or promote products derived from this software
18.\"    without specific prior written permission.
19.\"
20.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
21.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
22.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
23.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
24.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
25.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
26.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
27.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
28.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
29.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
30.\" SUCH DAMAGE.
31.\"
32.\"     from: @(#)uda.4	6.6 (Berkeley) 3/27/91
33.\"	$Id: uda.4,v 1.2 1993/08/01 07:35:56 mycroft Exp $
34.\"
35.Dd March 27, 1991
36.Dt UDA 4 vax
37.Os BSD 4
38.Sh NAME
39.Nm uda
40.Nd
41.Tn UDA50
42disk controller interface
43.Sh SYNOPSIS
44.Cd "controller uda0 at uba0 csr 0172150 vector udaintr"
45.Cd "disk ra0 at uda0 drive 0"
46.Cd "options MSCP_PARANOIA"
47.Sh DESCRIPTION
48This is a driver for the
49.Tn DEC UDA50
50disk controller and other
51compatible controllers.  The
52.Tn UDA50
53communicates with the host through
54a packet protocol known as the Mass Storage Control Protocol
55.Pq Tn MSCP .
56Consult the file
57.Aq Pa vax/mscp.h
58for a detailed description of this protocol.
59.Pp
60The
61.Nm uda
62driver
63is a typical block-device disk driver; see
64.Xr physio 4
65for a description of block
66.Tn I/O .
67The script
68.Xr MAKEDEV 8
69should be used to create the
70.Nm uda
71special files; should a special
72file need to be created by hand, consult
73.Xr mknod 8 .
74.Pp
75The
76.Dv MSCP_PARANOIA
77option enables runtime checking on all transfer completion responses
78from the controller.  This increases disk
79.Tn I/O
80overhead and may
81be undesirable on slow machines, but is otherwise recommended.
82.Pp
83The first sector of each disk contains both a first-stage bootstrap program
84and a disk label containing geometry information and partition layouts (see
85.Xr disklabel 5 ) .
86This sector is normally write-protected, and disk-to-disk copies should
87avoid copying this sector.
88The label may be updated with
89.Xr disklabel 8 ,
90which can also be used to write-enable and write-disable the sector.
91The next 15 sectors contain a second-stage bootstrap program.
92.Sh DISK SUPPORT
93During autoconfiguration,
94as well as when a drive is opened after all partitions are closed,
95the first sector of the drive is examined for a disk label.
96If a label is found, the geometry of the drive and the partition tables
97are taken from it.
98If no label is found,
99the driver configures the type of each drive when it is first
100encountered.  A default partition table in the driver is used for each type
101of disk when a pack is not labelled.  The origin and size
102(in sectors) of the default pseudo-disks on each
103drive are shown below.  Not all partitions begin on cylinder
104boundaries, as on other drives, because previous drivers used one
105partition table for all drive types.  Variants of the partition tables
106are common; check the driver and the file
107.Pa /etc/disktab
108.Pq Xr disktab 5
109for other possibilities.
110.Pp
111Special file names begin with
112.Sq Li ra
113and
114.Sq Li rra
115for the block and character files respectively. The second
116component of the name, a drive unit number in the range of zero to
117seven, is represented by a
118.Sq Li ?
119in the disk layouts below. The last component of the name is the
120file system partition
121designated
122by a letter from
123.Sq Li a
124to
125.Sq Li h
126and which corresponds to a minor device number set: zero to seven,
127eight to 15, 16 to 23 and so forth for drive zero, drive two and drive
128three respectively, (see
129.Xr physio 4) .
130The location and size (in sectors) of the partitions:
131.Bl -column header diskx undefined length
132.Tn RA60 No partitions
133.Sy	disk	start	length
134	ra?a	0	15884
135	ra?b	15884	33440
136	ra?c	0	400176
137	ra?d	49324	82080	same as 4.2BSD ra?g
138	ra?e	131404	268772	same as 4.2BSD ra?h
139	ra?f	49324	350852
140	ra?g	242606	157570
141	ra?h	49324	193282
142
143.Tn RA70 No partitions
144.Sy	disk	start	length
145	ra?a	0	15884
146	ra?b	15972	33440
147	ra?c	0	547041
148	ra?d	34122	15884
149	ra?e	357192	55936
150	ra?f	413457	133584
151	ra?g	341220	205821
152	ra?h	49731	29136
153
154.Tn RA80 No partitions
155.Sy	disk	start	length
156	ra?a	0	15884
157	ra?b	15884	33440
158	ra?c	0	242606
159	ra?e	49324	193282	same as old Berkeley ra?g
160	ra?f	49324	82080	same as 4.2BSD ra?g
161	ra?g	49910	192696
162	ra?h	131404	111202	same as 4.2BSD
163
164.Tn RA81 No partitions
165.Sy	disk	start	length
166	ra?a	0	15884
167	ra?b	16422	66880
168	ra?c	0	891072
169	ra?d	375564	15884
170	ra?e	391986	307200
171	ra?f	699720	191352
172	ra?g	375564	515508
173	ra?h	83538	291346
174
175.Tn RA81 No partitions with 4.2BSD-compatible partitions
176.Sy	disk	start	length
177	ra?a	0	15884
178	ra?b	16422	66880
179	ra?c	0	891072
180	ra?d	49324	82080	same as 4.2BSD ra?g
181	ra?e	131404	759668	same as 4.2BSD ra?h
182	ra?f	412490	478582	same as 4.2BSD ra?f
183	ra?g	375564	515508
184	ra?h	83538	291346
185
186.Tn RA82 No partitions
187.Sy	disk	start	length
188	ra?a	0	15884
189	ra?b	16245	66880
190	ra?c	0	1135554
191	ra?d	375345	15884
192	ra?e	391590	307200
193	ra?f	669390	466164
194	ra?g	375345	760209
195	ra?h	83790	291346
196.El
197.Pp
198The ra?a partition is normally used for the root file system, the ra?b
199partition as a paging area, and the ra?c partition for pack-pack
200copying (it maps the entire disk).
201.Sh FILES
202.Bl -tag -width /dev/rra[0-9][a-f] -compact
203.It Pa /dev/ra[0-9][a-f]
204.It Pa /dev/rra[0-9][a-f]
205.El
206.Sh DIAGNOSTICS
207.Bl -diag
208.It "panic: udaslave"
209No command packets were available while the driver was looking
210for disk drives.  The controller is not extending enough credits
211to use the drives.
212.Pp
213.It "uda%d: no response to Get Unit Status request"
214A disk drive was found, but did not respond to a status request.
215This is either a hardware problem or someone pulling unit number
216plugs very fast.
217.Pp
218.It "uda%d: unit %d off line"
219While searching for drives, the controller found one that
220seems to be manually disabled.  It is ignored.
221.Pp
222.It "uda%d: unable to get unit status"
223Something went wrong while trying to determine the status of
224a disk drive.  This is followed by an error detail.
225.Pp
226.It uda%d: unit %d, next %d
227This probably never happens, but I wanted to know if it did.  I
228have no idea what one should do about it.
229.Pp
230.It "uda%d: cannot handle unit number %d (max is %d)"
231The controller found a drive whose unit number is too large.
232Valid unit numbers are those in the range [0..7].
233.Pp
234.It "ra%d: don't have a partition table for %s; using (s,t,c)=(%d,%d,%d)"
235The controller found a drive whose media identifier (e.g. `RA 25')
236does not have a default partition table.  A temporary partition
237table containing only an `a' partition has been created covering
238the entire disk, which has the indicated numbers of sectors per
239track (s), tracks per cylinder (t), and total cylinders (c).
240Give the pack a label with the
241.Xr disklabel
242utility.
243.Pp
244.It "uda%d: uballoc map failed"
245Unibus resource map allocation failed during initialisation.  This
246can only happen if you have 496 devices on a Unibus.
247.Pp
248.It uda%d: timeout during init
249The controller did not initialise within ten seconds.  A hardware
250problem, but it sometimes goes away if you try again.
251.Pp
252.It uda%d: init failed, sa=%b
253The controller refused to initalise.
254.Pp
255.It uda%d: controller hung
256The controller never finished initialisation.  Retrying may sometimes
257fix it.
258.Pp
259.It ra%d: drive will not come on line
260The drive will not come on line, probably because it is spun down.
261This should be preceded by a message giving details as to why the
262drive stayed off line.
263.Pp
264.It uda%d: still hung
265When the controller hangs, the driver occasionally tries to reinitialise
266it.  This means it just tried, without success.
267.Pp
268.It panic: udastart: bp==NULL
269A bug in the driver has put an empty drive queue on a controller queue.
270.Pp
271.It uda%d: command ring too small
272If you increase
273.Dv NCMDL2 ,
274you may see a performance improvement.
275(See
276.Pa /sys/vaxuba/uda.c . )
277.Pp
278.It panic: udastart
279A drive was found marked for status or on-line functions while performing
280status or on-line functions.  This indicates a bug in the driver.
281.Pp
282.It "uda%d: controller error, sa=0%o (%s)"
283The controller reported an error.  The error code is printed in
284octal, along with a short description if the code is known (see the
285.%T UDA50 Maintenance Guide ,
286.Tn DEC
287part number
288.Tn AA-M185B-TC ,
289pp. 18-22).
290If this occurs during normal
291operation, the driver will reset it and retry pending
292.Tn I/O .
293If
294it occurs during configuration, the controller may be ignored.
295.Pp
296.It uda%d: stray intr
297The controller interrupted when it should have stayed quiet.  The
298interrupt has been ignored.
299.Pp
300.It "uda%d: init step %d failed, sa=%b"
301The controller reported an error during the named initialisation step.
302The driver will retry initialisation later.
303.Pp
304.It uda%d: version %d model %d
305An informational message giving the revision level of the controller.
306.Pp
307.It uda%d: DMA burst size set to %d
308An informational message showing the
309.Tn DMA
310burst size, in words.
311.Pp
312.It panic: udaintr
313Indicates a bug in the generic
314.Tn MSCP
315code.
316.Pp
317.It uda%d: driver bug, state %d
318The driver has a bogus value for the controller state.  Something
319is quite wrong.  This is immediately followed by a `panic: udastate'.
320.Pp
321.It uda%d: purge bdp %d
322A benign message tracing BDP purges.  I have been trying to figure
323out what BDP purges are for.  You might want to comment out this
324call to log() in /sys/vaxuba/uda.c.
325.Pp
326.It uda%d: SETCTLRC failed:  `detail'
327The Set Controller Characteristics command (the last part of the
328controller initialisation sequence) failed.  The
329.Em detail
330message tells why.
331.Pp
332.It "uda%d: attempt to bring ra%d on line failed:  `detail'"
333The drive could not be brought on line.  The
334.Em detail
335message tells why.
336.Pp
337.It uda%d: ra%d: unknown type %d
338The type index of the named drive is not known to the driver, so the
339drive will be ignored.
340.Pp
341.It "ra%d: changed types! was %d now %d"
342A drive somehow changed from one kind to another, e.g., from an
343.Tn RA80
344to an
345.Tn RA60 .
346The numbers printed are the encoded media identifiers (see
347.Ao Pa vax/mscp.h Ac
348for the encoding).
349The driver believes the new type.
350.Pp
351.It "ra%d: uda%d, unit %d, size = %d sectors"
352The named drive is on the indicated controller as the given unit,
353and has that many sectors of user-file area.  This is printed
354during configuration.
355.Pp
356.It "uda%d: attempt to get status for ra%d failed:  `detail'"
357A status request failed.  The
358.Em detail
359message should tell why.
360.Pp
361.It ra%d: bad block report: %d
362The drive has reported the given block as bad.  If there are multiple
363bad blocks, the drive will report only the first; in this case this
364message will be followed by `+ others'.  Get
365.Tn DEC
366to forward the
367block with
368.Tn EVRLK .
369.Pp
370.It ra%d: serious exception reported
371I have no idea what this really means.
372.Pp
373.It panic: udareplace
374The controller reported completion of a
375.Tn REPLACE
376operation.  The
377driver never issues any
378.Tn REPLACE Ns s ,
379so something is wrong.
380.Pp
381.It panic: udabb
382The controller reported completion of bad block related
383.Tn I/O .
384The
385driver never issues any such, so something is wrong.
386.Pp
387.It uda%d: lost interrupt
388The controller has gone out to lunch, and is being reset to try to bring
389it back.
390.Pp
391.It panic: mscp_go: AEB_MAX_BP too small
392You defined
393.Dv AVOID_EMULEX_BUG
394and increased
395.Dv NCMDL2
396and Emulex has
397new firmware.  Raise
398.Dv AEB_MAX_BP
399or turn off
400.Dv AVOID_EMULEX_BUG .
401.Pp
402.It "uda%d: unit %d: unknown message type 0x%x ignored"
403The controller responded with a mysterious message type. See
404.Pa /sys/vax/mscp.h
405for a list of known message types.  This is probably
406a controller hardware problem.
407.Pp
408.It "uda%d: unit %d out of range"
409The disk drive unit number (the unit plug) is higher than the
410maximum number the driver allows (currently 7).
411.Pp
412.It "uda%d: unit %d not configured, message ignored"
413The named disk drive has announced its presence to the controller,
414but was not, or cannot now be, configured into the running system.
415.Em Message
416is one of `available attention' (an `I am here' message) or
417`stray response op 0x%x status 0x%x' (anything else).
418.Pp
419.It ra%d: bad lbn (%d)?
420The drive has reported an invalid command error, probably due to an
421invalid block number.  If the lbn value is very much greater than the
422size reported by the drive, this is the problem.  It is probably due to
423an improperly configured partition table.  Other invalid commands
424indicate a bug in the driver, or hardware trouble.
425.Pp
426.It ra%d: duplicate ONLINE ignored
427The drive has come on-line while already on-line.  This condition
428can probably be ignored (and has been).
429.Pp
430.It ra%d: io done, but no buffer?
431Hardware trouble, or a bug; the drive has finished an
432.Tn I/O
433request,
434but the response has an invalid (zero) command reference number.
435.Pp
436.It "Emulex SC41/MS screwup: uda%d, got %d correct, then changed 0x%x to 0x%x"
437You turned on
438.Dv AVOID_EMULEX_BUG ,
439and the driver successfully
440avoided the bug.  The number of correctly-handled requests is
441reported, along with the expected and actual values relating to
442the bug being avoided.
443.Pp
444.It panic: unrecoverable Emulex screwup
445You turned on
446.Dv AVOID_EMULEX_BUG ,
447but Emulex was too clever and
448avoided the avoidance.  Try turning on
449.Dv MSCP_PARANOIA
450instead.
451.Pp
452.It uda%d: bad response packet ignored
453You turned on
454.Dv MSCP_PARANOIA ,
455and the driver caught the controller in
456a lie.  The lie has been ignored, and the controller will soon be
457reset (after a `lost' interrupt).  This is followed by a hex dump of
458the offending packet.
459.Pp
460.It ra%d: bogus REPLACE end
461The drive has reported finishing a bad sector replacement, but the
462driver never issues bad sector replacement commands.  The report
463is ignored.  This is likely a hardware problem.
464.Pp
465.It "ra%d: unknown opcode 0x%x status 0x%x ignored"
466The drive has reported something that the driver cannot understand.
467Perhaps
468.Tn DEC
469has been inventive, or perhaps your hardware is ill.
470This is followed by a hex dump of the offending packet.
471.Pp
472.It "ra%d%c: hard error %sing fsbn %d [of %d-%d] (ra%d bn %d cn %d tn %d sn %d)."
473An unrecoverable error occurred during transfer of the specified
474filesystem block number(s),
475which are logical block numbers on the indicated partition.
476If the transfer involved multiple blocks, the block range is printed as well.
477The parenthesized fields list the actual disk sector number
478relative to the beginning of the drive,
479as well as the cylinder, track and sector number of the block.
480.Pp
481.It uda%d: %s error datagram
482The controller has reported some kind of error, either `hard'
483(unrecoverable) or `soft' (recoverable).  If the controller is going on
484(attempting to fix the problem), this message includes the remark
485`(continuing)'.  Emulex controllers wrongly claim that all soft errors
486are hard errors.  This message may be followed by
487one of the following 5 messages, depending on its type, and will always
488be followed by a failure detail message (also listed below).
489.Bd -filled -offset indent
490.It memory addr 0x%x
491A host memory access error; this is the address that could not be
492read.
493.Pp
494.It "unit %d: level %d retry %d, %s %d"
495A typical disk error; the retry count and error recovery levels are
496printed, along with the block type (`lbn', or logical block; or `rbn',
497or replacement block) and number.  If the string is something else,
498.Tn DEC
499has been clever, or your hardware has gone to Australia for vacation
500(unless you live there; then it might be in New Zealand, or Brazil).
501.Pp
502.It unit %d: %s %d
503Also a disk error, but an `SDI' error, whatever that is.  (I doubt
504it has anything to do with Ronald Reagan.)  This lists the block
505type (`lbn' or `rbn') and number.  This is followed by a second
506message indicating a microprocessor error code and a front panel
507code.  These latter codes are drive-specific, and are intended to
508be used by field service as an aid in locating failing hardware.
509The codes for RA81s can be found in the
510.%T RA81 Maintenance Guide ,
511DEC order number AA-M879A-TC, in appendices E and F.
512.Pp
513.It "unit %d: small disk error, cyl %d"
514Yet another kind of disk error, but for small disks.  (`That's what
515it says, guv'nor.  Dunnask me what it means.')
516.Pp
517.It "unit %d: unknown error, format 0x%x"
518A mysterious error: the given format code is not known.
519.Ed
520.Pp
521The detail messages are as follows:
522.Bd -filled -offset indent
523.It success (%s) (code 0, subcode %d)
524Everything worked, but the controller thought it would let you know
525that something went wrong.  No matter what subcode, this can probably
526be ignored.
527.Pp
528.It "invalid command (%s) (code 1, subcode %d)"
529This probably cannot occur unless the hardware is out; %s should be
530`invalid msg length', meaning some command was too short or too long.
531.Pp
532.It "command aborted (unknown subcode) (code 2, subcode %d)"
533This should never occur, as the driver never aborts commands.
534.Pp
535.It "unit offline (%s) (code 3, subcode %d)"
536The drive is offline, either because it is not around (`unknown
537drive'), stopped (`not mounted'), out of order (`inoperative'), has the
538same unit number as some other drive (`duplicate'), or has been
539disabled for diagnostics (`in diagnosis').
540.Pp
541.It "unit available (unknown subcode) (code 4, subcode %d)"
542The controller has decided to report a perfectly normal event as
543an error.  (Why?)
544.Pp
545.It "media format error (%s) (code 5, subcode %d)"
546The drive cannot be used without reformatting.  The Format Control
547Table cannot be read (`fct unread - edc'), there is a bad sector
548header (`invalid sector header'), the drive is not set for 512-byte
549sectors (`not 512 sectors'), the drive is not formatted (`not formatted'),
550or the
551.Tn FCT
552has an uncorrectable
553.Tn ECC
554error (`fct ecc').
555.Pp
556.It "write protected (%s) (code 6, subcode %d)"
557The drive is write protected, either by the front panel switch
558(`hardware') or via the driver (`software').  The driver never
559sets software write protect.
560.Pp
561.It "compare error (unknown subcode) (code 7, subcode %d)"
562A compare operation showed some sort of difference.  The driver
563never uses compare operations.
564.Pp
565.It "data error (%s) (code 7, subcode %d)"
566Something went wrong reading or writing a data sector.  A `forced
567error' is a software-asserted error used to mark a sector that contains
568suspect data.  Rewriting the sector will clear the forced error.  This
569is normally set only during bad block replacment, and the driver does
570no bad block replacement, so these should not occur.  A `header
571compare' error probably means the block is shot.  A `sync timeout'
572presumably has something to do with sector synchronisation.
573An `uncorrectable ecc' error is an ordinary data error that cannot
574be fixed via
575.Tn ECC
576logic.  A `%d symbol ecc' error is a data error
577that can be (and presumably has been) corrected by the
578.Tn ECC
579logic.
580It might indicate a sector that is imperfect but usable, or that
581is starting to go bad.  If any of these errors recur, the sector
582may need to be replaced.
583.Pp
584.It "host buffer access error (%s) (code %d, subcode %d)"
585Something went wrong while trying to copy data to or from the host
586(Vax).  The subcode is one of `odd xfer addr', `odd xfer count',
587`non-exist. memory', or `memory parity'.  The first two could be a
588software glitch; the last two indicate hardware problems.
589.It controller error (%s) (code %d, subcode %d)
590The controller has detected a hardware error in itself.  A
591`serdes overrun' is a serialiser / deserialiser overrun; `edc'
592probably stands for `error detection code'; and `inconsistent
593internal data struct' is obvious.
594.Pp
595.It "drive error (%s) (code %d, subcode %d)"
596Either the controller or the drive has detected a hardware error
597in the drive.  I am not sure what an `sdi command timeout' is, but
598these seem to occur benignly on occasion.  A `ctlr detected protocol'
599error means that the controller and drive do not agree on a protocol;
600this could be a cabling problem, or a version mismatch.  A `positioner'
601error means the drive seek hardware is ailing; `lost rd/wr ready'
602means the drive read/write logic is sick; and `drive clock dropout'
603means that the drive clock logic is bad, or the media is hopelessly
604scrambled.  I have no idea what `lost recvr ready' means.  A `drive
605detected error' is a catch-all for drive hardware trouble; `ctlr
606detected pulse or parity' errors are often caused by cabling problems.
607.Ed
608.El
609.Sh SEE ALSO
610.Xr disklabel 5 ,
611.Xr disklabel 8
612.Sh HISTORY
613The
614.Nm
615driver appeared in
616.Bx 4.2 .
617