xref: /netbsd-src/share/man/man4/man4.vax/uda.4 (revision cda4f8f6ee55684e8d311b86c99ea59191e6b74f)
1.\" Copyright (c) 1980, 1987, 1991, 1993
2.\"	The Regents of the University of California.  All rights reserved.
3.\"
4.\" Redistribution and use in source and binary forms, with or without
5.\" modification, are permitted provided that the following conditions
6.\" are met:
7.\" 1. Redistributions of source code must retain the above copyright
8.\"    notice, this list of conditions and the following disclaimer.
9.\" 2. Redistributions in binary form must reproduce the above copyright
10.\"    notice, this list of conditions and the following disclaimer in the
11.\"    documentation and/or other materials provided with the distribution.
12.\" 3. All advertising materials mentioning features or use of this software
13.\"    must display the following acknowledgement:
14.\"	This product includes software developed by the University of
15.\"	California, Berkeley and its contributors.
16.\" 4. Neither the name of the University nor the names of its contributors
17.\"    may be used to endorse or promote products derived from this software
18.\"    without specific prior written permission.
19.\"
20.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
21.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
22.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
23.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
24.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
25.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
26.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
27.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
28.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
29.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
30.\" SUCH DAMAGE.
31.\"
32.\"     @(#)uda.4	8.1 (Berkeley) 6/5/93
33.\"
34.Dd June 5, 1993
35.Dt UDA 4 vax
36.Os BSD 4
37.Sh NAME
38.Nm uda
39.Nd
40.Tn UDA50
41disk controller interface
42.Sh SYNOPSIS
43.Cd "controller uda0 at uba0 csr 0172150 vector udaintr"
44.Cd "disk ra0 at uda0 drive 0"
45.Cd "options MSCP_PARANOIA"
46.Sh DESCRIPTION
47This is a driver for the
48.Tn DEC UDA50
49disk controller and other
50compatible controllers.  The
51.Tn UDA50
52communicates with the host through
53a packet protocol known as the Mass Storage Control Protocol
54.Pq Tn MSCP .
55Consult the file
56.Aq Pa vax/mscp.h
57for a detailed description of this protocol.
58.Pp
59The
60.Nm uda
61driver
62is a typical block-device disk driver; see
63.Xr physio 4
64for a description of block
65.Tn I/O .
66The script
67.Xr MAKEDEV 8
68should be used to create the
69.Nm uda
70special files; should a special
71file need to be created by hand, consult
72.Xr mknod 8 .
73.Pp
74The
75.Dv MSCP_PARANOIA
76option enables runtime checking on all transfer completion responses
77from the controller.  This increases disk
78.Tn I/O
79overhead and may
80be undesirable on slow machines, but is otherwise recommended.
81.Pp
82The first sector of each disk contains both a first-stage bootstrap program
83and a disk label containing geometry information and partition layouts (see
84.Xr disklabel 5 ) .
85This sector is normally write-protected, and disk-to-disk copies should
86avoid copying this sector.
87The label may be updated with
88.Xr disklabel 8 ,
89which can also be used to write-enable and write-disable the sector.
90The next 15 sectors contain a second-stage bootstrap program.
91.Sh DISK SUPPORT
92During autoconfiguration,
93as well as when a drive is opened after all partitions are closed,
94the first sector of the drive is examined for a disk label.
95If a label is found, the geometry of the drive and the partition tables
96are taken from it.
97If no label is found,
98the driver configures the type of each drive when it is first
99encountered.  A default partition table in the driver is used for each type
100of disk when a pack is not labelled.  The origin and size
101(in sectors) of the default pseudo-disks on each
102drive are shown below.  Not all partitions begin on cylinder
103boundaries, as on other drives, because previous drivers used one
104partition table for all drive types.  Variants of the partition tables
105are common; check the driver and the file
106.Pa /etc/disktab
107.Pq Xr disktab 5
108for other possibilities.
109.Pp
110Special file names begin with
111.Sq Li ra
112and
113.Sq Li rra
114for the block and character files respectively. The second
115component of the name, a drive unit number in the range of zero to
116seven, is represented by a
117.Sq Li ?
118in the disk layouts below. The last component of the name is the
119file system partition
120designated
121by a letter from
122.Sq Li a
123to
124.Sq Li h
125and which corresponds to a minor device number set: zero to seven,
126eight to 15, 16 to 23 and so forth for drive zero, drive two and drive
127three respectively, (see
128.Xr physio 4) .
129The location and size (in sectors) of the partitions:
130.Bl -column header diskx undefined length
131.Tn RA60 No partitions
132.Sy	disk	start	length
133	ra?a	0	15884
134	ra?b	15884	33440
135	ra?c	0	400176
136	ra?d	49324	82080	same as 4.2BSD ra?g
137	ra?e	131404	268772	same as 4.2BSD ra?h
138	ra?f	49324	350852
139	ra?g	242606	157570
140	ra?h	49324	193282
141
142.Tn RA70 No partitions
143.Sy	disk	start	length
144	ra?a	0	15884
145	ra?b	15972	33440
146	ra?c	0	547041
147	ra?d	34122	15884
148	ra?e	357192	55936
149	ra?f	413457	133584
150	ra?g	341220	205821
151	ra?h	49731	29136
152
153.Tn RA80 No partitions
154.Sy	disk	start	length
155	ra?a	0	15884
156	ra?b	15884	33440
157	ra?c	0	242606
158	ra?e	49324	193282	same as old Berkeley ra?g
159	ra?f	49324	82080	same as 4.2BSD ra?g
160	ra?g	49910	192696
161	ra?h	131404	111202	same as 4.2BSD
162
163.Tn RA81 No partitions
164.Sy	disk	start	length
165	ra?a	0	15884
166	ra?b	16422	66880
167	ra?c	0	891072
168	ra?d	375564	15884
169	ra?e	391986	307200
170	ra?f	699720	191352
171	ra?g	375564	515508
172	ra?h	83538	291346
173
174.Tn RA81 No partitions with 4.2BSD-compatible partitions
175.Sy	disk	start	length
176	ra?a	0	15884
177	ra?b	16422	66880
178	ra?c	0	891072
179	ra?d	49324	82080	same as 4.2BSD ra?g
180	ra?e	131404	759668	same as 4.2BSD ra?h
181	ra?f	412490	478582	same as 4.2BSD ra?f
182	ra?g	375564	515508
183	ra?h	83538	291346
184
185.Tn RA82 No partitions
186.Sy	disk	start	length
187	ra?a	0	15884
188	ra?b	16245	66880
189	ra?c	0	1135554
190	ra?d	375345	15884
191	ra?e	391590	307200
192	ra?f	669390	466164
193	ra?g	375345	760209
194	ra?h	83790	291346
195.El
196.Pp
197The ra?a partition is normally used for the root file system, the ra?b
198partition as a paging area, and the ra?c partition for pack-pack
199copying (it maps the entire disk).
200.Sh FILES
201.Bl -tag -width /dev/rra[0-9][a-f] -compact
202.It Pa /dev/ra[0-9][a-f]
203.It Pa /dev/rra[0-9][a-f]
204.El
205.Sh DIAGNOSTICS
206.Bl -diag
207.It "panic: udaslave"
208No command packets were available while the driver was looking
209for disk drives.  The controller is not extending enough credits
210to use the drives.
211.Pp
212.It "uda%d: no response to Get Unit Status request"
213A disk drive was found, but did not respond to a status request.
214This is either a hardware problem or someone pulling unit number
215plugs very fast.
216.Pp
217.It "uda%d: unit %d off line"
218While searching for drives, the controller found one that
219seems to be manually disabled.  It is ignored.
220.Pp
221.It "uda%d: unable to get unit status"
222Something went wrong while trying to determine the status of
223a disk drive.  This is followed by an error detail.
224.Pp
225.It uda%d: unit %d, next %d
226This probably never happens, but I wanted to know if it did.  I
227have no idea what one should do about it.
228.Pp
229.It "uda%d: cannot handle unit number %d (max is %d)"
230The controller found a drive whose unit number is too large.
231Valid unit numbers are those in the range [0..7].
232.Pp
233.It "ra%d: don't have a partition table for %s; using (s,t,c)=(%d,%d,%d)"
234The controller found a drive whose media identifier (e.g. `RA 25')
235does not have a default partition table.  A temporary partition
236table containing only an `a' partition has been created covering
237the entire disk, which has the indicated numbers of sectors per
238track (s), tracks per cylinder (t), and total cylinders (c).
239Give the pack a label with the
240.Xr disklabel
241utility.
242.Pp
243.It "uda%d: uballoc map failed"
244Unibus resource map allocation failed during initialisation.  This
245can only happen if you have 496 devices on a Unibus.
246.Pp
247.It uda%d: timeout during init
248The controller did not initialise within ten seconds.  A hardware
249problem, but it sometimes goes away if you try again.
250.Pp
251.It uda%d: init failed, sa=%b
252The controller refused to initalise.
253.Pp
254.It uda%d: controller hung
255The controller never finished initialisation.  Retrying may sometimes
256fix it.
257.Pp
258.It ra%d: drive will not come on line
259The drive will not come on line, probably because it is spun down.
260This should be preceded by a message giving details as to why the
261drive stayed off line.
262.Pp
263.It uda%d: still hung
264When the controller hangs, the driver occasionally tries to reinitialise
265it.  This means it just tried, without success.
266.Pp
267.It panic: udastart: bp==NULL
268A bug in the driver has put an empty drive queue on a controller queue.
269.Pp
270.It uda%d: command ring too small
271If you increase
272.Dv NCMDL2 ,
273you may see a performance improvement.
274(See
275.Pa /sys/vaxuba/uda.c . )
276.Pp
277.It panic: udastart
278A drive was found marked for status or on-line functions while performing
279status or on-line functions.  This indicates a bug in the driver.
280.Pp
281.It "uda%d: controller error, sa=0%o (%s)"
282The controller reported an error.  The error code is printed in
283octal, along with a short description if the code is known (see the
284.%T UDA50 Maintenance Guide ,
285.Tn DEC
286part number
287.Tn AA-M185B-TC ,
288pp. 18-22).
289If this occurs during normal
290operation, the driver will reset it and retry pending
291.Tn I/O .
292If
293it occurs during configuration, the controller may be ignored.
294.Pp
295.It uda%d: stray intr
296The controller interrupted when it should have stayed quiet.  The
297interrupt has been ignored.
298.Pp
299.It "uda%d: init step %d failed, sa=%b"
300The controller reported an error during the named initialisation step.
301The driver will retry initialisation later.
302.Pp
303.It uda%d: version %d model %d
304An informational message giving the revision level of the controller.
305.Pp
306.It uda%d: DMA burst size set to %d
307An informational message showing the
308.Tn DMA
309burst size, in words.
310.Pp
311.It panic: udaintr
312Indicates a bug in the generic
313.Tn MSCP
314code.
315.Pp
316.It uda%d: driver bug, state %d
317The driver has a bogus value for the controller state.  Something
318is quite wrong.  This is immediately followed by a `panic: udastate'.
319.Pp
320.It uda%d: purge bdp %d
321A benign message tracing BDP purges.  I have been trying to figure
322out what BDP purges are for.  You might want to comment out this
323call to log() in /sys/vaxuba/uda.c.
324.Pp
325.It uda%d: SETCTLRC failed:  `detail'
326The Set Controller Characteristics command (the last part of the
327controller initialisation sequence) failed.  The
328.Em detail
329message tells why.
330.Pp
331.It "uda%d: attempt to bring ra%d on line failed:  `detail'"
332The drive could not be brought on line.  The
333.Em detail
334message tells why.
335.Pp
336.It uda%d: ra%d: unknown type %d
337The type index of the named drive is not known to the driver, so the
338drive will be ignored.
339.Pp
340.It "ra%d: changed types! was %d now %d"
341A drive somehow changed from one kind to another, e.g., from an
342.Tn RA80
343to an
344.Tn RA60 .
345The numbers printed are the encoded media identifiers (see
346.Ao Pa vax/mscp.h Ac
347for the encoding).
348The driver believes the new type.
349.Pp
350.It "ra%d: uda%d, unit %d, size = %d sectors"
351The named drive is on the indicated controller as the given unit,
352and has that many sectors of user-file area.  This is printed
353during configuration.
354.Pp
355.It "uda%d: attempt to get status for ra%d failed:  `detail'"
356A status request failed.  The
357.Em detail
358message should tell why.
359.Pp
360.It ra%d: bad block report: %d
361The drive has reported the given block as bad.  If there are multiple
362bad blocks, the drive will report only the first; in this case this
363message will be followed by `+ others'.  Get
364.Tn DEC
365to forward the
366block with
367.Tn EVRLK .
368.Pp
369.It ra%d: serious exception reported
370I have no idea what this really means.
371.Pp
372.It panic: udareplace
373The controller reported completion of a
374.Tn REPLACE
375operation.  The
376driver never issues any
377.Tn REPLACE Ns s ,
378so something is wrong.
379.Pp
380.It panic: udabb
381The controller reported completion of bad block related
382.Tn I/O .
383The
384driver never issues any such, so something is wrong.
385.Pp
386.It uda%d: lost interrupt
387The controller has gone out to lunch, and is being reset to try to bring
388it back.
389.Pp
390.It panic: mscp_go: AEB_MAX_BP too small
391You defined
392.Dv AVOID_EMULEX_BUG
393and increased
394.Dv NCMDL2
395and Emulex has
396new firmware.  Raise
397.Dv AEB_MAX_BP
398or turn off
399.Dv AVOID_EMULEX_BUG .
400.Pp
401.It "uda%d: unit %d: unknown message type 0x%x ignored"
402The controller responded with a mysterious message type. See
403.Pa /sys/vax/mscp.h
404for a list of known message types.  This is probably
405a controller hardware problem.
406.Pp
407.It "uda%d: unit %d out of range"
408The disk drive unit number (the unit plug) is higher than the
409maximum number the driver allows (currently 7).
410.Pp
411.It "uda%d: unit %d not configured, message ignored"
412The named disk drive has announced its presence to the controller,
413but was not, or cannot now be, configured into the running system.
414.Em Message
415is one of `available attention' (an `I am here' message) or
416`stray response op 0x%x status 0x%x' (anything else).
417.Pp
418.It ra%d: bad lbn (%d)?
419The drive has reported an invalid command error, probably due to an
420invalid block number.  If the lbn value is very much greater than the
421size reported by the drive, this is the problem.  It is probably due to
422an improperly configured partition table.  Other invalid commands
423indicate a bug in the driver, or hardware trouble.
424.Pp
425.It ra%d: duplicate ONLINE ignored
426The drive has come on-line while already on-line.  This condition
427can probably be ignored (and has been).
428.Pp
429.It ra%d: io done, but no buffer?
430Hardware trouble, or a bug; the drive has finished an
431.Tn I/O
432request,
433but the response has an invalid (zero) command reference number.
434.Pp
435.It "Emulex SC41/MS screwup: uda%d, got %d correct, then changed 0x%x to 0x%x"
436You turned on
437.Dv AVOID_EMULEX_BUG ,
438and the driver successfully
439avoided the bug.  The number of correctly-handled requests is
440reported, along with the expected and actual values relating to
441the bug being avoided.
442.Pp
443.It panic: unrecoverable Emulex screwup
444You turned on
445.Dv AVOID_EMULEX_BUG ,
446but Emulex was too clever and
447avoided the avoidance.  Try turning on
448.Dv MSCP_PARANOIA
449instead.
450.Pp
451.It uda%d: bad response packet ignored
452You turned on
453.Dv MSCP_PARANOIA ,
454and the driver caught the controller in
455a lie.  The lie has been ignored, and the controller will soon be
456reset (after a `lost' interrupt).  This is followed by a hex dump of
457the offending packet.
458.Pp
459.It ra%d: bogus REPLACE end
460The drive has reported finishing a bad sector replacement, but the
461driver never issues bad sector replacement commands.  The report
462is ignored.  This is likely a hardware problem.
463.Pp
464.It "ra%d: unknown opcode 0x%x status 0x%x ignored"
465The drive has reported something that the driver cannot understand.
466Perhaps
467.Tn DEC
468has been inventive, or perhaps your hardware is ill.
469This is followed by a hex dump of the offending packet.
470.Pp
471.It "ra%d%c: hard error %sing fsbn %d [of %d-%d] (ra%d bn %d cn %d tn %d sn %d)."
472An unrecoverable error occurred during transfer of the specified
473filesystem block number(s),
474which are logical block numbers on the indicated partition.
475If the transfer involved multiple blocks, the block range is printed as well.
476The parenthesized fields list the actual disk sector number
477relative to the beginning of the drive,
478as well as the cylinder, track and sector number of the block.
479.Pp
480.It uda%d: %s error datagram
481The controller has reported some kind of error, either `hard'
482(unrecoverable) or `soft' (recoverable).  If the controller is going on
483(attempting to fix the problem), this message includes the remark
484`(continuing)'.  Emulex controllers wrongly claim that all soft errors
485are hard errors.  This message may be followed by
486one of the following 5 messages, depending on its type, and will always
487be followed by a failure detail message (also listed below).
488.Bd -filled -offset indent
489.It memory addr 0x%x
490A host memory access error; this is the address that could not be
491read.
492.Pp
493.It "unit %d: level %d retry %d, %s %d"
494A typical disk error; the retry count and error recovery levels are
495printed, along with the block type (`lbn', or logical block; or `rbn',
496or replacement block) and number.  If the string is something else,
497.Tn DEC
498has been clever, or your hardware has gone to Australia for vacation
499(unless you live there; then it might be in New Zealand, or Brazil).
500.Pp
501.It unit %d: %s %d
502Also a disk error, but an `SDI' error, whatever that is.  (I doubt
503it has anything to do with Ronald Reagan.)  This lists the block
504type (`lbn' or `rbn') and number.  This is followed by a second
505message indicating a microprocessor error code and a front panel
506code.  These latter codes are drive-specific, and are intended to
507be used by field service as an aid in locating failing hardware.
508The codes for RA81s can be found in the
509.%T RA81 Maintenance Guide ,
510DEC order number AA-M879A-TC, in appendices E and F.
511.Pp
512.It "unit %d: small disk error, cyl %d"
513Yet another kind of disk error, but for small disks.  (`That's what
514it says, guv'nor.  Dunnask me what it means.')
515.Pp
516.It "unit %d: unknown error, format 0x%x"
517A mysterious error: the given format code is not known.
518.Ed
519.Pp
520The detail messages are as follows:
521.Bd -filled -offset indent
522.It success (%s) (code 0, subcode %d)
523Everything worked, but the controller thought it would let you know
524that something went wrong.  No matter what subcode, this can probably
525be ignored.
526.Pp
527.It "invalid command (%s) (code 1, subcode %d)"
528This probably cannot occur unless the hardware is out; %s should be
529`invalid msg length', meaning some command was too short or too long.
530.Pp
531.It "command aborted (unknown subcode) (code 2, subcode %d)"
532This should never occur, as the driver never aborts commands.
533.Pp
534.It "unit offline (%s) (code 3, subcode %d)"
535The drive is offline, either because it is not around (`unknown
536drive'), stopped (`not mounted'), out of order (`inoperative'), has the
537same unit number as some other drive (`duplicate'), or has been
538disabled for diagnostics (`in diagnosis').
539.Pp
540.It "unit available (unknown subcode) (code 4, subcode %d)"
541The controller has decided to report a perfectly normal event as
542an error.  (Why?)
543.Pp
544.It "media format error (%s) (code 5, subcode %d)"
545The drive cannot be used without reformatting.  The Format Control
546Table cannot be read (`fct unread - edc'), there is a bad sector
547header (`invalid sector header'), the drive is not set for 512-byte
548sectors (`not 512 sectors'), the drive is not formatted (`not formatted'),
549or the
550.Tn FCT
551has an uncorrectable
552.Tn ECC
553error (`fct ecc').
554.Pp
555.It "write protected (%s) (code 6, subcode %d)"
556The drive is write protected, either by the front panel switch
557(`hardware') or via the driver (`software').  The driver never
558sets software write protect.
559.Pp
560.It "compare error (unknown subcode) (code 7, subcode %d)"
561A compare operation showed some sort of difference.  The driver
562never uses compare operations.
563.Pp
564.It "data error (%s) (code 7, subcode %d)"
565Something went wrong reading or writing a data sector.  A `forced
566error' is a software-asserted error used to mark a sector that contains
567suspect data.  Rewriting the sector will clear the forced error.  This
568is normally set only during bad block replacment, and the driver does
569no bad block replacement, so these should not occur.  A `header
570compare' error probably means the block is shot.  A `sync timeout'
571presumably has something to do with sector synchronisation.
572An `uncorrectable ecc' error is an ordinary data error that cannot
573be fixed via
574.Tn ECC
575logic.  A `%d symbol ecc' error is a data error
576that can be (and presumably has been) corrected by the
577.Tn ECC
578logic.
579It might indicate a sector that is imperfect but usable, or that
580is starting to go bad.  If any of these errors recur, the sector
581may need to be replaced.
582.Pp
583.It "host buffer access error (%s) (code %d, subcode %d)"
584Something went wrong while trying to copy data to or from the host
585(Vax).  The subcode is one of `odd xfer addr', `odd xfer count',
586`non-exist. memory', or `memory parity'.  The first two could be a
587software glitch; the last two indicate hardware problems.
588.It controller error (%s) (code %d, subcode %d)
589The controller has detected a hardware error in itself.  A
590`serdes overrun' is a serialiser / deserialiser overrun; `edc'
591probably stands for `error detection code'; and `inconsistent
592internal data struct' is obvious.
593.Pp
594.It "drive error (%s) (code %d, subcode %d)"
595Either the controller or the drive has detected a hardware error
596in the drive.  I am not sure what an `sdi command timeout' is, but
597these seem to occur benignly on occasion.  A `ctlr detected protocol'
598error means that the controller and drive do not agree on a protocol;
599this could be a cabling problem, or a version mismatch.  A `positioner'
600error means the drive seek hardware is ailing; `lost rd/wr ready'
601means the drive read/write logic is sick; and `drive clock dropout'
602means that the drive clock logic is bad, or the media is hopelessly
603scrambled.  I have no idea what `lost recvr ready' means.  A `drive
604detected error' is a catch-all for drive hardware trouble; `ctlr
605detected pulse or parity' errors are often caused by cabling problems.
606.Ed
607.El
608.Sh SEE ALSO
609.Xr disklabel 5 ,
610.Xr disklabel 8
611.Sh HISTORY
612The
613.Nm
614driver appeared in
615.Bx 4.2 .
616