Commit Graph

52 Commits

Author SHA1 Message Date
Archie Cobbs
2127f26023 Examine all occurrences of sprintf(), strcat(), and str[n]cpy()
for possible buffer overflow problems. Replaced most sprintf()'s
with snprintf(); for others cases, added terminating NUL bytes where
appropriate, replaced constants like "16" with sizeof(), etc.

These changes include several bug fixes, but most changes are for
maintainability's sake. Any instance where it wasn't "immediately
obvious" that a buffer overflow could not occur was made safer.

Reviewed by:	Bruce Evans <bde@zeta.org.au>
Reviewed by:	Matthew Dillon <dillon@apollo.backplane.com>
Reviewed by:	Mike Spengler <mks@networkcs.com>
1998-12-04 22:54:57 +00:00
Kenneth D. Merry
4f1d0ef261 "Fix" a problem with the Quantum Viking. It appears that this drive does
not like the 6-byte read and write commands!  It returns illegal request,
with the field pointer pointing to byte 9 of a 6 byte CDB.

In any case, the work around is to put in a quirk mechanism that makes sure
that we don't send 6-byte reads or writes to this device.  It's rather sad
that this is necessary.  You'd think that they would be able to get
something that basic to work right in their firmware...

Reviewed by:	gibbs
Reported by:	Adam McDougall <bsdx@spawnet.com>
1998-12-02 17:35:28 +00:00
Joerg Wunsch
ae1b283631 ...nor does this old TDC3620 like to be asked for compression.
But well, now it's running again!
1998-11-26 10:47:52 +00:00
Joerg Wunsch
36230d67d0 This old firmware of the TDC3620 hangs the SCSI bus upon serial
number requests.  Don't ask it so.
1998-11-25 13:50:10 +00:00
Kenneth D. Merry
22b9c86cfd Fix a few problems that Bruce noticed about a month ago, and fix oup one
other problem.

- Hold onto splsoftcam() in the peripheral driver open routines until we
  have locked the periph.  This eliminates a race condition.

- Disallow opening the pass driver when securelevel > 1.

- If a user tries to open the pass driver with O_NONBLOCK set, return
  EINVAL instead of ENODEV.  (noticed by gibbs)
1998-11-22 23:44:47 +00:00
Kenneth D. Merry
25e5ca272b Generalize the quirk entry that disables multi-lun probing for Sony CDROM
drives.  It seems that quite a few (possibly all?) of their drives respond
to inquiries on multiple luns.  Hopefully we can detect problems like this
in the probe phase at some point.  For now, this is a pretty functional
solution.
1998-11-04 19:56:24 +00:00
Kenneth D. Merry
ee9c90c75c Fix a problem with the way we handled device invalidation when attaching
to a device failed.

In theory, the same steps that happen when we get an AC_LOST_DEVICE async
notification should have been taken when a driver fails to attach.  In
practice, that wasn't the case.

This only affected the da, cd and ch drivers, but the fix affects all
peripheral drivers.

There were several possible problems:
 - In the da driver, we didn't remove the peripheral's softc from the da
   driver's linked list of softcs.  Once the peripheral and softc got
   removed, we'd get a kernel panic the next time the timeout routine
   called dasendorderedtag().
 - In the da, cd and possibly ch drivers, we didn't remove the
   peripheral's devstat structure from the devstat queue.  Once the
   peripheral and softc were removed, this could cause a panic if anyone
   tried to access device statistics.  (one component of the linked list
   wouldn't exist anymore)
 - In the cd driver, we didn't take the peripheral off the changer run
   queue if it was scheduled to run.  In practice, it's highly unlikely,
   and maybe impossible that the peripheral would have been on the
   changer run queue at that stage of the probe process.

The fix is:
 - Add a new peripheral callback function (the "oninvalidate" function)
   that is called the first time cam_periph_invalidate() is called for a
   peripheral.

 - Create new foooninvalidate() routines for each peripheral driver.  This
   routine is always called at splsoftcam(), and contains all the stuff
   that used to be in the AC_LOST_DEVICE case of the async callback
   handler.

 - Move the devstat cleanup call to the destructor/cleanup routines, since
   some of the drivers do I/O in their close routines.

 - Make sure that when we're flushing the buffer queue, we traverse it at
   splbio().

 - Add a check for the invalid flag in the pt driver's open routine.

Reviewed by:	gibbs
1998-10-22 22:16:56 +00:00
Justin T. Gibbs
bb1f2fe47f Add a mechanism to send a non-tagged transaction even if a device is
currently operating in a tagged mode.  The SIM driver should determine
if a device is in tag mode by looking at the CAM_TAG_ACTION_VALID flag
in the ccb header.  If the flag is set, the tag_action field is either
a SCSI II tag message (simple, ordered, head) or CAM_TAG_ACTION_NONE
to specify that no tagging should be performed.
1998-10-15 23:17:35 +00:00
Kenneth D. Merry
50642f180c Fix several potential buffer overrun conditions. These changes have been
tested both in the kernel and in userland.  Also, fix a couple of printf
warnings that show up when CAMDEBUG is defined.

Reviewed by:		imp
Partially submitted by:	imp
1998-10-15 19:08:58 +00:00
Kenneth D. Merry
11021a1ab5 Clean up some unused variables.
Reviewed by:	ken
Submitted by:	phk
1998-10-15 17:46:26 +00:00
Kenneth D. Merry
1b6833dbb1 Narrow the quirk entry for the Seagate Elite 9 a bit to just cover drives
with 71* firmware revisions.  Scott Mace <smace@intt.ORG> reports that
drives with 00* firmware revisions do tagged queueing just fine.
1998-10-14 22:51:51 +00:00
Kenneth D. Merry
8597eafed0 Disable tagged queueing for the Seagate Elite 9GB drives. They tend to get
hung up when you send tags to them too quickly.  (CAM is able to recover
from the problem, but this just avoids it altogether.)

Reviewed by:	gibbs
Reported by:	Bret Ford <bford@uop.cs.uop.edu>
	and:	Martin Renters <martin@tdc.on.ca>
1998-10-14 21:17:39 +00:00
Kenneth D. Merry
718cd18c53 Disable cache syncs for a broken NEC drive.
Reviewed by:	gibbs
Submitted by:	Blaz Zupan <blaz@gold.amis.net>
1998-10-13 23:34:54 +00:00
Kenneth D. Merry
60a899a075 Fix a bug in the error recovery code. It was possible to have more than
one error recovery action oustanding for a given peripheral.

This is bad for several reasons.  The first problem is that the error
recovery actions would likely be to fix the same problem.  (e.g., we
queue 5 CCBs to a disk, and the first one comes back with 0x04,0x02.  We
start error recovery, and the second one comes back with the same status.
Then the third one comes back, and so on.  Each one causes the drive to get
nailed with a start unit, when we really only need one.)

The other problem is that we only have space to store one CCB while we're
doing error recovery.  The subsequent error recovery actions that got
started were over-writing the CCBs from previous error recovery actions,
but we still tried to call the done routine N times for N error recovery
actions.  Each call to dadone() was done with the same CCB, though.  So on
the second one, we got a "biodone: buffer not busy" panic, since the buffer
in question had already been through biodone().

In any case, this fixes things so that any any given time, there's only one
error recovery action outstanding for any given peripheral driver.

Reviewed by:	gibbs
Reported by:	Philippe Regnauld <regnauld@deepo.prosa.dk>
[ Philippe wins the "bug finder of the week" award ]
1998-10-13 21:41:32 +00:00
Kenneth D. Merry
fce84cb42b Fix a bug in the scan lun code that showed up when we did the following
sequence of things:

- spin up a disk
  - send an async event to refresh the inquiry data
    - run through xpt_scan_lun() to re-probe the device
        - eventually finish the probe, but panic in xpt_done() because the
          periph pointer wasn't set.

Reviewed by:	gibbs
Reported by:	Philippe Regnauld <regnauld@deepo.prosa.dk>
1998-10-13 21:29:04 +00:00
David Greenman
6cde7a165f Fixed two potentially serious classes of bugs:
1) The vnode pager wasn't properly tracking the file size due to
   "size" being page rounded in some cases and not in others.
   This sometimes resulted in corrupted files. First noticed by
   Terry Lambert.
   Fixed by changing the "size" pager_alloc parameter to be a 64bit
   byte value (as opposed to a 32bit page index) and changing the
   pagers and their callers to deal with this properly.
2) Fixed a bogus type cast in round_page() and trunc_page() that
   caused some 64bit offsets and sizes to be scrambled. Removing
   the cast required adding casts at a few dozen callers.
   There may be problems with other bogus casts in close-by
   macros. A quick check seemed to indicate that those were okay,
   however.
1998-10-13 08:24:45 +00:00
Kenneth D. Merry
621a60d46b Add a "dummy light" (actually two dummy lights) to catch people who don't
have the passthrough device configured in their kernel.

This will hopefully reduce the number of people complaining that they can't
get {camcontrol, xmcd, tosha, cdrecord, etc.} to work.

Reviewed by:	gibbs
1998-10-12 21:54:13 +00:00
Kenneth D. Merry
458c85235c Add quirk entries to disable the synchronize cache command for Micropolis
2217's (reported by Matthew Jacob in NetBSD PR kern/6027) and Fujitsu
M2954's (reported by Tom Jackson).

Some of the Fujitsus at least hang when they get a cache sync command.
(Others just return illegal request.)

Also, make error printing in dashutdown() a little more selective.  Don't
print any error when the sense key is illegal request.  Drives that don't
support the synchronize cache command usually return illegal request.
Also, make sure the scsi status is check condition before going into
scsi_sense_print().

Reviewed by:	gibbs
1998-10-12 17:16:47 +00:00
Kenneth D. Merry
d5ef4c961a Bring over a quirk entry from the old SCSI code for a Chinon CDROM drive
that returns track numbers in BCD.

Reviewed by:	gibbs
1998-10-12 17:02:37 +00:00
Justin T. Gibbs
2863f7b147 If the bus delay is >= 2 seconds, notify the user that we are waiting
for devices to settle.  This will hopefully allay any 'first installation'
fears that the machine has hung.
1998-10-10 21:10:36 +00:00
Kenneth D. Merry
8e35ba93ae Add the quirk entry framework to handle disabling the synchronize cache
command on drives that don't like it.  Right now, there's just a bogus
quirk entry in the table that doesn't do anything, but that should be
changed once we get actual inquiry data for drives that don't like the
synchronize cache command.

Also, add a shutdown hook that runs through all direct access peripherals
and runs a synchronize cache on them if they're still open, and if
synchronize cache isn't disabled via a quirk entry.

Add a synchronize cache call at the end of dadump() (again, conditionalized
on the quirk entry), so we can insure that the disk cache contents get
flushed to physical media after a dump.

Check the new quirk entry in daclose() to decide whether or not to
synchronize the cache for a disk at final close.

Reviewed by:	gibbs
1998-10-08 05:46:38 +00:00
Justin T. Gibbs
2ba01e6ff4 Add a quirk entry for the CFP2107, another drive with broken
tagged queuing support.

Ensure that we report that a device supports tagged queuing even if
the system is waiting a "command count delay" before starting to use
them.

If a user disables disconnects on a device ensure that tagged queuing
is also turned off.  We did the right thing during initial configuration,
but could be confused by manual changes.
1998-10-07 03:25:21 +00:00
Warner Losh
6a3722a752 Up the read capacity timeout from 20 seconds to 60 seconds to keep my
JAZ drive happy.  This shouldn't impact fast drives, and will keep cam
from failing on very slow ones (that are spinning up, say).  20
seconds was almost long enough, but not in all cases.

Suggested by: gibbs
1998-10-07 03:09:19 +00:00
Kenneth D. Merry
777cfc2acd Some fixes for the CD and DA drivers from bde. (and some of my own as
well)  Among them:

[ cd driver ]
1. Old labeling code was still there.
2. Error handling for dsopen() was broken (no test for the `error'
   returned by dsopen(); bogus test of an `error' that is known to be 0).
3. cdopen() closed the physical device after certain errors although there
   may still be open partitions on it.
4. cdclose() closed the physical device although there may still be open
   partitions on it.
5. Some printf format fixes was incomplete or missing.
6. cdioctl() truncated unit numbers mod 256.
7. cdioctl() was missing locking.

[ da driver ]
1. daclose() closed the physical device although there may still be open
   partitions on it.  This was fixed many years ago in sd.c rev.1.57.
2. A minor optimization (the dk_slices != NULL test) in sdopen() became
   uglier in daopen().  It is not worth doing.  da only regressed compared
   with od and my version of sd, since I never committed the change to sd.
   daopen() should probably do less if some partition is already open.
   This is not addressed by the diffs.
[ ... ]
5. "opt_hw_wdog.h" was not included, so the HW_WDOG code was unreachable.

- Added a getdev CCB call in the cdopen() and daopen() calls so that the
  vendor name and device name are available for the disklabel.  (suggested
  by bde)

- Removed vestigal devfs support in both drivers, since we can't properly
  work with devfs yet.  (ask bde for details on this)

- Cleaned up the probe code in both drivers in the failure cases.  There
  were a number of things wrong here.  The peripheral driver instances
  weren't getting properly cleaned up.  Sometimes the wrong probe message
  would get printed out (with the failure message appended), so it wasn't
  very clear that we failed to attach.  SCSI sense information was printed,
  even when the error in question wasn't a SCSI error.  I put similar fixes
  into the changer driver in revision 1.2 of scsi_ch.c.

Reviewed by:	gibbs
Submitted by:	bde (partially)
1998-10-07 02:57:57 +00:00
Kenneth D. Merry
b1d26455e0 Disable multi-lun probing and serial number inquiries for the Exabyte 8200. 1998-10-06 19:27:19 +00:00
Kenneth D. Merry
6326180fab Fix a printf format warning that shows up when CAMDEBUG is defined. 1998-10-02 21:20:21 +00:00
Kenneth D. Merry
d05caa00c5 Add a new CAM debugging mode, CAM_DEBUG_CDB. This causes the kernel to
print out a one line description/dump of every SCSI CDB sent to a
particular debugging target or targets.

This is a good bit more useful than the other debugging modes, I think.

Change some things in LINT to note the availability of this new option.

Fix an erroneous argument to scsi_cdb_string() in scsi_all.c

Reviewed by:	gibbs
1998-10-02 21:00:58 +00:00
Kenneth D. Merry
696db22238 Modify the changer driver so it can handle (hopefully!) changers that need
block descriptors enabled on mode sense commands.

Basically, we try sending a mode sense with block descriptors disabled (the
previous default), and if it fails, we try sending the mode sense with
block descriptors enabled.  If that works, we note that in a runtime quirk
entry, so we don't bother disabling block descriptors again for the device.

This problem was first reported by Chris Jones <cjones@honors.montana.edu>
on one of the NetBSD lists, but I'd imagine that some FreeBSD users would
have run into it eventually as well, since our changer driver is derived
form the NetBSD changer driver.

Also, change some of the probe logic so that we do the right thing in the
case of a failure to attach.

Fix a memory leak in chgetparams().

Add a couple of inline helper functions to scsi_all.h to correctly return
the start of a mode page.

NetBSD PR:	kern/6214
Reviewed by:	gibbs
1998-10-02 05:25:49 +00:00
Kenneth D. Merry
9dfb44710e Patches from DES to create three new kernel config options to control
timeouts in the SA driver (timeouts for space, rewind and erase).  Folks
can lengthen the timeouts if their hardware is especially slow, or shorten
them if they want to be notified of errors a little sooner.

Also, get rid of two OD driver options.  The od driver has been made
obsolete by the da driver.

Reviewed by:	ken, gibbs
Submitted by:	Dag-Erling Coidan Smørgrav <des@FreeBSD.ORG>
1998-10-02 05:15:51 +00:00
Kenneth D. Merry
e5b118dd3d In the bootverbose case, print out error messages for all errors that will
not be retried again, even if the SF_NO_PRINT flag is set.

Reviewed by:	gibbs
1998-09-29 22:11:30 +00:00
Bruce Evans
2e8bf20912 Fixed printf format errors. u_long is not necessarily suitable for casting
pointers to, and %d is not suitable for printing uint32_t's.
1998-09-29 09:18:08 +00:00
Justin T. Gibbs
03e3511b47 Correct problems with xpt_set_transfer_settings and async transfer
negotiation changes with wildcarded paths.
1998-09-25 22:35:56 +00:00
Justin T. Gibbs
f0adc79010 Fix a few problems with the tag delay code:
- Tagged devices were limited to one transaction (oops)
 - We revert to untagged with a tag delay if the user changed the
   transfer negotiation values (via perhaps camcontrol some day).
 - xpt_async did not use the expanded path in some cases which could
   cause a panic.
1998-09-24 22:43:54 +00:00
Kenneth D. Merry
54cbee5db2 Treat not ready errors (asc 0x04) as non-fatal errors for attach. We
already allowed medium not present type errors (0x3a), but some Philips and
HP WORM drives return 0x04,0x00 when you issue a read capacity without
media in the drive.
1998-09-23 03:17:08 +00:00
Justin T. Gibbs
fd21cc5ee0 Allow 5 untagged commands to go to a device before enabling tags after
enabling transfer negotiations, a BDR, or a bus reset to allow the controller
driver to negotiate without tagged messages getting in the way.  Some
devices are confused by attempts to negotiate and tag at the same time.
Some controllers (e.g. BT MultiMaster with certain firmware revs) will
never negotiate if you don't give them an untagged "window" to perform
negotiation in.

Bump the maximum tag count to 255.  The system reclaims unused tag space
as the tag count is dropped anyway, so we might as well try the max.

We should probably use a larger type than u_int8_t to hold our tag value
as SCSI over certain mediums allows for higher values.

Reviewed by:	 Kenneth Merry <ken@FreeBSD.org>
1998-09-23 03:03:19 +00:00
Kenneth D. Merry
f24c39c7d5 Add several quirks:
Western Digital Enterprise drives have sorry performance (1.5MB/sec versus
8MB/sec) when doing tagged queueing.  Disable tagged queueing for them.

Submitted by:	Andrew Gallatin <gallatin@cs.duke.edu>

Some Sony CDROM drives don't like it when we probe more than one LUN.

Verified by:    Jean-Marc Zucconi <jmz@FreeBSD.ORG>

Some Sony CD-R's don't like multi-LUN probing either.

Submitted by:   Parag Patel <parag@cgt.com>
1998-09-22 20:41:12 +00:00
Justin T. Gibbs
53b062d3cc cam.c:
Clear up trailing NULs in cam_strvis.

cam_xpt.c:
	Nuke an experimental quirk entry for the Toshiba 3401.  The real
	problem with this device turned out to be a bug in the aic7xxx
	driver that was fixed months ago.

	Add a quirk entry to inhibit multiple lun scanning and serial number
	probing of DPT RAID volumes.  My DPT controller hangs up solid when
	I do either of these things to a RAID 1 volume.
1998-09-22 04:53:23 +00:00
Kenneth D. Merry
050279e55a Some fixes to the CD driver that may fix PR kern/7996. The data direction
flags on some of the operations in the driver weren't quite right.  Also,
clean up scsi_cd.h, change u_char to u_int8_t.

I'm surprised this problem didn't show up sooner.  (the code has been in
there almost a year and a half)

PR:		        7996
Reviewed by:	        ken
Submitted (mostly) by:	gibbs
1998-09-20 22:48:15 +00:00
Justin T. Gibbs
0c4930f525 Don't invalidate devices due to unexpected unit attention errors. In
a perfect world, we'd notice the UA and do some device validation to ensure
that the device hasn't changed.  We may get this before the year ends,
but not before 3.0R.  This change gives the adminstrator ample ammunition
to take off a foot or two, but hey this *is* UN*X.
1998-09-20 07:17:11 +00:00
Justin T. Gibbs
e471e974cc cam_xpt.c:
Add quirk entry for a Samsung drive that doesn't like experiencing
	the queue full condition.

	Bump the timeouts for all probe activities to 60s.  We don't know
	what the seletion timeout (or equivelent on other mediums) is
	for controllers, which can make the transactions at the tail
	end of a parallel probe take a while to complete.  The DPT
	seems to be a card that takes a long time to see a selection timeout.

cam_periph.c:
	Don't call a device "gone" after a single selection timeout.  We
	need to come up with a better policy.  Until that time, you'll
	have to manually re-scan a bus via camcontrol for the system to
	decide that a device is really gone.  This should give devices
	experiencing temporary insanity to escape death.
1998-09-20 07:14:36 +00:00
Justin T. Gibbs
ecdf111388 Only deregister out configuration hook manually if there are no SCSI
busses to configure.
1998-09-20 05:03:34 +00:00
Justin T. Gibbs
9eec0f7d5a Don't leave the device queue in a frozen state if the Synchronize Cache
command on close fails.
1998-09-19 04:59:35 +00:00
Kenneth D. Merry
66411419a6 Fix error recovery in scsi_interpret_sense(). It turns out that ERESTART
wasn't getting sent back for most errors, even if there were retries left
on the command.  I'm not sure how I ever let this slip by before...

In any case, we now send back ERESTART if there are retries left for the
command, and send back the default error code when there are no retries
left.

Reviewed by:	gibbs
1998-09-19 01:23:04 +00:00
Kenneth D. Merry
37b9efd37b Fix the CAM code so that people can compile kernels with the CD driver but
without the DA driver.

The problem was that the CD driver depended on scsi_read_write() and
scsi_start_stop(), which were defined in scsi_da.c.

I moved both functions, and their associated data structures and defines
from scsi_da.* to scsi_all.*.  This is technically the "wrong" thing to do
since those commands are really only for direct-access type devices, not
for all SCSI devices.  I think, though, that the advantage (allowing people
to compile kernels without the disk driver) outweighs any architectural
purity arguments.

PR:		kern/7969
Reviewed by:	gibbs
1998-09-18 22:33:59 +00:00
Kenneth D. Merry
6747168dc9 Change the Atlas II quirk entries so they work with differential Atlas
II's.  Also, add a quirk entry for the 2 gig Atlas II.

Partially Submitted by:	Ted Buswell <tbuswell@mediaone.net>
1998-09-18 19:55:34 +00:00
Kenneth D. Merry
a822eb2a8b Fix a formatting error.
Fix a problem reported by bde:  setting SCSI_DELAY to 0 doesn't work.  Now,
when the user sets SCSI_DELAY to 0, we re-set it to the minimum allowable
bus settle delay (100ms).

Fix a potential panic in xptfinishconfigfunc() if the CCB passed in is
NULL.  Reported by, I think, Nicolas Souchu.  Fix a memory leak in the same
function (we created a path, but didn't free it) by allocating the getdev
CCB and path on the stack.

Reviewed by:	gibbs
1998-09-17 23:58:53 +00:00
Kenneth D. Merry
c5e6ca4410 Some Alpha patches for CAM from Doug Rabson.
Reviewed by:	gibbs
Submitted by:	dfr
1998-09-16 23:30:11 +00:00
Justin T. Gibbs
9bc66b835c Properly allocate our, per lun, probe peripheral softc from
the TEMP malloc pool.

Noticed by: Don Lewis <Don.Lewis@tsc.tdk.com>
1998-09-16 13:24:37 +00:00
Kenneth D. Merry
66a0780e8e Check to make sure that this device is opened read-write, not just read
only.  Previously, if the device was chmoded 644, someone could open it
with the O_RDONLY flag and issue any ioctl to the device.

Reviewed by:	imp, gibbs
1998-09-16 00:11:53 +00:00
Justin T. Gibbs
5ee58402df Correct printf format bugs. 1998-09-15 22:05:44 +00:00