FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > CentOS > CentOS

 
 
LinkBack Thread Tools
 
Old 06-22-2008, 04:04 AM
Joshua Baker-LePain
 
Default 3ware 9650 issues

I've been having no end of issues with a 3ware 9650SE-24M8 in a server that's
coming on a year old. I've got 24 WDC WD5001ABYS drives (500GB) hooked to it,
running as a single RAID6 w/ a hot spare. These issues boil down to the card
periodically throwing errors like the following:


sd 1:0:0:0: WARNING: (0x06:0x002C): Command (0x8a) timed out, resetting card.

Usually when this happens, it's followed by:

3w-9xxx: scsi1: AEN: INFO (0x04:0x005E): Cache synchronization
completed:unit=0.


On the less pleasant occasions, it's followed by:

scsi1: ERROR: (0x06:0x0036): Response queue (large) empty failed during reset
sequence.
3w-9xxx: scsi1: ERROR: (0x06:0x002B): Controller reset failed during scsi host
reset.

sd 1:0:0:0: scsi: Device offlined - not ready after error recovery

This of course leads to a several hour downtime as the system has to be powered
down (not just rebooted) and then the volume needs to be fscked. I've been back
and forth with both the vendor and (via the vendor) 3ware with this. The card
has been replaced, as well as the whole system. I'm running the latest
firmware and drivers from 3ware.


Have other folks had good luck with this card? What sorts of configs are you
running? I'm in the position of needing more storage, and I'm a bit gun shy on
3ware at the moment...


--
Joshua Baker-LePain
QB3 Shared Cluster Sysadmin
UCSF
_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
 
Old 06-22-2008, 04:12 AM
John R Pierce
 
Default 3ware 9650 issues

Joshua Baker-LePain wrote:
I've been having no end of issues with a 3ware 9650SE-24M8 in a server
that's coming on a year old. I've got 24 WDC WD5001ABYS drives
(500GB) hooked to it, running as a single RAID6 w/ a hot spare. These
issues boil down to the card periodically throwing errors like the
following:

....
Have other folks had good luck with this card? What sorts of configs
are you running? I'm in the position of needing more storage, and I'm
a bit gun shy on 3ware at the moment...





I have no experience with that raid card, most of our larger systems use
external SAN storage, but I will say that, IMHO, is a very large
raid-6. we usually don't make single raid sets much large than 7-8
drives, and for a very large storage system, will stripe multiple
raid5/6 sets rather than have one huge one.

_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
 
Old 06-22-2008, 04:44 AM
Joshua Baker-LePain
 
Default 3ware 9650 issues

On Sat, 21 Jun 2008 at 9:12pm, John R Pierce wrote


Joshua Baker-LePain wrote:
I've been having no end of issues with a 3ware 9650SE-24M8 in a server
that's coming on a year old. I've got 24 WDC WD5001ABYS drives (500GB)
hooked to it, running as a single RAID6 w/ a hot spare. These issues boil
down to the card periodically throwing errors like the following:

....
Have other folks had good luck with this card? What sorts of configs are
you running? I'm in the position of needing more storage, and I'm a bit
gun shy on 3ware at the moment...





I have no experience with that raid card, most of our larger systems use
external SAN storage, but I will say that, IMHO, is a very large raid-6. we
usually don't make single raid sets much large than 7-8 drives, and for a
very large storage system, will stripe multiple raid5/6 sets rather than have
one huge one.


Would that I had such luxuries. This is a university lab with needs for
massive amounts of data and not much money with which to do it.


--
Joshua Baker-LePain
QB3 Shared Cluster Sysadmin
UCSF
_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
 
Old 06-22-2008, 05:01 AM
Ruslan Sivak
 
Default 3ware 9650 issues

Joshua Baker-LePain wrote:

On Sat, 21 Jun 2008 at 9:12pm, John R Pierce wrote

I have no experience with that raid card, most of our larger systems
use external SAN storage, but I will say that, IMHO, is a very large
raid-6. we usually don't make single raid sets much large than 7-8
drives, and for a very large storage system, will stripe multiple
raid5/6 sets rather than have one huge one.


Would that I had such luxuries. This is a university lab with needs
for massive amounts of data and not much money with which to do it.


Wouldn't striping a bunch of raid6 volumes give you about the same
amount of space?


Russ
_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
 
Old 06-22-2008, 05:09 AM
Joshua Baker-LePain
 
Default 3ware 9650 issues

On Sun, 22 Jun 2008 at 1:01am, Ruslan Sivak wrote


Joshua Baker-LePain wrote:

On Sat, 21 Jun 2008 at 9:12pm, John R Pierce wrote

I have no experience with that raid card, most of our larger systems use
external SAN storage, but I will say that, IMHO, is a very large raid-6.
we usually don't make single raid sets much large than 7-8 drives, and for
a very large storage system, will stripe multiple raid5/6 sets rather than
have one huge one.


Would that I had such luxuries. This is a university lab with needs for
massive amounts of data and not much money with which to do it.


Wouldn't striping a bunch of raid6 volumes give you about the same amount of
space?


No. We have 24 drives. Use one for a hot spare -> leaves 23.

1 array: 23 drives, - 2 for parity -> capacity = 21 * drive capacity
2 arrays: array1 = 12 drives - 2 for parity -> 10 drives
array2 = 11 drives - 2 for parity -> 9 drives
-> capcity = 19 * drive capcity
3 arrays: array1 = 8 drives - 2 for parity -> 6 drives
array2 = 8 drives - 2 for parity -> 6 drives
array3 = 7 drives - 2 for parity -> 5 drives
-> capcity = 17 * drive capacity

With 1TB drives, you're losing 2TB worth of volume space for each
increased number of arrays. That's a lot of space.


Unless I misunderstood you...

--
Joshua Baker-LePain
QB3 Shared Cluster Sysadmin
UCSF
_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
 
Old 06-22-2008, 05:38 AM
"nate"
 
Default 3ware 9650 issues

Joshua Baker-LePain wrote:

> periodically throwing errors like the following:
>
> sd 1:0:0:0: WARNING: (0x06:0x002C): Command (0x8a) timed out, resetting
> card.

Wondering if you have scheduled automatic media scans of all of the
disks in the array? Perhaps you have a disk that is going bad
causing the issue.

Something else that could be related, I was told by someone who
had a Isilon storage system(fancy NAS box), who was having his
WD disk drives hang on him on occasion, when this occured he had
to physically remove the disk from the system and re plug it in.
It was a firmware issue, I don't recall which WD drives he had,
he eventually got a fixed firmware though. This was about a year
ago.

I have media scans run once a week for about 7 hours on my 2 disk
3Ware systems (8006-2 controllers). For a 24 disk system you'll
probably need to run it longer. (unless the newer controllers scan
in parallel, the 8000 series seems to be serial).

I ran a couple 9650 series cards not too long ago, I think they
were just two disk systems running RAID 1 (up to 8 disks, but only
used 2). I've been using 3ware cards for about 8 years now and
have not run into those types of errors you describe. Probably
ran about 350 cards over the years, most of them in the 8000
series.

nate

_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
 
Old 06-22-2008, 05:50 AM
Jeff
 
Default 3ware 9650 issues

On Sat, Jun 21, 2008 at 11:04 PM, Joshua Baker-LePain <jlb17@duke.edu> wrote:
> I've been having no end of issues with a 3ware 9650SE-24M8 in a server
> that's coming on a year old. I've got 24 WDC WD5001ABYS drives (500GB)
> hooked to it, running as a single RAID6 w/ a hot spare. These issues boil
> down to the card periodically throwing errors like the following:
>
> sd 1:0:0:0: WARNING: (0x06:0x002C): Command (0x8a) timed out, resetting
> card.
>
> Usually when this happens, it's followed by:
>
> 3w-9xxx: scsi1: AEN: INFO (0x04:0x005E): Cache synchronization
> completed:unit=0.
>
> On the less pleasant occasions, it's followed by:
>
> scsi1: ERROR: (0x06:0x0036): Response queue (large) empty failed during
> reset sequence.
> 3w-9xxx: scsi1: ERROR: (0x06:0x002B): Controller reset failed during scsi
> host reset.
> sd 1:0:0:0: scsi: Device offlined - not ready after error recovery
>
> This of course leads to a several hour downtime as the system has to be
> powered down (not just rebooted) and then the volume needs to be fscked.
> I've been back and forth with both the vendor and (via the vendor) 3ware
> with this. The card has been replaced, as well as the whole system. I'm
> running the latest firmware and drivers from 3ware.
>
> Have other folks had good luck with this card? What sorts of configs are
> you running? I'm in the position of needing more storage, and I'm a bit gun
> shy on 3ware at the moment...

This may be completely irrelevant, but we have a 9550 card running
RAID 5 with a 'prominent non-Linux' operating system that suffers from
the same symptoms (and 4 others that have never done it). We've heard
from our vendor (and 3ware) that there are some upcoming firmware
releases (looks like August) that might help. A 3ware tech told me
that the controller reset happens when communication between the
driver and the firmware times out, which appears to be exactly what is
in your error message.

Meanwhile, we just cross our fingers and thank our lucky stars the the
server in question is in our local office and not one of our
non-tech-staffed remote offices. There are unsupported pre-release
firmware downloads available if you like to gamble. I have not had the
courage to install the beta firmware on our servers. I have not used
3ware with CentOS, but I don't think this is a CentOS issue.

--
Jeff
_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
 
Old 06-22-2008, 06:59 AM
"Joseph L. Casale"
 
Default 3ware 9650 issues

>Have other folks had good luck with this card? What sorts of configs are you
>running? I'm in the position of needing more storage, and I'm a bit gun shy on
>3ware at the moment...

Does that drive have a jumper to slow it down to 1.5Gb transfer rate?
Cheap controllers and drives just cant do it, I have had no end of issues
even with *all* my LSI controllers until I jumped all my sata drives
down.

As far as performance, it made no impact on my systems.

jlc
_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
 
Old 06-22-2008, 05:23 PM
Scott Silva
 
Default 3ware 9650 issues

on 6-21-2008 9:04 PM Joshua Baker-LePain spake the following:
I've been having no end of issues with a 3ware 9650SE-24M8 in a server
that's coming on a year old. I've got 24 WDC WD5001ABYS drives (500GB)
hooked to it, running as a single RAID6 w/ a hot spare. These issues
boil down to the card periodically throwing errors like the following:


sd 1:0:0:0: WARNING: (0x06:0x002C): Command (0x8a) timed out, resetting
card.


Usually when this happens, it's followed by:

3w-9xxx: scsi1: AEN: INFO (0x04:0x005E): Cache synchronization
completed:unit=0.


On the less pleasant occasions, it's followed by:

scsi1: ERROR: (0x06:0x0036): Response queue (large) empty failed during
reset sequence.
3w-9xxx: scsi1: ERROR: (0x06:0x002B): Controller reset failed during
scsi host reset.

sd 1:0:0:0: scsi: Device offlined - not ready after error recovery

This of course leads to a several hour downtime as the system has to be
powered down (not just rebooted) and then the volume needs to be fscked.
I've been back and forth with both the vendor and (via the vendor) 3ware
with this. The card has been replaced, as well as the whole system.
I'm running the latest firmware and drivers from 3ware.


Have other folks had good luck with this card? What sorts of configs
are you running? I'm in the position of needing more storage, and I'm a
bit gun shy on 3ware at the moment...



That looks like either drive, cabling, or power problems.

--
MailScanner is like deodorant...
You hope everybody uses it, and
you notice quickly if they don't!!!!

_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
 
Old 06-22-2008, 05:37 PM
Peter Arremann
 
Default 3ware 9650 issues

On Sunday 22 June 2008 12:04:47 am Joshua Baker-LePain wrote:
> I've been having no end of issues with a 3ware 9650SE-24M8 in a server
> that's coming on a year old. I've got 24 WDC WD5001ABYS drives (500GB)
> hooked to it, running as a single RAID6 w/ a hot spare.
What size power supply do you have in your server?

Peter.
_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
 

Thread Tools




All times are GMT. The time now is 11:39 AM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org