FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Debian > Debian User

 
 
LinkBack Thread Tools
 
Old 07-28-2008, 12:41 PM
"Lucas Mocellin"
 
Default trouble HP SmartArray 6400

Hi,

I'm having some troubles with the DP SmartArray 6400 controller.

Before I had a failed drive, so I repalced this drive, and now I'm getting this error:

sp02:~# hpacucli
=> ctrl slot=4 pd all show


Smart Array 6400 in Slot 4

** array A

***** physicaldrive 2:0** (port 2:id 0 , Parallel SCSI, 72.8 GB, OK)
***** physicaldrive 2:1** (port 2:id 1 , Parallel SCSI, 72.8 GB, OK)
***** physicaldrive 2:2** (port 2:id 2 , Parallel SCSI, 146.8 GB, OK)

***** physicaldrive 2:3** (port 2:id 3 , Parallel SCSI, 146.8 GB, OK)
***** physicaldrive 2:4** (port 2:id 4 , Parallel SCSI, 146.8 GB, Predictive Failure)
***** physicaldrive 2:5** (port 2:id 5 , Parallel SCSI, 146.8 GB, OK)


A "Predictive Failure", but I don't know what is this.

I searched at google but without answers..

Can somebody help me?

Thanks in advance,

Lucas.
 
Old 07-28-2008, 02:18 PM
"Damon L. Chesser"
 
Default trouble HP SmartArray 6400

On Mon, 2008-07-28 at 09:41 -0300, Lucas Mocellin wrote:
> Hi,
>
> I'm having some troubles with the DP SmartArray 6400 controller.
>
> Before I had a failed drive, so I repalced this drive, and now I'm
> getting this error:
>
> sp02:~# hpacucli
> => ctrl slot=4 pd all show
>
> Smart Array 6400 in Slot 4
>
> array A
>
> physicaldrive 2:0 (port 2:id 0 , Parallel SCSI, 72.8 GB, OK)
> physicaldrive 2:1 (port 2:id 1 , Parallel SCSI, 72.8 GB, OK)
> physicaldrive 2:2 (port 2:id 2 , Parallel SCSI, 146.8 GB, OK)
> physicaldrive 2:3 (port 2:id 3 , Parallel SCSI, 146.8 GB, OK)
> physicaldrive 2:4 (port 2:id 4 , Parallel SCSI, 146.8 GB,
> Predictive Failure)
> physicaldrive 2:5 (port 2:id 5 , Parallel SCSI, 146.8 GB, OK)
>
> A "Predictive Failure", but I don't know what is this.
>
> I searched at google but without answers..
>
> Can somebody help me?
>
> Thanks in advance,
>
> Lucas.

Lucas,

I am CCing you also as this could be very bad for you.

I worked on Dell hardware, but I can tell you what a predictive failure
is. It is one of two things:

1. The smart hardware on the HD is reporting that the drive failure is
eminent. It may last for hours or months, but it is in a state that
says it is about to fail.

2. I don't remember what the chipset is for Dell raid controllers is,
but I bet it is the same mfg as HP. Sometimes when the meta-data gets
corrupted (after a failure and a HD is replaced) the strip is punctured
(google punctured strip). If this is the case, no matter what you do PD
4 will never rebuild correctly and it will always report a predictive
failure.

You did not say if you replaced pd4 or not. If you did, there is a
chance that pd4 is just bad. If you did not, there is a greater chance
that pd4 is bad. The only things you can do now is replace pd4 and see
if it rebuilds correctly. If it does not and still shows a predictive
failure there is only one recourse. Backup all the data. Break the
raid, rebuild the raid, restore the data. You MIGHT get away with
clearing the strip, then rebuilding the strip in the controller and in a
perfect world, all the data will be there. Slim chance.

If your meta-data is corrupted, you are now gambling with your data.
With out respect to pd4 being in a predictive failure state or not, make
a complete backup and prepare for complete loss of that raid. A
punctured stripe means you have no parity to rebuild from. Or, to put
it differently, a bit of data was made into garbage, then copied as part
of the parity onto the strip. The corrupted parity strip faithfully
rebuild the array, only this time it included that piece of bogus data.
Everything will work just fine until the machine tries to access that
bit, expecting to find some sort of data it stored there, only to find
nonsensical data, then WHAM! Lock up. You can also experience
seemingly random HD failures, sometimes multiple hd will get kicked from
the array. Needles to say, this plays havoc with data preservation.

This could be as simple as replacing pd4 and rebuilding (if it is just a
SMART error), or is could be a prelude to complete data lose. You have
to ask yourself, "Do you feel lucky, Well, do you?"

The above was learned through two years working for Dell at the
Gold/Platinum level for server support. Failed HDs comprised about 80%
of the job.

HTH
--
Damon L. Chesser
damon@damtek.com
http://www.linkedin.com/in/dchesser
 
Old 07-28-2008, 02:50 PM
"Lucas Mocellin"
 
Default trouble HP SmartArray 6400

Hi Damon,

before all, thanks for your complete answer.

I replaced a "failed disk", so now I have a "predictive failure".

I understood, "googled" about "punctured strip" and found another cases of this problem. So I will replace the drive again and see the result of, if occur again I will reconstruct my array (raid 6).


I will try it and back with the solution (or another questions =( )..

Thanks again!!

Lucas.

2008/7/28 Damon L. Chesser <damon@damtek.com>

On Mon, 2008-07-28 at 09:41 -0300, Lucas Mocellin wrote:


> Hi,

>

> I'm having some troubles with the DP SmartArray 6400 controller.

>

> Before I had a failed drive, so I repalced this drive, and now I'm

> getting this error:

>

> sp02:~# hpacucli

> => ctrl slot=4 pd all show

>

> Smart Array 6400 in Slot 4

>

> * *array A

>

> * * * physicaldrive 2:0 * (port 2:id 0 , Parallel SCSI, 72.8 GB, OK)

> * * * physicaldrive 2:1 * (port 2:id 1 , Parallel SCSI, 72.8 GB, OK)

> * * * physicaldrive 2:2 * (port 2:id 2 , Parallel SCSI, 146.8 GB, OK)

> * * * physicaldrive 2:3 * (port 2:id 3 , Parallel SCSI, 146.8 GB, OK)

> * * * physicaldrive 2:4 * (port 2:id 4 , Parallel SCSI, 146.8 GB,

> Predictive Failure)

> * * * physicaldrive 2:5 * (port 2:id 5 , Parallel SCSI, 146.8 GB, OK)

>

> A "Predictive Failure", but I don't know what is this.

>

> I searched at google but without answers..

>

> Can somebody help me?

>

> Thanks in advance,

>

> Lucas.



Lucas,



I am CCing you also as this could be very bad for you.



I worked on Dell hardware, but I can tell you what a predictive failure

is. *It is one of two things:



1. *The smart hardware on the HD is reporting that the drive failure is

eminent. *It may last for hours or months, but it is in a state that

says it is about to fail.



2. *I don't remember what the chipset is for Dell raid controllers is,

but I bet it is the same mfg as HP. *Sometimes when the meta-data gets

corrupted (after a failure and a HD is replaced) the strip is punctured

(google punctured strip). *If this is the case, no matter what you do PD

4 will never rebuild correctly and it will always report a predictive

failure.



You did not say if you replaced pd4 or not. *If you did, there is a

chance that pd4 is just bad. *If you did not, there is a greater chance

that pd4 is bad. *The only things you can do now is replace pd4 and see

if it rebuilds correctly. *If it does not and still shows a predictive

failure there is only one recourse. *Backup all the data. *Break the

raid, rebuild the raid, restore the data. You MIGHT get away with

clearing the strip, then rebuilding the strip in the controller and in a

perfect world, all the data will be there. *Slim chance.



If your meta-data is corrupted, you are now gambling with your data.

With out respect to pd4 being in a predictive failure state or not, make

a complete backup and prepare for complete loss of that raid. *A

punctured stripe means you have no parity to rebuild from. *Or, to put

it differently, a bit of data was made into garbage, then copied as part

of the parity onto the strip. *The corrupted parity strip faithfully

rebuild the array, only this time it included that piece of bogus data.

Everything will work just fine until the machine tries to access that

bit, expecting to find some sort of data it stored there, only to find

nonsensical data, then WHAM! *Lock up. *You can also experience

seemingly random HD failures, sometimes multiple hd will get kicked from

the array. *Needles to say, this plays havoc with data preservation.



This could be as simple as replacing pd4 and rebuilding (if it is just a

SMART error), or is could be a prelude to complete data lose. *You have

to ask yourself, "Do you feel lucky, Well, do you?"



The above was learned through two years working for Dell at the

Gold/Platinum level for server support. *Failed HDs comprised about 80%

of the job.



HTH

--

Damon L. Chesser

damon@damtek.com

http://www.linkedin.com/in/dchesser
 
Old 07-29-2008, 12:15 AM
Alex Samad
 
Default trouble HP SmartArray 6400

On Mon, Jul 28, 2008 at 11:50:34AM -0300, Lucas Mocellin wrote:
> Hi Damon,
>
> before all, thanks for your complete answer.

HP have a set of forums
(forums.itrc.hp.com/) I believe they have HP people on there - its free
from memory, I would suggest they would have the best answer for you.

also have you installed the ACU - part of their psp package (tools and
stuff for their equipment)

>
> I replaced a "failed disk", so now I have a "predictive failure".
>
> I understood, "googled" about "punctured strip" and found another cases of
> this problem. So I will replace the drive again and see the result of, if
> occur again I will reconstruct my array (raid 6).
>
> I will try it and back with the solution (or another questions =( )..
>
> Thanks again!!
>
> Lucas.
>

[snip]

> >
> > HTH
> > --
> > Damon L. Chesser
> > damon@damtek.com
> > http://www.linkedin.com/in/dchesser
> >
> >

--
A budget is just a method of worrying before you spend money, as well
as afterward.
 

Thread Tools




All times are GMT. The time now is 01:26 PM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org