PATA Hard Drive woes
Hi All.
Yesterday I was installing Centos 5.5 to my web server, and it looks like the main hard drive has gone AWOL. Fedora 12 put the file system into r/o mode. The drive is an Hitachi, still under warranty. There are bad sectors on it, and running the Hitachi DFT tool confirms this. Also I cannot repair the bad sectors. Would this be caused by a faulty I/O chip, or is it safe to say it's definately the HDD at fault? Kind Regards, Keith Roberts -- In theory, theory and practice are the same; in practice they are not. This email was sent from my laptop with Centos 5.5 _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos |
PATA Hard Drive woes
On 10/31/2010 3:27 PM, Keith Roberts wrote:
> Hi All. > > Yesterday I was installing Centos 5.5 to my web server, and > it looks like the main hard drive has gone AWOL. > > Fedora 12 put the file system into r/o mode. > > The drive is an Hitachi, still under warranty. > > There are bad sectors on it, and running the Hitachi DFT > tool confirms this. Also I cannot repair the bad sectors. > > Would this be caused by a faulty I/O chip, or is it safe to > say it's definately the HDD at fault? > > Kind Regards, > > Keith Roberts > hdd at fault _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos |
PATA Hard Drive woes
On Sun, 31 Oct 2010, William Warren wrote:
> To: CentOS mailing list <centos@centos.org> > From: William Warren <hescominsoon@emmanuelcomputerconsulting.com> > Subject: Re: [CentOS] PATA Hard Drive woes > > On 10/31/2010 3:27 PM, Keith Roberts wrote: >> Hi All. >> >> Yesterday I was installing Centos 5.5 to my web server, and >> it looks like the main hard drive has gone AWOL. >> >> Fedora 12 put the file system into r/o mode. >> >> The drive is an Hitachi, still under warranty. >> >> There are bad sectors on it, and running the Hitachi DFT >> tool confirms this. Also I cannot repair the bad sectors. >> >> Would this be caused by a faulty I/O chip, or is it safe to >> say it's definately the HDD at fault? >> >> Kind Regards, >> >> Keith Roberts >> > hdd at fault OK - thanks for confirming that Bill. I'll remove it and take it back for replacement. Keith _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos |
PATA Hard Drive woes
On Sun, 31 Oct 2010, Keith Roberts wrote:
> To: CentOS mailing list <centos@centos.org> > From: Keith Roberts <keith@karsites.net> > Subject: Re: [CentOS] PATA Hard Drive woes > > On Sun, 31 Oct 2010, William Warren wrote: > >> To: CentOS mailing list <centos@centos.org> >> From: William Warren <hescominsoon@emmanuelcomputerconsulting.com> >> Subject: Re: [CentOS] PATA Hard Drive woes >> >> On 10/31/2010 3:27 PM, Keith Roberts wrote: >>> Hi All. >>> >>> Yesterday I was installing Centos 5.5 to my web server, and >>> it looks like the main hard drive has gone AWOL. >>> >>> Fedora 12 put the file system into r/o mode. >>> >>> The drive is an Hitachi, still under warranty. >>> >>> There are bad sectors on it, and running the Hitachi DFT >>> tool confirms this. Also I cannot repair the bad sectors. >>> >>> Would this be caused by a faulty I/O chip, or is it safe to >>> say it's definately the HDD at fault? >>> >>> Kind Regards, >>> >>> Keith Roberts >>> >> hdd at fault > > OK - thanks for confirming that Bill. > > I'll remove it and take it back for replacement. > > Keith There were about 79 Seek errors in the SMART logs of the HDD. I moved the drive from the Primary Master cable, to the Secondary Master cable, and I ran Hitachi's DFT tool, did a complete disk erase, and that terminated with errors. So to prepare the disk for returning under warranty, I used another HDD utility to clean the disk again, still on Sec Master cable. I used vivard 0.4 to do a complete disk erase. That was on the http://www.ultimatebootcd.com/index.html Under HDD utils. vivard did not show any errors when doing a full disk erase. So I ran an Advanced r/w scan again with Hitachi DFT, and the result was OK. Any ideas what's happening please? Is this disk usable, or is it still in need of replacing? Kind Regards, Keith Roberts _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos |
PATA Hard Drive woes
Keith Roberts wrote, On 11/03/2010 10:32 AM:
> On Sun, 31 Oct 2010, Keith Roberts wrote: > <SNIP> > There were about 79 Seek errors in the SMART logs of the > HDD. > <SNIP> > vivard did not show any errors when doing a full disk erase. > > So I ran an Advanced r/w scan again with Hitachi DFT, and > the result was OK. > > Any ideas what's happening please? WFG: In writing it all, the seek motor knocked the dust out of it's way? (what dust?) How about checking all the smart attributes and seeing if others are elevated. http://en.wikipedia.org/wiki/S.M.A.R.T.#Known_ATA_S.M.A.R.T._attributes Are you seeing any block "remap" activity? http://en.wikipedia.org/wiki/Hard_disk_drive#Error_handling > > Is this disk usable, or is it still in need of replacing? > http://en.wikipedia.org/wiki/S.M.A.R.T.#Background You have gotten SMART errors from this drive already, so: You have to ask yourself, 'Do you feel lucky?', Well do y'a... And the other question: If this drive up and dies shortly and I knew about the smart errors, will the data owner complain more or less to me about the drive death later or drive replacement hassle now? Only YOU (and the data owner) know the risk trade-off levels you have to consider. -- Todd Denniston Crane Division, Naval Surface Warfare Center (NSWC Crane) Harnessing the Power of Technology for the Warfighter _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos |
PATA Hard Drive woes
On Wed, 3 Nov 2010, Todd Denniston wrote:
> To: CentOS mailing list <centos@centos.org> > From: Todd Denniston <Todd.Denniston@tsb.cranrdte.navy.mil> > Subject: Re: [CentOS] PATA Hard Drive woes > > Keith Roberts wrote, On 11/03/2010 10:32 AM: >> On Sun, 31 Oct 2010, Keith Roberts wrote: >> > <SNIP> >> There were about 79 Seek errors in the SMART logs of the >> HDD. >> > <SNIP> >> vivard did not show any errors when doing a full disk erase. >> >> So I ran an Advanced r/w scan again with Hitachi DFT, and >> the result was OK. >> >> Any ideas what's happening please? > > WFG: In writing it all, the seek motor knocked the dust > out of it's way? (what dust?) How about checking all the > smart attributes and seeing if others are elevated. > http://en.wikipedia.org/wiki/S.M.A.R.T.#Known_ATA_S.M.A.R.T._attributes > > Are you seeing any block "remap" activity? > http://en.wikipedia.org/wiki/Hard_disk_drive#Error_handling > >> >> Is this disk usable, or is it still in need of replacing? >> > > http://en.wikipedia.org/wiki/S.M.A.R.T.#Background You > have gotten SMART errors from this drive already, so: You > have to ask yourself, 'Do you feel lucky?', Well do y'a... > > And the other question: If this drive up and dies shortly > and I knew about the smart errors, will the data owner > complain more or less to me about the drive death later or > drive replacement hassle now? > > Only YOU (and the data owner) know the risk trade-off > levels you have to consider. > Thanks Todd for the reply. There were no sectors remapped, which is odd as there were bad sectors originally on the drive. I ran MemTest86+ out of curiousity, and there are 5120 Errors, some at 0.4MB & 0.5 MB. The BIOS has been playing up, not recognising the Primary Master drive. This is the channel the Hitachi disk was on when it developed the sector read errors. Could a bad controller or bad RAM cause Hard Drive sector errors? The drive is as good as uninstalled, so I may as well send it for replacement. Regards, Keith NB: The box is down now, and I'll try and test and identify the bad memory module next. _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos |
PATA Hard Drive woes
On 11/3/2010 8:32 AM, Keith Roberts wrote:
> > So to prepare the disk for returning under warranty, I used > another HDD utility to clean the disk again ... > So I ran an Advanced r/w scan again with Hitachi DFT, and > the result was OK. A complete disk wipe brings bad sectors to the drive's attention, forcing it to remap them using spare sectors set aside for the purpose. All drives can do this, and they do it without logging the change. You can't tell, from the outside, when or whether the drive has done this. All you can do is infer it, because a sector that once tested bad now tests good. As to why this happened only during a format, not during the previous disk test, it's probably because the format zeroed the disk. That particular drive may have a policy to only remap sectors on write, so as to preserve the sector contents for potential recovery later. (See below for one way this can be done.) It may be that your drive is now fine. If you put it back into service, at minimum I would set up smartd, from the smartmontools package. Maybe run smartctl on it by hand daily or weekly, too. If you find that errors start happening again, there is something continually degrading the drive's integrity, so the automatic sector remapping will eventually run the drive out of spare sectors. SpinRite (http://spinrite.com/) does nondestructive sector remapping. At level 4 and above, it reads each sector in and then writes it back out to the drive. Because remapping is silent, it's possible for it to appear to do nothing, yet improve data integrity by bringing dodgy sectors to the drive's attention. If a sector can't be read without error, SpinRite forces the drive to ignore the CRC and return the data anyway, retrying many times, then making a statistical guess about the most likely contents of the sector. (Reading a bad sector won't necessarily give the same value each try.) Then on writing the reconstructed data back out, the drive automatically remaps the sector, repairing it. You might want to combine the SMART monitoring with periodic SpinRite runs on the drive until you regain confidence in it. _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos |
PATA Hard Drive woes
Warren Young wrote:
> On 11/3/2010 8:32 AM, Keith Roberts wrote: >> >> So to prepare the disk for returning under warranty, I used >> another HDD utility to clean the disk again > > ... > >> So I ran an Advanced r/w scan again with Hitachi DFT, and >> the result was OK. > > A complete disk wipe brings bad sectors to the drive's attention, > forcing it to remap them using spare sectors set aside for the purpose. <snip> > If you put it back into service, at minimum I would set up smartd, from > the smartmontools package. Maybe run smartctl on it by hand daily or > weekly, too. If you find that errors start happening again, there is > something continually degrading the drive's integrity, so the automatic > sector remapping will eventually run the drive out of spare sectors. <snip> Yeah, but I have problems with smartmon: for example, I've got a drive in one server that's got two bad sectors, which SMART reports. I've followed the instructions on how to make the log messages go away, and fsck -c... but on reboot, SMART seems to ignore what badblocks found, and the irritating messages are back. mark _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos |
PATA Hard Drive woes
On 11/3/2010 11:27 AM, m.roth@5-cent.us wrote:
> Yeah, but I have problems with smartmon: More likely, problems with SMART. S.M.A.R.T. is D.U.M.B. :) It's better than nothing, but sometimes not by a whole lot. > one server that's got two bad sectors, which SMART reports. I've followed > the instructions on how to make the log messages go away, and fsck -c... > but on reboot, SMART seems to ignore what badblocks found, and the > irritating messages are back. It may be that SpinRite could fix that by forcing a remap. Another option -- which I didn't mention because it probably isn't an option for the original poster, but which may work with your servers -- is that some high-end RAID systems can do something like SpinRite at level 4+, as can ZFS. They call it resilvering. I don't think these systems do statistical reconstruction, but periodic read-then-rewrite can stave off the need to reconstruct. _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos |
PATA Hard Drive woes
On 11/03/10 17:01, Keith Roberts wrote:
> > There were no sectors remapped, which is odd as there were > bad sectors originally on the drive. I ran MemTest86+ out of > curiousity, and there are 5120 Errors, some at 0.4MB& 0.5 > MB. > You should fix that first. > The BIOS has been playing up, not recognising the Primary > Master drive. This is the channel the Hitachi disk was on > when it developed the sector read errors. > > Could a bad controller or bad RAM cause Hard Drive sector > errors? > Neither bad RAM or a bad controllor can physically damage a hard drive. A bad controller will not cause reallocated sectors. It can however cause UDMA CRC errors and other weird non-SMART related behaviour. > The drive is as good as uninstalled, so I may as well send > it for replacement. > Send the output of smartctl -a /dev/yourdisk, that'll give us more factual data than speculation. _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos |
| All times are GMT. The time now is 07:15 PM. |
VBulletin, Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.