FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > CentOS > CentOS

 
 
LinkBack Thread Tools
 
Old 02-17-2012, 06:52 PM
 
Default smartd and smartctl

A few weeks ago, one of my servers started complaining, via smartd, that
one drive had one unreadable sector. I umounted it, and ran an fsck -c,
then remounted it. Error didn't go away. Now, what's really annoying is
that I've gotten back to it today, and it's reporting the problem, as it
has for weeks now, every half an hour.

However, when I run
> smartctl -q errorsonly -H -l selftest -l error /dev/sdb
it gives me *nothing*. Anyone understand why I get two different results?

mark "and I am waiting for the smartctl -t long /dev/sdb to complete"

_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
 
Old 02-17-2012, 07:16 PM
"Mike Burger"
 
Default smartd and smartctl

> A few weeks ago, one of my servers started complaining, via smartd, that
> one drive had one unreadable sector. I umounted it, and ran an fsck -c,
> then remounted it. Error didn't go away. Now, what's really annoying is
> that I've gotten back to it today, and it's reporting the problem, as it
> has for weeks now, every half an hour.
>
> However, when I run
>> smartctl -q errorsonly -H -l selftest -l error /dev/sdb
> it gives me *nothing*. Anyone understand why I get two different results?
>
> mark "and I am waiting for the smartctl -t long /dev/sdb to complete"

The smart system works at the hardware level, reading diagnostic
information from the SMART circuitry on the hard drives, themselves. The
hard drives will often, now, try to move the data from bad sectors on the
platters to good sectors, and then mark them so that they won't be used,
later.

Running fsck only works at the logical filesystem layer. The fsck tool has
no hooks to deal with the physical layer.

--
Mike Burger
http://www.bubbanfriends.org

Visit the Dog Pound II BBS
telnet://dogpound2.citadel.org http://dogpound2.citadel.org
https://dogpound2.citadel.org

To be notified of updates to the web site, visit:

https://www.bubbanfriends.org/mailman/listinfo/site-update

or send a blank email to:

site-update-subscribe@bubbanfriends.org
_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
 
Old 02-17-2012, 07:25 PM
 
Default smartd and smartctl

Mike Burger wrote:
>
>> A few weeks ago, one of my servers started complaining, via smartd, that
>> one drive had one unreadable sector. I umounted it, and ran an fsck -c,
>> then remounted it. Error didn't go away. Now, what's really annoying is
>> that I've gotten back to it today, and it's reporting the problem, as it
>> has for weeks now, every half an hour.
>>
>> However, when I run
>>> smartctl -q errorsonly -H -l selftest -l error /dev/sdb
>> it gives me *nothing*. Anyone understand why I get two different
>> results?
>>
>> mark "and I am waiting for the smartctl -t long /dev/sdb to
>> complete"
>
> The smart system works at the hardware level, reading diagnostic
> information from the SMART circuitry on the hard drives, themselves. The
> hard drives will often, now, try to move the data from bad sectors on the
> platters to good sectors, and then mark them so that they won't be used,
> later.
>
> Running fsck only works at the logical filesystem layer. The fsck tool has
> no hooks to deal with the physical layer.

Ok, but my thinking was, first, that after the fsck, the system wouldn't
try to write to the bad sector, thus not provoking smart. The more
annoying thing is that I don't understand why smartctl doesn't give the
same info as smartd. When I do a -a, it does tell me that one sector's
pending, but not that there's any error.

mark

_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
 
Old 02-17-2012, 07:34 PM
Mike VanHorn
 
Default smartd and smartctl

FWIW, on some of my workstations, when I have gotten the "sector pending"
messages, I have been able to take the drive out and run the
manufacturer's diagnostics on it (in my case, Seatools), and that fixed
some things and I haven't had any issues since.

---
Mike VanHorn
Senior Computer Systems Administrator
College of Engineering and Computer Science
Wright State University
265 Russ Engineering Center
937-775-5157
michael.vanhorn@wright.edu
http://www.engineering.wright.edu/~mvanhorn/





On 2/17/12 3:25 PM, "m.roth@5-cent.us" <m.roth@5-cent.us> wrote:

Mike Burger wrote:
>
>> A few weeks ago, one of my servers started complaining, via smartd, that
>> one drive had one unreadable sector. I umounted it, and ran an fsck -c,
>> then remounted it. Error didn't go away. Now, what's really annoying is
>> that I've gotten back to it today, and it's reporting the problem, as it
>> has for weeks now, every half an hour.
>>
>> However, when I run
>>> smartctl -q errorsonly -H -l selftest -l error /dev/sdb
>> it gives me *nothing*. Anyone understand why I get two different
>> results?
>>
>> mark "and I am waiting for the smartctl -t long /dev/sdb to
>> complete"
>
> The smart system works at the hardware level, reading diagnostic
> information from the SMART circuitry on the hard drives, themselves. The
> hard drives will often, now, try to move the data from bad sectors on the
> platters to good sectors, and then mark them so that they won't be used,
> later.
>
> Running fsck only works at the logical filesystem layer. The fsck tool
>has
> no hooks to deal with the physical layer.

Ok, but my thinking was, first, that after the fsck, the system wouldn't
try to write to the bad sector, thus not provoking smart. The more
annoying thing is that I don't understand why smartctl doesn't give the
same info as smartd. When I do a -a, it does tell me that one sector's
pending, but not that there's any error.

mark

_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
 
Old 02-17-2012, 07:36 PM
Andrzej Szymański
 
Default smartd and smartctl

W dniu 2012-02-17 21:25, m.roth@5-cent.us pisze:
> Mike Burger wrote:
> Ok, but my thinking was, first, that after the fsck, the system wouldn't
> try to write to the bad sector, thus not provoking smart. The more
> annoying thing is that I don't understand why smartctl doesn't give the
> same info as smartd. When I do a -a, it does tell me that one sector's
> pending, but not that there's any error.
>

Actually smartd is reporting THIS pending sector, and it probably won't
stop until this sector is reallocated, which will happen on a nearest
write to this sector.

As the location and contents of this sector are quite hard to find, the
simplest, but the most troublesome way of solving the problem is moving
all data away from this disk, writing the whole surface with zeros (dd)
and moving the data back.

However, I would carefully monitor number of reallocated sectors on this
disk. If it grows steadily, then better move your valuable data elsewhere.

Andrzej
_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
 
Old 02-17-2012, 09:04 PM
 
Default smartd and smartctl

Mike VanHorn wrote:
>
> FWIW, on some of my workstations, when I have gotten the "sector pending"
> messages, I have been able to take the drive out and run the
> manufacturer's diagnostics on it (in my case, Seatools), and that fixed
> some things and I haven't had any issues since.
>
Well, since the server has users on it, I can't really do that, or wipe
the disk.... I'm not really worried - it's stayed at 1 sector. If that
starts growing, then I'll worry, and get ready to replace the disk. Right
now, it's just an annoyance, as I said, that it shows up on email logs
from our loghost twice every hour. And I'm still waiting for anyone to
explain to me what I'm doing using smartctl that results in it *not*
telling me there's an error, or where the error is. In fact, the last long
test I started, early this afternoon, seems to be done, and with smartctl
-a, I see
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining
LifeTime(hours) LBA
_of_first_error
# 1 Extended offline Completed without error 00% 2536
-
# 2 Short offline Completed without error 00% 2529
-

So I'm befuddled why it won't tell me anything about this pending error.

mark

_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
 
Old 02-17-2012, 10:27 PM
Yves Bellefeuille
 
Default smartd and smartctl

On Friday 17 February 2012, Andrzej Szymański <szymans@agh.edu.pl>
wrote:

> As the location and contents of this sector are quite hard to find,
> the simplest, but the most troublesome way of solving the problem is
> moving all data away from this disk, writing the whole surface with
> zeros (dd) and moving the data back.

badblocks -n would also work, I imagine.

--
Yves Bellefeuille <yan@storm.ca>
"La Esperanta Civito ne rifuzas anticipe la kunlaboron de erarintoj, se
ili konscias pri sia eraro." -- Heroldo Komunikas, n-ro 473.
_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
 

Thread Tools




All times are GMT. The time now is 08:00 PM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org