FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Redhat > EXT3 Users

 
 
LinkBack Thread Tools
 
Old 03-24-2009, 10:45 AM
Dushyanth
 
Default Ext3 - Frequent read-only FS issues

Hi all,

I have a bunch of mail servers running postfix (external smtp),
qmail (LDA) and courier IMAP/POP. Frequently, Ext3 filesystem goes
into read-only mode forcing recovery using fsck.

Below are the errors we have seen so far on these systems and those
systems config. The ext3 errors are common in many cases.

1.
Red Hat Enterprise Linux Server release 5.2 (Tikanga)
2.6.18-92.1.10.el5 #1 SMP x86_64
Adaptec 2420SA RAID controller
Logical Disks's write cache disabled, Physical disk write cache enabled
Battery not installed
2 * 500GB ST373307LW in RAID1

Instances of EXT3 errors which caused read only FS :

a. “EXT3-fs error (device sdb1): ext3_lookup: unlinked inode 8766158 in dir
#8765708”

2.
Red Hat Enterprise Linux Server release 5.2 (Tikanga)
2.6.18-92.1.10.el5 #1 SMP x86_64
Dell PERC 6/i - Write back cache enabled
Battery available
2 * ST3500630AS 500GB in RAID1

Instances of EXT3 errors which caused read only FS :

a. “EXT3-fs error (device sda3): ext3_lookup: unlinked inode 89065027 in dir
#89065024”
b. “EST3-fs error (device sda3): htree_dirblock_to_tree: bad entry #65077525:
rec_len is smaller than minimal - offset=0, inode=0, rc_len=0, name_len=0”
c. “EXT3-fs error (device sda3): htree_dirblock_to_tree: bad entry in directory
#65077525: rec_len is smaller than minimal - offset=0, inode=0, rec_len=0,
name_len=0 “
d. “kernel: EXT3-fs error (device sda3): htree_dirblock_to_tree: bad entry in
directory #65077525: rec_len is smaller than minimal - offset=0, inode=0,
rec_len=0, name_len=0”

3.
Red Hat Enterprise Linux Server release 5.2 (Tikanga)
2.6.18-92.1.13.el5 #1 SMP x86_64
Dell PERC 6/i - Write back cache enabled
Battery available
2 * ST3500630AS 500GB in RAID1

Instances of EXT3 errors which caused read only FS :

a. “EXT3-fs error (device sdb1): ext3_lookup: unlinked inode 26968135 in dir
#35127737“
b. "EXT3-fs error (device sdb1): ext3_lookup: unlinked inode 9994260 in dir
#39518393"

* Iam not sure whats causing this errors. These crashes have been happening
since over 6 months now on a fairly regular basis - but on different servers.
* The disks on these systems seem to be fine - i haven't checked for badblocks
on them yet.
* Each of those servers have their own disks for mail storage - there is no NFS
or cluster FS involved.
* The inode numbers in "ext3_lookup: unlinked inode" seem to be referring to a
non existent courier pop/imap servers cache file (courierpop3dsizelist).

At this point, iam trying to figure out the possible causes for such ext3
errors. Any pointers/recommendations will be of great help.

P.S : Logs provided here is not complete and i would be glad to dig & post
complete logs of these events as required.

TIA
Dushyanth

_______________________________________________
Ext3-users mailing list
Ext3-users@redhat.com
https://www.redhat.com/mailman/listinfo/ext3-users
 
Old 03-24-2009, 12:58 PM
Dushyanth
 
Default Ext3 - Frequent read-only FS issues

> I have a bunch of mail servers running postfix (external smtp),
> qmail (LDA) and courier IMAP/POP. Frequently, Ext3 filesystem goes
> into read-only mode forcing recovery using fsck.
>
> Below are the errors we have seen so far on these systems and those
> systems config. The ext3 errors are common in many cases.

Forgot to mention that all ext3 mounts are ordered

/dev/sdX1 on /mountpoint type ext3 (rw,noatime,usrquota,grpquota,data=ordered)

Dushyanth



_______________________________________________
Ext3-users mailing list
Ext3-users@redhat.com
https://www.redhat.com/mailman/listinfo/ext3-users
 
Old 03-24-2009, 02:01 PM
Steven Chervets
 
Default Ext3 - Frequent read-only FS issues

Dushyanth,

If you have the disk write-cache enabled it means that any power
outage can cause file system corruption. I would suggest that you put
a battery in your disk controller and enable the write-cache on it,
and at the same time, disable the write-cache on the disk itself.


If you can't get your hands on a battery, then disabled the write-
cache on the hard disk and see if you get any more file system
corruption errors.


Good Luck,
Steve

On Mar 24, 2009, at 7:58 AM, Dushyanth wrote:



I have a bunch of mail servers running postfix (external smtp),
qmail (LDA) and courier IMAP/POP. Frequently, Ext3 filesystem goes
into read-only mode forcing recovery using fsck.

Below are the errors we have seen so far on these systems and those
systems config. The ext3 errors are common in many cases.


Forgot to mention that all ext3 mounts are ordered

/dev/sdX1 on /mountpoint type ext3
(rw,noatime,usrquota,grpquota,data=ordered)


Dushyanth



_______________________________________________
Ext3-users mailing list
Ext3-users@redhat.com
https://www.redhat.com/mailman/listinfo/ext3-users

_______________________________________________
Ext3-users mailing list
Ext3-users@redhat.com
https://www.redhat.com/mailman/listinfo/ext3-users
 
Old 03-24-2009, 02:25 PM
Eric Sandeen
 
Default Ext3 - Frequent read-only FS issues

Steven Chervets wrote:
> Dushyanth,
>
> If you have the disk write-cache enabled it means that any power
> outage can cause file system corruption. I would suggest that you put
> a battery in your disk controller and enable the write-cache on it,
> and at the same time, disable the write-cache on the disk itself.
>
> If you can't get your hands on a battery, then disabled the write-
> cache on the hard disk and see if you get any more file system
> corruption errors.
>
> Good Luck,
> Steve

If you think it could be write-cache related, just mount with -o
barrier=1, and see if things get better. (or disable write cache on the
disks with hdparm as Steve suggests)

Do you lose power much? Do these errors correspond to power loss events?

-Eric

_______________________________________________
Ext3-users mailing list
Ext3-users@redhat.com
https://www.redhat.com/mailman/listinfo/ext3-users
 
Old 03-24-2009, 02:37 PM
Dushyanth
 
Default Ext3 - Frequent read-only FS issues

Hi,

Thanks Steve and Eric. My response below.

>
> Steven Chervets wrote:
> > Dushyanth,
> >
> > If you have the disk write-cache enabled it means that any power
> > outage can cause file system corruption. I would suggest that you put
> > a battery in your disk controller and enable the write-cache on it,
> > and at the same time, disable the write-cache on the disk itself.
> >
> > If you can't get your hands on a battery, then disabled the write-
> > cache on the hard disk and see if you get any more file system
> > corruption errors.
> >
> > Good Luck,
> > Steve
>
> If you think it could be write-cache related, just mount with -o
> barrier=1, and see if things get better. (or disable write cache on the
> disks with hdparm as Steve suggests)
>
> Do you lose power much? Do these errors correspond to power loss events?

These errors come up suddenly on running systems. We do IPMI based reboots for
power cycling the boxes when they hang.

Is it possible that some corruption that occurred during one such reboots to
cause a read only FS later on with the errors i mentioned ? Iam guessing yes,
cos force fsck during boot is not forced always cos sometimes a FS check is
suggested by ext3 during mounting the disk.

I will disable the disk caches and observe.

Thanks
Dushyanth



_______________________________________________
Ext3-users mailing list
Ext3-users@redhat.com
https://www.redhat.com/mailman/listinfo/ext3-users
 
Old 03-24-2009, 03:36 PM
Theodore Tso
 
Default Ext3 - Frequent read-only FS issues

On Tue, Mar 24, 2009 at 03:37:48PM +0000, Dushyanth wrote:
> > Do you lose power much? Do these errors correspond to power loss events?
>
> These errors come up suddenly on running systems. We do IPMI based reboots for
> power cycling the boxes when they hang.

I assume you do have a full fsck done on the filesystem after you see
these errors?

> Is it possible that some corruption that occurred during one such reboots to
> cause a read only FS later on with the errors i mentioned ? Iam guessing yes,
> cos force fsck during boot is not forced always cos sometimes a FS check is
> suggested by ext3 during mounting the disk.

If an fs check is getting suggested by ext3 during the mounting of the
disk, then your boot scripts must not be set up correctly, or
/etc/fstab has been set up to disable fsck getting run automatically
on said filesystem.

I would strongly recommend running forced fsck to make sure all of
your filesystems are clean first; if the errors have been around for a
while, and for some reason the automatic fsck has been disabled so
they aren't getting checked, that could be really bad.

- Ted

_______________________________________________
Ext3-users mailing list
Ext3-users@redhat.com
https://www.redhat.com/mailman/listinfo/ext3-users
 
Old 03-24-2009, 05:09 PM
Dushyanth
 
Default Ext3 - Frequent read-only FS issues

Hi,

Theodore Tso <tytso <at> mit.edu> writes:
>
> On Tue, Mar 24, 2009 at 03:37:48PM +0000, Dushyanth wrote:
> > > Do you lose power much? Do these errors correspond to power loss
> > > events?
> >
> > These errors come up suddenly on running systems. We do IPMI based
> > reboots for power cycling the boxes when they hang.
>
> I assume you do have a full fsck done on the filesystem after you see
> these errors?

Its quite possible that we might have not not run the suggested fsck. Ops/DC
staff sometimes skip fsck to get the server up & running quick.

> > Is it possible that some corruption that occurred during one such
> > reboots to cause a read only FS later on with the errors i mentioned ?
> > Iam guessing yes, cos force fsck during boot is not forced always cos
> > sometimes a FS check is suggested by ext3 during mounting the disk.
>
> If an fs check is getting suggested by ext3 during the mounting of the
> disk, then your boot scripts must not be set up correctly, or
> /etc/fstab has been set up to disable fsck getting run automatically
> on said filesystem.

Boot scripts are redhat's defaults and fstab has fsck check is enabled
for the disks in question.

If the boot scripts run a fsck on ext3 filesystems that are marked unclean
then i was wrong in guessing that ext3 fs's get mounted in such cases with a
suggestion.

> I would strongly recommend running forced fsck to make sure all of
> your filesystems are clean first; if the errors have been around for a
> while, and for some reason the automatic fsck has been disabled so
> they aren't getting checked, that could be really bad.

Ok. I might as well do the preemptive maintenance just to be sure.

Thanks for the suggestions.

TIA
Dushyanth

_______________________________________________
Ext3-users mailing list
Ext3-users@redhat.com
https://www.redhat.com/mailman/listinfo/ext3-users
 

Thread Tools




All times are GMT. The time now is 10:27 PM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org