aborted journal and kernel bug on RHEL AP 5.1 on SUN AMD 64bit (X4200M2)
Hi,
i’m reached a strange situation over my servers
SUN X4200M2 running with Linux Advanced Platform 5.1 Linux fea.localdomain
2.6.18-53.el5 #1 SMP Wed Oct 10 16:34:19 EDT 2007 x86_64 x86_64 x86_64
GNU/Linux.. This happen on both internal and external disks (Hitachi AMS 200
storage , emulex HBA , and HDLM sw Hitachi
for multipath)
Â*
After problem happening I’m not able to use the
server due to root corruption files :
-rwxr-xr-x 1 root rootÂ*Â* 14096 SepÂ*
5Â* 2007 rmmod
-rwxr-xr-x 1 root rootÂ* 521552 AugÂ* 7Â*
2006 rmt
-rwxr-xr-x 1 root rootÂ*Â* 14648 Jul 13Â*
2006 rngd
-rwxr-xr-x 1 root rootÂ*Â* 57920 AugÂ*
7Â* 2006 route
-rwxr-xr-x 1 root rootÂ*Â*Â* 5904 Sep
25Â* 2007 rpc.lockd
-rwxr-xr-x 1 root rootÂ*Â* 49352 Sep 25Â*
2007 rpc.statd
?--------- ? ?Â*Â*Â*
?Â*Â*Â*Â*Â*Â*Â*Â*Â*
?Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â* ? rrestore
?--------- ? ?Â*Â*Â*
?Â*Â*Â*Â*Â*Â*Â*Â*Â*
?Â*Â*Â*Â*Â*Â*Â*Â*Â*Â*Â* ?
rrestore.static
-rwxr-xr-x 1 root rootÂ*Â* 29976 JanÂ*
9Â* 2007 rtmon
-rwxr-xr-x 1 root rootÂ*Â*Â* 7736 Oct
13Â* 2006 runlevel
-rwxr-xr-x 1 root rootÂ*Â* 30840 Nov 27Â*
2006 runuser
-rwxr-xr-x 1 root rootÂ*Â* 10376 Aug 17Â*
2007 salsa
Â*[root@fea sbin]#
Â*
And also file system are mounted in read-only mode
Â*
The following is a parts of messages file:
Jul 11 16:11:15 fea clurgmgrd[4739]: <notice>
Service service:appl-dfdd is disabled
Jul 11 16:29:56 fea clurgmgrd[4739]: <notice>
Stopping service service:db-dfdd
Jul 11 16:30:00 fea avahi-daemon[4622]: Withdrawing
address record for 10.40.3.40 on eth1.
Jul 11 16:30:11 fea dlm_controld[4281]: uevent message
has 3 args
Jul 11 16:30:11 fea clurgmgrd[4739]: <notice>
Service service:db-dfdd is disabled
Jul 11 16:31:44 fea clurgmgrd[4739]: <notice>
Starting disabled service service:db-dfdd
Jul 11 16:31:44 fea kernel: kjournald starting.Â*
Commit interval 5 seconds
Jul 11 16:31:44 fea kernel: EXT3-fs warning: maximal
mount count reached, running e2fsck is recommended
Jul 11 16:31:44 fea kernel: EXT3 FS on sddlmab,
internal journal
Jul 11 16:31:44 fea kernel: EXT3-fs: mounted
filesystem with ordered data mode.
Jul 11 16:31:44 fea dlm_controld[4281]: uevent message
has 3 args
Jul 11 16:31:44 fea avahi-daemon[4622]: Registering
new address record for 10.40.3.40 on eth1.
Jul 11 16:31:48 fea clurgmgrd[4739]: <notice>
Service service:db-dfdd started
Jul 11 16:40:23 fea clurgmgrd[4739]: <notice>
Stopping service service:db-dfdd
Jul 11 16:40:25 fea avahi-daemon[4622]: Withdrawing
address record for 10.40.3.40 on eth1.
Jul 11 16:40:35 fea dlm_controld[4281]: uevent message
has 3 args
Jul 11 16:40:35 fea clurgmgrd[4739]: <notice>
Service service:db-dfdd is disabled
Jul 11 17:13:01 fea kernel: EXT3-fs error (device
dm-0): ext3_free_blocks_sb: bit already cleared for block 382976
Jul 11 17:13:01 fea kernel: Aborting journal on device
dm-0.
Jul 11 17:13:01 fea kernel: EXT3-fs error (device
dm-0): ext3_free_blocks_sb: bit already cleared for block 382977
Jul 11 17:13:01 fea kernel: EXT3-fs error (device
dm-0): ext3_free_blocks_sb: bit already cleared for block 382978
Jul 11 17:13:01 fea kernel: EXT3-fs error (device
dm-0): ext3_free_blocks_sb: bit already cleared for block 382979
Jul 11 17:13:01 fea kernel: EXT3-fs error (device
dm-0): ext3_free_blocks_sb: bit already cleared for block 382980
Jul 11 17:13:02 fea kernel: EXT3-fs error (device
dm-0) in ext3_reserve_inode_write: Journal has aborted
Jul 11 17:13:02 fea kernel: EXT3-fs error (device
dm-0) in ext3_reserve_inode_write: Journal has aborted
Jul 11 17:13:02 fea kernel: EXT3-fs error (device
dm-0) in ext3_orphan_del: Journal has aborted
Jul 11 17:13:02 fea kernel: EXT3-fs error (device
dm-0) in ext3_truncate: Journal has aborted
Jul 11 17:13:02 fea kernel: ext3_abort called.
Jul 11 17:13:02 fea kernel: EXT3-fs error (device
dm-0): ext3_journal_start_sb: Detected aborted journal
Jul 11 17:13:02 fea kernel: Remounting filesystem
read-only
Jul 11 17:27:30 fea clurgmgrd[4739]: <info>
State change: feb.iride DOWN
Jul 11 17:27:30 fea clurgmgrd[4739]: <info>
State change: /dev/sddlmac UP
Jul 11 17:27:30 fea clurgmgrd[4739]: <info>
Waiting for node #2 to be fenced
Jul 11 17:28:50 fea qdiskd[4191]: <info> Node 2
shutdown
Â*
And also a kernel bug as:
JulÂ* 9 16:57:13 fea syslogd 1.4.1: restart.
/trace
Jul 10 17:41:09 fea kernel: EXT3-fs warning (device
sddlmaa): ext3_unlink: Deleting nonexistent file (13353077), 0
Jul 10 18:20:04 fea dlm_controld[4260]: uevent message
has 3 args
Jul 10 18:20:04 fea kernel: sb orphan head is 13353077
Jul 10 18:20:04 fea kernel: sb_info orphan list:
Jul 10 18:20:04 fea kernel:Â*Â* inode
dm-0:1010899 at ffff8100df1f3448: mode 100555, nlink 1, next 0
Jul 10 18:20:13 fea last message repeated 59479 times
Jul 10 18:20:13 fea kernel: BUG: soft lockup detected
on CPU#1!
Jul 10 18:20:13 fea kernel:
Jul 10 18:20:13 fea kernel: Call Trace:
Jul 10 18:20:13 fea kernel:Â* <IRQ>Â*
[<ffffffff800b50fa>] softlockup_tick+0xd5/0xe7
Jul 10 18:20:13 fea kernel:Â*
[<ffffffff800930e2>] update_process_times+0x42/0x68
Jul 10 18:20:13 fea kernel:Â*
[<ffffffff800746e3>] smp_local_timer_interrupt+0x23/0x47
Jul 10 18:20:13 fea kernel:Â*
[<ffffffff80074da5>] smp_apic_timer_interrupt+0x41/0x47
Jul 10 18:20:13 fea kernel:Â*
[<ffffffff8005bc8e>] apic_timer_interrupt+0x66/0x6c
Jul 10 18:20:13 fea kernel:Â* <EOI>Â*
[<ffffffff8008d4b6>] vprintk+0x29e/0x2ea
Jul 10 18:20:13 fea kernel:Â*
[<ffffffff8008d554>] printk+0x52/0xbd
Jul 10 18:20:13 fea kernel:Â*
[<ffffffff80061a3f>] out_of_line_wait_on_bit+0x6c/0x78
Jul 10 18:20:13 fea kernel:Â*
[<ffffffff880564f4>] :ext3:ext3_put_super+0x13e/0x1e0
Jul 10 18:20:13 fea kernel:Â*
[<ffffffff800d8e1e>] generic_shutdown_super+0x79/0xfb
Jul 10 18:20:13 fea kernel:Â*
[<ffffffff800d8ec6>] kill_block_super+0x26/0x3a
Jul 10 18:20:13 fea kernel:Â*
[<ffffffff800d8f94>] deactivate_super+0x6a/0x82
Jul 10 18:20:13 fea kernel:Â*
[<ffffffff800e1d13>] sys_umount+0x245/0x27b
Jul 10 18:20:13 fea kernel:Â* [<ffffffff800b27ae>]
audit_syscall_entry+0x14d/0x180
Jul 10 18:20:13 fea kernel:Â*
[<ffffffff8005b28d>] tracesys+0xd5/0xe0
Jul 10 18:20:13 fea kernel:
Jul 10 18:20:13 fea kernel:Â*Â* inode
dm-0:1010899 at ffff8100df1f3448: mode 100555, nlink 1, next 0
Jul 10 18:20:13 fea last message repeated 50 times
Jul 10 18:20:13 fea kernel:Â*Â* inode
dm-0:1010899 at ffff8100df1f3448: mode 100555, nlink , nlink 1, next 0
Jul 10 18:20:13 fea kernel:Â*Â* inode
dm-0:1010899 at ffff8100df1f3448: mode 100555, nlink 1, next 0
Jul 10 18:20:13 fea last message repeated 54 times
Jul 10 18:20:13 fea kernel:Â*Â* in, nlink 1,
next 0
Jul 10 18:20:13 fea kernel:Â*Â* inode
dm-0:1010899 at ffff8100df1f3448: mode 100555, nlink 1, next 0
Jul 10 18:20:13 fea last message repeated 54 times
Jul 10 18:20:13 fea kernel:Â*Â* in, nlink 1,
next 0
Jul 10 18:20:13 fea kernel:Â*Â* inode
dm-0:1010899 at ffff8100df1f3448: mode 100555, nlink 1, next 0
Jul 10 18:20:13 fea last message repeated 54 times
Jul 10 18:20:13 fea kernel:Â*Â* in, nlink 1,
next 0
Jul 10 18:20:13 fea kernel:Â*Â* inode
dm-0:1010899 at ffff8100df1f3448: mode 100555, nlink 1, next 0
Jul 10 18:20:13 fea last message repeated 54 times
Jul 10 18:20:13 fea kernel:Â*Â* in, nlink 1,
next 0
Jul 10 18:20:13 fea kernel:Â*Â* inode
dm-0:1010899 at ffff8100df1f3448: mode 100555, nlink 1, next 0
Jul 10 18:20:13 fea last message repeated 54 times
Â*
I’m planning to reinstall the server …
Â*
Some body can help me ?
Thanks a lot
Fabio
--------------------------------------------
INFORMATIVA SULLA PRIVACY
Ai sensi del D.Lgs. 196/2003 si precisa che le informazioni contenute
in questo messaggio e nei suoi eventuali allegati sono riservate e per
uso esclusivo del destinatario. Nessuno, all'infuori dello stesso,
può copiare o distribuire il messaggio, o parte di esso, a terzi.
Chiunque riceva questo messaggio per errore è pregato di distruggerlo
e di informare il mittente.
PRIVACY NOTICE
According to the D.Lgs. 196/2003 this document and its attachments are
confidential and intended for the named addressee(s) only. If you are
not the intended recipient of this message, any use or dissemination
of this message is prohibited. If you have received this document by
mistake, please notify the sender and destroy all physical and/or
electronic copies.
--------------------------------------------
INFORMATIVA SULLA PRIVACY
Ai sensi del D.Lgs. 196/2003 si precisa che le informazioni contenute
in questo messaggio e nei suoi eventuali allegati sono riservate e per
uso esclusivo del destinatario. Nessuno, all'infuori dello stesso,
può copiare o distribuire il messaggio, o parte di esso, a terzi.
Chiunque riceva questo messaggio per errore è pregato di distruggerlo
e di informare il mittente.
PRIVACY NOTICE
According to the D.Lgs. 196/2003 this document and its attachments are
confidential and intended for the named addressee(s) only. If you are
not the intended recipient of this message, any use or dissemination
of this message is prohibited. If you have received this document by
mistake, please notify the sender and destroy all physical and/or
electronic copies.
_______________________________________________
Ext3-users mailing list
Ext3-users@redhat.com
https://www.redhat.com/mailman/listinfo/ext3-users
|