FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Debian > Debian Kernel

 
 
LinkBack Thread Tools
 
Old 02-12-2010, 06:35 PM
Philipp Weis
 
Default Bug#569598: XFS corruption

Package: linux-source-2.6.26
Version: 2.6.26-21
Severity: normal

Hi,

I just experienced an XFS corruption on one of my machines during a
remote run of rdiff-backup. No indications of I/O errors or hardware
problems. The XFS volume runs on top of a encrypted loop-aes device,
on top of lvm, on top of software raid 1. I've been using this setup
for years without any problems.

After this, xfs refuses to mount again, and xfs_repair only runs with
-L to ignore the log. The filesystem is up now again and I'm trying to
figure out the damage. I still have the original corrupted one as an
lvm snapshot.

Here are the syslog messages and the xfs_repair output.

| Feb 11 23:46:07 marvin kernel: XFS internal error XFS_WANT_CORRUPTED_GOTO at line 1650 of file fs/xfs/xfs_alloc.c. Caller 0xc01e6284
| Feb 11 23:46:07 marvin kernel: Pid: 1510, comm: rdiff-backup Not tainted 2.6.26 #1
| Feb 11 23:46:07 marvin kernel: [<c01e479f>] xfs_free_ag_extent+0x5ef/0x730
| Feb 11 23:46:07 marvin kernel: [<c01e6284>] xfs_free_extent+0xb4/0xe0
| Feb 11 23:46:07 marvin kernel: [<c01e6284>] xfs_free_extent+0xb4/0xe0
| Feb 11 23:46:07 marvin kernel: [<c01f6bc3>] xfs_bmap_finish+0x123/0x170
| Feb 11 23:46:07 marvin kernel: [<c0219cea>] xfs_itruncate_finish+0x1ea/0x460
| Feb 11 23:46:07 marvin kernel: [<c0235ad5>] xfs_inactive+0x3c5/0x4e0
| Feb 11 23:46:07 marvin kernel: [<c0183fb7>] inotify_inode_is_dead+0x17/0x80
| Feb 11 23:46:07 marvin kernel: [<c0241766>] xfs_fs_clear_inode+0x36/0x70
| Feb 11 23:46:07 marvin kernel: [<c016d455>] clear_inode+0x65/0x140
| Feb 11 23:46:07 marvin kernel: [<c016d9be>] generic_delete_inode+0xde/0xf0
| Feb 11 23:46:07 marvin kernel: [<c016cb44>] iput+0x44/0x50
| Feb 11 23:46:07 marvin kernel: [<c0163a29>] do_unlinkat+0xf9/0x180
| Feb 11 23:46:07 marvin kernel: [<c0170303>] mntput_no_expire+0x13/0xa0
| Feb 11 23:46:07 marvin kernel: [<c0157a27>] filp_close+0x47/0x80
| Feb 11 23:46:07 marvin kernel: [<c0102f4e>] syscall_call+0x7/0xb
| Feb 11 23:46:07 marvin kernel: =======================
| Feb 11 23:46:07 marvin kernel: xfs_force_shutdown(loop0,0x8) called from line 4261 of file fs/xfs/xfs_bmap.c. Return address = 0xc01f6c00
| Feb 11 23:46:07 marvin kernel: Filesystem "loop0": Corruption of in-memory data detected. Shutting down filesystem: loop0
| Feb 11 23:46:07 marvin kernel: Please umount the filesystem, and rectify the problem(s)
| Feb 11 23:46:08 marvin kernel: Filesystem "loop0": xfs_log_force: error 5 returned.
| Feb 11 23:46:20 marvin kernel: Filesystem "loop0": xfs_log_force: error 5 returned.
| Feb 11 23:46:50 marvin kernel: Filesystem "loop0": xfs_log_force: error 5 returned.
[last line repeats every 30 seconds until volume is unmounted]


| # xfs_repair -L /dev/loop0
| Phase 1 - find and verify superblock...
| Phase 2 - using internal log
| - zero log...
| ALERT: The filesystem has valuable metadata changes in a log which is being
| destroyed because the -L option was used.
| - scan filesystem freespace and inode maps...
| - found root inode chunk
| Phase 3 - for each AG...
| - scan and clear agi unlinked lists...
| - process known inodes and perform inode discovery...
| - agno = 0
| - agno = 1
| - agno = 2
| - agno = 3
| b45f7b90: Badness in key lookup (length)
| bp=(bno 43856128, len 16384 bytes) key=(bno 43856128, len 8192 bytes)
| - agno = 4
| - agno = 5
| - agno = 6
| - agno = 7
| - agno = 8
| - agno = 9
| - agno = 10
| - agno = 11
| - agno = 12
| - agno = 13
| - agno = 14
| - agno = 15
| - agno = 16
| - agno = 17
| - agno = 18
| - agno = 19
| - agno = 20
| - agno = 21
| - agno = 22
| - agno = 23
| - agno = 24
| - agno = 25
| - agno = 26
| - agno = 27
| - agno = 28
| - agno = 29
| - agno = 30
| - agno = 31
| - agno = 32
| - agno = 33
| - process newly discovered inodes...
| Phase 4 - check for duplicate blocks...
| - setting up duplicate extent list...
| - check for inodes claiming duplicate blocks...
| - agno = 0
| - agno = 1
| - agno = 2
| - agno = 3
| - agno = 4
| - agno = 5
| - agno = 6
| - agno = 7
| - agno = 8
| - agno = 9
| - agno = 10
| - agno = 11
| - agno = 12
| - agno = 13
| - agno = 14
| - agno = 15
| - agno = 16
| - agno = 17
| - agno = 18
| - agno = 19
| - agno = 20
| - agno = 21
| - agno = 22
| - agno = 23
| - agno = 24
| - agno = 25
| - agno = 26
| - agno = 27
| - agno = 28
| - agno = 29
| - agno = 30
| - agno = 31
| - agno = 32
| - agno = 33
| Phase 5 - rebuild AG headers and trees...
| - reset superblock...
| Phase 6 - check inode connectivity...
| - resetting contents of realtime bitmap and summary inodes
| - traversing filesystem ...
| - traversal finished ...
| - moving disconnected inodes to lost+found ...
| disconnected inode 149578249, moving to lost+found
| Phase 7 - verify and correct link counts...
| cache_purge: shake on cache 0x82af840 left 1 nodes!?
| done

lost+found has just one empty file.


-- System Information:
Debian Release: 5.0.4
APT prefers stable
APT policy: (600, 'stable')
Architecture: i386 (i686)

Kernel: Linux 2.6.26
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/bash

Versions of packages linux-source-2.6.26 depends on:
ii binutils 2.18.1~cvs20080103-7 The GNU assembler, linker and bina
ii bzip2 1.0.5-1 high-quality block-sorting file co

Versions of packages linux-source-2.6.26 recommends:
ii gcc 4:4.3.2-2 The GNU C compiler
ii libc6-dev [libc-dev] 2.7-18lenny2 GNU C Library: Development Librari
ii make 3.81-5 The GNU version of the "make" util

Versions of packages linux-source-2.6.26 suggests:
ii kernel-package 11.015 A utility for building Linux kerne
ii libncurses5-dev [ncurses- 5.7+20081213-1 developer's libraries and docs for
pn libqt3-mt-dev <none> (no description available)

-- no debconf information



--
To UNSUBSCRIBE, email to debian-kernel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
 
Old 02-14-2010, 06:39 PM
Ben Hutchings
 
Default Bug#569598: XFS corruption

On Fri, 2010-02-12 at 14:35 -0500, Philipp Weis wrote:
> Package: linux-source-2.6.26
> Version: 2.6.26-21
> Severity: normal

Since you are using a custom kernel, please send the build config you
used.

> I just experienced an XFS corruption on one of my machines during a
> remote run of rdiff-backup. No indications of I/O errors or hardware
> problems. The XFS volume runs on top of a encrypted loop-aes device,
> on top of lvm, on top of software raid 1. I've been using this setup
> for years without any problems.
[...]

So how old are the disks now?

Ben.

--
Ben Hutchings
It is easier to change the specification to fit the program than vice versa.
 
Old 02-14-2010, 06:52 PM
Philipp Weis
 
Default Bug#569598: XFS corruption

On 2010-02-14 19:39, Ben Hutchings <ben@decadent.org.uk> wrote:
> On Fri, 2010-02-12 at 14:35 -0500, Philipp Weis wrote:
> > Package: linux-source-2.6.26
> > Version: 2.6.26-21
> > Severity: normal
>
> Since you are using a custom kernel, please send the build config you
> used.

Attached.

> > I just experienced an XFS corruption on one of my machines during a
> > remote run of rdiff-backup. No indications of I/O errors or hardware
> > problems. The XFS volume runs on top of a encrypted loop-aes device,
> > on top of lvm, on top of software raid 1. I've been using this setup
> > for years without any problems.
> [...]
>
> So how old are the disks now?

4 years, 2 years and 2 months on a 3-way RAID-1. I keep replacing them
as they break. No I/O errors or mdadm messages in the syslog, so I
don't think that's an issue here.

Thanks!

Philipp
 
Old 02-16-2010, 12:54 AM
Ben Hutchings
 
Default Bug#569598: XFS corruption

On Sun, 2010-02-14 at 14:52 -0500, Philipp Weis wrote:
> On 2010-02-14 19:39, Ben Hutchings <ben@decadent.org.uk> wrote:
> > On Fri, 2010-02-12 at 14:35 -0500, Philipp Weis wrote:
> > > Package: linux-source-2.6.26
> > > Version: 2.6.26-21
> > > Severity: normal
> >
> > Since you are using a custom kernel, please send the build config you
> > used.
>
> Attached.

I notice you set CONFIG_PREEMPT_VOLUNTARY. The kernel image packages of
2.6.26 all use CONFIG_PREEMPT_NONE due to concern about the stability of
preemption. (This has been changed for more recent versions.) So it
may be worth changing this option.

When did you upgrade to kernel version 2.6.26? Which version were you
using before?

Ben.

--
Ben Hutchings
Humour is the best antidote to reality.
 
Old 02-16-2010, 01:12 AM
Philipp Weis
 
Default Bug#569598: XFS corruption

On 2010-02-16 01:54, Ben Hutchings <ben@decadent.org.uk> wrote:
> I notice you set CONFIG_PREEMPT_VOLUNTARY. The kernel image packages of
> 2.6.26 all use CONFIG_PREEMPT_NONE due to concern about the stability of
> preemption. (This has been changed for more recent versions.) So it
> may be worth changing this option.

Ok, thanks for the hint, I'll disable preemption for now.

> When did you upgrade to kernel version 2.6.26? Which version were you
> using before?

I've been using 2.6.18 before, and switched to 2.6.26 between August
and December of 2009. Sorry I can't be more specific.

Philipp
 

Thread Tools




All times are GMT. The time now is 08:30 PM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org