FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Debian > Debian Kernel

 
 
LinkBack Thread Tools
 
Old 02-15-2012, 06:40 PM
Jonathan Nieder
 
Default Bug#602991: kernel crash with null pointer dereference while umounting nfs

Hi,

George Barnett wrote:

> We maintain a large number of OpenVZ containers on several hosts.
> In the course of running these containers, we keep a number of NFS
> mounts which are presented into the OpenVZ containers.
>
> We currently have 3 test machines we are able to test this on. All
> are running the same image, netbooted. The Stack trace below is
> from a 2 x 12 core AMD box, although we see the exact same crash
> with the same cause on the Intel test nodes too (2 x X5650 6 core).
>
> When we stop all the containers quickly on a host, we see the following repeatable crash:
>
> [ 317.100898] CT: 10018: stopped
> [ 317.912269] BUG: unable to handle kernel NULL pointer dereference at (null)
> [ 317.916307] IP: [<ffffffff812ea21e>] _spin_lock_bh+0xe/0x25
> [ 317.916307] PGD 100a5d0067 PUD 100a557067 PMD 0
> [ 317.916307] Oops: 0002 [#1] SMP
> [ 317.916307] last sysfs file: /sys/devices/system/cpu/cpu23/cache/index2/shared_cpu_map
[...]
> [ 317.916307] Pid: 7838, comm: umount Not tainted 2.6.32-5-openvz-amd64 #1 dyomin H8DGU
> [ 317.916307] RIP: 0010:[<ffffffff812ea21e>] [<ffffffff812ea21e>] _spin_lock_bh+0xe/0x25
[...]
> [ 317.916307] Call Trace:
> [ 317.916307] [<ffffffffa01b97a4>] ? rpc_wake_up_queued_task+0x12/0x29 [sunrpc]
> [ 317.916307] [<ffffffffa01b9835>] ? rpc_killall_tasks+0x7a/0x9b [sunrpc]
> [ 317.916307] [<ffffffffa0217fed>] ? nfs_umount_begin+0x34/0x3a [nfs]
> [ 317.916307] [<ffffffff81106844>] ? sys_umount+0x11b/0x2e6
> [ 317.916307] [<ffffffff812ec6a5>] ? do_page_fault+0x2e0/0x2fc
> [ 317.916307] [<ffffffff81010c12>] ? system_call_fastpath+0x16/0x1b
> [ 317.916307] Code: e9 ff 5b c3 53 48 89 fb e8 a6 a4 d6 ff 48 89 df
> f0 83 2f 01 79 05 e8 42 73 e9 ff 5b c3 53 48 89 fb e8 8d a4 d6 ff b8
> 00 00 01 00 <f0> 0f c1 03 0f b7 d0 c1 e8 10 39 c2 74 07 f3 90 0f b7 13
> eb f5

Thanks for a clear report. Do you still have access to these systems?
If so, can you still reproduce this?

If this bug is still present, our best bet is probably to get help
from openvz upstream, which might involve trying a different
(alienized RHEL) or newer (3.x.y) kernel.

Sorry for the trouble,
Jonathan



--
To UNSUBSCRIBE, email to debian-kernel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 20120215193927.GA23759@burratino">http://lists.debian.org/20120215193927.GA23759@burratino
 
Old 02-15-2012, 11:14 PM
George Barnett
 
Default Bug#602991: kernel crash with null pointer dereference while umounting nfs

Hi Jonathan,

We ended up moving to blessed openvz kernels on Centos 5.5 after hitting a few more bugs in debian openvz. As such, I no longer have any systems I could reproduce this on. Sorry I'm not able to be of help.

Cheers,

George


--
To UNSUBSCRIBE, email to debian-kernel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 0328E75A-EC66-45AE-8067-B6DFD78B7CBE@atlassian.com">http://lists.debian.org/0328E75A-EC66-45AE-8067-B6DFD78B7CBE@atlassian.com
 
Old 02-15-2012, 11:27 PM
Jonathan Nieder
 
Default Bug#602991: kernel crash with null pointer dereference while umounting nfs

tags 602991 + unreproducible
quit

George Barnett wrote:

> We ended up moving to blessed openvz kernels on Centos 5.5 after
> hitting a few more bugs in debian openvz. As such, I no longer have
> any systems I could reproduce this on. Sorry I'm not able to be of
> help.

No problem. Thanks for the update.

Ola, do the symptoms below look familiar to you? Kernel was 2.6.32-27.

> We maintain a large number of OpenVZ containers on several hosts.
> In the course of running these containers, we keep a number of NFS
> mounts which are presented into the OpenVZ containers.
>
> We currently have 3 test machines we are able to test this on. All
> are running the same image, netbooted. The Stack trace below is
> from a 2 x 12 core AMD box, although we see the exact same crash
> with the same cause on the Intel test nodes too (2 x X5650 6 core).
>
> When we stop all the containers quickly on a host, we see the following repeatable crash:
>
> [ 317.100898] CT: 10018: stopped
> [ 317.912269] BUG: unable to handle kernel NULL pointer dereference at (null)
> [ 317.916307] IP: [<ffffffff812ea21e>] _spin_lock_bh+0xe/0x25
> [ 317.916307] PGD 100a5d0067 PUD 100a557067 PMD 0
> [ 317.916307] Oops: 0002 [#1] SMP
> [ 317.916307] last sysfs file: /sys/devices/system/cpu/cpu23/cache/index2/shared_cpu_map
[...]
> [ 317.916307] Pid: 7838, comm: umount Not tainted 2.6.32-5-openvz-amd64 #1 dyomin H8DGU
> [ 317.916307] RIP: 0010:[<ffffffff812ea21e>] [<ffffffff812ea21e>] _spin_lock_bh+0xe/0x25
[...]
> [ 317.916307] Call Trace:
> [ 317.916307] [<ffffffffa01b97a4>] ? rpc_wake_up_queued_task+0x12/0x29 [sunrpc]
> [ 317.916307] [<ffffffffa01b9835>] ? rpc_killall_tasks+0x7a/0x9b [sunrpc]
> [ 317.916307] [<ffffffffa0217fed>] ? nfs_umount_begin+0x34/0x3a [nfs]
> [ 317.916307] [<ffffffff81106844>] ? sys_umount+0x11b/0x2e6
> [ 317.916307] [<ffffffff812ec6a5>] ? do_page_fault+0x2e0/0x2fc
> [ 317.916307] [<ffffffff81010c12>] ? system_call_fastpath+0x16/0x1b
> [ 317.916307] Code: e9 ff 5b c3 53 48 89 fb e8 a6 a4 d6 ff 48 89 df
> f0 83 2f 01 79 05 e8 42 73 e9 ff 5b c3 53 48 89 fb e8 8d a4 d6 ff b8
> 00 00 01 00 <f0> 0f c1 03 0f b7 d0 c1 e8 10 39 c2 74 07 f3 90 0f b7 13
> eb f5



--
To UNSUBSCRIBE, email to debian-kernel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 20120216002659.GB29709@burratino">http://lists.debian.org/20120216002659.GB29709@burratino
 
Old 02-16-2012, 12:07 AM
Steven Chamberlain
 
Default Bug#602991: kernel crash with null pointer dereference while umounting nfs

On 16/02/12 00:27, Jonathan Nieder wrote:
> George Barnett wrote:
>> [ 317.916307] Pid: 7838, comm: umount Not tainted 2.6.32-5-openvz-amd64 #1 dyomin H8DGU
>> [ 317.916307] RIP: 0010:[<ffffffff812ea21e>] [<ffffffff812ea21e>] _spin_lock_bh+0xe/0x25

>> [ 317.916307] [<ffffffffa01b97a4>] ? rpc_wake_up_queued_task+0x12/0x29 [sunrpc]
>> [ 317.916307] [<ffffffffa01b9835>] ? rpc_killall_tasks+0x7a/0x9b [sunrpc]
>> [ 317.916307] [<ffffffffa0217fed>] ? nfs_umount_begin+0x34/0x3a [nfs]
>> [ 317.916307] [<ffffffff81106844>] ? sys_umount+0x11b/0x2e6
>> [ 317.916307] [<ffffffff812ec6a5>] ? do_page_fault+0x2e0/0x2fc
>> [ 317.916307] [<ffffffff81010c12>] ? system_call_fastpath+0x16/0x1b

Hi,

FWIW, I found NFS to be very buggy before the 'feoktistov' version of
the OpenVZ patchset (introduced in linux-2.6 2.6.32-31); since that
version I've had no problems of this nature, and I use nfs quite heavily
between OpenVZ containers.

The 'dyomin' version mentioned above was based on 2.6.32.22 which I
believe had some NFS issues not even specific to OpenVZ, such as
kernel.org BZ#24302, and another mentioned in Debian's changelog for
2.6.32-31.

Hope that helps,
Regards,
--
Steven Chamberlain
steven@pyro.eu.org



--
To UNSUBSCRIBE, email to debian-kernel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 4F3C56E8.1010300@pyro.eu.org">http://lists.debian.org/4F3C56E8.1010300@pyro.eu.org
 
Old 02-16-2012, 05:21 AM
Ola Lundqvist
 
Default Bug#602991: kernel crash with null pointer dereference while umounting nfs

Great.

I was just about to tell (almost) the same, that is that NFS has been rather
buggy. So the approach that Steven proposed is probably the best way to go.

// Ola

On Wed, Feb 15, 2012 at 07:13:28PM -0600, Jonathan Nieder wrote:
> Version: 2.6.32-31
>
> Steven Chamberlain wrote:
> > On 16/02/12 00:27, Jonathan Nieder wrote:
> >> George Barnett wrote:
>
> >>> [ 317.916307] Pid: 7838, comm: umount Not tainted 2.6.32-5-openvz-amd64 #1 dyomin H8DGU
> >>> [ 317.916307] RIP: 0010:[<ffffffff812ea21e>] [<ffffffff812ea21e>] _spin_lock_bh+0xe/0x25
> >
> >>> [ 317.916307] [<ffffffffa01b97a4>] ? rpc_wake_up_queued_task+0x12/0x29 [sunrpc]
> >>> [ 317.916307] [<ffffffffa01b9835>] ? rpc_killall_tasks+0x7a/0x9b [sunrpc]
> >>> [ 317.916307] [<ffffffffa0217fed>] ? nfs_umount_begin+0x34/0x3a [nfs]
> >>> [ 317.916307] [<ffffffff81106844>] ? sys_umount+0x11b/0x2e6
> >>> [ 317.916307] [<ffffffff812ec6a5>] ? do_page_fault+0x2e0/0x2fc
> >>> [ 317.916307] [<ffffffff81010c12>] ? system_call_fastpath+0x16/0x1b
> >
> > Hi,
> >
> > FWIW, I found NFS to be very buggy before the 'feoktistov' version of
> > the OpenVZ patchset (introduced in linux-2.6 2.6.32-31); since that
> > version I've had no problems of this nature, and I use nfs quite heavily
> > between OpenVZ containers.
>
> Thanks, Steven. Let's go with that. ;-)
>

--
--- Inguza Technology AB --- MSc in Information Technology ----
/ ola@inguza.com Annebergsslingan 37
| opal@debian.org 654 65 KARLSTAD |
| http://inguza.com/ Mobile: +46 (0)70-332 1551 |
gpg/f.p.: 7090 A92B 18FE 7994 0C36 4FE4 18A1 B1CF 0FE5 3DD9 /
---------------------------------------------------------------



--
To UNSUBSCRIBE, email to debian-kernel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 20120216062150.GA7376@inguza.net">http://lists.debian.org/20120216062150.GA7376@inguza.net
 

Thread Tools




All times are GMT. The time now is 01:16 AM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org