FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > CentOS > CentOS

 
 
LinkBack Thread Tools
 
Old 11-23-2009, 12:25 AM
Philip Manuel
 
Default NFS4 issue

We are running kernel 2.6.18-164.6.1.el5 with exporting 3 aoe provided
ext4 directories. For a couple of weeks we had a small number of users
using the system with no issues, today we added 7 users and the system
crashed and did not perform correctly since.

Nov 23 10:20:03 sulphur rpc.idmapd[5199]: nfsdcb: id '-2' too big!
Nov 23 10:42:25 sulphur nfsd[27306]: nfssvc: Setting version failed:
errno 16 (Device or resource busy)
Nov 23 10:42:25 sulphur nfsd[27306]: nfssvc: unable to bind UPD socket:
errno 98 (Address already in use)
Nov 23 10:42:26 sulphur kernel: slab error in kmem_cache_destroy():
cache `nfsd4_files': Can't free all objects
Nov 23 10:42:26 sulphur kernel: [<ffffffff88645efd>]
:nfsd:nfsd4_free_slab+0x11/0x4d
Nov 23 10:42:26 sulphur kernel: [<ffffffff88645f55>]
:nfsd:nfsd4_free_slabs+0x1c/0x33
Nov 23 10:42:26 sulphur kernel: [<ffffffff88646ecb>]
:nfsd:nfs4_state_shutdown+0x17e/0x18a
Nov 23 10:42:26 sulphur kernel: [<ffffffff88630570>]
:nfsd:nfsd_last_thread+0x45/0x76
Nov 23 10:42:26 sulphur kernel: [<ffffffff88630856>] :nfsd:nfsd+0x2b5/0x2cb
Nov 23 10:42:26 sulphur kernel: [<ffffffff886305a1>] :nfsd:nfsd+0x0/0x2cb
Nov 23 10:42:26 sulphur kernel: [<ffffffff886305a1>] :nfsd:nfsd+0x0/0x2cb
Nov 23 10:42:26 sulphur kernel: BUG: warning at
fs/nfsd/nfs4state.c:1016/nfsd4_free_slab() (Tainted: G )
Nov 23 10:42:26 sulphur kernel: [<ffffffff88645f55>]
:nfsd:nfsd4_free_slabs+0x1c/0x33
Nov 23 10:42:26 sulphur kernel: [<ffffffff88646ecb>]
:nfsd:nfs4_state_shutdown+0x17e/0x18a
Nov 23 10:42:26 sulphur kernel: [<ffffffff88630570>]
:nfsd:nfsd_last_thread+0x45/0x76
Nov 23 10:42:26 sulphur kernel: [<ffffffff88630856>] :nfsd:nfsd+0x2b5/0x2cb
Nov 23 10:42:26 sulphur kernel: [<ffffffff886305a1>] :nfsd:nfsd+0x0/0x2cb
Nov 23 10:42:26 sulphur kernel: [<ffffffff886305a1>] :nfsd:nfsd+0x0/0x2cb
Nov 23 10:42:26 sulphur kernel: slab error in kmem_cache_destroy():
cache `nfsd4_delegations': Can't free all objects
Nov 23 10:42:26 sulphur kernel: [<ffffffff88645efd>]
:nfsd:nfsd4_free_slab+0x11/0x4d
Nov 23 10:42:26 sulphur kernel: [<ffffffff88646ecb>]
:nfsd:nfs4_state_shutdown+0x17e/0x18a
Nov 23 10:42:26 sulphur kernel: [<ffffffff88630570>]
:nfsd:nfsd_last_thread+0x45/0x76
Nov 23 10:42:26 sulphur kernel: [<ffffffff88630856>] :nfsd:nfsd+0x2b5/0x2cb
Nov 23 10:42:26 sulphur kernel: [<ffffffff886305a1>] :nfsd:nfsd+0x0/0x2cb
Nov 23 10:42:26 sulphur kernel: [<ffffffff886305a1>]
:nfsd:nfsd+0x0/0x2cb
Nov 23 10:42:26 sulphur kernel: BUG: warning at
fs/nfsd/nfs4state.c:1016/nfsd4_free_slab() (Tainted: G )
Nov 23 10:42:26 sulphur kernel: [<ffffffff88646ecb>]
:nfsd:nfs4_state_shutdown+0x17e/0x18a
Nov 23 10:42:26 sulphur kernel: [<ffffffff88630570>]
:nfsd:nfsd_last_thread+0x45/0x76
Nov 23 10:42:26 sulphur kernel: [<ffffffff88630856>]
:nfsd:nfsd+0x2b5/0x2cb
Nov 23 10:42:26 sulphur kernel: [<ffffffff886305a1>]
:nfsd:nfsd+0x0/0x2cb
Nov 23 10:42:26 sulphur kernel: [<ffffffff886305a1>]
:nfsd:nfsd+0x0/0x2cb
Nov 23 10:42:26 sulphur kernel: nfsd: last server has
exited
Nov 23 10:42:26 sulphur kernel: nfsd: unexporting all
filesystems
Nov 23 10:42:44 sulphur kernel: kmem_cache_create: duplicate cache
nfsd4_files
Nov 23 10:42:44 sulphur kernel: [<ffffffff88646f29>]
:nfsd:nfs4_state_start+0x52/0x18f
Nov 23 10:42:44 sulphur kernel: [<ffffffff886303ae>]
:nfsd:nfsd_svc+0x6c/0x1e9
Nov 23 10:42:44 sulphur kernel: [<ffffffff88630f8e>]
:nfsd:write_threads+0x0/0xa9
Nov 23 10:42:44 sulphur kernel: [<ffffffff88630ffd>]
:nfsd:write_threads+0x6f/0xa9
Nov 23 10:42:44 sulphur kernel: [<ffffffff88630f8e>]
:nfsd:write_threads+0x0/0xa9
Nov 23 10:42:44 sulphur kernel: [<ffffffff88630d59>]
:nfsd:nfsctl_transaction_write+0x42/0x77Nov 23 10:42:44 sulphur
nfsd[27369]: nfssvc: Cannot allocate memory
Nov 23 10:43:55 sulphur nfsd[27495]: nfssvc: Setting version failed:
errno 16 (Device or resource
busy)

Nov 23 10:43:55 sulphur nfsd[27495]: nfssvc: unable to bind UPD socket:
errno 98 (Address already in use)

So above shows the original problem and then me restarting it and
eventually I had to reboot the server. Since then it has been behaving
bizarrely with it running for 5 mins and then stopping, upon a restart
it will run for a while and then stop.
Nov 23 11:04:46 sulphur kernel: NFSD: Using /var/lib/nfs/v4recovery as
the NFSv4 state recovery directory
Nov 23 11:17:02 sulphur rpc.idmapd[8178]: nfsdcb: id '-2' too big!
Nov 23 11:29:01 sulphur kernel: nfsd: last server has exited
Nov 23 11:29:01 sulphur kernel: nfsd: unexporting all filesystems
Nov 23 11:29:08 sulphur kernel: NFSD: Using /var/lib/nfs/v4recovery as
the NFSv4 state recovery directory
Nov 23 11:29:08 sulphur rpc.idmapd[8178]: nfsdcb: id '-2' too big!
Nov 23 11:32:03 sulphur kernel: nfsd: last server has exited
Nov 23 11:32:03 sulphur kernel: nfsd: unexporting all filesystems
Nov 23 11:32:34 sulphur kernel: NFSD: Using /var/lib/nfs/v4recovery as
the NFSv4 state recovery directory
Nov 23 11:32:34 sulphur rpc.idmapd[8178]: nfsdcb: id '-2' too big!
Nov 23 11:41:58 sulphur kernel: nfsd: last server has exited
Nov 23 11:41:58 sulphur kernel: nfsd: unexporting all filesystems
Nov 23 11:42:03 sulphur kernel: NFSD: Using /var/lib/nfs/v4recovery as
the NFSv4 state recovery directory
Nov 23 11:42:03 sulphur rpc.idmapd[8178]: nfsdcb: id '-2' too big!
Nov 23 11:47:20 sulphur kernel: nfsd: last server has exited
Nov 23 11:47:20 sulphur kernel: nfsd: unexporting all filesystems

I haven't found a report of an issues for the "nfsdcb: id '-2' too
big!" message but equally I don't know what it means either.

On the console we are seeing loads of these messages:-

kernel: NFSD: preprocess_seqid_op: magic stateid!

Again I don't know what this means or the implications of this message.

Any suggestions would be welcome.

At the moment we are up with two users migrated back to the old servers.

Thanks

Phil.
_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
 
Old 11-23-2009, 04:00 AM
Philip Manuel
 
Default NFS4 issue

Philip Manuel wrote:
> We are running kernel 2.6.18-164.6.1.el5 with exporting 3 aoe provided
> ext4 directories. For a couple of weeks we had a small number of users
> using the system with no issues, today we added 7 users and the system
> crashed and did not perform correctly since.
>
> Nov 23 10:20:03 sulphur rpc.idmapd[5199]: nfsdcb: id '-2' too big!
> Nov 23 10:42:25 sulphur nfsd[27306]: nfssvc: Setting version failed:
> errno 16 (Device or resource busy)
> Nov 23 10:42:25 sulphur nfsd[27306]: nfssvc: unable to bind UPD socket:
> errno 98 (Address already in use)
> Nov 23 10:42:26 sulphur kernel: slab error in kmem_cache_destroy():
> cache `nfsd4_files': Can't free all objects
> Nov 23 10:42:26 sulphur kernel: [<ffffffff88645efd>]
> :nfsd:nfsd4_free_slab+0x11/0x4d
> Nov 23 10:42:26 sulphur kernel: [<ffffffff88645f55>]
> :nfsd:nfsd4_free_slabs+0x1c/0x33
> Nov 23 10:42:26 sulphur kernel: [<ffffffff88646ecb>]
> :nfsd:nfs4_state_shutdown+0x17e/0x18a
> Nov 23 10:42:26 sulphur kernel: [<ffffffff88630570>]
> :nfsd:nfsd_last_thread+0x45/0x76
> Nov 23 10:42:26 sulphur kernel: [<ffffffff88630856>] :nfsd:nfsd+0x2b5/0x2cb
> Nov 23 10:42:26 sulphur kernel: [<ffffffff886305a1>] :nfsd:nfsd+0x0/0x2cb
> Nov 23 10:42:26 sulphur kernel: [<ffffffff886305a1>] :nfsd:nfsd+0x0/0x2cb
> Nov 23 10:42:26 sulphur kernel: BUG: warning at
> fs/nfsd/nfs4state.c:1016/nfsd4_free_slab() (Tainted: G )
> Nov 23 10:42:26 sulphur kernel: [<ffffffff88645f55>]
> :nfsd:nfsd4_free_slabs+0x1c/0x33
> Nov 23 10:42:26 sulphur kernel: [<ffffffff88646ecb>]
> :nfsd:nfs4_state_shutdown+0x17e/0x18a
> Nov 23 10:42:26 sulphur kernel: [<ffffffff88630570>]
> :nfsd:nfsd_last_thread+0x45/0x76
> Nov 23 10:42:26 sulphur kernel: [<ffffffff88630856>] :nfsd:nfsd+0x2b5/0x2cb
> Nov 23 10:42:26 sulphur kernel: [<ffffffff886305a1>] :nfsd:nfsd+0x0/0x2cb
> Nov 23 10:42:26 sulphur kernel: [<ffffffff886305a1>] :nfsd:nfsd+0x0/0x2cb
> Nov 23 10:42:26 sulphur kernel: slab error in kmem_cache_destroy():
> cache `nfsd4_delegations': Can't free all objects
> Nov 23 10:42:26 sulphur kernel: [<ffffffff88645efd>]
> :nfsd:nfsd4_free_slab+0x11/0x4d
> Nov 23 10:42:26 sulphur kernel: [<ffffffff88646ecb>]
> :nfsd:nfs4_state_shutdown+0x17e/0x18a
> Nov 23 10:42:26 sulphur kernel: [<ffffffff88630570>]
> :nfsd:nfsd_last_thread+0x45/0x76
> Nov 23 10:42:26 sulphur kernel: [<ffffffff88630856>] :nfsd:nfsd+0x2b5/0x2cb
> Nov 23 10:42:26 sulphur kernel: [<ffffffff886305a1>] :nfsd:nfsd+0x0/0x2cb
> Nov 23 10:42:26 sulphur kernel: [<ffffffff886305a1>]
> :nfsd:nfsd+0x0/0x2cb
> Nov 23 10:42:26 sulphur kernel: BUG: warning at
> fs/nfsd/nfs4state.c:1016/nfsd4_free_slab() (Tainted: G )
> Nov 23 10:42:26 sulphur kernel: [<ffffffff88646ecb>]
> :nfsd:nfs4_state_shutdown+0x17e/0x18a
> Nov 23 10:42:26 sulphur kernel: [<ffffffff88630570>]
> :nfsd:nfsd_last_thread+0x45/0x76
> Nov 23 10:42:26 sulphur kernel: [<ffffffff88630856>]
> :nfsd:nfsd+0x2b5/0x2cb
> Nov 23 10:42:26 sulphur kernel: [<ffffffff886305a1>]
> :nfsd:nfsd+0x0/0x2cb
> Nov 23 10:42:26 sulphur kernel: [<ffffffff886305a1>]
> :nfsd:nfsd+0x0/0x2cb
> Nov 23 10:42:26 sulphur kernel: nfsd: last server has
> exited
> Nov 23 10:42:26 sulphur kernel: nfsd: unexporting all
> filesystems
> Nov 23 10:42:44 sulphur kernel: kmem_cache_create: duplicate cache
> nfsd4_files
> Nov 23 10:42:44 sulphur kernel: [<ffffffff88646f29>]
> :nfsd:nfs4_state_start+0x52/0x18f
> Nov 23 10:42:44 sulphur kernel: [<ffffffff886303ae>]
> :nfsd:nfsd_svc+0x6c/0x1e9
> Nov 23 10:42:44 sulphur kernel: [<ffffffff88630f8e>]
> :nfsd:write_threads+0x0/0xa9
> Nov 23 10:42:44 sulphur kernel: [<ffffffff88630ffd>]
> :nfsd:write_threads+0x6f/0xa9
> Nov 23 10:42:44 sulphur kernel: [<ffffffff88630f8e>]
> :nfsd:write_threads+0x0/0xa9
> Nov 23 10:42:44 sulphur kernel: [<ffffffff88630d59>]
> :nfsd:nfsctl_transaction_write+0x42/0x77Nov 23 10:42:44 sulphur
> nfsd[27369]: nfssvc: Cannot allocate memory
> Nov 23 10:43:55 sulphur nfsd[27495]: nfssvc: Setting version failed:
> errno 16 (Device or resource
> busy)
>
> Nov 23 10:43:55 sulphur nfsd[27495]: nfssvc: unable to bind UPD socket:
> errno 98 (Address already in use)
>
> So above shows the original problem and then me restarting it and
> eventually I had to reboot the server. Since then it has been behaving
> bizarrely with it running for 5 mins and then stopping, upon a restart
> it will run for a while and then stop.
> Nov 23 11:04:46 sulphur kernel: NFSD: Using /var/lib/nfs/v4recovery as
> the NFSv4 state recovery directory
> Nov 23 11:17:02 sulphur rpc.idmapd[8178]: nfsdcb: id '-2' too big!
> Nov 23 11:29:01 sulphur kernel: nfsd: last server has exited
> Nov 23 11:29:01 sulphur kernel: nfsd: unexporting all filesystems
> Nov 23 11:29:08 sulphur kernel: NFSD: Using /var/lib/nfs/v4recovery as
> the NFSv4 state recovery directory
> Nov 23 11:29:08 sulphur rpc.idmapd[8178]: nfsdcb: id '-2' too big!
> Nov 23 11:32:03 sulphur kernel: nfsd: last server has exited
> Nov 23 11:32:03 sulphur kernel: nfsd: unexporting all filesystems
> Nov 23 11:32:34 sulphur kernel: NFSD: Using /var/lib/nfs/v4recovery as
> the NFSv4 state recovery directory
> Nov 23 11:32:34 sulphur rpc.idmapd[8178]: nfsdcb: id '-2' too big!
> Nov 23 11:41:58 sulphur kernel: nfsd: last server has exited
> Nov 23 11:41:58 sulphur kernel: nfsd: unexporting all filesystems
> Nov 23 11:42:03 sulphur kernel: NFSD: Using /var/lib/nfs/v4recovery as
> the NFSv4 state recovery directory
> Nov 23 11:42:03 sulphur rpc.idmapd[8178]: nfsdcb: id '-2' too big!
> Nov 23 11:47:20 sulphur kernel: nfsd: last server has exited
> Nov 23 11:47:20 sulphur kernel: nfsd: unexporting all filesystems
>
> I haven't found a report of an issues for the "nfsdcb: id '-2' too
> big!" message but equally I don't know what it means either.
>
> On the console we are seeing loads of these messages:-
>
> kernel: NFSD: preprocess_seqid_op: magic stateid!
>
> Again I don't know what this means or the implications of this message.
>
> Any suggestions would be welcome.
>
> At the moment we are up with two users migrated back to the old servers.
>
> Thanks
>
> Phil.
> _______________________________________________
> CentOS mailing list
> CentOS@centos.org
> http://lists.centos.org/mailman/listinfo/centos
>

Just a quick update, 4 hours later the message "

kernel: NFSD: preprocess_seqid_op: magic stateid!" has stopped, now to why ?

Thanks


_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
 
Old 11-23-2009, 10:01 PM
Giovanni Tirloni
 
Default NFS4 issue

On Mon, Nov 23, 2009 at 3:00 AM, Philip Manuel <phil@zomojo.com> wrote:
>
>
> Philip Manuel wrote:
>> We are running kernel 2.6.18-164.6.1.el5 with exporting 3 aoe provided
>> ext4 directories. For a couple of weeks we had a small number of users
>> using the system with no issues, today we added 7 users and the system
>> crashed and did not perform correctly since.
>>
>> Nov 23 10:20:03 sulphur rpc.idmapd[5199]: nfsdcb: id '-2' too big!
>> Nov 23 10:42:25 sulphur nfsd[27306]: nfssvc: Setting version failed:
>> errno 16 (Device or resource busy)

Check your nfsnobody user and try changing its id to something below
65536, on client and server.

http://www.fedoraforum.org/forum/archive/index.php/t-134487.html

--
Giovanni.
_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
 
Old 11-23-2009, 10:18 PM
Philip Manuel
 
Default NFS4 issue

That's a little confusing, does that mean all the clients need to change
as well as the server ? Has no-one else hit this issue? We are running
all our clients and servers on x86_64.

Thanks

Phil

Giovanni Tirloni wrote:
> On Mon, Nov 23, 2009 at 3:00 AM, Philip Manuel <phil@zomojo.com> wrote:
>
>> Philip Manuel wrote:
>>
>>> We are running kernel 2.6.18-164.6.1.el5 with exporting 3 aoe provided
>>> ext4 directories. For a couple of weeks we had a small number of users
>>> using the system with no issues, today we added 7 users and the system
>>> crashed and did not perform correctly since.
>>>
>>> Nov 23 10:20:03 sulphur rpc.idmapd[5199]: nfsdcb: id '-2' too big!
>>> Nov 23 10:42:25 sulphur nfsd[27306]: nfssvc: Setting version failed:
>>> errno 16 (Device or resource busy)
>>>
>
> Check your nfsnobody user and try changing its id to something below
> 65536, on client and server.
>
> http://www.fedoraforum.org/forum/archive/index.php/t-134487.html
>
>
_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
 

Thread Tools




All times are GMT. The time now is 12:55 AM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org