FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Redhat > Cluster Development

 
 
LinkBack Thread Tools
 
Old 01-27-2009, 09:44 AM
Chrissie Caulfield
 
Default dlm: Allow large nodeids

This patch changes DLM to use its own hash table rather than the idr_
code. It should allow clusters with large nodeids to work correctly.

It also fixes the 1..max_nodeid loops when the DLM shuts down.

This is a slightly different patch to the one I posted to IRC yesterday,
but all I've changed is the use of list_for_each_entry( rather than just
list_for_each().

Chrissie
 
Old 01-27-2009, 10:33 AM
Chrissie Caulfield
 
Default dlm: Allow large nodeids

This an updated patch that uses hlists rather than list_heads to save
memory in the connection structure.

Thanks to Steven Whitehouse for the suggestion.


Chrissie Caulfield wrote:
> This patch changes DLM to use its own hash table rather than the idr_
> code. It should allow clusters with large nodeids to work correctly.
>
> It also fixes the 1..max_nodeid loops when the DLM shuts down.
>
> This is a slightly different patch to the one I posted to IRC yesterday,
> but all I've changed is the use of list_for_each_entry( rather than just
> list_for_each().



Chrissie
 
Old 01-27-2009, 07:06 PM
David Teigland
 
Default dlm: Allow large nodeids

On Tue, Jan 27, 2009 at 11:33:30AM +0000, Chrissie Caulfield wrote:
> This an updated patch that uses hlists rather than list_heads to save
> memory in the connection structure.
>
> Thanks to Steven Whitehouse for the suggestion.

I fixed some checkpatch warnings, tested, and pushed into the "next" branch.
 
Old 01-27-2009, 07:19 PM
David Teigland
 
Default dlm: Allow large nodeids

On Tue, Jan 27, 2009 at 02:06:30PM -0600, David Teigland wrote:
> On Tue, Jan 27, 2009 at 11:33:30AM +0000, Chrissie Caulfield wrote:
> > This an updated patch that uses hlists rather than list_heads to save
> > memory in the connection structure.
> >
> > Thanks to Steven Whitehouse for the suggestion.
>
> I fixed some checkpatch warnings, tested, and pushed into the "next" branch.

I take that back after hitting the following on unmount,

Pid: 4484, comm: umount Not tainted 2.6.29-rc2 #1
RIP: 0010:[<ffffffffa04ecfb4>] [<ffffffffa04ecfb4>] foreach_conn+0x20/0x46 [dlm]
RSP: 0018:ffff880072db5d38 EFLAGS: 00010202
RAX: 0000000000000001 RBX: 6b6b6b6b6b6b6b6b RCX: 0000000000000000
RDX: ffffffffa04ed0dc RSI: 000000000000006b RDI: ffff880057998de0
RBP: ffff880072db5d58 R08: 0000000000000000 R09: ffff880057998de8
R10: 0000000000000000 R11: ffff88007dd428d8 R12: 0000000000000000
R13: ffffffffa04ecede R14: 0000000000006000 R15: 0000000000000100
FS: 00007fbce8f0b720(0000) GS:ffffffff80a33080(0000) knlGS:00000000f7f7a6c0
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00007ff8aa8d38e8 CR3: 0000000138c4a000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process umount (pid: 4484, threadinfo ffff880072db4000, task ffff8800738d4740)
Stack:
ffff88007d187000 0000000000000000 ffff88007d187000 ffff88007c145fa0
ffff880072db5d68 ffffffffa04ed35a ffff880072db5d78 ffffffffa04eaf20
ffff880072db5db8 ffffffffa04eb299 ffff880072db5da8 ffff88007e85e198
Call Trace:
[<ffffffffa04ed35a>] dlm_lowcomms_stop+0x68/0x82 [dlm]
[<ffffffffa04eaf20>] threads_stop+0xe/0x15 [dlm]
[<ffffffffa04eb299>] dlm_release_lockspace+0x372/0x3a4 [dlm]
[<ffffffffa02720e0>] gdlm_unmount+0x28/0x49 [lock_dlm]
[<ffffffffa047270f>] gfs2_unmount_lockproto+0x2d/0x52 [gfs2]
[<ffffffffa0476bcc>] gfs2_lm_unmount+0x16/0x18 [gfs2]
[<ffffffffa047afb7>] gfs2_put_super+0x180/0x190 [gfs2]
[<ffffffff802afadc>] generic_shutdown_super+0x73/0xe8
[<ffffffff802afb73>] kill_block_super+0x22/0x3a
[<ffffffffa0476953>] gfs2_kill_sb+0x63/0x78 [gfs2]
[<ffffffff802afc5c>] deactivate_super+0x68/0x7d
[<ffffffff802c2aaf>] mntput_no_expire+0x103/0x149
[<ffffffff802c3094>] sys_umount+0x2e2/0x341
[<ffffffff8020c05b>] system_call_fastpath+0x16/0x1b
Code: 23 fe df 48 89 d8 5b 41 5c c9 c3 55 48 89 e5 41 55 49 89 fd 41 54 45 31 e4 53 48 83 ec 08 4a 8b 1c e5 e0 79 50 a0 48 85 db 74 15 <48> 8b 03 48 8d bb d0 fe ff ff 0f 18 08 41 ff d5 48 8b 1b eb e6
RIP [<ffffffffa04ecfb4>] foreach_conn+0x20/0x46 [dlm]
RSP <ffff880072db5d38>
 
Old 01-28-2009, 10:27 AM
Chrissie Caulfield
 
Default dlm: Allow large nodeids

David Teigland wrote:
> On Tue, Jan 27, 2009 at 02:06:30PM -0600, David Teigland wrote:
>> On Tue, Jan 27, 2009 at 11:33:30AM +0000, Chrissie Caulfield wrote:
>>> This an updated patch that uses hlists rather than list_heads to save
>>> memory in the connection structure.
>>>
>>> Thanks to Steven Whitehouse for the suggestion.
>> I fixed some checkpatch warnings, tested, and pushed into the "next" branch.
>
> I take that back after hitting the following on unmount,
>
> Pid: 4484, comm: umount Not tainted 2.6.29-rc2 #1
> RIP: 0010:[<ffffffffa04ecfb4>] [<ffffffffa04ecfb4>] foreach_conn+0x20/0x46 [dlm]
> RSP: 0018:ffff880072db5d38 EFLAGS: 00010202

Thanks,

The attached patch should, I hope, fix that

--

Chrissie
 
Old 03-06-2009, 07:51 PM
David Teigland
 
Default dlm: Allow large nodeids

On Wed, Jan 28, 2009 at 11:27:35AM +0000, Chrissie Caulfield wrote:
> David Teigland wrote:
> > On Tue, Jan 27, 2009 at 02:06:30PM -0600, David Teigland wrote:
> >> On Tue, Jan 27, 2009 at 11:33:30AM +0000, Chrissie Caulfield wrote:
> >>> This an updated patch that uses hlists rather than list_heads to save
> >>> memory in the connection structure.

This patch (with fix) seems to cause the following about half of the time when
killing dlm_controld:

dlm: x: leaving the lockspace group...
dlm: x: group event done 0 0
dlm: x: release_lockspace final free
dlm: closing connection to node 1
general protection fault: 0000 [#1] SMP
last sysfs file: /sys/kernel/dlm/x/event_done
CPU 1
Modules linked in: lock_dlm dlm gfs2 configfs autofs4 sunrpc ipv6 cpufreq_ondema
nd dm_multipath video output sbs sbshc battery ac parport_pc lp parport sg butto
n serio_raw tg3 libphy i2c_nforce2 i2c_core pcspkr dm_snapshot dm_zero dm_mirror
dm_region_hash dm_log dm_mod qla2xxx scsi_transport_fc shpchp mptspi mptscsih m
ptbase scsi_transport_spi sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd
Pid: 10416, comm: dlm_controld Not tainted 2.6.29-rc2 #1
RIP: 0010:[<ffffffffa045116a>] [<ffffffffa045116a>] __find_con+0x17/0x35 [dlm]
RSP: 0018:ffff88007b189da8 EFLAGS: 00010202
RAX: ffff880078ccfde8 RBX: 0000000000000001 RCX: 6b6b6b6b6b6b6b6b
RDX: 6b6b6b6b6b6b6b6b RSI: 0000000000000022 RDI: 0000000000000001
RBP: ffff88007b189da8 R08: 0000000000000000 R09: ffff88007b189d48
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000001 R14: ffffffffa0462960 R15: ffff88007dd52de0
FS: 00007f71554c06e0(0000) GS:ffff88007f682210(0000) knlGS:00000000f7ef76c0
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00007f111c3ce000 CR3: 000000007e92a000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process dlm_controld (pid: 10416, threadinfo ffff88007b188000, task ffff88007e47
83c0)
Stack:
ffff88007b189dd8 ffffffffa04514ea ffffffffa026d61f 0000000000000001
ffff880078d12b50 ffffffffa04629d0 ffff88007b189df8 ffffffffa045169c
ffff88007b1935f8 ffff880078d12b50 ffff88007b189e18 ffffffffa0446921
Call Trace:
[<ffffffffa04514ea>] nodeid2con+0x29/0x1b7 [dlm]
[<ffffffffa026d61f>] ? configfs_rmdir+0x203/0x277 [configfs]
[<ffffffffa045169c>] dlm_lowcomms_close+0x24/0x48 [dlm]
[<ffffffffa0446921>] drop_comm+0x29/0x55 [dlm]
[<ffffffffa026be0c>] client_drop_item+0x25/0x31 [configfs]
[<ffffffffa026d63d>] configfs_rmdir+0x221/0x277 [configfs]
[<ffffffff804d0609>] ? _spin_unlock+0x26/0x2a
[<ffffffff802b5ca9>] vfs_rmdir+0xc5/0x137
[<ffffffff802b7c00>] do_rmdir+0xb5/0x107
[<ffffffff8026f0a0>] ? audit_syscall_entry+0x16b/0x19e
[<ffffffff802b7c89>] sys_rmdir+0x11/0x13
[<ffffffff8020c05b>] system_call_fastpath+0x16/0x1b
Code: c7 80 34 46 a0 31 db e8 b1 d9 07 e0 48 89 d8 5b 41 5c c9 c3 48 89 f8 55 83
e0 1f 48 8b 14 c5 e0 bb 46 a0 48 89 e5 48 85 d2 74 1a <39> ba d8 fe ff ff 48 8b
0a 48 8d 82 d0 fe ff ff 0f 18 09 74 07
RIP [<ffffffffa045116a>] __find_con+0x17/0x35 [dlm]
RSP <ffff88007b189da8>
 
Old 03-09-2009, 09:01 AM
Chrissie Caulfield
 
Default dlm: Allow large nodeids

David Teigland wrote:
> On Wed, Jan 28, 2009 at 11:27:35AM +0000, Chrissie Caulfield wrote:
>> David Teigland wrote:
>>> On Tue, Jan 27, 2009 at 02:06:30PM -0600, David Teigland wrote:
>>>> On Tue, Jan 27, 2009 at 11:33:30AM +0000, Chrissie Caulfield wrote:
>>>>> This an updated patch that uses hlists rather than list_heads to save
>>>>> memory in the connection structure.
>
> This patch (with fix) seems to cause the following about half of the time when
> killing dlm_controld:


I thought you were going to change the iterator in foreach_conn to use
hlist_for_each_entry_safe() ?

My guess is that the connection is being freed by free_conn and messing
up the list.

Chrissie
 
Old 03-11-2009, 03:02 PM
Chrissie Caulfield
 
Default dlm: Allow large nodeids

David Teigland wrote:
> On Wed, Jan 28, 2009 at 11:27:35AM +0000, Chrissie Caulfield wrote:
>> David Teigland wrote:
>>> On Tue, Jan 27, 2009 at 02:06:30PM -0600, David Teigland wrote:
>>>> On Tue, Jan 27, 2009 at 11:33:30AM +0000, Chrissie Caulfield wrote:
>>>>> This an updated patch that uses hlists rather than list_heads to save
>>>>> memory in the connection structure.
>
> This patch (with fix) seems to cause the following about half of the time when
> killing dlm_controld:
>

Oops, Something slightly vital was missing from free_conn() ...


Chrissie
 

Thread Tools




All times are GMT. The time now is 01:20 PM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org