dlm: Allow large nodeids
This patch changes DLM to use its own hash table rather than the idr_
code. It should allow clusters with large nodeids to work correctly. It also fixes the 1..max_nodeid loops when the DLM shuts down. This is a slightly different patch to the one I posted to IRC yesterday, but all I've changed is the use of list_for_each_entry( rather than just list_for_each(). Chrissie |
dlm: Allow large nodeids
This an updated patch that uses hlists rather than list_heads to save
memory in the connection structure. Thanks to Steven Whitehouse for the suggestion. Chrissie Caulfield wrote: > This patch changes DLM to use its own hash table rather than the idr_ > code. It should allow clusters with large nodeids to work correctly. > > It also fixes the 1..max_nodeid loops when the DLM shuts down. > > This is a slightly different patch to the one I posted to IRC yesterday, > but all I've changed is the use of list_for_each_entry( rather than just > list_for_each(). Chrissie |
dlm: Allow large nodeids
On Tue, Jan 27, 2009 at 11:33:30AM +0000, Chrissie Caulfield wrote:
> This an updated patch that uses hlists rather than list_heads to save > memory in the connection structure. > > Thanks to Steven Whitehouse for the suggestion. I fixed some checkpatch warnings, tested, and pushed into the "next" branch. |
dlm: Allow large nodeids
On Tue, Jan 27, 2009 at 02:06:30PM -0600, David Teigland wrote:
> On Tue, Jan 27, 2009 at 11:33:30AM +0000, Chrissie Caulfield wrote: > > This an updated patch that uses hlists rather than list_heads to save > > memory in the connection structure. > > > > Thanks to Steven Whitehouse for the suggestion. > > I fixed some checkpatch warnings, tested, and pushed into the "next" branch. I take that back after hitting the following on unmount, Pid: 4484, comm: umount Not tainted 2.6.29-rc2 #1 RIP: 0010:[<ffffffffa04ecfb4>] [<ffffffffa04ecfb4>] foreach_conn+0x20/0x46 [dlm] RSP: 0018:ffff880072db5d38 EFLAGS: 00010202 RAX: 0000000000000001 RBX: 6b6b6b6b6b6b6b6b RCX: 0000000000000000 RDX: ffffffffa04ed0dc RSI: 000000000000006b RDI: ffff880057998de0 RBP: ffff880072db5d58 R08: 0000000000000000 R09: ffff880057998de8 R10: 0000000000000000 R11: ffff88007dd428d8 R12: 0000000000000000 R13: ffffffffa04ecede R14: 0000000000006000 R15: 0000000000000100 FS: 00007fbce8f0b720(0000) GS:ffffffff80a33080(0000) knlGS:00000000f7f7a6c0 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00007ff8aa8d38e8 CR3: 0000000138c4a000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process umount (pid: 4484, threadinfo ffff880072db4000, task ffff8800738d4740) Stack: ffff88007d187000 0000000000000000 ffff88007d187000 ffff88007c145fa0 ffff880072db5d68 ffffffffa04ed35a ffff880072db5d78 ffffffffa04eaf20 ffff880072db5db8 ffffffffa04eb299 ffff880072db5da8 ffff88007e85e198 Call Trace: [<ffffffffa04ed35a>] dlm_lowcomms_stop+0x68/0x82 [dlm] [<ffffffffa04eaf20>] threads_stop+0xe/0x15 [dlm] [<ffffffffa04eb299>] dlm_release_lockspace+0x372/0x3a4 [dlm] [<ffffffffa02720e0>] gdlm_unmount+0x28/0x49 [lock_dlm] [<ffffffffa047270f>] gfs2_unmount_lockproto+0x2d/0x52 [gfs2] [<ffffffffa0476bcc>] gfs2_lm_unmount+0x16/0x18 [gfs2] [<ffffffffa047afb7>] gfs2_put_super+0x180/0x190 [gfs2] [<ffffffff802afadc>] generic_shutdown_super+0x73/0xe8 [<ffffffff802afb73>] kill_block_super+0x22/0x3a [<ffffffffa0476953>] gfs2_kill_sb+0x63/0x78 [gfs2] [<ffffffff802afc5c>] deactivate_super+0x68/0x7d [<ffffffff802c2aaf>] mntput_no_expire+0x103/0x149 [<ffffffff802c3094>] sys_umount+0x2e2/0x341 [<ffffffff8020c05b>] system_call_fastpath+0x16/0x1b Code: 23 fe df 48 89 d8 5b 41 5c c9 c3 55 48 89 e5 41 55 49 89 fd 41 54 45 31 e4 53 48 83 ec 08 4a 8b 1c e5 e0 79 50 a0 48 85 db 74 15 <48> 8b 03 48 8d bb d0 fe ff ff 0f 18 08 41 ff d5 48 8b 1b eb e6 RIP [<ffffffffa04ecfb4>] foreach_conn+0x20/0x46 [dlm] RSP <ffff880072db5d38> |
dlm: Allow large nodeids
David Teigland wrote:
> On Tue, Jan 27, 2009 at 02:06:30PM -0600, David Teigland wrote: >> On Tue, Jan 27, 2009 at 11:33:30AM +0000, Chrissie Caulfield wrote: >>> This an updated patch that uses hlists rather than list_heads to save >>> memory in the connection structure. >>> >>> Thanks to Steven Whitehouse for the suggestion. >> I fixed some checkpatch warnings, tested, and pushed into the "next" branch. > > I take that back after hitting the following on unmount, > > Pid: 4484, comm: umount Not tainted 2.6.29-rc2 #1 > RIP: 0010:[<ffffffffa04ecfb4>] [<ffffffffa04ecfb4>] foreach_conn+0x20/0x46 [dlm] > RSP: 0018:ffff880072db5d38 EFLAGS: 00010202 Thanks, The attached patch should, I hope, fix that -- Chrissie |
dlm: Allow large nodeids
On Wed, Jan 28, 2009 at 11:27:35AM +0000, Chrissie Caulfield wrote:
> David Teigland wrote: > > On Tue, Jan 27, 2009 at 02:06:30PM -0600, David Teigland wrote: > >> On Tue, Jan 27, 2009 at 11:33:30AM +0000, Chrissie Caulfield wrote: > >>> This an updated patch that uses hlists rather than list_heads to save > >>> memory in the connection structure. This patch (with fix) seems to cause the following about half of the time when killing dlm_controld: dlm: x: leaving the lockspace group... dlm: x: group event done 0 0 dlm: x: release_lockspace final free dlm: closing connection to node 1 general protection fault: 0000 [#1] SMP last sysfs file: /sys/kernel/dlm/x/event_done CPU 1 Modules linked in: lock_dlm dlm gfs2 configfs autofs4 sunrpc ipv6 cpufreq_ondema nd dm_multipath video output sbs sbshc battery ac parport_pc lp parport sg butto n serio_raw tg3 libphy i2c_nforce2 i2c_core pcspkr dm_snapshot dm_zero dm_mirror dm_region_hash dm_log dm_mod qla2xxx scsi_transport_fc shpchp mptspi mptscsih m ptbase scsi_transport_spi sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd Pid: 10416, comm: dlm_controld Not tainted 2.6.29-rc2 #1 RIP: 0010:[<ffffffffa045116a>] [<ffffffffa045116a>] __find_con+0x17/0x35 [dlm] RSP: 0018:ffff88007b189da8 EFLAGS: 00010202 RAX: ffff880078ccfde8 RBX: 0000000000000001 RCX: 6b6b6b6b6b6b6b6b RDX: 6b6b6b6b6b6b6b6b RSI: 0000000000000022 RDI: 0000000000000001 RBP: ffff88007b189da8 R08: 0000000000000000 R09: ffff88007b189d48 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 R13: 0000000000000001 R14: ffffffffa0462960 R15: ffff88007dd52de0 FS: 00007f71554c06e0(0000) GS:ffff88007f682210(0000) knlGS:00000000f7ef76c0 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00007f111c3ce000 CR3: 000000007e92a000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process dlm_controld (pid: 10416, threadinfo ffff88007b188000, task ffff88007e47 83c0) Stack: ffff88007b189dd8 ffffffffa04514ea ffffffffa026d61f 0000000000000001 ffff880078d12b50 ffffffffa04629d0 ffff88007b189df8 ffffffffa045169c ffff88007b1935f8 ffff880078d12b50 ffff88007b189e18 ffffffffa0446921 Call Trace: [<ffffffffa04514ea>] nodeid2con+0x29/0x1b7 [dlm] [<ffffffffa026d61f>] ? configfs_rmdir+0x203/0x277 [configfs] [<ffffffffa045169c>] dlm_lowcomms_close+0x24/0x48 [dlm] [<ffffffffa0446921>] drop_comm+0x29/0x55 [dlm] [<ffffffffa026be0c>] client_drop_item+0x25/0x31 [configfs] [<ffffffffa026d63d>] configfs_rmdir+0x221/0x277 [configfs] [<ffffffff804d0609>] ? _spin_unlock+0x26/0x2a [<ffffffff802b5ca9>] vfs_rmdir+0xc5/0x137 [<ffffffff802b7c00>] do_rmdir+0xb5/0x107 [<ffffffff8026f0a0>] ? audit_syscall_entry+0x16b/0x19e [<ffffffff802b7c89>] sys_rmdir+0x11/0x13 [<ffffffff8020c05b>] system_call_fastpath+0x16/0x1b Code: c7 80 34 46 a0 31 db e8 b1 d9 07 e0 48 89 d8 5b 41 5c c9 c3 48 89 f8 55 83 e0 1f 48 8b 14 c5 e0 bb 46 a0 48 89 e5 48 85 d2 74 1a <39> ba d8 fe ff ff 48 8b 0a 48 8d 82 d0 fe ff ff 0f 18 09 74 07 RIP [<ffffffffa045116a>] __find_con+0x17/0x35 [dlm] RSP <ffff88007b189da8> |
dlm: Allow large nodeids
David Teigland wrote:
> On Wed, Jan 28, 2009 at 11:27:35AM +0000, Chrissie Caulfield wrote: >> David Teigland wrote: >>> On Tue, Jan 27, 2009 at 02:06:30PM -0600, David Teigland wrote: >>>> On Tue, Jan 27, 2009 at 11:33:30AM +0000, Chrissie Caulfield wrote: >>>>> This an updated patch that uses hlists rather than list_heads to save >>>>> memory in the connection structure. > > This patch (with fix) seems to cause the following about half of the time when > killing dlm_controld: I thought you were going to change the iterator in foreach_conn to use hlist_for_each_entry_safe() ? My guess is that the connection is being freed by free_conn and messing up the list. Chrissie |
dlm: Allow large nodeids
David Teigland wrote:
> On Wed, Jan 28, 2009 at 11:27:35AM +0000, Chrissie Caulfield wrote: >> David Teigland wrote: >>> On Tue, Jan 27, 2009 at 02:06:30PM -0600, David Teigland wrote: >>>> On Tue, Jan 27, 2009 at 11:33:30AM +0000, Chrissie Caulfield wrote: >>>>> This an updated patch that uses hlists rather than list_heads to save >>>>> memory in the connection structure. > > This patch (with fix) seems to cause the following about half of the time when > killing dlm_controld: > Oops, Something slightly vital was missing from free_conn() ... Chrissie |
| All times are GMT. The time now is 05:46 PM. |
VBulletin, Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.