FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Redhat > Cluster Development

 
 
LinkBack Thread Tools
 
Old 11-12-2009, 12:27 PM
Steven Whitehouse
 
Default dlm: Add down/up_write_non_owner to keep lockdep happy

Hi,

Is there an issue with this patch? It doesn't seem to have appeared in
your dlm git tree yet,

Steve.

On Tue, 2009-10-13 at 15:56 +0100, Steven Whitehouse wrote:
> I looked at possibly changing this to use completions, but
> it seems that the usage here is not easily adapted to that.
> This patch adds suitable annotation to the write side of
> the ls_in_recovery semaphore so that we don't get nasty
> messages from lockdep when mounting a gfs2 filesystem.
>
> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
> Cc: Ingo Molnar <mingo@elte.hu>
> ---
> fs/dlm/lockspace.c | 2 +-
> fs/dlm/member.c | 2 +-
> fs/dlm/recoverd.c | 2 +-
> include/linux/rwsem.h | 4 ++++
> kernel/rwsem.c | 16 ++++++++++++++++
> 5 files changed, 23 insertions(+), 3 deletions(-)
>
> diff --git a/fs/dlm/lockspace.c b/fs/dlm/lockspace.c
> index 8dde538..fa0cc22 100644
> --- a/fs/dlm/lockspace.c
> +++ b/fs/dlm/lockspace.c
> @@ -551,7 +551,7 @@ static int new_lockspace(const char *name, int namelen, void **lockspace,
> INIT_LIST_HEAD(&ls->ls_root_list);
> init_rwsem(&ls->ls_root_sem);
>
> - down_write(&ls->ls_in_recovery);
> + down_write_non_owner(&ls->ls_in_recovery);
>
> spin_lock(&lslist_lock);
> ls->ls_create_count = 1;
> diff --git a/fs/dlm/member.c b/fs/dlm/member.c
> index b128775..99bd086 100644
> --- a/fs/dlm/member.c
> +++ b/fs/dlm/member.c
> @@ -318,7 +318,7 @@ int dlm_ls_stop(struct dlm_ls *ls)
> */
>
> if (new)
> - down_write(&ls->ls_in_recovery);
> + down_write_non_owner(&ls->ls_in_recovery);
>
> /*
> * The recoverd suspend/resume makes sure that dlm_recoverd (if
> diff --git a/fs/dlm/recoverd.c b/fs/dlm/recoverd.c
> index fd677c8..376479a 100644
> --- a/fs/dlm/recoverd.c
> +++ b/fs/dlm/recoverd.c
> @@ -40,7 +40,7 @@ static int enable_locking(struct dlm_ls *ls, uint64_t seq)
> if (ls->ls_recover_seq == seq) {
> set_bit(LSFL_RUNNING, &ls->ls_flags);
> /* unblocks processes waiting to enter the dlm */
> - up_write(&ls->ls_in_recovery);
> + up_write_non_owner(&ls->ls_in_recovery);
> error = 0;
> }
> spin_unlock(&ls->ls_recover_lock);
> diff --git a/include/linux/rwsem.h b/include/linux/rwsem.h
> index efd348f..34643df 100644
> --- a/include/linux/rwsem.h
> +++ b/include/linux/rwsem.h
> @@ -80,12 +80,16 @@ extern void down_write_nested(struct rw_semaphore *sem, int subclass);
> * proper abstraction for this case is completions. ]
> */
> extern void down_read_non_owner(struct rw_semaphore *sem);
> +extern void down_write_non_owner(struct rw_semaphore *sem);
> extern void up_read_non_owner(struct rw_semaphore *sem);
> +extern void up_write_non_owner(struct rw_semaphore *sem);
> #else
> # define down_read_nested(sem, subclass) down_read(sem)
> # define down_write_nested(sem, subclass) down_write(sem)
> # define down_read_non_owner(sem) down_read(sem)
> +# define down_write_non_owner(sem) down_write(sem)
> # define up_read_non_owner(sem) up_read(sem)
> +# define up_write_non_owner(sem) up_write(sem)
> #endif
>
> #endif /* _LINUX_RWSEM_H */
> diff --git a/kernel/rwsem.c b/kernel/rwsem.c
> index cae050b..2c57eef 100644
> --- a/kernel/rwsem.c
> +++ b/kernel/rwsem.c
> @@ -126,6 +126,15 @@ void down_read_non_owner(struct rw_semaphore *sem)
>
> EXPORT_SYMBOL(down_read_non_owner);
>
> +void down_write_non_owner(struct rw_semaphore *sem)
> +{
> + might_sleep();
> +
> + __down_write(sem);
> +}
> +
> +EXPORT_SYMBOL(down_write_non_owner);
> +
> void down_write_nested(struct rw_semaphore *sem, int subclass)
> {
> might_sleep();
> @@ -143,6 +152,13 @@ void up_read_non_owner(struct rw_semaphore *sem)
>
> EXPORT_SYMBOL(up_read_non_owner);
>
> +void up_write_non_owner(struct rw_semaphore *sem)
> +{
> + __up_write(sem);
> +}
> +
> +EXPORT_SYMBOL(up_write_non_owner);
> +
> #endif
>
>
 
Old 11-12-2009, 01:29 PM
Steven Whitehouse
 
Default dlm: Add down/up_write_non_owner to keep lockdep happy

Hi,

On Thu, 2009-11-12 at 15:22 +0100, Ingo Molnar wrote:
> * Steven Whitehouse <swhiteho@redhat.com> wrote:
>
> > I looked at possibly changing this to use completions, but
> > it seems that the usage here is not easily adapted to that.
> > This patch adds suitable annotation to the write side of
> > the ls_in_recovery semaphore so that we don't get nasty
> > messages from lockdep when mounting a gfs2 filesystem.
>
> What do those 'nasty messages' say? If they expose some bug and this
> patch works around that bug by hiding it then NAK ...
>
> Ingo
>
The nasty messages are moaning that the lock is being taken in one
thread and unlocked in another. I couldn't see any bugs in the code when
I looked at it. Below are the messages that I get - to reproduce just
mount a GFS2 filesystem with the dlm lock manager. It happens on every
mount,

Steve.

Nov 12 15:10:01 chywoon kernel:
=============================================
Nov 12 15:10:01 chywoon kernel: [ INFO: possible recursive locking
detected ]
Nov 12 15:10:01 chywoon kernel: 2.6.32-rc6 #46
Nov 12 15:10:01 chywoon kernel:
---------------------------------------------
Nov 12 15:10:01 chywoon kernel: mount.gfs2/2996 is trying to acquire
lock:
Nov 12 15:10:01 chywoon kernel: (&ls->ls_in_recovery){+++++.}, at:
[<ffffffffa01cf3b5>] dlm_lock+0x55/0x1b0 [dlm]
Nov 12 15:10:01 chywoon kernel:
Nov 12 15:10:01 chywoon kernel: but task is already holding lock:
Nov 12 15:10:01 chywoon kernel: (&ls->ls_in_recovery){+++++.}, at:
[<ffffffffa01d1e1c>] dlm_new_lockspace+0x96c/0xb80 [dlm]
Nov 12 15:10:01 chywoon kernel:
Nov 12 15:10:01 chywoon kernel: other info that might help us debug
this:
Nov 12 15:10:01 chywoon kernel: 2 locks held by mount.gfs2/2996:
Nov 12 15:10:01 chywoon kernel: #0: (&type->s_umount_key#37/1){+.+.+.},
at: [<ffffffff81168698>] sget+0x258/0x4c0
Nov 12 15:10:01 chywoon kernel: #1: (&ls->ls_in_recovery){+++++.}, at:
[<ffffffffa01d1e1c>] dlm_new_lockspace+0x96c/0xb80 [dlm]
Nov 12 15:10:01 chywoon kernel:
Nov 12 15:10:01 chywoon kernel: stack backtrace:
Nov 12 15:10:01 chywoon kernel: Pid: 2996, comm: mount.gfs2 Not tainted
2.6.32-rc6 #46
Nov 12 15:10:01 chywoon kernel: Call Trace:
Nov 12 15:10:01 chywoon kernel: [<ffffffff810c53e4>] validate_chain
+0xdd4/0x1170
Nov 12 15:10:01 chywoon kernel: [<ffffffff810c2a85>] ? mark_lock
+0x2e5/0x670
Nov 12 15:10:01 chywoon kernel: [<ffffffff810c5cb2>] __lock_acquire
+0x532/0xbc0
Nov 12 15:10:01 chywoon kernel: [<ffffffff81043e2c>] ?
native_sched_clock+0x2c/0x80
Nov 12 15:10:01 chywoon kernel: [<ffffffff810c6404>] lock_acquire
+0xc4/0x1b0
Nov 12 15:10:01 chywoon kernel: [<ffffffffa01cf3b5>] ? dlm_lock
+0x55/0x1b0 [dlm]
Nov 12 15:10:01 chywoon kernel: [<ffffffff81627317>] down_read+0x47/0x80
Nov 12 15:10:01 chywoon kernel: [<ffffffffa01cf3b5>] ? dlm_lock
+0x55/0x1b0 [dlm]
Nov 12 15:10:01 chywoon kernel: [<ffffffffa01d0af6>] ?
dlm_find_lockspace_local+0x46/0x60 [dlm]
Nov 12 15:10:01 chywoon kernel: [<ffffffffa01cf3b5>] dlm_lock+0x55/0x1b0
[dlm]
Nov 12 15:10:01 chywoon kernel: [<ffffffff81043df3>] ? sched_clock
+0x13/0x20
Nov 12 15:10:01 chywoon kernel: [<ffffffff810b4e1d>] ? sched_clock_cpu
+0xed/0x140
Nov 12 15:10:01 chywoon kernel: [<ffffffff810bf6fe>] ? put_lock_stats
+0xe/0x30
Nov 12 15:10:01 chywoon kernel: [<ffffffff810c10e1>] ?
lock_release_holdtime+0xb1/0x160
Nov 12 15:10:01 chywoon kernel: [<ffffffffa0224a75>] gdlm_lock
+0xe5/0x120 [gfs2]
Nov 12 15:10:01 chywoon kernel: [<ffffffffa0224b70>] ? gdlm_ast+0x0/0xd0
[gfs2]
Nov 12 15:10:01 chywoon kernel: [<ffffffffa0224ab0>] ? gdlm_bast
+0x0/0x50 [gfs2]
Nov 12 15:10:01 chywoon kernel: [<ffffffffa02081d3>] do_xmote+0xf3/0x2c0
[gfs2]
Nov 12 15:10:01 chywoon kernel: [<ffffffffa0208a21>] run_queue
+0x1a1/0x280 [gfs2]
Nov 12 15:10:01 chywoon kernel: [<ffffffffa0208cb4>] gfs2_glock_nq
+0x134/0x3b0 [gfs2]
Nov 12 15:10:01 chywoon kernel: [<ffffffffa02099a9>] gfs2_glock_nq_num
+0x69/0x90 [gfs2]
Nov 12 15:10:01 chywoon kernel: [<ffffffffa0215a25>] init_locking
+0x45/0x190 [gfs2]
Nov 12 15:10:01 chywoon kernel: [<ffffffffa0216440>] gfs2_get_sb
+0x8d0/0xc10 [gfs2]
Nov 12 15:10:01 chywoon kernel: [<ffffffffa02099a1>] ? gfs2_glock_nq_num
+0x61/0x90 [gfs2]
Nov 12 15:10:01 chywoon kernel: [<ffffffff8115f9eb>] ? __alloc_percpu
+0xb/0x10
Nov 12 15:10:01 chywoon kernel: [<ffffffff81167838>] vfs_kern_mount
+0x58/0xf0
Nov 12 15:10:01 chywoon kernel: [<ffffffff8116793d>] do_kern_mount
+0x4d/0x130
Nov 12 15:10:01 chywoon kernel: [<ffffffff81629062>] ? lock_kernel
+0x42/0x109
Nov 12 15:10:01 chywoon kernel: [<ffffffff81186854>] do_mount
+0x2c4/0x820
Nov 12 15:10:01 chywoon kernel: [<ffffffff81186e3b>] sys_mount+0x8b/0xe0
Nov 12 15:10:01 chywoon kernel: [<ffffffff8103d0c2>]
system_call_fastpath+0x16/0x1b
 
Old 11-12-2009, 04:24 PM
Steven Whitehouse
 
Default dlm: Add down/up_write_non_owner to keep lockdep happy

Hi,

On Thu, 2009-11-12 at 11:14 -0600, David Teigland wrote:
> On Thu, Nov 12, 2009 at 02:29:18PM +0000, Steven Whitehouse wrote:
> > Hi,
> >
> > On Thu, 2009-11-12 at 15:22 +0100, Ingo Molnar wrote:
> > > * Steven Whitehouse <swhiteho@redhat.com> wrote:
> > >
> > > > I looked at possibly changing this to use completions, but
> > > > it seems that the usage here is not easily adapted to that.
> > > > This patch adds suitable annotation to the write side of
> > > > the ls_in_recovery semaphore so that we don't get nasty
> > > > messages from lockdep when mounting a gfs2 filesystem.
> > >
> > > What do those 'nasty messages' say? If they expose some bug and this
> > > patch works around that bug by hiding it then NAK ...
> > >
> > > Ingo
> > >
> > The nasty messages are moaning that the lock is being taken in one
> > thread and unlocked in another. I couldn't see any bugs in the code when
> > I looked at it. Below are the messages that I get - to reproduce just
> > mount a GFS2 filesystem with the dlm lock manager. It happens on every
> > mount,
> >
> > Steve.
> >
> > Nov 12 15:10:01 chywoon kernel:
> > =============================================
> > Nov 12 15:10:01 chywoon kernel: [ INFO: possible recursive locking
> > detected ]
>
> That recursive locking trace is something different. up_write_non_owner()
> addresses this trace, which as you say, is from doing the down and up from
> different threads (which is the intention):
>
I don't think it is different, the traces differ due to the ordering of
running of dlm_recoverd and mount.gfs2,

Steve.
 
Old 11-12-2009, 04:27 PM
Steven Whitehouse
 
Default dlm: Add down/up_write_non_owner to keep lockdep happy

Hi,

On Thu, 2009-11-12 at 17:45 +0100, Peter Zijlstra wrote:
> On Thu, 2009-11-12 at 11:14 -0600, David Teigland wrote:
> > up_write_non_owner()
> > addresses this trace, which as you say, is from doing the down and up from
> > different threads (which is the intention):
>
> That's really something I cannot advice to do. Aside from loosing
> lock-dependency validation (not a good thing), asymmetric locking like
> that is generally very hard to analyze since its not clear who 'owns'
> what data when.
>
> There are a few places in the kernel that use the non_owner things, and
> we should generally strive to remove them, not add more.
>
> Please consider solving your problem without adding things like this.
>
The code that does this already exists - it is not being added by the
patch. Its just that in recent kernels lockdep has started noticing the
problem. I did seriously consider changing the locking rather than just
silencing the messages, but it looks rather complicated and not easily
replaced with other primitives.

Any suggestions as to a better solution are welcome,

Steve.
 
Old 11-13-2009, 09:21 AM
Steven Whitehouse
 
Default dlm: Add down/up_write_non_owner to keep lockdep happy

Hi,

On Thu, 2009-11-12 at 12:34 -0600, David Teigland wrote:
> On Thu, Nov 12, 2009 at 05:24:12PM +0000, Steven Whitehouse wrote:
> > > > Nov 12 15:10:01 chywoon kernel: [ INFO: possible recursive locking
> > > > detected ]
> > >
> > > That recursive locking trace is something different. up_write_non_owner()
> > > addresses this trace, which as you say, is from doing the down and up from
> > > different threads (which is the intention):
> > >
> > I don't think it is different, the traces differ due to the ordering of
> > running of dlm_recoverd and mount.gfs2,
>
> I explained the "recursive locking" warning back in Sep:
>
> > I've not looked at how to remove this "recursive" message. What
> > happens is that mount calls dlm_new_lockspace() which returns with
> > in_recovery locked. mount then makes a lock request which blocks on
> > in_recovery (as expected) until the dlm_recoverd thread completes
> > recovery and releases the in_recovery lock (triggering the unlock
> > balance) to allow locking activity.
>
> It doesn't appear to me that up_write_non_owner() would suppress that.
>
> Dave
>
It is simply down to the ordering of the running of the threads as to
which message you get at mount time. There are two possible scenarios:

Scenario 1:

1. mount.gfs2 calls (via mount sys call and gfs2) dlm_newlockspace()
which takes the ls_in_recovery rwsem with a down_write()
2. mount.gfs2 goes on to try and take out a lock on the filesystem, and
calls dlm_lock which tries to do a down_read() on the rwsem. Since this
is from the same thread as the down_write() you get the recursive
locking message reported in the dmesg which I attached to my earlier
email.

In the second scenario, dlm_recoverd runs between step 1 and 2 above.
this results in the trace which you reported, since ls_in_recovery has
then been unlocked from a different thread, which creates the unlocking
balance trace which you posted.

In both cases the cause is the same, its just the running order of the
threads which results in it being reported in a different way. The patch
should fix both of these reports, since it annotates the up & down write
side of the rwsem,

Steve.
 

Thread Tools




All times are GMT. The time now is 02:49 PM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org