FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Redhat > Cluster Development

 
 
LinkBack Thread Tools
 
Old 01-04-2011, 08:27 PM
David Teigland
 
Default dlm: send_bast_queue() skip list loop not only sending basts to convertqueue

On Tue, Jan 04, 2011 at 06:06:51PM -0200, cmaiolino@redhat.com wrote:
> The resource groups got corrupted without this patch:

I could see an extraneous bast leading to confusion in gfs2 about the lock
state, but gfs2 should probably be asserting somewhere before it actually
corrupts anything...

> diff --git a/fs/dlm/lock.c b/fs/dlm/lock.c
> index 64e5f3e..565c519 100644
> --- a/fs/dlm/lock.c
> +++ b/fs/dlm/lock.c
> @@ -1847,7 +1847,7 @@ static void send_bast_queue(struct dlm_rsb *r, struct list_head *head,
>
> list_for_each_entry(gr, head, lkb_statequeue) {
> /* skip self when sending basts to convertqueue */
> - if (gr == lkb)
> + if (head == &r->res_grantqueue && gr == lkb)
> continue;
> if (gr->lkb_bastfn && modes_require_bast(gr, lkb)) {
> queue_bast(r, gr, lkb->lkb_rqmode);

I haven't been able to figure out the problem or the fix; some printk's
around the case in question would be revealing.

This is the specific case where a TRY_1CB (NOQUEUBAST) conversion fails.
Here's how I step through the code for that case:

_convert_lock(lkb)
error = do_convert(lkb)
when error equals -EAGAIN, lkb remains on grantqueue
do_convert_effects(lkb, -EAGAIN)
-EAGAIN and NOQUEUEBAST -> send_blocking_asts_all ->
send_bast_queue(grantqueue, lkb)
[lkb is expected to be here, skip sending bast to self]
send_bast_queue(convertqueue, lkb):
[lkb should not be on here, but your patch implies there
are cases where it can be? I think that would be a bug.]

Dave
 
Old 01-05-2011, 08:46 AM
Steven Whitehouse
 
Default dlm: send_bast_queue() skip list loop not only sending basts to convertqueue

Hi,

On Tue, 2011-01-04 at 16:27 -0500, David Teigland wrote:
> On Tue, Jan 04, 2011 at 06:06:51PM -0200, cmaiolino@redhat.com wrote:
> > The resource groups got corrupted without this patch:
>
> I could see an extraneous bast leading to confusion in gfs2 about the lock
> state, but gfs2 should probably be asserting somewhere before it actually
> corrupts anything...
>
I don't believe that this is the right solution, although it might be a
useful clue as to what is really going wrong. Also, nothing is getting
corrupted at all, the issue is that deallocation fails to take place at
the expected time. Thats not the end of the world, since there is
already a mechanism in place to catch that at a later point in time.

Carlos, why do you think that the dlm should be sending basts to the
lock requesting process for its own lock request?

That test was added specifically to prevent performance issues due to
extraneous basts being sent in such cases. I'd suggest getting some
traces of the glocks in question and checking to see what is happening,
and whether it is happening in the correct order,

Steve.

> > diff --git a/fs/dlm/lock.c b/fs/dlm/lock.c
> > index 64e5f3e..565c519 100644
> > --- a/fs/dlm/lock.c
> > +++ b/fs/dlm/lock.c
> > @@ -1847,7 +1847,7 @@ static void send_bast_queue(struct dlm_rsb *r, struct list_head *head,
> >
> > list_for_each_entry(gr, head, lkb_statequeue) {
> > /* skip self when sending basts to convertqueue */
> > - if (gr == lkb)
> > + if (head == &r->res_grantqueue && gr == lkb)
> > continue;
> > if (gr->lkb_bastfn && modes_require_bast(gr, lkb)) {
> > queue_bast(r, gr, lkb->lkb_rqmode);
>
> I haven't been able to figure out the problem or the fix; some printk's
> around the case in question would be revealing.
>
> This is the specific case where a TRY_1CB (NOQUEUBAST) conversion fails.
> Here's how I step through the code for that case:
>
> _convert_lock(lkb)
> error = do_convert(lkb)
> when error equals -EAGAIN, lkb remains on grantqueue
> do_convert_effects(lkb, -EAGAIN)
> -EAGAIN and NOQUEUEBAST -> send_blocking_asts_all ->
> send_bast_queue(grantqueue, lkb)
> [lkb is expected to be here, skip sending bast to self]
> send_bast_queue(convertqueue, lkb):
> [lkb should not be on here, but your patch implies there
> are cases where it can be? I think that would be a bug.]
>
> Dave
>
 

Thread Tools




All times are GMT. The time now is 08:33 PM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org