FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Redhat > Cluster Development

 
 
LinkBack Thread Tools
 
Old 04-05-2012, 04:11 PM
Bob Peterson
 
Default GFS2: Instruct DLM to avoid queue convert slowdowns

Hi,

Here's another patch (explanation below). This patch replies upon
a DLM patch that hasn't fully gone upstream yet, so perhaps it
shouldn't be added to the nmw tree until it is. This greatly
improves the performance of gfs2_grow in a clustered gfs2 file system.

Regards,

Bob Peterson
Red Hat File Systems

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
---
Author: Bob Peterson <rpeterso@redhat.com>
Date: Wed Apr 4 15:14:51 2012 -0500

GFS2: Instruct DLM to avoid queue convert slowdowns

This patch instructs DLM to prevent an "in place" conversion, where the
lock just stays on the granted queue, and instead forces the conversion to
the back of the convert queue. This is done on upward conversions only.

This is useful in cases where, for example, a lock is frequently needed in
PR on one node, but another node needs it temporarily in EX to update it.
This may happen, for example, when the rindex is being updated by gfs2_grow.
The gfs2_grow needs to have the lock in EX, but the other nodes need to
re-read it to retrieve the updates. The glock is already granted in PR on
the non-growing nodes, so this prevents them from continually re-granting
the lock in PR, and forces the EX from gfs2_grow to go through.

diff --git a/fs/gfs2/lock_dlm.c b/fs/gfs2/lock_dlm.c
index bd5af03..43466d6 100644
--- a/fs/gfs2/lock_dlm.c
+++ b/fs/gfs2/lock_dlm.c
@@ -16,6 +16,25 @@
#include "glock.h"
#include "util.h"

+/*
+ * Borrowed from dlm's lock.c: We need to only pass the QUECVT bit with
+ * requests that are compatible, so we use dlm's check:
+ *
+ * Compatibility matrix for conversions with QUECVT set.
+ * Granted mode is the row; requested mode is the column.
+ * Usage: matrix[grmode+1][rqmode+1]
+ */
+static const int quecvt_compat_matrix[8][8] = {
+ /* UN NL CR CW PR PW EX PD */
+ {0, 0, 0, 0, 0, 0, 0, 0}, /* UN */
+ {0, 0, 1, 1, 1, 1, 1, 0}, /* NL */
+ {0, 0, 0, 1, 1, 1, 1, 0}, /* CR */
+ {0, 0, 0, 0, 1, 1, 1, 0}, /* CW */
+ {0, 0, 0, 1, 0, 1, 1, 0}, /* PR */
+ {0, 0, 0, 0, 0, 0, 1, 0}, /* PW */
+ {0, 0, 0, 0, 0, 0, 0, 0}, /* EX */
+ {0, 0, 0, 0, 0, 0, 0, 0} /* PD */
+};

static void gdlm_ast(void *arg)
{
@@ -104,10 +123,13 @@ static int make_mode(const unsigned int lmstate)
return -1;
}

-static u32 make_flags(const u32 lkid, const unsigned int gfs_flags,
+static u32 make_flags(struct gfs2_glock *gl, const unsigned int gfs_flags,
const int req)
{
u32 lkf = 0;
+ u32 lkid = gl->gl_lksb.sb_lkid;
+ int is_upconvert =
+ quecvt_compat_matrix[make_mode(gl->gl_state) + 1][req + 1];

if (gfs_flags & LM_FLAG_TRY)
lkf |= DLM_LKF_NOQUEUE;
@@ -131,8 +153,11 @@ static u32 make_flags(const u32 lkid, const unsigned int gfs_flags,
BUG();
}

- if (lkid != 0)
+ if (lkid != 0) {
lkf |= DLM_LKF_CONVERT;
+ if (is_upconvert)
+ lkf |= DLM_LKF_QUECVT;
+ }

lkf |= DLM_LKF_VALBLK;

@@ -149,7 +174,7 @@ static unsigned int gdlm_lock(struct gfs2_glock *gl,

gl->gl_req = req_state;
req = make_mode(req_state);
- lkf = make_flags(gl->gl_lksb.sb_lkid, flags, req);
+ lkf = make_flags(gl, flags, req);

/*
* Submit the actual lock request.
 
Old 04-10-2012, 02:01 PM
Bob Peterson
 
Default GFS2: Instruct DLM to avoid queue convert slowdowns

----- Original Message -----
| Hi,
|
| On Thu, 2012-04-05 at 12:11 -0400, Bob Peterson wrote:
| > Hi,
| >
| > Here's another patch (explanation below). This patch replies upon
| > a DLM patch that hasn't fully gone upstream yet, so perhaps it
| > shouldn't be added to the nmw tree until it is. This greatly
| > improves the performance of gfs2_grow in a clustered gfs2 file
| > system.
| >
| > Regards,
| >
| I'm still not very keen on dragging in this bit of dlm. If it is
| really
| needed, then we should use the copy in the dlm itself and not add our
| own copy of it. Also, why can we not use the GLF_BLOCKING flag for
| this
| purpose? Is there some issue with try locks in that case?

To answer your questions:
(1) Yes, this bit is really needed.
(2) We cannot use the actual dlm flags because they're part of
dlm/lock.c and not in a header. I can request they be moved
to a header file, but that's at the whim of upstream dlm.
(3) We can't use GLF_BLOCKING because it's a different issue.
The issue is this: Unless instructed otherwise, dlm will
grant GFS2 (or any caller) locks as it sees fit, and when there
is a hotly contested file (such as rindex, once it's been
updated by gfs2_grow and needed immediately by all nodes in
a cluster) dlm often finds it more convenient to grant the lock
to a process on the current owner node rather than bouncing the
lock to a different node.

That can result in one node "hogging" the lock for long periods of
time. That means a gfs2_grow can take 30 minutes to complete
rather than 0.3 seconds. To use a farming analogy: your animals
are starving because the hogs at the trough are eagerly fighting
over the morsel at the bottom, and not giving the farmer enough
of a break to put their food in. To avoid this fighting behavior
dlm has the DLM_LKF_QUECVT bit which basically says lock upgrades
(the farmer) over the convenience of keeping the lock on the same
node (the hogs). Employing this bit does wonders for gfs2_grow.
However, we can only use the bit on dlm lock upgrades. If we
use it in other cases, dlm goes through an error path that
results in a hang.

| When you say that this relies upon this dlm patch, what does that
| mean?

(4) It means that there's a DLM bug whereby using the DLM_LKF_QUECVT
bit causes DLM (and therefore GFS2) to hang in certain situations.
There is a patch to fix this bad DLM behavior but it hasn't gone
upstream yet.

| What are the consequences of having this patch but not the dlm one?

(5) The consequences of having this patch but not the DLM one are
disastrous: gfs2 is likely to hang.

| I'm
| wondering whether I should hold off on this at least until the dlm
| one
| has been finalised and applied.

(6) Yes, that's what we need to do.

| Aside from those points though, this is a really big step forward and
| should make a measurable difference to performance across the board
| when
| contention between multiple nodes occurs. So I'm very happy to see
| such
| significant progress in this area,
|
| Steve.
|
|
| > Bob Peterson
| > Red Hat File Systems
| >
| > Signed-off-by: Bob Peterson <rpeterso@redhat.com>
| > ---
| > Author: Bob Peterson <rpeterso@redhat.com>
| > Date: Wed Apr 4 15:14:51 2012 -0500
| >
| > GFS2: Instruct DLM to avoid queue convert slowdowns
| >
| > This patch instructs DLM to prevent an "in place" conversion,
| > where the
| > lock just stays on the granted queue, and instead forces the
| > conversion to
| > the back of the convert queue. This is done on upward
| > conversions only.
| >
| > This is useful in cases where, for example, a lock is
| > frequently needed in
| > PR on one node, but another node needs it temporarily in EX to
| > update it.
| > This may happen, for example, when the rindex is being updated
| > by gfs2_grow.
| > The gfs2_grow needs to have the lock in EX, but the other nodes
| > need to
| > re-read it to retrieve the updates. The glock is already
| > granted in PR on
| > the non-growing nodes, so this prevents them from continually
| > re-granting
| > the lock in PR, and forces the EX from gfs2_grow to go through.
| >
| > diff --git a/fs/gfs2/lock_dlm.c b/fs/gfs2/lock_dlm.c
| > index bd5af03..43466d6 100644
| > --- a/fs/gfs2/lock_dlm.c
| > +++ b/fs/gfs2/lock_dlm.c
| > @@ -16,6 +16,25 @@
| > #include "glock.h"
| > #include "util.h"
| >
| > +/*
| > + * Borrowed from dlm's lock.c: We need to only pass the QUECVT bit
| > with
| > + * requests that are compatible, so we use dlm's check:
| > + *
| > + * Compatibility matrix for conversions with QUECVT set.
| > + * Granted mode is the row; requested mode is the column.
| > + * Usage: matrix[grmode+1][rqmode+1]
| > + */
| > +static const int quecvt_compat_matrix[8][8] = {
| > + /* UN NL CR CW PR PW EX PD */
| > + {0, 0, 0, 0, 0, 0, 0, 0}, /* UN */
| > + {0, 0, 1, 1, 1, 1, 1, 0}, /* NL */
| > + {0, 0, 0, 1, 1, 1, 1, 0}, /* CR */
| > + {0, 0, 0, 0, 1, 1, 1, 0}, /* CW */
| > + {0, 0, 0, 1, 0, 1, 1, 0}, /* PR */
| > + {0, 0, 0, 0, 0, 0, 1, 0}, /* PW */
| > + {0, 0, 0, 0, 0, 0, 0, 0}, /* EX */
| > + {0, 0, 0, 0, 0, 0, 0, 0} /* PD */
| > +};
| >
| > static void gdlm_ast(void *arg)
| > {
| > @@ -104,10 +123,13 @@ static int make_mode(const unsigned int
| > lmstate)
| > return -1;
| > }
| >
| > -static u32 make_flags(const u32 lkid, const unsigned int
| > gfs_flags,
| > +static u32 make_flags(struct gfs2_glock *gl, const unsigned int
| > gfs_flags,
| > const int req)
| > {
| > u32 lkf = 0;
| > + u32 lkid = gl->gl_lksb.sb_lkid;
| > + int is_upconvert =
| > + quecvt_compat_matrix[make_mode(gl->gl_state) + 1][req + 1];
| >
| > if (gfs_flags & LM_FLAG_TRY)
| > lkf |= DLM_LKF_NOQUEUE;
| > @@ -131,8 +153,11 @@ static u32 make_flags(const u32 lkid, const
| > unsigned int gfs_flags,
| > BUG();
| > }
| >
| > - if (lkid != 0)
| > + if (lkid != 0) {
| > lkf |= DLM_LKF_CONVERT;
| > + if (is_upconvert)
| > + lkf |= DLM_LKF_QUECVT;
| > + }
| >
| > lkf |= DLM_LKF_VALBLK;
| >
| > @@ -149,7 +174,7 @@ static unsigned int gdlm_lock(struct gfs2_glock
| > *gl,
| >
| > gl->gl_req = req_state;
| > req = make_mode(req_state);
| > - lkf = make_flags(gl->gl_lksb.sb_lkid, flags, req);
| > + lkf = make_flags(gl, flags, req);
| >
| > /*
| > * Submit the actual lock request.
| >
|
|
|
 

Thread Tools




All times are GMT. The time now is 09:46 PM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org