FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Redhat > Cluster Development

 
 
LinkBack Thread Tools
 
Old 08-29-2012, 11:28 AM
Steven Whitehouse
 
Default GFS2: Use congestion statistics to select rgrps

Hi,

Here is a quick patch I wrote this morning to demonstrate the priciple
of selecting rgrps by using the recently added glock statistics.

The main issue here is not how to use the stats, but in fact what to
do when we've first mounted the filesystem and we don't have
enough data gathered in order to generate some meaningful stats. This
patch takes the route of increasing the margin which is used to
make the decision. Maybe not ideal, but possibly "good enough".

There are any number of tweeks which could be added to this in the
future in order to improve specific cases.

Anyway, this is very much a work in progress, so further thoughts on
how to improve this are very welcome at this stage,

Steve.


diff --git a/fs/gfs2/rgrp.c b/fs/gfs2/rgrp.c
index defb826..304c6fd 100644
--- a/fs/gfs2/rgrp.c
+++ b/fs/gfs2/rgrp.c
@@ -1651,6 +1651,65 @@ static void try_rgrp_unlink(struct gfs2_rgrpd *rgd, u64 *last_unlinked, u64 skip
return;
}

+/**
+ * gfs2_rgrp_congested - Use stats to figure out whether an rgrp is congested
+ * @rgd: The rgrp in question
+ * @loops: An indication of how picky we can be (0=very, 1=less so)
+ *
+ * This function uses the recently added glock statistics in order to
+ * figure out whether a parciular resource group is suffering from
+ * contention from multiple nodes. This is done purely on the basis
+ * of timings, since this is the only data we have to work with and
+ * our aim here is to reject a resource group which is highly contended
+ * but (very important) not to do this too often in order to ensure that
+ * we do not land up introducing fragmentation by changing resource
+ * groups when not actually required.
+ *
+ * The calculation is fairly simple, we want to know whether the SRTTB
+ * (i.e. smoothed round trip time for blocking operations) to acquire
+ * the lock for this rgrp's glock is significantly greater than the
+ * time taken for resource groups on average. We introduce a margin in
+ * the form of the variable @var which is computed as the sum of the two
+ * respective variences, and multiplied by a factor depending on @loops
+ * and whether we have a lot of data to base the decision on.
+ *
+ * Returns: A boolean verdict on the congestion status
+ */
+
+static bool gfs2_rgrp_congested(const struct gfs2_rgrpd *rgd, int loops)
+{
+ const struct gfs2_glock *gl = rgd->rd_gl;
+ const struct gfs2_sbd *sdp = gl->gl_sbd;
+ struct gfs2_lkstats *st;
+ s64 r_dcount, l_dcount;
+ s64 r_srttb, l_srttb;
+ s64 var;
+
+ preempt_disable();
+ st = &this_cpu_ptr(sdp->sd_lkstats)->lkstats[LM_TYPE_RGRP];
+ r_srttb = st->stats[GFS2_LKS_SRTTB];
+ r_dcount = st->stats[GFS2_LKS_DCOUNT];
+ var = st->stats[GFS2_LKS_SRTTVARB] +
+ gl->gl_stats.stats[GFS2_LKS_SRTTVARB];
+ preempt_enable();
+
+ l_srttb = gl->gl_stats.stats[GFS2_LKS_SRTTB];
+ l_dcount = gl->gl_stats.stats[GFS2_LKS_DCOUNT];
+
+ WARN_ON(var < 0);
+ WARN_ON(l_srttb < 0);
+ WARN_ON(r_srttb < 0);
+
+ /* If we do not have much data then use a larger margin */
+ if ((r_dcount < 8) || (l_dcount < 8))
+ var *= max(8ULL - l_dcount, 2ULL);
+
+ if (loops == 1)
+ var *= 2;
+
+ return (l_srttb > (r_srttb + var));
+}
+
static bool gfs2_select_rgrp(struct gfs2_rgrpd **pos, const struct gfs2_rgrpd *begin)
{
struct gfs2_rgrpd *rgd = *pos;
@@ -1677,7 +1736,7 @@ int gfs2_inplace_reserve(struct gfs2_inode *ip, u32 requested)
struct gfs2_sbd *sdp = GFS2_SB(&ip->i_inode);
struct gfs2_rgrpd *begin = NULL;
struct gfs2_blkreserv *rs = ip->i_res;
- int error = 0, rg_locked, flags = LM_FLAG_TRY;
+ int error = 0, rg_locked, flags = 0;
u64 last_unlinked = NO_BLOCK;
int loops = 0;

@@ -1704,10 +1763,11 @@ int gfs2_inplace_reserve(struct gfs2_inode *ip, u32 requested)
error = gfs2_glock_nq_init(rs->rs_rbm.rgd->rd_gl,
LM_ST_EXCLUSIVE, flags,
&rs->rs_rgd_gh);
- if (error == GLR_TRYFAILED)
- goto next_rgrp;
if (unlikely(error))
return error;
+ if (!gfs2_rs_active(rs) && (loops < 2) &&
+ gfs2_rgrp_congested(rs->rs_rbm.rgd, loops))
+ goto next_rgrp;
if (sdp->sd_args.ar_rgrplvb) {
error = update_rgrp_lvb(rs->rs_rbm.rgd);
if (unlikely(error)) {
@@ -1759,7 +1819,6 @@ next_rgrp:
* then this checks for some less likely conditions before
* trying again.
*/
- flags &= ~LM_FLAG_TRY;
loops++;
/* Check that fs hasn't grown if writing to rindex */
if (ip == GFS2_I(sdp->sd_rindex) && !sdp->sd_rindex_uptodate) {
 

Thread Tools




All times are GMT. The time now is 12:54 PM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org