Linux Archive

Linux Archive (http://www.linux-archive.org/)
-   Cluster Development (http://www.linux-archive.org/cluster-development/)
-   -   GFS2: Use congestion statistics to select rgrps (http://www.linux-archive.org/cluster-development/698807-gfs2-use-congestion-statistics-select-rgrps.html)

Steven Whitehouse 08-29-2012 11:28 AM

GFS2: Use congestion statistics to select rgrps
 
Hi,

Here is a quick patch I wrote this morning to demonstrate the priciple
of selecting rgrps by using the recently added glock statistics.

The main issue here is not how to use the stats, but in fact what to
do when we've first mounted the filesystem and we don't have
enough data gathered in order to generate some meaningful stats. This
patch takes the route of increasing the margin which is used to
make the decision. Maybe not ideal, but possibly "good enough".

There are any number of tweeks which could be added to this in the
future in order to improve specific cases.

Anyway, this is very much a work in progress, so further thoughts on
how to improve this are very welcome at this stage,

Steve.


diff --git a/fs/gfs2/rgrp.c b/fs/gfs2/rgrp.c
index defb826..304c6fd 100644
--- a/fs/gfs2/rgrp.c
+++ b/fs/gfs2/rgrp.c
@@ -1651,6 +1651,65 @@ static void try_rgrp_unlink(struct gfs2_rgrpd *rgd, u64 *last_unlinked, u64 skip
return;
}

+/**
+ * gfs2_rgrp_congested - Use stats to figure out whether an rgrp is congested
+ * @rgd: The rgrp in question
+ * @loops: An indication of how picky we can be (0=very, 1=less so)
+ *
+ * This function uses the recently added glock statistics in order to
+ * figure out whether a parciular resource group is suffering from
+ * contention from multiple nodes. This is done purely on the basis
+ * of timings, since this is the only data we have to work with and
+ * our aim here is to reject a resource group which is highly contended
+ * but (very important) not to do this too often in order to ensure that
+ * we do not land up introducing fragmentation by changing resource
+ * groups when not actually required.
+ *
+ * The calculation is fairly simple, we want to know whether the SRTTB
+ * (i.e. smoothed round trip time for blocking operations) to acquire
+ * the lock for this rgrp's glock is significantly greater than the
+ * time taken for resource groups on average. We introduce a margin in
+ * the form of the variable @var which is computed as the sum of the two
+ * respective variences, and multiplied by a factor depending on @loops
+ * and whether we have a lot of data to base the decision on.
+ *
+ * Returns: A boolean verdict on the congestion status
+ */
+
+static bool gfs2_rgrp_congested(const struct gfs2_rgrpd *rgd, int loops)
+{
+ const struct gfs2_glock *gl = rgd->rd_gl;
+ const struct gfs2_sbd *sdp = gl->gl_sbd;
+ struct gfs2_lkstats *st;
+ s64 r_dcount, l_dcount;
+ s64 r_srttb, l_srttb;
+ s64 var;
+
+ preempt_disable();
+ st = &this_cpu_ptr(sdp->sd_lkstats)->lkstats[LM_TYPE_RGRP];
+ r_srttb = st->stats[GFS2_LKS_SRTTB];
+ r_dcount = st->stats[GFS2_LKS_DCOUNT];
+ var = st->stats[GFS2_LKS_SRTTVARB] +
+ gl->gl_stats.stats[GFS2_LKS_SRTTVARB];
+ preempt_enable();
+
+ l_srttb = gl->gl_stats.stats[GFS2_LKS_SRTTB];
+ l_dcount = gl->gl_stats.stats[GFS2_LKS_DCOUNT];
+
+ WARN_ON(var < 0);
+ WARN_ON(l_srttb < 0);
+ WARN_ON(r_srttb < 0);
+
+ /* If we do not have much data then use a larger margin */
+ if ((r_dcount < 8) || (l_dcount < 8))
+ var *= max(8ULL - l_dcount, 2ULL);
+
+ if (loops == 1)
+ var *= 2;
+
+ return (l_srttb > (r_srttb + var));
+}
+
static bool gfs2_select_rgrp(struct gfs2_rgrpd **pos, const struct gfs2_rgrpd *begin)
{
struct gfs2_rgrpd *rgd = *pos;
@@ -1677,7 +1736,7 @@ int gfs2_inplace_reserve(struct gfs2_inode *ip, u32 requested)
struct gfs2_sbd *sdp = GFS2_SB(&ip->i_inode);
struct gfs2_rgrpd *begin = NULL;
struct gfs2_blkreserv *rs = ip->i_res;
- int error = 0, rg_locked, flags = LM_FLAG_TRY;
+ int error = 0, rg_locked, flags = 0;
u64 last_unlinked = NO_BLOCK;
int loops = 0;

@@ -1704,10 +1763,11 @@ int gfs2_inplace_reserve(struct gfs2_inode *ip, u32 requested)
error = gfs2_glock_nq_init(rs->rs_rbm.rgd->rd_gl,
LM_ST_EXCLUSIVE, flags,
&rs->rs_rgd_gh);
- if (error == GLR_TRYFAILED)
- goto next_rgrp;
if (unlikely(error))
return error;
+ if (!gfs2_rs_active(rs) && (loops < 2) &&
+ gfs2_rgrp_congested(rs->rs_rbm.rgd, loops))
+ goto next_rgrp;
if (sdp->sd_args.ar_rgrplvb) {
error = update_rgrp_lvb(rs->rs_rbm.rgd);
if (unlikely(error)) {
@@ -1759,7 +1819,6 @@ next_rgrp:
* then this checks for some less likely conditions before
* trying again.
*/
- flags &= ~LM_FLAG_TRY;
loops++;
/* Check that fs hasn't grown if writing to rindex */
if (ip == GFS2_I(sdp->sd_rindex) && !sdp->sd_rindex_uptodate) {


All times are GMT. The time now is 05:51 PM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.