FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Redhat > Cluster Development

 
 
LinkBack Thread Tools
 
Old 04-16-2008, 02:30 PM
 
Default Cluster Project branch, STABLE2, updated. cluster-2.03.00-15-gebdcd11

This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "Cluster Project".

http://sources.redhat.com/git/gitweb.cgi?p=cluster.git;a=commitdiff;h=ebdcd110c7 58fe0c95285708626f02d7d67c1258

The branch, STABLE2 has been updated
via ebdcd110c758fe0c95285708626f02d7d67c1258 (commit)
from c1b13275db8d088ba9604b274fb3e0ec17d426dc (commit)

Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.

- Log -----------------------------------------------------------------
commit ebdcd110c758fe0c95285708626f02d7d67c1258
Author: David Teigland <teigland@redhat.com>
Date: Wed Apr 16 09:22:27 2008 -0500

gfs_controld: retry recovery for withdrawn journal

bz 442451

This is unfortunate, but seems to be the best solution available. The
problem, described more fully in the bz, is that when gfs_controld tries
to do recovery on a journal for a withdraw, the withdrawing node may not
yet have cleared its dlm locks. This means the journal lock may still be
held by the withdrawing node, causing all the recovering node(s) to fail
acquiring it, and no one does the recovery. The solution is for all
recovering nodes to retry recovery of a withdrawn journal until they
succeed (only the first to get the journal lock will actually recover
it, the others will see it's recovered and report success.)

Signed-off-by: David Teigland <teigland@redhat.com>

-----------------------------------------------------------------------

Summary of changes:
group/gfs_controld/recover.c | 19 +++++++++++++++++++
1 files changed, 19 insertions(+), 0 deletions(-)

diff --git a/group/gfs_controld/recover.c b/group/gfs_controld/recover.c
index 9ce3aa7..52d96ff 100644
--- a/group/gfs_controld/recover.c
+++ b/group/gfs_controld/recover.c
@@ -1913,6 +1913,25 @@ int kernel_recovery_done(char *table)

switch (atoi(buf)) {
case LM_RD_GAVEUP:
+ /*
+ * This is unfortunate; it's needed for bz 442451 where
+ * gfs-kernel fails to acquire the journal lock on all nodes
+ * because a withdrawing node has not yet called
+ * dlm_release_lockspace() to free it's journal lock. With
+ * this, all nodes should repeatedly try to to recover the
+ * journal of the withdrawn node until the withdrawing node
+ * clears its dlm locks, and gfs on each of the remaining nodes
+ * succeeds in doing the recovery.
+ */
+
+ if (memb->withdrawing) {
+ log_group(mg, "recovery_done jid %d nodeid %d retry "
+ "for withdraw", memb->jid, memb->nodeid);
+ memb->tell_gfs_to_recover = 1;
+ memb->wait_gfs_recover_done = 0;
+ usleep(500000);
+ }
+
memb->local_recovery_status = RS_GAVEUP;
ss = "gaveup";
break;


hooks/post-receive
--
Cluster Project
 

Thread Tools




All times are GMT. The time now is 01:25 PM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org