FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Redhat > Cluster Development

 
 
LinkBack Thread Tools
 
Old 11-30-2010, 03:57 PM
Menyhart Zoltan
 
Default Patch: making DLM more robust

Hi,

An easy first step to make DLM more robust can be adding a time out protection
to the lock space cration operation, while waiting for a "dlm_controld" action.
A new memeber "ci_dlm_controld_secs" is added to "dlm_config" to set up time out
in seconds, DEFAULT_DLM_CTRL_SECS is 5 seconds.

At the same time, signals can be enabled and handled, too.

DLM_USER_CREATE_LOCKSPACE will be able to return new error codes:
-EINTR or -ETIMEDOUT.

Could you please tell me why the signals are blocked within "device_write()"?
I think it is safe to allow signals, surely in your original code sequences
waiting in an uninterruptible way.

BTW "sigprocmask()" already contains "recalc_sigpending()".

out_sig:
sigprocmask(SIG_SETMASK, &tmpsig, NULL);
recalc_sigpending();


Thanks,

Zoltan Menyhart
diff -Nru linux-2.6.32.x86_64-old/fs/dlm/config.c linux-2.6.32.x86_64/fs/dlm/config.c
--- linux-2.6.32.x86_64-old/fs/dlm/config.c 2010-11-30 16:44:49.000000000 +0100
+++ linux-2.6.32.x86_64/fs/dlm/config.c 2010-11-30 17:12:00.000000000 +0100
@@ -99,6 +99,7 @@
unsigned int cl_log_debug;
unsigned int cl_protocol;
unsigned int cl_timewarn_cs;
+ unsigned int cl_dlm_controld_secs; /* dlm_controld response time-out */
};

enum {
@@ -113,6 +114,7 @@
CLUSTER_ATTR_LOG_DEBUG,
CLUSTER_ATTR_PROTOCOL,
CLUSTER_ATTR_TIMEWARN_CS,
+ CLUSTER_ATTR_DLM_CTRL_SECS, /* dlm_controld response time-out */
};

struct cluster_attribute {
@@ -165,6 +167,7 @@
CLUSTER_ATTR(log_debug, 0);
CLUSTER_ATTR(protocol, 0);
CLUSTER_ATTR(timewarn_cs, 1);
+CLUSTER_ATTR(dlm_controld_secs, 1);

static struct configfs_attribute *cluster_attrs[] = {
[CLUSTER_ATTR_TCP_PORT] = &cluster_attr_tcp_port.attr,
@@ -178,6 +181,7 @@
[CLUSTER_ATTR_LOG_DEBUG] = &cluster_attr_log_debug.attr,
[CLUSTER_ATTR_PROTOCOL] = &cluster_attr_protocol.attr,
[CLUSTER_ATTR_TIMEWARN_CS] = &cluster_attr_timewarn_cs.attr,
+ [CLUSTER_ATTR_DLM_CTRL_SECS] = &cluster_attr_dlm_controld_secs.attr,
NULL,
};

@@ -438,6 +442,7 @@
cl->cl_log_debug = dlm_config.ci_log_debug;
cl->cl_protocol = dlm_config.ci_protocol;
cl->cl_timewarn_cs = dlm_config.ci_timewarn_cs;
+ cl->cl_dlm_controld_secs = dlm_config.ci_dlm_controld_secs;

space_list = &sps->ss_group;
comm_list = &cms->cs_group;
@@ -1010,7 +1015,8 @@
#define DEFAULT_SCAN_SECS 5
#define DEFAULT_LOG_DEBUG 0
#define DEFAULT_PROTOCOL 0
-#define DEFAULT_TIMEWARN_CS 500 /* 5 sec = 500 centiseconds */
+#define DEFAULT_TIMEWARN_CS 500 /* 5 sec = 500 centiseconds */
+#define DEFAULT_DLM_CTRL_SECS 5 /* dlm_controld response time-out */

struct dlm_config_info dlm_config = {
.ci_tcp_port = DEFAULT_TCP_PORT,
@@ -1023,6 +1029,7 @@
.ci_scan_secs = DEFAULT_SCAN_SECS,
.ci_log_debug = DEFAULT_LOG_DEBUG,
.ci_protocol = DEFAULT_PROTOCOL,
- .ci_timewarn_cs = DEFAULT_TIMEWARN_CS
+ .ci_timewarn_cs = DEFAULT_TIMEWARN_CS,
+ .ci_dlm_controld_secs = DEFAULT_DLM_CTRL_SECS,
};

diff -Nru linux-2.6.32.x86_64-old/fs/dlm/config.h linux-2.6.32.x86_64/fs/dlm/config.h
--- linux-2.6.32.x86_64-old/fs/dlm/config.h 2010-11-30 16:44:49.000000000 +0100
+++ linux-2.6.32.x86_64/fs/dlm/config.h 2010-11-30 17:15:43.000000000 +0100
@@ -28,6 +28,7 @@
int ci_log_debug;
int ci_protocol;
int ci_timewarn_cs;
+ int ci_dlm_controld_secs; /* dlm_controld response time-out */
};

extern struct dlm_config_info dlm_config;
diff -Nru linux-2.6.32.x86_64-old/fs/dlm/lockspace.c linux-2.6.32.x86_64/fs/dlm/lockspace.c
--- linux-2.6.32.x86_64-old/fs/dlm/lockspace.c 2010-11-30 16:44:49.000000000 +0100
+++ linux-2.6.32.x86_64/fs/dlm/lockspace.c 2010-11-30 17:35:10.000000000 +0100
@@ -568,7 +568,12 @@
if (error)
goto out_stop;

- wait_for_completion(&ls->ls_members_done);
+ error = wait_for_completion_interruptible_timeout(&ls->ls_members_done,
+ dlm_config.ci_dlm_controld_secs * HZ);
+ if (error){
+ error = signal_pending(current) ? -EINTR : -ETIMEDOUT;
+ goto out_members;
+ }
error = ls->ls_members_result;
if (error)
goto out_members;
 
Old 11-30-2010, 04:30 PM
David Teigland
 
Default Patch: making DLM more robust

On Tue, Nov 30, 2010 at 05:57:50PM +0100, Menyhart Zoltan wrote:
> Hi,
>
> An easy first step to make DLM more robust can be adding a time out protection
> to the lock space cration operation, while waiting for a "dlm_controld" action.
> A new memeber "ci_dlm_controld_secs" is added to "dlm_config" to set up time out
> in seconds, DEFAULT_DLM_CTRL_SECS is 5 seconds.
>
> At the same time, signals can be enabled and handled, too.
>
> DLM_USER_CREATE_LOCKSPACE will be able to return new error codes:
> -EINTR or -ETIMEDOUT.
>
> Could you please tell me why the signals are blocked within "device_write()"?
> I think it is safe to allow signals, surely in your original code sequences
> waiting in an uninterruptible way.

Thanks, I'll take a look; as long as it's disabled by default I don't
expect I'd object much. There are two main problems with this idea,
though, that need to be handled before it's generally usable:

1. The kernel can wait on user space indefinately during completely normal
situations, e.g. the loss of quorum or fencing failures can delay
completion indefinately. This means you can easily introduce false
failures when using a timeout. EINTR, since it's driven by user
intervention, is a better idea, e.g. killing a mount process.

2. The difficulty, even with EINTR, is correctly and cleanly unwinding the
dlm_controld state.

Dave
 
Old 12-01-2010, 08:23 AM
Menyhart Zoltan
 
Default Patch: making DLM more robust

David Teigland wrote:


Thanks, I'll take a look; as long as it's disabled by default I don't
expect I'd object much. There are two main problems with this idea,
though, that need to be handled before it's generally usable:

1. The kernel can wait on user space indefinately during completely normal
situations, e.g. the loss of quorum or fencing failures can delay
completion indefinately.


In my eyes, a networked application should indicate a failure within a
"human expectable" time delay. E.g.:
- You can try a DLM_USER_CREATE_LOCKSPACE for 5 seconds
- If it times out, you can log it, display some status telling the user
that it has already been retried for H hours M minutes and S seconds
- And retry (if configured so to do by itself) if there is no intervention


This means you can easily introduce false
failures when using a timeout.


If we cannot obtain a given resource within a limited time frame,
then it is a real error for the customer: s/he cannot mount an OCFS2
volume, cannot issue a cluster command, etc.


EINTR, since it's driven by user
intervention, is a better idea, e.g. killing a mount process.

2. The difficulty, even with EINTR, is correctly and cleanly unwinding the
dlm_controld state.


Let's take this example indlm/libdlm/libdlm.c:

int create_lockspace_v6(const char *name, uint32_t flags)
{
char reqbuf[sizeof(struct dlm_write_request) + DLM_LOCKSPACE_LEN];
struct dlm_write_request *req = (struct dlm_write_request *)reqbuf;
int namelen = strlen(name);

memset(reqbuf, 0, sizeof(reqbuf));
set_version_v6(req);
req->cmd = DLM_USER_CREATE_LOCKSPACE;
req->i.lspace.flags = flags;
if (namelen > DLM_LOCKSPACE_LEN) {
errno = EINVAL;
return -1;
}
memcpy(req->i.lspace.name, name, namelen);
return write(control_fd, req, sizeof(*req) + namelen);
}

The caller should already be prepared to unwind everything in case of an
EINVAL is returned due to a name length error.
"write()" can also return several errors.

We will have two more error codes:

EINTR: there is no much difference if the signal arrives just before we
call "write()" or inside the system call...
If you already ignore it... If you already handle it...

ETIMEDOUT:see above

There should be a smooth way out from errors, other than hard reseting the
machine :-)

Thanks,

Zoltan Menyhart
 
Old 12-01-2010, 04:27 PM
David Teigland
 
Default Patch: making DLM more robust

On Wed, Dec 01, 2010 at 10:23:25AM +0100, Menyhart Zoltan wrote:
> If we cannot obtain a given resource within a limited time frame,
> then it is a real error for the customer: s/he cannot mount an OCFS2
> volume, cannot issue a cluster command, etc.

Matter of opinion and preference I suppose.

> >2. The difficulty, even with EINTR, is correctly and cleanly unwinding the
> >dlm_controld state.
>
> Let's take this example indlm/libdlm/libdlm.c:

The problem is not backing out of libdlm, it's leaving the cpg group, etc
in dlm_controld (when the join itself is not even complete). It should
all be possible, but I've never viewed this as a problem worth fixing
given the effort required.

Dave
 

Thread Tools




All times are GMT. The time now is 06:49 PM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org