FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Redhat > Cluster Development

 
 
LinkBack Thread Tools
 
Old 10-27-2010, 09:17 PM
Lon Hohberger
 
Default rgmanager: Halt services if CMAN dies

If cman dies because it receives a kill packet (of doom)
from other hosts, rgmanager does not notice. This can
happen if, for example, you are using qdiskd and it hangs
on I/O to the quorum disk due to frequent trespasses or
other SAN interruptions. The other instance of qdiskd
will ask CMAN to evict the hung node, causing it to be
ejected from the cluster and fenced.

Data is safe (which is the top priority). If power-cycle
fencing is in use, there is no issue at all; the node
reboots and service failover occurs fairly quickly.

However, problems can arise if, in the same hung-I/O
situation:

* storage-level fencing is in use

* rgmanager has one or more IP addresses in use
as part of cluster services.

This is because more recent versions of the IP resource
agent actually ping the IP address prior to bringing it
online for use by services. This prevents accidental
take-over of IP addresses in use by other hosts on the
network due to an administrator mistake when setting up
the cluster.

Unfortunately, this behavior also prevents service
failover if the presumed-dead host is still online.

This patch causes rgmanager to use poll() instead of
select() when dealing with the baseline CMAN connection
it uses for receiving membership changes and so forth.

If the socket is closed by CMAN (either by CMAN's death
or some other reason), rgmanager can now detect and act
upon that will now treat that stimulus. It treats it as
an emergency cluster shutdown request. It will halt all
services and exit as quickly as possible.

Unfortunately, there is a race between this emergency
action and recovery on the surviving host. It is not
possible for rgmanager to guarantee that all services will
halt after the node has been fenced from shared storage
(but before the other host attempts to start the
service(s)).

Furthermore, a hung 'stop' request caused by loss of
access to shared storage may very well cause rgmanager
to hang forever, preventing some services (or parts)
from ever actually being killed.

A main use case for storage-level fencing over power-
cycling is the ability to perform post-mortem RCA of what
happened in order to cause the node to die in the first
place. This implies that rgmanager killing the host
would be an incorrect resolution.

Resolves: rhbz#639961

Signed-off-by: Lon Hohberger <lhh@redhat.com>
---
rgmanager/src/clulib/msg_cluster.c | 32 ++++++++++++++++++++++----------
1 files changed, 22 insertions(+), 10 deletions(-)

diff --git a/rgmanager/src/clulib/msg_cluster.c b/rgmanager/src/clulib/msg_cluster.c
index 4ec3750..00f28c3 100644
--- a/rgmanager/src/clulib/msg_cluster.c
+++ b/rgmanager/src/clulib/msg_cluster.c
@@ -34,7 +34,9 @@
#include <gettid.h>
#include <cman-private.h>
#include <clulog.h>
+#include <poll.h>

+static void process_cman_event(cman_handle_t handle, void *private, int reason, int arg);
/* Ripped from ccsd's setup_local_socket */

int cluster_msg_close(msgctx_t *ctx);
@@ -165,18 +167,17 @@ static int
poll_cluster_messages(int timeout)
{
int ret = -1;
- fd_set rfds;
- int fd, lfd, max;
+ int fd, lfd;
struct timeval tv;
struct timeval *p = NULL;
cman_handle_t ch;
+ struct pollfd fds[2];

if (timeout >= 0) {
p = &tv;
tv.tv_sec = tv.tv_usec = timeout;
}

- FD_ZERO(&rfds);

/* This sucks - it could cause other threads trying to get a
membership list to block for a long time. Now, that should not
@@ -195,20 +196,31 @@ poll_cluster_messages(int timeout)
cman_unlock(ch);
return 0;
}
- FD_SET(fd, &rfds);
- FD_SET(lfd, &rfds);

- max = (lfd > fd ? lfd : fd);
- if (select(max + 1, &rfds, NULL, NULL, p) > 0) {
+ fds[0].fd = lfd;
+ fds[1].fd = fd;
+ fds[0].events = POLLIN | POLLHUP | POLLERR;
+ fds[1].events = POLLIN | POLLHUP | POLLERR;
+
+ if (poll(fds, 2, timeout * 1000) > 0) {
+
/* Someone woke us up */
- if (FD_ISSET(lfd, &rfds)) {
+ if (fds[0].revents & POLLIN) {
cman_unlock(ch);
errno = EAGAIN;
return -1;
}

- cman_dispatch(ch, 0);
- ret = 0;
+ if (fds[1].revents & (POLLHUP | POLLERR)) {
+ process_cman_event(ch, NULL,
+ CMAN_REASON_TRY_SHUTDOWN,
+ 0);
+ }
+
+ if (fds[1].revents & POLLIN) {
+ cman_dispatch(ch, 0);
+ ret = 0;
+ }
}
cman_unlock(ch);

--
1.7.2.3
 
Old 10-28-2010, 06:38 AM
"Fabio M. Di Nitto"
 
Default rgmanager: Halt services if CMAN dies

Looks sane to me.

Fabio

On 10/27/2010 11:17 PM, Lon Hohberger wrote:
> If cman dies because it receives a kill packet (of doom)
> from other hosts, rgmanager does not notice. This can
> happen if, for example, you are using qdiskd and it hangs
> on I/O to the quorum disk due to frequent trespasses or
> other SAN interruptions. The other instance of qdiskd
> will ask CMAN to evict the hung node, causing it to be
> ejected from the cluster and fenced.
>
> Data is safe (which is the top priority). If power-cycle
> fencing is in use, there is no issue at all; the node
> reboots and service failover occurs fairly quickly.
>
> However, problems can arise if, in the same hung-I/O
> situation:
>
> * storage-level fencing is in use
>
> * rgmanager has one or more IP addresses in use
> as part of cluster services.
>
> This is because more recent versions of the IP resource
> agent actually ping the IP address prior to bringing it
> online for use by services. This prevents accidental
> take-over of IP addresses in use by other hosts on the
> network due to an administrator mistake when setting up
> the cluster.
>
> Unfortunately, this behavior also prevents service
> failover if the presumed-dead host is still online.
>
> This patch causes rgmanager to use poll() instead of
> select() when dealing with the baseline CMAN connection
> it uses for receiving membership changes and so forth.
>
> If the socket is closed by CMAN (either by CMAN's death
> or some other reason), rgmanager can now detect and act
> upon that will now treat that stimulus. It treats it as
> an emergency cluster shutdown request. It will halt all
> services and exit as quickly as possible.
>
> Unfortunately, there is a race between this emergency
> action and recovery on the surviving host. It is not
> possible for rgmanager to guarantee that all services will
> halt after the node has been fenced from shared storage
> (but before the other host attempts to start the
> service(s)).
>
> Furthermore, a hung 'stop' request caused by loss of
> access to shared storage may very well cause rgmanager
> to hang forever, preventing some services (or parts)
> from ever actually being killed.
>
> A main use case for storage-level fencing over power-
> cycling is the ability to perform post-mortem RCA of what
> happened in order to cause the node to die in the first
> place. This implies that rgmanager killing the host
> would be an incorrect resolution.
>
> Resolves: rhbz#639961
>
> Signed-off-by: Lon Hohberger <lhh@redhat.com>
> ---
> rgmanager/src/clulib/msg_cluster.c | 32 ++++++++++++++++++++++----------
> 1 files changed, 22 insertions(+), 10 deletions(-)
>
> diff --git a/rgmanager/src/clulib/msg_cluster.c b/rgmanager/src/clulib/msg_cluster.c
> index 4ec3750..00f28c3 100644
> --- a/rgmanager/src/clulib/msg_cluster.c
> +++ b/rgmanager/src/clulib/msg_cluster.c
> @@ -34,7 +34,9 @@
> #include <gettid.h>
> #include <cman-private.h>
> #include <clulog.h>
> +#include <poll.h>
>
> +static void process_cman_event(cman_handle_t handle, void *private, int reason, int arg);
> /* Ripped from ccsd's setup_local_socket */
>
> int cluster_msg_close(msgctx_t *ctx);
> @@ -165,18 +167,17 @@ static int
> poll_cluster_messages(int timeout)
> {
> int ret = -1;
> - fd_set rfds;
> - int fd, lfd, max;
> + int fd, lfd;
> struct timeval tv;
> struct timeval *p = NULL;
> cman_handle_t ch;
> + struct pollfd fds[2];
>
> if (timeout >= 0) {
> p = &tv;
> tv.tv_sec = tv.tv_usec = timeout;
> }
>
> - FD_ZERO(&rfds);
>
> /* This sucks - it could cause other threads trying to get a
> membership list to block for a long time. Now, that should not
> @@ -195,20 +196,31 @@ poll_cluster_messages(int timeout)
> cman_unlock(ch);
> return 0;
> }
> - FD_SET(fd, &rfds);
> - FD_SET(lfd, &rfds);
>
> - max = (lfd > fd ? lfd : fd);
> - if (select(max + 1, &rfds, NULL, NULL, p) > 0) {
> + fds[0].fd = lfd;
> + fds[1].fd = fd;
> + fds[0].events = POLLIN | POLLHUP | POLLERR;
> + fds[1].events = POLLIN | POLLHUP | POLLERR;
> +
> + if (poll(fds, 2, timeout * 1000) > 0) {
> +
> /* Someone woke us up */
> - if (FD_ISSET(lfd, &rfds)) {
> + if (fds[0].revents & POLLIN) {
> cman_unlock(ch);
> errno = EAGAIN;
> return -1;
> }
>
> - cman_dispatch(ch, 0);
> - ret = 0;
> + if (fds[1].revents & (POLLHUP | POLLERR)) {
> + process_cman_event(ch, NULL,
> + CMAN_REASON_TRY_SHUTDOWN,
> + 0);
> + }
> +
> + if (fds[1].revents & POLLIN) {
> + cman_dispatch(ch, 0);
> + ret = 0;
> + }
> }
> cman_unlock(ch);
>
 
Old 10-28-2010, 01:44 PM
Lon Hohberger
 
Default rgmanager: Halt services if CMAN dies

On 10/28/2010 02:38 AM, Fabio M. Di Nitto wrote:

Looks sane to me.



Argh, I thought it looked familiar:

http://git.fedorahosted.org/git?p=cluster.git;a=commit;h=82faccc5e341c9e7131d8 db5a5e524a8ebfe30dc

I'm going to retest RHEL5 with the already-merged STABLE31 commit.

-- Lon
 

Thread Tools




All times are GMT. The time now is 07:47 AM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org