FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Redhat > Cluster Development

 
 
LinkBack Thread Tools
 
Old 01-07-2008, 04:39 AM
Wendy Cheng
 
Default NLM failover unlock commands

We've implemented two new NFSD procfs files:

o /proc/fs/nfsd/unlock_ip
o /proc/fs/nfsd/unlock_filesystem

They are intended to allow admin or user mode script to release NLM
locks based on either a path name or a server in-bound ip address (ipv4
for now)

as;

shell> echo 10.1.1.2 > /proc/fs/nfsd/unlock_ip
shell> echo /mnt/sfs1 > /proc/fs/nfsd/unlock_filesystem

The expected usage is for High Availability (HA) environment where nfs
servers are clustered together to provide either load balancing or take
over upon server failure. The task is normally started by transferring a
floating IP address from serverA to serverB with the following sequences:


ServerA:
1. Tear down the IP address
2. Unexport the path
3. Write IP to /proc/fs/nfsd/unlock_ip to unlock files
4. If unmount required,
write path name to /proc/fs/nfsd/unlock_filesystem, then unmount.
5. Signal peer to begin take-over.

For details, check out:
http://people.redhat.com/wcheng/Patches/NFS/NLM/004.txt

Acknowledgment goes to Neil Brown who has been offered support and
guidance during our prototype efforts.


-- Wendy
 
Old 01-08-2008, 04:18 AM
Neil Brown
 
Default NLM failover unlock commands

On Monday January 7, wcheng@redhat.com wrote:
> We've implemented two new NFSD procfs files:
>
> o /proc/fs/nfsd/unlock_ip
> o /proc/fs/nfsd/unlock_filesystem
>
> They are intended to allow admin or user mode script to release NLM
> locks based on either a path name or a server in-bound ip address (ipv4
> for now)
> as;
>
> shell> echo 10.1.1.2 > /proc/fs/nfsd/unlock_ip
> shell> echo /mnt/sfs1 > /proc/fs/nfsd/unlock_filesystem

I'm happy with this interface and the code looks credible, so
Acked-by: NeilBrown <neilb@suse.de>

however......


> --- linux-o/fs/nfsd/nfsctl.c 2008-01-04 10:01:08.000000000 -0500
> +++ linux/fs/nfsd/nfsctl.c 2008-01-06 15:27:34.000000000 -0500
> @@ -288,6 +295,56 @@ static ssize_t write_getfd(struct file *
> return err;
> }
>
> +extern __u32 in_aton(const char *str);

Bad. It is "__be32" in linux/inet.h, and the difference an be
important.
Can you just #include <linux/inet.h> ???

> +
> +static
> +ssize_t failover_parse(int where, struct file *file, char *buf, size_t size)
> +{
> + char *fo_path, *mesg;
> + __be32 server_ip[4];

Why '4' ???

Also, fo_path is only sometimes a path, so the name choice could be
confusing. You use "data" in the formal parameters for nfsd_fo_cmd,
which is more idiomatic at least.
Maybe we should have a
union unlock_args {
char *path;
__be32 IPv4;
};
and pass around a pointer to such a union?
If you don't like that I would be happy with a 'void*', but not with a
'char *' called path.

> @@ -717,7 +776,6 @@ static void __exit exit_nfsd(void)
> nfsd4_free_slabs();
> unregister_filesystem(&nfsd_fs_type);
> }
> -
> MODULE_AUTHOR("Olaf Kirch <okir@monad.swb.de>");
> MODULE_LICENSE("GPL");
> module_init(init_nfsd)

Any good reason for removing this blank line?


> +int nlmsvc_fo_match(struct nlm_host *dummy1, struct nlm_host *dummy2)
> +{
> + return 1;
> +}

White space damage. Did you run checkpatch.pl??

> +int
> +nlmsvc_fo_cmd(int cmd, void *datap, int grace_time)
> +{
> + nlm_fo_cmd fo_cmd;
> + int rc=-EINVAL;
> +
> + fo_printk("lockd: nlmsvc_fo_cmd enter, cmd=%d, datap=0x%p, gp=%d
",
> + cmd, datap, grace_time);
> +
> + fo_cmd.cmd = cmd;
> + fo_cmd.stat = 0;
> + fo_cmd.gp = 0;
> + fo_cmd.datap = datap;
> +
> + /* "if" place holder for NFSD_FO_RESUME */
> + {
> + /* fo_start */
> + rc = nlm_traverse_files((struct nlm_host*) &fo_cmd,
> + nlmsvc_fo_match);
> + fo_printk("nlmsvc_fo_cmd rc=%d, stat=%d
", rc, fo_cmd.stat);
> + }
> +
> + return rc;
> +}
> +
> +EXPORT_SYMBOL(nlmsvc_fo_cmd);

I think today's convention it to not have a blank line before
EXPORT_SYMBOL. checkpatch.pl should pick this up for you.

> --- linux-o/include/linux/lockd/lockd.h 2008-01-04 10:01:08.000000000 -0500
> +++ linux/include/linux/lockd/lockd.h 2008-01-06 15:14:55.000000000 -0500
> @@ -39,7 +39,7 @@
> struct nlm_host {
> struct hlist_node h_hash; /* doubly linked list */
> struct sockaddr_in h_addr; /* peer address */
> - struct sockaddr_in h_saddr; /* our address (optional) */
> + struct sockaddr_in h_saddr; /* our address (optional) */
> struct rpc_clnt * h_rpcclnt; /* RPC client to talk to peer */
> char * h_name; /* remote hostname */
> u32 h_version; /* interface version */

This change is purely white-space breakage.


> @@ -214,6 +215,17 @@ void nlmsvc_mark_resources(void);
> void nlmsvc_free_host_resources(struct nlm_host *);
> void nlmsvc_invalidate_all(void);
>
> +/* cluster failover support */
> +
> +typedef struct {
> + int cmd;
> + int stat;
> + int gp;
> + void *datap;
> +} nlm_fo_cmd;

gp??? I guess that means 'grace period'. It isn't used at all in
this patch. Ideally it should only be introduce in the patch which
uses it, but it definitely needs a better name - and preferably a
comment.

NeilBrown
 
Old 01-08-2008, 04:02 PM
Christoph Hellwig
 
Default NLM failover unlock commands

On Mon, Jan 07, 2008 at 12:39:25AM -0500, Wendy Cheng wrote:
> +#define DEBUG 0
> +#define fo_printk(x...) ((void)(DEBUG && printk(x)))

Please don't introduce more debugging helpers but use the existing
ones.

> +extern __u32 in_aton(const char *str);

This is properly declared in <linux/inet.h>

> + return (nfsd_fo_cmd(where, fo_path, 0));

no braces around the return values please. (happens multiple times)

> +int
> +nfsd_fo_cmd(int cmd, char *datap, int grace_period)
> +{
> + struct nameidata nd;
> + void *objp = (void *)datap;
> + int rc=0;
> +
> + if (cmd == NFSD_FO_PATH) {
> + rc = path_lookup((const char *)datap, 0, &nd);
> + if (rc) {
> + fo_printk("nfsd: nfsd_fo path (%s) not found
", datap);
> + return rc;
> + }
> + fo_printk("nfsd: nfsd_fo lookup path = (0x%p,0x%p)
",
> + nd.mnt, nd.dentry);
> + objp = (void *) &nd;
> + }
> + return (nlmsvc_fo_cmd(cmd, objp, grace_period));

this has nothing in common for the two cases except for the final
function call. Please just inline this function into the caller which
gives you quite a bit of nice cleanup by not passing all the parameters
in odd ways aswell.

And btw, I think this code has quite a bit too much debug printks,
almost more than code. I'd be better readable by reducing that.

> +static inline int
> +nlmsvc_fo_unlock_match(void *datap, struct nlm_file *file)
> +{
> + nlm_fo_cmd *fo_cmd = (nlm_fo_cmd *) datap;
> + int cmd = fo_cmd->cmd;
> + struct path *f_path;
> +
> + fo_printk("nlm_fo_unlock_match cmd=%d
", cmd);
> +
> + if (cmd == NFSD_FO_VIP) {

Please split this into two separate functions for the NFSD_FO_VIP/
NFSD_FO_PATH cases as there's just about nothing in common for the two.

> {
> + /* Cluster failover has timing constraints. There is a slight
> + * performance hit if nlm_fo_unlock_match() is implemented as
> + * a match fn (since it will be invoked for each block, share,
> + * and lock later when the lists are traversed). Instead, we
> + * add path-matching logic into the following unlikely clause.
> + * If matches, the dummy nlmsvc_fo_match will always return
> + * true.
> + */
> + dprintk("nlm_inspect_files: file=%p
", file);
> + if (unlikely(match == nlmsvc_fo_match)) {
> + if (!nlmsvc_fo_unlock_match((void *)host, file))
> + return 0;
> + fo_printk("nlm_fo find lock file entry (0x%p)
", file);
> + }

That's a quite nast hack. Did you benchmark the the match fn variant
to see if there is actually any mesurable difference? Also no need
to downcast pointers to void *, it's implicit in C.

> + /* "if" place holder for NFSD_FO_RESUME */
> + {

no need for such placeholders.

> +/* cluster failover support */
> +
> +typedef struct {
> + int cmd;
> + int stat;
> + int gp;
> + void *datap;
> +} nlm_fo_cmd;

please don't introduce typedefs for struct types.
 
Old 01-08-2008, 04:49 PM
Christoph Hellwig
 
Default NLM failover unlock commands

Ok, I played around with this and cleaned up the ip/path codepathes to
be entirely setup which helped the code quite a bit. Also a few other
cleanups and two bugfixes (postive error code returned and missing
path_release) fell out of it. I still don't like what's going on in
fs/lockd/svcsubs.c, it would be much better if the cluster unlock code
simply didn't use nlm_traverse_files but did it's own loop over
the nlm hosts. That should also absolete the second patch.


Index: linux-2.6/fs/nfsd/nfsctl.c
================================================== =================
--- linux-2.6.orig/fs/nfsd/nfsctl.c 2008-01-08 18:36:37.000000000 +0100
+++ linux-2.6/fs/nfsd/nfsctl.c 2008-01-08 18:45:55.000000000 +0100
@@ -22,6 +22,7 @@
#include <linux/seq_file.h>
#include <linux/pagemap.h>
#include <linux/init.h>
+#include <linux/inet.h>
#include <linux/string.h>
#include <linux/smp_lock.h>
#include <linux/ctype.h>
@@ -35,6 +36,7 @@
#include <linux/nfsd/cache.h>
#include <linux/nfsd/xdr.h>
#include <linux/nfsd/syscall.h>
+#include <linux/lockd/lockd.h>

#include <asm/uaccess.h>

@@ -52,6 +54,8 @@ enum {
NFSD_Getfs,
NFSD_List,
NFSD_Fh,
+ NFSD_FO_UnlockIP,
+ NFSD_FO_UnlockFS,
NFSD_Threads,
NFSD_Pool_Threads,
NFSD_Versions,
@@ -88,6 +92,9 @@ static ssize_t write_leasetime(struct fi
static ssize_t write_recoverydir(struct file *file, char *buf, size_t size);
#endif

+static ssize_t failover_unlock_ip(struct file *file, char *buf, size_t size);
+static ssize_t failover_unlock_fs(struct file *file, char *buf, size_t size);
+
static ssize_t (*write_op[])(struct file *, char *, size_t) = {
[NFSD_Svc] = write_svc,
[NFSD_Add] = write_add,
@@ -97,6 +104,8 @@ static ssize_t (*write_op[])(struct file
[NFSD_Getfd] = write_getfd,
[NFSD_Getfs] = write_getfs,
[NFSD_Fh] = write_filehandle,
+ [NFSD_FO_UnlockIP] = failover_unlock_ip,
+ [NFSD_FO_UnlockFS] = failover_unlock_fs,
[NFSD_Threads] = write_threads,
[NFSD_Pool_Threads] = write_pool_threads,
[NFSD_Versions] = write_versions,
@@ -288,6 +297,55 @@ static ssize_t write_getfd(struct file *
return err;
}

+static ssize_t failover_unlock_ip(struct file *file, char *buf, size_t size)
+{
+ __be32 server_ip;
+ char *fo_path;
+ char *mesg;
+
+ /* sanity check */
+ if (size <= 0)
+ return -EINVAL;
+
+ if (buf[size-1] == '
')
+ buf[size-1] = 0;
+
+ fo_path = mesg = buf;
+ if (qword_get(&mesg, fo_path, size) < 0)
+ return -EINVAL;
+
+ server_ip = in_aton(fo_path);
+ return nlmsvc_failover_ip(server_ip[0]);
+}
+
+static ssize_t failover_unlock_fs(struct file *file, char *buf, size_t size)
+{
+ struct nameidata nd;
+ char *fo_path;
+ char *mesg;
+ int error;
+
+ /* sanity check */
+ if (size <= 0)
+ return -EINVAL;
+
+ if (buf[size-1] == '
')
+ buf[size-1] = 0;
+
+ fo_path = mesg = buf;
+ if (qword_get(&mesg, fo_path, size) < 0)
+ return -EINVAL;
+
+ error = path_lookup(fo_path, 0, &nd);
+ if (error)
+ return error;
+
+ error = nlmsvc_failover_path(&nd);
+
+ path_release(&nd);
+ return error;
+}
+
static ssize_t write_filehandle(struct file *file, char *buf, size_t size)
{
/* request is:
@@ -646,6 +704,8 @@ static int nfsd_fill_super(struct super_
[NFSD_Getfd] = {".getfd", &transaction_ops, S_IWUSR|S_IRUSR},
[NFSD_Getfs] = {".getfs", &transaction_ops, S_IWUSR|S_IRUSR},
[NFSD_List] = {"exports", &exports_operations, S_IRUGO},
+ [NFSD_FO_UnlockIP] = {"unlock_ip", &transaction_ops, S_IWUSR|S_IRUSR},
+ [NFSD_FO_UnlockFS] = {"unlock_filesystem", &transaction_ops, S_IWUSR|S_IRUSR},
[NFSD_Fh] = {"filehandle", &transaction_ops, S_IWUSR|S_IRUSR},
[NFSD_Threads] = {"threads", &transaction_ops, S_IWUSR|S_IRUSR},
[NFSD_Pool_Threads] = {"pool_threads", &transaction_ops, S_IWUSR|S_IRUSR},
Index: linux-2.6/fs/lockd/svcsubs.c
================================================== =================
--- linux-2.6.orig/fs/lockd/svcsubs.c 2008-01-08 18:36:37.000000000 +0100
+++ linux-2.6/fs/lockd/svcsubs.c 2008-01-08 18:44:11.000000000 +0100
@@ -18,6 +18,8 @@
#include <linux/lockd/lockd.h>
#include <linux/lockd/share.h>
#include <linux/lockd/sm_inter.h>
+#include <linux/module.h>
+#include <linux/mount.h>

#define NLMDBG_FACILITY NLMDBG_SVCSUBS

@@ -87,7 +89,7 @@ nlm_lookup_file(struct svc_rqst *rqstp,
unsigned int hash;
__be32 nfserr;

- nlm_debug_print_fh("nlm_file_lookup", f);
+ nlm_debug_print_fh("nlm_lookup_file", f);

hash = file_hash(f);

@@ -123,6 +125,9 @@ nlm_lookup_file(struct svc_rqst *rqstp,

hlist_add_head(&file->f_list, &nlm_files[hash]);

+ /* fill in f_iaddr for nlm lock failover */
+ file->f_iaddr = rqstp->rq_daddr;
+
found:
dprintk("lockd: found file %p (count %d)
", file, file->f_count);
*result = file;
@@ -194,12 +199,63 @@ again:
return 0;
}

+static int
+nlmsvc_fo_unlock_match_path(void *datap, struct nlm_file *file)
+{
+ struct nameidata *nd = datap;
+ return nd->mnt == file->f_file->f_path.mnt;
+}
+
+static int
+nlmsvc_fo_unlock_match_ip(void *datap, struct nlm_file *file)
+{
+ struct in_addr *in = datap;
+
+ return file->f_iaddr.addr.s_addr == in->s_addr;
+}
+
+/*
+ * To fit the logic into current lockd code structure, we add a
+ * little wrapper function here. The real matching task should be
+ * carried out by nlm_fo_check_fsid().
+ */
+
+static int nlmsvc_fo_match_path(struct nlm_host *dummy1,
+ struct nlm_host *dummy2)
+{
+ return 1;
+}
+
+static int nlmsvc_fo_match_ip(struct nlm_host *dummy1, struct nlm_host *dummy2)
+{
+ return 1;
+}
+
/*
* Inspect a single file
*/
static inline int
nlm_inspect_file(struct nlm_host *host, struct nlm_file *file, nlm_host_match_fn_t match)
{
+ /*
+ * Cluster failover has timing constraints. There is a slight
+ * performance hit if nlm_fo_unlock_match() is implemented as
+ * a match fn (since it will be invoked for each block, share,
+ * and lock later when the lists are traversed). Instead, we
+ * add path-matching logic into the following unlikely clause.
+ * If matches, the dummy nlmsvc_fo_match will always return
+ * true.
+ */
+ dprintk("nlm_inspect_files: file=%p
", file);
+ if (unlikely(match == nlmsvc_fo_match_path)) {
+ if (!nlmsvc_fo_unlock_match_path(host, file))
+ return 0;
+ }
+ if (unlikely(match == nlmsvc_fo_match_ip)) {
+ if (!nlmsvc_fo_unlock_match_ip(host, file))
+ return 0;
+ }
+
nlmsvc_traverse_blocks(host, file, match);
nlmsvc_traverse_shares(host, file, match);
return nlm_traverse_locks(host, file, match);
@@ -370,3 +426,22 @@ nlmsvc_invalidate_all(void)
*/
nlm_traverse_files(NULL, nlmsvc_is_client);
}
+
+/*
+ * Release locks associated with an export fsid upon failover
+ * invoked via nfsd nfsctl call (write_fo_unlock).
+ */
+int
+nlmsvc_failover_path(struct nameidata *nd)
+{
+ return nlm_traverse_files((struct nlm_host*)nd, nlmsvc_fo_match_path);
+}
+EXPORT_SYMBOL_GPL(nlmsvc_failover_path);
+
+int
+nlmsvc_failover_ip(__be32 server_addr)
+{
+ return nlm_traverse_files((struct nlm_host *)&server_addr,
+ nlmsvc_fo_match_ip);
+}
+EXPORT_SYMBOL_GPL(nlmsvc_failover_ip);
Index: linux-2.6/include/linux/lockd/lockd.h
================================================== =================
--- linux-2.6.orig/include/linux/lockd/lockd.h 2008-01-08 18:36:37.000000000 +0100
+++ linux-2.6/include/linux/lockd/lockd.h 2008-01-08 18:44:44.000000000 +0100
@@ -113,6 +113,7 @@ struct nlm_file {
unsigned int f_locks; /* guesstimate # of locks */
unsigned int f_count; /* reference count */
struct mutex f_mutex; /* avoid concurrent access */
+ union svc_addr_u f_iaddr; /* server ip for failover */
};

/*
@@ -214,6 +215,12 @@ void nlmsvc_mark_resources(void);
void nlmsvc_free_host_resources(struct nlm_host *);
void nlmsvc_invalidate_all(void);

+/*
+ * Cluster failover support
+ */
+int nlmsvc_failover_path(struct nameidata *nd);
+int nlmsvc_failover_ip(__be32 server_addr);
+
static __inline__ struct inode *
nlmsvc_file_inode(struct nlm_file *file)
{
 
Old 01-08-2008, 07:57 PM
Wendy Cheng
 
Default NLM failover unlock commands

Christoph Hellwig wrote:

Ok, I played around with this and cleaned up the ip/path codepathes to
be entirely setup which helped the code quite a bit. Also a few other

Thanks for doing this . In the middle of running it with our cluster
test - if passed, will repost it. Get your "signed-off" line ready ?


-- Wendy

cleanups and two bugfixes (postive error code returned and missing
path_release) fell out of it. I still don't like what's going on in
fs/lockd/svcsubs.c, it would be much better if the cluster unlock code
simply didn't use nlm_traverse_files but did it's own loop over
the nlm hosts. That should also absolete the second patch.



Index: linux-2.6/fs/nfsd/nfsctl.c
================================================== =================
--- linux-2.6.orig/fs/nfsd/nfsctl.c 2008-01-08 18:36:37.000000000 +0100
+++ linux-2.6/fs/nfsd/nfsctl.c 2008-01-08 18:45:55.000000000 +0100
@@ -22,6 +22,7 @@
#include <linux/seq_file.h>
#include <linux/pagemap.h>
#include <linux/init.h>
+#include <linux/inet.h>
#include <linux/string.h>
#include <linux/smp_lock.h>
#include <linux/ctype.h>
@@ -35,6 +36,7 @@
#include <linux/nfsd/cache.h>
#include <linux/nfsd/xdr.h>
#include <linux/nfsd/syscall.h>
+#include <linux/lockd/lockd.h>

#include <asm/uaccess.h>

@@ -52,6 +54,8 @@ enum {

NFSD_Getfs,
NFSD_List,
NFSD_Fh,
+ NFSD_FO_UnlockIP,
+ NFSD_FO_UnlockFS,
NFSD_Threads,
NFSD_Pool_Threads,
NFSD_Versions,
@@ -88,6 +92,9 @@ static ssize_t write_leasetime(struct fi
static ssize_t write_recoverydir(struct file *file, char *buf, size_t size);
#endif

+static ssize_t failover_unlock_ip(struct file *file, char *buf, size_t size);

+static ssize_t failover_unlock_fs(struct file *file, char *buf, size_t size);
+
static ssize_t (*write_op[])(struct file *, char *, size_t) = {
[NFSD_Svc] = write_svc,
[NFSD_Add] = write_add,
@@ -97,6 +104,8 @@ static ssize_t (*write_op[])(struct file
[NFSD_Getfd] = write_getfd,
[NFSD_Getfs] = write_getfs,
[NFSD_Fh] = write_filehandle,
+ [NFSD_FO_UnlockIP] = failover_unlock_ip,
+ [NFSD_FO_UnlockFS] = failover_unlock_fs,
[NFSD_Threads] = write_threads,
[NFSD_Pool_Threads] = write_pool_threads,
[NFSD_Versions] = write_versions,
@@ -288,6 +297,55 @@ static ssize_t write_getfd(struct file *
return err;
}

+static ssize_t failover_unlock_ip(struct file *file, char *buf, size_t size)

+{
+ __be32 server_ip;
+ char *fo_path;
+ char *mesg;
+
+ /* sanity check */
+ if (size <= 0)
+ return -EINVAL;
+
+ if (buf[size-1] == '
')
+ buf[size-1] = 0;
+
+ fo_path = mesg = buf;
+ if (qword_get(&mesg, fo_path, size) < 0)
+ return -EINVAL;
+
+ server_ip = in_aton(fo_path);
+ return nlmsvc_failover_ip(server_ip[0]);
+}
+
+static ssize_t failover_unlock_fs(struct file *file, char *buf, size_t size)
+{
+ struct nameidata nd;
+ char *fo_path;
+ char *mesg;
+ int error;
+
+ /* sanity check */
+ if (size <= 0)
+ return -EINVAL;
+
+ if (buf[size-1] == '
')
+ buf[size-1] = 0;
+
+ fo_path = mesg = buf;
+ if (qword_get(&mesg, fo_path, size) < 0)
+ return -EINVAL;
+
+ error = path_lookup(fo_path, 0, &nd);
+ if (error)
+ return error;
+
+ error = nlmsvc_failover_path(&nd);
+
+ path_release(&nd);
+ return error;
+}
+
static ssize_t write_filehandle(struct file *file, char *buf, size_t size)
{
/* request is:
@@ -646,6 +704,8 @@ static int nfsd_fill_super(struct super_
[NFSD_Getfd] = {".getfd", &transaction_ops, S_IWUSR|S_IRUSR},
[NFSD_Getfs] = {".getfs", &transaction_ops, S_IWUSR|S_IRUSR},
[NFSD_List] = {"exports", &exports_operations, S_IRUGO},
+ [NFSD_FO_UnlockIP] = {"unlock_ip", &transaction_ops, S_IWUSR|S_IRUSR},
+ [NFSD_FO_UnlockFS] = {"unlock_filesystem", &transaction_ops, S_IWUSR|S_IRUSR},
[NFSD_Fh] = {"filehandle", &transaction_ops, S_IWUSR|S_IRUSR},
[NFSD_Threads] = {"threads", &transaction_ops, S_IWUSR|S_IRUSR},
[NFSD_Pool_Threads] = {"pool_threads", &transaction_ops, S_IWUSR|S_IRUSR},
Index: linux-2.6/fs/lockd/svcsubs.c
================================================== =================
--- linux-2.6.orig/fs/lockd/svcsubs.c 2008-01-08 18:36:37.000000000 +0100
+++ linux-2.6/fs/lockd/svcsubs.c 2008-01-08 18:44:11.000000000 +0100
@@ -18,6 +18,8 @@
#include <linux/lockd/lockd.h>
#include <linux/lockd/share.h>
#include <linux/lockd/sm_inter.h>
+#include <linux/module.h>
+#include <linux/mount.h>

#define NLMDBG_FACILITY NLMDBG_SVCSUBS

@@ -87,7 +89,7 @@ nlm_lookup_file(struct svc_rqst *rqstp,
unsigned int hash;

__be32 nfserr;

- nlm_debug_print_fh("nlm_file_lookup", f);

+ nlm_debug_print_fh("nlm_lookup_file", f);

hash = file_hash(f);

@@ -123,6 +125,9 @@ nlm_lookup_file(struct svc_rqst *rqstp,

hlist_add_head(&file->f_list, &nlm_files[hash]);

+ /* fill in f_iaddr for nlm lock failover */

+ file->f_iaddr = rqstp->rq_daddr;
+
found:
dprintk("lockd: found file %p (count %d)
", file, file->f_count);
*result = file;
@@ -194,12 +199,63 @@ again:
return 0;
}

+static int

+nlmsvc_fo_unlock_match_path(void *datap, struct nlm_file *file)
+{
+ struct nameidata *nd = datap;
+ return nd->mnt == file->f_file->f_path.mnt;
+}
+
+static int
+nlmsvc_fo_unlock_match_ip(void *datap, struct nlm_file *file)
+{
+ struct in_addr *in = datap;
+
+ return file->f_iaddr.addr.s_addr == in->s_addr;
+}
+
+/*
+ * To fit the logic into current lockd code structure, we add a
+ * little wrapper function here. The real matching task should be
+ * carried out by nlm_fo_check_fsid().
+ */
+
+static int nlmsvc_fo_match_path(struct nlm_host *dummy1,
+ struct nlm_host *dummy2)
+{
+ return 1;
+}
+
+static int nlmsvc_fo_match_ip(struct nlm_host *dummy1, struct nlm_host *dummy2)
+{
+ return 1;
+}
+
/*
* Inspect a single file
*/
static inline int
nlm_inspect_file(struct nlm_host *host, struct nlm_file *file, nlm_host_match_fn_t match)
{
+ /*
+ * Cluster failover has timing constraints. There is a slight
+ * performance hit if nlm_fo_unlock_match() is implemented as
+ * a match fn (since it will be invoked for each block, share,
+ * and lock later when the lists are traversed). Instead, we
+ * add path-matching logic into the following unlikely clause.
+ * If matches, the dummy nlmsvc_fo_match will always return
+ * true.
+ */
+ dprintk("nlm_inspect_files: file=%p
", file);
+ if (unlikely(match == nlmsvc_fo_match_path)) {
+ if (!nlmsvc_fo_unlock_match_path(host, file))
+ return 0;
+ }
+ if (unlikely(match == nlmsvc_fo_match_ip)) {
+ if (!nlmsvc_fo_unlock_match_ip(host, file))
+ return 0;
+ }
+
nlmsvc_traverse_blocks(host, file, match);
nlmsvc_traverse_shares(host, file, match);
return nlm_traverse_locks(host, file, match);
@@ -370,3 +426,22 @@ nlmsvc_invalidate_all(void)
*/
nlm_traverse_files(NULL, nlmsvc_is_client);
}
+
+/*
+ * Release locks associated with an export fsid upon failover
+ * invoked via nfsd nfsctl call (write_fo_unlock).
+ */
+int
+nlmsvc_failover_path(struct nameidata *nd)
+{
+ return nlm_traverse_files((struct nlm_host*)nd, nlmsvc_fo_match_path);
+}
+EXPORT_SYMBOL_GPL(nlmsvc_failover_path);
+
+int
+nlmsvc_failover_ip(__be32 server_addr)
+{
+ return nlm_traverse_files((struct nlm_host *)&server_addr,
+ nlmsvc_fo_match_ip);
+}
+EXPORT_SYMBOL_GPL(nlmsvc_failover_ip);
Index: linux-2.6/include/linux/lockd/lockd.h
================================================== =================
--- linux-2.6.orig/include/linux/lockd/lockd.h 2008-01-08 18:36:37.000000000 +0100
+++ linux-2.6/include/linux/lockd/lockd.h 2008-01-08 18:44:44.000000000 +0100
@@ -113,6 +113,7 @@ struct nlm_file {
unsigned int f_locks; /* guesstimate # of locks */
unsigned int f_count; /* reference count */
struct mutex f_mutex; /* avoid concurrent access */
+ union svc_addr_u f_iaddr; /* server ip for failover */
};

/*

@@ -214,6 +215,12 @@ void nlmsvc_mark_resources(void);
void nlmsvc_free_host_resources(struct nlm_host *);
void nlmsvc_invalidate_all(void);

+/*

+ * Cluster failover support
+ */
+int nlmsvc_failover_path(struct nameidata *nd);
+int nlmsvc_failover_ip(__be32 server_addr);
+
static __inline__ struct inode *
nlmsvc_file_inode(struct nlm_file *file)
{
 
Old 01-09-2008, 01:51 AM
Wendy Cheng
 
Default NLM failover unlock commands

Neil Brown wrote:


On Monday January 7, wcheng@redhat.com wrote:



We've implemented two new NFSD procfs files:

o /proc/fs/nfsd/unlock_ip
o /proc/fs/nfsd/unlock_filesystem

They are intended to allow admin or user mode script to release NLM
locks based on either a path name or a server in-bound ip address (ipv4
for now)

as;

shell> echo 10.1.1.2 > /proc/fs/nfsd/unlock_ip
shell> echo /mnt/sfs1 > /proc/fs/nfsd/unlock_filesystem




I'm happy with this interface and the code looks credible, so
Acked-by: NeilBrown <neilb@suse.de>


however......



[snip]

Thank .. all points taken .. patch will be re-submitted tomorrow (my
time zone, North Carolina, US) ..


-- Wendy
 
Old 01-09-2008, 02:49 AM
Wendy Cheng
 
Default NLM failover unlock commands

Christoph Hellwig wrote:


+/* cluster failover support */
+
+typedef struct {
+ int cmd;
+ int stat;
+ int gp;
+ void *datap;
+} nlm_fo_cmd;




please don't introduce typedefs for struct types.




I don't do much community version of linux code so its coding standard
is new to me. Any reason for this (not doing typedefs) ?


-- Wendy
 
Old 01-09-2008, 03:13 PM
"J. Bruce Fields"
 
Default NLM failover unlock commands

On Tue, Jan 08, 2008 at 10:49:17PM -0500, Wendy Cheng wrote:
> Christoph Hellwig wrote:
>
>>> +/* cluster failover support */
>>> +
>>> +typedef struct {
>>> + int cmd;
>>> + int stat;
>>> + int gp;
>>> + void *datap;
>>> +} nlm_fo_cmd;
>>>
>>>
>>
>> please don't introduce typedefs for struct types.
>>
>>
>
> I don't do much community version of linux code so its coding standard
> is new to me. Any reason for this (not doing typedefs) ?

The argument is in "Chapter 5: Typdefs" of Documentation/CodingStyle.

--b.
 
Old 01-09-2008, 05:02 PM
Christoph Hellwig
 
Default NLM failover unlock commands

On Tue, Jan 08, 2008 at 03:57:45PM -0500, Wendy Cheng wrote:
> Christoph Hellwig wrote:
>> Ok, I played around with this and cleaned up the ip/path codepathes to
>> be entirely setup which helped the code quite a bit. Also a few other
>>
> Thanks for doing this . In the middle of running it with our cluster
> test - if passed, will repost it. Get your "signed-off" line ready ?

Not quite yet. I'm not happy with what's going on in svcsubs.c in the
current form.

I've added another (untested) idea patch below which adds a second
match function to nlm_traverse_files to remove the current hardcoded
hack. If that works out we'll just need to incorporate Neil's feedback
to the second patch somehow.


Index: linux-2.6/fs/nfsd/nfsctl.c
================================================== =================
--- linux-2.6.orig/fs/nfsd/nfsctl.c 2008-01-08 18:36:37.000000000 +0100
+++ linux-2.6/fs/nfsd/nfsctl.c 2008-01-08 18:45:55.000000000 +0100
@@ -22,6 +22,7 @@
#include <linux/seq_file.h>
#include <linux/pagemap.h>
#include <linux/init.h>
+#include <linux/inet.h>
#include <linux/string.h>
#include <linux/smp_lock.h>
#include <linux/ctype.h>
@@ -35,6 +36,7 @@
#include <linux/nfsd/cache.h>
#include <linux/nfsd/xdr.h>
#include <linux/nfsd/syscall.h>
+#include <linux/lockd/lockd.h>

#include <asm/uaccess.h>

@@ -52,6 +54,8 @@ enum {
NFSD_Getfs,
NFSD_List,
NFSD_Fh,
+ NFSD_FO_UnlockIP,
+ NFSD_FO_UnlockFS,
NFSD_Threads,
NFSD_Pool_Threads,
NFSD_Versions,
@@ -88,6 +92,9 @@ static ssize_t write_leasetime(struct fi
static ssize_t write_recoverydir(struct file *file, char *buf, size_t size);
#endif

+static ssize_t failover_unlock_ip(struct file *file, char *buf, size_t size);
+static ssize_t failover_unlock_fs(struct file *file, char *buf, size_t size);
+
static ssize_t (*write_op[])(struct file *, char *, size_t) = {
[NFSD_Svc] = write_svc,
[NFSD_Add] = write_add,
@@ -97,6 +104,8 @@ static ssize_t (*write_op[])(struct file
[NFSD_Getfd] = write_getfd,
[NFSD_Getfs] = write_getfs,
[NFSD_Fh] = write_filehandle,
+ [NFSD_FO_UnlockIP] = failover_unlock_ip,
+ [NFSD_FO_UnlockFS] = failover_unlock_fs,
[NFSD_Threads] = write_threads,
[NFSD_Pool_Threads] = write_pool_threads,
[NFSD_Versions] = write_versions,
@@ -288,6 +297,55 @@ static ssize_t write_getfd(struct file *
return err;
}

+static ssize_t failover_unlock_ip(struct file *file, char *buf, size_t size)
+{
+ __be32 server_ip;
+ char *fo_path;
+ char *mesg;
+
+ /* sanity check */
+ if (size <= 0)
+ return -EINVAL;
+
+ if (buf[size-1] == '
')
+ buf[size-1] = 0;
+
+ fo_path = mesg = buf;
+ if (qword_get(&mesg, fo_path, size) < 0)
+ return -EINVAL;
+
+ server_ip = in_aton(fo_path);
+ return nlmsvc_failover_ip(server_ip[0]);
+}
+
+static ssize_t failover_unlock_fs(struct file *file, char *buf, size_t size)
+{
+ struct nameidata nd;
+ char *fo_path;
+ char *mesg;
+ int error;
+
+ /* sanity check */
+ if (size <= 0)
+ return -EINVAL;
+
+ if (buf[size-1] == '
')
+ buf[size-1] = 0;
+
+ fo_path = mesg = buf;
+ if (qword_get(&mesg, fo_path, size) < 0)
+ return -EINVAL;
+
+ error = path_lookup(fo_path, 0, &nd);
+ if (error)
+ return error;
+
+ error = nlmsvc_failover_path(&nd);
+
+ path_release(&nd);
+ return error;
+}
+
static ssize_t write_filehandle(struct file *file, char *buf, size_t size)
{
/* request is:
@@ -646,6 +704,8 @@ static int nfsd_fill_super(struct super_
[NFSD_Getfd] = {".getfd", &transaction_ops, S_IWUSR|S_IRUSR},
[NFSD_Getfs] = {".getfs", &transaction_ops, S_IWUSR|S_IRUSR},
[NFSD_List] = {"exports", &exports_operations, S_IRUGO},
+ [NFSD_FO_UnlockIP] = {"unlock_ip", &transaction_ops, S_IWUSR|S_IRUSR},
+ [NFSD_FO_UnlockFS] = {"unlock_filesystem", &transaction_ops, S_IWUSR|S_IRUSR},
[NFSD_Fh] = {"filehandle", &transaction_ops, S_IWUSR|S_IRUSR},
[NFSD_Threads] = {"threads", &transaction_ops, S_IWUSR|S_IRUSR},
[NFSD_Pool_Threads] = {"pool_threads", &transaction_ops, S_IWUSR|S_IRUSR},
Index: linux-2.6/fs/lockd/svcsubs.c
================================================== =================
--- linux-2.6.orig/fs/lockd/svcsubs.c 2008-01-08 18:36:37.000000000 +0100
+++ linux-2.6/fs/lockd/svcsubs.c 2008-01-09 18:59:37.000000000 +0100
@@ -18,6 +18,8 @@
#include <linux/lockd/lockd.h>
#include <linux/lockd/share.h>
#include <linux/lockd/sm_inter.h>
+#include <linux/module.h>
+#include <linux/mount.h>

#define NLMDBG_FACILITY NLMDBG_SVCSUBS

@@ -87,7 +89,7 @@ nlm_lookup_file(struct svc_rqst *rqstp,
unsigned int hash;
__be32 nfserr;

- nlm_debug_print_fh("nlm_file_lookup", f);
+ nlm_debug_print_fh("nlm_lookup_file", f);

hash = file_hash(f);

@@ -123,6 +125,9 @@ nlm_lookup_file(struct svc_rqst *rqstp,

hlist_add_head(&file->f_list, &nlm_files[hash]);

+ /* fill in f_iaddr for nlm lock failover */
+ file->f_iaddr = rqstp->rq_daddr;
+
found:
dprintk("lockd: found file %p (count %d)
", file, file->f_count);
*result = file;
@@ -194,6 +199,12 @@ again:
return 0;
}

+static int
+nlmsvc_always_match(struct nlm_host *dummy1, struct nlm_host *dummy2)
+{
+ return 1;
+}
+
/*
* Inspect a single file
*/
@@ -230,7 +241,8 @@ nlm_file_inuse(struct nlm_file *file)
* Loop over all files in the file table.
*/
static int
-nlm_traverse_files(struct nlm_host *host, nlm_host_match_fn_t match)
+nlm_traverse_files(void *data, nlm_host_match_fn_t match,
+ int (*file_ok)(void *data, struct nlm_file *file))
{
struct hlist_node *pos, *next;
struct nlm_file *file;
@@ -244,8 +256,10 @@ nlm_traverse_files(struct nlm_host *host

/* Traverse locks, blocks and shares of this file
* and update file->f_locks count */
- if (nlm_inspect_file(host, file, match))
- ret = 1;
+ if (file_ok && file_ok(data, file)) {
+ if (nlm_inspect_file(data, file, match))
+ ret = 1;
+ }

mutex_lock(&nlm_file_mutex);
file->f_count--;
@@ -337,7 +351,7 @@ void
nlmsvc_mark_resources(void)
{
dprintk("lockd: nlmsvc_mark_resources
");
- nlm_traverse_files(NULL, nlmsvc_mark_host);
+ nlm_traverse_files(NULL, nlmsvc_mark_host, NULL);
}

/*
@@ -348,7 +362,7 @@ nlmsvc_free_host_resources(struct nlm_ho
{
dprintk("lockd: nlmsvc_free_host_resources
");

- if (nlm_traverse_files(host, nlmsvc_same_host)) {
+ if (nlm_traverse_files(host, nlmsvc_same_host, NULL)) {
printk(KERN_WARNING
"lockd: couldn't remove all locks held by %s
",
host->h_name);
@@ -368,5 +382,36 @@ nlmsvc_invalidate_all(void)
* turn, which is about as inefficient as it gets.
* Now we just do it once in nlm_traverse_files.
*/
- nlm_traverse_files(NULL, nlmsvc_is_client);
+ nlm_traverse_files(NULL, nlmsvc_is_client, NULL);
+}
+
+static int
+nlmsvc_failover_file_ok_path(void *datap, struct nlm_file *file)
+{
+ struct nameidata *nd = datap;
+ return nd->mnt == file->f_file->f_path.mnt;
+}
+
+int
+nlmsvc_failover_path(struct nameidata *nd)
+{
+ return nlm_traverse_files(nd, nlmsvc_always_match,
+ nlmsvc_failover_file_ok_path);
+}
+EXPORT_SYMBOL_GPL(nlmsvc_failover_path);
+
+static int
+nlmsvc_failover_file_ok_ip(void *datap, struct nlm_file *file)
+{
+ struct in_addr *in = datap;
+
+ return file->f_iaddr.addr.s_addr == in->s_addr;
+}
+
+int
+nlmsvc_failover_ip(__be32 server_addr)
+{
+ return nlm_traverse_files(&server_addr, nlmsvc_always_match,
+ nlmsvc_failover_file_ok_ip);
}
+EXPORT_SYMBOL_GPL(nlmsvc_failover_ip);
Index: linux-2.6/include/linux/lockd/lockd.h
================================================== =================
--- linux-2.6.orig/include/linux/lockd/lockd.h 2008-01-08 18:36:37.000000000 +0100
+++ linux-2.6/include/linux/lockd/lockd.h 2008-01-08 18:44:44.000000000 +0100
@@ -113,6 +113,7 @@ struct nlm_file {
unsigned int f_locks; /* guesstimate # of locks */
unsigned int f_count; /* reference count */
struct mutex f_mutex; /* avoid concurrent access */
+ union svc_addr_u f_iaddr; /* server ip for failover */
};

/*
@@ -214,6 +215,12 @@ void nlmsvc_mark_resources(void);
void nlmsvc_free_host_resources(struct nlm_host *);
void nlmsvc_invalidate_all(void);

+/*
+ * Cluster failover support
+ */
+int nlmsvc_failover_path(struct nameidata *nd);
+int nlmsvc_failover_ip(__be32 server_addr);
+
static __inline__ struct inode *
nlmsvc_file_inode(struct nlm_file *file)
{
 
Old 01-10-2008, 06:59 AM
Christoph Hellwig
 
Default NLM failover unlock commands

On Wed, Jan 09, 2008 at 06:02:15PM +0000, Christoph Hellwig wrote:
> On Tue, Jan 08, 2008 at 03:57:45PM -0500, Wendy Cheng wrote:
> > Christoph Hellwig wrote:
> >> Ok, I played around with this and cleaned up the ip/path codepathes to
> >> be entirely setup which helped the code quite a bit. Also a few other
> >>
> > Thanks for doing this . In the middle of running it with our cluster
> > test - if passed, will repost it. Get your "signed-off" line ready ?
>
> Not quite yet. I'm not happy with what's going on in svcsubs.c in the
> current form.
>
> I've added another (untested) idea patch below which adds a second
> match function to nlm_traverse_files to remove the current hardcoded
> hack. If that works out we'll just need to incorporate Neil's feedback
> to the second patch somehow.

This patch introduce a new bug by a reversed check for the file_ok
function that would hit all non-failover lockd functionality. There's
also been a bug left-over from my previous patch where the file_ok
callback expects the wrong type passed to it and would crash.

Fixed version below:


Index: linux-2.6/fs/nfsd/nfsctl.c
================================================== =================
--- linux-2.6.orig/fs/nfsd/nfsctl.c 2008-01-08 18:36:37.000000000 +0100
+++ linux-2.6/fs/nfsd/nfsctl.c 2008-01-10 08:56:55.000000000 +0100
@@ -22,6 +22,7 @@
#include <linux/seq_file.h>
#include <linux/pagemap.h>
#include <linux/init.h>
+#include <linux/inet.h>
#include <linux/string.h>
#include <linux/smp_lock.h>
#include <linux/ctype.h>
@@ -35,6 +36,7 @@
#include <linux/nfsd/cache.h>
#include <linux/nfsd/xdr.h>
#include <linux/nfsd/syscall.h>
+#include <linux/lockd/lockd.h>

#include <asm/uaccess.h>

@@ -52,6 +54,8 @@ enum {
NFSD_Getfs,
NFSD_List,
NFSD_Fh,
+ NFSD_FO_UnlockIP,
+ NFSD_FO_UnlockFS,
NFSD_Threads,
NFSD_Pool_Threads,
NFSD_Versions,
@@ -88,6 +92,9 @@ static ssize_t write_leasetime(struct fi
static ssize_t write_recoverydir(struct file *file, char *buf, size_t size);
#endif

+static ssize_t failover_unlock_ip(struct file *file, char *buf, size_t size);
+static ssize_t failover_unlock_fs(struct file *file, char *buf, size_t size);
+
static ssize_t (*write_op[])(struct file *, char *, size_t) = {
[NFSD_Svc] = write_svc,
[NFSD_Add] = write_add,
@@ -97,6 +104,8 @@ static ssize_t (*write_op[])(struct file
[NFSD_Getfd] = write_getfd,
[NFSD_Getfs] = write_getfs,
[NFSD_Fh] = write_filehandle,
+ [NFSD_FO_UnlockIP] = failover_unlock_ip,
+ [NFSD_FO_UnlockFS] = failover_unlock_fs,
[NFSD_Threads] = write_threads,
[NFSD_Pool_Threads] = write_pool_threads,
[NFSD_Versions] = write_versions,
@@ -288,6 +297,55 @@ static ssize_t write_getfd(struct file *
return err;
}

+static ssize_t failover_unlock_ip(struct file *file, char *buf, size_t size)
+{
+ __be32 server_ip;
+ char *fo_path;
+ char *mesg;
+
+ /* sanity check */
+ if (size <= 0)
+ return -EINVAL;
+
+ if (buf[size-1] == '
')
+ buf[size-1] = 0;
+
+ fo_path = mesg = buf;
+ if (qword_get(&mesg, fo_path, size) < 0)
+ return -EINVAL;
+
+ server_ip = in_aton(fo_path);
+ return nlmsvc_failover_ip(server_ip);
+}
+
+static ssize_t failover_unlock_fs(struct file *file, char *buf, size_t size)
+{
+ struct nameidata nd;
+ char *fo_path;
+ char *mesg;
+ int error;
+
+ /* sanity check */
+ if (size <= 0)
+ return -EINVAL;
+
+ if (buf[size-1] == '
')
+ buf[size-1] = 0;
+
+ fo_path = mesg = buf;
+ if (qword_get(&mesg, fo_path, size) < 0)
+ return -EINVAL;
+
+ error = path_lookup(fo_path, 0, &nd);
+ if (error)
+ return error;
+
+ error = nlmsvc_failover_path(&nd);
+
+ path_release(&nd);
+ return error;
+}
+
static ssize_t write_filehandle(struct file *file, char *buf, size_t size)
{
/* request is:
@@ -646,6 +704,8 @@ static int nfsd_fill_super(struct super_
[NFSD_Getfd] = {".getfd", &transaction_ops, S_IWUSR|S_IRUSR},
[NFSD_Getfs] = {".getfs", &transaction_ops, S_IWUSR|S_IRUSR},
[NFSD_List] = {"exports", &exports_operations, S_IRUGO},
+ [NFSD_FO_UnlockIP] = {"unlock_ip", &transaction_ops, S_IWUSR|S_IRUSR},
+ [NFSD_FO_UnlockFS] = {"unlock_filesystem", &transaction_ops, S_IWUSR|S_IRUSR},
[NFSD_Fh] = {"filehandle", &transaction_ops, S_IWUSR|S_IRUSR},
[NFSD_Threads] = {"threads", &transaction_ops, S_IWUSR|S_IRUSR},
[NFSD_Pool_Threads] = {"pool_threads", &transaction_ops, S_IWUSR|S_IRUSR},
Index: linux-2.6/fs/lockd/svcsubs.c
================================================== =================
--- linux-2.6.orig/fs/lockd/svcsubs.c 2008-01-08 18:36:37.000000000 +0100
+++ linux-2.6/fs/lockd/svcsubs.c 2008-01-10 08:56:18.000000000 +0100
@@ -18,6 +18,8 @@
#include <linux/lockd/lockd.h>
#include <linux/lockd/share.h>
#include <linux/lockd/sm_inter.h>
+#include <linux/module.h>
+#include <linux/mount.h>

#define NLMDBG_FACILITY NLMDBG_SVCSUBS

@@ -87,7 +89,7 @@ nlm_lookup_file(struct svc_rqst *rqstp,
unsigned int hash;
__be32 nfserr;

- nlm_debug_print_fh("nlm_file_lookup", f);
+ nlm_debug_print_fh("nlm_lookup_file", f);

hash = file_hash(f);

@@ -123,6 +125,9 @@ nlm_lookup_file(struct svc_rqst *rqstp,

hlist_add_head(&file->f_list, &nlm_files[hash]);

+ /* fill in f_iaddr for nlm lock failover */
+ file->f_iaddr = rqstp->rq_daddr;
+
found:
dprintk("lockd: found file %p (count %d)
", file, file->f_count);
*result = file;
@@ -194,6 +199,12 @@ again:
return 0;
}

+static int
+nlmsvc_always_match(struct nlm_host *dummy1, struct nlm_host *dummy2)
+{
+ return 1;
+}
+
/*
* Inspect a single file
*/
@@ -230,7 +241,8 @@ nlm_file_inuse(struct nlm_file *file)
* Loop over all files in the file table.
*/
static int
-nlm_traverse_files(struct nlm_host *host, nlm_host_match_fn_t match)
+nlm_traverse_files(void *data, nlm_host_match_fn_t match,
+ int (*file_ok)(void *data, struct nlm_file *file))
{
struct hlist_node *pos, *next;
struct nlm_file *file;
@@ -244,8 +256,10 @@ nlm_traverse_files(struct nlm_host *host

/* Traverse locks, blocks and shares of this file
* and update file->f_locks count */
- if (nlm_inspect_file(host, file, match))
- ret = 1;
+ if (!file_ok || file_ok(data, file)) {
+ if (nlm_inspect_file(data, file, match))
+ ret = 1;
+ }

mutex_lock(&nlm_file_mutex);
file->f_count--;
@@ -337,7 +351,7 @@ void
nlmsvc_mark_resources(void)
{
dprintk("lockd: nlmsvc_mark_resources
");
- nlm_traverse_files(NULL, nlmsvc_mark_host);
+ nlm_traverse_files(NULL, nlmsvc_mark_host, NULL);
}

/*
@@ -348,7 +362,7 @@ nlmsvc_free_host_resources(struct nlm_ho
{
dprintk("lockd: nlmsvc_free_host_resources
");

- if (nlm_traverse_files(host, nlmsvc_same_host)) {
+ if (nlm_traverse_files(host, nlmsvc_same_host, NULL)) {
printk(KERN_WARNING
"lockd: couldn't remove all locks held by %s
",
host->h_name);
@@ -368,5 +382,36 @@ nlmsvc_invalidate_all(void)
* turn, which is about as inefficient as it gets.
* Now we just do it once in nlm_traverse_files.
*/
- nlm_traverse_files(NULL, nlmsvc_is_client);
+ nlm_traverse_files(NULL, nlmsvc_is_client, NULL);
+}
+
+static int
+nlmsvc_failover_file_ok_path(void *datap, struct nlm_file *file)
+{
+ struct nameidata *nd = datap;
+ return nd->mnt == file->f_file->f_path.mnt;
+}
+
+int
+nlmsvc_failover_path(struct nameidata *nd)
+{
+ return nlm_traverse_files(nd, nlmsvc_always_match,
+ nlmsvc_failover_file_ok_path);
+}
+EXPORT_SYMBOL_GPL(nlmsvc_failover_path);
+
+static int
+nlmsvc_failover_file_ok_ip(void *datap, struct nlm_file *file)
+{
+ __be32 *server_addr = datap;
+
+ return file->f_iaddr.addr.s_addr == *server_addr;
+}
+
+int
+nlmsvc_failover_ip(__be32 server_addr)
+{
+ return nlm_traverse_files(&server_addr, nlmsvc_always_match,
+ nlmsvc_failover_file_ok_ip);
}
+EXPORT_SYMBOL_GPL(nlmsvc_failover_ip);
Index: linux-2.6/include/linux/lockd/lockd.h
================================================== =================
--- linux-2.6.orig/include/linux/lockd/lockd.h 2008-01-08 18:36:37.000000000 +0100
+++ linux-2.6/include/linux/lockd/lockd.h 2008-01-08 18:44:44.000000000 +0100
@@ -113,6 +113,7 @@ struct nlm_file {
unsigned int f_locks; /* guesstimate # of locks */
unsigned int f_count; /* reference count */
struct mutex f_mutex; /* avoid concurrent access */
+ union svc_addr_u f_iaddr; /* server ip for failover */
};

/*
@@ -214,6 +215,12 @@ void nlmsvc_mark_resources(void);
void nlmsvc_free_host_resources(struct nlm_host *);
void nlmsvc_invalidate_all(void);

+/*
+ * Cluster failover support
+ */
+int nlmsvc_failover_path(struct nameidata *nd);
+int nlmsvc_failover_ip(__be32 server_addr);
+
static __inline__ struct inode *
nlmsvc_file_inode(struct nlm_file *file)
{
 

Thread Tools




All times are GMT. The time now is 08:46 AM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org