FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.

» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Ubuntu > Ubuntu Kernel Team

LinkBack Thread Tools
Old 06-16-2011, 02:10 PM
Al Viro
Default VFS: Fix vfsmount overput on simultaneous automount

BugLink: http://bugs.launchpad.net/bugs/769927

[Kudos to dhowells for tracking that crap down]

If two processes attempt to cause automounting on the same mountpoint at the
same time, the vfsmount holding the mountpoint will be left with one too few
references on it, causing a BUG when the kernel tries to clean up.

The problem is that lock_mount() drops the caller's reference to the
mountpoint's vfsmount in the case where it finds something already mounted on
the mountpoint as it transits to the mounted filesystem and replaces path->mnt
with the new mountpoint vfsmount.

During a pathwalk, however, we don't take a reference on the vfsmount if it is
the same as the one in the nameidata struct, but do_add_mount() doesn't know

The fix is to make sure we have a ref on the vfsmount of the mountpoint before
calling do_add_mount(). However, if lock_mount() doesn't transit, we're then
left with an extra ref on the mountpoint vfsmount which needs releasing.
We can handle that in follow_managed() by not making assumptions about what
we can and what we cannot get from lookup_mnt() as the current code does.

The callers of follow_managed() expect that reference to path->mnt will be
grabbed iff path->mnt has been changed. follow_managed() and follow_automount()
keep track of whether such reference has been grabbed and assume that it'll
happen in those and only those cases that'll have us return with changed
path->mnt. That assumption is almost correct - it breaks in case of
racing automounts and in even harder to hit race between following a mountpoint
and a couple of mount --move. The thing is, we don't need to make that
assumption at all - after the end of loop in follow_manage() we can check
if path->mnt has ended up unchanged and do mntput() if needed.

The BUG can be reproduced with the following test program:

#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>
#include <sys/wait.h>
int main(int argc, char **argv)
int pid, ws;
struct stat buf;
pid = fork();
stat(argv[1], &buf);
if (pid > 0) wait(&ws);
return 0;

and the following procedure:

(1) Mount an NFS volume that on the server has something else mounted on a
subdirectory. For instance, I can mount / from my server:

mount warthog:/ /mnt -t nfs4 -r

On the server /data has another filesystem mounted on it, so NFS will see
a change in FSID as it walks down the path, and will mark /mnt/data as
being a mountpoint. This will cause the automount code to be triggered.

!!! Do not look inside the mounted fs at this point !!!

(2) Run the above program on a file within the submount to generate two
simultaneous automount requests:

/tmp/forkstat /mnt/data/testfile

(3) Unmount the automounted submount:

umount /mnt/data

(4) Unmount the original mount:

umount /mnt

At this point the kernel should throw a BUG with something like the

BUG: Dentry ffff880032e3c5c0{i=2,n=} still in use (1) [unmount of nfs4 0:12]

Note that the bug appears on the root dentry of the original mount, not the
mountpoint and not the submount because sys_umount() hasn't got to its final
mntput_no_expire() yet, but this isn't so obvious from the call trace:

[<ffffffff8117cd82>] shrink_dcache_for_umount+0x69/0x82
[<ffffffff8116160e>] generic_shutdown_super+0x37/0x15b
[<ffffffffa00fae56>] ? nfs_super_return_all_delegations+0x2e/0x1b1 [nfs]
[<ffffffff811617f3>] kill_anon_super+0x1d/0x7e
[<ffffffffa00d0be1>] nfs4_kill_super+0x60/0xb6 [nfs]
[<ffffffff81161c17>] deactivate_locked_super+0x34/0x83
[<ffffffff811629ff>] deactivate_super+0x6f/0x7b
[<ffffffff81186261>] mntput_no_expire+0x18d/0x199
[<ffffffff811862a8>] mntput+0x3b/0x44
[<ffffffff81186d87>] release_mounts+0xa2/0xbf
[<ffffffff811876af>] sys_umount+0x47a/0x4ba
[<ffffffff8109e1ca>] ? trace_hardirqs_on_caller+0x1fd/0x22f
[<ffffffff816ea86b>] system_call_fastpath+0x16/0x1b

as do_umount() is inlined. However, you can see release_mounts() in there.

Note also that it may be necessary to have multiple CPU cores to be able to
trigger this bug.

Tested-by: Jeff Layton <jlayton@redhat.com>
Tested-by: Ian Kent <raven@themaw.net>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
(cherry picked from commit 8aef18845266f5c05904c610088f2d1ed58f6be3)

Signed-off-by: Chris J Arges <chris.j.arges@canonical.com>
fs/namei.c | 24 ++++++++++++++++--------
1 files changed, 16 insertions(+), 8 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index 44890c6..7bf6968 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -923,6 +923,11 @@ static int follow_automount(struct path *path, unsigned flags,
if (!mnt) /* mount collision */
return 0;

+ if (!*need_mntput) {
+ /* lock_mount() may release path->mnt on error */
+ mntget(path->mnt);
+ *need_mntput = true;
+ }
err = finish_automount(mnt, path);

switch (err) {
@@ -930,12 +935,9 @@ static int follow_automount(struct path *path, unsigned flags,
/* Someone else made a mount here whilst we were busy */
return 0;
case 0:
- dput(path->dentry);
- if (*need_mntput)
- mntput(path->mnt);
+ path_put(path);
path->mnt = mnt;
path->dentry = dget(mnt->mnt_root);
- *need_mntput = true;
return 0;
return err;
@@ -955,9 +957,10 @@ static int follow_automount(struct path *path, unsigned flags,
static int follow_managed(struct path *path, unsigned flags)
+ struct vfsmount *mnt = path->mnt; /* held by caller, must be left alone */
unsigned managed;
bool need_mntput = false;
- int ret;
+ int ret = 0;

/* Given that we're not holding a lock here, we retain the value in a
* local variable for each dentry as we look at it so that we don't see
@@ -973,7 +976,7 @@ static int follow_managed(struct path *path, unsigned flags)
ret = path->dentry->d_op->d_manage(path->dentry,
false, false);
if (ret < 0)
- return ret == -EISDIR ? 0 : ret;
+ break;

/* Transit to a mounted filesystem. */
@@ -999,14 +1002,19 @@ static int follow_managed(struct path *path, unsigned flags)
if (managed & DCACHE_NEED_AUTOMOUNT) {
ret = follow_automount(path, flags, &need_mntput);
if (ret < 0)
- return ret == -EISDIR ? 0 : ret;
+ break;

/* We didn't change the current path point */
- return 0;
+ if (need_mntput && path->mnt == mnt)
+ mntput(path->mnt);
+ if (ret == -EISDIR)
+ ret = 0;
+ return ret;

int follow_down_one(struct path *path)

Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

kernel-team mailing list


Thread Tools

All times are GMT. The time now is 03:44 AM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org