This is a note to let you know that I've just added the patch titled
VFS: close race between getcwd() and d_move()
to the 4.9-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git%3Ba=su...
The filename of the patch is: vfs-close-race-between-getcwd-and-d_move.patch and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree, please let stable@vger.kernel.org know about it.
From foo@baz Mon Apr 9 17:09:24 CEST 2018
From: NeilBrown neilb@suse.com Date: Fri, 10 Nov 2017 15:45:41 +1100 Subject: VFS: close race between getcwd() and d_move()
From: NeilBrown neilb@suse.com
[ Upstream commit 61647823aa920e395afcce4b57c32afb51456cab ]
d_move() will call __d_drop() and then __d_rehash() on the dentry being moved. This creates a small window when the dentry appears to be unhashed. Many tests of d_unhashed() are made under ->d_lock and so are safe from racing with this window, but some aren't. In particular, getcwd() calls d_unlinked() (which calls d_unhashed()) without d_lock protection, so it can race.
This races has been seen in practice with lustre, which uses d_move() as part of name lookup. See: https://jira.hpdd.intel.com/browse/LU-9735 It could race with a regular rename(), and result in ENOENT instead of either the 'before' or 'after' name.
The race can be demonstrated with a simple program which has two threads, one renaming a directory back and forth while another calls getcwd() within that directory: it should never fail, but does. See: https://patchwork.kernel.org/patch/9455345/
We could fix this race by taking d_lock and rechecking when d_unhashed() reports true. Alternately when can remove the window, which is the approach this patch takes.
___d_drop() is introduce which does *not* clear d_hash.pprev so the dentry still appears to be hashed. __d_drop() calls ___d_drop(), then clears d_hash.pprev. __d_move() now uses ___d_drop() and only clears d_hash.pprev when not rehashing.
Signed-off-by: NeilBrown neilb@suse.com Signed-off-by: Al Viro viro@zeniv.linux.org.uk Signed-off-by: Sasha Levin alexander.levin@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/dcache.c | 23 ++++++++++++++++------- 1 file changed, 16 insertions(+), 7 deletions(-)
--- a/fs/dcache.c +++ b/fs/dcache.c @@ -461,9 +461,11 @@ static void dentry_lru_add(struct dentry * d_drop() is used mainly for stuff that wants to invalidate a dentry for some * reason (NFS timeouts or autofs deletes). * - * __d_drop requires dentry->d_lock. + * __d_drop requires dentry->d_lock + * ___d_drop doesn't mark dentry as "unhashed" + * (dentry->d_hash.pprev will be LIST_POISON2, not NULL). */ -void __d_drop(struct dentry *dentry) +static void ___d_drop(struct dentry *dentry) { if (!d_unhashed(dentry)) { struct hlist_bl_head *b; @@ -479,12 +481,17 @@ void __d_drop(struct dentry *dentry)
hlist_bl_lock(b); __hlist_bl_del(&dentry->d_hash); - dentry->d_hash.pprev = NULL; hlist_bl_unlock(b); /* After this call, in-progress rcu-walk path lookup will fail. */ write_seqcount_invalidate(&dentry->d_seq); } } + +void __d_drop(struct dentry *dentry) +{ + ___d_drop(dentry); + dentry->d_hash.pprev = NULL; +} EXPORT_SYMBOL(__d_drop);
void d_drop(struct dentry *dentry) @@ -2378,7 +2385,7 @@ EXPORT_SYMBOL(d_delete); static void __d_rehash(struct dentry *entry) { struct hlist_bl_head *b = d_hash(entry->d_name.hash); - BUG_ON(!d_unhashed(entry)); + hlist_bl_lock(b); hlist_bl_add_head_rcu(&entry->d_hash, b); hlist_bl_unlock(b); @@ -2815,9 +2822,9 @@ static void __d_move(struct dentry *dent write_seqcount_begin_nested(&target->d_seq, DENTRY_D_LOCK_NESTED);
/* unhash both */ - /* __d_drop does write_seqcount_barrier, but they're OK to nest. */ - __d_drop(dentry); - __d_drop(target); + /* ___d_drop does write_seqcount_barrier, but they're OK to nest. */ + ___d_drop(dentry); + ___d_drop(target);
/* Switch the names.. */ if (exchange) @@ -2829,6 +2836,8 @@ static void __d_move(struct dentry *dent __d_rehash(dentry); if (exchange) __d_rehash(target); + else + target->d_hash.pprev = NULL;
/* ... and switch them in the tree */ if (IS_ROOT(dentry)) {
Patches currently in stable-queue which might be from neilb@suse.com are
queue-4.9/vfs-close-race-between-getcwd-and-d_move.patch queue-4.9/sunrpc-ensure-correct-error-is-reported-by-xs_tcp_setup_socket.patch
linux-stable-mirror@lists.linaro.org