From: Olga Kornievskaia olga.kornievskaia@gmail.com
commit d03727b248d0dae6199569a8d7b629a681154633 upstream.
Figuring out the root case for the REMOVE/CLOSE race and suggesting the solution was done by Neil Brown.
Currently what happens is that direct IO calls hold a reference on the open context which is decremented as an asynchronous task in the nfs_direct_complete(). Before reference is decremented, control is returned to the application which is free to close the file. When close is being processed, it decrements its reference on the open_context but since directIO still holds one, it doesn't sent a close on the wire. It returns control to the application which is free to do other operations. For instance, it can delete a file. Direct IO is finally releasing its reference and triggering an asynchronous close. Which races with the REMOVE. On the server, REMOVE can be processed before the CLOSE, failing the REMOVE with EACCES as the file is still opened.
Signed-off-by: Olga Kornievskaia kolga@netapp.com Suggested-by: Neil Brown neilb@suse.com CC: stable@vger.kernel.org Signed-off-by: Anna Schumaker Anna.Schumaker@Netapp.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/nfs/direct.c | 13 +++++++++---- fs/nfs/file.c | 1 + 2 files changed, 10 insertions(+), 4 deletions(-)
diff --git a/fs/nfs/direct.c b/fs/nfs/direct.c index 6b0bf4ebd8124..70cf8c5760c73 100644 --- a/fs/nfs/direct.c +++ b/fs/nfs/direct.c @@ -367,8 +367,6 @@ static void nfs_direct_complete(struct nfs_direct_req *dreq) { struct inode *inode = dreq->inode;
- inode_dio_end(inode); - if (dreq->iocb) { long res = (long) dreq->error; if (dreq->count != 0) { @@ -380,7 +378,10 @@ static void nfs_direct_complete(struct nfs_direct_req *dreq)
complete(&dreq->completion);
+ igrab(inode); nfs_direct_req_release(dreq); + inode_dio_end(inode); + iput(inode); }
static void nfs_direct_read_completion(struct nfs_pgio_header *hdr) @@ -510,8 +511,10 @@ static ssize_t nfs_direct_read_schedule_iovec(struct nfs_direct_req *dreq, * generic layer handle the completion. */ if (requested_bytes == 0) { - inode_dio_end(inode); + igrab(inode); nfs_direct_req_release(dreq); + inode_dio_end(inode); + iput(inode); return result < 0 ? result : -EIO; }
@@ -923,8 +926,10 @@ static ssize_t nfs_direct_write_schedule_iovec(struct nfs_direct_req *dreq, * generic layer handle the completion. */ if (requested_bytes == 0) { - inode_dio_end(inode); + igrab(inode); nfs_direct_req_release(dreq); + inode_dio_end(inode); + iput(inode); return result < 0 ? result : -EIO; }
diff --git a/fs/nfs/file.c b/fs/nfs/file.c index 95dc90570786c..7b31367532054 100644 --- a/fs/nfs/file.c +++ b/fs/nfs/file.c @@ -83,6 +83,7 @@ nfs_file_release(struct inode *inode, struct file *filp) dprintk("NFS: release(%pD2)\n", filp);
nfs_inc_stats(inode, NFSIOS_VFSRELEASE); + inode_dio_wait(inode); nfs_file_clear_open_context(filp); return 0; }