From: Chuck Lever chuck.lever@oracle.com
commit 11476e9dec39d90fe1e9bf12abc6f3efe35a073d upstream.
At Connectathon 2016, we found that recent upstream Linux clients would occasionally send a LOCK operation with a zero stateid. This appeared to happen in close proximity to another thread returning a delegation before unlinking the same file while it remained open.
Earlier, the client received a write delegation on this file and returned the open stateid. Now, as it is getting ready to unlink the file, it returns the write delegation. But there is still an open file descriptor on that file, so the client must OPEN the file again before it returns the delegation.
Since commit 24311f884189 ('NFSv4: Recovery of recalled read delegations is broken'), nfs_open_delegation_recall() clears the NFS_DELEGATED_STATE flag _before_ it sends the OPEN. This allows a racing LOCK on the same inode to be put on the wire before the OPEN operation has returned a valid open stateid.
To eliminate this race, serialize delegation return with the acquisition of a file lock on the same file. Adopt the same approach as is used in the unlock path.
This patch also eliminates a similar race seen when sending a LOCK operation at the same time as returning a delegation on the same file.
Fixes: 24311f884189 ('NFSv4: Recovery of recalled read ... ') Signed-off-by: Chuck Lever chuck.lever@oracle.com [Anna: Add sentence about LOCK / delegation race] Signed-off-by: Anna Schumaker Anna.Schumaker@Netapp.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/nfs/nfs4proc.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/fs/nfs/nfs4proc.c +++ b/fs/nfs/nfs4proc.c @@ -6054,6 +6054,7 @@ static int nfs41_lock_expired(struct nfs static int _nfs4_proc_setlk(struct nfs4_state *state, int cmd, struct file_lock *request) { struct nfs_inode *nfsi = NFS_I(state->inode); + struct nfs4_state_owner *sp = state->owner; unsigned char fl_flags = request->fl_flags; int status = -ENOLCK;
@@ -6068,6 +6069,7 @@ static int _nfs4_proc_setlk(struct nfs4_ status = do_vfs_lock(state->inode, request); if (status < 0) goto out; + mutex_lock(&sp->so_delegreturn_mutex); down_read(&nfsi->rwsem); if (test_bit(NFS_DELEGATED_STATE, &state->flags)) { /* Yes: cache locks! */ @@ -6075,9 +6077,11 @@ static int _nfs4_proc_setlk(struct nfs4_ request->fl_flags = fl_flags & ~FL_SLEEP; status = do_vfs_lock(state->inode, request); up_read(&nfsi->rwsem); + mutex_unlock(&sp->so_delegreturn_mutex); goto out; } up_read(&nfsi->rwsem); + mutex_unlock(&sp->so_delegreturn_mutex); status = _nfs4_do_setlk(state, cmd, request, NFS_LOCK_NEW); out: request->fl_flags = fl_flags;