On Fri, 2024-09-06 at 00:57 +0000, Oleksandr Tymoshenko wrote:
nfs41_init_clientid does not signal a failure condition from nfs4_proc_exchange_id and nfs4_proc_create_session to a client which may lead to mount syscall indefinitely blocked in the following stack trace: nfs_wait_client_init_complete nfs41_discover_server_trunking nfs4_discover_server_trunking nfs4_init_client nfs4_set_client nfs4_create_server nfs4_try_get_tree vfs_get_tree do_new_mount __se_sys_mount
and the client stuck in uninitialized state.
In addition to this all subsequent mount calls would also get blocked in nfs_match_client waiting for the uninitialized client to finish initialization: nfs_wait_client_init_complete nfs_match_client nfs_get_client nfs4_set_client nfs4_create_server nfs4_try_get_tree vfs_get_tree do_new_mount __se_sys_mount
To avoid this situation propagate error condition to the mount thread and let mount syscall fail properly.
Signed-off-by: Oleksandr Tymoshenko ovt@google.com
fs/nfs/nfs4state.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c index 877f682b45f2..54ad3440ad2b 100644 --- a/fs/nfs/nfs4state.c +++ b/fs/nfs/nfs4state.c @@ -335,8 +335,8 @@ int nfs41_init_clientid(struct nfs_client *clp, const struct cred *cred) if (!(clp->cl_exchange_flags & EXCHGID4_FLAG_CONFIRMED_R)) nfs4_state_start_reclaim_reboot(clp); nfs41_finish_session_reset(clp);
- nfs_mark_client_ready(clp, NFS_CS_READY);
out:
- nfs_mark_client_ready(clp, status == 0 ? NFS_CS_READY :
status); return status; }
NACK. This will break all sorts of recovery scenarios, because it doesn't distinguish between an initial 'mount' and a server reboot recovery situation. Even in the case where we are in the initial mount, it also doesn't distinguish between transient errors such as NFS4ERR_DELAY or reboot errors such as NFS4ERR_STALE_CLIENTID, etc.
Exactly what is the scenario that is causing your hang? Let's try to address that with a more targeted fix.