Hi,
please backport the following path to 4.4.x and 4.9.x
subject: cifs: Fix a race condition with cifs_echo_request hash: f2caf901c1b7ce65f9e6aef4217e3241039db768
regards, Henning
On Fri, May 15, 2020 at 01:44:20PM +0200, Henning Schild wrote:
Hi,
please backport the following path to 4.4.x and 4.9.x
subject: cifs: Fix a race condition with cifs_echo_request hash: f2caf901c1b7ce65f9e6aef4217e3241039db768
It does not apply cleanly, can you provide a working backport so that we can queue it up properly?
thanks,
greg k-h
From: Henning Schild henning.schild@siemens.com
commit f2caf901c1b7ce65f9e6aef4217e3241039db768 upstream
There is a race condition with how we send (or supress and don't send) smb echos that will cause the client to incorrectly think the server is unresponsive and thus needs to be reconnected.
Summary of the race condition: 1) Daisy chaining scheduling creates a gap. 2) If traffic comes unfortunate shortly after the last echo, the planned echo is suppressed. 3) Due to the gap, the next echo transmission is delayed until after the timeout, which is set hard to twice the echo interval.
This is fixed by changing the timeouts from 2 to three times the echo interval.
Detailed description of the bug: https://lutz.donnerhacke.de/eng/Blog/Groundhog-Day-with-SMB-remount
Signed-off-by: Ronnie Sahlberg lsahlber@redhat.com Reviewed-by: Pavel Shilovsky pshilov@microsoft.com Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Henning Schild henning.schild@siemens.com --- fs/cifs/connect.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c index c9793ce0d336..52884a00d9f0 100644 --- a/fs/cifs/connect.c +++ b/fs/cifs/connect.c @@ -548,10 +548,10 @@ static bool server_unresponsive(struct TCP_Server_Info *server) { /* - * We need to wait 2 echo intervals to make sure we handle such + * We need to wait 3 echo intervals to make sure we handle such * situations right: * 1s client sends a normal SMB request - * 2s client gets a response + * 3s client gets a response * 30s echo workqueue job pops, and decides we got a response recently * and don't need to send another * ... @@ -559,9 +559,9 @@ server_unresponsive(struct TCP_Server_Info *server) * a response in >60s. */ if (server->tcpStatus == CifsGood && - time_after(jiffies, server->lstrp + 2 * SMB_ECHO_INTERVAL)) { + time_after(jiffies, server->lstrp + 3 * SMB_ECHO_INTERVAL)) { cifs_dbg(VFS, "Server %s has not responded in %d seconds. Reconnecting...\n", - server->hostname, (2 * SMB_ECHO_INTERVAL) / HZ); + server->hostname, (3 * SMB_ECHO_INTERVAL) / HZ); cifs_reconnect(server); wake_up(&server->response_q); return true;
From: Henning Schild henning.schild@siemens.com
commit f2caf901c1b7ce65f9e6aef4217e3241039db768 upstream
There is a race condition with how we send (or supress and don't send) smb echos that will cause the client to incorrectly think the server is unresponsive and thus needs to be reconnected.
Summary of the race condition: 1) Daisy chaining scheduling creates a gap. 2) If traffic comes unfortunate shortly after the last echo, the planned echo is suppressed. 3) Due to the gap, the next echo transmission is delayed until after the timeout, which is set hard to twice the echo interval.
This is fixed by changing the timeouts from 2 to three times the echo interval.
Detailed description of the bug: https://lutz.donnerhacke.de/eng/Blog/Groundhog-Day-with-SMB-remount
Signed-off-by: Ronnie Sahlberg lsahlber@redhat.com Reviewed-by: Pavel Shilovsky pshilov@microsoft.com Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Henning Schild henning.schild@siemens.com --- fs/cifs/connect.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c index c018d161735c..f54277049512 100644 --- a/fs/cifs/connect.c +++ b/fs/cifs/connect.c @@ -551,10 +551,10 @@ static bool server_unresponsive(struct TCP_Server_Info *server) { /* - * We need to wait 2 echo intervals to make sure we handle such + * We need to wait 3 echo intervals to make sure we handle such * situations right: * 1s client sends a normal SMB request - * 2s client gets a response + * 3s client gets a response * 30s echo workqueue job pops, and decides we got a response recently * and don't need to send another * ... @@ -562,9 +562,9 @@ server_unresponsive(struct TCP_Server_Info *server) * a response in >60s. */ if (server->tcpStatus == CifsGood && - time_after(jiffies, server->lstrp + 2 * server->echo_interval)) { + time_after(jiffies, server->lstrp + 3 * server->echo_interval)) { cifs_dbg(VFS, "Server %s has not responded in %lu seconds. Reconnecting...\n", - server->hostname, (2 * server->echo_interval) / HZ); + server->hostname, (3 * server->echo_interval) / HZ); cifs_reconnect(server); wake_up(&server->response_q); return true;
Am Fri, 15 May 2020 14:57:48 +0200 schrieb Greg KH greg@kroah.com:
On Fri, May 15, 2020 at 01:44:20PM +0200, Henning Schild wrote:
Hi,
please backport the following path to 4.4.x and 4.9.x
subject: cifs: Fix a race condition with cifs_echo_request hash: f2caf901c1b7ce65f9e6aef4217e3241039db768
It does not apply cleanly, can you provide a working backport so that we can queue it up properly?
Sure, just did that. The 4.4 one has been runtime-tested since a few days, but the bug is hard to trigger. The 4.9 has only been compile-time tested.
They are backports, and Pavel already confirmed that this should be done. That was in a personal email though ...
Henning
thanks,
greg k-h
On Fri, May 15, 2020 at 02:57:48PM +0200, Greg KH wrote:
On Fri, May 15, 2020 at 01:44:20PM +0200, Henning Schild wrote:
Hi,
please backport the following path to 4.4.x and 4.9.x
subject: cifs: Fix a race condition with cifs_echo_request hash: f2caf901c1b7ce65f9e6aef4217e3241039db768
It does not apply cleanly, can you provide a working backport so that we can queue it up properly?
I've queued these two for 4.9 and 4.4:
f2caf901c1b7 ("cifs: Fix a race condition with cifs_echo_request") 76e752701a8a ("cifs: Check for timeout on Negotiate stage")
On Fri, May 15, 2020 at 09:31:59AM -0400, Sasha Levin wrote:
On Fri, May 15, 2020 at 02:57:48PM +0200, Greg KH wrote:
On Fri, May 15, 2020 at 01:44:20PM +0200, Henning Schild wrote:
Hi,
please backport the following path to 4.4.x and 4.9.x
subject: cifs: Fix a race condition with cifs_echo_request hash: f2caf901c1b7ce65f9e6aef4217e3241039db768
It does not apply cleanly, can you provide a working backport so that we can queue it up properly?
I've queued these two for 4.9 and 4.4:
f2caf901c1b7 ("cifs: Fix a race condition with cifs_echo_request") 76e752701a8a ("cifs: Check for timeout on Negotiate stage")
Thanks!
On Fri, May 15, 2020 at 09:31:59AM -0400, Sasha Levin wrote: ...
I've queued these two for 4.9 and 4.4:
f2caf901c1b7 ("cifs: Fix a race condition with cifs_echo_request") 76e752701a8a ("cifs: Check for timeout on Negotiate stage")
Thanks!
-- Best regards, Pavel Shilovsky
linux-stable-mirror@lists.linaro.org