From: Chuck Lever chuck.lever@oracle.com
Backport the upstream patch that disables background NFSv4.2 COPY operations.
Unlike later LTS kernels, the patches that limit the number of background COPY operations do not apply at all to v5.4. Because there is no support for server-to-server COPY in v5.4, disabling background COPY operations should not be noticeable.
Chuck Lever (1): NFSD: Force all NFSv4.2 COPY requests to be synchronous
fs/nfsd/nfs4proc.c | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-)
From: Chuck Lever chuck.lever@oracle.com
[ Upstream commit 8d915bbf39266bb66082c1e4980e123883f19830 ]
We've discovered that delivering a CB_OFFLOAD operation can be unreliable in some pretty unremarkable situations. Examples include:
- The server dropped the connection because it lost a forechannel NFSv4 request and wishes to force the client to retransmit - The GSS sequence number window under-flowed - A network partition occurred
When that happens, all pending callback operations, including CB_OFFLOAD, are lost. NFSD does not retransmit them.
Moreover, the Linux NFS client does not yet support sending an OFFLOAD_STATUS operation to probe whether an asynchronous COPY operation has finished. Thus, on Linux NFS clients, when a CB_OFFLOAD is lost, asynchronous COPY can hang until manually interrupted.
I've tried a couple of remedies, but so far the side-effects are worse than the disease and they have had to be reverted. So temporarily force COPY operations to be synchronous so that the use of CB_OFFLOAD is avoided entirely. This is a fix that can easily be backported to LTS kernels. I am working on client patches that introduce an implementation of OFFLOAD_STATUS.
Note that NFSD arbitrarily limits the size of a copy_file_range to 4MB to avoid indefinitely blocking an nfsd thread. A short COPY result is returned in that case, and the client can present a fresh COPY request for the remainder.
Link: https://nvd.nist.gov/vuln/detail/CVE-2024-49974 [ cel: adjusted to apply to origin/linux-5.4.y ] Signed-off-by: Chuck Lever chuck.lever@oracle.com --- fs/nfsd/nfs4proc.c | 7 +++++++ 1 file changed, 7 insertions(+)
diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c index e38f873f98a7..27e9754ad3b9 100644 --- a/fs/nfsd/nfs4proc.c +++ b/fs/nfsd/nfs4proc.c @@ -1262,6 +1262,13 @@ nfsd4_copy(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate, __be32 status; struct nfsd4_copy *async_copy = NULL;
+ /* + * Currently, async COPY is not reliable. Force all COPY + * requests to be synchronous to avoid client application + * hangs waiting for COPY completion. + */ + copy->cp_synchronous = 1; + status = nfsd4_verify_copy(rqstp, cstate, ©->cp_src_stateid, ©->nf_src, ©->cp_dst_stateid, ©->nf_dst);
[ Sasha's backport helper bot ]
Hi,
The upstream commit SHA1 provided is correct: 8d915bbf39266bb66082c1e4980e123883f19830
WARNING: Author mismatch between patch and upstream commit: Backport author: cel@kernel.org Commit author: Chuck Lever chuck.lever@oracle.com
Status in newer kernel trees: 6.12.y | Present (exact SHA1) 6.11.y | Present (exact SHA1) 6.6.y | Not found 6.1.y | Not found 5.15.y | Not found 5.10.y | Not found 5.4.y | Not found
Note: The patch differs from the upstream commit: --- --- - 2024-11-20 15:40:02.161540833 -0500 +++ /tmp/tmp.PdmCvDR3aZ 2024-11-20 15:40:02.153931894 -0500 @@ -1,3 +1,5 @@ +[ Upstream commit 8d915bbf39266bb66082c1e4980e123883f19830 ] + We've discovered that delivering a CB_OFFLOAD operation can be unreliable in some pretty unremarkable situations. Examples include: @@ -28,16 +30,18 @@ COPY result is returned in that case, and the client can present a fresh COPY request for the remainder.
+Link: https://nvd.nist.gov/vuln/detail/CVE-2024-49974 +[ cel: adjusted to apply to origin/linux-5.4.y ] Signed-off-by: Chuck Lever chuck.lever@oracle.com --- fs/nfsd/nfs4proc.c | 7 +++++++ 1 file changed, 7 insertions(+)
diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c -index ea3cc3e870a7f..46bd20fe5c0f4 100644 +index e38f873f98a7..27e9754ad3b9 100644 --- a/fs/nfsd/nfs4proc.c +++ b/fs/nfsd/nfs4proc.c -@@ -1807,6 +1807,13 @@ nfsd4_copy(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate, +@@ -1262,6 +1262,13 @@ nfsd4_copy(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate, __be32 status; struct nfsd4_copy *async_copy = NULL;
@@ -46,8 +50,11 @@ + * requests to be synchronous to avoid client application + * hangs waiting for COPY completion. + */ -+ nfsd4_copy_set_sync(copy, true); ++ copy->cp_synchronous = 1; + - copy->cp_clp = cstate->clp; - if (nfsd4_ssc_is_inter(copy)) { - trace_nfsd_copy_inter(copy); + status = nfsd4_verify_copy(rqstp, cstate, ©->cp_src_stateid, + ©->nf_src, ©->cp_dst_stateid, + ©->nf_dst); +-- +2.47.0 + ---
Results of testing on various branches:
| Branch | Patch Apply | Build Test | |---------------------------|-------------|------------| | stable/linux-5.4.y | Success | Success |
On Wed, Nov 20, 2024 at 02:13:15PM -0500, cel@kernel.org wrote:
From: Chuck Lever chuck.lever@oracle.com
[ Upstream commit 8d915bbf39266bb66082c1e4980e123883f19830 ]
What about kernel versions greater than 5.4? Like 5.10, 5.15, 6.1, and 6.6 for this change? Shouldn't it also be needed there?
thanks,
greg k-h
On Dec 2, 2024, at 4:09 AM, Greg KH gregkh@linuxfoundation.org wrote:
On Wed, Nov 20, 2024 at 02:13:15PM -0500, cel@kernel.org wrote:
From: Chuck Lever chuck.lever@oracle.com
[ Upstream commit 8d915bbf39266bb66082c1e4980e123883f19830 ]
What about kernel versions greater than 5.4? Like 5.10, 5.15, 6.1, and 6.6 for this change? Shouldn't it also be needed there?
Good catch. My rationale is:
Asynchronous COPY offload is needed to implement NFSv4.2 server-to-server COPY offload.
The upstream patches that address the CVE don't apply cleanly to linux-5.4.y. However, 5.4 kernels do not have NFSv4.2 server-to-server COPY offload, thus this change, which simply disables async COPY, has no user-visible impact. So I decided the easy, low-impact way to address the CVE for v5.4 was applying only this patch.
The newer LTS kernels do have server-to-server COPY offload, thus if this patch is applied, they would see a behavior regression whenever CONFIG_NFSD_V4_2_INTER_SSC is enabled. The upstream patches that address the CVE apply cleanly to these kernels, and I've sent those to stable@ already.
As these were separate patch series, there wasn't an obvious place to add a cover letter that explains this.
-- Chuck Lever
On Mon, Dec 02, 2024 at 02:19:13PM +0000, Chuck Lever III wrote:
On Dec 2, 2024, at 4:09 AM, Greg KH gregkh@linuxfoundation.org wrote:
On Wed, Nov 20, 2024 at 02:13:15PM -0500, cel@kernel.org wrote:
From: Chuck Lever chuck.lever@oracle.com
[ Upstream commit 8d915bbf39266bb66082c1e4980e123883f19830 ]
What about kernel versions greater than 5.4? Like 5.10, 5.15, 6.1, and 6.6 for this change? Shouldn't it also be needed there?
Good catch. My rationale is:
Asynchronous COPY offload is needed to implement NFSv4.2 server-to-server COPY offload.
The upstream patches that address the CVE don't apply cleanly to linux-5.4.y. However, 5.4 kernels do not have NFSv4.2 server-to-server COPY offload, thus this change, which simply disables async COPY, has no user-visible impact. So I decided the easy, low-impact way to address the CVE for v5.4 was applying only this patch.
The newer LTS kernels do have server-to-server COPY offload, thus if this patch is applied, they would see a behavior regression whenever CONFIG_NFSD_V4_2_INTER_SSC is enabled. The upstream patches that address the CVE apply cleanly to these kernels, and I've sent those to stable@ already.
As these were separate patch series, there wasn't an obvious place to add a cover letter that explains this.
Ok, that's fine, we'll just leave this as-is, thanks!
greg k-h
linux-stable-mirror@lists.linaro.org