The patch below does not apply to the 5.10-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to stable@vger.kernel.org.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y git checkout FETCH_HEAD git cherry-pick -x d68886bae76a4b9b3484d23e5b7df086f940fa38 # <resolve conflicts, build, test, etc.> git commit -s git send-email --to 'stable@vger.kernel.org' --in-reply-to '2025101659-wing-paltry-7e9d@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From d68886bae76a4b9b3484d23e5b7df086f940fa38 Mon Sep 17 00:00:00 2001 From: Sergey Bashirov sergeybashirov@gmail.com Date: Mon, 21 Jul 2025 21:40:56 +0300 Subject: [PATCH] NFSD: Fix last write offset handling in layoutcommit
The data type of loca_last_write_offset is newoffset4 and is switched on a boolean value, no_newoffset, that indicates if a previous write occurred or not. If no_newoffset is FALSE, an offset is not given. This means that client does not try to update the file size. Thus, server should not try to calculate new file size and check if it fits into the segment range. See RFC 8881, section 12.5.4.2.
Sometimes the current incorrect logic may cause clients to hang when trying to sync an inode. If layoutcommit fails, the client marks the inode as dirty again.
Fixes: 9cf514ccfacb ("nfsd: implement pNFS operations") Cc: stable@vger.kernel.org Co-developed-by: Konstantin Evtushenko koevtushenko@yandex.com Signed-off-by: Konstantin Evtushenko koevtushenko@yandex.com Signed-off-by: Sergey Bashirov sergeybashirov@gmail.com Reviewed-by: Christoph Hellwig hch@lst.de Reviewed-by: Jeff Layton jlayton@kernel.org Signed-off-by: Chuck Lever chuck.lever@oracle.com
diff --git a/fs/nfsd/blocklayout.c b/fs/nfsd/blocklayout.c index 4c936132eb44..0822d8a119c6 100644 --- a/fs/nfsd/blocklayout.c +++ b/fs/nfsd/blocklayout.c @@ -118,7 +118,6 @@ nfsd4_block_commit_blocks(struct inode *inode, struct nfsd4_layoutcommit *lcp, struct iomap *iomaps, int nr_iomaps) { struct timespec64 mtime = inode_get_mtime(inode); - loff_t new_size = lcp->lc_last_wr + 1; struct iattr iattr = { .ia_valid = 0 }; int error;
@@ -128,9 +127,9 @@ nfsd4_block_commit_blocks(struct inode *inode, struct nfsd4_layoutcommit *lcp, iattr.ia_valid |= ATTR_ATIME | ATTR_CTIME | ATTR_MTIME; iattr.ia_atime = iattr.ia_ctime = iattr.ia_mtime = lcp->lc_mtime;
- if (new_size > i_size_read(inode)) { + if (lcp->lc_size_chg) { iattr.ia_valid |= ATTR_SIZE; - iattr.ia_size = new_size; + iattr.ia_size = lcp->lc_newsize; }
error = inode->i_sb->s_export_op->commit_blocks(inode, iomaps, diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c index 656b2e7d8840..7043fc475458 100644 --- a/fs/nfsd/nfs4proc.c +++ b/fs/nfsd/nfs4proc.c @@ -2475,7 +2475,6 @@ nfsd4_layoutcommit(struct svc_rqst *rqstp, const struct nfsd4_layout_seg *seg = &lcp->lc_seg; struct svc_fh *current_fh = &cstate->current_fh; const struct nfsd4_layout_ops *ops; - loff_t new_size = lcp->lc_last_wr + 1; struct inode *inode; struct nfs4_layout_stateid *ls; __be32 nfserr; @@ -2491,13 +2490,21 @@ nfsd4_layoutcommit(struct svc_rqst *rqstp, goto out; inode = d_inode(current_fh->fh_dentry);
- nfserr = nfserr_inval; - if (new_size <= seg->offset) - goto out; - if (new_size > seg->offset + seg->length) - goto out; - if (!lcp->lc_newoffset && new_size > i_size_read(inode)) - goto out; + lcp->lc_size_chg = false; + if (lcp->lc_newoffset) { + loff_t new_size = lcp->lc_last_wr + 1; + + nfserr = nfserr_inval; + if (new_size <= seg->offset) + goto out; + if (new_size > seg->offset + seg->length) + goto out; + + if (new_size > i_size_read(inode)) { + lcp->lc_size_chg = true; + lcp->lc_newsize = new_size; + } + }
nfserr = nfsd4_preprocess_layout_stateid(rqstp, cstate, &lcp->lc_sid, false, lcp->lc_layout_type, @@ -2513,13 +2520,6 @@ nfsd4_layoutcommit(struct svc_rqst *rqstp, /* LAYOUTCOMMIT does not require any serialization */ mutex_unlock(&ls->ls_mutex);
- if (new_size > i_size_read(inode)) { - lcp->lc_size_chg = true; - lcp->lc_newsize = new_size; - } else { - lcp->lc_size_chg = false; - } - nfserr = ops->proc_layoutcommit(inode, rqstp, lcp); nfs4_put_stid(&ls->ls_stid); out:
From: Sergey Bashirov sergeybashirov@gmail.com
[ Upstream commit 832738e4b325b742940761e10487403f9aad13e8 ]
Compilers may optimize the layout of C structures, so we should not rely on sizeof struct and memcpy to encode and decode XDR structures. The byte order of the fields should also be taken into account.
This patch adds the correct functions to handle the deviceid4 structure and removes the pad field, which is currently not used by NFSD, from the runtime state. The server's byte order is preserved because the deviceid4 blob on the wire is only used as a cookie by the client.
Signed-off-by: Sergey Bashirov sergeybashirov@gmail.com Signed-off-by: Chuck Lever chuck.lever@oracle.com Stable-dep-of: d68886bae76a ("NFSD: Fix last write offset handling in layoutcommit") Signed-off-by: Sasha Levin sashal@kernel.org --- fs/nfsd/blocklayoutxdr.c | 7 ++----- fs/nfsd/flexfilelayoutxdr.c | 3 +-- fs/nfsd/nfs4layouts.c | 1 - fs/nfsd/nfs4xdr.c | 14 +------------- fs/nfsd/xdr4.h | 36 +++++++++++++++++++++++++++++++++++- 5 files changed, 39 insertions(+), 22 deletions(-)
diff --git a/fs/nfsd/blocklayoutxdr.c b/fs/nfsd/blocklayoutxdr.c index 1ed2f691ebb90..dd35c472eb37d 100644 --- a/fs/nfsd/blocklayoutxdr.c +++ b/fs/nfsd/blocklayoutxdr.c @@ -29,8 +29,7 @@ nfsd4_block_encode_layoutget(struct xdr_stream *xdr, *p++ = cpu_to_be32(len); *p++ = cpu_to_be32(1); /* we always return a single extent */
- p = xdr_encode_opaque_fixed(p, &b->vol_id, - sizeof(struct nfsd4_deviceid)); + p = svcxdr_encode_deviceid4(p, &b->vol_id); p = xdr_encode_hyper(p, b->foff); p = xdr_encode_hyper(p, b->len); p = xdr_encode_hyper(p, b->soff); @@ -145,9 +144,7 @@ nfsd4_block_decode_layoutupdate(__be32 *p, u32 len, struct iomap **iomapp, for (i = 0; i < nr_iomaps; i++) { struct pnfs_block_extent bex;
- memcpy(&bex.vol_id, p, sizeof(struct nfsd4_deviceid)); - p += XDR_QUADLEN(sizeof(struct nfsd4_deviceid)); - + p = svcxdr_decode_deviceid4(p, &bex.vol_id); p = xdr_decode_hyper(p, &bex.foff); if (bex.foff & (block_size - 1)) { dprintk("%s: unaligned offset 0x%llx\n", diff --git a/fs/nfsd/flexfilelayoutxdr.c b/fs/nfsd/flexfilelayoutxdr.c index bb205328e043d..223a10f37898e 100644 --- a/fs/nfsd/flexfilelayoutxdr.c +++ b/fs/nfsd/flexfilelayoutxdr.c @@ -54,8 +54,7 @@ nfsd4_ff_encode_layoutget(struct xdr_stream *xdr, *p++ = cpu_to_be32(1); /* single mirror */ *p++ = cpu_to_be32(1); /* single data server */
- p = xdr_encode_opaque_fixed(p, &fl->deviceid, - sizeof(struct nfsd4_deviceid)); + p = svcxdr_encode_deviceid4(p, &fl->deviceid);
*p++ = cpu_to_be32(1); /* efficiency */
diff --git a/fs/nfsd/nfs4layouts.c b/fs/nfsd/nfs4layouts.c index e4e23b2a3e655..d0fbbd34db689 100644 --- a/fs/nfsd/nfs4layouts.c +++ b/fs/nfsd/nfs4layouts.c @@ -120,7 +120,6 @@ nfsd4_set_deviceid(struct nfsd4_deviceid *id, const struct svc_fh *fhp,
id->fsid_idx = fhp->fh_export->ex_devid_map->idx; id->generation = device_generation; - id->pad = 0; return 0; }
diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c index d1625a6ff3ce3..57dc5493c8607 100644 --- a/fs/nfsd/nfs4xdr.c +++ b/fs/nfsd/nfs4xdr.c @@ -566,18 +566,6 @@ nfsd4_decode_state_owner4(struct nfsd4_compoundargs *argp, }
#ifdef CONFIG_NFSD_PNFS -static __be32 -nfsd4_decode_deviceid4(struct nfsd4_compoundargs *argp, - struct nfsd4_deviceid *devid) -{ - __be32 *p; - - p = xdr_inline_decode(argp->xdr, NFS4_DEVICEID4_SIZE); - if (!p) - return nfserr_bad_xdr; - memcpy(devid, p, sizeof(*devid)); - return nfs_ok; -}
static __be32 nfsd4_decode_layoutupdate4(struct nfsd4_compoundargs *argp, @@ -1733,7 +1721,7 @@ nfsd4_decode_getdeviceinfo(struct nfsd4_compoundargs *argp, __be32 status;
memset(gdev, 0, sizeof(*gdev)); - status = nfsd4_decode_deviceid4(argp, &gdev->gd_devid); + status = nfsd4_decode_deviceid4(argp->xdr, &gdev->gd_devid); if (status) return status; if (xdr_stream_decode_u32(argp->xdr, &gdev->gd_layout_type) < 0) diff --git a/fs/nfsd/xdr4.h b/fs/nfsd/xdr4.h index 15a617bece00a..c989398eb4d8c 100644 --- a/fs/nfsd/xdr4.h +++ b/fs/nfsd/xdr4.h @@ -459,9 +459,43 @@ struct nfsd4_reclaim_complete { struct nfsd4_deviceid { u64 fsid_idx; u32 generation; - u32 pad; };
+static inline __be32 * +svcxdr_encode_deviceid4(__be32 *p, const struct nfsd4_deviceid *devid) +{ + __be64 *q = (__be64 *)p; + + *q = (__force __be64)devid->fsid_idx; + p += 2; + *p++ = (__force __be32)devid->generation; + *p++ = xdr_zero; + return p; +} + +static inline __be32 * +svcxdr_decode_deviceid4(__be32 *p, struct nfsd4_deviceid *devid) +{ + __be64 *q = (__be64 *)p; + + devid->fsid_idx = (__force u64)(*q); + p += 2; + devid->generation = (__force u32)(*p++); + p++; /* NFSD does not use the remaining octets */ + return p; +} + +static inline __be32 +nfsd4_decode_deviceid4(struct xdr_stream *xdr, struct nfsd4_deviceid *devid) +{ + __be32 *p = xdr_inline_decode(xdr, NFS4_DEVICEID4_SIZE); + + if (unlikely(!p)) + return nfserr_bad_xdr; + svcxdr_decode_deviceid4(p, devid); + return nfs_ok; +} + struct nfsd4_layout_seg { u32 iomode; u64 offset;
From: Sergey Bashirov sergeybashirov@gmail.com
[ Upstream commit 274365a51d88658fb51cca637ba579034e90a799 ]
Remove dprintk in nfsd4_layoutcommit. These are not needed in day to day usage, and the information is also available in Wireshark when capturing NFS traffic.
Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Sergey Bashirov sergeybashirov@gmail.com Signed-off-by: Chuck Lever chuck.lever@oracle.com Stable-dep-of: d68886bae76a ("NFSD: Fix last write offset handling in layoutcommit") Signed-off-by: Sasha Levin sashal@kernel.org --- fs/nfsd/nfs4proc.c | 12 +++--------- 1 file changed, 3 insertions(+), 9 deletions(-)
diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c index 1da06a15b13f5..3353bc2e3d0ef 100644 --- a/fs/nfsd/nfs4proc.c +++ b/fs/nfsd/nfs4proc.c @@ -2278,18 +2278,12 @@ nfsd4_layoutcommit(struct svc_rqst *rqstp, inode = d_inode(current_fh->fh_dentry);
nfserr = nfserr_inval; - if (new_size <= seg->offset) { - dprintk("pnfsd: last write before layout segment\n"); + if (new_size <= seg->offset) goto out; - } - if (new_size > seg->offset + seg->length) { - dprintk("pnfsd: last write beyond layout segment\n"); + if (new_size > seg->offset + seg->length) goto out; - } - if (!lcp->lc_newoffset && new_size > i_size_read(inode)) { - dprintk("pnfsd: layoutcommit beyond EOF\n"); + if (!lcp->lc_newoffset && new_size > i_size_read(inode)) goto out; - }
nfserr = nfsd4_preprocess_layout_stateid(rqstp, cstate, &lcp->lc_sid, false, lcp->lc_layout_type,
From: Sergey Bashirov sergeybashirov@gmail.com
[ Upstream commit d68886bae76a4b9b3484d23e5b7df086f940fa38 ]
The data type of loca_last_write_offset is newoffset4 and is switched on a boolean value, no_newoffset, that indicates if a previous write occurred or not. If no_newoffset is FALSE, an offset is not given. This means that client does not try to update the file size. Thus, server should not try to calculate new file size and check if it fits into the segment range. See RFC 8881, section 12.5.4.2.
Sometimes the current incorrect logic may cause clients to hang when trying to sync an inode. If layoutcommit fails, the client marks the inode as dirty again.
Fixes: 9cf514ccfacb ("nfsd: implement pNFS operations") Cc: stable@vger.kernel.org Co-developed-by: Konstantin Evtushenko koevtushenko@yandex.com Signed-off-by: Konstantin Evtushenko koevtushenko@yandex.com Signed-off-by: Sergey Bashirov sergeybashirov@gmail.com Reviewed-by: Christoph Hellwig hch@lst.de Reviewed-by: Jeff Layton jlayton@kernel.org Signed-off-by: Chuck Lever chuck.lever@oracle.com [ removed rqstp parameter from proc_layoutcommit ] Signed-off-by: Sasha Levin sashal@kernel.org --- fs/nfsd/blocklayout.c | 5 ++--- fs/nfsd/nfs4proc.c | 30 +++++++++++++++--------------- 2 files changed, 17 insertions(+), 18 deletions(-)
diff --git a/fs/nfsd/blocklayout.c b/fs/nfsd/blocklayout.c index d91a686d2f313..aa9b7ae59a076 100644 --- a/fs/nfsd/blocklayout.c +++ b/fs/nfsd/blocklayout.c @@ -121,7 +121,6 @@ static __be32 nfsd4_block_commit_blocks(struct inode *inode, struct nfsd4_layoutcommit *lcp, struct iomap *iomaps, int nr_iomaps) { - loff_t new_size = lcp->lc_last_wr + 1; struct iattr iattr = { .ia_valid = 0 }; int error;
@@ -131,9 +130,9 @@ nfsd4_block_commit_blocks(struct inode *inode, struct nfsd4_layoutcommit *lcp, iattr.ia_valid |= ATTR_ATIME | ATTR_CTIME | ATTR_MTIME; iattr.ia_atime = iattr.ia_ctime = iattr.ia_mtime = lcp->lc_mtime;
- if (new_size > i_size_read(inode)) { + if (lcp->lc_size_chg) { iattr.ia_valid |= ATTR_SIZE; - iattr.ia_size = new_size; + iattr.ia_size = lcp->lc_newsize; }
error = inode->i_sb->s_export_op->commit_blocks(inode, iomaps, diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c index 3353bc2e3d0ef..908143cac5b46 100644 --- a/fs/nfsd/nfs4proc.c +++ b/fs/nfsd/nfs4proc.c @@ -2262,7 +2262,6 @@ nfsd4_layoutcommit(struct svc_rqst *rqstp, const struct nfsd4_layout_seg *seg = &lcp->lc_seg; struct svc_fh *current_fh = &cstate->current_fh; const struct nfsd4_layout_ops *ops; - loff_t new_size = lcp->lc_last_wr + 1; struct inode *inode; struct nfs4_layout_stateid *ls; __be32 nfserr; @@ -2277,13 +2276,21 @@ nfsd4_layoutcommit(struct svc_rqst *rqstp, goto out; inode = d_inode(current_fh->fh_dentry);
- nfserr = nfserr_inval; - if (new_size <= seg->offset) - goto out; - if (new_size > seg->offset + seg->length) - goto out; - if (!lcp->lc_newoffset && new_size > i_size_read(inode)) - goto out; + lcp->lc_size_chg = false; + if (lcp->lc_newoffset) { + loff_t new_size = lcp->lc_last_wr + 1; + + nfserr = nfserr_inval; + if (new_size <= seg->offset) + goto out; + if (new_size > seg->offset + seg->length) + goto out; + + if (new_size > i_size_read(inode)) { + lcp->lc_size_chg = true; + lcp->lc_newsize = new_size; + } + }
nfserr = nfsd4_preprocess_layout_stateid(rqstp, cstate, &lcp->lc_sid, false, lcp->lc_layout_type, @@ -2299,13 +2306,6 @@ nfsd4_layoutcommit(struct svc_rqst *rqstp, /* LAYOUTCOMMIT does not require any serialization */ mutex_unlock(&ls->ls_mutex);
- if (new_size > i_size_read(inode)) { - lcp->lc_size_chg = 1; - lcp->lc_newsize = new_size; - } else { - lcp->lc_size_chg = 0; - } - nfserr = ops->proc_layoutcommit(inode, lcp); nfs4_put_stid(&ls->ls_stid); out:
linux-stable-mirror@lists.linaro.org