Hi all,
Here's a bunch of bespoke hand-ported bug fixes for 6.12 LTS. These fixes are the ones that weren't automatically picked up by Greg, either because they didn't apply or because they weren't cc'd to stable. If there are any problems please let me know; this is the first time I've ever sent patches to stable.
With a bit of luck, this should all go splendidly. Comments and questions are, as always, welcome.
--D --- Commits in this patchset: * xfs: sb_spino_align is not verified * xfs: fix sparse inode limits on runt AG * xfs: fix off-by-one error in fsmap's end_daddr usage * xfs: fix sb_spino_align checks for large fsblock sizes * xfs: fix zero byte checking in the superblock scrubber --- fs/xfs/libxfs/xfs_ialloc.c | 16 +++++++++------- fs/xfs/libxfs/xfs_sb.c | 15 +++++++++++++++ fs/xfs/scrub/agheader.c | 29 +++++++++++++++++++++++++++-- fs/xfs/xfs_fsmap.c | 29 ++++++++++++++++++----------- 4 files changed, 69 insertions(+), 20 deletions(-)
From: Dave Chinner dchinner@redhat.com
commit 59e43f5479cce106d71c0b91a297c7ad1913176c upstream.
It's just read in from the superblock and used without doing any validity checks at all on the value.
Fixes: fb4f2b4e5a82 ("xfs: add sparse inode chunk alignment superblock field") Signed-off-by: Dave Chinner dchinner@redhat.com Reviewed-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Carlos Maiolino cem@kernel.org [djwong: actually tag for 6.12 because upstream maintainer ignored cc-stable tag] Link: https://lore.kernel.org/linux-xfs/20241024165544.GI21853@frogsfrogsfrogs/ Signed-off-by: "Darrick J. Wong" djwong@kernel.org --- fs/xfs/libxfs/xfs_sb.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+)
diff --git a/fs/xfs/libxfs/xfs_sb.c b/fs/xfs/libxfs/xfs_sb.c index 02ebcbc4882f5b..9e0ae312bc8035 100644 --- a/fs/xfs/libxfs/xfs_sb.c +++ b/fs/xfs/libxfs/xfs_sb.c @@ -391,6 +391,20 @@ xfs_validate_sb_common( sbp->sb_inoalignmt, align); return -EINVAL; } + + if (!sbp->sb_spino_align || + sbp->sb_spino_align > sbp->sb_inoalignmt || + (sbp->sb_inoalignmt % sbp->sb_spino_align) != 0) { + xfs_warn(mp, + "Sparse inode alignment (%u) is invalid.", + sbp->sb_spino_align); + return -EINVAL; + } + } else if (sbp->sb_spino_align) { + xfs_warn(mp, + "Sparse inode alignment (%u) should be zero.", + sbp->sb_spino_align); + return -EINVAL; } } else if (sbp->sb_qflags & (XFS_PQUOTA_ENFD | XFS_GQUOTA_ENFD | XFS_PQUOTA_CHKD | XFS_GQUOTA_CHKD)) {
[ Sasha's backport helper bot ]
Hi,
The upstream commit SHA1 provided is correct: 59e43f5479cce106d71c0b91a297c7ad1913176c
WARNING: Author mismatch between patch and upstream commit: Backport author: "Darrick J. Wong" djwong@kernel.org Commit author: Dave Chinner dchinner@redhat.com
Status in newer kernel trees: 6.12.y | Not found
Note: The patch differs from the upstream commit: --- 1: 59e43f5479cc ! 1: 273f71b3b9d4 xfs: sb_spino_align is not verified @@ Metadata ## Commit message ## xfs: sb_spino_align is not verified
+ commit 59e43f5479cce106d71c0b91a297c7ad1913176c upstream. + It's just read in from the superblock and used without doing any validity checks at all on the value.
@@ Commit message Signed-off-by: Dave Chinner dchinner@redhat.com Reviewed-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Carlos Maiolino cem@kernel.org + [djwong: actually tag for 6.12 because upstream maintainer ignored cc-stable tag] + Link: https://lore.kernel.org/linux-xfs/20241024165544.GI21853@frogsfrogsfrogs/ + Signed-off-by: "Darrick J. Wong" djwong@kernel.org
## fs/xfs/libxfs/xfs_sb.c ## @@ fs/xfs/libxfs/xfs_sb.c: xfs_validate_sb_common( ---
Results of testing on various branches:
| Branch | Patch Apply | Build Test | |---------------------------|-------------|------------| | stable/linux-6.12.y | Success | Success | | stable/linux-6.6.y | Success | Success | | stable/linux-6.1.y | Success | Success | | stable/linux-5.15.y | Success | Success | | stable/linux-5.10.y | Failed | N/A | | stable/linux-5.4.y | Failed | N/A |
From: Dave Chinner dchinner@redhat.com
commit 13325333582d4820d39b9e8f63d6a54e745585d9 upstream.
The runt AG at the end of a filesystem is almost always smaller than the mp->m_sb.sb_agblocks. Unfortunately, when setting the max_agbno limit for the inode chunk allocation, we do not take this into account. This means we can allocate a sparse inode chunk that overlaps beyond the end of an AG. When we go to allocate an inode from that sparse chunk, the irec fails validation because the agbno of the start of the irec is beyond valid limits for the runt AG.
Prevent this from happening by taking into account the size of the runt AG when allocating inode chunks. Also convert the various checks for valid inode chunk agbnos to use xfs_ag_block_count() so that they will also catch such issues in the future.
Fixes: 56d1115c9bc7 ("xfs: allocate sparse inode chunks on full chunk allocation failure") Signed-off-by: Dave Chinner dchinner@redhat.com Reviewed-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Carlos Maiolino cem@kernel.org [djwong: backport to stable because upstream maintainer ignored cc-stable] Link: https://lore.kernel.org/linux-xfs/20241112231539.GG9438@frogsfrogsfrogs/ Signed-off-by: "Darrick J. Wong" djwong@kernel.org --- fs/xfs/libxfs/xfs_ialloc.c | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-)
diff --git a/fs/xfs/libxfs/xfs_ialloc.c b/fs/xfs/libxfs/xfs_ialloc.c index 271855227514cb..6258527315f28b 100644 --- a/fs/xfs/libxfs/xfs_ialloc.c +++ b/fs/xfs/libxfs/xfs_ialloc.c @@ -855,7 +855,8 @@ xfs_ialloc_ag_alloc( * the end of the AG. */ args.min_agbno = args.mp->m_sb.sb_inoalignmt; - args.max_agbno = round_down(args.mp->m_sb.sb_agblocks, + args.max_agbno = round_down(xfs_ag_block_count(args.mp, + pag->pag_agno), args.mp->m_sb.sb_inoalignmt) - igeo->ialloc_blks;
@@ -2332,9 +2333,9 @@ xfs_difree( return -EINVAL; } agbno = XFS_AGINO_TO_AGBNO(mp, agino); - if (agbno >= mp->m_sb.sb_agblocks) { - xfs_warn(mp, "%s: agbno >= mp->m_sb.sb_agblocks (%d >= %d).", - __func__, agbno, mp->m_sb.sb_agblocks); + if (agbno >= xfs_ag_block_count(mp, pag->pag_agno)) { + xfs_warn(mp, "%s: agbno >= xfs_ag_block_count (%d >= %d).", + __func__, agbno, xfs_ag_block_count(mp, pag->pag_agno)); ASSERT(0); return -EINVAL; } @@ -2457,7 +2458,7 @@ xfs_imap( */ agino = XFS_INO_TO_AGINO(mp, ino); agbno = XFS_AGINO_TO_AGBNO(mp, agino); - if (agbno >= mp->m_sb.sb_agblocks || + if (agbno >= xfs_ag_block_count(mp, pag->pag_agno) || ino != XFS_AGINO_TO_INO(mp, pag->pag_agno, agino)) { error = -EINVAL; #ifdef DEBUG @@ -2467,11 +2468,12 @@ xfs_imap( */ if (flags & XFS_IGET_UNTRUSTED) return error; - if (agbno >= mp->m_sb.sb_agblocks) { + if (agbno >= xfs_ag_block_count(mp, pag->pag_agno)) { xfs_alert(mp, "%s: agbno (0x%llx) >= mp->m_sb.sb_agblocks (0x%lx)", __func__, (unsigned long long)agbno, - (unsigned long)mp->m_sb.sb_agblocks); + (unsigned long)xfs_ag_block_count(mp, + pag->pag_agno)); } if (ino != XFS_AGINO_TO_INO(mp, pag->pag_agno, agino)) { xfs_alert(mp,
From: Darrick J. Wong djwong@kernel.org
commit a440a28ddbdcb861150987b4d6e828631656b92f upstream.
In commit ca6448aed4f10a, we created an "end_daddr" variable to fix fsmap reporting when the end of the range requested falls in the middle of an unknown (aka free on the rmapbt) region. Unfortunately, I didn't notice that the the code sets end_daddr to the last sector of the device but then uses that quantity to compute the length of the synthesized mapping.
Zizhi Wo later observed that when end_daddr isn't set, we still don't report the last fsblock on a device because in that case (aka when info->last is true), the info->high mapping that we pass to xfs_getfsmap_group_helper has a startblock that points to the last fsblock. This is also wrong because the code uses startblock to compute the length of the synthesized mapping.
Fix the second problem by setting end_daddr unconditionally, and fix the first problem by setting start_daddr to one past the end of the range to query.
Cc: stable@vger.kernel.org # v6.11 Fixes: ca6448aed4f10a ("xfs: Fix missing interval for missing_owner in xfs fsmap") Signed-off-by: "Darrick J. Wong" djwong@kernel.org Reported-by: Zizhi Wo wozizhi@huawei.com Reviewed-by: Christoph Hellwig hch@lst.de --- fs/xfs/xfs_fsmap.c | 29 ++++++++++++++++++----------- 1 file changed, 18 insertions(+), 11 deletions(-)
diff --git a/fs/xfs/xfs_fsmap.c b/fs/xfs/xfs_fsmap.c index ae18ab86e608b5..8712b891defbc7 100644 --- a/fs/xfs/xfs_fsmap.c +++ b/fs/xfs/xfs_fsmap.c @@ -162,7 +162,8 @@ struct xfs_getfsmap_info { xfs_daddr_t next_daddr; /* next daddr we expect */ /* daddr of low fsmap key when we're using the rtbitmap */ xfs_daddr_t low_daddr; - xfs_daddr_t end_daddr; /* daddr of high fsmap key */ + /* daddr of high fsmap key, or the last daddr on the device */ + xfs_daddr_t end_daddr; u64 missing_owner; /* owner of holes */ u32 dev; /* device id */ /* @@ -306,7 +307,7 @@ xfs_getfsmap_helper( * Note that if the btree query found a mapping, there won't be a gap. */ if (info->last && info->end_daddr != XFS_BUF_DADDR_NULL) - rec_daddr = info->end_daddr; + rec_daddr = info->end_daddr + 1;
/* Are we just counting mappings? */ if (info->head->fmh_count == 0) { @@ -898,7 +899,10 @@ xfs_getfsmap( struct xfs_trans *tp = NULL; struct xfs_fsmap dkeys[2]; /* per-dev keys */ struct xfs_getfsmap_dev handlers[XFS_GETFSMAP_DEVS]; - struct xfs_getfsmap_info info = { NULL }; + struct xfs_getfsmap_info info = { + .fsmap_recs = fsmap_recs, + .head = head, + }; bool use_rmap; int i; int error = 0; @@ -963,9 +967,6 @@ xfs_getfsmap(
info.next_daddr = head->fmh_keys[0].fmr_physical + head->fmh_keys[0].fmr_length; - info.end_daddr = XFS_BUF_DADDR_NULL; - info.fsmap_recs = fsmap_recs; - info.head = head;
/* For each device we support... */ for (i = 0; i < XFS_GETFSMAP_DEVS; i++) { @@ -978,17 +979,23 @@ xfs_getfsmap( break;
/* - * If this device number matches the high key, we have - * to pass the high key to the handler to limit the - * query results. If the device number exceeds the - * low key, zero out the low key so that we get - * everything from the beginning. + * If this device number matches the high key, we have to pass + * the high key to the handler to limit the query results, and + * set the end_daddr so that we can synthesize records at the + * end of the query range or device. */ if (handlers[i].dev == head->fmh_keys[1].fmr_device) { dkeys[1] = head->fmh_keys[1]; info.end_daddr = min(handlers[i].nr_sectors - 1, dkeys[1].fmr_physical); + } else { + info.end_daddr = handlers[i].nr_sectors - 1; } + + /* + * If the device number exceeds the low key, zero out the low + * key so that we get everything from the beginning. + */ if (handlers[i].dev > head->fmh_keys[0].fmr_device) memset(&dkeys[0], 0, sizeof(struct xfs_fsmap));
From: Darrick J. Wong djwong@kernel.org
commit 7f8a44f37229fc76bfcafa341a4b8862368ef44a upstream.
For a sparse inodes filesystem, mkfs.xfs computes the values of sb_spino_align and sb_inoalignmt with the following code:
int cluster_size = XFS_INODE_BIG_CLUSTER_SIZE;
if (cfg->sb_feat.crcs_enabled) cluster_size *= cfg->inodesize / XFS_DINODE_MIN_SIZE;
sbp->sb_spino_align = cluster_size >> cfg->blocklog; sbp->sb_inoalignmt = XFS_INODES_PER_CHUNK * cfg->inodesize >> cfg->blocklog;
On a V5 filesystem with 64k fsblocks and 512 byte inodes, this results in cluster_size = 8192 * (512 / 256) = 16384. As a result, sb_spino_align and sb_inoalignmt are both set to zero. Unfortunately, this trips the new sb_spino_align check that was just added to xfs_validate_sb_common, and the mkfs fails:
# mkfs.xfs -f -b size=64k, /dev/sda meta-data=/dev/sda isize=512 agcount=4, agsize=81136 blks = sectsz=512 attr=2, projid32bit=1 = crc=1 finobt=1, sparse=1, rmapbt=1 = reflink=1 bigtime=1 inobtcount=1 nrext64=1 = exchange=0 metadir=0 data = bsize=65536 blocks=324544, imaxpct=25 = sunit=0 swidth=0 blks naming =version 2 bsize=65536 ascii-ci=0, ftype=1, parent=0 log =internal log bsize=65536 blocks=5006, version=2 = sectsz=512 sunit=0 blks, lazy-count=1 realtime =none extsz=65536 blocks=0, rtextents=0 = rgcount=0 rgsize=0 extents Discarding blocks...Sparse inode alignment (0) is invalid. Metadata corruption detected at 0x560ac5a80bbe, xfs_sb block 0x0/0x200 libxfs_bwrite: write verifier failed on xfs_sb bno 0x0/0x1 mkfs.xfs: Releasing dirty buffer to free list! found dirty buffer (bulk) on free list! Sparse inode alignment (0) is invalid. Metadata corruption detected at 0x560ac5a80bbe, xfs_sb block 0x0/0x200 libxfs_bwrite: write verifier failed on xfs_sb bno 0x0/0x1 mkfs.xfs: writing AG headers failed, err=22
Prior to commit 59e43f5479cce1 this all worked fine, even if "sparse" inodes are somewhat meaningless when everything fits in a single fsblock. Adjust the checks to handle existing filesystems.
Cc: stable@vger.kernel.org # v6.13-rc1 Fixes: 59e43f5479cce1 ("xfs: sb_spino_align is not verified") Signed-off-by: "Darrick J. Wong" djwong@kernel.org Reviewed-by: Christoph Hellwig hch@lst.de --- fs/xfs/libxfs/xfs_sb.c | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/fs/xfs/libxfs/xfs_sb.c b/fs/xfs/libxfs/xfs_sb.c index 9e0ae312bc8035..e27b63281d012a 100644 --- a/fs/xfs/libxfs/xfs_sb.c +++ b/fs/xfs/libxfs/xfs_sb.c @@ -392,12 +392,13 @@ xfs_validate_sb_common( return -EINVAL; }
- if (!sbp->sb_spino_align || - sbp->sb_spino_align > sbp->sb_inoalignmt || - (sbp->sb_inoalignmt % sbp->sb_spino_align) != 0) { + if (sbp->sb_spino_align && + (sbp->sb_spino_align > sbp->sb_inoalignmt || + (sbp->sb_inoalignmt % sbp->sb_spino_align) != 0)) { xfs_warn(mp, - "Sparse inode alignment (%u) is invalid.", - sbp->sb_spino_align); +"Sparse inode alignment (%u) is invalid, must be integer factor of (%u).", + sbp->sb_spino_align, + sbp->sb_inoalignmt); return -EINVAL; } } else if (sbp->sb_spino_align) {
From: Darrick J. Wong djwong@kernel.org
commit c004a793e0ec34047c3bd423bcd8966f5fac88dc upstream.
The logic to check that the region past the end of the superblock is all zeroes is wrong -- we don't want to check only the bytes past the end of the maximally sized ondisk superblock structure as currently defined in xfs_format.h; we want to check the bytes beyond the end of the ondisk as defined by the feature bits.
Port the superblock size logic from xfs_repair and then put it to use in xfs_scrub.
Cc: stable@vger.kernel.org # v4.15 Fixes: 21fb4cb1981ef7 ("xfs: scrub the secondary superblocks") Signed-off-by: "Darrick J. Wong" djwong@kernel.org Reviewed-by: Christoph Hellwig hch@lst.de --- fs/xfs/scrub/agheader.c | 29 +++++++++++++++++++++++++++-- 1 file changed, 27 insertions(+), 2 deletions(-)
diff --git a/fs/xfs/scrub/agheader.c b/fs/xfs/scrub/agheader.c index da30f926cbe66d..0f2f1852d58fe7 100644 --- a/fs/xfs/scrub/agheader.c +++ b/fs/xfs/scrub/agheader.c @@ -59,6 +59,30 @@ xchk_superblock_xref( /* scrub teardown will take care of sc->sa for us */ }
+/* + * Calculate the ondisk superblock size in bytes given the feature set of the + * mounted filesystem (aka the primary sb). This is subtlely different from + * the logic in xfs_repair, which computes the size of a secondary sb given the + * featureset listed in the secondary sb. + */ +STATIC size_t +xchk_superblock_ondisk_size( + struct xfs_mount *mp) +{ + if (xfs_has_metauuid(mp)) + return offsetofend(struct xfs_dsb, sb_meta_uuid); + if (xfs_has_crc(mp)) + return offsetofend(struct xfs_dsb, sb_lsn); + if (xfs_sb_version_hasmorebits(&mp->m_sb)) + return offsetofend(struct xfs_dsb, sb_bad_features2); + if (xfs_has_logv2(mp)) + return offsetofend(struct xfs_dsb, sb_logsunit); + if (xfs_has_sector(mp)) + return offsetofend(struct xfs_dsb, sb_logsectsize); + /* only support dirv2 or more recent */ + return offsetofend(struct xfs_dsb, sb_dirblklog); +} + /* * Scrub the filesystem superblock. * @@ -75,6 +99,7 @@ xchk_superblock( struct xfs_buf *bp; struct xfs_dsb *sb; struct xfs_perag *pag; + size_t sblen; xfs_agnumber_t agno; uint32_t v2_ok; __be32 features_mask; @@ -350,8 +375,8 @@ xchk_superblock( }
/* Everything else must be zero. */ - if (memchr_inv(sb + 1, 0, - BBTOB(bp->b_length) - sizeof(struct xfs_dsb))) + sblen = xchk_superblock_ondisk_size(mp); + if (memchr_inv((char *)sb + sblen, 0, BBTOB(bp->b_length) - sblen)) xchk_block_set_corrupt(sc, bp);
xchk_superblock_xref(sc, bp);
linux-stable-mirror@lists.linaro.org