Hi all,
Bug fixes for 6.13.
If you're going to start using this code, I strongly recommend pulling from my git trees, which are linked below.
This has been running on the djcloud for months with no problems. Enjoy! Comments and questions are, as always, welcome.
--D
kernel git tree: https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=xfs...
xfsprogs git tree: https://git.kernel.org/cgit/linux/kernel/git/djwong/xfsprogs-dev.git/log/?h=... --- Commits in this patchset: * xfs: return a 64-bit block count from xfs_btree_count_blocks * xfs: fix error bailout in xfs_rtginode_create * xfs: update btree keys correctly when _insrec splits an inode root block * xfs: don't call xfs_bmap_same_rtgroup in xfs_bmap_add_extent_hole_delay * xfs: fix sb_spino_align checks for large fsblock sizes * xfs: return from xfs_symlink_verify early on V4 filesystems --- libxfs/xfs_bmap.c | 6 ++---- libxfs/xfs_btree.c | 33 +++++++++++++++++++++++++-------- libxfs/xfs_btree.h | 2 +- libxfs/xfs_ialloc_btree.c | 4 +++- libxfs/xfs_rtgroup.c | 2 +- libxfs/xfs_sb.c | 11 ++++++----- libxfs/xfs_symlink_remote.c | 4 +++- 7 files changed, 41 insertions(+), 21 deletions(-)
From: Darrick J. Wong djwong@kernel.org
With the nrext64 feature enabled, it's possible for a data fork to have 2^48 extent mappings. Even with a 64k fsblock size, that maps out to a bmbt containing more than 2^32 blocks. Therefore, this predicate must return a u64 count to avoid an integer wraparound that will cause scrub to do the wrong thing.
It's unlikely that any such filesystem currently exists, because the incore bmbt would consume more than 64GB of kernel memory on its own, and so far nobody except me has driven a filesystem that far, judging from the lack of complaints.
Cc: stable@vger.kernel.org # v5.19 Fixes: df9ad5cc7a5240 ("xfs: Introduce macros to represent new maximum extent counts for data/attr forks") Signed-off-by: "Darrick J. Wong" djwong@kernel.org --- libxfs/xfs_btree.c | 4 ++-- libxfs/xfs_btree.h | 2 +- libxfs/xfs_ialloc_btree.c | 4 +++- 3 files changed, 6 insertions(+), 4 deletions(-)
diff --git a/libxfs/xfs_btree.c b/libxfs/xfs_btree.c index 3d870f3f4a5165..5c293ccf623336 100644 --- a/libxfs/xfs_btree.c +++ b/libxfs/xfs_btree.c @@ -5142,7 +5142,7 @@ xfs_btree_count_blocks_helper( int level, void *data) { - xfs_extlen_t *blocks = data; + xfs_filblks_t *blocks = data; (*blocks)++;
return 0; @@ -5152,7 +5152,7 @@ xfs_btree_count_blocks_helper( int xfs_btree_count_blocks( struct xfs_btree_cur *cur, - xfs_extlen_t *blocks) + xfs_filblks_t *blocks) { *blocks = 0; return xfs_btree_visit_blocks(cur, xfs_btree_count_blocks_helper, diff --git a/libxfs/xfs_btree.h b/libxfs/xfs_btree.h index 3b739459ebb0f4..c5bff273cae255 100644 --- a/libxfs/xfs_btree.h +++ b/libxfs/xfs_btree.h @@ -484,7 +484,7 @@ typedef int (*xfs_btree_visit_blocks_fn)(struct xfs_btree_cur *cur, int level, int xfs_btree_visit_blocks(struct xfs_btree_cur *cur, xfs_btree_visit_blocks_fn fn, unsigned int flags, void *data);
-int xfs_btree_count_blocks(struct xfs_btree_cur *cur, xfs_extlen_t *blocks); +int xfs_btree_count_blocks(struct xfs_btree_cur *cur, xfs_filblks_t *blocks);
union xfs_btree_rec *xfs_btree_rec_addr(struct xfs_btree_cur *cur, int n, struct xfs_btree_block *block); diff --git a/libxfs/xfs_ialloc_btree.c b/libxfs/xfs_ialloc_btree.c index 19fca9fad62b1d..4cccac145dc775 100644 --- a/libxfs/xfs_ialloc_btree.c +++ b/libxfs/xfs_ialloc_btree.c @@ -743,6 +743,7 @@ xfs_finobt_count_blocks( { struct xfs_buf *agbp = NULL; struct xfs_btree_cur *cur; + xfs_filblks_t blocks; int error;
error = xfs_ialloc_read_agi(pag, tp, 0, &agbp); @@ -750,9 +751,10 @@ xfs_finobt_count_blocks( return error;
cur = xfs_finobt_init_cursor(pag, tp, agbp); - error = xfs_btree_count_blocks(cur, tree_blocks); + error = xfs_btree_count_blocks(cur, &blocks); xfs_btree_del_cursor(cur, error); xfs_trans_brelse(tp, agbp); + *tree_blocks = blocks;
return error; }
From: Darrick J. Wong djwong@kernel.org
In commit 2c813ad66a72, I partially fixed a bug wherein xfs_btree_insrec would erroneously try to update the parent's key for a block that had been split if we decided to insert the new record into the new block. The solution was to detect this situation and update the in-core key value that we pass up to the caller so that the caller will (eventually) add the new block to the parent level of the tree with the correct key.
However, I missed a subtlety about the way inode-rooted btrees work. If the full block was a maximally sized inode root block, we'll solve that fullness by moving the root block's records to a new block, resizing the root block, and updating the root to point to the new block. We don't pass a pointer to the new block to the caller because that work has already been done. The new record will /always/ land in the new block, so in this case we need to use xfs_btree_update_keys to update the keys.
This bug can theoretically manifest itself in the very rare case that we split a bmbt root block and the new record lands in the very first slot of the new block, though I've never managed to trigger it in practice. However, it is very easy to reproduce by running generic/522 with the realtime rmapbt patchset if rtinherit=1.
Cc: stable@vger.kernel.org # v4.8 Fixes: 2c813ad66a7218 ("xfs: support btrees with overlapping intervals for keys") Signed-off-by: "Darrick J. Wong" djwong@kernel.org Reviewed-by: Christoph Hellwig hch@lst.de --- libxfs/xfs_btree.c | 29 +++++++++++++++++++++++------ 1 file changed, 23 insertions(+), 6 deletions(-)
diff --git a/libxfs/xfs_btree.c b/libxfs/xfs_btree.c index 5c293ccf623336..f4c4db62e2069e 100644 --- a/libxfs/xfs_btree.c +++ b/libxfs/xfs_btree.c @@ -3555,14 +3555,31 @@ xfs_btree_insrec( xfs_btree_log_block(cur, bp, XFS_BB_NUMRECS);
/* - * If we just inserted into a new tree block, we have to - * recalculate nkey here because nkey is out of date. + * Update btree keys to reflect the newly added record or keyptr. + * There are three cases here to be aware of. Normally, all we have to + * do is walk towards the root, updating keys as necessary. * - * Otherwise we're just updating an existing block (having shoved - * some records into the new tree block), so use the regular key - * update mechanism. + * If the caller had us target a full block for the insertion, we dealt + * with that by calling the _make_block_unfull function. If the + * "make unfull" function splits the block, it'll hand us back the key + * and pointer of the new block. We haven't yet added the new block to + * the next level up, so if we decide to add the new record to the new + * block (bp->b_bn != old_bn), we have to update the caller's pointer + * so that the caller adds the new block with the correct key. + * + * However, there is a third possibility-- if the selected block is the + * root block of an inode-rooted btree and cannot be expanded further, + * the "make unfull" function moves the root block contents to a new + * block and updates the root block to point to the new block. In this + * case, no block pointer is passed back because the block has already + * been added to the btree. In this case, we need to use the regular + * key update function, just like the first case. This is critical for + * overlapping btrees, because the high key must be updated to reflect + * the entire tree, not just the subtree accessible through the first + * child of the root (which is now two levels down from the root). */ - if (bp && xfs_buf_daddr(bp) != old_bn) { + if (!xfs_btree_ptr_is_null(cur, &nptr) && + bp && xfs_buf_daddr(bp) != old_bn) { xfs_btree_get_keys(cur, block, lkey); } else if (xfs_btree_needs_key_update(cur, optr)) { error = xfs_btree_update_keys(cur, level);
From: Darrick J. Wong djwong@kernel.org
V4 symlink blocks didn't have headers, so return early if this is a V4 filesystem.
Cc: stable@vger.kernel.org # v5.1 Fixes: 39708c20ab5133 ("xfs: miscellaneous verifier magic value fixups") Signed-off-by: "Darrick J. Wong" djwong@kernel.org --- libxfs/xfs_symlink_remote.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/libxfs/xfs_symlink_remote.c b/libxfs/xfs_symlink_remote.c index 2ad586f3926ad2..1c355f751e1cc7 100644 --- a/libxfs/xfs_symlink_remote.c +++ b/libxfs/xfs_symlink_remote.c @@ -89,8 +89,10 @@ xfs_symlink_verify( struct xfs_mount *mp = bp->b_mount; struct xfs_dsymlink_hdr *dsl = bp->b_addr;
+ /* no verification of non-crc buffers */ if (!xfs_has_crc(mp)) - return __this_address; + return NULL; + if (!xfs_verify_magic(bp, dsl->sl_magic)) return __this_address; if (!uuid_equal(&dsl->sl_uuid, &mp->m_sb.sb_meta_uuid))
On Mon, Dec 09, 2024 at 09:25:02PM -0800, Christoph Hellwig wrote:
Shouldn't the bug fix series go first?
Something got messed up in the patchbombing sequence. Patchset 5/9 should have sorted before this one (6/9) but instead it came at the end.
Ohhh, right. Long Li add a "Cc: stable@vger.kernel.org # v4.19+" tag. This isn't rfc2822 compliant, so the stgit subcommand that converts Cc: tags in the commit message to actual cc: headers tried to put the entire string (minus spaces!) in the cc line, which caused the msmtp to reject the cover letter. I stripped out the stable@ entirely and resent just that message, but forgot that my patchbombing script always updates the send time for each attempt, so the mutt sort order is now incorrect.
String parsing sucks, and our automated tools don't catch it either:
$ ./scripts/checkpatch.pl ../xfsprogs/patches-djwong-dev10/129* ERROR: Please use git commit description style 'commit <12+ chars of sha1> ("<title line>")' - ie: 'commit 652f03db897b ("xfs: remove unknown compat feature check in superblock write validation")' #4: Source kernel commit: 652f03db897ba24f9c4b269e254ccc6cc01ff1b7
total: 1 errors, 0 warnings, 13 lines checked
NOTE: For some of the reported defects, checkpatch may be able to mechanically convert to the typical style using --fix or --fix-inplace.
../xfsprogs/patches-djwong-dev10/129-xfs__remove_unknown_compat_feature_check_in_superblock_write_validation.patch has style problems, please review.
NOTE: If any of the errors are false positives, please report them to the maintainer, see CHECKPATCH in MAINTAINERS.
Yeah, it whines about the commit hash being **too long** which is totally inconsequential but totally misses the incorrect Cc: header specification that confuses the hyell out of MTAs.
Sorry about that.
--D
linux-stable-mirror@lists.linaro.org