On Thu, Dec 16, 2021 at 07:47:35PM +0800, Qu Wenruo wrote:
[BUG] The following super simple script would crash btrfs at unmount time, if CONFIG_BTRFS_ASSERT() is set.
mkfs.btrfs -f $dev mount $dev $mnt xfs_io -f -c "pwrite 0 4k" $mnt/file umount $mnt mount -r ro $dev $mnt btrfs scrub start -Br $mnt umount $mnt
This will trigger the following ASSERT() introduced by commit 0a31daa4b602 ("btrfs: add assertion for empty list of transactions at late stage of umount").
That patch is deifnitely not the cause, it just makes enough noise for us developer.
[CAUSE] We will start transaction for the following call chain during scrub:
scrub_enumerate_chunks() |- btrfs_inc_block_group_ro() |- btrfs_join_transaction()
However for RO mount, there is no running transaction at all, thus btrfs_join_transaction() will start a new transaction.
Furthermore, since it's read-only mount, btrfs_sync_fs() will not call btrfs_commit_super() to commit the new but empty transaction.
And lead to the ASSERT() being triggered.
The bug should be there for a long time. Only the new ASSERT() makes it noisy enough to be noticed.
[FIX] For read-only scrub on read-only mount, there is no need to start a transaction nor to allocate new chunks in btrfs_inc_block_group_ro().
Just do extra read-only mount check in btrfs_inc_block_group_ro(), and if it's read-only, skip all chunk allocation and go inc_block_group_ro() directly.
Since we're here, also add extra debug message at unmount for btrfs_fs_info::trans_list. Sometimes just knowing that there is no dirty metadata bytes for a uncommitted transaction can tell us a lot of things.
Cc: stable@vger.kernel.org # 5.4+ Signed-off-by: Qu Wenruo wqu@suse.com
fs/btrfs/block-group.c | 13 +++++++++++++ 1 file changed, 13 insertions(+)
diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c index 1db24e6d6d90..702219361b12 100644 --- a/fs/btrfs/block-group.c +++ b/fs/btrfs/block-group.c @@ -2544,6 +2544,19 @@ int btrfs_inc_block_group_ro(struct btrfs_block_group *cache, int ret; bool dirty_bg_running;
- /*
* This can only happen when we are doing read-only scrub on read-only
* mount.
* In that case we should not start a new transaction on read-only fs.
* Thus here we skip all chunk allocation.
*/
- if (sb_rdonly(fs_info->sb)) {
Should this also verify or at least assert that do_chunk_alloc is not set? The scrub code is used for replace that can set the parameter to true.
mutex_lock(&fs_info->ro_block_group_mutex);
ret = inc_block_group_ro(cache, 0);
mutex_unlock(&fs_info->ro_block_group_mutex);
return ret;
So this is taking a shortcut and skips a few things done in the function that use the transaction. I'm not sure how safe this is, it depends on the read-only status of superblock, that can chage any time, so what are further calls to btrfs_inc_block_group_ro going to do regaring the transaction?
- }
- do { trans = btrfs_join_transaction(root); if (IS_ERR(trans))
-- 2.34.1