From: Qu Wenruo wqu@suse.com
commit 484167da77739a8d0e225008c48e697fd3f781ae upstream.
[BUG] There are users using autodefrag mount option reporting obvious increase in IO:
If I compare the write average (in total, I don't have it per process) when taking idle periods on the same machine: Linux 5.16: without autodefrag: ~ 10KiB/s with autodefrag: between 1 and 2MiB/s.
Linux 5.15: with autodefrag:~ 10KiB/s (around the same as without
autodefrag on 5.16)
[CAUSE] When autodefrag mount option is enabled, btrfs_defrag_file() will be called with @max_sectors = BTRFS_DEFRAG_BATCH (1024) to limit how many sectors we can defrag in one try.
And then use the number of sectors defragged to determine if we need to re-defrag.
But commit b18c3ab2343d ("btrfs: defrag: introduce helper to defrag one cluster") uses wrong unit to increase @sectors_defragged, which should be in unit of sector, not byte.
This means, if we have defragged any sector, then @sectors_defragged will be >= sectorsize (normally 4096), which is larger than BTRFS_DEFRAG_BATCH.
This makes the @max_sectors check in defrag_one_cluster() to underflow, rendering the whole @max_sectors check useless.
Thus causing way more IO for autodefrag mount options, as now there is no limit on how many sectors can really be defragged.
[FIX] Fix the problems by:
- Use sector as unit when increasing @sectors_defragged
- Include @sectors_defragged > @max_sectors case to break the loop
- Add extra comment on the return value of btrfs_defrag_file()
Reported-by: Anthony Ruhier aruhier@mailbox.org Fixes: b18c3ab2343d ("btrfs: defrag: introduce helper to defrag one cluster") Link: https://lore.kernel.org/linux-btrfs/0a269612-e43f-da22-c5bc-b34b1b56ebe8@mai... CC: stable@vger.kernel.org # 5.16 Reviewed-by: Filipe Manana fdmanana@suse.com Signed-off-by: Qu Wenruo wqu@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/ioctl.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-)
--- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -1416,8 +1416,8 @@ static int defrag_one_cluster(struct btr list_for_each_entry(entry, &target_list, list) { u32 range_len = entry->len;
- /* Reached the limit */ - if (max_sectors && max_sectors == *sectors_defragged) + /* Reached or beyond the limit */ + if (max_sectors && *sectors_defragged >= max_sectors) break;
if (max_sectors) @@ -1439,7 +1439,8 @@ static int defrag_one_cluster(struct btr extent_thresh, newer_than, do_compress); if (ret < 0) break; - *sectors_defragged += range_len; + *sectors_defragged += range_len >> + inode->root->fs_info->sectorsize_bits; } out: list_for_each_entry_safe(entry, tmp, &target_list, list) { @@ -1458,6 +1459,9 @@ out: * @newer_than: minimum transid to defrag * @max_to_defrag: max number of sectors to be defragged, if 0, the whole inode * will be defragged. + * + * Return <0 for error. + * Return >=0 for the number of sectors defragged. */ int btrfs_defrag_file(struct inode *inode, struct file_ra_state *ra, struct btrfs_ioctl_defrag_range_args *range,