On Thu, Jan 07, 2021 at 05:08:30PM -0500, Josef Bacik wrote:
Commit 38d715f494f2 ("btrfs: use btrfs_start_delalloc_roots in shrink_delalloc") cleaned up how we do delalloc shrinking by utilizing some infrastructure we have in place to flush inodes that we use for device replace and snapshot. However this introduced a pretty serious performance regression. To reproduce the user untarred the source tarball of Firefox, and would see it take anywhere from 5 to 20 times as long to untar in 5.10 compared to 5.9.
The root cause is because before we would generally use the normal writeback path to reclaim delalloc space, and for this we would provide it with the number of pages we wanted to flush. The referenced commit changed this to flush that many inodes, which drastically increased the amount of space we were flushing in certain cases, which severely affected performance.
We cannot revert this patch unfortunately because of
btrfs: fix deadlock when cloning inline extent and low on free metadata space
which requires the ability to skip flushing inodes that are being cloned in certain scenarios, which means we need to keep using our flushing infrastructure or risk re-introducing the deadlock.
Instead to fix this problem we can go back to providing btrfs_start_delalloc_roots with a number of pages to flush, and then set up a writeback_control and utilize sync_inode() to handle the flushing for us. This gives us the same behavior we had prior to the fix, while still allowing us to avoid the deadlock that was fixed by Filipe. I redid the users original test and got the following results on one of our test machines (256gib of ram, 56 cores, 2tib Intel NVME drive)
5.9 0m54.258s 5.10 1m26.212s 5.10+patch 0m38.800s
5.10+patch is significantly faster than plain 5.9 because of my patch series "Change data reservations to use the ticketing infra" which contained the patch that introduced the regression, but generally improved the overall ENOSPC flushing mechanisms.
CC: stable@vger.kernel.org # 5.10 Reported-by: René Rebe rene@exactcode.de Fixes: 38d715f494f2 ("btrfs: use btrfs_start_delalloc_roots in shrink_delalloc") Signed-off-by: Josef Bacik josef@toxicpanda.com
v2->v3:
- modified the changelog to add information about the patches referenced, and detail the specs of the machine I used for the performance numbers.
Great, thanks. Meanwhile I did some other tests, 'dbench 32' is basically the same and async random write with 'fio --rw=randwrite --size=4g --ioengine=libaio' as well.
I'm going to send another rc3 pull request with this patch so we can get it to 5.10 stable.