From: Qu Wenruo wqu@suse.com
[ Upstream commit 0528476b6ac7832f31e2ed740a57ae31316b124e ]
[BUG] With current subpage RW support, the following script can hang the fs with 64K page size.
# mkfs.btrfs -f -s 4k $dev # mount $dev -o nospace_cache $mnt # fsstress -w -n 50 -p 1 -s 1607749395 -d $mnt
The kernel will do an infinite loop in btrfs_punch_hole_lock_range().
[CAUSE] In btrfs_punch_hole_lock_range() we:
- Truncate page cache range - Lock extent io tree - Wait any ordered extents in the range.
We exit the loop until we meet all the following conditions:
- No ordered extent in the lock range - No page is in the lock range
The latter condition has a pitfall, it only works for sector size == PAGE_SIZE case.
While can't handle the following subpage case:
0 32K 64K 96K 128K | |///////||//////| ||
lockstart=32K lockend=96K - 1
In this case, although the range crosses 2 pages, truncate_pagecache_range() will invalidate no page at all, but only zero the [32K, 96K) range of the two pages.
Thus filemap_range_has_page(32K, 96K-1) will always return true, thus we will never meet the loop exit condition.
[FIX] Fix the problem by doing page alignment for the lock range.
Function filemap_range_has_page() has already handled lend < lstart case, we only need to round up @lockstart, and round_down @lockend for truncate_pagecache_range().
This modification should not change any thing for sector size == PAGE_SIZE case, as in that case our range is already page aligned.
Tested-by: Ritesh Harjani riteshh@linux.ibm.com # [ppc64] Tested-by: Anand Jain anand.jain@oracle.com # [aarch64] Signed-off-by: Qu Wenruo wqu@suse.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/file.c | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-)
diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index 55f68422061d..7e6ed97a6d08 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -2483,6 +2483,17 @@ static int btrfs_punch_hole_lock_range(struct inode *inode, const u64 lockend, struct extent_state **cached_state) { + /* + * For subpage case, if the range is not at page boundary, we could + * have pages at the leading/tailing part of the range. + * This could lead to dead loop since filemap_range_has_page() + * will always return true. + * So here we need to do extra page alignment for + * filemap_range_has_page(). + */ + const u64 page_lockstart = round_up(lockstart, PAGE_SIZE); + const u64 page_lockend = round_down(lockend + 1, PAGE_SIZE) - 1; + while (1) { struct btrfs_ordered_extent *ordered; int ret; @@ -2503,7 +2514,7 @@ static int btrfs_punch_hole_lock_range(struct inode *inode, (ordered->file_offset + ordered->num_bytes <= lockstart || ordered->file_offset > lockend)) && !filemap_range_has_page(inode->i_mapping, - lockstart, lockend)) { + page_lockstart, page_lockend)) { if (ordered) btrfs_put_ordered_extent(ordered); break;