[BUG] Test case generic/475 have a very high chance (almost 100%) to hit a fs hang, where a data page will never be unlocked and hang all later operations.
[CAUSE] In btrfs_do_readpage(), if we hit an error from submit_extent_page() we will try to do the cleanup for our current io range, and exit.
This works fine for PAGE_SIZE == sectorsize cases, but not for subpage.
For subpage btrfs_do_readpage() will lock the full page first, which can contain several different sectors and extents:
btrfs_do_readpage() |- begin_page_read() | |- btrfs_subpage_start_reader(); | Now the page will hage PAGE_SIZE / sectorsize reader pending, | and the page is locked. | |- end_page_read() for different branches | This function will reduce subpage readers, and when readers | reach 0, it will unlock the page.
But when submit_extent_page() failed, we only cleanup the current io range, while the remaining io range will never be cleaned up, and the page remains locked forever.
[FIX] Update the error handling of submit_extent_page() to cleanup all the remaining subpage range before exiting the loop.
CC: stable@vger.kernel.org # 5.15+ Signed-off-by: Qu Wenruo wqu@suse.com --- fs/btrfs/extent_io.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 34073b0ed6ca..8de25ce05606 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -3735,8 +3735,12 @@ int btrfs_do_readpage(struct page *page, struct extent_map **em_cached, this_bio_flag, force_bio_submit); if (ret) { - unlock_extent(tree, cur, cur + iosize - 1); - end_page_read(page, false, cur, iosize); + /* + * We have to unlock the remaining range, or the page + * will never be unlocked. + */ + unlock_extent(tree, cur, end); + end_page_read(page, false, cur, end + 1 - cur); goto out; } cur = cur + iosize;
On Mon, Apr 11, 2022 at 02:12:50PM +0800, Qu Wenruo wrote:
[BUG] Test case generic/475 have a very high chance (almost 100%) to hit a fs hang, where a data page will never be unlocked and hang all later operations.
Question: how can we even get an error? The submission already stopped return errors with patch 1. btrfs_get_chunk_map called from btrfs_zoned_get_device and calc_bio_boundaries really just has sanity checks that should be fatal if not met, same for btrfs_get_io_geometry.
So yes, we could fix the nasty error handling here. Or just remove it entirely, which would reduce the possibility of bugs even more.
On 2022/4/11 15:18, Christoph Hellwig wrote:
On Mon, Apr 11, 2022 at 02:12:50PM +0800, Qu Wenruo wrote:
[BUG] Test case generic/475 have a very high chance (almost 100%) to hit a fs hang, where a data page will never be unlocked and hang all later operations.
Question: how can we even get an error? The submission already stopped return errors with patch 1. btrfs_get_chunk_map called from btrfs_zoned_get_device and calc_bio_boundaries really just has sanity checks that should be fatal if not met, same for btrfs_get_io_geometry.
Yep, btrfs_get_chunk_map() related call sites can only get error due to sanity check.
But the sanity check still makes sense, and those can not be easily rejected by our existing tree-checkers.
So we still need to do the error handling.
So yes, we could fix the nasty error handling here. Or just remove it entirely, which would reduce the possibility of bugs even more.
I don't see a super good way to remove the sanity check.
The get_chunk_map() one is here to catch possible bad bio which wants to write into non-existing chunk. To completely get rid that, we need a bullet proof solution to make sure all of our bio will never point to some non-existing logical bytenr.
Especially considering we have extra mechanisms to make chunk allocation/deletion more dynamic and harder to catch such situation at other locations.
Which can be more challenging than the error handling.
Anyway, we still need to handle the error for __get_extent_map(), we just need to do the same one here.
Thanks, Qu
linux-stable-mirror@lists.linaro.org