From: Huang Jianan huangjianan@oppo.com
[ Upstream commit 57bbeacdbee72a54eb97d56b876cf9c94059fc34 ]
We observed the following deadlock in the stress test under low memory scenario:
Thread A Thread B - erofs_shrink_scan - erofs_try_to_release_workgroup - erofs_workgroup_try_to_freeze -- A - z_erofs_do_read_page - z_erofs_collection_begin - z_erofs_register_collection - erofs_insert_workgroup - xa_lock(&sbi->managed_pslots) -- B - erofs_workgroup_get - erofs_wait_on_workgroup_freezed -- A - xa_erase - xa_lock(&sbi->managed_pslots) -- B
To fix this, it needs to hold xa_lock before freezing the workgroup since xarray will be touched then. So let's hold the lock before accessing each workgroup, just like what we did with the radix tree before.
[ Gao Xiang: Jianhua Hao also reports this issue at https://lore.kernel.org/r/b10b85df30694bac8aadfe43537c897a@xiaomi.com ]
Link: https://lore.kernel.org/r/20211118135844.3559-1-huangjianan@oppo.com Fixes: 64094a04414f ("erofs: convert workstn to XArray") Reviewed-by: Chao Yu chao@kernel.org Reviewed-by: Gao Xiang hsiangkao@linux.alibaba.com Signed-off-by: Huang Jianan huangjianan@oppo.com Reported-by: Jianhua Hao haojianhua1@xiaomi.com Signed-off-by: Gao Xiang xiang@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/erofs/utils.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/fs/erofs/utils.c b/fs/erofs/utils.c index bd86067a63f7f..3ca703cd5b24a 100644 --- a/fs/erofs/utils.c +++ b/fs/erofs/utils.c @@ -141,7 +141,7 @@ static bool erofs_try_to_release_workgroup(struct erofs_sb_info *sbi, * however in order to avoid some race conditions, add a * DBG_BUGON to observe this in advance. */ - DBG_BUGON(xa_erase(&sbi->managed_pslots, grp->index) != grp); + DBG_BUGON(__xa_erase(&sbi->managed_pslots, grp->index) != grp);
/* last refcount should be connected with its managed pslot. */ erofs_workgroup_unfreeze(grp, 0); @@ -156,15 +156,19 @@ static unsigned long erofs_shrink_workstation(struct erofs_sb_info *sbi, unsigned int freed = 0; unsigned long index;
+ xa_lock(&sbi->managed_pslots); xa_for_each(&sbi->managed_pslots, index, grp) { /* try to shrink each valid workgroup */ if (!erofs_try_to_release_workgroup(sbi, grp)) continue; + xa_unlock(&sbi->managed_pslots);
++freed; if (!--nr_shrink) - break; + return freed; + xa_lock(&sbi->managed_pslots); } + xa_unlock(&sbi->managed_pslots); return freed; }