Andreas Gruenbacher agruenba@redhat.com 于2024年8月20日周二 22:22写道:
Hi,
On Tue, Aug 20, 2024 at 5:32 AM Julian Sun sunjunchao2870@gmail.com wrote:
When gfs2_fill_super() fails, destroy_workqueue() is called within gfs2_gl_hash_clear(), and the subsequent code path calls destroy_workqueue() on the same work queue again.
This issue can be fixed by setting the work queue pointer to NULL after the first destroy_workqueue() call and checking for a NULL pointer before attempting to destroy the work queue again.
Reported-by: syzbot+d34c2a269ed512c531b0@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=d34c2a269ed512c531b0 Fixes: 30e388d57367 ("gfs2: Switch to a per-filesystem glock workqueue") Cc: stable@vger.kernel.org Signed-off-by: Julian Sun sunjunchao2870@gmail.com
fs/gfs2/glock.c | 1 + fs/gfs2/ops_fstype.c | 3 ++- 2 files changed, 3 insertions(+), 1 deletion(-)
diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c index 12a769077ea0..4775c2cb8ae1 100644 --- a/fs/gfs2/glock.c +++ b/fs/gfs2/glock.c @@ -2249,6 +2249,7 @@ void gfs2_gl_hash_clear(struct gfs2_sbd *sdp) gfs2_free_dead_glocks(sdp); glock_hash_walk(dump_glock_func, sdp); destroy_workqueue(sdp->sd_glock_wq);
sdp->sd_glock_wq = NULL;
Here, sdp->sd_glock_wq is set to NULL,
}
static const char *state2str(unsigned state) diff --git a/fs/gfs2/ops_fstype.c b/fs/gfs2/ops_fstype.c index ff1f3e3dc65c..c1a7ff713c84 100644 --- a/fs/gfs2/ops_fstype.c +++ b/fs/gfs2/ops_fstype.c @@ -1305,7 +1305,8 @@ static int gfs2_fill_super(struct super_block *sb, struct fs_context *fc) gfs2_delete_debugfs_file(sdp); gfs2_sys_fs_del(sdp); fail_delete_wq:
destroy_workqueue(sdp->sd_delete_wq);
if (sdp->sd_delete_wq)
destroy_workqueue(sdp->sd_delete_wq);
but here, we check if sdp->sd_delete_wq is NULL? That doesn't make sense.
I'm not sure if I have missed anything important. My understanding is that in gfs2_fill_super(), if execution jumps to the fail_lm label, gfs2_gl_hash_clear() is called first, which internally calls destroy_workqueue(sdp->sd_glock_wq). Subsequently, the code reaches the fail_delete_wq label, where destroy_workqueue(sdp->sd_glock_wq) is called again, leading to a bug. If there is anything important I'm missing, please let me know.
fail_glock_wq: destroy_workqueue(sdp->sd_glock_wq); fail_free: -- 2.39.2
Thanks, Andreas
Thanks,