commit 8cccf05fe857a18ee26e20d11a8455a73ffd4efd upstream.
If a nilfs2 filesystem is downgraded to read-only due to metadata corruption on disk and is remounted read/write, or if emergency read-only remount is performed, detaching a log writer and synchronizing the filesystem can be done at the same time.
In these cases, use-after-free of the log writer (hereinafter nilfs->ns_writer) can happen as shown in the scenario below:
Task1 Task2 -------------------------------- ------------------------------ nilfs_construct_segment nilfs_segctor_sync init_wait init_waitqueue_entry add_wait_queue schedule nilfs_remount (R/W remount case) nilfs_attach_log_writer nilfs_detach_log_writer nilfs_segctor_destroy kfree finish_wait _raw_spin_lock_irqsave __raw_spin_lock_irqsave do_raw_spin_lock debug_spin_lock_before <-- use-after-free
While Task1 is sleeping, nilfs->ns_writer is freed by Task2. After Task1 waked up, Task1 accesses nilfs->ns_writer which is already freed. This scenario diagram is based on the Shigeru Yoshida's post [1].
This patch fixes the issue by not detaching nilfs->ns_writer on remount so that this UAF race doesn't happen. Along with this change, this patch also inserts a few necessary read-only checks with superblock instance where only the ns_writer pointer was used to check if the filesystem is read-only.
Link: https://syzkaller.appspot.com/bug?id=79a4c002e960419ca173d55e863bd09e8112df8... Link: https://lkml.kernel.org/r/20221103141759.1836312-1-syoshida@redhat.com [1] Link: https://lkml.kernel.org/r/20221104142959.28296-1-konishi.ryusuke@gmail.com Signed-off-by: Ryusuke Konishi konishi.ryusuke@gmail.com Reported-by: syzbot+f816fa82f8783f7a02bb@syzkaller.appspotmail.com Reported-by: Shigeru Yoshida syoshida@redhat.com Tested-by: Ryusuke Konishi konishi.ryusuke@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org --- Please apply this patch to stable-4.9 instead of the patch that could not be applied to it last time.
A rejected hunk has been manually resolved, and replaced sb_rdonly() with its equivalent bitwise operation since the function does not yet exist in this kernel. Also, retested against that stable tree.
fs/nilfs2/segment.c | 15 ++++++++------- fs/nilfs2/super.c | 2 -- 2 files changed, 8 insertions(+), 9 deletions(-)
diff --git a/fs/nilfs2/segment.c b/fs/nilfs2/segment.c index 450bfdee513b..86769ecfe61f 100644 --- a/fs/nilfs2/segment.c +++ b/fs/nilfs2/segment.c @@ -329,7 +329,7 @@ void nilfs_relax_pressure_in_lock(struct super_block *sb) struct the_nilfs *nilfs = sb->s_fs_info; struct nilfs_sc_info *sci = nilfs->ns_writer;
- if (!sci || !sci->sc_flush_request) + if (sb->s_flags & MS_RDONLY || unlikely(!sci) || !sci->sc_flush_request) return;
set_bit(NILFS_SC_PRIOR_FLUSH, &sci->sc_flags); @@ -2252,7 +2252,7 @@ int nilfs_construct_segment(struct super_block *sb) struct nilfs_transaction_info *ti; int err;
- if (!sci) + if (sb->s_flags & MS_RDONLY || unlikely(!sci)) return -EROFS;
/* A call inside transactions causes a deadlock. */ @@ -2291,7 +2291,7 @@ int nilfs_construct_dsync_segment(struct super_block *sb, struct inode *inode, struct nilfs_transaction_info ti; int err = 0;
- if (!sci) + if (sb->s_flags & MS_RDONLY || unlikely(!sci)) return -EROFS;
nilfs_transaction_lock(sb, &ti, 0); @@ -2788,11 +2788,12 @@ int nilfs_attach_log_writer(struct super_block *sb, struct nilfs_root *root)
if (nilfs->ns_writer) { /* - * This happens if the filesystem was remounted - * read/write after nilfs_error degenerated it into a - * read-only mount. + * This happens if the filesystem is made read-only by + * __nilfs_error or nilfs_remount and then remounted + * read/write. In these cases, reuse the existing + * writer. */ - nilfs_detach_log_writer(sb); + return 0; }
nilfs->ns_writer = nilfs_segctor_new(sb, root); diff --git a/fs/nilfs2/super.c b/fs/nilfs2/super.c index c95d369e90aa..9e8f186b14d1 100644 --- a/fs/nilfs2/super.c +++ b/fs/nilfs2/super.c @@ -1147,8 +1147,6 @@ static int nilfs_remount(struct super_block *sb, int *flags, char *data) if ((*flags & MS_RDONLY) == (sb->s_flags & MS_RDONLY)) goto out; if (*flags & MS_RDONLY) { - /* Shutting down log writer */ - nilfs_detach_log_writer(sb); sb->s_flags |= MS_RDONLY;
/*
On Mon, Nov 21, 2022 at 09:35:28PM +0900, Ryusuke Konishi wrote:
commit 8cccf05fe857a18ee26e20d11a8455a73ffd4efd upstream.
If a nilfs2 filesystem is downgraded to read-only due to metadata corruption on disk and is remounted read/write, or if emergency read-only remount is performed, detaching a log writer and synchronizing the filesystem can be done at the same time.
In these cases, use-after-free of the log writer (hereinafter nilfs->ns_writer) can happen as shown in the scenario below:
Task1 Task2
nilfs_construct_segment nilfs_segctor_sync init_wait init_waitqueue_entry add_wait_queue schedule nilfs_remount (R/W remount case) nilfs_attach_log_writer nilfs_detach_log_writer nilfs_segctor_destroy kfree finish_wait _raw_spin_lock_irqsave __raw_spin_lock_irqsave do_raw_spin_lock debug_spin_lock_before <-- use-after-free
While Task1 is sleeping, nilfs->ns_writer is freed by Task2. After Task1 waked up, Task1 accesses nilfs->ns_writer which is already freed. This scenario diagram is based on the Shigeru Yoshida's post [1].
This patch fixes the issue by not detaching nilfs->ns_writer on remount so that this UAF race doesn't happen. Along with this change, this patch also inserts a few necessary read-only checks with superblock instance where only the ns_writer pointer was used to check if the filesystem is read-only.
Link: https://syzkaller.appspot.com/bug?id=79a4c002e960419ca173d55e863bd09e8112df8... Link: https://lkml.kernel.org/r/20221103141759.1836312-1-syoshida@redhat.com [1] Link: https://lkml.kernel.org/r/20221104142959.28296-1-konishi.ryusuke@gmail.com Signed-off-by: Ryusuke Konishi konishi.ryusuke@gmail.com Reported-by: syzbot+f816fa82f8783f7a02bb@syzkaller.appspotmail.com Reported-by: Shigeru Yoshida syoshida@redhat.com Tested-by: Ryusuke Konishi konishi.ryusuke@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org
Please apply this patch to stable-4.9 instead of the patch that could not be applied to it last time.
A rejected hunk has been manually resolved, and replaced sb_rdonly() with its equivalent bitwise operation since the function does not yet exist in this kernel. Also, retested against that stable tree.
Now queued up, thanks.
greg k-h
linux-stable-mirror@lists.linaro.org