On Mon, Jul 12, 2021 at 9:12 PM syzbot syzbot+283ce5a46486d6acdbaf@syzkaller.appspotmail.com wrote:
syzbot has found a reproducer for the following issue on:
Hmm.
This issue is reported to have been already fixed:
Fix commit: 9b5b8722 file: fix close_range() for unshare+cloexec
and that fix is already in the reported HEAD commit:
HEAD commit: 7fef2edf sd: don't mess with SD_MINORS for CONFIG_DEBUG_BL..
and the oops report clearly is from that:
CPU: 1 PID: 8445 Comm: syz-executor493 Not tainted 5.14.0-rc1-syzkaller #0
so the alleged fix is already there.
So clearly commit 9b5b872215fe ("file: fix close_range() for unshare+cloexec") does *NOT* fix the issue.
This was originally bisected to that 582f1fb6b721 ("fs, close_range: add flag CLOSE_RANGE_CLOEXEC") in
https://syzkaller.appspot.com/bug?id=1bef50bdd9622a1969608d1090b2b4a588d0c6a...
which is where the "fix" is from.
It would probably be good if sysbot made this kind of "hey, it was reported fixed, but it's not" very clear.
The KASAN report looks like a use-after-free, and that "use" is actually the sanity check that the file count is non-zero, so it's really a "struct file *" that has already been free'd.
That bogus free is a regular close() system call
filp_close+0x22/0x170 fs/open.c:1306 close_fd+0x5c/0x80 fs/file.c:628 __do_sys_close fs/open.c:1331 [inline] __se_sys_close fs/open.c:1329 [inline]
And it was opened by a "creat()" system call:
Allocated by task 8445: __alloc_file+0x21/0x280 fs/file_table.c:101 alloc_empty_file+0x6d/0x170 fs/file_table.c:150 path_openat+0xde/0x27f0 fs/namei.c:3493 do_filp_open+0x1aa/0x400 fs/namei.c:3534 do_sys_openat2+0x16d/0x420 fs/open.c:1204 do_sys_open fs/open.c:1220 [inline] __do_sys_creat fs/open.c:1294 [inline] __se_sys_creat fs/open.c:1288 [inline] __x64_sys_creat+0xc9/0x120 fs/open.c:1288 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae
But it has apparently already been closed from a workqueue:
Freed by task 8445: __fput+0x288/0x920 fs/file_table.c:280 task_work_run+0xdd/0x1a0 kernel/task_work.c:164
So it's some kind of confusion and re-use of a struct file pointer.
Which is certainly consistent with the "fix" in 9b5b872215fe ("file: fix close_range() for unshare+cloexec"), but it very much looks like that fix was incomplete and not the full story.
Some fdtable got re-allocated? The fix that wasn't a fix ends up re-checking the maximum file number under the file_lock, but there's clearly something else going on too.
Christian?
Linus