On Mon, Aug 14, 2023 at 10:35:57AM +0500, Muhammad Usama Anjum wrote:
The last refactoring was done by 4e7ea81db53465 on this code in 2013. The code segment in question is present from even before that. It means that this bug is present for several years. 4.14 is the most old kernel being maintained today. So it affects all current LTS and mainline kernels. I'll report 4e7ea81db53465 with regzbot for proper tracking. Thus probably the bug report will get associated with all LTS kernels as well.
#regzbot title: Race condition between buffer write and page_mkwrite
#regzbot title: ext4: Race condition between buffer write and page_mkwrite
If it's a long-standing bug, then it's really not something I consider a regression. That being said, you're assuming that the refactoring is what has introduced the bug; that's not necessarily case.
*Especially* if it requires a maliciously fuzzed file system, since you have to be root to mount a file system. That's the other thing; the different reports at the console have different reproducers, and at least one of them has a very badly corrupted file system --- and since you need to have root to mount the a maliciously fuzzed file system, these are treated with a much lower priority as far as I'm concerned.
(If you think it should be higher priority, and your company is willing to fund such work, patches are greatfully appreciated. :-)
I tried to reproduce this using one of the reproducers on a modern kernel, and it doesn't reproduce there. That being said, it's not entirely what the reproducer is doing, since (a) passing -1 to the in_fd and out_fd to sendfile *should* just cause sendfile to to return an EBADF error, and (b) when I ran it, it just segfaulted on an mmap() before it executed anything interesting.
Please let me know (a) if you can replicate this on the latest upstream kernel, and (b) if the reproducer doesn't require a maliciously fuzzed kernel, or where the reproducer is scribbling on the file system image while it is mounted.
Cheers,
- Ted