On Tue, May 20, 2025 at 10:57:08AM -0400, Theodore Ts'o wrote:
On Mon, May 19, 2025 at 07:42:46AM -0300, Thadeu Lima de Souza Cascardo wrote:
inline data handling has a race between writing and writing to a memory map.
When ext4_page_mkwrite is called, it calls ext4_convert_inline_data, which destroys the inline data, but if block allocation fails, restores the inline data. In that process, we could have:
CPU1 CPU2 destroy_inline_data write_begin (does not see inline data) restory_inline_data write_end (sees inline data)
The conversion inside ext4_page_mkwrite was introduced at commit 7b4cc9787fe3 ("ext4: evict inline data when writing to memory map"). This fixes a documented bug in the commit message, which suggests some alternatives fixes.
Your fix just reverts commit 7b4cc9787fe3, and removes the BUG_ON. While this is great for shutting up the syzbot report, but it causes file writes to an inline data file via a mmap to never get written back to the storage device. So you are replacing BUG_ON that can get triggered on a race condition in case of a failed block allocation, with silent data corruption. This is not an improvement.
Thanks for trying to address this, but I'm not going to accept your proposed fix.
- Ted
Hi, Ted.
I am trying to understand better the circumstances where the data loss might occur with the fix, but might not occur without the fix. Or, even if they occur either way, such that I can work on a better/proper fix.
Right now, if ext4_convert_inline_data (called from ext4_page_mkwrite) fails with ENOSPC, the memory access will lead to a SIGBUS. The same will happen without the fix, if there are no blocks available.
Now, without ext4_convert_inline_data, blocks will be allocated by ext4_page_mkwrite and written by ext4_do_writepages. Are you concerned about a failure between the clearing of the inode data and the writing of the block in ext4_do_writepages?
Or are you concerned about a potential race condition when allocating blocks?
Which of these cannot happen today with the code as is? If I understand correctly, the inline conversion code also calls ext4_destroy_inline_data before allocating and writing to blocks.
Thanks a lot for the review and guidance. Cascardo.