NACK.
On Thu, Oct 17, 2024 at 12:51:03AM +0000, jeffxu@chromium.org wrote:
From: Jeff Xu jeffxu@google.com
Two fixes for madvise(MADV_DONTNEED) when sealed.
For PROT_NONE mappings, the previous blocking of madvise(MADV_DONTNEED) is unnecessary. As PROT_NONE already prohibits memory access, madvise(MADV_DONTNEED) should be allowed to proceed in order to free the page.
Except if they are VM_MAYWRITE...
For file-backed, private, read-only memory mappings, we previously did not block the madvise(MADV_DONTNEED). This was based on the assumption that the memory's content, being file-backed, could be retrieved from the file if accessed again. However, this assumption failed to consider scenarios where a mapping is initially created as read-write, modified, and subsequently changed to read-only. The newly introduced VM_WASWRITE flag addresses this oversight.
There's no justification for adding a new VMA flag, especially given it will break VMA merging for everyone.
This whole approach seems broken. What you seem to need is to check whether a mapping _could_ be mapped writably at some stage.
The kernel doesn't need to keep track of all the times where it was writable before or not but rather this.
Please look at VM_MAYWRITE and mapping_writably_mapped() (to account for memfd seal behaviour).
Also you need to rewrite your tests to be readable.
Jeff Xu (2): mseal: Two fixes for madvise(MADV_DONTNEED) when sealed selftest/mseal: Add tests for madvise
include/linux/mm.h | 2 + mm/mprotect.c | 3 + mm/mseal.c | 42 +++++++-- tools/testing/selftests/mm/mseal_test.c | 118 +++++++++++++++++++++++- 4 files changed, 157 insertions(+), 8 deletions(-)
-- 2.47.0.rc1.288.g06298d1525-goog