On Thu, Oct 17, 2024 at 12:51:04AM +0000, jeffxu@chromium.org wrote:
From: Jeff Xu jeffxu@chromium.org
Two fixes for madvise(MADV_DONTNEED) when sealed.
Please separate these fixes into two separate patches.
For PROT_NONE mappings, the previous blocking of madvise(MADV_DONTNEED) is unnecessary. As PROT_NONE already prohibits memory access, madvise(MADV_DONTNEED) should be allowed to proceed in order to free the page.
I don't get it. Is there an actual use case for this?
For file-backed, private, read-only memory mappings, we previously did not block the madvise(MADV_DONTNEED). This was based on the assumption that the memory's content, being file-backed, could be retrieved from the file if accessed again. However, this assumption failed to consider scenarios where a mapping is initially created as read-write, modified, and subsequently changed to read-only. The newly introduced VM_WASWRITE flag addresses this oversight.
We *do not* need this. It's sufficient to just block discard operations on read-only private mappings. Sending a possible (fully untested) fix. If you like this approach I can resend properly, or Andrew can pick it up, whatever floats people's boats.
----8<----
From dc5ec662dcb79156f4bdc1cba2a2575dce905ffa Mon Sep 17 00:00:00 2001 From: Pedro Falcato pedro.falcato@gmail.com Date: Thu, 17 Oct 2024 20:21:10 +0100 Subject: [PATCH] mm/mseal: Disallow madvise discard on file-private sealed mappings
Doing an operation such as MADV_DONTNEED on a file-private mapping may forcibly alter data by discarding CoW'd, anon pages and replacing them with page cache pages fresh from the filesystem.
As such, this somewhat bypasses the mseal of a read-only mapping, and should be disallowed.
Signed-off-by: Pedro Falcato pedro.falcato@gmail.com Fixes: 8be7258aad44 ("mseal: add mseal syscall") Cc: stable@vger.kernel.org # 6.11.y --- mm/mseal.c | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-)
diff --git a/mm/mseal.c b/mm/mseal.c index 28cd17d7aaf2..d053303c5542 100644 --- a/mm/mseal.c +++ b/mm/mseal.c @@ -36,10 +36,15 @@ static bool is_madv_discard(int behavior) return false; }
-static bool is_ro_anon(struct vm_area_struct *vma) +static bool is_ro_private(struct vm_area_struct *vma) { - /* check anonymous mapping. */ - if (vma->vm_file || vma->vm_flags & VM_SHARED) + /* + * If shared, allow discard operations - it shouldn't + * affect the underlying data. Discard on private VMAs may + * forcibly alter data by replacing CoW'd anonymous pages + * with ones fresh from the page cache. + */ + if (vma->vm_flags & VM_SHARED) return false;
/* @@ -61,7 +66,7 @@ bool can_modify_vma_madv(struct vm_area_struct *vma, int behavior) if (!is_madv_discard(behavior)) return true;
- if (unlikely(!can_modify_vma(vma) && is_ro_anon(vma))) + if (unlikely(!can_modify_vma(vma) && is_ro_private(vma))) return false;
/* Allow by default. */