Hi Pedro
On Thu, Oct 17, 2024 at 12:37 PM Pedro Falcato pedro.falcato@gmail.com wrote:
For PROT_NONE mappings, the previous blocking of madvise(MADV_DONTNEED) is unnecessary. As PROT_NONE already prohibits memory access, madvise(MADV_DONTNEED) should be allowed to proceed in order to free the page.
I don't get it. Is there an actual use case for this?
Sealing should not over-blocking API that it can allow to pass without security concern, this is a case in that principle.
There is a user case for this as well: to seal NX stack on android, Android uses PROT_NONE/madvise to set up a guide page to prevent stack run over boundary. So we need to let madvise to pass.
For file-backed, private, read-only memory mappings, we previously did not block the madvise(MADV_DONTNEED). This was based on the assumption that the memory's content, being file-backed, could be retrieved from the file if accessed again. However, this assumption failed to consider scenarios where a mapping is initially created as read-write, modified, and subsequently changed to read-only. The newly introduced VM_WASWRITE flag addresses this oversight.
We *do not* need this. It's sufficient to just block discard operations on read-only private mappings.
I think you meant blocking madvise(MADV_DONTNEED) on all read-only private file-backed mappings.
I considered that option, but there is a use case for madvise on those mappings that never get modified.
Apps can use that to free up RAM. e.g. Considering read-only .text section, which never gets modified, madvise( MADV_DONTNEED) can free up RAM when memory is in-stress, memory will be reclaimed from a backed-file on next read access. Therefore we can't just block all read-only private file-backed mapping, only those that really need to, such as mapping changed from rw=>r (what you described)