On Tue, Nov 01, 2022 at 04:14:39PM -0700, Jeff Xu wrote:
Sorry for the long overdue reply.
No worries! I am a fan of thread necromancy. :)
[...] 1> memfd_create: Add two flags: #define MFD_EXEC 0x0008 #define MFD_NOEXEC_SEAL 0x0010 This lets application to set executable bit explicitly. (If application set both, it will be rejected)
So no MFD_NOEXEC without seal? (I'm fine with that.)
2> For old application that doesn't set executable bit: Add a pid name-spaced sysctl.kernel.pid_mfd_noexec, with:
bikeshed: vm.memfd_noexec (doesn't belong in "kernel", and seems better suited to "vm" than "fs")
value = 0: Default_EXEC Honor MFD_EXEC and MFD_NOEXEC_SEAL When none is set, will fall back to original behavior (EXEC)
Yeah. Rephrasing for myself to understand more clearly:
"memfd_create() without MFD_EXEC nor MFD_NOEXEC_SEAL act like MFD_EXEC was set."
value = 1: Default_NOEXEC_SEAL Honor MFD_EXEC and MFD_NOEXEC_SEAL When none is set, will default to MFD_NOEXEC_SEAL
"memfd_create() without MFD_EXEC nor MFD_NOEXEC_SEAL act like MFD_NOEXEC_SEAL was set."
Also, I think there should be a pr_warn_ratelimited() when memfd_create() is used without either bit, so that there is some pressure to please adjust their API calls to explicitly set a bit.
3> Add a pid name-spaced sysctl kernel.pid_mfd_noexec_enforced: with: value = 0: default, not enforced. value = 1: enforce NOEXEC_SEAL (overwrite everything)
How about making this just mode "value 2" for the first sysctl? "memfd_create() without MFD_NOEXEC_SEAL will be rejected."
-Kees