From: Jeff Xu jeffxu@google.com
By default, memfd_create() creates a non-sealable MFD, unless the MFD_ALLOW_SEALING flag is set.
When the MFD_NOEXEC_SEAL flag is initially introduced, the MFD created with that flag is sealable, even though MFD_ALLOW_SEALING is not set. This patch changes MFD_NOEXEC_SEAL to be non-sealable by default, unless MFD_ALLOW_SEALING is explicitly set.
This is a non-backward compatible change. However, as MFD_NOEXEC_SEAL is new, we expect not many applications will rely on the nature of MFD_NOEXEC_SEAL being sealable. In most cases, the application already sets MFD_ALLOW_SEALING if they need a sealable MFD.
Additionally, this enhances the useability of pid namespace sysctl vm.memfd_noexec. When vm.memfd_noexec equals 1 or 2, the kernel will add MFD_NOEXEC_SEAL if mfd_create does not specify MFD_EXEC or MFD_NOEXEC_SEAL, and the addition of MFD_NOEXEC_SEAL enables the MFD to be sealable. This means, any application that does not desire this behavior will be unable to utilize vm.memfd_noexec = 1 or 2 to migrate/enforce non-executable MFD. This adjustment ensures that applications can anticipate that the sealable characteristic will remain unmodified by vm.memfd_noexec.
This patch was initially developed by Barnabás Pőcze, and Barnabás used Debian Code Search and GitHub to try to find potential breakages and could only find a single one. Dbus-broker's memfd_create() wrapper is aware of this implicit `MFD_ALLOW_SEALING` behavior, and tries to work around it [1]. This workaround will break. Luckily, this only affects the test suite, it does not affect the normal operations of dbus-broker. There is a PR with a fix[2]. In addition, David Rheinsberg also raised similar fix in [3]
[1]: https://github.com/bus1/dbus-broker/blob/9eb0b7e5826fc76cad7b025bc46f267d4a8... [2]: https://github.com/bus1/dbus-broker/pull/366 [3]: https://lore.kernel.org/lkml/20230714114753.170814-1-david@readahead.eu/
History ====== V2: update commit message. add testcase for vm.memfd_noexec add documentation.
V1: https://lore.kernel.org/lkml/20240513191544.94754-1-pobrn@protonmail.com/
Jeff Xu (2): memfd: fix MFD_NOEXEC_SEAL to be non-sealable by default memfd:add MEMFD_NOEXEC_SEAL documentation
Documentation/userspace-api/index.rst | 1 + Documentation/userspace-api/mfd_noexec.rst | 90 ++++++++++++++++++++++ mm/memfd.c | 9 +-- tools/testing/selftests/memfd/memfd_test.c | 26 ++++++- 4 files changed, 120 insertions(+), 6 deletions(-) create mode 100644 Documentation/userspace-api/mfd_noexec.rst