From: Jeff Xu jeffxu@chromium.org
When MFD_NOEXEC_SEAL was introduced, there was one big mistake: it didn't have proper documentation. This led to a lot of confusion, especially about whether or not memfd created with the MFD_NOEXEC_SEAL flag is sealable. Before MFD_NOEXEC_SEAL, memfd had to explicitly set MFD_ALLOW_SEALING to be sealable, so it's a fair question.
As one might have noticed, unlike other flags in memfd_create, MFD_NOEXEC_SEAL is actually a combination of multiple flags. The idea is to make it easier to use memfd in the most common way, which is NOEXEC + F_SEAL_EXEC + MFD_ALLOW_SEALING. This works with sysctl vm.noexec to help existing applications move to a more secure way of using memfd.
Proposals have been made to put MFD_NOEXEC_SEAL non-sealable, unless MFD_ALLOW_SEALING is set, to be consistent with other flags [1] [2], Those are based on the viewpoint that each flag is an atomic unit, which is a reasonable assumption. However, MFD_NOEXEC_SEAL was designed with the intent of promoting the most secure method of using memfd, therefore a combination of multiple functionalities into one bit.
Furthermore, the MFD_NOEXEC_SEAL has been added for more than one year, and multiple applications and distributions have backported and utilized it. Altering ABI now presents a degree of risk and may lead to disruption.
MFD_NOEXEC_SEAL is a new flag, and applications must change their code to use it. There is no backward compatibility problem.
When sysctl vm.noexec == 1 or 2, applications that don't set MFD_NOEXEC_SEAL or MFD_EXEC will get MFD_NOEXEC_SEAL memfd. And old-application might break, that is by-design, in such a system vm.noexec = 0 shall be used. Also no backward compatibility problem.
I propose to include this documentation patch to assist in clarifying the semantics of MFD_NOEXEC_SEAL, thereby preventing any potential future confusion.
This patch supersede previous patch which is trying different direction [3], and please remove [2] from mm-unstable branch when applying this patch.
Finally, I would like to express my gratitude to David Rheinsberg and Barnabás Pőcze for initiating the discussion on the topic of sealability.
[1] https://lore.kernel.org/lkml/20230714114753.170814-1-david@readahead.eu/
[2] https://lore.kernel.org/lkml/20240513191544.94754-1-pobrn@protonmail.com/
[3] https://lore.kernel.org/lkml/20240524033933.135049-1-jeffxu@google.com/
v2: Update according to Randy Dunlap' comments.
v1: https://lore.kernel.org/linux-mm/20240607203543.2151433-1-jeffxu@google.com/
Jeff Xu (1): mm/memfd: add documentation for MFD_NOEXEC_SEAL MFD_EXEC
Documentation/userspace-api/index.rst | 1 + Documentation/userspace-api/mfd_noexec.rst | 86 ++++++++++++++++++++++ 2 files changed, 87 insertions(+) create mode 100644 Documentation/userspace-api/mfd_noexec.rst