Hi Barnabás
Is that OK that I work on V2 ? It will be based on your V1 change and I will also add more test cases.
Thanks -Jeff
-
On Thu, May 23, 2024 at 12:45 PM Andrew Morton akpm@linux-foundation.org wrote:
On Wed, 22 May 2024 19:32:35 -0700 Jeff Xu jeffxu@google.com wrote:
It's a change to a userspace API, yes? Please let's have a detailed description of why this is OK. Why it won't affect any existing users.
Unfortunately, this is a breaking change that might break a application if they do below: memfd_create("", MFD_NOEXEC_SEAL) fcntl(fd, F_ADD_SEALS, F_SEAL_WRITE); <-- this will fail in new semantics, due to mfd not being sealable.
However, I still think the new semantics is a better, the reason is due to the sysctl: memfd_noexec_scope Currently, when the sysctl is set to MEMFD_NOEXEC_SCOPE_NOEXEC_SEAL kernel adds MFD_NOEXEC_SEAL to memfd_create, and the memfd becomes sealable. E.g. When the sysctl is set to MEMFD_NOEXEC_SCOPE_NOEXEC_SEAL The app calls memfd_create("",0) application will get sealable memfd, which might be a surprise to application.
If the app doesn't want this behavior, they will need one of two below in current implementation. 1> set the sysctl: memfd_noexec_scope to 0. So the kernel doesn't overwrite the mdmfd_create
2> modify their code to get non-sealable NOEXEC memfd. memfd_create("", MEMFD_NOEXEC_SCOPE_NOEXEC) fcntl(fd, F_ADD_SEALS, F_SEAL_SEAL)
The new semantics works better with the sysctl.
Since memfd noexec is new, maybe there is no application using the MFD_NOEXEC_SEAL to create sealable memfd. They mostly likely use memfd(MFD_NOEXEC_SEAL|MFD_ALLOW_SEALING) instead. I think it might benefit in the long term with the new semantics.
Yes, it's new so I expect any damage will be small. Please prepare a v2 which fully explains/justifies the thinking for this non-backward-compatible change and which include the cc:stable.