On Mon, Jun 13, 2022 at 4:23 PM Jonathan Corbet corbet@lwn.net wrote:
Axel Rasmussen axelrasmussen@google.com writes:
I think for any approach involving syscalls, we need to be able to control access to who can call a syscall. Maybe there's another way I'm not aware of, but I think today the only mechanism to do this is capabilities. I proposed adding a CAP_USERFAULTFD for this purpose, but that approach was rejected [1]. So, I'm not sure of another way besides using a device node.
I take it there's a reason why this can't be done with a security module
- either a custom module or a policy in one of the existing modules?
That sort of access control is just what security modules are supposed to be for, after all.
Thanks,
jon
Admittedly I haven't tried proposing a patch, but I suspect there would be pushback against adding an entirely new LSM just for this case, similarly to the reasons the CAP_USERFAULTFD approach was rejected.
For existing LSMs, I think SELinux can be used to restrict access to syscalls. But then again, it's fairly heavy weight / difficult to configure, and I suspect migrating production servers which don't use it today would be a nontrivial undertaking. At least to me it seems unfortunate to say, there isn't an obvious "safe" way to use userfaultfd, without enabling + configuring selinux. (That assumes by "safe" we mean, without granting wider-than necessary access to userfaultfd, or without granting uffd-using processes more permissions [root or CAP_SYS_PTRACE] to do their job.) I suspect if we do that then in practice many? most? users will just either run UFFD programs as root, or toggle the sysctl to allow unprivileged UFFD usage.