Re: [PATCH v3 2/6] userfaultfd: add /dev/userfaultfd for fine grained access control

14 Jun 2022


      On Jun 13, 2022, at 3:38 PM, Axel Rasmussen axelrasmussen@google.com wrote:
...
On Mon, Jun 13, 2022 at 3:29 PM Peter Xu peterx@redhat.com wrote:
...
On Mon, Jun 13, 2022 at 02:55:40PM -0700, Andrew Morton wrote:
...
On Wed,  1 Jun 2022 14:09:47 -0700 Axel Rasmussen axelrasmussen@google.com wrote:
...
To achieve this, add a /dev/userfaultfd misc device. This device
provides an alternative to the userfaultfd(2) syscall for the creation
of new userfaultfds. The idea is, any userfaultfds created this way will
be able to handle kernel faults, without the caller having any special
capabilities. Access to this mechanism is instead restricted using e.g.
standard filesystem permissions.
The use of a /dev node isn't pretty.  Why can't this be done by
tweaking sys_userfaultfd() or by adding a sys_userfaultfd2()?
I think for any approach involving syscalls, we need to be able to
control access to who can call a syscall. Maybe there's another way
I'm not aware of, but I think today the only mechanism to do this is
capabilities. I proposed adding a CAP_USERFAULTFD for this purpose,
but that approach was rejected [1]. So, I'm not sure of another way
besides using a device node.
One thing that could potentially make this cleaner is, as one LWN
commenter pointed out, we could have open() on /dev/userfaultfd just
return a new userfaultfd directly, instead of this multi-step process
of open /dev/userfaultfd, NEW ioctl, then you get a userfaultfd. When
I wrote this originally it wasn't clear to me how to get that to
happen - open() doesn't directly return the result of our custom open
function pointer, as far as I can tell - but it could be investigated.
If this direction is pursued, I think that it would be better to set it as
/proc/[pid]/userfaultfd, which would allow remote monitors (processes) to
hook into userfaultfd of remote processes. I have a patch for that which
extends userfaultfd syscall, but /proc/[pid]/userfaultfd may be cleaner.

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: [PATCH v3 2/6] userfaultfd: add /dev/userfaultfd for fine grained access control