While discussing some userfaultfd relevant issues recently, Andrea noticed a potential ABI breakage with -EAGAIN on almost all userfaultfd ioctl()s.
Quote from Andrea, explaining how -EAGAIN was processed, and how this should fix it (taking example of UFFDIO_COPY ioctl):
The "mmap_changing" and "stale pmd" conditions are already reported as -EAGAIN written in the copy field, this does not change it. This change removes the subnormal case that left copy.copy uninitialized and required apps to explicitly set the copy field to get deterministic behavior (which is a requirement contrary to the documentation in both the manpage and source code). In turn there's no alteration to backwards compatibility as result of this change because userland will find the copy field consistently set to -EAGAIN, and not anymore sometime -EAGAIN and sometime uninitialized.
Even then the change only can make a difference to non cooperative users of userfaultfd, so when UFFD_FEATURE_EVENT_* is enabled, which is not true for the vast majority of apps using userfaultfd or this unintended uninitialized field may have been noticed sooner.
Meanwhile, since this bug existed for years, it also almost affects all ioctl()s that was introduced later. Besides UFFDIO_ZEROPAGE, these also get affected in the same way:
- UFFDIO_CONTINUE - UFFDIO_POISON - UFFDIO_MOVE
This patch should have fixed all of them.
Fixes: df2cc96e7701 ("userfaultfd: prevent non-cooperative events vs mcopy_atomic races") Fixes: f619147104c8 ("userfaultfd: add UFFDIO_CONTINUE ioctl") Fixes: fc71884a5f59 ("mm: userfaultfd: add new UFFDIO_POISON ioctl") Fixes: adef440691ba ("userfaultfd: UFFDIO_MOVE uABI") Cc: linux-stable stable@vger.kernel.org Cc: Mike Rapoport rppt@kernel.org Cc: Axel Rasmussen axelrasmussen@google.com Cc: Suren Baghdasaryan surenb@google.com Reported-by: Andrea Arcangeli aarcange@redhat.com Suggested-by: Andrea Arcangeli aarcange@redhat.com Signed-off-by: Peter Xu peterx@redhat.com --- fs/userfaultfd.c | 28 ++++++++++++++++++++++------ 1 file changed, 22 insertions(+), 6 deletions(-)
diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index d80f94346199..22f4bf956ba1 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -1585,8 +1585,11 @@ static int userfaultfd_copy(struct userfaultfd_ctx *ctx, user_uffdio_copy = (struct uffdio_copy __user *) arg;
ret = -EAGAIN; - if (atomic_read(&ctx->mmap_changing)) + if (unlikely(atomic_read(&ctx->mmap_changing))) { + if (unlikely(put_user(ret, &user_uffdio_copy->copy))) + return -EFAULT; goto out; + }
ret = -EFAULT; if (copy_from_user(&uffdio_copy, user_uffdio_copy, @@ -1641,8 +1644,11 @@ static int userfaultfd_zeropage(struct userfaultfd_ctx *ctx, user_uffdio_zeropage = (struct uffdio_zeropage __user *) arg;
ret = -EAGAIN; - if (atomic_read(&ctx->mmap_changing)) + if (unlikely(atomic_read(&ctx->mmap_changing))) { + if (unlikely(put_user(ret, &user_uffdio_zeropage->zeropage))) + return -EFAULT; goto out; + }
ret = -EFAULT; if (copy_from_user(&uffdio_zeropage, user_uffdio_zeropage, @@ -1744,8 +1750,11 @@ static int userfaultfd_continue(struct userfaultfd_ctx *ctx, unsigned long arg) user_uffdio_continue = (struct uffdio_continue __user *)arg;
ret = -EAGAIN; - if (atomic_read(&ctx->mmap_changing)) + if (unlikely(atomic_read(&ctx->mmap_changing))) { + if (unlikely(put_user(ret, &user_uffdio_continue->mapped))) + return -EFAULT; goto out; + }
ret = -EFAULT; if (copy_from_user(&uffdio_continue, user_uffdio_continue, @@ -1801,8 +1810,11 @@ static inline int userfaultfd_poison(struct userfaultfd_ctx *ctx, unsigned long user_uffdio_poison = (struct uffdio_poison __user *)arg;
ret = -EAGAIN; - if (atomic_read(&ctx->mmap_changing)) + if (unlikely(atomic_read(&ctx->mmap_changing))) { + if (unlikely(put_user(ret, &user_uffdio_poison->updated))) + return -EFAULT; goto out; + }
ret = -EFAULT; if (copy_from_user(&uffdio_poison, user_uffdio_poison, @@ -1870,8 +1882,12 @@ static int userfaultfd_move(struct userfaultfd_ctx *ctx,
user_uffdio_move = (struct uffdio_move __user *) arg;
- if (atomic_read(&ctx->mmap_changing)) - return -EAGAIN; + ret = -EAGAIN; + if (unlikely(atomic_read(&ctx->mmap_changing))) { + if (unlikely(put_user(ret, &user_uffdio_move->move))) + return -EFAULT; + goto out; + }
if (copy_from_user(&uffdio_move, user_uffdio_move, /* don't copy "move" last field */
On 24.04.25 23:57, Peter Xu wrote:
While discussing some userfaultfd relevant issues recently, Andrea noticed a potential ABI breakage with -EAGAIN on almost all userfaultfd ioctl()s.
I guess we talk about e.g., "man UFFDIO_COPY" documentation:
"The copy field is used by the kernel to return the number of bytes that was actually copied, or an error (a negated errno-style value). The copy field is output-only; it is not read by the UFFDIO_COPY operation."
I assume -EINVAL/-ESRCH/-EFAULT are excluded from that rule, because there is no sense in user-space trying again on these errors either way. Well, there are cases where we would store -EFAULT, when we receive it from mfill_atomic_copy().
So if we store -EAGAIN to copy.copy it says "we didn't copy anything". (probably just storing 0 would have been better, but I am sure there was a reason to indicate negative errors in addition to returning an error)
Quote from Andrea, explaining how -EAGAIN was processed, and how this should fix it (taking example of UFFDIO_COPY ioctl):
The "mmap_changing" and "stale pmd" conditions are already reported as -EAGAIN written in the copy field, this does not change it. This change removes the subnormal case that left copy.copy uninitialized and required apps to explicitly set the copy field to get deterministic behavior (which is a requirement contrary to the documentation in both the manpage and source code). In turn there's no alteration to backwards compatibility as result of this change because userland will find the copy field consistently set to -EAGAIN, and not anymore sometime -EAGAIN and sometime uninitialized.
Even then the change only can make a difference to non cooperative users of userfaultfd, so when UFFD_FEATURE_EVENT_* is enabled, which is not true for the vast majority of apps using userfaultfd or this unintended uninitialized field may have been noticed sooner.
Meanwhile, since this bug existed for years, it also almost affects all ioctl()s that was introduced later. Besides UFFDIO_ZEROPAGE, these also get affected in the same way:
- UFFDIO_CONTINUE
- UFFDIO_POISON
- UFFDIO_MOVE
This patch should have fixed all of them.
Fixes: df2cc96e7701 ("userfaultfd: prevent non-cooperative events vs mcopy_atomic races") Fixes: f619147104c8 ("userfaultfd: add UFFDIO_CONTINUE ioctl") Fixes: fc71884a5f59 ("mm: userfaultfd: add new UFFDIO_POISON ioctl") Fixes: adef440691ba ("userfaultfd: UFFDIO_MOVE uABI") Cc: linux-stable stable@vger.kernel.org Cc: Mike Rapoport rppt@kernel.org Cc: Axel Rasmussen axelrasmussen@google.com Cc: Suren Baghdasaryan surenb@google.com Reported-by: Andrea Arcangeli aarcange@redhat.com Suggested-by: Andrea Arcangeli aarcange@redhat.com Signed-off-by: Peter Xu peterx@redhat.com
fs/userfaultfd.c | 28 ++++++++++++++++++++++------ 1 file changed, 22 insertions(+), 6 deletions(-)
diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index d80f94346199..22f4bf956ba1 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -1585,8 +1585,11 @@ static int userfaultfd_copy(struct userfaultfd_ctx *ctx, user_uffdio_copy = (struct uffdio_copy __user *) arg; ret = -EAGAIN;
- if (atomic_read(&ctx->mmap_changing))
- if (unlikely(atomic_read(&ctx->mmap_changing))) {
if (unlikely(put_user(ret, &user_uffdio_copy->copy)))
goto out;return -EFAULT;
- }
Nit: It's weird that we do "return -EFAULT" in one case, in the other we do "goto out;" which ends up doing a "return ret" ...
Maybe to keep it consistent:
ret = -EAGAIN; if (unlikely(atomic_read(&ctx->mmap_changing))) { if (unlikely(put_user(ret, &user_uffdio_copy->copy))) ret = -EFAULT; goto out; }
In all of these functions, we should probably just get rid of the "goto out" and just return directly. We have a weird mixture of "goto out;" and return; ... a different cleanup.
Reviewed-by: David Hildenbrand david@redhat.com
On Fri, Apr 25, 2025 at 11:12 AM David Hildenbrand david@redhat.com wrote:
On 24.04.25 23:57, Peter Xu wrote:
While discussing some userfaultfd relevant issues recently, Andrea noticed a potential ABI breakage with -EAGAIN on almost all userfaultfd ioctl()s.
I guess we talk about e.g., "man UFFDIO_COPY" documentation:
"The copy field is used by the kernel to return the number of bytes that was actually copied, or an error (a negated errno-style value). The copy field is output-only; it is not read by the UFFDIO_COPY operation."
I assume -EINVAL/-ESRCH/-EFAULT are excluded from that rule, because there is no sense in user-space trying again on these errors either way. Well, there are cases where we would store -EFAULT, when we receive it from mfill_atomic_copy().
So if we store -EAGAIN to copy.copy it says "we didn't copy anything". (probably just storing 0 would have been better, but I am sure there was a reason to indicate negative errors in addition to returning an error)
IMHO, it makes more sense to store 0 than -EAGAIN (at least it will mean that my userspace[1] won't break).
Userspace will need to know from where to restart the ioctl, and if we put -EAGAIN in `mapped`/`copy`/etc., userspace will need to know that -EAGAIN actually means 0 anyway.
[1]: https://lore.kernel.org/linux-mm/CADrL8HXuZkX0CP6apHLw0A0Ax4b4a+-=XEt0dH5mAK...
On Fri, Apr 25, 2025 at 11:54:47AM -0400, James Houghton wrote:
On Fri, Apr 25, 2025 at 11:12 AM David Hildenbrand david@redhat.com wrote:
On 24.04.25 23:57, Peter Xu wrote:
While discussing some userfaultfd relevant issues recently, Andrea noticed a potential ABI breakage with -EAGAIN on almost all userfaultfd ioctl()s.
I guess we talk about e.g., "man UFFDIO_COPY" documentation:
"The copy field is used by the kernel to return the number of bytes that was actually copied, or an error (a negated errno-style value). The copy field is output-only; it is not read by the UFFDIO_COPY operation."
I assume -EINVAL/-ESRCH/-EFAULT are excluded from that rule, because there is no sense in user-space trying again on these errors either way. Well, there are cases where we would store -EFAULT, when we receive it from mfill_atomic_copy().
So if we store -EAGAIN to copy.copy it says "we didn't copy anything". (probably just storing 0 would have been better, but I am sure there was a reason to indicate negative errors in addition to returning an error)
IMHO, it makes more sense to store 0 than -EAGAIN (at least it will mean that my userspace[1] won't break).
Userspace will need to know from where to restart the ioctl, and if we put -EAGAIN in `mapped`/`copy`/etc., userspace will need to know that -EAGAIN actually means 0 anyway.
Yes agreed, the API might be easier to follow if the kernel will only update >=0 values to copy.copy and only if -EAGAIN is returned, so that errno will be the only source of truth on the type of error that userapp must check first. For now, we may need to stick with the current API.
linux-stable-mirror@lists.linaro.org