The patch titled
Subject: mm/userfaultfd: Fix release hang over concurrent GUP
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-userfaultfd-fix-release-hang-over-concurrent-gup.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Peter Xu <peterx(a)redhat.com>
Subject: mm/userfaultfd: Fix release hang over concurrent GUP
Date: Wed, 12 Mar 2025 10:51:31 -0400
This patch should fix a possible userfaultfd release() hang during
concurrent GUP.
This problem was initially reported by Dimitris Siakavaras in July 2023
[1] in a firecracker use case. Firecracker has a separate process
handling page faults remotely, and when the process releases the
userfaultfd it can race with a concurrent GUP from KVM trying to fault in
a guest page during the secondary MMU page fault process.
A similar problem was reported recently again by Jinjiang Tu in March 2025
[2], even though the race happened this time with a mlockall() operation,
which does GUP in a similar fashion.
In 2017, commit 656710a60e36 ("userfaultfd: non-cooperative: closing the
uffd without triggering SIGBUS") was trying to fix this issue. AFAIU,
that fixes well the fault paths but may not work yet for GUP. In GUP, the
issue is NOPAGE will be almost treated the same as "page fault resolved"
in faultin_page(), then the GUP will follow page again, seeing page
missing, and it'll keep going into a live lock situation as reported.
This change makes core mm return RETRY instead of NOPAGE for both the GUP
and fault paths, proactively releasing the mmap read lock. This should
guarantee the other release thread make progress on taking the write lock
and avoid the live lock even for GUP.
When at it, rearrange the comments to make sure it's uptodate.
[1] https://lore.kernel.org/r/79375b71-db2e-3e66-346b-254c90d915e2@cslab.ece.nt…
[2] https://lore.kernel.org/r/20250307072133.3522652-1-tujinjiang@huawei.com
Link: https://lkml.kernel.org/r/20250312145131.1143062-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx(a)redhat.com>
Cc: Andrea Arcangeli <aarcange(a)redhat.com>
Cc: Mike Rapoport (IBM) <rppt(a)kernel.org>
Cc: Axel Rasmussen <axelrasmussen(a)google.com>
Cc: Jinjiang Tu <tujinjiang(a)huawei.com>
Cc: Dimitris Siakavaras <jimsiak(a)cslab.ece.ntua.gr>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/userfaultfd.c | 51 ++++++++++++++++++++++-----------------------
1 file changed, 25 insertions(+), 26 deletions(-)
--- a/fs/userfaultfd.c~mm-userfaultfd-fix-release-hang-over-concurrent-gup
+++ a/fs/userfaultfd.c
@@ -396,32 +396,6 @@ vm_fault_t handle_userfault(struct vm_fa
goto out;
/*
- * If it's already released don't get it. This avoids to loop
- * in __get_user_pages if userfaultfd_release waits on the
- * caller of handle_userfault to release the mmap_lock.
- */
- if (unlikely(READ_ONCE(ctx->released))) {
- /*
- * Don't return VM_FAULT_SIGBUS in this case, so a non
- * cooperative manager can close the uffd after the
- * last UFFDIO_COPY, without risking to trigger an
- * involuntary SIGBUS if the process was starting the
- * userfaultfd while the userfaultfd was still armed
- * (but after the last UFFDIO_COPY). If the uffd
- * wasn't already closed when the userfault reached
- * this point, that would normally be solved by
- * userfaultfd_must_wait returning 'false'.
- *
- * If we were to return VM_FAULT_SIGBUS here, the non
- * cooperative manager would be instead forced to
- * always call UFFDIO_UNREGISTER before it can safely
- * close the uffd.
- */
- ret = VM_FAULT_NOPAGE;
- goto out;
- }
-
- /*
* Check that we can return VM_FAULT_RETRY.
*
* NOTE: it should become possible to return VM_FAULT_RETRY
@@ -457,6 +431,31 @@ vm_fault_t handle_userfault(struct vm_fa
if (vmf->flags & FAULT_FLAG_RETRY_NOWAIT)
goto out;
+ if (unlikely(READ_ONCE(ctx->released))) {
+ /*
+ * If a concurrent release is detected, do not return
+ * VM_FAULT_SIGBUS or VM_FAULT_NOPAGE, but instead always
+ * return VM_FAULT_RETRY with lock released proactively.
+ *
+ * If we were to return VM_FAULT_SIGBUS here, the non
+ * cooperative manager would be instead forced to
+ * always call UFFDIO_UNREGISTER before it can safely
+ * close the uffd, to avoid involuntary SIGBUS triggered.
+ *
+ * If we were to return VM_FAULT_NOPAGE, it would work for
+ * the fault path, in which the lock will be released
+ * later. However for GUP, faultin_page() does nothing
+ * special on NOPAGE, so GUP would spin retrying without
+ * releasing the mmap read lock, causing possible livelock.
+ *
+ * Here only VM_FAULT_RETRY would make sure the mmap lock
+ * be released immediately, so that the thread concurrently
+ * releasing the userfault would always make progress.
+ */
+ release_fault_lock(vmf);
+ goto out;
+ }
+
/* take the reference before dropping the mmap_lock */
userfaultfd_ctx_get(ctx);
_
Patches currently in -mm which might be from peterx(a)redhat.com are
mm-userfaultfd-fix-release-hang-over-concurrent-gup.patch
Hi there,
Hope you're having a great day!
Would you be interested in a recently verified list of NetApp clients to support your outreach?
Let me know, and I'll be happy to share the details.
Best regards,
Kevin Martin
Demand Consultant
If you wish to stop receiving emails, reply with Abolish.
Hi there,
I am following up on my previous email.
Could you kindly share your thoughts when you have a moment?
Regards,
Kristina
________________________________
From: Kristina Williams
Sent: 07 March 2025 07:58
To: linux-stable-mirror(a)lists.linaro.org<mailto:linux-stable-mirror@lists.linaro.org>
Subject: Shopify POS new users
Hi there,
I hope you're doing well.
Would you be open to exploring our freshly verified Shopify POS Users Data? This could be a valuable resource for your marketing and outreach efforts.
Let me know if you're interested, and I'd be happy to share the available count and more details.
Looking forward to your response.
Best Regards,
Kristina Williams
Demand Generation Executive
Please respond with cancel, if not needed.
As part of I3C driver probing sequence for particular device instance,
While adding to queue it is trying to access ibi variable of dev which is
not yet initialized causing "Unable to handle kernel read from unreadable
memory" resulting in kernel panic.
Below is the sequence where this issue happened.
1. During boot up sequence IBI is received at host from the slave device
before requesting for IBI, Usually will request IBI by calling
i3c_device_request_ibi() during probe of slave driver.
2. Since master code trying to access IBI Variable for the particular
device instance before actually it initialized by slave driver,
due to this randomly accessing the address and causing kernel panic.
3. i3c_device_request_ibi() function invoked by the slave driver where
dev->ibi = ibi; assigned as part of function call
i3c_dev_request_ibi_locked().
4. But when IBI request sent by slave device, master code trying to access
this variable before its initialized due to this race condition
situation kernel panic happened.
fixes: dd3c52846d595 (i3c: master: svc: Add Silvaco I3C master driver)
Cc: stable(a)vger.kernel.org
Signed-off-by: Manjunatha Venkatesh <manjunatha.venkatesh(a)nxp.com>
---
Changes since v2:
- Description updated as per the review feedback
drivers/i3c/master/svc-i3c-master.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/drivers/i3c/master/svc-i3c-master.c b/drivers/i3c/master/svc-i3c-master.c
index d6057d8c7dec..98c4d2e5cd8d 100644
--- a/drivers/i3c/master/svc-i3c-master.c
+++ b/drivers/i3c/master/svc-i3c-master.c
@@ -534,8 +534,11 @@ static void svc_i3c_master_ibi_work(struct work_struct *work)
switch (ibitype) {
case SVC_I3C_MSTATUS_IBITYPE_IBI:
if (dev) {
- i3c_master_queue_ibi(dev, master->ibi.tbq_slot);
- master->ibi.tbq_slot = NULL;
+ data = i3c_dev_get_master_data(dev);
+ if (master->ibi.slots[data->ibi]) {
+ i3c_master_queue_ibi(dev, master->ibi.tbq_slot);
+ master->ibi.tbq_slot = NULL;
+ }
}
svc_i3c_master_emit_stop(master);
break;
--
2.46.1