I got a bad pud error and lost a 1GB HugeTLB when calling swapoff.
The problem can be reproduced by the following steps:
1. Allocate an anonymous 1GB HugeTLB and some other anonymous memory.
2. Swapout the above anonymous memory.
3. run swapoff and we will get a bad pud error in kernel message:
mm/pgtable-generic.c:42: bad pud 00000000743d215d(84000001400000e7)
We can tell that pud_clear_bad is called by pud_none_or_clear_bad
in unuse_pud_range() by ftrace. And therefore the HugeTLB pages will
never be freed because we lost it from page table. We can skip
HugeTLB pages for unuse_vma to fix it.
Fixes: 0fe6e20b9c4c ("hugetlb, rmap: add reverse mapping for hugepage")
Signed-off-by: Liu Shixin <liushixin2(a)huawei.com>
---
mm/swapfile.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/swapfile.c b/mm/swapfile.c
index 0cded32414a1..f4ef91513fc9 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -2312,7 +2312,7 @@ static int unuse_mm(struct mm_struct *mm, unsigned int type)
mmap_read_lock(mm);
for_each_vma(vmi, vma) {
- if (vma->anon_vma) {
+ if (vma->anon_vma && !is_vm_hugetlb_page(vma)) {
ret = unuse_vma(vma, type);
if (ret)
break;
--
2.34.1
Previously, the domain_context_clear() function incorrectly called
pci_for_each_dma_alias() to set up context entries for non-PCI devices.
This could lead to kernel hangs or other unexpected behavior.
Add a check to only call pci_for_each_dma_alias() for PCI devices. For
non-PCI devices, domain_context_clear_one() is called directly.
Reported-by: Todd Brandt <todd.e.brandt(a)intel.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219363
Fixes: 9a16ab9d6402 ("iommu/vt-d: Make context clearing consistent with context mapping")
Cc: stable(a)vger.kernel.org
Signed-off-by: Lu Baolu <baolu.lu(a)linux.intel.com>
---
drivers/iommu/intel/iommu.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 9f6b0780f2ef..e860bc9439a2 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -3340,8 +3340,10 @@ static int domain_context_clear_one_cb(struct pci_dev *pdev, u16 alias, void *op
*/
static void domain_context_clear(struct device_domain_info *info)
{
- if (!dev_is_pci(info->dev))
+ if (!dev_is_pci(info->dev)) {
domain_context_clear_one(info, info->bus, info->devfn);
+ return;
+ }
pci_for_each_dma_alias(to_pci_dev(info->dev),
&domain_context_clear_one_cb, info);
--
2.43.0
The added test case from commit
09bcf9254838 ("selftests/ftrace: Add new test case which checks non unique symbol")
failed, it is fixed by
b022f0c7e404 ("tracing/kprobes: Return EADDRNOTAVAIL when func matches several symbols").
Backport it and its fix commit to 5.4.y together. Resolved minor context change conflicts.
Resend the patch series after backpporting to 5.10.y first.
Andrii Nakryiko (1):
tracing/kprobes: Fix symbol counting logic by looking at modules as
well
Francis Laniel (1):
tracing/kprobes: Return EADDRNOTAVAIL when func matches several
symbols
kernel/trace/trace_kprobe.c | 76 +++++++++++++++++++++++++++++++++++++
kernel/trace/trace_probe.h | 1 +
2 files changed, 77 insertions(+)
--
2.46.0
Hello all,
We are experiencing a boot hang issue when booting kernel version
6.1.83+ on a Dell Inc. PowerEdge R770 equipped with an Intel Xeon
6710E processor. After extensive testing and use of `git bisect`, we
have traced the issue to commit:
`586e19c88a0c ("iommu/vt-d: Retrieve IOMMU perfmon capability information")`
This commit appears to be part of a larger patchset, which can be found here:
[Patchset on lore.kernel.org](https://lore.kernel.org/lkml/7c4b3e4e-1c5d-04f1-1891-84f68…
We attempted to boot with the `intel_iommu=off` option, but the system
hangs in the same manner. However, the system boots successfully after
disabling `CONFIG_INTEL_IOMMU_PERF_EVENTS`.
I'm reporting here in case others hit the same issue.
Any assistance or guidance on understanding/resolving this issue would
be greatly appreciated.
Thank you.
Jinpu Wang @ IONOS Cloud
The patch below does not apply to the 6.11-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.11.y
git checkout FETCH_HEAD
git cherry-pick -x c9d952b9103b600ddafc5d1c0e2f2dbd30f0b805
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024101436-avalanche-approach-5c90@gregkh' --subject-prefix 'PATCH 6.11.y' HEAD^..
Possible dependencies:
c9d952b9103b ("io_uring/rw: fix cflags posting for single issue multishot read")
6733e678ba12 ("io_uring/kbuf: pass in 'len' argument for buffer commit")
ecd5c9b29643 ("io_uring/kbuf: add io_kbuf_commit() helper")
03e02e8f95fe ("io_uring/kbuf: use 'bl' directly rather than req->buf_list")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From c9d952b9103b600ddafc5d1c0e2f2dbd30f0b805 Mon Sep 17 00:00:00 2001
From: Jens Axboe <axboe(a)kernel.dk>
Date: Sat, 5 Oct 2024 19:06:50 -0600
Subject: [PATCH] io_uring/rw: fix cflags posting for single issue multishot
read
If multishot gets disabled, and hence the request will get terminated
rather than persist for more iterations, then posting the CQE with the
right cflags is still important. Most notably, the buffer reference
needs to be included.
Refactor the return of __io_read() a bit, so that the provided buffer
is always put correctly, and hence returned to the application.
Reported-by: Sharon Rosner <Sharon Rosner>
Link: https://github.com/axboe/liburing/issues/1257
Cc: stable(a)vger.kernel.org
Fixes: 2a975d426c82 ("io_uring/rw: don't allow multishot reads without NOWAIT support")
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
diff --git a/io_uring/rw.c b/io_uring/rw.c
index f023ff49c688..93ad92605884 100644
--- a/io_uring/rw.c
+++ b/io_uring/rw.c
@@ -972,17 +972,21 @@ int io_read_mshot(struct io_kiocb *req, unsigned int issue_flags)
if (issue_flags & IO_URING_F_MULTISHOT)
return IOU_ISSUE_SKIP_COMPLETE;
return -EAGAIN;
- }
-
- /*
- * Any successful return value will keep the multishot read armed.
- */
- if (ret > 0 && req->flags & REQ_F_APOLL_MULTISHOT) {
+ } else if (ret <= 0) {
+ io_kbuf_recycle(req, issue_flags);
+ if (ret < 0)
+ req_set_fail(req);
+ } else {
/*
- * Put our buffer and post a CQE. If we fail to post a CQE, then
+ * Any successful return value will keep the multishot read
+ * armed, if it's still set. Put our buffer and post a CQE. If
+ * we fail to post a CQE, or multishot is no longer set, then
* jump to the termination path. This request is then done.
*/
cflags = io_put_kbuf(req, ret, issue_flags);
+ if (!(req->flags & REQ_F_APOLL_MULTISHOT))
+ goto done;
+
rw->len = 0; /* similarly to above, reset len to 0 */
if (io_req_post_cqe(req, ret, cflags | IORING_CQE_F_MORE)) {
@@ -1003,6 +1007,7 @@ int io_read_mshot(struct io_kiocb *req, unsigned int issue_flags)
* Either an error, or we've hit overflow posting the CQE. For any
* multishot request, hitting overflow will terminate it.
*/
+done:
io_req_set_res(req, ret, cflags);
io_req_rw_cleanup(req, issue_flags);
if (issue_flags & IO_URING_F_MULTISHOT)