Hi there,
I found a nullptr dereference in perf subsystem and it affects at least
v5.10 and v6.1 stable trees. (the same poc cannot trigger the crash in
the mainline).
I fail to find the root cause the bug. All I know is that it is a race
condition in the logic of moving_groups from pure software-based perf
events to hardware ones. More specifically, when we add a hardware perf
event to a software event group, it will trigger a "move_group" logic in
perf_event_open. When the "move_group" logic happens, it will remove all
existing events from the context first using `perf_remove_from_context`.
And it will invoke `__perf_remove_from_context` through
`event_function_call`.
Notice that `event_function_call` is defined as follow:
~~~
static void event_function_call(struct perf_event *event, event_f func, void *data)
{
...
func(event, NULL, ctx, data);
...
}
~~~
This means `__perf_remove_from_context` will be invoked with
cpuctx==NULL, which leads to invoking `event_sched_out` with cpuctx ==
NULL.
At this moment, as long as the event is active, we are going to invoke
the `if (event->attr.exclusive || !cpuctx->active_oncpu)` logic, which
is a null pointer deference.
I don't know the proper way to patch this bug. So I'm asking for help.
A reproducer is attached to this email.
Best,
Kyle Zeng
If of_clk_add_provider() fails in ca8210_register_ext_clock(),
it calls clk_unregister() to release priv->clk and returns an
error. However, the caller ca8210_probe() then calls ca8210_remove(),
where priv->clk is freed again in ca8210_unregister_ext_clock(). In
this case, a use-after-free may happen in the second time we call
clk_unregister().
Fix this by removing the first clk_unregister(). Also, priv->clk could
be an error code on failure of clk_register_fixed_rate(). Use
IS_ERR_OR_NULL to catch this case in ca8210_unregister_ext_clock().
Fixes: ded845a781a5 ("ieee802154: Add CA8210 IEEE 802.15.4 device driver")
Signed-off-by: Dinghao Liu <dinghao.liu(a)zju.edu.cn>
---
Changelog:
v2: -Remove the first clk_unregister() instead of nulling priv->clk.
v3: -Simplify ca8210_register_ext_clock().
-Add a ';' after return in ca8210_unregister_ext_clock().
---
drivers/net/ieee802154/ca8210.c | 16 +++-------------
1 file changed, 3 insertions(+), 13 deletions(-)
diff --git a/drivers/net/ieee802154/ca8210.c b/drivers/net/ieee802154/ca8210.c
index aebb19f1b3a4..ae44a9133937 100644
--- a/drivers/net/ieee802154/ca8210.c
+++ b/drivers/net/ieee802154/ca8210.c
@@ -2757,18 +2757,8 @@ static int ca8210_register_ext_clock(struct spi_device *spi)
dev_crit(&spi->dev, "Failed to register external clk\n");
return PTR_ERR(priv->clk);
}
- ret = of_clk_add_provider(np, of_clk_src_simple_get, priv->clk);
- if (ret) {
- clk_unregister(priv->clk);
- dev_crit(
- &spi->dev,
- "Failed to register external clock as clock provider\n"
- );
- } else {
- dev_info(&spi->dev, "External clock set as clock provider\n");
- }
- return ret;
+ return of_clk_add_provider(np, of_clk_src_simple_get, priv->clk);
}
/**
@@ -2780,8 +2770,8 @@ static void ca8210_unregister_ext_clock(struct spi_device *spi)
{
struct ca8210_priv *priv = spi_get_drvdata(spi);
- if (!priv->clk)
- return
+ if (IS_ERR_OR_NULL(priv->clk))
+ return;
of_clk_del_provider(spi->dev.of_node);
clk_unregister(priv->clk);
--
2.17.1
The quilt patch titled
Subject: mm: make PR_MDWE_REFUSE_EXEC_GAIN an unsigned long
has been removed from the -mm tree. Its filename was
mm-make-pr_mdwe_refuse_exec_gain-an-unsigned-long.patch
This patch was dropped because it was merged into the mm-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Florent Revest <revest(a)chromium.org>
Subject: mm: make PR_MDWE_REFUSE_EXEC_GAIN an unsigned long
Date: Mon, 28 Aug 2023 17:08:56 +0200
Defining a prctl flag as an int is a footgun because on a 64 bit machine
and with a variadic implementation of prctl (like in musl and glibc), when
used directly as a prctl argument, it can get casted to long with garbage
upper bits which would result in unexpected behaviors.
This patch changes the constant to an unsigned long to eliminate that
possibilities. This does not break UAPI.
I think that a stable backport would be "nice to have": to reduce the
chances that users build binaries that could end up with garbage bits in
their MDWE prctl arguments. We are not aware of anyone having yet
encountered this corner case with MDWE prctls but a backport would reduce
the likelihood it happens, since this sort of issues has happened with
other prctls. But If this is perceived as a backporting burden, I suppose
we could also live without a stable backport.
Link: https://lkml.kernel.org/r/20230828150858.393570-5-revest@chromium.org
Fixes: b507808ebce2 ("mm: implement memory-deny-write-execute as a prctl")
Signed-off-by: Florent Revest <revest(a)chromium.org>
Suggested-by: Alexey Izbyshev <izbyshev(a)ispras.ru>
Reviewed-by: David Hildenbrand <david(a)redhat.com>
Reviewed-by: Kees Cook <keescook(a)chromium.org>
Acked-by: Catalin Marinas <catalin.marinas(a)arm.com>
Cc: Anshuman Khandual <anshuman.khandual(a)arm.com>
Cc: Ayush Jain <ayush.jain3(a)amd.com>
Cc: Greg Thelen <gthelen(a)google.com>
Cc: Joey Gouly <joey.gouly(a)arm.com>
Cc: KP Singh <kpsingh(a)kernel.org>
Cc: Mark Brown <broonie(a)kernel.org>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Peter Xu <peterx(a)redhat.com>
Cc: Ryan Roberts <ryan.roberts(a)arm.com>
Cc: Szabolcs Nagy <Szabolcs.Nagy(a)arm.com>
Cc: Topi Miettinen <toiwoton(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
include/uapi/linux/prctl.h | 2 +-
tools/include/uapi/linux/prctl.h | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
--- a/include/uapi/linux/prctl.h~mm-make-pr_mdwe_refuse_exec_gain-an-unsigned-long
+++ a/include/uapi/linux/prctl.h
@@ -283,7 +283,7 @@ struct prctl_mm_map {
/* Memory deny write / execute */
#define PR_SET_MDWE 65
-# define PR_MDWE_REFUSE_EXEC_GAIN 1
+# define PR_MDWE_REFUSE_EXEC_GAIN (1UL << 0)
#define PR_GET_MDWE 66
--- a/tools/include/uapi/linux/prctl.h~mm-make-pr_mdwe_refuse_exec_gain-an-unsigned-long
+++ a/tools/include/uapi/linux/prctl.h
@@ -283,7 +283,7 @@ struct prctl_mm_map {
/* Memory deny write / execute */
#define PR_SET_MDWE 65
-# define PR_MDWE_REFUSE_EXEC_GAIN 1
+# define PR_MDWE_REFUSE_EXEC_GAIN (1UL << 0)
#define PR_GET_MDWE 66
_
Patches currently in -mm which might be from revest(a)chromium.org are
The quilt patch titled
Subject: mm/migrate: fix do_pages_move for compat pointers
has been removed from the -mm tree. Its filename was
mm-migrate-fix-do_pages_move-for-compat-pointers.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Gregory Price <gourry.memverge(a)gmail.com>
Subject: mm/migrate: fix do_pages_move for compat pointers
Date: Tue, 3 Oct 2023 10:48:56 -0400
do_pages_move does not handle compat pointers for the page list.
correctly. Add in_compat_syscall check and appropriate get_user fetch
when iterating the page list.
It makes the syscall in compat mode (32-bit userspace, 64-bit kernel)
work the same way as the native 32-bit syscall again, restoring the
behavior before my broken commit 5b1b561ba73c ("mm: simplify
compat_sys_move_pages").
More specifically, my patch moved the parsing of the 'pages' array from
the main entry point into do_pages_stat(), which left the syscall
working correctly for the 'stat' operation (nodes = NULL), while the
'move' operation (nodes != NULL) is now missing the conversion and
interprets 'pages' as an array of 64-bit pointers instead of the
intended 32-bit userspace pointers.
It is possible that nobody noticed this bug because the few
applications that actually call move_pages are unlikely to run in
compat mode because of their large memory requirements, but this
clearly fixes a user-visible regression and should have been caught by
ltp.
Link: https://lkml.kernel.org/r/20231003144857.752952-1-gregory.price@memverge.com
Fixes: 5b1b561ba73c ("mm: simplify compat_sys_move_pages")
Signed-off-by: Gregory Price <gregory.price(a)memverge.com>
Reported-by: Arnd Bergmann <arnd(a)arndb.de>
Co-developed-by: Arnd Bergmann <arnd(a)arndb.de>
Cc: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/migrate.c | 14 ++++++++++++--
1 file changed, 12 insertions(+), 2 deletions(-)
--- a/mm/migrate.c~mm-migrate-fix-do_pages_move-for-compat-pointers
+++ a/mm/migrate.c
@@ -2162,6 +2162,7 @@ static int do_pages_move(struct mm_struc
const int __user *nodes,
int __user *status, int flags)
{
+ compat_uptr_t __user *compat_pages = (void __user *)pages;
int current_node = NUMA_NO_NODE;
LIST_HEAD(pagelist);
int start, i;
@@ -2174,8 +2175,17 @@ static int do_pages_move(struct mm_struc
int node;
err = -EFAULT;
- if (get_user(p, pages + i))
- goto out_flush;
+ if (in_compat_syscall()) {
+ compat_uptr_t cp;
+
+ if (get_user(cp, compat_pages + i))
+ goto out_flush;
+
+ p = compat_ptr(cp);
+ } else {
+ if (get_user(p, pages + i))
+ goto out_flush;
+ }
if (get_user(node, nodes + i))
goto out_flush;
_
Patches currently in -mm which might be from gourry.memverge(a)gmail.com are
mm-migrate-remove-unused-mm-argument-from-do_move_pages_to_node.patch
The quilt patch titled
Subject: mm/mempolicy: fix set_mempolicy_home_node() previous VMA pointer
has been removed from the -mm tree. Its filename was
mm-mempolicy-fix-set_mempolicy_home_node-previous-vma-pointer.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: "Liam R. Howlett" <Liam.Howlett(a)oracle.com>
Subject: mm/mempolicy: fix set_mempolicy_home_node() previous VMA pointer
Date: Thu, 28 Sep 2023 13:24:32 -0400
The two users of mbind_range() are expecting that mbind_range() will
update the pointer to the previous VMA, or return an error. However,
set_mempolicy_home_node() does not call mbind_range() if there is no VMA
policy. The fix is to update the pointer to the previous VMA prior to
continuing iterating the VMAs when there is no policy.
Users may experience a WARN_ON() during VMA policy updates when updating
a range of VMAs on the home node.
Link: https://lkml.kernel.org/r/20230928172432.2246534-1-Liam.Howlett@oracle.com
Link: https://lore.kernel.org/linux-mm/CALcu4rbT+fMVNaO_F2izaCT+e7jzcAciFkOvk21HG…
Fixes: f4e9e0e69468 ("mm/mempolicy: fix use-after-free of VMA iterator")
Signed-off-by: Liam R. Howlett <Liam.Howlett(a)oracle.com>
Reported-by: Yikebaer Aizezi <yikebaer61(a)gmail.com>
Closes: https://lore.kernel.org/linux-mm/CALcu4rbT+fMVNaO_F2izaCT+e7jzcAciFkOvk21HG…
Reviewed-by: Lorenzo Stoakes <lstoakes(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/mempolicy.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
--- a/mm/mempolicy.c~mm-mempolicy-fix-set_mempolicy_home_node-previous-vma-pointer
+++ a/mm/mempolicy.c
@@ -1543,8 +1543,10 @@ SYSCALL_DEFINE4(set_mempolicy_home_node,
* the home node for vmas we already updated before.
*/
old = vma_policy(vma);
- if (!old)
+ if (!old) {
+ prev = vma;
continue;
+ }
if (old->mode != MPOL_BIND && old->mode != MPOL_PREFERRED_MANY) {
err = -EOPNOTSUPP;
break;
_
Patches currently in -mm which might be from Liam.Howlett(a)oracle.com are
mmap-add-clarifying-comment-to-vma_merge-code.patch
radix-tree-test-suite-fix-allocation-calculation-in-kmem_cache_alloc_bulk.patch
The quilt patch titled
Subject: mmap: fix vma_iterator in error path of vma_merge()
has been removed from the -mm tree. Its filename was
mmap-fix-vma_iterator-in-error-path-of-vma_merge.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: "Liam R. Howlett" <Liam.Howlett(a)oracle.com>
Subject: mmap: fix vma_iterator in error path of vma_merge()
Date: Fri, 29 Sep 2023 14:30:39 -0400
During the error path, the vma iterator may not be correctly positioned or
set to the correct range. Undo the vma_prev() call by resetting to the
passed in address. Re-walking to the same range will fix the range to the
area previously passed in.
Users would notice increased cycles as vma_merge() would be called an
extra time with vma == prev, and thus would fail to merge and return.
Link: https://lore.kernel.org/linux-mm/CAG48ez12VN1JAOtTNMY+Y2YnsU45yL5giS-Qn=ejt…
Link: https://lkml.kernel.org/r/20230929183041.2835469-2-Liam.Howlett@oracle.com
Fixes: 18b098af2890 ("vma_merge: set vma iterator to correct position.")
Signed-off-by: Liam R. Howlett <Liam.Howlett(a)oracle.com>
Reported-by: Jann Horn <jannh(a)google.com>
Closes: https://lore.kernel.org/linux-mm/CAG48ez12VN1JAOtTNMY+Y2YnsU45yL5giS-Qn=ejt…
Reviewed-by: Lorenzo Stoakes <lstoakes(a)gmail.com>
Acked-by: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Suren Baghdasaryan <surenb(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/mmap.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
--- a/mm/mmap.c~mmap-fix-vma_iterator-in-error-path-of-vma_merge
+++ a/mm/mmap.c
@@ -975,7 +975,7 @@ struct vm_area_struct *vma_merge(struct
/* Error in anon_vma clone. */
if (err)
- return NULL;
+ goto anon_vma_fail;
if (vma_start < vma->vm_start || vma_end > vma->vm_end)
vma_expanded = true;
@@ -988,7 +988,7 @@ struct vm_area_struct *vma_merge(struct
}
if (vma_iter_prealloc(vmi, vma))
- return NULL;
+ goto prealloc_fail;
init_multi_vma_prep(&vp, vma, adjust, remove, remove2);
VM_WARN_ON(vp.anon_vma && adjust && adjust->anon_vma &&
@@ -1016,6 +1016,12 @@ struct vm_area_struct *vma_merge(struct
vma_complete(&vp, vmi, mm);
khugepaged_enter_vma(res, vm_flags);
return res;
+
+prealloc_fail:
+anon_vma_fail:
+ vma_iter_set(vmi, addr);
+ vma_iter_load(vmi);
+ return NULL;
}
/*
_
Patches currently in -mm which might be from Liam.Howlett(a)oracle.com are
mmap-add-clarifying-comment-to-vma_merge-code.patch
radix-tree-test-suite-fix-allocation-calculation-in-kmem_cache_alloc_bulk.patch
The quilt patch titled
Subject: mm/page_alloc: correct start page when guard page debug is enabled
has been removed from the -mm tree. Its filename was
mm-page_alloc-correct-start-page-when-guard-page-debug-is-enabled.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Kemeng Shi <shikemeng(a)huaweicloud.com>
Subject: mm/page_alloc: correct start page when guard page debug is enabled
Date: Wed, 27 Sep 2023 17:44:01 +0800
When guard page debug is enabled and set_page_guard returns success, we
miss to forward page to point to start of next split range and we will do
split unexpectedly in page range without target page. Move start page
update before set_page_guard to fix this.
As we split to wrong target page, then splited pages are not able to merge
back to original order when target page is put back and splited pages
except target page is not usable. To be specific:
Consider target page is the third page in buddy page with order 2.
| buddy-2 | Page | Target | Page |
After break down to target page, we will only set first page to Guard
because of bug.
| Guard | Page | Target | Page |
When we try put_page_back_buddy with target page, the buddy page of target
if neither guard nor buddy, Then it's not able to construct original page
with order 2
| Guard | Page | buddy-0 | Page |
All pages except target page is not in free list and is not usable.
Link: https://lkml.kernel.org/r/20230927094401.68205-1-shikemeng@huaweicloud.com
Fixes: 06be6ff3d2ec ("mm,hwpoison: rework soft offline for free pages")
Signed-off-by: Kemeng Shi <shikemeng(a)huaweicloud.com>
Acked-by: Naoya Horiguchi <naoya.horiguchi(a)nec.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Oscar Salvador <osalvador(a)suse.de>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/page_alloc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/page_alloc.c~mm-page_alloc-correct-start-page-when-guard-page-debug-is-enabled
+++ a/mm/page_alloc.c
@@ -6475,6 +6475,7 @@ static void break_down_buddy_pages(struc
next_page = page;
current_buddy = page + size;
}
+ page = next_page;
if (set_page_guard(zone, current_buddy, high, migratetype))
continue;
@@ -6482,7 +6483,6 @@ static void break_down_buddy_pages(struc
if (current_buddy != target) {
add_to_free_list(current_buddy, zone, high, migratetype);
set_buddy_order(current_buddy, high);
- page = next_page;
}
}
}
_
Patches currently in -mm which might be from shikemeng(a)huaweicloud.com are
mm-page_alloc-remove-unnecessary-check-in-break_down_buddy_pages.patch
mm-page_alloc-remove-unnecessary-next_page-in-break_down_buddy_pages.patch
When a zswap store fails due to the limit, it acquires a pool
reference and queues the shrinker. When the shrinker runs, it drops
the reference. However, there can be multiple store attempts before
the shrinker wakes up and runs once. This results in reference leaks
and eventual saturation warnings for the pool refcount.
Fix this by dropping the reference again when the shrinker is already
queued. This ensures one reference per shrinker run.
Reported-by: Chris Mason <clm(a)fb.com>
Fixes: 45190f01dd40 ("mm/zswap.c: add allocation hysteresis if pool limit is hit")
Cc: stable(a)vger.kernel.org [5.6+]
Cc: Vitaly Wool <vitaly.wool(a)konsulko.com>
Cc: Domenico Cerasuolo <cerasuolodomenico(a)gmail.com>
Cc: Nhat Pham <nphamcs(a)gmail.com>
Signed-off-by: Johannes Weiner <hannes(a)cmpxchg.org>
---
mm/zswap.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/mm/zswap.c b/mm/zswap.c
index 083c693602b8..37d2b1cb2ecb 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -1383,8 +1383,8 @@ bool zswap_store(struct folio *folio)
shrink:
pool = zswap_pool_last_get();
- if (pool)
- queue_work(shrink_wq, &pool->shrink_work);
+ if (pool && !queue_work(shrink_wq, &pool->shrink_work))
+ zswap_pool_put(pool);
goto reject;
}
--
2.42.0