The patch titled
Subject: stackdepot: fix stack_depot_save_flags() in NMI context
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
stackdepot-fix-stack_depot_save_flags-in-nmi-context.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Marco Elver <elver(a)google.com>
Subject: stackdepot: fix stack_depot_save_flags() in NMI context
Date: Fri, 22 Nov 2024 16:39:47 +0100
Per documentation, stack_depot_save_flags() was meant to be usable from
NMI context if STACK_DEPOT_FLAG_CAN_ALLOC is unset. However, it still
would try to take the pool_lock in an attempt to save a stack trace in the
current pool (if space is available).
This could result in deadlock if an NMI is handled while pool_lock is
already held. To avoid deadlock, only try to take the lock in NMI context
and give up if unsuccessful.
The documentation is fixed to clearly convey this.
Link: https://lkml.kernel.org/r/Z0CcyfbPqmxJ9uJH@elver.google.com
Link: https://lkml.kernel.org/r/20241122154051.3914732-1-elver@google.com
Fixes: 4434a56ec209 ("stackdepot: make fast paths lock-less again")
Signed-off-by: Marco Elver <elver(a)google.com>
Reported-by: Sebastian Andrzej Siewior <bigeasy(a)linutronix.de>
Reviewed-by: Sebastian Andrzej Siewior <bigeasy(a)linutronix.de>
Cc: Alexander Potapenko <glider(a)google.com>
Cc: Andrey Konovalov <andreyknvl(a)gmail.com>
Cc: Dmitry Vyukov <dvyukov(a)google.com>
Cc: Oscar Salvador <osalvador(a)suse.de>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
include/linux/stackdepot.h | 6 +++---
lib/stackdepot.c | 10 +++++++++-
2 files changed, 12 insertions(+), 4 deletions(-)
--- a/include/linux/stackdepot.h~stackdepot-fix-stack_depot_save_flags-in-nmi-context
+++ a/include/linux/stackdepot.h
@@ -147,7 +147,7 @@ static inline int stack_depot_early_init
* If the provided stack trace comes from the interrupt context, only the part
* up to the interrupt entry is saved.
*
- * Context: Any context, but setting STACK_DEPOT_FLAG_CAN_ALLOC is required if
+ * Context: Any context, but unsetting STACK_DEPOT_FLAG_CAN_ALLOC is required if
* alloc_pages() cannot be used from the current context. Currently
* this is the case for contexts where neither %GFP_ATOMIC nor
* %GFP_NOWAIT can be used (NMI, raw_spin_lock).
@@ -156,7 +156,7 @@ static inline int stack_depot_early_init
*/
depot_stack_handle_t stack_depot_save_flags(unsigned long *entries,
unsigned int nr_entries,
- gfp_t gfp_flags,
+ gfp_t alloc_flags,
depot_flags_t depot_flags);
/**
@@ -175,7 +175,7 @@ depot_stack_handle_t stack_depot_save_fl
* Return: Handle of the stack trace stored in depot, 0 on failure
*/
depot_stack_handle_t stack_depot_save(unsigned long *entries,
- unsigned int nr_entries, gfp_t gfp_flags);
+ unsigned int nr_entries, gfp_t alloc_flags);
/**
* __stack_depot_get_stack_record - Get a pointer to a stack_record struct
--- a/lib/stackdepot.c~stackdepot-fix-stack_depot_save_flags-in-nmi-context
+++ a/lib/stackdepot.c
@@ -630,7 +630,15 @@ depot_stack_handle_t stack_depot_save_fl
prealloc = page_address(page);
}
- raw_spin_lock_irqsave(&pool_lock, flags);
+ if (in_nmi()) {
+ /* We can never allocate in NMI context. */
+ WARN_ON_ONCE(can_alloc);
+ /* Best effort; bail if we fail to take the lock. */
+ if (!raw_spin_trylock_irqsave(&pool_lock, flags))
+ goto exit;
+ } else {
+ raw_spin_lock_irqsave(&pool_lock, flags);
+ }
printk_deferred_enter();
/* Try to find again, to avoid concurrently inserting duplicates. */
_
Patches currently in -mm which might be from elver(a)google.com are
stackdepot-fix-stack_depot_save_flags-in-nmi-context.patch
The patch titled
Subject: mm: open-code page_folio() in dump_page()
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-open-code-page_folio-in-dump_page.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: "Matthew Wilcox (Oracle)" <willy(a)infradead.org>
Subject: mm: open-code page_folio() in dump_page()
Date: Mon, 25 Nov 2024 20:17:19 +0000
page_folio() calls page_fixed_fake_head() which will misidentify this page
as being a fake head and load off the end of 'precise'. We may have a
pointer to a fake head, but that's OK because it contains the right
information for dump_page().
gcc-15 is smart enough to catch this with -Warray-bounds:
In function 'page_fixed_fake_head',
inlined from '_compound_head' at ../include/linux/page-flags.h:251:24,
inlined from '__dump_page' at ../mm/debug.c:123:11:
../include/asm-generic/rwonce.h:44:26: warning: array subscript 9 is outside
+array bounds of 'struct page[1]' [-Warray-bounds=]
Link: https://lkml.kernel.org/r/20241125201721.2963278-2-willy@infradead.org
Fixes: fae7d834c43c ("mm: add __dump_folio()")
Signed-off-by: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Reported-by: Kees Cook <kees(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/debug.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
--- a/mm/debug.c~mm-open-code-page_folio-in-dump_page
+++ a/mm/debug.c
@@ -124,19 +124,22 @@ static void __dump_page(const struct pag
{
struct folio *foliop, folio;
struct page precise;
+ unsigned long head;
unsigned long pfn = page_to_pfn(page);
unsigned long idx, nr_pages = 1;
int loops = 5;
again:
memcpy(&precise, page, sizeof(*page));
- foliop = page_folio(&precise);
- if (foliop == (struct folio *)&precise) {
+ head = precise.compound_head;
+ if ((head & 1) == 0) {
+ foliop = (struct folio *)&precise;
idx = 0;
if (!folio_test_large(foliop))
goto dump;
foliop = (struct folio *)page;
} else {
+ foliop = (struct folio *)(head - 1);
idx = folio_page_idx(foliop, page);
}
_
Patches currently in -mm which might be from willy(a)infradead.org are
mm-open-code-pagetail-in-folio_flags-and-const_folio_flags.patch
mm-open-code-page_folio-in-dump_page.patch
mm-page_alloc-cache-page_zone-result-in-free_unref_page.patch
mm-make-alloc_pages_mpol-static.patch
mm-page_alloc-export-free_frozen_pages-instead-of-free_unref_page.patch
mm-page_alloc-move-set_page_refcounted-to-callers-of-post_alloc_hook.patch
mm-page_alloc-move-set_page_refcounted-to-callers-of-prep_new_page.patch
mm-page_alloc-move-set_page_refcounted-to-callers-of-get_page_from_freelist.patch
mm-page_alloc-move-set_page_refcounted-to-callers-of-__alloc_pages_cpuset_fallback.patch
mm-page_alloc-move-set_page_refcounted-to-callers-of-__alloc_pages_may_oom.patch
mm-page_alloc-move-set_page_refcounted-to-callers-of-__alloc_pages_direct_compact.patch
mm-page_alloc-move-set_page_refcounted-to-callers-of-__alloc_pages_direct_reclaim.patch
mm-page_alloc-move-set_page_refcounted-to-callers-of-__alloc_pages_slowpath.patch
mm-page_alloc-move-set_page_refcounted-to-end-of-__alloc_pages.patch
mm-page_alloc-add-__alloc_frozen_pages.patch
mm-mempolicy-add-alloc_frozen_pages.patch
slab-allocate-frozen-pages.patch
The patch titled
Subject: mm: open-code PageTail in folio_flags() and const_folio_flags()
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-open-code-pagetail-in-folio_flags-and-const_folio_flags.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: "Matthew Wilcox (Oracle)" <willy(a)infradead.org>
Subject: mm: open-code PageTail in folio_flags() and const_folio_flags()
Date: Mon, 25 Nov 2024 20:17:18 +0000
It is unsafe to call PageTail() in dump_page() as page_is_fake_head() will
almost certainly return true when called on a head page that is copied to
the stack. That will cause the VM_BUG_ON_PGFLAGS() in const_folio_flags()
to trigger when it shouldn't. Fortunately, we don't need to call
PageTail() here; it's fine to have a pointer to a virtual alias of the
page's flag word rather than the real page's flag word.
Link: https://lkml.kernel.org/r/20241125201721.2963278-1-willy@infradead.org
Fixes: fae7d834c43c ("mm: add __dump_folio()")
Signed-off-by: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: <stable(a)vger.kernel.org>
Cc: Kees Cook <kees(a)kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
include/linux/page-flags.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/include/linux/page-flags.h~mm-open-code-pagetail-in-folio_flags-and-const_folio_flags
+++ a/include/linux/page-flags.h
@@ -306,7 +306,7 @@ static const unsigned long *const_folio_
{
const struct page *page = &folio->page;
- VM_BUG_ON_PGFLAGS(PageTail(page), page);
+ VM_BUG_ON_PGFLAGS(page->compound_head & 1, page);
VM_BUG_ON_PGFLAGS(n > 0 && !test_bit(PG_head, &page->flags), page);
return &page[n].flags;
}
@@ -315,7 +315,7 @@ static unsigned long *folio_flags(struct
{
struct page *page = &folio->page;
- VM_BUG_ON_PGFLAGS(PageTail(page), page);
+ VM_BUG_ON_PGFLAGS(page->compound_head & 1, page);
VM_BUG_ON_PGFLAGS(n > 0 && !test_bit(PG_head, &page->flags), page);
return &page[n].flags;
}
_
Patches currently in -mm which might be from willy(a)infradead.org are
mm-open-code-pagetail-in-folio_flags-and-const_folio_flags.patch
mm-open-code-page_folio-in-dump_page.patch
mm-page_alloc-cache-page_zone-result-in-free_unref_page.patch
mm-make-alloc_pages_mpol-static.patch
mm-page_alloc-export-free_frozen_pages-instead-of-free_unref_page.patch
mm-page_alloc-move-set_page_refcounted-to-callers-of-post_alloc_hook.patch
mm-page_alloc-move-set_page_refcounted-to-callers-of-prep_new_page.patch
mm-page_alloc-move-set_page_refcounted-to-callers-of-get_page_from_freelist.patch
mm-page_alloc-move-set_page_refcounted-to-callers-of-__alloc_pages_cpuset_fallback.patch
mm-page_alloc-move-set_page_refcounted-to-callers-of-__alloc_pages_may_oom.patch
mm-page_alloc-move-set_page_refcounted-to-callers-of-__alloc_pages_direct_compact.patch
mm-page_alloc-move-set_page_refcounted-to-callers-of-__alloc_pages_direct_reclaim.patch
mm-page_alloc-move-set_page_refcounted-to-callers-of-__alloc_pages_slowpath.patch
mm-page_alloc-move-set_page_refcounted-to-end-of-__alloc_pages.patch
mm-page_alloc-add-__alloc_frozen_pages.patch
mm-mempolicy-add-alloc_frozen_pages.patch
slab-allocate-frozen-pages.patch
The patch titled
Subject: mm: fix vrealloc()'s KASAN poisoning logic
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-fix-vreallocs-kasan-poisoning-logic.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Andrii Nakryiko <andrii(a)kernel.org>
Subject: mm: fix vrealloc()'s KASAN poisoning logic
Date: Mon, 25 Nov 2024 16:52:06 -0800
When vrealloc() reuses already allocated vmap_area, we need to re-annotate
poisoned and unpoisoned portions of underlying memory according to the new
size.
Note, hard-coding KASAN_VMALLOC_PROT_NORMAL might not be exactly correct,
but KASAN flag logic is pretty involved and spread out throughout
__vmalloc_node_range_noprof(), so I'm using the bare minimum flag here and
leaving the rest to mm people to refactor this logic and reuse it here.
Link: https://lkml.kernel.org/r/20241126005206.3457974-1-andrii@kernel.org
Fixes: 3ddc2fefe6f3 ("mm: vmalloc: implement vrealloc()")
Signed-off-by: Andrii Nakryiko <andrii(a)kernel.org>
Cc: Alexei Starovoitov <ast(a)kernel.org>
Cc: Christoph Hellwig <hch(a)infradead.org>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Uladzislau Rezki (Sony) <urezki(a)gmail.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/vmalloc.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
--- a/mm/vmalloc.c~mm-fix-vreallocs-kasan-poisoning-logic
+++ a/mm/vmalloc.c
@@ -4093,7 +4093,8 @@ void *vrealloc_noprof(const void *p, siz
/* Zero out spare memory. */
if (want_init_on_alloc(flags))
memset((void *)p + size, 0, old_size - size);
-
+ kasan_poison_vmalloc(p + size, old_size - size);
+ kasan_unpoison_vmalloc(p, size, KASAN_VMALLOC_PROT_NORMAL);
return (void *)p;
}
_
Patches currently in -mm which might be from andrii(a)kernel.org are
mm-fix-vreallocs-kasan-poisoning-logic.patch
The patch titled
Subject: Revert "readahead: properly shorten readahead when falling back to do_page_cache_ra()"
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
revert-readahead-properly-shorten-readahead-when-falling-back-to-do_page_cache_ra.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Jan Kara <jack(a)suse.cz>
Subject: Revert "readahead: properly shorten readahead when falling back to do_page_cache_ra()"
Date: Tue, 26 Nov 2024 15:52:08 +0100
This reverts commit 7c877586da3178974a8a94577b6045a48377ff25.
Anders and Philippe have reported that recent kernels occasionally hang
when used with NFS in readahead code. The problem has been bisected to
7c877586da3 ("readahead: properly shorten readahead when falling back to
do_page_cache_ra()"). The cause of the problem is that ra->size can be
shrunk by read_pages() call and subsequently we end up calling
do_page_cache_ra() with negative (read huge positive) number of pages.
Let's revert 7c877586da3 for now until we can find a proper way how the
logic in read_pages() and page_cache_ra_order() can coexist. This can
lead to reduced readahead throughput due to readahead window confusion but
that's better than outright hangs.
Link: https://lkml.kernel.org/r/20241126145208.985-1-jack@suse.cz
Fixes: 7c877586da31 ("readahead: properly shorten readahead when falling back to do_page_cache_ra()")
Reported-by: Anders Blomdell <anders.blomdell(a)gmail.com>
Reported-by: Philippe Troin <phil(a)fifi.org>
Signed-off-by: Jan Kara <jack(a)suse.cz>
Tested-by: Philippe Troin <phil(a)fifi.org>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/readahead.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
--- a/mm/readahead.c~revert-readahead-properly-shorten-readahead-when-falling-back-to-do_page_cache_ra
+++ a/mm/readahead.c
@@ -460,8 +460,7 @@ void page_cache_ra_order(struct readahea
struct file_ra_state *ra, unsigned int new_order)
{
struct address_space *mapping = ractl->mapping;
- pgoff_t start = readahead_index(ractl);
- pgoff_t index = start;
+ pgoff_t index = readahead_index(ractl);
unsigned int min_order = mapping_min_folio_order(mapping);
pgoff_t limit = (i_size_read(mapping->host) - 1) >> PAGE_SHIFT;
pgoff_t mark = index + ra->size - ra->async_size;
@@ -524,7 +523,7 @@ void page_cache_ra_order(struct readahea
if (!err)
return;
fallback:
- do_page_cache_ra(ractl, ra->size - (index - start), ra->async_size);
+ do_page_cache_ra(ractl, ra->size, ra->async_size);
}
static unsigned long ractl_max_pages(struct readahead_control *ractl,
_
Patches currently in -mm which might be from jack(a)suse.cz are
revert-readahead-properly-shorten-readahead-when-falling-back-to-do_page_cache_ra.patch
The patch titled
Subject: mm: vmscan: ensure kswapd is woken up if the wait queue is active
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-vmscan-ensure-kswapd-is-woken-up-if-the-wait-queue-is-active.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Seiji Nishikawa <snishika(a)redhat.com>
Subject: mm: vmscan: ensure kswapd is woken up if the wait queue is active
Date: Wed, 27 Nov 2024 00:06:12 +0900
Even after commit 501b26510ae3 ("vmstat: allow_direct_reclaim should use
zone_page_state_snapshot"), a task may remain indefinitely stuck in
throttle_direct_reclaim() while holding mm->rwsem.
__alloc_pages_nodemask
try_to_free_pages
throttle_direct_reclaim
This can cause numerous other tasks to wait on the same rwsem, leading
to severe system hangups:
[1088963.358712] INFO: task python3:1670971 blocked for more than 120 seconds.
[1088963.365653] Tainted: G OE -------- - - 4.18.0-553.el8_10.aarch64 #1
[1088963.373887] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[1088963.381862] task:python3 state:D stack:0 pid:1670971 ppid:1667117 flags:0x00800080
[1088963.381869] Call trace:
[1088963.381872] __switch_to+0xd0/0x120
[1088963.381877] __schedule+0x340/0xac8
[1088963.381881] schedule+0x68/0x118
[1088963.381886] rwsem_down_read_slowpath+0x2d4/0x4b8
The issue arises when allow_direct_reclaim(pgdat) returns false,
preventing progress even when the pgdat->pfmemalloc_wait wait queue is
empty. Despite the wait queue being empty, the condition,
allow_direct_reclaim(pgdat), may still be returning false, causing it to
continue looping.
In some cases, reclaimable pages exist (zone_reclaimable_pages() returns
> 0), but calculations of pfmemalloc_reserve and free_pages result in
wmark_ok being false.
And then, despite the pgdat->kswapd_wait queue being non-empty, kswapd
is not woken up, further exacerbating the problem:
crash> px ((struct pglist_data *) 0xffff00817fffe540)->kswapd_highest_zoneidx
$775 = __MAX_NR_ZONES
This patch modifies allow_direct_reclaim() to wake kswapd if the
pgdat->kswapd_wait queue is active, regardless of whether wmark_ok is true
or false. This change ensures kswapd does not miss wake-ups under high
memory pressure, reducing the risk of task stalls in the throttled reclaim
path.
Link: https://lkml.kernel.org/r/20241126150612.114561-1-snishika@redhat.com
Signed-off-by: Seiji Nishikawa <snishika(a)redhat.com>
Cc: Mel Gorman <mgorman(a)techsingularity.net>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/vmscan.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/mm/vmscan.c~mm-vmscan-ensure-kswapd-is-woken-up-if-the-wait-queue-is-active
+++ a/mm/vmscan.c
@@ -6389,8 +6389,8 @@ static bool allow_direct_reclaim(pg_data
wmark_ok = free_pages > pfmemalloc_reserve / 2;
- /* kswapd must be awake if processes are being throttled */
- if (!wmark_ok && waitqueue_active(&pgdat->kswapd_wait)) {
+ /* Always wake up kswapd if the wait queue is not empty */
+ if (waitqueue_active(&pgdat->kswapd_wait)) {
if (READ_ONCE(pgdat->kswapd_highest_zoneidx) > ZONE_NORMAL)
WRITE_ONCE(pgdat->kswapd_highest_zoneidx, ZONE_NORMAL);
_
Patches currently in -mm which might be from snishika(a)redhat.com are
mm-vmscan-ensure-kswapd-is-woken-up-if-the-wait-queue-is-active.patch
The patch titled
Subject: selftests/damon: add _damon_sysfs.py to TEST_FILES
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
selftests-damon-add-_damon_sysfspy-to-test_files.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Maximilian Heyne <mheyne(a)amazon.de>
Subject: selftests/damon: add _damon_sysfs.py to TEST_FILES
Date: Wed, 27 Nov 2024 12:08:53 +0000
When running selftests I encountered the following error message with
some damon tests:
# Traceback (most recent call last):
# File "[...]/damon/./damos_quota.py", line 7, in <module>
# import _damon_sysfs
# ModuleNotFoundError: No module named '_damon_sysfs'
Fix this by adding the _damon_sysfs.py file to TEST_FILES so that it
will be available when running the respective damon selftests.
Link: https://lkml.kernel.org/r/20241127-picks-visitor-7416685b-mheyne@amazon.de
Fixes: 306abb63a8ca ("selftests/damon: implement a python module for test-purpose DAMON sysfs controls")
Signed-off-by: Maximilian Heyne <mheyne(a)amazon.de>
Reviewed-by: SeongJae Park <sj(a)kernel.org>
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
tools/testing/selftests/damon/Makefile | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/tools/testing/selftests/damon/Makefile~selftests-damon-add-_damon_sysfspy-to-test_files
+++ a/tools/testing/selftests/damon/Makefile
@@ -6,7 +6,7 @@ TEST_GEN_FILES += debugfs_target_ids_rea
TEST_GEN_FILES += debugfs_target_ids_pid_leak
TEST_GEN_FILES += access_memory access_memory_even
-TEST_FILES = _chk_dependency.sh _debugfs_common.sh
+TEST_FILES = _chk_dependency.sh _debugfs_common.sh _damon_sysfs.py
# functionality tests
TEST_PROGS = debugfs_attrs.sh debugfs_schemes.sh debugfs_target_ids.sh
_
Patches currently in -mm which might be from mheyne(a)amazon.de are
selftests-damon-add-_damon_sysfspy-to-test_files.patch
The patch titled
Subject: selftest: hugetlb_dio: fix test naming
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
selftest-hugetlb_dio-fix-test-naming.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Mark Brown <broonie(a)kernel.org>
Subject: selftest: hugetlb_dio: fix test naming
Date: Wed, 27 Nov 2024 16:14:22 +0000
The string logged when a test passes or fails is used by the selftest
framework to identify which test is being reported. The hugetlb_dio test
not only uses the same strings for every test that is run but it also uses
different strings for test passes and failures which means that test
automation is unable to follow what the test is doing at all.
Pull the existing duplicated logging of the number of free huge pages
before and after the test out of the conditional and replace that and the
logging of the result with a single ksft_print_result() which incorporates
the parameters passed into the test into the output.
Link: https://lkml.kernel.org/r/20241127-kselftest-mm-hugetlb-dio-names-v1-1-22aa…
Fixes: fae1980347bf ("selftests: hugetlb_dio: fixup check for initial conditions to skip in the start")
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Cc: Donet Tom <donettom(a)linux.ibm.com>
Cc: Muhammad Usama Anjum <usama.anjum(a)collabora.com>
Cc: Ritesh Harjani (IBM) <ritesh.list(a)gmail.com>
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
tools/testing/selftests/mm/hugetlb_dio.c | 14 +++++---------
1 file changed, 5 insertions(+), 9 deletions(-)
--- a/tools/testing/selftests/mm/hugetlb_dio.c~selftest-hugetlb_dio-fix-test-naming
+++ a/tools/testing/selftests/mm/hugetlb_dio.c
@@ -76,19 +76,15 @@ void run_dio_using_hugetlb(unsigned int
/* Get the free huge pages after unmap*/
free_hpage_a = get_free_hugepages();
+ ksft_print_msg("No. Free pages before allocation : %d\n", free_hpage_b);
+ ksft_print_msg("No. Free pages after munmap : %d\n", free_hpage_a);
+
/*
* If the no. of free hugepages before allocation and after unmap does
* not match - that means there could still be a page which is pinned.
*/
- if (free_hpage_a != free_hpage_b) {
- ksft_print_msg("No. Free pages before allocation : %d\n", free_hpage_b);
- ksft_print_msg("No. Free pages after munmap : %d\n", free_hpage_a);
- ksft_test_result_fail(": Huge pages not freed!\n");
- } else {
- ksft_print_msg("No. Free pages before allocation : %d\n", free_hpage_b);
- ksft_print_msg("No. Free pages after munmap : %d\n", free_hpage_a);
- ksft_test_result_pass(": Huge pages freed successfully !\n");
- }
+ ksft_test_result(free_hpage_a == free_hpage_b,
+ "free huge pages from %u-%u\n", start_off, end_off);
}
int main(void)
_
Patches currently in -mm which might be from broonie(a)kernel.org are
selftest-hugetlb_dio-fix-test-naming.patch
The patch titled
Subject: arch_numa: restore nid checks before registering a memblock with a node
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
arch_numa-restore-nid-checks-before-registering-a-memblock-with-a-node.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Marc Zyngier <maz(a)kernel.org>
Subject: arch_numa: restore nid checks before registering a memblock with a node
Date: Wed, 27 Nov 2024 19:30:00 +0000
Commit 767507654c22 ("arch_numa: switch over to numa_memblks")
significantly cleaned up the NUMA registration code, but also dropped a
significant check that was refusing to accept to configure a memblock with
an invalid nid.
On "quality hardware" such as my ThunderX machine, this results
in a kernel that dies immediately:
[ 0.000000] Booting Linux on physical CPU 0x0000000000 [0x431f0a10]
[ 0.000000] Linux version 6.12.0-00013-g8920d74cf8db (maz@valley-girl) (gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40) #3872 SMP PREEMPT Wed Nov 27 15:25:49 GMT 2024
[ 0.000000] KASLR disabled due to lack of seed
[ 0.000000] Machine model: Cavium ThunderX CN88XX board
[ 0.000000] efi: EFI v2.4 by American Megatrends
[ 0.000000] efi: ESRT=0xffce0ff18 SMBIOS 3.0=0xfffb0000 ACPI 2.0=0xffec60000 MEMRESERVE=0xffc905d98
[ 0.000000] esrt: Reserving ESRT space from 0x0000000ffce0ff18 to 0x0000000ffce0ff50.
[ 0.000000] earlycon: pl11 at MMIO 0x000087e024000000 (options '115200n8')
[ 0.000000] printk: legacy bootconsole [pl11] enabled
[ 0.000000] NODE_DATA(0) allocated [mem 0xff6754580-0xff67566bf]
[ 0.000000] Unable to handle kernel paging request at virtual address 0000000000001d40
[ 0.000000] Mem abort info:
[ 0.000000] ESR = 0x0000000096000004
[ 0.000000] EC = 0x25: DABT (current EL), IL = 32 bits
[ 0.000000] SET = 0, FnV = 0
[ 0.000000] EA = 0, S1PTW = 0
[ 0.000000] FSC = 0x04: level 0 translation fault
[ 0.000000] Data abort info:
[ 0.000000] ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
[ 0.000000] CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[ 0.000000] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[ 0.000000] [0000000000001d40] user address but active_mm is swapper
[ 0.000000] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP
[ 0.000000] Modules linked in:
[ 0.000000] CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted 6.12.0-00013-g8920d74cf8db #3872
[ 0.000000] Hardware name: Cavium ThunderX CN88XX board (DT)
[ 0.000000] pstate: a00000c5 (NzCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 0.000000] pc : sparse_init_nid+0x54/0x428
[ 0.000000] lr : sparse_init+0x118/0x240
[ 0.000000] sp : ffff800081da3cb0
[ 0.000000] x29: ffff800081da3cb0 x28: 0000000fedbab10c x27: 0000000000000001
[ 0.000000] x26: 0000000ffee250f8 x25: 0000000000000001 x24: ffff800082102cd0
[ 0.000000] x23: 0000000000000001 x22: 0000000000000000 x21: 00000000001fffff
[ 0.000000] x20: 0000000000000001 x19: 0000000000000000 x18: ffffffffffffffff
[ 0.000000] x17: 0000000001b00000 x16: 0000000ffd130000 x15: 0000000000000000
[ 0.000000] x14: 00000000003e0000 x13: 00000000000001c8 x12: 0000000000000014
[ 0.000000] x11: ffff800081e82860 x10: ffff8000820fb2c8 x9 : ffff8000820fb490
[ 0.000000] x8 : 0000000000ffed20 x7 : 0000000000000014 x6 : 00000000001fffff
[ 0.000000] x5 : 00000000ffffffff x4 : 0000000000000000 x3 : 0000000000000000
[ 0.000000] x2 : 0000000000000000 x1 : 0000000000000040 x0 : 0000000000000007
[ 0.000000] Call trace:
[ 0.000000] sparse_init_nid+0x54/0x428
[ 0.000000] sparse_init+0x118/0x240
[ 0.000000] bootmem_init+0x70/0x1c8
[ 0.000000] setup_arch+0x184/0x270
[ 0.000000] start_kernel+0x74/0x670
[ 0.000000] __primary_switched+0x80/0x90
[ 0.000000] Code: f865d804 d37df060 cb030000 d2800003 (b95d4084)
[ 0.000000] ---[ end trace 0000000000000000 ]---
[ 0.000000] Kernel panic - not syncing: Attempted to kill the idle task!
[ 0.000000] ---[ end Kernel panic - not syncing: Attempted to kill the idle task! ]---
while previous kernel versions were able to recognise how brain-damaged
the machine is, and only build a fake node.
Restoring the check brings back some sanity and a "working" system.
Link: https://lkml.kernel.org/r/20241127193000.3702637-1-maz@kernel.org
Fixes: 767507654c22 ("arch_numa: switch over to numa_memblks")
Signed-off-by: Marc Zyngier <maz(a)kernel.org>
Cc: Mike Rapoport <rppt(a)kernel.org>
Cc: Catalin Marinas <catalin.marinas(a)arm.com>
Cc: Will Deacon <will(a)kernel.org>
Cc: Zi Yan <ziy(a)nvidia.com>
Cc: Dan Williams <dan.j.williams(a)intel.com>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
drivers/base/arch_numa.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)
--- a/drivers/base/arch_numa.c~arch_numa-restore-nid-checks-before-registering-a-memblock-with-a-node
+++ a/drivers/base/arch_numa.c
@@ -207,7 +207,21 @@ static void __init setup_node_data(int n
static int __init numa_register_nodes(void)
{
int nid;
+ struct memblock_region *mblk;
+ /* Check that valid nid is set to memblks */
+ for_each_mem_region(mblk) {
+ int mblk_nid = memblock_get_region_node(mblk);
+ phys_addr_t start = mblk->base;
+ phys_addr_t end = mblk->base + mblk->size - 1;
+
+ if (mblk_nid == NUMA_NO_NODE || mblk_nid >= MAX_NUMNODES) {
+ pr_warn("Warning: invalid memblk node %d [mem %pap-%pap]\n",
+ mblk_nid, &start, &end);
+ return -EINVAL;
+ }
+ }
+
/* Finally register nodes. */
for_each_node_mask(nid, numa_nodes_parsed) {
unsigned long start_pfn, end_pfn;
_
Patches currently in -mm which might be from maz(a)kernel.org are
arch_numa-restore-nid-checks-before-registering-a-memblock-with-a-node.patch
The quilt patch titled
Subject: fs/proc/kcore.c: clear ret value in read_kcore_iter after successful iov_iter_zero
has been removed from the -mm tree. Its filename was
fs-proc-kcorec-clear-ret-value-in-read_kcore_iter-after-successful-iov_iter_zero.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Jiri Olsa <jolsa(a)kernel.org>
Subject: fs/proc/kcore.c: clear ret value in read_kcore_iter after successful iov_iter_zero
Date: Fri, 22 Nov 2024 00:11:18 +0100
If iov_iter_zero succeeds after failed copy_from_kernel_nofault, we need
to reset the ret value to zero otherwise it will be returned as final
return value of read_kcore_iter.
This fixes objdump -d dump over /proc/kcore for me.
Link: https://lkml.kernel.org/r/20241121231118.3212000-1-jolsa@kernel.org
Fixes: 3d5854d75e31 ("fs/proc/kcore.c: allow translation of physical memory addresses")
Signed-off-by: Jiri Olsa <jolsa(a)kernel.org>
Cc: Alexander Gordeev <agordeev(a)linux.ibm.com>
Cc: Christian Brauner <brauner(a)kernel.org>
Cc: <hca(a)linux.ibm.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/proc/kcore.c | 1 +
1 file changed, 1 insertion(+)
--- a/fs/proc/kcore.c~fs-proc-kcorec-clear-ret-value-in-read_kcore_iter-after-successful-iov_iter_zero
+++ a/fs/proc/kcore.c
@@ -600,6 +600,7 @@ static ssize_t read_kcore_iter(struct ki
ret = -EFAULT;
goto out;
}
+ ret = 0;
/*
* We know the bounce buffer is safe to copy from, so
* use _copy_to_iter() directly.
_
Patches currently in -mm which might be from jolsa(a)kernel.org are