The patch below does not apply to the 5.7-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to stable@vger.kernel.org.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From da97f2d56bbd880b4138916a7ef96f9881a551b2 Mon Sep 17 00:00:00 2001
From: Pavel Tatashin pasha.tatashin@soleen.com Date: Wed, 3 Jun 2020 15:59:27 -0700 Subject: [PATCH] mm: call cond_resched() from deferred_init_memmap()
Now that deferred pages are initialized with interrupts enabled we can replace touch_nmi_watchdog() with cond_resched(), as it was before 3a2d7fa8a3d5.
For now, we cannot do the same in deferred_grow_zone() as it is still initializes pages with interrupts disabled.
This change fixes RCU problem described in https://lkml.kernel.org/r/20200401104156.11564-2-david@redhat.com
[ 60.474005] rcu: INFO: rcu_sched detected stalls on CPUs/tasks: [ 60.475000] rcu: 1-...0: (0 ticks this GP) idle=02a/1/0x4000000000000000 softirq=1/1 fqs=15000 [ 60.475000] rcu: (detected by 0, t=60002 jiffies, g=-1199, q=1) [ 60.475000] Sending NMI from CPU 0 to CPUs 1: [ 1.760091] NMI backtrace for cpu 1 [ 1.760091] CPU: 1 PID: 20 Comm: pgdatinit0 Not tainted 4.18.0-147.9.1.el8_1.x86_64 #1 [ 1.760091] Hardware name: Red Hat KVM, BIOS 1.13.0-1.module+el8.2.0+5520+4e5817f3 04/01/2014 [ 1.760091] RIP: 0010:__init_single_page.isra.65+0x10/0x4f [ 1.760091] Code: 48 83 cf 63 48 89 f8 0f 1f 40 00 48 89 c6 48 89 d7 e8 6b 18 80 ff 66 90 5b c3 31 c0 b9 10 00 00 00 49 89 f8 48 c1 e6 33 f3 ab <b8> 07 00 00 00 48 c1 e2 36 41 c7 40 34 01 00 00 00 48 c1 e0 33 41 [ 1.760091] RSP: 0000:ffffba783123be40 EFLAGS: 00000006 [ 1.760091] RAX: 0000000000000000 RBX: fffffad34405e300 RCX: 0000000000000000 [ 1.760091] RDX: 0000000000000000 RSI: 0010000000000000 RDI: fffffad34405e340 [ 1.760091] RBP: 0000000033f3177e R08: fffffad34405e300 R09: 0000000000000002 [ 1.760091] R10: 000000000000002b R11: ffff98afb691a500 R12: 0000000000000002 [ 1.760091] R13: 0000000000000000 R14: 000000003f03ea00 R15: 000000003e10178c [ 1.760091] FS: 0000000000000000(0000) GS:ffff9c9ebeb00000(0000) knlGS:0000000000000000 [ 1.760091] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1.760091] CR2: 00000000ffffffff CR3: 000000a1cf20a001 CR4: 00000000003606e0 [ 1.760091] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1.760091] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 1.760091] Call Trace: [ 1.760091] deferred_init_pages+0x8f/0xbf [ 1.760091] deferred_init_memmap+0x184/0x29d [ 1.760091] ? deferred_free_pages.isra.97+0xba/0xba [ 1.760091] kthread+0x112/0x130 [ 1.760091] ? kthread_flush_work_fn+0x10/0x10 [ 1.760091] ret_from_fork+0x35/0x40 [ 89.123011] node 0 initialised, 1055935372 pages in 88650ms
Fixes: 3a2d7fa8a3d5 ("mm: disable interrupts while initializing deferred pages") Reported-by: Yiqian Wei yiwei@redhat.com Signed-off-by: Pavel Tatashin pasha.tatashin@soleen.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Tested-by: David Hildenbrand david@redhat.com Reviewed-by: Daniel Jordan daniel.m.jordan@oracle.com Reviewed-by: David Hildenbrand david@redhat.com Reviewed-by: Pankaj Gupta pankaj.gupta.linux@gmail.com Acked-by: Michal Hocko mhocko@suse.com Cc: Dan Williams dan.j.williams@intel.com Cc: James Morris jmorris@namei.org Cc: Kirill Tkhai ktkhai@virtuozzo.com Cc: Sasha Levin sashal@kernel.org Cc: Shile Zhang shile.zhang@linux.alibaba.com Cc: Vlastimil Babka vbabka@suse.cz Cc: stable@vger.kernel.org [4.17+] Link: http://lkml.kernel.org/r/20200403140952.17177-4-pasha.tatashin@soleen.com Signed-off-by: Linus Torvalds torvalds@linux-foundation.org
diff --git a/mm/page_alloc.c b/mm/page_alloc.c index c75561a3f144..2dbc571f68de 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1870,7 +1870,7 @@ static int __init deferred_init_memmap(void *data) */ while (spfn < epfn) { nr_pages += deferred_init_maxorder(&i, zone, &spfn, &epfn); - touch_nmi_watchdog(); + cond_resched(); } zone_empty: /* Sanity check that the next zone really is unpopulated */
On Thu, Jun 18, 2020 at 05:55:43PM +0200, gregkh@linuxfoundation.org wrote:
The patch below does not apply to the 5.7-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to stable@vger.kernel.org.
Oops, I applied things out of order, I've queued this and the 5.4 version up, but 4.19 doesn't apply as the dependant patch does not apply.
thanks
greg k-h
On Thu, Jun 18, 2020 at 06:26:49PM +0200, Greg KH wrote:
On Thu, Jun 18, 2020 at 05:55:43PM +0200, gregkh@linuxfoundation.org wrote:
The patch below does not apply to the 5.7-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to stable@vger.kernel.org.
Oops, I applied things out of order, I've queued this and the 5.4 version up, but 4.19 doesn't apply as the dependant patch does not apply.
Something like this should work?
diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 7181dfe76440..b7905a075e79 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1611,11 +1611,13 @@ static int __init deferred_init_memmap(void *data) spfn = max_t(unsigned long, first_init_pfn, PFN_UP(spa)); epfn = min_t(unsigned long, zone_end_pfn(zone), PFN_DOWN(epa)); nr_pages += deferred_init_pages(nid, zid, spfn, epfn); + cond_resched(); } for_each_free_mem_range(i, nid, MEMBLOCK_NONE, &spa, &epa, NULL) { spfn = max_t(unsigned long, first_init_pfn, PFN_UP(spa)); epfn = min_t(unsigned long, zone_end_pfn(zone), PFN_DOWN(epa)); deferred_free_pages(nid, zid, spfn, epfn); + cond_resched(); }
/* Sanity check that the next zone really is unpopulated */
On Thu 18-06-20 22:28:22, Sasha Levin wrote:
On Thu, Jun 18, 2020 at 06:26:49PM +0200, Greg KH wrote:
On Thu, Jun 18, 2020 at 05:55:43PM +0200, gregkh@linuxfoundation.org wrote:
The patch below does not apply to the 5.7-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to stable@vger.kernel.org.
Oops, I applied things out of order, I've queued this and the 5.4 version up, but 4.19 doesn't apply as the dependant patch does not apply.
Something like this should work?
Nope. Unless I am misreading the old code you are udner pgdat_resize_lock. Or is there any other change queued before this backport to remove the lock? (I didn't get to check more closely but it would be 3d060856adfc5 IIRC).
diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 7181dfe76440..b7905a075e79 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1611,11 +1611,13 @@ static int __init deferred_init_memmap(void *data) spfn = max_t(unsigned long, first_init_pfn, PFN_UP(spa)); epfn = min_t(unsigned long, zone_end_pfn(zone), PFN_DOWN(epa)); nr_pages += deferred_init_pages(nid, zid, spfn, epfn);
cond_resched();
} for_each_free_mem_range(i, nid, MEMBLOCK_NONE, &spa, &epa, NULL) { spfn = max_t(unsigned long, first_init_pfn, PFN_UP(spa)); epfn = min_t(unsigned long, zone_end_pfn(zone), PFN_DOWN(epa)); deferred_free_pages(nid, zid, spfn, epfn);
cond_resched();
}
/* Sanity check that the next zone really is unpopulated */
-- Thanks, Sasha
On 19.06.20 11:21, Michal Hocko wrote:
On Thu 18-06-20 22:28:22, Sasha Levin wrote:
On Thu, Jun 18, 2020 at 06:26:49PM +0200, Greg KH wrote:
On Thu, Jun 18, 2020 at 05:55:43PM +0200, gregkh@linuxfoundation.org wrote:
The patch below does not apply to the 5.7-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to stable@vger.kernel.org.
Oops, I applied things out of order, I've queued this and the 5.4 version up, but 4.19 doesn't apply as the dependant patch does not apply.
Something like this should work?
Nope. Unless I am misreading the old code you are udner pgdat_resize_lock. Or is there any other change queued before this backport to remove the lock? (I didn't get to check more closely but it would be 3d060856adfc5 IIRC).
I recently did a similar backport. For pre-5.2, the following commits might be required to backport cleanly
56ec43d8b027 mm: drop meminit_pfn_in_nid as it is redundant 837566e7e08e mm: implement new zone specific memblock iterator 0e56acae4b4d mm: initialize MAX_ORDER_NR_PAGES at a time instead of doing larger sections b9705d8778e7 mm/page_alloc.c: fix regression with deferred struct page init 117003c32771 mm/pagealloc.c: call touch_nmi_watchdog() on max order boundaries in deferred init 3d060856adfc mm: initialize deferred pages with interrupts enabled da97f2d56bbd mm: call cond_resched() from deferred_init_memmap()
did not verify, though, if anything else is missing (and which commits can be bypassed with less trouble).
I will send the backport. I think only three patches are needed that were in my original series:
117003c32771 mm/pagealloc.c: call touch_nmi_watchdog() on max order boundaries in deferred init 3d060856adfc mm: initialize deferred pages with interrupts enabled da97f2d56bbd mm: call cond_resched() from deferred_init_memmap()
Pasha
On Fri, Jun 19, 2020 at 5:44 AM David Hildenbrand david@redhat.com wrote:
On 19.06.20 11:21, Michal Hocko wrote:
On Thu 18-06-20 22:28:22, Sasha Levin wrote:
On Thu, Jun 18, 2020 at 06:26:49PM +0200, Greg KH wrote:
On Thu, Jun 18, 2020 at 05:55:43PM +0200, gregkh@linuxfoundation.org wrote:
The patch below does not apply to the 5.7-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to stable@vger.kernel.org.
Oops, I applied things out of order, I've queued this and the 5.4 version up, but 4.19 doesn't apply as the dependant patch does not apply.
Something like this should work?
Nope. Unless I am misreading the old code you are udner pgdat_resize_lock. Or is there any other change queued before this backport to remove the lock? (I didn't get to check more closely but it would be 3d060856adfc5 IIRC).
I recently did a similar backport. For pre-5.2, the following commits might be required to backport cleanly
56ec43d8b027 mm: drop meminit_pfn_in_nid as it is redundant 837566e7e08e mm: implement new zone specific memblock iterator 0e56acae4b4d mm: initialize MAX_ORDER_NR_PAGES at a time instead of doing larger sections b9705d8778e7 mm/page_alloc.c: fix regression with deferred struct page init 117003c32771 mm/pagealloc.c: call touch_nmi_watchdog() on max order boundaries in deferred init 3d060856adfc mm: initialize deferred pages with interrupts enabled da97f2d56bbd mm: call cond_resched() from deferred_init_memmap()
did not verify, though, if anything else is missing (and which commits can be bypassed with less trouble).
-- Thanks,
David / dhildenb
On 19.06.20 14:15, Pavel Tatashin wrote:
I will send the backport. I think only three patches are needed that were in my original series:
117003c32771 mm/pagealloc.c: call touch_nmi_watchdog() on max order boundaries in deferred init 3d060856adfc mm: initialize deferred pages with interrupts enabled da97f2d56bbd mm: call cond_resched() from deferred_init_memmap()
Just to note, in contrast to the subject, this is about the 4.19 backport IIRC. Is it really only these patches?
117003c32771 mm/pagealloc.c: call touch_nmi_watchdog() on max order boundaries in deferred init 3d060856adfc mm: initialize deferred pages with interrupts enabled da97f2d56bbd mm: call cond_resched() from deferred_init_memmap()
Just to note, in contrast to the subject, this is about the 4.19 backport IIRC. Is it really only these patches?
Ah, I did not realize it was 4.19 backport.
Yeah, it might be more, I have not checked it.
Thanks, Pasha
On Fri, Jun 19, 2020 at 8:43 AM Pavel Tatashin pasha.tatashin@soleen.com wrote:
117003c32771 mm/pagealloc.c: call touch_nmi_watchdog() on max order boundaries in deferred init 3d060856adfc mm: initialize deferred pages with interrupts enabled da97f2d56bbd mm: call cond_resched() from deferred_init_memmap()
Just to note, in contrast to the subject, this is about the 4.19 backport IIRC. Is it really only these patches?
Ah, I did not realize it was 4.19 backport.
Yeah, it might be more, I have not checked it.
5.7 backport: https://lore.kernel.org/stable/20200619122555.372957-1-pasha.tatashin@soleen... 4.19 backport: https://lore.kernel.org/stable/20200619132425.425063-1-pasha.tatashin@soleen...
Thanks, Pasha
linux-stable-mirror@lists.linaro.org